[1.198.033] Dedicated server fails to restart on its own

Dalten shared this bug 4 months ago
Won't Fix

Hello Keen Support,

When scheduling automated restart on dedicated server (8h in our case), sometimes the new session gets stuck and fails to start up properly, possibly indicating a previous server session fails to gracefully exit and still has a socket registered, listening to the TCP port. This seems to happen regularly once every 4 or 5 cycles. Our current fix is to kill all dedicated server tasks and start it up manually again. Please find attached an example startup log and let me know what additional info you need.

on the dedicated server GUI, this is the line the startup process is held up at:

Error binding server endpoint: Only one usage of each socket address (protocol/network address/port) is normally permitted

Community, I'm not interested in using Torch, don't bother suggesting that.

Comments (6)

photo
1

Hello, Dalten,

thanks for letting us know about our concern. However, when tried I was not able to reproduce this issue.

My test server was able to restart properly each and every time, even with different restart time settings etc.

I did get this error that you are mentioning, alongside with second error:

"This world can not be loaded. It has been created in a newer version of the game or in different branch of the game. You can change the branch, before launching the game, in game properties."

But only when I indeed did try to run server from newer version of the game, on the older client. This error appeared when tried to run the newer version server AND is was still happening even when I realize and started completely new game so it fits the server app version.

My new server was Mars Planet, while the server still tried to run the Lone Survivor one. The Lone Survivor was somehow stuck there even after being properly stopped. It got fixed only by closing the server app completely and running it again.

/965b5024939847305982eb77c5411a71

Is it possible that you did something similar? Are you running multiple servers there on that machine? Are you switching them often? From time to time? Never? Did you also get in the console two different names of the server/world, when starting? Can you check?

If yes, that would be great, because we might be onto something here!

If not... well I don´t know. Since it´s not happening to me without me doing this silly mistake. In this case, it would be probably the best, if you can share me the save file of the affected server.

You can access your save files by typing %appdata% into your Windows search bar and you will be redirected to the hidden Roaming folder. After that just follow: \Roaming\SpaceEngineersDedicated\Saves.

Please zip the file and attach it here. If you are having difficulty attaching files you can optionally use Google Drive. When sharing a google drive link please make sure it is set to be downloadable by anyone with the link.

Thanks in advance for you reply and eventually for providing the safe.

Kind Regards

Keen Software House: QA Department

photo
1

Good morning Ondrej,

Thank you for the reply. Fortunately, this server only runs only one instance and world of space engineers, and the world was based off the Star System Template. This server never changes between worlds. This physical server is dedicated to just this function, so it runs no other game servers or web services of any kind.

the server and world name are different, but we have had this configuration for two years working fine. please see startup log and config below:

/0e50bd70c21f659a84631437ce947076

/3a9da780eb9ce85ee4e38f8c8da9aafa


Is there currently a bug limiting this? if so, should I unify the names of Server and World name? A lot of community servers have different server and world names:

/30677f8edab8cd1a7029493579047d9f

I have a copy of the world save uploaded to Google Drive for you. do you have a way for me to privately share the link with you (email perhaps)? I would rather not post it publicly.

On a side note, do you have any documented requirements regarding Antivirus policies, exemptions, etc? I'm running Sophos Home on this server. I appreciate all your help with this so far. Thank you.

photo
1

Hello, Dalten,

yeah, sure. You can send it to: ondrej.borz@keenswh.com and I will take a look.

Towards your other question - no, it´s completely fine to have different names in server settings. Just mine example was when I realized I did that mistake. But since you do not run any more servers there, this is not the case :)

Thanks in advance for the save file.

Kind Regards

Keen Software House: QA Department

photo
1

Good morning Ondrej,

I just emailed you the save file. I look forward to an update.

Since I mentioned observing this event every 4-5 cycles, you may have to leave your instance running for a long period until you notice the service getting stuck. it will stay in that state perpetually until you manually kill the service.

photo
photo
1

Good morning Ondrej,

The issue just happened again, after a couple of days of having my server on a 10-hour restart cycle instead of 8 hours. here is the startup log. please let me know what other info I can provide.

photo
1

Good morning Ondrej,

Any news on your end? the issue happened just today and 3 days ago on 6/24. we've been on a 10h cycle now

photo
1

Hello, Dalten,

sorry it is taking so long, but please do believe me that I´m trying my best here to make it happen on my side as well.

We have more work/issue to take care of but when I have some time to set the server and keep it running, I do so. So far, no luck, though :(

I will keep you updated if I will manage to make it happen! You can count on that.

Please, do believe me that it´s not that I completely abandon this issue just because there is no new comments from me. Take it more in a way that sadly there is nothing to share :(

Hope you understand. And thanks for you patience!

Kind Regards

Keen Software House: QA Department

photo
1

Good afternoon Ondrej,

That's fine, thank you for the progress update. If this will take a while to resolve, it's OK as long as we keep the conversation going. In the meantime, may I offer you any other information about the system or environment?

photo
1

Hello, Dalten,

so I spend most of the time of today with working on your issue as I wanted to have some decent info for you, and... I have four messages for you.

1 - it didn´t happen to me when tried on my own dedicated server, that I started, run and let restart itself.

2 - it happened to me with your save after couple of restarts, that´s true. But there is 40+ mods on the server and seems like some of them might result in this issue happening more often. Don´t know which, sadly :(

3 - when I stripped your save of all the mods, I still did experience this issue, but not that often (for example 1 out of 15 restarts, I would say). I did get that error(s) that we were talking about here, but when I left the restart to go through this loop two or three times, it did actually connect properly after that.

4 - I have found that one of my colleagues did experiencing this issue more than a year back and put in into our internal system. After some digging, devs commented on that one, that it seems that there is nothing they can possibly do about it. It´s simply happening because the server on restart behaves like the previous DS instance was not ended correctly and that might have more than a couple of reasons.

So... to summarize it:

Please after this faulty restart happens, let the error go through couple of time (like I said, two or three times was ok in my case). Make sure you have proper boxes checked on Maintenance tab.

Or if you get really stuck in this loop for long time, you will be sadly forced to restart the server manually (using the Stop button, don´t kill it through Task manager if possible), as the fault session will be terminated properly and then you can run new instance of the server.

I know that it is not much help for you, but that´s the best I can provide you right now.

Hope you understand. Will close this thread now.

Kind Regards

Keen Software House: QA Department

photo
1

Good afternoon Ondrej,

I am happy to hear that you were able to replicate the issue. when the issue occurs on my end, it appears to stay in this loop perpetually. one evening it started and I found and addressed the issue 8h later in the morning.

I would like to suggest to your dev team that an if statement be added to server startup runtime. if a running process is found with this name, kill it, then proceed with the rest of startup. For example in windows batch, the command would be:

taskkill /F /FI "IMAGENAME eq SpaceEngineersDedicated.exe"

alternatively, add this to the end of list of tasks done during a graceful server shutdown.

photo
1

Hello, Dalten,

thanks for keeping the conversation going :)

But please do believe me that I have no more possibilities what to do on my side here. Can you please write this suggestion into our Feedback section - devs are taking a look there, so it might be better to suggest this solution there than trying to push it through here/me.

As I already said - even the old bug in our system was resolved as not an issue, so I don´t see any option, how this one might end up differently. Especially when it really seems to be escalated by one or more mods that you are using - as there are hundreds and probably thousands of mods out there, we can not and do not provide support for those, just the main vanilla (without mods) game. And since you are able to make it run manually again, it´s not a great stopper for you, I guess.

If I can advice something... you can try to make the restart interval in the time that you are at home and can monitor whether it is restarted properly - maybe just once a day? That might save you from the situation you described above about the server misbehaving through whole night before you were able to spot it in the morning.

I know it´s not much, but please believe me that this is all I´ve got.

Do suggest it in our Feedback section and you will have definitely better chance that some of the devs will take a look on it right away.

Hope you understand.

Kind Regards

Keen Software House: QA Department

photo
1

I am disappointed that your team is willing to give up on this issue. I have made the suggested cycle change to 24h and hopefully the issue becomes more manageable. Perhaps in the future, your company would be willing to invest more in development resources than pink mansions, who knows...

photo