FWIW, you can auto-register resources and environments along with an agent just as with manual assignments: https://docs.gocd.org/current/advanced_usage/agent_auto_register.html
The agent UUID is in the database, but the token will not be, I believe. No idea what might cause this on the server side, but I imagine there'd be errors in the logs that correspond to the timing, if that's a possible related thing. -Chad On Fri, Apr 7, 2023 at 11:07 PM <chant...@gmail.com> wrote: > I’ll see if I can aggregate all the logs. > > > > For the resources, we assign values to particular agents so we can > dynamically assign agents based on what a job step should be doing. For > example if it needs to be local to a SQL server we’ll tag that with a code > and something like SQL, SQL-A if its in a cluster, etc. and then it all > auto-assigns to the first valid agent. I’m not sure that we can do > auto-registration with detail so we’ve just done it by hand. > > > > I don’t think its crossover because I have 4-5 agents that I need to fix > that are the only one on the server but I’ll check and see what they say. > Would something interrupting the SQL instance cause this to occur? A write > to the agents table being missed or timing out, etc.? I’ve had some that > have both the token and guid but this one just has a guid. I’m going to > make sure the SQL folders are being handled gently and that AV wouldn’t be > the interference. > > > > I was asking about the guid/token to see if that was stored somewhere I > could retrieve or could be manually registered to make sure it reattached > the agent to the assignments (environment and resources) instead of > enabling the agent and doing all the assignments again. Some of these have > quite a few resources to eyeball for comparisons. > > > > I’ll share back as soon as I find either more questions or a solution > someone else might find useful. > > > > Thanks! > > > > *From:* go-cd@googlegroups.com <go-cd@googlegroups.com> *On Behalf Of *Chad > Wilson > *Sent:* Tuesday, April 4, 2023 11:05 AM > *To:* go-cd@googlegroups.com > *Subject:* Re: [go-cd] Agents going offline randomly on 22.3 > > > > Was the setup working at some point and then something changed? > > It sounds to me like you have some problem with > > - agents' identities getting confused with one another (shared GUIDs), > or > - accidentally sharing working folders between two agent processes > (double-starting an agent perhaps?) or > - token getting removed after it is first issued (by something...) > > Do you have any automated re-provisioning of the agents or other > automation here that could be interfering with the config/token or guid.txt > files? > > I can't really think of any other reason this would happen, and there's > not really much information here to debug. If the agents aren't getting > confused with one another, what this looks like is the agent still knows > its GUID, but assuming it was previously working, the token it was > previously issued has been lost off disk. To my knowledge the agent only > actively deletes a token when the registration of the agent is denied by > the server due to a 403 FORBIDDEN error after you reject registration, so > If you have missing tokens for agents that were previously OK, perhaps you > want to see what could be deleting the token? > > You also may need to follow through an agent's full log and timeline to > see how that could have happened, correlating to other events and search > the server log for the agent's GUID to see what might be happening - > snippets like the below aren't complete enough to be helpful. Or have a > look through https://github.com/gocd/gocd/issues/5170 > > And no, you can't recreate GUID/token from PostgreSQL, but not sure what > you mean here. Removing the GUID and token and restarting the agent should > be sufficient to get it to re-register reliably - as long as the root > problem is addressed that is causing the agents > > As for the resource tags, is there a reason you're doing that manually? > You may be able to use auto registration of agents to automate that? > https://docs.gocd.org/current/advanced_usage/agent_auto_register.html > > > > -Chad > > > > On Tue, Apr 4, 2023 at 10:51 PM Funkycybermonk <chant...@gmail.com> wrote: > > Hello! I'm running 22.3 and I keep having agents go offline. For example, > on a particular server (mirror setup to other environments) I have several > agents running side-by-side on an admin server and then an agent on various > individual servers. At the moment for this particular example, I have 12 of > 15 agents that are running perfectly fine. They all enabled and took their > configs originally but now the two that are offline are just looping the > below message. Generally I can go to each server, stop the agent, delete > the contents of the config folder and restart and it may after 1 or more > tries create a new entry. The new entry now is missing all the resource > tags so we have to note all the tags from the abandoned agent registration > and add it to the new one. > > > > We have a significant number of agents around in multiple environments but > this happens to maybe 10-20% of them. All agents were provisioned in the > same way, started and registered in the same way. > > > > Sometimes they have a token, and guid file but sometimes there is only a > guid while the error message loops. In this particular agent case, I have > two that just went offline from a clean install. Both showed up initially > and enabled but are now showing offline. They are on the same server but > each has a different name "Go Agent 01" "Go Agent 02" etc.: > > > > 2023-04-03 18:46:28,930 INFO [scheduler-3] SslInfrastructureService:78 - > [Agent Registration] Starting to register agent. > 2023-04-03 18:46:28,930 INFO [scheduler-3] SslInfrastructureService:88 - > [Agent Registration] Fetching token from server. > 2023-04-03 18:46:28,932 ERROR [scheduler-3] TokenRequester:59 - Received > status code from server 409 > 2023-04-03 18:46:28,933 ERROR [scheduler-3] TokenRequester:60 - Reason for > failure A token has already been issued for this agent. > 2023-04-03 18:46:28,933 ERROR [scheduler-3] SslInfrastructureService:106 - > [Agent Registration] There was a problem registering with the GoCD server. > java.lang.RuntimeException: A token has already been issued for this agent. > > > > > > I have tried to see if I could recreate the token and guid files but I > can't seem to get them to be accepted when I think their values are > correct. If there is a way to recreate the guid and token from the > PostgreSQL server I can do that but I haven't found anything so far that > seems to work for recreating those. > > > > Is there any reason that the agent would register and then lose its > registration that we can try to avoid? Over the last month or two we've > lost registration and set agents back up roughly 50-80 times across all > areas. > > > > Thanks in advance for any assistance! > > -- > You received this message because you are subscribed to the Google Groups > "go-cd" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to go-cd+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/go-cd/b5b7ee4f-d21a-41a5-8162-3c883ae01542n%40googlegroups.com > <https://groups.google.com/d/msgid/go-cd/b5b7ee4f-d21a-41a5-8162-3c883ae01542n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > -- > You received this message because you are subscribed to a topic in the > Google Groups "go-cd" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/go-cd/T-W4YWSj-8o/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > go-cd+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/go-cd/CAA1RwH-XOHtXGd6dO5vfv8YHL2ig%2Bi74St8i98k5cO6ONj16fQ%40mail.gmail.com > <https://groups.google.com/d/msgid/go-cd/CAA1RwH-XOHtXGd6dO5vfv8YHL2ig%2Bi74St8i98k5cO6ONj16fQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > -- > You received this message because you are subscribed to the Google Groups > "go-cd" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to go-cd+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/go-cd/053c01d9672f%24640b8260%242c228720%24%40gmail.com > <https://groups.google.com/d/msgid/go-cd/053c01d9672f%24640b8260%242c228720%24%40gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "go-cd" group. To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/CAA1RwH9rQW8yZLcaEOP5rNbqMBE7HJsH1wdQPR8nSDRuXK8jyw%40mail.gmail.com.