Ok i think we found the problem. We had a peak disk IO of > 1s - 6s on the gocd server. This resulted in the timeout of the pipelines. So the reason of the problem was the storage of beneath the vm. We still investigate the reason for this peak, but we noticed the same peak on several vms at the same time in the same storage.
Am Do., 21. Nov. 2024 um 10:47 Uhr schrieb Chad Wilson < [email protected]>: > What does the GoCD server think the status of that job and stage is? What > does "the pipeline crashed" mean? If the stage is shown as passed by the > GoCD server, what downstream problem did this cause? Did subsequent stages > or pipelines not trigger correctly? > > The error looks like your agent had some kind of problem talking to the > server or reporting its status. If that's the case then there is > potentially a chicken-and-egg problem here that might prevent reporting at > the level of scope you suggest - depending on the root cause of the issue > > - if the agent couldn't talk to the server to report its status and > the error was not recoverable by the agent then you'd probably need to > monitor agents for such connectivity errors. Agents do have a health > API > > <https://docs.gocd.org/24.4.0/advanced_usage/agent-health-check-api.html>exposed > which reports their ability to connect to the server - this could be > monitored externally, but would not have "pipeline/stage/job" scope. > - if the GoCD server itself had a problem preventing it from correctly > updating the status from the agent, it would depend what the cause of that > error is/was and whether it happened within the scope of a stage/job. If > the stage/job was left in an indeterminate state there'd potentially be a > similar problem with knowing how to report the status at the > pipeline/stage's scope. The server has its own internal error > reporting/tracking (the one that drives the red errors/warnings in the UI, > and also has its own API for external consumption > <https://api.gocd.org/current/#server-health-messages>) but we'd need > to know what the root cause was and whether it triggered such an > error/warning. > > > -Chad > > On Thu, Nov 21, 2024 at 5:26 PM 'Hans Dampf' via go-cd < > [email protected]> wrote: > >> Hi, >> >> [image: fail.png] >> >> We run into this problem tonight. The stage passed, but then the pipeline >> crashed. The crash itself is not the main problem. >> >> The main problem is we have an extra fail-stage configured by mail and >> enriched information. My guess is this fail stage did not get triggered >> because the previous staged succeeded. >> >> I found in the documentation this part, but I'm not sure if this had >> worked in this case. >> https://docs.gocd.org/current/configuration/dev_notifications.html >> >> If not, then there should be a way to catch these events outside the >> stage and job level but still in the pipeline and generate an alert. >> >> Regards >> >> -- >> You received this message because you are subscribed to the Google Groups >> "go-cd" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion visit >> https://groups.google.com/d/msgid/go-cd/e4dd0034-8c49-4ac4-9e19-6d193d73c20fn%40googlegroups.com >> <https://groups.google.com/d/msgid/go-cd/e4dd0034-8c49-4ac4-9e19-6d193d73c20fn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- > You received this message because you are subscribed to a topic in the > Google Groups "go-cd" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/go-cd/hEtUHngS7oo/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/go-cd/CAA1RwH-t9Do19zSeR9v7ckFsr3vmR0fLR2AbACQU3BwLg8hWRA%40mail.gmail.com > <https://groups.google.com/d/msgid/go-cd/CAA1RwH-t9Do19zSeR9v7ckFsr3vmR0fLR2AbACQU3BwLg8hWRA%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "go-cd" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/go-cd/CANhjCLATop%3DTVwT4vzwYaBjN6s4n5sJrwbEfn1%2Brmq%2Bk5XZ2uQ%40mail.gmail.com.
