I administer a go-server on RHEL6U6 (kernel 2.6.32) with many pipelines and users. Recently we began to see frequent incidents where all http request to go server cease to respond. I've spent a considerable amount of time trying to find the cause without much luck, until I noticed that running strace -fp <go-server pid> causes the server to respond again. Some cursory investigation suggests this kind of behavior may be indicative of a race condition within the process that is resolved by slowing it down a bit when stracing.
Usual JVM metrics are very healthy. OS looks healthy though we get occasional spikes in load when many resources are fetched. Iowait is fine. We were previously running go-server v17.x when we started having this issue, on jdk1.8.0_121. Since upgrading to go-server v18.9.0 and jdk1.8.0_181 we still see the same issue. I don't see anything in the go-server logs that is consistent with each incident beginning, though there is a ton of noise in the logs. Any idea what might be causing this? My only ideas are a bad pipeline/user behavior, or a hypervisor issue as this server is virtualized on vmware. -- You received this message because you are subscribed to the Google Groups "go-cd" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
