Thanks for your help on this, Michael. Yes, the fix seems pretty simple and porting to older Go releases will ensure people are not getting those super awkward messages randomly.
I have created a Trello card with a checklist so we can track all the distributions that we need to fix, and included you, John, and Zygmunt. On Sun, Mar 12, 2017 at 5:38 AM, Michael Hudson-Doyle < michael.hud...@canonical.com> wrote: > PS: I guess I should back port that Go fix to all supported Go releases? > > On 12 March 2017 at 21:37, Michael Hudson-Doyle < > michael.hud...@canonical.com> wrote: > > > Before we get into this, what is the actual problem here? Just the ugly > > messages? > > > > On 11 March 2017 at 02:58, Alfonso Sanchez-Beato <alfonso.sanchez-beato@ > > canonical.com> wrote: > > > >> On Fri, Mar 10, 2017 at 10:22 AM, John Lenton < > john.len...@canonical.com> > >> wrote: > >> > >> > Hello! > >> > > >> > We're seeing a weird issue with either go, pthreads, or the kernel. If > >> > you're knowledgeable about one or more of those things, could you take > >> > a look? Thank you. > >> > > >> > The issue manifests as nasty warnings from the "snap run" command, > >> > which is also the first step into a snapped app or service. It looks > >> > like > >> > > >> > runtime/cgo: pthread_create failed: Resource temporarily unavailable > >> > > >> > a very stripped-down reproducer is http://pastebin.ubuntu.com/ > 24150663/ > >> > > >> > build that, run it in a loop, and you'll see a bunch of those messages > >> > (and some weirder ones, but let's take it one step at a time) > >> > > > > Turns out this was fixed in Go 1.8: https://go-review. > > googlesource.com/c/33894/ > > > > > >> > if you comment out the 'import "C"' line the message will change but > >> > still happen, which makes me think that at least in part this is a Go > >> > issue (or that we're holding it wrong). > >> > > > > ... but only in the non-cgo case, you can (occasionally) still get > > messages like: > > > > runtime: failed to create new OS thread (have 5 already; errno=11) > > runtime: may need to increase max user processes (ulimit -u) > > fatal error: newosproc > > > > if you comment out the import "C". I guess we should report that > upstream. > > > > > >> > Note that the exec does work; the warning seems to come from a > >> > different thread than the one doing the Exec (the other clue that > >> > points in this direction is that sometimes the message is truncated). > >> > You can verify the fact that it does run by changing the /bin/true to > >> > /bin/echo os.Args[1], but because this issue is obviously a race > >> > somewhere, this change makes it less likely to happen (from ~10% down > >> > to ~.5% of runs, in my machines). > >> > > >> > One thing that makes this harder to debug is that strace'ing the > >> > process hangs (hard, kill -9 of strace to get out) before reproducing > >> > the issue. This probably means we need to trace it at a lower level, > >> > and I don't know enough about tracing a process group from inside the > >> > kernel to be able to do that; what I can find about kernel-level > >> > tracing is around syscalls or devices. > >> > > >> > Ideas? > >> > > >> > >> I found this related thread: > >> > >> https://groups.google.com/forum/#!msg/golang-nuts/8gszDBRZh_ > >> 4/lhROTfN9TxIJ > >> > >> << > >> I believe this can happen on GNU/Linux if your program uses cgo and if > >> thread A is in the Go runtime starting up a new thread B while thread > >> C is execing a program. The underlying cause is that while one thread > >> is calling exec the Linux kernel will fail attempts by other threads > >> to call clone by returning EAGAIN. (Look for uses of the in_exec > >> field in the kernel sources.) > >> >> > >> > > > > Yeah, this seems to be very accurate. It's also why it seems this is a > > cosmetic problem only, some thread not calling exec fails, but well, the > > thread is about to die anyway. > > > > > >> Something like adding a little sleep removes the traces, for instance: > >> > >> http://paste.ubuntu.com/24151637/ > >> > >> where the program run sleep for 1ms before calling Exec. For smaller > units > >> (say, 20 us) the issue still happens. > >> > >> It looks to me that right before running main(), go creates some > threads, > >> calling clone() and probably getting the race described in the thread. > As > >> anyway you are running Exec I guess the traces are harmless, you do not > >> need the go threads. Nonetheless, I think that the go run time should > >> retry > >> instead of printing that trace. > > > > > > Cheers, > > mwh > > > -- > Snapcraft mailing list > Snapcraft@lists.snapcraft.io > Modify settings or unsubscribe at: https://lists.ubuntu.com/ > mailman/listinfo/snapcraft > -- gustavo @ http://niemeyer.net -- Snapcraft mailing list Snapcraft@lists.snapcraft.io Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/snapcraft