This thread and the ticket linked by Michael got me curious about whether we could write our own routine for spawning processes that doesn't invoke the usual copy-on-write semantics.
The struct returned by exec.Command has a a SysProcAttr field where you can set (Linux-specific) flags to pass to the clone syscall. The CLONE_VM flag looks promising but it seems to upset Go when you use it (fatal error: runtime: stack growth during syscall). If CLONE_VM | CLONE_VFORK is used the executable runs - I see the output from echo - but the call to Run never returns. I'm not sure why given that with CLONE_VFORK the parent process is supposed to unblock once the child calls execve (which it does). I could poke at this further but I need to get back to other things. Looking at https://github.com/golang/go/issues/5838 I'm not the only one who's tried this and run into similar problems. Another approach - and one that Go might do internally one day - is to tell the kernel to not allow copy-on-write for all (or at least large) memory blocks. Tweaking the allocations in Nate's example like this: bigs := make([][]byte, 6) for i := range bigs { bigs[i] = make([]byte, GB) syscall.Madvise(bigs[i], syscall.MADV_DONTFORK) } Allows the fork to work. It fails as before without the Madvise calls. This isn't particularly practical for us but it's an interesting data point anyway. - Menno On 4 June 2015 at 02:07, John Meinel <j...@arbash-meinel.com> wrote: > Yeah, I'm pretty sure this machine is on "0" and we've just overcommitted > enough that Linux is refusing to overcommit more. I'm pretty sure juju was > at least at 2GB of pages, where 1G was in RAM and 1GB was in swap. And if > we've already overcommitted to 9.7GB over 6.2GB linux probably decided that > another 2GB was "obvious overcommits" that it would refuse. > > John > =:-> > > > On Wed, Jun 3, 2015 at 5:32 PM, Gustavo Niemeyer <gust...@niemeyer.net> > wrote: > >> From https://www.kernel.org/doc/Documentation/vm/overcommit-accounting: >> >> The Linux kernel supports the following overcommit handling modes >> >> 0 - Heuristic overcommit handling. Obvious overcommits of >> address space are refused. Used for a typical system. It >> ensures a seriously wild allocation fails while allowing >> overcommit to reduce swap usage. root is allowed to >> allocate slightly more memory in this mode. This is the >> default. >> >> 1 - Always overcommit. Appropriate for some scientific >> applications. Classic example is code using sparse arrays >> and just relying on the virtual memory consisting almost >> entirely of zero pages. >> >> 2 - Don't overcommit. The total address space commit >> for the system is not permitted to exceed swap + a >> configurable amount (default is 50%) of physical RAM. >> Depending on the amount you use, in most situations >> this means a process will not be killed while accessing >> pages but will receive errors on memory allocation as >> appropriate. >> >> Useful for applications that want to guarantee their >> memory allocations will be available in the future >> without having to initialize every page. >> >> >> On Wed, Jun 3, 2015 at 7:40 AM, John Meinel <j...@arbash-meinel.com> >> wrote: >> >>> So interestingly we are already fairly heavily overcommitted. We have >>> 4GB of RAM and 4GB of swap available. And cat /proc/meminfo is saying: >>> CommitLimit: 6214344 kB >>> Committed_AS: 9764580 kB >>> >>> John >>> =:-> >>> >>> >>> >>> On Wed, Jun 3, 2015 at 9:28 AM, Gustavo Niemeyer <gust...@niemeyer.net> >>> wrote: >>> >>>> Ah, and you can also suggest increasing the swap. It would not actually >>>> be used, but the system would be able to commit to the amount of memory >>>> required, if it really had to. >>>> On Jun 3, 2015 1:24 AM, "Gustavo Niemeyer" <gust...@niemeyer.net> >>>> wrote: >>>> >>>>> Hey John, >>>>> >>>>> It's probably an overcommit issue. Even if you don't have the memory >>>>> in use, cloning it would mean the new process would have a chance to >>>>> change >>>>> that memory and thus require real memory pages, which the system obviously >>>>> cannot give it. You can workaround that by explicitly enabling overcommit, >>>>> which means the potential to crash late in strange places in the bad case, >>>>> but would be totally okay for the exec situation. >>>>> So we're running into this failure mode again at one of our sites. >>>>> >>>>> Specifically, the system is running with a reasonable number of nodes >>>>> (~100) and has been running for a while. It appears that it wanted to >>>>> restart itself (I don't think it restarted jujud, but I do think it at >>>>> least restarted a lot of the workers.) >>>>> Anyway, we have a fair number of things that we "exec" during startup >>>>> (kvm-ok, restart rsyslog, etc). >>>>> But when we get into this situation (whatever it actually is) then we >>>>> can't exec anything and we start getting failures. >>>>> >>>>> Now, this *might* be a golang bug. >>>>> >>>>> When I was trying to debug it in the past, I created a small program >>>>> that just allocated big slices of memory (10MB strings, IIRC) and then >>>>> tried to run "echo hello" until it started failing. >>>>> IIRC the failure point was when I wasn't using swap and the allocated >>>>> memory was 50% of total available memory. (I have 8GB of RAM, it would >>>>> start failing once we had allocated 4GB of strings). >>>>> When I tried digging into the golang code, it looked like they use >>>>> clone(2) as the "create a new process for exec" function. And it seemed it >>>>> wasn't playing nicely with copy-on-write. At least, it appeared that >>>>> instead of doing a simple copy-on-write clone without allocating any new >>>>> memory and then exec into a new process, it actually required to have >>>>> enough RAM available for the new process. >>>>> >>>>> On the customer site, though, jujud has a RES size of only 1GB, and >>>>> they have 4GB of available RAM and swap is enabled (2GB of 4GB swap >>>>> currently in use). >>>>> >>>>> The only workaround I can think of is for us to create a "forker" >>>>> process right away at startup that we just send RPC requests to run a >>>>> command for us and return the results. ATM I don't think we do any fork >>>>> and >>>>> run interactively such that we need the stdin/stdout file handles inside >>>>> our process. >>>>> >>>>> I'd rather just have golang fork() work even when the current process >>>>> is using a large amount of RAM. >>>>> >>>>> Any of the golang folks know what is going on? >>>>> >>>>> John >>>>> =:-> >>>>> >>>>> >>>>> -- >>>>> Juju-dev mailing list >>>>> Juju-dev@lists.ubuntu.com >>>>> Modify settings or unsubscribe at: >>>>> https://lists.ubuntu.com/mailman/listinfo/juju-dev >>>>> >>>>> >>> >> >> >> -- >> >> gustavo @ http://niemeyer.net >> > > > -- > Juju-dev mailing list > Juju-dev@lists.ubuntu.com > Modify settings or unsubscribe at: > https://lists.ubuntu.com/mailman/listinfo/juju-dev > >
-- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev