Re: [Caml-list] More re GC hanging
On 01/09/2010, Damien Doligez wrote: > > On 2010-08-15, at 12:45, Adrien wrote: > >> First, remove all non-tail-rec functions: no more List.map, @ or >> List.concat. All lists were pretty short (definitely less than 1000 >> elements) but maybe the amount of calls generated garbage or something >> like that: I couldn't get much infos about the problem so I tried what >> I could think of and being tail-rec couldn't be a bad thing anyway. >> The idea was to create less values since I first suspected a memory >> fragmentation issue (iirc I had thousands of fragments), so I also >> memoized some functions. > > That has nothing to do with the GC getting huge counts. I know but I first had crashes which didn't show the huge counts and did what I had planned to do for some time. Also, I was actually generating lots of garbage (well, maybe not 10^20 ;-)). > Also, if you > have fragmentation problems, you can try changing the allocation > policy with this statement: > >Gc.set {(Gc.get ()) with Gc.allocation_policy = 1};; > > I'm still waiting for feedback on that alternate allocation policy :-) I had tried that, it didn't change anything. >> Then, as Basile mentionned, call something like Gc.compact () >> regularly. The overhead is actually pretty small, especially if ran >> regularly. > > That's good for tracking down problems, but I wouldn't recommend it > for production code. > >> Finally, C bindings. I created a few while not having access to the >> internet and they are quite dirty. I highly doubt they play perfectly >> well with the garbage collector: they seem ok but probably aren't >> perfect. That's definitely something to check, even if the bindings >> were written by someone else because working nicely with the GC can be >> quite hard. >> >> Now, the problem seems to be gone but I can't say for sure. One really >> annoying thing was that adding a line like 'print_endline "pouet";' >> would make the out-of-memory problem go away! Same when getting stats >> from the GC. > > > That almost certainly indicates a problem with your C bindings: some > pointer gets garbled and the GC may or may not stumble upon it. That's also what I think: calling Gc.compact () doesn't solve the problem, it only changes the planet alignment and the phase of the moon. Sorry for the late reaction, I was pretty short on time during the past ten days but it's going to be better now. :-) I took a quick look at the C stubs and noticed a few variables of type 'value' where not introduced with CAMLlocalX(), in particular the creation of a list. I don't know if that's enough to fix the problem since it wasn't happening anymore on my computer and I'm now waiting for someone to be able to test. -- Adrien Nader ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
On 2010-08-15, at 12:45, Adrien wrote: > First, remove all non-tail-rec functions: no more List.map, @ or > List.concat. All lists were pretty short (definitely less than 1000 > elements) but maybe the amount of calls generated garbage or something > like that: I couldn't get much infos about the problem so I tried what > I could think of and being tail-rec couldn't be a bad thing anyway. > The idea was to create less values since I first suspected a memory > fragmentation issue (iirc I had thousands of fragments), so I also > memoized some functions. That has nothing to do with the GC getting huge counts. Also, if you have fragmentation problems, you can try changing the allocation policy with this statement: Gc.set {(Gc.get ()) with Gc.allocation_policy = 1};; I'm still waiting for feedback on that alternate allocation policy :-) > Then, as Basile mentionned, call something like Gc.compact () > regularly. The overhead is actually pretty small, especially if ran > regularly. That's good for tracking down problems, but I wouldn't recommend it for production code. > Finally, C bindings. I created a few while not having access to the > internet and they are quite dirty. I highly doubt they play perfectly > well with the garbage collector: they seem ok but probably aren't > perfect. That's definitely something to check, even if the bindings > were written by someone else because working nicely with the GC can be > quite hard. > > Now, the problem seems to be gone but I can't say for sure. One really > annoying thing was that adding a line like 'print_endline "pouet";' > would make the out-of-memory problem go away! Same when getting stats > from the GC. That almost certainly indicates a problem with your C bindings: some pointer gets garbled and the GC may or may not stumble upon it. -- Damien ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
Adrien writes: > Also, I found out that I had a pretty ugly error in my C bindings but > it looks like it had no bad impact. > Basically, I had 'external ml_f : *string* -> string array' but the C > side read 'value ml_f()': the C function took *no* argument while > ocaml was passing one (I wasn't actually using the argument). Has > anything been developped against that? Anything to warn about errors > in bindings? That actually makes no problem on any architecture afaik. The parameter will be placed in a register or on stack and never accessed. The GC might move the value around making the register/stack value invalid, but you never access it anyway. I don't think there is anything that will verify that the external declaration and C side have the same number of arguments. MfG Goswin ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
David and Basile, you are absolutely right about the redirection issue. It also pretty obvious actually. I guess I need to pay more attention. Back to the original problem, I thought I had somehow gotten rid of it but it still happens on someone else's computer. Calling 'Gc.compact' regularly seems to work around the problem but calling 'Array.make' might actually have the same effect: it might not fix the problem, only prevent it from being triggered. I'll try to reproduce it this week. Also, I found out that I had a pretty ugly error in my C bindings but it looks like it had no bad impact. Basically, I had 'external ml_f : *string* -> string array' but the C side read 'value ml_f()': the C function took *no* argument while ocaml was passing one (I wasn't actually using the argument). Has anything been developped against that? Anything to warn about errors in bindings? Finally, I don't think it has to do with the bug on 64bit systems with ASLR, at least not directly: I'm using ocaml 3.11.2 and tried with ASLR disabled. But I need to make a reproducer: the very high word count did not always show up (although the out-of-memory error always did). --- Adrien Nader ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
On Fri, Aug 20, 2010 at 6:21 PM, Richard Jones wrote: > > On Sun, Aug 15, 2010 at 09:03:41AM +0200, Basile Starynkevitch wrote: > > On Sun, 2010-08-15 at 15:57 +1000, Paul Steckler wrote: > > > I haven't yet come up with a solution to the GC hanging problem I > > > mentioned the other day. > > > > > > I believe recent Ocaml versions (did you try 3.12?) have GC improvements > > for that. > > Would that be: > https://bugzilla.redhat.com/show_bug.cgi?id=445545 > (fixed in OCaml 3.11)? > > Rich. > On many systems old versions of ocaml are shipped. In debian stable there is a pretty old version (10.x?), which is quite frustrating IMHO. At any rate, bug reporters must always indicate which ocaml version they're using. Cheers, -- Eray Ozkural, PhD candidate. Comp. Sci. Dept., Bilkent University, Ankara http://groups.yahoo.com/group/ai-philosophy ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
On Sun, Aug 15, 2010 at 09:03:41AM +0200, Basile Starynkevitch wrote: > On Sun, 2010-08-15 at 15:57 +1000, Paul Steckler wrote: > > I haven't yet come up with a solution to the GC hanging problem I > > mentioned the other day. > > > > But here's something that looks funny. [..] > > > After turning on the Gc verbose option, I see: > > [...] > > !<>Sweeping 9223372036854775807 words > > Starting new major GC cycle > > Marking 9223372036854775807 words > > Subphase = 10 > > Sweeping 9223372036854775807 words > > > > Those are some big mark and sweep numbers at the end! > > I guess this is related to the fact that recent Linux kernel have turned > on the randomize virtual address space feature -designed to improve > system security. You could disable it by > echo 0 > /proc/sys/kernel/randomize_va_space > but first learn more about it. > > I believe recent Ocaml versions (did you try 3.12?) have GC improvements > for that. Would that be: https://bugzilla.redhat.com/show_bug.cgi?id=445545 (fixed in OCaml 3.11)? Rich. -- Richard Jones Red Hat ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
Paul Steckler writes: > I haven't yet come up with a solution to the GC hanging problem I > mentioned the other day. > > But here's something that looks funny. I changed the default minor > heap size, the major > heap increment, the allocation policy. I also threw in a > `Gc.major_slice 0' in the code. > After turning on the Gc verbose option, I see: > > New heap increment size: 1000k bytes > New allocation policy: 1 > New minor heap size: 500k bytes > <>Starting new major GC cycle > allocated_words = 9404 > extra_heap_resources = 49000u > amount of work to do = 249956u > ordered work = 0 words > computed work = 44081 words > Marking 44081 words > Subphase = 10 > !<>Sweeping 9223372036854775807 words > Starting new major GC cycle > Marking 9223372036854775807 words > Subphase = 10 > Sweeping 9223372036854775807 words > > Those are some big mark and sweep numbers at the end! > > I'm using the x64 version of Fedora 12. Maybe the 64-bit garbage > collector has some integer > overflow problems? > > I wasn't seeing those huge numbers in other experiments where the Gc > hangs, so maybe that's > not the underlying problem. > > -- Paul I wondered about that as well. I think this is some uninitialized value in the GC statistics. MfG Goswin ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
On Sun, 2010-08-15 at 12:45 +0200, Adrien wrote: > > Finally, C bindings. I created a few while not having access to the > internet and they are quite dirty. I highly doubt they play perfectly > well with the garbage collector: they seem ok but probably aren't > perfect. That's definitely something to check, even if the bindings > were written by someone else because working nicely with the GC can be > quite hard. Then I suggest to post here a representative (rather complex) example of your C binding glue code. > > Also, on my computer, I have the following behaviour: > 11:44 ~ % sudo echo 0 > /proc/sys/kernel/randomize_va_space > zsh: permission denied: /proc/sys/kernel/randomize_va_space This is normal. The sudo applies to the echo command, not to the redirection. You want to redirect as root, not as the user, so sudo sh -c 'echo 0 > /proc/sys/kernel/randomize_va_space' should work. Cheers. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} *** ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
RE: [Caml-list] More re GC hanging
Adrien wrote: > Hi, > Also, on my computer, I have the following behaviour: > 11:44 ~ % sudo echo 0 > /proc/sys/kernel/randomize_va_space > zsh: permission denied: /proc/sys/kernel/randomize_va_space > r...@jarjar:~# echo 0 > /proc/sys/kernel/randomize_va_space > r...@jarjar:~# > I can't use sudo to write to most files in /proc or /sys: I have to log in > as root ('su -' does the job just fine). The redirection (> /proc/sys...) is not part of the sudo invocation, it's a separate instruction to the *shell* to redirect output of the previous part of the command to a file and so it runs with *your* uid. There are two ways to achieve what you're after - one verbose one described in the sudo man page (essentially you pass the whole command line to sudo quoted) or the easier way: echo 0| sudo tee /proc/sys/kernel/randomize_va_space You can add > /dev/null if tee's outputting of the 0 to stdout is for some reason annoying. David ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
Hi, I recently had similar output from the GC (huge count of words) which I noticed after my program started to exit with an out-of-memory error. It doesn't seem to be happening anymore but I'm not sure I "fixed" it. There are three things I thought of to get rid of it. (btw, I'm on 64bit linux) First, remove all non-tail-rec functions: no more List.map, @ or List.concat. All lists were pretty short (definitely less than 1000 elements) but maybe the amount of calls generated garbage or something like that: I couldn't get much infos about the problem so I tried what I could think of and being tail-rec couldn't be a bad thing anyway. The idea was to create less values since I first suspected a memory fragmentation issue (iirc I had thousands of fragments), so I also memoized some functions. Then, as Basile mentionned, call something like Gc.compact () regularly. The overhead is actually pretty small, especially if ran regularly. Finally, C bindings. I created a few while not having access to the internet and they are quite dirty. I highly doubt they play perfectly well with the garbage collector: they seem ok but probably aren't perfect. That's definitely something to check, even if the bindings were written by someone else because working nicely with the GC can be quite hard. Now, the problem seems to be gone but I can't say for sure. One really annoying thing was that adding a line like 'print_endline "pouet";' would make the out-of-memory problem go away! Same when getting stats from the GC. As for the problem with randomize_va_space on 64bit, I thought it had been fixed in 3.11 so I haven't looked at it (and in the absence of internet access, I was unable to get details for that problem). I just tried about a hundred run with VA-space-randomization on and without Gc.compact calls and ran without problem. Hopefully everything is tracked in git so I can get the older and non-working code if needed. Also, on my computer, I have the following behaviour: 11:44 ~ % sudo echo 0 > /proc/sys/kernel/randomize_va_space zsh: permission denied: /proc/sys/kernel/randomize_va_space r...@jarjar:~# echo 0 > /proc/sys/kernel/randomize_va_space r...@jarjar:~# I can't use sudo to write to most files in /proc or /sys: I have to log in as root ('su -' does the job just fine). Hope this helps. --- Adrien Nader ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
> For some reason, I was able to edit that file using emacs, even when > echo wouldn't work. maybe you wrote "sudo echo 0 > file" or something similar which perfoms the echo as root but the redirection as normal user ?___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
On Sun, 2010-08-15 at 18:40 +1000, Paul Steckler wrote: > > I guess this is related to the fact that recent Linux kernel have turned > > on the randomize virtual address space feature -designed to improve > > system security. You could disable it by > > echo 0 > /proc/sys/kernel/randomize_va_space > > but first learn more about it. > > For some reason, I was able to edit that file using emacs, even when > echo wouldn't work. To check that it did work as expected (which I doubt) do cat /proc/sys/kernel/randomize_va_space it should give 0 > > But the hanging problem persists. Are you sure that you don't have badly coded C routines that you call from your Ocaml code (don't forget correct use of CAMLparam & CAMLlocal, read again carefully http://caml.inria.fr/pub/docs/manual-ocaml/manual032.html and perhaps other material about precise garbage collectors). Are you sure you don't have a memory leak in your Ocaml code? This could happen when a reference value refers to a "big" value you don't need anymore. Cheers. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} *** ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
> I guess this is related to the fact that recent Linux kernel have turned > on the randomize virtual address space feature -designed to improve > system security. You could disable it by > echo 0 > /proc/sys/kernel/randomize_va_space > but first learn more about it. For some reason, I was able to edit that file using emacs, even when echo wouldn't work. But the hanging problem persists. -- Paul ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
On Sun, Aug 15, 2010 at 5:03 PM, Basile Starynkevitch wrote: > I guess this is related to the fact that recent Linux kernel have turned > on the randomize virtual address space feature -designed to improve > system security. You could disable it by > echo 0 > /proc/sys/kernel/randomize_va_space > but first learn more about it. Can't do that, even as root, permission denied. > I believe recent Ocaml versions (did you try 3.12?) have GC improvements > for that. I installed 3.12, same hanging issue. -- Paul ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
Re: [Caml-list] More re GC hanging
On Sun, 2010-08-15 at 15:57 +1000, Paul Steckler wrote: > I haven't yet come up with a solution to the GC hanging problem I > mentioned the other day. > > But here's something that looks funny. [..] > After turning on the Gc verbose option, I see: [...] > !<>Sweeping 9223372036854775807 words > Starting new major GC cycle > Marking 9223372036854775807 words > Subphase = 10 > Sweeping 9223372036854775807 words > > Those are some big mark and sweep numbers at the end! I guess this is related to the fact that recent Linux kernel have turned on the randomize virtual address space feature -designed to improve system security. You could disable it by echo 0 > /proc/sys/kernel/randomize_va_space but first learn more about it. I believe recent Ocaml versions (did you try 3.12?) have GC improvements for that. Cheers. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} *** ___ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs