Re: [freenet-support] freenet stops running

Matthew Toseland Thu, 05 Feb 2009 05:50:24 -0800

On Thursday 05 February 2009 00:43, Dennis Nezic wrote:
> On Mon, 2 Feb 2009 17:26:40 -0500, Dennis Nezic wrote:
> > On Tue, 27 Jan 2009 20:13:59 +0000, Matthew Toseland wrote:
> > > On Tuesday 27 January 2009 20:03, Dennis Nezic wrote:
> > > > On Tue, 27 Jan 2009 12:44:59 -0500, Dennis Nezic wrote:
> > > > > On Wed, 21 Jan 2009 17:28:47 +0000, Matthew Toseland wrote:
> > > > > > Give it more memory. If you can't give it more memory, throw
> > > > > > the box out the window and buy a new one. If you can't do that
> > > > > > wait for the db4o branch.
> > > > > 
> > > > > Or, more likely, throw freenet out the window :|.
> > > > > 
> > > > > > Seriously, EVERY time I have investigated these sorts of
> > > > > > issues the answer has been either that it is showing constant
> > > > > > Full GC's because it has slightly too little memory, or that
> > > > > > there is external CPU load. Are you absolutely completely
> > > > > > totally 100000000000000000000000000% sure that that is not
> > > > > > the problem? AFAICS there are two posters here, and just
> > > > > > because one of them is sure that the problem isn't memory
> > > > > > doesn't necessarily mean that the other one's problems are
> > > > > > not due to memory??
> > > > > 
> > > > > My node crashed/restarted again due to MessageCore/PacketSender
> > > > > freezing for 3 minutes. The problem appears to be with cpu
> > > > > usage, since my memory usage is basically plateauing when the
> > > > > crash occurs, though I suppose the two factors may not be
> > > > > necessarily entirely unrelated. My cpu load (ie. as reported by
> > > > > uptime) would sometimes rise pretty dramatically, with a 15-min
> > > > > load number hovering between 3 and 4, which brings my system to
> > > > > a crawl, and I guess this eventually "freezes" some threads in
> > > > > freenet, and then triggers the shutdown.
> > > > 
> > > > Restarting the node "fixes" the cpu-load problem, even though the
> > > > node is doing exactly the same stuff as before, at least from the
> > > > user's perspective. So, clearly, the problem is not just "slow and
> > > > obsolete" hardware as you suggest, but something else internal to
> > > > the code, that grows out of control over time--over the course of
> > > > dozens of hours.
> > > 
> > > I.e. memory usage. QED!
> > > 
> > > Memory usage was plateauing = memory usage was constantly at the
> > > (low) maximum, and it was using 100% CPU in a vain attempt to
> > > reclaim the last byte of memory. This is the most likely
> > > explanation by far: Can you categorically rule it out via checking
> > > freenet.loggc? You did add the wrapper.conf line I mentioned?:
> > 
> > Hrm. Upon closer inspection of my latest loggc,
> > http://dennisn.dyndns.org/guest/pubstuff/loggc-freezes.log.bz2
> > 
> > It appears that memory may in fact be an issue. But I don't think it's
> > the memory limit itself. This last test I set my java memory limit to
> > 250MB, and the logs show it never went much above 200MB. BUT, looking
> > at the last few Full GC's, the time it took for them to complete
> > increased rapidly near the end, and the last Full GC took over 3min!,
> > which probably triggered the "freeze".
> > 
> > My system only has 384MB of physical ram and 400MB of swap in a
> > swapfile (all of which is on Raid5/LVM :b). My current theory is that
> > maybe the terribly long Full GCs are due to long disk-io times
> > resulting from accessing the Raid5/LVM/swapfile. "man java" shows an
> > interesting option "-Xincgc", which seems to avoid Full GC's:
> > 
> > "
> > Enable the incremental garbage collector. The incremental
> > garbage collector, which is off by default, will reduce the
> > occasional long garbage-collection pauses during program exe-
> > cution. The incremental garbage collector will at times exe-
> > cute concurrently with the program and during such times will
> > reduce the processor capacity available to the program.
> > "
> > 
> > I'll see if that has any effect. (Is there any way to make the jvm
> > more forgiving, to allow it to handle longer-than-3min garbage
> > collections?)
> > 
> > Here is my vmstat, in 60s samples, without freenet running. So,
> > clearly, with < 10M of physical memory free, the swapfile will be used
> > heavily :o.
> > 
> > # vmstat -S M 60
> > ---------memory---------- ---swap-- -----io---- --system-- ----cpu----
> > swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
> >  101      9     37    179    0    0    33   155  471   192 19  2 74  5
> >  101      5     37    179    0    0    79    21  442   179 20  4 66 11
> >  101      8     37    179    0    0    71     5  441   114 68  3 25  3
> >  101      8     37    180    0    0    59    32  465   168 19  2 73  5
> > 
> > Here is the same vmstat with freenet running:
> > 
> >  196      4      5     45    0    0   267   137  518   540 39  3 38 20
> >  196      4      6     49    0    0    80   184  486   371 36  2 54  8
> >  196      7      6     46    0    0    18    39  486   303 30  1 63  6
> >  196     11      7     41    0    0    88   109  472   341 31  2 62  4
> > 
> > More swap space is used, and more disk-io (bi and bo--blocks
> > written/read from disk is almost doubled, cpu.us--time spent running
> > non-kernel code has more than doubled, and cpu.wa--time spent waiting
> > for io--the last column, is somewhat increas(ing)).
> > 
> > My fingers are crossed with this -Xincgc option.
> 
> It didn't appear to have much effect. It still did a few Full GC's, 3
> in 1.8 days, so far more rarely, but no significant improvement in load
> or memory management.
> 
> http://dennisn.dyndns.org/guest/pubstuff/loggc-freezes-xinc.log.bz2
> 
> As before, the GC's become increasingly longer and more eratic near the
> end. This time, the last Full GC took 99s, with a bunch of long 11s-40s
> GCs around the same time, which almost certainly contributed the most to
> the dreaded "3 minute freeze". As usual, the hour before the freeze,
> CPU load is higher than normal (as are GC timings), then finally spikes
> even higher, then the freeze and node shutdown.
> 
> I'll try lowering the memory I allocate to freenet, and try to free
> some more of my precious RAM on my system. (My datastore is currently
> only 5G, so the bloom shouldn't be a problem.) I'll also try to monitor
> my swapfile activity to doubly-confirm that is the issue here :|.
> 
> (Is there no way to get rid of constant stupid java GC? Perhaps we
> should move to C? :P)


If GC is taking more than a fraction of a second then there is something 
seriously wrong on your system, most likely part of the VM has been swapped 
out. And maybe you'd prefer us to be constantly chasing exploitable buffer 
overflows and memory leaks rather than doing productive work. :)

pgpvRn7bzxP6N.pgp
Description: PGP signature

_______________________________________________
Support mailing list
Support@freenetproject.org
http://news.gmane.org/gmane.network.freenet.support
Unsubscribe at http://emu.freenetproject.org/cgi-bin/mailman/listinfo/support
Or mailto:support-requ...@freenetproject.org?subject=unsubscribe

Re: [freenet-support] freenet stops running

Reply via email to