On Thursday 14 February 2008 13:12, Malkus Lindroos wrote:
> Matthew Toseland wrote:
> > On Linux, during CPU intensive node activities - resuming requests, 
decoding 
> > or encoding a large splitfile etc - the threads that do the core of 
Freenet's 
> > work (the packet sender and packet receiver threads, request senders etc), 
> > get starved of CPU by the CPU-intensive threads doing the FEC decoding 
(etc). 
> > The result is the node is dramatically slowed down and stops accepting 
> > requests because of this (we use the average round trip time for a message 
as 
> > effectively a measure of system load). It takes a while to recover 
afterwards 
> > because we use an averager to smooth it out.
> > In theory this shouldn't happen, because we set thread priorities: 
> > MAX_PRIORITY for important stuff, MIN_PRIORITY for FEC decodes etc.

> Seems rather strange. Starving the CPU would imply the Linux kernel for 
> some reason wouldn't interrupt(preempt) the java threads. Hence in 
> theory it should not happen in any case, despite setting priorities. 

You misunderstand - we set priorities, and they are ignored, because of 
linux's limitations / because java doesn't work around them.

> Having even tens of CPU -intensive threads running at the same time with 
> equal priorities should still not starve the CPU, assuming that the 
> normal use CPU load is around 10%.

Yes, there is a question here of why there is such a big impact on threads 
that shouldn't use *that* much cpu...
> 
> At least on my system the problem is not so much the CPU, but the 
> disk-io. Doing the operations you described is also very disk-intensive, 
> and disk-io may almost jam the system or at least freenet with CFQ, as 
> CFQ priorizes disk-io also according to nice -values. CFQ was recently 
> changed as the default scheduler at least in ubuntu.

Nice-values are not currently set, as we have discussed. If they were set it 
might improve matters.
> 
> The easiest solution would be to increase timeouts in the wrapper - did 
> that in wrapper.conf:
> wrapper.jvm_exit.timeout=2700
> wrapper.restart.delay=60
> wrapper.startup.timeout=300
> wrapper.shutdown.timeout=300
> wrapper.ping.interval=30
> wrapper.ping.timeout=60
> wrapper.startup.delay=1
> 
> However, the node seems now to complain about the message core freezing 
> for over three minutes - that is easily true with the load on the system 
> and freenet on the bottom in all priorities. 

You have lots of other cpu intensive processes running? The watchdog threads 
use virtually no CPU so should always be able to run.

> This should also be made  
> configurable, because now it causes frequent (useless) node restarts.

You can turn it off (there is a config option), but if you have to there is 
something wrong.

W.r.t. nice values, once we have implemented the proposed solution, freenet 
will need some slack within which to vary its own threads' nice value - 
please run freenet at 17, or ideally more like 10, so that we can have low 
priority threads at 19, normal threads at say 15, and high priority threads 
at 10.
> 
> P.S.
> You have of course tested the priorities with passing the java option 
> (wrapper.conf):
> wrapper.java.additional.8=-XX:+UseThreadPriorities

This is the default, isn't it?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080214/8f0f6cbb/attachment.pgp>

Reply via email to