Matthew Toseland wrote:
> On Linux, during CPU intensive node activities - resuming requests, decoding 
> or encoding a large splitfile etc - the threads that do the core of Freenet's 
> work (the packet sender and packet receiver threads, request senders etc), 
> get starved of CPU by the CPU-intensive threads doing the FEC decoding (etc). 
> The result is the node is dramatically slowed down and stops accepting 
> requests because of this (we use the average round trip time for a message as 
> effectively a measure of system load). It takes a while to recover afterwards 
> because we use an averager to smooth it out.
> In theory this shouldn't happen, because we set thread priorities: 
> MAX_PRIORITY for important stuff, MIN_PRIORITY for FEC decodes etc.
Seems rather strange. Starving the CPU would imply the Linux kernel for 
some reason wouldn't interrupt(preempt) the java threads. Hence in 
theory it should not happen in any case, despite setting priorities. 
Having even tens of CPU -intensive threads running at the same time with 
equal priorities should still not starve the CPU, assuming that the 
normal use CPU load is around 10%.

At least on my system the problem is not so much the CPU, but the 
disk-io. Doing the operations you described is also very disk-intensive, 
and disk-io may almost jam the system or at least freenet with CFQ, as 
CFQ priorizes disk-io also according to nice -values. CFQ was recently 
changed as the default scheduler at least in ubuntu.

The easiest solution would be to increase timeouts in the wrapper - did 
that in wrapper.conf:
wrapper.jvm_exit.timeout=2700
wrapper.restart.delay=60
wrapper.startup.timeout=300
wrapper.shutdown.timeout=300
wrapper.ping.interval=30
wrapper.ping.timeout=60
wrapper.startup.delay=1

However, the node seems now to complain about the message core freezing 
for over three minutes - that is easily true with the load on the system 
and freenet on the bottom in all priorities. This should also be made 
configurable, because now it causes frequent (useless) node restarts.

P.S.
You have of course tested the priorities with passing the java option 
(wrapper.conf):
wrapper.java.additional.8=-XX:+UseThreadPriorities

-- 
Malkus Lindroos





>
>
>
> Unfortunately, while thread priorities are used on Windows, they are *not* 
> used on Linux. Linux only supports thread priorities for realtime threads. 
> Practically if you run two java threads on a single core system one with MAX 
> and one with MIN priority, both will get the same amount of cpu time on 
> average. We have tested this.
>
> Further, the fairness features in the scheduler in 2.6.23 don't seem to help 
> matters very much.
>
> Nextgens suggested we use the java realtime API. The only reference I found 
> to 
> it was extremely unhelpful:
>
> "Java RTS is only available from Sun's OEM Sales team, which has 
> knowledgeable 
> sales staff worldwide. Please contact them at this address. You can also call 
> Sales at +1-800-786-0404, Prompt 1, Prompt 3."
> ( http://java.sun.com/javase/technologies/realtime/ )
>
> In other words, RTSJ is only available for big-budget closed source embedded 
> stuff.
>
> A further complication is that if the node is starved of CPU the wrapper will 
> restart it, and if it continues to be starved of CPU the wrapper shuts it 
> down; this is presumably detected by a thread running within the JVM.
>
> The first part of the solution is true request resuming: At the moment, on 
> startup, every request has to pull every key it used from the datastore, 
> decode each layer, and generally do a lot of work which has already been done 
> but we didn't save.
>
> Beyond that, the only realistic option appears to be an external daemon 
> running at a lower nice level to do CPU-intensive jobs and FEC encoding and 
> decoding in particular.
>
> Thoughts?
>
> We could ask Tom Marble (the senior java performance guy, at least he was 
> last 
> year) to open source RTSJ :) ... but it wouldn't be ready for a long time, 
> even if sun did take the plunge (and there's every reason not to).
>
>
> Sources:
> http://kerneltrap.org/node/6080
> http://www.md.pp.ru/~eu/jdk6options.html#ThreadPriorityPolicy
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Devl mailing list
> Devl at freenetproject.org
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080214/b27b1554/attachment.html>

Reply via email to