On Linux, during CPU intensive node activities - resuming requests, decoding or encoding a large splitfile etc - the threads that do the core of Freenet's work (the packet sender and packet receiver threads, request senders etc), get starved of CPU by the CPU-intensive threads doing the FEC decoding (etc). The result is the node is dramatically slowed down and stops accepting requests because of this (we use the average round trip time for a message as effectively a measure of system load). It takes a while to recover afterwards because we use an averager to smooth it out.
In theory this shouldn't happen, because we set thread priorities: MAX_PRIORITY for important stuff, MIN_PRIORITY for FEC decodes etc. Unfortunately, while thread priorities are used on Windows, they are *not* used on Linux. Linux only supports thread priorities for realtime threads. Practically if you run two java threads on a single core system one with MAX and one with MIN priority, both will get the same amount of cpu time on average. We have tested this. Further, the fairness features in the scheduler in 2.6.23 don't seem to help matters very much. Nextgens suggested we use the java realtime API. The only reference I found to it was extremely unhelpful: "Java RTS is only available from Sun's OEM Sales team, which has knowledgeable sales staff worldwide. Please contact them at this address. You can also call Sales at +1-800-786-0404, Prompt 1, Prompt 3." ( http://java.sun.com/javase/technologies/realtime/ ) In other words, RTSJ is only available for big-budget closed source embedded stuff. A further complication is that if the node is starved of CPU the wrapper will restart it, and if it continues to be starved of CPU the wrapper shuts it down; this is presumably detected by a thread running within the JVM. The first part of the solution is true request resuming: At the moment, on startup, every request has to pull every key it used from the datastore, decode each layer, and generally do a lot of work which has already been done but we didn't save. Beyond that, the only realistic option appears to be an external daemon running at a lower nice level to do CPU-intensive jobs and FEC encoding and decoding in particular. Thoughts? We could ask Tom Marble (the senior java performance guy, at least he was last year) to open source RTSJ :) ... but it wouldn't be ready for a long time, even if sun did take the plunge (and there's every reason not to). Sources: http://kerneltrap.org/node/6080 http://www.md.pp.ru/~eu/jdk6options.html#ThreadPriorityPolicy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20080208/3fe22a9c/attachment.pgp>
