My local setup has a fairly beefy system (1GB RAM, striped IDE disks, Athlon XP 1700+) running two nodes. My load average is 7 to 11, and has been for the last hour. This includes a bzip2 process running in the background to do a backup, but that only contributes one to the load, leaving 6-10. My unstable branch node:
Global mean traffic (queries per hour):2084.9733333333334 Local mean traffic (queries per hour): 495.99005815483434 Current advertise probability: 0.2083009983444445 Current proportion of requests being accepted: 1.0 Current routingTime 76ms Active pooled jobs 77 (64.166664%) Available threads 20 Current estimated load 64.166664% And my stable branch node: Global mean traffic (queries per hour):3594.457142857143 Local mean traffic (queries per hour): 1613.9878950907869 Current advertise probability: 0.02 Current proportion of requests being accepted: 0.39 Current routingTime 2099ms Active pooled jobs 62 (51.666664%) [QueryRejecting all incoming requests!] Available threads 24 Current estimated load 100.0% Okay, so one is overloaded, and the other is not. Hawk runs one node, and: Global mean traffic (queries per hour):1804.1 Local mean traffic (queries per hour): 6745.6153500224855 Current advertise probability: 0.02 Current proportion of requests being accepted: 0.17 ... Active pooled jobs 104 (86.666664%) [QueryRejecting all incoming However, hawk's sysload is around 1. What conclusions can be drawn? Mainly, that system load and both local mean traffic and the load percentage have absolutely nothing to do with each other. The problem with this is that our load balancing mechanism decides when it needs more traffic relies on the local mean traffic and the accept ratio (which is strongly influenced by the load percentage). Thus a node that is struggling to cope with traffic and severely effecting the user experience may well think it is not overloaded and solicit more traffic through a high datasource reset probability. One suggestion has been to incorporate the bandwidth utilization percentage (if we have a limit set). This may make sense... or it may not. Bandwidth utilization has little to do with the actual cost of the request... One thing I have suggested in the past has been to measure the actual CPU usage, or the system load, on an architecture specific basis (on linux, parse /proc/stat, on other unix, parse the output of vmstat, on Windoze, use JNI, etc). I would like to point out here that transparent portability to arbitrary future or obscure platforms IS NOT AND NEVER HAS BEEN A CORE PROJECT GOAL. I have functional java code to parse /proc/stat and determine CPU every few seconds (on linux). I believe fish has JNI code to find the CPU usage from the JVM on Windoze. Since load balancing is fairly critical to a functioning Freenet, I would like to add this code to Fred at least as an option. Does anyone agree with me? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20030517/2f2e014a/attachment.pgp>
