My local setup has a fairly beefy system (1GB RAM, striped IDE disks,
Athlon XP 1700+) running two nodes. My load average is 7 to 11, and has
been for the last hour. This includes a bzip2 process running in the
background to do a backup, but that only contributes one to the load,
leaving 6-10. My unstable branch node:

Global mean traffic (queries per hour):2084.9733333333334
Local mean traffic (queries per hour): 495.99005815483434
Current advertise probability: 0.2083009983444445
Current proportion of requests being accepted: 1.0
Current routingTime 76ms
Active pooled jobs 77 (64.166664%)
Available threads 20
Current estimated load 64.166664%

And my stable branch node:

Global mean traffic (queries per hour):3594.457142857143
Local mean traffic (queries per hour): 1613.9878950907869
Current advertise probability: 0.02
Current proportion of requests being accepted: 0.39
Current routingTime 2099ms
Active pooled jobs 62 (51.666664%) [QueryRejecting all incoming requests!]
Available threads 24
Current estimated load 100.0%

Okay, so one is overloaded, and the other is not.

Hawk runs one node, and:

Global mean traffic (queries per hour):1804.1
Local mean traffic (queries per hour): 6745.6153500224855
Current advertise probability: 0.02
Current proportion of requests being accepted: 0.17
...
Active pooled jobs     104 (86.666664%) [QueryRejecting all incoming

However, hawk's sysload is around 1.

What conclusions can be drawn? Mainly, that system load and both local mean
traffic and the load percentage have absolutely nothing to do with each 
other. The problem with this is that our load balancing mechanism decides 
when it needs more traffic relies on the local mean traffic and the
accept ratio (which is strongly influenced by the load percentage). Thus
a node that is struggling to cope with traffic and severely effecting
the user experience may well think it is not overloaded and solicit more
traffic through a high datasource reset probability.

One suggestion has been to incorporate the bandwidth utilization
percentage (if we have a limit set). This may make sense... or it may
not. Bandwidth utilization has little to do with the actual cost of the
request... One thing I have suggested in the past has been to measure
the actual CPU usage, or the system load, on an architecture specific
basis (on linux, parse /proc/stat, on other unix, parse the output of
vmstat, on Windoze, use JNI, etc). I would like to point out here that
transparent portability to arbitrary future or obscure platforms IS NOT
AND NEVER HAS BEEN A CORE PROJECT GOAL. I have functional java code to 
parse /proc/stat and determine CPU every few seconds (on linux). I
believe fish has JNI code to find the CPU usage from the JVM on 
Windoze. Since load balancing is fairly critical to a functioning
Freenet, I would like to add this code to Fred at least as an option.
Does anyone agree with me?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20030517/2f2e014a/attachment.pgp>

Reply via email to