Damn, I sent this mail to myself instead of to devl.. Another try then..
sorry for loosing the formatting.

The attachements I am talking about can be retrieved from
http://194.236.28.174/freenetstuff/prof.zip

/N

----- Original Message -----
From: "Niklas Bergh" <[email protected]>
To: <nb at tific.com>
Sent: Friday, May 16, 2003 5:35 PM
Subject: Profiling of fred


I decided to play around with a profiler and fred.. The test case I
selected was: 'Why is fproxy slow at delivering web pages even though
most of the data is cached locally'/'Why is TFE slow'. I deliberately
skipped drawing any deep conclusions from the collected data since I am
not enough aquainted with the code yet (and because I am to tired right
now), please discuss.

Setup:
1. Firewalled v6032 node (not transient but not able to recieve
incoming connections)
2. Source modified to not kill of any surplus threads
(QThreadFactory.run)
3. Node configured with maximumTreads=800
4. 5 pieces of lightweight webspiders, each running 10
concurrent requests againt fred
5. Eclipse profiler configured to profile using
System.getCurrentMillis() call (low-overhead profiling)
6. Datastore status (was pretty much static during the tests):
Maximum size 10 000 MiB
Free space 782 716 KiB
Used space 9 457 284 KiB
Percent used 92
Total keys 23574
Most recent file timestamp Fri May 16 13:43:36
GMT+01:00 2003
Least recent file timestamp Thu Apr 10 10:11:26
GMT+01:00 2003
7. Sun JVM 1.4.1
8. maximumThreads=700
9. Instrumenting (and thereby profiling) disabled of any methods
where the bytecode was less than 250 bytes. Because of this they dont
show up in the stats (except as an addition to the callers execution
time of course)
10.  The profiling was done on a P3 650MHz with 1 gig of ram


Assumptions:
1. Since I have viewed TFE multiple times before starting these
tests my datastore ought to contain most of the data used by TFE (and if
not, then the tests in themselves should definitely make fred cache the
data)
2. Since 50 concurrent request should consume most of the
machine:s CPU pretty much all of the profiling data I collect should be
related to the TFE loading

Actions performed:
1. Spidered all of TFE without profiling (called 'First run'
below), using 5*10 threads, 20 minutes request timeout and 5 retries
2. Spidered all of TFE without profiling (called 'Second run'
below), using 5*10 threads, 20 minutes request timeout and 5 retries
3. Spidered all of TFE without profiling (called 'Third run'
below), using 5*10 threads, 20 minutes request timeout and 5 retries
4. Spidered all of TFE without profiling (called 'Fourth run'
below), using 5*10 threads, 1 minute request timeout and 0 retries
5. Spidered all of TFE with profiling (called 'Fifth run'
below), using 5*10 threads, 10 minute request timeout and 0 retries.
Profiling data collection done ONLY during the time when the spiders
where running. Export of some of the collected data attached, more
detailed eclipse profiler stored data available upon request (much
recommended above the puny HTML export version.. Sorting and filtering
and such things are available there). Please note that some of the
supplied HTML:s are javascript-expandable ('+'-sign)
It might be helpful to have a look at
http://eclipsecolorer.sourceforge.net if you have problems interpreting
some of the data.


Possibly interesting facts:
At any time during any of the tests network I/O where very close
to 0
At any time during any of the tests kernel time (disk I/O?!)
where significant. About 25% when profiler wheren't attached, about 15%
when it was.
At nearly any time during any of the tests 95-100% of the CPU
was used by javaw.exe (only exception was when the profiler sucked data
from the node, for about one second every 5 seconds it used 50% CPU)
While actual CPU load was around 100% freds load indicator said
15%
The test set of TFE is 313 files (thelist.html and images used
in it)
About 240 of the files where retrievable using 20 min
timeout and 5 retries (Approx total size: 2 528 290 bytes)
About 240 of the files where retrievable using 1 min
timeout and 0 retries (Approx total size: 2 528 290 bytes)
First run average retrieval speed was 5*1800 bytes/sec
Second run average retrieval speed was 5*1600 bytes/sec
Third run average retrieval speed was 5*1700 bytes/sec
Fourth run average retrieval speed was 5*1700 bytes/sec
Fifth run average retrieval speed was 5*700 bytes/sec
At all times the retrieval speed was higher initially than
finally but an approx factor of 3 (probably because more threads where
stuck waiting for DNF files in the later part of the test, see 'Caveats'
below)
It would seem that file retrievability increased slightly during
my tests.. Later tests where able to retrieve about 2-3 files more than
the initial ones
Restart of the node was done only between run 4 and 5
After the restart I didn't request anything manually from the
node, it was just the spiders running
The machine running the test had more than enough RAM. At no
point during the tests the memory consumption where closer to the
physical memory present than 200 megs less
The tests took somewhere between 30-60 each (forgot to time it
but it was something like that)
While one spider managed to retrieve 254 files another one
running during the same time period managed to retrieve 238, third one
managed 239 and so one. This might tie in to the issues with
disappearing NIMs and jpg:s people have reported
Some snapshots of thread usage:
Second run
freenet.OpenConnectionManager$ConnectionJob
19

freenet.client.InternalClient$ClientMessageVector 16

freenet.interfaces.LocalInterface$ConnectionShell 51
Third run
freenet.OpenConnectionManager$ConnectionJob
27

freenet.client.InternalClient$ClientMessageVector 28

freenet.interfaces.LocalInterface$ConnectionShell 51
Fourth run
freenet.OpenConnectionManager$ConnectionJob
17

freenet.client.InternalClient$ClientMessageVector 52

freenet.interfaces.LocalInterface$ConnectionShell 71
Fifth run
freenet.OpenConnectionManager$ConnectionJob
14

freenet.client.InternalClient$ClientMessageVector 32

freenet.interfaces.LocalInterface$ConnectionShell 51



Caveats:
Since the spiders are each using only 10 threads it might be
possible that they where stuck waiting for the 73 files which wheren't
retrievable much of the time. That would explain the low throughput.
However, if so where the case, then I suppose the CPU shouldn't be
pegged at 100% all the time?!
When using 0 retries and 1 minute timeout I saw some problems
with retrieval. It would seem that 2 of the spider applications even had
problems getting thelist.html using those settings
During all of the runs a couple of the threads where occupied at
freenet.Message: QueryRejected. Who where they rejecting? Me? Current
routingTime: 18ms, Active pooled jobs: 120 (15.000001%), Current
estimated load: 15.000001%.
Even though the profiler wasn't collecting data all the time I
was executing profiled code all the time.. According to the author of
the profiler this should only slow down the application by 30% (not
verified by me)

Ideas for the analyzing the profiling data/figure out where to modify
the code:
Compare number of bytes retrieved to number of invocations of
methods and time consumed in them
Compare number of files retrieved to number of invocations of
methods and time consumed in them
Compare number of files that failed retrieval to number of
invocations of methods and time consumed in them
Compare number of files requested to number of requests sent out
over the network
Compare number of files requested to number of DNFs and/or RNFs


Ideas for the future:
Try to eliminate a bottleneck or two and try again
Run with a spider that can handle more concurrent requests
Listen to people feedback and collect some more data if
needed...


Brrghh.. Sorry if I missed something... I am starting to get a little
bit tired of this now.. Please ask if there is something that need
clarification.


/N

_______________________________________________
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Reply via email to