Re: one more squid-2.6 rel?

2006-09-19 Thread Adrian Chadd
On Wed, Sep 20, 2006, Steven Wilton wrote:
> I'm about to get MASK assignment working in wccp2. I'd like to see that in
> the next release if possible.

Nice! I guess the CCE's helped..



Adrian



RE: one more squid-2.6 rel?

2006-09-19 Thread Steven Wilton
I'm about to get MASK assignment working in wccp2. I'd like to see that in
the next release if possible.

Steven

> -Original Message-
> From: Adrian Chadd [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, 20 September 2006 9:03 AM
> To: squid-dev@squid-cache.org
> Subject: one more squid-2.6 rel?
> 
> Hiya,
> 
> What do you all think about another squid-2.6 release? A few bugfixes
> have gone into Squid-2.6. It'd also be good to say "this 
> stable release
> has stable COSS support."
> 
> 
> 
> 
> Adrian
> 
> 
> -- 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.405 / Virus Database: 268.12.5/451 - Release 
> Date: 19/09/2006
>  
> 

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.12.5/451 - Release Date: 19/09/2006
 



one more squid-2.6 rel?

2006-09-19 Thread Adrian Chadd
Hiya,

What do you all think about another squid-2.6 release? A few bugfixes
have gone into Squid-2.6. It'd also be good to say "this stable release
has stable COSS support."




Adrian



Re: more profiling

2006-09-19 Thread Henrik Nordstrom
tis 2006-09-19 klockan 21:10 +0300 skrev Andres Kroonmaa:

> Because gprof call graph is determinate, but profile
> information is statistical approximation. For vast
> majority of cases its good enough. For eg. in this
> case it seems that gprof wasn't that much off afterall
> as Adrian found a bug that caused abnormal cleanups.
> Sometimes gprof can produce stats that are misleading.

The gprof manual has some good guidelines on what thresholds to use in
determining if the "self" profile result is meaningful or not. It's in
principle a direct relation to sampling period and reported runtime.

The callgraph based runtime accounting (children, and total including
children) is more difficult as it evenly distributes the run time among
the different call paths going via the function so some paths may get a
heavier weight than they should and others less..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: more profiling

2006-09-19 Thread Andres Kroonmaa
On 19 Sep 2006 at 14:12, Gonzalo Arana wrote:

> On 9/19/06, Andres Kroonmaa <[EMAIL PROTECTED]> wrote:
> > > On Tue, Sep 19, 2006, Gonzalo Arana wrote:
> > >
> > > > There is a comment in profiling.h claiming that rdtsc (for x86 arch)
> > > > stalls CPU pipes.  That's not what Intel documentation says (page 213
> > > > -numbered as 4-209- of the Intel Architecture Software Developer
> > > > Manual, volume 2b, Instruction Reference N-Z).
> >
> > Well, this is somewhat mixed issue. Intel documented
> > usage of rdtsc (at a time when I coded this) with a
> > requirement of a pair of cpuid+rdtsc. Cpuid flushes all
> 
> I guess that's because time stamp counter may have different values on
> different CPUs.  Am I right?

if you meant MP or multi-core systems, then no, tsc is in
sync on all cpus.

My understanding was that Intel required cpud+rdtsc pair
because it expected rdtsc to be used in single-OP code
section profiling by someone and then blaming Intel when
measured section of code was executed before or after the
time measuring rdtsc's. cpuid was simplest way to
guarantee on-cpu code execution order.

> > superscalar cpus, but I went on with assumption that as
> > long as probe start and probe stop are similar pieces of
> > code, the added time uncertainty is largely cancelling
> > out as we are measuring time *between* two invocations.
> 
> Sounds reasonable to me: both the start and the stop could have (on
> average) the same offset error, so they would cancel each other.  I
> just wonder if branch prediction does not gives us some bias in this.

At some point we must stop worrying. Afterall, there are
*too many* things that impact precision as you approach
clock tick resolution. Uncertainty of ~50 clock ticks is
damn good by any standards.

> We could have a
> matrix of profile information:
> M[caller][callee].  If you wish to get deeper levels, just add a new
> dimension it to the 'profile matrix': M[grandfather][father][callee].

imo would add too much overhead, so that it won't be
usable in production mode.

> Any way, if we trust & rely on gprof call tree, there is no point in
> doing any of this.
> Just as a note: why don't we trust in gprof profile information but we
> do trus in gprof call graph?

Because gprof call graph is determinate, but profile
information is statistical approximation. For vast
majority of cases its good enough. For eg. in this
case it seems that gprof wasn't that much off afterall
as Adrian found a bug that caused abnormal cleanups.
Sometimes gprof can produce stats that are misleading.


 Andres Kroonmaa
 Elion




Re: more profiling

2006-09-19 Thread Gonzalo Arana

On 9/19/06, Andres Kroonmaa <[EMAIL PROTECTED]> wrote:

On 19 Sep 2006 at 21:10, Adrian Chadd wrote:
> On Tue, Sep 19, 2006, Gonzalo Arana wrote:
>
> > Is hires profiling *that* heavy? I've used it in my production squids
> > (while I've used squid3) and the overhead was neglible.
>
> It doesn't seem to be that heavy.

hires profiling was designed for lightest possible
overhead. I added special probes to measure its own
overhead, run regularily in events (PROF_OVERHEAD) and it
shows around 100 cpu 2.6G clock ticks on average for
Adrian.

> > There is a comment in profiling.h claiming that rdtsc (for x86 arch)
> > stalls CPU pipes.  That's not what Intel documentation says (page 213
> > -numbered as 4-209- of the Intel Architecture Software Developer
> > Manual, volume 2b, Instruction Reference N-Z).
> >
> > So, it should be harmless to profile as much code as possible, am I right?
>
> Thats what I'm thinking! Things like perfsuite seem to do a pretty good job
> of it without requiring re-compilation as well.

Well, this is somewhat mixed issue. Intel documented
usage of rdtsc (at a time when I coded this) with a
requirement of a pair of cpuid+rdtsc. Cpuid flushes all


I guess that's because time stamp counter may have different values on
different CPUs.  Am I right?


prefetch pipes and serializes execution and that is what
they meant by stalled pipes. In my implementation I
played around with both and found that cpuid inclusion
added unnecessary overhead - it indeed stalls pipes. For


so, timestamp counters seem to be onsync on multiple processors.


netburst Intel cores that can become quite notable
overhead. So I excluded this and use simple rdtsc command
alone. This on other hand by definition causes some
precision error due to possible out of order execution on
superscalar cpus, but I went on with assumption that as
long as probe start and probe stop are similar pieces of
code, the added time uncertainty is largely cancelling
out as we are measuring time *between* two invocations.


Sounds reasonable to me: both the start and the stop could have (on
average) the same offset error, so they would cancel each other.  I
just wonder if branch prediction does not gives us some bias in this.


> > We could build something like gprof call graph (with some
> > limitations).  Adding this shouln't be *that* difficult, right?
> >
> > Is there interest in improving the profiling code this way? (i.e.:
> > somewhat automated probe collection & adding call graph support).
>
> It'd be a pretty interesting experiment. gprof seems good enough
> to obtain call graph information (and call graph information only)
> and I'd rather we put our efforts towards fixing what we can find
> and porting over the remaining stuff from 2.6 into 3. We really
> need to concentrate on fixing up -3 rather than adding shinier things.
> Yet :)

How would you go on with adding call graphs without
adding too much overhead? I think it would be hard to
beat gprof on that one.


I'm sory, by 'adding call graph' I meant 'adding some basic call tree
information' (I've used the wrong words, sory).  We could have a
matrix of profile information:
M[caller][callee].  If you wish to get deeper levels, just add a new
dimension it to the 'profile matrix': M[grandfather][father][callee].

Any way, if we trust & rely on gprof call tree, there is no point in
doing any of this.
Just as a note: why don't we trust in gprof profile information but we
do trus in gprof call graph?

Regards,

--
Gonzalo A. Arana


Re: more profiling

2006-09-19 Thread Andres Kroonmaa
On 19 Sep 2006 at 21:10, Adrian Chadd wrote:
> On Tue, Sep 19, 2006, Gonzalo Arana wrote:
> 
> > Is hires profiling *that* heavy? I've used it in my production squids
> > (while I've used squid3) and the overhead was neglible.
> 
> It doesn't seem to be that heavy.

hires profiling was designed for lightest possible 
overhead. I added special probes to measure its own 
overhead, run regularily in events (PROF_OVERHEAD) and it 
shows around 100 cpu 2.6G clock ticks on average for
Adrian.

> > There is a comment in profiling.h claiming that rdtsc (for x86 arch)
> > stalls CPU pipes.  That's not what Intel documentation says (page 213
> > -numbered as 4-209- of the Intel Architecture Software Developer
> > Manual, volume 2b, Instruction Reference N-Z).
> > 
> > So, it should be harmless to profile as much code as possible, am I right?
> 
> Thats what I'm thinking! Things like perfsuite seem to do a pretty good job
> of it without requiring re-compilation as well.

Well, this is somewhat mixed issue. Intel documented
usage of rdtsc (at a time when I coded this) with a
requirement of a pair of cpuid+rdtsc. Cpuid flushes all
prefetch pipes and serializes execution and that is what
they meant by stalled pipes. In my implementation I
played around with both and found that cpuid inclusion
added unnecessary overhead - it indeed stalls pipes. For
netburst Intel cores that can become quite notable
overhead. So I excluded this and use simple rdtsc command
alone. This on other hand by definition causes some 
precision error due to possible out of order execution on
superscalar cpus, but I went on with assumption that as
long as probe start and probe stop are similar pieces of
code, the added time uncertainty is largely cancelling
out as we are measuring time *between* two invocations.

I notice that someone has added rdtsc macro for windows
platform and went with the documented cpuid+rdtsc pair.
My suggestion would be to omit the cpuid. It gives
nothing in terms of precision due to stalling pipes
adding more overhead than error due to omiting it.

> > We could build something like gprof call graph (with some
> > limitations).  Adding this shouln't be *that* difficult, right?
> > 
> > Is there interest in improving the profiling code this way? (i.e.:
> > somewhat automated probe collection & adding call graph support).
> 
> It'd be a pretty interesting experiment. gprof seems good enough
> to obtain call graph information (and call graph information only)
> and I'd rather we put our efforts towards fixing what we can find
> and porting over the remaining stuff from 2.6 into 3. We really
> need to concentrate on fixing up -3 rather than adding shinier things.
> Yet :)

How would you go on with adding call graphs without
adding too much overhead? I think it would be hard to
beat gprof on that one.


 Andres Kroonmaa
 Elion




Re: more profiling

2006-09-19 Thread Adrian Chadd
On Tue, Sep 19, 2006, Gonzalo Arana wrote:

> >Hey, want to join? :)
> 
> Tempting invitation :)  but for the moment I'll be commited to
> another projects (I believe for 2 months or so). But once I finish
> on-going projects, I'd love to join.
> 
> I manage about 20 squid servers for a hundred thousand ADSL & dial-up
> clients.  As soon as I return to squid development, I can provide a
> realworld testbench for squid3.

Nice! That'll be really helpful, thanks.

(Now, i -should- be committed to other projects, but you know me,
can't get away from hacking on Squid.)



Adrian



Re: more profiling

2006-09-19 Thread Gonzalo Arana

On 9/19/06, Adrian Chadd <[EMAIL PROTECTED]> wrote:

On Tue, Sep 19, 2006, Gonzalo Arana wrote:






> >Bout the only really crinkly point I see atm is the zero-sized reply
> >stuff. I have a sneaking sense that the forwarder code is still slightly
> >broken.
>
> Nothing the squid-guru-team cannot solve I hope :).

Hey, want to join? :)


Tempting invitation :)  but for the moment I'll be commited to
another projects (I believe for 2 months or so). But once I finish
on-going projects, I'd love to join.

I manage about 20 squid servers for a hundred thousand ADSL & dial-up
clients.  As soon as I return to squid development, I can provide a
realworld testbench for squid3.

Regards,

--
Gonzalo A. Arana


Re: more profiling

2006-09-19 Thread Adrian Chadd
On Tue, Sep 19, 2006, Gonzalo Arana wrote:

> >Cute! It'd still be a good idea to explicitly state beginning/end where
> >appropriate. What might be nice is a "i was deallocated at the end of the
> >function rather than being deallocated explicitly" counter so things
> >could be noted?
> 
> I don't understand the "so things could be noted" meaning :(, sory.

"written down to be looked over at a later time".

> >Bout the only really crinkly point I see atm is the zero-sized reply
> >stuff. I have a sneaking sense that the forwarder code is still slightly
> >broken.
> 
> Nothing the squid-guru-team cannot solve I hope :).

Hey, want to join? :)




Adrian



Re: more profiling

2006-09-19 Thread Gonzalo Arana

On 9/19/06, Adrian Chadd <[EMAIL PROTECTED]> wrote:

On Tue, Sep 19, 2006, Gonzalo Arana wrote:



> There is a comment in profiling.h claiming that rdtsc (for x86 arch)
> stalls CPU pipes.  That's not what Intel documentation says (page 213
> -numbered as 4-209- of the Intel Architecture Software Developer
> Manual, volume 2b, Instruction Reference N-Z).
>
> So, it should be harmless to profile as much code as possible, am I right?

Thats what I'm thinking! Things like perfsuite seem to do a pretty good job
of it without requiring re-compilation as well.


That seems promising.


> This could be automatically done by the compiler, if the profile probe
> was contained in an object.  The object will get automatically
> destroyed (and therefore the profiling probe will stop) when the
> function exits.

Cute! It'd still be a good idea to explicitly state beginning/end where
appropriate. What might be nice is a "i was deallocated at the end of the
function rather than being deallocated explicitly" counter so things
could be noted?


I don't understand the "so things could be noted" meaning :(, sory.


> We could build something like gprof call graph (with some
> limitations).  Adding this shouln't be *that* difficult, right?
>
> Is there interest in improving the profiling code this way? (i.e.:
> somewhat automated probe collection & adding call graph support).

It'd be a pretty interesting experiment. gprof seems good enough
to obtain call graph information (and call graph information only)
and I'd rather we put our efforts towards fixing what we can find
and porting over the remaining stuff from 2.6 into 3. We really
need to concentrate on fixing up -3 rather than adding shinier things.
Yet :)


Agreed, getting a stable squid3 is a priority.  It would be good to
the goals of having a squid3 release to get better profiling
information.  But if we can trust gprof's call graph, then this
profiling code improvement is not needed right now.


I'm going to continue doing microbenchmarks to tax certain parts of
Squid (request parsing, reply parsing, connection creation/teardown,
storage memory management, small/large object proxying/caching,
probably should do some range request tests as well) to find the really
crinkly points and iron them out before the -3 release.

Bout the only really crinkly point I see atm is the zero-sized reply
stuff. I have a sneaking sense that the forwarder code is still slightly
broken.


Nothing the squid-guru-team cannot solve I hope :).

Regards,

--
Gonzalo A. Arana


Re: more profiling

2006-09-19 Thread Adrian Chadd
On Tue, Sep 19, 2006, Gonzalo Arana wrote:

> Is hires profiling *that* heavy? I've used it in my production squids
> (while I've used squid3) and the overhead was neglible.

It doesn't seem to be that heavy.

> There is a comment in profiling.h claiming that rdtsc (for x86 arch)
> stalls CPU pipes.  That's not what Intel documentation says (page 213
> -numbered as 4-209- of the Intel Architecture Software Developer
> Manual, volume 2b, Instruction Reference N-Z).
> 
> So, it should be harmless to profile as much code as possible, am I right?

Thats what I'm thinking! Things like perfsuite seem to do a pretty good job
of it without requiring re-compilation as well.

> This could be automatically done by the compiler, if the profile probe
> was contained in an object.  The object will get automatically
> destroyed (and therefore the profiling probe will stop) when the
> function exits.

Cute! It'd still be a good idea to explicitly state beginning/end where
appropriate. What might be nice is a "i was deallocated at the end of the
function rather than being deallocated explicitly" counter so things
could be noted?

> 
> We could build something like gprof call graph (with some
> limitations).  Adding this shouln't be *that* difficult, right?
> 
> Is there interest in improving the profiling code this way? (i.e.:
> somewhat automated probe collection & adding call graph support).

It'd be a pretty interesting experiment. gprof seems good enough
to obtain call graph information (and call graph information only)
and I'd rather we put our efforts towards fixing what we can find
and porting over the remaining stuff from 2.6 into 3. We really
need to concentrate on fixing up -3 rather than adding shinier things.
Yet :)

I'm going to continue doing microbenchmarks to tax certain parts of
Squid (request parsing, reply parsing, connection creation/teardown,
storage memory management, small/large object proxying/caching,
probably should do some range request tests as well) to find the really
crinkly points and iron them out before the -3 release.

Bout the only really crinkly point I see atm is the zero-sized reply
stuff. I have a sneaking sense that the forwarder code is still slightly
broken.




Adrian



Adrian



Re: more profiling

2006-09-19 Thread Gonzalo Arana

On 9/19/06, Adrian Chadd <[EMAIL PROTECTED]> wrote:

On Tue, Sep 19, 2006, Andres Kroonmaa wrote:



...


> Since then quite alot of changes have happened, so I'd
> suggest to look at the gprof stats to decide what funcs
> to probe with hires prof and add them.

Yeah, I'm thinking that too.


Is hires profiling *that* heavy? I've used it in my production squids
(while I've used squid3) and the overhead was neglible.

There is a comment in profiling.h claiming that rdtsc (for x86 arch)
stalls CPU pipes.  That's not what Intel documentation says (page 213
-numbered as 4-209- of the Intel Architecture Software Developer
Manual, volume 2b, Instruction Reference N-Z).

So, it should be harmless to profile as much code as possible, am I right?


> Also review the probes already there - you'd want to make
> sure a probe isn't left "running" at any function exit
> point - this would lead to accounting to a probe sections
> of code incorrectly.


This could be automatically done by the compiler, if the profile probe
was contained in an object.  The object will get automatically
destroyed (and therefore the profiling probe will stop) when the
function exits.


> There's something fishy with "best case" timers. They
> shouldn't be zero, ever. Ditto "worst case" - they *can*
> get high due to task switches, but your worst cases look
> way too high, on P4 2.6G there should be 2.6G ticks per
> second. Your worst case looks like probes have been
> running for 8.9secs straight, seems unlikely.
> So there seems to be a need to get hires profiling
> uptodate with current squid code base.



I did notice that but I don't know enough about the code to go digging.


On x86 architecture, timestamp counter may not run at external clock
rate.  On different processor versions the meaning of a tick in this
clock may vary.  From Intel Architecture * Volume 3b (chap 18.9):
"For Pentium 4 processors, Intel Xeon  Intel Core Solo and Intel
Core Duo ...: the timestamp counter increments at a constant rate.
That rate may be set by the maximum core-clock to bus-clock ratio of
the processor or may be set by the frequency at which the processor is
booted.  The specific processor configuration determines the
behaviour."

The only important issue is that TS counter runs at a contant rate,
but it is somewhat unknown (could be measured anyway).


That said, the traces look much nicer now. There's definitely something
weird going on with the nested traces though.

I just don't have the time to go through the profiling code. Its definitely
nicer to use than gprof but it'd be nice to keep counts of call graphs.



Thats all I really use gprof for these days.


We could build something like gprof call graph (with some
limitations).  Adding this shouln't be *that* difficult, right?

Is there interest in improving the profiling code this way? (i.e.:
somewhat automated probe collection & adding call graph support).

Regards,

--
Gonzalo A. Arana


Re: more profiling

2006-09-19 Thread Adrian Chadd
On Tue, Sep 19, 2006, Andres Kroonmaa wrote:

> Is that all? or did you paste only part of probes?
> When I used it on my prod caches, I enabled quite alot
> more probes. When the profiling feature was included,
> there were other concurrent changes (eg chunked mempools)
> and in submitted patch many probes got left out.

Take a look:

http://www.creative.net.au/diffs/profiling-1.html




Adrian



Re: more profiling

2006-09-19 Thread Adrian Chadd
On Tue, Sep 19, 2006, Andres Kroonmaa wrote:

> Is that all? or did you paste only part of probes?
> When I used it on my prod caches, I enabled quite alot
> more probes. When the profiling feature was included,
> there were other concurrent changes (eg chunked mempools)
> and in submitted patch many probes got left out.

Only part of the probing.

> Since then quite alot of changes have happened, so I'd
> suggest to look at the gprof stats to decide what funcs
> to probe with hires prof and add them. 

Yeah, I'm thinking that too.

> Also review the probes already there - you'd want to make
> sure a probe isn't left "running" at any function exit
> point - this would lead to accounting to a probe sections
> of code incorrectly.
> 
> There's something fishy with "best case" timers. They
> shouldn't be zero, ever. Ditto "worst case" - they *can*
> get high due to task switches, but your worst cases look
> way too high, on P4 2.6G there should be 2.6G ticks per
> second. Your worst case looks like probes have been
> running for 8.9secs straight, seems unlikely.
> So there seems to be a need to get hires profiling
> uptodate with current squid code base.

I did notice that but I don't know enough about the code to go digging.
That said, the traces look much nicer now. There's definitely something
weird going on with the nested traces though.

I just don't have the time to go through the profiling code. Its definitely
nicer to use than gprof but it'd be nice to keep counts of call graphs.
Thats all I really use gprof for these days.

> Unfortunately, I can't participate for now, my company
> has been restructured and caching has been thrown out, so
> I don't have any suitable platform at the moment.. ;(

My current employer is responsible for a few Squid caches here and
there but they're small installs for < 100 people. Squid-2.6 is a
negligible load on the proxy servers.

I'm doing all this stuff for fun. I got sick of having no hardware
and bought some second-hand equipment to play with.



adrian



Re: more profiling

2006-09-19 Thread Andres Kroonmaa
On 19 Sep 2006 at 13:17, Adrian Chadd wrote:

> Here's the hourly snapshot. 
> 
> Adrian
> 
> Last 1 hour averages: (Cumulated time: 25411206816740, 2782.06 sec)
> 
>   Probe NameEvents   cumulated time best case average 
> worst case  Rate / sec % in int
> 
> PROF_UNACCOUNTED   105984696  1728110104996 0   16305   
> 424549192   38095.756.801
> PROF_OVERHEAD 115170   18666044 0 162 
>  212816  41.400.000
> HttpStateData_readReply  6748802 20584289534326 0 3050065 
> 231442910002425.83   81.005
> StoreEntry_write82448181 19892984791130 0  241278 
> 23142919280   29635.65   78.284
> storeGetMemSpace82448181 18827597315536 0  228356 
> 23142884908   29635.65   74.092
> comm_check_incoming  1746752  1530441372606 0  876164
> 92138904 627.866.023
> comm_handle_ready_fd 1652867  1265826211860 0  765836   
> 233867732 594.124.981
> HttpStateData_processReplyBody   6747441  1252785308450 0  185668  
> 94341625002425.344.930
> MemObject_write 82448181  1023697835718 0   12416
> 87664104   29635.654.029
> storeWriteComplete  82448181   684813646988 08305
> 46559868   29635.652.695
> 

Is that all? or did you paste only part of probes?
When I used it on my prod caches, I enabled quite alot
more probes. When the profiling feature was included,
there were other concurrent changes (eg chunked mempools)
and in submitted patch many probes got left out.

Since then quite alot of changes have happened, so I'd
suggest to look at the gprof stats to decide what funcs
to probe with hires prof and add them. 

Also review the probes already there - you'd want to make
sure a probe isn't left "running" at any function exit
point - this would lead to accounting to a probe sections
of code incorrectly.

There's something fishy with "best case" timers. They
shouldn't be zero, ever. Ditto "worst case" - they *can*
get high due to task switches, but your worst cases look
way too high, on P4 2.6G there should be 2.6G ticks per
second. Your worst case looks like probes have been
running for 8.9secs straight, seems unlikely.
So there seems to be a need to get hires profiling
uptodate with current squid code base.

Unfortunately, I can't participate for now, my company
has been restructured and caching has been thrown out, so
I don't have any suitable platform at the moment.. ;(


 Andres Kroonmaa
 Elion




Re: more profiling

2006-09-19 Thread Adrian Chadd
On Tue, Sep 19, 2006, Adrian Chadd wrote:

> You did say the getStats() call was very expensive.
> 
> Aha! Could someone please beat me to fixing this? I need to attend to studies
> for a few days.

Too late!

http://www.creative.net.au/diffs/20060919-squid3-mempools.diff

This seems to be doing the right thing with the previous workload.
I've filled up a 512mb memory cache and its humming along fine.

Last 5 min averages: (Cumulated time: 798074067062, 299.90 sec)

  Probe NameEvents  cumulated time best case average worst 
case Rate / sec % in int

PROF_UNACCOUNTED   10507385   124237869296 0   11823   
63590628   35035.76   15.567
PROF_OVERHEAD  96001309928 0 136  
73084  32.010.000
comm_check_incoming  842083   551269798010 0  654650   
743815242807.84   69.075
HttpStateData_readReply  28696869808228176 0  243261   
25017588 956.868.747
StoreEntry_write264537541428750244 0   15660   
240688088820.725.191
HttpStateData_processReplyBody   28696838964175080 0  135778   
24096428 956.864.882
MemObject_write 264537535661568784 0   13480   
127177328820.724.468
comm_handle_ready_fd 80888135570432888 0   43974
53296682697.134.457
storeWriteComplete  264537525746981780 09732   
127066048820.723.226
comm_read_handler59157423191715880 0   39203
23733241972.542.906
commHandleWrite  57989720575621632 0   35481
18689961933.612.578


I'll commit this tomorrow if noone objects.



Adrian