On 9/19/06, Adrian Chadd <[EMAIL PROTECTED]> wrote:
On Tue, Sep 19, 2006, Gonzalo Arana wrote:

> There is a comment in profiling.h claiming that rdtsc (for x86 arch)
> stalls CPU pipes.  That's not what Intel documentation says (page 213
> -numbered as 4-209- of the Intel Architecture Software Developer
> Manual, volume 2b, Instruction Reference N-Z).
>
> So, it should be harmless to profile as much code as possible, am I right?

Thats what I'm thinking! Things like perfsuite seem to do a pretty good job
of it without requiring re-compilation as well.

That seems promising.

> This could be automatically done by the compiler, if the profile probe
> was contained in an object.  The object will get automatically
> destroyed (and therefore the profiling probe will stop) when the
> function exits.

Cute! It'd still be a good idea to explicitly state beginning/end where
appropriate. What might be nice is a "i was deallocated at the end of the
function rather than being deallocated explicitly" counter so things
could be noted?

I don't understand the "so things could be noted" meaning :(, sory.

> We could build something like gprof call graph (with some
> limitations).  Adding this shouln't be *that* difficult, right?
>
> Is there interest in improving the profiling code this way? (i.e.:
> somewhat automated probe collection & adding call graph support).

It'd be a pretty interesting experiment. gprof seems good enough
to obtain call graph information (and call graph information only)
and I'd rather we put our efforts towards fixing what we can find
and porting over the remaining stuff from 2.6 into 3. We really
need to concentrate on fixing up -3 rather than adding shinier things.
Yet :)

Agreed, getting a stable squid3 is a priority.  It would be good to
the goals of having a squid3 release to get better profiling
information.  But if we can trust gprof's call graph, then this
profiling code improvement is not needed right now.

I'm going to continue doing microbenchmarks to tax certain parts of
Squid (request parsing, reply parsing, connection creation/teardown,
storage memory management, small/large object proxying/caching,
probably should do some range request tests as well) to find the really
crinkly points and iron them out before the -3 release.

Bout the only really crinkly point I see atm is the zero-sized reply
stuff. I have a sneaking sense that the forwarder code is still slightly
broken.

Nothing the squid-guru-team cannot solve I hope :).

Regards,

--
Gonzalo A. Arana

Reply via email to