On 9/19/06, Adrian Chadd <[EMAIL PROTECTED]> wrote:
On Tue, Sep 19, 2006, Gonzalo Arana wrote:
> There is a comment in profiling.h claiming that rdtsc (for x86 arch) > stalls CPU pipes. That's not what Intel documentation says (page 213 > -numbered as 4-209- of the Intel Architecture Software Developer > Manual, volume 2b, Instruction Reference N-Z). > > So, it should be harmless to profile as much code as possible, am I right? Thats what I'm thinking! Things like perfsuite seem to do a pretty good job of it without requiring re-compilation as well.
That seems promising.
> This could be automatically done by the compiler, if the profile probe > was contained in an object. The object will get automatically > destroyed (and therefore the profiling probe will stop) when the > function exits. Cute! It'd still be a good idea to explicitly state beginning/end where appropriate. What might be nice is a "i was deallocated at the end of the function rather than being deallocated explicitly" counter so things could be noted?
I don't understand the "so things could be noted" meaning :(, sory.
> We could build something like gprof call graph (with some > limitations). Adding this shouln't be *that* difficult, right? > > Is there interest in improving the profiling code this way? (i.e.: > somewhat automated probe collection & adding call graph support). It'd be a pretty interesting experiment. gprof seems good enough to obtain call graph information (and call graph information only) and I'd rather we put our efforts towards fixing what we can find and porting over the remaining stuff from 2.6 into 3. We really need to concentrate on fixing up -3 rather than adding shinier things. Yet :)
Agreed, getting a stable squid3 is a priority. It would be good to the goals of having a squid3 release to get better profiling information. But if we can trust gprof's call graph, then this profiling code improvement is not needed right now.
I'm going to continue doing microbenchmarks to tax certain parts of Squid (request parsing, reply parsing, connection creation/teardown, storage memory management, small/large object proxying/caching, probably should do some range request tests as well) to find the really crinkly points and iron them out before the -3 release. Bout the only really crinkly point I see atm is the zero-sized reply stuff. I have a sneaking sense that the forwarder code is still slightly broken.
Nothing the squid-guru-team cannot solve I hope :). Regards, -- Gonzalo A. Arana