On 27/12/10 17:25, Nigel Kersten wrote: > On Mon, Dec 20, 2010 at 3:26 AM, Brice Figureau > <[email protected]> wrote: >> On 20/12/10 05:32, [email protected] wrote: >>> Hi Brice, >>> >>> This really just seems like eye candy to me for those who run top and >>> not really useful to anyone who wants to know "Why was my puppet >>> master slow at 3am this morning?". >> >> I never pretended this patch would solve this issue :) >> This is a first step toward a broader solution that I hope can cover >> this use case. >> >>> How about asynchronously & >>> semantically logging it somewhere to a file or syslog so that those >>> who want to know can fire up Dashboard/Splunk/perl/python/ruby/awk on >>> the logs and find the answers. >> >> Compilation time and various other information are already logged, if >> you think there are some missing information, please let us know and >> we'll add more logs. >> >> But I think we need something more powerful and fine-grained, that could >> capture different metrics than only time spent doing something. >> Ultimately I'd like to offer a framework that can capture elapsed time, >> count, cpu time, memory, per probe. What is not clear at this stage is >> how to get access to those metrics, how long to keep them, how to >> aggregate them, where probes should be places, and so on... >> >> If anyone has any good ideas about these subjects (especially what >> scenario should be handled), that would be terriffic. > > Something I've seen work particularly well is to implement something > like Apache mod_status/mod_info, where you have an internal HTTP > server that exposes various metrics.
That's more or less what I had in mind. The problem we have to solve, though, is not how to expose this information (we already have a status indirection that is unused, which we could leverage), but more what metric would be interesting. I still plan to produce a patch that would give us a way to collect the metrics and get access to those. But I still don't have a clear vision of what metrics could be interesting :( > I'm not entirely clear how we could do this effectively with our > common deployment pattern of multiple daemons behind > Apache/nginx/pound, but I've found this sort of model far more useful > than just having debug logging in the past, and there are wider > benefits like being able to scrape this data more easily than parsing > syslog. That's correct. My plan regarding multiple process was to not care. I'm sure it's easy to write any kind of script to accumulate the different processes metrics. > I'm not really fond of the process name hack, but I'm not sure I have > a better suggestion either, so I'm echoing Brice's call for more input > from the community. As I said earlier, this is just an experimental (but fun anyway) way to dig inside an opaque process :) I'm busy on another puppet experimental project right now, but I'll soon resume this instrumentation work. -- Brice Figureau My Blog: http://www.masterzen.fr/ -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
