On Wed, Dec 9, 2009 at 2:24 AM, Rainer Heilke <rhei...@dragonhearth.com> wrote: > Peter Tribble wrote: >> >> I was only thinking of kstat -p output as prototype/illustration. Although >> being able to munge the output with the normal awk/sed/grep/perl and >> chuck it straight into one's plotting package of choice does have some >> appeal! > > And if the data corrupts, there's a possibility you may be able to get > through/around the corruption. When sar data corrupts, everything for the > rest of the day is gone. Realistically speaking, is 5MB of text data that > big of a deal today? > >> I was thinking more of using cron for some of that. > > Since cron is more than likely already running (does _anyone_ turn it off?), > I'd go with it, myself. > >> Keeping the storage under control is clearly going to be something that's >> going to need quite a bit of thought. One of my aims is to actually have >> very much more data to chew on, so that storage would naturally be >> expected to increase. > > Yes, but see above. I'd rather have 10MB of text data than 0.5MB of binary I > can't do anything with when everything starts circling the drain. > >> - Only save the processed data that sar uses. Compatible, but useless. > > Actually, that's better than sar. But now I'm just repeating myself. :-)
And to support my previous claims that sar is useless because it has a propensity to corrupt data: $ du -h /var/adm/sa/* 2.3M /var/adm/sa/sa16 1.3M /var/adm/sa/sa17 336K /var/adm/sa/sa20 $ sar -u -f /var/adm/sa/sa16 SunOS xxxx 5.10 Generic_137137-09 sun4u 10/16/2009 08:24:09 %usr %sys %wio %idle 08:24:09 unix restarts 11:44:38 unix restarts 15:32:16 unix restarts 19:50:38 unix restarts 21:15:02 unix restarts 21:44:33 unix restarts 22:08:38 unix restarts $ sar -A -f /var/adm/sa/sa16 | grep -v 'unix restarts' | grep . SunOS xxxx 5.10 Generic_137137-09 sun4u 10/16/2009 08:24:09 %usr %sys %wio %idle 08:24:09 device %busy avque r+w/s blks/s avwait avserv 08:24:09 runq-sz %runocc swpq-sz %swpocc 08:24:09 bread/s lread/s %rcache bwrit/s lwrit/s %wcache pread/s pwrit/s 08:24:09 swpin/s bswin/s swpot/s bswot/s pswch/s 08:24:09 scall/s sread/s swrit/s fork/s exec/s rchar/s wchar/s 08:24:09 iget/s namei/s dirbk/s 08:24:09 rawch/s canch/s outch/s rcvin/s xmtin/s mdmin/s 08:24:09 proc-sz ov inod-sz ov file-sz ov lock-sz 08:24:09 msg/s sema/s 08:24:09 atch/s pgin/s ppgin/s pflt/s vflt/s slock/s 08:24:09 pgout/s ppgout/s pgfree/s pgscan/s %ufs_ipf 08:24:09 freemem freeswap 08:24:09 sml_mem alloc fail lg_mem alloc fail ovsz_alloc fail 2.3 MB of corrupt sar data on a S10u6 system. The other days are the same. > >> - Compress in the time domain, so you don't keep saving the kstats >> that don't change. (A quick test on my desktop - only 10% or so of >> the statistics actually change over an hour.) > > I like this--only keep the data that changes. You need to be careful about > how you parse (sed/grep/awk/Perl) it, though. If any sort of compression is used (such as not repeating values that are observed to be the same as the last time) it is essential that there is a robust decompression mechanism (API and command). The output of such a command can be piped to traditional tools. >> However, I think that if this turns out to be useful, then it will be seen >> to be much less of an issue. And I can see users wanting to increase >> the sampling rate to get at more detail. > > As I've had to do on several occasions with the data I was collecting. One > system was at every 5 minutes and that wasn't enough. > > We really need to make sure Sun (I'm thinking of the support wings) to buy > into anything we do, though. As long as they insist on sar data, we're > hooped. Sampling at a higher rate with automatic roll-up and pruning is very helpful. This is the key to the success of rrdtool. This is also the approach used by the analytics found Sun's commercial webstack offerings. After seeing rrdtool behind the scenes in webstack, it makes me wonder if it is also used in the analytics with Amber Road. The interface to sar is a handful of commands. It should not be hard to provide the same commands that get their data from whatever format is chosen as a successor. When support asks for sar data, you can give it to them. When that does not show the full story, the modern statistics that are important can be pulled. Somewhat importantly, they will tell a more detailed story (more measurements at the same intervals and sample times) rather than a slightly different story (different sample times, different intervals, maybe confusing/conflicting terminology). And on the wishlist - A web service or similar that makes it possible to collect this data or do analysis that spans systems. - API's and tools that can talk to these web services and aid in data visualization. FWIW, I have repeatedly asked Sun to sell me Amber Road style analytics for use on Solaris boxes. I can't say that I have heard any indication that they will ever deliver such a thing. Perhaps this effort can be a step in that direction. -- Mike Gerdts http://mgerdts.blogspot.com/ _______________________________________________ sysadmin-discuss mailing list sysadmin-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/sysadmin-discuss