Re: [ceph-users] Useful visualizations / metrics

2014-04-13 Thread Greg Poirier
Villalta [ > ja...@rubixnet.com] > *Sent:* 12 April 2014 16:41 > *To:* Greg Poirier > *Cc:* ceph-users@lists.ceph.com > *Subject:* Re: [ceph-users] Useful visualizations / metrics > > I know ceph throws some warnings if there is high write latency. But i > would be mo

Re: [ceph-users] Useful visualizations / metrics

2014-04-13 Thread Dan Van Der Ster
-boun...@lists.ceph.com [ceph-users-boun...@lists.ceph.com] on behalf of Jason Villalta [ja...@rubixnet.com] Sent: 12 April 2014 16:41 To: Greg Poirier Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Useful visualizations / metrics I know ceph throws some warnings if there is high write

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Craig Lewis
I've been graphing disk latency, osd latency, and RGW latency.  It's a bit tricky to pull out of ceph --admin-daemon ceph-osd.0.asok perf dump though.  perf dump gives you the total ops and total op time.  You have to track the delta of those two values, then di

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Greg Poirier
We are collecting system metrics through sysstat every minute and getting those to OpenTSDB via Sensu. We have a plethora of metrics, but I am finding it difficult to create meaningful visualizations. We have alerting for things like individual OSDs reaching capacity thresholds, memory spikes on OS

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Mark Nelson
One thing I do right now for ceph performance testing is run a copy of collectl during every test. This gives you a TON of information about CPU usage, network stats, disk stats, etc. It's pretty easy to import the output data into gnuplot. Mark Seger (the creator of collectl) also has some

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Jason Villalta
I know ceph throws some warnings if there is high write latency. But i would be most intrested in the delay for io requests, linking directly to iops. If iops start to drop because the disk are overwhelmed then latency for requests would be increasing. This would tell me that I need to add more

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Greg Poirier
Curious as to how you define cluster latency. On Sat, Apr 12, 2014 at 7:21 AM, Jason Villalta wrote: > Hi, i have not don't anything with metrics yet but the only ones I > personally would be interested in is total capacity utilization and cluster > latency. > > Just my 2 cents. > > > On Sat, A

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Jason Villalta
Hi, i have not don't anything with metrics yet but the only ones I personally would be interested in is total capacity utilization and cluster latency. Just my 2 cents. On Sat, Apr 12, 2014 at 10:02 AM, Greg Poirier wrote: > I'm in the process of building a dashboard for our Ceph nodes. I was >