Right now we're just scraping the output of ifconfig:

ifconfig p2p1 | grep -e 'RX\|TX' | grep packets | awk '{print $3}'

It clunky, but it works. I'm sure there's a cleaner way, but this was
expedient.

QH


On Tue, Mar 31, 2015 at 5:05 PM, Francois Lafont <flafdiv...@free.fr> wrote:

> Hi,
>
> Quentin Hartman wrote:
>
> > Since I have been in ceph-land today, it reminded me that I needed to
> close
> > the loop on this. I was finally able to isolate this problem down to a
> > faulty NIC on the ceph cluster network. It "worked", but it was
> > accumulating a huge number of Rx errors. My best guess is some receive
> > buffer cache failed? Anyway, having a NIC go weird like that is totally
> > consistent with all the weird problems I was seeing, the corrupted PGs,
> and
> > the inability for the cluster to settle down.
> >
> > As a result we've added NIC error rates to our monitoring suite on the
> > cluster so we'll hopefully see this coming if it ever happens again.
>
> Good for you. ;)
>
> Could you post here the command that you use to get NIC error rates?
>
> --
> François Lafont
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to