Re: OSD's slow down to a crawl

2013-01-09 Thread Mark Nelson
-- From: Sage Weil [mailto:s...@inktank.com] Sent: Friday, 21 December 2012 1:14 AM To: Matthew Anderson Cc: 'Mark Nelson'; ceph-devel@vger.kernel.org Subject: RE: OSD's slow down to a crawl On Thu, 20 Dec 2012, Matthew Anderson wrote: Hi Sage, Logs are attached. I took the osd logs fr

RE: OSD's slow down to a crawl

2013-01-09 Thread Matthew Anderson
on'; ceph-devel@vger.kernel.org Subject: RE: OSD's slow down to a crawl On Fri, 21 Dec 2012, Matthew Anderson wrote: > Hi Sage, > > I've tried to reproduce the error again with logging on every OSD and > got the above. RADOS bench had stalled on a write request like the

RE: OSD's slow down to a crawl

2012-12-21 Thread Sage Weil
--Original Message- > From: Sage Weil [mailto:s...@inktank.com] > Sent: Friday, 21 December 2012 1:14 AM > To: Matthew Anderson > Cc: 'Mark Nelson'; ceph-devel@vger.kernel.org > Subject: RE: OSD's slow down to a crawl > > On Thu, 20 Dec 2012, Matthew Anderson wrote:

RE: OSD's slow down to a crawl

2012-12-20 Thread Sage Weil
21 December 2012 12:30 AM > To: Matthew Anderson > Cc: 'Mark Nelson'; ceph-devel@vger.kernel.org > Subject: RE: OSD's slow down to a crawl > > Can you do a similar test, but with full logging on? > > ceph tell osd.0 injectargs '--debug-ms 1 --debug-files

RE: OSD's slow down to a crawl

2012-12-20 Thread Matthew Anderson
config or the kernel stats for the IB card.    -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Mark Nelson Sent: Friday, 21 December 2012 12:22 AM To: Matthew Anderson Cc: ceph-devel@vger.kernel.org Subject: Re: OSD's

RE: OSD's slow down to a crawl

2012-12-20 Thread Sage Weil
th (MB/sec): 0 > Average Latency:29.2801 > Stddev Latency: 11.9768 > Max latency: 52.6277 > Min latency:11.4904 > > > > >     > Original Message- > From: Mark Nelson [mailto:mark.nel...@inktank.com] > Sent: Thursd

Re: OSD's slow down to a crawl

2012-12-20 Thread Mark Nelson
From: Mark Nelson [mailto:mark.nel...@inktank.com] Sent: Thursday, 20 December 2012 11:59 PM To: Matthew Anderson Cc: ceph-devel@vger.kernel.org Subject: Re: OSD's slow down to a crawl Out of curiosity, if you fire up a rados bench instance on one of the nodes with say, 256 concurrent w

RE: OSD's slow down to a crawl

2012-12-20 Thread Matthew Anderson
@vger.kernel.org Subject: Re: OSD's slow down to a crawl Out of curiosity, if you fire up a rados bench instance on one of the nodes with say, 256 concurrent writes, do any of the writes complete? Mark On 12/20/2012 09:51 AM, Matthew Anderson wrote: > Hi Mark, > > Thanks for the quick

Re: OSD's slow down to a crawl

2012-12-20 Thread Mark Nelson
atch_interval": "0.001", "journaler_batch_max": "0", "mds_data": "\/var\/lib\/ceph\/mds\/ceph-24", "mds_max_file_size": "1099511627776", "mds_cache_size": "10", "mds_cache_mid"

RE: OSD's slow down to a crawl

2012-12-20 Thread Matthew Anderson
"104857600", "objecter_inflight_ops": "1024", "journaler_allow_split_entries": "true", "journaler_write_head_interval": "15", "journaler_prefetch_periods": "10", "journaler_prezero_periods": &

Re: OSD's slow down to a crawl

2012-12-20 Thread Mark Nelson
On 12/20/2012 09:16 AM, Matthew Anderson wrote: Hi All, I've run into an issue where OSD's slow right down to the point that they no longer appear to be processing write IO and everything grinds to a halt. Once they've stopped performing IO they can be brought back to life by restarting them