Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-17 Thread Lincoln Bryant
Hello again,

Well, I disabled offloads on the NIC -- didn’t work for me. I also tried 
setting net.ipv4.tcp_moderate_rcvbuf = 0 as suggested elsewhere in the thread 
to no avail.

Today I was watching iostat on an OSD box ('iostat -xm 5') when the cluster got 
into “slow” state:

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sdb   0.0013.57   84.23  167.47 0.45 2.7826.26 
2.068.18   3.85  96.93
sdc   0.0046.715.59  289.22 0.03 2.5417.85 
3.18   10.77   0.97  28.72
sdd   0.0016.57   45.11   91.62 0.25 0.5512.01 
0.755.51   2.45  33.47
sde   0.0013.576.99  143.31 0.03 2.5334.97 
1.99   13.27   2.12  31.86
sdf   0.0018.764.99  158.48 0.10 1.0914.88 
1.267.69   1.24  20.26
sdg   0.0025.55   81.64  237.52 0.44 2.8921.36 
4.14   12.99   2.58  82.22
sdh   0.0089.42   16.17  492.42 0.09 3.8115.69
17.12   33.66   0.73  36.95
sdi   0.0020.16   17.76  189.62 0.10 1.6717.46 
3.45   16.63   1.57  32.55
sdj   0.0031.540.00  185.23 0.00 1.9121.15 
3.33   18.00   0.03   0.62
sdk   0.0026.152.40  133.33 0.01 0.8412.79 
1.077.87   0.85  11.58
sdl   0.0025.559.38  123.95 0.05 1.1518.44 
0.503.74   1.58  21.10
sdm   0.00 6.39   92.61   47.11 0.47 0.2610.65 
1.279.07   6.92  96.73

The %util is rather high on some disks, but I’m not an expert at looking at 
iostat so I’m not sure how worrisome this is. Does anything here stand out to 
anyone? 

At the time of that iostat, Ceph was reporting a lot of blocked ops on the OSD 
associated with sde (as well as about 30 other OSDs), but it doesn’t look all 
that busy. Some simple ‘dd’ tests seem to indicate the disk is fine.

Similarly, iotop seems OK on this host:

  TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>COMMAND
472477 be/4 root0.00 B/s5.59 M/s  0.00 %  0.57 % ceph-osd -i 111 
--pid-file /var/run/ceph/osd.111.pid -c /etc/ceph/ceph.conf --cluster ceph
470621 be/4 root0.00 B/s   10.09 M/s  0.00 %  0.40 % ceph-osd -i 111 
--pid-file /var/run/ceph/osd.111.pid -c /etc/ceph/ceph.conf --cluster ceph
3495447 be/4 root0.00 B/s  272.19 K/s  0.00 %  0.36 % ceph-osd -i 114 
--pid-file /var/run/ceph/osd.114.pid -c /etc/ceph/ceph.conf --cluster ceph
3488389 be/4 root0.00 B/s  596.80 K/s  0.00 %  0.16 % ceph-osd -i 109 
--pid-file /var/run/ceph/osd.109.pid -c /etc/ceph/ceph.conf --cluster ceph
3488060 be/4 root0.00 B/s  600.83 K/s  0.00 %  0.15 % ceph-osd -i 108 
--pid-file /var/run/ceph/osd.108.pid -c /etc/ceph/ceph.conf --cluster ceph
3505573 be/4 root0.00 B/s  528.25 K/s  0.00 %  0.10 % ceph-osd -i 119 
--pid-file /var/run/ceph/osd.119.pid -c /etc/ceph/ceph.conf --cluster ceph
3495434 be/4 root0.00 B/s2.02 K/s  0.00 %  0.10 % ceph-osd -i 114 
--pid-file /var/run/ceph/osd.114.pid -c /etc/ceph/ceph.conf --cluster ceph
3502327 be/4 root0.00 B/s  506.07 K/s  0.00 %  0.09 % ceph-osd -i 118 
--pid-file /var/run/ceph/osd.118.pid -c /etc/ceph/ceph.conf --cluster ceph
3489100 be/4 root0.00 B/s  106.86 K/s  0.00 %  0.09 % ceph-osd -i 110 
--pid-file /var/run/ceph/osd.110.pid -c /etc/ceph/ceph.conf --cluster ceph
3496631 be/4 root0.00 B/s  229.85 K/s  0.00 %  0.05 % ceph-osd -i 115 
--pid-file /var/run/ceph/osd.115.pid -c /etc/ceph/ceph.conf --cluster ceph
3505561 be/4 root0.00 B/s2.02 K/s  0.00 %  0.03 % ceph-osd -i 119 
--pid-file /var/run/ceph/osd.119.pid -c /etc/ceph/ceph.conf --cluster ceph
3488059 be/4 root0.00 B/s2.02 K/s  0.00 %  0.03 % ceph-osd -i 108 
--pid-file /var/run/ceph/osd.108.pid -c /etc/ceph/ceph.conf --cluster ceph
3488391 be/4 root   46.37 K/s  431.47 K/s  0.00 %  0.02 % ceph-osd -i 109 
--pid-file /var/run/ceph/osd.109.pid -c /etc/ceph/ceph.conf --cluster ceph
3500639 be/4 root0.00 B/s  221.78 K/s  0.00 %  0.02 % ceph-osd -i 117 
--pid-file /var/run/ceph/osd.117.pid -c /etc/ceph/ceph.conf --cluster ceph
3488392 be/4 root   34.28 K/s  185.49 K/s  0.00 %  0.02 % ceph-osd -i 109 
--pid-file /var/run/ceph/osd.109.pid -c /etc/ceph/ceph.conf --cluster ceph
3488062 be/4 root4.03 K/s   66.54 K/s  0.00 %  0.02 % ceph-osd -i 108 
--pid-file /var/run/ceph/osd.108.pid -c /etc/ceph/ceph.conf --cluster ceph

These are all 6TB seagates in single-disk RAID 0 on a PERC H730 Mini controller.

I did try removing the disk with 20k non-medium errors, but that didn’t seem to 
help. 

Thanks for any insight!

Cheers,
Lincoln Bryant

> On Sep 9, 2015, at 1:09 PM, Lincoln Bryant  wrote:
> 
> Hi Jan,
> 
> I’ll take a look at all of those things 

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-17 Thread Lincoln Bryant
Hi Nick,

Thanks for responding. Yes, I am.

—Lincoln

> On Sep 17, 2015, at 11:53 AM, Nick Fisk <n...@fisk.me.uk> wrote:
> 
> You are getting a fair amount of reads on the disks whilst doing these 
> writes. You're not using cache tiering are you?
> 
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Lincoln Bryant
>> Sent: 17 September 2015 17:42
>> To: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops
>> are blocked
>> 
>> Hello again,
>> 
>> Well, I disabled offloads on the NIC -- didn’t work for me. I also tried 
>> setting
>> net.ipv4.tcp_moderate_rcvbuf = 0 as suggested elsewhere in the thread to
>> no avail.
>> 
>> Today I was watching iostat on an OSD box ('iostat -xm 5') when the cluster
>> got into “slow” state:
>> 
>> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
>> avgqu-sz
>> await  svctm  %util
>> sdb   0.0013.57   84.23  167.47 0.45 2.7826.26   
>>   2.068.18   3.85
>> 96.93
>> sdc   0.0046.715.59  289.22 0.03 2.5417.85   
>>   3.18   10.77   0.97
>> 28.72
>> sdd   0.0016.57   45.11   91.62 0.25 0.5512.01   
>>   0.755.51   2.45
>> 33.47
>> sde   0.0013.576.99  143.31 0.03 2.5334.97   
>>   1.99   13.27   2.12
>> 31.86
>> sdf   0.0018.764.99  158.48 0.10 1.0914.88   
>>   1.267.69   1.24
>> 20.26
>> sdg   0.0025.55   81.64  237.52 0.44 2.8921.36   
>>   4.14   12.99   2.58
>> 82.22
>> sdh   0.0089.42   16.17  492.42 0.09 3.8115.69   
>>  17.12   33.66   0.73
>> 36.95
>> sdi   0.0020.16   17.76  189.62 0.10 1.6717.46   
>>   3.45   16.63   1.57
>> 32.55
>> sdj   0.0031.540.00  185.23 0.00 1.9121.15   
>>   3.33   18.00   0.03
>> 0.62
>> sdk   0.0026.152.40  133.33 0.01 0.8412.79   
>>   1.077.87   0.85
>> 11.58
>> sdl   0.0025.559.38  123.95 0.05 1.1518.44   
>>   0.503.74   1.58
>> 21.10
>> sdm   0.00 6.39   92.61   47.11 0.47 0.2610.65   
>>   1.279.07   6.92
>> 96.73
>> 
>> The %util is rather high on some disks, but I’m not an expert at looking at
>> iostat so I’m not sure how worrisome this is. Does anything here stand out to
>> anyone?
>> 
>> At the time of that iostat, Ceph was reporting a lot of blocked ops on the 
>> OSD
>> associated with sde (as well as about 30 other OSDs), but it doesn’t look all
>> that busy. Some simple ‘dd’ tests seem to indicate the disk is fine.
>> 
>> Similarly, iotop seems OK on this host:
>> 
>>  TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>COMMAND
>> 472477 be/4 root0.00 B/s5.59 M/s  0.00 %  0.57 % ceph-osd -i 111 
>> --pid-
>> file /var/run/ceph/osd.111.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 470621 be/4 root0.00 B/s   10.09 M/s  0.00 %  0.40 % ceph-osd -i 111 
>> --pid-
>> file /var/run/ceph/osd.111.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 3495447 be/4 root0.00 B/s  272.19 K/s  0.00 %  0.36 % ceph-osd -i 
>> 114 --
>> pid-file /var/run/ceph/osd.114.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 3488389 be/4 root 0.00 B/s  596.80 K/s  0.00 %  0.16 % ceph-osd -i 109 --
>> pid-file /var/run/ceph/osd.109.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 3488060 be/4 root0.00 B/s  600.83 K/s  0.00 %  0.15 % ceph-osd -i 
>> 108 --
>> pid-file /var/run/ceph/osd.108.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 3505573 be/4 root0.00 B/s  528.25 K/s  0.00 %  0.10 % ceph-osd -i 
>> 119 --
>> pid-file /var/run/ceph/osd.119.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 3495434 be/4 root0.00 B/s2.02 K/s  0.00 %  0.10 % ceph-osd -i 
>> 114 --pid-
>> file /var/run/ceph/osd.114.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 3502327 be/4 root0.00 B/s  506.07 K/s  0.00 %  0.09 % ceph-osd -i 
>> 118 --
>> pid-file /var/run/ceph/osd.118.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 3489100 be/4 root0.00 B/s  106.86 K/s  0.00 %  0.09 % ceph-osd -i 
>> 110 --
>> pid-file /var/run/ceph/osd.110.pid -c /etc/ceph/ceph.conf --cluster ceph
>> 3496631 be/4 root0.00 B/s  229.

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-17 Thread Lincoln Bryant
Just a small update — the blocked ops did disappear after doubling the 
target_max_bytes. We’ll see if it sticks! I’ve thought I’ve solved this blocked 
ops problem about 10 times now :)

Assuming this is the issue, is there any workaround for this problem (or is it 
working as intended)? (Should I set up a cron to run cache-try-flush-evict-all 
every night? :))

Another curious thing is that a rolling restart of all OSDs also seems to fix 
the problem — for a time. I’m not sure how that would fit in if this is the 
problem.

—Lincoln

> On Sep 17, 2015, at 12:07 PM, Lincoln Bryant <linco...@uchicago.edu> wrote:
> 
> We have CephFS utilizing a cache tier + EC backend. The cache tier and ec 
> pool sit on the same spinners — no SSDs. Our cache tier has a 
> target_max_bytes of 5TB and the total storage is about 1PB. 
> 
> I do have a separate test pool with 3x replication and no cache tier, and I 
> still see significant performance drops and blocked ops with no/minimal 
> client I/O from CephFS. Right now I have 530 blocked ops with 20MB/s of 
> client write I/O and no active scrubs. The rados bench on my test pool looks 
> like this:
> 
>  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>0   0 0 0 0 0 - 0
>1  319463   251.934   252   0.31017  0.217719
>2  31   10372   143.96936  0.978544  0.260631
>3  31   10372   95.9815 0 -  0.260631
>4  31   11180   79.985616   2.29218  0.476458
>5  31   11281   64.7886 42.5559   0.50213
>6  31   11281   53.9905 0 -   0.50213
>7  31   11584   47.9917 6   3.71826  0.615882
>8  31   11584   41.9928 0 -  0.615882
>9  31   1158437.327 0 -  0.615882
>   10  31   11786   34.3942   2.7   6.73678  0.794532
> 
> I’m really leaning more toward it being a weird controller/disk problem. 
> 
> As a test, I suppose I could double the target_max_bytes, just so the cache 
> tier stops evicting while client I/O is writing?
> 
> —Lincoln
> 
>> On Sep 17, 2015, at 11:59 AM, Nick Fisk <n...@fisk.me.uk> wrote:
>> 
>> Ah rightthis is where it gets interesting.
>> 
>> You are probably hitting a cache full on a PG somewhere which is either 
>> making everything wait until it flushes or something like that. 
>> 
>> What cache settings have you got set?
>> 
>> I assume you have SSD's for the cache tier? Can you share the size of the 
>> pool.
>> 
>> If possible could you also create a non tiered test pool and do some 
>> benchmarks on that to rule out any issue with the hardware and OSD's.
>> 
>>> -Original Message-
>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>>> Lincoln Bryant
>>> Sent: 17 September 2015 17:54
>>> To: Nick Fisk <n...@fisk.me.uk>
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops
>>> are blocked
>>> 
>>> Hi Nick,
>>> 
>>> Thanks for responding. Yes, I am.
>>> 
>>> —Lincoln
>>> 
>>>> On Sep 17, 2015, at 11:53 AM, Nick Fisk <n...@fisk.me.uk> wrote:
>>>> 
>>>> You are getting a fair amount of reads on the disks whilst doing these
>>> writes. You're not using cache tiering are you?
>>>> 
>>>>> -Original Message-
>>>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>>>>> Of Lincoln Bryant
>>>>> Sent: 17 September 2015 17:42
>>>>> To: ceph-users@lists.ceph.com
>>>>> Subject: Re: [ceph-users] Ceph cluster NO read / write performance ::
>>>>> Ops are blocked
>>>>> 
>>>>> Hello again,
>>>>> 
>>>>> Well, I disabled offloads on the NIC -- didn’t work for me. I also
>>>>> tried setting net.ipv4.tcp_moderate_rcvbuf = 0 as suggested elsewhere
>>>>> in the thread to no avail.
>>>>> 
>>>>> Today I was watching iostat on an OSD box ('iostat -xm 5') when the
>>>>> cluster got into “slow” state:
>>>>> 
>>>>> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s 
>>>>> avgrq-sz avgqu-
>>> sz
>>>>> await  svctm  %util
>>>>> sdb   0.0013.57

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-17 Thread Nick Fisk
You are getting a fair amount of reads on the disks whilst doing these writes. 
You're not using cache tiering are you?

> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Lincoln Bryant
> Sent: 17 September 2015 17:42
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops
> are blocked
> 
> Hello again,
> 
> Well, I disabled offloads on the NIC -- didn’t work for me. I also tried 
> setting
> net.ipv4.tcp_moderate_rcvbuf = 0 as suggested elsewhere in the thread to
> no avail.
> 
> Today I was watching iostat on an OSD box ('iostat -xm 5') when the cluster
> got into “slow” state:
> 
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
> avgqu-sz
> await  svctm  %util
> sdb   0.0013.57   84.23  167.47 0.45 2.7826.26
>  2.068.18   3.85
> 96.93
> sdc   0.0046.715.59  289.22 0.03 2.5417.85
>  3.18   10.77   0.97
> 28.72
> sdd   0.0016.57   45.11   91.62 0.25 0.5512.01
>  0.755.51   2.45
> 33.47
> sde   0.0013.576.99  143.31 0.03 2.5334.97
>  1.99   13.27   2.12
> 31.86
> sdf   0.0018.764.99  158.48 0.10 1.0914.88
>  1.267.69   1.24
> 20.26
> sdg   0.0025.55   81.64  237.52 0.44 2.8921.36
>  4.14   12.99   2.58
> 82.22
> sdh   0.0089.42   16.17  492.42 0.09 3.8115.69
> 17.12   33.66   0.73
> 36.95
> sdi   0.0020.16   17.76  189.62 0.10 1.6717.46
>  3.45   16.63   1.57
> 32.55
> sdj   0.0031.540.00  185.23 0.00 1.9121.15
>  3.33   18.00   0.03
> 0.62
> sdk   0.0026.152.40  133.33 0.01 0.8412.79
>  1.077.87   0.85
> 11.58
> sdl   0.0025.559.38  123.95 0.05 1.1518.44
>  0.503.74   1.58
> 21.10
> sdm   0.00 6.39   92.61   47.11 0.47 0.2610.65
>  1.279.07   6.92
> 96.73
> 
> The %util is rather high on some disks, but I’m not an expert at looking at
> iostat so I’m not sure how worrisome this is. Does anything here stand out to
> anyone?
> 
> At the time of that iostat, Ceph was reporting a lot of blocked ops on the OSD
> associated with sde (as well as about 30 other OSDs), but it doesn’t look all
> that busy. Some simple ‘dd’ tests seem to indicate the disk is fine.
> 
> Similarly, iotop seems OK on this host:
> 
>   TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>COMMAND
> 472477 be/4 root0.00 B/s5.59 M/s  0.00 %  0.57 % ceph-osd -i 111 
> --pid-
> file /var/run/ceph/osd.111.pid -c /etc/ceph/ceph.conf --cluster ceph
> 470621 be/4 root0.00 B/s   10.09 M/s  0.00 %  0.40 % ceph-osd -i 111 
> --pid-
> file /var/run/ceph/osd.111.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3495447 be/4 root0.00 B/s  272.19 K/s  0.00 %  0.36 % ceph-osd -i 114 
> --
> pid-file /var/run/ceph/osd.114.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3488389 be/4 root  0.00 B/s  596.80 K/s  0.00 %  0.16 % ceph-osd -i 109 --
> pid-file /var/run/ceph/osd.109.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3488060 be/4 root0.00 B/s  600.83 K/s  0.00 %  0.15 % ceph-osd -i 108 
> --
> pid-file /var/run/ceph/osd.108.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3505573 be/4 root0.00 B/s  528.25 K/s  0.00 %  0.10 % ceph-osd -i 119 
> --
> pid-file /var/run/ceph/osd.119.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3495434 be/4 root0.00 B/s2.02 K/s  0.00 %  0.10 % ceph-osd -i 114 
> --pid-
> file /var/run/ceph/osd.114.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3502327 be/4 root0.00 B/s  506.07 K/s  0.00 %  0.09 % ceph-osd -i 118 
> --
> pid-file /var/run/ceph/osd.118.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3489100 be/4 root0.00 B/s  106.86 K/s  0.00 %  0.09 % ceph-osd -i 110 
> --
> pid-file /var/run/ceph/osd.110.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3496631 be/4 root0.00 B/s  229.85 K/s  0.00 %  0.05 % ceph-osd -i 115 
> --
> pid-file /var/run/ceph/osd.115.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3505561 be/4 root  0.00 B/s2.02 K/s  0.00 %  0.03 % ceph-osd -i 119 --
> pid-file /var/run/ceph/osd.119.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3488059 be/4 root0.00 B/s2.02 K/s  0.00 %  0.03 % ceph-osd -i 108 
> --pid-
> file /var/run/ceph/osd.108.pid -c /etc/ceph/ceph.conf --cluster ceph
> 3488391 be/4 root   46.37 K/s  431.47 K/s  0.00 %  0.02 % ceph-osd -i 109 
&

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-17 Thread Nick Fisk
Ah rightthis is where it gets interesting.

You are probably hitting a cache full on a PG somewhere which is either making 
everything wait until it flushes or something like that. 

What cache settings have you got set?

I assume you have SSD's for the cache tier? Can you share the size of the pool.

If possible could you also create a non tiered test pool and do some benchmarks 
on that to rule out any issue with the hardware and OSD's.

> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Lincoln Bryant
> Sent: 17 September 2015 17:54
> To: Nick Fisk <n...@fisk.me.uk>
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops
> are blocked
> 
> Hi Nick,
> 
> Thanks for responding. Yes, I am.
> 
> —Lincoln
> 
> > On Sep 17, 2015, at 11:53 AM, Nick Fisk <n...@fisk.me.uk> wrote:
> >
> > You are getting a fair amount of reads on the disks whilst doing these
> writes. You're not using cache tiering are you?
> >
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> >> Of Lincoln Bryant
> >> Sent: 17 September 2015 17:42
> >> To: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] Ceph cluster NO read / write performance ::
> >> Ops are blocked
> >>
> >> Hello again,
> >>
> >> Well, I disabled offloads on the NIC -- didn’t work for me. I also
> >> tried setting net.ipv4.tcp_moderate_rcvbuf = 0 as suggested elsewhere
> >> in the thread to no avail.
> >>
> >> Today I was watching iostat on an OSD box ('iostat -xm 5') when the
> >> cluster got into “slow” state:
> >>
> >> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
> >> avgqu-
> sz
> >> await  svctm  %util
> >> sdb   0.0013.57   84.23  167.47 0.45 2.7826.26 
> >> 2.068.18
> 3.85
> >> 96.93
> >> sdc   0.0046.715.59  289.22 0.03 2.5417.85 
> >> 3.18   10.77
> 0.97
> >> 28.72
> >> sdd   0.0016.57   45.11   91.62 0.25 0.5512.01 
> >> 0.755.51
> 2.45
> >> 33.47
> >> sde   0.0013.576.99  143.31 0.03 2.5334.97 
> >> 1.99   13.27
> 2.12
> >> 31.86
> >> sdf   0.0018.764.99  158.48 0.10 1.0914.88 
> >> 1.267.69   1.24
> >> 20.26
> >> sdg   0.0025.55   81.64  237.52 0.44 2.8921.36 
> >> 4.14   12.99
> 2.58
> >> 82.22
> >> sdh   0.0089.42   16.17  492.42 0.09 3.8115.69 
> >>17.12   33.66
> 0.73
> >> 36.95
> >> sdi   0.0020.16   17.76  189.62 0.10 1.6717.46 
> >> 3.45   16.63
> 1.57
> >> 32.55
> >> sdj   0.0031.540.00  185.23 0.00 1.9121.15 
> >> 3.33   18.00
> 0.03
> >> 0.62
> >> sdk   0.0026.152.40  133.33 0.01 0.8412.79 
> >> 1.077.87
> 0.85
> >> 11.58
> >> sdl   0.0025.559.38  123.95 0.05 1.1518.44 
> >> 0.503.74   1.58
> >> 21.10
> >> sdm   0.00 6.39   92.61   47.11 0.47 0.2610.65 
> >> 1.279.07
> 6.92
> >> 96.73
> >>
> >> The %util is rather high on some disks, but I’m not an expert at
> >> looking at iostat so I’m not sure how worrisome this is. Does
> >> anything here stand out to anyone?
> >>
> >> At the time of that iostat, Ceph was reporting a lot of blocked ops
> >> on the OSD associated with sde (as well as about 30 other OSDs), but
> >> it doesn’t look all that busy. Some simple ‘dd’ tests seem to indicate the
> disk is fine.
> >>
> >> Similarly, iotop seems OK on this host:
> >>
> >>  TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>COMMAND
> >> 472477 be/4 root0.00 B/s5.59 M/s  0.00 %  0.57 % ceph-osd -i 
> >> 111 --
> pid-
> >> file /var/run/ceph/osd.111.pid -c /etc/ceph/ceph.conf --cluster ceph
> >> 470621 be/4 root0.00 B/s   10.09 M/s  0.00 %  0.40 % ceph-osd -i 
> >> 111 --
> pid-
> >> file /var/run/ceph/osd.111.pid -c /etc/ceph/ceph.conf --cluster ceph
> >> 3495447 be/4 root0.00 B/s  272.19 K/s  0.00 %  

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-17 Thread Lincoln Bryant
We have CephFS utilizing a cache tier + EC backend. The cache tier and ec pool 
sit on the same spinners — no SSDs. Our cache tier has a target_max_bytes of 
5TB and the total storage is about 1PB. 

I do have a separate test pool with 3x replication and no cache tier, and I 
still see significant performance drops and blocked ops with no/minimal client 
I/O from CephFS. Right now I have 530 blocked ops with 20MB/s of client write 
I/O and no active scrubs. The rados bench on my test pool looks like this:

  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
0   0 0 0 0 0 - 0
1  319463   251.934   252   0.31017  0.217719
2  31   10372   143.96936  0.978544  0.260631
3  31   10372   95.9815 0 -  0.260631
4  31   11180   79.985616   2.29218  0.476458
5  31   11281   64.7886 42.5559   0.50213
6  31   11281   53.9905 0 -   0.50213
7  31   11584   47.9917 6   3.71826  0.615882
8  31   11584   41.9928 0 -  0.615882
9  31   1158437.327 0 -  0.615882
   10  31   11786   34.3942   2.7   6.73678  0.794532

I’m really leaning more toward it being a weird controller/disk problem. 

As a test, I suppose I could double the target_max_bytes, just so the cache 
tier stops evicting while client I/O is writing?

—Lincoln

> On Sep 17, 2015, at 11:59 AM, Nick Fisk <n...@fisk.me.uk> wrote:
> 
> Ah rightthis is where it gets interesting.
> 
> You are probably hitting a cache full on a PG somewhere which is either 
> making everything wait until it flushes or something like that. 
> 
> What cache settings have you got set?
> 
> I assume you have SSD's for the cache tier? Can you share the size of the 
> pool.
> 
> If possible could you also create a non tiered test pool and do some 
> benchmarks on that to rule out any issue with the hardware and OSD's.
> 
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Lincoln Bryant
>> Sent: 17 September 2015 17:54
>> To: Nick Fisk <n...@fisk.me.uk>
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops
>> are blocked
>> 
>> Hi Nick,
>> 
>> Thanks for responding. Yes, I am.
>> 
>> —Lincoln
>> 
>>> On Sep 17, 2015, at 11:53 AM, Nick Fisk <n...@fisk.me.uk> wrote:
>>> 
>>> You are getting a fair amount of reads on the disks whilst doing these
>> writes. You're not using cache tiering are you?
>>> 
>>>> -Original Message-----
>>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>>>> Of Lincoln Bryant
>>>> Sent: 17 September 2015 17:42
>>>> To: ceph-users@lists.ceph.com
>>>> Subject: Re: [ceph-users] Ceph cluster NO read / write performance ::
>>>> Ops are blocked
>>>> 
>>>> Hello again,
>>>> 
>>>> Well, I disabled offloads on the NIC -- didn’t work for me. I also
>>>> tried setting net.ipv4.tcp_moderate_rcvbuf = 0 as suggested elsewhere
>>>> in the thread to no avail.
>>>> 
>>>> Today I was watching iostat on an OSD box ('iostat -xm 5') when the
>>>> cluster got into “slow” state:
>>>> 
>>>> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
>>>> avgqu-
>> sz
>>>> await  svctm  %util
>>>> sdb   0.0013.57   84.23  167.47 0.45 2.7826.26 
>>>> 2.068.18
>> 3.85
>>>> 96.93
>>>> sdc   0.0046.715.59  289.22 0.03 2.5417.85 
>>>> 3.18   10.77
>> 0.97
>>>> 28.72
>>>> sdd   0.0016.57   45.11   91.62 0.25 0.5512.01 
>>>> 0.755.51
>> 2.45
>>>> 33.47
>>>> sde   0.0013.576.99  143.31 0.03 2.5334.97 
>>>> 1.99   13.27
>> 2.12
>>>> 31.86
>>>> sdf   0.0018.764.99  158.48 0.10 1.0914.88 
>>>> 1.267.69   1.24
>>>> 20.26
>>>> sdg   0.0025.55   81.64  237.52 0.44 2.8921.36 
>>>> 4.14   12.99
>> 2.58
>>>> 82.22
>>>> sdh   0.0089.42   16.17  492.42 0.09

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-17 Thread Lincoln Bryant
Hi Nick,

Thanks for the detailed response and insight. SSDs are indeed definitely on the 
to-buy list. 

I will certainly try to rule out any hardware issues in the meantime.

Cheers,
Lincoln

> On Sep 17, 2015, at 12:53 PM, Nick Fisk <n...@fisk.me.uk> wrote:
> 
> It's probably helped but I fear that your overall design is not going to work 
> well for you. Cache Tier + Base tier + journals on the same disks is going to 
> really hurt.
> 
> The problem when using cache tiering (especially with EC pools in future 
> releases) is that to modify a block that isn't in the cache tier you have to 
> promote it 1st, which often kicks another block out the cache.
> 
> So worse case you could have for a single write
> 
> R from EC -> W to CT + jrnl W -> W actual data to CT + jrnl W -> R from CT -> 
> W to EC + jrnl W
> 
> Plus any metadata updates. Either way you looking at probably somewhere near 
> a 10x write amplification for 4MB writes, which will quickly overload your 
> disks leading to very slow performance. Smaller IO's would still cause 4MB 
> blocks to be shifted between pools. What makes it worse is that these 
> promotions/evictions tend to happen to hot PG's and not spread round the 
> whole cluster meaning that a single hot OSD can hold up writes across the 
> whole pool.
> 
> I know it's not what you want to hear, but I can't think of anything you can 
> do to help in this instance unless you are willing to get some SSD journals 
> and maybe move the Cache pool on to separate disks or SSD's. Basically try 
> and limit the amount of random IO the disks have to do.
> 
> Of course please do try and find a time to stop all IO and then run the test 
> on the test 3 way pool, to rule out any hardware/OS issues. 
> 
> 
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Lincoln Bryant
>> Sent: 17 September 2015 18:36
>> To: Nick Fisk <n...@fisk.me.uk>
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops
>> are blocked
>> 
>> Just a small update — the blocked ops did disappear after doubling the
>> target_max_bytes. We’ll see if it sticks! I’ve thought I’ve solved this 
>> blocked
>> ops problem about 10 times now :)
>> 
>> Assuming this is the issue, is there any workaround for this problem (or is 
>> it
>> working as intended)? (Should I set up a cron to run 
>> cache-try-flush-evict-all
>> every night? :))
>> 
>> Another curious thing is that a rolling restart of all OSDs also seems to 
>> fix the
>> problem — for a time. I’m not sure how that would fit in if this is the
>> problem.
>> 
>> —Lincoln
>> 
>>> On Sep 17, 2015, at 12:07 PM, Lincoln Bryant <linco...@uchicago.edu>
>> wrote:
>>> 
>>> We have CephFS utilizing a cache tier + EC backend. The cache tier and ec
>> pool sit on the same spinners — no SSDs. Our cache tier has a
>> target_max_bytes of 5TB and the total storage is about 1PB.
>>> 
>>> I do have a separate test pool with 3x replication and no cache tier, and I
>> still see significant performance drops and blocked ops with no/minimal
>> client I/O from CephFS. Right now I have 530 blocked ops with 20MB/s of
>> client write I/O and no active scrubs. The rados bench on my test pool looks
>> like this:
>>> 
>>> sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>>   0   0 0 0 0 0 - 0
>>>   1  319463   251.934   252   0.31017  0.217719
>>>   2  31   10372   143.96936  0.978544  0.260631
>>>   3  31   10372   95.9815 0 -  0.260631
>>>   4  31   11180   79.985616   2.29218  0.476458
>>>   5  31   11281   64.7886 42.5559   0.50213
>>>   6  31   11281   53.9905 0 -   0.50213
>>>   7  31   11584   47.9917 6   3.71826  0.615882
>>>   8  31   11584   41.9928 0 -  0.615882
>>>   9  31   1158437.327 0 -  0.615882
>>>  10  31   11786   34.3942   2.7   6.73678  0.794532
>>> 
>>> I’m really leaning more toward it being a weird controller/disk problem.
>>> 
>>> As a test, I suppose I could double the target_max_bytes, just so the cache
>> tier stops evicting while client I/O is writing?
>>> 
>>> —Linco

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-17 Thread Nick Fisk
It's probably helped but I fear that your overall design is not going to work 
well for you. Cache Tier + Base tier + journals on the same disks is going to 
really hurt.

The problem when using cache tiering (especially with EC pools in future 
releases) is that to modify a block that isn't in the cache tier you have to 
promote it 1st, which often kicks another block out the cache.

So worse case you could have for a single write

R from EC -> W to CT + jrnl W -> W actual data to CT + jrnl W -> R from CT -> W 
to EC + jrnl W

Plus any metadata updates. Either way you looking at probably somewhere near a 
10x write amplification for 4MB writes, which will quickly overload your disks 
leading to very slow performance. Smaller IO's would still cause 4MB blocks to 
be shifted between pools. What makes it worse is that these 
promotions/evictions tend to happen to hot PG's and not spread round the whole 
cluster meaning that a single hot OSD can hold up writes across the whole pool.

I know it's not what you want to hear, but I can't think of anything you can do 
to help in this instance unless you are willing to get some SSD journals and 
maybe move the Cache pool on to separate disks or SSD's. Basically try and 
limit the amount of random IO the disks have to do.

Of course please do try and find a time to stop all IO and then run the test on 
the test 3 way pool, to rule out any hardware/OS issues. 


> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Lincoln Bryant
> Sent: 17 September 2015 18:36
> To: Nick Fisk <n...@fisk.me.uk>
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops
> are blocked
> 
> Just a small update — the blocked ops did disappear after doubling the
> target_max_bytes. We’ll see if it sticks! I’ve thought I’ve solved this 
> blocked
> ops problem about 10 times now :)
> 
> Assuming this is the issue, is there any workaround for this problem (or is it
> working as intended)? (Should I set up a cron to run cache-try-flush-evict-all
> every night? :))
> 
> Another curious thing is that a rolling restart of all OSDs also seems to fix 
> the
> problem — for a time. I’m not sure how that would fit in if this is the
> problem.
> 
> —Lincoln
> 
> > On Sep 17, 2015, at 12:07 PM, Lincoln Bryant <linco...@uchicago.edu>
> wrote:
> >
> > We have CephFS utilizing a cache tier + EC backend. The cache tier and ec
> pool sit on the same spinners — no SSDs. Our cache tier has a
> target_max_bytes of 5TB and the total storage is about 1PB.
> >
> > I do have a separate test pool with 3x replication and no cache tier, and I
> still see significant performance drops and blocked ops with no/minimal
> client I/O from CephFS. Right now I have 530 blocked ops with 20MB/s of
> client write I/O and no active scrubs. The rados bench on my test pool looks
> like this:
> >
> >  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
> >0   0 0 0 0 0 - 0
> >1  319463   251.934   252   0.31017  0.217719
> >2  31   10372   143.96936  0.978544  0.260631
> >3  31   10372   95.9815 0 -  0.260631
> >4  31   11180   79.985616   2.29218  0.476458
> >5  31   11281   64.7886 42.5559   0.50213
> >6  31   11281   53.9905 0 -   0.50213
> >7  31   11584   47.9917 6   3.71826  0.615882
> >8  31   11584   41.9928 0 -  0.615882
> >9  31   1158437.327 0 -  0.615882
> >   10  31   11786   34.3942   2.7   6.73678  0.794532
> >
> > I’m really leaning more toward it being a weird controller/disk problem.
> >
> > As a test, I suppose I could double the target_max_bytes, just so the cache
> tier stops evicting while client I/O is writing?
> >
> > —Lincoln
> >
> >> On Sep 17, 2015, at 11:59 AM, Nick Fisk <n...@fisk.me.uk> wrote:
> >>
> >> Ah rightthis is where it gets interesting.
> >>
> >> You are probably hitting a cache full on a PG somewhere which is either
> making everything wait until it flushes or something like that.
> >>
> >> What cache settings have you got set?
> >>
> >> I assume you have SSD's for the cache tier? Can you share the size of the
> pool.
> >>
> >> If possible could you also create a non tiered test pool and do some
> benchmarks on that to rule o

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-11 Thread Shinobu Kinjo
If you really want to improve performance of *distributed* filesystem
like Ceph, Lustre, GPFS,
you must consider from networking of the linux kernel.

 L5: Socket
 L4: TCP
 L3: IP
 L2: Queuing

In this discussion, problem could be in L2 which is queuing in descriptor.
We may have to take a closer look at qdisc, if qlen is good enough or not.

But this case:

> 399 16 32445 32429 325.054 84 0.0233839 0.193655
 to
> 400 16 32445 32429 324.241 0 - 0.193655

probably different story -;

> needless to say, very strange. 

Yes, it is quite strange like my English...

Shinobu

- Original Message -
From: "Vickey Singh" <vickey.singh22...@gmail.com>
To: "Jan Schermer" <j...@schermer.cz>
Cc: ceph-users@lists.ceph.com
Sent: Thursday, September 10, 2015 2:22:22 AM
Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops   
are blocked

Hello Jan 

On Wed, Sep 9, 2015 at 11:59 AM, Jan Schermer < j...@schermer.cz > wrote: 


Just to recapitulate - the nodes are doing "nothing" when it drops to zero? Not 
flushing something to drives (iostat)? Not cleaning pagecache (kswapd and 
similiar)? Not out of any type of memory (slab, min_free_kbytes)? Not network 
link errors, no bad checksums (those are hard to spot, though)? 

Unless you find something I suggest you try disabling offloads on the NICs and 
see if the problem goes away. 

Could you please elaborate this point , how do you disable / offload on the NIC 
? what does it mean ? how to do it ? how its gonna help. 

Sorry i don't know about this. 

- Vickey - 




Jan 

> On 08 Sep 2015, at 18:26, Lincoln Bryant < linco...@uchicago.edu > wrote: 
> 
> For whatever it’s worth, my problem has returned and is very similar to 
> yours. Still trying to figure out what’s going on over here. 
> 
> Performance is nice for a few seconds, then goes to 0. This is a similar 
> setup to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3, etc) 
> 
> 384 16 29520 29504 307.287 1188 0.0492006 0.208259 
> 385 16 29813 29797 309.532 1172 0.0469708 0.206731 
> 386 16 30105 30089 311.756 1168 0.0375764 0.205189 
> 387 16 30401 30385 314.009 1184 0.036142 0.203791 
> 388 16 30695 30679 316.231 1176 0.0372316 0.202355 
> 389 16 30987 30971 318.42 1168 0.0660476 0.200962 
> 390 16 31282 31266 320.628 1180 0.0358611 0.199548 
> 391 16 31568 31552 322.734 1144 0.0405166 0.198132 
> 392 16 31857 31841 324.859 1156 0.0360826 0.196679 
> 393 16 32090 32074 326.404 932 0.0416869 0.19549 
> 394 16 32205 32189 326.743 460 0.0251877 0.194896 
> 395 16 32302 32286 326.897 388 0.0280574 0.194395 
> 396 16 32348 32332 326.537 184 0.0256821 0.194157 
> 397 16 32385 32369 326.087 148 0.0254342 0.193965 
> 398 16 32424 32408 325.659 156 0.0263006 0.193763 
> 399 16 32445 32429 325.054 84 0.0233839 0.193655 
> 2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat: 
> 0.193655 
> sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 
> 400 16 32445 32429 324.241 0 - 0.193655 
> 401 16 32445 32429 323.433 0 - 0.193655 
> 402 16 32445 32429 322.628 0 - 0.193655 
> 403 16 32445 32429 321.828 0 - 0.193655 
> 404 16 32445 32429 321.031 0 - 0.193655 
> 405 16 32445 32429 320.238 0 - 0.193655 
> 406 16 32445 32429 319.45 0 - 0.193655 
> 407 16 32445 32429 318.665 0 - 0.193655 
> 
> needless to say, very strange. 
> 
> —Lincoln 
> 
> 
>> On Sep 7, 2015, at 3:35 PM, Vickey Singh < vickey.singh22...@gmail.com > 
>> wrote: 
>> 
>> Adding ceph-users. 
>> 
>> On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh < vickey.singh22...@gmail.com 
>> > wrote: 
>> 
>> 
>> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke < ulem...@polarzone.de > wrote: 
>> Hi Vickey, 
>> Thanks for your time in replying to my problem. 
>> 
>> I had the same rados bench output after changing the motherboard of the 
>> monitor node with the lowest IP... 
>> Due to the new mainboard, I assume the hw-clock was wrong during startup. 
>> Ceph health show no errors, but all VMs aren't able to do IO (very high load 
>> on the VMs - but no traffic). 
>> I stopped the mon, but this don't changed anything. I had to restart all 
>> other mons to get IO again. After that I started the first mon also (with 
>> the right time now) and all worked fine again... 
>> 
>> Thanks i will try to restart all OSD / MONS and report back , if it solves 
>> my problem 
>> 
>> Another posibility: 
>> Do you use journal on SSDs? Perhaps the SSDs can't write to garbage 
>> collection? 
>> 
>> No i don't have journals on SSD , they are on the same OSD disk. 
>> 
>> 
>> 
>> Udo 
>> 
>> 
>> On 07.09.2015 16:36,

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-11 Thread Shinobu Kinjo
Dropwatch.stp would help us see who dropped packets, 
here packets were dropped at.

To do further investigation regarding to networking, 
I always check:

  /sys/class/net//statistics/*

tc command also is quite useful.

Have we already check if there is any bo or not using 
vmstat?

Using vmstat and tcpdump, tc would give you more concu-
rrency information to solve the problem.

Shinobu

- Original Message -
From: "Shinobu Kinjo" <ski...@redhat.com>
To: "Vickey Singh" <vickey.singh22...@gmail.com>
Cc: ceph-users@lists.ceph.com
Sent: Friday, September 11, 2015 10:32:27 PM
Subject: Re: [ceph-users] Ceph cluster NO read / write performance ::   Ops 
are blocked

If you really want to improve performance of *distributed* filesystem
like Ceph, Lustre, GPFS,
you must consider from networking of the linux kernel.

 L5: Socket
 L4: TCP
 L3: IP
 L2: Queuing

In this discussion, problem could be in L2 which is queuing in descriptor.
We may have to take a closer look at qdisc, if qlen is good enough or not.

But this case:

> 399 16 32445 32429 325.054 84 0.0233839 0.193655
 to
> 400 16 32445 32429 324.241 0 - 0.193655

probably different story -;

> needless to say, very strange. 

Yes, it is quite strange like my English...

Shinobu

- Original Message -
From: "Vickey Singh" <vickey.singh22...@gmail.com>
To: "Jan Schermer" <j...@schermer.cz>
Cc: ceph-users@lists.ceph.com
Sent: Thursday, September 10, 2015 2:22:22 AM
Subject: Re: [ceph-users] Ceph cluster NO read / write performance :: Ops   
are blocked

Hello Jan 

On Wed, Sep 9, 2015 at 11:59 AM, Jan Schermer < j...@schermer.cz > wrote: 


Just to recapitulate - the nodes are doing "nothing" when it drops to zero? Not 
flushing something to drives (iostat)? Not cleaning pagecache (kswapd and 
similiar)? Not out of any type of memory (slab, min_free_kbytes)? Not network 
link errors, no bad checksums (those are hard to spot, though)? 

Unless you find something I suggest you try disabling offloads on the NICs and 
see if the problem goes away. 

Could you please elaborate this point , how do you disable / offload on the NIC 
? what does it mean ? how to do it ? how its gonna help. 

Sorry i don't know about this. 

- Vickey - 




Jan 

> On 08 Sep 2015, at 18:26, Lincoln Bryant < linco...@uchicago.edu > wrote: 
> 
> For whatever it’s worth, my problem has returned and is very similar to 
> yours. Still trying to figure out what’s going on over here. 
> 
> Performance is nice for a few seconds, then goes to 0. This is a similar 
> setup to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3, etc) 
> 
> 384 16 29520 29504 307.287 1188 0.0492006 0.208259 
> 385 16 29813 29797 309.532 1172 0.0469708 0.206731 
> 386 16 30105 30089 311.756 1168 0.0375764 0.205189 
> 387 16 30401 30385 314.009 1184 0.036142 0.203791 
> 388 16 30695 30679 316.231 1176 0.0372316 0.202355 
> 389 16 30987 30971 318.42 1168 0.0660476 0.200962 
> 390 16 31282 31266 320.628 1180 0.0358611 0.199548 
> 391 16 31568 31552 322.734 1144 0.0405166 0.198132 
> 392 16 31857 31841 324.859 1156 0.0360826 0.196679 
> 393 16 32090 32074 326.404 932 0.0416869 0.19549 
> 394 16 32205 32189 326.743 460 0.0251877 0.194896 
> 395 16 32302 32286 326.897 388 0.0280574 0.194395 
> 396 16 32348 32332 326.537 184 0.0256821 0.194157 
> 397 16 32385 32369 326.087 148 0.0254342 0.193965 
> 398 16 32424 32408 325.659 156 0.0263006 0.193763 
> 399 16 32445 32429 325.054 84 0.0233839 0.193655 
> 2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat: 
> 0.193655 
> sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 
> 400 16 32445 32429 324.241 0 - 0.193655 
> 401 16 32445 32429 323.433 0 - 0.193655 
> 402 16 32445 32429 322.628 0 - 0.193655 
> 403 16 32445 32429 321.828 0 - 0.193655 
> 404 16 32445 32429 321.031 0 - 0.193655 
> 405 16 32445 32429 320.238 0 - 0.193655 
> 406 16 32445 32429 319.45 0 - 0.193655 
> 407 16 32445 32429 318.665 0 - 0.193655 
> 
> needless to say, very strange. 
> 
> —Lincoln 
> 
> 
>> On Sep 7, 2015, at 3:35 PM, Vickey Singh < vickey.singh22...@gmail.com > 
>> wrote: 
>> 
>> Adding ceph-users. 
>> 
>> On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh < vickey.singh22...@gmail.com 
>> > wrote: 
>> 
>> 
>> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke < ulem...@polarzone.de > wrote: 
>> Hi Vickey, 
>> Thanks for your time in replying to my problem. 
>> 
>> I had the same rados bench output after changing the motherboard of the 
>> monitor node with the lowest IP... 
>> Due to the new mainboard, I assume the hw-clock was wrong during startup. 
>> Ceph health show no errors, but all VMs aren't able to do IO (v

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-09 Thread Jan Schermer
Just to recapitulate - the nodes are doing "nothing" when it drops to zero? Not 
flushing something to drives (iostat)? Not cleaning pagecache (kswapd and 
similiar)? Not out of any type of memory (slab, min_free_kbytes)? Not network 
link errors, no bad checksums (those are hard to spot, though)?

Unless you find something I suggest you try disabling offloads on the NICs and 
see if the problem goes away.

Jan

> On 08 Sep 2015, at 18:26, Lincoln Bryant  wrote:
> 
> For whatever it’s worth, my problem has returned and is very similar to 
> yours. Still trying to figure out what’s going on over here.
> 
> Performance is nice for a few seconds, then goes to 0. This is a similar 
> setup to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3, etc)
> 
>  384  16 29520 29504   307.287  1188 0.0492006  0.208259
>  385  16 29813 29797   309.532  1172 0.0469708  0.206731
>  386  16 30105 30089   311.756  1168 0.0375764  0.205189
>  387  16 30401 30385   314.009  1184  0.036142  0.203791
>  388  16 30695 30679   316.231  1176 0.0372316  0.202355
>  389  16 30987 30971318.42  1168 0.0660476  0.200962
>  390  16 31282 31266   320.628  1180 0.0358611  0.199548
>  391  16 31568 31552   322.734  1144 0.0405166  0.198132
>  392  16 31857 31841   324.859  1156 0.0360826  0.196679
>  393  16 32090 32074   326.404   932 0.0416869   0.19549
>  394  16 32205 32189   326.743   460 0.0251877  0.194896
>  395  16 32302 32286   326.897   388 0.0280574  0.194395
>  396  16 32348 32332   326.537   184 0.0256821  0.194157
>  397  16 32385 32369   326.087   148 0.0254342  0.193965
>  398  16 32424 32408   325.659   156 0.0263006  0.193763
>  399  16 32445 32429   325.05484 0.0233839  0.193655
> 2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat: 
> 0.193655
>  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>  400  16 32445 32429   324.241 0 -  0.193655
>  401  16 32445 32429   323.433 0 -  0.193655
>  402  16 32445 32429   322.628 0 -  0.193655
>  403  16 32445 32429   321.828 0 -  0.193655
>  404  16 32445 32429   321.031 0 -  0.193655
>  405  16 32445 32429   320.238 0 -  0.193655
>  406  16 32445 32429319.45 0 -  0.193655
>  407  16 32445 32429   318.665 0 -  0.193655
> 
> needless to say, very strange.
> 
> —Lincoln
> 
> 
>> On Sep 7, 2015, at 3:35 PM, Vickey Singh  wrote:
>> 
>> Adding ceph-users.
>> 
>> On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh  
>> wrote:
>> 
>> 
>> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke  wrote:
>> Hi Vickey,
>> Thanks for your time in replying to my problem.
>> 
>> I had the same rados bench output after changing the motherboard of the 
>> monitor node with the lowest IP...
>> Due to the new mainboard, I assume the hw-clock was wrong during startup. 
>> Ceph health show no errors, but all VMs aren't able to do IO (very high load 
>> on the VMs - but no traffic).
>> I stopped the mon, but this don't changed anything. I had to restart all 
>> other mons to get IO again. After that I started the first mon also (with 
>> the right time now) and all worked fine again...
>> 
>> Thanks i will try to restart all OSD / MONS and report back , if it solves 
>> my problem 
>> 
>> Another posibility:
>> Do you use journal on SSDs? Perhaps the SSDs can't write to garbage 
>> collection?
>> 
>> No i don't have journals on SSD , they are on the same OSD disk. 
>> 
>> 
>> 
>> Udo
>> 
>> 
>> On 07.09.2015 16:36, Vickey Singh wrote:
>>> Dear Experts
>>> 
>>> Can someone please help me , why my cluster is not able write data.
>>> 
>>> See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.
>>> 
>>> 
>>> Ceph Hammer  0.94.2
>>> CentOS 6 (3.10.69-1)
>>> 
>>> The Ceph status says OPS are blocked , i have tried checking , what all i 
>>> know 
>>> 
>>> - System resources ( CPU , net, disk , memory )-- All normal 
>>> - 10G network for public and cluster network  -- no saturation 
>>> - Add disks are physically healthy 
>>> - No messages in /var/log/messages OR dmesg
>>> - Tried restarting OSD which are blocking operation , but no luck
>>> - Tried writing through RBD  and Rados bench , both are giving same problemm
>>> 
>>> Please help me to fix this problem.
>>> 
>>> #  rados bench -p rbd 60 write
>>> Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds or 0 
>>> objects
>>> Object prefix: benchmark_data_stor1_1791844
>>>   sec Cur ops   started  finished  

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-09 Thread Vickey Singh
Hey Lincoln



On Tue, Sep 8, 2015 at 7:26 PM, Lincoln Bryant 
wrote:

> For whatever it’s worth, my problem has returned and is very similar to
> yours. Still trying to figure out what’s going on over here.
>
> Performance is nice for a few seconds, then goes to 0. This is a similar
> setup to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3, etc)
>
>   384  16 29520 29504   307.287  1188 0.0492006  0.208259
>   385  16 29813 29797   309.532  1172 0.0469708  0.206731
>   386  16 30105 30089   311.756  1168 0.0375764  0.205189
>   387  16 30401 30385   314.009  1184  0.036142  0.203791
>   388  16 30695 30679   316.231  1176 0.0372316  0.202355
>   389  16 30987 30971318.42  1168 0.0660476  0.200962
>   390  16 31282 31266   320.628  1180 0.0358611  0.199548
>   391  16 31568 31552   322.734  1144 0.0405166  0.198132
>   392  16 31857 31841   324.859  1156 0.0360826  0.196679
>   393  16 32090 32074   326.404   932 0.0416869   0.19549
>   394  16 32205 32189   326.743   460 0.0251877  0.194896
>   395  16 32302 32286   326.897   388 0.0280574  0.194395
>   396  16 32348 32332   326.537   184 0.0256821  0.194157
>   397  16 32385 32369   326.087   148 0.0254342  0.193965
>   398  16 32424 32408   325.659   156 0.0263006  0.193763
>   399  16 32445 32429   325.05484 0.0233839  0.193655
> 2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat:
> 0.193655
>   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>   400  16 32445 32429   324.241 0 -  0.193655
>   401  16 32445 32429   323.433 0 -  0.193655
>   402  16 32445 32429   322.628 0 -  0.193655
>   403  16 32445 32429   321.828 0 -  0.193655
>   404  16 32445 32429   321.031 0 -  0.193655
>   405  16 32445 32429   320.238 0 -  0.193655
>   406  16 32445 32429319.45 0 -  0.193655
>   407  16 32445 32429   318.665 0 -  0.193655
>
> needless to say, very strange.
>

Its indeed very strange

( The solution that you gave me in the below email ) Have you tried
restarting all OSD's ?

By the way my problem got fixed ( but i am afraid , it can come back any
time ) by doing

# service ceph restart osd  on all OSD nodes ( this didn't helped )
# set noout,nodown,nobackfill,norecover and then reboot all OSD nodes ( It
worked )  After they all the rados bench write started to work.

[ i know its hilarious , feels like  i am watching *The IT Crowd* ' Hello
IT , Have you tried turning it OFF and ON again ' ]

It would be really helpful if someone provides a real solution.




>
> —Lincoln
>
>
> > On Sep 7, 2015, at 3:35 PM, Vickey Singh 
> wrote:
> >
> > Adding ceph-users.
> >
> > On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh <
> vickey.singh22...@gmail.com> wrote:
> >
> >
> > On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke 
> wrote:
> > Hi Vickey,
> > Thanks for your time in replying to my problem.
> >
> > I had the same rados bench output after changing the motherboard of the
> monitor node with the lowest IP...
> > Due to the new mainboard, I assume the hw-clock was wrong during
> startup. Ceph health show no errors, but all VMs aren't able to do IO (very
> high load on the VMs - but no traffic).
> > I stopped the mon, but this don't changed anything. I had to restart all
> other mons to get IO again. After that I started the first mon also (with
> the right time now) and all worked fine again...
> >
> > Thanks i will try to restart all OSD / MONS and report back , if it
> solves my problem
> >
> > Another posibility:
> > Do you use journal on SSDs? Perhaps the SSDs can't write to garbage
> collection?
> >
> > No i don't have journals on SSD , they are on the same OSD disk.
> >
> >
> >
> > Udo
> >
> >
> > On 07.09.2015 16:36, Vickey Singh wrote:
> >> Dear Experts
> >>
> >> Can someone please help me , why my cluster is not able write data.
> >>
> >> See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.
> >>
> >>
> >> Ceph Hammer  0.94.2
> >> CentOS 6 (3.10.69-1)
> >>
> >> The Ceph status says OPS are blocked , i have tried checking , what all
> i know
> >>
> >> - System resources ( CPU , net, disk , memory )-- All normal
> >> - 10G network for public and cluster network  -- no saturation
> >> - Add disks are physically healthy
> >> - No messages in /var/log/messages OR dmesg
> >> - Tried restarting OSD which are blocking operation , but no luck
> >> - Tried writing through RBD  and Rados bench , both are giving same
> problemm
> >>
> >> Please help me to 

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-09 Thread Bill Sanders
We were experiencing something similar in our setup (rados bench does some
work, then comes to a screeching halt).  No pattern to which OSD's were
causing the problem, though.  Sounds like similar hardware (This was on
Dell R720xd, and yeah, that controller is suuuper frustrating).

For us, setting tcp_moderate_rcvbuf to 0 on all nodes solved the issue.

echo 0 > /proc/sys/net/ipv4/tcp_moderate_rcvbuf

Or set it in /etc/sysctl.conf:

net.ipv4.tcp_moderate_rcvbuf = 0

We figured this out independently after I posted this thread, "Slow/Hung
IOs":
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-January/045674.html

Hope this helps

Bill Sanders

On Wed, Sep 9, 2015 at 11:09 AM, Lincoln Bryant 
wrote:

> Hi Jan,
>
> I’ll take a look at all of those things and report back (hopefully :))
>
> I did try setting all of my OSDs to writethrough instead of writeback on
> the controller, which was significantly more consistent in performance
> (from 1100MB/s down to 300MB/s, but still occasionally dropping to 0MB/s).
> Still plenty of blocked ops.
>
> I was wondering if not-so-nicely failing OSD(s) might be the cause. My
> controller (PERC H730 Mini) seems frustratingly terse with SMART
> information, but at least one disk has a “Non-medium error count” of over
> 20,000..
>
> I’ll try disabling offloads as well.
>
> Thanks much for the suggestions!
>
> Cheers,
> Lincoln
>
> > On Sep 9, 2015, at 3:59 AM, Jan Schermer  wrote:
> >
> > Just to recapitulate - the nodes are doing "nothing" when it drops to
> zero? Not flushing something to drives (iostat)? Not cleaning pagecache
> (kswapd and similiar)? Not out of any type of memory (slab,
> min_free_kbytes)? Not network link errors, no bad checksums (those are hard
> to spot, though)?
> >
> > Unless you find something I suggest you try disabling offloads on the
> NICs and see if the problem goes away.
> >
> > Jan
> >
> >> On 08 Sep 2015, at 18:26, Lincoln Bryant  wrote:
> >>
> >> For whatever it’s worth, my problem has returned and is very similar to
> yours. Still trying to figure out what’s going on over here.
> >>
> >> Performance is nice for a few seconds, then goes to 0. This is a
> similar setup to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3,
> etc)
> >>
> >> 384  16 29520 29504   307.287  1188 0.0492006  0.208259
> >> 385  16 29813 29797   309.532  1172 0.0469708  0.206731
> >> 386  16 30105 30089   311.756  1168 0.0375764  0.205189
> >> 387  16 30401 30385   314.009  1184  0.036142  0.203791
> >> 388  16 30695 30679   316.231  1176 0.0372316  0.202355
> >> 389  16 30987 30971318.42  1168 0.0660476  0.200962
> >> 390  16 31282 31266   320.628  1180 0.0358611  0.199548
> >> 391  16 31568 31552   322.734  1144 0.0405166  0.198132
> >> 392  16 31857 31841   324.859  1156 0.0360826  0.196679
> >> 393  16 32090 32074   326.404   932 0.0416869   0.19549
> >> 394  16 32205 32189   326.743   460 0.0251877  0.194896
> >> 395  16 32302 32286   326.897   388 0.0280574  0.194395
> >> 396  16 32348 32332   326.537   184 0.0256821  0.194157
> >> 397  16 32385 32369   326.087   148 0.0254342  0.193965
> >> 398  16 32424 32408   325.659   156 0.0263006  0.193763
> >> 399  16 32445 32429   325.05484 0.0233839  0.193655
> >> 2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat:
> 0.193655
> >> sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
> >> 400  16 32445 32429   324.241 0 -  0.193655
> >> 401  16 32445 32429   323.433 0 -  0.193655
> >> 402  16 32445 32429   322.628 0 -  0.193655
> >> 403  16 32445 32429   321.828 0 -  0.193655
> >> 404  16 32445 32429   321.031 0 -  0.193655
> >> 405  16 32445 32429   320.238 0 -  0.193655
> >> 406  16 32445 32429319.45 0 -  0.193655
> >> 407  16 32445 32429   318.665 0 -  0.193655
> >>
> >> needless to say, very strange.
> >>
> >> —Lincoln
> >>
> >>
> >>> On Sep 7, 2015, at 3:35 PM, Vickey Singh 
> wrote:
> >>>
> >>> Adding ceph-users.
> >>>
> >>> On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh <
> vickey.singh22...@gmail.com> wrote:
> >>>
> >>>
> >>> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke 
> wrote:
> >>> Hi Vickey,
> >>> Thanks for your time in replying to my problem.
> >>>
> >>> I had the same rados bench output after changing the motherboard of
> the monitor node with the lowest IP...
> >>> Due to the new mainboard, I assume the hw-clock was wrong during
> startup. Ceph health 

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-09 Thread Vickey Singh
Hello Jan

On Wed, Sep 9, 2015 at 11:59 AM, Jan Schermer  wrote:

> Just to recapitulate - the nodes are doing "nothing" when it drops to
> zero? Not flushing something to drives (iostat)? Not cleaning pagecache
> (kswapd and similiar)? Not out of any type of memory (slab,
> min_free_kbytes)? Not network link errors, no bad checksums (those are hard
> to spot, though)?
>
> Unless you find something I suggest you try disabling offloads on the NICs
> and see if the problem goes away.
>

Could you please elaborate this point , how do you disable / offload on the
NIC ? what does it mean ? how to do it ? how its gonna help.

Sorry i don't know about this.

- Vickey -



>
> Jan
>
> > On 08 Sep 2015, at 18:26, Lincoln Bryant  wrote:
> >
> > For whatever it’s worth, my problem has returned and is very similar to
> yours. Still trying to figure out what’s going on over here.
> >
> > Performance is nice for a few seconds, then goes to 0. This is a similar
> setup to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3, etc)
> >
> >  384  16 29520 29504   307.287  1188 0.0492006  0.208259
> >  385  16 29813 29797   309.532  1172 0.0469708  0.206731
> >  386  16 30105 30089   311.756  1168 0.0375764  0.205189
> >  387  16 30401 30385   314.009  1184  0.036142  0.203791
> >  388  16 30695 30679   316.231  1176 0.0372316  0.202355
> >  389  16 30987 30971318.42  1168 0.0660476  0.200962
> >  390  16 31282 31266   320.628  1180 0.0358611  0.199548
> >  391  16 31568 31552   322.734  1144 0.0405166  0.198132
> >  392  16 31857 31841   324.859  1156 0.0360826  0.196679
> >  393  16 32090 32074   326.404   932 0.0416869   0.19549
> >  394  16 32205 32189   326.743   460 0.0251877  0.194896
> >  395  16 32302 32286   326.897   388 0.0280574  0.194395
> >  396  16 32348 32332   326.537   184 0.0256821  0.194157
> >  397  16 32385 32369   326.087   148 0.0254342  0.193965
> >  398  16 32424 32408   325.659   156 0.0263006  0.193763
> >  399  16 32445 32429   325.05484 0.0233839  0.193655
> > 2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat:
> 0.193655
> >  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
> >  400  16 32445 32429   324.241 0 -  0.193655
> >  401  16 32445 32429   323.433 0 -  0.193655
> >  402  16 32445 32429   322.628 0 -  0.193655
> >  403  16 32445 32429   321.828 0 -  0.193655
> >  404  16 32445 32429   321.031 0 -  0.193655
> >  405  16 32445 32429   320.238 0 -  0.193655
> >  406  16 32445 32429319.45 0 -  0.193655
> >  407  16 32445 32429   318.665 0 -  0.193655
> >
> > needless to say, very strange.
> >
> > —Lincoln
> >
> >
> >> On Sep 7, 2015, at 3:35 PM, Vickey Singh 
> wrote:
> >>
> >> Adding ceph-users.
> >>
> >> On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh <
> vickey.singh22...@gmail.com> wrote:
> >>
> >>
> >> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke 
> wrote:
> >> Hi Vickey,
> >> Thanks for your time in replying to my problem.
> >>
> >> I had the same rados bench output after changing the motherboard of the
> monitor node with the lowest IP...
> >> Due to the new mainboard, I assume the hw-clock was wrong during
> startup. Ceph health show no errors, but all VMs aren't able to do IO (very
> high load on the VMs - but no traffic).
> >> I stopped the mon, but this don't changed anything. I had to restart
> all other mons to get IO again. After that I started the first mon also
> (with the right time now) and all worked fine again...
> >>
> >> Thanks i will try to restart all OSD / MONS and report back , if it
> solves my problem
> >>
> >> Another posibility:
> >> Do you use journal on SSDs? Perhaps the SSDs can't write to garbage
> collection?
> >>
> >> No i don't have journals on SSD , they are on the same OSD disk.
> >>
> >>
> >>
> >> Udo
> >>
> >>
> >> On 07.09.2015 16:36, Vickey Singh wrote:
> >>> Dear Experts
> >>>
> >>> Can someone please help me , why my cluster is not able write data.
> >>>
> >>> See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.
> >>>
> >>>
> >>> Ceph Hammer  0.94.2
> >>> CentOS 6 (3.10.69-1)
> >>>
> >>> The Ceph status says OPS are blocked , i have tried checking , what
> all i know
> >>>
> >>> - System resources ( CPU , net, disk , memory )-- All normal
> >>> - 10G network for public and cluster network  -- no saturation
> >>> - Add disks are physically healthy
> >>> - No messages in /var/log/messages OR dmesg
> >>> - Tried 

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-09 Thread Lincoln Bryant
Hi Jan,

I’ll take a look at all of those things and report back (hopefully :))

I did try setting all of my OSDs to writethrough instead of writeback on the 
controller, which was significantly more consistent in performance (from 
1100MB/s down to 300MB/s, but still occasionally dropping to 0MB/s). Still 
plenty of blocked ops. 

I was wondering if not-so-nicely failing OSD(s) might be the cause. My 
controller (PERC H730 Mini) seems frustratingly terse with SMART information, 
but at least one disk has a “Non-medium error count” of over 20,000..

I’ll try disabling offloads as well. 

Thanks much for the suggestions!

Cheers,
Lincoln

> On Sep 9, 2015, at 3:59 AM, Jan Schermer  wrote:
> 
> Just to recapitulate - the nodes are doing "nothing" when it drops to zero? 
> Not flushing something to drives (iostat)? Not cleaning pagecache (kswapd and 
> similiar)? Not out of any type of memory (slab, min_free_kbytes)? Not network 
> link errors, no bad checksums (those are hard to spot, though)?
> 
> Unless you find something I suggest you try disabling offloads on the NICs 
> and see if the problem goes away.
> 
> Jan
> 
>> On 08 Sep 2015, at 18:26, Lincoln Bryant  wrote:
>> 
>> For whatever it’s worth, my problem has returned and is very similar to 
>> yours. Still trying to figure out what’s going on over here.
>> 
>> Performance is nice for a few seconds, then goes to 0. This is a similar 
>> setup to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3, etc)
>> 
>> 384  16 29520 29504   307.287  1188 0.0492006  0.208259
>> 385  16 29813 29797   309.532  1172 0.0469708  0.206731
>> 386  16 30105 30089   311.756  1168 0.0375764  0.205189
>> 387  16 30401 30385   314.009  1184  0.036142  0.203791
>> 388  16 30695 30679   316.231  1176 0.0372316  0.202355
>> 389  16 30987 30971318.42  1168 0.0660476  0.200962
>> 390  16 31282 31266   320.628  1180 0.0358611  0.199548
>> 391  16 31568 31552   322.734  1144 0.0405166  0.198132
>> 392  16 31857 31841   324.859  1156 0.0360826  0.196679
>> 393  16 32090 32074   326.404   932 0.0416869   0.19549
>> 394  16 32205 32189   326.743   460 0.0251877  0.194896
>> 395  16 32302 32286   326.897   388 0.0280574  0.194395
>> 396  16 32348 32332   326.537   184 0.0256821  0.194157
>> 397  16 32385 32369   326.087   148 0.0254342  0.193965
>> 398  16 32424 32408   325.659   156 0.0263006  0.193763
>> 399  16 32445 32429   325.05484 0.0233839  0.193655
>> 2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat: 
>> 0.193655
>> sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>> 400  16 32445 32429   324.241 0 -  0.193655
>> 401  16 32445 32429   323.433 0 -  0.193655
>> 402  16 32445 32429   322.628 0 -  0.193655
>> 403  16 32445 32429   321.828 0 -  0.193655
>> 404  16 32445 32429   321.031 0 -  0.193655
>> 405  16 32445 32429   320.238 0 -  0.193655
>> 406  16 32445 32429319.45 0 -  0.193655
>> 407  16 32445 32429   318.665 0 -  0.193655
>> 
>> needless to say, very strange.
>> 
>> —Lincoln
>> 
>> 
>>> On Sep 7, 2015, at 3:35 PM, Vickey Singh  
>>> wrote:
>>> 
>>> Adding ceph-users.
>>> 
>>> On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh  
>>> wrote:
>>> 
>>> 
>>> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke  wrote:
>>> Hi Vickey,
>>> Thanks for your time in replying to my problem.
>>> 
>>> I had the same rados bench output after changing the motherboard of the 
>>> monitor node with the lowest IP...
>>> Due to the new mainboard, I assume the hw-clock was wrong during startup. 
>>> Ceph health show no errors, but all VMs aren't able to do IO (very high 
>>> load on the VMs - but no traffic).
>>> I stopped the mon, but this don't changed anything. I had to restart all 
>>> other mons to get IO again. After that I started the first mon also (with 
>>> the right time now) and all worked fine again...
>>> 
>>> Thanks i will try to restart all OSD / MONS and report back , if it solves 
>>> my problem 
>>> 
>>> Another posibility:
>>> Do you use journal on SSDs? Perhaps the SSDs can't write to garbage 
>>> collection?
>>> 
>>> No i don't have journals on SSD , they are on the same OSD disk. 
>>> 
>>> 
>>> 
>>> Udo
>>> 
>>> 
>>> On 07.09.2015 16:36, Vickey Singh wrote:
 Dear Experts
 
 Can someone please help me , why my cluster is not able write data.
 
 See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-08 Thread Lincoln Bryant
For whatever it’s worth, my problem has returned and is very similar to yours. 
Still trying to figure out what’s going on over here.

Performance is nice for a few seconds, then goes to 0. This is a similar setup 
to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3, etc)

  384  16 29520 29504   307.287  1188 0.0492006  0.208259
  385  16 29813 29797   309.532  1172 0.0469708  0.206731
  386  16 30105 30089   311.756  1168 0.0375764  0.205189
  387  16 30401 30385   314.009  1184  0.036142  0.203791
  388  16 30695 30679   316.231  1176 0.0372316  0.202355
  389  16 30987 30971318.42  1168 0.0660476  0.200962
  390  16 31282 31266   320.628  1180 0.0358611  0.199548
  391  16 31568 31552   322.734  1144 0.0405166  0.198132
  392  16 31857 31841   324.859  1156 0.0360826  0.196679
  393  16 32090 32074   326.404   932 0.0416869   0.19549
  394  16 32205 32189   326.743   460 0.0251877  0.194896
  395  16 32302 32286   326.897   388 0.0280574  0.194395
  396  16 32348 32332   326.537   184 0.0256821  0.194157
  397  16 32385 32369   326.087   148 0.0254342  0.193965
  398  16 32424 32408   325.659   156 0.0263006  0.193763
  399  16 32445 32429   325.05484 0.0233839  0.193655
2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat: 0.193655
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  400  16 32445 32429   324.241 0 -  0.193655
  401  16 32445 32429   323.433 0 -  0.193655
  402  16 32445 32429   322.628 0 -  0.193655
  403  16 32445 32429   321.828 0 -  0.193655
  404  16 32445 32429   321.031 0 -  0.193655
  405  16 32445 32429   320.238 0 -  0.193655
  406  16 32445 32429319.45 0 -  0.193655
  407  16 32445 32429   318.665 0 -  0.193655

needless to say, very strange.

—Lincoln


> On Sep 7, 2015, at 3:35 PM, Vickey Singh  wrote:
> 
> Adding ceph-users.
> 
> On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh  
> wrote:
> 
> 
> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke  wrote:
> Hi Vickey,
> Thanks for your time in replying to my problem.
>  
> I had the same rados bench output after changing the motherboard of the 
> monitor node with the lowest IP...
> Due to the new mainboard, I assume the hw-clock was wrong during startup. 
> Ceph health show no errors, but all VMs aren't able to do IO (very high load 
> on the VMs - but no traffic).
> I stopped the mon, but this don't changed anything. I had to restart all 
> other mons to get IO again. After that I started the first mon also (with the 
> right time now) and all worked fine again...
> 
> Thanks i will try to restart all OSD / MONS and report back , if it solves my 
> problem 
> 
> Another posibility:
> Do you use journal on SSDs? Perhaps the SSDs can't write to garbage 
> collection?
> 
> No i don't have journals on SSD , they are on the same OSD disk. 
> 
> 
> 
> Udo
> 
> 
> On 07.09.2015 16:36, Vickey Singh wrote:
>> Dear Experts
>> 
>> Can someone please help me , why my cluster is not able write data.
>> 
>> See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.
>> 
>> 
>> Ceph Hammer  0.94.2
>> CentOS 6 (3.10.69-1)
>> 
>> The Ceph status says OPS are blocked , i have tried checking , what all i 
>> know 
>> 
>> - System resources ( CPU , net, disk , memory )-- All normal 
>> - 10G network for public and cluster network  -- no saturation 
>> - Add disks are physically healthy 
>> - No messages in /var/log/messages OR dmesg
>> - Tried restarting OSD which are blocking operation , but no luck
>> - Tried writing through RBD  and Rados bench , both are giving same problemm
>> 
>> Please help me to fix this problem.
>> 
>> #  rados bench -p rbd 60 write
>>  Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds or 0 
>> objects
>>  Object prefix: benchmark_data_stor1_1791844
>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>  0   0 0 0 0 0 - 0
>>  1  16   125   109   435.873   436  0.022076 0.0697864
>>  2  16   139   123   245.94856  0.246578 0.0674407
>>  3  16   139   123   163.969 0 - 0.0674407
>>  4  16   139   123   122.978 0 - 0.0674407
>>  5  16   139   12398.383 0 - 0.0674407
>>  6  16   139   123   81.9865 0 - 0.0674407
>>  7  16   

[ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-07 Thread Vickey Singh
Dear Experts

Can someone please help me , why my cluster is not able write data.

See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.


Ceph Hammer  0.94.2
CentOS 6 (3.10.69-1)

The Ceph status says OPS are blocked , i have tried checking , what all i
know

- System resources ( CPU , net, disk , memory )-- All normal
- 10G network for public and cluster network  -- no saturation
- Add disks are physically healthy
- No messages in /var/log/messages OR dmesg
- Tried restarting OSD which are blocking operation , but no luck
- Tried writing through RBD  and Rados bench , both are giving same problemm

Please help me to fix this problem.

#  rados bench -p rbd 60 write
 Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds or
0 objects
 Object prefix: benchmark_data_stor1_1791844
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
 0   0 0 0 0 0 - 0
 1  16   125   109   435.873   436  0.022076 0.0697864
 2  16   139   123   245.94856  0.246578 0.0674407
 3  16   139   123   163.969 0 - 0.0674407
 4  16   139   123   122.978 0 - 0.0674407
 5  16   139   12398.383 0 - 0.0674407
 6  16   139   123   81.9865 0 - 0.0674407
 7  16   139   123   70.2747 0 - 0.0674407
 8  16   139   123   61.4903 0 - 0.0674407
 9  16   139   123   54.6582 0 - 0.0674407
10  16   139   123   49.1924 0 - 0.0674407
11  16   139   123   44.7201 0 - 0.0674407
12  16   139   123   40.9934 0 - 0.0674407
13  16   139   123   37.8401 0 - 0.0674407
14  16   139   123   35.1373 0 - 0.0674407
15  16   139   123   32.7949 0 - 0.0674407
16  16   139   123   30.7451 0 - 0.0674407
17  16   139   123   28.9364 0 - 0.0674407
18  16   139   123   27.3289 0 - 0.0674407
19  16   139   123   25.8905 0 - 0.0674407
2015-09-07 15:54:52.694071min lat: 0.022076 max lat: 0.46117 avg lat:
0.0674407
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
20  16   139   12324.596 0 - 0.0674407
21  16   139   123   23.4247 0 - 0.0674407
22  16   139   123 22.36 0 - 0.0674407
23  16   139   123   21.3878 0 - 0.0674407
24  16   139   123   20.4966 0 - 0.0674407
25  16   139   123   19.6768 0 - 0.0674407
26  16   139   123 18.92 0 - 0.0674407
27  16   139   123   18.2192 0 - 0.0674407
28  16   139   123   17.5686 0 - 0.0674407
29  16   139   123   16.9628 0 - 0.0674407
30  16   139   123   16.3973 0 - 0.0674407
31  16   139   123   15.8684 0 - 0.0674407
32  16   139   123   15.3725 0 - 0.0674407
33  16   139   123   14.9067 0 - 0.0674407
34  16   139   123   14.4683 0 - 0.0674407
35  16   139   123   14.0549 0 - 0.0674407
36  16   139   123   13.6645 0 - 0.0674407
37  16   139   123   13.2952 0 - 0.0674407
38  16   139   123   12.9453 0 - 0.0674407
39  16   139   123   12.6134 0 - 0.0674407
2015-09-07 15:55:12.697124min lat: 0.022076 max lat: 0.46117 avg lat:
0.0674407
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
40  16   139   123   12.2981 0 - 0.0674407
41  16   139   123   11.9981 0 - 0.0674407




cluster 86edf8b8-b353-49f1-ab0a-a4827a9ea5e8
 health HEALTH_WARN
1 requests are blocked > 32 sec
 monmap e3: 3 mons at {stor0111=
10.100.1.111:6789/0,stor0113=10.100.1.113:6789/0,stor011
5=10.100.1.115:6789/0}
election epoch 32, quorum 0,1,2 stor0111,stor0113,stor0115
 osdmap e19536: 50 osds: 50 up, 50 in
  pgmap v928610: 2752 pgs, 9 pools, 30476 GB data, 4183 kobjects
91513 GB used, 47642 GB / 135 TB avail
2752 active+clean


Tried using RBD


# dd if=/dev/zero of=file1 bs=4K count=1 oflag=direct
1+0 records in
1+0 records out
4096 bytes (41 

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-07 Thread Lincoln Bryant

Hi Vickey,

I had this exact same problem last week, resolved by rebooting all of my 
OSD nodes. I have yet to figure out why it happened, though. I _suspect_ 
in my case it's due to a failing controller on a particular box I've had 
trouble with in the past.


I tried setting 'noout', stopping my OSDs one host at a time, then 
rerunning RADOS bench between to see if I could nail down the 
problematic machine. Depending on your # of hosts, this might work for 
you. Admittedly, I got impatient with this approach though and just 
ended up restarting everything (which worked!) :)


If you have a bunch of blocked ops, you could maybe try a 'pg query' on 
the PGs involved and see if there's a common OSD with all of your 
blocked ops. In my experience, it's not necessarily the one reporting.


Anecdotally, I've had trouble with Intel 10Gb NICs and custom kernels as 
well. I've seen a NIC appear to be happy (no message in dmesg, machine 
appears to be communicating normally, etc) but when I went to iperf it, 
I was getting super pitiful performance (like KB/s). I don't know what 
kind of NICs you're using, but you may want to iperf everything just in 
case.


--Lincoln

On 9/7/2015 9:36 AM, Vickey Singh wrote:

Dear Experts

Can someone please help me , why my cluster is not able write data.

See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.


Ceph Hammer  0.94.2
CentOS 6 (3.10.69-1)

The Ceph status says OPS are blocked , i have tried checking , what all i
know

- System resources ( CPU , net, disk , memory )-- All normal
- 10G network for public and cluster network  -- no saturation
- Add disks are physically healthy
- No messages in /var/log/messages OR dmesg
- Tried restarting OSD which are blocking operation , but no luck
- Tried writing through RBD  and Rados bench , both are giving same problemm

Please help me to fix this problem.

#  rados bench -p rbd 60 write
  Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds or
0 objects
  Object prefix: benchmark_data_stor1_1791844
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  0   0 0 0 0 0 - 0
  1  16   125   109   435.873   436  0.022076 0.0697864
  2  16   139   123   245.94856  0.246578 0.0674407
  3  16   139   123   163.969 0 - 0.0674407
  4  16   139   123   122.978 0 - 0.0674407
  5  16   139   12398.383 0 - 0.0674407
  6  16   139   123   81.9865 0 - 0.0674407
  7  16   139   123   70.2747 0 - 0.0674407
  8  16   139   123   61.4903 0 - 0.0674407
  9  16   139   123   54.6582 0 - 0.0674407
 10  16   139   123   49.1924 0 - 0.0674407
 11  16   139   123   44.7201 0 - 0.0674407
 12  16   139   123   40.9934 0 - 0.0674407
 13  16   139   123   37.8401 0 - 0.0674407
 14  16   139   123   35.1373 0 - 0.0674407
 15  16   139   123   32.7949 0 - 0.0674407
 16  16   139   123   30.7451 0 - 0.0674407
 17  16   139   123   28.9364 0 - 0.0674407
 18  16   139   123   27.3289 0 - 0.0674407
 19  16   139   123   25.8905 0 - 0.0674407
2015-09-07 15:54:52.694071min lat: 0.022076 max lat: 0.46117 avg lat:
0.0674407
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
 20  16   139   12324.596 0 - 0.0674407
 21  16   139   123   23.4247 0 - 0.0674407
 22  16   139   123 22.36 0 - 0.0674407
 23  16   139   123   21.3878 0 - 0.0674407
 24  16   139   123   20.4966 0 - 0.0674407
 25  16   139   123   19.6768 0 - 0.0674407
 26  16   139   123 18.92 0 - 0.0674407
 27  16   139   123   18.2192 0 - 0.0674407
 28  16   139   123   17.5686 0 - 0.0674407
 29  16   139   123   16.9628 0 - 0.0674407
 30  16   139   123   16.3973 0 - 0.0674407
 31  16   139   123   15.8684 0 - 0.0674407
 32  16   139   123   15.3725 0 - 0.0674407
 33  16   139   123   14.9067 0 - 0.0674407
 34  16   139   123   14.4683 0 - 0.0674407
 35  16   139   123   14.0549 0   

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-07 Thread Vickey Singh
On Mon, Sep 7, 2015 at 7:39 PM, Lincoln Bryant 
wrote:

> Hi Vickey,
>
>
Thanks a lot for replying to my problem.


> I had this exact same problem last week, resolved by rebooting all of my
> OSD nodes. I have yet to figure out why it happened, though. I _suspect_ in
> my case it's due to a failing controller on a particular box I've had
> trouble with in the past.
>

Mine is a 5 node cluster 12 OSD each node and and in past there has never
been any hardware problems.


> I tried setting 'noout', stopping my OSDs one host at a time, then
> rerunning RADOS bench between to see if I could nail down the problematic
> machine. Depending on your # of hosts, this might work for you. Admittedly,
> I got impatient with this approach though and just ended up restarting
> everything (which worked!) :)
>

So do you mean you intentionally brought one node's OSD down ? so some OSD
were down but none of then were out (no out ) . Then you waited for some
time to make cluster healthy , and then you rerun rados bench ??


>
> If you have a bunch of blocked ops, you could maybe try a 'pg query' on
> the PGs involved and see if there's a common OSD with all of your blocked
> ops. In my experience, it's not necessarily the one reporting.
>

Yeah, i have 55 OSDs and every time any random OSD shows OPS blocked. So i
can't blame any specific OSD. After few minutes that blocked OSD becomes
clean and after sometime some other osd blocks ops.


Thanks, i will try to restart all osd / monitor daemons , and see if this
fixes. Is there any thing i need to keep in mind to restart osd  ( expect
nodown , noout )  ??


>
> Anecdotally, I've had trouble with Intel 10Gb NICs and custom kernels as
> well. I've seen a NIC appear to be happy (no message in dmesg, machine
> appears to be communicating normally, etc) but when I went to iperf it, I
> was getting super pitiful performance (like KB/s). I don't know what kind
> of NICs you're using, but you may want to iperf everything just in case.
>

Yeah i did that , iperf shows no problem.

Is there anything else i should do ??


>
> --Lincoln
>
>
> On 9/7/2015 9:36 AM, Vickey Singh wrote:
>
> Dear Experts
>
> Can someone please help me , why my cluster is not able write data.
>
> See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.
>
>
> Ceph Hammer  0.94.2
> CentOS 6 (3.10.69-1)
>
> The Ceph status says OPS are blocked , i have tried checking , what all i
> know
>
> - System resources ( CPU , net, disk , memory )-- All normal
> - 10G network for public and cluster network  -- no saturation
> - Add disks are physically healthy
> - No messages in /var/log/messages OR dmesg
> - Tried restarting OSD which are blocking operation , but no luck
> - Tried writing through RBD  and Rados bench , both are giving same problemm
>
> Please help me to fix this problem.
>
> #  rados bench -p rbd 60 write
>  Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds or
> 0 objects
>  Object prefix: benchmark_data_stor1_1791844
>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>  0   0 0 0 0 0 - 0
>  1  16   125   109   435.873   436  0.022076 0.0697864
>  2  16   139   123   245.94856  0.246578 0.0674407
>  3  16   139   123   163.969 0 - 0.0674407
>  4  16   139   123   122.978 0 - 0.0674407
>  5  16   139   12398.383 0 - 0.0674407
>  6  16   139   123   81.9865 0 - 0.0674407
>  7  16   139   123   70.2747 0 - 0.0674407
>  8  16   139   123   61.4903 0 - 0.0674407
>  9  16   139   123   54.6582 0 - 0.0674407
> 10  16   139   123   49.1924 0 - 0.0674407
> 11  16   139   123   44.7201 0 - 0.0674407
> 12  16   139   123   40.9934 0 - 0.0674407
> 13  16   139   123   37.8401 0 - 0.0674407
> 14  16   139   123   35.1373 0 - 0.0674407
> 15  16   139   123   32.7949 0 - 0.0674407
> 16  16   139   123   30.7451 0 - 0.0674407
> 17  16   139   123   28.9364 0 - 0.0674407
> 18  16   139   123   27.3289 0 - 0.0674407
> 19  16   139   123   25.8905 0 - 0.0674407
> 2015-09-07 15:54:52.694071min lat: 0.022076 max lat: 0.46117 avg lat:
> 0.0674407
>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
> 20  16   139   12324.596 0 - 0.0674407
> 21  16   139   123   23.4247 0 - 0.0674407
> 22  16   139   123 

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-07 Thread Vickey Singh
Adding ceph-users.

On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh 
wrote:

>
>
> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke  wrote:
>
>> Hi Vickey,
>>
> Thanks for your time in replying to my problem.
>
>
>> I had the same rados bench output after changing the motherboard of the
>> monitor node with the lowest IP...
>> Due to the new mainboard, I assume the hw-clock was wrong during startup.
>> Ceph health show no errors, but all VMs aren't able to do IO (very high
>> load on the VMs - but no traffic).
>> I stopped the mon, but this don't changed anything. I had to restart all
>> other mons to get IO again. After that I started the first mon also (with
>> the right time now) and all worked fine again...
>>
>
> Thanks i will try to restart all OSD / MONS and report back , if it solves
> my problem
>
>>
>> Another posibility:
>> Do you use journal on SSDs? Perhaps the SSDs can't write to garbage
>> collection?
>>
>
> No i don't have journals on SSD , they are on the same OSD disk.
>
>>
>>
>>
>> Udo
>>
>>
>> On 07.09.2015 16:36, Vickey Singh wrote:
>>
>> Dear Experts
>>
>> Can someone please help me , why my cluster is not able write data.
>>
>> See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.
>>
>>
>> Ceph Hammer  0.94.2
>> CentOS 6 (3.10.69-1)
>>
>> The Ceph status says OPS are blocked , i have tried checking , what all i
>> know
>>
>> - System resources ( CPU , net, disk , memory )-- All normal
>> - 10G network for public and cluster network  -- no saturation
>> - Add disks are physically healthy
>> - No messages in /var/log/messages OR dmesg
>> - Tried restarting OSD which are blocking operation , but no luck
>> - Tried writing through RBD  and Rados bench , both are giving same
>> problemm
>>
>> Please help me to fix this problem.
>>
>> #  rados bench -p rbd 60 write
>>  Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds
>> or 0 objects
>>  Object prefix: benchmark_data_stor1_1791844
>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>  0   0 0 0 0 0 - 0
>>  1  16   125   109   435.873   436  0.022076 0.0697864
>>  2  16   139   123   245.94856  0.246578 0.0674407
>>  3  16   139   123   163.969 0 - 0.0674407
>>  4  16   139   123   122.978 0 - 0.0674407
>>  5  16   139   12398.383 0 - 0.0674407
>>  6  16   139   123   81.9865 0 - 0.0674407
>>  7  16   139   123   70.2747 0 - 0.0674407
>>  8  16   139   123   61.4903 0 - 0.0674407
>>  9  16   139   123   54.6582 0 - 0.0674407
>> 10  16   139   123   49.1924 0 - 0.0674407
>> 11  16   139   123   44.7201 0 - 0.0674407
>> 12  16   139   123   40.9934 0 - 0.0674407
>> 13  16   139   123   37.8401 0 - 0.0674407
>> 14  16   139   123   35.1373 0 - 0.0674407
>> 15  16   139   123   32.7949 0 - 0.0674407
>> 16  16   139   123   30.7451 0 - 0.0674407
>> 17  16   139   123   28.9364 0 - 0.0674407
>> 18  16   139   123   27.3289 0 - 0.0674407
>> 19  16   139   123   25.8905 0 - 0.0674407
>> 2015-09-07 15:54:52.694071min lat: 0.022076 max lat: 0.46117 avg lat:
>> 0.0674407
>>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>> 20  16   139   12324.596 0 - 0.0674407
>> 21  16   139   123   23.4247 0 - 0.0674407
>> 22  16   139   123 22.36 0 - 0.0674407
>> 23  16   139   123   21.3878 0 - 0.0674407
>> 24  16   139   123   20.4966 0 - 0.0674407
>> 25  16   139   123   19.6768 0 - 0.0674407
>> 26  16   139   123 18.92 0 - 0.0674407
>> 27  16   139   123   18.2192 0 - 0.0674407
>> 28  16   139   123   17.5686 0 - 0.0674407
>> 29  16   139   123   16.9628 0 - 0.0674407
>> 30  16   139   123   16.3973 0 - 0.0674407
>> 31  16   139   123   15.8684 0 - 0.0674407
>> 32  16   139   123   15.3725 0 - 0.0674407
>> 33  16   139   123   14.9067 0 - 0.0674407
>> 34  16   139   123   14.4683 0 - 0.0674407
>> 35  

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-07 Thread Udo Lembke
Hi Vickey,
I had the same rados bench output after changing the motherboard of the
monitor node with the lowest IP...
Due to the new mainboard, I assume the hw-clock was wrong during
startup. Ceph health show no errors, but all VMs aren't able to do IO
(very high load on the VMs - but no traffic).
I stopped the mon, but this don't changed anything. I had to restart all
other mons to get IO again. After that I started the first mon also
(with the right time now) and all worked fine again...

Another posibility:
Do you use journal on SSDs? Perhaps the SSDs can't write to garbage
collection?


Udo

On 07.09.2015 16:36, Vickey Singh wrote:
> Dear Experts
>
> Can someone please help me , why my cluster is not able write data.
>
> See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.
>
>
> Ceph Hammer  0.94.2
> CentOS 6 (3.10.69-1)
>
> The Ceph status says OPS are blocked , i have tried checking , what
> all i know 
>
> - System resources ( CPU , net, disk , memory )-- All normal 
> - 10G network for public and cluster network  -- no saturation 
> - Add disks are physically healthy 
> - No messages in /var/log/messages OR dmesg
> - Tried restarting OSD which are blocking operation , but no luck
> - Tried writing through RBD  and Rados bench , both are giving same
> problemm
>
> Please help me to fix this problem.
>
> #  rados bench -p rbd 60 write
>  Maintaining 16 concurrent writes of 4194304 bytes for up to 60
> seconds or 0 objects
>  Object prefix: benchmark_data_stor1_1791844
>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>  0   0 0 0 0 0 - 0
>  1  16   125   109   435.873   436  0.022076 0.0697864
>  2  16   139   123   245.94856  0.246578 0.0674407
>  3  16   139   123   163.969 0 - 0.0674407
>  4  16   139   123   122.978 0 - 0.0674407
>  5  16   139   12398.383 0 - 0.0674407
>  6  16   139   123   81.9865 0 - 0.0674407
>  7  16   139   123   70.2747 0 - 0.0674407
>  8  16   139   123   61.4903 0 - 0.0674407
>  9  16   139   123   54.6582 0 - 0.0674407
> 10  16   139   123   49.1924 0 - 0.0674407
> 11  16   139   123   44.7201 0 - 0.0674407
> 12  16   139   123   40.9934 0 - 0.0674407
> 13  16   139   123   37.8401 0 - 0.0674407
> 14  16   139   123   35.1373 0 - 0.0674407
> 15  16   139   123   32.7949 0 - 0.0674407
> 16  16   139   123   30.7451 0 - 0.0674407
> 17  16   139   123   28.9364 0 - 0.0674407
> 18  16   139   123   27.3289 0 - 0.0674407
> 19  16   139   123   25.8905 0 - 0.0674407
> 2015-09-07 15:54:52.694071min lat: 0.022076 max lat: 0.46117 avg lat:
> 0.0674407
>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
> 20  16   139   12324.596 0 - 0.0674407
> 21  16   139   123   23.4247 0 - 0.0674407
> 22  16   139   123 22.36 0 - 0.0674407
> 23  16   139   123   21.3878 0 - 0.0674407
> 24  16   139   123   20.4966 0 - 0.0674407
> 25  16   139   123   19.6768 0 - 0.0674407
> 26  16   139   123 18.92 0 - 0.0674407
> 27  16   139   123   18.2192 0 - 0.0674407
> 28  16   139   123   17.5686 0 - 0.0674407
> 29  16   139   123   16.9628 0 - 0.0674407
> 30  16   139   123   16.3973 0 - 0.0674407
> 31  16   139   123   15.8684 0 - 0.0674407
> 32  16   139   123   15.3725 0 - 0.0674407
> 33  16   139   123   14.9067 0 - 0.0674407
> 34  16   139   123   14.4683 0 - 0.0674407
> 35  16   139   123   14.0549 0 - 0.0674407
> 36  16   139   123   13.6645 0 - 0.0674407
> 37  16   139   123   13.2952 0 - 0.0674407
> 38  16   139   123   12.9453 0 - 0.0674407
> 39  16   139   123   12.6134 0 - 0.0674407
> 2015-09-07 15:55:12.697124min lat: 0.022076 max lat: 0.46117 avg lat:
> 0.0674407
>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>