Re: [ceph-users] ceph cluster having blocke requests very frequently

Thomas Danan Tue, 15 Nov 2016 03:02:10 -0800

Hi Chris,

We checked memory as well and we have plenty of free memory (12GB used / 125GB 
available) on each and every DN.


Actually we have activated some Debug logs yesterday and found many messages 
like :

1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7ff9bdb42700' had timed out 
after 15
1 heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7ff9bb33d700' had timed 
out after 15

Those messages are seen on one of the secondary OSD when we identify blocked 
requests (waiting for subops) on the primary OSD.

Is this related to some connection issues with monitors ?

Thanks

Thomas



From: Chris Taylor [mailto:ctay...@eyonic.com]
Sent: mardi 15 novembre 2016 00:54
To: Brad Hubbard
Cc: Thomas Danan; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph cluster having blocke requests very frequently


Maybe a long shot, but have you checked OSD memory usage? Are the OSD hosts low 
on RAM and swapping to disk?

I am not familiar with your issue, but though that might cause it.



Chris



On 2016-11-14 3:29 pm, Brad Hubbard wrote:
Have you looked for clues in the output of dump_historic_ops ?

On Tue, Nov 15, 2016 at 1:45 AM, Thomas Danan 
<thomas.da...@mycom-osi.com<mailto:thomas.da...@mycom-osi.com>> wrote:
Thanks Luis,

Here are some answers ....

Journals are not on SSD and collocated with OSD daemons host.
We look at the disk performances and did not notice anything wrong with 
acceptable rw latency < 20ms.
No issue on the network as well from what we have seen.

There is only one pool in the cluster so pool size = cluster size. Replication 
factor is default: 3 and there is no erasure coding.

We tried to stop deep scrub but without notable effect.

We have one near full OSD and adding new DN but I doubt this could be the issue.

I doubt we are hitting cluster limits but if it was the case, then adding new 
DN should help. Also writing to primary OSD is working fine whereas writing on 
secondary OSD is often blocked. Last is recovery can be very fast (several 
GB/s) and seems never blocked, where client RW IOs are about several hundred MB 
/s and are too much often blocked when writing replicas.

Thomas


From: Luis Periquito [mailto:periqu...@gmail.com<mailto:periqu...@gmail.com>]
Sent: lundi 14 novembre 2016 16:23
To: Thomas Danan
Cc: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] ceph cluster having blocke requests very frequently

Without knowing the cluster architecture it's hard to know exactly what may be 
happening.

How is the cluster hardware? Where are the journals? How busy are the disks (% 
time busy)? What is the pool size? Are these replicated or EC pools?

Have you tried tuning the deep-scrub processes? Have you tried stopping them 
altogether? Are the journals on SSDs? As a first feeling the cluster may be 
hitting it's limits (also you have at least one OSD getting full)...

On Mon, Nov 14, 2016 at 3:16 PM, Thomas Danan 
<thomas.da...@mycom-osi.com<mailto:thomas.da...@mycom-osi.com>> wrote:
Hi All,

We have a cluster in production who is suffering from intermittent blocked 
request (25 requests are blocked > 32 sec). The blocked request occurrences are 
frequent and global to all OSDs.
From the OSD daemon logs, I can see related messages:

16-11-11 18:25:29.917518 7fd28b989700 0 log_channel(cluster) log [WRN] : slow 
request 30.429723 seconds old, received at 2016-11-11 18:24:59.487570: 
osd_op(client.2406272.1:336025615 rbd_data.66e952ae8944a.0000000000350167 
[set-alloc-hint object_size 4194304 write_size 4194304,write 0~524288] 
0.8d3c9da5 snapc 248=[248,216] ondisk+write e201514) currently waiting for 
subops from 210,499,821

. So I guess the issue is related to replication process when writing new data 
on the cluster. Again it is never the same secondary OSDs that are displayed in 
OSD daemon logs.
As a result we are experiencing very important IO Write latency on ceph client 
side (can be up to 1 hour !!!).
We have checked Network health as well as disk health but we wre not able to 
find any issue.

Wanted to know if this issue was already observed or if you have ideas to 
investigate / WA the issue.
Many thanks...

Thomas

The cluster is composed with 37DN and 851 OSDs and 5 MONs
The Ceph clients are accessing the client with RBD
Cluster is Hammer 0.94.5 version

cluster 1a26e029-3734-4b0e-b86e-ca2778d0c990
health HEALTH_WARN
25 requests are blocked > 32 sec
1 near full osd(s)
noout flag(s) set
monmap e3: 5 mons at 
{NVMBD1CGK190D00=10.137.81.13:6789/0,nvmbd1cgy050d00=10.137.78.226:6789/0,nvmbd1cgy070d00=10.137.78.232:6789/0,nvmbd1cgy090d00=10.137.78.228:6789/0,nvmbd1cgy130d00=10.137.78.218:6789/0<http://10.137.81.13:6789/0,nvmbd1cgy050d00=10.137.78.226:6789/0,nvmbd1cgy070d00=10.137.78.232:6789/0,nvmbd1cgy090d00=10.137.78.228:6789/0,nvmbd1cgy130d00=10.137.78.218:6789/0>}
election epoch 664, quorum 0,1,2,3,4 
nvmbd1cgy130d00,nvmbd1cgy050d00,nvmbd1cgy090d00,nvmbd1cgy070d00,NVMBD1CGK190D00
osdmap e205632: 851 osds: 850 up, 850 in
flags noout
pgmap v25919096: 10240 pgs, 1 pools, 197 TB data, 50664 kobjects
597 TB used, 233 TB / 831 TB avail
10208 active+clean
32 active+clean+scrubbing+deep
client io 97822 kB/s rd, 205 MB/s wr, 2402 op/s



Thank you
Thomas Danan
Director of Product Development

Office        +33 1 49 03 77 53<tel:%2B33%201%2049%2003%2077%2053>
Mobile        +33 7 76 35 76 43<tel:%2B33%207%2076%2035%2076%2043>
Skype         thomas.danan
 www.mycom-osi.com<http://www.mycom-osi.com/>

[cid:image001.jpg@01CFFC1F.8FF11180]<http://www.mycom-osi.com/>



Follow us on Twitter, LinkedIn, YouTube and our Blog
[cid:image002.jpg@01CFFD5E.4B6531F0]<http://twitter.com/mycomosi>  
[cid:image003.jpg@01CFFD5E.4B6531F0] 
<http://www.linkedin.com/company/mycom-osi>   
[cid:image004.jpg@01CFFD5E.4B6531F0] <http://www.youtube.com/user/MYCOM-OSI>   
[cid:image005.jpg@01CFFD5E.4B6531F0] <http://www.mycom-osi.com/blog>


________________________________

This electronic message contains information from Mycom which may be privileged 
or confidential. The information is intended to be for the use of the 
individual(s) or entity named above. If you are not the intended recipient, be 
aware that any disclosure, copying, distribution or any other use of the 
contents of this information is prohibited. If you have received this 
electronic message in error, please notify us by post or telephone (to the 
numbers or correspondence address above) or by email (at the email address 
above) immediately.

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


________________________________

This electronic message contains information from Mycom which may be privileged 
or confidential. The information is intended to be for the use of the 
individual(s) or entity named above. If you are not the intended recipient, be 
aware that any disclosure, copying, distribution or any other use of the 
contents of this information is prohibited. If you have received this 
electronic message in error, please notify us by post or telephone (to the 
numbers or correspondence address above) or by email (at the email address 
above) immediately.

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Cheers,
Brad

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

________________________________

This electronic message contains information from Mycom which may be privileged 
or confidential. The information is intended to be for the use of the 
individual(s) or entity named above. If you are not the intended recipient, be 
aware that any disclosure, copying, distribution or any other use of the 
contents of this information is prohibited. If you have received this 
electronic message in error, please notify us by post or telephone (to the 
numbers or correspondence address above) or by email (at the email address 
above) immediately.

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph cluster having blocke requests very frequently

Reply via email to