Re: [ceph-users] Read Stalls with Multiple OSD Servers

Helander, Thomas Tue, 02 Aug 2016 12:15:16 -0700

Hi David,

There’s a good amount of backstory to our configuration, but I’m happy to 
report I found the source of my problem.

We were applying some “optimizations” for our 10GbE via sysctl, including 
disabling net.ipv4.tcp_sack. Re-enabling net.ipv4.tcp_sack resolved the issue.

Thanks,
Tom

From: David Turner [mailto:david.tur...@storagecraft.com]
Sent: Monday, August 01, 2016 12:06 PM
To: Helander, Thomas <thomas.helan...@kla-tencor.com>; ceph-users@lists.ceph.com
Subject: RE: Read Stalls with Multiple OSD Servers

Why are you running Raid 6 osds?  Ceph's usefulness is a lot of osds that can 
fail and be replaced.  With your processors/ram, you should be running these as 
individual osds.  That will utilize your dual processor setup much more.  Ceph 
is optimal for 1 core per osd.  Extra cores are more or less wasted in the 
storage node.  You only have 2 storage nodes, so you can't utilize a lot of the 
benefits of Ceph.  Your setup looks like you're much better suited for a 
Gluster cluster instead of a Ceph cluster.  I don't know what your needs are, 
but that's what it looks like from here.
________________________________
[cid:image001.jpg@01D1ECB5.B37D8B00]<https://storagecraft.com>

David Turner | Cloud Operations Engineer | StorageCraft Technology 
Corporation<https://storagecraft.com>
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943

________________________________
If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

________________________________
________________________________
From: Helander, Thomas [thomas.helan...@kla-tencor.com]
Sent: Monday, August 01, 2016 11:10 AM
To: David Turner; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Read Stalls with Multiple OSD Servers
Hi David,

Thanks for the quick response and suggestion. I do have just a basic network 
config (one network, no VLANs) and am able to ping between the storage servers 
using hostnames and IPs.

Thanks,
Tom

From: David Turner [mailto:david.tur...@storagecraft.com]
Sent: Monday, August 01, 2016 9:14 AM
To: Helander, Thomas 
<thomas.helan...@kla-tencor.com<mailto:thomas.helan...@kla-tencor.com>>; 
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Read Stalls with Multiple OSD Servers

This could be explained by your osds not being able to communicate with each 
other.  We have 2 vlans between our storage nodes, the public and private 
networks for ceph to use.  We added 2 new nodes in a new rack on new switches 
and as soon as we added a single osd for one of them to the cluster, the 
peering never finished and we had a lot of blocked requests that never went 
away.

In testing we found that the rest of the cluster could not communicate with 
these nodes on the private vlan and after fixing the network switch config, 
everything worked perfectly for adding in the 2 new nodes.

If you are using a basic network configuration with only one network and/or 
vlan, then this is likely not to be your issue.  But to check and make sure, 
you should test pinging between your nodes on all of the IPs they have.
________________________________
[cid:image001.jpg@01D1ECB5.B37D8B00]<https://storagecraft.com>

David Turner | Cloud Operations Engineer | StorageCraft Technology 
Corporation<https://storagecraft.com>
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943

________________________________
If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

________________________________
________________________________
From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Helander, 
Thomas [thomas.helan...@kla-tencor.com]
Sent: Monday, August 01, 2016 10:06 AM
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: [ceph-users] Read Stalls with Multiple OSD Servers
Hi,

I’m running a three server cluster (one monitor, two OSD) and am having a 
problem where after adding the second OSD server, my read rate drops 
significantly and eventually the reads stall (writes are improved as expected). 
Attached is a log of the rados benchmarks for the two configurations and below 
is my hardware configuration. I’m not using replicas (capacity is more 
important than uptime for our use case) and am using a single 10GbE network. 
The pool (rbd) is configured with 128 placement groups.

I’ve checked the CPU utilization of the ceph-osd processes and they all hover 
around 10% until the stall. After the stall, the CPU usage is 0% and the disks 
all show zero operations via iostat. Iperf reports 9.9Gb/s between the monitor 
and OSD servers.

I’m looking for any advice/help on how to identify the source of this issue as 
my attempts so far have proven fruitless…

Monitor server:
2x E5-2680V3
32GB DDR4
2x 4TB HDD in RAID1 on an Avago/LSI 3108 with Cachevault, configured as 
write-back
10GbE

OSD servers:
2x E5-2680V3
128GB DDR4
2x 8+2 RAID6 using 8TB SAS12 drives on an Avago/LSI 9380 controller with 
Cachevault, configured as write-back.
                - Each RAID6 is an OSD
10GbE

Thanks,

Tom Helander

KLA-Tencor
One Technology Dr | M/S 5-2042R | Milpitas, CA | 95035

CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents, files or 
previous e-mail messages attached to it, may contain confidential information. 
If you are not the intended recipient, or a person responsible for delivering 
it to the intended recipient, you are hereby notified that any disclosure, 
copying, distribution or use of any of the information contained in or attached 
to this message is STRICTLY PROHIBITED. If you have received this transmission 
in error, please immediately notify us by reply e-mail at 
thomas.helan...@kla-tencor.com<mailto:thomas.helan...@kla-tencor.com> or by 
telephone at (408) 875-7819, and destroy the original transmission and its 
attachments without reading them or saving them to disk. Thank you.

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Read Stalls with Multiple OSD Servers

Reply via email to