Re: [ceph-users] Poor ceph cluster performance
Cody : > > > And this exact problem was one of the reasons why we migrated > > everything to PXE boot where the OS runs from RAM. > > Hi Paul, > > I totally agree with and admire your diskless approach. If I may ask, > what kind of OS image do you use? 1GB footprint sounds really small. It's based on Debian, because Debian makes live boot really easy with squashfs + overlayfs. We also have a half-finished CentOS/RHEL-based version somewhere, but that requires way more RAM because it doesn't use overlayfs (or didn't when we last checked, I guess we need to check RHEL 8 again) Current image size is 400 MB + 30 MB for kernel + initrd and it comes with everything you need for Ceph. We don't even run aggressive compression on the squashfs, it's just lzo. You can test it for yourself in a VM: https://croit.io/croit-virtual-demo Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 > > On Tue, Nov 27, 2018 at 1:53 PM Paul Emmerich wrote: > > > > And this exact problem was one of the reasons why we migrated > > everything to PXE boot where the OS runs from RAM. > > That kind of failure is just the worst to debug... > > Also, 1 GB of RAM is cheaper than a separate OS disk. > > > > -- > > Paul Emmerich > > > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > > > croit GmbH > > Freseniusstr. 31h > > 81247 München > > www.croit.io > > Tel: +49 89 1896585 90 > > > > Am Di., 27. Nov. 2018 um 19:22 Uhr schrieb Cody : > > > > > > Hi everyone, > > > > > > Many, many thanks to all of you! > > > > > > The root cause was due to a failed OS drive on one storage node. The > > > server was responsive to ping, but unable to login. After a reboot via > > > IPMI, docker daemon failed to start due to I/O errors and dmesg > > > complained about the failing OS disk. I failed to catch the problem > > > initially since 'ceph -s' kept showing HEALTH and the cluster was > > > "functional" despite of slow performance. > > > > > > I really appreciate all the tips and advices received from you all and > > > learned a lot. I will carry your advices (e.g. using bluestore, > > > enterprise ssd/hdd, separating public and cluster traffics, etc) into > > > my next round PoC. > > > > > > Thank you very much! > > > > > > Best regards, > > > Cody > > > > > > On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov > > > wrote: > > > > > > > > > CPU: 2 x E5-2603 @1.8GHz > > > > > RAM: 16GB > > > > > Network: 1G port shared for Ceph public and cluster traffics > > > > > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > > > > > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) > > > > > > > > 0.84 MB/s sequential write is impossibly bad, it's not normal with any > > > > kind of devices and even with 1G network, you probably have some kind of > > > > problem in your setup - maybe the network RTT is very high or maybe osd > > > > or > > > > mon nodes are shared with other running tasks and overloaded or maybe > > > > your > > > > disks are already dead... :)) > > > > > > > > > As I moved on to test block devices, I got a following error message: > > > > > > > > > > # rbd map image01 --pool testbench --name client.admin > > > > > > > > You don't need to map it to run benchmarks, use `fio --ioengine=rbd` > > > > (however you'll still need /etc/ceph/ceph.client.admin.keyring) > > > > > > > > -- > > > > With best regards, > > > >Vitaliy Filippov > > > ___ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
And this exact problem was one of the reasons why we migrated everything to PXE boot where the OS runs from RAM. That kind of failure is just the worst to debug... Also, 1 GB of RAM is cheaper than a separate OS disk. -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 Am Di., 27. Nov. 2018 um 19:22 Uhr schrieb Cody : > > Hi everyone, > > Many, many thanks to all of you! > > The root cause was due to a failed OS drive on one storage node. The > server was responsive to ping, but unable to login. After a reboot via > IPMI, docker daemon failed to start due to I/O errors and dmesg > complained about the failing OS disk. I failed to catch the problem > initially since 'ceph -s' kept showing HEALTH and the cluster was > "functional" despite of slow performance. > > I really appreciate all the tips and advices received from you all and > learned a lot. I will carry your advices (e.g. using bluestore, > enterprise ssd/hdd, separating public and cluster traffics, etc) into > my next round PoC. > > Thank you very much! > > Best regards, > Cody > > On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov wrote: > > > > > CPU: 2 x E5-2603 @1.8GHz > > > RAM: 16GB > > > Network: 1G port shared for Ceph public and cluster traffics > > > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > > > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) > > > > 0.84 MB/s sequential write is impossibly bad, it's not normal with any > > kind of devices and even with 1G network, you probably have some kind of > > problem in your setup - maybe the network RTT is very high or maybe osd or > > mon nodes are shared with other running tasks and overloaded or maybe your > > disks are already dead... :)) > > > > > As I moved on to test block devices, I got a following error message: > > > > > > # rbd map image01 --pool testbench --name client.admin > > > > You don't need to map it to run benchmarks, use `fio --ioengine=rbd` > > (however you'll still need /etc/ceph/ceph.client.admin.keyring) > > > > -- > > With best regards, > >Vitaliy Filippov > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
Hi everyone, Many, many thanks to all of you! The root cause was due to a failed OS drive on one storage node. The server was responsive to ping, but unable to login. After a reboot via IPMI, docker daemon failed to start due to I/O errors and dmesg complained about the failing OS disk. I failed to catch the problem initially since 'ceph -s' kept showing HEALTH and the cluster was "functional" despite of slow performance. I really appreciate all the tips and advices received from you all and learned a lot. I will carry your advices (e.g. using bluestore, enterprise ssd/hdd, separating public and cluster traffics, etc) into my next round PoC. Thank you very much! Best regards, Cody On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov wrote: > > > CPU: 2 x E5-2603 @1.8GHz > > RAM: 16GB > > Network: 1G port shared for Ceph public and cluster traffics > > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) > > 0.84 MB/s sequential write is impossibly bad, it's not normal with any > kind of devices and even with 1G network, you probably have some kind of > problem in your setup - maybe the network RTT is very high or maybe osd or > mon nodes are shared with other running tasks and overloaded or maybe your > disks are already dead... :)) > > > As I moved on to test block devices, I got a following error message: > > > > # rbd map image01 --pool testbench --name client.admin > > You don't need to map it to run benchmarks, use `fio --ioengine=rbd` > (however you'll still need /etc/ceph/ceph.client.admin.keyring) > > -- > With best regards, >Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
CPU: 2 x E5-2603 @1.8GHz RAM: 16GB Network: 1G port shared for Ceph public and cluster traffics Journaling device: 1 x 120GB SSD (SATA3, consumer grade) OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) 0.84 MB/s sequential write is impossibly bad, it's not normal with any kind of devices and even with 1G network, you probably have some kind of problem in your setup - maybe the network RTT is very high or maybe osd or mon nodes are shared with other running tasks and overloaded or maybe your disks are already dead... :)) As I moved on to test block devices, I got a following error message: # rbd map image01 --pool testbench --name client.admin You don't need to map it to run benchmarks, use `fio --ioengine=rbd` (however you'll still need /etc/ceph/ceph.client.admin.keyring) -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
Hi, Most likely the issue is with your consumer grade journal ssd. Run this to your ssd to check if it performs: fio --filename= --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test On Tue, Nov 27, 2018 at 2:06 AM Cody wrote: > > Hello, > > I have a Ceph cluster deployed together with OpenStack using TripleO. > While the Ceph cluster shows a healthy status, its performance is > painfully slow. After eliminating a possibility of network issues, I > have zeroed in on the Ceph cluster itself, but have no experience in > further debugging and tunning. > > The Ceph OSD part of the cluster uses 3 identical servers with the > following specifications: > > CPU: 2 x E5-2603 @1.8GHz > RAM: 16GB > Network: 1G port shared for Ceph public and cluster traffics > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) > > This is not beefy enough in any way, but I am running for PoC only, > with minimum utilization. > > Ceph-mon and ceph-mgr daemons are hosted on the OpenStack Controller > nodes. Ceph-ansible version is 3.1 and is using Filestore with > non-colocated scenario (1 SSD for every 2 OSDs). Connection speed > among Controllers, Computes, and OSD nodes can reach ~900Mbps tested > using iperf. > > I followed the Red Hat Ceph 3 benchmarking procedure [1] and received > following results: > > Write Test: > > Total time run: 80.313004 > Total writes made: 17 > Write size: 4194304 > Object size:4194304 > Bandwidth (MB/sec): 0.846687 > Stddev Bandwidth: 0.320051 > Max bandwidth (MB/sec): 2 > Min bandwidth (MB/sec): 0 > Average IOPS: 0 > Stddev IOPS:0 > Max IOPS: 0 > Min IOPS: 0 > Average Latency(s): 66.6582 > Stddev Latency(s): 15.5529 > Max latency(s): 80.3122 > Min latency(s): 29.7059 > > Sequencial Read Test: > > Total time run: 25.951049 > Total reads made: 17 > Read size:4194304 > Object size: 4194304 > Bandwidth (MB/sec): 2.62032 > Average IOPS: 0 > Stddev IOPS: 0 > Max IOPS: 1 > Min IOPS: 0 > Average Latency(s): 24.4129 > Max latency(s): 25.9492 > Min latency(s): 0.117732 > > Random Read Test: > > Total time run: 66.355433 > Total reads made: 46 > Read size:4194304 > Object size: 4194304 > Bandwidth (MB/sec): 2.77295 > Average IOPS: 0 > Stddev IOPS: 3 > Max IOPS: 27 > Min IOPS: 0 > Average Latency(s): 21.4531 > Max latency(s): 66.1885 > Min latency(s): 0.0395266 > > Apparently, the results are pathetic... > > As I moved on to test block devices, I got a following error message: > > # rbd map image01 --pool testbench --name client.admin > rbd: failed to add secret 'client.admin' to kernel > > Any suggestions on the above error and/or debugging would be greatly > appreciated! > > Thank you very much to all. > > Cody > > [1] > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#benchmarking_performance > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
Quoting Cody (codeology@gmail.com): > The Ceph OSD part of the cluster uses 3 identical servers with the > following specifications: > > CPU: 2 x E5-2603 @1.8GHz > RAM: 16GB > Network: 1G port shared for Ceph public and cluster traffics This will hamper throughput a lot. > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) OK, let's stop here first: Consumer grade SSD. Percona did a nice writeup about "fsync" speed on consumer grade SSDs [1]. As I don't know what drives you use this might or might not be the issue. > > This is not beefy enough in any way, but I am running for PoC only, > with minimum utilization. > > Ceph-mon and ceph-mgr daemons are hosted on the OpenStack Controller > nodes. Ceph-ansible version is 3.1 and is using Filestore with > non-colocated scenario (1 SSD for every 2 OSDs). Connection speed > among Controllers, Computes, and OSD nodes can reach ~900Mbps tested > using iperf. Why filestore if I may ask? I guess bluestore with bluestore journal on SSD and data on SATA should give you better performance. If the SSDs are suitable for the job at all. What version of Ceph are use using? Metrics can give you a lot of insight. Did you take a look at those? In fFor example Ceph mgr dashboard? > > I followed the Red Hat Ceph 3 benchmarking procedure [1] and received > following results: > > Write Test: > > Total time run: 80.313004 > Total writes made: 17 > Write size: 4194304 > Object size:4194304 > Bandwidth (MB/sec): 0.846687 > Stddev Bandwidth: 0.320051 > Max bandwidth (MB/sec): 2 > Min bandwidth (MB/sec): 0 > Average IOPS: 0 > Stddev IOPS:0 > Max IOPS: 0 > Min IOPS: 0 > Average Latency(s): 66.6582 > Stddev Latency(s): 15.5529 > Max latency(s): 80.3122 > Min latency(s): 29.7059 > > Sequencial Read Test: > > Total time run: 25.951049 > Total reads made: 17 > Read size:4194304 > Object size: 4194304 > Bandwidth (MB/sec): 2.62032 > Average IOPS: 0 > Stddev IOPS: 0 > Max IOPS: 1 > Min IOPS: 0 > Average Latency(s): 24.4129 > Max latency(s): 25.9492 > Min latency(s): 0.117732 > > Random Read Test: > > Total time run: 66.355433 > Total reads made: 46 > Read size:4194304 > Object size: 4194304 > Bandwidth (MB/sec): 2.77295 > Average IOPS: 0 > Stddev IOPS: 3 > Max IOPS: 27 > Min IOPS: 0 > Average Latency(s): 21.4531 > Max latency(s): 66.1885 > Min latency(s): 0.0395266 > > Apparently, the results are pathetic... > > As I moved on to test block devices, I got a following error message: > > # rbd map image01 --pool testbench --name client.admin > rbd: failed to add secret 'client.admin' to kernel What replication factor are you using? Make sure you have the client.admin keyring on the node you are issuing this command. If you have the keyring present like Ceph expects it to be, then you can omit the --name client.admin. On a monitor node you can extract the admin keyring: ceph auth export client.admin. Put the output of that in /etc/ceph/ceph.client.admin.keyring and this should work. > Any suggestions on the above error and/or debugging would be greatly > appreciated! Gr. Stefan [1]: https://www.percona.com/blog/2018/07/18/why-consumer-ssd-reviews-are-useless-for-database-performance-use-case/ > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#benchmarking_performance > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- | BIT BV http://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Poor ceph cluster performance
Hello, I have a Ceph cluster deployed together with OpenStack using TripleO. While the Ceph cluster shows a healthy status, its performance is painfully slow. After eliminating a possibility of network issues, I have zeroed in on the Ceph cluster itself, but have no experience in further debugging and tunning. The Ceph OSD part of the cluster uses 3 identical servers with the following specifications: CPU: 2 x E5-2603 @1.8GHz RAM: 16GB Network: 1G port shared for Ceph public and cluster traffics Journaling device: 1 x 120GB SSD (SATA3, consumer grade) OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) This is not beefy enough in any way, but I am running for PoC only, with minimum utilization. Ceph-mon and ceph-mgr daemons are hosted on the OpenStack Controller nodes. Ceph-ansible version is 3.1 and is using Filestore with non-colocated scenario (1 SSD for every 2 OSDs). Connection speed among Controllers, Computes, and OSD nodes can reach ~900Mbps tested using iperf. I followed the Red Hat Ceph 3 benchmarking procedure [1] and received following results: Write Test: Total time run: 80.313004 Total writes made: 17 Write size: 4194304 Object size:4194304 Bandwidth (MB/sec): 0.846687 Stddev Bandwidth: 0.320051 Max bandwidth (MB/sec): 2 Min bandwidth (MB/sec): 0 Average IOPS: 0 Stddev IOPS:0 Max IOPS: 0 Min IOPS: 0 Average Latency(s): 66.6582 Stddev Latency(s): 15.5529 Max latency(s): 80.3122 Min latency(s): 29.7059 Sequencial Read Test: Total time run: 25.951049 Total reads made: 17 Read size:4194304 Object size: 4194304 Bandwidth (MB/sec): 2.62032 Average IOPS: 0 Stddev IOPS: 0 Max IOPS: 1 Min IOPS: 0 Average Latency(s): 24.4129 Max latency(s): 25.9492 Min latency(s): 0.117732 Random Read Test: Total time run: 66.355433 Total reads made: 46 Read size:4194304 Object size: 4194304 Bandwidth (MB/sec): 2.77295 Average IOPS: 0 Stddev IOPS: 3 Max IOPS: 27 Min IOPS: 0 Average Latency(s): 21.4531 Max latency(s): 66.1885 Min latency(s): 0.0395266 Apparently, the results are pathetic... As I moved on to test block devices, I got a following error message: # rbd map image01 --pool testbench --name client.admin rbd: failed to add secret 'client.admin' to kernel Any suggestions on the above error and/or debugging would be greatly appreciated! Thank you very much to all. Cody [1] https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#benchmarking_performance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com