Re: [ceph-users] Benchmark performance when using SSD as the journal
Hi Dave Have you looked at the Intel P4600 vsd the P4500 The P4600 has better random writes and a better drive writes per day I believe Thanks Joe >>> 11/13/2018 8:45 PM >>> Thanks Merrick! I checked with Intel spec [1], the performance Intel said is, · Sequential Read (up to) 500 MB/s · Sequential Write (up to) 330 MB/s · Random Read (100% Span) 72000 IOPS · Random Write (100% Span) 2 IOPS I think these indicator should be must better than general HDD, and I have run read/write commands with “rados bench” respectively, there should be some difference. And is there any kinds of configuration that could give us any performance gain with this SSD (Intel S4500)? [1] https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC- Best Regards, Dave Chen From: Ashley Merrick Sent: Wednesday, November 14, 2018 12:30 PM To: Chen2, Dave Cc: ceph-users Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests for sensitive information. Only certain SSD's are good for CEPH Journals as can be seen @ https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ The SSD your using isn't listed but doing a quick search online it appears to be a SSD designed for read workloads as a "upgrade" from a HD so probably is not designed for the high write requirements a journal demands. Therefore when it's been hit by 3 OSD's of workloads your not going to get much more performance out of it than you would just using the disk as your seeing. On Wed, Nov 14, 2018 at 12:21 PM wrote: Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Hi Dave, The main line in SSD specs you should look at is Enhanced Power Loss Data Protection: Yes This makes SSD cache nonvolatile and makes SSD ignore fsync()s so transactional performance becomes equal to non-transactional. So your SSDs should be OK for journal. rados bench is a bad tool for testing because of 4M default block size and a very small number of objects created for testing. Better test it with fio -ioengine=rbd -bs=4k -rw=randwrite and -sync=1 -iodepth=1 for latency or -iodepth=128 for max random load. Another recent thing that I've discovered was that turning off write cache for all drives (for i in /dev/sd*; do hdparm -W 0 $i; done) increased write iops by an order of magnitude. Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run "rados bench" utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run "rados bench" again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Hi Roos, I will try with the configuration, thank you very much! Best Regards, Dave Chen -Original Message- From: Marc Roos Sent: Wednesday, November 14, 2018 4:37 PM To: ceph-users; Chen2, Dave Subject: RE: [ceph-users] Benchmark performance when using SSD as the journal [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests for sensitive information. Try comparing results from something like this test [global] ioengine=posixaio invalidate=1 ramp_time=30 iodepth=1 runtime=180 time_based direct=1 filename=/mnt/cephfs/ssd/fio-bench.img [write-4k-seq] stonewall bs=4k rw=write #write_bw_log=sdx-4k-write-seq.results #write_iops_log=sdx-4k-write-seq.results [randwrite-4k-seq] stonewall bs=4k rw=randwrite #write_bw_log=sdx-4k-randwrite-seq.results #write_iops_log=sdx-4k-randwrite-seq.results [read-4k-seq] stonewall bs=4k rw=read #write_bw_log=sdx-4k-read-seq.results #write_iops_log=sdx-4k-read-seq.results [randread-4k-seq] stonewall bs=4k rw=randread #write_bw_log=sdx-4k-randread-seq.results #write_iops_log=sdx-4k-randread-seq.results [rw-4k-seq] stonewall bs=4k rw=rw #write_bw_log=sdx-4k-rw-seq.results #write_iops_log=sdx-4k-rw-seq.results [randrw-4k-seq] stonewall bs=4k rw=randrw #write_bw_log=sdx-4k-randrw-seq.results #write_iops_log=sdx-4k-randrw-seq.results [write-128k-seq] stonewall bs=128k rw=write #write_bw_log=sdx-128k-write-seq.results #write_iops_log=sdx-128k-write-seq.results [randwrite-128k-seq] stonewall bs=128k rw=randwrite #write_bw_log=sdx-128k-randwrite-seq.results #write_iops_log=sdx-128k-randwrite-seq.results [read-128k-seq] stonewall bs=128k rw=read #write_bw_log=sdx-128k-read-seq.results #write_iops_log=sdx-128k-read-seq.results [randread-128k-seq] stonewall bs=128k rw=randread #write_bw_log=sdx-128k-randread-seq.results #write_iops_log=sdx-128k-randread-seq.results [rw-128k-seq] stonewall bs=128k rw=rw #write_bw_log=sdx-128k-rw-seq.results #write_iops_log=sdx-128k-rw-seq.results [randrw-128k-seq] stonewall bs=128k rw=randrw #write_bw_log=sdx-128k-randrw-seq.results #write_iops_log=sdx-128k-randrw-seq.results [write-1024k-seq] stonewall bs=1024k rw=write #write_bw_log=sdx-1024k-write-seq.results #write_iops_log=sdx-1024k-write-seq.results [randwrite-1024k-seq] stonewall bs=1024k rw=randwrite #write_bw_log=sdx-1024k-randwrite-seq.results #write_iops_log=sdx-1024k-randwrite-seq.results [read-1024k-seq] stonewall bs=1024k rw=read #write_bw_log=sdx-1024k-read-seq.results #write_iops_log=sdx-1024k-read-seq.results [randread-1024k-seq] stonewall bs=1024k rw=randread #write_bw_log=sdx-1024k-randread-seq.results #write_iops_log=sdx-1024k-randread-seq.results [rw-1024k-seq] stonewall bs=1024k rw=rw #write_bw_log=sdx-1024k-rw-seq.results #write_iops_log=sdx-1024k-rw-seq.results [randrw-1024k-seq] stonewall bs=1024k rw=randrw #write_bw_log=sdx-1024k-randrw-seq.results #write_iops_log=sdx-1024k-randrw-seq.results [write-4096k-seq] stonewall bs=4096k rw=write #write_bw_log=sdx-4096k-write-seq.results #write_iops_log=sdx-4096k-write-seq.results [randwrite-4096k-seq] stonewall bs=4096k rw=randwrite #write_bw_log=sdx-4096k-randwrite-seq.results #write_iops_log=sdx-4096k-randwrite-seq.results [read-4096k-seq] stonewall bs=4096k rw=read #write_bw_log=sdx-4096k-read-seq.results #write_iops_log=sdx-4096k-read-seq.results [randread-4096k-seq] stonewall bs=4096k rw=randread #write_bw_log=sdx-4096k-randread-seq.results #write_iops_log=sdx-4096k-randread-seq.results [rw-4096k-seq] stonewall bs=4096k rw=rw #write_bw_log=sdx-4096k-rw-seq.results #write_iops_log=sdx-4096k-rw-seq.results [randrw-4096k-seq] stonewall bs=4096k rw=randrw #write_bw_log=sdx-4096k-randrw-seq.results #write_iops_log=sdx-4096k-randrw-seq.results -Original Message- From: dave.c...@dell.com [mailto:dave.c...@dell.com] Sent: woensdag 14 november 2018 5:21 To: ceph-users@lists.ceph.com Subject: [ceph-users] Benchmark performance when using SSD as the journal Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen
Re: [ceph-users] Benchmark performance when using SSD as the journal
Thanks Mokhtar! This is what I am looking for, thanks for your explanation! Best Regards, Dave Chen From: Maged Mokhtar Sent: Wednesday, November 14, 2018 3:36 PM To: Chen2, Dave; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests for sensitive information. Hi Dave, The SSD journal will help boost iops & latency which will be more apparent for small block sizes. The rados benchmark default block size is 4M, use the -b option to specify the size. Try at 4k, 32k, 64k ... As a side note, this is a rados level test, the rbd image size is not relevant here. Maged. On 14/11/18 06:21, dave.c...@dell.com<mailto:dave.c...@dell.com> wrote: Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run "rados bench" utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run "rados bench" again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Try comparing results from something like this test [global] ioengine=posixaio invalidate=1 ramp_time=30 iodepth=1 runtime=180 time_based direct=1 filename=/mnt/cephfs/ssd/fio-bench.img [write-4k-seq] stonewall bs=4k rw=write #write_bw_log=sdx-4k-write-seq.results #write_iops_log=sdx-4k-write-seq.results [randwrite-4k-seq] stonewall bs=4k rw=randwrite #write_bw_log=sdx-4k-randwrite-seq.results #write_iops_log=sdx-4k-randwrite-seq.results [read-4k-seq] stonewall bs=4k rw=read #write_bw_log=sdx-4k-read-seq.results #write_iops_log=sdx-4k-read-seq.results [randread-4k-seq] stonewall bs=4k rw=randread #write_bw_log=sdx-4k-randread-seq.results #write_iops_log=sdx-4k-randread-seq.results [rw-4k-seq] stonewall bs=4k rw=rw #write_bw_log=sdx-4k-rw-seq.results #write_iops_log=sdx-4k-rw-seq.results [randrw-4k-seq] stonewall bs=4k rw=randrw #write_bw_log=sdx-4k-randrw-seq.results #write_iops_log=sdx-4k-randrw-seq.results [write-128k-seq] stonewall bs=128k rw=write #write_bw_log=sdx-128k-write-seq.results #write_iops_log=sdx-128k-write-seq.results [randwrite-128k-seq] stonewall bs=128k rw=randwrite #write_bw_log=sdx-128k-randwrite-seq.results #write_iops_log=sdx-128k-randwrite-seq.results [read-128k-seq] stonewall bs=128k rw=read #write_bw_log=sdx-128k-read-seq.results #write_iops_log=sdx-128k-read-seq.results [randread-128k-seq] stonewall bs=128k rw=randread #write_bw_log=sdx-128k-randread-seq.results #write_iops_log=sdx-128k-randread-seq.results [rw-128k-seq] stonewall bs=128k rw=rw #write_bw_log=sdx-128k-rw-seq.results #write_iops_log=sdx-128k-rw-seq.results [randrw-128k-seq] stonewall bs=128k rw=randrw #write_bw_log=sdx-128k-randrw-seq.results #write_iops_log=sdx-128k-randrw-seq.results [write-1024k-seq] stonewall bs=1024k rw=write #write_bw_log=sdx-1024k-write-seq.results #write_iops_log=sdx-1024k-write-seq.results [randwrite-1024k-seq] stonewall bs=1024k rw=randwrite #write_bw_log=sdx-1024k-randwrite-seq.results #write_iops_log=sdx-1024k-randwrite-seq.results [read-1024k-seq] stonewall bs=1024k rw=read #write_bw_log=sdx-1024k-read-seq.results #write_iops_log=sdx-1024k-read-seq.results [randread-1024k-seq] stonewall bs=1024k rw=randread #write_bw_log=sdx-1024k-randread-seq.results #write_iops_log=sdx-1024k-randread-seq.results [rw-1024k-seq] stonewall bs=1024k rw=rw #write_bw_log=sdx-1024k-rw-seq.results #write_iops_log=sdx-1024k-rw-seq.results [randrw-1024k-seq] stonewall bs=1024k rw=randrw #write_bw_log=sdx-1024k-randrw-seq.results #write_iops_log=sdx-1024k-randrw-seq.results [write-4096k-seq] stonewall bs=4096k rw=write #write_bw_log=sdx-4096k-write-seq.results #write_iops_log=sdx-4096k-write-seq.results [randwrite-4096k-seq] stonewall bs=4096k rw=randwrite #write_bw_log=sdx-4096k-randwrite-seq.results #write_iops_log=sdx-4096k-randwrite-seq.results [read-4096k-seq] stonewall bs=4096k rw=read #write_bw_log=sdx-4096k-read-seq.results #write_iops_log=sdx-4096k-read-seq.results [randread-4096k-seq] stonewall bs=4096k rw=randread #write_bw_log=sdx-4096k-randread-seq.results #write_iops_log=sdx-4096k-randread-seq.results [rw-4096k-seq] stonewall bs=4096k rw=rw #write_bw_log=sdx-4096k-rw-seq.results #write_iops_log=sdx-4096k-rw-seq.results [randrw-4096k-seq] stonewall bs=4096k rw=randrw #write_bw_log=sdx-4096k-randrw-seq.results #write_iops_log=sdx-4096k-randrw-seq.results -Original Message- From: dave.c...@dell.com [mailto:dave.c...@dell.com] Sent: woensdag 14 november 2018 5:21 To: ceph-users@lists.ceph.com Subject: [ceph-users] Benchmark performance when using SSD as the journal Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Hi Dave, The SSD journal will help boost iops & latency which will be more apparent for small block sizes. The rados benchmark default block size is 4M, use the -b option to specify the size. Try at 4k, 32k, 64k ... As a side note, this is a rados level test, the rbd image size is not relevant here. Maged. On 14/11/18 06:21, dave.c...@dell.com wrote: Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Thanks Martin for your suggestion! I will definitely try bluestore later. The version of Ceph I am using is v10.2.10 Jewel, do you think it’s stable enough to use Bluestore for Jewel or should I upgrade Ceph to Luminous? Best Regards, Dave Chen From: Martin Verges Sent: Wednesday, November 14, 2018 1:49 PM To: Chen2, Dave Cc: singap...@amerrick.co.uk; ceph-users Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests for sensitive information. Please never use the Datasheet values to select your SSD. We never had a single one that that delivers the shown perfomance in a Ceph Journal use case. However, do not use Filestore anymore. Especialy with newer kernel versions. Use Bluestore instead. -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io<mailto:martin.ver...@croit.io> Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io YouTube: https://goo.gl/PGE1Bx Am Mi., 14. Nov. 2018, 05:46 hat mailto:dave.c...@dell.com>> geschrieben: Thanks Merrick! I checked with Intel spec [1], the performance Intel said is, • Sequential Read (up to) 500 MB/s • Sequential Write (up to) 330 MB/s • Random Read (100% Span) 72000 IOPS • Random Write (100% Span) 2 IOPS I think these indicator should be must better than general HDD, and I have run read/write commands with “rados bench” respectively, there should be some difference. And is there any kinds of configuration that could give us any performance gain with this SSD (Intel S4500)? [1] https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC- Best Regards, Dave Chen From: Ashley Merrick mailto:singap...@amerrick.co.uk>> Sent: Wednesday, November 14, 2018 12:30 PM To: Chen2, Dave Cc: ceph-users Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests for sensitive information. Only certain SSD's are good for CEPH Journals as can be seen @ https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ The SSD your using isn't listed but doing a quick search online it appears to be a SSD designed for read workloads as a "upgrade" from a HD so probably is not designed for the high write requirements a journal demands. Therefore when it's been hit by 3 OSD's of workloads your not going to get much more performance out of it than you would just using the disk as your seeing. On Wed, Nov 14, 2018 at 12:21 PM mailto:dave.c...@dell.com>> wrote: Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Thanks Merrick! I haven’t tried the blue store but I believe what you said, I tried again with “rbd bench-write” with filestore, the result has more than 50% performance increase with the SSD as the journal, so I am still cannot understand why “rados bench” cannot give us any difference, what’s the rationale behind it? Do you know that? Best Regards, Dave Chen From: Ashley Merrick Sent: Wednesday, November 14, 2018 12:49 PM To: Chen2, Dave Cc: ceph-users Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests for sensitive information. Well as you mentioned Journals I guess you was using filestore in your test? You could go down the route of bluestore and put the WAL + DB onto the SSD and the bluestore data onto the HD, you should notice an increase in performance over both methods you have tried on filestore. On Wed, Nov 14, 2018 at 12:45 PM mailto:dave.c...@dell.com>> wrote: Thanks Merrick! I checked with Intel spec [1], the performance Intel said is, • Sequential Read (up to) 500 MB/s • Sequential Write (up to) 330 MB/s • Random Read (100% Span) 72000 IOPS • Random Write (100% Span) 2 IOPS I think these indicator should be must better than general HDD, and I have run read/write commands with “rados bench” respectively, there should be some difference. And is there any kinds of configuration that could give us any performance gain with this SSD (Intel S4500)? [1] https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC- Best Regards, Dave Chen From: Ashley Merrick mailto:singap...@amerrick.co.uk>> Sent: Wednesday, November 14, 2018 12:30 PM To: Chen2, Dave Cc: ceph-users Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests for sensitive information. Only certain SSD's are good for CEPH Journals as can be seen @ https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ The SSD your using isn't listed but doing a quick search online it appears to be a SSD designed for read workloads as a "upgrade" from a HD so probably is not designed for the high write requirements a journal demands. Therefore when it's been hit by 3 OSD's of workloads your not going to get much more performance out of it than you would just using the disk as your seeing. On Wed, Nov 14, 2018 at 12:21 PM mailto:dave.c...@dell.com>> wrote: Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Please never use the Datasheet values to select your SSD. We never had a single one that that delivers the shown perfomance in a Ceph Journal use case. However, do not use Filestore anymore. Especialy with newer kernel versions. Use Bluestore instead. -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io YouTube: https://goo.gl/PGE1Bx Am Mi., 14. Nov. 2018, 05:46 hat geschrieben: > Thanks Merrick! > > > > I checked with Intel spec [1], the performance Intel said is, > > > > · Sequential Read (up to) 500 MB/s > > · Sequential Write (up to) 330 MB/s > > · Random Read (100% Span) 72000 IOPS > > · Random Write (100% Span) 2 IOPS > > > > I think these indicator should be must better than general HDD, and I have > run read/write commands with “rados bench” respectively, there should be > some difference. > > > > And is there any kinds of configuration that could give us any performance > gain with this SSD (Intel S4500)? > > > > [1] > https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC- > > > > Best Regards, > > Dave Chen > > > > *From:* Ashley Merrick > *Sent:* Wednesday, November 14, 2018 12:30 PM > *To:* Chen2, Dave > *Cc:* ceph-users > *Subject:* Re: [ceph-users] Benchmark performance when using SSD as the > journal > > > > [EXTERNAL EMAIL] > Please report any suspicious attachments, links, or requests for sensitive > information. > > Only certain SSD's are good for CEPH Journals as can be seen @ > https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ > > > > The SSD your using isn't listed but doing a quick search online it appears > to be a SSD designed for read workloads as a "upgrade" from a HD so > probably is not designed for the high write requirements a journal demands. > > Therefore when it's been hit by 3 OSD's of workloads your not going to get > much more performance out of it than you would just using the disk as your > seeing. > > > > On Wed, Nov 14, 2018 at 12:21 PM wrote: > > Hi all, > > > > We want to compare the performance between HDD partition as the journal > (inline from OSD disk) and SSD partition as the journal, here is what we > have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. > Firstly, we created the OSD with journal from OSD partition, and run “rados > bench” utility to test the performance, and then migrate the journal from > HDD to SSD (Intel S4500) and run “rados bench” again, the expected result > is SSD partition should be much better than HDD, but the result shows us > there is nearly no change, > > > > The configuration of Ceph is as below, > > pool size: 3 > > osd size: 3*3 > > pg (pgp) num: 300 > > osd nodes are separated across three different nodes > > rbd image size: 10G (10240M) > > > > The utility I used is, > > rados bench -p rbd $duration write > > rados bench -p rbd $duration seq > > rados bench -p rbd $duration rand > > > > Is there anything wrong from what I did? Could anyone give me some > suggestion? > > > > > > Best Regards, > > Dave Chen > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Well as you mentioned Journals I guess you was using filestore in your test? You could go down the route of bluestore and put the WAL + DB onto the SSD and the bluestore data onto the HD, you should notice an increase in performance over both methods you have tried on filestore. On Wed, Nov 14, 2018 at 12:45 PM wrote: > Thanks Merrick! > > > > I checked with Intel spec [1], the performance Intel said is, > > > > · Sequential Read (up to) 500 MB/s > > · Sequential Write (up to) 330 MB/s > > · Random Read (100% Span) 72000 IOPS > > · Random Write (100% Span) 2 IOPS > > > > I think these indicator should be must better than general HDD, and I have > run read/write commands with “rados bench” respectively, there should be > some difference. > > > > And is there any kinds of configuration that could give us any performance > gain with this SSD (Intel S4500)? > > > > [1] > https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC- > > > > Best Regards, > > Dave Chen > > > > *From:* Ashley Merrick > *Sent:* Wednesday, November 14, 2018 12:30 PM > *To:* Chen2, Dave > *Cc:* ceph-users > *Subject:* Re: [ceph-users] Benchmark performance when using SSD as the > journal > > > > [EXTERNAL EMAIL] > Please report any suspicious attachments, links, or requests for sensitive > information. > > Only certain SSD's are good for CEPH Journals as can be seen @ > https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ > > > > The SSD your using isn't listed but doing a quick search online it appears > to be a SSD designed for read workloads as a "upgrade" from a HD so > probably is not designed for the high write requirements a journal demands. > > Therefore when it's been hit by 3 OSD's of workloads your not going to get > much more performance out of it than you would just using the disk as your > seeing. > > > > On Wed, Nov 14, 2018 at 12:21 PM wrote: > > Hi all, > > > > We want to compare the performance between HDD partition as the journal > (inline from OSD disk) and SSD partition as the journal, here is what we > have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. > Firstly, we created the OSD with journal from OSD partition, and run “rados > bench” utility to test the performance, and then migrate the journal from > HDD to SSD (Intel S4500) and run “rados bench” again, the expected result > is SSD partition should be much better than HDD, but the result shows us > there is nearly no change, > > > > The configuration of Ceph is as below, > > pool size: 3 > > osd size: 3*3 > > pg (pgp) num: 300 > > osd nodes are separated across three different nodes > > rbd image size: 10G (10240M) > > > > The utility I used is, > > rados bench -p rbd $duration write > > rados bench -p rbd $duration seq > > rados bench -p rbd $duration rand > > > > Is there anything wrong from what I did? Could anyone give me some > suggestion? > > > > > > Best Regards, > > Dave Chen > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Thanks Merrick! I checked with Intel spec [1], the performance Intel said is, • Sequential Read (up to) 500 MB/s • Sequential Write (up to) 330 MB/s • Random Read (100% Span) 72000 IOPS • Random Write (100% Span) 2 IOPS I think these indicator should be must better than general HDD, and I have run read/write commands with “rados bench” respectively, there should be some difference. And is there any kinds of configuration that could give us any performance gain with this SSD (Intel S4500)? [1] https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC- Best Regards, Dave Chen From: Ashley Merrick Sent: Wednesday, November 14, 2018 12:30 PM To: Chen2, Dave Cc: ceph-users Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal [EXTERNAL EMAIL] Please report any suspicious attachments, links, or requests for sensitive information. Only certain SSD's are good for CEPH Journals as can be seen @ https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ The SSD your using isn't listed but doing a quick search online it appears to be a SSD designed for read workloads as a "upgrade" from a HD so probably is not designed for the high write requirements a journal demands. Therefore when it's been hit by 3 OSD's of workloads your not going to get much more performance out of it than you would just using the disk as your seeing. On Wed, Nov 14, 2018 at 12:21 PM mailto:dave.c...@dell.com>> wrote: Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Benchmark performance when using SSD as the journal
Only certain SSD's are good for CEPH Journals as can be seen @ https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ The SSD your using isn't listed but doing a quick search online it appears to be a SSD designed for read workloads as a "upgrade" from a HD so probably is not designed for the high write requirements a journal demands. Therefore when it's been hit by 3 OSD's of workloads your not going to get much more performance out of it than you would just using the disk as your seeing. On Wed, Nov 14, 2018 at 12:21 PM wrote: > Hi all, > > > > We want to compare the performance between HDD partition as the journal > (inline from OSD disk) and SSD partition as the journal, here is what we > have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. > Firstly, we created the OSD with journal from OSD partition, and run “rados > bench” utility to test the performance, and then migrate the journal from > HDD to SSD (Intel S4500) and run “rados bench” again, the expected result > is SSD partition should be much better than HDD, but the result shows us > there is nearly no change, > > > > The configuration of Ceph is as below, > > pool size: 3 > > osd size: 3*3 > > pg (pgp) num: 300 > > osd nodes are separated across three different nodes > > rbd image size: 10G (10240M) > > > > The utility I used is, > > rados bench -p rbd $duration write > > rados bench -p rbd $duration seq > > rados bench -p rbd $duration rand > > > > Is there anything wrong from what I did? Could anyone give me some > suggestion? > > > > > > Best Regards, > > Dave Chen > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Benchmark performance when using SSD as the journal
Hi all, We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run "rados bench" utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run "rados bench" again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change, The configuration of Ceph is as below, pool size: 3 osd size: 3*3 pg (pgp) num: 300 osd nodes are separated across three different nodes rbd image size: 10G (10240M) The utility I used is, rados bench -p rbd $duration write rados bench -p rbd $duration seq rados bench -p rbd $duration rand Is there anything wrong from what I did? Could anyone give me some suggestion? Best Regards, Dave Chen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com