Re: [ceph-users] qemu-img convert vs rbd import performance
It's already in qemu 2.9 http://git.qemu.org/?p=qemu.git;a=commit;h=2d9187bc65727d9dd63e2c410b5500add3db0b0d " This patches introduces 2 new cmdline parameters. The -m parameter to specify the number of coroutines running in parallel (defaults to 8). And the -W parameter to allow qemu-img to write to the target out of order rather than sequential. This improves performance as the writes do not have to wait for each other to complete. " And performance was dramatically increase! Runed it with Luminous and qemu 2.9.0 (this is host with qemu-img, network bandwith with ceph cluster): http://storage6.static.itmages.ru/i/17/1223/h_1514004003_2271300_d3ee031fda.png From 11:05 to 11:28: 35% of 100Gb. Started googling about news in qemu, founded this message. Append -m 16 -W. Network iface utilisation was raises from ~150Mbit/s to ~2500Mbit/s (this is convert from one rbd pool to another). k ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
>>Is there anything changed from Hammer to Jewel that might be affecting the >>qemu-img convert performance? maybe object map for exclusive lock ? (I think it could be a little bit slower when objects are created first) you could test it, create the target rbd volume, disable exclusive lock,objet map, and try qemu-img convert. - Mail original - De: "Mahesh Jambhulkar" <mahesh.jambhul...@trilio.io> À: "aderumier" <aderum...@odiso.com> Cc: "dillaman" <dilla...@redhat.com>, "ceph-users" <ceph-users@lists.ceph.com> Envoyé: Vendredi 21 Juillet 2017 14:38:20 Objet: Re: [ceph-users] qemu-img convert vs rbd import performance Thanks Alexandre! We were using ceph - Hammer before and we never had these performance issues with qemu-img convert. Is there anything changed from Hammer to Jewel that might be affecting the qemu-img convert performance? On Fri, Jul 21, 2017 at 2:24 PM, Alexandre DERUMIER < [ mailto:aderum...@odiso.com | aderum...@odiso.com ] > wrote: It's already in qemu 2.9 [ http://git.qemu.org/?p=qemu.git;a=commit;h=2d9187bc65727d9dd63e2c410b5500add3db0b0d | http://git.qemu.org/?p=qemu.git;a=commit;h=2d9187bc65727d9dd63e2c410b5500add3db0b0d ] " This patches introduces 2 new cmdline parameters. The -m parameter to specify the number of coroutines running in parallel (defaults to 8). And the -W parameter to allow qemu-img to write to the target out of order rather than sequential. This improves performance as the writes do not have to wait for each other to complete. " - Mail original - De: "aderumier" < [ mailto:aderum...@odiso.com | aderum...@odiso.com ] > À: "dillaman" < [ mailto:dilla...@redhat.com | dilla...@redhat.com ] > Cc: "Mahesh Jambhulkar" < [ mailto:mahesh.jambhul...@trilio.io | mahesh.jambhul...@trilio.io ] >, "ceph-users" < [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ] > Envoyé: Vendredi 21 Juillet 2017 10:51:21 Objet: Re: [ceph-users] qemu-img convert vs rbd import performance Hi, they are an RFC here: "[RFC] qemu-img: make convert async" [ https://patchwork.kernel.org/patch/9552415/ | https://patchwork.kernel.org/patch/9552415/ ] maybe it could help - Mail original - De: "Jason Dillaman" < [ mailto:jdill...@redhat.com | jdill...@redhat.com ] > À: "Mahesh Jambhulkar" < [ mailto:mahesh.jambhul...@trilio.io | mahesh.jambhul...@trilio.io ] > Cc: "ceph-users" < [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ] > Envoyé: Jeudi 20 Juillet 2017 15:20:32 Objet: Re: [ceph-users] qemu-img convert vs rbd import performance Running a similar 20G import test within a single OSD VM-based cluster, I see the following: $ time qemu-img convert -p -O raw -f raw ~/image rbd:rbd/image (100.00/100%) real 3m20.722s user 0m18.859s sys 0m20.628s $ time rbd import ~/image Importing image: 100% complete...done. real 2m11.907s user 0m12.236s sys 0m20.971s Examining the IO patterns from qemu-img, I can see that it is effectively using synchronous IO (i.e. only a single write is in-flight at a time), whereas "rbd import" will send up to 10 (by default) IO requests concurrently. Therefore, the higher the latencies to your cluster, the worse qemu-img will perform as compared to "rbd import". On Thu, Jul 20, 2017 at 5:07 AM, Mahesh Jambhulkar < [ mailto: [ mailto:mahesh.jambhul...@trilio.io | mahesh.jambhul...@trilio.io ] | [ mailto:mahesh.jambhul...@trilio.io | mahesh.jambhul...@trilio.io ] ] > wrote: Adding rbd readahead disable after bytes = 0 did not help. [root@cephlarge mnt]# time qemu-img convert -p -O raw /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 (100.00/100%) real 4858m13.822s user 73m39.656s sys 32m11.891s It took 80 hours to complete. Also, its not feasible to test this with huge 465GB file every time. So I tested qemu-img convert with a 20GB file. Parameters Time taken -t writeback 38mins -t none 38 mins -S 4k 38 mins With client options mentions by Irek Fasikhov 40 mins The time taken is almost the same. On Thu, Jul 13, 2017 at 6:40 PM, Jason Dillaman < [ mailto: [ mailto:jdill...@redhat.com | jdill...@redhat.com ] | [ mailto:jdill...@redhat.com | jdill...@redhat.com ] ] > wrote: On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhov < [ mailto: [ mailto:malm...@gmail.com | malm...@gmail.com ] | [ mailto:malm...@gmail.com | malm...@gmail.com ] ] > wrote: > rbd readahead disable after bytes = 0 There isn't any readin
Re: [ceph-users] qemu-img convert vs rbd import performance
Thanks Alexandre! We were using ceph - Hammer before and we never had these performance issues with qemu-img convert. Is there anything changed from Hammer to Jewel that might be affecting the qemu-img convert performance? On Fri, Jul 21, 2017 at 2:24 PM, Alexandre DERUMIER <aderum...@odiso.com> wrote: > It's already in qemu 2.9 > > http://git.qemu.org/?p=qemu.git;a=commit;h=2d9187bc65727d9dd63e2c410b5500 > add3db0b0d > > > " > This patches introduces 2 new cmdline parameters. The -m parameter to > specify > the number of coroutines running in parallel (defaults to 8). And the -W > parameter to > allow qemu-img to write to the target out of order rather than sequential. > This improves > performance as the writes do not have to wait for each other to complete. > " > > - Mail original - > De: "aderumier" <aderum...@odiso.com> > À: "dillaman" <dilla...@redhat.com> > Cc: "Mahesh Jambhulkar" <mahesh.jambhul...@trilio.io>, "ceph-users" < > ceph-users@lists.ceph.com> > Envoyé: Vendredi 21 Juillet 2017 10:51:21 > Objet: Re: [ceph-users] qemu-img convert vs rbd import performance > > Hi, > > they are an RFC here: > > "[RFC] qemu-img: make convert async" > https://patchwork.kernel.org/patch/9552415/ > > > maybe it could help > > > - Mail original - > De: "Jason Dillaman" <jdill...@redhat.com> > À: "Mahesh Jambhulkar" <mahesh.jambhul...@trilio.io> > Cc: "ceph-users" <ceph-users@lists.ceph.com> > Envoyé: Jeudi 20 Juillet 2017 15:20:32 > Objet: Re: [ceph-users] qemu-img convert vs rbd import performance > > Running a similar 20G import test within a single OSD VM-based cluster, I > see the following: > $ time qemu-img convert -p -O raw -f raw ~/image rbd:rbd/image > (100.00/100%) > > real 3m20.722s > user 0m18.859s > sys 0m20.628s > > $ time rbd import ~/image > Importing image: 100% complete...done. > > real 2m11.907s > user 0m12.236s > sys 0m20.971s > > Examining the IO patterns from qemu-img, I can see that it is effectively > using synchronous IO (i.e. only a single write is in-flight at a time), > whereas "rbd import" will send up to 10 (by default) IO requests > concurrently. Therefore, the higher the latencies to your cluster, the > worse qemu-img will perform as compared to "rbd import". > > > > On Thu, Jul 20, 2017 at 5:07 AM, Mahesh Jambhulkar < [ mailto: > mahesh.jambhul...@trilio.io | mahesh.jambhul...@trilio.io ] > wrote: > > > > Adding rbd readahead disable after bytes = 0 did not help. > > [root@cephlarge mnt]# time qemu-img convert -p -O raw > /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/ > snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_ > 05e9534e-5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8- > bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 > rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 (100.00/100%) > > real 4858m13.822s > user 73m39.656s > sys 32m11.891s > It took 80 hours to complete. > > Also, its not feasible to test this with huge 465GB file every time. So I > tested qemu-img convert with a 20GB file. > > Parameters Time taken > -t writeback 38mins > -t none 38 mins > -S 4k 38 mins > With client options mentions by Irek Fasikhov 40 mins > The time taken is almost the same. > > On Thu, Jul 13, 2017 at 6:40 PM, Jason Dillaman < [ mailto: > jdill...@redhat.com | jdill...@redhat.com ] > wrote: > > > On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhov < [ mailto: > malm...@gmail.com | malm...@gmail.com ] > wrote: > > rbd readahead disable after bytes = 0 > > > There isn't any reading from an RBD image in this example -- plus > readahead disables itself automatically after the first 50MBs of IO > (i.e. after the OS should have had enough time to start its own > readahead logic). > > -- > Jason > > > > > > > -- > Regards, > mahesh j > > > > > > > -- > Jason > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Regards, mahesh j ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
It's already in qemu 2.9 http://git.qemu.org/?p=qemu.git;a=commit;h=2d9187bc65727d9dd63e2c410b5500add3db0b0d " This patches introduces 2 new cmdline parameters. The -m parameter to specify the number of coroutines running in parallel (defaults to 8). And the -W parameter to allow qemu-img to write to the target out of order rather than sequential. This improves performance as the writes do not have to wait for each other to complete. " - Mail original - De: "aderumier" <aderum...@odiso.com> À: "dillaman" <dilla...@redhat.com> Cc: "Mahesh Jambhulkar" <mahesh.jambhul...@trilio.io>, "ceph-users" <ceph-users@lists.ceph.com> Envoyé: Vendredi 21 Juillet 2017 10:51:21 Objet: Re: [ceph-users] qemu-img convert vs rbd import performance Hi, they are an RFC here: "[RFC] qemu-img: make convert async" https://patchwork.kernel.org/patch/9552415/ maybe it could help - Mail original - De: "Jason Dillaman" <jdill...@redhat.com> À: "Mahesh Jambhulkar" <mahesh.jambhul...@trilio.io> Cc: "ceph-users" <ceph-users@lists.ceph.com> Envoyé: Jeudi 20 Juillet 2017 15:20:32 Objet: Re: [ceph-users] qemu-img convert vs rbd import performance Running a similar 20G import test within a single OSD VM-based cluster, I see the following: $ time qemu-img convert -p -O raw -f raw ~/image rbd:rbd/image (100.00/100%) real 3m20.722s user 0m18.859s sys 0m20.628s $ time rbd import ~/image Importing image: 100% complete...done. real 2m11.907s user 0m12.236s sys 0m20.971s Examining the IO patterns from qemu-img, I can see that it is effectively using synchronous IO (i.e. only a single write is in-flight at a time), whereas "rbd import" will send up to 10 (by default) IO requests concurrently. Therefore, the higher the latencies to your cluster, the worse qemu-img will perform as compared to "rbd import". On Thu, Jul 20, 2017 at 5:07 AM, Mahesh Jambhulkar < [ mailto:mahesh.jambhul...@trilio.io | mahesh.jambhul...@trilio.io ] > wrote: Adding rbd readahead disable after bytes = 0 did not help. [root@cephlarge mnt]# time qemu-img convert -p -O raw /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 (100.00/100%) real 4858m13.822s user 73m39.656s sys 32m11.891s It took 80 hours to complete. Also, its not feasible to test this with huge 465GB file every time. So I tested qemu-img convert with a 20GB file. Parameters Time taken -t writeback 38mins -t none 38 mins -S 4k 38 mins With client options mentions by Irek Fasikhov 40 mins The time taken is almost the same. On Thu, Jul 13, 2017 at 6:40 PM, Jason Dillaman < [ mailto:jdill...@redhat.com | jdill...@redhat.com ] > wrote: On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhov < [ mailto:malm...@gmail.com | malm...@gmail.com ] > wrote: > rbd readahead disable after bytes = 0 There isn't any reading from an RBD image in this example -- plus readahead disables itself automatically after the first 50MBs of IO (i.e. after the OS should have had enough time to start its own readahead logic). -- Jason -- Regards, mahesh j -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
Hi, they are an RFC here: "[RFC] qemu-img: make convert async" https://patchwork.kernel.org/patch/9552415/ maybe it could help - Mail original - De: "Jason Dillaman" <jdill...@redhat.com> À: "Mahesh Jambhulkar" <mahesh.jambhul...@trilio.io> Cc: "ceph-users" <ceph-users@lists.ceph.com> Envoyé: Jeudi 20 Juillet 2017 15:20:32 Objet: Re: [ceph-users] qemu-img convert vs rbd import performance Running a similar 20G import test within a single OSD VM-based cluster, I see the following: $ time qemu-img convert -p -O raw -f raw ~/image rbd:rbd/image (100.00/100%) real 3m20.722s user 0m18.859s sys 0m20.628s $ time rbd import ~/image Importing image: 100% complete...done. real 2m11.907s user 0m12.236s sys 0m20.971s Examining the IO patterns from qemu-img, I can see that it is effectively using synchronous IO (i.e. only a single write is in-flight at a time), whereas "rbd import" will send up to 10 (by default) IO requests concurrently. Therefore, the higher the latencies to your cluster, the worse qemu-img will perform as compared to "rbd import". On Thu, Jul 20, 2017 at 5:07 AM, Mahesh Jambhulkar < [ mailto:mahesh.jambhul...@trilio.io | mahesh.jambhul...@trilio.io ] > wrote: Adding rbd readahead disable after bytes = 0 did not help. [root@cephlarge mnt]# time qemu-img convert -p -O raw /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 (100.00/100%) real 4858m13.822s user 73m39.656s sys 32m11.891s It took 80 hours to complete. Also, its not feasible to test this with huge 465GB file every time. So I tested qemu-img convert with a 20GB file. Parameters Time taken -t writeback38mins -t none 38 mins -S 4k 38 mins With client options mentions by Irek Fasikhov 40 mins The time taken is almost the same. On Thu, Jul 13, 2017 at 6:40 PM, Jason Dillaman < [ mailto:jdill...@redhat.com | jdill...@redhat.com ] > wrote: BQ_BEGIN On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhov < [ mailto:malm...@gmail.com | malm...@gmail.com ] > wrote: > rbd readahead disable after bytes = 0 There isn't any reading from an RBD image in this example -- plus readahead disables itself automatically after the first 50MBs of IO (i.e. after the OS should have had enough time to start its own readahead logic). -- Jason -- Regards, mahesh j BQ_END -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
Thanks for the information Jason! We have few concerns: 1. Following is our ceph configuration. Is there something that needs to be changed here? #cat /etc/ceph/ceph.conf [global] fsid = 0e1bd4fe-4e2d-4e30-8bc5-cb94ecea43f0 mon_initial_members = cephlarge mon_host = 10.0.0.188 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx osd pool default size = 2 public network = 10.0.0.0/16 osd max object name len = 256 osd max object namespace len = 64 [client] rbd cache = true rbd readahead trigger requests = 5 rbd readahead max bytes = 419430400 rbd readahead disable after bytes = 0 rbd_concurrent_management_ops = 50 2. We are using ext4 FS for cepf. Does this hamper the write performance of qemu-img convert? 3. qemu-img-1.5.3-126.el7_3.10.x86_64 is the version we are using 4. Ceph version is Jewel v10. 5. Is there a way we can control latency so that qemu-img performance can be increased? Please provide your suggestions. On Thu, Jul 20, 2017 at 6:50 PM, Jason Dillamanwrote: > Running a similar 20G import test within a single OSD VM-based cluster, I > see the following: > > $ time qemu-img convert -p -O raw -f raw ~/image rbd:rbd/image > (100.00/100%) > > real 3m20.722s > user 0m18.859s > sys 0m20.628s > > $ time rbd import ~/image > Importing image: 100% complete...done. > > real 2m11.907s > user 0m12.236s > sys 0m20.971s > > Examining the IO patterns from qemu-img, I can see that it is effectively > using synchronous IO (i.e. only a single write is in-flight at a time), > whereas "rbd import" will send up to 10 (by default) IO requests > concurrently. Therefore, the higher the latencies to your cluster, the > worse qemu-img will perform as compared to "rbd import". > > > > On Thu, Jul 20, 2017 at 5:07 AM, Mahesh Jambhulkar < > mahesh.jambhul...@trilio.io> wrote: > >> Adding *rbd readahead disable after bytes = 0* did not help. >> >> [root@cephlarge mnt]# time qemu-img convert -p -O raw >> /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snap >> shot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c8 >> 4-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8 >> -bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 >> rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 (100.00/100%) >> >> real4858m13.822s >> user73m39.656s >> sys 32m11.891s >> It took 80 hours to complete. >> >> Also, its not feasible to test this with huge 465GB file every time. So I >> tested *qemu-img convert* with a 20GB file. >> >> Parameters Time taken >> -t writeback 38mins >> -t none 38 mins >> -S 4k 38 mins >> With client options mentions by Irek Fasikhov 40 mins >> The time taken is almost the same. >> >> On Thu, Jul 13, 2017 at 6:40 PM, Jason Dillaman >> wrote: >> >>> On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhov >>> wrote: >>> > rbd readahead disable after bytes = 0 >>> >>> >>> There isn't any reading from an RBD image in this example -- plus >>> readahead disables itself automatically after the first 50MBs of IO >>> (i.e. after the OS should have had enough time to start its own >>> readahead logic). >>> >>> -- >>> Jason >>> >> >> >> >> -- >> Regards, >> mahesh j >> > > > > -- > Jason > -- Regards, mahesh j ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
Running a similar 20G import test within a single OSD VM-based cluster, I see the following: $ time qemu-img convert -p -O raw -f raw ~/image rbd:rbd/image (100.00/100%) real 3m20.722s user 0m18.859s sys 0m20.628s $ time rbd import ~/image Importing image: 100% complete...done. real 2m11.907s user 0m12.236s sys 0m20.971s Examining the IO patterns from qemu-img, I can see that it is effectively using synchronous IO (i.e. only a single write is in-flight at a time), whereas "rbd import" will send up to 10 (by default) IO requests concurrently. Therefore, the higher the latencies to your cluster, the worse qemu-img will perform as compared to "rbd import". On Thu, Jul 20, 2017 at 5:07 AM, Mahesh Jambhulkar < mahesh.jambhul...@trilio.io> wrote: > Adding *rbd readahead disable after bytes = 0* did not help. > > [root@cephlarge mnt]# time qemu-img convert -p -O raw > /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snap > shot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e- > 5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad- > 80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 > rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 (100.00/100%) > > real4858m13.822s > user73m39.656s > sys 32m11.891s > It took 80 hours to complete. > > Also, its not feasible to test this with huge 465GB file every time. So I > tested *qemu-img convert* with a 20GB file. > > Parameters Time taken > -t writeback 38mins > -t none 38 mins > -S 4k 38 mins > With client options mentions by Irek Fasikhov 40 mins > The time taken is almost the same. > > On Thu, Jul 13, 2017 at 6:40 PM, Jason Dillaman> wrote: > >> On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhov wrote: >> > rbd readahead disable after bytes = 0 >> >> >> There isn't any reading from an RBD image in this example -- plus >> readahead disables itself automatically after the first 50MBs of IO >> (i.e. after the OS should have had enough time to start its own >> readahead logic). >> >> -- >> Jason >> > > > > -- > Regards, > mahesh j > -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
Adding *rbd readahead disable after bytes = 0* did not help. [root@cephlarge mnt]# time qemu-img convert -p -O raw /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/ snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613- 1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 (100.00/100%) real4858m13.822s user73m39.656s sys 32m11.891s It took 80 hours to complete. Also, its not feasible to test this with huge 465GB file every time. So I tested *qemu-img convert* with a 20GB file. Parameters Time taken -t writeback 38mins -t none 38 mins -S 4k 38 mins With client options mentions by Irek Fasikhov 40 mins The time taken is almost the same. On Thu, Jul 13, 2017 at 6:40 PM, Jason Dillamanwrote: > On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhov wrote: > > rbd readahead disable after bytes = 0 > > > There isn't any reading from an RBD image in this example -- plus > readahead disables itself automatically after the first 50MBs of IO > (i.e. after the OS should have had enough time to start its own > readahead logic). > > -- > Jason > -- Regards, mahesh j ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhovwrote: > rbd readahead disable after bytes = 0 There isn't any reading from an RBD image in this example -- plus readahead disables itself automatically after the first 50MBs of IO (i.e. after the OS should have had enough time to start its own readahead logic). -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
I'll refer you to the original thread about this [1] that was awaiting an answer. I would recommend dropping the "-t none" option since that might severely slow down sequential write operations if "qemu-img convert" is performing 512 byte IO operations. You might also want to consider adding the "-S 4k' option to potentially re-sparsify a non-sparse image (i.e. so it doesn't waste time writing zeroes to the RBD image). [1] https://www.spinics.net/lists/ceph-users/msg37064.html On Thu, Jul 13, 2017 at 8:29 AM, Mahesh Jambhulkarwrote: > Seeing some performance issues on my ceph cluster with qemu-img convert > directly writing to ceph against normal rbd import command. > > Direct data copy (without qemu-img convert) took 5 hours 43 minutes for > 465GB data. > > > [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# time > rbd import 66582225-6539-4e5e-9b7a-59aa16739df1 -p volumes > 66582225-6539-4e5e-9b7a-59aa16739df1_directCopy --image-format 2 > rbd: --pool is deprecated for import, use --dest-pool > Importing image: 100% complete...done. > > real343m38.028s > user4m40.779s > sys 7m18.916s > [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# rbd > info volumes/66582225-6539-4e5e-9b7a-59aa16739df1_directCopy > rbd image '66582225-6539-4e5e-9b7a-59aa16739df1_directCopy': > size 465 GB in 119081 objects > order 22 (4096 kB objects) > block_name_prefix: rbd_data.373174b0dc51 > format: 2 > features: layering, exclusive-lock, object-map, fast-diff, > deep-flatten > flags: > [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# > > > Qemu-img convert is still in progress and completed merely 10% in more than > 40 hours. (for 465GB data) > > [root@cephlarge mnt]# time qemu-img convert -p -t none -O raw > /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 > rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 > (0.00/100%) > > > (10.00/100%) > > > Rbd bench-write shows speed of ~21MB/s. > > [root@cephlarge ~]# rbd bench-write image01 --pool=rbdbench > bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern sequential > SEC OPS OPS/SEC BYTES/SEC > 2 6780 3133.53 12834946.35 > 3 6831 1920.65 7866998.17 > 4 8896 2040.50 8357871.83 > 5 13058 2562.61 10496432.34 > 6 17225 2836.78 11619432.99 > 7 20345 2736.84 11210076.25 > 8 23534 3761.57 15407392.94 > 9 25689 3601.35 14751109.98 >10 29670 3391.53 13891695.57 >11 33169 3218.29 13182107.64 >12 36356 3135.34 12842344.21 >13 38431 2972.62 12175863.99 >14 47780 4389.77 17980497.11 >15 55452 5156.40 21120627.26 >16 59298 4772.32 19547440.33 >17 61437 5151.20 21099315.94 >18 67702 5861.64 24009295.97 >19 77086 5895.03 24146032.34 >20 85474 5936.09 24314243.88 >21 93848 7499.73 30718898.25 >22100115 7783.39 31880760.34 >23105405 7524.76 30821410.70 >24111677 6797.12 27841003.78 >25116971 6274.51 25700386.48 >26121156 5468.77 22400087.81 >27126484 5345.83 21896515.02 >28137937 6412.41 26265239.30 >29143229 6347.28 25998461.13 >30149505 6548.76 26823729.97 >31159978 7815.37 32011752.09 >32171431 8821.65 36133479.15 >33181084 8795.28 36025472.27 >35182856 6322.41 25896605.75 >36186891 5592.25 22905872.73 >37190906 4876.30 19973339.07 >38190943 3076.87 12602853.89 >39190974 1536.79 6294701.64 >40195323 2344.75 9604081.07 >41198479 2703.00 11071492.89 >42208893 3918.55 16050365.70 >43214172 4702.42 19261091.89 >44215263 5167.53 21166212.98 >45219435 5392.57 22087961.94 >46225731 5242.85 21474728.85 >47234101 5009.43 20518607.70 >48243529 6326.00 25911280.08 >49254058 7944.90 32542315.10 > elapsed:50 ops: 262144 ops/sec: 5215.19 bytes/sec: 21361431.86 > [root@cephlarge ~]# > > This CEPH deployment has 2 OSDs. > > It would be of great help if anyone can give me pointers. > > -- > Regards, > mahesh j > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
Hi. You need to add to the ceph.conf [client] rbd cache = true rbd readahead trigger requests = 5 rbd readahead max bytes = 419430400 *rbd readahead disable after bytes = 0* rbd_concurrent_management_ops = 50 2017-07-13 15:29 GMT+03:00 Mahesh Jambhulkar: > Seeing some performance issues on my ceph cluster with *qemu-img convert > *directly > writing to ceph against normal rbd import command. > > *Direct data copy (without qemu-img convert) took 5 hours 43 minutes for > 465GB data.* > > > [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# time > rbd import 66582225-6539-4e5e-9b7a-59aa16739df1 -p volumes > 66582225-6539-4e5e-9b7a-59aa16739df1_directCopy --image-format 2 > rbd: --pool is deprecated for import, use --dest-pool > Importing image: 100% complete...done. > > real*343m38.028s* > user4m40.779s > sys 7m18.916s > [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# rbd > info volumes/66582225-6539-4e5e-9b7a-59aa16739df1_directCopy > rbd image '66582225-6539-4e5e-9b7a-59aa16739df1_directCopy': > size 465 GB in 119081 objects > order 22 (4096 kB objects) > block_name_prefix: rbd_data.373174b0dc51 > format: 2 > features: layering, exclusive-lock, object-map, fast-diff, > deep-flatten > flags: > [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# > > > *Qemu-img convert is still in progress and completed merely 10% in more > than 40 hours. (for 465GB data)* > > [root@cephlarge mnt]# time qemu-img convert -p -t none -O raw > /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snap > shot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e- > 5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad- > 80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 > rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 > (0.00/100%) > > > (10.00/100%) > > > *Rbd bench-write shows speed of ~21MB/s.* > > [root@cephlarge ~]# rbd bench-write image01 --pool=rbdbench > bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern sequential > SEC OPS OPS/SEC BYTES/SEC > 2 6780 3133.53 12834946.35 > 3 6831 1920.65 7866998.17 > 4 8896 2040.50 8357871.83 > 5 13058 2562.61 10496432.34 > 6 17225 2836.78 11619432.99 > 7 20345 2736.84 11210076.25 > 8 23534 3761.57 15407392.94 > 9 25689 3601.35 14751109.98 >10 29670 3391.53 13891695.57 >11 33169 3218.29 13182107.64 >12 36356 3135.34 12842344.21 >13 38431 2972.62 12175863.99 >14 47780 4389.77 17980497.11 >15 55452 5156.40 21120627.26 >16 59298 4772.32 19547440.33 >17 61437 5151.20 21099315.94 >18 67702 5861.64 24009295.97 >19 77086 5895.03 24146032.34 >20 85474 5936.09 24314243.88 >21 93848 7499.73 30718898.25 >22100115 7783.39 31880760.34 >23105405 7524.76 30821410.70 >24111677 6797.12 27841003.78 >25116971 6274.51 25700386.48 >26121156 5468.77 22400087.81 >27126484 5345.83 21896515.02 >28137937 6412.41 26265239.30 >29143229 6347.28 25998461.13 >30149505 6548.76 26823729.97 >31159978 7815.37 32011752.09 >32171431 8821.65 36133479.15 >33181084 8795.28 36025472.27 >35182856 6322.41 25896605.75 >36186891 5592.25 22905872.73 >37190906 4876.30 19973339.07 >38190943 3076.87 12602853.89 >39190974 1536.79 6294701.64 >40195323 2344.75 9604081.07 >41198479 2703.00 11071492.89 >42208893 3918.55 16050365.70 >43214172 4702.42 19261091.89 >44215263 5167.53 21166212.98 >45219435 5392.57 22087961.94 >46225731 5242.85 21474728.85 >47234101 5009.43 20518607.70 >48243529 6326.00 25911280.08 >49254058 7944.90 32542315.10 > elapsed:50 ops: 262144 ops/sec: 5215.19 bytes/sec: 21361431.86 > [root@cephlarge ~]# > > This CEPH deployment has 2 OSDs. > > It would be of great help if anyone can give me pointers. > > -- > Regards, > mahesh j > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] qemu-img convert vs rbd import performance
Seeing some performance issues on my ceph cluster with *qemu-img convert *directly writing to ceph against normal rbd import command. *Direct data copy (without qemu-img convert) took 5 hours 43 minutes for 465GB data.* [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# time rbd import 66582225-6539-4e5e-9b7a-59aa16739df1 -p volumes 66582225-6539-4e5e-9b7a-59aa16739df1_directCopy --image-format 2 rbd: --pool is deprecated for import, use --dest-pool Importing image: 100% complete...done. real*343m38.028s* user4m40.779s sys 7m18.916s [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# rbd info volumes/66582225-6539-4e5e-9b7a-59aa16739df1_directCopy rbd image '66582225-6539-4e5e-9b7a-59aa16739df1_directCopy': size 465 GB in 119081 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.373174b0dc51 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten flags: [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# *Qemu-img convert is still in progress and completed merely 10% in more than 40 hours. (for 465GB data)* [root@cephlarge mnt]# time qemu-img convert -p -t none -O raw /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/ snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613- 1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9 (0.00/100%) (10.00/100%) *Rbd bench-write shows speed of ~21MB/s.* [root@cephlarge ~]# rbd bench-write image01 --pool=rbdbench bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern sequential SEC OPS OPS/SEC BYTES/SEC 2 6780 3133.53 12834946.35 3 6831 1920.65 7866998.17 4 8896 2040.50 8357871.83 5 13058 2562.61 10496432.34 6 17225 2836.78 11619432.99 7 20345 2736.84 11210076.25 8 23534 3761.57 15407392.94 9 25689 3601.35 14751109.98 10 29670 3391.53 13891695.57 11 33169 3218.29 13182107.64 12 36356 3135.34 12842344.21 13 38431 2972.62 12175863.99 14 47780 4389.77 17980497.11 15 55452 5156.40 21120627.26 16 59298 4772.32 19547440.33 17 61437 5151.20 21099315.94 18 67702 5861.64 24009295.97 19 77086 5895.03 24146032.34 20 85474 5936.09 24314243.88 21 93848 7499.73 30718898.25 22100115 7783.39 31880760.34 23105405 7524.76 30821410.70 24111677 6797.12 27841003.78 25116971 6274.51 25700386.48 26121156 5468.77 22400087.81 27126484 5345.83 21896515.02 28137937 6412.41 26265239.30 29143229 6347.28 25998461.13 30149505 6548.76 26823729.97 31159978 7815.37 32011752.09 32171431 8821.65 36133479.15 33181084 8795.28 36025472.27 35182856 6322.41 25896605.75 36186891 5592.25 22905872.73 37190906 4876.30 19973339.07 38190943 3076.87 12602853.89 39190974 1536.79 6294701.64 40195323 2344.75 9604081.07 41198479 2703.00 11071492.89 42208893 3918.55 16050365.70 43214172 4702.42 19261091.89 44215263 5167.53 21166212.98 45219435 5392.57 22087961.94 46225731 5242.85 21474728.85 47234101 5009.43 20518607.70 48243529 6326.00 25911280.08 49254058 7944.90 32542315.10 elapsed:50 ops: 262144 ops/sec: 5215.19 bytes/sec: 21361431.86 [root@cephlarge ~]# This CEPH deployment has 2 OSDs. It would be of great help if anyone can give me pointers. -- Regards, mahesh j ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
Perhaps just one cluster has low latency and the other has excessively high latency? You can use "rbd bench-write" to verify. On Wed, Jun 28, 2017 at 8:04 PM, Murali Balchawrote: > We will give it a try. I have another cluster of similar configuration and > the converts are working fine. We have not changed any queue depth setting > on that setup either. If it turns out to be queue depth how can we set queue > setting for qemu-img convert operation? > > Thank you. > > Sent from my iPhone > >> On Jun 28, 2017, at 7:56 PM, Jason Dillaman wrote: >> >> Given that your time difference is roughly 10x, best guess is that >> qemu-img is sending the IO operations synchronously (queue depth = 1), >> whereas, by default, "rbd import" will send up to 10 write requests in >> parallel to the backing OSDs. Such an assumption assumes that you have >> really high latency. You can re-run "rbd import" with >> "--rbd-concurrent-management-ops=1" to change your queue depth to 1 >> and see if it's similar to qemu-img runtime. >> >>> On Wed, Jun 28, 2017 at 5:46 PM, Murali Balcha >>> wrote: >>> Need some help resolving the performance issues on the my ceph cluster. We >>> are running acute performance issues when we are using qemu-img convert. >>> However rbd import operation works perfectly alright. Please ignore image >>> format for a minute. I am trying to understand why rbd import performs well >>> on the same cluster where as qemu-img convert operation takes inordinate >>> amount of time. Here are the performance numbers: >>> >>> 1. qemu-img convert command for 465GB data took more than 48 hours to copy >>> the image to ceph. >>> >>> [root@redhat-compute4 ~]# qemu-img convert -p -t none -O raw >>> /var/triliovault-mounts/MTAuMC4wLjc3Oi92YXIvbmZzX3NoYXJl/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 >>> rbd:vms/volume-5ad883a0cd65435bb6ffbfa1243bbdc6 >>> >>>(100.00/100%) >>> >>> You have new mail in /var/spool/mail/root >>> >>> [root@redhat-compute4 ~]# >>> >>> >>> 2. Just copying the file to ceph took just 3 hours 18 mins (without qemu-img >>> convert). >>> >>> [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# >>> time rbd import 66582225-6539-4e5e-9b7a-59aa16739df1 -p volumes >>> 66582225-6539-4e5e-9b7a-59aa16739df1 --image-format 2 >>> >>> Importing image: 100% complete...done. >>> >>> >>> real198m9.069s >>> >>> user5m32.724s >>> >>> sys 18m32.213s >>> >>> [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# >>> >>> [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# >>> rbd info volumes/66582225-6539-4e5e-9b7a-59aa16739df1 >>> >>> rbd image '66582225-6539-4e5e-9b7a-59aa16739df1': >>> >>>size 465 GB in 119081 objects >>> >>>order 22 (4096 kB objects) >>> >>>block_name_prefix: rbd_data.753102ae8944a >>> >>>format: 2 >>> >>>features: layering >>> >>>flags: >>> >>> [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# >>> >>> >>> I appreciate if any one can give me pointers on where to look for? >>> >>> Best, >>> >>> Murali Balcha >>> O 508.233.3912 | M 508.494.5007 | murali.bal...@trilio.io | trilio.io >>> >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >> >> >> -- >> Jason -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-img convert vs rbd import performance
Given that your time difference is roughly 10x, best guess is that qemu-img is sending the IO operations synchronously (queue depth = 1), whereas, by default, "rbd import" will send up to 10 write requests in parallel to the backing OSDs. Such an assumption assumes that you have really high latency. You can re-run "rbd import" with "--rbd-concurrent-management-ops=1" to change your queue depth to 1 and see if it's similar to qemu-img runtime. On Wed, Jun 28, 2017 at 5:46 PM, Murali Balchawrote: > Need some help resolving the performance issues on the my ceph cluster. We > are running acute performance issues when we are using qemu-img convert. > However rbd import operation works perfectly alright. Please ignore image > format for a minute. I am trying to understand why rbd import performs well > on the same cluster where as qemu-img convert operation takes inordinate > amount of time. Here are the performance numbers: > > 1. qemu-img convert command for 465GB data took more than 48 hours to copy > the image to ceph. > > [root@redhat-compute4 ~]# qemu-img convert -p -t none -O raw > /var/triliovault-mounts/MTAuMC4wLjc3Oi92YXIvbmZzX3NoYXJl/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 > rbd:vms/volume-5ad883a0cd65435bb6ffbfa1243bbdc6 > > (100.00/100%) > > You have new mail in /var/spool/mail/root > > [root@redhat-compute4 ~]# > > > 2. Just copying the file to ceph took just 3 hours 18 mins (without qemu-img > convert). > > [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# > time rbd import 66582225-6539-4e5e-9b7a-59aa16739df1 -p volumes > 66582225-6539-4e5e-9b7a-59aa16739df1 --image-format 2 > > Importing image: 100% complete...done. > > > real198m9.069s > > user5m32.724s > > sys 18m32.213s > > [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# > > [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# > rbd info volumes/66582225-6539-4e5e-9b7a-59aa16739df1 > > rbd image '66582225-6539-4e5e-9b7a-59aa16739df1': > > size 465 GB in 119081 objects > > order 22 (4096 kB objects) > > block_name_prefix: rbd_data.753102ae8944a > > format: 2 > > features: layering > > flags: > > [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# > > > I appreciate if any one can give me pointers on where to look for? > > Best, > > Murali Balcha > O 508.233.3912 | M 508.494.5007 | murali.bal...@trilio.io | trilio.io > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] qemu-img convert vs rbd import performance
Need some help resolving the performance issues on the my ceph cluster. We are running acute performance issues when we are using qemu-img convert. However rbd import operation works perfectly alright. Please ignore image format for a minute. I am trying to understand why rbd import performs well on the same cluster where as qemu-img convert operation takes inordinate amount of time. Here are the performance numbers: 1. qemu-img convert command for 465GB data took more than 48 hours to copy the image to ceph. [root@redhat-compute4 ~]# qemu-img convert -p -t none -O raw /var/triliovault-mounts/MTAuMC4wLjc3Oi92YXIvbmZzX3NoYXJl/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snapshot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1 rbd:vms/volume-5ad883a0cd65435bb6ffbfa1243bbdc6 (100.00/100%) You have new mail in /var/spool/mail/root [root@redhat-compute4 ~]# 2. Just copying the file to ceph took just 3 hours 18 mins (without qemu-img convert). [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# time rbd import 66582225-6539-4e5e-9b7a-59aa16739df1 -p volumes 66582225-6539-4e5e-9b7a-59aa16739df1 --image-format 2 Importing image: 100% complete...done. real198m9.069s user5m32.724s sys 18m32.213s [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# rbd info volumes/66582225-6539-4e5e-9b7a-59aa16739df1 rbd image '66582225-6539-4e5e-9b7a-59aa16739df1': size 465 GB in 119081 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.753102ae8944a format: 2 features: layering flags: [root@redhat-compute4 vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# I appreciate if any one can give me pointers on where to look for? Best, Murali Balcha O 508.233.3912 | M 508.494.5007 | murali.bal...@trilio.io | trilio.io ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com