[Qemu-devel] [PATCH 1.8 3/6] qemu-img: add option to specify alternate iobuffer size
since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to an alternate (greater) value. Signed-off-by: Peter Lieven p...@kamp.de --- qemu-img-cmds.hx |4 ++-- qemu-img.c | 25 - qemu-img.texi|4 +++- 3 files changed, 25 insertions(+), 8 deletions(-) diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx index da1d965..e0b8ab4 100644 --- a/qemu-img-cmds.hx +++ b/qemu-img-cmds.hx @@ -34,9 +34,9 @@ STEXI ETEXI DEF(convert, img_convert, -convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s snapshot_name] [-S sparse_size] filename [filename2 [...]] output_filename) +convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s snapshot_name] [-S sparse_size] [-m iobuf_size] filename [filename2 [...]] output_filename) STEXI -@item convert [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename} +@item convert [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] [-m @var{iobuf_size}] @var{filename} [@var{filename2} [...]] @var{output_filename} ETEXI DEF(info, img_info, diff --git a/qemu-img.c b/qemu-img.c index e2d1a0a..0ce5d14 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -105,6 +105,7 @@ static void help(void) conversion. If the number of bytes is 0, the source will not be scanned for\n unallocated or zero sectors, and the destination image will always be\n fully allocated\n + '-m' specify I/O buffer size in bytes (default 2M)\n '--output' takes the format in which the output must be done (human or json)\n '-n' skips the target volume creation (useful if the volume is created\n prior to running qemu-img)\n @@ -1135,6 +1136,7 @@ static int img_convert(int argc, char **argv) sector_num_next_status = 0; uint64_t bs_sectors; uint8_t * buf = NULL; +size_t bufsectors = IO_BUF_SIZE / BDRV_SECTOR_SIZE; const uint8_t *buf1; BlockDriverInfo bdi; QEMUOptionParameter *param = NULL, *create_options = NULL; @@ -1152,7 +1154,7 @@ static int img_convert(int argc, char **argv) compress = 0; skip_create = 0; for(;;) { -c = getopt(argc, argv, f:O:B:s:hce6o:pS:t:qn); +c = getopt(argc, argv, f:O:B:s:hce6o:pS:m:t:qn); if (c == -1) { break; } @@ -1200,6 +1202,19 @@ static int img_convert(int argc, char **argv) min_sparse = sval / BDRV_SECTOR_SIZE; break; } +case 'm': +{ +int64_t sval; +char *end; +sval = strtosz_suffix(optarg, end, STRTOSZ_DEFSUFFIX_B); +if (sval BDRV_SECTOR_SIZE || *end) { +error_report(Invalid I/O buffer size specified); +return 1; +} + +bufsectors = sval / BDRV_SECTOR_SIZE; +break; +} case 'p': progress = 1; break; @@ -1371,7 +1386,7 @@ static int img_convert(int argc, char **argv) bs_i = 0; bs_offset = 0; bdrv_get_geometry(bs[0], bs_sectors); -buf = qemu_blockalign(out_bs, IO_BUF_SIZE); +buf = qemu_blockalign(out_bs, bufsectors * BDRV_SECTOR_SIZE); if (skip_create) { int64_t output_length = bdrv_getlength(out_bs); @@ -1394,7 +1409,7 @@ static int img_convert(int argc, char **argv) goto out; } cluster_size = bdi.cluster_size; -if (cluster_size = 0 || cluster_size IO_BUF_SIZE) { +if (cluster_size = 0 || cluster_size bufsectors * BDRV_SECTOR_SIZE) { error_report(invalid cluster size); ret = -1; goto out; @@ -1531,8 +1546,8 @@ static int img_convert(int argc, char **argv) sector_num_next_status = sector_num + n1; } -if (nb_sectors = (IO_BUF_SIZE / 512)) { -n = (IO_BUF_SIZE / 512); +if (nb_sectors = bufsectors) { +n = bufsectors; } else { n = nb_sectors; } diff --git a/qemu-img.texi b/qemu-img.texi index da36975..87f9d0f 100644 --- a/qemu-img.texi +++ b/qemu-img.texi @@ -179,7 +179,7 @@ Error on reading data @end table -@item convert [-c] [-p] [-n] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename} +@item convert [-c] [-p] [-n] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] [-m @var{iobuf_size}] @var{filename}
Re: [Qemu-devel] [PATCH 1.8 3/6] qemu-img: add option to specify alternate iobuffer size
Il 25/11/2013 14:57, Peter Lieven ha scritto: since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to an alternate (greater) value. Do you really need the extra knob? You can just add to BlockLimits the optimal transfer length, and use it unconditionally. (It would be interesting to add a double-buffer to qemu-img convert, using asynchronous I/O...). Paolo Signed-off-by: Peter Lieven p...@kamp.de --- qemu-img-cmds.hx |4 ++-- qemu-img.c | 25 - qemu-img.texi|4 +++- 3 files changed, 25 insertions(+), 8 deletions(-) diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx index da1d965..e0b8ab4 100644 --- a/qemu-img-cmds.hx +++ b/qemu-img-cmds.hx @@ -34,9 +34,9 @@ STEXI ETEXI DEF(convert, img_convert, -convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s snapshot_name] [-S sparse_size] filename [filename2 [...]] output_filename) +convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s snapshot_name] [-S sparse_size] [-m iobuf_size] filename [filename2 [...]] output_filename) STEXI -@item convert [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename} +@item convert [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] [-m @var{iobuf_size}] @var{filename} [@var{filename2} [...]] @var{output_filename} ETEXI DEF(info, img_info, diff --git a/qemu-img.c b/qemu-img.c index e2d1a0a..0ce5d14 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -105,6 +105,7 @@ static void help(void) conversion. If the number of bytes is 0, the source will not be scanned for\n unallocated or zero sectors, and the destination image will always be\n fully allocated\n + '-m' specify I/O buffer size in bytes (default 2M)\n '--output' takes the format in which the output must be done (human or json)\n '-n' skips the target volume creation (useful if the volume is created\n prior to running qemu-img)\n @@ -1135,6 +1136,7 @@ static int img_convert(int argc, char **argv) sector_num_next_status = 0; uint64_t bs_sectors; uint8_t * buf = NULL; +size_t bufsectors = IO_BUF_SIZE / BDRV_SECTOR_SIZE; const uint8_t *buf1; BlockDriverInfo bdi; QEMUOptionParameter *param = NULL, *create_options = NULL; @@ -1152,7 +1154,7 @@ static int img_convert(int argc, char **argv) compress = 0; skip_create = 0; for(;;) { -c = getopt(argc, argv, f:O:B:s:hce6o:pS:t:qn); +c = getopt(argc, argv, f:O:B:s:hce6o:pS:m:t:qn); if (c == -1) { break; } @@ -1200,6 +1202,19 @@ static int img_convert(int argc, char **argv) min_sparse = sval / BDRV_SECTOR_SIZE; break; } +case 'm': +{ +int64_t sval; +char *end; +sval = strtosz_suffix(optarg, end, STRTOSZ_DEFSUFFIX_B); +if (sval BDRV_SECTOR_SIZE || *end) { +error_report(Invalid I/O buffer size specified); +return 1; +} + +bufsectors = sval / BDRV_SECTOR_SIZE; +break; +} case 'p': progress = 1; break; @@ -1371,7 +1386,7 @@ static int img_convert(int argc, char **argv) bs_i = 0; bs_offset = 0; bdrv_get_geometry(bs[0], bs_sectors); -buf = qemu_blockalign(out_bs, IO_BUF_SIZE); +buf = qemu_blockalign(out_bs, bufsectors * BDRV_SECTOR_SIZE); if (skip_create) { int64_t output_length = bdrv_getlength(out_bs); @@ -1394,7 +1409,7 @@ static int img_convert(int argc, char **argv) goto out; } cluster_size = bdi.cluster_size; -if (cluster_size = 0 || cluster_size IO_BUF_SIZE) { +if (cluster_size = 0 || cluster_size bufsectors * BDRV_SECTOR_SIZE) { error_report(invalid cluster size); ret = -1; goto out; @@ -1531,8 +1546,8 @@ static int img_convert(int argc, char **argv) sector_num_next_status = sector_num + n1; } -if (nb_sectors = (IO_BUF_SIZE / 512)) { -n = (IO_BUF_SIZE / 512); +if (nb_sectors = bufsectors) { +n = bufsectors; } else { n = nb_sectors; } diff --git a/qemu-img.texi b/qemu-img.texi index da36975..87f9d0f 100644 --- a/qemu-img.texi +++ b/qemu-img.texi @@ -179,7 +179,7 @@ Error on reading data @end table
Re: [Qemu-devel] [PATCH 1.8 3/6] qemu-img: add option to specify alternate iobuffer size
On 25.11.2013 15:54, Paolo Bonzini wrote: Il 25/11/2013 14:57, Peter Lieven ha scritto: since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to an alternate (greater) value. Do you really need the extra knob? You can just add to BlockLimits the optimal transfer length, and use it unconditionally. That would be possible. (It would be interesting to add a double-buffer to qemu-img convert, using asynchronous I/O...). That would be the best benefit of all. But I thought this would be too complicated. Peter Paolo Signed-off-by: Peter Lieven p...@kamp.de --- qemu-img-cmds.hx |4 ++-- qemu-img.c | 25 - qemu-img.texi|4 +++- 3 files changed, 25 insertions(+), 8 deletions(-) diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx index da1d965..e0b8ab4 100644 --- a/qemu-img-cmds.hx +++ b/qemu-img-cmds.hx @@ -34,9 +34,9 @@ STEXI ETEXI DEF(convert, img_convert, -convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s snapshot_name] [-S sparse_size] filename [filename2 [...]] output_filename) +convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s snapshot_name] [-S sparse_size] [-m iobuf_size] filename [filename2 [...]] output_filename) STEXI -@item convert [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename} +@item convert [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] [-m @var{iobuf_size}] @var{filename} [@var{filename2} [...]] @var{output_filename} ETEXI DEF(info, img_info, diff --git a/qemu-img.c b/qemu-img.c index e2d1a0a..0ce5d14 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -105,6 +105,7 @@ static void help(void) conversion. If the number of bytes is 0, the source will not be scanned for\n unallocated or zero sectors, and the destination image will always be\n fully allocated\n + '-m' specify I/O buffer size in bytes (default 2M)\n '--output' takes the format in which the output must be done (human or json)\n '-n' skips the target volume creation (useful if the volume is created\n prior to running qemu-img)\n @@ -1135,6 +1136,7 @@ static int img_convert(int argc, char **argv) sector_num_next_status = 0; uint64_t bs_sectors; uint8_t * buf = NULL; +size_t bufsectors = IO_BUF_SIZE / BDRV_SECTOR_SIZE; const uint8_t *buf1; BlockDriverInfo bdi; QEMUOptionParameter *param = NULL, *create_options = NULL; @@ -1152,7 +1154,7 @@ static int img_convert(int argc, char **argv) compress = 0; skip_create = 0; for(;;) { -c = getopt(argc, argv, f:O:B:s:hce6o:pS:t:qn); +c = getopt(argc, argv, f:O:B:s:hce6o:pS:m:t:qn); if (c == -1) { break; } @@ -1200,6 +1202,19 @@ static int img_convert(int argc, char **argv) min_sparse = sval / BDRV_SECTOR_SIZE; break; } +case 'm': +{ +int64_t sval; +char *end; +sval = strtosz_suffix(optarg, end, STRTOSZ_DEFSUFFIX_B); +if (sval BDRV_SECTOR_SIZE || *end) { +error_report(Invalid I/O buffer size specified); +return 1; +} + +bufsectors = sval / BDRV_SECTOR_SIZE; +break; +} case 'p': progress = 1; break; @@ -1371,7 +1386,7 @@ static int img_convert(int argc, char **argv) bs_i = 0; bs_offset = 0; bdrv_get_geometry(bs[0], bs_sectors); -buf = qemu_blockalign(out_bs, IO_BUF_SIZE); +buf = qemu_blockalign(out_bs, bufsectors * BDRV_SECTOR_SIZE); if (skip_create) { int64_t output_length = bdrv_getlength(out_bs); @@ -1394,7 +1409,7 @@ static int img_convert(int argc, char **argv) goto out; } cluster_size = bdi.cluster_size; -if (cluster_size = 0 || cluster_size IO_BUF_SIZE) { +if (cluster_size = 0 || cluster_size bufsectors * BDRV_SECTOR_SIZE) { error_report(invalid cluster size); ret = -1; goto out; @@ -1531,8 +1546,8 @@ static int img_convert(int argc, char **argv) sector_num_next_status = sector_num + n1; } -if (nb_sectors = (IO_BUF_SIZE / 512)) { -n = (IO_BUF_SIZE / 512); +if (nb_sectors = bufsectors) { +n = bufsectors; } else { n = nb_sectors; } diff --git a/qemu-img.texi b/qemu-img.texi index da36975..87f9d0f 100644 ---
Re: [Qemu-devel] [PATCH 1.8 3/6] qemu-img: add option to specify alternate iobuffer size
On 25.11.2013 15:54, Paolo Bonzini wrote: Il 25/11/2013 14:57, Peter Lieven ha scritto: since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to an alternate (greater) value. Do you really need the extra knob? You can just add to BlockLimits the optimal transfer length, and use it unconditionally. If you say patch 5 and 3 are ok. What could be done is to remove this knob and increase the iobuf_size to cluster_size if cluster_size is greater. I do not want to increase the default iobuf size to anything greater than 2MB. I do not know why this was choosen, but maybe there was a reason for it. The storages we use here have a very strange page size of 15MB. If I sent aligned 15MB chunks the performace is about +50% compared to the original qemu-img. Peter
Re: [Qemu-devel] [PATCH 1.8 3/6] qemu-img: add option to specify alternate iobuffer size
Il 25/11/2013 16:07, Peter Lieven ha scritto: since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to an alternate (greater) value. Do you really need the extra knob? You can just add to BlockLimits the optimal transfer length, and use it unconditionally. If you say patch 5 and 3 are ok. What could be done is to remove this knob and increase the iobuf_size to cluster_size if cluster_size is greater. Yes, that makes sense because it avoids unnecessary COW. I do not want to increase the default iobuf size to anything greater than 2MB. I do not know why this was choosen, but maybe there was a reason for it. I think it is fine to increase it to the cluster_size or even to the optimal transfer length (new BlockLimits field). The storages we use here have a very strange page size of 15MB. If I sent aligned 15MB chunks the performace is about +50% compared to the original qemu-img. I think composing the optimal transfer length and optimal unmap granularity (cluster_size) will work well: * clamp maximum size to optimal transfer length * then, round final sector down to unmap granularity Paolo
Re: [Qemu-devel] [PATCH 1.8 3/6] qemu-img: add option to specify alternate iobuffer size
On 25.11.2013 16:14, Paolo Bonzini wrote: Il 25/11/2013 16:07, Peter Lieven ha scritto: since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to an alternate (greater) value. Do you really need the extra knob? You can just add to BlockLimits the optimal transfer length, and use it unconditionally. If you say patch 5 and 3 are ok. What could be done is to remove this knob and increase the iobuf_size to cluster_size if cluster_size is greater. Yes, that makes sense because it avoids unnecessary COW. I do not want to increase the default iobuf size to anything greater than 2MB. I do not know why this was choosen, but maybe there was a reason for it. I think it is fine to increase it to the cluster_size or even to the optimal transfer length (new BlockLimits field). okay scsi speaking: bs-bl.optimal_transfer_length = max(iscsilun-bl.opt_xfer_len, iscsilun-bl.opt_unmap_gran) ? bdi-cluster_size as in Patch 3? The storages we use here have a very strange page size of 15MB. If I sent aligned 15MB chunks the performace is about +50% compared to the original qemu-img. I think composing the optimal transfer length and optimal unmap granularity (cluster_size) will work well: * clamp maximum size to optimal transfer length. or increase it if its larger. for the dell equallogic storages we use the opt_unmap_gran is 15MB!!! * then, round final sector down to unmap granularity okay, i will try your modification to only round down the last sector. would you use bdi-cluster_size or the unmap alignment field from the bs-bl? Peter
Re: [Qemu-devel] [PATCH 1.8 3/6] qemu-img: add option to specify alternate iobuffer size
Il 25/11/2013 16:24, Peter Lieven ha scritto: On 25.11.2013 16:14, Paolo Bonzini wrote: Il 25/11/2013 16:07, Peter Lieven ha scritto: since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to an alternate (greater) value. Do you really need the extra knob? You can just add to BlockLimits the optimal transfer length, and use it unconditionally. If you say patch 5 and 3 are ok. What could be done is to remove this knob and increase the iobuf_size to cluster_size if cluster_size is greater. Yes, that makes sense because it avoids unnecessary COW. I do not want to increase the default iobuf size to anything greater than 2MB. I do not know why this was choosen, but maybe there was a reason for it. I think it is fine to increase it to the cluster_size or even to the optimal transfer length (new BlockLimits field). okay scsi speaking: bs-bl.optimal_transfer_length = max(iscsilun-bl.opt_xfer_len, iscsilun-bl.opt_unmap_gran) ? I was thinking more of bs-bl.optimal_transfer_length = lun2qemu(iscsilun-bl.opt_xfer_len); ... iobuf_size = max(bs-bl.optimal_transfer_length, cluster_sectors)*512; iobuf_size = min(16MB, max(2MB, iobuf_size)); bdi-cluster_size as in Patch 3? Yes, that one's fine. * clamp maximum size to optimal transfer length. or increase it if its larger. for the dell equallogic storages we use the opt_unmap_gran is 15MB!!! You're right, maximum size is really bounded anyway by iobuf_size. * then, round final sector down to unmap granularity okay, i will try your modification to only round down the last sector. would you use bdi-cluster_size or the unmap alignment field from the bs-bl? bdi-cluster_size. The qemu-img reason for the rounding is to avoid unnecessary COW, which is what bdi-cluster_size is about. It just so happens that it helps your benchmark too. :) Paolo Peter
Re: [Qemu-devel] [PATCH 1.8 3/6] qemu-img: add option to specify alternate iobuffer size
On 25.11.2013 16:48, Paolo Bonzini wrote: Il 25/11/2013 16:24, Peter Lieven ha scritto: On 25.11.2013 16:14, Paolo Bonzini wrote: Il 25/11/2013 16:07, Peter Lieven ha scritto: since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to an alternate (greater) value. Do you really need the extra knob? You can just add to BlockLimits the optimal transfer length, and use it unconditionally. If you say patch 5 and 3 are ok. What could be done is to remove this knob and increase the iobuf_size to cluster_size if cluster_size is greater. Yes, that makes sense because it avoids unnecessary COW. I do not want to increase the default iobuf size to anything greater than 2MB. I do not know why this was choosen, but maybe there was a reason for it. I think it is fine to increase it to the cluster_size or even to the optimal transfer length (new BlockLimits field). okay scsi speaking: bs-bl.optimal_transfer_length = max(iscsilun-bl.opt_xfer_len, iscsilun-bl.opt_unmap_gran) ? I was thinking more of bs-bl.optimal_transfer_length = lun2qemu(iscsilun-bl.opt_xfer_len); ... iobuf_size = max(bs-bl.optimal_transfer_length, cluster_sectors)*512; iobuf_size = min(16MB, max(2MB, iobuf_size)); okay. same result ;-) Peter ... KAMP Netzwerkdienste GmbH Vestische Str. 89-91 | 46117 Oberhausen Tel: +49 (0) 208.89 402-50 | Fax: +49 (0) 208.89 402-40 p...@kamp.de | http://www.kamp.de Geschäftsführer: Heiner Lante | Michael Lante Amtsgericht Duisburg | HRB Nr. 12154 USt-Id-Nr.: DE 120607556 ...