Re: [Qemu-devel] Regression: block: Add .bdrv_co_pwrite_zeroes()

Peter Lieven Thu, 21 Jul 2016 00:04:42 -0700

Hi Eric,

Am 21.07.2016 um 01:35 schrieb Eric Blake:

On 07/04/2016 07:49 AM, Peter Lieven wrote:

Hi,


the above commit:

commit d05aa8bb4a8b6aa9a915ec5074fb12ae632d2323
Author: Eric Blake <ebl...@redhat.com>
Date:   Wed Jun 1 15:10:03 2016 -0600

     block: Add .bdrv_co_pwrite_zeroes()

introduces a regression (at least for me).

The Limits from the iSCSI Block Limits VPD have no requirement of being
a power of two.
We use Dell Equallogic iSCSI SANs for instance. They have an internal
page size of 15MB. And
they advertise this page size as max_ws_len, opt_transfer_len and
opt_discard_alignment.

Since I don't have access to this device, let me double check: if you
put a breakpoint in iscsi.c:iscsi_refresh_limits(), can you dump the
contents of the struct iscsilun->bl?  What is the block size of this
device (512, 4096, something else)?


I can choose between 512 and 4096. 512 is the default.

Here are the advertised limits in the Block Limits VPD:

$ iscsi-inq -e 1 -c $((0xb0)) iscsi://XXX/0
wsnz:0
maximum compare and write length:1
optimal transfer length granularity:0
maximum transfer length:0
optimal transfer length:0
maximum prefetch xdread xdwrite transfer length:0
maximum unmap lba count:30720
maximum unmap block descriptor count:2
optimal unmap granularity:30720
ugavalid:1
unmap granularity alignment:0
maximum write same length:30720


Also, while the device is advertising that the optimal discard alignment
is 15M, that does not tell me the minimum granularity that it can
actually discard.  Can you determine that value?  That is, if I try to
discard only 1M, does that actually result in a 1M allocation hole, or
is it ignored?  It sounds like qemu should be tracking 2 separate
values: the minimum discard granularity (I suspect this number is a
power of 2, at least the block size, and perhaps precisely equal to the
block size), and the maximum discard granularity that results in the
fewest/fastest discard of the entire device (not necessarily a power of
2).  Or, maybe that merely means that qemu's pdiscard_alignment should
be the MINIMUM granularity, and NOT the non-power-of-2
iscsilun->bl.opt_unmap_gran.


As far as I know there is no minimum discard granularity. Only optimum
and maximum.


Or put another way, I get that I can't discard more than 15M at a time.
  But I highly suspect that I do not have to align my discard requests to
15M boundaries.  That is, if the discard granularity is 1M, then in
qemu-io, 'discard 1M 15M' should result in a 15M hole, and should be no
different from the result of 'discard 1M 14M; discard 15M 1M'.  But if
qemu sticks to pdiscard_alignment == iscsilun->bl.opt_unmap_gran of 15M,
then both operations mistakenly discard nothing (because it is not
aligned to a 15M boundary).


I do not know what the storage does internally. But I agree the block
provisioning info will not change. However, if you issue a discard 1M 15M
and later a discard 0 1M it still might to report the first block as unallocated
later.


Peter

Re: [Qemu-devel] Regression: block: Add .bdrv_co_pwrite_zeroes()

Reply via email to