Re: [PATCH v3 02/24] modules: collect module meta-data
Oh well. Let's add a to-do marker. Paolo Il mer 23 giu 2021, 09:36 Gerd Hoffmann ha scritto: > On Tue, Jun 22, 2021 at 06:03:45PM +0200, Paolo Bonzini wrote: > > On 21/06/21 14:52, Gerd Hoffmann wrote: > > > ninja: error: 'libui-curses.a.p/meson-generated_.._config-host.h.o', > needed by 'ui-curses.modinfo.test', missing and no known rule to make it > > > > > > Hmm, not sure where this comes from. meson doesn't try to link > > > config-host.h.o into libui-curses.a, so why does extract_all_objects() > > > return it? > > > > > > Test patch (incremental to this series) below. > > > > Bug in Meson, fix at https://github.com/mesonbuild/meson/pull/8900. > You can > > just ignore missing files. > > Well, it's ninja throwing the error not the modinfo script, the script > doesn't even run ... > > take care, > Gerd > >
Re: [RFC PATCH 0/9] hw/sd: Allow card size not power of 2 again
On 210623 2000, Philippe Mathieu-Daudé wrote: > Hi Ubi-Wan Kenubi and Tom, > > In commit a9bcedd (SD card size has to be power of 2) we decided > to restrict SD card size to avoid security problems (CVE-2020-13253) > but this became not practical to some users. > > This RFC series tries to remove the limitation, keeping our > functional tests working. It is unfinished work because I had to > attend other topics, but sending it early as RFC to get feedback. > I'll keep working when I get more time, except if one if you can > help me. > > Alexander, could you generate a qtest reproducer with the fuzzer > corpus? See: https://bugs.launchpad.net/qemu/+bug/1878054 I think that bug was already fixed - the reproducer no logner causes a timeout on 6.0. Did I misunderstand something? I applied this series and ran the OSS-Fuzz corpus for the sdhci-v3 config. The only problem it found is this assert() (that exists without the patch anyways): https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=29225 Let me know if this is something you think I should report on gitlab.. I'll leave the fuzzer running for another 24h or so, but otherwise I'm happy to leave a Tested-by, once there is a V1 series -Alex > Thanks, > > Phil. > > Philippe Mathieu-Daudé (9): > hw/sd: When card is in wrong state, log which state it is > hw/sd: Extract address_in_range() helper, log invalid accesses > tests/acceptance: Tag NetBSD tests as 'os:netbsd' > tests/acceptance: Extract image_expand() helper > tests/acceptance: Use image_expand() in > test_arm_orangepi_uboot_netbsd9 > tests/acceptance: Use image_expand() in test_arm_orangepi_bionic_20_08 > tests/acceptance: Do not expand SD card image in test_arm_orangepi_sd > tests/acceptance: Remove now unused pow2ceil() > hw/sd: Allow card size not power of 2 again > > hw/sd/sd.c | 60 +- > tests/acceptance/boot_linux_console.py | 39 - > tests/acceptance/ppc_prep_40p.py | 2 + > 3 files changed, 52 insertions(+), 49 deletions(-) > > -- > 2.31.1 >
[PATCH] hw/sd: sdhci: Enable 64-bit system bus capability in the default SD/MMC host controller
The default SD/MMC host controller uses SD spec v2.00. 64-bit system bus capability was added in v2. In this change, we arrive at 0x157834b4 by computing (0x057834b4 | (1ul << 28)) where 28 represents the BUS64BIT SDHC_CAPAB field. Signed-off-by: Joanne Koong --- hw/sd/sdhci-internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h index e8c753d6d1..a76fc704e5 100644 --- a/hw/sd/sdhci-internal.h +++ b/hw/sd/sdhci-internal.h @@ -316,16 +316,16 @@ extern const VMStateDescription sdhci_vmstate; * - 3.3v and 1.8v voltages * - SDMA/ADMA1/ADMA2 * - high-speed + * - 64-bit system bus * max host controller R/W buffers size: 512B * max clock frequency for SDclock: 52 MHz * timeout clock frequency: 52 MHz * * does not support: * - 3.0v voltage - * - 64-bit system bus * - suspend/resume */ -#define SDHC_CAPAB_REG_DEFAULT 0x057834b4 +#define SDHC_CAPAB_REG_DEFAULT 0x157834b4 #define DEFINE_SDHCI_COMMON_PROPERTIES(_state) \ DEFINE_PROP_UINT8("sd-spec-version", _state, sd_spec_version, 2), \ -- 2.20.1
[RFC PATCH 6/9] tests/acceptance: Use image_expand() in test_arm_orangepi_bionic_20_08
U-Boot expects the SD card size to be at least 2GiB, so expand the SD card image to this size before using it. Signed-off-by: Philippe Mathieu-Daudé --- TODO: U-Boot reference? --- tests/acceptance/boot_linux_console.py | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py index b10f7257503..c4c0f0b393d 100644 --- a/tests/acceptance/boot_linux_console.py +++ b/tests/acceptance/boot_linux_console.py @@ -820,11 +820,13 @@ def test_arm_orangepi_bionic_20_08(self): :avocado: tags=arch:arm :avocado: tags=machine:orangepi-pc :avocado: tags=device:sd +:avocado: tags=u-boot """ -# This test download a 275 MiB compressed image and expand it -# to 1036 MiB, but the underlying filesystem is 1552 MiB... -# As we expand it to 2 GiB we are safe. +# This test download a 275 MiB compressed image and inflate it +# to 1036 MiB, but 1/ the underlying filesystem is 1552 MiB, +# 2/ U-Boot loads TFTP filenames from the last sector below +# 2 GiB, so we need to expand the image further more to 2 GiB. image_url = ('https://archive.armbian.com/orangepipc/archive/' 'Armbian_20.08.1_Orangepipc_bionic_current_5.8.5.img.xz') @@ -833,7 +835,7 @@ def test_arm_orangepi_bionic_20_08(self): image_path_xz = self.fetch_asset(image_url, asset_hash=image_hash, algorithm='sha256') image_path = archive.extract(image_path_xz, self.workdir) -image_pow2ceil_expand(image_path) +image_expand(image_path, 2 * 1024 * 1024 * 1024) self.vm.set_console() self.vm.add_args('-drive', 'file=' + image_path + ',if=sd,format=raw', -- 2.31.1
[PATCH 5/9] tests/acceptance: Use image_expand() in test_arm_orangepi_uboot_netbsd9
The NetBSD OrangePi image must be at least 2GiB, not less. Expand the SD card image to this size before using it. Signed-off-by: Philippe Mathieu-Daudé --- tests/acceptance/boot_linux_console.py | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py index 61069f0064f..b10f7257503 100644 --- a/tests/acceptance/boot_linux_console.py +++ b/tests/acceptance/boot_linux_console.py @@ -868,7 +868,12 @@ def test_arm_orangepi_uboot_netbsd9(self): :avocado: tags=device:sd :avocado: tags=os:netbsd """ -# This test download a 304MB compressed image and expand it to 2GB +# This test download a 304MB compressed image and expand it to 2GB, +# which is the minimum card size required by the NetBSD installer: +# https://wiki.netbsd.org/ports/evbarm/raspberry_pi/#index7h2 +# "A 2 GB card is the smallest workable size that the installation +# image will fit on." +NETBSD_SDCARD_MINSIZE = 2 * 1024 * 1024 * 1024 deb_url = ('http://snapshot.debian.org/archive/debian/' '20200108T145233Z/pool/main/u/u-boot/' 'u-boot-sunxi_2020.01%2Bdfsg-1_armhf.deb') @@ -886,7 +891,7 @@ def test_arm_orangepi_uboot_netbsd9(self): image_path_gz = self.fetch_asset(image_url, asset_hash=image_hash) image_path = os.path.join(self.workdir, 'armv7.img') archive.gzip_uncompress(image_path_gz, image_path) -image_pow2ceil_expand(image_path) +image_expand(image_path, NETBSD_SDCARD_MINSIZE) image_drive_args = 'if=sd,format=raw,snapshot=on,file=' + image_path # dd if=u-boot-sunxi-with-spl.bin of=armv7.img bs=1K seek=8 conv=notrunc -- 2.31.1
[PATCH 4/9] tests/acceptance: Extract image_expand() helper
To be able to expand an image to a non-power-of-2 value, extract image_expand() from image_pow2ceil_expand(). Signed-off-by: Philippe Mathieu-Daudé --- tests/acceptance/boot_linux_console.py | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py index 20d57c1a8c6..61069f0064f 100644 --- a/tests/acceptance/boot_linux_console.py +++ b/tests/acceptance/boot_linux_console.py @@ -35,15 +35,19 @@ def pow2ceil(x): return 1 if x == 0 else 2**(x - 1).bit_length() +""" +Expand file size +""" +def image_expand(path, size): +if size != os.path.getsize(path): +with open(path, 'ab+') as fd: +fd.truncate(size) + """ Expand file size to next power of 2 """ def image_pow2ceil_expand(path): -size = os.path.getsize(path) -size_aligned = pow2ceil(size) -if size != size_aligned: -with open(path, 'ab+') as fd: -fd.truncate(size_aligned) +image_expand(path, pow2ceil(os.path.getsize(path))) class LinuxKernelTest(Test): KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 ' -- 2.31.1
[RFC PATCH 9/9] hw/sd: Allow card size not power of 2 again
In commit a9bcedd15a5 ("hw/sd/sdcard: Do not allow invalid SD card sizes") we tried to protect us from CVE-2020-13253 by only allowing card with power-of-2 sizes. However doing so we disrupted valid user cases. As a compromise, allow any card size, but warn only power of 2 sizes are supported, still suggesting the user how to increase a card with 'qemu-img resize'. Cc: Tom Yan Cc: Warner Losh Buglink: https://bugs.launchpad.net/qemu/+bug/1910586 Signed-off-by: Philippe Mathieu-Daudé --- hw/sd/sd.c | 25 + 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 9c8dd11bad1..cab4aab1475 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -2131,23 +2131,16 @@ static void sd_realize(DeviceState *dev, Error **errp) blk_size = blk_getlength(sd->blk); if (blk_size > 0 && !is_power_of_2(blk_size)) { int64_t blk_size_aligned = pow2ceil(blk_size); -char *blk_size_str; +g_autofree char *blk_size_s = size_to_str(blk_size); +g_autofree char *blk_size_aligned_s = size_to_str(blk_size_aligned); -blk_size_str = size_to_str(blk_size); -error_setg(errp, "Invalid SD card size: %s", blk_size_str); -g_free(blk_size_str); - -blk_size_str = size_to_str(blk_size_aligned); -error_append_hint(errp, - "SD card size has to be a power of 2, e.g. %s.\n" - "You can resize disk images with" - " 'qemu-img resize '\n" - "(note that this will lose data if you make the" - " image smaller than it currently is).\n", - blk_size_str); -g_free(blk_size_str); - -return; +warn_report("SD card size is not a power of 2 (%s). It might work" +" but is not supported by QEMU. If possible, resize" +" your disk image to %s with:", +blk_size_s, blk_size_aligned_s); +warn_report(" 'qemu-img resize '"); +warn_report("(note that this will lose data if you make the" +" image smaller than it currently is)."); } ret = blk_set_perm(sd->blk, BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE, -- 2.31.1
[RFC PATCH 7/9] tests/acceptance: Do not expand SD card image in test_arm_orangepi_sd
XXX it seems to work... Signed-off-by: Philippe Mathieu-Daudé --- tests/acceptance/boot_linux_console.py | 1 - 1 file changed, 1 deletion(-) diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py index c4c0f0b393d..48c0ba09117 100644 --- a/tests/acceptance/boot_linux_console.py +++ b/tests/acceptance/boot_linux_console.py @@ -783,7 +783,6 @@ def test_arm_orangepi_sd(self): rootfs_path_xz = self.fetch_asset(rootfs_url, asset_hash=rootfs_hash) rootfs_path = os.path.join(self.workdir, 'rootfs.cpio') archive.lzma_uncompress(rootfs_path_xz, rootfs_path) -image_pow2ceil_expand(rootfs_path) self.vm.set_console() kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE + -- 2.31.1
[PATCH 8/9] tests/acceptance: Remove now unused pow2ceil()
We don't use pow2ceil() anymore, remove it. Signed-off-by: Philippe Mathieu-Daudé --- tests/acceptance/boot_linux_console.py | 12 1 file changed, 12 deletions(-) diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py index 48c0ba09117..77bc80c505d 100644 --- a/tests/acceptance/boot_linux_console.py +++ b/tests/acceptance/boot_linux_console.py @@ -29,12 +29,6 @@ except CmdNotFoundError: P7ZIP_AVAILABLE = False -""" -Round up to next power of 2 -""" -def pow2ceil(x): -return 1 if x == 0 else 2**(x - 1).bit_length() - """ Expand file size """ @@ -43,12 +37,6 @@ def image_expand(path, size): with open(path, 'ab+') as fd: fd.truncate(size) -""" -Expand file size to next power of 2 -""" -def image_pow2ceil_expand(path): -image_expand(path, pow2ceil(os.path.getsize(path))) - class LinuxKernelTest(Test): KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 ' -- 2.31.1
[PATCH 3/9] tests/acceptance: Tag NetBSD tests as 'os:netbsd'
Avocado allows us to select set of tests using tags. When wanting to run all tests using a NetBSD guest OS, it is convenient to have them tagged, add the 'os:netbsd' tag. Signed-off-by: Philippe Mathieu-Daudé --- tests/acceptance/boot_linux_console.py | 1 + tests/acceptance/ppc_prep_40p.py | 2 ++ 2 files changed, 3 insertions(+) diff --git a/tests/acceptance/boot_linux_console.py b/tests/acceptance/boot_linux_console.py index cded547d1d4..20d57c1a8c6 100644 --- a/tests/acceptance/boot_linux_console.py +++ b/tests/acceptance/boot_linux_console.py @@ -862,6 +862,7 @@ def test_arm_orangepi_uboot_netbsd9(self): :avocado: tags=arch:arm :avocado: tags=machine:orangepi-pc :avocado: tags=device:sd +:avocado: tags=os:netbsd """ # This test download a 304MB compressed image and expand it to 2GB deb_url = ('http://snapshot.debian.org/archive/debian/' diff --git a/tests/acceptance/ppc_prep_40p.py b/tests/acceptance/ppc_prep_40p.py index 96ba13b8943..2993ee3b078 100644 --- a/tests/acceptance/ppc_prep_40p.py +++ b/tests/acceptance/ppc_prep_40p.py @@ -27,6 +27,7 @@ def test_factory_firmware_and_netbsd(self): """ :avocado: tags=arch:ppc :avocado: tags=machine:40p +:avocado: tags=os:netbsd :avocado: tags=slowness:high """ bios_url = ('http://ftpmirror.your.org/pub/misc/' @@ -64,6 +65,7 @@ def test_openbios_and_netbsd(self): """ :avocado: tags=arch:ppc :avocado: tags=machine:40p +:avocado: tags=os:netbsd """ drive_url = ('https://cdn.netbsd.org/pub/NetBSD/iso/7.1.2/' 'NetBSD-7.1.2-prep.iso') -- 2.31.1
[PATCH 2/9] hw/sd: Extract address_in_range() helper, log invalid accesses
Multiple commands have to check the address requested is valid. Extract this code pattern as a new address_in_range() helper, and log invalid accesses as guest errors. Signed-off-by: Philippe Mathieu-Daudé --- hw/sd/sd.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index d8fdf84f4db..9c8dd11bad1 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -937,6 +937,18 @@ static void sd_lock_command(SDState *sd) sd->card_status &= ~CARD_IS_LOCKED; } +static bool address_in_range(SDState *sd, const char *desc, + uint64_t addr, uint32_t length) +{ +if (addr + length > sd->size) { +qemu_log_mask(LOG_GUEST_ERROR, "%s offset %lu > card %lu [%%%u]\n", + desc, addr, sd->size, length); +sd->card_status |= ADDRESS_ERROR; +return false; +} +return true; +} + static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) { uint32_t rca = 0x; @@ -1218,8 +1230,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) switch (sd->state) { case sd_transfer_state: -if (addr + sd->blk_len > sd->size) { -sd->card_status |= ADDRESS_ERROR; +if (!address_in_range(sd, "READ_BLOCK", addr, sd->blk_len)) { return sd_r1; } @@ -1264,8 +1275,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) switch (sd->state) { case sd_transfer_state: -if (addr + sd->blk_len > sd->size) { -sd->card_status |= ADDRESS_ERROR; +if (!address_in_range(sd, "WRITE_BLOCK", addr, sd->blk_len)) { return sd_r1; } @@ -1325,8 +1335,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) switch (sd->state) { case sd_transfer_state: -if (addr >= sd->size) { -sd->card_status |= ADDRESS_ERROR; +if (!address_in_range(sd, "SET_WRITE_PROT", addr, 1)) { return sd_r1b; } @@ -1348,8 +1357,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) switch (sd->state) { case sd_transfer_state: -if (addr >= sd->size) { -sd->card_status |= ADDRESS_ERROR; +if (!address_in_range(sd, "CLR_WRITE_PROT", addr, 1)) { return sd_r1b; } @@ -1826,8 +1834,8 @@ void sd_write_byte(SDState *sd, uint8_t value) case 25: /* CMD25: WRITE_MULTIPLE_BLOCK */ if (sd->data_offset == 0) { /* Start of the block - let's check the address is valid */ -if (sd->data_start + sd->blk_len > sd->size) { -sd->card_status |= ADDRESS_ERROR; +if (!address_in_range(sd, "WRITE_MULTIPLE_BLOCK", + sd->data_start, sd->blk_len)) { break; } if (sd->size <= SDSC_MAX_CAPACITY) { @@ -1999,8 +2007,8 @@ uint8_t sd_read_byte(SDState *sd) case 18: /* CMD18: READ_MULTIPLE_BLOCK */ if (sd->data_offset == 0) { -if (sd->data_start + io_len > sd->size) { -sd->card_status |= ADDRESS_ERROR; +if (!address_in_range(sd, "READ_MULTIPLE_BLOCK", + sd->data_start, io_len)) { return 0x00; } BLK_READ_BLOCK(sd->data_start, io_len); -- 2.31.1
[PATCH 1/9] hw/sd: When card is in wrong state, log which state it is
We report the card is in an inconsistent state, but don't precise in which state it is. Add this information, as it is useful when debugging problems. Signed-off-by: Philippe Mathieu-Daudé --- hw/sd/sd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 282d39a7042..d8fdf84f4db 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1504,7 +1504,8 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) return sd_illegal; } -qemu_log_mask(LOG_GUEST_ERROR, "SD: CMD%i in a wrong state\n", req.cmd); +qemu_log_mask(LOG_GUEST_ERROR, "SD: CMD%i in a wrong state: %s\n", + req.cmd, sd_state_name(sd->state)); return sd_illegal; } -- 2.31.1
[RFC PATCH 0/9] hw/sd: Allow card size not power of 2 again
Hi Ubi-Wan Kenubi and Tom, In commit a9bcedd (SD card size has to be power of 2) we decided to restrict SD card size to avoid security problems (CVE-2020-13253) but this became not practical to some users. This RFC series tries to remove the limitation, keeping our functional tests working. It is unfinished work because I had to attend other topics, but sending it early as RFC to get feedback. I'll keep working when I get more time, except if one if you can help me. Alexander, could you generate a qtest reproducer with the fuzzer corpus? See: https://bugs.launchpad.net/qemu/+bug/1878054 Thanks, Phil. Philippe Mathieu-Daudé (9): hw/sd: When card is in wrong state, log which state it is hw/sd: Extract address_in_range() helper, log invalid accesses tests/acceptance: Tag NetBSD tests as 'os:netbsd' tests/acceptance: Extract image_expand() helper tests/acceptance: Use image_expand() in test_arm_orangepi_uboot_netbsd9 tests/acceptance: Use image_expand() in test_arm_orangepi_bionic_20_08 tests/acceptance: Do not expand SD card image in test_arm_orangepi_sd tests/acceptance: Remove now unused pow2ceil() hw/sd: Allow card size not power of 2 again hw/sd/sd.c | 60 +- tests/acceptance/boot_linux_console.py | 39 - tests/acceptance/ppc_prep_40p.py | 2 + 3 files changed, 52 insertions(+), 49 deletions(-) -- 2.31.1
Re: SD/MMC host controller + 64-bit system bus
Great!! I'm happy to do so. Thanks for the reply! On Tue, Jun 22, 2021 at 1:51 PM Philippe Mathieu-Daudé wrote: > Hi Joanne, > > On 6/22/21 8:07 PM, Joanne Koong wrote: > > Hello! I noticed that the default SD/MMC host controller only supports a > > 32-bit system bus. Is there a reason 64-bit system buses aren't > > supported by default? > > We aim to support the spec v2.00, so this is a bug in the model, 64-bit > system bus should be supported. Do you mind sending a patch? > > Thanks, > > Phil. >
Re: [PATCH v2 2/1] qemu-img: Add "backing":true to unallocated map segments
On Wed, Jun 23, 2021 at 7:04 PM Kevin Wolf wrote: > > Am 23.06.2021 um 15:58 hat Nir Soffer geschrieben: > > On Wed, Jun 23, 2021 at 11:58 AM Kevin Wolf wrote: > > > > > > Am 22.06.2021 um 18:56 hat Nir Soffer geschrieben: > > > > On Tue, Jun 22, 2021 at 6:38 PM Kevin Wolf wrote: > > > > > > > > > > Am 11.06.2021 um 21:03 hat Eric Blake geschrieben: > > > > > > To save the user from having to check 'qemu-img info > > > > > > --backing-chain' > > > > > > or other followup command to determine which "depth":n goes beyond > > > > > > the > > > > > > chain, add a boolean field "backing" that is set only for > > > > > > unallocated > > > > > > portions of the disk. > > > > > > > > > > > > Signed-off-by: Eric Blake > > > > > > --- > > > > > > > > > > > > Touches the same iotest output as 1/1. If we decide that switching > > > > > > to > > > > > > "depth":n+1 is too risky, and that the mere addition of > > > > > > "backing":true > > > > > > while keeping "depth":n is good enough, then we'd have just one > > > > > > patch, > > > > > > instead of this double churn. Preferences? > > > > > > > > > > I think the additional flag is better because it's guaranteed to be > > > > > backwards compatible, and because you don't need to know the number of > > > > > layers to infer whether a cluster was allocated in the whole backing > > > > > chain. And by exposing ALLOCATED we definitely give access to the > > > > > whole > > > > > information that exists in QEMU. > > > > > > > > > > However, to continue with the bike shedding: I won't insist on > > > > > "allocated" even if that is what the flag is called internally and > > > > > consistency is usually helpful, but "backing" is misleading, too, > > > > > because intuitively it doesn't cover the top layer or standalone > > > > > images > > > > > without a backing file. How about something like "present"? > > > > > > > > Looks hard to document: > > > > > > > > # @present: if present and false, the range is not allocated within the > > > > # backing chain (since 6.1) > > > > > > I'm not sure why you would document it with a double negative. > > > > > > > And is not consistent with "offset". It would work better as: > > > > > > > > # @present: if present, the range is allocated within the backing > > > > # chain (since 6.1) > > > > > > Completely ignoring the value? I would have documented it like this, but > > > with "if true..." instead of "if present...". > > > > This is fine, but it means that this flag will present in all ranges, > > instead of only in unallocated ranges (what this patch is doing). > > An argument for always having the flag would be that it's probably > useful for a tool to know whether a given block is actually absent or > whether it's just running an old qemu-img. Good point, this is the best option. The disadvantage is a bigger output but if you use json you don't care about the size of the output. > If we didn't care about this, I would still define the actual value, but > also document a default. > > Kevin >
Re: [PATCH v2 2/1] qemu-img: Add "backing":true to unallocated map segments
Am 23.06.2021 um 15:58 hat Nir Soffer geschrieben: > On Wed, Jun 23, 2021 at 11:58 AM Kevin Wolf wrote: > > > > Am 22.06.2021 um 18:56 hat Nir Soffer geschrieben: > > > On Tue, Jun 22, 2021 at 6:38 PM Kevin Wolf wrote: > > > > > > > > Am 11.06.2021 um 21:03 hat Eric Blake geschrieben: > > > > > To save the user from having to check 'qemu-img info --backing-chain' > > > > > or other followup command to determine which "depth":n goes beyond the > > > > > chain, add a boolean field "backing" that is set only for unallocated > > > > > portions of the disk. > > > > > > > > > > Signed-off-by: Eric Blake > > > > > --- > > > > > > > > > > Touches the same iotest output as 1/1. If we decide that switching to > > > > > "depth":n+1 is too risky, and that the mere addition of "backing":true > > > > > while keeping "depth":n is good enough, then we'd have just one patch, > > > > > instead of this double churn. Preferences? > > > > > > > > I think the additional flag is better because it's guaranteed to be > > > > backwards compatible, and because you don't need to know the number of > > > > layers to infer whether a cluster was allocated in the whole backing > > > > chain. And by exposing ALLOCATED we definitely give access to the whole > > > > information that exists in QEMU. > > > > > > > > However, to continue with the bike shedding: I won't insist on > > > > "allocated" even if that is what the flag is called internally and > > > > consistency is usually helpful, but "backing" is misleading, too, > > > > because intuitively it doesn't cover the top layer or standalone images > > > > without a backing file. How about something like "present"? > > > > > > Looks hard to document: > > > > > > # @present: if present and false, the range is not allocated within the > > > # backing chain (since 6.1) > > > > I'm not sure why you would document it with a double negative. > > > > > And is not consistent with "offset". It would work better as: > > > > > > # @present: if present, the range is allocated within the backing > > > # chain (since 6.1) > > > > Completely ignoring the value? I would have documented it like this, but > > with "if true..." instead of "if present...". > > This is fine, but it means that this flag will present in all ranges, > instead of only in unallocated ranges (what this patch is doing). An argument for always having the flag would be that it's probably useful for a tool to know whether a given block is actually absent or whether it's just running an old qemu-img. If we didn't care about this, I would still define the actual value, but also document a default. Kevin
Re: [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use
On 08.06.21 15:16, Paolo Bonzini wrote: From: Joelle van Dyne iOS hosts do not have these defined so we fallback to the default behaviour. Co-authored-by: Warner Losh Reviewed-by: Peter Maydell Signed-off-by: Joelle van Dyne Message-Id: <20210315180341.31638-...@getutm.app> Signed-off-by: Paolo Bonzini --- block/file-posix.c | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 5821e1afed..4e2f7cf508 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2322,8 +2322,11 @@ static int64_t raw_getlength(BlockDriverState *bs) again: #endif if (!fstat(fd, &sb) && (S_IFCHR & sb.st_mode)) { +size = 0; #ifdef DIOCGMEDIASIZE -if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size)) +if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size)) { +size = 0; +} #elif defined(DIOCGPART) { struct partinfo pi; @@ -2332,9 +2335,7 @@ again: else size = 0; } -if (size == 0) -#endif -#if defined(__APPLE__) && defined(__MACH__) +#elif defined(DKIOCGETBLOCKCOUNT) && defined(DKIOCGETBLOCKSIZE) In v3, I was wondering whether it’s intentional that the following DKIOCGETBLOCKCOUNT/SIZE block would no longer be used as a fallback if the DIOCGMEDIASIZE ioctl failed (which it was before this patch). I’m still wondering. Max { uint64_t sectors = 0; uint32_t sector_size = 0; @@ -2342,19 +2343,15 @@ again: if (ioctl(fd, DKIOCGETBLOCKCOUNT, §ors) == 0 && ioctl(fd, DKIOCGETBLOCKSIZE, §or_size) == 0) { size = sectors * sector_size; -} else { -size = lseek(fd, 0LL, SEEK_END); -if (size < 0) { -return -errno; -} } } -#else -size = lseek(fd, 0LL, SEEK_END); +#endif +if (size == 0) { +size = lseek(fd, 0LL, SEEK_END); +} if (size < 0) { return -errno; } -#endif #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__) switch(s->type) { case FTYPE_CD:
Re: [PATCH v4 5/7] block: feature detection for host block support
On 08.06.21 15:16, Paolo Bonzini wrote: From: Joelle van Dyne On Darwin (iOS), there are no system level APIs for directly accessing host block devices. We detect this at configure time. Signed-off-by: Joelle van Dyne Message-Id: <20210315180341.31638-...@getutm.app> Signed-off-by: Paolo Bonzini --- block/file-posix.c | 33 ++--- meson.build | 6 +- qapi/block-core.json | 10 +++--- 3 files changed, 34 insertions(+), 15 deletions(-) [...] diff --git a/qapi/block-core.json b/qapi/block-core.json index 2ea294129e..2dd41be156 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -897,7 +897,8 @@ 'discriminator': 'driver', 'data': { 'file': 'BlockStatsSpecificFile', - 'host_device': 'BlockStatsSpecificFile', + 'host_device': { 'type': 'BlockStatsSpecificFile', + 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' }, 'nvme': 'BlockStatsSpecificNvme' } } ## @@ -2814,7 +2815,9 @@ { 'enum': 'BlockdevDriver', 'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs', 'cloop', 'compress', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps', -'gluster', 'host_cdrom', 'host_device', 'http', 'https', 'iscsi', +'gluster', 'host_cdrom', +{'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' }, +'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd', { 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' }, @@ -3996,7 +3999,8 @@ 'ftps': 'BlockdevOptionsCurlFtps', 'gluster':'BlockdevOptionsGluster', 'host_cdrom': 'BlockdevOptionsFile', - 'host_device':'BlockdevOptionsFile', + 'host_device': { 'type': 'BlockdevOptionsFile', + 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' }, 'http': 'BlockdevOptionsCurlHttp', 'https': 'BlockdevOptionsCurlHttps', 'iscsi': 'BlockdevOptionsIscsi', As I asked in v3: What about host_cdrom? Max
Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
On 08.06.21 21:14, Vladimir Sementsov-Ogievskiy wrote: 08.06.2021 16:16, Paolo Bonzini wrote: Even though it was only called for devices that have bs->sg set (which must be character devices), sg_get_max_segments looked at /sys/dev/block which only works for block devices. On Linux the sg driver has its own way to provide the maximum number of iovecs in a scatter/gather list, so add support for it. The block device path is kept because it will be reinstated in the next patches. Signed-off-by: Paolo Bonzini --- block/file-posix.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index f37dfc10b3..536998a1d6 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd) goto out; } + if (S_ISCHR(st.st_mode)) { Why not check "if (bs->sg) {" instead? It seems to be more consistent with issuing SG_ ioctl. Or what I miss? I dismissed this in v3, because I didn’t understand why you’d raise this point. The function is called sg_*(), and it’s only called if bs->sg is true anyway. So clearly we can use SG_ ioctls, because the whole function is intended only for SG devices anyway. This time, I looked forward, and perhaps starting at patch 4 I can understand where you’re coming from, because then the function is used for host devices in general. So now I don’t particularly mind. I think it’s still clear that if there’s a host device here that’s a character device, then that’s going to be an SG device, so I don’t really have a preference between S_ISCHR() and bs->sg. Max + if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) { + return ret; + } + return -ENOTSUP; + } + + if (!S_ISBLK(st.st_mode)) { + return -ENOTSUP; + } + sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments", major(st.st_rdev), minor(st.st_rdev)); sysfd = open(sysfspath, O_RDONLY);
[PATCH v2 6/6] block/iscsi: Do not force-cap *pnum
bdrv_co_block_status() does it for us, we do not need to do it here. The advantage of not capping *pnum is that bdrv_co_block_status() can cache larger data regions than requested by its caller. Signed-off-by: Max Reitz Reviewed-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/iscsi.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/block/iscsi.c b/block/iscsi.c index 4d2a416ce7..852384086b 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -781,9 +781,6 @@ retry: iscsi_allocmap_set_allocated(iscsilun, offset, *pnum); } -if (*pnum > bytes) { -*pnum = bytes; -} out_unlock: qemu_mutex_unlock(&iscsilun->mutex); g_free(iTask.err_str); -- 2.31.1
[PATCH v2 4/6] block/file-posix: Do not force-cap *pnum
bdrv_co_block_status() does it for us, we do not need to do it here. The advantage of not capping *pnum is that bdrv_co_block_status() can cache larger data regions than requested by its caller. Signed-off-by: Max Reitz Reviewed-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/file-posix.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index b3fbb9bd63..aeb370d5bb 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2689,7 +2689,8 @@ static int find_allocation(BlockDriverState *bs, off_t start, * the specified offset) that are known to be in the same * allocated/unallocated state. * - * 'bytes' is the max value 'pnum' should be set to. + * 'bytes' is a soft cap for 'pnum'. If the information is free, 'pnum' may + * well exceed it. */ static int coroutine_fn raw_co_block_status(BlockDriverState *bs, bool want_zero, @@ -2727,7 +2728,7 @@ static int coroutine_fn raw_co_block_status(BlockDriverState *bs, } else if (data == offset) { /* On a data extent, compute bytes to the end of the extent, * possibly including a partial sector at EOF. */ -*pnum = MIN(bytes, hole - offset); +*pnum = hole - offset; /* * We are not allowed to return partial sectors, though, so @@ -2746,7 +2747,7 @@ static int coroutine_fn raw_co_block_status(BlockDriverState *bs, } else { /* On a hole, compute bytes to the beginning of the next extent. */ assert(hole == offset); -*pnum = MIN(bytes, data - offset); +*pnum = data - offset; ret = BDRV_BLOCK_ZERO; } *map = offset; -- 2.31.1
[PATCH v2 3/6] block: Clarify that @bytes is no limit on *pnum
.bdrv_co_block_status() implementations are free to return a *pnum that exceeds @bytes, because bdrv_co_block_status() in block/io.c will clamp *pnum as necessary. On the other hand, if drivers' implementations return values for *pnum that are as large as possible, our recently introduced block-status cache will become more effective. So, make a note in block_int.h that @bytes is no upper limit for *pnum. Suggested-by: Eric Blake Signed-off-by: Max Reitz --- include/block/block_int.h | 5 + 1 file changed, 5 insertions(+) diff --git a/include/block/block_int.h b/include/block/block_int.h index fcb599dd1c..f85b96ed99 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -347,6 +347,11 @@ struct BlockDriver { * clamped to bdrv_getlength() and aligned to request_alignment, * as well as non-NULL pnum, map, and file; in turn, the driver * must return an error or set pnum to an aligned non-zero value. + * + * Note that @bytes is just a hint on how big of a region the + * caller wants to inspect. It is not a limit on *pnum. + * Implementations are free to return larger values of *pnum if + * doing so does not incur a performance penalty. */ int coroutine_fn (*bdrv_co_block_status)(BlockDriverState *bs, bool want_zero, int64_t offset, int64_t bytes, int64_t *pnum, -- 2.31.1
[PATCH v2 1/6] block: Drop BDS comment regarding bdrv_append()
There is a comment above the BDS definition stating care must be taken to consider handling newly added fields in bdrv_append(). Actually, this comment should have said "bdrv_swap()" as of 4ddc07cac (nine years ago), and in any case, bdrv_swap() was dropped in 8e419aefa (six years ago). So no such care is necessary anymore. Signed-off-by: Max Reitz Reviewed-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- include/block/block_int.h | 6 -- 1 file changed, 6 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 057d88b1fc..a8f9598102 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -832,12 +832,6 @@ struct BdrvChild { QLIST_ENTRY(BdrvChild) next_parent; }; -/* - * Note: the function bdrv_append() copies and swaps contents of - * BlockDriverStates, so if you add new fields to this struct, please - * inspect bdrv_append() to determine if the new fields need to be - * copied as well. - */ struct BlockDriverState { /* Protected by big QEMU lock or read-only after opening. No special * locking needed during I/O... -- 2.31.1
[PATCH v2 5/6] block/gluster: Do not force-cap *pnum
bdrv_co_block_status() does it for us, we do not need to do it here. The advantage of not capping *pnum is that bdrv_co_block_status() can cache larger data regions than requested by its caller. Signed-off-by: Max Reitz Reviewed-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/gluster.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/block/gluster.c b/block/gluster.c index e8ee14c8e9..8ef7bb18d5 100644 --- a/block/gluster.c +++ b/block/gluster.c @@ -1461,7 +1461,8 @@ exit: * the specified offset) that are known to be in the same * allocated/unallocated state. * - * 'bytes' is the max value 'pnum' should be set to. + * 'bytes' is a soft cap for 'pnum'. If the information is free, 'pnum' may + * well exceed it. * * (Based on raw_co_block_status() from file-posix.c.) */ @@ -1500,12 +1501,12 @@ static int coroutine_fn qemu_gluster_co_block_status(BlockDriverState *bs, } else if (data == offset) { /* On a data extent, compute bytes to the end of the extent, * possibly including a partial sector at EOF. */ -*pnum = MIN(bytes, hole - offset); +*pnum = hole - offset; ret = BDRV_BLOCK_DATA; } else { /* On a hole, compute bytes to the beginning of the next extent. */ assert(hole == offset); -*pnum = MIN(bytes, data - offset); +*pnum = data - offset; ret = BDRV_BLOCK_ZERO; } -- 2.31.1
[PATCH v2 2/6] block: block-status cache for data regions
As we have attempted before (https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06451.html, "file-posix: Cache lseek result for data regions"; https://lists.nongnu.org/archive/html/qemu-block/2021-02/msg00934.html, "file-posix: Cache next hole"), this patch seeks to reduce the number of SEEK_DATA/HOLE operations the file-posix driver has to perform. The main difference is that this time it is implemented as part of the general block layer code. The problem we face is that on some filesystems or in some circumstances, SEEK_DATA/HOLE is unreasonably slow. Given the implementation is outside of qemu, there is little we can do about its performance. We have already introduced the want_zero parameter to bdrv_co_block_status() to reduce the number of SEEK_DATA/HOLE calls unless we really want zero information; but sometimes we do want that information, because for files that consist largely of zero areas, special-casing those areas can give large performance boosts. So the real problem is with files that consist largely of data, so that inquiring the block status does not gain us much performance, but where such an inquiry itself takes a lot of time. To address this, we want to cache data regions. Most of the time, when bad performance is reported, it is in places where the image is iterated over from start to end (qemu-img convert or the mirror job), so a simple yet effective solution is to cache only the current data region. (Note that only caching data regions but not zero regions means that returning false information from the cache is not catastrophic: Treating zeroes as data is fine. While we try to invalidate the cache on zero writes and discards, such incongruences may still occur when there are other processes writing to the image.) We only use the cache for nodes without children (i.e. protocol nodes), because that is where the problem is: Drivers that rely on block-status implementations outside of qemu (e.g. SEEK_DATA/HOLE). Resolves: https://gitlab.com/qemu-project/qemu/-/issues/307 Signed-off-by: Max Reitz --- include/block/block_int.h | 47 ++ block.c | 84 +++ block/io.c| 61 ++-- 3 files changed, 189 insertions(+), 3 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index a8f9598102..fcb599dd1c 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -832,6 +832,22 @@ struct BdrvChild { QLIST_ENTRY(BdrvChild) next_parent; }; +/* + * Allows bdrv_co_block_status() to cache one data region for a + * protocol node. + * + * @valid: Whether the cache is valid (should be accessed with atomic + * functions so this can be reset by RCU readers) + * @data_start: Offset where we know (or strongly assume) is data + * @data_end: Offset where the data region ends (which is not necessarily + *the start of a zeroed region) + */ +typedef struct BdrvBlockStatusCache { +bool valid; +int64_t data_start; +int64_t data_end; +} BdrvBlockStatusCache; + struct BlockDriverState { /* Protected by big QEMU lock or read-only after opening. No special * locking needed during I/O... @@ -997,6 +1013,11 @@ struct BlockDriverState { /* BdrvChild links to this node may never be frozen */ bool never_freeze; + +/* Lock for block-status cache RCU writers */ +CoMutex bsc_modify_lock; +/* Always non-NULL, but must only be dereferenced under an RCU read guard */ +BdrvBlockStatusCache *block_status_cache; }; struct BlockBackendRootState { @@ -1422,4 +1443,30 @@ static inline BlockDriverState *bdrv_primary_bs(BlockDriverState *bs) */ void bdrv_drain_all_end_quiesce(BlockDriverState *bs); +/** + * Check whether the given offset is in the cached block-status data + * region. + * + * If it is, and @pnum is not NULL, *pnum is set to + * `bsc.data_end - offset`, i.e. how many bytes, starting from + * @offset, are data (according to the cache). + * Otherwise, *pnum is not touched. + */ +bool bdrv_bsc_is_data(BlockDriverState *bs, int64_t offset, int64_t *pnum); + +/** + * If [offset, offset + bytes) overlaps with the currently cached + * block-status region, invalidate the cache. + * + * (To be used by I/O paths that cause data regions to be zero or + * holes.) + */ +void bdrv_bsc_invalidate_range(BlockDriverState *bs, + int64_t offset, int64_t bytes); + +/** + * Mark the range [offset, offset + bytes) as a data region. + */ +void bdrv_bsc_fill(BlockDriverState *bs, int64_t offset, int64_t bytes); + #endif /* BLOCK_INT_H */ diff --git a/block.c b/block.c index 3f456892d0..9ab9459f7a 100644 --- a/block.c +++ b/block.c @@ -49,6 +49,8 @@ #include "qemu/timer.h" #include "qemu/cutils.h" #include "qemu/id.h" +#include "qemu/range.h" +#include "qemu/rcu.h" #include "block/coroutines.h" #ifdef CONFIG_BSD @@ -398,6 +400,9 @@ BlockDriverState *bdrv
[PATCH v2 0/6] block: block-status cache for data regions
Hi, See the cover letter from v1 for the general idea: https://lists.nongnu.org/archive/html/qemu-block/2021-06/msg00843.html The biggest change here in v2 is that instead of having a common CoMutex protect the block-status cache, we’re using RCU now. So to read from the cache, or even to invalidate it, no lock is needed, only to update it with new data. Disclaimer: I have no experience with RCU in practice so far, neither in qemu nor anywhere else. So I hope I’ve used it correctly... Differences to v1 in detail: - Patch 2: - Moved BdrvBlockStatusCache.lock up to BDS, it is now the RCU writer lock - BDS.block_status_cache is now a pointer, so it can be replaced with RCU - Moved all cache access functionality into helper functions (bdrv_bsc_is_data(), bdrv_bsc_invalidate_range(), bdrv_bsc_fill()) in block.c - Guard BSC accesses with RCU (BSC.valid is to be accessed atomically, which allows resetting it without taking an RCU write lock) - Check QLIST_EMPTY(&bs->children) not just when reading from the cache, but when filling it, too (so we don’t need an RCU update when it won’t make sense) - Patch 3: Added - Dropped the block/nbd patch (because it would make NBD query a larger range; the patch’s intent was to get more information for free, which this would not be) git-backport-diff against v1: Key: [] : patches are identical [] : number of functional differences between upstream/downstream patch [down] : patch is downstream-only The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively 001/6:[] [--] 'block: Drop BDS comment regarding bdrv_append()' 002/6:[0169] [FC] 'block: block-status cache for data regions' 003/6:[down] 'block: Clarify that @bytes is no limit on *pnum' 004/6:[] [--] 'block/file-posix: Do not force-cap *pnum' 005/6:[] [--] 'block/gluster: Do not force-cap *pnum' 006/6:[] [--] 'block/iscsi: Do not force-cap *pnum' Max Reitz (6): block: Drop BDS comment regarding bdrv_append() block: block-status cache for data regions block: Clarify that @bytes is no limit on *pnum block/file-posix: Do not force-cap *pnum block/gluster: Do not force-cap *pnum block/iscsi: Do not force-cap *pnum include/block/block_int.h | 54 +++-- block.c | 84 +++ block/file-posix.c| 7 ++-- block/gluster.c | 7 ++-- block/io.c| 61 ++-- block/iscsi.c | 3 -- 6 files changed, 200 insertions(+), 16 deletions(-) -- 2.31.1
Re: Running iotest linters from check-python-* CI jobs
On 6/23/21 6:55 AM, Kevin Wolf wrote: Am 22.06.2021 um 18:24 hat John Snow geschrieben: On 6/22/21 11:52 AM, Max Reitz wrote: On 22.06.21 16:57, John Snow wrote: Hi Kevin: At one point I had the idea to augment the Python linter CI jobs to also run the iotest linters. I thought it would be convenient to ensure that while I changed around the QMP and Machine packages that I didn't introduce regressions in iotest 297 either. I sent an RFC, got feedback from Vladimir (Who seemed broadly in favor of the idea), and then wrote a v2 that I never sent. RFC: Message-Id: <20210604163907.1511224-1-js...@redhat.com> mreitz stated (on IRC) in no uncertain terms they were not happy with the idea of 297 becoming gating CI, so I held off on pursuing the idea. I wanted to reach out and see if you had feelings on the matter, or if I should indeed just shelve it entirely. I like the general idea of making such checks gating CI, because if we already have them, what are we gaining by only manually finding violations? The more interesting part is defining our standards, i.e. writing config files for the tools, and we already do that for 297 anyway. The potential problem I could see is different linter versions, but you already addressed that below. (Through great personal pain, I assure you. Why do you think I have cooled off on pylint so drastically?...) My main point was that I don’t want to have to have an opinion on this topic. ;) Sorry if I put words in your mouth! I wanted to take your feedback/reaction seriously. It’s true that I’m not happy about linters being part of gating CI, but I also stated that I cannot defend this gut feeling, and that I feel like it’s “objectively” wrong. Therefore, I don’t want to be part of such a discussion, if I can avoid it. (To my defense, in virtiofsd-rs I myself made a linter part of the gating CI. That’s because we already had another linter in it, and because my gut feeling is much easier to suppress when it’s about a small project with few maintainers to annoy. It has nothing to do with me hating Python coding style guidelines, because I probably hate Rust coding style guidelines just as much.) 😅 I'll fully admit that pylint in particular is very, very annoying. My RFC does not increase the strictness of its use for iotests, at least. Yeah, I'm not sure if pylint ever warned about something that I actually cared to get changed... Most times it's just failing to meet some questionable arbitrary style requirements. I've seen it say something useful here and again ... though after converting everything to mypy strict, as you say, it's generally not as useful. It seems most useful for cleaning up old python2 era code, but less so for actively written and loved python3 code. I have grown way less attached to it after my efforts to clean up all the Python in the tree. Still, I have some... vague attachment to the idea that enforcing a style guide is "nice" for consistency reasons. Maybe I'll drop it eventually, or just continue to use it with rather permissive configurations, I don't know. Nothing to worry about today, I think. On the other hand, I assume you count mypy as a linter, too, and the messages of that one I treat more like compiler warnings or errors. They are actually useful, and if your code doesn't pass, then I usually do care about it getting fixed. The "linters" I run in the Python jobs right now are mypy, pylint, isort, and flake8. It's all the stuff in python/tests/. I would actually prefer our mypy config to become stricter over time. The mypy config for python/ and (most of, and soon to be all) scripts/qapi/ is 100% strict. It probably wouldn't be *so* bad to finish strictifying all of the non-unit-test files for iotests -- I had a hack/WIP series I posted about a year ago that tried to do most of it. I have since learned a few tricks while doing the qapi series to turn strictness on/off for individual modules which might allow us to do a more gradual conversion. IIRC there was a QED or a qcow module that liberally used dynamic typing that was non-trivial to convert to a statically typed subset of python. Perfectly reasonable code, just not to mypy. There are three linting standards for Python in the tree right now: 1. Those applied to scripts/qapi/ (Manually run only) 2. Those applied to tests/iotests/ (via 297) 3. Those applied to python/qemu/ (via CI) The python/qemu/ ones are the strictest and most annoying. scripts/qapi/ has an almost identical set of rules that will be integrated to python/qemu/ once I move the QAPI generator there. The iotests ones are separate and I intend to keep separate -- I think it should remain up to the block maintainers what their own tolerance level for annoying yappy errors are. I have no desire to change that. (I definitely have no desire to scrub and audit everything in iotests to bring it up to speed with the stricter standard. They're just tests, af
Re: [PATCH v2 2/1] qemu-img: Add "backing":true to unallocated map segments
On Wed, Jun 23, 2021 at 11:58 AM Kevin Wolf wrote: > > Am 22.06.2021 um 18:56 hat Nir Soffer geschrieben: > > On Tue, Jun 22, 2021 at 6:38 PM Kevin Wolf wrote: > > > > > > Am 11.06.2021 um 21:03 hat Eric Blake geschrieben: > > > > To save the user from having to check 'qemu-img info --backing-chain' > > > > or other followup command to determine which "depth":n goes beyond the > > > > chain, add a boolean field "backing" that is set only for unallocated > > > > portions of the disk. > > > > > > > > Signed-off-by: Eric Blake > > > > --- > > > > > > > > Touches the same iotest output as 1/1. If we decide that switching to > > > > "depth":n+1 is too risky, and that the mere addition of "backing":true > > > > while keeping "depth":n is good enough, then we'd have just one patch, > > > > instead of this double churn. Preferences? > > > > > > I think the additional flag is better because it's guaranteed to be > > > backwards compatible, and because you don't need to know the number of > > > layers to infer whether a cluster was allocated in the whole backing > > > chain. And by exposing ALLOCATED we definitely give access to the whole > > > information that exists in QEMU. > > > > > > However, to continue with the bike shedding: I won't insist on > > > "allocated" even if that is what the flag is called internally and > > > consistency is usually helpful, but "backing" is misleading, too, > > > because intuitively it doesn't cover the top layer or standalone images > > > without a backing file. How about something like "present"? > > > > Looks hard to document: > > > > # @present: if present and false, the range is not allocated within the > > # backing chain (since 6.1) > > I'm not sure why you would document it with a double negative. > > > And is not consistent with "offset". It would work better as: > > > > # @present: if present, the range is allocated within the backing > > # chain (since 6.1) > > Completely ignoring the value? I would have documented it like this, but > with "if true..." instead of "if present...". This is fine, but it means that this flag will present in all ranges, instead of only in unallocated ranges (what this patch is doing). > > > Or: > > > > # @absent: if present, the range is not allocated within the backing > > # chain (since 6.1) > > This is possible, too, but generally positive flags are preferable to > negative ones, and the internal one is already positive. > > > This is used by libnbd now: > > https://github.com/libguestfs/libnbd/commit/1d01d2ac4f6443b160b7d81119d555e1aaedb56d > > > > But I'm fine with "backing", It is consistent with BLK_BACKING_FILE, > > meaning this area exposes data from a backing file (if one exists). > > > > We use "backing" internally to be consistent with future qemu-img. > > I just realised that I actually misunderstood "backing" to mean the > opposite of what it is in this patch! > > It really means "the data comes from some imaginary additional backing > file that doesn't exist in the backing chain", while I understood it as > "something in the (real) backing chain contains the data". > > "present" or "absent" should be much less prone to such > misunderstandings. > > Kevin >
Re: [PATCH] block: BDRV_O_NO_IO for backing file on creation
Am 22.06.2021 um 16:00 hat Max Reitz geschrieben: > When creating an image file with a backing file, we generally try to > open the backing file (unless -u was specified), mostly to verify that > it is there, but also to get the file size if none was specified for the > new image. > > For neither of these things do we need data I/O, and so we can pass > BDRV_O_NO_IO when opening the backing file. This allows us to open even > encrypted backing images without requiring the user to provide a secret. > > This makes the -u switch in iotests 189 and 198 unnecessary (and the > $size parameter), so drop it, because this way we get regression tests > for this patch here. > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/441 > Signed-off-by: Max Reitz Thanks, applied to the block branch. Kevin
Re: [PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
On 6/23/21 11:11 AM, Bin Meng wrote: > On Wed, Jun 23, 2021 at 4:52 PM Philippe Mathieu-Daudé > wrote: >> >> On 6/23/21 10:39 AM, Bin Meng wrote: >>> On Wed, Jun 23, 2021 at 4:30 PM Cédric Le Goater wrote: The number of blocks is defined in the lower bits [15:0] >>> >>> I checked the physical spec v8.00 and it says bits [31:0] for CMD23 >>> argument. >> >> Watch out, we only support 1-3: >> > > Yes > >> enum SDPhySpecificationVersion { >> SD_PHY_SPECv1_10_VERS = 1, >> SD_PHY_SPECv2_00_VERS = 2, >> SD_PHY_SPECv3_01_VERS = 3, >> }; >> > > However the physical sepc v8.00 should document any difference between > ver 3.0 and ver 8.0 if there are indeed any, but for CMD23 it does > not. So it means it's 32-bit since day 1. > > To double check, I just downloaded the spec 3.01 and confirmed it's > still 32-bit. OK, so patch is incorrect then. Thanks, Phil.
Re: Running iotest linters from check-python-* CI jobs
Am 22.06.2021 um 18:24 hat John Snow geschrieben: > On 6/22/21 11:52 AM, Max Reitz wrote: > > On 22.06.21 16:57, John Snow wrote: > > > Hi Kevin: > > > > > > At one point I had the idea to augment the Python linter CI jobs to > > > also run the iotest linters. I thought it would be convenient to > > > ensure that while I changed around the QMP and Machine packages that > > > I didn't introduce regressions in iotest 297 either. > > > > > > I sent an RFC, got feedback from Vladimir (Who seemed broadly in > > > favor of the idea), and then wrote a v2 that I never sent. > > > > > > RFC: Message-Id: <20210604163907.1511224-1-js...@redhat.com> > > > > > > mreitz stated (on IRC) in no uncertain terms they were not happy > > > with the idea of 297 becoming gating CI, so I held off on pursuing > > > the idea. I wanted to reach out and see if you had feelings on the > > > matter, or if I should indeed just shelve it entirely. I like the general idea of making such checks gating CI, because if we already have them, what are we gaining by only manually finding violations? The more interesting part is defining our standards, i.e. writing config files for the tools, and we already do that for 297 anyway. The potential problem I could see is different linter versions, but you already addressed that below. > > My main point was that I don’t want to have to have an opinion on this > > topic. ;) > > > > Sorry if I put words in your mouth! I wanted to take your feedback/reaction > seriously. > > > It’s true that I’m not happy about linters being part of gating CI, but > > I also stated that I cannot defend this gut feeling, and that I feel > > like it’s “objectively” wrong. Therefore, I don’t want to be part of > > such a discussion, if I can avoid it. > > (To my defense, in virtiofsd-rs I myself made a linter part of the > > gating CI. That’s because we already had another linter in it, and > > because my gut feeling is much easier to suppress when it’s about a > > small project with few maintainers to annoy. It has nothing to do with > > me hating Python coding style guidelines, because I probably hate Rust > > coding style guidelines just as much.) > > > > 😅 > > I'll fully admit that pylint in particular is very, very annoying. My RFC > does not increase the strictness of its use for iotests, at least. Yeah, I'm not sure if pylint ever warned about something that I actually cared to get changed... Most times it's just failing to meet some questionable arbitrary style requirements. On the other hand, I assume you count mypy as a linter, too, and the messages of that one I treat more like compiler warnings or errors. They are actually useful, and if your code doesn't pass, then I usually do care about it getting fixed. I would actually prefer our mypy config to become stricter over time. > There are three linting standards for Python in the tree right now: > > 1. Those applied to scripts/qapi/ (Manually run only) > 2. Those applied to tests/iotests/ (via 297) > 3. Those applied to python/qemu/ (via CI) > > The python/qemu/ ones are the strictest and most annoying. scripts/qapi/ has > an almost identical set of rules that will be integrated to python/qemu/ > once I move the QAPI generator there. > > The iotests ones are separate and I intend to keep separate -- I think it > should remain up to the block maintainers what their own tolerance level for > annoying yappy errors are. I have no desire to change that. > > (I definitely have no desire to scrub and audit everything in iotests to > bring it up to speed with the stricter standard. They're just tests, after > all. It's not worth it.) Right, individual tests aren't that important, especially concerning style, though I feel shared files like iotests.py and the test infrastructure itself are probably worth it. > > In any case, I had understood you wanted to make 297 part of the > > non-gating CI anyway, though, so I wonder what of the things I said made > > you shelve that idea. > > I just don't like pursuing things that might increase your maintenance > burden or make your day worse. I know you don't want to be involved, but > this kind of necessarily involves you at least indirectly, so ... It > genuinely felt a little rude to press onward without getting a bit more > information first. > > I figured I'd ask Kevin what his feelings are to see if that > un-muddies the waters. So my hope is that it would in fact decrease the maintenance burden because we would catch bugs in the tests in time, and dealing with false positives would cost us less time than dealing with such bugs. But then, this is something that is mostly a point for mypy, not for pylint. > > (Another concern I had was linter updates breaking CI, but you promised > > to keep the linter in a steady configuration so this wouldn’t happen. So > > all in all, I can’t remember I brought any argument that would ac > > buttually speak against your idea.) > > > > Right. I can't prom
Re: [PATCH 0/2] introduce QEMU_AUTO_VFREE
Am 19.06.2021 um 16:21 hat Vladimir Sementsov-Ogievskiy geschrieben: > Hi all! > > There is a good movement to use g_autofree macro, that helps to > automatically call g_free on exit from code block. > > We lack similar possibility for qemu_memalign() functions family. Let's > add, it seems rather simple with help of "cleanup" attribute. > > I'll update more places with a follow-up if this is accepted. Good idea. Thanks, applied to the block branch. Kevin
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
On 22/06/21 12:39, Vladimir Sementsov-Ogievskiy wrote: 22.06.2021 13:20, Paolo Bonzini wrote: On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote: OK, I agree, let's keep it. You can also have a finished job, but get a stale value for error_is_read or ret. The issue is not in getting the stale value per se, but in block_copy_call_status's caller not expecting it. Hmm. So, do you mean that we can read ret and error_is_read ONLY after explicitly doing load_acquire(finished) and checking that it's true? That means that we must do it not in assertion (to not be compiled out): bool finished = load_acquire() assert(finished); ... read reat and error_is_read ... In reality you must have synchronized in some other way; that outside synchronization outside block-copy.c is what guarantees that the assertion does not fail. The simplest cases are: - a mutex: "unlock" counts as release, "lock" counts as acquire; - a bottom half: "schedule" counts as release, running counts as acquire. Therefore, removing the assertion would work just fine because the synchronization would be like this: write ret/error_is_read write finished trigger bottom half or something -->bottom half runs read ret/error_is_read So there is no need at all to read ->finished, much less to load it outside the assertion. At the same time there are two problems with "assert(qatomic_read(&call_state->finished))". Note that they are not logic issues, they are maintenance issues. First, if *there is a bug elsewhere* and the above synchronization doesn't happen, it may manifest sometimes as an assertion failure and sometimes as a memory reordering. This is what I was talking about in the previous message, and it is probably the last thing that you want when debugging; since we're adding asserts defensively, we might as well do it well. Second, if somebody later carelessly changes the function to if (qatomic_read(&call_state->finished)) { ... } else { error_setg(...); } that would be broken. Using qatomic_load_acquire makes the code more future-proof against a change like the one above. Paolo
Re: [PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
On 6/23/21 11:12 AM, Bin Meng wrote: > On Wed, Jun 23, 2021 at 4:55 PM Cédric Le Goater wrote: >> >> On 6/23/21 10:39 AM, Bin Meng wrote: >>> On Wed, Jun 23, 2021 at 4:30 PM Cédric Le Goater wrote: The number of blocks is defined in the lower bits [15:0] >>> >>> I checked the physical spec v8.00 and it says bits [31:0] for CMD23 >>> argument. >> >> May be that's an eMMC thing. That's what I read from the specs : >> >> [31] Reliable Write Request >> [30:16] set to 0 >> [15:0] number of blocks > > I don't think we ever claimed eMMC support in QEMU. Did we? Is that a question ? C.
Re: [PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
On Wed, Jun 23, 2021 at 4:55 PM Cédric Le Goater wrote: > > On 6/23/21 10:39 AM, Bin Meng wrote: > > On Wed, Jun 23, 2021 at 4:30 PM Cédric Le Goater wrote: > >> > >> The number of blocks is defined in the lower bits [15:0] > > > > I checked the physical spec v8.00 and it says bits [31:0] for CMD23 > > argument. > > May be that's an eMMC thing. That's what I read from the specs : > > [31] Reliable Write Request > [30:16] set to 0 > [15:0] number of blocks I don't think we ever claimed eMMC support in QEMU. Did we? Regards, Bin
Re: [PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
On Wed, Jun 23, 2021 at 4:52 PM Philippe Mathieu-Daudé wrote: > > On 6/23/21 10:39 AM, Bin Meng wrote: > > On Wed, Jun 23, 2021 at 4:30 PM Cédric Le Goater wrote: > >> > >> The number of blocks is defined in the lower bits [15:0] > > > > I checked the physical spec v8.00 and it says bits [31:0] for CMD23 > > argument. > > Watch out, we only support 1-3: > Yes > enum SDPhySpecificationVersion { > SD_PHY_SPECv1_10_VERS = 1, > SD_PHY_SPECv2_00_VERS = 2, > SD_PHY_SPECv3_01_VERS = 3, > }; > However the physical sepc v8.00 should document any difference between ver 3.0 and ver 8.0 if there are indeed any, but for CMD23 it does not. So it means it's 32-bit since day 1. To double check, I just downloaded the spec 3.01 and confirmed it's still 32-bit. Regards, Bin
Re: [PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
On 6/23/21 10:52 AM, Philippe Mathieu-Daudé wrote: > On 6/23/21 10:39 AM, Bin Meng wrote: >> On Wed, Jun 23, 2021 at 4:30 PM Cédric Le Goater wrote: >>> >>> The number of blocks is defined in the lower bits [15:0] >> >> I checked the physical spec v8.00 and it says bits [31:0] for CMD23 argument. > > Watch out, we only support 1-3: > > enum SDPhySpecificationVersion { > SD_PHY_SPECv1_10_VERS = 1, > SD_PHY_SPECv2_00_VERS = 2, > SD_PHY_SPECv3_01_VERS = 3, > }; > Yes. block count was increased to 32-bit in v4 if I am correct. Any how, it is a bit more complex than the patch I sent which is fixing an issue I saw with eMMC. Thanks, C.
Re: [PATCH v2 2/1] qemu-img: Add "backing":true to unallocated map segments
Am 22.06.2021 um 18:56 hat Nir Soffer geschrieben: > On Tue, Jun 22, 2021 at 6:38 PM Kevin Wolf wrote: > > > > Am 11.06.2021 um 21:03 hat Eric Blake geschrieben: > > > To save the user from having to check 'qemu-img info --backing-chain' > > > or other followup command to determine which "depth":n goes beyond the > > > chain, add a boolean field "backing" that is set only for unallocated > > > portions of the disk. > > > > > > Signed-off-by: Eric Blake > > > --- > > > > > > Touches the same iotest output as 1/1. If we decide that switching to > > > "depth":n+1 is too risky, and that the mere addition of "backing":true > > > while keeping "depth":n is good enough, then we'd have just one patch, > > > instead of this double churn. Preferences? > > > > I think the additional flag is better because it's guaranteed to be > > backwards compatible, and because you don't need to know the number of > > layers to infer whether a cluster was allocated in the whole backing > > chain. And by exposing ALLOCATED we definitely give access to the whole > > information that exists in QEMU. > > > > However, to continue with the bike shedding: I won't insist on > > "allocated" even if that is what the flag is called internally and > > consistency is usually helpful, but "backing" is misleading, too, > > because intuitively it doesn't cover the top layer or standalone images > > without a backing file. How about something like "present"? > > Looks hard to document: > > # @present: if present and false, the range is not allocated within the > # backing chain (since 6.1) I'm not sure why you would document it with a double negative. > And is not consistent with "offset". It would work better as: > > # @present: if present, the range is allocated within the backing > # chain (since 6.1) Completely ignoring the value? I would have documented it like this, but with "if true..." instead of "if present...". > Or: > > # @absent: if present, the range is not allocated within the backing > # chain (since 6.1) This is possible, too, but generally positive flags are preferable to negative ones, and the internal one is already positive. > This is used by libnbd now: > https://github.com/libguestfs/libnbd/commit/1d01d2ac4f6443b160b7d81119d555e1aaedb56d > > But I'm fine with "backing", It is consistent with BLK_BACKING_FILE, > meaning this area exposes data from a backing file (if one exists). > > We use "backing" internally to be consistent with future qemu-img. I just realised that I actually misunderstood "backing" to mean the opposite of what it is in this patch! It really means "the data comes from some imaginary additional backing file that doesn't exist in the backing chain", while I understood it as "something in the (real) backing chain contains the data". "present" or "absent" should be much less prone to such misunderstandings. Kevin
Re: [PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
On 6/23/21 10:39 AM, Bin Meng wrote: > On Wed, Jun 23, 2021 at 4:30 PM Cédric Le Goater wrote: >> >> The number of blocks is defined in the lower bits [15:0] > > I checked the physical spec v8.00 and it says bits [31:0] for CMD23 argument. May be that's an eMMC thing. That's what I read from the specs : [31] Reliable Write Request [30:16] set to 0 [15:0] number of blocks Thanks, C.
Re: [PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
On 6/23/21 10:39 AM, Bin Meng wrote: > On Wed, Jun 23, 2021 at 4:30 PM Cédric Le Goater wrote: >> >> The number of blocks is defined in the lower bits [15:0] > > I checked the physical spec v8.00 and it says bits [31:0] for CMD23 argument. Watch out, we only support 1-3: enum SDPhySpecificationVersion { SD_PHY_SPECv1_10_VERS = 1, SD_PHY_SPECv2_00_VERS = 2, SD_PHY_SPECv3_01_VERS = 3, }; > >> >> Signed-off-by: Cédric Le Goater >> --- >> hw/sd/sd.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> > > Regards, > Bin >
Re: [PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
On Wed, Jun 23, 2021 at 4:30 PM Cédric Le Goater wrote: > > The number of blocks is defined in the lower bits [15:0] I checked the physical spec v8.00 and it says bits [31:0] for CMD23 argument. > > Signed-off-by: Cédric Le Goater > --- > hw/sd/sd.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Regards, Bin
[PATCH] sd: mmc: Fix SET_BLOCK_COUNT command argument
The number of blocks is defined in the lower bits [15:0] Signed-off-by: Cédric Le Goater --- hw/sd/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index a73d80661a10..a2553a502edc 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1358,7 +1358,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) } switch (sd->state) { case sd_transfer_state: -sd->multi_blk_cnt = req.arg; +sd->multi_blk_cnt = req.arg & 0x; return sd_r1; default: -- 2.31.1
Re: [PATCH v3 02/24] modules: collect module meta-data
On Tue, Jun 22, 2021 at 06:03:45PM +0200, Paolo Bonzini wrote: > On 21/06/21 14:52, Gerd Hoffmann wrote: > > ninja: error: 'libui-curses.a.p/meson-generated_.._config-host.h.o', needed > > by 'ui-curses.modinfo.test', missing and no known rule to make it > > > > Hmm, not sure where this comes from. meson doesn't try to link > > config-host.h.o into libui-curses.a, so why does extract_all_objects() > > return it? > > > > Test patch (incremental to this series) below. > > Bug in Meson, fix at https://github.com/mesonbuild/meson/pull/8900. You can > just ignore missing files. Well, it's ninja throwing the error not the modinfo script, the script doesn't even run ... take care, Gerd