Re: [PATCH v6 04/13] hw/arm: Add NPCM730 and NPCM750 SoC models
On 7/17/20 6:59 PM, Havard Skinnemoen wrote: > +Markus Armbruster > > On Fri, Jul 17, 2020 at 5:20 AM Cédric Le Goater wrote: >> >> On 7/17/20 8:02 AM, Havard Skinnemoen wrote: >>> The Nuvoton NPCM7xx SoC family are used to implement Baseboard >>> Management Controllers in servers. While the family includes four SoCs, >>> this patch implements limited support for two of them: NPCM730 (targeted >>> for Data Center applications) and NPCM750 (targeted for Enterprise >>> applications). >>> >>> This patch includes little more than the bare minimum needed to boot a >>> Linux kernel built with NPCM7xx support in direct-kernel mode: >>> >>> - Two Cortex-A9 CPU cores with built-in periperhals. >>> - Global Configuration Registers. >>> - Clock Management. >>> - 3 Timer Modules with 5 timers each. >>> - 4 serial ports. >>> >>> The chips themselves have a lot more features, some of which will be >>> added to the model at a later stage. >>> >>> Reviewed-by: Tyrone Ting >>> Reviewed-by: Joel Stanley >>> Reviewed-by: Philippe Mathieu-Daudé >>> Signed-off-by: Havard Skinnemoen >>> --- >>> include/hw/arm/npcm7xx.h | 85 >>> hw/arm/npcm7xx.c | 407 +++ >>> hw/arm/Kconfig | 5 + >>> hw/arm/Makefile.objs | 1 + >>> 4 files changed, 498 insertions(+) >>> create mode 100644 include/hw/arm/npcm7xx.h >>> create mode 100644 hw/arm/npcm7xx.c ... >>> +static void npcm7xx_realize(DeviceState *dev, Error **errp) >>> +{ >>> +NPCM7xxState *s = NPCM7XX(dev); >>> +NPCM7xxClass *nc = NPCM7XX_GET_CLASS(s); >>> +int i; >>> + >>> +if (memory_region_size(s->dram) > NPCM7XX_DRAM_SZ) { >>> +error_setg(errp, "%s: NPCM7xx cannot address more than %" PRIu64 >>> + " MiB of DRAM", __func__, NPCM7XX_DRAM_SZ / MiB); >>> +return; >>> +} >>> + >>> +/* CPUs */ >>> +for (i = 0; i < nc->num_cpus; i++) { >>> +object_property_set_int(OBJECT(&s->cpu[i]), "mp-affinity", >>> +arm_cpu_mp_affinity(i, >>> NPCM7XX_MAX_NUM_CPUS), >>> +&error_abort); >>> +object_property_set_int(OBJECT(&s->cpu[i]), "reset-cbar", >>> +NPCM7XX_GIC_CPU_IF_ADDR, &error_abort); >>> +object_property_set_bool(OBJECT(&s->cpu[i]), "reset-hivecs", true, >>> + &error_abort); >>> + >>> +/* Disable security extensions. */ >>> +object_property_set_bool(OBJECT(&s->cpu[i]), "has_el3", false, >>> + &error_abort); >>> + >>> +if (!qdev_realize(DEVICE(&s->cpu[i]), NULL, errp)) { >>> +return; >>> +} >>> +} >>> + >>> +/* A9MPCORE peripherals. Can only fail if we pass bad parameters here. >>> */ >>> +object_property_set_int(OBJECT(&s->a9mpcore), "num-cpu", nc->num_cpus, >>> +&error_abort); >>> +object_property_set_int(OBJECT(&s->a9mpcore), "num-irq", >>> NPCM7XX_NUM_IRQ, >>> +&error_abort); [1] >>> +sysbus_realize(SYS_BUS_DEVICE(&s->a9mpcore), &error_abort); [2] >> >> shouldn't we test the return value and use errp ? I don't know what >> was agreed upon. Per https://www.mail-archive.com/qemu-devel@nongnu.org/msg723217.html: >> 1. Internal code failing to set simple properties to predefined >> values is a programming error, so error_abort is appropriate. > > That would be my advice. > >> 2. qdev_realize() may fail due to user input, so errors should be >> propagated. > > In general, yes. For a specific device, you may know it can't fail, > and then &error_abort may be okay. So it looks correct. > > I'm not sure if I got it 100% right, but what I tried to do was to see > which submodules could possibly fail due to user input, and propagate > errors from those modules only. > > For example, the GCR can fail if the user-provided memory size can't > be encoded into registers, so that one clearly needs to be propagated. > > Other modules don't take any parameters at all, so they can only fail > due to programming errors, hence error_abort. > > I wasn't able to find any way command line options could cause the > a9mpcore module to fail, but that's one of the cases I'm very unsure > about, so I'll be happy to propagate errors from that if you (or > anyone else) think it's needed. > > I'm also not sure about the CPUs, but ended up going the other way > since there's a -cpu option, and it's plausible that the user could > cause it to fail even though I couldn't find any specific options to > trigger an error.
Re: [PATCH v6 00/13] Add Nuvoton NPCM730/NPCM750 SoCs and two BMC machines
On Fri, Jul 17, 2020 at 1:32 PM Cédric Le Goater wrote: > > On 7/17/20 8:02 AM, Havard Skinnemoen wrote: > > I also pushed this and the previous two patchsets to my qemu fork on github. > > The branches are named npcm7xx-v[1-6]. > > > > https://github.com/hskinnemoen/qemu > > > > This patch series models enough of the Nuvoton NPCM730 and NPCM750 SoCs to > > boot > > an OpenBMC image built for quanta-gsj. This includes device models for: > > > > - Global Configuration Registers > > - Clock Control > > - Timers > > - Fuses > > - Memory Controller > > - Flash Controller > > > > These modules, along with the existing Cortex A9 CPU cores and built-in > > peripherals, are integrated into a NPCM730 or NPCM750 SoC, which in turn > > form > > the foundation for the quanta-gsj and npcm750-evb machines, respectively. > > The > > two SoCs are very similar; the only difference is that NPCM730 is missing > > some > > peripherals that NPCM750 has, and which are not considered essential for > > datacenter use (e.g. graphics controllers). For more information, see > > > > https://www.nuvoton.com/products/cloud-computing/ibmc/ > > > > Both quanta-gsj and npcm750-evb correspond to real boards supported by > > OpenBMC. > > At the end of the series, qemu can boot an OpenBMC image built for one of > > these > > boards with some minor modifications. > > > > The patches in this series were developed by Google and reviewed by > > Nuvoton. We > > will be maintaining the machine and peripheral support together. > > > > The data sheet for these SoCs is not generally available. Please let me > > know if > > more comments are needed to understand the device behavior. > > I think this series is ready to go. > > Patch 6 "roms: Add virtual Boot ROM for NPCM7xx SoCs" needs a few > add-ons which can come later. > > Patch 13 "tests/acceptance: console boot tests for quanta-gsj" needs > an Acked-by by our test gurus. Aspeed needs the same kind of tests > if possible. It depends on the pressure that the QEMU CI will put on > the web servers hosting the images. Hmm, I didn't expect this to be a problem when hosting the images on github...? > I think that the rest of the comments are minor and can be fixed > as follow-ups when 5.2 is being assembled. Great! Thanks a lot for reviewing and testing. I've incorporated the feedback I got today, but I'm keeping them as separate commits for now. I can either fold them into a v7 series or post them as follow-ups. Thanks! Havard
[PATCH v2] goldfish_rtc: Fix non-atomic read behaviour of TIME_LOW/TIME_HIGH
The specification says: 0x00 TIME_LOW R: Get current time, then return low-order 32-bits. 0x04 TIME_HIGH R: Return high 32-bits from previous TIME_LOW read. ... To read the value, the kernel must perform an IO_READ(TIME_LOW), which returns an unsigned 32-bit value, before an IO_READ(TIME_HIGH), which returns a signed 32-bit value, corresponding to the higher half of the full value. However, we were just returning the current time for both. If the guest is unlucky enough to read TIME_LOW and TIME_HIGH either side of an overflow of the lower half, it will see time be in the future, before jumping backwards on the next read, and Linux currently relies on the atomicity guaranteed by the spec so is affected by this. Fix this violation of the spec by caching the correct value for TIME_HIGH whenever TIME_LOW is read, and returning that value for any TIME_HIGH read. Signed-off-by: Jessica Clarke --- Changes since v1: * Add time_high to goldfish_rtc_vmstate and increment version. hw/rtc/goldfish_rtc.c | 17 ++--- include/hw/rtc/goldfish_rtc.h | 1 + 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/hw/rtc/goldfish_rtc.c b/hw/rtc/goldfish_rtc.c index 01e9d2b083..6ddd45cce0 100644 --- a/hw/rtc/goldfish_rtc.c +++ b/hw/rtc/goldfish_rtc.c @@ -94,12 +94,22 @@ static uint64_t goldfish_rtc_read(void *opaque, hwaddr offset, GoldfishRTCState *s = opaque; uint64_t r = 0; +/* + * From the documentation linked at the top of the file: + * + * To read the value, the kernel must perform an IO_READ(TIME_LOW), which + * returns an unsigned 32-bit value, before an IO_READ(TIME_HIGH), which + * returns a signed 32-bit value, corresponding to the higher half of the + * full value. + */ switch (offset) { case RTC_TIME_LOW: -r = goldfish_rtc_get_count(s) & 0x; +r = goldfish_rtc_get_count(s); +s->time_high = r >> 32; +r &= 0x; break; case RTC_TIME_HIGH: -r = goldfish_rtc_get_count(s) >> 32; +r = s->time_high; break; case RTC_ALARM_LOW: r = s->alarm_next & 0x; @@ -216,7 +226,7 @@ static const MemoryRegionOps goldfish_rtc_ops = { static const VMStateDescription goldfish_rtc_vmstate = { .name = TYPE_GOLDFISH_RTC, -.version_id = 1, +.version_id = 2, .pre_save = goldfish_rtc_pre_save, .post_load = goldfish_rtc_post_load, .fields = (VMStateField[]) { @@ -225,6 +235,7 @@ static const VMStateDescription goldfish_rtc_vmstate = { VMSTATE_UINT32(alarm_running, GoldfishRTCState), VMSTATE_UINT32(irq_pending, GoldfishRTCState), VMSTATE_UINT32(irq_enabled, GoldfishRTCState), +VMSTATE_UINT32(time_high, GoldfishRTCState), VMSTATE_END_OF_LIST() } }; diff --git a/include/hw/rtc/goldfish_rtc.h b/include/hw/rtc/goldfish_rtc.h index 16f9f9e29d..9bd8924f5f 100644 --- a/include/hw/rtc/goldfish_rtc.h +++ b/include/hw/rtc/goldfish_rtc.h @@ -41,6 +41,7 @@ typedef struct GoldfishRTCState { uint32_t alarm_running; uint32_t irq_pending; uint32_t irq_enabled; +uint32_t time_high; } GoldfishRTCState; #endif -- 2.20.1
Re: [PATCH] goldfish_rtc: Fix non-atomic read behaviour of TIME_LOW/TIME_HIGH
On 7/17/20 5:20 PM, Jessica Clarke wrote: > The specification says: > >0x00 TIME_LOW R: Get current time, then return low-order 32-bits. >0x04 TIME_HIGH R: Return high 32-bits from previous TIME_LOW read. > >... > >To read the value, the kernel must perform an IO_READ(TIME_LOW), >which returns an unsigned 32-bit value, before an IO_READ(TIME_HIGH), >which returns a signed 32-bit value, corresponding to the higher half >of the full value. > > However, we were just returning the current time for both. If the guest > is unlucky enough to read TIME_LOW and TIME_HIGH either side of an > overflow of the lower half, it will see time be in the future, before > jumping backwards on the next read, and Linux currently relies on the > atomicity guaranteed by the spec so is affected by this. Fix this > violation of the spec by caching the correct value for TIME_HIGH > whenever TIME_LOW is read, and returning that value for any TIME_HIGH > read. > > Signed-off-by: Jessica Clarke > --- > hw/rtc/goldfish_rtc.c | 14 -- > include/hw/rtc/goldfish_rtc.h | 1 + > 2 files changed, 13 insertions(+), 2 deletions(-) > > diff --git a/hw/rtc/goldfish_rtc.c b/hw/rtc/goldfish_rtc.c > index 01e9d2b083..9b577bf159 100644 > --- a/hw/rtc/goldfish_rtc.c > +++ b/hw/rtc/goldfish_rtc.c > @@ -94,12 +94,22 @@ static uint64_t goldfish_rtc_read(void *opaque, hwaddr > offset, > GoldfishRTCState *s = opaque; > uint64_t r = 0; > > +/* > + * From the documentation linked at the top of the file: > + * > + * To read the value, the kernel must perform an IO_READ(TIME_LOW), > which > + * returns an unsigned 32-bit value, before an IO_READ(TIME_HIGH), > which > + * returns a signed 32-bit value, corresponding to the higher half of > the > + * full value. > + */ > switch (offset) { > case RTC_TIME_LOW: > -r = goldfish_rtc_get_count(s) & 0x; > +r = goldfish_rtc_get_count(s); > +s->time_high = r >> 32; > +r &= 0x; > break; > case RTC_TIME_HIGH: > -r = goldfish_rtc_get_count(s) >> 32; > +r = s->time_high; > break; > case RTC_ALARM_LOW: > r = s->alarm_next & 0x; > diff --git a/include/hw/rtc/goldfish_rtc.h b/include/hw/rtc/goldfish_rtc.h > index 16f9f9e29d..9bd8924f5f 100644 > --- a/include/hw/rtc/goldfish_rtc.h > +++ b/include/hw/rtc/goldfish_rtc.h > @@ -41,6 +41,7 @@ typedef struct GoldfishRTCState { > uint32_t alarm_running; > uint32_t irq_pending; > uint32_t irq_enabled; > +uint32_t time_high; > } GoldfishRTCState; You need to add the new field to goldfish_rtc_vmstate, and increment the version. r~
[PATCH] goldfish_rtc: Fix non-atomic read behaviour of TIME_LOW/TIME_HIGH
The specification says: 0x00 TIME_LOW R: Get current time, then return low-order 32-bits. 0x04 TIME_HIGH R: Return high 32-bits from previous TIME_LOW read. ... To read the value, the kernel must perform an IO_READ(TIME_LOW), which returns an unsigned 32-bit value, before an IO_READ(TIME_HIGH), which returns a signed 32-bit value, corresponding to the higher half of the full value. However, we were just returning the current time for both. If the guest is unlucky enough to read TIME_LOW and TIME_HIGH either side of an overflow of the lower half, it will see time be in the future, before jumping backwards on the next read, and Linux currently relies on the atomicity guaranteed by the spec so is affected by this. Fix this violation of the spec by caching the correct value for TIME_HIGH whenever TIME_LOW is read, and returning that value for any TIME_HIGH read. Signed-off-by: Jessica Clarke --- hw/rtc/goldfish_rtc.c | 14 -- include/hw/rtc/goldfish_rtc.h | 1 + 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/hw/rtc/goldfish_rtc.c b/hw/rtc/goldfish_rtc.c index 01e9d2b083..9b577bf159 100644 --- a/hw/rtc/goldfish_rtc.c +++ b/hw/rtc/goldfish_rtc.c @@ -94,12 +94,22 @@ static uint64_t goldfish_rtc_read(void *opaque, hwaddr offset, GoldfishRTCState *s = opaque; uint64_t r = 0; +/* + * From the documentation linked at the top of the file: + * + * To read the value, the kernel must perform an IO_READ(TIME_LOW), which + * returns an unsigned 32-bit value, before an IO_READ(TIME_HIGH), which + * returns a signed 32-bit value, corresponding to the higher half of the + * full value. + */ switch (offset) { case RTC_TIME_LOW: -r = goldfish_rtc_get_count(s) & 0x; +r = goldfish_rtc_get_count(s); +s->time_high = r >> 32; +r &= 0x; break; case RTC_TIME_HIGH: -r = goldfish_rtc_get_count(s) >> 32; +r = s->time_high; break; case RTC_ALARM_LOW: r = s->alarm_next & 0x; diff --git a/include/hw/rtc/goldfish_rtc.h b/include/hw/rtc/goldfish_rtc.h index 16f9f9e29d..9bd8924f5f 100644 --- a/include/hw/rtc/goldfish_rtc.h +++ b/include/hw/rtc/goldfish_rtc.h @@ -41,6 +41,7 @@ typedef struct GoldfishRTCState { uint32_t alarm_running; uint32_t irq_pending; uint32_t irq_enabled; +uint32_t time_high; } GoldfishRTCState; #endif -- 2.20.1
Re: tests/vm infrastructure fails to notice that QEMU dying is a failure
On 7/17/20 9:25 AM, Philippe Mathieu-Daudé wrote: On 7/17/20 3:22 PM, Philippe Mathieu-Daudé wrote: Cc'ing John & Cleber. On 7/17/20 3:08 PM, Peter Maydell wrote: If you run vm-build-openbsd, our makefile/scripting infrastructure seems to fail to notice that death of the QEMU process that's running the VM should be a failure, and ends up allowing make to return a success condition. I have a script which runs a VM build which basically does: #!/bin/sh -e make -C "build" "vm-build-openbsd" J=8 V=1 echo "OK DONE openbsd" It just gave me this output (tail end of logfile). We're executing tests, and then the qemu-system-x86_64 that's running the OpenBSD VM gets a signal 9 (sigkill), for unclear reasons (oom killer??). The python scripting gets an exception, but doesn't exit with a failure status to make, which then thinks all is fine, exits success itself and allows the set -e script to proceed to print the "OK DONE" line... The signal 9 is likely due to QEMU freezing on exit. The machine.py code waits about 3 seconds after trying to shut QEMU down gracefully before sending SIGKILL and cleaning up forcefully. PASS 30 qos-test /arm/imx25-pdk/imx.i2c/i2c-bus/pca9552/pca9552-tests/tx-rx PASS 31 qos-test /arm/imx25-pdk/imx.i2c/i2c-bus/pca9552/pca9552-tests/rx-autoinc PASS 32 qos-test /arm/imx25-pdk/imx.i2c/i2c-bus/ds1338/ds1338-tests/tx-rx DEBUG:QMP.qemu-26462:>>> {'execute': 'quit'} DEBUG:QMP.qemu-26462:<<< {'timestamp': {'seconds': 1594984057, 'microseconds': 485197}, 'event': 'NIC_RX_FILTER_CHANGED', 'data': {'path': '/machine/peripheral-anon/device[0]/virtio-backend'}} DEBUG:QMP.qemu-26462:<<< {'timestamp': {'seconds': 1594985855, 'microseconds': 169552}, 'event': 'RTC_CHANGE', 'data': {'offset': 0}} DEBUG:QMP.qemu-26462:<<< {'timestamp': {'seconds': 1594987655, 'microseconds': 169187}, 'event': 'RTC_CHANGE', 'data': {'offset': 0}} DEBUG:QMP.qemu-26462:<<< {'timestamp': {'seconds': 1594989456, 'microseconds': 88866}, 'event': 'RTC_CHANGE', 'data': {'offset': 0}} DEBUG:QMP.qemu-26462:<<< {'return': {}} WARNING:qemu.machine:qemu received signal 9; command: "qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/var/tmp/qemu-26462-monitor.sock -mon chardev=mon,mode=control -machine pc -chardev socket,id=console,path=/var/tmp/qemu-26462-console.sock,server,nowait -serial chardev:console -nodefaults -m 4G -cpu max -netdev user,id=vnet,hostfwd=:127.0.0.1:0-:22 -device virtio-net-pci,netdev=vnet -vnc 127.0.0.1:0,to=20 -smp 8 -enable-kvm -drive file=/home/peter.maydell/.cache/qemu-vm/images/openbsd.img,snapshot=on,if=none,id=drive0,cache=writeback -device virtio-blk,drive=drive0,bootindex=0 -drive file=/home/peter.maydell/qemu-openbsd/build/vm-test-yzwn6xdc.tmp/data-993a1.tar,if=none,id=data-993a1,cache=writeback,format=raw -device virtio-blk,drive=data-993a1,serial=data-993a1,bootindex=1" Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/home/peter.maydell/qemu-openbsd/tests/vm/../../python/qemu/machine.py", line 436, in _do_shutdown self._soft_shutdown(has_quit, timeout) File "/home/peter.maydell/qemu-openbsd/tests/vm/../../python/qemu/machine.py", line 419, in _soft_shutdown self._popen.wait(timeout=timeout) File "/usr/lib/python3.6/subprocess.py", line 1469, in wait raise TimeoutExpired(self.args, timeout) This is where it times out waiting for the process to exit of its own volition. This is the "Inner" exception ... subprocess.TimeoutExpired: Command '['qemu-system-x86_64', '-display', 'none', '-vga', 'none', '-chardev', 'socket,id=mon,path=/var/tmp/qemu-26462-monitor.sock', '-mon', 'chardev=mon,mode=control', '-machine', 'pc', '-chardev', 'socket,id=console,path=/var/tmp/qemu-26462-console.sock,server,nowait', '-serial', 'chardev:console', '-nodefaults', '-m', '4G', '-cpu', 'max', '-netdev', 'user,id=vnet,hostfwd=:127.0.0.1:0-:22', '-device', 'virtio-net-pci,netdev=vnet', '-vnc', '127.0.0.1:0,to=20', '-smp', '8', '-enable-kvm', '-drive', 'file=/home/peter.maydell/.cache/qemu-vm/images/openbsd.img,snapshot=on,if=none,id=drive0,cache=writeback', '-device', 'virtio-blk,drive=drive0,bootindex=0', '-drive', 'file=/home/peter.maydell/qemu-openbsd/build/vm-test-yzwn6xdc.tmp/data-993a1.tar,if=none,id=data-993a1,cache=writeback,format=raw', '-device', 'virtio-blk,drive=data-993a1,serial=data-993a1,bootindex=1']' timed out after 3 seconds The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/peter.maydell/qemu-openbsd/tests/vm/../../python/qemu/machine.py", line 466, in shutdown self._do_shutdown(has_quit, timeout=timeout) File "/home/peter.maydell/qemu-openbsd/tests/vm/../../python/qemu/machine.py", line 440, in _do_shutdown from exc qemu.machine.AbnormalShutdown: Could not perform graceful shutdown And this is the "Outer" exception, the machine.py cleanup handler letting you know that while it did perform cleanup, QEMU was not shut down
Re: [PATCH-for-5.2] hw/vfio: Move some target-independent devices to common-objs
On Wed, 15 Jul 2020 15:13:22 +0200 Philippe Mathieu-Daudé wrote: > These devices do not depend on the target CPU configuration > (32 or 64-bit, big / little endian). Move them to common-obj > to compile them once for all the targets. > > Signed-off-by: Philippe Mathieu-Daudé > --- > hw/vfio/Makefile.objs | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs > index 9bb1c09e84..f663a800b1 100644 > --- a/hw/vfio/Makefile.objs > +++ b/hw/vfio/Makefile.objs > @@ -1,8 +1,9 @@ > obj-y += common.o spapr.o > -obj-$(CONFIG_VFIO_PCI) += pci.o pci-quirks.o display.o > +obj-$(CONFIG_VFIO_PCI) += pci.o pci-quirks.o > obj-$(CONFIG_VFIO_CCW) += ccw.o > obj-$(CONFIG_VFIO_PLATFORM) += platform.o > -obj-$(CONFIG_VFIO_XGMAC) += calxeda-xgmac.o > -obj-$(CONFIG_VFIO_AMD_XGBE) += amd-xgbe.o > obj-$(CONFIG_VFIO_AP) += ap.o > -obj-$(CONFIG_VFIO_IGD) += igd.o > +common-obj-$(CONFIG_VFIO_PCI) += display.o > +common-obj-$(CONFIG_VFIO_XGMAC) += calxeda-xgmac.o > +common-obj-$(CONFIG_VFIO_AMD_XGBE) += amd-xgbe.o > +common-obj-$(CONFIG_VFIO_IGD) += igd.o What thing, or lack of thing, are you looking for in these files that aren't in the others? For instance if igd.o is common, but pci-quirks.o is not, where igd.o is just a quirk that got split out of pci-quirks.o so that we could more easily compile it out, it feels like we might be stumbling into cases where adding or removing code that would shift objects into or out of the common-obj target. I don't know how to maintain that. Thanks, Alex
[Bug 1880287] Re: gcc crashes in hppa emulation
Test still crashes the VM and chroot with up-to-date debian chroot, including updated gcc-9.3.0-14. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1880287 Title: gcc crashes in hppa emulation Status in QEMU: New Bug description: There seems to be a translation bug in the qemu-hppa (qemu v5.0.0) emulation: A stripped down testcase (taken from Linux kernel build) is attached. In there is "a.sh", a shell script which calls gcc-9 (fails with both debian gcc-9.3.0-11 or gcc-9.3.0-12). and "a.iii", the preprocessed source. When starting a.sh, in the emulation gcc crashes with segfault. On real hardware gcc succeeds to compile the source. In a hppa-user chroot running "apt update && apt install gcc-9" should be sufficient to get the needed reproducer environment. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1880287/+subscriptions
Re: [PATCH v5 10/11] hw/arm: Wire up BMC boot flash for npcm750-evb and quanta-gsj
On Fri, Jul 17, 2020 at 1:52 PM Philippe Mathieu-Daudé wrote: > > On 7/17/20 9:18 PM, Havard Skinnemoen wrote: > > On Fri, Jul 17, 2020 at 2:00 AM Philippe Mathieu-Daudé > > wrote: > >> > >> On 7/17/20 10:27 AM, Philippe Mathieu-Daudé wrote: > >>> On 7/17/20 10:03 AM, Thomas Huth wrote: > On 17/07/2020 09.48, Philippe Mathieu-Daudé wrote: > > +Thomas > > > On 7/16/20 10:56 PM, Havard Skinnemoen wrote: > >> On Wed, Jul 15, 2020 at 1:54 PM Havard Skinnemoen > >> wrote: > >>> > >>> On Wed, Jul 15, 2020 at 3:57 AM Philippe Mathieu-Daudé > >>> wrote: > > On 7/15/20 11:00 AM, Markus Armbruster wrote: > > Now my point. Why first make up user configuration, then use that > > to > > create a BlockBackend, when you could just go ahead and create the > > BlockBackend? > > CLI issue mostly. > > We can solve it similarly to the recent "sdcard: Do not allow > invalid SD > card sizes" patch: > > if (!dinfo) { > error_setg(errp, "Missing SPI flash drive"); > error_append_hint(errp, "You can use a dummy drive using:\n"); > error_append_hint(errp, "-drive if=mtd,driver=null-co," > "read-ones=on,size=64M\n); > return; > } > > having npcm7xx_connect_flash() taking an Error* argument, > and MachineClass::init() call it with &error_fatal. > >>> > >>> Erroring out if the user specifies a configuration that can't possibly > >>> boot sounds good to me. Better than trying to come up with defaults > >>> that are still not going to result in a bootable system. > >>> > >>> For testing recovery paths, I think it makes sense to explicitly > >>> specify a null device as you suggest. > >> > >> Hmm, one problem. qom-test fails with > >> > >> qemu-system-aarch64: Missing SPI flash drive > >> You can add a dummy drive using: > >> -drive if=mtd,driver=null-co,read-zeroes=on,size=32M > >> Broken pipe > >> /usr/local/google/home/hskinnemoen/qemu/for-upstream/tests/qtest/libqtest.c:166: > >> kill_qemu() tried to terminate QEMU process but encountered exit > >> status 1 (expected 0) > >> ERROR qom-test - too few tests run (expected 68, got 7) > >> > >> So it looks like we might need a different solution to this, unless we > >> want to make generic tests more machine-aware... > > I didn't follow the other mails in this thread, but what we usually do > in such a case: Add a "if (qtest_enabled())" check to the device or the > machine to ignore the error if it is running in qtest mode. > >>> > >>> Hmm I'm not sure it works in this case. We could do: > >>> > >>> if (!dinfo) { > >>> if (qtest) { > >>> /* create null drive for qtest */ > >>> opts = ...; > >>> dinfo = drive_new(opts, IF_MTD, &error_abort); > >>> } else { > >>> /* teach user to use proper CLI */ > >>> error_setg(errp, "Missing SPI flash drive"); > >>> error_append_hint(errp, "You can use a dummy drive using:\n"); > >>> error_append_hint(errp, "-drive if=mtd,driver=null-co," > >>> "read-ones=on,size=64M\n); > >>> } > >>> } > >>> > >>> But I'm not sure Markus will enjoy it :) > >>> > >>> Markus, any better idea about how to handle that with automatic qtests? > >> > >> FWIW IDE device has a concept of "Anonymous BlockBackend for an empty > >> drive": > >> > >> static void ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind, Error **errp) > >> { > >> IDEBus *bus = DO_UPCAST(IDEBus, qbus, dev->qdev.parent_bus); > >> IDEState *s = bus->ifs + dev->unit; > >> int ret; > >> > >> if (!dev->conf.blk) { > >> if (kind != IDE_CD) { > >> error_setg(errp, "No drive specified"); > >> return; > >> } else { > >> /* Anonymous BlockBackend for an empty drive */ > >> dev->conf.blk = blk_new(qemu_get_aio_context(), 0, > >> BLK_PERM_ALL); > >> ret = blk_attach_dev(dev->conf.blk, &dev->qdev); > >> assert(ret == 0); > >> } > >> } > > > > Could someone please remind me what problem we're trying to solve here? > > Sorry, out of the scope of your series, which is fine with the current > code base :) > > > Currently, if the user (or test) doesn't provide a drive, we pass NULL > > as the block backend to m25p80. This means we'll take the code path in > > m25p_realize() that does > > > > trace_m25p80_binding_no_bdrv(s); > > s->storage = blk_blockalign(NULL, s->size); > > memset(s->storage, 0xFF, s->size); > > > > which will look like a freshly chip-erased flash chip. > > > > Are we looking for a more elegant way to replace those three lines of > > code (+ a couple of conditionals in
Re: [PATCH v5 10/11] hw/arm: Wire up BMC boot flash for npcm750-evb and quanta-gsj
On 7/17/20 9:18 PM, Havard Skinnemoen wrote: > On Fri, Jul 17, 2020 at 2:00 AM Philippe Mathieu-Daudé > wrote: >> >> On 7/17/20 10:27 AM, Philippe Mathieu-Daudé wrote: >>> On 7/17/20 10:03 AM, Thomas Huth wrote: On 17/07/2020 09.48, Philippe Mathieu-Daudé wrote: > +Thomas > On 7/16/20 10:56 PM, Havard Skinnemoen wrote: >> On Wed, Jul 15, 2020 at 1:54 PM Havard Skinnemoen >> wrote: >>> >>> On Wed, Jul 15, 2020 at 3:57 AM Philippe Mathieu-Daudé >>> wrote: On 7/15/20 11:00 AM, Markus Armbruster wrote: > Now my point. Why first make up user configuration, then use that to > create a BlockBackend, when you could just go ahead and create the > BlockBackend? CLI issue mostly. We can solve it similarly to the recent "sdcard: Do not allow invalid SD card sizes" patch: if (!dinfo) { error_setg(errp, "Missing SPI flash drive"); error_append_hint(errp, "You can use a dummy drive using:\n"); error_append_hint(errp, "-drive if=mtd,driver=null-co," "read-ones=on,size=64M\n); return; } having npcm7xx_connect_flash() taking an Error* argument, and MachineClass::init() call it with &error_fatal. >>> >>> Erroring out if the user specifies a configuration that can't possibly >>> boot sounds good to me. Better than trying to come up with defaults >>> that are still not going to result in a bootable system. >>> >>> For testing recovery paths, I think it makes sense to explicitly >>> specify a null device as you suggest. >> >> Hmm, one problem. qom-test fails with >> >> qemu-system-aarch64: Missing SPI flash drive >> You can add a dummy drive using: >> -drive if=mtd,driver=null-co,read-zeroes=on,size=32M >> Broken pipe >> /usr/local/google/home/hskinnemoen/qemu/for-upstream/tests/qtest/libqtest.c:166: >> kill_qemu() tried to terminate QEMU process but encountered exit >> status 1 (expected 0) >> ERROR qom-test - too few tests run (expected 68, got 7) >> >> So it looks like we might need a different solution to this, unless we >> want to make generic tests more machine-aware... I didn't follow the other mails in this thread, but what we usually do in such a case: Add a "if (qtest_enabled())" check to the device or the machine to ignore the error if it is running in qtest mode. >>> >>> Hmm I'm not sure it works in this case. We could do: >>> >>> if (!dinfo) { >>> if (qtest) { >>> /* create null drive for qtest */ >>> opts = ...; >>> dinfo = drive_new(opts, IF_MTD, &error_abort); >>> } else { >>> /* teach user to use proper CLI */ >>> error_setg(errp, "Missing SPI flash drive"); >>> error_append_hint(errp, "You can use a dummy drive using:\n"); >>> error_append_hint(errp, "-drive if=mtd,driver=null-co," >>> "read-ones=on,size=64M\n); >>> } >>> } >>> >>> But I'm not sure Markus will enjoy it :) >>> >>> Markus, any better idea about how to handle that with automatic qtests? >> >> FWIW IDE device has a concept of "Anonymous BlockBackend for an empty >> drive": >> >> static void ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind, Error **errp) >> { >> IDEBus *bus = DO_UPCAST(IDEBus, qbus, dev->qdev.parent_bus); >> IDEState *s = bus->ifs + dev->unit; >> int ret; >> >> if (!dev->conf.blk) { >> if (kind != IDE_CD) { >> error_setg(errp, "No drive specified"); >> return; >> } else { >> /* Anonymous BlockBackend for an empty drive */ >> dev->conf.blk = blk_new(qemu_get_aio_context(), 0, >> BLK_PERM_ALL); >> ret = blk_attach_dev(dev->conf.blk, &dev->qdev); >> assert(ret == 0); >> } >> } > > Could someone please remind me what problem we're trying to solve here? Sorry, out of the scope of your series, which is fine with the current code base :) > Currently, if the user (or test) doesn't provide a drive, we pass NULL > as the block backend to m25p80. This means we'll take the code path in > m25p_realize() that does > > trace_m25p80_binding_no_bdrv(s); > s->storage = blk_blockalign(NULL, s->size); > memset(s->storage, 0xFF, s->size); > > which will look like a freshly chip-erased flash chip. > > Are we looking for a more elegant way to replace those three lines of > code (+ a couple of conditionals in the writeback paths)? Yes, I am. Anyway, unrelated to your work, sorry if it confused you. > > But we don't even have a dummy device that looks like an erased flash chip... No, this is still the design stage, but your series has a quality that let us foreseen a bit where we are heading... > > Havard >
[PATCH v1 2/3] python/qemu: Change ConsoleSocket to optionally drain socket.
The primary purpose of this change is to clean up machine.py's console_socket property to return a single type, a ConsoleSocket. ConsoleSocket now derives from a socket, which means that in the default case (of not draining), machine.py will see the same behavior as it did prior to ConsoleSocket. Signed-off-by: Robert Foley --- python/qemu/console_socket.py | 92 +-- python/qemu/machine.py| 13 ++--- 2 files changed, 59 insertions(+), 46 deletions(-) diff --git a/python/qemu/console_socket.py b/python/qemu/console_socket.py index 09986bc215..70869fbbdc 100644 --- a/python/qemu/console_socket.py +++ b/python/qemu/console_socket.py @@ -13,68 +13,75 @@ which can drain a socket and optionally dump the bytes to file. # the COPYING file in the top-level directory. # -import asyncore import socket import threading from collections import deque import time -class ConsoleSocket(asyncore.dispatcher): +class ConsoleSocket(socket.socket): """ ConsoleSocket represents a socket attached to a char device. -Drains the socket and places the bytes into an in memory buffer -for later processing. +Optionally (if drain==True), drains the socket and places the bytes +into an in memory buffer for later processing. Optionally a file path can be passed in and we will also dump the characters to this file for debugging purposes. """ -def __init__(self, address, file=None): +def __init__(self, address, file=None, drain=False): self._recv_timeout_sec = 300 self._sleep_time = 0.5 self._buffer = deque() -self._asyncore_thread = None -self._sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) -self._sock.connect(address) +socket.socket.__init__(self, socket.AF_UNIX, socket.SOCK_STREAM) +self.connect(address) self._logfile = None if file: self._logfile = open(file, "w") -asyncore.dispatcher.__init__(self, sock=self._sock) self._open = True -self._thread_start() +if drain: +self._drain_thread = self._thread_start() +else: +self._drain_thread = None -def _thread_start(self): -"""Kick off a thread to wait on the asyncore.loop""" -if self._asyncore_thread is not None: -return -self._asyncore_thread = threading.Thread(target=asyncore.loop, - kwargs={'timeout':1}) -self._asyncore_thread.daemon = True -self._asyncore_thread.start() +def _drain_fn(self): +"""Drains the socket and runs while the socket is open.""" +while self._open: +try: +self._drain_socket() +except socket.timeout: +# The socket is expected to timeout since we set a +# short timeout to allow the thread to exit when +# self._open is set to False. +time.sleep(self._sleep_time) -def handle_close(self): -"""redirect close to base class""" -# Call the base class close, but not self.close() since -# handle_close() occurs in the context of the thread which -# self.close() attempts to join. -asyncore.dispatcher.close(self) +def _thread_start(self): +"""Kick off a thread to drain the socket.""" +# Configure socket to not block and timeout. +# This allows our drain thread to not block +# on recieve and exit smoothly. +socket.socket.setblocking(self, False) +socket.socket.settimeout(self, 1) +drain_thread = threading.Thread(target=self._drain_fn) +drain_thread.daemon = True +drain_thread.start() +return drain_thread def close(self): """Close the base object and wait for the thread to terminate""" if self._open: self._open = False -asyncore.dispatcher.close(self) -if self._asyncore_thread is not None: -thread, self._asyncore_thread = self._asyncore_thread, None +if self._drain_thread is not None: +thread, self._drain_thread = self._drain_thread, None thread.join() +socket.socket.close(self) if self._logfile: self._logfile.close() self._logfile = None -def handle_read(self): +def _drain_socket(self): """process arriving characters into in memory _buffer""" -data = asyncore.dispatcher.recv(self, 1) +data = socket.socket.recv(self, 1) # latin1 is needed since there are some chars # we are receiving that cannot be encoded to utf-8 # such as 0xe2, 0x80, 0xA6. @@ -85,27 +92,38 @@ class ConsoleSocket(asyncore.dispatcher): for c in string: self._buffer.extend(c) -def recv(self, buffer_size=1): +def re
[PATCH v1 3/3] tests/vm: add shutdown timeout in basevm.py
We are adding the shutdown timeout to solve an issue we now see where the aarch64 VMs timeout on shutdown under TCG. There is a new 3 second timeout in machine.py, which we override in basevm.py when shutting down. Signed-off-by: Robert Foley --- tests/vm/basevm.py | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py index 7acb48b876..3fac20e929 100644 --- a/tests/vm/basevm.py +++ b/tests/vm/basevm.py @@ -80,6 +80,8 @@ class BaseVM(object): arch = "#arch" # command to halt the guest, can be overridden by subclasses poweroff = "poweroff" +# Time to wait for shutdown to finish. +shutdown_timeout_default = 30 # enable IPv6 networking ipv6 = True # This is the timeout on the wait for console bytes. @@ -87,7 +89,7 @@ class BaseVM(object): # Scale up some timeouts under TCG. # 4 is arbitrary, but greater than 2, # since we found we need to wait more than twice as long. -tcg_ssh_timeout_multiplier = 4 +tcg_timeout_multiplier = 4 def __init__(self, args, config=None): self._guest = None self._genisoimage = args.genisoimage @@ -141,9 +143,12 @@ class BaseVM(object): if args.jobs and args.jobs > 1: self._args += ["-smp", "%d" % args.jobs] if kvm_available(self.arch): +self._shutdown_timeout = self.shutdown_timeout_default self._args += ["-enable-kvm"] else: logging.info("KVM not available, not using -enable-kvm") +self._shutdown_timeout = \ +self.shutdown_timeout_default * self.tcg_timeout_multiplier self._data_args = [] if self._config['qemu_args'] != None: @@ -423,7 +428,7 @@ class BaseVM(object): def wait_ssh(self, wait_root=False, seconds=300, cmd="exit 0"): # Allow more time for VM to boot under TCG. if not kvm_available(self.arch): -seconds *= self.tcg_ssh_timeout_multiplier +seconds *= self.tcg_timeout_multiplier starttime = datetime.datetime.now() endtime = starttime + datetime.timedelta(seconds=seconds) cmd_success = False @@ -441,14 +446,14 @@ class BaseVM(object): raise Exception("Timeout while waiting for guest ssh") def shutdown(self): -self._guest.shutdown() +self._guest.shutdown(timeout=self._shutdown_timeout) def wait(self): -self._guest.wait() +self._guest.wait(timeout=self._shutdown_timeout) def graceful_shutdown(self): self.ssh_root(self.poweroff) -self._guest.wait() +self._guest.wait(timeout=self._shutdown_timeout) def qmp(self, *args, **kwargs): return self._guest.qmp(*args, **kwargs) -- 2.17.1
[PATCH v1 1/3] python/qemu: Cleanup changes to ConsoleSocket
The changes to console_socket.py and machine.py are to cleanup for pylint and flake8. Reviewed-by: Alex Bennée Signed-off-by: Robert Foley --- python/qemu/console_socket.py | 57 ++- python/qemu/machine.py| 7 +++-- python/qemu/pylintrc | 2 +- 3 files changed, 34 insertions(+), 32 deletions(-) diff --git a/python/qemu/console_socket.py b/python/qemu/console_socket.py index 830cb7c628..09986bc215 100644 --- a/python/qemu/console_socket.py +++ b/python/qemu/console_socket.py @@ -1,12 +1,9 @@ -#!/usr/bin/env python3 -# -# This python module implements a ConsoleSocket object which is -# designed always drain the socket itself, and place -# the bytes into a in memory buffer for later processing. -# -# Optionally a file path can be passed in and we will also -# dump the characters to this file for debug. -# +""" +QEMU Console Socket Module: + +This python module implements a ConsoleSocket object, +which can drain a socket and optionally dump the bytes to file. +""" # Copyright 2020 Linaro # # Authors: @@ -15,20 +12,27 @@ # This code is licensed under the GPL version 2 or later. See # the COPYING file in the top-level directory. # + import asyncore import socket import threading -import io -import os -import sys from collections import deque import time -import traceback + class ConsoleSocket(asyncore.dispatcher): +""" +ConsoleSocket represents a socket attached to a char device. +Drains the socket and places the bytes into an in memory buffer +for later processing. + +Optionally a file path can be passed in and we will also +dump the characters to this file for debugging purposes. +""" def __init__(self, address, file=None): self._recv_timeout_sec = 300 +self._sleep_time = 0.5 self._buffer = deque() self._asyncore_thread = None self._sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) @@ -70,31 +74,28 @@ class ConsoleSocket(asyncore.dispatcher): def handle_read(self): """process arriving characters into in memory _buffer""" -try: -data = asyncore.dispatcher.recv(self, 1) -# latin1 is needed since there are some chars -# we are receiving that cannot be encoded to utf-8 -# such as 0xe2, 0x80, 0xA6. -string = data.decode("latin1") -except: -print("Exception seen.") -traceback.print_exc() -return +data = asyncore.dispatcher.recv(self, 1) +# latin1 is needed since there are some chars +# we are receiving that cannot be encoded to utf-8 +# such as 0xe2, 0x80, 0xA6. +string = data.decode("latin1") if self._logfile: self._logfile.write("{}".format(string)) self._logfile.flush() for c in string: self._buffer.extend(c) -def recv(self, n=1, sleep_delay_s=0.1): -"""Return chars from in memory buffer""" +def recv(self, buffer_size=1): +"""Return chars from in memory buffer. + Maintains the same API as socket.socket.recv. +""" start_time = time.time() -while len(self._buffer) < n: -time.sleep(sleep_delay_s) +while len(self._buffer) < buffer_size: +time.sleep(self._sleep_time) elapsed_sec = time.time() - start_time if elapsed_sec > self._recv_timeout_sec: raise socket.timeout -chars = ''.join([self._buffer.popleft() for i in range(n)]) +chars = ''.join([self._buffer.popleft() for i in range(buffer_size)]) # We choose to use latin1 to remain consistent with # handle_read() and give back the same data as the user would # receive if they were reading directly from the diff --git a/python/qemu/machine.py b/python/qemu/machine.py index 80c4d4a8b6..9956360a79 100644 --- a/python/qemu/machine.py +++ b/python/qemu/machine.py @@ -27,7 +27,7 @@ import socket import tempfile from typing import Optional, Type from types import TracebackType -from qemu.console_socket import ConsoleSocket +from . import console_socket from . import qmp @@ -674,8 +674,9 @@ class QEMUMachine: """ if self._console_socket is None: if self._drain_console: -self._console_socket = ConsoleSocket(self._console_address, - file=self._console_log_path) +self._console_socket = console_socket.ConsoleSocket( +self._console_address, +file=self._console_log_path) else: self._console_socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) diff --git a/python/qemu/pylintrc b/python/qemu/pylintrc index 5d6ae7367d..3f69205000 100644 --- a/python/qemu/pylintrc +++ b/python/qemu/pylint
[PATCH v1 0/3] python/qemu: follow-up changes for ConsoleSocket
For v1, we added a few minor changes, and also added one new patch in tests/vm to add a shutdown timeout. This fixes an issue we saw in testing the aarch64 VMs with TCG. This patch series introduces a few follow-up changes after the introduction of ConsoleSocket. The first patch introduces cleanup changes for pylint and flake8. The second patch allows machine.py to use a single type for the console_socket, a ConsoleSocket. Since machine.py will use ConsoleSocket for both the draining and non-draining cases, we changed ConsoleSocket to handle the case where it does not drain the socket at all and essentially behaves like a socket. Robert Foley (3): python/qemu: Cleanup changes to ConsoleSocket python/qemu: Change ConsoleSocket to optionally drain socket. tests/vm: add shutdown timeout in basevm.py python/qemu/console_socket.py | 137 +++--- python/qemu/machine.py| 14 ++-- python/qemu/pylintrc | 2 +- tests/vm/basevm.py| 15 ++-- 4 files changed, 94 insertions(+), 74 deletions(-) -- 2.17.1
Re: [PATCH v6 00/13] Add Nuvoton NPCM730/NPCM750 SoCs and two BMC machines
On 7/17/20 8:02 AM, Havard Skinnemoen wrote: > I also pushed this and the previous two patchsets to my qemu fork on github. > The branches are named npcm7xx-v[1-6]. > > https://github.com/hskinnemoen/qemu > > This patch series models enough of the Nuvoton NPCM730 and NPCM750 SoCs to > boot > an OpenBMC image built for quanta-gsj. This includes device models for: > > - Global Configuration Registers > - Clock Control > - Timers > - Fuses > - Memory Controller > - Flash Controller > > These modules, along with the existing Cortex A9 CPU cores and built-in > peripherals, are integrated into a NPCM730 or NPCM750 SoC, which in turn form > the foundation for the quanta-gsj and npcm750-evb machines, respectively. The > two SoCs are very similar; the only difference is that NPCM730 is missing some > peripherals that NPCM750 has, and which are not considered essential for > datacenter use (e.g. graphics controllers). For more information, see > > https://www.nuvoton.com/products/cloud-computing/ibmc/ > > Both quanta-gsj and npcm750-evb correspond to real boards supported by > OpenBMC. > At the end of the series, qemu can boot an OpenBMC image built for one of > these > boards with some minor modifications. > > The patches in this series were developed by Google and reviewed by Nuvoton. > We > will be maintaining the machine and peripheral support together. > > The data sheet for these SoCs is not generally available. Please let me know > if > more comments are needed to understand the device behavior. I think this series is ready to go. Patch 6 "roms: Add virtual Boot ROM for NPCM7xx SoCs" needs a few add-ons which can come later. Patch 13 "tests/acceptance: console boot tests for quanta-gsj" needs an Acked-by by our test gurus. Aspeed needs the same kind of tests if possible. It depends on the pressure that the QEMU CI will put on the web servers hosting the images. I think that the rest of the comments are minor and can be fixed as follow-ups when 5.2 is being assembled. Thanks, C. > Changes since v5: > > - Boot ROM included, as a git submodule and a binary blob, and loaded by > default, so the -bios option is usually not necessary anymore. > - Two acceptance tests added (openbmc image boot, and direct kernel boot). > - npcm7xx_load_kernel() moved to SoC code. > - NPCM7XX_TIMER_REF_HZ definition moved to CLK header. > - Comments added clarifying available SPI flash chip selects. > - Error handling adjustments: > - Errors from CPU and GCR realization are propagated through the SoC > since they may be triggered by user-configurable parameters. > - Machine init uses error_fatal instead of error_abort for SoC > realization flash init. This makes error messages more helpful. > - Comments added to indicate whether peripherals may fail to realize. > - Use ERRP_GUARD() instead of Error *err when possible. > - Default CPU type is now set, and attempting to set it to anything else > will fail. > - Format string fixes (use HWADDR_PRIx, etc.) > - Simplified memory size encoding and error checking in npcm7xx_gcr. > - Encapsulate non-obvious pointer subtraction into helper functions in the > FIU and TIMER modules. > - Incorporate review feedback into the FIU module: > - Add select/deselect trace events. > - Use npcm7xx_fiu_{de,}select() consistently. > - Use extract/deposit in more places for consistency. > - Use -Wimplicit-fallthrough compatible fallthrough comments. > - Use qdev_init_gpio_out_named instead of sysbus_init_irq for chip > selects. > - Incorporate review feedback into the TIMER module: > - Assert that we never pause a timer that has already expired, instead > of > trying to handle it. This should be safe since QEMU_CLOCK_VIRTUAL is > stopped while this code is running. > - Simplify the switch blocks in the read and write handlers. > > I made a change to error out if a flash drive was not specified, but reverted > it because it caused make check to fail (qom-test). When specifying a NULL > block device, the m25p flash device initializes its in-memory storage with > 0xff > and doesn't attempt to write anything back. This seems correct to me. > > Changes since v4: > > - OTP cleanups suggested by Philippe Mathieu-Daudé. > - Added fuse array definitions based on public Nuvoton bootblock code. > - Moved class structure to .c file since it's only used internally. > - Readability improvements. > - Split the first patch and folded parts of it into three other patches so > that CONFIG_NPCM7XX is only enabled after the initial NPCM7xx machine > support is added. > - DRAM init moved to machine init code. > - Consistently use lower-case hex literals. > - Switched to fine-grained unimplemented devices, based on public bootblock > source code. Added a tiny SRAM that got left out previously. > - Simplifie
Re: [PATCH v5 10/11] hw/arm: Wire up BMC boot flash for npcm750-evb and quanta-gsj
On 7/17/20 9:18 PM, Havard Skinnemoen wrote: > On Fri, Jul 17, 2020 at 2:00 AM Philippe Mathieu-Daudé > wrote: >> >> On 7/17/20 10:27 AM, Philippe Mathieu-Daudé wrote: >>> On 7/17/20 10:03 AM, Thomas Huth wrote: On 17/07/2020 09.48, Philippe Mathieu-Daudé wrote: > +Thomas > On 7/16/20 10:56 PM, Havard Skinnemoen wrote: >> On Wed, Jul 15, 2020 at 1:54 PM Havard Skinnemoen >> wrote: >>> >>> On Wed, Jul 15, 2020 at 3:57 AM Philippe Mathieu-Daudé >>> wrote: On 7/15/20 11:00 AM, Markus Armbruster wrote: > Now my point. Why first make up user configuration, then use that to > create a BlockBackend, when you could just go ahead and create the > BlockBackend? CLI issue mostly. We can solve it similarly to the recent "sdcard: Do not allow invalid SD card sizes" patch: if (!dinfo) { error_setg(errp, "Missing SPI flash drive"); error_append_hint(errp, "You can use a dummy drive using:\n"); error_append_hint(errp, "-drive if=mtd,driver=null-co," "read-ones=on,size=64M\n); return; } having npcm7xx_connect_flash() taking an Error* argument, and MachineClass::init() call it with &error_fatal. >>> >>> Erroring out if the user specifies a configuration that can't possibly >>> boot sounds good to me. Better than trying to come up with defaults >>> that are still not going to result in a bootable system. >>> >>> For testing recovery paths, I think it makes sense to explicitly >>> specify a null device as you suggest. >> >> Hmm, one problem. qom-test fails with >> >> qemu-system-aarch64: Missing SPI flash drive >> You can add a dummy drive using: >> -drive if=mtd,driver=null-co,read-zeroes=on,size=32M >> Broken pipe >> /usr/local/google/home/hskinnemoen/qemu/for-upstream/tests/qtest/libqtest.c:166: >> kill_qemu() tried to terminate QEMU process but encountered exit >> status 1 (expected 0) >> ERROR qom-test - too few tests run (expected 68, got 7) >> >> So it looks like we might need a different solution to this, unless we >> want to make generic tests more machine-aware... I didn't follow the other mails in this thread, but what we usually do in such a case: Add a "if (qtest_enabled())" check to the device or the machine to ignore the error if it is running in qtest mode. >>> >>> Hmm I'm not sure it works in this case. We could do: >>> >>> if (!dinfo) { >>> if (qtest) { >>> /* create null drive for qtest */ >>> opts = ...; >>> dinfo = drive_new(opts, IF_MTD, &error_abort); >>> } else { >>> /* teach user to use proper CLI */ >>> error_setg(errp, "Missing SPI flash drive"); >>> error_append_hint(errp, "You can use a dummy drive using:\n"); >>> error_append_hint(errp, "-drive if=mtd,driver=null-co," >>> "read-ones=on,size=64M\n); >>> } >>> } >>> >>> But I'm not sure Markus will enjoy it :) >>> >>> Markus, any better idea about how to handle that with automatic qtests? >> >> FWIW IDE device has a concept of "Anonymous BlockBackend for an empty >> drive": >> >> static void ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind, Error **errp) >> { >> IDEBus *bus = DO_UPCAST(IDEBus, qbus, dev->qdev.parent_bus); >> IDEState *s = bus->ifs + dev->unit; >> int ret; >> >> if (!dev->conf.blk) { >> if (kind != IDE_CD) { >> error_setg(errp, "No drive specified"); >> return; >> } else { >> /* Anonymous BlockBackend for an empty drive */ >> dev->conf.blk = blk_new(qemu_get_aio_context(), 0, >> BLK_PERM_ALL); >> ret = blk_attach_dev(dev->conf.blk, &dev->qdev); >> assert(ret == 0); >> } >> } > > Could someone please remind me what problem we're trying to solve here? > > Currently, if the user (or test) doesn't provide a drive, we pass NULL > as the block backend to m25p80. This means we'll take the code path in > m25p_realize() that does > > trace_m25p80_binding_no_bdrv(s); > s->storage = blk_blockalign(NULL, s->size); > memset(s->storage, 0xFF, s->size); > > which will look like a freshly chip-erased flash chip. which is perfect. C.
Re: [PATCH v1 1/5] shippable: add one more qemu to registry url
On 7/17/20 12:51 PM, Alex Bennée wrote: > The registry url is //qemu/ > > Perhaps we should rationalise that some day but for now. > > Signed-off-by: Alex Bennée > --- > .shippable.yml | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/.shippable.yml b/.shippable.yml > index f6b742432e5..89d8be4291b 100644 > --- a/.shippable.yml > +++ b/.shippable.yml > @@ -27,7 +27,7 @@ env: >TARGET_LIST=ppc64-softmmu,ppc64-linux-user,ppc64abi32-linux-user > build: >pre_ci_boot: > -image_name: registry.gitlab.com/qemu-project/qemu/${IMAGE} > +image_name: registry.gitlab.com/qemu-project/qemu/qemu/${IMAGE} > image_tag: latest > pull: true > options: "-e HOME=/root" > Reviewed-by: Philippe Mathieu-Daudé
[Bug 1878641] Re: Abort() in mch_update_pciexbar
Proposed fix: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg05612.html -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1878641 Title: Abort() in mch_update_pciexbar Status in QEMU: New Bug description: Hello, I found an input which triggers an abort() in mch_update_pciexbar: #0 0x7686d761 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x7685755b in __GI_abort () at abort.c:79 #2 0x5705c7ae in mch_update_pciexbar (mch=0x62905920) at /home/alxndr/Development/qemu/hw/pci-host/q35.c:324 #3 0x5705bb6a in mch_write_config (d=0x62905920, address=0x60, val=0x8400056e, len=0x4) at /home/alxndr/Development/qemu/hw/pci-host/q35.c:480 #4 0x570954fb in pci_host_config_write_common (pci_dev=0x62905920, addr=0x60, limit=0x100, val=0x8400056e, len=0x4) at /home/alxndr/Development/qemu/hw/pci/pci_host.c:81 #5 0x5709606e in pci_data_write (s=0x61d96080, addr=0xf260, val=0x8400056e, len=0x4) at /home/alxndr/Development/qemu/hw/pci/pci_host.c:118 #6 0x570967d0 in pci_host_data_write (opaque=0x62905200, addr=0x0, val=0x8400056e, len=0x4) at /home/alxndr/Development/qemu/hw/pci/pci_host.c:165 #7 0x564938b5 in memory_region_write_accessor (mr=0x62905610, addr=0x0, value=0x7fff9c70, size=0x4, shift=0x0, mask=0x, attrs=...) at /home/alxndr/Development/qemu/memory.c:483 #8 0x5649328a in access_with_adjusted_size (addr=0x0, value=0x7fff9c70, size=0x4, access_size_min=0x1, access_size_max=0x4, access_fn=0x56493360 , mr=0x62905610, attrs=...) at /home/alxndr/Development/qemu/memory.c:544 #9 0x56491df6 in memory_region_dispatch_write (mr=0x62905610, addr=0x0, data=0x8400056e, op=MO_32, attrs=...) at /home/alxndr/Development/qemu/memory.c:1476 #10 0x562cbbf4 in flatview_write_continue (fv=0x60633b00, addr=0xcfc, attrs=..., ptr=0x7fffa4e0, len=0x4, addr1=0x0, l=0x4, mr=0x62905610) at /home/alxndr/Development/qemu/exec.c:3137 #11 0x562bbad9 in flatview_write (fv=0x60633b00, addr=0xcfc, attrs=..., buf=0x7fffa4e0, len=0x4) at /home/alxndr/Development/qemu/exec.c:3177 #12 0x562bb609 in address_space_write (as=0x5968f940 , addr=0xcfc, attrs=..., buf=0x7fffa4e0, len=0x4) at /home/alxndr/Development/qemu/exec.c:3268 #13 0x56478c0a in cpu_outl (addr=0xcfc, val=0x8400056e) at /home/alxndr/Development/qemu/ioport.c:80 #14 0x5648166f in qtest_process_command (chr=0x59691d00 , words=0x6039ebf0) at /home/alxndr/Development/qemu/qtest.c:396 #15 0x5647f187 in qtest_process_inbuf (chr=0x59691d00 , inbuf=0x6190f680) at /home/alxndr/Development/qemu/qtest.c:710 #16 0x5647e8b4 in qtest_read (opaque=0x59691d00 , buf=0x7fffca40 "outl 0xcf8 0xf260\noutl 0xcfc 0x8400056e\n-M pc-q35-5.0 -device intel-hda,id=hda0 -device hda-output,bus=hda0.0 -device hda-micro,bus=hda0.0 -device hda-duplex,bus=hda0.0 -display none -nodefaults -nographic\n\377\377\377\177", size=0xd2) at /home/alxndr/Development/qemu/qtest.c:722 #17 0x579c260c in qemu_chr_be_write_impl (s=0x60f01f30, buf=0x7fffca40 "outl 0xcf8 0xf260\noutl 0xcfc 0x8400056e\n-M pc-q35-5.0 -device intel-hda,id=hda0 -device hda-output,bus=hda0.0 -device hda-micro,bus=hda0.0 -device hda-duplex,bus=hda0.0 -display none -nodefaults -nographic\n\377\377\377\177", len=0xd2) at /home/alxndr/Development/qemu/chardev/char.c:183 #18 0x579c275b in qemu_chr_be_write (s=0x60f01f30, buf=0x7fffca40 "outl 0xcf8 0xf260\noutl 0xcfc 0x8400056e\n-M pc-q35-5.0 -device intel-hda,id=hda0 -device hda-output,bus=hda0.0 -device hda-micro,bus=hda0.0 -device hda-duplex,bus=hda0.0 -display none -nodefaults -nographic\n\377\377\377\177", len=0xd2) at /home/alxndr/Development/qemu/chardev/char.c:195 #19 0x579cb97a in fd_chr_read (chan=0x608026a0, cond=G_IO_IN, opaque=0x60f01f30) at /home/alxndr/Development/qemu/chardev/char-fd.c:68 #20 0x57a530ea in qio_channel_fd_source_dispatch (source=0x60c2ef00, callback=0x579cb540 , user_data=0x60f01f30) at /home/alxndr/Development/qemu/io/channel-watch.c:84 #21 0x77ca8898 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 #22 0x57c10b85 in glib_pollfds_poll () at /home/alxndr/Development/qemu/util/main-loop.c:219 #23 0x57c0f57e in os_host_main_loop_wait (timeout=0x0) at /home/alxndr/Development/qemu/util/main-loop.c:242 #24 0x57c0f177 in main_loop_wait (nonblocking=0x0) at /home/alxndr/Development/qemu/util/main-loop.c:518 #25 0x5689fd1e in qemu_main_loop () at /home/alxndr/Development/qemu/softmmu/vl.c:1664 #26 0x57a6a29d in main (argc=0x17, argv=0x7fffe1
Re: [RFC PATCH-for-5.1] hw/pci-host/q35: Ignore write of reserved PCIEXBAR LENGTH field
On 7/17/20 8:38 PM, Richard Henderson wrote: > On 7/17/20 11:17 AM, Philippe Mathieu-Daudé wrote: >> case MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD: >> -default: >> -abort(); >> +qemu_log_mask(LOG_GUEST_ERROR, "Q35: Reserved PCIEXBAR LENGTH\n"); >> +return; >> } > > Did you really want to remove the default case? The field takes 2 bits, 3 cases are covered... > I guess the mask means that only *_RVD is left over, but the default sorta > self-documents that. OK I don't have problem keeping it - until a compiler start warning about it "default case already covered" ;) > > Either way, > Reviewed-by: Richard Henderson Thanks! > > > r~ >
Re: [PATCH v5 10/11] hw/arm: Wire up BMC boot flash for npcm750-evb and quanta-gsj
On Fri, Jul 17, 2020 at 2:00 AM Philippe Mathieu-Daudé wrote: > > On 7/17/20 10:27 AM, Philippe Mathieu-Daudé wrote: > > On 7/17/20 10:03 AM, Thomas Huth wrote: > >> On 17/07/2020 09.48, Philippe Mathieu-Daudé wrote: > >>> +Thomas > >> > >>> On 7/16/20 10:56 PM, Havard Skinnemoen wrote: > On Wed, Jul 15, 2020 at 1:54 PM Havard Skinnemoen > wrote: > > > > On Wed, Jul 15, 2020 at 3:57 AM Philippe Mathieu-Daudé > > wrote: > >> > >> On 7/15/20 11:00 AM, Markus Armbruster wrote: > >>> Now my point. Why first make up user configuration, then use that to > >>> create a BlockBackend, when you could just go ahead and create the > >>> BlockBackend? > >> > >> CLI issue mostly. > >> > >> We can solve it similarly to the recent "sdcard: Do not allow invalid > >> SD > >> card sizes" patch: > >> > >> if (!dinfo) { > >> error_setg(errp, "Missing SPI flash drive"); > >> error_append_hint(errp, "You can use a dummy drive using:\n"); > >> error_append_hint(errp, "-drive if=mtd,driver=null-co," > >> "read-ones=on,size=64M\n); > >> return; > >> } > >> > >> having npcm7xx_connect_flash() taking an Error* argument, > >> and MachineClass::init() call it with &error_fatal. > > > > Erroring out if the user specifies a configuration that can't possibly > > boot sounds good to me. Better than trying to come up with defaults > > that are still not going to result in a bootable system. > > > > For testing recovery paths, I think it makes sense to explicitly > > specify a null device as you suggest. > > Hmm, one problem. qom-test fails with > > qemu-system-aarch64: Missing SPI flash drive > You can add a dummy drive using: > -drive if=mtd,driver=null-co,read-zeroes=on,size=32M > Broken pipe > /usr/local/google/home/hskinnemoen/qemu/for-upstream/tests/qtest/libqtest.c:166: > kill_qemu() tried to terminate QEMU process but encountered exit > status 1 (expected 0) > ERROR qom-test - too few tests run (expected 68, got 7) > > So it looks like we might need a different solution to this, unless we > want to make generic tests more machine-aware... > >> > >> I didn't follow the other mails in this thread, but what we usually do > >> in such a case: Add a "if (qtest_enabled())" check to the device or the > >> machine to ignore the error if it is running in qtest mode. > > > > Hmm I'm not sure it works in this case. We could do: > > > > if (!dinfo) { > > if (qtest) { > > /* create null drive for qtest */ > > opts = ...; > > dinfo = drive_new(opts, IF_MTD, &error_abort); > > } else { > > /* teach user to use proper CLI */ > > error_setg(errp, "Missing SPI flash drive"); > > error_append_hint(errp, "You can use a dummy drive using:\n"); > > error_append_hint(errp, "-drive if=mtd,driver=null-co," > > "read-ones=on,size=64M\n); > > } > > } > > > > But I'm not sure Markus will enjoy it :) > > > > Markus, any better idea about how to handle that with automatic qtests? > > FWIW IDE device has a concept of "Anonymous BlockBackend for an empty > drive": > > static void ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind, Error **errp) > { > IDEBus *bus = DO_UPCAST(IDEBus, qbus, dev->qdev.parent_bus); > IDEState *s = bus->ifs + dev->unit; > int ret; > > if (!dev->conf.blk) { > if (kind != IDE_CD) { > error_setg(errp, "No drive specified"); > return; > } else { > /* Anonymous BlockBackend for an empty drive */ > dev->conf.blk = blk_new(qemu_get_aio_context(), 0, > BLK_PERM_ALL); > ret = blk_attach_dev(dev->conf.blk, &dev->qdev); > assert(ret == 0); > } > } Could someone please remind me what problem we're trying to solve here? Currently, if the user (or test) doesn't provide a drive, we pass NULL as the block backend to m25p80. This means we'll take the code path in m25p_realize() that does trace_m25p80_binding_no_bdrv(s); s->storage = blk_blockalign(NULL, s->size); memset(s->storage, 0xFF, s->size); which will look like a freshly chip-erased flash chip. Are we looking for a more elegant way to replace those three lines of code (+ a couple of conditionals in the writeback paths)? But we don't even have a dummy device that looks like an erased flash chip... Havard
Re: [RFC PATCH-for-5.1] hw/pci-host/q35: Ignore write of reserved PCIEXBAR LENGTH field
On 7/17/20 11:17 AM, Philippe Mathieu-Daudé wrote: > case MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD: > -default: > -abort(); > +qemu_log_mask(LOG_GUEST_ERROR, "Q35: Reserved PCIEXBAR LENGTH\n"); > +return; > } Did you really want to remove the default case? I guess the mask means that only *_RVD is left over, but the default sorta self-documents that. Either way, Reviewed-by: Richard Henderson r~
Re: device compatibility interface for live migration with assigned devices
On Fri, 17 Jul 2020 19:03:44 +0100 "Dr. David Alan Gilbert" wrote: > * Alex Williamson (alex.william...@redhat.com) wrote: > > On Wed, 15 Jul 2020 16:20:41 +0800 > > Yan Zhao wrote: > > > > > On Tue, Jul 14, 2020 at 02:59:48PM -0600, Alex Williamson wrote: > > > > On Tue, 14 Jul 2020 18:19:46 +0100 > > > > "Dr. David Alan Gilbert" wrote: > > > > > > > > > * Alex Williamson (alex.william...@redhat.com) wrote: > > > > > > On Tue, 14 Jul 2020 11:21:29 +0100 > > > > > > Daniel P. Berrangé wrote: > > > > > > > > > > > > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote: > > > > > > > > hi folks, > > > > > > > > we are defining a device migration compatibility interface that > > > > > > > > helps upper > > > > > > > > layer stack like openstack/ovirt/libvirt to check if two > > > > > > > > devices are > > > > > > > > live migration compatible. > > > > > > > > The "devices" here could be MDEVs, physical devices, or hybrid > > > > > > > > of the two. > > > > > > > > e.g. we could use it to check whether > > > > > > > > - a src MDEV can migrate to a target MDEV, > > > > > > > > - a src VF in SRIOV can migrate to a target VF in SRIOV, > > > > > > > > - a src MDEV can migration to a target VF in SRIOV. > > > > > > > > (e.g. SIOV/SRIOV backward compatibility case) > > > > > > > > > > > > > > > > The upper layer stack could use this interface as the last step > > > > > > > > to check > > > > > > > > if one device is able to migrate to another device before > > > > > > > > triggering a real > > > > > > > > live migration procedure. > > > > > > > > we are not sure if this interface is of value or help to you. > > > > > > > > please don't > > > > > > > > hesitate to drop your valuable comments. > > > > > > > > > > > > > > > > > > > > > > > > (1) interface definition > > > > > > > > The interface is defined in below way: > > > > > > > > > > > > > > > > __userspace > > > > > > > > /\ \ > > > > > > > > / \write > > > > > > > > / read \ > > > > > > > >/__ ___\|/_ > > > > > > > > | migration_version | | migration_version |-->check > > > > > > > > migration > > > > > > > > - - > > > > > > > > compatibility > > > > > > > > device Adevice B > > > > > > > > > > > > > > > > > > > > > > > > a device attribute named migration_version is defined under > > > > > > > > each device's > > > > > > > > sysfs node. e.g. > > > > > > > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version). > > > > > > > > userspace tools read the migration_version as a string from the > > > > > > > > source device, > > > > > > > > and write it to the migration_version sysfs attribute in the > > > > > > > > target device. > > > > > > > > > > > > > > > > The userspace should treat ANY of below conditions as two > > > > > > > > devices not compatible: > > > > > > > > - any one of the two devices does not have a migration_version > > > > > > > > attribute > > > > > > > > - error when reading from migration_version attribute of one > > > > > > > > device > > > > > > > > - error when writing migration_version string of one device to > > > > > > > > migration_version attribute of the other device > > > > > > > > > > > > > > > > The string read from migration_version attribute is defined by > > > > > > > > device vendor > > > > > > > > driver and is completely opaque to the userspace. > > > > > > > > for a Intel vGPU, string format can be defined like > > > > > > > > "parent device PCI ID" + "version of gvt driver" + "mdev type" > > > > > > > > + "aggregator count". > > > > > > > > > > > > > > > > for an NVMe VF connecting to a remote storage. it could be > > > > > > > > "PCI ID" + "driver version" + "configured remote storage URL" > > > > > > > > > > > > > > > > for a QAT VF, it may be > > > > > > > > "PCI ID" + "driver version" + "supported encryption set". > > > > > > > > > > > > > > > > (to avoid namespace confliction from each vendor, we may prefix > > > > > > > > a driver name to > > > > > > > > each migration_version string. e.g. > > > > > > > > i915-v1-8086-591d-i915-GVTg_V5_8-1) > > > > > > > > > > > > It's very strange to define it as opaque and then proceed to > > > > > > describe > > > > > > the contents of that opaque string. The point is that its contents > > > > > > are defined by the vendor driver to describe the device, driver > > > > > > version, > > > > > > and possibly metadata about the configuration of the device. One > > > > > > instance of a device might generate a different string from another. > > > > > > The string that a device produces is not necessarily the only string > > > > > > the vendor driver will accept, for example the driver might support > > > > > > backwards compatible migrations. > > > > > > > > > > (As I've
Re: [PATCH] target/i386: floatx80: avoid compound literals in static initializers
On 7/17/20 6:46 PM, Laszlo Ersek wrote: > On 07/17/20 11:26, Laszlo Ersek wrote: >> On 07/16/20 17:09, Philippe Mathieu-Daudé wrote: >>> On 7/16/20 4:42 PM, Laszlo Ersek wrote: Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an object that has static storage duration shall be constant expressions or string literals". The compound literal produced by the make_floatx80() macro is not such a constant expression, per 6.6p7-9. (An implementation may accept it, according to 6.6p10, but is not required to.) Therefore using "floatx80_zero" and make_floatx80() for initializing "f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6 actually chokes on them: > target/i386/fpu_helper.c:871:5: error: initializer element is not constant > { make_floatx80(0xbfff, 0x8000ULL), > ^ >>> >>> This reminds me of: >>> >>> commit 6fa9ba09dbf4eb8b52bcb47d6820957f1b77ee0b >>> Author: Kamil Rytarowski >>> Date: Mon Sep 4 23:23:06 2017 +0200 >>> >>> target/m68k: Switch fpu_rom from make_floatx80() to make_floatx80_init() >>> >>> GCC 4.7.2 on SunOS reports that the values assigned to array members >>> are not >>> real constants: >>> >>> target/m68k/fpu_helper.c:32:5: error: initializer element is not >>> constant >>> target/m68k/fpu_helper.c:32:5: error: (near initialization for >>> 'fpu_rom[0]') >>> rules.mak:66: recipe for target 'target/m68k/fpu_helper.o' failed >>> >>> Convert the array to make_floatx80_init() to fix it. >>> Replace floatx80_pi-like constants with make_floatx80_init() as they are >>> defined as make_floatx80(). >>> >>> Reviewed-by: Philippe Mathieu-Daudé >>> We've had the make_floatx80_init() macro for this purpose since commit 3bf7e40ab914 ("softfloat: fix for C99", 2012-03-17), so let's use that macro again. Fixes: eca30647fc07 Fixes: ff57bb7b6326 Link: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06566.html Link: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04714.html Cc: Alex Bennée Cc: Aurelien Jarno Cc: Eduardo Habkost Cc: Joseph Myers Cc: Paolo Bonzini Cc: Peter Maydell Cc: Richard Henderson Signed-off-by: Laszlo Ersek --- Notes: I can see that there are test cases under "tests/tcg/i386", but I don't know how to run them. >>> >>> Yeah it is not easy to figure... >>> >>> Try 'make run-tcg-tests-i386-softmmu' >>> but you need docker :^) >> >> That worked, thanks! Even without Docker: I just had to add >> >> --cross-cc-i386=gcc >> >> to my ./configure flags. >> > > Also -- I meant to, but I forgot to put "for-5.1" in the subject prefix; > sorry about that. Alex, as Paolo is not available, can this go via your tree? > > Laszlo >
[PULL for-5.1 3/3] tcg/cpu-exec: precise single-stepping after an interrupt
When single-stepping with a debugger attached to QEMU, and when an interrupt is raised, the debugger misses the first instruction after the interrupt. Tested-by: Luc Michel Reviewed-by: Luc Michel Buglink: https://bugs.launchpad.net/qemu/+bug/757702 Message-Id: <20200717163029.2737546-1-richard.hender...@linaro.org> Signed-off-by: Richard Henderson --- accel/tcg/cpu-exec.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 6a3d3a3cfc..66d38f9d85 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -588,7 +588,13 @@ static inline bool cpu_handle_interrupt(CPUState *cpu, else { if (cc->cpu_exec_interrupt(cpu, interrupt_request)) { replay_interrupt(); -cpu->exception_index = -1; +/* + * After processing the interrupt, ensure an EXCP_DEBUG is + * raised when single-stepping so that GDB doesn't miss the + * next instruction. + */ +cpu->exception_index = +(cpu->singlestep_enabled ? EXCP_DEBUG : -1); *last_tb = NULL; } /* The target hook may have updated the 'cpu->interrupt_request'; -- 2.25.1
[RFC PATCH-for-5.1] hw/pci-host/q35: Ignore write of reserved PCIEXBAR LENGTH field
libFuzzer triggered the following assertion: cat << EOF | qemu-system-i386 -M pc-q35-5.0 \ -nographic -monitor none -serial none \ -qtest stdio -d guest_errors -trace pci\* outl 0xcf8 0xf260 outl 0xcfc 0x8400056e EOF pci_cfg_write mch 00:0 @0x60 <- 0x8400056e Aborted (core dumped) This is because guest wrote MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD (reserved value) to the PCIE XBAR register. There is no indication on the datasheet about what occurs when this value is written. Simply ignore it on QEMU (and report an guest error): pci_cfg_write mch 00:0 @0x60 <- 0x8400056e Q35: Reserved PCIEXBAR LENGTH pci_cfg_read mch 00:0 @0x0 -> 0x8086 pci_cfg_read mch 00:0 @0x0 -> 0x29c08086 ... Cc: qemu-sta...@nongnu.org Reported-by: Alexander Bulekov BugLink: https://bugs.launchpad.net/qemu/+bug/1878641 Fixes: df2d8b3ed4 ("q35: Introduce q35 pc based chipset emulator") Signed-off-by: Philippe Mathieu-Daudé --- RFC because I have no idea how to propagate the error. --- hw/pci-host/q35.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index b67cb9c29f..a3f839570d 100644 --- a/hw/pci-host/q35.c +++ b/hw/pci-host/q35.c @@ -29,6 +29,7 @@ */ #include "qemu/osdep.h" +#include "qemu/log.h" #include "hw/i386/pc.h" #include "hw/pci-host/q35.h" #include "hw/qdev-properties.h" @@ -318,8 +319,8 @@ static void mch_update_pciexbar(MCHPCIState *mch) addr_mask |= MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK; break; case MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD: -default: -abort(); +qemu_log_mask(LOG_GUEST_ERROR, "Q35: Reserved PCIEXBAR LENGTH\n"); +return; } addr = pciexbar & addr_mask; pcie_host_mmcfg_update(pehb, enable, addr, length); -- 2.21.3
[PULL for-5.1 0/3] tcg patch queue
The following changes since commit 95d1fbabae0cd44156ac4b96d512d143ca7dfd5e: Merge remote-tracking branch 'remotes/kraxel/tags/fixes-20200716-pull-request' into staging (2020-07-16 18:50:51 +0100) are available in the Git repository at: https://github.com/rth7680/qemu.git tags/pull-tcg-20200717 for you to fetch changes up to ba3c35d9c4026361fd380b269dc6def9510b7166: tcg/cpu-exec: precise single-stepping after an interrupt (2020-07-17 11:09:34 -0700) Fix vector min/max fallback expansion Fix singlestep from exception and interrupt Luc Michel (1): tcg/cpu-exec: precise single-stepping after an exception Richard Henderson (2): tcg: Save/restore vecop_list around minmax fallback tcg/cpu-exec: precise single-stepping after an interrupt accel/tcg/cpu-exec.c | 19 ++- tcg/tcg-op-vec.c | 2 ++ 2 files changed, 20 insertions(+), 1 deletion(-)
[PULL for-5.1 1/3] tcg: Save/restore vecop_list around minmax fallback
Forgetting this asserts when tcg_gen_cmp_vec is called from within tcg_gen_cmpsel_vec. Fixes: 72b4c792c7a Signed-off-by: Richard Henderson --- tcg/tcg-op-vec.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index f784517d84..ed6fb55fe1 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -657,7 +657,9 @@ static void do_minmax(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b, TCGOpcode opc, TCGCond cond) { if (!do_op3(vece, r, a, b, opc)) { +const TCGOpcode *hold_list = tcg_swap_vecop_list(NULL); tcg_gen_cmpsel_vec(cond, vece, r, a, b, a, b); +tcg_swap_vecop_list(hold_list); } } -- 2.25.1
[PULL for-5.1 2/3] tcg/cpu-exec: precise single-stepping after an exception
From: Luc Michel When single-stepping with a debugger attached to QEMU, and when an exception is raised, the debugger misses the first instruction after the exception: $ qemu-system-aarch64 -M virt -display none -cpu cortex-a53 -s -S $ aarch64-linux-gnu-gdb GNU gdb (GDB) 9.2 [...] (gdb) tar rem :1234 Remote debugging using :1234 warning: No executable has been specified and target does not support determining executable automatically. Try using the "file" command. 0x in ?? () (gdb) # writing nop insns to 0x200 and 0x204 (gdb) set *0x200 = 0xd503201f (gdb) set *0x204 = 0xd503201f (gdb) # 0x0 address contains 0 which is an invalid opcode. (gdb) # The CPU should raise an exception and jump to 0x200 (gdb) si 0x0204 in ?? () With this commit, the same run steps correctly on the first instruction of the exception vector: (gdb) si 0x0200 in ?? () Buglink: https://bugs.launchpad.net/qemu/+bug/757702 Signed-off-by: Luc Michel Message-Id: <20200716193947.3058389-1-luc.mic...@greensocs.com> Signed-off-by: Richard Henderson --- accel/tcg/cpu-exec.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index d95c4848a4..6a3d3a3cfc 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -504,6 +504,17 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) cc->do_interrupt(cpu); qemu_mutex_unlock_iothread(); cpu->exception_index = -1; + +if (unlikely(cpu->singlestep_enabled)) { +/* + * After processing the exception, ensure an EXCP_DEBUG is + * raised when single-stepping so that GDB doesn't miss the + * next instruction. + */ +*ret = EXCP_DEBUG; +cpu_handle_debug_exception(cpu); +return true; +} } else if (!replay_has_interrupt()) { /* give a chance to iothread in replay mode */ *ret = EXCP_INTERRUPT; -- 2.25.1
Re: [PATCH v1 4/5] util: add qemu_get_host_physmem utility function
On 7/17/20 3:51 AM, Alex Bennée wrote: > +size_t qemu_get_host_physmem(void) > +{ > +#ifdef _SC_PHYS_PAGES > +long pages = sysconf(_SC_PHYS_PAGES); > +if (pages > 0) { > +return pages * qemu_real_host_page_size; > +} > +#endif > +return 0; > +} Is it worth examining our own RLIMIT_{AS,DATA} as well? I suppose that's not usually what is tweaked in the example of a ram-limited container... r~
Re: [PATCH] tcg/cpu-exec: precise single-stepping after an interrupt
On 7/17/20 6:30 PM, Richard Henderson wrote: When single-stepping with a debugger attached to QEMU, and when an interrupt is raised, the debugger misses the first instruction after the interrupt. Buglink: https://bugs.launchpad.net/qemu/+bug/757702 Signed-off-by: Richard Henderson Reviewed-by: Luc Michel Tested-by: Luc Michel --- accel/tcg/cpu-exec.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 6a3d3a3cfc..66d38f9d85 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -588,7 +588,13 @@ static inline bool cpu_handle_interrupt(CPUState *cpu, else { if (cc->cpu_exec_interrupt(cpu, interrupt_request)) { replay_interrupt(); -cpu->exception_index = -1; +/* + * After processing the interrupt, ensure an EXCP_DEBUG is + * raised when single-stepping so that GDB doesn't miss the + * next instruction. + */ +cpu->exception_index = +(cpu->singlestep_enabled ? EXCP_DEBUG : -1); *last_tb = NULL; } /* The target hook may have updated the 'cpu->interrupt_request';
Re: device compatibility interface for live migration with assigned devices
* Alex Williamson (alex.william...@redhat.com) wrote: > On Wed, 15 Jul 2020 16:20:41 +0800 > Yan Zhao wrote: > > > On Tue, Jul 14, 2020 at 02:59:48PM -0600, Alex Williamson wrote: > > > On Tue, 14 Jul 2020 18:19:46 +0100 > > > "Dr. David Alan Gilbert" wrote: > > > > > > > * Alex Williamson (alex.william...@redhat.com) wrote: > > > > > On Tue, 14 Jul 2020 11:21:29 +0100 > > > > > Daniel P. Berrangé wrote: > > > > > > > > > > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote: > > > > > > > hi folks, > > > > > > > we are defining a device migration compatibility interface that > > > > > > > helps upper > > > > > > > layer stack like openstack/ovirt/libvirt to check if two devices > > > > > > > are > > > > > > > live migration compatible. > > > > > > > The "devices" here could be MDEVs, physical devices, or hybrid of > > > > > > > the two. > > > > > > > e.g. we could use it to check whether > > > > > > > - a src MDEV can migrate to a target MDEV, > > > > > > > - a src VF in SRIOV can migrate to a target VF in SRIOV, > > > > > > > - a src MDEV can migration to a target VF in SRIOV. > > > > > > > (e.g. SIOV/SRIOV backward compatibility case) > > > > > > > > > > > > > > The upper layer stack could use this interface as the last step > > > > > > > to check > > > > > > > if one device is able to migrate to another device before > > > > > > > triggering a real > > > > > > > live migration procedure. > > > > > > > we are not sure if this interface is of value or help to you. > > > > > > > please don't > > > > > > > hesitate to drop your valuable comments. > > > > > > > > > > > > > > > > > > > > > (1) interface definition > > > > > > > The interface is defined in below way: > > > > > > > > > > > > > > __userspace > > > > > > > /\ \ > > > > > > > / \write > > > > > > > / read \ > > > > > > >/__ ___\|/_ > > > > > > > | migration_version | | migration_version |-->check > > > > > > > migration > > > > > > > - - compatibility > > > > > > > device Adevice B > > > > > > > > > > > > > > > > > > > > > a device attribute named migration_version is defined under each > > > > > > > device's > > > > > > > sysfs node. e.g. > > > > > > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version). > > > > > > > userspace tools read the migration_version as a string from the > > > > > > > source device, > > > > > > > and write it to the migration_version sysfs attribute in the > > > > > > > target device. > > > > > > > > > > > > > > The userspace should treat ANY of below conditions as two devices > > > > > > > not compatible: > > > > > > > - any one of the two devices does not have a migration_version > > > > > > > attribute > > > > > > > - error when reading from migration_version attribute of one > > > > > > > device > > > > > > > - error when writing migration_version string of one device to > > > > > > > migration_version attribute of the other device > > > > > > > > > > > > > > The string read from migration_version attribute is defined by > > > > > > > device vendor > > > > > > > driver and is completely opaque to the userspace. > > > > > > > for a Intel vGPU, string format can be defined like > > > > > > > "parent device PCI ID" + "version of gvt driver" + "mdev type" + > > > > > > > "aggregator count". > > > > > > > > > > > > > > for an NVMe VF connecting to a remote storage. it could be > > > > > > > "PCI ID" + "driver version" + "configured remote storage URL" > > > > > > > > > > > > > > for a QAT VF, it may be > > > > > > > "PCI ID" + "driver version" + "supported encryption set". > > > > > > > > > > > > > > (to avoid namespace confliction from each vendor, we may prefix a > > > > > > > driver name to > > > > > > > each migration_version string. e.g. > > > > > > > i915-v1-8086-591d-i915-GVTg_V5_8-1) > > > > > > > > > > It's very strange to define it as opaque and then proceed to describe > > > > > the contents of that opaque string. The point is that its contents > > > > > are defined by the vendor driver to describe the device, driver > > > > > version, > > > > > and possibly metadata about the configuration of the device. One > > > > > instance of a device might generate a different string from another. > > > > > The string that a device produces is not necessarily the only string > > > > > the vendor driver will accept, for example the driver might support > > > > > backwards compatible migrations. > > > > > > > > (As I've said in the previous discussion, off one of the patch series) > > > > > > > > My view is it makes sense to have a half-way house on the opaqueness of > > > > this string; I'd expect to have an ID and version that are human > > > > readable, maybe a device ID/name that's human interpretable and then a > > > >
Re: [PATCH v1 4/5] util: add qemu_get_host_physmem utility function
On 7/17/20 7:24 AM, Christian Ehrhardt wrote: > > +size_t qemu_get_host_physmem(void) > > +{ > > +#ifdef _SC_PHYS_PAGES > > + long pages = sysconf(_SC_PHYS_PAGES); > > + if (pages > 0) { > > + return pages * qemu_real_host_page_size; > > The Linux man page warns that this product may overflow so maybe you could > return pages here. > > > The caller might be even less aware of that than this function - so maybe > better handle it here. > How about handling overflows and cutting it to MiB before returning? Indeed, the caller may be less aware, so we should handle it here. But I don't think truncating to MiB helps at all, because again, the caller has to handle overflow. Better, I think, to saturate the result to ~(size_t)0 and leave it at that. r~
Re: [PATCH v2] virtiofsd: Remove "norace" from cmdline help and docs
* Sergio Lopez (s...@redhat.com) wrote: > Commit 93bb3d8d4cda ("virtiofsd: remove symlink fallbacks") removed > the implementation of the "norace" option, so remove it from the > cmdline help and the documentation too. > > Signed-off-by: Sergio Lopez > Reviewed-by: Philippe Mathieu-Daudé > Reviewed-by: Stefano Garzarella Thanks; I've added it for my list for the next virtiofsd pull. > --- > v2: > * Drop "norace" from the documentation too (Stefano Garzarella) > --- > docs/tools/virtiofsd.rst | 3 --- > tools/virtiofsd/helper.c | 2 -- > 2 files changed, 5 deletions(-) > > diff --git a/docs/tools/virtiofsd.rst b/docs/tools/virtiofsd.rst > index 824e713491..58666a4495 100644 > --- a/docs/tools/virtiofsd.rst > +++ b/docs/tools/virtiofsd.rst > @@ -63,9 +63,6 @@ Options > Print only log messages matching LEVEL or more severe. LEVEL is one of > ``err``, ``warn``, ``info``, or ``debug``. The default is ``info``. > > - * norace - > -Disable racy fallback. The default is false. > - >* posix_lock|no_posix_lock - > Enable/disable remote POSIX locks. The default is ``posix_lock``. > > diff --git a/tools/virtiofsd/helper.c b/tools/virtiofsd/helper.c > index 3105b6c23a..7bc5d7dc5a 100644 > --- a/tools/virtiofsd/helper.c > +++ b/tools/virtiofsd/helper.c > @@ -159,8 +159,6 @@ void fuse_cmdline_help(void) > "-o max_idle_threadsthe maximum number of idle worker > " > "threads\n" > " allowed (default: 10)\n" > - "-o norace disable racy fallback\n" > - " default: false\n" > "-o posix_lock|no_posix_lock\n" > " enable/disable remote posix > lock\n" > " default: posix_lock\n" > -- > 2.26.2 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Bug 1878255] Re: Assertion failure in bdrv_aio_cancel, through ide
Proposed fix: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg05595.html -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1878255 Title: Assertion failure in bdrv_aio_cancel, through ide Status in QEMU: New Bug description: Hello, While fuzzing, I found an input that triggers an assertion failure in bdrv_aio_cancel, through ide: #1 0x7685755b in __GI_abort () at abort.c:79 #2 0x56a8d396 in bdrv_aio_cancel (acb=0x60761290) at /home/alxndr/Development/qemu/block/io.c:2746 #3 0x56a58525 in blk_aio_cancel (acb=0x2) at /home/alxndr/Development/qemu/block/block-backend.c:1540 #4 0x56552f5b in ide_reset (s=) at /home/alxndr/Development/qemu/hw/ide/core.c:1318 #5 0x56552aeb in ide_bus_reset (bus=0x62d17398) at /home/alxndr/Development/qemu/hw/ide/core.c:2422 #6 0x56579ba5 in ahci_reset_port (s=, port=) at /home/alxndr/Development/qemu/hw/ide/ahci.c:650 #7 0x5657bd8d in ahci_port_write (s=0x61e14d70, port=0x2, offset=, val=0x10) at /home/alxndr/Development/qemu/hw/ide/ahci.c:360 #8 0x5657bd8d in ahci_mem_write (opaque=, addr=, val=, size=) at /home/alxndr/Development/qemu/hw/ide/ahci.c:513 #9 0x560028d7 in memory_region_write_accessor (mr=, addr=, value=, size=, shift=, mask=, attrs=...) at /home/alxndr/Development/qemu/memory.c:483 #10 0x56002280 in access_with_adjusted_size (addr=, value=, size=, access_size_min=, access_size_max=, access_fn=, mr=0x61e14da0, attrs=...) at /home/alxndr/Development/qemu/memory.c:544 #11 0x56002280 in memory_region_dispatch_write (mr=, addr=, data=0x10, op=, attrs=...) at /home/alxndr/Development/qemu/memory.c:1476 #12 0x55f171d4 in flatview_write_continue (fv=, addr=0xe106c22c, attrs=..., ptr=, len=0x1, addr1=0x7fffb8d0, l=, mr=0x61e14da0) at /home/alxndr/Development/qemu/exec.c:3137 #13 0x55f0fb98 in flatview_write (fv=0x6063b180, addr=, attrs=..., buf=, len=) at /home/alxndr/Development/qemu/exec.c:3177 I can reproduce it in qemu 5.0 using: cat << EOF | ~/Development/qemu/build/i386-softmmu/qemu-system-i386 -qtest stdio -monitor none -serial none -M pc-q35-5.0 -nographic outl 0xcf8 0x8000fa24 outl 0xcfc 0xe106c000 outl 0xcf8 0x8000fa04 outw 0xcfc 0x7 outl 0xcf8 0x8000fb20 write 0x0 0x3 0x2780e7 write 0xe106c22c 0xd 0x1130c218021130c218021130c2 write 0xe106c218 0x15 0x110010110010110010110010110010110010110010 EOF I also attached the commands to this launchpad report, in case the formatting is broken: qemu-system-i386 -qtest stdio -monitor none -serial none -M pc-q35-5.0 -nographic < attachment Please let me know if I can provide any further info. -Alex To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1878255/+subscriptions
[Bug 1878043] Re: memcpy param-overlap in Slirp ip_stripoptions through e1000e
Committed in upstream libslirp: commit d620bac888923524f8b8407dbf35f6d2b3b7ddb2 (origin/lp1878043, lp1878043) Author: Dr. David Alan Gilbert Date: Fri Jul 17 18:17:41 2020 +0100 ip_stripoptions use memmove -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1878043 Title: memcpy param-overlap in Slirp ip_stripoptions through e1000e Status in QEMU: In Progress Bug description: Hello, While fuzzing, I found an input that triggers an overlapping memcpy (caught by AddressSanitizer). Overlapping memcpys are undefined behavior according to the POSIX and C standards, and can lead to bugs. ==1==ERROR: AddressSanitizer: memcpy-param-overlap: memory ranges [0x625000264940,0x62500026699a) and [0x625000264948, 0x6250002669a2) overlap #0 0x5622d7b6a3d4 in __asan_memcpy (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x96c3d4) #1 0x5622d896a2d2 in ip_stripoptions /home/alxndr/Development/qemu/slirp/src/ip_input.c:457:5 #2 0x5622d8963378 in udp_input /home/alxndr/Development/qemu/slirp/src/udp.c:86:9 #3 0x5622d89351ea in slirp_input /home/alxndr/Development/qemu/slirp/src/slirp.c:840:13 #4 0x5622d852e162 in net_slirp_receive /home/alxndr/Development/qemu/net/slirp.c:126:5 #5 0x5622d8515851 in nc_sendv_compat /home/alxndr/Development/qemu/net/net.c:700:15 #6 0x5622d8515851 in qemu_deliver_packet_iov /home/alxndr/Development/qemu/net/net.c:728:15 #7 0x5622d851786d in qemu_net_queue_deliver_iov /home/alxndr/Development/qemu/net/queue.c:179:11 #8 0x5622d851786d in qemu_net_queue_send_iov /home/alxndr/Development/qemu/net/queue.c:224:11 #9 0x5622d851b1c1 in net_hub_receive_iov /home/alxndr/Development/qemu/net/hub.c:74:9 #10 0x5622d851b1c1 in net_hub_port_receive_iov /home/alxndr/Development/qemu/net/hub.c:125:12 #11 0x5622d851572b in qemu_deliver_packet_iov /home/alxndr/Development/qemu/net/net.c:726:15 #12 0x5622d851786d in qemu_net_queue_deliver_iov /home/alxndr/Development/qemu/net/queue.c:179:11 #13 0x5622d851786d in qemu_net_queue_send_iov /home/alxndr/Development/qemu/net/queue.c:224:11 #14 0x5622d828bf87 in net_tx_pkt_sendv /home/alxndr/Development/qemu/hw/net/net_tx_pkt.c:546:9 #15 0x5622d828bf87 in net_tx_pkt_send /home/alxndr/Development/qemu/hw/net/net_tx_pkt.c:620:9 #16 0x5622d82b5f22 in e1000e_tx_pkt_send /home/alxndr/Development/qemu/hw/net/e1000e_core.c:666:16 #17 0x5622d82b5f22 in e1000e_process_tx_desc /home/alxndr/Development/qemu/hw/net/e1000e_core.c:743:17 #18 0x5622d82b5f22 in e1000e_start_xmit /home/alxndr/Development/qemu/hw/net/e1000e_core.c:934:9 #19 0x5622d82b2be0 in e1000e_set_tdt /home/alxndr/Development/qemu/hw/net/e1000e_core.c:2451:9 #20 0x5622d82a30fc in e1000e_core_write /home/alxndr/Development/qemu/hw/net/e1000e_core.c:3261:9 #21 0x5622d7c9e336 in memory_region_write_accessor /home/alxndr/Development/qemu/memory.c:483:5 #22 0x5622d7c9dcdf in access_with_adjusted_size /home/alxndr/Development/qemu/memory.c:544:18 #23 0x5622d7c9dcdf in memory_region_dispatch_write /home/alxndr/Development/qemu/memory.c:1476:16 #24 0x5622d7bb31d3 in flatview_write_continue /home/alxndr/Development/qemu/exec.c:3137:23 #25 0x5622d7babb97 in flatview_write /home/alxndr/Development/qemu/exec.c:3177:14 #26 0x5622d7babb97 in address_space_write /home/alxndr/Development/qemu/exec.c:3268:18 0x625000264940 is located 64 bytes inside of 8354-byte region [0x625000264900,0x6250002669a2) allocated by thread T0 here: #0 0x5622d7b6b06d in malloc (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x96d06d) #1 0x7f724b932500 in g_malloc (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x54500) 0x625000264948 is located 72 bytes inside of 8354-byte region [0x625000264900,0x6250002669a2) allocated by thread T0 here: #0 0x5622d7b6b06d in malloc (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x96d06d) #1 0x7f724b932500 in g_malloc (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x54500) I can reproduce it in qemu 5.0 built with --enable-sanitizers using: cat << EOF | ~/Development/qemu/build/i386-softmmu/qemu-system-i386 -M pc-q35-5.0 -accel qtest -qtest stdio -nographic -monitor none -serial none outl 0xcf8 0x80001010 outl 0xcfc 0xe102 outl 0xcf8 0x80001014 outl 0xcf8 0x80001004 outw 0xcfc 0x7 outl 0xcf8 0x800010a2 outl 0xcf8 0x8000fa24 outl 0xcfc 0xe1069000 outl 0xcf8 0x8000fa04 outw 0xcfc 0x7 outl 0xcf8 0x8000fb20 write 0xe1069100 0xe 0xff818420f9e10019 write 0x820b 0xc 0x080047bb0c02e1004011 write 0xe1020403 0x36 0xb7e1000f009006e1625c5eb7e1000f009006e1625c5eb7e1000f009006e1 EOF I also attached the trace to this launchpad report
Re: [PATCH v1 3/5] semihosting: don't send the trailing '\0'
On 7/17/20 3:51 AM, Alex Bennée wrote: > From: KONRAD Frederic > > Don't send the trailing 0 from the string. > > Signed-off-by: KONRAD Frederic > Signed-off-by: Alex Bennée > Reviewed-by: Philippe Mathieu-Daudé > Message-Id: <1592215252-26742-2-git-send-email-frederic.kon...@adacore.com> > --- > hw/semihosting/console.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) Reviewed-by: Richard Henderson > diff --git a/hw/semihosting/console.c b/hw/semihosting/console.c > index 22e7827824a..9b4fee92602 100644 > --- a/hw/semihosting/console.c > +++ b/hw/semihosting/console.c > @@ -52,7 +52,9 @@ static GString *copy_user_string(CPUArchState *env, > target_ulong addr) > > do { > if (cpu_memory_rw_debug(cpu, addr++, &c, 1, 0) == 0) { > -s = g_string_append_c(s, c); > +if (c) { > +s = g_string_append_c(s, c); > +} > } else { > qemu_log_mask(LOG_GUEST_ERROR, >"%s: passed inaccessible address " TARGET_FMT_lx, > Next cycle, we could clean up this loop a bit, rather than testing c != 0 twice. E.g. while (1) { if (cpu_memory_rw_debug(cpu, addr++, &c, 1, 0)) { error; break; } if (c == 0) { break; } s = g_string_append_c(s, c); } r~
Re: [PATCH v1 2/5] semihosting: defer connect_chardevs a little more to use serialx
On 7/17/20 3:51 AM, Alex Bennée wrote: > From: KONRAD Frederic > > With that we can just use -semihosting-config chardev=serial0. > > Signed-off-by: KONRAD Frederic > Message-Id: <1592215252-26742-1-git-send-email-frederic.kon...@adacore.com> > [AJB: tweak commit message] > Signed-off-by: Alex Bennée > --- > softmmu/vl.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) Reviewed-by: Richard Henderson r~
[Bug 1878043] Re: memcpy param-overlap in Slirp ip_stripoptions through e1000e
Created patch and merge request in upstream libslirp: https://gitlab.freedesktop.org/dgilbert/libslirp/-/commit/d620bac888923524f8b8407dbf35f6d2b3b7ddb2 ** Changed in: qemu Assignee: (unassigned) => Dr. David Alan Gilbert (dgilbert-h) ** Changed in: qemu Status: New => In Progress -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1878043 Title: memcpy param-overlap in Slirp ip_stripoptions through e1000e Status in QEMU: In Progress Bug description: Hello, While fuzzing, I found an input that triggers an overlapping memcpy (caught by AddressSanitizer). Overlapping memcpys are undefined behavior according to the POSIX and C standards, and can lead to bugs. ==1==ERROR: AddressSanitizer: memcpy-param-overlap: memory ranges [0x625000264940,0x62500026699a) and [0x625000264948, 0x6250002669a2) overlap #0 0x5622d7b6a3d4 in __asan_memcpy (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x96c3d4) #1 0x5622d896a2d2 in ip_stripoptions /home/alxndr/Development/qemu/slirp/src/ip_input.c:457:5 #2 0x5622d8963378 in udp_input /home/alxndr/Development/qemu/slirp/src/udp.c:86:9 #3 0x5622d89351ea in slirp_input /home/alxndr/Development/qemu/slirp/src/slirp.c:840:13 #4 0x5622d852e162 in net_slirp_receive /home/alxndr/Development/qemu/net/slirp.c:126:5 #5 0x5622d8515851 in nc_sendv_compat /home/alxndr/Development/qemu/net/net.c:700:15 #6 0x5622d8515851 in qemu_deliver_packet_iov /home/alxndr/Development/qemu/net/net.c:728:15 #7 0x5622d851786d in qemu_net_queue_deliver_iov /home/alxndr/Development/qemu/net/queue.c:179:11 #8 0x5622d851786d in qemu_net_queue_send_iov /home/alxndr/Development/qemu/net/queue.c:224:11 #9 0x5622d851b1c1 in net_hub_receive_iov /home/alxndr/Development/qemu/net/hub.c:74:9 #10 0x5622d851b1c1 in net_hub_port_receive_iov /home/alxndr/Development/qemu/net/hub.c:125:12 #11 0x5622d851572b in qemu_deliver_packet_iov /home/alxndr/Development/qemu/net/net.c:726:15 #12 0x5622d851786d in qemu_net_queue_deliver_iov /home/alxndr/Development/qemu/net/queue.c:179:11 #13 0x5622d851786d in qemu_net_queue_send_iov /home/alxndr/Development/qemu/net/queue.c:224:11 #14 0x5622d828bf87 in net_tx_pkt_sendv /home/alxndr/Development/qemu/hw/net/net_tx_pkt.c:546:9 #15 0x5622d828bf87 in net_tx_pkt_send /home/alxndr/Development/qemu/hw/net/net_tx_pkt.c:620:9 #16 0x5622d82b5f22 in e1000e_tx_pkt_send /home/alxndr/Development/qemu/hw/net/e1000e_core.c:666:16 #17 0x5622d82b5f22 in e1000e_process_tx_desc /home/alxndr/Development/qemu/hw/net/e1000e_core.c:743:17 #18 0x5622d82b5f22 in e1000e_start_xmit /home/alxndr/Development/qemu/hw/net/e1000e_core.c:934:9 #19 0x5622d82b2be0 in e1000e_set_tdt /home/alxndr/Development/qemu/hw/net/e1000e_core.c:2451:9 #20 0x5622d82a30fc in e1000e_core_write /home/alxndr/Development/qemu/hw/net/e1000e_core.c:3261:9 #21 0x5622d7c9e336 in memory_region_write_accessor /home/alxndr/Development/qemu/memory.c:483:5 #22 0x5622d7c9dcdf in access_with_adjusted_size /home/alxndr/Development/qemu/memory.c:544:18 #23 0x5622d7c9dcdf in memory_region_dispatch_write /home/alxndr/Development/qemu/memory.c:1476:16 #24 0x5622d7bb31d3 in flatview_write_continue /home/alxndr/Development/qemu/exec.c:3137:23 #25 0x5622d7babb97 in flatview_write /home/alxndr/Development/qemu/exec.c:3177:14 #26 0x5622d7babb97 in address_space_write /home/alxndr/Development/qemu/exec.c:3268:18 0x625000264940 is located 64 bytes inside of 8354-byte region [0x625000264900,0x6250002669a2) allocated by thread T0 here: #0 0x5622d7b6b06d in malloc (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x96d06d) #1 0x7f724b932500 in g_malloc (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x54500) 0x625000264948 is located 72 bytes inside of 8354-byte region [0x625000264900,0x6250002669a2) allocated by thread T0 here: #0 0x5622d7b6b06d in malloc (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x96d06d) #1 0x7f724b932500 in g_malloc (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x54500) I can reproduce it in qemu 5.0 built with --enable-sanitizers using: cat << EOF | ~/Development/qemu/build/i386-softmmu/qemu-system-i386 -M pc-q35-5.0 -accel qtest -qtest stdio -nographic -monitor none -serial none outl 0xcf8 0x80001010 outl 0xcfc 0xe102 outl 0xcf8 0x80001014 outl 0xcf8 0x80001004 outw 0xcfc 0x7 outl 0xcf8 0x800010a2 outl 0xcf8 0x8000fa24 outl 0xcfc 0xe1069000 outl 0xcf8 0x8000fa04 outw 0xcfc 0x7 outl 0xcf8 0x8000fb20 write 0xe1069100 0xe 0xff818420f9e10019 write 0x820b 0xc 0x080047bb0c02e1004011 write 0xe1020403 0x36 0xb7e1000f009006e1625c5eb7e1000f009006e1625c5eb7
[PATCH for-5.2] spapr: Simplify error handling in spapr_phb_realize()
The spapr_phb_realize() function has a local_err variable which is used to: 1) check failures of spapr_irq_findone() and spapr_irq_claim() 2) prepend extra information to the error message Recent work from Markus Armbruster highlighted we get better code when testing the return value of a function, rather than setting up all the local_err boiler plate. For similar reasons, it is now preferred to use ERRP_GUARD() and error_prepend() rather than error_propagate_prepend(). Since spapr_irq_findone() and spapr_irq_claim() return negative values in case of failure, do both changes. This is just cleanup, no functional impact. Signed-off-by: Greg Kurz --- Since we add ERRP_GUARD(), we could theoretically check *errp rather than the return value, and thus avoid the uint32_t to int32_t change but I personally find it clearer the other way. --- hw/ppc/spapr_pci.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 21681215d405..b1ce51327db4 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -1796,6 +1796,7 @@ static void spapr_phb_destroy_msi(gpointer opaque) static void spapr_phb_realize(DeviceState *dev, Error **errp) { +ERRP_GUARD(); /* We don't use SPAPR_MACHINE() in order to exit gracefully if the user * tries to add a sPAPR PHB to a non-pseries machine. */ @@ -1813,7 +1814,6 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) uint64_t msi_window_size = 4096; SpaprTceTable *tcet; const unsigned windows_supported = spapr_phb_windows_supported(sphb); -Error *local_err = NULL; if (!spapr) { error_setg(errp, TYPE_SPAPR_PCI_HOST_BRIDGE " needs a pseries machine"); @@ -1964,13 +1964,12 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) /* Initialize the LSI table */ for (i = 0; i < PCI_NUM_PINS; i++) { -uint32_t irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i; +int32_t irq = SPAPR_IRQ_PCI_LSI + sphb->index * PCI_NUM_PINS + i; if (smc->legacy_irq_allocation) { -irq = spapr_irq_findone(spapr, &local_err); -if (local_err) { -error_propagate_prepend(errp, local_err, -"can't allocate LSIs: "); +irq = spapr_irq_findone(spapr, errp); +if (irq < 0) { +error_prepend(errp, "can't allocate LSIs: "); /* * Older machines will never support PHB hotplug, ie, this is an * init only path and QEMU will terminate. No need to rollback. @@ -1979,9 +1978,8 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) } } -spapr_irq_claim(spapr, irq, true, &local_err); -if (local_err) { -error_propagate_prepend(errp, local_err, "can't allocate LSIs: "); +if (spapr_irq_claim(spapr, irq, true, errp) < 0) { +error_prepend(errp, "can't allocate LSIs: "); goto unrealize; }
Re: [RFC PATCH-for-5.1] hw/ide: Do not block for AIO while resetting a drive
Patchew URL: https://patchew.org/QEMU/20200717171938.1249-1-f4...@amsat.org/ Hi, This series failed the docker-quick@centos7 build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash make docker-image-centos7 V=1 NETWORK=1 time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 === TEST SCRIPT END === CC aarch64-softmmu/target/arm/translate-sve.o CC aarch64-softmmu/trace/generated-helpers.o LINKaarch64-softmmu/qemu-system-aarch64 collect2: error: ld returned 1 exit status collect2: error: ld returned 1 exit status make[1]: *** [qemu-system-aarch64] Error 1 make[1]: *** [qemu-system-x86_64] Error 1 make: *** [aarch64-softmmu/all] Error 2 make: *** Waiting for unfinished jobs make: *** [x86_64-softmmu/all] Error 2 Traceback (most recent call last): File "./tests/docker/docker.py", line 708, in sys.exit(main()) --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=3bb1a90ba07d44f8a68428ba8cc1aac8', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-s7o9muvt/src/docker-src.2020-07-17-13.24.31.3740:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=3bb1a90ba07d44f8a68428ba8cc1aac8 make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-s7o9muvt/src' make: *** [docker-run-test-quick@centos7] Error 2 real2m44.977s user0m8.984s The full log is available at http://patchew.org/logs/20200717171938.1249-1-f4...@amsat.org/testing.docker-quick@centos7/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
[RFC PATCH-for-5.1] hw/ide: Do not block for AIO while resetting a drive
libFuzzer triggered the following assertion: cat << EOF | qemu-system-i386 -M pc-q35-5.0 \ -nographic -monitor none -serial none \ -qtest stdio -trace ide\* outl 0xcf8 0x8000fa24 outl 0xcfc 0xe106c000 outl 0xcf8 0x8000fa04 outw 0xcfc 0x7 outl 0xcf8 0x8000fb20 write 0x0 0x3 0x2780e7 write 0xe106c22c 0xd 0x1130c218021130c218021130c2 write 0xe106c218 0x15 0x110010110010110010110010110010110010110010 EOF ide_exec_cmd IDE exec cmd: bus 0x56170a77a2b8; state 0x56170a77a340; cmd 0xe7 ide_reset IDEstate 0x56170a77a340 Aborted (core dumped) (gdb) bt #1 0x74f93895 in abort () at /lib64/libc.so.6 #2 0x55dc6c00 in bdrv_aio_cancel (acb=0x56765550) at block/io.c:2745 #3 0x55dac202 in blk_aio_cancel (acb=0x56765550) at block/block-backend.c:1546 #4 0x55b1bd74 in ide_reset (s=0x57213340) at hw/ide/core.c:1318 #5 0x55b1e3a1 in ide_bus_reset (bus=0x572132b8) at hw/ide/core.c:2422 #6 0x55b2aa27 in ahci_reset_port (s=0x5720eb50, port=2) at hw/ide/ahci.c:650 #7 0x55b29fd7 in ahci_port_write (s=0x5720eb50, port=2, offset=44, val=16) at hw/ide/ahci.c:360 #8 0x55b2a564 in ahci_mem_write (opaque=0x5720eb50, addr=556, val=16, size=1) at hw/ide/ahci.c:513 #9 0x5598415b in memory_region_write_accessor (mr=0x5720eb80, addr=556, value=0x7fffb838, size=1, shift=0, mask=255, attrs=...) at softmmu/memory.c:483 Looking at bdrv_aio_cancel: 2728 /* async I/Os */ 2729 2730 void bdrv_aio_cancel(BlockAIOCB *acb) 2731 { 2732 qemu_aio_ref(acb); 2733 bdrv_aio_cancel_async(acb); 2734 while (acb->refcnt > 1) { 2735 if (acb->aiocb_info->get_aio_context) { 2736 aio_poll(acb->aiocb_info->get_aio_context(acb), true); 2737 } else if (acb->bs) { 2738 /* qemu_aio_ref and qemu_aio_unref are not thread-safe, so 2739 * assert that we're not using an I/O thread. Thread-safe 2740 * code should use bdrv_aio_cancel_async exclusively. 2741 */ 2742 assert(bdrv_get_aio_context(acb->bs) == qemu_get_aio_context()); 2743 aio_poll(bdrv_get_aio_context(acb->bs), true); 2744 } else { 2745 abort(); <=== 2746 } 2747 } 2748 qemu_aio_unref(acb); 2749 } Our context is already referenced but we don't have a getter, neither a block driver state. Maybe because we are called from a vCPU context? Avoid this case by calling the pending callback directly. In this case this is WIN_FLUSH_CACHE. I'm not sure for the other READ/WRITE commands. Reported-by: Alexander Bulekov Fixes: bef0fd5958 ("ide: convert ide_sector_read() to asynchronous I/O") BugLink: https://bugs.launchpad.net/qemu/+bug/1878255 Signed-off-by: Philippe Mathieu-Daudé --- RFC because I don't understand AIO operations well. After RFC Cc: qemu-sta...@nongnu.org ide_is_pio_out() verifies a PIO OUT checking: s->end_transfer_func == ide_dummy_transfer_stop An alternative might be: if (s->pio_aiocb && s->end_transfer_func == ide_dummy_transfer_stop) { /* If there is a pending AIO callback, invoke it now. */ blk_aio_cancel_async(s->pio_aiocb); s->pio_aiocb = NULL; } Or if we want to limit to WIN_FLUSH_CACHE: if (s->pio_aiocb && s->bus->error_status & IDE_RETRY_FLUSH) { /* If there is a pending AIO callback, invoke it now. */ blk_aio_cancel_async(s->pio_aiocb); s->pio_aiocb = NULL; } Cc: Stefan Hajnoczi Last minute chat: 19:01 f4bug: use bdrv_aio_cancel_async() if possible because it won't block the current thread. 19:02 f4bug: For example, in device emulation code where the guest has requested to cancel an I/O request it's often possible to use the async version. 19:02 f4bug: But in synchronous code like device reset it may be necessary to use the synchronous (blocking) bdrv_aio_cancel() API instead. :( 19:14 f4bug: The way to decide is: will the current function return to the event loop and is there someone who will handle the request completion callback when cancel finishes? 19:14 f4bug: If the next line of code requires the request to finished then async cancel cannot be used. 19:15 f4bug: On the other hand, if the function can return and it's okay for the request to cancel at some future time then you can use async. So I'll revisit this patch :) --- hw/ide/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/ide/core.c b/hw/ide/core.c index d997a78e47..e3a9ce7d25 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -1315,7 +1315,8 @@ static void ide_reset(IDEState *s) trace_ide_reset(s); if (s->pio_aiocb) { -blk_aio_cancel(s->pio_aiocb); +/* If there is a pending AIO callback, invoke it now. */ +blk_aio_cancel_async(s->pio_aiocb); s->pio_aiocb = NULL; } -- 2.21.3
Re: sysbus_create_simple Vs qdev_create
On Fri, 17 Jul 2020 at 17:32, Daniel P. Berrangé wrote: > Personally I've not seen a problem with the term "parent" in > this scenario. The class inheritance metaphor maps reasonably > clearly to a parent/child metaphor. It's not bad in itself; it's just that it means almost all of our objects are in three different kinds of parent-child relationship simultaneously, which is confusing if you're not used to it... thanks -- PMM
Re: Inter-VM device emulation (call on Mon 20th July 2020)
On Fri, Jul 17, 2020 at 11:58:40AM +0300, Nikos Dragazis wrote: > On 15/7/20 7:44 μ.μ., Alex Bennée wrote: > > > Stefan Hajnoczi writes: > > > > > On Wed, Jul 15, 2020 at 01:28:07PM +0200, Jan Kiszka wrote: > > > > On 15.07.20 13:23, Stefan Hajnoczi wrote: > > > > > Let's have a call to figure out: > > > > > > > > > > 1. What is unique about these approaches and how do they overlap? > > > > > 2. Can we focus development and code review efforts to get something > > > > > merged sooner? > > > > > > > > > > Jan and Nikos: do you have time to join on Monday, 20th of July at > > > > > 15:00 > > > > > UTC? > > > > > https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1500 > > > > > > > > > Not at that slot, but one hour earlier or later would work for me (so > > > > far). > > > Nikos: Please let us know which of Jan's timeslots works best for you. > > I'm in - the earlier slot would be preferential for me to avoid clashing > > with > > family time. > > > > I'm OK with all timeslots. Great, let's do 16:00 UTC. I have a meeting at 14:00 UTC so I can't make the earlier slot and it sounds like Andra-Irina and Alexander Graf do too. Sorry, Alex (Bennée), not optimal but it's hard to find a slot that is perfect for everyone. Stefan signature.asc Description: PGP signature
Re: [PATCH] fuzz: Fix leak when assembling datadir path string
On 200717 1847, Thomas Huth wrote: > On 17/07/2020 18.35, Alexander Bulekov wrote: > > We freed the string containing the final datadir path, but did not free > > the path to the executable's directory that we get from > > g_path_get_dirname(). Fix that. > > > > Reported-by: Thomas Huth > > Signed-off-by: Alexander Bulekov > > --- > > > > I ran it with Thomas' fixed build-oss-fuzz job: > > https://gitlab.com/a1xndr/qemu/-/jobs/644463736 > > Looks like the fuzzer triggered a crash there, see line 5850 ... > shouldn't the job fail in that case? ... i.e. is the fuzzer still > exiting with return code 0? Ah. We run each input in a forked process. If the child crashes, the parent can continue forking+fuzzing, as if nothing happened. This also unfortunately means that the job might succeed even if there is a crash in the actual fuzz target code, as long as the error only happens in the child processes. Maybe we could add an env variable to have the parent exit -1 if the child crashes, but then the job would fail even for non-fuzzer issues (such as this virtio-net crash). -Alex > > > diff --git a/tests/qtest/fuzz/fuzz.c b/tests/qtest/fuzz/fuzz.c > > index 6bc17ef313..031594a686 100644 > > --- a/tests/qtest/fuzz/fuzz.c > > +++ b/tests/qtest/fuzz/fuzz.c > > @@ -143,7 +143,7 @@ int LLVMFuzzerInitialize(int *argc, char ***argv, char > > ***envp) > > { > > > > char *target_name; > > -char *dir; > > +char *bindir, *datadir; > > bool serialize = false; > > > > /* Initialize qgraph and modules */ > > @@ -164,11 +164,13 @@ int LLVMFuzzerInitialize(int *argc, char ***argv, > > char ***envp) > > * location of the executable. Using this we add exec_dir/pc-bios > > to > > * the datadirs. > > */ > > -dir = g_build_filename(g_path_get_dirname(**argv), "pc-bios", > > NULL); > > -if (g_file_test(dir, G_FILE_TEST_IS_DIR)) { > > -qemu_add_data_dir(dir); > > +bindir = g_path_get_dirname(**argv); > > +datadir = g_build_filename(bindir, "pc-bios", NULL); > > +g_free(bindir); > > +if (g_file_test(datadir, G_FILE_TEST_IS_DIR)) { > > +qemu_add_data_dir(datadir); > > } > > -g_free(dir); > > +g_free(datadir); > > } else if (*argc > 1) { /* The target is specified as an argument */ > > target_name = (*argv)[1]; > > if (!strstr(target_name, "--fuzz-target=")) { > > > > Patch looks fine, thanks! > > Reviewed-by: Thomas Huth >
Re: [PATCH v6 04/13] hw/arm: Add NPCM730 and NPCM750 SoC models
+Markus Armbruster On Fri, Jul 17, 2020 at 5:20 AM Cédric Le Goater wrote: > > On 7/17/20 8:02 AM, Havard Skinnemoen wrote: > > The Nuvoton NPCM7xx SoC family are used to implement Baseboard > > Management Controllers in servers. While the family includes four SoCs, > > this patch implements limited support for two of them: NPCM730 (targeted > > for Data Center applications) and NPCM750 (targeted for Enterprise > > applications). > > > > This patch includes little more than the bare minimum needed to boot a > > Linux kernel built with NPCM7xx support in direct-kernel mode: > > > > - Two Cortex-A9 CPU cores with built-in periperhals. > > - Global Configuration Registers. > > - Clock Management. > > - 3 Timer Modules with 5 timers each. > > - 4 serial ports. > > > > The chips themselves have a lot more features, some of which will be > > added to the model at a later stage. > > > > Reviewed-by: Tyrone Ting > > Reviewed-by: Joel Stanley > > Reviewed-by: Philippe Mathieu-Daudé > > Signed-off-by: Havard Skinnemoen > > --- > > include/hw/arm/npcm7xx.h | 85 > > hw/arm/npcm7xx.c | 407 +++ > > hw/arm/Kconfig | 5 + > > hw/arm/Makefile.objs | 1 + > > 4 files changed, 498 insertions(+) > > create mode 100644 include/hw/arm/npcm7xx.h > > create mode 100644 hw/arm/npcm7xx.c > > > > diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h > > new file mode 100644 > > index 00..e68d9c79e6 > > --- /dev/null > > +++ b/include/hw/arm/npcm7xx.h > > @@ -0,0 +1,85 @@ > > +/* > > + * Nuvoton NPCM7xx SoC family. > > + * > > + * Copyright 2020 Google LLC > > + * > > + * This program is free software; you can redistribute it and/or modify it > > + * under the terms of the GNU General Public License as published by the > > + * Free Software Foundation; either version 2 of the License, or > > + * (at your option) any later version. > > + * > > + * This program is distributed in the hope that it will be useful, but > > WITHOUT > > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > > + * for more details. > > + */ > > +#ifndef NPCM7XX_H > > +#define NPCM7XX_H > > + > > +#include "hw/boards.h" > > +#include "hw/cpu/a9mpcore.h" > > +#include "hw/misc/npcm7xx_clk.h" > > +#include "hw/misc/npcm7xx_gcr.h" > > +#include "hw/timer/npcm7xx_timer.h" > > +#include "target/arm/cpu.h" > > + > > +#define NPCM7XX_MAX_NUM_CPUS(2) > > + > > +/* The first half of the address space is reserved for DDR4 DRAM. */ > > +#define NPCM7XX_DRAM_BA (0x) > > +#define NPCM7XX_DRAM_SZ (2 * GiB) > > + > > +/* Magic addresses for setting up direct kernel booting and SMP boot > > stubs. */ > > +#define NPCM7XX_LOADER_START(0x) /* Start of SDRAM */ > > +#define NPCM7XX_SMP_LOADER_START(0x) /* Boot ROM */ > > +#define NPCM7XX_SMP_BOOTREG_ADDR(0xf080013c) /* GCR.SCRPAD */ > > +#define NPCM7XX_GIC_CPU_IF_ADDR (0xf03fe100) /* GIC within A9 */ > > + > > +typedef struct NPCM7xxState { > > +DeviceState parent; > > + > > +ARMCPU cpu[NPCM7XX_MAX_NUM_CPUS]; > > +A9MPPrivState a9mpcore; > > + > > +MemoryRegionsram; > > +MemoryRegionirom; > > +MemoryRegionram3; > > +MemoryRegion*dram; > > + > > +NPCM7xxGCRState gcr; > > +NPCM7xxCLKState clk; > > +NPCM7xxTimerCtrlState tim[3]; > > +} NPCM7xxState; > > + > > +#define TYPE_NPCM7XX"npcm7xx" > > +#define NPCM7XX(obj)OBJECT_CHECK(NPCM7xxState, (obj), TYPE_NPCM7XX) > > + > > +#define TYPE_NPCM730"npcm730" > > +#define TYPE_NPCM750"npcm750" > > + > > +typedef struct NPCM7xxClass { > > +DeviceClass parent; > > + > > +/* Bitmask of modules that are permanently disabled on this chip. */ > > +uint32_tdisabled_modules; > > +/* Number of CPU cores enabled in this SoC class (may be 1 or 2). */ > > +uint32_tnum_cpus; > > +} NPCM7xxClass; > > + > > +#define NPCM7XX_CLASS(klass)\ > > +OBJECT_CLASS_CHECK(NPCM7xxClass, (klass), TYPE_NPCM7XX) > > +#define NPCM7XX_GET_CLASS(obj) \ > > +OBJECT_GET_CLASS(NPCM7xxClass, (obj), TYPE_NPCM7XX) > > + > > +/** > > + * npcm7xx_load_kernel - Loads memory with everything needed to boot > > + * @machine - The machine containing the SoC to be booted. > > + * @soc - The SoC containing the CPU to be booted. > > + * > > + * This will set up the ARM boot info structure for the specific NPCM7xx > > + * derivative and call arm_load_kernel() to set up loading of the kernel, > > etc. > > + * into memory, if requested by the user. > > + */ > > +void npcm7xx_load_kernel(MachineState *machine, NPCM7xxState *soc); > > + > > +#endif /* NPCM7XX_H */ > > diff --git a/hw/arm
Re: [PATCH] fuzz: Fix leak when assembling datadir path string
Patchew URL: https://patchew.org/QEMU/20200717163523.1591-1-alx...@bu.edu/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #! /bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-mingw@fedora J=14 NETWORK=1 === TEST SCRIPT END === Configure options: --enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/tmp/qemu-test/install --python=/usr/bin/python3 --cross-prefix=x86_64-w64-mingw32- --enable-trace-backends=simple --enable-gnutls --enable-nettle --enable-curl --enable-vnc --enable-bzip2 --enable-guest-agent --enable-docs ERROR: configure test passed without -Werror but failed with -Werror. This is probably a bug in the configure script. The failing command will be at the bottom of config.log. You can run configure with --disable-werror to bypass this check. --- funcs: do_compiler do_cc compile_object check_define main lines: 93 128 636 662 0 x86_64-w64-mingw32-gcc -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -c -o config-temp/qemu-conf.o config-temp/qemu-conf.c config-temp/qemu-conf.c:2:2: error: #error __linux__ not defined 2 | #error __linux__ not defined | ^ --- funcs: do_compiler do_cc compile_object check_define main lines: 93 128 636 714 0 x86_64-w64-mingw32-gcc -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -c -o config-temp/qemu-conf.o config-temp/qemu-conf.c config-temp/qemu-conf.c:2:2: error: #error __i386__ not defined 2 | #error __i386__ not defined | ^ --- funcs: do_compiler do_cc compile_object check_define main lines: 93 128 636 717 0 x86_64-w64-mingw32-gcc -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -c -o config-temp/qemu-conf.o config-temp/qemu-conf.c config-temp/qemu-conf.c:2:2: error: #error __ILP32__ not defined 2 | #error __ILP32__ not defined | ^ --- lines: 93 134 987 0 x86_64-w64-mingw32-gcc -mthreads -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -o config-temp/qemu-conf.exe config-temp/qemu-conf.c -liberty /usr/lib/gcc/x86_64-w64-mingw32/9.2.1/../../../../x86_64-w64-mingw32/bin/ld: cannot find -liberty collect2: error: ld returned 1 exit status funcs: do_compiler do_cc compile_object main lines: 93 128 1998 0 --- funcs: do_compiler do_cc compile_prog cc_has_warning_flag main lines: 93 134 2098 2102 0 x86_64-w64-mingw32-gcc -m64 -mcx16 -mthreads -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Werror -Winitializer-overrides -o config-temp/qemu-conf.exe config-temp/qemu-conf.c -m64 x86_64-w64-mingw32-gcc: error: unrecognized command line option '-Winitializer-overrides' funcs: do_compiler do_cc compile_prog cc_has_warning_flag main lines: 93 134 2098 2102 0 --- funcs: do_compiler do_cc compile_prog cc_has_warning_flag main lines: 93 134 2098 2102 0 x86_64-w64-mingw32-gcc -m64 -mcx16 -mthreads -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wno-missing-include-dirs -Wno-shift-negative-value -Werror -Wstring-plus-int -o config-temp/qemu-conf.exe config-temp/qemu-conf.c -m64 x86_64-w64-mingw32-gcc: error: unrecognized command line option '-Wstring-plus-int' funcs: do_compiler do_cc compile_prog cc_has_warning_flag main lines: 93 134 2098 2102 0 x86_64-w64-mingw32-gcc -m64 -mcx16 -mthreads -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99 -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -
[PATCH] fuzz: Fix leak when assembling datadir path string
We freed the string containing the final datadir path, but did not free the path to the executable's directory that we get from g_path_get_dirname(). Fix that. Reported-by: Thomas Huth Signed-off-by: Alexander Bulekov --- I ran it with Thomas' fixed build-oss-fuzz job: https://gitlab.com/a1xndr/qemu/-/jobs/644463736 tests/qtest/fuzz/fuzz.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/tests/qtest/fuzz/fuzz.c b/tests/qtest/fuzz/fuzz.c index 6bc17ef313..031594a686 100644 --- a/tests/qtest/fuzz/fuzz.c +++ b/tests/qtest/fuzz/fuzz.c @@ -143,7 +143,7 @@ int LLVMFuzzerInitialize(int *argc, char ***argv, char ***envp) { char *target_name; -char *dir; +char *bindir, *datadir; bool serialize = false; /* Initialize qgraph and modules */ @@ -164,11 +164,13 @@ int LLVMFuzzerInitialize(int *argc, char ***argv, char ***envp) * location of the executable. Using this we add exec_dir/pc-bios to * the datadirs. */ -dir = g_build_filename(g_path_get_dirname(**argv), "pc-bios", NULL); -if (g_file_test(dir, G_FILE_TEST_IS_DIR)) { -qemu_add_data_dir(dir); +bindir = g_path_get_dirname(**argv); +datadir = g_build_filename(bindir, "pc-bios", NULL); +g_free(bindir); +if (g_file_test(datadir, G_FILE_TEST_IS_DIR)) { +qemu_add_data_dir(datadir); } -g_free(dir); +g_free(datadir); } else if (*argc > 1) { /* The target is specified as an argument */ target_name = (*argv)[1]; if (!strstr(target_name, "--fuzz-target=")) { -- 2.26.2
[GIT PULL] IPMI updates
The following changes since commit 95d1fbabae0cd44156ac4b96d512d143ca7dfd5e: Merge remote-tracking branch 'remotes/kraxel/tags/fixes-20200716-pull-request' into staging (2020-07-16 18:50:51 +0100) are available in the Git repository at: https://github.com/cminyard/qemu.git tags/for-qemu-ipmi-5 for you to fetch changes up to e3f7320caa1cc08a9b752e555b79abd6218c7c7a: ipmi: add SET_SENSOR_READING command (2020-07-17 11:39:46 -0500) Man page update and new set sensor command Some minor man page updates for fairly obvious things. The set sensor command addition has been in the Power group's tree for a long time and I have neglected to submit it. -corey Corey Minyard (2): ipmi: Add man page pieces for the IPMI PCI devices ipmi: Fix a man page entry Cédric Le Goater (1): ipmi: add SET_SENSOR_READING command hw/ipmi/ipmi_bmc_sim.c | 223 + qemu-options.hx| 11 ++- 2 files changed, 233 insertions(+), 1 deletion(-)
Re: [PATCH] fuzz: Fix leak when assembling datadir path string
On 17/07/2020 18.35, Alexander Bulekov wrote: > We freed the string containing the final datadir path, but did not free > the path to the executable's directory that we get from > g_path_get_dirname(). Fix that. > > Reported-by: Thomas Huth > Signed-off-by: Alexander Bulekov > --- > > I ran it with Thomas' fixed build-oss-fuzz job: > https://gitlab.com/a1xndr/qemu/-/jobs/644463736 Looks like the fuzzer triggered a crash there, see line 5850 ... shouldn't the job fail in that case? ... i.e. is the fuzzer still exiting with return code 0? > diff --git a/tests/qtest/fuzz/fuzz.c b/tests/qtest/fuzz/fuzz.c > index 6bc17ef313..031594a686 100644 > --- a/tests/qtest/fuzz/fuzz.c > +++ b/tests/qtest/fuzz/fuzz.c > @@ -143,7 +143,7 @@ int LLVMFuzzerInitialize(int *argc, char ***argv, char > ***envp) > { > > char *target_name; > -char *dir; > +char *bindir, *datadir; > bool serialize = false; > > /* Initialize qgraph and modules */ > @@ -164,11 +164,13 @@ int LLVMFuzzerInitialize(int *argc, char ***argv, char > ***envp) > * location of the executable. Using this we add exec_dir/pc-bios to > * the datadirs. > */ > -dir = g_build_filename(g_path_get_dirname(**argv), "pc-bios", NULL); > -if (g_file_test(dir, G_FILE_TEST_IS_DIR)) { > -qemu_add_data_dir(dir); > +bindir = g_path_get_dirname(**argv); > +datadir = g_build_filename(bindir, "pc-bios", NULL); > +g_free(bindir); > +if (g_file_test(datadir, G_FILE_TEST_IS_DIR)) { > +qemu_add_data_dir(datadir); > } > -g_free(dir); > +g_free(datadir); > } else if (*argc > 1) { /* The target is specified as an argument */ > target_name = (*argv)[1]; > if (!strstr(target_name, "--fuzz-target=")) { > Patch looks fine, thanks! Reviewed-by: Thomas Huth
Re: [PATCH] target/i386: floatx80: avoid compound literals in static initializers
On 07/17/20 11:26, Laszlo Ersek wrote: > On 07/16/20 17:09, Philippe Mathieu-Daudé wrote: >> On 7/16/20 4:42 PM, Laszlo Ersek wrote: >>> Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an >>> object that has static storage duration shall be constant expressions or >>> string literals". >>> >>> The compound literal produced by the make_floatx80() macro is not such a >>> constant expression, per 6.6p7-9. (An implementation may accept it, >>> according to 6.6p10, but is not required to.) >>> >>> Therefore using "floatx80_zero" and make_floatx80() for initializing >>> "f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6 >>> actually chokes on them: >>> target/i386/fpu_helper.c:871:5: error: initializer element is not constant { make_floatx80(0xbfff, 0x8000ULL), ^ >> >> This reminds me of: >> >> commit 6fa9ba09dbf4eb8b52bcb47d6820957f1b77ee0b >> Author: Kamil Rytarowski >> Date: Mon Sep 4 23:23:06 2017 +0200 >> >> target/m68k: Switch fpu_rom from make_floatx80() to make_floatx80_init() >> >> GCC 4.7.2 on SunOS reports that the values assigned to array members >> are not >> real constants: >> >> target/m68k/fpu_helper.c:32:5: error: initializer element is not >> constant >> target/m68k/fpu_helper.c:32:5: error: (near initialization for >> 'fpu_rom[0]') >> rules.mak:66: recipe for target 'target/m68k/fpu_helper.o' failed >> >> Convert the array to make_floatx80_init() to fix it. >> Replace floatx80_pi-like constants with make_floatx80_init() as they are >> defined as make_floatx80(). >> >> Reviewed-by: Philippe Mathieu-Daudé >> >>> >>> We've had the make_floatx80_init() macro for this purpose since commit >>> 3bf7e40ab914 ("softfloat: fix for C99", 2012-03-17), so let's use that >>> macro again. >>> >>> Fixes: eca30647fc07 >>> Fixes: ff57bb7b6326 >>> Link: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06566.html >>> Link: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04714.html >>> Cc: Alex Bennée >>> Cc: Aurelien Jarno >>> Cc: Eduardo Habkost >>> Cc: Joseph Myers >>> Cc: Paolo Bonzini >>> Cc: Peter Maydell >>> Cc: Richard Henderson >>> Signed-off-by: Laszlo Ersek >>> --- >>> >>> Notes: >>> I can see that there are test cases under "tests/tcg/i386", but I don't >>> know how to run them. >> >> Yeah it is not easy to figure... >> >> Try 'make run-tcg-tests-i386-softmmu' >> but you need docker :^) > > That worked, thanks! Even without Docker: I just had to add > > --cross-cc-i386=gcc > > to my ./configure flags. > Also -- I meant to, but I forgot to put "for-5.1" in the subject prefix; sorry about that. Laszlo
Re: sysbus_create_simple Vs qdev_create
On Fri, Jul 17, 2020 at 12:23:12PM -0400, Eduardo Habkost wrote: > On Fri, Jul 17, 2020 at 07:10:57AM +0200, Markus Armbruster wrote: > > Eduardo Habkost writes: > > > > > I'd also note that the use of "parent" in the code is also > > > ambiguous. It can mean: > > > > > > * QOM parent type, i.e. TypeInfo.parent. Related fields: > > > * parent_class members of class structs > > > * parent_obj members of object structs > > > > I hate the use of "parent" and "child" for a super- / subtype relation. > > > > Correcting the terminology there would be short term pain for long term > > gain. Worthwhile? > > I don't know. It looks like the terminology came from GObject. One day I would love it if we got QOM to actually use GObject, so from that POV I'd be inclined to stick with the "parent" term. Personally I've not seen a problem with the term "parent" in this scenario. The class inheritance metaphor maps reasonably clearly to a parent/child metaphor. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[PATCH] tcg/cpu-exec: precise single-stepping after an interrupt
When single-stepping with a debugger attached to QEMU, and when an interrupt is raised, the debugger misses the first instruction after the interrupt. Buglink: https://bugs.launchpad.net/qemu/+bug/757702 Signed-off-by: Richard Henderson --- accel/tcg/cpu-exec.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 6a3d3a3cfc..66d38f9d85 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -588,7 +588,13 @@ static inline bool cpu_handle_interrupt(CPUState *cpu, else { if (cc->cpu_exec_interrupt(cpu, interrupt_request)) { replay_interrupt(); -cpu->exception_index = -1; +/* + * After processing the interrupt, ensure an EXCP_DEBUG is + * raised when single-stepping so that GDB doesn't miss the + * next instruction. + */ +cpu->exception_index = +(cpu->singlestep_enabled ? EXCP_DEBUG : -1); *last_tb = NULL; } /* The target hook may have updated the 'cpu->interrupt_request'; -- 2.25.1
Re: [PATCH v3 3/9] vfio: add quirk device write method
On Fri, 17 Jul 2020 16:57:40 +0100 Peter Maydell wrote: > On Fri, 17 Jul 2020 at 16:54, Alex Williamson > wrote: > > > > On Thu, 16 Jul 2020 18:46:33 +0100 > > Peter Maydell wrote: > > > > > Alex (Williamson) -- as the vfio maintainer, do you have a view > > > on whether we should be logging write accesses to port 0x3c3 > > > here as guest-errors or unimplemented-QEMU-functionality? > > > > > > Guest-error seems plausible to me, anyway. > > > > I believe the intention was that writes would be dropped, so if this > > generates logging that is going to cause users to file bugs, that would > > be undesirable. Thanks, > > It will only log if the user explicitly turns on "log things the > guest does which are bugs in it, like writing to read-only > registers" (with the '-d guest_errors' option). IIRC, this quirk is based on observation more so than an actual spec, so whether a log of such an event is interpreted as a guest error or an emulation error might be up for debate. Aside from that nit, and lack of bandwidth to research how hardware handles writes, Acked-by: Alex Williamson Thanks, Alex
Re: sysbus_create_simple Vs qdev_create
On Fri, Jul 17, 2020 at 07:10:57AM +0200, Markus Armbruster wrote: > Eduardo Habkost writes: > > > I'd also note that the use of "parent" in the code is also > > ambiguous. It can mean: > > > > * QOM parent type, i.e. TypeInfo.parent. Related fields: > > * parent_class members of class structs > > * parent_obj members of object structs > > I hate the use of "parent" and "child" for a super- / subtype relation. > > Correcting the terminology there would be short term pain for long term > gain. Worthwhile? I don't know. It looks like the terminology came from GObject. > > > * QOM composition tree parent object, i.e. Object::parent > > * qdev device parent bus, i.e. DeviceState::parent_bus > > * parent device of of qdev bus, i.e. BusState::parent > > These are tree relations. Use of "parent" and "child" is perfectly > fine. The terms are fine but still ambiguous, as devices belong to two separate trees at the same time (the QOM composition tree, and the qdev tree). I never understood why we have two separate independent object trees. -- Eduardo
Re: [virtio-comment] [RFC] ivshmem v2: Shared memory device specification
On 15.07.20 15:27, Stefan Hajnoczi wrote: On Mon, May 25, 2020 at 09:58:28AM +0200, Jan Kiszka wrote: IVSHMEM Device Specification ** NOTE: THIS IS WORK-IN-PROGRESS, NOT YET A STABLE INTERFACE SPECIFICATION! ** Hi Jan, Thanks for posting this! I have a posted comments where I wasn't sure what the spec meant. The goal of the Inter-VM Shared Memory (IVSHMEM) device model is to define the minimally needed building blocks a hypervisor has to provide for enabling guest-to-guest communication. The details of communication protocols shall remain opaque to the hypervisor so that guests are free to define them as simple or sophisticated as they need. For that purpose, the IVSHMEM provides the following features to its users: - Interconnection between up to 65536 peers - Multi-purpose shared memory region - common read/writable section - output sections that are read/writable for one peer and only readable for the others - section with peer states - Event signaling via interrupt to remote sides - Support for life-cycle management via state value exchange and interrupt notification on changes, backed by a shared memory section - Free choice of protocol to be used on top - Protocol type declaration - Register can be implemented either memory-mapped or via I/O, depending on platform support and lower VM-exit costs - Unprivileged access to memory-mapped or I/O registers feasible - Single discovery and configuration via standard PCI, no complexity by additionally defining a platform device model Hypervisor Model In order to provide a consistent link between peers, all connected instances of IVSHMEM devices need to be configured, created and run by the hypervisor according to the following requirements: - The instances of the device shall appear as a PCI device to their users. - The read/write shared memory section has to be of the same size for all peers. The size can be zero. - If shared memory output sections are present (non-zero section size), there must be one reserved for each peer with exclusive write access. All output sections must have the same size and must be readable for all peers. - The State Table must have the same size for all peers, must be large enough to hold the state values of all peers, and must be read-only for the user. Who/what is the "user"? I guess this simply means that the State Table is read-only and only the hypervisor can update the table entries? Read-only for the guest, right. I used the term "user" here and elsewhere, switching to "guest" might be better. - State register changes (explicit writes, peer resets) have to be propagated to the other peers by updating the corresponding State Table entry and issuing an interrupt to all other peers if they enabled reception. - Interrupts events triggered by a peer have to be delivered to the target peer, provided the receiving side is valid and has enabled the reception. - All peers must have the same interrupt delivery features available, i.e. MSI-X with the same maximum number of vectors on platforms supporting this mechanism, otherwise INTx with one vector. Guest-side Programming Model An IVSHMEM device appears as a PCI device to its users. Unless otherwise noted, it conforms to the PCI Local Bus Specification, Revision 3.0. As such, it is discoverable via the PCI configuration space and provides a number of standard and custom PCI configuration registers. ### Shared Memory Region Layout The shared memory region is divided into several sections. +-+ - | | : | Output Section for peer n-1 | : Output Section Size | (n = Maximum Peers) | : +-+ - : : : : : : +-+ - | | : | Output Section for peer 1 | : Output Section Size | | : +-+ - | | : | Output Section for peer 0 | : Output Section Size | | : +-+ - | | : | Read/Write Section | : R/W Section Size | | : +-+ - | | : | State Table | : State Table Size | | : +-+ <-- Shared memory base address The first section consists of the mandatory State Table. Its size is defined by the State Table Size register and cannot be zero. This section is read-only for all peers. The second section
Re: device compatibility interface for live migration with assigned devices
On Thu, 16 Jul 2020 16:32:30 +0800 Yan Zhao wrote: > On Thu, Jul 16, 2020 at 12:16:26PM +0800, Jason Wang wrote: > > > > On 2020/7/14 上午7:29, Yan Zhao wrote: > > > hi folks, > > > we are defining a device migration compatibility interface that helps > > > upper > > > layer stack like openstack/ovirt/libvirt to check if two devices are > > > live migration compatible. > > > The "devices" here could be MDEVs, physical devices, or hybrid of the two. > > > e.g. we could use it to check whether > > > - a src MDEV can migrate to a target MDEV, > > > - a src VF in SRIOV can migrate to a target VF in SRIOV, > > > - a src MDEV can migration to a target VF in SRIOV. > > >(e.g. SIOV/SRIOV backward compatibility case) > > > > > > The upper layer stack could use this interface as the last step to check > > > if one device is able to migrate to another device before triggering a > > > real > > > live migration procedure. > > > we are not sure if this interface is of value or help to you. please don't > > > hesitate to drop your valuable comments. > > > > > > > > > (1) interface definition > > > The interface is defined in below way: > > > > > > __userspace > > >/\ \ > > > / \write > > > / read \ > > > /__ ___\|/_ > > >| migration_version | | migration_version |-->check migration > > >- - compatibility > > > device Adevice B > > > > > > > > > a device attribute named migration_version is defined under each device's > > > sysfs node. e.g. > > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version). > > > > > > Are you aware of the devlink based device management interface that is > > proposed upstream? I think it has many advantages over sysfs, do you > > consider to switch to that? Advantages, such as? > not familiar with the devlink. will do some research of it. > > > > > > > userspace tools read the migration_version as a string from the source > > > device, > > > and write it to the migration_version sysfs attribute in the target > > > device. > > > > > > The userspace should treat ANY of below conditions as two devices not > > > compatible: > > > - any one of the two devices does not have a migration_version attribute > > > - error when reading from migration_version attribute of one device > > > - error when writing migration_version string of one device to > > >migration_version attribute of the other device > > > > > > The string read from migration_version attribute is defined by device > > > vendor > > > driver and is completely opaque to the userspace. > > > > > > My understanding is that something opaque to userspace is not the > > philosophy > > but the VFIO live migration in itself is essentially a big opaque stream to > userspace. > > > of Linux. Instead of having a generic API but opaque value, why not do in a > > vendor specific way like: > > > > 1) exposing the device capability in a vendor specific way via sysfs/devlink > > or other API > > 2) management read capability in both src and dst and determine whether we > > can do the migration > > > > This is the way we plan to do with vDPA. > > > yes, in another reply, Alex proposed to use an interface in json format. > I guess we can define something like > > { "self" : > [ > { "pciid" : "8086591d", > "driver" : "i915", > "gvt-version" : "v1", > "mdev_type" : "i915-GVTg_V5_2", > "aggregator" : "1", > "pv-mode" : "none", > } > ], > "compatible" : > [ > { "pciid" : "8086591d", > "driver" : "i915", > "gvt-version" : "v1", > "mdev_type" : "i915-GVTg_V5_2", > "aggregator" : "1" > "pv-mode" : "none", > }, > { "pciid" : "8086591d", > "driver" : "i915", > "gvt-version" : "v1", > "mdev_type" : "i915-GVTg_V5_4", > "aggregator" : "2" > "pv-mode" : "none", > }, > { "pciid" : "8086591d", > "driver" : "i915", > "gvt-version" : "v2", > "mdev_type" : "i915-GVTg_V5_4", > "aggregator" : "2" > "pv-mode" : "none, ppgtt, context", > } > ... > ] > } > > But as those fields are mostly vendor specific, the userspace can > only do simple string comparing, I guess the list would be very long as > it needs to enumerate all possible targets. This ignores so much of what I tried to achieve in my example :( > also, in some fileds like "gvt-version", is there a simple way to express > things like v2+? That's not a reasonable thing to express anyway, how can you be certain that v3 won't break compatibility with v2? Sean proposed a versioning scheme that accounts for this, using an x.y.z version expressing the major, minor, and bugfix versions, where there is no compatibility across major versions, mi
Re: [PATCH v3 3/9] vfio: add quirk device write method
On Fri, 17 Jul 2020 at 16:54, Alex Williamson wrote: > > On Thu, 16 Jul 2020 18:46:33 +0100 > Peter Maydell wrote: > > > Alex (Williamson) -- as the vfio maintainer, do you have a view > > on whether we should be logging write accesses to port 0x3c3 > > here as guest-errors or unimplemented-QEMU-functionality? > > > > Guest-error seems plausible to me, anyway. > > I believe the intention was that writes would be dropped, so if this > generates logging that is going to cause users to file bugs, that would > be undesirable. Thanks, It will only log if the user explicitly turns on "log things the guest does which are bugs in it, like writing to read-only registers" (with the '-d guest_errors' option). thanks -- PMM
Re: [PATCH v3 3/9] vfio: add quirk device write method
On Thu, 16 Jul 2020 18:46:33 +0100 Peter Maydell wrote: > On Tue, 30 Jun 2020 at 13:30, P J P wrote: > > > > From: Prasad J Pandit > > > > Add vfio quirk device mmio write method to avoid NULL pointer > > dereference issue. > > > > Reported-by: Lei Sun > > Reviewed-by: Li Qiang > > Signed-off-by: Prasad J Pandit > > --- > > hw/vfio/pci-quirks.c | 8 > > 1 file changed, 8 insertions(+) > > > > Update v3: Add Reviewed-by: ... > > -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09406.html > > > > diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c > > index d304c81148..cc6d5dbc23 100644 > > --- a/hw/vfio/pci-quirks.c > > +++ b/hw/vfio/pci-quirks.c > > @@ -14,6 +14,7 @@ > > #include "config-devices.h" > > #include "exec/memop.h" > > #include "qemu/units.h" > > +#include "qemu/log.h" > > #include "qemu/error-report.h" > > #include "qemu/main-loop.h" > > #include "qemu/module.h" > > @@ -264,8 +265,15 @@ static uint64_t vfio_ati_3c3_quirk_read(void *opaque, > > return data; > > } > > > > +static void vfio_ati_3c3_quirk_write(void *opaque, hwaddr addr, > > +uint64_t data, unsigned size) > > +{ > > +qemu_log_mask(LOG_GUEST_ERROR, "%s not implemented\n", __func__); > > +} > > + > > static const MemoryRegionOps vfio_ati_3c3_quirk = { > > .read = vfio_ati_3c3_quirk_read, > > +.write = vfio_ati_3c3_quirk_write, > > .endianness = DEVICE_LITTLE_ENDIAN, > > }; > > > Alex (Williamson) -- as the vfio maintainer, do you have a view > on whether we should be logging write accesses to port 0x3c3 > here as guest-errors or unimplemented-QEMU-functionality? > > Guest-error seems plausible to me, anyway. I believe the intention was that writes would be dropped, so if this generates logging that is going to cause users to file bugs, that would be undesirable. Thanks, Alex
Re: [PATCH] e1000e: using bottom half to send packets
On Fri, 17 Jul 2020 at 04:11, Jason Wang wrote: > I think several things were missed in this patch (take virtio-net as a > reference), do we need the following things: > > - Cancel the bh when VM is stopped. Similarly, what should we do with the bh when the device is reset ? > - A throttle to prevent bh from executing too much timer? > - A flag to record whether or not this a pending tx (and migrate it?) thanks -- PMM
Re: [PATCH] e1000e: using bottom half to send packets
Jason Wang 于2020年7月17日周五 下午1:39写道: > > > On 2020/7/17 下午12:46, Li Qiang wrote: > > Jason Wang 于2020年7月17日周五 上午11:10写道: > >> > >> On 2020/7/17 上午12:14, Li Qiang wrote: > >>> Alexander Bulekov reported a UAF bug related e1000e packets send. > >>> > >>> -->https://bugs.launchpad.net/qemu/+bug/1886362 > >>> > >>> This is because the guest trigger a e1000e packet send and set the > >>> data's address to e1000e's MMIO address. So when the e1000e do DMA > >>> it will write the MMIO again and trigger re-entrancy and finally > >>> causes this UAF. > >>> > >>> Paolo suggested to use a bottom half whenever MMIO is doing complicate > >>> things in here: > >>> -->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html > >>> > >>> Reference here: > >>> 'The easiest solution is to delay processing of descriptors to a bottom > >>> half whenever MMIO is doing something complicated. This is also better > >>> for latency because it will free the vCPU thread more quickly and leave > >>> the work to the I/O thread.' > >> > >> I think several things were missed in this patch (take virtio-net as a > >> reference), do we need the following things: > >> > > Thanks Jason, > > In fact I know this, I'm scared for touching this but I want to try. > > Thanks for your advice. > > > >> - Cancel the bh when VM is stopped. > > Ok. I think add a vm state change notifier for e1000e can address this. > > > >> - A throttle to prevent bh from executing too much timer? > > Ok, I think add a config timeout and add a timer in e1000e can address this. > > > Sorry, a typo. I meant we probably need a tx_burst as what virtio-net did. > > > > > >> - A flag to record whether or not this a pending tx (and migrate it?) > > Is just a flag enough? Could you explain more about the idea behind > > processing the virtio-net/e1000e using bh like this? > > > Virtio-net use a tx_waiting variable to record whether or not there's a > pending bh. (E.g bh is cancelled due to vmstop, we need reschedule it > after vmresume). Maybe we can do something simpler by just schecule bh > unconditionally during vm resuming. > > > > For example, if the guest trigger a lot of packets send and if the bh > > is scheduled in IO thread. So will we lost packets? > > > We don't since we don't populate virtqueue which means packets are > queued there. > This remind of me a question: If we use tx_burst like in virtion-net. For detail: If we sent out 'tx_burst' packets per bh. Then we set 'tx_waiting' and then schedule another bh. However if between two bh schedule, the guest change the e1000e register such 'r->dh' 'r->dlen'. The data is fully corrupted. In fact this issue does exist in my origin patch. That's What if following happend: vcpu thread: guest write e1000e MMIO to trigger packets send vcpu thread: schedule a bh vcpu thread: return IO thread: begin to run the bh and start send packets vcpu thread: write register again such as 'r->dh' 'r->dlen'.. So here the IO thread and vcpu thread will race the register? If I remember correctly, the virtio net has no such problem because it uses ring buffer and the backedn(virtio device) uses the shadow index to index the ring buffer data. What's your idea here? Thanks, Li Qiang > Thanks > > > > How we avoid this in virtio-net. > > > > Thanks, > > Li Qiang > > > > > > > >> Thanks > >> > >> > >>> This patch fixes this UAF. > >>> > >>> Signed-off-by: Li Qiang > >>> --- > >>>hw/net/e1000e_core.c | 25 + > >>>hw/net/e1000e_core.h | 2 ++ > >>>2 files changed, 19 insertions(+), 8 deletions(-) > >>> > >>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c > >>> index bcd186cac5..6165b04b68 100644 > >>> --- a/hw/net/e1000e_core.c > >>> +++ b/hw/net/e1000e_core.c > >>> @@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, > >>> uint32_t val) > >>>static void > >>>e1000e_set_tctl(E1000ECore *core, int index, uint32_t val) > >>>{ > >>> -E1000E_TxRing txr; > >>>core->mac[index] = val; > >>> > >>>if (core->mac[TARC0] & E1000_TARC_ENABLE) { > >>> -e1000e_tx_ring_init(core, &txr, 0); > >>> -e1000e_start_xmit(core, &txr); > >>> +qemu_bh_schedule(core->tx[0].tx_bh); > >>>} > >>> > >>>if (core->mac[TARC1] & E1000_TARC_ENABLE) { > >>> -e1000e_tx_ring_init(core, &txr, 1); > >>> -e1000e_start_xmit(core, &txr); > >>> +qemu_bh_schedule(core->tx[1].tx_bh); > >>>} > >>>} > >>> > >>>static void > >>>e1000e_set_tdt(E1000ECore *core, int index, uint32_t val) > >>>{ > >>> -E1000E_TxRing txr; > >>>int qidx = e1000e_mq_queue_idx(TDT, index); > >>>uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1; > >>> > >>>core->mac[index] = val & 0x; > >>> > >>>if (core->mac[tarc_reg] & E1000_TARC_ENABLE) { > >>> -e1000e_tx_ring_init(core, &txr, qidx); > >>> -e1000e_start_xmit(core, &txr); > >>> +qemu_bh_schedule(core->tx[
Re: Implement standard file operation with QEMU
On Fri, 17 Jul 2020 at 16:28, casmac wrote: >What I want to realize is to be able to call standard file operations > (open, read, write etc) in the application program, and execute such programs > in QEMU. But I am building under system mode. >TI provide copilation toolchain and a library that provide partial > functionality from libc. I am hoping to use TI's toolkit to generate object > code which contains calls to hook functions, and then use QEMU's host I/O > implementation to realize low-level file operation. For example: > _stream[fildes]->WRITE <---hook to ---> qemu_semihosting_console_outs If the QEMU guest architecture you're using supports semihosting (eg arm, mips, lm32, m68k, niso2, xtensa), then you can use it to implement that kind of guest-libc functionality. What you need to do is entirely in the guest code: the implementation of the hook function for write would need to invoke the correct semihosting call. (For instance on Arm this is "put the arguments into the correct guest registers/in-memory structures, then invoke the right SVC instruction".) If the guest architecture you're using does not have a semihosting ABI, then you would need to define one, which is a moderate amount of work and also requires some agreement about what that ABI definition should look like (eg for risc-v we asked the risc-v folks to write up a basic spec document so that everybody implementing semihosting was working to the same set of requirements). The other usual way to implement the low-level hook functions would be for them to talk to emulated devices: eg a write-to-console function could be implemented to send the string to a UART device. thanks -- PMM
Re: [RFC PATCH] tcg/cpu-exec: precise single-stepping after an exception
On 7/17/20 4:01 AM, Luc Michel wrote: > I wrote a small test case for the interrupt side that can be run on the > virt board: ... > This is with your fix. Without it, the second stepi stops on 0x284. Awesome, thanks. > Do you want me to send it? If yes, how should I give credit to you? > Should I put your Signed-off-by: in it? I'll put it together. You can then add your Reviewed/Tested-by. r~
Re: [PATCH] gitlab-ci.yml: Add oss-fuzz build tests
On 17/07/2020 15.20, Alexander Bulekov wrote: > On 200717 0951, Thomas Huth wrote: >> On 17/07/2020 07.40, Thomas Huth wrote: [...] >> I think I've got it basically working like this: >> >> build-oss-fuzz: >> <<: *native_build_job_definition >> variables: >> IMAGE: fedora >> script: >> - mkdir build-oss-fuzz >> - CC="clang" CXX="clang++" CFLAGS="-fsanitize=address" >> ./scripts/oss-fuzz/build.sh >> - for fuzzer in $(find build-oss-fuzz/DEST_DIR/ -executable -type f >> | grep -v slirp); do >> grep "LLVMFuzzerTestOneInput" ${fuzzer} > /dev/null 2>&1 || >> continue ; >> echo Testing ${fuzzer} ... ; >> "${fuzzer}" -runs=1000 || exit 1 ; >> done >> >> However, it still triggered a memory leak and thus the pipeline failed: >> >> https://gitlab.com/huth/qemu/-/jobs/643472695 >> >> ... and this was with all the other "leak fix" patches already applied. >> Is there an easy way to debug that issue? That information from the >> LeakSanitizer looks pretty sparse... > > Strange... Especially since Philippe didn't get the same error. We > should probably add -seed=1 after -runs=1000, to make sure the fuzzer is > using the same rng seed. That said, I just ran the fuzzer for 50k > iterations, and still could not reproduce the leak... > > This environment variable should fix the partial stacktrace, so we don't > have to guess next time: > ASAN_OPTIONS="fast_unwind_on_malloc=0" Thanks, that did the trick: https://gitlab.com/huth/qemu/-/jobs/644354706#L3616 ... that also explains why I haven't seen it in my other tests where I am using the --fuzz-target parameter : The leak only happens if the fuzz-target is encoded in the program name - looks like we have to free the memory returned by g_path_get_dirname() there... Thomas
Re: [PATCH for-5.1] Makefile: Remove config-devices.mak on "make clean"
On 7/17/20 5:25 PM, Peter Maydell wrote: > The config-devices.mak files are generated by "make", and so they > should be deleted by "make clean". > > (This is different from config-host.mak and config-all-disas.mak, > which are created by "configure" and so only deleted by > "make distclen".) typo "distclean" > > If we don't delete these files on "make clean", then the build > tree is left in a state where it has the config-devices.mak > file but not the config-devices.mak.d file, and make will not > realize that it needs to rebuild config-devices.mak if, for > instance, hw/sd/Kconfig changes. > > NB: config-all-devices.mak is also generated by "make", but we > already remove it on "make clean". > Suggested-by: Paolo Bonzini ? Reviewed-by: Philippe Mathieu-Daudé > Signed-off-by: Peter Maydell > --- > I didn't remove the existing 'rm -f $(SUBDIR_DEVICES_MAK)' > from the 'distclean' rules on the basis that config-all-devices.mak > is explicitly removed in both 'distclean' and 'clean', despite > 'distclean' depending on 'clean'... > --- > Makefile | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/Makefile b/Makefile > index 32345c610ee..c2120d8d48d 100644 > --- a/Makefile > +++ b/Makefile > @@ -775,6 +775,7 @@ clean: recurse-clean > rm -f storage-daemon/qapi/qapi-gen-timestamp > rm -rf qga/qapi-generated > rm -f config-all-devices.mak > + rm -f $(SUBDIR_DEVICES_MAK) > > VERSION ?= $(shell cat VERSION) > >
Re: Implement standard file operation with QEMU
Hello Phil, What I want to realize is to be able to call standard file operations (open, read, write etc) in the application program, and execute such programs in QEMU. But I am building under system mode. TI provide copilation toolchain and a library that provide partial functionality from libc. I am hoping to use TI's toolkit to generate object code which contains calls to hook functions, and then use QEMU's host I/O implementation to realize low-level file operation. For example: _stream[fildes]->WRITE <---hook to ---> qemu_semihosting_console_outs Are you suggesting that this is done by TCG helpers? best regards, xiaolei -- Original -- From: "Philippe Mathieu-Daudé"
[Bug 1878642] Re: Assertion failure in pci_bus_get_irq_level
Proposed fix: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg05564.html ** Changed in: qemu Status: New => Confirmed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1878642 Title: Assertion failure in pci_bus_get_irq_level Status in QEMU: Confirmed Bug description: Hello, I found an input which triggers an assertion failure in pci_bus_get_irq_level: qemu-system-i386: /home/alxndr/Development/qemu/hw/pci/pci.c:268: int pci_bus_get_irq_level(PCIBus *, int): Assertion `irq_num < bus->nirq' failed. Aborted #0 0x7686d761 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x7685755b in __GI_abort () at abort.c:79 #2 0x7685742f in __assert_fail_base (fmt=0x769bdb48 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x57f9bca0 "irq_num < bus->nirq", file=0x57f9bbe0 "/home/alxndr/Development/qemu/hw/pci/pci.c", line=0x10c, function=) at assert.c:92 #3 0x76866092 in __GI___assert_fail (assertion=0x57f9bca0 "irq_num < bus->nirq", file=0x57f9bbe0 "/home/alxndr/Development/qemu/hw/pci/pci.c", line=0x10c, function=0x57f9bc40 <__PRETTY_FUNCTION__.pci_bus_get_irq_level> "int pci_bus_get_irq_level(PCIBus *, int)") at assert.c:101 #4 0x57060c34 in pci_bus_get_irq_level (bus=0x61d96080, irq_num=0xef) at /home/alxndr/Development/qemu/hw/pci/pci.c:268 #5 0x56657391 in ich9_lpc_update_apic (lpc=0x62a06200, gsi=0xff) at /home/alxndr/Development/qemu/hw/isa/lpc_ich9.c:249 #6 0x56658ea7 in ich9_set_sci (opaque=0x62a06200, irq_num=0x0, level=0x1) at /home/alxndr/Development/qemu/hw/isa/lpc_ich9.c:354 #7 0x56ccefc6 in qemu_set_irq (irq=0x6062af80, level=0x1) at /home/alxndr/Development/qemu/hw/core/irq.c:44 #8 0x56bc06fd in acpi_update_sci (regs=0x62a06c80, irq=0x6062af80) at /home/alxndr/Development/qemu/hw/acpi/core.c:723 #9 0x56bccb08 in ich9_pm_update_sci_fn (regs=0x62a06c80) at /home/alxndr/Development/qemu/hw/acpi/ich9.c:56 #10 0x56bc10ee in acpi_pm_evt_write (opaque=0x62a06c80, addr=0x2, val=0x2049, width=0x2) at /home/alxndr/Development/qemu/hw/acpi/core.c:456 #11 0x564938b5 in memory_region_write_accessor (mr=0x62a06db0, addr=0x2, value=0x7fff9c70, size=0x2, shift=0x0, mask=0x, attrs=...) at /home/alxndr/Development/qemu/memory.c:483 #12 0x5649328a in access_with_adjusted_size (addr=0x2, value=0x7fff9c70, size=0x2, access_size_min=0x1, access_size_max=0x4, access_fn=0x56493360 , mr=0x62a06db0, attrs=...) at /home/alxndr/Development/qemu/memory.c:544 #13 0x56491df6 in memory_region_dispatch_write (mr=0x62a06db0, addr=0x2, data=0x2049, op=MO_16, attrs=...) at /home/alxndr/Development/qemu/memory.c:1476 #14 0x562cbbf4 in flatview_write_continue (fv=0x60633fe0, addr=0x5d02, attrs=..., ptr=0x7fffa4e0, len=0x4, addr1=0x2, l=0x2, mr=0x62a06db0) at /home/alxndr/Development/qemu/exec.c:3137 #15 0x562bbad9 in flatview_write (fv=0x60633fe0, addr=0x5d02, attrs=..., buf=0x7fffa4e0, len=0x4) at /home/alxndr/Development/qemu/exec.c:3177 #16 0x562bb609 in address_space_write (as=0x5968f940 , addr=0x5d02, attrs=..., buf=0x7fffa4e0, len=0x4) at /home/alxndr/Development/qemu/exec.c:3268 #17 0x56478c0a in cpu_outl (addr=0x5d02, val=0xedf82049) at /home/alxndr/Development/qemu/ioport.c:80 #18 0x5648166f in qtest_process_command (chr=0x59691d00 , words=0x6039ef20) at /home/alxndr/Development/qemu/qtest.c:396 #19 0x5647f187 in qtest_process_inbuf (chr=0x59691d00 , inbuf=0x6190f680) at /home/alxndr/Development/qemu/qtest.c:710 #20 0x5647e8b4 in qtest_read (opaque=0x59691d00 , buf=0x7fffca40 "outl 0xcf8 0x8400f841\noutl 0xcfc 0xebed205d\noutl 0x5d02 0xedf82049\n-M pc-q35-5.0 -device intel-hda,id=hda0 -device hda-output,bus=hda0.0 -device hda-micro,bus=hda0.0 -device hda-duplex,bus=hda0.0 -display none -nodefaults -nographic\n", size=0xe9) at /home/alxndr/Development/qemu/qtest.c:722 #21 0x579c260c in qemu_chr_be_write_impl (s=0x60f01f30, buf=0x7fffca40 "outl 0xcf8 0x8400f841\noutl 0xcfc 0xebed205d\noutl 0x5d02 0xedf82049\n-M pc-q35-5.0 -device intel-hda,id=hda0 -device hda-output,bus=hda0.0 -device hda-micro,bus=hda0.0 -device hda-duplex,bus=hda0.0 -display none -nodefaults -nographic\n", len=0xe9) at /home/alxndr/Development/qemu/chardev/char.c:183 #22 0x579c275b in qemu_chr_be_write (s=0x60f01f30, buf=0x7fffca40 "outl 0xcf8 0x8400f841\noutl 0xcfc 0xebed205d\noutl 0x5d02 0xedf82049\n-M pc-q35-5.0 -device intel-hda,id=hda0 -device hda-output,bus=hda0.0 -device hda-micro,bus=hda0.0 -device hda-duplex,bus=hda0.0 -display none -nodefaults -nographic\n", len=0
[Bug 1887309] Re: Floating-point exception in ide_set_sector
Proposed fix: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg05528.html ** Changed in: qemu Status: New => Confirmed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1887309 Title: Floating-point exception in ide_set_sector Status in QEMU: Confirmed Bug description: Hello, Here is a reproducer: cat << EOF | ./i386-softmmu/qemu-system-i386 -M pc,accel=qtest\ -qtest null -nographic -vga qxl -qtest stdio -nodefaults\ -drive if=none,id=drive0,file=null-co://,file.read-zeroes=on,format=raw\ -drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw\ -device ide-cd,drive=drive0 -device ide-hd,drive=drive1 outw 0x176 0x3538 outl 0xcf8 0x8903 outl 0xcfc 0x184275c outb 0x376 0x2f outb 0x376 0x0 outw 0x176 0xa1a4 outl 0xcf8 0x8920 outb 0xcfc 0xff outb 0xf8 0x9 EOF The stack-trace: ==16513==ERROR: UndefinedBehaviorSanitizer: FPE on unknown address 0x556783603fdc (pc 0x556783603fdc bp 0x7fff82143b10 sp 0x7fff82143ab0 T16513) #0 0x556783603fdc in ide_set_sector /home/alxndr/Development/qemu/hw/ide/core.c:626:26 #1 0x556783603fdc in ide_dma_cb /home/alxndr/Development/qemu/hw/ide/core.c:883:9 #2 0x55678349d74d in dma_complete /home/alxndr/Development/qemu/dma-helpers.c:120:9 #3 0x55678349d74d in dma_blk_cb /home/alxndr/Development/qemu/dma-helpers.c:138:9 #4 0x556783962f25 in blk_aio_complete /home/alxndr/Development/qemu/block/block-backend.c:1402:9 #5 0x556783962f25 in blk_aio_complete_bh /home/alxndr/Development/qemu/block/block-backend.c:1412:5 #6 0x556783ac030f in aio_bh_call /home/alxndr/Development/qemu/util/async.c:136:5 #7 0x556783ac030f in aio_bh_poll /home/alxndr/Development/qemu/util/async.c:164:13 #8 0x556783a9a863 in aio_dispatch /home/alxndr/Development/qemu/util/aio-posix.c:380:5 #9 0x556783ac1b4c in aio_ctx_dispatch /home/alxndr/Development/qemu/util/async.c:306:5 #10 0x7f4f1c0fe9ed in g_main_context_dispatch (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4e9ed) #11 0x556783acdccb in glib_pollfds_poll /home/alxndr/Development/qemu/util/main-loop.c:219:9 #12 0x556783acdccb in os_host_main_loop_wait /home/alxndr/Development/qemu/util/main-loop.c:242:5 #13 0x556783acdccb in main_loop_wait /home/alxndr/Development/qemu/util/main-loop.c:518:11 #14 0x5567833613e5 in qemu_main_loop /home/alxndr/Development/qemu/softmmu/vl.c:1664:9 #15 0x556783a07a4d in main /home/alxndr/Development/qemu/softmmu/main.c:49:5 #16 0x7f4f1ac84e0a in __libc_start_main /build/glibc-GwnBeO/glibc-2.30/csu/../csu/libc-start.c:308:16 #17 0x5567830a9089 in _start (/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x7d2089) With -trace ide* 12163@1594585516.671265:ide_reset IDEstate 0x56162a269058 [R +0.024963] outw 0x176 0x3538 12163@1594585516.673676:ide_ioport_write IDE PIO wr @ 0x176 (Device/Head); val 0x38; bus 0x56162a268c00 IDEState 0x56162a268c88 12163@1594585516.673683:ide_ioport_write IDE PIO wr @ 0x177 (Command); val 0x35; bus 0x56162a268c00 IDEState 0x56162a269058 12163@1594585516.673686:ide_exec_cmd IDE exec cmd: bus 0x56162a268c00; state 0x56162a269058; cmd 0x35 OK [S +0.025002] OK [R +0.025012] outl 0xcf8 0x8903 OK [S +0.025018] OK [R +0.025026] outl 0xcfc 0x184275c OK [S +0.025210] OK [R +0.025219] outb 0x376 0x2f 12163@1594585516.673916:ide_cmd_write IDE PIO wr @ 0x376 (Device Control); val 0x2f; bus 0x56162a268c00 OK [S +0.025229] OK [R +0.025234] outb 0x376 0x0 12163@1594585516.673928:ide_cmd_write IDE PIO wr @ 0x376 (Device Control); val 0x00; bus 0x56162a268c00 OK [S +0.025240] OK [R +0.025246] outw 0x176 0xa1a4 12163@1594585516.673940:ide_ioport_write IDE PIO wr @ 0x176 (Device/Head); val 0xa4; bus 0x56162a268c00 IDEState 0x56162a269058 12163@1594585516.673943:ide_ioport_write IDE PIO wr @ 0x177 (Command); val 0xa1; bus 0x56162a268c00 IDEState 0x56162a268c88 12163@1594585516.673946:ide_exec_cmd IDE exec cmd: bus 0x56162a268c00; state 0x56162a268c88; cmd 0xa1 OK [S +0.025265] OK [R +0.025270] outl 0xcf8 0x8920 OK [S +0.025274] OK [R +0.025279] outb 0xcfc 0xff OK [S +0.025443] OK [R +0.025451] outb 0xf8 0x9 12163@1594585516.674221:ide_dma_cb IDEState 0x56162a268c88; sector_num=0 n=1 cmd=DMA READ OK [S +0.025724] OK UndefinedBehaviorSanitizer:DEADLYSIGNAL ==12163==ERROR: UndefinedBehaviorSanitizer: FPE on unknown address 0x5616279cffdc (pc 0x5616279cffdc bp 0x7ffcdaabae90 sp 0x7ffcdaabae30 T12163) -Alex To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1887309/+subscriptions
[PATCH for-5.1] Makefile: Remove config-devices.mak on "make clean"
The config-devices.mak files are generated by "make", and so they should be deleted by "make clean". (This is different from config-host.mak and config-all-disas.mak, which are created by "configure" and so only deleted by "make distclen".) If we don't delete these files on "make clean", then the build tree is left in a state where it has the config-devices.mak file but not the config-devices.mak.d file, and make will not realize that it needs to rebuild config-devices.mak if, for instance, hw/sd/Kconfig changes. NB: config-all-devices.mak is also generated by "make", but we already remove it on "make clean". Signed-off-by: Peter Maydell --- I didn't remove the existing 'rm -f $(SUBDIR_DEVICES_MAK)' from the 'distclean' rules on the basis that config-all-devices.mak is explicitly removed in both 'distclean' and 'clean', despite 'distclean' depending on 'clean'... --- Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/Makefile b/Makefile index 32345c610ee..c2120d8d48d 100644 --- a/Makefile +++ b/Makefile @@ -775,6 +775,7 @@ clean: recurse-clean rm -f storage-daemon/qapi/qapi-gen-timestamp rm -rf qga/qapi-generated rm -f config-all-devices.mak + rm -f $(SUBDIR_DEVICES_MAK) VERSION ?= $(shell cat VERSION) -- 2.20.1
Re: [PATCH v2 05/20] block/block-copy: implement block_copy_async
17.07.2020 17:00, Max Reitz wrote: On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: We'll need async block-copy invocation to use in backup directly. Signed-off-by: Vladimir Sementsov-Ogievskiy --- include/block/block-copy.h | 13 + block/block-copy.c | 40 ++ 2 files changed, 53 insertions(+) diff --git a/include/block/block-copy.h b/include/block/block-copy.h index 6397505f30..ada0d99566 100644 --- a/include/block/block-copy.h +++ b/include/block/block-copy.h @@ -19,7 +19,10 @@ #include "qemu/co-shared-resource.h" typedef void (*ProgressBytesCallbackFunc)(int64_t bytes, void *opaque); +typedef void (*BlockCopyAsyncCallbackFunc)(int ret, bool error_is_read, + void *opaque); typedef struct BlockCopyState BlockCopyState; +typedef struct BlockCopyCallState BlockCopyCallState; BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target, int64_t cluster_size, bool use_copy_range, @@ -41,6 +44,16 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s, int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes, bool *error_is_read); +/* + * Run block-copy in a coroutine, return state pointer. If finished early + * returns NULL (@cb is called anyway). Any special reason for doing so? Seems like the code would be a tad simpler if we just returned it either way. (And off the top of my head I’d guess it’d be easier for the caller if the returned value was always non-NULL, too.) Sounds reasonable, will check + */ +BlockCopyCallState *block_copy_async(BlockCopyState *s, + int64_t offset, int64_t bytes, + bool ratelimit, int max_workers, + int64_t max_chunk, + BlockCopyAsyncCallbackFunc cb); + BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s); void block_copy_set_skip_unallocated(BlockCopyState *s, bool skip); diff --git a/block/block-copy.c b/block/block-copy.c index 75882a094c..a0477d90f3 100644 --- a/block/block-copy.c +++ b/block/block-copy.c @@ -34,9 +34,11 @@ typedef struct BlockCopyCallState { BlockCopyState *s; int64_t offset; int64_t bytes; +BlockCopyAsyncCallbackFunc cb; /* State */ bool failed; +bool finished; /* OUT parameters */ bool error_is_read; @@ -676,6 +678,13 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state) */ } while (ret > 0); +if (call_state->cb) { +call_state->cb(ret, call_state->error_is_read, + call_state->s->progress_opaque); I find it weird to pass progress_opaque here. Shouldn’t we just have a dedicated opaque object for this CB? I remember, it should be refactored later. But seems strange here, better to change. +} + +call_state->finished = true; + return ret; } @@ -697,6 +706,37 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes, return ret; } +static void coroutine_fn block_copy_async_co_entry(void *opaque) +{ +block_copy_common(opaque); +} + +BlockCopyCallState *block_copy_async(BlockCopyState *s, + int64_t offset, int64_t bytes, + bool ratelimit, int max_workers, + int64_t max_chunk, + BlockCopyAsyncCallbackFunc cb) +{ +BlockCopyCallState *call_state = g_new(BlockCopyCallState, 1); +Coroutine *co = qemu_coroutine_create(block_copy_async_co_entry, + call_state); + +*call_state = (BlockCopyCallState) { +.s = s, +.offset = offset, +.bytes = bytes, +.cb = cb, +}; + +qemu_coroutine_enter(co); Do we need/want any already-in-coroutine shenanigans here? No: the aim of the function is to start a new coroutine in parallel, independently of are we already in some other coroutine or not. + +if (call_state->finished) { +g_free(call_state); +return NULL; +} + +return call_state; +} BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s) { return s->copy_bitmap; -- Best regards, Vladimir
Re: device compatibility interface for live migration with assigned devices
On Wed, 15 Jul 2020 15:37:19 +0800 Alex Xu wrote: > Alex Williamson 于2020年7月15日周三 上午5:00写道: > > > On Tue, 14 Jul 2020 18:19:46 +0100 > > "Dr. David Alan Gilbert" wrote: > > > > > * Alex Williamson (alex.william...@redhat.com) wrote: > > > > On Tue, 14 Jul 2020 11:21:29 +0100 > > > > Daniel P. Berrangé wrote: > > > > > > > > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote: > > > > > > hi folks, > > > > > > we are defining a device migration compatibility interface that > > helps upper > > > > > > layer stack like openstack/ovirt/libvirt to check if two devices > > are > > > > > > live migration compatible. > > > > > > The "devices" here could be MDEVs, physical devices, or hybrid of > > the two. > > > > > > e.g. we could use it to check whether > > > > > > - a src MDEV can migrate to a target MDEV, > > > > > > - a src VF in SRIOV can migrate to a target VF in SRIOV, > > > > > > - a src MDEV can migration to a target VF in SRIOV. > > > > > > (e.g. SIOV/SRIOV backward compatibility case) > > > > > > > > > > > > The upper layer stack could use this interface as the last step to > > check > > > > > > if one device is able to migrate to another device before > > triggering a real > > > > > > live migration procedure. > > > > > > we are not sure if this interface is of value or help to you. > > please don't > > > > > > hesitate to drop your valuable comments. > > > > > > > > > > > > > > > > > > (1) interface definition > > > > > > The interface is defined in below way: > > > > > > > > > > > > __userspace > > > > > > /\ \ > > > > > > / \write > > > > > > / read \ > > > > > >/__ ___\|/_ > > > > > > | migration_version | | migration_version |-->check migration > > > > > > - - compatibility > > > > > > device Adevice B > > > > > > > > > > > > > > > > > > a device attribute named migration_version is defined under each > > device's > > > > > > sysfs node. e.g. > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version). > > > > > > userspace tools read the migration_version as a string from the > > source device, > > > > > > and write it to the migration_version sysfs attribute in the > > target device. > > > > > > > > > > > > The userspace should treat ANY of below conditions as two devices > > not compatible: > > > > > > - any one of the two devices does not have a migration_version > > attribute > > > > > > - error when reading from migration_version attribute of one device > > > > > > - error when writing migration_version string of one device to > > > > > > migration_version attribute of the other device > > > > > > > > > > > > The string read from migration_version attribute is defined by > > device vendor > > > > > > driver and is completely opaque to the userspace. > > > > > > for a Intel vGPU, string format can be defined like > > > > > > "parent device PCI ID" + "version of gvt driver" + "mdev type" + > > "aggregator count". > > > > > > > > > > > > for an NVMe VF connecting to a remote storage. it could be > > > > > > "PCI ID" + "driver version" + "configured remote storage URL" > > > > > > > > > > > > for a QAT VF, it may be > > > > > > "PCI ID" + "driver version" + "supported encryption set". > > > > > > > > > > > > (to avoid namespace confliction from each vendor, we may prefix a > > driver name to > > > > > > each migration_version string. e.g. > > i915-v1-8086-591d-i915-GVTg_V5_8-1) > > > > > > > > It's very strange to define it as opaque and then proceed to describe > > > > the contents of that opaque string. The point is that its contents > > > > are defined by the vendor driver to describe the device, driver > > version, > > > > and possibly metadata about the configuration of the device. One > > > > instance of a device might generate a different string from another. > > > > The string that a device produces is not necessarily the only string > > > > the vendor driver will accept, for example the driver might support > > > > backwards compatible migrations. > > > > > > (As I've said in the previous discussion, off one of the patch series) > > > > > > My view is it makes sense to have a half-way house on the opaqueness of > > > this string; I'd expect to have an ID and version that are human > > > readable, maybe a device ID/name that's human interpretable and then a > > > bunch of other cruft that maybe device/vendor/version specific. > > > > > > I'm thinking that we want to be able to report problems and include the > > > string and the user to be able to easily identify the device that was > > > complaining and notice a difference in versions, and perhaps also use > > > it in compatibility patterns to find compatible hosts; but that does > > > get tricky when it's a 'ask the d
Re: [PATCH v2 03/20] qapi: backup: add x-use-copy-range parameter
17.07.2020 16:15, Max Reitz wrote: On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: Add parameter to enable/disable copy_range. Keep current default for now (enabled). Why x-, though? I can’t think of a reason why we would have to remove this. I add some x- arguments in these series: 1. I'm unsure about default values (still, it can be changed even without x- prefixes as I understand) 2. Probably all this should be wrapped into common "block-copy" options, to not add same arguments to different block-jobs, when they all (I believe in bright future :) move on block-copy. So, this series is not about API, but for new backup architecture, and experimental APIs are needed only to experiment with performance and adjust some iotests. Signed-off-by: Vladimir Sementsov-Ogievskiy --- qapi/block-core.json | 4 +++- block/backup-top.h | 1 + include/block/block-copy.h | 2 +- include/block/block_int.h | 1 + block/backup-top.c | 4 +++- block/backup.c | 4 +++- block/block-copy.c | 4 ++-- block/replication.c| 1 + blockdev.c | 5 + 9 files changed, 20 insertions(+), 6 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 6fbacddab2..0c7600e4ec 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1405,6 +1405,8 @@ #above node specified by @drive. If this option is not given, #a node name is autogenerated. (Since: 4.2) # +# @x-use-copy-range: use copy offloading if possible. Default true. (Since 5.1) Would it make more sense to invert it to disable-copy-range? First, this would make for a cleaner meaning, because it would allow dropping the “if possible” part. Setting use-copy-range=true would intuitively imply to me that I get an error if copy-range cannot be used. Sure, there’s this little “if possible” in the documentation, but it goes against my intuition. disable-copy-range=true is intuitively clear. Second, this would give us a default of false, which is marginally nicer. Reasonable. But we also should consider disabling copy_range by default as I propose.. But yes, if we keep x- for now, and don't change default for copy_range in this series, I can invert it just for readability. Thanks a lot for reviewing this! -- Best regards, Vladimir
[RFC PATCH-for-5.1] hw/isa/lpc_ich9: Ignore reserved/invalid SCI IRQ
libFuzzer triggered the following assertion: cat << EOF | qemu-system-i386 -M pc-q35-5.0 \ -nographic -monitor none -serial none \ -qtest stdio -d guest_errors -trace pci\* outl 0xcf8 0x8400f841 outl 0xcfc 0xebed205d outl 0x5d02 0xedf82049 EOF pci_cfg_write ICH9-LPC 31:0 @0x41 <- 0xebed205d hw/pci/pci.c:268: int pci_bus_get_irq_level(PCIBus *, int): Assertion `irq_num < bus->nirq' failed. This is because ich9_lpc_sci_irq() returns -1 for reserved (illegal) values, but ich9_lpc_pmbase_sci_update() considers it valid and store it in a 8-bit unsigned type. Then the 255 value is used as GSI IRQ, resulting in a PIRQ value of 247, more than ICH9_LPC_NB_PIRQS (8). Fix by simply ignoring the invalid access (and reporting it): pci_cfg_write ICH9-LPC 31:0 @0x41 <- 0xebed205d ICH9 LPC: SCI IRQ SEL #3 is reserved pci_cfg_read mch 00:0 @0x0 -> 0x8086 pci_cfg_read mch 00:0 @0x0 -> 0x29c08086 ... Cc: qemu-sta...@nongnu.org Reported-by: Alexander Bulekov Fixes: 8f242cb724 ("ich9: implement SCI_IRQ_SEL register") BugLink: https://bugs.launchpad.net/qemu/+bug/1878642 Signed-off-by: Philippe Mathieu-Daudé --- RFC because I have no idea how to report the error. The LPC is on Device 31 Function 0. Should it trigger some error to the PCI bridge? I can't find that info in the ICH9 datasheet. Signed-off-by: Philippe Mathieu-Daudé --- include/hw/i386/ich9.h | 1 + hw/isa/lpc_ich9.c | 14 +++--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h index a98d10b252..60483b16cd 100644 --- a/include/hw/i386/ich9.h +++ b/include/hw/i386/ich9.h @@ -144,6 +144,7 @@ typedef struct ICH9LPCState { #define ICH9_LPC_PMBASE_BASE_ADDRESS_MASK Q35_MASK(32, 15, 7) #define ICH9_LPC_PMBASE_RTE 0x1 #define ICH9_LPC_PMBASE_DEFAULT 0x1 + #define ICH9_LPC_ACPI_CTRL 0x44 #define ICH9_LPC_ACPI_CTRL_ACPI_EN 0x80 #define ICH9_LPC_ACPI_CTRL_SCI_IRQ_SEL_MASK Q35_MASK(8, 2, 0) diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index cd6e169d47..5ed28a726c 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -29,6 +29,7 @@ */ #include "qemu/osdep.h" +#include "qemu/log.h" #include "cpu.h" #include "qapi/visitor.h" #include "qemu/range.h" @@ -312,10 +313,12 @@ void ich9_generate_smi(void) cpu_interrupt(first_cpu, CPU_INTERRUPT_SMI); } +/* Returns -1 on error, IRQ number on success */ static int ich9_lpc_sci_irq(ICH9LPCState *lpc) { -switch (lpc->d.config[ICH9_LPC_ACPI_CTRL] & -ICH9_LPC_ACPI_CTRL_SCI_IRQ_SEL_MASK) { +uint8_t sel = lpc->d.config[ICH9_LPC_ACPI_CTRL] & + ICH9_LPC_ACPI_CTRL_SCI_IRQ_SEL_MASK; +switch (sel) { case ICH9_LPC_ACPI_CTRL_9: return 9; case ICH9_LPC_ACPI_CTRL_10: @@ -328,6 +331,8 @@ static int ich9_lpc_sci_irq(ICH9LPCState *lpc) return 21; default: /* reserved */ +qemu_log_mask(LOG_GUEST_ERROR, + "ICH9 LPC: SCI IRQ SEL #%u is reserved\n", sel); break; } return -1; @@ -450,7 +455,7 @@ ich9_lpc_pmbase_sci_update(ICH9LPCState *lpc) { uint32_t pm_io_base = pci_get_long(lpc->d.config + ICH9_LPC_PMBASE); uint8_t acpi_cntl = pci_get_long(lpc->d.config + ICH9_LPC_ACPI_CTRL); -uint8_t new_gsi; +int new_gsi; if (acpi_cntl & ICH9_LPC_ACPI_CTRL_ACPI_EN) { pm_io_base &= ICH9_LPC_PMBASE_BASE_ADDRESS_MASK; @@ -461,6 +466,9 @@ ich9_lpc_pmbase_sci_update(ICH9LPCState *lpc) ich9_pm_iospace_update(&lpc->pm, pm_io_base); new_gsi = ich9_lpc_sci_irq(lpc); +if (new_gsi == -1) { +return; +} if (lpc->sci_level && new_gsi != lpc->sci_gsi) { qemu_set_irq(lpc->pm.irq, 0); lpc->sci_gsi = new_gsi; -- 2.21.3
[PULL 0/1] bitmaps patches for 2020-07-17 [-rc1]
The following changes since commit 151f76c689b1ff4c2c59e6d8469a0d4fe5346f55: Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging (2020-07-16 21:46:18 +0100) are available in the Git repository at: https://repo.or.cz/qemu/ericb.git tags/pull-bitmaps-2020-07-17 for you to fetch changes up to 7cb015197b383a62f5729d2c92b1050db0185c1c: migration/block-dirty-bitmap: fix add_bitmaps_to_list (2020-07-17 08:18:51 -0500) I had been waiting to see if I had more than one patch to bundle, but given that we are now coming up on -rc1 and this is a bugfix, it's time for the pull request of this in isolation. bitmaps patches for 2020-07-17 - improve corner-case of bitmap migration Vladimir Sementsov-Ogievskiy (1): migration/block-dirty-bitmap: fix add_bitmaps_to_list migration/block-dirty-bitmap.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) -- 2.27.0
[PULL 1/1] migration/block-dirty-bitmap: fix add_bitmaps_to_list
From: Vladimir Sementsov-Ogievskiy We shouldn't fail when finding an unnamed bitmap in a unnamed node or node with auto-generated node name, as bitmap migration ignores such bitmaps in the first place. Fixes: 82640edb88faa Fixes: 4ff5cc121b089 Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20200626130658.76498-1-vsement...@virtuozzo.com> Reviewed-by: Eric Blake [eblake: commit message grammar tweaks] Signed-off-by: Eric Blake --- migration/block-dirty-bitmap.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index 47bc0f650c1e..b0dbf9eeed43 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -274,7 +274,11 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name) DirtyBitmapMigBitmapState *dbms; Error *local_err = NULL; -bitmap = bdrv_dirty_bitmap_first(bs); +FOR_EACH_DIRTY_BITMAP(bs, bitmap) { +if (bdrv_dirty_bitmap_name(bitmap)) { +break; +} +} if (!bitmap) { return 0; } -- 2.27.0
Re: [PATCH 2/7] pc-bios: s390x: Cleanup jump to ipl code
On 15/07/2020 11.40, Janosch Frank wrote: > jump_to_IPL_code takes a 64 bit address, masks it with the short psw > address mask and later branches to it using a full 64 bit register. > > * As the masking is not necessary, let's remove it > * Without the mask we can save the ipl address to a static 64 bit > function ptr as we later branch to it > * Let's also clean up the variable names and remove the now unneeded > ResetInfo > > Signed-off-by: Janosch Frank > --- > pc-bios/s390-ccw/jump2ipl.c | 27 +++ > 1 file changed, 11 insertions(+), 16 deletions(-) > > diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c > index 767012bf0c..aef37cea76 100644 > --- a/pc-bios/s390-ccw/jump2ipl.c > +++ b/pc-bios/s390-ccw/jump2ipl.c > @@ -13,20 +13,15 @@ > #define KERN_IMAGE_START 0x01UL > #define RESET_PSW_MASK (PSW_MASK_SHORTPSW | PSW_MASK_64) > > -typedef struct ResetInfo { > -uint64_t ipl_psw; > -uint32_t ipl_continue; > -} ResetInfo; > - > -static ResetInfo save; > +static void (*ipl_continue)(void); > +static uint64_t psw_save; Christian, do you remember whether there was a reason that we saved the "ipl_continue" in the low-core in the past? The changes here look ok to me, but I still wonder why it has been more "complicated" before...? Acked-by: Thomas Huth
Re: [PATCH 1/7] pc-bios: s390x: Fix bootmap.c zipl component entry data handling
On 15/07/2020 11.40, Janosch Frank wrote: > The two main types of zipl component entries are execute and > load/data. The last member of the component entry struct therefore > denotes either a PSW or an address. Let's make this a bit more clear > by introducing a union and cleaning up the code that uses that struct > member. > > The execute type component entries written by zipl contain short PSWs, > not addresses. Let's mask them and only pass the address part to > jump_to_IPL_code(uint64_t address) because it expects an address as > visible by the name of the argument. > > Signed-off-by: Janosch Frank > --- > pc-bios/s390-ccw/bootmap.c | 5 +++-- > pc-bios/s390-ccw/bootmap.h | 7 ++- > 2 files changed, 9 insertions(+), 3 deletions(-) Reviewed-by: Thomas Huth
Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems
On Fri, Jul 17, 2020 at 03:55:15PM +0100, Alex Bennée wrote: > > Daniel P. Berrangé writes: > > > On Fri, Jul 17, 2020 at 11:51:39AM +0100, Alex Bennée wrote: > >> It turns out there are some 64 bit systems that have relatively low > >> amounts of physical memory available to them (typically CI system). > >> Even with swapping available a 1GB translation buffer that fills up > >> can put the machine under increased memory pressure. Detect these low > >> memory situations and reduce tb_size appropriately. > >> > >> Fixes: 600e17b261 > >> Signed-off-by: Alex Bennée > >> Cc: BALATON Zoltan > >> Cc: Christian Ehrhardt > >> --- > >> accel/tcg/translate-all.c | 7 ++- > >> 1 file changed, 6 insertions(+), 1 deletion(-) > >> > >> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c > >> index 2afa46bd2b1..2ff0ba6d19b 100644 > >> --- a/accel/tcg/translate-all.c > >> +++ b/accel/tcg/translate-all.c > >> @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t > >> tb_size) > >> { > >> /* Size the buffer. */ > >> if (tb_size == 0) { > >> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > >> +size_t phys_mem = qemu_get_host_physmem(); > >> +if (phys_mem > 0 && phys_mem < (2 * > >> DEFAULT_CODE_GEN_BUFFER_SIZE)) { > >> +tb_size = phys_mem / 4; > >> +} else { > >> +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > >> +} > > > > I'm not convinced this is going to work when running QEMU inside a > > container environment with RAM cap, because the physmem level is > > completely unrelated to the RAM the container is permitted to actually > > use in practice. ie host has 32 GB of RAM, but the container QEMU is > > in only has 1 GB permitted. > > What will happen when the mmap happens? Will a capped container limit > the attempted mmap? I would hope the container case at least gave > different feedback than a "silent" OOM. IIRC it should trigger the OOM killer on process(s) within the container. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v3 for-5.1 0/2] Fix crash due to NBD export leak
17.07.2020 15:01, Kevin Wolf wrote: Am 14.07.2020 um 18:22 hat Vladimir Sementsov-Ogievskiy geschrieben: Hi all! We've faced crash bug, which is reproducing on master branch as well. The case is described in 01, where fix is suggested. New iotest in 02 crashes without that fix. v3: resend for convenience, as all preparatory patches are merged. 01-02: add Eric's r-b and t-b marks This is a crash-fix, so it would be good to fix in 5.1. Still neither Eric nor I are sure in patch 01: is AIO_WAIT_WHILE used correctly? Anything specific you had doubts about? At first sight it looks good to me. It's always called in the main loop and we don't hold any AioContext locks, so using NULL as the context is fine. You also made sure to use aio_wait_kick() so that we won't hang even though the condition has become false. I'm applying this to my block branch now. If your doubts were about something more subtle that I missed, we can unstage/revert the patch. Kevin Nothing specific, thanks! -- Best regards, Vladimir
Re: device compatibility interface for live migration with assigned devices
On Wed, 15 Jul 2020 16:20:41 +0800 Yan Zhao wrote: > On Tue, Jul 14, 2020 at 02:59:48PM -0600, Alex Williamson wrote: > > On Tue, 14 Jul 2020 18:19:46 +0100 > > "Dr. David Alan Gilbert" wrote: > > > > > * Alex Williamson (alex.william...@redhat.com) wrote: > > > > On Tue, 14 Jul 2020 11:21:29 +0100 > > > > Daniel P. Berrangé wrote: > > > > > > > > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote: > > > > > > hi folks, > > > > > > we are defining a device migration compatibility interface that > > > > > > helps upper > > > > > > layer stack like openstack/ovirt/libvirt to check if two devices are > > > > > > live migration compatible. > > > > > > The "devices" here could be MDEVs, physical devices, or hybrid of > > > > > > the two. > > > > > > e.g. we could use it to check whether > > > > > > - a src MDEV can migrate to a target MDEV, > > > > > > - a src VF in SRIOV can migrate to a target VF in SRIOV, > > > > > > - a src MDEV can migration to a target VF in SRIOV. > > > > > > (e.g. SIOV/SRIOV backward compatibility case) > > > > > > > > > > > > The upper layer stack could use this interface as the last step to > > > > > > check > > > > > > if one device is able to migrate to another device before > > > > > > triggering a real > > > > > > live migration procedure. > > > > > > we are not sure if this interface is of value or help to you. > > > > > > please don't > > > > > > hesitate to drop your valuable comments. > > > > > > > > > > > > > > > > > > (1) interface definition > > > > > > The interface is defined in below way: > > > > > > > > > > > > __userspace > > > > > > /\ \ > > > > > > / \write > > > > > > / read \ > > > > > >/__ ___\|/_ > > > > > > | migration_version | | migration_version |-->check migration > > > > > > - - compatibility > > > > > > device Adevice B > > > > > > > > > > > > > > > > > > a device attribute named migration_version is defined under each > > > > > > device's > > > > > > sysfs node. e.g. > > > > > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version). > > > > > > userspace tools read the migration_version as a string from the > > > > > > source device, > > > > > > and write it to the migration_version sysfs attribute in the target > > > > > > device. > > > > > > > > > > > > The userspace should treat ANY of below conditions as two devices > > > > > > not compatible: > > > > > > - any one of the two devices does not have a migration_version > > > > > > attribute > > > > > > - error when reading from migration_version attribute of one device > > > > > > - error when writing migration_version string of one device to > > > > > > migration_version attribute of the other device > > > > > > > > > > > > The string read from migration_version attribute is defined by > > > > > > device vendor > > > > > > driver and is completely opaque to the userspace. > > > > > > for a Intel vGPU, string format can be defined like > > > > > > "parent device PCI ID" + "version of gvt driver" + "mdev type" + > > > > > > "aggregator count". > > > > > > > > > > > > for an NVMe VF connecting to a remote storage. it could be > > > > > > "PCI ID" + "driver version" + "configured remote storage URL" > > > > > > > > > > > > for a QAT VF, it may be > > > > > > "PCI ID" + "driver version" + "supported encryption set". > > > > > > > > > > > > (to avoid namespace confliction from each vendor, we may prefix a > > > > > > driver name to > > > > > > each migration_version string. e.g. > > > > > > i915-v1-8086-591d-i915-GVTg_V5_8-1) > > > > > > > > It's very strange to define it as opaque and then proceed to describe > > > > the contents of that opaque string. The point is that its contents > > > > are defined by the vendor driver to describe the device, driver version, > > > > and possibly metadata about the configuration of the device. One > > > > instance of a device might generate a different string from another. > > > > The string that a device produces is not necessarily the only string > > > > the vendor driver will accept, for example the driver might support > > > > backwards compatible migrations. > > > > > > (As I've said in the previous discussion, off one of the patch series) > > > > > > My view is it makes sense to have a half-way house on the opaqueness of > > > this string; I'd expect to have an ID and version that are human > > > readable, maybe a device ID/name that's human interpretable and then a > > > bunch of other cruft that maybe device/vendor/version specific. > > > > > > I'm thinking that we want to be able to report problems and include the > > > string and the user to be able to easily identify the device that was > > > complaining and notice a difference in versions, and perhaps also use > > >
Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems
Daniel P. Berrangé writes: > On Fri, Jul 17, 2020 at 11:51:39AM +0100, Alex Bennée wrote: >> It turns out there are some 64 bit systems that have relatively low >> amounts of physical memory available to them (typically CI system). >> Even with swapping available a 1GB translation buffer that fills up >> can put the machine under increased memory pressure. Detect these low >> memory situations and reduce tb_size appropriately. >> >> Fixes: 600e17b261 >> Signed-off-by: Alex Bennée >> Cc: BALATON Zoltan >> Cc: Christian Ehrhardt >> --- >> accel/tcg/translate-all.c | 7 ++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c >> index 2afa46bd2b1..2ff0ba6d19b 100644 >> --- a/accel/tcg/translate-all.c >> +++ b/accel/tcg/translate-all.c >> @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t >> tb_size) >> { >> /* Size the buffer. */ >> if (tb_size == 0) { >> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; >> +size_t phys_mem = qemu_get_host_physmem(); >> +if (phys_mem > 0 && phys_mem < (2 * DEFAULT_CODE_GEN_BUFFER_SIZE)) { >> +tb_size = phys_mem / 4; >> +} else { >> +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; >> +} > > I'm not convinced this is going to work when running QEMU inside a > container environment with RAM cap, because the physmem level is > completely unrelated to the RAM the container is permitted to actually > use in practice. ie host has 32 GB of RAM, but the container QEMU is > in only has 1 GB permitted. What will happen when the mmap happens? Will a capped container limit the attempted mmap? I would hope the container case at least gave different feedback than a "silent" OOM. > I don't have much of a better suggestion, as I don't think we want > to get into reading the cgroups memory limits. It does feel like the > assumption we can blindly use a 1GB cache though is invalid even > with this patch applied. > > Regards, > Daniel -- Alex Bennée
[PATCH] file-posix: Handle `EINVAL` fallocate return value
From: Antoine Damhet The `detect-zeroes=unmap` option may issue unaligned `FALLOC_FL_PUNCH_HOLE` requests, raw block devices can (and will) return `EINVAL`, qemu should then write the zeroes to the blockdev instead of issuing an `IO_ERROR`. Signed-off-by: Antoine Damhet --- block/file-posix.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/block/file-posix.c b/block/file-posix.c index 8067e238cb..b2fabcc1b8 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1620,7 +1620,11 @@ static int handle_aiocb_write_zeroes_unmap(void *opaque) #ifdef CONFIG_FALLOCATE_PUNCH_HOLE int ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, aiocb->aio_offset, aiocb->aio_nbytes); -if (ret != -ENOTSUP) { +switch (ret) { +case -ENOTSUP: +case -EINVAL: +break; +default: return ret; } #endif -- 2.27.0
[PATCH RESEND] file-posix: Handle `EINVAL` fallocate return value
From: Antoine Damhet The `detect-zeroes=unmap` option may issue unaligned `FALLOC_FL_PUNCH_HOLE` requests, raw block devices can (and will) return `EINVAL`, qemu should then write the zeroes to the blockdev instead of issuing an `IO_ERROR`. Signed-off-by: Antoine Damhet --- I am resending this patch because: 1. I sent the first version from the wrong email address 2. I forgot to send it to the devel mailing list Please forgive me for the inconvenience. block/file-posix.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/block/file-posix.c b/block/file-posix.c index 8067e238cb..b2fabcc1b8 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1620,7 +1620,11 @@ static int handle_aiocb_write_zeroes_unmap(void *opaque) #ifdef CONFIG_FALLOCATE_PUNCH_HOLE int ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, aiocb->aio_offset, aiocb->aio_nbytes); -if (ret != -ENOTSUP) { +switch (ret) { +case -ENOTSUP: +case -EINVAL: +break; +default: return ret; } #endif -- 2.27.0
Re: [PATCH] linux-user: Add strace support for printing arguments for ioctls used for terminals and serial lines
Le 14/07/2020 à 22:04, Filip Bozuta a écrit : > Functions "print_ioctl()" and "print_syscall_ret_ioctl()" are used > to print arguments of "ioctl()" with "-strace". These functions > use "thunk_print()", which is defined in "thunk.c", to print the > contents of ioctl's third arguments that are not basic types. > However, this function doesn't handle ioctls of group ioctl_tty which > are used for terminals and serial lines. These ioctls use a type > "struct termios" which thunk type is defined in a non standard > way using "STRUCT_SPECIAL()". This means that this type is not decoded > regularly using "thunk_convert()" and uses special converting functions > "target_to_host_termios()" and "host_to_target_termios()", which are defined > in "syscall.c" to decode it's values. For simillar reasons, this type is > also not printed regularly using "thunk_print()". That is the reason > why a separate printing function "print_termios()" is defined in > file "strace.c". This function decodes and prints flag values of the > "termios" structure. > > Implementation notes: > > Function "print_termios()" was implemented in "strace.c" using > an existing function "print_flags()" to print flag values of > "struct termios" fields. These flag values were defined > using an existing macro "FLAG_TARGET()" that generates aproppriate > target flag values and string representations of these flags. > This function was declared in "qemu.h" so that it can be accessed > in "syscall.c". Type "StructEntry" defined in "exec/user/thunk.h" > contains information that is used to decode structure values. > Field "void print(void *arg)" was added in this structure as a > special print function. Also, function "thunk_print()" was changed > a little so that it uses this special print function in case > it is defined. This printing function was instantiated with the > defined "print_termios()" in "syscall.c" in "struct_termios_def". > > Signed-off-by: Filip Bozuta > --- > include/exec/user/thunk.h | 1 + > linux-user/qemu.h | 2 + > linux-user/strace.c | 193 ++ > linux-user/syscall.c | 1 + > thunk.c | 23 +++-- > 5 files changed, 211 insertions(+), 9 deletions(-) > > diff --git a/include/exec/user/thunk.h b/include/exec/user/thunk.h > index 7992475c9f..a5bbb2c733 100644 > --- a/include/exec/user/thunk.h > +++ b/include/exec/user/thunk.h > @@ -55,6 +55,7 @@ typedef struct { > int *field_offsets[2]; > /* special handling */ > void (*convert[2])(void *dst, const void *src); > +void (*print)(void *arg); > int size[2]; > int align[2]; > const char *name; > diff --git a/linux-user/qemu.h b/linux-user/qemu.h > index 5c964389c1..e51a0ededb 100644 > --- a/linux-user/qemu.h > +++ b/linux-user/qemu.h > @@ -706,6 +706,8 @@ static inline uint64_t target_offset64(uint64_t word0, > uint64_t word1) > } > #endif /* TARGET_ABI_BITS != 32 */ > > +extern void print_termios(void *arg); > + > /** > * preexit_cleanup: housekeeping before the guest exits > * > diff --git a/linux-user/strace.c b/linux-user/strace.c > index 5235b2260c..7b5408cf4a 100644 > --- a/linux-user/strace.c > +++ b/linux-user/strace.c > @@ -1193,6 +1193,138 @@ UNUSED static struct flags falloc_flags[] = { > #endif > }; > > +UNUSED static struct flags termios_iflags[] = { > +FLAG_TARGET(IGNBRK), > +FLAG_TARGET(BRKINT), > +FLAG_TARGET(IGNPAR), > +FLAG_TARGET(PARMRK), > +FLAG_TARGET(INPCK), > +FLAG_TARGET(ISTRIP), > +FLAG_TARGET(INLCR), > +FLAG_TARGET(IGNCR), > +FLAG_TARGET(ICRNL), > +FLAG_TARGET(IUCLC), > +FLAG_TARGET(IXON), > +FLAG_TARGET(IXANY), > +FLAG_TARGET(IXOFF), > +FLAG_TARGET(IMAXBEL), IUTF8 is missing > +FLAG_END, > +}; > + > +UNUSED static struct flags termios_oflags[] = { > +FLAG_TARGET(OPOST), > +FLAG_TARGET(OLCUC), > +FLAG_TARGET(ONLCR), > +FLAG_TARGET(OCRNL), > +FLAG_TARGET(ONOCR), > +FLAG_TARGET(ONLRET), > +FLAG_TARGET(OFILL), > +FLAG_TARGET(OFDEL), > +FLAG_END, > +}; > + the following entries are not flags: flags are bits, while we have enumerated values in these cases. NL0, NL1 = 0,1 > +UNUSED static struct flags termios_oflags_NLDLY[] = { > +FLAG_TARGET(NL0), > +FLAG_TARGET(NL1), > +FLAG_END, > +}; CR0, CR1, CR2, CR3 = 0, 1, 2, 3 > +UNUSED static struct flags termios_oflags_CRDLY[] = { > +FLAG_TARGET(CR0), > +FLAG_TARGET(CR1), > +FLAG_TARGET(CR2), > +FLAG_TARGET(CR3), > +FLAG_END, > +}; > TAB0 is 0 so it cannot be detected > +UNUSED static struct flags termios_oflags_TABDLY[] = { > +FLAG_TARGET(TAB0), > +FLAG_TARGET(TAB1), > +FLAG_TARGET(TAB2), > +FLAG_TARGET(TAB3), > +FLAG_END, > +}; VT0 is 0 > +UNUSED static struct flags termios_oflags_VTDLY[] = { > +FLAG_TARGET(VT0), > +FLAG_TARGET(VT1), > +FLAG_END, > +}; FF0 is 0 > +UNUSE
[PATCH v2 1/4] linux-user: Add support for a group of btrfs inode ioctls
This patch implements functionality of following ioctls: BTRFS_IOC_INO_LOOKUP - Reading tree root id and path Read tree root id and path for a given file or directory. The name and tree root id are returned in an ioctl's third argument that represents a pointer to a following type: struct btrfs_ioctl_ino_lookup_args { __u64 treeid; __u64 objectid; char name[BTRFS_INO_LOOKUP_PATH_MAX]; }; Before calling this ioctl, field 'objectid' should be filled with the object id value for which the tree id and path are to be read. Value 'BTRFS_FIRST_FREE_OBJECTID' represents the object id for the first available btrfs object (directory or file). BTRFS_IOC_INO_PATHS - Reading paths to all files Read path to all files with a certain inode number. The paths are returned in the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_ino_path_args { __u64 inum; /* in */ __u64 size; /* in */ __u64 reserved[4]; /* struct btrfs_data_container *fspath; out */ __u64 fspath; /* out */ }; Before calling this ioctl, the 'inum' and 'size' field should be filled with the aproppriate inode number and size of the directory where file paths should be looked for. For now, the paths are returned in an '__u64' (unsigned long long) value 'fspath'. BTRFS_IOC_LOGICAL_INO - Reading inode numbers Read inode numbers for files on a certain logical adress. The inode numbers are returned in the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_logical_ino_args { __u64 logical;/* in */ __u64 size; /* in */ __u64 reserved[3];/* must be 0 for now */ __u64 flags; /* in, v2 only */ /* struct btrfs_data_container *inodes;out */ __u64 inodes; }; Before calling this ioctl, the 'logical' and 'size' field should be filled with the aproppriate logical adress and size of where the inode numbers of files should be looked for. For now, the inode numbers are returned in an '__u64' (unsigned long long) value 'inodes'. BTRFS_IOC_LOGICAL_INO_V2 - Reading inode numbers Same as the above mentioned ioctl except that it allows passing a flags 'BTRFS_LOGICAL_INO_ARGS_IGNORE_OFFSET'. BTRFS_IOC_INO_LOOKUP_USER - Reading subvolume name and path Read name and path of a subvolume. The tree root id and path are read in an ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_ino_lookup_user_args { /* in, inode number containing the subvolume of 'subvolid' */ __u64 dirid; /* in */ __u64 treeid; /* out, name of the subvolume of 'treeid' */ char name[BTRFS_VOL_NAME_MAX + 1]; /* * out, constructed path from the directory with which the ioctl is * called to dirid */ char path[BTRFS_INO_LOOKUP_USER_PATH_MAX]; }; Before calling this ioctl, the 'dirid' and 'treeid' field should be filled with aproppriate values which represent the inode number of the directory that contains the subvolume and treeid of the subvolume. Implementation notes: All of the ioctls in this patch use structure types as third arguments. That is the reason why aproppriate thunk definitions were added in file 'syscall_types.h'. Signed-off-by: Filip Bozuta --- linux-user/ioctls.h| 20 linux-user/syscall_defs.h | 10 ++ linux-user/syscall_types.h | 24 3 files changed, 54 insertions(+) diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h index c6303a0406..a7f5664487 100644 --- a/linux-user/ioctls.h +++ b/linux-user/ioctls.h @@ -202,6 +202,10 @@ IOCTL(BTRFS_IOC_SNAP_DESTROY, IOC_W, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_vol_args))) #endif +#ifdef BTRFS_IOC_INO_LOOKUP + IOCTL(BTRFS_IOC_INO_LOOKUP, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_ino_lookup_args))) +#endif #ifdef BTRFS_IOC_SUBVOL_GETFLAGS IOCTL(BTRFS_IOC_SUBVOL_GETFLAGS, IOC_R, MK_PTR(TYPE_ULONGLONG)) #endif @@ -212,6 +216,14 @@ IOCTL(BTRFS_IOC_DEV_INFO, IOC_RW, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_dev_info_args))) #endif +#ifdef BTRFS_IOC_INO_PATHS + IOCTL(BTRFS_IOC_INO_PATHS, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_ino_path_args))) +#endif +#ifdef BTRFS_IOC_LOGICAL_INO + IOCTL(BTRFS_IOC_LOGICAL_INO, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_logical_ino_args
[PATCH v2 3/4] linux-user: Add support for btrfs ioctls used to manage quota
This patch implements functionality for following ioctls: BTRFS_IOC_QUOTA_CTL - Enabling/Disabling quota support Enable or disable quota support for a btrfs filesystem. Quota support is enabled or disabled using the ioctls third argument which represents a pointer to a following type: struct btrfs_ioctl_quota_ctl_args { __u64 cmd; __u64 status; }; Before calling this ioctl, the 'cmd' field should be filled with one of the values 'BTRFS_QUOTA_CTL_ENABLE' (enabling quota) 'BTRFS_QUOTA_CTL_DISABLE' (disabling quota). BTRFS_IOC_QGROUP_CREATE - Creating/Removing a subvolume quota group Create or remove a subvolume quota group. The subvolume quota group is created or removed using the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_qgroup_create_args { __u64 create; __u64 qgroupid; }; Before calling this ioctl, the 'create' field should be filled with the aproppriate value depending on if the user wants to create or remove a quota group (0 for removing, everything else for creating). Also, the 'qgroupid' field should be filled with the value for the quota group id that is to be created. BTRFS_IOC_QGROUP_ASSIGN - Asigning or removing a quota group as child group Asign or remove a quota group as child quota group of another group in the btrfs filesystem. The asignment is done using the ioctl's third argument which represents a pointert to a following type: struct btrfs_ioctl_qgroup_assign_args { __u64 assign; __u64 src; __u64 dst; }; Before calling this ioctl, the 'assign' field should be filled with the aproppriate value depending on if the user wants to asign or remove a quota group as a child quota group of another group (0 for removing, everythin else for asigning). Also, the 'src' and 'dst' fields should be filled with the aproppriate quota group id values depending on which quota group needs to asigned or removed as child quota group of another group ('src' gets asigned or removed as child group of 'dst'). BTRFS_IOC_QGROUP_LIMIT - Limiting the size of a quota group Limit the size of a quota group. The size of the quota group is limited with the ioctls third argument which represents a pointer to a following type: struct btrfs_ioctl_qgroup_limit_args { __u64 qgroupid; struct btrfs_qgroup_limit lim; }; Before calling this ioctl, the 'qgroup' id field should be filled with aproppriate value of the quota group id for which the size is to be limited. The second field is of following type: struct btrfs_qgroup_limit { __u64 flags; __u64 max_rfer; __u64 max_excl; __u64 rsv_rfer; __u64 rsv_excl; }; The 'max_rfer' field should be filled with the size to which the quota group should be limited. The 'flags' field can be used for passing additional options and can have values which can be found on: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/btrfs.h#L67 BTRFS_IOC_QUOTA_RESCAN_STATUS - Checking status of running rescan operation Check status of a running rescan operation. The status is checked using the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_quota_rescan_args { __u64 flags; __u64 progress; __u64 reserved[6]; }; If there is a rescan operation running, 'flags' field is set to 1, and 'progress' field is set to aproppriate value which represents the progress of the operation. BTRFS_IOC_QUOTA_RESCAN - Starting a rescan operation Start ar rescan operation to Trash all quota groups and scan the metadata again with the current config. Before calling this ioctl, BTRFS_IOC_QUOTA_RESCAN_STATUS sould be run to check if there is already a rescan operation runing. After that ioctl call, the received 'struct btrfs_ioctl_quota_rescan_args' should be than passed as this ioctls third argument. BTRFS_IOC_QUOTA_RESCAN_WAIT - Waiting for a rescan operation to finish Wait until a rescan operation is finished (if there is a rescan operation running). The third ioctls argument is ignored. Implementation notes: Almost all of the ioctls in this patch use structure types as third arguments. That is the reason why aproppriate thunk definitions were added in file 'syscall_types.h'. Signed-off-by: Filip Bozuta --- linux-user/ioctls.h| 27 +++ linux-user/syscall_defs.h | 14 ++ linux-user/syscall_types.h | 29 + 3 files changed, 70 insertions(+) diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h index 2c553103e6..8665f504bf 100644 --- a/linux-user/ioctls.h +++ b/linux-user/ioctls.h @@ -227,6 +227,33 @@ IOCTL(BTRFS_IOC_LOGICAL_INO,
[PATCH v2 2/4] linux-user: Add support for two btrfs ioctls used for subvolume
This patch implements functionality for following ioctl: BTRFS_IOC_DEFAULT_SUBVOL - Setting a default subvolume Set a default subvolume for a btrfs filesystem. The third ioctl's argument is a '__u64' (unsigned long long) which represents the id of a subvolume that is to be set as the default. BTRFS_IOC_GET_SUBVOL_ROOTREF - Getting tree and directory id of subvolumes Read tree and directory id of subvolumes from a btrfs filesystem. The tree and directory id's are returned in the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_get_subvol_rootref_args { /* in/out, minimum id of rootref's treeid to be searched */ __u64 min_treeid; /* out */ struct { __u64 treeid; __u64 dirid; } rootref[BTRFS_MAX_ROOTREF_BUFFER_NUM]; /* out, number of found items */ __u8 num_items; __u8 align[7]; }; Before calling this ioctl, 'min_treeid' field should be filled with value that represent the minimum value for the tree id. Implementation notes: Ioctl BTRFS_IOC_GET_SUBVOL_ROOTREF uses the above mentioned structure type as third argument. That is the reason why a aproppriate thunk structure definition is added in file 'syscall_types.h'. Signed-off-by: Filip Bozuta --- linux-user/ioctls.h| 7 +++ linux-user/syscall_defs.h | 4 linux-user/syscall_types.h | 11 +++ 3 files changed, 22 insertions(+) diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h index a7f5664487..2c553103e6 100644 --- a/linux-user/ioctls.h +++ b/linux-user/ioctls.h @@ -206,6 +206,9 @@ IOCTL(BTRFS_IOC_INO_LOOKUP, IOC_RW, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_ino_lookup_args))) #endif +#ifdef BTRFS_IOC_DEFAULT_SUBVOL + IOCTL(BTRFS_IOC_DEFAULT_SUBVOL, IOC_W, MK_PTR(TYPE_ULONGLONG)) +#endif #ifdef BTRFS_IOC_SUBVOL_GETFLAGS IOCTL(BTRFS_IOC_SUBVOL_GETFLAGS, IOC_R, MK_PTR(TYPE_ULONGLONG)) #endif @@ -248,6 +251,10 @@ IOCTL(BTRFS_IOC_GET_SUBVOL_INFO, IOC_R, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_get_subvol_info_args))) #endif +#ifdef BTRFS_IOC_GET_SUBVOL_ROOTREF + IOCTL(BTRFS_IOC_GET_SUBVOL_ROOTREF, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_get_subvol_rootref_args))) +#endif #ifdef BTRFS_IOC_INO_LOOKUP_USER IOCTL(BTRFS_IOC_INO_LOOKUP_USER, IOC_RW, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_ino_lookup_user_args))) diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h index 7bb105428b..f4b4fc4a20 100644 --- a/linux-user/syscall_defs.h +++ b/linux-user/syscall_defs.h @@ -984,6 +984,8 @@ struct target_rtc_pll_info { 15, struct btrfs_ioctl_vol_args) #define TARGET_BTRFS_IOC_INO_LOOKUP TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ 18, struct btrfs_ioctl_ino_lookup_args) +#define TARGET_BTRFS_IOC_DEFAULT_SUBVOL TARGET_IOW(BTRFS_IOCTL_MAGIC, \ + 19, abi_ullong) #define TARGET_BTRFS_IOC_SUBVOL_GETFLAGSTARGET_IOR(BTRFS_IOCTL_MAGIC, \ 25, abi_ullong) #define TARGET_BTRFS_IOC_SUBVOL_SETFLAGSTARGET_IOW(BTRFS_IOCTL_MAGIC, \ @@ -1006,6 +1008,8 @@ struct target_rtc_pll_info { 59, struct btrfs_ioctl_logical_ino_args) #define TARGET_BTRFS_IOC_GET_SUBVOL_INFOTARGET_IOR(BTRFS_IOCTL_MAGIC, \ 60, struct btrfs_ioctl_get_subvol_info_args) +#define TARGET_BTRFS_IOC_GET_SUBVOL_ROOTREF TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ +61, struct btrfs_ioctl_get_subvol_rootref_args) #define TARGET_BTRFS_IOC_INO_LOOKUP_USERTARGET_IOWR(BTRFS_IOCTL_MAGIC,\ 62, struct btrfs_ioctl_ino_lookup_user_args) diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h index 980c29000a..d2f1b30ff3 100644 --- a/linux-user/syscall_types.h +++ b/linux-user/syscall_types.h @@ -381,6 +381,17 @@ STRUCT(btrfs_ioctl_dev_info_args, MK_ARRAY(TYPE_ULONGLONG, 379), /* unused */ MK_ARRAY(TYPE_CHAR, BTRFS_DEVICE_PATH_NAME_MAX)) /* path */ +STRUCT(rootref, + TYPE_ULONGLONG, /* treeid */ + TYPE_ULONGLONG) /* dirid */ + +STRUCT(btrfs_ioctl_get_subvol_rootref_args, +TYPE_ULONGLONG, /* min_treeid */ +MK_ARRAY(MK_STRUCT(STRUCT_rootref), + BTRFS_MAX_ROOTREF_BUFFER_NUM), /* rootref */ +TYPE_CHAR, /* num_items */ +MK_ARRAY(TYPE_CHAR, 7)) /* align */ + STRUCT(btrfs_ioctl_get_dev_stats, TYPE_ULONGLONG, /* devid */ TYPE_ULONGLONG, /* nr_items */ -- 2.25.1
[PATCH v2 4/4] linux-user: Add support for btrfs ioctls used to scrub a filesystem
This patch implements functionality for following ioctls: BTRFS_IOC_SCRUB - Starting a btrfs filesystem scrub Start a btrfs filesystem scrub. The third ioctls argument is a pointer to a following type: struct btrfs_ioctl_scrub_args { __u64 devid;/* in */ __u64 start;/* in */ __u64 end; /* in */ __u64 flags;/* in */ struct btrfs_scrub_progress progress; /* out */ /* pad to 1k */ __u64 unused[(1024-32-sizeof(struct btrfs_scrub_progress))/8]; }; Before calling this ioctl, field 'devid' should be filled with value that represents the device id of the btrfs filesystem for which the scrub is to be started. BTRFS_IOC_SCRUB_CANCEL - Canceling scrub of a btrfs filesystem Cancel a btrfs filesystem scrub if it is running. The third ioctls argument is ignored. BTRFS_IOC_SCRUB_PROGRESS - Getting status of a running scrub Read the status of a running btrfs filesystem scrub. The third ioctls argument is a pointer to the above mentioned 'struct btrfs_ioctl_scrub_args'. Similarly as with 'BTRFS_IOC_SCRUB', the 'devid' field should be filled with value that represents the id of the btrfs device for which the scrub has started. The status of a running scrub is returned in the field 'progress' which is of type 'struct btrfs_scrub_progress' and its definition can be found at: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/btrfs.h#L150 Implementation nots: Ioctls in this patch use type 'struct btrfs_ioctl_scrub_args' as their third argument. That is the reason why an aproppriate thunk type definition is added in file 'syscall_types.h'. Signed-off-by: Filip Bozuta --- linux-user/ioctls.h| 11 +++ linux-user/syscall_defs.h | 6 ++ linux-user/syscall_types.h | 26 ++ 3 files changed, 43 insertions(+) diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h index 8665f504bf..bf80615438 100644 --- a/linux-user/ioctls.h +++ b/linux-user/ioctls.h @@ -215,6 +215,17 @@ #ifdef BTRFS_IOC_SUBVOL_SETFLAGS IOCTL(BTRFS_IOC_SUBVOL_SETFLAGS, IOC_W, MK_PTR(TYPE_ULONGLONG)) #endif +#ifdef BTRFS_IOC_SCRUB + IOCTL(BTRFS_IOC_SCRUB, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_scrub_args))) +#endif +#ifdef BTRFS_IOC_SCRUB_CANCEL + IOCTL(BTRFS_IOC_SCRUB_CANCEL, 0, TYPE_NULL) +#endif +#ifdef BTRFS_IOC_SCRUB_PROGRESS + IOCTL(BTRFS_IOC_SCRUB_PROGRESS, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_scrub_args))) +#endif #ifdef BTRFS_IOC_DEV_INFO IOCTL(BTRFS_IOC_DEV_INFO, IOC_RW, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_dev_info_args))) diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h index 3f771ae5d1..589ec3e9b0 100644 --- a/linux-user/syscall_defs.h +++ b/linux-user/syscall_defs.h @@ -990,6 +990,12 @@ struct target_rtc_pll_info { 25, abi_ullong) #define TARGET_BTRFS_IOC_SUBVOL_SETFLAGSTARGET_IOW(BTRFS_IOCTL_MAGIC, \ 26, abi_ullong) +#define TARGET_BTRFS_IOC_SCRUB TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ + 27, struct btrfs_ioctl_scrub_args) +#define TARGET_BTRFS_IOC_SCRUB_CANCEL TARGET_IO(BTRFS_IOCTL_MAGIC, \ + 28) +#define TARGET_BTRFS_IOC_SCRUB_PROGRESS TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ + 29, struct btrfs_ioctl_scrub_args) #define TARGET_BTRFS_IOC_DEV_INFO TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ 30, struct btrfs_ioctl_dev_info_args) #define TARGET_BTRFS_IOC_INO_PATHS TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h index b4f462b5c6..345193270c 100644 --- a/linux-user/syscall_types.h +++ b/linux-user/syscall_types.h @@ -373,6 +373,32 @@ STRUCT(btrfs_ioctl_ino_lookup_user_args, MK_ARRAY(TYPE_CHAR, BTRFS_VOL_NAME_MAX + 1), /* name */ MK_ARRAY(TYPE_CHAR, BTRFS_INO_LOOKUP_USER_PATH_MAX)) /* path */ +STRUCT(btrfs_scrub_progress, + TYPE_ULONGLONG, /* data_extents_scrubbed */ + TYPE_ULONGLONG, /* tree_extents_scrubbed */ + TYPE_ULONGLONG, /* data_bytes_scrubbed */ + TYPE_ULONGLONG, /* tree_bytes_scrubbed */ + TYPE_ULONGLONG, /* read_errors */ + TYPE_ULONGLONG, /* csum_errors */ + TYPE_ULONGLONG, /* verify_errors */ + TYPE_ULONGLONG, /* no_csum */ + TYPE_ULONGLONG, /* csum_discards */ + TYPE_ULONGLONG, /* super_errors */ + TYPE_ULONGLONG, /* malloc_errors */ + TYPE_ULONGLONG, /* uncorrectable_errors */ + TYPE_ULONGLONG, /* corrected_er */ + TYPE_ULONGLONG, /
[PATCH v2 0/4] Add support for a group of btrfs ioctls - 2
This series covers support for following btrfs ioctls *BTRFS_IOC_DEFAULT_SUBVOL*BTRFS_IOC_QUOTA_RESCAN *BTRFS_IOC_GET_SUBVOL_ROOTREF*BTRFS_IOC_QUOTA_RESCAN_WAIT *BTRFS_IOC_QUOTA_CTL *BTRFS_IOC_SCRUB *BTRFS_IOC_QGROUP_CREATE *BTRFS_IOC_SCRUB_CANCEL *BTRFS_IOC_QGROUP_ASSIGN *BTRFS_IOC_SCRUB_PROGRESS *BTRFS_IOC_INO_PATHS *BTRFS_IOC_QGROUP_LIMIT *BTRFS_IOC_LOGICAL_INO *BTRFS_IOC_QUOTA_RESCAN_STATUS *BTRFS_IOC_LOGICAL_INO_V2 *BTRFS_IOC_INO_LOOKUP_USER *BTRFS_IOC_INO_LOOKUP The functionalities of individual ioctls were described in this series patch commit messages. Since all of these ioctls are added in kernel version 3.9, their definitions in file 'linux-user/ioctls.h' are enwrapped in an #ifdef directive. Testing method: Mini test programs were written for these ioctls. These test programs can be found on a repository which is located on the link: https://github.com/bozutaf/btrfs-tests bozutaf/btrfs-tests Contribute to bozutaf/btrfs-tests development by creating an account on GitHub. github.com These test programs were compiled (sometimes using cross compilers) for following architectures: * Intel 64-bit (little endian) * Power pc 32-bit (big endian) * Power pc 64-bit (big endian) The corresponding native programs were executed without using QEMU on an intel x86_64 host. All applicable compiled programs were in turn executed through QEMU and the results obtained were the same ones gotten for native execution v2: * added Signed-off-by tag (forgot in v1...) Based-on: <20200709155203.21106-1-filip.boz...@syrmia.com> Filip Bozuta (4): linux-user: Add support for a group of btrfs inode ioctls linux-user: Add support for two btrfs ioctls used for subvolume linux-user: Add support for btrfs ioctls used to manage quota linux-user: Add support for btrfs ioctls used to scrub a filesystem linux-user/ioctls.h| 65 +++ linux-user/syscall_defs.h | 34 ++ linux-user/syscall_types.h | 90 ++ 3 files changed, 189 insertions(+) -- 2.25.1
Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems
On Fri, Jul 17, 2020 at 11:51:39AM +0100, Alex Bennée wrote: > It turns out there are some 64 bit systems that have relatively low > amounts of physical memory available to them (typically CI system). > Even with swapping available a 1GB translation buffer that fills up > can put the machine under increased memory pressure. Detect these low > memory situations and reduce tb_size appropriately. > > Fixes: 600e17b261 > Signed-off-by: Alex Bennée > Cc: BALATON Zoltan > Cc: Christian Ehrhardt > --- > accel/tcg/translate-all.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c > index 2afa46bd2b1..2ff0ba6d19b 100644 > --- a/accel/tcg/translate-all.c > +++ b/accel/tcg/translate-all.c > @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t tb_size) > { > /* Size the buffer. */ > if (tb_size == 0) { > -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > +size_t phys_mem = qemu_get_host_physmem(); > +if (phys_mem > 0 && phys_mem < (2 * DEFAULT_CODE_GEN_BUFFER_SIZE)) { > +tb_size = phys_mem / 4; > +} else { > +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > +} I'm not convinced this is going to work when running QEMU inside a container environment with RAM cap, because the physmem level is completely unrelated to the RAM the container is permitted to actually use in practice. ie host has 32 GB of RAM, but the container QEMU is in only has 1 GB permitted. I don't have much of a better suggestion, as I don't think we want to get into reading the cgroups memory limits. It does feel like the assumption we can blindly use a 1GB cache though is invalid even with this patch applied. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: TB Cache size grows out of control with qemu 5.0
Christian Ehrhardt writes: > On Thu, Jul 16, 2020 at 6:27 PM Alex Bennée wrote: > >> >> Christian Ehrhardt writes: >> >> > On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan >> wrote: >> > >> >> See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two >> >> following it. >> >> >> > >> > Thank you Zoltan for pointing out this commit, I agree that this seems >> to be >> > the trigger for the issues I'm seeing. Unfortunately the common CI host >> size >> > is 1-2G. For example on Ubuntu Autopkgtests 1.5G. >> > Those of them running guests do so in 0.5-1G size in TCG mode >> > (as they often can't rely on having KVM available). >> > >> > The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing >> > on memory pressure (never existed) makes these systems go OOM-Killing >> > the qemu process. >> >> Ooops. I admit the assumption was that most people running system >> emulation would be doing it on beefier machines. >> >> > The patches indicated that the TB flushes on a full guest boot are a >> > good indicator of the TB size efficiency. From my old checks I had: >> > >> > - Qemu 4.2 512M guest with 32M default overwritten by ram-size/4 >> > TB flush count 14, 14, 16 >> > - Qemu 5.0 512M guest with 1G default >> > TB flush count 1, 1, 1 >> > >> > I agree that ram/4 seems odd, especially on huge guests that is a lot >> > potentially wasted. And most environments have a bit of breathing >> > room 1G is too big in small host systems and the common CI system falls >> > into this category. So I tuned it down to 256M for a test. >> > >> > - Qemu 4.2 512M guest with tb-size 256M >> > TB flush count 5, 5, 5 >> > - Qemu 5.0 512M guest with tb-size 256M >> > TB flush count 5, 5, 5 >> > - Qemu 5.0 512M guest with 256M default in code >> > TB flush count 5, 5, 5 >> > >> > So performance wise the results are as much in-between as you'd think >> from a >> > TB size in between. And the memory consumption which (for me) is the >> actual >> > current issue to fix would be back in line again as expected. >> >> So I'm glad you have the workaround. >> >> > >> > So on one hand I'm suggesting something like: >> > --- a/accel/tcg/translate-all.c >> > +++ b/accel/tcg/translate-all.c >> > @@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re >> > * Users running large scale system emulation may want to tweak their >> > * runtime setup via the tb-size control on the command line. >> > */ >> > -#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB) >> > +#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB) >> >> The problem we have is any number we pick here is arbitrary. And while >> it did regress your use-case changing it again just pushes a performance >> regression onto someone else. > > > Thanks for your feedback Alex! > > That is true "for you" since 5.0 is released from upstreams POV. > But from the downstreams POV no 5.0 exists for Ubuntu yet and I'd break > many places releasing it like that. > Sadly the performance gain to the other cases will most likely go unnoticed. > > >> The most (*) 64 bit desktop PCs have 16Gb >> of RAM, almost all have more than 8gb. And there is a workaround. >> > > Due to our work around virtualization the values representing > "most 64 bit desktop PCs" aren't the only thing that matters :-) > > ... > > >> > This is a bit more tricky than it seems as ram_sizes is no more >> > present in that context but it is enough to discuss it. >> > That should serve all cases - small and large - better as a pure >> > static default of 1G or always ram/4? >> >> I'm definitely against re-introducing ram_size into the mix. The >> original commit (a1b18df9a4) that broke this introduced an ordering >> dependency which we don't want to bring back. >> > > I agree with that reasoning, but currently without any size dependency > the "arbitrary value" we picked to be 1G is even more fixed than it was > before. > Compared to pre v5.0 for now I can only decide to > a) tune it down -> performance impact for huge guests > b) keep it at 1G -> functional breakage with small hosts > > I'd be more amenable to something that took into account host memory and >> limited the default if it was smaller than a threshold. Is there a way >> to probe that that doesn't involve slurping /proc/meminfo? >> > > I agree that a host-size dependency might be the better way to go, > yet I have no great cross-platform resilient way to get that. > Maybe we can make it like "if I can get some value consider it, > otherwise use the current default". > That would improve many places already, while keeping the rest at the > current behavior. > > >> > >> > P.S. I added Alex being the Author of the offending patch and >> Richard/Paolo >> > for being listed in the Maintainers file for TCG. >> >> >> > From Zoltan (unifying the thread a bit): > >> I agree that this should be dependent on host memory size not guest >> ram_size but it might be tricky to get that value because different host >> OSes would ne
Re: [PATCH v1 4/5] util: add qemu_get_host_physmem utility function
On Fri, Jul 17, 2020 at 3:32 PM BALATON Zoltan wrote: > On Fri, 17 Jul 2020, Alex Bennée wrote: > > This will be used in a future patch. For POSIX systems _SC_PHYS_PAGES > > isn't standardised but at least appears in the man pages for > > Open/FreeBSD. The result is advisory so any users of it shouldn't just > > fail if we can't work it out. > > > > The win32 stub currently returns 0 until someone with a Windows system > > can develop and test a patch. > > > > Signed-off-by: Alex Bennée > > Cc: BALATON Zoltan > > Cc: Christian Ehrhardt > > --- > > include/qemu/osdep.h | 10 ++ > > util/oslib-posix.c | 11 +++ > > util/oslib-win32.c | 6 ++ > > 3 files changed, 27 insertions(+) > > > > diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h > > index 4841b5c6b5f..7ff209983e2 100644 > > --- a/include/qemu/osdep.h > > +++ b/include/qemu/osdep.h > > @@ -665,4 +665,14 @@ static inline void qemu_reset_optind(void) > > */ > > char *qemu_get_host_name(Error **errp); > > > > +/** > > + * qemu_get_host_physmem: > > + * > > + * Operating system agnostiv way of querying host memory. > > Typo: agnostiv -> agnostic > > > + * > > + * Returns amount of physical memory on the system. This is purely > > + * advisery and may return 0 if we can't work it out. > > + */ > > +size_t qemu_get_host_physmem(void); > > + > > #endif > > diff --git a/util/oslib-posix.c b/util/oslib-posix.c > > index 36bf8593f8c..d9da782b896 100644 > > --- a/util/oslib-posix.c > > +++ b/util/oslib-posix.c > > @@ -839,3 +839,14 @@ char *qemu_get_host_name(Error **errp) > > > > return g_steal_pointer(&hostname); > > } > > + > > +size_t qemu_get_host_physmem(void) > > +{ > > +#ifdef _SC_PHYS_PAGES > > +long pages = sysconf(_SC_PHYS_PAGES); > > +if (pages > 0) { > > +return pages * qemu_real_host_page_size; > > The Linux man page warns that this product may overflow so maybe you could > return pages here. > The caller might be even less aware of that than this function - so maybe better handle it here. How about handling overflows and cutting it to MiB before returning? > > +} > > +#endif > > +return 0; > > +} > > diff --git a/util/oslib-win32.c b/util/oslib-win32.c > > index 7eedbe5859a..31030463cc9 100644 > > --- a/util/oslib-win32.c > > +++ b/util/oslib-win32.c > > @@ -828,3 +828,9 @@ char *qemu_get_host_name(Error **errp) > > > > return g_utf16_to_utf8(tmp, size, NULL, NULL, NULL); > > } > > + > > +size_t qemu_get_host_physmem(void) > > +{ > > +/* currently unimplemented */ > > +return 0; > > +} > > For Windows this may help: > > https://stackoverflow.com/questions/5553665/get-ram-system-size > > not sure about other OSes. > > Regards, > BALATON Zoltan -- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd
Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems
On Fri, Jul 17, 2020 at 12:51 PM Alex Bennée wrote: > It turns out there are some 64 bit systems that have relatively low > amounts of physical memory available to them (typically CI system). > Even with swapping available a 1GB translation buffer that fills up > can put the machine under increased memory pressure. Detect these low > memory situations and reduce tb_size appropriately. > > Fixes: 600e17b261 > Signed-off-by: Alex Bennée > Cc: BALATON Zoltan > Cc: Christian Ehrhardt > --- > accel/tcg/translate-all.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c > index 2afa46bd2b1..2ff0ba6d19b 100644 > --- a/accel/tcg/translate-all.c > +++ b/accel/tcg/translate-all.c > @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t > tb_size) > { > /* Size the buffer. */ > if (tb_size == 0) { > -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > +size_t phys_mem = qemu_get_host_physmem(); > +if (phys_mem > 0 && phys_mem < (2 * > DEFAULT_CODE_GEN_BUFFER_SIZE)) { > +tb_size = phys_mem / 4; > In my experiments I've found that /8 more closely matches the former behavior on small hosts while at the same time not affecting common large hosts. > +} else { > +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > +} > } > if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) { > tb_size = MIN_CODE_GEN_BUFFER_SIZE; > -- > 2.20.1 > > -- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd
[PATCH 0/4] Add support for a group of btrfs ioctls - 2
This series covers support for following btrfs ioctls *BTRFS_IOC_DEFAULT_SUBVOL*BTRFS_IOC_QUOTA_RESCAN *BTRFS_IOC_GET_SUBVOL_ROOTREF*BTRFS_IOC_QUOTA_RESCAN_WAIT *BTRFS_IOC_QUOTA_CTL *BTRFS_IOC_SCRUB *BTRFS_IOC_QGROUP_CREATE *BTRFS_IOC_SCRUB_CANCEL *BTRFS_IOC_QGROUP_ASSIGN *BTRFS_IOC_SCRUB_PROGRESS *BTRFS_IOC_INO_PATHS *BTRFS_IOC_QGROUP_LIMIT *BTRFS_IOC_LOGICAL_INO *BTRFS_IOC_QUOTA_RESCAN_STATUS *BTRFS_IOC_LOGICAL_INO_V2 *BTRFS_IOC_INO_LOOKUP_USER *BTRFS_IOC_INO_LOOKUP The functionalities of individual ioctls were described in this series patch commit messages. Since all of these ioctls are added in kernel version 3.9, their definitions in file 'linux-user/ioctls.h' are enwrapped in an #ifdef directive. Testing method: Mini test programs were written for these ioctls. These test programs can be found on a repository which is located on the link: https://github.com/bozutaf/btrfs-tests These test programs were compiled (sometimes using cross compilers) for following architectures: * Intel 64-bit (little endian) * Power pc 32-bit (big endian) * Power pc 64-bit (big endian) The corresponding native programs were executed without using QEMU on an intel x86_64 host. All applicable compiled programs were in turn executed through QEMU and the results obtained were the same ones gotten for native execution. Based-on: <20200709155203.21106-1-filip.boz...@syrmia.com> Filip Bozuta (4): linux-user: Add support for a group of btrfs inode ioctls linux-user: Add support for two btrfs ioctls used for subvolume linux-user: Add support for btrfs ioctls used to manage quota linux-user: Add support for btrfs ioctls used to scrub a filesystem linux-user/ioctls.h| 65 +++ linux-user/syscall_defs.h | 34 ++ linux-user/syscall_types.h | 90 ++ 3 files changed, 189 insertions(+) -- 2.25.1
Re: [RFC v2 1/1] memory: Delete assertion in memory_region_unregister_iommu_notifier
On Thu, Jul 16, 2020 at 10:54:31AM +0800, Jason Wang wrote: > > On 2020/7/16 上午9:00, Peter Xu wrote: > > On Mon, Jul 13, 2020 at 12:04:16PM +0800, Jason Wang wrote: > > > On 2020/7/10 下午9:30, Peter Xu wrote: > > > > On Fri, Jul 10, 2020 at 02:34:11PM +0800, Jason Wang wrote: > > > > > On 2020/7/9 下午10:10, Peter Xu wrote: > > > > > > On Thu, Jul 09, 2020 at 01:58:33PM +0800, Jason Wang wrote: > > > > > > > > > - If we care the performance, it's better to implement the > > > > > > > > > MAP event for > > > > > > > > > vhost, otherwise it could be a lot of IOTLB miss > > > > > > > > I feel like these are two things. > > > > > > > > > > > > > > > > So far what we are talking about is whether vt-d should have > > > > > > > > knowledge about > > > > > > > > what kind of events one iommu notifier is interested in. I > > > > > > > > still think we > > > > > > > > should keep this as answered in question 1. > > > > > > > > > > > > > > > > The other question is whether we want to switch vhost from > > > > > > > > UNMAP to MAP/UNMAP > > > > > > > > events even without vDMA, so that vhost can establish the > > > > > > > > mapping even before > > > > > > > > IO starts. IMHO it's doable, but only if the guest runs DPDK > > > > > > > > workloads. When > > > > > > > > the guest is using dynamic iommu page mappings, I feel like > > > > > > > > that can be even > > > > > > > > slower, because then the worst case is for each IO we'll need > > > > > > > > to vmexit twice: > > > > > > > > > > > > > > > > - The first vmexit caused by an invalidation to MAP the > > > > > > > > page tables, so vhost > > > > > > > > will setup the page table before IO starts > > > > > > > > > > > > > > > > - IO/DMA triggers and completes > > > > > > > > > > > > > > > > - The second vmexit caused by another invalidation to > > > > > > > > UNMAP the page tables > > > > > > > > > > > > > > > > So it seems to be worse than when vhost only uses UNMAP like > > > > > > > > right now. At > > > > > > > > least we only have one vmexit (when UNMAP). We'll have a vhost > > > > > > > > translate() > > > > > > > > request from kernel to userspace, but IMHO that's cheaper than > > > > > > > > the vmexit. > > > > > > > Right but then I would still prefer to have another notifier. > > > > > > > > > > > > > > Since vtd_page_walk has nothing to do with device IOTLB. IOMMU > > > > > > > have a > > > > > > > dedicated command for flushing device IOTLB. But the check for > > > > > > > vtd_as_has_map_notifier is used to skip the device which can do > > > > > > > demand > > > > > > > paging via ATS or device specific way. If we have two different > > > > > > > notifiers, > > > > > > > vhost will be on the device iotlb notifier so we don't need it at > > > > > > > all? > > > > > > But we can still have iommu notifier that only registers to UNMAP > > > > > > even after we > > > > > > introduce dev-iotlb notifier? We don't want to do page walk for > > > > > > them as well. > > > > > > TCG should be the only one so far, but I don't know.. maybe there > > > > > > can still be > > > > > > new ones? > > > > > I think you're right. But looking at the codes, it looks like the > > > > > check of > > > > > vtd_as_has_map_notifier() was only used in: > > > > > > > > > > 1) vtd_iommu_replay() > > > > > 2) vtd_iotlb_page_invalidate_notify() (PSI) > > > > > > > > > > For the replay, it's expensive anyhow. For PSI, I think it's just > > > > > about one > > > > > or few mappings, not sure it will have obvious performance impact. > > > > > > > > > > And I had two questions: > > > > > > > > > > 1) The codes doesn't check map for DSI or GI, does this match what > > > > > spec > > > > > said? (It looks to me the spec is unclear in this part) > > > > Both DSI/GI should cover maps too? E.g. vtd_sync_shadow_page_table() in > > > > vtd_iotlb_domain_invalidate(). > > > > > > I meant the code doesn't check whether there's an MAP notifier :) > > It's actually checked, because it loops over vtd_as_with_notifiers, and only > > MAP notifiers register to that. :) > > > I may miss something but I don't find the code to block UNMAP notifiers? > > vhost_iommu_region_add() > memory_region_register_iommu_notifier() > memory_region_update_iommu_notify_flags() > imrc->notify_flag_changed() > vtd_iommu_notify_flag_changed() > > ? Yeah I think you're right. I might have confused with some previous implementations. Maybe we should also do similar thing for DSI/GI just like what we do in PSI. > > > > > 2) for the replay() I don't see other implementations (either spapr or > > > > > generic one) that did unmap (actually they skip unmap explicitly), any > > > > > reason for doing this in intel IOMMU? > > > > I could be wrong, but I'd guess it's because vt-d implemented the > > > > caching mode > > > > by leveraging the same invalidation strucuture, so it's harder to make > > > > all > > > > things
Re: [PATCH] accel/tcg: reduce default code gen buffer on small hosts
On Fri, Jul 17, 2020 at 4:07 PM Christian Ehrhardt < christian.ehrha...@canonical.com> wrote: > Since v5.0.0 and 600e17b2 "accel/tcg: increase default code gen buffer > size for 64 bit" in particular qemu with TCG regularly gets OOM Killed > on small hosts. > > The former 47a2def4 "accel/tcg: remove link between guest ram and TCG > cache size" removed the link to guest size which is right, but at least > some connection to the host size needs to be retained to avoid growing > out of control on common CI setups which run at 1-2G host sizes. > > The lower value of 1/8th of the host memory size and the default (of > currently 1G) will be taken to initialize the TB. There already is a > Min/Max check in place to not reach ridiculously small values. > > Fixes: 600e17b2 > Just found "[PATCH v1 0/5] candidate fixes for 5.1-rc1 (shippable, semihosting, OOM tcg)" which was submitted while I was prepping this one (this is a busy day since I'll be off for a week). Please ignore this patch here and give the series of Alex a look as it is the more advanced version :-). > Signed-off-by: Christian Ehrhardt > --- > accel/tcg/translate-all.c | 23 +++ > 1 file changed, 23 insertions(+) > > diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c > index 2afa46bd2b..ffcd67060e 100644 > --- a/accel/tcg/translate-all.c > +++ b/accel/tcg/translate-all.c > @@ -977,6 +977,29 @@ static inline size_t size_code_gen_buffer(size_t > tb_size) > /* Size the buffer. */ > if (tb_size == 0) { > tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > +/* > + * A static default of 1G turned out to break (OOM Kill) many > common > + * CI setups that run at 1-2G Host memory size. > + * At the same time the former default of ram_size/4 wasted > performance > + * on large host systems when running small guests. > + * Common CI guest sizes are 0.5-1G which meant ~128M-256M TB > size. > + * A Default of 1/8th of the host size will get small hosts a > + * similar TB size than they had prior to v5.0 and common bare > metal > + * systems (>=8G) the new 1G default that was set in v5.0 > + */ > +#if defined _SC_PHYS_PAGES && defined _SC_PAGESIZE > +{ > +unsigned long max = DEFAULT_CODE_GEN_BUFFER_SIZE; > +double pages = (double)sysconf(_SC_PHYS_PAGES); > + > +if (pages > 0 && pagesize > 0) { > +max = (unsigned long)((pages * qemu_real_host_page_size) > / 8); > +} > +if (max < DEFAULT_CODE_GEN_BUFFER_SIZE) { > +tb_size = max; > +} > +} > +#endif > } > if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) { > tb_size = MIN_CODE_GEN_BUFFER_SIZE; > -- > 2.27.0 > > -- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd
[PATCH] accel/tcg: reduce default code gen buffer on small hosts
Since v5.0.0 and 600e17b2 "accel/tcg: increase default code gen buffer size for 64 bit" in particular qemu with TCG regularly gets OOM Killed on small hosts. The former 47a2def4 "accel/tcg: remove link between guest ram and TCG cache size" removed the link to guest size which is right, but at least some connection to the host size needs to be retained to avoid growing out of control on common CI setups which run at 1-2G host sizes. The lower value of 1/8th of the host memory size and the default (of currently 1G) will be taken to initialize the TB. There already is a Min/Max check in place to not reach ridiculously small values. Fixes: 600e17b2 Signed-off-by: Christian Ehrhardt --- accel/tcg/translate-all.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c index 2afa46bd2b..ffcd67060e 100644 --- a/accel/tcg/translate-all.c +++ b/accel/tcg/translate-all.c @@ -977,6 +977,29 @@ static inline size_t size_code_gen_buffer(size_t tb_size) /* Size the buffer. */ if (tb_size == 0) { tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; +/* + * A static default of 1G turned out to break (OOM Kill) many common + * CI setups that run at 1-2G Host memory size. + * At the same time the former default of ram_size/4 wasted performance + * on large host systems when running small guests. + * Common CI guest sizes are 0.5-1G which meant ~128M-256M TB size. + * A Default of 1/8th of the host size will get small hosts a + * similar TB size than they had prior to v5.0 and common bare metal + * systems (>=8G) the new 1G default that was set in v5.0 + */ +#if defined _SC_PHYS_PAGES && defined _SC_PAGESIZE +{ +unsigned long max = DEFAULT_CODE_GEN_BUFFER_SIZE; +double pages = (double)sysconf(_SC_PHYS_PAGES); + +if (pages > 0 && pagesize > 0) { +max = (unsigned long)((pages * qemu_real_host_page_size) / 8); +} +if (max < DEFAULT_CODE_GEN_BUFFER_SIZE) { +tb_size = max; +} +} +#endif } if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) { tb_size = MIN_CODE_GEN_BUFFER_SIZE; -- 2.27.0
[PATCH 3/4] linux-user: Add support for btrfs ioctls used to manage quota
This patch implements functionality for following ioctls: BTRFS_IOC_QUOTA_CTL - Enabling/Disabling quota support Enable or disable quota support for a btrfs filesystem. Quota support is enabled or disabled using the ioctls third argument which represents a pointer to a following type: struct btrfs_ioctl_quota_ctl_args { __u64 cmd; __u64 status; }; Before calling this ioctl, the 'cmd' field should be filled with one of the values 'BTRFS_QUOTA_CTL_ENABLE' (enabling quota) 'BTRFS_QUOTA_CTL_DISABLE' (disabling quota). BTRFS_IOC_QGROUP_CREATE - Creating/Removing a subvolume quota group Create or remove a subvolume quota group. The subvolume quota group is created or removed using the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_qgroup_create_args { __u64 create; __u64 qgroupid; }; Before calling this ioctl, the 'create' field should be filled with the aproppriate value depending on if the user wants to create or remove a quota group (0 for removing, everything else for creating). Also, the 'qgroupid' field should be filled with the value for the quota group id that is to be created. BTRFS_IOC_QGROUP_ASSIGN - Asigning or removing a quota group as child group Asign or remove a quota group as child quota group of another group in the btrfs filesystem. The asignment is done using the ioctl's third argument which represents a pointert to a following type: struct btrfs_ioctl_qgroup_assign_args { __u64 assign; __u64 src; __u64 dst; }; Before calling this ioctl, the 'assign' field should be filled with the aproppriate value depending on if the user wants to asign or remove a quota group as a child quota group of another group (0 for removing, everythin else for asigning). Also, the 'src' and 'dst' fields should be filled with the aproppriate quota group id values depending on which quota group needs to asigned or removed as child quota group of another group ('src' gets asigned or removed as child group of 'dst'). BTRFS_IOC_QGROUP_LIMIT - Limiting the size of a quota group Limit the size of a quota group. The size of the quota group is limited with the ioctls third argument which represents a pointer to a following type: struct btrfs_ioctl_qgroup_limit_args { __u64 qgroupid; struct btrfs_qgroup_limit lim; }; Before calling this ioctl, the 'qgroup' id field should be filled with aproppriate value of the quota group id for which the size is to be limited. The second field is of following type: struct btrfs_qgroup_limit { __u64 flags; __u64 max_rfer; __u64 max_excl; __u64 rsv_rfer; __u64 rsv_excl; }; The 'max_rfer' field should be filled with the size to which the quota group should be limited. The 'flags' field can be used for passing additional options and can have values which can be found on: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/btrfs.h#L67 BTRFS_IOC_QUOTA_RESCAN_STATUS - Checking status of running rescan operation Check status of a running rescan operation. The status is checked using the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_quota_rescan_args { __u64 flags; __u64 progress; __u64 reserved[6]; }; If there is a rescan operation running, 'flags' field is set to 1, and 'progress' field is set to aproppriate value which represents the progress of the operation. BTRFS_IOC_QUOTA_RESCAN - Starting a rescan operation Start ar rescan operation to Trash all quota groups and scan the metadata again with the current config. Before calling this ioctl, BTRFS_IOC_QUOTA_RESCAN_STATUS sould be run to check if there is already a rescan operation runing. After that ioctl call, the received 'struct btrfs_ioctl_quota_rescan_args' should be than passed as this ioctls third argument. BTRFS_IOC_QUOTA_RESCAN_WAIT - Waiting for a rescan operation to finish Wait until a rescan operation is finished (if there is a rescan operation running). The third ioctls argument is ignored. Implementation notes: Almost all of the ioctls in this patch use structure types as third arguments. That is the reason why aproppriate thunk definitions were added in file 'syscall_types.h'. --- linux-user/ioctls.h| 27 +++ linux-user/syscall_defs.h | 14 ++ linux-user/syscall_types.h | 29 + 3 files changed, 70 insertions(+) diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h index 2c553103e6..8665f504bf 100644 --- a/linux-user/ioctls.h +++ b/linux-user/ioctls.h @@ -227,6 +227,33 @@ IOCTL(BTRFS_IOC_LOGICAL_INO, IOC_RW, MK_PTR(MK
[PATCH 4/4] linux-user: Add support for btrfs ioctls used to scrub a filesystem
This patch implements functionality for following ioctls: BTRFS_IOC_SCRUB - Starting a btrfs filesystem scrub Start a btrfs filesystem scrub. The third ioctls argument is a pointer to a following type: struct btrfs_ioctl_scrub_args { __u64 devid;/* in */ __u64 start;/* in */ __u64 end; /* in */ __u64 flags;/* in */ struct btrfs_scrub_progress progress; /* out */ /* pad to 1k */ __u64 unused[(1024-32-sizeof(struct btrfs_scrub_progress))/8]; }; Before calling this ioctl, field 'devid' should be filled with value that represents the device id of the btrfs filesystem for which the scrub is to be started. BTRFS_IOC_SCRUB_CANCEL - Canceling scrub of a btrfs filesystem Cancel a btrfs filesystem scrub if it is running. The third ioctls argument is ignored. BTRFS_IOC_SCRUB_PROGRESS - Getting status of a running scrub Read the status of a running btrfs filesystem scrub. The third ioctls argument is a pointer to the above mentioned 'struct btrfs_ioctl_scrub_args'. Similarly as with 'BTRFS_IOC_SCRUB', the 'devid' field should be filled with value that represents the id of the btrfs device for which the scrub has started. The status of a running scrub is returned in the field 'progress' which is of type 'struct btrfs_scrub_progress' and its definition can be found at: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/btrfs.h#L150 Implementation nots: Ioctls in this patch use type 'struct btrfs_ioctl_scrub_args' as their third argument. That is the reason why an aproppriate thunk type definition is added in file 'syscall_types.h'. --- linux-user/ioctls.h| 11 +++ linux-user/syscall_defs.h | 6 ++ linux-user/syscall_types.h | 26 ++ 3 files changed, 43 insertions(+) diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h index 8665f504bf..bf80615438 100644 --- a/linux-user/ioctls.h +++ b/linux-user/ioctls.h @@ -215,6 +215,17 @@ #ifdef BTRFS_IOC_SUBVOL_SETFLAGS IOCTL(BTRFS_IOC_SUBVOL_SETFLAGS, IOC_W, MK_PTR(TYPE_ULONGLONG)) #endif +#ifdef BTRFS_IOC_SCRUB + IOCTL(BTRFS_IOC_SCRUB, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_scrub_args))) +#endif +#ifdef BTRFS_IOC_SCRUB_CANCEL + IOCTL(BTRFS_IOC_SCRUB_CANCEL, 0, TYPE_NULL) +#endif +#ifdef BTRFS_IOC_SCRUB_PROGRESS + IOCTL(BTRFS_IOC_SCRUB_PROGRESS, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_scrub_args))) +#endif #ifdef BTRFS_IOC_DEV_INFO IOCTL(BTRFS_IOC_DEV_INFO, IOC_RW, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_dev_info_args))) diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h index 3f771ae5d1..589ec3e9b0 100644 --- a/linux-user/syscall_defs.h +++ b/linux-user/syscall_defs.h @@ -990,6 +990,12 @@ struct target_rtc_pll_info { 25, abi_ullong) #define TARGET_BTRFS_IOC_SUBVOL_SETFLAGSTARGET_IOW(BTRFS_IOCTL_MAGIC, \ 26, abi_ullong) +#define TARGET_BTRFS_IOC_SCRUB TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ + 27, struct btrfs_ioctl_scrub_args) +#define TARGET_BTRFS_IOC_SCRUB_CANCEL TARGET_IO(BTRFS_IOCTL_MAGIC, \ + 28) +#define TARGET_BTRFS_IOC_SCRUB_PROGRESS TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ + 29, struct btrfs_ioctl_scrub_args) #define TARGET_BTRFS_IOC_DEV_INFO TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ 30, struct btrfs_ioctl_dev_info_args) #define TARGET_BTRFS_IOC_INO_PATHS TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h index b4f462b5c6..345193270c 100644 --- a/linux-user/syscall_types.h +++ b/linux-user/syscall_types.h @@ -373,6 +373,32 @@ STRUCT(btrfs_ioctl_ino_lookup_user_args, MK_ARRAY(TYPE_CHAR, BTRFS_VOL_NAME_MAX + 1), /* name */ MK_ARRAY(TYPE_CHAR, BTRFS_INO_LOOKUP_USER_PATH_MAX)) /* path */ +STRUCT(btrfs_scrub_progress, + TYPE_ULONGLONG, /* data_extents_scrubbed */ + TYPE_ULONGLONG, /* tree_extents_scrubbed */ + TYPE_ULONGLONG, /* data_bytes_scrubbed */ + TYPE_ULONGLONG, /* tree_bytes_scrubbed */ + TYPE_ULONGLONG, /* read_errors */ + TYPE_ULONGLONG, /* csum_errors */ + TYPE_ULONGLONG, /* verify_errors */ + TYPE_ULONGLONG, /* no_csum */ + TYPE_ULONGLONG, /* csum_discards */ + TYPE_ULONGLONG, /* super_errors */ + TYPE_ULONGLONG, /* malloc_errors */ + TYPE_ULONGLONG, /* uncorrectable_errors */ + TYPE_ULONGLONG, /* corrected_er */ + TYPE_ULONGLONG, /* last_physical */ + TYP
[PATCH 1/4] linux-user: Add support for a group of btrfs inode ioctls
This patch implements functionality of following ioctls: BTRFS_IOC_INO_LOOKUP - Reading tree root id and path Read tree root id and path for a given file or directory. The name and tree root id are returned in an ioctl's third argument that represents a pointer to a following type: struct btrfs_ioctl_ino_lookup_args { __u64 treeid; __u64 objectid; char name[BTRFS_INO_LOOKUP_PATH_MAX]; }; Before calling this ioctl, field 'objectid' should be filled with the object id value for which the tree id and path are to be read. Value 'BTRFS_FIRST_FREE_OBJECTID' represents the object id for the first available btrfs object (directory or file). BTRFS_IOC_INO_PATHS - Reading paths to all files Read path to all files with a certain inode number. The paths are returned in the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_ino_path_args { __u64 inum; /* in */ __u64 size; /* in */ __u64 reserved[4]; /* struct btrfs_data_container *fspath; out */ __u64 fspath; /* out */ }; Before calling this ioctl, the 'inum' and 'size' field should be filled with the aproppriate inode number and size of the directory where file paths should be looked for. For now, the paths are returned in an '__u64' (unsigned long long) value 'fspath'. BTRFS_IOC_LOGICAL_INO - Reading inode numbers Read inode numbers for files on a certain logical adress. The inode numbers are returned in the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_logical_ino_args { __u64 logical;/* in */ __u64 size; /* in */ __u64 reserved[3];/* must be 0 for now */ __u64 flags; /* in, v2 only */ /* struct btrfs_data_container *inodes;out */ __u64 inodes; }; Before calling this ioctl, the 'logical' and 'size' field should be filled with the aproppriate logical adress and size of where the inode numbers of files should be looked for. For now, the inode numbers are returned in an '__u64' (unsigned long long) value 'inodes'. BTRFS_IOC_LOGICAL_INO_V2 - Reading inode numbers Same as the above mentioned ioctl except that it allows passing a flags 'BTRFS_LOGICAL_INO_ARGS_IGNORE_OFFSET'. BTRFS_IOC_INO_LOOKUP_USER - Reading subvolume name and path Read name and path of a subvolume. The tree root id and path are read in an ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_ino_lookup_user_args { /* in, inode number containing the subvolume of 'subvolid' */ __u64 dirid; /* in */ __u64 treeid; /* out, name of the subvolume of 'treeid' */ char name[BTRFS_VOL_NAME_MAX + 1]; /* * out, constructed path from the directory with which the ioctl is * called to dirid */ char path[BTRFS_INO_LOOKUP_USER_PATH_MAX]; }; Before calling this ioctl, the 'dirid' and 'treeid' field should be filled with aproppriate values which represent the inode number of the directory that contains the subvolume and treeid of the subvolume. Implementation notes: All of the ioctls in this patch use structure types as third arguments. That is the reason why aproppriate thunk definitions were added in file 'syscall_types.h'. --- linux-user/ioctls.h| 20 linux-user/syscall_defs.h | 10 ++ linux-user/syscall_types.h | 24 3 files changed, 54 insertions(+) diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h index c6303a0406..a7f5664487 100644 --- a/linux-user/ioctls.h +++ b/linux-user/ioctls.h @@ -202,6 +202,10 @@ IOCTL(BTRFS_IOC_SNAP_DESTROY, IOC_W, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_vol_args))) #endif +#ifdef BTRFS_IOC_INO_LOOKUP + IOCTL(BTRFS_IOC_INO_LOOKUP, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_ino_lookup_args))) +#endif #ifdef BTRFS_IOC_SUBVOL_GETFLAGS IOCTL(BTRFS_IOC_SUBVOL_GETFLAGS, IOC_R, MK_PTR(TYPE_ULONGLONG)) #endif @@ -212,6 +216,14 @@ IOCTL(BTRFS_IOC_DEV_INFO, IOC_RW, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_dev_info_args))) #endif +#ifdef BTRFS_IOC_INO_PATHS + IOCTL(BTRFS_IOC_INO_PATHS, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_ino_path_args))) +#endif +#ifdef BTRFS_IOC_LOGICAL_INO + IOCTL(BTRFS_IOC_LOGICAL_INO, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_logical_ino_args))) +#endif #ifdef BTRFS_IOC_
[PATCH 2/4] linux-user: Add support for two btrfs ioctls used for subvolume
This patch implements functionality for following ioctl: BTRFS_IOC_DEFAULT_SUBVOL - Setting a default subvolume Set a default subvolume for a btrfs filesystem. The third ioctl's argument is a '__u64' (unsigned long long) which represents the id of a subvolume that is to be set as the default. BTRFS_IOC_GET_SUBVOL_ROOTREF - Getting tree and directory id of subvolumes Read tree and directory id of subvolumes from a btrfs filesystem. The tree and directory id's are returned in the ioctl's third argument which represents a pointer to a following type: struct btrfs_ioctl_get_subvol_rootref_args { /* in/out, minimum id of rootref's treeid to be searched */ __u64 min_treeid; /* out */ struct { __u64 treeid; __u64 dirid; } rootref[BTRFS_MAX_ROOTREF_BUFFER_NUM]; /* out, number of found items */ __u8 num_items; __u8 align[7]; }; Before calling this ioctl, 'min_treeid' field should be filled with value that represent the minimum value for the tree id. Implementation notes: Ioctl BTRFS_IOC_GET_SUBVOL_ROOTREF uses the above mentioned structure type as third argument. That is the reason why a aproppriate thunk structure definition is added in file 'syscall_types.h'. --- linux-user/ioctls.h| 7 +++ linux-user/syscall_defs.h | 4 linux-user/syscall_types.h | 11 +++ 3 files changed, 22 insertions(+) diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h index a7f5664487..2c553103e6 100644 --- a/linux-user/ioctls.h +++ b/linux-user/ioctls.h @@ -206,6 +206,9 @@ IOCTL(BTRFS_IOC_INO_LOOKUP, IOC_RW, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_ino_lookup_args))) #endif +#ifdef BTRFS_IOC_DEFAULT_SUBVOL + IOCTL(BTRFS_IOC_DEFAULT_SUBVOL, IOC_W, MK_PTR(TYPE_ULONGLONG)) +#endif #ifdef BTRFS_IOC_SUBVOL_GETFLAGS IOCTL(BTRFS_IOC_SUBVOL_GETFLAGS, IOC_R, MK_PTR(TYPE_ULONGLONG)) #endif @@ -248,6 +251,10 @@ IOCTL(BTRFS_IOC_GET_SUBVOL_INFO, IOC_R, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_get_subvol_info_args))) #endif +#ifdef BTRFS_IOC_GET_SUBVOL_ROOTREF + IOCTL(BTRFS_IOC_GET_SUBVOL_ROOTREF, IOC_RW, + MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_get_subvol_rootref_args))) +#endif #ifdef BTRFS_IOC_INO_LOOKUP_USER IOCTL(BTRFS_IOC_INO_LOOKUP_USER, IOC_RW, MK_PTR(MK_STRUCT(STRUCT_btrfs_ioctl_ino_lookup_user_args))) diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h index 7bb105428b..f4b4fc4a20 100644 --- a/linux-user/syscall_defs.h +++ b/linux-user/syscall_defs.h @@ -984,6 +984,8 @@ struct target_rtc_pll_info { 15, struct btrfs_ioctl_vol_args) #define TARGET_BTRFS_IOC_INO_LOOKUP TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ 18, struct btrfs_ioctl_ino_lookup_args) +#define TARGET_BTRFS_IOC_DEFAULT_SUBVOL TARGET_IOW(BTRFS_IOCTL_MAGIC, \ + 19, abi_ullong) #define TARGET_BTRFS_IOC_SUBVOL_GETFLAGSTARGET_IOR(BTRFS_IOCTL_MAGIC, \ 25, abi_ullong) #define TARGET_BTRFS_IOC_SUBVOL_SETFLAGSTARGET_IOW(BTRFS_IOCTL_MAGIC, \ @@ -1006,6 +1008,8 @@ struct target_rtc_pll_info { 59, struct btrfs_ioctl_logical_ino_args) #define TARGET_BTRFS_IOC_GET_SUBVOL_INFOTARGET_IOR(BTRFS_IOCTL_MAGIC, \ 60, struct btrfs_ioctl_get_subvol_info_args) +#define TARGET_BTRFS_IOC_GET_SUBVOL_ROOTREF TARGET_IOWR(BTRFS_IOCTL_MAGIC,\ +61, struct btrfs_ioctl_get_subvol_rootref_args) #define TARGET_BTRFS_IOC_INO_LOOKUP_USERTARGET_IOWR(BTRFS_IOCTL_MAGIC,\ 62, struct btrfs_ioctl_ino_lookup_user_args) diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h index 980c29000a..d2f1b30ff3 100644 --- a/linux-user/syscall_types.h +++ b/linux-user/syscall_types.h @@ -381,6 +381,17 @@ STRUCT(btrfs_ioctl_dev_info_args, MK_ARRAY(TYPE_ULONGLONG, 379), /* unused */ MK_ARRAY(TYPE_CHAR, BTRFS_DEVICE_PATH_NAME_MAX)) /* path */ +STRUCT(rootref, + TYPE_ULONGLONG, /* treeid */ + TYPE_ULONGLONG) /* dirid */ + +STRUCT(btrfs_ioctl_get_subvol_rootref_args, +TYPE_ULONGLONG, /* min_treeid */ +MK_ARRAY(MK_STRUCT(STRUCT_rootref), + BTRFS_MAX_ROOTREF_BUFFER_NUM), /* rootref */ +TYPE_CHAR, /* num_items */ +MK_ARRAY(TYPE_CHAR, 7)) /* align */ + STRUCT(btrfs_ioctl_get_dev_stats, TYPE_ULONGLONG, /* devid */ TYPE_ULONGLONG, /* nr_items */ -- 2.25.1