Re: [PATCH] net: check payload length limit for all frames
On 2020/7/17 下午1:06, P J P wrote: Hello Jason, all +-- On Fri, 17 Jul 2020, Jason Wang wrote --+ | On 2020/7/17 上午9:21, Alexander Bulekov wrote: | > On 200717 0853, Li Qiang wrote: | >> Which issue are you trying to solve, any reference linking? | >> I also send a patch related this part and also a UAF. | > | > I reported a UAF privately to QEMU-Security in May. I believe the one Li | > is referring to is this one https://bugs.launchpad.net/qemu/+bug/1886362 | > | > When I saw Prasad's email, I was worried that I reported the same bug | > twice, but I can still reproduce LP#1886362 with Prasad's patch. | > | > On the other hand, I cannot reproduce either issue with Li's patch: | > Message-Id: <20200716161453.61295-1-liq...@163.com> | > | > Based on this, I think there were two distinct issues. Yes, LP#1886362 looks different that the one fixed here. | Could you describe the issue you saw in details? (E.g the calltrace?) The | commit log does not explain where we can get OOB or UAF. I should've included the backtrace in the commit message. It crossed my mind after I sent the patch. Sorry. Thanks but I don't see a direct relation between 64K limit and this calltrace. Maybe you can elaborate more on this? Thanks === 1040323==ERROR: AddressSanitizer: heap-use-after-free on address 0x606344d8 at pc 0x5571b8fb9ce7 bp 0x7ffede5a2290 sp 0x7ffede5a2280 READ of size 8 at 0x606344d8 thread T0 #0 0x5571b8fb9ce6 in e1000e_write_packet_to_guest hw/net/e1000e_core.c:1587 #1 0x5571b8fba8fc in e1000e_receive_iov hw/net/e1000e_core.c:1709 #2 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213 #3 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544 #4 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620 #5 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633 #6 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664 #7 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743 #8 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934 #9 0x5571b8fbe2ad in e1000e_set_tdt hw/net/e1000e_core.c:2451 #10 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265 #11 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109 ... 0x606344d8 is located 24 bytes inside of 64-byte region [0x606344c0,0x60634500) freed by thread T0 here: #0 0x7f89e3b87307 in __interceptor_free (/lib64/libasan.so.6+0xb0307) #1 0x7f89e37c7a6c in g_free (/lib64/libglib-2.0.so.0+0x58a6c) #2 0x5571b8f95fc7 in net_rx_pkt_pull_data hw/net/net_rx_pkt.c:103 #3 0x5571b8f969b3 in net_rx_pkt_attach_iovec_ex hw/net/net_rx_pkt.c:158 #4 0x5571b8fba6bc in e1000e_receive_iov hw/net/e1000e_core.c:1695 #5 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213 #6 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544 #7 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620 #8 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633 #9 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664 #10 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743 #11 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934 #12 0x5571b8fbe011 in e1000e_set_tctl hw/net/e1000e_core.c:2431 #13 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265 #14 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109 ... previously allocated by thread T0 here: #0 0x7f89e3b87667 in __interceptor_malloc (/lib64/libasan.so.6+0xb0667) #1 0x7f89e37c7978 in g_malloc (/lib64/libglib-2.0.so.0+0x58978) #2 0x5571b8f95fc7 in net_rx_pkt_pull_data hw/net/net_rx_pkt.c:103 #3 0x5571b8f969b3 in net_rx_pkt_attach_iovec_ex hw/net/net_rx_pkt.c:158 #4 0x5571b8fba6bc in e1000e_receive_iov hw/net/e1000e_core.c:1695 #5 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213 #6 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544 #7 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620 #8 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633 #9 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664 #10 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743 #11 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934 #12 0x5571b8fbe2ad in e1000e_set_tdt hw/net/e1000e_core.c:2451 #13 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265 #14 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109 === | >>> --- a/hw/net/net_tx_pkt.c | >>> +++ b/hw/net/net_tx_pkt.c | >>> @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, | >>> NetClientState *nc) | >>>* Since underlying infrastructure does not support IP datagrams | >>>longer | >>>* than 64K we should drop such packets and don't even try to send | >>>*/ | >>> -if (VIRTIO_NET_HDR_GSO_NONE !=
Re: [PATCH] gitlab-ci.yml: Add oss-fuzz build tests
On 16/07/2020 18.33, Alexander Bulekov wrote: > This tries to build and run the fuzzers with the same build-script used > by oss-fuzz. This doesn't guarantee that the builds on oss-fuzz will > also succeed, since oss-fuzz provides its own compiler and fuzzer vars, > but it can catch changes that are not compatible with the the > ./scripts/oss-fuzz/build.sh script. > The strange way of finding fuzzer binaries stems from the method used by > oss-fuzz: > https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-runner/targets_list > > Signed-off-by: Alexander Bulekov > --- > > Similar to Thomas' patch: > >> Note: This patch needs two other patches merged first to work correctly: > >> - 'fuzz: Expect the cmdline in a freeable GString' from Alexander > >> - 'qom: Plug memory leak in "info qom-tree"' from Markus > > Otherwise the test will fail due to detected memory leaks. > > Fair warning: I haven't been able to trigger this new job yet. I tried > to run the pipeline with these changes on my forked repo on gitlab, but > did not reach the build-oss-fuzz. I think this is due to some failures > in the Containers-layer-2 stage: > > ... > Error response from daemon: manifest for > registry.gitlab.com/a1xndr/qemu/qemu/debian-all-test-cross:latest not > found: manifest unknown: manifest unknown > #2 [internal] load .dockerignore > #2 transferring context: > #2 transferring context: 2B 0.1s done > #2 DONE 0.1s > #1 [internal] load build definition from tmpg8j4xoop.docker > #1 transferring dockerfile: 2.21kB 0.1s done > #1 DONE 0.2s > #3 [internal] load metadata for docker.io/qemu/debian10:latest > #3 ERROR: pull access denied, repository does not exist or may require > authorization: server message: insufficient_scope: authorization failed These look like the problems that we've seen with the main repo until two days ago, too, e.g.: https://gitlab.com/qemu-project/qemu/-/jobs/640410842 Maybe Alex (Bennée) can comment on how to resolve them? > > .gitlab-ci.yml | 14 ++ > 1 file changed, 14 insertions(+) > > diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml > index e96f8794b9..a50df420c9 100644 > --- a/.gitlab-ci.yml > +++ b/.gitlab-ci.yml > @@ -182,6 +182,20 @@ build-fuzzer: > || exit 1 ; >done As mentioned in my other mail, I think you can replace my build-fuzzer job once this is working. > +build-oss-fuzz: > + <<: *native_build_job_definition > + variables: > +IMAGE: fedora > + script: > +- OUT_DIR="./build" CC=clang-9 CXX=clang++-9 CFLAGS="-fsanitize=address" > + LIB_FUZZING_ENGINE="-fsanitize=fuzzer" CFL That "CFL" at the end seems to be a typo (leftover from "CFLAGS")? Also the fedora container does not have clang-9 : https://gitlab.com/huth/qemu/-/jobs/643383032#L28 I think it is at clang 10 already, so maybe just use CC=clang (without version number)? > + ./scripts/oss-fuzz/build.sh > +- for fuzzer in $(find ./build-oss-fuzz/DEST_DIR/ -executable -type f); > do > +grep "LLVMFuzzerTestOneInput" ${fuzzer} > /dev/null 2>&1 || continue > ; > +echo Testing ${fuzzer} ... ; > +"${fuzzer}" -runs=1000 || exit 1 ; > + done Should we exclude the virtio-net tests, since they could leak network traffic to the host? Thomas
Re: [PATCH] e1000e: using bottom half to send packets
On 2020/7/17 下午12:46, Li Qiang wrote: Jason Wang 于2020年7月17日周五 上午11:10写道: On 2020/7/17 上午12:14, Li Qiang wrote: Alexander Bulekov reported a UAF bug related e1000e packets send. -->https://bugs.launchpad.net/qemu/+bug/1886362 This is because the guest trigger a e1000e packet send and set the data's address to e1000e's MMIO address. So when the e1000e do DMA it will write the MMIO again and trigger re-entrancy and finally causes this UAF. Paolo suggested to use a bottom half whenever MMIO is doing complicate things in here: -->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html Reference here: 'The easiest solution is to delay processing of descriptors to a bottom half whenever MMIO is doing something complicated. This is also better for latency because it will free the vCPU thread more quickly and leave the work to the I/O thread.' I think several things were missed in this patch (take virtio-net as a reference), do we need the following things: Thanks Jason, In fact I know this, I'm scared for touching this but I want to try. Thanks for your advice. - Cancel the bh when VM is stopped. Ok. I think add a vm state change notifier for e1000e can address this. - A throttle to prevent bh from executing too much timer? Ok, I think add a config timeout and add a timer in e1000e can address this. Sorry, a typo. I meant we probably need a tx_burst as what virtio-net did. - A flag to record whether or not this a pending tx (and migrate it?) Is just a flag enough? Could you explain more about the idea behind processing the virtio-net/e1000e using bh like this? Virtio-net use a tx_waiting variable to record whether or not there's a pending bh. (E.g bh is cancelled due to vmstop, we need reschedule it after vmresume). Maybe we can do something simpler by just schecule bh unconditionally during vm resuming. For example, if the guest trigger a lot of packets send and if the bh is scheduled in IO thread. So will we lost packets? We don't since we don't populate virtqueue which means packets are queued there. Thanks How we avoid this in virtio-net. Thanks, Li Qiang Thanks This patch fixes this UAF. Signed-off-by: Li Qiang --- hw/net/e1000e_core.c | 25 + hw/net/e1000e_core.h | 2 ++ 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c index bcd186cac5..6165b04b68 100644 --- a/hw/net/e1000e_core.c +++ b/hw/net/e1000e_core.c @@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, uint32_t val) static void e1000e_set_tctl(E1000ECore *core, int index, uint32_t val) { -E1000E_TxRing txr; core->mac[index] = val; if (core->mac[TARC0] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , 0); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[0].tx_bh); } if (core->mac[TARC1] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , 1); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[1].tx_bh); } } static void e1000e_set_tdt(E1000ECore *core, int index, uint32_t val) { -E1000E_TxRing txr; int qidx = e1000e_mq_queue_idx(TDT, index); uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1; core->mac[index] = val & 0x; if (core->mac[tarc_reg] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , qidx); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[qidx].tx_bh); } } @@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, RunState state) } } +static void e1000e_core_tx_bh(void *opaque) +{ +struct e1000e_tx *tx = opaque; +E1000ECore *core = tx->core; +E1000E_TxRing txr; + +e1000e_tx_ring_init(core, , tx - >tx[0]); +e1000e_start_xmit(core, ); +} + void e1000e_core_pci_realize(E1000ECore *core, const uint16_t *eeprom_templ, @@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore *core, for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_init(>tx[i].tx_pkt, core->owner, E1000E_MAX_TX_FRAGS, core->has_vnet); +core->tx[i].core = core; +core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, >tx[i]); } net_rx_pkt_init(>rx_pkt, core->has_vnet); @@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core) for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_reset(core->tx[i].tx_pkt); net_tx_pkt_uninit(core->tx[i].tx_pkt); +qemu_bh_delete(core->tx[i].tx_bh); +core->tx[i].tx_bh = NULL; } net_rx_pkt_uninit(core->rx_pkt); diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h index aee32f7e48..94ddc6afc2 100644 --- a/hw/net/e1000e_core.h +++ b/hw/net/e1000e_core.h @@ -77,6 +77,8 @@ struct E1000Core { unsigned char sum_needed; bool cptse;
Re: [PATCH] gitlab-ci.yml: Add fuzzer tests
On 16/07/2020 18.46, Alexander Bulekov wrote: > On 200716 1209, Thomas Huth wrote: >> So far we neither compile-tested nor run any of the new fuzzers in our CI, >> which led to some build failures of the fuzzer code in the past weeks. >> To avoid this problem, add a job to compile the fuzzer code and run some >> loops (which likely don't find any new bugs via fuzzing, but at least we >> know that the code can still be run). >> >> A nice side-effect of this test is that the leak tests are enabled here, >> so we should now notice some of the memory leaks in our code base earlier. >> >> Signed-off-by: Thomas Huth > > Thank you for this, Thomas. I just sent a patch to add a job that runs > similar tests with the build-script that we use on oss-fuzz > Patch: <20200716163330.29141-1-alx...@bu.edu> Good idea! ... but it does not work quite yet, I gave it a try and ended up here: https://gitlab.com/huth/qemu/-/jobs/643353500 > I know that these jobs have a lot of overlap, but there are enough > quirks in the oss-fuzz build-script that I, personally, don't think > they are redundant. I think we should finally go with your approach and compile the fuzzing test via the script. But since that seems to need some more work first, let's go with my patch now, so that we have something for 5.1-rc1, and then when your patch is ready, replace my "build-fuzzer" job with yours, ok? >> +build-fuzzer: >> + <<: *native_build_job_definition >> + variables: >> +IMAGE: fedora >> + script: >> +- mkdir build >> +- cd build >> +- ../configure --cc=clang --cxx=clang++ --enable-fuzzing >> + --target-list=x86_64-softmmu > > https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg02310.html > When/if this gets merged, enable-fuzzing won't build with > AddressSanitizer, by default, so I would add --enable-sanitizers, just > to be safe. Ok, thanks, I'll add that. >> +- make -j"$JOBS" all check-build x86_64-softmmu/fuzz >> +- make check >> +- for fuzzer in i440fx-qos-fork-fuzz i440fx-qos-noreset-fuzz >> +i440fx-qtest-reboot-fuzz virtio-scsi-flags-fuzz virtio-scsi-fuzz ; >> do > > Any reason for these particular fuzzers? I know the virtio-net ones find > crashes pretty quickly, but I dont think that causes a non-zero exit. I did not include the virtio-net fuzzers because I read that warning "May result in network traffic emitted from the process. Run in an isolated network environment." in the help text ... so I was not sure whether they are really suitable for the CI on the gitlab machines? Thomas
Re: [RFC PATCH-for-5.2 v2 2/2] block/block-backend: Let blk_attach_dev() provide helpful error
Daniel P. Berrangé writes: > On Thu, Jul 16, 2020 at 02:37:04PM +0200, Philippe Mathieu-Daudé wrote: >> Let blk_attach_dev() take an Error* object to return helpful >> information. Adapt the callers. >> >> $ qemu-system-arm -M n800 >> qemu-system-arm: sd_init failed: cannot attach blk 'sd0' to device >> 'sd-card' >> blk 'sd0' is already attached by device 'omap2-mmc' >> Drive 'sd0' is already in use because it has been automatically connected >> to another device >> (do you need 'if=none' in the drive options?) >> >> Signed-off-by: Philippe Mathieu-Daudé >> --- >> v2: Rebased after 668f62ec62 ("error: Eliminate error_propagate()") >> --- >> include/sysemu/block-backend.h | 2 +- >> block/block-backend.c| 11 ++- >> hw/block/fdc.c | 4 +--- >> hw/block/swim.c | 4 +--- >> hw/block/xen-block.c | 5 +++-- >> hw/core/qdev-properties-system.c | 16 +--- >> hw/ide/qdev.c| 4 +--- >> hw/scsi/scsi-disk.c | 4 +--- >> 8 files changed, 27 insertions(+), 23 deletions(-) >> >> diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h >> index 8203d7f6f9..118fbad0b4 100644 >> --- a/include/sysemu/block-backend.h >> +++ b/include/sysemu/block-backend.h >> @@ -113,7 +113,7 @@ BlockDeviceIoStatus blk_iostatus(const BlockBackend >> *blk); >> void blk_iostatus_disable(BlockBackend *blk); >> void blk_iostatus_reset(BlockBackend *blk); >> void blk_iostatus_set_err(BlockBackend *blk, int error); >> -int blk_attach_dev(BlockBackend *blk, DeviceState *dev); >> +int blk_attach_dev(BlockBackend *blk, DeviceState *dev, Error **errp); >> void blk_detach_dev(BlockBackend *blk, DeviceState *dev); >> DeviceState *blk_get_attached_dev(BlockBackend *blk); >> char *blk_get_attached_dev_id(BlockBackend *blk); >> diff --git a/block/block-backend.c b/block/block-backend.c >> index 63ff940ef9..b7be0a4619 100644 >> --- a/block/block-backend.c >> +++ b/block/block-backend.c >> @@ -884,12 +884,21 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, >> uint64_t *shared_perm) >> >> /* >> * Attach device model @dev to @blk. >> + * >> + * @blk: Block backend >> + * @dev: Device to attach the block backend to >> + * @errp: pointer to NULL initialized error object >> + * >> * Return 0 on success, -EBUSY when a device model is attached already. >> */ >> -int blk_attach_dev(BlockBackend *blk, DeviceState *dev) >> +int blk_attach_dev(BlockBackend *blk, DeviceState *dev, Error **errp) >> { >> trace_blk_attach_dev(blk_name(blk), object_get_typename(OBJECT(dev))); >> if (blk->dev) { >> +error_setg(errp, "cannot attach blk '%s' to device '%s'", >> + blk_name(blk), object_get_typename(OBJECT(dev))); >> +error_append_hint(errp, "blk '%s' is already attached by device >> '%s'\n", >> + blk_name(blk), >> object_get_typename(OBJECT(blk->dev))); > > I would have a preference for expanding the main error message and not > using a hint. Any hint is completely thrown away when using QMP :-( Hints work best in cases like error message hint suggesting things to try to fix it, in CLI syntax error message rejecting a configuration value hint listing possible values, in CLI syntax Why "in CLI syntax"? Well, we need to use *some* syntax to be useful. Hints have always been phrased for the CLI, simply because the hint feature grew out of my disgust over the loss of lovingly written CLI hints in the conversion to Error. Hints in CLI syntax would be misleading QMP. We never extended QMP to transport hints. Hints may tempt you in a case like error message that is painfully long, because it really tries hard to explain, which is laudable in theory, but sadly illegible in practice; also, interminable sentences, meh, this is an error message, not a Joyce novel to get something like terse error message Explanation that is rather long, because it really tries hard to explain. It can be multiple sentences, and lines are wrapped at a reasonable length. Comes out okay in the CLI, but the explanation is lost in QMP. I don't have a good solution to offer for errors that genuinely need explaining.
Re: sysbus_create_simple Vs qdev_create
Eduardo Habkost writes: > I'd also note that the use of "parent" in the code is also > ambiguous. It can mean: > > * QOM parent type, i.e. TypeInfo.parent. Related fields: > * parent_class members of class structs > * parent_obj members of object structs I hate the use of "parent" and "child" for a super- / subtype relation. Correcting the terminology there would be short term pain for long term gain. Worthwhile? > * QOM composition tree parent object, i.e. Object::parent > * qdev device parent bus, i.e. DeviceState::parent_bus > * parent device of of qdev bus, i.e. BusState::parent These are tree relations. Use of "parent" and "child" is perfectly fine.
Re: [PATCH] net: check payload length limit for all frames
Hello Jason, all +-- On Fri, 17 Jul 2020, Jason Wang wrote --+ | On 2020/7/17 上午9:21, Alexander Bulekov wrote: | > On 200717 0853, Li Qiang wrote: | >> Which issue are you trying to solve, any reference linking? | >> I also send a patch related this part and also a UAF. | > | > I reported a UAF privately to QEMU-Security in May. I believe the one Li | > is referring to is this one https://bugs.launchpad.net/qemu/+bug/1886362 | > | > When I saw Prasad's email, I was worried that I reported the same bug | > twice, but I can still reproduce LP#1886362 with Prasad's patch. | > | > On the other hand, I cannot reproduce either issue with Li's patch: | > Message-Id: <20200716161453.61295-1-liq...@163.com> | > | > Based on this, I think there were two distinct issues. Yes, LP#1886362 looks different that the one fixed here. | Could you describe the issue you saw in details? (E.g the calltrace?) The | commit log does not explain where we can get OOB or UAF. I should've included the backtrace in the commit message. It crossed my mind after I sent the patch. Sorry. === 1040323==ERROR: AddressSanitizer: heap-use-after-free on address 0x606344d8 at pc 0x5571b8fb9ce7 bp 0x7ffede5a2290 sp 0x7ffede5a2280 READ of size 8 at 0x606344d8 thread T0 #0 0x5571b8fb9ce6 in e1000e_write_packet_to_guest hw/net/e1000e_core.c:1587 #1 0x5571b8fba8fc in e1000e_receive_iov hw/net/e1000e_core.c:1709 #2 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213 #3 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544 #4 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620 #5 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633 #6 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664 #7 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743 #8 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934 #9 0x5571b8fbe2ad in e1000e_set_tdt hw/net/e1000e_core.c:2451 #10 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265 #11 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109 ... 0x606344d8 is located 24 bytes inside of 64-byte region [0x606344c0,0x60634500) freed by thread T0 here: #0 0x7f89e3b87307 in __interceptor_free (/lib64/libasan.so.6+0xb0307) #1 0x7f89e37c7a6c in g_free (/lib64/libglib-2.0.so.0+0x58a6c) #2 0x5571b8f95fc7 in net_rx_pkt_pull_data hw/net/net_rx_pkt.c:103 #3 0x5571b8f969b3 in net_rx_pkt_attach_iovec_ex hw/net/net_rx_pkt.c:158 #4 0x5571b8fba6bc in e1000e_receive_iov hw/net/e1000e_core.c:1695 #5 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213 #6 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544 #7 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620 #8 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633 #9 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664 #10 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743 #11 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934 #12 0x5571b8fbe011 in e1000e_set_tctl hw/net/e1000e_core.c:2431 #13 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265 #14 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109 ... previously allocated by thread T0 here: #0 0x7f89e3b87667 in __interceptor_malloc (/lib64/libasan.so.6+0xb0667) #1 0x7f89e37c7978 in g_malloc (/lib64/libglib-2.0.so.0+0x58978) #2 0x5571b8f95fc7 in net_rx_pkt_pull_data hw/net/net_rx_pkt.c:103 #3 0x5571b8f969b3 in net_rx_pkt_attach_iovec_ex hw/net/net_rx_pkt.c:158 #4 0x5571b8fba6bc in e1000e_receive_iov hw/net/e1000e_core.c:1695 #5 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213 #6 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544 #7 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620 #8 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633 #9 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664 #10 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743 #11 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934 #12 0x5571b8fbe2ad in e1000e_set_tdt hw/net/e1000e_core.c:2451 #13 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265 #14 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109 === | >>> --- a/hw/net/net_tx_pkt.c | >>> +++ b/hw/net/net_tx_pkt.c | >>> @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, | >>> NetClientState *nc) | >>>* Since underlying infrastructure does not support IP datagrams | >>>longer | >>>* than 64K we should drop such packets and don't even try to send | >>>*/ | >>> -if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) { | >>> -if (pkt->payload_len > | >>> -ETH_MAX_IP_DGRAM_LEN - | >>> -pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { | >>> -return false; | >>> -} | >>> +if
Re: [PATCH] e1000e: using bottom half to send packets
Jason Wang 于2020年7月17日周五 上午11:10写道: > > > On 2020/7/17 上午12:14, Li Qiang wrote: > > Alexander Bulekov reported a UAF bug related e1000e packets send. > > > > -->https://bugs.launchpad.net/qemu/+bug/1886362 > > > > This is because the guest trigger a e1000e packet send and set the > > data's address to e1000e's MMIO address. So when the e1000e do DMA > > it will write the MMIO again and trigger re-entrancy and finally > > causes this UAF. > > > > Paolo suggested to use a bottom half whenever MMIO is doing complicate > > things in here: > > -->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html > > > > Reference here: > > 'The easiest solution is to delay processing of descriptors to a bottom > > half whenever MMIO is doing something complicated. This is also better > > for latency because it will free the vCPU thread more quickly and leave > > the work to the I/O thread.' > > > I think several things were missed in this patch (take virtio-net as a > reference), do we need the following things: > Thanks Jason, In fact I know this, I'm scared for touching this but I want to try. Thanks for your advice. > - Cancel the bh when VM is stopped. Ok. I think add a vm state change notifier for e1000e can address this. > - A throttle to prevent bh from executing too much timer? Ok, I think add a config timeout and add a timer in e1000e can address this. > - A flag to record whether or not this a pending tx (and migrate it?) Is just a flag enough? Could you explain more about the idea behind processing the virtio-net/e1000e using bh like this? For example, if the guest trigger a lot of packets send and if the bh is scheduled in IO thread. So will we lost packets? How we avoid this in virtio-net. Thanks, Li Qiang > > Thanks > > > > > > This patch fixes this UAF. > > > > Signed-off-by: Li Qiang > > --- > > hw/net/e1000e_core.c | 25 + > > hw/net/e1000e_core.h | 2 ++ > > 2 files changed, 19 insertions(+), 8 deletions(-) > > > > diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c > > index bcd186cac5..6165b04b68 100644 > > --- a/hw/net/e1000e_core.c > > +++ b/hw/net/e1000e_core.c > > @@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, > > uint32_t val) > > static void > > e1000e_set_tctl(E1000ECore *core, int index, uint32_t val) > > { > > -E1000E_TxRing txr; > > core->mac[index] = val; > > > > if (core->mac[TARC0] & E1000_TARC_ENABLE) { > > -e1000e_tx_ring_init(core, , 0); > > -e1000e_start_xmit(core, ); > > +qemu_bh_schedule(core->tx[0].tx_bh); > > } > > > > if (core->mac[TARC1] & E1000_TARC_ENABLE) { > > -e1000e_tx_ring_init(core, , 1); > > -e1000e_start_xmit(core, ); > > +qemu_bh_schedule(core->tx[1].tx_bh); > > } > > } > > > > static void > > e1000e_set_tdt(E1000ECore *core, int index, uint32_t val) > > { > > -E1000E_TxRing txr; > > int qidx = e1000e_mq_queue_idx(TDT, index); > > uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1; > > > > core->mac[index] = val & 0x; > > > > if (core->mac[tarc_reg] & E1000_TARC_ENABLE) { > > -e1000e_tx_ring_init(core, , qidx); > > -e1000e_start_xmit(core, ); > > +qemu_bh_schedule(core->tx[qidx].tx_bh); > > } > > } > > > > @@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, > > RunState state) > > } > > } > > > > +static void e1000e_core_tx_bh(void *opaque) > > +{ > > +struct e1000e_tx *tx = opaque; > > +E1000ECore *core = tx->core; > > +E1000E_TxRing txr; > > + > > +e1000e_tx_ring_init(core, , tx - >tx[0]); > > +e1000e_start_xmit(core, ); > > +} > > + > > void > > e1000e_core_pci_realize(E1000ECore *core, > > const uint16_t *eeprom_templ, > > @@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore *core, > > for (i = 0; i < E1000E_NUM_QUEUES; i++) { > > net_tx_pkt_init(>tx[i].tx_pkt, core->owner, > > E1000E_MAX_TX_FRAGS, core->has_vnet); > > +core->tx[i].core = core; > > +core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, >tx[i]); > > } > > > > net_rx_pkt_init(>rx_pkt, core->has_vnet); > > @@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core) > > for (i = 0; i < E1000E_NUM_QUEUES; i++) { > > net_tx_pkt_reset(core->tx[i].tx_pkt); > > net_tx_pkt_uninit(core->tx[i].tx_pkt); > > +qemu_bh_delete(core->tx[i].tx_bh); > > +core->tx[i].tx_bh = NULL; > > } > > > > net_rx_pkt_uninit(core->rx_pkt); > > diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h > > index aee32f7e48..94ddc6afc2 100644 > > --- a/hw/net/e1000e_core.h > > +++ b/hw/net/e1000e_core.h > > @@ -77,6 +77,8 @@ struct E1000Core { > > unsigned char sum_needed; > > bool cptse; > > struct NetTxPkt *tx_pkt; > > +QEMUBH *tx_bh; > > +
[PATCH] Fix vhost-user buffer over-read on ram hot-unplug
The VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS vhost-user protocol feature introduced a shadow-table, used by the backend to dynamically determine how a vdev's memory regions have changed since the last vhost_user_set_mem_table() call. On hot-remove, a memmove() operation is used to overwrite the removed shadow region descriptor(s). The size parameter of this memmove was off by 1 such that if a VM with a backend supporting the VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS filled it's shadow-table (by performing the maximum number of supported hot-add operatons) and attempted to remove the last region, Qemu would read an out of bounds value and potentially crash. This change fixes the memmove() bounds such that this erroneous read can never happen. Signed-off-by: Peter Turschmid Signed-off-by: Raphael Norwitz --- hw/virtio/vhost-user.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 3123121..d7e2423 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -672,7 +672,7 @@ static int send_remove_regions(struct vhost_dev *dev, memmove(>shadow_regions[shadow_reg_idx], >shadow_regions[shadow_reg_idx + 1], sizeof(struct vhost_memory_region) * -(u->num_shadow_regions - shadow_reg_idx)); +(u->num_shadow_regions - shadow_reg_idx - 1)); u->num_shadow_regions--; } -- 1.8.3.1
Re: [virtio-dev] [RFC for Linux v4 0/2] virtio_balloon: Add VIRTIO_BALLOON_F_CONT_PAGES to report continuous pages
> 2020年7月16日 18:45,Michael S. Tsirkin 写道: > > On Thu, Jul 16, 2020 at 03:01:18PM +0800, teawater wrote: >> >> >>> 2020年7月16日 14:38,Michael S. Tsirkin 写道: >>> >>> On Thu, Jul 16, 2020 at 10:41:50AM +0800, Hui Zhu wrote: The first, second and third version are in [1], [2] and [3]. Code of current version for Linux and qemu is available in [4] and [5]. Update of this version: 1. Report continuous pages will increase the speed. So added deflate continuous pages. 2. According to the comments from David in [6], added 2 new vqs inflate_cont_vq and deflate_cont_vq to report continuous pages with format 32 bits pfn and 32 bits size. Following is the introduction of the function. These patches add VIRTIO_BALLOON_F_CONT_PAGES to virtio_balloon. With this flag, balloon tries to use continuous pages to inflate and deflate. Opening this flag can bring two benefits: 1. Report continuous pages will increase memory report size of each time call tell_host. Then it will increase the speed of balloon inflate and deflate. 2. Host THPs will be splitted when qemu release the page of balloon inflate. Inflate balloon with continuous pages will let QEMU release the pages of same THPs. That will help decrease the splitted THPs number in the host. Following is an example in a VM with 1G memory 1CPU. This test setups an environment that has a lot of fragmentation pages. Then inflate balloon will split the THPs. >> >> // This is the THP number before VM execution in the host. // None use THP. cat /proc/meminfo | grep AnonHugePages: AnonHugePages: 0 kB >> These lines are from host. >> // After VM start, use usemem // (https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git) // punch-holes function generates 400m fragmentation pages in the guest // kernel. usemem --punch-holes -s -1 800m & >> These lines are from guest. They setups the environment that has a lot of >> fragmentation pages. >> // This is the THP number after this command in the host. // Some THP is used by VM because usemem will access 800M memory // in the guest. cat /proc/meminfo | grep AnonHugePages: AnonHugePages:911360 kB >> These lines are from host. >> // Connect to the QEMU monitor, setup balloon, and set it size to 600M. (qemu) device_add virtio-balloon-pci,id=balloon1 (qemu) info balloon balloon: actual=1024 (qemu) balloon 600 (qemu) info balloon balloon: actual=600 >> These lines are from host. >> // This is the THP number after inflate the balloon in the host. cat /proc/meminfo | grep AnonHugePages: AnonHugePages: 88064 kB >> These lines are from host. >> // Set the size back to 1024M in the QEMU monitor. (qemu) balloon 1024 (qemu) info balloon balloon: actual=1024 >> These lines are from host. >> // Use usemem to increase the memory usage of QEMU. killall usemem usemem 800m >> These lines are from guest. >> // This is the THP number after this operation. cat /proc/meminfo | grep AnonHugePages: AnonHugePages: 65536 kB >> These lines are from host. >> >> >> Following example change to use continuous pages balloon. The number of splitted THPs is decreased. // This is the THP number before VM execution in the host. // None use THP. cat /proc/meminfo | grep AnonHugePages: AnonHugePages: 0 kB >> These lines are from host. >> // After VM start, use usemem punch-holes function generates 400M // fragmentation pages in the guest kernel. usemem --punch-holes -s -1 800m & >> These lines are from guest. They setups the environment that has a lot of >> fragmentation pages. >> // This is the THP number after this command in the host. // Some THP is used by VM because usemem will access 800M memory // in the guest. cat /proc/meminfo | grep AnonHugePages: AnonHugePages:911360 kB >> These lines are from host. >> // Connect to the QEMU monitor, setup balloon, and set it size to 600M. (qemu) device_add virtio-balloon-pci,id=balloon1,cont-pages=on (qemu) info balloon balloon: actual=1024 (qemu) balloon 600 (qemu) info balloon balloon: actual=600 >> These lines are from host. >> // This is the THP number after inflate the balloon in the host. cat /proc/meminfo | grep AnonHugePages: AnonHugePages:616448 kB // Set the size back to 1024M in the QEMU monitor. (qemu) balloon 1024 (qemu) info balloon balloon: actual=1024 >> These lines are from host. >> // Use usemem to increase the memory usage of QEMU. killall usemem usemem 800m >> These lines are from guest. >> // This is the THP number after this operation. cat
[PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image
The following changes since commit 1038a309ec829f05a3a3e52a9951cfdb24dfd02c: spapr: Add a new level of NUMA for GPUs (2020-07-17 10:36:28 +1000) are available in the Git repository at: g...@github.com:aik/qemu.git tags/qemu-slof-20200717 for you to fetch changes up to 7f5258dd8327d574de455a2271788474fa25548d: pseries: Update SLOF firmware image (2020-07-17 13:23:00 +1000) Alexey Kardashevskiy (1): pseries: Update SLOF firmware image pc-bios/README | 2 +- pc-bios/slof.bin | Bin 965112 -> 968368 bytes roms/SLOF| 2 +- 3 files changed, 2 insertions(+), 2 deletions(-) *** Note: this is not for master, this is for pseries This adds tcgbios (this was posted earlier [1] but got lost) and fixes FDT update at ibm,client-architecture-support for huge guests. The full list of changes: Alexey Kardashevskiy (4): make: Define default rule for .c when V=1 or V=2 version: update to 20200513 fdt: Avoid recursion when traversing tree version: update to 20200717 Gustavo Romero (1): board-qemu: Fix comment about SLOF start address Stefan Berger (6): tcgbios: Only write logs for PCRs that are allocated tcgbios: Fix the vendorInfoSize to be of type uint8_t tcgbios: Add support for SHA3 type of algorithms elf: Implement elf_get_file_size to determine size of an ELF image tcgbios: Implement tpm_hash_log_extend_event_buffer tcgbios: Measure the bootloader file read from disk [1] https://patchwork.ozlabs.org/project/qemu-devel/patch/20200513024355.121476-1-...@ozlabs.ru/
Re: [PATCH] net: check payload length limit for all frames
On 2020/7/17 上午9:21, Alexander Bulekov wrote: On 200717 0853, Li Qiang wrote: P J P 于2020年7月17日周五 上午3:26写道: From: Prasad J Pandit While sending packets, the check that packet 'payload_len' is within 64kB limit, seems to happen only for GSO frames. It may lead to use-after-free or out-of-bounds access like issues when sending non-GSO frames. Check the 'payload_len' limit for all packets, irrespective of the gso type. Hello Prasad, Which issue are you trying to solve, any reference linking? I also send a patch related this part and also a UAF. Thanks, Li Qiang Hi Li, Prasad, I reported a UAF privately to QEMU-Security in May. I believe the one Li is referring to is this one https://bugs.launchpad.net/qemu/+bug/1886362 When I saw Prasad's email, I was worried that I reported the same bug twice, but I can still reproduce LP#1886362 with Prasad's patch. On the other hand, I cannot reproduce either issue with Li's patch: Message-Id: <20200716161453.61295-1-liq...@163.com> Based on this, I think there were two distinct issues. Both of the crashes rely on e1000e tx loopback into e1000e MMIO. Since Li's patch adds a TX bh, it seems to mitigate such types of issues. Sorry about any confusion. -Alex Could you describe the issue you saw in details? (E.g the calltrace?) The commit log does not explain where we can get OOB or UAF. Thanks Reported-by: Alexander Bulekov Signed-off-by: Prasad J Pandit --- hw/net/net_tx_pkt.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c index 162f802dd7..e66998a8f9 100644 --- a/hw/net/net_tx_pkt.c +++ b/hw/net/net_tx_pkt.c @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState *nc) * Since underlying infrastructure does not support IP datagrams longer * than 64K we should drop such packets and don't even try to send */ -if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) { -if (pkt->payload_len > -ETH_MAX_IP_DGRAM_LEN - -pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { -return false; -} +if (pkt->payload_len > +ETH_MAX_IP_DGRAM_LEN - +pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { +return false; } if (pkt->has_virt_hdr || -- 2.26.2
Re: [PATCH] e1000e: using bottom half to send packets
On 2020/7/17 上午12:14, Li Qiang wrote: Alexander Bulekov reported a UAF bug related e1000e packets send. -->https://bugs.launchpad.net/qemu/+bug/1886362 This is because the guest trigger a e1000e packet send and set the data's address to e1000e's MMIO address. So when the e1000e do DMA it will write the MMIO again and trigger re-entrancy and finally causes this UAF. Paolo suggested to use a bottom half whenever MMIO is doing complicate things in here: -->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html Reference here: 'The easiest solution is to delay processing of descriptors to a bottom half whenever MMIO is doing something complicated. This is also better for latency because it will free the vCPU thread more quickly and leave the work to the I/O thread.' I think several things were missed in this patch (take virtio-net as a reference), do we need the following things: - Cancel the bh when VM is stopped. - A throttle to prevent bh from executing too much timer? - A flag to record whether or not this a pending tx (and migrate it?) Thanks This patch fixes this UAF. Signed-off-by: Li Qiang --- hw/net/e1000e_core.c | 25 + hw/net/e1000e_core.h | 2 ++ 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c index bcd186cac5..6165b04b68 100644 --- a/hw/net/e1000e_core.c +++ b/hw/net/e1000e_core.c @@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, uint32_t val) static void e1000e_set_tctl(E1000ECore *core, int index, uint32_t val) { -E1000E_TxRing txr; core->mac[index] = val; if (core->mac[TARC0] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , 0); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[0].tx_bh); } if (core->mac[TARC1] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , 1); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[1].tx_bh); } } static void e1000e_set_tdt(E1000ECore *core, int index, uint32_t val) { -E1000E_TxRing txr; int qidx = e1000e_mq_queue_idx(TDT, index); uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1; core->mac[index] = val & 0x; if (core->mac[tarc_reg] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , qidx); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[qidx].tx_bh); } } @@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, RunState state) } } +static void e1000e_core_tx_bh(void *opaque) +{ +struct e1000e_tx *tx = opaque; +E1000ECore *core = tx->core; +E1000E_TxRing txr; + +e1000e_tx_ring_init(core, , tx - >tx[0]); +e1000e_start_xmit(core, ); +} + void e1000e_core_pci_realize(E1000ECore *core, const uint16_t *eeprom_templ, @@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore *core, for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_init(>tx[i].tx_pkt, core->owner, E1000E_MAX_TX_FRAGS, core->has_vnet); +core->tx[i].core = core; +core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, >tx[i]); } net_rx_pkt_init(>rx_pkt, core->has_vnet); @@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core) for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_reset(core->tx[i].tx_pkt); net_tx_pkt_uninit(core->tx[i].tx_pkt); +qemu_bh_delete(core->tx[i].tx_bh); +core->tx[i].tx_bh = NULL; } net_rx_pkt_uninit(core->rx_pkt); diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h index aee32f7e48..94ddc6afc2 100644 --- a/hw/net/e1000e_core.h +++ b/hw/net/e1000e_core.h @@ -77,6 +77,8 @@ struct E1000Core { unsigned char sum_needed; bool cptse; struct NetTxPkt *tx_pkt; +QEMUBH *tx_bh; +E1000ECore *core; } tx[E1000E_NUM_QUEUES]; struct NetRxPkt *rx_pkt;
Re: [PATCH] net: check payload length limit for all frames
On 200717 0853, Li Qiang wrote: > P J P 于2020年7月17日周五 上午3:26写道: > > > > From: Prasad J Pandit > > > > While sending packets, the check that packet 'payload_len' > > is within 64kB limit, seems to happen only for GSO frames. > > It may lead to use-after-free or out-of-bounds access like > > issues when sending non-GSO frames. Check the 'payload_len' > > limit for all packets, irrespective of the gso type. > > > > Hello Prasad, > Which issue are you trying to solve, any reference linking? > > I also send a patch related this part and also a UAF. > > Thanks, > Li Qiang Hi Li, Prasad, I reported a UAF privately to QEMU-Security in May. I believe the one Li is referring to is this one https://bugs.launchpad.net/qemu/+bug/1886362 When I saw Prasad's email, I was worried that I reported the same bug twice, but I can still reproduce LP#1886362 with Prasad's patch. On the other hand, I cannot reproduce either issue with Li's patch: Message-Id: <20200716161453.61295-1-liq...@163.com> Based on this, I think there were two distinct issues. Both of the crashes rely on e1000e tx loopback into e1000e MMIO. Since Li's patch adds a TX bh, it seems to mitigate such types of issues. Sorry about any confusion. -Alex > > Reported-by: Alexander Bulekov > > Signed-off-by: Prasad J Pandit > > --- > > hw/net/net_tx_pkt.c | 10 -- > > 1 file changed, 4 insertions(+), 6 deletions(-) > > > > diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c > > index 162f802dd7..e66998a8f9 100644 > > --- a/hw/net/net_tx_pkt.c > > +++ b/hw/net/net_tx_pkt.c > > @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, > > NetClientState *nc) > > * Since underlying infrastructure does not support IP datagrams longer > > * than 64K we should drop such packets and don't even try to send > > */ > > -if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) { > > -if (pkt->payload_len > > > -ETH_MAX_IP_DGRAM_LEN - > > -pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { > > -return false; > > -} > > +if (pkt->payload_len > > > +ETH_MAX_IP_DGRAM_LEN - > > +pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { > > +return false; > > } > > > > if (pkt->has_virt_hdr || > > -- > > 2.26.2 > > > >
[PATCH] usb: only build hcd-dwc2 host controller for RASPI target
The hcd-dwc2 host controller is currently built for all targets. Since for now hcd-dwc2 is only implemented on RASPI, restrict its build to that target only. Signed-off-by: Paul Zimmerman --- Hi Gerd, Do we want to apply this before the 5.1.0 release? It seems a waste to build this code for every target when it's only used on one. Sorry I didn't realize this earlier. Thanks, Paul hw/arm/Kconfig | 1 + hw/usb/Kconfig | 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig index 4a224a6351..bc3a423940 100644 --- a/hw/arm/Kconfig +++ b/hw/arm/Kconfig @@ -315,6 +315,7 @@ config RASPI select FRAMEBUFFER select PL011 # UART select SDHCI +select USB_DWC2 config STM32F205_SOC bool diff --git a/hw/usb/Kconfig b/hw/usb/Kconfig index d4d8c37c28..5e63dc75f8 100644 --- a/hw/usb/Kconfig +++ b/hw/usb/Kconfig @@ -48,7 +48,6 @@ config USB_MUSB config USB_DWC2 bool -default y select USB config TUSB6010 -- 2.17.1
Re: [PATCH] net: check payload length limit for all frames
P J P 于2020年7月17日周五 上午3:26写道: > > From: Prasad J Pandit > > While sending packets, the check that packet 'payload_len' > is within 64kB limit, seems to happen only for GSO frames. > It may lead to use-after-free or out-of-bounds access like > issues when sending non-GSO frames. Check the 'payload_len' > limit for all packets, irrespective of the gso type. > Hello Prasad, Which issue are you trying to solve, any reference linking? I also send a patch related this part and also a UAF. Thanks, Li Qiang > Reported-by: Alexander Bulekov > Signed-off-by: Prasad J Pandit > --- > hw/net/net_tx_pkt.c | 10 -- > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c > index 162f802dd7..e66998a8f9 100644 > --- a/hw/net/net_tx_pkt.c > +++ b/hw/net/net_tx_pkt.c > @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, > NetClientState *nc) > * Since underlying infrastructure does not support IP datagrams longer > * than 64K we should drop such packets and don't even try to send > */ > -if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) { > -if (pkt->payload_len > > -ETH_MAX_IP_DGRAM_LEN - > -pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { > -return false; > -} > +if (pkt->payload_len > > +ETH_MAX_IP_DGRAM_LEN - > +pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { > +return false; > } > > if (pkt->has_virt_hdr || > -- > 2.26.2 > >
Re: [PATCH] spapr_pci: Robustify support of PCI bridges
On Thu, Jul 16, 2020 at 04:57:54PM +0200, Greg Kurz wrote: > On Thu, 16 Jul 2020 16:23:52 +0200 > Markus Armbruster wrote: > > > David Gibson writes: > > > > > On Thu, Jul 16, 2020 at 12:32:44PM +0200, Greg Kurz wrote: > > >> On Thu, 16 Jul 2020 14:45:40 +1000 > > >> David Gibson wrote: > > >> > > >> > On Thu, Jul 09, 2020 at 07:12:47PM +0200, Greg Kurz wrote: > > >> > > Some recent error handling cleanups unveiled issues with our support > > >> > > of > > >> > > PCI bridges: > > >> > > > > >> > > 1) QEMU aborts when using non-standard PCI bridge types, > > >> > >unveiled by commit 7ef1553dac "spapr_pci: Drop some dead error > > >> > > handling" > > >> > > > > >> > > $ qemu-system-ppc64 -M pseries -device pcie-pci-bridge > > >> > > Unexpected error in object_property_find() at qom/object.c:1240: > > >> > > qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' > > >> > > not found > > >> > > Aborted (core dumped) > > >> > > > >> > Oops, I thought we had a check that we actually had a "pci-bridge" > > >> > device before continuing with the hotplug, but I guess not. > > >> > > >> Ah... are you suggesting we should explicitly check the actual type > > >> of the bridge rather than looking for the "chassis_nr" property ? > > > > > > Uh.. I thought about it, but I don't think it matters much which way > > > we do it. > > > > Would it make sense to add the "chassis_nr" property to *all* PCI > > bridge devices? > > > > I see that the "PCI Express to PCI/PCI-X Bridge Specification" mentions > a "Chassis Number Register" which looks very similar to the what exists > in standard PCI-to-PCI brdiges. This doesn't seem to be implemented in > our "pcie-pci-bridge" device model though, but of course I have no idea > why :) We could consider it, but I don't think there's a lot to be gained by it at this stage. I don't think there's really any reason to want to use bridges other than plain "pci-bridge" on the pseries machine. PCI is a bit weird on pseries, since it's explicitly paravirt. Although you can use extended config space, and thereby PCI-E devices on it, the topology really looks pretty much identical to vanilla PCI. So, I don't think there's any reason to use PCI-E bridges on pseries. Other than PCI-E bridges of various sorts, a quick scan suggests all the other bridge types in qemu are weird variants that are mostly specific to some particular platform. I don't see any reason we'd want those on pseries either. > Maybe Michael or Marcel (cc'd) can share some thoughts about that ? -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH] spapr_pci: Robustify support of PCI bridges
On Thu, Jul 16, 2020 at 04:42:00PM +0200, Greg Kurz wrote: > On Thu, 16 Jul 2020 16:01:18 +0200 > Markus Armbruster wrote: > > > David Gibson writes: > > > > > On Thu, Jul 09, 2020 at 07:12:47PM +0200, Greg Kurz wrote: > > >> Some recent error handling cleanups unveiled issues with our support of > > >> PCI bridges: > > >> > > >> 1) QEMU aborts when using non-standard PCI bridge types, > > >>unveiled by commit 7ef1553dac "spapr_pci: Drop some dead error > > >> handling" > > >> > > >> $ qemu-system-ppc64 -M pseries -device pcie-pci-bridge > > >> Unexpected error in object_property_find() at qom/object.c:1240: > > >> qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not > > >> found > > >> Aborted (core dumped) > > > > > > Oops, I thought we had a check that we actually had a "pci-bridge" > > > device before continuing with the hotplug, but I guess not. > > > > > >> This happens because we assume all PCI bridge types to have a > > >> "chassis_nr" > > >> property. This property only exists with the standard PCI bridge type > > >> "pci-bridge" actually. We could possibly revert 7ef1553dac but it seems > > >> much simpler to check the presence of "chassis_nr" earlier. > > > > > > Hrm, right, 7ef1553dac was not really correct since add_drcs() really > > > can fail. > > > > Right. I failed to see that we can run into a bridge without a > > "chassis_nr" here. And I missed it on review, as well. > > >> 2) QEMU abort if same "chassis_nr" value is used several times, > > >>unveiled by commit d2623129a7de "qom: Drop parameter @errp of > > >>object_property_add() & friends" > > >> > > >> $ qemu-system-ppc64 -M pseries -device pci-bridge,chassis_nr=1 \ > > >> -device pci-bridge,chassis_nr=1 > > >> Unexpected error in object_property_try_add() at qom/object.c:1167: > > >> qemu-system-ppc64: -device pci-bridge,chassis_nr=1: attempt to add > > >> duplicate property '4100' to object (type 'container') > > >> Aborted (core dumped) > > > > Before d2623129a7de, the error got *ignored* in > > spapr_dr_connector_new(): > > > > SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type, > > uint32_t id) > > { > > SpaprDrc *drc = SPAPR_DR_CONNECTOR(object_new(type)); > > char *prop_name; > > > > drc->id = id; > > drc->owner = owner; > > prop_name = g_strdup_printf("dr-connector[%"PRIu32"]", > > spapr_drc_index(drc)); > > object_property_add_child(owner, prop_name, OBJECT(drc), > > _abort); > > object_unref(OBJECT(drc)); > > --->object_property_set_bool(OBJECT(drc), true, "realized", NULL); > > g_free(prop_name); > > > > return drc; > > } > > > > I doubt that's healthy. Indeed. > This isn't. The object_property_set_bool() was later converted to > qdev_realize() (thanks again for the cleanups!) but the problem > remains. Realize can fail and I see now reason we don't do proper > error handling when it comes to the DRCs. > > I'll look into fixing that. > > > >> This happens because we assume that "chassis_nr" values are unique, but > > >> nobody enforces that and we end up generating duplicate DRC ids. The PCI > > >> code doesn't really care for duplicate "chassis_nr" properties since it > > >> is only used to initialize the "Chassis Number Register" of the bridge, > > >> with no functional impact on QEMU. So, even if passing the same value > > >> several times might look weird, it never broke anything before, so > > >> I guess we don't necessarily want to enforce strict checking in the PCI > > >> code now. > > > > > > Yeah, I guess. I'm pretty sure that the chassis number of bridges is > > > supposed to be system-unique (well, unique within the PCI domain at > > > least, I guess) as part of the hardware spec. So specifying multiple > > > chassis ids the same is a user error, but we need a better failure > > > mode. > > > > > >> Workaround both issues in the PAPR code: check that the bridge has a > > >> unique and non null "chassis_nr" when plugging it into its parent bus. > > >> > > >> Fixes: 05929a6c5dfe ("spapr: Don't use bus number for building DRC ids") > > > > > > Arguably, it's really fixing 7ef1553dac. > > > > I agree 7ef1553dac broke the "use a bridge that doesn't have property > > 'chassis_nr' case. > > > > I suspect the "duplicate chassis_nr" case has always been broken, and > > d2623129a7de merely uncovered it. > > Yes. I agree. > > If we can trigger the abort with hot-plug, then d2623129a7de made things > > materially worse (new way to accidentally kill your guest and maybe lose > > data), and I'd add a Fixes: blaming it. > > > > Yes it does. > > David, > > Maybe consider folding a third Fixes: tag into this patch ? Done. > > >> Reported-by: Thomas Huth > > >> Signed-off-by: Greg Kurz > > > > > > I had a few misgivings about the details of this, but I
Re: [PATCH v4] spapr: Add a new level of NUMA for GPUs
On Thu, Jul 16, 2020 at 05:56:55PM -0500, Reza Arbab wrote: > NUMA nodes corresponding to GPU memory currently have the same > affinity/distance as normal memory nodes. Add a third NUMA associativity > reference point enabling us to give GPU nodes more distance. > > This is guest visible information, which shouldn't change under a > running guest across migration between different qemu versions, so make > the change effective only in new (pseries > 5.0) machine types. > > Before, `numactl -H` output in a guest with 4 GPUs (nodes 2-5): > > node distances: > node 0 1 2 3 4 5 > 0: 10 40 40 40 40 40 > 1: 40 10 40 40 40 40 > 2: 40 40 10 40 40 40 > 3: 40 40 40 10 40 40 > 4: 40 40 40 40 10 40 > 5: 40 40 40 40 40 10 > > After: > > node distances: > node 0 1 2 3 4 5 > 0: 10 40 80 80 80 80 > 1: 40 10 80 80 80 80 > 2: 80 80 10 80 80 80 > 3: 80 80 80 10 80 80 > 4: 80 80 80 80 10 80 > 5: 80 80 80 80 80 10 > > These are the same distances as on the host, mirroring the change made > to host firmware in skiboot commit f845a648b8cb ("numa/associativity: > Add a new level of NUMA for GPU's"). Applied to ppc-for-5.1. > > Signed-off-by: Reza Arbab > --- > v4: > * Use nvslot->numa_id for distinction at all levels of ibm,associativity > * Use ARRAY_SIZE(refpoints) > * Rebase > > v3: > * Squash into one patch > * Add PHB compat property > --- > hw/ppc/spapr.c | 21 +++-- > hw/ppc/spapr_pci.c | 2 ++ > hw/ppc/spapr_pci_nvlink2.c | 13 ++--- > include/hw/pci-host/spapr.h | 1 + > include/hw/ppc/spapr.h | 1 + > 5 files changed, 33 insertions(+), 5 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 299908cc7396..0ae293ec9431 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -890,10 +890,16 @@ static int spapr_dt_rng(void *fdt) > static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt) > { > MachineState *ms = MACHINE(spapr); > +SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(ms); > int rtas; > GString *hypertas = g_string_sized_new(256); > GString *qemu_hypertas = g_string_sized_new(256); > -uint32_t refpoints[] = { cpu_to_be32(0x4), cpu_to_be32(0x4) }; > +uint32_t refpoints[] = { > +cpu_to_be32(0x4), > +cpu_to_be32(0x4), > +cpu_to_be32(0x2), > +}; > +uint32_t nr_refpoints = ARRAY_SIZE(refpoints); > uint64_t max_device_addr = MACHINE(spapr)->device_memory->base + > memory_region_size((spapr)->device_memory->mr); > uint32_t lrdr_capacity[] = { > @@ -945,8 +951,12 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void > *fdt) > qemu_hypertas->str, qemu_hypertas->len)); > g_string_free(qemu_hypertas, TRUE); > > +if (smc->pre_5_1_assoc_refpoints) { > +nr_refpoints = 2; > +} > + > _FDT(fdt_setprop(fdt, rtas, "ibm,associativity-reference-points", > - refpoints, sizeof(refpoints))); > + refpoints, nr_refpoints * sizeof(refpoints[0]))); > > _FDT(fdt_setprop(fdt, rtas, "ibm,max-associativity-domains", > maxdomains, sizeof(maxdomains))); > @@ -4584,9 +4594,16 @@ DEFINE_SPAPR_MACHINE(5_1, "5.1", true); > */ > static void spapr_machine_5_0_class_options(MachineClass *mc) > { > +SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc); > +static GlobalProperty compat[] = { > +{ TYPE_SPAPR_PCI_HOST_BRIDGE, "pre-5.1-associativity", "on" }, > +}; > + > spapr_machine_5_1_class_options(mc); > compat_props_add(mc->compat_props, hw_compat_5_0, hw_compat_5_0_len); > +compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat)); > mc->numa_mem_supported = true; > +smc->pre_5_1_assoc_refpoints = true; > } > > DEFINE_SPAPR_MACHINE(5_0, "5.0", false); > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 2a6a48744aaa..16739334e35f 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -2035,6 +2035,8 @@ static Property spapr_phb_properties[] = { > pcie_ecs, true), > DEFINE_PROP_UINT64("gpa", SpaprPhbState, nv2_gpa_win_addr, 0), > DEFINE_PROP_UINT64("atsd", SpaprPhbState, nv2_atsd_win_addr, 0), > +DEFINE_PROP_BOOL("pre-5.1-associativity", SpaprPhbState, > + pre_5_1_assoc, false), > DEFINE_PROP_END_OF_LIST(), > }; > > diff --git a/hw/ppc/spapr_pci_nvlink2.c b/hw/ppc/spapr_pci_nvlink2.c > index dd8cd6db9654..76ae77ebc851 100644 > --- a/hw/ppc/spapr_pci_nvlink2.c > +++ b/hw/ppc/spapr_pci_nvlink2.c > @@ -362,9 +362,9 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState *sphb, > void *fdt) > _abort); > uint32_t associativity[] = { > cpu_to_be32(0x4), > -SPAPR_GPU_NUMA_ID, > -SPAPR_GPU_NUMA_ID, > -
Re: [PATCH qemu v9] spapr: Implement Open Firmware client interface
On Thu, Jul 16, 2020 at 07:04:56PM +1000, Alexey Kardashevskiy wrote: > Ping? I kinda realize it is not going to replace SLOF any time soon but > still... Yeah, I know. I just haven't had time to consider it. Priority starvation. > On 07/07/2020 10:34, Alexey Kardashevskiy wrote: > > Ping? > > > > > > On 24/06/2020 10:28, Alexey Kardashevskiy wrote: > >> Ping? > >> > >> On 02/06/2020 21:40, Alexey Kardashevskiy wrote: > >>> Ping? > >>> > >>> On 13/05/2020 13:58, Alexey Kardashevskiy wrote: > The PAPR platform which describes an OS environment that's presented by > a combination of a hypervisor and firmware. The features it specifies > require collaboration between the firmware and the hypervisor. > > Since the beginning, the runtime component of the firmware (RTAS) has > been implemented as a 20 byte shim which simply forwards it to > a hypercall implemented in qemu. The boot time firmware component is > SLOF - but a build that's specific to qemu, and has always needed to be > updated in sync with it. Even though we've managed to limit the amount > of runtime communication we need between qemu and SLOF, there's some, > and it has become increasingly awkward to handle as we've implemented > new features. > > This implements a boot time OF client interface (CI) which is > enabled by a new "x-vof" pseries machine option (stands for "Virtual Open > Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall > which implements Open Firmware Client Interface (OF CI). This allows > using a smaller stateless firmware which does not have to manage > the device tree. > > The new "vof.bin" firmware image is included with source code under > pc-bios/. It also includes RTAS blob. > > This implements a handful of CI methods just to get -kernel/-initrd > working. In particular, this implements the device tree fetching and > simple memory allocator - "claim" (an OF CI memory allocator) and updates > "/memory@0/available" to report the client about available memory. > > This implements changing some device tree properties which we know how > to deal with, the rest is ignored. To allow changes, this skips > fdt_pack() when x-vof=on as not packing the blob leaves some room for > appending. > > In absence of SLOF, this assigns phandles to device tree nodes to make > device tree traversing work. > > When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree. > > This adds basic instances support which are managed by a hash map > ihandle -> [phandle]. > > Before the guest started, the used memory is: > 0..4000 - the initial firmware > 1..18 - stack > > This OF CI does not implement "interpret". > > With this basic support, this can only boot into kernel directly. > However this is just enough for the petitboot kernel and initradmdisk to > boot from any possible source. Note this requires reasonably recent guest > kernel with: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 > > Signed-off-by: Alexey Kardashevskiy > --- > > > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
[PATCH v4] spapr: Add a new level of NUMA for GPUs
NUMA nodes corresponding to GPU memory currently have the same affinity/distance as normal memory nodes. Add a third NUMA associativity reference point enabling us to give GPU nodes more distance. This is guest visible information, which shouldn't change under a running guest across migration between different qemu versions, so make the change effective only in new (pseries > 5.0) machine types. Before, `numactl -H` output in a guest with 4 GPUs (nodes 2-5): node distances: node 0 1 2 3 4 5 0: 10 40 40 40 40 40 1: 40 10 40 40 40 40 2: 40 40 10 40 40 40 3: 40 40 40 10 40 40 4: 40 40 40 40 10 40 5: 40 40 40 40 40 10 After: node distances: node 0 1 2 3 4 5 0: 10 40 80 80 80 80 1: 40 10 80 80 80 80 2: 80 80 10 80 80 80 3: 80 80 80 10 80 80 4: 80 80 80 80 10 80 5: 80 80 80 80 80 10 These are the same distances as on the host, mirroring the change made to host firmware in skiboot commit f845a648b8cb ("numa/associativity: Add a new level of NUMA for GPU's"). Signed-off-by: Reza Arbab --- v4: * Use nvslot->numa_id for distinction at all levels of ibm,associativity * Use ARRAY_SIZE(refpoints) * Rebase v3: * Squash into one patch * Add PHB compat property --- hw/ppc/spapr.c | 21 +++-- hw/ppc/spapr_pci.c | 2 ++ hw/ppc/spapr_pci_nvlink2.c | 13 ++--- include/hw/pci-host/spapr.h | 1 + include/hw/ppc/spapr.h | 1 + 5 files changed, 33 insertions(+), 5 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 299908cc7396..0ae293ec9431 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -890,10 +890,16 @@ static int spapr_dt_rng(void *fdt) static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt) { MachineState *ms = MACHINE(spapr); +SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(ms); int rtas; GString *hypertas = g_string_sized_new(256); GString *qemu_hypertas = g_string_sized_new(256); -uint32_t refpoints[] = { cpu_to_be32(0x4), cpu_to_be32(0x4) }; +uint32_t refpoints[] = { +cpu_to_be32(0x4), +cpu_to_be32(0x4), +cpu_to_be32(0x2), +}; +uint32_t nr_refpoints = ARRAY_SIZE(refpoints); uint64_t max_device_addr = MACHINE(spapr)->device_memory->base + memory_region_size((spapr)->device_memory->mr); uint32_t lrdr_capacity[] = { @@ -945,8 +951,12 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt) qemu_hypertas->str, qemu_hypertas->len)); g_string_free(qemu_hypertas, TRUE); +if (smc->pre_5_1_assoc_refpoints) { +nr_refpoints = 2; +} + _FDT(fdt_setprop(fdt, rtas, "ibm,associativity-reference-points", - refpoints, sizeof(refpoints))); + refpoints, nr_refpoints * sizeof(refpoints[0]))); _FDT(fdt_setprop(fdt, rtas, "ibm,max-associativity-domains", maxdomains, sizeof(maxdomains))); @@ -4584,9 +4594,16 @@ DEFINE_SPAPR_MACHINE(5_1, "5.1", true); */ static void spapr_machine_5_0_class_options(MachineClass *mc) { +SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc); +static GlobalProperty compat[] = { +{ TYPE_SPAPR_PCI_HOST_BRIDGE, "pre-5.1-associativity", "on" }, +}; + spapr_machine_5_1_class_options(mc); compat_props_add(mc->compat_props, hw_compat_5_0, hw_compat_5_0_len); +compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat)); mc->numa_mem_supported = true; +smc->pre_5_1_assoc_refpoints = true; } DEFINE_SPAPR_MACHINE(5_0, "5.0", false); diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 2a6a48744aaa..16739334e35f 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -2035,6 +2035,8 @@ static Property spapr_phb_properties[] = { pcie_ecs, true), DEFINE_PROP_UINT64("gpa", SpaprPhbState, nv2_gpa_win_addr, 0), DEFINE_PROP_UINT64("atsd", SpaprPhbState, nv2_atsd_win_addr, 0), +DEFINE_PROP_BOOL("pre-5.1-associativity", SpaprPhbState, + pre_5_1_assoc, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/ppc/spapr_pci_nvlink2.c b/hw/ppc/spapr_pci_nvlink2.c index dd8cd6db9654..76ae77ebc851 100644 --- a/hw/ppc/spapr_pci_nvlink2.c +++ b/hw/ppc/spapr_pci_nvlink2.c @@ -362,9 +362,9 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState *sphb, void *fdt) _abort); uint32_t associativity[] = { cpu_to_be32(0x4), -SPAPR_GPU_NUMA_ID, -SPAPR_GPU_NUMA_ID, -SPAPR_GPU_NUMA_ID, +cpu_to_be32(nvslot->numa_id), +cpu_to_be32(nvslot->numa_id), +cpu_to_be32(nvslot->numa_id), cpu_to_be32(nvslot->numa_id) }; uint64_t size = object_property_get_uint(nv_mrobj, "size", NULL); @@ -375,6 +375,13 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState
Re: TB Cache size grows out of control with qemu 5.0
On Thu, 16 Jul 2020, Alex Bennée wrote: Christian Ehrhardt writes: On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan wrote: See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two following it. Thank you Zoltan for pointing out this commit, I agree that this seems to be the trigger for the issues I'm seeing. Unfortunately the common CI host size is 1-2G. For example on Ubuntu Autopkgtests 1.5G. Those of them running guests do so in 0.5-1G size in TCG mode (as they often can't rely on having KVM available). The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing on memory pressure (never existed) makes these systems go OOM-Killing the qemu process. Ooops. I admit the assumption was that most people running system emulation would be doing it on beefier machines. The patches indicated that the TB flushes on a full guest boot are a good indicator of the TB size efficiency. From my old checks I had: - Qemu 4.2 512M guest with 32M default overwritten by ram-size/4 TB flush count 14, 14, 16 - Qemu 5.0 512M guest with 1G default TB flush count 1, 1, 1 I agree that ram/4 seems odd, especially on huge guests that is a lot potentially wasted. And most environments have a bit of breathing room 1G is too big in small host systems and the common CI system falls into this category. So I tuned it down to 256M for a test. - Qemu 4.2 512M guest with tb-size 256M TB flush count 5, 5, 5 - Qemu 5.0 512M guest with tb-size 256M TB flush count 5, 5, 5 - Qemu 5.0 512M guest with 256M default in code TB flush count 5, 5, 5 So performance wise the results are as much in-between as you'd think from a TB size in between. And the memory consumption which (for me) is the actual current issue to fix would be back in line again as expected. So I'm glad you have the workaround. So on one hand I'm suggesting something like: --- a/accel/tcg/translate-all.c +++ b/accel/tcg/translate-all.c @@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re * Users running large scale system emulation may want to tweak their * runtime setup via the tb-size control on the command line. */ -#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB) +#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB) The problem we have is any number we pick here is arbitrary. And while it did regress your use-case changing it again just pushes a performance regression onto someone else. The most (*) 64 bit desktop PCs have 16Gb of RAM, almost all have more than 8gb. And there is a workaround. * random number from Steams HW survey. #endif #endif OTOH I understand someone else might want to get the more speedy 1G especially for large guests. If someone used to run a 4G guest in TCG the TB Size was 1G all along. How about picking the smaller of (1G || ram-size/4) as default? This might then look like: --- a/accel/tcg/translate-all.c +++ b/accel/tcg/translate-all.c @@ -956,7 +956,12 @@ static inline size_t size_code_gen_buffe { /* Size the buffer. */ if (tb_size == 0) { -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; +unsigned long max_default = (unsigned long)(ram_size / 4); +if (max_default < DEFAULT_CODE_GEN_BUFFER_SIZE) { +tb_size = max_default; +} else { + tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; +} } if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) { tb_size = MIN_CODE_GEN_BUFFER_SIZE; This is a bit more tricky than it seems as ram_sizes is no more present in that context but it is enough to discuss it. That should serve all cases - small and large - better as a pure static default of 1G or always ram/4? I'm definitely against re-introducing ram_size into the mix. The original commit (a1b18df9a4) that broke this introduced an ordering dependency which we don't want to bring back. I'd be more amenable to something that took into account host memory and limited the default if it was smaller than a threshold. Is there a way to probe that that doesn't involve slurping /proc/meminfo? I agree that this should be dependent on host memory size not guest ram_size but it might be tricky to get that value because different host OSes would need different ways. Maybe a new qemu_host_mem_size portability function will be needed that implements this for different host OSes. POSIX may or may not have sysconf _SC_PHYS_PAGES and _SC_AVPHYS_PAGES and linux has sysinfo but don't know how reliable these are. Regards, BALATON Zoltan
Re: [GIT PULL] I2C updates
On Thu, Jul 16, 2020 at 09:45:41PM +0100, Peter Maydell wrote: > On Thu, 16 Jul 2020 at 18:49, Corey Minyard wrote: > > > > The following changes since commit 45db94cc90c286a9965a285ba19450f448760a09: > > > > Merge remote-tracking branch > > 'remotes/mcayland/tags/qemu-openbios-20200707' into staging (2020-07-10 > > 16:43:40 +0100) > > > > are available in the Git repository at: > > > > https://github.com/cminyard/qemu.git tags/for-qemu-i2c-5 > > > > for you to fetch changes up to 73d5f22ecbb76dfc785876779d47787084ff0f42: > > > > hw/i2c: Document the I2C qdev helpers (2020-07-16 12:30:54 -0500) > > > > > > Minor changes to: > > > > Add an SMBus config entry > > > > Cleanup/simplify/document some I2C interfaces > > > > > > Philippe Mathieu-Daudé (6): > > hw/i2c/Kconfig: Add an entry for the SMBus > > hw/i2c/aspeed_i2c: Simplify aspeed_i2c_get_bus() > > hw/i2c: Rename i2c_try_create_slave() as i2c_slave_new() > > hw/i2c: Rename i2c_realize_and_unref() as > > i2c_slave_realize_and_unref() > > hw/i2c: Rename i2c_create_slave() as i2c_slave_create_simple() > > hw/i2c: Document the I2C qdev helpers > > Hi; this failed to build on x86-64 Linux (incremental build): Hmm, I did test this, and I just rebuilt, then rebased on the end of master and rebuilt, without issue. It looks like the smbus code is not being included, but I don't see how that can be. -corey > > LINKi386-softmmu/qemu-system-i386 > ../hw/i2c/smbus_eeprom.o: In function `smbus_eeprom_vmstate_needed': > /home/petmay01/linaro/qemu-for-merges/hw/i2c/smbus_eeprom.c:94: > undefined reference to `smbus_vmstate_needed' > ../hw/i2c/smbus_eeprom.o:(.data.rel+0x50): undefined reference to > `vmstate_smbus_device' > ../hw/i2c/pm_smbus.o: In function `smb_transaction': > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:93: undefined > reference to `smbus_quick_command' > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:97: undefined > reference to `smbus_receive_byte' > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:100: undefined > reference to `smbus_send_byte' > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:105: undefined > reference to `smbus_read_byte' > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:108: undefined > reference to `smbus_write_byte' > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:114: undefined > reference to `smbus_read_word' > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:117: undefined > reference to `smbus_write_word' > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:149: undefined > reference to `smbus_read_block' > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:174: undefined > reference to `smbus_write_block' > ../hw/i2c/pm_smbus.o: In function `smb_ioport_writeb': > /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:290: undefined > reference to `smbus_write_block' > ../hw/ipmi/smbus_ipmi.o:(.data.rel+0x50): undefined reference to > `vmstate_smbus_device' > collect2: error: ld returned 1 exit status > > (similarly for other qemu-system-* binary links) > > thanks > -- PMM
Re: sysbus_create_simple Vs qdev_create
On Wed, Jul 15, 2020 at 04:37:18PM +0200, Markus Armbruster wrote: > Pratik Parvati writes: > > > Hi Markus and Philippe, > > > > Thanks for your reply. Now I am pretty clear about Qdev and sysbus helper > > function. > > > > Can you please explain to me in brief on buses and device hierarchies (i.e. > > BusState and DeviceState) and how they are related to each other? As I can > > see, the DeviceState class inherits the BusState > > > > struct DeviceState { > > /*< private >*/ > > Object parent_obj; > > /*< public >*/ > > > > const char *id; > > char *canonical_path; > > bool realized; > > bool pending_deleted_event; > > QemuOpts *opts; > > int hotplugged; > > bool allow_unplug_during_migration; > > BusState *parent_bus; \\ BusState is inherited here > > QLIST_HEAD(, NamedGPIOList) gpios; > > QLIST_HEAD(, BusState) child_bus; > > int num_child_bus; > > int instance_id_alias; > > int alias_required_for_version; > > ResettableState reset; > > }; > > > > and BusState, in turn, inherits the DeviceState as > > > > /** > > * BusState: > > * @hotplug_handler: link to a hotplug handler associated with bus. > > * @reset: ResettableState for the bus; handled by Resettable interface. > > */struct BusState { > > Object obj; > > DeviceState *parent; \\ DeviceState is inherited here > > char *name; > > HotplugHandler *hotplug_handler; > > int max_index; > > bool realized; > > int num_children; > > QTAILQ_HEAD(, BusChild) children; > > QLIST_ENTRY(BusState) sibling; > > ResettableState reset; > > }; > > > > I am a bit confused. Can you brief me this relation! > > We sorely lack introductory documentation on both qdev and QOM. I > believe most developers are more or less confused about them most of the > time. I've done a bit of work on both, so I'm probably less confused > than average. I'm cc'ing maintainers in the hope of reducing average > confusion among participants in this thread. > > DeviceState does not inherit from BusState, and BusState does not > inherit from DeviceState. The relation you marked in the code is > actually "has a". > > A DeviceState may have a BusState, namely the bus the device is plugged > into. "May", because some devices are bus-less (their > DEVICE_GET_CLASS(dev)->bus_type is null), and the others get plugged > into their bus only at realize time. > > Example: PCI device "pci-serial" plugs into a PCI bus. > > Example: device "serial" does not plug into a bus (its used as component > device of "pci-serial" and other devices). > > Example: device "pc-dimm" does not plug into a bus. > > A bus has a DeviceState, namely the device providing this bus. > > Example: device "i440FX-pcihost" provides PCI bus "pci.0". > > Both DeviceState and BusState are QOM subtypes of Object. I prefer to > avoid use of "inherit" for that, because it can mean different things to > different people. I'd also note that the use of "parent" in the code is also ambiguous. It can mean: * QOM parent type, i.e. TypeInfo.parent. Related fields: * parent_class members of class structs * parent_obj members of object structs * QOM composition tree parent object, i.e. Object::parent * qdev device parent bus, i.e. DeviceState::parent_bus * parent device of of qdev bus, i.e. BusState::parent -- Eduardo
Re: [PATCH 2/2] configure: add support for Control-Flow Integrity
On 7/2/2020 5:52 AM, Daniel P. Berrangé wrote: The need to maintain this list of functions makes me feel very uneasy. How can we have any confidence that this list of functions is accurate ? How will maintainers ensure that they correctly update it as they are writing/changing code, and how will they test the result ? Hi Daniel, I gave it some thought and studied more of clang's options. It is possible to disable cfi on specific functions by using an __attribute__ keyword, instead of providing a list in an external file. In terms of maintaining, this is much better since we are removing the need to update the list. I would suggest defining a macro, __disable_cfi__, that can be prepended to a function. The macro would expand to nothing if cfi is disabled, or to the proper attribute if it is enabled. Here's example code snippet /* Disable CFI checks. * The callback function has been loaded from an external library so we do not * have type information */ __disable_cfi__ void qemu_plugin_vcpu_syscall_ret(CPUState *cpu, int64_t num, int64_t ret) { ... } This would take care of renaming and removal of functions that need cfi. It would also probably be beneficial to the developers since they can see immediately that the function they are working with needs to have CFI checks disabled, and why. If you think this is a better approach, I'll submit v2 with this approach instead of the external function list. For new code, however, the best thing is proper education and testing. I'll work on a document for docs/devel to detail what it is and how to make code cfi-safe. A good approach should be to test code changes with CFI enabled. CFI is, after all, a sanitizer and therefore it makes sense to use it also during development, together with ASan, UBSan and the likes. Unfortunately, since it works only with clang, I don't think this can ever be a hard requirement. Daniele
hw-display-qxl.so: undefined symbol: qemu_qxl_io_log_semaphore
Hi Gerd, I'm trying to build qemu 5.1.0-rc0 in Fedora. I'm hitting some issues. Using this configure line: ./configure --prefix=/usr --libdir=/usr/lib64 --sysconfdir=/etc --localstatedir=/var --libexecdir=/usr/libexec --interp-prefix=/usr/qemu-%M --with-pkgversion=qemu-5.1.0-0.1.rc0.fc33 '--extra-ldflags=-Wl,--build-id -Wl,-z,relro -Wl,-z,now' '--extra-cflags=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' --enable-trace-backend=dtrace --audio-drv-list=pa,sdl,alsa,oss --enable-kvm --target-list=x86_64-softmmu --enable-pie --enable-modules --enable-spice Build and then run: $ ./x86_64-softmmu/qemu-system-x86_64 -device \? | grep qxl Failed to open module: /home/crobinso/src/qemu/x86_64-softmmu/../hw-display-qxl.so: undefined symbol: qemu_qxl_io_log_semaphore That error breaks iotests 127: --- /home/crobinso/src/qemu/tests/qemu-iotests/127.out 2020-07-15 04:00:10.589138586 -0400 +++ /home/crobinso/src/qemu/tests/qemu-iotests/127.out.bad 2020-07-16 16:44:37.717248172 -0400 @@ -1,4 +1,5 @@ QA output created by 127 +Failed to open module: /home/crobinso/src/qemu/x86_64-softmmu/../hw-display-qxl.so: undefined symbol: qemu_qxl_io_log_semaphore Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536 Formatting 'TEST_DIR/t.IMGFMT.overlay0', fmt=IMGFMT size=65536 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT Formatting 'TEST_DIR/t.IMGFMT.overlay1', fmt=IMGFMT size=65536 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT Doing a build with every target and running 'make check' will show undefined symbol errors for other targets with hw-display-qxl too. Most reference the qxl_io_log call but some are like: Failed to open module: /home/crobinso/src/qemu/microblaze-softmmu/../hw-display-qxl.so: undefined symbol: vga_ioport_read Also as a side note though I think it's pre-existing: running the test suite with --enable-modules while there are host installed modules is very noisy with lots of repetitive warnings like: Failed to initialize module: /usr/lib64/qemu/audio-oss.so Note: only modules from the same build can be loaded. Failed to initialize module: /usr/lib64/qemu/audio-pa.so Note: only modules from the same build can be loaded. Failed to initialize module: /usr/lib64/qemu/audio-sdl.so Note: only modules from the same build can be loaded. Failed to initialize module: /usr/lib64/qemu/ui-curses.so Note: only modules from the same build can be loaded. Failed to initialize module: /usr/lib64/qemu/ui-gtk.so Note: only modules from the same build can be loaded. It would be nice if those could be avoided somehow. Maybe QEMU_MODULE_DIR can help? Thanks, Cole
Re: [RFC PATCH] tcg/cpu-exec: precise single-stepping after an exception
On 7/16/20 1:12 PM, Peter Maydell wrote: > On Thu, 16 Jul 2020 at 11:08, Luc Michel wrote: >> >> When single-stepping with a debugger attached to QEMU, and when an >> exception is raised, the debugger misses the first instruction after the >> exception: > > This is a long-standing bug; thanks for looking at it. > (https://bugs.launchpad.net/qemu/+bug/757702) > > >> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c >> index d95c4848a4..e85fab5d40 100644 >> --- a/accel/tcg/cpu-exec.c >> +++ b/accel/tcg/cpu-exec.c >> @@ -502,10 +502,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, >> int *ret) >> CPUClass *cc = CPU_GET_CLASS(cpu); >> qemu_mutex_lock_iothread(); >> cc->do_interrupt(cpu); >> qemu_mutex_unlock_iothread(); >> cpu->exception_index = -1; >> + >> +if (unlikely(cpu->singlestep_enabled)) { >> +/* >> + * After processing the exception, ensure an EXCP_DEBUG is >> + * raised when single-stepping so that GDB doesn't miss the >> + * next instruction. >> + */ >> +cpu->exception_index = EXCP_DEBUG; >> +return cpu_handle_exception(cpu, ret); >> +} > > I like the idea of being able to do this generically in > the main loop. > > How about interrupts? If we are single-stepping and we > take an interrupt I guess we want to stop before the first > insn of the interrupt handler rather than after it, which > would imply a similar change to cpu_handle_interrupt(). Fair. I think something like this: if (cc->cpu_exec_interrupt(cpu, interrupt_request)) { replay_interrupt(); - cpu->exception_index = -1; + cpu->exception_index = + (cpu->singlestep_enabled ? EXCP_DEBUG : -1); *last_tb = NULL; } I'm not quite sure how to test this though... Probably best to keep this a separate patch anyway. r~
Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??
On 7/16/20 10:15 PM, Peter Maydell wrote: > On Thu, 16 Jul 2020 at 20:52, Michael Roth wrote: >> But is it intermittent, environment-dependent? I'm trying to understand how >> to >> replicate Peter's result since it seems like it would be straightforward >> reproducer. > > I blew away all my build trees and recreated them from > scratch, and the issue went away. I'm suspicious that the > complete lack of .d files was induced by a failed earlier > pullreq attempt and left the build tree in a messed up state > where it wouldn't notice that it needed to rebuild files. If it happens again, can you try to revert aaa1b70a0b ("Makefile: simplify MINIKCONF rules") on top of the tag you are testing, and re-run the testing?
Re: [PATCH v5 10/11] hw/arm: Wire up BMC boot flash for npcm750-evb and quanta-gsj
On Wed, Jul 15, 2020 at 1:54 PM Havard Skinnemoen wrote: > > On Wed, Jul 15, 2020 at 3:57 AM Philippe Mathieu-Daudé > wrote: > > > > On 7/15/20 11:00 AM, Markus Armbruster wrote: > > > Now my point. Why first make up user configuration, then use that to > > > create a BlockBackend, when you could just go ahead and create the > > > BlockBackend? > > > > CLI issue mostly. > > > > We can solve it similarly to the recent "sdcard: Do not allow invalid SD > > card sizes" patch: > > > > if (!dinfo) { > > error_setg(errp, "Missing SPI flash drive"); > > error_append_hint(errp, "You can use a dummy drive using:\n"); > > error_append_hint(errp, "-drive if=mtd,driver=null-co," > > "read-ones=on,size=64M\n); > > return; > > } > > > > having npcm7xx_connect_flash() taking an Error* argument, > > and MachineClass::init() call it with _fatal. > > Erroring out if the user specifies a configuration that can't possibly > boot sounds good to me. Better than trying to come up with defaults > that are still not going to result in a bootable system. > > For testing recovery paths, I think it makes sense to explicitly > specify a null device as you suggest. Hmm, one problem. qom-test fails with qemu-system-aarch64: Missing SPI flash drive You can add a dummy drive using: -drive if=mtd,driver=null-co,read-zeroes=on,size=32M Broken pipe /usr/local/google/home/hskinnemoen/qemu/for-upstream/tests/qtest/libqtest.c:166: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0) ERROR qom-test - too few tests run (expected 68, got 7) So it looks like we might need a different solution to this, unless we want to make generic tests more machine-aware...
Re: [GIT PULL] I2C updates
On Thu, 16 Jul 2020 at 18:49, Corey Minyard wrote: > > The following changes since commit 45db94cc90c286a9965a285ba19450f448760a09: > > Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-20200707' > into staging (2020-07-10 16:43:40 +0100) > > are available in the Git repository at: > > https://github.com/cminyard/qemu.git tags/for-qemu-i2c-5 > > for you to fetch changes up to 73d5f22ecbb76dfc785876779d47787084ff0f42: > > hw/i2c: Document the I2C qdev helpers (2020-07-16 12:30:54 -0500) > > > Minor changes to: > > Add an SMBus config entry > > Cleanup/simplify/document some I2C interfaces > > > Philippe Mathieu-Daudé (6): > hw/i2c/Kconfig: Add an entry for the SMBus > hw/i2c/aspeed_i2c: Simplify aspeed_i2c_get_bus() > hw/i2c: Rename i2c_try_create_slave() as i2c_slave_new() > hw/i2c: Rename i2c_realize_and_unref() as i2c_slave_realize_and_unref() > hw/i2c: Rename i2c_create_slave() as i2c_slave_create_simple() > hw/i2c: Document the I2C qdev helpers Hi; this failed to build on x86-64 Linux (incremental build): LINKi386-softmmu/qemu-system-i386 ../hw/i2c/smbus_eeprom.o: In function `smbus_eeprom_vmstate_needed': /home/petmay01/linaro/qemu-for-merges/hw/i2c/smbus_eeprom.c:94: undefined reference to `smbus_vmstate_needed' ../hw/i2c/smbus_eeprom.o:(.data.rel+0x50): undefined reference to `vmstate_smbus_device' ../hw/i2c/pm_smbus.o: In function `smb_transaction': /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:93: undefined reference to `smbus_quick_command' /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:97: undefined reference to `smbus_receive_byte' /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:100: undefined reference to `smbus_send_byte' /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:105: undefined reference to `smbus_read_byte' /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:108: undefined reference to `smbus_write_byte' /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:114: undefined reference to `smbus_read_word' /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:117: undefined reference to `smbus_write_word' /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:149: undefined reference to `smbus_read_block' /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:174: undefined reference to `smbus_write_block' ../hw/i2c/pm_smbus.o: In function `smb_ioport_writeb': /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:290: undefined reference to `smbus_write_block' ../hw/ipmi/smbus_ipmi.o:(.data.rel+0x50): undefined reference to `vmstate_smbus_device' collect2: error: ld returned 1 exit status (similarly for other qemu-system-* binary links) thanks -- PMM
[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64
Writing to SCTLR can cause QEMU to flush its TLB (as an internal implementation detail), so if adding SCTLR writes is sufficient to cause the problem to go away, I would be suspicious that your guest code is missing necessary TLB maintenance instructions. QEMU 3.1 and 4.1 are quite old -- can you reproduce with 5.0 or (ideally) head-of-git ? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1887854 Title: Spurious Data Abort on qemu-system-aarch64 Status in QEMU: New Bug description: When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is not yet publically available), the test generates a spurious data abort (the MMU and alignment checks should be disabled according to bits 1, 0 of SCTLR_EL1). The abort information is as follows: Taking exception 4 [Data Abort] ...from EL1 to EL1 ...with ESR 0x25/0x9610 ...with FAR 0x104010ca28 ...with ELR 0x400195d8 ...to EL1 PC 0x40018200 PSTATE 0x3c5 The ESR indicates that a synchronous external abort has occurred. ESR EC field: 0b100101 From the ARMv8 technical manual: Data Abort taken without a change in Exception level. Used for MMU faults generated by data accesses, alignment faults other than those caused by Stack Pointer misalignment, and synchronous External aborts, including synchronous parity or ECC errors. Not used for debug related exceptions. ESR ISS field: 0b1 From the ARMv8 technical manual: Synchronous External abort, not on translation table walk or hardware update of translation table. The following command line is used to invoke qemu: qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot -nographic -serial mon:stdio -kernel build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d in_asm,int,cpu_reset,unimp,guest_errors This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as built by the RTEMS source builder (4.1+minor patches). Edit: This bug can be worked around by getting and setting SCTLR without changing its value before each data abort would occur. This test needs 6 of these workarounds to operate successfully. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions
Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??
On Thu, 16 Jul 2020 at 20:52, Michael Roth wrote: > But is it intermittent, environment-dependent? I'm trying to understand how to > replicate Peter's result since it seems like it would be straightforward > reproducer. I blew away all my build trees and recreated them from scratch, and the issue went away. I'm suspicious that the complete lack of .d files was induced by a failed earlier pullreq attempt and left the build tree in a messed up state where it wouldn't notice that it needed to rebuild files. -- PMM
Re: [PULL 0/2] Fixes 20200716 patches
On Thu, 16 Jul 2020 at 10:34, Gerd Hoffmann wrote: > > The following changes since commit 8746309137ba470d1b2e8f5ce86ac228625db940: > > Update version for v5.1.0-rc0 release (2020-07-15 19:08:07 +0100) > > are available in the Git repository at: > > git://git.kraxel.org/qemu tags/fixes-20200716-pull-request > > for you to fetch changes up to 4084e35068772cf4f81bbae5174019f277c61084: > > usb: fix storage regression (2020-07-16 10:20:27 +0200) > > > fixes: usb storage regression, vfio display ramfb bug > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1 for any user-visible changes. -- PMM
Re: [RFC PATCH] tcg/cpu-exec: precise single-stepping after an exception
On Thu, 16 Jul 2020 at 11:08, Luc Michel wrote: > > When single-stepping with a debugger attached to QEMU, and when an > exception is raised, the debugger misses the first instruction after the > exception: This is a long-standing bug; thanks for looking at it. (https://bugs.launchpad.net/qemu/+bug/757702) > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index d95c4848a4..e85fab5d40 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -502,10 +502,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, > int *ret) > CPUClass *cc = CPU_GET_CLASS(cpu); > qemu_mutex_lock_iothread(); > cc->do_interrupt(cpu); > qemu_mutex_unlock_iothread(); > cpu->exception_index = -1; > + > +if (unlikely(cpu->singlestep_enabled)) { > +/* > + * After processing the exception, ensure an EXCP_DEBUG is > + * raised when single-stepping so that GDB doesn't miss the > + * next instruction. > + */ > +cpu->exception_index = EXCP_DEBUG; > +return cpu_handle_exception(cpu, ret); > +} I like the idea of being able to do this generically in the main loop. How about interrupts? If we are single-stepping and we take an interrupt I guess we want to stop before the first insn of the interrupt handler rather than after it, which would imply a similar change to cpu_handle_interrupt(). thanks -- PMM
[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64
** Description changed: When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is not yet publically available), the test generates a spurious data abort (the MMU and alignment checks should be disabled according to bits 1, 0 of SCTLR_EL1). The abort information is as follows: Taking exception 4 [Data Abort] ...from EL1 to EL1 ...with ESR 0x25/0x9610 ...with FAR 0x104010ca28 ...with ELR 0x400195d8 ...to EL1 PC 0x40018200 PSTATE 0x3c5 The ESR indicates that a synchronous external abort has occurred. ESR EC field: 0b100101 From the ARMv8 technical manual: Data Abort taken without a change in Exception level. Used for MMU faults generated by data accesses, alignment faults other than those caused by Stack Pointer misalignment, and synchronous External aborts, including synchronous parity or ECC errors. Not used for debug related exceptions. ESR ISS field: 0b1 From the ARMv8 technical manual: Synchronous External abort, not on translation table walk or hardware update of translation table. The following command line is used to invoke qemu: qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot -nographic -serial mon:stdio -kernel build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d in_asm,int,cpu_reset,unimp,guest_errors This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as built by the RTEMS source builder (4.1+minor patches). Edit: This bug can be worked around by getting and setting SCTLR without - changing its value. + changing its value before each data abort would occur. This test needs 6 + of these workarounds to operate successfully. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1887854 Title: Spurious Data Abort on qemu-system-aarch64 Status in QEMU: New Bug description: When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is not yet publically available), the test generates a spurious data abort (the MMU and alignment checks should be disabled according to bits 1, 0 of SCTLR_EL1). The abort information is as follows: Taking exception 4 [Data Abort] ...from EL1 to EL1 ...with ESR 0x25/0x9610 ...with FAR 0x104010ca28 ...with ELR 0x400195d8 ...to EL1 PC 0x40018200 PSTATE 0x3c5 The ESR indicates that a synchronous external abort has occurred. ESR EC field: 0b100101 From the ARMv8 technical manual: Data Abort taken without a change in Exception level. Used for MMU faults generated by data accesses, alignment faults other than those caused by Stack Pointer misalignment, and synchronous External aborts, including synchronous parity or ECC errors. Not used for debug related exceptions. ESR ISS field: 0b1 From the ARMv8 technical manual: Synchronous External abort, not on translation table walk or hardware update of translation table. The following command line is used to invoke qemu: qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot -nographic -serial mon:stdio -kernel build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d in_asm,int,cpu_reset,unimp,guest_errors This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as built by the RTEMS source builder (4.1+minor patches). Edit: This bug can be worked around by getting and setting SCTLR without changing its value before each data abort would occur. This test needs 6 of these workarounds to operate successfully. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions
Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??
On 7/16/20 9:52 PM, Michael Roth wrote: > Quoting Philippe Mathieu-Daudé (2020-07-16 12:59:28) >> On 7/16/20 7:55 PM, Michael Roth wrote: >>> Quoting Peter Maydell (2020-07-16 05:53:17) The first merge I tried to process after bumping VERSION for rc0 failed on test-qga like this: MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driv er.pl --test-name="test-qga" PASS 1 test-qga /qga/sync-delimited PASS 2 test-qga /qga/sync PASS 3 test-qga /qga/ping ** ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info: assertion failed (version == QEMU_VERSION): ("5.0.9 0" == "5.0.50") ERROR test-qga - Bail out! ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info: assertion failed (versio n == QEMU_VERSION): ("5.0.90" == "5.0.50") Aborted (core dumped) /home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:659: recipe for target 'check-unit' failed Looking at timestamps on files, tests/test-qga.o never got rebuilt, even though config-host.h has been updated (and so has the new QEMU_VERSION). Any idea what's gone wrong here? Also weird: this build tree has no .d files in it. >>> >>> I've been trying to reproduce with: >>> >>> make >>> make check-unit >>> *bump VERSION >>> make check-unit >>> >>> but test-qga.o gets rebuilt as expected and the test passed. >>> >>> This is with ubuntu 18.04, x86, with out-of-tree build directory. Are you >>> aware >>> of any other factors that might be needed to reproduce this? >> >> The problem is not for qga, it affects all QEMU objects. > > But is it intermittent, environment-dependent? I'm trying to understand how to > replicate Peter's result since it seems like it would be straightforward > reproducer. How to reproduce: https://www.mail-archive.com/qemu-devel@nongnu.org/msg723531.html > >> >>> thanks -- PMM >>> >> >
Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??
Quoting Philippe Mathieu-Daudé (2020-07-16 12:59:28) > On 7/16/20 7:55 PM, Michael Roth wrote: > > Quoting Peter Maydell (2020-07-16 05:53:17) > >> The first merge I tried to process after bumping VERSION for rc0 > >> failed on test-qga like this: > >> > >> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} > >> tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driv > >> er.pl --test-name="test-qga" > >> PASS 1 test-qga /qga/sync-delimited > >> PASS 2 test-qga /qga/sync > >> PASS 3 test-qga /qga/ping > >> ** > >> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info: > >> assertion failed (version == QEMU_VERSION): ("5.0.9 > >> 0" == "5.0.50") > >> ERROR test-qga - Bail out! > >> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info: > >> assertion failed (versio > >> n == QEMU_VERSION): ("5.0.90" == "5.0.50") > >> Aborted (core dumped) > >> /home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:659: > >> recipe for target 'check-unit' failed > >> > >> Looking at timestamps on files, tests/test-qga.o never got rebuilt, > >> even though config-host.h has been updated (and so has the new > >> QEMU_VERSION). Any idea what's gone wrong here? > >> > >> Also weird: this build tree has no .d files in it. > > > > I've been trying to reproduce with: > > > > make > > make check-unit > > *bump VERSION > > make check-unit > > > > but test-qga.o gets rebuilt as expected and the test passed. > > > > This is with ubuntu 18.04, x86, with out-of-tree build directory. Are you > > aware > > of any other factors that might be needed to reproduce this? > > The problem is not for qga, it affects all QEMU objects. But is it intermittent, environment-dependent? I'm trying to understand how to replicate Peter's result since it seems like it would be straightforward reproducer. > > > > >> > >> thanks > >> -- PMM > > >
[Bug 1887854] [NEW] Spurious Data Abort on qemu-system-aarch64
Public bug reported: When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is not yet publically available), the test generates a spurious data abort (the MMU and alignment checks should be disabled according to bits 1, 0 of SCTLR_EL1). The abort information is as follows: Taking exception 4 [Data Abort] ...from EL1 to EL1 ...with ESR 0x25/0x9610 ...with FAR 0x104010ca28 ...with ELR 0x400195d8 ...to EL1 PC 0x40018200 PSTATE 0x3c5 The ESR indicates that a synchronous external abort has occurred. ESR EC field: 0b100101 >From the ARMv8 technical manual: Data Abort taken without a change in Exception level. Used for MMU faults generated by data accesses, alignment faults other than those caused by Stack Pointer misalignment, and synchronous External aborts, including synchronous parity or ECC errors. Not used for debug related exceptions. ESR ISS field: 0b1 >From the ARMv8 technical manual: Synchronous External abort, not on translation table walk or hardware update of translation table. The following command line is used to invoke qemu: qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot -nographic -serial mon:stdio -kernel build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d in_asm,int,cpu_reset,unimp,guest_errors This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as built by the RTEMS source builder (4.1+minor patches). Edit: This bug can be worked around by getting and setting SCTLR without changing its value. ** Affects: qemu Importance: Undecided Status: New ** Attachment added: "Qemu execution log and binary" https://bugs.launchpad.net/bugs/1887854/+attachment/5393233/+files/data_abort.tar.gz ** Description changed: When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is not yet publically available), the test generates a spurious data abort (the MMU and alignment checks should be disabled according to bits 1, 0 of SCTLR_EL1). The abort information is as follows: Taking exception 4 [Data Abort] ...from EL1 to EL1 ...with ESR 0x25/0x9610 ...with FAR 0x104010ca28 ...with ELR 0x400195d8 ...to EL1 PC 0x40018200 PSTATE 0x3c5 The ESR indicates that a synchronous external abort has occurred. ESR EC field: 0b100101 From the ARMv8 technical manual: Data Abort taken without a change in Exception level. Used for MMU faults generated by data accesses, alignment faults other than those caused by Stack Pointer misalignment, and synchronous External aborts, including synchronous parity or ECC errors. Not used for debug related exceptions. ESR ISS field: 0b1 From the ARMv8 technical manual: Synchronous External abort, not on translation table walk or hardware update of translation table. The following command line is used to invoke qemu: qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot -nographic -serial mon:stdio -kernel build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d in_asm,int,cpu_reset,unimp,guest_errors This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as built by the RTEMS source builder (4.1+minor patches). + + Edit: This bug can be worked around by getting and setting SCTLR without + changing its value. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1887854 Title: Spurious Data Abort on qemu-system-aarch64 Status in QEMU: New Bug description: When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is not yet publically available), the test generates a spurious data abort (the MMU and alignment checks should be disabled according to bits 1, 0 of SCTLR_EL1). The abort information is as follows: Taking exception 4 [Data Abort] ...from EL1 to EL1 ...with ESR 0x25/0x9610 ...with FAR 0x104010ca28 ...with ELR 0x400195d8 ...to EL1 PC 0x40018200 PSTATE 0x3c5 The ESR indicates that a synchronous external abort has occurred. ESR EC field: 0b100101 From the ARMv8 technical manual: Data Abort taken without a change in Exception level. Used for MMU faults generated by data accesses, alignment faults other than those caused by Stack Pointer misalignment, and synchronous External aborts, including synchronous parity or ECC errors. Not used for debug related exceptions. ESR ISS field: 0b1 From the ARMv8 technical manual: Synchronous External abort, not on translation table walk or hardware update of translation table. The following command line is used to invoke qemu: qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot -nographic -serial mon:stdio -kernel build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d in_asm,int,cpu_reset,unimp,guest_errors This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as built by
[PATCH v2] tcg/cpu-exec: precise single-stepping after an exception
When single-stepping with a debugger attached to QEMU, and when an exception is raised, the debugger misses the first instruction after the exception: $ qemu-system-aarch64 -M virt -display none -cpu cortex-a53 -s -S $ aarch64-linux-gnu-gdb GNU gdb (GDB) 9.2 [...] (gdb) tar rem :1234 Remote debugging using :1234 warning: No executable has been specified and target does not support determining executable automatically. Try using the "file" command. 0x in ?? () (gdb) # writing nop insns to 0x200 and 0x204 (gdb) set *0x200 = 0xd503201f (gdb) set *0x204 = 0xd503201f (gdb) # 0x0 address contains 0 which is an invalid opcode. (gdb) # The CPU should raise an exception and jump to 0x200 (gdb) si 0x0204 in ?? () With this commit, the same run steps correctly on the first instruction of the exception vector: (gdb) si 0x0200 in ?? () Signed-off-by: Luc Michel --- v2: - remove RFC tag - inline the recursive call to cpu_handle_exception [Richard] --- accel/tcg/cpu-exec.c | 12 1 file changed, 12 insertions(+) diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index d95c4848a4..59b1b5fe76 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -502,10 +502,22 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) CPUClass *cc = CPU_GET_CLASS(cpu); qemu_mutex_lock_iothread(); cc->do_interrupt(cpu); qemu_mutex_unlock_iothread(); cpu->exception_index = -1; + +if (unlikely(cpu->singlestep_enabled)) { +/* + * After processing the exception, ensure an EXCP_DEBUG is + * raised when single-stepping so that GDB doesn't miss the + * next instruction. + */ +*ret = EXCP_DEBUG; +cpu_handle_debug_exception(cpu); +return true; +} + } else if (!replay_has_interrupt()) { /* give a chance to iothread in replay mode */ *ret = EXCP_INTERRUPT; return true; } -- 2.27.0
[PATCH] net: check payload length limit for all frames
From: Prasad J Pandit While sending packets, the check that packet 'payload_len' is within 64kB limit, seems to happen only for GSO frames. It may lead to use-after-free or out-of-bounds access like issues when sending non-GSO frames. Check the 'payload_len' limit for all packets, irrespective of the gso type. Reported-by: Alexander Bulekov Signed-off-by: Prasad J Pandit --- hw/net/net_tx_pkt.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c index 162f802dd7..e66998a8f9 100644 --- a/hw/net/net_tx_pkt.c +++ b/hw/net/net_tx_pkt.c @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState *nc) * Since underlying infrastructure does not support IP datagrams longer * than 64K we should drop such packets and don't even try to send */ -if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) { -if (pkt->payload_len > -ETH_MAX_IP_DGRAM_LEN - -pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { -return false; -} +if (pkt->payload_len > +ETH_MAX_IP_DGRAM_LEN - +pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) { +return false; } if (pkt->has_virt_hdr || -- 2.26.2
Re: [PATCH for-5.1] i386: hvf: Explicitly set CR4 guest/host mask
On Thu, Jul 16, 2020 at 02:14:57PM -0400, Eduardo Habkost wrote: > On Tue, Jul 14, 2020 at 08:20:04PM +0200, Paolo Bonzini wrote: > > Hi Roman, please ask Peter to apply it directly because I won't be able to > > send a pull request in the next couple of weeks. > > > > Paolo > > > > Il mar 14 lug 2020, 12:39 Roman Bolshakov ha > > scritto: > > > > > On Tue, Jul 14, 2020 at 12:07:27PM +0300, Roman Bolshakov wrote: > > > > Removal of register reset omitted initialization of CR4 guest/host mask. > > > > x86_64 guests aren't booting without it. > > > > > > > > Fixes: 5009ef22c6bb2 ("i386: hvf: Don't duplicate register reset") > > > > Signed-off-by: Roman Bolshakov > > > > > > > > > > If one has a chance to test or review it, it'd be very helpful. That'd > > > allow to include it in RC0. > > > > > I'll queue it for my -rc1 pull request. > Thanks! -Roman
[PULL 4/6] target/i386: fix model number and add missing features for Icelake-Server CPU model
From: Chenyi Qiang Add the missing features(sha_ni, avx512ifma, rdpid, fsrm, vmx-rdseed-exit, vmx-pml, vmx-eptp-switching) and change the model number to 106 in the Icelake-Server-v4 CPU model. Signed-off-by: Chenyi Qiang Message-Id: <20200714084148.26690-3-chenyi.qi...@intel.com> Signed-off-by: Eduardo Habkost --- target/i386/cpu.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 3885826bc4..132ef90421 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -3512,6 +3512,20 @@ static X86CPUDefinition builtin_x86_defs[] = { { /* end of list */ } }, }, +{ +.version = 4, +.props = (PropValue[]) { +{ "sha-ni", "on" }, +{ "avx512ifma", "on" }, +{ "rdpid", "on" }, +{ "fsrm", "on" }, +{ "vmx-rdseed-exit", "on" }, +{ "vmx-pml", "on" }, +{ "vmx-eptp-switching", "on" }, +{ "model", "106" }, +{ /* end of list */ } +}, +}, { /* end of list */ } } }, -- 2.26.2
[PULL 1/6] i368/cpu: Clear env->user_features after loading versioned CPU model
From: Xiaoyao Li Features defined in versioned CPU model are recorded in env->user_features since they are updated as property. It's unwated because they are not user specified. Simply clear env->user_features as a fix. It won't clear user specified features because user specified features are filled to env->user_features later in x86_cpu_expand_features(). Cc: Chenyi Qiang Suggested-by: Eduardo Habkost Signed-off-by: Xiaoyao Li Message-Id: <20200713174436.41070-2-xiaoyao...@intel.com> [ehabkost: fix coding style] Signed-off-by: Eduardo Habkost --- target/i386/cpu.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1e5123251d..caf0334f3a 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -5159,6 +5159,13 @@ static void x86_cpu_load_model(X86CPU *cpu, X86CPUModel *model) object_property_set_str(OBJECT(cpu), "vendor", vendor, _abort); x86_cpu_apply_version_props(cpu, model); + +/* + * Properties in versioned CPU model are not user specified features. + * We can simply clear env->user_features here since it will be filled later + * in x86_cpu_expand_features() based on plus_features and minus_features. + */ +memset(>user_features, 0, sizeof(env->user_features)); } #ifndef CONFIG_USER_ONLY -- 2.26.2
[PULL 6/6] i386: hvf: Explicitly set CR4 guest/host mask
From: Roman Bolshakov Removal of register reset omitted initialization of CR4 guest/host mask. x86_64 guests aren't booting without it. Fixes: 5009ef22c6bb2 ("i386: hvf: Don't duplicate register reset") Signed-off-by: Roman Bolshakov Message-Id: <20200714090726.41082-1-r.bolsha...@yadro.com> Signed-off-by: Eduardo Habkost --- target/i386/hvf/vmx.h | 1 + 1 file changed, 1 insertion(+) diff --git a/target/i386/hvf/vmx.h b/target/i386/hvf/vmx.h index 75ba1e2a5f..587b1b8375 100644 --- a/target/i386/hvf/vmx.h +++ b/target/i386/hvf/vmx.h @@ -166,6 +166,7 @@ static inline void macvm_set_cr4(hv_vcpuid_t vcpu, uint64_t cr4) wvmcs(vcpu, VMCS_GUEST_CR4, guest_cr4); wvmcs(vcpu, VMCS_CR4_SHADOW, cr4); +wvmcs(vcpu, VMCS_CR4_MASK, CR4_VMXE); hv_vcpu_invalidate_tlb(vcpu); hv_vcpu_flush(vcpu); -- 2.26.2
[PULL 0/6] x86 fixes for -rc1
The following changes since commit ee5128bb00f90dd301991d80d1db5224ce924c84: Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging (2020-07-16 13:12:05 +0100) are available in the Git repository at: git://github.com/ehabkost/qemu.git tags/x86-next-pull-request for you to fetch changes up to 818b9f111d64b40661d08f5e23236ac1ca5df505: i386: hvf: Explicitly set CR4 guest/host mask (2020-07-16 14:15:13 -0400) x86 fixes for -rc1 Fixes for x86 that missed hard freeze: * Don't trigger warnings for features set by CPU model versions (Xiaoyao Li) * Missing features in Icelake-Server, Skylake-Server, Cascadelake-Server CPU models (Chenyi Qiang) * Fix hvf x86_64 guest boot crash (Roman Bolshakov) Chenyi Qiang (3): target/i386: add fast short REP MOV support target/i386: fix model number and add missing features for Icelake-Server CPU model target/i386: add the missing vmx features for Skylake-Server and Cascadelake-Server CPU models Roman Bolshakov (1): i386: hvf: Explicitly set CR4 guest/host mask Xiaoyao Li (2): i368/cpu: Clear env->user_features after loading versioned CPU model i386/cpu: Don't add unavailable_features to env->user_features target/i386/cpu.h | 2 ++ target/i386/hvf/vmx.h | 1 + target/i386/cpu.c | 38 -- 3 files changed, 39 insertions(+), 2 deletions(-) -- 2.26.2
[PULL 5/6] target/i386: add the missing vmx features for Skylake-Server and Cascadelake-Server CPU models
From: Chenyi Qiang Add the missing vmx features in Skylake-Server and Cascadelake-Server CPU models based on the output of Paolo's script. Signed-off-by: Chenyi Qiang Message-Id: <20200714084148.26690-4-chenyi.qi...@intel.com> Signed-off-by: Eduardo Habkost --- target/i386/cpu.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 132ef90421..588f32e136 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -3034,6 +3034,13 @@ static X86CPUDefinition builtin_x86_defs[] = { { /* end of list */ } } }, +{ +.version = 4, +.props = (PropValue[]) { +{ "vmx-eptp-switching", "on" }, +{ /* end of list */ } +} +}, { /* end of list */ } } }, @@ -3158,6 +3165,13 @@ static X86CPUDefinition builtin_x86_defs[] = { { /* end of list */ } }, }, +{ .version = 4, + .note = "ARCH_CAPABILITIES, no TSX", + .props = (PropValue[]) { + { "vmx-eptp-switching", "on" }, + { /* end of list */ } + }, +}, { /* end of list */ } } }, -- 2.26.2
[PULL 3/6] target/i386: add fast short REP MOV support
From: Chenyi Qiang For CPUs support fast short REP MOV[CPUID.(EAX=7,ECX=0):EDX(bit4)], e.g Icelake and Tigerlake, expose it to the guest VM. Reviewed-by: Eduardo Habkost Signed-off-by: Chenyi Qiang Message-Id: <20200714084148.26690-2-chenyi.qi...@intel.com> Signed-off-by: Eduardo Habkost --- target/i386/cpu.h | 2 ++ target/i386/cpu.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 37fffa5cac..e1a5c174dc 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -775,6 +775,8 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_7_0_EDX_AVX512_4VNNIW (1U << 2) /* AVX512 Multiply Accumulation Single Precision */ #define CPUID_7_0_EDX_AVX512_4FMAPS (1U << 3) +/* Fast Short Rep Mov */ +#define CPUID_7_0_EDX_FSRM (1U << 4) /* AVX512 Vector Pair Intersection to a Pair of Mask Registers */ #define CPUID_7_0_EDX_AVX512_VP2INTERSECT (1U << 8) /* SERIALIZE instruction */ diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 93b62b2eca..3885826bc4 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -984,7 +984,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { .type = CPUID_FEATURE_WORD, .feat_names = { NULL, NULL, "avx512-4vnniw", "avx512-4fmaps", -NULL, NULL, NULL, NULL, +"fsrm", NULL, NULL, NULL, "avx512-vp2intersect", NULL, "md-clear", NULL, NULL, NULL, "serialize", NULL, "tsx-ldtrk", NULL, NULL /* pconfig */, NULL, -- 2.26.2
[PULL 2/6] i386/cpu: Don't add unavailable_features to env->user_features
From: Xiaoyao Li Features unavailable due to absent of their dependent features should not be added to env->user_features. env->user_features only contains the feature explicity specified with -feature/+feature by user. Fixes: 99e24dbdaa68 ("target/i386: introduce generic feature dependency mechanism") Signed-off-by: Xiaoyao Li Message-Id: <20200713174436.41070-3-xiaoyao...@intel.com> Signed-off-by: Eduardo Habkost --- target/i386/cpu.c | 1 - 1 file changed, 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index caf0334f3a..93b62b2eca 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6371,7 +6371,6 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error **errp) unavailable_features & env->user_features[d->to.index], "This feature depends on other features that were not requested"); -env->user_features[d->to.index] |= unavailable_features; env->features[d->to.index] &= ~unavailable_features; } } -- 2.26.2
Re: [PATCH for-5.1] i386: hvf: Explicitly set CR4 guest/host mask
On Tue, Jul 14, 2020 at 08:20:04PM +0200, Paolo Bonzini wrote: > Hi Roman, please ask Peter to apply it directly because I won't be able to > send a pull request in the next couple of weeks. > > Paolo > > Il mar 14 lug 2020, 12:39 Roman Bolshakov ha > scritto: > > > On Tue, Jul 14, 2020 at 12:07:27PM +0300, Roman Bolshakov wrote: > > > Removal of register reset omitted initialization of CR4 guest/host mask. > > > x86_64 guests aren't booting without it. > > > > > > Fixes: 5009ef22c6bb2 ("i386: hvf: Don't duplicate register reset") > > > Signed-off-by: Roman Bolshakov > > > > > > > If one has a chance to test or review it, it'd be very helpful. That'd > > allow to include it in RC0. > > I'll queue it for my -rc1 pull request. -- Eduardo
Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??
On 7/16/20 7:55 PM, Michael Roth wrote: > Quoting Peter Maydell (2020-07-16 05:53:17) >> The first merge I tried to process after bumping VERSION for rc0 >> failed on test-qga like this: >> >> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} >> tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driv >> er.pl --test-name="test-qga" >> PASS 1 test-qga /qga/sync-delimited >> PASS 2 test-qga /qga/sync >> PASS 3 test-qga /qga/ping >> ** >> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info: >> assertion failed (version == QEMU_VERSION): ("5.0.9 >> 0" == "5.0.50") >> ERROR test-qga - Bail out! >> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info: >> assertion failed (versio >> n == QEMU_VERSION): ("5.0.90" == "5.0.50") >> Aborted (core dumped) >> /home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:659: >> recipe for target 'check-unit' failed >> >> Looking at timestamps on files, tests/test-qga.o never got rebuilt, >> even though config-host.h has been updated (and so has the new >> QEMU_VERSION). Any idea what's gone wrong here? >> >> Also weird: this build tree has no .d files in it. > > I've been trying to reproduce with: > > make > make check-unit > *bump VERSION > make check-unit > > but test-qga.o gets rebuilt as expected and the test passed. > > This is with ubuntu 18.04, x86, with out-of-tree build directory. Are you > aware > of any other factors that might be needed to reproduce this? The problem is not for qga, it affects all QEMU objects. > >> >> thanks >> -- PMM >
Re: [RFC PATCH] tcg/cpu-exec: precise single-stepping after an exception
On 7/16/20 3:04 AM, Luc Michel wrote: > When single-stepping with a debugger attached to QEMU, and when an > exception is raised, the debugger misses the first instruction after the > exception: > > $ qemu-system-aarch64 -M virt -display none -cpu cortex-a53 -s -S > > $ aarch64-linux-gnu-gdb > GNU gdb (GDB) 9.2 > [...] > (gdb) tar rem :1234 > Remote debugging using :1234 > warning: No executable has been specified and target does not support > determining executable automatically. Try using the "file" command. > 0x in ?? () > (gdb) # writing nop insns to 0x200 and 0x204 > (gdb) set *0x200 = 0xd503201f > (gdb) set *0x204 = 0xd503201f > (gdb) # 0x0 address contains 0 which is an invalid opcode. > (gdb) # The CPU should raise an exception and jump to 0x200 > (gdb) si > 0x0204 in ?? () > > With this commit, the same run steps correctly on the first instruction > of the exception vector: > > (gdb) si > 0x0200 in ?? () > > Signed-off-by: Luc Michel > --- > > RFC because I'm really not sure this is the good place to do that since > EXCP_DEBUG are usually raised in each target translate.c. It could also > have implications with record/replay I'm not aware of. > > --- > accel/tcg/cpu-exec.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index d95c4848a4..e85fab5d40 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -502,10 +502,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, > int *ret) > CPUClass *cc = CPU_GET_CLASS(cpu); > qemu_mutex_lock_iothread(); > cc->do_interrupt(cpu); > qemu_mutex_unlock_iothread(); > cpu->exception_index = -1; > + > +if (unlikely(cpu->singlestep_enabled)) { > +/* > + * After processing the exception, ensure an EXCP_DEBUG is > + * raised when single-stepping so that GDB doesn't miss the > + * next instruction. > + */ > +cpu->exception_index = EXCP_DEBUG; > +return cpu_handle_exception(cpu, ret); Plausible. Although recursion on an inline function is going to defeat the inline, in general. We could expand the recursion by hand with if (unlikely(cpu->singlestep_enabled)) { *ret = EXCP_DEBUG; cpu_handle_debug_exception(cpu); return true; } which might even be clearer. r~ > +} > + > } else if (!replay_has_interrupt()) { > /* give a chance to iothread in replay mode */ > *ret = EXCP_INTERRUPT; > return true; > } >
Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??
Quoting Peter Maydell (2020-07-16 05:53:17) > The first merge I tried to process after bumping VERSION for rc0 > failed on test-qga like this: > > MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} > tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driv > er.pl --test-name="test-qga" > PASS 1 test-qga /qga/sync-delimited > PASS 2 test-qga /qga/sync > PASS 3 test-qga /qga/ping > ** > ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info: > assertion failed (version == QEMU_VERSION): ("5.0.9 > 0" == "5.0.50") > ERROR test-qga - Bail out! > ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info: > assertion failed (versio > n == QEMU_VERSION): ("5.0.90" == "5.0.50") > Aborted (core dumped) > /home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:659: > recipe for target 'check-unit' failed > > Looking at timestamps on files, tests/test-qga.o never got rebuilt, > even though config-host.h has been updated (and so has the new > QEMU_VERSION). Any idea what's gone wrong here? > > Also weird: this build tree has no .d files in it. I've been trying to reproduce with: make make check-unit *bump VERSION make check-unit but test-qga.o gets rebuilt as expected and the test passed. This is with ubuntu 18.04, x86, with out-of-tree build directory. Are you aware of any other factors that might be needed to reproduce this? > > thanks > -- PMM
Re: [PULL v1 0/2] Merge tpm 2020/07/15 v1
On Wed, 15 Jul 2020 at 20:23, Stefan Berger wrote: > > Hello! > > This series fixes a couple of minor issues with the PPC64 TPM SPAPR interface > and a test case. > >Stefan > > The following changes since commit 8746309137ba470d1b2e8f5ce86ac228625db940: > > Update version for v5.1.0-rc0 release (2020-07-15 19:08:07 +0100) > > are available in the Git repository at: > > git://github.com/stefanberger/qemu-tpm.git tags/pull-tpm-2020-07-15-1 > > for you to fetch changes up to df8a7568932e4c3c930fdfeb228dd72b4bb71a1f: > > tests: tpm: Skip over pcrUpdateCounter byte in result comparison > (2020-07-15 14:57:33 -0400) > > --- > Stefan Berger (2): > tpm: tpm_spapr: Exit on TPM backend failures > tests: tpm: Skip over pcrUpdateCounter byte in result comparison Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1 for any user-visible changes. -- PMM
[GIT PULL] I2C updates
The following changes since commit 45db94cc90c286a9965a285ba19450f448760a09: Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-20200707' into staging (2020-07-10 16:43:40 +0100) are available in the Git repository at: https://github.com/cminyard/qemu.git tags/for-qemu-i2c-5 for you to fetch changes up to 73d5f22ecbb76dfc785876779d47787084ff0f42: hw/i2c: Document the I2C qdev helpers (2020-07-16 12:30:54 -0500) Minor changes to: Add an SMBus config entry Cleanup/simplify/document some I2C interfaces Philippe Mathieu-Daudé (6): hw/i2c/Kconfig: Add an entry for the SMBus hw/i2c/aspeed_i2c: Simplify aspeed_i2c_get_bus() hw/i2c: Rename i2c_try_create_slave() as i2c_slave_new() hw/i2c: Rename i2c_realize_and_unref() as i2c_slave_realize_and_unref() hw/i2c: Rename i2c_create_slave() as i2c_slave_create_simple() hw/i2c: Document the I2C qdev helpers hw/arm/aspeed.c | 82 +++-- hw/arm/musicpal.c | 4 +-- hw/arm/nseries.c| 8 ++--- hw/arm/pxa2xx.c | 5 +-- hw/arm/realview.c | 2 +- hw/arm/spitz.c | 4 +-- hw/arm/stellaris.c | 2 +- hw/arm/tosa.c | 2 +- hw/arm/versatilepb.c| 2 +- hw/arm/vexpress.c | 2 +- hw/arm/z2.c | 4 +-- hw/display/sii9022.c| 2 +- hw/i2c/Kconfig | 8 +++-- hw/i2c/Makefile.objs| 3 +- hw/i2c/aspeed_i2c.c | 3 +- hw/i2c/core.c | 15 - hw/ppc/e500.c | 2 +- hw/ppc/sam460ex.c | 2 +- include/hw/i2c/aspeed_i2c.h | 2 +- include/hw/i2c/i2c.h| 54 +++-- 20 files changed, 131 insertions(+), 77 deletions(-)
Re: [PATCH v3 3/9] vfio: add quirk device write method
On Tue, 30 Jun 2020 at 13:30, P J P wrote: > > From: Prasad J Pandit > > Add vfio quirk device mmio write method to avoid NULL pointer > dereference issue. > > Reported-by: Lei Sun > Reviewed-by: Li Qiang > Signed-off-by: Prasad J Pandit > --- > hw/vfio/pci-quirks.c | 8 > 1 file changed, 8 insertions(+) > > Update v3: Add Reviewed-by: ... > -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09406.html > > diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c > index d304c81148..cc6d5dbc23 100644 > --- a/hw/vfio/pci-quirks.c > +++ b/hw/vfio/pci-quirks.c > @@ -14,6 +14,7 @@ > #include "config-devices.h" > #include "exec/memop.h" > #include "qemu/units.h" > +#include "qemu/log.h" > #include "qemu/error-report.h" > #include "qemu/main-loop.h" > #include "qemu/module.h" > @@ -264,8 +265,15 @@ static uint64_t vfio_ati_3c3_quirk_read(void *opaque, > return data; > } > > +static void vfio_ati_3c3_quirk_write(void *opaque, hwaddr addr, > +uint64_t data, unsigned size) > +{ > +qemu_log_mask(LOG_GUEST_ERROR, "%s not implemented\n", __func__); > +} > + > static const MemoryRegionOps vfio_ati_3c3_quirk = { > .read = vfio_ati_3c3_quirk_read, > +.write = vfio_ati_3c3_quirk_write, > .endianness = DEVICE_LITTLE_ENDIAN, > }; Alex (Williamson) -- as the vfio maintainer, do you have a view on whether we should be logging write accesses to port 0x3c3 here as guest-errors or unimplemented-QEMU-functionality? Guest-error seems plausible to me, anyway. Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v3 1/9] hw/pci-host: add pci-intack write method
On Tue, 30 Jun 2020 at 13:29, P J P wrote: > > From: Prasad J Pandit > > Add pci-intack mmio write method to avoid NULL pointer dereference > issue. > > Reported-by: Lei Sun > Reviewed-by: Li Qiang > Signed-off-by: Prasad J Pandit > --- > hw/pci-host/prep.c | 8 > 1 file changed, 8 insertions(+) > > Update v3: Add Reviewed-by: ... > -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09395.html > > diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c > index 367e408b91..3c8ff6af03 100644 > --- a/hw/pci-host/prep.c > +++ b/hw/pci-host/prep.c > @@ -26,6 +26,7 @@ > #include "qemu/osdep.h" > #include "qemu-common.h" > #include "qemu/units.h" > +#include "qemu/log.h" > #include "qapi/error.h" > #include "hw/pci/pci.h" > #include "hw/pci/pci_bus.h" > @@ -119,8 +120,15 @@ static uint64_t raven_intack_read(void *opaque, hwaddr > addr, > return pic_read_irq(isa_pic); > } > > +static void raven_intack_write(void *opaque, hwaddr addr, > +uint64_t data, unsigned size) > +{ > +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__); > +} > + > static const MemoryRegionOps raven_intack_ops = { > .read = raven_intack_read, > +.write = raven_intack_write, > .valid = { > .max_access_size = 1, > }, I suspect this may be a read-only register (and so a guest error rather than unimp) but I'm not sure I've found the correct Raven PCI controller datasheet, and if I have then there's a lot of unimplemented functionality in our model. So UNIMP is fine. This controller is only used by the ppc '40p' machine. Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v3 9/9] memory: assert MemoryRegionOps callbacks are defined
On Tue, 30 Jun 2020 at 13:31, P J P wrote: > > From: Prasad J Pandit > > When registering a MemoryRegionOps object, assert that its > read/write callback methods are defined. This avoids potential > guest crash via a NULL pointer dereference. > > Suggested-by: Peter Maydell > Reviewed-by: Li Qiang > Signed-off-by: Prasad J Pandit > --- Reviewed-by: Peter Maydell thanks -- PMM
[PATCH v4 for-5.2 1/2] spapr: Use error_append_hint() in spapr_caps.c
We have a dedicated error API for hints. Use it instead of embedding the hint in the error message, as recommanded in the "qapi/error.h" header file. Since spapr_caps_apply() passes _fatal, all functions must also call the ERRP_GUARD() macro for error_append_hint() to be functional. While here, have cap_fwnmi_apply(), which already uses error_append_hint(), to call ERRP_GUARD() as well. Signed-off-by: Greg Kurz Reviewed-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Laurent Vivier --- hw/ppc/spapr_caps.c | 89 +-- 1 file changed, 50 insertions(+), 39 deletions(-) diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c index 3225fc5a2edc..275f5bd0342c 100644 --- a/hw/ppc/spapr_caps.c +++ b/hw/ppc/spapr_caps.c @@ -180,24 +180,24 @@ static void spapr_cap_set_pagesize(Object *obj, Visitor *v, const char *name, static void cap_htm_apply(SpaprMachineState *spapr, uint8_t val, Error **errp) { +ERRP_GUARD(); if (!val) { /* TODO: We don't support disabling htm yet */ return; } if (tcg_enabled()) { -error_setg(errp, - "No Transactional Memory support in TCG," - " try appending -machine cap-htm=off"); +error_setg(errp, "No Transactional Memory support in TCG"); +error_append_hint(errp, "Try appending -machine cap-htm=off\n"); } else if (kvm_enabled() && !kvmppc_has_cap_htm()) { error_setg(errp, -"KVM implementation does not support Transactional Memory," - " try appending -machine cap-htm=off" -); + "KVM implementation does not support Transactional Memory"); +error_append_hint(errp, "Try appending -machine cap-htm=off\n"); } } static void cap_vsx_apply(SpaprMachineState *spapr, uint8_t val, Error **errp) { +ERRP_GUARD(); PowerPCCPU *cpu = POWERPC_CPU(first_cpu); CPUPPCState *env = >env; @@ -209,13 +209,14 @@ static void cap_vsx_apply(SpaprMachineState *spapr, uint8_t val, Error **errp) * rid of anything that doesn't do VMX */ g_assert(env->insns_flags & PPC_ALTIVEC); if (!(env->insns_flags2 & PPC2_VSX)) { -error_setg(errp, "VSX support not available," - " try appending -machine cap-vsx=off"); +error_setg(errp, "VSX support not available"); +error_append_hint(errp, "Try appending -machine cap-vsx=off\n"); } } static void cap_dfp_apply(SpaprMachineState *spapr, uint8_t val, Error **errp) { +ERRP_GUARD(); PowerPCCPU *cpu = POWERPC_CPU(first_cpu); CPUPPCState *env = >env; @@ -224,8 +225,8 @@ static void cap_dfp_apply(SpaprMachineState *spapr, uint8_t val, Error **errp) return; } if (!(env->insns_flags2 & PPC2_DFP)) { -error_setg(errp, "DFP support not available," - " try appending -machine cap-dfp=off"); +error_setg(errp, "DFP support not available"); +error_append_hint(errp, "Try appending -machine cap-dfp=off\n"); } } @@ -239,6 +240,7 @@ SpaprCapPossible cap_cfpc_possible = { static void cap_safe_cache_apply(SpaprMachineState *spapr, uint8_t val, Error **errp) { +ERRP_GUARD(); uint8_t kvm_val = kvmppc_get_cap_safe_cache(); if (tcg_enabled() && val) { @@ -247,9 +249,9 @@ static void cap_safe_cache_apply(SpaprMachineState *spapr, uint8_t val, cap_cfpc_possible.vals[val]); } else if (kvm_enabled() && (val > kvm_val)) { error_setg(errp, - "Requested safe cache capability level not supported by kvm," - " try appending -machine cap-cfpc=%s", - cap_cfpc_possible.vals[kvm_val]); + "Requested safe cache capability level not supported by KVM"); +error_append_hint(errp, "Try appending -machine cap-cfpc=%s\n", + cap_cfpc_possible.vals[kvm_val]); } } @@ -263,6 +265,7 @@ SpaprCapPossible cap_sbbc_possible = { static void cap_safe_bounds_check_apply(SpaprMachineState *spapr, uint8_t val, Error **errp) { +ERRP_GUARD(); uint8_t kvm_val = kvmppc_get_cap_safe_bounds_check(); if (tcg_enabled() && val) { @@ -271,9 +274,9 @@ static void cap_safe_bounds_check_apply(SpaprMachineState *spapr, uint8_t val, cap_sbbc_possible.vals[val]); } else if (kvm_enabled() && (val > kvm_val)) { error_setg(errp, -"Requested safe bounds check capability level not supported by kvm," - " try appending -machine cap-sbbc=%s", - cap_sbbc_possible.vals[kvm_val]); +"Requested safe bounds check capability level not supported by KVM"); +error_append_hint(errp, "Try appending -machine cap-sbbc=%s\n", + cap_sbbc_possible.vals[kvm_val]); } } @@ -290,6 +293,7 @@ SpaprCapPossible cap_ibs_possible = {
[PATCH v4 for-5.2 2/2] spapr: Forbid nested KVM-HV in pre-power9 compat mode
Nested KVM HV only works if the kernel is using the radix MMU mode, ie. the CPU is POWER9 and it is not running in some pre-power9 compat mode. Otherwise, the KVM HV module fails to load in the guest with -ENODEV. It might be painful for a user to discover this late that nested cannot work with their setup. Erroring out at machine init instead seems to be the best we can do. Signed-off-by: Greg Kurz Reviewed-by: Laurent Vivier --- hw/ppc/spapr_caps.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c index 275f5bd0342c..10a80a8159f4 100644 --- a/hw/ppc/spapr_caps.c +++ b/hw/ppc/spapr_caps.c @@ -382,6 +382,8 @@ static void cap_nested_kvm_hv_apply(SpaprMachineState *spapr, uint8_t val, Error **errp) { ERRP_GUARD(); +PowerPCCPU *cpu = POWERPC_CPU(first_cpu); + if (!val) { /* capability disabled by default */ return; @@ -391,6 +393,14 @@ static void cap_nested_kvm_hv_apply(SpaprMachineState *spapr, error_setg(errp, "No Nested KVM-HV support in TCG"); error_append_hint(errp, "Try appending -machine cap-nested-hv=off\n"); } else if (kvm_enabled()) { +if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_00, 0, + spapr->max_compat_pvr)) { +error_setg(errp, "Nested KVM-HV only supported on POWER9"); +error_append_hint(errp, + "Try appending -machine max-cpu-compat=power9\n"); +return; +} + if (!kvmppc_has_cap_nested_kvm_hv()) { error_setg(errp, "KVM implementation does not support Nested KVM-HV");
[PATCH v4 for-5.2 0/2] spapr: Improve error reporting in spapr_caps.c
Nested KVM HV only works if the kernel is using the radix MMU mode, ie. the CPU is POWER9 and it is not running in some pre-power9 compat mode. Otherwise, the KVM HV module fails to load in the guest with -ENODEV. It might be painful for a user to discover this late that nested cannot work with their setup. It seems a better fit for QEMU to do a sanity check when applying the nested-hv sPAPR capability and print out an informative error message. sPAPR capabilities are checked at machine init. If a capability cannot be used, an error message is printed and QEMU exits. In most places, the error message also contains an hint for the user. But we should use error_append_hint() for that, as explained in the "qapi/error.h" header. So this series first converts spapr_caps.c to using error_append_hint(). This requires to add some ERRP_GUARD() because spapr_caps_apply() passes _fatal. Then it adds a sanity check for the nested-hv case with an error message and hint. v4: - Same as v3 but rebased on ppc-for-5.2, updated changelogs and cover v3: - Add preliminary patch to use warn_report() instead of a convoluted error_setg()+warn_report_err() sequence v2: - Fix indentation and add some missing \n in patch 2 - Add ERRP_AUTO_PROPAGATE() to cap_nested_kvm_hv_apply() in patch 2 instead of patch 3 --- Greg Kurz (2): spapr: Use error_append_hint() in spapr_caps.c spapr: Forbid nested KVM-HV in pre-power9 compat mode hw/ppc/spapr_caps.c | 99 +++ 1 file changed, 60 insertions(+), 39 deletions(-) -- Greg
Re: [PATCH 2/2] python/qemu: Change ConsoleSocket to optionally drain socket.
On Thu, 16 Jul 2020 at 09:42, Alex Bennée wrote: > > > +self._drain_thread = None > > +socket.socket.__init__(self, socket.AF_UNIX, socket.SOCK_STREAM) > > +self.connect(address) > > +self._drain = drain > > We end up with two variables that represent the fact we have draining > happening. Could we rationalise it into: > > if drain: > self._drain_thread = self._thread_start() > else > self._drain_thread = None # if this is needed > > And then tests for: > > if not self._drain: > > become > > if self._drain_thread is None: Good point, this is simpler. Will update. > > +if self._drain and self._drain_thread is not None: > > +thread, self._drain_thread = self._drain_thread, None > Would self._drain ever not have self._drain_thread set? No, I believe that if drain is set, it results in _drain_thread also being set. This will be cleaned up once we drop the _drain. > > > thread.join() > > +socket.socket.close(self) > > > diff --git a/python/qemu/machine.py b/python/qemu/machine.py > > index 6769359766..62709d86e4 100644 > > --- a/python/qemu/machine.py > > +++ b/python/qemu/machine.py > > @@ -22,7 +22,6 @@ import logging > > import os > > import subprocess > > import shutil > > -import socket > > FYI minor patch conflict here with master OK, will rebase and fix this conflict. Thanks & Regards, -Rob > > > import tempfile > > from typing import Optional, Type > > from types import TracebackType > > @@ -591,12 +590,8 @@ class QEMUMachine: > > Returns a socket connected to the console > > """ > > if self._console_socket is None: > > -if self._drain_console: > > -self._console_socket = console_socket.ConsoleSocket( > > -self._console_address, > > -file=self._console_log_path) > > -else: > > -self._console_socket = socket.socket(socket.AF_UNIX, > > - socket.SOCK_STREAM) > > -self._console_socket.connect(self._console_address) > > +self._console_socket = console_socket.ConsoleSocket( > > +self._console_address, > > +file=self._console_log_path, > > +drain=self._drain_console) > > return self._console_socket > > > -- > Alex Bennée
Re: [PATCH v3 8/9] imx7-ccm: add digprog mmio write method
On Thu, 16 Jul 2020 at 17:55, P J P wrote: > > +-- On Thu, 16 Jul 2020, Peter Maydell wrote --+ > | > +static void imx7_digprog_write(void *opaque, hwaddr addr, > | > +uint64_t data, unsigned size) > | > +{ > | > +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__); > | > +} > | > | This covers a single register (DIGPROG) which is read-only (it returns a > | silicon revision number). So this is not a LOG_UNIMP, but a LOG_GUEST_ERROR: > | > | qemu_log_mask(LOG_GUEST_ERROR, "Guest write to read-only > | ANALOG_DIGPROG register\n"); > > Should this be g_assert_not_reached() in that case? No, because a malicious guest can write to the register (and cause the function to be called), it is merely that it is a bug in guest code for it to do that. -- PMM
Re: [PATCH v3 8/9] imx7-ccm: add digprog mmio write method
+-- On Thu, 16 Jul 2020, Peter Maydell wrote --+ | > +static void imx7_digprog_write(void *opaque, hwaddr addr, | > +uint64_t data, unsigned size) | > +{ | > +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__); | > +} | | This covers a single register (DIGPROG) which is read-only (it returns a | silicon revision number). So this is not a LOG_UNIMP, but a LOG_GUEST_ERROR: | | qemu_log_mask(LOG_GUEST_ERROR, "Guest write to read-only | ANALOG_DIGPROG register\n"); Should this be g_assert_not_reached() in that case? Thank you. -- Prasad J Pandit / Red Hat Product Security Team 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
Re: [PATCH] introduce VFIO-over-socket protocol specificaion
Patchew URL: https://patchew.org/QEMU/1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com/ Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com Subject: [PATCH] introduce VFIO-over-socket protocol specificaion === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' aa6155e introduce VFIO-over-socket protocol specificaion === OUTPUT BEGIN === WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? #20: new file mode 100644 ERROR: trailing whitespace #167: FILE: docs/devel/vfio-over-socket.rst:143: +configuration space. $ ERROR: trailing whitespace #360: FILE: docs/devel/vfio-over-socket.rst:336: +|| +-++ | $ ERROR: trailing whitespace #572: FILE: docs/devel/vfio-over-socket.rst:548: +be supported in future versions. $ ERROR: trailing whitespace #587: FILE: docs/devel/vfio-over-socket.rst:563: +| Command | 5| $ ERROR: trailing whitespace #874: FILE: docs/devel/vfio-over-socket.rst:850: +the interrupt. $ ERROR: trailing whitespace #878: FILE: docs/devel/vfio-over-socket.rst:854: +guest unmasks the interrupt. $ ERROR: trailing whitespace #908: FILE: docs/devel/vfio-over-socket.rst:884: +appears after the 16 byte message header. $ total: 7 errors, 1 warnings, 1135 lines checked Commit aa6155e147a9 (introduce VFIO-over-socket protocol specificaion) has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
Re: [PATCH] introduce VFIO-over-socket protocol specificaion
Patchew URL: https://patchew.org/QEMU/1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #! /bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-mingw@fedora J=14 NETWORK=1 === TEST SCRIPT END === CC block/qapi.o CC block/file-win32.o Warning, treated as error: /tmp/qemu-test/src/docs/devel/vfio-over-socket.rst:281:Malformed table. +--+-+---+ --- CC crypto/hmac.o CC crypto/hmac-nettle.o CC crypto/desrfb.o make: *** [Makefile:1090: docs/devel/index.html] Error 2 make: *** Waiting for unfinished jobs Traceback (most recent call last): File "./tests/docker/docker.py", line 708, in --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=739a80f85e9146d59e3826f0d63daa6c', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-gwtqm3ly/src/docker-src.2020-07-16-12.51.55.8263:/var/tmp/qemu:z,ro', 'qemu/fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=739a80f85e9146d59e3826f0d63daa6c make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-gwtqm3ly/src' make: *** [docker-run-test-mingw@fedora] Error 2 real3m4.939s user0m8.716s The full log is available at http://patchew.org/logs/1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com/testing.docker-mingw@fedora/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
Re: [PATCH v3 2/9] pci-host: add pcie-msi read method
On Tue, 30 Jun 2020 at 13:30, P J P wrote: > > From: Prasad J Pandit > > Add pcie-msi mmio read method to avoid NULL pointer dereference > issue. This change is specific to the designware pci host controller; it would be nice to have "designware" in the commit subject. > Reported-by: Lei Sun > Reviewed-by: Li Qiang > Signed-off-by: Prasad J Pandit > --- > hw/pci-host/designware.c | 9 + > 1 file changed, 9 insertions(+) > > Update v3: Add Reviewed-by: ... > -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09400.html > +static uint64_t designware_pcie_root_msi_read(void *opaque, hwaddr addr, > + unsigned size) > +{ > +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__); > +return 0; This is the right change, but is missing an explanation of why it's right: Looking at the data sheet, in the real hardware MSI interrupts are handled by looking at writes to see whether they match the configured address; if so then the write gets quashed and never gets put out onto the AXI bus (to the CPU, memory, etc). This only happens for writes, so reads from the magic address are just allowed to pass through and will read from the system address space like any other read from any other address. That's not trivial to implement, though, and well-behaved guests won't care, so we can just explain why we're not doing anything with a suitable comment: /* * Attempts to read from the MSI address are undefined in * the PCI specifications. For this hardware, the datasheet * specifies that a read from the magic address is simply not * intercepted by the MSI controller, and will go out to the * AHB/AXI bus like any other PCI-device-initiated DMA read. * This is not trivial to implement in QEMU, so since * well-behaved guests won't ever ask a PCI device to DMA from * this address we just log the missing functionality. */ qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__); return 0; > +} > + > static void designware_pcie_root_msi_write(void *opaque, hwaddr addr, > uint64_t val, unsigned len) > { > @@ -77,6 +85,7 @@ static void designware_pcie_root_msi_write(void *opaque, > hwaddr addr, > } > > static const MemoryRegionOps designware_pci_host_msi_ops = { > +.read = designware_pcie_root_msi_read, > .write = designware_pcie_root_msi_write, > .endianness = DEVICE_LITTLE_ENDIAN, > .valid = { With that comment, Reviewed-by: Peter Maydell thanks -- PMM
[PATCH] gitlab-ci.yml: Add oss-fuzz build tests
This tries to build and run the fuzzers with the same build-script used by oss-fuzz. This doesn't guarantee that the builds on oss-fuzz will also succeed, since oss-fuzz provides its own compiler and fuzzer vars, but it can catch changes that are not compatible with the the ./scripts/oss-fuzz/build.sh script. The strange way of finding fuzzer binaries stems from the method used by oss-fuzz: https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-runner/targets_list Signed-off-by: Alexander Bulekov --- Similar to Thomas' patch: > Note: This patch needs two other patches merged first to work correctly: > - 'fuzz: Expect the cmdline in a freeable GString' from Alexander > - 'qom: Plug memory leak in "info qom-tree"' from Markus Otherwise the test will fail due to detected memory leaks. Fair warning: I haven't been able to trigger this new job yet. I tried to run the pipeline with these changes on my forked repo on gitlab, but did not reach the build-oss-fuzz. I think this is due to some failures in the Containers-layer-2 stage: ... Error response from daemon: manifest for registry.gitlab.com/a1xndr/qemu/qemu/debian-all-test-cross:latest not found: manifest unknown: manifest unknown #2 [internal] load .dockerignore #2 transferring context: #2 transferring context: 2B 0.1s done #2 DONE 0.1s #1 [internal] load build definition from tmpg8j4xoop.docker #1 transferring dockerfile: 2.21kB 0.1s done #1 DONE 0.2s #3 [internal] load metadata for docker.io/qemu/debian10:latest #3 ERROR: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed #4 [1/3] FROM docker.io/qemu/debian10:latest #4 resolve docker.io/qemu/debian10:latest 0.1s done #4 ERROR: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed -- > [internal] load metadata for docker.io/qemu/debian10:latest: -- -- > [1/3] FROM docker.io/qemu/debian10:latest: -- failed to solve with frontend dockerfile.v0: failed to build LLB: failed to load cache key: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed ... .gitlab-ci.yml | 14 ++ 1 file changed, 14 insertions(+) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index e96f8794b9..a50df420c9 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -182,6 +182,20 @@ build-fuzzer: || exit 1 ; done +build-oss-fuzz: + <<: *native_build_job_definition + variables: +IMAGE: fedora + script: +- OUT_DIR="./build" CC=clang-9 CXX=clang++-9 CFLAGS="-fsanitize=address" + LIB_FUZZING_ENGINE="-fsanitize=fuzzer" CFL + ./scripts/oss-fuzz/build.sh +- for fuzzer in $(find ./build-oss-fuzz/DEST_DIR/ -executable -type f); do +grep "LLVMFuzzerTestOneInput" ${fuzzer} > /dev/null 2>&1 || continue ; +echo Testing ${fuzzer} ... ; +"${fuzzer}" -runs=1000 || exit 1 ; + done + build-tci: <<: *native_build_job_definition variables: -- 2.26.2
Re: [PATCH] gitlab-ci.yml: Add fuzzer tests
On 200716 1209, Thomas Huth wrote: > So far we neither compile-tested nor run any of the new fuzzers in our CI, > which led to some build failures of the fuzzer code in the past weeks. > To avoid this problem, add a job to compile the fuzzer code and run some > loops (which likely don't find any new bugs via fuzzing, but at least we > know that the code can still be run). > > A nice side-effect of this test is that the leak tests are enabled here, > so we should now notice some of the memory leaks in our code base earlier. > > Signed-off-by: Thomas Huth Thank you for this, Thomas. I just sent a patch to add a job that runs similar tests with the build-script that we use on oss-fuzz Patch: <20200716163330.29141-1-alx...@bu.edu> I know that these jobs have a lot of overlap, but there are enough quirks in the oss-fuzz build-script that I, personally, don't think they are redundant. A couple notes below, and I haven't been able to test on my own fork of qemu on gitlab, yet due to some pipeline errors, but otherwise Reviewed-by: Alexander Bulekov > --- > .gitlab-ci.yml | 20 +++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml > index 5eeba2791b..e96f8794b9 100644 > --- a/.gitlab-ci.yml > +++ b/.gitlab-ci.yml > @@ -161,9 +161,27 @@ build-clang: > IMAGE: fedora > CONFIGURE_ARGS: --cc=clang --cxx=clang++ > TARGETS: alpha-softmmu arm-softmmu m68k-softmmu mips64-softmmu > - ppc-softmmu s390x-softmmu x86_64-softmmu arm-linux-user > + ppc-softmmu s390x-softmmu arm-linux-user > MAKE_CHECK_ARGS: check > > +build-fuzzer: > + <<: *native_build_job_definition > + variables: > +IMAGE: fedora > + script: > +- mkdir build > +- cd build > +- ../configure --cc=clang --cxx=clang++ --enable-fuzzing > + --target-list=x86_64-softmmu https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg02310.html When/if this gets merged, enable-fuzzing won't build with AddressSanitizer, by default, so I would add --enable-sanitizers, just to be safe. > +- make -j"$JOBS" all check-build x86_64-softmmu/fuzz > +- make check > +- for fuzzer in i440fx-qos-fork-fuzz i440fx-qos-noreset-fuzz > +i440fx-qtest-reboot-fuzz virtio-scsi-flags-fuzz virtio-scsi-fuzz ; do Any reason for these particular fuzzers? I know the virtio-net ones find crashes pretty quickly, but I dont think that causes a non-zero exit. > + echo Testing ${fuzzer} ... ; > + x86_64-softmmu/qemu-fuzz-x86_64 --fuzz-target=${fuzzer} -runs=1000 > +|| exit 1 ; > + done > + > build-tci: ><<: *native_build_job_definition >variables: > -- > 2.18.1 >
Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs
On 7/16/20 1:00 PM, Reza Arbab wrote: On Thu, Jul 16, 2020 at 06:42:11AM -0300, Daniel Henrique Barboza wrote: Which would translate here to: uint32_t associativity[] = { cpu_to_be32(0x4), cpu_to_be32(nvslot->numa_id), cpu_to_be32(nvslot->numa_id), cpu_to_be32(nvslot->numa_id), cpu_to_be32(nvslot->numa_id), }; Sure, that's how it originally was in v1 of the patch. Yeah, Greg commented this in v2 about this chunk: This is a guest visible change. It should theoretically be controlled with a compat property of the PHB (look for "static GlobalProperty" in spapr.c). But since this code is only used for GPU passthrough and we don't support migration of such devices, I guess it's okay. Maybe just mention it in the changelog. -- By all means you can mention in changelog/code comment why the associativity of the GPU is nvslot->numa_id 4 times in a row, but I believe this format is still clearer for us to understand. It also makes it equal to skiboot. And it deprecates the SPAPR_GPU_NUMA_ID macro, allowing us to use its value (1) for other internal purposes regarding NUMA without collision with the GPU semantics. Thanks, DHB I'll send a v4 today. It's been a while so I need to rebase anyway.
Re: [PATCH v3 6/9] spapr_pci: add spapr msi read method
On Tue, 30 Jun 2020 at 13:31, P J P wrote: > > From: Prasad J Pandit > > Add spapr msi mmio read method to avoid NULL pointer dereference > issue. > > Reported-by: Lei Sun > Acked-by: David Gibson > Reviewed-by: Li Qiang > Signed-off-by: Prasad J Pandit > --- > hw/ppc/spapr_pci.c | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > Update v3: Add Acked-by: ... > -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg08054.html > > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 329002ac04..7033352834 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -52,6 +52,7 @@ > #include "sysemu/kvm.h" > #include "sysemu/hostmem.h" > #include "sysemu/numa.h" > +#include "qemu/log.h" > > /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */ > #define RTAS_QUERY_FN 0 > @@ -738,6 +739,12 @@ static PCIINTxRoute spapr_route_intx_pin_to_irq(void > *opaque, int pin) > return route; > } > > +static uint64_t spapr_msi_read(void *opaque, hwaddr addr, unsigned size) > +{ > +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__); > +return 0; > +} > + > /* > * MSI/MSIX memory region implementation. > * The handler handles both MSI and MSIX. > @@ -755,8 +762,10 @@ static void spapr_msi_write(void *opaque, hwaddr addr, > } > > static const MemoryRegionOps spapr_msi_ops = { > -/* There is no .read as the read result is undefined by PCI spec */ > -.read = NULL, > +/* .read result is undefined by PCI spec QEMU multiline comments should have the '/*' on a line of its own. > + * define .read method to avoid assert failure in memory_region_init_io > + */ If this is undefined behaviour per the PCI spec then LOG_UNIMP is the wrong thing -- this should either be LOG_GUEST_ERROR (if the guest can do this or program the h/w to do this) or assert() (if the only way this could happen would be a bug in a QEMU model of a PCI device). > +.read = spapr_msi_read, > .write = spapr_msi_write, > .endianness = DEVICE_LITTLE_ENDIAN > }; > -- > 2.26.2 thanks -- PMM
Re: [PATCH] target/i386: floatx80: avoid compound literals in static initializers
Laszlo Ersek writes: > Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an > object that has static storage duration shall be constant expressions or > string literals". > > The compound literal produced by the make_floatx80() macro is not such a > constant expression, per 6.6p7-9. (An implementation may accept it, > according to 6.6p10, but is not required to.) > > Therefore using "floatx80_zero" and make_floatx80() for initializing > "f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6 > actually chokes on them: > >> target/i386/fpu_helper.c:871:5: error: initializer element is not constant >> { make_floatx80(0xbfff, 0x8000ULL), >> ^ > > We've had the make_floatx80_init() macro for this purpose since commit > 3bf7e40ab914 ("softfloat: fix for C99", 2012-03-17), so let's use that > macro again. > > Fixes: eca30647fc07 > Fixes: ff57bb7b6326 > Link: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06566.html > Link: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04714.html > Cc: Alex Bennée > Cc: Aurelien Jarno > Cc: Eduardo Habkost > Cc: Joseph Myers > Cc: Paolo Bonzini > Cc: Peter Maydell > Cc: Richard Henderson > Signed-off-by: Laszlo Ersek > --- > > Notes: > I can see that there are test cases under "tests/tcg/i386", but I don't > know how to run them. You can run the TCG tests with: make check-tcg or more specifically: make run-tcg-tests-i386-linux-user there is also a: make check-softfloat although in this case nothing is affected. softfloat bits: Acked-by: Alex Bennée > > include/fpu/softfloat.h | 1 + > target/i386/fpu_helper.c | 426 +++ > 2 files changed, 214 insertions(+), 213 deletions(-) > > diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h > index f1a19df066b7..659218b5c787 100644 > --- a/include/fpu/softfloat.h > +++ b/include/fpu/softfloat.h > @@ -822,6 +822,7 @@ static inline bool floatx80_invalid_encoding(floatx80 a) > } > > #define floatx80_zero make_floatx80(0x, 0xLL) > +#define floatx80_zero_init make_floatx80_init(0x, 0xLL) > #define floatx80_one make_floatx80(0x3fff, 0x8000LL) > #define floatx80_ln2 make_floatx80(0x3ffe, 0xb17217f7d1cf79acLL) > #define floatx80_pi make_floatx80(0x4000, 0xc90fdaa22168c235LL) > diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c > index f5e6c4b88d4e..4ea73874d836 100644 > --- a/target/i386/fpu_helper.c > +++ b/target/i386/fpu_helper.c > @@ -868,201 +868,201 @@ struct f2xm1_data { > }; > > static const struct f2xm1_data f2xm1_table[65] = { > -{ make_floatx80(0xbfff, 0x8000ULL), > - make_floatx80(0x3ffe, 0x8000ULL), > - make_floatx80(0xbffe, 0x8000ULL) }, > -{ make_floatx80(0xbffe, 0xf8002e7eULL), > - make_floatx80(0x3ffe, 0x82cd8698ac2b9160ULL), > - make_floatx80(0xbffd, 0xfa64f2cea7a8dd40ULL) }, > -{ make_floatx80(0xbffe, 0xefffe960ULL), > - make_floatx80(0x3ffe, 0x85aac367cc488345ULL), > - make_floatx80(0xbffd, 0xf4aa7930676ef976ULL) }, > -{ make_floatx80(0xbffe, 0xe8006f10ULL), > - make_floatx80(0x3ffe, 0x88980e8092da5c14ULL), > - make_floatx80(0xbffd, 0xeecfe2feda4b47d8ULL) }, > -{ make_floatx80(0xbffe, 0xe0008a45ULL), > - make_floatx80(0x3ffe, 0x8b95c1e3ea8ba2a5ULL), > - make_floatx80(0xbffd, 0xe8d47c382ae8bab6ULL) }, > -{ make_floatx80(0xbffe, 0xd7ff8a9eULL), > - make_floatx80(0x3ffe, 0x8ea4398b45cd8116ULL), > - make_floatx80(0xbffd, 0xe2b78ce97464fdd4ULL) }, > -{ make_floatx80(0xbffe, 0xd00019a0ULL), > - make_floatx80(0x3ffe, 0x91c3d373ab11b919ULL), > - make_floatx80(0xbffd, 0xdc785918a9dc8dceULL) }, > -{ make_floatx80(0xbffe, 0xc7ff14dfULL), > - make_floatx80(0x3ffe, 0x94f4efa8fef76836ULL), > - make_floatx80(0xbffd, 0xd61620ae02112f94ULL) }, > -{ make_floatx80(0xbffe, 0xc0006530ULL), > - make_floatx80(0x3ffe, 0x9837f0518db87fbbULL), > - make_floatx80(0xbffd, 0xcf901f5ce48f008aULL) }, > -{ make_floatx80(0xbffe, 0xb7ff1723ULL), > - make_floatx80(0x3ffe, 0x9b8d39b9d54eb74cULL), > - make_floatx80(0xbffd, 0xc8e58c8c55629168ULL) }, > -{ make_floatx80(0xbffe, 0xb000b5e1ULL), > - make_floatx80(0x3ffe, 0x9ef5326091a0c366ULL), > - make_floatx80(0xbffd, 0xc2159b3edcbe7934ULL) }, > -{ make_floatx80(0xbffe, 0xa8006f8aULL), > - make_floatx80(0x3ffe, 0xa27043030c49370aULL), > - make_floatx80(0xbffd, 0xbb1f79f9e76d91ecULL) }, > -{ make_floatx80(0xbffe, 0x9fff816aULL), > - make_floatx80(0x3ffe, 0xa5fed6a9b15171cfULL), > - make_floatx80(0xbffd, 0xb40252ac9d5d1c62ULL) }, > -{ make_floatx80(0xbffe, 0x97ffb621ULL), > - make_floatx80(0x3ffe, 0xa9a15ab4ea7c30e6ULL), > - make_floatx80(0xbffd,
Re: [PATCH v3 5/9] nvram: add nrf51_soc flash read method
On Tue, 30 Jun 2020 at 13:31, P J P wrote: > > From: Prasad J Pandit > > Add nrf51_soc mmio read method to avoid NULL pointer dereference > issue. > > Reported-by: Lei Sun > Signed-off-by: Prasad J Pandit > --- > hw/nvram/nrf51_nvm.c | 5 + > 1 file changed, 5 insertions(+) > > Update v3: use g_assert_not_reached() > -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09560.html > > diff --git a/hw/nvram/nrf51_nvm.c b/hw/nvram/nrf51_nvm.c > index f2283c1a8d..82e89d965b 100644 > --- a/hw/nvram/nrf51_nvm.c > +++ b/hw/nvram/nrf51_nvm.c > @@ -273,6 +273,10 @@ static const MemoryRegionOps io_ops = { > .endianness = DEVICE_LITTLE_ENDIAN, > }; > > +static uint64_t flash_read(void *opaque, hwaddr offset, unsigned size) > +{ Could use a comment about why this is unreachable, since it's not totally obvious: /* * This is a rom_device MemoryRegion which is always in * romd_mode (we never put it in MMIO mode), so reads always * go directly to RAM and never come here. */ > +g_assert_not_reached(); > +} > Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v2 5/9] nvram: add nrf51_soc flash read method
On Mon, 29 Jun 2020 at 12:18, Li Qiang wrote: > > P J P 于2020年6月25日周四 上午3:01写道: > > > > From: Prasad J Pandit > > > > Add nrf51_soc mmio read method to avoid NULL pointer dereference > > issue. > > > > Reported-by: Lei Sun > > Signed-off-by: Prasad J Pandit > > --- > > hw/nvram/nrf51_nvm.c | 8 > > 1 file changed, 8 insertions(+) > > > > Update v2: return ldl_le_p() > > -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg04972.html > > > > diff --git a/hw/nvram/nrf51_nvm.c b/hw/nvram/nrf51_nvm.c > > index f2283c1a8d..8000ed530a 100644 > > --- a/hw/nvram/nrf51_nvm.c > > +++ b/hw/nvram/nrf51_nvm.c > > @@ -273,6 +273,13 @@ static const MemoryRegionOps io_ops = { > > .endianness = DEVICE_LITTLE_ENDIAN, > > }; > > > > +static uint64_t flash_read(void *opaque, hwaddr offset, unsigned size) > > +{ > > +NRF51NVMState *s = NRF51_NVM(opaque); > > + > > +assert(offset + size <= s->flash_size); > > +return ldl_le_p(s->storage + offset); > > +} > > The 'flash_ops' is for ROM, though I don't see where it calls > 'memory_region_rom_device_set_romd' > to ROMD, so this MR is in MMIO mode and it needs a read callback. I think that 'romd mode' (ie reads-go-directly-to-RAM) is the default: memory_region_initfn() sets romd_mode to true. So unless the device actively calls memory_region_rom_device_set_romd(mr, false) then the read callback can't be reached. thanks -- PMM
Re: TB Cache size grows out of control with qemu 5.0
Christian Ehrhardt writes: > On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan wrote: > >> See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two >> following it. >> > > Thank you Zoltan for pointing out this commit, I agree that this seems to be > the trigger for the issues I'm seeing. Unfortunately the common CI host size > is 1-2G. For example on Ubuntu Autopkgtests 1.5G. > Those of them running guests do so in 0.5-1G size in TCG mode > (as they often can't rely on having KVM available). > > The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing > on memory pressure (never existed) makes these systems go OOM-Killing > the qemu process. Ooops. I admit the assumption was that most people running system emulation would be doing it on beefier machines. > The patches indicated that the TB flushes on a full guest boot are a > good indicator of the TB size efficiency. From my old checks I had: > > - Qemu 4.2 512M guest with 32M default overwritten by ram-size/4 > TB flush count 14, 14, 16 > - Qemu 5.0 512M guest with 1G default > TB flush count 1, 1, 1 > > I agree that ram/4 seems odd, especially on huge guests that is a lot > potentially wasted. And most environments have a bit of breathing > room 1G is too big in small host systems and the common CI system falls > into this category. So I tuned it down to 256M for a test. > > - Qemu 4.2 512M guest with tb-size 256M > TB flush count 5, 5, 5 > - Qemu 5.0 512M guest with tb-size 256M > TB flush count 5, 5, 5 > - Qemu 5.0 512M guest with 256M default in code > TB flush count 5, 5, 5 > > So performance wise the results are as much in-between as you'd think from a > TB size in between. And the memory consumption which (for me) is the actual > current issue to fix would be back in line again as expected. So I'm glad you have the workaround. > > So on one hand I'm suggesting something like: > --- a/accel/tcg/translate-all.c > +++ b/accel/tcg/translate-all.c > @@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re > * Users running large scale system emulation may want to tweak their > * runtime setup via the tb-size control on the command line. > */ > -#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB) > +#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB) The problem we have is any number we pick here is arbitrary. And while it did regress your use-case changing it again just pushes a performance regression onto someone else. The most (*) 64 bit desktop PCs have 16Gb of RAM, almost all have more than 8gb. And there is a workaround. * random number from Steams HW survey. > #endif > #endif > > OTOH I understand someone else might want to get the more speedy 1G > especially for large guests. If someone used to run a 4G guest in TCG the > TB Size was 1G all along. > How about picking the smaller of (1G || ram-size/4) as default? > > This might then look like: > --- a/accel/tcg/translate-all.c > +++ b/accel/tcg/translate-all.c > @@ -956,7 +956,12 @@ static inline size_t size_code_gen_buffe > { > /* Size the buffer. */ > if (tb_size == 0) { > -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > +unsigned long max_default = (unsigned long)(ram_size / 4); > +if (max_default < DEFAULT_CODE_GEN_BUFFER_SIZE) { > +tb_size = max_default; > +} else { > + tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE; > +} > } > if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) { > tb_size = MIN_CODE_GEN_BUFFER_SIZE; > > This is a bit more tricky than it seems as ram_sizes is no more > present in that context but it is enough to discuss it. > That should serve all cases - small and large - better as a pure > static default of 1G or always ram/4? I'm definitely against re-introducing ram_size into the mix. The original commit (a1b18df9a4) that broke this introduced an ordering dependency which we don't want to bring back. I'd be more amenable to something that took into account host memory and limited the default if it was smaller than a threshold. Is there a way to probe that that doesn't involve slurping /proc/meminfo? > > P.S. I added Alex being the Author of the offending patch and Richard/Paolo > for being listed in the Maintainers file for TCG. -- Alex Bennée
Re: [PATCH v3 8/9] imx7-ccm: add digprog mmio write method
On Tue, 30 Jun 2020 at 13:31, P J P wrote: > > From: Prasad J Pandit > > Add digprog mmio write method to avoid assert failure during > initialisation. > > Reviewed-by: Li Qiang > Signed-off-by: Prasad J Pandit > --- > hw/misc/imx7_ccm.c | 7 +++ > 1 file changed, 7 insertions(+) > > Update v3: Add Reviewed-by: ... > -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09452.html > > diff --git a/hw/misc/imx7_ccm.c b/hw/misc/imx7_ccm.c > index 02fc1ae8d0..5ac5ecf74c 100644 > --- a/hw/misc/imx7_ccm.c > +++ b/hw/misc/imx7_ccm.c > @@ -131,8 +131,15 @@ static const struct MemoryRegionOps imx7_set_clr_tog_ops > = { > }, > }; > > +static void imx7_digprog_write(void *opaque, hwaddr addr, > +uint64_t data, unsigned size) > +{ > +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__); > +} This covers a single register (DIGPROG) which is read-only (it returns a silicon revision number). So this is not a LOG_UNIMP, but a LOG_GUEST_ERROR: qemu_log_mask(LOG_GUEST_ERROR, "Guest write to read-only ANALOG_DIGPROG register\n"); thanks -- PMM
Re: [PATCH for-5.1] qapi: Fix visit_type_STRUCT() not to fail for null object
Markus Armbruster 于2020年7月16日周四 下午11:07写道: > > To make deallocating partially constructed objects work, the > visit_type_STRUCT() need to succeed without doing anything when passed > a null object. > > Commit cdd2b228b9 "qapi: Smooth visitor error checking in generated > code" broke that. To reproduce, run tests/test-qobject-input-visitor > with AddressSanitizer: > > ==4353==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 16 byte(s) in 1 object(s) allocated from: > #0 0x7f192d0c5d28 in __interceptor_calloc > (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28) > #1 0x7f192cd21b10 in g_malloc0 > (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x51b10) > #2 0x556725f6bbee in visit_next_list qapi/qapi-visit-core.c:86 > #3 0x556725f49e15 in visit_type_UserDefOneList > tests/test-qapi-visit.c:474 > #4 0x556725f4489b in test_visitor_in_fail_struct_in_list > tests/test-qobject-input-visitor.c:1086 > #5 0x7f192cd42f29 > (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72f29) > > SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s). > > Test case /visitor/input/fail/struct-in-list feeds a list with a bad > element to the QObject input visitor. Visiting that element duly > fails, and aborts the visit with the list only partially constructed: > the faulty object is null. Cleaning up the partially constructed list > visits that null object, fails, and aborts the visit before the list > node gets freed. > > Fix the the generated visit_type_STRUCT() to succeed for null objects. > > Fixes: cdd2b228b973d2a29edf7696ef6e8b08ec329019 > Reported-by: Li Qiang > Signed-off-by: Markus Armbruster Oh, I also sent this too. Not matter, just ignore my patch. Tested-by: Li Qiang Reviewed-by: Li Qiang > --- > scripts/qapi/visit.py | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/scripts/qapi/visit.py b/scripts/qapi/visit.py > index 3fb2f30510..cdabc5fa28 100644 > --- a/scripts/qapi/visit.py > +++ b/scripts/qapi/visit.py > @@ -249,6 +249,7 @@ bool visit_type_%(c_name)s(Visitor *v, const char *name, > %(c_name)s **obj, Error > if (!*obj) { > /* incomplete */ > assert(visit_is_dealloc(v)); > +ok = true; > goto out_obj; > } > if (!visit_type_%(c_name)s_members(v, *obj, errp)) { > -- > 2.26.2 > >
[PATCH] e1000e: using bottom half to send packets
Alexander Bulekov reported a UAF bug related e1000e packets send. -->https://bugs.launchpad.net/qemu/+bug/1886362 This is because the guest trigger a e1000e packet send and set the data's address to e1000e's MMIO address. So when the e1000e do DMA it will write the MMIO again and trigger re-entrancy and finally causes this UAF. Paolo suggested to use a bottom half whenever MMIO is doing complicate things in here: -->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html Reference here: 'The easiest solution is to delay processing of descriptors to a bottom half whenever MMIO is doing something complicated. This is also better for latency because it will free the vCPU thread more quickly and leave the work to the I/O thread.' This patch fixes this UAF. Signed-off-by: Li Qiang --- hw/net/e1000e_core.c | 25 + hw/net/e1000e_core.h | 2 ++ 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c index bcd186cac5..6165b04b68 100644 --- a/hw/net/e1000e_core.c +++ b/hw/net/e1000e_core.c @@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, uint32_t val) static void e1000e_set_tctl(E1000ECore *core, int index, uint32_t val) { -E1000E_TxRing txr; core->mac[index] = val; if (core->mac[TARC0] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , 0); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[0].tx_bh); } if (core->mac[TARC1] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , 1); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[1].tx_bh); } } static void e1000e_set_tdt(E1000ECore *core, int index, uint32_t val) { -E1000E_TxRing txr; int qidx = e1000e_mq_queue_idx(TDT, index); uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1; core->mac[index] = val & 0x; if (core->mac[tarc_reg] & E1000_TARC_ENABLE) { -e1000e_tx_ring_init(core, , qidx); -e1000e_start_xmit(core, ); +qemu_bh_schedule(core->tx[qidx].tx_bh); } } @@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, RunState state) } } +static void e1000e_core_tx_bh(void *opaque) +{ +struct e1000e_tx *tx = opaque; +E1000ECore *core = tx->core; +E1000E_TxRing txr; + +e1000e_tx_ring_init(core, , tx - >tx[0]); +e1000e_start_xmit(core, ); +} + void e1000e_core_pci_realize(E1000ECore *core, const uint16_t *eeprom_templ, @@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore *core, for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_init(>tx[i].tx_pkt, core->owner, E1000E_MAX_TX_FRAGS, core->has_vnet); +core->tx[i].core = core; +core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, >tx[i]); } net_rx_pkt_init(>rx_pkt, core->has_vnet); @@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core) for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_reset(core->tx[i].tx_pkt); net_tx_pkt_uninit(core->tx[i].tx_pkt); +qemu_bh_delete(core->tx[i].tx_bh); +core->tx[i].tx_bh = NULL; } net_rx_pkt_uninit(core->rx_pkt); diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h index aee32f7e48..94ddc6afc2 100644 --- a/hw/net/e1000e_core.h +++ b/hw/net/e1000e_core.h @@ -77,6 +77,8 @@ struct E1000Core { unsigned char sum_needed; bool cptse; struct NetTxPkt *tx_pkt; +QEMUBH *tx_bh; +E1000ECore *core; } tx[E1000E_NUM_QUEUES]; struct NetRxPkt *rx_pkt; -- 2.17.1
Re: [PATCH v3 7/9] tz-ppc: add dummy read/write methods
On Tue, 30 Jun 2020 at 13:31, P J P wrote: > > From: Prasad J Pandit > > Add tz-ppc-dummy mmio read/write methods to avoid assert failure > during initialisation. > > Signed-off-by: Prasad J Pandit > -- Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v10 09/10] qcow2_format.py: collect fields to dump in JSON format
16.07.2020 18:52, Andrey Shinkevich wrote: On 16.07.2020 18:40, Vladimir Sementsov-Ogievskiy wrote: 16.07.2020 18:34, Andrey Shinkevich wrote: On 16.07.2020 13:24, Vladimir Sementsov-Ogievskiy wrote: 14.07.2020 00:36, Andrey Shinkevich wrote: As __dict__ is being extended with class members we do not want to print, make a light copy of the initial __dict__ and extend the copy by adding lists we have to print in the JSON output. Signed-off-by: Andrey Shinkevich --- tests/qemu-iotests/qcow2_format.py | 4 1 file changed, 4 insertions(+) diff --git a/tests/qemu-iotests/qcow2_format.py b/tests/qemu-iotests/qcow2_format.py index e0e14b5..83c3482 100644 --- a/tests/qemu-iotests/qcow2_format.py +++ b/tests/qemu-iotests/qcow2_format.py @@ -109,6 +109,8 @@ class Qcow2Struct(metaclass=Qcow2StructMeta): self.__dict__ = dict((field[2], values[i]) for i, field in enumerate(self.fields)) + self.fields_dict = self.__dict__.copy() No, I don't like that. Keeping two copies of all the data is bad idea. If you want to select some fields, add a method (dump_json() ?) Which will collect the fields you want in a dict and return that new dict. But don't store object attributes twice. That is what I did in the versions before but it looks clumsy to me. Every single class lists almost all the items of the __dict__ again in the additional method. Most of them should be listed automatically by base class method, which will iterate through .fields Not really. It makes a light copy that stores only references to the desired fields. I'm not about memory consumption, I'm about the design. Keeping two representations of same thing is always painful to maintain. + def dump(self, dump_json=None): for f in self.fields: value = self.__dict__[f[2]] @@ -144,6 +146,7 @@ class Qcow2BitmapExt(Qcow2Struct): self.bitmap_directory = \ [Qcow2BitmapDirEntry(fd, cluster_size=self.cluster_size) for _ in range(self.nb_bitmaps)] + self.fields_dict.update(bitmap_directory=self.bitmap_directory) def dump(self, dump_json=None): super().dump(dump_json) @@ -189,6 +192,7 @@ class Qcow2BitmapDirEntry(Qcow2Struct): table = [e[0] for e in struct.iter_unpack('>Q', fd.read(table_size))] self.bitmap_table = Qcow2BitmapTable(raw_table=table, cluster_size=self.cluster_size) + self.fields_dict.update(bitmap_table=self.bitmap_table) def dump(self, dump_json=None): print(f'{"Bitmap name":<25} {self.name}') -- Best regards, Vladimir
Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs
On Thu, Jul 16, 2020 at 06:42:11AM -0300, Daniel Henrique Barboza wrote: Which would translate here to: uint32_t associativity[] = { cpu_to_be32(0x4), cpu_to_be32(nvslot->numa_id), cpu_to_be32(nvslot->numa_id), cpu_to_be32(nvslot->numa_id), cpu_to_be32(nvslot->numa_id), }; Sure, that's how it originally was in v1 of the patch. I'll send a v4 today. It's been a while so I need to rebase anyway. -- Reza Arbab
Re: [PATCH for-5.1] qapi: Fix visit_type_STRUCT() not to fail for null object
Patchew URL: https://patchew.org/QEMU/20200716150617.4027356-1-arm...@redhat.com/ Hi, This series failed the docker-quick@centos7 build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash make docker-image-centos7 V=1 NETWORK=1 time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 === TEST SCRIPT END === TESTiotest-qcow2: 020 TESTcheck-unit: tests/test-char ** ERROR:/tmp/qemu-test/src/tests/test-char.c:1204:char_serial_test: 'chr' should not be NULL ERROR test-char - Bail out! ERROR:/tmp/qemu-test/src/tests/test-char.c:1204:char_serial_test: 'chr' should not be NULL TESTiotest-qcow2: 021 make: *** [check-unit] Error 1 make: *** Waiting for unfinished jobs TESTiotest-qcow2: 022 TESTiotest-qcow2: 024 --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=df5ad74d767b4fb79503019b6f0a4007', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-x2oom0v2/src/docker-src.2020-07-16-11.47.51.27809:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=df5ad74d767b4fb79503019b6f0a4007 make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-x2oom0v2/src' make: *** [docker-run-test-quick@centos7] Error 2 real16m21.927s user0m9.780s The full log is available at http://patchew.org/logs/20200716150617.4027356-1-arm...@redhat.com/testing.docker-quick@centos7/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
[Bug 1880822] Re: CVE-2020-13253 QEMU: sd: OOB access could crash the guest resulting in DoS
Fixed in commit 790762e54871143415bffcec4cb3c022c3cd. ** Changed in: qemu Status: In Progress => Fix Committed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1880822 Title: CVE-2020-13253 QEMU: sd: OOB access could crash the guest resulting in DoS Status in QEMU: Fix Committed Bug description: An out-of-bounds read access issue was found in the SD Memory Card emulator of the QEMU. It occurs while performing block write commands via sdhci_write(), if a guest user has sent 'address' which is OOB of 's->wp_groups'. A guest user/process may use this flaw to crash the QEMU process resulting in DoS. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1880822/+subscriptions
Re: [PATCH v10 09/10] qcow2_format.py: collect fields to dump in JSON format
On 16.07.2020 18:40, Vladimir Sementsov-Ogievskiy wrote: 16.07.2020 18:34, Andrey Shinkevich wrote: On 16.07.2020 13:24, Vladimir Sementsov-Ogievskiy wrote: 14.07.2020 00:36, Andrey Shinkevich wrote: As __dict__ is being extended with class members we do not want to print, make a light copy of the initial __dict__ and extend the copy by adding lists we have to print in the JSON output. Signed-off-by: Andrey Shinkevich --- tests/qemu-iotests/qcow2_format.py | 4 1 file changed, 4 insertions(+) diff --git a/tests/qemu-iotests/qcow2_format.py b/tests/qemu-iotests/qcow2_format.py index e0e14b5..83c3482 100644 --- a/tests/qemu-iotests/qcow2_format.py +++ b/tests/qemu-iotests/qcow2_format.py @@ -109,6 +109,8 @@ class Qcow2Struct(metaclass=Qcow2StructMeta): self.__dict__ = dict((field[2], values[i]) for i, field in enumerate(self.fields)) + self.fields_dict = self.__dict__.copy() No, I don't like that. Keeping two copies of all the data is bad idea. If you want to select some fields, add a method (dump_json() ?) Which will collect the fields you want in a dict and return that new dict. But don't store object attributes twice. That is what I did in the versions before but it looks clumsy to me. Every single class lists almost all the items of the __dict__ again in the additional method. Andrey Not really. It makes a light copy that stores only references to the desired fields. I'm not about memory consumption, I'm about the design. Keeping two representations of same thing is always painful to maintain. + def dump(self, dump_json=None): for f in self.fields: value = self.__dict__[f[2]] @@ -144,6 +146,7 @@ class Qcow2BitmapExt(Qcow2Struct): self.bitmap_directory = \ [Qcow2BitmapDirEntry(fd, cluster_size=self.cluster_size) for _ in range(self.nb_bitmaps)] + self.fields_dict.update(bitmap_directory=self.bitmap_directory) def dump(self, dump_json=None): super().dump(dump_json) @@ -189,6 +192,7 @@ class Qcow2BitmapDirEntry(Qcow2Struct): table = [e[0] for e in struct.iter_unpack('>Q', fd.read(table_size))] self.bitmap_table = Qcow2BitmapTable(raw_table=table, cluster_size=self.cluster_size) + self.fields_dict.update(bitmap_table=self.bitmap_table) def dump(self, dump_json=None): print(f'{"Bitmap name":<25} {self.name}')
[PATCH] introduce VFIO-over-socket protocol specificaion
This patch introduces the VFIO-over-socket protocol specification, which is designed to allow devices to be emulated outside QEMU, in a separate process. VFIO-over-socket reuses the existing VFIO defines, structs and concepts. It has been earlier discussed as an RFC in: "RFC: use VFIO over a UNIX domain socket to implement device offloading" Signed-off-by: John G Johnson Signed-off-by: Thanos Makatos --- docs/devel/vfio-over-socket.rst | 1135 +++ 1 files changed, 1135 insertions(+), 0 deletions(-) create mode 100644 docs/devel/vfio-over-socket.rst diff --git a/docs/devel/vfio-over-socket.rst b/docs/devel/vfio-over-socket.rst new file mode 100644 index 000..723b944 --- /dev/null +++ b/docs/devel/vfio-over-socket.rst @@ -0,0 +1,1135 @@ +*** +VFIO-over-socket Protocol Specification +*** + +Version 0.1 + +Introduction + +VFIO-over-socket, also known as vfio-user, is a protocol that allows a device +to be virtualized in a separate process outside of QEMU. VFIO-over-socket +devices consist of a generic VFIO device type, living inside QEMU, which we +call the client, and the core device implementation, living outside QEMU, which +we call the server. VFIO-over-socket can be the main transport mechanism for +multi-process QEMU, however it can be used by other applications offering +device virtualization. Explaining the advantages of a +disaggregated/multi-process QEMU, and device virtualization outside QEMU in +general, is beyond the scope of this document. + +This document focuses on specifying the VFIO-over-socket protocol. VFIO has +been chosen for the following reasons: + +1) It is a mature and stable API, backed by an extensively used framework. +2) The existing VFIO client implementation (qemu/hw/vfio/) can be largely + reused. + +In a proof of concept implementation it has been demonstrated that using VFIO +over a UNIX domain socket is a viable option. VFIO-over-socket is designed with +QEMU in mind, however it could be used by other client applications. The +VFIO-over-socket protocol does not require that QEMU's VFIO client +implementation is used in QEMU. None of the VFIO kernel modules are required +for supporting the protocol, neither in the client nor the server, only the +source header files are used. + +The main idea is to allow a virtual device to function in a separate process in +the same host over a UNIX domain socket. A UNIX domain socket (AF_UNIX) is +chosen because we can trivially send file descriptors over it, which in turn +allows: + +* Sharing of guest memory for DMA with the virtual device process. +* Sharing of virtual device memory with the guest for fast MMIO. +* Efficient sharing of eventfd's for triggering interrupts. + +However, other socket types could be used which allows the virtual device +process to run in a separate guest in the same host (AF_VSOCK) or remotely +(AF_INET). Theoretically the underlying transport doesn't necessarily have to +be a socket, however we don't examine such alternatives. In this document we +focus on using a UNIX domain socket and introduce basic support for the other +two types of sockets without considering performance implications. + +This document does not yet describe any internal details of the server-side +implementation, however QEMU's VFIO client implementation will have to be +adapted according to this protocol in order to support VFIO-over-socket virtual +devices. + +VFIO + +VFIO is a framework that allows a physical device to be securely passed through +to a user space process; the kernel does not drive the device at all. +Typically, the user space process is a VM and the device is passed through to +it in order to achieve high performance. VFIO provides an API and the required +functionality in the kernel. QEMU has adopted VFIO to allow a guest virtual +machine to directly access physical devices, instead of emulating them in +software + +VFIO-over-socket reuses the core VFIO concepts defined in its API, but +implements them as messages to be sent over a UNIX-domain socket. It does not +change the kernel-based VFIO in any way, in fact none of the VFIO kernel +modules need to be loaded to use VFIO-over-socket. It is also possible for QEMU +to concurrently use the current kernel-based VFIO for one guest device, and use +VFIO-over-socket for another device in the same guest. + +VFIO Device Model +- +A device under VFIO presents a standard VFIO model to the user process. Many +of the VFIO operations in the existing kernel model use the ioctl() system +call, and references to the existing model are called the ioctl() +implementation in this document. + +The following sections describe the set of messages that implement the VFIO +device model over a UNIX domain socket. In many cases, the messages are direct +translations of data structures used in the ioctl() implementation. Messages +derived
[PATCH] osdep.h: Add doc comment for qemu_get_thread_id()
Add a documentation comment for qemu_get_thread_id(): since this is rather host-OS-specific it's useful if people writing the implementation and people thinking of using the function know what the purpose and limitations are. Signed-off-by: Peter Maydell --- Based on conversation with Dan on IRC, and prompted by the recent patch to add OpenBSD support. Q: should we document exactly what the thread-id value is for each host platform in the QMP documentation ? Somebody writing a management layer app should ideally not have to grovel through the application to figure out what they should do with the integer value they get back from query-cpus... include/qemu/osdep.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index 4841b5c6b5f..8279f72e5ed 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -515,6 +515,20 @@ bool qemu_has_ofd_lock(void); bool qemu_write_pidfile(const char *pidfile, Error **errp); +/** + * qemu_get_thread_id: Return OS-specific ID of current thread + * + * This function returns an OS-specific identifier of the + * current thread. This will be used for the "thread-id" field in + * the response to the QMP query-cpus and query-iothreads commands. + * The intention is that a VM management layer application can then + * use it to tie specific QEMU vCPU and IO threads to specific host + * CPUs using whatever the host OS's CPU affinity setting API is. + * New implementations of this function for new host OSes should + * return the most sensible integer ID that works for that purpose. + * + * This function should not be used for anything else inside QEMU. + */ int qemu_get_thread_id(void); #ifndef CONFIG_IOVEC -- 2.20.1
Re: [PATCH v10 09/10] qcow2_format.py: collect fields to dump in JSON format
16.07.2020 18:34, Andrey Shinkevich wrote: On 16.07.2020 13:24, Vladimir Sementsov-Ogievskiy wrote: 14.07.2020 00:36, Andrey Shinkevich wrote: As __dict__ is being extended with class members we do not want to print, make a light copy of the initial __dict__ and extend the copy by adding lists we have to print in the JSON output. Signed-off-by: Andrey Shinkevich --- tests/qemu-iotests/qcow2_format.py | 4 1 file changed, 4 insertions(+) diff --git a/tests/qemu-iotests/qcow2_format.py b/tests/qemu-iotests/qcow2_format.py index e0e14b5..83c3482 100644 --- a/tests/qemu-iotests/qcow2_format.py +++ b/tests/qemu-iotests/qcow2_format.py @@ -109,6 +109,8 @@ class Qcow2Struct(metaclass=Qcow2StructMeta): self.__dict__ = dict((field[2], values[i]) for i, field in enumerate(self.fields)) + self.fields_dict = self.__dict__.copy() No, I don't like that. Keeping two copies of all the data is bad idea. If you want to select some fields, add a method (dump_json() ?) Which will collect the fields you want in a dict and return that new dict. But don't store object attributes twice. Not really. It makes a light copy that stores only references to the desired fields. I'm not about memory consumption, I'm about the design. Keeping two representations of same thing is always painful to maintain. + def dump(self, dump_json=None): for f in self.fields: value = self.__dict__[f[2]] @@ -144,6 +146,7 @@ class Qcow2BitmapExt(Qcow2Struct): self.bitmap_directory = \ [Qcow2BitmapDirEntry(fd, cluster_size=self.cluster_size) for _ in range(self.nb_bitmaps)] + self.fields_dict.update(bitmap_directory=self.bitmap_directory) def dump(self, dump_json=None): super().dump(dump_json) @@ -189,6 +192,7 @@ class Qcow2BitmapDirEntry(Qcow2Struct): table = [e[0] for e in struct.iter_unpack('>Q', fd.read(table_size))] self.bitmap_table = Qcow2BitmapTable(raw_table=table, cluster_size=self.cluster_size) + self.fields_dict.update(bitmap_table=self.bitmap_table) def dump(self, dump_json=None): print(f'{"Bitmap name":<25} {self.name}') -- Best regards, Vladimir
Re: [PATCH v10 09/10] qcow2_format.py: collect fields to dump in JSON format
On 16.07.2020 13:24, Vladimir Sementsov-Ogievskiy wrote: 14.07.2020 00:36, Andrey Shinkevich wrote: As __dict__ is being extended with class members we do not want to print, make a light copy of the initial __dict__ and extend the copy by adding lists we have to print in the JSON output. Signed-off-by: Andrey Shinkevich --- tests/qemu-iotests/qcow2_format.py | 4 1 file changed, 4 insertions(+) diff --git a/tests/qemu-iotests/qcow2_format.py b/tests/qemu-iotests/qcow2_format.py index e0e14b5..83c3482 100644 --- a/tests/qemu-iotests/qcow2_format.py +++ b/tests/qemu-iotests/qcow2_format.py @@ -109,6 +109,8 @@ class Qcow2Struct(metaclass=Qcow2StructMeta): self.__dict__ = dict((field[2], values[i]) for i, field in enumerate(self.fields)) + self.fields_dict = self.__dict__.copy() No, I don't like that. Keeping two copies of all the data is bad idea. If you want to select some fields, add a method (dump_json() ?) Which will collect the fields you want in a dict and return that new dict. But don't store object attributes twice. Not really. It makes a light copy that stores only references to the desired fields. Andrey + def dump(self, dump_json=None): for f in self.fields: value = self.__dict__[f[2]] @@ -144,6 +146,7 @@ class Qcow2BitmapExt(Qcow2Struct): self.bitmap_directory = \ [Qcow2BitmapDirEntry(fd, cluster_size=self.cluster_size) for _ in range(self.nb_bitmaps)] + self.fields_dict.update(bitmap_directory=self.bitmap_directory) def dump(self, dump_json=None): super().dump(dump_json) @@ -189,6 +192,7 @@ class Qcow2BitmapDirEntry(Qcow2Struct): table = [e[0] for e in struct.iter_unpack('>Q', fd.read(table_size))] self.bitmap_table = Qcow2BitmapTable(raw_table=table, cluster_size=self.cluster_size) + self.fields_dict.update(bitmap_table=self.bitmap_table) def dump(self, dump_json=None): print(f'{"Bitmap name":<25} {self.name}')
Re: [PATCH 2/2] i386/cpu: Mask off unsupported XSAVE components
On 7/16/2020 11:14 PM, Eduardo Habkost wrote: On Thu, Jul 16, 2020 at 04:20:19PM +0800, Xiaoyao Li wrote: When setting up XSAVE components, it needs to mask off those unsupported by KVM. Signed-off-by: Xiaoyao Li We must never disable CPUID features silently based on host capabilities, otherwise we can't guarantee guest ABI stability when migrating to another host. Filtering of features should involve a call to mark_unavailable_features() (or some equivalent mechanism) so we can report the missing features properly through QMP. Could you explain what's the bug you are trying to fix? The loop at x86_cpu_filter_features() is already supposed to disable features unsupported by the host. Sorry, I forgot x86_cpu_filter_features() totally when code inspection. --- target/i386/cpu.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index f5f11603e805..efc92334b7b1 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6274,8 +6274,10 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu) } } -env->features[FEAT_XSAVE_COMP_LO] = mask; -env->features[FEAT_XSAVE_COMP_HI] = mask >> 32; +env->features[FEAT_XSAVE_COMP_LO] = mask & +x86_cpu_get_supported_feature_word(FEAT_XSAVE_COMP_LO, cpu->migratable); +env->features[FEAT_XSAVE_COMP_HI] = (mask >> 32) & +x86_cpu_get_supported_feature_word(FEAT_XSAVE_COMP_HI, cpu->migratable); } /* Steps involved on loading and filtering CPUID data -- 2.18.4
Re: [PATCH v7 02/47] block: Add chain helper functions
16.07.2020 17:50, Max Reitz wrote: On 13.07.20 12:18, Vladimir Sementsov-Ogievskiy wrote: 25.06.2020 18:21, Max Reitz wrote: Add some helper functions for skipping filters in a chain of block nodes. Signed-off-by: Max Reitz --- include/block/block_int.h | 3 +++ block.c | 55 +++ 2 files changed, 58 insertions(+) diff --git a/include/block/block_int.h b/include/block/block_int.h index bb3457c5e8..5da793bfc3 100644 This patch raises two questions: 1. How to treat filters at the end of the backing chain? It was my understanding that this configuration would be impossible. OK for me, but I'd prefer to have a comment and assertions. - child-access function will return no filter child for such nodes, it's correct of course - filer skipping functions will return this filter.. How much is it correct - I don't know. Consider a chain top --- backing ---> filter-with-no-child How would it be possible to have filter without a child? if bdrv_backing_chain_next(top) returns NULL, it's incorrect, because top actually have backing, and on read it will read from it for unallocated clusters (and this should crash). So, probably, returning filter as a backing-chain-next is a valid thing to do. Or we should assert that we are not in such situation (which may crash more often than trying to really read from nonexistent child). so, returning NULL, may even less correct than returning a filter.. 2. How to tread nodes with drv=NULL, but with filter child (with BDRV_CHILD_FILTERED role). drv=NULL is a bug on its own that should be fixed... (The idea we had at some point was to introduce a special driver that just always returns -EIO on everything, and to replace corrupt qcow2 nodes by that. Or, well, we might just return -EIO from the qcow2 driver, actually, if we never use drv=NULL anywhere else.) In any case, drv=NULL is an edge case that I think never has anything to do with filters. So, again some good comment and assertion won't hurt. - child-access functions returns no filtered child for such nodes - filter skipping functions will stop on it.. === Isn't it better to drop drv->is_filter at all? And call filter nodes with a bs->file or bs->backing child in BDRV_CHILD_FILTERED role? This automatically closes the two questions: - node without a child in BDRV_CHILD_FILTERED is automatically non-filter. So, filter driver is responsible for having such child. - node without a drv may still be a filter if it have BDRV_CHILD_FILTERED.. Still, not very useful. Anyway, is_filter and BDRV_CHILD_FILTERED are in contradiction, and it seems good to get rid of is_filter. But I may miss something. [..] + +static BlockDriverState *bdrv_do_skip_filters(BlockDriverState *bs, + bool stop_on_explicit_filter) +{ + BdrvChild *c; + + if (!bs) { + return NULL; + } + + while (!(stop_on_explicit_filter && !bs->implicit)) { + c = bdrv_filter_child(bs); + if (!c) { + break; + } + bs = c->bs; + } + /* + * Note that this treats nodes with bs->drv == NULL as not being + * filters (bs->drv == NULL should be replaced by something else + * anyway). + * The advantage of this behavior is that this function will thus + * always return a non-NULL value (given a non-NULL @bs). I don't see, how it is follows from first sentence? We can skip nodes with a child of BDRV_CHILD_FILTERED and drv=NULL as well, and still return non-NULL bs at the end... My idea was that nodes with bs->drv == NULL might not even have children. If we treated them like filters and skipped through them, we would have to return NULL if there is no child. Didn't you mean "treat nodes without filter child as not being filters, even if they have drv->is_filter == true"? This is a real reason for the second sentence. Hm. I implicitly always assume that filters always have a filter child, so I tend to not even question that part. Hmm. Still, the relationship in the comment seems unclear to me, the code itself is simpler :) Okay, I'm actually OK with this all. I'd prefer to have assertions and comments on corner-cases I mentioned, but patch is OK as is: Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: [PATCH 1/2] i386/cpu: Clear FEAT_XSAVE_COMP_{LO, HI} when XSAVE is not available
On 7/16/2020 11:15 PM, Eduardo Habkost wrote: On Thu, Jul 16, 2020 at 04:20:18PM +0800, Xiaoyao Li wrote: Per Intel SDM vol 1, 13.2, if CPUID.1:ECX.XSAVE[bit 26] is 0, the processor provides no further enumeration through CPUID function 0DH. Can you explain what's the bug you are trying to fix? env->features[FEAT_XSAVE_COMP_*] is already initialized as zero. When "-cpu host", it's not zero I think. Signed-off-by: Xiaoyao Li --- target/i386/cpu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1e5123251d74..f5f11603e805 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6261,6 +6261,8 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu) uint64_t mask; if (!(env->features[FEAT_1_ECX] & CPUID_EXT_XSAVE)) { +env->features[FEAT_XSAVE_COMP_LO] = 0; +env->features[FEAT_XSAVE_COMP_HI] = 0; return; } -- 2.18.4
Re: [PATCH v7 21/47] block: Use CAFs in bdrv_refresh_filename()
On 15.07.20 14:52, Andrey Shinkevich wrote: > On 25.06.2020 18:21, Max Reitz wrote: >> bdrv_refresh_filename() and the kind of related bdrv_dirname() should >> look to the primary child when they wish to copy the underlying file's >> filename. >> >> Signed-off-by: Max Reitz >> --- >> block.c | 29 + >> 1 file changed, 21 insertions(+), 8 deletions(-) >> >> diff --git a/block.c b/block.c >> index 8131d0b5eb..7c827fefa0 100644 >> --- a/block.c >> +++ b/block.c >> @@ -6797,6 +6797,7 @@ void bdrv_refresh_filename(BlockDriverState *bs) >> { >> BlockDriver *drv = bs->drv; >> BdrvChild *child; >> + BlockDriverState *primary_child_bs; >> QDict *opts; >> bool backing_overridden; >> bool generate_json_filename; /* Whether our default >> implementation should >> @@ -6866,20 +6867,30 @@ void bdrv_refresh_filename(BlockDriverState *bs) >> qobject_unref(bs->full_open_options); >> bs->full_open_options = opts; >> + primary_child_bs = bdrv_primary_bs(bs); >> + >> if (drv->bdrv_refresh_filename) { >> /* Obsolete information is of no use here, so drop the old >> file name >> * information before refreshing it */ >> bs->exact_filename[0] = '\0'; >> drv->bdrv_refresh_filename(bs); >> - } else if (bs->file) { >> - /* Try to reconstruct valid information from the underlying >> file */ >> + } else if (primary_child_bs) { >> + /* >> + * Try to reconstruct valid information from the underlying >> + * file -- this only works for format nodes (filter nodes >> + * cannot be probed and as such must be selected by the user >> + * either through an options dict, or through a special >> + * filename which the filter driver must construct in its >> + * .bdrv_refresh_filename() implementation). >> + */ > > > The caller may not be aware of a filter node and intend to refresh the > name of underlying format node. > > In that case, the filter driver should redirect the call to the format > node. It shouldn’t. We can only return a plain filename if passing that filename to qemu (e.g. to -drive) will result in the same block graph configuration. This is what the comment means by “filter nodes cannot be probed”: If there is a filter node, we must generate a json:{} filename, because otherwise reopening the block device with -drive by passing the filename generated here would result in a configuration where the filter is missing. > What are situations the name of the filter itself should be refreshed in? Hypothetically, if a filename could specify a filter. For example, say the filename “filter[copy-on-read]:foo.qcow2” would result in qemu creating a COR filter on top of a qcow2 node, then we could generate such a filename. In practice, filters cannot be configured through plain filenames (apart from blkverify, but that’s a debugging feature, so it doesn’t really matter), so there is no such situation. All filter nodes should have an empty exact_filename and thus get a json:{} pseudo-filename. > If there are any, should we do both actions or choose either? > > Andrey > > >> bs->exact_filename[0] = '\0'; >> /* >> * We can use the underlying file's filename if: >> * - it has a filename, >> + * - the current BDS is not a filter, > > > Should we check the function input parameter for being a filter's BS > here, in this function, and handle the case here or let the filter > driver function do that or else the caller should check it? bdrv_refresh_filename() is called whenever some node in the block graph has changed, just to refresh its filename (after that change). The caller generally doesn’t really care about the result, so it doesn’t matter whether the node is a filter or not (i.e., whether it gets a plain filename or not). I don’t think the caller should check it, and in this implementation we simply have to handle filter nodes correctly: That is, not give them a plain filename. Max signature.asc Description: OpenPGP digital signature
[Bug 1887820] [NEW] TCG test targets missing from 'make check-help'
Public bug reported: We can run the TCG tests using: $ make run-tcg-tests-$TARGET-softmmu This is not listed in 'make check-help'. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1887820 Title: TCG test targets missing from 'make check-help' Status in QEMU: New Bug description: We can run the TCG tests using: $ make run-tcg-tests-$TARGET-softmmu This is not listed in 'make check-help'. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1887820/+subscriptions
Re: [PATCH 1/2] i386/cpu: Clear FEAT_XSAVE_COMP_{LO,HI} when XSAVE is not available
On Thu, Jul 16, 2020 at 04:20:18PM +0800, Xiaoyao Li wrote: > Per Intel SDM vol 1, 13.2, if CPUID.1:ECX.XSAVE[bit 26] is 0, the > processor provides no further enumeration through CPUID function 0DH. Can you explain what's the bug you are trying to fix? env->features[FEAT_XSAVE_COMP_*] is already initialized as zero. > > Signed-off-by: Xiaoyao Li > --- > target/i386/cpu.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index 1e5123251d74..f5f11603e805 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -6261,6 +6261,8 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu) > uint64_t mask; > > if (!(env->features[FEAT_1_ECX] & CPUID_EXT_XSAVE)) { > +env->features[FEAT_XSAVE_COMP_LO] = 0; > +env->features[FEAT_XSAVE_COMP_HI] = 0; > return; > } > > -- > 2.18.4 > -- Eduardo
Re: [PATCH v7 20/47] block: Iterate over children in refresh_limits
On 14.07.20 20:37, Andrey Shinkevich wrote: > On 25.06.2020 18:21, Max Reitz wrote: >> Instead of looking at just bs->file and bs->backing, we should look at >> all children that could end up receiving forwarded requests. >> >> Signed-off-by: Max Reitz >> --- >> block/io.c | 32 >> 1 file changed, 16 insertions(+), 16 deletions(-) >> >> diff --git a/block/io.c b/block/io.c >> index c2af7711d6..37057f13e0 100644 >> --- a/block/io.c >> +++ b/block/io.c >> @@ -135,6 +135,8 @@ static void bdrv_merge_limits(BlockLimits *dst, >> const BlockLimits *src) >> void bdrv_refresh_limits(BlockDriverState *bs, Error **errp) >> { >> BlockDriver *drv = bs->drv; >> + BdrvChild *c; >> + bool have_limits; >> Error *local_err = NULL; >> memset(>bl, 0, sizeof(bs->bl)); >> @@ -149,14 +151,21 @@ void bdrv_refresh_limits(BlockDriverState *bs, >> Error **errp) >> drv->bdrv_co_preadv_part) ? 1 : 512; >> /* Take some limits from the children as a default */ >> - if (bs->file) { >> - bdrv_refresh_limits(bs->file->bs, _err); >> - if (local_err) { >> - error_propagate(errp, local_err); >> - return; >> + have_limits = false; >> + QLIST_FOREACH(c, >children, next) { >> + if (c->role & (BDRV_CHILD_DATA | BDRV_CHILD_FILTERED | >> BDRV_CHILD_COW)) >> + { >> + bdrv_refresh_limits(c->bs, _err); >> + if (local_err) { >> + error_propagate(errp, local_err); >> + return; >> + } >> + bdrv_merge_limits(>bl, >bs->bl); >> + have_limits = true; >> } >> - bdrv_merge_limits(>bl, >file->bs->bl); >> - } else { >> + } >> + >> + if (!have_limits) { > > > This conditioned piece of code worked with (bs->file == NULL) only. > > Now, it works only if there are neither bs->file, nor bs->backing, nor > else filtered children. > > Is it OK and doesn't break the logic for all cases? Hm. Good question. I think the answer is it’s OK. For DATA and FILTERED, it makes absolute sense to just use their alignments. For COW, maybe not so much? But if there’s a COW child, there has to be a DATA child as well (in practice). So we’ll always consider its alignment, too. (And hypothetically speaking, if there was a COW child but no DATA child, then the only alignment we need to observe is in fact the one of the COW child.) Max signature.asc Description: OpenPGP digital signature
Re: [PATCH 2/2] i386/cpu: Mask off unsupported XSAVE components
On Thu, Jul 16, 2020 at 04:20:19PM +0800, Xiaoyao Li wrote: > When setting up XSAVE components, it needs to mask off those unsupported > by KVM. > > Signed-off-by: Xiaoyao Li We must never disable CPUID features silently based on host capabilities, otherwise we can't guarantee guest ABI stability when migrating to another host. Filtering of features should involve a call to mark_unavailable_features() (or some equivalent mechanism) so we can report the missing features properly through QMP. Could you explain what's the bug you are trying to fix? The loop at x86_cpu_filter_features() is already supposed to disable features unsupported by the host. > --- > target/i386/cpu.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index f5f11603e805..efc92334b7b1 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -6274,8 +6274,10 @@ static void x86_cpu_enable_xsave_components(X86CPU > *cpu) > } > } > > -env->features[FEAT_XSAVE_COMP_LO] = mask; > -env->features[FEAT_XSAVE_COMP_HI] = mask >> 32; > +env->features[FEAT_XSAVE_COMP_LO] = mask & > +x86_cpu_get_supported_feature_word(FEAT_XSAVE_COMP_LO, > cpu->migratable); > +env->features[FEAT_XSAVE_COMP_HI] = (mask >> 32) & > +x86_cpu_get_supported_feature_word(FEAT_XSAVE_COMP_HI, > cpu->migratable); > } > > /* Steps involved on loading and filtering CPUID data > -- > 2.18.4 > -- Eduardo
Re: [PATCH v7 19/47] vmdk: Drop vmdk_co_flush()
On 14.07.20 16:52, Andrey Shinkevich wrote: > On 25.06.2020 18:21, Max Reitz wrote: >> Before HEAD^, we needed this because bdrv_co_flush() by itself would >> only flush bs->file. With HEAD^, bdrv_co_flush() will flush all >> children on which a WRITE or WRITE_UNCHANGED permission has been taken. >> Thus, vmdk no longer needs to do it itself. >> >> Signed-off-by: Max Reitz >> --- >> block/vmdk.c | 16 >> 1 file changed, 16 deletions(-) >> >> diff --git a/block/vmdk.c b/block/vmdk.c >> index 62da465126..a23890e6ec 100644 >> --- a/block/vmdk.c >> +++ b/block/vmdk.c >> @@ -2802,21 +2802,6 @@ static void vmdk_close(BlockDriverState *bs) >> error_free(s->migration_blocker); >> } >> -static coroutine_fn int vmdk_co_flush(BlockDriverState *bs) >> -{ >> - BDRVVmdkState *s = bs->opaque; >> - int i, err; >> - int ret = 0; >> - >> - for (i = 0; i < s->num_extents; i++) { >> - err = bdrv_co_flush(s->extents[i].file->bs); >> - if (err < 0) { >> - ret = err; >> - } >> - } >> - return ret; >> -} >> - >> static int64_t vmdk_get_allocated_file_size(BlockDriverState *bs) >> { >> int i; >> @@ -3075,7 +3060,6 @@ static BlockDriver bdrv_vmdk = { >> .bdrv_close = vmdk_close, >> .bdrv_co_create_opts = vmdk_co_create_opts, >> .bdrv_co_create = vmdk_co_create, >> - .bdrv_co_flush_to_disk = vmdk_co_flush, > > > After HEAD^ applied, wouldn't we get an endless recursion in > bdrv_co_flush() if the HEAD (this patch) had not been merged into HEAD^? Hm, how so? HEAD^ just flushes all children, just like vmdk_co_flush() does. So it seems to me just like double the work. (Which is unfortunate but shouldn’t be a problem.) Max signature.asc Description: OpenPGP digital signature
Re: [PATCH] target/i386: floatx80: avoid compound literals in static initializers
On 7/16/20 4:42 PM, Laszlo Ersek wrote: > Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an > object that has static storage duration shall be constant expressions or > string literals". > > The compound literal produced by the make_floatx80() macro is not such a > constant expression, per 6.6p7-9. (An implementation may accept it, > according to 6.6p10, but is not required to.) > > Therefore using "floatx80_zero" and make_floatx80() for initializing > "f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6 > actually chokes on them: > >> target/i386/fpu_helper.c:871:5: error: initializer element is not constant >> { make_floatx80(0xbfff, 0x8000ULL), >> ^ This reminds me of: commit 6fa9ba09dbf4eb8b52bcb47d6820957f1b77ee0b Author: Kamil Rytarowski Date: Mon Sep 4 23:23:06 2017 +0200 target/m68k: Switch fpu_rom from make_floatx80() to make_floatx80_init() GCC 4.7.2 on SunOS reports that the values assigned to array members are not real constants: target/m68k/fpu_helper.c:32:5: error: initializer element is not constant target/m68k/fpu_helper.c:32:5: error: (near initialization for 'fpu_rom[0]') rules.mak:66: recipe for target 'target/m68k/fpu_helper.o' failed Convert the array to make_floatx80_init() to fix it. Replace floatx80_pi-like constants with make_floatx80_init() as they are defined as make_floatx80(). Reviewed-by: Philippe Mathieu-Daudé > > We've had the make_floatx80_init() macro for this purpose since commit > 3bf7e40ab914 ("softfloat: fix for C99", 2012-03-17), so let's use that > macro again. > > Fixes: eca30647fc07 > Fixes: ff57bb7b6326 > Link: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06566.html > Link: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04714.html > Cc: Alex Bennée > Cc: Aurelien Jarno > Cc: Eduardo Habkost > Cc: Joseph Myers > Cc: Paolo Bonzini > Cc: Peter Maydell > Cc: Richard Henderson > Signed-off-by: Laszlo Ersek > --- > > Notes: > I can see that there are test cases under "tests/tcg/i386", but I don't > know how to run them. Yeah it is not easy to figure... Try 'make run-tcg-tests-i386-softmmu' but you need docker :^) (There is also 'make check-softfloat', listed in 'make check-help') > > include/fpu/softfloat.h | 1 + > target/i386/fpu_helper.c | 426 +++ > 2 files changed, 214 insertions(+), 213 deletions(-) > > diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h > index f1a19df066b7..659218b5c787 100644 > --- a/include/fpu/softfloat.h > +++ b/include/fpu/softfloat.h > @@ -822,6 +822,7 @@ static inline bool floatx80_invalid_encoding(floatx80 a) > } > > #define floatx80_zero make_floatx80(0x, 0xLL) > +#define floatx80_zero_init make_floatx80_init(0x, 0xLL) > #define floatx80_one make_floatx80(0x3fff, 0x8000LL) > #define floatx80_ln2 make_floatx80(0x3ffe, 0xb17217f7d1cf79acLL) > #define floatx80_pi make_floatx80(0x4000, 0xc90fdaa22168c235LL) > diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c > index f5e6c4b88d4e..4ea73874d836 100644 > --- a/target/i386/fpu_helper.c > +++ b/target/i386/fpu_helper.c > @@ -868,201 +868,201 @@ struct f2xm1_data { > }; ...
Re: [PATCH v6 0/3] modify CPU model info
On Tue, Jul 14, 2020 at 04:41:45PM +0800, Chenyi Qiang wrote: > Add the missing VMX features in Skylake-Server, Cascadelake-Server and > Icelake-Server CPU models. In Icelake-Server CPU model, it lacks sha_ni, > avx512ifma, rdpid and fsrm. The model number of Icelake-Server also needs > to be fixed. > > To apply this patchset, a bug related to env->user_features need to be > fixed first. The patch is available at > https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04399.html > Queued for 5.1, thanks! -- Eduardo
[PATCH for-5.1] qapi: Fix visit_type_STRUCT() not to fail for null object
To make deallocating partially constructed objects work, the visit_type_STRUCT() need to succeed without doing anything when passed a null object. Commit cdd2b228b9 "qapi: Smooth visitor error checking in generated code" broke that. To reproduce, run tests/test-qobject-input-visitor with AddressSanitizer: ==4353==ERROR: LeakSanitizer: detected memory leaks Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7f192d0c5d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28) #1 0x7f192cd21b10 in g_malloc0 (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x51b10) #2 0x556725f6bbee in visit_next_list qapi/qapi-visit-core.c:86 #3 0x556725f49e15 in visit_type_UserDefOneList tests/test-qapi-visit.c:474 #4 0x556725f4489b in test_visitor_in_fail_struct_in_list tests/test-qobject-input-visitor.c:1086 #5 0x7f192cd42f29 (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72f29) SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s). Test case /visitor/input/fail/struct-in-list feeds a list with a bad element to the QObject input visitor. Visiting that element duly fails, and aborts the visit with the list only partially constructed: the faulty object is null. Cleaning up the partially constructed list visits that null object, fails, and aborts the visit before the list node gets freed. Fix the the generated visit_type_STRUCT() to succeed for null objects. Fixes: cdd2b228b973d2a29edf7696ef6e8b08ec329019 Reported-by: Li Qiang Signed-off-by: Markus Armbruster --- scripts/qapi/visit.py | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/qapi/visit.py b/scripts/qapi/visit.py index 3fb2f30510..cdabc5fa28 100644 --- a/scripts/qapi/visit.py +++ b/scripts/qapi/visit.py @@ -249,6 +249,7 @@ bool visit_type_%(c_name)s(Visitor *v, const char *name, %(c_name)s **obj, Error if (!*obj) { /* incomplete */ assert(visit_is_dealloc(v)); +ok = true; goto out_obj; } if (!visit_type_%(c_name)s_members(v, *obj, errp)) { -- 2.26.2
Re: [PATCH] spapr_pci: Robustify support of PCI bridges
On Thu, 16 Jul 2020 16:23:52 +0200 Markus Armbruster wrote: > David Gibson writes: > > > On Thu, Jul 16, 2020 at 12:32:44PM +0200, Greg Kurz wrote: > >> On Thu, 16 Jul 2020 14:45:40 +1000 > >> David Gibson wrote: > >> > >> > On Thu, Jul 09, 2020 at 07:12:47PM +0200, Greg Kurz wrote: > >> > > Some recent error handling cleanups unveiled issues with our support of > >> > > PCI bridges: > >> > > > >> > > 1) QEMU aborts when using non-standard PCI bridge types, > >> > >unveiled by commit 7ef1553dac "spapr_pci: Drop some dead error > >> > > handling" > >> > > > >> > > $ qemu-system-ppc64 -M pseries -device pcie-pci-bridge > >> > > Unexpected error in object_property_find() at qom/object.c:1240: > >> > > qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not > >> > > found > >> > > Aborted (core dumped) > >> > > >> > Oops, I thought we had a check that we actually had a "pci-bridge" > >> > device before continuing with the hotplug, but I guess not. > >> > >> Ah... are you suggesting we should explicitly check the actual type > >> of the bridge rather than looking for the "chassis_nr" property ? > > > > Uh.. I thought about it, but I don't think it matters much which way > > we do it. > > Would it make sense to add the "chassis_nr" property to *all* PCI > bridge devices? > I see that the "PCI Express to PCI/PCI-X Bridge Specification" mentions a "Chassis Number Register" which looks very similar to the what exists in standard PCI-to-PCI brdiges. This doesn't seem to be implemented in our "pcie-pci-bridge" device model though, but of course I have no idea why :) Maybe Michael or Marcel (cc'd) can share some thoughts about that ? > [...] >