Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init
On 4/2/2024 5:01 AM, Eugenio Perez Martin wrote: On Tue, Apr 2, 2024 at 8:19 AM Si-Wei Liu wrote: On 2/14/2024 11:11 AM, Eugenio Perez Martin wrote: On Wed, Feb 14, 2024 at 7:29 PM Si-Wei Liu wrote: Hi Michael, On 2/13/2024 2:22 AM, Michael S. Tsirkin wrote: On Mon, Feb 05, 2024 at 05:10:36PM -0800, Si-Wei Liu wrote: Hi Eugenio, I thought this new code looks good to me and the original issue I saw with x-svq=on should be gone. However, after rebase my tree on top of this, there's a new failure I found around setting up guest mappings at early boot, please see attached the specific QEMU config and corresponding event traces. Haven't checked into the detail yet, thinking you would need to be aware of ahead. Regards, -Siwei Eugenio were you able to reproduce? Siwei did you have time to look into this? Didn't get a chance to look into the detail yet in the past week, but thought it may have something to do with the (internals of) iova tree range allocation and the lookup routine. It started to fall apart at the first vhost_vdpa_dma_unmap call showing up in the trace events, where it should've gotten IOVA=0x201000, but an incorrect IOVA address 0x1000 was ended up returning from the iova tree lookup routine. HVAGPAIOVA - Map [0x7f7903e0, 0x7f7983e0)[0x0, 0x8000) [0x1000, 0x8000) [0x7f7983e0, 0x7f9903e0)[0x1, 0x208000) [0x80001000, 0x201000) [0x7f7903ea, 0x7f7903ec)[0xfeda, 0xfedc) [0x201000, 0x221000) Unmap [0x7f7903ea, 0x7f7903ec)[0xfeda, 0xfedc) [0x1000, 0x2) ??? shouldn't it be [0x201000, 0x221000) ??? It looks the SVQ iova tree lookup routine vhost_iova_tree_find_iova(), which is called from vhost_vdpa_listener_region_del(), can't properly deal with overlapped region. Specifically, q35's mch_realize() has the following: 579 memory_region_init_alias(&mch->open_high_smram, OBJECT(mch), "smram-open-high", 580 mch->ram_memory, MCH_HOST_BRIDGE_SMRAM_C_BASE, 581 MCH_HOST_BRIDGE_SMRAM_C_SIZE); 582 memory_region_add_subregion_overlap(mch->system_memory, 0xfeda, 583 &mch->open_high_smram, 1); 584 memory_region_set_enabled(&mch->open_high_smram, false); #0 0x564c30bf6980 in iova_tree_find_address_iterator (key=0x564c331cf8e0, value=0x564c331cf8e0, data=0x7fffb6d749b0) at ../util/iova-tree.c:96 #1 0x7f5f66479654 in g_tree_foreach () at /lib64/libglib-2.0.so.0 #2 0x564c30bf6b53 in iova_tree_find_iova (tree=, map=map@entry=0x7fffb6d74a00) at ../util/iova-tree.c:114 #3 0x564c309da0a9 in vhost_iova_tree_find_iova (tree=, map=map@entry=0x7fffb6d74a00) at ../hw/virtio/vhost-iova-tree.c:70 #4 0x564c3085e49d in vhost_vdpa_listener_region_del (listener=0x564c331024c8, section=0x7fffb6d74aa0) at ../hw/virtio/vhost-vdpa.c:444 #5 0x564c309f4931 in address_space_update_topology_pass (as=as@entry=0x564c31ab1840 , old_view=old_view@entry=0x564c33364cc0, new_view=new_view@entry=0x564c333640f0, adding=adding@entry=false) at ../system/memory.c:977 #6 0x564c309f4dcd in address_space_set_flatview (as=0x564c31ab1840 ) at ../system/memory.c:1079 #7 0x564c309f86d0 in memory_region_transaction_commit () at ../system/memory.c:1132 #8 0x564c309f86d0 in memory_region_transaction_commit () at ../system/memory.c:1117 #9 0x564c307cce64 in mch_realize (d=, errp=) at ../hw/pci-host/q35.c:584 However, it looks like iova_tree_find_address_iterator() only check if the translated address (HVA) falls in to the range when trying to locate the desired IOVA, causing the first DMAMap that happens to overlap in the translated address (HVA) space to be returned prematurely: 89 static gboolean iova_tree_find_address_iterator(gpointer key, gpointer value, 90 gpointer data) 91 { : : 99 if (map->translated_addr + map->size < needle->translated_addr || 100 needle->translated_addr + needle->size < map->translated_addr) { 101 return false; 102 } 103 104 args->result = map; 105 return true; 106 } In the QEMU trace file, it reveals that the first DMAMap as below gets returned incorrectly instead the second, the latter of which is what the actual IOVA corresponds to: HVA GPA IOVA [0x7f7903e0, 0x7f7983e0)[0x0, 0x8000) [0x1000, 0x80001000) [0x7f7903ea, 0x7f7903ec)[0xfeda, 0xfedc) [0x201000, 0x221000) I think the analysis is totally accurate as no code expects to unmap / map overlapping regions. In particular, vdpa kernel does not expect i
Re: [PATCH] sh4: mac.w: memory accesses are 16-bit words
On Tue, 02 Apr 2024 22:26:25 +0900, Philippe Mathieu-Daudé wrote: > > On 2/4/24 11:37, Zack Buhman wrote: > > Before this change, executing a code sequence such as: > > > > mova tblm,r0 > > movr0,r1 > > mova tbln,r0 > > clrs > > clrmac > > mac.w @r0+,@r1+ > > mac.w @r0+,@r1+ > > > > .align 4 > >tblm:.word 0x1234 > > .word 0x5678 > >tbln:.word 0x9abc > > .word 0xdefg > > > > Does not result in correct behavior: > > > > Expected behavior: > >first macw : macl = 0x1234 * 0x9abc + 0x0 > > mach = 0x0 > > > >second macw: macl = 0x5678 * 0xdefg + 0xb00a630 > > mach = 0x0 > > > > Observed behavior (qemu-sh4eb, prior to this commit): > > > >first macw : macl = 0x5678 * 0xdefg + 0x0 > > mach = 0x0 > > > >second macw: (unaligned longword memory access, SIGBUS) > > > > Various SH-4 ISA manuals also confirm that `mac.w` is a 16-bit word memory > > access, not a 32-bit longword memory access. > > > > Signed-off-by: Zack Buhman > > --- > > target/sh4/translate.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/target/sh4/translate.c b/target/sh4/translate.c > > index a9b1bc7524..6643c14dde 100644 > > --- a/target/sh4/translate.c > > +++ b/target/sh4/translate.c > > @@ -816,10 +816,10 @@ static void _decode_opc(DisasContext * ctx) > > TCGv arg0, arg1; > > arg0 = tcg_temp_new(); > > tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx, > > -MO_TESL | MO_ALIGN); > > +MO_TESW | MO_ALIGN); > > arg1 = tcg_temp_new(); > > tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx, > > -MO_TESL | MO_ALIGN); > > +MO_TESW | MO_ALIGN); > > Apparently invalid since its introduction in commit fdf9b3e831. > > Reviewed-by: Philippe Mathieu-Daudé > > > gen_helper_macw(tcg_env, arg0, arg1); > > tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 2); > > tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 2); > SH4 Software manual said. https://www.renesas.com/us/en/document/mas/sh-4-software-manual > This instruction performs signed multiplication of the 16-bit operands > whose addresses are the contents of general registers Rm and Rn, > adds the 32-bit result to the MAC register contents, and stores the > result in the MAC register. Operands Rm and Rn are each incremented > by 2 each time they are read. Reviewed-by: Yoshinori Sato -- Yosinori Sato
Re: [PATCH] hw/virtio: Fix packed virtqueue flush used_idx
On Wed, Mar 27, 2024 at 02:15:18PM +0800, Wafer wrote: > For indirect descriptors the elelm->ndescs was one, > For direct descriptors the elele->ndesc was the numbe of entries. > elem->ndescs = (desc_cache == &indirect_desc_cache) ? 1 : elem_entries; > > When flushing multiple elemes, > the used_idx should be added to all the privious numeric entry value. > > Signed-off-by: Wafer Thanks for the patch. It's kind of hard to figure out what you are trying to say with all the typos and grammar errors in the commit log. What's up with that? Please describe the following in the commit log: - current behaviour is abc - this is wrong because the virtio spec says def - as a result we observed guest doing pqr and then stu - to fix do ghi - with this fix the guest does xyz as expected - tested by klm Also I think you might want to add: Fixes: 86044b24e8 ("virtio: basic packed virtqueue support") Cc: "Jason Wang" > --- > hw/virtio/virtio.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index d229755eae..44f1d2fcfc 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -957,12 +957,17 @@ static void virtqueue_packed_flush(VirtQueue *vq, > unsigned int count) > return; > } > > +/* > + * When the descriptor's flag was 'INDIRECT', the value of 'ndescs' is > one. > + * When the descriptor's flag was 'chain', the value of 'ndescs' > + * is the number of entries. > + */ There's no such thing as "the flag" - descriptors do have a "flags" field though. And there's no 'chain' value either. maybe just " For indirect elems, ndescs is 1. For all other elems, ndescs is the number of descriptors chained by NEXT (as set in virtqueue_packed_pop). > +ndescs += vq->used_elems[0].ndescs; > for (i = 1; i < count; i++) { > -virtqueue_packed_fill_desc(vq, &vq->used_elems[i], i, false); > +virtqueue_packed_fill_desc(vq, &vq->used_elems[i], ndescs, false); > ndescs += vq->used_elems[i].ndescs; > } > virtqueue_packed_fill_desc(vq, &vq->used_elems[0], 0, true); > -ndescs += vq->used_elems[0].ndescs; > > vq->inuse -= ndescs; > vq->used_idx += ndescs; The patch itself seems correct to me. > -- > 2.27.0
Re: [PATCH v2] input-linux: Add option to not grab a device upon guest startup
> Missed one little thing: the doc comment needs a (since 9.1). Thanks for the input, I just submitted a v3 with the missing "(since 9.1)": https://patchew.org/QEMU/20240403055002.890760-1-justinien.bou...@gmail.com/ Regards, Justinien Bouron
[PATCH v3] input-linux: Add option to not grab a device upon guest startup
Depending on your use-case, it might be inconvenient to have qemu grab the input device from the host immediately upon starting the guest. Added a new bool option to input-linux: grab-on-startup. If true, the device is grabbed as soon as the guest is started, otherwise it is not grabbed until the toggle combination is entered. To avoid breaking existing setups, the default value of grab-on-startup is true, i.e. same behaviour as before this change. Signed-off-by: Justinien Bouron --- Changes since v2: - Added missing (since 9.1) to the new option in qapi/qom.json qapi/qom.json| 14 +- ui/input-linux.c | 20 +++- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/qapi/qom.json b/qapi/qom.json index 85e6b4f84a..2067e41991 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -508,13 +508,25 @@ # @grab-toggle: the key or key combination that toggles device grab # (default: ctrl-ctrl) # +# @grab-on-startup: if true, grab the device immediately upon starting +# the guest. Otherwise, don't grab the device until the +# combination is entered. This does not influence other devices +# even if grab_all is true, i.e. in the unlikely scenario where +# device1 has grab_all=true + grab-on-startup=true and device2 has +# grab-on-startup=false, only device1 is grabbed on startup, then, +# once the grab combination is entered, grabbing is toggled off +# for both devices (because device1 enforces the grab_all +# property) until the combination is entered again at which point +# both devices will be grabbed. (default: true) (since 9.1). +# # Since: 2.6 ## { 'struct': 'InputLinuxProperties', 'data': { 'evdev': 'str', '*grab_all': 'bool', '*repeat': 'bool', -'*grab-toggle': 'GrabToggleKeys' } } +'*grab-toggle': 'GrabToggleKeys', +'*grab-on-startup': 'bool'} } ## # @EventLoopBaseProperties: diff --git a/ui/input-linux.c b/ui/input-linux.c index e572a2e905..68b5c6d485 100644 --- a/ui/input-linux.c +++ b/ui/input-linux.c @@ -44,6 +44,7 @@ struct InputLinux { boolgrab_request; boolgrab_active; boolgrab_all; +boolgrab_on_startup; boolkeydown[KEY_CNT]; int keycount; int wheel; @@ -400,7 +401,7 @@ static void input_linux_complete(UserCreatable *uc, Error **errp) if (il->keycount) { /* delay grab until all keys are released */ il->grab_request = true; -} else { +} else if (il->grab_on_startup) { input_linux_toggle_grab(il); } QTAILQ_INSERT_TAIL(&inputs, il, next); @@ -491,6 +492,19 @@ static void input_linux_set_grab_toggle(Object *obj, int value, il->grab_toggle = value; } +static bool input_linux_get_grab_on_startup(Object *obj, Error **errp) +{ +InputLinux *il = INPUT_LINUX(obj); +return il->grab_on_startup; +} + +static void input_linux_set_grab_on_startup(Object *obj, bool value, +Error **errp) +{ +InputLinux *il = INPUT_LINUX(obj); +il->grab_on_startup = value; +} + static void input_linux_instance_init(Object *obj) { } @@ -498,6 +512,7 @@ static void input_linux_instance_init(Object *obj) static void input_linux_class_init(ObjectClass *oc, void *data) { UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); +ObjectProperty *grab_on_startup_prop; ucc->complete = input_linux_complete; @@ -514,6 +529,9 @@ static void input_linux_class_init(ObjectClass *oc, void *data) &GrabToggleKeys_lookup, input_linux_get_grab_toggle, input_linux_set_grab_toggle); +grab_on_startup_prop = object_class_property_add_bool(oc, "grab-on-startup", +input_linux_get_grab_on_startup, input_linux_set_grab_on_startup); +object_property_set_default_bool(grab_on_startup_prop, true); } static const TypeInfo input_linux_info = { -- 2.44.0
Re: [PATCH v2] input-linux: Add option to not grab a device upon guest startup
Justinien Bouron writes: > Just a ping to make sure this patch hasn't been lost in the noise. > The relevant patchew page is > https://patchew.org/QEMU/20240322034311.2980970-1-justinien.bou...@gmail.com/. > > Any chance to get this merged before the next release? Since it was posted after the soft freeze, none. https://wiki.qemu.org/Planning/9.0 You need to target 9.1, a bit over approximately four months from now.
Re: [PATCH v2] input-linux: Add option to not grab a device upon guest startup
Markus Armbruster writes: > Justinien Bouron writes: > >> Depending on your use-case, it might be inconvenient to have qemu grab >> the input device from the host immediately upon starting the guest. >> >> Added a new bool option to input-linux: grab-on-startup. If true, the >> device is grabbed as soon as the guest is started, otherwise it is not >> grabbed until the toggle combination is entered. To avoid breaking >> existing setups, the default value of grab-on-startup is true, i.e. same >> behaviour as before this change. >> >> Signed-off-by: Justinien Bouron > > QAPI schema > Acked-by: Markus Armbruster Missed one little thing: the doc comment needs a (since 9.1).
Re: [PATCH] virtio-net: fix qemu set used ring flag even vhost started
On 4/3/2024 1:10 PM, Michael Tokarev wrote: External email: Use caution opening links or attachments 02.04.2024 07:51, Yajun Wu: When vhost-user or vhost-kernel is handling virtio net datapath, qemu should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by qemu in following code path: ... This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. This seems to be qemu-stable material, even if the case when the issue happens is "very rare", -- is it not? If you mean this fix needs back port to previous version? Yes. Thanks, /mjt Signed-off-by: Yajun Wu Reviewed-by: Jiri Pirko --- hw/net/virtio-net.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index a6ff000cd9..8035e01fdf 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -2865,6 +2865,10 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq) VirtIONet *n = VIRTIO_NET(vdev); VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))]; +if (n->vhost_started) { +return; +} + if (unlikely((n->status & VIRTIO_NET_S_LINK_UP) == 0)) { virtio_net_drop_tx_queue_data(vdev, vq); return;
Re: TCG change broke MorphOS boot on sam460ex
On Tue Apr 2, 2024 at 9:32 PM AEST, BALATON Zoltan wrote: > On Thu, 21 Mar 2024, BALATON Zoltan wrote: > > On 27/2/24 17:47, BALATON Zoltan wrote: > >> Hello, > >> > >> Commit 18a536f1f8 (accel/tcg: Always require can_do_io) broke booting > >> MorphOS on sam460ex (this was before 8.2.0 and I thought I've verified it > >> before that release but apparently missed it back then). It can be > >> reproduced with https://www.morphos-team.net/morphos-3.18.iso and > >> following > >> command: > >> > >> qemu-system-ppc -M sam460ex -serial stdio -d unimp,guest_errors \ > >> -drive if=none,id=cd,format=raw,file=morphos-3.18.iso \ > >> -device ide-cd,drive=cd,bus=ide.1 > > Any idea on this one? While MorphOS boots on other machines and other OSes > seem to boot on this machine it may still suggest there's some problem > somewhere as this worked before. So it may worth investigating it to make > sure there's no bug that could affect other OSes too even if they boot. I > don't know how to debug this so some help would be needed. In the bad case it crashes after running this TB: IN: 0x00c01354: 38c00040 li r6, 0x40 0x00c01358: 38e10204 addi r7, r1, 0x204 0x00c0135c: 39010104 addi r8, r1, 0x104 0x00c01360: 39410004 addi r10, r1, 4 0x00c01364: 3920 li r9, 0 0x00c01368: 7cc903a6 mtctrr6 0x00c0136c: 84c70004 lwzu r6, 4(r7) 0x00c01370: 7cc907a4 tlbwehi r6, r9 0x00c01374: 84c80004 lwzu r6, 4(r8) 0x00c01378: 7cc90fa4 tlbwelo r6, r9 0x00c0137c: 84ca0004 lwzu r6, 4(r10) 0x00c01380: 7cc917a4 tlbwehi r6, r9 0x00c01384: 39290001 addi r9, r9, 1 0x00c01388: 4200ffe4 bdnz 0xc0136c IN: 0x00c01374: unable to read memory "unable to read memory" is the tracer, it does actually translate the address, but it points to a wayward real address which returns 0 to TCG, which is an invalid instruction. The good case instead doesn't exit the TB after 0x00c01370 but after the complete loop at the bdnz. That look like this after the same first TB: IN: 0x00c0136c: 84c70004 lwzu r6, 4(r7) 0x00c01370: 7cc907a4 tlbwehi r6, r9 0x00c01374: 84c80004 lwzu r6, 4(r8) 0x00c01378: 7cc90fa4 tlbwelo r6, r9 0x00c0137c: 84ca0004 lwzu r6, 4(r10) 0x00c01380: 7cc917a4 tlbwehi r6, r9 0x00c01384: 39290001 addi r9, r9, 1 0x00c01388: 4200ffe4 bdnz 0xc0136c IN: 0x00c0138c: 4c00012c isync All the tlbwe are executed in the same TB. MMU tracing shows the first tlbwehi creates a new valid(!) TLB for 0x-0x1 that has a garbage RPN because the tlbwelo did not run yet. What's happening in the bad case is that the translator breaks and "re-fetches" instructions in the middle of that sequence, and that's where the bogus translation causes 0 to be returned. The good case the whole block is executed in the same fetch which creates correct translations. So it looks like a morphos bug, the can-do-io change just happens to cause it to re-fetch in that place, but that could happen for a number of reasons, so you can't rely on TLB *only* changing or ifetch *only* re-fetching at a sync point like isync. I would expect code like this to write an invalid entry with tlbwehi, then tlbwelo to set the correct RPN, then make the entry valid with the second tlbwehi. It would probably fix the bug if you just did the first tlbwehi with r6=0 (or at least without the 0x200 bit set). Thanks, Nick
Re: [PATCH] Fix incorrect disassembly format for certain RISC-V instructions
02.04.2024 15:47, Simeon Krastnikov: [Content-Type: text/html] Your patch is html-damaged. There's probably a way to extract the text/plain version of it but it's better if you re-send it without html. Not saying anything about the changes though. Thanks, /mjt
Re: [PATCH] virtio-net: fix qemu set used ring flag even vhost started
02.04.2024 07:51, Yajun Wu: When vhost-user or vhost-kernel is handling virtio net datapath, qemu should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by qemu in following code path: ... This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. This seems to be qemu-stable material, even if the case when the issue happens is "very rare", -- is it not? Thanks, /mjt Signed-off-by: Yajun Wu Reviewed-by: Jiri Pirko --- hw/net/virtio-net.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index a6ff000cd9..8035e01fdf 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -2865,6 +2865,10 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq) VirtIONet *n = VIRTIO_NET(vdev); VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))]; +if (n->vhost_started) { +return; +} + if (unlikely((n->status & VIRTIO_NET_S_LINK_UP) == 0)) { virtio_net_drop_tx_queue_data(vdev, vq); return;
Re: [PATCH 0/2] P11 support for QEMU
On Mon Apr 1, 2024 at 3:55 PM AEST, Aditya Gupta wrote: > This patch series adds support for Power11 pseries and powernv machine targets > to emulate VMs running on Power11. > > Most of the P11 support code has been taken from P10 code in QEMU. > And has been tested in pseries, powernv, with and without compat mode. Thanks for this. I wonder if we could try to get rid of some of the code / structure duplication for creating a new machine. I don't want to add a bunch of CPP generator macros or squash too much code together with lots of flags, but maybe there's something we can do. Since this is a very small change from P10, it might be a good time to work out some refactoring. Even a hw/ppc/pnv_powernv10.c and hw/ppc/pnv_powernv11.c, and target/ppc/cpu_init_power10.c and cpu_init_power11.c might be an improvement because you could easily diff them (hopefully we could do better than that, but just a thought). Thanks, Nick > > Git Tree for Testing: https://github.com/adi-g15-ibm/qemu/tree/p11 > > Aditya Gupta (2): > ppc: pseries: add P11 cpu type > ppc: powernv11: add base support for P11 PowerNV
Re: [PATCH 2/2] CXL/cxl_type3: reset DVSEC CXL Control in ct3d_reset
On 02/04/2024 17:17, Jonathan Cameron wrote: > On Tue, 2 Apr 2024 09:46:47 +0800 > Li Zhijian wrote: > >> After the kernel commit >> 0cab68720598 ("cxl/pci: Fix disabling memory if DVSEC CXL Range does not >> match a CFMWS window") > > Fixes tag seems appropriate. > >> CXL type3 devices cannot be enabled again after the reboot because this >> flag was not reset. >> >> This flag could be changed by the firmware or OS, let it have a >> reset(default) value in reboot so that the OS can read its clean status. > > Good find. I think we should aim for a fix that is less fragile to future > code rearrangement etc. > >> >> Signed-off-by: Li Zhijian >> --- >> hw/mem/cxl_type3.c | 14 +- >> 1 file changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c >> index ad2fe7d463fb..3fe136053390 100644 >> --- a/hw/mem/cxl_type3.c >> +++ b/hw/mem/cxl_type3.c >> @@ -305,7 +305,8 @@ static void build_dvsecs(CXLType3Dev *ct3d) >> >> dvsec = (uint8_t *)&(CXLDVSECDevice){ >> .cap = 0x1e, >> -.ctrl = 0x2, >> +#define CT3D_DEVSEC_CXL_CTRL 0x2 >> +.ctrl = CT3D_DEVSEC_CXL_CTRL, > Naming doesn't make it clear the define is a reset value / default value>> >.status2 = 0x2, >> .range1_size_hi = range1_size_hi, >> .range1_size_lo = range1_size_lo, >> @@ -906,6 +907,16 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr >> host_addr, uint64_t data, >> return address_space_write(as, dpa_offset, attrs, &data, size); >> } >> >> +/* Reset DVSEC CXL Control */ >> +static void ct3d_dvsec_cxl_ctrl_reset(CXLType3Dev *ct3d) >> +{ >> +uint16_t offset = first_dvsec_offset(ct3d); > > This relies to much on the current memory layout. We should doing a search > of config space to find the right entry, Of course, this option is reliable and portable. My thought was that build_dvsecs() and the _reset() are static(internal used), their callers should have the responsibility to keep the configure space/DVSECS layout consistent. So I'm wondering if is it too heavy to have a *new* _find() subroutine for it. Another option could be rebuild the all the DVSECs simply after updated the *offset*, just like: void reset_devses() { cxl->dvsec_offset = OFFSET_TO_FIRST_DVSEC; build_dvsecs(); } It's reasonable because we ought to ensure *all* the DVSECs being reset in next boot. Let me know your thought. Thanks Zhijian > or we should cache a pointer to > the relevant structure when we fill it in the first time. > >> +CXLDVSECDevice *dvsec; >> + >> +dvsec = (CXLDVSECDevice *)(ct3d->cxl_cstate.pdev->config + offset); >> +dvsec->ctrl = CT3D_DEVSEC_CXL_CTRL; >> +} >> + >> static void ct3d_reset(DeviceState *dev) >> { >> CXLType3Dev *ct3d = CXL_TYPE3(dev); >> @@ -914,6 +925,7 @@ static void ct3d_reset(DeviceState *dev) >> >> cxl_component_register_init_common(reg_state, write_msk, >> CXL2_TYPE3_DEVICE); >> cxl_device_register_init_t3(ct3d); >> +ct3d_dvsec_cxl_ctrl_reset(ct3d); >> >> /* >>* Bring up an endpoint to target with MCTP over VDM. >
Re: [PATCH v11 18/23] hw/intc/arm_gicv3: Handle icv_nmiar1_read() for icc_nmiar1_read()
On 2024/4/3 0:12, Peter Maydell wrote: > On Sat, 30 Mar 2024 at 10:33, Jinjie Ruan wrote: >> >> Implement icv_nmiar1_read() for icc_nmiar1_read(), so add definition for >> ICH_LR_EL2.NMI and ICH_AP1R_EL2.NMI bit. >> >> If FEAT_GICv3_NMI is supported, ich_ap_write() should consider >> ICV_AP1R_EL1.NMI >> bit. In icv_activate_irq() and icv_eoir_write(), the ICV_AP1R_EL1.NMI bit >> should be set or clear according to the Non-maskable property. And the RPR >> priority should also update the NMI bit according to the APR priority NMI >> bit. >> >> By the way, add gicv3_icv_nmiar1_read trace event. >> >> If the hpp irq is a NMI, the icv iar read should return 1022 and trap for >> NMI again >> >> Signed-off-by: Jinjie Ruan >> Reviewed-by: Richard Henderson >> --- >> v11: >> - Deal with NMI in the callers instead of ich_highest_active_virt_prio(). >> - Set either NMI or a group-priority bit, not both. >> - Only set AP NMI bits in the 0 reg. >> - Handle NMI in hppvi_index(), icv_hppi_can_preempt() and icv_eoir_write(). >> v10: >> - Rename ICH_AP1R_EL2_NMI to ICV_AP1R_EL1_NMI. >> - Add ICV_RPR_EL1_NMI definition. >> - Set ICV_RPR_EL1.NMI according to the ICV_AP1R_EL1.NMI in >> ich_highest_active_virt_prio(). >> v9: >> - Correct the INTID_NMI logic. >> v8: >> - Fix an unexpected interrupt bug when sending VNMI by running qemu VM. >> v7: >> - Add Reviewed-by. >> v6: >> - Implement icv_nmiar1_read(). >> --- >> hw/intc/arm_gicv3_cpuif.c | 97 ++- >> hw/intc/gicv3_internal.h | 4 ++ >> hw/intc/trace-events | 1 + >> 3 files changed, 91 insertions(+), 11 deletions(-) >> >> diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c >> index f99f2570a6..a7bc44b30c 100644 >> --- a/hw/intc/arm_gicv3_cpuif.c >> +++ b/hw/intc/arm_gicv3_cpuif.c >> @@ -157,6 +157,10 @@ static int ich_highest_active_virt_prio(GICv3CPUState >> *cs) >> int i; >> int aprmax = ich_num_aprs(cs); >> >> +if (cs->gic->nmi_support && cs->ich_apr[GICV3_G1NS][0] & >> ICV_AP1R_EL1_NMI) { >> +return 0x80; > > This should be 0, not 0x80 -- see the register description for > ICH_LR_EL2 Priority field: when NMI is 1, the virtual interrupt's > priority is treated as 0x0. > >> +} >> + >> for (i = 0; i < aprmax; i++) { >> uint32_t apr = cs->ich_apr[GICV3_G0][i] | >> cs->ich_apr[GICV3_G1NS][i]; >> @@ -191,6 +195,7 @@ static int hppvi_index(GICv3CPUState *cs) >> * correct behaviour. >> */ >> int prio = 0xff; >> +bool nmi = false; >> >> if (!(cs->ich_vmcr_el2 & (ICH_VMCR_EL2_VENG0 | ICH_VMCR_EL2_VENG1))) { >> /* Both groups disabled, definitely nothing to do */ >> @@ -200,6 +205,11 @@ static int hppvi_index(GICv3CPUState *cs) >> for (i = 0; i < cs->num_list_regs; i++) { >> uint64_t lr = cs->ich_lr_el2[i]; >> int thisprio; >> +bool thisnmi = false; >> + >> +if (cs->gic->nmi_support) { >> +thisnmi = lr & ICH_LR_EL2_NMI; >> +} > > You could write this a little more concisely as > bool thisnmi = cs->gic_nmi_support && (lr & ICH_LR_EL2_NMI); > >> if (ich_lr_state(lr) != ICH_LR_EL2_STATE_PENDING) { >> /* Not Pending */ >> @@ -219,9 +229,13 @@ static int hppvi_index(GICv3CPUState *cs) >> >> thisprio = ich_lr_prio(lr); >> >> -if (thisprio < prio) { >> +if ((thisprio < prio) || ((thisprio == prio) && (thisnmi & >> (!nmi { >> prio = thisprio; >> idx = i; >> + >> +if (cs->gic->nmi_support) { >> +nmi = thisnmi; >> +} > > You don't need to check nmi_support here because if nmi_support > is false then thisnmi will also be false, and so we will never > set nmi to anything other than false. > >> } >> } >> >> @@ -326,6 +340,12 @@ static bool icv_hppi_can_preempt(GICv3CPUState *cs, >> uint64_t lr) >> return true; >> } >> >> +if ((prio & mask) == (rprio & mask) && >> +cs->gic->nmi_support && (lr & ICH_LR_EL2_NMI) && >> +(!(cs->ich_apr[GICV3_G1NS][0] & ICV_AP1R_EL1_NMI))) { >> +return true; >> +} >> + >> return false; >> } > > This isn't the only change needed to icv_hppi_can_preempt(): > virtual NMIs are never masked by the MPR, so that check also > must be changed. If we pull out a variable: > > bool is_nmi = cs->gic->nmi_support && (lr & ICH_LR_EL2_NMI); > > we can use that both to gate the vpmr check: > > if (!is_nmi && prio >= vpmr) { > > and also in the priority check you have here. > >> @@ -736,13 +765,19 @@ static void icv_activate_irq(GICv3CPUState *cs, int >> idx, int grp) >> */ >> uint32_t mask = icv_gprio_mask(cs, grp); >> int prio = ich_lr_prio(cs->ich_lr_el2[idx]) & mask; >> +bool nmi = cs->ich_lr_el2[idx] & ICH_LR_EL2_NMI; >> int aprbit = prio >> (8 - cs->vprebits); >> int regno = aprbit / 32; >> int regbit = aprbit % 32; >>
Re: [PATCH v11 18/23] hw/intc/arm_gicv3: Handle icv_nmiar1_read() for icc_nmiar1_read()
On 2024/4/3 0:12, Peter Maydell wrote: > On Sat, 30 Mar 2024 at 10:33, Jinjie Ruan wrote: >> >> Implement icv_nmiar1_read() for icc_nmiar1_read(), so add definition for >> ICH_LR_EL2.NMI and ICH_AP1R_EL2.NMI bit. >> >> If FEAT_GICv3_NMI is supported, ich_ap_write() should consider >> ICV_AP1R_EL1.NMI >> bit. In icv_activate_irq() and icv_eoir_write(), the ICV_AP1R_EL1.NMI bit >> should be set or clear according to the Non-maskable property. And the RPR >> priority should also update the NMI bit according to the APR priority NMI >> bit. >> >> By the way, add gicv3_icv_nmiar1_read trace event. >> >> If the hpp irq is a NMI, the icv iar read should return 1022 and trap for >> NMI again >> >> Signed-off-by: Jinjie Ruan >> Reviewed-by: Richard Henderson >> --- >> v11: >> - Deal with NMI in the callers instead of ich_highest_active_virt_prio(). >> - Set either NMI or a group-priority bit, not both. >> - Only set AP NMI bits in the 0 reg. >> - Handle NMI in hppvi_index(), icv_hppi_can_preempt() and icv_eoir_write(). >> v10: >> - Rename ICH_AP1R_EL2_NMI to ICV_AP1R_EL1_NMI. >> - Add ICV_RPR_EL1_NMI definition. >> - Set ICV_RPR_EL1.NMI according to the ICV_AP1R_EL1.NMI in >> ich_highest_active_virt_prio(). >> v9: >> - Correct the INTID_NMI logic. >> v8: >> - Fix an unexpected interrupt bug when sending VNMI by running qemu VM. >> v7: >> - Add Reviewed-by. >> v6: >> - Implement icv_nmiar1_read(). >> --- >> hw/intc/arm_gicv3_cpuif.c | 97 ++- >> hw/intc/gicv3_internal.h | 4 ++ >> hw/intc/trace-events | 1 + >> 3 files changed, 91 insertions(+), 11 deletions(-) >> >> diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c >> index f99f2570a6..a7bc44b30c 100644 >> --- a/hw/intc/arm_gicv3_cpuif.c >> +++ b/hw/intc/arm_gicv3_cpuif.c >> @@ -157,6 +157,10 @@ static int ich_highest_active_virt_prio(GICv3CPUState >> *cs) >> int i; >> int aprmax = ich_num_aprs(cs); >> >> +if (cs->gic->nmi_support && cs->ich_apr[GICV3_G1NS][0] & >> ICV_AP1R_EL1_NMI) { >> +return 0x80; > > This should be 0, not 0x80 -- see the register description for > ICH_LR_EL2 Priority field: when NMI is 1, the virtual interrupt's > priority is treated as 0x0. > >> +} >> + >> for (i = 0; i < aprmax; i++) { >> uint32_t apr = cs->ich_apr[GICV3_G0][i] | >> cs->ich_apr[GICV3_G1NS][i]; >> @@ -191,6 +195,7 @@ static int hppvi_index(GICv3CPUState *cs) >> * correct behaviour. >> */ >> int prio = 0xff; >> +bool nmi = false; >> >> if (!(cs->ich_vmcr_el2 & (ICH_VMCR_EL2_VENG0 | ICH_VMCR_EL2_VENG1))) { >> /* Both groups disabled, definitely nothing to do */ >> @@ -200,6 +205,11 @@ static int hppvi_index(GICv3CPUState *cs) >> for (i = 0; i < cs->num_list_regs; i++) { >> uint64_t lr = cs->ich_lr_el2[i]; >> int thisprio; >> +bool thisnmi = false; >> + >> +if (cs->gic->nmi_support) { >> +thisnmi = lr & ICH_LR_EL2_NMI; >> +} > > You could write this a little more concisely as > bool thisnmi = cs->gic_nmi_support && (lr & ICH_LR_EL2_NMI); If the following if check continues, the operations are unnecessary, so I think it's more appropriate to put it as thisprio operations do it. > >> if (ich_lr_state(lr) != ICH_LR_EL2_STATE_PENDING) { >> /* Not Pending */ >> @@ -219,9 +229,13 @@ static int hppvi_index(GICv3CPUState *cs) >> >> thisprio = ich_lr_prio(lr); >> >> -if (thisprio < prio) { >> +if ((thisprio < prio) || ((thisprio == prio) && (thisnmi & >> (!nmi { >> prio = thisprio; >> idx = i; >> + >> +if (cs->gic->nmi_support) { >> +nmi = thisnmi; >> +} > > You don't need to check nmi_support here because if nmi_support > is false then thisnmi will also be false, and so we will never > set nmi to anything other than false. > >> } >> } >> >> @@ -326,6 +340,12 @@ static bool icv_hppi_can_preempt(GICv3CPUState *cs, >> uint64_t lr) >> return true; >> } >> >> +if ((prio & mask) == (rprio & mask) && >> +cs->gic->nmi_support && (lr & ICH_LR_EL2_NMI) && >> +(!(cs->ich_apr[GICV3_G1NS][0] & ICV_AP1R_EL1_NMI))) { >> +return true; >> +} >> + >> return false; >> } > > This isn't the only change needed to icv_hppi_can_preempt(): > virtual NMIs are never masked by the MPR, so that check also > must be changed. If we pull out a variable: > > bool is_nmi = cs->gic->nmi_support && (lr & ICH_LR_EL2_NMI); > > we can use that both to gate the vpmr check: > > if (!is_nmi && prio >= vpmr) { > > and also in the priority check you have here. > >> @@ -736,13 +765,19 @@ static void icv_activate_irq(GICv3CPUState *cs, int >> idx, int grp) >> */ >> uint32_t mask = icv_gprio_mask(cs, grp); >> int prio = ich_lr_prio(cs->ich_lr_el2[idx]) & mask; >> +bool nmi = cs->ich_lr_e
Re: [PATCH] hw/i386/acpi: Set PCAT_COMPAT bit only when pic is not disabled
On 4/2/2024 10:31 PM, Michael S. Tsirkin wrote: On Tue, Apr 02, 2024 at 09:18:44PM +0800, Xiaoyao Li wrote: On 4/2/2024 6:02 PM, Michael S. Tsirkin wrote: On Tue, Apr 02, 2024 at 04:25:16AM -0400, Xiaoyao Li wrote: Set MADT.FLAGS[bit 0].PCAT_COMPAT based on x86ms->pic. Signed-off-by: Xiaoyao Li Please include more info in the commit log: what is the behaviour you observe, why it is wrong, how does the patch fix it, what is guest behaviour before and after. Sorry, I thought it was straightforward. A value 1 of PCAT_COMPAT (bit 0) of MADT.Flags indicates that the system also has a PC-AT-compatible dual-8259 setup, i.e., the PIC. When PIC is not enabled for x86 machine, the PCAT_COMPAT bit needs to be cleared. Otherwise, the guest thinks there is a present PIC even it is booted with pic=off on QEMU. (I haven't seen real issue from Linux guest. The user of PIC inside guest seems only the pit calibration. Whether pit calibration is triggered depends on other things. But logically, current code is wrong, we need to fix it anyway. @Isaku, please share more info if you have) + Kirill, It seems to have issue with legacy irqs with PCAT_COMPAT set 1 while no PIC on QEMU side. Kirill, could you elaborate it? That's sufficient, thanks! Pls put this in commit log and resubmit. The commit log and the subject should not repeat what the diff already states. --- hw/i386/acpi-common.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c index 20f19269da40..0cc2919bb851 100644 --- a/hw/i386/acpi-common.c +++ b/hw/i386/acpi-common.c @@ -107,7 +107,9 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker, acpi_table_begin(&table, table_data); /* Local APIC Address */ build_append_int_noprefix(table_data, APIC_DEFAULT_ADDRESS, 4); -build_append_int_noprefix(table_data, 1 /* PCAT_COMPAT */, 4); /* Flags */ +/* Flags. bit 0: PCAT_COMPAT */ +build_append_int_noprefix(table_data, + x86ms->pic != ON_OFF_AUTO_OFF ? 1 : 0 , 4); for (i = 0; i < apic_ids->len; i++) { pc_madt_cpu_entry(i, apic_ids, table_data, false); -- 2.34.1
Re: [PATCH v2] linux-user/syscall: xtensa: fix ipc_perm conversion
On 3/29/24 12:33, Max Filippov wrote: target_ipc_perm::mode and target_ipc_perm::__seq fields are 32-bit wide on xtensa and thus need to use tswap32. The issue is spotted by the libc-test http://nsz.repo.hu/git/?p=libc-test on big-endian xtensa core. Cc: qemu-sta...@nongnu.org Fixes: a3da8be5126b ("target/xtensa: linux-user: fix sysv IPC structures") Signed-off-by: Max Filippov --- Changes v1->v2: - split into a separate patch linux-user/syscall.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index e384e1424890..d9bfd31c1cad 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -3758,12 +3758,13 @@ static inline abi_long target_to_host_ipc_perm(struct ipc_perm *host_ip, host_ip->gid = tswap32(target_ip->gid); host_ip->cuid = tswap32(target_ip->cuid); host_ip->cgid = tswap32(target_ip->cgid); -#if defined(TARGET_ALPHA) || defined(TARGET_MIPS) || defined(TARGET_PPC) +#if defined(TARGET_ALPHA) || defined(TARGET_MIPS) || defined(TARGET_PPC) || \ +defined(TARGET_XTENSA) host_ip->mode = tswap32(target_ip->mode); #else host_ip->mode = tswap16(target_ip->mode); #endif Oof. I think we should rewrite these along the lines of __get_user_e, using __builtin_choose_expr and sizeof to choose the correct routine. r~
Re: [RFC PATCH 00/12] SMMUv3 nested translation support
Hi Mostafa, On Mon, Mar 25, 2024 at 10:13:56AM +, Mostafa Saleh wrote: > > Currently, QEMU supports emulating either stage-1 or stage-2 SMMUs > but not nested instances. > This patch series adds support for nested translation in SMMUv3, > this is controlled by property “arm-smmuv3.stage=nested”, and > advertised to guests as (IDR0.S1P == 1 && IDR0.S2P == 2) IIUIC, with this series, vSMMU will support a virtualized 2-stage translation in a guest VM, right? I wonder how it would interact with the ongoing 2-stage nesting support with host and guest. Or is it supposed to be just a total orthogonal feature without any interaction with the host system? Thanks Nicolin > Main changes(architecture): > > 1) CDs are considered IPA and translated with stage-2. > 2) TTBx and tables for stage-1 are considered IPA and translated >with stage-2. > 3) Translate the IPA address with stage-2. > > TLBs: > == > TLBs are the most tricky part. > > 1) General design >Unified(Combined) design is used, where a new tag is added "stage" >which has 2 valid values: >- STAGE_1: Meaning this entry translates VA to PADDR, it can be > cached from fully nested configuration or from stage-1 only. > It doesn't support separate cached entries (VA to IPA). > >- STAGE_2: Meaning this translates IPA to PADDR, cached from > stage-2 only configuration. > >TLBs are also modified to cache 2 permissions, a new permission added >"parent_perm." > >For non-nested configuration, perm == parent_perm and nothing >changes. This is used to know which stage to use in case there is >a permission fault from a TLB entry. > > 2) Caching in TLB >Stage-1 and stage-2 are inserted in the TLB as is. >For nested translation, both entries are combined into one TLB >entry. Everything is used from stage-1, except: >- transatled_addr from stage-2. >- parent_perm is from stage-2. >- addr_mask: is the minimum of both. > > 3) TLB Lookup >For stage-1 and nested translations, it look for STAGE_1 entries. >For stage-2 it look for STAGE_2 TLB entries. > > 4) TLB invalidation >- Stage-1 commands (CMD_TLBI_NH_VAA, SMMU_CMD_TLBI_NH_VA, > SMMU_CMD_TLBI_NH_ALL): Invalidate TLBs tagged with SMMU_STAGE_1. >- Stage-2 commands (CMD_TLBI_S2_IPA): Invalidate TLBs tagged with > SMMU_STAGE_2. >- All (SMMU_CMD_TLBI_S12_VMALL): Will invalidate both, this is > communicated to the TLB as SMMU_NESTED which is (SMMU_STAGE_1 | > SMMU_STAGE_2) which uses it as a mask. > >As far as I understand, this is compliant with the ARM >architecture, based on: >- ARM ARM DDI 0487J.a: RLGSCG, RTVTYQ, RGNJPZ >- ARM IHI 0070F.b: 16.2 Caching > >An alternative approach would be to instantiate 2 TLBs, one per >each stage. I haven’t investigated that. > > Others > === > - Advertise SMMUv3.2-S2FWB, it is NOP for QEMU as it doesn’t support > attributes. > > - OAS: A typical setup with nesting is to share CPU stage-2 with the > SMMU, and according to the user manual, SMMU OAS must match the > system physical address. > > This was discussed before in > https://lore.kernel.org/all/20230226220650.1480786-11-smost...@google.com/ > The implementation here, follows the discussion, where migration is > added and oas is set up from the board (virt). However, the OAS is > chosen based on the CPU PARANGE as there is no fixed one. > > - For nested configuration, IOVA notifier only notifies for stage-1 > invalidations (as far as I understand this is the intended > behaviour as it notifies for IOVA) > > - Stop ignoring VMID for stage-1 if stage-2 is also supported. > > > Future improvements: > = > 1) One small improvement, that I don’t think it’s worth the extra >complexity, is in case of Stage-1 TLB miss for nested translation, >we can do stage-1 walk and lookup for stage-2 TLBs, instead of >doing the full walk. > > 2) Patch 0006 (hw/arm/smmuv3: Translate CD and TT using stage-2 table) >introduces a macro to use functions that rely on cfg for stage-2, >I don’t like it. However, I didn’t find a simple way around it, >either we change many functions to have a separate stage argument, >or add another arg in config, which is probably more code. > > Testing > > 1) IOMMUFD + VFIO >Kernel: > https://lore.kernel.org/all/cover.1683688960.git.nicol...@nvidia.com/ >VMM: > https://qemu-devel.nongnu.narkive.com/o815DqpI/rfc-v5-0-8-arm-smmuv3-emulation-support > >By assigning > “virtio-net-pci,netdev=net0,disable-legacy=on,iommu_platform=on,ats=on”, >to a guest VM (on top of QEMU guest) with VIFO and IOMMUFD. > > 2) Work in progress prototype I am hacking on for nesting on KVM >(this is nowhere near complete, and misses many stuff but it >doesn't require VMs/VFIO) also with virtio-net-pci and git >cloning a bunch of stuff and also obs
Re: [PATCH 2/3] target/hppa: mask offset bits in gva
On 4/2/24 08:29, Sven Schnelle wrote: Richard Henderson writes: On 4/1/24 20:01, Sven Schnelle wrote: Implement dr2 and the mfdiag/mtdiag instructions. dr2 contains a bit which enables/disables space id hashing. Seabios would then set this bit when booting. Linux would disable it again during boot (this would be the same like on real hardware), while HP-UX would leave it enabled. This is how it works on real physical hardware, and since qemu should mimic real hardware as best as it can, IMHO we should implement it exactly like this. Helge Pointer to documentation? There's no documentation about that in the public. There's this code since the beginning of linux on hppa in the linux kernel (arch/parisc/kernel/pacache.S): /* Disable Space Register Hashing for PCXL */ .word 0x141c0600 /* mfdiag %dr0, %r28 */ depwi 0,28,2, %r28 /* Clear DHASH_EN & IHASH_EN */ .word 0x141c0240 /* mtdiag %r28, %dr0 */ b,n srdis_done srdis_pa20: /* Disable Space Register Hashing for PCXU,PCXU+,PCXW,PCXW+,PCXW2 */ .word 0x144008bc/* mfdiag %dr2, %r28 */ depdi 0, 54,1, %r28 /* clear DIAG_SPHASH_ENAB (bit 54) */ .word 0x145c1840/* mtdiag %r28, %dr2 */ So PCXL (32 bit) uses dr0, while 64 bit uses dr2. This still is the same on my C8000 - i see firmware still contains code reading dr2 to figure out whether space id hashing is enabled. The mfdiag/mtdiag instructions are described in the PCXL/PCXL2 ERS. https://parisc.wiki.kernel.org/index.php/File:PCXL_ers.pdf https://parisc.wiki.kernel.org/index.php/File:Pcxl2_ers.pdf There was a discussion mentioning disabling Space ID hashing in Linux: https://yhbt.net/lore/linux-parisc/199912161642.iaa11...@lucy.cup.hp.com/
[ANNOUNCE] QEMU 9.0.0-rc2 is now available
Hello, On behalf of the QEMU Team, I'd like to announce the availability of the third release candidate for the QEMU 9.0 release. This release is meant for testing purposes and should not be used in a production environment. http://download.qemu.org/qemu-9.0.0-rc2.tar.xz http://download.qemu.org/qemu-9.0.0-rc2.tar.xz.sig http://download.qemu.org/qemu-9.0.0-rc2.tar.bz2 http://download.qemu.org/qemu-9.0.0-rc2.tar.bz2.sig You can help improve the quality of the QEMU 9.0 release by testing this release and reporting bugs using our GitLab issue tracker: https://gitlab.com/qemu-project/qemu/-/milestones/11 The release plan, as well a documented known issues for release candidates, are available at: http://wiki.qemu.org/Planning/9.0 Please add entries to the ChangeLog for the 9.0 release below: http://wiki.qemu.org/ChangeLog/9.0 Thank you to everyone involved! Changes since rc1: e5c6528dce: Update version for v9.0.0-rc2 release (Peter Maydell) 4c54f5bc8e: hw/net/virtio-net: fix qemu set used ring flag even vhost started (Yajun Wu) 95a3645527: hw/xen_evtchn: Initialize flush_kvm_routes (Artem Chernyshev) 0fa5eefa16: gpio/pca955x: Update maintainer email address (Glenn Miles) 8cdb368d19: hw/nvme: fix -Werror=maybe-uninitialized (Marc-André Lureau) c65288de4d: plugins: fix -Werror=maybe-uninitialized false-positive (Marc-André Lureau) e193d4bdb8: block: Remove unnecessary NULL check in bdrv_pad_request() (Kevin Wolf) aab1b3eeb4: hw/i386/pc: Restrict CXL to PCI-based machines (Philippe Mathieu-Daudé) 3325af5355: MAINTAINERS: Fix error-report.c entry (Zhao Liu) 4fbb7687cf: qtest/libqos: Reduce size_to_prdtl() declaration scope (Philippe Mathieu-Daudé) d6fd5d8346: accel/hvf: Un-inline hvf_arch_supports_guest_debug() (Philippe Mathieu-Daudé) 0b796f3810: hw/arm/smmu: Avoid using inlined functions with external linkage again (Philippe Mathieu-Daudé) 870120b467: target/ppc: Rename init_excp_4xx_softmmu() -> init_excp_4xx() (Philippe Mathieu-Daudé) 0eaf7fb9a8: gdbstub/system: Rename 'user_ctx' argument as 'ctx' (Philippe Mathieu-Daudé) 25f34eb708: gdbstub: Correct invalid mentions of 'softmmu' by 'system' (Philippe Mathieu-Daudé) 93019696aa: accel/tcg/plugin: Remove CONFIG_SOFTMMU_GATE definition (Philippe Mathieu-Daudé) 7805132bc3: hmp: Add help information for watchdog action: inject-nmi (Dayu Liu) f6822fee96: Fix some typos in documentation (found by codespell) (Stefan Weil) 393770d7a0: raspi4b: Reduce RAM to 1Gb on 32-bit hosts (Cédric Le Goater) 27c335a464: tests/qtest: Fix STM32L4x5 GPIO test on 32-bit (Cédric Le Goater) 44e25fbc19: hw/intc/arm_gicv3: ICC_HPPIR* return SPURIOUS if int group is disabled (Peter Maydell) e12055: docs: sbsa: update specs, add dt note (Marcin Juszkiewicz) fbe5ac5671: target/arm: take HSTR traps of cp15 accesses to EL2, not EL1 (Peter Maydell) 9988c7b50e: fpu/softfloat: Remove mention of TILE-Gx target (Philippe Mathieu-Daudé) 8e0cd23f71: usb-audio: Fix invalid values in AudioControl descriptors (Joonas Kankaala) 1d2f2b35bc: gitlab-ci/cirrus: switch from 'master' to 'latest' (Michael Tokarev) d0ad271a76: migration/postcopy: Ensure postcopy_start() sets errp if it fails (Avihai Horon) 30158d8850: migration: Set migration error in migration_completion() (Avihai Horon) b07a5bb736: tests/avocado: ppc_hv_tests.py set alpine time before setup-alpine (Nicholas Piggin) 74eb04af18: tests/avocado: Fix ppc_hv_tests.py xorriso dependency guard (Nicholas Piggin) 434531619f: target/ppc: Do not clear MSR[ME] on MCE interrupts to supervisor (Nicholas Piggin) ed399ade3c: target/ppc: Fix GDB register indexing on secondary CPUs (Benjamin Gray) 978897a572: target/ppc: Restore [H]DEXCR to 64-bits (Benjamin Gray) d7d9c6071e: target/ppc/mmu-radix64: Use correct string format in walk_tree() (Philippe Mathieu-Daudé) beb0b62c3e: hw/ppc/spapr: Include missing 'sysemu/tcg.h' header (Philippe Mathieu-Daudé) 58cb91b34d: spapr: nested: use bitwise NOT operator for flags check (Harsh Prateek Bora) dafa0ecc97: accel/tcg: Use CPUState.get_pc in cpu_io_recompile (Richard Henderson) 13af3af196: disas: Show opcodes for target_disas and monitor_disas (Richard Henderson) 2911e9b95f: tcg/optimize: Fix sign_mask for logical right-shift (Richard Henderson) 4a3aa11e1f: target/hppa: Clear psw_n for BE on use_nullify_skip path (Richard Henderson) 3bdf20819e: target/hppa: Add diag instructions to set/restore shadow registers (Helge Deller) 381931275a: target/hppa: Move diag argument handling to decodetree (Richard Henderson) 558c09bef8: target/hppa: Generate getshadowregs inline (Richard Henderson) d9b33018a0: Revert "tap: setting error appropriately when calling net_init_tap_one()" (Akihiko Odaki) decfde6b0e: tap-win32: Remove unnecessary stubs (Akihiko Odaki) 89a8de364b: hw/net/net_tx_pkt: Fix virtio header without checksum offloading (Akihiko Odaki) ba6bb2ec95: ebpf: Fix indirections table setting (Akihiko Odaki) 1c188fc8cb: virtio-net: Fix vhost virtqueue notifiers for R
Re: [PATCH] migration: Yield coroutine when receiving MIG_CMD_POSTCOPY_LISTEN
On Tue, Apr 02, 2024 at 07:20:01AM +, Wang, Wei W wrote: > > IIUC we still need that pre_7_2 stuff and keep the postponed connect() to > > make sure the ordering is done properly. Wei, could you elaborate the patch > > you mentioned? Maybe I missed some spots. > > Sure. I saw your patch (5655aab079) ensures that the preempt channel to be > created when PONG is received from the destination, which indicates that > the main channel has been ready on the destination side. > > Why was it decided to postpone the preempt channel's connect() for newer QEMU > only? The old QEMU (before 7.2) still performs preempt channel connect() in > migrate_fd_connect(), doesn't it have the disordering issue? Yes it has, but the solution actually was not compatible with older binaries, so migration was broken back then.. See: https://lore.kernel.org/qemu-devel/20230326172540.2571110-1-pet...@redhat.com/ My memory was that we didn't have a good solution to fix it without breaking the old protocol, so it can only be fixed on newer binaries. Thanks, -- Peter Xu
Re: [PATCH] migration: Yield coroutine when receiving MIG_CMD_POSTCOPY_LISTEN
On Tue, Apr 02, 2024 at 05:28:36PM +0800, Wang, Lei wrote: > On 4/2/2024 15:25, Wang, Wei W wrote:> On Tuesday, April 2, 2024 2:56 PM, > Wang, > Lei4 wrote: > >> On 4/2/2024 0:13, Peter Xu wrote:> On Fri, Mar 29, 2024 at 08:54:07AM > >> +, > >> Wang, Wei W wrote: > On Friday, March 29, 2024 11:32 AM, Wang, Lei4 wrote: > > When using the post-copy preemption feature to perform post-copy > > live migration, the below scenario could lead to a deadlock and the > > migration will never finish: > > > > - Source connect() the preemption channel in postcopy_start(). > > - Source and the destination side TCP stack finished the 3-way > > handshake > >thus the connection is successful. > > - The destination side main thread is handling the loading of the bulk > >> RAM > >pages thus it doesn't start to handle the pending connection event > > in the > >event loop. and doesn't post the semaphore > >> postcopy_qemufile_dst_done for > >the preemption thread. > > - The source side sends non-iterative device states, such as the virtio > >states. > > - The destination main thread starts to receive the virtio states, this > >process may lead to a page fault (e.g., > > virtio_load()->vring_avail_idx() > >may trigger a page fault since the avail ring page may not be > > received > >yet). > >>> > >>> Ouch. Yeah I think this part got overlooked when working on the > >>> preempt channel. > >>> > > - The page request is sent back to the source side. Source sends the > > page > >content to the destination side preemption thread. > > - Since the event is not arrived and the semaphore > >postcopy_qemufile_dst_done is not posted, the preemption thread in > >destination side is blocked, and cannot handle receiving the page. > > - The QEMU main load thread on the destination side is stuck at the > > page > >fault, and cannot yield and handle the connect() event for the > >preemption channel to unblock the preemption thread. > > - The postcopy will stuck there forever since this is a deadlock. > > > > The key point to reproduce this bug is that the source side is > > sending pages at a rate faster than the destination handling, > > otherwise, the qemu_get_be64() in > > ram_load_precopy() will have a chance to yield since at that time > > there are no pending data in the buffer to get. This will make this > > bug harder to be reproduced. > >>> > >>> How hard would this reproduce? > >> > >> We can manually make this easier to reproduce by adding the following code > >> to make the destination busier to load the pages: > >> > >> diff --git a/migration/ram.c b/migration/ram.c index 0ad9fbba48..0b42877e1f > >> 100644 > >> --- a/migration/ram.c > >> +++ b/migration/ram.c > >> @@ -4232,6 +4232,7 @@ static int ram_load_precopy(QEMUFile *f) { > >> MigrationIncomingState *mis = migration_incoming_get_current(); > >> int flags = 0, ret = 0, invalid_flags = 0, len = 0, i = 0; > >> +volatile unsigned long long a; > >> > >> if (!migrate_compress()) { > >> invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE; @@ -4347,6 > >> +4348,7 @@ static int ram_load_precopy(QEMUFile *f) > >> break; > >> > >> case RAM_SAVE_FLAG_PAGE: > >> +for (a = 0; a < 1; a++); > >> qemu_get_buffer(f, host, TARGET_PAGE_SIZE); > >> break; > >> > > > > Which version of QEMU are you using? > > I tried with the latest upstream QEMU (e.g. v8.2.0 release, 1600b9f46b1bd), > > it's > > always reproducible without any changes (with local migration tests). > > I'm using the latest tip: > > 6af9d12c88b9720f209912f6e4b01fefe5906d59 > > and it cannot be reproduced without the modification. It does look like there's some mistery in reproducability. I can't reproduce it locally with v8.2.0 tag, neither can I reproduce it with the code above. I've requested our QE team to involve, but before that I may need help on any verifications. > > > > > > >>> > >>> I'm thinking whether this should be 9.0 material or 9.1. It's pretty > >>> late for 9.0 though, but we can still discuss. > >>> > > > > Fix this by yielding the load coroutine when receiving > > MIG_CMD_POSTCOPY_LISTEN so the main event loop can handle the > > connection event before loading the non-iterative devices state to > > avoid the deadlock condition. > > > > Signed-off-by: Lei Wang > > This seems to be a regression issue caused by this commit: > 737840e2c6ea (migration: Use the number of transferred bytes > directly) > > Adding qemu_fflush back to migration_rate_exceeded() or > ram_save_iterate seems to work (might not be a good fix though). > > > --- > > migration/savevm.c | 5 + > > 1 file changed, 5 insertions(
Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling
On Mon, Apr 01, 2024 at 11:26:25PM +0200, Yu Zhang wrote: > Hello Peter und Zhjian, > > Thank you so much for letting me know about this. I'm also a bit surprised at > the plan for deprecating the RDMA migration subsystem. It's not too late, since it looks like we do have users not yet notified from this, we'll redo the deprecation procedure even if it'll be the final plan, and it'll be 2 releases after this. > > > IMHO it's more important to know whether there are still users and whether > > they would still like to see it around. > > > I admit RDMA migration was lack of testing(unit/CI test), which led to the > > a few > > obvious bugs being noticed too late. > > Yes, we are a user of this subsystem. I was unaware of the lack of test > coverage > for this part. As soon as 8.2 was released, I saw that many of the > migration test > cases failed and came to realize that there might be a bug between 8.1 > and 8.2, but > was unable to confirm and report it quickly to you. > > The maintenance of this part could be too costly or difficult from > your point of view. It may or may not be too costly, it's just that we need real users of RDMA taking some care of it. Having it broken easily for >1 releases definitely is a sign of lack of users. It is an implication to the community that we should consider dropping some features so that we can get the best use of the community resources for the things that may have a broader audience. One thing majorly missing is a RDMA tester to guard all the merges to not break RDMA paths, hopefully in CI. That should not rely on RDMA hardwares but just to sanity check the migration+rdma code running all fine. RDMA taught us the lesson so we're requesting CI coverage for all other new features that will be merged at least for migration subsystem, so that we plan to not merge anything that is not covered by CI unless extremely necessary in the future. For sure CI is not the only missing part, but I'd say we should start with it, then someone should also take care of the code even if only in maintenance mode (no new feature to add on top). > > My concern is, this plan will forces a few QEMU users (not sure how > many) like us > either to stick to the RDMA migration by using an increasingly older > version of QEMU, > or to abandon the currently used RDMA migration. RDMA doesn't get new features anyway, if there's specific use case for RDMA migrations, would it work if such scenario uses the old binary? Is it possible to switch to the TCP protocol with some good NICs? Per our best knowledge, RDMA users are rare, and please let anyone know if you are aware of such users. IIUC the major reason why RDMA stopped being the trend is because the network is not like ten years ago; I don't think I have good knowledge in RDMA at all nor network, but my understanding is it's pretty easy to fetch modern NIC to outperform RDMAs, then it may make little sense to maintain multiple protocols, considering RDMA migration code is so special so that it has the most custom code comparing to other protocols. Thanks, -- Peter Xu
Re: [PATCH 13/19] hw/virtio-blk: fix -Werror=maybe-uninitialized false-positive
On Thu, Mar 28, 2024 at 02:20:46PM +0400, marcandre.lur...@redhat.com wrote: > From: Marc-André Lureau > > ../hw/block/virtio-blk.c:1212:12: error: ‘rq’ may be used uninitialized > [-Werror=maybe-uninitialized] > > Signed-off-by: Marc-André Lureau > --- > hw/block/virtio-blk.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Stefan Hajnoczi signature.asc Description: PGP signature
Re: [PATCH 01/19] util/coroutine: fix -Werror=maybe-uninitialized false-positive
On Thu, Mar 28, 2024 at 02:20:34PM +0400, marcandre.lur...@redhat.com wrote: > From: Marc-André Lureau > > ../util/qemu-coroutine.c:150:8: error: ‘batch’ may be used uninitialized > [-Werror=maybe-uninitialized] > > Signed-off-by: Marc-André Lureau > --- > util/qemu-coroutine.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Stefan Hajnoczi signature.asc Description: PGP signature
Re: [PATCH v2 2/4] mirror: allow specifying working bitmap
On 07.03.24 16:47, Fiona Ebner wrote: From: John Snow for the mirror job. The bitmap's granularity is used as the job's granularity. The new @bitmap parameter is marked unstable in the QAPI and can currently only be used for @sync=full mode. Clusters initially dirty in the bitmap as well as new writes are copied to the target. Using block-dirty-bitmap-clear and block-dirty-bitmap-merge API, callers can simulate the three kinds of @BitmapSyncMode (which is used by backup): 1. always: default, just pass bitmap as working bitmap. 2. never: copy bitmap and pass copy to the mirror job. 3. on-success: copy bitmap and pass copy to the mirror job and if successful, merge bitmap into original afterwards. When the target image is a fresh "diff image", i.e. one that was not used as the target of a previous mirror and the target image's cluster size is larger than the bitmap's granularity, or when @copy-mode=write-blocking is used, there is a pitfall, because the cluster in the target image will be allocated, but not contain all the data corresponding to the same region in the source image. An idea to avoid the limitation would be to mark clusters which are affected by unaligned writes and are not allocated in the target image dirty, so they would be copied fully later. However, for migration, the invariant that an actively synced mirror stays actively synced (unless an error happens) is useful, because without that invariant, migration might inactivate block devices when mirror still got work to do and run into an assertion failure [0]. Another approach would be to read the missing data from the source upon unaligned writes to be able to write the full target cluster instead. But certain targets like NBD do not allow querying the cluster size. To avoid limiting/breaking the use case of syncing to an existing target, which is arguably more common than the diff image use case, document the limiation in QAPI. This patch was originally based on one by Ma Haocong, but it has since been modified pretty heavily, first by John and then again by Fiona. [0]: https://lore.kernel.org/qemu-devel/1db7f571-cb7f-c293-04cc-cd856e060...@proxmox.com/ Suggested-by: Ma Haocong Signed-off-by: Ma Haocong Signed-off-by: John Snow [FG: switch to bdrv_dirty_bitmap_merge_internal] Signed-off-by: Fabian Grünbichler Signed-off-by: Thomas Lamprecht [FE: rebase for 9.0 get rid of bitmap mode parameter use caller-provided bitmap as working bitmap turn bitmap parameter experimental] Signed-off-by: Fiona Ebner --- block/mirror.c | 95 -- blockdev.c | 39 +-- include/block/block_int-global-state.h | 5 +- qapi/block-core.json | 37 +- tests/unit/test-block-iothread.c | 2 +- 5 files changed, 146 insertions(+), 32 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index 1609354db3..5c9a00b574 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -51,7 +51,7 @@ typedef struct MirrorBlockJob { BlockDriverState *to_replace; /* Used to block operations on the drive-mirror-replace target */ Error *replace_blocker; -bool is_none_mode; +MirrorSyncMode sync_mode; Could you please split this change to separate preparation patch? BlockMirrorBackingMode backing_mode; /* Whether the target image requires explicit zero-initialization */ bool zero_target; @@ -73,6 +73,11 @@ typedef struct MirrorBlockJob { size_t buf_size; int64_t bdev_length; unsigned long *cow_bitmap; +/* + * Whether the bitmap is created locally or provided by the caller (for + * incremental sync). + */ +bool dirty_bitmap_is_local; BdrvDirtyBitmap *dirty_bitmap; BdrvDirtyBitmapIter *dbi; [..] +if (bitmap_name) { +if (sync != MIRROR_SYNC_MODE_FULL) { +error_setg(errp, "Sync mode '%s' not supported with bitmap.", + MirrorSyncMode_str(sync)); +return; +} +if (granularity) { +error_setg(errp, "Granularity and bitmap cannot both be set"); +return; +} + +bitmap = bdrv_find_dirty_bitmap(bs, bitmap_name); +if (!bitmap) { +error_setg(errp, "Dirty bitmap '%s' not found", bitmap_name); +return; +} + +if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_ALLOW_RO, errp)) { Why allow read-only bitmaps? +return; +} +} + if (!bdrv_backing_chain_next(bs) && sync == MIRROR_SYNC_MODE_TOP) { sync = MIRROR_SYNC_MODE_FULL; } @@ -2889,10 +2913,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs, [..] +# @unstable: Member @bitmap is experimental. +# # Since: 1.3 ## { 'struct': 'DriveMirror', 'data': { '*job-id': 'str', 'device': 'str', 'target': 'str', '*format': 'str', '*node-name': 'str',
Re: [PATCH 06/19] block/stream: fix -Werror=maybe-uninitialized false-positives
On 02.04.24 18:34, Eric Blake wrote: On Tue, Apr 02, 2024 at 12:58:43PM +0300, Vladimir Sementsov-Ogievskiy wrote: Again, same false-positives, because of WITH_GRAPH_RDLOCK_GUARD().. Didn't you try to change WITH_ macros somehow, so that compiler believe in our good intentions? #define WITH_QEMU_LOCK_GUARD_(x, var) \ for (g_autoptr(QemuLockable) var = \ qemu_lockable_auto_lock(QEMU_MAKE_LOCKABLE_NONNULL((x))); \ var; \ qemu_lockable_auto_unlock(var), var = NULL) I can't think of a clever way to rewrite this. The compiler probably thinks the loop may not run, due to the "var" condition. But how to convince it otherwise? it's hard to introduce another variable too.. hmm. maybe like this? #define WITH_QEMU_LOCK_GUARD_(x, var) \ for (g_autoptr(QemuLockable) var = \ qemu_lockable_auto_lock(QEMU_MAKE_LOCKABLE_NONNULL((x))), \ var2 = (void *)(true); \ var2; \ qemu_lockable_auto_unlock(var), var2 = NULL) probably, it would be simpler for compiler to understand the logic this way. Could you check? Wouldn't that attach __attribute__((cleanup(xxx))) to var2, at which point we could cause the compiler to call xxx((void*)(true)) if the user does an early return inside the lock guard, with disastrous consequences? Or is the __attribute__ applied only to the first out of two declarations in a list? Oh, most probably you are right, seems g_autoptr apply it to both variables. Also, we don't need qemu_lockable_auto_unlock(var) separate call, if we zero-out another variable. So, me fixing: #define WITH_QEMU_LOCK_GUARD_(x, var) \ for (QemuLockable *var __attribute__((cleanup(qemu_lockable_auto_unlock))) = qemu_lockable_auto_lock(QEMU_MAKE_LOCKABLE_NONNULL((x))), \ *var2 = (void *)(true); \ var2; \ var2 = NULL) (and we'll need to modify qemu_lockable_auto_unlock() to take "QemuLockable **x" argument) -- Best regards, Vladimir
Re: [RFC PATCH 03/12] hw/arm/smmu: Add stage to TLB
Hi Eric, On Tue, Apr 02, 2024 at 07:15:20PM +0200, Eric Auger wrote: > Hi Mostafa, > > On 3/25/24 11:13, Mostafa Saleh wrote: > > TLBs for nesting will be extended to be combined, a new index is added > > "stage", with 2 valid values: > > - SMMU_STAGE_1: Meaning this translates VA to PADDR, this entry can > >be cached from fully nested configuration or from stage-1 only. > >We don't support separate cached entries (VA to IPA) > > > > - SMMU_STAGE_2: Meaning this translates IPA to PADDR, cached from > >stage-2 only configuration. > > > > For TLB invalidation: > > - by VA: Invalidate TLBs tagged with SMMU_STAGE_1 > > - by IPA: Invalidate TLBs tagged with SMMU_STAGE_2 > > - All: Will invalidate both, this is communicated to the TLB as > >SMMU_NESTED which is (SMMU_STAGE_1 | SMMU_STAGE_2) which uses > >it as a mask. > > I don't really get why you need this extra stage field in the key. Why > aren't the asid and vmid tags enough? > Looking again, I think we can do it with ASID and VMID only, but that requires some rework in the invalidation path. With nested SMMUs, we can cache entries from: - Stage-1 (or nested): Tagged with VMID and ASID - Stage-2: Tagged with VMID only (ASID = -1) That should be enough for caching/lookup, but for invalidation, we should be able to invalidate IPAs which are cached from stage-2. At the moment, we represent ASIDs with < 0 as a wildcard for invalidation or stage-2 and they were mutually exclusive. An example is: - CMD_TLBI_NH_VAA: Invalidate stage-1 for a VMID, all ASIDs (we use ASID = -1) - CMD_TLBI_NH_VA: Invalidate stage-1 for a VMID, an ASID ( > 0) - CMD_TLBI_S2_IPA: Invalidate stage-2 for a VMID (we use ASID = -1) We need to distinguish between case 1) and 3) otherwise we over invalidate. Similarly, CMD_TLBI_NH_ALL(invalidate all stage-1 by VMID) and CMD_TLBI_S12_VMALL(invalidate both stages by VMID). I guess we can add variants of these functions that operate on ASIDs (>= 0) or (< 0) which is basically stage-1 or stage-2. Another case I can think of which is not implemented in QEMU is global entries, where we would like to look up entries for all ASIDs (-1), but that’s not a problem for now. I don’t have a strong opinion, I can try to do it this way. Thanks, Mostafa > Eric > > > > This briefly described in the user manual (ARM IHI 0070 F.b) in > > "16.2.1 Caching combined structures". > > > > Signed-off-by: Mostafa Saleh > > --- > > hw/arm/smmu-common.c | 27 +-- > > hw/arm/smmu-internal.h | 2 ++ > > hw/arm/smmuv3.c | 5 +++-- > > hw/arm/trace-events | 3 ++- > > include/hw/arm/smmu-common.h | 8 ++-- > > 5 files changed, 30 insertions(+), 15 deletions(-) > > > > diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c > > index 20630eb670..677dcf9a13 100644 > > --- a/hw/arm/smmu-common.c > > +++ b/hw/arm/smmu-common.c > > @@ -38,7 +38,7 @@ static guint smmu_iotlb_key_hash(gconstpointer v) > > > > /* Jenkins hash */ > > a = b = c = JHASH_INITVAL + sizeof(*key); > > -a += key->asid + key->vmid + key->level + key->tg; > > +a += key->asid + key->vmid + key->level + key->tg + key->stage; > > b += extract64(key->iova, 0, 32); > > c += extract64(key->iova, 32, 32); > > > > @@ -54,14 +54,14 @@ static gboolean smmu_iotlb_key_equal(gconstpointer v1, > > gconstpointer v2) > > > > return (k1->asid == k2->asid) && (k1->iova == k2->iova) && > > (k1->level == k2->level) && (k1->tg == k2->tg) && > > - (k1->vmid == k2->vmid); > > + (k1->vmid == k2->vmid) && (k1->stage == k2->stage); > > } > > > > SMMUIOTLBKey smmu_get_iotlb_key(uint16_t asid, uint16_t vmid, uint64_t > > iova, > > -uint8_t tg, uint8_t level) > > +uint8_t tg, uint8_t level, SMMUStage stage) > > { > > SMMUIOTLBKey key = {.asid = asid, .vmid = vmid, .iova = iova, > > -.tg = tg, .level = level}; > > +.tg = tg, .level = level, .stage = stage}; > > > > return key; > > } > > @@ -81,7 +81,8 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, > > SMMUTransCfg *cfg, > > SMMUIOTLBKey key; > > > > key = smmu_get_iotlb_key(cfg->asid, cfg->s2cfg.vmid, > > - iova & ~mask, tg, level); > > + iova & ~mask, tg, level, > > + SMMU_STAGE_TO_TLB_TAG(cfg->stage)); > > entry = g_hash_table_lookup(bs->iotlb, &key); > > if (entry) { > > break; > > @@ -109,15 +110,16 @@ void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg > > *cfg, SMMUTLBEntry *new) > > { > > SMMUIOTLBKey *key = g_new0(SMMUIOTLBKey, 1); > > uint8_t tg = (new->granule - 10) / 2; > > +SMMUStage stage_tag = SMMU_STAGE_TO_TLB_TAG(cfg->stage); > > > > if (g_hash_table_size(bs->iotlb) >= SMMU_IOTLB_MAX_SIZE) { > >
Re: [PATCH] tests/qtest: Standardize qtest function caller strings.
ping ! On 27/03/24 4:18 pm, Het Gala wrote: On 27/03/24 2:37 am, Fabiano Rosas wrote: Het Gala writes: Some comments, mostly just thinking out loud... For --> migrate // //O:/... For --> validate ///O:/O:/ /O:/O:/... Do we need an optional 'capability' element? I'm not sure how practical is to leave that as 'others', because that puts it at the end of the string. We'd want the element that's more important/with more variants to be towards the start of the string so we can run all tests of the same kind with the -r option. While also looking at different functions for figuring out the transport and invocation, my observation was that, there might be many capabilities added to the same test, while it might not be important also. Ex: /migrate/multifd/tcp/plain 1. multifd is defined as a migration mode. 2. It is also a capability, and comes in 2 parts [multifd, multifd-channels] though one is a capability and another is parameter Similarly in other examples of compression, there are many capabilities and parameters added, but it might be not important to mention that ? Secondly, there are multiple migration capabilities IIRC (> 15). And a test requiring multiple capabilities, the overall string would be too long, and not that important also to mention all capabilities. Just thinking out of mind - Can we have selective list of capabilities ? 1. multifd 2. compress (again, there might be confusion with multifd compression methods like zstd, zlib and just 'compress') 3. zero-page (This will have sub capabilities ?) test-type:: migrate | validate We could alternatively drop migration|migrate|validate. They are kind of superfluous. I agree with the above comment. 'migrate' and 'validate' have a different set of variables required, some necessary, while other optional. IMO this will help is in streamlining the design further. migration-mode a. migrate --> :: precopy | postcopy | multifd b. validate -->:: (what to validate) methods :: preempt | recovery | reboot | suspend | simple I want some inputs here. 1. is there a better variable name rather than 'methods' 2. 'simple' does not fit perfect here IMO. transport:: tcp | fd | unix | file invocation :: uri | channels | both CompressionType :: zlib | zstd | none s/none/nocomp/ ? We're already familiar with that. Ack. Will change that. encryptionType :: tls | plain s/plain/notls/ ? What if there is another encryption technique in future ? Or maybe we simply omit the noop options. It would make the string way shorter in most cases. This might be a better approach. Can have some keys/variables as optional while some necessary. For ex: for 'migrate' - transport and invocation might be necessary while it might not be necessary for 'validate' qtests validate-test-result :: success | failure others :: other comments/capability that needs to be addressed. Can be multiple (more than one applicable, separated by using '-' in between) O: optional Signed-off-by: Het Gala Suggested-by: Fabiano Rosas --- tests/qtest/migration-test.c | 143 ++- 1 file changed, 72 insertions(+), 71 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index bd9f4b9dbb..bf4d000b76 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c Regards, Het Gala Regards, Het Gala
Re: [RFC PATCH 03/12] hw/arm/smmu: Add stage to TLB
Hi Mostafa, On 3/25/24 11:13, Mostafa Saleh wrote: > TLBs for nesting will be extended to be combined, a new index is added > "stage", with 2 valid values: > - SMMU_STAGE_1: Meaning this translates VA to PADDR, this entry can >be cached from fully nested configuration or from stage-1 only. >We don't support separate cached entries (VA to IPA) > > - SMMU_STAGE_2: Meaning this translates IPA to PADDR, cached from >stage-2 only configuration. > > For TLB invalidation: > - by VA: Invalidate TLBs tagged with SMMU_STAGE_1 > - by IPA: Invalidate TLBs tagged with SMMU_STAGE_2 > - All: Will invalidate both, this is communicated to the TLB as >SMMU_NESTED which is (SMMU_STAGE_1 | SMMU_STAGE_2) which uses >it as a mask. I don't really get why you need this extra stage field in the key. Why aren't the asid and vmid tags enough? Eric > > This briefly described in the user manual (ARM IHI 0070 F.b) in > "16.2.1 Caching combined structures". > > Signed-off-by: Mostafa Saleh > --- > hw/arm/smmu-common.c | 27 +-- > hw/arm/smmu-internal.h | 2 ++ > hw/arm/smmuv3.c | 5 +++-- > hw/arm/trace-events | 3 ++- > include/hw/arm/smmu-common.h | 8 ++-- > 5 files changed, 30 insertions(+), 15 deletions(-) > > diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c > index 20630eb670..677dcf9a13 100644 > --- a/hw/arm/smmu-common.c > +++ b/hw/arm/smmu-common.c > @@ -38,7 +38,7 @@ static guint smmu_iotlb_key_hash(gconstpointer v) > > /* Jenkins hash */ > a = b = c = JHASH_INITVAL + sizeof(*key); > -a += key->asid + key->vmid + key->level + key->tg; > +a += key->asid + key->vmid + key->level + key->tg + key->stage; > b += extract64(key->iova, 0, 32); > c += extract64(key->iova, 32, 32); > > @@ -54,14 +54,14 @@ static gboolean smmu_iotlb_key_equal(gconstpointer v1, > gconstpointer v2) > > return (k1->asid == k2->asid) && (k1->iova == k2->iova) && > (k1->level == k2->level) && (k1->tg == k2->tg) && > - (k1->vmid == k2->vmid); > + (k1->vmid == k2->vmid) && (k1->stage == k2->stage); > } > > SMMUIOTLBKey smmu_get_iotlb_key(uint16_t asid, uint16_t vmid, uint64_t iova, > -uint8_t tg, uint8_t level) > +uint8_t tg, uint8_t level, SMMUStage stage) > { > SMMUIOTLBKey key = {.asid = asid, .vmid = vmid, .iova = iova, > -.tg = tg, .level = level}; > +.tg = tg, .level = level, .stage = stage}; > > return key; > } > @@ -81,7 +81,8 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg > *cfg, > SMMUIOTLBKey key; > > key = smmu_get_iotlb_key(cfg->asid, cfg->s2cfg.vmid, > - iova & ~mask, tg, level); > + iova & ~mask, tg, level, > + SMMU_STAGE_TO_TLB_TAG(cfg->stage)); > entry = g_hash_table_lookup(bs->iotlb, &key); > if (entry) { > break; > @@ -109,15 +110,16 @@ void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg > *cfg, SMMUTLBEntry *new) > { > SMMUIOTLBKey *key = g_new0(SMMUIOTLBKey, 1); > uint8_t tg = (new->granule - 10) / 2; > +SMMUStage stage_tag = SMMU_STAGE_TO_TLB_TAG(cfg->stage); > > if (g_hash_table_size(bs->iotlb) >= SMMU_IOTLB_MAX_SIZE) { > smmu_iotlb_inv_all(bs); > } > > *key = smmu_get_iotlb_key(cfg->asid, cfg->s2cfg.vmid, new->entry.iova, > - tg, new->level); > + tg, new->level, stage_tag); > trace_smmu_iotlb_insert(cfg->asid, cfg->s2cfg.vmid, new->entry.iova, > -tg, new->level); > +tg, new->level, stage_tag); > g_hash_table_insert(bs->iotlb, key, new); > } > > @@ -159,18 +161,22 @@ static gboolean > smmu_hash_remove_by_asid_vmid_iova(gpointer key, gpointer value, > if (info->vmid >= 0 && info->vmid != SMMU_IOTLB_VMID(iotlb_key)) { > return false; > } > +if (!(info->stage & SMMU_IOTLB_STAGE(iotlb_key))) { > +return false; > +} > return ((info->iova & ~entry->addr_mask) == entry->iova) || > ((entry->iova & ~info->mask) == info->iova); > } > > void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova, > - uint8_t tg, uint64_t num_pages, uint8_t ttl) > + uint8_t tg, uint64_t num_pages, uint8_t ttl, > + SMMUStage stage) > { > /* if tg is not set we use 4KB range invalidation */ > uint8_t granule = tg ? tg * 2 + 10 : 12; > > if (ttl && (num_pages == 1) && (asid >= 0)) { > -SMMUIOTLBKey key = smmu_get_iotlb_key(asid, vmid, iova, tg, ttl); > +SMMUIOTLBKey key = smmu_get_iotlb_key(asid, vmid, iova, tg, ttl, > stage); > > if (g_hash_table_remove
Re: [PULL 00/15] Misc HW patches for 2024-04-02
On Tue, 2 Apr 2024 at 15:25, Philippe Mathieu-Daudé wrote: > > The following changes since commit 7fcf7575f3d201fc84ae168017ffdfd6c86257a6: > > Merge tag 'pull-target-arm-20240402' of > https://git.linaro.org/people/pmaydell/qemu-arm into staging (2024-04-02 > 11:34:49 +0100) > > are available in the Git repository at: > > https://github.com/philmd/qemu.git tags/hw-misc-20240402 > > for you to fetch changes up to 4c54f5bc8e1d38f15cc35b6a6932d8fbe219c692: > > hw/net/virtio-net: fix qemu set used ring flag even vhost started > (2024-04-02 16:15:07 +0200) > > > Misc HW patch queue > > - MAINTAINERS updates (Zhao, Glenn) > - Replace incorrect mentions of 'softmmu' by 'system' (Phil) > - Avoid using inlined functions with external linkage (Phil) > - Restrict CXL to x86 PC PCI-based machines (Phil) > - Remove unnecessary NULL check in bdrv_pad_request (Kevin) > - Fix a pair of -Werror=maybe-uninitialized (Marc-André) > - Initialize variable in xen_evtchn_soft_reset (Artem) > - Do not access virtio-net tx queue until vhost is started (Yajun) Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0 for any user-visible changes. -- PMM
Re: [RFC PATCH 02/12] hw/arm/smmu: Split smmuv3_translate()
Hi Mostafa, On 3/25/24 11:13, Mostafa Saleh wrote: > smmuv3_translate() does everything from STE/CD parsing to TLB lookup > and PTW. > > Soon, when nesting is supported, stage-1 data (tt, CD) needs to be > translated using stage-2. > > Split smmuv3_translate() to 3 functions: > > - smmu_translate(): in smmu-common.c, which does the TLB lookup, PTW, > TLB insertion, all the functions are already there, this just puts > them together. > This also simplifies the code as it consolidates event generation > in case of TLB lookup permission failure or in TT selection. > > - smmuv3_do_translate(): in smmuv3.c, Calls smmu_translate() and does > the event population in case of errors. > > - smmuv3_translate(), now calls smmuv3_do_translate() for >translation while the rest is the same. > > Also, add stage in trace_smmuv3_translate_success() > > Signed-off-by: Mostafa Saleh > --- > hw/arm/smmu-common.c | 59 > hw/arm/smmuv3.c | 175 +-- > hw/arm/trace-events | 2 +- > include/hw/arm/smmu-common.h | 5 + > 4 files changed, 130 insertions(+), 111 deletions(-) > > diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c > index 3a7c350aca..20630eb670 100644 > --- a/hw/arm/smmu-common.c > +++ b/hw/arm/smmu-common.c > @@ -554,6 +554,65 @@ int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, > IOMMUAccessFlags perm, > g_assert_not_reached(); > } > > +SMMUTLBEntry *smmu_translate(SMMUState *bs, SMMUTransCfg *cfg, dma_addr_t > addr, > + IOMMUAccessFlags flag, SMMUPTWEventInfo *info) > +{ > +uint64_t page_mask, aligned_addr; > +SMMUTLBEntry *cached_entry = NULL; > +SMMUTransTableInfo *tt; > +int status; > + > +/* > + * Combined attributes used for TLB lookup, as only one stage is > supported, > + * it will hold attributes based on the enabled stage. > + */ > +SMMUTransTableInfo tt_combined; > + > +if (cfg->stage == SMMU_STAGE_1) { > +/* Select stage1 translation table. */ > +tt = select_tt(cfg, addr); > +if (!tt) { > +info->type = SMMU_PTW_ERR_TRANSLATION; > +info->stage = SMMU_STAGE_1; > +return NULL; > +} > +tt_combined.granule_sz = tt->granule_sz; > +tt_combined.tsz = tt->tsz; > + > +} else { > +/* Stage2. */ > +tt_combined.granule_sz = cfg->s2cfg.granule_sz; > +tt_combined.tsz = cfg->s2cfg.tsz; > +} > + > +/* > + * TLB lookup looks for granule and input size for a translation stage, > + * as only one stage is supported right now, choose the right values > + * from the configuration. > + */ > +page_mask = (1ULL << tt_combined.granule_sz) - 1; > +aligned_addr = addr & ~page_mask; > + > +cached_entry = smmu_iotlb_lookup(bs, cfg, &tt_combined, aligned_addr); > +if (cached_entry) { > +if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) { > +info->type = SMMU_PTW_ERR_PERMISSION; > +info->stage = cfg->stage; > +return NULL; > +} > +return cached_entry; > +} > + > +cached_entry = g_new0(SMMUTLBEntry, 1); > +status = smmu_ptw(cfg, aligned_addr, flag, cached_entry, info); > +if (status) { > +g_free(cached_entry); > +return NULL; > +} > +smmu_iotlb_insert(bs, cfg, cached_entry); > +return cached_entry; > +} > + > /** > * The bus number is used for lookup when SID based invalidation occurs. > * In that case we lazily populate the SMMUPciBus array from the bus hash > diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c > index 50e5a72d54..f081ff0cc4 100644 > --- a/hw/arm/smmuv3.c > +++ b/hw/arm/smmuv3.c > @@ -827,6 +827,67 @@ static void smmuv3_flush_config(SMMUDevice *sdev) > g_hash_table_remove(bc->configs, sdev); > } > > +/* Do translation with TLB lookup. */ > +static SMMUTranslationStatus smmuv3_do_translate(SMMUv3State *s, hwaddr addr, > + SMMUTransCfg *cfg, > + SMMUEventInfo *event, > + IOMMUAccessFlags flag, > + SMMUTLBEntry **out_entry) > +{ > +SMMUPTWEventInfo ptw_info = {}; > +SMMUState *bs = ARM_SMMU(s); > +SMMUTLBEntry *cached_entry = NULL; > + > +cached_entry = smmu_translate(bs, cfg, addr, flag, &ptw_info); > +if (!cached_entry) { > +/* All faults from PTW has S2 field. */ > +event->u.f_walk_eabt.s2 = (ptw_info.stage == SMMU_STAGE_2); > +switch (ptw_info.type) { > +case SMMU_PTW_ERR_WALK_EABT: > +event->type = SMMU_EVT_F_WALK_EABT; > +event->u.f_walk_eabt.addr = addr; > +event->u.f_walk_eabt.rnw = flag & 0x1; > +event->u.f_walk_eabt.class = 0x1; > +event->u.f_walk_eabt.addr
Re: [RFC PATCH 01/12] hw/arm/smmu: Use enum for SMMU stage
Hi Mostafa, On 3/25/24 11:13, Mostafa Saleh wrote: > Currently, translation stage is represented as an int, where 1 is stage-1 and > 2 is stage-2, when nested is added, 3 would be confusing to represent nesting, > so we use an enum instead. > > While keeping the same values, this is useful for: > - Doing tricks with bit masks, where BIT(0) is stage-1 and BIT(1) is >stage-2 and both is nested. > - Tracing, as stage is printed as int. > > Signed-off-by: Mostafa Saleh Reviewed-by: Eric Auger Eric > --- > hw/arm/smmu-common.c | 14 +++--- > hw/arm/smmuv3.c | 15 --- > include/hw/arm/smmu-common.h | 11 +-- > 3 files changed, 24 insertions(+), 16 deletions(-) > > diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c > index 4caedb4998..3a7c350aca 100644 > --- a/hw/arm/smmu-common.c > +++ b/hw/arm/smmu-common.c > @@ -304,7 +304,7 @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg, >SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) > { > dma_addr_t baseaddr, indexmask; > -int stage = cfg->stage; > +SMMUStage stage = cfg->stage; > SMMUTransTableInfo *tt = select_tt(cfg, iova); > uint8_t level, granule_sz, inputsize, stride; > > @@ -392,7 +392,7 @@ static int smmu_ptw_64_s1(SMMUTransCfg *cfg, > info->type = SMMU_PTW_ERR_TRANSLATION; > > error: > -info->stage = 1; > +info->stage = SMMU_STAGE_1; > tlbe->entry.perm = IOMMU_NONE; > return -EINVAL; > } > @@ -415,7 +415,7 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, >dma_addr_t ipa, IOMMUAccessFlags perm, >SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) > { > -const int stage = 2; > +const SMMUStage stage = SMMU_STAGE_2; > int granule_sz = cfg->s2cfg.granule_sz; > /* ARM DDI0487I.a: Table D8-7. */ > int inputsize = 64 - cfg->s2cfg.tsz; > @@ -513,7 +513,7 @@ static int smmu_ptw_64_s2(SMMUTransCfg *cfg, > info->type = SMMU_PTW_ERR_TRANSLATION; > > error: > -info->stage = 2; > +info->stage = SMMU_STAGE_2; > tlbe->entry.perm = IOMMU_NONE; > return -EINVAL; > } > @@ -532,9 +532,9 @@ error: > int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, > SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info) > { > -if (cfg->stage == 1) { > +if (cfg->stage == SMMU_STAGE_1) { > return smmu_ptw_64_s1(cfg, iova, perm, tlbe, info); > -} else if (cfg->stage == 2) { > +} else if (cfg->stage == SMMU_STAGE_2) { > /* > * If bypassing stage 1(or unimplemented), the input address is > passed > * directly to stage 2 as IPA. If the input address of a transaction > @@ -543,7 +543,7 @@ int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, > IOMMUAccessFlags perm, > */ > if (iova >= (1ULL << cfg->oas)) { > info->type = SMMU_PTW_ERR_ADDR_SIZE; > -info->stage = 1; > +info->stage = SMMU_STAGE_1; > tlbe->entry.perm = IOMMU_NONE; > return -EINVAL; > } > diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c > index 9eb56a70f3..50e5a72d54 100644 > --- a/hw/arm/smmuv3.c > +++ b/hw/arm/smmuv3.c > @@ -34,7 +34,8 @@ > #include "smmuv3-internal.h" > #include "smmu-internal.h" > > -#define PTW_RECORD_FAULT(cfg) (((cfg)->stage == 1) ? (cfg)->record_faults > : \ > +#define PTW_RECORD_FAULT(cfg) (((cfg)->stage == SMMU_STAGE_1) ? \ > + (cfg)->record_faults : \ > (cfg)->s2cfg.record_faults) > > /** > @@ -402,7 +403,7 @@ static bool s2_pgtable_config_valid(uint8_t sl0, uint8_t > t0sz, uint8_t gran) > > static int decode_ste_s2_cfg(SMMUTransCfg *cfg, STE *ste) > { > -cfg->stage = 2; > +cfg->stage = SMMU_STAGE_2; > > if (STE_S2AA64(ste) == 0x0) { > qemu_log_mask(LOG_UNIMP, > @@ -678,7 +679,7 @@ static int decode_cd(SMMUTransCfg *cfg, CD *cd, > SMMUEventInfo *event) > > /* we support only those at the moment */ > cfg->aa64 = true; > -cfg->stage = 1; > +cfg->stage = SMMU_STAGE_1; > > cfg->oas = oas2bits(CD_IPS(cd)); > cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas); > @@ -762,7 +763,7 @@ static int smmuv3_decode_config(IOMMUMemoryRegion *mr, > SMMUTransCfg *cfg, > return ret; > } > > -if (cfg->aborted || cfg->bypassed || (cfg->stage == 2)) { > +if (cfg->aborted || cfg->bypassed || (cfg->stage == SMMU_STAGE_2)) { > return 0; > } > > @@ -882,7 +883,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion > *mr, hwaddr addr, > goto epilogue; > } > > -if (cfg->stage == 1) { > +if (cfg->stage == SMMU_STAGE_1) { > /* Select stage1 translation table. */ > tt = select_tt(cfg, addr); > if (!tt) { > @@ -919,7 +920,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion > *mr, hwaddr addr, >
Re: [PATCH v11 13/23] hw/intc: Enable FEAT_GICv3_NMI Feature
On Sat, 30 Mar 2024 at 10:33, Jinjie Ruan wrote: > > Added properties to enable FEAT_GICv3_NMI feature, setup distributor > and redistributor registers to indicate NMI support. The subject line is misleading, since we don't actually enable the FEAT_GICv3_NMI feature here. I suggest: hw/intc/arm_gicv3: Add has-nmi property to GICv3 device Add a property has-nmi to the GICv3 device, and use this to set the NMI bit in the GICD_TYPER register. This isn't visible to guests yet because the property defaults to false and we won't set it in the board code until we've landed all of the changes needed to implement FEAT_GICV3_NMI. > Signed-off-by: Jinjie Ruan > Reviewed-by: Richard Henderson > --- Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v11 14/23] hw/intc/arm_gicv3: Add irq non-maskable property
On Sat, 30 Mar 2024 at 10:33, Jinjie Ruan wrote: > > A SPI, PPI or SGI interrupt can have non-maskable property. So maintain > non-maskable property in PendingIrq and GICR/GICD. Since add new device > state, it also needs to be migrated, so also save NMI info in > vmstate_gicv3_cpu and vmstate_gicv3. > > Signed-off-by: Jinjie Ruan > Acked-by: Richard Henderson > --- > v11: > - Put vmstate_gicv3_cpu_nmi and vmstate_gicv3_gicd_nmi into existing list. > - Remove the excess != 0. > v10: > - superprio -> nmi, gicr_isuperprio -> gicr_inmir0. > - Save NMI state in vmstate_gicv3_cpu and vmstate_gicv3. > - Update the commit message. > v3: > - Place this ahead of implement GICR_INMIR. > - Add Acked-by. > --- > hw/intc/arm_gicv3_common.c | 38 ++ > include/hw/intc/arm_gicv3_common.h | 4 > 2 files changed, 42 insertions(+) > > diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c > index 2d2cea6858..189258e1ca 100644 > --- a/hw/intc/arm_gicv3_common.c > +++ b/hw/intc/arm_gicv3_common.c > @@ -164,6 +164,24 @@ const VMStateDescription vmstate_gicv3_gicv4 = { > } > }; > > +static bool nmi_needed(void *opaque) > +{ > +GICv3CPUState *cs = opaque; > + > +return cs->gic->nmi_support; > +} I think we should call this function gicv3_cpu_nmi_needed()... > + > +static const VMStateDescription vmstate_gicv3_cpu_nmi = { > +.name = "arm_gicv3_cpu/nmi", > +.version_id = 1, > +.minimum_version_id = 1, > +.needed = nmi_needed, > +.fields = (const VMStateField[]) { > +VMSTATE_UINT32(gicr_inmir0, GICv3CPUState), > +VMSTATE_END_OF_LIST() > +} > +}; > + > static const VMStateDescription vmstate_gicv3_cpu = { > .name = "arm_gicv3_cpu", > .version_id = 1, > @@ -196,6 +214,7 @@ static const VMStateDescription vmstate_gicv3_cpu = { > &vmstate_gicv3_cpu_virt, > &vmstate_gicv3_cpu_sre_el1, > &vmstate_gicv3_gicv4, > +&vmstate_gicv3_cpu_nmi, > NULL > } > }; > @@ -238,6 +257,24 @@ const VMStateDescription > vmstate_gicv3_gicd_no_migration_shift_bug = { > } > }; > > +static bool needed_nmi(void *opaque) > +{ > +GICv3State *cs = opaque; > + > +return cs->nmi_support; > +} ...and this one gicv3_nmi_needed(). Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v2] input-linux: Add option to not grab a device upon guest startup
Just a ping to make sure this patch hasn't been lost in the noise. The relevant patchew page is https://patchew.org/QEMU/20240322034311.2980970-1-justinien.bou...@gmail.com/. Any chance to get this merged before the next release? Regards, Justinien
Re: [PATCH v11 22/23] target/arm: Add FEAT_NMI to max
On Sat, 30 Mar 2024 at 10:34, Jinjie Ruan wrote: > > Enable FEAT_NMI on the 'max' CPU. > > Signed-off-by: Jinjie Ruan > Reviewed-by: Richard Henderson Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v11 21/23] hw/intc/arm_gicv3: Report the VINMI interrupt
On Sat, 30 Mar 2024 at 10:33, Jinjie Ruan wrote: > > In vCPU Interface, if the vIRQ has the non-maskable property, report > vINMI to the corresponding vPE. > > Signed-off-by: Jinjie Ruan > Reviewed-by: Richard Henderson > --- Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v11 20/23] hw/intc/arm_gicv3: Report the NMI interrupt in gicv3_cpuif_update()
On Sat, 30 Mar 2024 at 10:33, Jinjie Ruan wrote: > > In CPU Interface, if the IRQ has the non-maskable property, report NMI to > the corresponding PE. > > Signed-off-by: Jinjie Ruan > Reviewed-by: Richard Henderson > --- Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v11 19/23] hw/intc/arm_gicv3: Implement NMI interrupt prioirty
On Sat, 30 Mar 2024 at 10:33, Jinjie Ruan wrote: > > If GICD_CTLR_DS bit is zero and the NMI is non-secure, the NMI prioirty Typo in multiple places in this commit message and in the subject line: should be "priority". > is higher than 0x80, otherwise it is higher than 0x0. And save NMI > super prioirty information in hppi.superprio to deliver NMI exception. > Since both GICR and GICD can deliver NMI, it is both necessary to check > whether the pending irq is NMI in gicv3_redist_update_noirqset and > gicv3_update_noirqset. And In irqbetter(), only a non-NMI with the same > priority and a smaller interrupt number can be preempted but not NMI. > > Signed-off-by: Jinjie Ruan > Reviewed-by: Richard Henderson Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH v11 18/23] hw/intc/arm_gicv3: Handle icv_nmiar1_read() for icc_nmiar1_read()
On Sat, 30 Mar 2024 at 10:33, Jinjie Ruan wrote: > > Implement icv_nmiar1_read() for icc_nmiar1_read(), so add definition for > ICH_LR_EL2.NMI and ICH_AP1R_EL2.NMI bit. > > If FEAT_GICv3_NMI is supported, ich_ap_write() should consider > ICV_AP1R_EL1.NMI > bit. In icv_activate_irq() and icv_eoir_write(), the ICV_AP1R_EL1.NMI bit > should be set or clear according to the Non-maskable property. And the RPR > priority should also update the NMI bit according to the APR priority NMI bit. > > By the way, add gicv3_icv_nmiar1_read trace event. > > If the hpp irq is a NMI, the icv iar read should return 1022 and trap for > NMI again > > Signed-off-by: Jinjie Ruan > Reviewed-by: Richard Henderson > --- > v11: > - Deal with NMI in the callers instead of ich_highest_active_virt_prio(). > - Set either NMI or a group-priority bit, not both. > - Only set AP NMI bits in the 0 reg. > - Handle NMI in hppvi_index(), icv_hppi_can_preempt() and icv_eoir_write(). > v10: > - Rename ICH_AP1R_EL2_NMI to ICV_AP1R_EL1_NMI. > - Add ICV_RPR_EL1_NMI definition. > - Set ICV_RPR_EL1.NMI according to the ICV_AP1R_EL1.NMI in > ich_highest_active_virt_prio(). > v9: > - Correct the INTID_NMI logic. > v8: > - Fix an unexpected interrupt bug when sending VNMI by running qemu VM. > v7: > - Add Reviewed-by. > v6: > - Implement icv_nmiar1_read(). > --- > hw/intc/arm_gicv3_cpuif.c | 97 ++- > hw/intc/gicv3_internal.h | 4 ++ > hw/intc/trace-events | 1 + > 3 files changed, 91 insertions(+), 11 deletions(-) > > diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c > index f99f2570a6..a7bc44b30c 100644 > --- a/hw/intc/arm_gicv3_cpuif.c > +++ b/hw/intc/arm_gicv3_cpuif.c > @@ -157,6 +157,10 @@ static int ich_highest_active_virt_prio(GICv3CPUState > *cs) > int i; > int aprmax = ich_num_aprs(cs); > > +if (cs->gic->nmi_support && cs->ich_apr[GICV3_G1NS][0] & > ICV_AP1R_EL1_NMI) { > +return 0x80; This should be 0, not 0x80 -- see the register description for ICH_LR_EL2 Priority field: when NMI is 1, the virtual interrupt's priority is treated as 0x0. > +} > + > for (i = 0; i < aprmax; i++) { > uint32_t apr = cs->ich_apr[GICV3_G0][i] | > cs->ich_apr[GICV3_G1NS][i]; > @@ -191,6 +195,7 @@ static int hppvi_index(GICv3CPUState *cs) > * correct behaviour. > */ > int prio = 0xff; > +bool nmi = false; > > if (!(cs->ich_vmcr_el2 & (ICH_VMCR_EL2_VENG0 | ICH_VMCR_EL2_VENG1))) { > /* Both groups disabled, definitely nothing to do */ > @@ -200,6 +205,11 @@ static int hppvi_index(GICv3CPUState *cs) > for (i = 0; i < cs->num_list_regs; i++) { > uint64_t lr = cs->ich_lr_el2[i]; > int thisprio; > +bool thisnmi = false; > + > +if (cs->gic->nmi_support) { > +thisnmi = lr & ICH_LR_EL2_NMI; > +} You could write this a little more concisely as bool thisnmi = cs->gic_nmi_support && (lr & ICH_LR_EL2_NMI); > if (ich_lr_state(lr) != ICH_LR_EL2_STATE_PENDING) { > /* Not Pending */ > @@ -219,9 +229,13 @@ static int hppvi_index(GICv3CPUState *cs) > > thisprio = ich_lr_prio(lr); > > -if (thisprio < prio) { > +if ((thisprio < prio) || ((thisprio == prio) && (thisnmi & (!nmi > { > prio = thisprio; > idx = i; > + > +if (cs->gic->nmi_support) { > +nmi = thisnmi; > +} You don't need to check nmi_support here because if nmi_support is false then thisnmi will also be false, and so we will never set nmi to anything other than false. > } > } > > @@ -326,6 +340,12 @@ static bool icv_hppi_can_preempt(GICv3CPUState *cs, > uint64_t lr) > return true; > } > > +if ((prio & mask) == (rprio & mask) && > +cs->gic->nmi_support && (lr & ICH_LR_EL2_NMI) && > +(!(cs->ich_apr[GICV3_G1NS][0] & ICV_AP1R_EL1_NMI))) { > +return true; > +} > + > return false; > } This isn't the only change needed to icv_hppi_can_preempt(): virtual NMIs are never masked by the MPR, so that check also must be changed. If we pull out a variable: bool is_nmi = cs->gic->nmi_support && (lr & ICH_LR_EL2_NMI); we can use that both to gate the vpmr check: if (!is_nmi && prio >= vpmr) { and also in the priority check you have here. > @@ -736,13 +765,19 @@ static void icv_activate_irq(GICv3CPUState *cs, int > idx, int grp) > */ > uint32_t mask = icv_gprio_mask(cs, grp); > int prio = ich_lr_prio(cs->ich_lr_el2[idx]) & mask; > +bool nmi = cs->ich_lr_el2[idx] & ICH_LR_EL2_NMI; > int aprbit = prio >> (8 - cs->vprebits); > int regno = aprbit / 32; > int regbit = aprbit % 32; > > cs->ich_lr_el2[idx] &= ~ICH_LR_EL2_STATE_PENDING_BIT; > cs->ich_lr_el2[idx] |= ICH_LR_EL2_STATE_ACTIVE_BIT; > -cs->ich_apr[grp][regno] |= (1 << regbit); > + > +if (cs->gic->nmi_support && nmi) { > +
Re: [PATCH v11 17/23] hw/intc/arm_gicv3: Add NMI handling CPU interface registers
On Sat, 30 Mar 2024 at 10:34, Jinjie Ruan wrote: > > Add the NMIAR CPU interface registers which deal with acknowledging NMI. > > When introduce NMI interrupt, there are some updates to the semantics for the > register ICC_IAR1_EL1 and ICC_HPPIR1_EL1. For ICC_IAR1_EL1 register, it > should return 1022 if the intid has non-maskable property. And for > ICC_NMIAR1_EL1 register, it should return 1023 if the intid do not have > non-maskable property. Howerever, these are not necessary for ICC_HPPIR1_EL1 > register. > > And the APR and RPR has NMI bits which should be handled correctly. > > Signed-off-by: Jinjie Ruan > Reviewed-by: Richard Henderson I have a few more comments below but otherwise I think this patch is looking OK. > @@ -898,12 +922,24 @@ static bool icc_hppi_can_preempt(GICv3CPUState *cs) > */ > int rprio; > uint32_t mask; > +ARMCPU *cpu = ARM_CPU(cs->cpu); > +CPUARMState *env = &cpu->env; > > if (icc_no_enabled_hppi(cs)) { > return false; > } > > -if (cs->hppi.prio >= cs->icc_pmr_el1) { > +if (cs->gic->nmi_support && cs->hppi.nmi) { If cs->hppi.nmi is set then nmi_support must be true, as we won't ever set hppi.nmi if the GIC doesn't have NMI support (because the registers where the guest can set the NMI bit will not be writeable and so the nmi bits will always be zero). Elsewhere in the code we assume this (eg in icc_iar1_read() below), so I think we can also assume it here. > +if (!(cs->gic->gicd_ctlr & GICD_CTLR_DS) && > +cs->hppi.grp == GICV3_G1NS) { > +if (cs->icc_pmr_el1 < 0x80) { > +return false; > +} > +if (arm_is_secure(env) && cs->icc_pmr_el1 == 0x80) { > +return false; > +} > +} > +} else if (cs->hppi.prio >= cs->icc_pmr_el1) { > /* Priority mask masks this interrupt */ > return false; > } > @@ -923,6 +959,18 @@ static bool icc_hppi_can_preempt(GICv3CPUState *cs) > return true; > } > > +if (cs->gic->nmi_support && cs->hppi.nmi && > +(cs->hppi.prio & mask) == (rprio & mask)) { > +if ((cs->hppi.grp == GICV3_G1NS) && > +!(cs->icc_apr[GICV3_G1NS][0] & ICC_AP1R_EL1_NMI)) { > +return true; > +} > +if ((cs->hppi.grp == GICV3_G1) && > +!(cs->icc_apr[GICV3_G1][0] & ICC_AP1R_EL1_NMI)) { > +return true; > +} You can write this part a bit more concisely: if (!(cs->icc_apr[cs->hppi.grp][0] & ICC_AP1R_EL1_NMI)) { return true; } rather than writing out the G1 and G1NS cases separately. > +} > + > return false; > } > @@ -1159,13 +1212,16 @@ static uint64_t icc_iar0_read(CPUARMState *env, const > ARMCPRegInfo *ri) > static uint64_t icc_iar1_read(CPUARMState *env, const ARMCPRegInfo *ri) > { > GICv3CPUState *cs = icc_cs_from_env(env); > +int el = arm_current_el(env); > uint64_t intid; > > if (icv_access(env, HCR_IMO)) { > return icv_iar_read(env, ri); > } > > -if (!icc_hppi_can_preempt(cs)) { > +if (cs->hppi.nmi && env->cp15.sctlr_el[el] & SCTLR_NMI) { > +intid = INTID_NMI; > +} else if (!icc_hppi_can_preempt(cs)) { > intid = INTID_SPURIOUS; > } else { > intid = icc_hppir1_value(cs, env); This is the wrong order -- the cases where icc_hppi_can_preempt() returns false need to take priority over the case where we return INTID_NMI. You can get that by structuring it similarly to the pseudocode: if (!icc_hppi_can_preempt(cs)) { intid = INTID_SPURIOUS; } else { intid = icc_hppir1_value(cs, env); } if (!gicv3_intid_is_special(intid)) { if (cs->hppi.nmi && env->cp15.sctlr_el[el] & SCTLR_NMI) { intid = INTID_NMI; } else { icc_activate_irq(cs, intid); } } > @@ -1803,6 +1901,22 @@ static uint64_t icc_rpr_read(CPUARMState *env, const > ARMCPRegInfo *ri) > } > } > > +if (cs->gic->nmi_support) { > +/* NMI info is reported in the high bits of RPR */ > +if (arm_feature(env, ARM_FEATURE_EL3) && !arm_is_secure(env)) { > +if (cs->icc_apr[GICV3_G1NS][0] & ICC_AP1R_EL1_NMI) { > +prio |= ICC_RPR_EL1_NSNMI; This should be ICC_RPR_EL1_NMI. Compare the pseudocode for ICC_RPR_EL1: in the EL3-nonsecure case we set pPriority<63>. > +} > +} else { > +if (cs->icc_apr[GICV3_G1NS][0] & ICC_AP1R_EL1_NMI) { > +prio |= ICC_RPR_EL1_NSNMI; > +} > +if (cs->icc_apr[GICV3_G1][0] & ICC_AP1R_EL1_NMI) { > +prio |= ICC_RPR_EL1_NMI; > +} > +} > +} > + > trace_gicv3_icc_rpr_read(gicv3_redist_affid(cs), prio); > return prio; > } thanks -- PMM
Re: [PATCH 06/19] block/stream: fix -Werror=maybe-uninitialized false-positives
On Tue, Apr 02, 2024 at 12:58:43PM +0300, Vladimir Sementsov-Ogievskiy wrote: > > > Again, same false-positives, because of WITH_GRAPH_RDLOCK_GUARD().. > > > > > > Didn't you try to change WITH_ macros somehow, so that compiler believe > > > in our good intentions? > > > > > > > > > #define WITH_QEMU_LOCK_GUARD_(x, var) \ > > for (g_autoptr(QemuLockable) var = \ > > qemu_lockable_auto_lock(QEMU_MAKE_LOCKABLE_NONNULL((x))); \ > > var; \ > > qemu_lockable_auto_unlock(var), var = NULL) > > > > I can't think of a clever way to rewrite this. The compiler probably > > thinks the loop may not run, due to the "var" condition. But how to > > convince it otherwise? it's hard to introduce another variable too.. > > > hmm. maybe like this? > > #define WITH_QEMU_LOCK_GUARD_(x, var) \ > for (g_autoptr(QemuLockable) var = \ > qemu_lockable_auto_lock(QEMU_MAKE_LOCKABLE_NONNULL((x))), \ > var2 = (void *)(true); \ > var2; \ > qemu_lockable_auto_unlock(var), var2 = NULL) > > > probably, it would be simpler for compiler to understand the logic this way. > Could you check? Wouldn't that attach __attribute__((cleanup(xxx))) to var2, at which point we could cause the compiler to call xxx((void*)(true)) if the user does an early return inside the lock guard, with disastrous consequences? Or is the __attribute__ applied only to the first out of two declarations in a list? -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PULL 0/4] Trivial patches for 2024-04-02
On Tue, 2 Apr 2024 at 15:30, Laurent Vivier wrote: > To post PR I generally use git-publish and I have a hook that checks that. > > $ cat .git/hooks/pre-publish-send-email > !/bin/bash > > NAME=$(git config --get user.name) > EMAIL=$(git config --get user.email) > > for PATCH in $1/*.patch; do > if [ $(basename $PATCH) = "-cover-letter.patch" ]; then > continue > fi > if ! grep -q "^Signed-off-by: $NAME <$EMAIL>" $PATCH; then > echo "Error: Missing sender S-o-B in $PATCH" > exit 1 > fi > if grep "^From: " $PATCH | grep -q "" ; then > echo "Error: Author email address is mangled by the mailing list in > $PATCH" > exit 1 > fi You should check for qemu-.*, not just qemu-devel... thanks -- PMM
Re: [PULL 0/7] lsi, vga fixes for 2024-04-02
On Tue, 2 Apr 2024 at 14:20, Paolo Bonzini wrote: > > The following changes since commit b9dbf6f9bf533564f6a4277d03906fcd32bb0245: > > Merge tag 'pull-tcg-20240329' of https://gitlab.com/rth7680/qemu into > staging (2024-03-30 14:54:57 +) > > are available in the Git repository at: > > https://gitlab.com/bonzini/qemu.git tags/for-upstream > > for you to fetch changes up to eac4af186f6db46fc90ec571a855bd6fa4cb7841: > > pc_q35: remove unnecessary m->alias assignment (2024-04-02 15:14:02 +0200) > > > * lsi53c895a: fix assertion failure with invalid Block Move > * vga: fix assertion failure with 4- and 16-color modes > * remove unnecessary assignment > > > Paolo Bonzini (7): > vga: merge conditionals on shift control register > vga: move computation of dirty memory region later > vga: adjust dirty memory region if pel panning is active > vga: do not treat horiz pel panning value of 8 as "enabled" > lsi53c895a: avoid out of bounds access to s->msg[] > lsi53c895a: detect invalid Block Move instruction > pc_q35: remove unnecessary m->alias assignment This seems to break the avocado test tests/avocado/ppc_prep_40p.py:IbmPrep40pMachine.test_openbios_and_netbsd and it's consistent even with retrying the job: https://gitlab.com/qemu-project/qemu/-/jobs/6529626987 https://gitlab.com/qemu-project/qemu/-/jobs/6528696711 https://gitlab.com/qemu-project/qemu/-/jobs/6529196532 The debug log says: 14:23:32 DEBUG| Transitioning from 'Runstate.CONNECTING' to 'Runstate.RUNNING'. 14:23:32 DEBUG| Opening console file 14:23:32 DEBUG| Opening console socket 14:23:32 DEBUG| >> = 14:23:32 DEBUG| >> OpenBIOS 1.1 [Mar 7 2023 22:21] 14:23:32 DEBUG| >> Configuration device id QEMU version 1 machine id 0 14:23:32 DEBUG| >> CPUs: 0 14:23:32 DEBUG| >> Memory: 128M 14:23:32 DEBUG| >> UUID: ---- 14:23:32 DEBUG| >> CPU type PowerPC,604 14:23:32 DEBUG| milliseconds isn't unique. 14:23:32 DEBUG| Output device screen not found. 14:23:32 DEBUG| Output device screen not found. 14:23:32 DEBUG| Trying cd:,\\:tbxi... 14:23:32 DEBUG| Trying cd:,\ppc\bootinfo.txt... 14:23:32 DEBUG| Trying cd:,%BOOT... 14:23:32 DEBUG| No valid state has been set by load or init-program and then the test times out because it never sees the NetBSD console output it's waiting for. Successful job for a previous pullreq: https://gitlab.com/qemu-project/qemu/-/jobs/6527774374 Here the debug log says: 12:36:14 DEBUG| >> = 12:36:14 DEBUG| >> OpenBIOS 1.1 [Mar 7 2023 22:21] 12:36:14 DEBUG| >> Configuration device id QEMU version 1 machine id 0 12:36:14 DEBUG| >> CPUs: 0 12:36:14 DEBUG| >> Memory: 128M 12:36:14 DEBUG| >> UUID: ---- 12:36:14 DEBUG| >> CPU type PowerPC,604 12:36:14 DEBUG| milliseconds isn't unique. 12:36:14 DEBUG| Output device screen not found. 12:36:14 DEBUG| Output device screen not found. 12:36:14 DEBUG| Trying cd:,\\:tbxi... 12:36:14 DEBUG| >> Not a bootable ELF image 12:36:15 DEBUG| >> switching to new context: 12:36:15 DEBUG| >> NetBSD/prep BOOT, Revision 1.9 12:36:15 DEBUG| Shutting down VM appliance; timeout=30 12:36:15 DEBUG| Attempting graceful termination 12:36:15 DEBUG| Closing console file 12:36:15 DEBUG| Closing console socket 12:36:15 DEBUG| Politely asking QEMU to terminate This machine uses the lsi53c810 SCSI controller, and it's failing to load from the CDROM, so my guess is the problem is in one of the two SCSI patches. thanks -- PMM
[PATCH 0/1] virtio-pci: Fix the crash when the vector changes back from VIRTIO_NO_VECTOR
There is a crash in the Non-standard guest image. The root cause of the issue is that an IRQFD was used After it release During the booting process of the Vyatta image, the behavior of the called function in qemu is as follows: 1. vhost_net_stop() was called. This will call the function virtio_pci_set_guest_notifiers() with assgin= false, and virtio_pci_set_guest_notifiers(??? will release the irqfd for vector 0 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTOR 3.vhost_net_start() was called (at this time the configure vector is still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with assgin= true, so the irqfd for vector 0 was not "init" during this process 4. The system continues to boot, and msix_fire_vector_notifier() was called unmask the vector 0 and then met the crash [msix_fire_vector_notifier] 112 called vector 0 is_masked 1 [msix_fire_vector_notifier] 112 called vector 0 is_masked 0 To fix this, we need to call the function "kvm_virtio_pci_vector_use_one()" when the vector changes back from VIRTIO_NO_VECTOR Signed-off-by: Cindy Lu Cindy Lu (1): virtio-pci: Fix the crash when the vector changes back from VIRTIO_NO_VECTOR hw/virtio/virtio-pci.c | 31 --- 1 file changed, 28 insertions(+), 3 deletions(-) -- 2.43.0
[PATCH 1/1] virtio-pci: Fix the crash when the vector changes back from VIRTIO_NO_VECTOR
When the guest calls virtio_stop and then virtio_reset, the vector will change to VIRTIO_NO_VECTOR and the IRQFD for this vector will be released. After that If you want to change the vector back, it will cause a crash. To fix this, we need to call the function "kvm_virtio_pci_vector_use_one()" when the vector changes back from VIRTIO_NO_VECTOR Signed-off-by: Cindy Lu --- hw/virtio/virtio-pci.c | 31 --- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index e433879542..45f3ab38c3 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -874,12 +874,14 @@ static int virtio_pci_get_notifier(VirtIOPCIProxy *proxy, int queue_no, return 0; } -static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int queue_no) +static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int queue_no, + bool recovery) { unsigned int vector; int ret; EventNotifier *n; PCIDevice *dev = &proxy->pci_dev; +VirtIOIRQFD *irqfd; VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); @@ -890,10 +892,21 @@ static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int queue_no) if (vector >= msix_nr_vectors_allocated(dev)) { return 0; } +/* + * if this is recovery and irqfd still in use, means the irqfd was not + * release before and don't need to set up again + */ +if (recovery) { +irqfd = &proxy->vector_irqfd[vector]; +if (irqfd->users != 0) { +return 0; +} +} ret = kvm_virtio_pci_vq_vector_use(proxy, vector); if (ret < 0) { goto undo; } + /* * If guest supports masking, set up irqfd now. * Otherwise, delay until unmasked in the frontend. @@ -932,14 +945,14 @@ static int kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy *proxy, int nvqs) if (!virtio_queue_get_num(vdev, queue_no)) { return -1; } -ret = kvm_virtio_pci_vector_use_one(proxy, queue_no); +ret = kvm_virtio_pci_vector_use_one(proxy, queue_no, false); } return ret; } static int kvm_virtio_pci_vector_config_use(VirtIOPCIProxy *proxy) { -return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX); +return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, false); } static void kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy, @@ -1570,7 +1583,13 @@ static void virtio_pci_common_write(void *opaque, hwaddr addr, } else { val = VIRTIO_NO_VECTOR; } +vector = vdev->config_vector; vdev->config_vector = val; +/*check if the vector need to recovery*/ +if ((val != VIRTIO_NO_VECTOR) && (vector == VIRTIO_NO_VECTOR) && +(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { +kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, true); +} break; case VIRTIO_PCI_COMMON_STATUS: if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) { @@ -1611,6 +1630,12 @@ static void virtio_pci_common_write(void *opaque, hwaddr addr, val = VIRTIO_NO_VECTOR; } virtio_queue_set_vector(vdev, vdev->queue_sel, val); + +/*check if the vector need to recovery*/ +if ((val != VIRTIO_NO_VECTOR) && (vector == VIRTIO_NO_VECTOR) && +(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { +kvm_virtio_pci_vector_use_one(proxy, vdev->queue_sel, true); +} break; case VIRTIO_PCI_COMMON_Q_ENABLE: if (val == 1) { -- 2.43.0
Re: [PATCH] hw/i386/acpi: Set PCAT_COMPAT bit only when pic is not disabled
On Tue, Apr 02, 2024 at 09:18:44PM +0800, Xiaoyao Li wrote: > On 4/2/2024 6:02 PM, Michael S. Tsirkin wrote: > > On Tue, Apr 02, 2024 at 04:25:16AM -0400, Xiaoyao Li wrote: > > > Set MADT.FLAGS[bit 0].PCAT_COMPAT based on x86ms->pic. > > > > > > Signed-off-by: Xiaoyao Li > > > > Please include more info in the commit log: > > what is the behaviour you observe, why it is wrong, > > how does the patch fix it, what is guest behaviour > > before and after. > > Sorry, I thought it was straightforward. > > A value 1 of PCAT_COMPAT (bit 0) of MADT.Flags indicates that the system > also has a PC-AT-compatible dual-8259 setup, i.e., the PIC. > > When PIC is not enabled for x86 machine, the PCAT_COMPAT bit needs to be > cleared. Otherwise, the guest thinks there is a present PIC even it is > booted with pic=off on QEMU. > > (I haven't seen real issue from Linux guest. The user of PIC inside guest > seems only the pit calibration. Whether pit calibration is triggered depends > on other things. But logically, current code is wrong, we need to fix it > anyway. > > @Isaku, please share more info if you have) > That's sufficient, thanks! Pls put this in commit log and resubmit. > > The commit log and the subject should not repeat > > what the diff already states. > > > > > --- > > > hw/i386/acpi-common.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c > > > index 20f19269da40..0cc2919bb851 100644 > > > --- a/hw/i386/acpi-common.c > > > +++ b/hw/i386/acpi-common.c > > > @@ -107,7 +107,9 @@ void acpi_build_madt(GArray *table_data, BIOSLinker > > > *linker, > > > acpi_table_begin(&table, table_data); > > > /* Local APIC Address */ > > > build_append_int_noprefix(table_data, APIC_DEFAULT_ADDRESS, 4); > > > -build_append_int_noprefix(table_data, 1 /* PCAT_COMPAT */, 4); /* > > > Flags */ > > > +/* Flags. bit 0: PCAT_COMPAT */ > > > +build_append_int_noprefix(table_data, > > > + x86ms->pic != ON_OFF_AUTO_OFF ? 1 : 0 , 4); > > > for (i = 0; i < apic_ids->len; i++) { > > > pc_madt_cpu_entry(i, apic_ids, table_data, false); > > > -- > > > 2.34.1 > >
Re: [PULL 0/4] Trivial patches for 2024-04-02
Le 02/04/2024 à 12:41, Michael Tokarev a écrit : Author: Stefan Weil via *SIGH* This happened *again*. (you'll need to tell git log "--no-mailmap" to not get confused by the mapping we have for the last time one of these slipped through...) Now this is interesting. And this is exactly why I haven't noticed it - I did pay attention to Author lines this time. -- because it is displayed with mailmap applied. How very useful. I have to use `git show --no-mailmap' to see the original " via.." version. To post PR I generally use git-publish and I have a hook that checks that. $ cat .git/hooks/pre-publish-send-email !/bin/bash NAME=$(git config --get user.name) EMAIL=$(git config --get user.email) for PATCH in $1/*.patch; do if [ $(basename $PATCH) = "-cover-letter.patch" ]; then continue fi if ! grep -q "^Signed-off-by: $NAME <$EMAIL>" $PATCH; then echo "Error: Missing sender S-o-B in $PATCH" exit 1 fi if grep "^From: " $PATCH | grep -q "" ; then echo "Error: Author email address is mangled by the mailing list in $PATCH" exit 1 fi done exit 0
[PULL 11/15] plugins: fix -Werror=maybe-uninitialized false-positive
From: Marc-André Lureau ../plugins/loader.c:405:15: error: ‘ctx’ may be used uninitialized [-Werror=maybe-uninitialized] Signed-off-by: Marc-André Lureau Reviewed-by: Pierrick Bouvier Message-ID: <20240328102052.3499331-15-marcandre.lur...@redhat.com> Signed-off-by: Philippe Mathieu-Daudé --- plugins/loader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/plugins/loader.c b/plugins/loader.c index 9768b78eb6..513a429c57 100644 --- a/plugins/loader.c +++ b/plugins/loader.c @@ -390,7 +390,7 @@ void plugin_reset_uninstall(qemu_plugin_id_t id, bool reset) { struct qemu_plugin_reset_data *data; -struct qemu_plugin_ctx *ctx; +struct qemu_plugin_ctx *ctx = NULL; WITH_QEMU_LOCK_GUARD(&plugin.lock) { ctx = plugin_id_to_ctx_locked(id); -- 2.41.0
[PULL 08/15] MAINTAINERS: Fix error-report.c entry
From: Zhao Liu The commit 15002f60f792 ("util: rename qemu-error.c to match its header name") renamed util/qemu-error.c to util/error-report.c but missed to change the corresponding entry. To avoid get_maintainer.pl failing, update the error-report.c entry. Fixes: 15002f60f7 ("util: rename qemu-error.c to match its header name") Signed-off-by: Zhao Liu Reviewed-by: Philippe Mathieu-Daudé Message-ID: <20240327115539.3860270-1-zhao1@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index a07af6b9d4..197a06b42f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3013,7 +3013,7 @@ F: include/qapi/error.h F: include/qemu/error-report.h F: qapi/error.json F: util/error.c -F: util/qemu-error.c +F: util/error-report.c F: scripts/coccinelle/err-bad-newline.cocci F: scripts/coccinelle/error-use-after-free.cocci F: scripts/coccinelle/error_propagate_null.cocci -- 2.41.0
[PULL 13/15] gpio/pca955x: Update maintainer email address
From: Glenn Miles It was noticed that my linux.vnet.ibm.com address does not always work so dropping the vnet to see if that works better. Signed-off-by: Glenn Miles Message-ID: <20240328194914.2145709-1-mil...@linux.vnet.ibm.com> Signed-off-by: Philippe Mathieu-Daudé --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 197a06b42f..e71183eef9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1545,7 +1545,7 @@ F: pc-bios/skiboot.lid F: tests/qtest/pnv* pca955x -M: Glenn Miles +M: Glenn Miles L: qemu-...@nongnu.org L: qemu-...@nongnu.org S: Odd Fixes -- 2.41.0
[PULL 10/15] block: Remove unnecessary NULL check in bdrv_pad_request()
From: Kevin Wolf Coverity complains that the check introduced in commit 3f934817 suggests that qiov could be NULL and we dereference it before reaching the check. In fact, all of the callers pass a non-NULL pointer, so just remove the misleading check. Resolves: Coverity CID 1542668 Signed-off-by: Kevin Wolf Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Fiona Ebner Message-ID: <20240327192750.204197-1-kw...@redhat.com> Signed-off-by: Philippe Mathieu-Daudé --- block/io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index 395bea3bac..7217cf811b 100644 --- a/block/io.c +++ b/block/io.c @@ -1730,7 +1730,7 @@ static int bdrv_pad_request(BlockDriverState *bs, * For prefetching in stream_populate(), no qiov is passed along, because * only copy-on-read matters. */ -if (qiov && *qiov) { +if (*qiov) { sliced_iov = qemu_iovec_slice(*qiov, *qiov_offset, *bytes, &sliced_head, &sliced_tail, &sliced_niov); -- 2.41.0
[PULL 12/15] hw/nvme: fix -Werror=maybe-uninitialized
From: Marc-André Lureau ../hw/nvme/ctrl.c:6081:21: error: ‘result’ may be used uninitialized [-Werror=maybe-uninitialized] It's not obvious that 'result' is set in all code paths. When &result is a returned argument, it's even less clear. Looking at various assignments, 0 seems to be a suitable default value. Signed-off-by: Marc-André Lureau Reviewed-by: Klaus Jensen Message-ID: <20240328102052.3499331-18-marcandre.lur...@redhat.com> Signed-off-by: Philippe Mathieu-Daudé --- hw/nvme/ctrl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index c2b17de987..127c3d2383 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -5894,7 +5894,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) uint32_t dw10 = le32_to_cpu(cmd->cdw10); uint32_t dw11 = le32_to_cpu(cmd->cdw11); uint32_t nsid = le32_to_cpu(cmd->nsid); -uint32_t result; +uint32_t result = 0; uint8_t fid = NVME_GETSETFEAT_FID(dw10); NvmeGetFeatureSelect sel = NVME_GETFEAT_SELECT(dw10); uint16_t iv; -- 2.41.0
[PULL 15/15] hw/net/virtio-net: fix qemu set used ring flag even vhost started
From: Yajun Wu When vhost-user or vhost-kernel is handling virtio net datapath, QEMU should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by QEMU in following code path: #0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511 #1 0x559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576 #2 0x559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801 #3 0x559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248 #4 0x559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525 #5 0x559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321 #6 0x559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774 #7 0x559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259 #8 0x559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199 #9 0x559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783 #10 0x559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592 #11 0x559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266 #12 0x559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412 #13 0x559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311 #14 0x559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392 #15 0x559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455 #16 0x559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459 #17 0x559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301 #18 0x559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62 #19 0x559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82 This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. Signed-off-by: Yajun Wu Reviewed-by: Jiri Pirko Acked-by: Michael S. Tsirkin Message-ID: <20240402045109.97729-1-yaj...@nvidia.com> [PMD: Use unlikely()] Signed-off-by: Philippe Mathieu-Daudé --- hw/net/virtio-net.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index a6ff000cd9..58014a92ad 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -2865,6 +2865,10 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq) VirtIONet *n = VIRTIO_NET(vdev); VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))]; +if (unlikely(n->vhost_started)) { +return; +} + if (unlikely((n->status & VIRTIO_NET_S_LINK_UP) == 0)) { virtio_net_drop_tx_queue_data(vdev, vq); return; -- 2.41.0
[PULL 14/15] hw/xen_evtchn: Initialize flush_kvm_routes
From: Artem Chernyshev In xen_evtchn_soft_reset() variable flush_kvm_routes can be used before being initialized. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Oleg Sviridov Signed-off-by: Artem Chernyshev Reviewed-by: Philippe Mathieu-Daudé Message-ID: <20240329113939.257033-1-artem.chernys...@red-soft.ru> Signed-off-by: Philippe Mathieu-Daudé --- hw/i386/kvm/xen_evtchn.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index a5052c0ea3..07bd0c9ab8 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -1097,7 +1097,7 @@ static int close_port(XenEvtchnState *s, evtchn_port_t port, int xen_evtchn_soft_reset(void) { XenEvtchnState *s = xen_evtchn_singleton; -bool flush_kvm_routes; +bool flush_kvm_routes = false; int i; if (!s) { -- 2.41.0
[PULL 01/15] accel/tcg/plugin: Remove CONFIG_SOFTMMU_GATE definition
The CONFIG_SOFTMMU_GATE definition was never used, remove it. Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Thomas Huth Reviewed-by: Richard Henderson Message-Id: <20240313213339.82071-2-phi...@linaro.org> --- accel/tcg/plugin-gen.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c index 8028786c7b..cd78ef94a1 100644 --- a/accel/tcg/plugin-gen.c +++ b/accel/tcg/plugin-gen.c @@ -57,12 +57,6 @@ #include "exec/helper-info.c.inc" #undef HELPER_H -#ifdef CONFIG_SOFTMMU -# define CONFIG_SOFTMMU_GATE 1 -#else -# define CONFIG_SOFTMMU_GATE 0 -#endif - /* * plugin_cb_start TCG op args[]: * 0: enum plugin_gen_from -- 2.41.0
[PULL 04/15] target/ppc: Rename init_excp_4xx_softmmu() -> init_excp_4xx()
Unify with other init_excp_FOO() in the same file. Signed-off-by: Philippe Mathieu-Daudé Acked-by: Nicholas Piggin Message-Id: <20240313213339.82071-5-phi...@linaro.org> --- target/ppc/cpu_init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c index 22fdea093b..6241de62ce 100644 --- a/target/ppc/cpu_init.c +++ b/target/ppc/cpu_init.c @@ -1642,7 +1642,7 @@ static void register_8xx_sprs(CPUPPCState *env) /*/ /* Exception vectors models */ -static void init_excp_4xx_softmmu(CPUPPCState *env) +static void init_excp_4xx(CPUPPCState *env) { #if !defined(CONFIG_USER_ONLY) env->excp_vectors[POWERPC_EXCP_CRITICAL] = 0x0100; @@ -2120,7 +2120,7 @@ static void init_proc_405(CPUPPCState *env) env->id_tlbs = 0; env->tlb_type = TLB_EMB; #endif -init_excp_4xx_softmmu(env); +init_excp_4xx(env); env->dcache_line_size = 32; env->icache_line_size = 32; /* Allocate hardware IRQ controller */ -- 2.41.0
[PULL 09/15] hw/i386/pc: Restrict CXL to PCI-based machines
CXL is based on PCIe. In is pointless to initialize its context on non-PCI machines. Signed-off-by: Philippe Mathieu-Daudé Acked-by: Jonathan Cameron Message-ID: <20240327161642.33574-1-phi...@linaro.org> --- hw/i386/pc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index e80f02bef4..5c21b0c4db 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1738,7 +1738,9 @@ static void pc_machine_initfn(Object *obj) pcms->pcspk = isa_new(TYPE_PC_SPEAKER); object_property_add_alias(OBJECT(pcms), "pcspk-audiodev", OBJECT(pcms->pcspk), "audiodev"); -cxl_machine_init(obj, &pcms->cxl_devices_state); +if (pcmc->pci_enabled) { +cxl_machine_init(obj, &pcms->cxl_devices_state); +} pcms->machine_done.notify = pc_machine_done; qemu_add_machine_init_done_notifier(&pcms->machine_done); -- 2.41.0
[PULL 03/15] gdbstub/system: Rename 'user_ctx' argument as 'ctx'
Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Richard Henderson Message-Id: <20240313213339.82071-4-phi...@linaro.org> --- gdbstub/internals.h | 8 gdbstub/system.c| 8 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/gdbstub/internals.h b/gdbstub/internals.h index 66c939c67f..32f9f63297 100644 --- a/gdbstub/internals.h +++ b/gdbstub/internals.h @@ -187,7 +187,7 @@ typedef union GdbCmdVariant { #define get_param(p, i)(&g_array_index(p, GdbCmdVariant, i)) -void gdb_handle_query_rcmd(GArray *params, void *user_ctx); /* system */ +void gdb_handle_query_rcmd(GArray *params, void *ctx); /* system */ void gdb_handle_query_offsets(GArray *params, void *user_ctx); /* user */ void gdb_handle_query_xfer_auxv(GArray *params, void *user_ctx); /*user */ void gdb_handle_query_xfer_siginfo(GArray *params, void *user_ctx); /*user */ @@ -201,11 +201,11 @@ void gdb_handle_query_supported_user(const char *gdb_supported); /* user */ bool gdb_handle_set_thread_user(uint32_t pid, uint32_t tid); /* user */ bool gdb_handle_detach_user(uint32_t pid); /* user */ -void gdb_handle_query_attached(GArray *params, void *user_ctx); /* both */ +void gdb_handle_query_attached(GArray *params, void *ctx); /* both */ /* system only */ -void gdb_handle_query_qemu_phy_mem_mode(GArray *params, void *user_ctx); -void gdb_handle_set_qemu_phy_mem_mode(GArray *params, void *user_ctx); +void gdb_handle_query_qemu_phy_mem_mode(GArray *params, void *ctx); +void gdb_handle_set_qemu_phy_mem_mode(GArray *params, void *ctx); /* sycall handling */ void gdb_handle_file_io(GArray *params, void *user_ctx); diff --git a/gdbstub/system.c b/gdbstub/system.c index a3ce384cd1..d235403855 100644 --- a/gdbstub/system.c +++ b/gdbstub/system.c @@ -488,13 +488,13 @@ bool gdb_can_reverse(void) */ void gdb_handle_query_qemu_phy_mem_mode(GArray *params, -void *user_ctx) +void *ctx) { g_string_printf(gdbserver_state.str_buf, "%d", phy_memory_mode); gdb_put_strbuf(); } -void gdb_handle_set_qemu_phy_mem_mode(GArray *params, void *user_ctx) +void gdb_handle_set_qemu_phy_mem_mode(GArray *params, void *ctx) { if (!params->len) { gdb_put_packet("E22"); @@ -509,7 +509,7 @@ void gdb_handle_set_qemu_phy_mem_mode(GArray *params, void *user_ctx) gdb_put_packet("OK"); } -void gdb_handle_query_rcmd(GArray *params, void *user_ctx) +void gdb_handle_query_rcmd(GArray *params, void *ctx) { const guint8 zero = 0; int len; @@ -539,7 +539,7 @@ void gdb_handle_query_rcmd(GArray *params, void *user_ctx) * Execution state helpers */ -void gdb_handle_query_attached(GArray *params, void *user_ctx) +void gdb_handle_query_attached(GArray *params, void *ctx) { gdb_put_packet("1"); } -- 2.41.0
[PULL 07/15] qtest/libqos: Reduce size_to_prdtl() declaration scope
Since size_to_prdtl() is only used within ahci.c, declare it statically. This removes the last use of "inlined function with external linkage". See previous commit and commit 9de9fa5cf2 for rationale. Suggested-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Peter Maydell Reviewed-by: Thomas Huth Message-Id: <20240326171009.26696-4-phi...@linaro.org> --- tests/qtest/libqos/ahci.h | 1 - tests/qtest/libqos/ahci.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/tests/qtest/libqos/ahci.h b/tests/qtest/libqos/ahci.h index 48017864bf..a0487a1557 100644 --- a/tests/qtest/libqos/ahci.h +++ b/tests/qtest/libqos/ahci.h @@ -599,7 +599,6 @@ void ahci_port_check_cmd_sanity(AHCIQState *ahci, AHCICommand *cmd); /* Misc */ bool is_atapi(AHCIQState *ahci, uint8_t port); -unsigned size_to_prdtl(unsigned bytes, unsigned bytes_per_prd); /* Command: Macro level execution */ void ahci_guest_io(AHCIQState *ahci, uint8_t port, uint8_t ide_cmd, diff --git a/tests/qtest/libqos/ahci.c b/tests/qtest/libqos/ahci.c index a2c94c6e06..6d59c7551a 100644 --- a/tests/qtest/libqos/ahci.c +++ b/tests/qtest/libqos/ahci.c @@ -662,7 +662,7 @@ unsigned ahci_pick_cmd(AHCIQState *ahci, uint8_t port) g_assert_not_reached(); } -inline unsigned size_to_prdtl(unsigned bytes, unsigned bytes_per_prd) +static unsigned size_to_prdtl(unsigned bytes, unsigned bytes_per_prd) { /* Each PRD can describe up to 4MiB */ g_assert_cmphex(bytes_per_prd, <=, 4096 * 1024); -- 2.41.0
[PULL 06/15] accel/hvf: Un-inline hvf_arch_supports_guest_debug()
See previous commit and commit 9de9fa5cf2 ("Avoid using inlined functions with external linkage") for rationale. Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Peter Maydell Reviewed-by: Richard Henderson Message-Id: <20240313184954.42513-3-phi...@linaro.org> --- target/arm/hvf/hvf.c | 2 +- target/i386/hvf/hvf.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c index e5f0f60093..65a5601804 100644 --- a/target/arm/hvf/hvf.c +++ b/target/arm/hvf/hvf.c @@ -2246,7 +2246,7 @@ void hvf_arch_update_guest_debug(CPUState *cpu) hvf_arch_set_traps(); } -inline bool hvf_arch_supports_guest_debug(void) +bool hvf_arch_supports_guest_debug(void) { return true; } diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c index 11ffdd4c69..1ed8ed5154 100644 --- a/target/i386/hvf/hvf.c +++ b/target/i386/hvf/hvf.c @@ -708,7 +708,7 @@ void hvf_arch_update_guest_debug(CPUState *cpu) { } -inline bool hvf_arch_supports_guest_debug(void) +bool hvf_arch_supports_guest_debug(void) { return false; } -- 2.41.0
[PULL 05/15] hw/arm/smmu: Avoid using inlined functions with external linkage again
Similarly to commit 9de9fa5cf2 ("hw/arm/smmu-common: Avoid using inlined functions with external linkage"): None of our code base require / use inlined functions with external linkage. Some places use internal inlining in the hot path. These two functions are certainly not in any hot path and don't justify any inlining, so these are likely oversights rather than intentional. Fix: C compiler for the host machine: clang (clang 15.0.0 "Apple clang version 15.0.0 (clang-1500.3.9.4)") ... hw/arm/smmu-common.c:203:43: error: static function 'smmu_hash_remove_by_vmid' is used in an inline function with external linkage [-Werror,-Wstatic-in-inline] g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_vmid, &vmid); ^ include/hw/arm/smmu-common.h:197:1: note: use 'static' to give inline function 'smmu_iotlb_inv_vmid' internal linkage void smmu_iotlb_inv_vmid(SMMUState *s, uint16_t vmid); ^ static hw/arm/smmu-common.c:139:17: note: 'smmu_hash_remove_by_vmid' declared here static gboolean smmu_hash_remove_by_vmid(gpointer key, gpointer value, ^ Fixes: ccc3ee3871 ("hw/arm/smmuv3: Add CMDs related to stage-2") Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Richard Henderson Reviewed-by: Eric Auger Message-Id: <20240313184954.42513-2-phi...@linaro.org> --- hw/arm/smmu-common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 4caedb4998..c4b540656c 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -197,7 +197,7 @@ void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid) g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid, &asid); } -inline void smmu_iotlb_inv_vmid(SMMUState *s, uint16_t vmid) +void smmu_iotlb_inv_vmid(SMMUState *s, uint16_t vmid) { trace_smmu_iotlb_inv_vmid(vmid); g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_vmid, &vmid); -- 2.41.0
[PULL 00/15] Misc HW patches for 2024-04-02
The following changes since commit 7fcf7575f3d201fc84ae168017ffdfd6c86257a6: Merge tag 'pull-target-arm-20240402' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2024-04-02 11:34:49 +0100) are available in the Git repository at: https://github.com/philmd/qemu.git tags/hw-misc-20240402 for you to fetch changes up to 4c54f5bc8e1d38f15cc35b6a6932d8fbe219c692: hw/net/virtio-net: fix qemu set used ring flag even vhost started (2024-04-02 16:15:07 +0200) Misc HW patch queue - MAINTAINERS updates (Zhao, Glenn) - Replace incorrect mentions of 'softmmu' by 'system' (Phil) - Avoid using inlined functions with external linkage (Phil) - Restrict CXL to x86 PC PCI-based machines (Phil) - Remove unnecessary NULL check in bdrv_pad_request (Kevin) - Fix a pair of -Werror=maybe-uninitialized (Marc-André) - Initialize variable in xen_evtchn_soft_reset (Artem) - Do not access virtio-net tx queue until vhost is started (Yajun) Artem Chernyshev (1): hw/xen_evtchn: Initialize flush_kvm_routes Glenn Miles (1): gpio/pca955x: Update maintainer email address Kevin Wolf (1): block: Remove unnecessary NULL check in bdrv_pad_request() Marc-André Lureau (2): plugins: fix -Werror=maybe-uninitialized false-positive hw/nvme: fix -Werror=maybe-uninitialized Philippe Mathieu-Daudé (8): accel/tcg/plugin: Remove CONFIG_SOFTMMU_GATE definition gdbstub: Correct invalid mentions of 'softmmu' by 'system' gdbstub/system: Rename 'user_ctx' argument as 'ctx' target/ppc: Rename init_excp_4xx_softmmu() -> init_excp_4xx() hw/arm/smmu: Avoid using inlined functions with external linkage again accel/hvf: Un-inline hvf_arch_supports_guest_debug() qtest/libqos: Reduce size_to_prdtl() declaration scope hw/i386/pc: Restrict CXL to PCI-based machines Yajun Wu (1): hw/net/virtio-net: fix qemu set used ring flag even vhost started Zhao Liu (1): MAINTAINERS: Fix error-report.c entry MAINTAINERS | 4 ++-- gdbstub/internals.h | 26 +- tests/qtest/libqos/ahci.h | 1 - accel/tcg/plugin-gen.c| 6 -- block/io.c| 2 +- gdbstub/system.c | 10 +- hw/arm/smmu-common.c | 2 +- hw/i386/kvm/xen_evtchn.c | 2 +- hw/i386/pc.c | 4 +++- hw/net/virtio-net.c | 4 hw/nvme/ctrl.c| 2 +- plugins/loader.c | 2 +- target/arm/hvf/hvf.c | 2 +- target/i386/hvf/hvf.c | 2 +- target/ppc/cpu_init.c | 4 ++-- tests/qtest/libqos/ahci.c | 2 +- 16 files changed, 37 insertions(+), 38 deletions(-) -- 2.41.0
[PULL 02/15] gdbstub: Correct invalid mentions of 'softmmu' by 'system'
Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Thomas Huth Reviewed-by: Richard Henderson Message-Id: <20240313213339.82071-3-phi...@linaro.org> --- gdbstub/internals.h | 20 ++-- gdbstub/system.c| 2 +- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/gdbstub/internals.h b/gdbstub/internals.h index e83b179920..66c939c67f 100644 --- a/gdbstub/internals.h +++ b/gdbstub/internals.h @@ -115,7 +115,7 @@ void gdb_read_byte(uint8_t ch); /* * Packet acknowledgement - we handle this slightly differently - * between user and softmmu mode, mainly to deal with the differences + * between user and system mode, mainly to deal with the differences * between the flexible chardev and the direct fd approaches. * * We currently don't support a negotiated QStartNoAckMode @@ -125,7 +125,7 @@ void gdb_read_byte(uint8_t ch); * gdb_got_immediate_ack() - check ok to continue * * Returns true to continue, false to re-transmit for user only, the - * softmmu stub always returns true. + * system stub always returns true. */ bool gdb_got_immediate_ack(void); /* utility helpers */ @@ -135,12 +135,12 @@ CPUState *gdb_first_attached_cpu(void); void gdb_append_thread_id(CPUState *cpu, GString *buf); int gdb_get_cpu_index(CPUState *cpu); unsigned int gdb_get_max_cpus(void); /* both */ -bool gdb_can_reverse(void); /* softmmu, stub for user */ +bool gdb_can_reverse(void); /* system emulation, stub for user */ int gdb_target_sigtrap(void); /* user */ void gdb_create_default_process(GDBState *s); -/* signal mapping, common for softmmu, specialised for user-mode */ +/* signal mapping, common for system, specialised for user-mode */ int gdb_signal_to_target(int sig); int gdb_target_signal_to_gdb(int sig); @@ -157,12 +157,12 @@ void gdb_continue(void); int gdb_continue_partial(char *newstates); /* - * Helpers with separate softmmu and user implementations + * Helpers with separate system and user implementations */ void gdb_put_buffer(const uint8_t *buf, int len); /* - * Command handlers - either specialised or softmmu or user only + * Command handlers - either specialised or system or user only */ void gdb_init_gdbserver_state(void); @@ -187,7 +187,7 @@ typedef union GdbCmdVariant { #define get_param(p, i)(&g_array_index(p, GdbCmdVariant, i)) -void gdb_handle_query_rcmd(GArray *params, void *user_ctx); /* softmmu */ +void gdb_handle_query_rcmd(GArray *params, void *user_ctx); /* system */ void gdb_handle_query_offsets(GArray *params, void *user_ctx); /* user */ void gdb_handle_query_xfer_auxv(GArray *params, void *user_ctx); /*user */ void gdb_handle_query_xfer_siginfo(GArray *params, void *user_ctx); /*user */ @@ -203,7 +203,7 @@ bool gdb_handle_detach_user(uint32_t pid); /* user */ void gdb_handle_query_attached(GArray *params, void *user_ctx); /* both */ -/* softmmu only */ +/* system only */ void gdb_handle_query_qemu_phy_mem_mode(GArray *params, void *user_ctx); void gdb_handle_set_qemu_phy_mem_mode(GArray *params, void *user_ctx); @@ -213,11 +213,11 @@ bool gdb_handled_syscall(void); void gdb_disable_syscalls(void); void gdb_syscall_reset(void); -/* user/softmmu specific syscall handling */ +/* user/system specific syscall handling */ void gdb_syscall_handling(const char *syscall_packet); /* - * Break/Watch point support - there is an implementation for softmmu + * Break/Watch point support - there is an implementation for system * and user mode. */ bool gdb_supports_guest_debug(void); diff --git a/gdbstub/system.c b/gdbstub/system.c index 83fd452800..a3ce384cd1 100644 --- a/gdbstub/system.c +++ b/gdbstub/system.c @@ -1,5 +1,5 @@ /* - * gdb server stub - softmmu specific bits + * gdb server stub - system specific bits * * Debug integration depends on support from the individual * accelerators so most of this involves calling the ops helpers. -- 2.41.0
Re: [PATCH] virtio-net: fix qemu set used ring flag even vhost started
On 2/4/24 14:41, Yajun Wu wrote: On 4/2/2024 8:11 PM, Philippe Mathieu-Daudé wrote: External email: Use caution opening links or attachments Hi Yajun, On 2/4/24 06:51, Yajun Wu wrote: When vhost-user or vhost-kernel is handling virtio net datapath, qemu "QEMU" Ack. should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by qemu in "QEMU" Ack. following code path: #0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511 #1 0x559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576 #2 0x559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801 #3 0x559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248 #4 0x559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525 #5 0x559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321 #6 0x559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774 #7 0x559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259 #8 0x559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199 #9 0x559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783 #10 0x559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592 #11 0x559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266 #12 0x559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412 #13 0x559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311 #14 0x559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392 #15 0x559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455 #16 0x559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459 #17 0x559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301 #18 0x559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62 #19 0x559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82 This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. Signed-off-by: Yajun Wu Reviewed-by: Jiri Pirko --- hw/net/virtio-net.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index a6ff000cd9..8035e01fdf 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -2865,6 +2865,10 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq) VirtIONet *n = VIRTIO_NET(vdev); VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))]; + if (n->vhost_started) { Since you mentioned "in a very rare case", maybe use unlikely()? Ack. Thanks, queued squashing: -if (n->vhost_started) { +if (unlikely(n->vhost_started)) { return; } + return; + } + if (unlikely((n->status & VIRTIO_NET_S_LINK_UP) == 0)) { virtio_net_drop_tx_queue_data(vdev, vq); return;
Re: [PULL v2 0/4] Trivial patches for 2024-04-02
On Tue, 2 Apr 2024 at 11:42, Michael Tokarev wrote: > > The following changes since commit 6af9d12c88b9720f209912f6e4b01fefe5906d59: > > Merge tag 'migration-20240331-pull-request' of > https://gitlab.com/peterx/qemu into staging (2024-04-01 13:12:40 +0100) > > are available in the Git repository at: > > https://gitlab.com/mjt0k/qemu.git tags/pull-trivial-patches > > for you to fetch changes up to 7805132bc30b2619355b10bbfb67217ac838c677: > > hmp: Add help information for watchdog action: inject-nmi (2024-04-02 > 13:38:51 +0300) > > > trivial patches for 2024-04-02 > > spelling fixes for the release, minor doc fixes and usb-audio fix. > > v2: fix Stefan Weil email > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0 for any user-visible changes. -- PMM
Re: Patch for qemu-project/qemu#2247 issue
On 2/4/24 11:59, Michael Tokarev wrote: 02.04.2024 12:50, Philippe Mathieu-Daudé пишет: On 1/4/24 18:52, Michael Tokarev wrote: 01.04.2024 12:43, liu.d...@zte.com.cn wrote: hmp: Add help information for watchdog action: inject-nmi virsh qemu-monitor-command --hmp help information of watchdog_action missing inject-nmi which already supported in Commit 795dc6e4 Signed-off-by: Dayu Liu Applied to trivial-patches tree, in the following form: Author: Dayu Liu Date: Mon Apr 1 17:43:55 2024 +0800 hmp: Add help information for watchdog action: inject-nmi virsh qemu-monitor-command --hmp help information of watchdog_action missing inject-nmi which already supported in Commit 795dc6e4 Fixes: 795dc6e46d ("watchdog: Add new Virtual Watchdog action INJECT-NMI") I don't think that commit is broken and needs Fixing. I see your point though - to have more formal way to mark "related" commits, it isn't always fixing something. I sent a pullreq for this a couple hours ago anyway. No worries ;)
Re: [PATCH] sh4: mac.w: memory accesses are 16-bit words
On 2/4/24 11:37, Zack Buhman wrote: Before this change, executing a code sequence such as: mova tblm,r0 movr0,r1 mova tbln,r0 clrs clrmac mac.w @r0+,@r1+ mac.w @r0+,@r1+ .align 4 tblm:.word 0x1234 .word 0x5678 tbln:.word 0x9abc .word 0xdefg Does not result in correct behavior: Expected behavior: first macw : macl = 0x1234 * 0x9abc + 0x0 mach = 0x0 second macw: macl = 0x5678 * 0xdefg + 0xb00a630 mach = 0x0 Observed behavior (qemu-sh4eb, prior to this commit): first macw : macl = 0x5678 * 0xdefg + 0x0 mach = 0x0 second macw: (unaligned longword memory access, SIGBUS) Various SH-4 ISA manuals also confirm that `mac.w` is a 16-bit word memory access, not a 32-bit longword memory access. Signed-off-by: Zack Buhman --- target/sh4/translate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/sh4/translate.c b/target/sh4/translate.c index a9b1bc7524..6643c14dde 100644 --- a/target/sh4/translate.c +++ b/target/sh4/translate.c @@ -816,10 +816,10 @@ static void _decode_opc(DisasContext * ctx) TCGv arg0, arg1; arg0 = tcg_temp_new(); tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx, -MO_TESL | MO_ALIGN); +MO_TESW | MO_ALIGN); arg1 = tcg_temp_new(); tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx, -MO_TESL | MO_ALIGN); +MO_TESW | MO_ALIGN); Apparently invalid since its introduction in commit fdf9b3e831. Reviewed-by: Philippe Mathieu-Daudé gen_helper_macw(tcg_env, arg0, arg1); tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 2); tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 2);
[PULL 2/7] vga: move computation of dirty memory region later
Move the computation of region_start and region_end after the value of "bits" is known. This makes it possible to distinguish modes that support horizontal pel panning from modes that do not. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Paolo Bonzini --- hw/display/vga.c | 50 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/hw/display/vga.c b/hw/display/vga.c index 4795a0012e2..b4ceff70eb8 100644 --- a/hw/display/vga.c +++ b/hw/display/vga.c @@ -1501,31 +1501,6 @@ static void vga_draw_graphic(VGACommonState *s, int full_update) disp_width = width; depth = s->get_bpp(s); -region_start = (s->params.start_addr * 4); -region_end = region_start + (ram_addr_t)s->params.line_offset * height; -region_end += width * depth / 8; /* scanline length */ -region_end -= s->params.line_offset; -if (region_end > s->vbe_size || depth == 0 || depth == 15) { -/* - * We land here on: - * - wraps around (can happen with cirrus vbe modes) - * - depth == 0 (256 color palette video mode) - * - depth == 15 - * - * Take the safe and slow route: - * - create a dirty bitmap snapshot for all vga memory. - * - force shadowing (so all vga memory access goes - * through vga_read_*() helpers). - * - * Given this affects only vga features which are pretty much - * unused by modern guests there should be no performance - * impact. - */ -region_start = 0; -region_end = s->vbe_size; -force_shadow = true; -} - /* bits 5-6: 0 = 16-color mode, 1 = 4-color mode, 2 = 256-color mode. */ shift_control = (s->gr[VGA_GFX_MODE] >> 5) & 3; double_scan = (s->cr[VGA_CRTC_MAX_SCAN] >> 7); @@ -1597,6 +1572,31 @@ static void vga_draw_graphic(VGACommonState *s, int full_update) } } +region_start = (s->params.start_addr * 4); +region_end = region_start + (ram_addr_t)s->params.line_offset * height; +region_end += width * depth / 8; /* scanline length */ +region_end -= s->params.line_offset; +if (region_end > s->vbe_size || depth == 0 || depth == 15) { +/* + * We land here on: + * - wraps around (can happen with cirrus vbe modes) + * - depth == 0 (256 color palette video mode) + * - depth == 15 + * + * Take the safe and slow route: + * - create a dirty bitmap snapshot for all vga memory. + * - force shadowing (so all vga memory access goes + * through vga_read_*() helpers). + * + * Given this affects only vga features which are pretty much + * unused by modern guests there should be no performance + * impact. + */ +region_start = 0; +region_end = s->vbe_size; +force_shadow = true; +} + /* * Check whether we can share the surface with the backend * or whether we need a shadow surface. We share native -- 2.44.0
[PULL 1/7] vga: merge conditionals on shift control register
There are two sets of conditionals using the shift control bits: one to verify the palette and adjust disp_width, one to compute the "v" and "bits" variables. Merge them into one, with the extra benefit that we now have the "bits" value available early and can use it to compute region_end. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Paolo Bonzini --- hw/display/vga.c | 89 +++- 1 file changed, 42 insertions(+), 47 deletions(-) diff --git a/hw/display/vga.c b/hw/display/vga.c index bc5b83421bf..4795a0012e2 100644 --- a/hw/display/vga.c +++ b/hw/display/vga.c @@ -1546,12 +1546,54 @@ static void vga_draw_graphic(VGACommonState *s, int full_update) } if (shift_control == 0) { +full_update |= update_palette16(s); if (sr(s, VGA_SEQ_CLOCK_MODE) & 8) { disp_width <<= 1; +v = VGA_DRAW_LINE4D2; +} else { +v = VGA_DRAW_LINE4; } +bits = 4; + } else if (shift_control == 1) { +full_update |= update_palette16(s); if (sr(s, VGA_SEQ_CLOCK_MODE) & 8) { disp_width <<= 1; +v = VGA_DRAW_LINE2D2; +} else { +v = VGA_DRAW_LINE2; +} +bits = 4; + +} else { +switch (depth) { +default: +case 0: +full_update |= update_palette256(s); +v = VGA_DRAW_LINE8D2; +bits = 4; +break; +case 8: +full_update |= update_palette256(s); +v = VGA_DRAW_LINE8; +bits = 8; +break; +case 15: +v = s->big_endian_fb ? VGA_DRAW_LINE15_BE : VGA_DRAW_LINE15_LE; +bits = 16; +break; +case 16: +v = s->big_endian_fb ? VGA_DRAW_LINE16_BE : VGA_DRAW_LINE16_LE; +bits = 16; +break; +case 24: +v = s->big_endian_fb ? VGA_DRAW_LINE24_BE : VGA_DRAW_LINE24_LE; +bits = 24; +break; +case 32: +v = s->big_endian_fb ? VGA_DRAW_LINE32_BE : VGA_DRAW_LINE32_LE; +bits = 32; +break; } } @@ -1607,53 +1649,6 @@ static void vga_draw_graphic(VGACommonState *s, int full_update) } } -if (shift_control == 0) { -full_update |= update_palette16(s); -if (sr(s, VGA_SEQ_CLOCK_MODE) & 8) { -v = VGA_DRAW_LINE4D2; -} else { -v = VGA_DRAW_LINE4; -} -bits = 4; -} else if (shift_control == 1) { -full_update |= update_palette16(s); -if (sr(s, VGA_SEQ_CLOCK_MODE) & 8) { -v = VGA_DRAW_LINE2D2; -} else { -v = VGA_DRAW_LINE2; -} -bits = 4; -} else { -switch(s->get_bpp(s)) { -default: -case 0: -full_update |= update_palette256(s); -v = VGA_DRAW_LINE8D2; -bits = 4; -break; -case 8: -full_update |= update_palette256(s); -v = VGA_DRAW_LINE8; -bits = 8; -break; -case 15: -v = s->big_endian_fb ? VGA_DRAW_LINE15_BE : VGA_DRAW_LINE15_LE; -bits = 16; -break; -case 16: -v = s->big_endian_fb ? VGA_DRAW_LINE16_BE : VGA_DRAW_LINE16_LE; -bits = 16; -break; -case 24: -v = s->big_endian_fb ? VGA_DRAW_LINE24_BE : VGA_DRAW_LINE24_LE; -bits = 24; -break; -case 32: -v = s->big_endian_fb ? VGA_DRAW_LINE32_BE : VGA_DRAW_LINE32_LE; -bits = 32; -break; -} -} vga_draw_line = vga_draw_line_table[v]; if (!is_buffer_shared(surface) && s->cursor_invalidate) { -- 2.44.0
Re: [PATCH] hw/i386/acpi: Set PCAT_COMPAT bit only when pic is not disabled
On 4/2/2024 6:02 PM, Michael S. Tsirkin wrote: On Tue, Apr 02, 2024 at 04:25:16AM -0400, Xiaoyao Li wrote: Set MADT.FLAGS[bit 0].PCAT_COMPAT based on x86ms->pic. Signed-off-by: Xiaoyao Li Please include more info in the commit log: what is the behaviour you observe, why it is wrong, how does the patch fix it, what is guest behaviour before and after. Sorry, I thought it was straightforward. A value 1 of PCAT_COMPAT (bit 0) of MADT.Flags indicates that the system also has a PC-AT-compatible dual-8259 setup, i.e., the PIC. When PIC is not enabled for x86 machine, the PCAT_COMPAT bit needs to be cleared. Otherwise, the guest thinks there is a present PIC even it is booted with pic=off on QEMU. (I haven't seen real issue from Linux guest. The user of PIC inside guest seems only the pit calibration. Whether pit calibration is triggered depends on other things. But logically, current code is wrong, we need to fix it anyway. @Isaku, please share more info if you have) The commit log and the subject should not repeat what the diff already states. --- hw/i386/acpi-common.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c index 20f19269da40..0cc2919bb851 100644 --- a/hw/i386/acpi-common.c +++ b/hw/i386/acpi-common.c @@ -107,7 +107,9 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker, acpi_table_begin(&table, table_data); /* Local APIC Address */ build_append_int_noprefix(table_data, APIC_DEFAULT_ADDRESS, 4); -build_append_int_noprefix(table_data, 1 /* PCAT_COMPAT */, 4); /* Flags */ +/* Flags. bit 0: PCAT_COMPAT */ +build_append_int_noprefix(table_data, + x86ms->pic != ON_OFF_AUTO_OFF ? 1 : 0 , 4); for (i = 0; i < apic_ids->len; i++) { pc_madt_cpu_entry(i, apic_ids, table_data, false); -- 2.34.1
[PULL 4/7] vga: do not treat horiz pel panning value of 8 as "enabled"
Horizontal pel panning bit 3 is only used in text mode. In graphics mode, it can be treated as if it was zero, thus not extending the dirty memory region. Signed-off-by: Paolo Bonzini --- hw/display/vga.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/display/vga.c b/hw/display/vga.c index 40acd19e72a..77f59e8c113 100644 --- a/hw/display/vga.c +++ b/hw/display/vga.c @@ -1571,7 +1571,9 @@ static void vga_draw_graphic(VGACommonState *s, int full_update) break; } } -hpel = bits <= 8 ? s->params.hpel : 0; + +/* Horizontal pel panning bit 3 is only used in text mode. */ +hpel = bits <= 8 ? s->params.hpel & 7 : 0; region_start = (s->params.start_addr * 4); region_end = region_start + (ram_addr_t)s->params.line_offset * height; -- 2.44.0
[PULL 3/7] vga: adjust dirty memory region if pel panning is active
When pel panning is active, one more byte is read from each of the VGA memory planes. This has to be accounted in the computation of region_end, otherwise vga_draw_graphic() fails an assertion: qemu-system-i386: ../system/physmem.c:946: cpu_physical_memory_snapshot_get_dirty: Assertion `start + length <= snap->end' failed. Reported-by: Helge Konetzka Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2244 Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Paolo Bonzini --- hw/display/vga.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/hw/display/vga.c b/hw/display/vga.c index b4ceff70eb8..40acd19e72a 100644 --- a/hw/display/vga.c +++ b/hw/display/vga.c @@ -1571,11 +1571,15 @@ static void vga_draw_graphic(VGACommonState *s, int full_update) break; } } +hpel = bits <= 8 ? s->params.hpel : 0; region_start = (s->params.start_addr * 4); region_end = region_start + (ram_addr_t)s->params.line_offset * height; region_end += width * depth / 8; /* scanline length */ region_end -= s->params.line_offset; +if (hpel) { +region_end += 4; +} if (region_end > s->vbe_size || depth == 0 || depth == 15) { /* * We land here on: @@ -1660,7 +1664,6 @@ static void vga_draw_graphic(VGACommonState *s, int full_update) width, height, v, line_offset, s->cr[9], s->cr[VGA_CRTC_MODE], s->params.line_compare, sr(s, VGA_SEQ_CLOCK_MODE)); #endif -hpel = bits <= 8 ? s->params.hpel : 0; addr1 = (s->params.start_addr * 4); bwidth = DIV_ROUND_UP(width * bits, 8); if (hpel) { -- 2.44.0
[PULL 0/7] lsi, vga fixes for 2024-04-02
The following changes since commit b9dbf6f9bf533564f6a4277d03906fcd32bb0245: Merge tag 'pull-tcg-20240329' of https://gitlab.com/rth7680/qemu into staging (2024-03-30 14:54:57 +) are available in the Git repository at: https://gitlab.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to eac4af186f6db46fc90ec571a855bd6fa4cb7841: pc_q35: remove unnecessary m->alias assignment (2024-04-02 15:14:02 +0200) * lsi53c895a: fix assertion failure with invalid Block Move * vga: fix assertion failure with 4- and 16-color modes * remove unnecessary assignment Paolo Bonzini (7): vga: merge conditionals on shift control register vga: move computation of dirty memory region later vga: adjust dirty memory region if pel panning is active vga: do not treat horiz pel panning value of 8 as "enabled" lsi53c895a: avoid out of bounds access to s->msg[] lsi53c895a: detect invalid Block Move instruction pc_q35: remove unnecessary m->alias assignment hw/display/vga.c | 146 +-- hw/i386/pc_q35.c | 1 - hw/scsi/lsi53c895a.c | 28 +++--- 3 files changed, 94 insertions(+), 81 deletions(-) -- 2.44.0
[PULL 6/7] lsi53c895a: detect invalid Block Move instruction
The spec for the lsi53c895a says: "If the instruction is a Block Move and a value of 0x00 is loaded into the DBC register, an illegal instruction interrupt occurs if the LSI53C895A is not in target mode, Command phase". Because QEMU only operates in initiator mode, generate the interrupt unconditionally if the low 24 bits are 0x00. Reported-by: Chuhong Yuan Signed-off-by: Paolo Bonzini --- hw/scsi/lsi53c895a.c | 9 + 1 file changed, 9 insertions(+) diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c index eb9828dd5ef..1e18d88983b 100644 --- a/hw/scsi/lsi53c895a.c +++ b/hw/scsi/lsi53c895a.c @@ -1205,6 +1205,15 @@ again: break; } s->dbc = insn & 0xff; +if (!s->dbc) { +/* + * If the instruction is a Block Move and a value of 0x00 is + * loaded into the DBC register, an illegal instruction interrupt + * occurs if the LSI53C895A is not in target mode, Command phase. + */ +lsi_script_dma_interrupt(s, LSI_DSTAT_IID); +break; +} s->rbc = s->dbc; /* ??? Set ESA. */ s->ia = s->dsp - 8; -- 2.44.0
[PULL 7/7] pc_q35: remove unnecessary m->alias assignment
The assignment is already inherited from pc-q35-8.2. Signed-off-by: Paolo Bonzini --- hw/i386/pc_q35.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index b5922b44afa..c7bc8a2041f 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -393,7 +393,6 @@ static void pc_q35_8_1_machine_options(MachineClass *m) { PCMachineClass *pcmc = PC_MACHINE_CLASS(m); pc_q35_8_2_machine_options(m); -m->alias = NULL; pcmc->broken_32bit_mem_addr_check = true; compat_props_add(m->compat_props, hw_compat_8_1, hw_compat_8_1_len); compat_props_add(m->compat_props, pc_compat_8_1, pc_compat_8_1_len); -- 2.44.0
[PULL 5/7] lsi53c895a: avoid out of bounds access to s->msg[]
If no bytes are there to process in the message in phase, the input data latch (s->sidl) is set to s->msg[-1]. Just do nothing since no DMA is performed. Reported-by: Chuhong Yuan Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Paolo Bonzini --- hw/scsi/lsi53c895a.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c index 71f759a59dd..eb9828dd5ef 100644 --- a/hw/scsi/lsi53c895a.c +++ b/hw/scsi/lsi53c895a.c @@ -927,13 +927,18 @@ static void lsi_do_msgin(LSIState *s) assert(len > 0 && len <= LSI_MAX_MSGIN_LEN); if (len > s->dbc) len = s->dbc; -pci_dma_write(PCI_DEVICE(s), s->dnad, s->msg, len); -/* Linux drivers rely on the last byte being in the SIDL. */ -s->sidl = s->msg[len - 1]; -s->msg_len -= len; -if (s->msg_len) { -memmove(s->msg, s->msg + len, s->msg_len); -} else { + +if (len) { +pci_dma_write(PCI_DEVICE(s), s->dnad, s->msg, len); +/* Linux drivers rely on the last byte being in the SIDL. */ +s->sidl = s->msg[len - 1]; +s->msg_len -= len; +if (s->msg_len) { +memmove(s->msg, s->msg + len, s->msg_len); +} +} + +if (!s->msg_len) { /* ??? Check if ATN (not yet implemented) is asserted and maybe switch to PHASE_MO. */ switch (s->msg_action) { -- 2.44.0
[PATCH] Makefile: preserve --jobserver-auth argument when calling ninja
Qemu wraps its call to ninja in a Makefile. Since ninja, as opposed to make, utilizes all CPU cores by default, the qemu Makefile translates the absense of a `-jN` argument into `-j1`. This breaks jobserver functionality, so update the -jN mangling to take the --jobserver-auth argument into considerationa too. Signed-off-by: Martin Hundebøll --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 8f36990335..183756018f 100644 --- a/Makefile +++ b/Makefile @@ -142,7 +142,7 @@ MAKE.k = $(findstring k,$(firstword $(filter-out --%,$(MAKEFLAGS MAKE.q = $(findstring q,$(firstword $(filter-out --%,$(MAKEFLAGS MAKE.nq = $(if $(word 2, $(MAKE.n) $(MAKE.q)),nq) NINJAFLAGS = $(if $V,-v) $(if $(MAKE.n), -n) $(if $(MAKE.k), -k0) \ -$(filter-out -j, $(lastword -j1 $(filter -l% -j%, $(MAKEFLAGS \ +$(or $(filter -l% -j%, $(MAKEFLAGS)), $(if $(filter --jobserver-auth=%, $(MAKEFLAGS)),, -j1)) \ -d keepdepfile ninja-cmd-goals = $(or $(MAKECMDGOALS), all) ninja-cmd-goals += $(foreach g, $(MAKECMDGOALS), $(.ninja-goals.$g)) -- 2.44.0
[PATCH] Fix incorrect disassembly format for certain RISC-V instructions
* The immediate argument to lui/auipc should be an integer in the interval [0x0, 0xf]; e.g., 'auipc 0xf' and not 'auipc -1' * The floating-point rounding mode is the last operand to the function, not the first; e.g., 'fcvt.w.s a0, fa0, rtz' and not 'fcvt.w.s rtz, a0, fa0'. Note that fcvt.d.w[u] and fcvt.w[u].d are unaffected by the rounding mode and hence it is omitted from their disassembly. * When aq and rl are both present, they are not separated by a '.'; e.g., 'lr.d.aqrl' and not 'lr.d.aq.rl'. Based on the following assembly reference: https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md Signed-off-by: Simeon Krastnikov --- disas/riscv.c | 144 ++ disas/riscv.h | 10 ++-- 2 files changed, 79 insertions(+), 75 deletions(-) diff --git a/disas/riscv.c b/disas/riscv.c index e236c8b5b7..71a3ab878f 100644 --- a/disas/riscv.c +++ b/disas/riscv.c @@ -1311,98 +1311,98 @@ const rv_opcode_data rvi_opcode_data[] = { { "csrrci", rv_codec_i_csr, rv_fmt_rd_csr_zimm, NULL, 0, 0, 0 }, { "flw", rv_codec_i, rv_fmt_frd_offset_rs1, NULL, 0, 0, 0 }, { "fsw", rv_codec_s, rv_fmt_frs2_offset_rs1, NULL, 0, 0, 0 }, -{ "fmadd.s", rv_codec_r4_m, rv_fmt_rm_frd_frs1_frs2_frs3, NULL, 0, 0, 0 }, -{ "fmsub.s", rv_codec_r4_m, rv_fmt_rm_frd_frs1_frs2_frs3, NULL, 0, 0, 0 }, -{ "fnmsub.s", rv_codec_r4_m, rv_fmt_rm_frd_frs1_frs2_frs3, NULL, 0, 0, 0 }, -{ "fnmadd.s", rv_codec_r4_m, rv_fmt_rm_frd_frs1_frs2_frs3, NULL, 0, 0, 0 }, -{ "fadd.s", rv_codec_r_m, rv_fmt_rm_frd_frs1_frs2, NULL, 0, 0, 0 }, -{ "fsub.s", rv_codec_r_m, rv_fmt_rm_frd_frs1_frs2, NULL, 0, 0, 0 }, -{ "fmul.s", rv_codec_r_m, rv_fmt_rm_frd_frs1_frs2, NULL, 0, 0, 0 }, -{ "fdiv.s", rv_codec_r_m, rv_fmt_rm_frd_frs1_frs2, NULL, 0, 0, 0 }, +{ "fmadd.s", rv_codec_r4_m, rv_fmt_frd_frs1_frs2_frs3_rm, NULL, 0, 0, 0 }, +{ "fmsub.s", rv_codec_r4_m, rv_fmt_frd_frs1_frs2_frs3_rm, NULL, 0, 0, 0 }, +{ "fnmsub.s", rv_codec_r4_m, rv_fmt_frd_frs1_frs2_frs3_rm, NULL, 0, 0, 0 }, +{ "fnmadd.s", rv_codec_r4_m, rv_fmt_frd_frs1_frs2_frs3_rm, NULL, 0, 0, 0 }, +{ "fadd.s", rv_codec_r_m, rv_fmt_frd_frs1_frs2_rm, NULL, 0, 0, 0 }, +{ "fsub.s", rv_codec_r_m, rv_fmt_frd_frs1_frs2_rm, NULL, 0, 0, 0 }, +{ "fmul.s", rv_codec_r_m, rv_fmt_frd_frs1_frs2_rm, NULL, 0, 0, 0 }, +{ "fdiv.s", rv_codec_r_m, rv_fmt_frd_frs1_frs2_rm, NULL, 0, 0, 0 }, { "fsgnj.s", rv_codec_r, rv_fmt_frd_frs1_frs2, rvcp_fsgnj_s, 0, 0, 0 }, { "fsgnjn.s", rv_codec_r, rv_fmt_frd_frs1_frs2, rvcp_fsgnjn_s, 0, 0, 0 }, { "fsgnjx.s", rv_codec_r, rv_fmt_frd_frs1_frs2, rvcp_fsgnjx_s, 0, 0, 0 }, { "fmin.s", rv_codec_r, rv_fmt_frd_frs1_frs2, NULL, 0, 0, 0 }, { "fmax.s", rv_codec_r, rv_fmt_frd_frs1_frs2, NULL, 0, 0, 0 }, -{ "fsqrt.s", rv_codec_r_m, rv_fmt_rm_frd_frs1, NULL, 0, 0, 0 }, +{ "fsqrt.s", rv_codec_r_m, rv_fmt_frd_frs1_rm, NULL, 0, 0, 0 }, { "fle.s", rv_codec_r, rv_fmt_rd_frs1_frs2, NULL, 0, 0, 0 }, { "flt.s", rv_codec_r, rv_fmt_rd_frs1_frs2, NULL, 0, 0, 0 }, { "feq.s", rv_codec_r, rv_fmt_rd_frs1_frs2, NULL, 0, 0, 0 }, -{ "fcvt.w.s", rv_codec_r_m, rv_fmt_rm_rd_frs1, NULL, 0, 0, 0 }, -{ "fcvt.wu.s", rv_codec_r_m, rv_fmt_rm_rd_frs1, NULL, 0, 0, 0 }, -{ "fcvt.s.w", rv_codec_r_m, rv_fmt_rm_frd_rs1, NULL, 0, 0, 0 }, -{ "fcvt.s.wu", rv_codec_r_m, rv_fmt_rm_frd_rs1, NULL, 0, 0, 0 }, +{ "fcvt.w.s", rv_codec_r_m, rv_fmt_rd_frs1_rm, NULL, 0, 0, 0 }, +{ "fcvt.wu.s", rv_codec_r_m, rv_fmt_rd_frs1_rm, NULL, 0, 0, 0 }, +{ "fcvt.s.w", rv_codec_r_m, rv_fmt_frd_rs1_rm, NULL, 0, 0, 0 }, +{ "fcvt.s.wu", rv_codec_r_m, rv_fmt_frd_rs1_rm, NULL, 0, 0, 0 }, { "fmv.x.s", rv_codec_r, rv_fmt_rd_frs1, NULL, 0, 0, 0 }, { "fclass.s", rv_codec_r, rv_fmt_rd_frs1, NULL, 0, 0, 0 }, { "fmv.s.x", rv_codec_r, rv_fmt_frd_rs1, NULL, 0, 0, 0 }, -{ "fcvt.l.s", rv_codec_r_m, rv_fmt_rm_rd_frs1, NULL, 0, 0, 0 }, -{ "fcvt.lu.s", rv_codec_r_m, rv_fmt_rm_rd_frs1, NULL, 0, 0, 0 }, -{ "fcvt.s.l", rv_codec_r_m, rv_fmt_rm_frd_rs1, NULL, 0, 0, 0 }, -{ "fcvt.s.lu", rv_codec_r_m, rv_fmt_rm_frd_rs1, NULL, 0, 0, 0 }, +{ "fcvt.l.s", rv_codec_r_m, rv_fmt_rd_frs1_rm, NULL, 0, 0, 0 }, +{ "fcvt.lu.s", rv_codec_r_m, rv_fmt_rd_frs1_rm, NULL, 0, 0, 0 }, +{ "fcvt.s.l", rv_codec_r_m, rv_fmt_frd_rs1_rm, NULL, 0, 0, 0 }, +{ "fcvt.s.lu", rv_codec_r_m, rv_fmt_frd_rs1_rm, NULL, 0, 0, 0 }, { "fld", rv_codec_i, rv_fmt_frd_offset_rs1, NULL, 0, 0, 0 }, { "fsd", rv_codec_s, rv_fmt_frs2_offset_rs1, NULL, 0, 0, 0 }, -{ "fmadd.d", rv_codec_r4_m, rv_fmt_rm_frd_frs1_frs2_frs3, NULL, 0, 0, 0 }, -{ "fmsub.d", rv_codec_r4_m, rv_fmt_rm_frd_frs1_frs2_frs3, NULL, 0, 0, 0 }, -{ "fnmsub.d", rv_codec_r4_m, rv_fmt_rm_frd_frs1_frs2_frs3, NULL, 0, 0, 0 }, -{ "fnmadd.d", rv_codec_r4_m, rv_fmt_rm_frd_frs1_frs2_frs3, NULL, 0, 0, 0 }, -{ "fadd.d", rv_codec_r_m, rv_fmt_rm_frd_frs1_frs2, NULL, 0, 0, 0 }, -{ "fsub.d", rv_cod
[PATCH] sh4: mac.w: memory accesses are 16-bit words
Before this change, executing a code sequence such as: mova tblm,r0 movr0,r1 mova tbln,r0 clrs clrmac mac.w @r0+,@r1+ mac.w @r0+,@r1+ .align 4 tblm:.word 0x1234 .word 0x5678 tbln:.word 0x9abc .word 0xdefg Does not result in correct behavior: Expected behavior: first macw : macl = 0x1234 * 0x9abc + 0x0 mach = 0x0 second macw: macl = 0x5678 * 0xdefg + 0xb00a630 mach = 0x0 Observed behavior (qemu-sh4eb, prior to this commit): first macw : macl = 0x5678 * 0xdefg + 0x0 mach = 0x0 second macw: (unaligned longword memory access, SIGBUS) Various SH-4 ISA manuals also confirm that `mac.w` is a 16-bit word memory access, not a 32-bit longword memory access. Signed-off-by: Zack Buhman --- target/sh4/translate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/sh4/translate.c b/target/sh4/translate.c index a9b1bc7524..6643c14dde 100644 --- a/target/sh4/translate.c +++ b/target/sh4/translate.c @@ -816,10 +816,10 @@ static void _decode_opc(DisasContext * ctx) TCGv arg0, arg1; arg0 = tcg_temp_new(); tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx, -MO_TESL | MO_ALIGN); +MO_TESW | MO_ALIGN); arg1 = tcg_temp_new(); tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx, -MO_TESL | MO_ALIGN); +MO_TESW | MO_ALIGN); gen_helper_macw(tcg_env, arg0, arg1); tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 2); tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 2); -- 2.41.0
Re: [PATCH v8] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER
Am 29.03.2024 um 04:45 hat Shaoqin Huang geschrieben: > Hi Daniel, > > On 3/25/24 16:55, Daniel P. Berrangé wrote: > > On Mon, Mar 25, 2024 at 01:35:58PM +0800, Shaoqin Huang wrote: > > > Hi Daniel, > > > > > > Thanks for your reviewing. I see your comments in the v7. > > > > > > I have some doubts about what you said about the QAPI. Do you want me to > > > convert the current design into the QAPI parsing like the > > > IOThreadVirtQueueMapping? And we need to add new json definition in the > > > qapi/ directory? > > I have defined the QAPI for kvm-pmu-filter like below: > > +## > +# @FilterAction: > +# > +# The Filter Action > +# > +# @a: Allow > +# > +# @d: Disallow > +# > +# Since: 9.0 > +## > +{ 'enum': 'FilterAction', > + 'data': [ 'a', 'd' ] } > + > +## > +# @SingleFilter: > +# > +# Lazy > +# > +# @action: the action > +# > +# @start: the start > +# > +# @end: the end > +# > +# Since: 9.0 > +## > + > +{ 'struct': 'SingleFilter', > + 'data': { 'action': 'FilterAction', 'start': 'int', 'end': 'int' } } > + > +## > +# @KVMPMUFilter: > +# > +# Lazy > +# > +# @filter: the filter > +# > +# Since: 9.0 > +## > + > +{ 'struct': 'KVMPMUFilter', > + 'data': { 'filter': ['SingleFilter'] }} > > And I guess I can use it by adding code like below: > > --- a/hw/core/qdev-properties-system.c > +++ b/hw/core/qdev-properties-system.c > @@ -1206,3 +1206,35 @@ const PropertyInfo qdev_prop_iothread_vq_mapping_list > = { > .set = set_iothread_vq_mapping_list, > .release = release_iothread_vq_mapping_list, > }; > + > +/* --- kvm-pmu-filter ---*/ > + > +static void get_kvm_pmu_filter(Object *obj, Visitor *v, > +const char *name, void *opaque, Error **errp) > +{ > +KVMPMUFilter **prop_ptr = object_field_prop_ptr(obj, opaque); > + > +visit_type_KVMPMUFilter(v, name, prop_ptr, errp); > +} > + > +static void set_kvm_pmu_filter(Object *obj, Visitor *v, > +const char *name, void *opaque, Error **errp) > +{ > +KVMPMUFilter **prop_ptr = object_field_prop_ptr(obj, opaque); > +KVMPMUFilter *list; > + > +printf("running the %s\n", __func__); > +if (!visit_type_KVMPMUFilter(v, name, &list, errp)) { > +return; > +} > + > +printf("The name is %s\n", name); > +*prop_ptr = list; > +} > + > +const PropertyInfo qdev_prop_kvm_pmu_filter = { > +.name = "KVMPMUFilter", > +.description = "der der", > +.get = get_kvm_pmu_filter, > +.set = set_kvm_pmu_filter, > +}; > > +#define DEFINE_PROP_KVM_PMU_FILTER(_name, _state, _field) \ > +DEFINE_PROP(_name, _state, _field, qdev_prop_kvm_pmu_filter, \ > +KVMPMUFilter *) > > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -2439,6 +2441,7 @@ static Property arm_cpu_properties[] = { > mp_affinity, ARM64_AFFINITY_INVALID), > DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID), > DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1), > +DEFINE_PROP_KVM_PMU_FILTER("kvm-pmu-filter", ARMCPU, kvm_pmu_filter), > DEFINE_PROP_END_OF_LIST() > }; > > And I guess I can use the new json format input like below: > > qemu-system-aarch64 \ > -cpu host, '{"filter": [{"action": "a", "start": 0x10, "end": "0x11"}]}' > > But it doesn't work. It seems like because the -cpu option doesn't > support json format parameter. > > Maybe I'm wrong. So I want to double check with if the -cpu option > support json format nowadays? As far as I can see, -cpu doesn't support JSON yet. But even if it did, your command line would be invalid because the 'host,' part isn't JSON. > If the -cpu option doesn't support json format, how I can use the QAPI > for kvm-pmu-filter property? This would probably mean QAPIfying all CPUs first, which sounds like a major effort. Kevin
Re: [PATCH 1/1] migration/multifd: solve zero page causing multiple page faults
Yuan Liu writes: > Implemented recvbitmap tracking of received pages in multifd. > > If the zero page appears for the first time in the recvbitmap, this > page is not checked and set. > > If the zero page has already appeared in the recvbitmap, there is no > need to check the data but directly set the data to 0, because it is > unlikely that the zero page will be migrated multiple times. > > Signed-off-by: Yuan Liu Reviewed-by: Fabiano Rosas
Re: [PATCH] virtio-net: fix qemu set used ring flag even vhost started
On 4/2/2024 8:11 PM, Philippe Mathieu-Daudé wrote: External email: Use caution opening links or attachments Hi Yajun, On 2/4/24 06:51, Yajun Wu wrote: When vhost-user or vhost-kernel is handling virtio net datapath, qemu "QEMU" Ack. should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by qemu in "QEMU" Ack. following code path: #0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511 #1 0x559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576 #2 0x559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801 #3 0x559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248 #4 0x559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525 #5 0x559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321 #6 0x559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774 #7 0x559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259 #8 0x559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199 #9 0x559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783 #10 0x559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592 #11 0x559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266 #12 0x559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412 #13 0x559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311 #14 0x559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392 #15 0x559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455 #16 0x559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459 #17 0x559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301 #18 0x559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62 #19 0x559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82 This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. Signed-off-by: Yajun Wu Reviewed-by: Jiri Pirko --- hw/net/virtio-net.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index a6ff000cd9..8035e01fdf 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -2865,6 +2865,10 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq) VirtIONet *n = VIRTIO_NET(vdev); VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))]; +if (n->vhost_started) { Since you mentioned "in a very rare case", maybe use unlikely()? Ack. +return; +} + if (unlikely((n->status & VIRTIO_NET_S_LINK_UP) == 0)) { virtio_net_drop_tx_queue_data(vdev, vq); return;
Re: [PATCH] block: Remove unnecessary NULL check in bdrv_pad_request()
Am 02.04.2024 um 12:53 hat Philippe Mathieu-Daudé geschrieben: > On 27/3/24 20:27, Kevin Wolf wrote: > > Coverity complains that the check introduced in commit 3f934817 suggests > > that qiov could be NULL and we dereference it before reaching the check. > > In fact, all of the callers pass a non-NULL pointer, so just remove the > > misleading check. > > > > Resolves: Coverity CID 1542668 > > Signed-off-by: Kevin Wolf > > --- > > block/io.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > Since I'm not seeing other block related patch for 9.0 and > I'm preparing a pull request, I'm queuing this one. Thanks, Phil. I didn't send a pull request because I didn't have anything else and silencing a Coverity false positive didn't seem urgent for 9.0, but it certainly doesn't hurt either. Kevin
Re: [PATCH] virtio-net: fix qemu set used ring flag even vhost started
Hi Yajun, On 2/4/24 06:51, Yajun Wu wrote: When vhost-user or vhost-kernel is handling virtio net datapath, qemu "QEMU" should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by qemu in "QEMU" following code path: #0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511 #1 0x559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576 #2 0x559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801 #3 0x559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248 #4 0x559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525 #5 0x559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321 #6 0x559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774 #7 0x559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259 #8 0x559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199 #9 0x559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783 #10 0x559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592 #11 0x559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266 #12 0x559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412 #13 0x559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311 #14 0x559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392 #15 0x559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455 #16 0x559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459 #17 0x559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301 #18 0x559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62 #19 0x559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82 This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. Signed-off-by: Yajun Wu Reviewed-by: Jiri Pirko --- hw/net/virtio-net.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index a6ff000cd9..8035e01fdf 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -2865,6 +2865,10 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq) VirtIONet *n = VIRTIO_NET(vdev); VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))]; +if (n->vhost_started) { Since you mentioned "in a very rare case", maybe use unlikely()? +return; +} + if (unlikely((n->status & VIRTIO_NET_S_LINK_UP) == 0)) { virtio_net_drop_tx_queue_data(vdev, vq); return;
Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init
On Tue, Apr 2, 2024 at 8:19 AM Si-Wei Liu wrote: > > > > On 2/14/2024 11:11 AM, Eugenio Perez Martin wrote: > > On Wed, Feb 14, 2024 at 7:29 PM Si-Wei Liu wrote: > >> Hi Michael, > >> > >> On 2/13/2024 2:22 AM, Michael S. Tsirkin wrote: > >>> On Mon, Feb 05, 2024 at 05:10:36PM -0800, Si-Wei Liu wrote: > Hi Eugenio, > > I thought this new code looks good to me and the original issue I saw > with > x-svq=on should be gone. However, after rebase my tree on top of this, > there's a new failure I found around setting up guest mappings at early > boot, please see attached the specific QEMU config and corresponding > event > traces. Haven't checked into the detail yet, thinking you would need to > be > aware of ahead. > > Regards, > -Siwei > >>> Eugenio were you able to reproduce? Siwei did you have time to > >>> look into this? > >> Didn't get a chance to look into the detail yet in the past week, but > >> thought it may have something to do with the (internals of) iova tree > >> range allocation and the lookup routine. It started to fall apart at the > >> first vhost_vdpa_dma_unmap call showing up in the trace events, where it > >> should've gotten IOVA=0x201000, but an incorrect IOVA address > >> 0x1000 was ended up returning from the iova tree lookup routine. > >> > >> HVAGPAIOVA > >> - > >> Map > >> [0x7f7903e0, 0x7f7983e0)[0x0, 0x8000) [0x1000, 0x8000) > >> [0x7f7983e0, 0x7f9903e0)[0x1, 0x208000) > >> [0x80001000, 0x201000) > >> [0x7f7903ea, 0x7f7903ec)[0xfeda, 0xfedc) > >> [0x201000, 0x221000) > >> > >> Unmap > >> [0x7f7903ea, 0x7f7903ec)[0xfeda, 0xfedc) [0x1000, > >> 0x2) ??? > >> shouldn't it be [0x201000, > >> 0x221000) ??? > >> > It looks the SVQ iova tree lookup routine vhost_iova_tree_find_iova(), > which is called from vhost_vdpa_listener_region_del(), can't properly > deal with overlapped region. Specifically, q35's mch_realize() has the > following: > > 579 memory_region_init_alias(&mch->open_high_smram, OBJECT(mch), > "smram-open-high", > 580 mch->ram_memory, > MCH_HOST_BRIDGE_SMRAM_C_BASE, > 581 MCH_HOST_BRIDGE_SMRAM_C_SIZE); > 582 memory_region_add_subregion_overlap(mch->system_memory, 0xfeda, > 583 &mch->open_high_smram, 1); > 584 memory_region_set_enabled(&mch->open_high_smram, false); > > #0 0x564c30bf6980 in iova_tree_find_address_iterator > (key=0x564c331cf8e0, value=0x564c331cf8e0, data=0x7fffb6d749b0) at > ../util/iova-tree.c:96 > #1 0x7f5f66479654 in g_tree_foreach () at /lib64/libglib-2.0.so.0 > #2 0x564c30bf6b53 in iova_tree_find_iova (tree=, > map=map@entry=0x7fffb6d74a00) at ../util/iova-tree.c:114 > #3 0x564c309da0a9 in vhost_iova_tree_find_iova (tree= out>, map=map@entry=0x7fffb6d74a00) at ../hw/virtio/vhost-iova-tree.c:70 > #4 0x564c3085e49d in vhost_vdpa_listener_region_del > (listener=0x564c331024c8, section=0x7fffb6d74aa0) at > ../hw/virtio/vhost-vdpa.c:444 > #5 0x564c309f4931 in address_space_update_topology_pass > (as=as@entry=0x564c31ab1840 , > old_view=old_view@entry=0x564c33364cc0, > new_view=new_view@entry=0x564c333640f0, adding=adding@entry=false) at > ../system/memory.c:977 > #6 0x564c309f4dcd in address_space_set_flatview (as=0x564c31ab1840 > ) at ../system/memory.c:1079 > #7 0x564c309f86d0 in memory_region_transaction_commit () at > ../system/memory.c:1132 > #8 0x564c309f86d0 in memory_region_transaction_commit () at > ../system/memory.c:1117 > #9 0x564c307cce64 in mch_realize (d=, > errp=) at ../hw/pci-host/q35.c:584 > > However, it looks like iova_tree_find_address_iterator() only check if > the translated address (HVA) falls in to the range when trying to locate > the desired IOVA, causing the first DMAMap that happens to overlap in > the translated address (HVA) space to be returned prematurely: > > 89 static gboolean iova_tree_find_address_iterator(gpointer key, > gpointer value, > 90 gpointer data) > 91 { > : > : > 99 if (map->translated_addr + map->size < needle->translated_addr || > 100 needle->translated_addr + needle->size < map->translated_addr) { > 101 return false; > 102 } > 103 > 104 args->result = map; > 105 return true; > 106 } > > In the QEMU trace file, it reveals that the first DMAMap as below gets > returned incorrectly instead the second, the latter of which is what the > actual IOVA corresponds to: > > HVA GPA > IOVA > [0x7f7903e0, 0x7f7983e0)
[PATCH v2 1/2] scripts/checkpatch: Avoid author email mangled by qemu-*@nongnu.org
Commit f5177798d8 ("scripts: report on author emails that are mangled by the mailing list") added a check for qemu-devel@ list, extend the regexp to cover more such qemu-trivial@, qemu-block@ and qemu-ppc@. Signed-off-by: Philippe Mathieu-Daudé --- scripts/checkpatch.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 7026895074..12e9028b10 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -1573,7 +1573,7 @@ sub process { $is_patch = 1; } - if ($line =~ /^(Author|From): .* via .*/) { + if ($line =~ /^(Author|From): .* via .*/) { ERROR("Author email address is mangled by the mailing list\n" . $herecurr); } -- 2.41.0
[PATCH v2 2/2] scripts/checkpatch: Do not use mailmap
The .mailmap file fixes mistake we already did. Do not use it when running checkpatch.pl, otherwise we might commit the very same mistakes. Reported-by: Peter Maydell Signed-off-by: Philippe Mathieu-Daudé --- scripts/checkpatch.pl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 12e9028b10..76a0b79266 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -435,8 +435,8 @@ if ($chk_branch) { my @patches; my %git_commits = (); my $HASH; - open($HASH, "-|", "git", "log", "--reverse", "--no-merges", "--format=%H %s", $ARGV[0]) || - die "$P: git log --reverse --no-merges --format='%H %s' $ARGV[0] failed - $!\n"; + open($HASH, "-|", "git", "log", "--reverse", "--no-merges", "--no-mailmap", "--format=%H %s", $ARGV[0]) || + die "$P: git log --reverse --no-merges --no-mailmap --format='%H %s' $ARGV[0] failed - $!\n"; for my $line (<$HASH>) { $line =~ /^([0-9a-fA-F]{40,40}) (.*)$/; @@ -460,7 +460,7 @@ if ($chk_branch) { "-c", "diff.renamelimit=0", "-c", "diff.renames=True", "-c", "diff.algorithm=histogram", - "show", + "show", "--no-mailmap", "--patch-with-stat", $hash) || die "$P: git show $hash - $!\n"; while (<$FILE>) { -- 2.41.0
Re: [PATCH 1/2] scripts/checkpatch: Avoid author email mangled by qemu-trivial@
On Tue, 2 Apr 2024 at 12:30, Philippe Mathieu-Daudé wrote: > > Commit f5177798d8 ("scripts: report on author emails > that are mangled by the mailing list") added a check > for qemu-devel@ list, complete with qemu-trivial@. > > Signed-off-by: Philippe Mathieu-Daudé > --- > scripts/checkpatch.pl | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl > index 7026895074..4fe4cfd631 100755 > --- a/scripts/checkpatch.pl > +++ b/scripts/checkpatch.pl > @@ -1573,7 +1573,7 @@ sub process { > $is_patch = 1; > } > > - if ($line =~ /^(Author|From): .* via > .*/) { > + if ($line =~ /^(Author|From): .* via > .*/) { I recommend checking against qemu-.* rather than trying to capture explicitly all the suffixes we have. (For instance there's a line in mailmap for a commit that was attributed to qemu-block@, which this change still wouldn't catch.) thanks -- PMM
[PATCH v2 0/2] scripts/checkpatch: Do not use mailmap and check qemu-trivial@ author
Since v1: - Extend regexp to cover all lists (Peter) * qemu-trivial@ is not checked * mailmap hides the mistakes we want to catch See https://lore.kernel.org/qemu-devel/60faa39d-52e8-46f1-8bd9-9d9661794...@tls.msk.ru/ Philippe Mathieu-Daudé (2): scripts/checkpatch: Avoid author email mangled by qemu-*@nongnu.org scripts/checkpatch: Do not use mailmap scripts/checkpatch.pl | 8 1 file changed, 4 insertions(+), 4 deletions(-) -- 2.41.0
Re: [PULL 0/5] target-arm queue
On Tue, 2 Apr 2024 at 11:29, Peter Maydell wrote: > > Nothing exciting here: two minor bug fixes, some fixes for > running on a 32-bit host, and a docs tweak. > > thanks > -- PMM > > The following changes since commit 6af9d12c88b9720f209912f6e4b01fefe5906d59: > > Merge tag 'migration-20240331-pull-request' of > https://gitlab.com/peterx/qemu into staging (2024-04-01 13:12:40 +0100) > > are available in the Git repository at: > > https://git.linaro.org/people/pmaydell/qemu-arm.git > tags/pull-target-arm-20240402 > > for you to fetch changes up to 393770d7a02135e7468018f52da610712f151ec0: > > raspi4b: Reduce RAM to 1Gb on 32-bit hosts (2024-04-02 10:13:48 +0100) > > > target-arm queue: > * take HSTR traps of cp15 accesses to EL2, not EL1 > * docs: sbsa: update specs, add dt note > * hw/intc/arm_gicv3: ICC_HPPIR* return SPURIOUS if int group is disabled > * tests/qtest: Fix STM32L4x5 GPIO test on 32-bit > * raspi4b: Reduce RAM to 1Gb on 32-bit hosts Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0 for any user-visible changes. -- PMM
Re: [PATCH 1/2] scripts/checkpatch: Avoid author email mangled by qemu-trivial@
On 2/4/24 13:53, Philippe Mathieu-Daudé wrote: On 2/4/24 13:52, Peter Maydell wrote: On Tue, 2 Apr 2024 at 12:30, Philippe Mathieu-Daudé wrote: Commit f5177798d8 ("scripts: report on author emails that are mangled by the mailing list") added a check for qemu-devel@ list, complete with qemu-trivial@. Signed-off-by: Philippe Mathieu-Daudé --- scripts/checkpatch.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 7026895074..4fe4cfd631 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -1573,7 +1573,7 @@ sub process { $is_patch = 1; } - if ($line =~ /^(Author|From): .* via .*/) { + if ($line =~ /^(Author|From): .* via .*/) { I recommend checking against qemu-.* rather than trying to capture explicitly all the suffixes we have. (For instance there's a line in mailmap for a commit that was attributed to qemu-block@, which this change still wouldn't catch.) Oh, good point. And also qemu-...@nongnu.org: commit 5cbd51a5a58098444ffa246ece2013849be04299 Author: BALATON Zoltan via Date: Sun Jan 3 02:09:33 2021 +0100
Re: [PATCH 1/2] scripts/checkpatch: Avoid author email mangled by qemu-trivial@
On 2/4/24 13:52, Peter Maydell wrote: On Tue, 2 Apr 2024 at 12:30, Philippe Mathieu-Daudé wrote: Commit f5177798d8 ("scripts: report on author emails that are mangled by the mailing list") added a check for qemu-devel@ list, complete with qemu-trivial@. Signed-off-by: Philippe Mathieu-Daudé --- scripts/checkpatch.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 7026895074..4fe4cfd631 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -1573,7 +1573,7 @@ sub process { $is_patch = 1; } - if ($line =~ /^(Author|From): .* via .*/) { + if ($line =~ /^(Author|From): .* via .*/) { I recommend checking against qemu-.* rather than trying to capture explicitly all the suffixes we have. (For instance there's a line in mailmap for a commit that was attributed to qemu-block@, which this change still wouldn't catch.) Oh, good point.
Re: [PATCH for-9.0 3/4] vga: adjust dirty memory region if pel panning is active
On 2/4/24 13:34, Paolo Bonzini wrote: When pel panning is active, one more byte is read from each of the VGA memory planes. This has to be accounted in the computation of region_end, otherwise vga_draw_graphic() fails an assertion: qemu-system-i386: ../system/physmem.c:946: cpu_physical_memory_snapshot_get_dirty: Assertion `start + length <= snap->end' failed. Reported-by: Helge Konetzka Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2244 Signed-off-by: Paolo Bonzini --- hw/display/vga.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH for-9.0 2/4] vga: move computation of dirty memory region later
On 2/4/24 13:34, Paolo Bonzini wrote: Move the computation of region_start and region_end after the value of "bits" is known. This makes it possible to distinguish modes that support horizontal pel panning from modes that do not. Signed-off-by: Paolo Bonzini --- hw/display/vga.c | 50 1 file changed, 25 insertions(+), 25 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PULL 0/4] Trivial patches for 2024-04-02
On Tue, 2 Apr 2024 at 11:41, Michael Tokarev wrote: > > > > Author: Stefan Weil via > > *SIGH* This happened *again*. > > > (you'll need to tell git log "--no-mailmap" to not get confused > > by the mapping we have for the last time one of these slipped > > through...) > > Now this is interesting. And this is exactly why I haven't noticed > it - I did pay attention to Author lines this time. -- because > it is displayed with mailmap applied. How very useful. > > I have to use `git show --no-mailmap' to see the original " via.." > version. FWIW my apply-pullreq script makes this check for this: if git shortlog --author='qemu-.*@nongnu\.org' master..staging | grep .; then echo "ERROR: pull request includes commits attributed to list" exit 1 fi (This doesn't pass --no-mailmap, because git shortlog doesn't take that option, and the --author match is done on the "true" author, not the mapped one. This is the kind of orthogonality in command line UI that we have come to expect from git ;-)) -- PMM
Re: [PATCH for-9.0 1/4] vga: merge conditionals on shift control register
On 2/4/24 13:34, Paolo Bonzini wrote: There are two sets of conditionals using the shift control bits: one to verify the palette and adjust disp_width, one to compute the "v" and "bits" variables. Merge them into one, with the extra benefit that we now have the "bits" value available early and can use it to compute region_end. Signed-off-by: Paolo Bonzini --- hw/display/vga.c | 89 +++- 1 file changed, 42 insertions(+), 47 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: Intention to work on GSoC project
On Tue, Apr 2, 2024 at 6:58 AM Sahil wrote: > > Hi, > > On Monday, April 1, 2024 11:53:11 PM IST daleyoung4...@gmail.com wrote: > > Hi, > > > > On Monday, March 25, 2024 21:20:32 CST Sahil wrote: > > > Q1. > > > Section 2.7.4 of the virtio spec [3] states that in an available > > > descriptor, the "Element Length" stores the length of the buffer element. > > > In the next few lines, it also states that the "Element Length" is > > > reserved for used descriptors and is ignored by drivers. This sounds a > > > little contradictory given that drivers write available desciptors in the > > > descriptor ring. > > When VIRTQ_DESC_F_WRITE is set, the device will use "Element Length" to > > specify the length it writes. When VIRTQ_DESC_F_WRITE is not set, which > > means the buffer is read-only for the device, "Element Length" will not be > > changed by the device, so drivers just ignore it. > > > Thank you for the clarification. I think I misunderstood what I had read > in the virtio spec. What I have understood now is that "Element Length" > has different meanings for available and used descriptors. > > Correct me if I am wrong - for available descriptors, it represents the > length of the buffer. For used descriptors, it represents the length of > the buffer that is written to by the device if it's write-only, otherwise it > has no meaning and hence can be ignored by drivers. > Both answers are correct. > > > Q2. > > > In the Red Hat article, just below the first listing ("Memory layout of a > > > packed virtqueue descriptor"), there's the following line referring to the > > > buffer id in > > > > > > "virtq_desc": > > > > This time, the id field is not an index for the device to look for the > > > > buffer: it is an opaque value for it, only has meaning for the driver. > > > > > > But the device returns the buffer id when it writes the used descriptor to > > > the descriptor ring. The "only has meaning for the driver" part has got me > > > a little confused. Which buffer id is this that the device returns? Is it > > > related to the buffer id in the available descriptor? > > > > In my understanding, buffer id is the element that avail descriptor marks to > > identify when adding descriptors to table. Device will returns the buffer > > id in the processed descriptor or the last descriptor in a chain, and write > > it to the descriptor that used idx refers to (first one in the chain). Then > > used idx increments. > > > > The Packed Virtqueue blog [1] is helpful, but some details in the examples > > are making me confused. > > > > Q1. > > In the last step of the two-entries descriptor table example, it says both > > buffers #0 and #1 are available for the device. I understand descriptor[0] > > is available and descriptor[1] is not, but there is no ID #0 now. So does > > the device got buffer #0 by notification beforehand? If so, does it mean > > buffer #0 will be lost when notifications are disabled? > > > I guess you mean the table labeled "Figure: Full two-entries descriptor table". Take into account that the descriptor table is not the state of all the descriptors. That information must be maintained by the device and the driver internally. The descriptor table is used as a circular buffer, where one part is writable by the driver and the other part is writable by the device. For the device to override the descriptor table entry where descriptor id 0 used to be does not mean that the descriptor id 0 is used. It just means that the device communicates to the driver that descriptor 1 is used, and both sides need to keep the descriptor state coherently. > I too have a similar question and understanding the relation between buffer > ids in the used and available descriptors might give more insight into this. > For > available descriptors, the buffer id is used to associate descriptors with a > particular buffer. I am still not very sure about ids in used descriptors. > > Regarding Q1, both buffers #0 and #1 are available. In the mentioned figure, > both descriptor[0] and descriptor[1] are available. This figure follows the > figure > with the caption "Using first buffer out of order". So in the first figure > the device > reads buffer #1 and writes the used descriptor but it still has buffer #0 to > read. > That still belongs to the device while buffer #1 can now be handled by the > driver > once again. So in the next figure, the driver makes buffer #1 available > again. The > device can still read buffer #0 from the previous batch of available > descriptors. > > Based on what I have understood, the driver can't touch the descriptor > corresponding to buffer #0 until the device acknowledges it. I did find the > figure a little confusing as well. I think once the meaning of buffer id is > clear > from the driver's and device's perspective, it'll be easier to understand the > figure. > I think you got it right. Please let me know if you have further questions. > I am also not very sure about what hap
Re: [PATCH v3 11/17] esp.c: rework esp_cdb_length() into esp_cdb_ready()
On 24/3/24 20:17, Mark Cave-Ayland wrote: The esp_cdb_length() function is only used as part of a calculation to determine whether the cmdfifo contains an entire SCSI CDB. Rework esp_cdb_length() into a new esp_cdb_ready() function which both enables us to handle the case where scsi_cdb_length() returns -1, plus simplify the logic for its callers. Suggested-by: Paolo Bonzini Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 30 ++ 1 file changed, 14 insertions(+), 16 deletions(-) Reviewed-by: Philippe Mathieu-Daudé