date:20220405

Re: [RFC PATCH] python: add qmp-send program to send raw qmp commands to qemu

2022-04-05 Thread Markus Armbruster

John Snow  writes:

> On Tue, Apr 5, 2022, 5:03 AM Damien Hedde 
> wrote:

[...]

>> If it stays in QEMU tree, what licensing should I use ? LGPL does not
>> hurt, no ?
>>
>
> Whichever you please. GPLv2+ would be convenient and harmonizes well with
> other tools. LGPL is only something I started doing so that the "qemu.qmp"
> package would be LGPL. Licensing the tools as LGPL was just a sin of
> convenience so I could claim a single license for the whole wheel/egg/tgz.
>
> (I didn't want to make separate qmp and qmp-tools packages.)
>
> Go with what you feel is best.

Any license other than GPLv2+ needs justification in the commit message.

[...]

[PATCH qemu] ppc/vof: Fix uninitialized string tracing

2022-04-05 Thread Alexey Kardashevskiy

There are error paths which do not initialize propname but the trace_exit
label prints it anyway. This initializes the problem string.

Spotted by Coverity CID 1487241.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/vof.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/vof.c b/hw/ppc/vof.c
index 2b63a6287561..5ce3ca32c998 100644
--- a/hw/ppc/vof.c
+++ b/hw/ppc/vof.c
@@ -294,7 +294,7 @@ static uint32_t vof_setprop(MachineState *ms, void *fdt, 
Vof *vof,
 uint32_t nodeph, uint32_t pname,
 uint32_t valaddr, uint32_t vallen)
 {
-char propname[OF_PROPNAME_LEN_MAX + 1];
+char propname[OF_PROPNAME_LEN_MAX + 1] = "";
 uint32_t ret = PROM_ERROR;
 int offset, rc;
 char trval[64] = "";
-- 
2.30.2

RE: [PATCH V2 1/4] intel-iommu: don't warn guest errors when getting rid2pasid entry

2022-04-05 Thread Tian, Kevin

> From: Jason Wang 
> Sent: Wednesday, April 6, 2022 11:33 AM
> To: Tian, Kevin 
> Cc: Liu, Yi L ; m...@redhat.com; pet...@redhat.com;
> yi.y@linux.intel.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH V2 1/4] intel-iommu: don't warn guest errors when
> getting rid2pasid entry
> 
> On Sat, Apr 2, 2022 at 3:34 PM Tian, Kevin  wrote:
> >
> > > From: Jason Wang 
> > > Sent: Wednesday, March 30, 2022 4:37 PM
> > > On Wed, Mar 30, 2022 at 4:16 PM Tian, Kevin 
> wrote:
> > > >
> > > > > From: Jason Wang 
> > > > > Sent: Tuesday, March 29, 2022 12:52 PM
> > > > > >
> > > > > >>>
> > > > > >>> Currently the implementation of vtd_ce_get_rid2pasid_entry() is
> also
> > > > > >>> problematic. According to VT-d spec, RID2PASID field is effective
> only
> > > > > >>> when ecap.rps is true otherwise PASID#0 is used for RID2PASID. I
> > > didn't
> > > > > >>> see ecap.rps is set, neither is it checked in that function. It
> > > > > >>> works possibly
> > > > > >>> just because Linux currently programs 0 to RID2PASID...
> > > > > >>
> > > > > >> This seems to be another issue since the introduction of scalable
> mode.
> > > > > >
> > > > > > yes. this is not introduced in this series. The current scalable 
> > > > > > mode
> > > > > > vIOMMU support was following 3.0 spec, while RPS is added in 3.1.
> > > Needs
> > > > > > to be fixed.
> > > > >
> > > > >
> > > > > Interesting, so this is more complicated when dealing with migration
> > > > > compatibility. So what I suggest is probably something like:
> > > > >
> > > > > -device intel-iommu,version=$version
> > > > >
> > > > > Then we can maintain migration compatibility correctly. For 3.0 we
> can
> > > > > go without RPS and 3.1 and above we need to implement RPS.
> > > >
> > > > This is sensible. Probably a new version number is created only when
> > > > it breaks compatibility with an old version, i.e. not necessarily to 
> > > > follow
> > > > every release from VT-d spec. In this case we definitely need one from
> > > > 3.0 to 3.1+ given RID2PASID working on a 3.0 implementation will
> > > > trigger a reserved fault due to RPS not set on a 3.1 implementation.
> > >
> > > 3.0 should be fine, but I need to check whether there's another
> > > difference for PASID mode.
> > >
> > > It would be helpful if there's a chapter in the spec to describe the
> > > difference of behaviours.
> >
> > There is a section called 'Revision History' in the start of the VT-d spec.
> > It talks about changes in each revision, e.g.:
> > --
> >   June 2019, 3.1:
> >
> >   Added support for RID-PASID capability (RPS field in ECAP_REG).
> 
> Good to know that, does it mean, except for this revision history, all
> the other semantics keep backward compatibility across the version?

Yes and if you find anything not clarified properly I can help forward
to the spec owner.

Thanks
Kevin

Re: [PATCH V2 1/4] intel-iommu: don't warn guest errors when getting rid2pasid entry

2022-04-05 Thread Jason Wang

On Sat, Apr 2, 2022 at 3:34 PM Tian, Kevin  wrote:
>
> > From: Jason Wang 
> > Sent: Wednesday, March 30, 2022 4:37 PM
> > On Wed, Mar 30, 2022 at 4:16 PM Tian, Kevin  wrote:
> > >
> > > > From: Jason Wang 
> > > > Sent: Tuesday, March 29, 2022 12:52 PM
> > > > >
> > > > >>>
> > > > >>> Currently the implementation of vtd_ce_get_rid2pasid_entry() is also
> > > > >>> problematic. According to VT-d spec, RID2PASID field is effective 
> > > > >>> only
> > > > >>> when ecap.rps is true otherwise PASID#0 is used for RID2PASID. I
> > didn't
> > > > >>> see ecap.rps is set, neither is it checked in that function. It
> > > > >>> works possibly
> > > > >>> just because Linux currently programs 0 to RID2PASID...
> > > > >>
> > > > >> This seems to be another issue since the introduction of scalable 
> > > > >> mode.
> > > > >
> > > > > yes. this is not introduced in this series. The current scalable mode
> > > > > vIOMMU support was following 3.0 spec, while RPS is added in 3.1.
> > Needs
> > > > > to be fixed.
> > > >
> > > >
> > > > Interesting, so this is more complicated when dealing with migration
> > > > compatibility. So what I suggest is probably something like:
> > > >
> > > > -device intel-iommu,version=$version
> > > >
> > > > Then we can maintain migration compatibility correctly. For 3.0 we can
> > > > go without RPS and 3.1 and above we need to implement RPS.
> > >
> > > This is sensible. Probably a new version number is created only when
> > > it breaks compatibility with an old version, i.e. not necessarily to 
> > > follow
> > > every release from VT-d spec. In this case we definitely need one from
> > > 3.0 to 3.1+ given RID2PASID working on a 3.0 implementation will
> > > trigger a reserved fault due to RPS not set on a 3.1 implementation.
> >
> > 3.0 should be fine, but I need to check whether there's another
> > difference for PASID mode.
> >
> > It would be helpful if there's a chapter in the spec to describe the
> > difference of behaviours.
>
> There is a section called 'Revision History' in the start of the VT-d spec.
> It talks about changes in each revision, e.g.:
> --
>   June 2019, 3.1:
>
>   Added support for RID-PASID capability (RPS field in ECAP_REG).

Good to know that, does it mean, except for this revision history, all
the other semantics keep backward compatibility across the version?

> --
>
> >
> > >
> > > >
> > > > Since most of the advanced features has not been implemented, we may
> > > > probably start just from 3.4 (assuming it's the latest version). And all
> > > > of the following effort should be done for 3.4 in order to productize 
> > > > it.
> > > >
> > >
> > > Agree. btw in your understanding is intel-iommu in a production quality
> > > now?
> >
> > Red Hat supports vIOMMU for the guest DPDK path now.
> >
> > For scalable-mode we need to see some use cases then we can evaluate.
> > virtio SVA could be a possible use case, but it requires more work e.g
> > PRS queue.
>
> Yes it's not ready for full evaluation yet.
>
> The current state before your change is exactly feature-on-par with the
> legacy mode, except using scalable format in certain structures. That alone
> is not worthy of a formal evaluation.

Right.

Thanks

>
> >
> > > If not, do we want to apply this version scheme only when it
> > > reaches the production quality or also in the experimental phase?
> >
> > Yes. E.g if we think scalable mode is mature, we can enable 3.0.
> >
>
> Nice to know.
>
> Thanks
> Kevin

Re: [PATCH V2 4/4] intel-iommu: PASID support

2022-04-05 Thread Jason Wang

On Sat, Apr 2, 2022 at 3:27 PM Tian, Kevin  wrote:
>
> > From: Jason Wang 
> > Sent: Wednesday, March 30, 2022 4:32 PM
> >
> > >
> > > >
> > > > > If there is certain fault
> > > > > triggered by a request with PASID, we do want to report this
> > information
> > > > > upward.
> > > >
> > > > I tend to do it increasingly on top of this series (anyhow at least
> > > > RID2PASID is introduced before this series)
> > >
> > > Yes, RID2PASID should have been recorded too but it's not done correctly.
> > >
> > > If you do it in separate series, it implies that you will introduce 
> > > another
> > > "x-pasid-fault' to guard the new logic related to PASID fault recording?
> >
> > Something like this, as said previously, if it's a real problem, it
> > exists since the introduction of rid2pasid, not specific to this
> > patch.
> >
> > But I can add the fault recording if you insist.
>
> I prefer to including the fault recording given it's simple and makes this
> change more complete in concept. 

That's fine.

Thanks

>
> > > > >
> > > > > Earlier when Yi proposed Qemu changes for guest SVA [1] he aimed for
> > a
> > > > > coarse-grained knob design:
> > > > > --
> > > > >   Intel VT-d 3.0 introduces scalable mode, and it has a bunch of
> > capabilities
> > > > >   related to scalable mode translation, thus there are multiple
> > combinations.
> > > > >   While this vIOMMU implementation wants simplify it for user by
> > providing
> > > > >   typical combinations. User could config it by "x-scalable-mode" 
> > > > > option.
> > > > The
> > > > >   usage is as below:
> > > > > "-device intel-iommu,x-scalable-mode=["legacy"|"modern"]"
> > > > >
> > > > > - "legacy": gives support for SL page table
> > > > > - "modern": gives support for FL page table, pasid, virtual 
> > > > > command
> > > > > -  if not configured, means no scalable mode support, if not 
> > > > > proper
> > > > >configured, will throw error
> > > > > --
> > > > >
> > > > > Which way do you prefer to?
> > > > >
> > > > > [1] https://lists.gnu.org/archive/html/qemu-devel/2020-
> > 02/msg02805.html
> > > >
> > > > My understanding is that, if we want to deploy Qemu in a production
> > > > environment, we can't use the "x-" prefix. We need a full
> > > > implementation of each cap.
> > > >
> > > > E.g
> > > > -device intel-iommu,first-level=on,scalable-mode=on etc.
> > > >
> > >
> > > You meant each cap will get a separate control option?
> > >
> > > But that way requires the management stack or admin to have deep
> > > knowledge about how combinations of different capabilities work, e.g.
> > > if just turning on scalable mode w/o first-level cannot support vSVA
> > > on assigned devices. Is this a common practice when defining Qemu
> > > parameters?
> >
> > We can have a safe and good default value for each cap. E.g
> >
> > In qemu 8.0 we think scalable is mature, we can make scalable to be
> > enabled by default
> > in qemu 8.1 we think first-level is mature, we can make first level to
> > be enabled by default.
> >
>
> OK, that is a workable way.
>
> Thanks
> Kevin

Re: [PATCH V2 4/4] intel-iommu: PASID support

2022-04-05 Thread Jason Wang

On Sat, Apr 2, 2022 at 3:24 PM Tian, Kevin  wrote:
>
> > From: Jason Wang 
> > Sent: Wednesday, March 30, 2022 4:32 PM
> >
> > On Wed, Mar 30, 2022 at 4:02 PM Tian, Kevin  wrote:
> > >
> > > > From: Jason Wang 
> > > > Sent: Tuesday, March 29, 2022 12:49 PM
> > > >
> > > > On Mon, Mar 28, 2022 at 3:03 PM Tian, Kevin 
> > wrote:
> > > > >
> > > > > > From: Jason Wang
> > > > > > Sent: Monday, March 21, 2022 1:54 PM
> > > > > >
> > > > > > +/*
> > > > > > + * vtd-spec v3.4 3.14:
> > > > > > + *
> > > > > > + * """
> > > > > > + * Requests-with-PASID with input address in range 0xFEEx_
> > are
> > > > > > + * translated normally like any other request-with-PASID 
> > > > > > through
> > > > > > + * DMA-remapping hardware. However, if such a request is
> > processed
> > > > > > + * using pass-through translation, it will be blocked as 
> > > > > > described
> > > > > > + * in the paragraph below.
> > > > >
> > > > > While PASID+PT is blocked as described in the below paragraph, the
> > > > > paragraph itself applies to all situations:
> > > > >
> > > > >   1) PT + noPASID
> > > > >   2) translation + noPASID
> > > > >   3) PT + PASID
> > > > >   4) translation + PASID
> > > > >
> > > > > because...
> > > > >
> > > > > > + *
> > > > > > + * Software must not program paging-structure entries to remap
> > any
> > > > > > + * address to the interrupt address range. Untranslated 
> > > > > > requests
> > > > > > + * and translation requests that result in an address in the
> > > > > > + * interrupt range will be blocked with condition code LGN.4 or
> > > > > > + * SGN.8.
> > > > >
> > > > > ... if you look at the definition of LGN.4 or SGN.8:
> > > > >
> > > > > LGN.4:  When legacy mode (RTADDR_REG.TTM=00b) is enabled,
> > hardware
> > > > > detected an output address (i.e. address after remapping) in 
> > > > > the
> > > > > interrupt address range (0xFEEx_). For Translated 
> > > > > requests and
> > > > > requests with pass-through translation type (TT=10), the 
> > > > > output
> > > > > address is the same as the address in the request
> > > > >
> > > > > The last sentence in the first paragraph above just highlights the 
> > > > > fact
> > that
> > > > > when input address of PT is in interrupt range then it is blocked by
> > LGN.4
> > > > > or SGN.8 due to output address also in interrupt range.
> > > > >
> > > > > > + * """
> > > > > > + *
> > > > > > + * We enable per as memory region (iommu_ir_fault) for catching
> > > > > > + * the tranlsation for interrupt range through PASID + PT.
> > > > > > + */
> > > > > > +if (pt && as->pasid != PCI_NO_PASID) {
> > > > > > +memory_region_set_enabled(>iommu_ir_fault, true);
> > > > > > +} else {
> > > > > > +memory_region_set_enabled(>iommu_ir_fault, false);
> > > > > > +}
> > > > > > +
> > > > >
> > > > > Given above this should be a bug fix for nopasid first and then apply 
> > > > > it
> > > > > to pasid path too.
> > > >
> > > > Actually, nopasid path patches were posted here.
> > > >
> > > > https://www.mail-archive.com/qemu-
> > de...@nongnu.org/msg867878.html
> > > >
> > > > Thanks
> > > >
> > >
> > > Can you elaborate why they are handled differently?
> >
> > It's because that patch is for the case where pasid mode is not
> > implemented. We might need it for -stable.
> >
>
> So will that patch be replaced after this one goes in?

That path will be merged first if I understand correctly. Then this
patch could be applied on top.

> By any means
> the new iommu_ir_fault region could be applied to both nopasid
> and pasid i.e. no need toggle it when address space is switched.

Actually it's needed only when PT is enabled. When PT is disabled, the
translation is done via iommu_translate.

Considering the previous patch will be merged, I will fix this !PT in
the next version.

Thanks

>
> Thanks
> Kevin

Re: [PATCH] vdpa: Add missing tracing to batch mapping functions

2022-04-05 Thread Jason Wang




在 2022/4/5 下午2:36, Eugenio Pérez 写道:

These functions were not traced properly.

Signed-off-by: Eugenio Pérez 



Acked-by: Jason Wang 



---
  hw/virtio/vhost-vdpa.c | 2 ++
  hw/virtio/trace-events | 2 ++
  2 files changed, 4 insertions(+)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 8adf7c0b92..9e5fe15d03 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -129,6 +129,7 @@ static void vhost_vdpa_listener_begin_batch(struct 
vhost_vdpa *v)
  .iotlb.type = VHOST_IOTLB_BATCH_BEGIN,
  };

+trace_vhost_vdpa_listener_begin_batch(v, fd, msg.type, msg.iotlb.type);
  if (write(fd, , sizeof(msg)) != sizeof(msg)) {
  error_report("failed to write, fd=%d, errno=%d (%s)",
   fd, errno, strerror(errno));
@@ -163,6 +164,7 @@ static void vhost_vdpa_listener_commit(MemoryListener 
*listener)
  msg.type = v->msg_type;
  msg.iotlb.type = VHOST_IOTLB_BATCH_END;

+trace_vhost_vdpa_listener_commit(v, fd, msg.type, msg.iotlb.type);
  if (write(fd, , sizeof(msg)) != sizeof(msg)) {
  error_report("failed to write, fd=%d, errno=%d (%s)",
   fd, errno, strerror(errno));
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index a5102eac9e..48d9d5 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -25,6 +25,8 @@ vhost_user_postcopy_waker_nomatch(const char *rb, uint64_t rb_offset) 
"%s + 0x%"
  # vhost-vdpa.c
  vhost_vdpa_dma_map(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint64_t uaddr, uint8_t perm, uint8_t type) 
"vdpa:%p fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" uaddr: 0x%"PRIx64" 
perm: 0x%"PRIx8" type: %"PRIu8
  vhost_vdpa_dma_unmap(void *vdpa, int fd, uint32_t msg_type, uint64_t iova, uint64_t size, uint8_t type) "vdpa:%p 
fd: %d msg_type: %"PRIu32" iova: 0x%"PRIx64" size: 0x%"PRIx64" type: %"PRIu8
+vhost_vdpa_listener_begin_batch(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p 
fd: %d msg_type: %"PRIu32" type: %"PRIu8
+vhost_vdpa_listener_commit(void *v, int fd, uint32_t msg_type, uint8_t type)  "vdpa:%p fd: %d 
msg_type: %"PRIu32" type: %"PRIu8
  vhost_vdpa_listener_region_add(void *vdpa, uint64_t iova, uint64_t llend, void *vaddr, bool readonly) 
"vdpa: %p iova 0x%"PRIx64" llend 0x%"PRIx64" vaddr: %p read-only: %d"
  vhost_vdpa_listener_region_del(void *vdpa, uint64_t iova, uint64_t llend) "vdpa: %p iova 
0x%"PRIx64" llend 0x%"PRIx64
  vhost_vdpa_add_status(void *dev, uint8_t status) "dev: %p status: 0x%"PRIx8
--
2.27.0

Re: [PATCH v4] vdpa: reset the backend device in the end of vhost_net_stop()

2022-04-05 Thread Si-Wei Liu

On 4/1/2022 7:20 PM, Jason Wang wrote:

Adding Michael.

On Sat, Apr 2, 2022 at 7:08 AM Si-Wei Liu wrote:

On 3/31/2022 7:53 PM, Jason Wang wrote:

On Fri, Apr 1, 2022 at 9:31 AM Michael Qiu wrote:

Currently, when VM poweroff, it will trigger vdpa
device(such as mlx bluefield2 VF) reset many times(with 1 datapath
queue pair and one control queue, triggered 3 times), this
leads to below issue:

vhost VQ 2 ring restore failed: -22: Invalid argument (22)

This because in vhost_net_stop(), it will stop all vhost device bind to
this virtio device, and in vhost_dev_stop(), qemu tries to stop the device
, then stop the queue: vhost_virtqueue_stop().

In vhost_dev_stop(), it resets the device, which clear some flags
in low level driver, and in next loop(stop other vhost backends),
qemu try to stop the queue corresponding to the vhost backend,
the driver finds that the VQ is invalied, this is the root cause.

To solve the issue, vdpa should set vring unready, and
remove reset ops in device stop: vhost_dev_start(hdev, false).

and implement a new function vhost_dev_reset, only reset backend
device after all vhost(per-queue) stoped.

Typo.

Signed-off-by: Michael Qiu
Acked-by: Jason Wang

Rethink this patch, consider there're devices that don't support
set_vq_ready(). I wonder if we need

1) uAPI to tell the user space whether or not it supports set_vq_ready()

I guess what's more relevant here is to define the uAPI semantics for
unready i.e. set_vq_ready(0) for resuming/stopping virtqueue processing,
as starting vq is comparatively less ambiguous.

Yes.

Considering the
likelihood that this interface may be used for live migration, it would
be nice to come up with variants such as 1) discard inflight request
v.s. 2) waiting for inflight processing to be done,

Or inflight descriptor reporting (which seems to be tricky). But we
can start from net that a discarding may just work.

and 3) timeout in
waiting.

Actually, that's the plan and Eugenio is proposing something like this
via virtio spec:

https://urldefense.com/v3/__https://lists.oasis-open.org/archives/virtio-dev/202111/msg00020.html__;!!ACWV5N9M2RV99hQ!bcX6i6_atR-6Gcl-4q5Tekab_xDuXr7lDAMw2E1hilZ_1cZIX1c5mztQtvsnjiiy$
Thanks for the pointer, I seem to recall I saw it some time back though
I wonder if there's follow-up for the v3? My impression was that this is
still a work-in-progress spec proposal, while the semantics of various
F_STOP scenario is unclear yet and not all of the requirements (ex:
STOP_FAILED, rewind & !IN_ORDER) for live migration do seem to get
accommodated?

2) userspace will call SET_VRING_ENABLE() when the device supports
otherwise it will use RESET.

Are you looking to making virtqueue resume-able through the new
SET_VRING_ENABLE() uAPI?

I think RESET is inevitable in some case, i.e. when guest initiates
device reset by writing 0 to the status register.

Yes, that's all my plan.

For suspend/resume and
live migration use cases, indeed RESET can be substituted with
SET_VRING_ENABLE. Again, it'd need quite some code refactoring to
accommodate this change. Although I'm all for it, it'd be the best to
lay out the plan for multiple phases rather than overload this single
patch too much. You can count my time on this endeavor if you don't mind. :)

You're welcome, I agree we should choose a way to go first:

1) manage to use SET_VRING_ENABLE (more like a workaround anyway)
For networking device and the vq suspend/resume and live migration use
cases to support, I thought it might suffice? We may drop inflight or
unused ones for Ethernet... What other part do you think may limit its
extension to become a general uAPI or add new uAPI to address similar VQ
stop requirement if need be? Or we might well define subsystem specific
uAPI to stop the virtqueue, for vdpa device specifically? I think the
point here is given that we would like to avoid guest side modification
to support live migration, we can define specific uAPI for specific live
migration requirement without having to involve guest driver change.
It'd be easy to get started this way and generalize them all to a full
blown _S_STOP when things are eventually settled.

2) go with virtio-spec (may take a while)
I feel it might be still quite early for now to get to a full blown
_S_STOP spec level amendment that works for all types of virtio (vendor)
devices. Generally there can be very specific subsystem-dependent ways
to stop each type of virtio devices that satisfies the live migration of
virtio subsystem devices. For now the discussion mostly concerns with vq
index rewind, inflight handling, notification interrupt and
configuration space such kind of virtio level things, but real device
backend has implication on the other parts such as the order of IO/DMA
quiescing and interrupt masking. If the subsystem virtio guest drivers
today somehow don't support any of those _S_STOP new behaviors, I guess
it's with little point to introduce the same

1 2 >

1 - 100 of 152 matches

Mail list logo