from:"Michael S. Tsirkin"

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-19 Thread Michael S. Tsirkin

On Tue, Apr 19, 2016 at 11:01:38AM -0700, Andy Lutomirski wrote:
> On Tue, Apr 19, 2016 at 10:49 AM, Michael S. Tsirkin <m...@redhat.com> wrote:
> > On Tue, Apr 19, 2016 at 12:26:44PM -0400, David Woodhouse wrote:
> >> On Tue, 2016-04-19 at 19:20 +0300, Michael S. Tsirkin wrote:
> >> >
> >> > > I thought that PLATFORM served that purpose.  Woudn't the host
> >> > > advertise PLATFORM support and, if the guest doesn't ack it, the host
> >> > > device would skip translation?  Or is that problematic for vfio?
> >> >
> >> > Exactly that's problematic for security.
> >> > You can't allow guest driver to decide whether device skips security.
> >>
> >> Right. Because fundamentally, this *isn't* a property of the endpoint
> >> device, and doesn't live in virtio itself.
> >>
> >> It's a property of the platform IOMMU, and lives there.
> >
> > It's a property of the hypervisor virtio implementation, and lives there.
> 
> It is now, but QEMU could, in principle, change the way it thinks
> about it so that virtio devices would use the QEMU DMA API but ask
> QEMU to pass everything through 1:1.  This would be entirely invisible
> to guests but would make it be a property of the IOMMU implementation.
> At that point, maybe QEMU could find a (platform dependent) way to
> tell the guest what's going on.
> 
> FWIW, as far as I can tell, PPC and SPARC really could, in principle,
> set up 1:1 mappings in the guest so that the virtio devices would work
> regardless of whether QEMU is ignoring the IOMMU or not -- I think the
> only obstacle is that the PPC and SPARC 1:1 mappings are currectly set
> up with an offset.  I don't know too much about those platforms, but
> presumably the layout could be changed so that 1:1 really was 1:1.
> 
> --Andy

Sure. Do you see any reason why the decision to do this can't be
keyed off the virtio feature bit?

-- 
MST

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-19 Thread Michael S. Tsirkin

On Tue, Apr 19, 2016 at 12:26:44PM -0400, David Woodhouse wrote:
> On Tue, 2016-04-19 at 19:20 +0300, Michael S. Tsirkin wrote:
> > 
> > > I thought that PLATFORM served that purpose.  Woudn't the host
> > > advertise PLATFORM support and, if the guest doesn't ack it, the host
> > > device would skip translation?  Or is that problematic for vfio?
> > 
> > Exactly that's problematic for security.
> > You can't allow guest driver to decide whether device skips security.
> 
> Right. Because fundamentally, this *isn't* a property of the endpoint
> device, and doesn't live in virtio itself.
> 
> It's a property of the platform IOMMU, and lives there.

It's a property of the hypervisor virtio implementation, and lives there.

-- 
MST

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-19 Thread Michael S. Tsirkin

On Tue, Apr 19, 2016 at 09:12:03AM -0700, Andy Lutomirski wrote:
> On Tue, Apr 19, 2016 at 9:09 AM, Michael S. Tsirkin <m...@redhat.com> wrote:
> > On Tue, Apr 19, 2016 at 09:02:14AM -0700, Andy Lutomirski wrote:
> >> On Tue, Apr 19, 2016 at 3:27 AM, Michael S. Tsirkin <m...@redhat.com> 
> >> wrote:
> >> > On Mon, Apr 18, 2016 at 12:24:15PM -0700, Andy Lutomirski wrote:
> >> >> On Mon, Apr 18, 2016 at 11:29 AM, David Woodhouse <dw...@infradead.org> 
> >> >> wrote:
> >> >> > For x86, you *can* enable virtio-behind-IOMMU if your DMAR tables tell
> >> >> > the truth, and even legacy kernels ought to cope with that.
> >> >> > FSVO 'ought to' where I suspect some of them will actually crash with 
> >> >> > a
> >> >> > NULL pointer dereference if there's no "catch-all" DMAR unit in the
> >> >> > tables, which puts it back into the same camp as ARM and Power.
> >> >>
> >> >> I think x86 may get a bit of a free pass here.  AFAIK the QEMU IOMMU
> >> >> implementation on x86 has always been "experimental", so it just might
> >> >> be okay to change it in a way that causes some older kernels to OOPS.
> >> >>
> >> >> --Andy
> >> >
> >> > Since it's experimental, it might be OK to change *guest kernels*
> >> > such that they oops on old QEMU.
> >> > But guest kernels were not experimental - so we need a QEMU mode that
> >> > makes them work fine. The more functionality is available in this QEMU
> >> > mode, the betterm because it's going to be the default for a while. For
> >> > the same reason, it is preferable to also have new kernels not crash in
> >> > this mode.
> >> >
> >>
> >> People add QEMU features that need new guest kernels all time time.
> >> If you enable virtio-scsi and try to boot a guest that's too old, it
> >> won't work.  So I don't see anything fundamentally wrong with saying
> >> that the non-experimental QEMU Q35 IOMMU mode won't boot if the guest
> >> kernel is too old.  It might be annoying, since old kernels do work on
> >> actual Q35 hardware, but it at least seems to be that it might be
> >> okay.
> >>
> >> --Andy
> >
> > Yes but we need a mode that makes both old and new kernels work, and
> > that should be the default for a while.  this is what the
> > IOMMU_PASSTHROUGH flag was about: old kernels ignore it and bypass DMA
> > API, new kernels go "oh compatibility mode" and bypass the IOMMU
> > within DMA API.
> 
> I thought that PLATFORM served that purpose.  Woudn't the host
> advertise PLATFORM support and, if the guest doesn't ack it, the host
> device would skip translation?  Or is that problematic for vfio?

Exactly that's problematic for security.
You can't allow guest driver to decide whether device skips security.

> >
> > --
> > MST
> 
> 
> 
> -- 
> Andy Lutomirski
> AMA Capital Management, LLC

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-19 Thread Michael S. Tsirkin

On Tue, Apr 19, 2016 at 09:02:14AM -0700, Andy Lutomirski wrote:
> On Tue, Apr 19, 2016 at 3:27 AM, Michael S. Tsirkin <m...@redhat.com> wrote:
> > On Mon, Apr 18, 2016 at 12:24:15PM -0700, Andy Lutomirski wrote:
> >> On Mon, Apr 18, 2016 at 11:29 AM, David Woodhouse <dw...@infradead.org> 
> >> wrote:
> >> > For x86, you *can* enable virtio-behind-IOMMU if your DMAR tables tell
> >> > the truth, and even legacy kernels ought to cope with that.
> >> > FSVO 'ought to' where I suspect some of them will actually crash with a
> >> > NULL pointer dereference if there's no "catch-all" DMAR unit in the
> >> > tables, which puts it back into the same camp as ARM and Power.
> >>
> >> I think x86 may get a bit of a free pass here.  AFAIK the QEMU IOMMU
> >> implementation on x86 has always been "experimental", so it just might
> >> be okay to change it in a way that causes some older kernels to OOPS.
> >>
> >> --Andy
> >
> > Since it's experimental, it might be OK to change *guest kernels*
> > such that they oops on old QEMU.
> > But guest kernels were not experimental - so we need a QEMU mode that
> > makes them work fine. The more functionality is available in this QEMU
> > mode, the betterm because it's going to be the default for a while. For
> > the same reason, it is preferable to also have new kernels not crash in
> > this mode.
> >
> 
> People add QEMU features that need new guest kernels all time time.
> If you enable virtio-scsi and try to boot a guest that's too old, it
> won't work.  So I don't see anything fundamentally wrong with saying
> that the non-experimental QEMU Q35 IOMMU mode won't boot if the guest
> kernel is too old.  It might be annoying, since old kernels do work on
> actual Q35 hardware, but it at least seems to be that it might be
> okay.
> 
> --Andy

Yes but we need a mode that makes both old and new kernels work, and
that should be the default for a while.  this is what the
IOMMU_PASSTHROUGH flag was about: old kernels ignore it and bypass DMA
API, new kernels go "oh compatibility mode" and bypass the IOMMU
within DMA API.

-- 
MST

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-19 Thread Michael S. Tsirkin

On Tue, Apr 19, 2016 at 09:00:27AM -0700, Andy Lutomirski wrote:
> On Apr 19, 2016 2:13 AM, "Michael S. Tsirkin" <m...@redhat.com> wrote:
> >
> >
> > I guess you are right in that we should split this part out.
> > What I wanted is really the combination
> > PASSTHROUGH && !PLATFORM so that we can say "ok we don't
> > need to guess, this device actually bypasses the IOMMU".
> 
> What happens when you use a device like this on Xen or with a similar
> software translation layer?

I think you don't use it on Xen since virtio doesn't bypass an IOMMU there.
If you do you have misconfigured your device.

-- 
MST

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-19 Thread Michael S. Tsirkin

On Mon, Apr 18, 2016 at 12:24:15PM -0700, Andy Lutomirski wrote:
> On Mon, Apr 18, 2016 at 11:29 AM, David Woodhouse  wrote:
> > For x86, you *can* enable virtio-behind-IOMMU if your DMAR tables tell
> > the truth, and even legacy kernels ought to cope with that.
> > FSVO 'ought to' where I suspect some of them will actually crash with a
> > NULL pointer dereference if there's no "catch-all" DMAR unit in the
> > tables, which puts it back into the same camp as ARM and Power.
> 
> I think x86 may get a bit of a free pass here.  AFAIK the QEMU IOMMU
> implementation on x86 has always been "experimental", so it just might
> be okay to change it in a way that causes some older kernels to OOPS.
> 
> --Andy

Since it's experimental, it might be OK to change *guest kernels*
such that they oops on old QEMU.
But guest kernels were not experimental - so we need a QEMU mode that
makes them work fine. The more functionality is available in this QEMU
mode, the betterm because it's going to be the default for a while. For
the same reason, it is preferable to also have new kernels not crash in
this mode.

-- 
MST

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-19 Thread Michael S. Tsirkin

On Mon, Apr 18, 2016 at 02:29:33PM -0400, David Woodhouse wrote:
> On Mon, 2016-04-18 at 19:27 +0300, Michael S. Tsirkin wrote:
> > I balk at adding more hacks to a broken system. My goals are
> > merely to
> > - make things work correctly with an IOMMU and new guests,
> >   so people can use userspace drivers with virtio devices
> > - prevent security risks when guest kernel mistakenly thinks
> >   it's protected by an IOMMU, but in fact isn't
> > - avoid breaking any working configurations
> 
> AFAICT the VIRTIO_F_IOMMU_PASSTHROUGH thing seems orthogonal to this.
> That's just an optimisation, for telling an OS "you don't really need
> to bother with the IOMMU, even though you it works".
> 
> There are two main reasons why an operating system might want to use
> the IOMMU via the DMA API for native drivers: 
>  - To protect against driver bugs triggering rogue DMA.
>  - To protect against hardware (or firmware) bugs.
> 
> With virtio, the first reason still exists. But the second is moot
> because the device is part of the hypervisor and if the hypervisor is
> untrustworthy then you're screwed anyway... but then again, in SoC
> devices you could replace 'hypervisor' with 'chip' and the same is
> true, isn't it? Is there *really* anything virtio-specific here?
>
> Sure, I want my *external* network device on a PCIe card with software-
> loadable firmware to be behind an IOMMU because I don't trust it as far
> as I can throw it. But for on-SoC devices surely the situation is
> *just* the same as devices provided by a hypervisor?

Depends on how SoC is designed I guess.  At the moment specifically QEMU
runs everything in a single memory space so an IOMMU table lookup does
not offer any extra protection. That's not a must, one could come
up with modular hypervisor designs - it's just what we have ATM.

> And some people want that external network device to use passthrough
> anyway, for performance reasons.

That's a policy decision though.

> On the whole, there are *plenty* of reasons why we might want to have a
> passthrough mapping on a per-device basis,

That's true. And driver security also might differ, for example maybe I
trust a distro-supplied driver more than an out of tree one.  Or maybe I
trust a distro-supplied userspace driver more than a closed-source one.
And maybe I trust devices from same vendor as my chip more than a 3rd
party one.  So one can generalize this even further, think about device
and driver security/trust level as an integer and platform protection as an
integer.

If platform IOMMU offers you extra protection over trusting the device
(trust < protection) it improves you security to use platform to limit
the device. If trust >= protection it just adds overhead without
increasing the security.

> and I really struggle to
> find justification for having this 'hint' in a virtio-specific way.

It's a way. No system seems to expose this information in a more generic
way at the moment, and it's portable. Would you like to push for some
kind of standartization of such a hint? I would be interested
to hear about that.

> And it's complicating the discussion of the *actual* fix we're looking
> at.

I guess you are right in that we should split this part out.
What I wanted is really the combination
PASSTHROUGH && !PLATFORM so that we can say "ok we don't
need to guess, this device actually bypasses the IOMMU".

And I thought it's a nice idea to use PASSTHROUGH && PLATFORM
as a hint since it seemed to be unused.
But maybe the best thing to do for now is to say
- hosts should not set PASSTHROUGH && PLATFORM
- guests should ignore PASSTHROUGH if PLATFORM is set

and then we can come back to this optimization idea later
if it's appropriate.

So yes I think we need the two bits but no we don't need to
mix the hint discussion in here.

> > Looking at guest code, it looks like virtio was always
> > bypassing the IOMMU even if configured, but no other
> > guest driver did.
> > 
> > This makes me think the problem where guest drivers
> > ignore the IOMMU is virtio specific
> > and so a virtio specific solution seems cleaner.
> > 
> > The problem for assigned devices is IMHO different: they bypass
> > the guest IOMMU too but no guest driver knows about this,
> > so guests do not work. Seems cleaner to fix QEMU to make
> > existing guests work.
> 
> I certainly agree that it's better to fix QEMU. Whether devices are
> behind an IOMMU or not, the DMAR tables we expose to a guest should
> tell the truth.
> 
> Part of the issue here is virtio-specific; part isn't.
> 
> Basically, we have a conjunction of two separate bugs which happened to
> work (for virtio) — the IOMMU support in QEMU wasn't working for virtio
> (and as

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-18 Thread Michael S. Tsirkin

On Mon, Apr 18, 2016 at 11:51:41AM -0400, David Woodhouse wrote:
> On Mon, 2016-04-18 at 18:30 +0300, Michael S. Tsirkin wrote:
> > 
> > > Setting (only) VIRTIO_F_IOMMU_PASSTHROUGH indicates to the guest that
> > > its own operating system's IOMMU code is expected to be broken, and
> > > that the virtio driver should eschew the DMA API?
> > 
> > No - it tells guest that e.g. the ACPI tables (or whatever the
> > equivalent is) do not match reality with respect to this device
> > since IOMMU is ignored by hypervisor.
> > Hypervisor has no idea what does guest IOMMU code do - hopefully
> > it is not actually broken.
> 
> OK, that makes sense — thanks.
> 
> So where the platform *does* have a way to coherently tell the guest
> that some devices are behind and IOMMU and some aren't, we should never
> see VIRTIO_F_IOMMU_PASSTHROUGH && !VIRTIO_F_IOMMU_PLATFORM. (Except
> perhaps temporarily on x86 until we *do* fix the DMAR tables to tell
> the truth; qv.)
> 
> This should *only* be a crutch for platforms which cannot properly
> convey that information from the hypervisor to the guest. It should be
> clearly documented "thou shalt not use this unless you've first
> attempted to fix the broken platform to get it right for itself".
> 
> And if we look at it as such... does it make more sense for this to be
> a more *generic* qemu←→guest interface? That way the software hacks can
> live in the OS IOMMU code where they belong, and prevent assignment to
> nested guests for example. And can cover cases like assigned PCI
> devices in existing qemu/x86 which need the same treatment.
>
> Put another way: if we're going to add code to the guest OS to look at
> this information, why can't we add that code in the guest's IOMMU
> support instead, to look at an out-of-band qemu-specific "ignore IOMMU
> for these devices" list instead?

I balk at adding more hacks to a broken system. My goals are
merely to
- make things work correctly with an IOMMU and new guests,
  so people can use userspace drivers with virtio devices
- prevent security risks when guest kernel mistakenly thinks
  it's protected by an IOMMU, but in fact isn't
- avoid breaking any working configurations

Looking at guest code, it looks like virtio was always
bypassing the IOMMU even if configured, but no other
guest driver did.

This makes me think the problem where guest drivers
ignore the IOMMU is virtio specific
and so a virtio specific solution seems cleaner.

The problem for assigned devices is IMHO different: they bypass
the guest IOMMU too but no guest driver knows about this,
so guests do not work. Seems cleaner to fix QEMU to make
existing guests work.


> > The status quo is that that the IOMMU might well be bypassed
> > and then you need to program physical addresses into the device,
> > but maybe not. If DMA API does not give you physical addresses, you
> > need to bypass it, but hypervisor does not know or care.
> 
> Right. The status quo is that qemu doesn't provide correct information
> about IOMMU topology to guests, and they have to have heuristics to
> work out whether to eschew the IOMMU for a given device or not. This is
> true for virtio and assigned PCI devices alike.

True but I think we should fix QEMU to shadow IOMMU
page tables for assigned devices. This seems rather
possible with VT-D, and there are patches already on list.

It looks like this will fix all legacy guests which is
much nicer than what you suggest which will only help new guests.

> Furthermore, some platforms don't *have* a standard way for qemu to
> 'tell the truth' to the guests, and that's where the real fun comes in.
> But still, I'd like to see a generic solution for that lack instead of
> a virtio-specific hack.

But the issue is not just these holes.  E.g. with VT-D it is only easy
to emulate because there's a "caching mode" hook. It is fundamentally
paravirtualization.  So a completely generic solution would be a
paravirtualized IOMMU interface, replacing VT-D for VMs. It might be
justified if many platforms have hard to emulate interfaces.



> -- 
> dwmw2
> 
>

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-18 Thread Michael S. Tsirkin

On Mon, Apr 18, 2016 at 11:22:03AM -0400, David Woodhouse wrote:
> On Mon, 2016-04-18 at 17:23 +0300, Michael S. Tsirkin wrote:
> > 
> > This patch doesn't change DMAR tables, it creates a way for virtio
> > device to tell guest "I obey what DMAR tables tell you, you can stop
> > doing hacks".
> > 
> > And as PPC guys seem adamant that platform tools there are no good for
> > that purpose, there's another bit that says "ignore what platform tells
> > you, I'm not a real device - I'm part of hypervisor and I bypass the
> > IOMMU".
> 
> ...
> 
> +/* Request IOMMU passthrough (if available)
> + * Without VIRTIO_F_IOMMU_PLATFORM: bypass the IOMMU even if enabled.
> + * With VIRTIO_F_IOMMU_PLATFORM: suggest disabling IOMMU.
> + */
> +#define VIRTIO_F_IOMMU_PASSTHROUGH 33
> +
> +/* Do not bypass the IOMMU (if configured) */
> +#define VIRTIO_F_IOMMU_PLATFORM34
> 
> OK... let's see if I can reconcile those descriptions coherently.
> 
> Setting (only) VIRTIO_F_IOMMU_PASSTHROUGH indicates to the guest that
> its own operating system's IOMMU code is expected to be broken, and
> that the virtio driver should eschew the DMA API?

No - it tells guest that e.g. the ACPI tables (or whatever the
equivalent is) do not match reality with respect to this device
since IOMMU is ignored by hypervisor.
Hypervisor has no idea what does guest IOMMU code do - hopefully
it is not actually broken.

> And that the guest OS
> cannot further assign the affected device to any of *its* nested
> guests? Not that the broken IOMMU code in said guest OS will know the
> latter, of course.
> 
> With VIRTIO_F_IOMMU_PLATFORM set, VIRTIO_F_IOMMU_PASSTHROUGH is just a
> *hint*, suggesting that the guest OS should *request* a passthrough
> mapping from the IOMMU?

Right. But it'll work correctly if you don't.

> Via a driver←→IOMMU API which doesn't yet exist
> in Linux, since we only have 'iommu=pt' on the command line for that?
> 
> And having *neither* of those bits sets is the status quo, which means
> that your OS code might well be broken and need you to eschew the DMA
> API, but maybe not.


The status quo is that that the IOMMU might well be bypassed
and then you need to program physical addresses into the device,
but maybe not. If DMA API does not give you physical addresses, you
need to bypass it, but hypervisor does not know or care.


> 
> -- 
> dwmw2
> 
>

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-18 Thread Michael S. Tsirkin

On Mon, Apr 18, 2016 at 10:03:52AM -0400, David Woodhouse wrote:
> On Mon, 2016-04-18 at 16:12 +0300, Michael S. Tsirkin wrote:
> > I'm not sure I understand the issue.  The public API is not about how
> > the driver works.  It doesn't say "don't use DMA API" anywhere, does it?
> > It's about telling device whether to obey the IOMMU and
> > about discovering whether a device is in fact under the IOMMU.
> 
> Apologies, I was wrongly reading this as a kernel patch.
> 
> After a brief struggle with "telling device whether to obey the IOMMU",
> which is obviously completely impossible from the guest kernel, I
> realise my mistake :)
> 
> So... on x86 how does this get reflected in the DMAR tables that the
> guest BIOS presents to the guest kernel, so that the guest kernel
> *knows* which devices are behind which IOMMU?

This patch doesn't change DMAR tables, it creates a way for virtio
device to tell guest "I obey what DMAR tables tell you, you can stop
doing hacks".

And as PPC guys seem adamant that platform tools there are no good for
that purpose, there's another bit that says "ignore what platform tells
you, I'm not a real device - I'm part of hypervisor and I bypass the
IOMMU".


> (And are you fixing the case of assigned PCI devices, which aren't
> behind any IOMMU, at the same time as you answer that? :)

No - Aviv B.D. has patches on list to fix that.

-- 
MST

Re: [Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-18 Thread Michael S. Tsirkin

On Mon, Apr 18, 2016 at 07:58:37AM -0400, David Woodhouse wrote:
> On Mon, 2016-04-18 at 14:47 +0300, Michael S. Tsirkin wrote:
> > This adds a flag to enable/disable bypassing the IOMMU by
> > virtio devices.
> 
> I'm still deeply unhappy with having this kind of hack in the virtio
> code at all, as you know. Drivers should just use the DMA API and if
> the *platform* wants to make it a no-op for a specific device, then it
> can.
> 
> Remember, this isn't just virtio either. Don't we have *precisely* the
> same issue with assigned PCI devices on a system with an emulated Intel
> IOMMU? The assigned PCI devices aren't covered by the emulated IOMMU,
> and the platform needs to know to bypass *those* too.
> 
> Now, we've had this conversation, and we accepted the hack in virtio
> for now until the platforms (especially SPARC and Power IIRC) can get
> their act together and make their DMA API implementations not broken.
> 
> But now you're adding this hack to the public API where we have to
> support it for ever. Please, can't we avoid that?

I'm not sure I understand the issue.  The public API is not about how
the driver works.  It doesn't say "don't use DMA API" anywhere, does it?
It's about telling device whether to obey the IOMMU and
about discovering whether a device is in fact under the IOMMU.

Once DMA API allows bypassing IOMMU per device we'll be
able to drop the ugly hack from virtio drivers, simply keying it
off the given flag.


> -- 
> dwmw2
> 
>

[Qemu-block] [PATCH RFC] fixup! virtio: convert to use DMA api

2016-04-18 Thread Michael S. Tsirkin

This adds a flag to enable/disable bypassing the IOMMU by
virtio devices.

This is on top of patch
http://article.gmane.org/gmane.comp.emulators.qemu/403467
virtio: convert to use DMA api

Tested with patchset
http://article.gmane.org/gmane.linux.kernel.virtualization/27545
virtio-pci: iommu support

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>

---
 include/hw/virtio/virtio-access.h  | 3 ++-
 include/hw/virtio/virtio.h | 6 +-
 include/standard-headers/linux/virtio_config.h | 8 
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/hw/virtio/virtio-access.h 
b/include/hw/virtio/virtio-access.h
index 967cc75..bb6f34e 100644
--- a/include/hw/virtio/virtio-access.h
+++ b/include/hw/virtio/virtio-access.h
@@ -23,7 +23,8 @@ static inline AddressSpace *virtio_get_dma_as(VirtIODevice 
*vdev)
 BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 
-if (k->get_dma_as) {
+if ((vdev->host_features & (0x1ULL << VIRTIO_F_IOMMU_PLATFORM)) &&
+k->get_dma_as) {
 return k->get_dma_as(qbus->parent);
 }
 return _space_memory;
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b12faa9..34d3041 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -228,7 +228,11 @@ typedef struct VirtIORNGConf VirtIORNGConf;
 DEFINE_PROP_BIT64("notify_on_empty", _state, _field,  \
   VIRTIO_F_NOTIFY_ON_EMPTY, true), \
 DEFINE_PROP_BIT64("any_layout", _state, _field, \
-  VIRTIO_F_ANY_LAYOUT, true)
+  VIRTIO_F_ANY_LAYOUT, true), \
+DEFINE_PROP_BIT64("iommu_passthrough", _state, _field, \
+  VIRTIO_F_IOMMU_PASSTHROUGH, false), \
+DEFINE_PROP_BIT64("iommu_platform", _state, _field, \
+  VIRTIO_F_IOMMU_PLATFORM, false)
 
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n);
diff --git a/include/standard-headers/linux/virtio_config.h 
b/include/standard-headers/linux/virtio_config.h
index bcc445b..5564dab 100644
--- a/include/standard-headers/linux/virtio_config.h
+++ b/include/standard-headers/linux/virtio_config.h
@@ -61,4 +61,12 @@
 /* v1.0 compliant. */
 #define VIRTIO_F_VERSION_1 32
 
+/* Request IOMMU passthrough (if available)
+ * Without VIRTIO_F_IOMMU_PLATFORM: bypass the IOMMU even if enabled.
+ * With VIRTIO_F_IOMMU_PLATFORM: suggest disabling IOMMU.
+ */
+#define VIRTIO_F_IOMMU_PASSTHROUGH 33
+
+/* Do not bypass the IOMMU (if configured) */
+#define VIRTIO_F_IOMMU_PLATFORM34
 #endif /* _LINUX_VIRTIO_CONFIG_H */
-- 
MST

[Qemu-block] [PULL 14/15] virtio: merge virtio_queue_aio_set_host_notifier_handler with virtio_queue_set_aio

2016-04-08 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Eliminating the reentrancy is actually a nice thing that we can do
with the API that Michael proposed, so let's make it first class.
This also hides the complex assign/set_handler conventions from
callers of virtio_queue_aio_set_host_notifier_handler, which in
fact was always called with assign=true.

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/virtio/virtio.h  |  6 ++
 hw/block/dataplane/virtio-blk.c |  7 +++
 hw/scsi/virtio-scsi-dataplane.c | 12 
 hw/virtio/virtio.c  | 17 +
 4 files changed, 14 insertions(+), 28 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index fa3f93b..6a37065 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -142,9 +142,6 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 void (*handle_output)(VirtIODevice *,
   VirtQueue *));
 
-void virtio_set_queue_aio(VirtQueue *vq,
-  void (*handle_output)(VirtIODevice *, VirtQueue *));
-
 void virtio_del_queue(VirtIODevice *vdev, int n);
 
 void *virtqueue_alloc_element(size_t sz, unsigned out_num, unsigned in_num);
@@ -254,7 +251,8 @@ EventNotifier *virtio_queue_get_host_notifier(VirtQueue 
*vq);
 void virtio_queue_set_host_notifier_fd_handler(VirtQueue *vq, bool assign,
bool set_handler);
 void virtio_queue_aio_set_host_notifier_handler(VirtQueue *vq, AioContext *ctx,
-bool assign, bool set_handler);
+void (*fn)(VirtIODevice *,
+   VirtQueue *));
 void virtio_irq(VirtQueue *vq);
 VirtQueue *virtio_vector_first_queue(VirtIODevice *vdev, uint16_t vector);
 VirtQueue *virtio_vector_next_queue(VirtQueue *vq);
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 65c7f70..3cb97c9 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -237,8 +237,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
 /* Get this show started by hooking up our callbacks */
 aio_context_acquire(s->ctx);
-virtio_set_queue_aio(s->vq, virtio_blk_data_plane_handle_output);
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
+virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx,
+   
virtio_blk_data_plane_handle_output);
 aio_context_release(s->ctx);
 return;
 
@@ -273,8 +273,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 aio_context_acquire(s->ctx);
 
 /* Stop notifications for new requests from guest */
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, false, false);
-virtio_set_queue_aio(s->vq, NULL);
+virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, NULL);
 
 /* Drain and switch bs back to the QEMU main loop */
 blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 39ad086..1a49f1e 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -81,8 +81,7 @@ static int virtio_scsi_vring_init(VirtIOSCSI *s, VirtQueue 
*vq, int n,
 return rc;
 }
 
-virtio_queue_aio_set_host_notifier_handler(vq, s->ctx, true, true);
-virtio_set_queue_aio(vq, fn);
+virtio_queue_aio_set_host_notifier_handler(vq, s->ctx, fn);
 return 0;
 }
 
@@ -99,13 +98,10 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
 VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
 int i;
 
-virtio_queue_aio_set_host_notifier_handler(vs->ctrl_vq, s->ctx, false, 
false);
-virtio_set_queue_aio(vs->ctrl_vq, NULL);
-virtio_queue_aio_set_host_notifier_handler(vs->event_vq, s->ctx, false, 
false);
-virtio_set_queue_aio(vs->event_vq, NULL);
+virtio_queue_aio_set_host_notifier_handler(vs->ctrl_vq, s->ctx, NULL);
+virtio_queue_aio_set_host_notifier_handler(vs->event_vq, s->ctx, NULL);
 for (i = 0; i < vs->conf.num_queues; i++) {
-virtio_queue_aio_set_host_notifier_handler(vs->cmd_vqs[i], s->ctx, 
false, false);
-virtio_set_queue_aio(vs->cmd_vqs[i], NULL);
+virtio_queue_aio_set_host_notifier_handler(vs->cmd_vqs[i], s->ctx, 
NULL);
 }
 }
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index eb04ac0..f745c4a 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1159,14 +1159,6 @

[Qemu-block] [PULL 12/15] virtio-blk: use aio handler for data plane

2016-04-08 Thread Michael S. Tsirkin

In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.

This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant.  Use a separate handler just
for aio, and disable regular handlers when dataplane is active.

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/virtio/virtio-blk.h  |  2 ++
 hw/block/dataplane/virtio-blk.c | 13 +
 hw/block/virtio-blk.c   | 27 +--
 3 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 59ae1e4..8f2b056 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -86,4 +86,6 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb);
 
 void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb);
 
+void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq);
+
 #endif
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2870d21..65c7f70 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -184,6 +184,17 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 g_free(s);
 }
 
+static void virtio_blk_data_plane_handle_output(VirtIODevice *vdev,
+VirtQueue *vq)
+{
+VirtIOBlock *s = (VirtIOBlock *)vdev;
+
+assert(s->dataplane);
+assert(s->dataplane_started);
+
+virtio_blk_handle_vq(s, vq);
+}
+
 /* Context: QEMU global mutex held */
 void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 {
@@ -226,6 +237,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
 /* Get this show started by hooking up our callbacks */
 aio_context_acquire(s->ctx);
+virtio_set_queue_aio(s->vq, virtio_blk_data_plane_handle_output);
 virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
 aio_context_release(s->ctx);
 return;
@@ -262,6 +274,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 
 /* Stop notifications for new requests from guest */
 virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, false, false);
+virtio_set_queue_aio(s->vq, NULL);
 
 /* Drain and switch bs back to the QEMU main loop */
 blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 151fe78..3f88f8c 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -578,20 +578,11 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb)
 }
 }
 
-static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
 {
-VirtIOBlock *s = VIRTIO_BLK(vdev);
 VirtIOBlockReq *req;
 MultiReqBuffer mrb = {};
 
-/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
- * dataplane here instead of waiting for .set_status().
- */
-if (s->dataplane && !s->dataplane_started) {
-virtio_blk_data_plane_start(s->dataplane);
-return;
-}
-
 blk_io_plug(s->blk);
 
 while ((req = virtio_blk_get_request(s))) {
@@ -605,6 +596,22 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 blk_io_unplug(s->blk);
 }
 
+static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+VirtIOBlock *s = (VirtIOBlock *)vdev;
+
+if (s->dataplane) {
+/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
+ * dataplane here instead of waiting for .set_status().
+ */
+virtio_blk_data_plane_start(s->dataplane);
+if (!s->dataplane_disabled) {
+return;
+}
+}
+virtio_blk_handle_vq(s, vq);
+}
+
 static void virtio_blk_dma_restart_bh(void *opaque)
 {
 VirtIOBlock *s = opaque;
-- 
MST

[Qemu-block] [PULL 09/15] virtio-blk: fix disabled mode

2016-04-08 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

We must not call virtio_blk_data_plane_notify if dataplane is
disabled: we would hit a segmentation fault in notify_guest_bh as
s->guest_notifier has not been setup and is NULL.

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/virtio/virtio-blk.h  | 1 +
 hw/block/dataplane/virtio-blk.c | 7 +++
 hw/block/virtio-blk.c   | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index ae84d92..59ae1e4 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -53,6 +53,7 @@ typedef struct VirtIOBlock {
 unsigned short sector_mask;
 bool original_wce;
 VMChangeStateEntry *change;
+bool dataplane_disabled;
 bool dataplane_started;
 struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index e666dd4..2870d21 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -29,7 +29,6 @@
 struct VirtIOBlockDataPlane {
 bool starting;
 bool stopping;
-bool disabled;
 
 VirtIOBlkConf *conf;
 
@@ -234,7 +233,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
   fail_host_notifier:
 k->set_guest_notifiers(qbus->parent, 1, false);
   fail_guest_notifiers:
-s->disabled = true;
+vblk->dataplane_disabled = true;
 s->starting = false;
 vblk->dataplane_started = true;
 }
@@ -251,8 +250,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 }
 
 /* Better luck next time. */
-if (s->disabled) {
-s->disabled = false;
+if (vblk->dataplane_disabled) {
+vblk->dataplane_disabled = false;
 vblk->dataplane_started = false;
 return;
 }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 870d345..151fe78 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -54,7 +54,7 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, 
unsigned char status)
 
 stb_p(>in->status, status);
 virtqueue_push(s->vq, >elem, req->in_len);
-if (s->dataplane) {
+if (s->dataplane_started && !s->dataplane_disabled) {
 virtio_blk_data_plane_notify(s->dataplane);
 } else {
 virtio_notify(vdev, s->vq);
-- 
MST

[Qemu-block] [PULL 03/15] xen: piix reuse pci generic class init function

2016-04-08 Thread Michael S. Tsirkin

piix3_ide_xen_class_init is identical to piix3_ide_class_init
except it's buggy as it does not set exit and does not disable
hotplug properly.

Switch to the generic one.

Reviewed-by: Stefano Stabellini <sstabell...@kernel.org>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 hw/ide/piix.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index df46147..0a4cbcb 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -258,22 +258,10 @@ static const TypeInfo piix3_ide_info = {
 .class_init= piix3_ide_class_init,
 };
 
-static void piix3_ide_xen_class_init(ObjectClass *klass, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(klass);
-PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
-
-k->realize = pci_piix_ide_realize;
-k->vendor_id = PCI_VENDOR_ID_INTEL;
-k->device_id = PCI_DEVICE_ID_INTEL_82371SB_1;
-k->class_id = PCI_CLASS_STORAGE_IDE;
-set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-}
-
 static const TypeInfo piix3_ide_xen_info = {
 .name  = "piix3-ide-xen",
 .parent= TYPE_PCI_IDE,
-.class_init= piix3_ide_xen_class_init,
+.class_init= piix3_ide_class_init,
 };
 
 static void piix4_ide_class_init(ObjectClass *klass, void *data)
-- 
MST

Re: [Qemu-block] [PATCH] xen: piix reuse pci generic class init function

2016-04-06 Thread Michael S. Tsirkin

On Wed, Apr 06, 2016 at 04:56:12PM -0700, Stefano Stabellini wrote:
> On Sun, 3 Apr 2016, Michael S. Tsirkin wrote:
> > piix3_ide_xen_class_init is identical to piix3_ide_class_init
> > except it's buggy as it does not set exit and does not disable
> > hotplug properly.
> > 
> > Switch to the generic one.
> > 
> > Reviewed-by: Stefano Stabellini <sstabell...@kernel.org>
> > Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> 
> Hey John,
> 
> are you going to take the patch or do you want me to handle it?
> 
> Cheers,
> 
> Stefano

it's in my tree already.

> 
> >  hw/ide/piix.c | 14 +-
> >  1 file changed, 1 insertion(+), 13 deletions(-)
> > 
> > diff --git a/hw/ide/piix.c b/hw/ide/piix.c
> > index df46147..0a4cbcb 100644
> > --- a/hw/ide/piix.c
> > +++ b/hw/ide/piix.c
> > @@ -258,22 +258,10 @@ static const TypeInfo piix3_ide_info = {
> >  .class_init= piix3_ide_class_init,
> >  };
> >  
> > -static void piix3_ide_xen_class_init(ObjectClass *klass, void *data)
> > -{
> > -DeviceClass *dc = DEVICE_CLASS(klass);
> > -PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> > -
> > -k->realize = pci_piix_ide_realize;
> > -k->vendor_id = PCI_VENDOR_ID_INTEL;
> > -k->device_id = PCI_DEVICE_ID_INTEL_82371SB_1;
> > -k->class_id = PCI_CLASS_STORAGE_IDE;
> > -set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> > -}
> > -
> >  static const TypeInfo piix3_ide_xen_info = {
> >  .name  = "piix3-ide-xen",
> >  .parent= TYPE_PCI_IDE,
> > -.class_init= piix3_ide_xen_class_init,
> > +.class_init= piix3_ide_class_init,
> >  };
> >  
> >  static void piix4_ide_class_init(ObjectClass *klass, void *data)
> > -- 
> > MST
> >

Re: [Qemu-block] [PATCH] virtio-blk: assert on starting/stopping

2016-04-04 Thread Michael S. Tsirkin

On Mon, Apr 04, 2016 at 10:25:34AM +0200, Cornelia Huck wrote:
> On Mon, 4 Apr 2016 10:19:42 +0200
> Paolo Bonzini  wrote:
> 
> > On 04/04/2016 10:10, Cornelia Huck wrote:
> > > > This will be fixed by Cornelia's rework, and is an example of why I
> > > > think patch 1/9 is a good idea (IOW, assign=false is harmful).
> > > 
> > > So what do we want to do for 2.6? The aio handler rework (without the
> > > cleanup) is needed. Do we want to include the minimal version of my
> > > "keep handler assigned" patch (the one without the api rework) as well,
> > > as it fixes a latent bug?
> > 
> > I would, but Michael is more conservative in general.  Since the
> > difference between a bug and a feature is very fuzzy here, I would just
> > omit my patch 9.
> 
> I'd omit patch 9 as well, but the knowledge that the "handler
> deassigned" bug is still lurking makes me uncomfortable.

It's not a bug as such - that logic was relying on handler
invoking itself being a nop and that assumption broke with
dataplane rework.

> Would like to see a test from someone with a large setup, anyway (and I
> need to enhance my test setup, I guess...)

Now that Christian sent the backtrace I feel with understand
the issues, but more testing is always good :)

-- 
MST

[Qemu-block] [PATCH] virtio-blk: assert on starting/stopping

2016-04-03 Thread Michael S. Tsirkin

Reentrancy cannot happen while the BQL is being held,
so we should never enter this condition.

Cc: Christian Borntraeger <borntrae...@de.ibm.com>
Cc: Cornelia Huck <cornelia.h...@de.ibm.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---

This is a replacement for [PATCH 9/9] virtio: remove starting/stopping
checks Christian, could you please give it a spin with debug enabled?
Since you reported above Paolo's patch triggers segfaults, I expect this
one to trigger assertions as well, which should give us more info on
the root cause.

 hw/block/dataplane/virtio-blk.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index fd06726..04e0e0d 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -203,10 +203,12 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
 int r;
 
-if (vblk->dataplane_started || s->starting) {
+if (vblk->dataplane_started) {
 return;
 }
 
+assert(!s->starting);
+
 s->starting = true;
 s->vq = virtio_get_queue(s->vdev, 0);
 
@@ -257,10 +259,12 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
 
-if (!vblk->dataplane_started || s->stopping) {
+if (!vblk->dataplane_started) {
 return;
 }
 
+assert(!s->stopping);
+
 /* Better luck next time. */
 if (vblk->dataplane_disabled) {
 vblk->dataplane_disabled = false;
-- 
MST

[Qemu-block] [PATCH] xen: piix reuse pci generic class init function

2016-04-03 Thread Michael S. Tsirkin

piix3_ide_xen_class_init is identical to piix3_ide_class_init
except it's buggy as it does not set exit and does not disable
hotplug properly.

Switch to the generic one.

Reviewed-by: Stefano Stabellini <sstabell...@kernel.org>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 hw/ide/piix.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index df46147..0a4cbcb 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -258,22 +258,10 @@ static const TypeInfo piix3_ide_info = {
 .class_init= piix3_ide_class_init,
 };
 
-static void piix3_ide_xen_class_init(ObjectClass *klass, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(klass);
-PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
-
-k->realize = pci_piix_ide_realize;
-k->vendor_id = PCI_VENDOR_ID_INTEL;
-k->device_id = PCI_DEVICE_ID_INTEL_82371SB_1;
-k->class_id = PCI_CLASS_STORAGE_IDE;
-set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-}
-
 static const TypeInfo piix3_ide_xen_info = {
 .name  = "piix3-ide-xen",
 .parent= TYPE_PCI_IDE,
-.class_init= piix3_ide_xen_class_init,
+.class_init= piix3_ide_class_init,
 };
 
 static void piix4_ide_class_init(ObjectClass *klass, void *data)
-- 
MST

Re: [Qemu-block] [PATCH 2/2] virtio-blk: use aio handler for data plane

2016-03-29 Thread Michael S. Tsirkin

On Tue, Mar 29, 2016 at 04:05:46PM +0200, Paolo Bonzini wrote:
> 
> 
> On 29/03/2016 15:42, Michael S. Tsirkin wrote:
> > +if (s->dataplane) {
> > +/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so 
> > start
> > + * dataplane here instead of waiting for .set_status().
> > + */
> > +if (!s->dataplane_started) {
> > +virtio_blk_data_plane_start(s->dataplane);
> > +}
> > +return;
> > +}
> > +
> > +virtio_blk_handle_vq(s, vq);
> 
> Another small comment, this can be written simply as
> 
> if (s->dataplane) {
> virtio_blk_data_plane_start(s->dataplane);

True, it's best not to poke at dataplane_started.

> } else {
> virtio_blk_handle_vq(s, vq);
> }
> 

I prefer the return style I think, to stress the
fact that this is an unusual, unexpected case.

> Paolo

Re: [Qemu-block] [PATCH 2/2] virtio-blk: use aio handler for data plane

2016-03-29 Thread Michael S. Tsirkin

On Tue, Mar 29, 2016 at 03:56:18PM +0200, Paolo Bonzini wrote:
> 
> 
> On 29/03/2016 15:42, Michael S. Tsirkin wrote:
> > @@ -262,6 +274,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
> >  
> >  /* Stop notifications for new requests from guest */
> >  virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, false, 
> > false);
> 
> I think that this should have been ", true, false" even before your
> patch; I'd prefer to fix it even if the problem is latent.

Makes sense - post a patch?

> The patch looks good, and it might even be an API improvement
> independent of Conny's patches.  The reentrancy _is_ hard to understand,
> and eliminating it makes the new API not just a hack.
> 
> In that case I would unify the new function with
> virtio_queue_aio_set_host_notifier_handler.  In other words
> 
> - virtio_queue_aio_set_host_notifier_handler(vq, ctx, NULL) is
> the same as
> 
>  virtio_set_queue_aio(s->vq, NULL);
>  virtio_queue_aio_set_host_notifier_handler(vq, ctx, true, false);
> 
> - virtio_queue_aio_set_host_notifier_handler(vq, ctx, fn) is the same as
> 
>  virtio_queue_aio_set_host_notifier_handler(vq, ctx, true, true);
>  virtio_set_queue_aio(vq, fn);
> 
> Thanks,
> 
> Paolo

In that case, we'll have to also change scsi to use the new API.
A bit more work, to be sure.
Does scsi have the same problem as blk?

> > +virtio_set_queue_aio(s->vq, NULL);
> >  
> >  /* Drain and switch bs back to the QEMU main loop */
> >  blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());

[Qemu-block] [PATCH 2/2] virtio-blk: use aio handler for data plane

2016-03-29 Thread Michael S. Tsirkin

In addition to handling IO in vcpu thread and
in io thread, blk dataplane introduces yet another mode:
handling it by aio.

This reuses the same handler as previous modes,
which triggers races as these were not designed to be reentrant.

As a temporary fix, use a separate handler just for aio, and
disable regular handlers when dataplane is active.

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/virtio/virtio-blk.h  |  2 ++
 hw/block/dataplane/virtio-blk.c | 13 +
 hw/block/virtio-blk.c   | 28 ++--
 3 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index ae84d92..df517ff 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -85,4 +85,6 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb);
 
 void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb);
 
+void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq);
+
 #endif
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 36f3d2b..7d1f3b2 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -184,6 +184,17 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 g_free(s);
 }
 
+static void virtio_blk_data_plane_handle_output(VirtIODevice *vdev,
+VirtQueue *vq)
+{
+VirtIOBlock *s = VIRTIO_BLK(vdev);
+
+assert(s->dataplane);
+assert(s->dataplane_started);
+
+virtio_blk_handle_vq(s, vq);
+}
+
 /* Context: QEMU global mutex held */
 void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 {
@@ -226,6 +237,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
 /* Get this show started by hooking up our callbacks */
 aio_context_acquire(s->ctx);
+virtio_set_queue_aio(s->vq, virtio_blk_data_plane_handle_output);
 virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
 aio_context_release(s->ctx);
 return;
@@ -262,6 +274,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 
 /* Stop notifications for new requests from guest */
 virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, false, false);
+virtio_set_queue_aio(s->vq, NULL);
 
 /* Drain and switch bs back to the QEMU main loop */
 blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index cb710f1..5717f09 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -577,20 +577,11 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb)
 }
 }
 
-static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
 {
-VirtIOBlock *s = VIRTIO_BLK(vdev);
 VirtIOBlockReq *req;
 MultiReqBuffer mrb = {};
 
-/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
- * dataplane here instead of waiting for .set_status().
- */
-if (s->dataplane && !s->dataplane_started) {
-virtio_blk_data_plane_start(s->dataplane);
-return;
-}
-
 blk_io_plug(s->blk);
 
 while ((req = virtio_blk_get_request(s))) {
@@ -604,6 +595,23 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 blk_io_unplug(s->blk);
 }
 
+static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+VirtIOBlock *s = VIRTIO_BLK(vdev);
+
+if (s->dataplane) {
+/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
+ * dataplane here instead of waiting for .set_status().
+ */
+if (!s->dataplane_started) {
+virtio_blk_data_plane_start(s->dataplane);
+}
+return;
+}
+
+virtio_blk_handle_vq(s, vq);
+}
+
 static void virtio_blk_dma_restart_bh(void *opaque)
 {
 VirtIOBlock *s = opaque;
-- 
MST

[Qemu-block] [PULL v2 13/51] fdc: add function to determine drive chs limits

2016-03-15 Thread Michael S. Tsirkin

From: Roman Kagan <rka...@virtuozzo.com>

When populating ACPI objects for floppy drives one needs to provide the
maximum values for cylinder, sector, and head number the drive supports.

This patch adds a function that iterates through the array of predefined
floppy drive formats and returns the maximum values of c, h, s, out of
those matching the given floppy drive type.

Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Marcel Apfelbaum <mar...@redhat.com>
Cc: John Snow <js...@redhat.com>
Cc: Laszlo Ersek <ler...@redhat.com>
Cc: Kevin O'Connor <ke...@koconnor.net>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: John Snow <js...@redhat.com>
---
 include/hw/block/fdc.h |  2 ++
 hw/block/fdc.c | 23 +++
 2 files changed, 25 insertions(+)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index adce14f..1749dab 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -15,5 +15,7 @@ void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc);
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
+void isa_fdc_get_drive_max_chs(FloppyDriveType type,
+   uint8_t *maxc, uint8_t *maxh, uint8_t *maxs);
 
 #endif
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 9838d21..fc3aef9 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2557,6 +2557,29 @@ FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, 
int i)
 return isa->state.drives[i].drive;
 }
 
+void isa_fdc_get_drive_max_chs(FloppyDriveType type,
+   uint8_t *maxc, uint8_t *maxh, uint8_t *maxs)
+{
+const FDFormat *fdf;
+
+*maxc = *maxh = *maxs = 0;
+for (fdf = fd_formats; fdf->drive != FLOPPY_DRIVE_TYPE_NONE; fdf++) {
+if (fdf->drive != type) {
+continue;
+}
+if (*maxc < fdf->max_track) {
+*maxc = fdf->max_track;
+}
+if (*maxh < fdf->max_head) {
+*maxh = fdf->max_head;
+}
+if (*maxs < fdf->last_sect) {
+*maxs = fdf->last_sect;
+}
+}
+(*maxc)--;
+}
+
 static const VMStateDescription vmstate_isa_fdc ={
 .name = "fdc",
 .version_id = 2,
-- 
MST

[Qemu-block] [PULL 13/53] fdc: add function to determine drive chs limits

2016-03-11 Thread Michael S. Tsirkin

From: Roman Kagan <rka...@virtuozzo.com>

When populating ACPI objects for floppy drives one needs to provide the
maximum values for cylinder, sector, and head number the drive supports.

This patch adds a function that iterates through the array of predefined
floppy drive formats and returns the maximum values of c, h, s, out of
those matching the given floppy drive type.

Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Marcel Apfelbaum <mar...@redhat.com>
Cc: John Snow <js...@redhat.com>
Cc: Laszlo Ersek <ler...@redhat.com>
Cc: Kevin O'Connor <ke...@koconnor.net>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: John Snow <js...@redhat.com>
---
 include/hw/block/fdc.h |  2 ++
 hw/block/fdc.c | 23 +++
 2 files changed, 25 insertions(+)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index adce14f..1749dab 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -15,5 +15,7 @@ void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc);
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
+void isa_fdc_get_drive_max_chs(FloppyDriveType type,
+   uint8_t *maxc, uint8_t *maxh, uint8_t *maxs);
 
 #endif
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 9838d21..fc3aef9 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2557,6 +2557,29 @@ FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, 
int i)
 return isa->state.drives[i].drive;
 }
 
+void isa_fdc_get_drive_max_chs(FloppyDriveType type,
+   uint8_t *maxc, uint8_t *maxh, uint8_t *maxs)
+{
+const FDFormat *fdf;
+
+*maxc = *maxh = *maxs = 0;
+for (fdf = fd_formats; fdf->drive != FLOPPY_DRIVE_TYPE_NONE; fdf++) {
+if (fdf->drive != type) {
+continue;
+}
+if (*maxc < fdf->max_track) {
+*maxc = fdf->max_track;
+}
+if (*maxh < fdf->max_head) {
+*maxh = fdf->max_head;
+}
+if (*maxs < fdf->last_sect) {
+*maxs = fdf->last_sect;
+}
+}
+(*maxc)--;
+}
+
 static const VMStateDescription vmstate_isa_fdc ={
 .name = "fdc",
 .version_id = 2,
-- 
MST

Re: [Qemu-block] [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers

2016-03-04 Thread Michael S. Tsirkin

On Fri, Mar 04, 2016 at 01:57:05PM +0100, Markus Armbruster wrote:
> "Michael S. Tsirkin" <m...@redhat.com> writes:
> 
> > On Fri, Mar 04, 2016 at 09:42:02AM +0100, Markus Armbruster wrote:
> >> Plugging an MSI-capable device into an MSI-incapable board works just
> >> fine, both for physical and for virtual hardware.  What doesn't work is
> >> plugging an MSI-capable device into an MSI-capable board with *broken*
> >> MSI support.
> >> 
> >> As a convenience feature, we summarily and silently remove a device's
> >> MSI capability when we detect such a broken board.  At least that's what
> >> the commit message you quoted claims.
> >
> > And this makes sense, right?
> 
> Yes, except for the part where we ignore the user's explicit orders,
> and, to a lesser degree, for the part where we create variants of
> physical devices that don't exist physically.
> 
> >> In reality, we remove it not just for broken boards, but even for
> >> MSI-incapable boards.
> >> 
> >> I take issue with "summarily and silently", and "even for MSI-incapable
> >> boards".
> >> 
> >> When multiple variants of a device exist, and the user didn't ask for a
> >> specific one, then picking the one that works best with the board is
> >> just fine.
> >> 
> >> It's absolutely not fine when the user did ask for a specific one.  When
> >> I ask for msi=on, I mean it.  If it can't work with this board, tell me.
> >> But silently ignoring my orders is a bug.
> >
> > I agree. msi is not the only case either.  We really need - for any boolean
> > flag - a way to figure out whether it was set by user.
> > With that in place we could fix it.
> 
> QMP provides that directly as "optional bool", but qdev properties do
> "optional" diffently, and when you see the default value, you don't know
> whether it comes from the user or not.
> 
> Another solution is an on/off/auto type.  We got it already in the QAPI
> schema, as OnOffAuto, and my recent "[PATCH 32/38] qdev: New
> DEFINE_PROP_ON_OFF_AUTO" makes it available as qdev property.  With
> default set to auto, we should be set.

Should we somehow change all on/off properties to on/off/auto?

> > However, almost no one uses the msi/msi-x flag - we introduced
> > them only for one reason - for backwards compatibility.
> > The fact that each time we need a compatibility flag
> > we also expose a new user interface is very unfortunate.
> > IMO it was a design mistake made a long time ago and it won't
> > be easy to fix now.
> 
> I agree there's no easy fix, but we can try to find a non-easy one.
> 
> For backward compatibility, we need to configure some device models for
> certain machine types.  We use the only object configuration mechanism
> we have: properties.  The properties we use for compatibility are all
> exposed to the user.
> 
> We could easily provide a flag to mark properties private, and only
> accept non-private properties at external interfaces.  This should help
> stopping growth of the problem, but it's not an easy fix for reducing
> it, because making a property private retroactively is problematic.  We
> could have a flag to mark them deprecated instead, warn on use, remove
> them from documentation, and perhaps drop them a couple of releases
> later.

Sounds good.

> > And for the above reason I personally do not intend to
> > spend time designing a specific hack just for the msi
> > property.
> >
> >> It's fine to have emulations of MSI-capable boards where MSI doesn't yet
> >> work.  Even if that means we have to reject MSI-capable devices.
> >
> > I don't know what does reject mean here. Removing msi capability?
> > In that case I agree.
> 
> By "reject" I mean "reject the user's order to plug in an MSI-capable
> device."

I don't believe anyone tweaks these properties in practice.

However, I have to wonder. Assume that you have supplied
a device with 10 properties. QEMU can not support them.
At your suggesion, device is rejected. How does user
know which property to tweak? Try all values for them all?


> >> It's absolutely not fine to reject them for MSI-incapable boards, where
> >> they'd work just fine.
> >
> > I think that as long as users did not ask for msi explicitly,
> > and board is msi incapable, it does not matter much whether
> > device has msi capability or not - guest will not try
> > to use it anyway.
> 
> If the device comes in MSI-capable and MSI-incapable variants, and the
> user didn't specif

Re: [Qemu-block] [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers

2016-03-04 Thread Michael S. Tsirkin

On Fri, Mar 04, 2016 at 09:42:02AM +0100, Markus Armbruster wrote:
> Plugging an MSI-capable device into an MSI-incapable board works just
> fine, both for physical and for virtual hardware.  What doesn't work is
> plugging an MSI-capable device into an MSI-capable board with *broken*
> MSI support.
> 
> As a convenience feature, we summarily and silently remove a device's
> MSI capability when we detect such a broken board.  At least that's what
> the commit message you quoted claims.

And this makes sense, right?

> In reality, we remove it not just for broken boards, but even for
> MSI-incapable boards.
> 
> I take issue with "summarily and silently", and "even for MSI-incapable
> boards".
> 
> When multiple variants of a device exist, and the user didn't ask for a
> specific one, then picking the one that works best with the board is
> just fine.
> 
> It's absolutely not fine when the user did ask for a specific one.  When
> I ask for msi=on, I mean it.  If it can't work with this board, tell me.
> But silently ignoring my orders is a bug.

I agree. msi is not the only case either.  We really need - for any boolean
flag - a way to figure out whether it was set by user.
With that in place we could fix it.

However, almost no one uses the msi/msi-x flag - we introduced
them only for one reason - for backwards compatibility.
The fact that each time we need a compatibility flag
we also expose a new user interface is very unfortunate.
IMO it was a design mistake made a long time ago and it won't
be easy to fix now.

And for the above reason I personally do not intend to
spend time designing a specific hack just for the msi
property.

> It's fine to have emulations of MSI-capable boards where MSI doesn't yet
> work.  Even if that means we have to reject MSI-capable devices.

I don't know what does reject mean here. Removing msi capability?
In that case I agree.

> It's absolutely not fine to reject them for MSI-incapable boards, where
> they'd work just fine.

I think that as long as users did not ask for msi explicitly,
and board is msi incapable, it does not matter much whether
device has msi capability or not - guest will not try
to use it anyway.

> Furthermore, I doubt the wisdom of creating virtual devices that don't
> exist physically just to have something that works in our broken boards.
> If the physical device exists in MSI-capable and MSI-incapable variants,
> emulating both is fine.  But if it only ever existed MSI-capable, having
> the PCI core(!) drop the MSI capability is a questionable idea.  I
> suspect that the need for this dubious hack would be much smaller if we
> didn't foolishly treat every MSI-incapable board as broken MSI-capable
> board.
> 
> If you conclude that cleaning up this carelessly made mess is not worth
> the bother now, then at least explain the mess in the code, please.
> It's not obvious, and figuring out what's going on and why it is the way
> it is has taken me several hours, and Marcel's help.

I think it's worth cleaning up, or at least documenting.
Fixing it will take much more than the patch proposed here,
and we can not start with this patch as it will cause
regressions.
Adding a comment won't be too much work.
How about the below?

-->

msi_supported -> msi_nonbroken

Rename controller flag to make it clearer what it means.
Add some documentation as well.

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>

---

diff --git a/include/hw/pci/msi.h b/include/hw/pci/msi.h
index 50e452b..8124908 100644
--- a/include/hw/pci/msi.h
+++ b/include/hw/pci/msi.h
@@ -29,7 +29,7 @@ struct MSIMessage {
 uint32_t data;
 };
 
-extern bool msi_supported;
+extern bool msi_nonbroken;
 
 void msi_set_message(PCIDevice *dev, MSIMessage msg);
 MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector);
diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index 694d398..3c7c8fa 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -186,7 +186,7 @@ static void kvm_apic_realize(DeviceState *dev, Error **errp)
   APIC_SPACE_SIZE);
 
 if (kvm_has_gsi_routing()) {
-msi_supported = true;
+msi_nonbroken = true;
 }
 }
 
diff --git a/hw/i386/xen/xen_apic.c b/hw/i386/xen/xen_apic.c
index 2b8d709..21d68ee 100644
--- a/hw/i386/xen/xen_apic.c
+++ b/hw/i386/xen/xen_apic.c
@@ -44,7 +44,7 @@ static void xen_apic_realize(DeviceState *dev, Error **errp)
 s->vapic_control = 0;
 memory_region_init_io(>io_memory, OBJECT(s), _apic_io_ops, s,
   "xen-apic-msi", APIC_SPACE_SIZE);
-msi_supported = true;
+msi_nonbroken = true;
 }
 
 static void xen_apic_set_base(APICCommonState *s, uint64_t val)
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index a299462..28c2ea5 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -874,7 +874,7 @@ static void apic_

[Qemu-block] [PULL 14/16] fdc: add function to determine drive chs limits

2016-03-03 Thread Michael S. Tsirkin

From: Roman Kagan <rka...@virtuozzo.com>

When populating ACPI objects for floppy drives one needs to provide the
maximum values for cylinder, sector, and head number the drive supports.

This patch adds a function that iterates through the array of predefined
floppy drive formats and returns the maximum values of c, h, s, out of
those matching the given floppy drive type.

Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Marcel Apfelbaum <mar...@redhat.com>
Cc: John Snow <js...@redhat.com>
Cc: Laszlo Ersek <ler...@redhat.com>
Cc: Kevin O'Connor <ke...@koconnor.net>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: John Snow <js...@redhat.com>
---
 include/hw/block/fdc.h |  2 ++
 hw/block/fdc.c | 23 +++
 2 files changed, 25 insertions(+)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index adce14f..1749dab 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -15,5 +15,7 @@ void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc);
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
+void isa_fdc_get_drive_max_chs(FloppyDriveType type,
+   uint8_t *maxc, uint8_t *maxh, uint8_t *maxs);
 
 #endif
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 9838d21..fc3aef9 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2557,6 +2557,29 @@ FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, 
int i)
 return isa->state.drives[i].drive;
 }
 
+void isa_fdc_get_drive_max_chs(FloppyDriveType type,
+   uint8_t *maxc, uint8_t *maxh, uint8_t *maxs)
+{
+const FDFormat *fdf;
+
+*maxc = *maxh = *maxs = 0;
+for (fdf = fd_formats; fdf->drive != FLOPPY_DRIVE_TYPE_NONE; fdf++) {
+if (fdf->drive != type) {
+continue;
+}
+if (*maxc < fdf->max_track) {
+*maxc = fdf->max_track;
+}
+if (*maxh < fdf->max_head) {
+*maxh = fdf->max_head;
+}
+if (*maxs < fdf->last_sect) {
+*maxs = fdf->last_sect;
+}
+}
+(*maxc)--;
+}
+
 static const VMStateDescription vmstate_isa_fdc ={
 .name = "fdc",
 .version_id = 2,
-- 
MST

Re: [Qemu-block] [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers

2016-03-03 Thread Michael S. Tsirkin

On Thu, Mar 03, 2016 at 04:03:16PM +0100, Markus Armbruster wrote:
> "Michael S. Tsirkin" <m...@redhat.com> writes:
> 
> > On Thu, Mar 03, 2016 at 01:19:09PM +0200, Marcel Apfelbaum wrote:
> >> On 03/03/2016 12:45 PM, Michael S. Tsirkin wrote:
> >> >On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote:
> >> >>>>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int 
> >> >>>>nr_vectors,
> >> >>>>+ bool msi64bit, bool msi_per_vector_mask, Error **errp)
> >> >>>>  {
> >> >>>>  unsigned int vectors_order;
> >> >>>>-uint16_t flags;
> >> >>>>+uint16_t flags; /* Message Control register value */
> >> >>>>  uint8_t cap_size;
> >> >>>>  int config_offset;
> >> >>>>
> >> >>>>  if (!msi_supported) {
> >> >>>>+error_setg(errp, "MSI is not supported by interrupt 
> >> >>>>controller");
> >> >>>>  return -ENOTSUP;
> >> >>>
> >> >>>First failure mode: board does not support MSI (-ENOTSUP).
> >> >>>
> >> >>>Question to the PCI guys: why is this even an error?  A device with
> >> >>>capability MSI should work just fine in such a board.
> >> >>
> >> >>Hi Markus,
> >> >>
> >> >>Adding Jan Kiszka, maybe he can help.
> >> >>
> >> >>That's a fair question. Is there any history for this decision?
> >> >>The board not supporting MSI has nothing to do with the capability being 
> >> >>there.
> >> >>The HW should not change because the board doe not support it.
> >> >>
> >> >>The capability should be present but not active.
> >> >
> >> >Digging in git log will tell you why we have the msi_supported flag:
> >> >
> >> >commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support 
> >> >functions")
> >> >
> >> >  This is a safety measure to avoid breaking platforms which should 
> >> > support
> >> >  MSI-X but currently lack this in the interrupt controller emulation.
> >> >
> >> >in other words, on some platforms we must hide msi support from devices
> >> >because otherwise guests will try to use it, and our emulation is
> >> >incomplete.
> >> 
> >> 
> >> OK, thanks. So the flag should be "msi_broken" or 
> >> "msi_present_but_not_implemented" and not
> >> "msi_supported" that leads (at least me) to the assumption that some 
> >> platform *does not support msi*
> >> rather than it supports it, but we don't emulate it.
> 
> I agree the name is badly misleading for this role.
> 
> Now let me see how this contraption actually works.  msi_supported is
> global, initialized to false, and becomes globally true when
> 
> 1. certain MSI-capable interrupt controllers realize: "apic",
>   "kvm-apic" if kvm_has_gsi_routing(), "xen-apic", "arm-gicv2m",
>   "openpic" models 1 and 2, "kvm-openpic" models 1 and 2
> 
> 2. "s390-pcihost" class-initializes
> 
> 3. machine "spapr-machine" initializes
> 
> Issues:
> 
> * "Global is problematic.  What if a board has more than one interrupt
>   controller?  What if one of them sets msi_supported, but the other one
>   is of the kind Michael described, i.e. guests know it has MSI, but our
>   emulation doesn't actually work?
> 
> * "Initialize to false" is problematic.  We don't clear msi_supported
>   when we have a broken interrupt controler, we set it when we have a
>   working one.  The consequence is that boards with non-MSI interrupt
>   controllers are treated just like boards with broken interrupt
>   controllers.
> 
>   Here's  how msi_supported is documented:
> 
> /* Flag for interrupt controller to declare MSI/MSI-X support */
> bool msi_supported;
> 
>   This is matches how the code works.  However, it contradicts the
>   commit message Michael quoted.  The most plausible explanation is that
>   the commit is flawed.
> 
> * Class-initialize (2.) looks wrong to me.  msi_supported becomes true
>   when QOM type "s390-pcihost" is created, regardless of whether
>   instances get created, let alone used.
> 
> * I'm not su

Re: [Qemu-block] [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers

2016-03-03 Thread Michael S. Tsirkin

On Thu, Mar 03, 2016 at 01:19:09PM +0200, Marcel Apfelbaum wrote:
> On 03/03/2016 12:45 PM, Michael S. Tsirkin wrote:
> >On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote:
> >>>>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int 
> >>>>nr_vectors,
> >>>>+ bool msi64bit, bool msi_per_vector_mask, Error **errp)
> >>>>  {
> >>>>  unsigned int vectors_order;
> >>>>-uint16_t flags;
> >>>>+uint16_t flags; /* Message Control register value */
> >>>>  uint8_t cap_size;
> >>>>  int config_offset;
> >>>>
> >>>>  if (!msi_supported) {
> >>>>+error_setg(errp, "MSI is not supported by interrupt controller");
> >>>>  return -ENOTSUP;
> >>>
> >>>First failure mode: board does not support MSI (-ENOTSUP).
> >>>
> >>>Question to the PCI guys: why is this even an error?  A device with
> >>>capability MSI should work just fine in such a board.
> >>
> >>Hi Markus,
> >>
> >>Adding Jan Kiszka, maybe he can help.
> >>
> >>That's a fair question. Is there any history for this decision?
> >>The board not supporting MSI has nothing to do with the capability being 
> >>there.
> >>The HW should not change because the board doe not support it.
> >>
> >>The capability should be present but not active.
> >
> >Digging in git log will tell you why we have the msi_supported flag:
> >
> >commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support 
> >functions")
> >
> > This is a safety measure to avoid breaking platforms which should 
> > support
> > MSI-X but currently lack this in the interrupt controller emulation.
> >
> >in other words, on some platforms we must hide msi support from devices
> >because otherwise guests will try to use it, and our emulation is
> >incomplete.
> 
> 
> OK, thanks. So the flag should be "msi_broken" or 
> "msi_present_but_not_implemented" and not
> "msi_supported" that leads (at least me) to the assumption that some platform 
> *does not support msi*
> rather than it supports it, but we don't emulate it.
> 
> 
> >
> >And the conclusion from that is that for msi_init to fail silently is
> >at the moment the right thing to do.
> 
> But this is not the only thing we do, we are modifying the PCI devices. We 
> could fail starting the VM
> if a device supporting MSI is added on a platform with broken msi, but this 
> will prevent us to use
> assigned devices. Emulated devices should be created with a specific 
> "msi=off" flag.
> 
> Thanks,
> Marcel

That will just break a bunch of valid configurations, for no real
benefit to users.

> >
> >The only other reason for it to fail is pci config space corruption,
> >this probably never happens in practice.
> >

Re: [Qemu-block] [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers

2016-03-03 Thread Michael S. Tsirkin

On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote:
> >>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int 
> >>nr_vectors,
> >>+ bool msi64bit, bool msi_per_vector_mask, Error **errp)
> >>  {
> >>  unsigned int vectors_order;
> >>-uint16_t flags;
> >>+uint16_t flags; /* Message Control register value */
> >>  uint8_t cap_size;
> >>  int config_offset;
> >>
> >>  if (!msi_supported) {
> >>+error_setg(errp, "MSI is not supported by interrupt controller");
> >>  return -ENOTSUP;
> >
> >First failure mode: board does not support MSI (-ENOTSUP).
> >
> >Question to the PCI guys: why is this even an error?  A device with
> >capability MSI should work just fine in such a board.
> 
> Hi Markus,
> 
> Adding Jan Kiszka, maybe he can help.
> 
> That's a fair question. Is there any history for this decision?
> The board not supporting MSI has nothing to do with the capability being 
> there.
> The HW should not change because the board doe not support it.
> 
> The capability should be present but not active.

Digging in git log will tell you why we have the msi_supported flag:

commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support 
functions")

This is a safety measure to avoid breaking platforms which should 
support
MSI-X but currently lack this in the interrupt controller emulation.

in other words, on some platforms we must hide msi support from devices
because otherwise guests will try to use it, and our emulation is
incomplete.

And the conclusion from that is that for msi_init to fail silently is
at the moment the right thing to do.

The only other reason for it to fail is pci config space corruption,
this probably never happens in practice.

-- 
MST

[Qemu-block] [PULL v3 12/21] virtio-blk: fix "disabled data plane" mode

2016-02-25 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

In disabled mode, virtio-blk dataplane seems to be enabled, but flow
actually goes through the normal virtio path.  This patch simplifies a bit
the handling of disabled mode.  In disabled mode, virtio_blk_handle_output
might be called even if s->dataplane is not NULL.

This is a bit tricky, because the current check for s->dataplane will
always trigger, causing a continuous stream of calls to
virtio_blk_data_plane_start.  Unfortunately, these calls will not
do anything.  To fix this, set the "started" flag even in disabled
mode, and skip virtio_blk_data_plane_start if the started flag is true.
The resulting changes also prepare the code for the next patch, were
virtio-blk dataplane will reuse the same virtio_blk_handle_output function
as "regular" virtio-blk.

Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have
to move s->dataplane->started inside struct VirtIOBlock.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
---
 include/hw/virtio/virtio-blk.h  |  1 +
 hw/block/dataplane/virtio-blk.c | 21 +
 hw/block/virtio-blk.c   |  2 +-
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 199bb0e..781969d 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -56,6 +56,7 @@ typedef struct VirtIOBlock {
 /* Function to push to vq and notify guest */
 void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
 Notifier migration_state_notifier;
+bool dataplane_started;
 struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
 
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 03b81bc..cc521c1 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -28,7 +28,6 @@
 #include "qom/object_interfaces.h"
 
 struct VirtIOBlockDataPlane {
-bool started;
 bool starting;
 bool stopping;
 bool disabled;
@@ -264,11 +263,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 VirtQueue *vq;
 int r;
 
-if (s->started || s->disabled) {
-return;
-}
-
-if (s->starting) {
+if (vblk->dataplane_started || s->starting) {
 return;
 }
 
@@ -300,7 +295,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 vblk->complete_request = complete_request_vring;
 
 s->starting = false;
-s->started = true;
+vblk->dataplane_started = true;
 trace_virtio_blk_data_plane_start(s);
 
 blk_set_aio_context(s->conf->conf.blk, s->ctx);
@@ -319,9 +314,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 k->set_guest_notifiers(qbus->parent, 1, false);
   fail_guest_notifiers:
 vring_teardown(>vring, s->vdev, 0);
-s->disabled = true;
   fail_vring:
+s->disabled = true;
 s->starting = false;
+vblk->dataplane_started = true;
 }
 
 /* Context: QEMU global mutex held */
@@ -331,13 +327,14 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
 
+if (!vblk->dataplane_started || s->stopping) {
+return;
+}
 
 /* Better luck next time. */
 if (s->disabled) {
 s->disabled = false;
-return;
-}
-if (!s->started || s->stopping) {
+vblk->dataplane_started = false;
 return;
 }
 s->stopping = true;
@@ -364,6 +361,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 /* Clean up guest notifier (irq) */
 k->set_guest_notifiers(qbus->parent, 1, false);
 
-s->started = false;
+vblk->dataplane_started = false;
 s->stopping = false;
 }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c427698..e04c8f5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -589,7 +589,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
  * dataplane here instead of waiting for .set_status().
  */
-if (s->dataplane) {
+if (s->dataplane && !s->dataplane_started) {
 virtio_blk_data_plane_start(s->dataplane);
 return;
 }
-- 
MST

[Qemu-block] [PULL v3 09/21] vring: make vring_enable_notification return void

2016-02-25 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Make the API more similar to the regular virtqueue API.  This will
help when modifying the code to not use vring.c anymore.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
---
 include/hw/virtio/dataplane/vring.h | 2 +-
 hw/block/dataplane/virtio-blk.c | 3 ++-
 hw/virtio/dataplane/vring.c | 3 +--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/dataplane/vring.h 
b/include/hw/virtio/dataplane/vring.h
index e80985e..e1c2a65 100644
--- a/include/hw/virtio/dataplane/vring.h
+++ b/include/hw/virtio/dataplane/vring.h
@@ -42,7 +42,7 @@ static inline void vring_set_broken(Vring *vring)
 bool vring_setup(Vring *vring, VirtIODevice *vdev, int n);
 void vring_teardown(Vring *vring, VirtIODevice *vdev, int n);
 void vring_disable_notification(VirtIODevice *vdev, Vring *vring);
-bool vring_enable_notification(VirtIODevice *vdev, Vring *vring);
+void vring_enable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_should_notify(VirtIODevice *vdev, Vring *vring);
 void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz);
 void vring_push(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 0d99781..03b81bc 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -128,7 +128,8 @@ static void handle_notify(EventNotifier *e)
 /* Re-enable guest->host notifies and stop processing the vring.
  * But if the guest has snuck in more descriptors, keep processing.
  */
-if (vring_enable_notification(s->vdev, >vring)) {
+vring_enable_notification(s->vdev, >vring);
+if (!vring_more_avail(s->vdev, >vring)) {
 break;
 }
 } else { /* fatal error */
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 4308d9f..157e8b8 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -175,7 +175,7 @@ void vring_disable_notification(VirtIODevice *vdev, Vring 
*vring)
  *
  * Return true if the vring is empty, false if there are more requests.
  */
-bool vring_enable_notification(VirtIODevice *vdev, Vring *vring)
+void vring_enable_notification(VirtIODevice *vdev, Vring *vring)
 {
 if (virtio_vdev_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
 vring_avail_event(>vr) = vring->vr.avail->idx;
@@ -183,7 +183,6 @@ bool vring_enable_notification(VirtIODevice *vdev, Vring 
*vring)
 vring_clear_used_flags(vdev, vring, VRING_USED_F_NO_NOTIFY);
 }
 smp_mb(); /* ensure update is seen before reading avail_idx */
-return !vring_more_avail(vdev, vring);
 }
 
 /* This is stolen from linux/drivers/vhost/vhost.c:vhost_notify() */
-- 
MST

[Qemu-block] [PULL v3 08/21] block-migration: acquire AioContext as necessary

2016-02-25 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

This is needed because dataplane will run during block migration as well.

The block device migration code is quite liberal in taking the iothread
mutex.  For simplicity, keep it the same way, even though one could
actually choose between the BQL (for regular BlockDriverStates) and
the AioContext (for dataplane BlockDriverStates).  When the block layer
is made fully thread safe, aio_context_acquire shall go away altogether.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
---
 migration/block.c | 65 ---
 1 file changed, 52 insertions(+), 13 deletions(-)

diff --git a/migration/block.c b/migration/block.c
index 3a8330a..72883d7 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -54,17 +54,25 @@ typedef struct BlkMigDevState {
 int shared_base;
 int64_t total_sectors;
 QSIMPLEQ_ENTRY(BlkMigDevState) entry;
+Error *blocker;
 
 /* Only used by migration thread.  Does not need a lock.  */
 int bulk_completed;
 int64_t cur_sector;
 int64_t cur_dirty;
 
-/* Protected by block migration lock.  */
+/* Data in the aio_bitmap is protected by block migration lock.
+ * Allocation and free happen during setup and cleanup respectively.
+ */
 unsigned long *aio_bitmap;
+
+/* Protected by block migration lock.  */
 int64_t completed_sectors;
+
+/* During migration this is protected by iothread lock / AioContext.
+ * Allocation and free happen during setup and cleanup respectively.
+ */
 BdrvDirtyBitmap *dirty_bitmap;
-Error *blocker;
 } BlkMigDevState;
 
 typedef struct BlkMigBlock {
@@ -100,7 +108,7 @@ typedef struct BlkMigState {
 int prev_progress;
 int bulk_completed;
 
-/* Lock must be taken _inside_ the iothread lock.  */
+/* Lock must be taken _inside_ the iothread lock and any AioContexts.  */
 QemuMutex lock;
 } BlkMigState;
 
@@ -264,11 +272,13 @@ static int mig_save_device_bulk(QEMUFile *f, 
BlkMigDevState *bmds)
 
 if (bmds->shared_base) {
 qemu_mutex_lock_iothread();
+aio_context_acquire(bdrv_get_aio_context(bs));
 while (cur_sector < total_sectors &&
!bdrv_is_allocated(bs, cur_sector, MAX_IS_ALLOCATED_SEARCH,
   _sectors)) {
 cur_sector += nr_sectors;
 }
+aio_context_release(bdrv_get_aio_context(bs));
 qemu_mutex_unlock_iothread();
 }
 
@@ -302,11 +312,21 @@ static int mig_save_device_bulk(QEMUFile *f, 
BlkMigDevState *bmds)
 block_mig_state.submitted++;
 blk_mig_unlock();
 
+/* We do not know if bs is under the main thread (and thus does
+ * not acquire the AioContext when doing AIO) or rather under
+ * dataplane.  Thus acquire both the iothread mutex and the
+ * AioContext.
+ *
+ * This is ugly and will disappear when we make bdrv_* thread-safe,
+ * without the need to acquire the AioContext.
+ */
 qemu_mutex_lock_iothread();
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 blk->aiocb = bdrv_aio_readv(bs, cur_sector, >qiov,
 nr_sectors, blk_mig_read_cb, blk);
 
 bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, cur_sector, nr_sectors);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 qemu_mutex_unlock_iothread();
 
 bmds->cur_sector = cur_sector + nr_sectors;
@@ -321,8 +341,10 @@ static int set_dirty_tracking(void)
 int ret;
 
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
   NULL, NULL);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 if (!bmds->dirty_bitmap) {
 ret = -errno;
 goto fail;
@@ -333,18 +355,24 @@ static int set_dirty_tracking(void)
 fail:
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
 if (bmds->dirty_bitmap) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 }
 }
 return ret;
 }
 
+/* Called with iothread lock taken.  */
+
 static void unset_dirty_tracking(void)
 {
 BlkMigDevState *bmds;
 
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
+aio_context_release(bdrv_get_aio_context(bmds->bs));

[Qemu-block] [PULL v2 14/23] virtio-blk: fix "disabled data plane" mode

2016-02-25 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

In disabled mode, virtio-blk dataplane seems to be enabled, but flow
actually goes through the normal virtio path.  This patch simplifies a bit
the handling of disabled mode.  In disabled mode, virtio_blk_handle_output
might be called even if s->dataplane is not NULL.

This is a bit tricky, because the current check for s->dataplane will
always trigger, causing a continuous stream of calls to
virtio_blk_data_plane_start.  Unfortunately, these calls will not
do anything.  To fix this, set the "started" flag even in disabled
mode, and skip virtio_blk_data_plane_start if the started flag is true.
The resulting changes also prepare the code for the next patch, were
virtio-blk dataplane will reuse the same virtio_blk_handle_output function
as "regular" virtio-blk.

Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have
to move s->dataplane->started inside struct VirtIOBlock.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
---
 include/hw/virtio/virtio-blk.h  |  1 +
 hw/block/dataplane/virtio-blk.c | 21 +
 hw/block/virtio-blk.c   |  2 +-
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 199bb0e..781969d 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -56,6 +56,7 @@ typedef struct VirtIOBlock {
 /* Function to push to vq and notify guest */
 void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
 Notifier migration_state_notifier;
+bool dataplane_started;
 struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
 
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 03b81bc..cc521c1 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -28,7 +28,6 @@
 #include "qom/object_interfaces.h"
 
 struct VirtIOBlockDataPlane {
-bool started;
 bool starting;
 bool stopping;
 bool disabled;
@@ -264,11 +263,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 VirtQueue *vq;
 int r;
 
-if (s->started || s->disabled) {
-return;
-}
-
-if (s->starting) {
+if (vblk->dataplane_started || s->starting) {
 return;
 }
 
@@ -300,7 +295,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 vblk->complete_request = complete_request_vring;
 
 s->starting = false;
-s->started = true;
+vblk->dataplane_started = true;
 trace_virtio_blk_data_plane_start(s);
 
 blk_set_aio_context(s->conf->conf.blk, s->ctx);
@@ -319,9 +314,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 k->set_guest_notifiers(qbus->parent, 1, false);
   fail_guest_notifiers:
 vring_teardown(>vring, s->vdev, 0);
-s->disabled = true;
   fail_vring:
+s->disabled = true;
 s->starting = false;
+vblk->dataplane_started = true;
 }
 
 /* Context: QEMU global mutex held */
@@ -331,13 +327,14 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
 
+if (!vblk->dataplane_started || s->stopping) {
+return;
+}
 
 /* Better luck next time. */
 if (s->disabled) {
 s->disabled = false;
-return;
-}
-if (!s->started || s->stopping) {
+vblk->dataplane_started = false;
 return;
 }
 s->stopping = true;
@@ -364,6 +361,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 /* Clean up guest notifier (irq) */
 k->set_guest_notifiers(qbus->parent, 1, false);
 
-s->started = false;
+vblk->dataplane_started = false;
 s->stopping = false;
 }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c427698..e04c8f5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -589,7 +589,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
  * dataplane here instead of waiting for .set_status().
  */
-if (s->dataplane) {
+if (s->dataplane && !s->dataplane_started) {
 virtio_blk_data_plane_start(s->dataplane);
 return;
 }
-- 
MST

[Qemu-block] [PULL v2 11/23] vring: make vring_enable_notification return void

2016-02-25 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Make the API more similar to the regular virtqueue API.  This will
help when modifying the code to not use vring.c anymore.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
---
 include/hw/virtio/dataplane/vring.h | 2 +-
 hw/block/dataplane/virtio-blk.c | 3 ++-
 hw/virtio/dataplane/vring.c | 3 +--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/dataplane/vring.h 
b/include/hw/virtio/dataplane/vring.h
index e80985e..e1c2a65 100644
--- a/include/hw/virtio/dataplane/vring.h
+++ b/include/hw/virtio/dataplane/vring.h
@@ -42,7 +42,7 @@ static inline void vring_set_broken(Vring *vring)
 bool vring_setup(Vring *vring, VirtIODevice *vdev, int n);
 void vring_teardown(Vring *vring, VirtIODevice *vdev, int n);
 void vring_disable_notification(VirtIODevice *vdev, Vring *vring);
-bool vring_enable_notification(VirtIODevice *vdev, Vring *vring);
+void vring_enable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_should_notify(VirtIODevice *vdev, Vring *vring);
 void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz);
 void vring_push(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 0d99781..03b81bc 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -128,7 +128,8 @@ static void handle_notify(EventNotifier *e)
 /* Re-enable guest->host notifies and stop processing the vring.
  * But if the guest has snuck in more descriptors, keep processing.
  */
-if (vring_enable_notification(s->vdev, >vring)) {
+vring_enable_notification(s->vdev, >vring);
+if (!vring_more_avail(s->vdev, >vring)) {
 break;
 }
 } else { /* fatal error */
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 4308d9f..157e8b8 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -175,7 +175,7 @@ void vring_disable_notification(VirtIODevice *vdev, Vring 
*vring)
  *
  * Return true if the vring is empty, false if there are more requests.
  */
-bool vring_enable_notification(VirtIODevice *vdev, Vring *vring)
+void vring_enable_notification(VirtIODevice *vdev, Vring *vring)
 {
 if (virtio_vdev_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
 vring_avail_event(>vr) = vring->vr.avail->idx;
@@ -183,7 +183,6 @@ bool vring_enable_notification(VirtIODevice *vdev, Vring 
*vring)
 vring_clear_used_flags(vdev, vring, VRING_USED_F_NO_NOTIFY);
 }
 smp_mb(); /* ensure update is seen before reading avail_idx */
-return !vring_more_avail(vdev, vring);
 }
 
 /* This is stolen from linux/drivers/vhost/vhost.c:vhost_notify() */
-- 
MST

[Qemu-block] [PULL v2 15/23] virtio-blk: do not use vring in dataplane

2016-02-25 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
---
 hw/block/dataplane/virtio-blk.h |   1 +
 include/hw/virtio/virtio-blk.h  |   3 --
 hw/block/dataplane/virtio-blk.c | 112 +---
 hw/block/virtio-blk.c   |  49 +++---
 4 files changed, 19 insertions(+), 146 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.h b/hw/block/dataplane/virtio-blk.h
index c88d40e..0714c11 100644
--- a/hw/block/dataplane/virtio-blk.h
+++ b/hw/block/dataplane/virtio-blk.h
@@ -26,5 +26,6 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_drain(VirtIOBlockDataPlane *s);
+void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s);
 
 #endif /* HW_DATAPLANE_VIRTIO_BLK_H */
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 781969d..ae84d92 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -53,9 +53,6 @@ typedef struct VirtIOBlock {
 unsigned short sector_mask;
 bool original_wce;
 VMChangeStateEntry *change;
-/* Function to push to vq and notify guest */
-void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
-Notifier migration_state_notifier;
 bool dataplane_started;
 struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index cc521c1..36f3d2b 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -18,8 +18,6 @@
 #include "qemu/thread.h"
 #include "qemu/error-report.h"
 #include "hw/virtio/virtio-access.h"
-#include "hw/virtio/dataplane/vring.h"
-#include "hw/virtio/dataplane/vring-accessors.h"
 #include "sysemu/block-backend.h"
 #include "hw/virtio/virtio-blk.h"
 #include "virtio-blk.h"
@@ -35,7 +33,7 @@ struct VirtIOBlockDataPlane {
 VirtIOBlkConf *conf;
 
 VirtIODevice *vdev;
-Vring vring;/* virtqueue vring */
+VirtQueue *vq;  /* virtqueue vring */
 EventNotifier *guest_notifier;  /* irq */
 QEMUBH *bh; /* bh for guest notification */
 
@@ -48,94 +46,26 @@ struct VirtIOBlockDataPlane {
  */
 IOThread *iothread;
 AioContext *ctx;
-EventNotifier host_notifier;/* doorbell */
 
 /* Operation blocker on BDS */
 Error *blocker;
-void (*saved_complete_request)(struct VirtIOBlockReq *req,
-   unsigned char status);
 };
 
 /* Raise an interrupt to signal guest, if necessary */
-static void notify_guest(VirtIOBlockDataPlane *s)
+void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s)
 {
-if (!vring_should_notify(s->vdev, >vring)) {
-return;
-}
-
-event_notifier_set(s->guest_notifier);
+qemu_bh_schedule(s->bh);
 }
 
 static void notify_guest_bh(void *opaque)
 {
 VirtIOBlockDataPlane *s = opaque;
 
-notify_guest(s);
-}
-
-static void complete_request_vring(VirtIOBlockReq *req, unsigned char status)
-{
-VirtIOBlockDataPlane *s = req->dev->dataplane;
-stb_p(>in->status, status);
-
-vring_push(s->vdev, >dev->dataplane->vring, >elem, req->in_len);
-
-/* Suppress notification to guest by BH and its scheduled
- * flag because requests are completed as a batch after io
- * plug & unplug is introduced, and the BH can still be
- * executed in dataplane aio context even after it is
- * stopped, so needn't worry about notification loss with BH.
- */
-qemu_bh_schedule(s->bh);
-}
-
-static void handle_notify(EventNotifier *e)
-{
-VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
-   host_notifier);
-VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
-
-event_notifier_test_and_clear(>host_notifier);
-blk_io_plug(s->conf->conf.blk);
-for (;;) {
-MultiReqBuffer mrb = {};
-
-/* Disable guest->host notifies to avoid unnecessary vmexits */
-vring_disable_notification(s->vdev, >vring);
-
-for (;;) {
-VirtIOBlockReq *req = vring_pop(s->vdev, >vring,
-sizeof(VirtIOBlockReq));
-
-if (req == NULL) {
-break; /* no more requests */
-}
-
-virtio_blk_init_request(vblk, req);
-trace_virtio_blk_data_plane_process_request(s, req->elem.out_num,

[Qemu-block] [PULL v2 10/23] block-migration: acquire AioContext as necessary

2016-02-25 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

This is needed because dataplane will run during block migration as well.

The block device migration code is quite liberal in taking the iothread
mutex.  For simplicity, keep it the same way, even though one could
actually choose between the BQL (for regular BlockDriverStates) and
the AioContext (for dataplane BlockDriverStates).  When the block layer
is made fully thread safe, aio_context_acquire shall go away altogether.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
---
 migration/block.c | 65 ---
 1 file changed, 52 insertions(+), 13 deletions(-)

diff --git a/migration/block.c b/migration/block.c
index 3a8330a..72883d7 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -54,17 +54,25 @@ typedef struct BlkMigDevState {
 int shared_base;
 int64_t total_sectors;
 QSIMPLEQ_ENTRY(BlkMigDevState) entry;
+Error *blocker;
 
 /* Only used by migration thread.  Does not need a lock.  */
 int bulk_completed;
 int64_t cur_sector;
 int64_t cur_dirty;
 
-/* Protected by block migration lock.  */
+/* Data in the aio_bitmap is protected by block migration lock.
+ * Allocation and free happen during setup and cleanup respectively.
+ */
 unsigned long *aio_bitmap;
+
+/* Protected by block migration lock.  */
 int64_t completed_sectors;
+
+/* During migration this is protected by iothread lock / AioContext.
+ * Allocation and free happen during setup and cleanup respectively.
+ */
 BdrvDirtyBitmap *dirty_bitmap;
-Error *blocker;
 } BlkMigDevState;
 
 typedef struct BlkMigBlock {
@@ -100,7 +108,7 @@ typedef struct BlkMigState {
 int prev_progress;
 int bulk_completed;
 
-/* Lock must be taken _inside_ the iothread lock.  */
+/* Lock must be taken _inside_ the iothread lock and any AioContexts.  */
 QemuMutex lock;
 } BlkMigState;
 
@@ -264,11 +272,13 @@ static int mig_save_device_bulk(QEMUFile *f, 
BlkMigDevState *bmds)
 
 if (bmds->shared_base) {
 qemu_mutex_lock_iothread();
+aio_context_acquire(bdrv_get_aio_context(bs));
 while (cur_sector < total_sectors &&
!bdrv_is_allocated(bs, cur_sector, MAX_IS_ALLOCATED_SEARCH,
   _sectors)) {
 cur_sector += nr_sectors;
 }
+aio_context_release(bdrv_get_aio_context(bs));
 qemu_mutex_unlock_iothread();
 }
 
@@ -302,11 +312,21 @@ static int mig_save_device_bulk(QEMUFile *f, 
BlkMigDevState *bmds)
 block_mig_state.submitted++;
 blk_mig_unlock();
 
+/* We do not know if bs is under the main thread (and thus does
+ * not acquire the AioContext when doing AIO) or rather under
+ * dataplane.  Thus acquire both the iothread mutex and the
+ * AioContext.
+ *
+ * This is ugly and will disappear when we make bdrv_* thread-safe,
+ * without the need to acquire the AioContext.
+ */
 qemu_mutex_lock_iothread();
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 blk->aiocb = bdrv_aio_readv(bs, cur_sector, >qiov,
 nr_sectors, blk_mig_read_cb, blk);
 
 bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, cur_sector, nr_sectors);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 qemu_mutex_unlock_iothread();
 
 bmds->cur_sector = cur_sector + nr_sectors;
@@ -321,8 +341,10 @@ static int set_dirty_tracking(void)
 int ret;
 
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
   NULL, NULL);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 if (!bmds->dirty_bitmap) {
 ret = -errno;
 goto fail;
@@ -333,18 +355,24 @@ static int set_dirty_tracking(void)
 fail:
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
 if (bmds->dirty_bitmap) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 }
 }
 return ret;
 }
 
+/* Called with iothread lock taken.  */
+
 static void unset_dirty_tracking(void)
 {
 BlkMigDevState *bmds;
 
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
+aio_context_release(bdrv_get_aio_context(bmds->bs));

[Qemu-block] [PULL 15/23] virtio-blk: do not use vring in dataplane

2016-02-24 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
---
 hw/block/dataplane/virtio-blk.h |   1 +
 include/hw/virtio/virtio-blk.h  |   3 --
 hw/block/dataplane/virtio-blk.c | 112 +---
 hw/block/virtio-blk.c   |  49 +++---
 4 files changed, 19 insertions(+), 146 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.h b/hw/block/dataplane/virtio-blk.h
index c88d40e..0714c11 100644
--- a/hw/block/dataplane/virtio-blk.h
+++ b/hw/block/dataplane/virtio-blk.h
@@ -26,5 +26,6 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_drain(VirtIOBlockDataPlane *s);
+void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s);
 
 #endif /* HW_DATAPLANE_VIRTIO_BLK_H */
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 781969d..ae84d92 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -53,9 +53,6 @@ typedef struct VirtIOBlock {
 unsigned short sector_mask;
 bool original_wce;
 VMChangeStateEntry *change;
-/* Function to push to vq and notify guest */
-void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
-Notifier migration_state_notifier;
 bool dataplane_started;
 struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index cc521c1..36f3d2b 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -18,8 +18,6 @@
 #include "qemu/thread.h"
 #include "qemu/error-report.h"
 #include "hw/virtio/virtio-access.h"
-#include "hw/virtio/dataplane/vring.h"
-#include "hw/virtio/dataplane/vring-accessors.h"
 #include "sysemu/block-backend.h"
 #include "hw/virtio/virtio-blk.h"
 #include "virtio-blk.h"
@@ -35,7 +33,7 @@ struct VirtIOBlockDataPlane {
 VirtIOBlkConf *conf;
 
 VirtIODevice *vdev;
-Vring vring;/* virtqueue vring */
+VirtQueue *vq;  /* virtqueue vring */
 EventNotifier *guest_notifier;  /* irq */
 QEMUBH *bh; /* bh for guest notification */
 
@@ -48,94 +46,26 @@ struct VirtIOBlockDataPlane {
  */
 IOThread *iothread;
 AioContext *ctx;
-EventNotifier host_notifier;/* doorbell */
 
 /* Operation blocker on BDS */
 Error *blocker;
-void (*saved_complete_request)(struct VirtIOBlockReq *req,
-   unsigned char status);
 };
 
 /* Raise an interrupt to signal guest, if necessary */
-static void notify_guest(VirtIOBlockDataPlane *s)
+void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s)
 {
-if (!vring_should_notify(s->vdev, >vring)) {
-return;
-}
-
-event_notifier_set(s->guest_notifier);
+qemu_bh_schedule(s->bh);
 }
 
 static void notify_guest_bh(void *opaque)
 {
 VirtIOBlockDataPlane *s = opaque;
 
-notify_guest(s);
-}
-
-static void complete_request_vring(VirtIOBlockReq *req, unsigned char status)
-{
-VirtIOBlockDataPlane *s = req->dev->dataplane;
-stb_p(>in->status, status);
-
-vring_push(s->vdev, >dev->dataplane->vring, >elem, req->in_len);
-
-/* Suppress notification to guest by BH and its scheduled
- * flag because requests are completed as a batch after io
- * plug & unplug is introduced, and the BH can still be
- * executed in dataplane aio context even after it is
- * stopped, so needn't worry about notification loss with BH.
- */
-qemu_bh_schedule(s->bh);
-}
-
-static void handle_notify(EventNotifier *e)
-{
-VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
-   host_notifier);
-VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
-
-event_notifier_test_and_clear(>host_notifier);
-blk_io_plug(s->conf->conf.blk);
-for (;;) {
-MultiReqBuffer mrb = {};
-
-/* Disable guest->host notifies to avoid unnecessary vmexits */
-vring_disable_notification(s->vdev, >vring);
-
-for (;;) {
-VirtIOBlockReq *req = vring_pop(s->vdev, >vring,
-sizeof(VirtIOBlockReq));
-
-if (req == NULL) {
-break; /* no more requests */
-}
-
-virtio_blk_init_request(vblk, req);
-trace_virtio_blk_data_plane_process_request(s, req->elem.out_num,

[Qemu-block] [PULL 14/23] virtio-blk: fix "disabled data plane" mode

2016-02-24 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

In disabled mode, virtio-blk dataplane seems to be enabled, but flow
actually goes through the normal virtio path.  This patch simplifies a bit
the handling of disabled mode.  In disabled mode, virtio_blk_handle_output
might be called even if s->dataplane is not NULL.

This is a bit tricky, because the current check for s->dataplane will
always trigger, causing a continuous stream of calls to
virtio_blk_data_plane_start.  Unfortunately, these calls will not
do anything.  To fix this, set the "started" flag even in disabled
mode, and skip virtio_blk_data_plane_start if the started flag is true.
The resulting changes also prepare the code for the next patch, were
virtio-blk dataplane will reuse the same virtio_blk_handle_output function
as "regular" virtio-blk.

Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have
to move s->dataplane->started inside struct VirtIOBlock.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
---
 include/hw/virtio/virtio-blk.h  |  1 +
 hw/block/dataplane/virtio-blk.c | 21 +
 hw/block/virtio-blk.c   |  2 +-
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 199bb0e..781969d 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -56,6 +56,7 @@ typedef struct VirtIOBlock {
 /* Function to push to vq and notify guest */
 void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
 Notifier migration_state_notifier;
+bool dataplane_started;
 struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
 
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 03b81bc..cc521c1 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -28,7 +28,6 @@
 #include "qom/object_interfaces.h"
 
 struct VirtIOBlockDataPlane {
-bool started;
 bool starting;
 bool stopping;
 bool disabled;
@@ -264,11 +263,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 VirtQueue *vq;
 int r;
 
-if (s->started || s->disabled) {
-return;
-}
-
-if (s->starting) {
+if (vblk->dataplane_started || s->starting) {
 return;
 }
 
@@ -300,7 +295,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 vblk->complete_request = complete_request_vring;
 
 s->starting = false;
-s->started = true;
+vblk->dataplane_started = true;
 trace_virtio_blk_data_plane_start(s);
 
 blk_set_aio_context(s->conf->conf.blk, s->ctx);
@@ -319,9 +314,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 k->set_guest_notifiers(qbus->parent, 1, false);
   fail_guest_notifiers:
 vring_teardown(>vring, s->vdev, 0);
-s->disabled = true;
   fail_vring:
+s->disabled = true;
 s->starting = false;
+vblk->dataplane_started = true;
 }
 
 /* Context: QEMU global mutex held */
@@ -331,13 +327,14 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
 
+if (!vblk->dataplane_started || s->stopping) {
+return;
+}
 
 /* Better luck next time. */
 if (s->disabled) {
 s->disabled = false;
-return;
-}
-if (!s->started || s->stopping) {
+vblk->dataplane_started = false;
 return;
 }
 s->stopping = true;
@@ -364,6 +361,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 /* Clean up guest notifier (irq) */
 k->set_guest_notifiers(qbus->parent, 1, false);
 
-s->started = false;
+vblk->dataplane_started = false;
 s->stopping = false;
 }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c427698..e04c8f5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -589,7 +589,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
  * dataplane here instead of waiting for .set_status().
  */
-if (s->dataplane) {
+if (s->dataplane && !s->dataplane_started) {
 virtio_blk_data_plane_start(s->dataplane);
 return;
 }
-- 
MST

[Qemu-block] [PULL 11/23] vring: make vring_enable_notification return void

2016-02-24 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Make the API more similar to the regular virtqueue API.  This will
help when modifying the code to not use vring.c anymore.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
---
 include/hw/virtio/dataplane/vring.h | 2 +-
 hw/block/dataplane/virtio-blk.c | 3 ++-
 hw/virtio/dataplane/vring.c | 3 +--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/dataplane/vring.h 
b/include/hw/virtio/dataplane/vring.h
index e80985e..e1c2a65 100644
--- a/include/hw/virtio/dataplane/vring.h
+++ b/include/hw/virtio/dataplane/vring.h
@@ -42,7 +42,7 @@ static inline void vring_set_broken(Vring *vring)
 bool vring_setup(Vring *vring, VirtIODevice *vdev, int n);
 void vring_teardown(Vring *vring, VirtIODevice *vdev, int n);
 void vring_disable_notification(VirtIODevice *vdev, Vring *vring);
-bool vring_enable_notification(VirtIODevice *vdev, Vring *vring);
+void vring_enable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_should_notify(VirtIODevice *vdev, Vring *vring);
 void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz);
 void vring_push(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 0d99781..03b81bc 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -128,7 +128,8 @@ static void handle_notify(EventNotifier *e)
 /* Re-enable guest->host notifies and stop processing the vring.
  * But if the guest has snuck in more descriptors, keep processing.
  */
-if (vring_enable_notification(s->vdev, >vring)) {
+vring_enable_notification(s->vdev, >vring);
+if (!vring_more_avail(s->vdev, >vring)) {
 break;
 }
 } else { /* fatal error */
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 4308d9f..157e8b8 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -175,7 +175,7 @@ void vring_disable_notification(VirtIODevice *vdev, Vring 
*vring)
  *
  * Return true if the vring is empty, false if there are more requests.
  */
-bool vring_enable_notification(VirtIODevice *vdev, Vring *vring)
+void vring_enable_notification(VirtIODevice *vdev, Vring *vring)
 {
 if (virtio_vdev_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
 vring_avail_event(>vr) = vring->vr.avail->idx;
@@ -183,7 +183,6 @@ bool vring_enable_notification(VirtIODevice *vdev, Vring 
*vring)
 vring_clear_used_flags(vdev, vring, VRING_USED_F_NO_NOTIFY);
 }
 smp_mb(); /* ensure update is seen before reading avail_idx */
-return !vring_more_avail(vdev, vring);
 }
 
 /* This is stolen from linux/drivers/vhost/vhost.c:vhost_notify() */
-- 
MST

[Qemu-block] [PULL 10/23] block-migration: acquire AioContext as necessary

2016-02-24 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

This is needed because dataplane will run during block migration as well.

The block device migration code is quite liberal in taking the iothread
mutex.  For simplicity, keep it the same way, even though one could
actually choose between the BQL (for regular BlockDriverStates) and
the AioContext (for dataplane BlockDriverStates).  When the block layer
is made fully thread safe, aio_context_acquire shall go away altogether.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Stefan Hajnoczi <stefa...@redhat.com>
Reviewed-by: Fam Zheng <f...@redhat.com>
---
 migration/block.c | 65 ---
 1 file changed, 52 insertions(+), 13 deletions(-)

diff --git a/migration/block.c b/migration/block.c
index 3a8330a..72883d7 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -54,17 +54,25 @@ typedef struct BlkMigDevState {
 int shared_base;
 int64_t total_sectors;
 QSIMPLEQ_ENTRY(BlkMigDevState) entry;
+Error *blocker;
 
 /* Only used by migration thread.  Does not need a lock.  */
 int bulk_completed;
 int64_t cur_sector;
 int64_t cur_dirty;
 
-/* Protected by block migration lock.  */
+/* Data in the aio_bitmap is protected by block migration lock.
+ * Allocation and free happen during setup and cleanup respectively.
+ */
 unsigned long *aio_bitmap;
+
+/* Protected by block migration lock.  */
 int64_t completed_sectors;
+
+/* During migration this is protected by iothread lock / AioContext.
+ * Allocation and free happen during setup and cleanup respectively.
+ */
 BdrvDirtyBitmap *dirty_bitmap;
-Error *blocker;
 } BlkMigDevState;
 
 typedef struct BlkMigBlock {
@@ -100,7 +108,7 @@ typedef struct BlkMigState {
 int prev_progress;
 int bulk_completed;
 
-/* Lock must be taken _inside_ the iothread lock.  */
+/* Lock must be taken _inside_ the iothread lock and any AioContexts.  */
 QemuMutex lock;
 } BlkMigState;
 
@@ -264,11 +272,13 @@ static int mig_save_device_bulk(QEMUFile *f, 
BlkMigDevState *bmds)
 
 if (bmds->shared_base) {
 qemu_mutex_lock_iothread();
+aio_context_acquire(bdrv_get_aio_context(bs));
 while (cur_sector < total_sectors &&
!bdrv_is_allocated(bs, cur_sector, MAX_IS_ALLOCATED_SEARCH,
   _sectors)) {
 cur_sector += nr_sectors;
 }
+aio_context_release(bdrv_get_aio_context(bs));
 qemu_mutex_unlock_iothread();
 }
 
@@ -302,11 +312,21 @@ static int mig_save_device_bulk(QEMUFile *f, 
BlkMigDevState *bmds)
 block_mig_state.submitted++;
 blk_mig_unlock();
 
+/* We do not know if bs is under the main thread (and thus does
+ * not acquire the AioContext when doing AIO) or rather under
+ * dataplane.  Thus acquire both the iothread mutex and the
+ * AioContext.
+ *
+ * This is ugly and will disappear when we make bdrv_* thread-safe,
+ * without the need to acquire the AioContext.
+ */
 qemu_mutex_lock_iothread();
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 blk->aiocb = bdrv_aio_readv(bs, cur_sector, >qiov,
 nr_sectors, blk_mig_read_cb, blk);
 
 bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, cur_sector, nr_sectors);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 qemu_mutex_unlock_iothread();
 
 bmds->cur_sector = cur_sector + nr_sectors;
@@ -321,8 +341,10 @@ static int set_dirty_tracking(void)
 int ret;
 
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
   NULL, NULL);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 if (!bmds->dirty_bitmap) {
 ret = -errno;
 goto fail;
@@ -333,18 +355,24 @@ static int set_dirty_tracking(void)
 fail:
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
 if (bmds->dirty_bitmap) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
+aio_context_release(bdrv_get_aio_context(bmds->bs));
 }
 }
 return ret;
 }
 
+/* Called with iothread lock taken.  */
+
 static void unset_dirty_tracking(void)
 {
 BlkMigDevState *bmds;
 
 QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
+aio_context_acquire(bdrv_get_aio_context(bmds->bs));
 bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
+aio_context_release(bdrv_get_aio_context(bmds->bs));

[Qemu-block] [PULL v2 10/45] vring: slim down allocation of VirtQueueElements

2016-02-06 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Build the addresses and s/g lists on the stack, and then copy them
to a VirtQueueElement that is just as big as required to contain this
particular s/g list.  The cost of the copy is minimal compared to that
of a large malloc.

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 hw/virtio/dataplane/vring.c | 53 ++---
 1 file changed, 36 insertions(+), 17 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 57ada3b..4308d9f 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -218,8 +218,14 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 new, old);
 }
 
-
-static int get_desc(Vring *vring, VirtQueueElement *elem,
+typedef struct VirtQueueCurrentElement {
+unsigned in_num;
+unsigned out_num;
+hwaddr addr[VIRTQUEUE_MAX_SIZE];
+struct iovec iov[VIRTQUEUE_MAX_SIZE];
+} VirtQueueCurrentElement;
+
+static int get_desc(Vring *vring, VirtQueueCurrentElement *elem,
 struct vring_desc *desc)
 {
 unsigned *num;
@@ -230,12 +236,12 @@ static int get_desc(Vring *vring, VirtQueueElement *elem,
 
 if (desc->flags & VRING_DESC_F_WRITE) {
 num = >in_num;
-iov = >in_sg[*num];
-addr = >in_addr[*num];
+iov = >iov[elem->out_num + *num];
+addr = >addr[elem->out_num + *num];
 } else {
 num = >out_num;
-iov = >out_sg[*num];
-addr = >out_addr[*num];
+iov = >iov[*num];
+addr = >addr[*num];
 
 /* If it's an output descriptor, they're all supposed
  * to come before any input descriptors. */
@@ -299,7 +305,8 @@ static bool read_vring_desc(VirtIODevice *vdev,
 
 /* This is stolen from linux/drivers/vhost/vhost.c. */
 static int get_indirect(VirtIODevice *vdev, Vring *vring,
-VirtQueueElement *elem, struct vring_desc *indirect)
+VirtQueueCurrentElement *cur_elem,
+struct vring_desc *indirect)
 {
 struct vring_desc desc;
 unsigned int i = 0, count, found = 0;
@@ -351,7 +358,7 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 return -EFAULT;
 }
 
-ret = get_desc(vring, elem, );
+ret = get_desc(vring, cur_elem, );
 if (ret < 0) {
 vring->broken |= (ret == -EFAULT);
 return ret;
@@ -394,6 +401,7 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz)
 struct vring_desc desc;
 unsigned int i, head, found = 0, num = vring->vr.num;
 uint16_t avail_idx, last_avail_idx;
+VirtQueueCurrentElement cur_elem;
 VirtQueueElement *elem = NULL;
 int ret;
 
@@ -403,10 +411,7 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t 
sz)
 goto out;
 }
 
-elem = virtqueue_alloc_element(sz, VIRTQUEUE_MAX_SIZE, VIRTQUEUE_MAX_SIZE);
-
-/* Initialize elem so it can be safely unmapped */
-elem->in_num = elem->out_num = 0;
+cur_elem.in_num = cur_elem.out_num = 0;
 
 /* Check it isn't doing very strange things with descriptor numbers. */
 last_avail_idx = vring->last_avail_idx;
@@ -433,8 +438,6 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz)
  * the index we've seen. */
 head = vring_get_avail_ring(vdev, vring, last_avail_idx % num);
 
-elem->index = head;
-
 /* If their number is silly, that's an error. */
 if (unlikely(head >= num)) {
 error_report("Guest says index %u > %u is available", head, num);
@@ -461,14 +464,14 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t 
sz)
 barrier();
 
 if (desc.flags & VRING_DESC_F_INDIRECT) {
-ret = get_indirect(vdev, vring, elem, );
+ret = get_indirect(vdev, vring, _elem, );
 if (ret < 0) {
 goto out;
 }
 continue;
 }
 
-ret = get_desc(vring, elem, );
+ret = get_desc(vring, _elem, );
 if (ret < 0) {
 goto out;
 }
@@ -483,6 +486,18 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t 
sz)
 virtio_tswap16(vdev, vring->last_avail_idx);
 }
 
+/* Now copy what we have collected and mapped */
+elem = virtqueue_alloc_element(sz, cur_elem.out_num, cur_elem.in_num);
+elem->index = head;
+for (i = 0; i < cur_elem.out_num; i++) {
+elem->out_addr[i] = cur_elem.addr[i];
+elem->out_sg[i] = cur_elem.iov[i];
+}
+for (i = 0; i < cur_elem.in_num; i++) {
+elem->in_addr[i] = cur_elem.addr[cur_elem.out_num + i];
+

[Qemu-block] [PULL v2 06/45] virtio: move allocation to virtqueue_pop/vring_pop

2016-02-06 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0.  We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.

The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items.  Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.

The patch is pretty large, but changes to each device are testable
more or less independently.  Splitting it would mostly add churn.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
---
 hw/9pfs/virtio-9p.h |  2 +-
 include/hw/virtio/dataplane/vring.h |  2 +-
 include/hw/virtio/virtio-balloon.h  |  2 +-
 include/hw/virtio/virtio-blk.h  |  3 +-
 include/hw/virtio/virtio-net.h  |  2 +-
 include/hw/virtio/virtio-scsi.h |  2 +-
 include/hw/virtio/virtio-serial.h   |  2 +-
 include/hw/virtio/virtio.h  |  2 +-
 hw/9pfs/9p.c|  2 +-
 hw/9pfs/virtio-9p-device.c  | 17 
 hw/block/dataplane/virtio-blk.c | 11 +++--
 hw/block/virtio-blk.c   | 15 +++
 hw/char/virtio-serial-bus.c | 80 +++--
 hw/display/virtio-gpu.c | 21 ++
 hw/input/virtio-input.c | 24 +++
 hw/net/virtio-net.c | 69 
 hw/scsi/virtio-scsi-dataplane.c | 15 +++
 hw/scsi/virtio-scsi.c   | 18 -
 hw/virtio/dataplane/vring.c | 18 +
 hw/virtio/virtio-balloon.c  | 22 ++
 hw/virtio/virtio-rng.c  | 10 +++--
 hw/virtio/virtio.c  | 12 --
 22 files changed, 209 insertions(+), 142 deletions(-)

diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
index 1cdf0a2..7f6d885 100644
--- a/hw/9pfs/virtio-9p.h
+++ b/hw/9pfs/virtio-9p.h
@@ -11,7 +11,7 @@ typedef struct V9fsVirtioState
 VirtQueue *vq;
 size_t config_size;
 V9fsPDU pdus[MAX_REQ];
-VirtQueueElement elems[MAX_REQ];
+VirtQueueElement *elems[MAX_REQ];
 V9fsState state;
 } V9fsVirtioState;
 
diff --git a/include/hw/virtio/dataplane/vring.h 
b/include/hw/virtio/dataplane/vring.h
index a596e4c..e80985e 100644
--- a/include/hw/virtio/dataplane/vring.h
+++ b/include/hw/virtio/dataplane/vring.h
@@ -44,7 +44,7 @@ void vring_teardown(Vring *vring, VirtIODevice *vdev, int n);
 void vring_disable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_enable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_should_notify(VirtIODevice *vdev, Vring *vring);
-int vring_pop(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem);
+void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz);
 void vring_push(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
 int len);
 
diff --git a/include/hw/virtio/virtio-balloon.h 
b/include/hw/virtio/virtio-balloon.h
index 09c2ce4..35f62ac 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -37,7 +37,7 @@ typedef struct VirtIOBalloon {
 uint32_t num_pages;
 uint32_t actual;
 uint64_t stats[VIRTIO_BALLOON_S_NR];
-VirtQueueElement stats_vq_elem;
+VirtQueueElement *stats_vq_elem;
 size_t stats_vq_offset;
 QEMUTimer *stats_timer;
 int64_t stats_last_update;
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 403ab86..199bb0e 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -80,8 +80,7 @@ typedef struct MultiReqBuffer {
 bool is_write;
 } MultiReqBuffer;
 
-VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s);
-
+void virtio_blk_init_request(VirtIOBlock *s, VirtIOBlockReq *req);
 void virtio_blk_free_request(VirtIOBlockReq *req);
 
 void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb);
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index f3cc25f..2ce3b03 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -47,7 +47,7 @@ typedef struct VirtIONetQueue {
 QEMUBH *tx_bh;
 int tx_waiting;
 struct {
-VirtQueueElement elem;
+VirtQueueElement *elem;
 } async_tx;
 struct VirtIONet *n;
 } VirtIONetQueue;
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index eb9d25b..a8029aa 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -160,7 +160,7 @@ void virtio_scsi_common_unrealize(DeviceState *dev, Error 
**errp);
 vo

[Qemu-block] [PULL v2 07/45] virtio: introduce qemu_get/put_virtqueue_element

2016-02-06 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Move allocation to virtio functions also when loading/saving a
VirtQueueElement.  This will also let the load/save functions
keep backwards compatibility when the VirtQueueElement layout
is changed.

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/virtio/virtio.h  |  2 ++
 hw/block/virtio-blk.c   | 10 +++---
 hw/char/virtio-serial-bus.c | 10 +++---
 hw/scsi/virtio-scsi.c   |  7 ++-
 hw/virtio/virtio.c  | 13 +
 5 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 21fda17..44da9a8 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -153,6 +153,8 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement 
*elem,
 
 void virtqueue_map(VirtQueueElement *elem);
 void *virtqueue_pop(VirtQueue *vq, size_t sz);
+void *qemu_get_virtqueue_element(QEMUFile *f, size_t sz);
+void qemu_put_virtqueue_element(QEMUFile *f, VirtQueueElement *elem);
 int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
   unsigned int out_bytes);
 void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index bf70b52..c427698 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -808,8 +808,7 @@ static void virtio_blk_save_device(VirtIODevice *vdev, 
QEMUFile *f)
 
 while (req) {
 qemu_put_sbyte(f, 1);
-qemu_put_buffer(f, (unsigned char *)>elem,
-sizeof(VirtQueueElement));
+qemu_put_virtqueue_element(f, >elem);
 req = req->next;
 }
 qemu_put_sbyte(f, 0);
@@ -832,14 +831,11 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 VirtIOBlock *s = VIRTIO_BLK(vdev);
 
 while (qemu_get_sbyte(f)) {
-VirtIOBlockReq *req = g_new(VirtIOBlockReq, 1);
+VirtIOBlockReq *req;
+req = qemu_get_virtqueue_element(f, sizeof(VirtIOBlockReq));
 virtio_blk_init_request(s, req);
-qemu_get_buffer(f, (unsigned char *)>elem,
-sizeof(VirtQueueElement));
 req->next = s->rq;
 s->rq = req;
-
-virtqueue_map(>elem);
 }
 
 return 0;
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index cf3d12b..99cb683 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -646,9 +646,7 @@ static void virtio_serial_save_device(VirtIODevice *vdev, 
QEMUFile *f)
 if (elem_popped) {
 qemu_put_be32s(f, >iov_idx);
 qemu_put_be64s(f, >iov_offset);
-
-qemu_put_buffer(f, (unsigned char *)port->elem,
-sizeof(VirtQueueElement));
+qemu_put_virtqueue_element(f, port->elem);
 }
 }
 }
@@ -723,10 +721,8 @@ static int fetch_active_ports_list(QEMUFile *f, int 
version_id,
 qemu_get_be32s(f, >iov_idx);
 qemu_get_be64s(f, >iov_offset);
 
-port->elem = g_new(VirtQueueElement, 1);
-qemu_get_buffer(f, (unsigned char *)port->elem,
-sizeof(VirtQueueElement));
-virtqueue_map(port->elem);
+port->elem =
+qemu_get_virtqueue_element(f, sizeof(VirtQueueElement));
 
 /*
  *  Port was throttled on source machine.  Let's
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 50a3cb2..5b29bac 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -189,7 +189,7 @@ static void virtio_scsi_save_request(QEMUFile *f, 
SCSIRequest *sreq)
 
 assert(n < vs->conf.num_queues);
 qemu_put_be32s(f, );
-qemu_put_buffer(f, (unsigned char *)>elem, sizeof(req->elem));
+qemu_put_virtqueue_element(f, >elem);
 }
 
 static void *virtio_scsi_load_request(QEMUFile *f, SCSIRequest *sreq)
@@ -202,12 +202,9 @@ static void *virtio_scsi_load_request(QEMUFile *f, 
SCSIRequest *sreq)
 
 qemu_get_be32s(f, );
 assert(n < vs->conf.num_queues);
-req = g_malloc(sizeof(VirtIOSCSIReq) + vs->cdb_size);
-qemu_get_buffer(f, (unsigned char *)>elem, sizeof(req->elem));
+req = qemu_get_virtqueue_element(f, sizeof(VirtIOSCSIReq) + vs->cdb_size);
 virtio_scsi_init_req(s, vs->cmd_vqs[n], req);
 
-virtqueue_map(>elem);
-
 if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICmdReq) + vs->cdb_size,
   sizeof(VirtIOSCSICmdResp) + vs->sense_size) < 0) 
{
 error_report("invalid SCSI request migration data");
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
ind

Re: [Qemu-block] [PULL 06/49] virtio: move allocation to virtqueue_pop/vring_pop

2016-02-06 Thread Michael S. Tsirkin

On Fri, Feb 05, 2016 at 12:52:55PM +, Peter Maydell wrote:
> On 4 February 2016 at 21:51, Michael S. Tsirkin <m...@redhat.com> wrote:
> > From: Paolo Bonzini <pbonz...@redhat.com>
> >
> > The return code of virtqueue_pop/vring_pop is unused except to check for
> > errors or 0.  We can thus easily move allocation inside the functions
> > and just return a pointer to the VirtQueueElement.
> >
> > The advantage is that we will be able to allocate only the space that
> > is needed for the actual size of the s/g list instead of the full
> > VIRTQUEUE_MAX_SIZE items.  Currently VirtQueueElement takes about 48K
> > of memory, and this kind of allocation puts a lot of stress on malloc.
> > By cutting the size by two or three orders of magnitude, malloc can
> > use much more efficient algorithms.
> >
> > The patch is pretty large, but changes to each device are testable
> > more or less independently.  Splitting it would mostly add churn.
> >
> > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
> > Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
> > Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> > Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
> > ---
> >  hw/9pfs/virtio-9p.h |  2 +-
> >  include/hw/virtio/dataplane/vring.h |  2 +-
> >  include/hw/virtio/virtio-balloon.h  |  2 +-
> >  include/hw/virtio/virtio-blk.h  |  3 +-
> >  include/hw/virtio/virtio-net.h  |  2 +-
> >  include/hw/virtio/virtio-scsi.h |  2 +-
> >  include/hw/virtio/virtio-serial.h   |  2 +-
> >  include/hw/virtio/virtio.h  |  2 +-
> >  hw/9pfs/9p.c|  2 +-
> >  hw/9pfs/virtio-9p-device.c  | 17 
> >  hw/block/dataplane/virtio-blk.c | 11 +++--
> >  hw/block/virtio-blk.c   | 15 +++
> >  hw/char/virtio-serial-bus.c | 80 
> > +++--
> >  hw/display/virtio-gpu.c | 21 ++
> >  hw/input/virtio-input.c | 24 +++
> >  hw/net/virtio-net.c | 69 
> >  hw/scsi/virtio-scsi-dataplane.c | 15 +++
> >  hw/scsi/virtio-scsi.c   | 18 -
> >  hw/virtio/dataplane/vring.c | 18 +
> >  hw/virtio/virtio-balloon.c  | 22 ++
> >  hw/virtio/virtio-rng.c  | 10 +++--
> >  hw/virtio/virtio.c  | 12 --
> >  roms/seabios|  2 +-
> >  23 files changed, 210 insertions(+), 143 deletions(-)
> 
> > --- a/roms/seabios
> > +++ b/roms/seabios
> > @@ -1 +1 @@
> > -Subproject commit 01a84bea2d28a19d2405c1ecac4bdef17683cc0c
> > +Subproject commit 33fbe13a3e2a01e0ba1087a8feed801a0451db21
> > --
> > MST
> 
> Hi. This commit in this pull request includes a seabios submodule
> update, but the commit message says nothing about it. Is it
> really supposed to be here?
> 
> thanks
> -- PMM

Not sure how it got here - I didn't notice.
I'll redo the pull request.

-- 
MST

[Qemu-block] [PULL 10/49] vring: slim down allocation of VirtQueueElements

2016-02-04 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Build the addresses and s/g lists on the stack, and then copy them
to a VirtQueueElement that is just as big as required to contain this
particular s/g list.  The cost of the copy is minimal compared to that
of a large malloc.

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 hw/virtio/dataplane/vring.c | 53 ++---
 1 file changed, 36 insertions(+), 17 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 57ada3b..4308d9f 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -218,8 +218,14 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 new, old);
 }
 
-
-static int get_desc(Vring *vring, VirtQueueElement *elem,
+typedef struct VirtQueueCurrentElement {
+unsigned in_num;
+unsigned out_num;
+hwaddr addr[VIRTQUEUE_MAX_SIZE];
+struct iovec iov[VIRTQUEUE_MAX_SIZE];
+} VirtQueueCurrentElement;
+
+static int get_desc(Vring *vring, VirtQueueCurrentElement *elem,
 struct vring_desc *desc)
 {
 unsigned *num;
@@ -230,12 +236,12 @@ static int get_desc(Vring *vring, VirtQueueElement *elem,
 
 if (desc->flags & VRING_DESC_F_WRITE) {
 num = >in_num;
-iov = >in_sg[*num];
-addr = >in_addr[*num];
+iov = >iov[elem->out_num + *num];
+addr = >addr[elem->out_num + *num];
 } else {
 num = >out_num;
-iov = >out_sg[*num];
-addr = >out_addr[*num];
+iov = >iov[*num];
+addr = >addr[*num];
 
 /* If it's an output descriptor, they're all supposed
  * to come before any input descriptors. */
@@ -299,7 +305,8 @@ static bool read_vring_desc(VirtIODevice *vdev,
 
 /* This is stolen from linux/drivers/vhost/vhost.c. */
 static int get_indirect(VirtIODevice *vdev, Vring *vring,
-VirtQueueElement *elem, struct vring_desc *indirect)
+VirtQueueCurrentElement *cur_elem,
+struct vring_desc *indirect)
 {
 struct vring_desc desc;
 unsigned int i = 0, count, found = 0;
@@ -351,7 +358,7 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 return -EFAULT;
 }
 
-ret = get_desc(vring, elem, );
+ret = get_desc(vring, cur_elem, );
 if (ret < 0) {
 vring->broken |= (ret == -EFAULT);
 return ret;
@@ -394,6 +401,7 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz)
 struct vring_desc desc;
 unsigned int i, head, found = 0, num = vring->vr.num;
 uint16_t avail_idx, last_avail_idx;
+VirtQueueCurrentElement cur_elem;
 VirtQueueElement *elem = NULL;
 int ret;
 
@@ -403,10 +411,7 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t 
sz)
 goto out;
 }
 
-elem = virtqueue_alloc_element(sz, VIRTQUEUE_MAX_SIZE, VIRTQUEUE_MAX_SIZE);
-
-/* Initialize elem so it can be safely unmapped */
-elem->in_num = elem->out_num = 0;
+cur_elem.in_num = cur_elem.out_num = 0;
 
 /* Check it isn't doing very strange things with descriptor numbers. */
 last_avail_idx = vring->last_avail_idx;
@@ -433,8 +438,6 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz)
  * the index we've seen. */
 head = vring_get_avail_ring(vdev, vring, last_avail_idx % num);
 
-elem->index = head;
-
 /* If their number is silly, that's an error. */
 if (unlikely(head >= num)) {
 error_report("Guest says index %u > %u is available", head, num);
@@ -461,14 +464,14 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t 
sz)
 barrier();
 
 if (desc.flags & VRING_DESC_F_INDIRECT) {
-ret = get_indirect(vdev, vring, elem, );
+ret = get_indirect(vdev, vring, _elem, );
 if (ret < 0) {
 goto out;
 }
 continue;
 }
 
-ret = get_desc(vring, elem, );
+ret = get_desc(vring, _elem, );
 if (ret < 0) {
 goto out;
 }
@@ -483,6 +486,18 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t 
sz)
 virtio_tswap16(vdev, vring->last_avail_idx);
 }
 
+/* Now copy what we have collected and mapped */
+elem = virtqueue_alloc_element(sz, cur_elem.out_num, cur_elem.in_num);
+elem->index = head;
+for (i = 0; i < cur_elem.out_num; i++) {
+elem->out_addr[i] = cur_elem.addr[i];
+elem->out_sg[i] = cur_elem.iov[i];
+}
+for (i = 0; i < cur_elem.in_num; i++) {
+elem->in_addr[i] = cur_elem.addr[cur_elem.out_num + i];
+

[Qemu-block] [PULL 08/49] virtio: introduce virtqueue_alloc_element

2016-02-04 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Allocate the arrays for in_addr/out_addr/in_sg/out_sg outside the
VirtQueueElement.  For now, virtqueue_pop and vring_pop keep
allocating a very large VirtQueueElement.

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/virtio/virtio.h  |   9 ++--
 hw/virtio/dataplane/vring.c |   3 +-
 hw/virtio/virtio.c  | 110 +++-
 3 files changed, 105 insertions(+), 17 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 44da9a8..108cdb0 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -46,10 +46,10 @@ typedef struct VirtQueueElement
 unsigned int index;
 unsigned int out_num;
 unsigned int in_num;
-hwaddr in_addr[VIRTQUEUE_MAX_SIZE];
-hwaddr out_addr[VIRTQUEUE_MAX_SIZE];
-struct iovec in_sg[VIRTQUEUE_MAX_SIZE];
-struct iovec out_sg[VIRTQUEUE_MAX_SIZE];
+hwaddr *in_addr;
+hwaddr *out_addr;
+struct iovec *in_sg;
+struct iovec *out_sg;
 } VirtQueueElement;
 
 #define VIRTIO_QUEUE_MAX 1024
@@ -143,6 +143,7 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 
 void virtio_del_queue(VirtIODevice *vdev, int n);
 
+void *virtqueue_alloc_element(size_t sz, unsigned out_num, unsigned in_num);
 void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
 unsigned int len);
 void virtqueue_flush(VirtQueue *vq, unsigned int count);
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 4fb84bb..57ada3b 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -403,8 +403,7 @@ void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz)
 goto out;
 }
 
-assert(sz >= sizeof(VirtQueueElement));
-elem = g_malloc(sz);
+elem = virtqueue_alloc_element(sz, VIRTQUEUE_MAX_SIZE, VIRTQUEUE_MAX_SIZE);
 
 /* Initialize elem so it can be safely unmapped */
 elem->in_num = elem->out_num = 0;
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 28fa7fe..661a1e1 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -494,11 +494,30 @@ static void virtqueue_map_iovec(struct iovec *sg, hwaddr 
*addr,
 void virtqueue_map(VirtQueueElement *elem)
 {
 virtqueue_map_iovec(elem->in_sg, elem->in_addr, >in_num,
-MIN(ARRAY_SIZE(elem->in_sg), 
ARRAY_SIZE(elem->in_addr)),
-1);
+VIRTQUEUE_MAX_SIZE, 1);
 virtqueue_map_iovec(elem->out_sg, elem->out_addr, >out_num,
-MIN(ARRAY_SIZE(elem->out_sg), 
ARRAY_SIZE(elem->out_addr)),
-0);
+VIRTQUEUE_MAX_SIZE, 0);
+}
+
+void *virtqueue_alloc_element(size_t sz, unsigned out_num, unsigned in_num)
+{
+VirtQueueElement *elem;
+size_t in_addr_ofs = QEMU_ALIGN_UP(sz, __alignof__(elem->in_addr[0]));
+size_t out_addr_ofs = in_addr_ofs + in_num * sizeof(elem->in_addr[0]);
+size_t out_addr_end = out_addr_ofs + out_num * sizeof(elem->out_addr[0]);
+size_t in_sg_ofs = QEMU_ALIGN_UP(out_addr_end, 
__alignof__(elem->in_sg[0]));
+size_t out_sg_ofs = in_sg_ofs + in_num * sizeof(elem->in_sg[0]);
+size_t out_sg_end = out_sg_ofs + out_num * sizeof(elem->out_sg[0]);
+
+assert(sz >= sizeof(VirtQueueElement));
+elem = g_malloc(out_sg_end);
+elem->out_num = out_num;
+elem->in_num = in_num;
+elem->in_addr = (void *)elem + in_addr_ofs;
+elem->out_addr = (void *)elem + out_addr_ofs;
+elem->in_sg = (void *)elem + in_sg_ofs;
+elem->out_sg = (void *)elem + out_sg_ofs;
+return elem;
 }
 
 void *virtqueue_pop(VirtQueue *vq, size_t sz)
@@ -513,8 +532,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz)
 }
 
 /* When we start there are none of either input nor output. */
-assert(sz >= sizeof(VirtQueueElement));
-elem = g_malloc(sz);
+elem = virtqueue_alloc_element(sz, VIRTQUEUE_MAX_SIZE, VIRTQUEUE_MAX_SIZE);
 elem->out_num = elem->in_num = 0;
 
 max = vq->vring.num;
@@ -541,14 +559,14 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz)
 struct iovec *sg;
 
 if (vring_desc_flags(vdev, desc_pa, i) & VRING_DESC_F_WRITE) {
-if (elem->in_num >= ARRAY_SIZE(elem->in_sg)) {
+if (elem->in_num >= VIRTQUEUE_MAX_SIZE) {
 error_report("Too many write descriptors in indirect table");
 exit(1);
 }
 elem->in_addr[elem->in_num] = vring_desc_addr(vdev, desc_pa, i);
 sg = >in_sg[elem->in_num++];
 } else {
-if (elem->out_num >= ARRAY_SIZE(elem-&g

[Qemu-block] [PULL 07/49] virtio: introduce qemu_get/put_virtqueue_element

2016-02-04 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

Move allocation to virtio functions also when loading/saving a
VirtQueueElement.  This will also let the load/save functions
keep backwards compatibility when the VirtQueueElement layout
is changed.

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/virtio/virtio.h  |  2 ++
 hw/block/virtio-blk.c   | 10 +++---
 hw/char/virtio-serial-bus.c | 10 +++---
 hw/scsi/virtio-scsi.c   |  7 ++-
 hw/virtio/virtio.c  | 13 +
 5 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 21fda17..44da9a8 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -153,6 +153,8 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement 
*elem,
 
 void virtqueue_map(VirtQueueElement *elem);
 void *virtqueue_pop(VirtQueue *vq, size_t sz);
+void *qemu_get_virtqueue_element(QEMUFile *f, size_t sz);
+void qemu_put_virtqueue_element(QEMUFile *f, VirtQueueElement *elem);
 int virtqueue_avail_bytes(VirtQueue *vq, unsigned int in_bytes,
   unsigned int out_bytes);
 void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int *in_bytes,
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index bf70b52..c427698 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -808,8 +808,7 @@ static void virtio_blk_save_device(VirtIODevice *vdev, 
QEMUFile *f)
 
 while (req) {
 qemu_put_sbyte(f, 1);
-qemu_put_buffer(f, (unsigned char *)>elem,
-sizeof(VirtQueueElement));
+qemu_put_virtqueue_element(f, >elem);
 req = req->next;
 }
 qemu_put_sbyte(f, 0);
@@ -832,14 +831,11 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 VirtIOBlock *s = VIRTIO_BLK(vdev);
 
 while (qemu_get_sbyte(f)) {
-VirtIOBlockReq *req = g_new(VirtIOBlockReq, 1);
+VirtIOBlockReq *req;
+req = qemu_get_virtqueue_element(f, sizeof(VirtIOBlockReq));
 virtio_blk_init_request(s, req);
-qemu_get_buffer(f, (unsigned char *)>elem,
-sizeof(VirtQueueElement));
 req->next = s->rq;
 s->rq = req;
-
-virtqueue_map(>elem);
 }
 
 return 0;
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index cf3d12b..99cb683 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -646,9 +646,7 @@ static void virtio_serial_save_device(VirtIODevice *vdev, 
QEMUFile *f)
 if (elem_popped) {
 qemu_put_be32s(f, >iov_idx);
 qemu_put_be64s(f, >iov_offset);
-
-qemu_put_buffer(f, (unsigned char *)port->elem,
-sizeof(VirtQueueElement));
+qemu_put_virtqueue_element(f, port->elem);
 }
 }
 }
@@ -723,10 +721,8 @@ static int fetch_active_ports_list(QEMUFile *f, int 
version_id,
 qemu_get_be32s(f, >iov_idx);
 qemu_get_be64s(f, >iov_offset);
 
-port->elem = g_new(VirtQueueElement, 1);
-qemu_get_buffer(f, (unsigned char *)port->elem,
-sizeof(VirtQueueElement));
-virtqueue_map(port->elem);
+port->elem =
+qemu_get_virtqueue_element(f, sizeof(VirtQueueElement));
 
 /*
  *  Port was throttled on source machine.  Let's
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 50a3cb2..5b29bac 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -189,7 +189,7 @@ static void virtio_scsi_save_request(QEMUFile *f, 
SCSIRequest *sreq)
 
 assert(n < vs->conf.num_queues);
 qemu_put_be32s(f, );
-qemu_put_buffer(f, (unsigned char *)>elem, sizeof(req->elem));
+qemu_put_virtqueue_element(f, >elem);
 }
 
 static void *virtio_scsi_load_request(QEMUFile *f, SCSIRequest *sreq)
@@ -202,12 +202,9 @@ static void *virtio_scsi_load_request(QEMUFile *f, 
SCSIRequest *sreq)
 
 qemu_get_be32s(f, );
 assert(n < vs->conf.num_queues);
-req = g_malloc(sizeof(VirtIOSCSIReq) + vs->cdb_size);
-qemu_get_buffer(f, (unsigned char *)>elem, sizeof(req->elem));
+req = qemu_get_virtqueue_element(f, sizeof(VirtIOSCSIReq) + vs->cdb_size);
 virtio_scsi_init_req(s, vs->cmd_vqs[n], req);
 
-virtqueue_map(>elem);
-
 if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICmdReq) + vs->cdb_size,
   sizeof(VirtIOSCSICmdResp) + vs->sense_size) < 0) 
{
 error_report("invalid SCSI request migration data");
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
ind

[Qemu-block] [PULL 47/49] expose floppy drive geometry and CMOS type

2016-02-04 Thread Michael S. Tsirkin

From: Roman Kagan <rka...@virtuozzo.com>

Make it possible to query the geometry and the CMOS type of a floppy
drive outside of the respective source files.

It will be useful, in particular, when dynamically building ACPI tables,
and will allow to properly populate the corresponding ACPI objects and
thus enable BIOS-less systems to access the floppy drives.

Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
Signed-off-by: Igor Mammedov <imamm...@redhat.com>
Reviewed-by: John Snow <js...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/block/fdc.h |  2 ++
 include/hw/i386/pc.h   |  1 +
 hw/block/fdc.c | 11 +++
 hw/i386/pc.c   |  2 +-
 4 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index adce14f..d87859e 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -15,5 +15,7 @@ void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc);
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
+void isa_fdc_get_drive_geometry(ISADevice *fdc, int i, uint8_t *cylinders,
+uint8_t *heads, uint8_t *sectors);
 
 #endif
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 8b3546e..472754c 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -265,6 +265,7 @@ typedef void (*cpu_set_smm_t)(int smm, void *arg);
 void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name);
 
 ISADevice *pc_find_fdc0(void);
+int cmos_get_fd_drive_type(FloppyDriveType fd0);
 
 /* acpi_piix.c */
 
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 818e8a4..245184b 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2509,6 +2509,17 @@ FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, 
int i)
 return isa->state.drives[i].drive;
 }
 
+void isa_fdc_get_drive_geometry(ISADevice *fdc, int i, uint8_t *cylinders,
+uint8_t *heads, uint8_t *sectors)
+{
+FDCtrlISABus *isa = ISA_FDC(fdc);
+FDrive *drv = >state.drives[i];
+
+*cylinders = drv->max_track;
+*heads = (drv->flags & FDISK_DBL_SIDES) ? 2 : 1;
+*sectors = drv->last_sect;
+}
+
 static const VMStateDescription vmstate_isa_fdc ={
 .name = "fdc",
 .version_id = 2,
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ce185bb..fd8524f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -200,7 +200,7 @@ static void pic_irq_request(void *opaque, int irq, int 
level)
 
 #define REG_EQUIPMENT_BYTE  0x14
 
-static int cmos_get_fd_drive_type(FloppyDriveType fd0)
+int cmos_get_fd_drive_type(FloppyDriveType fd0)
 {
 int val;
 
-- 
MST

[Qemu-block] [PULL 06/49] virtio: move allocation to virtqueue_pop/vring_pop

2016-02-04 Thread Michael S. Tsirkin

From: Paolo Bonzini <pbonz...@redhat.com>

The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0.  We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.

The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items.  Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.

The patch is pretty large, but changes to each device are testable
more or less independently.  Splitting it would mostly add churn.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>
---
 hw/9pfs/virtio-9p.h |  2 +-
 include/hw/virtio/dataplane/vring.h |  2 +-
 include/hw/virtio/virtio-balloon.h  |  2 +-
 include/hw/virtio/virtio-blk.h  |  3 +-
 include/hw/virtio/virtio-net.h  |  2 +-
 include/hw/virtio/virtio-scsi.h |  2 +-
 include/hw/virtio/virtio-serial.h   |  2 +-
 include/hw/virtio/virtio.h  |  2 +-
 hw/9pfs/9p.c|  2 +-
 hw/9pfs/virtio-9p-device.c  | 17 
 hw/block/dataplane/virtio-blk.c | 11 +++--
 hw/block/virtio-blk.c   | 15 +++
 hw/char/virtio-serial-bus.c | 80 +++--
 hw/display/virtio-gpu.c | 21 ++
 hw/input/virtio-input.c | 24 +++
 hw/net/virtio-net.c | 69 
 hw/scsi/virtio-scsi-dataplane.c | 15 +++
 hw/scsi/virtio-scsi.c   | 18 -
 hw/virtio/dataplane/vring.c | 18 +
 hw/virtio/virtio-balloon.c  | 22 ++
 hw/virtio/virtio-rng.c  | 10 +++--
 hw/virtio/virtio.c  | 12 --
 roms/seabios|  2 +-
 23 files changed, 210 insertions(+), 143 deletions(-)

diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
index 1cdf0a2..7f6d885 100644
--- a/hw/9pfs/virtio-9p.h
+++ b/hw/9pfs/virtio-9p.h
@@ -11,7 +11,7 @@ typedef struct V9fsVirtioState
 VirtQueue *vq;
 size_t config_size;
 V9fsPDU pdus[MAX_REQ];
-VirtQueueElement elems[MAX_REQ];
+VirtQueueElement *elems[MAX_REQ];
 V9fsState state;
 } V9fsVirtioState;
 
diff --git a/include/hw/virtio/dataplane/vring.h 
b/include/hw/virtio/dataplane/vring.h
index a596e4c..e80985e 100644
--- a/include/hw/virtio/dataplane/vring.h
+++ b/include/hw/virtio/dataplane/vring.h
@@ -44,7 +44,7 @@ void vring_teardown(Vring *vring, VirtIODevice *vdev, int n);
 void vring_disable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_enable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_should_notify(VirtIODevice *vdev, Vring *vring);
-int vring_pop(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem);
+void *vring_pop(VirtIODevice *vdev, Vring *vring, size_t sz);
 void vring_push(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
 int len);
 
diff --git a/include/hw/virtio/virtio-balloon.h 
b/include/hw/virtio/virtio-balloon.h
index 09c2ce4..35f62ac 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -37,7 +37,7 @@ typedef struct VirtIOBalloon {
 uint32_t num_pages;
 uint32_t actual;
 uint64_t stats[VIRTIO_BALLOON_S_NR];
-VirtQueueElement stats_vq_elem;
+VirtQueueElement *stats_vq_elem;
 size_t stats_vq_offset;
 QEMUTimer *stats_timer;
 int64_t stats_last_update;
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 403ab86..199bb0e 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -80,8 +80,7 @@ typedef struct MultiReqBuffer {
 bool is_write;
 } MultiReqBuffer;
 
-VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s);
-
+void virtio_blk_init_request(VirtIOBlock *s, VirtIOBlockReq *req);
 void virtio_blk_free_request(VirtIOBlockReq *req);
 
 void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb);
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index f3cc25f..2ce3b03 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -47,7 +47,7 @@ typedef struct VirtIONetQueue {
 QEMUBH *tx_bh;
 int tx_waiting;
 struct {
-VirtQueueElement elem;
+VirtQueueElement *elem;
 } async_tx;
 struct VirtIONet *n;
 } VirtIONetQueue;
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index eb9d25b..a8029aa 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -160,7 +160,7 @@ void virtio_scsi_common_unr

[Qemu-block] [PULL v2 55/59] i386/pc: expose identifying the floppy controller

2016-01-09 Thread Michael S. Tsirkin

From: Roman Kagan <rka...@virtuozzo.com>

Factor out and expose the function to locate the floppy controller in
the system.
It will allow to dynamically populate the relevant objects in the ACPI
tables.

Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Eduardo Habkost <ehabk...@redhat.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: John Snow <js...@redhat.com>
Cc: Kevin Wolf <kw...@redhat.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Richard Henderson <r...@twiddle.net>
Cc: qemu-block@nongnu.org
Cc: qemu-sta...@nongnu.org
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/i386/pc.h |  2 ++
 hw/i386/pc.c | 44 ++--
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index b0d6283..819 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -267,6 +267,8 @@ typedef void (*cpu_set_smm_t)(int smm, void *arg);
 
 void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name);
 
+ISADevice *pc_find_fdc0(void);
+
 /* acpi_piix.c */
 
 I2CBus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base,
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 459260b..c36b8cf 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -360,6 +360,31 @@ static const char * const fdc_container_path[] = {
 "/unattached", "/peripheral", "/peripheral-anon"
 };
 
+/*
+ * Locate the FDC at IO address 0x3f0, in order to configure the CMOS registers
+ * and ACPI objects.
+ */
+ISADevice *pc_find_fdc0(void)
+{
+int i;
+Object *container;
+CheckFdcState state = { 0 };
+
+for (i = 0; i < ARRAY_SIZE(fdc_container_path); i++) {
+container = container_get(qdev_get_machine(), fdc_container_path[i]);
+object_child_foreach(container, check_fdc, );
+}
+
+if (state.multiple) {
+error_report("warning: multiple floppy disk controllers with "
+ "iobase=0x3f0 have been found;\n"
+ "the one being picked for CMOS setup might not reflect "
+ "your intent");
+}
+
+return state.floppy;
+}
+
 static void pc_cmos_init_late(void *opaque)
 {
 pc_cmos_init_late_arg *arg = opaque;
@@ -368,8 +393,6 @@ static void pc_cmos_init_late(void *opaque)
 int8_t heads, sectors;
 int val;
 int i, trans;
-Object *container;
-CheckFdcState state = { 0 };
 
 val = 0;
 if (ide_get_geometry(arg->idebus[0], 0,
@@ -399,22 +422,7 @@ static void pc_cmos_init_late(void *opaque)
 }
 rtc_set_memory(s, 0x39, val);
 
-/*
- * Locate the FDC at IO address 0x3f0, and configure the CMOS registers
- * accordingly.
- */
-for (i = 0; i < ARRAY_SIZE(fdc_container_path); i++) {
-container = container_get(qdev_get_machine(), fdc_container_path[i]);
-object_child_foreach(container, check_fdc, );
-}
-
-if (state.multiple) {
-error_report("warning: multiple floppy disk controllers with "
- "iobase=0x3f0 have been found;\n"
- "the one being picked for CMOS setup might not reflect "
- "your intent");
-}
-pc_cmos_init_floppy(s, state.floppy);
+pc_cmos_init_floppy(s, pc_find_fdc0());
 
 qemu_unregister_reset(pc_cmos_init_late, opaque);
 }
-- 
MST

[Qemu-block] [PULL 55/59] i386/pc: expose identifying the floppy controller

2016-01-08 Thread Michael S. Tsirkin

From: Roman Kagan <rka...@virtuozzo.com>

Factor out and expose the function to locate the floppy controller in
the system.
It will allow to dynamically populate the relevant objects in the ACPI
tables.

Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
Cc: "Michael S. Tsirkin" <m...@redhat.com>
Cc: Eduardo Habkost <ehabk...@redhat.com>
Cc: Igor Mammedov <imamm...@redhat.com>
Cc: John Snow <js...@redhat.com>
Cc: Kevin Wolf <kw...@redhat.com>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Richard Henderson <r...@twiddle.net>
Cc: qemu-block@nongnu.org
Cc: qemu-sta...@nongnu.org
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 include/hw/i386/pc.h |  2 ++
 hw/i386/pc.c | 44 ++--
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index b0d6283..819 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -267,6 +267,8 @@ typedef void (*cpu_set_smm_t)(int smm, void *arg);
 
 void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name);
 
+ISADevice *pc_find_fdc0(void);
+
 /* acpi_piix.c */
 
 I2CBus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base,
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 459260b..c36b8cf 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -360,6 +360,31 @@ static const char * const fdc_container_path[] = {
 "/unattached", "/peripheral", "/peripheral-anon"
 };
 
+/*
+ * Locate the FDC at IO address 0x3f0, in order to configure the CMOS registers
+ * and ACPI objects.
+ */
+ISADevice *pc_find_fdc0(void)
+{
+int i;
+Object *container;
+CheckFdcState state = { 0 };
+
+for (i = 0; i < ARRAY_SIZE(fdc_container_path); i++) {
+container = container_get(qdev_get_machine(), fdc_container_path[i]);
+object_child_foreach(container, check_fdc, );
+}
+
+if (state.multiple) {
+error_report("warning: multiple floppy disk controllers with "
+ "iobase=0x3f0 have been found;\n"
+ "the one being picked for CMOS setup might not reflect "
+ "your intent");
+}
+
+return state.floppy;
+}
+
 static void pc_cmos_init_late(void *opaque)
 {
 pc_cmos_init_late_arg *arg = opaque;
@@ -368,8 +393,6 @@ static void pc_cmos_init_late(void *opaque)
 int8_t heads, sectors;
 int val;
 int i, trans;
-Object *container;
-CheckFdcState state = { 0 };
 
 val = 0;
 if (ide_get_geometry(arg->idebus[0], 0,
@@ -399,22 +422,7 @@ static void pc_cmos_init_late(void *opaque)
 }
 rtc_set_memory(s, 0x39, val);
 
-/*
- * Locate the FDC at IO address 0x3f0, and configure the CMOS registers
- * accordingly.
- */
-for (i = 0; i < ARRAY_SIZE(fdc_container_path); i++) {
-container = container_get(qdev_get_machine(), fdc_container_path[i]);
-object_child_foreach(container, check_fdc, );
-}
-
-if (state.multiple) {
-error_report("warning: multiple floppy disk controllers with "
- "iobase=0x3f0 have been found;\n"
- "the one being picked for CMOS setup might not reflect "
- "your intent");
-}
-pc_cmos_init_floppy(s, state.floppy);
+pc_cmos_init_floppy(s, pc_find_fdc0());
 
 qemu_unregister_reset(pc_cmos_init_late, opaque);
 }
-- 
MST

Re: [Qemu-block] [PATCH v5 0/6] i386: expose floppy-related objects in SSDT

2016-01-07 Thread Michael S. Tsirkin

On Thu, Jan 07, 2016 at 12:56:09PM +0200, Michael S. Tsirkin wrote:
> On Wed, Jan 06, 2016 at 03:04:40PM +0100, Igor Mammedov wrote:
> > On Wed, 30 Dec 2015 23:11:50 +0300
> > Roman Kagan <rka...@virtuozzo.com> wrote:
> > 
> > > Windows on UEFI systems is only capable of detecting the presence and
> > > the type of floppy drives via corresponding ACPI objects.
> > > 
> > > Those objects are added in patch 5; the preceding ones pave the way to
> > > it, by making the necessary data public and by moving the whole
> > > floppy drive controller description into runtime-generated SSDT.
> > > 
> > > Note that the series conflicts with Igor's patchset for dynamic DSDT, in
> > > particular, with "[PATCH v2 27/51] pc: acpi: move FDC0 device from DSDT
> > > to SSDT"; I haven't managed to avoid that while trying to meet
> > > maintainer's comments.
> > 
> > Tested with XPsp3 WS2008R2 WS2012R2, no regressions so far it boots fine 
> > and can read floppy.
> > 
> > So for whole series:
> > Reviewed-by: Igor Mammedov <imamm...@redhat.com>
> > 
> 
> Igor, could you pls rebase this on top of your patches?
> I've merged them in my tree.

Pls remember to keep the author information intact though.

> > > Roman Kagan (6):
> > >   i386/pc: expose identifying the floppy controller
> > >   i386/acpi: make floppy controller object dynamic
> > >   tests/acpi: update test data
> > >   expose floppy drive geometry and CMOS type
> > >   i386: populate floppy drive information in SSDT
> > >   tests/acpi: update test data
> > > 
> > > Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
> > > Cc: "Michael S. Tsirkin" <m...@redhat.com>
> > > Cc: Eduardo Habkost <ehabk...@redhat.com>
> > > Cc: Igor Mammedov <imamm...@redhat.com>
> > > Cc: John Snow <js...@redhat.com>
> > > Cc: Kevin Wolf <kw...@redhat.com>
> > > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > > Cc: Richard Henderson <r...@twiddle.net>
> > > Cc: qemu-block@nongnu.org
> > > Cc: qemu-sta...@nongnu.org
> > > ---
> > > changes since v4:
> > >  - re-split out code changes from test data updates
> > > 
> > > changes since v3:
> > >  - make FDC object fully dynamic in a separate patch
> > >  - split out support patches
> > >  - include test data updates with the respective patches to maintain
> > >bisectability
> > > 
> > > changes since v2:
> > >  - explicit endianness for buffer data
> > >  - reorder code to reduce conflicts with dynamic DSDT patchset
> > >  - update test data
> > > 
> > >  hw/block/fdc.c  |  11 +
> > >  hw/i386/acpi-build.c|  92 
> > > 
> > >  hw/i386/acpi-dsdt-isa.dsl   |  18 ---
> > >  hw/i386/acpi-dsdt.dsl   |   1 -
> > >  hw/i386/pc.c|  46 ++
> > >  hw/i386/q35-acpi-dsdt.dsl   |   7 +--
> > >  include/hw/block/fdc.h  |   2 +
> > >  include/hw/i386/pc.h|   3 ++
> > >  tests/acpi-test-data/pc/DSDT| Bin 3028 -> 2946 bytes
> > >  tests/acpi-test-data/pc/SSDT| Bin 2486 -> 2635 bytes
> > >  tests/acpi-test-data/pc/SSDT.bridge | Bin 4345 -> 4494 bytes
> > >  tests/acpi-test-data/q35/DSDT   | Bin 7666 -> 7578 bytes
> > >  12 files changed, 137 insertions(+), 43 deletions(-)
> > >

Re: [Qemu-block] [PATCH v5 0/6] i386: expose floppy-related objects in SSDT

2016-01-07 Thread Michael S. Tsirkin

On Wed, Jan 06, 2016 at 03:04:40PM +0100, Igor Mammedov wrote:
> On Wed, 30 Dec 2015 23:11:50 +0300
> Roman Kagan <rka...@virtuozzo.com> wrote:
> 
> > Windows on UEFI systems is only capable of detecting the presence and
> > the type of floppy drives via corresponding ACPI objects.
> > 
> > Those objects are added in patch 5; the preceding ones pave the way to
> > it, by making the necessary data public and by moving the whole
> > floppy drive controller description into runtime-generated SSDT.
> > 
> > Note that the series conflicts with Igor's patchset for dynamic DSDT, in
> > particular, with "[PATCH v2 27/51] pc: acpi: move FDC0 device from DSDT
> > to SSDT"; I haven't managed to avoid that while trying to meet
> > maintainer's comments.
> 
> Tested with XPsp3 WS2008R2 WS2012R2, no regressions so far it boots fine and 
> can read floppy.
> 
> So for whole series:
> Reviewed-by: Igor Mammedov <imamm...@redhat.com>
> 

Igor, could you pls rebase this on top of your patches?
I've merged them in my tree.

> > Roman Kagan (6):
> >   i386/pc: expose identifying the floppy controller
> >   i386/acpi: make floppy controller object dynamic
> >   tests/acpi: update test data
> >   expose floppy drive geometry and CMOS type
> >   i386: populate floppy drive information in SSDT
> >   tests/acpi: update test data
> > 
> > Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
> > Cc: "Michael S. Tsirkin" <m...@redhat.com>
> > Cc: Eduardo Habkost <ehabk...@redhat.com>
> > Cc: Igor Mammedov <imamm...@redhat.com>
> > Cc: John Snow <js...@redhat.com>
> > Cc: Kevin Wolf <kw...@redhat.com>
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: Richard Henderson <r...@twiddle.net>
> > Cc: qemu-block@nongnu.org
> > Cc: qemu-sta...@nongnu.org
> > ---
> > changes since v4:
> >  - re-split out code changes from test data updates
> > 
> > changes since v3:
> >  - make FDC object fully dynamic in a separate patch
> >  - split out support patches
> >  - include test data updates with the respective patches to maintain
> >bisectability
> > 
> > changes since v2:
> >  - explicit endianness for buffer data
> >  - reorder code to reduce conflicts with dynamic DSDT patchset
> >  - update test data
> > 
> >  hw/block/fdc.c  |  11 +
> >  hw/i386/acpi-build.c|  92 
> > 
> >  hw/i386/acpi-dsdt-isa.dsl   |  18 ---
> >  hw/i386/acpi-dsdt.dsl   |   1 -
> >  hw/i386/pc.c|  46 ++
> >  hw/i386/q35-acpi-dsdt.dsl   |   7 +--
> >  include/hw/block/fdc.h  |   2 +
> >  include/hw/i386/pc.h|   3 ++
> >  tests/acpi-test-data/pc/DSDT| Bin 3028 -> 2946 bytes
> >  tests/acpi-test-data/pc/SSDT| Bin 2486 -> 2635 bytes
> >  tests/acpi-test-data/pc/SSDT.bridge | Bin 4345 -> 4494 bytes
> >  tests/acpi-test-data/q35/DSDT   | Bin 7666 -> 7578 bytes
> >  12 files changed, 137 insertions(+), 43 deletions(-)
> >

Re: [Qemu-block] [PATCH] SCSI device: fix to incomplete QOMify

2016-01-06 Thread Michael S. Tsirkin

On Wed, Jan 06, 2016 at 05:37:46PM +0800, Cao jin wrote:
> Signed-off-by: Cao jin <caoj.f...@cn.fujitsu.com>

Acked-by: Michael S. Tsirkin <m...@redhat.com>

> ---
>  hw/scsi/megasas.c | 12 ++--
>  hw/scsi/scsi-bus.c|  4 ++--
>  hw/scsi/virtio-scsi.c |  2 +-
>  3 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
> index d7dc667..78239bf 100644
> --- a/hw/scsi/megasas.c
> +++ b/hw/scsi/megasas.c
> @@ -744,7 +744,7 @@ static int megasas_ctrl_get_info(MegasasState *s, 
> MegasasCmd *cmd)
>  info.device.type = MFI_INFO_DEV_SAS3G;
>  info.device.port_count = 8;
>  QTAILQ_FOREACH(kid, >bus.qbus.children, sibling) {
> -SCSIDevice *sdev = DO_UPCAST(SCSIDevice, qdev, kid->child);
> +SCSIDevice *sdev = SCSI_DEVICE(kid->child);
>  uint16_t pd_id;
>  
>  if (num_pd_disks < 8) {
> @@ -960,7 +960,7 @@ static int megasas_dcmd_pd_get_list(MegasasState *s, 
> MegasasCmd *cmd)
>  max_pd_disks = MFI_MAX_SYS_PDS;
>  }
>  QTAILQ_FOREACH(kid, >bus.qbus.children, sibling) {
> -SCSIDevice *sdev = DO_UPCAST(SCSIDevice, qdev, kid->child);
> +SCSIDevice *sdev = SCSI_DEVICE(kid->child);
>  uint16_t pd_id;
>  
>  if (num_pd_disks >= max_pd_disks)
> @@ -1136,7 +1136,7 @@ static int megasas_dcmd_ld_get_list(MegasasState *s, 
> MegasasCmd *cmd)
>  max_ld_disks = MFI_MAX_LD;
>  }
>  QTAILQ_FOREACH(kid, >bus.qbus.children, sibling) {
> -SCSIDevice *sdev = DO_UPCAST(SCSIDevice, qdev, kid->child);
> +SCSIDevice *sdev = SCSI_DEVICE(kid->child);
>  
>  if (num_ld_disks >= max_ld_disks) {
>  break;
> @@ -1187,7 +1187,7 @@ static int megasas_dcmd_ld_list_query(MegasasState *s, 
> MegasasCmd *cmd)
>  max_ld_disks = MFI_MAX_LD;
>  }
>  QTAILQ_FOREACH(kid, >bus.qbus.children, sibling) {
> -SCSIDevice *sdev = DO_UPCAST(SCSIDevice, qdev, kid->child);
> +SCSIDevice *sdev = SCSI_DEVICE(kid->child);
>  
>  if (num_ld_disks >= max_ld_disks) {
>  break;
> @@ -1327,7 +1327,7 @@ static int megasas_dcmd_cfg_read(MegasasState *s, 
> MegasasCmd *cmd)
>  ld_offset = array_offset + sizeof(struct mfi_array) * num_pd_disks;
>  
>  QTAILQ_FOREACH(kid, >bus.qbus.children, sibling) {
> -SCSIDevice *sdev = DO_UPCAST(SCSIDevice, qdev, kid->child);
> +SCSIDevice *sdev = SCSI_DEVICE(kid->child);
>  uint16_t sdev_id = ((sdev->id & 0xFF) << 8) | (sdev->lun & 0xFF);
>  struct mfi_array *array;
>  struct mfi_ld_config *ld;
> @@ -2237,7 +2237,7 @@ static void megasas_soft_reset(MegasasState *s)
>   * after the initial reset.
>   */
>  QTAILQ_FOREACH(kid, >bus.qbus.children, sibling) {
> -SCSIDevice *sdev = DO_UPCAST(SCSIDevice, qdev, kid->child);
> +SCSIDevice *sdev = SCSI_DEVICE(kid->child);
>  
>  sdev->unit_attention = SENSE_CODE(NO_SENSE);
>  scsi_device_unit_attention_reported(sdev);
> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> index 00bddc9..fea0257 100644
> --- a/hw/scsi/scsi-bus.c
> +++ b/hw/scsi/scsi-bus.c
> @@ -1850,7 +1850,7 @@ void scsi_device_purge_requests(SCSIDevice *sdev, 
> SCSISense sense)
>  
>  static char *scsibus_get_dev_path(DeviceState *dev)
>  {
> -SCSIDevice *d = DO_UPCAST(SCSIDevice, qdev, dev);
> +SCSIDevice *d = SCSI_DEVICE(dev);
>  DeviceState *hba = dev->parent_bus->parent;
>  char *id;
>  char *path;
> @@ -2023,7 +2023,7 @@ static void scsi_device_class_init(ObjectClass *klass, 
> void *data)
>  static void scsi_dev_instance_init(Object *obj)
>  {
>  DeviceState *dev = DEVICE(obj);
> -SCSIDevice *s = DO_UPCAST(SCSIDevice, qdev, dev);
> +SCSIDevice *s = SCSI_DEVICE(dev);
>  
>  device_add_bootindex_property(obj, >conf.bootindex,
>"bootindex", NULL,
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index 3a4f520..607593c 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -352,7 +352,7 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, 
> VirtIOSCSIReq *req)
>  target = req->req.tmf.lun[1];
>  s->resetting++;
>  QTAILQ_FOREACH(kid, >bus.qbus.children, sibling) {
> - d = DO_UPCAST(SCSIDevice, qdev, kid->child);
> + d = SCSI_DEVICE(kid->child);
>   if (d->channel == 0 && d->id == target) {
>  qdev_reset_all(>qdev);
>   }
> -- 
> 2.1.0
> 
>

Re: [Qemu-block] [Qemu-devel] [PATCH v5 4/6] expose floppy drive geometry and CMOS type

2016-01-04 Thread Michael S. Tsirkin

On Mon, Jan 04, 2016 at 03:44:42PM -0500, John Snow wrote:
> 
> 
> On 12/30/2015 03:11 PM, Roman Kagan wrote:
> > Make it possible to query the geometry and the CMOS type of a floppy
> > drive outside of the respective source files.
> > 
> > It will be useful, in particular, when dynamically building ACPI tables,
> > and will allow to properly populate the corresponding ACPI objects and
> > thus enable BIOS-less systems to access the floppy drives.
> > 
> > Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
> > Cc: "Michael S. Tsirkin" <m...@redhat.com>
> > Cc: Eduardo Habkost <ehabk...@redhat.com>
> > Cc: Igor Mammedov <imamm...@redhat.com>
> > Cc: John Snow <js...@redhat.com>
> > Cc: Kevin Wolf <kw...@redhat.com>
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: Richard Henderson <r...@twiddle.net>
> > Cc: qemu-block@nongnu.org
> > Cc: qemu-sta...@nongnu.org
> > ---
> > no changes since v4
> > 
> > changes since v3:
> >  - split out into a separate patch to faciliate review
> > 
> >  hw/block/fdc.c | 11 +++
> >  hw/i386/pc.c   |  2 +-
> >  include/hw/block/fdc.h |  2 ++
> >  include/hw/i386/pc.h   |  1 +
> >  4 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/block/fdc.c b/hw/block/fdc.c
> > index 4292ece..c858c5f 100644
> > --- a/hw/block/fdc.c
> > +++ b/hw/block/fdc.c
> > @@ -2408,6 +2408,17 @@ FDriveType isa_fdc_get_drive_type(ISADevice *fdc, 
> > int i)
> >  return isa->state.drives[i].drive;
> >  }
> >  
> > +void isa_fdc_get_drive_geometry(ISADevice *fdc, int i, uint8_t *cylinders,
> > +uint8_t *heads, uint8_t *sectors)
> > +{
> > +FDCtrlISABus *isa = ISA_FDC(fdc);
> > +FDrive *drv = >state.drives[i];
> > +
> > +*cylinders = drv->max_track;
> > +*heads = (drv->flags & FDISK_DBL_SIDES) ? 2 : 1;
> > +*sectors = drv->last_sect;
> > +}
> > +
> >  static const VMStateDescription vmstate_isa_fdc ={
> >  .name = "fdc",
> >  .version_id = 2,
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index c36b8cf..99fab83 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -199,7 +199,7 @@ static void pic_irq_request(void *opaque, int irq, int 
> > level)
> >  
> >  #define REG_EQUIPMENT_BYTE  0x14
> >  
> > -static int cmos_get_fd_drive_type(FDriveType fd0)
> > +int cmos_get_fd_drive_type(FDriveType fd0)
> >  {
> >  int val;
> >  
> > diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
> > index d48b2f8..adaf3dc 100644
> > --- a/include/hw/block/fdc.h
> > +++ b/include/hw/block/fdc.h
> > @@ -22,5 +22,7 @@ void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
> > DriveInfo **fds, qemu_irq *fdc_tc);
> >  
> >  FDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
> > +void isa_fdc_get_drive_geometry(ISADevice *fdc, int i, uint8_t *cylinders,
> > +uint8_t *heads, uint8_t *sectors);
> >  
> >  #endif
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index 819..d044a9a 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -268,6 +268,7 @@ typedef void (*cpu_set_smm_t)(int smm, void *arg);
> >  void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name);
> >  
> >  ISADevice *pc_find_fdc0(void);
> > +int cmos_get_fd_drive_type(FDriveType fd0);
> >  
> >  /* acpi_piix.c */
> >  
> > 
> 
> Patches 1,4:
> 
> Reviewed-by: John Snow <js...@redhat.com>
> 
> Aside: Why did they have you split out the test changes to be separate
> from the code? Doesn't that introduce commits where the tests now fail?
> 
> --js

It's only a warning not a failure.

[Qemu-block] [PATCH for-2.5] blkdebug: silence warning under qtest

2015-11-30 Thread Michael S. Tsirkin

make check always outputs warnings, this
is not nice.  Disable blkdebug warnings under qtest.

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 block/blkdebug.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 6860a2b..dee3a0e 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -30,6 +30,7 @@
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qint.h"
 #include "qapi/qmp/qstring.h"
+#include "sysemu/qtest.h"
 
 typedef struct BDRVBlkdebugState {
 int state;
@@ -583,9 +584,13 @@ static void suspend_request(BlockDriverState *bs, 
BlkdebugRule *rule)
 remove_rule(rule);
 QLIST_INSERT_HEAD(>suspended_reqs, , next);
 
-printf("blkdebug: Suspended request '%s'\n", r.tag);
+if (!qtest_enabled()) {
+printf("blkdebug: Suspended request '%s'\n", r.tag);
+}
 qemu_coroutine_yield();
-printf("blkdebug: Resuming request '%s'\n", r.tag);
+if (!qtest_enabled()) {
+printf("blkdebug: Resuming request '%s'\n", r.tag);
+}
 
 QLIST_REMOVE(, next);
 g_free(r.tag);
-- 
MST

[Qemu-block] [PULL 08/16] virtio-blk: convert to virtqueue_map

2015-10-29 Thread Michael S. Tsirkin

Drop deprecated use of virtqueue_map_sg.

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Reviewed-by: Igor Mammedov <imamm...@redhat.com>
---
 hw/block/virtio-blk.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 8beb26b..3e230de 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -839,10 +839,7 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 req->next = s->rq;
 s->rq = req;
 
-virtqueue_map_sg(req->elem.in_sg, req->elem.in_addr,
-req->elem.in_num, 1);
-virtqueue_map_sg(req->elem.out_sg, req->elem.out_addr,
-req->elem.out_num, 0);
+virtqueue_map(>elem);
 }
 
 return 0;
-- 
MST

[Qemu-block] [PATCH 1/2] dataplane: simplify indirect descriptor read

2015-10-28 Thread Michael S. Tsirkin

Use address_space_read to make sure we handle the case of an indirect
descriptor crossing DIMM boundary correctly.

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---

Warning: compile-tested only.

 hw/virtio/dataplane/vring.c | 28 ++--
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 68f1994..0b92fcf 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -257,6 +257,21 @@ static void copy_in_vring_desc(VirtIODevice *vdev,
 host->next = virtio_lduw_p(vdev, >next);
 }
 
+static bool read_vring_desc(VirtIODevice *vdev,
+hwaddr guest,
+struct vring_desc *host)
+{
+if (address_space_read(_space_memory, guest, 
MEMTXATTRS_UNSPECIFIED,
+   (uint8_t *)host, sizeof *host)) {
+return false;
+}
+host->addr = virtio_tswap64(vdev, host->addr);
+host->len = virtio_tswap32(vdev, host->len);
+host->flags = virtio_tswap16(vdev, host->flags);
+host->next = virtio_tswap16(vdev, host->next);
+return true;
+}
+
 /* This is stolen from linux/drivers/vhost/vhost.c. */
 static int get_indirect(VirtIODevice *vdev, Vring *vring,
 VirtQueueElement *elem, struct vring_desc *indirect)
@@ -284,23 +299,16 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 }
 
 do {
-struct vring_desc *desc_ptr;
-MemoryRegion *mr;
-
 /* Translate indirect descriptor */
-desc_ptr = vring_map(,
- indirect->addr + found * sizeof(desc),
- sizeof(desc), false);
-if (!desc_ptr) {
-error_report("Failed to map indirect descriptor "
+if (!read_vring_desc(vdev, indirect->addr + found * sizeof(desc),
+ )) {
+error_report("Failed to read indirect descriptor "
  "addr %#" PRIx64 " len %zu",
  (uint64_t)indirect->addr + found * sizeof(desc),
  sizeof(desc));
 vring->broken = true;
 return -EFAULT;
 }
-copy_in_vring_desc(vdev, desc_ptr, );
-memory_region_unref(mr);
 
 /* Ensure descriptor has been loaded before accessing fields */
 barrier(); /* read_barrier_depends(); */
-- 
MST

[Qemu-block] [PATCH] fixup! dataplane: support non-contigious s/g

2015-10-28 Thread Michael S. Tsirkin

Should fix issues Stefan reported.

---

Built only.

 hw/virtio/dataplane/vring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 9ae9424..23f667e 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -261,8 +261,8 @@ static int get_desc(Vring *vring, VirtQueueElement *elem,
 
 /* The MemoryRegion is looked up again and unref'ed later, leave the
  * ref in place.  */
-iov->iov_len = len;
-*addr = desc->addr;
+(iov++)->iov_len = len;
+*addr++ = desc->addr;
 desc->len -= len;
 desc->addr += len;
 *num += 1;
-- 
MST

[Qemu-block] [PATCH 3/6] virtio-blk: convert to virtqueue_map

2015-10-27 Thread Michael S. Tsirkin

Drop deprecated use of virtqueue_map_sg.

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---
 hw/block/virtio-blk.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 8beb26b..3e230de 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -839,10 +839,7 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 req->next = s->rq;
 s->rq = req;
 
-virtqueue_map_sg(req->elem.in_sg, req->elem.in_addr,
-req->elem.in_num, 1);
-virtqueue_map_sg(req->elem.out_sg, req->elem.out_addr,
-req->elem.out_num, 0);
+virtqueue_map(>elem);
 }
 
 return 0;
-- 
MST

Re: [Qemu-block] [PATCH v2 3/3] virtio-blk: switch off scsi-passthrough by default

2015-10-19 Thread Michael S. Tsirkin

On Mon, Oct 19, 2015 at 01:53:50PM +0200, Cornelia Huck wrote:
> On Sun, 18 Oct 2015 10:59:59 +0300
> "Michael S. Tsirkin" <m...@redhat.com> wrote:
> 
> > On Fri, Oct 16, 2015 at 01:07:28PM +0200, Christian Borntraeger wrote:
> 
> > > Lets keep this patch as is to have scsi=off as default for virtio 1.0
> > > 
> > > (some iotests do fail because of this)
> > > 
> > > Christian
> > 
> > What fails, exactly?
> 
> For example, testcase 068 (on s390x, since we default virtio-1 to on):

I see, thanks.
So it's just the assertion that we have in code that fires.
Sure, scsi must be off by default before virtio 1 is on.


> --- /data/git/yyy/qemu/tests/qemu-iotests/068.out 2015-03-09 
> 12:32:35.245444052 +0100
> +++ 068.out.bad   2015-10-19 13:48:00.023772655 +0200
> @@ -3,9 +3,8 @@
>  === Saving and reloading a VM state to/from a qcow2 image ===
>  
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072
> +qemu-system-s390x: -hda 
> /data/git/yyy/qemu/build/tests/qemu-iotests/scratch/t.qcow2: Please set 
> scsi=off for virtio-blk devices in order to use virtio 1.0
>  QEMU X.Y.Z monitor - type 'help' for more information
> -(qemu) savevm 0
> -(qemu) quit
> +(qemu) qemu-system-s390x: -hda 
> /data/git/yyy/qemu/build/tests/qemu-iotests/scratch/t.qcow2: Please set 
> scsi=off for virtio-blk devices in order to use virtio 1.0
>  QEMU X.Y.Z monitor - type 'help' for more information
> -(qemu) quit
> -*** done
> +(qemu) *** done

Re: [Qemu-block] [PATCH v2 3/3] virtio-blk: switch off scsi-passthrough by default

2015-10-18 Thread Michael S. Tsirkin

On Fri, Oct 16, 2015 at 01:07:28PM +0200, Christian Borntraeger wrote:
> Am 16.10.2015 um 12:44 schrieb Cornelia Huck:
> > On Fri, 16 Oct 2015 12:32:52 +0200
> > Christian Borntraeger  wrote:
> > 
> >> Am 16.10.2015 um 12:25 schrieb Cornelia Huck:
> >>> Devices that are compliant with virtio-1 do not support scsi
> >>> passthrough any more (and it has not been a recommended setup
> >>> anyway for quite some time). To avoid having to switch it off
> >>> explicitly in newer qemus that turn on virtio-1 by default, let's
> >>> switch the default to scsi=false for 2.5.
> >>>
> >>> Signed-off-by: Cornelia Huck 
> >>> ---
> >>>  hw/block/virtio-blk.c | 2 +-
> >>>  include/hw/compat.h   | 6 +-
> >>>  2 files changed, 6 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> >>> index 8beb26b..999dbd7 100644
> >>> --- a/hw/block/virtio-blk.c
> >>> +++ b/hw/block/virtio-blk.c
> >>> @@ -975,7 +975,7 @@ static Property virtio_blk_properties[] = {
> >>>  DEFINE_PROP_STRING("serial", VirtIOBlock, conf.serial),
> >>>  DEFINE_PROP_BIT("config-wce", VirtIOBlock, conf.config_wce, 0, true),
> >>>  #ifdef __linux__
> >>> -DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, true),
> >>> +DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, false),
> >>>  #endif
> >>>  DEFINE_PROP_BIT("request-merging", VirtIOBlock, 
> >>> conf.request_merging, 0,
> >>>  true),
> >>> diff --git a/include/hw/compat.h b/include/hw/compat.h
> >>> index 095de5d..93e71af 100644
> >>> --- a/include/hw/compat.h
> >>> +++ b/include/hw/compat.h
> >>> @@ -2,7 +2,11 @@
> >>>  #define HW_COMPAT_H
> >>>
> >>>  #define HW_COMPAT_2_4 \
> >>> -/* empty */
> >>> +{\
> >>> +.driver   = "virtio-blk-device",\
> >>> +.property = "scsi",\
> >>> +.value= "true",\
> >>
> >> does that work?
> > 
> > It did for me :)
> > 
> >>
> >> If yes, would it make sense to convert the things in HW_COMPAT_2_3 from
> >> pci to device, e.g.
> >>
> >>
> >> {\
> >> -   .driver   = "virtio-blk-pci",\
> >> +   .driver   = "virtio-blk-device",\
> >> .property = "any_layout",\
> >> .value= "off",\
> >> ...
> > 
> > Not sure: We don't have 2.3 compat for ccw... but would give a better
> > template for later changes.
> 
> Yes. But this can be an addon patch. 
> 
> Lets keep this patch as is to have scsi=off as default for virtio 1.0
> 
> (some iotests do fail because of this)
> 
> Christian

What fails, exactly?

Re: [Qemu-block] [PATCH v2 2/6] hw/virtio/virtio-pci: Use pow2ceil() rather than hand-calculation

2015-08-12 Thread Michael S. Tsirkin

On Fri, Jul 24, 2015 at 01:33:08PM +0100, Peter Maydell wrote:
 Use the utility function pow2ceil() for rounding up to the next
 largest power of 2, rather than inline calculation.
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org

Reviewed-by: Michael S. Tsirkin m...@redhat.com

 ---
  hw/virtio/virtio-pci.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)
 
 diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
 index 283401a..845f52f 100644
 --- a/hw/virtio/virtio-pci.c
 +++ b/hw/virtio/virtio-pci.c
 @@ -1497,9 +1497,7 @@ static void virtio_pci_device_plugged(DeviceState *d, 
 Error **errp)
  if (legacy) {
  size = VIRTIO_PCI_REGION_SIZE(proxy-pci_dev)
  + virtio_bus_get_vdev_config_len(bus);
 -if (size  (size - 1)) {
 -size = 1  qemu_fls(size);
 -}
 +size = pow2ceil(size);
  
  memory_region_init_io(proxy-bar, OBJECT(proxy),
virtio_pci_config_ops,
 -- 
 1.9.1

Re: [Qemu-block] [PATCH v2 1/6] hw/pci: Use pow2ceil() rather than hand-calculation

2015-08-12 Thread Michael S. Tsirkin

On Fri, Jul 24, 2015 at 01:33:07PM +0100, Peter Maydell wrote:
 A couple of places in hw/pci use an inline calculation to round a
 size up to the next largest power of 2. We have a utility routine
 for this, so use it.
 
 (The behaviour of the old code is different if the size value
 is 0 -- it would leave it as 0 rather than rounding up to 1,
 but in both cases we know the size can't be 0.
 In the case where the size value had bit 31 set, the old code
 would invoke undefined behaviour; the new code will give a
 result of 0. Presumably that could never happen either.)
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org

Reviewed-by: Michael S. Tsirkin m...@redhat.com

 ---
  hw/pci/msix.c | 4 +---
  hw/pci/pci.c  | 4 +---
  2 files changed, 2 insertions(+), 6 deletions(-)
 
 diff --git a/hw/pci/msix.c b/hw/pci/msix.c
 index 7716bf3..2fdada4 100644
 --- a/hw/pci/msix.c
 +++ b/hw/pci/msix.c
 @@ -314,9 +314,7 @@ int msix_init_exclusive_bar(PCIDevice *dev, unsigned 
 short nentries,
  bar_size = bar_pba_offset + bar_pba_size;
  }
  
 -if (bar_size  (bar_size - 1)) {
 -bar_size = 1  qemu_fls(bar_size);
 -}
 +bar_size = pow2ceil(bar_size);
  
  name = g_strdup_printf(%s-msix, dev-name);
  memory_region_init(dev-msix_exclusive_bar, OBJECT(dev), name, 
 bar_size);
 diff --git a/hw/pci/pci.c b/hw/pci/pci.c
 index a017614..502da8d 100644
 --- a/hw/pci/pci.c
 +++ b/hw/pci/pci.c
 @@ -2065,9 +2065,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool 
 is_default_rom,
  g_free(path);
  return;
  }
 -if (size  (size - 1)) {
 -size = 1  qemu_fls(size);
 -}
 +size = pow2ceil(size);
  
  vmsd = qdev_get_vmsd(DEVICE(pdev));
  
 -- 
 1.9.1

[Qemu-block] [PULL 08/10] virtio-blk: only clear VIRTIO_F_ANY_LAYOUT for legacy device

2015-07-28 Thread Michael S. Tsirkin

From: Jason Wang jasow...@redhat.com

Chapter 6.3 of spec said


Transitional devices MUST offer, and if offered by the device
transitional drivers MUST accept the following:

VIRTIO_F_ANY_LAYOUT (27)


So this patch only clear VIRTIO_F_LAYOUT for legacy device.

Cc: Stefan Hajnoczi stefa...@redhat.com
Cc: Kevin Wolf kw...@redhat.com
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang jasow...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Acked-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/block/virtio-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index ebd9d84..44f9b8e 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -731,7 +731,6 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, 
uint64_t features,
 virtio_add_feature(features, VIRTIO_BLK_F_GEOMETRY);
 virtio_add_feature(features, VIRTIO_BLK_F_TOPOLOGY);
 virtio_add_feature(features, VIRTIO_BLK_F_BLK_SIZE);
-virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
 if (__virtio_has_feature(features, VIRTIO_F_VERSION_1)) {
 if (s-conf.scsi) {
 error_setg(errp, Please set scsi=off for virtio-blk devices in 
order to use virtio 1.0);
@@ -739,6 +738,7 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, 
uint64_t features,
 }
 virtio_add_feature(features, VIRTIO_F_ANY_LAYOUT);
 } else {
+virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
 virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
 }
 
-- 
MST

[Qemu-block] [PULL 06/10] virtio: get_features() can fail

2015-07-28 Thread Michael S. Tsirkin

From: Jason Wang jasow...@redhat.com

Signed-off-by: Jason Wang jasow...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Acked-by: Paolo Bonzini pbonz...@redhat.com
---
 include/hw/virtio/virtio.h  | 4 +++-
 hw/9pfs/virtio-9p-device.c  | 3 ++-
 hw/block/virtio-blk.c   | 3 ++-
 hw/char/virtio-serial-bus.c | 3 ++-
 hw/display/virtio-gpu.c | 3 ++-
 hw/input/virtio-input.c | 3 ++-
 hw/net/virtio-net.c | 3 ++-
 hw/scsi/vhost-scsi.c| 3 ++-
 hw/scsi/virtio-scsi.c   | 3 ++-
 hw/virtio/virtio-balloon.c  | 3 ++-
 hw/virtio/virtio-bus.c  | 3 ++-
 hw/virtio/virtio-rng.c  | 2 +-
 12 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index ff91711..59f0763 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -101,7 +101,9 @@ typedef struct VirtioDeviceClass {
 /* This is what a VirtioDevice must implement */
 DeviceRealize realize;
 DeviceUnrealize unrealize;
-uint64_t (*get_features)(VirtIODevice *vdev, uint64_t requested_features);
+uint64_t (*get_features)(VirtIODevice *vdev,
+ uint64_t requested_features,
+ Error **errp);
 uint64_t (*bad_features)(VirtIODevice *vdev);
 void (*set_features)(VirtIODevice *vdev, uint64_t val);
 int (*validate_features)(VirtIODevice *vdev);
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 3f4c9e7..93a407c 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -21,7 +21,8 @@
 #include virtio-9p-coth.h
 #include hw/virtio/virtio-access.h
 
-static uint64_t virtio_9p_get_features(VirtIODevice *vdev, uint64_t features)
+static uint64_t virtio_9p_get_features(VirtIODevice *vdev, uint64_t features,
+   Error **errp)
 {
 virtio_add_feature(features, VIRTIO_9P_MOUNT_TAG);
 return features;
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 015b9b5..a6cf008 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -722,7 +722,8 @@ static void virtio_blk_set_config(VirtIODevice *vdev, const 
uint8_t *config)
 aio_context_release(blk_get_aio_context(s-blk));
 }
 
-static uint64_t virtio_blk_get_features(VirtIODevice *vdev, uint64_t features)
+static uint64_t virtio_blk_get_features(VirtIODevice *vdev, uint64_t features,
+Error **errp)
 {
 VirtIOBlock *s = VIRTIO_BLK(vdev);
 
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 929e49c..bc56f5d 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -500,7 +500,8 @@ static void handle_input(VirtIODevice *vdev, VirtQueue *vq)
 }
 }
 
-static uint64_t get_features(VirtIODevice *vdev, uint64_t features)
+static uint64_t get_features(VirtIODevice *vdev, uint64_t features,
+ Error **errp)
 {
 VirtIOSerial *vser;
 
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 990a26b..a67d927 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -89,7 +89,8 @@ static void virtio_gpu_set_config(VirtIODevice *vdev, const 
uint8_t *config)
 }
 }
 
-static uint64_t virtio_gpu_get_features(VirtIODevice *vdev, uint64_t features)
+static uint64_t virtio_gpu_get_features(VirtIODevice *vdev, uint64_t features,
+Error **errp)
 {
 return features;
 }
diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c
index 7f5b8d6..7b25d27 100644
--- a/hw/input/virtio-input.c
+++ b/hw/input/virtio-input.c
@@ -166,7 +166,8 @@ static void virtio_input_set_config(VirtIODevice *vdev,
 virtio_notify_config(vdev);
 }
 
-static uint64_t virtio_input_get_features(VirtIODevice *vdev, uint64_t f)
+static uint64_t virtio_input_get_features(VirtIODevice *vdev, uint64_t f,
+  Error **errp)
 {
 return f;
 }
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e203058..1510839 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -446,7 +446,8 @@ static void virtio_net_set_queues(VirtIONet *n)
 
 static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue);
 
-static uint64_t virtio_net_get_features(VirtIODevice *vdev, uint64_t features)
+static uint64_t virtio_net_get_features(VirtIODevice *vdev, uint64_t features,
+Error **errp)
 {
 VirtIONet *n = VIRTIO_NET(vdev);
 NetClientState *nc = qemu_get_queue(n-nic);
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 52549f8..a69918b 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -153,7 +153,8 @@ static void vhost_scsi_stop(VHostSCSI *s)
 }
 
 static uint64_t vhost_scsi_get_features(VirtIODevice *vdev,
-uint64_t features)
+uint64_t

[Qemu-block] [PULL 04/10] virtio: set any_layout in virtio core

2015-07-28 Thread Michael S. Tsirkin

Exceptions:
- virtio-blk
- compat machine types

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 include/hw/compat.h| 22 +-
 include/hw/virtio/virtio.h |  4 +++-
 hw/block/virtio-blk.c  |  1 +
 hw/net/virtio-net.c|  2 --
 hw/scsi/virtio-scsi.c  |  2 --
 5 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/include/hw/compat.h b/include/hw/compat.h
index 4a43466..94c8097 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -2,7 +2,27 @@
 #define HW_COMPAT_H
 
 #define HW_COMPAT_2_3 \
-/* empty */
+{\
+.driver   = virtio-blk-pci,\
+.property = any_layout,\
+.value= off,\
+},{\
+.driver   = virtio-balloon-pci,\
+.property = any_layout,\
+.value= off,\
+},{\
+.driver   = virtio-serial-pci,\
+.property = any_layout,\
+.value= off,\
+},{\
+.driver   = virtio-9p-pci,\
+.property = any_layout,\
+.value= off,\
+},{\
+.driver   = virtio-rng-pci,\
+.property = any_layout,\
+.value= off,\
+},
 
 #define HW_COMPAT_2_2 \
 /* empty */
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 0634c15..ff91711 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -218,7 +218,9 @@ typedef struct VirtIORNGConf VirtIORNGConf;
 DEFINE_PROP_BIT64(event_idx, _state, _field,\
   VIRTIO_RING_F_EVENT_IDX, true), \
 DEFINE_PROP_BIT64(notify_on_empty, _state, _field,  \
-  VIRTIO_F_NOTIFY_ON_EMPTY, true)
+  VIRTIO_F_NOTIFY_ON_EMPTY, true), \
+DEFINE_PROP_BIT64(any_layout, _state, _field, \
+  VIRTIO_F_ANY_LAYOUT, true)
 
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n);
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 6aefda4..015b9b5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -731,6 +731,7 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, 
uint64_t features)
 virtio_add_feature(features, VIRTIO_BLK_F_TOPOLOGY);
 virtio_add_feature(features, VIRTIO_BLK_F_BLK_SIZE);
 virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
+virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
 
 if (s-conf.config_wce) {
 virtio_add_feature(features, VIRTIO_BLK_F_CONFIG_WCE);
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 304d3dd..e203058 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1777,8 +1777,6 @@ static void virtio_net_instance_init(Object *obj)
 }
 
 static Property virtio_net_properties[] = {
-DEFINE_PROP_BIT(any_layout, VirtIONet, host_features,
-VIRTIO_F_ANY_LAYOUT, true),
 DEFINE_PROP_BIT(csum, VirtIONet, host_features, VIRTIO_NET_F_CSUM, true),
 DEFINE_PROP_BIT(guest_csum, VirtIONet, host_features,
 VIRTIO_NET_F_GUEST_CSUM, true),
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index f7d3c7c..d17698d 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -953,8 +953,6 @@ static Property virtio_scsi_properties[] = {
   0x),
 DEFINE_PROP_UINT32(cmd_per_lun, VirtIOSCSI, parent_obj.conf.cmd_per_lun,
   128),
-DEFINE_PROP_BIT(any_layout, VirtIOSCSI, host_features,
-  VIRTIO_F_ANY_LAYOUT, true),
 DEFINE_PROP_BIT(hotplug, VirtIOSCSI, host_features,
VIRTIO_SCSI_F_HOTPLUG, true),
 DEFINE_PROP_BIT(param_change, VirtIOSCSI, host_features,
-- 
MST

[Qemu-block] [PULL 10/10] virtio: minor cleanup

2015-07-28 Thread Michael S. Tsirkin

There's no need for blk to set ANY_LAYOUT, it's
done by virtio core as necessary.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/block/virtio-blk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 44f9b8e..1556c9c 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -736,7 +736,6 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, 
uint64_t features,
 error_setg(errp, Please set scsi=off for virtio-blk devices in 
order to use virtio 1.0);
 return 0;
 }
-virtio_add_feature(features, VIRTIO_F_ANY_LAYOUT);
 } else {
 virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
 virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
-- 
MST

Re: [Qemu-block] [PATCH V4 3/3] virtio-blk: only clear VIRTIO_F_ANY_LAYOUT for legacy device

2015-07-27 Thread Michael S. Tsirkin

On Mon, Jul 27, 2015 at 12:30:19PM +0200, Paolo Bonzini wrote:
 
 
 On 27/07/2015 11:49, Jason Wang wrote:
  So this patch only clear VIRTIO_F_LAYOUT for legacy device.
  
  Cc: Stefan Hajnoczi stefa...@redhat.com
  Cc: Kevin Wolf kw...@redhat.com
  Cc: qemu-block@nongnu.org
  Signed-off-by: Jason Wang jasow...@redhat.com
  ---
   hw/block/virtio-blk.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
  index 9acbc3a..1d3f26c 100644
  --- a/hw/block/virtio-blk.c
  +++ b/hw/block/virtio-blk.c
  @@ -731,7 +731,6 @@ static uint64_t virtio_blk_get_features(VirtIODevice 
  *vdev, uint64_t features,
   virtio_add_feature(features, VIRTIO_BLK_F_GEOMETRY);
   virtio_add_feature(features, VIRTIO_BLK_F_TOPOLOGY);
   virtio_add_feature(features, VIRTIO_BLK_F_BLK_SIZE);
  -virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
   if (__virtio_has_feature(features, VIRTIO_F_VERSION_1)) {
   if (s-conf.scsi) {
   error_setg(errp, Virtio 1.0 does not support scsi 
  passthrough!);
  @@ -739,6 +738,7 @@ static uint64_t virtio_blk_get_features(VirtIODevice 
  *vdev, uint64_t features,
   }
   virtio_add_feature(features, VIRTIO_F_ANY_LAYOUT);
   } else {
  +virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
   virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
   }
 
 This patch is unnecessary, since the feature is added back below under
 if (__virtio_has_feature(features, VIRTIO_F_VERSION_1)).
 
 Paolo

It's needed so we can apply
virtio: set any_layout in virtio core

Re: [Qemu-block] [PATCH V4 3/3] virtio-blk: only clear VIRTIO_F_ANY_LAYOUT for legacy device

2015-07-27 Thread Michael S. Tsirkin

On Mon, Jul 27, 2015 at 03:28:51PM +0200, Cornelia Huck wrote:
 On Mon, 27 Jul 2015 14:22:37 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Mon, Jul 27, 2015 at 12:30:19PM +0200, Paolo Bonzini wrote:
   
   
   On 27/07/2015 11:49, Jason Wang wrote:
So this patch only clear VIRTIO_F_LAYOUT for legacy device.

Cc: Stefan Hajnoczi stefa...@redhat.com
Cc: Kevin Wolf kw...@redhat.com
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang jasow...@redhat.com
---
 hw/block/virtio-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 9acbc3a..1d3f26c 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -731,7 +731,6 @@ static uint64_t 
virtio_blk_get_features(VirtIODevice *vdev, uint64_t features,
 virtio_add_feature(features, VIRTIO_BLK_F_GEOMETRY);
 virtio_add_feature(features, VIRTIO_BLK_F_TOPOLOGY);
 virtio_add_feature(features, VIRTIO_BLK_F_BLK_SIZE);
-virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
 if (__virtio_has_feature(features, VIRTIO_F_VERSION_1)) {
 if (s-conf.scsi) {
 error_setg(errp, Virtio 1.0 does not support scsi 
passthrough!);
@@ -739,6 +738,7 @@ static uint64_t 
virtio_blk_get_features(VirtIODevice *vdev, uint64_t features,
 }
 virtio_add_feature(features, VIRTIO_F_ANY_LAYOUT);
 } else {
+virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
 virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
 }
   
   This patch is unnecessary, since the feature is added back below under
   if (__virtio_has_feature(features, VIRTIO_F_VERSION_1)).
   
   Paolo
  
  It's needed so we can apply
  virtio: set any_layout in virtio core
 
 So what's the plan on all those virtio feature patches? It's hard to
 keep track about what is based upon what, and what the end result looks
 like.

I pushed everything out to my pci branch, pls take a look.
This is what I have:

b787b35 acpi: fix pvpanic device is not shown in ui
49009db hw/acpi/ich9: clean up stale comment about KVM not supporting SMM
e513e9c hw/acpi/ich9: clear smi_en on reset
c9b11f9 virtio-blk: only clear VIRTIO_F_ANY_LAYOUT for legacy device
efb8206 virtio-blk: fail get_features when both scsi and 1.0 were set
9d5b731 virtio: get_features() can fail
2746269 virtio-pci: fix memory MR cleanup for modern
0a5 virtio: set any_layout in virtio core
cd4bfbb virtio-9p: fix any_layout
7882080 virtio-serial: fix ANY_LAYOUT
5f45607 virtio: hide legacy features from modern guests


 I don't have a good feeling about doing this that late in the 2.4
 cycle.

Well there will always be bugs. Given modern is disabled by default,
even if more bugs surface after release it's not the end of the
world.

-- 
MST

Re: [Qemu-block] [PATCH RFC] virtio: set any_layout in virtio core

2015-07-23 Thread Michael S. Tsirkin

On Thu, Jul 23, 2015 at 04:14:36PM +0800, Jason Wang wrote:
 
 
 On 07/22/2015 05:36 PM, Michael S. Tsirkin wrote:
  Virtio 1 requires this, 
 
 I think you mean transitional not virtio 1?
 
  and all devices are clean by now,
  so let's do it!
 
  Exceptions:
  - virtio-blk
  - compat machine types
 
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
  ---
 
  Untested - consider this pseudo-code - it just seems easier to write it
  in C than try to explain it.
 
   include/hw/compat.h| 22 +-
   include/hw/virtio/virtio.h |  4 +++-
   hw/block/virtio-blk.c  |  1 +
   hw/net/virtio-net.c|  2 --
   hw/scsi/virtio-scsi.c  |  2 --
   5 files changed, 25 insertions(+), 6 deletions(-)
 
  diff --git a/include/hw/compat.h b/include/hw/compat.h
  index 4a43466..94c8097 100644
  --- a/include/hw/compat.h
  +++ b/include/hw/compat.h
  @@ -2,7 +2,27 @@
   #define HW_COMPAT_H
   
   #define HW_COMPAT_2_3 \
  -/* empty */
  +{\
  +.driver   = virtio-blk-pci,\
  +.property = any_layout,\
  +.value= off,\
  +},{\
  +.driver   = virtio-balloon-pci,\
  +.property = any_layout,\
  +.value= off,\
  +},{\
  +.driver   = virtio-serial-pci,\
  +.property = any_layout,\
  +.value= off,\
 
 In send_control_msg() it has
 
 ...
 if (!virtqueue_pop(vq, elem)) {
 return 0;
 }
 
 memcpy(elem.in_sg[0].iov_base, buf, len);
 ...
 
 So looks like serial is not ready for any layout.
 
  +},{\
  +.driver   = virtio-9p-pci,\
  +.property = any_layout,\
  +.value= off,\
 
 In handle_9p_output() it has
 
 ...
 BUG_ON(pdu-elem.out_sg[0].iov_len  7);
 ...
 
 So looks like 9p does not support any layout at least.

I guess we could add code to disable virtio 1 for these two.
But it seems easier to fix them.




  +},{\
  +.driver   = virtio-rng-pci,\
  +.property = any_layout,\
  +.value= off,\
  +},
   
   #define HW_COMPAT_2_2 \
   /* empty */
  diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
  index 473fb75..fbb3c06 100644
  --- a/include/hw/virtio/virtio.h
  +++ b/include/hw/virtio/virtio.h
  @@ -214,7 +214,9 @@ typedef struct VirtIORNGConf VirtIORNGConf;
   DEFINE_PROP_BIT64(event_idx, _state, _field,\
 VIRTIO_RING_F_EVENT_IDX, true), \
   DEFINE_PROP_BIT64(notify_on_empty, _state, _field,  \
  -  VIRTIO_F_NOTIFY_ON_EMPTY, true)
  +  VIRTIO_F_NOTIFY_ON_EMPTY, true) \
  +DEFINE_PROP_BIT64(any_layout, _state, _field,  \
  +  VIRTIO_F_ANY_LAYOUT, true)
   
   hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
   hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n);
  diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
  index 6aefda4..015b9b5 100644
  --- a/hw/block/virtio-blk.c
  +++ b/hw/block/virtio-blk.c
  @@ -731,6 +731,7 @@ static uint64_t virtio_blk_get_features(VirtIODevice 
  *vdev, uint64_t features)
   virtio_add_feature(features, VIRTIO_BLK_F_TOPOLOGY);
   virtio_add_feature(features, VIRTIO_BLK_F_BLK_SIZE);
   virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
  +virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
   
   if (s-conf.config_wce) {
   virtio_add_feature(features, VIRTIO_BLK_F_CONFIG_WCE);
  diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
  index 304d3dd..e203058 100644
  --- a/hw/net/virtio-net.c
  +++ b/hw/net/virtio-net.c
  @@ -1777,8 +1777,6 @@ static void virtio_net_instance_init(Object *obj)
   }
   
   static Property virtio_net_properties[] = {
  -DEFINE_PROP_BIT(any_layout, VirtIONet, host_features,
  -VIRTIO_F_ANY_LAYOUT, true),
   DEFINE_PROP_BIT(csum, VirtIONet, host_features, VIRTIO_NET_F_CSUM, 
  true),
   DEFINE_PROP_BIT(guest_csum, VirtIONet, host_features,
   VIRTIO_NET_F_GUEST_CSUM, true),
  diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
  index f7d3c7c..d17698d 100644
  --- a/hw/scsi/virtio-scsi.c
  +++ b/hw/scsi/virtio-scsi.c
  @@ -953,8 +953,6 @@ static Property virtio_scsi_properties[] = {
 0x),
   DEFINE_PROP_UINT32(cmd_per_lun, VirtIOSCSI, 
  parent_obj.conf.cmd_per_lun,
 128),
  -DEFINE_PROP_BIT(any_layout, VirtIOSCSI, host_features,
  -  VIRTIO_F_ANY_LAYOUT, true),
   DEFINE_PROP_BIT(hotplug, VirtIOSCSI, host_features,
  VIRTIO_SCSI_F_HOTPLUG, true),
   DEFINE_PROP_BIT(param_change, VirtIOSCSI, host_features,

[Qemu-block] [PATCH RFC] virtio: set any_layout in virtio core

2015-07-22 Thread Michael S. Tsirkin

Virtio 1 requires this, and all devices are clean by now,
so let's do it!

Exceptions:
- virtio-blk
- compat machine types

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---

Untested - consider this pseudo-code - it just seems easier to write it
in C than try to explain it.

 include/hw/compat.h| 22 +-
 include/hw/virtio/virtio.h |  4 +++-
 hw/block/virtio-blk.c  |  1 +
 hw/net/virtio-net.c|  2 --
 hw/scsi/virtio-scsi.c  |  2 --
 5 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/include/hw/compat.h b/include/hw/compat.h
index 4a43466..94c8097 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -2,7 +2,27 @@
 #define HW_COMPAT_H
 
 #define HW_COMPAT_2_3 \
-/* empty */
+{\
+.driver   = virtio-blk-pci,\
+.property = any_layout,\
+.value= off,\
+},{\
+.driver   = virtio-balloon-pci,\
+.property = any_layout,\
+.value= off,\
+},{\
+.driver   = virtio-serial-pci,\
+.property = any_layout,\
+.value= off,\
+},{\
+.driver   = virtio-9p-pci,\
+.property = any_layout,\
+.value= off,\
+},{\
+.driver   = virtio-rng-pci,\
+.property = any_layout,\
+.value= off,\
+},
 
 #define HW_COMPAT_2_2 \
 /* empty */
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 473fb75..fbb3c06 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -214,7 +214,9 @@ typedef struct VirtIORNGConf VirtIORNGConf;
 DEFINE_PROP_BIT64(event_idx, _state, _field,\
   VIRTIO_RING_F_EVENT_IDX, true), \
 DEFINE_PROP_BIT64(notify_on_empty, _state, _field,  \
-  VIRTIO_F_NOTIFY_ON_EMPTY, true)
+  VIRTIO_F_NOTIFY_ON_EMPTY, true) \
+DEFINE_PROP_BIT64(any_layout, _state, _field,  \
+  VIRTIO_F_ANY_LAYOUT, true)
 
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n);
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 6aefda4..015b9b5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -731,6 +731,7 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, 
uint64_t features)
 virtio_add_feature(features, VIRTIO_BLK_F_TOPOLOGY);
 virtio_add_feature(features, VIRTIO_BLK_F_BLK_SIZE);
 virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
+virtio_clear_feature(features, VIRTIO_F_ANY_LAYOUT);
 
 if (s-conf.config_wce) {
 virtio_add_feature(features, VIRTIO_BLK_F_CONFIG_WCE);
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 304d3dd..e203058 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1777,8 +1777,6 @@ static void virtio_net_instance_init(Object *obj)
 }
 
 static Property virtio_net_properties[] = {
-DEFINE_PROP_BIT(any_layout, VirtIONet, host_features,
-VIRTIO_F_ANY_LAYOUT, true),
 DEFINE_PROP_BIT(csum, VirtIONet, host_features, VIRTIO_NET_F_CSUM, true),
 DEFINE_PROP_BIT(guest_csum, VirtIONet, host_features,
 VIRTIO_NET_F_GUEST_CSUM, true),
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index f7d3c7c..d17698d 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -953,8 +953,6 @@ static Property virtio_scsi_properties[] = {
   0x),
 DEFINE_PROP_UINT32(cmd_per_lun, VirtIOSCSI, parent_obj.conf.cmd_per_lun,
   128),
-DEFINE_PROP_BIT(any_layout, VirtIOSCSI, host_features,
-  VIRTIO_F_ANY_LAYOUT, true),
 DEFINE_PROP_BIT(hotplug, VirtIOSCSI, host_features,
VIRTIO_SCSI_F_HOTPLUG, true),
 DEFINE_PROP_BIT(param_change, VirtIOSCSI, host_features,
-- 
MST

Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-15 Thread Michael S. Tsirkin

On Wed, Jul 15, 2015 at 01:46:38PM +0200, Cornelia Huck wrote:
 On Wed, 15 Jul 2015 13:59:00 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Tue, Jul 14, 2015 at 07:43:44PM +0200, Cornelia Huck wrote:
Yes, and that's because as written, transitional devices must set
ANY_LAYOUT, but that's incompatible with scsi.
   
   Hm, I had a patch before that dynamically allowed different feature
   sets for legacy or modern, not only a subset. Probably won't apply
   anymore, but I'd like to able to do the following:
   
   - driver reads features without negotiating a revision: driver is
 legacy, offer legacy bits
   - driver negotiates revision 0: dito
   - driver negotiates revision = 1: driver is modern, offer modern bits
   
   That way we could offer SCSI and !ANY_LAYOUT (if scsi is enabled) in the
   first two cases, and a new qemu could still offer scsi to old guests.
   
   Would it be worth pursuing that idea?
  
  Frankly, I don't think so: I don't see why it makes sense
  to expose more features on the legacy interface than
  on the modern one. Imagine updating drivers to fix a bug
  and losing some features. How does this make sense?
 
 I don't think one should be a strict subset of the other. But I think
 we don't want to withdraw features from legacy guests on qemu updates
 either?

Absolutely. For now one has to enable the modern interface
explicitly. Around 2.5 we might switch that around, we'll
need to think hard about compatibility at that point.
In any case, we must definitely keep the old capability for old machine
types.

  
  I think the virtio TC's assumption was that the scsi passthrough was a
  bad idea, so in QEMU we only keep it around for legacy devices to avoid
  regressions.
 
 I'm not opposing this :)
 
  
  If you disagree and think transitional devices need the SCSI feature,
  either try to convince pbonzini or rewrite the spec youself
  to support it in the virtio 1 mode.
 
 This seems to boil down to the different meaning of transitional for
 ccw and pci, see the other thread.

Before the revision is negotiated, ccw won't know whether
it's a legacy driver - is that the difference?
Fine, but revision is negotiated way before features are
probed so why does it make a practical difference?

-- 
MST

Re: [Qemu-block] [PATCH V2 3/5] virtio-blk: disable scsi passthrough by default

2015-07-15 Thread Michael S. Tsirkin

On Wed, Jul 15, 2015 at 01:29:59PM +0800, Jason Wang wrote:
 Disable scsi passthrough by default since it was incompatible with
 virtio 1.0. For legacy machine types, keep this on by default.
 
 Cc: Stefan Hajnoczi stefa...@redhat.com
 Cc: Kevin Wolf kw...@redhat.com
 Cc: qemu-block@nongnu.org
 Signed-off-by: Jason Wang jasow...@redhat.com

Seems risky for 2.4.  modern is off by default for now. Can't we limit
the change to when modern is enabled?

I suggested changing this from bool to on/off/auto, and
make auto mean !modern.


 ---
  hw/block/virtio-blk.c | 2 +-
  include/hw/compat.h   | 6 +-
  2 files changed, 6 insertions(+), 2 deletions(-)
 
 diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
 index 761d763..362fe53 100644
 --- a/hw/block/virtio-blk.c
 +++ b/hw/block/virtio-blk.c
 @@ -964,7 +964,7 @@ static Property virtio_blk_properties[] = {
  DEFINE_PROP_STRING(serial, VirtIOBlock, conf.serial),
  DEFINE_PROP_BIT(config-wce, VirtIOBlock, conf.config_wce, 0, true),
  #ifdef __linux__
 -DEFINE_PROP_BIT(scsi, VirtIOBlock, conf.scsi, 0, true),
 +DEFINE_PROP_BIT(scsi, VirtIOBlock, conf.scsi, 0, false),
  #endif
  DEFINE_PROP_BIT(request-merging, VirtIOBlock, conf.request_merging, 0,
  true),
 diff --git a/include/hw/compat.h b/include/hw/compat.h
 index 4a43466..56039d8 100644
 --- a/include/hw/compat.h
 +++ b/include/hw/compat.h
 @@ -2,7 +2,11 @@
  #define HW_COMPAT_H
  
  #define HW_COMPAT_2_3 \
 -/* empty */
 +{\
 +.driver   = virtio-blk-pci,\
 +.property = scsi,\
 +.value= on,\
 +},
  
  #define HW_COMPAT_2_2 \
  /* empty */
 -- 
 2.1.4

Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-15 Thread Michael S. Tsirkin

On Wed, Jul 15, 2015 at 05:38:53PM +0200, Cornelia Huck wrote:
 On Wed, 15 Jul 2015 17:39:18 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Wed, Jul 15, 2015 at 04:30:51PM +0200, Cornelia Huck wrote:
   On Wed, 15 Jul 2015 17:11:57 +0300
   Michael S. Tsirkin m...@redhat.com wrote:
   
Fine, but revision is negotiated way before features are
probed so why does it make a practical difference?
   
   Legacy drivers (that don't know about the set-revision command) 
   will
   read features without revision negotiation - we need to offer 
   them the
   legacy feature set.
  
  Right. So simply do if (revision  1) return features  0x
  and that will do this, will it not?
 
 Not for bits that we want to offer for legacy but not for modern.

I don't think this selective offering works at least for scsi.
scsi is a backend feature, if you connect a modern device
in front the device simply does not work.
It therefore makes no sense to attach a transitional device
to such a backend.
   
   My point is that we're losing legacy features with that approach, and
   it would not be possible to offer them to legacy guests with newer
   qemus (at least with ccw).
  
  What's wrong with adding a disable-modern flag, like pci has?
  User can set that to get a legacy device.
 
 The whole idea behind the revision-stuff was that we don't need
 something like disable-modern. If the device is able to figure out on
 its own if it is to act as a modern or a legacy device, why require
 user intervention?

It's about compatibility, e.g. being able to test legacy mode
in transitional drivers in guests.
Consider also bugs, e.g. the fact that linux guests lack WCE
in modern mode ATM.

  
   What about the other way around (i.e. scsi is configured, therefore the
   device is legacy-only)? We'd only retain the scsi bit if it is actually
   wanted by the user's configuration. I would need to enforce a max
   revision of 0 for such a device in ccw, and pci could disable modern
   for it.
  
  Will have to think about it.
  But I think a flag to disable/enable modern is useful in any case,
  and it seems sufficient.
 
 I don't like the idea of disabling modern or legacy for ccw, where the
 differences between both are very minor.
 
 I also don't think requiring the user to specify a new flag on upgrade
 just to present the same features as before is a good idea: it is
 something that is easily missed and may lead to much headscratching.

And doing this on a driver upgrade won't?  As I said, if you believe
this feature has value, argue that we shouldn't drop scsi off in virtio
1.0 then.

-- 
MST

Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-15 Thread Michael S. Tsirkin

On Wed, Jul 15, 2015 at 03:40:22PM +0200, Cornelia Huck wrote:
 On Wed, 15 Jul 2015 16:16:07 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Wed, Jul 15, 2015 at 02:43:51PM +0200, Cornelia Huck wrote:
   On Wed, 15 Jul 2015 15:01:01 +0300
   Michael S. Tsirkin m...@redhat.com wrote:
   
On Wed, Jul 15, 2015 at 01:46:38PM +0200, Cornelia Huck wrote:
 On Wed, 15 Jul 2015 13:59:00 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Tue, Jul 14, 2015 at 07:43:44PM +0200, Cornelia Huck wrote:
Yes, and that's because as written, transitional devices must 
set
ANY_LAYOUT, but that's incompatible with scsi.
   
   Hm, I had a patch before that dynamically allowed different 
   feature
   sets for legacy or modern, not only a subset. Probably won't apply
   anymore, but I'd like to able to do the following:
   
   - driver reads features without negotiating a revision: driver is
 legacy, offer legacy bits
   - driver negotiates revision 0: dito
   - driver negotiates revision = 1: driver is modern, offer modern 
   bits
   
   That way we could offer SCSI and !ANY_LAYOUT (if scsi is enabled) 
   in the
   first two cases, and a new qemu could still offer scsi to old 
   guests.
   
   Would it be worth pursuing that idea?
  
  Frankly, I don't think so: I don't see why it makes sense
  to expose more features on the legacy interface than
  on the modern one. Imagine updating drivers to fix a bug
  and losing some features. How does this make sense?
 
 I don't think one should be a strict subset of the other. But I think
 we don't want to withdraw features from legacy guests on qemu updates
 either?

Absolutely. For now one has to enable the modern interface
explicitly. Around 2.5 we might switch that around, we'll
need to think hard about compatibility at that point.
In any case, we must definitely keep the old capability for old machine
types.
   
   ccw only offers revision 0 (legacy) in 2.4. I plan to introduce
   revision 1 in 2.5 and force revision to 0 for 2.4 compatibility (as 2.4
   is the first versioned ccw machine).
  
  I was talking about pci here actually.
 
 Sure, and these are my plans for ccw ;)
 
  

  
  I think the virtio TC's assumption was that the scsi passthrough 
  was a
  bad idea, so in QEMU we only keep it around for legacy devices to 
  avoid
  regressions.
 
 I'm not opposing this :)
 
  
  If you disagree and think transitional devices need the SCSI 
  feature,
  either try to convince pbonzini or rewrite the spec youself
  to support it in the virtio 1 mode.
 
 This seems to boil down to the different meaning of transitional for
 ccw and pci, see the other thread.

Before the revision is negotiated, ccw won't know whether
it's a legacy driver - is that the difference?
   
   I'd say it doesn't know whether the driver intends to use the modern
   interface.
  
  That's also the case for pci.
 
 But does pci know the moment it first tries to get the device's
 features? And does pci assume modern as default for transitional
 devices?

I don't think it does.

  
Fine, but revision is negotiated way before features are
probed so why does it make a practical difference?
   
   Legacy drivers (that don't know about the set-revision command) will
   read features without revision negotiation - we need to offer them the
   legacy feature set.
  
  Right. So simply do if (revision  1) return features  0x
  and that will do this, will it not?
 
 Not for bits that we want to offer for legacy but not for modern.

I don't think this selective offering works at least for scsi.
scsi is a backend feature, if you connect a modern device
in front the device simply does not work.
It therefore makes no sense to attach a transitional device
to such a backend.

-- 
MST

Re: [Qemu-block] [PATCH V2 3/5] virtio-blk: disable scsi passthrough by default

2015-07-15 Thread Michael S. Tsirkin

On Wed, Jul 15, 2015 at 02:47:24PM +0200, Paolo Bonzini wrote:

 On 15/07/2015 14:21, Michael S. Tsirkin wrote:
   Disable scsi passthrough by default since it was incompatible with
   virtio 1.0. For legacy machine types, keep this on by default.

   Cc: Stefan Hajnoczi stefa...@redhat.com
   Cc: Kevin Wolf kw...@redhat.com
   Cc: qemu-block@nongnu.org
   Signed-off-by: Jason Wang jasow...@redhat.com
  Seems risky for 2.4.  modern is off by default for now. Can't we limit
  the change to when modern is enabled?

 That would have the effect of disabling a feature when you turn on modern.

What's wrong with that?

  I suggested changing this from bool to on/off/auto, and
  make auto mean !modern.

 No, please do it like Jason did.  The SCSI feature effectively had to be
 enabled explicitly already, the requests were marked as unsupported.

 Paolo

I didn't know. How is it enabled?

-- 
MST

Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-15 Thread Michael S. Tsirkin

On Wed, Jul 15, 2015 at 04:30:51PM +0200, Cornelia Huck wrote:
 On Wed, 15 Jul 2015 17:11:57 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  Fine, but revision is negotiated way before features are
  probed so why does it make a practical difference?
 
 Legacy drivers (that don't know about the set-revision command) will
 read features without revision negotiation - we need to offer them the
 legacy feature set.

Right. So simply do if (revision  1) return features  0x
and that will do this, will it not?
   
   Not for bits that we want to offer for legacy but not for modern.
  
  I don't think this selective offering works at least for scsi.
  scsi is a backend feature, if you connect a modern device
  in front the device simply does not work.
  It therefore makes no sense to attach a transitional device
  to such a backend.
 
 My point is that we're losing legacy features with that approach, and
 it would not be possible to offer them to legacy guests with newer
 qemus (at least with ccw).

What's wrong with adding a disable-modern flag, like pci has?
User can set that to get a legacy device.

 What about the other way around (i.e. scsi is configured, therefore the
 device is legacy-only)? We'd only retain the scsi bit if it is actually
 wanted by the user's configuration. I would need to enforce a max
 revision of 0 for such a device in ccw, and pci could disable modern
 for it.

Will have to think about it.
But I think a flag to disable/enable modern is useful in any case,
and it seems sufficient.

-- 
MST

Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-15 Thread Michael S. Tsirkin

On Wed, Jul 15, 2015 at 02:43:51PM +0200, Cornelia Huck wrote:
 On Wed, 15 Jul 2015 15:01:01 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Wed, Jul 15, 2015 at 01:46:38PM +0200, Cornelia Huck wrote:
   On Wed, 15 Jul 2015 13:59:00 +0300
   Michael S. Tsirkin m...@redhat.com wrote:
   
On Tue, Jul 14, 2015 at 07:43:44PM +0200, Cornelia Huck wrote:
  Yes, and that's because as written, transitional devices must set
  ANY_LAYOUT, but that's incompatible with scsi.
 
 Hm, I had a patch before that dynamically allowed different feature
 sets for legacy or modern, not only a subset. Probably won't apply
 anymore, but I'd like to able to do the following:
 
 - driver reads features without negotiating a revision: driver is
   legacy, offer legacy bits
 - driver negotiates revision 0: dito
 - driver negotiates revision = 1: driver is modern, offer modern bits
 
 That way we could offer SCSI and !ANY_LAYOUT (if scsi is enabled) in 
 the
 first two cases, and a new qemu could still offer scsi to old guests.
 
 Would it be worth pursuing that idea?

Frankly, I don't think so: I don't see why it makes sense
to expose more features on the legacy interface than
on the modern one. Imagine updating drivers to fix a bug
and losing some features. How does this make sense?
   
   I don't think one should be a strict subset of the other. But I think
   we don't want to withdraw features from legacy guests on qemu updates
   either?
  
  Absolutely. For now one has to enable the modern interface
  explicitly. Around 2.5 we might switch that around, we'll
  need to think hard about compatibility at that point.
  In any case, we must definitely keep the old capability for old machine
  types.
 
 ccw only offers revision 0 (legacy) in 2.4. I plan to introduce
 revision 1 in 2.5 and force revision to 0 for 2.4 compatibility (as 2.4
 is the first versioned ccw machine).

I was talking about pci here actually.

  

I think the virtio TC's assumption was that the scsi passthrough was a
bad idea, so in QEMU we only keep it around for legacy devices to avoid
regressions.
   
   I'm not opposing this :)
   

If you disagree and think transitional devices need the SCSI feature,
either try to convince pbonzini or rewrite the spec youself
to support it in the virtio 1 mode.
   
   This seems to boil down to the different meaning of transitional for
   ccw and pci, see the other thread.
  
  Before the revision is negotiated, ccw won't know whether
  it's a legacy driver - is that the difference?
 
 I'd say it doesn't know whether the driver intends to use the modern
 interface.

That's also the case for pci.

  Fine, but revision is negotiated way before features are
  probed so why does it make a practical difference?
 
 Legacy drivers (that don't know about the set-revision command) will
 read features without revision negotiation - we need to offer them the
 legacy feature set.

Right. So simply do if (revision  1) return features  0x
and that will do this, will it not?

-- 
MST

Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-13 Thread Michael S. Tsirkin

On Mon, Jul 13, 2015 at 05:00:51PM +0800, Jason Wang wrote:
 
 
 On 07/13/2015 03:46 PM, Michael S. Tsirkin wrote:
  On Mon, Jul 13, 2015 at 01:46:48PM +0800, Jason Wang wrote:
  VIRTIO_BLK_F_SCSI was no longer supported in 1.0. So disable it.
 
  Cc: Stefan Hajnoczi stefa...@redhat.com
  Cc: Kevin Wolf kw...@redhat.com
  Cc: qemu-block@nongnu.org
  Signed-off-by: Jason Wang jasow...@redhat.com
  Interesting, I noticed we have a field scsi - see
  commit 1ba1f2e319afdcb485963cd3f426fdffd1b725f2
  Author: Paolo Bonzini pbonz...@redhat.com
  Date:   Fri Dec 23 15:39:03 2011 +0100
 
  virtio-blk: refuse SG_IO requests with scsi=off
 
  but it doesn't seem to be propagated to guest features in
  any way.
 
  Maybe we should fix that, making that flag AutoOnOff?
 
 Looks ok but auto may need some compat work since default is true.

Right. Auto would then mean !modern.

  Then, if user explicitly requested scsi=on with a modern
  interface then we can error out cleanly.
 
  Given scsi flag is currently ignored, I think
  this can be a patch on top.
 
 Looks like virtio_blk_handle_scsi_req() check this:
 
 if (!blk-conf.scsi) {
 status = VIRTIO_BLK_S_UNSUPP;
 goto fail;
 }
 
 
  ---
   hw/block/virtio-blk.c | 3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)
 
  diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
  index 6aefda4..f30ad25 100644
  --- a/hw/block/virtio-blk.c
  +++ b/hw/block/virtio-blk.c
  @@ -730,7 +730,8 @@ static uint64_t virtio_blk_get_features(VirtIODevice 
  *vdev, uint64_t features)
   virtio_add_feature(features, VIRTIO_BLK_F_GEOMETRY);
   virtio_add_feature(features, VIRTIO_BLK_F_TOPOLOGY);
   virtio_add_feature(features, VIRTIO_BLK_F_BLK_SIZE);
  -virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
  +if (!__virtio_has_feature(features, VIRTIO_F_VERSION_1))
  +virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
   
   if (s-conf.config_wce) {
   virtio_add_feature(features, VIRTIO_BLK_F_CONFIG_WCE);
  -- 
  2.1.4

Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-13 Thread Michael S. Tsirkin

On Mon, Jul 13, 2015 at 02:30:24PM +0200, Cornelia Huck wrote:
 On Mon, 13 Jul 2015 15:22:52 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Mon, Jul 13, 2015 at 01:51:56PM +0200, Cornelia Huck wrote:
   On Mon, 13 Jul 2015 11:56:51 +0200
   Kevin Wolf kw...@redhat.com wrote:
   
Am 13.07.2015 um 11:00 hat Jason Wang geschrieben:
 
 
 On 07/13/2015 03:46 PM, Michael S. Tsirkin wrote:
  On Mon, Jul 13, 2015 at 01:46:48PM +0800, Jason Wang wrote:
  VIRTIO_BLK_F_SCSI was no longer supported in 1.0. So disable it.
 
  Cc: Stefan Hajnoczi stefa...@redhat.com
  Cc: Kevin Wolf kw...@redhat.com
  Cc: qemu-block@nongnu.org
  Signed-off-by: Jason Wang jasow...@redhat.com
  Interesting, I noticed we have a field scsi - see
  commit 1ba1f2e319afdcb485963cd3f426fdffd1b725f2
  Author: Paolo Bonzini pbonz...@redhat.com
  Date:   Fri Dec 23 15:39:03 2011 +0100
 
  virtio-blk: refuse SG_IO requests with scsi=off
 
  but it doesn't seem to be propagated to guest features in
  any way.
 
  Maybe we should fix that, making that flag AutoOnOff?
 
 Looks ok but auto may need some compat work since default is true.
 
  Then, if user explicitly requested scsi=on with a modern
  interface then we can error out cleanly.
 
  Given scsi flag is currently ignored, I think
  this can be a patch on top.
 
 Looks like virtio_blk_handle_scsi_req() check this:
 
 if (!blk-conf.scsi) {
 status = VIRTIO_BLK_S_UNSUPP;
 goto fail;
 }

So we should be checking the same condition for the feature flag and
error out in the init function if we have a VERSION_1 device and
blk-conf.scsi is set.
   
   Hm, I wonder how this plays with transports that want to make the
   virtio-1 vs. legacy decision post-init? For virtio-ccw, I basically
   only want to offer VERSION_1 if the driver negotiated revision = 1.
   I'd need to check for !scsi as well before I can add this feature bit
   then? Have the init function set a blocker for VERSION_1 so that the
   driver may only negotiate revision 0?
  
  
  We already handle this, do we not?
 (...)
  So guest that doesn't negotiate revision = 1 never gets to see
  VIRTIO_F_VERSION_1.
 
 Not my question :) I was wondering about scsi vs. virtio-1 devices. And
 as I basically only want to make the decision on whether to offer
 VERSION_1 when the guest negotiated a revision, I cannot fence scsi
 during init, no?

No, I don't think there's a lot of value in offering scsi only to
old guests that don't negotiate revision = 1.

If user asked for virtio 1 support then that by proxy implies scsi
passthrough does not work, and it won't work for legacy
guests too.


  
  Maybe we should go further and additionally all bits = 32 if
  VIRTIO_F_VERSION_1 is clear, but that can wait
  and we have no bits like that in 2.4.
  
 Spec says bits = 32 are only valid if we have VERSION_1, doesn't it?
 Sounds sensible.

Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-13 Thread Michael S. Tsirkin

On Mon, Jul 13, 2015 at 03:20:59PM +0200, Cornelia Huck wrote:
 On Mon, 13 Jul 2015 15:36:00 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Mon, Jul 13, 2015 at 02:30:24PM +0200, Cornelia Huck wrote:
   On Mon, 13 Jul 2015 15:22:52 +0300
   Michael S. Tsirkin m...@redhat.com wrote:
   
On Mon, Jul 13, 2015 at 01:51:56PM +0200, Cornelia Huck wrote:
 On Mon, 13 Jul 2015 11:56:51 +0200
 Kevin Wolf kw...@redhat.com wrote:
 
  Am 13.07.2015 um 11:00 hat Jason Wang geschrieben:
   
   
   On 07/13/2015 03:46 PM, Michael S. Tsirkin wrote:
On Mon, Jul 13, 2015 at 01:46:48PM +0800, Jason Wang wrote:
VIRTIO_BLK_F_SCSI was no longer supported in 1.0. So disable 
it.
   
Cc: Stefan Hajnoczi stefa...@redhat.com
Cc: Kevin Wolf kw...@redhat.com
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang jasow...@redhat.com
Interesting, I noticed we have a field scsi - see
commit 1ba1f2e319afdcb485963cd3f426fdffd1b725f2
Author: Paolo Bonzini pbonz...@redhat.com
Date:   Fri Dec 23 15:39:03 2011 +0100
   
virtio-blk: refuse SG_IO requests with scsi=off
   
but it doesn't seem to be propagated to guest features in
any way.
   
Maybe we should fix that, making that flag AutoOnOff?
   
   Looks ok but auto may need some compat work since default is true.
   
Then, if user explicitly requested scsi=on with a modern
interface then we can error out cleanly.
   
Given scsi flag is currently ignored, I think
this can be a patch on top.
   
   Looks like virtio_blk_handle_scsi_req() check this:
   
   if (!blk-conf.scsi) {
   status = VIRTIO_BLK_S_UNSUPP;
   goto fail;
   }
  
  So we should be checking the same condition for the feature flag and
  error out in the init function if we have a VERSION_1 device and
  blk-conf.scsi is set.
 
 Hm, I wonder how this plays with transports that want to make the
 virtio-1 vs. legacy decision post-init? For virtio-ccw, I basically
 only want to offer VERSION_1 if the driver negotiated revision = 1.
 I'd need to check for !scsi as well before I can add this feature bit
 then? Have the init function set a blocker for VERSION_1 so that the
 driver may only negotiate revision 0?


We already handle this, do we not?
   (...)
So guest that doesn't negotiate revision = 1 never gets to see
VIRTIO_F_VERSION_1.
   
   Not my question :) I was wondering about scsi vs. virtio-1 devices. And
   as I basically only want to make the decision on whether to offer
   VERSION_1 when the guest negotiated a revision, I cannot fence scsi
   during init, no?
  
  No, I don't think there's a lot of value in offering scsi only to
  old guests that don't negotiate revision = 1.
  
  If user asked for virtio 1 support then that by proxy implies scsi
  passthrough does not work, and it won't work for legacy
  guests too.
 
 This would imply that any transitional device cannot offer scsi,
 doesn't it?

Yes, and that's because as written, transitional devices must set
ANY_LAYOUT, but that's incompatible with scsi.

 We have two layers interacting here: virtio-blk which may or may not
 offer scsi support, and the transport layer which may or may not offer
 VERSION_1 support. Failing scsi commands if VERSION_1 has been
 negotiated makes sense to me; but I don't want to disable scsi config a
 priori because the driver might negotiate VERSION_1. This would imply
 that virtio-blk over virtio-ccw would never offer scsi once we enable
 virtio-1 support, and it kind of defeats the purpose of a transitional
 device for me.
 
 (The other way round - fail negotiating revison 1 if the device was
 configured with scsi support - makes more sense to me.)

Re: [Qemu-block] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-13 Thread Michael S. Tsirkin

On Mon, Jul 13, 2015 at 01:46:48PM +0800, Jason Wang wrote:
 VIRTIO_BLK_F_SCSI was no longer supported in 1.0. So disable it.
 
 Cc: Stefan Hajnoczi stefa...@redhat.com
 Cc: Kevin Wolf kw...@redhat.com
 Cc: qemu-block@nongnu.org
 Signed-off-by: Jason Wang jasow...@redhat.com

Interesting, I noticed we have a field scsi - see
commit 1ba1f2e319afdcb485963cd3f426fdffd1b725f2
Author: Paolo Bonzini pbonz...@redhat.com
Date:   Fri Dec 23 15:39:03 2011 +0100

virtio-blk: refuse SG_IO requests with scsi=off

but it doesn't seem to be propagated to guest features in
any way.

Maybe we should fix that, making that flag AutoOnOff?
Then, if user explicitly requested scsi=on with a modern
interface then we can error out cleanly.

Given scsi flag is currently ignored, I think
this can be a patch on top.

 ---
  hw/block/virtio-blk.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
 index 6aefda4..f30ad25 100644
 --- a/hw/block/virtio-blk.c
 +++ b/hw/block/virtio-blk.c
 @@ -730,7 +730,8 @@ static uint64_t virtio_blk_get_features(VirtIODevice 
 *vdev, uint64_t features)
  virtio_add_feature(features, VIRTIO_BLK_F_GEOMETRY);
  virtio_add_feature(features, VIRTIO_BLK_F_TOPOLOGY);
  virtio_add_feature(features, VIRTIO_BLK_F_BLK_SIZE);
 -virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
 +if (!__virtio_has_feature(features, VIRTIO_F_VERSION_1))
 +virtio_add_feature(features, VIRTIO_BLK_F_SCSI);
  
  if (s-conf.config_wce) {
  virtio_add_feature(features, VIRTIO_BLK_F_CONFIG_WCE);
 -- 
 2.1.4

[Qemu-block] [PULL v2 02/16] Revert dataplane: allow virtio-1 devices

2015-07-08 Thread Michael S. Tsirkin

From: Cornelia Huck cornelia.h...@de.ibm.com

This reverts commit f5a5628cf0b65b223fa0c9031714578dfac4cf04.

This was an old patch that had been already superseded by b0e5d90eb
(dataplane: endianness-aware accesses).

Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Acked-by: Stefan Hajnoczi stefa...@redhat.com
---
 hw/virtio/dataplane/vring.c | 47 -
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index bed9b11..07fd69c 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -158,18 +158,15 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 }
 
 
-static int get_desc(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
+static int get_desc(Vring *vring, VirtQueueElement *elem,
 struct vring_desc *desc)
 {
 unsigned *num;
 struct iovec *iov;
 hwaddr *addr;
 MemoryRegion *mr;
-int is_write = virtio_tswap16(vdev, desc-flags)  VRING_DESC_F_WRITE;
-uint32_t len = virtio_tswap32(vdev, desc-len);
-uint64_t desc_addr = virtio_tswap64(vdev, desc-addr);
 
-if (is_write) {
+if (desc-flags  VRING_DESC_F_WRITE) {
 num = elem-in_num;
 iov = elem-in_sg[*num];
 addr = elem-in_addr[*num];
@@ -193,17 +190,18 @@ static int get_desc(VirtIODevice *vdev, Vring *vring, 
VirtQueueElement *elem,
 }
 
 /* TODO handle non-contiguous memory across region boundaries */
-iov-iov_base = vring_map(mr, desc_addr, len, is_write);
+iov-iov_base = vring_map(mr, desc-addr, desc-len,
+  desc-flags  VRING_DESC_F_WRITE);
 if (!iov-iov_base) {
 error_report(Failed to map descriptor addr %# PRIx64  len %u,
- (uint64_t)desc_addr, len);
+ (uint64_t)desc-addr, desc-len);
 return -EFAULT;
 }
 
 /* The MemoryRegion is looked up again and unref'ed later, leave the
  * ref in place.  */
-iov-iov_len = len;
-*addr = desc_addr;
+iov-iov_len = desc-len;
+*addr = desc-addr;
 *num += 1;
 return 0;
 }
@@ -225,23 +223,21 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 struct vring_desc desc;
 unsigned int i = 0, count, found = 0;
 int ret;
-uint32_t len = virtio_tswap32(vdev, indirect-len);
-uint64_t addr = virtio_tswap64(vdev, indirect-addr);
 
 /* Sanity check */
-if (unlikely(len % sizeof(desc))) {
+if (unlikely(indirect-len % sizeof(desc))) {
 error_report(Invalid length in indirect descriptor: 
  len %#x not multiple of %#zx,
- len, sizeof(desc));
+ indirect-len, sizeof(desc));
 vring-broken = true;
 return -EFAULT;
 }
 
-count = len / sizeof(desc);
+count = indirect-len / sizeof(desc);
 /* Buffers are chained via a 16 bit next field, so
  * we can have at most 2^16 of these. */
 if (unlikely(count  USHRT_MAX + 1)) {
-error_report(Indirect buffer length too big: %d, len);
+error_report(Indirect buffer length too big: %d, indirect-len);
 vring-broken = true;
 return -EFAULT;
 }
@@ -252,12 +248,12 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 
 /* Translate indirect descriptor */
 desc_ptr = vring_map(mr,
- addr + found * sizeof(desc),
+ indirect-addr + found * sizeof(desc),
  sizeof(desc), false);
 if (!desc_ptr) {
 error_report(Failed to map indirect descriptor 
  addr %# PRIx64  len %zu,
- (uint64_t)addr + found * sizeof(desc),
+ (uint64_t)indirect-addr + found * sizeof(desc),
  sizeof(desc));
 vring-broken = true;
 return -EFAULT;
@@ -275,20 +271,19 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 return -EFAULT;
 }
 
-if (unlikely(virtio_tswap16(vdev, desc.flags)
-  VRING_DESC_F_INDIRECT)) {
+if (unlikely(desc.flags  VRING_DESC_F_INDIRECT)) {
 error_report(Nested indirect descriptor);
 vring-broken = true;
 return -EFAULT;
 }
 
-ret = get_desc(vdev, vring, elem, desc);
+ret = get_desc(vring, elem, desc);
 if (ret  0) {
 vring-broken |= (ret == -EFAULT);
 return ret;
 }
-i = virtio_tswap16(vdev, desc.next);
-} while (virtio_tswap16(vdev, desc.flags)  VRING_DESC_F_NEXT);
+i = desc.next;
+} while (desc.flags  VRING_DESC_F_NEXT);
 return 0;
 }
 
@@ -389,7 +384,7 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 /* Ensure descriptor is loaded

[Qemu-block] [PULL 02/13] Revert dataplane: allow virtio-1 devices

2015-07-07 Thread Michael S. Tsirkin

From: Cornelia Huck cornelia.h...@de.ibm.com

This reverts commit f5a5628cf0b65b223fa0c9031714578dfac4cf04.

This was an old patch that had been already superseded by b0e5d90eb
(dataplane: endianness-aware accesses).

Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Acked-by: Stefan Hajnoczi stefa...@redhat.com
---
 hw/virtio/dataplane/vring.c | 47 -
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index bed9b11..07fd69c 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -158,18 +158,15 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 }
 
 
-static int get_desc(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
+static int get_desc(Vring *vring, VirtQueueElement *elem,
 struct vring_desc *desc)
 {
 unsigned *num;
 struct iovec *iov;
 hwaddr *addr;
 MemoryRegion *mr;
-int is_write = virtio_tswap16(vdev, desc-flags)  VRING_DESC_F_WRITE;
-uint32_t len = virtio_tswap32(vdev, desc-len);
-uint64_t desc_addr = virtio_tswap64(vdev, desc-addr);
 
-if (is_write) {
+if (desc-flags  VRING_DESC_F_WRITE) {
 num = elem-in_num;
 iov = elem-in_sg[*num];
 addr = elem-in_addr[*num];
@@ -193,17 +190,18 @@ static int get_desc(VirtIODevice *vdev, Vring *vring, 
VirtQueueElement *elem,
 }
 
 /* TODO handle non-contiguous memory across region boundaries */
-iov-iov_base = vring_map(mr, desc_addr, len, is_write);
+iov-iov_base = vring_map(mr, desc-addr, desc-len,
+  desc-flags  VRING_DESC_F_WRITE);
 if (!iov-iov_base) {
 error_report(Failed to map descriptor addr %# PRIx64  len %u,
- (uint64_t)desc_addr, len);
+ (uint64_t)desc-addr, desc-len);
 return -EFAULT;
 }
 
 /* The MemoryRegion is looked up again and unref'ed later, leave the
  * ref in place.  */
-iov-iov_len = len;
-*addr = desc_addr;
+iov-iov_len = desc-len;
+*addr = desc-addr;
 *num += 1;
 return 0;
 }
@@ -225,23 +223,21 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 struct vring_desc desc;
 unsigned int i = 0, count, found = 0;
 int ret;
-uint32_t len = virtio_tswap32(vdev, indirect-len);
-uint64_t addr = virtio_tswap64(vdev, indirect-addr);
 
 /* Sanity check */
-if (unlikely(len % sizeof(desc))) {
+if (unlikely(indirect-len % sizeof(desc))) {
 error_report(Invalid length in indirect descriptor: 
  len %#x not multiple of %#zx,
- len, sizeof(desc));
+ indirect-len, sizeof(desc));
 vring-broken = true;
 return -EFAULT;
 }
 
-count = len / sizeof(desc);
+count = indirect-len / sizeof(desc);
 /* Buffers are chained via a 16 bit next field, so
  * we can have at most 2^16 of these. */
 if (unlikely(count  USHRT_MAX + 1)) {
-error_report(Indirect buffer length too big: %d, len);
+error_report(Indirect buffer length too big: %d, indirect-len);
 vring-broken = true;
 return -EFAULT;
 }
@@ -252,12 +248,12 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 
 /* Translate indirect descriptor */
 desc_ptr = vring_map(mr,
- addr + found * sizeof(desc),
+ indirect-addr + found * sizeof(desc),
  sizeof(desc), false);
 if (!desc_ptr) {
 error_report(Failed to map indirect descriptor 
  addr %# PRIx64  len %zu,
- (uint64_t)addr + found * sizeof(desc),
+ (uint64_t)indirect-addr + found * sizeof(desc),
  sizeof(desc));
 vring-broken = true;
 return -EFAULT;
@@ -275,20 +271,19 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 return -EFAULT;
 }
 
-if (unlikely(virtio_tswap16(vdev, desc.flags)
-  VRING_DESC_F_INDIRECT)) {
+if (unlikely(desc.flags  VRING_DESC_F_INDIRECT)) {
 error_report(Nested indirect descriptor);
 vring-broken = true;
 return -EFAULT;
 }
 
-ret = get_desc(vdev, vring, elem, desc);
+ret = get_desc(vring, elem, desc);
 if (ret  0) {
 vring-broken |= (ret == -EFAULT);
 return ret;
 }
-i = virtio_tswap16(vdev, desc.next);
-} while (virtio_tswap16(vdev, desc.flags)  VRING_DESC_F_NEXT);
+i = desc.next;
+} while (desc.flags  VRING_DESC_F_NEXT);
 return 0;
 }
 
@@ -389,7 +384,7 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 /* Ensure descriptor is loaded

[Qemu-block] [PULL 01/13] dataplane: fix cross-endian issues

2015-07-07 Thread Michael S. Tsirkin

From: Greg Kurz gk...@linux.vnet.ibm.com

Accesses to vring_avail_event and vring_used_event must honor the queue
endianness.

This patch allows cross-endian setups to use dataplane (tested with ppc64
on ppc64le, and vice-versa).

Suggested-by: Cornelia Huck cornelia.h...@de.ibm.com
Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Reviewed-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 hw/virtio/dataplane/vring.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 3589185..bed9b11 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -153,7 +153,8 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 return true;
 }
 
-return vring_need_event(vring_used_event(vring-vr), new, old);
+return vring_need_event(virtio_tswap16(vdev, vring_used_event(vring-vr)),
+new, old);
 }
 
 
@@ -407,7 +408,8 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 /* On success, increment avail index. */
 vring-last_avail_idx++;
 if (virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX)) {
-vring_avail_event(vring-vr) = vring-last_avail_idx;
+vring_avail_event(vring-vr) =
+virtio_tswap16(vdev, vring-last_avail_idx);
 }
 
 return head;
-- 
MST

[Qemu-block] [PULL 09/10] pci: Don't register a specialized 'config_write' if default behavior is intended

2015-06-17 Thread Michael S. Tsirkin

From: Shmulik Ladkani shmulik.ladk...@ravellosystems.com

Few devices have their specialized 'config_write' methods which simply
call 'pci_default_write_config' followed by a 'msix_write_config' or
'msi_write_config' calls, using exact same arguments.

This is unnecessary as 'pci_default_write_config' already invokes
'msi_write_config' and 'msix_write_config'.

Also, since 'pci_default_write_config' is the default 'config_write'
handler, we can simply avoid the registration of these specialized
versions.

Cc: Leonid Shatz leonid.sh...@ravellosystems.com
Signed-off-by: Shmulik Ladkani shmulik.ladk...@ravellosystems.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/misc/ivshmem.c| 1 -
 hw/net/vmxnet3.c | 9 -
 hw/scsi/megasas.c| 8 
 hw/scsi/vmw_pvscsi.c | 8 
 4 files changed, 26 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 5d272c8..231c35f 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -698,7 +698,6 @@ static void ivshmem_write_config(PCIDevice *pci_dev, 
uint32_t address,
 uint32_t val, int len)
 {
 pci_default_write_config(pci_dev, address, val, len);
-msix_write_config(pci_dev, address, val, len);
 }
 
 static int pci_ivshmem_init(PCIDevice *dev)
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index dfb328d..34ffafd 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2481,14 +2481,6 @@ static const VMStateDescription vmstate_vmxnet3 = {
 }
 };
 
-static void
-vmxnet3_write_config(PCIDevice *pci_dev, uint32_t addr, uint32_t val, int len)
-{
-pci_default_write_config(pci_dev, addr, val, len);
-msix_write_config(pci_dev, addr, val, len);
-msi_write_config(pci_dev, addr, val, len);
-}
-
 static Property vmxnet3_properties[] = {
 DEFINE_NIC_PROPERTIES(VMXNET3State, conf),
 DEFINE_PROP_END_OF_LIST(),
@@ -2507,7 +2499,6 @@ static void vmxnet3_class_init(ObjectClass *class, void 
*data)
 c-class_id = PCI_CLASS_NETWORK_ETHERNET;
 c-subsystem_vendor_id = PCI_VENDOR_ID_VMWARE;
 c-subsystem_id = PCI_DEVICE_ID_VMWARE_VMXNET3;
-c-config_write = vmxnet3_write_config,
 dc-desc = VMWare Paravirtualized Ethernet v3;
 dc-reset = vmxnet3_qdev_reset;
 dc-vmsd = vmstate_vmxnet3;
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 91a5d97..51ba9e0 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -2407,13 +2407,6 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
 }
 }
 
-static void
-megasas_write_config(PCIDevice *pci, uint32_t addr, uint32_t val, int len)
-{
-pci_default_write_config(pci, addr, val, len);
-msi_write_config(pci, addr, val, len);
-}
-
 static Property megasas_properties_gen1[] = {
 DEFINE_PROP_UINT32(max_sge, MegasasState, fw_sge,
MEGASAS_DEFAULT_SGE),
@@ -2516,7 +2509,6 @@ static void megasas_class_init(ObjectClass *oc, void 
*data)
 dc-vmsd = info-vmsd;
 set_bit(DEVICE_CATEGORY_STORAGE, dc-categories);
 dc-desc = info-desc;
-pc-config_write = megasas_write_config;
 }
 
 static const TypeInfo megasas_info = {
diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index c6148d3..9c71f31 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -1174,13 +1174,6 @@ static const VMStateDescription vmstate_pvscsi = {
 }
 };
 
-static void
-pvscsi_write_config(PCIDevice *pci, uint32_t addr, uint32_t val, int len)
-{
-pci_default_write_config(pci, addr, val, len);
-msi_write_config(pci, addr, val, len);
-}
-
 static Property pvscsi_properties[] = {
 DEFINE_PROP_UINT8(use_msg, PVSCSIState, use_msg, 1),
 DEFINE_PROP_END_OF_LIST(),
@@ -1202,7 +1195,6 @@ static void pvscsi_class_init(ObjectClass *klass, void 
*data)
 dc-vmsd = vmstate_pvscsi;
 dc-props = pvscsi_properties;
 set_bit(DEVICE_CATEGORY_STORAGE, dc-categories);
-k-config_write = pvscsi_write_config;
 hc-unplug = pvscsi_hot_unplug;
 hc-plug = pvscsi_hotplug;
 }
-- 
MST

[Qemu-block] [PATCH v2 3/3] block/nfs: switch to error_init_local

2015-06-17 Thread Michael S. Tsirkin

We probably should just switch everyone, this is
just to demonstrate the API usage.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 block/nfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/nfs.c b/block/nfs.c
index ca9e24e..de4b8c3 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -385,7 +385,7 @@ static int nfs_file_open(BlockDriverState *bs, QDict 
*options, int flags,
 NFSClient *client = bs-opaque;
 int64_t ret;
 QemuOpts *opts;
-Error *local_err = NULL;
+Error *local_err = error_init_local(errp);
 
 client-aio_context = bdrv_get_aio_context(bs);
 
-- 
MST

Re: [Qemu-block] [Qemu-devel] [PATCH RFC 3/3] block/nfs: switch to error_init_local

2015-06-16 Thread Michael S. Tsirkin

On Tue, Jun 16, 2015 at 09:08:16AM -0600, Eric Blake wrote:
 On 06/16/2015 06:53 AM, Michael S. Tsirkin wrote:
  We probably should just switch everyone, this is
  just to demonstrate the API usage.
  
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
  ---
   block/nfs.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
 And indeed this is the reason things are still at RFC level.  I like the
 idea.  It doesn't change anything for a bug-free program, but where we
 DO have a bug, we now get a stacktrace that aborts as soon as possible
 rather than delaying to the propagation point and losing some information.
 
  
  diff --git a/block/nfs.c b/block/nfs.c
  index ca9e24e..de4b8c3 100644
  --- a/block/nfs.c
  +++ b/block/nfs.c
  @@ -385,7 +385,7 @@ static int nfs_file_open(BlockDriverState *bs, QDict 
  *options, int flags,
   NFSClient *client = bs-opaque;
   int64_t ret;
   QemuOpts *opts;
  -Error *local_err = NULL;
  +Error *local_err = error_init_local(errp);
 
 Should be a fairly mechanical patch to catch all the spots; although
 there are multiple spellings (not all callers name it local_err).

I'll try to write an spatch to do this.

 -- 
 Eric Blake   eblake redhat com+1-919-301-3266
 Libvirt virtualization library http://libvirt.org

[Qemu-block] [PULL 04/42] dataplane: allow virtio-1 devices

2015-06-11 Thread Michael S. Tsirkin

From: Cornelia Huck cornelia.h...@de.ibm.com

Handle endianness conversion for virtio-1 virtqueues correctly.

Note that dataplane now needs to be built per-target.

Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio/dataplane/vring.c | 47 +
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 5c7b8c2..fabb810 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -157,15 +157,18 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 }
 
 
-static int get_desc(Vring *vring, VirtQueueElement *elem,
+static int get_desc(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
 struct vring_desc *desc)
 {
 unsigned *num;
 struct iovec *iov;
 hwaddr *addr;
 MemoryRegion *mr;
+int is_write = virtio_tswap16(vdev, desc-flags)  VRING_DESC_F_WRITE;
+uint32_t len = virtio_tswap32(vdev, desc-len);
+uint64_t desc_addr = virtio_tswap64(vdev, desc-addr);
 
-if (desc-flags  VRING_DESC_F_WRITE) {
+if (is_write) {
 num = elem-in_num;
 iov = elem-in_sg[*num];
 addr = elem-in_addr[*num];
@@ -189,18 +192,17 @@ static int get_desc(Vring *vring, VirtQueueElement *elem,
 }
 
 /* TODO handle non-contiguous memory across region boundaries */
-iov-iov_base = vring_map(mr, desc-addr, desc-len,
-  desc-flags  VRING_DESC_F_WRITE);
+iov-iov_base = vring_map(mr, desc_addr, len, is_write);
 if (!iov-iov_base) {
 error_report(Failed to map descriptor addr %# PRIx64  len %u,
- (uint64_t)desc-addr, desc-len);
+ (uint64_t)desc_addr, len);
 return -EFAULT;
 }
 
 /* The MemoryRegion is looked up again and unref'ed later, leave the
  * ref in place.  */
-iov-iov_len = desc-len;
-*addr = desc-addr;
+iov-iov_len = len;
+*addr = desc_addr;
 *num += 1;
 return 0;
 }
@@ -222,21 +224,23 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 struct vring_desc desc;
 unsigned int i = 0, count, found = 0;
 int ret;
+uint32_t len = virtio_tswap32(vdev, indirect-len);
+uint64_t addr = virtio_tswap64(vdev, indirect-addr);
 
 /* Sanity check */
-if (unlikely(indirect-len % sizeof(desc))) {
+if (unlikely(len % sizeof(desc))) {
 error_report(Invalid length in indirect descriptor: 
  len %#x not multiple of %#zx,
- indirect-len, sizeof(desc));
+ len, sizeof(desc));
 vring-broken = true;
 return -EFAULT;
 }
 
-count = indirect-len / sizeof(desc);
+count = len / sizeof(desc);
 /* Buffers are chained via a 16 bit next field, so
  * we can have at most 2^16 of these. */
 if (unlikely(count  USHRT_MAX + 1)) {
-error_report(Indirect buffer length too big: %d, indirect-len);
+error_report(Indirect buffer length too big: %d, len);
 vring-broken = true;
 return -EFAULT;
 }
@@ -247,12 +251,12 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 
 /* Translate indirect descriptor */
 desc_ptr = vring_map(mr,
- indirect-addr + found * sizeof(desc),
+ addr + found * sizeof(desc),
  sizeof(desc), false);
 if (!desc_ptr) {
 error_report(Failed to map indirect descriptor 
  addr %# PRIx64  len %zu,
- (uint64_t)indirect-addr + found * sizeof(desc),
+ (uint64_t)addr + found * sizeof(desc),
  sizeof(desc));
 vring-broken = true;
 return -EFAULT;
@@ -270,19 +274,20 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 return -EFAULT;
 }
 
-if (unlikely(desc.flags  VRING_DESC_F_INDIRECT)) {
+if (unlikely(virtio_tswap16(vdev, desc.flags)
+  VRING_DESC_F_INDIRECT)) {
 error_report(Nested indirect descriptor);
 vring-broken = true;
 return -EFAULT;
 }
 
-ret = get_desc(vring, elem, desc);
+ret = get_desc(vdev, vring, elem, desc);
 if (ret  0) {
 vring-broken |= (ret == -EFAULT);
 return ret;
 }
-i = desc.next;
-} while (desc.flags  VRING_DESC_F_NEXT);
+i = virtio_tswap16(vdev, desc.next);
+} while (virtio_tswap16(vdev, desc.flags)  VRING_DESC_F_NEXT);
 return 0;
 }
 
@@ -383,7 +388,7 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 /* Ensure descriptor is loaded before accessing fields */
 barrier();
 
-if (desc.flags

[Qemu-block] [PULL v2 41/60] i386/pc: '-drive if=floppy' should imply a board-default FDC

2015-06-01 Thread Michael S. Tsirkin

From: Laszlo Ersek ler...@redhat.com

Even if board code decides not to request the creation of the FDC (keyed
off board-level factors, to be determined later), we should create the FDC
nevertheless if the user passes '-drive if=floppy' on the command line.

Otherwise '-drive if=floppy' would break without explicit '-device
isa-fdc' on such boards.

Cc: Markus Armbruster arm...@redhat.com
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Gerd Hoffmann kra...@redhat.com
Cc: John Snow js...@redhat.com
Cc: Gabriel L. Somlo gso...@gmail.com
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Kevin Wolf kw...@redhat.com
Cc: qemu-block@nongnu.org
Signed-off-by: Laszlo Ersek ler...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Reviewed-by: Markus Armbruster arm...@redhat.com
---
 hw/i386/pc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index b2fc501..1eb1db0 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1490,6 +1490,7 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
 
 for(i = 0; i  MAX_FD; i++) {
 fd[i] = drive_get(IF_FLOPPY, 0, i);
+create_fdctrl |= !!fd[i];
 }
 *floppy = create_fdctrl ? fdctrl_init_isa(isa_bus, fd) : NULL;
 }
-- 
MST

Re: [Qemu-block] [PATCH v2] virtio: make features 64bit wide

2015-06-01 Thread Michael S. Tsirkin

On Mon, Jun 01, 2015 at 09:23:28AM +0200, Gerd Hoffmann wrote:
 On Fr, 2015-05-29 at 16:53 +0200, Michael S. Tsirkin wrote:
  On Fri, May 29, 2015 at 09:51:20AM +0200, Gerd Hoffmann wrote:
   Make features 64bit wide everywhere.  Exception: command line flags
   remain 32bit and are copyed into the lower 32 host_features at
   initialization time.
   
   On migration a full 64bit guest_features field is sent if one of the
   high bits is set, additionally to the lower 32bit guest_features field
   which must stay for compatibility reasons.  That way we send the lower
   32 feature bits twice, but that way the code is simpler because we don't
   have to split and compose the 64bit features into two 32bit fields.
   
   This depends on move host_features patch by cornelia.
   
   Signed-off-by: Gerd Hoffmann kra...@redhat.com
  
  
  Thanks, this is very close to what I had in mind.
  Question: why do we need the feature_flags field?
  What's wrong with setting bits in host_features directly?
 
 DEFINE_PROP_BIT works on uint32_t.
 
 Alternative approach would be to introduce a DEFINE_PROP_BIT64 and use
 that for DEFINE_VIRTIO_COMMON_FEATURES.
 
 cheers,
   Gerd
 


Yes - previous versions of this patch did exactly that.
Can you do DEFINE_PROP_BIT64 please?
We'll need DEFINE_PROP_BIT64 down the road anyway when
we add properties  32.

If you prefer, I'm fine with this being a patch on top.

Let me know.


-- 
MST

[Qemu-block] [PULL 42/57] i386/pc_q35: don't insist on board FDC if there's no default floppy

2015-05-31 Thread Michael S. Tsirkin

From: Laszlo Ersek ler...@redhat.com

The no_floppy = 1 machine class setting causes default_floppy in
main() to become zero. Consequently, default_drive() will not call
drive_add() and drive_new() for IF_FLOPPY, index=0, meaning that no
default floppy drive will be created for the virtual machine. In that
case, board code should also not insist on the creation of the
board-default FDC.

The board-default FDC will still be created if the user requests a floppy
drive with -drive if=floppy.

Additionally, separate FDCs can be specified manually with -device
isa-fdc. They allow the

  -device isa-fdc,driveA=...

syntax that is more flexible than the one required by the board-default
FDC:

  -global isa-fdc.driveA=...

This patch doesn't change the behavior observably, as all Q35 machine
types have no_floppy = 0.

Cc: Markus Armbruster arm...@redhat.com
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Gerd Hoffmann kra...@redhat.com
Cc: John Snow js...@redhat.com
Cc: Gabriel L. Somlo gso...@gmail.com
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Kevin Wolf kw...@redhat.com
Cc: qemu-block@nongnu.org
Signed-off-by: Laszlo Ersek ler...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Reviewed-by: Markus Armbruster arm...@redhat.com
---
 hw/i386/pc_q35.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 9ca317c..9f036c8 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -89,6 +89,7 @@ static void pc_q35_init(MachineState *machine)
 PcGuestInfo *guest_info;
 ram_addr_t lowmem;
 DriveInfo *hd[MAX_SATA_PORTS];
+MachineClass *mc = MACHINE_GET_CLASS(machine);
 
 /* Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory
  * and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping
@@ -163,7 +164,6 @@ static void pc_q35_init(MachineState *machine)
 guest_info-legacy_acpi_table_size = 0;
 
 if (smbios_defaults) {
-MachineClass *mc = MACHINE_GET_CLASS(machine);
 /* These values are guest ABI, do not change */
 smbios_set_defaults(QEMU, Standard PC (Q35 + ICH9, 2009),
 mc-name, smbios_legacy_mode, smbios_uuid_encoded);
@@ -250,7 +250,7 @@ static void pc_q35_init(MachineState *machine)
 }
 
 /* init basic PC hardware */
-pc_basic_device_init(isa_bus, gsi, rtc_state, true, floppy,
+pc_basic_device_init(isa_bus, gsi, rtc_state, !mc-no_floppy, floppy,
  (pc_machine-vmport != ON_OFF_AUTO_ON), 0xff0104);
 
 /* connect pm stuff to lpc */
-- 
MST

[Qemu-block] [PULL 43/57] i386: drop FDC in pc-q35-2.4+ if neither it nor floppy drives are wanted

2015-05-31 Thread Michael S. Tsirkin

From: Laszlo Ersek ler...@redhat.com

It is Very annoying to carry forward an outdatEd coNtroller with a mOdern
Machine type.

Hence, let us not instantiate the FDC when all of the following apply:
- the machine type is pc-q35-2.4 or later,
- -device isa-fdc is not passed on the command line (nor in the config
  file),
- no -drive if=floppy,... is requested.

Cc: Markus Armbruster arm...@redhat.com
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Gerd Hoffmann kra...@redhat.com
Cc: John Snow js...@redhat.com
Cc: Gabriel L. Somlo gso...@gmail.com
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Kevin Wolf kw...@redhat.com
Cc: qemu-block@nongnu.org
Suggested-by: Markus Armbruster arm...@redhat.com
Signed-off-by: Laszlo Ersek ler...@redhat.com
Acked-by: Paolo Bonzini pbonz...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Reviewed-by: Markus Armbruster arm...@redhat.com
---
 hw/i386/pc_q35.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 9f036c8..66220b3 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -392,6 +392,7 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
 pc_q35_machine_options(m);
 m-default_machine_opts = firmware=bios-256k.bin;
 m-default_display = std;
+m-no_floppy = 1;
 m-alias = q35;
 }
 
-- 
MST

[Qemu-block] [PULL 40/57] i386/pc: pc_basic_device_init(): delegate FDC creation request

2015-05-31 Thread Michael S. Tsirkin

From: Laszlo Ersek ler...@redhat.com

This patch introduces no observable change, but it allows the callers of
pc_basic_device_init(), ie. pc_init1() and pc_q35_init(), to request (or
not request) the creation of the FDC explicitly.

At the moment both callers pass constant create_fdctrl=true (hence no
observable change).

Assuming a board passes create_fdctrl=false, floppy will be NULL on
output, and (beyond the FDC not being created) that NULL will be passed on
to pc_cmos_init(). Luckily, pc_cmos_init() already handles that case.

Cc: Markus Armbruster arm...@redhat.com
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Gerd Hoffmann kra...@redhat.com
Cc: John Snow js...@redhat.com
Cc: Gabriel L. Somlo gso...@gmail.com
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Kevin Wolf kw...@redhat.com
Cc: qemu-block@nongnu.org
Signed-off-by: Laszlo Ersek ler...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Reviewed-by: Markus Armbruster arm...@redhat.com
---
 include/hw/i386/pc.h | 1 +
 hw/i386/pc.c | 3 ++-
 hw/i386/pc_piix.c| 2 +-
 hw/i386/pc_q35.c | 2 +-
 4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 0510aea..27bd748 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -197,6 +197,7 @@ qemu_irq *pc_allocate_cpu_irq(void);
 DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus);
 void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
   ISADevice **rtc_state,
+  bool create_fdctrl,
   ISADevice **floppy,
   bool no_vmport,
   uint32 hpet_irqs);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index aeed45d..b2fc501 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1395,6 +1395,7 @@ static const MemoryRegionOps ioportF0_io_ops = {
 
 void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
   ISADevice **rtc_state,
+  bool create_fdctrl,
   ISADevice **floppy,
   bool no_vmport,
   uint32 hpet_irqs)
@@ -1490,7 +1491,7 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
 for(i = 0; i  MAX_FD; i++) {
 fd[i] = drive_get(IF_FLOPPY, 0, i);
 }
-*floppy = fdctrl_init_isa(isa_bus, fd);
+*floppy = create_fdctrl ? fdctrl_init_isa(isa_bus, fd) : NULL;
 }
 
 void pc_nic_init(ISABus *isa_bus, PCIBus *pci_bus)
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index e77486c..6e7fa42 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -242,7 +242,7 @@ static void pc_init1(MachineState *machine)
 }
 
 /* init basic PC hardware */
-pc_basic_device_init(isa_bus, gsi, rtc_state, floppy,
+pc_basic_device_init(isa_bus, gsi, rtc_state, true, floppy,
  (pc_machine-vmport != ON_OFF_AUTO_ON), 0x4);
 
 pc_nic_init(isa_bus, pci_bus);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 68b4867..9ca317c 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -250,7 +250,7 @@ static void pc_q35_init(MachineState *machine)
 }
 
 /* init basic PC hardware */
-pc_basic_device_init(isa_bus, gsi, rtc_state, floppy,
+pc_basic_device_init(isa_bus, gsi, rtc_state, true, floppy,
  (pc_machine-vmport != ON_OFF_AUTO_ON), 0xff0104);
 
 /* connect pm stuff to lpc */
-- 
MST

Re: [Qemu-block] [PATCH v2] virtio: make features 64bit wide

2015-05-29 Thread Michael S. Tsirkin

On Fri, May 29, 2015 at 09:51:20AM +0200, Gerd Hoffmann wrote:
 Make features 64bit wide everywhere.  Exception: command line flags
 remain 32bit and are copyed into the lower 32 host_features at
 initialization time.
 
 On migration a full 64bit guest_features field is sent if one of the
 high bits is set, additionally to the lower 32bit guest_features field
 which must stay for compatibility reasons.  That way we send the lower
 32 feature bits twice, but that way the code is simpler because we don't
 have to split and compose the 64bit features into two 32bit fields.
 
 This depends on move host_features patch by cornelia.
 
 Signed-off-by: Gerd Hoffmann kra...@redhat.com


Thanks, this is very close to what I had in mind.
Question: why do we need the feature_flags field?
What's wrong with setting bits in host_features directly?


 ---
  hw/9pfs/virtio-9p-device.c  |  2 +-
  hw/block/virtio-blk.c   |  2 +-
  hw/char/virtio-serial-bus.c |  2 +-
  hw/net/virtio-net.c | 18 ---
  hw/scsi/vhost-scsi.c|  4 ++--
  hw/scsi/virtio-scsi.c   |  4 ++--
  hw/virtio/virtio-balloon.c  |  2 +-
  hw/virtio/virtio-rng.c  |  2 +-
  hw/virtio/virtio.c  | 54 
 +++--
  include/hw/virtio/virtio.h  | 21 +-
  10 files changed, 77 insertions(+), 34 deletions(-)
 
 diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
 index 30492ec..60f9ff9 100644
 --- a/hw/9pfs/virtio-9p-device.c
 +++ b/hw/9pfs/virtio-9p-device.c
 @@ -21,7 +21,7 @@
  #include virtio-9p-coth.h
  #include hw/virtio/virtio-access.h
  
 -static uint32_t virtio_9p_get_features(VirtIODevice *vdev, uint32_t features)
 +static uint64_t virtio_9p_get_features(VirtIODevice *vdev, uint64_t features)
  {
  virtio_add_feature(features, VIRTIO_9P_MOUNT_TAG);
  return features;
 diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
 index e6afe97..cd539aa 100644
 --- a/hw/block/virtio-blk.c
 +++ b/hw/block/virtio-blk.c
 @@ -718,7 +718,7 @@ static void virtio_blk_set_config(VirtIODevice *vdev, 
 const uint8_t *config)
  aio_context_release(blk_get_aio_context(s-blk));
  }
  
 -static uint32_t virtio_blk_get_features(VirtIODevice *vdev, uint32_t 
 features)
 +static uint64_t virtio_blk_get_features(VirtIODevice *vdev, uint64_t 
 features)
  {
  VirtIOBlock *s = VIRTIO_BLK(vdev);
  
 diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
 index 6e2ad82..95be9fc 100644
 --- a/hw/char/virtio-serial-bus.c
 +++ b/hw/char/virtio-serial-bus.c
 @@ -498,7 +498,7 @@ static void handle_input(VirtIODevice *vdev, VirtQueue 
 *vq)
  }
  }
  
 -static uint32_t get_features(VirtIODevice *vdev, uint32_t features)
 +static uint64_t get_features(VirtIODevice *vdev, uint64_t features)
  {
  VirtIOSerial *vser;
  
 diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
 index 3af6faf..b21ef6b 100644
 --- a/hw/net/virtio-net.c
 +++ b/hw/net/virtio-net.c
 @@ -435,7 +435,7 @@ static void virtio_net_set_queues(VirtIONet *n)
  
  static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue);
  
 -static uint32_t virtio_net_get_features(VirtIODevice *vdev, uint32_t 
 features)
 +static uint64_t virtio_net_get_features(VirtIODevice *vdev, uint64_t 
 features)
  {
  VirtIONet *n = VIRTIO_NET(vdev);
  NetClientState *nc = qemu_get_queue(n-nic);
 @@ -468,9 +468,9 @@ static uint32_t virtio_net_get_features(VirtIODevice 
 *vdev, uint32_t features)
  return vhost_net_get_features(get_vhost_net(nc-peer), features);
  }
  
 -static uint32_t virtio_net_bad_features(VirtIODevice *vdev)
 +static uint64_t virtio_net_bad_features(VirtIODevice *vdev)
  {
 -uint32_t features = 0;
 +uint64_t features = 0;
  
  /* Linux kernel 2.6.25.  It understood MAC (as everyone must),
   * but also these: */
 @@ -1032,10 +1032,12 @@ static ssize_t virtio_net_receive(NetClientState *nc, 
 const uint8_t *buf, size_t
  if (i == 0)
  return -1;
  error_report(virtio-net unexpected empty queue: 
 -i %zd mergeable %d offset %zd, size %zd, 
 -guest hdr len %zd, host hdr len %zd guest features 
 0x%x,
 -i, n-mergeable_rx_bufs, offset, size,
 -n-guest_hdr_len, n-host_hdr_len, vdev-guest_features);
 + i %zd mergeable %d offset %zd, size %zd, 
 + guest hdr len %zd, host hdr len %zd 
 + guest features 0x% PRIx64,
 + i, n-mergeable_rx_bufs, offset, size,
 + n-guest_hdr_len, n-host_hdr_len,
 + vdev-guest_features);
  exit(1);
  }
  
 @@ -1549,7 +1551,7 @@ static void virtio_net_guest_notifier_mask(VirtIODevice 
 *vdev, int idx,
   vdev, idx, mask);
  }
  
 -static void virtio_net_set_config_size(VirtIONet *n, uint32_t host_features)
 +static void

< 3 4 5 6 7 8

701 - 798 of 798 matches

Mail list logo