date:20160105

Re: [Qemu-devel] [vfio-users] [PATCH v2 1/3] input: add qemu_input_qcode_to_linux + qemu_input_linux_to_qcode

2016-01-05 Thread Gerd Hoffmann

On Di, 2016-01-05 at 15:44 +0100, sL1pKn07 SpinFlo wrote:
> 2016-01-05 9:06 GMT+01:00 Jonathan Scruggs :
> > I notice no bugs as of yet.
> 
> Hi
> I found one (if can call that) bug
> 
> I use a physical USB switch for share the K/M with other PC
> when switch to other pc, lost the signal.

How does the switch work?  Disconnect the usb devices (i.e. they are
gone from lsusb output)?

cheers,
  Gerd

Re: [Qemu-devel] [PATCH for v2.3.0] fw_cfg: add check to validate current entry value

2016-01-05 Thread P J P

+-- On Tue, 5 Jan 2016, Stefan Weil wrote --+
| > -s->cur_offset < e->len) {
| > +if (s->cur_entry != FW_CFG_INVALID
| > +&& s->cur_entry & FW_CFG_WRITE_CHANNEL
| > +&& e->callback
| > +&& s->cur_offset < e->len) {
| 
| I suggest to test e != NULL instead of s->cur_entry != FW_CFG_INVALID.
| 
| Of course both variants are equivalent, but e != NULL might be easier
| to review and make work of static code analyzers easier, too.

  Yes, I've sent a revised patch with this change.

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

[Qemu-devel] [PATCH v2 for v2.3.0] fw_cfg: add check to validate current entry value

2016-01-05 Thread P J P

From: Prasad J Pandit 

When processing firmware configurations, an OOB r/w access occurs
if 's->cur_entry' is set to be invalid(FW_CFG_INVALID=0x).
Add a check to validate 's->cur_entry' to avoid such access.

Reported-by: Donghai Zdh 
Signed-off-by: Prasad J Pandit 
---
 hw/nvram/fw_cfg.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

Updated as per review in
  -> https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00398.html

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 68eff77..e73c0fb 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -233,12 +233,15 @@ static void fw_cfg_reboot(FWCfgState *s)
 static void fw_cfg_write(FWCfgState *s, uint8_t value)
 {
 int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
-FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
+FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
+ &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
 
 trace_fw_cfg_write(s, value);
 
-if (s->cur_entry & FW_CFG_WRITE_CHANNEL && e->callback &&
-s->cur_offset < e->len) {
+if (s->cur_entry & FW_CFG_WRITE_CHANNEL
+&& e != NULL
+&& e->callback
+&& s->cur_offset < e->len) {
 e->data[s->cur_offset++] = value;
 if (s->cur_offset == e->len) {
 e->callback(e->callback_opaque, e->data);
@@ -267,7 +270,8 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
 static uint8_t fw_cfg_read(FWCfgState *s)
 {
 int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
-FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
+FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
+ &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
 uint8_t ret;
 
 if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
-- 
2.4.3

Re: [Qemu-devel] [PATCH v3 4/7] bcm2835_peripherals: add rollup device for bcm2835 peripherals

2016-01-05 Thread Andrew Baumann

> From: Alistair Francis [mailto:alistai...@gmail.com]
> Sent: Tuesday, 5 January 2016 18:14
> On Thu, Dec 31, 2015 at 4:31 PM, Andrew Baumann
>  wrote:
> > This device maintains all the non-CPU peripherals on bcm2835 (Pi1)
> > which are also present on bcm2836 (Pi2). It also implements the
> > private address spaces used for DMA and mailboxes.
[...]
> > +obj = object_property_get_link(OBJECT(dev), "ram", &err);
> > +if (obj == NULL) {
> > +error_setg(errp, "%s: required ram link not found: %s",
> > +   __func__, error_get_pretty(err));
> > +return;
> > +}
> 
> I only had a quick read of this patch, but this RAM linking looks fine
> to me. Out of curiosity is there a reason you use
> object_property_get_link() instead of object_property_add_link() in
> the init?

I'm not sure I understand your question... it wouldn't work the other way. I 
allocate the ram and add the link using object_property_add_const_link() in 
hw/arm/raspi.c. This file needs to consume the ram to setup alias mappings, so 
it is using get_link(). (Note there's also level of indirection; raspi creates 
bcm2836, which does nothing but get the link set by its parent and add it to 
its bcm2835_peripherals child.)

I suppose I could do it the other way around (allocate and set link in 
bcm2835_peripherals, based on a size passed from the board), but it seemed more 
logical to treat the RAM as created/owned of the board rather than the SoC.

Cheers,
Andrew

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-05 Thread Jason Wang



On 01/05/2016 12:52 AM, Dr. David Alan Gilbert wrote:
> * Jason Wang (jasow...@redhat.com) wrote:
>>
>> On 01/04/2016 04:16 PM, Zhang Chen wrote:
>>>
>>> On 01/04/2016 01:37 PM, Jason Wang wrote:
 On 12/31/2015 04:40 PM, Zhang Chen wrote:
> On 12/31/2015 10:36 AM, Jason Wang wrote:
>> On 12/22/2015 06:42 PM, Zhang Chen wrote:
>>> From: zhangchen 
>>>
>>> Hi,all
>>>
>>> This patch add an colo-proxy object, COLO-Proxy is a part of COLO,
>>> based on qemu netfilter and it's a plugin for qemu netfilter. the
>>> function
>>> keep Secondary VM connect normal to Primary VM and compare packets
>>> sent by PVM to sent by SVM.if the packet difference,notify COLO do
>>> checkpoint and send all primary packet has queued.
>> Thanks for the work. I don't object this method but still not
>> convinced
>> that qemu is the best place for this.
>>
>> As been raised in the past discussion, it's almost impossible to
>> cooperate with vhost backends. If we want this to be used in
>> production
>> environment, need to think of a solution for vhost. There's no such
>> worry if we decouple this from qemu.
>>
>>> You can also get the series from:
>>>
>>> https://github.com/zhangckid/qemu/tree/colo-v2.2-periodic-mode-with-colo-proxyV2
>>>
>>>
>>>
>>> Usage:
>>>
>>> primary:
>>> -netdev tap,id=bn0 -device e1000,netdev=bn0
>>> -object
>>> colo-proxy,id=f0,netdev=bn0,queue=all,mode=primary,addr=host:port
>>>
>>> secondary:
>>> -netdev tap,id=bn0 -device e1000,netdev=bn0
>>> -object
>>> colo-proxy,id=f0,netdev=bn0,queue=all,mode=secondary,addr=host:port
>> Have a quick glance at how secondary mode work. What it does is just
>> forwarding packets between a nic and a socket, qemu socket backend did
>> exact the same job. You could even use socket in primary node and let
>> packet compare module talk to both primary and secondary node.
> If we use qemu socket backend , the same netdev will used by qemu
> socket and
> qemu netfilter. this will against qemu net design. and then, when colo
> do failover,
> secondary do not have backend to use. that's the real problem.
 Then, maybe it's time to implement changing the netdev of a nic. The
 point here is that what secondary mode did is in fact a netdev backend
 instead of a filter ...
>>> Currently, you are right. in colo-proxy V2 code, I just compare IP
>>> packet to
>>> decide whether to do checkpoint.
>>> But, in colo-proxy V3 I will compare tcp,icmp,udp packet to decide it.
>>> because that can reduce frequency of checkpoint and improve
>>> performance. To keep tcp connection well, colo secondary need to record
>>> primary guest's init seq and adjust secondary guest's ack. if colo do
>>> failover,
>>> secondary also need do this to old tcp connection. qemu socket
>>> can't do this job.
>> So a question here: is it a must to do things (e.g TCP analysis stuffs)
>> at secondary? Looks like we could do this at primary node. And I saw
>> you're doing packet comparing in primary node, any advantages of doing
>> this in primary instead of secondary?
> It needs to do this on the secondary; the trick is that things like TCP 
> sequence
> numbers are likely to be different on the primary and secondary; the kernel 
> colo-proxy
> implementation solved this problem by rewriting the sequence numbers on
> the secondary to match the primary, after a failover, the secondary has
> to keep doing that rewrite to ensure existing connections are OK.
> Thus it's holding some state about the current connections.

I see.

> I think also, to be able to do a 2nd failover (i.e. recover from the 1st 
> failure
> and then sometime later have another) you'd have to sync this
> state over to a new host, so again that says the state needs to be part of
> qemu or at least easily available to it.
>
> Dave

Right, if it does thing like tcp seq rewrite (which is missed in current
version), it works much more like a netfilter. Wonder if the function is
generic enough for users other than colo.

Thanks

>
>>> and another problem is do failover, if we use qemu socket
>>> to be backend in secondary, when colo do failover, I don't know how to
>>> change
>>> secondary be a normal qemu, if you know, please tell me.
>> Current qemu couldn't do this, but I mean we implement something like
>> nic_change_backend which can change nic's peer(s). With this, in
>> secondary, we can replace the socket backend with whatever you want (e.g
>> tap or other).
>>
>> Thanks
>>
>>>
>>> Thanks for your revew
>>> zhangchen 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-05 Thread Jason Wang



On 01/04/2016 07:17 PM, Zhang Chen wrote:
>
>
> On 01/04/2016 05:46 PM, Jason Wang wrote:
>>
>> On 01/04/2016 04:16 PM, Zhang Chen wrote:
>>>
>>> On 01/04/2016 01:37 PM, Jason Wang wrote:
 On 12/31/2015 04:40 PM, Zhang Chen wrote:
> On 12/31/2015 10:36 AM, Jason Wang wrote:
>> On 12/22/2015 06:42 PM, Zhang Chen wrote:
>>> From: zhangchen 
>>>
>>> Hi,all
>>>
>>> This patch add an colo-proxy object, COLO-Proxy is a part of COLO,
>>> based on qemu netfilter and it's a plugin for qemu netfilter. the
>>> function
>>> keep Secondary VM connect normal to Primary VM and compare packets
>>> sent by PVM to sent by SVM.if the packet difference,notify COLO do
>>> checkpoint and send all primary packet has queued.
>> Thanks for the work. I don't object this method but still not
>> convinced
>> that qemu is the best place for this.
>>
>> As been raised in the past discussion, it's almost impossible to
>> cooperate with vhost backends. If we want this to be used in
>> production
>> environment, need to think of a solution for vhost. There's no such
>> worry if we decouple this from qemu.
>>
>>> You can also get the series from:
>>>
>>> https://github.com/zhangckid/qemu/tree/colo-v2.2-periodic-mode-with-colo-proxyV2
>>>
>>>
>>>
>>>
>>> Usage:
>>>
>>> primary:
>>> -netdev tap,id=bn0 -device e1000,netdev=bn0
>>> -object
>>> colo-proxy,id=f0,netdev=bn0,queue=all,mode=primary,addr=host:port
>>>
>>> secondary:
>>> -netdev tap,id=bn0 -device e1000,netdev=bn0
>>> -object
>>> colo-proxy,id=f0,netdev=bn0,queue=all,mode=secondary,addr=host:port
>> Have a quick glance at how secondary mode work. What it does is just
>> forwarding packets between a nic and a socket, qemu socket
>> backend did
>> exact the same job. You could even use socket in primary node and
>> let
>> packet compare module talk to both primary and secondary node.
> If we use qemu socket backend , the same netdev will used by qemu
> socket and
> qemu netfilter. this will against qemu net design. and then, when
> colo
> do failover,
> secondary do not have backend to use. that's the real problem.
 Then, maybe it's time to implement changing the netdev of a nic. The
 point here is that what secondary mode did is in fact a netdev backend
 instead of a filter ...
>>> Currently, you are right. in colo-proxy V2 code, I just compare IP
>>> packet to
>>> decide whether to do checkpoint.
>>> But, in colo-proxy V3 I will compare tcp,icmp,udp packet to decide it.
>>> because that can reduce frequency of checkpoint and improve
>>> performance. To keep tcp connection well, colo secondary need to record
>>> primary guest's init seq and adjust secondary guest's ack. if colo do
>>> failover,
>>> secondary also need do this to old tcp connection. qemu socket
>>> can't do this job.
>> So a question here: is it a must to do things (e.g TCP analysis stuffs)
>> at secondary? Looks like we could do this at primary node. And I saw
>> you're doing packet comparing in primary node, any advantages of doing
>> this in primary instead of secondary?
>
> We think must  to do this in secondary, because if colo do
> failover,secondary
> must continues do TCP analysis stuffs to before tcp connection(if not,
> tcp connection
> will disconnect in that time), in this time primary already down or
> disconnect to
> secondary.so we can't make primary do this  TCP analysis stuffs.it can
> not ensure
> FT function.
>
> Thanks
> zhangchen

Makes sense.

Thanks

>
>>> and another problem is do failover, if we use qemu socket
>>> to be backend in secondary, when colo do failover, I don't know how to
>>> change
>>> secondary be a normal qemu, if you know, please tell me.
>> Current qemu couldn't do this, but I mean we implement something like
>> nic_change_backend which can change nic's peer(s). With this, in
>> secondary, we can replace the socket backend with whatever you want (e.g
>> tap or other).
>>
>> Thanks
>>
>>>
>>> Thanks for your revew
>>> zhangchen
>>
>>
>> .
>>
>

Re: [Qemu-devel] qcow2 snapshot + resize

2016-01-05 Thread lihuiba

At 2016-01-05 21:55:56, "Eric Blake"  wrote:
>On 01/05/2016 05:10 AM, lihuiba wrote:
>
 In our production environment, we need to extend a qcow2 image with
 snapshots in it.
>
>>> The thing is that one would need to update all the inactive L1 tables. I
>>> don't think it should be too difficult, it's just that apparently so far
>>> nobody ever had the need for this feature.
>
>Is resizing a snapshot really what you want?  Ideally, a snapshot tracks
>the data from a point in time, including the metadata of the size being
>tracked at that time.  Extending the snapshots then reverting to that
>snapshot means your guest would see a larger disk on revert than it did
>at the time the snapshot was created, which guests might not handle very
>well.
I want to make resizing (extending only) and snapshot independent to each 
other,otherwise going back and forth in snapshots may cause the disk shrinked 
and extended. That would introduce some technical trouble, and possibly confuse 
user as well.

Re: [Qemu-devel] [PATCH v2] l2tpv3: fix cookie decoding

2016-01-05 Thread Jason Wang



On 01/05/2016 07:26 AM, Alexis Dambricourt wrote:
> If a 32 bits l2tpv3 frame cookie MSB if set to 1, the cast to uint64_t
> cookie will spread 1 to the four most significant bytes.
> Then the condition (cookie != s->rx_cookie) becomes false.
>
> Signed-off-by: Alexis Dambricourt 
> ---
>  net/l2tpv3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/l2tpv3.c b/net/l2tpv3.c
> index 8e68e54..21d6119 100644
> --- a/net/l2tpv3.c
> +++ b/net/l2tpv3.c
> @@ -325,7 +325,7 @@ static int l2tpv3_verify_header(NetL2TPV3State *s, 
> uint8_t *buf)
>  if (s->cookie_is_64) {
>  cookie = ldq_be_p(buf + s->cookie_offset);
>  } else {
> -cookie = ldl_be_p(buf + s->cookie_offset);
> +cookie = ldl_be_p(buf + s->cookie_offset) & 0xULL;
>  }
>  if (cookie != s->rx_cookie) {
>  if (!s->header_mismatch) {

Applied to my -net.

Thanks

[Qemu-devel] [PATCH v3 4/4] Xen PCI passthru: convert to realize()

2016-01-05 Thread Cao jin

Signed-off-by: Cao jin 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_pt.c | 53 -
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 3787c26..b058f61 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -760,10 +760,10 @@ static void xen_pt_destroy(PCIDevice *d) {
 }
 /* init */
 
-static int xen_pt_initfn(PCIDevice *d)
+static void xen_pt_realize(PCIDevice *d, Error **errp)
 {
 XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
-int rc = 0;
+int i, rc = 0;
 uint8_t machine_irq = 0, scratch;
 uint16_t cmd = 0;
 int pirq = XEN_PT_UNASSIGNED_PIRQ;
@@ -780,8 +780,8 @@ static int xen_pt_initfn(PCIDevice *d)
 s->hostaddr.slot, s->hostaddr.function,
 &local_err);
 if (local_err) {
-XEN_PT_ERR(d, "Failed to \"open\" the real pci device.\n");
-return -1;
+error_propagate(errp, local_err);
+return;
 }
 
 s->is_virtfn = s->real_device.is_virtfn;
@@ -801,19 +801,19 @@ static int xen_pt_initfn(PCIDevice *d)
 if ((s->real_device.domain == 0) && (s->real_device.bus == 0) &&
 (s->real_device.dev == 2) && (s->real_device.func == 0)) {
 if (!is_igd_vga_passthrough(&s->real_device)) {
-XEN_PT_ERR(d, "Need to enable igd-passthru if you're trying"
-   " to passthrough IGD GFX.\n");
+error_setg(errp, "Need to enable igd-passthru if you're trying"
+" to passthrough IGD GFX.");
 xen_host_pci_device_put(&s->real_device);
-return -1;
+return;
 }
 
 xen_pt_setup_vga(s, &s->real_device, &local_err);
 if (local_err) {
 error_append_hint(&local_err, "Setup VGA BIOS of passthrough"
 " GFX failed!");
-XEN_PT_ERR(d, "Setup VGA BIOS of passthrough GFX failed!\n");
+error_propagate(errp, local_err);
 xen_host_pci_device_put(&s->real_device);
-return -1;
+return;
 }
 
 /* Register ISA bridge for passthrough GFX. */
@@ -827,27 +827,26 @@ static int xen_pt_initfn(PCIDevice *d)
 xen_pt_config_init(s, &local_err);
 if (local_err) {
 error_append_hint(&local_err, "PCI Config space initialisation 
failed");
-rc = -1;
+error_propagate(errp, local_err);
 goto err_out;
 }
 
 /* Bind interrupt */
 rc = xen_host_pci_get_byte(&s->real_device, PCI_INTERRUPT_PIN, &scratch);
 if (rc) {
-XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc);
+error_setg_errno(errp, errno, "Failed to read PCI_INTERRUPT_PIN!");
 goto err_out;
 }
 if (!scratch) {
-XEN_PT_LOG(d, "no pin interrupt\n");
+error_setg(errp, "no pin interrupt");
 goto out;
 }
 
 machine_irq = s->real_device.irq;
 rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq);
-
 if (rc < 0) {
-XEN_PT_ERR(d, "Mapping machine irq %u to pirq %i failed, (err: %d)\n",
-   machine_irq, pirq, errno);
+error_setg_errno(errp, errno, "Mapping machine irq %u to"
+ " pirq %i failed", machine_irq, pirq);
 
 /* Disable PCI intx assertion (turn on bit10 of devctl) */
 cmd |= PCI_COMMAND_INTX_DISABLE;
@@ -868,8 +867,8 @@ static int xen_pt_initfn(PCIDevice *d)
PCI_SLOT(d->devfn),
e_intx);
 if (rc < 0) {
-XEN_PT_ERR(d, "Binding of interrupt %i failed! (err: %d)\n",
-   e_intx, errno);
+error_setg_errno(errp, errno, "Binding of interrupt %i failed!",
+ e_intx);
 
 /* Disable PCI intx assertion (turn on bit10 of devctl) */
 cmd |= PCI_COMMAND_INTX_DISABLE;
@@ -877,8 +876,8 @@ static int xen_pt_initfn(PCIDevice *d)
 
 if (xen_pt_mapped_machine_irq[machine_irq] == 0) {
 if (xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq)) {
-XEN_PT_ERR(d, "Unmapping of machine interrupt %i failed!"
-   " (err: %d)\n", machine_irq, errno);
+error_setg_errno(errp, errno, "Unmapping of machine"
+" interrupt %i failed!", machine_irq);
 }
 }
 s->machine_irq = 0;
@@ -891,14 +890,14 @@ out:
 
 rc = xen_host_pci_get_word(&s->real_device, PCI_COMMAND, &val);
 if (rc) {
-XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc);
+error_setg_errno(errp, errno, "Failed to read PCI_COMMAND!");
 goto err_out;
 } else {
 val |= cmd;
 rc = xen_host_pci_set_word(&s->real_device, PCI_COMMAND, val);
 if (rc) {
-XEN_PT

[Qemu-devel] [PATCH v3 2/4] Add Error **errp for xen_pt_setup_vga()

2016-01-05 Thread Cao jin

To catch the error msg. Also modify the caller

Signed-off-by: Cao jin 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_pt.c  |  5 -
 hw/xen/xen_pt.h  |  3 ++-
 hw/xen/xen_pt_graphics.c | 11 ++-
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 1bd4109..fbce55c 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -807,7 +807,10 @@ static int xen_pt_initfn(PCIDevice *d)
 return -1;
 }
 
-if (xen_pt_setup_vga(s, &s->real_device) < 0) {
+xen_pt_setup_vga(s, &s->real_device, &local_err);
+if (local_err) {
+error_append_hint(&local_err, "Setup VGA BIOS of passthrough"
+" GFX failed!");
 XEN_PT_ERR(d, "Setup VGA BIOS of passthrough GFX failed!\n");
 xen_host_pci_device_put(&s->real_device);
 return -1;
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index c545280..dc74d3e 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -328,5 +328,6 @@ static inline bool is_igd_vga_passthrough(XenHostPCIDevice 
*dev)
 }
 int xen_pt_register_vga_regions(XenHostPCIDevice *dev);
 int xen_pt_unregister_vga_regions(XenHostPCIDevice *dev);
-int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev);
+void xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev,
+ Error **errp);
 #endif /* !XEN_PT_H */
diff --git a/hw/xen/xen_pt_graphics.c b/hw/xen/xen_pt_graphics.c
index df6069b..a0a7e9c 100644
--- a/hw/xen/xen_pt_graphics.c
+++ b/hw/xen/xen_pt_graphics.c
@@ -161,7 +161,8 @@ struct pci_data {
 uint16_t reserved;
 } __attribute__((packed));
 
-int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev)
+void xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev,
+ Error **errp)
 {
 unsigned char *bios = NULL;
 struct rom_header *rom;
@@ -172,13 +173,14 @@ int xen_pt_setup_vga(XenPCIPassthroughState *s, 
XenHostPCIDevice *dev)
 struct pci_data *pd = NULL;
 
 if (!is_igd_vga_passthrough(dev)) {
-return -1;
+error_setg(errp, "Need to enable igd-passthrough");
+return;
 }
 
 bios = get_vgabios(s, &bios_size, dev);
 if (!bios) {
-XEN_PT_ERR(&s->dev, "VGA: Can't getting VBIOS!\n");
-return -1;
+error_setg(errp, "VGA: Can't getting VBIOS!");
+return;
 }
 
 /* Currently we fixed this address as a primary. */
@@ -203,7 +205,6 @@ int xen_pt_setup_vga(XenPCIPassthroughState *s, 
XenHostPCIDevice *dev)
 
 /* Currently we fixed this address as a primary for legacy BIOS. */
 cpu_physical_memory_rw(0xc, bios, bios_size, 1);
-return 0;
 }
 
 uint32_t igd_read_opregion(XenPCIPassthroughState *s)
-- 
2.1.0

[Qemu-devel] [PATCH v3 3/4] Add Error **errp for xen_pt_config_init()

2016-01-05 Thread Cao jin

To catch the error msg. Also modify the caller

Signed-off-by: Cao jin 
---
 hw/xen/xen_pt.c |  7 ---
 hw/xen/xen_pt.h |  2 +-
 hw/xen/xen_pt_config_init.c | 51 -
 3 files changed, 32 insertions(+), 28 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index fbce55c..3787c26 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -824,9 +824,10 @@ static int xen_pt_initfn(PCIDevice *d)
 xen_pt_register_regions(s, &cmd);
 
 /* reinitialize each config register to be emulated */
-rc = xen_pt_config_init(s);
-if (rc) {
-XEN_PT_ERR(d, "PCI Config space initialisation failed.\n");
+xen_pt_config_init(s, &local_err);
+if (local_err) {
+error_append_hint(&local_err, "PCI Config space initialisation 
failed");
+rc = -1;
 goto err_out;
 }
 
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index dc74d3e..466525f 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -228,7 +228,7 @@ struct XenPCIPassthroughState {
 bool listener_set;
 };
 
-int xen_pt_config_init(XenPCIPassthroughState *s);
+void xen_pt_config_init(XenPCIPassthroughState *s, Error **errp);
 void xen_pt_config_delete(XenPCIPassthroughState *s);
 XenPTRegGroup *xen_pt_find_reg_grp(XenPCIPassthroughState *s, uint32_t 
address);
 XenPTReg *xen_pt_find_reg(XenPTRegGroup *reg_grp, uint32_t address);
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 3d8686d..72b10dd 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1899,8 +1899,9 @@ static uint8_t find_cap_offset(XenPCIPassthroughState *s, 
uint8_t cap)
 return 0;
 }
 
-static int xen_pt_config_reg_init(XenPCIPassthroughState *s,
-  XenPTRegGroup *reg_grp, XenPTRegInfo *reg)
+static void xen_pt_config_reg_init(XenPCIPassthroughState *s,
+  XenPTRegGroup *reg_grp, XenPTRegInfo *reg,
+  Error **errp)
 {
 XenPTReg *reg_entry;
 uint32_t data = 0;
@@ -1919,12 +1920,13 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
reg_grp->base_offset + reg->offset, &data);
 if (rc < 0) {
 g_free(reg_entry);
-return rc;
+error_setg(errp, "Init emulate register fail");
+return;
 }
 if (data == XEN_PT_INVALID_REG) {
 /* free unused BAR register entry */
 g_free(reg_entry);
-return 0;
+return;
 }
 /* Sync up the data to dev.config */
 offset = reg_grp->base_offset + reg->offset;
@@ -1942,7 +1944,8 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState 
*s,
 if (rc) {
 /* Serious issues when we cannot read the host values! */
 g_free(reg_entry);
-return rc;
+error_setg(errp, "Cannot read host values");
+return;
 }
 /* Set bits in emu_mask are the ones we emulate. The dev.config shall
  * contain the emulated view of the guest - therefore we flip the mask
@@ -1967,10 +1970,10 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
 val = data;
 
 if (val & ~size_mask) {
-XEN_PT_ERR(&s->dev,"Offset 0x%04x:0x%04x expands past register 
size(%d)!\n",
-   offset, val, reg->size);
+error_setg(errp, "Offset 0x%04x:0x%04x expands past"
+" register size(%d)!", offset, val, reg->size);
 g_free(reg_entry);
-return -ENXIO;
+return;
 }
 /* This could be just pci_set_long as we don't modify the bits
  * past reg->size, but in case this routine is run in parallel or the
@@ -1990,13 +1993,12 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
 }
 /* list add register entry */
 QLIST_INSERT_HEAD(®_grp->reg_tbl_list, reg_entry, entries);
-
-return 0;
 }
 
-int xen_pt_config_init(XenPCIPassthroughState *s)
+void xen_pt_config_init(XenPCIPassthroughState *s, Error **errp)
 {
 int i, rc;
+Error *local_err = NULL;
 
 QLIST_INIT(&s->reg_grps);
 
@@ -2039,11 +2041,12 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
   reg_grp_offset,
   ®_grp_entry->size);
 if (rc < 0) {
-XEN_PT_LOG(&s->dev, "Failed to initialize %d/%ld, type=0x%x, 
rc:%d\n",
-   i, ARRAY_SIZE(xen_pt_emu_reg_grps),
+error_setg(&local_err, "Failed to initialize %d/%ld, 
type=0x%x,"
+   " rc:%d", i, ARRAY_SIZE(xen_pt_emu_reg_grps),
xen_pt_emu_reg_grps[i].grp_type, rc);
+error_propagate(errp, local_err);
 xen_pt_config_delete(s);
-return rc;
+ret

[Qemu-devel] [PATCH v3 0/4] Convert to realize()

2016-01-05 Thread Cao jin

v3 changelog:
1. use following style when we want to check the returned error

 Error *err = NULL;
 foo(arg, &err);
 if (err) {
 handle the error...
 error_propagate(errp, err);
 }

Cao jin (4):
  Add Error **errp for xen_host_pci_device_get()
  Add Error **errp for xen_pt_setup_vga()
  Add Error **errp for xen_pt_config_init()
  Xen PCI passthru: convert to realize()

 hw/xen/xen-host-pci-device.c | 106 +--
 hw/xen/xen-host-pci-device.h |   5 +-
 hw/xen/xen_pt.c  |  73 -
 hw/xen/xen_pt.h  |   5 +-
 hw/xen/xen_pt_config_init.c  |  51 +++--
 hw/xen/xen_pt_graphics.c |  11 +++--
 6 files changed, 141 insertions(+), 110 deletions(-)

-- 
2.1.0

[Qemu-devel] [PATCH v3 1/4] Add Error **errp for xen_host_pci_device_get()

2016-01-05 Thread Cao jin

To catch the error msg. Also modify the caller

Signed-off-by: Cao jin 
---
 hw/xen/xen-host-pci-device.c | 106 +--
 hw/xen/xen-host-pci-device.h |   5 +-
 hw/xen/xen_pt.c  |  12 +++--
 3 files changed, 71 insertions(+), 52 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 7d8a023..952cae0 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -49,7 +49,7 @@ static int xen_host_pci_sysfs_path(const XenHostPCIDevice *d,
 
 /* This size should be enough to read the first 7 lines of a resource file */
 #define XEN_HOST_PCI_RESOURCE_BUFFER_SIZE 400
-static int xen_host_pci_get_resource(XenHostPCIDevice *d)
+static void xen_host_pci_get_resource(XenHostPCIDevice *d, Error **errp)
 {
 int i, rc, fd;
 char path[PATH_MAX];
@@ -60,23 +60,24 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
 
 rc = xen_host_pci_sysfs_path(d, "resource", path, sizeof (path));
 if (rc) {
-return rc;
+error_setg_errno(errp, errno, "snprintf err");
+return;
 }
+
 fd = open(path, O_RDONLY);
 if (fd == -1) {
-XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
-return -errno;
+error_setg_errno(errp, errno, "open %s err", path);
+return;
 }
 
 do {
 rc = read(fd, &buf, sizeof (buf) - 1);
 if (rc < 0 && errno != EINTR) {
-rc = -errno;
+error_setg_errno(errp, errno, "read err");
 goto out;
 }
 } while (rc < 0);
 buf[rc] = 0;
-rc = 0;
 
 s = buf;
 for (i = 0; i < PCI_NUM_REGIONS; i++) {
@@ -129,20 +130,20 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
 d->rom.bus_flags = flags & IORESOURCE_BITS;
 }
 }
+
 if (i != PCI_NUM_REGIONS) {
 /* Invalid format or input to short */
-rc = -ENODEV;
+error_setg(errp, "Invalid format or input to short: %s", buf);
 }
 
 out:
 close(fd);
-return rc;
 }
 
 /* This size should be enough to read a long from a file */
 #define XEN_HOST_PCI_GET_VALUE_BUFFER_SIZE 22
-static int xen_host_pci_get_value(XenHostPCIDevice *d, const char *name,
-  unsigned int *pvalue, int base)
+static void xen_host_pci_get_value(XenHostPCIDevice *d, const char *name,
+  unsigned int *pvalue, int base, Error **errp)
 {
 char path[PATH_MAX];
 char buf[XEN_HOST_PCI_GET_VALUE_BUFFER_SIZE];
@@ -152,47 +153,52 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 
 rc = xen_host_pci_sysfs_path(d, name, path, sizeof (path));
 if (rc) {
-return rc;
+error_setg_errno(errp, errno, "snprintf err");
+return;
 }
+
 fd = open(path, O_RDONLY);
 if (fd == -1) {
-XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
-return -errno;
+error_setg_errno(errp, errno, "open %s err", path);
+return;
 }
+
 do {
 rc = read(fd, &buf, sizeof (buf) - 1);
 if (rc < 0 && errno != EINTR) {
-rc = -errno;
+error_setg_errno(errp, errno, "read err");
 goto out;
 }
 } while (rc < 0);
+
 buf[rc] = 0;
 value = strtol(buf, &endptr, base);
 if (endptr == buf || *endptr != '\n') {
-rc = -1;
+error_setg(errp, "format invalid: %s", buf);
 } else if ((value == LONG_MIN || value == LONG_MAX) && errno == ERANGE) {
-rc = -errno;
+error_setg_errno(errp, errno, "strtol err");
 } else {
-rc = 0;
 *pvalue = value;
 }
+
 out:
 close(fd);
-return rc;
 }
 
-static inline int xen_host_pci_get_hex_value(XenHostPCIDevice *d,
+static inline void xen_host_pci_get_hex_value(XenHostPCIDevice *d,
  const char *name,
- unsigned int *pvalue)
+ unsigned int *pvalue,
+ Error **errp)
 {
-return xen_host_pci_get_value(d, name, pvalue, 16);
+xen_host_pci_get_value(d, name, pvalue, 16, errp);
 }
 
-static inline int xen_host_pci_get_dec_value(XenHostPCIDevice *d,
+static inline void xen_host_pci_get_dec_value(XenHostPCIDevice *d,
  const char *name,
- unsigned int *pvalue)
+ unsigned int *pvalue,
+ Error **errp)
 {
-return xen_host_pci_get_value(d, name, pvalue, 10);
+xen_host_pci_get_value(d, name, pvalue, 10, errp);
 }
 
 static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice *d)
@@ -206,20 +212,21 @@ static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice 
*d)
 return !stat(path, &buf);
 }
 
-static int xen_ho

Re: [Qemu-devel] [v15 12/15] vfio: add bus in reset flag

2016-01-05 Thread Chen Fan



On 01/06/2016 03:58 AM, Alex Williamson wrote:

On Tue, 2016-01-05 at 09:20 +0800, Cao jin wrote:

From: Chen Fan 

mark the host bus be in reset. avoid multiple devices trigger the
host bus reset many times.

Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 6 ++
  include/hw/vfio/vfio-common.h | 1 +
  2 files changed, 7 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index ee88db3..aa0d945 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2249,6 +2249,11 @@ static int vfio_pci_hot_reset(VFIOPCIDevice
*vdev, bool single)
  
  trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" :

"multi");
  
+if (vdev->vbasedev.bus_in_reset) {

+vdev->vbasedev.bus_in_reset = false;
+return 0;
+}
+
  vfio_pci_pre_reset(vdev);
  vdev->vbasedev.needs_reset = false;
  
@@ -2312,6 +2317,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice

*vdev, bool single)
  }
  vfio_pci_pre_reset(tmp);
  tmp->vbasedev.needs_reset = false;
+tmp->vbasedev.bus_in_reset = true;
  multi = true;
  break;
  }
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
common.h
index f037f3c..44b19d7 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -95,6 +95,7 @@ typedef struct VFIODevice {
  bool reset_works;
  bool needs_reset;
  bool no_mmap;
+bool bus_in_reset;
  VFIODeviceOps *ops;
  unsigned int num_irqs;
  unsigned int num_regions;

I imagine this should be a VFIOPCIDevice field, it has no use in the
common code.  The name is also a bit confusing; when I suggested a
bus_in_reset flag, I was referring to a property on the bus itself that
the existing device_reset could query to switch modes rather than add a
separate callback as you've done in this series.  This works, but it's
perhaps more intrusive than I was thinking.  It will need to get
approval by qdev folks.

maybe I don't get your point. I just think add a bus_in_reset flag in bus
has no much sense. for instance, if assigning device A and B from
different host bus into a same virtual bus. assume all check passed.
then if device A aer occurs. we should reset virtual bus to recover
the device A, we also need to reset the device B and do device B host
bus reset. but here the bus_in_reset just denote the device B not need
to do host bus reset, it's incorrect. right?



In any case, this bus_in_reset field is tracking whether a device has
already been reset as part of a hot reset, sort of a more bus-based
version with opposite polarity of needs_reset.  It doesn't actually
track the bus reset state at all, it tracks whether we should skip the
next call to hot reset for that device.  So it should probably be
something like VFIOPCIDevice.skip_hot_reset (though that's not a great
name either).

I also wonder if a "hot" reset callback in qbus is really too PCI
centered, should it just be "bus_reset"?

Finally, it would be great if you could mention in the cover email
which patches are new or more than superficially modified from the
previous version so we can more easily focus on the new code to review.
  Thanks!


Oh I am sorry, thanks for your mention. I will detail the change
from next version.

Thanks,
Chen



Alex


.

Re: [Qemu-devel] [PATCH v3 4/7] bcm2835_peripherals: add rollup device for bcm2835 peripherals

2016-01-05 Thread Alistair Francis

On Thu, Dec 31, 2015 at 4:31 PM, Andrew Baumann
 wrote:
> This device maintains all the non-CPU peripherals on bcm2835 (Pi1)
> which are also present on bcm2836 (Pi2). It also implements the
> private address spaces used for DMA and mailboxes.
>
> Signed-off-by: Andrew Baumann 
> ---
>
> Notes:
> v3:
>  * clean up raspi_platform.h
>  * s/_/-/ in type/property/child names
>  * use memory_region_init where appropriate rather than 
> memory_region_init_io
>  * pass ram as link property
>
> v2 changes:
>  * adapted to use common SDHCI emulation
>
>  hw/arm/Makefile.objs |   1 +
>  hw/arm/bcm2835_peripherals.c | 205 
> +++
>  include/hw/arm/bcm2835_peripherals.h |  42 +++
>  include/hw/arm/raspi_platform.h  | 128 ++
>  4 files changed, 376 insertions(+)
>  create mode 100644 hw/arm/bcm2835_peripherals.c
>  create mode 100644 include/hw/arm/bcm2835_peripherals.h
>  create mode 100644 include/hw/arm/raspi_platform.h
>
> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
> index 2195b60..82cc142 100644
> --- a/hw/arm/Makefile.objs
> +++ b/hw/arm/Makefile.objs
> @@ -11,6 +11,7 @@ obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o 
> pxa2xx_pic.o
>  obj-$(CONFIG_DIGIC) += digic.o
>  obj-y += omap1.o omap2.o strongarm.o
>  obj-$(CONFIG_ALLWINNER_A10) += allwinner-a10.o cubieboard.o
> +obj-$(CONFIG_RASPI) += bcm2835_peripherals.o
>  obj-$(CONFIG_STM32F205_SOC) += stm32f205_soc.o
>  obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp.o xlnx-ep108.o
>  obj-$(CONFIG_FSL_IMX25) += fsl-imx25.o imx25_pdk.o
> diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
> new file mode 100644
> index 000..879a41d
> --- /dev/null
> +++ b/hw/arm/bcm2835_peripherals.c
> @@ -0,0 +1,205 @@
> +/*
> + * Raspberry Pi emulation (c) 2012 Gregory Estrade
> + * Upstreaming code cleanup [including bcm2835_*] (c) 2013 Jan Petrous
> + *
> + * Rasperry Pi 2 emulation and refactoring Copyright (c) 2015, Microsoft
> + * Written by Andrew Baumann
> + *
> + * This code is licensed under the GNU GPLv2 and later.
> + */
> +
> +#include "hw/arm/bcm2835_peripherals.h"
> +#include "hw/misc/bcm2835_mbox_defs.h"
> +#include "hw/arm/raspi_platform.h"
> +
> +/* Peripheral base address on the VC (GPU) system bus */
> +#define BCM2835_VC_PERI_BASE 0x7e00
> +
> +/* Capabilities for SD controller: no DMA, high-speed, default clocks etc. */
> +#define BCM2835_SDHC_CAPAREG 0x52034b4
> +
> +static void bcm2835_peripherals_init(Object *obj)
> +{
> +BCM2835PeripheralState *s = BCM2835_PERIPHERALS(obj);
> +
> +/* Memory region for peripheral devices, which we export to our parent */
> +memory_region_init_io(&s->peri_mr, obj, NULL, s, "bcm2835-peripherals",
> +  0x100);
> +object_property_add_child(obj, "peripheral-io", OBJECT(&s->peri_mr), 
> NULL);
> +sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->peri_mr);
> +
> +/* Internal memory region for peripheral bus addresses (not exported) */
> +memory_region_init(&s->gpu_bus_mr, obj, "bcm2835-gpu", (uint64_t)1 << 
> 32);
> +object_property_add_child(obj, "gpu-bus", OBJECT(&s->gpu_bus_mr), NULL);
> +
> +/* Internal memory region for request/response communication with
> + * mailbox-addressable peripherals (not exported)
> + */
> +memory_region_init(&s->mbox_mr, obj, "bcm2835-mbox",
> +   MBOX_CHAN_COUNT << MBOX_AS_CHAN_SHIFT);
> +
> +/* Interrupt Controller */
> +object_initialize(&s->ic, sizeof(s->ic), TYPE_BCM2835_IC);
> +object_property_add_child(obj, "ic", OBJECT(&s->ic), NULL);
> +qdev_set_parent_bus(DEVICE(&s->ic), sysbus_get_default());
> +
> +/* UART0 */
> +s->uart0 = SYS_BUS_DEVICE(object_new("pl011"));
> +object_property_add_child(obj, "uart0", OBJECT(s->uart0), NULL);
> +qdev_set_parent_bus(DEVICE(s->uart0), sysbus_get_default());
> +
> +/* Mailboxes */
> +object_initialize(&s->mboxes, sizeof(s->mboxes), TYPE_BCM2835_MBOX);
> +object_property_add_child(obj, "mbox", OBJECT(&s->mboxes), NULL);
> +qdev_set_parent_bus(DEVICE(&s->mboxes), sysbus_get_default());
> +
> +object_property_add_const_link(OBJECT(&s->mboxes), "mbox-mr",
> +   OBJECT(&s->mbox_mr), &error_abort);
> +
> +/* Property channel */
> +object_initialize(&s->property, sizeof(s->property), 
> TYPE_BCM2835_PROPERTY);
> +object_property_add_child(obj, "property", OBJECT(&s->property), NULL);
> +qdev_set_parent_bus(DEVICE(&s->property), sysbus_get_default());
> +
> +object_property_add_const_link(OBJECT(&s->property), "dma-mr",
> +   OBJECT(&s->gpu_bus_mr), &error_abort);
> +
> +/* Extended Mass Media Controller */
> +object_initialize(&s->sdhci, sizeof(s->sdhci), TYPE_SYSBUS_SDHCI);
> +object_property_add_child(obj, "sdhci", OBJECT(&s->sdhci), NULL);
> +qdev_set_parent_bus(DEVICE(&s->sdh

Re: [Qemu-devel] [PATCH] hw/dma/xilinx_axidma: debug printf fixups

2016-01-05 Thread Alistair Francis

On Tue, Jan 5, 2016 at 7:32 AM, Andrew Jones  wrote:
> On Tue, Jan 05, 2016 at 07:07:22AM -0700, Eric Blake wrote:
>> On 01/05/2016 06:22 AM, Andrew Jones wrote:
>> > (Found by grepping for broken PRI users.)
>> >
>> > Signed-off-by: Andrew Jones 
>> > ---
>> >  hw/dma/xilinx_axidma.c | 8 
>> >  1 file changed, 4 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/hw/dma/xilinx_axidma.c b/hw/dma/xilinx_axidma.c
>> > index b1cfa11356a26..2ab0772cd19ae 100644
>> > --- a/hw/dma/xilinx_axidma.c
>> > +++ b/hw/dma/xilinx_axidma.c
>> > @@ -180,10 +180,10 @@ static inline int streamid_from_addr(hwaddr addr)
>> >  #ifdef DEBUG_ENET
>> >  static void stream_desc_show(struct SDesc *d)
>> >  {
>> > -qemu_log("buffer_addr  = " PRIx64 "\n", d->buffer_address);
>> > -qemu_log("nxtdesc  = " PRIx64 "\n", d->nxtdesc);
>> > -qemu_log("control  = %x\n", d->control);
>> > -qemu_log("status   = %x\n", d->status);
>> > +qemu_log("buffer_addr  = 0x%" PRIx64 "\n", d->buffer_address);
>> > +qemu_log("nxtdesc  = 0x%" PRIx64 "\n", d->nxtdesc);
>> > +qemu_log("control  = 0x%x\n", d->control);
>> > +qemu_log("status   = 0x%x\n", d->status);
>>
>> This is dead code.  Nothing uses stream_desc_show() even when DEBUG_ENET
>> is defined.  I'd just delete the function and #ifdef altogether, instead.
>
> Sounds good, but I guess I'll leave the keep+fix vs. throw decision to the
> maintainers, rather than to submit a v2 ripping it out.

I don't see any reason to keep dead code around. I think it should be removed.

If you send a V2 removing it (or a new patch altogether) I'll review it.

Thanks,

Alistair

>
> Thanks,
> drew
>
>>
>> --
>> Eric Blake   eblake redhat com+1-919-301-3266
>> Libvirt virtualization library http://libvirt.org
>>
>
>
>

Re: [Qemu-devel] [PATCH v8 17/35] qapi: Drop unused 'kind' for struct/enum visit

2016-01-05 Thread Eric Blake

On 12/21/2015 10:08 AM, Eric Blake wrote:
> visit_start_struct() and visit_type_enum() had a 'kind' argument
> that was usually set to either the stringized version of the
> corresponding qapi type name, or to NULL (although some clients
> didn't even get that right).  But nothing ever used the argument.
> It's even hard to argue that it would be useful in a debugger,
> as a stack backtrace also tells which type is being visited.
> 
> Therefore, drop the 'kind' argument as dead.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v8: rebase to 'name' motion
> v7: new patch

> +++ b/qapi/qapi-visit-core.c
> @@ -2,6 +2,7 @@
>   * Core Definitions for QAPI Visitor Classes
>   *
>   * Copyright IBM, Corp. 2011
> + * Copyright (C) 2015 Red Hat, Inc.

Hmm (and more of a note to myself): Now that the year has rolled over,
if I have a reason to spin v9, I'll have to remember to update all my
Copyright lines to include 2016.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v8 14.5/35] qapi: Update docs to match recent generator changes

2016-01-05 Thread Eric Blake

On 01/05/2016 05:01 PM, Eric Blake wrote:
> Several commits have been changing the generator, but not updating
> the docs to match (an obvious one is commit 9f08c8ec that made
> list types lazy, and thereby dropped UserDefOneList).  Rework the
> example to show slightly more output (we don't want to show too
> much; that's what the testsuite is for), and regenerate the output
> to match recent changes, including the previous patch's reordering
> of parameter positions.
> 
> Reported-by: Marc-André Lureau 
> Signed-off-by: Eric Blake 
> 
> ---
> v9: new patch

There's even more churn to the generated code later in this series; if I
have a good reason to spin v9, I'm probably going to sink this to the
bottom and do it only once, rather than trying to piecemeal each change
in place.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 06/13] HBitmap: Introduce "meta" bitmap to track bit changes

2016-01-05 Thread John Snow



On 01/04/2016 05:27 AM, Fam Zheng wrote:
> Upon each bit toggle, the corresponding bit in the meta bitmap will be
> set.
> 
> Signed-off-by: Fam Zheng 
> ---
>  include/qemu/hbitmap.h |  8 +++
>  util/hbitmap.c | 61 
> +-
>  2 files changed, 54 insertions(+), 15 deletions(-)
> 
> diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
> index bb94a00..ed672e7 100644
> --- a/include/qemu/hbitmap.h
> +++ b/include/qemu/hbitmap.h
> @@ -181,6 +181,14 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap 
> *hb, uint64_t first);
>   */
>  unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi);
>  
> +/* hbitmap_create_meta
> + * Create a "meta" hbitmap to track dirtiness of the bits in this HBitmap.
> + *
> + * @hb: The HBitmap to operate on.
> + * @chunk_size: How many bits in @hb does one bit in the meta track.
> + */
> +HBitmap *hbitmap_create_meta(HBitmap *hb, int chunk_size);
> +
>  /**
>   * hbitmap_iter_next:
>   * @hbi: HBitmapIter to operate on.
> diff --git a/util/hbitmap.c b/util/hbitmap.c
> index 50b888f..55d3182 100644
> --- a/util/hbitmap.c
> +++ b/util/hbitmap.c
> @@ -81,6 +81,9 @@ struct HBitmap {
>   */
>  int granularity;
>  
> +/* A meta dirty bitmap to track the dirtiness of bits in this HBitmap. */
> +HBitmap *meta;
> +
>  /* A number of progressively less coarse bitmaps (i.e. level 0 is the
>   * coarsest).  Each bit in level N represents a word in level N+1 that
>   * has a set bit, except the last level where each bit represents the
> @@ -212,25 +215,27 @@ static uint64_t hb_count_between(HBitmap *hb, uint64_t 
> start, uint64_t last)
>  }
>  
>  /* Setting starts at the last layer and propagates up if an element
> - * changes from zero to non-zero.
> + * changes.
>   */

Isn't this comment wrong anyway? hb_set_elem does not propagate upward
by itself.

>  static inline bool hb_set_elem(unsigned long *elem, uint64_t start, uint64_t 
> last)
>  {
>  unsigned long mask;
> -bool changed;
> +unsigned long old;
>  
>  assert((last >> BITS_PER_LEVEL) == (start >> BITS_PER_LEVEL));
>  assert(start <= last);
>  
>  mask = 2UL << (last & (BITS_PER_LONG - 1));
>  mask -= 1UL << (start & (BITS_PER_LONG - 1));
> -changed = (*elem == 0);
> +old = *elem;
>  *elem |= mask;
> -return changed;
> +return old != *elem;
>  }
>  
> -/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)... */
> -static void hb_set_between(HBitmap *hb, int level, uint64_t start, uint64_t 
> last)
> +/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)...
> + * Returns true if at least one bit is changed. */
> +static bool hb_set_between(HBitmap *hb, int level, uint64_t start,
> +   uint64_t last)
>  {
>  size_t pos = start >> BITS_PER_LEVEL;
>  size_t lastpos = last >> BITS_PER_LEVEL;
> @@ -259,22 +264,27 @@ static void hb_set_between(HBitmap *hb, int level, 
> uint64_t start, uint64_t last
>  if (level > 0 && changed) {
>  hb_set_between(hb, level - 1, pos, lastpos);
>  }
> +return changed;
>  }
>  
>  void hbitmap_set(HBitmap *hb, uint64_t start, uint64_t count)
>  {
>  /* Compute range in the last layer.  */
> +uint64_t first, n;
>  uint64_t last = start + count - 1;
>  
>  trace_hbitmap_set(hb, start, count,
>start >> hb->granularity, last >> hb->granularity);
>  
> -start >>= hb->granularity;
> +first = start >> hb->granularity;
>  last >>= hb->granularity;
> -count = last - start + 1;
> +n = last - first + 1;
>  
> -hb->count += count - hb_count_between(hb, start, last);
> -hb_set_between(hb, HBITMAP_LEVELS - 1, start, last);
> +hb->count += n - hb_count_between(hb, first, last);
> +if (hb_set_between(hb, HBITMAP_LEVELS - 1, first, last) &&
> +hb->meta) {
> +hbitmap_set(hb->meta, start, count);
> +}
>  }
>  
>  /* Resetting works the other way round: propagate up if the new
> @@ -295,8 +305,10 @@ static inline bool hb_reset_elem(unsigned long *elem, 
> uint64_t start, uint64_t l
>  return blanked;
>  }
>  
> -/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)... */
> -static void hb_reset_between(HBitmap *hb, int level, uint64_t start, 
> uint64_t last)
> +/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)...
> + * Returns true if at least one bit is changed. */
> +static bool hb_reset_between(HBitmap *hb, int level, uint64_t start,
> + uint64_t last)
>  {
>  size_t pos = start >> BITS_PER_LEVEL;
>  size_t lastpos = last >> BITS_PER_LEVEL;
> @@ -339,21 +351,28 @@ static void hb_reset_between(HBitmap *hb, int level, 
> uint64_t start, uint64_t la
>  if (level > 0 && changed) {
>  hb_reset_between(hb, level - 1, pos, lastpos);
>  }
> +
> +return changed;
> +
>  }
>  
>  void hbitmap_reset(HBitmap *hb, uint64_t s

[Qemu-devel] Clang -Werror build fails against spice-server-devel 0.12.6-1.fc22

2016-01-05 Thread John Snow

Just as a heads up:

/home/bos/jhuston/src/qemu/hw/display/qxl.c:1071:6: error: 'set_mm_time'
is deprecated [-Werror,-Wdeprecated-declarations]
.set_mm_time = interface_set_mm_time,
 ^
/usr/include/spice-server/spice-qxl.h:162:12: note: 'set_mm_time' has
been explicitly marked deprecated here
void (*set_mm_time)(QXLInstance *qin, uint32_t mm_time)
SPICE_GNUC_DEPRECATED;
   ^
1 error generated.
/home/bos/jhuston/src/qemu/rules.mak:57: recipe for target
'hw/display/qxl.o' failed


Seems to have been triggered by a recent update to
spice-server-devel.x86_64 0.12.6-1.fc22.

[Qemu-devel] [PATCH v8 14.5/35] qapi: Update docs to match recent generator changes

2016-01-05 Thread Eric Blake

Several commits have been changing the generator, but not updating
the docs to match (an obvious one is commit 9f08c8ec that made
list types lazy, and thereby dropped UserDefOneList).  Rework the
example to show slightly more output (we don't want to show too
much; that's what the testsuite is for), and regenerate the output
to match recent changes, including the previous patch's reordering
of parameter positions.

Reported-by: Marc-André Lureau 
Signed-off-by: Eric Blake 

---
v9: new patch
---
 docs/qapi-code-gen.txt | 103 -
 1 file changed, 59 insertions(+), 44 deletions(-)

diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
index 128f074..f9b1d0c 100644
--- a/docs/qapi-code-gen.txt
+++ b/docs/qapi-code-gen.txt
@@ -1,7 +1,7 @@
 = How to use the QAPI code generator =

 Copyright IBM Corp. 2011
-Copyright (C) 2012-2015 Red Hat, Inc.
+Copyright (C) 2012-2016 Red Hat, Inc.

 This work is licensed under the terms of the GNU GPL, version 2 or
 later. See the COPYING file in the top-level directory.
@@ -721,29 +721,34 @@ the names of built-in types.  Clients should examine 
member

 == Code generation ==

-Schemas are fed into four scripts to generate all the code/files that,
+Schemas are fed into five scripts to generate all the code/files that,
 paired with the core QAPI libraries, comprise everything required to
 take JSON commands read in by a Client JSON Protocol server, unmarshal
 the arguments into the underlying C types, call into the corresponding
-C function, and map the response back to a Client JSON Protocol
-response to be returned to the user.
+C function, map the response back to a Client JSON Protocol response
+to be returned to the user, and introspect the commands.

-As an example, we'll use the following schema, which describes a single
-complex user-defined type (which will produce a C struct, along with a list
-node structure that can be used to chain together a list of such types in
-case we want to accept/return a list of this type with a command), and a
-command which takes that type as a parameter and returns the same type:
+As an example, we'll use the following schema, which describes a
+single complex user-defined type, along with command which takes a
+list of that type as a parameter.  The user is responsible for writing
+the implementation of qmp_my_command(); everything else is produced by
+the generator.

 $ cat example-schema.json
 { 'struct': 'UserDefOne',
-  'data': { 'integer': 'int', 'string': 'str' } }
+  'data': { 'integer': 'int', '*string': 'str' } }

 { 'command': 'my-command',
-  'data':{'arg1': 'UserDefOne'},
+  'data':{ 'arg1': ['UserDefOne'] },
   'returns': 'UserDefOne' }

 { 'event': 'MY_EVENT' }

+For a more thorough look at generated code, the testsuite includes
+tests/qapi-schema/qapi-schema-tests.json that covers more examples of
+what the generator will accept, and compiles the resulting C code as
+part of 'make check-unit'.
+
 === scripts/qapi-types.py ===

 Used to generate the C types defined by a schema. The following files are
@@ -776,7 +781,7 @@ Example:

 qdv = qapi_dealloc_visitor_new();
 v = qapi_dealloc_get_visitor(qdv);
-visit_type_UserDefOne(v, &obj, NULL, NULL);
+visit_type_UserDefOne(v, NULL, &obj, NULL);
 qapi_dealloc_visitor_cleanup(qdv);
 }

@@ -791,7 +796,7 @@ Example:

 qdv = qapi_dealloc_visitor_new();
 v = qapi_dealloc_get_visitor(qdv);
-visit_type_UserDefOneList(v, &obj, NULL, NULL);
+visit_type_UserDefOneList(v, NULL, &obj, NULL);
 qapi_dealloc_visitor_cleanup(qdv);
 }
 $ cat qapi-generated/example-qapi-types.h
@@ -808,6 +813,7 @@ Example:

 struct UserDefOne {
 int64_t integer;
+bool has_string;
 char *string;
 };

@@ -854,34 +860,42 @@ Example:
 {
 Error *err = NULL;

-visit_type_int(v, &(*obj)->integer, "integer", &err);
+visit_type_int(v, "integer", &(*obj)->integer, &err);
 if (err) {
 goto out;
 }
-visit_type_str(v, &(*obj)->string, "string", &err);
-if (err) {
-goto out;
-}
-
-out:
-error_propagate(errp, err);
-}
-
-void visit_type_UserDefOne(Visitor *v, UserDefOne **obj, const char *name, 
Error **errp)
-{
-Error *err = NULL;
-
-visit_start_struct(v, (void **)obj, "UserDefOne", name, 
sizeof(UserDefOne), &err);
-if (!err) {
-if (*obj) {
-visit_type_UserDefOne_fields(v, obj, errp);
+if (visit_optional(v, "string", &(*obj)->has_string)) {
+visit_type_str(v, "string", &(*obj)->string, &err);
+if (err) {
+goto out;
 }
-visit_end_struct(v, &err);
 }
+
+out:
+error_propagate(errp, err);
+}
+
+void visit_type_UserDefOne(Visitor *v, const char *name,

Re: [Qemu-devel] [PATCH 08/17] isa: add an ISA DMA interface, and store it within the ISA bus

2016-01-05 Thread John Snow



On 12/29/2015 03:04 AM, Hervé Poussineau wrote:
> This will permit to deprecate global DMA_*() functions.
> 
> Signed-off-by: Hervé Poussineau 
> ---
>  hw/isa/isa-bus.c| 21 +
>  include/hw/isa/isa.h| 38 ++
>  include/qemu/typedefs.h |  1 +
>  3 files changed, 60 insertions(+)
> 
> diff --git a/hw/isa/isa-bus.c b/hw/isa/isa-bus.c
> index 43e0cd8..8887433 100644
> --- a/hw/isa/isa-bus.c
> +++ b/hw/isa/isa-bus.c
> @@ -36,6 +36,12 @@ static void isa_bus_class_init(ObjectClass *klass, void 
> *data)
>  k->get_fw_dev_path = isabus_get_fw_dev_path;
>  }
>  
> +static const TypeInfo isa_dma_info = {
> +.name = TYPE_ISADMA,
> +.parent = TYPE_INTERFACE,
> +.class_size = sizeof(IsaDmaClass),
> +};
> +
>  static const TypeInfo isa_bus_info = {
>  .name = TYPE_ISA_BUS,
>  .parent = TYPE_BUS,
> @@ -92,6 +98,20 @@ void isa_init_irq(ISADevice *dev, qemu_irq *p, int isairq)
>  dev->nirqs++;
>  }
>  
> +void isa_bus_dma(ISABus *bus, IsaDma *dma8, IsaDma *dma16)
> +{
> +assert(bus && dma8 && dma16);
> +assert(!bus->dma[0] && !bus->dma[1]);
> +bus->dma[0] = dma8;
> +bus->dma[1] = dma16;
> +}
> +
> +IsaDma *isa_get_dma(ISABus *bus, int nchan)
> +{
> +assert(bus);
> +return bus->dma[nchan > 3 ? 1 : 0];
> +}
> +
>  static inline void isa_init_ioport(ISADevice *dev, uint16_t ioport)
>  {
>  if (dev && (dev->ioport_id == 0 || ioport < dev->ioport_id)) {
> @@ -233,6 +253,7 @@ static const TypeInfo isa_device_type_info = {
>  
>  static void isabus_register_types(void)
>  {
> +type_register_static(&isa_dma_info);
>  type_register_static(&isa_bus_info);
>  type_register_static(&isabus_bridge_info);
>  type_register_static(&isa_device_type_info);
> diff --git a/include/hw/isa/isa.h b/include/hw/isa/isa.h
> index d84852b..193ceb2 100644
> --- a/include/hw/isa/isa.h
> +++ b/include/hw/isa/isa.h
> @@ -34,6 +34,41 @@ static inline uint16_t applesmc_port(void)
>  return 0;
>  }
>  
> +#define TYPE_ISADMA "isa-dma"
> +
> +#define ISADMA_CLASS(klass) \
> +OBJECT_CLASS_CHECK(IsaDmaClass, (klass), TYPE_ISADMA)
> +#define ISADMA_GET_CLASS(obj) \
> +OBJECT_GET_CLASS(IsaDmaClass, (obj), TYPE_ISADMA)
> +#define ISADMA(obj) \
> +INTERFACE_CHECK(IsaDma, (obj), TYPE_ISADMA)
> +
> +typedef struct IsaDma {
> +Object parent;
> +} IsaDma;
> +

You define IsaDma here, and

> +typedef enum {
> +ISADMA_TRANSFER_VERIFY,
> +ISADMA_TRANSFER_READ,
> +ISADMA_TRANSFER_WRITE,
> +ISADMA_TRANSFER_ILLEGAL,
> +} IsaDmaTransferMode;
> +
> +typedef struct IsaDmaClass {
> +InterfaceClass parent;
> +
> +IsaDmaTransferMode (*get_transfer_mode)(IsaDma *obj, int nchan);
> +bool (*has_autoinitialization)(IsaDma *obj, int nchan);
> +int (*read_memory)(IsaDma *obj, int nchan, void *buf, int pos, int len);
> +int (*write_memory)(IsaDma *obj, int nchan, void *buf, int pos, int len);
> +void (*hold_DREQ)(IsaDma *obj, int nchan);
> +void (*release_DREQ)(IsaDma *obj, int nchan);
> +void (*schedule)(IsaDma *obj);
> +void (*register_channel)(IsaDma *obj, int nchan,
> + DMA_transfer_handler transfer_handler,
> + void *opaque);
> +} IsaDmaClass;
> +
>  typedef struct ISADeviceClass {
>  DeviceClass parent_class;
>  } ISADeviceClass;
> @@ -46,6 +81,7 @@ struct ISABus {
>  MemoryRegion *address_space;
>  MemoryRegion *address_space_io;
>  qemu_irq *irqs;
> +IsaDma *dma[2];
>  };
>  
>  struct ISADevice {
> @@ -63,6 +99,8 @@ ISABus *isa_bus_new(DeviceState *dev, MemoryRegion 
> *address_space,
>  void isa_bus_irqs(ISABus *bus, qemu_irq *irqs);
>  qemu_irq isa_get_irq(ISADevice *dev, int isairq);
>  void isa_init_irq(ISADevice *dev, qemu_irq *p, int isairq);
> +void isa_bus_dma(ISABus *bus, IsaDma *dma8, IsaDma *dma16);
> +IsaDma *isa_get_dma(ISABus *bus, int nchan);
>  MemoryRegion *isa_address_space(ISADevice *dev);
>  MemoryRegion *isa_address_space_io(ISADevice *dev);
>  ISADevice *isa_create(ISABus *bus, const char *name);
> diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
> index 78fe6e8..6ed91b4 100644
> --- a/include/qemu/typedefs.h
> +++ b/include/qemu/typedefs.h
> @@ -33,6 +33,7 @@ typedef struct I2CBus I2CBus;
>  typedef struct I2SCodec I2SCodec;
>  typedef struct ISABus ISABus;
>  typedef struct ISADevice ISADevice;
> +typedef struct IsaDma IsaDma;

again here. Clang gets a little whiny about that.

>  typedef struct LoadStateEntry LoadStateEntry;
>  typedef struct MACAddr MACAddr;
>  typedef struct MachineClass MachineClass;
> 

-- 
—js

Re: [Qemu-devel] [PATCH 05/13] block: Hide HBitmap in block dirty bitmap interface

2016-01-05 Thread John Snow

Should we skip adding the typedef for HBitmapIter if we're just going to
use this instead?

On 01/04/2016 05:27 AM, Fam Zheng wrote:
> HBitmap is an implementation detail of block dirty bitmap that should be 
> hidden
> from users. Introduce a BdrvDirtyBitmapIter to encapsulate the underlying
> HBitmapIter.
> 
> A small difference in the interface is, before, an HBitmapIter is initialized
> in place, now the new BdrvDirtyBitmapIter must be dynamically allocated 
> because
> the structure definition is in block.c.
> 
> Two current users are converted too.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/backup.c   | 14 --
>  block/dirty-bitmap.c | 36 +++-
>  block/mirror.c   | 14 --
>  include/block/dirty-bitmap.h |  7 +--
>  include/qemu/typedefs.h  |  1 +
>  5 files changed, 53 insertions(+), 19 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 56ddec0..2388039 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -326,14 +326,14 @@ static int coroutine_fn 
> backup_run_incremental(BackupBlockJob *job)
>  int64_t end;
>  int64_t last_cluster = -1;
>  BlockDriverState *bs = job->common.bs;
> -HBitmapIter hbi;
> +BdrvDirtyBitmapIter *dbi;
>  
>  granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
>  clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
> -bdrv_dirty_iter_init(job->sync_bitmap, &hbi);
> +dbi = bdrv_dirty_iter_new(job->sync_bitmap, 0);
>  
>  /* Find the next dirty sector(s) */
> -while ((sector = hbitmap_iter_next(&hbi)) != -1) {
> +while ((sector = bdrv_dirty_iter_next(dbi)) != -1) {
>  cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
>  
>  /* Fake progress updates for any clusters we skipped */
> @@ -345,7 +345,7 @@ static int coroutine_fn 
> backup_run_incremental(BackupBlockJob *job)
>  for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
>  do {
>  if (yield_and_check(job)) {
> -return ret;
> +goto out;
>  }
>  ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
>  BACKUP_SECTORS_PER_CLUSTER, 
> &error_is_read,
> @@ -353,7 +353,7 @@ static int coroutine_fn 
> backup_run_incremental(BackupBlockJob *job)
>  if ((ret < 0) &&
>  backup_error_action(job, error_is_read, -ret) ==
>  BLOCK_ERROR_ACTION_REPORT) {
> -return ret;
> +goto out;
>  }
>  } while (ret < 0);
>  }
> @@ -361,7 +361,7 @@ static int coroutine_fn 
> backup_run_incremental(BackupBlockJob *job)
>  /* If the bitmap granularity is smaller than the backup granularity,
>   * we need to advance the iterator pointer to the next cluster. */
>  if (granularity < BACKUP_CLUSTER_SIZE) {
> -bdrv_set_dirty_iter(&hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
> +bdrv_set_dirty_iter(dbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
>  }
>  
>  last_cluster = cluster - 1;
> @@ -373,6 +373,8 @@ static int coroutine_fn 
> backup_run_incremental(BackupBlockJob *job)
>  job->common.offset += ((end - last_cluster - 1) * 
> BACKUP_CLUSTER_SIZE);
>  }
>  
> +out:
> +bdrv_dirty_iter_free(dbi);
>  return ret;
>  }
>  
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index 7924c38..53cf88d 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -41,9 +41,15 @@ struct BdrvDirtyBitmap {
>  char *name; /* Optional non-empty unique ID */
>  int64_t size;   /* Size of the bitmap (Number of sectors) */
>  bool disabled;  /* Bitmap is read-only */
> +int active_iterators;   /* How many iterators are active */
>  QLIST_ENTRY(BdrvDirtyBitmap) list;
>  };
>  
> +struct BdrvDirtyBitmapIter {
> +HBitmapIter hbi;
> +BdrvDirtyBitmap *bitmap;
> +};
> +
>  BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char 
> *name)
>  {
>  BdrvDirtyBitmap *bm;
> @@ -221,6 +227,7 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
> BdrvDirtyBitmap *bitmap)
>  BdrvDirtyBitmap *bm, *next;
>  QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
>  if (bm == bitmap) {
> +assert(!bitmap->active_iterators);

Should we add any assertions into the truncate function, too?

>  assert(!bdrv_dirty_bitmap_frozen(bm));
>  QLIST_REMOVE(bitmap, list);
>  hbitmap_free(bitmap->bitmap);
> @@ -299,9 +306,29 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
> *bitmap)
>  return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
>  }
>  
> -void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
> +BdrvDirtyBitm

Re: [Qemu-devel] [PATCH v8 14/35] qapi: Swap visit_* arguments for consistent 'name' placement

2016-01-05 Thread Eric Blake

On 01/05/2016 08:32 AM, Eric Blake wrote:

>>
>> However, docs/qapi-code-gen.txt should be updated in a follow-up patch.
> 
> D'oh - I knew I'd forget something :)  You're right, of course.

For that matter, several recent patches have tweaked generated code
without updating the docs, such as 9f08c8ec dropping the automatic
creation of a corresponding UserDefOneList type for the example as
given.  I'll bring it back in sync, but now I have to decide how much
(or how little) to actually demonstrate in the docs
(qapi-schema-test.json is a much more thorough testing mechanism for all
the corner cases; the docs only have to demonstrate a couple of common
examples).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 04/13] block: Remove unused typedef of BlockDriverDirtyHandler

2016-01-05 Thread John Snow



On 01/04/2016 05:27 AM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> ---
>  include/block/block.h | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 97e9b5e..0f42964 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -320,8 +320,6 @@ BlockDriverState *check_to_replace_node(BlockDriverState 
> *parent_bs,
>  const char *node_name, Error **errp);
>  
>  /* async block I/O */
> -typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector,
> - int sector_num);
>  BlockAIOCB *bdrv_aio_readv(BlockDriverState *bs, int64_t sector_num,
> QEMUIOVector *iov, int nb_sectors,
> BlockCompletionFunc *cb, void *opaque);
> 

Can be pulled independently of this series.

Reviewed-by: John Snow

Re: [Qemu-devel] [PATCH 03/13] block: Move block dirty bitmap code to separate files

2016-01-05 Thread John Snow



On 01/04/2016 05:27 AM, Fam Zheng wrote:
> The only change is making bdrv_dirty_bitmap_truncate public. It is used in
> block.c.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block.c  | 339 ---
>  block/Makefile.objs  |   2 +-
>  block/dirty-bitmap.c | 366 
> +++
>  include/block/block.h|  37 +
>  include/block/dirty-bitmap.h |  42 +
>  5 files changed, 410 insertions(+), 376 deletions(-)
>  create mode 100644 block/dirty-bitmap.c
>  create mode 100644 include/block/dirty-bitmap.h
> 
> diff --git a/block.c b/block.c
> index 411edbf..b544190 100644
> --- a/block.c
> +++ b/block.c
> @@ -55,23 +55,6 @@
>  #include 
>  #endif
>  
> -/**
> - * A BdrvDirtyBitmap can be in three possible states:
> - * (1) successor is NULL and disabled is false: full r/w mode
> - * (2) successor is NULL and disabled is true: read only mode ("disabled")
> - * (3) successor is set: frozen mode.
> - * A frozen bitmap cannot be renamed, deleted, anonymized, cleared, set,
> - * or enabled. A frozen bitmap can only abdicate() or reclaim().
> - */
> -struct BdrvDirtyBitmap {
> -HBitmap *bitmap;/* Dirty sector bitmap implementation */
> -BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
> -char *name; /* Optional non-empty unique ID */
> -int64_t size;   /* Size of the bitmap (Number of sectors) */
> -bool disabled;  /* Bitmap is read-only */
> -QLIST_ENTRY(BdrvDirtyBitmap) list;
> -};
> -
>  #define NOT_DONE 0x7fff /* used while emulated sync operation in 
> progress */
>  
>  struct BdrvStates bdrv_states = QTAILQ_HEAD_INITIALIZER(bdrv_states);
> @@ -87,7 +70,6 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const 
> char *filename,
>   BlockDriverState *parent,
>   const BdrvChildRole *child_role, Error **errp);
>  
> -static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
>  /* If non-zero, use only whitelisted block drivers */
>  static int use_bdrv_whitelist;
>  
> @@ -3375,327 +3357,6 @@ void bdrv_lock_medium(BlockDriverState *bs, bool 
> locked)
>  }
>  }
>  
> -BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char 
> *name)
> -{
> -BdrvDirtyBitmap *bm;
> -
> -assert(name);
> -QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
> -if (bm->name && !strcmp(name, bm->name)) {
> -return bm;
> -}
> -}
> -return NULL;
> -}
> -
> -void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
> -{
> -assert(!bdrv_dirty_bitmap_frozen(bitmap));
> -g_free(bitmap->name);https://www.facebook.com/nano.nago
> -bitmap->name = NULL;
> -}
> -
> -BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
> -  uint32_t granularity,
> -  const char *name,
> -  Error **errp)
> -{
> -int64_t bitmap_size;
> -BdrvDirtyBitmap *bitmap;
> -uint32_t sector_granularity;
> -
> -assert((granularity & (granularity - 1)) == 0);
> -
> -if (name && bdrv_find_dirty_bitmap(bs, name)) {
> -error_setg(errp, "Bitmap already exists: %s", name);
> -return NULL;
> -}
> -sector_granularity = granularity >> BDRV_SECTOR_BITS;
> -assert(sector_granularity);
> -bitmap_size = bdrv_nb_sectors(bs);
> -if (bitmap_size < 0) {
> -error_setg_errno(errp, -bitmap_size, "could not get length of 
> device");
> -errno = -bitmap_size;
> -return NULL;
> -}
> -bitmap = g_new0(BdrvDirtyBitmap, 1);
> -bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
> -bitmap->size = bitmap_size;
> -bitmap->name = g_strdup(name);
> -bitmap->disabled = false;
> -QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
> -return bitmap;
> -}
> -
> -bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
> -{
> -return bitmap->successor;
> -}
> -
> -bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
> -{
> -return !(bitmap->disabled || bitmap->successor);
> -}
> -
> -DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
> -{
> -if (bdrv_dirty_bitmap_frozen(bitmap)) {
> -return DIRTY_BITMAP_STATUS_FROZEN;
> -} else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
> -return DIRTY_BITMAP_STATUS_DISABLED;
> -} else {
> -return DIRTY_BITMAP_STATUS_ACTIVE;
> -}
> -}
> -
> -/**
> - * Create a successor bitmap destined to replace this bitmap after an 
> operation.
> - * Requires that the bitmap is not frozen and has no successor.
> - */
> -int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
> -   BdrvDirtyBitmap *bitmap, Error **errp)
> -{
> -uint64_t granularity;
> -BdrvDirtyBitmap *child

Re: [Qemu-devel] [PATCH 02/13] typedefs: Add BdrvDirtyBitmap and HBitmapIter

2016-01-05 Thread John Snow



On 01/04/2016 05:27 AM, Fam Zheng wrote:
> Following patches to refactor and move block dirty bitmap code could use this.
> 
> Signed-off-by: Fam Zheng 
> ---
>  include/qemu/typedefs.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
> index 78fe6e8..e83934e 100644
> --- a/include/qemu/typedefs.h
> +++ b/include/qemu/typedefs.h
> @@ -10,6 +10,7 @@ typedef struct AddressSpace AddressSpace;
>  typedef struct AioContext AioContext;
>  typedef struct AllwinnerAHCIState AllwinnerAHCIState;
>  typedef struct AudioState AudioState;
> +typedef struct BdrvDirtyBitmap BdrvDirtyBitmap;
>  typedef struct BlockBackend BlockBackend;
>  typedef struct BlockBackendRootState BlockBackendRootState;
>  typedef struct BlockDriverState BlockDriverState;
> @@ -28,6 +29,7 @@ typedef struct EventNotifier EventNotifier;
>  typedef struct FWCfgIoState FWCfgIoState;
>  typedef struct FWCfgMemState FWCfgMemState;
>  typedef struct FWCfgState FWCfgState;
> +typedef struct HBitmapIter HBitmapIter;
>  typedef struct HCIInfo HCIInfo;
>  typedef struct I2CBus I2CBus;
>  typedef struct I2SCodec I2SCodec;
> 

Should the existing typedefs be removed?

>> include/block/block.h:typedef struct BdrvDirtyBitmap BdrvDirtyBitmap;
>> include/block/block.h:struct HBitmapIter;
>> include/qemu/hbitmap.h:typedef struct HBitmapIter HBitmapIter;

[Qemu-devel] [PATCH v4 1/1] xlnx-zynqmp: Add support for high DDR memory regions

2016-01-05 Thread Alistair Francis

The Xilinx ZynqMP SoC and EP108 board supports three memory regions:
 - A 2GB region starting at 0
 - A 32GB region starting at 32GB
 - A 256GB region starting at 768GB

This patch adds support for the first two memory regions, which is
automatically created based on the size specified by the QEMU memory
command line argument.

On hardware the physical memory region is one continuous region, it is then
mapped into the three different regions by the DDRC. As we don't model the
DDRC this is done at startup by QEMU. The board creates the memory region and
then passes that memory region to the SoC. The SoC then maps the memory
regions.

Signed-off-by: Alistair Francis 
Reviewed-by: Peter Crosthwaite 
---
V4:
 - Small fixes
 - Localisation of ram_size
V3:
 - Assert on the RAM sizes
 - Remove ram_size property
 - General fixes
V2:
 - Create one continuous memory region and pass it to the SoC

Also, the Xilinx ZynqMP TRM is avaliable at:
http://www.xilinx.com/products/silicon-devices/soc/zynq-ultrascale-mpsoc.html?resultsTablePreSelect=documenttype:User%20Guides#documentation

 hw/arm/xlnx-ep108.c  | 35 ++-
 hw/arm/xlnx-zynqmp.c | 37 +
 include/hw/arm/xlnx-zynqmp.h | 12 
 3 files changed, 67 insertions(+), 17 deletions(-)

diff --git a/hw/arm/xlnx-ep108.c b/hw/arm/xlnx-ep108.c
index 85b978f..1c34774 100644
--- a/hw/arm/xlnx-ep108.c
+++ b/hw/arm/xlnx-ep108.c
@@ -25,9 +25,6 @@ typedef struct XlnxEP108 {
 MemoryRegion ddr_ram;
 } XlnxEP108;
 
-/* Max 2GB RAM */
-#define EP108_MAX_RAM_SIZE 0x8000ull
-
 static struct arm_boot_info xlnx_ep108_binfo;
 
 static void xlnx_ep108_init(MachineState *machine)
@@ -35,20 +32,12 @@ static void xlnx_ep108_init(MachineState *machine)
 XlnxEP108 *s = g_new0(XlnxEP108, 1);
 Error *err = NULL;
 
-object_initialize(&s->soc, sizeof(s->soc), TYPE_XLNX_ZYNQMP);
-object_property_add_child(OBJECT(machine), "soc", OBJECT(&s->soc),
-  &error_abort);
-
-object_property_set_bool(OBJECT(&s->soc), true, "realized", &err);
-if (err) {
-error_report("%s", error_get_pretty(err));
-exit(1);
-}
-
-if (machine->ram_size > EP108_MAX_RAM_SIZE) {
+/* Create the memory region to pass to the SoC */
+if (machine->ram_size > XLNX_ZYNQMP_MAX_RAM_SIZE) {
 error_report("WARNING: RAM size " RAM_ADDR_FMT " above max supported, "
- "reduced to %llx", machine->ram_size, EP108_MAX_RAM_SIZE);
-machine->ram_size = EP108_MAX_RAM_SIZE;
+ "reduced to %llx", machine->ram_size,
+ XLNX_ZYNQMP_MAX_RAM_SIZE);
+machine->ram_size = XLNX_ZYNQMP_MAX_RAM_SIZE;
 }
 
 if (machine->ram_size < 0x0800) {
@@ -58,7 +47,19 @@ static void xlnx_ep108_init(MachineState *machine)
 
 memory_region_allocate_system_memory(&s->ddr_ram, NULL, "ddr-ram",
  machine->ram_size);
-memory_region_add_subregion(get_system_memory(), 0, &s->ddr_ram);
+
+object_initialize(&s->soc, sizeof(s->soc), TYPE_XLNX_ZYNQMP);
+object_property_add_child(OBJECT(machine), "soc", OBJECT(&s->soc),
+  &error_abort);
+
+object_property_set_link(OBJECT(&s->soc), OBJECT(&s->ddr_ram),
+ "ddr-ram", &error_abort);
+
+object_property_set_bool(OBJECT(&s->soc), true, "realized", &err);
+if (err) {
+error_report("%s", error_get_pretty(err));
+exit(1);
+}
 
 xlnx_ep108_binfo.ram_size = machine->ram_size;
 xlnx_ep108_binfo.kernel_filename = machine->kernel_filename;
diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
index 87553bb..b9b8bee 100644
--- a/hw/arm/xlnx-zynqmp.c
+++ b/hw/arm/xlnx-zynqmp.c
@@ -90,6 +90,11 @@ static void xlnx_zynqmp_init(Object *obj)
   &error_abort);
 }
 
+object_property_add_link(obj, "ddr-ram", TYPE_MEMORY_REGION,
+ (Object **)&s->ddr_ram,
+ qdev_prop_allow_set_link_before_realize,
+ OBJ_PROP_LINK_UNREF_ON_RELEASE, &error_abort);
+
 object_initialize(&s->gic, sizeof(s->gic), TYPE_ARM_GIC);
 qdev_set_parent_bus(DEVICE(&s->gic), sysbus_get_default());
 
@@ -119,10 +124,42 @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error 
**errp)
 XlnxZynqMPState *s = XLNX_ZYNQMP(dev);
 MemoryRegion *system_memory = get_system_memory();
 uint8_t i;
+uint64 ram_size;
 const char *boot_cpu = s->boot_cpu ? s->boot_cpu : "apu-cpu[0]";
+ram_addr_t ddr_low_size, ddr_high_size;
 qemu_irq gic_spi[GIC_NUM_SPI_INTR];
 Error *err = NULL;
 
+ram_size = memory_region_size(s->ddr_ram);
+
+/* Create the DDR Memory Regions. User friendly checks should happen at
+ * the board level
+ */
+if (ram_size > XLNX_ZYNQMP_MAX_LOW_RAM_SIZE) {
+/* The RAM size is above the maxi

Re: [Qemu-devel] [PATCH 10/17] magnum: disable floppy DMA for now

2016-01-05 Thread John Snow



On 12/29/2015 03:04 AM, Hervé Poussineau wrote:
> Floppy uses the DMA controller in rc4030 chipset, and not the i8259 from the 
> ISA bus.
> It's better to disable DMA than to call the wrong DMA controller.
> 

I'll trust that these platforms' FDCs are already terribly broken and
unusable, I've not tested them personally.

> Signed-off-by: Hervé Poussineau 
> ---
>  hw/block/fdc.c | 5 +++--
>  hw/mips/mips_jazz.c| 3 ++-
>  include/hw/block/fdc.h | 2 +-
>  3 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/block/fdc.c b/hw/block/fdc.c
> index 4292ece..cfdd625 100644
> --- a/hw/block/fdc.c
> +++ b/hw/block/fdc.c
> @@ -2255,7 +2255,7 @@ ISADevice *fdctrl_init_isa(ISABus *bus, DriveInfo **fds)
>  return isadev;
>  }
>  
> -void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
> +void fdctrl_init_sysbus(qemu_irq irq, int dma_chann, IsaDma *dma,
>  hwaddr mmio_base, DriveInfo **fds)
>  {
>  FDCtrl *fdctrl;
> @@ -2266,7 +2266,8 @@ void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
>  dev = qdev_create(NULL, "sysbus-fdc");
>  sys = SYSBUS_FDC(dev);
>  fdctrl = &sys->state;
> -fdctrl->dma_chann = dma_chann; /* FIXME */
> +fdctrl->dma = dma;

You haven't added this field yet; so this breaks the bisect.

> +fdctrl->dma_chann = dma ? dma_chann : -1;
>  if (fds[0]) {
>  qdev_prop_set_drive_nofail(dev, "driveA", 
> blk_by_legacy_dinfo(fds[0]));
>  }
> diff --git a/hw/mips/mips_jazz.c b/hw/mips/mips_jazz.c
> index 64a0de2..300c199 100644
> --- a/hw/mips/mips_jazz.c
> +++ b/hw/mips/mips_jazz.c
> @@ -296,7 +296,8 @@ static void mips_jazz_init(MachineState *machine,
>  for (n = 0; n < MAX_FD; n++) {
>  fds[n] = drive_get(IF_FLOPPY, 0, n);
>  }
> -fdctrl_init_sysbus(qdev_get_gpio_in(rc4030, 1), 0, 0x80003000, fds);
> +/* FIXME: we should enable DMA with a custom IsaDma device */
> +fdctrl_init_sysbus(qdev_get_gpio_in(rc4030, 1), 0, NULL, 0x80003000, 
> fds);
>  
>  /* Real time clock */
>  rtc_init(isa_bus, 1980, NULL);
> diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
> index d48b2f8..f92e44f 100644
> --- a/include/hw/block/fdc.h
> +++ b/include/hw/block/fdc.h
> @@ -16,7 +16,7 @@ typedef enum FDriveType {
>  #define TYPE_ISA_FDC "isa-fdc"
>  
>  ISADevice *fdctrl_init_isa(ISABus *bus, DriveInfo **fds);
> -void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
> +void fdctrl_init_sysbus(qemu_irq irq, int dma_chann, IsaDma *dma,
>  hwaddr mmio_base, DriveInfo **fds);
>  void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
> DriveInfo **fds, qemu_irq *fdc_tc);
>

Re: [Qemu-devel] [PATCH 09/17] i8257: implement the IsaDma interface

2016-01-05 Thread John Snow

Accidental duplicate send of patch #09 with minor differences in the
commit message... Assuming this one is the "correct" one and just
discarding the other.

On 12/29/2015 03:04 AM, Hervé Poussineau wrote:
> Rewrite the global DMA_*() functions to use the IsaDma interface.
> Note that these functions will be deleted in a few commits.
> 
> Signed-off-by: Hervé Poussineau 
> ---
>  hw/dma/i8257.c | 148 
> +
>  1 file changed, 117 insertions(+), 31 deletions(-)
> 
> diff --git a/hw/dma/i8257.c b/hw/dma/i8257.c
> index 20231d6..cf4b0a7 100644
> --- a/hw/dma/i8257.c
> +++ b/hw/dma/i8257.c
> @@ -77,8 +77,6 @@ typedef struct I8257State {
>  int running;
>  } I8257State;
>  
> -static I8257State *dma_controllers[2];
> -
>  enum {
>  CMD_MEMORY_TO_MEMORY = 0x01,
>  CMD_FIXED_ADDRESS= 0x02,
> @@ -321,31 +319,36 @@ static uint64_t i8257_read_cont(void *opaque, hwaddr 
> nport, unsigned size)
>  return val;
>  }
>  
> -int DMA_get_channel_mode (int nchan)
> +static IsaDmaTransferMode i8257_dma_get_transfer_mode(IsaDma *obj, int nchan)
> +{
> +I8257State *d = I8257(obj);
> +return (d->regs[nchan & 3].mode >> 2) & 3;
> +}
> +
> +static bool i8257_dma_has_autoinitialization(IsaDma *obj, int nchan)
>  {
> -return dma_controllers[nchan > 3]->regs[nchan & 3].mode;
> +I8257State *d = I8257(obj);
> +return (d->regs[nchan & 3].mode >> 4) & 1;
>  }
>  
> -void DMA_hold_DREQ (int nchan)
> +static void i8257_dma_hold_DREQ(IsaDma *obj, int nchan)
>  {
> -int ncont, ichan;
> +I8257State *d = I8257(obj);
> +int ichan;
>  
> -ncont = nchan > 3;
>  ichan = nchan & 3;
> -linfo ("held cont=%d chan=%d\n", ncont, ichan);
> -dma_controllers[ncont]->status |= 1 << (ichan + 4);
> -i8257_dma_run(dma_controllers[ncont]);
> +d->status |= 1 << (ichan + 4);
> +i8257_dma_run(d);
>  }
>  
> -void DMA_release_DREQ (int nchan)
> +static void i8257_dma_release_DREQ(IsaDma *obj, int nchan)
>  {
> -int ncont, ichan;
> +I8257State *d = I8257(obj);
> +int ichan;
>  
> -ncont = nchan > 3;
>  ichan = nchan & 3;
> -linfo ("released cont=%d chan=%d\n", ncont, ichan);
> -dma_controllers[ncont]->status &= ~(1 << (ichan + 4));
> -i8257_dma_run(dma_controllers[ncont]);
> +d->status &= ~(1 << (ichan + 4));
> +i8257_dma_run(d);
>  }
>  
>  static void i8257_channel_run(I8257State *d, int ichan)
> @@ -405,24 +408,26 @@ out:
>  }
>  }
>  
> -void DMA_register_channel (int nchan,
> -   DMA_transfer_handler transfer_handler,
> -   void *opaque)
> +static void i8257_dma_register_channel(IsaDma *obj, int nchan,
> +   DMA_transfer_handler transfer_handler,
> +   void *opaque)
>  {
> +I8257State *d = I8257(obj);
>  struct dma_regs *r;
> -int ichan, ncont;
> +int ichan;
>  
> -ncont = nchan > 3;
>  ichan = nchan & 3;
>  
> -r = dma_controllers[ncont]->regs + ichan;
> +r = d->regs + ichan;
>  r->transfer_handler = transfer_handler;
>  r->opaque = opaque;
>  }
>  
> -int DMA_read_memory (int nchan, void *buf, int pos, int len)
> +static int i8257_dma_read_memory(IsaDma *obj, int nchan, void *buf, int pos,
> + int len)
>  {
> -struct dma_regs *r = &dma_controllers[nchan > 3]->regs[nchan & 3];
> +I8257State *d = I8257(obj);
> +struct dma_regs *r = &d->regs[nchan & 3];
>  hwaddr addr = ((r->pageh & 0x7f) << 24) | (r->page << 16) | r->now[ADDR];
>  
>  if (r->mode & 0x20) {
> @@ -442,9 +447,11 @@ int DMA_read_memory (int nchan, void *buf, int pos, int 
> len)
>  return len;
>  }
>  
> -int DMA_write_memory (int nchan, void *buf, int pos, int len)
> +static int i8257_dma_write_memory(IsaDma *obj, int nchan, void *buf, int pos,
> + int len)
>  {
> -struct dma_regs *r = &dma_controllers[nchan > 3]->regs[nchan & 3];
> +I8257State *s = I8257(obj);
> +struct dma_regs *r = &s->regs[nchan & 3];
>  hwaddr addr = ((r->pageh & 0x7f) << 24) | (r->page << 16) | r->now[ADDR];
>  
>  if (r->mode & 0x20) {
> @@ -467,10 +474,10 @@ int DMA_write_memory (int nchan, void *buf, int pos, 
> int len)
>  /* request the emulator to transfer a new DMA memory block ASAP (even
>   * if the idle bottom half would not have exited the iothread yet).
>   */
> -void DMA_schedule(void)
> +static void i8257_dma_schedule(IsaDma *obj)
>  {
> -if (dma_controllers[0]->dma_bh_scheduled ||
> -dma_controllers[1]->dma_bh_scheduled) {
> +I8257State *d = I8257(obj);
> +if (d->dma_bh_scheduled) {
>  qemu_notify_event();
>  }
>  }
> @@ -604,11 +611,85 @@ static Property i8257_properties[] = {
>  static void i8257_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> +IsaDmaClass *idc = ISADMA_CLASS(klass);
>

Re: [Qemu-devel] [PATCH v1] kvm/x86: Hyper-V tsc page setup

2016-01-05 Thread Peter Hornyack

On Thu, Dec 24, 2015 at 1:33 AM, Andrey Smetanin
 wrote:
> Lately tsc page was implemented but filled with empty
> values. This patch setup tsc page scale and offset based
> on vcpu tsc, tsc_khz and  HV_X64_MSR_TIME_REF_COUNT value.
>
> The valid tsc page drops HV_X64_MSR_TIME_REF_COUNT msr
> reads count to zero which potentially improves performance.
>
> The patch applies on top of
> 'kvm: Make vcpu->requests as 64 bit bitmap'
> previously sent.
>
> Signed-off-by: Andrey Smetanin 
> CC: Paolo Bonzini 
> CC: Gleb Natapov 
> CC: Roman Kagan 
> CC: Denis V. Lunev 
> CC: qemu-devel@nongnu.org
Reviewed-by: Peter Hornyack 

>
> ---
>  arch/x86/kvm/hyperv.c| 117 
> +--
>  arch/x86/kvm/hyperv.h|   2 +
>  arch/x86/kvm/x86.c   |  12 +
>  include/linux/kvm_host.h |   1 +
>  4 files changed, 117 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index d50675a..504fdc7 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -753,6 +753,105 @@ static int kvm_hv_msr_set_crash_data(struct kvm_vcpu 
> *vcpu,
> return 0;
>  }
>
> +static u64 calc_tsc_page_scale(u32 tsc_khz)
> +{
> +   /*
> +* reftime (in 100ns) = tsc * tsc_scale / 2^64 + tsc_offset
> +* so reftime_delta = (tsc_delta * tsc_scale) / 2^64
> +* so tsc_scale = (2^64 * reftime_delta)/tsc_delta
> +* so tsc_scale = (2^64 * 10 * 10^6) / tsc_hz = (2^64 * 1) / 
> tsc_khz
> +* so tsc_scale = (2^63 * 2 * 1) / tsc_khz
> +*/
> +   return mul_u64_u32_div(1ULL << 63, 2 * 1, tsc_khz);
> +}
> +
> +static int write_tsc_page(struct kvm *kvm, u64 gfn,
> + PHV_REFERENCE_TSC_PAGE tsc_ref)
> +{
> +   if (kvm_write_guest(kvm, gfn_to_gpa(gfn),
> +   tsc_ref, sizeof(*tsc_ref)))
> +   return 1;
> +   mark_page_dirty(kvm, gfn);
> +   return 0;
> +}
> +
> +static int read_tsc_page(struct kvm *kvm, u64 gfn,
> +PHV_REFERENCE_TSC_PAGE tsc_ref)
> +{
> +   if (kvm_read_guest(kvm, gfn_to_gpa(gfn),
> +  tsc_ref, sizeof(*tsc_ref)))
> +   return 1;
> +   return 0;
> +}
> +
> +static u64 calc_tsc_page_time(struct kvm_vcpu *vcpu,
> + PHV_REFERENCE_TSC_PAGE tsc_ref)
> +{
> +
> +   u64 tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> +
> +   return mul_u64_u64_shr(tsc, tsc_ref->tsc_scale, 64)
> +   + tsc_ref->tsc_offset;
> +}
> +
> +static int setup_blank_tsc_page(struct kvm_vcpu *vcpu, u64 gfn)
> +{
> +   HV_REFERENCE_TSC_PAGE tsc_ref;
> +
> +   memset(&tsc_ref, 0, sizeof(tsc_ref));
> +   return write_tsc_page(vcpu->kvm, gfn, &tsc_ref);
> +}
> +
> +int kvm_hv_setup_tsc_page(struct kvm_vcpu *vcpu)
> +{
> +   struct kvm *kvm = vcpu->kvm;
> +   struct kvm_hv *hv = &kvm->arch.hyperv;
> +   HV_REFERENCE_TSC_PAGE tsc_ref;
> +   u32 tsc_khz;
> +   int r;
> +   u64 gfn, ref_time, tsc_scale, tsc_offset, tsc;
> +
> +   if (WARN_ON_ONCE(!(hv->hv_tsc_page & 
> HV_X64_MSR_TSC_REFERENCE_ENABLE)))
> +   return -EINVAL;
> +
> +   gfn = hv->hv_tsc_page >> HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT;
> +   vcpu_debug(vcpu, "tsc page gfn 0x%llx\n", gfn);
> +
> +   tsc_khz = vcpu->arch.virtual_tsc_khz;
> +   if (!tsc_khz) {
> +   vcpu_unimpl(vcpu, "no tsc khz\n");
> +   return setup_blank_tsc_page(vcpu, gfn);
> +   }
> +
> +   r = read_tsc_page(kvm, gfn, &tsc_ref);
> +   if (r) {
> +   vcpu_err(vcpu, "can't access tsc page gfn 0x%llx\n", gfn);
> +   return r;
> +   }
> +
> +   tsc_scale = calc_tsc_page_scale(tsc_khz);
> +   ref_time = get_time_ref_counter(kvm);
> +   tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> +
> +   /* tsc_offset = reftime - tsc * tsc_scale / 2^64 */
> +   tsc_offset = ref_time - mul_u64_u64_shr(tsc, tsc_scale, 64);
> +   vcpu_debug(vcpu, "tsc khz %u tsc %llu scale %llu offset %llu\n",
> +  tsc_khz, tsc, tsc_scale, tsc_offset);
> +
> +   tsc_ref.tsc_sequence++;
> +   if (tsc_ref.tsc_sequence == 0)

Also avoid tsc_sequence == 0x here. In the Hyper-V TLFS 4.0
(Win2012 R2) 0x is the special sequence number to disable the
reference TSC page.

> +   tsc_ref.tsc_sequence = 1;
> +
> +   tsc_ref.tsc_scale = tsc_scale;
> +   tsc_ref.tsc_offset = tsc_offset;
> +
> +   vcpu_debug(vcpu, "tsc page calibration time %llu vs. reftime %llu\n",
> +  calc_tsc_page_time(vcpu, &tsc_ref),
> +  get_time_ref_counter(kvm));
> +
> +   return write_tsc_page(kvm, gfn, &tsc_ref);
> +}
> +
>  static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
>  bool host)
>  {
> @@ -790,23 +889,11 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
> msr, u64 data,
>

Re: [Qemu-devel] [PATCH] macio: fix overflow in lba to offset conversion for ATAPI devices

2016-01-05 Thread John Snow



On 01/04/2016 12:30 PM, Mark Cave-Ayland wrote:
> As the IDEState lba field is an int32_t, make sure we cast to int64_t before
> shifting to calculate the offset. Otherwise we end up with an overflow when
> trying to access sectors beyond 2GB as can occur when using DVD images.
> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/ide/macio.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> index 3ee962f..a78b6e0 100644
> --- a/hw/ide/macio.c
> +++ b/hw/ide/macio.c
> @@ -280,7 +280,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int 
> ret)
>  }
>  
>  /* Calculate current offset */
> -offset = (int64_t)(s->lba << 11) + s->io_buffer_index;
> +offset = ((int64_t)(s->lba) << 11) + s->io_buffer_index;
>  
>  pmac_dma_read(s->blk, offset, io->len, pmac_ide_atapi_transfer_cb, io);
>  return;
> 

Thanks, applied to my IDE tree:

https://github.com/jnsnow/qemu/commits/ide
https://github.com/jnsnow/qemu.git

--js

Re: [Qemu-devel] [PATCH for v2.3.0] fw_cfg: add check to validate current entry value

2016-01-05 Thread Stefan Weil

Am 05.01.2016 um 15:55 schrieb P J P:
> From: Prasad J Pandit 
> 
> When processing firmware configurations, an OOB r/w access occurs
> if 's->cur_entry' is set to be invalid(FW_CFG_INVALID=0x).
> Add a check to validate 's->cur_entry' to avoid such access.
> 
> Reported-by: Donghai Zdh 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/nvram/fw_cfg.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 68eff77..ce026bc 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -233,12 +233,15 @@ static void fw_cfg_reboot(FWCfgState *s)
>  static void fw_cfg_write(FWCfgState *s, uint8_t value)
>  {
>  int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> -FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> +FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
> + &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>  
>  trace_fw_cfg_write(s, value);
>  
> -if (s->cur_entry & FW_CFG_WRITE_CHANNEL && e->callback &&
> -s->cur_offset < e->len) {
> +if (s->cur_entry != FW_CFG_INVALID
> +&& s->cur_entry & FW_CFG_WRITE_CHANNEL
> +&& e->callback
> +&& s->cur_offset < e->len) {

I suggest to test e != NULL instead of s->cur_entry != FW_CFG_INVALID.

Of course both variants are equivalent, but e != NULL might be easier
to review and make work of static code analyzers easier, too.


>  e->data[s->cur_offset++] = value;
>  if (s->cur_offset == e->len) {
>  e->callback(e->callback_opaque, e->data);
> @@ -267,7 +270,8 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
>  static uint8_t fw_cfg_read(FWCfgState *s)
>  {
>  int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> -FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> +FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
> + &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>  uint8_t ret;
>  
>  if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= 
> e->len)
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-block] [PATCH 05/10] block: Inactivate BDS when migration completes

2016-01-05 Thread John Snow



On 12/22/2015 03:43 PM, Eric Blake wrote:
> On 12/22/2015 09:46 AM, Kevin Wolf wrote:
>> So far, live migration with shared storage meant that the image is in a
>> not-really-ready don't-touch-me state on the destination while the
>> source is still actively using it, but after completing the migration,
>> the image was fully opened on both sides. This is bad.
>>
>> This patch adds a block driver callback to inactivate images on the
>> source before completing the migration. Inactivation means that it goes
>> to a state as if it was just live migrated to the qemu instance on the
>> source (i.e. BDRV_O_INCOMING is set). You're then supposed to continue
>> either on the source or on the destination, which takes ownership of the
>> image.
>>
>> A typical migration looks like this now with respect to disk images:
>>
>> 1. Destination qemu is started, the image is opened with
>>BDRV_O_INCOMING. The image is fully opened on the source.
>>
>> 2. Migration is about to complete. The source flushes the image and
>>inactivates it. Now both sides have the image opened with
>>BDRV_O_INCOMING and are expecting the other side to still modify it.
> 
> The name BDRV_O_INCOMING now doesn't quite match semantics on the
> source, but I don't have any better suggestions.  BDRV_O_LIMITED_USE?
> BDRV_O_HANDOFF?  At any rate, I fully agree with your logic of locking
> things down on the source to mark that the destination is about to take
> over write access to the file.
> 

INCOMING is handy as it keeps the code simple, even if it's weird to
read. Is it worth adding the extra ifs/case statements everywhere to add
in BDRV_O_HANDOFF? Maybe in the future someone will use BDRV_O_INCOMING
to mean something more specific (data is incoming, not just in the
process of being handed off) that could cause problems.

Maybe even just renaming BDRV_O_INCOMING right now to be BDRV_O_HANDOFF
would accomplish the semantics we want on both source and destination
without needing two flags.

Follow your dreams, Go with what you feel.

>>
>> 3. One side (the destination on success) continues and calls
>>bdrv_invalidate_all() in order to take ownership of the image again.
>>This removes BDRV_O_INCOMING on the resuming side; the flag remains
>>set on the other side.
>>
>> This ensures that the same image isn't written to by both instances
>> (unless both are resumed, but then you get what you deserve). This is
>> important because .bdrv_close for non-BDRV_O_INCOMING images could write
>> to the image file, which is definitely forbidden while another host is
>> using the image.
> 
> And indeed, this is a prereq to your patch that modifies the file on
> close to clear the new 'open-for-writing' flag :)
> 
>>
>> Signed-off-by: Kevin Wolf 
>> ---
>>  block.c   | 34 ++
>>  include/block/block.h |  1 +
>>  include/block/block_int.h |  1 +
>>  migration/migration.c |  7 +++
>>  qmp.c | 12 
>>  5 files changed, 55 insertions(+)
>>
> 
>> @@ -1536,6 +1540,9 @@ static void migration_completion(MigrationState *s, 
>> int current_active_state,
>>  if (!ret) {
>>  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>>  if (ret >= 0) {
>> +ret = bdrv_inactivate_all();
>> +}
>> +if (ret >= 0) {
>>  qemu_file_set_rate_limit(s->file, INT64_MAX);
> 
> Isn't the point of the rate limit change to allow any pending operations
> to flush without artificial slow limits?  Will inactivating the device
> be too slow if rate limits are still slow?
> 

This sets the rate limit for the migration pipe, doesn't it? My reading
was that this removes any artificial limits for the sake of post-copy,
but we shouldn't be flushing any writes to disk at this point, so I
think this order won't interfere with anything.

> But offhand, I don't have any strong proof that a different order is
> required, so yours makes sense to me.
> 
> You may want a second opinion, but I'm okay if you add:
> Reviewed-by: Eric Blake 
> 
Reviewed-by: John Snow

Re: [Qemu-devel] [Qemu-block] [PATCH] send readcapacity10 when readcapacity16 failed

2016-01-05 Thread ronnie sahlberg

MMC devices:
READ CAPACITY 10 support is mandatory.
No support for READ CAPACITY 16

SBC devices:
READ CAPACITY 10 is mandatory
READ CAPACITY 16 support is only required when you have thin provisioning
or protection information (or if the device is >2^32 blocks)
Almost all, but apparently not all, SBC devices support both.

For SBC devices you probably want to start with RC16 and only fallback to
RC10 if you get INVALID_OPCODE.
You start with RC16 since this is the way to discover if you have thin
provisioning or special protection information.

For MMC devices you could try the "try RC16 first and fallback to RC10" but
as probably almost no MMC devices support RC16 it makes little sense to do
so.

On Tue, Jan 5, 2016 at 11:42 AM, John Snow  wrote:

>
>
> On 12/28/2015 10:32 PM, Zhu Lingshan wrote:
> > When play with Dell MD3000 target, for sure it
> > is a TYPE_DISK, but readcapacity16 would fail.
> > Then we find that readcapacity10 succeeded. It
> > looks like the target just support readcapacity10
> > even through it is a TYPE_DISK or have some
> > TYPE_ROM characteristics.
> >
> > This patch can give a chance to send
> > readcapacity16 when readcapacity10 failed.
> > This patch is not harmful to original pathes
> >
> > Signed-off-by: Zhu Lingshan 
> > ---
> >  block/iscsi.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/iscsi.c b/block/iscsi.c
> > index bd1f1bf..c8d167f 100644
> > --- a/block/iscsi.c
> > +++ b/block/iscsi.c
> > @@ -1243,8 +1243,9 @@ static void iscsi_readcapacity_sync(IscsiLun
> *iscsilun, Error **errp)
> >  iscsilun->lbprz = !!rc16->lbprz;
> >  iscsilun->use_16_for_rw = (rc16->returned_lba >
> 0x);
> >  }
> > +break;
> >  }
> > -break;
> > +//fall through to try readcapacity10 instead
> >  case TYPE_ROM:
> >  task = iscsi_readcapacity10_sync(iscsilun->iscsi,
> iscsilun->lun, 0, 0);
> >  if (task != NULL && task->status == SCSI_STATUS_GOOD) {
> >
>
> For the uninitiated, why does readcapacity16 fail?
>
> My gut feeling is that this is a hack, because:
>
> - Either readcapacity16 should work, or
> - We shouldn't be choosing 10/16 based on the target type to begin with
>
> but I don't know much about iSCSI, so maybe You, Paolo or Peter could
> fill me in.
>
> --js
>

Re: [Qemu-devel] [PATCH 1/8] ipmi: fix SDR length value

2016-01-05 Thread Eric Blake

On 01/05/2016 10:29 AM, Cédric Le Goater wrote:

[meta-comment] Your messages were not marked in-reply-to: the 0/8 cover
letter, but came through as separate threads.  This makes it harder to
follow, especially in mail clients that sort top-level threads by most
recent activity on the thread.

> The IPMI BMC simulator populates the SDR table with a set of initial
> SDRs. The length of each SDR is taken from the record itself (byte 4)
> which does not include the size of the header. But, the full length
> (header + data) is required by the sdr_add_entry() routine.
> 
> Signed-off-by: Cédric Le Goater 
> ---
> 
>  Maybe we could use a sdr struct/typedef to clarify the code. See
>  patch 7: "ipmi: introduce an ipmi_bmc_init_sensor() API"
> 
>  hw/ipmi/ipmi_bmc_sim.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
> index 0a59e539f549..559e1398d669 100644
> --- a/hw/ipmi/ipmi_bmc_sim.c
> +++ b/hw/ipmi/ipmi_bmc_sim.c
> @@ -362,7 +362,7 @@ static int sdr_find_entry(IPMISdr *sdr, uint16_t recid,
>  
>  while (pos < sdr->next_free) {
>  uint16_t trec = sdr->sdr[pos] | (sdr->sdr[pos + 1] << 8);
> -unsigned int nextpos = pos + sdr->sdr[pos + 4];
> +unsigned int nextpos = pos + sdr->sdr[pos + 4] + 5;

5 feels like a magic number; should you use a #define and name it?


> @@ -1709,20 +1709,20 @@ static void ipmi_sim_init(Object *obj)
>  for (i = 0;;) {
>  int len;
>  if ((i + 5) > sizeof(init_sdrs)) {
> -error_report("Problem with recid 0x%4.4x: \n", i);
> +error_report("Problem with recid 0x%4.4x\n", i);

Please drop the trailing \n as long as you are touching this.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [v15 12/15] vfio: add bus in reset flag

2016-01-05 Thread Alex Williamson

On Tue, 2016-01-05 at 09:20 +0800, Cao jin wrote:
> From: Chen Fan 
> 
> mark the host bus be in reset. avoid multiple devices trigger the
> host bus reset many times.
> 
> Signed-off-by: Chen Fan 
> ---
>  hw/vfio/pci.c | 6 ++
>  include/hw/vfio/vfio-common.h | 1 +
>  2 files changed, 7 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index ee88db3..aa0d945 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2249,6 +2249,11 @@ static int vfio_pci_hot_reset(VFIOPCIDevice
> *vdev, bool single)
>  
>  trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" :
> "multi");
>  
> +if (vdev->vbasedev.bus_in_reset) {
> +vdev->vbasedev.bus_in_reset = false;
> +return 0;
> +}
> +
>  vfio_pci_pre_reset(vdev);
>  vdev->vbasedev.needs_reset = false;
>  
> @@ -2312,6 +2317,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice
> *vdev, bool single)
>  }
>  vfio_pci_pre_reset(tmp);
>  tmp->vbasedev.needs_reset = false;
> +tmp->vbasedev.bus_in_reset = true;
>  multi = true;
>  break;
>  }
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
> common.h
> index f037f3c..44b19d7 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -95,6 +95,7 @@ typedef struct VFIODevice {
>  bool reset_works;
>  bool needs_reset;
>  bool no_mmap;
> +bool bus_in_reset;
>  VFIODeviceOps *ops;
>  unsigned int num_irqs;
>  unsigned int num_regions;

I imagine this should be a VFIOPCIDevice field, it has no use in the
common code.  The name is also a bit confusing; when I suggested a
bus_in_reset flag, I was referring to a property on the bus itself that
the existing device_reset could query to switch modes rather than add a
separate callback as you've done in this series.  This works, but it's
perhaps more intrusive than I was thinking.  It will need to get
approval by qdev folks.

In any case, this bus_in_reset field is tracking whether a device has
already been reset as part of a hot reset, sort of a more bus-based
version with opposite polarity of needs_reset.  It doesn't actually
track the bus reset state at all, it tracks whether we should skip the
next call to hot reset for that device.  So it should probably be
something like VFIOPCIDevice.skip_hot_reset (though that's not a great
name either).

I also wonder if a "hot" reset callback in qbus is really too PCI
centered, should it just be "bus_reset"?

Finally, it would be great if you could mention in the cover email
which patches are new or more than superficially modified from the
previous version so we can more easily focus on the new code to review.
 Thanks!

Alex

[Qemu-devel] [PATCH 13/22] fsdev: rename virtio-9p-marshal.{c, h} to 9p-iov-marshal.{c, h}

2016-01-05 Thread Wei Liu

And rename v9fs_marshal to v9fs_iov_marshal, v9fs_unmarshal to
v9fs_iov_unmarshal.

The rationale behind this change is that, this marshalling interface is
used both by virtio and proxy helper. Renaming files and functions to
reflect the true nature of this interface.

Xen transport is going to have its own marshalling interface.

Signed-off-by: Wei Liu 
---
 Makefile|   2 +-
 fsdev/{virtio-9p-marshal.c => 9p-iov-marshal.c} | 118 +---
 fsdev/9p-iov-marshal.h  |  10 ++
 fsdev/Makefile.objs |   2 +-
 fsdev/virtfs-proxy-helper.c |   4 +-
 fsdev/virtio-9p-marshal.h   |  10 --
 hw/9pfs/9p-proxy.h  |   4 +-
 hw/9pfs/virtio-9p.h |   6 +-
 8 files changed, 83 insertions(+), 73 deletions(-)
 rename fsdev/{virtio-9p-marshal.c => 9p-iov-marshal.c} (63%)
 create mode 100644 fsdev/9p-iov-marshal.h
 delete mode 100644 fsdev/virtio-9p-marshal.h

diff --git a/Makefile b/Makefile
index 7e881d8..d0de2d4 100644
--- a/Makefile
+++ b/Makefile
@@ -240,7 +240,7 @@ qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) 
$(qom-obj-y) libqemuu
 
 qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
 
-fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o 
fsdev/9p-marshal.o fsdev/virtio-9p-marshal.o libqemuutil.a libqemustub.a
+fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o 
fsdev/9p-marshal.o fsdev/9p-iov-marshal.o libqemuutil.a libqemustub.a
 fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
 
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
diff --git a/fsdev/virtio-9p-marshal.c b/fsdev/9p-iov-marshal.c
similarity index 63%
rename from fsdev/virtio-9p-marshal.c
rename to fsdev/9p-iov-marshal.c
index d120bd2..d17983e 100644
--- a/fsdev/virtio-9p-marshal.c
+++ b/fsdev/9p-iov-marshal.c
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p backend
+ * 9p backend
  *
  * Copyright IBM, Corp. 2010
  *
@@ -22,7 +22,7 @@
 #include 
 
 #include "qemu/compiler.h"
-#include "virtio-9p-marshal.h"
+#include "9p-iov-marshal.h"
 #include "qemu/bswap.h"
 
 static ssize_t v9fs_packunpack(void *addr, struct iovec *sg, int sg_count,
@@ -76,8 +76,8 @@ static ssize_t v9fs_pack(struct iovec *in_sg, int in_num, 
size_t offset,
 return v9fs_packunpack((void *)src, in_sg, in_num, offset, size, 1);
 }
 
-ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
-   int bswap, const char *fmt, ...)
+ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
+   int bswap, const char *fmt, ...)
 {
 int i;
 va_list ap;
@@ -127,8 +127,8 @@ ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, 
size_t offset,
 }
 case 's': {
 V9fsString *str = va_arg(ap, V9fsString *);
-copied = v9fs_unmarshal(out_sg, out_num, offset, bswap,
-"w", &str->size);
+copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
+"w", &str->size);
 if (copied > 0) {
 offset += copied;
 str->data = g_malloc(str->size + 1);
@@ -144,8 +144,8 @@ ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, 
size_t offset,
 }
 case 'B': {
 V9fsBlob *blob = va_arg(ap, V9fsBlob *);
-copied = v9fs_unmarshal(out_sg, out_num, offset, bswap,
-"w", &blob->size);
+copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
+"w", &blob->size);
 if (copied > 0) {
 offset += copied;
 blob->data = g_malloc(blob->size);
@@ -159,31 +159,36 @@ ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, 
size_t offset,
 }
 case 'Q': {
 V9fsQID *qidp = va_arg(ap, V9fsQID *);
-copied = v9fs_unmarshal(out_sg, out_num, offset, bswap, "bdq",
-&qidp->type, &qidp->version, &qidp->path);
+copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
+"bdq", &qidp->type, &qidp->version,
+&qidp->path);
 break;
 }
 case 'S': {
 V9fsStat *statp = va_arg(ap, V9fsStat *);
-copied = v9fs_unmarshal(out_sg, out_num, offset, bswap,
-"wwdQdddqsddd",
-&statp->size, &statp->type, &statp->dev,
-&statp->qid, &statp->mode, &statp->atime,
-&statp->mtime, &statp->length,
-&statp->name, &statp->uid, &statp->gid,
-&statp->muid, &statp->extension,
-

Re: [Qemu-devel] [Qemu-block] [PATCH] send readcapacity10 when readcapacity16 failed

2016-01-05 Thread John Snow



On 12/28/2015 10:32 PM, Zhu Lingshan wrote:
> When play with Dell MD3000 target, for sure it
> is a TYPE_DISK, but readcapacity16 would fail.
> Then we find that readcapacity10 succeeded. It
> looks like the target just support readcapacity10
> even through it is a TYPE_DISK or have some
> TYPE_ROM characteristics.
> 
> This patch can give a chance to send
> readcapacity16 when readcapacity10 failed.
> This patch is not harmful to original pathes
> 
> Signed-off-by: Zhu Lingshan 
> ---
>  block/iscsi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/block/iscsi.c b/block/iscsi.c
> index bd1f1bf..c8d167f 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -1243,8 +1243,9 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, 
> Error **errp)
>  iscsilun->lbprz = !!rc16->lbprz;
>  iscsilun->use_16_for_rw = (rc16->returned_lba > 
> 0x);
>  }
> +break;
>  }
> -break;
> +//fall through to try readcapacity10 instead
>  case TYPE_ROM:
>  task = iscsi_readcapacity10_sync(iscsilun->iscsi, iscsilun->lun, 
> 0, 0);
>  if (task != NULL && task->status == SCSI_STATUS_GOOD) {
> 

For the uninitiated, why does readcapacity16 fail?

My gut feeling is that this is a hack, because:

- Either readcapacity16 should work, or
- We shouldn't be choosing 10/16 based on the target type to begin with

but I don't know much about iSCSI, so maybe You, Paolo or Peter could
fill me in.

--js

[Qemu-devel] [PATCH 12/22] 9pfs: use V9fsBlob to transmit xattr

2016-01-05 Thread Wei Liu

And make v9fs_pack static function. Now we only need to export
v9fs_{,un}marshal to device.

Signed-off-by: Wei Liu 
---
 fsdev/virtio-9p-marshal.c |  4 ++--
 fsdev/virtio-9p-marshal.h |  3 ---
 hw/9pfs/virtio-9p.c   | 21 +
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/fsdev/virtio-9p-marshal.c b/fsdev/virtio-9p-marshal.c
index c3ac316..d120bd2 100644
--- a/fsdev/virtio-9p-marshal.c
+++ b/fsdev/virtio-9p-marshal.c
@@ -70,8 +70,8 @@ static ssize_t v9fs_unpack(void *dst, struct iovec *out_sg, 
int out_num,
 return v9fs_packunpack(dst, out_sg, out_num, offset, size, 0);
 }
 
-ssize_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
-  const void *src, size_t size)
+static ssize_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
+ const void *src, size_t size)
 {
 return v9fs_packunpack((void *)src, in_sg, in_num, offset, size, 1);
 }
diff --git a/fsdev/virtio-9p-marshal.h b/fsdev/virtio-9p-marshal.h
index 0709bcd..766a48e 100644
--- a/fsdev/virtio-9p-marshal.h
+++ b/fsdev/virtio-9p-marshal.h
@@ -3,9 +3,6 @@
 
 #include "9p-marshal.h"
 
-
-ssize_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
-  const void *src, size_t size);
 ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, ...);
 ssize_t v9fs_marshal(struct iovec *in_sg, int in_num, size_t offset,
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 084fa6a..5da25ec 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -1561,6 +1561,7 @@ static int v9fs_xattr_read(V9fsState *s, V9fsPDU *pdu, 
V9fsFidState *fidp,
 size_t offset = 7;
 int read_count;
 int64_t xattr_len;
+V9fsBlob blob;
 
 xattr_len = fidp->fs.xattr.len;
 read_count = xattr_len - off;
@@ -1572,14 +1573,18 @@ static int v9fs_xattr_read(V9fsState *s, V9fsPDU *pdu, 
V9fsFidState *fidp,
  */
 read_count = 0;
 }
-err = pdu_marshal(pdu, offset, "d", read_count);
-if (err < 0) {
-return err;
-}
-offset += err;
-err = v9fs_pack(pdu->elem.in_sg, pdu->elem.in_num, offset,
-((char *)fidp->fs.xattr.value) + off,
-read_count);
+
+v9fs_blob_init(&blob);
+
+blob.data = g_malloc(read_count);
+memcpy(blob.data, ((char *)fidp->fs.xattr.value) + off,
+   read_count);
+blob.size = read_count;
+
+err = pdu_marshal(pdu, offset, "dB", read_count, &blob);
+
+v9fs_blob_free(&blob);
+
 if (err < 0) {
 return err;
 }
-- 
2.1.4

[Qemu-devel] [PATCH 17/22] 9pfs: factor out virtio_pdu_{, un}marshal

2016-01-05 Thread Wei Liu

Signed-off-by: Wei Liu 
---
 hw/9pfs/virtio-9p-device.c | 14 ++
 hw/9pfs/virtio-9p.c|  6 ++
 hw/9pfs/virtio-9p.h|  5 +
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index f3091cc..d77247f 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -156,6 +156,20 @@ static void virtio_9p_device_unrealize(DeviceState *dev, 
Error **errp)
 g_free(s->tag);
 }
 
+ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
+const char *fmt, va_list ap)
+{
+return v9fs_iov_vmarshal(pdu->elem.in_sg, pdu->elem.in_num,
+ offset, 1, fmt, ap);
+}
+
+ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
+  const char *fmt, va_list ap)
+{
+return v9fs_iov_vunmarshal(pdu->elem.out_sg, pdu->elem.out_num,
+   offset, 1, fmt, ap);
+}
+
 /* virtio-9p device */
 
 static Property virtio_9p_properties[] = {
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 780c398..db79a48 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -45,8 +45,7 @@ ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const char 
*fmt, ...)
 va_list ap;
 
 va_start(ap, fmt);
-ret = v9fs_iov_vmarshal(pdu->elem.in_sg, pdu->elem.in_num,
-offset, 1, fmt, ap);
+ret = virtio_pdu_vmarshal(pdu, offset, fmt, ap);
 va_end(ap);
 
 return ret;
@@ -58,8 +57,7 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const char 
*fmt, ...)
 va_list ap;
 
 va_start(ap, fmt);
-ret = v9fs_iov_vunmarshal(pdu->elem.out_sg, pdu->elem.out_num,
-  offset, 1, fmt, ap);
+ret = virtio_pdu_vunmarshal(pdu, offset, fmt, ap);
 va_end(ap);
 
 return ret;
diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
index d6f3ac0..e298949 100644
--- a/hw/9pfs/virtio-9p.h
+++ b/hw/9pfs/virtio-9p.h
@@ -323,6 +323,11 @@ extern int v9fs_name_to_path(V9fsState *s, V9fsPath 
*dirpath,
 ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...);
 ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...);
 
+ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
+const char *fmt, va_list ap);
+ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
+  const char *fmt, va_list ap);
+
 #define TYPE_VIRTIO_9P "virtio-9p-device"
 #define VIRTIO_9P(obj) \
 OBJECT_CHECK(V9fsState, (obj), TYPE_VIRTIO_9P)
-- 
2.1.4

[Qemu-devel] [PATCH 20/22] 9pfs: break out generic code from virtio-9p.{c, h}

2016-01-05 Thread Wei Liu

The vast majority of code in virtio-9p.c is actually generic code.
Rename that file to 9p.c and move virtio specific code to
virtio-9p-device.c. Rename virtio-9p.h to 9p.h and split out virtio
specific code to new virtio-9p.h.

Finally fix up virtio-pci.h header file inclusion.

Note that V9fsState and V9fsPDU are still tied to virtio at the moment.
They will be handled later.

Signed-off-by: Wei Liu 
---
 hw/9pfs/9p-handle.c   |   2 +-
 hw/9pfs/9p-local.c|   2 +-
 hw/9pfs/9p-posix-acl.c|   2 +-
 hw/9pfs/9p-proxy.c|   2 +-
 hw/9pfs/9p-synth.c|   2 +-
 hw/9pfs/9p-xattr-user.c   |   2 +-
 hw/9pfs/9p-xattr.c|   2 +-
 hw/9pfs/{virtio-9p.c => 9p.c} |  48 +--
 hw/9pfs/9p.h  | 328 ++
 hw/9pfs/Makefile.objs |   2 +-
 hw/9pfs/coth.h|   9 +-
 hw/9pfs/virtio-9p-device.c|  48 +++
 hw/9pfs/virtio-9p.h   | 319 +---
 hw/virtio/virtio-pci.h|   1 +
 14 files changed, 399 insertions(+), 370 deletions(-)
 rename hw/9pfs/{virtio-9p.c => 9p.c} (98%)
 create mode 100644 hw/9pfs/9p.h

diff --git a/hw/9pfs/9p-handle.c b/hw/9pfs/9p-handle.c
index 51a9d15..58b77b4 100644
--- a/hw/9pfs/9p-handle.c
+++ b/hw/9pfs/9p-handle.c
@@ -11,7 +11,7 @@
  *
  */
 
-#include "virtio-9p.h"
+#include "9p.h"
 #include "9p-xattr.h"
 #include 
 #include 
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index ac553e0..bf63eab 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -11,7 +11,7 @@
  *
  */
 
-#include "virtio-9p.h"
+#include "9p.h"
 #include "9p-xattr.h"
 #include "fsdev/qemu-fsdev.h"   /* local_ops */
 #include 
diff --git a/hw/9pfs/9p-posix-acl.c b/hw/9pfs/9p-posix-acl.c
index 073af39..8df8228 100644
--- a/hw/9pfs/9p-posix-acl.c
+++ b/hw/9pfs/9p-posix-acl.c
@@ -13,7 +13,7 @@
 
 #include 
 #include "qemu/xattr.h"
-#include "virtio-9p.h"
+#include "9p.h"
 #include "fsdev/file-op-9p.h"
 #include "9p-xattr.h"
 
diff --git a/hw/9pfs/9p-proxy.c b/hw/9pfs/9p-proxy.c
index 67c1fb9..73d00dd 100644
--- a/hw/9pfs/9p-proxy.c
+++ b/hw/9pfs/9p-proxy.c
@@ -11,7 +11,7 @@
  */
 #include 
 #include 
-#include "virtio-9p.h"
+#include "9p.h"
 #include "qemu/error-report.h"
 #include "fsdev/qemu-fsdev.h"
 #include "9p-proxy.h"
diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
index b1064e3..090ae0c 100644
--- a/hw/9pfs/9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -13,7 +13,7 @@
  */
 
 #include "hw/virtio/virtio.h"
-#include "virtio-9p.h"
+#include "9p.h"
 #include "9p-xattr.h"
 #include "fsdev/qemu-fsdev.h"
 #include "9p-synth.h"
diff --git a/hw/9pfs/9p-xattr-user.c b/hw/9pfs/9p-xattr-user.c
index 163b158..c490ec3 100644
--- a/hw/9pfs/9p-xattr-user.c
+++ b/hw/9pfs/9p-xattr-user.c
@@ -12,7 +12,7 @@
  */
 
 #include 
-#include "virtio-9p.h"
+#include "9p.h"
 #include "fsdev/file-op-9p.h"
 #include "9p-xattr.h"
 
diff --git a/hw/9pfs/9p-xattr.c b/hw/9pfs/9p-xattr.c
index 1d7861b..741dd03 100644
--- a/hw/9pfs/9p-xattr.c
+++ b/hw/9pfs/9p-xattr.c
@@ -11,7 +11,7 @@
  *
  */
 
-#include "virtio-9p.h"
+#include "9p.h"
 #include "fsdev/file-op-9p.h"
 #include "9p-xattr.h"
 
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/9p.c
similarity index 98%
rename from hw/9pfs/virtio-9p.c
rename to hw/9pfs/9p.c
index d6850ba..2e9982f 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/9p.c
@@ -16,6 +16,7 @@
 #include "qemu/error-report.h"
 #include "qemu/iov.h"
 #include "qemu/sockets.h"
+#include "9p.h"
 #include "virtio-9p.h"
 #include "fsdev/qemu-fsdev.h"
 #include "9p-xattr.h"
@@ -65,13 +66,7 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const 
char *fmt, ...)
 
 static void pdu_push_and_notify(V9fsPDU *pdu)
 {
-V9fsState *s = pdu->s;
-
-/* push onto queue and notify */
-virtqueue_push(s->vq, &pdu->elem, pdu->size);
-
-/* FIXME: we should batch these completions */
-virtio_notify(VIRTIO_DEVICE(s), s->vq);
+virtio_9p_push_and_notify(pdu);
 }
 
 static int omode_to_uflags(int8_t mode)
@@ -598,7 +593,7 @@ static int fid_to_qid(V9fsPDU *pdu, V9fsFidState *fidp, 
V9fsQID *qidp)
 return 0;
 }
 
-static V9fsPDU *pdu_alloc(V9fsState *s)
+V9fsPDU *pdu_alloc(V9fsState *s)
 {
 V9fsPDU *pdu = NULL;
 
@@ -610,7 +605,7 @@ static V9fsPDU *pdu_alloc(V9fsState *s)
 return pdu;
 }
 
-static void pdu_free(V9fsPDU *pdu)
+void pdu_free(V9fsPDU *pdu)
 {
 if (pdu) {
 V9fsState *s = pdu->s;
@@ -3257,7 +3252,7 @@ static inline bool is_read_only_op(V9fsPDU *pdu)
 }
 }
 
-static void pdu_submit(V9fsPDU *pdu)
+void pdu_submit(V9fsPDU *pdu)
 {
 Coroutine *co;
 CoroutineEntry *handler;
@@ -3276,36 +3271,3 @@ static void pdu_submit(V9fsPDU *pdu)
 co = qemu_coroutine_create(handler);
 qemu_coroutine_enter(co, pdu);
 }
-
-void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
-{
-V9fsState *s = (V9fsState *)vdev;
-V9fsPDU *pdu;
-ssize_t len;
-
-while ((pdu = pdu_alloc(s)) &&
-(len = virtqueue_pop(

[Qemu-devel] [PATCH 16/22] 9pfs: make pdu_{, un}marshal proper functions

2016-01-05 Thread Wei Liu

Factor out v9fs_iov_v{,un}marshal. Implement pdu_{,un}marshal with those
functions.

Signed-off-by: Wei Liu 
---
 fsdev/9p-iov-marshal.c | 42 ++
 fsdev/9p-iov-marshal.h |  5 +
 hw/9pfs/virtio-9p.c| 26 ++
 hw/9pfs/virtio-9p.h|  6 ++
 4 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/fsdev/9p-iov-marshal.c b/fsdev/9p-iov-marshal.c
index d17983e..4db7133 100644
--- a/fsdev/9p-iov-marshal.c
+++ b/fsdev/9p-iov-marshal.c
@@ -76,15 +76,13 @@ static ssize_t v9fs_pack(struct iovec *in_sg, int in_num, 
size_t offset,
 return v9fs_packunpack((void *)src, in_sg, in_num, offset, size, 1);
 }
 
-ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
-   int bswap, const char *fmt, ...)
+ssize_t v9fs_iov_vunmarshal(struct iovec *out_sg, int out_num, size_t offset,
+int bswap, const char *fmt, va_list ap)
 {
 int i;
-va_list ap;
 ssize_t copied = 0;
 size_t old_offset = offset;
 
-va_start(ap, fmt);
 for (i = 0; fmt[i]; i++) {
 switch (fmt[i]) {
 case 'b': {
@@ -195,25 +193,34 @@ ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int 
out_num, size_t offset,
 break;
 }
 if (copied < 0) {
-va_end(ap);
 return copied;
 }
 offset += copied;
 }
-va_end(ap);
 
 return offset - old_offset;
 }
 
-ssize_t v9fs_iov_marshal(struct iovec *in_sg, int in_num, size_t offset,
- int bswap, const char *fmt, ...)
+ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
+   int bswap, const char *fmt, ...)
 {
-int i;
+ssize_t ret;
 va_list ap;
+
+va_start(ap, fmt);
+ret = v9fs_iov_vunmarshal(out_sg, out_num, offset, bswap, fmt, ap);
+va_end(ap);
+
+return ret;
+}
+
+ssize_t v9fs_iov_vmarshal(struct iovec *in_sg, int in_num, size_t offset,
+  int bswap, const char *fmt, va_list ap)
+{
+int i;
 ssize_t copied = 0;
 size_t old_offset = offset;
 
-va_start(ap, fmt);
 for (i = 0; fmt[i]; i++) {
 switch (fmt[i]) {
 case 'b': {
@@ -316,12 +323,23 @@ ssize_t v9fs_iov_marshal(struct iovec *in_sg, int in_num, 
size_t offset,
 break;
 }
 if (copied < 0) {
-va_end(ap);
 return copied;
 }
 offset += copied;
 }
-va_end(ap);
 
 return offset - old_offset;
 }
+
+ssize_t v9fs_iov_marshal(struct iovec *in_sg, int in_num, size_t offset,
+ int bswap, const char *fmt, ...)
+{
+ssize_t ret;
+va_list ap;
+
+va_start(ap, fmt);
+ret = v9fs_iov_vmarshal(in_sg, in_num, offset, bswap, fmt, ap);
+va_end(ap);
+
+return ret;
+}
diff --git a/fsdev/9p-iov-marshal.h b/fsdev/9p-iov-marshal.h
index 72c0cb3..410a1ea 100644
--- a/fsdev/9p-iov-marshal.h
+++ b/fsdev/9p-iov-marshal.h
@@ -7,4 +7,9 @@ ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int out_num, 
size_t offset,
int bswap, const char *fmt, ...);
 ssize_t v9fs_iov_marshal(struct iovec *in_sg, int in_num, size_t offset,
  int bswap, const char *fmt, ...);
+
+ssize_t v9fs_iov_vunmarshal(struct iovec *out_sg, int out_num, size_t offset,
+int bswap, const char *fmt, va_list ap);
+ssize_t v9fs_iov_vmarshal(struct iovec *in_sg, int in_num, size_t offset,
+  int bswap, const char *fmt, va_list ap);
 #endif
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 07e5eff..780c398 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -39,6 +39,32 @@ enum {
 Oappend = 0x80,
 };
 
+ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...)
+{
+ssize_t ret;
+va_list ap;
+
+va_start(ap, fmt);
+ret = v9fs_iov_vmarshal(pdu->elem.in_sg, pdu->elem.in_num,
+offset, 1, fmt, ap);
+va_end(ap);
+
+return ret;
+}
+
+ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...)
+{
+ssize_t ret;
+va_list ap;
+
+va_start(ap, fmt);
+ret = v9fs_iov_vunmarshal(pdu->elem.out_sg, pdu->elem.out_num,
+  offset, 1, fmt, ap);
+va_end(ap);
+
+return ret;
+}
+
 static int omode_to_uflags(int8_t mode)
 {
 int ret = 0;
diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
index 3a7e136..d6f3ac0 100644
--- a/hw/9pfs/virtio-9p.h
+++ b/hw/9pfs/virtio-9p.h
@@ -320,10 +320,8 @@ extern void v9fs_path_copy(V9fsPath *lhs, V9fsPath *rhs);
 extern int v9fs_name_to_path(V9fsState *s, V9fsPath *dirpath,
  const char *name, V9fsPath *path);
 
-#define pdu_marshal(pdu, offset, fmt, args...)  \
-v9fs_iov_marshal(pdu->elem.in_sg, pdu->elem.in_num, offset, 1, fmt, ##args)
-#define pdu_unmarshal(pdu, offset, fmt, args...)  \
-

[Qemu-devel] [PATCH 14/22] 9pfs: PDU processing functions don't need to take V9fsState as argument

2016-01-05 Thread Wei Liu

V9fsState can be referenced by pdu->s. Initialise that in device
realization function.

Signed-off-by: Wei Liu 
---
 hw/9pfs/virtio-9p-device.c |  1 +
 hw/9pfs/virtio-9p.c| 98 +-
 2 files changed, 46 insertions(+), 53 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 885b940..f3091cc 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -69,6 +69,7 @@ static void virtio_9p_device_realize(DeviceState *dev, Error 
**errp)
 QLIST_INIT(&s->active_list);
 for (i = 0; i < (MAX_REQ - 1); i++) {
 QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
+s->pdus[i].s = s;
 }
 
 s->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 5da25ec..f605895 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -575,9 +575,10 @@ static V9fsPDU *alloc_pdu(V9fsState *s)
 return pdu;
 }
 
-static void free_pdu(V9fsState *s, V9fsPDU *pdu)
+static void free_pdu(V9fsPDU *pdu)
 {
 if (pdu) {
+V9fsState *s = pdu->s;
 /*
  * Cancelled pdu are added back to the freelist
  * by flush request .
@@ -594,9 +595,10 @@ static void free_pdu(V9fsState *s, V9fsPDU *pdu)
  * because we always expect to have enough space to encode
  * error details
  */
-static void complete_pdu(V9fsState *s, V9fsPDU *pdu, ssize_t len)
+static void complete_pdu(V9fsPDU *pdu, ssize_t len)
 {
 int8_t id = pdu->id + 1; /* Response */
+V9fsState *s = pdu->s;
 
 if (len < 0) {
 int err = -len;
@@ -636,7 +638,7 @@ static void complete_pdu(V9fsState *s, V9fsPDU *pdu, 
ssize_t len)
 /* Now wakeup anybody waiting in flush for this request */
 qemu_co_queue_next(&pdu->complete);
 
-free_pdu(s, pdu);
+free_pdu(pdu);
 }
 
 static mode_t v9mode_to_mode(uint32_t mode, V9fsString *extension)
@@ -931,7 +933,7 @@ static void v9fs_version(void *opaque)
 offset += err;
 trace_v9fs_version_return(pdu->tag, pdu->id, s->msize, version.data);
 out:
-complete_pdu(s, pdu, offset);
+complete_pdu(pdu, offset);
 v9fs_string_free(&version);
 }
 
@@ -995,7 +997,7 @@ static void v9fs_attach(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(s, pdu, err);
+complete_pdu(pdu, err);
 v9fs_string_free(&uname);
 v9fs_string_free(&aname);
 }
@@ -1009,7 +1011,6 @@ static void v9fs_stat(void *opaque)
 struct stat stbuf;
 V9fsFidState *fidp;
 V9fsPDU *pdu = opaque;
-V9fsState *s = pdu->s;
 
 err = pdu_unmarshal(pdu, offset, "d", &fid);
 if (err < 0) {
@@ -1042,7 +1043,7 @@ static void v9fs_stat(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(s, pdu, err);
+complete_pdu(pdu, err);
 }
 
 static void v9fs_getattr(void *opaque)
@@ -1105,7 +1106,7 @@ static void v9fs_getattr(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(s, pdu, retval);
+complete_pdu(pdu, retval);
 }
 
 /* Attribute flags */
@@ -1129,7 +1130,6 @@ static void v9fs_setattr(void *opaque)
 size_t offset = 7;
 V9fsIattr v9iattr;
 V9fsPDU *pdu = opaque;
-V9fsState *s = pdu->s;
 
 err = pdu_unmarshal(pdu, offset, "dI", &fid, &v9iattr);
 if (err < 0) {
@@ -1203,7 +1203,7 @@ static void v9fs_setattr(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(s, pdu, err);
+complete_pdu(pdu, err);
 }
 
 static int v9fs_walk_marshal(V9fsPDU *pdu, uint16_t nwnames, V9fsQID *qids)
@@ -1245,7 +1245,7 @@ static void v9fs_walk(void *opaque)
 
 err = pdu_unmarshal(pdu, offset, "ddw", &fid, &newfid, &nwnames);
 if (err < 0) {
-complete_pdu(s, pdu, err);
+complete_pdu(pdu, err);
 return ;
 }
 offset += err;
@@ -1313,7 +1313,7 @@ out:
 v9fs_path_free(&dpath);
 v9fs_path_free(&path);
 out_nofid:
-complete_pdu(s, pdu, err);
+complete_pdu(pdu, err);
 if (nwnames && nwnames <= P9_MAXWELEM) {
 for (name_idx = 0; name_idx < nwnames; name_idx++) {
 v9fs_string_free(&wnames[name_idx]);
@@ -1430,7 +1430,7 @@ static void v9fs_open(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(s, pdu, err);
+complete_pdu(pdu, err);
 }
 
 static void v9fs_lcreate(void *opaque)
@@ -1487,7 +1487,7 @@ static void v9fs_lcreate(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu->s, pdu, err);
+complete_pdu(pdu, err);
 v9fs_string_free(&name);
 }
 
@@ -1499,7 +1499,6 @@ static void v9fs_fsync(void *opaque)
 size_t offset = 7;
 V9fsFidState *fidp;
 V9fsPDU *pdu = opaque;
-V9fsState *s = pdu->s;
 
 err = pdu_unmarshal(pdu, offset, "dd", &fid, &datasync);
 if (err < 0) {
@@ -1518,7 +1517,7 @@ static void v9fs_fsync(void *opaque)
 }
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(s, pdu, err);
+complete_pdu(pdu, err);
 }
 
 static void v9fs_clunk

[Qemu-devel] [PATCH 19/22] 9pfs: break out virtio_init_iov_from_pdu

2016-01-05 Thread Wei Liu

Signed-off-by: Wei Liu 
---
 hw/9pfs/virtio-9p-device.c | 12 
 hw/9pfs/virtio-9p.c|  8 +---
 hw/9pfs/virtio-9p.h|  2 ++
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index d77247f..5cad654 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -170,6 +170,18 @@ ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
offset, 1, fmt, ap);
 }
 
+void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec **piov,
+  unsigned int *pniov, bool is_write)
+{
+if (is_write) {
+*piov = pdu->elem.out_sg;
+*pniov = pdu->elem.out_num;
+} else {
+*piov = pdu->elem.in_sg;
+*pniov = pdu->elem.in_num;
+}
+}
+
 /* virtio-9p device */
 
 static Property virtio_9p_properties[] = {
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 5475f29..d6850ba 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -1702,13 +1702,7 @@ static void v9fs_init_qiov_from_pdu(QEMUIOVector *qiov, 
V9fsPDU *pdu,
 struct iovec *iov;
 unsigned int niov;
 
-if (is_write) {
-iov = pdu->elem.out_sg;
-niov = pdu->elem.out_num;
-} else {
-iov = pdu->elem.in_sg;
-niov = pdu->elem.in_num;
-}
+virtio_init_iov_from_pdu(pdu, &iov, &niov, is_write);
 
 qemu_iovec_init_external(&elem, iov, niov);
 qemu_iovec_init(qiov, niov);
diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
index e298949..5024ad0 100644
--- a/hw/9pfs/virtio-9p.h
+++ b/hw/9pfs/virtio-9p.h
@@ -327,6 +327,8 @@ ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
 const char *fmt, va_list ap);
 ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
   const char *fmt, va_list ap);
+void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec **piov,
+  unsigned int *pniov, bool is_write);
 
 #define TYPE_VIRTIO_9P "virtio-9p-device"
 #define VIRTIO_9P(obj) \
-- 
2.1.4

[Qemu-devel] [PATCH 18/22] 9pfs: factor out pdu_push_and_notify

2016-01-05 Thread Wei Liu

Signed-off-by: Wei Liu 
---
 hw/9pfs/virtio-9p.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index db79a48..5475f29 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -63,6 +63,17 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const 
char *fmt, ...)
 return ret;
 }
 
+static void pdu_push_and_notify(V9fsPDU *pdu)
+{
+V9fsState *s = pdu->s;
+
+/* push onto queue and notify */
+virtqueue_push(s->vq, &pdu->elem, pdu->size);
+
+/* FIXME: we should batch these completions */
+virtio_notify(VIRTIO_DEVICE(s), s->vq);
+}
+
 static int omode_to_uflags(int8_t mode)
 {
 int ret = 0;
@@ -653,11 +664,7 @@ static void pdu_complete(V9fsPDU *pdu, ssize_t len)
 pdu->size = len;
 pdu->id = id;
 
-/* push onto queue and notify */
-virtqueue_push(s->vq, &pdu->elem, len);
-
-/* FIXME: we should batch these completions */
-virtio_notify(VIRTIO_DEVICE(s), s->vq);
+pdu_push_and_notify(pdu);
 
 /* Now wakeup anybody waiting in flush for this request */
 qemu_co_queue_next(&pdu->complete);
-- 
2.1.4

[Qemu-devel] [PATCH 10/22] fsdev: break out 9p-marshal.{c, h} from virtio-9p-marshal.{c, h}

2016-01-05 Thread Wei Liu

Break out some generic functions for marshaling 9p state. Pure code
motion plus minor fixes for build system.

Signed-off-by: Wei Liu 
---
 Makefile  |  2 +-
 fsdev/9p-marshal.c| 57 
 fsdev/9p-marshal.h| 84 +++
 fsdev/Makefile.objs   |  2 +-
 fsdev/virtio-9p-marshal.c | 31 -
 fsdev/virtio-9p-marshal.h | 79 +---
 6 files changed, 144 insertions(+), 111 deletions(-)
 create mode 100644 fsdev/9p-marshal.c
 create mode 100644 fsdev/9p-marshal.h

diff --git a/Makefile b/Makefile
index 82b2fc8..7e881d8 100644
--- a/Makefile
+++ b/Makefile
@@ -240,7 +240,7 @@ qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) 
$(qom-obj-y) libqemuu
 
 qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
 
-fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o 
fsdev/virtio-9p-marshal.o libqemuutil.a libqemustub.a
+fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o 
fsdev/9p-marshal.o fsdev/virtio-9p-marshal.o libqemuutil.a libqemustub.a
 fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
 
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
diff --git a/fsdev/9p-marshal.c b/fsdev/9p-marshal.c
new file mode 100644
index 000..610978e
--- /dev/null
+++ b/fsdev/9p-marshal.c
@@ -0,0 +1,57 @@
+/*
+ * 9p backend
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Anthony Liguori   
+ *  Wei Liu 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "qemu/compiler.h"
+#include "9p-marshal.h"
+
+void v9fs_string_free(V9fsString *str)
+{
+g_free(str->data);
+str->data = NULL;
+str->size = 0;
+}
+
+void v9fs_string_null(V9fsString *str)
+{
+v9fs_string_free(str);
+}
+
+void GCC_FMT_ATTR(2, 3)
+v9fs_string_sprintf(V9fsString *str, const char *fmt, ...)
+{
+va_list ap;
+
+v9fs_string_free(str);
+
+va_start(ap, fmt);
+str->size = g_vasprintf(&str->data, fmt, ap);
+va_end(ap);
+}
+
+void v9fs_string_copy(V9fsString *lhs, V9fsString *rhs)
+{
+v9fs_string_free(lhs);
+v9fs_string_sprintf(lhs, "%s", rhs->data);
+}
diff --git a/fsdev/9p-marshal.h b/fsdev/9p-marshal.h
new file mode 100644
index 000..e91b24e
--- /dev/null
+++ b/fsdev/9p-marshal.h
@@ -0,0 +1,84 @@
+#ifndef _QEMU_9P_MARSHAL_H
+#define _QEMU_9P_MARSHAL_H
+
+typedef struct V9fsString
+{
+uint16_t size;
+char *data;
+} V9fsString;
+
+typedef struct V9fsQID
+{
+int8_t type;
+int32_t version;
+int64_t path;
+} V9fsQID;
+
+typedef struct V9fsStat
+{
+int16_t size;
+int16_t type;
+int32_t dev;
+V9fsQID qid;
+int32_t mode;
+int32_t atime;
+int32_t mtime;
+int64_t length;
+V9fsString name;
+V9fsString uid;
+V9fsString gid;
+V9fsString muid;
+/* 9p2000.u */
+V9fsString extension;
+int32_t n_uid;
+int32_t n_gid;
+int32_t n_muid;
+} V9fsStat;
+
+typedef struct V9fsIattr
+{
+int32_t valid;
+int32_t mode;
+int32_t uid;
+int32_t gid;
+int64_t size;
+int64_t atime_sec;
+int64_t atime_nsec;
+int64_t mtime_sec;
+int64_t mtime_nsec;
+} V9fsIattr;
+
+typedef struct V9fsStatDotl {
+uint64_t st_result_mask;
+V9fsQID qid;
+uint32_t st_mode;
+uint32_t st_uid;
+uint32_t st_gid;
+uint64_t st_nlink;
+uint64_t st_rdev;
+uint64_t st_size;
+uint64_t st_blksize;
+uint64_t st_blocks;
+uint64_t st_atime_sec;
+uint64_t st_atime_nsec;
+uint64_t st_mtime_sec;
+uint64_t st_mtime_nsec;
+uint64_t st_ctime_sec;
+uint64_t st_ctime_nsec;
+uint64_t st_btime_sec;
+uint64_t st_btime_nsec;
+uint64_t st_gen;
+uint64_t st_data_version;
+} V9fsStatDotl;
+
+static inline void v9fs_string_init(V9fsString *str)
+{
+str->data = NULL;
+str->size = 0;
+}
+extern void v9fs_string_free(V9fsString *str);
+extern void v9fs_string_null(V9fsString *str);
+extern void v9fs_string_sprintf(V9fsString *str, const char *fmt, ...);
+extern void v9fs_string_copy(V9fsString *lhs, V9fsString *rhs);
+
+#endif
diff --git a/fsdev/Makefile.objs b/fsdev/Makefile.objs
index c27dad3..8357851 100644
--- a/fsdev/Makefile.objs
+++ b/fsdev/Makefile.objs
@@ -1,7 +1,7 @@
 ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
 # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
 # only pull in the actual virtio-9p device if we also enabled virtio.
-common-obj-y = qemu-fsdev.o virtio-9p-marshal.o
+common-obj-y = qemu-fsdev.o 9p-marshal.o virtio-9p-marshal.o
 else
 common-obj-y = qemu-fsdev-dummy.o
 endif
diff --git a/fsdev/virtio-9p-marshal.c b/fsdev/virtio-9p-marshal.c
index 7748d32..f236bab 100644
--- a/fsdev/virtio-9p-marshal.c
+++ b/fsdev/virtio-9p-marshal.c
@@ -25,37 +25,6 @@
 #include "virtio-9

[Qemu-devel] [PATCH 21/22] 9pfs: factor out v9fs_device_{, un}realize_common

2016-01-05 Thread Wei Liu

Signed-off-by: Wei Liu 
---
 hw/9pfs/9p.c   | 95 ++
 hw/9pfs/9p.h   |  2 +
 hw/9pfs/virtio-9p-device.c | 90 ---
 3 files changed, 104 insertions(+), 83 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 2e9982f..4f2defd 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -3271,3 +3271,98 @@ void pdu_submit(V9fsPDU *pdu)
 co = qemu_coroutine_create(handler);
 qemu_coroutine_enter(co, pdu);
 }
+
+/* Returns 0 on success, 1 on failure. */
+int v9fs_device_realize_common(V9fsState *s, Error **errp)
+{
+int i, len;
+struct stat stat;
+FsDriverEntry *fse;
+V9fsPath path;
+int rc = 1;
+
+/* initialize pdu allocator */
+QLIST_INIT(&s->free_list);
+QLIST_INIT(&s->active_list);
+for (i = 0; i < (MAX_REQ - 1); i++) {
+QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
+s->pdus[i].s = s;
+}
+
+v9fs_path_init(&path);
+
+fse = get_fsdev_fsentry(s->fsconf.fsdev_id);
+
+if (!fse) {
+/* We don't have a fsdev identified by fsdev_id */
+error_setg(errp, "9pfs device couldn't find fsdev with the "
+   "id = %s",
+   s->fsconf.fsdev_id ? s->fsconf.fsdev_id : "NULL");
+goto out;
+}
+
+if (!s->fsconf.tag) {
+/* we haven't specified a mount_tag */
+error_setg(errp, "fsdev with id %s needs mount_tag arguments",
+   s->fsconf.fsdev_id);
+goto out;
+}
+
+s->ctx.export_flags = fse->export_flags;
+s->ctx.fs_root = g_strdup(fse->path);
+s->ctx.exops.get_st_gen = NULL;
+len = strlen(s->fsconf.tag);
+if (len > MAX_TAG_LEN - 1) {
+error_setg(errp, "mount tag '%s' (%d bytes) is longer than "
+   "maximum (%d bytes)", s->fsconf.tag, len, MAX_TAG_LEN - 1);
+goto out;
+}
+
+s->tag = g_strdup(s->fsconf.tag);
+s->ctx.uid = -1;
+
+s->ops = fse->ops;
+
+s->fid_list = NULL;
+qemu_co_rwlock_init(&s->rename_lock);
+
+if (s->ops->init(&s->ctx) < 0) {
+error_setg(errp, "9pfs Failed to initialize fs-driver with id:%s"
+   " and export path:%s", s->fsconf.fsdev_id, s->ctx.fs_root);
+goto out;
+}
+
+/*
+ * Check details of export path, We need to use fs driver
+ * call back to do that. Since we are in the init path, we don't
+ * use co-routines here.
+ */
+if (s->ops->name_to_path(&s->ctx, NULL, "/", &path) < 0) {
+error_setg(errp,
+   "error in converting name to path %s", strerror(errno));
+goto out;
+}
+if (s->ops->lstat(&s->ctx, &path, &stat)) {
+error_setg(errp, "share path %s does not exist", fse->path);
+goto out;
+} else if (!S_ISDIR(stat.st_mode)) {
+error_setg(errp, "share path %s is not a directory", fse->path);
+goto out;
+}
+v9fs_path_free(&path);
+
+rc = 0;
+out:
+if (rc) {
+g_free(s->ctx.fs_root);
+g_free(s->tag);
+v9fs_path_free(&path);
+}
+return rc;
+}
+
+void v9fs_device_unrealize_common(V9fsState *s, Error **errp)
+{
+g_free(s->ctx.fs_root);
+g_free(s->tag);
+}
diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
index 6ed2f1b..76c7cec 100644
--- a/hw/9pfs/9p.h
+++ b/hw/9pfs/9p.h
@@ -318,6 +318,8 @@ extern void v9fs_path_free(V9fsPath *path);
 extern void v9fs_path_copy(V9fsPath *lhs, V9fsPath *rhs);
 extern int v9fs_name_to_path(V9fsState *s, V9fsPath *dirpath,
  const char *name, V9fsPath *path);
+extern int v9fs_device_realize_common(V9fsState *s, Error **errp);
+extern void v9fs_device_unrealize_common(V9fsState *s, Error **errp);
 
 ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...);
 ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...);
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index fdf79a2..f6e7ec7 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -104,93 +104,18 @@ static void virtio_9p_device_realize(DeviceState *dev, 
Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 V9fsState *s = VIRTIO_9P(dev);
-int i, len;
-struct stat stat;
-FsDriverEntry *fse;
-V9fsPath path;
-
-virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P,
-sizeof(struct virtio_9p_config) + MAX_TAG_LEN);
-
-/* initialize pdu allocator */
-QLIST_INIT(&s->free_list);
-QLIST_INIT(&s->active_list);
-for (i = 0; i < (MAX_REQ - 1); i++) {
-QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
-s->pdus[i].s = s;
-}
-
-s->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
-
-v9fs_path_init(&path);
-
-fse = get_fsdev_fsentry(s->fsconf.fsdev_id);
-
-if (!fse) {
-/* We don't have a fsdev identified by fsdev_id */
-error_setg(errp, "Virtio-9p device couldn't find fsdev with the "
-

[Qemu-devel] [PATCH 22/22] 9pfs: disentangle V9fsState

2016-01-05 Thread Wei Liu

V9fsState now only contains generic fields. Introduce V9fsVirtioState
for virtio transport.  Change virtio-pci and virtio-ccw to use
V9fsVirtioState. Handle transport enumeration in generic routines.

Signed-off-by: Wei Liu 
---
 hw/9pfs/9p.c   | 41 +++-
 hw/9pfs/9p.h   | 19 +++
 hw/9pfs/virtio-9p-device.c | 80 +-
 hw/9pfs/virtio-9p.h| 13 +++-
 hw/s390x/virtio-ccw.h  |  2 +-
 hw/virtio/virtio-pci.h |  2 +-
 6 files changed, 109 insertions(+), 48 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 4f2defd..f9c5451 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -46,7 +46,13 @@ ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const char 
*fmt, ...)
 va_list ap;
 
 va_start(ap, fmt);
-ret = virtio_pdu_vmarshal(pdu, offset, fmt, ap);
+switch (pdu->transport) {
+case VIRTIO:
+ret = virtio_pdu_vmarshal(pdu, offset, fmt, ap);
+break;
+default:
+ret = -1;
+}
 va_end(ap);
 
 return ret;
@@ -58,7 +64,13 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const 
char *fmt, ...)
 va_list ap;
 
 va_start(ap, fmt);
-ret = virtio_pdu_vunmarshal(pdu, offset, fmt, ap);
+switch (pdu->transport) {
+case VIRTIO:
+ret = virtio_pdu_vunmarshal(pdu, offset, fmt, ap);
+break;
+default:
+ret = -1;
+}
 va_end(ap);
 
 return ret;
@@ -66,7 +78,11 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const 
char *fmt, ...)
 
 static void pdu_push_and_notify(V9fsPDU *pdu)
 {
-virtio_9p_push_and_notify(pdu);
+switch (pdu->transport) {
+case VIRTIO:
+virtio_9p_push_and_notify(pdu);
+break;
+}
 }
 
 static int omode_to_uflags(int8_t mode)
@@ -1697,7 +1713,11 @@ static void v9fs_init_qiov_from_pdu(QEMUIOVector *qiov, 
V9fsPDU *pdu,
 struct iovec *iov;
 unsigned int niov;
 
-virtio_init_iov_from_pdu(pdu, &iov, &niov, is_write);
+switch (pdu->transport) {
+case VIRTIO:
+virtio_init_iov_from_pdu(pdu, &iov, &niov, is_write);
+break;
+}
 
 qemu_iovec_init_external(&elem, iov, niov);
 qemu_iovec_init(qiov, niov);
@@ -3273,8 +3293,10 @@ void pdu_submit(V9fsPDU *pdu)
 }
 
 /* Returns 0 on success, 1 on failure. */
-int v9fs_device_realize_common(V9fsState *s, Error **errp)
+int v9fs_device_realize_common(V9fsState *s, enum p9_transport transport,
+   Error **errp)
 {
+V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
 int i, len;
 struct stat stat;
 FsDriverEntry *fse;
@@ -3285,8 +3307,10 @@ int v9fs_device_realize_common(V9fsState *s, Error 
**errp)
 QLIST_INIT(&s->free_list);
 QLIST_INIT(&s->active_list);
 for (i = 0; i < (MAX_REQ - 1); i++) {
-QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
-s->pdus[i].s = s;
+QLIST_INSERT_HEAD(&s->free_list, &v->pdus[i], next);
+v->pdus[i].s = s;
+v->pdus[i].idx = i;
+v->pdus[i].transport = transport;
 }
 
 v9fs_path_init(&path);
@@ -3361,7 +3385,8 @@ out:
 return rc;
 }
 
-void v9fs_device_unrealize_common(V9fsState *s, Error **errp)
+void v9fs_device_unrealize_common(V9fsState *s, enum p9_transport transport,
+  Error **errp)
 {
 g_free(s->ctx.fs_root);
 g_free(s->tag);
diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
index 76c7cec..c36a02b 100644
--- a/hw/9pfs/9p.h
+++ b/hw/9pfs/9p.h
@@ -14,6 +14,10 @@
 #include "qemu/thread.h"
 #include "qemu/coroutine.h"
 
+enum p9_transport {
+VIRTIO = 0x1,
+};
+
 enum {
 P9_TLERROR = 6,
 P9_RLERROR,
@@ -131,9 +135,10 @@ struct V9fsPDU
 uint8_t id;
 uint8_t cancelled;
 CoQueue complete;
-VirtQueueElement elem;
 struct V9fsState *s;
 QLIST_ENTRY(V9fsPDU) next;
+uint32_t idx; /* index inside the array */
+enum p9_transport transport;
 };
 
 
@@ -205,16 +210,12 @@ struct V9fsFidState
 
 typedef struct V9fsState
 {
-VirtIODevice parent_obj;
-VirtQueue *vq;
-V9fsPDU pdus[MAX_REQ];
 QLIST_HEAD(, V9fsPDU) free_list;
 QLIST_HEAD(, V9fsPDU) active_list;
 V9fsFidState *fid_list;
 FileOperations *ops;
 FsContext ctx;
 char *tag;
-size_t config_size;
 enum p9_proto_version proto_version;
 int32_t msize;
 /*
@@ -318,8 +319,12 @@ extern void v9fs_path_free(V9fsPath *path);
 extern void v9fs_path_copy(V9fsPath *lhs, V9fsPath *rhs);
 extern int v9fs_name_to_path(V9fsState *s, V9fsPath *dirpath,
  const char *name, V9fsPath *path);
-extern int v9fs_device_realize_common(V9fsState *s, Error **errp);
-extern void v9fs_device_unrealize_common(V9fsState *s, Error **errp);
+
+extern int v9fs_device_realize_common(V9fsState *s, enum p9_transport 
transport,
+  Error **errp);
+extern void v9fs_device_unrealize_common(V9fsState *s,
+

[Qemu-devel] [PATCH 15/22] 9pfs: PDU processing functions should start pdu_ prefix

2016-01-05 Thread Wei Liu

This matches naming convention of pdu_marshal and pdu_unmarshal.

Signed-off-by: Wei Liu 
---
This patch can be squashed if necessary.
---
 hw/9pfs/virtio-9p.c | 88 ++---
 1 file changed, 44 insertions(+), 44 deletions(-)

diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index f605895..07e5eff 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -563,7 +563,7 @@ static int fid_to_qid(V9fsPDU *pdu, V9fsFidState *fidp, 
V9fsQID *qidp)
 return 0;
 }
 
-static V9fsPDU *alloc_pdu(V9fsState *s)
+static V9fsPDU *pdu_alloc(V9fsState *s)
 {
 V9fsPDU *pdu = NULL;
 
@@ -575,7 +575,7 @@ static V9fsPDU *alloc_pdu(V9fsState *s)
 return pdu;
 }
 
-static void free_pdu(V9fsPDU *pdu)
+static void pdu_free(V9fsPDU *pdu)
 {
 if (pdu) {
 V9fsState *s = pdu->s;
@@ -595,7 +595,7 @@ static void free_pdu(V9fsPDU *pdu)
  * because we always expect to have enough space to encode
  * error details
  */
-static void complete_pdu(V9fsPDU *pdu, ssize_t len)
+static void pdu_complete(V9fsPDU *pdu, ssize_t len)
 {
 int8_t id = pdu->id + 1; /* Response */
 V9fsState *s = pdu->s;
@@ -638,7 +638,7 @@ static void complete_pdu(V9fsPDU *pdu, ssize_t len)
 /* Now wakeup anybody waiting in flush for this request */
 qemu_co_queue_next(&pdu->complete);
 
-free_pdu(pdu);
+pdu_free(pdu);
 }
 
 static mode_t v9mode_to_mode(uint32_t mode, V9fsString *extension)
@@ -933,7 +933,7 @@ static void v9fs_version(void *opaque)
 offset += err;
 trace_v9fs_version_return(pdu->tag, pdu->id, s->msize, version.data);
 out:
-complete_pdu(pdu, offset);
+pdu_complete(pdu, offset);
 v9fs_string_free(&version);
 }
 
@@ -997,7 +997,7 @@ static void v9fs_attach(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 v9fs_string_free(&uname);
 v9fs_string_free(&aname);
 }
@@ -1043,7 +1043,7 @@ static void v9fs_stat(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 }
 
 static void v9fs_getattr(void *opaque)
@@ -1106,7 +1106,7 @@ static void v9fs_getattr(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, retval);
+pdu_complete(pdu, retval);
 }
 
 /* Attribute flags */
@@ -1203,7 +1203,7 @@ static void v9fs_setattr(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 }
 
 static int v9fs_walk_marshal(V9fsPDU *pdu, uint16_t nwnames, V9fsQID *qids)
@@ -1245,7 +1245,7 @@ static void v9fs_walk(void *opaque)
 
 err = pdu_unmarshal(pdu, offset, "ddw", &fid, &newfid, &nwnames);
 if (err < 0) {
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 return ;
 }
 offset += err;
@@ -1313,7 +1313,7 @@ out:
 v9fs_path_free(&dpath);
 v9fs_path_free(&path);
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 if (nwnames && nwnames <= P9_MAXWELEM) {
 for (name_idx = 0; name_idx < nwnames; name_idx++) {
 v9fs_string_free(&wnames[name_idx]);
@@ -1430,7 +1430,7 @@ static void v9fs_open(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 }
 
 static void v9fs_lcreate(void *opaque)
@@ -1487,7 +1487,7 @@ static void v9fs_lcreate(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 v9fs_string_free(&name);
 }
 
@@ -1517,7 +1517,7 @@ static void v9fs_fsync(void *opaque)
 }
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 }
 
 static void v9fs_clunk(void *opaque)
@@ -1550,7 +1550,7 @@ static void v9fs_clunk(void *opaque)
 err = offset;
 }
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 }
 
 static int v9fs_xattr_read(V9fsState *s, V9fsPDU *pdu, V9fsFidState *fidp,
@@ -1765,7 +1765,7 @@ static void v9fs_read(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 }
 
 static size_t v9fs_readdir_data_size(V9fsString *name)
@@ -1888,7 +1888,7 @@ static void v9fs_readdir(void *opaque)
 out:
 put_fid(pdu, fidp);
 out_nofid:
-complete_pdu(pdu, retval);
+pdu_complete(pdu, retval);
 }
 
 static int v9fs_xattr_write(V9fsState *s, V9fsPDU *pdu, V9fsFidState *fidp,
@@ -1955,7 +1955,7 @@ static void v9fs_write(void *opaque)
 
 err = pdu_unmarshal(pdu, offset, "dqd", &fid, &off, &count);
 if (err < 0) {
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 return;
 }
 offset += err;
@@ -2018,7 +2018,7 @@ out:
 put_fid(pdu, fidp);
 out_nofid:
 qemu_iovec_destroy(&qiov_full);
-complete_pdu(pdu, err);
+pdu_complete(pdu, err);
 }
 
 static void v9fs_create(void *opaque)
@@ -2185,7 +2185,7 @@ static void v9fs_create(void *opaque)
 out:

[Qemu-devel] [PATCH 11/22] fsdev: 9p-marshal: introduce V9fsBlob

2016-01-05 Thread Wei Liu

Introduce a concept of blob. It will be used to pack / unpack xattr
value.

With this change there is no need to expose v9fs_pack to device code
anymore.

Signed-off-by: Wei Liu 
---
 fsdev/9p-marshal.c|  7 +++
 fsdev/9p-marshal.h| 14 ++
 fsdev/virtio-9p-marshal.c | 26 ++
 3 files changed, 47 insertions(+)

diff --git a/fsdev/9p-marshal.c b/fsdev/9p-marshal.c
index 610978e..b457d49 100644
--- a/fsdev/9p-marshal.c
+++ b/fsdev/9p-marshal.c
@@ -55,3 +55,10 @@ void v9fs_string_copy(V9fsString *lhs, V9fsString *rhs)
 v9fs_string_free(lhs);
 v9fs_string_sprintf(lhs, "%s", rhs->data);
 }
+
+void v9fs_blob_free(V9fsBlob *blob)
+{
+g_free(blob->data);
+blob->data = NULL;
+blob->size = 0;
+}
diff --git a/fsdev/9p-marshal.h b/fsdev/9p-marshal.h
index e91b24e..5a0150b 100644
--- a/fsdev/9p-marshal.h
+++ b/fsdev/9p-marshal.h
@@ -7,6 +7,12 @@ typedef struct V9fsString
 char *data;
 } V9fsString;
 
+typedef struct V9fsBlob
+{
+uint16_t size;
+void *data;
+} V9fsBlob;
+
 typedef struct V9fsQID
 {
 int8_t type;
@@ -81,4 +87,12 @@ extern void v9fs_string_null(V9fsString *str);
 extern void v9fs_string_sprintf(V9fsString *str, const char *fmt, ...);
 extern void v9fs_string_copy(V9fsString *lhs, V9fsString *rhs);
 
+static inline void v9fs_blob_init(V9fsBlob *blob)
+{
+blob->data = NULL;
+blob->size = 0;
+}
+
+extern void v9fs_blob_free(V9fsBlob *blob);
+
 #endif
diff --git a/fsdev/virtio-9p-marshal.c b/fsdev/virtio-9p-marshal.c
index f236bab..c3ac316 100644
--- a/fsdev/virtio-9p-marshal.c
+++ b/fsdev/virtio-9p-marshal.c
@@ -142,6 +142,21 @@ ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, 
size_t offset,
 }
 break;
 }
+case 'B': {
+V9fsBlob *blob = va_arg(ap, V9fsBlob *);
+copied = v9fs_unmarshal(out_sg, out_num, offset, bswap,
+"w", &blob->size);
+if (copied > 0) {
+offset += copied;
+blob->data = g_malloc(blob->size);
+copied = v9fs_unpack(blob->data, out_sg, out_num, offset,
+ blob->size);
+if (copied < 0) {
+v9fs_blob_free(blob);
+}
+}
+break;
+}
 case 'Q': {
 V9fsQID *qidp = va_arg(ap, V9fsQID *);
 copied = v9fs_unmarshal(out_sg, out_num, offset, bswap, "bdq",
@@ -241,6 +256,17 @@ ssize_t v9fs_marshal(struct iovec *in_sg, int in_num, 
size_t offset,
 }
 break;
 }
+case 'B': {
+V9fsBlob *blob = va_arg(ap, V9fsBlob *);
+copied = v9fs_marshal(in_sg, in_num, offset, bswap,
+  "w", blob->size);
+if (copied > 0) {
+offset += copied;
+copied = v9fs_pack(in_sg, in_num, offset, blob->data,
+   blob->size);
+}
+break;
+}
 case 'Q': {
 V9fsQID *qidp = va_arg(ap, V9fsQID *);
 copied = v9fs_marshal(in_sg, in_num, offset, bswap, "bdq",
-- 
2.1.4

Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Frame v12 25/38] qmp event: Add event notification for COLO error

2016-01-05 Thread John Snow



On 12/22/2015 08:24 PM, Wen Congyang wrote:
> On 12/19/2015 06:02 PM, Markus Armbruster wrote:
>> Copying qemu-block because this seems related to generalising block jobs
>> to background jobs.
>>
>> zhanghailiang  writes:
>>
>>> If some errors happen during VM's COLO FT stage, it's important to notify 
>>> the users
>>> of this event. Together with 'colo_lost_heartbeat', users can intervene in 
>>> COLO's
>>> failover work immediately.
>>> If users don't want to get involved in COLO's failover verdict,
>>> it is still necessary to notify users that we exited COLO mode.
>>>
>>> Cc: Markus Armbruster 
>>> Cc: Michael Roth 
>>> Signed-off-by: zhanghailiang 
>>> Signed-off-by: Li Zhijian 
>>> ---
>>> v11:
>>> - Fix several typos found by Eric
>>>
>>> Signed-off-by: zhanghailiang 
>>> ---
>>>  docs/qmp-events.txt | 17 +
>>>  migration/colo.c| 11 +++
>>>  qapi-schema.json| 16 
>>>  qapi/event.json | 17 +
>>>  4 files changed, 61 insertions(+)
>>>
>>> diff --git a/docs/qmp-events.txt b/docs/qmp-events.txt
>>> index d2f1ce4..19f68fc 100644
>>> --- a/docs/qmp-events.txt
>>> +++ b/docs/qmp-events.txt
>>> @@ -184,6 +184,23 @@ Example:
>>>  Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
>>>  event.
>>>  
>>> +COLO_EXIT
>>> +-
>>> +
>>> +Emitted when VM finishes COLO mode due to some errors happening or
>>> +at the request of users.
>>
>> How would the event's recipient distinguish between "due to error" and
>> "at the user's request"?
>>
>>> +
>>> +Data:
>>> +
>>> + - "mode": COLO mode, primary or secondary side (json-string)
>>> + - "reason":  the exit reason, internal error or external request. 
>>> (json-string)
>>> + - "error": error message (json-string, operation)
>>> +
>>> +Example:
>>> +
>>> +{"timestamp": {"seconds": 2032141960, "microseconds": 417172},
>>> + "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
>>> +
>>
>> Pardon my ignorance again...  Does "VM finishes COLO mode" means have
>> some kind of COLO background job, and it just finished for whatever
>> reason?
>>
>> If yes, this COLO job could be an instance of the general background job
>> concept we're trying to grow from the existing block job concept.
>>
>> I'm not asking you to rebase your work onto the background job
>> infrastructure, not least for the simple reason that it doesn't exist,
>> yet.  But I think it would be fruitful to compare your COLO job
>> management QMP interface with the one we have for block jobs.  Not only
>> may that avoid unnecessary inconsistency, it could also help shape the
>> general background job interface.
> 
> COLO is not a block job. If live migration is a background jon, COLO
> is also a backgroud job.
> 

Right. We are contemplating expanding the "block job" subsystem to be a
generic "background job" system. Live Migration might be one target to
be converted into this Jobs API, COLO might also be a fit.

The framework doesn't exist yet, though.

>>
>> Quick overview of the block job QMP interface:
>>
>> * Commands to create a job: block-commit, block-stream, drive-mirror,
>>   drive-backup.
>>
>> * Get information on jobs: query-block-jobs
>>
>> * Pause a job: block-job-pause
>>
>> * Resume a job: block-job-resume
>>
>> * Cancel a job: block-job-cancel
>>
>> * Block job completion events: BLOCK_JOB_COMPLETED, BLOCK_JOB_CANCELLED
>>
>> * Block job error event: BLOCK_JOB_ERROR
>>
>> * Block job synchronous completion: event BLOCK_JOB_READY and command
>>   block-job-complete
> 
> What is background job infrastructure? Do you mean implement all the above
> interfaces for each background job?
> 
> Thanks
> Wen Congyang
> 

Markus is laying out how Block Jobs currently work for some background
on how the job system exists today. He's highlighting the commands to
create, query, pause, resume, and cancel jobs; as well as demonstrating
the QMP events that the Block Job system uses to indicate completion,
cancellation, error and convergence.

We're thinking of making a generic background job system that would
replace the blockjobs API with a new generic Jobs API that looks very
similar.

Something like this:

Commands:
query: query-jobs
pause: job-pause
resume: job-resume
cancel: job-cancel
complete: job-complete (finalizes a long running command that has converged)

Events:
completion: JOB_COMPLETED, JOB_CANCELLED
error: JOB_ERROR
convergence indicator: JOB_READY

The system doesn't exist yet, but your proposed events that indicate
success/failure etc for COLO caught Markus' attention as perhaps quite
neatly fitting into the above proposed system.

--js

>>
>>>  DEVICE_DELETED
>>>  --
>>>  
>>> diff --git a/migration/colo.c b/migration/colo.c
>>> index d1dd4e1..d06c14f 100644
>>> --- a/migration/colo.c
>>> +++ b/migration/colo.c
>>> @@ -18,6 +18,7 @@
>>>  #include "qemu/error-report.h"
>>>  #include "qemu/sockets.h"
>>>  #include "migration/failover.h"
>>> +#include "qapi-event

Re: [Qemu-devel] [PATCH v2 0/3] virtio: cross-endian helpers fixes

2016-01-05 Thread Greg Kurz

On Wed, 23 Dec 2015 17:28:23 +0100
Greg Kurz  wrote:

> On Wed, 23 Dec 2015 15:47:00 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Thu, Dec 17, 2015 at 09:52:46AM +0100, Greg Kurz wrote:
> > > This series tries to rework cross-endian helpers for better clarity.
> > > It does not change behaviour, except perhaps patch 3/3 even if I could not
> > > measure any performance gain.
> > 
> > Breaks build:
> > 
> >   CCmips64-softmmu/hw/mips/mips_malta.o
> > /home/mst/scm/qemu/hw/net/vhost_net.c: In function
> > ‘vhost_net_set_vnet_endian’:
> > /home/mst/scm/qemu/hw/net/vhost_net.c:208:10: error: implicit
> > declaration of function ‘virtio_legacy_is_cross_endian’
> > [-Werror=implicit-function-declaration]
> >  (virtio_legacy_is_cross_endian(dev) &&
> > !virtio_is_big_endian(dev))) {
> >   ^
> > /home/mst/scm/qemu/hw/net/vhost_net.c:208:9: error: nested extern
> > declaration of ‘virtio_legacy_is_cross_endian’ [-Werror=nested-externs]
> >  (virtio_legacy_is_cross_endian(dev) &&
> > !virtio_is_big_endian(dev))) {
> >  ^
> > cc1: all warnings being treated as errors
> > /home/mst/scm/qemu/rules.mak:57: recipe for target 'hw/net/vhost_net.o'
> > failed
> > make[1]: *** [hw/net/vhost_net.o] Error 1
> > Makefile:186: recipe for target 'subdir-i386-softmmu' failed
> > make: *** [subdir-i386-softmmu] Error 2
> > 
> > 
> > please always build all architectures.
> > 
> 
> Ok. I'll do so from now on.
> 

The break isn't architecture related actually. It is because this series
depends on the "virtio-net/vhost-net: share cross-endian enablement" series
I had posted before... my bad. Since most of these series is cleanup of the
cross-endian code, I'll repost a single series with all the patches.

> > > ---
> > > 
> > > Greg Kurz (3):
> > >   virtio: move cross-endian helper to vhost
> > >   vhost: move virtio 1.0 check to cross-endian helper
> > >   virtio: optimize virtio_access_is_big_endian() for little-endian 
> > > targets
> > > 
> > > 
> > >  hw/virtio/vhost.c |   22 ++
> > >  include/hw/virtio/virtio-access.h |   16 +++-
> > >  2 files changed, 21 insertions(+), 17 deletions(-)
> > 
> 
>

[Qemu-devel] [PATCH 07/22] 9pfs: rename virtio-9p-xattr{, -user}.{c, h} to 9p-xattr{, -user}.{c, h}

2016-01-05 Thread Wei Liu

These three files are not virtio specific. Rename them to generic
names.

Fix comments and header inclusion in various files.

Signed-off-by: Wei Liu 
---
 hw/9pfs/9p-handle.c | 2 +-
 hw/9pfs/9p-local.c  | 2 +-
 hw/9pfs/9p-posix-acl.c  | 2 +-
 hw/9pfs/9p-synth.c  | 2 +-
 hw/9pfs/{virtio-9p-xattr-user.c => 9p-xattr-user.c} | 5 ++---
 hw/9pfs/{virtio-9p-xattr.c => 9p-xattr.c}   | 5 ++---
 hw/9pfs/{virtio-9p-xattr.h => 9p-xattr.h}   | 6 +++---
 hw/9pfs/Makefile.objs   | 4 ++--
 hw/9pfs/virtio-9p-device.c  | 2 +-
 hw/9pfs/virtio-9p.c | 2 +-
 10 files changed, 15 insertions(+), 17 deletions(-)
 rename hw/9pfs/{virtio-9p-xattr-user.c => 9p-xattr-user.c} (97%)
 rename hw/9pfs/{virtio-9p-xattr.c => 9p-xattr.c} (97%)
 rename hw/9pfs/{virtio-9p-xattr.h => 9p-xattr.h} (97%)

diff --git a/hw/9pfs/9p-handle.c b/hw/9pfs/9p-handle.c
index a48dbc9..51a9d15 100644
--- a/hw/9pfs/9p-handle.c
+++ b/hw/9pfs/9p-handle.c
@@ -12,7 +12,7 @@
  */
 
 #include "virtio-9p.h"
-#include "virtio-9p-xattr.h"
+#include "9p-xattr.h"
 #include 
 #include 
 #include 
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 877ad86..ac553e0 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -12,7 +12,7 @@
  */
 
 #include "virtio-9p.h"
-#include "virtio-9p-xattr.h"
+#include "9p-xattr.h"
 #include "fsdev/qemu-fsdev.h"   /* local_ops */
 #include 
 #include 
diff --git a/hw/9pfs/9p-posix-acl.c b/hw/9pfs/9p-posix-acl.c
index 1ee7bdc..073af39 100644
--- a/hw/9pfs/9p-posix-acl.c
+++ b/hw/9pfs/9p-posix-acl.c
@@ -15,7 +15,7 @@
 #include "qemu/xattr.h"
 #include "virtio-9p.h"
 #include "fsdev/file-op-9p.h"
-#include "virtio-9p-xattr.h"
+#include "9p-xattr.h"
 
 #define MAP_ACL_ACCESS "user.virtfs.system.posix_acl_access"
 #define MAP_ACL_DEFAULT "user.virtfs.system.posix_acl_default"
diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
index 6d34b89..b1064e3 100644
--- a/hw/9pfs/9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -14,7 +14,7 @@
 
 #include "hw/virtio/virtio.h"
 #include "virtio-9p.h"
-#include "virtio-9p-xattr.h"
+#include "9p-xattr.h"
 #include "fsdev/qemu-fsdev.h"
 #include "9p-synth.h"
 #include "qemu/rcu.h"
diff --git a/hw/9pfs/virtio-9p-xattr-user.c b/hw/9pfs/9p-xattr-user.c
similarity index 97%
rename from hw/9pfs/virtio-9p-xattr-user.c
rename to hw/9pfs/9p-xattr-user.c
index 46133e0..163b158 100644
--- a/hw/9pfs/virtio-9p-xattr-user.c
+++ b/hw/9pfs/9p-xattr-user.c
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p user. xattr callback
+ * 9p user. xattr callback
  *
  * Copyright IBM, Corp. 2010
  *
@@ -12,10 +12,9 @@
  */
 
 #include 
-#include "hw/virtio/virtio.h"
 #include "virtio-9p.h"
 #include "fsdev/file-op-9p.h"
-#include "virtio-9p-xattr.h"
+#include "9p-xattr.h"
 
 
 static ssize_t mp_user_getxattr(FsContext *ctx, const char *path,
diff --git a/hw/9pfs/virtio-9p-xattr.c b/hw/9pfs/9p-xattr.c
similarity index 97%
rename from hw/9pfs/virtio-9p-xattr.c
rename to hw/9pfs/9p-xattr.c
index 0718388..1d7861b 100644
--- a/hw/9pfs/virtio-9p-xattr.c
+++ b/hw/9pfs/9p-xattr.c
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p  xattr callback
+ * 9p  xattr callback
  *
  * Copyright IBM, Corp. 2010
  *
@@ -11,10 +11,9 @@
  *
  */
 
-#include "hw/virtio/virtio.h"
 #include "virtio-9p.h"
 #include "fsdev/file-op-9p.h"
-#include "virtio-9p-xattr.h"
+#include "9p-xattr.h"
 
 
 static XattrOperations *get_xattr_operations(XattrOperations **h,
diff --git a/hw/9pfs/virtio-9p-xattr.h b/hw/9pfs/9p-xattr.h
similarity index 97%
rename from hw/9pfs/virtio-9p-xattr.h
rename to hw/9pfs/9p-xattr.h
index 327b32b..4d39a20 100644
--- a/hw/9pfs/virtio-9p-xattr.h
+++ b/hw/9pfs/9p-xattr.h
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p
+ * 9p
  *
  * Copyright IBM, Corp. 2010
  *
@@ -10,8 +10,8 @@
  * the COPYING file in the top-level directory.
  *
  */
-#ifndef _QEMU_VIRTIO_9P_XATTR_H
-#define _QEMU_VIRTIO_9P_XATTR_H
+#ifndef _QEMU_9P_XATTR_H
+#define _QEMU_9P_XATTR_H
 
 #include "qemu/xattr.h"
 
diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index ba62571..838c5e1 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -1,6 +1,6 @@
 common-obj-y  = virtio-9p.o
-common-obj-y += 9p-local.o virtio-9p-xattr.o
-common-obj-y += virtio-9p-xattr-user.o 9p-posix-acl.o
+common-obj-y += 9p-local.o 9p-xattr.o
+common-obj-y += 9p-xattr-user.o 9p-posix-acl.o
 common-obj-y += coth.o cofs.o codir.o cofile.o
 common-obj-y += coxattr.o 9p-synth.o
 common-obj-$(CONFIG_OPEN_BY_HANDLE) +=  9p-handle.o
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 667b54a..92ac19b 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -17,7 +17,7 @@
 #include "qemu/sockets.h"
 #include "virtio-9p.h"
 #include "fsdev/qemu-fsdev.h"
-#include "virtio-9p-xattr.h"
+#include "9p-xattr.h"
 #include "coth.h"
 #include "hw/virtio/virtio-access.h"
 
diff -

[Qemu-devel] [PATCH 00/22] 9pfs: disentangling virtio and generic code

2016-01-05 Thread Wei Liu

Hi all

Back in 2015 summer one of our OPW interns Linda Jacobson explored the
possibility of making 9pfs work on Xen. It turned out lots of code in QEMU can
be reused.

This series refactors 9pfs related code:

1. Rename a bunch of files and functions to make clear they are generic.
2. Only export two functions (marshal and unmarshal) from transport to generic
   code.
3. disentangle virtio transport code and generic 9pfs code.
4. Some function name clean-up.

To make sure this series doesn't break compilation a rune is use to compile
every commit.

$ git rebase -i origin/master --exec "make -j16 clean; ./configure 
--target-list=x86_64-softmmu --enable-virtfs --enable-kvm; make -j16 && echo 
ok."

Three use cases are tested:

1. Local file system driver with passthrough security policy
./qemu-system-x86_64 -L .  -hda /dev/DATA/jessie   -vnc 0.0.0.0:0,to=99  -fsdev 
local,path=/root/qemu,security_model=passthrough,id=fs1  -device 
virtio-9p-pci,fsdev=fs1,mount_tag=qemu  &

2. Local file system driver with mapped security policy
./qemu-system-x86_64 -L . -hda /dev/DATA/jessie -vnc 0.0.0.0:0,to=99 -fsdev 
local,path=/root/qemu,security_model=mapped,id=fs1 -device 
virtio-9p-pci,fsdev=fs1,mount_tag=qemu &

3. Proxy file system driver
./virtfs-proxy-helper -p /root/qemu -n -s virtfs-helper-sock -u 0 -g 0 &
./qemu-system-x86_64 -L .  -hda /dev/DATA/jessie   -vnc 0.0.0.0:0,to=99  -fsdev 
proxy,socket=virtfs-helper-sock,id=fs1  -device 
virtio-9p-pci,fsdev=fs1,mount_tag=qemu &

During each of the tests, mounting, unmounting, read and write operations are
performed. In "mapped" test, getfattr in host was used to inspect the
attributes.

Let me know if you would like to see more tests.

Xen transport is still under development. I figure it would be better to post
this series as soon as possible because rebasing such huge series from time to
time is prone to error.

Comments are welcome. Thanks!

Wei.
---
Cc: "Aneesh Kumar K.V" 
Cc: Greg Kurz 
Cc: "Michael S. Tsirkin" 
Cc: Stefano Stabellini 
---

Wei Liu (22):
  9pfs: rename virtio-9p-coth.{c,h} to coth.{c,h}
  9pfs: rename virtio-9p-handle.c to 9p-handle.c
  9pfs: rename virtio-9p-handle.c to 9p-handle.c
  9pfs: rename virtio-9p-posix-acl.c to 9p-posix-acl.c
  9pfs: rename virtio-9p-proxy.{c,h} to 9p-proxy.{c,h}
  9pfs: rename virtio-9p-synth.{c,h} to 9p-synth.{c,h}
  9pfs: rename virtio-9p-xattr{,-user}.{c,h} to 9p-xattr{,-user}.{c,h}
  9pfs: merge hw/virtio/virtio-9p.h into hw/9pfs/virtio-9p.h
  9pfs: remove dead code
  fsdev: break out 9p-marshal.{c,h} from virtio-9p-marshal.{c,h}
  fsdev: 9p-marshal: introduce V9fsBlob
  9pfs: use V9fsBlob to transmit xattr
  fsdev: rename virtio-9p-marshal.{c,h} to 9p-iov-marshal.{c,h}
  9pfs: PDU processing functions don't need to take V9fsState as
argument
  9pfs: PDU processing functions should start pdu_ prefix
  9pfs: make pdu_{,un}marshal proper functions
  9pfs: factor out virtio_pdu_{,un}marshal
  9pfs: factor out pdu_push_and_notify
  9pfs: break out virtio_init_iov_from_pdu
  9pfs: break out generic code from virtio-9p.{c,h}
  9pfs: factor out v9fs_device_{,un}realize_common
  9pfs: disentangle V9fsState

 Makefile   |   2 +-
 fsdev/{virtio-9p-marshal.c => 9p-iov-marshal.c}| 205 ++-
 fsdev/9p-iov-marshal.h |  15 +
 fsdev/9p-marshal.c |  64 
 fsdev/{virtio-9p-marshal.h => 9p-marshal.h}|  26 +-
 fsdev/Makefile.objs|   2 +-
 fsdev/virtfs-proxy-helper.c|   6 +-
 hw/9pfs/{virtio-9p-handle.c => 9p-handle.c}|   7 +-
 hw/9pfs/{virtio-9p-local.c => 9p-local.c}  |   7 +-
 hw/9pfs/{virtio-9p-posix-acl.c => 9p-posix-acl.c}  |   7 +-
 hw/9pfs/{virtio-9p-proxy.c => 9p-proxy.c}  |   7 +-
 hw/9pfs/{virtio-9p-proxy.h => 9p-proxy.h}  |  10 +-
 hw/9pfs/{virtio-9p-synth.c => 9p-synth.c}  |   6 +-
 hw/9pfs/{virtio-9p-synth.h => 9p-synth.h}  |   6 +-
 .../{virtio-9p-xattr-user.c => 9p-xattr-user.c}|   7 +-
 hw/9pfs/{virtio-9p-xattr.c => 9p-xattr.c}  |   7 +-
 hw/9pfs/{virtio-9p-xattr.h => 9p-xattr.h}  |   6 +-
 hw/9pfs/{virtio-9p.c => 9p.c}  | 301 ++--
 hw/9pfs/9p.h   | 335 ++
 hw/9pfs/Makefile.objs  |  14 +-
 hw/9pfs/codir.c|   2 +-
 hw/9pfs/cofile.c   |   2 +-
 hw/9pfs/cofs.c |   2 +-
 hw/9pfs/{virtio-9p-coth.c => coth.c}   |   4 +-
 hw/9pfs/{virtio-9p-coth.h => coth.h}   |  15 +-
 hw/9pfs/coxattr.c  |   2 +-
 hw/9pfs/virtio-9p-device.c | 196 ++-
 hw/9pfs/virtio-9p.h| 391 +
 hw/s390x/virtio-ccw.h  |   2 +-
 hw/

[Qemu-devel] [PATCH 04/22] 9pfs: rename virtio-9p-posix-acl.c to 9p-posix-acl.c

2016-01-05 Thread Wei Liu

This file is not virtio specific. Rename it to use generic name.

Fix comment and remove unneeded inclusion of virtio.h.

Signed-off-by: Wei Liu 
---
 hw/9pfs/{virtio-9p-posix-acl.c => 9p-posix-acl.c} | 3 +--
 hw/9pfs/Makefile.objs | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)
 rename hw/9pfs/{virtio-9p-posix-acl.c => 9p-posix-acl.c} (98%)

diff --git a/hw/9pfs/virtio-9p-posix-acl.c b/hw/9pfs/9p-posix-acl.c
similarity index 98%
rename from hw/9pfs/virtio-9p-posix-acl.c
rename to hw/9pfs/9p-posix-acl.c
index 09dad07..1ee7bdc 100644
--- a/hw/9pfs/virtio-9p-posix-acl.c
+++ b/hw/9pfs/9p-posix-acl.c
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p system.posix* xattr callback
+ * 9p system.posix* xattr callback
  *
  * Copyright IBM, Corp. 2010
  *
@@ -13,7 +13,6 @@
 
 #include 
 #include "qemu/xattr.h"
-#include "hw/virtio/virtio.h"
 #include "virtio-9p.h"
 #include "fsdev/file-op-9p.h"
 #include "virtio-9p-xattr.h"
diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index 5059681..0721462 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -1,6 +1,6 @@
 common-obj-y  = virtio-9p.o
 common-obj-y += 9p-local.o virtio-9p-xattr.o
-common-obj-y += virtio-9p-xattr-user.o virtio-9p-posix-acl.o
+common-obj-y += virtio-9p-xattr-user.o 9p-posix-acl.o
 common-obj-y += coth.o cofs.o codir.o cofile.o
 common-obj-y += coxattr.o virtio-9p-synth.o
 common-obj-$(CONFIG_OPEN_BY_HANDLE) +=  9p-handle.o
-- 
2.1.4

[Qemu-devel] [PATCH 05/22] 9pfs: rename virtio-9p-proxy.{c, h} to 9p-proxy.{c, h}

2016-01-05 Thread Wei Liu

Those two files are not virtio specific. Rename them to use generic
names.

Fix includes in various C files. Change define guards and comments
in header files.

Signed-off-by: Wei Liu 
---
 fsdev/virtfs-proxy-helper.c   | 2 +-
 hw/9pfs/{virtio-9p-proxy.c => 9p-proxy.c} | 5 ++---
 hw/9pfs/{virtio-9p-proxy.h => 9p-proxy.h} | 6 +++---
 hw/9pfs/Makefile.objs | 2 +-
 4 files changed, 7 insertions(+), 8 deletions(-)
 rename hw/9pfs/{virtio-9p-proxy.c => 9p-proxy.c} (99%)
 rename hw/9pfs/{virtio-9p-proxy.h => 9p-proxy.h} (95%)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index ad1da0d..7753654 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -24,7 +24,7 @@
 #include "qemu/sockets.h"
 #include "qemu/xattr.h"
 #include "virtio-9p-marshal.h"
-#include "hw/9pfs/virtio-9p-proxy.h"
+#include "hw/9pfs/9p-proxy.h"
 #include "fsdev/virtio-9p-marshal.h"
 
 #define PROGNAME "virtfs-proxy-helper"
diff --git a/hw/9pfs/virtio-9p-proxy.c b/hw/9pfs/9p-proxy.c
similarity index 99%
rename from hw/9pfs/virtio-9p-proxy.c
rename to hw/9pfs/9p-proxy.c
index 1bc7881..67c1fb9 100644
--- a/hw/9pfs/virtio-9p-proxy.c
+++ b/hw/9pfs/9p-proxy.c
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p Proxy callback
+ * 9p Proxy callback
  *
  * Copyright IBM, Corp. 2011
  *
@@ -11,11 +11,10 @@
  */
 #include 
 #include 
-#include "hw/virtio/virtio.h"
 #include "virtio-9p.h"
 #include "qemu/error-report.h"
 #include "fsdev/qemu-fsdev.h"
-#include "virtio-9p-proxy.h"
+#include "9p-proxy.h"
 
 typedef struct V9fsProxy {
 int sockfd;
diff --git a/hw/9pfs/virtio-9p-proxy.h b/hw/9pfs/9p-proxy.h
similarity index 95%
rename from hw/9pfs/virtio-9p-proxy.h
rename to hw/9pfs/9p-proxy.h
index 005c1ad..56150b9 100644
--- a/hw/9pfs/virtio-9p-proxy.h
+++ b/hw/9pfs/9p-proxy.h
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p Proxy callback
+ * 9p Proxy callback
  *
  * Copyright IBM, Corp. 2011
  *
@@ -9,8 +9,8 @@
  * This work is licensed under the terms of the GNU GPL, version 2.  See
  * the COPYING file in the top-level directory.
  */
-#ifndef _QEMU_VIRTIO_9P_PROXY_H
-#define _QEMU_VIRTIO_9P_PROXY_H
+#ifndef _QEMU_9P_PROXY_H
+#define _QEMU_9P_PROXY_H
 
 #define PROXY_MAX_IO_SZ (64 * 1024)
 #define V9FS_FD_VALID INT_MAX
diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index 0721462..cd5d146 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -4,6 +4,6 @@ common-obj-y += virtio-9p-xattr-user.o 9p-posix-acl.o
 common-obj-y += coth.o cofs.o codir.o cofile.o
 common-obj-y += coxattr.o virtio-9p-synth.o
 common-obj-$(CONFIG_OPEN_BY_HANDLE) +=  9p-handle.o
-common-obj-y += virtio-9p-proxy.o
+common-obj-y += 9p-proxy.o
 
 obj-y += virtio-9p-device.o
-- 
2.1.4

[Qemu-devel] [PATCH 09/22] 9pfs: remove dead code

2016-01-05 Thread Wei Liu

Some structures virtio-9p.h have been unused since 2011 when relevant
functions switched to use coroutines.

The declaration of pdu_packunpack and function do_pdu_unpack are
useless.

The function virtio_9p_set_fd_limit is unused.

Signed-off-by: Wei Liu 
---
 hw/9pfs/virtio-9p.c | 11 -
 hw/9pfs/virtio-9p.h | 68 -
 2 files changed, 79 deletions(-)

diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 30ff828..084fa6a 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -3287,14 +3287,3 @@ void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
 }
 free_pdu(s, pdu);
 }
-
-static void __attribute__((__constructor__)) virtio_9p_set_fd_limit(void)
-{
-struct rlimit rlim;
-if (getrlimit(RLIMIT_NOFILE, &rlim) < 0) {
-fprintf(stderr, "Failed to get the resource limit\n");
-exit(1);
-}
-open_fd_hw = rlim.rlim_cur - MIN(400, rlim.rlim_cur/3);
-open_fd_rc = rlim.rlim_cur/2;
-}
diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
index ac4cb00..3c78d3c 100644
--- a/hw/9pfs/virtio-9p.h
+++ b/hw/9pfs/virtio-9p.h
@@ -227,65 +227,6 @@ typedef struct V9fsState
 V9fsConf fsconf;
 } V9fsState;
 
-typedef struct V9fsStatState {
-V9fsPDU *pdu;
-size_t offset;
-V9fsStat v9stat;
-V9fsFidState *fidp;
-struct stat stbuf;
-} V9fsStatState;
-
-typedef struct V9fsOpenState {
-V9fsPDU *pdu;
-size_t offset;
-int32_t mode;
-V9fsFidState *fidp;
-V9fsQID qid;
-struct stat stbuf;
-int iounit;
-} V9fsOpenState;
-
-typedef struct V9fsReadState {
-V9fsPDU *pdu;
-size_t offset;
-int32_t count;
-int32_t total;
-int64_t off;
-V9fsFidState *fidp;
-struct iovec iov[128]; /* FIXME: bad, bad, bad */
-struct iovec *sg;
-off_t dir_pos;
-struct dirent *dent;
-struct stat stbuf;
-V9fsString name;
-V9fsStat v9stat;
-int32_t len;
-int32_t cnt;
-int32_t max_count;
-} V9fsReadState;
-
-typedef struct V9fsWriteState {
-V9fsPDU *pdu;
-size_t offset;
-int32_t len;
-int32_t count;
-int32_t total;
-int64_t off;
-V9fsFidState *fidp;
-struct iovec iov[128]; /* FIXME: bad, bad, bad */
-struct iovec *sg;
-int cnt;
-} V9fsWriteState;
-
-typedef struct V9fsMkState {
-V9fsPDU *pdu;
-size_t offset;
-V9fsQID qid;
-struct stat stbuf;
-V9fsString name;
-V9fsString fullname;
-} V9fsMkState;
-
 /* 9p2000.L open flags */
 #define P9_DOTL_RDONLY
 #define P9_DOTL_WRONLY0001
@@ -345,15 +286,6 @@ typedef struct V9fsGetlock
 extern int open_fd_hw;
 extern int total_open_fd;
 
-size_t pdu_packunpack(void *addr, struct iovec *sg, int sg_count,
-  size_t offset, size_t size, int pack);
-
-static inline size_t do_pdu_unpack(void *dst, struct iovec *sg, int sg_count,
-size_t offset, size_t size)
-{
-return pdu_packunpack(dst, sg, sg_count, offset, size, 0);
-}
-
 static inline void v9fs_path_write_lock(V9fsState *s)
 {
 if (s->ctx.export_flags & V9FS_PATHNAME_FSCONTEXT) {
-- 
2.1.4

[Qemu-devel] [PATCH 01/22] 9pfs: rename virtio-9p-coth.{c, h} to coth.{c, h}

2016-01-05 Thread Wei Liu

Those two files are not virtio specific. Rename them to use generic
names.

Fix includes in various C files. Change define guards and comments in
header files.

Signed-off-by: Wei Liu 
---
 hw/9pfs/Makefile.objs| 2 +-
 hw/9pfs/codir.c  | 2 +-
 hw/9pfs/cofile.c | 2 +-
 hw/9pfs/cofs.c   | 2 +-
 hw/9pfs/{virtio-9p-coth.c => coth.c} | 4 ++--
 hw/9pfs/{virtio-9p-coth.h => coth.h} | 6 +++---
 hw/9pfs/coxattr.c| 2 +-
 hw/9pfs/virtio-9p-device.c   | 2 +-
 hw/9pfs/virtio-9p.c  | 2 +-
 9 files changed, 12 insertions(+), 12 deletions(-)
 rename hw/9pfs/{virtio-9p-coth.c => coth.c} (95%)
 rename hw/9pfs/{virtio-9p-coth.h => coth.h} (98%)

diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index 1e9b595..76dadbe 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -1,7 +1,7 @@
 common-obj-y  = virtio-9p.o
 common-obj-y += virtio-9p-local.o virtio-9p-xattr.o
 common-obj-y += virtio-9p-xattr-user.o virtio-9p-posix-acl.o
-common-obj-y += virtio-9p-coth.o cofs.o codir.o cofile.o
+common-obj-y += coth.o cofs.o codir.o cofile.o
 common-obj-y += coxattr.o virtio-9p-synth.o
 common-obj-$(CONFIG_OPEN_BY_HANDLE) +=  virtio-9p-handle.o
 common-obj-y += virtio-9p-proxy.o
diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c
index ec9cc7f..5a4f74d 100644
--- a/hw/9pfs/codir.c
+++ b/hw/9pfs/codir.c
@@ -15,7 +15,7 @@
 #include "fsdev/qemu-fsdev.h"
 #include "qemu/thread.h"
 #include "qemu/coroutine.h"
-#include "virtio-9p-coth.h"
+#include "coth.h"
 
 int v9fs_co_readdir_r(V9fsPDU *pdu, V9fsFidState *fidp, struct dirent *dent,
   struct dirent **result)
diff --git a/hw/9pfs/cofile.c b/hw/9pfs/cofile.c
index 7cb55ee..893df2c 100644
--- a/hw/9pfs/cofile.c
+++ b/hw/9pfs/cofile.c
@@ -15,7 +15,7 @@
 #include "fsdev/qemu-fsdev.h"
 #include "qemu/thread.h"
 #include "qemu/coroutine.h"
-#include "virtio-9p-coth.h"
+#include "coth.h"
 
 int v9fs_co_st_gen(V9fsPDU *pdu, V9fsPath *path, mode_t st_mode,
V9fsStatDotl *v9stat)
diff --git a/hw/9pfs/cofs.c b/hw/9pfs/cofs.c
index e1953a9..7b4202b 100644
--- a/hw/9pfs/cofs.c
+++ b/hw/9pfs/cofs.c
@@ -15,7 +15,7 @@
 #include "fsdev/qemu-fsdev.h"
 #include "qemu/thread.h"
 #include "qemu/coroutine.h"
-#include "virtio-9p-coth.h"
+#include "coth.h"
 
 static ssize_t __readlink(V9fsState *s, V9fsPath *path, V9fsString *buf)
 {
diff --git a/hw/9pfs/virtio-9p-coth.c b/hw/9pfs/coth.c
similarity index 95%
rename from hw/9pfs/virtio-9p-coth.c
rename to hw/9pfs/coth.c
index ab9425c..56772d6 100644
--- a/hw/9pfs/virtio-9p-coth.c
+++ b/hw/9pfs/coth.c
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p backend
+ * 9p backend
  *
  * Copyright IBM, Corp. 2010
  *
@@ -16,7 +16,7 @@
 #include "block/thread-pool.h"
 #include "qemu/coroutine.h"
 #include "qemu/main-loop.h"
-#include "virtio-9p-coth.h"
+#include "coth.h"
 
 /* Called from QEMU I/O thread.  */
 static void coroutine_enter_cb(void *opaque, int ret)
diff --git a/hw/9pfs/virtio-9p-coth.h b/hw/9pfs/coth.h
similarity index 98%
rename from hw/9pfs/virtio-9p-coth.h
rename to hw/9pfs/coth.h
index 4ac1aaf..209fc6a 100644
--- a/hw/9pfs/virtio-9p-coth.h
+++ b/hw/9pfs/coth.h
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p backend
+ * 9p backend
  *
  * Copyright IBM, Corp. 2010
  *
@@ -12,8 +12,8 @@
  *
  */
 
-#ifndef _QEMU_VIRTIO_9P_COTH_H
-#define _QEMU_VIRTIO_9P_COTH_H
+#ifndef _QEMU_9P_COTH_H
+#define _QEMU_9P_COTH_H
 
 #include "qemu/thread.h"
 #include "qemu/coroutine.h"
diff --git a/hw/9pfs/coxattr.c b/hw/9pfs/coxattr.c
index 55c0d23..0590cbf 100644
--- a/hw/9pfs/coxattr.c
+++ b/hw/9pfs/coxattr.c
@@ -15,7 +15,7 @@
 #include "fsdev/qemu-fsdev.h"
 #include "qemu/thread.h"
 #include "qemu/coroutine.h"
-#include "virtio-9p-coth.h"
+#include "coth.h"
 
 int v9fs_co_llistxattr(V9fsPDU *pdu, V9fsPath *path, void *value, size_t size)
 {
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index b42d3b3..667b54a 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -18,7 +18,7 @@
 #include "virtio-9p.h"
 #include "fsdev/qemu-fsdev.h"
 #include "virtio-9p-xattr.h"
-#include "virtio-9p-coth.h"
+#include "coth.h"
 #include "hw/virtio/virtio-access.h"
 
 static uint64_t virtio_9p_get_features(VirtIODevice *vdev, uint64_t features,
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index f972731..0f178de 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -19,7 +19,7 @@
 #include "virtio-9p.h"
 #include "fsdev/qemu-fsdev.h"
 #include "virtio-9p-xattr.h"
-#include "virtio-9p-coth.h"
+#include "coth.h"
 #include "trace.h"
 #include "migration/migration.h"
 
-- 
2.1.4

[Qemu-devel] [PATCH 02/22] 9pfs: rename virtio-9p-handle.c to 9p-handle.c

2016-01-05 Thread Wei Liu

This file is not virtio specific. Rename it to use generic name.

Fix comment and remove unneeded inclusion of virtio.h.

Signed-off-by: Wei Liu 
---
 hw/9pfs/{virtio-9p-handle.c => 9p-handle.c} | 3 +--
 hw/9pfs/Makefile.objs   | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)
 rename hw/9pfs/{virtio-9p-handle.c => 9p-handle.c} (99%)

diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/9p-handle.c
similarity index 99%
rename from hw/9pfs/virtio-9p-handle.c
rename to hw/9pfs/9p-handle.c
index 13eabb9..a48dbc9 100644
--- a/hw/9pfs/virtio-9p-handle.c
+++ b/hw/9pfs/9p-handle.c
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p handle callback
+ * 9p handle callback
  *
  * Copyright IBM, Corp. 2011
  *
@@ -11,7 +11,6 @@
  *
  */
 
-#include "hw/virtio/virtio.h"
 #include "virtio-9p.h"
 #include "virtio-9p-xattr.h"
 #include 
diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index 76dadbe..9fdd8a4 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -3,7 +3,7 @@ common-obj-y += virtio-9p-local.o virtio-9p-xattr.o
 common-obj-y += virtio-9p-xattr-user.o virtio-9p-posix-acl.o
 common-obj-y += coth.o cofs.o codir.o cofile.o
 common-obj-y += coxattr.o virtio-9p-synth.o
-common-obj-$(CONFIG_OPEN_BY_HANDLE) +=  virtio-9p-handle.o
+common-obj-$(CONFIG_OPEN_BY_HANDLE) +=  9p-handle.o
 common-obj-y += virtio-9p-proxy.o
 
 obj-y += virtio-9p-device.o
-- 
2.1.4

[Qemu-devel] [PATCH 06/22] 9pfs: rename virtio-9p-synth.{c, h} to 9p-synth.{c, h}

2016-01-05 Thread Wei Liu

These two files are not virtio specific. Rename them to use generic
names.

Fix includes in various C files. Change define guards and comments
in header files.

Signed-off-by: Wei Liu 
---
 hw/9pfs/{virtio-9p-synth.c => 9p-synth.c} | 2 +-
 hw/9pfs/{virtio-9p-synth.h => 9p-synth.h} | 6 +++---
 hw/9pfs/Makefile.objs | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)
 rename hw/9pfs/{virtio-9p-synth.c => 9p-synth.c} (99%)
 rename hw/9pfs/{virtio-9p-synth.h => 9p-synth.h} (94%)

diff --git a/hw/9pfs/virtio-9p-synth.c b/hw/9pfs/9p-synth.c
similarity index 99%
rename from hw/9pfs/virtio-9p-synth.c
rename to hw/9pfs/9p-synth.c
index a0ab9a8..6d34b89 100644
--- a/hw/9pfs/virtio-9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -16,7 +16,7 @@
 #include "virtio-9p.h"
 #include "virtio-9p-xattr.h"
 #include "fsdev/qemu-fsdev.h"
-#include "virtio-9p-synth.h"
+#include "9p-synth.h"
 #include "qemu/rcu.h"
 #include "qemu/rcu_queue.h"
 #include 
diff --git a/hw/9pfs/virtio-9p-synth.h b/hw/9pfs/9p-synth.h
similarity index 94%
rename from hw/9pfs/virtio-9p-synth.h
rename to hw/9pfs/9p-synth.h
index ab05a8e..eaf5a0c 100644
--- a/hw/9pfs/virtio-9p-synth.h
+++ b/hw/9pfs/9p-synth.h
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p
+ * 9p
  *
  * Copyright IBM, Corp. 2011
  *
@@ -10,8 +10,8 @@
  * the COPYING file in the top-level directory.
  *
  */
-#ifndef HW_9PFS_VIRTIO9P_SYNTH_H
-#define HW_9PFS_VIRTIO9P_SYNTH_H 1
+#ifndef HW_9PFS_SYNTH_H
+#define HW_9PFS_SYNTH_H 1
 
 #include 
 #include 
diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index cd5d146..ba62571 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -2,7 +2,7 @@ common-obj-y  = virtio-9p.o
 common-obj-y += 9p-local.o virtio-9p-xattr.o
 common-obj-y += virtio-9p-xattr-user.o 9p-posix-acl.o
 common-obj-y += coth.o cofs.o codir.o cofile.o
-common-obj-y += coxattr.o virtio-9p-synth.o
+common-obj-y += coxattr.o 9p-synth.o
 common-obj-$(CONFIG_OPEN_BY_HANDLE) +=  9p-handle.o
 common-obj-y += 9p-proxy.o
 
-- 
2.1.4

[Qemu-devel] [PATCH 08/22] 9pfs: merge hw/virtio/virtio-9p.h into hw/9pfs/virtio-9p.h

2016-01-05 Thread Wei Liu

The deleted file only contained V9fsConf which wasn't virtio specific.
Merge that to the general header of 9pfs.

Fixed header inclusions as I went along.

Signed-off-by: Wei Liu 
---
 hw/9pfs/virtio-9p-device.c|  1 -
 hw/9pfs/virtio-9p.h   |  8 +++-
 hw/virtio/virtio-pci.h|  1 -
 include/hw/virtio/virtio-9p.h | 24 
 4 files changed, 7 insertions(+), 27 deletions(-)
 delete mode 100644 include/hw/virtio/virtio-9p.h

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 92ac19b..885b940 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -12,7 +12,6 @@
  */
 
 #include "hw/virtio/virtio.h"
-#include "hw/virtio/virtio-9p.h"
 #include "hw/i386/pc.h"
 #include "qemu/sockets.h"
 #include "virtio-9p.h"
diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
index d7a4dc1..ac4cb00 100644
--- a/hw/9pfs/virtio-9p.h
+++ b/hw/9pfs/virtio-9p.h
@@ -9,7 +9,6 @@
 #include 
 #include "standard-headers/linux/virtio_9p.h"
 #include "hw/virtio/virtio.h"
-#include "hw/virtio/virtio-9p.h"
 #include "fsdev/file-op-9p.h"
 #include "fsdev/virtio-9p-marshal.h"
 #include "qemu/thread.h"
@@ -156,6 +155,13 @@ enum {
 P9_FID_XATTR,
 };
 
+typedef struct V9fsConf
+{
+/* tag name for the device */
+char *tag;
+char *fsdev_id;
+} V9fsConf;
+
 typedef struct V9fsXattr
 {
 int64_t copied_len;
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index a104ff2..7cf5974 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -23,7 +23,6 @@
 #include "hw/virtio/virtio-scsi.h"
 #include "hw/virtio/virtio-balloon.h"
 #include "hw/virtio/virtio-bus.h"
-#include "hw/virtio/virtio-9p.h"
 #include "hw/virtio/virtio-input.h"
 #include "hw/virtio/virtio-gpu.h"
 #ifdef CONFIG_VIRTFS
diff --git a/include/hw/virtio/virtio-9p.h b/include/hw/virtio/virtio-9p.h
deleted file mode 100644
index 65789db..000
--- a/include/hw/virtio/virtio-9p.h
+++ /dev/null
@@ -1,24 +0,0 @@
-/*
- * Virtio 9p
- *
- * Copyright IBM, Corp. 2010
- *
- * Authors:
- *  Aneesh Kumar K.V 
- *
- * This work is licensed under the terms of the GNU GPL, version 2.  See
- * the COPYING file in the top-level directory.
- *
- */
-
-#ifndef QEMU_VIRTIO_9P_DEVICE_H
-#define QEMU_VIRTIO_9P_DEVICE_H
-
-typedef struct V9fsConf
-{
-/* tag name for the device */
-char *tag;
-char *fsdev_id;
-} V9fsConf;
-
-#endif
-- 
2.1.4

[Qemu-devel] [PATCH 03/22] 9pfs: rename virtio-9p-handle.c to 9p-handle.c

2016-01-05 Thread Wei Liu

This file is not virtio specific. Rename it to use generic name.

Fix comment and remove unneeded inclusion of virtio.h.

Signed-off-by: Wei Liu 
---
 hw/9pfs/{virtio-9p-local.c => 9p-local.c} | 3 +--
 hw/9pfs/Makefile.objs | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)
 rename hw/9pfs/{virtio-9p-local.c => 9p-local.c} (99%)

diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/9p-local.c
similarity index 99%
rename from hw/9pfs/virtio-9p-local.c
rename to hw/9pfs/9p-local.c
index f1f2e25..877ad86 100644
--- a/hw/9pfs/virtio-9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -1,5 +1,5 @@
 /*
- * Virtio 9p Posix callback
+ * 9p Posix callback
  *
  * Copyright IBM, Corp. 2010
  *
@@ -11,7 +11,6 @@
  *
  */
 
-#include "hw/virtio/virtio.h"
 #include "virtio-9p.h"
 #include "virtio-9p-xattr.h"
 #include "fsdev/qemu-fsdev.h"   /* local_ops */
diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index 9fdd8a4..5059681 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -1,5 +1,5 @@
 common-obj-y  = virtio-9p.o
-common-obj-y += virtio-9p-local.o virtio-9p-xattr.o
+common-obj-y += 9p-local.o virtio-9p-xattr.o
 common-obj-y += virtio-9p-xattr-user.o virtio-9p-posix-acl.o
 common-obj-y += coth.o cofs.o codir.o cofile.o
 common-obj-y += coxattr.o virtio-9p-synth.o
-- 
2.1.4

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Konrad Rzeszutek Wilk

> You can create a dummy device in guest for the duration of migration.
> Use guest agent to move IP address there and that should be enough to trick 
> most guests.

If you are doing this  - why not bond the physical NIC with an virtual device
and unplug the physical NIC?

Re: [Qemu-devel] [RFC v6 02/14] softmmu: Add new TLB_EXCL flag

2016-01-05 Thread Alex Bennée


alvise rigo  writes:

> On Tue, Jan 5, 2016 at 5:10 PM, Alex Bennée  wrote:
>>
>> Alvise Rigo  writes:
>>
>>> Add a new TLB flag to force all the accesses made to a page to follow
>>> the slow-path.
>>>
>>> In the case we remove a TLB entry marked as EXCL, we unset the
>>> corresponding exclusive bit in the bitmap.
>>>
>>> Suggested-by: Jani Kokkonen 
>>> Suggested-by: Claudio Fontana 
>>> Signed-off-by: Alvise Rigo 
>>> ---
>>>  cputlb.c|  38 +++-
>>>  include/exec/cpu-all.h  |   8 
>>>  include/exec/cpu-defs.h |   1 +
>>>  include/qom/cpu.h   |  14 ++
>>>  softmmu_template.h  | 114 
>>> ++--
>>>  5 files changed, 152 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/cputlb.c b/cputlb.c
>>> index bf1d50a..7ee0c89 100644
>>> --- a/cputlb.c
>>> +++ b/cputlb.c
>>> @@ -394,6 +394,16 @@ void tlb_set_page_with_attrs(CPUState *cpu, 
>>> target_ulong vaddr,
>>>  env->tlb_v_table[mmu_idx][vidx] = *te;
>>>  env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
>>>
>>> +if (unlikely(!(te->addr_write & TLB_MMIO) && (te->addr_write &
>>> TLB_EXCL))) {
>>
>> Why do we care about TLB_MMIO flags here? Does it actually happen? Would
>> bad things happen if we enforced exclusivity for an MMIO write? Do the
>> other flags matter?
>
> In the previous version of the patch series it came out that the
> accesses to MMIO regions have to be supported since, for instance, the
> GDB stub relies on them.
> The last two patches actually finalize the MMIO support.
>
>>
>> There should be a comment as to why MMIO is mentioned I think.
>
> OK.
>
>>
>>> +/* We are removing an exclusive entry, set the page to dirty. This
>>> + * is not be necessary if the vCPU has performed both SC and LL. */
>>> +hwaddr hw_addr = (env->iotlb[mmu_idx][index].addr & 
>>> TARGET_PAGE_MASK) +
>>> +  (te->addr_write & 
>>> TARGET_PAGE_MASK);
>>> +if (!cpu->ll_sc_context) {
>>> +cpu_physical_memory_unset_excl(hw_addr, cpu->cpu_index);
>>> +}
>>> +}
>>> +
>>>  /* refill the tlb */
>>>  env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
>>>  env->iotlb[mmu_idx][index].attrs = attrs;
>>> @@ -419,7 +429,15 @@ void tlb_set_page_with_attrs(CPUState *cpu, 
>>> target_ulong vaddr,
>>> + xlat)) {
>>>  te->addr_write = address | TLB_NOTDIRTY;
>>>  } else {
>>> -te->addr_write = address;
>>> +if (!(address & TLB_MMIO) &&
>>> +cpu_physical_memory_atleast_one_excl(section->mr->ram_addr
>>> +   + xlat)) {
>>> +/* There is at least one vCPU that has flagged the address 
>>> as
>>> + * exclusive. */
>>> +te->addr_write = address | TLB_EXCL;
>>> +} else {
>>> +te->addr_write = address;
>>> +}
>>>  }
>>>  } else {
>>>  te->addr_write = -1;
>>> @@ -471,6 +489,24 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, 
>>> target_ulong addr)
>>>  return qemu_ram_addr_from_host_nofail(p);
>>>  }
>>>
>>> +/* For every vCPU compare the exclusive address and reset it in case of a
>>> + * match. Since only one vCPU is running at once, no lock has to be held to
>>> + * guard this operation. */
>>> +static inline void lookup_and_reset_cpus_ll_addr(hwaddr addr, hwaddr size)
>>> +{
>>> +CPUState *cpu;
>>> +
>>> +CPU_FOREACH(cpu) {
>>> +if (cpu->excl_protected_range.begin != EXCLUSIVE_RESET_ADDR &&
>>> +ranges_overlap(cpu->excl_protected_range.begin,
>>> +   cpu->excl_protected_range.end -
>>> +   cpu->excl_protected_range.begin,
>>> +   addr, size)) {
>>> +cpu->excl_protected_range.begin = EXCLUSIVE_RESET_ADDR;
>>> +}
>>> +}
>>> +}
>>> +
>>>  #define MMUSUFFIX _mmu
>>>
>>>  #define SHIFT 0
>>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
>>> index 83b1781..f8d8feb 100644
>>> --- a/include/exec/cpu-all.h
>>> +++ b/include/exec/cpu-all.h
>>> @@ -277,6 +277,14 @@ CPUArchState *cpu_copy(CPUArchState *env);
>>>  #define TLB_NOTDIRTY(1 << 4)
>>>  /* Set if TLB entry is an IO callback.  */
>>>  #define TLB_MMIO(1 << 5)
>>> +/* Set if TLB entry references a page that requires exclusive access.  */
>>> +#define TLB_EXCL(1 << 6)
>>> +
>>> +/* Do not allow a TARGET_PAGE_MASK which covers one or more bits defined
>>> + * above. */
>>> +#if TLB_EXCL >= TARGET_PAGE_SIZE
>>> +#error TARGET_PAGE_MASK covering the low bits of the TLB virtual address
>>> +#endif
>>>
>>>  void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
>>>  void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf);
>>> diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
>>

Re: [Qemu-devel] [PATCH] sdhci: add quirk property for card insert interrupt status on Raspberry Pi

2016-01-05 Thread Andrew Baumann

> From: Peter Crosthwaite [mailto:crosthwaitepe...@gmail.com]
> Sent: Monday, 4 January 2016 22:18
> On Mon, Jan 4, 2016 at 2:12 PM, Andrew Baumann
>  wrote:
> >> From: Peter Crosthwaite [mailto:crosthwaitepe...@gmail.com]
> >> Sent: Thursday, 31 December 2015 21:38
> >> On Thu, Dec 31, 2015 at 1:40 PM, Andrew Baumann
> >>  wrote:
> >> > This quirk is a workaround for the following hardware behaviour, on
> >> > which UEFI (specifically, the bootloader for Windows on Pi2) depends:
> >> >
> >> > 1. at boot with an SD card present, the interrupt status/enable
> >> >registers are initially zero
> >> > 2. upon enabling it in the interrupt enable register, the card insert
> >> >bit in the interrupt status register is immediately set
> >> > 3. after a subsequent controller reset, the card insert interrupt does
> >> >not fire, even if enabled in the interrupt enable register
> >> >
> >>
> >> This is a baffling symptom. Does prnsts card ejection state fully work
> >> with physical card ejections and insertions both before and after the
> >> subsequent controller reset?
> >
> > I just tested this, by polling prnsts and intsts in a tight loop at board 
> > startup.
> At power on with a card inserted, prnsts reads 1FFF. Subsequent
> removal of the card, re-insertion etc. does not change its value.
> 
> Does either the subsequent reset or the interrupt ack change it? I'm
> assuming it is stuck permanently at 1fff.

That's correct -- there's no change.

> >After enabling interrupts, I reliably see a card insert interrupt in intsts. 
> >If I
> then write zero to the interrupt enable register, the pending card insert
> interrupt remains, which seems to dispel the "mask on read" theory. Once
> acked or reset, the card insert interrupt never recurs. I never saw a card
> removal interrupt.
> >
> 
> So
> 
> * interrupt status is initially 0
> * writing one to enable triggers the ghost
> * it can only be cleared with a status ack
(or reset)
> * you can never get a second ghost

Correct.

[...]
> > but either way there is a reliable ghost of a card insertion interrupt that 
> > is
> signalled at power on, and remains pending until it is either acked or the
> controller reset, after which point it never recurs. And I'd really like to 
> model
> that somehow without making a mess of sdhci.c :) Any ideas?
> >
> 
> Ok, I think it can be explained as a bad top-level connection as
> follows. The pin is mis-connected in such a way that such that it sees
> one edge on the POR reset and never sees any action again. The
> controller considers this pin edge-triggered and has the penning quirk
> as well, that is it saves edge interrupt until they are enabled and
> then releases them singly to the status register.
> 
> This doesn't explain why the controller doesn't see the interrupt on
> the soft reset, but perhaps that is explained by the spec, as I don't
> see anywhere that says that the interrupt has to retrigger for a
> constantly inserted card over a controller reset. Might be
> implementation specific.
> 
> Looking at the set_cb stuff, I think the guard on your original quirk
> implementation may be missing for the sd_set_cb() in sdhci_initfn().
> If this guard were added that quirk would be more complete, as
> currently it probably is seeing action on changes of state.
> 
> I think the way to correct the original quirk is to:
> 
> * make both sd_set_cb()'s conditional
> * manually call insert_eject_cb() on the POR reset (call the CB
> instead of register it).
> 
> Note that sdhci has no device::reset callback. You could add this to
> implement your POR reset.
> 
> You then have the problem of the prnsts register, which I assume it
> getting blasted by the reset memset. That can be managed by
> specifically preserving those two bits of prnsts through the reset
> (with an accompanying comment that this is needed for your quirk).

Assuming the user doesn't eject/change the SD card at runtime then my original 
patch isn't necessary at all. (I'm happy with that outcome, which is why I 
submitted the revert patch.) Because the memset in reset clears norintstsen, 
the sdhci_insert_eject_cb will never signal an insert interrupt. If we wanted a 
quirk to disable insert/eject interrupts, then what you've proposed seems like 
the right thing to do, although I think we'd need to preserve more than two 
bits of prnsts; we'd also need the write protect bit, and it's probably safe to 
just keep the upper half of the register.

> Your patch as-is here doesn't seem to address the penning behaviour
> (where the interrupt status remains clear until it is enabled), maybe
> that can be added as a second quirk if needed later?

My second patch does handle this in a way that's good enough to boot UEFI: a 
card insert interrupt is pending at power on, and goes away on enable/ack or 
reset. However, it deviates from the hardware in that disabling an interrupt 
(intstsen) implicity masks it out from the intsts (and this is true for any 
inter

Re: [Qemu-devel] [PATCH for v2.3.0] fw_cfg: add check to validate current entry value

2016-01-05 Thread P J P

+-- On Tue, 5 Jan 2016, P J P wrote --+
| An OOB r/w access issue was reported by Mr Donghai Zdh, CC'd here.

Mr Donghai CC'd now.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

Re: [Qemu-devel] [PATCH v8 34/35] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-05 Thread Eric Blake

On 01/05/2016 10:22 AM, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
>> Returning a partial object on error is an invitation for a careless
>> caller to leak memory.  As no one outside the testsuite was actually
>> relying on these semantics, it is cleaner to just document and
>> guarantee that ALL pointer-based visit_type_FOO() functions always
>> leave a safe value in *obj during an input visitor (either the new
>> object on success, or NULL if an error is encountered).
>>
>> Since input visitors have blind assignment semantics, we have to
>> track the result of whether an assignment is made all the way down
>> to each visitor callback implementation, to avoid making decisions
>> based on potentially uninitialized storage.
>>
>> Note that we still leave *obj unchanged after a scalar-based
>> visit_type_FOO(); I did not feel like auditing all uses of
>> visit_type_Enum() to see if the callers would tolerate a specific
>> sentinel value (not to mention having to decide whether it would
>> be better to use 0 or ENUM__MAX as that sentinel).
>>
>> Signed-off-by: Eric Blake 
>>
> 
> nice cleanup, few issues:
> 

>> @@ -237,6 +254,10 @@ void visit_type_str(Visitor *v, const char *name, char 
>> **obj, Error **errp)
>>  }
>>   */
>>  v->type_str(v, name, obj, errp);
>> +if (!visit_is_output(v) && err) {
>> +*obj = NULL;
>> +}
> 
> This will always evelatute to false, you need to change the line above I 
> suppose
> 
>> +error_propagate(errp, err);

Oh right, that needs to be v->type_str(..., &err).

I'll have to double-check that no assertions trigger with the fixed
code, and provide the fixup patch. I don't know if Markus will want me
to spin a v9, but I'll wait for his comments before deciding if a full
respin is needed.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 3/6] device_tree: introduce qemu_fdt_node_path

2016-01-05 Thread Peter Maydell

On 5 January 2016 at 16:20, Eric Auger  wrote:
> Hi Peter,
> On 12/18/2015 03:23 PM, Peter Maydell wrote:
>> On 17 December 2015 at 12:29, Eric Auger  wrote:
>>> This new helper routine returns the node path of a device
>>> referred to by its node name and compat string.
>>>
>>> Signed-off-by: Eric Auger 

>>> +
>>> +*node_path = NULL;
>>> +offset = fdt_node_offset_by_compatible(fdt, -1, compat);
>>> +while (offset != -FDT_ERR_NOTFOUND) {
>>> +if (offset < 0) {
>>> +continue;
>>
>> I don't understand this continue -- if the fdt function returned any
>> error other than -FDT_ERR_NOTFOUND then this will cause us to go
>> into an infinite loop around this while(). Did you mean 'break' ?
>> (Though if you just want to break then fixing the while condition
>> would be better.)
> My first understanding of the API was fdt_node_offset_by_compatible
> would increment the offset even if an error occurred; so I envisioned to
> continue parsing the tree, looking for another node with same features.

Your code doesn't call fdt_node_offset_by_compatible again
in the case where it's trying to continue, though...

thanks
-- PMM

Re: [Qemu-devel] [PATCH 4/6] device_tree: qemu_fdt_getprop converted to use the error API

2016-01-05 Thread Peter Maydell

On 5 January 2016 at 16:20, Eric Auger  wrote:
> Hi Peter,
> On 12/18/2015 03:36 PM, Peter Maydell wrote:
>> On 17 December 2015 at 12:29, Eric Auger  wrote:
>>> Current qemu_fdt_getprop exits if the property is not found. It is
>>> sometimes needed to read an optional property, in which case we do
>>> not wish to exit but simply returns a null value.
>>>
>>> This patch converts qemu_fdt_getprop to accept an Error **, and existing
>>> users are converted to pass &error_fatal. This preserves the existing
>>> behaviour. Then to use the API with your optional semantic a null
>>> parameter can be conveyed.
>>>
>>> Signed-off-by: Eric Auger 
>>>
>>> ---
>>>
>>> RFC -> v1:
>>> - get rid of qemu_fdt_getprop_optional and implement Peter's suggestion
>>>   that consists in using the error API
>>
>> This doesn't seem to me like a great way for qemu_fdt_getprop to
>> report "property not found", because there's no way for the caller
>> to distinguish "property not found" from "function went wrong
>> some other way" (since Errors just report human readable strings,
>> and in any case you're not distinguishing -FDT_ERR_NOTFOUND
>> from any of the other FDT errors).
> Not sure I get what you mean here. In case fdt_getprop fails, as long as
> the caller provided a lenp != NULL, *lenp contains the error code so
> qemu_fdt_getprop's caller can discriminate a -FDT_ERR_NOTFOUND from any
> other errors. Do I miss something?

There's no documentation of this behaviour of qemu_fdt_getprop
in either this commit message or in a doc comment in the header,
so I didn't realise that was your intention.

thanks
-- PMM

Re: [Qemu-devel] qcow2 snapshot + resize

2016-01-05 Thread John Snow



On 01/05/2016 08:55 AM, Eric Blake wrote:
> On 01/05/2016 05:10 AM, lihuiba wrote:
> 
 In our production environment, we need to extend a qcow2 image with
 snapshots in it.
> 
>>> The thing is that one would need to update all the inactive L1 tables. I
>>> don't think it should be too difficult, it's just that apparently so far
>>> nobody ever had the need for this feature.
> 
> Is resizing a snapshot really what you want?  Ideally, a snapshot tracks
> the data from a point in time, including the metadata of the size being
> tracked at that time.  Extending the snapshots then reverting to that
> snapshot means your guest would see a larger disk on revert than it did
> at the time the snapshot was created, which guests might not handle very
> well.
> 

Unless reverting to a snapshot also implied undoing the re-size.

[Qemu-devel] [PATCH 2/8] ipmi: add get and set SENSOR_TYPE commands

2016-01-05 Thread Cédric Le Goater

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 51 --
 1 file changed, 49 insertions(+), 2 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 559e1398d669..061db8437479 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -37,13 +37,15 @@
 #define IPMI_CMD_CHASSIS_CONTROL  0x02
 
 #define IPMI_NETFN_SENSOR_EVENT   0x04
-#define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x2e
+#define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x30
 
 #define IPMI_CMD_SET_SENSOR_EVT_ENABLE0x28
 #define IPMI_CMD_GET_SENSOR_EVT_ENABLE0x29
 #define IPMI_CMD_REARM_SENSOR_EVTS0x2a
 #define IPMI_CMD_GET_SENSOR_EVT_STATUS0x2b
 #define IPMI_CMD_GET_SENSOR_READING   0x2d
+#define IPMI_CMD_SET_SENSOR_TYPE  0x2e
+#define IPMI_CMD_GET_SENSOR_TYPE  0x2f
 
 /* #define IPMI_NETFN_APP 0x06 In ipmi.h */
 #define IPMI_NETFN_APP_MAXCMD 0x36
@@ -1576,6 +1578,49 @@ static void get_sensor_reading(IPMIBmcSim *ibs,
 return;
 }
 
+static void set_sensor_type(IPMIBmcSim *ibs,
+   uint8_t *cmd, unsigned int cmd_len,
+   uint8_t *rsp, unsigned int *rsp_len,
+   unsigned int max_rsp_len)
+{
+IPMISensor *sens;
+
+
+IPMI_CHECK_CMD_LEN(5);
+if ((cmd[2] > MAX_SENSORS) ||
+!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
+rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
+goto out;
+}
+sens = ibs->sensors + cmd[2];
+sens->sensor_type = cmd[3];
+sens->evt_reading_type_code = cmd[4] & 0x7f;
+
+ out:
+return;
+}
+
+static void get_sensor_type(IPMIBmcSim *ibs,
+   uint8_t *cmd, unsigned int cmd_len,
+   uint8_t *rsp, unsigned int *rsp_len,
+   unsigned int max_rsp_len)
+{
+IPMISensor *sens;
+
+
+IPMI_CHECK_CMD_LEN(3);
+if ((cmd[2] > MAX_SENSORS) ||
+!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
+rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
+goto out;
+}
+sens = ibs->sensors + cmd[2];
+IPMI_ADD_RSP_DATA(sens->sensor_type);
+IPMI_ADD_RSP_DATA(sens->evt_reading_type_code);
+ out:
+return;
+}
+
 static const IPMICmdHandler chassis_cmds[IPMI_NETFN_CHASSIS_MAXCMD] = {
 [IPMI_CMD_GET_CHASSIS_CAPABILITIES] = chassis_capabilities,
 [IPMI_CMD_GET_CHASSIS_STATUS] = chassis_status,
@@ -1592,7 +1637,9 @@ sensor_event_cmds[IPMI_NETFN_SENSOR_EVENT_MAXCMD] = {
 [IPMI_CMD_GET_SENSOR_EVT_ENABLE] = get_sensor_evt_enable,
 [IPMI_CMD_REARM_SENSOR_EVTS] = rearm_sensor_evts,
 [IPMI_CMD_GET_SENSOR_EVT_STATUS] = get_sensor_evt_status,
-[IPMI_CMD_GET_SENSOR_READING] = get_sensor_reading
+[IPMI_CMD_GET_SENSOR_READING] = get_sensor_reading,
+[IPMI_CMD_SET_SENSOR_TYPE] = set_sensor_type,
+[IPMI_CMD_GET_SENSOR_TYPE] = get_sensor_type,
 };
 static const IPMINetfn sensor_event_netfn = {
 .cmd_nums = IPMI_NETFN_SENSOR_EVENT_MAXCMD,
-- 
2.1.4

[Qemu-devel] [PATCH 0/8] ipmi: a couple of enhancements to the BMC simulator

2016-01-05 Thread Cédric Le Goater

Hi,

Here are a few patches adding a couple of IPMI commands to the BMC
simulator. The last patches provides an API to extend the initial SDRs
and generate events. These will be useful for the upcoming powernv
platform.

Based on 38a762fec63f and also available here  :

  https://github.com/legoater/qemu/commits/ipmi

Thanks,


Cédric Le Goater (8):
  ipmi: fix SDR length value
  ipmi: add get and set SENSOR_TYPE commands
  ipmi: add GET_SYS_RESTART_CAUSE chassis command
  ipmi: add FRU support
  ipmi: add ACPI power and GUID commands
  ipmi: add SET_SENSOR_READING command (tentative try)
  ipmi: introduce an ipmi_bmc_init_sensor() API
  ipmi: introduce an ipmi_bmc_gen_event() API

 hw/ipmi/ipmi_bmc_sim.c | 483 ++---
 include/hw/ipmi/ipmi.h |  39 
 2 files changed, 500 insertions(+), 22 deletions(-)

-- 
2.1.4

[Qemu-devel] [PATCH 8/8] ipmi: introduce an ipmi_bmc_gen_event() API

2016-01-05 Thread Cédric Le Goater

It will be used to fill the message buffer with custom events expected
by some systems. Typically, an Open PowerNV platform guest is notified
with an OEM SEL message before a shutdown or a reboot.

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 24 
 include/hw/ipmi/ipmi.h |  2 ++
 2 files changed, 26 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 9618db44ce69..cf105c343596 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -447,6 +447,30 @@ static int attn_irq_enabled(IPMIBmcSim *ibs)
 IPMI_BMC_MSG_FLAG_EVT_BUF_FULL_SET(ibs));
 }
 
+void ipmi_bmc_gen_event(IPMIBmc *b, uint8_t *evt, bool log)
+{
+IPMIBmcSim *ibs = IPMI_BMC_SIMULATOR(b);
+IPMIInterface *s = ibs->parent.intf;
+IPMIInterfaceClass *k = IPMI_INTERFACE_GET_CLASS(s);
+
+if (!IPMI_BMC_EVENT_MSG_BUF_ENABLED(ibs)) {
+return;
+}
+
+if (log && IPMI_BMC_EVENT_LOG_ENABLED(ibs)) {
+sel_add_event(ibs, evt);
+}
+
+if (ibs->msg_flags & IPMI_BMC_MSG_FLAG_EVT_BUF_FULL) {
+goto out;
+}
+
+memcpy(ibs->evtbuf, evt, 16);
+ibs->msg_flags |= IPMI_BMC_MSG_FLAG_EVT_BUF_FULL;
+k->set_atn(s, 1, attn_irq_enabled(ibs));
+ out:
+return;
+}
 static void gen_event(IPMIBmcSim *ibs, unsigned int sens_num, uint8_t deassert,
   uint8_t evd1, uint8_t evd2, uint8_t evd3)
 {
diff --git a/include/hw/ipmi/ipmi.h b/include/hw/ipmi/ipmi.h
index ce1f539754be..0006299e263b 100644
--- a/include/hw/ipmi/ipmi.h
+++ b/include/hw/ipmi/ipmi.h
@@ -247,4 +247,6 @@ typedef uint8_t ipmi_sdr_compact_buffer[sizeof(struct 
ipmi_sdr_compact)];
 int ipmi_bmc_init_sensor(IPMIBmc *b, const uint8_t *entry,
  unsigned int len, uint8_t *sensor_num);
 
+void ipmi_bmc_gen_event(IPMIBmc *b, uint8_t *evt, bool log);
+
 #endif
-- 
2.1.4

[Qemu-devel] [PATCH 6/8] ipmi: add SET_SENSOR_READING command (tentative try)

2016-01-05 Thread Cédric Le Goater

SET_SENSOR_READING is a complex IPMI command (IPMI spec : "35.17 Set
Sensor Reading And Event Status Command"). Here is a very minimum
framework fitting the Open PowerNV platform needs. This command is
used on this platform to set the "System Firmware Progress" sensor and
the "Boot Count" sensor.

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 141 -
 1 file changed, 140 insertions(+), 1 deletion(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index c3a06d0ac7e4..4f7c74da4b6b 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -39,7 +39,7 @@
 #define IPMI_CMD_GET_SYS_RESTART_CAUSE0x09
 
 #define IPMI_NETFN_SENSOR_EVENT   0x04
-#define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x30
+#define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x31
 
 #define IPMI_CMD_SET_SENSOR_EVT_ENABLE0x28
 #define IPMI_CMD_GET_SENSOR_EVT_ENABLE0x29
@@ -48,6 +48,7 @@
 #define IPMI_CMD_GET_SENSOR_READING   0x2d
 #define IPMI_CMD_SET_SENSOR_TYPE  0x2e
 #define IPMI_CMD_GET_SENSOR_TYPE  0x2f
+#define IPMI_CMD_SET_SENSOR_READING   0x30
 
 /* #define IPMI_NETFN_APP 0x06 In ipmi.h */
 #define IPMI_NETFN_APP_MAXCMD 0x36
@@ -1794,6 +1795,143 @@ static void get_sensor_type(IPMIBmcSim *ibs,
 return;
 }
 
+static void set_sensor_reading(IPMIBmcSim *ibs,
+   uint8_t *cmd, unsigned int cmd_len,
+   uint8_t *rsp, unsigned int *rsp_len,
+   unsigned int max_rsp_len)
+{
+IPMISensor *sens;
+uint8_t evd1;
+uint8_t evd2;
+uint8_t evd3;
+
+IPMI_CHECK_CMD_LEN(5);
+if ((cmd[2] > MAX_SENSORS) ||
+!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
+rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
+goto out;
+}
+
+sens = ibs->sensors + cmd[2];
+
+/* Sensor Reading operation */
+switch ((cmd[3]) & 0x3) {
+case 0: /* Do not change */
+break;
+case 1: /* write given value to sensor reading byte */
+sens->reading = cmd[4];
+break;
+case 2:
+case 3:
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+goto out;
+}
+
+/* Deassertion bits operation */
+switch ((cmd[3] >> 2) & 0x3) {
+case 0: /* Do not change */
+break;
+case 1: /* write given value */
+if (cmd_len > 7) {
+sens->deassert_states = cmd[7];
+}
+if (cmd_len > 8) {
+sens->deassert_states = cmd[8] << 8;
+}
+
+case 2: /* mask on */
+if (cmd_len > 7) {
+sens->deassert_states |= cmd[7];
+}
+if (cmd_len > 8) {
+sens->deassert_states |= cmd[8] << 8;
+}
+break;
+case 3: /* mask off */
+if (cmd_len > 7) {
+sens->deassert_states &= cmd[7];
+}
+if (cmd_len > 8) {
+sens->deassert_states &= (cmd[8] << 8);
+}
+break;
+}
+
+/* Assertion bits operation */
+switch ((cmd[3] >> 4) & 0x3) {
+case 0: /* Do not change */
+break;
+case 1: /* write given value */
+if (cmd_len > 5) {
+sens->assert_states = cmd[5];
+}
+if (cmd_len > 6) {
+sens->assert_states = cmd[6] << 8;
+}
+
+case 2: /* mask on */
+if (cmd_len > 5) {
+sens->assert_states |= cmd[5];
+}
+if (cmd_len > 6) {
+sens->assert_states |= cmd[6] << 8;
+}
+break;
+case 3: /* mask off */
+if (cmd_len > 5) {
+sens->assert_states &= cmd[5];
+}
+if (cmd_len > 6) {
+sens->assert_states &= (cmd[6] << 8);
+}
+break;
+}
+
+evd1 = evd2 = evd3 = 0x0;
+if (cmd_len > 9) {
+evd1 = cmd[9];
+}
+if (cmd_len > 10) {
+evd2 = cmd[10];
+}
+if (cmd_len > 11) {
+evd3 = cmd[11];
+}
+
+/* Event Data Bytes operation */
+switch ((cmd[3] >> 6) & 0x3) {
+case 0: /* Do not use the event data in message */
+evd1 = evd2 = evd3 = 0x0;
+break;
+case 1: /* Write given values to event data bytes excluding bits
+ * [3:0] Event Data 1. */
+evd1 &= 0xf0;
+break;
+case 2: /* Write given values to event data bytes including bits
+ * [3:0] Event Data 1. */
+break;
+case 3:
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+goto out;
+}
+
+if (IPMI_SENSOR_IS_DISCRETE(sens)) {
+unsigned int bit = evd1 & 0xf;
+uint16_t mask = (1 << bit);
+
+if (sens->assert_states & mask & sens->assert_enable) {
+gen_event(ibs, cmd[2], 0, evd1, evd2, evd3);
+}
+
+if (sens->deassert_states & mask & sens->deassert_enable) {
+gen_event(ibs, cmd[2], 1, evd1, evd2, evd3);
+}
+}
+
+out:
+return;
+}
+
 static const IPMICmdHandler chassis_cmds[IPM

[Qemu-devel] [PATCH 5/8] ipmi: add ACPI power and GUID commands

2016-01-05 Thread Cédric Le Goater

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 55 ++
 1 file changed, 55 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 60586a67104e..c3a06d0ac7e4 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include "sysemu/sysemu.h"
 #include "qemu/timer.h"
 #include "hw/ipmi/ipmi.h"
 #include "qemu/error-report.h"
@@ -54,6 +55,9 @@
 #define IPMI_CMD_GET_DEVICE_ID0x01
 #define IPMI_CMD_COLD_RESET   0x02
 #define IPMI_CMD_WARM_RESET   0x03
+#define IPMI_CMD_SET_POWER_STATE  0x06
+#define IPMI_CMD_GET_POWER_STATE  0x07
+#define IPMI_CMD_GET_DEVICE_GUID  0x08
 #define IPMI_CMD_RESET_WATCHDOG_TIMER 0x22
 #define IPMI_CMD_SET_WATCHDOG_TIMER   0x24
 #define IPMI_CMD_GET_WATCHDOG_TIMER   0x25
@@ -215,6 +219,9 @@ struct IPMIBmcSim {
 
 uint8_t restart_cause;
 
+uint8_t power_state[2];
+uint8_t uuid[16];
+
 IPMISel sel;
 IPMISdr sdr;
 IPMIFru fru;
@@ -842,6 +849,42 @@ static void warm_reset(IPMIBmcSim *ibs,
 k->reset(s, false);
 }
 }
+static void set_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_CHECK_CMD_LEN(4);
+ibs->power_state[0] = cmd[2];
+ibs->power_state[1] = cmd[3];
+ out:
+return;
+}
+
+static void get_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_ADD_RSP_DATA(ibs->power_state[0]);
+IPMI_ADD_RSP_DATA(ibs->power_state[1]);
+ out:
+return;
+}
+
+static void get_device_guid(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+unsigned int i;
+
+for (i = 0; i < 16; i++) {
+IPMI_ADD_RSP_DATA(ibs->uuid[i]);
+}
+ out:
+return;
+}
 
 static void set_bmc_global_enables(IPMIBmcSim *ibs,
uint8_t *cmd, unsigned int cmd_len,
@@ -1781,6 +1824,9 @@ static const IPMICmdHandler 
app_cmds[IPMI_NETFN_APP_MAXCMD] = {
 [IPMI_CMD_GET_DEVICE_ID] = get_device_id,
 [IPMI_CMD_COLD_RESET] = cold_reset,
 [IPMI_CMD_WARM_RESET] = warm_reset,
+[IPMI_CMD_SET_POWER_STATE] = set_power_state,
+[IPMI_CMD_GET_POWER_STATE] = get_power_state,
+[IPMI_CMD_GET_DEVICE_GUID] = get_device_guid,
 [IPMI_CMD_SET_BMC_GLOBAL_ENABLES] = set_bmc_global_enables,
 [IPMI_CMD_GET_BMC_GLOBAL_ENABLES] = get_bmc_global_enables,
 [IPMI_CMD_CLR_MSG_FLAGS] = clr_msg_flags,
@@ -1907,6 +1953,15 @@ static void ipmi_sim_init(Object *obj)
 i += len;
 }
 
+ibs->power_state[0] = 0;
+ibs->power_state[1] = 0;
+
+if (qemu_uuid_set) {
+memcpy(&ibs->uuid, qemu_uuid, 16);
+} else {
+memset(&ibs->uuid, 0, 16);
+}
+
 ipmi_init_sensors_from_sdrs(ibs);
 register_cmds(ibs);
 
-- 
2.1.4

[Qemu-devel] [PATCH 4/8] ipmi: add FRU support

2016-01-05 Thread Cédric Le Goater

This patch provides a simplistic FRU support for the IPMI BMC
simulator.  The FRU area contains 32 entries * 256 bytes which should
be enough to start some simulation.

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 119 +
 1 file changed, 119 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 5db94491b130..60586a67104e 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -81,6 +81,9 @@
 #define IPMI_CMD_ENTER_SDR_REP_UPD_MODE   0x2A
 #define IPMI_CMD_EXIT_SDR_REP_UPD_MODE0x2B
 #define IPMI_CMD_RUN_INIT_AGENT   0x2C
+#define IPMI_CMD_GET_FRU_AREA_INFO0x10
+#define IPMI_CMD_READ_FRU_DATA0x11
+#define IPMI_CMD_WRITE_FRU_DATA   0x12
 #define IPMI_CMD_GET_SEL_INFO 0x40
 #define IPMI_CMD_GET_SEL_ALLOC_INFO   0x41
 #define IPMI_CMD_RESERVE_SEL  0x42
@@ -123,6 +126,14 @@ typedef struct IPMISdr {
 uint8_t overflow;
 } IPMISdr;
 
+/* theoretically, the offset being 16bits, it should be 65536 */
+#define MAX_FRU_SIZE 256
+#define MAX_FRU_ID 32
+
+typedef struct IPMIFru {
+uint8_t data[MAX_FRU_SIZE][MAX_FRU_ID];
+} IPMIFru;
+
 typedef struct IPMISensor {
 uint8_t status;
 uint8_t reading;
@@ -206,6 +217,7 @@ struct IPMIBmcSim {
 
 IPMISel sel;
 IPMISdr sdr;
+IPMIFru fru;
 IPMISensor sensors[MAX_SENSORS];
 
 /* Odd netfns are for responses, so we only need the even ones. */
@@ -1305,6 +1317,110 @@ static void get_sel_info(IPMIBmcSim *ibs,
 return;
 }
 
+static void get_fru_area_info(IPMIBmcSim *ibs,
+ uint8_t *cmd, unsigned int cmd_len,
+ uint8_t *rsp, unsigned int *rsp_len,
+ unsigned int max_rsp_len)
+{
+uint8_t fruid;
+uint16_t fru_entry_size;
+
+IPMI_CHECK_CMD_LEN(3);
+
+fruid = cmd[2];
+
+if (fruid > MAX_FRU_ID) {
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+goto out;
+}
+
+fru_entry_size = MAX_FRU_SIZE;
+
+IPMI_ADD_RSP_DATA(fru_entry_size & 0xff);
+IPMI_ADD_RSP_DATA(fru_entry_size >> 8 & 0xff);
+IPMI_ADD_RSP_DATA(0x0);
+out:
+return;
+}
+
+#define min(x, y) ((x) < (y) ? (x) : (y))
+#define max(x, y) ((x) > (y) ? (x) : (y))
+
+static void read_fru_data(IPMIBmcSim *ibs,
+ uint8_t *cmd, unsigned int cmd_len,
+ uint8_t *rsp, unsigned int *rsp_len,
+ unsigned int max_rsp_len)
+{
+uint8_t fruid;
+uint16_t offset;
+int i;
+uint8_t *fru_entry;
+unsigned int count;
+
+IPMI_CHECK_CMD_LEN(5);
+
+fruid = cmd[2];
+offset = (cmd[3] | cmd[4] << 8);
+
+if (fruid > MAX_FRU_ID) {
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+goto out;
+}
+
+if (offset >= MAX_FRU_SIZE - 1) {
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+goto out;
+}
+
+fru_entry = ibs->fru.data[fruid];
+
+count = min(cmd[5], MAX_FRU_SIZE - offset);
+
+IPMI_ADD_RSP_DATA(count & 0xff);
+for (i = 0; i < count; i++) {
+IPMI_ADD_RSP_DATA(fru_entry[offset + i]);
+}
+
+ out:
+return;
+}
+
+static void write_fru_data(IPMIBmcSim *ibs,
+ uint8_t *cmd, unsigned int cmd_len,
+ uint8_t *rsp, unsigned int *rsp_len,
+ unsigned int max_rsp_len)
+{
+uint8_t fruid;
+uint16_t offset;
+uint8_t *fru_entry;
+unsigned int count;
+
+IPMI_CHECK_CMD_LEN(5);
+
+fruid = cmd[2];
+offset = (cmd[3] | cmd[4] << 8);
+
+if (fruid > MAX_FRU_ID) {
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+goto out;
+}
+
+if (offset >= MAX_FRU_SIZE - 1) {
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+goto out;
+}
+
+fru_entry = ibs->fru.data[fruid];
+
+count = min(cmd_len - 5, MAX_FRU_SIZE - offset);
+
+memcpy(fru_entry + offset, cmd + 5, count);
+
+IPMI_ADD_RSP_DATA(count & 0xff);
+ out:
+return;
+}
+
 static void reserve_sel(IPMIBmcSim *ibs,
 uint8_t *cmd, unsigned int cmd_len,
 uint8_t *rsp, unsigned int *rsp_len,
@@ -1682,6 +1798,9 @@ static const IPMINetfn app_netfn = {
 };
 
 static const IPMICmdHandler storage_cmds[IPMI_NETFN_STORAGE_MAXCMD] = {
+[IPMI_CMD_GET_FRU_AREA_INFO] = get_fru_area_info,
+[IPMI_CMD_READ_FRU_DATA] = read_fru_data,
+[IPMI_CMD_WRITE_FRU_DATA] = write_fru_data,
 [IPMI_CMD_GET_SDR_REP_INFO] = get_sdr_rep_info,
 [IPMI_CMD_RESERVE_SDR_REP] = reserve_sdr_rep,
 [IPMI_CMD_GET_SDR] = get_sdr,
-- 
2.1.4

[Qemu-devel] [PATCH 7/8] ipmi: introduce an ipmi_bmc_init_sensor() API

2016-01-05 Thread Cédric Le Goater

This routine will let qemu platforms populate the sdr/sensor tables of
the IPMI BMC simulator with their customs needs.

The patch adds a compact sensor record typedef to ease definition of
sdrs. To be used in the code the following way:

static ipmi_sdr_compact_buffer my_init_sdrs[] = {
{   /* Firmware Progress Sensor */
0xff, 0xff, 0x51, 0x02,   43, 0x20, 0x00, 0xff,
0x22, 0x00, 0xff, 0x40, 0x0f, 0x6f, 0x07, 0x00,
0x00, 0x00, 0xff, 0xff, 0xc0, 0x00, 0x00, 0x01,
0x81, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xd0,
'F',  'W',  ' ',  'B',  'o',  'o',  't',  ' ',
'P',  'r',  'o',  'g',  'r',  'e',  's',  's',
},
...
};

struct ipmi_sdr_compact *sdr =
   (struct ipmi_sdr_compact *) &my_init_sdrs[0];

ipmi_bmc_init_sensor(IPMI_BMC(obj), my_init_sdrs[0],
 sdr->rec_length + 5, &sdr->sensor_owner_number);

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 61 +-
 include/hw/ipmi/ipmi.h | 37 ++
 2 files changed, 87 insertions(+), 11 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 4f7c74da4b6b..9618db44ce69 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -527,6 +527,22 @@ static void sensor_set_discrete_bit(IPMIBmcSim *ibs, 
unsigned int sensor,
 }
 }
 
+static void ipmi_init_sensor(IPMISensor *sens, const uint8_t *sdr)
+{
+IPMI_SENSOR_SET_PRESENT(sens, 1);
+IPMI_SENSOR_SET_SCAN_ON(sens, (sdr[10] >> 6) & 1);
+IPMI_SENSOR_SET_EVENTS_ON(sens, (sdr[10] >> 5) & 1);
+sens->assert_suppt = sdr[14] | (sdr[15] << 8);
+sens->deassert_suppt = sdr[16] | (sdr[17] << 8);
+sens->states_suppt = sdr[18] | (sdr[19] << 8);
+sens->sensor_type = sdr[12];
+sens->evt_reading_type_code = sdr[13] & 0x7f;
+
+/* Enable all the events that are supported. */
+sens->assert_enable = sens->assert_suppt;
+sens->deassert_enable = sens->deassert_suppt;
+}
+
 static void ipmi_init_sensors_from_sdrs(IPMIBmcSim *s)
 {
 unsigned int i, pos;
@@ -553,19 +569,42 @@ static void ipmi_init_sensors_from_sdrs(IPMIBmcSim *s)
 }
 sens = s->sensors + sdr[7];
 
-IPMI_SENSOR_SET_PRESENT(sens, 1);
-IPMI_SENSOR_SET_SCAN_ON(sens, (sdr[10] >> 6) & 1);
-IPMI_SENSOR_SET_EVENTS_ON(sens, (sdr[10] >> 5) & 1);
-sens->assert_suppt = sdr[14] | (sdr[15] << 8);
-sens->deassert_suppt = sdr[16] | (sdr[17] << 8);
-sens->states_suppt = sdr[18] | (sdr[19] << 8);
-sens->sensor_type = sdr[12];
-sens->evt_reading_type_code = sdr[13] & 0x7f;
+ipmi_init_sensor(sens, sdr);
+}
+}
+
+int ipmi_bmc_init_sensor(IPMIBmc *b, const uint8_t *sdr,
+ unsigned int len, uint8_t *sensor_num)
+{
+IPMIBmcSim *ibs = IPMI_BMC_SIMULATOR(b);
+int ret;
+unsigned int i;
+IPMISensor *sens;
 
-/* Enable all the events that are supported. */
-sens->assert_enable = sens->assert_suppt;
-sens->deassert_enable = sens->deassert_suppt;
+for (i = 0; i < MAX_SENSORS; i++) {
+sens = ibs->sensors + i;
+if (!IPMI_SENSOR_GET_PRESENT(sens)) {
+break;
+}
+}
+
+if (i == MAX_SENSORS) {
+return 1;
 }
+
+ret = sdr_add_entry(ibs, sdr, len, NULL);
+if (ret) {
+return ret;
+}
+
+ipmi_init_sensor(sens, sdr);
+if (sensor_num) {
+*sensor_num = i;
+}
+
+/* patch sensor in sdr table. This is a little hacky. */
+ibs->sdr.sdr[ibs->sdr.next_free - len + 7] = i;
+return 0;
 }
 
 static int ipmi_register_netfn(IPMIBmcSim *s, unsigned int netfn,
diff --git a/include/hw/ipmi/ipmi.h b/include/hw/ipmi/ipmi.h
index 32bac0fa8877..ce1f539754be 100644
--- a/include/hw/ipmi/ipmi.h
+++ b/include/hw/ipmi/ipmi.h
@@ -210,4 +210,41 @@ IPMIFwInfo *ipmi_next_fwinfo(IPMIFwInfo *current);
 #define ipmi_debug(fs, ...)
 #endif
 
+/*
+ * 43.2 SDR Type 02h. Compact Sensor Record
+ */
+struct ipmi_sdr_compact {
+uint16_t rec_id;
+uint8_t  sdr_version;   /* 0x51 */
+uint8_t  rec_type;  /* 0x02 Compact Sensor Record */
+uint8_t  rec_length;
+uint8_t  sensor_owner_id;
+uint8_t  sensor_owner_lun;
+uint8_t  sensor_owner_number;   /* byte 8 */
+uint8_t  entity_id;
+uint8_t  entity_instance;
+uint8_t  sensor_init;
+uint8_t  sensor_caps;
+uint8_t  sensor_type;
+uint8_t  reading_type;
+uint8_t  assert_mask[2];/* byte 16 */
+uint8_t  deassert_mask[2];
+uint8_t  discrete_mask[2];
+uint8_t  sensor_unit1;
+uint8_t  sensor_unit2;
+uint8_t  sensor_unit3;
+uint8_t  sensor_direction[2];   /* byte 24 */
+uint8_t  positive_threshold;
+uint8_t  negative_threshold;
+uint8_t  reserved[3];
+uint8_t  oem;
+uint8_t  id_str_len;/* byte 32 */
+u

[Qemu-devel] [PATCH 3/8] ipmi: add GET_SYS_RESTART_CAUSE chassis command

2016-01-05 Thread Cédric Le Goater

This is a simulator. Just return an unknown cause (0).

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 061db8437479..5db94491b130 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -30,11 +30,12 @@
 #include "qemu/error-report.h"
 
 #define IPMI_NETFN_CHASSIS0x00
-#define IPMI_NETFN_CHASSIS_MAXCMD 0x03
+#define IPMI_NETFN_CHASSIS_MAXCMD 0x0a
 
 #define IPMI_CMD_GET_CHASSIS_CAPABILITIES 0x00
 #define IPMI_CMD_GET_CHASSIS_STATUS   0x01
 #define IPMI_CMD_CHASSIS_CONTROL  0x02
+#define IPMI_CMD_GET_SYS_RESTART_CAUSE0x09
 
 #define IPMI_NETFN_SENSOR_EVENT   0x04
 #define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x30
@@ -201,6 +202,8 @@ struct IPMIBmcSim {
 uint8_t mfg_id[3];
 uint8_t product_id[2];
 
+uint8_t restart_cause;
+
 IPMISel sel;
 IPMISdr sdr;
 IPMISensor sensors[MAX_SENSORS];
@@ -754,6 +757,17 @@ static void chassis_control(IPMIBmcSim *ibs,
 return;
 }
 
+static void chassis_get_sys_restart_cause(IPMIBmcSim *ibs,
+   uint8_t *cmd, unsigned int cmd_len,
+   uint8_t *rsp, unsigned int *rsp_len,
+   unsigned int max_rsp_len)
+{
+IPMI_ADD_RSP_DATA(ibs->restart_cause & 0xf); /* Restart Cause */
+IPMI_ADD_RSP_DATA(0);  /* Channel 0 */
+ out:
+return;
+}
+
 static void get_device_id(IPMIBmcSim *ibs,
   uint8_t *cmd, unsigned int cmd_len,
   uint8_t *rsp, unsigned int *rsp_len,
@@ -1624,7 +1638,8 @@ static void get_sensor_type(IPMIBmcSim *ibs,
 static const IPMICmdHandler chassis_cmds[IPMI_NETFN_CHASSIS_MAXCMD] = {
 [IPMI_CMD_GET_CHASSIS_CAPABILITIES] = chassis_capabilities,
 [IPMI_CMD_GET_CHASSIS_STATUS] = chassis_status,
-[IPMI_CMD_CHASSIS_CONTROL] = chassis_control
+[IPMI_CMD_CHASSIS_CONTROL] = chassis_control,
+[IPMI_CMD_GET_SYS_RESTART_CAUSE] = chassis_get_sys_restart_cause
 };
 static const IPMINetfn chassis_netfn = {
 .cmd_nums = IPMI_NETFN_CHASSIS_MAXCMD,
@@ -1746,6 +1761,7 @@ static void ipmi_sim_init(Object *obj)
 ibs->bmc_global_enables = (1 << IPMI_BMC_EVENT_LOG_BIT);
 ibs->device_id = 0x20;
 ibs->ipmi_version = 0x02; /* IPMI 2.0 */
+ibs->restart_cause = 0;
 for (i = 0; i < 4; i++) {
 ibs->sel.last_addition[i] = 0xff;
 ibs->sel.last_clear[i] = 0xff;
-- 
2.1.4

[Qemu-devel] [PATCH 1/8] ipmi: fix SDR length value

2016-01-05 Thread Cédric Le Goater

The IPMI BMC simulator populates the SDR table with a set of initial
SDRs. The length of each SDR is taken from the record itself (byte 4)
which does not include the size of the header. But, the full length
(header + data) is required by the sdr_add_entry() routine.

Signed-off-by: Cédric Le Goater 
---

 Maybe we could use a sdr struct/typedef to clarify the code. See
 patch 7: "ipmi: introduce an ipmi_bmc_init_sensor() API"

 hw/ipmi/ipmi_bmc_sim.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 0a59e539f549..559e1398d669 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -362,7 +362,7 @@ static int sdr_find_entry(IPMISdr *sdr, uint16_t recid,
 
 while (pos < sdr->next_free) {
 uint16_t trec = sdr->sdr[pos] | (sdr->sdr[pos + 1] << 8);
-unsigned int nextpos = pos + sdr->sdr[pos + 4];
+unsigned int nextpos = pos + sdr->sdr[pos + 4] + 5;
 
 if (trec == recid) {
 if (nextrec) {
@@ -1198,7 +1198,7 @@ static void get_sdr(IPMIBmcSim *ibs,
 rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
 goto out;
 }
-if (cmd[6] > (ibs->sdr.sdr[pos + 4])) {
+if (cmd[6] > (ibs->sdr.sdr[pos + 4] + 5)) {
 rsp[2] = IPMI_CC_PARM_OUT_OF_RANGE;
 goto out;
 }
@@ -1207,7 +1207,7 @@ static void get_sdr(IPMIBmcSim *ibs,
 IPMI_ADD_RSP_DATA((nextrec >> 8) & 0xff);
 
 if (cmd[7] == 0xff) {
-cmd[7] = ibs->sdr.sdr[pos + 4] - cmd[6];
+cmd[7] = ibs->sdr.sdr[pos + 4] + 5 - cmd[6];
 }
 
 if ((cmd[7] + *rsp_len) > max_rsp_len) {
@@ -1709,20 +1709,20 @@ static void ipmi_sim_init(Object *obj)
 for (i = 0;;) {
 int len;
 if ((i + 5) > sizeof(init_sdrs)) {
-error_report("Problem with recid 0x%4.4x: \n", i);
+error_report("Problem with recid 0x%4.4x\n", i);
 return;
 }
-len = init_sdrs[i + 4];
+len = init_sdrs[i + 4] + 5;
 recid = init_sdrs[i] | (init_sdrs[i + 1] << 8);
 if (recid == 0x) {
 break;
 }
-if ((i + len + 5) > sizeof(init_sdrs)) {
+if ((i + len) > sizeof(init_sdrs)) {
 error_report("Problem with recid 0x%4.4x\n", i);
 return;
 }
 sdr_add_entry(ibs, init_sdrs + i, len, NULL);
-i += len + 5;
+i += len;
 }
 
 ipmi_init_sensors_from_sdrs(ibs);
-- 
2.1.4

Re: [Qemu-devel] [RFC v6 02/14] softmmu: Add new TLB_EXCL flag

2016-01-05 Thread alvise rigo

On Tue, Jan 5, 2016 at 5:10 PM, Alex Bennée  wrote:
>
> Alvise Rigo  writes:
>
>> Add a new TLB flag to force all the accesses made to a page to follow
>> the slow-path.
>>
>> In the case we remove a TLB entry marked as EXCL, we unset the
>> corresponding exclusive bit in the bitmap.
>>
>> Suggested-by: Jani Kokkonen 
>> Suggested-by: Claudio Fontana 
>> Signed-off-by: Alvise Rigo 
>> ---
>>  cputlb.c|  38 +++-
>>  include/exec/cpu-all.h  |   8 
>>  include/exec/cpu-defs.h |   1 +
>>  include/qom/cpu.h   |  14 ++
>>  softmmu_template.h  | 114 
>> ++--
>>  5 files changed, 152 insertions(+), 23 deletions(-)
>>
>> diff --git a/cputlb.c b/cputlb.c
>> index bf1d50a..7ee0c89 100644
>> --- a/cputlb.c
>> +++ b/cputlb.c
>> @@ -394,6 +394,16 @@ void tlb_set_page_with_attrs(CPUState *cpu, 
>> target_ulong vaddr,
>>  env->tlb_v_table[mmu_idx][vidx] = *te;
>>  env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
>>
>> +if (unlikely(!(te->addr_write & TLB_MMIO) && (te->addr_write &
>> TLB_EXCL))) {
>
> Why do we care about TLB_MMIO flags here? Does it actually happen? Would
> bad things happen if we enforced exclusivity for an MMIO write? Do the
> other flags matter?

In the previous version of the patch series it came out that the
accesses to MMIO regions have to be supported since, for instance, the
GDB stub relies on them.
The last two patches actually finalize the MMIO support.

>
> There should be a comment as to why MMIO is mentioned I think.

OK.

>
>> +/* We are removing an exclusive entry, set the page to dirty. This
>> + * is not be necessary if the vCPU has performed both SC and LL. */
>> +hwaddr hw_addr = (env->iotlb[mmu_idx][index].addr & 
>> TARGET_PAGE_MASK) +
>> +  (te->addr_write & 
>> TARGET_PAGE_MASK);
>> +if (!cpu->ll_sc_context) {
>> +cpu_physical_memory_unset_excl(hw_addr, cpu->cpu_index);
>> +}
>> +}
>> +
>>  /* refill the tlb */
>>  env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
>>  env->iotlb[mmu_idx][index].attrs = attrs;
>> @@ -419,7 +429,15 @@ void tlb_set_page_with_attrs(CPUState *cpu, 
>> target_ulong vaddr,
>> + xlat)) {
>>  te->addr_write = address | TLB_NOTDIRTY;
>>  } else {
>> -te->addr_write = address;
>> +if (!(address & TLB_MMIO) &&
>> +cpu_physical_memory_atleast_one_excl(section->mr->ram_addr
>> +   + xlat)) {
>> +/* There is at least one vCPU that has flagged the address 
>> as
>> + * exclusive. */
>> +te->addr_write = address | TLB_EXCL;
>> +} else {
>> +te->addr_write = address;
>> +}
>>  }
>>  } else {
>>  te->addr_write = -1;
>> @@ -471,6 +489,24 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, 
>> target_ulong addr)
>>  return qemu_ram_addr_from_host_nofail(p);
>>  }
>>
>> +/* For every vCPU compare the exclusive address and reset it in case of a
>> + * match. Since only one vCPU is running at once, no lock has to be held to
>> + * guard this operation. */
>> +static inline void lookup_and_reset_cpus_ll_addr(hwaddr addr, hwaddr size)
>> +{
>> +CPUState *cpu;
>> +
>> +CPU_FOREACH(cpu) {
>> +if (cpu->excl_protected_range.begin != EXCLUSIVE_RESET_ADDR &&
>> +ranges_overlap(cpu->excl_protected_range.begin,
>> +   cpu->excl_protected_range.end -
>> +   cpu->excl_protected_range.begin,
>> +   addr, size)) {
>> +cpu->excl_protected_range.begin = EXCLUSIVE_RESET_ADDR;
>> +}
>> +}
>> +}
>> +
>>  #define MMUSUFFIX _mmu
>>
>>  #define SHIFT 0
>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
>> index 83b1781..f8d8feb 100644
>> --- a/include/exec/cpu-all.h
>> +++ b/include/exec/cpu-all.h
>> @@ -277,6 +277,14 @@ CPUArchState *cpu_copy(CPUArchState *env);
>>  #define TLB_NOTDIRTY(1 << 4)
>>  /* Set if TLB entry is an IO callback.  */
>>  #define TLB_MMIO(1 << 5)
>> +/* Set if TLB entry references a page that requires exclusive access.  */
>> +#define TLB_EXCL(1 << 6)
>> +
>> +/* Do not allow a TARGET_PAGE_MASK which covers one or more bits defined
>> + * above. */
>> +#if TLB_EXCL >= TARGET_PAGE_SIZE
>> +#error TARGET_PAGE_MASK covering the low bits of the TLB virtual address
>> +#endif
>>
>>  void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
>>  void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf);
>> diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
>> index 5093be2..b34d7ae 100644
>> --- a/include/exec/cpu-defs.h
>> +++ b/include/exec/cpu-defs.h
>> @@ -27,6 +27,7 @@
>>  #include 
>>  #include "qemu/osdep.

Re: [Qemu-devel] How to reserve guest physical region for ACPI

2016-01-05 Thread Laszlo Ersek

On 01/05/16 18:08, Igor Mammedov wrote:
> On Mon, 4 Jan 2016 21:17:31 +0100
> Laszlo Ersek  wrote:
> 
>> Michael CC'd me on the grandparent of the email below. I'll try to add
>> my thoughts in a single go, with regard to OVMF.
>>
>> On 12/30/15 20:52, Michael S. Tsirkin wrote:
>>> On Wed, Dec 30, 2015 at 04:55:54PM +0100, Igor Mammedov wrote:  
 On Mon, 28 Dec 2015 14:50:15 +0200
 "Michael S. Tsirkin"  wrote:
  
> On Mon, Dec 28, 2015 at 10:39:04AM +0800, Xiao Guangrong wrote:  
>>
>> Hi Michael, Paolo,
>>
>> Now it is the time to return to the challenge that how to reserve guest
>> physical region internally used by ACPI.
>>
>> Igor suggested that:
>> | An alternative place to allocate reserve from could be high memory.
>> | For pc we have "reserved-memory-end" which currently makes sure
>> | that hotpluggable memory range isn't used by firmware
>> (https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00926.html) 
>>  
>>
>> OVMF has no support for the "reserved-memory-end" fw_cfg file. The
>> reason is that nobody wrote that patch, nor asked for the patch to be
>> written. (Not implying that just requesting the patch would be
>> sufficient for the patch to be written.)
>>
> I don't want to tie things to reserved-memory-end because this
> does not scale: next time we need to reserve memory,
> we'll need to find yet another way to figure out what is where.  
 Could you elaborate a bit more on a problem you're seeing?

 To me it looks like it scales rather well.
 For example lets imagine that we adding a device
 that has some on device memory that should be mapped into GPA
 code to do so would look like:

   pc_machine_device_plug_cb(dev)
   {
...
if (dev == OUR_NEW_DEVICE_TYPE) {
memory_region_add_subregion(as, current_reserved_end, &dev->mr);
set_new_reserved_end(current_reserved_end + 
 memory_region_size(&dev->mr));
}
   }

 we can practically add any number of new devices that way.  
>>>
>>> Yes but we'll have to build a host side allocator for these, and that's
>>> nasty. We'll also have to maintain these addresses indefinitely (at
>>> least per machine version) as they are guest visible.
>>> Not only that, there's no way for guest to know if we move things
>>> around, so basically we'll never be able to change addresses.
>>>
>>>   

> I would like ./hw/acpi/bios-linker-loader.c interface to be extended to
> support 64 bit RAM instead  
>>
>> This looks quite doable in OVMF, as long as the blob to allocate from
>> high memory contains *zero* ACPI tables.
>>
>> (
>> Namely, each ACPI table is installed from the containing fw_cfg blob
>> with EFI_ACPI_TABLE_PROTOCOL.InstallAcpiTable(), and the latter has its
>> own allocation policy for the *copies* of ACPI tables it installs.
>>
>> This allocation policy is left unspecified in the section of the UEFI
>> spec that governs EFI_ACPI_TABLE_PROTOCOL.
>>
>> The current policy in edk2 (= the reference implementation) seems to be
>> "allocate from under 4GB". It is currently being changed to "try to
>> allocate from under 4GB, and if that fails, retry from high memory". (It
>> is motivated by Aarch64 machines that may have no DRAM at all under 4GB.)
>> )
>>
> (and maybe a way to allocate and
> zero-initialize buffer without loading it through fwcfg),  
>>
>> Sounds reasonable.
>>
> this way bios
> does the allocation, and addresses can be patched into acpi.  
 and then guest side needs to parse/execute some AML that would
 initialize QEMU side so it would know where to write data.  
>>>
>>> Well not really - we can put it in a data table, by itself
>>> so it's easy to find.  
>>
>> Do you mean acpi_tb_find_table(), acpi_get_table_by_index() /
>> acpi_get_table_with_size()?
>>
>>>
>>> AML is only needed if access from ACPI is desired.
>>>
>>>   
 bios-linker-loader is a great interface for initializing some
 guest owned data and linking it together but I think it adds
 unnecessary complexity and is misused if it's used to handle
 device owned data/on device memory in this and VMGID cases.  
>>>
>>> I want a generic interface for guest to enumerate these things.  linker
>>> seems quite reasonable but if you see a reason why it won't do, or want
>>> to propose a better interface, fine.  
>>
>> * The guest could do the following:
>> - while processing the ALLOCATE commands, it would make a note where in
>> GPA space each fw_cfg blob gets allocated
>> - at the end the guest would prepare a temporary array with a predefined
>> record format, that associates each fw_cfg blob's name with the concrete
>> allocation address
>> - it would create an FWCfgDmaAccess stucture pointing at this array,
>> with a new "control" bit set (or something similar)
>> - the guest could write the address of the FWCfgDmaAccess struct to the
>> appropria

Re: [Qemu-devel] [PATCH v8 32/35] qapi: Split visit_end_struct() into pieces

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> As mentioned in previous patches, we want to call visit_end_struct()
> functions unconditionally, so that visitors can release resources
> tied up since the matching visit_start_struct() without also having
> to worry about error priority if more than one error occurs.
>
> Even though error_propagate() can be safely used to ignore a second
> error during cleanup caused by a first error, it is simpler if the
> cleanup cannot set an error, and we instead split the task of
> checking that an input visitor has no unvisited input as a new
> function visit_check_struct(), called only if all prior steps are
> successful.
>
> Generated code has diffs resembling:
>
> |@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v,
> | goto out_obj;
> | }
> | visit_type_ACPIOSTInfo_fields(v, obj, &err);
> |+if (err) {
> |+goto out_obj;
> |+}
> |+visit_check_struct(v, &err);
> | out_obj:
> |-error_propagate(errp, err);
> |-err = NULL;
> |-visit_end_struct(v, &err);
> |+visit_end_struct(v);
> | out:
>
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 34/35] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Returning a partial object on error is an invitation for a careless
> caller to leak memory.  As no one outside the testsuite was actually
> relying on these semantics, it is cleaner to just document and
> guarantee that ALL pointer-based visit_type_FOO() functions always
> leave a safe value in *obj during an input visitor (either the new
> object on success, or NULL if an error is encountered).
>
> Since input visitors have blind assignment semantics, we have to
> track the result of whether an assignment is made all the way down
> to each visitor callback implementation, to avoid making decisions
> based on potentially uninitialized storage.
>
> Note that we still leave *obj unchanged after a scalar-based
> visit_type_FOO(); I did not feel like auditing all uses of
> visit_type_Enum() to see if the callers would tolerate a specific
> sentinel value (not to mention having to decide whether it would
> be better to use 0 or ENUM__MAX as that sentinel).
>
> Signed-off-by: Eric Blake 
>

nice cleanup, few issues:

> ---
> v8: rebase to earlier changes
> v7: rebase to earlier changes, enhance commit message, also fix
> visit_type_str() and visit_type_any()
> v6: rebase on top of earlier doc and formatting improvements, mention
> that *obj can be uninitialized on entry to an input visitor, rework
> semantics to keep valgrind happy on uninitialized input, break some
> long lines
> ---
>  include/qapi/visitor-impl.h|  6 ++---
>  include/qapi/visitor.h | 53 
> --
>  qapi/opts-visitor.c| 11 ++---
>  qapi/qapi-dealloc-visitor.c|  9 ---
>  qapi/qapi-visit-core.c | 39 ++-
>  qapi/qmp-input-visitor.c   | 18 +-
>  qapi/qmp-output-visitor.c  |  6 +++--
>  qapi/string-input-visitor.c|  6 +++--
>  qapi/string-output-visitor.c   |  3 ++-
>  scripts/qapi-visit.py  | 40 +++
>  tests/test-qmp-commands.c  | 13 +--
>  tests/test-qmp-input-strict.c  | 19 +++
>  tests/test-qmp-input-visitor.c | 10 ++--
>  13 files changed, 154 insertions(+), 79 deletions(-)
>
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index 94d65fa..c32e5f5 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -26,7 +26,7 @@ struct Visitor
>  {
>  /* Must be provided to visit structs (the string visitors do not
>   * currently visit structs). */
> -void (*start_struct)(Visitor *v, const char *name, void **obj,
> +bool (*start_struct)(Visitor *v, const char *name, void **obj,
>   size_t size, Error **errp);
>  /* May be NULL; most useful for input visitors. */
>  void (*check_struct)(Visitor *v, Error **errp);
> @@ -34,13 +34,13 @@ struct Visitor
>  void (*end_struct)(Visitor *v);
>
>  /* May be NULL; most useful for input visitors. */
> -void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
> +bool (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
>Error **errp);
>  /* May be NULL */
>  void (*end_implicit_struct)(Visitor *v);
>
>  /* Must be set */
> -void (*start_list)(Visitor *v, const char *name, GenericList **list,
> +bool (*start_list)(Visitor *v, const char *name, GenericList **list,
> Error **errp);
>  /* Must be set */
>  GenericList *(*next_list)(Visitor *v, GenericList *element, Error 
> **errp);
> diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
> index 4638863..4eae633 100644
> --- a/include/qapi/visitor.h
> +++ b/include/qapi/visitor.h
> @@ -31,6 +31,27 @@
>   * visitor-impl.h.
>   */
>
> +/* All qapi types have a corresponding function with a signature
> + * roughly compatible with this:
> + *
> + * void visit_type_FOO(Visitor *v, void *obj, const char *name, Error 
> **errp);
> + *
> + * where *@obj is itself a pointer or a scalar.  The visit functions for
> + * built-in types are declared here, while the functions for qapi-defined
> + * struct, union, enum, and list types are generated (see qapi-visit.h).
> + * Input visitors may receive an uninitialized *@obj, and guarantee a
> + * safe value is assigned (a new object on success, or NULL on failure).
> + * Output visitors expect *@obj to be properly initialized on entry.
> + *
> + * Additionally, all qapi structs have a generated function compatible
> + * with this:
> + *
> + * void qapi_free_FOO(void *obj);
> + *
> + * which behaves like free(), even if @obj is NULL or was only partially
> + * allocated before encountering an error.
> + */
> +
>  /* This struct is layout-compatible with all other *List structs
>   * created by the qapi generator. */
>  typedef struct GenericList
> @@ -62,11 +83,12 @@ typedef struct GenericList
>   * Set *@errp on failure; for example, if the input stream does no

Re: [Qemu-devel] [PATCH v8 33/35] qapi: Simplify semantics of visit_next_list()

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> We have two uses of list visits in the entire code base: one in
> spapr_drc (which completely avoids visit_next_list(), feeding in
> integers from a different source than uint8List), and one in
> qapi-visit.py (that is, all other list visitors are generated
> in qapi-visit.c, and share the same paradigm based on a qapi
> FooList type).  What's more, the semantics of the list visit are
> somewhat baroque, with the following pseudocode when FooList is
> used:
>
> start()
> prev = head
> while (cur = next(prev)) {
> visit(cur)
> prev = &cur
> }
>
> Note that these semantics (advance before visit) requires that
> the first call to next() return the list head, while all other
> calls return the next element of the list; that is, every visitor
> implementation is required to track extra state to decide whether
> to return the input as-is, or to advance.  It also requires an
> argument of 'GenericList **' to next(), solely because the first
> iteration might need to modify the caller's GenericList head, so
> that all other calls have to do a layer of dereferencing.
>
> We can greatly simplify things by hoisting the special case
> into the start() routine, and flipping the order in the loop
> to visit before advance:
>
> start(head)
> element = *head
> while (element) {
> visit(element)
> element = next(element)
> }
>
> With the simpler semantics, visitors have less state to track,
> the argument to next() is reduced to 'GenericList *', and it
> also becomes obvious whether an input visitor is allocating a
> FooList during visit_start_list() (rather than the old way of
> not knowing if an allocation happened until the first
> visit_next_list()).
>
> The signature of visit_start_list() is chosen to match
> visit_start_struct(), with the new parameter after 'name'.
>
> The spapr_drc case requires that visit_start_list() has to pay
> attention to whether visit_next_list() will even be used to
> visit a FooList qapi struct; this is done by passing NULL for
> list, similarly to how NULL is passed to visit_start_struct()
> when a qapi type is not used in those visits.  It was easy to
> provide these semantics for qmp-output and dealloc visitors,
> and a bit harder for qmp-input (it required hoisting the
> advance of the current qlist entry out of qmp_input_next_list()
> into qmp_input_get_object()).  But it turned out that the
> string and opts visitors munge enough state during
> visit_next_list() to make those conversions simpler if they
> require a GenericList visit for now; an assertion will remind
> us to adjust things if we need the semantics in the future.
>
> Signed-off-by: Eric Blake 

nice cleanup
Reviewed-by: Marc-André Lureau 

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 35/35] RFC: qapi: Adjust layout of FooList types

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> By sticking the next pointer first, we don't need a union with
> 64-bit padding for smaller types.  On 32-bit platforms, this
> can reduce the size of uint8List from 16 bytes (or 12, depending
> on whether 64-bit ints can tolerate 4-byte alignment) down to 8.
> It has no effect on 64-bit platforms (where alignment still
> dictates a 16-byte struct); but fewer anonymous unions is still
> a win in my book.

small win, but a win

> However, this requires visit_start_list() and visit_next_list()
> to gain a size parameter, to know what size element to allocate.

If this is the only "drawback", I'd be fine with it.

> I debated about going one step further, to allow for fewer casts,
> by doing:
> typedef GenericList GenericList;
> struct GenericList {
> GenericList *next;
> };
> struct FooList {
> GenericList base;
> Foo value;
> };
> so that you convert to 'GenericList *' by '&foolist->base', and
> back by 'container_of(generic, GenericList, base)' (as opposed to
> the existing '(GenericList *)foolist' and '(FooList *)generic').
> But doing that would require hoisting the declaration of
> GenericList prior to inclusion of qapi-types.h, rather than its
> current spot in visitor.h; it also makes iteration a bit more
> verbose through 'foolist->base.next' instead of 'foolist->next'.
>
> Signed-off-by: Eric Blake 
>

otherwise, patch looks good to me, waiting for non-rfc version if all
agree with it, for the review-by tag.

-- 
Marc-André Lureau

Re: [Qemu-devel] How to reserve guest physical region for ACPI

2016-01-05 Thread Xiao Guangrong




On 01/06/2016 12:43 AM, Michael S. Tsirkin wrote:


Yes - if address is static, you need to put it outside
the table. Can come right before or right after this.


Also if OperationRegion() is used, then one has to patch
DefOpRegion directly as RegionOffset must be Integer,
using variable names is not permitted there.


I am not sure the comment was understood correctly.
The comment says really "we can't use DataTableRegion
so here is an alternative".

so how are you going to access data at which patched
NameString point to?
for that you'd need a normal patched OperationRegion
as well since DataTableRegion isn't usable.


For VMGENID you would patch the method that
returns the address - you do not need an op region
as you never access it.

I don't know about NVDIMM. Maybe OperationRegion can
use the patched NameString? Will need some thought.


The ACPI spec says that the offsetTerm in OperationRegion
is evaluated as Int, so the named object is allowed to be
used in OperationRegion, that is exact what my patchset
is doing (http://marc.info/?l=kvm&m=145193395624537&w=2):

+dsm_mem = aml_arg(3);
+aml_append(method, aml_store(aml_call0(NVDIMM_GET_DSM_MEM), dsm_mem));

+aml_append(method, aml_operation_region("NRAM", AML_SYSTEM_MEMORY,
+dsm_mem, TARGET_PAGE_SIZE));

We hide the int64 object which is patched by BIOS in the method,
NVDIMM_GET_DSM_MEM, to make windows XP happy.

However, the disadvantages i see are:
a) as Igor pointed out, we need a way to tell QEMU what is the patched
   address, in NVDIMM ACPI, we used a 64 bit IO ports to pass the address
   to QEMU.

b) BIOS allocated memory is RAM based so it stops us to use MMIO in ACPI,
   MMIO is the more scalable resource than IO port as it has larger region
   and supports 64 bits operation.

Re: [Qemu-devel] How to reserve guest physical region for ACPI

2016-01-05 Thread Igor Mammedov

On Mon, 4 Jan 2016 21:17:31 +0100
Laszlo Ersek  wrote:

> Michael CC'd me on the grandparent of the email below. I'll try to add
> my thoughts in a single go, with regard to OVMF.
> 
> On 12/30/15 20:52, Michael S. Tsirkin wrote:
> > On Wed, Dec 30, 2015 at 04:55:54PM +0100, Igor Mammedov wrote:  
> >> On Mon, 28 Dec 2015 14:50:15 +0200
> >> "Michael S. Tsirkin"  wrote:
> >>  
> >>> On Mon, Dec 28, 2015 at 10:39:04AM +0800, Xiao Guangrong wrote:  
> 
>  Hi Michael, Paolo,
> 
>  Now it is the time to return to the challenge that how to reserve guest
>  physical region internally used by ACPI.
> 
>  Igor suggested that:
>  | An alternative place to allocate reserve from could be high memory.
>  | For pc we have "reserved-memory-end" which currently makes sure
>  | that hotpluggable memory range isn't used by firmware
>  (https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00926.html) 
>   
> 
> OVMF has no support for the "reserved-memory-end" fw_cfg file. The
> reason is that nobody wrote that patch, nor asked for the patch to be
> written. (Not implying that just requesting the patch would be
> sufficient for the patch to be written.)
> 
> >>> I don't want to tie things to reserved-memory-end because this
> >>> does not scale: next time we need to reserve memory,
> >>> we'll need to find yet another way to figure out what is where.  
> >> Could you elaborate a bit more on a problem you're seeing?
> >>
> >> To me it looks like it scales rather well.
> >> For example lets imagine that we adding a device
> >> that has some on device memory that should be mapped into GPA
> >> code to do so would look like:
> >>
> >>   pc_machine_device_plug_cb(dev)
> >>   {
> >>...
> >>if (dev == OUR_NEW_DEVICE_TYPE) {
> >>memory_region_add_subregion(as, current_reserved_end, &dev->mr);
> >>set_new_reserved_end(current_reserved_end + 
> >> memory_region_size(&dev->mr));
> >>}
> >>   }
> >>
> >> we can practically add any number of new devices that way.  
> > 
> > Yes but we'll have to build a host side allocator for these, and that's
> > nasty. We'll also have to maintain these addresses indefinitely (at
> > least per machine version) as they are guest visible.
> > Not only that, there's no way for guest to know if we move things
> > around, so basically we'll never be able to change addresses.
> > 
> >   
> >>
> >>> I would like ./hw/acpi/bios-linker-loader.c interface to be extended to
> >>> support 64 bit RAM instead  
> 
> This looks quite doable in OVMF, as long as the blob to allocate from
> high memory contains *zero* ACPI tables.
> 
> (
> Namely, each ACPI table is installed from the containing fw_cfg blob
> with EFI_ACPI_TABLE_PROTOCOL.InstallAcpiTable(), and the latter has its
> own allocation policy for the *copies* of ACPI tables it installs.
> 
> This allocation policy is left unspecified in the section of the UEFI
> spec that governs EFI_ACPI_TABLE_PROTOCOL.
> 
> The current policy in edk2 (= the reference implementation) seems to be
> "allocate from under 4GB". It is currently being changed to "try to
> allocate from under 4GB, and if that fails, retry from high memory". (It
> is motivated by Aarch64 machines that may have no DRAM at all under 4GB.)
> )
> 
> >>> (and maybe a way to allocate and
> >>> zero-initialize buffer without loading it through fwcfg),  
> 
> Sounds reasonable.
> 
> >>> this way bios
> >>> does the allocation, and addresses can be patched into acpi.  
> >> and then guest side needs to parse/execute some AML that would
> >> initialize QEMU side so it would know where to write data.  
> > 
> > Well not really - we can put it in a data table, by itself
> > so it's easy to find.  
> 
> Do you mean acpi_tb_find_table(), acpi_get_table_by_index() /
> acpi_get_table_with_size()?
> 
> > 
> > AML is only needed if access from ACPI is desired.
> > 
> >   
> >> bios-linker-loader is a great interface for initializing some
> >> guest owned data and linking it together but I think it adds
> >> unnecessary complexity and is misused if it's used to handle
> >> device owned data/on device memory in this and VMGID cases.  
> > 
> > I want a generic interface for guest to enumerate these things.  linker
> > seems quite reasonable but if you see a reason why it won't do, or want
> > to propose a better interface, fine.  
> 
> * The guest could do the following:
> - while processing the ALLOCATE commands, it would make a note where in
> GPA space each fw_cfg blob gets allocated
> - at the end the guest would prepare a temporary array with a predefined
> record format, that associates each fw_cfg blob's name with the concrete
> allocation address
> - it would create an FWCfgDmaAccess stucture pointing at this array,
> with a new "control" bit set (or something similar)
> - the guest could write the address of the FWCfgDmaAccess struct to the
> appropriate register, as always.
> 
> * Another idea would be a GET_ALLOCATION_ADD

Re: [Qemu-devel] How to reserve guest physical region for ACPI

2016-01-05 Thread Laszlo Ersek

On 01/05/16 17:43, Michael S. Tsirkin wrote:
> On Tue, Jan 05, 2016 at 05:30:25PM +0100, Igor Mammedov wrote:
 bios-linker-loader is a great interface for initializing some
 guest owned data and linking it together but I think it adds
 unnecessary complexity and is misused if it's used to handle
 device owned data/on device memory in this and VMGID cases.  
>>>
>>> I want a generic interface for guest to enumerate these things.  linker
>>> seems quite reasonable but if you see a reason why it won't do, or want
>>> to propose a better interface, fine.
>>>
>>> PCI would do, too - though windows guys had concerns about
>>> returning PCI BARs from ACPI.
>> There were potential issues with pSeries bootloader that treated
>> PCI_CLASS_MEMORY_RAM as conventional RAM but it was fixed.
>> Could you point out to discussion about windows issues?
>>
>> What VMGEN patches that used PCI for mapping purposes were
>> stuck at, was that it was suggested to use PCI_CLASS_MEMORY_RAM
>> class id but we couldn't agree on it.
>>
>> VMGEN v13 with full discussion is here
>> https://patchwork.ozlabs.org/patch/443554/
>> So to continue with this route we would need to pick some other
>> driver less class id so windows won't prompt for driver or
>> maybe supply our own driver stub to guarantee that no one
>> would touch it. Any suggestions?
> 
> Pick any device/vendor id pair for which windows specifies no driver.
> There's a small risk that this will conflict with some
> guest but I think it's minimal.
> 
> 
>>>
>>>
 There was RFC on list to make BIOS boot from NVDIMM already
 doing some ACPI table lookup/parsing. Now if they were forced
 to also parse and execute AML to initialize QEMU with guest
 allocated address that would complicate them quite a bit.  
>>>
>>> If they just need to find a table by name, it won't be
>>> too bad, would it?
>> that's what they were doing scanning memory for static NVDIMM table.
>> However if it were DataTable, BIOS side would have to execute
>> AML so that the table address could be told to QEMU.
> 
> Not at all. You can find any table by its signature without
> parsing AML.
> 
> 
>> In case of direct mapping or PCI BAR there is no need to initialize
>> QEMU side from AML.
>> That also saves us IO port where this address should be written
>> if bios-linker-loader approach is used.
>>
>>>
 While with NVDIMM control memory region mapped directly by QEMU,
 respective patches don't need in any way to initialize QEMU,
 all they would need just read necessary data from control region.

 Also using bios-linker-loader takes away some usable RAM
 from guest and in the end that doesn't scale,
 the more devices I add the less usable RAM is left for guest OS
 while all the device needs is a piece of GPA address space
 that would belong to it.  
>>>
>>> I don't get this comment. I don't think it's MMIO that is wanted.
>>> If it's backed by qemu virtual memory then it's RAM.
>> Then why don't allocate video card VRAM the same way and try to explain
>> user that a guest started with '-m 128 -device cirrus-vga,vgamem_mb=64Mb'
>> only has 64Mb of available RAM because of we think that on device VRAM
>> is also RAM.
>>
>> Maybe I've used MMIO term wrongly here but it roughly reflects the idea
>> that on device memory (whether it's VRAM, NVDIMM control block or VMGEN
>> area) is not allocated from guest's usable RAM (as described in E820)
>> but rather directly mapped in guest's GPA and doesn't consume available
>> RAM as guest sees it. That's also the way it's done on real hardware.
>>
>> What we need in case of VMGEN ID and NVDIMM is on device memory
>> that could be directly accessed by guest.
>> Both direct mapping or PCI BAR do that job and we could use simple
>> static AML without any patching.
> 
> At least with VMGEN the issue is that there's an AML method
> that returns the physical address.
> Then if guest OS moves the BAR (which is legal), it will break
> since caller has no way to know it's related to the BAR.
> 
> 
>
> See patch at the bottom that might be handy.
>   
>> he also innovated a way to use 64-bit address in DSDT/SSDT.rev = 1:
>> | when writing ASL one shall make sure that only XP supported
>> | features are in global scope, which is evaluated when tables
>> | are loaded and features of rev2 and higher are inside methods.
>> | That way XP doesn't crash as far as it doesn't evaluate unsupported
>> | opcode and one can guard those opcodes checking _REV object if 
>> neccesary.
>> (https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg01010.html) 
>>  
>
> Yes, this technique works.
>
> An alternative is to add an XSDT, XP ignores that.
> XSDT at the moment breaks OVMF (because it loads both
> the RSDT and the XSDT, which is wrong), but I think
> Laszlo was working on a fix for that.  
 Using XSDT would increase ACPI tables occupied RAM
 a

Re: [Qemu-devel] [PATCH 5/6] hw/arm/sysbus-fdt: helpers for clock node generation

2016-01-05 Thread Eric Auger

On 12/18/2015 03:54 PM, Peter Maydell wrote:
> On 17 December 2015 at 12:29, Eric Auger  wrote:
>> Some passthrough'ed devices depend on clock nodes. Those need to be
>> generated in the guest device tree. This patch introduces some helpers
>> to build a clock node from information retrieved in the host device tree.
>>
>> - inherit_properties copies properties from a host device tree node to
>>   a guest device tree node
>> - fdt_build_clock_node builds a guest clock node and checks the host
>>   fellow clock is a fixed one.
>>
>> fdt_build_clock_node will become static as soon as it gets used. A
>> dummy pre-declaration is needed for compilation of this patch.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> RFC -> v1:
>> - use the new proto of qemu_fdt_getprop
>> - remove newline in error_report
>> - fix some style issues
>> ---
>>  hw/arm/sysbus-fdt.c | 111 
>> 
>>  1 file changed, 111 insertions(+)
>>
>> diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
>> index 9d28797..c2d813b 100644
>> --- a/hw/arm/sysbus-fdt.c
>> +++ b/hw/arm/sysbus-fdt.c
>> @@ -21,6 +21,7 @@
>>   *
>>   */
>>
>> +#include 
>>  #include "hw/arm/sysbus-fdt.h"
>>  #include "qemu/error-report.h"
>>  #include "sysemu/device_tree.h"
>> @@ -56,6 +57,116 @@ typedef struct NodeCreationPair {
>>  int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
>>  } NodeCreationPair;
>>
>> +/* helpers */
>> +
>> +struct HostProperty {
>> +const char *name;
>> +bool optional;
>> +};
>> +
>> +typedef struct HostProperty HostProperty;
> 
> You can combine the typedef and the struct declaration if you like.

ok
> 
>> +
>> +/**
>> + * inherit_properties
>> + *
>> + * copies properties listed in an array from host device tree to
>> + * guest device tree. If a non optional property is not found, the
>> + * function self-asserts. An optional property is ignored if not found
>> + * in the host device tree.
>> + * @props: array of HostProperty to copy
>> + * @nb_props: number of properties in the array
>> + * @host_dt: host device tree blob
>> + * @guest_dt: guest device tree blob
>> + * @node_path: host dt node path where the property is supposed to be
>> +  found
>> + * @nodename: guest node name the properties should be added to
>> + */
>> +static void inherit_properties(HostProperty *props, int nb_props,
>> +   void *host_fdt, void *guest_fdt,
>> +   char *node_path, char *nodename)
>> +{
>> +int i, prop_len;
>> +const void *r;
>> +
>> +for (i = 0; i < nb_props; i++) {
>> +r = qemu_fdt_getprop(host_fdt, node_path,
>> + props[i].name,
>> + &prop_len,
>> + props[i].optional ? NULL : &error_fatal);
> 
> We'll get an error here if the host device tree doesn't match
> up correctly, right? Is the error message going to be sufficiently
> informative to the end user about what's gone wrong? (What does
> it end up looking like?)

hum you're right, in case of a mandated property there is auto-assert
with a good error message.
In case of an optional property, there is currently no message output to
the end-user and the host property is ignored/not propagated to the
guest, with potential functional consequences.  I intend to use an error
object instead and print the error message in case the error is !=
FDT_ERR_NOTFOUND.
> 
>> +if (r) {
>> +qemu_fdt_setprop(guest_fdt, nodename,
>> + props[i].name, r, prop_len);
>> +}
>> +}
>> +}
>> +
>> +/* clock properties whose values are copied/pasted from host */
>> +static HostProperty clock_inherited_properties[] = {
>> +{"compatible", 0},
>> +{"#clock-cells", 0},
>> +{"clock-frequency", 1},
>> +{"clock-output-names", 1},
>> +};
>> +
>> +/**
>> + * fdt_build_clock_node
>> + *
>> + * Build a guest clock node, used as a dependency from a passthrough'ed
>> + * device. Most information are retrieved from the host clock node.
>> + * Also check the host clock is a fixed one.
>> + *
>> + * @host_fdt: host device tree blob from which info are retrieved
>> + * @guest_fdt: guest device tree blob where the clock node is added
>> + * @host_phandle: phandle of the clock in host device tree
>> + * @guest_phandle: phandle to assign to the guest node
>> + */
>> +int fdt_build_clock_node(void *host_fdt, void *guest_fdt,
>> + uint32_t host_phandle,
>> + uint32_t guest_phandle);
>> +int fdt_build_clock_node(void *host_fdt, void *guest_fdt,
>> + uint32_t host_phandle,
>> + uint32_t guest_phandle)
>> +{
>> +char node_path[256];
> 
> Please don't use hardcoded fixed buffer lengths (see previous patch
> review comments).
OK
> 
>> +char *nodename;
>> +const void *r;
>> +int ret, prop_len;
>> +
>> +ret = fdt_node_offset_by_p

Re: [Qemu-devel] How to reserve guest physical region for ACPI

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 05:30:25PM +0100, Igor Mammedov wrote:
> > > bios-linker-loader is a great interface for initializing some
> > > guest owned data and linking it together but I think it adds
> > > unnecessary complexity and is misused if it's used to handle
> > > device owned data/on device memory in this and VMGID cases.  
> > 
> > I want a generic interface for guest to enumerate these things.  linker
> > seems quite reasonable but if you see a reason why it won't do, or want
> > to propose a better interface, fine.
> > 
> > PCI would do, too - though windows guys had concerns about
> > returning PCI BARs from ACPI.
> There were potential issues with pSeries bootloader that treated
> PCI_CLASS_MEMORY_RAM as conventional RAM but it was fixed.
> Could you point out to discussion about windows issues?
> 
> What VMGEN patches that used PCI for mapping purposes were
> stuck at, was that it was suggested to use PCI_CLASS_MEMORY_RAM
> class id but we couldn't agree on it.
> 
> VMGEN v13 with full discussion is here
> https://patchwork.ozlabs.org/patch/443554/
> So to continue with this route we would need to pick some other
> driver less class id so windows won't prompt for driver or
> maybe supply our own driver stub to guarantee that no one
> would touch it. Any suggestions?

Pick any device/vendor id pair for which windows specifies no driver.
There's a small risk that this will conflict with some
guest but I think it's minimal.


> > 
> > 
> > > There was RFC on list to make BIOS boot from NVDIMM already
> > > doing some ACPI table lookup/parsing. Now if they were forced
> > > to also parse and execute AML to initialize QEMU with guest
> > > allocated address that would complicate them quite a bit.  
> > 
> > If they just need to find a table by name, it won't be
> > too bad, would it?
> that's what they were doing scanning memory for static NVDIMM table.
> However if it were DataTable, BIOS side would have to execute
> AML so that the table address could be told to QEMU.

Not at all. You can find any table by its signature without
parsing AML.


> In case of direct mapping or PCI BAR there is no need to initialize
> QEMU side from AML.
> That also saves us IO port where this address should be written
> if bios-linker-loader approach is used.
> 
> > 
> > > While with NVDIMM control memory region mapped directly by QEMU,
> > > respective patches don't need in any way to initialize QEMU,
> > > all they would need just read necessary data from control region.
> > > 
> > > Also using bios-linker-loader takes away some usable RAM
> > > from guest and in the end that doesn't scale,
> > > the more devices I add the less usable RAM is left for guest OS
> > > while all the device needs is a piece of GPA address space
> > > that would belong to it.  
> > 
> > I don't get this comment. I don't think it's MMIO that is wanted.
> > If it's backed by qemu virtual memory then it's RAM.
> Then why don't allocate video card VRAM the same way and try to explain
> user that a guest started with '-m 128 -device cirrus-vga,vgamem_mb=64Mb'
> only has 64Mb of available RAM because of we think that on device VRAM
> is also RAM.
> 
> Maybe I've used MMIO term wrongly here but it roughly reflects the idea
> that on device memory (whether it's VRAM, NVDIMM control block or VMGEN
> area) is not allocated from guest's usable RAM (as described in E820)
> but rather directly mapped in guest's GPA and doesn't consume available
> RAM as guest sees it. That's also the way it's done on real hardware.
> 
> What we need in case of VMGEN ID and NVDIMM is on device memory
> that could be directly accessed by guest.
> Both direct mapping or PCI BAR do that job and we could use simple
> static AML without any patching.

At least with VMGEN the issue is that there's an AML method
that returns the physical address.
Then if guest OS moves the BAR (which is legal), it will break
since caller has no way to know it's related to the BAR.


> > > > 
> > > > See patch at the bottom that might be handy.
> > > >   
> > > > > he also innovated a way to use 64-bit address in DSDT/SSDT.rev = 1:
> > > > > | when writing ASL one shall make sure that only XP supported
> > > > > | features are in global scope, which is evaluated when tables
> > > > > | are loaded and features of rev2 and higher are inside methods.
> > > > > | That way XP doesn't crash as far as it doesn't evaluate unsupported
> > > > > | opcode and one can guard those opcodes checking _REV object if 
> > > > > neccesary.
> > > > > (https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg01010.html)
> > > > >   
> > > > 
> > > > Yes, this technique works.
> > > > 
> > > > An alternative is to add an XSDT, XP ignores that.
> > > > XSDT at the moment breaks OVMF (because it loads both
> > > > the RSDT and the XSDT, which is wrong), but I think
> > > > Laszlo was working on a fix for that.  
> > > Using XSDT would increase ACPI tables occupied RAM
> > > as it would duplicate DSDT + non XP sup

Re: [Qemu-devel] [RFC v6 03/14] Add CPUClass hook to set exclusive range

2016-01-05 Thread Alex Bennée


Alvise Rigo  writes:

> Allow each architecture to set the exclusive range at any LoadLink
> operation through a CPUClass hook.

nit: space or continue paragraph.

> This comes in handy to emulate, for instance, the exclusive monitor
> implemented in some ARM architectures (more precisely, the Exclusive
> Reservation Granule).
>
> Suggested-by: Jani Kokkonen 
> Suggested-by: Claudio Fontana 
> Signed-off-by: Alvise Rigo 

Reviewed-by: Alex Bennée 

> ---
>  include/qom/cpu.h | 4 
>  qom/cpu.c | 7 +++
>  2 files changed, 11 insertions(+)
>
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index c6bb6b6..9e409ce 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -175,6 +175,10 @@ typedef struct CPUClass {
>  void (*cpu_exec_exit)(CPUState *cpu);
>  bool (*cpu_exec_interrupt)(CPUState *cpu, int interrupt_request);
>
> +/* Atomic instruction handling */
> +void (*cpu_set_excl_protected_range)(CPUState *cpu, hwaddr addr,
> + hwaddr size);
> +
>  void (*disas_set_info)(CPUState *cpu, disassemble_info *info);
>  } CPUClass;
>
> diff --git a/qom/cpu.c b/qom/cpu.c
> index fb80d13..a5c25a8 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -203,6 +203,12 @@ static bool cpu_common_exec_interrupt(CPUState *cpu, int 
> int_req)
>  return false;
>  }
>
> +static void cpu_common_set_excl_range(CPUState *cpu, hwaddr addr, hwaddr 
> size)
> +{
> +cpu->excl_protected_range.begin = addr;
> +cpu->excl_protected_range.end = addr + size;
> +}
> +
>  void cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf,
>  int flags)
>  {
> @@ -355,6 +361,7 @@ static void cpu_class_init(ObjectClass *klass, void *data)
>  k->cpu_exec_enter = cpu_common_noop;
>  k->cpu_exec_exit = cpu_common_noop;
>  k->cpu_exec_interrupt = cpu_common_exec_interrupt;
> +k->cpu_set_excl_protected_range = cpu_common_set_excl_range;
>  dc->realize = cpu_common_realizefn;
>  /*
>   * Reason: CPUs still need special care by board code: wiring up


--
Alex Bennée

Re: [Qemu-devel] How to reserve guest physical region for ACPI

2016-01-05 Thread Igor Mammedov

On Wed, 30 Dec 2015 21:52:32 +0200
"Michael S. Tsirkin"  wrote:

> On Wed, Dec 30, 2015 at 04:55:54PM +0100, Igor Mammedov wrote:
> > On Mon, 28 Dec 2015 14:50:15 +0200
> > "Michael S. Tsirkin"  wrote:
> >   
> > > On Mon, Dec 28, 2015 at 10:39:04AM +0800, Xiao Guangrong wrote:  
> > > > 
> > > > Hi Michael, Paolo,
> > > > 
> > > > Now it is the time to return to the challenge that how to reserve guest
> > > > physical region internally used by ACPI.
> > > > 
> > > > Igor suggested that:
> > > > | An alternative place to allocate reserve from could be high memory.
> > > > | For pc we have "reserved-memory-end" which currently makes sure
> > > > | that hotpluggable memory range isn't used by firmware
> > > > (https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00926.html)
> > > >   
> > > 
> > > I don't want to tie things to reserved-memory-end because this
> > > does not scale: next time we need to reserve memory,
> > > we'll need to find yet another way to figure out what is where.  
> > Could you elaborate a bit more on a problem you're seeing?
> > 
> > To me it looks like it scales rather well.
> > For example lets imagine that we adding a device
> > that has some on device memory that should be mapped into GPA
> > code to do so would look like:
> > 
> >   pc_machine_device_plug_cb(dev)
> >   {
> >...
> >if (dev == OUR_NEW_DEVICE_TYPE) {
> >memory_region_add_subregion(as, current_reserved_end, &dev->mr);
> >set_new_reserved_end(current_reserved_end + 
> > memory_region_size(&dev->mr));
> >}
> >   }
> > 
> > we can practically add any number of new devices that way.  
> 
> Yes but we'll have to build a host side allocator for these, and that's
> nasty. We'll also have to maintain these addresses indefinitely (at
> least per machine version) as they are guest visible.
> Not only that, there's no way for guest to know if we move things
> around, so basically we'll never be able to change addresses.
simplistic GPA allocator in snippet above does the job,

if one unconditionally adds a device in new version then yes
code has to have compat code based on machine version.
But that applies to any device that gas a state to migrate
or to any address space layout change.

However device that directly maps addresses doesn't have to
have fixed address though, it could behave the same way as
PCI device with BARs, with only difference that its
MemoryRegions are mapped before guest is running vs
BARs mapped by BIOS.
It could be worth to create a generic base device class
that would do above. Then it could be inherited from and
extended by concrete device implementations.

> >
> > > I would like ./hw/acpi/bios-linker-loader.c interface to be extended to
> > > support 64 bit RAM instead (and maybe a way to allocate and
> > > zero-initialize buffer without loading it through fwcfg), this way bios
> > > does the allocation, and addresses can be patched into acpi.  
> > and then guest side needs to parse/execute some AML that would
> > initialize QEMU side so it would know where to write data.  
> 
> Well not really - we can put it in a data table, by itself
> so it's easy to find.
> 
> AML is only needed if access from ACPI is desired.
in both cases (VMGEN, NVDIMM) access from ACPI is required
as minimum to write address back to QEMU and for NVDIM
to pass _DSM method data between guest and QEMU.

> 
> 
> > bios-linker-loader is a great interface for initializing some
> > guest owned data and linking it together but I think it adds
> > unnecessary complexity and is misused if it's used to handle
> > device owned data/on device memory in this and VMGID cases.  
> 
> I want a generic interface for guest to enumerate these things.  linker
> seems quite reasonable but if you see a reason why it won't do, or want
> to propose a better interface, fine.
> 
> PCI would do, too - though windows guys had concerns about
> returning PCI BARs from ACPI.
There were potential issues with pSeries bootloader that treated
PCI_CLASS_MEMORY_RAM as conventional RAM but it was fixed.
Could you point out to discussion about windows issues?

What VMGEN patches that used PCI for mapping purposes were
stuck at, was that it was suggested to use PCI_CLASS_MEMORY_RAM
class id but we couldn't agree on it.

VMGEN v13 with full discussion is here
https://patchwork.ozlabs.org/patch/443554/
So to continue with this route we would need to pick some other
driver less class id so windows won't prompt for driver or
maybe supply our own driver stub to guarantee that no one
would touch it. Any suggestions?

> 
> 
> > There was RFC on list to make BIOS boot from NVDIMM already
> > doing some ACPI table lookup/parsing. Now if they were forced
> > to also parse and execute AML to initialize QEMU with guest
> > allocated address that would complicate them quite a bit.  
> 
> If they just need to find a table by name, it won't be
> too bad, would it?
that's what they were doing scanning memory for static NVDIMM table.
However i

Re: [Qemu-devel] [PATCH v2] trace-events: fix broken format strings

2016-01-05 Thread Greg Kurz

Cc'ing Peter because we'd like this patch to go directly to the master branch.

On Tue,  5 Jan 2016 16:37:35 +0100
Andrew Jones  wrote:

> Fixes compiling with --enable-trace-backends
> 
> Signed-off-by: Andrew Jones 
> ---
> v2: also remove trailing null strings [Laurent]
> 
> 

Reviewed-by: Greg Kurz 

>  trace-events | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/trace-events b/trace-events
> index 6f036384a84f8..98ec748270a39 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1799,15 +1799,15 @@ qcrypto_tls_session_new(void *session, void *creds, 
> const char *hostname, const
>  vhost_user_event(const char *chr, int event) "chr: %s got event: %d"
> 
>  # linux-user/signal.c
> -user_setup_frame(void *env, uint64_t frame_addr) "env=%p frame_addr="PRIx64""
> -user_setup_rt_frame(void *env, uint64_t frame_addr) "env=%p 
> frame_addr="PRIx64""
> -user_do_rt_sigreturn(void *env, uint64_t frame_addr) "env=%p 
> frame_addr="PRIx64""
> -user_do_sigreturn(void *env, uint64_t frame_addr) "env=%p 
> frame_addr="PRIx64""
> +user_setup_frame(void *env, uint64_t frame_addr) "env=%p 
> frame_addr=0x%"PRIx64
> +user_setup_rt_frame(void *env, uint64_t frame_addr) "env=%p 
> frame_addr=0x%"PRIx64
> +user_do_rt_sigreturn(void *env, uint64_t frame_addr) "env=%p 
> frame_addr=0x%"PRIx64
> +user_do_sigreturn(void *env, uint64_t frame_addr) "env=%p 
> frame_addr=0x%"PRIx64
>  user_force_sig(void *env, int target_sig, int host_sig) "env=%p signal %d 
> (host %d)"
>  user_handle_signal(void *env, int target_sig) "env=%p signal %d"
>  user_host_signal(void *env, int host_sig, int target_sig) "env=%p signal %d 
> (target %d("
>  user_queue_signal(void *env, int target_sig) "env=%p signal %d"
> -user_s390x_restore_sigregs(void *env, uint64_t sc_psw_addr, uint64_t 
> env_psw_addr) "env=%p frame psw.addr "PRIx64 " current psw.addr "PRIx64""
> +user_s390x_restore_sigregs(void *env, uint64_t sc_psw_addr, uint64_t 
> env_psw_addr) "env=%p frame psw.addr 0x%"PRIx64 " current psw.addr 0x%"PRIx64
> 
>  # io/task.c
>  qio_task_new(void *task, void *source, void *func, void *opaque) "Task new 
> task=%p source=%p func=%p opaque=%p"

Re: [Qemu-devel] [PATCH 3/6] device_tree: introduce qemu_fdt_node_path

2016-01-05 Thread Eric Auger

Hi Peter,
On 12/18/2015 03:23 PM, Peter Maydell wrote:
> On 17 December 2015 at 12:29, Eric Auger  wrote:
>> This new helper routine returns the node path of a device
>> referred to by its node name and compat string.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> RFC -> v1:
>> - improve error handling according to Alex' comments
>> ---
>>  device_tree.c| 45 
>> 
>>  include/sysemu/device_tree.h |  3 +++
>>  2 files changed, 48 insertions(+)
>>
>> diff --git a/device_tree.c b/device_tree.c
>> index e556a99..b5d7e0b 100644
>> --- a/device_tree.c
>> +++ b/device_tree.c
>> @@ -233,6 +233,51 @@ static int findnode_nofail(void *fdt, const char 
>> *node_path)
>>  return offset;
>>  }
>>
>> +/**
>> + * qemu_fdt_node_path
>> + *
>> + * return the node path of a device, given its node name and its
>> + * compat string
>> + * fdt: pointer to the dt blob
>> + * name: device node name
>> + * compat: compatibility string of the device
>> + *
>> + * upon success, the path is output at node_path address
>> + * returns 0 on success, < 0 on failure
>> + */
> 
> Can we put the doc comment in the header file, since this is
> a globally visible function? Also it would be nice to follow the
> doc-comment syntax standards about marking up arguments with '@'
> and so on.
sure
> 
>> +int qemu_fdt_node_path(void *fdt, const char *name, char *compat,
>> +   char **node_path)
>> +{
>> +int offset, len, ret;
>> +const char *iter_name;
>> +char path[256];
> 
> Rather than a fixed buffer size, we should check whether
> fdt_get_path returns -FDT_ERR_NOSPACE and if so enlarge the
> buffer and try again.
OK
> 
>> +
>> +*node_path = NULL;
>> +offset = fdt_node_offset_by_compatible(fdt, -1, compat);
>> +while (offset != -FDT_ERR_NOTFOUND) {
>> +if (offset < 0) {
>> +continue;
> 
> I don't understand this continue -- if the fdt function returned any
> error other than -FDT_ERR_NOTFOUND then this will cause us to go
> into an infinite loop around this while(). Did you mean 'break' ?
> (Though if you just want to break then fixing the while condition
> would be better.)
My first understanding of the API was fdt_node_offset_by_compatible
would increment the offset even if an error occurred; so I envisioned to
continue parsing the tree, looking for another node with same features.
But I think it is overkill anyway and I will abort.
> 
>> +}
>> +iter_name = fdt_get_name(fdt, offset, &len);
>> +if (!iter_name) {
>> +continue;
> 
> This also seems like it ought to be a break, except you need to
> set offset to the error code first (which fdt_get_name() will
> have returned in 'len').
yes will set offset to len and then break;
> 
>> +}
>> +
>> +if (!strncmp(iter_name, name, len)) {
> 
> Do we really want strncmp rather than strcmp here ? (ie
> "find first node whose name has 'name' as a prefix" rather
> than "find first node whose name matches 'name').
true strcmp is OK

Thanks

Eric
> 
>> +goto found;
>> +}
>> +offset = fdt_node_offset_by_compatible(fdt, offset, compat);
>> +}
>> +return offset;
>> +
>> +found:
>> +ret = fdt_get_path(fdt, offset, path, 256);
>> +if (!ret) {
>> +*node_path = g_strdup(path);
>> +}
>> +return ret;
>> +}
>> +
>>  int qemu_fdt_setprop(void *fdt, const char *node_path,
>>   const char *property, const void *val, int size)
>>  {
>> diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
>> index 307e53d..f9e6e6e 100644
>> --- a/include/sysemu/device_tree.h
>> +++ b/include/sysemu/device_tree.h
>> @@ -18,6 +18,9 @@ void *create_device_tree(int *sizep);
>>  void *load_device_tree(const char *filename_path, int *sizep);
>>  void *load_device_tree_from_sysfs(void);
>>
>> +int qemu_fdt_node_path(void *fdt, const char *name, char *compat,
>> +   char **node_path);
>> +
>>  int qemu_fdt_setprop(void *fdt, const char *node_path,
>>   const char *property, const void *val, int size);
>>  int qemu_fdt_setprop_cell(void *fdt, const char *node_path,
>> --
>> 1.9.1
> 
> thanks
> -- PMM
>

Re: [Qemu-devel] [PATCH 4/6] device_tree: qemu_fdt_getprop converted to use the error API

2016-01-05 Thread Eric Auger

Hi Peter,
On 12/18/2015 03:36 PM, Peter Maydell wrote:
> On 17 December 2015 at 12:29, Eric Auger  wrote:
>> Current qemu_fdt_getprop exits if the property is not found. It is
>> sometimes needed to read an optional property, in which case we do
>> not wish to exit but simply returns a null value.
>>
>> This patch converts qemu_fdt_getprop to accept an Error **, and existing
>> users are converted to pass &error_fatal. This preserves the existing
>> behaviour. Then to use the API with your optional semantic a null
>> parameter can be conveyed.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> RFC -> v1:
>> - get rid of qemu_fdt_getprop_optional and implement Peter's suggestion
>>   that consists in using the error API
> 
> This doesn't seem to me like a great way for qemu_fdt_getprop to
> report "property not found", because there's no way for the caller
> to distinguish "property not found" from "function went wrong
> some other way" (since Errors just report human readable strings,
> and in any case you're not distinguishing -FDT_ERR_NOTFOUND
> from any of the other FDT errors).
Not sure I get what you mean here. In case fdt_getprop fails, as long as
the caller provided a lenp != NULL, *lenp contains the error code so
qemu_fdt_getprop's caller can discriminate a -FDT_ERR_NOTFOUND from any
other errors. Do I miss something?
> 
> If we want to handle "ok if property doesn't exist" then we
> could either (a) make the function return NULL on doesn't-exist
> and error_report in the other error cases, with the existing
> single caller extending its error checking appropriately, or
> (b) have the caller that cares about property-may-not-exist
> call fdt_getprop() directly.
> 
>> Signed-off-by: Eric Auger 
>> ---
>>  device_tree.c| 11 ++-
>>  include/sysemu/device_tree.h |  3 ++-
>>  2 files changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/device_tree.c b/device_tree.c
>> index b5d7e0b..c3720c2 100644
>> --- a/device_tree.c
>> +++ b/device_tree.c
>> @@ -331,18 +331,18 @@ int qemu_fdt_setprop_string(void *fdt, const char 
>> *node_path,
>>  }
>>
>>  const void *qemu_fdt_getprop(void *fdt, const char *node_path,
>> - const char *property, int *lenp)
>> + const char *property, int *lenp, Error **errp)
>>  {
>>  int len;
>>  const void *r;
>> +
>>  if (!lenp) {
>>  lenp = &len;
>>  }
>>  r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
>>  if (!r) {
>> -error_report("%s: Couldn't get %s/%s: %s", __func__,
>> - node_path, property, fdt_strerror(*lenp));
>> -exit(1);
>> +error_setg(errp, "%s: Couldn't get %s/%s: %s", __func__,
>> +  node_path, property, fdt_strerror(*lenp));
>>  }
>>  return r;
>>  }
>> @@ -351,7 +351,8 @@ uint32_t qemu_fdt_getprop_cell(void *fdt, const char 
>> *node_path,
>> const char *property)
>>  {
>>  int len;
>> -const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len);
>> +const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len,
>> + &error_fatal);
>>  if (len != 4) {
>>  error_report("%s: %s/%s not 4 bytes long (not a cell?)",
>>   __func__, node_path, property);
>> diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
>> index f9e6e6e..284fd3b 100644
>> --- a/include/sysemu/device_tree.h
>> +++ b/include/sysemu/device_tree.h
>> @@ -33,7 +33,8 @@ int qemu_fdt_setprop_phandle(void *fdt, const char 
>> *node_path,
>>   const char *property,
>>   const char *target_node_path);
>>  const void *qemu_fdt_getprop(void *fdt, const char *node_path,
>> - const char *property, int *lenp);
>> + const char *property, int *lenp,
>> + Error **errp);
> 
> If we change the function it would be nice to add a brief
> doc comment while we're touching the prototype in the header.
sure

Thanks

Eric
> 
> thanks
> -- PMM
>

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Alexander Duyck

On Tue, Jan 5, 2016 at 1:40 AM, Michael S. Tsirkin  wrote:
> On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
>> >> The two mechanisms referenced above would likely require coordination with
>> >> QEMU and as such are open to discussion.  I haven't attempted to address
>> >> them as I am not sure there is a consensus as of yet.  My personal
>> >> preference would be to add a vendor-specific configuration block to the
>> >> emulated pci-bridge interfaces created by QEMU that would allow us to
>> >> essentially extend shpc to support guest live migration with pass-through
>> >> devices.
>> >
>> > shpc?
>>
>> That is kind of what I was thinking.  We basically need some mechanism
>> to allow for the host to ask the device to quiesce.  It has been
>> proposed to possibly even look at something like an ACPI interface
>> since I know ACPI is used by QEMU to manage hot-plug in the standard
>> case.
>>
>> - Alex
>
>
> Start by using hot-unplug for this!
>
> Really use your patch guest side, and write host side
> to allow starting migration with the device, but
> defer completing it.

Yeah, I'm fully on board with this idea, though I'm not really working
on this right now since last I knew the folks on this thread from
Intel were working on it.  My patches were mostly meant to be a nudge
in this direction so that we could get away from the driver specific
code.

> So
>
> 1.- host tells guest to start tracking memory writes
> 2.- guest acks
> 3.- migration starts
> 4.- most memory is migrated
> 5.- host tells guest to eject device
> 6.- guest acks
> 7.- stop vm and migrate rest of state
>

Sounds about right.  The only way this differs from what I see as the
final solution for this is that instead of fully ejecting the device
in step 5 the driver would instead pause the device and give the host
something like 10 seconds to stop the VM and resume with the same
device connected if it is available.  We would probably also need to
look at a solution that would force the device to be ejected or abort
prior to starting the migration if it doesn't give us the ack in step
2.

> It will already be a win since hot unplug after migration starts and
> most memory has been migrated is better than hot unplug before migration
> starts.

Right.  Generally the longer the VF can be maintained as a part of the
guest the longer the network performance is improved versus using a
purely virtual interface.

> Then measure downtime and profile. Then we can look at ways
> to quiesce device faster which really means step 5 is replaced
> with "host tells guest to quiesce device and dirty (or just unmap!)
> all memory mapped for write by device".

Step 5 will be the spot where we really need to start modifying
drivers.  Specifically we probably need to go through and clean-up
things so that we can reduce as many of the delays in the driver
suspend/resume path as possible.  I suspect there is quite a bit that
can be done there that would probably also improve boot and shutdown
times since those are also impacted by the devices.

- Alex

Re: [Qemu-devel] [PATCH v8 26/35] qapi: Add type.is_empty() helper

2016-01-05 Thread Eric Blake

On 01/05/2016 07:04 AM, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
>> And use it in qapi-types and qapi-event.  Down the road, we may
>> want to lift our artificial restriction of no variants at the
>> top level of an event, at which point, inlining our check for
>> whether members is empty will no longer be sufficient.  More
>> immediately, the new .is_empty() helper will help fix a bug in
>> qapi-visit.
>>
> 
> which bug? (I guess it's related to the next patch)

Yes, the next patch. I guess I can spell it out better in this commit
message, though.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC v6 02/14] softmmu: Add new TLB_EXCL flag

2016-01-05 Thread Alex Bennée


Alvise Rigo  writes:

> Add a new TLB flag to force all the accesses made to a page to follow
> the slow-path.
>
> In the case we remove a TLB entry marked as EXCL, we unset the
> corresponding exclusive bit in the bitmap.
>
> Suggested-by: Jani Kokkonen 
> Suggested-by: Claudio Fontana 
> Signed-off-by: Alvise Rigo 
> ---
>  cputlb.c|  38 +++-
>  include/exec/cpu-all.h  |   8 
>  include/exec/cpu-defs.h |   1 +
>  include/qom/cpu.h   |  14 ++
>  softmmu_template.h  | 114 
> ++--
>  5 files changed, 152 insertions(+), 23 deletions(-)
>
> diff --git a/cputlb.c b/cputlb.c
> index bf1d50a..7ee0c89 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -394,6 +394,16 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
> vaddr,
>  env->tlb_v_table[mmu_idx][vidx] = *te;
>  env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
>
> +if (unlikely(!(te->addr_write & TLB_MMIO) && (te->addr_write &
> TLB_EXCL))) {

Why do we care about TLB_MMIO flags here? Does it actually happen? Would
bad things happen if we enforced exclusivity for an MMIO write? Do the
other flags matter?

There should be a comment as to why MMIO is mentioned I think.

> +/* We are removing an exclusive entry, set the page to dirty. This
> + * is not be necessary if the vCPU has performed both SC and LL. */
> +hwaddr hw_addr = (env->iotlb[mmu_idx][index].addr & 
> TARGET_PAGE_MASK) +
> +  (te->addr_write & 
> TARGET_PAGE_MASK);
> +if (!cpu->ll_sc_context) {
> +cpu_physical_memory_unset_excl(hw_addr, cpu->cpu_index);
> +}
> +}
> +
>  /* refill the tlb */
>  env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
>  env->iotlb[mmu_idx][index].attrs = attrs;
> @@ -419,7 +429,15 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
> vaddr,
> + xlat)) {
>  te->addr_write = address | TLB_NOTDIRTY;
>  } else {
> -te->addr_write = address;
> +if (!(address & TLB_MMIO) &&
> +cpu_physical_memory_atleast_one_excl(section->mr->ram_addr
> +   + xlat)) {
> +/* There is at least one vCPU that has flagged the address as
> + * exclusive. */
> +te->addr_write = address | TLB_EXCL;
> +} else {
> +te->addr_write = address;
> +}
>  }
>  } else {
>  te->addr_write = -1;
> @@ -471,6 +489,24 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, 
> target_ulong addr)
>  return qemu_ram_addr_from_host_nofail(p);
>  }
>
> +/* For every vCPU compare the exclusive address and reset it in case of a
> + * match. Since only one vCPU is running at once, no lock has to be held to
> + * guard this operation. */
> +static inline void lookup_and_reset_cpus_ll_addr(hwaddr addr, hwaddr size)
> +{
> +CPUState *cpu;
> +
> +CPU_FOREACH(cpu) {
> +if (cpu->excl_protected_range.begin != EXCLUSIVE_RESET_ADDR &&
> +ranges_overlap(cpu->excl_protected_range.begin,
> +   cpu->excl_protected_range.end -
> +   cpu->excl_protected_range.begin,
> +   addr, size)) {
> +cpu->excl_protected_range.begin = EXCLUSIVE_RESET_ADDR;
> +}
> +}
> +}
> +
>  #define MMUSUFFIX _mmu
>
>  #define SHIFT 0
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index 83b1781..f8d8feb 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -277,6 +277,14 @@ CPUArchState *cpu_copy(CPUArchState *env);
>  #define TLB_NOTDIRTY(1 << 4)
>  /* Set if TLB entry is an IO callback.  */
>  #define TLB_MMIO(1 << 5)
> +/* Set if TLB entry references a page that requires exclusive access.  */
> +#define TLB_EXCL(1 << 6)
> +
> +/* Do not allow a TARGET_PAGE_MASK which covers one or more bits defined
> + * above. */
> +#if TLB_EXCL >= TARGET_PAGE_SIZE
> +#error TARGET_PAGE_MASK covering the low bits of the TLB virtual address
> +#endif
>
>  void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
>  void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf);
> diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
> index 5093be2..b34d7ae 100644
> --- a/include/exec/cpu-defs.h
> +++ b/include/exec/cpu-defs.h
> @@ -27,6 +27,7 @@
>  #include 
>  #include "qemu/osdep.h"
>  #include "qemu/queue.h"
> +#include "qemu/range.h"
>  #include "tcg-target.h"
>  #ifndef CONFIG_USER_ONLY
>  #include "exec/hwaddr.h"
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 51a1323..c6bb6b6 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -29,6 +29,7 @@
>  #include "qemu/queue.h"
>  #include "qemu/thread.h"
>  #include "qemu/typedefs.h"
> +#include "qemu/range.h"
>

Re: [Qemu-devel] [PATCH v8 22/35] qapi: Add visit_type_null() visitor

2016-01-05 Thread Eric Blake

On 01/05/2016 07:05 AM, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
>> Right now, qmp-output-visitor happens to produce a QNull result
>> if nothing is actually visited between the creation of the visitor
>> and the request for the resulting QObject.  A stronger protocol
>> would require that a QMP output visit MUST visit something.  But
>> to still be able to produce a JSON 'null' output, we need a new
>> visitor function that states our intentions.
>>
>> This patch introduces the new visit_type_null() interface, and
>> a later patch will then wire it up into the qmp output visitor.
>>
>> Signed-off-by: Eric Blake 
>>
> 
> overall looks good to me:
> Reviewed-by: Marc-André Lureau 
> 
> Just a small remark below,
> 

>> +/* Must be provided to visit explicit null values (right now, only the
>> + * dealloc visitor supports this).  */
>> +void (*type_null)(Visitor *v, const char *name, Error **errp);
>>
> 
> It's not clear to me what you mean by "Must be provided" if only one
> visitor implements it.
> 

>> +void visit_type_null(Visitor *v, const char *name, Error **errp)
>> +{
>> +v->type_null(v, name, errp);
>> +}
> 
> Shouldn't it be optional then and a if (v->type_null) be added?

No one else (yet) uses a visitor that explicitly visits JSON 'null'.  If
someone adds the use of such a visitor, this code will crash at the
attempt to use the missing callback, which will tell the developer to
add the callback.  Adding a conditional would paper over the issue and
make it harder to find.

At any rate, it matches existing pattern of 'must be provided if you
plan to visit X' vs. visitors that lack the callback - see visit_type_any().

Maybe I should write a followup patch that implements it for
qmp-input-visitor, as well as a testsuite that proves we can round-trip
a null literal from JSON to QMP back to JSON.  And when it comes to
blockdev commands for reopening a device (such as changing its
throttling options), we've toyed with the idea of being able to
explicitly call out to leave an option unchanged (omit 'option'
altogether) vs. reset the option to its default (doable with
'option':null, but requires that we allow parsing explicit null).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

1 2 3 >

1 - 100 of 226 matches

Mail list logo