date:20160121

Re: [Qemu-devel] [PATCH v10 0/2] mirror: Improve zero write and discard

2016-01-21 Thread Fam Zheng

On Wed, 01/13 10:50, Fam Zheng wrote:
> v10: Fix and simplify mirror_cow_align. [Max]

Jeff, are you happy to take these patches?

Fam

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Wen Congyang

On 01/22/2016 03:42 PM, Jason Wang wrote:
> 
> 
> On 01/22/2016 02:47 PM, Wen Congyang wrote:
>> On 01/22/2016 02:21 PM, Jason Wang wrote:
>>>
>>> On 01/22/2016 01:56 PM, Wen Congyang wrote:
 On 01/22/2016 01:41 PM, Jason Wang wrote:
>>
>> On 01/22/2016 11:28 AM, Wen Congyang wrote:
 On 01/22/2016 11:15 AM, Jason Wang wrote:
>> On 01/20/2016 06:30 PM, Wen Congyang wrote:
 On 01/20/2016 06:19 PM, Jason Wang wrote:
 On 01/20/2016 06:01 PM, Wen Congyang wrote:
 On 01/20/2016 02:54 PM, Jason Wang wrote:
 On 01/20/2016 11:29 AM, Zhang Chen wrote:
 Sure.

 Two main comments/suggestions:

 - TCP analysis is missed in current version, 
 maybe you point a git tree
 (or another version of RFC) to me for a better 
 understanding of the
 design. (Just a skeleton for TCP should be 
 sufficient to discuss).
 - I prefer to make the code as reusable as 
 possible. So it's better to
 split/decouple the reusable parts from the 
 codes. So a vague idea is:

 1) Decouple the packet comparing from the 
 netfilter. You've achieved
 this 99% since the work has been done in a 
 thread. Just let the thread
 poll sockets directly, then the comparing have 
 the possibility to be
 reused by other kinds of dataplane.
 2) Implement traffic mirror/redirector as 
 filter.
 3) Implement TCP seq rewriting as a filter.

 Then, in primary node, you need just a traffic 
 mirror, which did:
 - mirror ingress traffic to secondary node
 - mirror outgress traffic to packet comparing 
 thread

 And in secondadry node, you need two filters:
 - A TCP seq rewriter which adjust tcp sequence 
 number.
 - A traffic redirector which redirect packet 
 from a socket as ingress
 traffic, and redirect outgress traffic to the 
 socket which could be
 polled by remote packet comparing thread.
   Thoughts?

 Thanks

 Thanks
 zhangchen
 Hi, Jason.
 We consider your suggestion to split/decouple
 the reusable parts from the codes.
 Due to filter plugin are traversed one by one in 
 order
 we will split colo-proxy to three filters in each 
 side.

 But in this plan,primary and secondary both have 
 socket
 server,startup is a problem.
 I believe this issue could be solved by reusing socket 
 chardev.

  Primary qemu  

 Secondary qemu
 +--+

 +---+
 | 
 +-+
   |   | 
 +--+
  |
 | |
  |  |   | 
 |

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Jason Wang



On 01/22/2016 02:47 PM, Wen Congyang wrote:
> On 01/22/2016 02:21 PM, Jason Wang wrote:
>>
>> On 01/22/2016 01:56 PM, Wen Congyang wrote:
>>> On 01/22/2016 01:41 PM, Jason Wang wrote:
>
> On 01/22/2016 11:28 AM, Wen Congyang wrote:
>>> On 01/22/2016 11:15 AM, Jason Wang wrote:
> On 01/20/2016 06:30 PM, Wen Congyang wrote:
>>> On 01/20/2016 06:19 PM, Jason Wang wrote:
>>> On 01/20/2016 06:01 PM, Wen Congyang wrote:
>>> On 01/20/2016 02:54 PM, Jason Wang wrote:
>>> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>>> Sure.
>>>
>>> Two main comments/suggestions:
>>>
>>> - TCP analysis is missed in current version, 
>>> maybe you point a git tree
>>> (or another version of RFC) to me for a better 
>>> understanding of the
>>> design. (Just a skeleton for TCP should be 
>>> sufficient to discuss).
>>> - I prefer to make the code as reusable as 
>>> possible. So it's better to
>>> split/decouple the reusable parts from the 
>>> codes. So a vague idea is:
>>>
>>> 1) Decouple the packet comparing from the 
>>> netfilter. You've achieved
>>> this 99% since the work has been done in a 
>>> thread. Just let the thread
>>> poll sockets directly, then the comparing have 
>>> the possibility to be
>>> reused by other kinds of dataplane.
>>> 2) Implement traffic mirror/redirector as 
>>> filter.
>>> 3) Implement TCP seq rewriting as a filter.
>>>
>>> Then, in primary node, you need just a traffic 
>>> mirror, which did:
>>> - mirror ingress traffic to secondary node
>>> - mirror outgress traffic to packet comparing 
>>> thread
>>>
>>> And in secondadry node, you need two filters:
>>> - A TCP seq rewriter which adjust tcp sequence 
>>> number.
>>> - A traffic redirector which redirect packet 
>>> from a socket as ingress
>>> traffic, and redirect outgress traffic to the 
>>> socket which could be
>>> polled by remote packet comparing thread.
>>>   Thoughts?
>>>
>>> Thanks
>>>
>>> Thanks
>>> zhangchen
>>> Hi, Jason.
>>> We consider your suggestion to split/decouple
>>> the reusable parts from the codes.
>>> Due to filter plugin are traversed one by one in 
>>> order
>>> we will split colo-proxy to three filters in each 
>>> side.
>>>
>>> But in this plan,primary and secondary both have 
>>> socket
>>> server,startup is a problem.
>>> I believe this issue could be solved by reusing socket 
>>> chardev.
>>>
>>>  Primary qemu   
>>>
>>> Secondary qemu
>>> +--+
>>>   
>>> +---+
>>> | 
>>> +-+
>>>   |   | 
>>> +--+
>>>  |
>>> | | 
>>> |  |   | 
>>> |   
>>>| |
>>> | |guest

[Qemu-devel] [PATCH] virtio-pci: Fix vq enabled flag on reset for virtio 1.0

2016-01-21 Thread Fam Zheng

Virtio 1.0 requires queue_enable status in virtio_pci_common_cfg to be
cleared upon device reset (virtio-v1.0.pdf, section 4.1.4.3.1), but we
don't do it.

This causes trouble on Linux's virtio-blk-pci probe since seabios 1.9.0
update (commit ad30c0b0d), if modern enabled. Kernel spits:

...
[5.624492] virtio_blk: probe of virtio0 failed with error -2
...

To fix it, call the virtio-pci's reset handler instead of the device
attached on the bus.

Signed-off-by: Fam Zheng 
---
 hw/virtio/virtio-pci.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 94667e6..f0ab812 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1310,6 +1310,8 @@ static uint64_t virtio_pci_common_read(void *opaque, 
hwaddr addr,
 return val;
 }
 
+static void virtio_pci_reset(DeviceState *qdev);
+
 static void virtio_pci_common_write(void *opaque, hwaddr addr,
 uint64_t val, unsigned size)
 {
@@ -1351,7 +1353,7 @@ static void virtio_pci_common_write(void *opaque, hwaddr 
addr,
 }
 
 if (vdev->status == 0) {
-virtio_reset(vdev);
+virtio_pci_reset(DEVICE(proxy));
 msix_unuse_all_vectors(&proxy->pci_dev);
 }
 
-- 
2.4.3

Re: [Qemu-devel] RFC: running the user interface in a thread ...

2016-01-21 Thread Dave Airlie

On 22 January 2016 at 16:59, Gerd Hoffmann  wrote:
>   Hi,
>
>> In theory Mesa could help here, but GL isn't thread friendly at all,
>> so it probably won't help in the virgl
>> case even if it did. Since most GL apps compile a shader and block on
>> using it straight away doing it
>> in a thread won't help unblock things.
>>
>> So I think it would be best to have all the virgl vq processing happen
>> in it's own thread with some API
>> to the UI to do UI resizes (the most difficult) and dirty regions etc.
>
> We can move only virgl into its own thread, but then we'll have two
> threads (virgl and main which runs ui) which use opengl.  So I was
> thinking maybe it is better to have a single thread which runs both
> virgl and ui (and thats why I've started this thread ...).

In theory as long as we have separate contexts bound in each thread it
should be fine.

OpenGL is okay as long as one context is bound in one thread at a time.

Since the only sharing we have between contexts are textures, they
should be okay
between shared contexts.

Dave.

Re: [Qemu-devel] RFC: running the user interface in a thread ...

2016-01-21 Thread Gerd Hoffmann

  Hi,

> In theory Mesa could help here, but GL isn't thread friendly at all,
> so it probably won't help in the virgl
> case even if it did. Since most GL apps compile a shader and block on
> using it straight away doing it
> in a thread won't help unblock things.
> 
> So I think it would be best to have all the virgl vq processing happen
> in it's own thread with some API
> to the UI to do UI resizes (the most difficult) and dirty regions etc.

We can move only virgl into its own thread, but then we'll have two
threads (virgl and main which runs ui) which use opengl.  So I was
thinking maybe it is better to have a single thread which runs both
virgl and ui (and thats why I've started this thread ...).

cheers,
  Gerd

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Wen Congyang

On 01/22/2016 02:21 PM, Jason Wang wrote:
> 
> 
> On 01/22/2016 01:56 PM, Wen Congyang wrote:
>> On 01/22/2016 01:41 PM, Jason Wang wrote:


 On 01/22/2016 11:28 AM, Wen Congyang wrote:
>> On 01/22/2016 11:15 AM, Jason Wang wrote:

 On 01/20/2016 06:30 PM, Wen Congyang wrote:
>> On 01/20/2016 06:19 PM, Jason Wang wrote:
>>
>> On 01/20/2016 06:01 PM, Wen Congyang wrote:
>> On 01/20/2016 02:54 PM, Jason Wang wrote:
>> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>> Sure.
>>
>> Two main comments/suggestions:
>>
>> - TCP analysis is missed in current version, 
>> maybe you point a git tree
>> (or another version of RFC) to me for a better 
>> understanding of the
>> design. (Just a skeleton for TCP should be 
>> sufficient to discuss).
>> - I prefer to make the code as reusable as 
>> possible. So it's better to
>> split/decouple the reusable parts from the 
>> codes. So a vague idea is:
>>
>> 1) Decouple the packet comparing from the 
>> netfilter. You've achieved
>> this 99% since the work has been done in a 
>> thread. Just let the thread
>> poll sockets directly, then the comparing have 
>> the possibility to be
>> reused by other kinds of dataplane.
>> 2) Implement traffic mirror/redirector as filter.
>> 3) Implement TCP seq rewriting as a filter.
>>
>> Then, in primary node, you need just a traffic 
>> mirror, which did:
>> - mirror ingress traffic to secondary node
>> - mirror outgress traffic to packet comparing 
>> thread
>>
>> And in secondadry node, you need two filters:
>> - A TCP seq rewriter which adjust tcp sequence 
>> number.
>> - A traffic redirector which redirect packet 
>> from a socket as ingress
>> traffic, and redirect outgress traffic to the 
>> socket which could be
>> polled by remote packet comparing thread.
>>   Thoughts?
>>
>> Thanks
>>
>> Thanks
>> zhangchen
>> Hi, Jason.
>> We consider your suggestion to split/decouple
>> the reusable parts from the codes.
>> Due to filter plugin are traversed one by one in 
>> order
>> we will split colo-proxy to three filters in each 
>> side.
>>
>> But in this plan,primary and secondary both have 
>> socket
>> server,startup is a problem.
>> I believe this issue could be solved by reusing socket 
>> chardev.
>>
>>  Primary qemu
>>   
>> Secondary qemu
>> +--+
>>   
>> +---+
>> | 
>> +-+
>>   |   | 
>> +--+
>>  |
>> | |  
>>|  |   | 
>> |
>>   | |
>> | |guest 
>>|  |   | 
>> |guest

Re: [Qemu-devel] [PATCH] net: set endianness on all backend devices

2016-01-21 Thread Jason Wang



On 01/21/2016 04:42 PM, Laurent Vivier wrote:
> ping
>
> [added Jason in cc:]
>
> On 13/01/2016 20:26, Laurent Vivier wrote:
>> commit 5be7d9f1b1452613b95c6ba70b8d7ad3d0797991
>>vhost-net: tell tap backend about the vnet endianness
>>
>> makes vhost net to set the endianness of the device, but only for
>> the first device.
>>
>> In case of multiqueue, we have multiple devices... This patch sets the
>> endianness for all the devices of the interface.
>>
>> Signed-off-by: Laurent Vivier 
>> ---
>>  hw/net/vhost_net.c | 23 +++
>>  1 file changed, 11 insertions(+), 12 deletions(-)
>>
>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>> index 318c3e6..10e233a 100644
>> --- a/hw/net/vhost_net.c
>> +++ b/hw/net/vhost_net.c
>> @@ -300,21 +300,19 @@ int vhost_net_start(VirtIODevice *dev, NetClientState 
>> *ncs,
>>  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
>>  VirtioBusState *vbus = VIRTIO_BUS(qbus);
>>  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
>> -int r, e, i;
>> +int r, e, i, j;
>>  
>>  if (!k->set_guest_notifiers) {
>>  error_report("binding does not support guest notifiers");
>> -r = -ENOSYS;
>> -goto err;
>> +return -ENOSYS;
>>  }
>>  
>> -r = vhost_net_set_vnet_endian(dev, ncs[0].peer, true);
>> -if (r < 0) {
>> -goto err;
>> -}
>> -
>> -for (i = 0; i < total_queues; i++) {
>> -vhost_net_set_vq_index(get_vhost_net(ncs[i].peer), i * 2);
>> +for (j = 0; j < total_queues; j++) {
>> +r = vhost_net_set_vnet_endian(dev, ncs[j].peer, true);
>> +if (r < 0) {
>> +goto err_endian;
>> +}
>> +vhost_net_set_vq_index(get_vhost_net(ncs[j].peer), j * 2);
>>  }
>>  
>>  r = k->set_guest_notifiers(qbus->parent, total_queues * 2, true);
>> @@ -343,8 +341,9 @@ err_start:
>>  fflush(stderr);
>>  }
>>  err_endian:
>> -vhost_net_set_vnet_endian(dev, ncs[0].peer, false);
>> -err:
>> +while (--j >= 0) {
>> +vhost_net_set_vnet_endian(dev, ncs[j].peer, false);
>> +}
>>  return r;
>>  }
>>  

Reviewed-by: Jason Wang

Re: [Qemu-devel] [PATCH v2 2/9] ipmi: replace goto by a return statement

2016-01-21 Thread Greg Kurz

On Thu, 21 Jan 2016 18:18:47 +0100
Cédric Le Goater  wrote:

> Each routine using the IPMI_ADD_RSP_DATA, IPMI_CHECK_CMD_LEN or
> IPMI_CHECK_RESERVATION macros needs to define a goto label 'out' to
> handle hidden errors. Using directly a return statement as the same

s/as/has

> effect and it removes the fact that 'out' needs to be defined.
> 
> The code exits in ipmi_sim_handle_command() are a little different
> from the rest and a "possible" error in the macro IPMI_ADD_RSP_DATA is
> handled before making use of it. This might be a bit excessive as a
> minimum response len is currently 300 bytes and the patch checks that
> at least 3 are available.
> 
> Signed-off-by: Cédric Le Goater 
> ---

Reviewed-by: Greg Kurz 

>  hw/ipmi/ipmi_bmc_sim.c | 140 
> +++--
>  1 file changed, 41 insertions(+), 99 deletions(-)
> 
> diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
> index 0a59e539f549..e42c7e86c344 100644
> --- a/hw/ipmi/ipmi_bmc_sim.c
> +++ b/hw/ipmi/ipmi_bmc_sim.c
> @@ -258,7 +258,7 @@ struct IPMIBmcSim {
>  do {   \
>  if (*rsp_len >= max_rsp_len) { \
>  rsp[2] = IPMI_CC_REQUEST_DATA_TRUNCATED;   \
> -goto out;  \
> +return;\
>  }  \
>  rsp[(*rsp_len)++] = (b);   \
>  } while (0)
> @@ -267,7 +267,7 @@ struct IPMIBmcSim {
>  #define IPMI_CHECK_CMD_LEN(l) \
>  if (cmd_len < l) { \
>  rsp[2] = IPMI_CC_REQUEST_DATA_LENGTH_INVALID;  \
> -goto out; \
> +return; \
>  }
> 
>  /* Check that the reservation in the command is valid. */
> @@ -275,7 +275,7 @@ struct IPMIBmcSim {
>  do {   \
>  if ((cmd[off] | (cmd[off + 1] << 8)) != r) {   \
>  rsp[2] = IPMI_CC_INVALID_RESERVATION;  \
> -goto out;  \
> +return;\
>  }  \
>  } while (0)
> 
> @@ -453,14 +453,12 @@ static void gen_event(IPMIBmcSim *ibs, unsigned int 
> sens_num, uint8_t deassert,
>  }
> 
>  if (ibs->msg_flags & IPMI_BMC_MSG_FLAG_EVT_BUF_FULL) {
> -goto out;
> +return;
>  }
> 
>  memcpy(ibs->evtbuf, evt, 16);
>  ibs->msg_flags |= IPMI_BMC_MSG_FLAG_EVT_BUF_FULL;
>  k->set_atn(s, 1, attn_irq_enabled(ibs));
> - out:
> -return;
>  }
> 
>  static void sensor_set_discrete_bit(IPMIBmcSim *ibs, unsigned int sensor,
> @@ -581,6 +579,11 @@ static void ipmi_sim_handle_command(IPMIBmc *b,
> 
>  /* Set up the response, set the low bit of NETFN. */
>  /* Note that max_rsp_len must be at least 3 */
> +if (max_rsp_len < 3) {
> +rsp[2] = IPMI_CC_REQUEST_DATA_TRUNCATED;
> +goto out;
> +}
> +
>  IPMI_ADD_RSP_DATA(cmd[0] | 0x04);
>  IPMI_ADD_RSP_DATA(cmd[1]);
>  IPMI_ADD_RSP_DATA(0); /* Assume success */
> @@ -698,8 +701,6 @@ static void chassis_capabilities(IPMIBmcSim *ibs,
>  IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
>  IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
>  IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
> - out:
> -return;
>  }
> 
>  static void chassis_status(IPMIBmcSim *ibs,
> @@ -711,8 +712,6 @@ static void chassis_status(IPMIBmcSim *ibs,
>  IPMI_ADD_RSP_DATA(0);
>  IPMI_ADD_RSP_DATA(0);
>  IPMI_ADD_RSP_DATA(0);
> - out:
> -return;
>  }
> 
>  static void chassis_control(IPMIBmcSim *ibs,
> @@ -746,10 +745,8 @@ static void chassis_control(IPMIBmcSim *ibs,
>  break;
>  default:
>  rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
> -goto out;
> +return;
>  }
> - out:
> -return;
>  }
> 
>  static void get_device_id(IPMIBmcSim *ibs,
> @@ -768,8 +765,6 @@ static void get_device_id(IPMIBmcSim *ibs,
>  IPMI_ADD_RSP_DATA(ibs->mfg_id[2]);
>  IPMI_ADD_RSP_DATA(ibs->product_id[0]);
>  IPMI_ADD_RSP_DATA(ibs->product_id[1]);
> - out:
> -return;
>  }
> 
>  static void set_global_enables(IPMIBmcSim *ibs, uint8_t val)
> @@ -822,8 +817,6 @@ static void set_bmc_global_enables(IPMIBmcSim *ibs,
>  {
>  IPMI_CHECK_CMD_LEN(3);
>  set_global_enables(ibs, cmd[2]);
> - out:
> -return;
>  }
> 
>  static void get_bmc_global_enables(IPMIBmcSim *ibs,
> @@ -832,8 +825,6 @@ static void get_bmc_global_enables(IPMIBmcSim *ibs,
> unsigned int max_rsp_len)
>  {
>  IPMI_ADD_RSP_DATA(ibs->bmc_global_enables);
> - out:
> -return;
>  }
> 
>  static void clr_msg_flags(IPMIBmcSim *ibs,
> @@ -847,8 +838,6 @@ static void clr_msg_flags(IPMIBmcSim *ibs,
>  IPMI_CHECK_CMD_LEN(3);
>  ibs->msg_flags &= ~cmd[2];
>  k->set_atn(s, attn_se

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Jason Wang



On 01/22/2016 01:56 PM, Wen Congyang wrote:
> On 01/22/2016 01:41 PM, Jason Wang wrote:
>> > 
>> > 
>> > On 01/22/2016 11:28 AM, Wen Congyang wrote:
>>> >> On 01/22/2016 11:15 AM, Jason Wang wrote:
 >>>
 >>> On 01/20/2016 06:30 PM, Wen Congyang wrote:
>  On 01/20/2016 06:19 PM, Jason Wang wrote:
>>> >>
>>> >> On 01/20/2016 06:01 PM, Wen Congyang wrote:
>  On 01/20/2016 02:54 PM, Jason Wang wrote:
>>> >> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>>> >> Sure.
>>> >>
>>> >> Two main comments/suggestions:
>>> >>
>>> >> - TCP analysis is missed in current version, 
>>> >> maybe you point a git tree
>>> >> (or another version of RFC) to me for a better 
>>> >> understanding of the
>>> >> design. (Just a skeleton for TCP should be 
>>> >> sufficient to discuss).
>>> >> - I prefer to make the code as reusable as 
>>> >> possible. So it's better to
>>> >> split/decouple the reusable parts from the 
>>> >> codes. So a vague idea is:
>>> >>
>>> >> 1) Decouple the packet comparing from the 
>>> >> netfilter. You've achieved
>>> >> this 99% since the work has been done in a 
>>> >> thread. Just let the thread
>>> >> poll sockets directly, then the comparing have 
>>> >> the possibility to be
>>> >> reused by other kinds of dataplane.
>>> >> 2) Implement traffic mirror/redirector as filter.
>>> >> 3) Implement TCP seq rewriting as a filter.
>>> >>
>>> >> Then, in primary node, you need just a traffic 
>>> >> mirror, which did:
>>> >> - mirror ingress traffic to secondary node
>>> >> - mirror outgress traffic to packet comparing 
>>> >> thread
>>> >>
>>> >> And in secondadry node, you need two filters:
>>> >> - A TCP seq rewriter which adjust tcp sequence 
>>> >> number.
>>> >> - A traffic redirector which redirect packet 
>>> >> from a socket as ingress
>>> >> traffic, and redirect outgress traffic to the 
>>> >> socket which could be
>>> >> polled by remote packet comparing thread.
>>> >>   Thoughts?
>>> >>
>>> >> Thanks
>>> >>
>  Thanks
>  zhangchen
>  Hi, Jason.
>  We consider your suggestion to split/decouple
>  the reusable parts from the codes.
>  Due to filter plugin are traversed one by one in 
>  order
>  we will split colo-proxy to three filters in each 
>  side.
> 
>  But in this plan,primary and secondary both have 
>  socket
>  server,startup is a problem.
>>> >> I believe this issue could be solved by reusing socket 
>>> >> chardev.
>>> >>
>   Primary qemu
>    
>  Secondary qemu
>  +--+
>    
>  +---+
>  | 
>  +-+
>    |   | 
>  +--+
>   |
>  | |  
> |  |   | 
>  |
>    | |
>  | |guest 
> |  |   | 
>  |guest   
>    | |
>

Re: [Qemu-devel] [PATCH] e1000: eliminate infinite loops on out-of-bounds transfer start

2016-01-21 Thread Jason Wang



On 01/22/2016 02:11 PM, Michael Tokarev wrote:
> 22.01.2016 06:09, Jason Wang wrote:
>> On 01/19/2016 09:17 PM, Laszlo Ersek wrote:
>>> The start_xmit() and e1000_receive_iov() functions implement DMA transfers
>>> iterating over a set of descriptors that the guest's e1000 driver
>>> prepares:
> ...
>> Applied in my -net.
> This is CVE-2016-1981, btw.
>
> /mjt
>

Add this into commit log.

Thanks

Re: [Qemu-devel] [PATCH] e1000: eliminate infinite loops on out-of-bounds transfer start

2016-01-21 Thread Michael Tokarev

22.01.2016 06:09, Jason Wang wrote:
> On 01/19/2016 09:17 PM, Laszlo Ersek wrote:
>> The start_xmit() and e1000_receive_iov() functions implement DMA transfers
>> iterating over a set of descriptors that the guest's e1000 driver
>> prepares:
...
> Applied in my -net.

This is CVE-2016-1981, btw.

/mjt

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Wen Congyang

On 01/22/2016 01:33 PM, Jason Wang wrote:
> 
> 
> On 01/20/2016 06:34 PM, Wen Congyang wrote:
>> On 01/20/2016 06:03 PM, Jason Wang wrote:
>>>
>>> On 01/20/2016 05:49 PM, Wen Congyang wrote:
 On 01/20/2016 05:20 PM, Jason Wang wrote:
> On 01/20/2016 03:44 PM, Wen Congyang wrote:
 ...
 -chardev socket,id=comparer0,host=ip_primary,port=X,server,nowait
 -chardev socket,id=comparer1,host=ip_primary,port=Y,server,nowait
 -chardev socket,id=mirrorer0,host=ip_primary,port=Z,server,nowait
 -netdev tap,id=hn0
 -traffic-mirrorer netdev=hn0,id=t0,indev=comparer0,outdev=mirrorer0
 -colo-comparer primary_traffic=comparer0,secondary_traffic=comparer1
 ...

 packet comparer compares the packets from two chardev: comparer0 and
 comparer1.
 traffic-mirrorer mirror tx to secondary node through chardev mirrorer0,
 and mirror rx to packet comparer through chardev comparer0.

 In secondary node:

 ...
 -chardev socket,id=redirector0,host=ip_primary,port=Y
 -chardev socket,id=redirector1,host=ip_primary,port=Z
 -netdev tap,id=hn0
 -traffic-redirector 
 netdev=hn0,id,r0,indev=redirector0,outdev=redirector1
 -colo-rewriter netdev=hn0,id=c0
 ...

 traffic-redirector redirect the rx traffic from primary node through
 redirector0 and redirect the tx traffic to promary node through 
 redirector1.
 colo-rewriter rewrite seq number as a normal netfilter.
>> What are traffic-mirrorer and colo-comparer, traffic-redirector, 
>> colo-rewriter?
>> A netfilter driver?
> traffic-mirrorer/redirector is a type of netfilter that just
> mirror/redirect packets between netdev and chardev (just the mirror
> client/sever and redirect client/sever in the above graph)
> colo-rewriter is a type of netfilter that did ack/seq adjust (just the
> TCP rewriter in the above graph)
> colo-comparer is a thread object that did packet comparing (similar to
> "compare" in the above graph but not a netfiler)
 Thanks. I have another question:
 IIRC, both rx and tx packets walk through all netfilter objects in the 
 same order.

 tx packet(sent to the guest): we want that redirector hanldes it first
 rx packet(sent from the guest): we want that colo-rewriter handles it first
 Change the order or use two traffic-redirectors?

 Thanks
 Wen Congyang
>>> Interesting question.
>>>
>>> Two redirectors sounds ok or maybe we can go through rx filters in a
>>> reverse order?
>> netdev <---> filter1 <> filter2 <>  <> emulated device 
>> <> guest
>> So I think we can go through rx filters in a reverse order. But it changes
>> the behavior. So I am not sure if we can do it.
> 
> I think we can. Both dump and buffer filter does not require strict
> order, so it's a good time and change to do this.

OK, we will do it.

Thanks
Wen Congyang

> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>> .
>>>
>>
>>
> 
> 
> 
> .
>

Re: [Qemu-devel] [PATCH v6 0/6] Xen PCI passthru: Convert to realize()

2016-01-21 Thread Cao jin




On 01/21/2016 11:41 PM, Stefano Stabellini wrote:

Hi Cao,

I appreciate the reminder, but it looks like Eric hasn't reviewed patch
3/5. Am I wrong?



Seems I made a little mistake:-[
Already see your pull request, Thank you:)
And thanks for Eric`s review:)

--
Yours Sincerely,

Cao jin

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Wen Congyang

On 01/22/2016 01:41 PM, Jason Wang wrote:
> 
> 
> On 01/22/2016 11:28 AM, Wen Congyang wrote:
>> On 01/22/2016 11:15 AM, Jason Wang wrote:
>>>
>>> On 01/20/2016 06:30 PM, Wen Congyang wrote:
 On 01/20/2016 06:19 PM, Jason Wang wrote:
>>
>> On 01/20/2016 06:01 PM, Wen Congyang wrote:
 On 01/20/2016 02:54 PM, Jason Wang wrote:
>> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>> Sure.
>>
>> Two main comments/suggestions:
>>
>> - TCP analysis is missed in current version, maybe you point a 
>> git tree
>> (or another version of RFC) to me for a better understanding of 
>> the
>> design. (Just a skeleton for TCP should be sufficient to 
>> discuss).
>> - I prefer to make the code as reusable as possible. So it's 
>> better to
>> split/decouple the reusable parts from the codes. So a vague 
>> idea is:
>>
>> 1) Decouple the packet comparing from the netfilter. You've 
>> achieved
>> this 99% since the work has been done in a thread. Just let the 
>> thread
>> poll sockets directly, then the comparing have the possibility 
>> to be
>> reused by other kinds of dataplane.
>> 2) Implement traffic mirror/redirector as filter.
>> 3) Implement TCP seq rewriting as a filter.
>>
>> Then, in primary node, you need just a traffic mirror, which did:
>> - mirror ingress traffic to secondary node
>> - mirror outgress traffic to packet comparing thread
>>
>> And in secondadry node, you need two filters:
>> - A TCP seq rewriter which adjust tcp sequence number.
>> - A traffic redirector which redirect packet from a socket as 
>> ingress
>> traffic, and redirect outgress traffic to the socket which could 
>> be
>> polled by remote packet comparing thread.
>>   Thoughts?
>>
>> Thanks
>>
 Thanks
 zhangchen
 Hi, Jason.
 We consider your suggestion to split/decouple
 the reusable parts from the codes.
 Due to filter plugin are traversed one by one in order
 we will split colo-proxy to three filters in each side.

 But in this plan,primary and secondary both have socket
 server,startup is a problem.
>> I believe this issue could be solved by reusing socket chardev.
>>
  Primary qemu  
 Secondary qemu
 +--+  
 +---+
 | +-+  |   
 | 
 +--+ |
 | | |  |   
 | 
 |  | |
 | |guest|  |   
 | 
 |guest | |
 | | |  |   
 | 
 |  | |
 | +---^--+--+  |   
 | 
 +-++---+ |
 | |  | |  
 |^| |
 | |  | |  
 ||| |
 | +-+ 
 ||| |
 |  netfilter  |  | ||  
 |  
 netfilter|| |
 | +-+  ||  
 | 
 +--+ |
 | |   |  | filter excute order  |  ||  
 | 
 | ||  filter excute order  | |
 | |   |  |+---> |  ||  
 | 
 | || +---> | |
 | |

Re: [Qemu-devel] [PATCH v2 2/9] ipmi: replace goto by a return statement

2016-01-21 Thread Marcel Apfelbaum


On 01/21/2016 07:18 PM, Cédric Le Goater wrote:

Each routine using the IPMI_ADD_RSP_DATA, IPMI_CHECK_CMD_LEN or
IPMI_CHECK_RESERVATION macros needs to define a goto label 'out' to
handle hidden errors. Using directly a return statement as the same
effect and it removes the fact that 'out' needs to be defined.

The code exits in ipmi_sim_handle_command() are a little different
from the rest and a "possible" error in the macro IPMI_ADD_RSP_DATA is
handled before making use of it. This might be a bit excessive as a
minimum response len is currently 300 bytes and the patch checks that
at least 3 are available.

Signed-off-by: Cédric Le Goater 
---
  hw/ipmi/ipmi_bmc_sim.c | 140 +++--
  1 file changed, 41 insertions(+), 99 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 0a59e539f549..e42c7e86c344 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -258,7 +258,7 @@ struct IPMIBmcSim {
  do {   \
  if (*rsp_len >= max_rsp_len) { \
  rsp[2] = IPMI_CC_REQUEST_DATA_TRUNCATED;   \
-goto out;  \
+return;\
  }  \
  rsp[(*rsp_len)++] = (b);   \
  } while (0)
@@ -267,7 +267,7 @@ struct IPMIBmcSim {
  #define IPMI_CHECK_CMD_LEN(l) \
  if (cmd_len < l) { \
  rsp[2] = IPMI_CC_REQUEST_DATA_LENGTH_INVALID;  \
-goto out; \
+return; \
  }

  /* Check that the reservation in the command is valid. */
@@ -275,7 +275,7 @@ struct IPMIBmcSim {
  do {   \
  if ((cmd[off] | (cmd[off + 1] << 8)) != r) {   \
  rsp[2] = IPMI_CC_INVALID_RESERVATION;  \
-goto out;  \
+return;\
  }  \
  } while (0)

@@ -453,14 +453,12 @@ static void gen_event(IPMIBmcSim *ibs, unsigned int 
sens_num, uint8_t deassert,
  }

  if (ibs->msg_flags & IPMI_BMC_MSG_FLAG_EVT_BUF_FULL) {
-goto out;
+return;
  }

  memcpy(ibs->evtbuf, evt, 16);
  ibs->msg_flags |= IPMI_BMC_MSG_FLAG_EVT_BUF_FULL;
  k->set_atn(s, 1, attn_irq_enabled(ibs));
- out:
-return;
  }

  static void sensor_set_discrete_bit(IPMIBmcSim *ibs, unsigned int sensor,
@@ -581,6 +579,11 @@ static void ipmi_sim_handle_command(IPMIBmc *b,

  /* Set up the response, set the low bit of NETFN. */
  /* Note that max_rsp_len must be at least 3 */
+if (max_rsp_len < 3) {
+rsp[2] = IPMI_CC_REQUEST_DATA_TRUNCATED;
+goto out;
+}
+
  IPMI_ADD_RSP_DATA(cmd[0] | 0x04);
  IPMI_ADD_RSP_DATA(cmd[1]);
  IPMI_ADD_RSP_DATA(0); /* Assume success */
@@ -698,8 +701,6 @@ static void chassis_capabilities(IPMIBmcSim *ibs,
  IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
  IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
  IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
- out:
-return;
  }

  static void chassis_status(IPMIBmcSim *ibs,
@@ -711,8 +712,6 @@ static void chassis_status(IPMIBmcSim *ibs,
  IPMI_ADD_RSP_DATA(0);
  IPMI_ADD_RSP_DATA(0);
  IPMI_ADD_RSP_DATA(0);
- out:
-return;
  }

  static void chassis_control(IPMIBmcSim *ibs,
@@ -746,10 +745,8 @@ static void chassis_control(IPMIBmcSim *ibs,
  break;
  default:
  rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
-goto out;
+return;
  }
- out:
-return;
  }

  static void get_device_id(IPMIBmcSim *ibs,
@@ -768,8 +765,6 @@ static void get_device_id(IPMIBmcSim *ibs,
  IPMI_ADD_RSP_DATA(ibs->mfg_id[2]);
  IPMI_ADD_RSP_DATA(ibs->product_id[0]);
  IPMI_ADD_RSP_DATA(ibs->product_id[1]);
- out:
-return;
  }

  static void set_global_enables(IPMIBmcSim *ibs, uint8_t val)
@@ -822,8 +817,6 @@ static void set_bmc_global_enables(IPMIBmcSim *ibs,
  {
  IPMI_CHECK_CMD_LEN(3);
  set_global_enables(ibs, cmd[2]);
- out:
-return;
  }

  static void get_bmc_global_enables(IPMIBmcSim *ibs,
@@ -832,8 +825,6 @@ static void get_bmc_global_enables(IPMIBmcSim *ibs,
 unsigned int max_rsp_len)
  {
  IPMI_ADD_RSP_DATA(ibs->bmc_global_enables);
- out:
-return;
  }

  static void clr_msg_flags(IPMIBmcSim *ibs,
@@ -847,8 +838,6 @@ static void clr_msg_flags(IPMIBmcSim *ibs,
  IPMI_CHECK_CMD_LEN(3);
  ibs->msg_flags &= ~cmd[2];
  k->set_atn(s, attn_set(ibs), attn_irq_enabled(ibs));
- out:
-return;
  }

  static void get_msg_flags(IPMIBmcSim *ibs,
@@ -857,8 +846,6 @@ static void get_msg_flags(IPMIBmcSim *ibs,
unsigned int max_rsp_len)
  {
  IPMI_ADD_RSP_DATA(ibs->msg_flag

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Jason Wang



On 01/22/2016 11:28 AM, Wen Congyang wrote:
> On 01/22/2016 11:15 AM, Jason Wang wrote:
>>
>> On 01/20/2016 06:30 PM, Wen Congyang wrote:
>>> On 01/20/2016 06:19 PM, Jason Wang wrote:
>
> On 01/20/2016 06:01 PM, Wen Congyang wrote:
>>> On 01/20/2016 02:54 PM, Jason Wang wrote:
> On 01/20/2016 11:29 AM, Zhang Chen wrote:
> Sure.
>
> Two main comments/suggestions:
>
> - TCP analysis is missed in current version, maybe you point a 
> git tree
> (or another version of RFC) to me for a better understanding of 
> the
> design. (Just a skeleton for TCP should be sufficient to discuss).
> - I prefer to make the code as reusable as possible. So it's 
> better to
> split/decouple the reusable parts from the codes. So a vague idea 
> is:
>
> 1) Decouple the packet comparing from the netfilter. You've 
> achieved
> this 99% since the work has been done in a thread. Just let the 
> thread
> poll sockets directly, then the comparing have the possibility to 
> be
> reused by other kinds of dataplane.
> 2) Implement traffic mirror/redirector as filter.
> 3) Implement TCP seq rewriting as a filter.
>
> Then, in primary node, you need just a traffic mirror, which did:
> - mirror ingress traffic to secondary node
> - mirror outgress traffic to packet comparing thread
>
> And in secondadry node, you need two filters:
> - A TCP seq rewriter which adjust tcp sequence number.
> - A traffic redirector which redirect packet from a socket as 
> ingress
> traffic, and redirect outgress traffic to the socket which could 
> be
> polled by remote packet comparing thread.
>   Thoughts?
>
> Thanks
>
>>> Thanks
>>> zhangchen
>>> Hi, Jason.
>>> We consider your suggestion to split/decouple
>>> the reusable parts from the codes.
>>> Due to filter plugin are traversed one by one in order
>>> we will split colo-proxy to three filters in each side.
>>>
>>> But in this plan,primary and secondary both have socket
>>> server,startup is a problem.
> I believe this issue could be solved by reusing socket chardev.
>
>>>  Primary qemu  
>>> Secondary qemu
>>> +--+  
>>> +---+
>>> | +-+  |   
>>> | 
>>> +--+ |
>>> | | |  |   
>>> | 
>>> |  | |
>>> | |guest|  |   
>>> | 
>>> |guest | |
>>> | | |  |   
>>> | 
>>> |  | |
>>> | +---^--+--+  |   
>>> | 
>>> +-++---+ |
>>> | |  | |  
>>> |^| |
>>> | |  | |  
>>> ||| |
>>> | +-+ 
>>> ||| |
>>> |  netfilter  |  | ||  
>>> |  
>>> netfilter|| |
>>> | +-+  ||  
>>> | 
>>> +--+ |
>>> | |   |  | filter excute order  |  ||  
>>> | 
>>> | ||  filter excute order  | |
>>> | |   |  |+---> |  ||  
>>> | 
>>> | || +---> | |
>>> | |   |  |  |  ||  
>>> | 
>>> | ||   TCP | |
>>> | | +-

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Jason Wang



On 01/20/2016 06:34 PM, Wen Congyang wrote:
> On 01/20/2016 06:03 PM, Jason Wang wrote:
>>
>> On 01/20/2016 05:49 PM, Wen Congyang wrote:
>>> On 01/20/2016 05:20 PM, Jason Wang wrote:
 On 01/20/2016 03:44 PM, Wen Congyang wrote:
>>> ...
>>> -chardev socket,id=comparer0,host=ip_primary,port=X,server,nowait
>>> -chardev socket,id=comparer1,host=ip_primary,port=Y,server,nowait
>>> -chardev socket,id=mirrorer0,host=ip_primary,port=Z,server,nowait
>>> -netdev tap,id=hn0
>>> -traffic-mirrorer netdev=hn0,id=t0,indev=comparer0,outdev=mirrorer0
>>> -colo-comparer primary_traffic=comparer0,secondary_traffic=comparer1
>>> ...
>>>
>>> packet comparer compares the packets from two chardev: comparer0 and
>>> comparer1.
>>> traffic-mirrorer mirror tx to secondary node through chardev mirrorer0,
>>> and mirror rx to packet comparer through chardev comparer0.
>>>
>>> In secondary node:
>>>
>>> ...
>>> -chardev socket,id=redirector0,host=ip_primary,port=Y
>>> -chardev socket,id=redirector1,host=ip_primary,port=Z
>>> -netdev tap,id=hn0
>>> -traffic-redirector 
>>> netdev=hn0,id,r0,indev=redirector0,outdev=redirector1
>>> -colo-rewriter netdev=hn0,id=c0
>>> ...
>>>
>>> traffic-redirector redirect the rx traffic from primary node through
>>> redirector0 and redirect the tx traffic to promary node through 
>>> redirector1.
>>> colo-rewriter rewrite seq number as a normal netfilter.
> What are traffic-mirrorer and colo-comparer, traffic-redirector, 
> colo-rewriter?
> A netfilter driver?
 traffic-mirrorer/redirector is a type of netfilter that just
 mirror/redirect packets between netdev and chardev (just the mirror
 client/sever and redirect client/sever in the above graph)
 colo-rewriter is a type of netfilter that did ack/seq adjust (just the
 TCP rewriter in the above graph)
 colo-comparer is a thread object that did packet comparing (similar to
 "compare" in the above graph but not a netfiler)
>>> Thanks. I have another question:
>>> IIRC, both rx and tx packets walk through all netfilter objects in the same 
>>> order.
>>>
>>> tx packet(sent to the guest): we want that redirector hanldes it first
>>> rx packet(sent from the guest): we want that colo-rewriter handles it first
>>> Change the order or use two traffic-redirectors?
>>>
>>> Thanks
>>> Wen Congyang
>> Interesting question.
>>
>> Two redirectors sounds ok or maybe we can go through rx filters in a
>> reverse order?
> netdev <---> filter1 <> filter2 <>  <> emulated device <> 
> guest
> So I think we can go through rx filters in a reverse order. But it changes
> the behavior. So I am not sure if we can do it.

I think we can. Both dump and buffer filter does not require strict
order, so it's a good time and change to do this.

>
> Thanks
> Wen Congyang
>
>>
>> .
>>
>
>

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Wen Congyang

On 01/22/2016 11:15 AM, Jason Wang wrote:
> 
> 
> On 01/20/2016 06:30 PM, Wen Congyang wrote:
>> On 01/20/2016 06:19 PM, Jason Wang wrote:


 On 01/20/2016 06:01 PM, Wen Congyang wrote:
>> On 01/20/2016 02:54 PM, Jason Wang wrote:

 On 01/20/2016 11:29 AM, Zhang Chen wrote:
 Sure.

 Two main comments/suggestions:

 - TCP analysis is missed in current version, maybe you point a git 
 tree
 (or another version of RFC) to me for a better understanding of the
 design. (Just a skeleton for TCP should be sufficient to discuss).
 - I prefer to make the code as reusable as possible. So it's 
 better to
 split/decouple the reusable parts from the codes. So a vague idea 
 is:

 1) Decouple the packet comparing from the netfilter. You've 
 achieved
 this 99% since the work has been done in a thread. Just let the 
 thread
 poll sockets directly, then the comparing have the possibility to 
 be
 reused by other kinds of dataplane.
 2) Implement traffic mirror/redirector as filter.
 3) Implement TCP seq rewriting as a filter.

 Then, in primary node, you need just a traffic mirror, which did:
 - mirror ingress traffic to secondary node
 - mirror outgress traffic to packet comparing thread

 And in secondadry node, you need two filters:
 - A TCP seq rewriter which adjust tcp sequence number.
 - A traffic redirector which redirect packet from a socket as 
 ingress
 traffic, and redirect outgress traffic to the socket which could be
 polled by remote packet comparing thread.
   Thoughts?

 Thanks

>> Thanks
>> zhangchen
>>
>> Hi, Jason.
>> We consider your suggestion to split/decouple
>> the reusable parts from the codes.
>> Due to filter plugin are traversed one by one in order
>> we will split colo-proxy to three filters in each side.
>>
>> But in this plan,primary and secondary both have socket
>> server,startup is a problem.
 I believe this issue could be solved by reusing socket chardev.

>>
>>  Primary qemu  
>> Secondary qemu
>> +--+  
>> +---+
>> | +-+  |   | 
>> +--+ |
>> | | |  |   | 
>> |  | |
>> | |guest|  |   | 
>> |guest | |
>> | | |  |   | 
>> |  | |
>> | +---^--+--+  |   | 
>> +-++---+ |
>> | |  | |  
>> |^| |
>> | |  | |  
>> ||| |
>> | +-+ 
>> ||| |
>> |  netfilter  |  | ||  | 
>>  
>> netfilter|| |
>> | +-+  ||  | 
>> +--+ |
>> | |   |  | filter excute order  |  ||  | 
>> | ||  filter excute order  | |
>> | |   |  |+---> |  ||  | 
>> | || +---> | |
>> | |   |  |  |  ||  | 
>> | ||   TCP | |
>> | | +-+-+ +--v-+++ +-+  |  ||  | 
>> | +---+   +---++---v+rewriter+  ++ | |
>> | | |   | ||||  |  ||  | 
>> | |   |

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-21 Thread Jason Wang



On 01/20/2016 06:30 PM, Wen Congyang wrote:
> On 01/20/2016 06:19 PM, Jason Wang wrote:
>> > 
>> > 
>> > On 01/20/2016 06:01 PM, Wen Congyang wrote:
>>> >> On 01/20/2016 02:54 PM, Jason Wang wrote:
 >>>
 >>> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>> > Sure.
>> >
>> > Two main comments/suggestions:
>> >
>> > - TCP analysis is missed in current version, maybe you point a git 
>> > tree
>> > (or another version of RFC) to me for a better understanding of the
>> > design. (Just a skeleton for TCP should be sufficient to discuss).
>> > - I prefer to make the code as reusable as possible. So it's 
>> > better to
>> > split/decouple the reusable parts from the codes. So a vague idea 
>> > is:
>> >
>> > 1) Decouple the packet comparing from the netfilter. You've 
>> > achieved
>> > this 99% since the work has been done in a thread. Just let the 
>> > thread
>> > poll sockets directly, then the comparing have the possibility to 
>> > be
>> > reused by other kinds of dataplane.
>> > 2) Implement traffic mirror/redirector as filter.
>> > 3) Implement TCP seq rewriting as a filter.
>> >
>> > Then, in primary node, you need just a traffic mirror, which did:
>> > - mirror ingress traffic to secondary node
>> > - mirror outgress traffic to packet comparing thread
>> >
>> > And in secondadry node, you need two filters:
>> > - A TCP seq rewriter which adjust tcp sequence number.
>> > - A traffic redirector which redirect packet from a socket as 
>> > ingress
>> > traffic, and redirect outgress traffic to the socket which could be
>> > polled by remote packet comparing thread.
>> >   Thoughts?
>> >
>> > Thanks
>> >
>>> >> Thanks
>>> >> zhangchen
> 
>  Hi, Jason.
>  We consider your suggestion to split/decouple
>  the reusable parts from the codes.
>  Due to filter plugin are traversed one by one in order
>  we will split colo-proxy to three filters in each side.
> 
>  But in this plan,primary and secondary both have socket
>  server,startup is a problem.
 >>> I believe this issue could be solved by reusing socket chardev.
 >>>
> 
>   Primary qemu  
>  Secondary qemu
>  +--+  
>  +---+
>  | +-+  |   | 
>  +--+ |
>  | | |  |   | 
>  |  | |
>  | |guest|  |   | 
>  |guest | |
>  | | |  |   | 
>  |  | |
>  | +---^--+--+  |   | 
>  +-++---+ |
>  | |  | |  
>  |^| |
>  | |  | |  
>  ||| |
>  | +-+ 
>  ||| |
>  |  netfilter  |  | ||  | 
>   
>  netfilter|| |
>  | +-+  ||  | 
>  +--+ |
>  | |   |  | filter excute order  |  ||  | 
>  | ||  filter excute order  | |
>  | |   |  |+---> |  ||  | 
>  | || +---> | |
>  | |   |  |  |  ||  | 
>  | ||   TCP | |
>  | | +-+-+ +--v-+++ +-+  |  ||  | 
>  | +---+   +---++---v+rewriter+  ++ | |
>  | | |   | ||||  |  ||  | 
>  | |   |   || |  || | |
>

Re: [Qemu-devel] [PATCH] e1000: eliminate infinite loops on out-of-bounds transfer start

2016-01-21 Thread Jason Wang



On 01/19/2016 09:17 PM, Laszlo Ersek wrote:
> The start_xmit() and e1000_receive_iov() functions implement DMA transfers
> iterating over a set of descriptors that the guest's e1000 driver
> prepares:
>
> - the TDLEN and RDLEN registers store the total size of the descriptor
>   area,
>
> - while the TDH and RDH registers store the offset (in whole tx / rx
>   descriptors) into the area where the transfer is supposed to start.
>
> Each time a descriptor is processed, the TDH and RDH register is bumped
> (as appropriate for the transfer direction).
>
> QEMU already contains logic to deal with bogus transfers submitted by the
> guest:
>
> - Normally, the transmit case wants to increase TDH from its initial value
>   to TDT. (TDT is allowed to be numerically smaller than the initial TDH
>   value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
>   that QEMU currently has here is a check against reaching the original
>   TDH value again -- a complete wraparound, which should never happen.
>
> - In the receive case RDH is increased from its initial value until
>   "total_size" bytes have been received; preferably in a single step, or
>   in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
>   RX descriptors are skipped without receiving data, while RDH is
>   incremented just the same. QEMU tries to prevent an infinite loop
>   (processing only null RX descriptors) by detecting whether RDH assumes
>   its original value during the loop. (Again, wrapping from RDLEN to 0 is
>   normal.)
>
> What both directions miss is that the guest could program TDLEN and RDLEN
> so low, and the initial TDH and RDH so high, that these registers will
> immediately be truncated to zero, and then never reassume their initial
> values in the loop -- a full wraparound will never occur.
>
> The condition that expresses this is:
>
>   xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)
>
> i.e., TDH or RDH start out after the last whole rx or tx descriptor that
> fits into the TDLEN or RDLEN sized area.
>
> This condition could be checked before we enter the loops, but
> pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
> bogus DMA addresses, so we just extend the existing failsafes with the
> above condition.
>
> Cc: "Michael S. Tsirkin" 
> Cc: Petr Matousek 
> Cc: Stefano Stabellini 
> Cc: Prasad Pandit 
> Cc: Michael Roth 
> Cc: Jason Wang 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044
> Signed-off-by: Laszlo Ersek 
> Reviewed-by: Jason Wang 
> ---
>
> Notes:
> Regarding the public posting: we made an honest effort to vet this
> vulnerability, and the impact seems low -- no host side reads/writes,
> "just" a DoS (infinite loop). We decided the patch could be posted
> publicly, for the usual review process. Jason and Prasad checked the
> patch in the internal discussion already, but comments, improvements
> etc. are clearly welcome. The CVE request is underway. Thanks.
>
>  hw/net/e1000.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index bec06e9..34d0823 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -908,7 +908,8 @@ start_xmit(E1000State *s)
>   * bogus values to TDT/TDLEN.
>   * there's nothing too intelligent we could do about this.
>   */
> -if (s->mac_reg[TDH] == tdh_start) {
> +if (s->mac_reg[TDH] == tdh_start ||
> +tdh_start >= s->mac_reg[TDLEN] / sizeof(desc)) {
>  DBGOUT(TXERR, "TDH wraparound @%x, TDT %x, TDLEN %x\n",
> tdh_start, s->mac_reg[TDT], s->mac_reg[TDLEN]);
>  break;
> @@ -1165,7 +1166,8 @@ e1000_receive_iov(NetClientState *nc, const struct 
> iovec *iov, int iovcnt)
>  if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN])
>  s->mac_reg[RDH] = 0;
>  /* see comment in start_xmit; same here */
> -if (s->mac_reg[RDH] == rdh_start) {
> +if (s->mac_reg[RDH] == rdh_start ||
> +rdh_start >= s->mac_reg[RDLEN] / sizeof(desc)) {
>  DBGOUT(RXERR, "RDH wraparound @%x, RDT %x, RDLEN %x\n",
> rdh_start, s->mac_reg[RDT], s->mac_reg[RDLEN]);
>  set_ics(s, 0, E1000_ICS_RXO);

Applied in my -net.

Thanks

[Qemu-devel] [PATCH v7 15/15] iotests: Add "qemu-img map" test for VMDK extents

2016-01-21 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/059 | 10 ++
 tests/qemu-iotests/059.out | 38 ++
 2 files changed, 48 insertions(+)

diff --git a/tests/qemu-iotests/059 b/tests/qemu-iotests/059
index 0ded0c3..261d8b0 100755
--- a/tests/qemu-iotests/059
+++ b/tests/qemu-iotests/059
@@ -133,6 +133,16 @@ $QEMU_IO -c "write -P 0xa 900G 512" "$TEST_IMG" | 
_filter_qemu_io
 $QEMU_IO -c "read -v 900G 1024" "$TEST_IMG" | _filter_qemu_io
 
 echo
+echo "=== Testing qemu-img map on extents ==="
+for fmt in twoGbMaxExtentSparse twoGbMaxExtentFlat; do
+IMGOPTS="subformat=$fmt" _make_test_img 31G
+$QEMU_IO -c "write 65024 1k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "write 2147483136 1k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "write 5G 1k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG map "$TEST_IMG" | _filter_testdir
+done
+
+echo
 echo "=== Testing afl image with a very large capacity ==="
 _use_sample_img afl9.vmdk.bz2
 _img_info
diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out
index 9d506cb..9f5e5cc 100644
--- a/tests/qemu-iotests/059.out
+++ b/tests/qemu-iotests/059.out
@@ -2335,6 +2335,44 @@ e103f0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00  
 read 1024/1024 bytes at offset 966367641600
 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
+=== Testing qemu-img map on extents ===
+Formatting 'TEST_DIR/iotest-version3.IMGFMT', fmt=IMGFMT size=33285996544 
subformat=twoGbMaxExtentSparse
+wrote 1024/1024 bytes at offset 65024
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 2147483136
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 5368709120
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset  Length  Mapped to   File
+0   0x2 0x5 
TEST_DIR/iotest-version3-s001.vmdk
+0x7fff  0x1 0x7 
TEST_DIR/iotest-version3-s001.vmdk
+0x8000  0x1 0x5 
TEST_DIR/iotest-version3-s002.vmdk
+0x14000 0x1 0x5 
TEST_DIR/iotest-version3-s003.vmdk
+Formatting 'TEST_DIR/iotest-version3.IMGFMT', fmt=IMGFMT size=33285996544 
subformat=twoGbMaxExtentFlat
+wrote 1024/1024 bytes at offset 65024
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 2147483136
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 5368709120
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset  Length  Mapped to   File
+0   0x8000  0   
TEST_DIR/iotest-version3-f001.vmdk
+0x8000  0x8000  0   
TEST_DIR/iotest-version3-f002.vmdk
+0x1 0x8000  0   
TEST_DIR/iotest-version3-f003.vmdk
+0x18000 0x8000  0   
TEST_DIR/iotest-version3-f004.vmdk
+0x2 0x8000  0   
TEST_DIR/iotest-version3-f005.vmdk
+0x28000 0x8000  0   
TEST_DIR/iotest-version3-f006.vmdk
+0x3 0x8000  0   
TEST_DIR/iotest-version3-f007.vmdk
+0x38000 0x8000  0   
TEST_DIR/iotest-version3-f008.vmdk
+0x4 0x8000  0   
TEST_DIR/iotest-version3-f009.vmdk
+0x48000 0x8000  0   
TEST_DIR/iotest-version3-f010.vmdk
+0x5 0x8000  0   
TEST_DIR/iotest-version3-f011.vmdk
+0x58000 0x8000  0   
TEST_DIR/iotest-version3-f012.vmdk
+0x6 0x8000  0   
TEST_DIR/iotest-version3-f013.vmdk
+0x68000 0x8000  0   
TEST_DIR/iotest-version3-f014.vmdk
+0x7 0x8000  0   
TEST_DIR/iotest-version3-f015.vmdk
+0x78000 0x4000  0   
TEST_DIR/iotest-version3-f016.vmdk
+
 === Testing afl image with a very large capacity ===
 qemu-img: Can't get size of device 'image': File too large
 *** done
-- 
2.4.3

Re: [Qemu-devel] [PATCH RFC for-2.6 1/3] HBitmap: Introduce "meta" bitmap to track bit changes

2016-01-21 Thread Fam Zheng

On Thu, 01/21 13:58, Vladimir Sementsov-Ogievskiy wrote:
> >>I thing now, that the same may be accomplished for BdrvDirtyBitmap,
> >>all we need is return "changed" status from hb_set_between and then
> >>from hbitmap_set.
> >That is true, but it makes further optimizing to *really* only set the 
> >toggled
> >meta bits much more difficult (i.e. when only a few bits between start and 
> >last
> >are changed).
> 
> Are you going add some optimization in following versions (v3 ?) or
> as separate series, or other plan?

I aim at a seperate series for that.

Fam

[Qemu-devel] [PATCH v7 14/15] qemu-img: Make MapEntry a QAPI struct

2016-01-21 Thread Fam Zheng

The "flags" bit mask is expanded to two booleans, "data" and "zero";
"bs" is replaced with "filename" string.

Refactor the merge conditions in img_map() into entry_mergeable().

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 qapi/block-core.json | 27 
 qemu-img.c   | 71 +++-
 2 files changed, 69 insertions(+), 29 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0a915ed..30c2e5f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -186,6 +186,33 @@
'*fragmented-clusters': 'int', '*compressed-clusters': 'int' } }
 
 ##
+# @MapEntry:
+#
+# Mapping information from a virtual block range to a host file range
+#
+# @start: the start byte of the mapped virtual range
+#
+# @length: the number of bytes of the mapped virtual range
+#
+# @data: whether the mapped range has data
+#
+# @zero: whether the virtual blocks are zeroed
+#
+# @depth: the depth of the mapping
+#
+# @offset: #optional the offset in file that the virtual sectors are mapped to
+#
+# @filename: #optional filename that is referred to by @offset
+#
+# Since: 2.6
+#
+##
+{ 'struct': 'MapEntry',
+  'data': {'start': 'int', 'length': 'int', 'data': 'bool',
+   'zero': 'bool', 'depth': 'int', '*offset': 'int',
+   '*filename': 'str' } }
+
+##
 # @BlockdevCacheInfo
 #
 # Cache mode information for a block device
diff --git a/qemu-img.c b/qemu-img.c
index c8bc63f..f121980 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2147,47 +2147,37 @@ static int img_info(int argc, char **argv)
 return 0;
 }
 
-
-typedef struct MapEntry {
-int flags;
-int depth;
-int64_t start;
-int64_t length;
-int64_t offset;
-BlockDriverState *bs;
-} MapEntry;
-
 static void dump_map_entry(OutputFormat output_format, MapEntry *e,
MapEntry *next)
 {
 switch (output_format) {
 case OFORMAT_HUMAN:
-if ((e->flags & BDRV_BLOCK_DATA) &&
-!(e->flags & BDRV_BLOCK_OFFSET_VALID)) {
+if (e->data && !e->has_offset) {
 error_report("File contains external, encrypted or compressed 
clusters.");
 exit(1);
 }
-if ((e->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == BDRV_BLOCK_DATA) 
{
+if (e->data && !e->zero) {
 printf("%#-16"PRIx64"%#-16"PRIx64"%#-16"PRIx64"%s\n",
-   e->start, e->length, e->offset, e->bs->filename);
+   e->start, e->length,
+   e->has_offset ? e->offset : 0,
+   e->has_filename ? e->filename : "");
 }
 /* This format ignores the distinction between 0, ZERO and ZERO|DATA.
  * Modify the flags here to allow more coalescing.
  */
-if (next &&
-(next->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) != 
BDRV_BLOCK_DATA) {
-next->flags &= ~BDRV_BLOCK_DATA;
-next->flags |= BDRV_BLOCK_ZERO;
+if (next && (!next->data || next->zero)) {
+next->data = false;
+next->zero = true;
 }
 break;
 case OFORMAT_JSON:
-printf("%s{ \"start\": %"PRId64", \"length\": %"PRId64", \"depth\": 
%d,"
-   " \"zero\": %s, \"data\": %s",
+printf("%s{ \"start\": %"PRId64", \"length\": %"PRId64","
+   " \"depth\": %"PRId64", \"zero\": %s, \"data\": %s",
(e->start == 0 ? "[" : ",\n"),
e->start, e->length, e->depth,
-   (e->flags & BDRV_BLOCK_ZERO) ? "true" : "false",
-   (e->flags & BDRV_BLOCK_DATA) ? "true" : "false");
-if (e->flags & BDRV_BLOCK_OFFSET_VALID) {
+   e->zero ? "true" : "false",
+   e->data ? "true" : "false");
+if (e->has_offset) {
 printf(", \"offset\": %"PRId64"", e->offset);
 }
 putchar('}');
@@ -2233,13 +2223,39 @@ static int get_block_status(BlockDriverState *bs, 
int64_t sector_num,
 
 e->start = sector_num * BDRV_SECTOR_SIZE;
 e->length = nb_sectors * BDRV_SECTOR_SIZE;
-e->flags = ret & ~BDRV_BLOCK_OFFSET_MASK;
+e->data = !!(ret & BDRV_BLOCK_DATA);
+e->zero = !!(ret & BDRV_BLOCK_ZERO);
 e->offset = ret & BDRV_BLOCK_OFFSET_MASK;
+e->has_offset = !!(ret & BDRV_BLOCK_OFFSET_VALID);
 e->depth = depth;
-e->bs = file;
+if (file && e->has_offset) {
+e->has_filename = true;
+e->filename = file->filename;
+}
 return 0;
 }
 
+static inline bool entry_mergeable(const MapEntry *curr, const MapEntry *next)
+{
+if (curr->length == 0) {
+return false;
+}
+if (curr->zero != next->zero ||
+curr->data != next->data ||
+curr->depth != next->depth ||
+curr->has_filename != next->has_filename ||
+curr->has_offset != next->has_offset) {
+return false;
+}
+if (curr->has_filename && strcmp(curr->filename, next->filenam

[Qemu-devel] [PATCH v7 12/15] block: Use returned *file in bdrv_co_get_block_status

2016-01-21 Thread Fam Zheng

Now that all drivers return the right "file" pointer, we can use it.

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
---
 block/io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 0836991..d704d32 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1552,13 +1552,13 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 }
 }
 
-if (bs->file &&
+if (*file && *file != bs &&
 (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
 (ret & BDRV_BLOCK_OFFSET_VALID)) {
 BlockDriverState *file2;
 int file_pnum;
 
-ret2 = bdrv_co_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
+ret2 = bdrv_co_get_block_status(*file, ret >> BDRV_SECTOR_BITS,
 *pnum, &file_pnum, &file2);
 if (ret2 >= 0) {
 /* Ignore errors.  This is just providing extra information, it
-- 
2.4.3

[Qemu-devel] [PATCH v7 13/15] qemu-img: In "map", use the returned "file" from bdrv_get_block_status

2016-01-21 Thread Fam Zheng

Now all drivers should return a correct "file", we can make use of it,
even with the recursion into backing chain above.

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 qemu-img.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-img.c b/qemu-img.c
index e653b2f..c8bc63f 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2236,7 +2236,7 @@ static int get_block_status(BlockDriverState *bs, int64_t 
sector_num,
 e->flags = ret & ~BDRV_BLOCK_OFFSET_MASK;
 e->offset = ret & BDRV_BLOCK_OFFSET_MASK;
 e->depth = depth;
-e->bs = bs;
+e->bs = file;
 return 0;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v7 11/15] vmdk: Return extent's file in bdrv_get_block_status

2016-01-21 Thread Fam Zheng

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
---
 block/vmdk.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index e1d3e27..f8f7fcf 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1274,6 +1274,7 @@ static int64_t coroutine_fn 
vmdk_co_get_block_status(BlockDriverState *bs,
  0, 0);
 qemu_co_mutex_unlock(&s->lock);
 
+index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
 switch (ret) {
 case VMDK_ERROR:
 ret = -EIO;
@@ -1286,14 +1287,15 @@ static int64_t coroutine_fn 
vmdk_co_get_block_status(BlockDriverState *bs,
 break;
 case VMDK_OK:
 ret = BDRV_BLOCK_DATA;
-if (extent->file == bs->file && !extent->compressed) {
-ret |= BDRV_BLOCK_OFFSET_VALID | offset;
+if (!extent->compressed) {
+ret |= BDRV_BLOCK_OFFSET_VALID;
+ret |= (offset + (index_in_cluster << BDRV_SECTOR_BITS))
+& BDRV_BLOCK_OFFSET_MASK;
 }
-
+*file = extent->file->bs;
 break;
 }
 
-index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
 n = extent->cluster_sectors - index_in_cluster;
 if (n > nb_sectors) {
 n = nb_sectors;
-- 
2.4.3

[Qemu-devel] [PATCH v7 10/15] vpc: Assign bs->file->bs to file in vpc_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/vpc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/vpc.c b/block/vpc.c
index a070307..f504536 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -589,6 +589,7 @@ static int64_t coroutine_fn 
vpc_co_get_block_status(BlockDriverState *bs,
 
 if (be32_to_cpu(footer->type) == VHD_FIXED) {
 *pnum = nb_sectors;
+*file = bs->file->bs;
 return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
 }
@@ -610,6 +611,7 @@ static int64_t coroutine_fn 
vpc_co_get_block_status(BlockDriverState *bs,
 /* *pnum can't be greater than one block for allocated
  * sectors since there is always a bitmap in between. */
 if (allocated) {
+*file = bs->file->bs;
 return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
 }
 if (nb_sectors == 0) {
-- 
2.4.3

[Qemu-devel] [PATCH v7 09/15] vdi: Assign bs->file->bs to file in vdi_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/vdi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/vdi.c b/block/vdi.c
index 294c438..b403243 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -551,6 +551,7 @@ static int64_t coroutine_fn 
vdi_co_get_block_status(BlockDriverState *bs,
 offset = s->header.offset_data +
   (uint64_t)bmap_entry * s->block_size +
   sector_in_block * SECTOR_SIZE;
+*file = bs->file->bs;
 return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v7 04/15] raw: Assign bs to file in raw_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/raw-posix.c | 1 +
 block/raw_bsd.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 3ef9b25..8866121 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1861,6 +1861,7 @@ static int64_t coroutine_fn 
raw_co_get_block_status(BlockDriverState *bs,
 *pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
 ret = BDRV_BLOCK_ZERO;
 }
+*file = bs;
 return ret | BDRV_BLOCK_OFFSET_VALID | start;
 }
 
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 9a8933b..fd355d5 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -119,6 +119,7 @@ static int64_t coroutine_fn 
raw_co_get_block_status(BlockDriverState *bs,
 BlockDriverState **file)
 {
 *pnum = nb_sectors;
+*file = bs;
 return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
 }
-- 
2.4.3

[Qemu-devel] [PATCH v7 07/15] qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/qed.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/qed.c b/block/qed.c
index 8f6f841..404be1e 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -693,6 +693,7 @@ typedef struct {
 uint64_t pos;
 int64_t status;
 int *pnum;
+BlockDriverState **file;
 } QEDIsAllocatedCB;
 
 static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t 
len)
@@ -704,6 +705,7 @@ static void qed_is_allocated_cb(void *opaque, int ret, 
uint64_t offset, size_t l
 case QED_CLUSTER_FOUND:
 offset |= qed_offset_into_cluster(s, cb->pos);
 cb->status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
+*cb->file = cb->bs->file->bs;
 break;
 case QED_CLUSTER_ZERO:
 cb->status = BDRV_BLOCK_ZERO;
@@ -735,6 +737,7 @@ static int64_t coroutine_fn 
bdrv_qed_co_get_block_status(BlockDriverState *bs,
 .pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE,
 .status = BDRV_BLOCK_OFFSET_MASK,
 .pnum = pnum,
+.file = file,
 };
 QEDRequest request = { .l2_table = NULL };
 
-- 
2.4.3

[Qemu-devel] [PATCH v7 08/15] sheepdog: Assign bs to file in sd_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/sheepdog.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 2ea05a6..a0098c1 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -2739,6 +2739,9 @@ sd_co_get_block_status(BlockDriverState *bs, int64_t 
sector_num, int nb_sectors,
 if (*pnum > nb_sectors) {
 *pnum = nb_sectors;
 }
+if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
+*file = bs;
+}
 return ret;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v7 03/15] qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/qcow2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index d4ea6b4..8babecd 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1349,6 +1349,7 @@ static int64_t coroutine_fn 
qcow2_co_get_block_status(BlockDriverState *bs,
 !s->cipher) {
 index_in_cluster = sector_num & (s->cluster_sectors - 1);
 cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
+*file = bs->file->bs;
 status |= BDRV_BLOCK_OFFSET_VALID | cluster_offset;
 }
 if (ret == QCOW2_CLUSTER_ZERO) {
-- 
2.4.3

[Qemu-devel] [PATCH v7 02/15] qcow: Assign bs->file->bs to file in qcow_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/qcow.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/qcow.c b/block/qcow.c
index 4202797..251910c 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -510,6 +510,7 @@ static int64_t coroutine_fn 
qcow_co_get_block_status(BlockDriverState *bs,
 return BDRV_BLOCK_DATA;
 }
 cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
+*file = bs->file->bs;
 return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | cluster_offset;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v7 06/15] parallels: Assign bs->file->bs to file in parallels_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/parallels.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/parallels.c b/block/parallels.c
index e2de308..645521d 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -274,6 +274,7 @@ static int64_t coroutine_fn 
parallels_co_get_block_status(BlockDriverState *bs,
 return 0;
 }
 
+*file = bs->file->bs;
 return (offset << BDRV_SECTOR_BITS) |
 BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
 }
-- 
2.4.3

[Qemu-devel] [PATCH v7 01/15] block: Add "file" output parameter to block status query functions

2016-01-21 Thread Fam Zheng

The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.

Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.

The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.

Signed-off-by: Fam Zheng 
---
 block/io.c| 38 ++
 block/iscsi.c |  6 --
 block/mirror.c|  3 ++-
 block/parallels.c |  2 +-
 block/qcow.c  |  2 +-
 block/qcow2.c |  2 +-
 block/qed.c   |  3 ++-
 block/raw-posix.c |  3 ++-
 block/raw_bsd.c   |  3 ++-
 block/sheepdog.c  |  2 +-
 block/vdi.c   |  2 +-
 block/vmdk.c  |  2 +-
 block/vpc.c   |  2 +-
 block/vvfat.c |  2 +-
 include/block/block.h | 11 +++
 include/block/block_int.h |  3 ++-
 qemu-img.c| 13 +
 17 files changed, 64 insertions(+), 35 deletions(-)

diff --git a/block/io.c b/block/io.c
index 5bb353a..0836991 100644
--- a/block/io.c
+++ b/block/io.c
@@ -664,6 +664,7 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t 
sector_num,
 int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
 {
 int64_t target_sectors, ret, nb_sectors, sector_num = 0;
+BlockDriverState *file;
 int n;
 
 target_sectors = bdrv_nb_sectors(bs);
@@ -676,7 +677,7 @@ int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags 
flags)
 if (nb_sectors <= 0) {
 return 0;
 }
-ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
+ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n, &file);
 if (ret < 0) {
 error_report("error getting block status at sector %" PRId64 ": 
%s",
  sector_num, strerror(-ret));
@@ -1466,6 +1467,7 @@ int bdrv_flush_all(void)
 typedef struct BdrvCoGetBlockStatusData {
 BlockDriverState *bs;
 BlockDriverState *base;
+BlockDriverState **file;
 int64_t sector_num;
 int nb_sectors;
 int *pnum;
@@ -1490,7 +1492,8 @@ typedef struct BdrvCoGetBlockStatusData {
  */
 static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
  int64_t sector_num,
- int nb_sectors, int *pnum)
+ int nb_sectors, int *pnum,
+ BlockDriverState **file)
 {
 int64_t total_sectors;
 int64_t n;
@@ -1520,16 +1523,19 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 return ret;
 }
 
-ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
+*file = NULL;
+ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum,
+file);
 if (ret < 0) {
 *pnum = 0;
+*file = NULL;
 return ret;
 }
 
 if (ret & BDRV_BLOCK_RAW) {
 assert(ret & BDRV_BLOCK_OFFSET_VALID);
 return bdrv_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
- *pnum, pnum);
+ *pnum, pnum, file);
 }
 
 if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
@@ -1549,10 +1555,11 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 if (bs->file &&
 (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
 (ret & BDRV_BLOCK_OFFSET_VALID)) {
+BlockDriverState *file2;
 int file_pnum;
 
 ret2 = bdrv_co_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
-*pnum, &file_pnum);
+*pnum, &file_pnum, &file2);
 if (ret2 >= 0) {
 /* Ignore errors.  This is just providing extra information, it
  * is useful but not necessary.
@@ -1577,14 +1584,15 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status_above(BlockDriverState *bs,
 BlockDriverState *base,
 int64_t sector_num,
 int nb_sectors,
-int *pnum)
+int *pnum,
+BlockDriverState **file)
 {
 BlockDriverState *p;
 int64_t ret = 0;
 
 assert(bs != base);
 for (p = bs; p != base; p = backing_bs(p)) {
-ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum);
+ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum, file);
 if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) {
 break;
 }
@@ -1603,7 +1611,8 @@ static void coroutine_fn 
bdrv_get_block_status_above_co_entry(void *opaque)
 data->ret = bdrv_co_ge

[Qemu-devel] [PATCH v7 05/15] iscsi: Assign bs to file in iscsi_co_get_block_status

2016-01-21 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/iscsi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index e182557..9fe76f4 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -625,6 +625,9 @@ out:
 if (iTask.task != NULL) {
 scsi_free_scsi_task(iTask.task);
 }
+if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
+*file = bs;
+}
 return ret;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v7 00/15] qemu-img map: Allow driver to return file of the allocated block

2016-01-21 Thread Fam Zheng

v7: Rebase, update patch 1 for two new bdrv_get_block_status_above() callers in
qemu-img.c. [Max]
Add Max's rev-by in patch 12.

Original cover letter
-

I stumbled upon this when looking at external bitmap formats.

Current "qemu-img map" command only displays filename if the data is allocated
in bs (bs->file) itself, or in the backing chain. Otherwise, it displays an
unfriendly error message:

$ qemu-img create -f vmdk -o subformat=monolithicFlat /tmp/test.vmdk 1G

$ qemu-img map /tmp/test.vmdk
Offset  Length  Mapped to   File
qemu-img: File contains external, encrypted or compressed clusters.

This can be improved. This series extends the .bdrv_co_get_block_status
callback, to let block driver return the BDS of file; then updates all driver
to implement it; and lastly, it changes qemu-img to use this information in
"map" command:


$ qemu-img map /tmp/test.vmdk
Offset  Length  Mapped to   File
0   0x4000  0   /tmp/test-flat.vmdk

$ qemu-img map --output json /tmp/test.vmdk
[{"length": 1073741824, "start": 0, "zero": false, "offset": 0, "depth": 0,
  "file": "/tmp/test-flat.vmdk", "data": true}
]


Fam Zheng (15):
  block: Add "file" output parameter to block status query functions
  qcow: Assign bs->file->bs to file in qcow_co_get_block_status
  qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
  raw: Assign bs to file in raw_co_get_block_status
  iscsi: Assign bs to file in iscsi_co_get_block_status
  parallels: Assign bs->file->bs to file in
parallels_co_get_block_status
  qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status
  sheepdog: Assign bs to file in sd_co_get_block_status
  vdi: Assign bs->file->bs to file in vdi_co_get_block_status
  vpc: Assign bs->file->bs to file in vpc_co_get_block_status
  vmdk: Return extent's file in bdrv_get_block_status
  block: Use returned *file in bdrv_co_get_block_status
  qemu-img: In "map", use the returned "file" from bdrv_get_block_status
  qemu-img: Make MapEntry a QAPI struct
  iotests: Add "qemu-img map" test for VMDK extents

 block/io.c | 42 +++
 block/iscsi.c  |  9 +++--
 block/mirror.c |  3 +-
 block/parallels.c  |  3 +-
 block/qcow.c   |  3 +-
 block/qcow2.c  |  3 +-
 block/qed.c|  6 +++-
 block/raw-posix.c  |  4 ++-
 block/raw_bsd.c|  4 ++-
 block/sheepdog.c   |  5 ++-
 block/vdi.c|  3 +-
 block/vmdk.c   | 12 ---
 block/vpc.c|  4 ++-
 block/vvfat.c  |  2 +-
 include/block/block.h  | 11 +++---
 include/block/block_int.h  |  3 +-
 qapi/block-core.json   | 27 +++
 qemu-img.c | 84 --
 tests/qemu-iotests/059 | 10 ++
 tests/qemu-iotests/059.out | 38 +
 20 files changed, 206 insertions(+), 70 deletions(-)

-- 
2.4.3

Re: [Qemu-devel] COLO hybrid mode and other changes

2016-01-21 Thread Hailiang Zhang


Nice work :)

On 2016/1/22 3:08, Dr. David Alan Gilbert wrote:

Hi,
   I've got a world with a few COLO changes in that you might
like to look at; they're all experimental, but it seems
to be working for me.

   The world is at:
 https://github.com/orbitfp7/qemu/commits/orbit-wp4-colo-jan16

It's based off:
 The December/2.4 colo framework world with periodic checkpoints
 + The december user space colo-proxy
 + The integration code that was in the colo-proxy git repo
 (those needed some updates to go into the 2.4-periodic world)

And then I've added:
- An updated version of my hybrid mode code that switches between
  COLO and periodic checkpoints depending on the frequency of
  miscompares
- A minimum checkpoint time, so that on fast miscompares
  we do get some runtime (as in the earlier colo series)
- I'm using a condition-variable to flag the miscompares from
  the colo-proxy to the colo thread; this seems cleaner and it
  avoids having to do short waits;  it might be a bit tricky
  to make portable.
- Add back some statistcs in 'info migrate'
- An HMP command equivalent to x-blockdev-change.
- RDMA transport for COLO

I don't intend to post any of these changes in full to the list
until your main COLO code is in; although if any of my changes
are useful to you, then feel free to include them in sets you post.



Thanks for your effort. I will look at them later.

Hailiang


Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

.

Re: [Qemu-devel] [PATCH v6 01/15] block: Add "file" output parameter to block status query functions

2016-01-21 Thread Fam Zheng

On Thu, 01/21 16:57, Max Reitz wrote:
> On 08.01.2016 03:08, Fam Zheng wrote:
> > The added parameter can be used to return the BDS pointer which the
> > valid offset is referring to. Its value should be ignored unless
> > BDRV_BLOCK_OFFSET_VALID in ret is set.
> > 
> > Until block drivers fill in the right value, let's clear it explicitly
> > right before calling .bdrv_get_block_status.
> > 
> > The "bs->file" condition in bdrv_co_get_block_status is kept now to keep 
> > iotest
> > case 102 passing, and will be fixed once all drivers return the right file
> > pointer.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  block/io.c| 38 ++
> >  block/iscsi.c |  6 --
> >  block/mirror.c|  3 ++-
> >  block/parallels.c |  2 +-
> >  block/qcow.c  |  2 +-
> >  block/qcow2.c |  2 +-
> >  block/qed.c   |  3 ++-
> >  block/raw-posix.c |  3 ++-
> >  block/raw_bsd.c   |  3 ++-
> >  block/sheepdog.c  |  2 +-
> >  block/vdi.c   |  2 +-
> >  block/vmdk.c  |  2 +-
> >  block/vpc.c   |  2 +-
> >  block/vvfat.c |  2 +-
> >  include/block/block.h | 11 +++
> >  include/block/block_int.h |  3 ++-
> >  qemu-img.c|  7 +--
> >  17 files changed, 60 insertions(+), 33 deletions(-)
> 
> Patch itself is good, but the two bdrv_get_block_status_above() calls
> you added in qemu-img.c in 25ad8e6e need to be amended now, too.

Sure, thanks, will rebase and resend.

Re: [Qemu-devel] [PATCH 1/2] hmp: add hmp command for incremental backup

2016-01-21 Thread 张敏

On 16/1/22 上午12:39, Eric Blake wrote:
> On 01/21/2016 04:22 AM, Rudy Zhang wrote:
>> Add hmp command for incremental backup in drive-backup.
>> It need a bitmap to backup data from drive-image to incremental image,
>> so before it need add bitmap for this device to track io.
>> Usage:
>> drive_backup [-n] [-f] device target [bitmap] [format]
>>
>> Signed-off-by: Rudy Zhang 
>> ---
>>   hmp-commands.hx |  5 +++--
>>   hmp.c   | 16 ++--
>>   2 files changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>> index bb52e4d..7378aaa 100644
>> --- a/hmp-commands.hx
>> +++ b/hmp-commands.hx
>> @@ -1180,12 +1180,13 @@ ETEXI
>>
>>   {
>>   .name   = "drive_backup",
>> -.args_type  = "reuse:-n,full:-f,device:B,target:s,format:s?",
>> -.params = "[-n] [-f] device target [format]",
>> +.args_type  = 
>> "reuse:-n,full:-f,device:B,target:s,bitmap:s?,format:s?",
>> +.params = "[-n] [-f] device target [bitmap] [format]",
> This is HMP, so it may not matter, but this is not backwards compatible.
>   Scripts targetting the old style of passing a format will now have that
> format string interpreted as a bitmap name with no format.  Better would
> be to stick [bitmap] at the end, not the middle.

But I have a question: If I don't want to input a 'format', only use 'bitmap',
it will let 'bitmap' as 'format', This problem how to do it.

>
>> @@ -1098,6 +1100,17 @@ void hmp_drive_backup(Monitor *mon, const QDict 
>> *qdict)
>>   return;
>>   }
>>
>> +if (full && bitmap) {
>> +error_setg(&err, "Parameter 'bitmap' if conflict with '-f'");
> s/if conflict/conflicts/

oh，made a mistake.

>> +hmp_handle_error(mon, &err);
>> +return;
>> +} else if (full)
>> +sync = MIRROR_SYNC_MODE_FULL;
> Needs {}.  Run your patch through scripts/checkpatch.pl, to flag this
> and other style violations.

I have checked these patches，but I ignored these warnings.

Re: [Qemu-devel] RFC: running the user interface in a thread ...

2016-01-21 Thread Dave Airlie

On 21 January 2016 at 19:05, Paolo Bonzini  wrote:
>
>
> On 21/01/2016 09:44, Dave Airlie wrote:
>> I've hacked on this before, but only with SDL and it was pretty dirty,
>> and it gave a fairly decent
>> speed up.
>>
>> My thoughts are to use dataplane like design to process the queue in a
>> separate thread,
>> and then have some sort of channel between the UI and virtio-gpu
>> thread to handle things like
>> screen resize etc.
>>
>> http://cgit.freedesktop.org/~airlied/qemu/commit/?h=dave3d&id=fe22a0955255afef12b947c4a91efa57e7a7c429
>> is my previous horror patch.
>
> Instead of having a full-blown thread, are there things (such as the
> TGSI->GLSL conversion) that could be simply offloaded to a userspace
> thread pool, either in QEMU or in virglrenderer?

Not really, the TGSI->GLSL overhead is quite minor, the problem is the
produced GLSL then gets passed to
OpenGL to compile, and depending on the quality of the original
program and my conversion code, the
GLSL compiler can get quite upset.

In theory Mesa could help here, but GL isn't thread friendly at all,
so it probably won't help in the virgl
case even if it did. Since most GL apps compile a shader and block on
using it straight away doing it
in a thread won't help unblock things.

So I think it would be best to have all the virgl vq processing happen
in it's own thread with some API
to the UI to do UI resizes (the most difficult) and dirty regions etc.

Dave.

Re: [Qemu-devel] [PATCH] virt-acpi-build: add always-on property for timer

2016-01-21 Thread Shannon Zhao



On 2016/1/21 20:54, Andrew Jones wrote:
> This patch is the ACPI equivalent of "hw/arm/virt: Add always-on
> property to the virt board timer". The timer is always on, and
> thus setting this informs Linux that it may switch off the periodic
> timer. Switching off the periodic timer substantially reduces the
> number of interrupts the host needs to inject.
> 
> Testing note: AArch64 guests (the only ones currently booting with
> ACPI) do not actually need this patch to determine it can turn the
> periodic timer off. I therefore used a hacked guest kernel to ensure
> this patch works as the equivalent DT patch does.
> 
Reviewed-by: Shannon Zhao 

> Signed-off-by: Andrew Jones 
> ---
>  hw/arm/virt-acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 87fbe7c97d99b..f6e538f3d02ea 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -443,7 +443,7 @@ build_gtdt(GArray *table_data, GArray *linker)
>  gtdt->secure_el1_flags = ACPI_EDGE_SENSITIVE;
>  
>  gtdt->non_secure_el1_interrupt = ARCH_TIMER_NS_EL1_IRQ + 16;
> -gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE;
> +gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE | ACPI_GTDT_ALWAYS_ON;
>  
>  gtdt->virtual_timer_interrupt = ARCH_TIMER_VIRT_IRQ + 16;
>  gtdt->virtual_timer_flags = ACPI_EDGE_SENSITIVE;
> 

-- 
Shannon

Re: [Qemu-devel] [PATCH v9 21/37] qapi: Document visitor interfaces, add assertions

2016-01-21 Thread Eric Blake

On 01/21/2016 01:08 PM, Markus Armbruster wrote:
> All right, this one's a bear.  Not because the patch is bad, but because
> what it tries to do is bloody difficult.

Is there any reasonable split (such as adding some of the assertions in
earlier patches) that would make it any easier? Or do we just bite the
bullet and do it?

> 
> Eric Blake  writes:
> 
>> The visitor interface for mapping between QObject/QemuOpts/string
>> and qapi has formerly been documented only by reading source code,
> 
> Polite way to say "is scandalously undocumented".

Indeed.

> 
>> making it difficult to propose changes to either scripts/qapi*.py
>> or to clients without knowing whether those changes would be safe.
>> This adds documentation, including mentioning when parameters can
>> be NULL, and where there are still some interface warts that would
>> be nice to remove.  In particular, I have plans to remove
>> visit_start_union() in a future patch.
> 
> Suggest
> 
> The visitor interface for mapping between QObject/QemuOpts/string
> and QAPI is pretty much undocumented, making changes to visitor
> core, individual visitors, and users of visitors difficult.
> 
> Correct this by retrofitting proper contracts.  Document some
> interface warts that would be nice to remove.  In particular, I have
> plans to remove visit_start_union() in a future patch.

Your suggestions here, and elsewhere, are good and will be in my next
spin.  I'll trim to just the places where you have more than just a
wording suggestion.

>>  include/qapi/visitor-impl.h |  31 ++-
>>  include/qapi/visitor.h  | 196 
>> 
>>  qapi/qapi-visit-core.c  |  39 -
>>  3 files changed, 262 insertions(+), 4 deletions(-)
>>
> 
> My review probably makes more sense if you skip ahead to visitor.h, then
> come back here.

If I remember, I'll use -O when generating v10 to force visitor.h first,
other .h second, and .c last (I don't always remember to do it; maybe I
should add it into my handy .git alias that I use for firing off long
series).  [I really wish the git people would make it possible to
automate -O via 'git config', and to make it easier to have a
per-project preferred order file]

> 
>> +
>>  struct Visitor
>>  {
>> -/* Must be set */
>> +/* Must be provided to visit structs (the string visitors do not
>> + * currently visit structs). */
> 
> Uh, the string visitors don't decide what gets visited, their users do.
> The string visitors don't support visiting structs.  The restriction
> needs to be documented, but this isn't the place to do it, is it?

Good point.  I think what I will do is split out a separate patch that
documents, per visitor with limitation, what callers cannot (yet) do
with that visitor.

>> +/* May be NULL; most useful for input visitors. */
> 
> "Optional" would be a bit terser than "May be NULL".
> 
> Why is it "most useful for input visitors"?  For what it's worth, the
> dealloc visitor finds it useful, too...

and a later patch adds it to the QemuOpts input visitor (Zoltan's patch
has been sitting for how many months now?).  I'll come up with something.

> 
>>  void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
>>Error **errp);
>>  /* May be NULL */
>>  void (*end_implicit_struct)(Visitor *v);
>>
>> +/* Must be set */
>>  void (*start_list)(Visitor *v, const char *name, Error **errp);
>> +/* Must be set */
>>  GenericList *(*next_list)(Visitor *v, GenericList **list, Error **errp);
>>  /* Must be set */
>>  void (*end_list)(Visitor *v);
> 
> A visitor could omit these two with similar consequences to omitting
> start_struct() and end_struct(): attempts to visit lists crash then.  In
> fact, the string visitors omitted them until commit 659268f and 69e2556,
> respectively.

Which is why, as you pointed out, it may be better to document the
limitations in the string visitor rather than here, and in this file
maybe just mention at the top something along the lines that "must be
set" really means that "only needs to be set if your callers are
expecting a visit to encounter this type; the corresponding crash on
calling NULL is your hint to write missing functionality in your visitor".

>>  /* May be NULL; only needed for input visitors. */
>void (*get_next_type)(Visitor *v, const char *name, QType *type,
>  bool promote_int, Error **errp);
> 
> I figure it must be set for input visitors, because without it, the
> generated visit function will assume QTYPE_NONE, and fail.
> 
> Suggest "Must be set for input visitors."

Correct, and nice wording.

> 
> Looks like we say "must be set to visit this" when at least one visitor
> doesn't provide, and "must be set" when all our current visitors
> provide.  Hmm.
> 
>>  void (*type_any)(Visitor *v, const char *name, QObject **obj,
>>   Error **errp);

[Qemu-devel] [PATCH v2] .travis.yml: migrate to container builds

2016-01-21 Thread Alex Bennée

This moves the Travis tests from the legacy VM infrastructure (which
only seems to run 5-6 jobs at once) to the new container based approach.

The principle difference is there is no sudo in the containers so all
packages are installed using the apt add-on. This means one of the build
combinations can be dropped as it was only for checking the build with
additional packages.

Signed-off-by: Alex Bennée 

--
v2
  - re-enabled the user-space-tracing build
  - disable optional extras (instead of having optional build)
  - remove stray linux-user target from make check target
---
 .travis.yml | 44 
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 6ca0260..4a0c23a 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,9 +1,37 @@
+sudo: false
 language: c
 python:
   - "2.4"
 compiler:
   - gcc
   - clang
+addons:
+  apt:
+packages:
+  - libaio-dev
+  - libattr1-dev
+  - libbrlapi-dev
+  - libcap-ng-dev
+  - libgnutls-dev
+  - libgtk-3-dev
+  - libiscsi-dev
+  - liblttng-ust-dev
+  - libncurses5-dev
+  - libnss3-dev
+  - libpixman-1-dev
+  - libpng12-dev
+  - librados-dev
+  - libsdl1.2-dev
+  - libseccomp-dev
+  - libspice-protocol-dev
+  - libspice-server-dev
+  - libssh2-1-dev
+  - liburcu-dev
+  - libusb-1.0-0-dev
+  - libvte-2.90-dev
+  - sparse
+  - uuid-dev
+
 notifications:
   irc:
 channels:
@@ -14,11 +42,6 @@ env:
   global:
 - TEST_CMD=""
 - EXTRA_CONFIG=""
-# Development packages, EXTRA_PKGS saved for additional builds
-- CORE_PKGS="libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev"
-- NET_PKGS="libseccomp-dev libgnutls-dev libssh2-1-dev  
libspice-server-dev libspice-protocol-dev libnss3-dev"
-- GUI_PKGS="libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev 
libpixman-1-dev"
-- EXTRA_PKGS=""
   matrix:
 # Group major targets together with their linux-user counterparts
 - TARGETS=alpha-softmmu,alpha-linux-user
@@ -43,8 +66,6 @@ git:
 before_install:
   - wget -O - 
http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
   - git submodule update --init --recursive
-  - sudo apt-get update -qq
-  - sudo apt-get install -qq ${CORE_PKGS} ${NET_PKGS} ${GUI_PKGS} ${EXTRA_PKGS}
 before_script:
   - ./configure --target-list=${TARGETS} --enable-debug-tcg ${EXTRA_CONFIG}
 script:
@@ -54,7 +75,7 @@ matrix:
   include:
 # Make check target (we only do this once)
 - env:
-- 
TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,microblazeel-softmmu,mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,unicore32-softmmu,unicore32-linux-user,lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
+- 
TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,microblazeel-softmmu,mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,unicore32-softmmu,lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
   TEST_CMD="make check"
   compiler: gcc
 # Debug related options
@@ -64,16 +85,16 @@ matrix:
 - env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-debug --enable-tcg-interpreter"
   compiler: gcc
-# All the extra -dev packages
+# Disable a few of the optional features
 - env: TARGETS=i386-softmmu,x86_64-softmmu
-   EXTRA_PKGS="libaio-dev libcap-ng-dev libattr1-dev libbrlapi-dev 
uuid-dev libusb-1.0.0-dev"
+   EXTRA_CONFIG="--disable-linux-aio --disable-cap-ng --disable-attr 
--disable-brlapi --disable-uuid --disable-libusb"
   compiler: gcc
 # Currently configure doesn't force --disable-pie
 - env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-gprof --enable-gcov --disable-pie"
   compiler: gcc
+# Sparse
 - env: TARGETS=i386-softmmu,x86_64-softmmu
-   EXTRA_PKGS="sparse"
EXTRA_CONFIG="--enable-sparse"
   compiler: gcc
 # All the trace backends (apart from dtrace)
@@ -87,7 +108,6 @@ matrix:
EXTRA_CONFIG="--enable-trace-backends=ftrace"
   compiler: gcc
 - env: TARGETS=i386-softmmu,x86_64-softmmu
-  EXTRA_PKGS="liblttng-ust-dev liburcu-dev"
   EXTRA_CONFIG="--enable-trace-backends=ust"
   compiler: gcc
 - env: TARGETS=i386-softmmu,x86_64-softmmu
-- 
2.6.4

Re: [Qemu-devel] [PATCH 09/10] virtio: read avail_idx from VQ only when necessary

2016-01-21 Thread Vincenzo Maffione

2016-01-19 19:48 GMT+01:00 Paolo Bonzini :
>
>
> On 19/01/2016 17:54, Michael S. Tsirkin wrote:
>> On Fri, Jan 15, 2016 at 01:41:57PM +0100, Paolo Bonzini wrote:
>>> From: Vincenzo Maffione 
>>>
>>> The virtqueue_pop() implementation needs to check if the avail ring
>>> contains some pending buffers. To perform this check, it is not
>>> always necessary to fetch the avail_idx in the VQ memory, which is
>>> expensive. This patch introduces a shadow variable tracking avail_idx
>>> and modifies virtio_queue_empty() to access avail_idx in physical
>>> memory only when necessary.
>>>
>>> Signed-off-by: Vincenzo Maffione 
>>> Message-Id: 
>>> 
>>> Signed-off-by: Paolo Bonzini 
>>
>> Is the cost due to the page walk?
>
> Yes, as with all the other patches.  But unlike patches 7 and 10 where
> we just reduce the number of walks, for patch 8 and 9 it's difficult to
> beat a local cache. :)
>
>>> @@ -1579,6 +1595,7 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int 
>>> version_id)
>>>  return -1;
>>>  }
>>>  vdev->vq[i].used_idx = vring_used_idx(&vdev->vq[i]);
>>> +vdev->vq[i].shadow_avail_idx = vring_avail_idx(&vdev->vq[i]);
>>>  }
>>>  }
>>
>>
>> shadow_avail_idx also should be updated on vhost stop,
>
> That's virtio_queue_set_last_avail_idx, right?
>
> Paolo

Right. Sorry, I missed that.

Vincenzo



-- 
Vincenzo Maffione

Re: [Qemu-devel] [PATCH v4 10/12] fdc: rework pick_geometry

2016-01-21 Thread Eric Blake

On 01/21/2016 01:14 PM, John Snow wrote:
> 
> 
> On 01/20/2016 06:45 PM, Eric Blake wrote:
>> On 01/19/2016 11:51 PM, John Snow wrote:
>>> This one is the crazy one.
>>>
>>> fd_revalidate currently uses pick_geometry to tell if the diskette
>>> geometry has changed upon an eject/insert event, but it won't allow us
>>> to insert a 1.44MB diskette into a 2.88MB drive. This is inflexible.
>>>
>>> The new algorithm applies a new heuristic to guessing disk geometries
>>> that allows us to switch diskette types as long as the physical size
>>> matches before falling back to the old heuristic.
>>>
>>> The old one is roughly:
>>>  - If the size (sectors) and type matches, choose it.
>>>  - Fall back to the first geometry that matched our type.
>>>
> 
> For sake of review, we'll call these choices (A) and (B).
> 
>>> The new one is:
>>>  - If the size (sectors) and type matches, choose it.
>>>  - If the size (sectors) and physical size match, choose it.
>>>  - If the size (sectors) matches at all, choose it begrudgingly.
>>
>> This is the one I worry about; details below.
>>
>>>  - Fall back to the first geometry that matched our type.
>>>
> 
> And these choices will be (1) through (4) as they are annotated in the
> comment below in this way as well.
> 
> (1) is identical to (A), and (4) is identical to (B).

Agreed.

> That leaves (2) and (3) are new behaviors.
> 
> There is strong rationale for (2).
> (3) was invented as a last-ditch effort before (4), see below!
> 

>>
>> Thus, I think you still have a bug. :(
>>
> 
> It's definitely new behavior. In either case, I think we can agree that
> QEMU is choosing a garbage format as a last-ditch effort to see if
> something works.
> 
> The old behavior of "Choose a wrong geometry of the right type as a
> last-chance effort" was something I thought that maybe I could "improve"
> by doing:
> 
> "Choose a wrong geometry of the right type as a last-chance effort,
> unless we found something that is the correct number of sectors, in
> which case maybe use that as a backup strategy because it might work
> better."

I don't think the geometry for a 120 row is going to work right for a
guest expecting 144/288 behavior.

> 
> It is decidedly "new" behavior, but it is similarly to the old strategy
> a recourse action when faced with the absence of a more precise match
> (right size and type (1) or right size and correct physical size (2))
> 
> The thought was basically: "I bet the floppy drive code would cope
> better by setting a geometry that is at the very least the right number
> of sectors over one that's clearly wrong."

I think the point of this function is:

If the user told us a disk type and gave us an image with an exact
number of sectors, use the geometry associated with that size.  Behavior
(2) refines this to further pick sizes that are compatible with the
current disk type (with a 288 drive, the exact geometry for 2880 sectors
visiting a 144 disk is probably better than the behavior (B) choice for
the first 288 disk).

If the user did not tell us a disk type, favor a geometry that matches
the number of sectors.  If the size can occur from more than one disk
type, we used to favor the disk type that occurred first in the list
(2880 sectors maps to 3.5", not 5.25"; while 720 sectors favors 5.25");
due to behavior (2) the new code may instead favor the disk size that
matches the default fallback of "auto" (if auto falls back to 144 or
288, 720 sectors will now select that size rather than 5.25").

If the user told us a disk type but we don't have a sector match, then
pick the first geometry associated with that disk type (hopefully the
host image is smaller than that choice, and we pad out the host file to
pretend that it matches the most common use of that disk type from the
guest's view).

Finally, the user told us nothing and we have no size heuristic to fall
back on, so pick a (hopefully sane) default of the first table entry
(old behavior) or the "auto" fallback (new behavior).

> 
>>>  }
>>> -if (first_match == -1) {
>>> -first_match = i;
>>> +} else if (type_match == -1) {
>>
>> Here, we are visiting a row where sectors doesn't match.
>>
> 
> Yes, the old "hail mary" behavior present from 2003-2016, what I
> annotated as type (B) in my reply above. Now documented as behavior (4).
> 
>>> +if ((parse->drive == drv->drive) ||
>>> +(magic && (parse->drive == get_fallback_drive_type(drv 
>>> {
>>> +/* (4) type matches, or type matches the autodetect 
>>> default if
>>> + * we are using the autodetect mechanism. */
>>> +type_match = i;
>>
>> In which case, if the user told us a size, or the default for the "auto"
>> size matches, we set 'type_match' (in effect, picking the first
>> 144/288/120 row that matches their explicit or default type).
>>
> 
> Right, this is the old behavior. "We couldn't find the right geometry,
> so we're just

Re: [Qemu-devel] [PATCH 2/2] migration/virtio: Remove simple .get/.put use

2016-01-21 Thread Sascha Silbe

Dear David,

"Dr. David Alan Gilbert"  writes:

> 
> 

Interesting, didn't know about --global yet. Thanks for posting the
example. That will come in handy for my tests.

> Thanks I've reused a chunk of that;  I'll post the fix soon.
> Thanks for your help on this.

Thanks, looking forward to the next version of your patch. Will use the
the previous one in the meantime as a local work-around.

Sascha
-- 
Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg
https://se-silbe.de/
USt-IdNr. DE281696641

Re: [Qemu-devel] [Qemu-block] [PATCH 1/2] hmp: add hmp command for incremental backup

2016-01-21 Thread John Snow



On 01/21/2016 11:39 AM, Eric Blake wrote:
> On 01/21/2016 04:22 AM, Rudy Zhang wrote:
>> Add hmp command for incremental backup in drive-backup.
>> It need a bitmap to backup data from drive-image to incremental image,
>> so before it need add bitmap for this device to track io.
>> Usage:
>>  drive_backup [-n] [-f] device target [bitmap] [format]
>>
>> Signed-off-by: Rudy Zhang 
>> ---
>>  hmp-commands.hx |  5 +++--
>>  hmp.c   | 16 ++--
>>  2 files changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>> index bb52e4d..7378aaa 100644
>> --- a/hmp-commands.hx
>> +++ b/hmp-commands.hx
>> @@ -1180,12 +1180,13 @@ ETEXI
>>  
>>  {
>>  .name   = "drive_backup",
>> -.args_type  = "reuse:-n,full:-f,device:B,target:s,format:s?",
>> -.params = "[-n] [-f] device target [format]",
>> +.args_type  = 
>> "reuse:-n,full:-f,device:B,target:s,bitmap:s?,format:s?",
>> +.params = "[-n] [-f] device target [bitmap] [format]",
> 
> This is HMP, so it may not matter, but this is not backwards compatible.
>  Scripts targetting the old style of passing a format will now have that
> format string interpreted as a bitmap name with no format.  Better would
> be to stick [bitmap] at the end, not the middle.
> 

I'd like to also point out that the method of intuiting the backup mode
based on the parameters present won't expand very well as we add more
modes, especially if there's ever an addition for a differential backup
mode.

Maybe it's fine because it's HMP and backwards compatibility is not a
real concern ...

> 
>> @@ -1098,6 +1100,17 @@ void hmp_drive_backup(Monitor *mon, const QDict 
>> *qdict)
>>  return;
>>  }
>>  
>> +if (full && bitmap) {
>> +error_setg(&err, "Parameter 'bitmap' if conflict with '-f'");
> 
> s/if conflict/conflicts/
> 
>> +hmp_handle_error(mon, &err);
>> +return;
>> +} else if (full)
>> +sync = MIRROR_SYNC_MODE_FULL;
> 
> Needs {}.  Run your patch through scripts/checkpatch.pl, to flag this
> and other style violations.
> 

-- 
—js

[Qemu-devel] [PATCH qemu] migration/vmstate: document VMStateFlags

2016-01-21 Thread Sascha Silbe

The VMState API is rather sparsely documented. Start by describing the
meaning of all VMStateFlags.

Signed-off-by: Sascha Silbe 
---
Since I had to dive into the code for debugging the migration breakage
anyway, I figured I could just as well write down the result in the
form of comments so the next one trying to make some sense out of it
will have a better start.

This is based on my understanding of the code, not necessarily what
the original author(s) intended.

 include/migration/vmstate.h | 93 -
 1 file changed, 84 insertions(+), 9 deletions(-)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 97d44d3..b1bbf68 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -88,20 +88,95 @@ struct VMStateInfo {
 };
 
 enum VMStateFlags {
+/* Ignored */
 VMS_SINGLE   = 0x001,
+
+/* The struct member at opaque + VMStateField.offset is a pointer
+ * to the actual field (e.g. struct a { uint8_t *b;
+ * }). Dereference the pointer before using it as basis for
+ * further pointer arithmetic (see e.g. VMS_ARRAY). Does not
+ * affect the meaning of VMStateField.num_offset or
+ * VMStateField.size_offset; see VMS_VARRAY* and VMS_VBUFFER for
+ * those. */
 VMS_POINTER  = 0x002,
+
+/* The field is an array of fixed size. VMStateField.num contains
+ * the number of entries in the array. The size of each entry is
+ * given by VMStateField.size and / or opaque +
+ * VMStateField.size_offset; see VMS_VBUFFER and
+ * VMS_MULTIPLY. Each array entry will be processed individually
+ * (VMStateField.info.get()/put() if VMS_STRUCT is not set,
+ * recursion into VMStateField.vmsd if VMS_STRUCT is set). May not
+ * be combined with VMS_VARRAY*. */
 VMS_ARRAY= 0x004,
+
+/* The field is itself a struct, containing one or more
+ * fields. Recurse into VMStateField.vmsd. Most useful in
+ * combination with VMS_ARRAY / VMS_VARRAY*, recursing into each
+ * array entry. */
 VMS_STRUCT   = 0x008,
-VMS_VARRAY_INT32 = 0x010,  /* Array with size in int32_t field*/
-VMS_BUFFER   = 0x020,  /* static sized buffer */
+
+/* The field is an array of variable size. The int32_t at opaque +
+ * VMStateField.num_offset contains the number of entries in the
+ * array. See the VMS_ARRAY description regarding array handling
+ * in general. May not be combined with VMS_ARRAY or any other
+ * VMS_VARRAY*. */
+VMS_VARRAY_INT32 = 0x010,
+
+/* Ignored */
+VMS_BUFFER   = 0x020,
+
+/* The field is a (fixed-size or variable-size) array of pointers
+ * (e.g. struct a { uint8_t *b[]; }). Dereference each array entry
+ * before using it. Note: Does not imply any one of VMS_ARRAY /
+ * VMS_VARRAY*; these need to be set explicitly. */
 VMS_ARRAY_OF_POINTER = 0x040,
-VMS_VARRAY_UINT16= 0x080,  /* Array with size in uint16_t field */
-VMS_VBUFFER  = 0x100,  /* Buffer with size in int32_t field */
-VMS_MULTIPLY = 0x200,  /* multiply "size" field by field_size */
-VMS_VARRAY_UINT8 = 0x400,  /* Array with size in uint8_t field*/
-VMS_VARRAY_UINT32= 0x800,  /* Array with size in uint32_t field*/
-VMS_MUST_EXIST   = 0x1000, /* Field must exist in input */
-VMS_ALLOC= 0x2000, /* Alloc a buffer on the destination */
+
+/* The field is an array of variable size. The uint16_t at opaque
+ * + VMStateField.num_offset contains the number of entries in the
+ * array. See the VMS_ARRAY description regarding array handling
+ * in general. May not be combined with VMS_ARRAY or any other
+ * VMS_VARRAY*. */
+VMS_VARRAY_UINT16= 0x080,
+
+/* The size of the individual entries (a single array entry if
+ * VMS_ARRAY or VMS_VARRAY are set, or the field itself if neither
+ * is set) is variable (i.e. not known at compile-time), but the
+ * same for all entries. Use the int32_t at opaque +
+ * VMStateField.size_offset (subject to VMS_MULTIPLY) to determine
+ * the size of each (and every) entry. */
+VMS_VBUFFER  = 0x100,
+
+/* Multiply the entry size given by the int32_t at opaque +
+ * VMStateField.size_offset (see VMS_VBUFFER description) with
+ * VMStateField.size to determine the number of bytes to be
+ * allocated. Only valid in combination with VMS_VBUFFER. */
+VMS_MULTIPLY = 0x200,
+
+/* The field is an array of variable size. The uint8_t at opaque +
+ * VMStateField.num_offset contains the number of entries in the
+ * array. See the VMS_ARRAY description regarding array handling
+ * in general. May not be combined with VMS_ARRAY or any other
+ * VMS_VARRAY*. */
+VMS_VARRAY_UINT8 = 0x400,
+
+/* The field is an array of variable size. The uint32_t at opaque
+ * + VMStateField.num_offset

Re: [Qemu-devel] [PATCH v4 10/12] fdc: rework pick_geometry

2016-01-21 Thread John Snow



On 01/20/2016 06:45 PM, Eric Blake wrote:
> On 01/19/2016 11:51 PM, John Snow wrote:
>> This one is the crazy one.
>>
>> fd_revalidate currently uses pick_geometry to tell if the diskette
>> geometry has changed upon an eject/insert event, but it won't allow us
>> to insert a 1.44MB diskette into a 2.88MB drive. This is inflexible.
>>
>> The new algorithm applies a new heuristic to guessing disk geometries
>> that allows us to switch diskette types as long as the physical size
>> matches before falling back to the old heuristic.
>>
>> The old one is roughly:
>>  - If the size (sectors) and type matches, choose it.
>>  - Fall back to the first geometry that matched our type.
>>

For sake of review, we'll call these choices (A) and (B).

>> The new one is:
>>  - If the size (sectors) and type matches, choose it.
>>  - If the size (sectors) and physical size match, choose it.
>>  - If the size (sectors) matches at all, choose it begrudgingly.
> 
> This is the one I worry about; details below.
> 
>>  - Fall back to the first geometry that matched our type.
>>

And these choices will be (1) through (4) as they are annotated in the
comment below in this way as well.

(1) is identical to (A), and (4) is identical to (B).
That leaves (2) and (3) are new behaviors.

There is strong rationale for (2).
(3) was invented as a last-ditch effort before (4), see below!

>> Signed-off-by: John Snow 
>> ---
> 
>> -int i, first_match, match;
>> +int i;
>> +int match, size_match, type_match;
>> +bool magic = drv->drive == FLOPPY_DRIVE_TYPE_AUTO;
>>  
>>  /* We can only pick a geometry if we have a diskette. */
>>  if ((!drv->media_inserted) || (drv->drive == FLOPPY_DRIVE_TYPE_NONE)) {
>>  return -1;
>>  }
>>  
>> +/* We need to determine the likely geometry of the inserted medium.
>> + * In order of preference, we look for:
>> + * (1) The same drive type and number of sectors,
>> + * (2) The same diskette size and number of sectors,
>> + * (3) The same number of sectors,
>> + * (4) The same drive type.
>> + *
>> + * In all cases, matches that occur higher in the drive table will take
>> + * precedence over matches that occur later in the table.
> 
> Yay - comments!  Makes it somewhat easier to follow.
> 
>> + */
>>  blk_get_geometry(blk, &nb_sectors);
>> -match = -1;
>> -first_match = -1;
>> +match = size_match = type_match = -1;
>>  for (i = 0; ; i++) {
>>  parse = &fd_formats[i];
>>  if (parse->drive == FLOPPY_DRIVE_TYPE_NONE) {
>>  break;
>>  }
>> -if (drv->drive == parse->drive ||
>> -drv->drive == FLOPPY_DRIVE_TYPE_AUTO) {
>> -size = (parse->max_head + 1) * parse->max_track *
>> -parse->last_sect;
>> -if (nb_sectors == size) {
>> -match = i;
>> -break;
>> +size = (parse->max_head + 1) * parse->max_track * parse->last_sect;
>> +if (nb_sectors == size) {
> 
> To make sure I understand this:
> 
> The first three conditions are reached only when visiting a row where
> nb_sectors matches.
> 

Yes.

>> +if (magic || parse->drive == drv->drive) {
>> +/* (1) perfect match */
>> +goto out;
> 
> The conditional means we get here only if the user specified 'magic'
> (aka the new "auto" behavior of picking the first drive that works) or
> an actual size (144/288/120).  Since the sectors match, we are set; and
> this matches the old behavior (either drive type and sectors match, or
> drive type was unspecified and sectors match).
> 

Yes. This is the old (A), identical to what is now documented as (1).

>> +} else if (drive_size(parse->drive) == drive_size(drv->drive)) {
>> +/* (2) physical size match */
>> +match = (match == -1) ? i : match;
> 
> Here, drive_size(parse->drive) will never be FDRIVE_SIZE_UNKNOWN, but
> drive_size(drv->drive) depends on what the user told us.  If they didn't
> tell us what disk size to target, including if they asked for "auto",
> this rule will never match.  If they DID tell us a disk size
> (144/288/120), then this favors their disk size.  More concretely, if
> they asked for 2880 sectors and a 288 disk, the old algorithm would fail
> (no 288 entry has 2880 sectors), but now succeeds (the 144 entry with
> 2880 sectors has the same 3.5" disk size).  This is a good bug fix.
> 

Yes. This is new behavior, documented as (2).

>> +} else {
>> +/* (3) nsectors match only */
> 
> nb_sectors?
> 

Yes, I can update the comment to be less fuzzy.

>> +size_match = (size_match == -1) ? i : size_match;
> 
> Here, we don't have a drive match.  Either the user did not specify a
> drive type [backwards-compatible behavior - the old loop found the first
> entry with a matching size among every row of the table], or they

The old behavior you are refe

Re: [Qemu-devel] [PATCH v9 21/37] qapi: Document visitor interfaces, add assertions

2016-01-21 Thread Markus Armbruster

All right, this one's a bear.  Not because the patch is bad, but because
what it tries to do is bloody difficult.

Eric Blake  writes:

> The visitor interface for mapping between QObject/QemuOpts/string
> and qapi has formerly been documented only by reading source code,

Polite way to say "is scandalously undocumented".

> making it difficult to propose changes to either scripts/qapi*.py
> or to clients without knowing whether those changes would be safe.
> This adds documentation, including mentioning when parameters can
> be NULL, and where there are still some interface warts that would
> be nice to remove.  In particular, I have plans to remove
> visit_start_union() in a future patch.

Suggest

The visitor interface for mapping between QObject/QemuOpts/string
and QAPI is pretty much undocumented, making changes to visitor
core, individual visitors, and users of visitors difficult.

Correct this by retrofitting proper contracts.  Document some
interface warts that would be nice to remove.  In particular, I have
plans to remove visit_start_union() in a future patch.

> Add some asserts to strengthen the claims of the documentation; some
> of these were only made possible by recent cleanup commits.  These
> were made easier with the addition of a new visit_is_output()
> helper (since all 2 output visitors of our 6 overall visitors use
> the same .type_enum() callback).

I find past tense confusing here, it makes me think visit_is_output()
was added in a previous patch.  Perhaps:

Add assertions to (partially) enforce the contract.  Some of these
were only made possible by recent cleanup commits.

Add a new visit_is_output() helper to make the assertions more
readable.  Its implementation is a bit hacky: it relies on the fact
that all output visitors use the same .type_enum() callback.

> Signed-off-by: Eric Blake 
> Reviewed-by: Marc-André Lureau 
>
> ---
> v9: no change
> v8: rebase to 'name' motion
> v7: retitle; more wording changes, add asserts to enforce the
> wording, place later in series to rebase on fixes that would
> otherwise trip the new assertions
> v6: mention that input visitors blindly assign over *obj; wording
> improvements
> ---
>  include/qapi/visitor-impl.h |  31 ++-
>  include/qapi/visitor.h  | 196 
> 
>  qapi/qapi-visit-core.c  |  39 -
>  3 files changed, 262 insertions(+), 4 deletions(-)
>

My review probably makes more sense if you skip ahead to visitor.h, then
come back here.

> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index 7f512cf..aab46bc 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -15,23 +15,37 @@
>  #include "qapi/error.h"
>  #include "qapi/visitor.h"
>
> +/* This file describes the callback interface for implementing a
> + * QObject visitor.  For the client interface, see visitor.h.  When

"QAPI visitor", for consistency with visitor.h's and qapi-visit-core.c's
file comment.

> + * implementing the callbacks, it is easiest to declare a struct with
> + * 'Visitor visitor;' as the first member.  Semantics for the
> + * callbacks are generally similar to the counterpart public
> + * interface.  */

Thanks for adding this overview comment.

Suggest to say "A callback's contract matches the corresponding public
functions' contract unless stated otherwise."  Then state otherwise
where needed.

> +
>  struct Visitor
>  {
> -/* Must be set */
> +/* Must be provided to visit structs (the string visitors do not
> + * currently visit structs). */

Uh, the string visitors don't decide what gets visited, their users do.
The string visitors don't support visiting structs.  The restriction
needs to be documented, but this isn't the place to do it, is it?
Suggest to drop the parenthesis.

>  void (*start_struct)(Visitor *v, const char *name, void **obj,
>   size_t size, Error **errp);
> +/* Must be provided if start_struct is present. */
>  void (*end_struct)(Visitor *v, Error **errp);
>
> +/* May be NULL; most useful for input visitors. */

"Optional" would be a bit terser than "May be NULL".

Why is it "most useful for input visitors"?  For what it's worth, the
dealloc visitor finds it useful, too...

>  void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
>Error **errp);
>  /* May be NULL */
>  void (*end_implicit_struct)(Visitor *v);
>
> +/* Must be set */
>  void (*start_list)(Visitor *v, const char *name, Error **errp);
> +/* Must be set */
>  GenericList *(*next_list)(Visitor *v, GenericList **list, Error **errp);
>  /* Must be set */
>  void (*end_list)(Visitor *v);

A visitor could omit these two with similar consequences to omitting
start_struct() and end_struct(): attempts to visit lists crash then.  In
fact, the string visitors omitted them until commit 659268f an

[Qemu-devel] COLO hybrid mode and other changes

2016-01-21 Thread Dr. David Alan Gilbert

Hi,
  I've got a world with a few COLO changes in that you might
like to look at; they're all experimental, but it seems
to be working for me.

  The world is at:
https://github.com/orbitfp7/qemu/commits/orbit-wp4-colo-jan16

It's based off:
The December/2.4 colo framework world with periodic checkpoints
+ The december user space colo-proxy
+ The integration code that was in the colo-proxy git repo
(those needed some updates to go into the 2.4-periodic world)

And then I've added:
   - An updated version of my hybrid mode code that switches between
 COLO and periodic checkpoints depending on the frequency of
 miscompares
   - A minimum checkpoint time, so that on fast miscompares
 we do get some runtime (as in the earlier colo series)
   - I'm using a condition-variable to flag the miscompares from
 the colo-proxy to the colo thread; this seems cleaner and it
 avoids having to do short waits;  it might be a bit tricky
 to make portable.
   - Add back some statistcs in 'info migrate'
   - An HMP command equivalent to x-blockdev-change.
   - RDMA transport for COLO

I don't intend to post any of these changes in full to the list
until your main COLO code is in; although if any of my changes
are useful to you, then feel free to include them in sets you post.

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [PATCH v11 7/7] arm_mptimer: Convert to use ptimer

2016-01-21 Thread Dmitry Osipenko

Current ARM MPTimer implementation uses QEMUTimer for the actual timer,
this implementation isn't complete and mostly tries to duplicate of what
generic ptimer is already doing fine.

Conversion to ptimer brings the following benefits and fixes:
- Simple timer pausing implementation
- Fixes counter value preservation after stopping the timer
- Correctly handles prescaler != 0 cases
- Code simplification and reduction

Bump VMSD to version 3, since VMState is changed and is not compatible
with the previous implementation.

Signed-off-by: Dmitry Osipenko 
---
 hw/timer/arm_mptimer.c | 131 +
 include/hw/timer/arm_mptimer.h |   5 +-
 2 files changed, 67 insertions(+), 69 deletions(-)

diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
index 3e59c2a..a5f46df 100644
--- a/hw/timer/arm_mptimer.c
+++ b/hw/timer/arm_mptimer.c
@@ -19,8 +19,9 @@
  * with this program; if not, see .
  */
 
+#include "hw/ptimer.h"
 #include "hw/timer/arm_mptimer.h"
-#include "qemu/timer.h"
+#include "qemu/main-loop.h"
 #include "qom/cpu.h"
 
 /* This device implements the per-cpu private timer and watchdog block
@@ -42,33 +43,34 @@ static inline void timerblock_update_irq(TimerBlock *tb)
 }
 
 /* Return conversion factor from mpcore timer ticks to qemu timer ticks.  */
-static inline uint32_t timerblock_scale(TimerBlock *tb)
+static inline uint32_t timerblock_scale(uint32_t control)
 {
-return (((tb->control >> 8) & 0xff) + 1) * 10;
+return (((control >> 8) & 0xff) + 1) * 10;
 }
 
-static void timerblock_reload(TimerBlock *tb, int restart)
+static inline void timerblock_set_count(struct ptimer_state *timer,
+uint32_t control, uint64_t *count)
 {
-if (tb->count == 0) {
-return;
+/* PTimer would immediately trigger interrupt for periodic timer
+ * when counter set to 0, MPtimer under certain condition only.  */
+if ((control & 3) == 3 && (*count == 0) && (control & 0xff00) == 0) {
+*count = ptimer_get_limit(timer);
 }
-if (restart) {
-tb->tick = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+ptimer_set_count(timer, *count);
+}
+
+static inline void timerblock_run(struct ptimer_state *timer, uint32_t control,
+  bool cond)
+{
+if (cond) {
+ptimer_run(timer, !(control & 2));
 }
-tb->tick += (int64_t)tb->count * timerblock_scale(tb);
-timer_mod(tb->timer, tb->tick);
 }
 
 static void timerblock_tick(void *opaque)
 {
 TimerBlock *tb = (TimerBlock *)opaque;
 tb->status = 1;
-if (tb->control & 2) {
-tb->count = tb->load;
-timerblock_reload(tb, 0);
-} else {
-tb->count = 0;
-}
 timerblock_update_irq(tb);
 }
 
@@ -76,21 +78,11 @@ static uint64_t timerblock_read(void *opaque, hwaddr addr,
 unsigned size)
 {
 TimerBlock *tb = (TimerBlock *)opaque;
-int64_t val;
 switch (addr) {
 case 0: /* Load */
-return tb->load;
+return ptimer_get_limit(tb->timer);
 case 4: /* Counter.  */
-if (((tb->control & 1) == 0) || (tb->count == 0)) {
-return 0;
-}
-/* Slow and ugly, but hopefully won't happen too often.  */
-val = tb->tick - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-val /= timerblock_scale(tb);
-if (val < 0) {
-val = 0;
-}
-return val;
+return ptimer_get_count(tb->timer);
 case 8: /* Control.  */
 return tb->control;
 case 12: /* Interrupt status.  */
@@ -104,39 +96,51 @@ static void timerblock_write(void *opaque, hwaddr addr,
  uint64_t value, unsigned size)
 {
 TimerBlock *tb = (TimerBlock *)opaque;
-int64_t old;
+uint32_t control = tb->control;
 switch (addr) {
 case 0: /* Load */
-tb->load = value;
-/* Fall through.  */
-case 4: /* Counter.  */
-if ((tb->control & 1) && tb->count) {
-/* Cancel the previous timer.  */
-timer_del(tb->timer);
+/* Setting load to 0 stops the timer if prescaler == 0.  */
+if ((control & 1) && (value == 0) && (control & 0xff00) == 0) {
+ptimer_stop(tb->timer);
+control &= ~1;
 }
-tb->count = value;
-if (tb->control & 1) {
-timerblock_reload(tb, 1);
+ptimer_set_limit(tb->timer, value, 1);
+timerblock_run(tb->timer, control, (control & 1));
+break;
+case 4: /* Counter.  */
+/* Setting counter to 0 stops the one-shot timer if prescaler == 0
+ * and periodic if load = 0.  */
+if ((control & 1) && (value == 0) && (control & 0xff00) == 0) {
+if (!(control & 2) || ptimer_get_limit(tb->timer) == 0) {
+ptimer_stop(tb->timer);
+control &= ~1;
+}
 }
+ti

[Qemu-devel] [PATCH v11 5/7] hw/ptimer: Introduce ptimer_get_limit

2016-01-21 Thread Dmitry Osipenko

Currently ptimer users are used to store copy of the limit value, because
ptimer doesn't provide facility to retrieve the limit. Let's provide it.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c| 5 +
 include/hw/ptimer.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index a85d1ae..142cc64 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -224,6 +224,11 @@ void ptimer_set_limit(ptimer_state *s, uint64_t limit, int 
reload)
 }
 }
 
+uint64_t ptimer_get_limit(ptimer_state *s)
+{
+return s->limit;
+}
+
 const VMStateDescription vmstate_ptimer = {
 .name = "ptimer",
 .version_id = 1,
diff --git a/include/hw/ptimer.h b/include/hw/ptimer.h
index 8ebacbb..e397db5 100644
--- a/include/hw/ptimer.h
+++ b/include/hw/ptimer.h
@@ -19,6 +19,7 @@ typedef void (*ptimer_cb)(void *opaque);
 ptimer_state *ptimer_init(QEMUBH *bh);
 void ptimer_set_period(ptimer_state *s, int64_t period);
 void ptimer_set_freq(ptimer_state *s, uint32_t freq);
+uint64_t ptimer_get_limit(ptimer_state *s);
 void ptimer_set_limit(ptimer_state *s, uint64_t limit, int reload);
 uint64_t ptimer_get_count(ptimer_state *s);
 void ptimer_set_count(ptimer_state *s, uint64_t count);
-- 
2.7.0

[Qemu-devel] [PATCH v11 0/7] PTimer fixes/features and ARM MPTimer conversion

2016-01-21 Thread Dmitry Osipenko

Changelog for ARM MPTimer QEMUTimer to ptimer conversion:

V2: Fixed changing periodic timer counter value "on the fly". I added a
test to the gist to cover that issue.

V3: Fixed starting the timer with load = 0 and counter != 0, added tests
to the gist for this issue. Changed vmstate version for all VMSD's,
since loadvm doesn't check version of nested VMSD.

V4: Fixed spurious IT bit set for the timer starting in the periodic mode
with counter = 0. Test added.

V5: Code cleanup, now depends on ptimer_set_limit() fix.

V6: No code change, added test to check ptimer_get_count() with corrected
.limit value.

V7: No change.

V8: No change.

V9: No change.

V10: Correctly handle cases when counter = load = 0 and prescaler != 0,
 i.e. triggering interrupt in that case. Call ptimer_* only when
 certain MPTimer state was changed, like prescaler change. Factor out
 timerblock_set_count from timerblock_run and inline both.
 Tests updated.

V11: Fixed missed periodic timer stopping on setting counter => 0 with
 load = 0 and prescaler = 0.

ARM MPTimer tests: https://gist.github.com/digetx/dbd46109503b1a91941a


Patches for ptimer are introduced since V5 of "ARM MPTimer conversion".

Changelog for the ptimer patches:

V5: Only fixed ptimer_set_limit() for the disabled timer.

V6: As was pointed by Peter Maydell, there are other issues beyond
ptimer_set_limit(), so V6 supposed to cover all those issues.

V7: Added accidentally removed !use_icount check.
Added missed "else" statement.

V8: Adjust period instead of the limit and do it for periodic timer only
(.limit adjusting bug). Added patch/fix for freq/period change and
ptimer_get_count() improvement.

V9: Don't do wrap around if counter == 0, otherwise polled periodic
timer won't ever return counter = 0.

V10: Addressed V8/9 review comments.
 Adjust timer period based on delta instead of limit.
 Don't wrap around when in icount mode.
 New patches: "on the fly" mode switch, silence error msg when
  delta = load = 0, introduce ptimer_get_limit.

V11: Dropped timer tick from "Perform tick and counter wrap around if
 timer already expired" patch since it would cause bogus tick after
 QEMU been reset if ptimer was stopped and it's QEMUtimer expired
 during reset.
 Patch "Legalize running with delta = load = 0" now explicitly
 forbids period = 0.

Dmitry Osipenko (7):
  hw/ptimer: Fix issues caused by the adjusted timer limit value
  hw/ptimer: Perform counter wrap around if timer already expired
  hw/ptimer: Update .delta on period/freq change
  hw/ptimer: Support "on the fly" timer mode switch
  hw/ptimer: Introduce ptimer_get_limit
  hw/ptimer: Legalize running with delta = load = 0 and abort on period
= 0
  arm_mptimer: Convert to use ptimer

 hw/core/ptimer.c   | 111 --
 hw/timer/arm_mptimer.c | 131 +
 include/hw/ptimer.h|   1 +
 include/hw/timer/arm_mptimer.h |   5 +-
 4 files changed, 135 insertions(+), 113 deletions(-)

-- 
2.7.0

[Qemu-devel] [PATCH v11 1/7] hw/ptimer: Fix issues caused by the adjusted timer limit value

2016-01-21 Thread Dmitry Osipenko

Multiple issues here related to the timer with a adjusted .limit value:

1) ptimer_get_count() returns incorrect counter value for the disabled
timer after loading the counter with a small value, because adjusted limit
value is used instead of the original.

For instance:
1) ptimer_stop(t)
2) ptimer_set_period(t, 1)
3) ptimer_set_limit(t, 0, 1)
4) ptimer_get_count(t) <-- would return 1 instead of 0

2) ptimer_get_count() might return incorrect value for the timer running
with a adjusted limit value.

For instance:
1) ptimer_stop(t)
2) ptimer_set_period(t, 1)
3) ptimer_set_limit(t, 10, 1)
4) ptimer_run(t)
5) ptimer_get_count(t) <-- might return value > 10

3) Neither ptimer_set_period() nor ptimer_set_freq() are adjusting the
limit value, so it is still possible to make timer timeout value
arbitrary small.

For instance:
1) ptimer_set_period(t, 1)
2) ptimer_set_limit(t, 1, 0)
3) ptimer_set_period(t, 1) <-- bypass limit correction

Fix all of the above issues by adjusting timer period instead of the limit.
Perform the adjustment for periodic timer only. Use the delta value instead
of the limit to make decision whether adjustment is required, as limit could
be altered while timer is running, resulting in incorrect value returned by
ptimer_get_count.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c | 51 +++
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index edf077c..6dc1677 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -34,6 +34,9 @@ static void ptimer_trigger(ptimer_state *s)
 
 static void ptimer_reload(ptimer_state *s)
 {
+uint32_t period_frac = s->period_frac;
+uint64_t period = s->period;
+
 if (s->delta == 0) {
 ptimer_trigger(s);
 s->delta = s->limit;
@@ -44,10 +47,24 @@ static void ptimer_reload(ptimer_state *s)
 return;
 }
 
+/*
+ * Artificially limit timeout rate to something
+ * achievable under QEMU.  Otherwise, QEMU spends all
+ * its time generating timer interrupts, and there
+ * is no forward progress.
+ * About ten microseconds is the fastest that really works
+ * on the current generation of host machines.
+ */
+
+if ((s->enabled == 1) && (s->delta * period < 1) && !use_icount) {
+period = 1 / s->delta;
+period_frac = 0;
+}
+
 s->last_event = s->next_event;
-s->next_event = s->last_event + s->delta * s->period;
-if (s->period_frac) {
-s->next_event += ((int64_t)s->period_frac * s->delta) >> 32;
+s->next_event = s->last_event + s->delta * period;
+if (period_frac) {
+s->next_event += ((int64_t)period_frac * s->delta) >> 32;
 }
 timer_mod(s->timer, s->next_event);
 }
@@ -82,6 +99,13 @@ uint64_t ptimer_get_count(ptimer_state *s)
 uint64_t div;
 int clz1, clz2;
 int shift;
+uint32_t period_frac = s->period_frac;
+uint64_t period = s->period;
+
+if ((s->enabled == 1) && !use_icount && (s->delta * period < 
1)) {
+period = 1 / s->delta;
+period_frac = 0;
+}
 
 /* We need to divide time by period, where time is stored in
rem (64-bit integer) and period is stored in period/period_frac
@@ -94,7 +118,7 @@ uint64_t ptimer_get_count(ptimer_state *s)
 */
 
 rem = s->next_event - now;
-div = s->period;
+div = period;
 
 clz1 = clz64(rem);
 clz2 = clz64(div);
@@ -103,13 +127,13 @@ uint64_t ptimer_get_count(ptimer_state *s)
 rem <<= shift;
 div <<= shift;
 if (shift >= 32) {
-div |= ((uint64_t)s->period_frac << (shift - 32));
+div |= ((uint64_t)period_frac << (shift - 32));
 } else {
 if (shift != 0)
-div |= (s->period_frac >> (32 - shift));
+div |= (period_frac >> (32 - shift));
 /* Look at remaining bits of period_frac and round div up if 
necessary.  */
-if ((uint32_t)(s->period_frac << shift))
+if ((uint32_t)(period_frac << shift))
 div += 1;
 }
 counter = rem / div;
@@ -181,19 +205,6 @@ void ptimer_set_freq(ptimer_state *s, uint32_t freq)
count = limit.  */
 void ptimer_set_limit(ptimer_state *s, uint64_t limit, int reload)
 {
-/*
- * Artificially limit timeout rate to something
- * achievable under QEMU.  Otherwise, QEMU spends all
- * its time generating timer interrupts, and there
- * is no forward progress.
- * About ten microseconds is the fastest that really works
- * on the current generation of host machines.
- */
-
-if (!use_icount && limit *

[Qemu-devel] [PATCH v11 6/7] hw/ptimer: Legalize running with delta = load = 0 and abort on period = 0

2016-01-21 Thread Dmitry Osipenko

Currently ptimer would print error message and clear enable flag for an
arming timer that has delta = load = 0. That actually could be a valid case
for some hardware, like instant IRQ trigger for oneshot timer or continuous
in periodic mode. Support those cases by removing the error message and
stopping the timer when delta = 0. Abort execution when period = 0 instead
of printing the error message, otherwise there is a chance to miss that error.

In addition, don't re-load oneshot timer when delta = 0 and remove duplicated
code from ptimer_tick(), since ptimer_reload would invoke trigger and stop
the timer.

Signed-off-by: Dmitry Osipenko 
---
 hw/core/ptimer.c | 27 ---
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 142cc64..cec59e1 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -39,11 +39,14 @@ static void ptimer_reload(ptimer_state *s)
 
 if (s->delta == 0) {
 ptimer_trigger(s);
+}
+
+if (s->delta == 0 && s->enabled == 1) {
 s->delta = s->limit;
 }
-if (s->delta == 0 || s->period == 0) {
-fprintf(stderr, "Timer with period zero, disabling\n");
-s->enabled = 0;
+
+if (s->delta == 0) {
+ptimer_stop(s);
 return;
 }
 
@@ -72,27 +75,22 @@ static void ptimer_reload(ptimer_state *s)
 static void ptimer_tick(void *opaque)
 {
 ptimer_state *s = (ptimer_state *)opaque;
-ptimer_trigger(s);
 s->delta = 0;
-if (s->enabled == 2) {
-s->enabled = 0;
-} else {
-ptimer_reload(s);
-}
+ptimer_reload(s);
 }
 
 uint64_t ptimer_get_count(ptimer_state *s)
 {
 uint64_t counter;
 
-if (s->enabled) {
+if (s->enabled && s->delta != 0) {
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 int64_t next = s->next_event;
 bool expired = (now - next >= 0);
 bool oneshot = (s->enabled == 2);
 
 /* Figure out the current counter value.  */
-if (s->period == 0 || (expired && (oneshot || use_icount))) {
+if (expired && (oneshot || use_icount)) {
 /* Prevent timer underflowing if it should already have
triggered.  */
 counter = 0;
@@ -164,10 +162,7 @@ void ptimer_run(ptimer_state *s, int oneshot)
 {
 bool was_disabled = !s->enabled;
 
-if (was_disabled && s->period == 0) {
-fprintf(stderr, "Timer with period zero, disabling\n");
-return;
-}
+g_assert(s->period != 0);
 s->enabled = oneshot ? 2 : 1;
 if (was_disabled) {
 s->next_event = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
@@ -190,6 +185,7 @@ void ptimer_stop(ptimer_state *s)
 /* Set counter increment interval in nanoseconds.  */
 void ptimer_set_period(ptimer_state *s, int64_t period)
 {
+g_assert(period != 0);
 s->delta = ptimer_get_count(s);
 s->period = period;
 s->period_frac = 0;
@@ -202,6 +198,7 @@ void ptimer_set_period(ptimer_state *s, int64_t period)
 /* Set counter frequency in Hz.  */
 void ptimer_set_freq(ptimer_state *s, uint32_t freq)
 {
+g_assert(freq != 0);
 s->delta = ptimer_get_count(s);
 s->period = 10ll / freq;
 s->period_frac = (10ll << 32) / freq;
-- 
2.7.0

[Qemu-devel] [PATCH v11 2/7] hw/ptimer: Perform counter wrap around if timer already expired

2016-01-21 Thread Dmitry Osipenko

ptimer_get_count() might be called while QEMU timer already been expired.
In that case ptimer would return counter = 0, which might be undesirable
in case of polled timer. Do counter wrap around for periodic timer to keep
it distributed. In order to achieve more accurate emulation behaviour of
certain hardware, don't perform wrap around when in icount mode and return
counter = 0 in that case (that doesn't affect polled counter distribution).

Signed-off-by: Dmitry Osipenko 
---
 hw/core/ptimer.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 6dc1677..cb50d30 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -83,14 +83,16 @@ static void ptimer_tick(void *opaque)
 
 uint64_t ptimer_get_count(ptimer_state *s)
 {
-int64_t now;
 uint64_t counter;
 
 if (s->enabled) {
-now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+int64_t next = s->next_event;
+bool expired = (now - next >= 0);
+bool oneshot = (s->enabled == 2);
+
 /* Figure out the current counter value.  */
-if (now - s->next_event > 0
-|| s->period == 0) {
+if (s->period == 0 || (expired && (oneshot || use_icount))) {
 /* Prevent timer underflowing if it should already have
triggered.  */
 counter = 0;
@@ -102,7 +104,7 @@ uint64_t ptimer_get_count(ptimer_state *s)
 uint32_t period_frac = s->period_frac;
 uint64_t period = s->period;
 
-if ((s->enabled == 1) && !use_icount && (s->delta * period < 
1)) {
+if (!oneshot && (s->delta * period < 1) && !use_icount) {
 period = 1 / s->delta;
 period_frac = 0;
 }
@@ -117,7 +119,7 @@ uint64_t ptimer_get_count(ptimer_state *s)
backwards.
 */
 
-rem = s->next_event - now;
+rem = expired ? now - next : next - now;
 div = period;
 
 clz1 = clz64(rem);
@@ -137,6 +139,11 @@ uint64_t ptimer_get_count(ptimer_state *s)
 div += 1;
 }
 counter = rem / div;
+
+if (expired && (counter != 0)) {
+/* Wrap around periodic counter.  */
+counter = s->limit - counter % s->limit;
+}
 }
 } else {
 counter = s->delta;
-- 
2.7.0

[Qemu-devel] [PATCH v11 4/7] hw/ptimer: Support "on the fly" timer mode switch

2016-01-21 Thread Dmitry Osipenko

Allow switching between periodic <-> oneshot modes while timer is running.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 904a77b..a85d1ae 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -162,16 +162,17 @@ void ptimer_set_count(ptimer_state *s, uint64_t count)
 
 void ptimer_run(ptimer_state *s, int oneshot)
 {
-if (s->enabled) {
-return;
-}
-if (s->period == 0) {
+bool was_disabled = !s->enabled;
+
+if (was_disabled && s->period == 0) {
 fprintf(stderr, "Timer with period zero, disabling\n");
 return;
 }
 s->enabled = oneshot ? 2 : 1;
-s->next_event = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-ptimer_reload(s);
+if (was_disabled) {
+s->next_event = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+ptimer_reload(s);
+}
 }
 
 /* Pause a timer.  Note that this may cause it to "lose" time, even if it
-- 
2.7.0

[Qemu-devel] [PATCH v11 3/7] hw/ptimer: Update .delta on period/freq change

2016-01-21 Thread Dmitry Osipenko

Delta value must be updated on period/freq change, otherwise running timer
would be restarted (counter reloaded with old delta). Only m68k/mcf520x
and arm/arm_timer devices are currently doing freq change correctly, i.e.
stopping the timer. Perform delta update to fix affected devices and
eliminate potential further mistakes.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index cb50d30..904a77b 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -189,6 +189,7 @@ void ptimer_stop(ptimer_state *s)
 /* Set counter increment interval in nanoseconds.  */
 void ptimer_set_period(ptimer_state *s, int64_t period)
 {
+s->delta = ptimer_get_count(s);
 s->period = period;
 s->period_frac = 0;
 if (s->enabled) {
@@ -200,6 +201,7 @@ void ptimer_set_period(ptimer_state *s, int64_t period)
 /* Set counter frequency in Hz.  */
 void ptimer_set_freq(ptimer_state *s, uint32_t freq)
 {
+s->delta = ptimer_get_count(s);
 s->period = 10ll / freq;
 s->period_frac = (10ll << 32) / freq;
 if (s->enabled) {
-- 
2.7.0

Re: [Qemu-devel] virtio ring layout changes for optimal single-stream performance

2016-01-21 Thread Michael S. Tsirkin

On Thu, Jan 21, 2016 at 04:38:36PM +0100, Cornelia Huck wrote:
> On Thu, 21 Jan 2016 15:39:26 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > Hi all!
> > I have been experimenting with alternative virtio ring layouts,
> > in order to speed up single stream performance.
> > 
> > I have just posted a benchmark I wrote for the purpose, and a (partial)
> > alternative layout implementation.  This achieves 20-40% reduction in
> > virtio overhead in the (default) polling mode.
> > 
> > http://article.gmane.org/gmane.linux.kernel.virtualization/26889
> > 
> > The layout is trying to be as simple as possible, to reduce
> > the number of cache lines bouncing between CPUs.
> 
> Some kind of diagram or textual description would really help to review
> this.
> 
> > 
> > For benchmarking, the idea is to emulate virtio in user-space,
> > artificially adding overhead for e.g. signalling to match what happens
> > in case of a VM.
> 
> Hm... is this overhead comparable enough between different platform so
> that you can get a halfway realistic scenario?

On x86 is seems pretty stable.
It's a question of setting VMEXIT_CYCLES and VMENTRY_CYCLES correctly.

> What about things like
> endianness conversions?

I didn't bother with them yet.

> > 
> > I'd be very curious to get feedback on this, in particular, some people
> > discussed using vectored operations to format virtio ring - would it
> > conflict with this work?
> > 
> > You are all welcome to post enhancements or more layout alternatives as
> > patches.
> 
> Let me see if I can find time to experiment a bit.

Re: [Qemu-devel] [PATCH] gtk: use qemu_chr_alloc() to allocate CharDriverState

2016-01-21 Thread Hervé Poussineau


Le 21/01/2016 12:56, Daniel P. Berrange a écrit :

The gd_vc_handler() callback is using g_malloc0() to
allocate the CharDriverState struct. As a result the
logfd field is getting initialized to 0, instead of
-1 when no logfile is requested.

The result is that when running

  $ qemu-system-i386 -nodefaults -chardev vc,id=mon0 -mon chardev=mon0

qemu duplicates all monitor output to stdout as well
as the GTK window.

Not using qemu_chr_alloc() was already a bug, but harmless
until this commit

   commit d0d7708ba29cbcc343364a46bff981e0ff88366f
   Author: Daniel P. Berrange 
   Date:   Mon Jan 11 12:44:41 2016 +

 qemu-char: add logfile facility to all chardev backends

which exposed the problem as a behaviour regression

Reported-by: Hervé Poussineau 
Signed-off-by: Daniel P. Berrange 


Tested-by: Hervé Poussineau 

Hervé

Re: [Qemu-devel] [PATCH] gtk: use qemu_chr_alloc() to allocate CharDriverState

2016-01-21 Thread Peter Maydell

On 21 January 2016 at 11:56, Daniel P. Berrange  wrote:
> The gd_vc_handler() callback is using g_malloc0() to
> allocate the CharDriverState struct. As a result the
> logfd field is getting initialized to 0, instead of
> -1 when no logfile is requested.
>
> The result is that when running
>
>  $ qemu-system-i386 -nodefaults -chardev vc,id=mon0 -mon chardev=mon0
>
> qemu duplicates all monitor output to stdout as well
> as the GTK window.
>
> Not using qemu_chr_alloc() was already a bug, but harmless
> until this commit

A quick check with coccinelle:

@@
typedef CharDriverState;
CharDriverState *x;
@@

- x = g_malloc0(...)
+ x = qemu_chr_alloc(foo)


revealed only this ui/gtk.c allocation plus the actual
implementation of qemu_chr_alloc() as places where we try
to do a manual g_malloc0() of a CharDriverState. So I
think this is the only bit that needs changing.

thanks
-- PMM

Re: [Qemu-devel] [PULL 0/11] xen-20160121

2016-01-21 Thread Peter Maydell

On 21 January 2016 at 17:01, Stefano Stabellini
 wrote:
> The following changes since commit 17c8a2197888bac8ec0256b16919b721c76c5e62:
>
>   Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-01-13' 
> into staging (2016-01-14 13:07:38 +)
>
> are available in the git repository at:
>
>
>   git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-20160121
>
> for you to fetch changes up to 5a11d0f7549e24a10e178a9dc8ff5e698031d9a6:
>
>   Xen PCI passthru: convert to realize() (2016-01-21 16:45:54 +)
>
> 
> Xen 2016/01/21

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v4 12/12] fdc: change auto fallback drive for ISA FDC to 288

2016-01-21 Thread Eric Blake

On 01/19/2016 11:51 PM, John Snow wrote:
> The 2.88 drive is more suitable as a default because
> it can still read 1.44 images correctly, but the reverse
> is not true.
> 
> Since there exist virtio-win drivers that are shipped on
> 2.88 floppy images, this patch will allow VMs booted without
> a floppy disk inserted to later insert a 2.88MB floppy and
> have that work.
> 
> This patch has been tested with msdos, freedos, fedora,
> windows 8 and windows 10 without issue: if problems do
> arise for certain guests being unable to cope with 2.88MB
> drives as the default, they are in the minority and can use
> type=144 as needed (or insert a proper boot medium and omit
> type=144/288 or use type=auto) to obtain different drive types.
> 
> As icing, the default will remain auto/144 for any pre-2.6
> machine types, hopefully minimizing the impact of this change
> in legacy hw to basically zero.
> 
> Signed-off-by: John Snow 
> ---
>  hw/block/fdc.c  | 2 +-
>  include/hw/compat.h | 4 
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 11/12] qtest/fdc: Support for 2.88MB drives

2016-01-21 Thread Eric Blake

On 01/19/2016 11:51 PM, John Snow wrote:
> The old test assumes a 1.44MB drive.
> Assert that the QEMU default drive is now either 1.44 or 2.88.
> 
> Signed-off-by: John Snow 
> ---
>  tests/fdc-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PULL 11/11] Xen PCI passthru: convert to realize()

2016-01-21 Thread Stefano Stabellini

From: Cao jin 

Signed-off-by: Cao jin 
Reviewed-by: Eric Blake 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_pt.c |   53 -
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 9eef3df..d33221b 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -760,10 +760,10 @@ static void xen_pt_destroy(PCIDevice *d) {
 }
 /* init */
 
-static int xen_pt_initfn(PCIDevice *d)
+static void xen_pt_realize(PCIDevice *d, Error **errp)
 {
 XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
-int rc = 0;
+int i, rc = 0;
 uint8_t machine_irq = 0, scratch;
 uint16_t cmd = 0;
 int pirq = XEN_PT_UNASSIGNED_PIRQ;
@@ -781,8 +781,8 @@ static int xen_pt_initfn(PCIDevice *d)
 &err);
 if (err) {
 error_append_hint(&err, "Failed to \"open\" the real pci device");
-error_report_err(err);
-return -1;
+error_propagate(errp, err);
+return;
 }
 
 s->is_virtfn = s->real_device.is_virtfn;
@@ -802,19 +802,19 @@ static int xen_pt_initfn(PCIDevice *d)
 if ((s->real_device.domain == 0) && (s->real_device.bus == 0) &&
 (s->real_device.dev == 2) && (s->real_device.func == 0)) {
 if (!is_igd_vga_passthrough(&s->real_device)) {
-XEN_PT_ERR(d, "Need to enable igd-passthru if you're trying"
-   " to passthrough IGD GFX.\n");
+error_setg(errp, "Need to enable igd-passthru if you're trying"
+" to passthrough IGD GFX");
 xen_host_pci_device_put(&s->real_device);
-return -1;
+return;
 }
 
 xen_pt_setup_vga(s, &s->real_device, &err);
 if (err) {
 error_append_hint(&err, "Setup VGA BIOS of passthrough"
 " GFX failed");
-error_report_err(err);
+error_propagate(errp, err);
 xen_host_pci_device_put(&s->real_device);
-return -1;
+return;
 }
 
 /* Register ISA bridge for passthrough GFX. */
@@ -836,20 +836,19 @@ static int xen_pt_initfn(PCIDevice *d)
 /* Bind interrupt */
 rc = xen_host_pci_get_byte(&s->real_device, PCI_INTERRUPT_PIN, &scratch);
 if (rc) {
-XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc);
+error_setg_errno(errp, errno, "Failed to read PCI_INTERRUPT_PIN");
 goto err_out;
 }
 if (!scratch) {
-XEN_PT_LOG(d, "no pin interrupt\n");
+error_setg(errp, "no pin interrupt");
 goto out;
 }
 
 machine_irq = s->real_device.irq;
 rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq);
-
 if (rc < 0) {
-XEN_PT_ERR(d, "Mapping machine irq %u to pirq %i failed, (err: %d)\n",
-   machine_irq, pirq, errno);
+error_setg_errno(errp, errno, "Mapping machine irq %u to"
+ " pirq %i failed", machine_irq, pirq);
 
 /* Disable PCI intx assertion (turn on bit10 of devctl) */
 cmd |= PCI_COMMAND_INTX_DISABLE;
@@ -870,8 +869,8 @@ static int xen_pt_initfn(PCIDevice *d)
PCI_SLOT(d->devfn),
e_intx);
 if (rc < 0) {
-XEN_PT_ERR(d, "Binding of interrupt %i failed! (err: %d)\n",
-   e_intx, errno);
+error_setg_errno(errp, errno, "Binding of interrupt %u failed",
+ e_intx);
 
 /* Disable PCI intx assertion (turn on bit10 of devctl) */
 cmd |= PCI_COMMAND_INTX_DISABLE;
@@ -879,8 +878,8 @@ static int xen_pt_initfn(PCIDevice *d)
 
 if (xen_pt_mapped_machine_irq[machine_irq] == 0) {
 if (xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq)) {
-XEN_PT_ERR(d, "Unmapping of machine interrupt %i failed!"
-   " (err: %d)\n", machine_irq, errno);
+error_setg_errno(errp, errno, "Unmapping of machine"
+" interrupt %u failed", machine_irq);
 }
 }
 s->machine_irq = 0;
@@ -893,14 +892,14 @@ out:
 
 rc = xen_host_pci_get_word(&s->real_device, PCI_COMMAND, &val);
 if (rc) {
-XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc);
+error_setg_errno(errp, errno, "Failed to read PCI_COMMAND");
 goto err_out;
 } else {
 val |= cmd;
 rc = xen_host_pci_set_word(&s->real_device, PCI_COMMAND, val);
 if (rc) {
-XEN_PT_ERR(d, "Failed to write PCI_COMMAND val=0x%x!(rc: 
%d)\n",
-   val, rc);
+error_setg_errno(errp, errno, "Failed to write PCI_COMMAND"
+ " val = 0x%x", val);
 goto err_out;
 }
 }
@@ -910,15 +909,19 @@ out

Re: [Qemu-devel] [PATCH v9 01/37] qobject: Document more shortcomings in our number handling

2016-01-21 Thread Daniel P. Berrange

On Thu, Jan 21, 2016 at 07:21:49AM +0100, Markus Armbruster wrote:
> Eric Blake  writes:
> 
> > On 01/20/2016 02:02 AM, Markus Armbruster wrote:
> >
> >>> @@ -519,6 +519,8 @@ static QObject *parse_literal(JSONParserContext *ctxt)
> >>>  }
> >>>  case JSON_FLOAT:
> >>>  /* FIXME dependent on locale */
> >>> +/* FIXME our lexer matches RFC7159 in forbidding Inf or NaN,
> >> 
> >> For what it's worth, the RFC spells this "RFC 7159".
> >
> > Looks like we use space more often than not, but that we're
> > inconsistent.  For example:
> >
> > slirp/tcp.h: * Per RFC 793, September, 1981.
> > slirp/tcp.h: * Per RFC793, September, 1981.
> >
> > Will fix if I need to respin, otherwise I assume you can do it.
> 
> Okay.
> 
> >>> +/* FIXME: snprintf() is locale dependent; but JSON requires
> >>> + * numbers to be formatted as if in the C locale. */
> >> 
> >> The new FIXME matches the existing one in parse_literal(), visible
> >> above.
> >> 
> >> However, dependence on C locale is a pervasive issue in QEMU.  These two
> >> comments could give readers the idea that it's an issue just here.
> >> Suggest to add something like "Dependence on C locale is a pervasive
> >> issue in QEMU."
> >
> > Good idea.
> >
> >> 
> >>> +/* FIXME: This risks printing Inf or NaN, which are not valid
> >>> + * JSON values. */
> >>> +/* FIXME: the default precision of %f may be insufficient to
> >>> + * tell this number apart from others. */
> >> 
> >> Yup.
> >> 
> >> The easy way to avoid loss of precision is %a, but of course that's way
> >> too much sophistication for JSON.
> >> 
> >> Avoiding loss of precision with a decimal format is non-trivial; see
> >> Steele, Jr., Guy L., and White, Jon L. How to print floating-point
> >> numbers accurately, SIGPLAN ’90, and later improvements collected here:
> >> http://stackoverflow.com/questions/7153979/algorithm-to-convert-an-ieee-754-double-to-a-string
> >
> > Ah, memories.  I read and implemented that paper when working on the
> > jikes compiler for the Java language back in the late nineties, as it is
> > the Java language specification which had the very neat property of
> > requiring the shortest decimal string that will unambiguously round back
> > to the same floating point pattern.
> >
> > One alternative is to always output a guaranteed unambiguous decimal
> > string (although not necessarily the shortest), by using %.17f, using
> >  DBL_DECIMAL_DIG.  (Note that DBL_DIG of 15 is NOT sufficient -
> > it is the lower limit that says that a decimal->float->decimal will not
> > change the decimal; but we want the converse where a
> > float->decimal->float will not change the float.  There are stretches of
> > numbers where the pigeonhole principle applies; you can think of it this
> > way: there is no way to map all possible 2^10 (1024) binary values
> > inside 2^3 (1000) decimal digits without at least 24 of them needing one
> > more decimal digit.  But by the same arguments, DBL_DECIMAL_DIG is an
> > upper limit and usually more than you need.)
> >
> > So, the question is whether we want to always output 17 digits, or
> > whether we want to do the poor-man's truncation scheme (easy to
> > understand, but not optimal use of the processor), or go all the way to
> > the algorithm of that paper (faster but lots harder to understand).  For
> > reference, here's the poor-man's algorithm in pseudocode:
> 
> I don't think we want to implement floating-point formatting ourselves.

FWIW in libvirt we use per-thread locale switching, so that we can flip
into the C locale temporarily when formatting doubles. If those are not
available we have a bit of a nasty hack to post-process the locale
dependant string back into what we (hope) is C-locale formatting.
We don't do anything aobut the Inf/NaN problem though.

http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/virutil.c;h=bb9604a0c1ffb9c99e454e84878a8c376f773046;hb=HEAD#l454

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PATCH v9 10/37] qapi: Make all visitors supply uint64 callbacks

2016-01-21 Thread Eric Blake

On 01/21/2016 01:56 AM, Markus Armbruster wrote:

>>> Before: nobody implements type_uint64(), and the core falls back to
>>> type_int64(), casting negative values to large positive ones.  With an
>>> implementation of type_int64() that parses large positive values as
>>> negative, the two casts cancel out.
>>>
>>> After: everybody implements type_uint64() incorrectly, by reusing
>>> type_int64() code.  The problem moves from visitor core to visitor
>>> implementations, where we can actually fix it if we care.
>>>
>>> Correct?
>>
>> Close. opts-visitor.c already implemented type_uint64, but also has its
>> known bugs (and Paolo has been pinged about his claim for fixes in that
>> file...)
> 
> Ah, yes.  opts_type_uint64() doesn't reuse opts_type_int64(), but
> largely duplicates it.  Potentially less wrong :)
> 
>> But otherwise, yes, in this patch, we are not fixing insanity so much as
>> localizing and better documenting it.
> 
> Let me try to clarify the commit message a bit:
> 
> This patch does not address the disparity in handling large values
> as negatives.  It merely moves the fallback from uint64 to int64
> from the visitor core to the visitors, where the issue can actually
> be fixed, by implementing the missing type_uint64() callbacks on top
> of the respective type_int64() callbacks, with a FIXME comment
> explaining why that's wrong.

Looks good.  I think we're starting to rack up enough tweaks to make it
worth me posting a v10 respin to fold them all in.

 The dealloc visitor no longer needs a type_size callback,
 since that now safely falls back to the type_uint64 callback.
>>>
>>> Did it need the callback before this patch?
>>
>> Not really.  Should I split out the deletion of that callback as a
>> separate patch?
> 
> Probably cleanest, but merely adjusting the commit message would also
> work for me:
> 
> The dealloc visitor doesn't actually need a type_size() callback,
> since the visitor core can safely fall back to the type_uint64()
> callback.  Drop it.

Okay, I won't bother to split this one; the commit message tweak seems
sufficient.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2 9/9] ipmi: add SET_SENSOR_READING command (tentative try)

2016-01-21 Thread Cédric Le Goater

SET_SENSOR_READING is a complex IPMI command (IPMI spec : "35.17 Set
Sensor Reading And Event Status Command"). Here is a very minimum
framework fitting the Open PowerNV platform needs. This command is
used on this platform to set the "System Firmware Progress" sensor and
the "Boot Count" sensor.

Signed-off-by: Cédric Le Goater 
Acked-by: Corey Minyard 
---
 hw/ipmi/ipmi_bmc_sim.c | 135 +
 1 file changed, 135 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 53c75cb21c1a..0aa7e67ae217 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -46,6 +46,7 @@
 #define IPMI_CMD_GET_SENSOR_READING   0x2d
 #define IPMI_CMD_SET_SENSOR_TYPE  0x2e
 #define IPMI_CMD_GET_SENSOR_TYPE  0x2f
+#define IPMI_CMD_SET_SENSOR_READING   0x30
 
 /* #define IPMI_NETFN_APP 0x06 In ipmi.h */
 
@@ -1616,6 +1617,139 @@ static void get_sensor_type(IPMIBmcSim *ibs,
 IPMI_ADD_RSP_DATA(sens->evt_reading_type_code);
 }
 
+static void set_sensor_reading(IPMIBmcSim *ibs,
+   uint8_t *cmd, unsigned int cmd_len,
+   uint8_t *rsp, unsigned int *rsp_len,
+   unsigned int max_rsp_len)
+{
+IPMISensor *sens;
+uint8_t evd1;
+uint8_t evd2;
+uint8_t evd3;
+
+IPMI_CHECK_CMD_LEN(5);
+if ((cmd[2] > MAX_SENSORS) ||
+!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
+rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
+return;
+}
+
+sens = ibs->sensors + cmd[2];
+
+/* Sensor Reading operation */
+switch ((cmd[3]) & 0x3) {
+case 0: /* Do not change */
+break;
+case 1: /* write given value to sensor reading byte */
+sens->reading = cmd[4];
+break;
+case 2:
+case 3:
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+return;
+}
+
+/* Deassertion bits operation */
+switch ((cmd[3] >> 2) & 0x3) {
+case 0: /* Do not change */
+break;
+case 1: /* write given value */
+if (cmd_len > 7) {
+sens->deassert_states = cmd[7];
+}
+if (cmd_len > 8) {
+sens->deassert_states = cmd[8] << 8;
+}
+
+case 2: /* mask on */
+if (cmd_len > 7) {
+sens->deassert_states |= cmd[7];
+}
+if (cmd_len > 8) {
+sens->deassert_states |= cmd[8] << 8;
+}
+break;
+case 3: /* mask off */
+if (cmd_len > 7) {
+sens->deassert_states &= cmd[7];
+}
+if (cmd_len > 8) {
+sens->deassert_states &= (cmd[8] << 8);
+}
+break;
+}
+
+/* Assertion bits operation */
+switch ((cmd[3] >> 4) & 0x3) {
+case 0: /* Do not change */
+break;
+case 1: /* write given value */
+if (cmd_len > 5) {
+sens->assert_states = cmd[5];
+}
+if (cmd_len > 6) {
+sens->assert_states = cmd[6] << 8;
+}
+
+case 2: /* mask on */
+if (cmd_len > 5) {
+sens->assert_states |= cmd[5];
+}
+if (cmd_len > 6) {
+sens->assert_states |= cmd[6] << 8;
+}
+break;
+case 3: /* mask off */
+if (cmd_len > 5) {
+sens->assert_states &= cmd[5];
+}
+if (cmd_len > 6) {
+sens->assert_states &= (cmd[6] << 8);
+}
+break;
+}
+
+evd1 = evd2 = evd3 = 0x0;
+if (cmd_len > 9) {
+evd1 = cmd[9];
+}
+if (cmd_len > 10) {
+evd2 = cmd[10];
+}
+if (cmd_len > 11) {
+evd3 = cmd[11];
+}
+
+/* Event Data Bytes operation */
+switch ((cmd[3] >> 6) & 0x3) {
+case 0: /* Do not use the event data in message */
+evd1 = evd2 = evd3 = 0x0;
+break;
+case 1: /* Write given values to event data bytes excluding bits
+ * [3:0] Event Data 1. */
+evd1 &= 0xf0;
+break;
+case 2: /* Write given values to event data bytes including bits
+ * [3:0] Event Data 1. */
+break;
+case 3:
+rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
+return;
+}
+
+if (IPMI_SENSOR_IS_DISCRETE(sens)) {
+unsigned int bit = evd1 & 0xf;
+uint16_t mask = (1 << bit);
+
+if (sens->assert_states & mask & sens->assert_enable) {
+gen_event(ibs, cmd[2], 0, evd1, evd2, evd3);
+}
+
+if (sens->deassert_states & mask & sens->deassert_enable) {
+gen_event(ibs, cmd[2], 1, evd1, evd2, evd3);
+}
+}
+}
 
 static const IPMICmdHandler chassis_cmds[] = {
 [IPMI_CMD_GET_CHASSIS_CAPABILITIES] = chassis_capabilities,
@@ -1636,6 +1770,7 @@ static const IPMICmdHandler sensor_event_cmds[] = {
 [IPMI_CMD_GET_SENSOR_READING] = get_sensor_reading,
 [IPMI_CMD_SET_SENSOR_TYPE] = set_sensor_type,
 [IPMI_CMD_GET_SENSOR_TYPE] = get_sensor_type,
+[IPMI_CMD_SET_SENSOR_RE

[Qemu-devel] [PATCH v2 8/9] ipmi: add ACPI power and GUID commands

2016-01-21 Thread Cédric Le Goater

Signed-off-by: Cédric Le Goater 
---

Changes since v1:
 - added ACPI to command names.
 
 hw/ipmi/ipmi_bmc_sim.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index e882af3f1b40..53c75cb21c1a 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include "sysemu/sysemu.h"
 #include "qemu/timer.h"
 #include "hw/ipmi/ipmi.h"
 #include "qemu/error-report.h"
@@ -51,6 +52,9 @@
 #define IPMI_CMD_GET_DEVICE_ID0x01
 #define IPMI_CMD_COLD_RESET   0x02
 #define IPMI_CMD_WARM_RESET   0x03
+#define IPMI_CMD_SET_ACPI_POWER_STATE 0x06
+#define IPMI_CMD_GET_ACPI_POWER_STATE 0x07
+#define IPMI_CMD_GET_DEVICE_GUID  0x08
 #define IPMI_CMD_RESET_WATCHDOG_TIMER 0x22
 #define IPMI_CMD_SET_WATCHDOG_TIMER   0x24
 #define IPMI_CMD_GET_WATCHDOG_TIMER   0x25
@@ -200,6 +204,9 @@ struct IPMIBmcSim {
 
 uint8_t restart_cause;
 
+uint8_t acpi_power_state[2];
+uint8_t uuid[16];
+
 IPMISel sel;
 IPMISdr sdr;
 IPMISensor sensors[MAX_SENSORS];
@@ -828,6 +835,36 @@ static void warm_reset(IPMIBmcSim *ibs,
 k->reset(s, false);
 }
 }
+static void set_acpi_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_CHECK_CMD_LEN(4);
+ibs->acpi_power_state[0] = cmd[2];
+ibs->acpi_power_state[1] = cmd[3];
+}
+
+static void get_acpi_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_ADD_RSP_DATA(ibs->acpi_power_state[0]);
+IPMI_ADD_RSP_DATA(ibs->acpi_power_state[1]);
+}
+
+static void get_device_guid(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+unsigned int i;
+
+for (i = 0; i < 16; i++) {
+IPMI_ADD_RSP_DATA(ibs->uuid[i]);
+}
+}
 
 static void set_bmc_global_enables(IPMIBmcSim *ibs,
uint8_t *cmd, unsigned int cmd_len,
@@ -1609,6 +1646,9 @@ static const IPMICmdHandler app_cmds[] = {
 [IPMI_CMD_GET_DEVICE_ID] = get_device_id,
 [IPMI_CMD_COLD_RESET] = cold_reset,
 [IPMI_CMD_WARM_RESET] = warm_reset,
+[IPMI_CMD_SET_ACPI_POWER_STATE] = set_acpi_power_state,
+[IPMI_CMD_GET_ACPI_POWER_STATE] = get_acpi_power_state,
+[IPMI_CMD_GET_DEVICE_GUID] = get_device_guid,
 [IPMI_CMD_SET_BMC_GLOBAL_ENABLES] = set_bmc_global_enables,
 [IPMI_CMD_GET_BMC_GLOBAL_ENABLES] = get_bmc_global_enables,
 [IPMI_CMD_CLR_MSG_FLAGS] = clr_msg_flags,
@@ -1734,6 +1774,15 @@ static void ipmi_sim_init(Object *obj)
 i += len;
 }
 
+ibs->acpi_power_state[0] = 0;
+ibs->acpi_power_state[1] = 0;
+
+if (qemu_uuid_set) {
+memcpy(&ibs->uuid, qemu_uuid, 16);
+} else {
+memset(&ibs->uuid, 0, 16);
+}
+
 ipmi_init_sensors_from_sdrs(ibs);
 register_cmds(ibs);
 
-- 
2.1.4

[Qemu-devel] [PATCH v2 4/9] ipmi: introduce a struct ipmi_sdr_compact

2016-01-21 Thread Cédric Le Goater

Currently, sdr attributes are identified using byte offsets and this
can be a bit confusing.

This patch adds a struct ipmi_sdr_compact conforming to the IPMI specs
and replaces byte offsets with names. It also introduces and uses a
struct ipmi_sdr_header in sections of the code where no assumption is
made on the type of SDR. This leave rooms to potential usage of other
types in the future.

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 65 +++---
 include/hw/ipmi/ipmi.h | 44 ++
 2 files changed, 84 insertions(+), 25 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index fc596a548df7..31f990199154 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -323,11 +323,15 @@ static void sdr_inc_reservation(IPMISdr *sdr)
 static int sdr_add_entry(IPMIBmcSim *ibs, const uint8_t *entry,
  unsigned int len, uint16_t *recid)
 {
+struct ipmi_sdr_header *sdrh_entry = (struct ipmi_sdr_header *) entry;
+struct ipmi_sdr_header *sdrh =
+(struct ipmi_sdr_header *) &ibs->sdr.sdr[ibs->sdr.next_free];
+
 if ((len < 5) || (len > 255)) {
 return 1;
 }
 
-if (entry[4] != len - 5) {
+if (sdrh_entry->rec_length != len - 5) {
 return 1;
 }
 
@@ -336,10 +340,10 @@ static int sdr_add_entry(IPMIBmcSim *ibs, const uint8_t 
*entry,
 return 1;
 }
 
-memcpy(ibs->sdr.sdr + ibs->sdr.next_free, entry, len);
-ibs->sdr.sdr[ibs->sdr.next_free] = ibs->sdr.next_rec_id & 0xff;
-ibs->sdr.sdr[ibs->sdr.next_free+1] = (ibs->sdr.next_rec_id >> 8) & 0xff;
-ibs->sdr.sdr[ibs->sdr.next_free+2] = 0x51; /* Conform to IPMI 1.5 spec */
+memcpy(sdrh, entry, len);
+sdrh->rec_id[0] = ibs->sdr.next_rec_id & 0xff;
+sdrh->rec_id[1] = (ibs->sdr.next_rec_id >> 8) & 0xff;
+sdrh->sdr_version = 0x51; /* Conform to IPMI 1.5 spec */
 
 if (recid) {
 *recid = ibs->sdr.next_rec_id;
@@ -357,8 +361,10 @@ static int sdr_find_entry(IPMISdr *sdr, uint16_t recid,
 unsigned int pos = *retpos;
 
 while (pos < sdr->next_free) {
-uint16_t trec = sdr->sdr[pos] | (sdr->sdr[pos + 1] << 8);
-unsigned int nextpos = pos + sdr->sdr[pos + 4];
+struct ipmi_sdr_header *sdrh =
+(struct ipmi_sdr_header *) &sdr->sdr[pos];
+uint16_t trec = ipmi_sdr_recid(sdrh);
+unsigned int nextpos = pos + sdrh->rec_length;
 
 if (trec == recid) {
 if (nextrec) {
@@ -507,29 +513,32 @@ static void ipmi_init_sensors_from_sdrs(IPMIBmcSim *s)
 
 pos = 0;
 for (i = 0; !sdr_find_entry(&s->sdr, i, &pos, NULL); i++) {
-uint8_t *sdr = s->sdr.sdr + pos;
-unsigned int len = sdr[4];
+struct ipmi_sdr_compact *sdr =
+(struct ipmi_sdr_compact *) &s->sdr.sdr[pos];
+unsigned int len = sdr->header.rec_length;
 
 if (len < 20) {
 continue;
 }
-if ((sdr[3] < 1) || (sdr[3] > 2)) {
+if (sdr->header.rec_type != IPMI_SDR_COMPACT_TYPE) {
 continue; /* Not a sensor SDR we set from */
 }
 
-if (sdr[7] > MAX_SENSORS) {
+if (sdr->sensor_owner_number > MAX_SENSORS) {
 continue;
 }
-sens = s->sensors + sdr[7];
+sens = s->sensors + sdr->sensor_owner_number;
 
 IPMI_SENSOR_SET_PRESENT(sens, 1);
-IPMI_SENSOR_SET_SCAN_ON(sens, (sdr[10] >> 6) & 1);
-IPMI_SENSOR_SET_EVENTS_ON(sens, (sdr[10] >> 5) & 1);
-sens->assert_suppt = sdr[14] | (sdr[15] << 8);
-sens->deassert_suppt = sdr[16] | (sdr[17] << 8);
-sens->states_suppt = sdr[18] | (sdr[19] << 8);
-sens->sensor_type = sdr[12];
-sens->evt_reading_type_code = sdr[13] & 0x7f;
+IPMI_SENSOR_SET_SCAN_ON(sens, (sdr->sensor_init >> 6) & 1);
+IPMI_SENSOR_SET_EVENTS_ON(sens, (sdr->sensor_init >> 5) & 1);
+sens->assert_suppt = sdr->assert_mask[0] | (sdr->assert_mask[1] << 8);
+sens->deassert_suppt =
+sdr->deassert_mask[0] | (sdr->deassert_mask[1] << 8);
+sens->states_suppt =
+sdr->discrete_mask[0] | (sdr->discrete_mask[1] << 8);
+sens->sensor_type = sdr->sensor_type;
+sens->evt_reading_type_code = sdr->reading_type & 0x7f;
 
 /* Enable all the events that are supported. */
 sens->assert_enable = sens->assert_suppt;
@@ -1155,6 +1164,7 @@ static void get_sdr(IPMIBmcSim *ibs,
 {
 unsigned int pos;
 uint16_t nextrec;
+struct ipmi_sdr_header *sdrh;
 
 IPMI_CHECK_CMD_LEN(8);
 if (cmd[6]) {
@@ -1166,7 +1176,10 @@ static void get_sdr(IPMIBmcSim *ibs,
 rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
 return;
 }
-if (cmd[6] > (ibs->sdr.sdr[pos + 4])) {
+
+sdrh = (struct ipmi_sdr_header *) &ibs->sdr.sdr[pos];
+
+if (cmd[6] > sdrh->rec_length) {
 rsp[2] = IPMI_CC_PARM_OUT_OF_RANGE;
 return;
 }
@@ -1175,7

[Qemu-devel] [PATCH v2 2/9] ipmi: replace goto by a return statement

2016-01-21 Thread Cédric Le Goater

Each routine using the IPMI_ADD_RSP_DATA, IPMI_CHECK_CMD_LEN or
IPMI_CHECK_RESERVATION macros needs to define a goto label 'out' to
handle hidden errors. Using directly a return statement as the same
effect and it removes the fact that 'out' needs to be defined.

The code exits in ipmi_sim_handle_command() are a little different
from the rest and a "possible" error in the macro IPMI_ADD_RSP_DATA is
handled before making use of it. This might be a bit excessive as a
minimum response len is currently 300 bytes and the patch checks that
at least 3 are available.

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 140 +++--
 1 file changed, 41 insertions(+), 99 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 0a59e539f549..e42c7e86c344 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -258,7 +258,7 @@ struct IPMIBmcSim {
 do {   \
 if (*rsp_len >= max_rsp_len) { \
 rsp[2] = IPMI_CC_REQUEST_DATA_TRUNCATED;   \
-goto out;  \
+return;\
 }  \
 rsp[(*rsp_len)++] = (b);   \
 } while (0)
@@ -267,7 +267,7 @@ struct IPMIBmcSim {
 #define IPMI_CHECK_CMD_LEN(l) \
 if (cmd_len < l) { \
 rsp[2] = IPMI_CC_REQUEST_DATA_LENGTH_INVALID;  \
-goto out; \
+return; \
 }
 
 /* Check that the reservation in the command is valid. */
@@ -275,7 +275,7 @@ struct IPMIBmcSim {
 do {   \
 if ((cmd[off] | (cmd[off + 1] << 8)) != r) {   \
 rsp[2] = IPMI_CC_INVALID_RESERVATION;  \
-goto out;  \
+return;\
 }  \
 } while (0)
 
@@ -453,14 +453,12 @@ static void gen_event(IPMIBmcSim *ibs, unsigned int 
sens_num, uint8_t deassert,
 }
 
 if (ibs->msg_flags & IPMI_BMC_MSG_FLAG_EVT_BUF_FULL) {
-goto out;
+return;
 }
 
 memcpy(ibs->evtbuf, evt, 16);
 ibs->msg_flags |= IPMI_BMC_MSG_FLAG_EVT_BUF_FULL;
 k->set_atn(s, 1, attn_irq_enabled(ibs));
- out:
-return;
 }
 
 static void sensor_set_discrete_bit(IPMIBmcSim *ibs, unsigned int sensor,
@@ -581,6 +579,11 @@ static void ipmi_sim_handle_command(IPMIBmc *b,
 
 /* Set up the response, set the low bit of NETFN. */
 /* Note that max_rsp_len must be at least 3 */
+if (max_rsp_len < 3) {
+rsp[2] = IPMI_CC_REQUEST_DATA_TRUNCATED;
+goto out;
+}
+
 IPMI_ADD_RSP_DATA(cmd[0] | 0x04);
 IPMI_ADD_RSP_DATA(cmd[1]);
 IPMI_ADD_RSP_DATA(0); /* Assume success */
@@ -698,8 +701,6 @@ static void chassis_capabilities(IPMIBmcSim *ibs,
 IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
 IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
 IPMI_ADD_RSP_DATA(ibs->parent.slave_addr);
- out:
-return;
 }
 
 static void chassis_status(IPMIBmcSim *ibs,
@@ -711,8 +712,6 @@ static void chassis_status(IPMIBmcSim *ibs,
 IPMI_ADD_RSP_DATA(0);
 IPMI_ADD_RSP_DATA(0);
 IPMI_ADD_RSP_DATA(0);
- out:
-return;
 }
 
 static void chassis_control(IPMIBmcSim *ibs,
@@ -746,10 +745,8 @@ static void chassis_control(IPMIBmcSim *ibs,
 break;
 default:
 rsp[2] = IPMI_CC_INVALID_DATA_FIELD;
-goto out;
+return;
 }
- out:
-return;
 }
 
 static void get_device_id(IPMIBmcSim *ibs,
@@ -768,8 +765,6 @@ static void get_device_id(IPMIBmcSim *ibs,
 IPMI_ADD_RSP_DATA(ibs->mfg_id[2]);
 IPMI_ADD_RSP_DATA(ibs->product_id[0]);
 IPMI_ADD_RSP_DATA(ibs->product_id[1]);
- out:
-return;
 }
 
 static void set_global_enables(IPMIBmcSim *ibs, uint8_t val)
@@ -822,8 +817,6 @@ static void set_bmc_global_enables(IPMIBmcSim *ibs,
 {
 IPMI_CHECK_CMD_LEN(3);
 set_global_enables(ibs, cmd[2]);
- out:
-return;
 }
 
 static void get_bmc_global_enables(IPMIBmcSim *ibs,
@@ -832,8 +825,6 @@ static void get_bmc_global_enables(IPMIBmcSim *ibs,
unsigned int max_rsp_len)
 {
 IPMI_ADD_RSP_DATA(ibs->bmc_global_enables);
- out:
-return;
 }
 
 static void clr_msg_flags(IPMIBmcSim *ibs,
@@ -847,8 +838,6 @@ static void clr_msg_flags(IPMIBmcSim *ibs,
 IPMI_CHECK_CMD_LEN(3);
 ibs->msg_flags &= ~cmd[2];
 k->set_atn(s, attn_set(ibs), attn_irq_enabled(ibs));
- out:
-return;
 }
 
 static void get_msg_flags(IPMIBmcSim *ibs,
@@ -857,8 +846,6 @@ static void get_msg_flags(IPMIBmcSim *ibs,
   unsigned int max_rsp_len)
 {
 IPMI_ADD_RSP_DATA(ibs->msg_flags);
- out:
-return;
 }
 
 static void read_evt_msg_buf(IPMIBmcSim *ibs,
@@ -872,15 +859,13 @@ static vo

[Qemu-devel] [PATCH v2 5/9] ipmi: fix SDR length value

2016-01-21 Thread Cédric Le Goater

The IPMI BMC simulator populates the SDR table with a set of initial
SDRs. The length of each SDR is taken from the record itself (byte 4)
which does not include the size of the header. But, the full length
(header + data) is required by the sdr_add_entry() routine.

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 18 +-
 include/hw/ipmi/ipmi.h |  1 +
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 31f990199154..803c7e5130c0 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -327,11 +327,11 @@ static int sdr_add_entry(IPMIBmcSim *ibs, const uint8_t 
*entry,
 struct ipmi_sdr_header *sdrh =
 (struct ipmi_sdr_header *) &ibs->sdr.sdr[ibs->sdr.next_free];
 
-if ((len < 5) || (len > 255)) {
+if ((len < IPMI_SDR_HEADER_SIZE) || (len > 255)) {
 return 1;
 }
 
-if (sdrh_entry->rec_length != len - 5) {
+if (ipmi_sdr_length(sdrh_entry) != len) {
 return 1;
 }
 
@@ -364,7 +364,7 @@ static int sdr_find_entry(IPMISdr *sdr, uint16_t recid,
 struct ipmi_sdr_header *sdrh =
 (struct ipmi_sdr_header *) &sdr->sdr[pos];
 uint16_t trec = ipmi_sdr_recid(sdrh);
-unsigned int nextpos = pos + sdrh->rec_length;
+unsigned int nextpos = pos + ipmi_sdr_length(sdrh);
 
 if (trec == recid) {
 if (nextrec) {
@@ -1179,7 +1179,7 @@ static void get_sdr(IPMIBmcSim *ibs,
 
 sdrh = (struct ipmi_sdr_header *) &ibs->sdr.sdr[pos];
 
-if (cmd[6] > sdrh->rec_length) {
+if (cmd[6] > ipmi_sdr_length(sdrh)) {
 rsp[2] = IPMI_CC_PARM_OUT_OF_RANGE;
 return;
 }
@@ -1188,7 +1188,7 @@ static void get_sdr(IPMIBmcSim *ibs,
 IPMI_ADD_RSP_DATA((nextrec >> 8) & 0xff);
 
 if (cmd[7] == 0xff) {
-cmd[7] = sdrh->rec_length - cmd[6];
+cmd[7] = ipmi_sdr_length(sdrh) - cmd[6];
 }
 
 if ((cmd[7] + *rsp_len) > max_rsp_len) {
@@ -1659,22 +1659,22 @@ static void ipmi_sim_init(Object *obj)
 for (i = 0;;) {
 struct ipmi_sdr_header *sdrh;
 int len;
-if ((i + 5) > sizeof(init_sdrs)) {
+if ((i + IPMI_SDR_HEADER_SIZE) > sizeof(init_sdrs)) {
 error_report("Problem with recid 0x%4.4x", i);
 return;
 }
 sdrh = (struct ipmi_sdr_header *) &init_sdrs[i];
-len = sdrh->rec_length;
+len = ipmi_sdr_length(sdrh);
 recid = ipmi_sdr_recid(sdrh);
 if (recid == 0x) {
 break;
 }
-if ((i + len + 5) > sizeof(init_sdrs)) {
+if ((i + len) > sizeof(init_sdrs)) {
 error_report("Problem with recid 0x%4.4x", i);
 return;
 }
 sdr_add_entry(ibs, init_sdrs + i, len, NULL);
-i += len + 5;
+i += len;
 }
 
 ipmi_init_sensors_from_sdrs(ibs);
diff --git a/include/hw/ipmi/ipmi.h b/include/hw/ipmi/ipmi.h
index 7e142e241dcb..74a2b5af9613 100644
--- a/include/hw/ipmi/ipmi.h
+++ b/include/hw/ipmi/ipmi.h
@@ -219,6 +219,7 @@ struct ipmi_sdr_header {
 #define IPMI_SDR_HEADER_SIZE sizeof(struct ipmi_sdr_header)
 
 #define ipmi_sdr_recid(sdr) ((sdr)->rec_id[0] | ((sdr)->rec_id[1] << 8))
+#define ipmi_sdr_length(sdr) ((sdr)->rec_length + IPMI_SDR_HEADER_SIZE)
 
 /*
  * 43.2 SDR Type 02h. Compact Sensor Record
-- 
2.1.4

Re: [Qemu-devel] [PULL 00/12] X86 queue, 2016-01-21

2016-01-21 Thread Peter Maydell

On 21 January 2016 at 15:09, Eduardo Habkost  wrote:
> The following changes since commit 3c9331c47f4118d5019b0af8eac704824d8d:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
> (2016-01-21 13:09:47 +)
>
> are available in the git repository at:
>
>   git://github.com/ehabkost/qemu.git tags/x86-pull-request
>
> for you to fetch changes up to f74eefe0b98cd7e13825de8e8d9f32e22aed102c:
>
>   target-i386: Add PKU and and OSPKE support (2016-01-21 12:47:16 -0200)
>
> 
> X86 queue, 2016-01-21
>
> 

Applied, thanks.

-- PMM

[Qemu-devel] [PATCH v2 6/9] ipmi: add get and set SENSOR_TYPE commands

2016-01-21 Thread Cédric Le Goater

Signed-off-by: Cédric Le Goater 
Acked-by: Corey Minyard 
---
 hw/ipmi/ipmi_bmc_sim.c | 45 -
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 803c7e5130c0..7c0f2a1d9799 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -42,6 +42,8 @@
 #define IPMI_CMD_REARM_SENSOR_EVTS0x2a
 #define IPMI_CMD_GET_SENSOR_EVT_STATUS0x2b
 #define IPMI_CMD_GET_SENSOR_READING   0x2d
+#define IPMI_CMD_SET_SENSOR_TYPE  0x2e
+#define IPMI_CMD_GET_SENSOR_TYPE  0x2f
 
 /* #define IPMI_NETFN_APP 0x06 In ipmi.h */
 
@@ -1527,6 +1529,45 @@ static void get_sensor_reading(IPMIBmcSim *ibs,
 }
 }
 
+static void set_sensor_type(IPMIBmcSim *ibs,
+   uint8_t *cmd, unsigned int cmd_len,
+   uint8_t *rsp, unsigned int *rsp_len,
+   unsigned int max_rsp_len)
+{
+IPMISensor *sens;
+
+
+IPMI_CHECK_CMD_LEN(5);
+if ((cmd[2] > MAX_SENSORS) ||
+!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
+rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
+return;
+}
+sens = ibs->sensors + cmd[2];
+sens->sensor_type = cmd[3];
+sens->evt_reading_type_code = cmd[4] & 0x7f;
+}
+
+static void get_sensor_type(IPMIBmcSim *ibs,
+   uint8_t *cmd, unsigned int cmd_len,
+   uint8_t *rsp, unsigned int *rsp_len,
+   unsigned int max_rsp_len)
+{
+IPMISensor *sens;
+
+
+IPMI_CHECK_CMD_LEN(3);
+if ((cmd[2] > MAX_SENSORS) ||
+!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
+rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
+return;
+}
+sens = ibs->sensors + cmd[2];
+IPMI_ADD_RSP_DATA(sens->sensor_type);
+IPMI_ADD_RSP_DATA(sens->evt_reading_type_code);
+}
+
+
 static const IPMICmdHandler chassis_cmds[] = {
 [IPMI_CMD_GET_CHASSIS_CAPABILITIES] = chassis_capabilities,
 [IPMI_CMD_GET_CHASSIS_STATUS] = chassis_status,
@@ -1542,7 +1583,9 @@ static const IPMICmdHandler sensor_event_cmds[] = {
 [IPMI_CMD_GET_SENSOR_EVT_ENABLE] = get_sensor_evt_enable,
 [IPMI_CMD_REARM_SENSOR_EVTS] = rearm_sensor_evts,
 [IPMI_CMD_GET_SENSOR_EVT_STATUS] = get_sensor_evt_status,
-[IPMI_CMD_GET_SENSOR_READING] = get_sensor_reading
+[IPMI_CMD_GET_SENSOR_READING] = get_sensor_reading,
+[IPMI_CMD_SET_SENSOR_TYPE] = set_sensor_type,
+[IPMI_CMD_GET_SENSOR_TYPE] = get_sensor_type,
 };
 static const IPMINetfn sensor_event_netfn = {
 .cmd_nums = ARRAY_SIZE(sensor_event_cmds),
-- 
2.1.4

[Qemu-devel] [PATCH v2 1/9] ppc: add IPMI support

2016-01-21 Thread Cédric Le Goater

Open PowerNV systems use a BT device to communicate with the BMC.
Provide support for it.

Signed-off-by: Cédric Le Goater 
---
 default-configs/ppc64-softmmu.mak | 4 
 1 file changed, 4 insertions(+)

diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index bb71b23ee78a..d6871f43cbd5 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -7,6 +7,10 @@ CONFIG_VIRTIO_VGA=y
 CONFIG_ISA_MMIO=y
 CONFIG_ESCC=y
 CONFIG_M48T59=y
+CONFIG_IPMI=y
+CONFIG_IPMI_LOCAL=y
+CONFIG_IPMI_EXTERN=y
+CONFIG_ISA_IPMI_BT=y
 CONFIG_SERIAL=y
 CONFIG_PARALLEL=y
 CONFIG_I8254=y
-- 
2.1.4

[Qemu-devel] [PATCH v2 7/9] ipmi: add GET_SYS_RESTART_CAUSE chassis command

2016-01-21 Thread Cédric Le Goater

This is a simulator. Just return an unknown cause (0).

Signed-off-by: Cédric Le Goater 
Acked-by: Corey Minyard 
---
 hw/ipmi/ipmi_bmc_sim.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 7c0f2a1d9799..e882af3f1b40 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -34,6 +34,7 @@
 #define IPMI_CMD_GET_CHASSIS_CAPABILITIES 0x00
 #define IPMI_CMD_GET_CHASSIS_STATUS   0x01
 #define IPMI_CMD_CHASSIS_CONTROL  0x02
+#define IPMI_CMD_GET_SYS_RESTART_CAUSE0x09
 
 #define IPMI_NETFN_SENSOR_EVENT   0x04
 
@@ -197,6 +198,8 @@ struct IPMIBmcSim {
 uint8_t mfg_id[3];
 uint8_t product_id[2];
 
+uint8_t restart_cause;
+
 IPMISel sel;
 IPMISdr sdr;
 IPMISensor sensors[MAX_SENSORS];
@@ -756,6 +759,15 @@ static void chassis_control(IPMIBmcSim *ibs,
 }
 }
 
+static void chassis_get_sys_restart_cause(IPMIBmcSim *ibs,
+   uint8_t *cmd, unsigned int cmd_len,
+   uint8_t *rsp, unsigned int *rsp_len,
+   unsigned int max_rsp_len)
+{
+IPMI_ADD_RSP_DATA(ibs->restart_cause & 0xf); /* Restart Cause */
+IPMI_ADD_RSP_DATA(0);  /* Channel 0 */
+}
+
 static void get_device_id(IPMIBmcSim *ibs,
   uint8_t *cmd, unsigned int cmd_len,
   uint8_t *rsp, unsigned int *rsp_len,
@@ -1571,7 +1583,8 @@ static void get_sensor_type(IPMIBmcSim *ibs,
 static const IPMICmdHandler chassis_cmds[] = {
 [IPMI_CMD_GET_CHASSIS_CAPABILITIES] = chassis_capabilities,
 [IPMI_CMD_GET_CHASSIS_STATUS] = chassis_status,
-[IPMI_CMD_CHASSIS_CONTROL] = chassis_control
+[IPMI_CMD_CHASSIS_CONTROL] = chassis_control,
+[IPMI_CMD_GET_SYS_RESTART_CAUSE] = chassis_get_sys_restart_cause
 };
 static const IPMINetfn chassis_netfn = {
 .cmd_nums = ARRAY_SIZE(chassis_cmds),
@@ -1692,6 +1705,7 @@ static void ipmi_sim_init(Object *obj)
 ibs->bmc_global_enables = (1 << IPMI_BMC_EVENT_LOG_BIT);
 ibs->device_id = 0x20;
 ibs->ipmi_version = 0x02; /* IPMI 2.0 */
+ibs->restart_cause = 0;
 for (i = 0; i < 4; i++) {
 ibs->sel.last_addition[i] = 0xff;
 ibs->sel.last_clear[i] = 0xff;
-- 
2.1.4

[Qemu-devel] [PATCH v2 0/9] ipmi: a couple of enhancements to the BMC simulator

2016-01-21 Thread Cédric Le Goater

Hi,

Here are a few patches adding a couple of IPMI commands to the BMC
simulator. 

Changes since v1:
  - Added IPMI to ppc. We will need it for the future powernv platform. 
  - Added some initial cleanups 
  - Kept FRU, the API extensions to expose SDR and generate events
for later, may be as an extension of this patchset. (Work in progress)

Based on 3db34bf64ab4 and also available here  :

  https://github.com/legoater/qemu/commits/ipmi

Thanks,

Cédric Le Goater (9):
  ppc: add IPMI support
  ipmi: replace goto by a return statement
  ipmi: replace *_MAXCMD defines
  ipmi: introduce a struct ipmi_sdr_compact
  ipmi: fix SDR length value
  ipmi: add get and set SENSOR_TYPE commands
  ipmi: add GET_SYS_RESTART_CAUSE chassis command
  ipmi: add ACPI power and GUID commands
  ipmi: add SET_SENSOR_READING command (tentative try)

 default-configs/ppc64-softmmu.mak |   4 +
 hw/ipmi/ipmi_bmc_sim.c| 479 ++
 include/hw/ipmi/ipmi.h|  45 
 3 files changed, 385 insertions(+), 143 deletions(-)

-- 
2.1.4

[Qemu-devel] [PATCH v2 3/9] ipmi: replace *_MAXCMD defines

2016-01-21 Thread Cédric Le Goater

ARRAY_SIZE() is simple to use and removes the need to pre-define
the size of the command arrays.

Signed-off-by: Cédric Le Goater 
---
 hw/ipmi/ipmi_bmc_sim.c | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index e42c7e86c344..fc596a548df7 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -30,14 +30,12 @@
 #include "qemu/error-report.h"
 
 #define IPMI_NETFN_CHASSIS0x00
-#define IPMI_NETFN_CHASSIS_MAXCMD 0x03
 
 #define IPMI_CMD_GET_CHASSIS_CAPABILITIES 0x00
 #define IPMI_CMD_GET_CHASSIS_STATUS   0x01
 #define IPMI_CMD_CHASSIS_CONTROL  0x02
 
 #define IPMI_NETFN_SENSOR_EVENT   0x04
-#define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x2e
 
 #define IPMI_CMD_SET_SENSOR_EVT_ENABLE0x28
 #define IPMI_CMD_GET_SENSOR_EVT_ENABLE0x29
@@ -46,7 +44,6 @@
 #define IPMI_CMD_GET_SENSOR_READING   0x2d
 
 /* #define IPMI_NETFN_APP 0x06 In ipmi.h */
-#define IPMI_NETFN_APP_MAXCMD 0x36
 
 #define IPMI_CMD_GET_DEVICE_ID0x01
 #define IPMI_CMD_COLD_RESET   0x02
@@ -63,7 +60,6 @@
 #define IPMI_CMD_READ_EVT_MSG_BUF 0x35
 
 #define IPMI_NETFN_STORAGE0x0a
-#define IPMI_NETFN_STORAGE_MAXCMD 0x4a
 
 #define IPMI_CMD_GET_SDR_REP_INFO 0x20
 #define IPMI_CMD_GET_SDR_REP_ALLOC_INFO   0x21
@@ -1518,18 +1514,17 @@ static void get_sensor_reading(IPMIBmcSim *ibs,
 }
 }
 
-static const IPMICmdHandler chassis_cmds[IPMI_NETFN_CHASSIS_MAXCMD] = {
+static const IPMICmdHandler chassis_cmds[] = {
 [IPMI_CMD_GET_CHASSIS_CAPABILITIES] = chassis_capabilities,
 [IPMI_CMD_GET_CHASSIS_STATUS] = chassis_status,
 [IPMI_CMD_CHASSIS_CONTROL] = chassis_control
 };
 static const IPMINetfn chassis_netfn = {
-.cmd_nums = IPMI_NETFN_CHASSIS_MAXCMD,
+.cmd_nums = ARRAY_SIZE(chassis_cmds),
 .cmd_handlers = chassis_cmds
 };
 
-static const IPMICmdHandler
-sensor_event_cmds[IPMI_NETFN_SENSOR_EVENT_MAXCMD] = {
+static const IPMICmdHandler sensor_event_cmds[] = {
 [IPMI_CMD_SET_SENSOR_EVT_ENABLE] = set_sensor_evt_enable,
 [IPMI_CMD_GET_SENSOR_EVT_ENABLE] = get_sensor_evt_enable,
 [IPMI_CMD_REARM_SENSOR_EVTS] = rearm_sensor_evts,
@@ -1537,11 +1532,11 @@ sensor_event_cmds[IPMI_NETFN_SENSOR_EVENT_MAXCMD] = {
 [IPMI_CMD_GET_SENSOR_READING] = get_sensor_reading
 };
 static const IPMINetfn sensor_event_netfn = {
-.cmd_nums = IPMI_NETFN_SENSOR_EVENT_MAXCMD,
+.cmd_nums = ARRAY_SIZE(sensor_event_cmds),
 .cmd_handlers = sensor_event_cmds
 };
 
-static const IPMICmdHandler app_cmds[IPMI_NETFN_APP_MAXCMD] = {
+static const IPMICmdHandler app_cmds[] = {
 [IPMI_CMD_GET_DEVICE_ID] = get_device_id,
 [IPMI_CMD_COLD_RESET] = cold_reset,
 [IPMI_CMD_WARM_RESET] = warm_reset,
@@ -1557,11 +1552,11 @@ static const IPMICmdHandler 
app_cmds[IPMI_NETFN_APP_MAXCMD] = {
 [IPMI_CMD_GET_WATCHDOG_TIMER] = get_watchdog_timer,
 };
 static const IPMINetfn app_netfn = {
-.cmd_nums = IPMI_NETFN_APP_MAXCMD,
+.cmd_nums = ARRAY_SIZE(app_cmds),
 .cmd_handlers = app_cmds
 };
 
-static const IPMICmdHandler storage_cmds[IPMI_NETFN_STORAGE_MAXCMD] = {
+static const IPMICmdHandler storage_cmds[] = {
 [IPMI_CMD_GET_SDR_REP_INFO] = get_sdr_rep_info,
 [IPMI_CMD_RESERVE_SDR_REP] = reserve_sdr_rep,
 [IPMI_CMD_GET_SDR] = get_sdr,
@@ -1577,7 +1572,7 @@ static const IPMICmdHandler 
storage_cmds[IPMI_NETFN_STORAGE_MAXCMD] = {
 };
 
 static const IPMINetfn storage_netfn = {
-.cmd_nums = IPMI_NETFN_STORAGE_MAXCMD,
+.cmd_nums = ARRAY_SIZE(storage_cmds),
 .cmd_handlers = storage_cmds
 };
 
-- 
2.1.4

Re: [Qemu-devel] [PATCH 5/8] ipmi: add ACPI power and GUID commands

2016-01-21 Thread Corey Minyard


On 01/17/2016 08:16 AM, Michael S. Tsirkin wrote:

On Sun, Jan 17, 2016 at 02:04:32PM +0200, Marcel Apfelbaum wrote:

On 01/05/2016 07:29 PM, Cédric Le Goater wrote:

Signed-off-by: Cédric Le Goater 
---
  hw/ipmi/ipmi_bmc_sim.c | 55 ++
  1 file changed, 55 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 60586a67104e..c3a06d0ac7e4 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -25,6 +25,7 @@
  #include 
  #include 
  #include 
+#include "sysemu/sysemu.h"
  #include "qemu/timer.h"
  #include "hw/ipmi/ipmi.h"
  #include "qemu/error-report.h"
@@ -54,6 +55,9 @@
  #define IPMI_CMD_GET_DEVICE_ID0x01
  #define IPMI_CMD_COLD_RESET   0x02
  #define IPMI_CMD_WARM_RESET   0x03
+#define IPMI_CMD_SET_POWER_STATE  0x06
+#define IPMI_CMD_GET_POWER_STATE  0x07
+#define IPMI_CMD_GET_DEVICE_GUID  0x08
  #define IPMI_CMD_RESET_WATCHDOG_TIMER 0x22
  #define IPMI_CMD_SET_WATCHDOG_TIMER   0x24
  #define IPMI_CMD_GET_WATCHDOG_TIMER   0x25
@@ -215,6 +219,9 @@ struct IPMIBmcSim {

  uint8_t restart_cause;

+uint8_t power_state[2];
+uint8_t uuid[16];
+
  IPMISel sel;
  IPMISdr sdr;
  IPMIFru fru;
@@ -842,6 +849,42 @@ static void warm_reset(IPMIBmcSim *ibs,
  k->reset(s, false);
  }
  }
+static void set_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_CHECK_CMD_LEN(4);
+ibs->power_state[0] = cmd[2];
+ibs->power_state[1] = cmd[3];
+ out:
+return;


Hi,

I am sorry for my late comment, but I find a little strange the use of
the "out" label here.
I understand this is because of its usage in IPMI_*  macros, but
I looked into every usage(I hope I didn't miss anything) and the code
simply returns.
Also the correlation between those macros is a little odd.

Thanks,
Marcel


Yes - these macros with goto out are confusing.

Please rewrite them to return bool, and put
goto out in the caller.



Marcel, do you want me to do this?

-corey




+}
+
+static void get_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_ADD_RSP_DATA(ibs->power_state[0]);
+IPMI_ADD_RSP_DATA(ibs->power_state[1]);
+ out:
+return;
+}
+
+static void get_device_guid(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+unsigned int i;
+
+for (i = 0; i < 16; i++) {
+IPMI_ADD_RSP_DATA(ibs->uuid[i]);
+}
+ out:
+return;
+}

  static void set_bmc_global_enables(IPMIBmcSim *ibs,
 uint8_t *cmd, unsigned int cmd_len,
@@ -1781,6 +1824,9 @@ static const IPMICmdHandler 
app_cmds[IPMI_NETFN_APP_MAXCMD] = {
  [IPMI_CMD_GET_DEVICE_ID] = get_device_id,
  [IPMI_CMD_COLD_RESET] = cold_reset,
  [IPMI_CMD_WARM_RESET] = warm_reset,
+[IPMI_CMD_SET_POWER_STATE] = set_power_state,
+[IPMI_CMD_GET_POWER_STATE] = get_power_state,
+[IPMI_CMD_GET_DEVICE_GUID] = get_device_guid,
  [IPMI_CMD_SET_BMC_GLOBAL_ENABLES] = set_bmc_global_enables,
  [IPMI_CMD_GET_BMC_GLOBAL_ENABLES] = get_bmc_global_enables,
  [IPMI_CMD_CLR_MSG_FLAGS] = clr_msg_flags,
@@ -1907,6 +1953,15 @@ static void ipmi_sim_init(Object *obj)
  i += len;
  }

+ibs->power_state[0] = 0;
+ibs->power_state[1] = 0;
+
+if (qemu_uuid_set) {
+memcpy(&ibs->uuid, qemu_uuid, 16);
+} else {
+memset(&ibs->uuid, 0, 16);
+}
+
  ipmi_init_sensors_from_sdrs(ibs);
  register_cmds(ibs);

Re: [Qemu-devel] [PATCH v9 01/37] qobject: Document more shortcomings in our number handling

2016-01-21 Thread Eric Blake

On 01/20/2016 11:21 PM, Markus Armbruster wrote:
>> One alternative is to always output a guaranteed unambiguous decimal
>> string (although not necessarily the shortest), by using %.17f, using
>>  DBL_DECIMAL_DIG.  (Note that DBL_DIG of 15 is NOT sufficient -
>> it is the lower limit that says that a decimal->float->decimal will not
>> change the decimal; but we want the converse where a
>> float->decimal->float will not change the float.  There are stretches of
>> numbers where the pigeonhole principle applies; you can think of it this
>> way: there is no way to map all possible 2^10 (1024) binary values
>> inside 2^3 (1000) decimal digits without at least 24 of them needing one
>> more decimal digit.  But by the same arguments, DBL_DECIMAL_DIG is an
>> upper limit and usually more than you need.)
>>
>> So, the question is whether we want to always output 17 digits, or
>> whether we want to do the poor-man's truncation scheme (easy to
>> understand, but not optimal use of the processor), or go all the way to
>> the algorithm of that paper (faster but lots harder to understand).  For
>> reference, here's the poor-man's algorithm in pseudocode:
> 
> I don't think we want to implement floating-point formatting ourselves.

Well, we already have _some_ level of floating-point support built in
for TCG emulation of floating point on various architectures.  I don't
know how much would be easily reusable for this case, though.

> 
>> if 0, inf, nan:
>> special case
>> else:
>> Obtain the DBL_DECIMAL_DIG string via sprintf %.17f
>> i = 17;
>> do {
>> truncate the original string to i-1 decimal characters
>> parse that with strtod()
>> if the bit pattern differs:
>> break;
>> } while (--i);
>> assert(i)
>> use i digits of the string
> 
> That's a lot of strtod()...  May not be noticable if we write the result
> to a slowish sink.  Binary search could save a few.
> 
> Naive idea: chop off trailing '0'?

That, and rounding trailing '9'.  Any other digits in positions 1-16 are
significant (other digits in position 17 might not matter, though), so I
guess a full 17 iterations is probably not strictly necessary.  Our
current code DOES chop trailing '0', but starting from the flawed
premise that 6 digits was enough precision.

> 
>> As a separate patch, of course, but I have a pending patch that provides
>> a single place where we could drop in such an improvement:
>> https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg03932.html
> 
> Definitely separate.

Okay, all thoughts for a later day :)

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PULL 09/11] Add Error **errp for xen_pt_setup_vga()

2016-01-21 Thread Stefano Stabellini

From: Cao jin 

To catch the error message. Also modify the caller

Signed-off-by: Cao jin 
Reviewed-by: Eric Blake 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_pt.c  |7 +--
 hw/xen/xen_pt.h  |3 ++-
 hw/xen/xen_pt_graphics.c |   11 ++-
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 53b5bca..07bfcec 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -808,8 +808,11 @@ static int xen_pt_initfn(PCIDevice *d)
 return -1;
 }
 
-if (xen_pt_setup_vga(s, &s->real_device) < 0) {
-XEN_PT_ERR(d, "Setup VGA BIOS of passthrough GFX failed!\n");
+xen_pt_setup_vga(s, &s->real_device, &err);
+if (err) {
+error_append_hint(&err, "Setup VGA BIOS of passthrough"
+" GFX failed");
+error_report_err(err);
 xen_host_pci_device_put(&s->real_device);
 return -1;
 }
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 3749711..26f74f8 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -330,5 +330,6 @@ static inline bool is_igd_vga_passthrough(XenHostPCIDevice 
*dev)
 }
 int xen_pt_register_vga_regions(XenHostPCIDevice *dev);
 int xen_pt_unregister_vga_regions(XenHostPCIDevice *dev);
-int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev);
+void xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev,
+ Error **errp);
 #endif /* !XEN_PT_H */
diff --git a/hw/xen/xen_pt_graphics.c b/hw/xen/xen_pt_graphics.c
index df6069b..e7a7c7e 100644
--- a/hw/xen/xen_pt_graphics.c
+++ b/hw/xen/xen_pt_graphics.c
@@ -161,7 +161,8 @@ struct pci_data {
 uint16_t reserved;
 } __attribute__((packed));
 
-int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev)
+void xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev,
+ Error **errp)
 {
 unsigned char *bios = NULL;
 struct rom_header *rom;
@@ -172,13 +173,14 @@ int xen_pt_setup_vga(XenPCIPassthroughState *s, 
XenHostPCIDevice *dev)
 struct pci_data *pd = NULL;
 
 if (!is_igd_vga_passthrough(dev)) {
-return -1;
+error_setg(errp, "Need to enable igd-passthrough");
+return;
 }
 
 bios = get_vgabios(s, &bios_size, dev);
 if (!bios) {
-XEN_PT_ERR(&s->dev, "VGA: Can't getting VBIOS!\n");
-return -1;
+error_setg(errp, "VGA: Can't get VBIOS");
+return;
 }
 
 /* Currently we fixed this address as a primary. */
@@ -203,7 +205,6 @@ int xen_pt_setup_vga(XenPCIPassthroughState *s, 
XenHostPCIDevice *dev)
 
 /* Currently we fixed this address as a primary for legacy BIOS. */
 cpu_physical_memory_rw(0xc, bios, bios_size, 1);
-return 0;
 }
 
 uint32_t igd_read_opregion(XenPCIPassthroughState *s)
-- 
1.7.10.4

[Qemu-devel] [PULL 02/11] xenfb.c: avoid expensive loops when prod <= out_cons

2016-01-21 Thread Stefano Stabellini

If the frontend sets out_cons to a value higher than out_prod, it will
cause xenfb_handle_events to loop about 2^32 times. Avoid that by using
better checks at the beginning of the function.

Signed-off-by: Stefano Stabellini 
Reported-by: Ling Liu 
---
 hw/display/xenfb.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 4e2a27a..8eb3046 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -789,8 +789,9 @@ static void xenfb_handle_events(struct XenFB *xenfb)
 
 prod = page->out_prod;
 out_cons = page->out_cons;
-if (prod == out_cons)
-   return;
+if (prod - out_cons >= XENFB_OUT_RING_LEN) {
+return;
+}
 xen_rmb(); /* ensure we see ring contents up to prod */
 for (cons = out_cons; cons != prod; cons++) {
union xenfb_out_event *event = &XENFB_OUT_RING_REF(page, cons);
-- 
1.7.10.4

[Qemu-devel] [PULL 03/11] xen-hvm: Clean up xen_hvm_init() error handling

2016-01-21 Thread Stefano Stabellini

From: Markus Armbruster 

xen_hvm_init() returns -1 without cleaning up on some errors (harmless
long as the caller exit()s on error), dies with hw_error() on others.
hw_error() isn't approprate here.  Clean up to exit() on all errors.

Signed-off-by: Markus Armbruster 
Reviewed-by: Stefano Stabellini 
Signed-off-by: Stefano Stabellini 
---
 hw/i386/pc_piix.c|5 ++---
 hw/i386/pc_q35.c |5 ++---
 include/hw/xen/xen.h |2 +-
 xen-hvm-stub.c   |3 +--
 xen-hvm.c|   61 ++
 5 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index df2b824..db0ae9c 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -121,9 +121,8 @@ static void pc_init1(MachineState *machine,
 pcms->below_4g_mem_size = machine->ram_size;
 }
 
-if (xen_enabled() && xen_hvm_init(pcms, &ram_memory) != 0) {
-fprintf(stderr, "xen hardware virtual machine initialisation 
failed\n");
-exit(1);
+if (xen_enabled()) {
+xen_hvm_init(pcms, &ram_memory);
 }
 
 pc_cpus_init(pcms);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 412b3cd..abbcbf9 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -111,9 +111,8 @@ static void pc_q35_init(MachineState *machine)
 pcms->below_4g_mem_size = machine->ram_size;
 }
 
-if (xen_enabled() && xen_hvm_init(pcms, &ram_memory) != 0) {
-fprintf(stderr, "xen hardware virtual machine initialisation 
failed\n");
-exit(1);
+if (xen_enabled()) {
+xen_hvm_init(pcms, &ram_memory);
 }
 
 pc_cpus_init(pcms);
diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index e90931a..d07bc99 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -39,7 +39,7 @@ qemu_irq *xen_interrupt_controller_init(void);
 void xenstore_store_pv_console_info(int i, struct CharDriverState *chr);
 
 #if defined(NEED_CPU_H) && !defined(CONFIG_USER_ONLY)
-int xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory);
+void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory);
 void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
struct MemoryRegion *mr);
 void xen_modified_memory(ram_addr_t start, ram_addr_t length);
diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c
index 6a39425..7ff0602 100644
--- a/xen-hvm-stub.c
+++ b/xen-hvm-stub.c
@@ -47,9 +47,8 @@ void xen_modified_memory(ram_addr_t start, ram_addr_t length)
 {
 }
 
-int xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory)
+void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory)
 {
-return 0;
 }
 
 void qmp_xen_set_global_dirty_log(bool enable, Error **errp)
diff --git a/xen-hvm.c b/xen-hvm.c
index 2a93390..cb7128c 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -17,6 +17,7 @@
 #include "qmp-commands.h"
 
 #include "sysemu/char.h"
+#include "qemu/error-report.h"
 #include "qemu/range.h"
 #include "sysemu/xen-mapcache.h"
 #include "trace.h"
@@ -1189,16 +1190,8 @@ static void xen_wakeup_notifier(Notifier *notifier, void 
*data)
 xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 0);
 }
 
-/* return 0 means OK, or -1 means critical issue -- will exit(1) */
-int xen_hvm_init(PCMachineState *pcms,
- MemoryRegion **ram_memory)
+void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory)
 {
-/*
- * FIXME Returns -1 without cleaning up on some errors (harmless
- * as long as the caller exit()s on error), dies with hw_error()
- * on others.  hw_error() isn't approprate here.  Should probably
- * simply exit() on all errors.
- */
 int i, rc;
 xen_pfn_t ioreq_pfn;
 xen_pfn_t bufioreq_pfn;
@@ -1210,19 +1203,19 @@ int xen_hvm_init(PCMachineState *pcms,
 state->xce_handle = xen_xc_evtchn_open(NULL, 0);
 if (state->xce_handle == XC_HANDLER_INITIAL_VALUE) {
 perror("xen: event channel open");
-return -1;
+goto err;
 }
 
 state->xenstore = xs_daemon_open();
 if (state->xenstore == NULL) {
 perror("xen: xenstore open");
-return -1;
+goto err;
 }
 
 rc = xen_create_ioreq_server(xen_xc, xen_domid, &state->ioservid);
 if (rc < 0) {
 perror("xen: ioreq server create");
-return -1;
+goto err;
 }
 
 state->exit.notify = xen_exit_notifier;
@@ -1238,8 +1231,9 @@ int xen_hvm_init(PCMachineState *pcms,
&ioreq_pfn, &bufioreq_pfn,
&bufioreq_evtchn);
 if (rc < 0) {
-hw_error("failed to get ioreq server info: error %d handle=" 
XC_INTERFACE_FMT,
- errno, xen_xc);
+error_report("failed to get ioreq server info: error %d handle=" 
XC_INTERFACE_FMT,
+ errno, xen_xc);
+goto err;
 }
 
 DPRINTF("shared page at pfn %lx\n", ioreq_pfn);
@@ -1249,8 +1243,9 @@ int xen_hvm_init(PCMachineState *pcms,
 state->sh

[Qemu-devel] [PULL 08/11] Add Error **errp for xen_host_pci_device_get()

2016-01-21 Thread Stefano Stabellini

From: Cao jin 

To catch the error message. Also modify the caller

Signed-off-by: Cao jin 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Eric Blake 
---
 hw/xen/xen-host-pci-device.c |  102 +++---
 hw/xen/xen-host-pci-device.h |5 ++-
 hw/xen/xen_pt.c  |   13 +++---
 3 files changed, 68 insertions(+), 52 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 83da9c4..3827ca7 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -44,7 +44,7 @@ static void xen_host_pci_sysfs_path(const XenHostPCIDevice *d,
 
 /* This size should be enough to read the first 7 lines of a resource file */
 #define XEN_HOST_PCI_RESOURCE_BUFFER_SIZE 400
-static int xen_host_pci_get_resource(XenHostPCIDevice *d)
+static void xen_host_pci_get_resource(XenHostPCIDevice *d, Error **errp)
 {
 int i, rc, fd;
 char path[PATH_MAX];
@@ -57,19 +57,18 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
 
 fd = open(path, O_RDONLY);
 if (fd == -1) {
-XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
-return -errno;
+error_setg_file_open(errp, errno, path);
+return;
 }
 
 do {
-rc = read(fd, &buf, sizeof (buf) - 1);
+rc = read(fd, &buf, sizeof(buf) - 1);
 if (rc < 0 && errno != EINTR) {
-rc = -errno;
+error_setg_errno(errp, errno, "read err");
 goto out;
 }
 } while (rc < 0);
 buf[rc] = 0;
-rc = 0;
 
 s = buf;
 for (i = 0; i < PCI_NUM_REGIONS; i++) {
@@ -122,20 +121,19 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
 d->rom.bus_flags = flags & IORESOURCE_BITS;
 }
 }
+
 if (i != PCI_NUM_REGIONS) {
-/* Invalid format or input to short */
-rc = -ENODEV;
+error_setg(errp, "Invalid format or input too short: %s", buf);
 }
 
 out:
 close(fd);
-return rc;
 }
 
 /* This size should be enough to read a long from a file */
 #define XEN_HOST_PCI_GET_VALUE_BUFFER_SIZE 22
-static int xen_host_pci_get_value(XenHostPCIDevice *d, const char *name,
-  unsigned int *pvalue, int base)
+static void xen_host_pci_get_value(XenHostPCIDevice *d, const char *name,
+   unsigned int *pvalue, int base, Error 
**errp)
 {
 char path[PATH_MAX];
 char buf[XEN_HOST_PCI_GET_VALUE_BUFFER_SIZE];
@@ -147,39 +145,45 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 
 fd = open(path, O_RDONLY);
 if (fd == -1) {
-XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
-return -errno;
+error_setg_file_open(errp, errno, path);
+return;
 }
+
 do {
-rc = read(fd, &buf, sizeof (buf) - 1);
+rc = read(fd, &buf, sizeof(buf) - 1);
 if (rc < 0 && errno != EINTR) {
-rc = -errno;
+error_setg_errno(errp, errno, "read err");
 goto out;
 }
 } while (rc < 0);
+
 buf[rc] = 0;
 rc = qemu_strtoul(buf, &endptr, base, &value);
 if (!rc) {
 assert(value <= UINT_MAX);
 *pvalue = value;
+} else {
+error_setg_errno(errp, -rc, "failed to parse value '%s'", buf);
 }
+
 out:
 close(fd);
-return rc;
 }
 
-static inline int xen_host_pci_get_hex_value(XenHostPCIDevice *d,
- const char *name,
- unsigned int *pvalue)
+static inline void xen_host_pci_get_hex_value(XenHostPCIDevice *d,
+  const char *name,
+  unsigned int *pvalue,
+  Error **errp)
 {
-return xen_host_pci_get_value(d, name, pvalue, 16);
+xen_host_pci_get_value(d, name, pvalue, 16, errp);
 }
 
-static inline int xen_host_pci_get_dec_value(XenHostPCIDevice *d,
- const char *name,
- unsigned int *pvalue)
+static inline void xen_host_pci_get_dec_value(XenHostPCIDevice *d,
+  const char *name,
+  unsigned int *pvalue,
+  Error **errp)
 {
-return xen_host_pci_get_value(d, name, pvalue, 10);
+xen_host_pci_get_value(d, name, pvalue, 10, errp);
 }
 
 static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice *d)
@@ -192,17 +196,16 @@ static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice 
*d)
 return !stat(path, &buf);
 }
 
-static int xen_host_pci_config_open(XenHostPCIDevice *d)
+static void xen_host_pci_config_open(XenHostPCIDevice *d, Error **errp)
 {
 char path[PATH_MAX];
 
 xen_host_pci_sysfs_path(d, "config", path, sizeof(path));
 
 d->config_fd = op

[Qemu-devel] [PULL 0/11] xen-20160121

2016-01-21 Thread Stefano Stabellini

The following changes since commit 17c8a2197888bac8ec0256b16919b721c76c5e62:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-01-13' into 
staging (2016-01-14 13:07:38 +)

are available in the git repository at:


  git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-20160121

for you to fetch changes up to 5a11d0f7549e24a10e178a9dc8ff5e698031d9a6:

  Xen PCI passthru: convert to realize() (2016-01-21 16:45:54 +)


Xen 2016/01/21


Cao jin (7):
  xen-pvdevice: convert to realize()
  Change xen_host_pci_sysfs_path() to return void
  Xen: use qemu_strtoul instead of strtol
  Add Error **errp for xen_host_pci_device_get()
  Add Error **errp for xen_pt_setup_vga()
  Add Error **errp for xen_pt_config_init()
  Xen PCI passthru: convert to realize()

Markus Armbruster (2):
  xen-hvm: Clean up xen_hvm_init() error handling
  xen-hvm: Clean up xen_ram_alloc() error handling

Stefano Stabellini (2):
  MAINTAINERS: update Xen files
  xenfb.c: avoid expensive loops when prod <= out_cons

 MAINTAINERS  |3 +
 exec.c   |8 ++-
 hw/display/xenfb.c   |5 +-
 hw/i386/pc_piix.c|5 +-
 hw/i386/pc_q35.c |5 +-
 hw/i386/xen/xen_pvdevice.c   |   12 ++--
 hw/xen/xen-host-pci-device.c |  149 --
 hw/xen/xen-host-pci-device.h |5 +-
 hw/xen/xen_pt.c  |   77 --
 hw/xen/xen_pt.h  |5 +-
 hw/xen/xen_pt_config_init.c  |   51 ---
 hw/xen/xen_pt_graphics.c |   11 ++--
 include/hw/xen/xen.h |4 +-
 xen-hvm-stub.c   |6 +-
 xen-hvm.c|   68 ++-
 15 files changed, 219 insertions(+), 195 deletions(-)

[Qemu-devel] [PULL 05/11] xen-pvdevice: convert to realize()

2016-01-21 Thread Stefano Stabellini

From: Cao jin 

Signed-off-by: Cao jin 
Reviewed-by: Stefano Stabellini 
---
 hw/i386/xen/xen_pvdevice.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/i386/xen/xen_pvdevice.c b/hw/i386/xen/xen_pvdevice.c
index c218947..9abcf25 100644
--- a/hw/i386/xen/xen_pvdevice.c
+++ b/hw/i386/xen/xen_pvdevice.c
@@ -69,14 +69,16 @@ static const MemoryRegionOps xen_pv_mmio_ops = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
-static int xen_pv_init(PCIDevice *pci_dev)
+static void xen_pv_realize(PCIDevice *pci_dev, Error **errp)
 {
 XenPVDevice *d = XEN_PV_DEVICE(pci_dev);
 uint8_t *pci_conf;
 
 /* device-id property must always be supplied */
-if (d->device_id == 0x)
-   return -1;
+if (d->device_id == 0x) {
+error_setg(errp, "Device ID invalid, it must always be supplied");
+return;
+}
 
 pci_conf = pci_dev->config;
 
@@ -97,8 +99,6 @@ static int xen_pv_init(PCIDevice *pci_dev)
 
 pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_MEM_PREFETCH,
  &d->mmio);
-
-return 0;
 }
 
 static Property xen_pv_props[] = {
@@ -114,7 +114,7 @@ static void xen_pv_class_init(ObjectClass *klass, void 
*data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
-k->init = xen_pv_init;
+k->realize = xen_pv_realize;
 k->class_id = PCI_CLASS_SYSTEM_OTHER;
 dc->desc = "Xen PV Device";
 dc->props = xen_pv_props;
-- 
1.7.10.4

[Qemu-devel] [PULL 01/11] MAINTAINERS: update Xen files

2016-01-21 Thread Stefano Stabellini

Add the PV block backend, the Xen mapcache, and hw/i386/xen to the list
of Xen related files maintained by me.

Signed-off-by: Stefano Stabellini 
Reviewed-by: Markus Armbruster 
---
 MAINTAINERS |3 +++
 1 file changed, 3 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d8b0f36..e3d4513 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -273,9 +273,12 @@ F: */xen*
 F: hw/char/xen_console.c
 F: hw/display/xenfb.c
 F: hw/net/xen_nic.c
+F: hw/block/xen_*
 F: hw/xen/
 F: hw/xenpv/
+F: hw/i386/xen/
 F: include/hw/xen/
+F: include/sysemu/xen-mapcache.h
 
 Hosts:
 --
-- 
1.7.10.4

[Qemu-devel] [PULL 10/11] Add Error **errp for xen_pt_config_init()

2016-01-21 Thread Stefano Stabellini

From: Cao jin 

To catch the error message. Also modify the caller

Signed-off-by: Cao jin 
Reviewed-by: Eric Blake 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_pt.c |8 ---
 hw/xen/xen_pt.h |2 +-
 hw/xen/xen_pt_config_init.c |   51 +++
 3 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 07bfcec..9eef3df 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -825,9 +825,11 @@ static int xen_pt_initfn(PCIDevice *d)
 xen_pt_register_regions(s, &cmd);
 
 /* reinitialize each config register to be emulated */
-rc = xen_pt_config_init(s);
-if (rc) {
-XEN_PT_ERR(d, "PCI Config space initialisation failed.\n");
+xen_pt_config_init(s, &err);
+if (err) {
+error_append_hint(&err, "PCI Config space initialisation failed");
+error_report_err(err);
+rc = -1;
 goto err_out;
 }
 
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 26f74f8..c2f8e1f 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -230,7 +230,7 @@ struct XenPCIPassthroughState {
 bool listener_set;
 };
 
-int xen_pt_config_init(XenPCIPassthroughState *s);
+void xen_pt_config_init(XenPCIPassthroughState *s, Error **errp);
 void xen_pt_config_delete(XenPCIPassthroughState *s);
 XenPTRegGroup *xen_pt_find_reg_grp(XenPCIPassthroughState *s, uint32_t 
address);
 XenPTReg *xen_pt_find_reg(XenPTRegGroup *reg_grp, uint32_t address);
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 185a698..81c6721 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1887,8 +1887,9 @@ static uint8_t find_cap_offset(XenPCIPassthroughState *s, 
uint8_t cap)
 return 0;
 }
 
-static int xen_pt_config_reg_init(XenPCIPassthroughState *s,
-  XenPTRegGroup *reg_grp, XenPTRegInfo *reg)
+static void xen_pt_config_reg_init(XenPCIPassthroughState *s,
+   XenPTRegGroup *reg_grp, XenPTRegInfo *reg,
+   Error **errp)
 {
 XenPTReg *reg_entry;
 uint32_t data = 0;
@@ -1907,12 +1908,13 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
reg_grp->base_offset + reg->offset, &data);
 if (rc < 0) {
 g_free(reg_entry);
-return rc;
+error_setg(errp, "Init emulate register fail");
+return;
 }
 if (data == XEN_PT_INVALID_REG) {
 /* free unused BAR register entry */
 g_free(reg_entry);
-return 0;
+return;
 }
 /* Sync up the data to dev.config */
 offset = reg_grp->base_offset + reg->offset;
@@ -1930,7 +1932,8 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState 
*s,
 if (rc) {
 /* Serious issues when we cannot read the host values! */
 g_free(reg_entry);
-return rc;
+error_setg(errp, "Cannot read host values");
+return;
 }
 /* Set bits in emu_mask are the ones we emulate. The dev.config shall
  * contain the emulated view of the guest - therefore we flip the mask
@@ -1955,10 +1958,10 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
 val = data;
 
 if (val & ~size_mask) {
-XEN_PT_ERR(&s->dev,"Offset 0x%04x:0x%04x expands past register 
size(%d)!\n",
-   offset, val, reg->size);
+error_setg(errp, "Offset 0x%04x:0x%04x expands past"
+" register size (%d)", offset, val, reg->size);
 g_free(reg_entry);
-return -ENXIO;
+return;
 }
 /* This could be just pci_set_long as we don't modify the bits
  * past reg->size, but in case this routine is run in parallel or the
@@ -1978,13 +1981,12 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
 }
 /* list add register entry */
 QLIST_INSERT_HEAD(®_grp->reg_tbl_list, reg_entry, entries);
-
-return 0;
 }
 
-int xen_pt_config_init(XenPCIPassthroughState *s)
+void xen_pt_config_init(XenPCIPassthroughState *s, Error **errp)
 {
 int i, rc;
+Error *err = NULL;
 
 QLIST_INIT(&s->reg_grps);
 
@@ -2027,11 +2029,12 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
   reg_grp_offset,
   ®_grp_entry->size);
 if (rc < 0) {
-XEN_PT_LOG(&s->dev, "Failed to initialize %d/%ld, type=0x%x, 
rc:%d\n",
-   i, ARRAY_SIZE(xen_pt_emu_reg_grps),
+error_setg(&err, "Failed to initialize %d/%zu, type = 0x%x,"
+   " rc: %d", i, ARRAY_SIZE(xen_pt_emu_reg_grps),
xen_pt_emu_reg_grps[i].grp_type, rc);
+error_propagate(errp, err);

[Qemu-devel] [PULL 06/11] Change xen_host_pci_sysfs_path() to return void

2016-01-21 Thread Stefano Stabellini

From: Cao jin 

And assert the snprintf() error, because user can do nothing in case of
snprintf() fail.

Signed-off-by: Cao jin 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Eric Blake 
---
 hw/xen/xen-host-pci-device.c |   35 +++
 1 file changed, 11 insertions(+), 24 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 7d8a023..9c342e7 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -31,19 +31,14 @@
 #define IORESOURCE_PREFETCH 0x1000  /* No side effects */
 #define IORESOURCE_MEM_64   0x0010
 
-static int xen_host_pci_sysfs_path(const XenHostPCIDevice *d,
-   const char *name, char *buf, ssize_t size)
+static void xen_host_pci_sysfs_path(const XenHostPCIDevice *d,
+const char *name, char *buf, ssize_t size)
 {
 int rc;
 
 rc = snprintf(buf, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
   d->domain, d->bus, d->dev, d->func, name);
-
-if (rc >= size || rc < 0) {
-/* The output is truncated, or some other error was encountered */
-return -ENODEV;
-}
-return 0;
+assert(rc >= 0 && rc < size);
 }
 
 
@@ -58,10 +53,8 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
 char *endptr, *s;
 uint8_t type;
 
-rc = xen_host_pci_sysfs_path(d, "resource", path, sizeof (path));
-if (rc) {
-return rc;
-}
+xen_host_pci_sysfs_path(d, "resource", path, sizeof(path));
+
 fd = open(path, O_RDONLY);
 if (fd == -1) {
 XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
@@ -150,10 +143,8 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 unsigned long value;
 char *endptr;
 
-rc = xen_host_pci_sysfs_path(d, name, path, sizeof (path));
-if (rc) {
-return rc;
-}
+xen_host_pci_sysfs_path(d, name, path, sizeof(path));
+
 fd = open(path, O_RDONLY);
 if (fd == -1) {
 XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
@@ -200,21 +191,17 @@ static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice 
*d)
 char path[PATH_MAX];
 struct stat buf;
 
-if (xen_host_pci_sysfs_path(d, "physfn", path, sizeof (path))) {
-return false;
-}
+xen_host_pci_sysfs_path(d, "physfn", path, sizeof(path));
+
 return !stat(path, &buf);
 }
 
 static int xen_host_pci_config_open(XenHostPCIDevice *d)
 {
 char path[PATH_MAX];
-int rc;
 
-rc = xen_host_pci_sysfs_path(d, "config", path, sizeof (path));
-if (rc) {
-return rc;
-}
+xen_host_pci_sysfs_path(d, "config", path, sizeof(path));
+
 d->config_fd = open(path, O_RDWR);
 if (d->config_fd < 0) {
 return -errno;
-- 
1.7.10.4

[Qemu-devel] [PULL 04/11] xen-hvm: Clean up xen_ram_alloc() error handling

2016-01-21 Thread Stefano Stabellini

From: Markus Armbruster 

xen_ram_alloc() dies with hw_error() on error, even though its caller
ram_block_add() handles errors just fine.  Add an Error **errp
parameter and use it.

Leave case RUN_STATE_INMIGRATE alone, because that looks like some
kind of warning.

Signed-off-by: Markus Armbruster 
Reviewed-by: Stefano Stabellini 
Signed-off-by: Stefano Stabellini 
---
 exec.c   |8 +++-
 include/hw/xen/xen.h |2 +-
 xen-hvm-stub.c   |3 ++-
 xen-hvm.c|7 ---
 4 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 7f0ce42..c268c36 100644
--- a/exec.c
+++ b/exec.c
@@ -1474,6 +1474,7 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, 
Error **errp)
 RAMBlock *block;
 RAMBlock *last_block = NULL;
 ram_addr_t old_ram_size, new_ram_size;
+Error *err = NULL;
 
 old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
 
@@ -1483,7 +1484,12 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, 
Error **errp)
 if (!new_block->host) {
 if (xen_enabled()) {
 xen_ram_alloc(new_block->offset, new_block->max_length,
-  new_block->mr);
+  new_block->mr, &err);
+if (err) {
+error_propagate(errp, err);
+qemu_mutex_unlock_ramlist();
+return -1;
+}
 } else {
 new_block->host = phys_mem_alloc(new_block->max_length,
  &new_block->mr->align);
diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index d07bc99..1b81b4b 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -41,7 +41,7 @@ void xenstore_store_pv_console_info(int i, struct 
CharDriverState *chr);
 #if defined(NEED_CPU_H) && !defined(CONFIG_USER_ONLY)
 void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory);
 void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
-   struct MemoryRegion *mr);
+   struct MemoryRegion *mr, Error **errp);
 void xen_modified_memory(ram_addr_t start, ram_addr_t length);
 #endif
 
diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c
index 7ff0602..b958367 100644
--- a/xen-hvm-stub.c
+++ b/xen-hvm-stub.c
@@ -30,7 +30,8 @@ void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
 {
 }
 
-void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr)
+void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
+   Error **errp)
 {
 }
 
diff --git a/xen-hvm.c b/xen-hvm.c
index cb7128c..a9085a8 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -239,9 +239,9 @@ static void xen_ram_init(PCMachineState *pcms,
 }
 }
 
-void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr)
+void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
+   Error **errp)
 {
-/* FIXME caller ram_block_add() wants error_setg() on failure */
 unsigned long nr_pfn;
 xen_pfn_t *pfn_list;
 int i;
@@ -268,7 +268,8 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, 
MemoryRegion *mr)
 }
 
 if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, nr_pfn, 0, 0, 
pfn_list)) {
-hw_error("xen: failed to populate ram at " RAM_ADDR_FMT, ram_addr);
+error_setg(errp, "xen: failed to populate ram at " RAM_ADDR_FMT,
+   ram_addr);
 }
 
 g_free(pfn_list);
-- 
1.7.10.4

[Qemu-devel] [PULL 07/11] Xen: use qemu_strtoul instead of strtol

2016-01-21 Thread Stefano Stabellini

From: Cao jin 

No need to roll our own (with slightly incorrect handling of errno),
when we can use the common version.

Change signed parsing to unsigned, because what it read are values in
PCI config space, which are non-negative.

Signed-off-by: Cao jin 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Eric Blake 
---
 hw/xen/xen-host-pci-device.c |   12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 9c342e7..83da9c4 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -141,7 +141,7 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 char buf[XEN_HOST_PCI_GET_VALUE_BUFFER_SIZE];
 int fd, rc;
 unsigned long value;
-char *endptr;
+const char *endptr;
 
 xen_host_pci_sysfs_path(d, name, path, sizeof(path));
 
@@ -158,13 +158,9 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 }
 } while (rc < 0);
 buf[rc] = 0;
-value = strtol(buf, &endptr, base);
-if (endptr == buf || *endptr != '\n') {
-rc = -1;
-} else if ((value == LONG_MIN || value == LONG_MAX) && errno == ERANGE) {
-rc = -errno;
-} else {
-rc = 0;
+rc = qemu_strtoul(buf, &endptr, base, &value);
+if (!rc) {
+assert(value <= UINT_MAX);
 *pvalue = value;
 }
 out:
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH v9 2/3] quorum: implement bdrv_add_child() and bdrv_del_child()

2016-01-21 Thread Eric Blake

On 01/21/2016 06:05 AM, Alberto Garcia wrote:
> On Thu 21 Jan 2016 02:54:10 AM CET, Wen Congyang  wrote:
> 
 @@ -875,9 +878,9 @@ static int quorum_open(BlockDriverState *bs, QDict 
 *options, int flags,
  ret = -EINVAL;
  goto exit;
  }
 -if (s->num_children < 2) {
 +if (s->num_children < 1) {
  error_setg(&local_err,
 -   "Number of provided children must be greater than 1");
 +   "Number of provided children must be 1 or more");
  ret = -EINVAL;
  goto exit;
  }
>>>
>>> I have a question: if you have a Quorum with just one member and you
>>> add a new one, how do you know if it has the same data as the
>>> existing one?
>>>
>>> In general, what do you do to make sure that the data in a new Quorum
>>> child is consistent with that of the rest of the array?
>>
>> Quorum can have more than one child when it starts. But we don't do
>> the similar check. So I don't think we should do such check here.
> 
> Yes, but when you start a VM you can verify in advance that all members
> of the Quorum have the same data. If you do that on a running VM how can
> you know if the new disk is consistent with the others?

User error if it is not.  Just the same as it is user error if you
request a shallow drive-mirror but the destination is not the same
contents as the backing file.  I don't think qemu has to protect us from
user error in this case.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 5/8] ipmi: add ACPI power and GUID commands

2016-01-21 Thread Marcel Apfelbaum


On 01/21/2016 06:41 PM, Cédric Le Goater wrote:

On 01/21/2016 05:37 PM, Corey Minyard wrote:

On 01/17/2016 08:16 AM, Michael S. Tsirkin wrote:

On Sun, Jan 17, 2016 at 02:04:32PM +0200, Marcel Apfelbaum wrote:

On 01/05/2016 07:29 PM, Cédric Le Goater wrote:

Signed-off-by: Cédric Le Goater 
---
   hw/ipmi/ipmi_bmc_sim.c | 55 
++
   1 file changed, 55 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 60586a67104e..c3a06d0ac7e4 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -25,6 +25,7 @@
   #include 
   #include 
   #include 
+#include "sysemu/sysemu.h"
   #include "qemu/timer.h"
   #include "hw/ipmi/ipmi.h"
   #include "qemu/error-report.h"
@@ -54,6 +55,9 @@
   #define IPMI_CMD_GET_DEVICE_ID0x01
   #define IPMI_CMD_COLD_RESET   0x02
   #define IPMI_CMD_WARM_RESET   0x03
+#define IPMI_CMD_SET_POWER_STATE  0x06
+#define IPMI_CMD_GET_POWER_STATE  0x07
+#define IPMI_CMD_GET_DEVICE_GUID  0x08
   #define IPMI_CMD_RESET_WATCHDOG_TIMER 0x22
   #define IPMI_CMD_SET_WATCHDOG_TIMER   0x24
   #define IPMI_CMD_GET_WATCHDOG_TIMER   0x25
@@ -215,6 +219,9 @@ struct IPMIBmcSim {

   uint8_t restart_cause;

+uint8_t power_state[2];
+uint8_t uuid[16];
+
   IPMISel sel;
   IPMISdr sdr;
   IPMIFru fru;
@@ -842,6 +849,42 @@ static void warm_reset(IPMIBmcSim *ibs,
   k->reset(s, false);
   }
   }
+static void set_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_CHECK_CMD_LEN(4);
+ibs->power_state[0] = cmd[2];
+ibs->power_state[1] = cmd[3];
+ out:
+return;


Hi,

I am sorry for my late comment, but I find a little strange the use of
the "out" label here.
I understand this is because of its usage in IPMI_*  macros, but
I looked into every usage(I hope I didn't miss anything) and the code
simply returns.
Also the correlation between those macros is a little odd.

Thanks,
Marcel


Yes - these macros with goto out are confusing.

Please rewrite them to return bool, and put
goto out in the caller.



Marcel, do you want me to do this?


I have the patch ready (and more). I will send this one to start with.


Thanks!
Marcel



C.

Re: [Qemu-devel] [PATCH] gtk: use qemu_chr_alloc() to allocate CharDriverState

2016-01-21 Thread Eric Blake

On 01/21/2016 04:56 AM, Daniel P. Berrange wrote:
> The gd_vc_handler() callback is using g_malloc0() to
> allocate the CharDriverState struct. As a result the
> logfd field is getting initialized to 0, instead of
> -1 when no logfile is requested.
> 
> The result is that when running
> 
>  $ qemu-system-i386 -nodefaults -chardev vc,id=mon0 -mon chardev=mon0
> 
> qemu duplicates all monitor output to stdout as well
> as the GTK window.
> 
> Not using qemu_chr_alloc() was already a bug, but harmless
> until this commit
> 
>   commit d0d7708ba29cbcc343364a46bff981e0ff88366f
>   Author: Daniel P. Berrange 
>   Date:   Mon Jan 11 12:44:41 2016 +
> 
> qemu-char: add logfile facility to all chardev backends
> 
> which exposed the problem as a behaviour regression
> 
> Reported-by: Hervé Poussineau 
> Signed-off-by: Daniel P. Berrange 
> ---
>  ui/gtk.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v6 0/6] Xen PCI passthru: Convert to realize()

2016-01-21 Thread Stefano Stabellini

On Thu, 21 Jan 2016, Eric Blake wrote:
> On 01/21/2016 08:41 AM, Stefano Stabellini wrote:
> > Hi Cao,
> > 
> > I appreciate the reminder, but it looks like Eric hasn't reviewed patch
> > 3/5. Am I wrong?
> 
> I've done it now. Thanks for your patience, and for letting me jump in
> with a late review (since I remember you were about to push an earlier
> version when I started looking at it).
 
Thank you for reviewing :-)

Re: [Qemu-devel] [PATCH 5/8] ipmi: add ACPI power and GUID commands

2016-01-21 Thread Cédric Le Goater

On 01/21/2016 05:37 PM, Corey Minyard wrote:
> On 01/17/2016 08:16 AM, Michael S. Tsirkin wrote:
>> On Sun, Jan 17, 2016 at 02:04:32PM +0200, Marcel Apfelbaum wrote:
>>> On 01/05/2016 07:29 PM, Cédric Le Goater wrote:
 Signed-off-by: Cédric Le Goater 
 ---
   hw/ipmi/ipmi_bmc_sim.c | 55 
 ++
   1 file changed, 55 insertions(+)

 diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
 index 60586a67104e..c3a06d0ac7e4 100644
 --- a/hw/ipmi/ipmi_bmc_sim.c
 +++ b/hw/ipmi/ipmi_bmc_sim.c
 @@ -25,6 +25,7 @@
   #include 
   #include 
   #include 
 +#include "sysemu/sysemu.h"
   #include "qemu/timer.h"
   #include "hw/ipmi/ipmi.h"
   #include "qemu/error-report.h"
 @@ -54,6 +55,9 @@
   #define IPMI_CMD_GET_DEVICE_ID0x01
   #define IPMI_CMD_COLD_RESET   0x02
   #define IPMI_CMD_WARM_RESET   0x03
 +#define IPMI_CMD_SET_POWER_STATE  0x06
 +#define IPMI_CMD_GET_POWER_STATE  0x07
 +#define IPMI_CMD_GET_DEVICE_GUID  0x08
   #define IPMI_CMD_RESET_WATCHDOG_TIMER 0x22
   #define IPMI_CMD_SET_WATCHDOG_TIMER   0x24
   #define IPMI_CMD_GET_WATCHDOG_TIMER   0x25
 @@ -215,6 +219,9 @@ struct IPMIBmcSim {

   uint8_t restart_cause;

 +uint8_t power_state[2];
 +uint8_t uuid[16];
 +
   IPMISel sel;
   IPMISdr sdr;
   IPMIFru fru;
 @@ -842,6 +849,42 @@ static void warm_reset(IPMIBmcSim *ibs,
   k->reset(s, false);
   }
   }
 +static void set_power_state(IPMIBmcSim *ibs,
 +  uint8_t *cmd, unsigned int cmd_len,
 +  uint8_t *rsp, unsigned int *rsp_len,
 +  unsigned int max_rsp_len)
 +{
 +IPMI_CHECK_CMD_LEN(4);
 +ibs->power_state[0] = cmd[2];
 +ibs->power_state[1] = cmd[3];
 + out:
 +return;
>>>
>>> Hi,
>>>
>>> I am sorry for my late comment, but I find a little strange the use of
>>> the "out" label here.
>>> I understand this is because of its usage in IPMI_*  macros, but
>>> I looked into every usage(I hope I didn't miss anything) and the code
>>> simply returns.
>>> Also the correlation between those macros is a little odd.
>>>
>>> Thanks,
>>> Marcel
>>
>> Yes - these macros with goto out are confusing.
>>
>> Please rewrite them to return bool, and put
>> goto out in the caller.
>>
> 
> Marcel, do you want me to do this?

I have the patch ready (and more). I will send this one to start with.

C.

[Qemu-devel] [PATCH v4 14/14] nbd: enable use of TLS with nbd-server-start command

2016-01-21 Thread Daniel P. Berrange

This modifies the nbd-server-start QMP command so that it
is possible to request use of TLS. This is done by adding
a new optional parameter "tls-creds" which provides the ID
of a previously created QCryptoTLSCreds object instance.

TLS is only supported when using an IPv4/IPv6 socket listener.

Signed-off-by: Daniel P. Berrange 
---
 blockdev-nbd.c  | 122 ++--
 hmp.c   |   2 +-
 qapi/block.json |   4 +-
 qmp-commands.hx |   2 +-
 4 files changed, 105 insertions(+), 25 deletions(-)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 6af1684..e287f05 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -19,42 +19,126 @@
 #include "block/nbd.h"
 #include "io/channel-socket.h"
 
-static QIOChannelSocket *server_ioc;
-static int server_watch = -1;
+typedef struct NBDServerData {
+QIOChannelSocket *listen_ioc;
+int watch;
+QCryptoTLSCreds *tlscreds;
+} NBDServerData;
+
+static NBDServerData *nbd_server;
+
 
 static gboolean nbd_accept(QIOChannel *ioc, GIOCondition condition,
gpointer opaque)
 {
 QIOChannelSocket *cioc;
 
+if (!nbd_server) {
+return FALSE;
+}
+
 cioc = qio_channel_socket_accept(QIO_CHANNEL_SOCKET(ioc),
  NULL);
 if (!cioc) {
 return TRUE;
 }
 
-nbd_client_new(NULL, cioc, NULL, NULL, nbd_client_put);
+nbd_client_new(NULL, cioc,
+   nbd_server->tlscreds, NULL,
+   nbd_client_put);
 object_unref(OBJECT(cioc));
 return TRUE;
 }
 
-void qmp_nbd_server_start(SocketAddress *addr, Error **errp)
+
+static void nbd_server_free(NBDServerData *server)
 {
-if (server_ioc) {
-error_setg(errp, "NBD server already running");
+if (!server) {
 return;
 }
 
-server_ioc = qio_channel_socket_new();
-if (qio_channel_socket_listen_sync(server_ioc, addr, errp) < 0) {
+if (server->watch != -1) {
+g_source_remove(server->watch);
+}
+object_unref(OBJECT(server->listen_ioc));
+if (server->tlscreds) {
+object_unref(OBJECT(server->tlscreds));
+}
+
+g_free(server);
+}
+
+static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp)
+{
+Object *obj;
+QCryptoTLSCreds *creds;
+
+obj = object_resolve_path_component(
+object_get_objects_root(), id);
+if (!obj) {
+error_setg(errp, "No TLS credentials with id '%s'",
+   id);
+return NULL;
+}
+creds = (QCryptoTLSCreds *)
+object_dynamic_cast(obj, TYPE_QCRYPTO_TLS_CREDS);
+if (!creds) {
+error_setg(errp, "Object with id '%s' is not TLS credentials",
+   id);
+return NULL;
+}
+
+if (creds->endpoint != QCRYPTO_TLS_CREDS_ENDPOINT_SERVER) {
+error_setg(errp,
+   "Expecting TLS credentials with a server endpoint");
+return NULL;
+}
+object_ref(obj);
+return creds;
+}
+
+
+void qmp_nbd_server_start(SocketAddress *addr,
+  bool has_tls_creds, const char *tls_creds,
+  Error **errp)
+{
+if (nbd_server) {
+error_setg(errp, "NBD server already running");
 return;
 }
 
-server_watch = qio_channel_add_watch(QIO_CHANNEL(server_ioc),
- G_IO_IN,
- nbd_accept,
- NULL,
- NULL);
+nbd_server = g_new0(NBDServerData, 1);
+nbd_server->watch = -1;
+nbd_server->listen_ioc = qio_channel_socket_new();
+if (qio_channel_socket_listen_sync(
+nbd_server->listen_ioc, addr, errp) < 0) {
+goto error;
+}
+
+if (has_tls_creds) {
+nbd_server->tlscreds = nbd_get_tls_creds(tls_creds, errp);
+if (!nbd_server->tlscreds) {
+goto error;
+}
+
+if (addr->type != SOCKET_ADDRESS_KIND_INET) {
+error_setg(errp, "TLS is only supported with IPv4/IPv6");
+goto error;
+}
+}
+
+nbd_server->watch = qio_channel_add_watch(
+QIO_CHANNEL(nbd_server->listen_ioc),
+G_IO_IN,
+nbd_accept,
+NULL,
+NULL);
+
+return;
+
+ error:
+nbd_server_free(nbd_server);
+nbd_server = NULL;
 }
 
 /*
@@ -89,7 +173,7 @@ void qmp_nbd_server_add(const char *device, bool 
has_writable, bool writable,
 NBDExport *exp;
 NBDCloseNotifier *n;
 
-if (!server_ioc) {
+if (!nbd_server) {
 error_setg(errp, "NBD server not running");
 return;
 }
@@ -139,12 +223,6 @@ void qmp_nbd_server_stop(Error **errp)
 nbd_close_notifier(&cn->n, nbd_export_get_blockdev(cn->exp));
 }
 
-if (server_watch != -1) {
-g_source_remove(server_watch);
-server_watch = -1;
-}
-if (server_ioc) {
-object_unref(OBJECT(server_ioc));
-server_ioc =

1 2 3 >

1 - 100 of 295 matches

Mail list logo