date:20170612

Re: [Qemu-devel] [PATCH 0/2] Unit test+fix for problem with QEMU handling of multiple bind()s to the same port

2017-06-12 Thread Knut Omang

On Fri, 2017-06-09 at 14:39 -0500, Eric Blake wrote:
> On 06/09/2017 02:19 PM, Knut Omang wrote:
> > This series contains:
> > * a unit test that exposes a race condition which causes QEMU to fail
> >   to find a port even when there is plenty of available ports.
> > * a refactor of the qemu-sockets inet_listen_saddr() function
> >   to better handle this situation.
> > 
> > Thanks,
> > Knut
> > 
> > Knut Omang (2):
> >   Add test-listen - a stress test for QEMU socket listen
> >   socket: Handle race condition between binds to the same port
> > 
> 
> I'd reorder the series, to put the fix first and the test second, rather
> than the (crippled) test first.  

Are there good reasons not to have the test first? (as long as it does not 
break the build). IMHO the logical test driven approach 
is to have the test first to highlight/reproduce the issue.

> Someone that wants to prove that the
> test works can easily apply the patches out of order.

Yes, but in my view that's both less logical and less convenient?

Thanks,
Knut

Re: [Qemu-devel] [PATCH] timer/aspeed: fix timer enablement when a reload is not set

2017-06-12 Thread Cédric Le Goater

On 06/13/2017 06:37 AM, Andrew Jeffery wrote:
> On Fri, 2017-06-09 at 07:40 +0200, Cédric Le Goater wrote:
>> On 06/09/2017 04:26 AM, Andrew Jeffery wrote:
>>> On Tue, 2017-06-06 at 10:55 +0200, Cédric Le Goater wrote:
 When a timer is enabled before a reload value is set, the controller
 waits for a reload value to be set before starting decrementing. This
 fix tries to cover that case by changing the timer expiry only when
 a reload value is valid.

> Signed-off-by: Cédric Le Goater 

 ---
  hw/timer/aspeed_timer.c | 37 +
  1 file changed, 29 insertions(+), 8 deletions(-)

 diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
 index 9b70ee09b07f..50acbf530a3a 100644
 --- a/hw/timer/aspeed_timer.c
 +++ b/hw/timer/aspeed_timer.c
 @@ -130,15 +130,26 @@ static uint64_t calculate_next(struct AspeedTimer *t)
  next = seq[1];
  } else if (now < seq[2]) {
  next = seq[2];
 -} else {
 +} else if (t->reload) {
  reload_ns = muldiv64(t->reload, NANOSECONDS_PER_SECOND, rate);
  t->start = now - ((now - t->start) % reload_ns);
 +} else {
 +/* no reload value, return 0 */
 +break;
  }
  }
  
  return next;
  }
  
 +static void aspeed_timer_mod(AspeedTimer *t)
 +{
 +uint64_t next = calculate_next(t);
 +if (next) {
 +timer_mod(>timer, next);
 +}
 +}
 +
  static void aspeed_timer_expire(void *opaque)
  {
  AspeedTimer *t = opaque;
 @@ -164,7 +175,7 @@ static void aspeed_timer_expire(void *opaque)
  qemu_set_irq(t->irq, t->level);
  }
  
 -timer_mod(>timer, calculate_next(t));
 +aspeed_timer_mod(t);
  }
  
  static uint64_t aspeed_timer_get_value(AspeedTimer *t, int reg)
 @@ -227,10 +238,23 @@ static void 
 aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
 uint32_t value)
  {
  AspeedTimer *t;
 +uint32_t old_reload;
  
  trace_aspeed_timer_set_value(timer, reg, value);
  t = >timers[timer];
  switch (reg) {
 +case TIMER_REG_RELOAD:
 +old_reload = t->reload;
 +t->reload = value;
 +
 +/* If the reload value was not previously set, or zero, and
 + * the current value is valid, try to start the timer if it is
 + * enabled.
 + */
 +if (old_reload || !t->reload) {
 +break;
 +}
>>>
>>> Maybe I need more caffeine, but I initially struggled to reconcile the
>>> condition with the comment, as the condition checks the inverse in
>>> order to break while the comment discusses the non-breaking case. 
>>
>> I agree. The reload "value" is used in a hidden way to the activate the 
>> timer.
>>
>>> However, after trying for several minutes, I'm not sure there's an easy
>>> way to improve it.
>>
>> I tried a few things. May be, we could move the following code in 
>> its own routine and call it twice ? 
> 
> I don't think it's necessary. The comment serves as enough warning - it
> should at least make people think before modifying the code.

OK. Let it be.


Peter, 

The Moxa Art timer driver was recently merged into the Faraday 
FTTMR010 driver and the initial setup is slightly different, 
it enables the timer before setting the reload value, which 
breaks the current QEMU model.

Thanks,

C. 


> Cheers,
> 
> Andrew
> 
>>  
 +
  case TIMER_REG_STATUS:
  if (timer_enabled(t)) {
  uint64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 @@ -238,17 +262,14 @@ static void 
 aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
  uint32_t rate = calculate_rate(t);
  
  t->start += muldiv64(delta, NANOSECONDS_PER_SECOND, rate);
 -timer_mod(>timer, calculate_next(t));
 +aspeed_timer_mod(t);
  }
  break;
 -case TIMER_REG_RELOAD:
 -t->reload = value;
 -break;
  case TIMER_REG_MATCH_FIRST:
  case TIMER_REG_MATCH_SECOND:
  t->match[reg - 2] = value;
  if (timer_enabled(t)) {
 -timer_mod(>timer, calculate_next(t));
 +aspeed_timer_mod(t);
  }
  break;
  default:
 @@ -268,7 +289,7 @@ static void aspeed_timer_ctrl_enable(AspeedTimer *t, 
 bool enable)
  trace_aspeed_timer_ctrl_enable(t->id, enable);
  if (enable) {
  t->start = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 -timer_mod(>timer, calculate_next(t));
 +aspeed_timer_mod(t);
  } else {

Re: [Qemu-devel] [PULL 0/7] ui patch queue

2017-06-12 Thread Thomas Huth

On 12.06.2017 20:08, Peter Maydell wrote:
> On 12 June 2017 at 19:00, Peter Maydell  wrote:
>> This causes configure to barf warnings in my build logs on half
>> my build machines:
>>
>> WARNING: Support for gtk2 will be dropped in future releases.
>> WARNING: Please consider using gtk3 instead.
>>
>> I don't think it is yet possible to drop gtk2.
> 
> Oh, one of them was the warning about SDL1.2.

Adding the warning does not necessarily mean that we've really got to
drop support for these in the very near future, but it's a good start to
remind people that support for the old versions won't be around forever
;-) And AFAIK the old versions of these libraries are not maintained by
the upstream projects anymore, too, so at one point in time, we've
really got to deprecate support for these in QEMU, too.

So would it at least be feasible to install SDL 2 on that build
machines? (IMHO installing SDL 2 from sources is not too much of a pain,
e.g. it does not require lots of other additional libraries around)

 Thomas

Re: [Qemu-devel] [PATCH v6 1/2] spapr: Add a "no HPT" encoding to HTAB migration stream

2017-06-12 Thread Bharata B Rao

On Mon, Jun 12, 2017 at 05:10:44PM +0800, David Gibson wrote:
> On Mon, Jun 12, 2017 at 11:02:34AM +0530, Bharata B Rao wrote:
> > Add a "no HPT" encoding (using value -1) to the HTAB migration
> > stream (in the place of HPT size) when the guest doesn't allocate HPT.
> > This will help the target side to match target HPT with the source HPT
> > and thus enable successful migration.
> > 
> > Suggested-by: David Gibson 
> > Signed-off-by: Bharata B Rao 
> > ---
> >  hw/ppc/spapr.c | 29 +
> >  1 file changed, 25 insertions(+), 4 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 8b541d9..c425499 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1558,13 +1558,19 @@ static int htab_save_setup(QEMUFile *f, void 
> > *opaque)
> >  sPAPRMachineState *spapr = opaque;
> >  
> >  /* "Iteration" header */
> > -qemu_put_be32(f, spapr->htab_shift);
> > +if (!spapr->htab_shift) {
> > +qemu_put_be32(f, -1);
> > +} else {
> > +qemu_put_be32(f, spapr->htab_shift);
> > +}
> >  
> >  if (spapr->htab) {
> >  spapr->htab_save_index = 0;
> >  spapr->htab_first_pass = true;
> >  } else {
> > -assert(kvm_enabled());
> > +if (spapr->htab_shift) {
> > +assert(kvm_enabled());
> > +}
> >  }
> >  
> >  
> > @@ -1710,7 +1716,12 @@ static int htab_save_iterate(QEMUFile *f, void 
> > *opaque)
> >  int rc = 0;
> >  
> >  /* Iteration header */
> > -qemu_put_be32(f, 0);
> > +if (!spapr->htab_shift) {
> > +qemu_put_be32(f, -1);
> > +return 0;
> > +} else {
> > +qemu_put_be32(f, 0);
> > +}
> >  
> >  if (!spapr->htab) {
> >  assert(kvm_enabled());
> > @@ -1744,7 +1755,12 @@ static int htab_save_complete(QEMUFile *f, void 
> > *opaque)
> >  int fd;
> >  
> >  /* Iteration header */
> > -qemu_put_be32(f, 0);
> > +if (!spapr->htab_shift) {
> > +qemu_put_be32(f, -1);
> > +return 0;
> > +} else {
> > +qemu_put_be32(f, 0);
> > +}
> 
> Do you actually need the modifications for _iterate and _complete?  I
> would have thought you just wouldn't need to send any more of the HPT
> stream at all after sending the -1 header.  

_setup, _iterate, _complete handler routines for HTAB always get interspersed
with similar routines for ram savevm handlers as per what I have seen.
And moreover these are called by the core migration code and hence we they get
called, we need these changes to ensure that we don't attempt to send HPT
stream.

> 
> We should also adjust the downtime estimation logic so we don't allow
> for transferring an HPT that isn't there.

I will have to check that out.

> 
> >  if (!spapr->htab) {
> >  int rc;
> > @@ -1788,6 +1804,11 @@ static int htab_load(QEMUFile *f, void *opaque, int 
> > version_id)
> >  
> >  section_hdr = qemu_get_be32(f);
> >  
> > +if (section_hdr == -1) {
> > +spapr_free_hpt(spapr);
> > +return 0;
> > +}
> 
> Strictly speaking we probably shouldn't just return here.  We should
> wait and see if there is more data in the stream.

Because of the way the source sends data (with HPT data getting
interspersed with ram data, I don't think we can say for sure if
we got or will get any HPT data following the no-HPT indication.

> Any actual content
> (i.e. section_hdr == 0 sections) would be an error at this point.
> However, a reset to an HPT guest would be represented by a new
> non-zero section header, then more data.  This isn't urgent, but it
> would be nice to fix at some point.

Sure.

Regards,
Bharata.

Re: [Qemu-devel] [PATCH v4 5/7] target-m68k: use floatx80 internally

2017-06-12 Thread Thomas Huth

On 12.06.2017 01:16, Laurent Vivier wrote:
> Coldfire uses float64, but 680x0 use floatx80.
> This patch introduces the use of floatx80 internally
> and enables 680x0 80bits FPU.
> 
> Signed-off-by: Laurent Vivier 
> ---
>  target/m68k/cpu.c|   9 +-
>  target/m68k/cpu.h|   6 +-
>  target/m68k/fpu_helper.c |  85 +++
>  target/m68k/helper.c |  12 +-
>  target/m68k/helper.h |  37 +--
>  target/m68k/qregs.def|   1 -
>  target/m68k/translate.c  | 568 
> +++
>  7 files changed, 464 insertions(+), 254 deletions(-)
> 
> diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
> index f068922..435456f 100644
> --- a/target/m68k/cpu.c
> +++ b/target/m68k/cpu.c
> @@ -49,6 +49,8 @@ static void m68k_cpu_reset(CPUState *s)
>  M68kCPU *cpu = M68K_CPU(s);
>  M68kCPUClass *mcc = M68K_CPU_GET_CLASS(cpu);
>  CPUM68KState *env = >env;
> +floatx80 nan = floatx80_default_nan(NULL);
> +int i;
>  
>  mcc->parent_reset(s);
>  
> @@ -57,7 +59,12 @@ static void m68k_cpu_reset(CPUState *s)
>  env->sr = 0x2700;
>  #endif
>  m68k_switch_sp(env);
> -/* ??? FP regs should be initialized to NaN.  */
> +for (i = 0; i < 8; i++) {
> +env->fregs[i].d = nan;
> +}
> +env->fpcr = 0;
> +env->fpsr = 0;
> +

Maybe move such non-related hunks to a separate patch? This patch here
is already big enough...

 Thomas

Re: [Qemu-devel] [PATCH] timer/aspeed: fix timer enablement when a reload is not set

2017-06-12 Thread Andrew Jeffery

On Fri, 2017-06-09 at 07:40 +0200, Cédric Le Goater wrote:
> On 06/09/2017 04:26 AM, Andrew Jeffery wrote:
> > On Tue, 2017-06-06 at 10:55 +0200, Cédric Le Goater wrote:
> > > When a timer is enabled before a reload value is set, the controller
> > > waits for a reload value to be set before starting decrementing. This
> > > fix tries to cover that case by changing the timer expiry only when
> > > a reload value is valid.
> > > 
> > > > Signed-off-by: Cédric Le Goater 
> > > 
> > > ---
> > >  hw/timer/aspeed_timer.c | 37 +
> > >  1 file changed, 29 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/hw/timer/aspeed_timer.c b/hw/timer/aspeed_timer.c
> > > index 9b70ee09b07f..50acbf530a3a 100644
> > > --- a/hw/timer/aspeed_timer.c
> > > +++ b/hw/timer/aspeed_timer.c
> > > @@ -130,15 +130,26 @@ static uint64_t calculate_next(struct AspeedTimer 
> > > *t)
> > >  next = seq[1];
> > >  } else if (now < seq[2]) {
> > >  next = seq[2];
> > > -} else {
> > > +} else if (t->reload) {
> > >  reload_ns = muldiv64(t->reload, NANOSECONDS_PER_SECOND, 
> > > rate);
> > >  t->start = now - ((now - t->start) % reload_ns);
> > > +} else {
> > > +/* no reload value, return 0 */
> > > +break;
> > >  }
> > >  }
> > >  
> > >  return next;
> > >  }
> > >  
> > > +static void aspeed_timer_mod(AspeedTimer *t)
> > > +{
> > > +uint64_t next = calculate_next(t);
> > > +if (next) {
> > > +timer_mod(>timer, next);
> > > +}
> > > +}
> > > +
> > >  static void aspeed_timer_expire(void *opaque)
> > >  {
> > >  AspeedTimer *t = opaque;
> > > @@ -164,7 +175,7 @@ static void aspeed_timer_expire(void *opaque)
> > >  qemu_set_irq(t->irq, t->level);
> > >  }
> > >  
> > > -timer_mod(>timer, calculate_next(t));
> > > +aspeed_timer_mod(t);
> > >  }
> > >  
> > >  static uint64_t aspeed_timer_get_value(AspeedTimer *t, int reg)
> > > @@ -227,10 +238,23 @@ static void 
> > > aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
> > > uint32_t value)
> > >  {
> > >  AspeedTimer *t;
> > > +uint32_t old_reload;
> > >  
> > >  trace_aspeed_timer_set_value(timer, reg, value);
> > >  t = >timers[timer];
> > >  switch (reg) {
> > > +case TIMER_REG_RELOAD:
> > > +old_reload = t->reload;
> > > +t->reload = value;
> > > +
> > > +/* If the reload value was not previously set, or zero, and
> > > + * the current value is valid, try to start the timer if it is
> > > + * enabled.
> > > + */
> > > +if (old_reload || !t->reload) {
> > > +break;
> > > +}
> > 
> > Maybe I need more caffeine, but I initially struggled to reconcile the
> > condition with the comment, as the condition checks the inverse in
> > order to break while the comment discusses the non-breaking case. 
> 
> I agree. The reload "value" is used in a hidden way to the activate the 
> timer.
> 
> > However, after trying for several minutes, I'm not sure there's an easy
> > way to improve it.
> 
> I tried a few things. May be, we could move the following code in 
> its own routine and call it twice ? 

I don't think it's necessary. The comment serves as enough warning - it
should at least make people think before modifying the code.

Cheers,

Andrew

>  
> > > +
> > >  case TIMER_REG_STATUS:
> > >  if (timer_enabled(t)) {
> > >  uint64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> > > @@ -238,17 +262,14 @@ static void 
> > > aspeed_timer_set_value(AspeedTimerCtrlState *s, int timer, int reg,
> > >  uint32_t rate = calculate_rate(t);
> > >  
> > >  t->start += muldiv64(delta, NANOSECONDS_PER_SECOND, rate);
> > > -timer_mod(>timer, calculate_next(t));
> > > +aspeed_timer_mod(t);
> > >  }
> > >  break;
> > > -case TIMER_REG_RELOAD:
> > > -t->reload = value;
> > > -break;
> > >  case TIMER_REG_MATCH_FIRST:
> > >  case TIMER_REG_MATCH_SECOND:
> > >  t->match[reg - 2] = value;
> > >  if (timer_enabled(t)) {
> > > -timer_mod(>timer, calculate_next(t));
> > > +aspeed_timer_mod(t);
> > >  }
> > >  break;
> > >  default:
> > > @@ -268,7 +289,7 @@ static void aspeed_timer_ctrl_enable(AspeedTimer *t, 
> > > bool enable)
> > >  trace_aspeed_timer_ctrl_enable(t->id, enable);
> > >  if (enable) {
> > >  t->start = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> > > -timer_mod(>timer, calculate_next(t));
> > > +aspeed_timer_mod(t);
> > >  } else {
> > >  timer_del(>timer);
> > >  }
> > 
> > Reviewed-by: Andrew Jeffery 
> 
> Thanks,
> 
> C.

signature.asc
Description: This is a digitally signed message part

Re: [Qemu-devel] [PATCH v6 1/4] kvm-all: Pass an error object to kvm_device_access

2017-06-12 Thread Peter Xu

On Fri, Jun 09, 2017 at 05:52:30PM +0200, Eric Auger wrote:
> In some circumstances, we don't want to abort if the
> kvm_device_access fails. This will be the case during ITS
> migration, in case the ITS table save/restore fails because
> the guest did not program the vITS correctly. So let's pass an
> error object to the function and return the ioctl value. New
> callers will be able to make a decision upon this returned
> value.
> 
> Existing callers pass _abort which will cause the
> function to abort on failure.
> 
> Signed-off-by: Eric Auger 

Reviewed-by: Peter Xu 

-- 
Peter Xu

[Qemu-devel] [PATCH] usb-host: support devices with sparse/non-sequential USB interfaces

2017-06-12 Thread Samuel Brian


Some USB devices have sparse interface numbering which is not able to be
passthroughed.
For example, the Sierra Wireless MC7455/MC7430:

  # lsusb  -D /dev/bus/usb/003/003 | egrep 
'1199|9071|bNumInterfaces|bInterfaceNumber'

  Device: ID 1199:9071 Sierra Wireless, Inc.
idVendor   0x1199 Sierra Wireless, Inc.
idProduct  0x9071
  bNumInterfaces  5
bInterfaceNumber0
bInterfaceNumber2
bInterfaceNumber3
bInterfaceNumber8
bInterfaceNumber   10

In this case, the interface numbers are 0, 2, 3, 8, 10 and not the
0, 1, 2, 3, 4 that QEMU tries to claim.

This change allows sparse USB interface numbering.
Instead of only claiming the interfaces in the range reported by the USB
device through bNumInterfaces, QEMU attempts to claim all possible
interfaces.

Signed-off-by: Samuel Brian 
---
 hw/usb/host-libusb.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index f9c8eafe06..2e3a752ef6 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -1107,7 +1107,7 @@ static void usb_host_detach_kernel(USBHostDevice *s)
 if (rc != 0) {
 return;
 }
-for (i = 0; i < conf->bNumInterfaces; i++) {
+for (i = 0; i < USB_MAX_INTERFACES; i++) {
 rc = libusb_kernel_driver_active(s->dh, i);
 usb_host_libusb_error("libusb_kernel_driver_active", rc);
 if (rc != 1) {
@@ -1130,7 +1130,7 @@ static void usb_host_attach_kernel(USBHostDevice *s)
 if (rc != 0) {
 return;
 }
-for (i = 0; i < conf->bNumInterfaces; i++) {
+for (i = 0; i < USB_MAX_INTERFACES; i++) {
 if (!s->ifs[i].detached) {
 continue;
 }
@@ -1145,7 +1145,7 @@ static int usb_host_claim_interfaces(USBHostDevice 
*s, int configuration)

 {
 USBDevice *udev = USB_DEVICE(s);
 struct libusb_config_descriptor *conf;
-int rc, i;
+int rc, i, claimed;

 for (i = 0; i < USB_MAX_INTERFACES; i++) {
 udev->altsetting[i] = 0;
@@ -1164,14 +1164,19 @@ static int 
usb_host_claim_interfaces(USBHostDevice *s, int configuration)

 return USB_RET_STALL;
 }

-for (i = 0; i < conf->bNumInterfaces; i++) {
+claimed = 0;
+for (i = 0; i < USB_MAX_INTERFACES; i++) {
 trace_usb_host_claim_interface(s->bus_num, s->addr, 
configuration, i);

 rc = libusb_claim_interface(s->dh, i);
-usb_host_libusb_error("libusb_claim_interface", rc);
-if (rc != 0) {
-return USB_RET_STALL;
+if (rc == 0) {
+ s->ifs[i].claimed = true;
+ if (++claimed == conf->bNumInterfaces) {
+break;
+ }
 }
-s->ifs[i].claimed = true;
+}
+if (claimed != conf->bNumInterfaces) {
+return USB_RET_STALL;
 }

 udev->ninterfaces   = conf->bNumInterfaces;
@@ -1183,10 +1188,9 @@ static int 
usb_host_claim_interfaces(USBHostDevice *s, int configuration)


 static void usb_host_release_interfaces(USBHostDevice *s)
 {
-USBDevice *udev = USB_DEVICE(s);
 int i, rc;

-for (i = 0; i < udev->ninterfaces; i++) {
+for (i = 0; i < USB_MAX_INTERFACES; i++) {
 if (!s->ifs[i].claimed) {
 continue;
 }
--
2.11.0

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size

2017-06-12 Thread Jason Wang




On 2017年06月13日 11:55, Jason Wang wrote:



On 2017年06月13日 11:51, Wei Wang wrote:

On 06/13/2017 11:19 AM, Jason Wang wrote:



On 2017年06月13日 11:10, Wei Wang wrote:

On 06/13/2017 04:43 AM, Michael S. Tsirkin wrote:

On Mon, Jun 12, 2017 at 05:30:46PM +0800, Wei Wang wrote:

Ping for comments, thanks.

This was only posted a week ago, might be a bit too short for some
people.

OK, sorry for the push.

A couple of weeks is more reasonable before you ping.  Also, I
sent a bunch of comments on Thu, 8 Jun 2017.  You should probably
address these.


I responded to the comments. The main question is that I'm not sure
why we need the vhost backend to support VIRTIO_F_MAX_CHAIN_SIZE.
IMHO, that should be a feature proposed to solve the possible issue
caused by the QEMU implemented backend.


The issue is what if there's a mismatch of max #sgs between qemu and 
vhost?




When the vhost backend is used, QEMU is not involved in the data 
path. The vhost backend

directly gets what is offered by the guest from the vq.


FYI, qemu will try to fallback to userspace if there's something wrong 
with vhost-kernel (e.g the IOMMU support). This doesn't work for 
vhost-user actually, but it works for vhost-kernel.


Thanks


Why would there be a mismatch of
max #sgs between QEMU and vhost, and what is the QEMU side max #sgs 
used for? Thanks.


You need query the backend max #sgs in this case at least. no? If not 
how do you know the value is supported by the backend?


Thanks



Best,
Wei

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size

2017-06-12 Thread Jason Wang




On 2017年06月13日 11:51, Wei Wang wrote:

On 06/13/2017 11:19 AM, Jason Wang wrote:



On 2017年06月13日 11:10, Wei Wang wrote:

On 06/13/2017 04:43 AM, Michael S. Tsirkin wrote:

On Mon, Jun 12, 2017 at 05:30:46PM +0800, Wei Wang wrote:

Ping for comments, thanks.

This was only posted a week ago, might be a bit too short for some
people.

OK, sorry for the push.

A couple of weeks is more reasonable before you ping.  Also, I
sent a bunch of comments on Thu, 8 Jun 2017.  You should probably
address these.


I responded to the comments. The main question is that I'm not sure
why we need the vhost backend to support VIRTIO_F_MAX_CHAIN_SIZE.
IMHO, that should be a feature proposed to solve the possible issue
caused by the QEMU implemented backend.


The issue is what if there's a mismatch of max #sgs between qemu and 
vhost?




When the vhost backend is used, QEMU is not involved in the data path. 
The vhost backend
directly gets what is offered by the guest from the vq. Why would 
there be a mismatch of
max #sgs between QEMU and vhost, and what is the QEMU side max #sgs 
used for? Thanks.


You need query the backend max #sgs in this case at least. no? If not 
how do you know the value is supported by the backend?


Thanks



Best,
Wei

Re: [Qemu-devel] [PATCH v2 2/6] migration: let MigrationState be a qdev

2017-06-12 Thread Peter Xu

On Mon, Jun 12, 2017 at 01:13:49PM -0300, Eduardo Habkost wrote:
> On Mon, Jun 12, 2017 at 12:46:09PM +0800, Peter Xu wrote:
> > On Fri, Jun 09, 2017 at 03:39:24PM +0200, Markus Armbruster wrote:
> > > Peter Xu  writes:
> > > 
> > > > Let the old man "MigrationState" join the object family. Direct benefit
> > > > is that we can start to use all the property features derived from
> > > > current QDev, like: HW_COMPAT_* bits, command line setup for migration
> > > > parameters (so will never need to set them up each time using HMP/QMP,
> > > > this is really, really attractive for test writters), etc.
> > > >
> > > > I see no reason to disallow this happen yet. So let's start from this
> > > > one, to see whether it would be anything good.
> > > >
> > > > No functional change at all.
> > > >
> > > > Signed-off-by: Peter Xu 
> > > > ---
> > > >  include/migration/migration.h | 19 ++
> > > >  migration/migration.c | 61 
> > > > ---
> > > >  2 files changed, 59 insertions(+), 21 deletions(-)
> > > >
> > > > diff --git a/include/migration/migration.h 
> > > > b/include/migration/migration.h
> > > > index 79b5484..bd0186c 100644
> > > > --- a/include/migration/migration.h
> > > > +++ b/include/migration/migration.h
> > > > @@ -21,6 +21,7 @@
> > > >  #include "qapi-types.h"
> > > >  #include "exec/cpu-common.h"
> > > >  #include "qemu/coroutine_int.h"
> > > > +#include "hw/qdev.h"
> > > >  
> > > >  #define QEMU_VM_FILE_MAGIC   0x5145564d
> > > >  #define QEMU_VM_FILE_VERSION_COMPAT  0x0002
> > > > @@ -49,6 +50,8 @@ enum mig_rp_message_type {
> > > >  MIG_RP_MSG_MAX
> > > >  };
> > > >  
> > > > +#define TYPE_MIGRATION "migration"
> > > > +
> > > >  /* State for the incoming migration */
> > > >  struct MigrationIncomingState {
> > > >  QEMUFile *from_src_file;
> > > > @@ -91,8 +94,24 @@ struct MigrationIncomingState {
> > > >  MigrationIncomingState *migration_incoming_get_current(void);
> > > >  void migration_incoming_state_destroy(void);
> > > >  
> > > > +#define MIGRATION_CLASS(klass) \
> > > > +OBJECT_CLASS_CHECK(MigrationClass, (klass), TYPE_MIGRATION)
> > > > +#define MIGRATION_OBJ(obj) \
> > > > +OBJECT_CHECK(MigrationState, (obj), TYPE_MIGRATION)
> > > > +#define MIGRATION_GET_CLASS(obj) \
> > > > +OBJECT_GET_CLASS(MigrationClass, (obj), TYPE_MIGRATION)
> > > > +
> > > > +typedef struct MigrationClass {
> > > > +/*< private >*/
> > > > +DeviceClass parent_class;
> > > > +} MigrationClass;
> > > > +
> > > >  struct MigrationState
> > > >  {
> > > > +/*< private >*/
> > > > +DeviceState parent_obj;
> > > > +
> > > > +/*< public >*/
> > > 
> > > Turning MigrationState into a QOM object so you can use QOM
> > > infrastructure makes sense.  But why turn it into a device?  It doesn't
> > > feel device-like to me.  Would ObjectClass and Object instead of
> > > DeviceClass and DeviceState do?
> > 
> > Yeah. That's the main reason for the series to be (was) RFC.
> > 
> > I was trying to use the HW_COMPAT_* bits and -global, and that's QDev
> > thing (IIUC you got that already :). I am just curious about why that
> > is not for QObject from the very beginning, then it'll be easier.
> > 
> > For now, IMHO QDev is okay as well for migration, as long as it's kept
> > internally inside QEMU. But sure at least I should turn user_creatable
> > to off. I'll investigate more to see how to make this a safer approach
> > in next post.
> 
> We could allow non-device QOM objects use the global property
> system optionally.
> 
> We could make qdev_prop_set_globals() work on any Object*, and
> let QOM classes call it on post_init if they want to be
> configurable by global properties too.
> 
> Note that this would break qdev_prop_check_globals(), because it
> expects globals to work only on TYPE_DEVICE.  We could address
> that by introducing a new interface type to indicate the type
> works with -global (something like
> INTERFACE_CONFIGURABLE_BY_GLOBAL_PROPERTIES, but shorter?).
> 
> Such a system would probably allow us to replace
> default_machine_opts, default_boot_order, default_display,
> default_ram_size (and probably many other compat fields) on
> MachineClass.  It could also be used to implement -machine and
> -accel options by simply translating them to global properties
> (like we already do for -cpu).
> 
> (This sounds like reinventing a QemuOpts-like system on top of
> global properties.  Maybe that's a bad thing, maybe that's a good
> thing.  I'm not sure.)

Thanks Eduardo for the reviews and suggestions. 

It'll obviously benefit us a lot if all the objects can use the global
properties and the *_COMPAT_* legacy magic, while the tricky point is
that, QObject will then depend on QDev?

Generally speaking this sounds good to me.  Before I do anything else,
I believe I should wait to see how other QOM developers think about
it.  Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size

2017-06-12 Thread Wei Wang


On 06/13/2017 11:19 AM, Jason Wang wrote:



On 2017年06月13日 11:10, Wei Wang wrote:

On 06/13/2017 04:43 AM, Michael S. Tsirkin wrote:

On Mon, Jun 12, 2017 at 05:30:46PM +0800, Wei Wang wrote:

Ping for comments, thanks.

This was only posted a week ago, might be a bit too short for some
people.

OK, sorry for the push.

A couple of weeks is more reasonable before you ping.  Also, I
sent a bunch of comments on Thu, 8 Jun 2017.  You should probably
address these.


I responded to the comments. The main question is that I'm not sure
why we need the vhost backend to support VIRTIO_F_MAX_CHAIN_SIZE.
IMHO, that should be a feature proposed to solve the possible issue
caused by the QEMU implemented backend.


The issue is what if there's a mismatch of max #sgs between qemu and 
vhost?




When the vhost backend is used, QEMU is not involved in the data path. 
The vhost backend
directly gets what is offered by the guest from the vq. Why would there 
be a mismatch of
max #sgs between QEMU and vhost, and what is the QEMU side max #sgs used 
for? Thanks.


Best,
Wei

Re: [Qemu-devel] [PATCH v2 3/6] migration: move global_state.optional out

2017-06-12 Thread Peter Xu

On Mon, Jun 12, 2017 at 04:18:06PM +0800, Peter Xu wrote:
> On Fri, Jun 09, 2017 at 10:40:10AM -0300, Eduardo Habkost wrote:
> > On Fri, Jun 09, 2017 at 11:48:59AM +0800, Peter Xu wrote:
> > > Put it into MigrationState then we can use the properties to specify
> > > whether to enable storing global state.
> > > 
> > > Removing global_state_set_optional() since now we can use HW_COMPAT_2_3
> > > for x86/power in general, and the register_compat_prop() for xen_init().
> > > 
> > > Signed-off-by: Peter Xu 
> > > ---
> > >  hw/i386/pc_piix.c |  1 -
> > >  hw/ppc/spapr.c|  1 -
> > >  hw/xen/xen-common.c   |  8 +++-
> > >  include/hw/compat.h   |  4 
> > >  include/migration/migration.h |  7 ++-
> > >  migration/migration.c | 24 
> > >  6 files changed, 33 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > index 2234bd0..c83cec5 100644
> > > --- a/hw/i386/pc_piix.c
> > > +++ b/hw/i386/pc_piix.c
> > > @@ -317,7 +317,6 @@ static void pc_compat_2_3(MachineState *machine)
> > >  if (kvm_enabled()) {
> > >  pcms->smm = ON_OFF_AUTO_OFF;
> > >  }
> > > -global_state_set_optional();
> > >  savevm_skip_configuration();
> > >  }
> > >  
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index ab3aab1..3e78bb9 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -3593,7 +3593,6 @@ static void 
> > > spapr_machine_2_3_instance_options(MachineState *machine)
> > >  {
> > >  spapr_machine_2_4_instance_options(machine);
> > >  savevm_skip_section_footers();
> > > -global_state_set_optional();
> > >  savevm_skip_configuration();
> > >  }
> > 
> > This is a good thing: makes the migration behavior of the
> > machine-types introspectable in compat_props.
> > 
> > I suggest moving this part (and all the rest except the new
> > register_compat_prop() call below) to a separate patch, because
> > it is an improvement on its own.
> 
> Sure.
> 
> > 
> > >  
> > > diff --git a/hw/xen/xen-common.c b/hw/xen/xen-common.c
> > > index 0bed577..8240d50 100644
> > > --- a/hw/xen/xen-common.c
> > > +++ b/hw/xen/xen-common.c
> > > @@ -138,7 +138,13 @@ static int xen_init(MachineState *ms)
> > >  }
> > >  qemu_add_vm_change_state_handler(xen_change_state_handler, NULL);
> > >  
> > > -global_state_set_optional();
> > > +/*
> > > + * TODO: make sure global MigrationState has not yet been created
> > > + * (otherwise the compat trick won't work). For now we are in
> > > + * configure_accelerator() so we are mostly good. Better to
> > > + * confirm that in the future.
> > > + */
> > 
> > So, this makes accelerator initialization very sensitive to
> > ordering.
> > 
> > > +register_compat_prop("migration", "store-global-state", "off");
> > 
> > It's already hard to track down machine-type stuff that is
> > initialized at machine->init() time but it's not introspectable,
> > let's not do the same mistake with accelerators.
> > 
> > Can't this be solved by a AcceleratorClass::global_props field,
> > so we are sure there's a single place in the code where globals
> > are registered?  (Currently, they are all registered by the few
> > lines around the machine_register_compat_props() call in main()).
> > 
> > I think I see other use cases for a new
> > AcceleratorClass::global_props field.  e.g.: replacing
> > kvm_default_props and tcg_default_props in target/i386/cpu.c.
> 
> Hmm, this sounds nice.  Then we can also get rid of the TODO above.
> 
> Thanks for your suggestion and review! I'll see whether I can writeup
> something as RFC for this.

After a second thought, I see these requirements are slightly
different.

For xen_init(), AccelClass::global_props may help since that's mainly
properties to be setup for TYPE_MIGRATION (let's assume it's okay that
MigrationState can be a QDev).

For kvm_default_props and tcg_default_props, AccelClass::global_props
may not help since they are setting up properties for TYPE_CPU, and
TYPE_CPU cannot really use global property features yet. :(

If finally we can have a new way (e.g., as you suggested in the other
thread, using a new INTERFACE to show that one QObject supports global
qdev properties) to allow TYPE_CPU to use global properties, then
these two requirements can merge. Otherwise IMHO we may still need to
keep the *_default_props in CPU codes.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size

2017-06-12 Thread Jason Wang




On 2017年06月13日 11:10, Wei Wang wrote:

On 06/13/2017 04:43 AM, Michael S. Tsirkin wrote:

On Mon, Jun 12, 2017 at 05:30:46PM +0800, Wei Wang wrote:

Ping for comments, thanks.

This was only posted a week ago, might be a bit too short for some
people.

OK, sorry for the push.

A couple of weeks is more reasonable before you ping.  Also, I
sent a bunch of comments on Thu, 8 Jun 2017.  You should probably
address these.


I responded to the comments. The main question is that I'm not sure
why we need the vhost backend to support VIRTIO_F_MAX_CHAIN_SIZE.
IMHO, that should be a feature proposed to solve the possible issue
caused by the QEMU implemented backend.


The issue is what if there's a mismatch of max #sgs between qemu and vhost?

Thanks



Best,
Wei

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size

2017-06-12 Thread Wei Wang


On 06/13/2017 04:43 AM, Michael S. Tsirkin wrote:

On Mon, Jun 12, 2017 at 05:30:46PM +0800, Wei Wang wrote:

Ping for comments, thanks.

This was only posted a week ago, might be a bit too short for some
people.

OK, sorry for the push.

A couple of weeks is more reasonable before you ping.  Also, I
sent a bunch of comments on Thu, 8 Jun 2017.  You should probably
address these.


I responded to the comments. The main question is that I'm not sure
why we need the vhost backend to support VIRTIO_F_MAX_CHAIN_SIZE.
IMHO, that should be a feature proposed to solve the possible issue
caused by the QEMU implemented backend.

Best,
Wei

Re: [Qemu-devel] [PATCH v11 4/6] mm: function to offer a page block on the free list

2017-06-12 Thread Wei Wang


On 06/13/2017 04:54 AM, Dave Hansen wrote:

On 06/12/2017 01:34 PM, Michael S. Tsirkin wrote:

On Mon, Jun 12, 2017 at 09:42:36AM -0700, Dave Hansen wrote:

On 06/12/2017 09:28 AM, Michael S. Tsirkin wrote:

The hypervisor is going to throw away the contents of these pages,
right?

It should be careful and only throw away contents that was there before
report_unused_page_block was invoked.  Hypervisor is responsible for not
corrupting guest memory.  But that's not something an mm patch should
worry about.

That makes sense.  I'm struggling to imagine how the hypervisor makes
use of this information, though.  Does it make the pages read-only
before this, and then it knows if there has not been a write *and* it
gets notified via this new mechanism that it can throw the page away?

Yes, and specifically, this is how it works for migration.  Normally you
start by migrating all of memory, then send updates incrementally if
pages have been modified.  This mechanism allows skipping some pages in
the 1st stage, if they get changed they will be migrated in the 2nd
stage.

OK, so the migration starts and marks everything read-only.  All the
pages now have read-only valuable data, or read-only worthless data in
the case that the page is in the free lists.  In order for a page to
become non-worthless, it has to have a write done to it, which the
hypervisor obviously knows about.

With this mechanism, the hypervisor knows it can discard pages which
have not had a write since they were known to have worthless contents.

Correct?

Right. By the way, ready-only is one of the dirty page logging
methods that a hypervisor uses to capture the pages that are
written by the VM.



That also seems like pretty good information to include in the
changelog.  Otherwise, folks are going to be left wondering what good
the mechanism is.  It's pretty non-trivial to figure out. :)

If necessary, I think it's better to keep the introduction at high-level:

Examples of using this API by a hypervisor:
To live migrate a VM from one physical machine to another,
the hypervisor usually transfers all the VM's memory content.
An optimization here is to skip the transfer of memory that are not
in use by the VM, because the content of the unused memory is
worthless.
This API is the used to report the unused pages to the hypervisor.
The pages that have been reported to the hypervisor as unused
pages may be used by the VM after the report. The hypervisor
has a good mechanism (i.e. dirty page logging) to capture
the change. Therefore, if the new used pages are written into some
data, the hypervisor will still transfer them to the destination machine.

What do you guys think?

Best,
Wei

Re: [Qemu-devel] [PATCH v6 1/6] Pass generic CPUState to gen_intermediate_code()

2017-06-12 Thread David Gibson

On Mon, Jun 12, 2017 at 05:53:55PM +0300, Lluís Vilanova wrote:
> Needed to implement a target-agnostic gen_intermediate_code() in the
> future.
> 
> Signed-off-by: Lluís Vilanova 
> Reviewed-by: David Gibson 
> Reviewed-by: Richard Henderson 

ppc parts

Acked-by: David Gibson 

> ---
>  include/exec/exec-all.h   |2 +-
>  target/alpha/translate.c  |   11 +--
>  target/arm/translate.c|   20 ++--
>  target/cris/translate.c   |   17 -
>  target/i386/translate.c   |   13 ++---
>  target/lm32/translate.c   |   22 +++---
>  target/m68k/translate.c   |   15 +++
>  target/microblaze/translate.c |   22 +++---
>  target/mips/translate.c   |   15 +++
>  target/moxie/translate.c  |   14 +++---
>  target/openrisc/translate.c   |   19 ++-
>  target/ppc/translate.c|   15 +++
>  target/s390x/translate.c  |   13 ++---
>  target/sh4/translate.c|   15 +++
>  target/sparc/translate.c  |   11 +--
>  target/tilegx/translate.c |7 +++
>  target/tricore/translate.c|9 -
>  target/unicore32/translate.c  |   17 -
>  target/xtensa/translate.c |   13 ++---
>  translate-all.c   |2 +-
>  20 files changed, 130 insertions(+), 142 deletions(-)
> 
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 87ae10bcc9..1ec7637170 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -43,7 +43,7 @@ typedef ram_addr_t tb_page_addr_t;
>  
>  #include "qemu/log.h"
>  
> -void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb);
> +void gen_intermediate_code(CPUState *env, struct TranslationBlock *tb);
>  void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
>target_ulong *data);
>  
> diff --git a/target/alpha/translate.c b/target/alpha/translate.c
> index 7c45ae360c..9b60680454 100644
> --- a/target/alpha/translate.c
> +++ b/target/alpha/translate.c
> @@ -2900,10 +2900,9 @@ static ExitStatus translate_one(DisasContext *ctx, 
> uint32_t insn)
>  return ret;
>  }
>  
> -void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb)
> +void gen_intermediate_code(CPUState *cpu, struct TranslationBlock *tb)
>  {
> -AlphaCPU *cpu = alpha_env_get_cpu(env);
> -CPUState *cs = CPU(cpu);
> +CPUAlphaState *env = cpu->env_ptr;
>  DisasContext ctx, *ctxp = 
>  target_ulong pc_start;
>  target_ulong pc_mask;
> @@ -2918,7 +2917,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
> TranslationBlock *tb)
>  ctx.pc = pc_start;
>  ctx.mem_idx = cpu_mmu_index(env, false);
>  ctx.implver = env->implver;
> -ctx.singlestep_enabled = cs->singlestep_enabled;
> +ctx.singlestep_enabled = cpu->singlestep_enabled;
>  
>  #ifdef CONFIG_USER_ONLY
>  ctx.ir = cpu_std_ir;
> @@ -2961,7 +2960,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
> TranslationBlock *tb)
>  tcg_gen_insn_start(ctx.pc);
>  num_insns++;
>  
> -if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
> +if (unlikely(cpu_breakpoint_test(cpu, ctx.pc, BP_ANY))) {
>  ret = gen_excp(, EXCP_DEBUG, 0);
>  /* The address covered by the breakpoint must be included in
> [tb->pc, tb->pc + tb->size) in order to for it to be
> @@ -3030,7 +3029,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
> TranslationBlock *tb)
>  && qemu_log_in_addr_range(pc_start)) {
>  qemu_log_lock();
>  qemu_log("IN: %s\n", lookup_symbol(pc_start));
> -log_target_disas(cs, pc_start, ctx.pc - pc_start, 1);
> +log_target_disas(cpu, pc_start, ctx.pc - pc_start, 1);
>  qemu_log("\n");
>  qemu_log_unlock();
>  }
> diff --git a/target/arm/translate.c b/target/arm/translate.c
> index 0862f9e4aa..96272a9888 100644
> --- a/target/arm/translate.c
> +++ b/target/arm/translate.c
> @@ -11787,10 +11787,10 @@ static bool insn_crosses_page(CPUARMState *env, 
> DisasContext *s)
>  }
>  
>  /* generate intermediate code for basic block 'tb'.  */
> -void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb)
> +void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb)
>  {
> -ARMCPU *cpu = arm_env_get_cpu(env);
> -CPUState *cs = CPU(cpu);
> +CPUARMState *env = cpu->env_ptr;
> +ARMCPU *arm_cpu = arm_env_get_cpu(env);
>  DisasContext dc1, *dc = 
>  target_ulong pc_start;
>  target_ulong next_page_start;
> @@ -11804,7 +11804,7 @@ void gen_intermediate_code(CPUARMState *env, 
> TranslationBlock *tb)
>   * the A32/T32 complexity to do with conditional execution/IT blocks/etc.
>   */
>  if

Re: [Qemu-devel] [Qemu-block] NVME: is there any plan to support SGL data transfer?

2017-06-12 Thread Qu Wenruo




At 06/12/2017 10:05 PM, Keith Busch wrote:

On Fri, Jun 02, 2017 at 03:51:51PM +0200, Kevin Wolf wrote:

Am 02.06.2017 um 03:47 hat Qu Wenruo geschrieben:

When going through NVMe specification and hw/block/nvme.c,
I found that it seems that NVMe qemu implementation only support PRP
for sq entry.
And NvmeRwCmd doesn't even use union to define DPTR, but just prp1 and prp2.

Although I am just a newbie, but I'm quite interested in NVMe and
want to try to implement SGL support for qemu NVMe.

Is there anyone already doing such work? Or is there any plan on
implement such feature?


Keith, you can probably answer this?


Hi,

I personally have no plans to implement it, but would be happy to see it
added if someone wants to send the patch for review.


Glad to hear that.

I have already submitted preparation patch to update nvme.h for SGL 
structures and introduced needed error code according to NVMe 1.3 spec.


It would be quite nice if anyone is interested in that patchset.

Thanks,
Qu



Thanks,
Keith

[Qemu-devel] 答复: Re: 答复: Re: [PATCHv2 02/04] colo-compare: Process pactkets in the IOThread ofthe primary

2017-06-12 Thread wang.yong155

>> ＞＞ From: Wang Yong ＜wang.yong...@zte.com.cn＞

>>

>> ＞＞

>>

>> ＞＞ Process pactkets in the IOThread which arrived over the socket.

>>

>> ＞＞ we use qio_channel_set_aio_fd_handler to set the handlers on the

>>

>> ＞＞ IOThread AioContext.then the packets from the primary and the 

>> secondary

>>

>> ＞＞ are processed in the IOThread.

>>

>> ＞＞ Finally remove the colo-compare thread using the IOThread instead.

>>

>> ＞＞

>>

>> ＞＞ Signed-off-by: Wang Yong＜wang.yong...@zte.com.cn＞

>>

>> ＞＞ Signed-off-by: Wang Guang＜wang.guan...@zte.com.cn＞

>>

>> ＞＞ ---

>>

>> ＞＞   net/colo-compare.c | 133 

>> -

>>

>> ＞＞   net/colo.h |   1 +

>>

>> ＞＞   2 files changed, 91 insertions(+), 43 deletions(-)

>>

>> ＞＞

>>

>> ＞＞ diff --git a/net/colo-compare.c b/net/colo-compare.c

>>

>> ＞＞ index b0942a4..e3af791 100644

>>

>> ＞＞ --- a/net/colo-compare.c

>>

>> ＞＞ +++ b/net/colo-compare.c

>>

>> ＞＞ @@ -29,6 +29,7 @@

>>

>> ＞＞   #include "qemu/sockets.h"

>>

>> ＞＞   #include "qapi-visit.h"

>>

>> ＞＞   #include "net/colo.h"

>>

>> ＞＞ +#include "io/channel.h"

>>

>> ＞＞   #include "sysemu/iothread.h"

>>

>> ＞＞

>>

>> ＞＞   #define TYPE_COLO_COMPARE "colo-compare"

>>

>> ＞＞ @@ -82,11 +83,6 @@ typedef struct CompareState {

>>

>> ＞＞   GQueue conn_list

>>

>> ＞＞   /* hashtable to save connection */

>>

>> ＞＞   GHashTable *connection_track_table

>>

>> ＞＞ -/* compare thread, a thread for each NIC */

>>

>> ＞＞ -QemuThread thread

>>

>> ＞＞ -

>>

>> ＞＞ -GMainContext *worker_context

>>

>> ＞＞ -GMainLoop *compare_loop

>>

>> ＞＞

>>

>> ＞＞   /*compare iothread*/

>>

>> ＞＞   IOThread *iothread

>>

>> ＞＞ @@ -95,6 +91,14 @@ typedef struct CompareState {

>>

>> ＞＞   QEMUTimer *packet_check_timer

>>

>> ＞＞   } CompareState

>>

>> ＞＞

>>

>> ＞＞ +typedef struct {

>>

>> ＞＞ +Chardev parent

>>

>> ＞＞ +QIOChannel *ioc /*I/O channel */

>>

>>

>> ＞We probably don't want to manipulate char backend's internal io 

>> channel.

>>

>> ＞All need here is to access the frontend API (char-fe.c) I believe, and

>>

>> ＞hide the internal implementation.

>>

>> char-fd.c ?

>>




>Char-fe.c for sure which means frontend of chardev.




>> These API can only watch events in the qemu main thread, not in the 

>> IOThread.

>>

>> I had to use the qio_channel_socket_set_aio_fd_handler function to

>>

>> monitor the char event in the IOThread,so the io channel is used her

>>




>The point is not touching the internal structure of chardev like ioc, 

>instead extend its helper like e.g qemu_chr_fe_set_handlers() and let it 

>set aio handlers,




Currently character devices are tied to the GSource API. However,I'll try to 
submit a patch first.




Thanks




>> -＞qio_channel_socket_set_aio_fd_handler

>>

>>-＞aio_set_fd_handler

>>

>>

>> Thanks

>>

>>

>> ＞＞ +} CompareChardev

>>

>> ＞＞ +

>>

>> ＞＞ +#define COMPARE_CHARDEV(obj) \

>>

>> ＞＞ +OBJECT_CHECK(CompareChardev, (obj), TYPE_CHARDEV_SOCKET)

>>

>> ＞＞ +

>>

>> ＞＞   typedef struct CompareClass {

>>

>> ＞＞   ObjectClass parent_class

>>

>> ＞＞   } CompareClass

>>

>> ＞＞ @@ -107,6 +111,12 @@ enum {

>>

>> ＞＞   static int compare_chr_send(CharBackend *out,

>>

>> ＞＞   const uint8_t *buf,

>>

>> ＞＞   uint32_t size)

>>

>> ＞＞ +static void compare_chr_set_aio_fd_handlers(CharBackend *b,

>>

>> ＞＞ +AioContext *ctx,

>>

>> ＞＞ +IOCanReadHandler *fd_can_read,

>>

>> ＞＞ +IOReadHandler *fd_read,

>>

>> ＞＞ +IOEventHandler *fd_event,

>>

>> ＞＞ +void *opaque)

>>

>> ＞＞

>>

>> ＞＞   static gint seq_sorter(Packet *a, Packet *b, gpointer data)

>>

>> ＞＞   {

>>

>> ＞＞ @@ -534,6 +544,30 @@ err:

>>

>> ＞＞   return ret ＜ 0 ? ret : -EIO

>>

>> ＞＞   }

>>

>> ＞＞

>>

>> ＞＞ +static void compare_chr_read(void *opaque)

>>

>> ＞＞ +{

>>

>> ＞＞ +Chardev *chr = opaque

>>

>> ＞＞ +uint8_t buf[CHR_READ_BUF_LEN]

>>

>> ＞＞ +int len, size

>>

>> ＞＞ +int max_size

>>

>> ＞＞ +

>>

>> ＞＞ +max_size = qemu_chr_be_can_write(chr)

>>

>> ＞＞ +if (max_size ＜= 0) {

>>

>> ＞＞ +return

>>

>> ＞＞ +}

>>

>> ＞＞ +

>>

>> ＞＞ +len = sizeof(buf)

>>

>> ＞＞ +if (len ＞ max_size) {

>>

>> ＞＞ +len = max_size

>>

>> ＞＞ +}

>>

>> ＞＞ +size = CHARDEV_GET_CLASS(chr)-＞chr_sync_read(chr, (void *)buf, 

>> len)

>>

>> ＞＞ +if (size == 0) {

>>

>> ＞＞ +return

>>

>> ＞＞ +} else if (size ＞ 0) {

>>

>> ＞＞ +qemu_chr_be_write(chr, buf, size)

>>

>> ＞＞ +}

>>

>> ＞＞ +}

>>

>> ＞＞ +

>>

>> ＞＞   static int compare_chr_can_read(void *opaque)

>>

>> ＞＞   {

>>

>> ＞＞   return COMPARE_READ_LEN_MAX

>>

>> ＞＞ @@ -550,8 +584,8 @@ static void

Re: [Qemu-devel] [PATCHv2 1/4] fw_cfg: don't map the fw_cfg IO ports in fw_cfg_io_realize()

2017-06-12 Thread Laszlo Ersek

On 06/13/17 00:27, Laszlo Ersek wrote:

> In fw_cfg_init_io_dma(), you know the object type exactly -- you just
> created it with TYPE_FW_CFG_IO --, so after device realization, you can
> cast the type as narrowly as necessary, and refer to the fields by name.

I understand that in sun4u code like

you couldn't access those fields by name (such as "comb_iomem") immediately.

I can think of two ways out:

- Instead of accessing the field from the outside, for the board
mapping, pass the parent memory region in to the helper. (This is my
previous suggestion, and you didn't like it :) )

- or else, in patch v2 4/4, make the guts of FWCfgState / FWCfgIoState /
FWCfgMemState public too. Then board code can get at
"FWCfgIoState.comb_iomem" directly.

Thanks
Laszlo

Re: [Qemu-devel] [PATCHv2 4/4] fw_cfg: move QOM type defines into fw_cfg.h

2017-06-12 Thread Laszlo Ersek

On 06/12/17 23:21, Mark Cave-Ayland wrote:
> This allows the device to be instantiated externally for boards that
> wish to wire up the fw_cfg device themselves.
> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/nvram/fw_cfg.c |8 
>  include/hw/nvram/fw_cfg.h |8 
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 6c21e43..22a8404 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -40,14 +40,6 @@
>  #define FW_CFG_NAME "fw_cfg"
>  #define FW_CFG_PATH "/machine/" FW_CFG_NAME
>  
> -#define TYPE_FW_CFG "fw_cfg"
> -#define TYPE_FW_CFG_IO  "fw_cfg_io"
> -#define TYPE_FW_CFG_MEM "fw_cfg_mem"
> -
> -#define FW_CFG(obj) OBJECT_CHECK(FWCfgState,(obj), TYPE_FW_CFG)
> -#define FW_CFG_IO(obj)  OBJECT_CHECK(FWCfgIoState,  (obj), TYPE_FW_CFG_IO)
> -#define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM)
> -
>  /* FW_CFG_VERSION bits */
>  #define FW_CFG_VERSION  0x01
>  #define FW_CFG_VERSION_DMA  0x02
> diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
> index b980cba..e515698 100644
> --- a/include/hw/nvram/fw_cfg.h
> +++ b/include/hw/nvram/fw_cfg.h
> @@ -4,6 +4,14 @@
>  #include "exec/hwaddr.h"
>  #include "hw/nvram/fw_cfg_keys.h"
>  
> +#define TYPE_FW_CFG "fw_cfg"
> +#define TYPE_FW_CFG_IO  "fw_cfg_io"
> +#define TYPE_FW_CFG_MEM "fw_cfg_mem"
> +
> +#define FW_CFG(obj) OBJECT_CHECK(FWCfgState,(obj), TYPE_FW_CFG)
> +#define FW_CFG_IO(obj)  OBJECT_CHECK(FWCfgIoState,  (obj), TYPE_FW_CFG_IO)
> +#define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM)
> +
>  typedef struct FWCfgFile {
>  uint32_t  size;/* file size */
>  uint16_t  select;  /* write this to 0x510 to read it */
> 

With the concern that I voiced under patch #3 (namely that board code
might be required to set up the machine-wide link to the sole fw-cfg
device manually, outside of realize):

Reviewed-by: Laszlo Ersek

Re: [Qemu-devel] [PATCHv2 3/4] fw_cfg: move qdev_init_nofail() from fw_cfg_init1() to callers

2017-06-12 Thread Laszlo Ersek

On 06/12/17 23:21, Mark Cave-Ayland wrote:
> When looking to instantiate a TYPE_FW_CFG_MEM or TYPE_FW_CFG_IO device to be
> able to wire it up differently, it is much more convenient for the caller to
> instantiate the device and have the fw_cfg default files already preloaded
> during realize.
> 
> Move fw_cfg_init1() to the end of both the fw_cfg_mem_realize() and
> fw_cfg_io_realize() functions so it no longer needs to be called manually
> when instantiating the device.
> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/nvram/fw_cfg.c |   10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index e1aa4fc..6c21e43 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -920,8 +920,6 @@ static void fw_cfg_init1(DeviceState *dev)
>  
>  object_property_add_child(OBJECT(machine), FW_CFG_NAME, OBJECT(s), NULL);
>  
> -qdev_init_nofail(dev);
> -
>  fw_cfg_add_bytes(s, FW_CFG_SIGNATURE, (char *)"QEMU", 4);
>  fw_cfg_add_bytes(s, FW_CFG_UUID, _uuid, 16);
>  fw_cfg_add_i16(s, FW_CFG_NOGRAPHIC, (uint16_t)!machine->enable_graphics);
> @@ -954,7 +952,7 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
> dma_iobase,
>  qdev_prop_set_bit(dev, "dma_enabled", false);
>  }
>  
> -fw_cfg_init1(dev);
> +qdev_init_nofail(dev);
>  
>  sbd = SYS_BUS_DEVICE(dev);
>  sysbus_add_io(sbd, iobase, sysbus_mmio_get_region(sbd, 0));
> @@ -991,7 +989,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
>  qdev_prop_set_bit(dev, "dma_enabled", false);
>  }
>  
> -fw_cfg_init1(dev);
> +qdev_init_nofail(dev);
>  
>  sbd = SYS_BUS_DEVICE(dev);
>  sysbus_mmio_map(sbd, 0, ctl_addr);
> @@ -1097,6 +1095,8 @@ static void fw_cfg_io_realize(DeviceState *dev, Error 
> **errp)
>sizeof(dma_addr_t));
>  sysbus_init_mmio(sbd, _CFG(s)->dma_iomem);
>  }
> +
> +fw_cfg_init1(dev);
>  }
>  
>  static void fw_cfg_io_class_init(ObjectClass *klass, void *data)
> @@ -1163,6 +1163,8 @@ static void fw_cfg_mem_realize(DeviceState *dev, Error 
> **errp)
>sizeof(dma_addr_t));
>  sysbus_init_mmio(sbd, _CFG(s)->dma_iomem);
>  }
> +
> +fw_cfg_init1(dev);
>  }
>  
>  static void fw_cfg_mem_class_init(ObjectClass *klass, void *data)
> 

This looks good to me generally, but I'm concerned about the part of
fw_cfg_init1() that precedes the original qdev_init_nofail() call, namely

assert(!object_resolve_path(FW_CFG_PATH, NULL));

object_property_add_child(OBJECT(machine), FW_CFG_NAME, OBJECT(s),
NULL);

The object_property_add_child() call creates a machine-global link to
the sole fw-cfg device, so that *other* code can find the fw-cfg device
by calling object_resolve_path(). (The same way that we assert fails
right before the creation, i.e. we don't try to create several fw_cfg
devices.)

I feel that this link creation does not belong in device realize
methods, but to board code. I feel that these two steps should be
factored out to a separate helper function, and then called from:

- fw_cfg_init_io_dma(), just before the new qdev_init_nofail() call site,
- fw_cfg_init_mem_wide(), just before the new qdev_init_nofail() call site,
- before any similar qdev_init_nofail() call sites, such as in
.

Again this is just a gut feeling, comments / opinions welcome.

Thanks,
Laszlo

Re: [Qemu-devel] [PATCHv2 2/4] fw_cfg: move setting of FW_CFG_VERSION_DMA bit to fw_cfg_init1()

2017-06-12 Thread Laszlo Ersek

On 06/12/17 23:21, Mark Cave-Ayland wrote:
> The setting of the FW_CFG_VERSION_DMA bit is the same across both the
> TYPE_FW_CFG_MEM and TYPE_FW_CFG_IO devices, so unify the logic in
> fw_cfg_init1().
> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/nvram/fw_cfg.c |   16 +++-
>  1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index be5b04e..e1aa4fc 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -914,6 +914,7 @@ static void fw_cfg_init1(DeviceState *dev)
>  {
>  FWCfgState *s = FW_CFG(dev);
>  MachineState *machine = MACHINE(qdev_get_machine());
> +uint32_t version = FW_CFG_VERSION;
>  
>  assert(!object_resolve_path(FW_CFG_PATH, NULL));
>  
> @@ -928,6 +929,12 @@ static void fw_cfg_init1(DeviceState *dev)
>  fw_cfg_bootsplash(s);
>  fw_cfg_reboot(s);
>  
> +if (s->dma_enabled) {
> +version |= FW_CFG_VERSION_DMA;
> +}
> +
> +fw_cfg_add_i32(s, FW_CFG_ID, version);
> +
>  s->machine_ready.notify = fw_cfg_machine_ready;
>  qemu_add_machine_init_done_notifier(>machine_ready);
>  }
> @@ -938,7 +945,6 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
> dma_iobase,
>  DeviceState *dev;
>  SysBusDevice *sbd;
>  FWCfgState *s;
> -uint32_t version = FW_CFG_VERSION;
>  bool dma_requested = dma_iobase && dma_as;
>  
>  dev = qdev_create(NULL, TYPE_FW_CFG_IO);
> @@ -960,12 +966,8 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
> dma_iobase,
>  s->dma_as = dma_as;
>  s->dma_addr = 0;
>  sysbus_add_io(sbd, dma_iobase, sysbus_mmio_get_region(sbd, 1));
> -
> -version |= FW_CFG_VERSION_DMA;
>  }
>  
> -fw_cfg_add_i32(s, FW_CFG_ID, version);
> -
>  return s;
>  }
>  
> @@ -981,7 +983,6 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
>  DeviceState *dev;
>  SysBusDevice *sbd;
>  FWCfgState *s;
> -uint32_t version = FW_CFG_VERSION;
>  bool dma_requested = dma_addr && dma_as;
>  
>  dev = qdev_create(NULL, TYPE_FW_CFG_MEM);
> @@ -1002,11 +1003,8 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
>  s->dma_as = dma_as;
>  s->dma_addr = 0;
>  sysbus_mmio_map(sbd, 2, dma_addr);
> -version |= FW_CFG_VERSION_DMA;
>  }
>  
> -fw_cfg_add_i32(s, FW_CFG_ID, version);
> -
>  return s;
>  }
>  
> 

Reviewed-by: Laszlo Ersek

Re: [Qemu-devel] [PATCHv2 1/4] fw_cfg: don't map the fw_cfg IO ports in fw_cfg_io_realize()

2017-06-12 Thread Laszlo Ersek

On 06/12/17 23:21, Mark Cave-Ayland wrote:
> As indicated by Laszlo it is a QOM bug for the realize() method to actually
> map the device. Set up the IO regions with sysbus_init_mmio() and defer
> the mapping to the caller, as already done in fw_cfg_init_mem_wide().
>
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/nvram/fw_cfg.c |   10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 316fca9..be5b04e 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -936,6 +936,7 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
> dma_iobase,
>  AddressSpace *dma_as)
>  {
>  DeviceState *dev;
> +SysBusDevice *sbd;
>  FWCfgState *s;
>  uint32_t version = FW_CFG_VERSION;
>  bool dma_requested = dma_iobase && dma_as;
> @@ -948,12 +949,17 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, 
> uint32_t dma_iobase,
>  }
>
>  fw_cfg_init1(dev);
> +
> +sbd = SYS_BUS_DEVICE(dev);
> +sysbus_add_io(sbd, iobase, sysbus_mmio_get_region(sbd, 0));
> +
>  s = FW_CFG(dev);
>
>  if (s->dma_enabled) {
>  /* 64 bits for the address field */
>  s->dma_as = dma_as;
>  s->dma_addr = 0;
> +sysbus_add_io(sbd, dma_iobase, sysbus_mmio_get_region(sbd, 1));
>
>  version |= FW_CFG_VERSION_DMA;
>  }
> @@ -1085,13 +1091,13 @@ static void fw_cfg_io_realize(DeviceState *dev, Error 
> **errp)
>   * of the i/o region used is FW_CFG_CTL_SIZE */
>  memory_region_init_io(>comb_iomem, OBJECT(s), _cfg_comb_mem_ops,
>FW_CFG(s), "fwcfg", FW_CFG_CTL_SIZE);
> -sysbus_add_io(sbd, s->iobase, >comb_iomem);
> +sysbus_init_mmio(sbd, >comb_iomem);
>
>  if (FW_CFG(s)->dma_enabled) {
>  memory_region_init_io(_CFG(s)->dma_iomem, OBJECT(s),
>_cfg_dma_mem_ops, FW_CFG(s), "fwcfg.dma",
>sizeof(dma_addr_t));
> -sysbus_add_io(sbd, s->dma_iobase, _CFG(s)->dma_iomem);
> +sysbus_init_mmio(sbd, _CFG(s)->dma_iomem);
>  }
>  }
>
>

This does mostly what I had in mind, but why are the sysbus_init_mmio()
"replacements" necessary?

This is what sysbus_init_mmio() does:

assert(dev->num_mmio < QDEV_MAX_MMIO);
n = dev->num_mmio++;
dev->mmio[n].addr = -1;
dev->mmio[n].memory = memory;

But we don't have MMIO regions here, we have IO ports. This is all what
sysbus_add_io() does:

memory_region_add_subregion(get_system_io(), addr, mem);

It doesn't do anything related to SysBusDevice.mmio[] and mmio.num_mmio.

This patch, as written, changes "num_mmio" from zero to nonzero, and
that could have a bunch of unexpected consequences for
"hw/core/sysbus.c":
- sysbus_has_mmio() would return true
- sysbus_dev_print() produces different output
- sysbus_get_fw_dev_path() formats a different OpenFW device path
  fragment

Instead, can we just move the sysbus_add_io() function calls *without*
replacements in fw_cfg_io_realize()?

In fw_cfg_init_io_dma(), you know the object type exactly -- you just
created it with TYPE_FW_CFG_IO --, so after device realization, you can
cast the type as narrowly as necessary, and refer to the fields by name.
Something like (build-tested only):

> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 316fca9bc1da..a28ce1eacd63 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -96,7 +96,6 @@ struct FWCfgIoState {
>  /*< public >*/
>
>  MemoryRegion comb_iomem;
> -uint32_t iobase, dma_iobase;
>  };
>
>  struct FWCfgMemState {
> @@ -936,24 +935,29 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, 
> uint32_t dma_iobase,
>  AddressSpace *dma_as)
>  {
>  DeviceState *dev;
> +SysBusDevice *sbd;
>  FWCfgState *s;
> +FWCfgIoState *ios;
>  uint32_t version = FW_CFG_VERSION;
>  bool dma_requested = dma_iobase && dma_as;
>
>  dev = qdev_create(NULL, TYPE_FW_CFG_IO);
> -qdev_prop_set_uint32(dev, "iobase", iobase);
> -qdev_prop_set_uint32(dev, "dma_iobase", dma_iobase);
>  if (!dma_requested) {
>  qdev_prop_set_bit(dev, "dma_enabled", false);
>  }
>
>  fw_cfg_init1(dev);
> +sbd = SYS_BUS_DEVICE(dev);
>  s = FW_CFG(dev);
> +ios = FW_CFG_IO(dev);
> +
> +sysbus_add_io(sbd, iobase, >comb_iomem);
>
>  if (s->dma_enabled) {
>  /* 64 bits for the address field */
>  s->dma_as = dma_as;
>  s->dma_addr = 0;
> +sysbus_add_io(sbd, dma_iobase, >dma_iomem);
>
>  version |= FW_CFG_VERSION_DMA;
>  }
> @@ -1059,8 +1063,6 @@ static void fw_cfg_file_slots_allocate(FWCfgState *s, 
> Error **errp)
>  }
>
>  static Property fw_cfg_io_properties[] = {
> -DEFINE_PROP_UINT32("iobase", FWCfgIoState, iobase, -1),
> -DEFINE_PROP_UINT32("dma_iobase", FWCfgIoState, dma_iobase, -1),
>  DEFINE_PROP_BOOL("dma_enabled",

[Qemu-devel] [PATCHv2 3/4] fw_cfg: move qdev_init_nofail() from fw_cfg_init1() to callers

2017-06-12 Thread Mark Cave-Ayland

When looking to instantiate a TYPE_FW_CFG_MEM or TYPE_FW_CFG_IO device to be
able to wire it up differently, it is much more convenient for the caller to
instantiate the device and have the fw_cfg default files already preloaded
during realize.

Move fw_cfg_init1() to the end of both the fw_cfg_mem_realize() and
fw_cfg_io_realize() functions so it no longer needs to be called manually
when instantiating the device.

Signed-off-by: Mark Cave-Ayland 
---
 hw/nvram/fw_cfg.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index e1aa4fc..6c21e43 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -920,8 +920,6 @@ static void fw_cfg_init1(DeviceState *dev)
 
 object_property_add_child(OBJECT(machine), FW_CFG_NAME, OBJECT(s), NULL);
 
-qdev_init_nofail(dev);
-
 fw_cfg_add_bytes(s, FW_CFG_SIGNATURE, (char *)"QEMU", 4);
 fw_cfg_add_bytes(s, FW_CFG_UUID, _uuid, 16);
 fw_cfg_add_i16(s, FW_CFG_NOGRAPHIC, (uint16_t)!machine->enable_graphics);
@@ -954,7 +952,7 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
dma_iobase,
 qdev_prop_set_bit(dev, "dma_enabled", false);
 }
 
-fw_cfg_init1(dev);
+qdev_init_nofail(dev);
 
 sbd = SYS_BUS_DEVICE(dev);
 sysbus_add_io(sbd, iobase, sysbus_mmio_get_region(sbd, 0));
@@ -991,7 +989,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
 qdev_prop_set_bit(dev, "dma_enabled", false);
 }
 
-fw_cfg_init1(dev);
+qdev_init_nofail(dev);
 
 sbd = SYS_BUS_DEVICE(dev);
 sysbus_mmio_map(sbd, 0, ctl_addr);
@@ -1097,6 +1095,8 @@ static void fw_cfg_io_realize(DeviceState *dev, Error 
**errp)
   sizeof(dma_addr_t));
 sysbus_init_mmio(sbd, _CFG(s)->dma_iomem);
 }
+
+fw_cfg_init1(dev);
 }
 
 static void fw_cfg_io_class_init(ObjectClass *klass, void *data)
@@ -1163,6 +1163,8 @@ static void fw_cfg_mem_realize(DeviceState *dev, Error 
**errp)
   sizeof(dma_addr_t));
 sysbus_init_mmio(sbd, _CFG(s)->dma_iomem);
 }
+
+fw_cfg_init1(dev);
 }
 
 static void fw_cfg_mem_class_init(ObjectClass *klass, void *data)
-- 
1.7.10.4

[Qemu-devel] [PATCHv2 2/4] fw_cfg: move setting of FW_CFG_VERSION_DMA bit to fw_cfg_init1()

2017-06-12 Thread Mark Cave-Ayland

The setting of the FW_CFG_VERSION_DMA bit is the same across both the
TYPE_FW_CFG_MEM and TYPE_FW_CFG_IO devices, so unify the logic in
fw_cfg_init1().

Signed-off-by: Mark Cave-Ayland 
---
 hw/nvram/fw_cfg.c |   16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index be5b04e..e1aa4fc 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -914,6 +914,7 @@ static void fw_cfg_init1(DeviceState *dev)
 {
 FWCfgState *s = FW_CFG(dev);
 MachineState *machine = MACHINE(qdev_get_machine());
+uint32_t version = FW_CFG_VERSION;
 
 assert(!object_resolve_path(FW_CFG_PATH, NULL));
 
@@ -928,6 +929,12 @@ static void fw_cfg_init1(DeviceState *dev)
 fw_cfg_bootsplash(s);
 fw_cfg_reboot(s);
 
+if (s->dma_enabled) {
+version |= FW_CFG_VERSION_DMA;
+}
+
+fw_cfg_add_i32(s, FW_CFG_ID, version);
+
 s->machine_ready.notify = fw_cfg_machine_ready;
 qemu_add_machine_init_done_notifier(>machine_ready);
 }
@@ -938,7 +945,6 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
dma_iobase,
 DeviceState *dev;
 SysBusDevice *sbd;
 FWCfgState *s;
-uint32_t version = FW_CFG_VERSION;
 bool dma_requested = dma_iobase && dma_as;
 
 dev = qdev_create(NULL, TYPE_FW_CFG_IO);
@@ -960,12 +966,8 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
dma_iobase,
 s->dma_as = dma_as;
 s->dma_addr = 0;
 sysbus_add_io(sbd, dma_iobase, sysbus_mmio_get_region(sbd, 1));
-
-version |= FW_CFG_VERSION_DMA;
 }
 
-fw_cfg_add_i32(s, FW_CFG_ID, version);
-
 return s;
 }
 
@@ -981,7 +983,6 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
 DeviceState *dev;
 SysBusDevice *sbd;
 FWCfgState *s;
-uint32_t version = FW_CFG_VERSION;
 bool dma_requested = dma_addr && dma_as;
 
 dev = qdev_create(NULL, TYPE_FW_CFG_MEM);
@@ -1002,11 +1003,8 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
 s->dma_as = dma_as;
 s->dma_addr = 0;
 sysbus_mmio_map(sbd, 2, dma_addr);
-version |= FW_CFG_VERSION_DMA;
 }
 
-fw_cfg_add_i32(s, FW_CFG_ID, version);
-
 return s;
 }
 
-- 
1.7.10.4

[Qemu-devel] [PATCHv2 1/4] fw_cfg: don't map the fw_cfg IO ports in fw_cfg_io_realize()

2017-06-12 Thread Mark Cave-Ayland

As indicated by Laszlo it is a QOM bug for the realize() method to actually
map the device. Set up the IO regions with sysbus_init_mmio() and defer
the mapping to the caller, as already done in fw_cfg_init_mem_wide().

Signed-off-by: Mark Cave-Ayland 
---
 hw/nvram/fw_cfg.c |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 316fca9..be5b04e 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -936,6 +936,7 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
dma_iobase,
 AddressSpace *dma_as)
 {
 DeviceState *dev;
+SysBusDevice *sbd;
 FWCfgState *s;
 uint32_t version = FW_CFG_VERSION;
 bool dma_requested = dma_iobase && dma_as;
@@ -948,12 +949,17 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t 
dma_iobase,
 }
 
 fw_cfg_init1(dev);
+
+sbd = SYS_BUS_DEVICE(dev);
+sysbus_add_io(sbd, iobase, sysbus_mmio_get_region(sbd, 0));
+
 s = FW_CFG(dev);
 
 if (s->dma_enabled) {
 /* 64 bits for the address field */
 s->dma_as = dma_as;
 s->dma_addr = 0;
+sysbus_add_io(sbd, dma_iobase, sysbus_mmio_get_region(sbd, 1));
 
 version |= FW_CFG_VERSION_DMA;
 }
@@ -1085,13 +1091,13 @@ static void fw_cfg_io_realize(DeviceState *dev, Error 
**errp)
  * of the i/o region used is FW_CFG_CTL_SIZE */
 memory_region_init_io(>comb_iomem, OBJECT(s), _cfg_comb_mem_ops,
   FW_CFG(s), "fwcfg", FW_CFG_CTL_SIZE);
-sysbus_add_io(sbd, s->iobase, >comb_iomem);
+sysbus_init_mmio(sbd, >comb_iomem);
 
 if (FW_CFG(s)->dma_enabled) {
 memory_region_init_io(_CFG(s)->dma_iomem, OBJECT(s),
   _cfg_dma_mem_ops, FW_CFG(s), "fwcfg.dma",
   sizeof(dma_addr_t));
-sysbus_add_io(sbd, s->dma_iobase, _CFG(s)->dma_iomem);
+sysbus_init_mmio(sbd, _CFG(s)->dma_iomem);
 }
 }
 
-- 
1.7.10.4

[Qemu-devel] [PATCHv2 0/4] fw_cfg: qdev-related tidy-ups

2017-06-12 Thread Mark Cave-Ayland

As part of some ongoing sun4u work, I need to be able to wire the fw_cfg
IO interface to a separate IO space by instantiating the qdev device instead
of calling fw_cfg_init_io(). This patchset brings FW_CFG_IO in line with
FW_CFG_MEM and tidies up the realize methods accordingly.

Signed-off-by: Mark Cave-Ayland 

v2:
- Fix the QOM bug in patch 1 as indicated by Laszlo
- Minimise code churn compared to v1


Mark Cave-Ayland (4):
  fw_cfg: don't map the fw_cfg IO ports in fw_cfg_io_realize()
  fw_cfg: move setting of FW_CFG_VERSION_DMA bit to fw_cfg_init1()
  fw_cfg: move qdev_init_nofail() from fw_cfg_init1() to callers
  fw_cfg: move QOM type defines into fw_cfg.h

 hw/nvram/fw_cfg.c |   44 +---
 include/hw/nvram/fw_cfg.h |8 
 2 files changed, 29 insertions(+), 23 deletions(-)

-- 
1.7.10.4

[Qemu-devel] [PATCHv2 4/4] fw_cfg: move QOM type defines into fw_cfg.h

2017-06-12 Thread Mark Cave-Ayland

This allows the device to be instantiated externally for boards that
wish to wire up the fw_cfg device themselves.

Signed-off-by: Mark Cave-Ayland 
---
 hw/nvram/fw_cfg.c |8 
 include/hw/nvram/fw_cfg.h |8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 6c21e43..22a8404 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -40,14 +40,6 @@
 #define FW_CFG_NAME "fw_cfg"
 #define FW_CFG_PATH "/machine/" FW_CFG_NAME
 
-#define TYPE_FW_CFG "fw_cfg"
-#define TYPE_FW_CFG_IO  "fw_cfg_io"
-#define TYPE_FW_CFG_MEM "fw_cfg_mem"
-
-#define FW_CFG(obj) OBJECT_CHECK(FWCfgState,(obj), TYPE_FW_CFG)
-#define FW_CFG_IO(obj)  OBJECT_CHECK(FWCfgIoState,  (obj), TYPE_FW_CFG_IO)
-#define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM)
-
 /* FW_CFG_VERSION bits */
 #define FW_CFG_VERSION  0x01
 #define FW_CFG_VERSION_DMA  0x02
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index b980cba..e515698 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -4,6 +4,14 @@
 #include "exec/hwaddr.h"
 #include "hw/nvram/fw_cfg_keys.h"
 
+#define TYPE_FW_CFG "fw_cfg"
+#define TYPE_FW_CFG_IO  "fw_cfg_io"
+#define TYPE_FW_CFG_MEM "fw_cfg_mem"
+
+#define FW_CFG(obj) OBJECT_CHECK(FWCfgState,(obj), TYPE_FW_CFG)
+#define FW_CFG_IO(obj)  OBJECT_CHECK(FWCfgIoState,  (obj), TYPE_FW_CFG_IO)
+#define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM)
+
 typedef struct FWCfgFile {
 uint32_t  size;/* file size */
 uint16_t  select;  /* write this to 0x510 to read it */
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH 2/6] fw_cfg: move qdev_init_nofail() out from fw_cfg_init1() into callers

2017-06-12 Thread Mark Cave-Ayland

On 12/06/17 20:50, Mark Cave-Ayland wrote:

> Now I understand this much better, let me try a v2 of this which fixes
> (1) and moves fw_cfg_init1() to a QOM method in the parent as suggested
> by Igor and see what you think.

Experimenting with this some more, if the aim is to try and minimise
code movement/reorganisation then I don't think the QOM method will give
that much benefit for the work involved.

I've just created a v2 with the aim of minimising code churn which I'll
send along shortly.

ATB,

Mark.

Re: [Qemu-devel] [PATCH v11 4/6] mm: function to offer a page block on the free list

2017-06-12 Thread Dave Hansen

On 06/12/2017 01:34 PM, Michael S. Tsirkin wrote:
> On Mon, Jun 12, 2017 at 09:42:36AM -0700, Dave Hansen wrote:
>> On 06/12/2017 09:28 AM, Michael S. Tsirkin wrote:
>>>
 The hypervisor is going to throw away the contents of these pages,
 right?
>>> It should be careful and only throw away contents that was there before
>>> report_unused_page_block was invoked.  Hypervisor is responsible for not
>>> corrupting guest memory.  But that's not something an mm patch should
>>> worry about.
>>
>> That makes sense.  I'm struggling to imagine how the hypervisor makes
>> use of this information, though.  Does it make the pages read-only
>> before this, and then it knows if there has not been a write *and* it
>> gets notified via this new mechanism that it can throw the page away?
> 
> Yes, and specifically, this is how it works for migration.  Normally you
> start by migrating all of memory, then send updates incrementally if
> pages have been modified.  This mechanism allows skipping some pages in
> the 1st stage, if they get changed they will be migrated in the 2nd
> stage.

OK, so the migration starts and marks everything read-only.  All the
pages now have read-only valuable data, or read-only worthless data in
the case that the page is in the free lists.  In order for a page to
become non-worthless, it has to have a write done to it, which the
hypervisor obviously knows about.

With this mechanism, the hypervisor knows it can discard pages which
have not had a write since they were known to have worthless contents.

Correct?

That also seems like pretty good information to include in the
changelog.  Otherwise, folks are going to be left wondering what good
the mechanism is.  It's pretty non-trivial to figure out. :)

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size

2017-06-12 Thread Michael S. Tsirkin

On Mon, Jun 12, 2017 at 05:30:46PM +0800, Wei Wang wrote:
> Ping for comments, thanks.

This was only posted a week ago, might be a bit too short for some
people.  A couple of weeks is more reasonable before you ping.  Also, I
sent a bunch of comments on Thu, 8 Jun 2017.  You should probably
address these.

-- 
MST

Re: [Qemu-devel] [PATCH v11 4/6] mm: function to offer a page block on the free list

2017-06-12 Thread Michael S. Tsirkin

On Mon, Jun 12, 2017 at 09:42:36AM -0700, Dave Hansen wrote:
> On 06/12/2017 09:28 AM, Michael S. Tsirkin wrote:
> > 
> >> The hypervisor is going to throw away the contents of these pages,
> >> right?
> > It should be careful and only throw away contents that was there before
> > report_unused_page_block was invoked.  Hypervisor is responsible for not
> > corrupting guest memory.  But that's not something an mm patch should
> > worry about.
> 
> That makes sense.  I'm struggling to imagine how the hypervisor makes
> use of this information, though.  Does it make the pages read-only
> before this, and then it knows if there has not been a write *and* it
> gets notified via this new mechanism that it can throw the page away?

Yes, and specifically, this is how it works for migration.  Normally you
start by migrating all of memory, then send updates incrementally if
pages have been modified.  This mechanism allows skipping some pages in
the 1st stage, if they get changed they will be migrated in the 2nd
stage.

-- 
MST

Re: [Qemu-devel] [RFC PATCH] target-s390x: Implement mvcos instruction

2017-06-12 Thread David Hildenbrand

On 12.06.2017 18:44, Thomas Huth wrote:
> On 01.03.2017 13:19, Miroslav Benes wrote:
>> On Wed, 1 Mar 2017, Thomas Huth wrote:
>>
>>> On 28.02.2017 14:17, Miroslav Benes wrote:
 Implement MVCOS instruction, which the Linux kernel uses in user access
 functions.

 Signed-off-by: Miroslav Benes 
 ---
 I tried to do my best to follow the specification but it is quite
 possible that I got something wrong because of my lack of
 understanding. Especially I am not sure about all those bit ops :/.

 Anyway, there is one piece missing. The actual use of keys and
 address-space-control during the move. I used fast_memmove, but
 it is not correct. Is there a helper which I could use? I looked at
 other instructions which should implement access control, but there were
 silently ignore it :).
>>>
>>> I'm not aware of a function that could deal with two address spaces
>>> already (but that does not mean that there is no such function already)
>>> ... still, I guess, you likely need to write your own memmove helper
>>> function that can deal with two different address spaces.
>>
>> Ok, I thought that was the case. I'll try to come up with something.
>>  
  target/s390x/helper.h  |  1 +
  target/s390x/insn-data.def |  2 ++
  target/s390x/mem_helper.c  | 80 
 ++
  target/s390x/translate.c   | 12 +++
  4 files changed, 95 insertions(+)

 diff --git a/target/s390x/helper.h b/target/s390x/helper.h
 index 9102071d0aa4..bc5dfccc3d7e 100644
 --- a/target/s390x/helper.h
 +++ b/target/s390x/helper.h
 @@ -104,6 +104,7 @@ DEF_HELPER_FLAGS_2(iske, TCG_CALL_NO_RWG_SE, i64, env, 
 i64)
  DEF_HELPER_FLAGS_3(sske, TCG_CALL_NO_RWG, void, env, i64, i64)
  DEF_HELPER_FLAGS_2(rrbe, TCG_CALL_NO_RWG, i32, env, i64)
  DEF_HELPER_3(csp, i32, env, i32, i64)
 +DEF_HELPER_5(mvcos, i32, env, i64, i64, i64, i64)
  DEF_HELPER_4(mvcs, i32, env, i64, i64, i64)
  DEF_HELPER_4(mvcp, i32, env, i64, i64, i64)
  DEF_HELPER_4(sigp, i32, env, i64, i32, i64)
 diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
 index 075ff597c3de..a1e6d735d090 100644
 --- a/target/s390x/insn-data.def
 +++ b/target/s390x/insn-data.def
 @@ -854,6 +854,8 @@
  /* LOAD USING REAL ADDRESS */
  C(0xb24b, LURA,RRE,   Z,   0, r2, new, r1_32, lura, 0)
  C(0xb905, LURAG,   RRE,   Z,   0, r2, r1, 0, lurag, 0)
 +/* MOVE WITH OPTIONAL SPECIFICATION */
 +C(0xc800, MVCOS,   SSF,   MVCOS, la1, a2, 0, 0, mvcos, 0)
  /* MOVE TO PRIMARY */
  C(0xda00, MVCP,SS_d,  Z,   la1, a2, 0, 0, mvcp, 0)
  /* MOVE TO SECONDARY */
 diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c
 index 675aba2e44d4..ca8f7c49250c 100644
 --- a/target/s390x/mem_helper.c
 +++ b/target/s390x/mem_helper.c
 @@ -1089,6 +1089,86 @@ uint32_t HELPER(mvcp)(CPUS390XState *env, uint64_t 
 l, uint64_t a1, uint64_t a2)
  return cc;
  }
  
 +uint32_t HELPER(mvcos)(CPUS390XState *env, uint64_t r0, uint64_t dest,
 +   uint64_t src, uint64_t len)
 +{
 +int cc;
 +int key1, as1, abit1, kbit1;
 +int key2, as2, abit2, kbit2;
 +
 +HELPER_LOG("%s dest %" PRIx64 ", src %" PRIx64 ", len %" PRIx64 "\n",
 +   __func__, dest, src, len);
 +
 +/* check DAT */
 +if (!(env->psw.mask & PSW_MASK_DAT)) {
 +program_interrupt(env, PGM_SPECIAL_OP, 2);
>>>
>>> Length of the opcode is 6 bytes, not 2.
>>
>> True. Sorry, I don't know where 2 came from. It does not make sense.
> 
> As I recently had to learn it the hard way (while implementing the TEST
> BLOCK instruction), you should use ILEN_LATER_INC here instead of 2 (or
> 6), since the Special operation exception is suppressing, too, i.e. the
> program counter should be increased afterwards to the next instruction.
> 

This is no longer needed with
https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg02631.html

You can either use 6 or ILEN_AUTO here, suppression will be handled
internally.

> BTW, are you still working on a new version of this patch?
> 
>  Thomas
> 


-- 

Thanks,

David

Re: [Qemu-devel] [PATCH RFC] migration: kvmclock: save and load the PVCLOCK_TSC_UNSTABLE_BIT flag when migration

2017-06-12 Thread Radim Krčmář

2017-06-12 21:30+0800, Jay Zhou:
> Guest using kvmclock will be hanged when migrating from unstable
> tsc host to stable tsc host occasionally.
> Sometimes, the tsc timestamp saved at the source side will be
> backward when the guest stopped, and this value is transferred
> to the destination side. The guest at the destination side thought
> kvmclock is stable, so the protection mechanism against time
> going backwards is not used.
> When the first time vcpu0 enters the guest at the destination
> side to update the wall clock, the result of
> pvclock_clocksource_read will be backward occasionally,
> which results in the wall clock drift.
> 
> Signed-off-by: Jay Zhou 
> ---
> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
>  
>  if (running) {
>  struct kvm_clock_data data = {};
> +uint8_t flags_at_migration;
>  
>  /*
>   * If the host where s->clock was read did not support reliable
>   * KVM_GET_CLOCK, read kvmclock value from memory.
>   */
>  if (!s->clock_is_reliable) {

'clock_is_reliable = true' on all newer KVMs (v4.9+), so I don't see a
reason to add a feature that can't be used.

> -uint64_t pvclock_via_mem = kvmclock_current_nsec(s);
> +uint64_t pvclock_via_mem = kvmclock_current_nsec(s,
> +_at_migration);

kvmclock_current_nsec() was introduced to work around the problem with
backward time, so we should understand why it returns a time that is
backwards if we want to do something for old KVMs ...

Is pvclock_via_mem < s->clock?

Thanks.

>  /* We can't rely on the saved clock value, just discard it */
>  if (pvclock_via_mem) {
>  s->clock = pvclock_via_mem;
> +/* whether src kvmclock has PVCLOCK_TSC_STABLE_BIT flag */
> +if (!(flags_at_migration & PVCLOCK_TSC_STABLE_BIT)) {
> +data.flags |= MIGRATION_PVCLOCK_TSC_UNSTABLE_BIT;
> +}
>  }
>  }

Re: [Qemu-devel] [PATCH RFC] KVM: X86: save and load PVCLOCK_TSC_UNSTABLE_BIT when migration

2017-06-12 Thread Radim Krčmář

2017-06-12 21:23+0800, Jay Zhou:
> Guest using kvmclock will be hanged when migrating from unstable
> tsc host to stable tsc host occasionally.
> Sometimes, the tsc timestamp saved at the source side will be
> backward when the guest stopped, and this value is transferred
> to the destination side. The guest at the destination side thought
> kvmclock is stable, so the protection mechanism against time
> going backwards is not used.
> When the first time vcpu0 enters the guest at the destination
> side to update the wall clock, the result of
> pvclock_clocksource_read will be backward occasionally,
> which results in the wall clock drift.
> 
> Signed-off-by: Jay Zhou 
> ---

Hm, are you using KVM that has e3fd9a93a12a (4.9+)?

If you get a timestamp from KVM_GET_CLOCK() and pass that to
KVM_SET_CLOCK(), then kvmclock should not jump backwards anymore
(it could before 4.9, but only if the host had stable tsc).

A possible source of this bug in when QEMU recomputes the timestamp to
pass to KVM_SET_CLOCK() using TSC and kvmclock page.

Can you provide the values used for KVM_GET_CLOCK and KVM_SET_CLOCK when
the bug occurs?

Thanks.

Re: [Qemu-devel] [PATCH 2/6] fw_cfg: move qdev_init_nofail() out from fw_cfg_init1() into callers

2017-06-12 Thread Mark Cave-Ayland

On 12/06/17 20:15, Laszlo Ersek wrote:

> Adding Peter
> 
> On 06/12/17 13:54, Mark Cave-Ayland wrote:
>> On 12/06/17 12:43, Igor Mammedov wrote:
>>
>>> On Sat, 10 Jun 2017 13:30:17 +0100
>>> Mark Cave-Ayland  wrote:
>>>
>>> patch needs a commit message saying why it's needed.
>>> maybe add something similar to:
>>>
>>> qdev_init_nofail() essentially 'realizes' object,
>>> taking in account that fw_cfg_init1() will be moved to
>>> realize in follow up patches, move qdev_init_nofail() outside of
>>> fw_cfg_init1().
>>
>> Yes, I can alter the commit message to provide more explanation. For
>> some background the use case can be found in my follow-up sun4u patches
>> here (i.e. preparation for moving the ebus device behind a PCI bridge):
>>
>> https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg02658.html
>> https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg02656.html
> 
> I see.
> 
> So what you want to replace actually resides in fw_cfg_io_realize(). It
> is the following set of function calls:
> 
> sysbus_add_io(sbd, s->iobase, >comb_iomem);
> 
> sysbus_add_io(sbd, s->dma_iobase, _CFG(s)->dma_iomem);
> 
> which both translate to
> 
> memory_region_add_subregion(get_system_io(), addr, mem);
> 
> And you want to replace the first parameter here, namely get_system_io().
> 
> I think your use case uncovered an actual QOM bug in
> fw_cfg_io_realize(). Compare fw_cfg_mem_realize():
> 
> sysbus_init_mmio(sbd, >ctl_iomem);
> 
> sysbus_init_mmio(sbd, >data_iomem);
> 
> sysbus_init_mmio(sbd, _CFG(s)->dma_iomem);
> 
> But these function calls do not *map* subregions!
> 
> Now, I *distinctly* recall that, when we were working on
> fw_cfg_mem_realize(), Paolo and Peter pointed out that, while MMIO and
> IO region *creation* were "realize business", *mapping* those same
> regions to address spaces (or higher level memory regions) was *board*
> business... Yes, please see the following messages:
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg02839.html
> http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg02849.html
> 
> Ultimately, Paolo graciously fixed up this error in my then-v5 patch
> (thanks again Paolo!); see:
> 
> - http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg02851.html
> - http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg03254.html
> - commit 5712db6ae510 ("fw_cfg: hard separation between the MMIO and
>   I/O port mappings", 2014-12-22).
> 
> But, we all seem to have missed the exact same error in
> fw_cfg_io_realize() at that time.
> 
> So here's my suggestion:
> 
> (1) Fix the QOM bug -- my mess :) -- at the very first. Move the
> sysbus_add_io() calls from fw_cfg_io_realize() to fw_cfg_init_io_dma()
> (which is a board helper function, not a realize function), after
> fw_cfg_init1().
> 
> Notice that in fw_cfg_init_mem_wide(), we have
> 
> fw_cfg_init1(dev);
> 
> sbd = SYS_BUS_DEVICE(dev);
> sysbus_mmio_map(sbd, 0, ctl_addr);
> sysbus_mmio_map(sbd, 1, data_addr);
> 
> which is exactly what we should parallel in fw_cfg_init_io_dma().
> 
> (2) Once this is done, modify the fw_cfg_init_io_dma() function, so that
> it takes a new MemoryRegion* parameter called "parent_io". Update all
> current call sites to pass in NULL, and fw_cfg_init_io_dma() should do
> something like:
> 
>   io = parent_io ? parent_io : get_system_io();
>   memory_region_add_subregion(io, s->iobase, >comb_iomem);
>   if (s->dma_enabled) {
> memory_region_add_subregion(io, s->dma_iobase, s->dma_iomem);
>   }
> 
> 
> Basically, fix the QOM bug first, by moving the region mapping out of
> the realize function, to the board helper function. Then modify the
> board helper function so that board code can pass in the parent_io region.
> 
> This should be two small patches, and the rest should be possible to
> drop, I think.
> 
> Then, in your sun4u series, you can simply remove
> 
> fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT);
> 
> and add
> 
> fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, 0, NULL,
> pci_address_space_io(ebus));
> 
> ... Upon re-reading your cover letter:
> 
>> As part of some ongoing sun4u work, I need to be able to wire the
>> fw_cfg IO interface to a separate IO space by instantiating the qdev
>> device instead of calling fw_cfg_init_io(). This patchset brings
>> FW_CFG_IO in line with FW_CFG_MEM and tidies up the realize methods
>> accordingly.
> 
> I agree that fw_cfg_init_io() should be synched with fw_cfg_init_mem(),
> in that the region *mapping* should be moved from the realize function
> to the board helper function.
> 
> Beyond that, I don't think that a whole lot of code movement /
> reorganization is necessary. Simply empower the current board helper
> function fw_cfg_init_io_dma() to take parent_io, and then use that from
> sun4u board code.

Thanks a lot for the clarification! I agree that (1) is definitely a
bug, however I'm not

Re: [Qemu-devel] [PATCH v4 8/8] tpm: Added support for TPM emulator

2017-06-12 Thread Stefan Berger


On 06/05/2017 03:25 AM, Valluri, Amarnath wrote:

On Wed, 2017-05-24 at 11:15 -0400, Stefan Berger wrote:

On 05/16/2017 03:58 AM, Amarnath Valluri wrote:

This change introduces a new TPM backend driver that can
communicate with
swtpm(software TPM emulator) using unix domain socket interface.

Swtpm uses two unix sockets, one for plain TPM commands and
responses, and one
for out-of-band control messages.

The swtpm and associated tools can be found here:
  https://github.com/stefanberger/swtpm

The swtpm's control channel protocol specification can be found
here:
  https://github.com/stefanberger/swtpm/wiki/Control-Channel-Spe
cification

Usage:
  # setup TPM state directory
  mkdir /tmp/mytpm
  chown -R tss:root /tmp/mytpm
  /usr/bin/swtpm_setup --tpm-state /tmp/mytpm --createek

  # Ask qemu to use TPM emulator with given tpm state directory
  qemu-system-x86_64 \
  [...] \
  -tpmdev
emulator,id=tpm0,tpmstatedir=/tmp/mytpm,logfile=/tmp/swtpm.log \
  -device tpm-tis,tpmdev=tpm0 \
  [...]

Signed-off-by: Amarnath Valluri 

Since you are not supporting migration in this patch, you probably
have
to add a migrate_add_blocker() call somewhere along the lines of this
here:

https://github.com/stefanberger/qemu-tpm/commit/27d332dc3b2c6bfd0fcd3
8e69f5c899651f3a5d8#diff-3a0192eef5d20837af490c32bf396f4eR641


I need to dig a bit and femiliarize myself about 'migration' support, I
guess i can send it as a separate patch ?


Fine by me.



- Amarnath

Re: [Qemu-devel] [PATCH 2/6] fw_cfg: move qdev_init_nofail() out from fw_cfg_init1() into callers

2017-06-12 Thread Laszlo Ersek

Adding Peter

On 06/12/17 13:54, Mark Cave-Ayland wrote:
> On 12/06/17 12:43, Igor Mammedov wrote:
> 
>> On Sat, 10 Jun 2017 13:30:17 +0100
>> Mark Cave-Ayland  wrote:
>>
>> patch needs a commit message saying why it's needed.
>> maybe add something similar to:
>>
>> qdev_init_nofail() essentially 'realizes' object,
>> taking in account that fw_cfg_init1() will be moved to
>> realize in follow up patches, move qdev_init_nofail() outside of
>> fw_cfg_init1().
> 
> Yes, I can alter the commit message to provide more explanation. For
> some background the use case can be found in my follow-up sun4u patches
> here (i.e. preparation for moving the ebus device behind a PCI bridge):
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg02658.html
> https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg02656.html

I see.

So what you want to replace actually resides in fw_cfg_io_realize(). It
is the following set of function calls:

sysbus_add_io(sbd, s->iobase, >comb_iomem);

sysbus_add_io(sbd, s->dma_iobase, _CFG(s)->dma_iomem);

which both translate to

memory_region_add_subregion(get_system_io(), addr, mem);

And you want to replace the first parameter here, namely get_system_io().

I think your use case uncovered an actual QOM bug in
fw_cfg_io_realize(). Compare fw_cfg_mem_realize():

sysbus_init_mmio(sbd, >ctl_iomem);

sysbus_init_mmio(sbd, >data_iomem);

sysbus_init_mmio(sbd, _CFG(s)->dma_iomem);

But these function calls do not *map* subregions!

Now, I *distinctly* recall that, when we were working on
fw_cfg_mem_realize(), Paolo and Peter pointed out that, while MMIO and
IO region *creation* were "realize business", *mapping* those same
regions to address spaces (or higher level memory regions) was *board*
business... Yes, please see the following messages:

http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg02839.html
http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg02849.html

Ultimately, Paolo graciously fixed up this error in my then-v5 patch
(thanks again Paolo!); see:

- http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg02851.html
- http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg03254.html
- commit 5712db6ae510 ("fw_cfg: hard separation between the MMIO and
  I/O port mappings", 2014-12-22).

But, we all seem to have missed the exact same error in
fw_cfg_io_realize() at that time.

So here's my suggestion:

(1) Fix the QOM bug -- my mess :) -- at the very first. Move the
sysbus_add_io() calls from fw_cfg_io_realize() to fw_cfg_init_io_dma()
(which is a board helper function, not a realize function), after
fw_cfg_init1().

Notice that in fw_cfg_init_mem_wide(), we have

fw_cfg_init1(dev);

sbd = SYS_BUS_DEVICE(dev);
sysbus_mmio_map(sbd, 0, ctl_addr);
sysbus_mmio_map(sbd, 1, data_addr);

which is exactly what we should parallel in fw_cfg_init_io_dma().

(2) Once this is done, modify the fw_cfg_init_io_dma() function, so that
it takes a new MemoryRegion* parameter called "parent_io". Update all
current call sites to pass in NULL, and fw_cfg_init_io_dma() should do
something like:

  io = parent_io ? parent_io : get_system_io();
  memory_region_add_subregion(io, s->iobase, >comb_iomem);
  if (s->dma_enabled) {
memory_region_add_subregion(io, s->dma_iobase, s->dma_iomem);
  }

Basically, fix the QOM bug first, by moving the region mapping out of
the realize function, to the board helper function. Then modify the
board helper function so that board code can pass in the parent_io region.

This should be two small patches, and the rest should be possible to
drop, I think.

Then, in your sun4u series, you can simply remove

fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT);

and add

fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, 0, NULL,
pci_address_space_io(ebus));

... Upon re-reading your cover letter:

> As part of some ongoing sun4u work, I need to be able to wire the
> fw_cfg IO interface to a separate IO space by instantiating the qdev
> device instead of calling fw_cfg_init_io(). This patchset brings
> FW_CFG_IO in line with FW_CFG_MEM and tidies up the realize methods
> accordingly.

I agree that fw_cfg_init_io() should be synched with fw_cfg_init_mem(),
in that the region *mapping* should be moved from the realize function
to the board helper function.

Beyond that, I don't think that a whole lot of code movement /
reorganization is necessary. Simply empower the current board helper
function fw_cfg_init_io_dma() to take parent_io, and then use that from
sun4u board code.

Thanks
Laszlo

Re: [Qemu-devel] [PATCH v6 1/4] kvm-all: Pass an error object to kvm_device_access

2017-06-12 Thread Auger Eric

Hi Juan,

On 12/06/2017 08:23, Juan Quintela wrote:
> Eric Auger  wrote:
>> In some circumstances, we don't want to abort if the
>> kvm_device_access fails. This will be the case during ITS
>> migration, in case the ITS table save/restore fails because
>> the guest did not program the vITS correctly. So let's pass an
>> error object to the function and return the ioctl value. New
>> callers will be able to make a decision upon this returned
>> value.
>>
>> Existing callers pass _abort which will cause the
>> function to abort on failure.
>>
>> Signed-off-by: Eric Auger 
> 
> Reviewed-by: Juan Quintela 

Thank you for the review!

Eric
>

Re: [Qemu-devel] [PATCH v4 4/7] target-m68k: move fmove CR to a function

2017-06-12 Thread Philippe Mathieu-Daudé


Hi Laurent,

On 06/11/2017 08:16 PM, Laurent Vivier wrote:

Move code of fmove to/from control register to a function

Signed-off-by: Laurent Vivier 
---
 target/m68k/translate.c | 66 ++---
 1 file changed, 41 insertions(+), 25 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 049d837..45733ce 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -4099,6 +4099,45 @@ DISAS_INSN(trap)
 gen_exception(s, s->pc - 2, EXCP_TRAP0 + (insn & 0xf));
 }

+static void gen_op_fmove_fcr(CPUM68KState *env, DisasContext *s,
+ uint32_t insn, uint32_t ext)
+{
+int mask = (ext >> 10) & 7;
+int is_write = (ext >> 13) & 1;
+TCGv val;
+
+if (is_write) {
+switch (mask) {
+case 1: /* FPIAR */
+case 2: /* FPSR */
+default:
+cpu_abort(NULL, "Unimplemented: fmove from control %d", mask);
+goto undef;
+case 4: /* FPCR */


It seems easier to move the 'if (is_write) {' check here


+val = tcg_const_i32(0);
+DEST_EA(env, insn, OS_LONG, val, NULL);
+tcg_temp_free(val);


then '}'


+break;
+}
+return;
+}
+switch (mask) {
+case 1: /* FPIAR */
+case 2: /* FPSR */
+default:
+cpu_abort(NULL, "Unimplemented: fmove to control %d",
+  mask);
+break;
+case 4: /* FPCR */
+/* Not implemented.  Ignore writes.  */
+break;
+}
+return;
+undef:
+s->pc -= 2;
+disas_undef_fpu(env, s, insn);
+}
+
 /* ??? FP exceptions are not implemented.  Most exceptions are deferred until
immediately before the next FP instruction is executed.  */
 DISAS_INSN(fpu)
@@ -4177,32 +4216,9 @@ DISAS_INSN(fpu)
 tcg_temp_free_i32(tmp32);
 return;
 case 4: /* fmove to control register.  */
-switch ((ext >> 10) & 7) {
-case 4: /* FPCR */
-/* Not implemented.  Ignore writes.  */
-break;
-case 1: /* FPIAR */
-case 2: /* FPSR */
-default:
-cpu_abort(NULL, "Unimplemented: fmove to control %d",
-  (ext >> 10) & 7);
-}
-break;
 case 5: /* fmove from control register.  */
-switch ((ext >> 10) & 7) {
-case 4: /* FPCR */
-/* Not implemented.  Always return zero.  */
-tmp32 = tcg_const_i32(0);
-break;
-case 1: /* FPIAR */
-case 2: /* FPSR */
-default:
-cpu_abort(NULL, "Unimplemented: fmove from control %d",
-  (ext >> 10) & 7);
-goto undef;
-}
-DEST_EA(env, insn, OS_LONG, tmp32, NULL);
-break;
+gen_op_fmove_fcr(env, s, insn, ext);
+return;
 case 6: /* fmovem */
 case 7:
 {

Re: [Qemu-devel] [PATCH v2 0/3] char: fix chardev aliases regression

2017-06-12 Thread Marc-André Lureau

Hi Paolo

On Mon, Jun 12, 2017 at 8:06 PM Paolo Bonzini  wrote:

>
>
> On 08/06/2017 13:59, Marc-André Lureau wrote:
> > Hi,
> >
> > The patch "char: move CharBackend handling in char-fe unit" broke
> > chardev aliases. Here is a small series to fix it, and add a simple
> > unit test to check the alias keep working.
> >
> > v2:
> > - move HAVE_CHARDEV_* in osdep.h (Markus Armbruster)
> > - add r-b tags from Eric for the patches that didn't change
> >
> > Marc-André Lureau (3):
> >   char: fix alias devices regression
> >   chardev: don't use alias names in parse_compat()
> >   test-char: start a /char/serial test
> >
> >  include/chardev/char-parallel.h |  5 -
> >  include/chardev/char-serial.h   |  8 
> >  include/qemu/osdep.h| 13 +
> >  chardev/char.c  |  4 ++--
> >  tests/test-char.c   | 30 ++
> >  5 files changed, 45 insertions(+), 15 deletions(-)
> >
>
> Queued, thanks.
>

I also sent a pull request with those patches "[PULL 0/3] Char patches" a
few days ago. I should take the habit to notice it on the PATCH series.

Thanks
-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH 1/6] fw_cfg: move initialisation of FWCfgState into instance_init

2017-06-12 Thread Mark Cave-Ayland

On 12/06/17 19:27, Laszlo Ersek wrote:

>> Based upon this what do you think the best solution would be?
> 
> 
> if you apply patch #2 without patch #1, then the above SIGSEGV will hit
> on all fw_cfg using targets / machine types, not just
> qemu-system-sparc64. The reason is that, after patch #2 (without patch
> #1 applied) you have
> 
>   fw_cfg_init1()
> ...
>   fw_cfg_add_bytes_read_callback()
> s->entries[arch][key].data
>   qdev_init_nofail()
> fw_cfg_file_slots_allocate()
> 
> IOW, the s->entries array is now dynamically allocated (after the
> introduction of the x-file-slots property):
> 
> FWCfgEntry *entries[2];
> 
> and it is allocated in fw_cfg_file_slots_allocate(), called from
> realize. But with patch #2 applied in isolation, you perform the first
> dereference (from fw_cfg_init1(), indirectly) before allocation, that's
> why it crashes.
> 
> Ultimately we need the following order:
> 
> (1) set properties (from device defaults, from device compat settings,
> and from the command line; and also directly from code, such as in
> fw_cfg_init_io_dma())
> 
> (2) call qdev_init_nofail() -- this will consume the properties from
> (1), including the x-file-slots property, for allocating "s->entries" in
> fw_cfg_file_slots_allocate()
> 
> (3) call fw_cfg_add_*() -- now that the device has been realized and is
> usable.
> 
> This means that
> 
> - patch #1 should be dropped,
> 
> - and in patch #2, you have to push the rest of fw_cfg_init1() --
> everything that is after the original qdev_init_nofail() call -- to the
> end of the realize functions.
> 
> 
> Alternatively, in patch #2, you have to split fw_cfg_init1() into two
> parts, fw_cfg_init1() and fw_cfg_init2(). fw_cfg_init1() would keep
> everything that's *before* the original qdev_init_nofail() call, and
> fw_cfg_init2() would get everything *after*. Then, in
> fw_cfg_init_io_dma() and fw_cfg_init_mem_wide(), you'd do
> 
> fw_cfg_init1(dev);
> qdev_init_nofail(dev);
> fw_cfg_init2(dev);
> 
> In short, you cannot reorder steps (2) and (3) against each other -- you
> cannot add fw_cfg entries before you realize the device -- and the crash
> happens because that's exactly what patch #2 does at the moment.
> 
> And, patch #1 is not the right fix (you can allocate s->entries only in
> realize, because the size depends on a property, which you can only
> access in realize). The right fix is to drop patch #1 and to split
> fw_cfg_init1() like described above.
> 
> ... Hmmm, wait a second, while the above approach would fix patch #2, in
> fact I think patch #2 doesn't have the right *goal*.
> 
> I'll comment under patch #2.

Yes that now makes sense - I think I managed to confuse myself when
writing the patch since I grepped for the type and the calls to the
fw_cfg_init_*() functions but managed to miss the x-file-slots property
as pointed out by Igor :(

Reading what you said above, I'd be inclined to move the default values
from fw_cfg_init1() into a QOM method in the parent fw_cfg type as
suggested by Igor, and then call the method right at the end of the
realise function in order to load them.

However it also sounds like you have some better thoughts for patch #2
so I'll wait for your reply to that first.


ATB,

Mark.

Re: [Qemu-devel] [PATCH v4 4/7] target-m68k: move fmove CR to a function

2017-06-12 Thread Richard Henderson


On 06/12/2017 10:56 AM, Laurent Vivier wrote:

Le 12/06/2017 à 18:13, Richard Henderson a écrit :

On 06/11/2017 04:16 PM, Laurent Vivier wrote:

Move code of fmove to/from control register to a function

Signed-off-by: Laurent Vivier 
---
   target/m68k/translate.c | 66
++---
   1 file changed, 41 insertions(+), 25 deletions(-)


In that this is 100% code movement,

Reviewed-by: Richard Henderson 



+cpu_abort(NULL, "Unimplemented: fmove from control %d",
mask);
+goto undef;


But cpu_abort doesn't return, and will exit qemu.
This should be qemu_log_mask(LOG_UNIMP, ...).


Do you want I update the patch to fix that?


Yes please.


r~

Re: [Qemu-devel] [PATCH 1/6] fw_cfg: move initialisation of FWCfgState into instance_init

2017-06-12 Thread Laszlo Ersek

Hi Mark,

On 06/12/17 13:45, Mark Cave-Ayland wrote:
> On 12/06/17 12:20, Igor Mammedov wrote:
> 
>> On Sat, 10 Jun 2017 13:30:16 +0100
>> Mark Cave-Ayland  wrote:
>>
>>> Signed-off-by: Mark Cave-Ayland 
>>> ---
>>>  hw/nvram/fw_cfg.c |   14 ++
>>>  1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
>>> index 316fca9..144e0c6 100644
>>> --- a/hw/nvram/fw_cfg.c
>>> +++ b/hw/nvram/fw_cfg.c
>>> @@ -1017,6 +1017,15 @@ FWCfgState *fw_cfg_find(void)
>>>  return FW_CFG(object_resolve_path(FW_CFG_PATH, NULL));
>>>  }
>>>  
>>> +static void fw_cfg_init(Object *obj)
>>> +{
>>> +FWCfgState *s = FW_CFG(obj);
>>> +
>>> +s->entries[0] = g_new0(FWCfgEntry, fw_cfg_max_entry(s));
>>> +s->entries[1] = g_new0(FWCfgEntry, fw_cfg_max_entry(s));
>>> +s->entry_order = g_new0(int, fw_cfg_max_entry(s));
>> it doesn't seem right,
>>
>> consider,
>>  object_new(fwcfg)
>>1: fw_cfg_init -> g_new0(FWCfgEntry, fw_cfg_max_entry(s) -> 
>> FW_CFG_FILE_SLOTS_DFLT);
>>2: set property x-file-slots
>>3: realize -> fw_cfg_file_slots_allocate()
>>
>>> @@ -1052,10 +1062,6 @@ static void fw_cfg_file_slots_allocate(FWCfgState 
>>> *s, Error **errp)
>>> file_slots_max);
>>>  return;
>>>  }
>>> -
>>> -s->entries[0] = g_new0(FWCfgEntry, fw_cfg_max_entry(s));
>>> -s->entries[1] = g_new0(FWCfgEntry, fw_cfg_max_entry(s));
>>> -s->entry_order = g_new0(int, fw_cfg_max_entry(s));
>> opps, s->entries doesn't account for new values of x-file-slots
>>
>> So question is why this patch is needed at all (it needs some reasoning in 
>> commit message)?
>>
>> So for now NACK and I'd suggest to drop this patch.
> 
> Right I missed the x-file-slots property when I was looking at this,
> mainly because all of the existing callers went through
> fw_cfg_init_mem() and fw_cfg_init_io(). I do see now though that the
> property is referenced in compat.h.
> 
> The reason for moving the initialisation is that if you apply patch 2 on
> its own then you get hit by the following assert on qemu-system-sparc64
> due to uninitialised data:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x0062f81d in fw_cfg_add_bytes_read_callback (s=0x1e85410,
> key=0, callback=0, callback_opaque=0x0, data=0x9f6c28, len=4,
> read_only=true) at hw/nvram/fw_cfg.c:633
> 633 assert(s->entries[arch][key].data == NULL); /* avoid key
> conflict */
> (gdb) bt
> #0  0x0062f81d in fw_cfg_add_bytes_read_callback (s=0x1e85410,
> key=0, callback=0, callback_opaque=0x0, data=0x9f6c28, len=4,
> read_only=true) at hw/nvram/fw_cfg.c:633
> #1  0x0062fac8 in fw_cfg_add_bytes (s=0x1e85410, key=0,
> data=0x9f6c28, len=4) at hw/nvram/fw_cfg.c:664
> #2  0x006307a8 in fw_cfg_init1 (dev=0x1e85410) at
> hw/nvram/fw_cfg.c:922
> #3  0x006308fe in fw_cfg_init_io_dma (iobase=1296, dma_iobase=0,
> dma_as=0x0) at hw/nvram/fw_cfg.c:948
> #4  0x006309bc in fw_cfg_init_io (iobase=1296) at
> hw/nvram/fw_cfg.c:968
> #5  0x004f31e9 in sun4uv_init (address_space_mem=0x132df20,
> machine=0x132c8c0, hwdef=0x8bf400) at
> /home/build/src/qemu/git/qemu/hw/sparc64/sun4u.c:515
> #6  0x004f3403 in sun4u_init (machine=0x132c8c0) at
> /home/build/src/qemu/git/qemu/hw/sparc64/sun4u.c:565
> #7  0x005c0d26 in machine_run_board_init (machine=0x132c8c0) at
> hw/core/machine.c:760
> 
> 
> #8  0x00532e56 in main (argc=2, argv=0x7fffeaa8,
> envp=0x7fffeac0) at vl.c:4594
> 
> Based upon this what do you think the best solution would be?


if you apply patch #2 without patch #1, then the above SIGSEGV will hit
on all fw_cfg using targets / machine types, not just
qemu-system-sparc64. The reason is that, after patch #2 (without patch
#1 applied) you have

  fw_cfg_init1()
...
  fw_cfg_add_bytes_read_callback()
s->entries[arch][key].data
  qdev_init_nofail()
fw_cfg_file_slots_allocate()

IOW, the s->entries array is now dynamically allocated (after the
introduction of the x-file-slots property):

FWCfgEntry *entries[2];

and it is allocated in fw_cfg_file_slots_allocate(), called from
realize. But with patch #2 applied in isolation, you perform the first
dereference (from fw_cfg_init1(), indirectly) before allocation, that's
why it crashes.

Ultimately we need the following order:

(1) set properties (from device defaults, from device compat settings,
and from the command line; and also directly from code, such as in
fw_cfg_init_io_dma())

(2) call qdev_init_nofail() -- this will consume the properties from
(1), including the x-file-slots property, for allocating "s->entries" in
fw_cfg_file_slots_allocate()

(3) call fw_cfg_add_*() -- now that the device has been realized and is
usable.

This means that

- patch #1 should be dropped,

- and in patch #2, you have to push the rest of fw_cfg_init1() --
everything that is

Re: [Qemu-devel] [PULL v3 00/23] Docker and block patches

2017-06-12 Thread Peter Maydell

On 8 June 2017 at 12:56, Fam Zheng  wrote:
> The following changes since commit 64175afc695c0672876fbbfc31b299c86d562cb4:
>
>   arm_gicv3: Fix ICC_BPR1 reset value when EL3 not implemented (2017-06-07 
> 17:21:44 +0100)
>
> are available in the git repository at:
>
>   git://github.com/famz/qemu.git tags/docker-and-block-pull-request
>
> for you to fetch changes up to 383226d7f90e83fd2b4ea5fbedf67bd9d3173221:
>
>   block: make accounting thread-safe (2017-06-08 19:09:23 +0800)
>
> 
>
> v3: Update Paolo's series to fix make check on OSX.
>
> 

This still fails on OSX:

  GTESTER tests/test-blockjob
qemu: qemu_mutex_lock: Invalid argument

The backtrace is different from last time:

* thread #1: tid = 0x7f066a, 0x7fffc611ad42
libsystem_kernel.dylib`__pthread_kill + 10, queue =
'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x7fffc611ad42 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x7fffc6208457 libsystem_pthread.dylib`pthread_kill + 90
frame #2: 0x7fffc6080420 libsystem_c.dylib`abort + 129
frame #3: 0x0001000c1265
test-blockjob`error_exit(err=, msg=) + 53 at
qemu-thread-posix.c:36
frame #4: 0x0001000c1327
test-blockjob`qemu_mutex_lock(mutex=) + 151 at
qemu-thread-posix.c:63
frame #5: 0x00010005eef7
test-blockjob`bdrv_do_release_matching_dirty_bitmap [inlined]
bdrv_dirty_bitmaps_lock(bs=0x000101806000) + 39 at
dirty-bitmap.c:58
frame #6: 0x00010005eee7
test-blockjob`bdrv_do_release_matching_dirty_bitmap(bs=0x000101806000,
bitmap=0x, only_named=true) + 23 at dirty-bitmap.c:349
frame #7: 0x0001962d test-blockjob`bdrv_delete [inlined]
bdrv_close(bs=0x000101806000) + 32 at block.c:3042
frame #8: 0x0001960d
test-blockjob`bdrv_delete(bs=0x000101806000) + 61 at block.c:3229
frame #9: 0x000100043da2
test-blockjob`blk_remove_bs(blk=0x000100d0a7d0) + 98 at
block-backend.c:607
frame #10: 0x00011318 test-blockjob`test_job_ids [inlined]
destroy_blk(blk=0x000100d0a7d0) + 29 at test-blockjob.c:83
frame #11: 0x000112fb test-blockjob`test_job_ids + 411 at
test-blockjob.c:140
frame #12: 0x00010070d91d
libglib-2.0.0.dylib`g_test_run_suite_internal + 626
frame #13: 0x00010070dae1
libglib-2.0.0.dylib`g_test_run_suite_internal + 1078
frame #14: 0x00010070d198 libglib-2.0.0.dylib`g_test_run_suite + 266
frame #15: 0x00011148
test-blockjob`main(argc=, argv=) + 88 at
test-blockjob.c:152
frame #16: 0x7fffc5fec235 libdyld.dylib`start + 1
frame #17: 0x7fffc5fec235 libdyld.dylib`start + 1

thanks
-- PMM

Re: [Qemu-devel] [PULL 0/7] ui patch queue

2017-06-12 Thread Peter Maydell

On 12 June 2017 at 19:00, Peter Maydell  wrote:
> This causes configure to barf warnings in my build logs on half
> my build machines:
>
> WARNING: Support for gtk2 will be dropped in future releases.
> WARNING: Please consider using gtk3 instead.
>
> I don't think it is yet possible to drop gtk2.

Oh, one of them was the warning about SDL1.2.

thanks
-- PMM

Re: [Qemu-devel] [PULL 0/7] ui patch queue

2017-06-12 Thread Peter Maydell

On 8 June 2017 at 08:07, Gerd Hoffmann  wrote:
>   Hi,
>
> Here is the ui patch queue.  Switches the default
> to gtk3 and sdl2 and fixes some bugs.
>
> please pull,
>   Gerd
>
> The following changes since commit a0d4aac7467dd02e5657b79e867f067330266a24:
>
>   Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170605' into 
> staging (2017-06-05 18:03:43 +0100)
>
> are available in the git repository at:
>
>   git://git.kraxel.org/qemu tags/pull-ui-20170608-1
>
> for you to fetch changes up to 468949958fdfa4347a66e3cf5c8240a428532602:
>
>   spice: don't enter opengl mode in case another UI provides opengl support 
> (2017-06-08 08:13:46 +0200)
>
> 
> ui: prefer gtk3 and sdl2, various fixes.
>
> 

This causes configure to barf warnings in my build logs on half
my build machines:

WARNING: Support for gtk2 will be dropped in future releases.
WARNING: Please consider using gtk3 instead.

I don't think it is yet possible to drop gtk2.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v4 4/7] target-m68k: move fmove CR to a function

2017-06-12 Thread Laurent Vivier

Le 12/06/2017 à 18:13, Richard Henderson a écrit :
> On 06/11/2017 04:16 PM, Laurent Vivier wrote:
>> Move code of fmove to/from control register to a function
>>
>> Signed-off-by: Laurent Vivier 
>> ---
>>   target/m68k/translate.c | 66
>> ++---
>>   1 file changed, 41 insertions(+), 25 deletions(-)
> 
> In that this is 100% code movement,
> 
> Reviewed-by: Richard Henderson 
> 
> 
>> +cpu_abort(NULL, "Unimplemented: fmove from control %d",
>> mask);
>> +goto undef;
> 
> But cpu_abort doesn't return, and will exit qemu.
> This should be qemu_log_mask(LOG_UNIMP, ...).

Do you want I update the patch to fix that?

Thanks,
Laurent

Re: [Qemu-devel] [RFC PATCH] target-s390x: Implement mvcos instruction

2017-06-12 Thread Thomas Huth

On 01.03.2017 13:19, Miroslav Benes wrote:
> On Wed, 1 Mar 2017, Thomas Huth wrote:
> 
>> On 28.02.2017 14:17, Miroslav Benes wrote:
>>> Implement MVCOS instruction, which the Linux kernel uses in user access
>>> functions.
>>>
>>> Signed-off-by: Miroslav Benes 
>>> ---
>>> I tried to do my best to follow the specification but it is quite
>>> possible that I got something wrong because of my lack of
>>> understanding. Especially I am not sure about all those bit ops :/.
>>>
>>> Anyway, there is one piece missing. The actual use of keys and
>>> address-space-control during the move. I used fast_memmove, but
>>> it is not correct. Is there a helper which I could use? I looked at
>>> other instructions which should implement access control, but there were
>>> silently ignore it :).
>>
>> I'm not aware of a function that could deal with two address spaces
>> already (but that does not mean that there is no such function already)
>> ... still, I guess, you likely need to write your own memmove helper
>> function that can deal with two different address spaces.
> 
> Ok, I thought that was the case. I'll try to come up with something.
>  
>>>  target/s390x/helper.h  |  1 +
>>>  target/s390x/insn-data.def |  2 ++
>>>  target/s390x/mem_helper.c  | 80 
>>> ++
>>>  target/s390x/translate.c   | 12 +++
>>>  4 files changed, 95 insertions(+)
>>>
>>> diff --git a/target/s390x/helper.h b/target/s390x/helper.h
>>> index 9102071d0aa4..bc5dfccc3d7e 100644
>>> --- a/target/s390x/helper.h
>>> +++ b/target/s390x/helper.h
>>> @@ -104,6 +104,7 @@ DEF_HELPER_FLAGS_2(iske, TCG_CALL_NO_RWG_SE, i64, env, 
>>> i64)
>>>  DEF_HELPER_FLAGS_3(sske, TCG_CALL_NO_RWG, void, env, i64, i64)
>>>  DEF_HELPER_FLAGS_2(rrbe, TCG_CALL_NO_RWG, i32, env, i64)
>>>  DEF_HELPER_3(csp, i32, env, i32, i64)
>>> +DEF_HELPER_5(mvcos, i32, env, i64, i64, i64, i64)
>>>  DEF_HELPER_4(mvcs, i32, env, i64, i64, i64)
>>>  DEF_HELPER_4(mvcp, i32, env, i64, i64, i64)
>>>  DEF_HELPER_4(sigp, i32, env, i64, i32, i64)
>>> diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def
>>> index 075ff597c3de..a1e6d735d090 100644
>>> --- a/target/s390x/insn-data.def
>>> +++ b/target/s390x/insn-data.def
>>> @@ -854,6 +854,8 @@
>>>  /* LOAD USING REAL ADDRESS */
>>>  C(0xb24b, LURA,RRE,   Z,   0, r2, new, r1_32, lura, 0)
>>>  C(0xb905, LURAG,   RRE,   Z,   0, r2, r1, 0, lurag, 0)
>>> +/* MOVE WITH OPTIONAL SPECIFICATION */
>>> +C(0xc800, MVCOS,   SSF,   MVCOS, la1, a2, 0, 0, mvcos, 0)
>>>  /* MOVE TO PRIMARY */
>>>  C(0xda00, MVCP,SS_d,  Z,   la1, a2, 0, 0, mvcp, 0)
>>>  /* MOVE TO SECONDARY */
>>> diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c
>>> index 675aba2e44d4..ca8f7c49250c 100644
>>> --- a/target/s390x/mem_helper.c
>>> +++ b/target/s390x/mem_helper.c
>>> @@ -1089,6 +1089,86 @@ uint32_t HELPER(mvcp)(CPUS390XState *env, uint64_t 
>>> l, uint64_t a1, uint64_t a2)
>>>  return cc;
>>>  }
>>>  
>>> +uint32_t HELPER(mvcos)(CPUS390XState *env, uint64_t r0, uint64_t dest,
>>> +   uint64_t src, uint64_t len)
>>> +{
>>> +int cc;
>>> +int key1, as1, abit1, kbit1;
>>> +int key2, as2, abit2, kbit2;
>>> +
>>> +HELPER_LOG("%s dest %" PRIx64 ", src %" PRIx64 ", len %" PRIx64 "\n",
>>> +   __func__, dest, src, len);
>>> +
>>> +/* check DAT */
>>> +if (!(env->psw.mask & PSW_MASK_DAT)) {
>>> +program_interrupt(env, PGM_SPECIAL_OP, 2);
>>
>> Length of the opcode is 6 bytes, not 2.
> 
> True. Sorry, I don't know where 2 came from. It does not make sense.

As I recently had to learn it the hard way (while implementing the TEST
BLOCK instruction), you should use ILEN_LATER_INC here instead of 2 (or
6), since the Special operation exception is suppressing, too, i.e. the
program counter should be increased afterwards to the next instruction.

BTW, are you still working on a new version of this patch?

 Thomas

Re: [Qemu-devel] [PATCH v11 4/6] mm: function to offer a page block on the free list

2017-06-12 Thread Dave Hansen

On 06/12/2017 09:28 AM, Michael S. Tsirkin wrote:
> 
>> The hypervisor is going to throw away the contents of these pages,
>> right?
> It should be careful and only throw away contents that was there before
> report_unused_page_block was invoked.  Hypervisor is responsible for not
> corrupting guest memory.  But that's not something an mm patch should
> worry about.

That makes sense.  I'm struggling to imagine how the hypervisor makes
use of this information, though.  Does it make the pages read-only
before this, and then it knows if there has not been a write *and* it
gets notified via this new mechanism that it can throw the page away?

Re: [Qemu-devel] [PATCH v11 4/6] mm: function to offer a page block on the free list

2017-06-12 Thread Michael S. Tsirkin

On Mon, Jun 12, 2017 at 07:10:12AM -0700, Dave Hansen wrote:
> Please stop cc'ing me on things also sent to closed mailing lists
> (virtio-...@lists.oasis-open.org).  I'm happy to review things on open
> lists, but I'm not fond of the closed lists bouncing things at me.
> 
> On 06/09/2017 03:41 AM, Wei Wang wrote:
> > Add a function to find a page block on the free list specified by the
> > caller. Pages from the page block may be used immediately after the
> > function returns. The caller is responsible for detecting or preventing
> > the use of such pages.
> 
> This description doesn't tell me very much about what's going on here.
> Neither does the comment.
> 
> "Pages from the page block may be used immediately after the
>  function returns".
> 
> Used by who?  Does the "may" here mean that it is OK, or is it a warning
> that the contents will be thrown away immediately?

I agree here. Don't tell callers what they should do, say what does the
function does. "offer" also confuses. Here's a better comment

--->
mm: support reporting free page blocks

This adds support for reporting blocks of pages on the free list
specified by the caller.

As pages can leave the free list during this call or immediately
afterwards, they are not guaranteed to be free after the function
returns. The only guarantee this makes is that the page was on the free
list at some point in time after the function has been invoked.

Therefore, it is not safe for caller to use any pages on the returned
block or to discard data that is put there after the function returns.
However, it is safe for caller to discard data that was in one of these
pages before the function was invoked.

---

And repeat the last part in a code comment:

 * Note: it is not safe for caller to use any pages on the returned
 * block or to discard data that is put there after the function returns.
 * However, it is safe for caller to discard data that was in one of these
 * pages before the function was invoked.

> The hypervisor is going to throw away the contents of these pages,
> right?

It should be careful and only throw away contents that was there before
report_unused_page_block was invoked.  Hypervisor is responsible for not
corrupting guest memory.  But that's not something an mm patch should
worry about.

>  As soon as the spinlock is released, someone can allocate a
> page, and put good data in it.  What keeps the hypervisor from throwing
> away good data?

API should require this explicitly. Hopefully above answers this question.

-- 
MST

Re: [Qemu-devel] [PATCH v2 2/6] migration: let MigrationState be a qdev

2017-06-12 Thread Eduardo Habkost

On Mon, Jun 12, 2017 at 12:46:09PM +0800, Peter Xu wrote:
> On Fri, Jun 09, 2017 at 03:39:24PM +0200, Markus Armbruster wrote:
> > Peter Xu  writes:
> > 
> > > Let the old man "MigrationState" join the object family. Direct benefit
> > > is that we can start to use all the property features derived from
> > > current QDev, like: HW_COMPAT_* bits, command line setup for migration
> > > parameters (so will never need to set them up each time using HMP/QMP,
> > > this is really, really attractive for test writters), etc.
> > >
> > > I see no reason to disallow this happen yet. So let's start from this
> > > one, to see whether it would be anything good.
> > >
> > > No functional change at all.
> > >
> > > Signed-off-by: Peter Xu 
> > > ---
> > >  include/migration/migration.h | 19 ++
> > >  migration/migration.c | 61 
> > > ---
> > >  2 files changed, 59 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > > index 79b5484..bd0186c 100644
> > > --- a/include/migration/migration.h
> > > +++ b/include/migration/migration.h
> > > @@ -21,6 +21,7 @@
> > >  #include "qapi-types.h"
> > >  #include "exec/cpu-common.h"
> > >  #include "qemu/coroutine_int.h"
> > > +#include "hw/qdev.h"
> > >  
> > >  #define QEMU_VM_FILE_MAGIC   0x5145564d
> > >  #define QEMU_VM_FILE_VERSION_COMPAT  0x0002
> > > @@ -49,6 +50,8 @@ enum mig_rp_message_type {
> > >  MIG_RP_MSG_MAX
> > >  };
> > >  
> > > +#define TYPE_MIGRATION "migration"
> > > +
> > >  /* State for the incoming migration */
> > >  struct MigrationIncomingState {
> > >  QEMUFile *from_src_file;
> > > @@ -91,8 +94,24 @@ struct MigrationIncomingState {
> > >  MigrationIncomingState *migration_incoming_get_current(void);
> > >  void migration_incoming_state_destroy(void);
> > >  
> > > +#define MIGRATION_CLASS(klass) \
> > > +OBJECT_CLASS_CHECK(MigrationClass, (klass), TYPE_MIGRATION)
> > > +#define MIGRATION_OBJ(obj) \
> > > +OBJECT_CHECK(MigrationState, (obj), TYPE_MIGRATION)
> > > +#define MIGRATION_GET_CLASS(obj) \
> > > +OBJECT_GET_CLASS(MigrationClass, (obj), TYPE_MIGRATION)
> > > +
> > > +typedef struct MigrationClass {
> > > +/*< private >*/
> > > +DeviceClass parent_class;
> > > +} MigrationClass;
> > > +
> > >  struct MigrationState
> > >  {
> > > +/*< private >*/
> > > +DeviceState parent_obj;
> > > +
> > > +/*< public >*/
> > 
> > Turning MigrationState into a QOM object so you can use QOM
> > infrastructure makes sense.  But why turn it into a device?  It doesn't
> > feel device-like to me.  Would ObjectClass and Object instead of
> > DeviceClass and DeviceState do?
> 
> Yeah. That's the main reason for the series to be (was) RFC.
> 
> I was trying to use the HW_COMPAT_* bits and -global, and that's QDev
> thing (IIUC you got that already :). I am just curious about why that
> is not for QObject from the very beginning, then it'll be easier.
> 
> For now, IMHO QDev is okay as well for migration, as long as it's kept
> internally inside QEMU. But sure at least I should turn user_creatable
> to off. I'll investigate more to see how to make this a safer approach
> in next post.

We could allow non-device QOM objects use the global property
system optionally.

We could make qdev_prop_set_globals() work on any Object*, and
let QOM classes call it on post_init if they want to be
configurable by global properties too.

Note that this would break qdev_prop_check_globals(), because it
expects globals to work only on TYPE_DEVICE.  We could address
that by introducing a new interface type to indicate the type
works with -global (something like
INTERFACE_CONFIGURABLE_BY_GLOBAL_PROPERTIES, but shorter?).

Such a system would probably allow us to replace
default_machine_opts, default_boot_order, default_display,
default_ram_size (and probably many other compat fields) on
MachineClass.  It could also be used to implement -machine and
-accel options by simply translating them to global properties
(like we already do for -cpu).

(This sounds like reinventing a QemuOpts-like system on top of
global properties.  Maybe that's a bad thing, maybe that's a good
thing.  I'm not sure.)

> [...]

-- 
Eduardo

Re: [Qemu-devel] [PATCH v4 4/7] target-m68k: move fmove CR to a function

2017-06-12 Thread Richard Henderson


On 06/11/2017 04:16 PM, Laurent Vivier wrote:

Move code of fmove to/from control register to a function

Signed-off-by: Laurent Vivier 
---
  target/m68k/translate.c | 66 ++---
  1 file changed, 41 insertions(+), 25 deletions(-)


In that this is 100% code movement,

Reviewed-by: Richard Henderson 



+cpu_abort(NULL, "Unimplemented: fmove from control %d", mask);
+goto undef;


But cpu_abort doesn't return, and will exit qemu.
This should be qemu_log_mask(LOG_UNIMP, ...).


r~

Re: [Qemu-devel] allocation zone extensions for the firmware linker/loader

2017-06-12 Thread Paolo Bonzini



On 08/06/2017 19:44, Michael S. Tsirkin wrote:
> On Tue, Jun 06, 2017 at 08:10:17PM +0200, Laszlo Ersek wrote:
>> On 06/05/17 18:02, Michael S. Tsirkin wrote:
>>> On Sat, Jun 03, 2017 at 09:36:23AM +0200, Laszlo Ersek wrote:
 On 06/02/17 17:45, Laszlo Ersek wrote:

> The patches can cause linker/loader breakage when old firmware is booted
> on new QEMU. However, that's no problem (it's nothing new), the next
> release of QEMU should bundle the new firmware binaries as always.

 Dave made a good point (which I should have realized myself, really!),
 namely if you launch old fw on old qemu, then migrate the guest to a new
 qemu and then reboot the guest on the target host, within the migrated
 VM, things will break.

 So that makes this approach dead in the water.

 Possible mitigations I could think of:
 - Make it machine type dependent. Complicated (we don't usually bind
 ACPI generation to machine types) and wouldn't help existing devices.
 - Let the firmware negotiate these extensions. Very complicated (new
 fw-cfg files are needed for negotiation) and wouldn't help existing 
 devices.
>>>
>>> This last option *shouldn't* be complicated. If it is something's wrong.
>>>
>>> Maybe we made a mistake when we added etc/smi/*features*.
>>>
>>> It's not too late to move these to etc/*features* for new
>>> machine types if we want to and if you can do the firmware
>>> work. Then you'd just take out a bit and be done with it.
>>>
>>> I don't insist on doing the ACPI thing now but I do think
>>> infrastructure for negotiating extensions should be there.
>>
>> Different drivers in the firmware would need to negotiate different
>> questions / features with QEMU independently of each other. The "thing"
>> in OVMF that negotiates (and uses) the SMI broadcast is very-very
>> different and separate from the "thing" in OVMF that handles the ACPI
>> linker/loader script.
> 
> They both could call a common library.
> 
> Also, we don't need separate fw cfg files - we could
> reserve ranges of bits in a single file.
> E.g. bits 0-31 - smi, 32-63 - tseg, etc.

TSEG size is an integer, not a boolean...

Paolo

Re: [Qemu-devel] [PATCH] MAINTAINERS: seccomp: change email contact for Eduardo Otubo

2017-06-12 Thread Thomas Huth

On 12.06.2017 14:18, Eduardo Otubo wrote:
> Signed-off-by: Eduardo Otubo 
> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 120788d..0d065a0 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1490,7 +1490,7 @@ F: tests/vmstate-static-checker-data/
>  F: docs/migration.txt
>  
>  Seccomp
> -M: Eduardo Otubo 
> +M: Eduardo Otubo 
>  S: Supported
>  F: qemu-seccomp.c
>  F: include/sysemu/seccomp.h

Reviewed-by: Thomas Huth

Re: [Qemu-devel] [PATCH v2 0/3] char: fix chardev aliases regression

2017-06-12 Thread Paolo Bonzini



On 08/06/2017 13:59, Marc-André Lureau wrote:
> Hi,
> 
> The patch "char: move CharBackend handling in char-fe unit" broke
> chardev aliases. Here is a small series to fix it, and add a simple
> unit test to check the alias keep working.
> 
> v2:
> - move HAVE_CHARDEV_* in osdep.h (Markus Armbruster)
> - add r-b tags from Eric for the patches that didn't change
> 
> Marc-André Lureau (3):
>   char: fix alias devices regression
>   chardev: don't use alias names in parse_compat()
>   test-char: start a /char/serial test
> 
>  include/chardev/char-parallel.h |  5 -
>  include/chardev/char-serial.h   |  8 
>  include/qemu/osdep.h| 13 +
>  chardev/char.c  |  4 ++--
>  tests/test-char.c   | 30 ++
>  5 files changed, 45 insertions(+), 15 deletions(-)
> 

Queued, thanks.

Paolo

[Qemu-devel] [PATCH 1/2] docs: create interop/ subdirectory

2017-06-12 Thread Paolo Bonzini

This is for the future interoperability & management guide.  It includes
the QAPI docs, including the automatically generated ones, other socket
protocols (vhost-user, VNC), and the qcow2 file format.

Signed-off-by: Paolo Bonzini 
---
 .gitignore | 16 ++---
 Makefile   | 69 +-
 configure  |  2 +-
 docs/{specs => interop}/parallels.txt  |  0
 docs/{specs => interop}/qcow2.txt  |  0
 docs/{specs => interop}/qed_spec.txt   |  0
 docs/{ => interop}/qemu-ga-ref.texi|  0
 docs/{ => interop}/qemu-qmp-ref.texi   |  0
 docs/{ => interop}/qmp-intro.txt   |  0
 docs/{ => interop}/qmp-spec.txt|  0
 docs/{specs => interop}/vhost-user.txt |  0
 .../{ => interop}/vnc-ledstate-Pseudo-encoding.txt |  0
 rules.mak  |  2 +-
 13 files changed, 50 insertions(+), 39 deletions(-)
 rename docs/{specs => interop}/parallels.txt (100%)
 rename docs/{specs => interop}/qcow2.txt (100%)
 rename docs/{specs => interop}/qed_spec.txt (100%)
 rename docs/{ => interop}/qemu-ga-ref.texi (100%)
 rename docs/{ => interop}/qemu-qmp-ref.texi (100%)
 rename docs/{ => interop}/qmp-intro.txt (100%)
 rename docs/{ => interop}/qmp-spec.txt (100%)
 rename docs/{specs => interop}/vhost-user.txt (100%)
 rename docs/{ => interop}/vnc-ledstate-Pseudo-encoding.txt (100%)

diff --git a/.gitignore b/.gitignore
index 55a001e3b8..11bb1b12da 100644
--- a/.gitignore
+++ b/.gitignore
@@ -99,14 +99,14 @@
 /pc-bios/optionrom/kvmvapic.img
 /pc-bios/s390-ccw/s390-ccw.elf
 /pc-bios/s390-ccw/s390-ccw.img
-/docs/qemu-ga-qapi.texi
-/docs/qemu-ga-ref.html
-/docs/qemu-ga-ref.info*
-/docs/qemu-ga-ref.txt
-/docs/qemu-qmp-qapi.texi
-/docs/qemu-qmp-ref.html
-/docs/qemu-qmp-ref.info*
-/docs/qemu-qmp-ref.txt
+/docs/interop/qemu-ga-qapi.texi
+/docs/interop/qemu-ga-ref.html
+/docs/interop/qemu-ga-ref.info*
+/docs/interop/qemu-ga-ref.txt
+/docs/interop/qemu-qmp-qapi.texi
+/docs/interop/qemu-qmp-ref.html
+/docs/interop/qemu-qmp-ref.info*
+/docs/interop/qemu-qmp-ref.txt
 /docs/version.texi
 *.tps
 .stgit-*
diff --git a/Makefile b/Makefile
index 32d4441697..c27389af28 100644
--- a/Makefile
+++ b/Makefile
@@ -207,8 +207,8 @@ HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
 
 ifdef BUILD_DOCS
 DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
-DOCS+=docs/qemu-qmp-ref.html docs/qemu-qmp-ref.txt docs/qemu-qmp-ref.7
-DOCS+=docs/qemu-ga-ref.html docs/qemu-ga-ref.txt docs/qemu-ga-ref.7
+DOCS+=docs/interop/qemu-qmp-ref.html docs/interop/qemu-qmp-ref.txt 
docs/interop/qemu-qmp-ref.7
+DOCS+=docs/interop/qemu-ga-ref.html docs/interop/qemu-ga-ref.txt 
docs/interop/qemu-ga-ref.7
 ifdef CONFIG_VIRTFS
 DOCS+=fsdev/virtfs-proxy-helper.1
 endif
@@ -519,11 +519,12 @@ distclean: clean
rm -f qemu-doc.vr qemu-doc.txt
rm -f config.log
rm -f linux-headers/asm
-   rm -f docs/qemu-ga-qapi.texi docs/qemu-qmp-qapi.texi docs/version.texi
-   rm -f docs/qemu-qmp-ref.7 docs/qemu-ga-ref.7
-   rm -f docs/qemu-qmp-ref.txt docs/qemu-ga-ref.txt
-   rm -f docs/qemu-qmp-ref.pdf docs/qemu-ga-ref.pdf
-   rm -f docs/qemu-qmp-ref.html docs/qemu-ga-ref.html
+   rm -f docs/version.texi
+   rm -f docs/interop/qemu-ga-qapi.texi docs/interop/qemu-qmp-qapi.texi
+   rm -f docs/interop/qemu-qmp-ref.7 docs/interop/qemu-ga-ref.7
+   rm -f docs/interop/qemu-qmp-ref.txt docs/interop/qemu-ga-ref.txt
+   rm -f docs/interop/qemu-qmp-ref.pdf docs/interop/qemu-ga-ref.pdf
+   rm -f docs/interop/qemu-qmp-ref.html docs/interop/qemu-ga-ref.html
for d in $(TARGET_DIRS); do \
rm -rf $$d || exit 1 ; \
 done
@@ -562,13 +563,13 @@ install-doc: $(DOCS)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.txt "$(DESTDIR)$(qemu_docdir)"
-   $(INSTALL_DATA) docs/qemu-qmp-ref.html "$(DESTDIR)$(qemu_docdir)"
-   $(INSTALL_DATA) docs/qemu-qmp-ref.txt "$(DESTDIR)$(qemu_docdir)"
+   $(INSTALL_DATA) docs/interop/qemu-qmp-ref.html 
"$(DESTDIR)$(qemu_docdir)"
+   $(INSTALL_DATA) docs/interop/qemu-qmp-ref.txt "$(DESTDIR)$(qemu_docdir)"
 ifdef CONFIG_POSIX
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man7"
-   $(INSTALL_DATA) docs/qemu-qmp-ref.7 "$(DESTDIR)$(mandir)/man7"
+   $(INSTALL_DATA) docs/interop/qemu-qmp-ref.7 "$(DESTDIR)$(mandir)/man7"
 ifneq ($(TOOLS),)
$(INSTALL_DATA) qemu-img.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
@@ -576,9 +577,9 @@ ifneq ($(TOOLS),)
 endif
 ifneq (,$(findstring qemu-ga,$(TOOLS)))
$(INSTALL_DATA) qemu-ga.8

[Qemu-devel] [PATCH 2/2] qemu-doc: include version number

2017-06-12 Thread Paolo Bonzini

Note that TEXI2POD doesn't support @set/@clear, so pass the value
manually on the command line.

This patch partially undoes commit bd7f974796 ("qapi: Clean up build of
generated documentation", 2017-03-16).

Signed-off-by: Paolo Bonzini 
---
 Makefile  | 17 +
 qemu-doc.texi |  5 +++--
 2 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index c27389af28..b017e3abf8 100644
--- a/Makefile
+++ b/Makefile
@@ -669,33 +669,26 @@ ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
 MAKEINFO=makeinfo
 MAKEINFOINCLUDES= -I docs -I $( $@,"GEN","$@")
 
-%.html: %.texi
+%.html: %.texi docs/version.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers 
\
--html $< -o $@,"GEN","$@")
 
-%.info: %.texi
+%.info: %.texi docs/version.texi
$(call quiet-command,$(MAKEINFO) $(MAKEINFOFLAGS) $< -o $@,"GEN","$@")
 
-%.txt: %.texi
+%.txt: %.texi docs/version.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers 
\
--plaintext $< -o $@,"GEN","$@")
 
-%.pdf: %.texi
+%.pdf: %.texi docs/version.texi
$(call quiet-command,texi2pdf $(TEXI2PDFFLAGS) $< -o $@,"GEN","$@")
 
-docs/interop/qemu-ga-ref.html docs/interop/qemu-ga-ref.info \
-docs/interop/qemu-ga-ref.txt docs/interop/qemu-ga-ref.pdf \
-docs/interop/qemu-ga-ref.7.pod: docs/version.texi
-docs/interop/qemu-qmp-ref.html docs/interop/qemu-qmp-ref.info \
-docs/interop/qemu-qmp-ref.txt docs/interop/qemu-qmp-ref.pdf \
-docs/interop/qemu-qmp-ref.pod: docs/version.texi
-
 qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > 
$@,"GEN","$@")
 
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 965ba5929e..21079fd675 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -1,11 +1,12 @@
 \input texinfo @c -*- texinfo -*-
 @c %**start of header
 @setfilename qemu-doc.info
+@include version.texi
 
 @documentlanguage en
 @documentencoding UTF-8
 
-@settitle QEMU Emulator User Documentation
+@settitle QEMU version @value{VERSION} User Documentation
 @exampleindent 0
 @paragraphindent 0
 @c %**end of header
@@ -19,7 +20,7 @@
 @iftex
 @titlepage
 @sp 7
-@center @titlefont{QEMU Emulator}
+@center @titlefont{QEMU version @value{VERSION}}
 @sp 1
 @center @titlefont{User Documentation}
 @sp 3
-- 
2.13.0

[Qemu-devel] [PATCH v2 0/2] docs: create docs/interop/, include version number in qemu-doc

2017-06-12 Thread Paolo Bonzini

Submitted together because of Makefile conflicts.

Paolo Bonzini (2):
  docs: create interop/ subdirectory
  qemu-doc: include version number

 .gitignore | 16 ++---
 Makefile   | 72 --
 configure  |  2 +-
 docs/{specs => interop}/parallels.txt  |  0
 docs/{specs => interop}/qcow2.txt  |  0
 docs/{specs => interop}/qed_spec.txt   |  0
 docs/{ => interop}/qemu-ga-ref.texi|  0
 docs/{ => interop}/qemu-qmp-ref.texi   |  0
 docs/{ => interop}/qmp-intro.txt   |  0
 docs/{ => interop}/qmp-spec.txt|  0
 docs/{specs => interop}/vhost-user.txt |  0
 .../{ => interop}/vnc-ledstate-Pseudo-encoding.txt |  0
 qemu-doc.texi  |  5 +-
 rules.mak  |  2 +-
 14 files changed, 51 insertions(+), 46 deletions(-)
 rename docs/{specs => interop}/parallels.txt (100%)
 rename docs/{specs => interop}/qcow2.txt (100%)
 rename docs/{specs => interop}/qed_spec.txt (100%)
 rename docs/{ => interop}/qemu-ga-ref.texi (100%)
 rename docs/{ => interop}/qemu-qmp-ref.texi (100%)
 rename docs/{ => interop}/qmp-intro.txt (100%)
 rename docs/{ => interop}/qmp-spec.txt (100%)
 rename docs/{specs => interop}/vhost-user.txt (100%)
 rename docs/{ => interop}/vnc-ledstate-Pseudo-encoding.txt (100%)

-- 
2.13.0

[Qemu-devel] [PATCH 1/2] block/rbd: enable filename option and parsing

2017-06-12 Thread Jeff Cody

When enabling option parsing and blockdev-add for rbd, we removed the
'filename' option.  Unfortunately, this was a bit optimistic, as
previous versions of QEMU allowed the use of the option in backing
filenames via json.  This means that without parsing this option, we
cannot open existing images that used to work fine.

See bug: https://bugzilla.redhat.com/show_bug.cgi?id=1457088

Tested-by: Richard W.M. Jones 
Signed-off-by: Jeff Cody 
---
 block/rbd.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/block/rbd.c b/block/rbd.c
index e551639..cbc0b85 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -340,6 +340,10 @@ static QemuOptsList runtime_opts = {
 .type = QEMU_OPT_STRING,
 .help = "Legacy rados key/value option parameters",
 },
+{
+.name = "filename",
+.type = QEMU_OPT_STRING,
+},
 { /* end of list */ }
 },
 };
@@ -541,12 +545,24 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
 {
 BDRVRBDState *s = bs->opaque;
 const char *pool, *snap, *conf, *user, *image_name, *keypairs;
-const char *secretid;
+const char *secretid, *filename;
 QemuOpts *opts;
 Error *local_err = NULL;
 char *mon_host = NULL;
 int r;
 
+/* If we are given a filename, parse the filename, with precedence given to
+ * filename encoded options */
+filename = qdict_get_try_str(options, "filename");
+if (filename) {
+qemu_rbd_parse_filename(filename, options, _err);
+if (local_err) {
+r = -EINVAL;
+error_propagate(errp, local_err);
+goto exit;
+}
+}
+
 opts = qemu_opts_create(_opts, NULL, 0, _abort);
 qemu_opts_absorb_qdict(opts, options, _err);
 if (local_err) {
@@ -665,6 +681,7 @@ failed_shutdown:
 failed_opts:
 qemu_opts_del(opts);
 g_free(mon_host);
+exit:
 return r;
 }
 
-- 
2.9.3

[Qemu-devel] [PATCH 2/2] block/iscsi: enable filename option and parsing

2017-06-12 Thread Jeff Cody

When enabling option parsing and blockdev-add for iscsi, we removed the
'filename' option.  Unfortunately, this was a bit optimistic, as
previous versions of QEMU allowed the use of the option in backing
filenames via json.  This means that without parsing this option, we
cannot open existing images that used to work fine.

See bug: https://bugzilla.redhat.com/show_bug.cgi?id=1457088

Tested-by: Richard W.M. Jones 
Signed-off-by: Jeff Cody 
---
 block/iscsi.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 5daa201..ba50e76 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1732,6 +1732,10 @@ static QemuOptsList runtime_opts = {
 .name = "timeout",
 .type = QEMU_OPT_NUMBER,
 },
+{
+.name = "filename",
+.type = QEMU_OPT_STRING,
+},
 { /* end of list */ }
 },
 };
@@ -1747,12 +1751,24 @@ static int iscsi_open(BlockDriverState *bs, QDict 
*options, int flags,
 char *initiator_name = NULL;
 QemuOpts *opts;
 Error *local_err = NULL;
-const char *transport_name, *portal, *target;
+const char *transport_name, *portal, *target, *filename;
 #if LIBISCSI_API_VERSION >= (20160603)
 enum iscsi_transport_type transport;
 #endif
 int i, ret = 0, timeout = 0, lun;
 
+/* If we are given a filename, parse the filename, with precedence given to
+ * filename encoded options */
+filename = qdict_get_try_str(options, "filename");
+if (filename) {
+iscsi_parse_filename(filename, options, _err);
+if (local_err) {
+ret = -EINVAL;
+error_propagate(errp, local_err);
+goto exit;
+}
+}
+
 opts = qemu_opts_create(_opts, NULL, 0, _abort);
 qemu_opts_absorb_qdict(opts, options, _err);
 if (local_err) {
@@ -1967,6 +1983,7 @@ out:
 }
 memset(iscsilun, 0, sizeof(IscsiLun));
 }
+exit:
 return ret;
 }
 
-- 
2.9.3

[Qemu-devel] [PATCH 0/2] Parse 'filename' option for RBD/iSCSI

2017-06-12 Thread Jeff Cody

We need to be able to parse the 'filename' option for rbd and iscsi, because
there may exist images in the wild that have json backing files, that specify
the filename argument.

Jeff Cody (2):
  block/rbd: enable filename option and parsing
  block/iscsi: enable filename option and parsing

 block/iscsi.c | 19 ++-
 block/rbd.c   | 19 ++-
 2 files changed, 36 insertions(+), 2 deletions(-)

-- 
2.9.3

[Qemu-devel] [PATCH v6 6/6] target: [tcg, arm] Port to generic translation framework

2017-06-12 Thread Lluís Vilanova

Signed-off-by: Lluís Vilanova 
---
 target/arm/translate-a64.c |  346 ++---
 target/arm/translate.c |  720 ++--
 target/arm/translate.h |   46 ++-
 3 files changed, 560 insertions(+), 552 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 860e279658..60264ce1c2 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -302,17 +302,17 @@ static void gen_exception(int excp, uint32_t syndrome, 
uint32_t target_el)
 
 static void gen_exception_internal_insn(DisasContext *s, int offset, int excp)
 {
-gen_a64_set_pc_im(s->pc - offset);
+gen_a64_set_pc_im(s->base.pc_next - offset);
 gen_exception_internal(excp);
-s->is_jmp = DISAS_EXC;
+s->base.jmp_type = DJ_EXC;
 }
 
 static void gen_exception_insn(DisasContext *s, int offset, int excp,
uint32_t syndrome, uint32_t target_el)
 {
-gen_a64_set_pc_im(s->pc - offset);
+gen_a64_set_pc_im(s->base.pc_next - offset);
 gen_exception(excp, syndrome, target_el);
-s->is_jmp = DISAS_EXC;
+s->base.jmp_type = DJ_EXC;
 }
 
 static void gen_ss_advance(DisasContext *s)
@@ -340,7 +340,7 @@ static void gen_step_complete_exception(DisasContext *s)
 gen_ss_advance(s);
 gen_exception(EXCP_UDEF, syn_swstep(s->ss_same_el, 1, s->is_ldex),
   default_exception_el(s));
-s->is_jmp = DISAS_EXC;
+s->base.jmp_type = DJ_EXC;
 }
 
 static inline bool use_goto_tb(DisasContext *s, int n, uint64_t dest)
@@ -348,13 +348,14 @@ static inline bool use_goto_tb(DisasContext *s, int n, 
uint64_t dest)
 /* No direct tb linking with singlestep (either QEMU's or the ARM
  * debug architecture kind) or deterministic io
  */
-if (s->singlestep_enabled || s->ss_active || (s->tb->cflags & CF_LAST_IO)) 
{
+if (s->base.singlestep_enabled || s->ss_active ||
+(s->base.tb->cflags & CF_LAST_IO)) {
 return false;
 }
 
 #ifndef CONFIG_USER_ONLY
 /* Only link tbs from inside the same guest page */
-if ((s->tb->pc & TARGET_PAGE_MASK) != (dest & TARGET_PAGE_MASK)) {
+if ((s->base.tb->pc & TARGET_PAGE_MASK) != (dest & TARGET_PAGE_MASK)) {
 return false;
 }
 #endif
@@ -366,21 +367,21 @@ static inline void gen_goto_tb(DisasContext *s, int n, 
uint64_t dest)
 {
 TranslationBlock *tb;
 
-tb = s->tb;
+tb = s->base.tb;
 if (use_goto_tb(s, n, dest)) {
 tcg_gen_goto_tb(n);
 gen_a64_set_pc_im(dest);
 tcg_gen_exit_tb((intptr_t)tb + n);
-s->is_jmp = DISAS_TB_JUMP;
+s->base.jmp_type = DJ_TB_JUMP;
 } else {
 gen_a64_set_pc_im(dest);
 if (s->ss_active) {
 gen_step_complete_exception(s);
-} else if (s->singlestep_enabled) {
+} else if (s->base.singlestep_enabled) {
 gen_exception_internal(EXCP_DEBUG);
 } else {
 tcg_gen_lookup_and_goto_ptr(cpu_pc);
-s->is_jmp = DISAS_TB_JUMP;
+s->base.jmp_type = DJ_TB_JUMP;
 }
 }
 }
@@ -397,11 +398,11 @@ static void unallocated_encoding(DisasContext *s)
 qemu_log_mask(LOG_UNIMP, \
   "%s:%d: unsupported instruction encoding 0x%08x "  \
   "at pc=%016" PRIx64 "\n",  \
-  __FILE__, __LINE__, insn, s->pc - 4);  \
+  __FILE__, __LINE__, insn, s->base.pc_next - 4);\
 unallocated_encoding(s); \
 } while (0);
 
-static void init_tmp_a64_array(DisasContext *s)
+void init_tmp_a64_array(DisasContext *s)
 {
 #ifdef CONFIG_DEBUG_TCG
 int i;
@@ -1216,11 +1217,11 @@ static inline AArch64DecodeFn *lookup_disas_fn(const 
AArch64DecodeTable *table,
  */
 static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
 {
-uint64_t addr = s->pc + sextract32(insn, 0, 26) * 4 - 4;
+uint64_t addr = s->base.pc_next + sextract32(insn, 0, 26) * 4 - 4;
 
 if (insn & (1U << 31)) {
 /* C5.6.26 BL Branch with link */
-tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
+tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
 }
 
 /* C5.6.20 B Branch / C5.6.26 BL Branch with link */
@@ -1243,7 +1244,7 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t 
insn)
 sf = extract32(insn, 31, 1);
 op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */
 rt = extract32(insn, 0, 5);
-addr = s->pc + sextract32(insn, 5, 19) * 4 - 4;
+addr = s->base.pc_next + sextract32(insn, 5, 19) * 4 - 4;
 
 tcg_cmp = read_cpu_reg(s, rt, sf);
 label_match = gen_new_label();
@@ -1251,7 +1252,7 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t 
insn)
 tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
 tcg_cmp, 0, label_match);
 
-gen_goto_tb(s, 0, s->pc);
+gen_goto_tb(s,

[Qemu-devel] [PATCH v6 1/6] Pass generic CPUState to gen_intermediate_code()

2017-06-12 Thread Lluís Vilanova

Needed to implement a target-agnostic gen_intermediate_code() in the
future.

Signed-off-by: Lluís Vilanova 
Reviewed-by: David Gibson 
Reviewed-by: Richard Henderson 
---
 include/exec/exec-all.h   |2 +-
 target/alpha/translate.c  |   11 +--
 target/arm/translate.c|   20 ++--
 target/cris/translate.c   |   17 -
 target/i386/translate.c   |   13 ++---
 target/lm32/translate.c   |   22 +++---
 target/m68k/translate.c   |   15 +++
 target/microblaze/translate.c |   22 +++---
 target/mips/translate.c   |   15 +++
 target/moxie/translate.c  |   14 +++---
 target/openrisc/translate.c   |   19 ++-
 target/ppc/translate.c|   15 +++
 target/s390x/translate.c  |   13 ++---
 target/sh4/translate.c|   15 +++
 target/sparc/translate.c  |   11 +--
 target/tilegx/translate.c |7 +++
 target/tricore/translate.c|9 -
 target/unicore32/translate.c  |   17 -
 target/xtensa/translate.c |   13 ++---
 translate-all.c   |2 +-
 20 files changed, 130 insertions(+), 142 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 87ae10bcc9..1ec7637170 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -43,7 +43,7 @@ typedef ram_addr_t tb_page_addr_t;
 
 #include "qemu/log.h"
 
-void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb);
+void gen_intermediate_code(CPUState *env, struct TranslationBlock *tb);
 void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
   target_ulong *data);
 
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 7c45ae360c..9b60680454 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -2900,10 +2900,9 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 return ret;
 }
 
-void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb)
+void gen_intermediate_code(CPUState *cpu, struct TranslationBlock *tb)
 {
-AlphaCPU *cpu = alpha_env_get_cpu(env);
-CPUState *cs = CPU(cpu);
+CPUAlphaState *env = cpu->env_ptr;
 DisasContext ctx, *ctxp = 
 target_ulong pc_start;
 target_ulong pc_mask;
@@ -2918,7 +2917,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
TranslationBlock *tb)
 ctx.pc = pc_start;
 ctx.mem_idx = cpu_mmu_index(env, false);
 ctx.implver = env->implver;
-ctx.singlestep_enabled = cs->singlestep_enabled;
+ctx.singlestep_enabled = cpu->singlestep_enabled;
 
 #ifdef CONFIG_USER_ONLY
 ctx.ir = cpu_std_ir;
@@ -2961,7 +2960,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
TranslationBlock *tb)
 tcg_gen_insn_start(ctx.pc);
 num_insns++;
 
-if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
+if (unlikely(cpu_breakpoint_test(cpu, ctx.pc, BP_ANY))) {
 ret = gen_excp(, EXCP_DEBUG, 0);
 /* The address covered by the breakpoint must be included in
[tb->pc, tb->pc + tb->size) in order to for it to be
@@ -3030,7 +3029,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
TranslationBlock *tb)
 && qemu_log_in_addr_range(pc_start)) {
 qemu_log_lock();
 qemu_log("IN: %s\n", lookup_symbol(pc_start));
-log_target_disas(cs, pc_start, ctx.pc - pc_start, 1);
+log_target_disas(cpu, pc_start, ctx.pc - pc_start, 1);
 qemu_log("\n");
 qemu_log_unlock();
 }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 0862f9e4aa..96272a9888 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -11787,10 +11787,10 @@ static bool insn_crosses_page(CPUARMState *env, 
DisasContext *s)
 }
 
 /* generate intermediate code for basic block 'tb'.  */
-void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb)
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb)
 {
-ARMCPU *cpu = arm_env_get_cpu(env);
-CPUState *cs = CPU(cpu);
+CPUARMState *env = cpu->env_ptr;
+ARMCPU *arm_cpu = arm_env_get_cpu(env);
 DisasContext dc1, *dc = 
 target_ulong pc_start;
 target_ulong next_page_start;
@@ -11804,7 +11804,7 @@ void gen_intermediate_code(CPUARMState *env, 
TranslationBlock *tb)
  * the A32/T32 complexity to do with conditional execution/IT blocks/etc.
  */
 if (ARM_TBFLAG_AARCH64_STATE(tb->flags)) {
-gen_intermediate_code_a64(cpu, tb);
+gen_intermediate_code_a64(arm_cpu, tb);
 return;
 }
 
@@ -11814,7 +11814,7 @@ void gen_intermediate_code(CPUARMState *env, 
TranslationBlock *tb)
 
 dc->is_jmp = DISAS_NEXT;
 dc->pc = pc_start;
-dc->singlestep_enabled = cs->singlestep_enabled;
+

[Qemu-devel] [PATCH v6 3/6] target: [tcg] Add generic translation framework

2017-06-12 Thread Lluís Vilanova

Signed-off-by: Lluís Vilanova 
---
 include/exec/gen-icount.h |2 
 include/exec/translate-all_template.h |   73 
 include/qom/cpu.h |   22 
 translate-all_template.h  |  204 +
 4 files changed, 300 insertions(+), 1 deletion(-)
 create mode 100644 include/exec/translate-all_template.h
 create mode 100644 translate-all_template.h

diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index 62d462e494..547c979629 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -44,7 +44,7 @@ static inline void gen_tb_start(TranslationBlock *tb)
 tcg_temp_free_i32(count);
 }
 
-static void gen_tb_end(TranslationBlock *tb, int num_insns)
+static inline void gen_tb_end(TranslationBlock *tb, int num_insns)
 {
 if (tb->cflags & CF_USE_ICOUNT) {
 /* Update the num_insn immediate parameter now that we know
diff --git a/include/exec/translate-all_template.h 
b/include/exec/translate-all_template.h
new file mode 100644
index 00..51c8a43b80
--- /dev/null
+++ b/include/exec/translate-all_template.h
@@ -0,0 +1,73 @@
+/*
+ * Generic intermediate code generation.
+ *
+ * Copyright (C) 2016-2017 Lluís Vilanova 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef EXEC__TRANSLATE_ALL_TEMPLATE_H
+#define EXEC__TRANSLATE_ALL_TEMPLATE_H
+
+/*
+ * Include this header from a target-specific file, and add a
+ *
+ * DisasContextBase base;
+ *
+ * member in your target-specific DisasContext.
+ */
+
+
+#include "exec/exec-all.h"
+
+
+/**
+ * BreakpointHitType:
+ * @BH_MISS: No hit
+ * @BH_HIT_INSN: Hit, but continue translating instruction
+ * @BH_HIT_TB: Hit, stop translating TB
+ *
+ * How to react to a breakpoint hit.
+ */
+typedef enum BreakpointHitType {
+BH_MISS,
+BH_HIT_INSN,
+BH_HIT_TB,
+} BreakpointHitType;
+
+/**
+ * DisasJumpType:
+ * @DJ_NEXT: Next instruction in program order
+ * @DJ_TOO_MANY: Too many instructions executed
+ * @DJ_TARGET: Start of target-specific conditions
+ *
+ * What instruction to disassemble next.
+ */
+typedef enum DisasJumpType {
+DJ_NEXT,
+DJ_TOO_MANY,
+DJ_TARGET,
+} DisasJumpType;
+
+/**
+ * DisasContextBase:
+ * @tb: Translation block for this disassembly.
+ * @pc_first: Address of first guest instruction in this TB.
+ * @pc_next: Address of next guest instruction in this TB (current during
+ *   disassembly).
+ * @num_insns: Number of translated instructions (including current).
+ * @singlestep_enabled: "Hardware" single stepping enabled.
+ *
+ * Architecture-agnostic disassembly context.
+ */
+typedef struct DisasContextBase {
+TranslationBlock *tb;
+target_ulong pc_first;
+target_ulong pc_next;
+DisasJumpType jmp_type;
+unsigned int num_insns;
+bool singlestep_enabled;
+} DisasContextBase;
+
+#endif  /* EXEC__TRANSLATE_ALL_TEMPLATE_H */
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 89ddb686fb..d46e8df756 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -982,6 +982,28 @@ static inline bool cpu_breakpoint_test(CPUState *cpu, 
vaddr pc, int mask)
 return false;
 }
 
+/* Get first breakpoint matching a PC */
+static inline CPUBreakpoint *cpu_breakpoint_get(CPUState *cpu, vaddr pc,
+CPUBreakpoint *bp)
+{
+if (likely(bp == NULL)) {
+if (unlikely(!QTAILQ_EMPTY(>breakpoints))) {
+QTAILQ_FOREACH(bp, >breakpoints, entry) {
+if (bp->pc == pc) {
+return bp;
+}
+}
+}
+} else {
+QTAILQ_FOREACH_CONTINUE(bp, entry) {
+if (bp->pc == pc) {
+return bp;
+}
+}
+}
+return NULL;
+}
+
 int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
   int flags, CPUWatchpoint **watchpoint);
 int cpu_watchpoint_remove(CPUState *cpu, vaddr addr,
diff --git a/translate-all_template.h b/translate-all_template.h
new file mode 100644
index 00..3e3f0816f7
--- /dev/null
+++ b/translate-all_template.h
@@ -0,0 +1,204 @@
+/*
+ * Generic intermediate code generation.
+ *
+ * Copyright (C) 2016-2017 Lluís Vilanova 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef TRANSLATE_ALL_TEMPLATE_H
+#define TRANSLATE_ALL_TEMPLATE_H
+
+/*
+ * Include this header from a target-specific file, which must define the
+ * target-specific functions declared below.
+ *
+ * These must be paired with instructions in "exec/translate-all_template.h".
+ */
+
+
+#include "cpu.h"
+#include "qemu/error-report.h"
+
+
+static void gen_intermediate_code_target_init_disas_context(
+DisasContext *dc, CPUArchState *env);
+
+static void

[Qemu-devel] [PATCH v6 4/6] target: [tcg] Redefine DISAS_* onto the generic translation framework (DJ_*)

2017-06-12 Thread Lluís Vilanova

Temporarily redefine DISAS_* values based on DJ_TARGET. They should
disappear as targets get ported to the generic framework.

Signed-off-by: Lluís Vilanova 
---
 include/exec/exec-all.h  |   11 +++
 target/arm/translate.h   |   19 ++-
 target/cris/translate.c  |3 ++-
 target/m68k/translate.c  |3 ++-
 target/s390x/translate.c |3 ++-
 target/unicore32/translate.c |3 ++-
 6 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 1ec7637170..6ad31a8b6f 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -36,10 +36,13 @@ typedef ram_addr_t tb_page_addr_t;
 #endif
 
 /* is_jmp field values */
-#define DISAS_NEXT0 /* next instruction can be analyzed */
-#define DISAS_JUMP1 /* only pc was modified dynamically */
-#define DISAS_UPDATE  2 /* cpu state was modified dynamically */
-#define DISAS_TB_JUMP 3 /* only pc was modified statically */
+/* TODO: delete after all targets are transitioned to generic translation */
+#include "exec/translate-all_template.h"
+#define DISAS_NEXTDJ_NEXT   /* next instruction can be analyzed */
+#define DISAS_JUMP(DJ_TARGET + 0)   /* only pc was modified dynamically */
+#define DISAS_UPDATE  (DJ_TARGET + 1)   /* cpu state was modified dynamically 
*/
+#define DISAS_TB_JUMP (DJ_TARGET + 2)   /* only pc was modified statically */
+#define DISAS_TARGET  (DJ_TARGET + 3)   /* base for target-specific values */
 
 #include "qemu/log.h"
 
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 15d383d9af..e42fdbe61c 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -120,29 +120,30 @@ static void disas_set_insn_syndrome(DisasContext *s, 
uint32_t syn)
 }
 
 /* target-specific extra values for is_jmp */
+/* TODO: rename as DJ_* when transitioning this target to generic translation 
*/
 /* These instructions trap after executing, so the A32/T32 decoder must
  * defer them until after the conditional execution state has been updated.
  * WFI also needs special handling when single-stepping.
  */
-#define DISAS_WFI 4
-#define DISAS_SWI 5
+#define DISAS_WFI (DISAS_TARGET + 0)
+#define DISAS_SWI (DISAS_TARGET + 1)
 /* For instructions which unconditionally cause an exception we can skip
  * emitting unreachable code at the end of the TB in the A64 decoder
  */
-#define DISAS_EXC 6
+#define DISAS_EXC (DISAS_TARGET + 2)
 /* WFE */
-#define DISAS_WFE 7
-#define DISAS_HVC 8
-#define DISAS_SMC 9
-#define DISAS_YIELD 10
+#define DISAS_WFE (DISAS_TARGET + 3)
+#define DISAS_HVC (DISAS_TARGET + 4)
+#define DISAS_SMC (DISAS_TARGET + 5)
+#define DISAS_YIELD (DISAS_TARGET + 6)
 /* M profile branch which might be an exception return (and so needs
  * custom end-of-TB code)
  */
-#define DISAS_BX_EXCRET 11
+#define DISAS_BX_EXCRET (DISAS_TARGET + 7)
 /* For instructions which want an immediate exit to the main loop,
  * as opposed to attempting to use lookup_and_goto_ptr.
  */
-#define DISAS_EXIT 12
+#define DISAS_EXIT (DISAS_TARGET + 8)
 
 #ifdef TARGET_AARCH64
 void a64_translate_init(void);
diff --git a/target/cris/translate.c b/target/cris/translate.c
index 35931e7061..85916196fc 100644
--- a/target/cris/translate.c
+++ b/target/cris/translate.c
@@ -50,7 +50,8 @@
 #define BUG() (gen_BUG(dc, __FILE__, __LINE__))
 #define BUG_ON(x) ({if (x) BUG();})
 
-#define DISAS_SWI 5
+/* TODO: rename as DJ_* when transitioning this target to generic translation 
*/
+#define DISAS_SWI (DISAS_TARGET + 0)
 
 /* Used by the decoder.  */
 #define EXTRACT_FIELD(src, start, end) \
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 0a3372818c..3a0cb6227f 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -190,7 +190,8 @@ static void do_writebacks(DisasContext *s)
 }
 }
 
-#define DISAS_JUMP_NEXT 4
+/* TODO: rename as DJ_* when transitioning this target to generic translation 
*/
+#define DISAS_JUMP_NEXT (DISAS_TARGET + 0)
 
 #if defined(CONFIG_USER_ONLY)
 #define IS_USER(s) 1
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 2a17b3d7aa..98365bd29c 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -76,7 +76,8 @@ typedef struct {
 } u;
 } DisasCompare;
 
-#define DISAS_EXCP 4
+/* TODO: rename as DJ_* when transitioning this target to generic translation 
*/
+#define DISAS_EXCP (DISAS_TARGET + 0)
 
 #ifdef DEBUG_INLINE_BRANCHES
 static uint64_t inline_branch_hit[CC_OP_MAX];
diff --git a/target/unicore32/translate.c b/target/unicore32/translate.c
index 494ed58c10..374b6bb079 100644
--- a/target/unicore32/translate.c
+++ b/target/unicore32/translate.c
@@ -45,9 +45,10 @@ typedef struct DisasContext {
 #define IS_USER(s)  1
 #endif
 
+/* TODO: rename as DJ_* when transitioning this target to generic translation 
*/
 /* These instructions trap after executing, so defer them until after the
conditional executions state has been

[Qemu-devel] [PATCH v6 2/6] queue: Add macro for incremental traversal

2017-06-12 Thread Lluís Vilanova

Adds macro QTAILQ_FOREACH_CONTINUE to support incremental list
traversal.

Signed-off-by: Lluís Vilanova 
---
 include/qemu/queue.h |   12 
 1 file changed, 12 insertions(+)

diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 35292c3155..eb2bf9cb1c 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -415,6 +415,18 @@ struct {   
 \
 (var);  \
 (var) = ((var)->field.tqe_next))
 
+/**
+ * QTAILQ_FOREACH_CONTINUE:
+ * @var: Variable to resume iteration from.
+ * @field: Field in @var holding a QTAILQ_ENTRY for this queue.
+ *
+ * Resumes iteration on a queue from the element in @var.
+ */
+#define QTAILQ_FOREACH_CONTINUE(var, field) \
+for ((var) = ((var)->field.tqe_next);   \
+(var);  \
+(var) = ((var)->field.tqe_next))
+
 #define QTAILQ_FOREACH_SAFE(var, head, field, next_var) \
 for ((var) = ((head)->tqh_first);   \
 (var) && ((next_var) = ((var)->field.tqe_next), 1); \

[Qemu-devel] [RFC PATCH v6 0/6] translate: [tcg] Generic translation framework

2017-06-12 Thread Lluís Vilanova

This series proposes a generic (target-agnostic) instruction translation
framework.

It basically provides a generic main loop for instruction disassembly, which
calls target-specific functions when necessary. This generalization makes
inserting new code in the main loop easier, and helps in keeping all targets in
synch as to the contents of it.

This series also paves the way towards adding events to trace guest code
execution (BBLs and instructions).

I've ported i386/x86-64 and arm/aarch64 as an example to see how it fits in the
current organization, but will port the rest when this series gets merged.

Signed-off-by: Lluís Vilanova 
---

Changes in v6
=

* Rebase on upstream master (64175afc69).
* Reorder fields in DisasContextBase to minimize padding [Richard Henderson].


Changes in v5
=

* Remove stray uses of "restrict" keyword.


Changes in v4
=

* Document new macro QTAILQ_FOREACH_CONTINUE [Peter Maydell].
* Fix coding style errors reported by checkpatch.
* Remove use of "restrict" in added functions; it makes older gcc versions barf
  about compilation errors.


Changes in v3
=

* Rebase on 0737f32daf.


Changes in v2
=

* Port ARM and AARCH64 targets.
* Fold single-stepping checks into "max_insns" [Richard Henderson].
* Move instruction start marks to target code [Richard Henderson].
* Add target hook for TB start.
* Check for TCG temporary leaks.
* Move instruction disassembly into a target hook.
* Make breakpoint_hit() return an enum to accomodate target's needs (ARM).


Lluís Vilanova (6):
  Pass generic CPUState to gen_intermediate_code()
  queue: Add macro for incremental traversal
  target: [tcg] Add generic translation framework
  target: [tcg] Redefine DISAS_* onto the generic translation framework 
(DJ_*)
  target: [tcg,i386] Port to generic translation framework
  target: [tcg,arm] Port to generic translation framework


 include/exec/exec-all.h   |   13 -
 include/exec/gen-icount.h |2 
 include/exec/translate-all_template.h |   73 +++
 include/qemu/queue.h  |   12 +
 include/qom/cpu.h |   22 +
 target/alpha/translate.c  |   11 -
 target/arm/translate-a64.c|  346 
 target/arm/translate.c|  720 +
 target/arm/translate.h|   45 +-
 target/cris/translate.c   |   20 -
 target/i386/translate.c   |  305 ++
 target/lm32/translate.c   |   22 +
 target/m68k/translate.c   |   18 -
 target/microblaze/translate.c |   22 +
 target/mips/translate.c   |   15 -
 target/moxie/translate.c  |   14 -
 target/openrisc/translate.c   |   19 -
 target/ppc/translate.c|   15 -
 target/s390x/translate.c  |   16 -
 target/sh4/translate.c|   15 -
 target/sparc/translate.c  |   11 -
 target/tilegx/translate.c |7 
 target/tricore/translate.c|9 
 target/unicore32/translate.c  |   20 -
 target/xtensa/translate.c |   13 -
 translate-all.c   |2 
 translate-all_template.h  |  204 +
 27 files changed, 1141 insertions(+), 850 deletions(-)
 create mode 100644 include/exec/translate-all_template.h
 create mode 100644 translate-all_template.h


To: qemu-devel@nongnu.org
Cc: Paolo Bonzini 
Cc: Peter Crosthwaite 
Cc: Richard Henderson 
Cc: Alex Bennée

Re: [Qemu-devel] [PATCH v2 1/2] spapr: disable hotplugging without OS

2017-06-12 Thread David Gibson

On Thu, Jun 08, 2017 at 07:27:42PM +0200, Laurent Vivier wrote:
> If the OS is not started, QEMU sends an event to the OS
> that is lost and cannot be recovered. An unplug is not
> able to restore QEMU in a coherent state.
> So, while the OS is not started, disable CPU and memory hotplug.
> We guess the OS is started if the CAS has been negotiated.
> 
> Signed-off-by: Laurent Vivier 

It seems a pain to introduce a whole new (migrated) variable just to
check this.  Could we instead tweak the allocation of spapr->ov5_cas,
so it is NULL until CAS is completed?

> ---
>  hw/ppc/spapr.c | 51 
> +++---
>  hw/ppc/spapr_hcall.c   |  1 +
>  include/hw/ppc/spapr.h |  2 ++
>  3 files changed, 51 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 91b4057..4c979d5 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1308,6 +1308,7 @@ static void ppc_spapr_reset(void)
>  {
>  MachineState *machine = MACHINE(qdev_get_machine());
>  sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
>  PowerPCCPU *first_ppc_cpu;
>  uint32_t rtas_limit;
>  hwaddr rtas_addr, fdt_addr;
> @@ -1373,6 +1374,7 @@ static void ppc_spapr_reset(void)
>  first_ppc_cpu->env.nip = SPAPR_ENTRY_POINT;
>  
>  spapr->cas_reboot = false;
> +spapr->cas_completed = smc->cas_completed_default;

I also dislike the cas_completed_default thing.  If
cas_completed_default != false, it means that we will have
cas_completed == true when we have not, in fact, completed CAS.  I see
why you're doing this for migration compat, but that seems like a
recipe for confusion in the future.

>  }
>  
>  static void spapr_create_nvram(sPAPRMachineState *spapr)
> @@ -1528,6 +1530,27 @@ static const VMStateDescription 
> vmstate_spapr_patb_entry = {
>  },
>  };
>  
> +static bool spapr_cas_completed_needed(void *opaque)
> +{
> +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(opaque);
> +
> +/* we need to migrate cas_completed only if it is
> + * not set by default
> + */
> +return !smc->cas_completed_default;
> +}
> +
> +static const VMStateDescription vmstate_spapr_cas_completed = {
> +.name = "spapr_cas_completed",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.needed = spapr_cas_completed_needed,
> +.fields = (VMStateField[]) {
> +VMSTATE_BOOL(cas_completed, sPAPRMachineState),
> +VMSTATE_END_OF_LIST()
> +},
> +};
> +
>  static const VMStateDescription vmstate_spapr = {
>  .name = "spapr",
>  .version_id = 3,
> @@ -1546,6 +1569,7 @@ static const VMStateDescription vmstate_spapr = {
>  .subsections = (const VMStateDescription*[]) {
>  _spapr_ov5_cas,
>  _spapr_patb_entry,
> +_spapr_cas_completed,
>  NULL
>  }
>  };
> @@ -2599,6 +2623,7 @@ out:
>  static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState 
> *dev,
>Error **errp)
>  {
> +sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
>  PCDIMMDevice *dimm = PC_DIMM(dev);
>  PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
>  MemoryRegion *mr = ddc->get_memory_region(dimm);
> @@ -2617,6 +2642,14 @@ static void spapr_memory_pre_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
> "Use 'memory-backend-file' with correct mem-path.");
>  return;
>  }
> +if (dev->hotplugged) {
> +if (!runstate_check(RUN_STATE_PRELAUNCH) &&
> +!runstate_check(RUN_STATE_INMIGRATE) &&
> +!ms->cas_completed) {
> +error_setg(errp, "Memory hotplug not supported without OS");
> +return;
> +}
> +}
>  }
>  
>  struct sPAPRDIMMState {
> @@ -2915,6 +2948,7 @@ static void spapr_core_pre_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  Error **errp)
>  {
>  MachineState *machine = MACHINE(OBJECT(hotplug_dev));
> +sPAPRMachineState *ms = SPAPR_MACHINE(machine);
>  MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev);
>  Error *local_err = NULL;
>  CPUCore *cc = CPU_CORE(dev);
> @@ -2923,9 +2957,18 @@ static void spapr_core_pre_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  CPUArchId *core_slot;
>  int index;
>  
> -if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
> -error_setg(_err, "CPU hotplug not supported for this machine");
> -goto out;
> +if (dev->hotplugged) {
> +if (!mc->has_hotpluggable_cpus) {
> +error_setg(_err,
> +   "CPU hotplug not supported for this machine");
> +goto out;
> +}
> +if (!runstate_check(RUN_STATE_PRELAUNCH) &&
> +!runstate_check(RUN_STATE_INMIGRATE) &&
> +!ms->cas_completed) {
> +error_setg(_err, "CPU hotplug not supported

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 0/2] spapr: disable hotplugging without OS

2017-06-12 Thread David Gibson

On Thu, Jun 08, 2017 at 03:35:58PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 06/08/2017 02:27 PM, Laurent Vivier wrote:
> > If the OS is not started, QEMU sends an event to the OS
> > that is lost and cannot be recovered. An unplug is not
> > able to restore QEMU in a coherent state.
> > So, while the OS is not started, disable CPU and memory hotplug.
> > We guess the OS is started if the CAS has been negotiated.
> > 
> > This series also revert previous fix which was not really fixing
> > the hotplug problem when the OS is not running.
> > 
> > v2:
> > - fix indent
> > - remove useless local_err
> > - allow hotplug if the VM is not started (pre-launch or incoming state)
> > - remove vector 6, instead just mark the end of CAS negotiation
> > 
> > Laurent Vivier (2):
> >spapr: disable hotplugging without OS
> >Revert "spapr: fix memory hot-unplugging"
> > 
> >   hw/ppc/spapr.c | 51 
> > +++---
> >   hw/ppc/spapr_drc.c | 20 +++---
> >   hw/ppc/spapr_hcall.c   |  1 +
> >   include/hw/ppc/spapr.h |  2 ++
> >   include/hw/ppc/spapr_drc.h |  1 -
> >   5 files changed, 54 insertions(+), 21 deletions(-)
> > 
> Tested-by: Daniel Barboza 
> 
> This is curious. I was having a little look into hotplug/unplug and DRC
> states yesterday
> while taking a look at the latest David's cleanup. I was trying to find out
> a way to tune
> spapr_drc_needed() in a way that we don't accidentally migrate more DRCs
> than necessary
> and at the same time handle that scenario of libvirt migration after hotplug
> (libvirt hotplugs
> the device on target instead of adding it in the command line).
> 
> I would like to migrate the DRCs if and only if:
> 
> 1 - a device was hotplugged
> 2 - a device is undergoing hotplug/unplug
> 
> (2) is easy and is already taken care of. But (1) is tricky because, before
> this patch, if a migration
> occurs *before* CAS we don't need to migrate the DRC - the object will go
> through the state
> changes after the migration. With this series this scenario is not going to
> happen, then
> we can happily add a
> 
> if (dev->hotplugged) return true;
> 
> in spapr_drc_needed and fix the libvirt migration scenario again.

As mentioned in another thread, I'm looking at explicitly resetting
all DRC states during the CAS process.  I think that will simplify
these cases.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PULL 0/1] Tracing patches

2017-06-12 Thread Peter Maydell

On 7 June 2017 at 19:55, Stefan Hajnoczi  wrote:
> The following changes since commit 0db1851becbefe3e50cfc03776fb1f75817376af:
>
>   Merge remote-tracking branch 
> 'remotes/vivier/tags/m68k-for-2.10-pull-request' into staging (2017-06-07 
> 11:56:00 +0100)
>
> are available in the git repository at:
>
>   git://github.com/stefanha/qemu.git tags/tracing-pull-request
>
> for you to fetch changes up to 249e9f792c4c6e52058570e83b550ec8310f621e:
>
>   simpletrace: Improve the error message if event is not declared (2017-06-07 
> 14:34:19 +0100)
>
> 
>
> 
>
> Jose Ricardo Ziviani (1):
>   simpletrace: Improve the error message if event is not declared
>
>  scripts/simpletrace.py | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH] qemu-nbd: Ignore SIGPIPE

2017-06-12 Thread Stefan Hajnoczi

On Sun, Jun 11, 2017 at 02:37:14PM +0200, Max Reitz wrote:
> qemu proper has done so for 13 years
> (8a7ddc38a60648257dc0645ab4a05b33d6040063), qemu-img and qemu-io have
> done so for four years (526eda14a68d5b3596be715505289b541288ef2a).
> Ignoring this signal is especially important in qemu-nbd because
> otherwise a client can easily take down the qemu-nbd server by dropping
> the connection when the server wants to send something, for example:
> 
> $ qemu-nbd -x foo -f raw -t null-co:// &
> [1] 12726
> $ qemu-io -c quit nbd://localhost/bar
> can't open device nbd://localhost/bar: No export with name 'bar' available
> [1]  + 12726 broken pipe  qemu-nbd -x foo -f raw -t null-co://
> 
> In this case, the client sends an NBD_OPT_ABORT and closes the
> connection (because it is not required to wait for a reply), but the
> server replies with an NBD_REP_ACK (because it is required to reply).
> 
> Signed-off-by: Max Reitz 
> ---
> I tried to find some other reproducer instead of using a qemu client
> (e.g. nping -c 1 --tcp-connect localhost -p 10809, which gives the same
> pattern of PSH-ACK,  only results in the write being successful and the next read failing;
> interestingly, sendmsg()'s man page states that EPIPE is only returned
> when the local end is closed. However, I have not found the NBD server
> to close that local end anywhere.
> 
> In any case, ignoring SIGPIPE is the right thing to do. We get an EPIPE
> anyway, and this is fully sufficient to let us know that the connection
> is dead.
> ---
>  qemu-nbd.c | 4 
>  1 file changed, 4 insertions(+)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v3 5/5] spapr: fix migration of ICPState objects from/to older QEMU

2017-06-12 Thread David Gibson

On Thu, Jun 08, 2017 at 11:54:10AM +0200, Greg Kurz wrote:
> On Thu, 8 Jun 2017 14:08:57 +1000
> David Gibson  wrote:
> 
> > On Wed, Jun 07, 2017 at 07:17:26PM +0200, Greg Kurz wrote:
> > > Commit 5bc8d26de20c ("spapr: allocate the ICPState object from under
> > > sPAPRCPUCore") moved ICPState objects from the machine to CPU cores.
> > > This is an improvement since we no longer allocate ICPState objects
> > > that will never be used. But it has the side-effect of breaking
> > > migration of older machine types from older QEMU versions.
> > > 
> > > This patch allows spapr to register dummy "icp/server" entries to vmstate.
> > > These entries use a dedicated VMStateDescription that can swallow and
> > > discard state of an incoming migration stream, and that don't send 
> > > anything
> > > on outgoing migration.
> > > 
> > > As for real ICPState objects, the instance_id is the cpu_index of the
> > > corresponding vCPU, which happens to be equal to the generated instance_id
> > > of older machine types.
> > > 
> > > The machine can unregister/register these entries when CPUs are 
> > > dynamically
> > > plugged/unplugged.
> > > 
> > > This is only available for pseries-2.9 and older machines, thanks to a
> > > compat property.
> > > 
> > > Signed-off-by: Greg Kurz 
> > > ---
> > > v3: - new logic entirely implemented in hw/ppc/spapr.c
> > > ---
> > >  hw/ppc/spapr.c |   88 
> > > +++-
> > >  include/hw/ppc/spapr.h |2 +
> > >  2 files changed, 88 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 9b7ae28939a8..c15b604978f0 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -124,9 +124,52 @@ error:
> > >  return NULL;
> > >  }
> > >  
> > > +static bool pre_2_10_vmstate_dummy_icp_needed(void *opaque)
> > > +{
> > > +return false;
> > > +}  
> > 
> > Uh.. the needed function always returns false, how can that work?
> > 
> 
> The needed function is used for outgoing migration only:
> 
> bool vmstate_save_needed(const VMStateDescription *vmsd, void *opaque)
> {
> if (vmsd->needed && !vmsd->needed(opaque)) {
> /* optional section not needed */
> return false;
> }
> return true;
> }
> 
> The idea is that all ICPState objects that were created but not associated
> to a vCPU by pre-2.10 machine types don't need to be migrated at all, as
> their state hasn't changed.
> 
> We don't even create these unneeded ICPState objects here, but simply
> register their ids to vmstate.
> 
> > > +
> > > +static const VMStateDescription pre_2_10_vmstate_dummy_icp = {
> > > +.name = "icp/server",
> > > +.version_id = 1,
> > > +.minimum_version_id = 1,
> > > +.needed = pre_2_10_vmstate_dummy_icp_needed,
> 
> Outgoing migration:
> - machine in older QEMU have unused ICPState objects (default state)
> - machine in QEMU 2.10 doesn't even have extra ICPState objects
> 
> => don't send anything
> 
> > > +.fields = (VMStateField[]) {
> > > +VMSTATE_UNUSED(4), /* uint32_t xirr */
> > > +VMSTATE_UNUSED(1), /* uint8_t pending_priority */
> > > +VMSTATE_UNUSED(1), /* uint8_t mfrr */
> 
> Incoming migration from older QEMU: we don't have the extra ICPState objects.
> 
> => accept the state and discard it
> 
> > > +VMSTATE_END_OF_LIST()
> > > +},
> > > +};
> > > +
> > > +static void pre_2_10_vmstate_register_dummy_icp(sPAPRMachineState 
> > > *spapr, int i)
> > > +{
> > > +bool *flag = >pre_2_10_ignore_icp[i];
> > > +
> > > +g_assert(!*flag);  
> > 
> > Apart from this assert(), you never seem to test the values in the
> > pre_2_10_ignore_icp() array, so it seems a bit pointless.
> > 
> 
> There's the opposite check in pre_2_10_vmstate_unregister_dummy_icp().
> But I agree it isn't really useful... but more because of paranoia :)

I'm all for paranoid assert()s if they can be made using data readily
to hand.  Adding a data structure just for the purpose of making an
assert() later, not so much.

> > > +vmstate_register(NULL, i, _2_10_vmstate_dummy_icp, flag);
> > > +*flag = true;
> > > +}
> > > +
> > > +static void pre_2_10_vmstate_unregister_dummy_icp(sPAPRMachineState 
> > > *spapr,
> > > +  int i)
> > > +{
> > > +bool *flag = >pre_2_10_ignore_icp[i];
> > > +
> > > +g_assert(*flag);
> > > +vmstate_unregister(NULL, _2_10_vmstate_dummy_icp, flag);
> > > +*flag = false;
> > > +}
> > > +
> > > +static inline int xics_nr_servers(void)  
> > 
> > Maybe a different name to emphasise that this is only used for the
> > backwards compat logic.
> > 
> 
> It is also used to compute the "ibm,interrupt-server-ranges" DT prop.
> 
> /* /interrupt controller */
> spapr_dt_xics(xics_nr_servers(), fdt, PHANDLE_XICP);

Ah, good point.  Maybe rename to "max server number" or something,
since "nr_servers" isn't really

Re: [Qemu-devel] [PATCH v3] block: change variable names in BlockDriverState

2017-06-12 Thread Stefan Hajnoczi

On Fri, Jun 09, 2017 at 01:18:08PM +0300, Manos Pitsidianakis wrote:
> Change the 'int count' parameter in *pwrite_zeros, *pdiscard related
> functions (and some others) to 'int bytes', as they both refer to bytes.
> This helps with code legibility.
> 
> Signed-off-by: Manos Pitsidianakis 
> ---

Please include a changelog beneath '---' in future patches.  For
example:

v3:
 * Drop ide change which operated on sectors, not bytes [Stefan]

>  block/blkdebug.c   | 36 +++
>  block/blkreplay.c  |  8 +++
>  block/block-backend.c  | 22 +--
>  block/file-posix.c | 34 +++---
>  block/io.c | 48 
> +-
>  block/iscsi.c  | 20 +-
>  block/mirror.c |  8 +++
>  block/nbd-client.c |  8 +++
>  block/nbd-client.h |  4 ++--
>  block/qcow2.c  | 28 
>  block/qed.c|  8 +++
>  block/raw-format.c |  8 +++
>  block/rbd.c|  4 ++--
>  block/sheepdog.c   |  6 +++---
>  include/block/block.h  |  8 +++
>  include/block/block_int.h  |  6 +++---
>  include/sysemu/block-backend.h | 20 +-
>  qemu-io-cmds.c | 46 
>  18 files changed, 161 insertions(+), 161 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v4 5/6] xics: directly register ICPState objects to vmstate

2017-06-12 Thread David Gibson

On Thu, Jun 08, 2017 at 03:43:18PM +0200, Greg Kurz wrote:
> The ICPState objects are currently registered to vmstate as qdev objects.
> Their instance ids are hence computed automatically in the migration code,
> and thus depends on the order the CPU cores were plugged.
> 
> If the destination had its CPU cores plugged in a different order than the
> source, then ICPState objects will have different instance_ids and load
> the wrong state.
> 
> Since CPU objects have a reliable cpu_index which is already used as
> instance_id in vmstate, let's use it for ICPState as well.
> 
> Signed-off-by: Greg Kurz 

This is certainly an improvement.  You answered my query on the
previous version as to why this doesn't break migration, but that
information should go into the commit message.

So, ideally, we would use the XICS "server number" as the migration
key.  That's an architected part of the XICs state, since those values
are entered explicitly into the ICS.  We have a way to go from server
number to ICP at the moment, but not the reverse, but we can fix that.

Unfortunately I think those won't always match existing automatically
generated IDs, which makes things harder.

> ---
>  hw/intc/xics.c |5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 7ccfb53c55a0..faa5c631f655 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -344,10 +344,14 @@ static void icp_realize(DeviceState *dev, Error **errp)
>  }
>  
>  qemu_register_reset(icp_reset, dev);
> +vmstate_register(NULL, icp->cs->cpu_index, _icp_server, icp);
>  }
>  
>  static void icp_unrealize(DeviceState *dev, Error **errp)
>  {
> +ICPState *icp = ICP(dev);
> +
> +vmstate_unregister(NULL, _icp_server, icp);
>  qemu_unregister_reset(icp_reset, dev);
>  }
>  
> @@ -355,7 +359,6 @@ static void icp_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  
> -dc->vmsd = _icp_server;
>  dc->realize = icp_realize;
>  dc->unrealize = icp_unrealize;
>  }
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v4 6/6] spapr: fix migration of ICPState objects from/to older QEMU

2017-06-12 Thread David Gibson

On Thu, Jun 08, 2017 at 03:43:27PM +0200, Greg Kurz wrote:
> Commit 5bc8d26de20c ("spapr: allocate the ICPState object from under
> sPAPRCPUCore") moved ICPState objects from the machine to CPU cores.
> This is an improvement since we no longer allocate ICPState objects
> that will never be used. But it has the side-effect of breaking
> migration of older machine types from older QEMU versions.
> 
> This patch allows spapr to register dummy "icp/server" entries to vmstate.
> These entries use a dedicated VMStateDescription that can swallow and
> discard state of an incoming migration stream, and that don't send anything
> on outgoing migration.
> 
> As for real ICPState objects, the instance_id is the cpu_index of the
> corresponding vCPU, which happens to be equal to the generated instance_id
> of older machine types.
> 
> The machine can unregister/register these entries when CPUs are dynamically
> plugged/unplugged.
> 
> This is only available for pseries-2.9 and older machines, thanks to a
> compat property.
> 
> Signed-off-by: Greg Kurz 
> ---
> v4: - dropped paranoid g_assert()s
> ---
>  hw/ppc/spapr.c |   86 
> +++-
>  include/hw/ppc/spapr.h |2 +
>  2 files changed, 86 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b2951d7618d6..1379986c0c7b 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -125,9 +125,50 @@ error:
>  return NULL;
>  }
>  
> +static bool pre_2_10_vmstate_dummy_icp_needed(void *opaque)
> +{
> +return false;

Comment here with why always false makes sense here would be useful
for the future.

> +}
> +
> +static const VMStateDescription pre_2_10_vmstate_dummy_icp = {
> +.name = "icp/server",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.needed = pre_2_10_vmstate_dummy_icp_needed,
> +.fields = (VMStateField[]) {
> +VMSTATE_UNUSED(4), /* uint32_t xirr */
> +VMSTATE_UNUSED(1), /* uint8_t pending_priority */
> +VMSTATE_UNUSED(1), /* uint8_t mfrr */
> +VMSTATE_END_OF_LIST()
> +},
> +};
> +
> +static void pre_2_10_vmstate_register_dummy_icp(sPAPRMachineState *spapr, 
> int i)
> +{
> +bool *flag = >pre_2_10_ignore_icp[i];

AFAICT you now set, but never check the flags in the ignore_icp array,
so you should be able to get rid of it.

> +vmstate_register(NULL, i, _2_10_vmstate_dummy_icp, flag);
> +*flag = true;
> +}
> +
> +static void pre_2_10_vmstate_unregister_dummy_icp(sPAPRMachineState *spapr,
> +  int i)
> +{
> +bool *flag = >pre_2_10_ignore_icp[i];
> +
> +vmstate_unregister(NULL, _2_10_vmstate_dummy_icp, flag);
> +*flag = false;
> +}
> +
> +static inline int xics_nr_servers(void)
> +{
> +return DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(), smp_threads);
> +}
> +
>  static void xics_system_init(MachineState *machine, int nr_irqs, Error 
> **errp)
>  {
>  sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
>  
>  if (kvm_enabled()) {
>  if (machine_kernel_irqchip_allowed(machine) &&
> @@ -149,6 +190,15 @@ static void xics_system_init(MachineState *machine, int 
> nr_irqs, Error **errp)
>  return;
>  }
>  }
> +
> +if (smc->pre_2_10_has_unused_icps) {
> +int i;
> +
> +spapr->pre_2_10_ignore_icp = g_malloc(xics_nr_servers());
> +for (i = 0; i < xics_nr_servers(); i++) {
> +pre_2_10_vmstate_register_dummy_icp(spapr, i);

A comment around here explaining that the dummy entries get
deregistered when real ICPs are registered would also be helpful.

> +}
> +}
>  }
>  
>  static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
> @@ -977,7 +1027,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>  void *fdt;
>  sPAPRPHBState *phb;
>  char *buf;
> -int smt = kvmppc_smt_threads();
>  
>  fdt = g_malloc0(FDT_MAX_SIZE);
>  _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> @@ -1017,7 +1066,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>  _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
>  
>  /* /interrupt controller */
> -spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, 
> PHANDLE_XICP);
> +spapr_dt_xics(xics_nr_servers(), fdt, PHANDLE_XICP);
>  
>  ret = spapr_populate_memory(spapr, fdt);
>  if (ret < 0) {
> @@ -2803,9 +2852,24 @@ static void spapr_core_unplug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>Error **errp)
>  {
>  MachineState *ms = MACHINE(qdev_get_machine());
> +sPAPRMachineState *spapr = SPAPR_MACHINE(ms);
>  CPUCore *cc = CPU_CORE(dev);
>  CPUArchId *core_slot = spapr_find_cpu_slot(ms, cc->core_id, NULL);
>  
> +if (spapr->pre_2_10_ignore_icp) {
> +sPAPRCPUCore *sc = SPAPR_CPU_CORE(OBJECT(dev));
> +

[Qemu-devel] [PATCH v4 3/4] net/socket: Convert error report message to Error

2017-06-12 Thread Mao Zhongyi

Currently, net_socket_mcast_create(), net_socket_fd_init_dgram() and
net_socket_fd_init() use the function such as fprintf(), perror() to
report an error message.

Now, convert these functions to Error.

Cc: jasow...@redhat.com
Cc: arm...@redhat.com
Cc: berra...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 net/socket.c | 57 ++---
 1 file changed, 34 insertions(+), 23 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index 66016c8..9825422 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -209,7 +209,9 @@ static void net_socket_send_dgram(void *opaque)
 }
 }
 
-static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, struct 
in_addr *localaddr)
+static int net_socket_mcast_create(struct sockaddr_in *mcastaddr,
+   struct in_addr *localaddr,
+   Error **errp)
 {
 struct ip_mreq imr;
 int fd;
@@ -221,8 +223,8 @@ static int net_socket_mcast_create(struct sockaddr_in 
*mcastaddr, struct in_addr
 #endif
 
 if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) {
-fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) "
-"does not contain a multicast address\n",
+error_setg(errp, "qemu: error: specified mcastaddr %s (0x%08x) "
+"does not contain a multicast address",
 inet_ntoa(mcastaddr->sin_addr),
 (int)ntohl(mcastaddr->sin_addr.s_addr));
 return -1;
@@ -230,7 +232,8 @@ static int net_socket_mcast_create(struct sockaddr_in 
*mcastaddr, struct in_addr
 }
 fd = qemu_socket(PF_INET, SOCK_DGRAM, 0);
 if (fd < 0) {
-perror("socket(PF_INET, SOCK_DGRAM)");
+error_setg_errno(errp, errno, "Create socket(PF_INET, SOCK_DGRAM)"
+ " failed");
 return -1;
 }
 
@@ -242,13 +245,15 @@ static int net_socket_mcast_create(struct sockaddr_in 
*mcastaddr, struct in_addr
 val = 1;
 ret = qemu_setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, , sizeof(val));
 if (ret < 0) {
-perror("setsockopt(SOL_SOCKET, SO_REUSEADDR)");
+error_setg_errno(errp, errno, "Set the socket fd=%d attribute"
+ " SO_REUSEADDR failed", fd);
 goto fail;
 }
 
 ret = bind(fd, (struct sockaddr *)mcastaddr, sizeof(*mcastaddr));
 if (ret < 0) {
-perror("bind");
+error_setg_errno(errp, errno, "Bind ip=%s to fd=%d failed",
+ inet_ntoa(mcastaddr->sin_addr), fd);
 goto fail;
 }
 
@@ -263,7 +268,8 @@ static int net_socket_mcast_create(struct sockaddr_in 
*mcastaddr, struct in_addr
 ret = qemu_setsockopt(fd, IPPROTO_IP, IP_ADD_MEMBERSHIP,
   , sizeof(struct ip_mreq));
 if (ret < 0) {
-perror("setsockopt(IP_ADD_MEMBERSHIP)");
+error_setg_errno(errp, errno, "Add socket fd=%d to multicast group %s"
+" failed", fd, inet_ntoa(imr.imr_multiaddr));
 goto fail;
 }
 
@@ -272,7 +278,8 @@ static int net_socket_mcast_create(struct sockaddr_in 
*mcastaddr, struct in_addr
 ret = qemu_setsockopt(fd, IPPROTO_IP, IP_MULTICAST_LOOP,
   , sizeof(loop));
 if (ret < 0) {
-perror("setsockopt(SOL_IP, IP_MULTICAST_LOOP)");
+error_setg_errno(errp, errno, "Force multicast message to loopback"
+ " failed");
 goto fail;
 }
 
@@ -281,7 +288,8 @@ static int net_socket_mcast_create(struct sockaddr_in 
*mcastaddr, struct in_addr
 ret = qemu_setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF,
   localaddr, sizeof(*localaddr));
 if (ret < 0) {
-perror("setsockopt(IP_MULTICAST_IF)");
+error_setg_errno(errp, errno, "Set the default network send "
+ "interfaec failed");
 goto fail;
 }
 }
@@ -320,7 +328,8 @@ static NetClientInfo net_dgram_socket_info = {
 static NetSocketState *net_socket_fd_init_dgram(NetClientState *peer,
 const char *model,
 const char *name,
-int fd, int is_connected)
+int fd, int is_connected,
+Error **errp)
 {
 struct sockaddr_in saddr;
 int newfd;
@@ -342,7 +351,7 @@ static NetSocketState 
*net_socket_fd_init_dgram(NetClientState *peer,
 goto err;
 }
 /* clone dgram socket */
-newfd = net_socket_mcast_create(, NULL);
+newfd = net_socket_mcast_create(, NULL, errp);
 if (newfd < 0) {
 /* error already reported by net_socket_mcast_create() */
 goto err;
@@ -352,8 +361,8 @@ static NetSocketState

[Qemu-devel] [PATCH v4 2/4] net/net: Convert parse_host_port() to Error

2017-06-12 Thread Mao Zhongyi

Cc: berra...@redhat.com
Cc: kra...@redhat.com
Cc: pbonz...@redhat.com
Cc: jasow...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 include/qemu/sockets.h |  2 +-
 net/net.c  | 21 -
 net/socket.c   | 37 ++---
 3 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/include/qemu/sockets.h b/include/qemu/sockets.h
index 5c326db..7abffc4 100644
--- a/include/qemu/sockets.h
+++ b/include/qemu/sockets.h
@@ -53,7 +53,7 @@ void socket_listen_cleanup(int fd, Error **errp);
 int socket_dgram(SocketAddress *remote, SocketAddress *local, Error **errp);
 
 /* Old, ipv4 only bits.  Don't use for new code. */
-int parse_host_port(struct sockaddr_in *saddr, const char *str);
+int parse_host_port(struct sockaddr_in *saddr, const char *str, Error **errp);
 int socket_init(void);
 
 /**
diff --git a/net/net.c b/net/net.c
index 6235aab..e55869a 100644
--- a/net/net.c
+++ b/net/net.c
@@ -100,7 +100,7 @@ static int get_str_sep(char *buf, int buf_size, const char 
**pp, int sep)
 return 0;
 }
 
-int parse_host_port(struct sockaddr_in *saddr, const char *str)
+int parse_host_port(struct sockaddr_in *saddr, const char *str, Error **errp)
 {
 char buf[512];
 struct hostent *he;
@@ -108,24 +108,35 @@ int parse_host_port(struct sockaddr_in *saddr, const char 
*str)
 int port;
 
 p = str;
-if (get_str_sep(buf, sizeof(buf), , ':') < 0)
+if (get_str_sep(buf, sizeof(buf), , ':') < 0) {
+error_setg(errp, "The address should contain ':', for example: "
+   "=230.0.0.1:1234");
 return -1;
+}
 saddr->sin_family = AF_INET;
 if (buf[0] == '\0') {
 saddr->sin_addr.s_addr = 0;
 } else {
 if (qemu_isdigit(buf[0])) {
-if (!inet_aton(buf, >sin_addr))
+if (!inet_aton(buf, >sin_addr)) {
+error_setg(errp, "Host address '%s' is not a valid "
+   "IPv4 address", buf);
 return -1;
+}
 } else {
-if ((he = gethostbyname(buf)) == NULL)
+he = gethostbyname(buf);
+if (he == NULL) {
+error_setg(errp, "Specified hostname is error.");
 return - 1;
+}
 saddr->sin_addr = *(struct in_addr *)he->h_addr;
 }
 }
 port = strtol(p, (char **), 0);
-if (r == p)
+if (r == p) {
+error_setg(errp, "The port number is illegal");
 return -1;
+}
 saddr->sin_port = htons(port);
 return 0;
 }
diff --git a/net/socket.c b/net/socket.c
index 53765bd..66016c8 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -485,15 +485,17 @@ static void net_socket_accept(void *opaque)
 static int net_socket_listen_init(NetClientState *peer,
   const char *model,
   const char *name,
-  const char *host_str)
+  const char *host_str,
+  Error **errp)
 {
 NetClientState *nc;
 NetSocketState *s;
 struct sockaddr_in saddr;
 int fd, ret;
 
-if (parse_host_port(, host_str) < 0)
+if (parse_host_port(, host_str, errp) < 0) {
 return -1;
+}
 
 fd = qemu_socket(PF_INET, SOCK_STREAM, 0);
 if (fd < 0) {
@@ -531,14 +533,16 @@ static int net_socket_listen_init(NetClientState *peer,
 static int net_socket_connect_init(NetClientState *peer,
const char *model,
const char *name,
-   const char *host_str)
+   const char *host_str,
+   Error **errp)
 {
 NetSocketState *s;
 int fd, connected, ret;
 struct sockaddr_in saddr;
 
-if (parse_host_port(, host_str) < 0)
+if (parse_host_port(, host_str, errp) < 0) {
 return -1;
+}
 
 fd = qemu_socket(PF_INET, SOCK_STREAM, 0);
 if (fd < 0) {
@@ -580,14 +584,15 @@ static int net_socket_mcast_init(NetClientState *peer,
  const char *model,
  const char *name,
  const char *host_str,
- const char *localaddr_str)
+ const char *localaddr_str,
+ Error **errp)
 {
 NetSocketState *s;
 int fd;
 struct sockaddr_in saddr;
 struct in_addr localaddr, *param_localaddr;
 
-if (parse_host_port(, host_str) < 0)
+if (parse_host_port(, host_str, errp) < 0)
 return -1;
 
 if (localaddr_str != NULL) {
@@ -619,17 +624,18 @@ static int net_socket_udp_init(NetClientState *peer,
  const char *model,
  const char *name,
  const

[Qemu-devel] [PATCH v4 4/4] net/socket: Improve -net socket error reporting

2017-06-12 Thread Mao Zhongyi

When -net socket fails, it first reports a specific error, then
a generic one, like this:

$ qemu-system-x86_64 -net socket,
qemu-system-x86_64: -net socket: exactly one of fd=, listen=, connect=, 
mcast= or udp= is required
qemu-system-x86_64: -net socket: Device 'socket' could not be initialized

Convert net_init_socket() to Error to get rid of the superfluous second
error message. After the patch, the effect like this:

$ qemu-system-x86_64 -net socket,
qemu-system-x86_64: -net socket: exactly one of fd=, listen=, connect=, 
mcast= or udp= is required

At the same time, add many explicit error handling message when it fails.

Cc: jasow...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 net/socket.c | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index 9825422..a8f2d7f 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -510,7 +510,8 @@ static int net_socket_listen_init(NetClientState *peer,
 
 fd = qemu_socket(PF_INET, SOCK_STREAM, 0);
 if (fd < 0) {
-perror("socket");
+error_setg_errno(errp, errno, "Create socket(PF_INET,"
+ " SOCK_STREAM) failed");
 return -1;
 }
 qemu_set_nonblock(fd);
@@ -519,13 +520,14 @@ static int net_socket_listen_init(NetClientState *peer,
 
 ret = bind(fd, (struct sockaddr *), sizeof(saddr));
 if (ret < 0) {
-perror("bind");
+error_setg_errno(errp, errno, "Bind ip=%s to fd=%d failed",
+ inet_ntoa(saddr.sin_addr), fd);
 closesocket(fd);
 return -1;
 }
 ret = listen(fd, 0);
 if (ret < 0) {
-perror("listen");
+error_setg_errno(errp, errno, "Listen socket fd=%d failed", fd);
 closesocket(fd);
 return -1;
 }
@@ -557,7 +559,8 @@ static int net_socket_connect_init(NetClientState *peer,
 
 fd = qemu_socket(PF_INET, SOCK_STREAM, 0);
 if (fd < 0) {
-perror("socket");
+error_setg_errno(errp, errno, "Create socket(PF_INET,"
+ " SOCK_STREAM) failed");
 return -1;
 }
 qemu_set_nonblock(fd);
@@ -573,7 +576,7 @@ static int net_socket_connect_init(NetClientState *peer,
errno == EINVAL) {
 break;
 } else {
-perror("connect");
+error_setg_errno(errp, errno, "Connection failed");
 closesocket(fd);
 return -1;
 }
@@ -607,8 +610,11 @@ static int net_socket_mcast_init(NetClientState *peer,
 return -1;
 
 if (localaddr_str != NULL) {
-if (inet_aton(localaddr_str, ) == 0)
+if (inet_aton(localaddr_str, ) == 0) {
+error_setg(errp, "localaddr '%s' is not a valid IPv4 address",
+   localaddr_str);
 return -1;
+}
 param_localaddr = 
 } else {
 param_localaddr = NULL;
@@ -652,18 +658,22 @@ static int net_socket_udp_init(NetClientState *peer,
 
 fd = qemu_socket(PF_INET, SOCK_DGRAM, 0);
 if (fd < 0) {
-perror("socket(PF_INET, SOCK_DGRAM)");
+error_setg_errno(errp, errno, "Create socket(PF_INET,"
+ " SOCK_DGRAM) failed");
 return -1;
 }
 
 ret = socket_set_fast_reuse(fd);
 if (ret < 0) {
+error_setg_errno(errp, errno, "Set the socket fd=%d attribute"
+   " SO_REUSEADDR failed", fd);
 closesocket(fd);
 return -1;
 }
 ret = bind(fd, (struct sockaddr *), sizeof(laddr));
 if (ret < 0) {
-perror("bind");
+error_setg_errno(errp, errno, "Bind ip=%s to fd=%d failed",
+ inet_ntoa(laddr.sin_addr), fd);
 closesocket(fd);
 return -1;
 }
@@ -685,8 +695,6 @@ static int net_socket_udp_init(NetClientState *peer,
 int net_init_socket(const Netdev *netdev, const char *name,
 NetClientState *peer, Error **errp)
 {
-/* FIXME error_setg(errp, ...) on failure */
-Error *err = NULL;
 const NetdevSocketOptions *sock;
 
 assert(netdev->type == NET_CLIENT_DRIVER_SOCKET);
@@ -694,22 +702,21 @@ int net_init_socket(const Netdev *netdev, const char 
*name,
 
 if (sock->has_fd + sock->has_listen + sock->has_connect + sock->has_mcast +
 sock->has_udp != 1) {
-error_report("exactly one of fd=, listen=, connect=, mcast= or udp="
+error_setg(errp, "exactly one of fd=, listen=, connect=, mcast= or 
udp="
  " is required");
 return -1;
 }
 
 if (sock->has_localaddr && !sock->has_mcast && !sock->has_udp) {
-error_report("localaddr= is only valid with mcast= or udp=");
+error_setg(errp, "localaddr= is only valid with mcast= or udp=");
 return -1;
 }
 
 if (sock->has_fd) {
 int fd;
 
-fd = monitor_fd_param(cur_mon,

Re: [Qemu-devel] Fwd: [BUG] Failed to compile using gcc7.1

2017-06-12 Thread Max Reitz

On 2017-06-12 05:19, Philippe Mathieu-Daudé wrote:
> Hi Tsung-en,
> 
> On 06/11/2017 04:08 PM, Tsung-en Hsiao wrote:
>> Hi all,
>> I encountered the same problem on gcc 7.1.1 and found Qu's mail in
>> this list from google search.
>>
>> Temporarily fix it by specifying the string length in snprintf
>> directive. Hope this is helpful to other people encountered the same
>> problem.
> 
> Thank your for sharing this.
> 
>>
>> @@ -1,9 +1,7 @@
>> ---
>> --- a/block/blkdebug.c
>> - "blkdebug:%s:%s", s->config_file ?: "",
>> --- a/block/blkverify.c
>> - "blkverify:%s:%s",
>> --- a/hw/usb/bus.c
>> -snprintf(downstream->path, sizeof(downstream->path), "%s.%d",
>> -snprintf(downstream->path, sizeof(downstream->path), "%d",
>> portnr);
>> -- 
>> +++ b/block/blkdebug.c
>> + "blkdebug:%.2037s:%.2037s", s->config_file ?: "",
> 
> It is a rather funny way to silent this warning :) Truncating the
> filename until it fits.
> 
> However I don't think it is the correct way since there is indeed an
> overflow of bs->exact_filename.
> 
> Apparently exact_filename from "block/block_int.h" is defined to hold a
> pathname:
> char exact_filename[PATH_MAX];
> 
> but is used for more than that (for example in blkdebug.c it might use
> until 10+2*PATH_MAX chars).

In any case, truncating the filenames will do just as much as truncating
the result: You'll get an unusable filename.

> I suppose it started as a buffer to hold a pathname then more block
> drivers were added and this buffer ended used differently.
> 
> If it is a multi-purpose buffer one safer option might be to declare it
> as a GString* and use g_string_printf().

What it is supposed to be now is just an information string we can print
to the user, because strings are nicer than JSON objects. There are some
commands that take a filename for identifying a block node, but I dream
we can get rid of them in 3.0...

The right solution is to remove it altogether and have a
"char *bdrv_filename(BlockDriverState *bs)" function (which generates
the filename every time it's called). I've been working on this for some
years now, actually, but it was never pressing enough to get it finished
(so I never had enough time).

What we can do in the meantime is to not generate a plain filename if it
won't fit into bs->exact_filename.

(The easiest way to do this probably would be to truncate
bs->exact_filename back to an empty string if snprintf() returns a value
greater than or equal to the length of bs->exact_filename.)

What to do about hw/usb/bus.c I don't know (I guess the best solution
would be to ignore the warning, but I don't suppose that is going to work).

Max

> 
> I CC'ed the block folks to have their feedback.
> 
> Regards,
> 
> Phil.
> 
>> +++ b/block/blkverify.c
>> + "blkverify:%.2038s:%.2038s",
>> +++ b/hw/usb/bus.c
>> +snprintf(downstream->path, sizeof(downstream->path), "%.12s.%d",
>> +snprintf(downstream->path, sizeof(downstream->path), "%.12d",
>> portnr);
>>
>> Tsung-en Hsiao
>>
>>> Qu Wenruo Wrote:
>>>
>>> Hi all,
>>>
>>> After upgrading gcc from 6.3.1 to 7.1.1, qemu can't be compiled with
>>> gcc.
>>>
>>> The error is:
>>>
>>> --
>>>  CC  block/blkdebug.o
>>> block/blkdebug.c: In function 'blkdebug_refresh_filename':
>>>
>>> block/blkdebug.c:693:31: error: '%s' directive output may be
>>> truncated writing up to 4095 bytes into a region of size 4086
>>> [-Werror=format-truncation=]
>>>
>>>  "blkdebug:%s:%s", s->config_file ?: "",
>>>   ^~
>>> In file included from /usr/include/stdio.h:939:0,
>>> from /home/adam/qemu/include/qemu/osdep.h:68,
>>> from block/blkdebug.c:25:
>>>
>>> /usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk'
>>> output 11 or more bytes (assuming 4106) into a destination of size 4096
>>>
>>>   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
>>>  ^~~~
>>>__bos (__s), __fmt, __va_arg_pack ());
>>>~
>>> cc1: all warnings being treated as errors
>>> make: *** [/home/adam/qemu/rules.mak:69: block/blkdebug.o] Error 1
>>> --
>>>
>>> It seems that gcc 7 is introducing more restrict check for printf.
>>>
>>> If using clang, although there are some extra warning, it can at
>>> least pass the compile.
>>>
>>> Thanks,
>>> Qu
>>




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v4 0/4] Improve error reporting

2017-06-12 Thread Mao Zhongyi

Daniel's patch(commit 6701e551, Revert "Change net/socket.c to use socket_*() 
functions" again) has been in upstream. Continue this patchset.

In v2, this series convert the non-blocking connect mechanism to QIOchannel 
by replace the socket_connect(), and some errors also are converted to Error. 

The v2 related discussion was listed on:
https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg04886.html

v4: 
* PATCH 01 is redoing previous patch 1, replace the fprintf() with 
error_report()
 in the 'default' case of net_socket_fd_init() [Markus 
Armbruster]

v3:
* PATCH 01 is suggested by Markus and Daniel that removes the dubious 'default' 
case
   in the net_socket_fd_init(). Jason agreed.
* PATCH 02 is redoing previous patch 4.
* PATCH 04 is redoing previous patch 2, improves sort of error messages. 

v2:
* PATCH 02 reworking of patch 2 following Markus's suggestion that convert 
error_report()
   in the function called by net_socket_*_init() to Error. Also add 
many error 
   handling information.
* PATCH 03 net_socket_mcast_create(), net_socket_fd_init_dgram() and 
net_socket_fd_init() 
   use the function such as fprintf, perror to report an error message. 
Convert it 
   to Error.
* PATCH 04 parse_host_port() may fail without reporting an error. Now, fix it 
to set an
   error when it fails.


Cc: jasow...@redhat.com
Cc: arm...@redhat.com
Cc: berra...@redhat.com
Cc: kra...@redhat.com
Cc: pbonz...@redhat.com

Mao Zhongyi (4):
  net/socket: Drop the odd 'default' case and comment
  net/net: Convert parse_host_port() to Error
  net/socket: Convert error report message to Error
  net/socket: Improve -net socket error reporting

 include/qemu/sockets.h |   2 +-
 net/net.c  |  21 ++--
 net/socket.c   | 133 +
 3 files changed, 96 insertions(+), 60 deletions(-)

-- 
2.9.3

[Qemu-devel] [PATCH v4 1/4] net/socket: Drop the odd 'default' case and comment

2017-06-12 Thread Mao Zhongyi

In the net_socket_fd_init(), the 'default' case and comment is odd.
If @fd really was a pty, getsockopt() would fail with ENOTSOCK. If
@fd was a socket, but neither SOCK_DGRAM nor SOCK_STREAM. It should
not be treated as if it was SOCK_STREAM.

If there is a genuine reason to support something like SOCK_RAW, it
should be explicitly handled.

So, drop the 'default' case since it is broken already.

Cc: jasow...@redhat.com
Cc: arm...@redhat.com
Cc: berra...@redhat.com
Cc: arm...@redhat.com
Suggested-by: Markus Armbruster 
Suggested-by: Daniel P. Berrange 
Signed-off-by: Mao Zhongyi 
---
 net/socket.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index dcae1ae..53765bd 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -449,9 +449,9 @@ static NetSocketState *net_socket_fd_init(NetClientState 
*peer,
 case SOCK_STREAM:
 return net_socket_fd_init_stream(peer, model, name, fd, is_connected);
 default:
-/* who knows ... this could be a eg. a pty, do warn and continue as 
stream */
-fprintf(stderr, "qemu: warning: socket type=%d for fd=%d is not 
SOCK_DGRAM or SOCK_STREAM\n", so_type, fd);
-return net_socket_fd_init_stream(peer, model, name, fd, is_connected);
+error_report("qemu: error: socket type=%d for fd=%d is not"
+" SOCK_DGRAM or SOCK_STREAM", so_type, fd);
+closesocket(fd);
 }
 return NULL;
 }
-- 
2.9.3

Re: [Qemu-devel] [PATCH v11 4/6] mm: function to offer a page block on the free list

2017-06-12 Thread Dave Hansen

Please stop cc'ing me on things also sent to closed mailing lists
(virtio-...@lists.oasis-open.org).  I'm happy to review things on open
lists, but I'm not fond of the closed lists bouncing things at me.

On 06/09/2017 03:41 AM, Wei Wang wrote:
> Add a function to find a page block on the free list specified by the
> caller. Pages from the page block may be used immediately after the
> function returns. The caller is responsible for detecting or preventing
> the use of such pages.

This description doesn't tell me very much about what's going on here.
Neither does the comment.

"Pages from the page block may be used immediately after the
 function returns".

Used by who?  Does the "may" here mean that it is OK, or is it a warning
that the contents will be thrown away immediately?

The hypervisor is going to throw away the contents of these pages,
right?  As soon as the spinlock is released, someone can allocate a
page, and put good data in it.  What keeps the hypervisor from throwing
away good data?

Re: [Qemu-devel] [PATCH v11 6/6] virtio-balloon: VIRTIO_BALLOON_F_CMD_VQ

2017-06-12 Thread Dave Hansen

On 06/09/2017 03:41 AM, Wei Wang wrote:
> + for_each_populated_zone(zone) {
> + for (order = MAX_ORDER - 1; order > 0; order--) {
> + for (migratetype = 0; migratetype < MIGRATE_TYPES;
> +  migratetype++) {
> + do {
> + ret = report_unused_page_block(zone,
> + order, migratetype, );
> + if (!ret) {
> + pfn = (u64)page_to_pfn(page);
> + add_one_chunk(vb, vq,
> + PAGE_CHNUK_UNUSED_PAGE,
> + pfn << VIRTIO_BALLOON_PFN_SHIFT,
> + (u64)(1 << order) *
> + VIRTIO_BALLOON_PAGES_PER_PAGE);
> + }
> + } while (!ret);
> + }
> + }
> + }

This is pretty unreadable.Please add some indentation.  If you go
over 80 cols, then you might need to break this up into a separate
function.  But, either way, it can't be left like this.

Re: [Qemu-devel] [Proposal] I/O throttling through new -object interface

2017-06-12 Thread Stefan Hajnoczi

On Thu, Jun 08, 2017 at 09:21:13PM +0300, Manos Pitsidianakis wrote:
> Users can hotplug ThrottleGroups with object-add in QMP/HMP, then
> attach drives to a throttle group (ie adding a filter node in the drive path
> that points to that throttle group) with an 'attach-throttle' and detach
> with 'detach-throttle', each taking the drive and throttle group name as
> arguments.

BDS nodes are added/removed with blockdev-add/blockdev-del.  Why are new
attach-throttle/detach-throttle commands necessary?

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] NVME: is there any plan to support SGL data transfer?

2017-06-12 Thread Keith Busch

On Fri, Jun 02, 2017 at 03:51:51PM +0200, Kevin Wolf wrote:
> Am 02.06.2017 um 03:47 hat Qu Wenruo geschrieben:
> > When going through NVMe specification and hw/block/nvme.c,
> > I found that it seems that NVMe qemu implementation only support PRP
> > for sq entry.
> > And NvmeRwCmd doesn't even use union to define DPTR, but just prp1 and prp2.
> > 
> > Although I am just a newbie, but I'm quite interested in NVMe and
> > want to try to implement SGL support for qemu NVMe.
> > 
> > Is there anyone already doing such work? Or is there any plan on
> > implement such feature?
> 
> Keith, you can probably answer this?

Hi,

I personally have no plans to implement it, but would be happy to see it
added if someone wants to send the patch for review.

Thanks,
Keith

[Qemu-devel] [PATCH v5 3/9] pci: Fix the return value checking

2017-06-12 Thread Mao Zhongyi

pci_add_capability returns a strictly positive value on success,
correct asserts.

Cc: dmi...@daynix.com
Cc: jasow...@redhat.com
Cc: kra...@redhat.com
Cc: alex.william...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/net/e1000e.c   | 2 +-
 hw/net/eepro100.c | 2 +-
 hw/usb/hcd-xhci.c | 2 +-
 hw/vfio/pci.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 6e23493..8259d67 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -374,7 +374,7 @@ e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, 
uint16_t pmc)
 {
 int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset, PCI_PM_SIZEOF);
 
-if (ret >= 0) {
+if (ret > 0) {
 pci_set_word(pdev->config + offset + PCI_PM_PMC,
  PCI_PM_CAP_VER_1_1 |
  pmc);
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index 4bf71f2..da36816 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -571,7 +571,7 @@ static void e100_pci_reset(EEPRO100State * s)
 int cfg_offset = 0xdc;
 int r = pci_add_capability(>dev, PCI_CAP_ID_PM,
cfg_offset, PCI_PM_SIZEOF);
-assert(r >= 0);
+assert(r > 0);
 pci_set_word(pci_conf + cfg_offset + PCI_PM_PMC, 0x7e21);
 #if 0 /* TODO: replace dummy code for power management emulation. */
 /* TODO: Power Management Control / Status. */
diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index a0c7960..ab42f86 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -3417,7 +3417,7 @@ static void usb_xhci_realize(struct PCIDevice *dev, Error 
**errp)
 if (pci_bus_is_express(dev->bus) ||
 xhci_get_flag(xhci, XHCI_FLAG_FORCE_PCIE_ENDCAP)) {
 ret = pcie_endpoint_cap_init(dev, 0xa0);
-assert(ret >= 0);
+assert(ret > 0);
 }
 
 if (xhci->msix != ON_OFF_AUTO_OFF) {
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 32aca77..5881968 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1744,7 +1744,7 @@ static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int 
pos, uint8_t size,
 }
 
 pos = pci_add_capability(>pdev, PCI_CAP_ID_EXP, pos, size);
-if (pos >= 0) {
+if (pos > 0) {
 vdev->pdev.exp.exp_cap = pos;
 }
 
-- 
2.9.3

[Qemu-devel] [PATCH v5 9/9] i386/kvm/pci-assign: Use errp directly rather than local_err

2017-06-12 Thread Mao Zhongyi

In assigned_device_pci_cap_init(), first, error messages are filled
to a local_err variable, then through error_propagate() pass to
the parameter of errp. It leads to cumbersome code. In order to
avoid the extra local_err and error_propagate(), drop it and use
errp instead.

Cc: pbonz...@redhat.com
Cc: r...@twiddle.net
Cc: ehabk...@redhat.com
Cc: m...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/i386/kvm/pci-assign.c | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index b7fdb47..9f2615c 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1232,7 +1232,6 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 AssignedDevice *dev = PCI_ASSIGN(pci_dev);
 PCIRegion *pci_region = dev->real_device.regions;
 int ret, pos;
-Error *local_err = NULL;
 
 /* Clear initial capabilities pointer and status copied from hw */
 pci_set_byte(pci_dev->config + PCI_CAPABILITY_LIST, 0);
@@ -1251,9 +1250,8 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 dev->cap.available |= ASSIGNED_DEVICE_CAP_MSI;
 /* Only 32-bit/no-mask currently supported */
 ret = pci_add_capability(pci_dev, PCI_CAP_ID_MSI, pos, 10,
-  _err);
+  errp);
 if (ret < 0) {
-error_propagate(errp, local_err);
 return ret;
 }
 pci_dev->msi_cap = pos;
@@ -1283,9 +1281,8 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 dev->dev.cap_present |= QEMU_PCI_CAP_MSIX;
 dev->cap.available |= ASSIGNED_DEVICE_CAP_MSIX;
 ret = pci_add_capability(pci_dev, PCI_CAP_ID_MSIX, pos, 12,
-  _err);
+  errp);
 if (ret < 0) {
-error_propagate(errp, local_err);
 return ret;
 }
 pci_dev->msix_cap = pos;
@@ -1313,9 +1310,8 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 uint16_t pmc;
 
 ret = pci_add_capability(pci_dev, PCI_CAP_ID_PM, pos, PCI_PM_SIZEOF,
-  _err);
+  errp);
 if (ret < 0) {
-error_propagate(errp, local_err);
 return ret;
 }
 
@@ -1381,9 +1377,8 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 }
 
 ret = pci_add_capability(pci_dev, PCI_CAP_ID_EXP, pos, size,
-  _err);
+  errp);
 if (ret < 0) {
-error_propagate(errp, local_err);
 return ret;
 }
 
@@ -1457,9 +1452,8 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 
 /* Only expose the minimum, 8 byte capability */
 ret = pci_add_capability(pci_dev, PCI_CAP_ID_PCIX, pos, 8,
-  _err);
+  errp);
 if (ret < 0) {
-error_propagate(errp, local_err);
 return ret;
 }
 
@@ -1485,9 +1479,8 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 if (pos) {
 /* Direct R/W passthrough */
 ret = pci_add_capability(pci_dev, PCI_CAP_ID_VPD, pos, 8,
-  _err);
+  errp);
 if (ret < 0) {
-error_propagate(errp, local_err);
 return ret;
 }
 
@@ -1503,9 +1496,8 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 uint8_t len = pci_get_byte(pci_dev->config + pos + PCI_CAP_FLAGS);
 /* Direct R/W passthrough */
 ret = pci_add_capability(pci_dev, PCI_CAP_ID_VNDR, pos, len,
-  _err);
+  errp);
 if (ret < 0) {
-error_propagate(errp, local_err);
 return ret;
 }
 
-- 
2.9.3

Re: [Qemu-devel] [Proposal] I/O throttling through new -object interface

2017-06-12 Thread Stefan Hajnoczi

On Thu, Jun 08, 2017 at 09:21:13PM +0300, Manos Pitsidianakis wrote:
> The old interface will be retained as much as possible.

I think "retained" means "reused" here but in case it doesn't:

The old interface must remain 100% functional.  Existing software like
libvirt must continue to work without any changes.  QEMU releases are
backwards compatible except for a long-term deprecation schedule.


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH v5 5/9] pci: Replace pci_add_capability2() with pci_add_capability()

2017-06-12 Thread Mao Zhongyi

After the patch 'Make errp the last parameter of pci_add_capability()',
pci_add_capability() and pci_add_capability2() now do exactly the same.
So drop the wrapper pci_add_capability() of pci_add_capability2(), then
replace the pci_add_capability2() with pci_add_capability() everywhere.

Cc: pbonz...@redhat.com
Cc: r...@twiddle.net
Cc: ehabk...@redhat.com
Cc: m...@redhat.com
Cc: dmi...@daynix.com
Cc: jasow...@redhat.com
Cc: mar...@redhat.com
Cc: alex.william...@redhat.com
Cc: arm...@redhat.com
Suggested-by: Eduardo Habkost 
Signed-off-by: Mao Zhongyi 
---
 hw/i386/kvm/pci-assign.c | 14 +++---
 hw/ide/ich.c |  2 +-
 hw/pci/msi.c |  2 +-
 hw/pci/msix.c|  2 +-
 hw/pci/pci.c | 20 ++--
 hw/vfio/pci.c|  6 +++---
 include/hw/pci/pci.h |  3 ---
 7 files changed, 15 insertions(+), 34 deletions(-)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 87dcbdd..3d60455 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1254,7 +1254,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 dev->dev.cap_present |= QEMU_PCI_CAP_MSI;
 dev->cap.available |= ASSIGNED_DEVICE_CAP_MSI;
 /* Only 32-bit/no-mask currently supported */
-ret = pci_add_capability2(pci_dev, PCI_CAP_ID_MSI, pos, 10,
+ret = pci_add_capability(pci_dev, PCI_CAP_ID_MSI, pos, 10,
   _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
@@ -1288,7 +1288,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 }
 dev->dev.cap_present |= QEMU_PCI_CAP_MSIX;
 dev->cap.available |= ASSIGNED_DEVICE_CAP_MSIX;
-ret = pci_add_capability2(pci_dev, PCI_CAP_ID_MSIX, pos, 12,
+ret = pci_add_capability(pci_dev, PCI_CAP_ID_MSIX, pos, 12,
   _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
@@ -1318,7 +1318,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 if (pos) {
 uint16_t pmc;
 
-ret = pci_add_capability2(pci_dev, PCI_CAP_ID_PM, pos, PCI_PM_SIZEOF,
+ret = pci_add_capability(pci_dev, PCI_CAP_ID_PM, pos, PCI_PM_SIZEOF,
   _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
@@ -1386,7 +1386,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 return -EINVAL;
 }
 
-ret = pci_add_capability2(pci_dev, PCI_CAP_ID_EXP, pos, size,
+ret = pci_add_capability(pci_dev, PCI_CAP_ID_EXP, pos, size,
   _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
@@ -1462,7 +1462,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 uint32_t status;
 
 /* Only expose the minimum, 8 byte capability */
-ret = pci_add_capability2(pci_dev, PCI_CAP_ID_PCIX, pos, 8,
+ret = pci_add_capability(pci_dev, PCI_CAP_ID_PCIX, pos, 8,
   _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
@@ -1490,7 +1490,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 pos = pci_find_cap_offset(pci_dev, PCI_CAP_ID_VPD, 0);
 if (pos) {
 /* Direct R/W passthrough */
-ret = pci_add_capability2(pci_dev, PCI_CAP_ID_VPD, pos, 8,
+ret = pci_add_capability(pci_dev, PCI_CAP_ID_VPD, pos, 8,
   _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
@@ -1508,7 +1508,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 pos += PCI_CAP_LIST_NEXT) {
 uint8_t len = pci_get_byte(pci_dev->config + pos + PCI_CAP_FLAGS);
 /* Direct R/W passthrough */
-ret = pci_add_capability2(pci_dev, PCI_CAP_ID_VNDR, pos, len,
+ret = pci_add_capability(pci_dev, PCI_CAP_ID_VNDR, pos, len,
   _err);
 if (ret < 0) {
 error_propagate(errp, local_err);
diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index 4599169..989fca5 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -130,7 +130,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error 
**errp)
 pci_register_bar(dev, ICH9_MEM_BAR, PCI_BASE_ADDRESS_SPACE_MEMORY,
  >ahci.mem);
 
-sata_cap_offset = pci_add_capability2(dev, PCI_CAP_ID_SATA,
+sata_cap_offset = pci_add_capability(dev, PCI_CAP_ID_SATA,
   ICH9_SATA_CAP_OFFSET, SATA_CAP_SIZE,
   errp);
 if (sata_cap_offset < 0) {
diff --git a/hw/pci/msi.c b/hw/pci/msi.c
index a87b227..5e05ce5 100644
--- a/hw/pci/msi.c
+++ b/hw/pci/msi.c
@@ -216,7 +216,7 @@ int

[Qemu-devel] [PATCH v5 4/9] pci: Make errp the last parameter of pci_add_capability()

2017-06-12 Thread Mao Zhongyi

Add Error argument for pci_add_capability() to leverage the errp
to pass info on errors. This way is helpful for its callers to
make a better error handling when moving to 'realize'.

Cc: pbonz...@redhat.com
Cc: r...@twiddle.net
Cc: ehabk...@redhat.com
Cc: m...@redhat.com
Cc: jasow...@redhat.com
Cc: mar...@redhat.com
Cc: alex.william...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/i386/amd_iommu.c   | 24 +---
 hw/net/e1000e.c   | 30 ++
 hw/net/eepro100.c | 18 ++
 hw/pci-bridge/i82801b11.c |  1 +
 hw/pci/pci.c  | 10 --
 hw/pci/pci_bridge.c   |  7 ++-
 hw/pci/pcie.c | 10 --
 hw/pci/shpc.c |  5 -
 hw/pci/slotid_cap.c   |  7 ++-
 hw/vfio/pci.c |  9 ++---
 hw/virtio/virtio-pci.c| 12 
 include/hw/pci/pci.h  |  3 ++-
 12 files changed, 94 insertions(+), 42 deletions(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 7b6d4ea..d93ffc2 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1158,13 +1158,23 @@ static void amdvi_realize(DeviceState *dev, Error **err)
 x86_iommu->type = TYPE_AMD;
 qdev_set_parent_bus(DEVICE(>pci), >qbus);
 object_property_set_bool(OBJECT(>pci), true, "realized", err);
-s->capab_offset = pci_add_capability(>pci.dev, AMDVI_CAPAB_ID_SEC, 0,
- AMDVI_CAPAB_SIZE);
-assert(s->capab_offset > 0);
-ret = pci_add_capability(>pci.dev, PCI_CAP_ID_MSI, 0, 
AMDVI_CAPAB_REG_SIZE);
-assert(ret > 0);
-ret = pci_add_capability(>pci.dev, PCI_CAP_ID_HT, 0, 
AMDVI_CAPAB_REG_SIZE);
-assert(ret > 0);
+ret = pci_add_capability(>pci.dev, AMDVI_CAPAB_ID_SEC, 0,
+ AMDVI_CAPAB_SIZE, err);
+if (ret < 0) {
+return;
+}
+s->capab_offset = ret;
+
+ret = pci_add_capability(>pci.dev, PCI_CAP_ID_MSI, 0,
+ AMDVI_CAPAB_REG_SIZE, err);
+if (ret < 0) {
+return;
+}
+ret = pci_add_capability(>pci.dev, PCI_CAP_ID_HT, 0,
+ AMDVI_CAPAB_REG_SIZE, err);
+if (ret < 0) {
+return;
+}
 
 /* set up MMIO */
 memory_region_init_io(>mmio, OBJECT(s), _mem_ops, s, "amdvi-mmio",
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 8259d67..d1b1a97 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -47,6 +47,7 @@
 #include "e1000e_core.h"
 
 #include "trace.h"
+#include "qapi/error.h"
 
 #define TYPE_E1000E "e1000e"
 #define E1000E(obj) OBJECT_CHECK(E1000EState, (obj), TYPE_E1000E)
@@ -372,21 +373,26 @@ e1000e_gen_dsn(uint8_t *mac)
 static int
 e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, uint16_t pmc)
 {
-int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset, PCI_PM_SIZEOF);
+Error *local_err = NULL;
+int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset,
+ PCI_PM_SIZEOF, _err);
 
-if (ret > 0) {
-pci_set_word(pdev->config + offset + PCI_PM_PMC,
- PCI_PM_CAP_VER_1_1 |
- pmc);
+if (local_err) {
+error_report_err(local_err);
+return ret;
+}
 
-pci_set_word(pdev->wmask + offset + PCI_PM_CTRL,
- PCI_PM_CTRL_STATE_MASK |
- PCI_PM_CTRL_PME_ENABLE |
- PCI_PM_CTRL_DATA_SEL_MASK);
+pci_set_word(pdev->config + offset + PCI_PM_PMC,
+ PCI_PM_CAP_VER_1_1 |
+ pmc);
 
-pci_set_word(pdev->w1cmask + offset + PCI_PM_CTRL,
- PCI_PM_CTRL_PME_STATUS);
-}
+pci_set_word(pdev->wmask + offset + PCI_PM_CTRL,
+ PCI_PM_CTRL_STATE_MASK |
+ PCI_PM_CTRL_PME_ENABLE |
+ PCI_PM_CTRL_DATA_SEL_MASK);
+
+pci_set_word(pdev->w1cmask + offset + PCI_PM_CTRL,
+ PCI_PM_CTRL_PME_STATUS);
 
 return ret;
 }
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index da36816..5a4774a 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -48,6 +48,7 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/dma.h"
 #include "qemu/bitops.h"
+#include "qapi/error.h"
 
 /* QEMU sends frames smaller than 60 bytes to ethernet nics.
  * Such frames are rejected by real nics and their emulations.
@@ -494,7 +495,7 @@ static void eepro100_fcp_interrupt(EEPRO100State * s)
 }
 #endif
 
-static void e100_pci_reset(EEPRO100State * s)
+static void e100_pci_reset(EEPRO100State *s, Error **errp)
 {
 E100PCIDeviceInfo *info = eepro100_get_class(s);
 uint32_t device = s->device;
@@ -570,8 +571,12 @@ static void e100_pci_reset(EEPRO100State * s)
 /* Power Management Capabilities */
 int cfg_offset = 0xdc;
 int r = pci_add_capability(>dev, PCI_CAP_ID_PM,
-   cfg_offset, PCI_PM_SIZEOF);
-assert(r > 0);

[Qemu-devel] [PATCH v5 6/9] pci: Convert to realize

2017-06-12 Thread Mao Zhongyi

The pci-birdge device i82801b11 and io3130_upstream/downstream
still implements the old PCIDeviceClass .init() through *_init()
instead of the new .realize(). All devices need to be converted
to .realize(). So convert it and rename it to *_realize().

Cc: m...@redhat.com
Cc: mar...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/pci-bridge/i82801b11.c  | 11 +--
 hw/pci-bridge/pcie_root_port.c | 15 ++-
 hw/pci-bridge/xio3130_downstream.c | 20 +---
 hw/pci-bridge/xio3130_upstream.c   | 20 +---
 hw/pci/pci_bridge.c|  7 +++
 hw/pci/pcie.c  | 11 ++-
 include/hw/pci/pci_bridge.h|  3 ++-
 include/hw/pci/pcie.h  |  3 ++-
 8 files changed, 42 insertions(+), 48 deletions(-)

diff --git a/hw/pci-bridge/i82801b11.c b/hw/pci-bridge/i82801b11.c
index 2c065c3..2c1b747 100644
--- a/hw/pci-bridge/i82801b11.c
+++ b/hw/pci-bridge/i82801b11.c
@@ -59,24 +59,23 @@ typedef struct I82801b11Bridge {
 /*< public >*/
 } I82801b11Bridge;
 
-static int i82801b11_bridge_initfn(PCIDevice *d)
+static void i82801b11_bridge_realize(PCIDevice *d, Error **errp)
 {
 int rc;
 
 pci_bridge_initfn(d, TYPE_PCI_BUS);
 
 rc = pci_bridge_ssvid_init(d, I82801ba_SSVID_OFFSET,
-   I82801ba_SSVID_SVID, I82801ba_SSVID_SSID);
+   I82801ba_SSVID_SVID, I82801ba_SSVID_SSID,
+   errp);
 if (rc < 0) {
 goto err_bridge;
 }
 pci_config_set_prog_interface(d->config, PCI_CLASS_BRIDGE_PCI_INF_SUB);
-return 0;
+return;
 
 err_bridge:
 pci_bridge_exitfn(d);
-
-return rc;
 }
 
 static const VMStateDescription i82801b11_bridge_dev_vmstate = {
@@ -96,7 +95,7 @@ static void i82801b11_bridge_class_init(ObjectClass *klass, 
void *data)
 k->vendor_id = PCI_VENDOR_ID_INTEL;
 k->device_id = PCI_DEVICE_ID_INTEL_82801BA_11;
 k->revision = ICH9_D2P_A2_REVISION;
-k->init = i82801b11_bridge_initfn;
+k->realize = i82801b11_bridge_realize;
 k->config_write = pci_bridge_write_config;
 dc->vmsd = _bridge_dev_vmstate;
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index cf36318..00f0b1f 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -59,29 +59,27 @@ static void rp_realize(PCIDevice *d, Error **errp)
 PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(d);
 PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(d);
 int rc;
-Error *local_err = NULL;
 
 pci_config_set_interrupt_pin(d->config, 1);
 pci_bridge_initfn(d, TYPE_PCIE_BUS);
 pcie_port_init_reg(d);
 
-rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id, rpc->ssid);
+rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id,
+   rpc->ssid, errp);
 if (rc < 0) {
-error_setg(errp, "Can't init SSV ID, error %d", rc);
 goto err_bridge;
 }
 
 if (rpc->interrupts_init) {
-rc = rpc->interrupts_init(d, _err);
+rc = rpc->interrupts_init(d, errp);
 if (rc < 0) {
-error_propagate(errp, local_err);
 goto err_bridge;
 }
 }
 
-rc = pcie_cap_init(d, rpc->exp_offset, PCI_EXP_TYPE_ROOT_PORT, p->port);
+rc = pcie_cap_init(d, rpc->exp_offset, PCI_EXP_TYPE_ROOT_PORT,
+   p->port, errp);
 if (rc < 0) {
-error_setg(errp, "Can't add Root Port capability, error %d", rc);
 goto err_int;
 }
 
@@ -98,9 +96,8 @@ static void rp_realize(PCIDevice *d, Error **errp)
 }
 
 rc = pcie_aer_init(d, PCI_ERR_VER, rpc->aer_offset,
-   PCI_ERR_SIZEOF, _err);
+   PCI_ERR_SIZEOF, errp);
 if (rc < 0) {
-error_propagate(errp, local_err);
 goto err;
 }
 pcie_aer_root_init(d);
diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index cfe8a36..e706f36 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -56,33 +56,33 @@ static void xio3130_downstream_reset(DeviceState *qdev)
 pci_bridge_reset(qdev);
 }
 
-static int xio3130_downstream_initfn(PCIDevice *d)
+static void xio3130_downstream_realize(PCIDevice *d, Error **errp)
 {
 PCIEPort *p = PCIE_PORT(d);
 PCIESlot *s = PCIE_SLOT(d);
 int rc;
-Error *err = NULL;
 
 pci_bridge_initfn(d, TYPE_PCIE_BUS);
 pcie_port_init_reg(d);
 
 rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR,
   XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
-  XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, );
+  XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
+  errp);
 if (rc < 0) {
 assert(rc == -ENOTSUP);
-

[Qemu-devel] [PATCH v5 0/9] Convert to realize and cleanup

2017-06-12 Thread Mao Zhongyi

This series mainly implements the conversions of pci-bridge devices
i82801b11, io3130_upstream/downstream and so on to realize(). Naturally
part of error messages need to be converted to Error, then propagate
to its callers via the argument errp, bonus clean related minor flaw
up. In short, the former patches are prerequisites for latter ones.

v5:
* patch5: replace pci_add_capability2() with pci_add_capability(), because
  it's confusing to have a function named pci_add_capability2() if
  pci_add_capability() doesn't exist anymore. [Eduardo Habkost]
* patch8: a new patch that fix the return type of verify_irqchip_kernel().
* patch9: a new patch that use the errp directly instead of the local_err to
  propagate the error messages.

v4:
* patch4: changed from patch 5 in v3. use a elegant way to check
  the error, like

  function(...); 
  if (function succeeded) {
 /* non-error code path here */
 foo = bar;
  }

  or

  function(...);
  if (function succeeded) {
  /* non-error code path here */
  foo = bar;
  } else {
  /* error path here */
  return ret;
  }

  for readability, instead of this:

  function(...)
  if (function failed) {
  return ...;  /* or: "goto out" */
  }

  /* non-error code path here */
  foo = bar; [Eduardo Habkost]

  meanwhile, split previous patch4 out. [Michael S. Tsirkin]
* patch5: a new patch that replace pci_add_capability() with
  pci_add_capability2(). [Eduardo Habkost]

v3:
* patch2: explain the specified means of the return value, also
  improve the commit message. [Marcel Apfelbaum]
* patch3: simplify the subject and commit message, fix another
  wrong assert. [Marcel Apfelbaum]
* patch4: adjust the subject.
* patch5: fix a wrong optimization for errp. [Eduardo Habkost]
* patch7: a new patch that converts shpc_init() to Error in order
  to propagate the error better.   
v2:
* patch1: subject and commit message was rewrited by markus.
* patch2: comment was added to pci_add_capability2().
* patch3: a new patch that fix the wrong return value judgment condition.
* patch4: a new patch that fix code style problems.
* patch5: add an errp argument for pci_add_capability to pass
  error for its callers.
* patch6: convert part of pci-bridge device to realize.

v1:
* patch1: fix unreasonable return value check

Cc: m...@redhat.com
Cc: mar...@redhat.com
Cc: arm...@redhat.com
Cc: dmi...@daynix.com
Cc: jasow...@redhat.com
Cc: kra...@redhat.com
Cc: alex.william...@redhat.com
Cc: pbonz...@redhat.com
Cc: r...@twiddle.net
Cc: ehabk...@redhat.com

Mao Zhongyi (9):
  pci: Clean up error checking in pci_add_capability()
  pci: Add comment for pci_add_capability2()
  pci: Fix the return value checking
  pci: Make errp the last parameter of pci_add_capability()
  pci: Replace pci_add_capability2() with pci_add_capability()
  pci: Convert to realize
  pci: Convert shpc_init() to Error
  i386/kvm/pci-assign: Fix return type of verify_irqchip_kernel()
  i386/kvm/pci-assign: Use errp directly rather than local_err

 hw/i386/amd_iommu.c| 24 -
 hw/i386/kvm/pci-assign.c   | 54 ++
 hw/ide/ich.c   |  2 +-
 hw/net/e1000e.c| 30 -
 hw/net/eepro100.c  | 18 ++---
 hw/pci-bridge/i82801b11.c  | 12 -
 hw/pci-bridge/pci_bridge_dev.c | 21 ++-
 hw/pci-bridge/pcie_root_port.c | 15 +--
 hw/pci-bridge/xio3130_downstream.c | 20 +++---
 hw/pci-bridge/xio3130_upstream.c   | 20 +++---
 hw/pci/msi.c   |  2 +-
 hw/pci/msix.c  |  2 +-
 hw/pci/pci.c   | 24 +++--
 hw/pci/pci_bridge.c|  8 --
 hw/pci/pcie.c  | 15 ---
 hw/pci/shpc.c  | 10 ---
 hw/pci/slotid_cap.c| 12 ++---
 hw/usb/hcd-xhci.c  |  2 +-
 hw/vfio/pci.c  | 15 ++-
 hw/virtio/virtio-pci.c | 12 ++---
 include/hw/pci/pci.h   |  2 --
 include/hw/pci/pci_bridge.h|  3 ++-
 include/hw/pci/pcie.h  |  3 ++-
 include/hw/pci/shpc.h  |  3 ++-
 include/hw/pci/slotid_cap.h|  3 ++-
 25 files changed, 171 insertions(+), 161 deletions(-)

-- 
2.9.3

[Qemu-devel] [PATCH v5 7/9] pci: Convert shpc_init() to Error

2017-06-12 Thread Mao Zhongyi

In order to propagate error message better, convert shpc_init() to
Error also convert the pci_bridge_dev_initfn() to realize.

Cc: m...@redhat.com
Cc: mar...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/pci-bridge/pci_bridge_dev.c | 21 -
 hw/pci/shpc.c  | 11 +--
 hw/pci/slotid_cap.c| 11 +--
 include/hw/pci/shpc.h  |  3 ++-
 include/hw/pci/slotid_cap.h|  3 ++-
 5 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index 5dbd933..30c4186 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -49,12 +49,11 @@ struct PCIBridgeDev {
 };
 typedef struct PCIBridgeDev PCIBridgeDev;
 
-static int pci_bridge_dev_initfn(PCIDevice *dev)
+static void pci_bridge_dev_realize(PCIDevice *dev, Error **errp)
 {
 PCIBridge *br = PCI_BRIDGE(dev);
 PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(dev);
 int err;
-Error *local_err = NULL;
 
 pci_bridge_initfn(dev, TYPE_PCI_BUS);
 
@@ -62,7 +61,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
 dev->config[PCI_INTERRUPT_PIN] = 0x1;
 memory_region_init(_dev->bar, OBJECT(dev), "shpc-bar",
shpc_bar_size(dev));
-err = shpc_init(dev, >sec_bus, _dev->bar, 0);
+err = shpc_init(dev, >sec_bus, _dev->bar, 0, errp);
 if (err) {
 goto shpc_error;
 }
@@ -71,7 +70,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
 bridge_dev->msi = ON_OFF_AUTO_OFF;
 }
 
-err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0);
+err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0, errp);
 if (err) {
 goto slotid_error;
 }
@@ -79,20 +78,18 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
 if (bridge_dev->msi != ON_OFF_AUTO_OFF) {
 /* it means SHPC exists, because MSI is needed by SHPC */
 
-err = msi_init(dev, 0, 1, true, true, _err);
+err = msi_init(dev, 0, 1, true, true, errp);
 /* Any error other than -ENOTSUP(board's MSI support is broken)
  * is a programming error */
 assert(!err || err == -ENOTSUP);
 if (err && bridge_dev->msi == ON_OFF_AUTO_ON) {
 /* Can't satisfy user's explicit msi=on request, fail */
-error_append_hint(_err, "You have to use msi=auto (default) "
+error_append_hint(errp, "You have to use msi=auto (default) "
 "or msi=off with this machine type.\n");
-error_report_err(local_err);
 goto msi_error;
 }
-assert(!local_err || bridge_dev->msi == ON_OFF_AUTO_AUTO);
+assert(bridge_dev->msi == ON_OFF_AUTO_AUTO);
 /* With msi=auto, we fall back to MSI off silently */
-error_free(local_err);
 }
 
 if (shpc_present(dev)) {
@@ -101,7 +98,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
 pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY |
  PCI_BASE_ADDRESS_MEM_TYPE_64, _dev->bar);
 }
-return 0;
+return;
 
 msi_error:
 slotid_cap_cleanup(dev);
@@ -111,8 +108,6 @@ slotid_error:
 }
 shpc_error:
 pci_bridge_exitfn(dev);
-
-return err;
 }
 
 static void pci_bridge_dev_exitfn(PCIDevice *dev)
@@ -216,7 +211,7 @@ static void pci_bridge_dev_class_init(ObjectClass *klass, 
void *data)
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass);
 
-k->init = pci_bridge_dev_initfn;
+k->realize = pci_bridge_dev_realize;
 k->exit = pci_bridge_dev_exitfn;
 k->config_write = pci_bridge_dev_write_config;
 k->vendor_id = PCI_VENDOR_ID_REDHAT;
diff --git a/hw/pci/shpc.c b/hw/pci/shpc.c
index d72d5e4..69fc14b 100644
--- a/hw/pci/shpc.c
+++ b/hw/pci/shpc.c
@@ -446,16 +446,14 @@ static void shpc_cap_update_dword(PCIDevice *d)
 }
 
 /* Add SHPC capability to the config space for the device. */
-static int shpc_cap_add_config(PCIDevice *d)
+static int shpc_cap_add_config(PCIDevice *d, Error **errp)
 {
 uint8_t *config;
 int config_offset;
-Error *local_err = NULL;
 config_offset = pci_add_capability(d, PCI_CAP_ID_SHPC,
0, SHPC_CAP_LENGTH,
-   _err);
+   errp);
 if (config_offset < 0) {
-error_report_err(local_err);
 return config_offset;
 }
 config = d->config + config_offset;
@@ -584,13 +582,14 @@ void shpc_device_hot_unplug_request_cb(HotplugHandler 
*hotplug_dev,
 }
 
 /* Initialize the SHPC structure in bridge's BAR. */
-int shpc_init(PCIDevice *d, PCIBus *sec_bus, MemoryRegion *bar, unsigned 
offset)
+int shpc_init(PCIDevice *d, PCIBus *sec_bus, MemoryRegion *bar,
+  unsigned offset, Error **errp)
 {
 int i, ret;
 int nslots = SHPC_MAX_SLOTS; /*

Re: [Qemu-devel] [RFC 0/2] Parse 'filename' option for RBD/iSCSI

2017-06-12 Thread Richard W.M. Jones

On Mon, Jun 12, 2017 at 12:05:12AM -0400, Jeff Cody wrote:
> We need to be able to parse the 'filename' option for both rbd and iscsi,
> because there may exist images in the wild that have json backing files,
> that specify the filename argument.
> 
> Marking the series as RFC at least partially for the precedence given to
> arguments; as written, these patches will give preference to the 'filename'
> parameter over passed options.

I tested this as follows:

(1) Prepare a test image.  By using the ‘qemu-img rebase -u’ option we
can set the formerly correct / now incorrect backing option to an
arbitrary json string of our choosing:

$ qemu-img create test.img -f qcow2 10M
$ qemu-img rebase -u -b 
'json:{"file.driver":"rbd","file.filename":"rbd:rbd/rbd.img:mon_host=127.0.0.0"}'
 test.img 

(2) With upstream qemu:

$ qemu-system-x86_64 -hda test.img 
qemu-system-x86_64: -hda test.img: Could not open backing file: Parameters 
'pool' and 'image' are required

(3) With upstream qemu + these two patches:

$ ~/d/qemu/x86_64-softmmu/qemu-system-x86_64 -hda test.img
[hangs and eventually reports a timeout because I don't really
have a Ceph server]

So I would say that these patches, superficially at least, do seem to
work.  Although I wasn't able to test them against a real Ceph server.

  Tested-by: Richard W.M. Jones 

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

[Qemu-devel] [PATCH v5 1/9] pci: Clean up error checking in pci_add_capability()

2017-06-12 Thread Mao Zhongyi

On success, pci_add_capability2() returns a positive value. On
failure, it sets an error and return a negative value.

pci_add_capability() laboriously checks this behavior. No other
caller does. Drop the checks from pci_add_capability().

Cc: m...@redhat.com
Cc: mar...@redhat.com
Signed-off-by: Mao Zhongyi 
Reviewed-by: Marcel Apfelbaum 
---
 hw/pci/pci.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 98ccc27..53566b8 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2270,12 +2270,8 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
 Error *local_err = NULL;
 
 ret = pci_add_capability2(pdev, cap_id, offset, size, _err);
-if (local_err) {
-assert(ret < 0);
+if (ret < 0) {
 error_report_err(local_err);
-} else {
-/* success implies a positive offset in config space */
-assert(ret > 0);
 }
 return ret;
 }
-- 
2.9.3

[Qemu-devel] [PATCH v5 8/9] i386/kvm/pci-assign: Fix return type of verify_irqchip_kernel()

2017-06-12 Thread Mao Zhongyi

When the function no success value to transmit, it usually make the
function return void. It has turned out not to be a success, because
it means that the extra local_err variable and error_propagate() will
be needed. It leads to cumbersome code, therefore, transmit success/
failure in the return value is worth. So fix the return type to avoid
it.

Cc: pbonz...@redhat.com
Cc: r...@twiddle.net
Cc: ehabk...@redhat.com
Cc: m...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/i386/kvm/pci-assign.c | 18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 3d60455..b7fdb47 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -824,12 +824,13 @@ static void assign_device(AssignedDevice *dev, Error 
**errp)
 }
 }
 
-static void verify_irqchip_in_kernel(Error **errp)
+static int verify_irqchip_in_kernel(Error **errp)
 {
 if (kvm_irqchip_in_kernel()) {
-return;
+return 0;
 }
 error_setg(errp, "pci-assign requires KVM with in-kernel irqchip enabled");
+return -1;
 }
 
 static int assign_intx(AssignedDevice *dev, Error **errp)
@@ -838,7 +839,6 @@ static int assign_intx(AssignedDevice *dev, Error **errp)
 PCIINTxRoute intx_route;
 bool intx_host_msi;
 int r;
-Error *local_err = NULL;
 
 /* Interrupt PIN 0 means don't use INTx */
 if (assigned_dev_pci_read_byte(>dev, PCI_INTERRUPT_PIN) == 0) {
@@ -846,9 +846,7 @@ static int assign_intx(AssignedDevice *dev, Error **errp)
 return 0;
 }
 
-verify_irqchip_in_kernel(_err);
-if (local_err) {
-error_propagate(errp, local_err);
+if (verify_irqchip_in_kernel(errp) < 0) {
 return -ENOTSUP;
 }
 
@@ -1246,9 +1244,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
  * MSI capability is the 1st capability in capability config */
 pos = pci_find_cap_offset(pci_dev, PCI_CAP_ID_MSI, 0);
 if (pos != 0 && kvm_check_extension(kvm_state, KVM_CAP_ASSIGN_DEV_IRQ)) {
-verify_irqchip_in_kernel(_err);
-if (local_err) {
-error_propagate(errp, local_err);
+if (verify_irqchip_in_kernel(errp) < 0) {
 return -ENOTSUP;
 }
 dev->dev.cap_present |= QEMU_PCI_CAP_MSI;
@@ -1281,9 +1277,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 uint32_t msix_table_entry;
 uint16_t msix_max;
 
-verify_irqchip_in_kernel(_err);
-if (local_err) {
-error_propagate(errp, local_err);
+if (verify_irqchip_in_kernel(errp) < 0) {
 return -ENOTSUP;
 }
 dev->dev.cap_present |= QEMU_PCI_CAP_MSIX;
-- 
2.9.3

[Qemu-devel] [PATCH v5 2/9] pci: Add comment for pci_add_capability2()

2017-06-12 Thread Mao Zhongyi

Comments for pci_add_capability2() to explain the return
value. This may help to make a correct return value check
for its callers.

Cc: m...@redhat.com
Cc: mar...@redhat.com
Cc: arm...@redhat.com
Suggested-by: Markus Armbruster 
Signed-off-by: Mao Zhongyi 
---
 hw/pci/pci.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 53566b8..b73bfea 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2276,6 +2276,12 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
 return ret;
 }
 
+/*
+ * On success, pci_add_capability2() returns a positive value
+ * that the offset of the pci capability.
+ * On failure, it sets an error and returns a negative error
+ * code.
+ */
 int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id,
uint8_t offset, uint8_t size,
Error **errp)
-- 
2.9.3

Re: [Qemu-devel] [PULL for-2.9 0/5] Block patches

2017-06-12 Thread Peter Maydell

On 7 June 2017 at 19:07, Stefan Hajnoczi  wrote:
> The following changes since commit 0db1851becbefe3e50cfc03776fb1f75817376af:
>
>   Merge remote-tracking branch 
> 'remotes/vivier/tags/m68k-for-2.10-pull-request' into staging (2017-06-07 
> 11:56:00 +0100)
>
> are available in the git repository at:
>
>   git://github.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to 11cde1c81093a33c46c7a4039bf750bb61551087:
>
>   configure: split c and cxx extra flags (2017-06-07 15:29:46 +0100)
>
> 
>
> 
>
> Bruno Dominguez (1):
>   configure: split c and cxx extra flags
>
> Philippe Mathieu-Daudé (2):
>   oslib: strip trailing '\n' from error_setg() string argument
>   coccinelle: fix typo in comment
>
> Roman Pen (1):
>   coroutine-lock: do not touch coroutine after another one has been
> entered
>
> Stefan Hajnoczi (1):
>   .gdbinit: load QEMU sub-commands when gdb starts
>
>  configure| 75 
> ++--
>  disas/libvixl/Makefile.objs  |  4 +-
>  util/oslib-posix.c   |  2 +-
>  util/qemu-coroutine-lock.c   | 19 +++-
>  util/qemu-coroutine.c|  5 +++
>  .gdbinit |  8 
>  rules.mak|  3 --
>  scripts/coccinelle/return_directly.cocci |  2 +-
>  8 files changed, 77 insertions(+), 41 deletions(-)
>  create mode 100644 .gdbinit

Applied, thanks.

-- PMM

1 2 >

1 - 100 of 169 matches

Mail list logo