date:20160106

Re: [Qemu-devel] [PATCH v2 05/10] exec: [ŧcg] Use multiple physical TB caches

2016-01-06 Thread Stefan Hajnoczi

On Tue, Nov 24, 2015 at 06:09:14PM +0100, Lluís Vilanova wrote:

Commit message uses 'LATIN SMALL LETTER T WITH STROKE' (U+0167) instead
of regular 't' character :).


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] hw/arm/virt: Initialize NICs configured in PCI bus

2016-01-06 Thread Ashok Kumar

On Wed, Jan 06, 2016 at 06:10:15PM +, Peter Maydell wrote:
Hi,

> On 6 January 2016 at 14:47, Ashok Kumar  wrote:
> > virtio model is used for default case.
> >
> > Signed-off-by: Ashok Kumar 
> 
> Could you explain why you think this needs to be done?
Yes, for -net. Honestly, I didn't know this is obsolete and the
only syntax I knew for networking. 

> Virtio networking works OK for me...
I tried the new syntax now and it is working fine.

> 
> I guess from the patch that this is adding support for
> the legacy '-net' way of configuring networking, but do
> we need that if we never supported it in the first place?
> (If virt is the only PCI machine which doesn't support
> -net syntax that would probably be a strong argument for
> supporting it.)
Fine with me. But there are some documentation for e.g [1] which says "-net" is
still supported.

[1] http://www.linux-kvm.org/page/Networking#Compatibility

Thanks,
Ashok
> 
> thanks
> -- PMM
>

Re: [Qemu-devel] [Qemu-block] Minutes from the "Stuttgart block Gipfele"

2016-01-06 Thread Stefan Hajnoczi

On Mon, Jan 04, 2016 at 03:28:36PM +0800, Fam Zheng wrote:
> On Mon, 01/04 13:16, Stefan Hajnoczi wrote:
> > On Wed, Dec 23, 2015 at 06:15:20PM +0800, Fam Zheng wrote:
> > > On Fri, 12/18 14:15, Markus Armbruster wrote:
> > > In that theory, all other block job types, mirror/stream/commit, fit into 
> > > a
> > > "pull" model, which follows a specified dirty bitmap and copies data from 
> > > a
> > > specified src BDS. In this pull model,
> > > 
> > > mirror (device=d0 target=d1) becomes a pull fileter:
> > > 
> > > BB[d0]BB[d1]
> > >| |
> > > throttle[pull,src=d0]
> > >| |
> > >detect-zero   detect-zero
> > >| |
> > >   copy-on-read  copy-on-read
> > >| |
> > >   BDS   BDS
> > > 
> > > Note: the pull reuses most of the block/mirror.c code except the
> > > s->dirty_bitmap will be initialized depending on the block job type. In 
> > > the
> > > case of mirror, it is trivially the same as now.
> > 
> > I don't understand the pull filter.  Is there also a mirror block job
> > coroutine?
> > 
> > Does anything perform I/O to BB[d1]?
> 
> Yes, the filter will have a mirror block job coroutine, and it writes to the
> BDS behind BB[d1]. This is conceptually different from the "block jobs have
> their own BBs" design.

Are any of the pull filter's callbacks (.bdrv_co_readv(),
.bdrv_co_writev()) being invoked?

I still don't understand your idea because it seems like the coroutine
is doing all the work and the filter serves no purpose.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] about I/O

2016-01-06 Thread Stefan Hajnoczi

On Tue, Dec 29, 2015 at 03:36:07PM +0800, 浩樊啊 wrote:
> qemu use paio or libaio finish I/O reqeusts,does that mean that when a sync 
> request comes from a vm becomes async request in the host?

Yes, although storage controllers (virtio-blk, SCSI, SATA, and even IDE
in DMA mode) are asynchronous at the hardware interface level.  That
means the guest OS submits an I/O request to the device and the guest
CPU continues running until the I/O completion interrupt occurs.

The exceptions are old hardware interfaces like some of the SD Card
controllers or maybe floppy where a single hardware register read/write
actually involves synchronous I/O (the guest cannot execute CPU
instructions while the I/O request is happening).

Stefan

signature.asc
Description: PGP signature

[Qemu-devel] Fwd: [vfio-users] [PATCH v2 1/3] input: add qemu_input_qcode_to_linux + qemu_input_linux_to_qcode

2016-01-06 Thread sL1pKn07 SpinFlo

On Mi, 2016-01-06 at 13:42 +0100, sL1pKn07 SpinFlo wrote:
> Yes, is like disconect the cable of USB.

Ok.  That is not so easy.  Well, the first part (switch away) is easy,
it is just a matter of catching read errors, so the device going away
(unplug) doesn't screw up things.  But when it comes to reconnect after
switching back it'll quickly becomes quite messy ...

cheers,
  Gerd

Re: [Qemu-devel] [PATCH] Fix wrong format specifiers in trace events

2016-01-06 Thread Stefan Hajnoczi

On Tue, Dec 29, 2015 at 02:53:34PM +0100, Stefan Weil wrote:
> * Add missing % before PRIx64
> * Remove unneeded "" at end of format string in modified lines
> 
> This fixes a regression introduced in commit
> c8ee0a445a6a85635e962c0346bc7b1259c1a3f5.
> 
> Signed-off-by: Stefan Weil 
> ---
> 
> The patch should also be used for v2.5.1.
> 
> Happy New Year!

Thanks!

I've already merged Mark Cave-Ayland's patch.  Sorry for the duplicated
work, I was away and didn't send a pull request.

https://github.com/stefanha/qemu/commit/c6daed8654bb1a4d085d9229e5a7dd5363794180

Stefan


signature.asc
Description: PGP signature

[Qemu-devel] Fwd: [vfio-users] [PATCH v2 1/3] input: add qemu_input_qcode_to_linux + qemu_input_linux_to_qcode

2016-01-06 Thread sL1pKn07 SpinFlo

Yes, is like disconect the cable of USB.
http://sl1pkn07.wtf/imagenes/kvm01.jpg
http://sl1pkn07.wtf/imagenes/kvm04.jpg

Greetings.

El 06/01/2016 08:53, "Gerd Hoffmann"  escribió:
>
> On Di, 2016-01-05 at 15:44 +0100, sL1pKn07 SpinFlo wrote:
> > 2016-01-05 9:06 GMT+01:00 Jonathan Scruggs :
> > > I notice no bugs as of yet.
> >
> > Hi
> > I found one (if can call that) bug
> >
> > I use a physical USB switch for share the K/M with other PC
> > when switch to other pc, lost the signal.
>
> How does the switch work?  Disconnect the usb devices (i.e. they are
> gone from lsusb output)?
>
> cheers,
>   Gerd
>

Re: [Qemu-devel] [PATCH] virtio-blk: Allow startup of empty cdroms

2016-01-06 Thread Stefan Hajnoczi

On Wed, Jan 06, 2016 at 11:35:43AM +0100, Michal Privoznik wrote:
> If you have an empty IDE cdrom we will start just fine:
> 
> -drive if=none,id=drive-ide0-0-0,readonly=on
> -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0
> 
> However, that's not the case with virtio disk:
> 
> -drive if=none,media=cdrom,id=drive-virtio-disk1,readonly=on
> -device 
> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x2,drive=drive-virtio-disk1,id=virtio-disk1
> 
> One will get the following error:
> 
> qemu-system-x86_64: -device 
> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x2,drive=drive-virtio-disk1,id=virtio-disk1:
>  Device needs media, but drive is empty
> 
> The error comes from virtio_blk_device_realize() where we check
> if virtio block device has a media inserted. This should,
> however, be not required for cdroms.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  hw/block/virtio-blk.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)

virtio-blk doesn't do CD-ROM emulation but it supports read-only block
devices.  The difference is that read-only block devices don't offer the
features that CD-ROMs support like media status and tray eject.

Also, virtio-blk does not support media change.  If we allow it to start
with an empty disk, is there an expectation that media can be changed
later?

In order to support empty media that can later be inserted virtio-blk.c
needs to implement BlockDevOps->change_media_cb() to notify the guest.
You'd need to check guest whether existing guest drivers handle a virtio
configuration interrupt properly (i.e. by invalidating cached pages from
the block devices).

Is this patch supposed to make virtio-blk support CD-ROMs?  In that case
a lot more work is necessary and it's probably better to use virtio-scsi
instead.

Is this patch purely supposed to make virtio-blk happy with empty
drives?  If so, then I'm not sure why that is a good idea since media
change isn't supported.

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 0/7] Raspberry Pi 2 support

2016-01-06 Thread Andrew Baumann

> From: Peter Crosthwaite [mailto:crosthwaitepe...@gmail.com]
> Sent: Wednesday, 6 January 2016 18:47
> On Wed, Jan 6, 2016 at 4:09 PM, Piotr Król  wrote:
> > On Wed, Jan 06, 2016 at 06:27:54PM +, Andrew Baumann wrote:
> >> This is most likely an unimplemented setend instruction in the
> >> userspace memcmp. You need to mount the image, and comment out
> the
> >> contents of (or just remove) /etc/ld.so.preload, then it should boot
> >> further.
> >
> > Yes, I figured this out.
> >
> 
> Do you need BE8 or BE32?

After quickly reading up on this, I'm pretty sure we only care about BE8. This 
is for the setend instructions in memcmp, e.g.:
https://github.com/raspberrypi/linux/commit/5cbcb76ee23807632fcd5db19ebcc2d913011f17

> Some BE8 support can be found here:
> 
> https://github.com/Xilinx/qemu/commits/mainline/big_endian

Thanks for the pointer. It would be nice to get this in for Pi (particularly 
for newer pi1 kernels, as they have started using the optimised memcmp in the 
kernel) but it's not on my critical path. Would a Tested-By help?

Andrew

Re: [Qemu-devel] [PATCH v2] trace-events: fix broken format strings

2016-01-06 Thread Stefan Hajnoczi

On Tue, Jan 05, 2016 at 08:50:45AM -0700, Eric Blake wrote:
> On 01/05/2016 08:37 AM, Andrew Jones wrote:
> > Fixes compiling with --enable-trace-backends
> > 
> > Signed-off-by: Andrew Jones 
> > ---
> > v2: also remove trailing null strings [Laurent]
> > 
> > 
> >  trace-events | 10 +-
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> Duplicate of:
> https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg03667.html

I have merged Mark Cave-Ayland's patch and am sending a pull request
today.

Stefan


signature.asc
Description: PGP signature

[Qemu-devel] 答复: What's the advantages of POSTCOPY over CPU-THROTTLE?

2016-01-06 Thread Zhangbo (Oscar)

Thank you David and Jason!

BTW, I noticed that Vmware did the same work alike us, but the situation is a 
little different:
they proposed postcopy(in the name of QuickResume) in vSphere4.1, but they 
substituted it with SDPS(similar to CPU-THROTTLE) from vSphere5, do you know 
the reason behind this?
Reference: 
https://qianr.wordpress.com/2013/10/14/vmware-vm-live-migration-vmotion/

It's told that they've already introduced a shared storage to avoid losing the 
guest when the network connection is lost. So what's their concern of disposing 
QuickResume? 
Are there any other prices we need to pay to have postcopy?

-邮件原件-
发件人: Jason J. Herne [mailto:jjhe...@linux.vnet.ibm.com] 
发送时间: 2016年1月7日 3:43
收件人: Dr. David Alan Gilbert; Zhangbo (Oscar)
抄送: zhouyimin Zhou(Yimin); Zhanghailiang; Yanqiangjun; Huangpeng (Peter); 
qemu-devel@nongnu.org; Herongguang (Stephen); Linqiangmin; Huangzhichao; 
Wangyufei (James)
主题: Re: [Qemu-devel] What's the advantages of POSTCOPY over CPU-THROTTLE?

On 01/06/2016 04:57 AM, Dr. David Alan Gilbert wrote:
> * Zhangbo (Oscar) (oscar.zhan...@huawei.com) wrote:
>> Hi all:
>> 　　Postcopy is suitable for migrating guests which have large page change 
>> rates. It
>>  1 makes the guest run at the destination ASAP.
>>  2 makes the downtime of the guest small enough.
>>  If we don't take the 1st advantage into account, then, its benefit 
>> seems similar with CPU-THROTTLE: both of them make the guest's downtime 
>> small during migration.
>>
>>  CPU-THROTTLE would make the guest's dirtypage rate *smaller than the 
>> network bandwidth*, in order to make the to_send_page_number in each 
>> iteration convergent and achieve the small-enough downtime during the last 
>> iteration.
>>  If we adopt POST-COPY here, the guest's dirtypage rate would *become 
>> equal to the bandwidth*, because we have to fetch its memory from the source 
>> side, via the network.
>>  Both of them would introduce performance degradations of the guest, 
>> which may in turn cause downtime larger.
>>
>>  So, here comes the question: If we just compare POSTCOPY with 
>> CPU-THROTTLE for their advantages in decreasing downtime, POSTCOPY seems has 
>> no pos over CPU-THROTTLE, is that right?
>>
>>  Meanwhile, Are there any other benifits of POSTCOPY besides the 2 
>> mentioned above?
>
> It's a good question and they do both try and help solve the same problem.
> One problem with cpu-throttle is whether you can throttle the CPU 
> enough to get the dirty-rate below the rate of the network, and the 
> answer to that is very workload dependent.  On a large, many-core VM, 
> even a little bit of CPU can dirty a lot of memory.  Postcopy is 
> guaranteed to finish migration, irrespective of the workload.
>
> Postcopy is pretty fine-grained, in that only threads that are 
> accessing pages that are still on the source are blocked, since it 
> allows the use of async page faults, that means it's even finer 
> grained than the vCPU level, so many threads come back up to full 
> performance pretty quickly even if there are a few pages left.
>

Good answer Dave. FWIW, I completely agree. Using cpu throttling can help the 
situation depending on workload. Postcopy will *always* work. 
One possible side effect of Postcopy is loss of the guest if the network 
connection dies during the postcopy phase of migration. This should be a very 
rare occurrence however. So both methods have their uses.

--
-- Jason J. Herne (jjhe...@linux.vnet.ibm.com)

Re: [Qemu-devel] [PATCH v3 2/4] Add Error **errp for xen_pt_setup_vga()

2016-01-06 Thread Cao jin




On 01/06/2016 11:53 PM, Eric Blake wrote:

On 01/05/2016 07:39 PM, Cao jin wrote:

[...]


Please no '!' in error messages.  We aren't shouting at the user.


Interesting, I didn`t think of this kind of questions before. will fix




  XEN_PT_ERR(d, "Setup VGA BIOS of passthrough GFX failed!\n");


Do we still need the XEN_PT_ERR() alongside setting the local error?



this is fixed in "4/4 convert to realize" patch, as I said in last mail, 
I try to make every patch as small as possible



  xen_host_pci_device_put(&s->real_device);
  return -1;


This leaks local_err.



[...]
--
Yours Sincerely,

Cao jin

Re: [Qemu-devel] [PATCH v3 1/4] Add Error **errp for xen_host_pci_device_get()

2016-01-06 Thread Cao jin




On 01/06/2016 11:51 PM, Eric Blake wrote:

On 01/05/2016 07:39 PM, Cao jin wrote:

To catch the error msg. Also modify the caller

Signed-off-by: Cao jin 
---
  hw/xen/xen-host-pci-device.c | 106 +--
  hw/xen/xen-host-pci-device.h |   5 +-
  hw/xen/xen_pt.c  |  12 +++--
  3 files changed, 71 insertions(+), 52 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 7d8a023..952cae0 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -49,7 +49,7 @@ static int xen_host_pci_sysfs_path(const XenHostPCIDevice *d,

  /* This size should be enough to read the first 7 lines of a resource file */
  #define XEN_HOST_PCI_RESOURCE_BUFFER_SIZE 400
-static int xen_host_pci_get_resource(XenHostPCIDevice *d)
+static void xen_host_pci_get_resource(XenHostPCIDevice *d, Error **errp)
  {
  int i, rc, fd;
  char path[PATH_MAX];
@@ -60,23 +60,24 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)

  rc = xen_host_pci_sysfs_path(d, "resource", path, sizeof (path));
  if (rc) {
-return rc;
+error_setg_errno(errp, errno, "snprintf err");


Are you sure that errno is relevant?  And "snprintf err" doesn't seem to
be the correct message, as there is no snprintf in the line above.



snprintf() is called in xen_host_pci_sysfs_path() above, and is the only 
possible error source, so I guess the errno is relevant?


Or, replace the error_setg_errno() to assert(0)? because if snprintf 
goes wrong, user seems can do nothing.



+return;
  }
+
  fd = open(path, O_RDONLY);
  if (fd == -1) {
-XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
-return -errno;
+error_setg_errno(errp, errno, "open %s err", path);


Please use error_setg_file_open() for reporting open() failures.


@@ -129,20 +130,20 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
  d->rom.bus_flags = flags & IORESOURCE_BITS;
  }
  }
+
  if (i != PCI_NUM_REGIONS) {
  /* Invalid format or input to short */
-rc = -ENODEV;
+error_setg(errp, "Invalid format or input to short: %s", buf);


s/to/too/ (and might as well fix the same typo in the comment while at it)



@@ -152,47 +153,52 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,

  rc = xen_host_pci_sysfs_path(d, name, path, sizeof (path));
  if (rc) {
-return rc;
+error_setg_errno(errp, errno, "snprintf err");
+return;
  }
+
  fd = open(path, O_RDONLY);
  if (fd == -1) {
-XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
-return -errno;
+error_setg_errno(errp, errno, "open %s err", path);


Same comments as above.


+return;
  }
+
  do {
  rc = read(fd, &buf, sizeof (buf) - 1);
  if (rc < 0 && errno != EINTR) {
-rc = -errno;
+error_setg_errno(errp, errno, "read err");
  goto out;
  }
  } while (rc < 0);
+
  buf[rc] = 0;
  value = strtol(buf, &endptr, base);
  if (endptr == buf || *endptr != '\n') {
-rc = -1;
+error_setg(errp, "format invalid: %s", buf);
  } else if ((value == LONG_MIN || value == LONG_MAX) && errno == ERANGE) {
-rc = -errno;
+error_setg_errno(errp, errno, "strtol err");


This is pre-existing invalid use of strtol (the value of errno is not
guaranteed to be ERANGE on overflow; and the only correct way to safely
check errno after strtol() is to first prime it to 0 prior to calling
strtol).  Better would be to use qemu_strtol() (preferably as a separate
patch), so that you don't even have to worry about using strtol()
incorrectly.



Ok, will split it into a separate patch


+static void xen_host_pci_config_open(XenHostPCIDevice *d, Error **errp)
  {
  char path[PATH_MAX];
  int rc;

  rc = xen_host_pci_sysfs_path(d, "config", path, sizeof (path));


May want to delete the space before ( while touching the code in this
vicinity.


  if (rc) {
-return rc;
+error_setg_errno(errp, errno, "snprintf err");


Another suspect message.



+void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t domain,
+uint8_t bus, uint8_t dev, uint8_t func,
+Error **errp)
  {
  unsigned int v;
-int rc = 0;
+Error *local_err = NULL;


These days, naming the local variable 'err' is more common than 'local_err'.



agree. I guess maybe "local_err" is a better mnemonic for newbie. and 
when I am gradually familiar with error report, I also prefer "err". 
Actually I considered to change this name, I just don`t want to bother 
the reviewer to review it again, especially when the patch got a 
Review-by and new version just changes some names. Will fix it.



@@ -774,11 +775,12 @@ static int xen_pt_initfn(PCIDevice *d)

Re: [Qemu-devel] [PATCH v2 0/7] Raspberry Pi 2 support

2016-01-06 Thread Peter Crosthwaite

On Wed, Jan 6, 2016 at 4:09 PM, Piotr Król  wrote:
> On Wed, Jan 06, 2016 at 06:27:54PM +, Andrew Baumann wrote:
>> Hi Piotr,
>>
>> Sorry, I just noticed this email. Please keep me in the To or CC, as I am 
>> not good at keeping up with all the traffic on qemu-devel.
>
> Hi Andrew,
> this was my fault instead reply-to-all I send reply to mailing list. I
> realized after sending.
>
>>
>> > From: qemu-devel-bounces+andrew.baumann=microsoft@nongnu.org
>> > [mailto:qemu-devel-
>> > bounces+andrew.baumann=microsoft@nongnu.org] On Behalf Of Piotr
>> > Król
>> > Sent: Wednesday, 30 December 2015 17:14
>>
>> > First, I tried your code from raspi branch (ar7-raspi doesn't compile [1]).
>> > Using recent Raspbian 2015-11-21-raspbian-jessie (same results I saw for
>> > 2015-09-24). I'm getting kernel panic:
>> >
>> > [6.892677] random: systemd urandom read with 7 bits of entropy 
>> > available
>> > [6.908292] Kernel panic - not syncing: Attempted to kill init!
>> > exitcode=0x0004
>>
>> This is most likely an unimplemented setend instruction in the
>> userspace memcmp. You need to mount the image, and comment out the
>> contents of (or just remove) /etc/ld.so.preload, then it should boot
>> further.
>
> Yes, I figured this out.
>

Do you need BE8 or BE32?

Some BE8 support can be found here:

https://github.com/Xilinx/qemu/commits/mainline/big_endian

Paolo did have BE32 in the original work but I simplified it out of my
own BE work, as I didn't have any tests for it.
 (I was focussed on AArch64 BE at the time).

Watch out for this bug:

https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05762.html

Which is probably still at large in that Xilinx branch.

Regards,
Peter

>>
>> BTW, I mentioned this briefly in the notes on Linux at the end of this page:
>> https://github.com/0xabu/qemu/wiki
>>
>> > Third test I tried is to apply patches from this series on top of master
>> > (38a762fec63f), what cause hang on:
>> >
>> > [   15.094558 ] mmc0: Timeout waiting for hardware interrupt
>>
>> Hmm, I have seen this error on the raspi machine, but I don't think I
>> saw it on raspi2 before. Does it also occur on the older (09-24)
>> image? I've seen this on raspi, but it only occurs when using the DMA
>> controller, and in the patch series presently on list, there is no DMA
>> controller included, so it can't be that.
>>
>
> I tried v2 with qemu 38a762fec63f using:
>
> $ qemu-system-arm -M raspi2 -kernel raspbian-boot/kernel7.img -sd 
> 2015-11-21-raspbian-jessie.img -append "rw earlyprintk loglevel=8 
> console=ttyAMA0,115200 dwc_otg.lpm_enable=0 root=/dev/mmcblk0p2" -dtb 
> raspbian-boot/bcm2709-rpi-2-b.dtb -usbdevice mouse -usbdevice keyboard 
> -serial stdio
>
> WARNING: Image format was not specified for '2015-11-21-raspbian-jessie.img' 
> and probing guessed raw.
>  Automatically detecting the format is dangerous for raw images, 
> write operations on block 0 will be restricted.
>  Specify the 'raw' format explicitly to remove the restrictions.
> qemu-system-arm: -usbdevice mouse: Error: no usb bus to attach usbdevice 
> mouse, please try -machine usb=on and check that the machine model supports 
> USB
> qemu-system-arm: -usbdevice mouse: could not add USB device 'mouse'
>
> Removing USB:
> qemu-system-arm -M raspi2 -kernel 2015-11-21-raspbian-boot/kernel7.img -sd 
> 2015-11-21-raspbian-jessie.img -append "rw earlyprintk loglevel=8 
> console=ttyAMA0,115200 dwc_otg.lpm_enable=0 root=/dev/mmcblk0p2" -dtb 
> 2015-11-21-raspbian-boot/bcm2709-rpi-2-b.dtb -serial stdio
>
> [5.071960] uart-pl011 3f201000.uart: no DMA platform data
> [   15.083445] mmc0: Timeout waiting for hardware interrupt.
> 
>
> Same stuff with 2015-09-24.
>
> Trying your GitHub raspi branch everything works fine after commenting in
> /etc/ld.so.preload.
>
>> > I do not follow this mailing list, so maybe I missing some pieces.
>> > If I need some other patches to test this series please let me know.
>>
>
>> There are other patches (on the list) with changes to sd and sdhci
>> emulation needed to boot Windows, but for Linux I believe this is all
>> you need.
>>
>> > My questions:
>> >
>> > Does 2015-09-24-raspbian-jessie.vhd extracted Linux partition from raspbian
>> > image or simply img converted to vhd format ? Is there any problem with
>> > using img ?
>>
>> No, img is fine. I converted it to a VHD container, because it's
>> easier to work with on Windows (and avoids qemu's warning about raw
>> images).
>>
>> > Would you mind to issue tracking on GitHub and pull requests or you want
>> > whole
>> > communication and possible fixes go through QEMU mailing list ? I think 
>> > that
>> > it
>> > maybe useful for code that is not ready for upstreaming. I saw you have
>> > issue
>> > tracking disabled.
>>
>> I'm in a bind here. My original goal was simply to contribute the
>> raspi2 + Windows guest support. Then I discovered that the raspi work
>> itself was not yet in mainline qemu, and so I've been workin

Re: [Qemu-devel] [PATCH v6 1/6] qdev: get_child_bus(): Use QOM lookup if available

2016-01-06 Thread Peter Crosthwaite

On Wed, Jan 6, 2016 at 4:18 PM, Alistair Francis
 wrote:
> On Sat, Dec 19, 2015 at 9:43 PM, Peter Crosthwaite
>  wrote:
>> qbus_realize() adds busses as a QOM child of the device in addition to
>> adding it to the qdev bus list. Change get_child_bus() to use the QOM
>> child if it is available. This takes priority over the bus-list, but
>> the child object is checked for type correctness.
>
> This doesn't cause problems for anything else? Is there anything else
> with the same name or something where this causes conflicts?
>

Not that I know of, I'm just being defensive as it is core code. I
suspect that that ignored failure path on the child setter is
incorrect but I can't prove it beyond all doubt.

Regards,
Peter

> Thanks,
>
> Alistair
>
>>
>> This prepares support for aliasing of buses. The use case is SoCs,
>> where a SoC container needs to present buses to the board level, but
>> the buses are implemented by controller IP we already model as self
>> contained qbus-containing devices.
>>
>> Signed-off-by: Peter Crosthwaite 
>> ---
>> Currently qbus_realize() ignores errors from object_property_add_child,
>> so it is hard to guarantee that the QOM linkage is reliable. If it were
>> the case that that object_property_add_child was supposed to be error
>> asserting, we could remove the old bus-list strcmp iterator altogether.
>>
>>  hw/core/qdev.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>> index b3ad467..c96c464 100644
>> --- a/hw/core/qdev.c
>> +++ b/hw/core/qdev.c
>> @@ -581,6 +581,12 @@ void qdev_pass_gpios(DeviceState *dev, DeviceState 
>> *container,
>>  BusState *qdev_get_child_bus(DeviceState *dev, const char *name)
>>  {
>>  BusState *bus;
>> +Object *child = object_resolve_path_component(OBJECT(dev), name);
>> +
>> +bus = (BusState *)object_dynamic_cast(child, TYPE_BUS);
>> +if (bus) {
>> +return bus;
>> +}
>>
>>  QLIST_FOREACH(bus, &dev->child_bus, sibling) {
>>  if (strcmp(name, bus->name) == 0) {
>> --
>> 1.9.1
>>
>>

Re: [Qemu-devel] [PATCH v2 4/7] device_tree: qemu_fdt_getprop converted to use the error API

2016-01-06 Thread Peter Crosthwaite

On Wed, Jan 6, 2016 at 7:13 AM, Eric Auger  wrote:
> Current qemu_fdt_getprop exits if the property is not found. It is
> sometimes needed to read an optional property, in which case we do
> not wish to exit but simply returns a null value.
>
> This patch converts qemu_fdt_getprop to accept an Error **, and existing
> users are converted to pass &error_fatal. This preserves the existing
> behaviour. Then to use the API with your optional semantic a null
> parameter can be conveyed.
>
> Signed-off-by: Eric Auger 
>
> ---
>
> v1 -> v2:
> - add a doc comment in the header file
>
> RFC -> v1:
> - get rid of qemu_fdt_getprop_optional and implement Peter's suggestion
>   that consists in using the error API
>
> Signed-off-by: Eric Auger 
> ---
>  device_tree.c| 11 ++-
>  include/sysemu/device_tree.h | 15 ++-
>  2 files changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/device_tree.c b/device_tree.c
> index 8441e01..6ecc9da 100644
> --- a/device_tree.c
> +++ b/device_tree.c
> @@ -321,18 +321,18 @@ int qemu_fdt_setprop_string(void *fdt, const char 
> *node_path,
>  }
>
>  const void *qemu_fdt_getprop(void *fdt, const char *node_path,
> - const char *property, int *lenp)
> + const char *property, int *lenp, Error **errp)
>  {
>  int len;
>  const void *r;
> +
>  if (!lenp) {
>  lenp = &len;
>  }
>  r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
>  if (!r) {
> -error_report("%s: Couldn't get %s/%s: %s", __func__,
> - node_path, property, fdt_strerror(*lenp));
> -exit(1);
> +error_setg(errp, "%s: Couldn't get %s/%s: %s", __func__,
> +  node_path, property, fdt_strerror(*lenp));
>  }
>  return r;
>  }
> @@ -341,7 +341,8 @@ uint32_t qemu_fdt_getprop_cell(void *fdt, const char 
> *node_path,
> const char *property)
>  {
>  int len;
> -const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len);
> +const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len,
> + &error_fatal);

The _cell variant is now inconsistent with the regular _getprop. Can
we convert them both together and fix the clients to error_fatal?
(there are only 4 of them).

This would become the standard error_propagate pattern.

>  if (len != 4) {
>  error_report("%s: %s/%s not 4 bytes long (not a cell?)",
>   __func__, node_path, property);

And this would also convert to error_setg.

Regards,
Peter

> diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
> index 269cb1c..4d7cbb9 100644
> --- a/include/sysemu/device_tree.h
> +++ b/include/sysemu/device_tree.h
> @@ -45,8 +45,21 @@ int qemu_fdt_setprop_string(void *fdt, const char 
> *node_path,
>  int qemu_fdt_setprop_phandle(void *fdt, const char *node_path,
>   const char *property,
>   const char *target_node_path);
> +/**
> + * qemu_fdt_getprop: retrieve the value of a given property
> + * @fdt: pointer to the device tree blob
> + * @node_path: node path
> + * @property: name of the property to find
> + * @lenp: fdt error if any or length of the property on success
> + * @errp: handle to an error object
> + *
> + * returns a pointer to the property on success and NULL on failure
> + * in case errp is set to &error_fatal, the function auto-asserts
> + * on error (legacy behavior)
> + */
>  const void *qemu_fdt_getprop(void *fdt, const char *node_path,
> - const char *property, int *lenp);
> + const char *property, int *lenp,
> + Error **errp);
>  uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
> const char *property);
>  uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
> --
> 1.9.1
>

Re: [Qemu-devel] [PATCH v6 1/6] qdev: get_child_bus(): Use QOM lookup if available

2016-01-06 Thread Alistair Francis

On Sat, Dec 19, 2015 at 9:43 PM, Peter Crosthwaite
 wrote:
> qbus_realize() adds busses as a QOM child of the device in addition to
> adding it to the qdev bus list. Change get_child_bus() to use the QOM
> child if it is available. This takes priority over the bus-list, but
> the child object is checked for type correctness.

This doesn't cause problems for anything else? Is there anything else
with the same name or something where this causes conflicts?

Thanks,

Alistair

>
> This prepares support for aliasing of buses. The use case is SoCs,
> where a SoC container needs to present buses to the board level, but
> the buses are implemented by controller IP we already model as self
> contained qbus-containing devices.
>
> Signed-off-by: Peter Crosthwaite 
> ---
> Currently qbus_realize() ignores errors from object_property_add_child,
> so it is hard to guarantee that the QOM linkage is reliable. If it were
> the case that that object_property_add_child was supposed to be error
> asserting, we could remove the old bus-list strcmp iterator altogether.
>
>  hw/core/qdev.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index b3ad467..c96c464 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -581,6 +581,12 @@ void qdev_pass_gpios(DeviceState *dev, DeviceState 
> *container,
>  BusState *qdev_get_child_bus(DeviceState *dev, const char *name)
>  {
>  BusState *bus;
> +Object *child = object_resolve_path_component(OBJECT(dev), name);
> +
> +bus = (BusState *)object_dynamic_cast(child, TYPE_BUS);
> +if (bus) {
> +return bus;
> +}
>
>  QLIST_FOREACH(bus, &dev->child_bus, sibling) {
>  if (strcmp(name, bus->name) == 0) {
> --
> 1.9.1
>
>

Re: [Qemu-devel] [PATCH v2 0/7] Raspberry Pi 2 support

2016-01-06 Thread Andrew Baumann

> From: Piotr Król [mailto:piotr.k...@3mdeb.com]
> Sent: Wednesday, 6 January 2016 16:10
[...]
> I tried v2 with qemu 38a762fec63f using:
> 
> $ qemu-system-arm -M raspi2 -kernel raspbian-boot/kernel7.img -sd 2015-
> 11-21-raspbian-jessie.img -append "rw earlyprintk loglevel=8
> console=ttyAMA0,115200 dwc_otg.lpm_enable=0 root=/dev/mmcblk0p2" -
> dtb raspbian-boot/bcm2709-rpi-2-b.dtb -usbdevice mouse -usbdevice
> keyboard -serial stdio

Try dropping the -dtb parameter. I think the DTB file tells Linux to use the 
non-existent DMA controller. If that doesn't work, check the command I included 
in the initial email of the patch series ("patch" 0/7) -- I tested that at 
least once :)

> > BTW, the biggest roadblock in my plan above is the USB host controller
> > emulation (by Gregory), which is obviously incomplete/hacky, but still
> > works well enough for Linux keyboard/mouse that it's useful to people.
> > If someone wants to pick that up, I would be very happy :)
> 
> What you mean by 'pick that up' ? Is Gregory's USB code available
> somewhere and have to be cleaned and upstreamed ? Please point me to
> this pice and I would check what I can do.

It's here:
https://github.com/0xabu/qemu/blob/raspi/hw/usb/bcm2835_usb.c

I have not attempted to understand the code, or the controller it emulates but 
the first comment says a lot:
/* This is wrong at so many levels, but well, I'm releasing it anyway */

Cheers,
Andrew

Re: [Qemu-devel] [PATCH v2 0/7] Raspberry Pi 2 support

2016-01-06 Thread Piotr Król

On Wed, Jan 06, 2016 at 06:27:54PM +, Andrew Baumann wrote:
> Hi Piotr,
> 
> Sorry, I just noticed this email. Please keep me in the To or CC, as I am not 
> good at keeping up with all the traffic on qemu-devel.

Hi Andrew,
this was my fault instead reply-to-all I send reply to mailing list. I
realized after sending.

> 
> > From: qemu-devel-bounces+andrew.baumann=microsoft@nongnu.org
> > [mailto:qemu-devel-
> > bounces+andrew.baumann=microsoft@nongnu.org] On Behalf Of Piotr
> > Król
> > Sent: Wednesday, 30 December 2015 17:14
> 
> > First, I tried your code from raspi branch (ar7-raspi doesn't compile [1]).
> > Using recent Raspbian 2015-11-21-raspbian-jessie (same results I saw for
> > 2015-09-24). I'm getting kernel panic:
> > 
> > [6.892677] random: systemd urandom read with 7 bits of entropy available
> > [6.908292] Kernel panic - not syncing: Attempted to kill init!
> > exitcode=0x0004
> 
> This is most likely an unimplemented setend instruction in the
> userspace memcmp. You need to mount the image, and comment out the
> contents of (or just remove) /etc/ld.so.preload, then it should boot
> further.

Yes, I figured this out.

> 
> BTW, I mentioned this briefly in the notes on Linux at the end of this page:
> https://github.com/0xabu/qemu/wiki
> 
> > Third test I tried is to apply patches from this series on top of master
> > (38a762fec63f), what cause hang on:
> > 
> > [   15.094558 ] mmc0: Timeout waiting for hardware interrupt
> 
> Hmm, I have seen this error on the raspi machine, but I don't think I
> saw it on raspi2 before. Does it also occur on the older (09-24)
> image? I've seen this on raspi, but it only occurs when using the DMA
> controller, and in the patch series presently on list, there is no DMA
> controller included, so it can't be that.
> 

I tried v2 with qemu 38a762fec63f using:

$ qemu-system-arm -M raspi2 -kernel raspbian-boot/kernel7.img -sd 
2015-11-21-raspbian-jessie.img -append "rw earlyprintk loglevel=8 
console=ttyAMA0,115200 dwc_otg.lpm_enable=0 root=/dev/mmcblk0p2" -dtb 
raspbian-boot/bcm2709-rpi-2-b.dtb -usbdevice mouse -usbdevice keyboard -serial 
stdio

WARNING: Image format was not specified for '2015-11-21-raspbian-jessie.img' 
and probing guessed raw.
 Automatically detecting the format is dangerous for raw images, write 
operations on block 0 will be restricted.
 Specify the 'raw' format explicitly to remove the restrictions.
qemu-system-arm: -usbdevice mouse: Error: no usb bus to attach usbdevice mouse, 
please try -machine usb=on and check that the machine model supports USB
qemu-system-arm: -usbdevice mouse: could not add USB device 'mouse'
 
Removing USB:
qemu-system-arm -M raspi2 -kernel 2015-11-21-raspbian-boot/kernel7.img -sd 
2015-11-21-raspbian-jessie.img -append "rw earlyprintk loglevel=8 
console=ttyAMA0,115200 dwc_otg.lpm_enable=0 root=/dev/mmcblk0p2" -dtb 
2015-11-21-raspbian-boot/bcm2709-rpi-2-b.dtb -serial stdio

[5.071960] uart-pl011 3f201000.uart: no DMA platform data
[   15.083445] mmc0: Timeout waiting for hardware interrupt.


Same stuff with 2015-09-24.

Trying your GitHub raspi branch everything works fine after commenting in
/etc/ld.so.preload.

> > I do not follow this mailing list, so maybe I missing some pieces.
> > If I need some other patches to test this series please let me know.
> 

> There are other patches (on the list) with changes to sd and sdhci
> emulation needed to boot Windows, but for Linux I believe this is all
> you need.
> 
> > My questions:
> > 
> > Does 2015-09-24-raspbian-jessie.vhd extracted Linux partition from raspbian
> > image or simply img converted to vhd format ? Is there any problem with
> > using img ?
> 
> No, img is fine. I converted it to a VHD container, because it's
> easier to work with on Windows (and avoids qemu's warning about raw
> images).
> 
> > Would you mind to issue tracking on GitHub and pull requests or you want
> > whole
> > communication and possible fixes go through QEMU mailing list ? I think that
> > it
> > maybe useful for code that is not ready for upstreaming. I saw you have
> > issue
> > tracking disabled.
> 
> I'm in a bind here. My original goal was simply to contribute the
> raspi2 + Windows guest support. Then I discovered that the raspi work
> itself was not yet in mainline qemu, and so I've been working to clean
> that up and get it merged along with the raspi2 work. However, I don't
> have the time/resources to be a long-term maintainer of everything
> Pi-related. I'm happy to enable issue tracking on github, but don't
> want to give a false impression as to the level of support it's likely
> to receive :) On pull requests: my goal is to get everything in my
> raspi branch merged upstream (which is why I recently dropped some
> stub device emulations that weren't needed); assuming that happens I'd
> prefer if subsequent changes went through the qemu patch process,
> since ultimately that's how they'll get into mainline and be
> main

[Qemu-devel] [PATCH v8 23.5/35] qmp: Tighten output visitor rules, part 2

2016-01-06 Thread Eric Blake

Nothing was using the return value of qmp_output_pop().  Also,
adding a parameter will let us diagnose any programming bugs
due to mismatched push(struct)/pop(list) or push(list)/pop(struct).

Signed-off-by: Eric Blake 

---
v9: new patch; probably worth squashing into 23/35
---
 qapi/qmp-output-visitor.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/qapi/qmp-output-visitor.c b/qapi/qmp-output-visitor.c
index f2c39c5..8a297a6 100644
--- a/qapi/qmp-output-visitor.c
+++ b/qapi/qmp-output-visitor.c
@@ -1,7 +1,7 @@
 /*
  * Core Definitions for QAPI/QMP Command Registry
  *
- * Copyright (C) 2015 Red Hat, Inc.
+ * Copyright (C) 2015-2016 Red Hat, Inc.
  * Copyright IBM, Corp. 2011
  *
  * Authors:
@@ -57,17 +57,15 @@ static void qmp_output_push_obj(QmpOutputVisitor *qov, 
QObject *value)
 QTAILQ_INSERT_HEAD(&qov->stack, e, node);
 }

-/* Grab and remove the most recent QObject from the stack */
-static QObject *qmp_output_pop(QmpOutputVisitor *qov)
+/* Remove the most recent QObject with given type from the stack */
+static void qmp_output_pop(QmpOutputVisitor *qov, QType type)
 {
 QStackEntry *e = QTAILQ_FIRST(&qov->stack);
-QObject *value;

 assert(e);
 QTAILQ_REMOVE(&qov->stack, e, node);
-value = e->value;
+assert(qobject_type(e->value) == type);
 g_free(e);
-return value;
 }

 /* Grab the most recent QObject from the stack, if any */
@@ -120,7 +118,7 @@ static void qmp_output_start_struct(Visitor *v, const char 
*name, void **obj,
 static void qmp_output_end_struct(Visitor *v, Error **errp)
 {
 QmpOutputVisitor *qov = to_qov(v);
-qmp_output_pop(qov);
+qmp_output_pop(qov, QTYPE_QDICT);
 }

 static void qmp_output_start_list(Visitor *v, const char *name, Error **errp)
@@ -151,7 +149,7 @@ static GenericList *qmp_output_next_list(Visitor *v, 
GenericList **listp,
 static void qmp_output_end_list(Visitor *v)
 {
 QmpOutputVisitor *qov = to_qov(v);
-qmp_output_pop(qov);
+qmp_output_pop(qov, QTYPE_QLIST);
 }

 static void qmp_output_type_int64(Visitor *v, const char *name, int64_t *obj,
-- 
2.4.3

Re: [Qemu-devel] [PATCH v8 23/35] qmp: Tighten output visitor rules

2016-01-06 Thread Eric Blake

On 01/05/2016 07:05 AM, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
>> Add a new qmp_output_visitor_reset(), which must be called before
>> reusing an exising QmpOutputVisitor on a new root object.  Tighten
>> assertions to require that qmp_output_get_qobject() can only be
>> called after pairing a visit_end_* for every visit_start_* (rather
>> than allowing it to return a partially built object), that it must
>> not be called unless at least one visit_type_* or visit_start/
>> visit_end pair has occurred since creation/reset (the accidental
>> return of NULL fixed by commit ab8bf1d7 would have been much
>> easier to diagnose), and that it may only be called once per visit.
>>
>> To keep the semantics of test_visitor_out_empty, we now have to
>> explicitly request a top-level visit of a NULL object, by
>> implementing the just-added visitor type_null() callback.
>>
>> Signed-off-by: Eric Blake 
>>

>> -void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
>> +void qmp_output_visitor_reset(QmpOutputVisitor *v)
>>  {
>>  QStackEntry *e, *tmp;
>>
>> @@ -231,6 +235,12 @@ void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
>>  }
>>
>>  qobject_decref(v->root);
>> +v->root = NULL;
>> +}
>> +
>> +void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
> 
> It would make sense to call it _free() imho..

Pre-existing name, so a cleanup should be in a separate patch.


>> @@ -459,6 +460,7 @@ static void test_visitor_out_empty(TestOutputVisitorData 
>> *data,
>>  {
>>  QObject *arg;
>>
>> +visit_type_null(data->ov, NULL, &error_abort);
> 
> I guess this is going to be used outside of just this artificial case.
> Otherwise, I would have suggested to get rid of it.

Used in 24/35 (in fact, it was discussion on that topic just before 2.5
that prompted me to even add visit_type_null()).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v8 22.5/35] qmp: Support explicit null on input visit

2016-01-06 Thread Eric Blake

Implement the new type_null() callback for the qmp input visitor.
While we don't yet have a use for this in qapi (the generator
will need some tweaks first), one usage is already envisioned:
when changing blockdev parameters, it would be nice to have a
difference between leaving a tuning parameter unchanged (omit
that parameter from the struct) and to explicitly reset the
parameter to its default without having to know what the default
value is (specify the parameter with an explicit null value,
which will require us to allow a qapi alternate that chooses
between the normal values and an explicit null).

At any rate, we can test this without the use of generated qapi
by manually using visit_start_struct()/visit_end_struct().

Signed-off-by: Eric Blake 

---
v9: new patch
---
 include/qapi/visitor-impl.h|  2 +-
 qapi/qmp-input-visitor.c   | 14 ++
 tests/test-qmp-input-visitor.c | 26 ++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 8705136..95408a5 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -76,7 +76,7 @@ struct Visitor
 void (*type_any)(Visitor *v, const char *name, QObject **obj,
  Error **errp);
 /* Must be provided to visit explicit null values (right now, only the
- * dealloc visitor supports this).  */
+ * dealloc and qmp-input visitors support this).  */
 void (*type_null)(Visitor *v, const char *name, Error **errp);

 /* May be NULL; most useful for input visitors. */
diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index 597652c..ad23bec 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -315,6 +315,19 @@ static void qmp_input_type_any(Visitor *v, const char 
*name, QObject **obj,
 *obj = qobj;
 }

+static void qmp_input_type_null(Visitor *v, const char *name, Error **errp)
+{
+QmpInputVisitor *qiv = to_qiv(v);
+QObject *qobj = qmp_input_get_object(qiv, name, true);
+
+if (qobject_type(qobj) == QTYPE_QNULL) {
+return;
+}
+
+error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
+   "null");
+}
+
 static void qmp_input_optional(Visitor *v, const char *name, bool *present)
 {
 QmpInputVisitor *qiv = to_qiv(v);
@@ -358,6 +371,7 @@ QmpInputVisitor *qmp_input_visitor_new(QObject *obj)
 v->visitor.type_str = qmp_input_type_str;
 v->visitor.type_number = qmp_input_type_number;
 v->visitor.type_any = qmp_input_type_any;
+v->visitor.type_null = qmp_input_type_null;
 v->visitor.optional = qmp_input_optional;
 v->visitor.get_next_type = qmp_input_get_next_type;

diff --git a/tests/test-qmp-input-visitor.c b/tests/test-qmp-input-visitor.c
index f6bd408..6489e4a 100644
--- a/tests/test-qmp-input-visitor.c
+++ b/tests/test-qmp-input-visitor.c
@@ -278,6 +278,30 @@ static void test_visitor_in_any(TestInputVisitorData *data,
 qobject_decref(res);
 }

+static void test_visitor_in_null(TestInputVisitorData *data,
+ const void *unused)
+{
+Visitor *v;
+QObject *null;
+
+v = visitor_input_test_init(data, "null");
+visit_type_null(v, NULL, &error_abort);
+
+v = visitor_input_test_init(data, "{ 'a': null }");
+visit_start_struct(v, NULL, NULL, 0, &error_abort);
+visit_type_null(v, "a", &error_abort);
+visit_end_struct(v, &error_abort);
+
+/* Check that qnull reference counting is sane:
+ * 1 for global use, 1 for our qnull() use, and 1 still owned by 'v'
+ * until it is torn down */
+null = qnull();
+g_assert(null->refcnt == 3);
+visitor_input_teardown(data, NULL);
+g_assert(null->refcnt == 2);
+qobject_decref(null);
+}
+
 static void test_visitor_in_union_flat(TestInputVisitorData *data,
const void *unused)
 {
@@ -792,6 +816,8 @@ int main(int argc, char **argv)
&in_visitor_data, test_visitor_in_list);
 input_visitor_test_add("/visitor/input/any",
&in_visitor_data, test_visitor_in_any);
+input_visitor_test_add("/visitor/input/null",
+   &in_visitor_data, test_visitor_in_null);
 input_visitor_test_add("/visitor/input/union-flat",
&in_visitor_data, test_visitor_in_union_flat);
 input_visitor_test_add("/visitor/input/alternate",
-- 
2.4.3

Re: [Qemu-devel] [PULL 49/57] Round up RAMBlock sizes to host page sizes

2016-01-06 Thread Paolo Bonzini



On 30/12/2015 01:26, Peter Crosthwaite wrote:
> On Tue, Nov 10, 2015 at 6:25 AM, Juan Quintela  wrote:
>> From: "Dr. David Alan Gilbert" 
>>
>> RAMBlocks that are not a multiple of host pages in length
>> cause problems for postcopy (I've seen an ACPI table on aarch64
>> be 5k in length - i.e. 5x target-page), so round RAMBlock sizes
>> up to a host-page.
>>
>> This potentially breaks migration compatibility due to changes
>> in RAMBlock sizes; however:
>>1) x86 and s390 I think always have host=target page size
>>2) When I've tried on Power the block sizes already seem aligned.
>>3) I don't think there's anything else that maintains per-version
>>   machine-types for compatibility.
>>
> 
> Is there any reason this shouldn't be converted to
> REAL_HOST_PAGE_ALIGN given these restrictions?
> 
> I'm thinking about multi-arch, where HOST_PAGE_ALIGN is inaccessible
> from exec.c as it is target-arch specific. My previous workaround was
> to define the target page size for multi-arch so that TARGET_PAGE_SIZE
> was usable but not sure that should go viral to these other defs.

host_page_size and host_page_mask are really just MAX(host_page_size,
TARGET_PAGE_SIZE) and MIN(qemu_host_page_mask, TARGET_PAGE_MASK).  If
you convert them to macros this way, the hack you use for multi-arch
TARGET_PAGE_SIZE will work transparently for HOST_PAGE_{SIZE,MASK}.

Paolo

Re: [Qemu-devel] [PATCH 2/4] macio: add dma_active to VMStateDescription

2016-01-06 Thread Mark Cave-Ayland

On 06/01/16 20:57, John Snow wrote:

> On 01/06/2016 03:37 PM, Mark Cave-Ayland wrote:
>> Make sure that we include the value of dma_active in the migration stream.
>>
>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  hw/ide/macio.c |3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
>> index 560c071..695d4d2 100644
>> --- a/hw/ide/macio.c
>> +++ b/hw/ide/macio.c
>> @@ -518,11 +518,12 @@ static const MemoryRegionOps pmac_ide_ops = {
>>  
>>  static const VMStateDescription vmstate_pmac = {
>>  .name = "ide",
>> -.version_id = 3,
>> +.version_id = 4,
>>  .minimum_version_id = 0,
>>  .fields = (VMStateField[]) {
>>  VMSTATE_IDE_BUS(bus, MACIOIDEState),
>>  VMSTATE_IDE_DRIVES(bus.ifs, MACIOIDEState),
>> +VMSTATE_BOOL(dma_active, MACIOIDEState),
>>  VMSTATE_END_OF_LIST()
>>  }
>>  };
>>
> 
> Did you wind up ever observing this value to be non-zero when it was
> written to the migration stream?
> 
> I really did think that we should be able to assume this was always
> false due to how migration will drain all outstanding AIO, but maybe I
> am mistaken.

I think this can happen because Darwin/MacOS sets the DBDMA processor
running first *before* the IDE request is issued, compared to pretty
much every other OS which issues the IDE request *first* which then in
turn invokes the DMA engine (which is the general assumption in the QEMU
IDE/DMA APIs).

So there could be a window where the DBDMA is programmed and active but
migration takes place before the corresponding IDE request has been
issued (which is exactly the situation that this flag handles).

ATB,

Mark.

Re: [Qemu-devel] [PATCH 1/4] macio: use the existing IDEDMA aiocb to hold the active DMA aiocb

2016-01-06 Thread John Snow



On 01/06/2016 03:37 PM, Mark Cave-Ayland wrote:
> Currently the aiocb is held within MACIOIDEState, however the IDE core code
> assumes that the current actvie DMA aiocb is held in aiocb in a few places,
> e.g. ide_bus_reset() and ide_reset().
> 
> Switch over to using IDEDMA aiocb to store the aiocb for the current active
> DMA request so that bus resets and restarts are handled correctly. As a
> consequence we can now use ide_set_inactive() rather than handling its
> functionality ourselves.
> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/ide/macio.c |   20 
>  hw/ppc/mac.h   |1 -
>  2 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> index 3ee962f..560c071 100644
> --- a/hw/ide/macio.c
> +++ b/hw/ide/macio.c
> @@ -119,8 +119,8 @@ static void pmac_dma_read(BlockBackend *blk,
>  MACIO_DPRINTF("--- Block read transfer - sector_num: %" PRIx64 "  "
>"nsector: %x\n", (offset >> 9), (bytes >> 9));
>  
> -m->aiocb = blk_aio_readv(blk, (offset >> 9), &io->iov, (bytes >> 9),
> - cb, io);
> +s->bus->dma->aiocb = blk_aio_readv(blk, (offset >> 9), &io->iov,
> + (bytes >> 9), cb, io);
>  }
>  
>  static void pmac_dma_write(BlockBackend *blk,
> @@ -204,8 +204,8 @@ static void pmac_dma_write(BlockBackend *blk,
>  MACIO_DPRINTF("--- Block write transfer - sector_num: %" PRIx64 "  "
>"nsector: %x\n", (offset >> 9), (bytes >> 9));
>  
> -m->aiocb = blk_aio_writev(blk, (offset >> 9), &io->iov, (bytes >> 9),
> -  cb, io);
> +s->bus->dma->aiocb = blk_aio_writev(blk, (offset >> 9), &io->iov,
> + (bytes >> 9), cb, io);
>  }
>  
>  static void pmac_dma_trim(BlockBackend *blk,
> @@ -231,8 +231,8 @@ static void pmac_dma_trim(BlockBackend *blk,
>  s->io_buffer_index += io->len;
>  io->len = 0;
>  
> -m->aiocb = ide_issue_trim(blk, (offset >> 9), &io->iov, (bytes >> 9),
> -  cb, io);
> +s->bus->dma->aiocb = ide_issue_trim(blk, (offset >> 9), &io->iov,
> + (bytes >> 9), cb, io);
>  }
>  
>  static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> @@ -291,6 +291,8 @@ done:
>  } else {
>  block_acct_done(blk_get_stats(s->blk), &s->acct);
>  }
> +
> +ide_set_inactive(s, false);
>  io->dma_end(opaque);
>  
>  return;
> @@ -307,7 +309,6 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
>  
>  if (ret < 0) {
>  MACIO_DPRINTF("DMA error: %d\n", ret);
> -m->aiocb = NULL;
>  ide_dma_error(s);
>  goto done;
>  }
> @@ -358,6 +359,8 @@ done:
>  block_acct_done(blk_get_stats(s->blk), &s->acct);
>  }
>  }
> +
> +ide_set_inactive(s, false);
>  io->dma_end(opaque);
>  }
>  
> @@ -395,8 +398,9 @@ static void pmac_ide_transfer(DBDMA_io *io)
>  static void pmac_ide_flush(DBDMA_io *io)
>  {
>  MACIOIDEState *m = io->opaque;
> +IDEState *s = idebus_active_if(&m->bus);
>  
> -if (m->aiocb) {
> +if (s->bus->dma->aiocb) {
>  blk_drain_all();
>  }
>  }
> diff --git a/hw/ppc/mac.h b/hw/ppc/mac.h
> index e375ed2..ecf7792 100644
> --- a/hw/ppc/mac.h
> +++ b/hw/ppc/mac.h
> @@ -134,7 +134,6 @@ typedef struct MACIOIDEState {
>  
>  MemoryRegion mem;
>  IDEBus bus;
> -BlockAIOCB *aiocb;
>  IDEDMA dma;
>  void *dbdma;
>  bool dma_active;
> 

Reviewed-by: John Snow

Re: [Qemu-devel] [PATCH 2/4] macio: add dma_active to VMStateDescription

2016-01-06 Thread John Snow



On 01/06/2016 03:37 PM, Mark Cave-Ayland wrote:
> Make sure that we include the value of dma_active in the migration stream.
> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/ide/macio.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> index 560c071..695d4d2 100644
> --- a/hw/ide/macio.c
> +++ b/hw/ide/macio.c
> @@ -518,11 +518,12 @@ static const MemoryRegionOps pmac_ide_ops = {
>  
>  static const VMStateDescription vmstate_pmac = {
>  .name = "ide",
> -.version_id = 3,
> +.version_id = 4,
>  .minimum_version_id = 0,
>  .fields = (VMStateField[]) {
>  VMSTATE_IDE_BUS(bus, MACIOIDEState),
>  VMSTATE_IDE_DRIVES(bus.ifs, MACIOIDEState),
> +VMSTATE_BOOL(dma_active, MACIOIDEState),
>  VMSTATE_END_OF_LIST()
>  }
>  };
> 

Did you wind up ever observing this value to be non-zero when it was
written to the migration stream?

I really did think that we should be able to assume this was always
false due to how migration will drain all outstanding AIO, but maybe I
am mistaken.

--js

Re: [Qemu-devel] [PATCH v8 2/4] hw/ptimer: Perform tick and counter wrap around if timer already expired

2016-01-06 Thread Dmitry Osipenko


06.01.2016 16:59, Peter Crosthwaite пишет:


+
+if (expired) {
+/* Wrap around periodic counter.  */
+counter = s->delta = s->limit - counter % s->limit;



Why do you update the delta here?



Because we would want to schedule next tick based on current wrapped around
counter value and not some arbitrary delta.



So looking at ptimer_reload(), the new schedule is done relative to
the VM clock of the when the tick was expected to hit, not the current
time. But this new delta is going to be relative to the now time and
then used to update the next tick which will happen relative to
next_event. Unless you stop or scale the timer, I don't think you need
to do delta manipulation?



Yes, I missed that next_event would be set earlier (like you described) in case 
of expired timer. Thanks for the note, will fix it.



Also can you just get ptimer_reload to do the modulo math for you? If the
timer is !oneshot and expired, then you call ptimer_reload anyway,
which will update next_event. When the expired test returns false
you can just reliably use the original logic involving now and next.



Yes, that's what I changed in V9. Have you received it?

https://lists.nongnu.org/archive/html/qemu-devel/2016-01/msg00272.html



Just had a look.

V9 still has the modulo I think?:

+if (expired && (counter != 0)) {
+/* Wrap around periodic counter.  */
+counter = s->delta = s->limit - counter % s->limit;
+}



Modulo is there, I just meant that V9 changed to call ptimer_reload() on counter 
== 0. As noted above, ptimer_reload would adjust next_event, so s->delta
shouldn't be set to the wrapped around counter. However it should be set to the 
limit, since delta might been altered by ptimer_set_count.


--
Dmitry

Re: [Qemu-devel] [PATCH 07/13] tests: Add test code for meta bitmap

2016-01-06 Thread John Snow



On 01/04/2016 05:27 AM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> ---
>  tests/test-hbitmap.c | 113 
> +++
>  1 file changed, 113 insertions(+)
> 
> diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
> index abcea0c..19136e7 100644
> --- a/tests/test-hbitmap.c
> +++ b/tests/test-hbitmap.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include "qemu/hbitmap.h"
> +#include "block/block.h"
>  
>  #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
>  
> @@ -23,6 +24,7 @@
>  
>  typedef struct TestHBitmapData {
>  HBitmap   *hb;
> +HBitmap   *meta;
>  unsigned long *bits;
>  size_t size;
>  size_t old_size;
> @@ -94,6 +96,14 @@ static void hbitmap_test_init(TestHBitmapData *data,
>  }
>  }
>  
> +static void hbitmap_test_init_meta(TestHBitmapData *data,
> +   uint64_t size, int granularity,
> +   int meta_chunk)
> +{
> +hbitmap_test_init(data, size, granularity);
> +data->meta = hbitmap_create_meta(data->hb, meta_chunk);
> +}
> +
>  static inline size_t hbitmap_test_array_size(size_t bits)
>  {
>  size_t n = (bits + BITS_PER_LONG - 1) / BITS_PER_LONG;
> @@ -637,6 +647,103 @@ static void 
> test_hbitmap_truncate_shrink_large(TestHBitmapData *data,
>  hbitmap_test_truncate(data, size, -diff, 0);
>  }
>  
> +static void hbitmap_check_meta(TestHBitmapData *data,
> +   int64_t start, int count)
> +{
> +int64_t i;
> +
> +for (i = 0; i < data->size; i++) {
> +if (i >= start && i < start + count) {
> +g_assert(hbitmap_get(data->meta, i));
> +} else {
> +g_assert(!hbitmap_get(data->meta, i));
> +}
> +}
> +}
> +
> +static void hbitmap_test_meta(TestHBitmapData *data,
> +  int64_t start, int count,
> +  int64_t check_start, int check_count)
> +{
> +hbitmap_reset_all(data->hb);
> +hbitmap_reset_all(data->meta);
> +
> +/* Test "unset" -> "unset" will not update meta. */
> +hbitmap_reset(data->hb, start, count);
> +hbitmap_check_meta(data, 0, 0);
> +
> +/* Test "unset" -> "set" will update meta */
> +hbitmap_set(data->hb, start, count);
> +hbitmap_check_meta(data, check_start, check_count);
> +
> +/* Test "set" -> "set" will not update meta */
> +hbitmap_reset_all(data->meta);
> +hbitmap_set(data->hb, start, count);
> +hbitmap_check_meta(data, 0, 0);
> +
> +/* Test "set" -> "unset" will update meta */
> +hbitmap_reset_all(data->meta);
> +hbitmap_reset(data->hb, start, count);
> +hbitmap_check_meta(data, check_start, check_count);
> +}
> +
> +static void hbitmap_test_meta_do(TestHBitmapData *data, int chunk_size)
> +{
> +uint64_t size = chunk_size * 100;
> +hbitmap_test_init_meta(data, size, 0, chunk_size);
> +
> +hbitmap_test_meta(data, 0, 1, 0, chunk_size);
> +hbitmap_test_meta(data, 0, chunk_size, 0, chunk_size);
> +hbitmap_test_meta(data, chunk_size - 1, 1, 0, chunk_size);
> +hbitmap_test_meta(data, chunk_size - 1, 2, 0, chunk_size * 2);
> +hbitmap_test_meta(data, chunk_size - 1, chunk_size + 1, 0, chunk_size * 
> 2);
> +hbitmap_test_meta(data, chunk_size - 1, chunk_size + 2, 0, chunk_size * 
> 3);
> +hbitmap_test_meta(data, 7 * chunk_size - 1, chunk_size + 2,
> +  6 * chunk_size, chunk_size * 3);
> +hbitmap_test_meta(data, size - 1, 1, size - chunk_size, chunk_size);
> +hbitmap_test_meta(data, 0, size, 0, size);
> +}
> +
> +static void test_hbitmap_meta_byte(TestHBitmapData *data, const void *unused)
> +{
> +hbitmap_test_meta_do(data, BITS_PER_BYTE);
> +}
> +
> +static void test_hbitmap_meta_word(TestHBitmapData *data, const void *unused)
> +{
> +hbitmap_test_meta_do(data, BITS_PER_LONG);
> +}
> +
> +static void test_hbitmap_meta_sector(TestHBitmapData *data, const void 
> *unused)
> +{
> +hbitmap_test_meta_do(data, BDRV_SECTOR_SIZE * BITS_PER_BYTE);
> +}
> +
> +/**
> + * Create an HBitmap and test set/unset.
> + */
> +static void test_hbitmap_meta_one(TestHBitmapData *data, const void *unused)
> +{
> +int i;
> +int64_t offsets[] = {
> +0, 1, L1 - 1, L1, L1 + 1, L2 - 1, L2, L2 + 1, L3 - 1, L3, L3 + 1
> +};
> +
> +hbitmap_test_init_meta(data, L3 * 2, 0, 1);
> +for (i = 0; i < ARRAY_SIZE(offsets); i++) {
> +hbitmap_test_meta(data, offsets[i], 1, offsets[i], 1);
> +hbitmap_test_meta(data, offsets[i], L1, offsets[i], L1);
> +hbitmap_test_meta(data, offsets[i], L2, offsets[i], L2);
> +}
> +}
> +
> +static void test_hbitmap_meta_zero(TestHBitmapData *data, const void *unused)
> +{
> +hbitmap_test_init_meta(data, 0, 0, 1);
> +
> +hbitmap_check_meta(data, 0, 0);
> +}
> +
>  static void hbitmap_test_add(const char *testpath,
> void (*test_func)(TestHB

[Qemu-devel] [PATCH 4/4] cuda: add missing fields to VMStateDescription

2016-01-06 Thread Mark Cave-Ayland

Include some fields missed from the previous VMState conversion to the
migration stream, as well as the new SR_INT delay timer.

Signed-off-by: Mark Cave-Ayland 
---
 hw/misc/macio/cuda.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/misc/macio/cuda.c b/hw/misc/macio/cuda.c
index 9db4c64..3556852 100644
--- a/hw/misc/macio/cuda.c
+++ b/hw/misc/macio/cuda.c
@@ -704,15 +704,17 @@ static const VMStateDescription vmstate_cuda_timer = {
 
 static const VMStateDescription vmstate_cuda = {
 .name = "cuda",
-.version_id = 2,
-.minimum_version_id = 2,
+.version_id = 3,
+.minimum_version_id = 3,
 .fields = (VMStateField[]) {
 VMSTATE_UINT8(a, CUDAState),
 VMSTATE_UINT8(b, CUDAState),
+VMSTATE_UINT8(last_b, CUDAState),
 VMSTATE_UINT8(dira, CUDAState),
 VMSTATE_UINT8(dirb, CUDAState),
 VMSTATE_UINT8(sr, CUDAState),
 VMSTATE_UINT8(acr, CUDAState),
+VMSTATE_UINT8(last_acr, CUDAState),
 VMSTATE_UINT8(pcr, CUDAState),
 VMSTATE_UINT8(ifr, CUDAState),
 VMSTATE_UINT8(ier, CUDAState),
@@ -727,6 +729,7 @@ static const VMStateDescription vmstate_cuda = {
 VMSTATE_STRUCT_ARRAY(timers, CUDAState, 2, 1,
  vmstate_cuda_timer, CUDATimer),
 VMSTATE_TIMER_PTR(adb_poll_timer, CUDAState),
+VMSTATE_TIMER_PTR(sr_delay_timer, CUDAState),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.7.10.4

[Qemu-devel] [PATCH 1/4] macio: use the existing IDEDMA aiocb to hold the active DMA aiocb

2016-01-06 Thread Mark Cave-Ayland

Currently the aiocb is held within MACIOIDEState, however the IDE core code
assumes that the current actvie DMA aiocb is held in aiocb in a few places,
e.g. ide_bus_reset() and ide_reset().

Switch over to using IDEDMA aiocb to store the aiocb for the current active
DMA request so that bus resets and restarts are handled correctly. As a
consequence we can now use ide_set_inactive() rather than handling its
functionality ourselves.

Signed-off-by: Mark Cave-Ayland 
---
 hw/ide/macio.c |   20 
 hw/ppc/mac.h   |1 -
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 3ee962f..560c071 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -119,8 +119,8 @@ static void pmac_dma_read(BlockBackend *blk,
 MACIO_DPRINTF("--- Block read transfer - sector_num: %" PRIx64 "  "
   "nsector: %x\n", (offset >> 9), (bytes >> 9));
 
-m->aiocb = blk_aio_readv(blk, (offset >> 9), &io->iov, (bytes >> 9),
- cb, io);
+s->bus->dma->aiocb = blk_aio_readv(blk, (offset >> 9), &io->iov,
+ (bytes >> 9), cb, io);
 }
 
 static void pmac_dma_write(BlockBackend *blk,
@@ -204,8 +204,8 @@ static void pmac_dma_write(BlockBackend *blk,
 MACIO_DPRINTF("--- Block write transfer - sector_num: %" PRIx64 "  "
   "nsector: %x\n", (offset >> 9), (bytes >> 9));
 
-m->aiocb = blk_aio_writev(blk, (offset >> 9), &io->iov, (bytes >> 9),
-  cb, io);
+s->bus->dma->aiocb = blk_aio_writev(blk, (offset >> 9), &io->iov,
+ (bytes >> 9), cb, io);
 }
 
 static void pmac_dma_trim(BlockBackend *blk,
@@ -231,8 +231,8 @@ static void pmac_dma_trim(BlockBackend *blk,
 s->io_buffer_index += io->len;
 io->len = 0;
 
-m->aiocb = ide_issue_trim(blk, (offset >> 9), &io->iov, (bytes >> 9),
-  cb, io);
+s->bus->dma->aiocb = ide_issue_trim(blk, (offset >> 9), &io->iov,
+ (bytes >> 9), cb, io);
 }
 
 static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
@@ -291,6 +291,8 @@ done:
 } else {
 block_acct_done(blk_get_stats(s->blk), &s->acct);
 }
+
+ide_set_inactive(s, false);
 io->dma_end(opaque);
 
 return;
@@ -307,7 +309,6 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
 
 if (ret < 0) {
 MACIO_DPRINTF("DMA error: %d\n", ret);
-m->aiocb = NULL;
 ide_dma_error(s);
 goto done;
 }
@@ -358,6 +359,8 @@ done:
 block_acct_done(blk_get_stats(s->blk), &s->acct);
 }
 }
+
+ide_set_inactive(s, false);
 io->dma_end(opaque);
 }
 
@@ -395,8 +398,9 @@ static void pmac_ide_transfer(DBDMA_io *io)
 static void pmac_ide_flush(DBDMA_io *io)
 {
 MACIOIDEState *m = io->opaque;
+IDEState *s = idebus_active_if(&m->bus);
 
-if (m->aiocb) {
+if (s->bus->dma->aiocb) {
 blk_drain_all();
 }
 }
diff --git a/hw/ppc/mac.h b/hw/ppc/mac.h
index e375ed2..ecf7792 100644
--- a/hw/ppc/mac.h
+++ b/hw/ppc/mac.h
@@ -134,7 +134,6 @@ typedef struct MACIOIDEState {
 
 MemoryRegion mem;
 IDEBus bus;
-BlockAIOCB *aiocb;
 IDEDMA dma;
 void *dbdma;
 bool dma_active;
-- 
1.7.10.4

[Qemu-devel] [PATCH 2/4] macio: add dma_active to VMStateDescription

2016-01-06 Thread Mark Cave-Ayland

Make sure that we include the value of dma_active in the migration stream.

Signed-off-by: Mark Cave-Ayland 
---
 hw/ide/macio.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 560c071..695d4d2 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -518,11 +518,12 @@ static const MemoryRegionOps pmac_ide_ops = {
 
 static const VMStateDescription vmstate_pmac = {
 .name = "ide",
-.version_id = 3,
+.version_id = 4,
 .minimum_version_id = 0,
 .fields = (VMStateField[]) {
 VMSTATE_IDE_BUS(bus, MACIOIDEState),
 VMSTATE_IDE_DRIVES(bus.ifs, MACIOIDEState),
+VMSTATE_BOOL(dma_active, MACIOIDEState),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.7.10.4

[Qemu-devel] [PATCH 3/4] mac_dbdma: add DBDMA controller state to VMStateDescription

2016-01-06 Thread Mark Cave-Ayland

Make sure that we include the DBDMA controller state in the migration
stream.

Signed-off-by: Mark Cave-Ayland 
---
 hw/misc/macio/mac_dbdma.c |   40 
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index 5ee8f02..161f49e 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -712,20 +712,52 @@ static const MemoryRegionOps dbdma_ops = {
 },
 };
 
-static const VMStateDescription vmstate_dbdma_channel = {
-.name = "dbdma_channel",
+static const VMStateDescription vmstate_dbdma_io = {
+.name = "dbdma_io",
+.version_id = 0,
+.minimum_version_id = 0,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(addr, struct DBDMA_io),
+VMSTATE_INT32(len, struct DBDMA_io),
+VMSTATE_INT32(is_last, struct DBDMA_io),
+VMSTATE_INT32(is_dma_out, struct DBDMA_io),
+VMSTATE_BOOL(processing, struct DBDMA_io),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_dbdma_cmd = {
+.name = "dbdma_cmd",
 .version_id = 0,
 .minimum_version_id = 0,
 .fields = (VMStateField[]) {
+VMSTATE_UINT16(req_count, dbdma_cmd),
+VMSTATE_UINT16(command, dbdma_cmd),
+VMSTATE_UINT32(phy_addr, dbdma_cmd),
+VMSTATE_UINT32(cmd_dep, dbdma_cmd),
+VMSTATE_UINT16(res_count, dbdma_cmd),
+VMSTATE_UINT16(xfer_status, dbdma_cmd),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_dbdma_channel = {
+.name = "dbdma_channel",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
 VMSTATE_UINT32_ARRAY(regs, struct DBDMA_channel, DBDMA_REGS),
+VMSTATE_STRUCT(io, struct DBDMA_channel, 0, vmstate_dbdma_io, 
DBDMA_io),
+VMSTATE_STRUCT(current, struct DBDMA_channel, 0, vmstate_dbdma_cmd,
+   dbdma_cmd),
 VMSTATE_END_OF_LIST()
 }
 };
 
 static const VMStateDescription vmstate_dbdma = {
 .name = "dbdma",
-.version_id = 2,
-.minimum_version_id = 2,
+.version_id = 3,
+.minimum_version_id = 3,
 .fields = (VMStateField[]) {
 VMSTATE_STRUCT_ARRAY(channels, DBDMAState, DBDMA_CHANNELS, 1,
  vmstate_dbdma_channel, DBDMA_channel),
-- 
1.7.10.4

[Qemu-devel] [PATCH 0/4] ppc: loadvm/savevm fixups for macio/DBDMA

2016-01-06 Thread Mark Cave-Ayland

This patchset fixes up macio/DBDMA to allow migration to succeed for Mac
machines while IDE/DMA requests are in-flight, and in conjunction with the
migration fixup patchset, enables me to successfully migrate live VMs
under TCG.

Migration was tested by running through a complete Darwin install whilst
issuing savevm/loadvm command pairs every minute or so in the monitor
with no visible ill-effects, and resulted in a bootable disk image.

Signed-off-by: Mark Cave-Ayland 

Mark Cave-Ayland (4):
  macio: use the existing IDEDMA aiocb to hold the active DMA aiocb
  macio: add dma_active to VMStateDescription
  mac_dbdma: add DBDMA controller state to VMStateDescription
  cuda: add missing fields to VMStateDescription

 hw/ide/macio.c|   23 ++-
 hw/misc/macio/cuda.c  |7 +--
 hw/misc/macio/mac_dbdma.c |   40 
 hw/ppc/mac.h  |1 -
 4 files changed, 55 insertions(+), 16 deletions(-)

-- 
1.7.10.4

Re: [Qemu-devel] [PATCH v3 4/7] bcm2835_peripherals: add rollup device for bcm2835 peripherals

2016-01-06 Thread Alistair Francis

On Wed, Jan 6, 2016 at 5:32 AM, Peter Crosthwaite
 wrote:
> On Tue, Jan 5, 2016 at 10:07 PM, Andrew Baumann
>  wrote:
>>> From: Alistair Francis [mailto:alistai...@gmail.com]
>>> Sent: Tuesday, 5 January 2016 18:14
>>> On Thu, Dec 31, 2015 at 4:31 PM, Andrew Baumann
>>>  wrote:
>>> > This device maintains all the non-CPU peripherals on bcm2835 (Pi1)
>>> > which are also present on bcm2836 (Pi2). It also implements the
>>> > private address spaces used for DMA and mailboxes.
>> [...]
>>> > +obj = object_property_get_link(OBJECT(dev), "ram", &err);
>>> > +if (obj == NULL) {
>>> > +error_setg(errp, "%s: required ram link not found: %s",
>>> > +   __func__, error_get_pretty(err));
>>> > +return;
>>> > +}
>>>
>>> I only had a quick read of this patch, but this RAM linking looks fine
>>> to me. Out of curiosity is there a reason you use
>>> object_property_get_link() instead of object_property_add_link() in
>>> the init?
>>
>
> The const link system removes the need for the object to have storage
> for the link pointer in state. This means you don't need the state
> field or add_link(), but the only way to get the pointer for your own
> use is to get_link() on yourself. This is slightly simpler but has the
> disadvantage that you cannot unlink and then relink something else (I
> think?).

Ok, that makes sense.

Thanks,

Alistair

>
> I don't have an opinion over which way is more correct so both are
> fine for me but if the QOM people have a preferred style we should
> probably make the two patches consistent.
>
> Regards,
> Peter
>
>> I'm not sure I understand your question... it wouldn't work the other way. I 
>> allocate the ram and add the link using object_property_add_const_link() in 
>> hw/arm/raspi.c. This file needs to consume the ram to setup alias mappings, 
>> so it is using get_link(). (Note there's also level of indirection; raspi 
>> creates bcm2836, which does nothing but get the link set by its parent and 
>> add it to its bcm2835_peripherals child.)
>>
>> I suppose I could do it the other way around (allocate and set link in 
>> bcm2835_peripherals, based on a size passed from the board), but it seemed 
>> more logical to treat the RAM as created/owned of the board rather than the 
>> SoC.
>>
>> Cheers,
>> Andrew

[Qemu-devel] [PATCH v2 1/2] ohci: delay first SOF interrupt

2016-01-06 Thread Laurent Vivier

On overcommitted CPU, kernel can be so slow that an interrupt can
be triggered by the device whereas the driver is not ready to receive
it. This drives us into an infinite loop.

This does not happen on real hardware because real hardware never send
interrupt immediately after the controller has been moved to OPERATION state.

This patch tries to delay the first SOF interrupt to let driver exits from
the critical section (which is not protected against interrupts...)

Some details:

- ohci_irq(): the OHCI interrupt handler, acknowledges the SOF IRQ
  only if the state of the driver (rh_state) is OHCI_STATE_RUNNING.
  So if this interrupt happens and the driver is not in this state,
  the function is called again and again, moving the system to a
  CPU starvation.

- ohci_rh_resume(): the driver re-enables operation with OHCI_USB_OPER.
  In QEMU this start the SOF timer and QEMU starts to send IRQs. As
  the driver is not in OHCI_STATE_RUNNING and not protected against IRQ,
  the ohci_irq() can be called and the driver never moved to
  OHCI_STATE_RUNNING.

Suggested-by: Gerd Hoffmann 
Signed-off-by: Laurent Vivier 
---
 hw/usb/hcd-ohci.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 7d65818..a44fab2 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -1231,11 +1231,16 @@ static int ohci_service_ed_list(OHCIState *ohci, 
uint32_t head, int completion)
 return active;
 }
 
-/* Generate a SOF event, and set a timer for EOF */
-static void ohci_sof(OHCIState *ohci)
+/* set a timer for EOF */
+static void ohci_eof_timer(OHCIState *ohci)
 {
 ohci->sof_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 timer_mod(ohci->eof_timer, ohci->sof_time + usb_frame_time);
+}
+/* Set a timer for EOF and generate a SOF event */
+static void ohci_sof(OHCIState *ohci)
+{
+ohci_eof_timer(ohci);
 ohci_set_interrupt(ohci, OHCI_INTR_SF);
 }
 
@@ -1343,7 +1348,12 @@ static int ohci_bus_start(OHCIState *ohci)
 
 trace_usb_ohci_start(ohci->name);
 
-ohci_sof(ohci);
+/* Delay the first SOF event by one frame time as
+ * linux driver is not ready to receive it and
+ * can meet some race conditions
+ */
+
+ohci_eof_timer(ohci);
 
 return 1;
 }
-- 
1.8.3.1

[Qemu-devel] [PATCH v2 2/2] ohci: clear pending SOF on suspend

2016-01-06 Thread Laurent Vivier

On overcommitted CPU, kernel can be so slow that an interrupt can
be triggered by the device whereas the driver is not ready to receive
it. This drives us into an infinite loop.

On suspend, if a SOF interrupt is raised between the stop of the
device processing and the change of the device internal state to
OHCI_USB_SUSPEND (QEMU stops SOF timer on this state change), this
interrupt is never acknowledged.

This patch clears pending SOF interrupt on OHCI_USB_SUSPEND setting.

Some details:

- ohci_irq(): the OHCI interrupt handler, acknowledges the SOF IRQ
  only if the state of the driver (rh_state) is OHCI_STATE_RUNNING.
  So if this interrupt happens and the driver is not in this state,
  the function is called again and again, moving the system to a
  CPU starvation.

- ohci_rh_suspend(): the function stop the operation and acknowledge
  pending interrupts (but doesn't disable it). Later in the function,
  the device is moved to OHCI_SUSPEND_STATE, and the driver to
  OHCI_RH_SUSPENDED. If between the moment when the interrupt is
  acknowledged and the moment when the device is suspended a new
  interrupt is raised, it will be never acknowledged because the
  driver is now not in OHCI_RH_RUNNING state.

Signed-off-by: Laurent Vivier 
---
 hw/usb/hcd-ohci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index a44fab2..b99e44f 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -1446,6 +1446,9 @@ static void ohci_set_ctl(OHCIState *ohci, uint32_t val)
 break;
 case OHCI_USB_SUSPEND:
 ohci_bus_stop(ohci);
+/* clear pending SF otherwise linux driver loops in ohci_irq() */
+ohci->intr_status &= ~OHCI_INTR_SF;
+ohci_intr_update(ohci);
 break;
 case OHCI_USB_RESUME:
 trace_usb_ohci_resume(ohci->name);
-- 
1.8.3.1

[Qemu-devel] [PATCH v2 0/2] ohci: try to mimic real hardware command latency

2016-01-06 Thread Laurent Vivier

OHCI linux driver has some critical sections not protected against
device interrupts. Because of real hardware latency, it is generally
not a problem as interrupts cannot be triggered fast enough to happen
during these critical sections.

But theoretically, it can happen. And with QEMU used on an overcommitted
CPU, the vCPU becomes slow enough and it happens.

This series fixes a kernel crash on boot (CPU stuck) when the OHCI driver
tries to resume or suspend the device.

v2: Address comments from Thomas
update code comments
split ohci_sof() to add ohci_eof_timer()

Laurent Vivier (2):
  ohci: delay first SOF interrupt
  ohci: clear pending SOF on suspend

 hw/usb/hcd-ohci.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

-- 
1.8.3.1

Re: [Qemu-devel] What's the advantages of POSTCOPY over CPU-THROTTLE?

2016-01-06 Thread Jason J. Herne


On 01/06/2016 04:57 AM, Dr. David Alan Gilbert wrote:

* Zhangbo (Oscar) (oscar.zhan...@huawei.com) wrote:

Hi all:
　　Postcopy is suitable for migrating guests which have large page change rates. 
It
 1 makes the guest run at the destination ASAP.
 2 makes the downtime of the guest small enough.
 If we don't take the 1st advantage into account, then, its benefit seems 
similar with CPU-THROTTLE: both of them make the guest's downtime small during 
migration.

 CPU-THROTTLE would make the guest's dirtypage rate *smaller than the 
network bandwidth*, in order to make the to_send_page_number in each iteration 
convergent and achieve the small-enough downtime during the last iteration.
 If we adopt POST-COPY here, the guest's dirtypage rate would *become equal 
to the bandwidth*, because we have to fetch its memory from the source side, 
via the network.
 Both of them would introduce performance degradations of the guest, which 
may in turn cause downtime larger.

 So, here comes the question: If we just compare POSTCOPY with CPU-THROTTLE 
for their advantages in decreasing downtime, POSTCOPY seems has no pos over 
CPU-THROTTLE, is that right?

 Meanwhile, Are there any other benifits of POSTCOPY besides the 2 
mentioned above?


It's a good question and they do both try and help solve the same problem.
One problem with cpu-throttle is whether you can throttle the CPU enough to
get the dirty-rate below the rate of the network, and the answer to that is
very workload dependent.  On a large, many-core VM, even a little bit of CPU
can dirty a lot of memory.  Postcopy is guaranteed to finish migration,
irrespective of the workload.

Postcopy is pretty fine-grained, in that only threads that are accessing
pages that are still on the source are blocked, since it allows the use
of async page faults, that means it's even finer grained than the vCPU level,
so many threads come back up to full performance pretty quickly
even if there are a few pages left.



Good answer Dave. FWIW, I completely agree. Using cpu throttling can 
help the situation depending on workload. Postcopy will *always* work. 
One possible side effect of Postcopy is loss of the guest if the network 
connection dies during the postcopy phase of migration. This should be a 
very rare occurrence however. So both methods have their uses.


--
-- Jason J. Herne (jjhe...@linux.vnet.ibm.com)

Re: [Qemu-devel] [PATCH v2] vfio/common: Check iova with limit not with size

2016-01-06 Thread Alex Williamson

On Tue, 2016-01-05 at 17:03 +0100, Pierre Morel wrote:
> In vfio_listener_region_add(), the code makes sure
> that the offset in the section is lower than the size
> of the section.
> But the calculation uses size of the region instead of
> the region's limit (size - 1).

We're really just trying to validate that the region is not zero sized
and hasn't overflowed the addresses space.

> This leads to Int128 overflow when the region has
> been initialized to UINT64_MAX because in this case
> memory_region_init() transform the size from UINT64_MAX
> to int128_2_64().
> 
> Let's really use the limit by sustracting one to the size
> and take care to use the limit for functions using limit
> and size to call functions which need size.
> 
> Signed-off-by: Pierre Morel 
> ---
>  hw/vfio/common.c |   15 ++-
>  1 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 6797208..fe4962a 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -342,18 +342,23 @@ static void vfio_listener_region_add(MemoryListener 
> *listener,
>  
>  iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
>  llend = int128_make64(section->offset_within_address_space);
> -llend = int128_add(llend, section->size);
> +
> +if (int128_ge(llend, int128_2_64())) {

We've just set llend using int128_make64, so this is guaranteed false.

> +llend = int128_add(llend, int128_sub(section->size, int128_one()));
> +} else {
> +llend = int128_add(llend, section->size);
> +}

So the above changed nothing.

>  llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
>  
> -if (int128_ge(int128_make64(iova), llend)) {
> +if (int128_gt(int128_make64(iova), llend)) {

And this allows zero sized regions through.

>  return;
>  }
>  end = int128_get64(llend);
>  
> -if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
> +if ((iova < container->min_iova) || (end > container->max_iova)) {
>  error_report("vfio: IOMMU container %p can't map guest IOVA region"
>   " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
> - container, iova, end - 1);
> + container, iova, end);

This looks wrong too, max_iova is set to the last valid iova, for
instance if the iommu only supported a 4k address space, max_iova would
be 0xfff.  A mapping of size 4k at offset 0 should work, but this
change would cause it to fail.

>  ret = -EFAULT;
>  goto fail;
>  }
> @@ -363,7 +368,7 @@ static void vfio_listener_region_add(MemoryListener 
> *listener,
>  if (memory_region_is_iommu(section->mr)) {
>  VFIOGuestIOMMU *giommu;
>  
> -trace_vfio_listener_region_add_iommu(iova, end - 1);
> +trace_vfio_listener_region_add_iommu(iova, end);
>  /*
>   * FIXME: We should do some checking to see if the
>   * capabilities of the host VFIO IOMMU are adequate to model

I think maybe you want to set end using:

end = int128_get64(int128_sub(llend, int128_one()));

Then removing the -1 in other places becomes correct, BUT we need to
add 1 where we're passing the size, end - iova - > end - iova + 1.
 Thanks,

Alex

Re: [Qemu-devel] [PATCH] virtio-blk: Allow startup of empty cdroms

2016-01-06 Thread P J P

+-- On Wed, 6 Jan 2016, Michal Privoznik wrote --+
|  }
| -if (!blk_is_inserted(conf->conf.blk)) {
| +
| +dinfo = blk_legacy_dinfo(conf->conf.blk);
| +if (!((dinfo && dinfo->media_cd) ||
| +  blk_is_inserted(conf->conf.blk))) {
|  error_setg(errp, "Device needs media, but drive is empty");
|  return;
|  }

  The if expression seems little confusing; Maybe

 ->  if (dinfo && !dinfo->media_cd && !blk_is_inserted(conf->conf.blk))

--
 - P J P
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

Re: [Qemu-devel] [PATCH v2 0/7] Raspberry Pi 2 support

2016-01-06 Thread Andrew Baumann

Hi Piotr,

Sorry, I just noticed this email. Please keep me in the To or CC, as I am not 
good at keeping up with all the traffic on qemu-devel.

> From: qemu-devel-bounces+andrew.baumann=microsoft@nongnu.org
> [mailto:qemu-devel-
> bounces+andrew.baumann=microsoft@nongnu.org] On Behalf Of Piotr
> Król
> Sent: Wednesday, 30 December 2015 17:14

> First, I tried your code from raspi branch (ar7-raspi doesn't compile [1]).
> Using recent Raspbian 2015-11-21-raspbian-jessie (same results I saw for
> 2015-09-24). I'm getting kernel panic:
> 
> [6.892677] random: systemd urandom read with 7 bits of entropy available
> [6.908292] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0004

This is most likely an unimplemented setend instruction in the userspace 
memcmp. You need to mount the image, and comment out the contents of (or just 
remove) /etc/ld.so.preload, then it should boot further.

BTW, I mentioned this briefly in the notes on Linux at the end of this page:
https://github.com/0xabu/qemu/wiki

> Third test I tried is to apply patches from this series on top of master
> (38a762fec63f), what cause hang on:
> 
> [   15.094558 ] mmc0: Timeout waiting for hardware interrupt

Hmm, I have seen this error on the raspi machine, but I don't think I saw it on 
raspi2 before. Does it also occur on the older (09-24) image? I've seen this on 
raspi, but it only occurs when using the DMA controller, and in the patch 
series presently on list, there is no DMA controller included, so it can't be 
that.

> I do not follow this mailing list, so maybe I missing some pieces.
> If I need some other patches to test this series please let me know.

There are other patches (on the list) with changes to sd and sdhci emulation 
needed to boot Windows, but for Linux I believe this is all you need.

> My questions:
> 
> Does 2015-09-24-raspbian-jessie.vhd extracted Linux partition from raspbian
> image or simply img converted to vhd format ? Is there any problem with
> using img ?

No, img is fine. I converted it to a VHD container, because it's easier to work 
with on Windows (and avoids qemu's warning about raw images).

> Would you mind to issue tracking on GitHub and pull requests or you want
> whole
> communication and possible fixes go through QEMU mailing list ? I think that
> it
> maybe useful for code that is not ready for upstreaming. I saw you have
> issue
> tracking disabled.

I'm in a bind here. My original goal was simply to contribute the raspi2 + 
Windows guest support. Then I discovered that the raspi work itself was not yet 
in mainline qemu, and so I've been working to clean that up and get it merged 
along with the raspi2 work. However, I don't have the time/resources to be a 
long-term maintainer of everything Pi-related. I'm happy to enable issue 
tracking on github, but don't want to give a false impression as to the level 
of support it's likely to receive :) On pull requests: my goal is to get 
everything in my raspi branch merged upstream (which is why I recently dropped 
some stub device emulations that weren't needed); assuming that happens I'd 
prefer if subsequent changes went through the qemu patch process, since 
ultimately that's how they'll get into mainline and be maintained in the longer 
term. (I'm happy to be CCed on patches and provide feedback.)

BTW, the biggest roadblock in my plan above is the USB host controller 
emulation (by Gregory), which is obviously incomplete/hacky, but still works 
well enough for Linux keyboard/mouse that it's useful to people. If someone 
wants to pick that up, I would be very happy :)

Cheers,
Andrew

Re: [Qemu-devel] [PATCH] hw/dma/xilinx_axidma: remove dead code

2016-01-06 Thread Alistair Francis

On Wed, Jan 6, 2016 at 7:54 AM, Eric Blake  wrote:
> On 01/06/2016 05:53 AM, Andrew Jones wrote:
>> stream_desc_show() (and DEBUG_ENET) appear to be unused, as the
>> function isn't compilable (there are broken PRI format strings).
>>
>> Signed-off-by: Andrew Jones 
>> ---
>>  hw/dma/xilinx_axidma.c | 10 --
>>  1 file changed, 10 deletions(-)
>
> Reviewed-by: Eric Blake 

Reviewed-by: Alistair Francis 

Thanks,

Alistair

>
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

[Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-06 Thread Mark Cave-Ayland

During local testing with TCG, intermittent errors were found when trying to
migrate Darwin OS images.

The underlying cause was that Darwin resets the decrementer value to fairly
small values on each interrupt. cpu_ppc_set_tb_clk() sets the default value
of the decrementer to 0x during initialisation which typically
corresponds to several seconds. Hence when restoring the image, the guest
would effectively "lose" decrementer interrupts during this time causing
confusion in the guest.

NOTE: there does seem to be some overlap here with the vmstate_ppc_timebase
code, however it doesn't seem to handle multiple CPUs which is why I've gone
for an independent implementation.

Signed-off-by: Mark Cave-Ayland 
---
 target-ppc/machine.c |   49 +
 1 file changed, 49 insertions(+)

diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index cb56423..5ee6269 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -499,6 +499,54 @@ static const VMStateDescription vmstate_tlbmas = {
 }
 };
 
+static bool decr_needed(void *opaque)
+{
+return true;
+}
+
+static int get_decr_state(QEMUFile *f, void *opaque, size_t size)
+{
+PowerPCCPU *cpu = opaque;
+CPUPPCState *env = &cpu->env;
+
+cpu_ppc_store_decr(env, qemu_get_be32(f));
+
+return 0;
+}
+
+static void put_decr_state(QEMUFile *f, void *opaque, size_t size)
+{
+PowerPCCPU *cpu = opaque;
+CPUPPCState *env = &cpu->env;
+
+qemu_put_be32(f, cpu_ppc_load_decr(env));
+}
+
+static const VMStateInfo vmstate_info_decr = {
+.name = "decr_state",
+.get = get_decr_state,
+.put = put_decr_state
+};
+
+static const VMStateDescription vmstate_decr = {
+.name = "cpu/decr",
+.version_id = 0,
+.minimum_version_id = 0,
+.needed = decr_needed,
+.fields = (VMStateField[]) {
+{
+.name = "cpu/decr",
+.version_id   = 0,
+.field_exists = NULL,
+.size = 0,
+.info = &vmstate_info_decr,
+.flags= VMS_SINGLE,
+.offset   = 0,
+},
+VMSTATE_END_OF_LIST()
+}
+};
+
 const VMStateDescription vmstate_ppc_cpu = {
 .name = "cpu",
 .version_id = 5,
@@ -555,6 +603,7 @@ const VMStateDescription vmstate_ppc_cpu = {
 &vmstate_tlb6xx,
 &vmstate_tlbemb,
 &vmstate_tlbmas,
+&vmstate_decr,
 NULL
 }
 };
-- 
1.7.10.4

[Qemu-devel] [PATCH 3/4] target-ppc: add CPU access_type into the migration stream

2016-01-06 Thread Mark Cave-Ayland

This is referenced in cpu_ppc_handle_mmu_fault() and so should be included
in the migration stream.

Signed-off-by: Mark Cave-Ayland 
---
 target-ppc/machine.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 322ce84..cb56423 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -530,7 +530,7 @@ const VMStateDescription vmstate_ppc_cpu = {
 
 /* Internal state */
 VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
-/* FIXME: access_type? */
+VMSTATE_INT32(env.access_type, PowerPCCPU),
 
 /* Interrupt state */
 VMSTATE_UINT32(env.pending_interrupts, PowerPCCPU),
-- 
1.7.10.4

[Qemu-devel] [PATCH 0/4] target-ppc: migration fixups (TCG related)

2016-01-06 Thread Mark Cave-Ayland

This patchset came out of my work to fix migration for Mac machines under
QEMU running TCG.

Patch 1 was posted to the list a few months ago, but is now part this larger
patchset instead.

Patches 2 and 3 were discovered through a combination of dumping out CPU
structures pre- and post- migration and general code review.

Patch 4 solves the problem that caused random errors when migrating Darwin
images, but seems to duplicate some work that has already been started for
migrating timebase information (see vmstate_ppc_timebase).

I don't have access to any KVM PPC hardware so this has all been tested running
TCG and constantly running savevm/loadvm cycles during a complete Darwin
installation, which in combination with a subsequent macio/DBDMA patchset is
enough to produce a working, bootable image.

Signed-off-by: Mark Cave-Ayland 

Mark Cave-Ayland (4):
  target-ppc: add CPU IRQ state to PPC VMStateDescription
  target-ppc: use cpu_write_xer() helper in cpu_post_load
  target-ppc: add CPU access_type into the migration stream
  target-ppc: ensure we include the decrementer value during migration

 target-ppc/machine.c |   57 --
 1 file changed, 55 insertions(+), 2 deletions(-)

-- 
1.7.10.4

[Qemu-devel] [PATCH 1/4] target-ppc: add CPU IRQ state to PPC VMStateDescription

2016-01-06 Thread Mark Cave-Ayland

Commit a90db15 "target-ppc: Convert ppc cpu savevm to VMStateDescription"
appears to drop the internal CPU IRQ state from the migration stream. Whilst
testing migration on g3beige/mac99 machines, test images would randomly fail to
resume unless a key was pressed on the VGA console.

Further investigation suggests that internal CPU IRQ state isn't being
preserved and so interrupts asserted at the time of migration are lost. Adding
the pending_interrupts and irq_input_state fields back into the migration
stream appears to fix the problem here during local tests.

Signed-off-by: Mark Cave-Ayland 
---
 target-ppc/machine.c |4 
 1 file changed, 4 insertions(+)

diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index f4ac761..98fc63a 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -532,6 +532,10 @@ const VMStateDescription vmstate_ppc_cpu = {
 VMSTATE_UINTTL(env.hflags_nmsr, PowerPCCPU),
 /* FIXME: access_type? */
 
+/* Interrupt state */
+VMSTATE_UINT32(env.pending_interrupts, PowerPCCPU),
+VMSTATE_UINT32(env.irq_input_state, PowerPCCPU),
+
 /* Sanity checking */
 VMSTATE_UINTTL_EQUAL(env.msr_mask, PowerPCCPU),
 VMSTATE_UINT64_EQUAL(env.insns_flags, PowerPCCPU),
-- 
1.7.10.4

[Qemu-devel] [PATCH 2/4] target-ppc: use cpu_write_xer() helper in cpu_post_load

2016-01-06 Thread Mark Cave-Ayland

Otherwise some internal xer variables fail to get set post-migration.

Signed-off-by: Mark Cave-Ayland 
---
 target-ppc/machine.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 98fc63a..322ce84 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -168,7 +168,7 @@ static int cpu_post_load(void *opaque, int version_id)
 env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;
 env->lr = env->spr[SPR_LR];
 env->ctr = env->spr[SPR_CTR];
-env->xer = env->spr[SPR_XER];
+cpu_write_xer(env, env->spr[SPR_XER]);
 #if defined(TARGET_PPC64)
 env->cfar = env->spr[SPR_CFAR];
 #endif
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH 2/8] ipmi: add get and set SENSOR_TYPE commands

2016-01-06 Thread Cédric Le Goater

On 01/06/2016 10:55 AM, Greg Kurz wrote:
> On Tue,  5 Jan 2016 18:29:56 +0100
> Cédric Le Goater  wrote:
> 
>> Signed-off-by: Cédric Le Goater 
>> ---
> 
> Acked-by: Greg Kurz 
> 
> Just some minor comments on the form below.
> 
> 
>>  hw/ipmi/ipmi_bmc_sim.c | 51 
>> --
>>  1 file changed, 49 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
>> index 559e1398d669..061db8437479 100644
>> --- a/hw/ipmi/ipmi_bmc_sim.c
>> +++ b/hw/ipmi/ipmi_bmc_sim.c
>> @@ -37,13 +37,15 @@
>>  #define IPMI_CMD_CHASSIS_CONTROL  0x02
>>
>>  #define IPMI_NETFN_SENSOR_EVENT   0x04
>> -#define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x2e
>> +#define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x30
>>
> 
> Maybe IPMI_NETFN_SENSOR_EVENT_MAXCMD should be defined...

These are maximums for each IPMI family of commands (netfn). See
the arrays at the end of the file :

static const IPMICmdHandler *_cmds[*_MAXCMD] 

You'll get the idea.
 
>>  #define IPMI_CMD_SET_SENSOR_EVT_ENABLE0x28
>>  #define IPMI_CMD_GET_SENSOR_EVT_ENABLE0x29
>>  #define IPMI_CMD_REARM_SENSOR_EVTS0x2a
>>  #define IPMI_CMD_GET_SENSOR_EVT_STATUS0x2b
>>  #define IPMI_CMD_GET_SENSOR_READING   0x2d
>> +#define IPMI_CMD_SET_SENSOR_TYPE  0x2e
>> +#define IPMI_CMD_GET_SENSOR_TYPE  0x2f
>>
> 
> ... here ?
> 
>>  /* #define IPMI_NETFN_APP 0x06 In ipmi.h */
>>  #define IPMI_NETFN_APP_MAXCMD 0x36
>> @@ -1576,6 +1578,49 @@ static void get_sensor_reading(IPMIBmcSim *ibs,
>>  return;
>>  }
>>
>> +static void set_sensor_type(IPMIBmcSim *ibs,
>> +   uint8_t *cmd, unsigned int cmd_len,
>> +   uint8_t *rsp, unsigned int *rsp_len,
>> +   unsigned int max_rsp_len)
>> +{
>> +IPMISensor *sens;
>> +
>> +
>> +IPMI_CHECK_CMD_LEN(5);
>> +if ((cmd[2] > MAX_SENSORS) ||
> 
> Parenthesis not needed here since > has precedence over ||
> 
>> +!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
> 
> Indentation ?

OK. I will fix that.

>> +rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
>> +goto out;
>> +}
>> +sens = ibs->sensors + cmd[2];
>> +sens->sensor_type = cmd[3];
>> +sens->evt_reading_type_code = cmd[4] & 0x7f;
> 
> So evt_reading_type_code is 7bit ? Maybe worth to
> introduce a IPMI_SENSOR_TYPE_MASK define.

Yes. there are a few of these in the code.

>> +
>> + out:
>> +return;
>> +}
>> +
>> +static void get_sensor_type(IPMIBmcSim *ibs,
>> +   uint8_t *cmd, unsigned int cmd_len,
>> +   uint8_t *rsp, unsigned int *rsp_len,
>> +   unsigned int max_rsp_len)
>> +{
>> +IPMISensor *sens;
>> +
>> +
>> +IPMI_CHECK_CMD_LEN(3);
>> +if ((cmd[2] > MAX_SENSORS) ||
>> +!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
> 
> Parenthesis and indentation ?

yep. copy & paste :)

Thanks,

C.

>> +rsp[2] = IPMI_CC_REQ_ENTRY_NOT_PRESENT;
>> +goto out;
>> +}
>> +sens = ibs->sensors + cmd[2];
>> +IPMI_ADD_RSP_DATA(sens->sensor_type);
>> +IPMI_ADD_RSP_DATA(sens->evt_reading_type_code);
>> + out:
>> +return;
>> +}
>> +
>>  static const IPMICmdHandler chassis_cmds[IPMI_NETFN_CHASSIS_MAXCMD] = {
>>  [IPMI_CMD_GET_CHASSIS_CAPABILITIES] = chassis_capabilities,
>>  [IPMI_CMD_GET_CHASSIS_STATUS] = chassis_status,
>> @@ -1592,7 +1637,9 @@ sensor_event_cmds[IPMI_NETFN_SENSOR_EVENT_MAXCMD] = {
>>  [IPMI_CMD_GET_SENSOR_EVT_ENABLE] = get_sensor_evt_enable,
>>  [IPMI_CMD_REARM_SENSOR_EVTS] = rearm_sensor_evts,
>>  [IPMI_CMD_GET_SENSOR_EVT_STATUS] = get_sensor_evt_status,
>> -[IPMI_CMD_GET_SENSOR_READING] = get_sensor_reading
>> +[IPMI_CMD_GET_SENSOR_READING] = get_sensor_reading,
>> +[IPMI_CMD_SET_SENSOR_TYPE] = set_sensor_type,
>> +[IPMI_CMD_GET_SENSOR_TYPE] = get_sensor_type,
>>  };
>>  static const IPMINetfn sensor_event_netfn = {
>>  .cmd_nums = IPMI_NETFN_SENSOR_EVENT_MAXCMD,
>

Re: [Qemu-devel] [PATCH] hw/arm/virt: Initialize NICs configured in PCI bus

2016-01-06 Thread Peter Maydell

On 6 January 2016 at 14:47, Ashok Kumar  wrote:
> virtio model is used for default case.
>
> Signed-off-by: Ashok Kumar 

Could you explain why you think this needs to be done?
Virtio networking works OK for me...

I guess from the patch that this is adding support for
the legacy '-net' way of configuring networking, but do
we need that if we never supported it in the first place?
(If virt is the only PCI machine which doesn't support
-net syntax that would probably be a strong argument for
supporting it.)

thanks
-- PMM

Re: [Qemu-devel] [PATCH v4 0/5] qmp: Add blockdev-mirror

2016-01-06 Thread Max Reitz

On 24.12.2015 05:45, Fam Zheng wrote:
> v4: 02: Add Max's rev-by.
> 04: buf_size -> buf-size.
> Add Markus' Ack-by.
> 05: 'node1' -> qmp_target.
> Fix double quotes.
> Add Max's Rev-by.
> 
> v3: Rebase to master.
> 
> v2: 01: Move bdrv_op_block_all down. [Max]
> 02, 04: Add Max's rev-by.
> 03: Check has_mode and fix "return;". [Max]
> 05: Check target->blk.
> Drop superfluous whitespace. [Max]
> 06: Drop superfluous whitespace hunk and add Max's rev-by. [Max]
> 
> This is the counterpart of blockdev-backup. The biggest value of this command
> is to allow full flexibility on target image open options, via blockdev-add.
> For example this could help solve the target provisioning issue in:
> 
> http://lists.gnu.org/archive/html/qemu-devel/2015-06/msg02139.html
> 
> Fam Zheng (5):
>   block: Rename BLOCK_OP_TYPE_MIRROR to BLOCK_OP_TYPE_MIRROR_SOURCE
>   block: Extract blockdev part of qmp_drive_mirror
>   block: Add check on mirror target
>   qmp: Add blockdev-mirror command
>   iotests: Add test cases for blockdev-mirror
> 
>  blockdev.c  | 179 
> ++--
>  hw/block/dataplane/virtio-blk.c |   2 +-
>  include/block/block.h   |   3 +-
>  qapi/block-core.json|  48 +++
>  qmp-commands.hx |  50 ++-
>  tests/qemu-iotests/041  | 100 --
>  tests/qemu-iotests/041.out  |   4 +-
>  7 files changed, 316 insertions(+), 70 deletions(-)

Thanks, applied to my block tree:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC v6 00/14] Slow-path for atomic instruction translation

2016-01-06 Thread Andrew Baumann

Hi,

> From: qemu-devel-bounces+andrew.baumann=microsoft@nongnu.org
> [mailto:qemu-devel-
> bounces+andrew.baumann=microsoft@nongnu.org] On Behalf Of
> Alvise Rigo
> Sent: Monday, 14 December 2015 00:41
> 
> This is the sixth iteration of the patch series which applies to the
> upstream branch of QEMU (v2.5.0-rc3).
> 
> Changes versus previous versions are at the bottom of this cover letter.
> 
> The code is also available at following repository:
> https://git.virtualopensystems.com/dev/qemu-mt.git
> branch:
> slowpath-for-atomic-v6-no-mttcg
> 
> This patch series provides an infrastructure for atomic instruction
> implementation in QEMU, thus offering a 'legacy' solution for
> translating guest atomic instructions. Moreover, it can be considered as
> a first step toward a multi-thread TCG.
> 
> The underlying idea is to provide new TCG helpers (sort of softmmu
> helpers) that guarantee atomicity to some memory accesses or in general
> a way to define memory transactions.
> 
> More specifically, the new softmmu helpers behave as LoadLink and
> StoreConditional instructions, and are called from TCG code by means of
> target specific helpers. This work includes the implementation for all
> the ARM atomic instructions, see target-arm/op_helper.c.

As a heads up, we just added support for alignment checks in LDREX:
https://github.com/qemu/qemu/commit/30901475b91ef1f46304404ab4bfe89097f61b96

Hopefully it is an easy change to ensure that the same check happens for the 
relevant loads when CONFIG_TCG_USE_LDST_EXCL is enabled?

Thanks,
Andrew

Re: [Qemu-devel] [PATCH v9 1/2] mirror: Rewrite mirror_iteration

2016-01-06 Thread Max Reitz

On 05.01.2016 09:46, Fam Zheng wrote:
> The "pnum < nb_sectors" condition in deciding whether to actually copy
> data is unnecessarily strict, and the qiov initialization is
> unnecessarily for bdrv_aio_write_zeroes and bdrv_aio_discard.
> 
> Rewrite mirror_iteration to fix both flaws.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/mirror.c | 347 
> +++--
>  trace-events   |   1 -
>  2 files changed, 216 insertions(+), 132 deletions(-)

Side note: This breaks the output of iotest 109, probably due to
different alignment of the mirroring operations (doesn't look serious,
though).

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-block] [PATCH] send readcapacity10 when readcapacity16 failed

2016-01-06 Thread John Snow



On 01/05/2016 02:57 PM, ronnie sahlberg wrote:
> MMC devices:
> READ CAPACITY 10 support is mandatory.
> No support for READ CAPACITY 16
> 
> SBC devices:
> READ CAPACITY 10 is mandatory
> READ CAPACITY 16 support is only required when you have thin
> provisioning or protection information (or if the device is >2^32 blocks)
> Almost all, but apparently not all, SBC devices support both.
> 
> 
> For SBC devices you probably want to start with RC16 and only fallback
> to RC10 if you get INVALID_OPCODE.
> You start with RC16 since this is the way to discover if you have thin
> provisioning or special protection information.
> 
> For MMC devices you could try the "try RC16 first and fallback to RC10"
> but as probably almost no MMC devices support RC16 it makes little sense
> to do so.
> 
> 

Ronnie: Thanks for the explanation!

Zhu: In light of this, can the patch be reworked slightly to explicitly
check *why* READCAPACITY16 failed and only attempt the READCAPACITY10 as
a fallback if it receives INVALID_OPCODE?

If it fails for any other reason it's probably best to report the error
and let QEMU decide what to do about it.

I suppose caching a flag that lets us know to go straight to
READ_CAPACITY10 is not worthwhile because this command is not likely to
be issued very often.

Thanks,
--js

> 
> On Tue, Jan 5, 2016 at 11:42 AM, John Snow  > wrote:
> 
> 
> 
> On 12/28/2015 10:32 PM, Zhu Lingshan wrote:
> > When play with Dell MD3000 target, for sure it
> > is a TYPE_DISK, but readcapacity16 would fail.
> > Then we find that readcapacity10 succeeded. It
> > looks like the target just support readcapacity10
> > even through it is a TYPE_DISK or have some
> > TYPE_ROM characteristics.
> >
> > This patch can give a chance to send
> > readcapacity16 when readcapacity10 failed.
> > This patch is not harmful to original pathes
> >
> > Signed-off-by: Zhu Lingshan mailto:ls...@suse.com>>
> > ---
> >  block/iscsi.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/iscsi.c b/block/iscsi.c
> > index bd1f1bf..c8d167f 100644
> > --- a/block/iscsi.c
> > +++ b/block/iscsi.c
> > @@ -1243,8 +1243,9 @@ static void iscsi_readcapacity_sync(IscsiLun
> *iscsilun, Error **errp)
> >  iscsilun->lbprz = !!rc16->lbprz;
> >  iscsilun->use_16_for_rw = (rc16->returned_lba
> > 0x);
> >  }
> > +break;
> >  }
> > -break;
> > +//fall through to try readcapacity10 instead
> >  case TYPE_ROM:
> >  task = iscsi_readcapacity10_sync(iscsilun->iscsi,
> iscsilun->lun, 0, 0);
> >  if (task != NULL && task->status == SCSI_STATUS_GOOD) {
> >
> 
> For the uninitiated, why does readcapacity16 fail?
> 
> My gut feeling is that this is a hack, because:
> 
> - Either readcapacity16 should work, or
> - We shouldn't be choosing 10/16 based on the target type to begin with
> 
> but I don't know much about iSCSI, so maybe You, Paolo or Peter could
> fill me in.
> 
> --js
> 
>

Re: [Qemu-devel] [PATCH v9 1/2] mirror: Rewrite mirror_iteration

2016-01-06 Thread Max Reitz

On 05.01.2016 09:46, Fam Zheng wrote:
> The "pnum < nb_sectors" condition in deciding whether to actually copy
> data is unnecessarily strict, and the qiov initialization is
> unnecessarily for bdrv_aio_write_zeroes and bdrv_aio_discard.
> 
> Rewrite mirror_iteration to fix both flaws.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/mirror.c | 347 
> +++--
>  trace-events   |   1 -
>  2 files changed, 216 insertions(+), 132 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index f201f2b..e3e9fad 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -46,7 +46,6 @@ typedef struct MirrorBlockJob {
>  BlockdevOnError on_source_error, on_target_error;
>  bool synced;
>  bool should_complete;
> -int64_t sector_num;
>  int64_t granularity;
>  size_t buf_size;
>  int64_t bdev_length;
> @@ -63,6 +62,8 @@ typedef struct MirrorBlockJob {
>  int ret;
>  bool unmap;
>  bool waiting_for_io;
> +int target_cluster_sectors;
> +int max_iov;
>  } MirrorBlockJob;
>  
>  typedef struct MirrorOp {
> @@ -158,115 +159,93 @@ static void mirror_read_complete(void *opaque, int ret)
>  mirror_write_complete, op);
>  }
>  
> -static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
> +/* Round sector_num and/or nb_sectors to target cluster if COW is needed, and
> + * return the offset of the adjusted tail sector against original. */
> +static int mirror_cow_align(MirrorBlockJob *s,
> +int64_t *sector_num,
> +int *nb_sectors)
> +{
> +bool head_need_cow, tail_need_cow;
> +int diff = 0;
> +int chunk_sectors = s->granularity >> BDRV_SECTOR_BITS;
> +
> +head_need_cow = !test_bit(*sector_num / chunk_sectors, s->cow_bitmap);
> +tail_need_cow = !test_bit((*sector_num + *nb_sectors - 1) / 
> chunk_sectors,
> +  s->cow_bitmap);
> +if (head_need_cow || tail_need_cow) {
> +int64_t align_sector_num;
> +int align_nb_sectors;
> +bdrv_round_to_clusters(s->target, *sector_num, *nb_sectors,
> +   &align_sector_num, &align_nb_sectors);
> +if (tail_need_cow) {
> +diff = align_sector_num + align_nb_sectors
> +   - (*sector_num + *nb_sectors);
> +assert(diff >= 0);
> +*nb_sectors += diff;
> +}
> +if (head_need_cow) {
> +int d = *sector_num - align_sector_num;
> +assert(d >= 0);
> +*sector_num = align_sector_num;
> +*nb_sectors += d;
> +}
> +}
> +
> +/* If the resulting chunks are more than max_iov, we have to shrink it
> + * under the alignment restriction. */
> +if (*nb_sectors > chunk_sectors * s->max_iov) {
> +int shrink = *nb_sectors - chunk_sectors * s->max_iov;
> +if (tail_need_cow) {
> +/* In this case, tail must be aligned already, so we just make 
> sure
> + * the shrink is also aligned. */
> +shrink -= shrink % s->target_cluster_sectors;
> +}
> +assert(shrink);
> +diff -= shrink;
> +*nb_sectors -= shrink;
> +}

Hm, looking at this closer... If we get here with tail_need_cow not
being set, we may end up with an unaligned tail, which then may need COW
(because it points to somewhere else than before).

On the other hand, if we get here with tail_need_cow being set, shrink
will be decreased so that it will only remove an aligned number of
sectors from *nb_sectors; however, because shrink is increased, that
means that *nb_sectors may then still be too large. Also, because of the
shrink, the tail may in fact not need COW any more.

Should we do this check before we test whether we need COW and do the
correction in a way that ensures that the cluster adjustment can never
increase *nb_sectors beyond chunk_sectors * s->max_iov?

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v8 19/35] qmp: Fix reference-counting of qnull on empty output visit

2016-01-06 Thread Eric Blake

On 01/05/2016 07:05 AM, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
>> Commit 6c2f9a15 ensured that we would not return NULL when the
>> caller used an output visitor but had nothing to visit. But
>> in doing so, it added a FIXME about a reference count leak
>> that could abort qemu in the (unlikely) case of SIZE_MAX such
>> visits (more plausible on 32-bit).  (Although that commit
>> suggested we might fix it in time for 2.5, we ran out of time;
>> fortunately, it is unlikely enough to bite that it was not
>> worth worrying about during the 2.5 release.)
>>
>> This fixes things by documenting the internal contracts, and
>> explaining why the internal function can return NULL and only
>> the public facing interface needs to worry about qnull(),
>> thus avoiding over-referencing the qnull_ global object.
>>
>> It does not, however, fix the stupidity of the stack mixing
>> up two separate pieces of information; add a FIXME to explain
>> that issue.
>>
>> Signed-off-by: Eric Blake 
>> Cc: qemu-sta...@nongnu.org
>>

>> +++ b/qapi/qmp-output-visitor.c
>> @@ -29,6 +29,15 @@ typedef QTAILQ_HEAD(QStack, QStackEntry) QStack;
>>  struct QmpOutputVisitor
>>  {
>>  Visitor visitor;
>> +/* FIXME: we are abusing stack to hold two separate pieces of
>> + * information: the current root object in slot 0, and the stack
>> + * of N objects still being built in slots 1 through N (for N+1
>> + * slots in use).  Worse, our behavior is inconsistent:
>> + * qmp_output_add_obj() visiting two top-level scalars in a row
>> + * discards the first in favor of the second, but visiting two
>> + * top-level objects in a row tries to append the second object
>> + * into the first (since the first object was placed in the stack
>> + * in both slot 0 and 1, but only popped from slot 1).  */
> 
> I skipped checking thoroughly this comment, since it's a bit
> off-topic, although it looks ok.
> 
> Later, oh well, it's fixed in next commit. Imho it's not strictly
> necessary in this commit.

I added the comment based on Markus' request that I document how the
stack is used; but yes, it does feel like a bit of churn since it
changes in the next commit.

If there's a reason to respin, I might change it to:

Visitor visitor;
/* Stack holds two pieces of information: the current root object in
 * slot 0, then a stack of N objects still being built in slots 1
 * through N (for N+1 slots in use).
 * FIXME: The root object should be stored separately from the
 * stack, particularly since qmp_output_add_obj() behaves
 * differently when visiting two top-level scalars in a row than
 * it does for two objects (the second object is appended to the
 * first, since the first is placed in both slots 0 and 1 but only
 * popped from slot 1).   */

> 
> Reviewed-by: Marc-André Lureau 
> 
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC v6 08/14] target-arm: Add atomic_clear helper for CLREX insn

2016-01-06 Thread alvise rigo

On Wed, Jan 6, 2016 at 6:13 PM, Alex Bennée  wrote:
>
> Alvise Rigo  writes:
>
>> Add a simple helper function to emulate the CLREX instruction.
>
> And now I see ;-)
>
> I suspect this should be merged with the other helpers as a generic helper.

Agreed.

>
>>
>> Suggested-by: Jani Kokkonen 
>> Suggested-by: Claudio Fontana 
>> Signed-off-by: Alvise Rigo 
>> ---
>>  target-arm/helper.h| 2 ++
>>  target-arm/op_helper.c | 6 ++
>>  target-arm/translate.c | 1 +
>>  3 files changed, 9 insertions(+)
>>
>> diff --git a/target-arm/helper.h b/target-arm/helper.h
>> index c2a85c7..37cec49 100644
>> --- a/target-arm/helper.h
>> +++ b/target-arm/helper.h
>> @@ -532,6 +532,8 @@ DEF_HELPER_2(dc_zva, void, env, i64)
>>  DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
>>  DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
>>
>> +DEF_HELPER_1(atomic_clear, void, env)
>> +
>>  #ifdef TARGET_AARCH64
>>  #include "helper-a64.h"
>>  #endif
>> diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
>> index 6cd54c8..5a67557 100644
>> --- a/target-arm/op_helper.c
>> +++ b/target-arm/op_helper.c
>> @@ -50,6 +50,12 @@ static int exception_target_el(CPUARMState *env)
>>  return target_el;
>>  }
>>
>> +void HELPER(atomic_clear)(CPUARMState *env)
>> +{
>> +ENV_GET_CPU(env)->excl_protected_range.begin = -1;
>
> Is there not a defined reset value EXCLUSIVE_RESET_ADDR we should use here?

Yes, I will move the EXCLUSIVE_RESET_ADDR definition somewhere else in
order to include it in this file.

>
>> +ENV_GET_CPU(env)->ll_sc_context = false;
>> +}
>> +
>>  uint32_t HELPER(neon_tbl)(CPUARMState *env, uint32_t ireg, uint32_t def,
>>uint32_t rn, uint32_t maxindex)
>>  {
>> diff --git a/target-arm/translate.c b/target-arm/translate.c
>> index e88d8a3..e0362e0 100644
>> --- a/target-arm/translate.c
>> +++ b/target-arm/translate.c
>> @@ -7514,6 +7514,7 @@ static void gen_load_exclusive(DisasContext *s, int 
>> rt, int rt2,
>>  static void gen_clrex(DisasContext *s)
>>  {
>>  #ifdef CONFIG_TCG_USE_LDST_EXCL
>> +gen_helper_atomic_clear(cpu_env);
>>  #else
>>  tcg_gen_movi_i64(cpu_exclusive_addr, -1);
>>  #endif
>
>
> --
> Alex Bennée

[Qemu-devel] ANNOUNCE: libguestfs 1.32 released

2016-01-06 Thread Richard W.M. Jones

I'm pleased to announce libguestfs 1.32, a library and set of tools
for accessing and modifying virtual machine disk images.

This release took 6 months of work by many people - see release notes
below.

You can get libguestfs 1.32 here:

Main website: http://libguestfs.org/
  Source: http://libguestfs.org/download/1.32-stable/
  Fedora 23+: http://koji.fedoraproject.org/koji/packageinfo?packageID=8391
  It will appear as an update for F23 in about a week.
Debian/experimental coming soon, see:
  https://packages.debian.org/libguestfs0

Release notes (from http://libguestfs.org/guestfs-release-notes.1.html):

   New features
   New tools

   The new virt-v2v-copy-to-local(1) tool is an ancillary tool for
   virt-v2v(1) allowing you to convert source guests that virt-v2v is
   unable to access directly.

   New features in existing tools

   Virt-customize knows how to write a random seed to CirrOS (Pino
   Toscano).

   On Fedora, virt-customize runs dnf(8) with the --best flag, ensuring it
   always updates to the latest available packages.

   Virt-builder now provides 32 bit Fedora templates.

   Virt-builder and virt-customize --install option now works on 32 bit
   Fedora guests.  Previously it would try to install 64 bit packages (Jan
   Sedlák).

   Virt-builder can now fetch cloud images using Simple Streams v1.0
   metadata (Pino Toscano).

   Virt-builder can now fetch openSUSE cloud images out of the box (Cédric
   Bosdonnat).

   Virt-customize will now use stronger (SHA-512) encrypted passwords by
   default on openSUSE ≥ 11 (Pino Toscano).

   Virt-builder will now correctly handle output filenames containing
   colon characters (":"), and will create temporary files in the
   libguestfs cache directory instead of defaulting to /tmp (Pino
   Toscano).

   Virt-resize has a new --unknown-filesystems option to control what to
   do when asked to resize a filesystem that libguestfs doesn't know how
   to resize.

   Virt-v2v now has an --in-place flag/mode, allowing in-place conversion
   of guests (Roman Kagan).

   Virt-v2v has a --compressed option for creating compressed qcow2 output
   files.

   Virt-v2v can now correctly get the VMware datacenter path (dcPath) from
   libvirt, instead of having to calculate it using an algorithm that
   occasionally got the wrong answer (Matthias Bolte, Tingting Zheng).

   Virt-v2v now processes RAM sizes correctly for 64 bit guests when
   running on a 32 bit host.

   Language bindings

   In Perl and Python programs, the "get_program_name" API now returns the
   true program name, instead of the incorrect string "perl" or "python".

   The Python bindings can now be compiled against a different version of
   libguestfs, allowing the pip module to be built against any version of
   libguestfs (instead of requiring the pip module and libguestfs to have
   exactly the same version).

   The quality of the Ruby rdoc (documentation) has been improved (Pino
   Toscano).

   Perl scripts no longer hard-code the location of perl in the shebang
   line, but use env(1) to locate it instead (Pino Toscano).

   In OCaml programs, the guestfs handle was incorrectly made into a
   global root, meaning it could never be garbage collected.  If you
   didn't call the "close" function explicitly, the handle would not be
   closed until the whole program exited.  This has now been fixed so
   handles will be garbage collected in the usual way.  This changes the
   API of the OCaml function "Guestfs.event_callback".  Note that non-C
   language bindings are not covered by the libguestfs API/ABI guarantee,
   although we try hard not to change them, but in this case it was
   essential in order to fix this very serious bug.

   Inspection

   Alpine Linux and the APK package manager, ALT Linux, Frugalware, and
   PLD Linux are now recognized (Pino Toscano).

   If it exists, /etc/os-release will be preferred for inspecting Linux
   guests (Pino Toscano).

   The correct kernel version is returned for Windows guests ≥ 10.

   Documentation

   The large guestfs(3) man page has been split into several separate man
   pages: guestfs-hacking(1) guestfs-internals(1) guestfs-security(1).  In
   the source tree, a new docs directory contains this documentation.

   Architectures and platforms

   Libguestfs now supports ARM 64 bit platforms with vGICv3.

   Security
   See also guestfs-security(1).

   "CVE-2015-5745"
   https://bugzilla.redhat.com/1251157

   This is not a vulnerability in libguestfs, but because we always
   give a virtio-serial port to each guest (since that is how guest-
   host communication happens), an escalation from the appl

Re: [Qemu-devel] [RFC v6 08/14] target-arm: Add atomic_clear helper for CLREX insn

2016-01-06 Thread Alex Bennée


Alvise Rigo  writes:

> Add a simple helper function to emulate the CLREX instruction.

And now I see ;-)

I suspect this should be merged with the other helpers as a generic helper.

>
> Suggested-by: Jani Kokkonen 
> Suggested-by: Claudio Fontana 
> Signed-off-by: Alvise Rigo 
> ---
>  target-arm/helper.h| 2 ++
>  target-arm/op_helper.c | 6 ++
>  target-arm/translate.c | 1 +
>  3 files changed, 9 insertions(+)
>
> diff --git a/target-arm/helper.h b/target-arm/helper.h
> index c2a85c7..37cec49 100644
> --- a/target-arm/helper.h
> +++ b/target-arm/helper.h
> @@ -532,6 +532,8 @@ DEF_HELPER_2(dc_zva, void, env, i64)
>  DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
>  DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
>
> +DEF_HELPER_1(atomic_clear, void, env)
> +
>  #ifdef TARGET_AARCH64
>  #include "helper-a64.h"
>  #endif
> diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
> index 6cd54c8..5a67557 100644
> --- a/target-arm/op_helper.c
> +++ b/target-arm/op_helper.c
> @@ -50,6 +50,12 @@ static int exception_target_el(CPUARMState *env)
>  return target_el;
>  }
>
> +void HELPER(atomic_clear)(CPUARMState *env)
> +{
> +ENV_GET_CPU(env)->excl_protected_range.begin = -1;

Is there not a defined reset value EXCLUSIVE_RESET_ADDR we should use here?

> +ENV_GET_CPU(env)->ll_sc_context = false;
> +}
> +
>  uint32_t HELPER(neon_tbl)(CPUARMState *env, uint32_t ireg, uint32_t def,
>uint32_t rn, uint32_t maxindex)
>  {
> diff --git a/target-arm/translate.c b/target-arm/translate.c
> index e88d8a3..e0362e0 100644
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -7514,6 +7514,7 @@ static void gen_load_exclusive(DisasContext *s, int rt, 
> int rt2,
>  static void gen_clrex(DisasContext *s)
>  {
>  #ifdef CONFIG_TCG_USE_LDST_EXCL
> +gen_helper_atomic_clear(cpu_env);
>  #else
>  tcg_gen_movi_i64(cpu_exclusive_addr, -1);
>  #endif


--
Alex Bennée

Re: [Qemu-devel] [RFC v6 07/14] target-arm: translate: Use ld/st excl for atomic insns

2016-01-06 Thread Alex Bennée


Alvise Rigo  writes:

> Use the new LL/SC runtime helpers to handle the ARM atomic
> instructions in softmmu_llsc_template.h.
>
> In general, the helper generator
> gen_helper_{ldlink,stcond}_aa32_i{8,16,32,64}() calls the function
> helper_{le,be}_{ldlink,stcond}{ub,uw,ulq}_mmu() implemented in
> softmmu_llsc_template.h.
>
> Suggested-by: Jani Kokkonen 
> Suggested-by: Claudio Fontana 
> Signed-off-by: Alvise Rigo 
> ---
>  target-arm/translate.c | 101 
> +++--
>  1 file changed, 97 insertions(+), 4 deletions(-)
>
> diff --git a/target-arm/translate.c b/target-arm/translate.c
> index 5d22879..e88d8a3 100644
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -64,8 +64,10 @@ TCGv_ptr cpu_env;
>  static TCGv_i64 cpu_V0, cpu_V1, cpu_M0;
>  static TCGv_i32 cpu_R[16];
>  TCGv_i32 cpu_CF, cpu_NF, cpu_VF, cpu_ZF;
> +#ifndef CONFIG_TCG_USE_LDST_EXCL
>  TCGv_i64 cpu_exclusive_addr;
>  TCGv_i64 cpu_exclusive_val;
> +#endif
>  #ifdef CONFIG_USER_ONLY
>  TCGv_i64 cpu_exclusive_test;
>  TCGv_i32 cpu_exclusive_info;
> @@ -98,10 +100,12 @@ void arm_translate_init(void)
>  cpu_VF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, VF), 
> "VF");
>  cpu_ZF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, ZF), 
> "ZF");
>
> +#ifndef CONFIG_TCG_USE_LDST_EXCL
>  cpu_exclusive_addr = tcg_global_mem_new_i64(TCG_AREG0,
>  offsetof(CPUARMState, exclusive_addr), "exclusive_addr");
>  cpu_exclusive_val = tcg_global_mem_new_i64(TCG_AREG0,
>  offsetof(CPUARMState, exclusive_val), "exclusive_val");
> +#endif
>  #ifdef CONFIG_USER_ONLY
>  cpu_exclusive_test = tcg_global_mem_new_i64(TCG_AREG0,
>  offsetof(CPUARMState, exclusive_test), "exclusive_test");
> @@ -7414,15 +7418,59 @@ static void gen_logicq_cc(TCGv_i32 lo, TCGv_i32 hi)
>  tcg_gen_or_i32(cpu_ZF, lo, hi);
>  }
>
> -/* Load/Store exclusive instructions are implemented by remembering
> +/* If the softmmu is enabled, the translation of Load/Store exclusive
> + * instructions will rely on the gen_helper_{ldlink,stcond} helpers,
> + * offloading most of the work to the softmmu_llsc_template.h functions.
> +
> +   Otherwise, these instructions are implemented by remembering
> the value/address loaded, and seeing if these are the same
> when the store is performed. This should be sufficient to implement
> the architecturally mandated semantics, and avoids having to monitor
> regular stores.
>
> -   In system emulation mode only one CPU will be running at once, so
> -   this sequence is effectively atomic.  In user emulation mode we
> -   throw an exception and handle the atomic operation elsewhere.  */
> +   In user emulation mode we throw an exception and handle the atomic
> +   operation elsewhere.  */
> +#ifdef CONFIG_TCG_USE_LDST_EXCL
> +static void gen_load_exclusive(DisasContext *s, int rt, int rt2,
> +   TCGv_i32 addr, int size)
> + {
> +TCGv_i32 tmp = tcg_temp_new_i32();
> +TCGv_i32 mem_idx = tcg_temp_new_i32();
> +
> +tcg_gen_movi_i32(mem_idx, get_mem_index(s));
> +
> +if (size != 3) {
> +switch (size) {
> +case 0:
> +gen_helper_ldlink_aa32_i8(tmp, cpu_env, addr, mem_idx);
> +break;
> +case 1:
> +gen_helper_ldlink_aa32_i16(tmp, cpu_env, addr, mem_idx);
> +break;
> +case 2:
> +gen_helper_ldlink_aa32_i32(tmp, cpu_env, addr, mem_idx);
> +break;
> +default:
> +abort();
> +}
> +
> +store_reg(s, rt, tmp);
> +} else {
> +TCGv_i64 tmp64 = tcg_temp_new_i64();
> +TCGv_i32 tmph = tcg_temp_new_i32();
> +
> +gen_helper_ldlink_aa32_i64(tmp64, cpu_env, addr, mem_idx);
> +tcg_gen_extr_i64_i32(tmp, tmph, tmp64);
> +
> +store_reg(s, rt, tmp);
> +store_reg(s, rt2, tmph);
> +
> +tcg_temp_free_i64(tmp64);
> +}
> +
> +tcg_temp_free_i32(mem_idx);
> +}
> +#else
>  static void gen_load_exclusive(DisasContext *s, int rt, int rt2,
> TCGv_i32 addr, int size)
>  {
> @@ -7461,10 +7509,14 @@ static void gen_load_exclusive(DisasContext *s, int 
> rt, int rt2,
>  store_reg(s, rt, tmp);
>  tcg_gen_extu_i32_i64(cpu_exclusive_addr, addr);
>  }
> +#endif
>
>  static void gen_clrex(DisasContext *s)
>  {
> +#ifdef CONFIG_TCG_USE_LDST_EXCL

I don't think it would be correct to ignore clrex in softmmu mode.
Assuming the code path had used it we may well be creating slow-path
transitions for no reason.

> +#else
>  tcg_gen_movi_i64(cpu_exclusive_addr, -1);
> +#endif
>  }
>
>  #ifdef CONFIG_USER_ONLY
> @@ -7476,6 +7528,47 @@ static void gen_store_exclusive(DisasContext *s, int 
> rd, int rt, int rt2,
>   size | (rd << 4) | (rt << 8) | (rt2 << 12));
>  gen_exception_internal_insn(s, 4, EXCP_STREX);
>  }
> +#elif defined CONFIG_TCG_USE_LDST_EXCL
> +static void gen_stor

Re: [Qemu-devel] [PATCH v5 12/15] block: Remove the "bs->file" test in bdrv_co_get_block_status

2016-01-06 Thread Max Reitz

On 05.01.2016 09:11, Fam Zheng wrote:
> Now that all drivers return the right "file" pointer, we can remove this
> check.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 492c291..1ca4e61 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1550,7 +1550,7 @@ static int64_t coroutine_fn 
> bdrv_co_get_block_status(BlockDriverState *bs,
>  }
>  }
>  
> -if (bs->file && *file && *file != bs &&
> +if (*file && *file != bs &&
>  (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
>  (ret & BDRV_BLOCK_OFFSET_VALID)) {
>  BlockDriverState *file2;

Fine by itself, but I think patch 1 needs a change, so this should be

-if (bs->file &&
+if (*file && *file != bs &&

instead; and the first parameter of the subsequent
bdrv_co_get_block_status() call then needs to be changed to *file here
(instead of in patch 1).

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v5 11/15] vmdk: Return extent's file in bdrv_get_block_status

2016-01-06 Thread Max Reitz

On 05.01.2016 09:11, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> ---
>  block/vmdk.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [v15 12/15] vfio: add bus in reset flag

2016-01-06 Thread Alex Williamson

On Wed, 2016-01-06 at 10:13 +0800, Chen Fan wrote:
> On 01/06/2016 03:58 AM, Alex Williamson wrote:
> > On Tue, 2016-01-05 at 09:20 +0800, Cao jin wrote:
> > > From: Chen Fan 
> > > 
> > > mark the host bus be in reset. avoid multiple devices trigger the
> > > host bus reset many times.
> > > 
> > > Signed-off-by: Chen Fan 
> > > ---
> > >   hw/vfio/pci.c | 6 ++
> > >   include/hw/vfio/vfio-common.h | 1 +
> > >   2 files changed, 7 insertions(+)
> > > 
> > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > > index ee88db3..aa0d945 100644
> > > --- a/hw/vfio/pci.c
> > > +++ b/hw/vfio/pci.c
> > > @@ -2249,6 +2249,11 @@ static int
> > > vfio_pci_hot_reset(VFIOPCIDevice
> > > *vdev, bool single)
> > >   
> > >   trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ?
> > > "one" :
> > > "multi");
> > >   
> > > +if (vdev->vbasedev.bus_in_reset) {
> > > +vdev->vbasedev.bus_in_reset = false;
> > > +return 0;
> > > +}
> > > +
> > >   vfio_pci_pre_reset(vdev);
> > >   vdev->vbasedev.needs_reset = false;
> > >   
> > > @@ -2312,6 +2317,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice
> > > *vdev, bool single)
> > >   }
> > >   vfio_pci_pre_reset(tmp);
> > >   tmp->vbasedev.needs_reset = false;
> > > +tmp->vbasedev.bus_in_reset = true;
> > >   multi = true;
> > >   break;
> > >   }
> > > diff --git a/include/hw/vfio/vfio-common.h
> > > b/include/hw/vfio/vfio-
> > > common.h
> > > index f037f3c..44b19d7 100644
> > > --- a/include/hw/vfio/vfio-common.h
> > > +++ b/include/hw/vfio/vfio-common.h
> > > @@ -95,6 +95,7 @@ typedef struct VFIODevice {
> > >   bool reset_works;
> > >   bool needs_reset;
> > >   bool no_mmap;
> > > +bool bus_in_reset;
> > >   VFIODeviceOps *ops;
> > >   unsigned int num_irqs;
> > >   unsigned int num_regions;
> > I imagine this should be a VFIOPCIDevice field, it has no use in
> > the
> > common code.  The name is also a bit confusing; when I suggested a
> > bus_in_reset flag, I was referring to a property on the bus itself
> > that
> > the existing device_reset could query to switch modes rather than
> > add a
> > separate callback as you've done in this series.  This works, but
> > it's
> > perhaps more intrusive than I was thinking.  It will need to get
> > approval by qdev folks.
> maybe I don't get your point. I just think add a bus_in_reset flag in
> bus
> has no much sense. for instance, if assigning device A and B from
> different host bus into a same virtual bus. assume all check passed.
> then if device A aer occurs. we should reset virtual bus to recover
> the device A, we also need to reset the device B and do device B host
> bus reset. but here the bus_in_reset just denote the device B not
> need
> to do host bus reset, it's incorrect. right?

Let's take an example of the state of this flag on the device to see
why the name doesn't make sense to me.  We have a dual port card with
devices A and B, both on the same host bus.  We start a hot reset on A
and we have the following states:

A.bus_in_reset = false
B.bus_in_reset = false

Well, that's not accurate.  As we complete the hot reset we tag device
B as already being reset:

A.bus_in_reset = false
B.bus_in_reset = true

That's not accurate either, they're both in the same bus hierarchy,
they should not be different.  Later hot reset is called on B and we're
back to:

A.bus_in_reset = false
B.bus_in_reset = false

So I agree with your algorithm, but the variable is not tracking the
state of the bus being in reset, it's tracking whether to skip the next
reset call.

The separate hot (bus) reset in DeviceClass seems unnecessary too, this
is where I think we could work entirely within the PCI code w/o new
qbus/qdev interfaces.  Imagine pci_bridge_write_config()
calls qbus_walk_children() directly instead of calling
qbus_reset_all().  The pre_busfn() could set a flag on the PCIBus to
indicate the bus is in reset.  qdev_reset_one() could be used as the
post_devfn() and the post_busfn() would call qdev_reset_one() followed
by a clear of the flag.  The modification to vfio is then simply that
if we're resetting an AER device and the bus is in reset, we know we
can do a hot reset.  It simply allows us to test which reset type is
occurring within the existing reset callback rather than adding a new
one.

Going back to my idea of a sequence ID, if we had:

bool PCIBus.bus_in_reset
uint PCIBus.bus_reset_seqid

The pre_busfn would do:

PCIBus.bus_in_reset = true
PCIBus.bus_reset_seqid++

Then we could add:

uint VFIOPCIDevice.last_bus_reset_seqid

vfio_pci_reset() would test (PCIBus.bus_in_reset && VFIOPCIDevice.AER)
to know whether to do a hot reset.  vfio_pci_hot_reset() would skip
devices for which (VFIOPCIDevice.last_bus_reset_seqid ==
PCIBus.bus_reset_seqid) and for each device reset would set
VFIOPCIDevice.last_bus_reset_seqid = PCIBus.bus_reset_seqid

Re: [Qemu-devel] [PATCH v5 01/15] block: Add "file" output parameter to block status query functions

2016-01-06 Thread Max Reitz

On 05.01.2016 09:11, Fam Zheng wrote:
> The added parameter can be used to return the BDS pointer which the
> valid offset is referring to. Its value should be ignored unless
> BDRV_BLOCK_OFFSET_VALID in ret is set.
> 
> Until block drivers fill in the right value, let's clear it explicitly
> right before calling .bdrv_get_block_status.
> 
> The "bs->file" condition in bdrv_co_get_block_status is kept now to keep 
> iotest
> case 102 passing, and will be fixed once all drivers return the right file
> pointer.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/io.c| 42 --
>  block/iscsi.c |  6 --
>  block/mirror.c|  3 ++-
>  block/parallels.c |  2 +-
>  block/qcow.c  |  2 +-
>  block/qcow2.c |  2 +-
>  block/qed.c   |  3 ++-
>  block/raw-posix.c |  3 ++-
>  block/raw_bsd.c   |  3 ++-
>  block/sheepdog.c  |  2 +-
>  block/vdi.c   |  2 +-
>  block/vmdk.c  |  2 +-
>  block/vpc.c   |  2 +-
>  block/vvfat.c |  2 +-
>  include/block/block.h |  9 ++---
>  include/block/block_int.h |  3 ++-
>  qemu-img.c|  7 +--
>  17 files changed, 61 insertions(+), 34 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 63e3678..492c291 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -663,6 +663,7 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t 
> sector_num,
>  int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
>  {
>  int64_t target_sectors, ret, nb_sectors, sector_num = 0;
> +BlockDriverState *file;
>  int n;
>  
>  target_sectors = bdrv_nb_sectors(bs);
> @@ -675,7 +676,7 @@ int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags 
> flags)
>  if (nb_sectors <= 0) {
>  return 0;
>  }
> -ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
> +ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n, &file);
>  if (ret < 0) {
>  error_report("error getting block status at sector %" PRId64 ": 
> %s",
>   sector_num, strerror(-ret));
> @@ -1464,6 +1465,7 @@ int bdrv_flush_all(void)
>  typedef struct BdrvCoGetBlockStatusData {
>  BlockDriverState *bs;
>  BlockDriverState *base;
> +BlockDriverState **file;
>  int64_t sector_num;
>  int nb_sectors;
>  int *pnum;
> @@ -1488,7 +1490,8 @@ typedef struct BdrvCoGetBlockStatusData {
>   */
>  static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
>   int64_t sector_num,
> - int nb_sectors, int 
> *pnum)
> + int nb_sectors, int 
> *pnum,
> + BlockDriverState **file)
>  {
>  int64_t total_sectors;
>  int64_t n;
> @@ -1518,16 +1521,19 @@ static int64_t coroutine_fn 
> bdrv_co_get_block_status(BlockDriverState *bs,
>  return ret;
>  }
>  
> -ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, 
> pnum);
> +*file = NULL;
> +ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum,
> +file);
>  if (ret < 0) {
>  *pnum = 0;
> +*file = NULL;
>  return ret;
>  }
>  
>  if (ret & BDRV_BLOCK_RAW) {
>  assert(ret & BDRV_BLOCK_OFFSET_VALID);
>  return bdrv_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
> - *pnum, pnum);
> + *pnum, pnum, file);
>  }
>  
>  if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
> @@ -1544,13 +1550,14 @@ static int64_t coroutine_fn 
> bdrv_co_get_block_status(BlockDriverState *bs,
>  }
>  }
>  
> -if (bs->file &&
> +if (bs->file && *file && *file != bs &&

In order to keep iotest 102 working, this line should not be changed at
all in this patch. Because *file is NULL and not changed by any of the
block drivers, the condition will always be false now (breaking iotest 102).

>  (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
>  (ret & BDRV_BLOCK_OFFSET_VALID)) {
> +BlockDriverState *file2;
>  int file_pnum;
>  
> -ret2 = bdrv_co_get_block_status(bs->file->bs, ret >> 
> BDRV_SECTOR_BITS,
> -*pnum, &file_pnum);
> +ret2 = bdrv_co_get_block_status(*file, ret >> BDRV_SECTOR_BITS,

And instead of *file, this should remain bs->file->bs.

> +*pnum, &file_pnum, &file2);
>  if (ret2 >= 0) {
>  /* Ignore errors.  This is just providing extra information, it
>   * is useful but not necessary.

[...]

> diff --git a/include/block/block.h b/include/block/b

[Qemu-devel] [PATCH v2] xenfb.c: avoid expensive loops when prod <= out_cons

2016-01-06 Thread Stefano Stabellini

If the frontend sets out_cons to a value higher than out_prod, it will
cause xenfb_handle_events to loop about 2^32 times. Avoid that by using
better checks at the beginning of the function.

Signed-off-by: Stefano Stabellini 
Reported-by: Ling Liu 

---

diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 4e2a27a..594baff 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -789,8 +789,9 @@ static void xenfb_handle_events(struct XenFB *xenfb)
 
 prod = page->out_prod;
 out_cons = page->out_cons;
-if (prod == out_cons)
-   return;
+if (prod - out_cons >= XENFB_OUT_RING_LEL) {
+return;
+}
 xen_rmb(); /* ensure we see ring contents up to prod */
 for (cons = out_cons; cons != prod; cons++) {
union xenfb_out_event *event = &XENFB_OUT_RING_REF(page, cons);

Re: [Qemu-devel] qcow2 snapshot + resize

2016-01-06 Thread Max Reitz

On 06.01.2016 17:28, Eric Blake wrote:
> On 01/06/2016 09:20 AM, Max Reitz wrote:
> 
>>> If I take a snapshot while the guest sees a 1G disk, then resize the
>>> disk to 2G, then roll back to the point in time of the snapshot, I'd
>>> expect the disk to roll back to 1G in size.  Anything else is likely to
>>> confuse the guest.  And that's what current resize support already does
>>> (it only resizes the active image, not the snapshots).
>>
>> No, the current resize operation just refuses to resize the image if it
>> has any snapshots. Snapshots currently do not store the size of the
>> image when they were created.
> 
> Huh? I thought that we specifically added bytes 48-55 per snapshot entry
> in the qcow2v3 description specifically so that internal snapshots DO
> record the size of the image when the snapshot was created.

Oh, you're right! Well, then that was probably the intention, yes.
However, resizing an image with snapshots will still fail.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] qcow2 snapshot + resize

2016-01-06 Thread Eric Blake

On 01/06/2016 09:20 AM, Max Reitz wrote:

>> If I take a snapshot while the guest sees a 1G disk, then resize the
>> disk to 2G, then roll back to the point in time of the snapshot, I'd
>> expect the disk to roll back to 1G in size.  Anything else is likely to
>> confuse the guest.  And that's what current resize support already does
>> (it only resizes the active image, not the snapshots).
> 
> No, the current resize operation just refuses to resize the image if it
> has any snapshots. Snapshots currently do not store the size of the
> image when they were created.

Huh? I thought that we specifically added bytes 48-55 per snapshot entry
in the qcow2v3 description specifically so that internal snapshots DO
record the size of the image when the snapshot was created.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC] util: Fix QEMU_LD_PREFIX endless loop

2016-01-06 Thread 陳威伯

For further explaination of endless loop happen in util/path.c
For example, set the QEMU_LD_PREFIX="/home/apple/i386"
sudo debootstrap --arch=i386 trusty /home/apple/i386
http://archive.ubuntu.com/ubuntu/
magic is a normal dynamic-linkd 32-bit elf.
qemu-i386 -L /home/apple/i386 /home/apple/magic
This command will produce the endless loop.
Because /home/apple/i386/dev/fd will link to /proc/self/fd/
We could use strace to watch the progress
strace qemu-i386 -L /home/apple/i386 /home/apple/magic
openat(AT_FDCWD,
"/home/apple/i386/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/3/dev/fd/5/15/usr/share/doc/libp11-kit0/examples",
O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 48
If one of the QEMU_LD_PREFIX directory entry is symbolic
link, is_dir_maybe in the add_entry will return true.

# define is_dir_maybe(type) \
((type) == DT_DIR || (type) == DT_UNKNOWN || (type) == DT_LNK)
Then add the path name with add_dir_maybe function.
And then add the path name again and again.
Produce the infinite loop in the directory searching.
So the solution is not to deal with the symbolic link in the is_dir_maybe
macro
Modify the code as following:
# define is_dir_maybe(type) \
((type) == DT_DIR || (type) == DT_UNKNOWN)

On Wed, Jan 6, 2016 at 11:21 PM, Wei-Bo, Chen  wrote:

> Detail bug report in the following url:
> https://bugs.launchpad.net/qemu/+bug/1245703
>
> Remove is_dir_maybe macro condition DT_LNK in util/path.c
>
> Signed-off-by: Wei-Bo, Chen 
> ---
>  util/path.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/util/path.c b/util/path.c
> index 4e4877e..b99e436 100644
> --- a/util/path.c
> +++ b/util/path.c
> @@ -58,7 +58,7 @@ static struct pathelem *new_entry(const char *root,
>  #if defined(DT_DIR) && defined(DT_UNKNOWN) && defined(DT_LNK)
>  # define dirent_type(dirent) ((dirent)->d_type)
>  # define is_dir_maybe(type) \
> -((type) == DT_DIR || (type) == DT_UNKNOWN || (type) == DT_LNK)
> +((type) == DT_DIR || (type) == DT_UNKNOWN)
>  #else
>  # define dirent_type(dirent) (1)
>  # define is_dir_maybe(type)  (type)
> --
> 2.5.0
>
>

Re: [Qemu-devel] [Xen-devel] [PATCH v3 07/11] igd: revamp host config read

2016-01-06 Thread Stefano Stabellini

On Wed, 6 Jan 2016, Gerd Hoffmann wrote:
> > > +for (i = 0; i < len; i++) {
> > > +rc = pread(config_fd, guest->config + list[i].offset,
> > > +   list[i].len, list[i].offset);
> > > +if (rc != list[i].len) {
> > 
> > pread is allowed to return early, returning the number of bytes read.
> > 
> 
> This is a sysfs file though, not a socket or pipe where a partial read
> makes sense and will actually happen.  If we can't read something
> that'll be because the kernel denies access.
> 
> So IMHO it should be fine to treat anything which doesn't give us the
> amount of bytes we asked for as an error condition.

True, still theoretically, it's possible for pread to return early. Who
knows what glibc and linux are going to do in the future.

Re: [Qemu-devel] qcow2 snapshot + resize

2016-01-06 Thread Max Reitz

On 06.01.2016 16:04, Eric Blake wrote:
> On 01/05/2016 07:50 PM, lihuiba wrote:
>> At 2016-01-05 21:55:56, "Eric Blake"  wrote:
>>> On 01/05/2016 05:10 AM, lihuiba wrote:
>>>
>> In our production environment, we need to extend a qcow2 image with
>> snapshots in it.
>>>
> The thing is that one would need to update all the inactive L1 tables. I
> don't think it should be too difficult, it's just that apparently so far
> nobody ever had the need for this feature.
>>>
>>> Is resizing a snapshot really what you want?  Ideally, a snapshot tracks
>>> the data from a point in time, including the metadata of the size being
>>> tracked at that time.  Extending the snapshots then reverting to that
>>> snapshot means your guest would see a larger disk on revert than it did
>>> at the time the snapshot was created, which guests might not handle very
>>> well.
>> I want to make resizing (extending only) and snapshot independent to each 
>> other,otherwise going back and forth in snapshots may cause the disk 
>> shrinked and extended. That would introduce some technical trouble, and 
>> possibly confuse user as well.
> 
> If I take a snapshot while the guest sees a 1G disk, then resize the
> disk to 2G, then roll back to the point in time of the snapshot, I'd
> expect the disk to roll back to 1G in size.  Anything else is likely to
> confuse the guest.  And that's what current resize support already does
> (it only resizes the active image, not the snapshots).

No, the current resize operation just refuses to resize the image if it
has any snapshots. Snapshots currently do not store the size of the
image when they were created.

Max



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [Bug 1357226] Re: qemu: uncaught target signal 11 (Segmentation fault) - core dumped

2016-01-06 Thread Scott Moser

This may or may not be relevant here, but the mysterious "uncaught
target signal 11" error was fixed for maas images (lp:maas-images) build
process by increasing the memory to the VMs that were doing the build.
We had been doing the cross/qemu-static building in ~512M vms and that
was resulting in somewhat transient failures during 'apt-get update'.
Upping the memory of the vm to 2G made those go away.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1357226

Title:
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped

Status in QEMU:
  New

Bug description:
  steps to reproduce:
  pbuilder-dist utopic armhf create
  pbuilder-dist utopic armhf login
  apt-get install imagemagick
  convert foo.xpm foo.png
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault

  (doesn't matter if images are actually there or not)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1357226/+subscriptions

Re: [Qemu-devel] [PATCH v3 4/4] Xen PCI passthru: convert to realize()

2016-01-06 Thread Eric Blake

On 01/05/2016 07:39 PM, Cao jin wrote:
> Signed-off-by: Cao jin 
> Reviewed-by: Stefano Stabellini 
> ---
>  hw/xen/xen_pt.c | 53 -
>  1 file changed, 28 insertions(+), 25 deletions(-)
> 

> @@ -801,19 +801,19 @@ static int xen_pt_initfn(PCIDevice *d)
>  if ((s->real_device.domain == 0) && (s->real_device.bus == 0) &&
>  (s->real_device.dev == 2) && (s->real_device.func == 0)) {
>  if (!is_igd_vga_passthrough(&s->real_device)) {
> -XEN_PT_ERR(d, "Need to enable igd-passthru if you're trying"
> -   " to passthrough IGD GFX.\n");
> +error_setg(errp, "Need to enable igd-passthru if you're trying"
> +" to passthrough IGD GFX.");

No trailing '.' in error_setg() messages.


> @@ -827,27 +827,26 @@ static int xen_pt_initfn(PCIDevice *d)
>  xen_pt_config_init(s, &local_err);
>  if (local_err) {
>  error_append_hint(&local_err, "PCI Config space initialisation 
> failed");
> -rc = -1;
> +error_propagate(errp, local_err);
>  goto err_out;
>  }

Looks like this fixes a memory leak in an earlier patch; maybe you need
to shuffle hunks around?

>  
>  /* Bind interrupt */
>  rc = xen_host_pci_get_byte(&s->real_device, PCI_INTERRUPT_PIN, &scratch);
>  if (rc) {
> -XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc);
> +error_setg_errno(errp, errno, "Failed to read PCI_INTERRUPT_PIN!");

No trailing '!'


> @@ -891,14 +890,14 @@ out:
>  
>  rc = xen_host_pci_get_word(&s->real_device, PCI_COMMAND, &val);
>  if (rc) {
> -XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc);
> +error_setg_errno(errp, errno, "Failed to read PCI_COMMAND!");

and again


> @@ -911,12 +910,16 @@ out:
> "Real physical device %02x:%02x.%d registered 
> successfully!\n",
> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function);
>  
> -return 0;
> +return;
>  
>  err_out:
> +for (i = 0; i < PCI_ROM_SLOT; i++) {
> +object_unparent(OBJECT(&s->bar[i]));
> +}
> +object_unparent(OBJECT(&s->rom));
> +
>  xen_pt_destroy(d);
>  assert(rc);
> -return rc;

Is the assertion still needed?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 0/4] Convert to realize()

2016-01-06 Thread Stefano Stabellini

On Wed, 6 Jan 2016, Eric Blake wrote:
> On 01/06/2016 04:08 AM, Stefano Stabellini wrote:
> > On Wed, 6 Jan 2016, Cao jin wrote:
> >> v3 changelog:
> >> 1. use following style when we want to check the returned error
> >>
> >>  Error *err = NULL;
> >>  foo(arg, &err);
> >>  if (err) {
> >>  handle the error...
> >>  error_propagate(errp, err);
> >>  }
> >>
> >> Cao jin (4):
> >>   Add Error **errp for xen_host_pci_device_get()
> >>   Add Error **errp for xen_pt_setup_vga()
> >>   Add Error **errp for xen_pt_config_init()
> >>   Xen PCI passthru: convert to realize()
> >>
> >>  hw/xen/xen-host-pci-device.c | 106 
> >> +--
> >>  hw/xen/xen-host-pci-device.h |   5 +-
> >>  hw/xen/xen_pt.c  |  73 -
> >>  hw/xen/xen_pt.h  |   5 +-
> >>  hw/xen/xen_pt_config_init.c  |  51 +++--
> >>  hw/xen/xen_pt_graphics.c |  11 +++--
> >>  6 files changed, 141 insertions(+), 110 deletions(-)
> > 
> > Thanks Cao, I applied the whole series to my next branch.
> 
> I found some issues while reviewing; maybe you want to wait for a v4.

Sure, thanks for reviewing.

Re: [Qemu-devel] [PATCH v3 3/4] Add Error **errp for xen_pt_config_init()

2016-01-06 Thread Eric Blake

On 01/05/2016 07:39 PM, Cao jin wrote:
> To catch the error msg. Also modify the caller
> 
> Signed-off-by: Cao jin 
> ---
>  hw/xen/xen_pt.c |  7 ---
>  hw/xen/xen_pt.h |  2 +-
>  hw/xen/xen_pt_config_init.c | 51 
> -
>  3 files changed, 32 insertions(+), 28 deletions(-)
> 

> +++ b/hw/xen/xen_pt_config_init.c
> @@ -1899,8 +1899,9 @@ static uint8_t find_cap_offset(XenPCIPassthroughState 
> *s, uint8_t cap)
>  return 0;
>  }
>  
> -static int xen_pt_config_reg_init(XenPCIPassthroughState *s,
> -  XenPTRegGroup *reg_grp, XenPTRegInfo *reg)
> +static void xen_pt_config_reg_init(XenPCIPassthroughState *s,
> +  XenPTRegGroup *reg_grp, XenPTRegInfo *reg,
> +  Error **errp)

Indentation is now off.


> @@ -1967,10 +1970,10 @@ static int 
> xen_pt_config_reg_init(XenPCIPassthroughState *s,
>  val = data;
>  
>  if (val & ~size_mask) {
> -XEN_PT_ERR(&s->dev,"Offset 0x%04x:0x%04x expands past register 
> size(%d)!\n",
> -   offset, val, reg->size);
> +error_setg(errp, "Offset 0x%04x:0x%04x expands past"
> +" register size(%d)!", offset, val, reg->size);

Drop the trailing !.  Also, while touching this, it's better to have a
space before ( in English.


> +void xen_pt_config_init(XenPCIPassthroughState *s, Error **errp)
>  {
>  int i, rc;
> +Error *local_err = NULL;

Same comments as earlier in the series about using the shorter 'err'
instead of 'local_err'.

>  
>  QLIST_INIT(&s->reg_grps);
>  
> @@ -2039,11 +2041,12 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
>reg_grp_offset,
>®_grp_entry->size);
>  if (rc < 0) {
> -XEN_PT_LOG(&s->dev, "Failed to initialize %d/%ld, type=0x%x, 
> rc:%d\n",
> -   i, ARRAY_SIZE(xen_pt_emu_reg_grps),
> +error_setg(&local_err, "Failed to initialize %d/%ld, 
> type=0x%x,"
> +   " rc:%d", i, ARRAY_SIZE(xen_pt_emu_reg_grps),

This maps ARRAY_SIZE() (which is size_t) to %ld, which can fail to
compile on 32-bit platforms (where size_t is not necessarily long).  Fix
it to %zd while touching it.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] hw/dma/xilinx_axidma: remove dead code

2016-01-06 Thread Eric Blake

On 01/06/2016 05:53 AM, Andrew Jones wrote:
> stream_desc_show() (and DEBUG_ENET) appear to be unused, as the
> function isn't compilable (there are broken PRI format strings).
> 
> Signed-off-by: Andrew Jones 
> ---
>  hw/dma/xilinx_axidma.c | 10 --
>  1 file changed, 10 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 2/4] Add Error **errp for xen_pt_setup_vga()

2016-01-06 Thread Eric Blake

On 01/05/2016 07:39 PM, Cao jin wrote:
> To catch the error msg. Also modify the caller
> 
> Signed-off-by: Cao jin 
> Reviewed-by: Stefano Stabellini 
> ---
>  hw/xen/xen_pt.c  |  5 -
>  hw/xen/xen_pt.h  |  3 ++-
>  hw/xen/xen_pt_graphics.c | 11 ++-
>  3 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 1bd4109..fbce55c 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -807,7 +807,10 @@ static int xen_pt_initfn(PCIDevice *d)
>  return -1;
>  }
>  
> -if (xen_pt_setup_vga(s, &s->real_device) < 0) {
> +xen_pt_setup_vga(s, &s->real_device, &local_err);
> +if (local_err) {
> +error_append_hint(&local_err, "Setup VGA BIOS of passthrough"
> +" GFX failed!");

Please no '!' in error messages.  We aren't shouting at the user.

>  XEN_PT_ERR(d, "Setup VGA BIOS of passthrough GFX failed!\n");

Do we still need the XEN_PT_ERR() alongside setting the local error?

>  xen_host_pci_device_put(&s->real_device);
>  return -1;

This leaks local_err.


> @@ -172,13 +173,14 @@ int xen_pt_setup_vga(XenPCIPassthroughState *s, 
> XenHostPCIDevice *dev)
>  struct pci_data *pd = NULL;
>  
>  if (!is_igd_vga_passthrough(dev)) {
> -return -1;
> +error_setg(errp, "Need to enable igd-passthrough");
> +return;
>  }
>  
>  bios = get_vgabios(s, &bios_size, dev);
>  if (!bios) {
> -XEN_PT_ERR(&s->dev, "VGA: Can't getting VBIOS!\n");
> -return -1;
> +error_setg(errp, "VGA: Can't getting VBIOS!");
> +return;

Please drop the trailing '!' while touching this

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 10/11] igd: handle igd-passthrough-isa-bridge setup in realize()

2016-01-06 Thread Gerd Hoffmann

On Mi, 2016-01-06 at 15:29 +, Stefano Stabellini wrote:
> On Tue, 5 Jan 2016, Gerd Hoffmann wrote:
> > That way a simple '-device igd-passthrough-isa-bridge,addr=1f' will
> > do the setup.
> 
> Is this going to change the QEMU command line arguments to use it?

See patch 11 ;)

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v3 1/4] Add Error **errp for xen_host_pci_device_get()

2016-01-06 Thread Eric Blake

On 01/05/2016 07:39 PM, Cao jin wrote:
> To catch the error msg. Also modify the caller
> 
> Signed-off-by: Cao jin 
> ---
>  hw/xen/xen-host-pci-device.c | 106 
> +--
>  hw/xen/xen-host-pci-device.h |   5 +-
>  hw/xen/xen_pt.c  |  12 +++--
>  3 files changed, 71 insertions(+), 52 deletions(-)
> 
> diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
> index 7d8a023..952cae0 100644
> --- a/hw/xen/xen-host-pci-device.c
> +++ b/hw/xen/xen-host-pci-device.c
> @@ -49,7 +49,7 @@ static int xen_host_pci_sysfs_path(const XenHostPCIDevice 
> *d,
>  
>  /* This size should be enough to read the first 7 lines of a resource file */
>  #define XEN_HOST_PCI_RESOURCE_BUFFER_SIZE 400
> -static int xen_host_pci_get_resource(XenHostPCIDevice *d)
> +static void xen_host_pci_get_resource(XenHostPCIDevice *d, Error **errp)
>  {
>  int i, rc, fd;
>  char path[PATH_MAX];
> @@ -60,23 +60,24 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
>  
>  rc = xen_host_pci_sysfs_path(d, "resource", path, sizeof (path));
>  if (rc) {
> -return rc;
> +error_setg_errno(errp, errno, "snprintf err");

Are you sure that errno is relevant?  And "snprintf err" doesn't seem to
be the correct message, as there is no snprintf in the line above.

> +return;
>  }
> +
>  fd = open(path, O_RDONLY);
>  if (fd == -1) {
> -XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, 
> strerror(errno));
> -return -errno;
> +error_setg_errno(errp, errno, "open %s err", path);

Please use error_setg_file_open() for reporting open() failures.

> @@ -129,20 +130,20 @@ static int xen_host_pci_get_resource(XenHostPCIDevice 
> *d)
>  d->rom.bus_flags = flags & IORESOURCE_BITS;
>  }
>  }
> +
>  if (i != PCI_NUM_REGIONS) {
>  /* Invalid format or input to short */
> -rc = -ENODEV;
> +error_setg(errp, "Invalid format or input to short: %s", buf);

s/to/too/ (and might as well fix the same typo in the comment while at it)


> @@ -152,47 +153,52 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
> const char *name,
>  
>  rc = xen_host_pci_sysfs_path(d, name, path, sizeof (path));
>  if (rc) {
> -return rc;
> +error_setg_errno(errp, errno, "snprintf err");
> +return;
>  }
> +
>  fd = open(path, O_RDONLY);
>  if (fd == -1) {
> -XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, 
> strerror(errno));
> -return -errno;
> +error_setg_errno(errp, errno, "open %s err", path);

Same comments as above.

> +return;
>  }
> +
>  do {
>  rc = read(fd, &buf, sizeof (buf) - 1);
>  if (rc < 0 && errno != EINTR) {
> -rc = -errno;
> +error_setg_errno(errp, errno, "read err");
>  goto out;
>  }
>  } while (rc < 0);
> +
>  buf[rc] = 0;
>  value = strtol(buf, &endptr, base);
>  if (endptr == buf || *endptr != '\n') {
> -rc = -1;
> +error_setg(errp, "format invalid: %s", buf);
>  } else if ((value == LONG_MIN || value == LONG_MAX) && errno == ERANGE) {
> -rc = -errno;
> +error_setg_errno(errp, errno, "strtol err");

This is pre-existing invalid use of strtol (the value of errno is not
guaranteed to be ERANGE on overflow; and the only correct way to safely
check errno after strtol() is to first prime it to 0 prior to calling
strtol).  Better would be to use qemu_strtol() (preferably as a separate
patch), so that you don't even have to worry about using strtol()
incorrectly.

> +static void xen_host_pci_config_open(XenHostPCIDevice *d, Error **errp)
>  {
>  char path[PATH_MAX];
>  int rc;
>  
>  rc = xen_host_pci_sysfs_path(d, "config", path, sizeof (path));

May want to delete the space before ( while touching the code in this
vicinity.

>  if (rc) {
> -return rc;
> +error_setg_errno(errp, errno, "snprintf err");

Another suspect message.


> +void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t domain,
> +uint8_t bus, uint8_t dev, uint8_t func,
> +Error **errp)
>  {
>  unsigned int v;
> -int rc = 0;
> +Error *local_err = NULL;

These days, naming the local variable 'err' is more common than 'local_err'.

> @@ -774,11 +775,12 @@ static int xen_pt_initfn(PCIDevice *d)
> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
> s->dev.devfn);
>  
> -rc = xen_host_pci_device_get(&s->real_device,
> - s->hostaddr.domain, s->hostaddr.bus,
> - s->hostaddr.slot, s->hostaddr.function);
> -if (rc) {
> -XEN_PT_ERR(d, "Failed to \"open\" the real pci device. rc: %i\n", 
> rc);
> +xen_host_pci_device_get(&s->real_device,
> +s->hostaddr.domain, s->hostaddr

Re: [Qemu-devel] [PATCH v3 07/11] igd: revamp host config read

2016-01-06 Thread Gerd Hoffmann

> > +for (i = 0; i < len; i++) {
> > +rc = pread(config_fd, guest->config + list[i].offset,
> > +   list[i].len, list[i].offset);
> > +if (rc != list[i].len) {
> 
> pread is allowed to return early, returning the number of bytes read.
> 

This is a sysfs file though, not a socket or pipe where a partial read
makes sense and will actually happen.  If we can't read something
that'll be because the kernel denies access.

So IMHO it should be fine to treat anything which doesn't give us the
amount of bytes we asked for as an error condition.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v3 0/4] Convert to realize()

2016-01-06 Thread Eric Blake

On 01/06/2016 04:08 AM, Stefano Stabellini wrote:
> On Wed, 6 Jan 2016, Cao jin wrote:
>> v3 changelog:
>> 1. use following style when we want to check the returned error
>>
>>  Error *err = NULL;
>>  foo(arg, &err);
>>  if (err) {
>>  handle the error...
>>  error_propagate(errp, err);
>>  }
>>
>> Cao jin (4):
>>   Add Error **errp for xen_host_pci_device_get()
>>   Add Error **errp for xen_pt_setup_vga()
>>   Add Error **errp for xen_pt_config_init()
>>   Xen PCI passthru: convert to realize()
>>
>>  hw/xen/xen-host-pci-device.c | 106 
>> +--
>>  hw/xen/xen-host-pci-device.h |   5 +-
>>  hw/xen/xen_pt.c  |  73 -
>>  hw/xen/xen_pt.h  |   5 +-
>>  hw/xen/xen_pt_config_init.c  |  51 +++--
>>  hw/xen/xen_pt_graphics.c |  11 +++--
>>  6 files changed, 141 insertions(+), 110 deletions(-)
> 
> Thanks Cao, I applied the whole series to my next branch.

I found some issues while reviewing; maybe you want to wait for a v4.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 3/6] nvdimm acpi: introduce patched dsm memory

2016-01-06 Thread Xiao Guangrong




On 01/06/2016 11:23 PM, Igor Mammedov wrote:

On Tue,  5 Jan 2016 02:52:05 +0800
Xiao Guangrong  wrote:


The dsm memory is used to save the input parameters and store
the dsm result which is filled by QEMU.

The address of dsm memory is decided by bios and patched into
int64 object returned by "MEMA" method

Signed-off-by: Xiao Guangrong 
---
  hw/acpi/aml-build.c | 12 
  hw/acpi/nvdimm.c| 24 ++--
  include/hw/acpi/aml-build.h |  1 +
  3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 78e1290..83eadb3 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -394,6 +394,18 @@ Aml *aml_int(const uint64_t val)
  }

  /*
+ * ACPI 1.0b: 16.2.3 Data Objects Encoding:
+ * encode: QWordConst
+ */
+Aml *aml_int64(const uint64_t val)
+{
+Aml *var = aml_alloc();
+build_append_byte(var->buf, 0x0E); /* QWordPrefix */
+build_append_int_noprefix(var->buf, val, 8);
+return var;
+}
+
+/*
   * helper to construct NameString, which returns Aml object
   * for using with aml_append or other aml_* terms
   */
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index bc7cd8f..a72104c 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -28,6 +28,7 @@

  #include "hw/acpi/acpi.h"
  #include "hw/acpi/aml-build.h"
+#include "hw/acpi/bios-linker-loader.h"
  #include "hw/nvram/fw_cfg.h"
  #include "hw/mem/nvdimm.h"

@@ -402,7 +403,8 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, 
MemoryRegion *io,
  state->dsm_mem->len);
  }

-#define NVDIMM_COMMON_DSM  "NCAL"
+#define NVDIMM_GET_DSM_MEM  "MEMA"
+#define NVDIMM_COMMON_DSM   "NCAL"

  static void nvdimm_build_common_dsm(Aml *dev)
  {
@@ -468,7 +470,8 @@ static void nvdimm_build_ssdt(GSList *device_list, GArray 
*table_offsets,
GArray *table_data, GArray *linker,
uint8_t revision)
  {
-Aml *ssdt, *sb_scope, *dev;
+Aml *ssdt, *sb_scope, *dev, *method;
+int offset;

  acpi_add_table(table_offsets, table_data);

@@ -499,9 +502,26 @@ static void nvdimm_build_ssdt(GSList *device_list, GArray 
*table_offsets,

  aml_append(sb_scope, dev);

+/*
+ * leave it at the end of ssdt so that we can conveniently get the
+ * offset of int64 object returned by the function which will be
+ * patched with the real address of the dsm memory by BIOS.
+ */
+method = aml_method(NVDIMM_GET_DSM_MEM, 0, AML_NOTSERIALIZED);
+aml_append(method, aml_return(aml_int64(0x0)));

there is no need in dedicated aml_int64(), you can use aml_int(0x64) 
trick


We can not do this due to the trick in  bios_linker_loader_add_pointer() which 
will
issue a COMMAND_ADD_POINTER to BIOS, however, this request does:
/*
 * COMMAND_ADD_POINTER - patch the table (originating from
 * @dest_file) at @pointer.offset, by adding a pointer to the table
 * originating from @src_file. 1,2,4 or 8 byte unsigned
 * addition is used depending on @pointer.size.
 */

that means the new-offset = old-offset + the address of the new table allocated 
by BIOS.

So we expect 0 offset here.




+aml_append(sb_scope, method);
  aml_append(ssdt, sb_scope);
  /* copy AML table into ACPI tables blob and patch header there */
  g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
+
+offset = table_data->len - 8;
+
+bios_linker_loader_alloc(linker, NVDIMM_DSM_MEM_FILE, TARGET_PAGE_SIZE,
+ false /* high memory */);
+bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
+   NVDIMM_DSM_MEM_FILE, table_data,
+   table_data->data + offset,
+   sizeof(uint64_t));

this offset magic will break badly as soon as someone add something
to the end of SSDT.



Yes, it is, so don't do that, :) and this is why we made the comment here:
 +/*
 + * leave it at the end of ssdt so that we can conveniently get the
 + * offset of int64 object returned by the function which will be
 + * patched with the real address of the dsm memory by BIOS.
 + */

Re: [Qemu-devel] [PATCH v3 05/11] igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize

2016-01-06 Thread Gerd Hoffmann

> >  
> > +static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp);
> >  static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
> >  {
> > +Error *err = NULL;
> >  uint32_t val = 0;
> >  int rc, i, num;
> >  int pos, len;
> 
> Can't we get the parent PCIDeviceClass realize function from pci_dev? So
> that we don't have to introduce i440fx_realize?

I don't think so ...

> >  
> > +i440fx_realize = k->realize;
> >  k->realize = igd_pt_i440fx_realize;

... because we are overriding it right here.

cheers,
  Gerd

Re: [Qemu-devel] could i using qemu-img covert && rebase -u to do qcow2 rollback?

2016-01-06 Thread Max Reitz

On 05.01.2016 04:52, Huan Zhang wrote:
> Hi Max,
> "rollback" means revert user data to snap1 state, and for some reason,
> we want to
> keep snap2 and  'rollbacked' qocw2 file in a single backing file chain.
> After rollback, looks like:
> snap0.qcow2 -> snap1.qcow2 ->snap2.qcow2 -> rollbacked-snap1.qcow2
>  
> In my immature opinion,
> 'qemu-img convert -O qcow2 snap1.qcow2 rollback.qcow2' get snap1 sate data,
> 'qemu-img rebase -u -b snap2.qcow2 rollback.qcow2' just changes
> rollback.qcow2 backing file to snap2.qcow2,
> will NOT change the data from user perspective(data reading from backing
> file (snap2.qcow2 e.g.) which not in rollback.qcow2 is meaningless to user].
> Is that right?

OK. The problem with this is that qemu-img convert will not write
unallocated clusters. For instance, assume we have the following
configuration:

# An empty snap0.qcow2
$ qemu-img create -f qcow2 snap0.qcow2 64M

# snap1.qcow2 contains some 64k data block at offset 0
$ qemu-img create -f qcow2 -b snap0.qcow2 -F qcow2 snap1.qcow2
$ qemu-io -c 'write -P 1 0 64k' snap1.qcow2

# snap2.qcow2 contains some other 64k data block at offset 64k
$ qemu-img create -f qcow2 -b snap1.qcow2 -F qcow2 snap2.qcow2
$ qemu-io -c 'write -P 2 64k 64k' snap2.qcow2

# Now you want to rollback to snap1
$ qemu-img convert -O qcow2 snap1.qcow2 rollback.qcow2
$ qemu-img rebase -u -b snap2.qcow2 rollback.qcow2

# Now let's compare snap1.qcow2 and rollback.qcow2
$ qemu-img compare snap1.qcow2 rollback.qcow2
Content mismatch at offset 65536!

So what went wrong? qemu-img convert does not write unallocated sectors
to the output; therefore, the block starting from offset 64k is
unallocated in rollback.qcow2 (just as it is in snap1.qcow2), however,
in rollback.qcow2, this will not return 0, but whatever is in
snap2.qcow2 (which snap1.qcow2 does not have as a backing file).
Therefore, when read from snap1.qcow2, that range is 0; but from
rollback.qcow2, it returns 2s (what we wrote to snap2.qcow2).

How to fix it? Drop the -u for rebase. However, qemu-img will not let us
do this because without -u it wants the image to have a backing file
already.

So let's make rollback.qcow2's backing file snap0.qcow2 (the backing
file snap1.qcow2 has):

$ qemu-img convert -O qcow2 \
-o backing_file=snap0.qcow2,backing_fmt=qcow2 \
snap1.qcow2 rollback.qcow2

(So in the general case, the backing_file option should be set to
whatever backing file snap1.qcow2 has.)

Alternatively, since you will actually be doing a rebase after this, you
can also simply do:

$ cp snap1.qcow2 rollback.qcow2

Whatever you choose (the second may actually be the better choice), the
rebase can be done using:

$ qemu-img rebase -b snap2.qcow2 -F qcow2 rollback.qcow2

Now let's check:

$ qemu-img compare snap1.qcow2 rollback.qcow2
Images are identical.

That looks better!

So, conclusion, the following will probably generally do:

$ cp snap1.qcow2 rollback.qcow2
$ qemu-img rebase -b snap2.qcow2 -F qcow2 rollback.qcow2

Where snap1.qcow2 is the state you want to roll back to, and snap2.qcow2
is the last image you want to be in the backing chain under rollback.qcow2.

Hope that helps (and that I'm actually correct),

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 11/11] igd: move igd-passthrough-isa-bridge creation to machine init

2016-01-06 Thread Stefano Stabellini

On Tue, 5 Jan 2016, Gerd Hoffmann wrote:
> This patch moves igd-passthrough-isa-bridge creation out of the xen
> passthrough code into machine init.  It is triggered by the
> igd-passthru=on machine option.  Advantages:
> 
>  * This works for on both xen and kvm.
>  * It is activated for the pc machine type only, q35 has a real
>isa bridge on 1f.0 and must be handled differently.  The q35
>plan is https://lkml.org/lkml/2015/11/26/183 (should land in
>the next merge window, i.e. linux 4.5).
>  * If we don't need it any more some day (intel is busy removing
>chipset dependencies from the guest driver) we have a single
>machine switch to just turn off all igd passthru chipset
>tweaks.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  hw/i386/pc_piix.c |  6 ++
>  hw/xen/xen_pt.c   | 14 --
>  2 files changed, 6 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index f36222e..2afbbd3 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -281,6 +281,12 @@ static void pc_init1(MachineState *machine,
>  if (pcmc->pci_enabled) {
>  pc_pci_device_init(pci_bus);
>  }
> +
> +#ifdef CONFIG_LINUX
> +if (machine->igd_gfx_passthru) {
> +igd_passthrough_isa_bridge_create(pci_bus);
> +}
> +#endif

One thing I don't like about this is that it is going to skip the checks
done in xen_pt_initfn. For example it is going to create the isa bridge,
even if there is going to be an error loading the vga bios or if the
device specified is not even an Intel graphic card.


>  }
>  
>  /* Looking for a pc_compat_2_4() function? It doesn't exist.
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 18a7f72..5f626c9 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -685,17 +685,6 @@ static const MemoryListener xen_pt_io_listener = {
>  .priority = 10,
>  };
>  
> -static void
> -xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
> -  XenHostPCIDevice *dev)
> -{
> -uint16_t gpu_dev_id;
> -PCIDevice *d = &s->dev;
> -
> -gpu_dev_id = dev->device_id;
> -igd_passthrough_isa_bridge_create(d->bus);
> -}
> -
>  /* destroy. */
>  static void xen_pt_destroy(PCIDevice *d) {
>  
> @@ -810,9 +799,6 @@ static int xen_pt_initfn(PCIDevice *d)
>  xen_host_pci_device_put(&s->real_device);
>  return -1;
>  }
> -
> -/* Register ISA bridge for passthrough GFX. */
> -xen_igd_passthrough_isa_bridge_create(s, &s->real_device);
>  }
>  
>  /* Handle real device's MMIO/PIO BARs */
> -- 
> 1.8.3.1
>

Re: [Qemu-devel] [PATCH v3 10/11] igd: handle igd-passthrough-isa-bridge setup in realize()

2016-01-06 Thread Stefano Stabellini

On Tue, 5 Jan 2016, Gerd Hoffmann wrote:
> That way a simple '-device igd-passthrough-isa-bridge,addr=1f' will
> do the setup.

Is this going to change the QEMU command line arguments to use it?


> Also instead of looking up reasonable PCI IDs based on the graphic
> device id simply copy over the ids from the host, thereby reusing the
> infrastructure we have in place for the igd host bridges.  Less code,
> and should be more robust as we don't have to maintain the id table
> to keep things going.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  hw/pci-host/igd.c| 115 
> +--
>  hw/xen/xen_pt.c  |   2 +-
>  include/hw/i386/pc.h |   2 +-
>  3 files changed, 30 insertions(+), 89 deletions(-)
> 
> diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
> index 96b679d..8f32c39 100644
> --- a/hw/pci-host/igd.c
> +++ b/hw/pci-host/igd.c
> @@ -123,111 +123,52 @@ static const TypeInfo igd_passthrough_q35_info = {
>  .class_init= igd_passthrough_q35_class_init,
>  };
>  
> -typedef struct {
> -uint16_t gpu_device_id;
> -uint16_t pch_device_id;
> -uint8_t pch_revision_id;
> -} IGDDeviceIDInfo;
> -
> -/* In real world different GPU should have different PCH. But actually
> - * the different PCH DIDs likely map to different PCH SKUs. We do the
> - * same thing for the GPU. For PCH, the different SKUs are going to be
> - * all the same silicon design and implementation, just different
> - * features turn on and off with fuses. The SW interfaces should be
> - * consistent across all SKUs in a given family (eg LPT). But just same
> - * features may not be supported.
> - *
> - * Most of these different PCH features probably don't matter to the
> - * Gfx driver, but obviously any difference in display port connections
> - * will so it should be fine with any PCH in case of passthrough.
> - *
> - * So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
> - * scenarios, 0x9cc3 for BDW(Broadwell).
> - */
> -static const IGDDeviceIDInfo igd_combo_id_infos[] = {
> -/* HSW Classic */
> -{0x0402, 0x8c4e, 0x04}, /* HSWGT1D, HSWD_w7 */
> -{0x0406, 0x8c4e, 0x04}, /* HSWGT1M, HSWM_w7 */
> -{0x0412, 0x8c4e, 0x04}, /* HSWGT2D, HSWD_w7 */
> -{0x0416, 0x8c4e, 0x04}, /* HSWGT2M, HSWM_w7 */
> -{0x041E, 0x8c4e, 0x04}, /* HSWGT15D, HSWD_w7 */
> -/* HSW ULT */
> -{0x0A06, 0x8c4e, 0x04}, /* HSWGT1UT, HSWM_w7 */
> -{0x0A16, 0x8c4e, 0x04}, /* HSWGT2UT, HSWM_w7 */
> -{0x0A26, 0x8c4e, 0x06}, /* HSWGT3UT, HSWM_w7 */
> -{0x0A2E, 0x8c4e, 0x04}, /* HSWGT3UT28W, HSWM_w7 */
> -{0x0A1E, 0x8c4e, 0x04}, /* HSWGT2UX, HSWM_w7 */
> -{0x0A0E, 0x8c4e, 0x04}, /* HSWGT1ULX, HSWM_w7 */
> -/* HSW CRW */
> -{0x0D26, 0x8c4e, 0x04}, /* HSWGT3CW, HSWM_w7 */
> -{0x0D22, 0x8c4e, 0x04}, /* HSWGT3CWDT, HSWD_w7 */
> -/* HSW Server */
> -{0x041A, 0x8c4e, 0x04}, /* HSWSVGT2, HSWD_w7 */
> -/* HSW SRVR */
> -{0x040A, 0x8c4e, 0x04}, /* HSWSVGT1, HSWD_w7 */
> -/* BSW */
> -{0x1606, 0x9cc3, 0x03}, /* BDWULTGT1, BDWM_w7 */
> -{0x1616, 0x9cc3, 0x03}, /* BDWULTGT2, BDWM_w7 */
> -{0x1626, 0x9cc3, 0x03}, /* BDWULTGT3, BDWM_w7 */
> -{0x160E, 0x9cc3, 0x03}, /* BDWULXGT1, BDWM_w7 */
> -{0x161E, 0x9cc3, 0x03}, /* BDWULXGT2, BDWM_w7 */
> -{0x1602, 0x9cc3, 0x03}, /* BDWHALOGT1, BDWM_w7 */
> -{0x1612, 0x9cc3, 0x03}, /* BDWHALOGT2, BDWM_w7 */
> -{0x1622, 0x9cc3, 0x03}, /* BDWHALOGT3, BDWM_w7 */
> -{0x162B, 0x9cc3, 0x03}, /* BDWHALO28W, BDWM_w7 */
> -{0x162A, 0x9cc3, 0x03}, /* BDWGT3WRKS, BDWM_w7 */
> -{0x162D, 0x9cc3, 0x03}, /* BDWGT3SRVR, BDWM_w7 */
> +static const IGDHostInfo igd_isa_bridge_infos[] = {
> +{PCI_VENDOR_ID,   2},
> +{PCI_DEVICE_ID,   2},
> +{PCI_REVISION_ID, 2},
> +{PCI_SUBSYSTEM_VENDOR_ID, 2},
> +{PCI_SUBSYSTEM_ID,2},
>  };
>  
> +static void igd_pt_isa_bridge_realize(PCIDevice *pci_dev, Error **errp)
> +{
> +Error *err = NULL;
> +
> +if (pci_dev->devfn != PCI_DEVFN(0x1f, 0)) {
> +error_setg(errp, "igd isa bridge must have address 1f.0");
> +return;
> +}
> +
> +host_pci_config_copy(pci_dev, ":00:1f.0",
> + igd_isa_bridge_infos,
> + ARRAY_SIZE(igd_isa_bridge_infos),
> + &err);
> +if (err != NULL) {
> +error_propagate(errp, err);
> +return;
> +}
> +}
> +
>  static void isa_bridge_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>  
>  dc->desc= "ISA bridge faked to support IGD PT";
> -k->vendor_id= PCI_VENDOR_ID_INTEL;
> +k->realize  = igd_pt_isa_bridge_realize;
>  k->class_id = PCI_CLASS_BRIDGE_ISA;
>  };
>  
>  static TypeInfo igd_passthrough_isa_bridge_info = {
>  .name  = "igd-passthrough-isa-bridge",
>  .parent= TYPE_PCI_DEVICE,
> -

Re: [Qemu-devel] [PATCH 3/6] nvdimm acpi: introduce patched dsm memory

2016-01-06 Thread Igor Mammedov

On Tue,  5 Jan 2016 02:52:05 +0800
Xiao Guangrong  wrote:

> The dsm memory is used to save the input parameters and store
> the dsm result which is filled by QEMU.
> 
> The address of dsm memory is decided by bios and patched into
> int64 object returned by "MEMA" method
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/aml-build.c | 12 
>  hw/acpi/nvdimm.c| 24 ++--
>  include/hw/acpi/aml-build.h |  1 +
>  3 files changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 78e1290..83eadb3 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -394,6 +394,18 @@ Aml *aml_int(const uint64_t val)
>  }
>  
>  /*
> + * ACPI 1.0b: 16.2.3 Data Objects Encoding:
> + * encode: QWordConst
> + */
> +Aml *aml_int64(const uint64_t val)
> +{
> +Aml *var = aml_alloc();
> +build_append_byte(var->buf, 0x0E); /* QWordPrefix */
> +build_append_int_noprefix(var->buf, val, 8);
> +return var;
> +}
> +
> +/*
>   * helper to construct NameString, which returns Aml object
>   * for using with aml_append or other aml_* terms
>   */
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index bc7cd8f..a72104c 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -28,6 +28,7 @@
>  
>  #include "hw/acpi/acpi.h"
>  #include "hw/acpi/aml-build.h"
> +#include "hw/acpi/bios-linker-loader.h"
>  #include "hw/nvram/fw_cfg.h"
>  #include "hw/mem/nvdimm.h"
>  
> @@ -402,7 +403,8 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, 
> MemoryRegion *io,
>  state->dsm_mem->len);
>  }
>  
> -#define NVDIMM_COMMON_DSM  "NCAL"
> +#define NVDIMM_GET_DSM_MEM  "MEMA"
> +#define NVDIMM_COMMON_DSM   "NCAL"
>  
>  static void nvdimm_build_common_dsm(Aml *dev)
>  {
> @@ -468,7 +470,8 @@ static void nvdimm_build_ssdt(GSList *device_list, GArray 
> *table_offsets,
>GArray *table_data, GArray *linker,
>uint8_t revision)
>  {
> -Aml *ssdt, *sb_scope, *dev;
> +Aml *ssdt, *sb_scope, *dev, *method;
> +int offset;
>  
>  acpi_add_table(table_offsets, table_data);
>  
> @@ -499,9 +502,26 @@ static void nvdimm_build_ssdt(GSList *device_list, 
> GArray *table_offsets,
>  
>  aml_append(sb_scope, dev);
>  
> +/*
> + * leave it at the end of ssdt so that we can conveniently get the
> + * offset of int64 object returned by the function which will be
> + * patched with the real address of the dsm memory by BIOS.
> + */
> +method = aml_method(NVDIMM_GET_DSM_MEM, 0, AML_NOTSERIALIZED);
> +aml_append(method, aml_return(aml_int64(0x0)));
there is no need in dedicated aml_int64(), you can use aml_int(0x64) 
trick

> +aml_append(sb_scope, method);
>  aml_append(ssdt, sb_scope);
>  /* copy AML table into ACPI tables blob and patch header there */
>  g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
> +
> +offset = table_data->len - 8;
> +
> +bios_linker_loader_alloc(linker, NVDIMM_DSM_MEM_FILE, TARGET_PAGE_SIZE,
> + false /* high memory */);
> +bios_linker_loader_add_pointer(linker, ACPI_BUILD_TABLE_FILE,
> +   NVDIMM_DSM_MEM_FILE, table_data,
> +   table_data->data + offset,
> +   sizeof(uint64_t));
this offset magic will break badly as soon as someone add something
to the end of SSDT.


>  build_header(linker, table_data,
>  (void *)(table_data->data + table_data->len - ssdt->buf->len),
>  "SSDT", ssdt->buf->len, revision, "NVDIMM");
> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> index ef44d02..b4726a4 100644
> --- a/include/hw/acpi/aml-build.h
> +++ b/include/hw/acpi/aml-build.h
> @@ -246,6 +246,7 @@ Aml *aml_name(const char *name_format, ...) 
> GCC_FMT_ATTR(1, 2);
>  Aml *aml_name_decl(const char *name, Aml *val);
>  Aml *aml_return(Aml *val);
>  Aml *aml_int(const uint64_t val);
> +Aml *aml_int64(const uint64_t val);
>  Aml *aml_arg(int pos);
>  Aml *aml_to_integer(Aml *arg);
>  Aml *aml_to_hexstring(Aml *src, Aml *dst);

[Qemu-devel] [RFC] util: Fix QEMU_LD_PREFIX endless loop

2016-01-06 Thread Wei-Bo, Chen

Detail bug report in the following url:
https://bugs.launchpad.net/qemu/+bug/1245703

Remove is_dir_maybe macro condition DT_LNK in util/path.c

Signed-off-by: Wei-Bo, Chen 
---
 util/path.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/path.c b/util/path.c
index 4e4877e..b99e436 100644
--- a/util/path.c
+++ b/util/path.c
@@ -58,7 +58,7 @@ static struct pathelem *new_entry(const char *root,
 #if defined(DT_DIR) && defined(DT_UNKNOWN) && defined(DT_LNK)
 # define dirent_type(dirent) ((dirent)->d_type)
 # define is_dir_maybe(type) \
-((type) == DT_DIR || (type) == DT_UNKNOWN || (type) == DT_LNK)
+((type) == DT_DIR || (type) == DT_UNKNOWN)
 #else
 # define dirent_type(dirent) (1)
 # define is_dir_maybe(type)  (type)
-- 
2.5.0

[Qemu-devel] [PATCH v2 5/7] hw/arm/sysbus-fdt: helpers for clock node generation

2016-01-06 Thread Eric Auger

Some passthrough'ed devices depend on clock nodes. Those need to be
generated in the guest device tree. This patch introduces some helpers
to build a clock node from information retrieved in the host device tree.

- inherit_properties copies properties from a host device tree node to
  a guest device tree node
- fdt_build_clock_node builds a guest clock node and checks the host
  fellow clock is a fixed one.

fdt_build_clock_node will become static as soon as it gets used. A
dummy pre-declaration is needed for compilation of this patch.

Signed-off-by: Eric Auger 

---

v1 -> v2:
- inherit properties now outputs an error message in case
  qemu_fdt_getprop fails for an existing optional property
- no hardcoded fixed buffer length
- fdt_build_clock_node becomes void and auto-asserts on error
- use boolean values when defining the clock properties

RFC -> v1:
- use the new proto of qemu_fdt_getprop
- remove newline in error_report
- fix some style issues
---
 hw/arm/sysbus-fdt.c | 120 
 1 file changed, 120 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 9d28797..a1cf57b 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -21,6 +21,7 @@
  *
  */
 
+#include 
 #include "hw/arm/sysbus-fdt.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
@@ -56,6 +57,125 @@ typedef struct NodeCreationPair {
 int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
 } NodeCreationPair;
 
+/* helpers */
+
+typedef struct HostProperty {
+const char *name;
+bool optional;
+} HostProperty;
+
+/**
+ * inherit_properties
+ *
+ * copies properties listed in an array from host device tree to
+ * guest device tree. If a non optional property is not found, the
+ * function self-asserts. An optional property is ignored if not found
+ * in the host device tree.
+ * @props: array of HostProperty to copy
+ * @nb_props: number of properties in the array
+ * @host_dt: host device tree blob
+ * @guest_dt: guest device tree blob
+ * @node_path: host dt node path where the property is supposed to be
+  found
+ * @nodename: guest node name the properties should be added to
+ */
+static void inherit_properties(HostProperty *props, int nb_props,
+   void *host_fdt, void *guest_fdt,
+   char *node_path, char *nodename)
+{
+int i, prop_len;
+const void *r;
+Error *err = NULL;
+
+for (i = 0; i < nb_props; i++) {
+r = qemu_fdt_getprop(host_fdt, node_path,
+ props[i].name,
+ &prop_len,
+ props[i].optional ? &err : &error_fatal);
+if (r) {
+qemu_fdt_setprop(guest_fdt, nodename,
+ props[i].name, r, prop_len);
+} else {
+if (prop_len != -FDT_ERR_NOTFOUND) {
+/* optional property not returned although property exists */
+error_report_err(err);
+} else {
+error_free(err);
+}
+}
+}
+}
+
+/* clock properties whose values are copied/pasted from host */
+static HostProperty clock_inherited_properties[] = {
+{"compatible", false},
+{"#clock-cells", false},
+{"clock-frequency", true},
+{"clock-output-names", true},
+};
+
+/**
+ * fdt_build_clock_node
+ *
+ * Build a guest clock node, used as a dependency from a passthrough'ed
+ * device. Most information are retrieved from the host clock node.
+ * Also check the host clock is a fixed one.
+ *
+ * @host_fdt: host device tree blob from which info are retrieved
+ * @guest_fdt: guest device tree blob where the clock node is added
+ * @host_phandle: phandle of the clock in host device tree
+ * @guest_phandle: phandle to assign to the guest node
+ */
+void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+ uint32_t host_phandle,
+ uint32_t guest_phandle);
+void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+ uint32_t host_phandle,
+ uint32_t guest_phandle)
+{
+char *node_path = NULL;
+char *nodename;
+const void *r;
+int ret, node_offset, prop_len, path_len = 16;
+
+node_offset = fdt_node_offset_by_phandle(host_fdt, host_phandle);
+if (node_offset <= 0) {
+error_setg(&error_fatal,
+   "not able to locate clock handle %d in host device tree",
+   host_phandle);
+}
+node_path = g_malloc(path_len);
+while ((ret = fdt_get_path(host_fdt, node_offset, node_path, path_len))
+== -FDT_ERR_NOSPACE) {
+path_len += 16;
+node_path = g_realloc(node_path, path_len);
+}
+if (ret < 0) {
+error_setg(&error_fatal,
+   "not able to retrieve node path for clock handle %d",
+   host_phandle);
+}
+
+r = qemu_fdt_getprop(host

Re: [Qemu-devel] [RFC v6 04/14] softmmu: Add helpers for a new slowpath

2016-01-06 Thread Alex Bennée


Alvise Rigo  writes:

> The new helpers rely on the legacy ones to perform the actual read/write.
>
> The LoadLink helper (helper_ldlink_name) prepares the way for the
> following SC operation. It sets the linked address and the size of the
> access.

nit: extra line or continue paragraph

> These helper also update the TLB entry of the page involved in the
> LL/SC for those vCPUs that have the bit set (dirty), so that the
> following accesses made by all the vCPUs will follow the slow path.
>
> The StoreConditional helper (helper_stcond_name) returns 1 if the
> store has to fail due to a concurrent access to the same page by
> another vCPU. A 'concurrent access' can be a store made by *any* vCPU
> (although, some implementations allow stores made by the CPU that issued
> the LoadLink).
>
> Suggested-by: Jani Kokkonen 
> Suggested-by: Claudio Fontana 
> Signed-off-by: Alvise Rigo 
> ---
>  cputlb.c|   3 ++
>  softmmu_llsc_template.h | 134 
> 
>  softmmu_template.h  |  12 +
>  tcg/tcg.h   |  31 +++
>  4 files changed, 180 insertions(+)
>  create mode 100644 softmmu_llsc_template.h
>
> diff --git a/cputlb.c b/cputlb.c
> index 7ee0c89..70b6404 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -509,6 +509,8 @@ static inline void lookup_and_reset_cpus_ll_addr(hwaddr 
> addr, hwaddr size)
>
>  #define MMUSUFFIX _mmu
>
> +/* Generates LoadLink/StoreConditional helpers in softmmu_template.h */
> +#define GEN_EXCLUSIVE_HELPERS
>  #define SHIFT 0
>  #include "softmmu_template.h"
>
> @@ -521,6 +523,7 @@ static inline void lookup_and_reset_cpus_ll_addr(hwaddr 
> addr, hwaddr size)
>  #define SHIFT 3
>  #include "softmmu_template.h"
>  #undef MMUSUFFIX
> +#undef GEN_EXCLUSIVE_HELPERS
>
>  #define MMUSUFFIX _cmmu
>  #undef GETPC_ADJ
> diff --git a/softmmu_llsc_template.h b/softmmu_llsc_template.h
> new file mode 100644
> index 000..586bb2e
> --- /dev/null
> +++ b/softmmu_llsc_template.h
> @@ -0,0 +1,134 @@
> +/*
> + *  Software MMU support (esclusive load/store operations)
> + *
> + * Generate helpers used by TCG for qemu_ldlink/stcond ops.
> + *
> + * Included from softmmu_template.h only.
> + *
> + * Copyright (c) 2015 Virtual Open Systems
> + *
> + * Authors:
> + *  Alvise Rigo 
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + */
> +
> +/* This template does not generate together the le and be version, but only 
> one
> + * of the two depending on whether BIGENDIAN_EXCLUSIVE_HELPERS has been set.
> + * The same nomenclature as softmmu_template.h is used for the exclusive
> + * helpers.  */
> +
> +#ifdef BIGENDIAN_EXCLUSIVE_HELPERS
> +
> +#define helper_ldlink_name  glue(glue(helper_be_ldlink, USUFFIX), MMUSUFFIX)
> +#define helper_stcond_name  glue(glue(helper_be_stcond, SUFFIX), MMUSUFFIX)
> +#define helper_ld glue(glue(helper_be_ld, USUFFIX), MMUSUFFIX)
> +#define helper_st glue(glue(helper_be_st, SUFFIX), MMUSUFFIX)
> +
> +#else /* LE helpers + 8bit helpers (generated only once for both LE end BE) 
> */
> +
> +#if DATA_SIZE > 1
> +#define helper_ldlink_name  glue(glue(helper_le_ldlink, USUFFIX), MMUSUFFIX)
> +#define helper_stcond_name  glue(glue(helper_le_stcond, SUFFIX), MMUSUFFIX)
> +#define helper_ld glue(glue(helper_le_ld, USUFFIX), MMUSUFFIX)
> +#define helper_st glue(glue(helper_le_st, SUFFIX), MMUSUFFIX)
> +#else /* DATA_SIZE <= 1 */
> +#define helper_ldlink_name  glue(glue(helper_ret_ldlink, USUFFIX), MMUSUFFIX)
> +#define helper_stcond_name  glue(glue(helper_ret_stcond, SUFFIX), MMUSUFFIX)
> +#define helper_ld glue(glue(helper_ret_ld, USUFFIX), MMUSUFFIX)
> +#define helper_st glue(glue(helper_ret_st, SUFFIX), MMUSUFFIX)
> +#endif
> +
> +#endif
> +
> +WORD_TYPE helper_ldlink_name(CPUArchState *env, target_ulong addr,
> +TCGMemOpIdx oi, uintptr_t retaddr)
> +{
> +WORD_TYPE ret;
> +int index;
> +CPUState *cpu, *this = ENV_GET_CPU(env);
> +CPUClass *cc = CPU_GET_CLASS(this);
> +hwaddr hw_addr;
> +unsigned mmu_idx = get_mmuidx(oi);
> +
> +/* Use the proper load helper from cpu_ldst.h */
> +ret = helper_ld(env, addr, mmu_idx, retaddr);
> +
> +index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
> +
> +/* hw_addr = hwaddr of the page (i.e. section->mr->ram_addr + xlat)
> + * plus the offs

[Qemu-devel] [PATCH v2 7/7] hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check

2016-01-06 Thread Eric Auger

qemu_fdt_setprop self-exists in case of error hence no need to check
the returned value.

Signed-off-by: Eric Auger 
---
 hw/arm/sysbus-fdt.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 66fa766..68d7e53 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -241,12 +241,8 @@ static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice 
*sbdev, void *opaque)
 reg_attr[2 * i + 1] = cpu_to_be32(
 memory_region_size(&vdev->regions[i]->mem));
 }
-ret = qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
-   vbasedev->num_regions * 2 * sizeof(uint32_t));
-if (ret) {
-error_report("could not set reg property of node %s", nodename);
-goto fail_reg;
-}
+qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
+ vbasedev->num_regions * 2 * sizeof(uint32_t));
 
 irq_attr = g_new(uint32_t, vbasedev->num_irqs * 3);
 for (i = 0; i < vbasedev->num_irqs; i++) {
@@ -256,14 +252,9 @@ static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice 
*sbdev, void *opaque)
 irq_attr[3 * i + 1] = cpu_to_be32(irq_number);
 irq_attr[3 * i + 2] = cpu_to_be32(GIC_FDT_IRQ_FLAGS_LEVEL_HI);
 }
-ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+qemu_fdt_setprop(fdt, nodename, "interrupts",
  irq_attr, vbasedev->num_irqs * 3 * sizeof(uint32_t));
-if (ret) {
-error_report("could not set interrupts property of node %s",
- nodename);
-}
 g_free(irq_attr);
-fail_reg:
 g_free(reg_attr);
 g_free(nodename);
 return ret;
-- 
1.9.1

[Qemu-devel] [PATCH v2 6/7] hw/arm/sysbus-fdt: enable amd-xgbe dynamic instantiation

2016-01-06 Thread Eric Auger

This patch allows the instantiation of the vfio-amd-xgbe device
from the QEMU command line (-device vfio-amd-xgbe,host="").

The guest is exposed with a device tree node that combines the description
of both XGBE and PHY (representation supported from 4.2 onwards kernel):
Documentation/devicetree/bindings/net/amd-xgbe.txt.

There are 5 register regions, 6 interrupts including 4 optional
edge-sensitive per-channel interrupts.

Some property values are inherited from host device tree. Host device tree
must feature a combined XGBE/PHY representation (>= 4.2 host kernel).

2 clock nodes (dma and ptp) also are created. It is checked those clocks
are fixed on host side.

AMD XGBE node creation function has a dependency on vfio Linux header and
more generally node creation function for VFIO platform devices only make
sense with CONFIG_LINUX so let's protect this code with #ifdef CONFIG_LINUX.

Signed-off-by: Eric Auger 

---
v1 -> v2:
- add CONFIG_LINUX protection
- improves robustness in sysfs_to_dt_name
- output messages to end-user on misc failures and self-exits:
  error management becomes more violent than before assuming if
  the end-user wants passthrough we must exit if the device cannot
  be instantiated
- fix misc style issues
- remove qemu_fdt_setprop returned value check since it self-asserts

RFC -> v1:
- use qemu_fdt_getprop with Error **
- free substrings in sysfs_to_dt_name
- add some comments related to endianess in add_amd_xgbe_fdt_node
- reword commit message (dtc binary dependency has disappeared)
- check the host device has 5 regions meaning this is a combined
  XGBE/PHY device
---
 hw/arm/sysbus-fdt.c | 187 ++--
 1 file changed, 181 insertions(+), 6 deletions(-)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index a1cf57b..66fa766 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -22,6 +22,10 @@
  */
 
 #include 
+#include "qemu-common.h"
+#ifdef CONFIG_LINUX
+#include 
+#endif
 #include "hw/arm/sysbus-fdt.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
@@ -29,6 +33,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/vfio/vfio-platform.h"
 #include "hw/vfio/vfio-calxeda-xgmac.h"
+#include "hw/vfio/vfio-amd-xgbe.h"
 #include "hw/arm/fdt.h"
 
 /*
@@ -64,6 +69,8 @@ typedef struct HostProperty {
 bool optional;
 } HostProperty;
 
+#ifdef CONFIG_LINUX
+
 /**
  * inherit_properties
  *
@@ -126,12 +133,9 @@ static HostProperty clock_inherited_properties[] = {
  * @host_phandle: phandle of the clock in host device tree
  * @guest_phandle: phandle to assign to the guest node
  */
-void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
- uint32_t host_phandle,
- uint32_t guest_phandle);
-void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
- uint32_t host_phandle,
- uint32_t guest_phandle)
+static void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+uint32_t host_phandle,
+uint32_t guest_phandle)
 {
 char *node_path = NULL;
 char *nodename;
@@ -176,6 +180,28 @@ void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
 g_free(node_path);
 }
 
+/**
+ * sysfs_to_dt_name: convert the name found in sysfs into the node name
+ * for instance e090.xgmac is converted into xgmac@e090
+ * @sysfs_name: directory name in sysfs
+ *
+ * returns the device tree name upon success or NULL in case the sysfs name
+ * does not match the expected format
+ */
+static char *sysfs_to_dt_name(const char *sysfs_name)
+{
+gchar **substrings =  g_strsplit(sysfs_name, ".", 2);
+char *dt_name = NULL;
+
+if (!substrings || !substrings[1] || !substrings[0]) {
+goto out;
+}
+dt_name = g_strdup_printf("%s@%s", substrings[1], substrings[0]);
+out:
+g_strfreev(substrings);
+return dt_name;
+}
+
 /* Device Specific Code */
 
 /**
@@ -243,9 +269,158 @@ fail_reg:
 return ret;
 }
 
+
+/* AMD xgbe properties whose values are copied/pasted from host */
+static HostProperty amd_xgbe_inherited_properties[] = {
+{"compatible", false},
+{"dma-coherent", true},
+{"amd,per-channel-interrupt", true},
+{"phy-mode", false},
+{"mac-address", true},
+{"amd,speed-set", false},
+{"amd,serdes-blwc", true},
+{"amd,serdes-cdr-rate", true},
+{"amd,serdes-pq-skew", true},
+{"amd,serdes-tx-amp", true},
+{"amd,serdes-dfe-tap-config", true},
+{"amd,serdes-dfe-tap-enable", true},
+{"clock-names", false},
+};
+
+/**
+ * add_amd_xgbe_fdt_node
+ *
+ * Generates the combined xgbe/phy node following kernel >=4.2
+ * binding documentation:
+ * Documentation/devicetree/bindings/net/amd-xgbe.txt:
+ * Also 2 clock nodes are created (dma and ptp)
+ */
+static int add_amd_xgbe_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+PlatformBusFDTData *data = opaque;
+PlatformBusDevice *pbus = data->pbus;

[Qemu-devel] [PATCH v2 4/7] device_tree: qemu_fdt_getprop converted to use the error API

2016-01-06 Thread Eric Auger

Current qemu_fdt_getprop exits if the property is not found. It is
sometimes needed to read an optional property, in which case we do
not wish to exit but simply returns a null value.

This patch converts qemu_fdt_getprop to accept an Error **, and existing
users are converted to pass &error_fatal. This preserves the existing
behaviour. Then to use the API with your optional semantic a null
parameter can be conveyed.

Signed-off-by: Eric Auger 

---

v1 -> v2:
- add a doc comment in the header file

RFC -> v1:
- get rid of qemu_fdt_getprop_optional and implement Peter's suggestion
  that consists in using the error API

Signed-off-by: Eric Auger 
---
 device_tree.c| 11 ++-
 include/sysemu/device_tree.h | 15 ++-
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 8441e01..6ecc9da 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -321,18 +321,18 @@ int qemu_fdt_setprop_string(void *fdt, const char 
*node_path,
 }
 
 const void *qemu_fdt_getprop(void *fdt, const char *node_path,
- const char *property, int *lenp)
+ const char *property, int *lenp, Error **errp)
 {
 int len;
 const void *r;
+
 if (!lenp) {
 lenp = &len;
 }
 r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
 if (!r) {
-error_report("%s: Couldn't get %s/%s: %s", __func__,
- node_path, property, fdt_strerror(*lenp));
-exit(1);
+error_setg(errp, "%s: Couldn't get %s/%s: %s", __func__,
+  node_path, property, fdt_strerror(*lenp));
 }
 return r;
 }
@@ -341,7 +341,8 @@ uint32_t qemu_fdt_getprop_cell(void *fdt, const char 
*node_path,
const char *property)
 {
 int len;
-const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len);
+const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, &len,
+ &error_fatal);
 if (len != 4) {
 error_report("%s: %s/%s not 4 bytes long (not a cell?)",
  __func__, node_path, property);
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 269cb1c..4d7cbb9 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -45,8 +45,21 @@ int qemu_fdt_setprop_string(void *fdt, const char *node_path,
 int qemu_fdt_setprop_phandle(void *fdt, const char *node_path,
  const char *property,
  const char *target_node_path);
+/**
+ * qemu_fdt_getprop: retrieve the value of a given property
+ * @fdt: pointer to the device tree blob
+ * @node_path: node path
+ * @property: name of the property to find
+ * @lenp: fdt error if any or length of the property on success
+ * @errp: handle to an error object
+ *
+ * returns a pointer to the property on success and NULL on failure
+ * in case errp is set to &error_fatal, the function auto-asserts
+ * on error (legacy behavior)
+ */
 const void *qemu_fdt_getprop(void *fdt, const char *node_path,
- const char *property, int *lenp);
+ const char *property, int *lenp,
+ Error **errp);
 uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
const char *property);
 uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
-- 
1.9.1

[Qemu-devel] [PATCH v2 2/7] device_tree: introduce load_device_tree_from_sysfs

2016-01-06 Thread Eric Auger

This function returns the host device tree blob from sysfs
(/proc/device-tree). It uses a recursive function inspired
from dtc read_fstree.

Signed-off-by: Eric Auger 

---
v1 -> v2:
- do not implement/expose read_fstree and load_device_tree_from_sysfs
  if CONFIG_LINUX is not defined (lstat is not implemeted in mingw)
- correct indentation in read_fstree
- use /proc/device-tree symlink instead of /sys/firmware/devicetree/base
  path (kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-ofw)
- use g_file_get_contents in read_fstree
- introduce SYSFS_DT_BASEDIR macro and use strlen
- exit on error in load_device_tree_from_sysfs
- user error_setg

RFC -> v1:
- remove runtime dependency on dtc binary and introduce read_fstree
---
 device_tree.c| 100 +++
 include/sysemu/device_tree.h |   3 ++
 2 files changed, 103 insertions(+)

diff --git a/device_tree.c b/device_tree.c
index a9f5f8e..b262c2d 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -17,6 +17,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_LINUX
+#include 
+#endif
 
 #include "qemu-common.h"
 #include "qemu/error-report.h"
@@ -117,6 +120,103 @@ fail:
 return NULL;
 }
 
+#ifdef CONFIG_LINUX
+
+#define SYSFS_DT_BASEDIR "/proc/device-tree"
+
+/**
+ * read_fstree: this function is inspired from dtc read_fstree
+ * @fdt: preallocated fdt blob buffer, to be populated
+ * @dirname: directory to scan under SYSFS_DT_BASEDIR
+ * the search is recursive and the tree is searched down to the
+ * leafs (property files).
+ *
+ * the function self-asserts in case of error
+ */
+static void read_fstree(void *fdt, const char *dirname)
+{
+DIR *d;
+struct dirent *de;
+struct stat st;
+const char *root_dir = SYSFS_DT_BASEDIR;
+char *parent_node;
+
+if (strstr(dirname, root_dir) != dirname) {
+error_report("%s: %s must be searched within %s",
+ __func__, dirname, root_dir);
+exit(1);
+}
+parent_node = (char *)&dirname[strlen(SYSFS_DT_BASEDIR)];
+
+d = opendir(dirname);
+if (!d) {
+error_setg(&error_fatal, "%s cannot open %s", __func__, dirname);
+}
+
+while ((de = readdir(d)) != NULL) {
+char *tmpnam;
+
+if (!g_strcmp0(de->d_name, ".")
+|| !g_strcmp0(de->d_name, "..")) {
+continue;
+}
+
+tmpnam = g_strjoin("/", dirname, de->d_name, NULL);
+
+if (lstat(tmpnam, &st) < 0) {
+error_setg(&error_fatal, "%s cannot lstat %s", __func__, tmpnam);
+}
+
+if (S_ISREG(st.st_mode)) {
+gchar *val;
+gsize len;
+
+if (!g_file_get_contents(tmpnam, &val, &len, NULL)) {
+error_setg(&error_fatal, "%s not able to extract info from %s",
+   __func__, tmpnam);
+}
+
+if (strlen(parent_node) > 0) {
+qemu_fdt_setprop(fdt, parent_node,
+ de->d_name, val, len);
+} else {
+qemu_fdt_setprop(fdt, "/", de->d_name, val, len);
+}
+g_free(val);
+} else if (S_ISDIR(st.st_mode)) {
+char *node_name;
+
+node_name = g_strdup_printf("%s/%s",
+parent_node, de->d_name);
+qemu_fdt_add_subnode(fdt, node_name);
+g_free(node_name);
+read_fstree(fdt, tmpnam);
+}
+
+g_free(tmpnam);
+}
+
+closedir(d);
+}
+
+/* load_device_tree_from_sysfs: extract the dt blob from host sysfs */
+void *load_device_tree_from_sysfs(void)
+{
+void *host_fdt;
+int host_fdt_size;
+
+host_fdt = create_device_tree(&host_fdt_size);
+read_fstree(host_fdt, SYSFS_DT_BASEDIR);
+if (fdt_check_header(host_fdt)) {
+error_setg(&error_fatal,
+   "%s host device tree extracted into memory is invalid",
+   __func__);
+}
+return host_fdt;
+}
+
+#endif /* CONFIG_LINUX */
+
 static int findnode_nofail(void *fdt, const char *node_path)
 {
 int offset;
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 359e143..fdf25a4 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -16,6 +16,9 @@
 
 void *create_device_tree(int *sizep);
 void *load_device_tree(const char *filename_path, int *sizep);
+#ifdef CONFIG_LINUX
+void *load_device_tree_from_sysfs(void);
+#endif
 
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size);
-- 
1.9.1

[Qemu-devel] [PATCH v2 3/7] device_tree: introduce qemu_fdt_node_path

2016-01-06 Thread Eric Auger

This new helper routine returns the node path of a device
referred to by its node name and compat string.

Signed-off-by: Eric Auger 

---

v1 -> v2:
- move doc comment in header file
- do not use a fixed size buffer
- break on errors in while loop
- use strcmp instead of strncmp

RFC -> v1:
- improve error handling according to Alex' comments
---
 device_tree.c| 37 +
 include/sysemu/device_tree.h | 14 ++
 2 files changed, 51 insertions(+)

diff --git a/device_tree.c b/device_tree.c
index b262c2d..8441e01 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -231,6 +231,43 @@ static int findnode_nofail(void *fdt, const char 
*node_path)
 return offset;
 }
 
+int qemu_fdt_node_path(void *fdt, const char *name, char *compat,
+   char **node_path)
+{
+int offset, len, ret;
+const char *iter_name;
+unsigned int path_len = 16;
+char *path;
+
+*node_path = NULL;
+offset = fdt_node_offset_by_compatible(fdt, -1, compat);
+
+while (offset >= 0) {
+iter_name = fdt_get_name(fdt, offset, &len);
+if (!iter_name) {
+offset = len;
+break;
+}
+if (!strcmp(iter_name, name)) {
+goto found;
+}
+offset = fdt_node_offset_by_compatible(fdt, offset, compat);
+}
+return offset;
+
+found:
+path = g_malloc(path_len);
+while ((ret = fdt_get_path(fdt, offset, path, path_len))
+== -FDT_ERR_NOSPACE) {
+path_len += 16;
+path = g_realloc(path, path_len);
+}
+if (!ret) {
+*node_path = path;
+}
+return ret;
+}
+
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size)
 {
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index fdf25a4..269cb1c 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -20,6 +20,20 @@ void *load_device_tree(const char *filename_path, int 
*sizep);
 void *load_device_tree_from_sysfs(void);
 #endif
 
+/**
+ * qemu_fdt_node_path: return the node path of a device, given its
+ * node name and its compat string
+ * @fdt: pointer to the dt blob
+ * @name: device node name
+ * @compat: compatibility string of the device
+ * @node_path: returned node path
+ *
+ * upon success, the path is output at node_path address
+ * returns 0 on success, < 0 on failure
+ */
+int qemu_fdt_node_path(void *fdt, const char *name, char *compat,
+   char **node_path);
+
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size);
 int qemu_fdt_setprop_cell(void *fdt, const char *node_path,
-- 
1.9.1

[Qemu-devel] [PATCH v2 0/7] AMD XGBE KVM platform passthrough

2016-01-06 Thread Eric Auger

This series allows to set up AMD XGBE passthrough. This was tested on AMD
Seattle.

The first upstreamed device supporting KVM platform passthrough was the
Calxeda Midway XGMAC. Compared to this latter, the XGBE XGMAC exposes a
much more complex device tree node.

- First There are 2 device tree node formats:
one where XGBE and PHY are described in separate nodes and another one
that combines both description in a single node (only supported by 4.2
onwards kernels). Only the combined description is supported for passthrough,
meaning the host must be >= 4.2 and must feature a device tree with a combined
description. The guest will also be exposed with a combined description,
meaning only >= 4.2 guest are supported. It is not planned to support
separate node representation since assignment of the PHY is less
straigtforward.

- the XGMAC/PHY node depends on 2 clock nodes (DMA and PTP).
The code checks those clocks are fixed to make sure they cannot be
switched off at some point after the native driver gets unbound.

- there are many property values to populate on guest side. Most of them
cannot be hardcoded. That series implements host device tree blob extraction
from the host /proc/device-tree (inspired from dtc implementation)
and retrieve host property values to populate guest dtb.

- the case where the host uses ACPI is not yet covered since there is
  no usable ACPI description for this HW yet.

The patches can be found at
https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-xgbe-v2

Previous version can be found at
https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-xgbe-v1

HISTORY:
v1 -> v2:
- take into account Peter's comments:
  - add CONFIG_LINUX protection
  - improve error handling and messages
  - no fixed size buffer anymore
  - fix read_fstree error handling
- use /proc/device-tree symlink instead of /sys/firmware/devicetree/base
- added hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check
- see individual commits for full details

RFC -> v1:
- no dependency anymore on dtc binary: load_device_tree_from_sysfs is
  self-contained and implements something similar to dtc read_fstree.
- take into account Alex' comments
- remove qemu_fdt_getprop_optional and use error API instead

Best Regards

Eric


Eric Auger (7):
  hw/vfio/platform: amd-xgbe device
  device_tree: introduce load_device_tree_from_sysfs
  device_tree: introduce qemu_fdt_node_path
  device_tree: qemu_fdt_getprop converted to use the error API
  hw/arm/sysbus-fdt: helpers for clock node generation
  hw/arm/sysbus-fdt: enable amd-xgbe dynamic instantiation
  hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check

 device_tree.c   | 148 ++-
 hw/arm/sysbus-fdt.c | 306 ++--
 hw/vfio/Makefile.objs   |   1 +
 hw/vfio/amd-xgbe.c  |  55 
 include/hw/vfio/vfio-amd-xgbe.h |  51 +++
 include/sysemu/device_tree.h|  32 -
 6 files changed, 577 insertions(+), 16 deletions(-)
 create mode 100644 hw/vfio/amd-xgbe.c
 create mode 100644 include/hw/vfio/vfio-amd-xgbe.h

-- 
1.9.1

[Qemu-devel] [PATCH v2 1/7] hw/vfio/platform: amd-xgbe device

2016-01-06 Thread Eric Auger

This patch introduces the amd-xgbe VFIO platform device. It
allows the guest to do passthrough on a device exposing an
"amd,xgbe-seattle-v1a" compat string.

Signed-off-by: Eric Auger 
Reviewed-by: Alex Benné

---
RFC -> v1:
- add Alex' R-b
---
 hw/vfio/Makefile.objs   |  1 +
 hw/vfio/amd-xgbe.c  | 55 +
 include/hw/vfio/vfio-amd-xgbe.h | 51 ++
 3 files changed, 107 insertions(+)
 create mode 100644 hw/vfio/amd-xgbe.c
 create mode 100644 include/hw/vfio/vfio-amd-xgbe.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index d324863..ceddbb8 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -3,4 +3,5 @@ obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o pci-quirks.o
 obj-$(CONFIG_SOFTMMU) += platform.o
 obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
+obj-$(CONFIG_SOFTMMU) += amd-xgbe.o
 endif
diff --git a/hw/vfio/amd-xgbe.c b/hw/vfio/amd-xgbe.c
new file mode 100644
index 000..53451eb
--- /dev/null
+++ b/hw/vfio/amd-xgbe.c
@@ -0,0 +1,55 @@
+/*
+ * AMD XGBE VFIO device
+ *
+ * Copyright Linaro Limited, 2015
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-amd-xgbe.h"
+
+static void amd_xgbe_realize(DeviceState *dev, Error **errp)
+{
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+VFIOAmdXgbeDeviceClass *k = VFIO_AMD_XGBE_DEVICE_GET_CLASS(dev);
+
+vdev->compat = g_strdup("amd,xgbe-seattle-v1a");
+
+k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_amd_xgbe_vmstate = {
+.name = TYPE_VFIO_AMD_XGBE,
+.unmigratable = 1,
+};
+
+static void vfio_amd_xgbe_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VFIOAmdXgbeDeviceClass *vcxc =
+VFIO_AMD_XGBE_DEVICE_CLASS(klass);
+vcxc->parent_realize = dc->realize;
+dc->realize = amd_xgbe_realize;
+dc->desc = "VFIO AMD XGBE";
+dc->vmsd = &vfio_platform_amd_xgbe_vmstate;
+}
+
+static const TypeInfo vfio_amd_xgbe_dev_info = {
+.name = TYPE_VFIO_AMD_XGBE,
+.parent = TYPE_VFIO_PLATFORM,
+.instance_size = sizeof(VFIOAmdXgbeDevice),
+.class_init = vfio_amd_xgbe_class_init,
+.class_size = sizeof(VFIOAmdXgbeDeviceClass),
+};
+
+static void register_amd_xgbe_dev_type(void)
+{
+type_register_static(&vfio_amd_xgbe_dev_info);
+}
+
+type_init(register_amd_xgbe_dev_type)
diff --git a/include/hw/vfio/vfio-amd-xgbe.h b/include/hw/vfio/vfio-amd-xgbe.h
new file mode 100644
index 000..9fff65e
--- /dev/null
+++ b/include/hw/vfio/vfio-amd-xgbe.h
@@ -0,0 +1,51 @@
+/*
+ * VFIO AMD XGBE device
+ *
+ * Copyright Linaro Limited, 2015
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_AMD_XGBE_H
+#define HW_VFIO_VFIO_AMD_XGBE_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_AMD_XGBE "vfio-amd-xgbe"
+
+/**
+ * This device exposes:
+ * - 5 MMIO regions: MAC, PCS, SerDes Rx/Tx regs,
+ SerDes Integration Registers 1/2 & 2/2
+ * - 2 level sensitive IRQs and optional DMA channel IRQs
+ */
+struct VFIOAmdXgbeDevice {
+VFIOPlatformDevice vdev;
+};
+
+typedef struct VFIOAmdXgbeDevice VFIOAmdXgbeDevice;
+
+struct VFIOAmdXgbeDeviceClass {
+/*< private >*/
+VFIOPlatformDeviceClass parent_class;
+/*< public >*/
+DeviceRealize parent_realize;
+};
+
+typedef struct VFIOAmdXgbeDeviceClass VFIOAmdXgbeDeviceClass;
+
+#define VFIO_AMD_XGBE_DEVICE(obj) \
+ OBJECT_CHECK(VFIOAmdXgbeDevice, (obj), TYPE_VFIO_AMD_XGBE)
+#define VFIO_AMD_XGBE_DEVICE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(VFIOAmdXgbeDeviceClass, (klass), \
+TYPE_VFIO_AMD_XGBE)
+#define VFIO_AMD_XGBE_DEVICE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(VFIOAmdXgbeDeviceClass, (obj), \
+  TYPE_VFIO_AMD_XGBE)
+
+#endif
-- 
1.9.1

Re: [Qemu-devel] qcow2 snapshot + resize

2016-01-06 Thread Eric Blake

On 01/05/2016 07:50 PM, lihuiba wrote:
> At 2016-01-05 21:55:56, "Eric Blake"  wrote:
>> On 01/05/2016 05:10 AM, lihuiba wrote:
>>
> In our production environment, we need to extend a qcow2 image with
> snapshots in it.
>>
 The thing is that one would need to update all the inactive L1 tables. I
 don't think it should be too difficult, it's just that apparently so far
 nobody ever had the need for this feature.
>>
>> Is resizing a snapshot really what you want?  Ideally, a snapshot tracks
>> the data from a point in time, including the metadata of the size being
>> tracked at that time.  Extending the snapshots then reverting to that
>> snapshot means your guest would see a larger disk on revert than it did
>> at the time the snapshot was created, which guests might not handle very
>> well.
> I want to make resizing (extending only) and snapshot independent to each 
> other,otherwise going back and forth in snapshots may cause the disk shrinked 
> and extended. That would introduce some technical trouble, and possibly 
> confuse user as well.

If I take a snapshot while the guest sees a 1G disk, then resize the
disk to 2G, then roll back to the point in time of the snapshot, I'd
expect the disk to roll back to 1G in size.  Anything else is likely to
confuse the guest.  And that's what current resize support already does
(it only resizes the active image, not the snapshots).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 07/11] igd: revamp host config read

2016-01-06 Thread Stefano Stabellini

On Tue, 5 Jan 2016, Gerd Hoffmann wrote:
> Move all work to the host_pci_config_copy helper function,
> which we can easily reuse when adding q35 support.
> Open sysfs file only once for all values.  Use pread.
> Proper error handling.  Fix bugs:
> 
>  * Don't throw away results (like old host_pci_config_read
>did because val was passed by value not reference).
>  * Update config space directly (writing via
>pci_default_write_config only works for registers
>whitelisted in wmask).
> 
> Hmm, this code can hardly ever worked before,
> /me wonders what test coverage it had.
> 
> With this patch in place igd-passthru=on actually
> works, although it still requires root priviledges
> because linux refuses to allow non-root users access
> pci config space above offset 0x50.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  hw/pci-host/igd.c | 65 
> +++
>  1 file changed, 27 insertions(+), 38 deletions(-)
> 
> diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
> index 0784128..ec48875 100644
> --- a/hw/pci-host/igd.c
> +++ b/hw/pci-host/igd.c
> @@ -19,47 +19,39 @@ static const IGDHostInfo igd_host_bridge_infos[] = {
>  {0xa8, 4},  /* SNB: base of GTT stolen memory */
>  };
>  
> -static int host_pci_config_read(int pos, int len, uint32_t val)
> +static void host_pci_config_copy(PCIDevice *guest, const char *host,
> + const IGDHostInfo *list, int len, Error 
> **errp)
>  {
> -char path[PATH_MAX];
> -int config_fd;
> -ssize_t size = sizeof(path);
> -/* Access real host bridge. */
> -int rc = snprintf(path, size, 
> "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
> -  0, 0, 0, 0, "config");
> -int ret = 0;
> +char *path;
> +int config_fd, rc, i;
>  
> -if (rc >= size || rc < 0) {
> -return -ENODEV;
> -}
> -
> -config_fd = open(path, O_RDWR);
> +path = g_strdup_printf("/sys/bus/pci/devices/%s/config", host);
> +config_fd = open(path, O_RDONLY);
>  if (config_fd < 0) {
> -return -ENODEV;
> +error_setg_file_open(errp, errno, path);
> +goto out_free;
>  }
>  
> -if (lseek(config_fd, pos, SEEK_SET) != pos) {
> -ret = -errno;
> -goto out;
> +for (i = 0; i < len; i++) {
> +rc = pread(config_fd, guest->config + list[i].offset,
> +   list[i].len, list[i].offset);
> +if (rc != list[i].len) {

pread is allowed to return early, returning the number of bytes read.



> +error_setg_errno(errp, errno, "read %s, offset 0x%x",
> + path, list[i].offset);
> +goto out_close;
> +}
>  }
> -do {
> -rc = read(config_fd, (uint8_t *)&val, len);
> -} while (rc < 0 && (errno == EINTR || errno == EAGAIN));
> -if (rc != len) {
> -ret = -errno;
> -}
> -out:
> +
> +out_close:
>  close(config_fd);
> -return ret;
> +out_free:
> +g_free(path);
>  }
>  
>  static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp);
>  static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
>  {
>  Error *err = NULL;
> -uint32_t val = 0;
> -int rc, i, num;
> -int pos, len;
>  
>  i440fx_realize(pci_dev, &err);
>  if (err != NULL) {
> @@ -67,16 +59,13 @@ static void igd_pt_i440fx_realize(PCIDevice *pci_dev, 
> Error **errp)
>  return;
>  }
>  
> -num = ARRAY_SIZE(igd_host_bridge_infos);
> -for (i = 0; i < num; i++) {
> -pos = igd_host_bridge_infos[i].offset;
> -len = igd_host_bridge_infos[i].len;
> -rc = host_pci_config_read(pos, len, val);
> -if (rc) {
> -error_setg(errp, "failed to read host config");
> -return;
> -}
> -pci_default_write_config(pci_dev, pos, val, len);
> +host_pci_config_copy(pci_dev, ":00:00.0",
> + igd_host_bridge_infos,
> + ARRAY_SIZE(igd_host_bridge_infos),
> + &err);
> +if (err != NULL) {
> +error_propagate(errp, err);
> +return;
>  }
>  }
>  
> -- 
> 1.8.3.1
>

[Qemu-devel] [PATCH] i2c-tiny-usb: add new usb to i2c bridge

2016-01-06 Thread Tim Sander

Version 4 with improvements suggested by Gerd Hoffmann:

Signed-off-by: Tim Sander 

i2c-tiny-usb is a small usb to i2c bridge:
 http://www.harbaum.org/till/i2c_tiny_usb/index.shtml

It is pretty simple and has no usb endpoints just a control.
Reasons for adding this device:
* Linux device driver available
* adding an additional i2c bus via command line e.g.
  -device usb-i2c-tiny,id=i2c-0 -device tmp105,bus=i2c,address=0x50
---
 default-configs/usb.mak |   1 +
 hw/usb/Makefile.objs|   1 +
 hw/usb/dev-i2c-tiny.c   | 320 
 trace-events|  11 ++
 4 files changed, 333 insertions(+)
 create mode 100644 hw/usb/dev-i2c-tiny.c

diff --git a/default-configs/usb.mak b/default-configs/usb.mak
index f4b8568..01d2c9f 100644
--- a/default-configs/usb.mak
+++ b/default-configs/usb.mak
@@ -8,3 +8,4 @@ CONFIG_USB_AUDIO=y
 CONFIG_USB_SERIAL=y
 CONFIG_USB_NETWORK=y
 CONFIG_USB_BLUETOOTH=y
+CONFIG_USB_I2C_TINY=y
diff --git a/hw/usb/Makefile.objs b/hw/usb/Makefile.objs
index 8f00fbd..3a4c337 100644
--- a/hw/usb/Makefile.objs
+++ b/hw/usb/Makefile.objs
@@ -20,6 +20,7 @@ common-obj-$(CONFIG_USB_AUDIO)+= dev-audio.o
 common-obj-$(CONFIG_USB_SERIAL)   += dev-serial.o
 common-obj-$(CONFIG_USB_NETWORK)  += dev-network.o
 common-obj-$(CONFIG_USB_BLUETOOTH)+= dev-bluetooth.o
+common-obj-$(CONFIG_USB_I2C_TINY) += dev-i2c-tiny.o
 
 ifeq ($(CONFIG_USB_SMARTCARD),y)
 common-obj-y  += dev-smartcard-reader.o
diff --git a/hw/usb/dev-i2c-tiny.c b/hw/usb/dev-i2c-tiny.c
new file mode 100644
index 000..c28d7a5
--- /dev/null
+++ b/hw/usb/dev-i2c-tiny.c
@@ -0,0 +1,320 @@
+/*
+ * I2C tiny usb device emulation
+ *
+ * i2c-tiny-usb is a small usb to i2c bridge:
+ *
+ * http://www.harbaum.org/till/i2c_tiny_usb/index.shtml
+ *
+ * The simulated device is pretty simple and has no usb endpoints.
+ * There is a Linux device driver available named i2c-tiny-usb.
+ *
+ * Below is an example how to use this device from command line:
+ *  -device usb-i2c-tiny,id=i2c-0 -device tmp105,bus=i2c,address=0x50
+ *
+ * Copyright (c) 2015 Tim Sander 
+ *
+ * Loosly based on usb dev-serial.c:
+ * Copyright (c) 2006 CodeSourcery.
+ * Copyright (c) 2008 Samuel Thibault 
+ * Written by Paul Brook, reused for FTDI by Samuel Thibault
+ *
+ * This code is licensed under the LGPL.
+ *
+ */
+
+#include "trace.h"
+#include "qemu-common.h"
+#include "qemu/error-report.h"
+#include "hw/usb.h"
+#include "hw/usb/desc.h"
+#include "hw/i2c/i2c.h"
+#include "hw/i2c/smbus.h"
+#include "sysemu/char.h"
+#include "endian.h"
+
+/* commands from USB, must e.g. match command ids in kernel driver */
+#define CMD_ECHO   0
+#define CMD_GET_FUNC   1
+#define CMD_SET_DELAY  2
+#define CMD_GET_STATUS 3
+
+/* To determine what functionality is present */
+#define I2C_FUNC_I2C0x0001
+#define I2C_FUNC_10BIT_ADDR 0x0002
+#define I2C_FUNC_PROTOCOL_MANGLING  0x0004
+#define I2C_FUNC_SMBUS_HWPEC_CALC   0x0008 /* SMBus 2.0 */
+#define I2C_FUNC_SMBUS_READ_WORD_DATA_PEC   0x0800 /* SMBus 2.0 */
+#define I2C_FUNC_SMBUS_WRITE_WORD_DATA_PEC  0x1000 /* SMBus 2.0 */
+#define I2C_FUNC_SMBUS_PROC_CALL_PEC0x2000 /* SMBus 2.0 */
+#define I2C_FUNC_SMBUS_BLOCK_PROC_CALL_PEC  0x4000 /* SMBus 2.0 */
+#define I2C_FUNC_SMBUS_BLOCK_PROC_CALL  0x8000 /* SMBus 2.0 */
+#define I2C_FUNC_SMBUS_QUICK0x0001
+#define I2C_FUNC_SMBUS_READ_BYTE0x0002
+#define I2C_FUNC_SMBUS_WRITE_BYTE   0x0004
+#define I2C_FUNC_SMBUS_READ_BYTE_DATA   0x0008
+#define I2C_FUNC_SMBUS_WRITE_BYTE_DATA  0x0010
+#define I2C_FUNC_SMBUS_READ_WORD_DATA   0x0020
+#define I2C_FUNC_SMBUS_WRITE_WORD_DATA  0x0040
+#define I2C_FUNC_SMBUS_PROC_CALL0x0080
+#define I2C_FUNC_SMBUS_READ_BLOCK_DATA  0x0100
+#define I2C_FUNC_SMBUS_WRITE_BLOCK_DATA 0x0200
+#define I2C_FUNC_SMBUS_READ_I2C_BLOCK   0x0400 /*I2C-like blk-xfr 
*/
+#define I2C_FUNC_SMBUS_WRITE_I2C_BLOCK  0x0800 /*1-byte reg. 
addr.*/
+#define I2C_FUNC_SMBUS_READ_I2C_BLOCK_2 0x1000 /*I2C-like 
blk-xfer*/
+#define I2C_FUNC_SMBUS_WRITE_I2C_BLOCK_20x2000 /* w/ 2-byte 
regadr*/
+#define I2C_FUNC_SMBUS_READ_BLOCK_DATA_PEC  0x4000 /* SMBus 2.0 */
+#define I2C_FUNC_SMBUS_WRITE_BLOCK_DATA_PEC 0x8000 /* SMBus 2.0 */
+
+#define I2C_FUNC_SMBUS_BYTE (I2C_FUNC_SMBUS_READ_BYTE | \
+I2C_FUNC_SMBUS_WRITE_BYTE)
+#define I2C_FUNC_SMBUS_BYTE_DATA (I2C_FUNC_SMBUS_READ_BYTE_DATA | \
+I2C_FUNC_SMBUS_WRITE_BYTE_DATA)
+#define I2C_FUNC_SMBUS_WORD_DATA (I2C_FUNC_SMBUS_READ_WORD_DATA | \
+I2C_FUNC_SMBUS_WRITE_WORD_DATA)
+#define I2C_FUNC_SMBUS_BLOCK_DATA (I2C_FUNC_SMBUS_READ_BLOCK_DATA | \
+I2C_FUNC_SMBUS_WRITE_BLOCK_DATA)
+#define I2C_FUNC_SMBUS_I2C_BLOCK (

Re: [Qemu-devel] [PATCH 3/8] ipmi: add GET_SYS_RESTART_CAUSE chassis command

2016-01-06 Thread Greg Kurz

On Tue,  5 Jan 2016 18:29:57 +0100
Cédric Le Goater  wrote:

> This is a simulator. Just return an unknown cause (0).
> 
> Signed-off-by: Cédric Le Goater 
> ---

Acked-by: Greg Kurz 

>  hw/ipmi/ipmi_bmc_sim.c | 20 ++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
> index 061db8437479..5db94491b130 100644
> --- a/hw/ipmi/ipmi_bmc_sim.c
> +++ b/hw/ipmi/ipmi_bmc_sim.c
> @@ -30,11 +30,12 @@
>  #include "qemu/error-report.h"
> 
>  #define IPMI_NETFN_CHASSIS0x00
> -#define IPMI_NETFN_CHASSIS_MAXCMD 0x03
> +#define IPMI_NETFN_CHASSIS_MAXCMD 0x0a
> 

No big deal again but I guess the define would better fit...

>  #define IPMI_CMD_GET_CHASSIS_CAPABILITIES 0x00
>  #define IPMI_CMD_GET_CHASSIS_STATUS   0x01
>  #define IPMI_CMD_CHASSIS_CONTROL  0x02
> +#define IPMI_CMD_GET_SYS_RESTART_CAUSE0x09
> 

... here.

>  #define IPMI_NETFN_SENSOR_EVENT   0x04
>  #define IPMI_NETFN_SENSOR_EVENT_MAXCMD0x30
> @@ -201,6 +202,8 @@ struct IPMIBmcSim {
>  uint8_t mfg_id[3];
>  uint8_t product_id[2];
> 
> +uint8_t restart_cause;
> +
>  IPMISel sel;
>  IPMISdr sdr;
>  IPMISensor sensors[MAX_SENSORS];
> @@ -754,6 +757,17 @@ static void chassis_control(IPMIBmcSim *ibs,
>  return;
>  }
> 
> +static void chassis_get_sys_restart_cause(IPMIBmcSim *ibs,
> +   uint8_t *cmd, unsigned int cmd_len,
> +   uint8_t *rsp, unsigned int *rsp_len,
> +   unsigned int max_rsp_len)
> +{
> +IPMI_ADD_RSP_DATA(ibs->restart_cause & 0xf); /* Restart Cause */

Define a mask ?

> +IPMI_ADD_RSP_DATA(0);  /* Channel 0 */
> + out:
> +return;
> +}
> +
>  static void get_device_id(IPMIBmcSim *ibs,
>uint8_t *cmd, unsigned int cmd_len,
>uint8_t *rsp, unsigned int *rsp_len,
> @@ -1624,7 +1638,8 @@ static void get_sensor_type(IPMIBmcSim *ibs,
>  static const IPMICmdHandler chassis_cmds[IPMI_NETFN_CHASSIS_MAXCMD] = {
>  [IPMI_CMD_GET_CHASSIS_CAPABILITIES] = chassis_capabilities,
>  [IPMI_CMD_GET_CHASSIS_STATUS] = chassis_status,
> -[IPMI_CMD_CHASSIS_CONTROL] = chassis_control
> +[IPMI_CMD_CHASSIS_CONTROL] = chassis_control,
> +[IPMI_CMD_GET_SYS_RESTART_CAUSE] = chassis_get_sys_restart_cause
>  };
>  static const IPMINetfn chassis_netfn = {
>  .cmd_nums = IPMI_NETFN_CHASSIS_MAXCMD,
> @@ -1746,6 +1761,7 @@ static void ipmi_sim_init(Object *obj)
>  ibs->bmc_global_enables = (1 << IPMI_BMC_EVENT_LOG_BIT);
>  ibs->device_id = 0x20;
>  ibs->ipmi_version = 0x02; /* IPMI 2.0 */
> +ibs->restart_cause = 0;
>  for (i = 0; i < 4; i++) {
>  ibs->sel.last_addition[i] = 0xff;
>  ibs->sel.last_clear[i] = 0xff;

[Qemu-devel] [PATCH] hw/arm/virt: Initialize NICs configured in PCI bus

2016-01-06 Thread Ashok Kumar

virtio model is used for default case.

Signed-off-by: Ashok Kumar 
---
 hw/arm/virt.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index acc1fcb..fd52b76 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -808,6 +808,7 @@ static void create_pcie(const VirtBoardInfo *vbi, qemu_irq 
*pic,
 DeviceState *dev;
 char *nodename;
 int i;
+PCIHostState *pci;
 
 dev = qdev_create(NULL, TYPE_GPEX_HOST);
 qdev_init_nofail(dev);
@@ -847,6 +848,19 @@ static void create_pcie(const VirtBoardInfo *vbi, qemu_irq 
*pic,
 sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, pic[irq + i]);
 }
 
+pci = PCI_HOST_BRIDGE(dev);
+if (pci->bus) {
+for (i = 0; i < nb_nics; i++) {
+NICInfo *nd = &nd_table[i];
+
+if (!nd->model) {
+nd->model = g_strdup("virtio");
+}
+
+pci_nic_init_nofail(nd, pci->bus, nd->model, NULL);
+}
+}
+
 nodename = g_strdup_printf("/pcie@%" PRIx64, base);
 qemu_fdt_add_subnode(vbi->fdt, nodename);
 qemu_fdt_setprop_string(vbi->fdt, nodename,
-- 
2.1.0

1 2 >

1 - 100 of 159 matches

Mail list logo