Re: [Qemu-devel] [PATCH RFC v4 15/20] intel_iommu: provide its own replay() callback

2017-01-21 Thread Jason Wang



On 2017年01月20日 21:08, Peter Xu wrote:

The default replay() don't work for VT-d since vt-d will have a huge
default memory region which covers address range 0-(2^64-1). This will
normally consumes a lot of time (which looks like a dead loop).

The solution is simple - we don't walk over all the regions. Instead, we
jump over the regions when we found that the page directories are empty.
It'll greatly reduce the time to walk the whole region.

To achieve this, we provided a page walk helper to do that, invoking
corresponding hook function when we found an page we are interested in.
vtd_page_walk_level() is the core logic for the page walking. It's
interface is designed to suite further use case, e.g., to invalidate a
range of addresses.

Signed-off-by: Peter Xu 
---
  hw/i386/intel_iommu.c | 216 --
  hw/i386/trace-events  |   7 ++
  include/exec/memory.h |   2 +
  3 files changed, 220 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 6f5f68a..f9c5142 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -598,6 +598,22 @@ static inline uint32_t 
vtd_get_agaw_from_context_entry(VTDContextEntry *ce)
  return 30 + (ce->hi & VTD_CONTEXT_ENTRY_AW) * 9;
  }
  
+static inline uint64_t vtd_iova_limit(VTDContextEntry *ce)

+{
+uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
+return 1ULL << MIN(ce_agaw, VTD_MGAW);
+}
+
+/* Return true if IOVA passes range check, otherwise false. */
+static inline bool vtd_iova_range_check(uint64_t iova, VTDContextEntry *ce)
+{
+/*
+ * Check if @iova is above 2^X-1, where X is the minimum of MGAW
+ * in CAP_REG and AW in context-entry.
+ */
+return !(iova & ~(vtd_iova_limit(ce) - 1));
+}
+
  static const uint64_t vtd_paging_entry_rsvd_field[] = {
  [0] = ~0ULL,
  /* For not large page */
@@ -633,13 +649,9 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
  uint32_t level = vtd_get_level_from_context_entry(ce);
  uint32_t offset;
  uint64_t slpte;
-uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
  uint64_t access_right_check;
  
-/* Check if @iova is above 2^X-1, where X is the minimum of MGAW

- * in CAP_REG and AW in context-entry.
- */
-if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
+if (!vtd_iova_range_check(iova, ce)) {
  trace_vtd_err("IOVA exceeds limits");
  return -VTD_FR_ADDR_BEYOND_MGAW;
  }
@@ -681,6 +693,168 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
  }
  }
  
+typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private);

+
+/**
+ * vtd_page_walk_level - walk over specific level for IOVA range
+ *
+ * @addr: base GPA addr to start the walk
+ * @start: IOVA range start address
+ * @end: IOVA range end address (start <= addr < end)
+ * @hook_fn: hook func to be called when detected page
+ * @private: private data to be passed into hook func
+ * @read: whether parent level has read permission
+ * @write: whether parent level has write permission
+ * @skipped: accumulated skipped ranges


What's the usage for this parameter? Looks like it was never used in 
this series.



+ * @notify_unmap: whether we should notify invalid entries
+ */
+static int vtd_page_walk_level(dma_addr_t addr, uint64_t start,
+   uint64_t end, vtd_page_walk_hook hook_fn,
+   void *private, uint32_t level,
+   bool read, bool write, uint64_t *skipped,
+   bool notify_unmap)
+{
+bool read_cur, write_cur, entry_valid;
+uint32_t offset;
+uint64_t slpte;
+uint64_t subpage_size, subpage_mask;
+IOMMUTLBEntry entry;
+uint64_t iova = start;
+uint64_t iova_next;
+uint64_t skipped_local = 0;
+int ret = 0;
+
+trace_vtd_page_walk_level(addr, level, start, end);
+
+subpage_size = 1ULL << vtd_slpt_level_shift(level);
+subpage_mask = vtd_slpt_level_page_mask(level);
+
+while (iova < end) {
+iova_next = (iova & subpage_mask) + subpage_size;
+
+offset = vtd_iova_level_offset(iova, level);
+slpte = vtd_get_slpte(addr, offset);
+
+/*
+ * When one of the following case happens, we assume the whole
+ * range is invalid:
+ *
+ * 1. read block failed


Don't get the meaning (and don't see any code relate to this comment).


+ * 3. reserved area non-zero
+ * 2. both read & write flag are not set


Should be 1,2,3? And the above comment is conflict with the code at 
least when notify_unmap is true.



+ */
+
+if (slpte == (uint64_t)-1) {


If this is true, vtd_slpte_nonzero_rsvd(slpte) should be true too I think?


+trace_vtd_page_walk_skip_read(iova, iova_next);
+skipped_local++;
+goto next;
+}

[Qemu-devel] [PATCH] virtio-gpu: fix memory leak in set scanout

2017-01-21 Thread Li Qiang
From: Li Qiang 

In virtio_gpu_set_scanout function, when creating the 'rect'
its refcount is set to 2, by pixman_image_create_bits and
qemu_create_displaysurface_pixman function. This can lead 
a memory leak issues. This patch avoid this issue.

Signed-off-by: Li Qiang 
---
 hw/display/virtio-gpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 7a15c61..4aecea3 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -608,6 +608,7 @@ static void virtio_gpu_set_scanout(VirtIOGPU *g,
 cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
 return;
 }
+pixman_image_unref(rect);
 dpy_gfx_replace_surface(g->scanout[ss.scanout_id].con, scanout->ds);
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH] Change name of live migration thread

2017-01-21 Thread Pankaj Gupta
Change the name of live migration thread from 'migration' 
to 'qemu_vm_migration' to identify it clearly. 'migration'
is a generic word and kernel also has  tasks for process
migration with the name 'migration/cpu#'.

Signed-off-by: Pankaj Gupta 
---
 migration/migration.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index f498ab8..6ebd606 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1969,7 +1969,7 @@ void migrate_fd_connect(MigrationState *s)
 }
 
 migrate_compress_threads_create();
-qemu_thread_create(>thread, "migration", migration_thread, s,
+qemu_thread_create(>thread, "qemu_vm_migration", migration_thread, s,
QEMU_THREAD_JOINABLE);
 s->migration_thread_running = true;
 }
-- 
2.7.4




Re: [Qemu-devel] [PATCH 01/18] tcg: add support for 128bit vector type

2017-01-21 Thread Richard Henderson

On 01/19/2017 08:54 AM, Kirill Batuzov wrote:


Wrappers issue emulation code instead of operation if it is not supported by 
host.

tcg_gen_add_i32x4 looks like this:

if (TCG_TARGET_HAS_add_i32x4) {
tcg_gen_op3_v128(INDEX_op_add_i32x4, args[0], args[1], args[2]);
} else {
for (i = 0; i < 4; i++) {
tcg_gen_ld_i32(...);
tcg_gen_ld_i32(...);
tcg_gen_add_i32(...);
tcg_gen_st_i32(...);
}
}


To me that begs the question of why you wouldn't issue 4 adds on 4 i32 
registers instead.



r~



Re: [Qemu-devel] [PATCH RFC v3 02/14] intel_iommu: simplify irq region translation

2017-01-21 Thread Peter Xu
On Sun, Jan 22, 2017 at 04:42:13AM +, Tian, Kevin wrote:
> > From: Peter Xu [mailto:pet...@redhat.com]
> > Sent: Friday, January 20, 2017 6:04 PM
> > 
> > On Fri, Jan 20, 2017 at 09:52:01AM +, Tian, Kevin wrote:
> > 
> > [...]
> > 
> > > btw what about guest setups a valid mapping at 0xFEEx_ in
> > > its remapping structure, which is then programmed to virtual
> > > device as DMA destination? Then when emulating that virtual DMA,
> > > vtd_do_iommu_translate should simply return (maybe throw out
> > > a warning for diagnostic purpose) instead of assert here.
> > >
> > > VT-d spec defines as below:
> > >
> > >   Software must ensure the second-level paging-structure entries
> > >   are programmed not to remap input addresses to the interrupt
> > >   address range. Hardware behavior is undefined for memory
> > >   requests remapped to the interrupt address range.
> > 
> > Thanks for this reference. That's something I was curious about.
> > 
> > >
> > > I don't think "hardware behavior is undefined" is equal to "assert
> > > thus kill VM"...
> > 
> > I don't think it will kill the VM. After we have the MSI region, it
> > should just use that IR region for everything (read/write/translate).
> > So iiuc when anyone setups IOVA mapping within range 0xfeex, then
> > a DMA will trigger an interrupt (rather than memory moves), but in
> > most cases the interrupt will be illegal since either the data is
> > invalid (e.g., non-zero reserved bits, or SID verification failure),
> > further it should trigger a vIOMMU fault (though IR fault reporting is
> > still incomplete, that's my next thing to do after this series).
> > 
> 
> Yes, you're right here. Sorry for bothering with my wrong understanding. :-)

No problem at all.

Looking forward to any of your further comments on v4. :-)

-- peterx



Re: [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest

2017-01-21 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Saturday, January 21, 2017 2:21 AM
> 
> On Fri, 20 Jan 2017 06:57:22 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson
> > > Sent: Thursday, January 19, 2017 6:32 AM
> > >
> > > On Sat, 31 Dec 2016 17:13:07 +0800
> > > Cao jin  wrote:
> > >
> > > > From: Chen Fan 
> > > >
> > > > When physical device has uncorrectable error hanppened, the vfio_pci
> > > > driver will signal the uncorrectable error status register value to
> > > > corresponding QEMU's vfio-pci device via the eventfd registered by this
> > > > device, then, the vfio-pci's error eventfd handler will be invoked in
> > > > event loop.
> > > >
> > > > Construct and pass the aer message to root port, root port will trigger 
> > > > an
> > > > interrupt to signal guest, then, the guest driver will do the recovery.
> > > >
> > > > Note: Now only support non-fatal error's recovery, fatal error will
> > > > still result in vm stop.
> > >
> > > Please update the entire commit log, don't just add a note that this
> > > now only covers non-fatal errors.
> > >
> >
> > One thing relate to vIOMMU. There is still a TODO task about forwarding
> > IOMMU fault thru VFIO to Qemu, so Qemu vIOMMU has the chance to
> > walk guest remapping structure to emulate virtual IOMMU fault. Likely
> > it also requires eventfd mechanism.
> >
> > Wondering whether making sense to reuse same eventfd for both AER
> > and vIOMMU or using separate eventfd is also fine? Even go with the
> > former option, I don't expect substantial change to this series. Major
> > change is on interface definition - extensible to multiple types of
> > fault/error conditions instead of assuming AER only.
> >
> > Thoughts?
> 
> We can't really convey any information through an eventfd, it's just a
> signal, so I don't think we can use the same eventfd for both types of
> errors.  Already here we're discussing the idea of using separate
> eventfds for fatal vs non-fatal AERs.  IOMMU error processing seems
> like yet another eventfd and likely some region or ioctl mechanism for
> retrieving the error details since the IOMMU hardware is not directly
> accessible.  Furthermore, such an event might logically be connected to
> the vfio container rather than the device, so it might not even use the
> same file descriptor.  Thanks,
> 

Clear enough. Thanks,
Kevin



Re: [Qemu-devel] [PATCH RFC v3 02/14] intel_iommu: simplify irq region translation

2017-01-21 Thread Tian, Kevin
> From: Peter Xu [mailto:pet...@redhat.com]
> Sent: Friday, January 20, 2017 6:04 PM
> 
> On Fri, Jan 20, 2017 at 09:52:01AM +, Tian, Kevin wrote:
> 
> [...]
> 
> > btw what about guest setups a valid mapping at 0xFEEx_ in
> > its remapping structure, which is then programmed to virtual
> > device as DMA destination? Then when emulating that virtual DMA,
> > vtd_do_iommu_translate should simply return (maybe throw out
> > a warning for diagnostic purpose) instead of assert here.
> >
> > VT-d spec defines as below:
> >
> > Software must ensure the second-level paging-structure entries
> > are programmed not to remap input addresses to the interrupt
> > address range. Hardware behavior is undefined for memory
> > requests remapped to the interrupt address range.
> 
> Thanks for this reference. That's something I was curious about.
> 
> >
> > I don't think "hardware behavior is undefined" is equal to "assert
> > thus kill VM"...
> 
> I don't think it will kill the VM. After we have the MSI region, it
> should just use that IR region for everything (read/write/translate).
> So iiuc when anyone setups IOVA mapping within range 0xfeex, then
> a DMA will trigger an interrupt (rather than memory moves), but in
> most cases the interrupt will be illegal since either the data is
> invalid (e.g., non-zero reserved bits, or SID verification failure),
> further it should trigger a vIOMMU fault (though IR fault reporting is
> still incomplete, that's my next thing to do after this series).
> 

Yes, you're right here. Sorry for bothering with my wrong understanding. :-)

Thanks
Kevin


Re: [Qemu-devel] Nested PCI passthrough

2017-01-21 Thread Peter Xu
On Sat, Jan 21, 2017 at 04:16:25AM +0100, fassl wrote:
> Hello,
> 
> i am trying to pass through a graphic card to a guest within a guest.
> So far i can see a text console within the target vm which says
> "radeon: ring 0 test failed", so it times out in radeon_vce_ring_test
> function. I am using qemu 2.8.50 at revision
> 0f6bcf68a99efdc531b209551f2b760b0bdcc554.
> 
> The relevant lowermost host arguments are:
> -machine pc-q35-2.8,accel=kvm,kernel-irqchip=split
> -device intel-iommu,intremap=on,eim=on
> 
> If i dont set the x-vga flag for the passed through device in the
> lowermost host the VM within the VM does not reset(?) the device during
> shutdown and the screen freezes. If i do, the screen goes black and no
> signal is going to the screen anymore and i can restart the target VM
> without the whole machine freezing. (one has to set
> CONFIG_VFIO_PCI_VGA=y in the kernel of the first VM, or call
> pci_register_vga to get this)
> 
> Also with irqchip=split the first vm cannot shutdown gracefully and
> crashes during shutdown.
> 
> I also can see some IRTE vector and trigger mode inconsistencies, can
> they cause this?

This should not be related. Even you don't passthrough devices, they
should be possibly there as well as long as you are running Linux
inside guest (of course, you should have enabled IOMMU_DEBUG).

-- peterx



Re: [Qemu-devel] [PATCH RFC 0/3] vfio: allow to notify unmap for very big region

2017-01-21 Thread Peter Xu
On Fri, Jan 20, 2017 at 10:14:01AM -0700, Alex Williamson wrote:
> On Fri, 20 Jan 2017 20:27:18 +0800
> Peter Xu  wrote:
> 
> > On Fri, Jan 20, 2017 at 11:43:28AM +0800, Peter Xu wrote:
> > 
> > [...]
> > 
> > > > What I don't want to see is for this API bug to leak out into the rest
> > > > of the QEMU code such that intel_iommu code, or iommu code in general
> > > > subtly avoids it by artificially using a smaller range.  VT-d hardware
> > > > has an actual physical address space of either 2^39 or 2^48 bits, so if
> > > > you want to make the iommu address space match the device we're trying
> > > > to emulate, that's perfectly fine.  AIUI, AMD-Vi does actually have a
> > > > 64-bit address space on the IOMMU, so to handle that case I'd expect
> > > > the simplest solution would be to track the and mapped iova high water
> > > > mark per container in vfio and truncate unmaps to that high water end
> > > > address.  Realistically we're probably not going to see iovas at the end
> > > > of the 64-bit address space, but we can come up with some other
> > > > workaround in the vfio code or update the kernel API if we do.  Thanks, 
> > > >  
> > > 
> > > Agree that high watermark can be a good solution for VT-d. I'll use
> > > that instead of 2^63-1.  
> > 
> > Okay when I replied I didn't notice this "watermark" may need more
> > than several (even tens of) LOCs. :(
> > 
> > Considering that I see no further usage of this watermark, I'm
> > thinking whether it's okay I directly use (1ULL << VTD_MGAW) here as
> > the watermark - it's simple, efficient and secure imho.
> 
> Avoiding the issue based on the virtual iommu hardware properties is a
> fine solution, my intention was only to discourage introduction of
> artificial limitations in the surrounding code to avoid this vfio
> issue.  Thanks,

Yes.

I have posted a new version of the vfio series. Looking forward to
your further comment (or ack, if with luck :) on v4.

Thanks,

-- peterx



[Qemu-devel] [PATCH RFC v4.1 04/20] intel_iommu: add "caching-mode" option

2017-01-21 Thread Peter Xu
From: Aviv Ben-David 

This capability asks the guest to invalidate cache before each map operation.
We can use this invalidation to trap map operations in the hypervisor.

Signed-off-by: Aviv Ben-David 
[peterx: using "caching-mode" instead of "cache-mode" to align with spec]
[peterx: re-write the subject to make it short and clear]
Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c  | 5 +
 hw/i386/intel_iommu_internal.h | 1 +
 include/hw/i386/intel_iommu.h  | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index ec62239..e58f1de 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2107,6 +2107,7 @@ static Property vtd_properties[] = {
 DEFINE_PROP_ON_OFF_AUTO("eim", IntelIOMMUState, intr_eim,
 ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BOOL("x-buggy-eim", IntelIOMMUState, buggy_eim, false),
+DEFINE_PROP_BOOL("caching-mode", IntelIOMMUState, caching_mode, FALSE),
 DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -2488,6 +2489,10 @@ static void vtd_init(IntelIOMMUState *s)
 s->ecap |= VTD_ECAP_DT;
 }
 
+if (s->caching_mode) {
+s->cap |= VTD_CAP_CM;
+}
+
 vtd_reset_context_cache(s);
 vtd_reset_iotlb(s);
 
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 356f188..4104121 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -202,6 +202,7 @@
 #define VTD_CAP_MAMV(VTD_MAMV << 48)
 #define VTD_CAP_PSI (1ULL << 39)
 #define VTD_CAP_SLLPS   ((1ULL << 34) | (1ULL << 35))
+#define VTD_CAP_CM  (1ULL << 7)
 
 /* Supported Adjusted Guest Address Widths */
 #define VTD_CAP_SAGAW_SHIFT 8
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 405c9d1..fe645aa 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -257,6 +257,8 @@ struct IntelIOMMUState {
 uint8_t womask[DMAR_REG_SIZE];  /* WO (write only - read returns 0) */
 uint32_t version;
 
+bool caching_mode;  /* RO - is cap CM enabled? */
+
 dma_addr_t root;/* Current root table pointer */
 bool root_extended; /* Type of root table (extended or not) */
 bool dmar_enabled;  /* Set if DMA remapping is enabled */
-- 
2.7.4




Re: [Qemu-devel] [PATCH RFC v3 01/14] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest

2017-01-21 Thread Peter Xu
On Fri, Jan 20, 2017 at 09:42:25AM -0600, Eric Blake wrote:
> On 01/12/2017 09:06 PM, Peter Xu wrote:
> > From: Aviv Ben-David 
> 
> Long subject line, please try to keep it around 60 or so characters
> (look at 'git shortlog -30' for comparison).  Also, fix the typos:
> s/capility exposoed/capability exposed/

Will fix this and repost this single patch as v4.1 based on v4 series.

Thanks!

> 
> > 
> > This capability asks the guest to invalidate cache before each map 
> > operation.
> > We can use this invalidation to trap map operations in the hypervisor.
> > 
> > Signed-off-by: Aviv Ben-David 
> > Signed-off-by: Peter Xu 
> > ---
> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 

-- peterx



Re: [Qemu-devel] [PATCH V2] qemu-img: optimize is_allocated_sectors_min

2017-01-21 Thread Max Reitz
On 19.01.2017 17:35, Peter Lieven wrote:
> the current implementation always splits requests if a buffer
> begins or ends with zeroes independent of the length of the
> zero area. Change this to really only split off zero areas
> that have at least a length of 'min' bytes.
> 
> Signed-off-by: Peter Lieven 
> ---
>  qemu-img.c | 44 ++--
>  1 file changed, 14 insertions(+), 30 deletions(-)

That is because the idea is that you should delay the write operation as
much as possible. Say you have some empty space at the beginning of the
buffer and then some used space. Of course you should then only start
writing at the point where you actually have to write something. The
same goes for the end of the buffer. Why write zeroes when you can just
not do it?

On the other hand, it makes sense to not split a single write operation
into two just because there is some empty space in the middle. This is
because just writing that empty space may take less time than issuing
two operations, especially if that empty space is e.g. in the middle of
a qcow2 cluster anyway.

This is why empty space in the middle is treated differently than empty
space at the beginning or the end of the buffer.

Do you have a reason for changing this other than because it simplifies
the code? Which it does, admittedly, but I think the code does have a
reason for being how it is.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 0/2] block: Fix iotests 059

2017-01-21 Thread Max Reitz
On 19.01.2017 14:07, Fam Zheng wrote:
> The 059.out went out of sync, bring it back in line.
> 
> Fam Zheng (2):
>   qapi: Tweak error message of bdrv_query_image_info
>   iotests: Fix reference output for 059
> 
>  block/qapi.c   | 4 ++--
>  tests/qemu-iotests/059.out | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)

Thanks, applied to my block tree:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/2] qapi: Tweak error message of bdrv_query_image_info

2017-01-21 Thread Max Reitz
On 19.01.2017 14:07, Fam Zheng wrote:
> @bs doesn't always have a device name, such as when it comes from
> "qemu-img info". Report file name instead.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/qapi.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Yes, I had a discussion with Kevin whether we should add some function
like bdrv_error_setg() which automatically prepends some description™ of
a BDS (e.g. node name if user-set, or device name if meaningful, or
filename if not JSON). Spoiler: We probably should.

This is fine for now, though, because this function specifically deals
with images.

Max

> diff --git a/block/qapi.c b/block/qapi.c
> index a62e862..6329735 100644
> --- a/block/qapi.c
> +++ b/block/qapi.c
> @@ -237,8 +237,8 @@ void bdrv_query_image_info(BlockDriverState *bs,
>  
>  size = bdrv_getlength(bs);
>  if (size < 0) {
> -error_setg_errno(errp, -size, "Can't get size of device '%s'",
> - bdrv_get_device_name(bs));
> +error_setg_errno(errp, -size, "Can't get image size '%s'",
> + bs->exact_filename);
>  goto out;
>  }
>  
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v1 15/15] block: remove all encryption handling APIs

2017-01-21 Thread Max Reitz
On 03.01.2017 19:28, Daniel P. Berrange wrote:
> Now that all encryption keys must be provided upfront via
> the QCryptoSecret API and associated block driver properties
> there is no need for any explicit encryption handling APIs
> in the block layer. Encryption can be handled transparently
> within the block driver. We only retain an API for querying
> whether an image is encrypted or not, since that is a
> potentially useful piece of metadata to report to the user.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block.c   | 77 
> +--
>  block/crypto.c|  1 -
>  block/qapi.c  |  2 +-
>  block/qcow.c  |  1 -
>  block/qcow2.c |  1 -
>  blockdev.c| 37 ++-
>  include/block/block.h |  3 --
>  include/block/block_int.h |  1 -
>  include/qapi/error.h  |  1 -
>  qapi/common.json  |  5 +--
>  10 files changed, 5 insertions(+), 124 deletions(-)

It would probably make sense to replace the description of
BlockDeviceInfo's @encryption_key_missing in qapi/block-core.json by
"Deprecated; always false".

[...]

> diff --git a/blockdev.c b/blockdev.c
> index 245e1e1..dfeba0c 100644
> --- a/blockdev.c
> +++ b/blockdev.c

[...]

> @@ -2244,24 +2240,8 @@ void qmp_block_passwd(bool has_device, const char 
> *device,
>bool has_node_name, const char *node_name,
>const char *password, Error **errp)
>  {
> -Error *local_err = NULL;
> -BlockDriverState *bs;
> -AioContext *aio_context;
> -
> -bs = bdrv_lookup_bs(has_device ? device : NULL,
> -has_node_name ? node_name : NULL,
> -_err);
> -if (local_err) {
> -error_propagate(errp, local_err);
> -return;
> -}
> -
> -aio_context = bdrv_get_aio_context(bs);
> -aio_context_acquire(aio_context);
> -
> -bdrv_add_key(bs, password, errp);
> -
> -aio_context_release(aio_context);
> +error_setg_errno(errp, -ENOSYS,
> + "Setting block passwords directly is no longer 
> supported");

A plain error_setg() without _errno should be sufficient.

>  }
>  

I'll leave it up to you whether you want to follow the suggestions I've
given, so:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v1 14/15] block: rip out all traces of password prompting

2017-01-21 Thread Max Reitz
On 03.01.2017 19:28, Daniel P. Berrange wrote:
> Now that qcow & qcow2 are wired up to get encryption keys
> via the QCryptoSecret object, nothing is relying on the
> interactive prompting for passwords. All the code related
> to password prompting can thus be ripped out.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  hmp.c | 31 -
>  include/monitor/monitor.h |  7 -
>  include/qemu/osdep.h  |  2 --
>  monitor.c | 68 
> ---
>  qapi-schema.json  | 10 +--
>  qemu-img.c| 31 -
>  qemu-io.c | 20 --
>  qmp.c | 12 +
>  util/oslib-posix.c| 66 -
>  util/oslib-win32.c| 24 -
>  10 files changed, 2 insertions(+), 269 deletions(-)

Nice. :-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v1 13/15] iotests: enable tests 134 and 158 to work with qcow (v1)

2017-01-21 Thread Max Reitz
On 03.01.2017 19:27, Daniel P. Berrange wrote:
> The 138 and 158 iotests exercise the legacy qcow2 aes encryption
> code path. With a few simple tweaks they can exercise the same
> feature in qcow (v1).
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  tests/qemu-iotests/134 | 10 +-
>  tests/qemu-iotests/158 |  2 +-
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/qemu-iotests/134 b/tests/qemu-iotests/134
> index dd080a2..23b7834 100755
> --- a/tests/qemu-iotests/134
> +++ b/tests/qemu-iotests/134
> @@ -37,7 +37,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>  . ./common.rc
>  . ./common.filter
>  
> -_supported_fmt qcow2
> +_supported_fmt qcow qcow2
>  _supported_proto generic
>  _supported_os Linux
>  
> @@ -55,19 +55,19 @@ QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
>  
>  echo
>  echo "== reading whole image =="
> -$QEMU_IO --object $SECRET -c "read 0 $size" --image-opts $IMGSPEC | 
> _filter_qemu_io | _filter_testdir
> +$QEMU_IO --object $SECRET -c "read 0 $size" --image-opts $IMGSPEC | 
> _filter_qemu_io | _filter_testdir | _filter_imgfmt
>  
>  echo
>  echo "== rewriting whole image =="
> -$QEMU_IO --object $SECRET -c "write -P 0xa 0 $size" --image-opts $IMGSPEC | 
> _filter_qemu_io | _filter_testdir
> +$QEMU_IO --object $SECRET -c "write -P 0xa 0 $size" --image-opts $IMGSPEC | 
> _filter_qemu_io | _filter_testdir | _filter_imgfmt
>  
>  echo
>  echo "== verify pattern =="
> -$QEMU_IO --object $SECRET -c "read -P 0xa 0 $size"  --image-opts $IMGSPEC | 
> _filter_qemu_io | _filter_testdir
> +$QEMU_IO --object $SECRET -c "read -P 0xa 0 $size"  --image-opts $IMGSPEC | 
> _filter_qemu_io | _filter_testdir | _filter_imgfmt
>  
>  echo
>  echo "== verify pattern failure with wrong password =="
> -$QEMU_IO --object $SECRETALT -c "read -P 0xa 0 $size" --image-opts $IMGSPEC 
> | _filter_qemu_io | _filter_testdir
> +$QEMU_IO --object $SECRETALT -c "read -P 0xa 0 $size" --image-opts $IMGSPEC 
> | _filter_qemu_io | _filter_testdir | _filter_imgfmt

What is this necessary for? The qemu-io output does not contain the
image format, as far as I can see.

Under the premise of "can't hurt", though:

Reviewed-by: Max Reitz 

>  
>  
>  # success, all done
> diff --git a/tests/qemu-iotests/158 b/tests/qemu-iotests/158
> index 7a1eb5c..2b53d9f 100755
> --- a/tests/qemu-iotests/158
> +++ b/tests/qemu-iotests/158
> @@ -37,7 +37,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>  . ./common.rc
>  . ./common.filter
>  
> -_supported_fmt qcow2
> +_supported_fmt qcow qcow2
>  _supported_proto generic
>  _supported_os Linux
>  
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v1 11/15] qcow2: convert QCow2 to use QCryptoBlock for encryption

2017-01-21 Thread Max Reitz
On 03.01.2017 19:27, Daniel P. Berrange wrote:
> This converts the qcow2 driver to make use of the QCryptoBlock
> APIs for encrypting image content, using the legacyy QCow2 AES
> scheme.
> 
> With this change it is now required to use the QCryptoSecret
> object for providing passwords, instead of the current block
> password APIs / interactive prompting.
> 
>   $QEMU \
> -object secret,id=sec0,filename=/home/berrange/encrypted.pw \
> -drive file=/home/berrange/encrypted.qcow2,aes-key-secret=sec0
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block/qcow2-cluster.c  |  47 +--
>  block/qcow2.c  | 190 
> +
>  block/qcow2.h  |   5 +-
>  qapi/block-core.json   |   7 +-
>  tests/qemu-iotests/049 |   2 +-
>  tests/qemu-iotests/049.out |   4 +-
>  tests/qemu-iotests/082.out |  27 +++
>  tests/qemu-iotests/087 |  28 ++-
>  tests/qemu-iotests/087.out |   6 +-
>  tests/qemu-iotests/134 |  18 +++--
>  tests/qemu-iotests/134.out |  10 +--
>  tests/qemu-iotests/158 |  19 +++--
>  tests/qemu-iotests/158.out |  14 +---
>  13 files changed, 219 insertions(+), 158 deletions(-)

[...]

> diff --git a/tests/qemu-iotests/134 b/tests/qemu-iotests/134
> index af618b8..c2458d8 100755
> --- a/tests/qemu-iotests/134
> +++ b/tests/qemu-iotests/134
> @@ -43,23 +43,31 @@ _supported_os Linux
>  
>  
>  size=128M
> -IMGOPTS="encryption=on" _make_test_img $size
> +
> +SECRET="secret,id=sec0,data=astrochicken"
> +SECRETALT="secret,id=sec0,data=platypus"
> +
> +_make_test_img --object $SECRET -o "encryption=on,qcow-key-secret=sec0" $size
> +
> +IMGSPEC="driver=$IMGFMT,file.filename=$TEST_IMG,qcow-key-secret=sec0"
> +
> +QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT

While I agree that it makes sense to have this variable, we
unfortunately do not have it. Yet. ;-)

It should be defined somewhere and it should probably actually contain
all non-format options (such as the cache mode).

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v1 12/15] qcow2: add support for LUKS encryption format

2017-01-21 Thread Max Reitz
On 03.01.2017 19:27, Daniel P. Berrange wrote:
> This adds support for using LUKS as an encryption format
> with the qcow2 file. The use of the existing 'encryption=on'
> parameter is replaced by a new parameter 'encryption-format'
> which takes the values 'aes' or 'luks'. e.g.
> 
>   # qemu-img create --object secret,data=123456,id=sec0 \
>-f qcow2 -o encryption-format=luks,luks-key-secret=sec0 \
>test.qcow2 10G
> 
> results in the creation of an image using the LUKS format.
> Use of the legacy 'encryption=on' parameter still results
> in creation of the old qcow2 AES format, and is equivalent
> to the new 'encryption-format=aes'. e.g. the following are
> equivalent:
> 
>   # qemu-img create --object secret,data=123456,id=sec0 \
>-f qcow2 -o encryption=on,aes-key-secret=sec0 \
>test.qcow2 10G
> 
>  # qemu-img create --object secret,data=123456,id=sec0 \
>-f qcow2 -o encryption-format=aes,aes-key-secret=sec0 \
>test.qcow2 10G
> 
> With the LUKS format it is necessary to store the LUKS
> partition header and key material in the QCow2 file. This
> data can be many MB in size, so cannot go into the QCow2
> header region directly. Thus the spec defines a FDE
> (Full Disk Encryption) header extension that specifies
> the offset of a set of clusters to hold the FDE headers,
> as well as the length of that region. The LUKS header is
> thus stored in these extra allocated clusters before the
> main image payload.
> 
> Aside from all the cryptographic differences implied by
> use of the LUKS format, there is one further key difference
> between the use of legacy AES and LUKS encryption in qcow2.
> For LUKS, the initialiazation vectors are generated using
> the host physical sector as the input, rather than the
> guest virtual sector. This guarantees unique initialization
> vectors for all sectors when qcow2 internal snapshots are
> used, thus giving stronger protection against watermarking
> attacks.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block/qcow2-cluster.c  |   4 +-
>  block/qcow2-refcount.c |  10 ++
>  block/qcow2.c  | 236 
> ++---
>  block/qcow2.h  |   9 ++
>  docs/specs/qcow2.txt   |  99 +++

I'd personally rather have specification changes separate from the
implementation.

>  qapi/block-core.json   |   6 +-
>  tests/qemu-iotests/049 |   2 +-
>  tests/qemu-iotests/082.out | 216 +
>  tests/qemu-iotests/087 |  65 -
>  tests/qemu-iotests/087.out |  22 -
>  tests/qemu-iotests/134 |   4 +-
>  tests/qemu-iotests/158 |   8 +-
>  tests/qemu-iotests/174 |  76 +++
>  tests/qemu-iotests/174.out |  19 
>  tests/qemu-iotests/175 |  85 
>  tests/qemu-iotests/175.out |  26 +
>  tests/qemu-iotests/group   |   2 +
>  17 files changed, 840 insertions(+), 49 deletions(-)
>  create mode 100755 tests/qemu-iotests/174
>  create mode 100644 tests/qemu-iotests/174.out
>  create mode 100755 tests/qemu-iotests/175
>  create mode 100644 tests/qemu-iotests/175.out

Also, in order to help reviewing by making this patch less scary (840
insertions are kind of... Urgh.), it would make sense to add these two
tests in one or two separate patches.

> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index a2103dc..866b122 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -383,7 +383,9 @@ static int coroutine_fn do_perform_cow(BlockDriverState 
> *bs,
>  
>  if (bs->encrypted) {
>  Error *err = NULL;
> -int64_t sector = (src_cluster_offset + offset_in_cluster)
> +int64_t sector = (s->crypt_physical_offset ?
> +  (cluster_offset + offset_in_cluster) :
> +  (src_cluster_offset + offset_in_cluster))
>   >> BDRV_SECTOR_BITS;
>  assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
>  assert((bytes & ~BDRV_SECTOR_MASK) == 0);
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index cbfb3fe..afa4636 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c

[...]

> @@ -80,6 +81,73 @@ static int qcow2_probe(const uint8_t *buf, int buf_size, 
> const char *filename)

[...]

> +static ssize_t qcow2_crypto_hdr_init_func(QCryptoBlock *block, size_t 
> headerlen,
> +  Error **errp, void *opaque)
> +{
> +BlockDriverState *bs = opaque;
> +BDRVQcow2State *s = bs->opaque;
> +int64_t ret;
> +
> +ret = qcow2_alloc_clusters(bs, headerlen);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret,
> + "Cannot allocate cluster for LUKS header size %zu",
> + headerlen);
> +return -1;
> +}
> +
> +s->crypto_header.length = headerlen;
> +s->crypto_header.offset = ret;

The 

Re: [Qemu-devel] [PATCH] usb: Set category and description of the MTP device

2017-01-21 Thread Michael Tokarev
Applied to -trivial, thanks!

/mjt



Re: [Qemu-devel] [PATCH v5 1/2] gdbstub.c: fix GDB connection segfault caused by empty machines

2017-01-21 Thread Michael Tokarev
Applied both to -trivial, thank you!

/mjt



Re: [Qemu-devel] [PATCH] scsi-disk: add 'fall through' comment to switch VERIFY cases

2017-01-21 Thread Michael Tokarev
Applied to -trivial, thanks!

/mjt



Re: [Qemu-devel] [PATCH] Drop duplicate display option documentation

2017-01-21 Thread Michael Tokarev
15.01.2017 23:30, Samuel Thibault wrote:
> The curses and none possibilities are already documented on a separate line,
> so documenting it on the sdl line was both unneeded and confusing.

Applied to -trivial, thanks!

/mjt



[Qemu-devel] [PATCH] 9pfs: fix offset error in v9fs_xattr_read()

2017-01-21 Thread Greg Kurz
The current code tries to copy `read_count' bytes starting at offset
`offset' from a `read_count`-sized iovec. This causes v9fs_pack() to
fail with ENOBUFS.

Since the PDU iovec is already partially filled with `offset' bytes,
let's skip them when creating `qiov_full' and have v9fs_pack() to
copy the whole of it. Moreover, this is consistent with the other
places where v9fs_init_qiov_from_pdu() is called.

This fixes commit "bcb8998fac16 9pfs: call v9fs_init_qiov_from_pdu
before v9fs_pack".

Signed-off-by: Greg Kurz 
---
 hw/9pfs/9p.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index fa58877570f6..f0eef1a3ef53 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1685,8 +1685,8 @@ static int v9fs_xattr_read(V9fsState *s, V9fsPDU *pdu, 
V9fsFidState *fidp,
 }
 offset += err;
 
-v9fs_init_qiov_from_pdu(_full, pdu, 0, read_count, false);
-err = v9fs_pack(qiov_full.iov, qiov_full.niov, offset,
+v9fs_init_qiov_from_pdu(_full, pdu, offset, read_count, false);
+err = v9fs_pack(qiov_full.iov, qiov_full.niov, 0,
 ((char *)fidp->fs.xattr.value) + off,
 read_count);
 qemu_iovec_destroy(_full);




[Qemu-devel] [Bug 1658120] Re: building with gcc-aarch64-linux-gnu

2017-01-21 Thread Bilal Amarni
Thanks for your replies,

I've managed to compile it using a chroot as suggested by Peter. I just
grabbed a pre-built rootfs from here : https://wiki.debian.org/Arm64Port
#Pre-built_Rootfses, then installed qemu-user-static with apt-get and
run the build from the chroot.

Somehow "apt-get build-dep -a arm64 qemu" didn't work, I had tried to do
"dpkg --add-architecture arm64 && apt-get update" before running this
command but it couldn't find the needed packages, not sure why.

In any case thanks for your help :)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1658120

Title:
  building with gcc-aarch64-linux-gnu

Status in QEMU:
  Invalid

Bug description:
  Hi, while trying to build qemu v2.8.0 with gcc-aarch64-linux-gnu
  cross-compiler I'm getting the following :

  
  In file included from /usr/include/x86_64-linux-gnu/sys/syscall.h:31:0,
   from /root/qemu/util/compatfd.c:21:
  /root/qemu/util/compatfd.c: In function 'qemu_signalfd':
  /root/qemu/util/compatfd.c:103:19: error: '__NR_signalfd' undeclared (first 
use in this function)
   ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8);
 ^
  /root/qemu/util/compatfd.c:103:19: note: each undeclared identifier is 
reported only once for each function it appears in
  /root/qemu/rules.mak:59: recipe for target 'util/compatfd.o' failed
  make: *** [util/compatfd.o] Error 1

  
  I had configured it with :

  ../configure --target-list=x86_64-linux-user --static --cpu=aarch64

  And I'm on :

  Linux ubuntu-512mb-fra1-01 4.4.0-59-generic #80-Ubuntu SMP Fri Jan 6
  17:47:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

  Thanks

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1658120/+subscriptions



[Qemu-devel] [Bug 1658120] Re: building with gcc-aarch64-linux-gnu

2017-01-21 Thread Bilal Amarni
this was an issue in my setup

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1658120

Title:
  building with gcc-aarch64-linux-gnu

Status in QEMU:
  Invalid

Bug description:
  Hi, while trying to build qemu v2.8.0 with gcc-aarch64-linux-gnu
  cross-compiler I'm getting the following :

  
  In file included from /usr/include/x86_64-linux-gnu/sys/syscall.h:31:0,
   from /root/qemu/util/compatfd.c:21:
  /root/qemu/util/compatfd.c: In function 'qemu_signalfd':
  /root/qemu/util/compatfd.c:103:19: error: '__NR_signalfd' undeclared (first 
use in this function)
   ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8);
 ^
  /root/qemu/util/compatfd.c:103:19: note: each undeclared identifier is 
reported only once for each function it appears in
  /root/qemu/rules.mak:59: recipe for target 'util/compatfd.o' failed
  make: *** [util/compatfd.o] Error 1

  
  I had configured it with :

  ../configure --target-list=x86_64-linux-user --static --cpu=aarch64

  And I'm on :

  Linux ubuntu-512mb-fra1-01 4.4.0-59-generic #80-Ubuntu SMP Fri Jan 6
  17:47:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

  Thanks

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1658120/+subscriptions



[Qemu-devel] [RFC v4] RBD: Add support readv,writev for rbd

2017-01-21 Thread jazeltq
From: tianqing 

Rbd can do readv and writev directly, so wo do not need to transform
iov to buf or vice versa any more.

Signed-off-by: tianqing 
---
 block/rbd.c | 47 +--
 1 file changed, 41 insertions(+), 6 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index a57b3e3..0cbd9e3 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -73,7 +73,12 @@ typedef struct RBDAIOCB {
 BlockAIOCB common;
 int64_t ret;
 QEMUIOVector *qiov;
+/* Note:
+ * The LIBRBD_SUPPORTS_IOVEC is defined in librbd.h.
+ */
+#ifndef LIBRBD_SUPPORTS_IOVEC
 char *bounce;
+#endif
 RBDAIOCmd cmd;
 int error;
 struct BDRVRBDState *s;
@@ -83,7 +88,9 @@ typedef struct RADOSCB {
 RBDAIOCB *acb;
 struct BDRVRBDState *s;
 int64_t size;
+#ifndef LIBRBD_SUPPORTS_IOVEC
 char *buf;
+#endif
 int64_t ret;
 } RADOSCB;
 
@@ -426,11 +433,21 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
 }
 } else {
 if (r < 0) {
+#ifndef LIBRBD_SUPPORTS_IOVEC
 memset(rcb->buf, 0, rcb->size);
+#else
+iov_memset(acb->qiov->iov, acb->qiov->niov, 0, 0, acb->qiov->size);
+#endif
 acb->ret = r;
 acb->error = 1;
 } else if (r < rcb->size) {
+#ifndef LIBRBD_SUPPORTS_IOVEC
 memset(rcb->buf + r, 0, rcb->size - r);
+#else
+iov_memset(acb->qiov->iov, acb->qiov->niov,
+   r, 0, acb->qiov->size - r);
+#endif
+
 if (!acb->error) {
 acb->ret = rcb->size;
 }
@@ -441,10 +458,12 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
 
 g_free(rcb);
 
+#ifndef LIBRBD_SUPPORTS_IOVEC
 if (acb->cmd == RBD_AIO_READ) {
 qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
 }
 qemu_vfree(acb->bounce);
+#endif
 acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
 
 qemu_aio_unref(acb);
@@ -655,8 +674,10 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 RBDAIOCB *acb;
 RADOSCB *rcb = NULL;
 rbd_completion_t c;
-char *buf;
 int r;
+#ifndef LIBRBD_SUPPORTS_IOVEC
+char *buf = NULL;
+#endif
 
 BDRVRBDState *s = bs->opaque;
 
@@ -664,6 +685,8 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 acb->cmd = cmd;
 acb->qiov = qiov;
 assert(!qiov || qiov->size == size);
+#ifndef LIBRBD_SUPPORTS_IOVEC
+
 if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
 acb->bounce = NULL;
 } else {
@@ -672,19 +695,21 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 goto failed;
 }
 }
-acb->ret = 0;
-acb->error = 0;
-acb->s = s;
-
 if (cmd == RBD_AIO_WRITE) {
 qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size);
 }
-
 buf = acb->bounce;
+#endif
+acb->ret = 0;
+acb->error = 0;
+acb->s = s;
 
 rcb = g_new(RADOSCB, 1);
+
 rcb->acb = acb;
+#ifndef LIBRBD_SUPPORTS_IOVEC
 rcb->buf = buf;
+#endif
 rcb->s = acb->s;
 rcb->size = size;
 r = rbd_aio_create_completion(rcb, (rbd_callback_t) rbd_finish_aiocb, );
@@ -694,10 +719,18 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 
 switch (cmd) {
 case RBD_AIO_WRITE:
+#ifndef LIBRBD_SUPPORTS_IOVEC
 r = rbd_aio_write(s->image, off, size, buf, c);
+#else
+r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c);
+#endif
 break;
 case RBD_AIO_READ:
+#ifndef LIBRBD_SUPPORTS_IOVEC
 r = rbd_aio_read(s->image, off, size, buf, c);
+#else
+r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c);
+#endif
 break;
 case RBD_AIO_DISCARD:
 r = rbd_aio_discard_wrapper(s->image, off, size, c);
@@ -719,7 +752,9 @@ failed_completion:
 rbd_aio_release(c);
 failed:
 g_free(rcb);
+#ifndef LIBRBD_SUPPORTS_IOVEC
 qemu_vfree(acb->bounce);
+#endif
 qemu_aio_unref(acb);
 return NULL;
 }
-- 
2.10.2




Re: [Qemu-devel] [Nbd] [PATCH] Further tidy-up on block status

2017-01-21 Thread Wouter Verhelst
On Fri, Jan 20, 2017 at 01:35:10PM -0600, Eric Blake wrote:
> On 01/20/2017 12:00 PM, Alex Bligh wrote:
> > 
> >> On 20 Jan 2017, at 17:04, Vladimir Sementsov-Ogievskiy 
> >>  wrote:
> >>
> >> About extents: is 32bit length enough? We will have to send 4096 for empty 
> >> 16tb disk..
> > 
> > The nbd protocol uniformly uses 32 bit lengths (for better or for worse). 
> > This is baked into the specification heavily.
> > 
> > I'm not sure sending 4,096 items for an empty 16TB disk is any great 
> > hardship to be honest.
> 
> If it truly matters, an idea that has already been floated on the list
> is to add a new NBD_CMD_FLAG_SCALE (or some other spelling) that states
> that a particular command is sending lengths scaled by a particular
> factor (by the block size? obviously it would have to be better
> specified).  Under this idea, the scaling factor would allow you to
> report larger extents for fewer back-and-forth operations, but it gets
> tricky if the scaling is allowed to get larger than the minimum
> granularity between extent changes.

Right, but people objected to it on grounds of it being too complicated
(which I think was a fair point).

> The other idea that has been floated is a way to report whether the
> entire export is known to be all-zero content at the time the client
> connects, which would trade 4096+ queries (you'd actually have to do
> more than 4096 queries, since length is < 4G, not <= 4G) with a single
> piece of information at the time the client connects.

That's pretty special-case, and I objected to it on those grounds :-)

However, I don't think there's anything necessarily wrong in changing
the size of the length field in the reply header of the
NBD_REPLY_TYPE_BLOCK_STATUS packet. That way, combined with the fact
that we'd already stated that a server may send more information than
was requested, a client could ask for metadata on an export, and if the
extent is the whole size of the export, the server could say so in a
single reply for all export sizes.

I suppose it'd be slightly odd to have the request use 32-bit lengths
and the response be expressed in 64-bit ones, but I suppose that's the
price we pay to remain a backwards compatible and not too complicated a
protocol.

> Either way, discussion on such enhancements are probably worth a new
> thread.

Sure.

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
   people in the world who think they really understand all of its rules,
   and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12



Re: [Qemu-devel] [RFC v3] RBD: Add support readv,writev for rbd

2017-01-21 Thread no-reply
Hi,

Your series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [RFC v3] RBD: Add support readv,writev for rbd
Message-id: 20170121105949.15466-1-jaze...@gmail.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
f826583 RBD: Add support readv,writev for rbd

=== OUTPUT BEGIN ===
Checking PATCH 1/1: RBD: Add support readv,writev for rbd...
ERROR: trailing whitespace
#81: FILE: block/rbd.c:679:
+char * buf = NULL; $

ERROR: "foo * bar" should be "foo *bar"
#81: FILE: block/rbd.c:679:
+char * buf = NULL; 

ERROR: trailing whitespace
#91: FILE: block/rbd.c:689:
+$

total: 3 errors, 0 warnings, 126 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [RFC v3] RBD: Add support readv,writev for rbd

2017-01-21 Thread jazeltq
From: tianqing 

Rbd can do readv and writev directly, so wo do not need to transform
iov to buf or vice versa any more.

Signed-off-by: tianqing 
---
 block/rbd.c | 47 +--
 1 file changed, 41 insertions(+), 6 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index a57b3e3..4add4dd 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -73,7 +73,12 @@ typedef struct RBDAIOCB {
 BlockAIOCB common;
 int64_t ret;
 QEMUIOVector *qiov;
+/* Note:
+ * The LIBRBD_SUPPORTS_IOVEC is defined in librbd.h.
+ */
+#ifndef LIBRBD_SUPPORTS_IOVEC
 char *bounce;
+#endif
 RBDAIOCmd cmd;
 int error;
 struct BDRVRBDState *s;
@@ -83,7 +88,9 @@ typedef struct RADOSCB {
 RBDAIOCB *acb;
 struct BDRVRBDState *s;
 int64_t size;
+#ifndef LIBRBD_SUPPORTS_IOVEC
 char *buf;
+#endif
 int64_t ret;
 } RADOSCB;
 
@@ -426,11 +433,21 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
 }
 } else {
 if (r < 0) {
+#ifndef LIBRBD_SUPPORTS_IOVEC
 memset(rcb->buf, 0, rcb->size);
+#else
+iov_memset(acb->qiov->iov, acb->qiov->niov, 0, 0, acb->qiov->size);
+#endif
 acb->ret = r;
 acb->error = 1;
 } else if (r < rcb->size) {
+#ifndef LIBRBD_SUPPORTS_IOVEC
 memset(rcb->buf + r, 0, rcb->size - r);
+#else
+iov_memset(acb->qiov->iov, acb->qiov->niov,
+   r, 0, acb->qiov->size - r);
+#endif
+
 if (!acb->error) {
 acb->ret = rcb->size;
 }
@@ -441,10 +458,12 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
 
 g_free(rcb);
 
+#ifndef LIBRBD_SUPPORTS_IOVEC
 if (acb->cmd == RBD_AIO_READ) {
 qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
 }
 qemu_vfree(acb->bounce);
+#endif
 acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
 
 qemu_aio_unref(acb);
@@ -655,8 +674,10 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 RBDAIOCB *acb;
 RADOSCB *rcb = NULL;
 rbd_completion_t c;
-char *buf;
 int r;
+#ifndef LIBRBD_SUPPORTS_IOVEC
+char * buf = NULL; 
+#endif
 
 BDRVRBDState *s = bs->opaque;
 
@@ -664,6 +685,8 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 acb->cmd = cmd;
 acb->qiov = qiov;
 assert(!qiov || qiov->size == size);
+#ifndef LIBRBD_SUPPORTS_IOVEC
+
 if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
 acb->bounce = NULL;
 } else {
@@ -672,19 +695,21 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 goto failed;
 }
 }
-acb->ret = 0;
-acb->error = 0;
-acb->s = s;
-
 if (cmd == RBD_AIO_WRITE) {
 qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size);
 }
-
 buf = acb->bounce;
+#endif
+acb->ret = 0;
+acb->error = 0;
+acb->s = s;
 
 rcb = g_new(RADOSCB, 1);
+
 rcb->acb = acb;
+#ifndef LIBRBD_SUPPORTS_IOVEC
 rcb->buf = buf;
+#endif
 rcb->s = acb->s;
 rcb->size = size;
 r = rbd_aio_create_completion(rcb, (rbd_callback_t) rbd_finish_aiocb, );
@@ -694,10 +719,18 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
 
 switch (cmd) {
 case RBD_AIO_WRITE:
+#ifndef LIBRBD_SUPPORTS_IOVEC
 r = rbd_aio_write(s->image, off, size, buf, c);
+#else
+r = rbd_aio_writev(s->image, qiov->iov, qiov->niov, off, c);
+#endif
 break;
 case RBD_AIO_READ:
+#ifndef LIBRBD_SUPPORTS_IOVEC
 r = rbd_aio_read(s->image, off, size, buf, c);
+#else
+r = rbd_aio_readv(s->image, qiov->iov, qiov->niov, off, c);
+#endif
 break;
 case RBD_AIO_DISCARD:
 r = rbd_aio_discard_wrapper(s->image, off, size, c);
@@ -719,7 +752,9 @@ failed_completion:
 rbd_aio_release(c);
 failed:
 g_free(rcb);
+#ifndef LIBRBD_SUPPORTS_IOVEC
 qemu_vfree(acb->bounce);
+#endif
 qemu_aio_unref(acb);
 return NULL;
 }
-- 
1.8.3.1




[Qemu-devel] [Bug 601946] Re: [Feature request] qemu-img multi-threaded compressed image conversion

2017-01-21 Thread Коренберг Марк
@~quentin.casasnovas please report this as new feature request, instead
of adding comment to this one.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/601946

Title:
  [Feature request] qemu-img multi-threaded compressed image conversion

Status in QEMU:
  New

Bug description:
  Feature request:
  qemu-img multi-threaded compressed image conversion

  Suppose I want to convert raw image to compressed qcow2. Multi-
  threaded conversion will be much faster, because bottleneck is
  compressing data.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/601946/+subscriptions



[Qemu-devel] [Bug 601946] Re: [Feature request] qemu-img multi-threaded compressed image conversion

2017-01-21 Thread Quentin Casasnovas
That was also my feeling, so nice to get a confirmation!

Another related thing would be to allow qemu-nbd to write compressed
blocks its backing image - today if you use a qcow2 with compression,
any block which is written to gets uncompressed in the resulting image
and you need to recompress the image offline with qemu-img.

Would you have any pointers/documentation on how best to implement this
so both qemu-img and qemu-nbd can use multithreaded compressed writes ?
I'm totally new to qemu block subsystem.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/601946

Title:
  [Feature request] qemu-img multi-threaded compressed image conversion

Status in QEMU:
  New

Bug description:
  Feature request:
  qemu-img multi-threaded compressed image conversion

  Suppose I want to convert raw image to compressed qcow2. Multi-
  threaded conversion will be much faster, because bottleneck is
  compressing data.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/601946/+subscriptions



Re: [Qemu-devel] [RFC PATCH 0/3] Preparing the build system for libtcg

2017-01-21 Thread no-reply
Hi,

Your series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [RFC PATCH 0/3] Preparing the build system for libtcg
Message-id: 20170121084600.5860-1-ale+q...@clearmind.me
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
4be8b33 Introduce libtcg infrastructure
1264cba *-user targets object files decoupling
5130643 Factor out {linux,bsd}-user/qemu.h

=== OUTPUT BEGIN ===
Checking PATCH 1/3: Factor out {linux,bsd}-user/qemu.h...
ERROR: line over 90 characters
#439: FILE: qemu-user-common.h:15:
+(type == VERIFY_READ) ? PAGE_READ : (PAGE_READ | 
PAGE_WRITE)) == 0;

ERROR: code indent should never use tabs
#481: FILE: qemu-user-common.h:57:
+#define put_user(x, gaddr, target_type)^I^I^I^I^I\$

ERROR: code indent should never use tabs
#482: FILE: qemu-user-common.h:58:
+({^I^I^I^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#483: FILE: qemu-user-common.h:59:
+abi_ulong __gaddr = (gaddr);^I^I^I^I^I\$

ERROR: code indent should never use tabs
#484: FILE: qemu-user-common.h:60:
+target_type *__hptr;^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#485: FILE: qemu-user-common.h:61:
+abi_long __ret = 0;^I^I^I^I^I^I^I\$

ERROR: do not use assignment in if condition
#486: FILE: qemu-user-common.h:62:
+if ((__hptr = lock_user(VERIFY_WRITE, __gaddr, sizeof(target_type), 0))) { 
\

ERROR: braces {} are necessary for all arms of this statement
#486: FILE: qemu-user-common.h:62:
+if ((__hptr = lock_user(VERIFY_WRITE, __gaddr, sizeof(target_type), 0))) { 
\
[...]
+} else \
[...]

ERROR: code indent should never use tabs
#487: FILE: qemu-user-common.h:63:
+__put_user((x), __hptr);^I^I^I^I\$

ERROR: code indent should never use tabs
#488: FILE: qemu-user-common.h:64:
+unlock_user(__hptr, __gaddr, sizeof(target_type));^I^I\$

ERROR: code indent should never use tabs
#489: FILE: qemu-user-common.h:65:
+} else^I^I^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#490: FILE: qemu-user-common.h:66:
+__ret = -TARGET_EFAULT;^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#491: FILE: qemu-user-common.h:67:
+__ret;^I^I^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#494: FILE: qemu-user-common.h:70:
+#define get_user(x, gaddr, target_type)^I^I^I^I^I\$

ERROR: code indent should never use tabs
#495: FILE: qemu-user-common.h:71:
+({^I^I^I^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#496: FILE: qemu-user-common.h:72:
+abi_ulong __gaddr = (gaddr);^I^I^I^I^I\$

ERROR: code indent should never use tabs
#497: FILE: qemu-user-common.h:73:
+target_type *__hptr;^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#498: FILE: qemu-user-common.h:74:
+abi_long __ret = 0;^I^I^I^I^I^I^I\$

ERROR: do not use assignment in if condition
#499: FILE: qemu-user-common.h:75:
+if ((__hptr = lock_user(VERIFY_READ, __gaddr, sizeof(target_type), 1))) { \

ERROR: code indent should never use tabs
#500: FILE: qemu-user-common.h:76:
+__get_user((x), __hptr);^I^I^I^I\$

ERROR: code indent should never use tabs
#501: FILE: qemu-user-common.h:77:
+unlock_user(__hptr, __gaddr, 0);^I^I^I^I\$

ERROR: code indent should never use tabs
#502: FILE: qemu-user-common.h:78:
+} else {^I^I^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#503: FILE: qemu-user-common.h:79:
+/* avoid warning */^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#504: FILE: qemu-user-common.h:80:
+(x) = 0;^I^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#505: FILE: qemu-user-common.h:81:
+__ret = -TARGET_EFAULT;^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#506: FILE: qemu-user-common.h:82:
+}^I^I^I^I^I^I^I^I^I\$

ERROR: code indent should never use tabs
#507: FILE: qemu-user-common.h:83:
+__ret;^I^I^I^I^I^I^I^I\$

WARNING: line over 80 characters
#547: FILE: qemu-user-common.h:123:
+static inline void *lock_user(int type, abi_ulong guest_addr, long len, int 
copy)

ERROR: braces {} are necessary for all arms of this statement
#549: FILE: qemu-user-common.h:125:
+if (!access_ok(type, guest_addr, len))
[...]

ERROR: braces {} are necessary for all arms of this statement
#555: FILE: qemu-user-common.h:131:
+if (copy)
[...]
+else
[...]

ERROR: braces {} are 

[Qemu-devel] [RFC PATCH 2/3] *-user targets object files decoupling

2017-01-21 Thread Alessandro Di Federico
Move some functions around in target/$arch/*.c to reduce the coupling
among the various compilation unit. In particular, we want to make the
various translate.c as much standalone as possible, so that they will be
easily employed by libtcg.
---
 exec.c   |   2 +
 target/arm/cpu.c | 118 +++
 target/arm/cpu.h |  82 ++
 target/arm/helper.c  |  73 +-
 target/arm/translate-a64.c   |  54 
 target/arm/translate.c   |  97 ---
 target/arm/translate.h   |   7 ---
 target/m68k/op_helper.c  |  23 +
 target/m68k/translate.c  |  23 -
 target/mips/op_helper.c  |   8 ---
 target/mips/translate.c  |   8 +++
 target/ppc/gdbstub.c |  20 
 target/ppc/translate_init.c  |  20 
 target/unicore32/helper.c|  68 +
 target/unicore32/translate.c |  68 -
 15 files changed, 333 insertions(+), 338 deletions(-)

diff --git a/exec.c b/exec.c
index 47835c1dc1..66f8187281 100644
--- a/exec.c
+++ b/exec.c
@@ -672,9 +672,11 @@ void cpu_exec_unrealizefn(CPUState *cpu)
 if (cc->vmsd != NULL) {
 vmstate_unregister(NULL, cc->vmsd, cpu);
 }
+#ifndef CONFIG_USER_ONLY
 if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
 vmstate_unregister(NULL, _cpu_common, cpu);
 }
+#endif
 }
 
 void cpu_exec_initfn(CPUState *cpu)
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index f5cb30af6c..f0af5d5d7d 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -238,6 +238,124 @@ static void arm_cpu_reset(CPUState *s)
 hw_watchpoint_update_all(cpu);
 }
 
+static void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
+   fprintf_function cpu_fprintf, int flags)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = >env;
+uint32_t psr = pstate_read(env);
+int i;
+int el = arm_current_el(env);
+const char *ns_status;
+
+cpu_fprintf(f, "PC=%016"PRIx64"  SP=%016"PRIx64"\n",
+env->pc, env->xregs[31]);
+for (i = 0; i < 31; i++) {
+cpu_fprintf(f, "X%02d=%016"PRIx64, i, env->xregs[i]);
+if ((i % 4) == 3) {
+cpu_fprintf(f, "\n");
+} else {
+cpu_fprintf(f, " ");
+}
+}
+
+if (arm_feature(env, ARM_FEATURE_EL3) && el != 3) {
+ns_status = env->cp15.scr_el3 & SCR_NS ? "NS " : "S ";
+} else {
+ns_status = "";
+}
+
+cpu_fprintf(f, "\nPSTATE=%08x %c%c%c%c %sEL%d%c\n",
+psr,
+psr & PSTATE_N ? 'N' : '-',
+psr & PSTATE_Z ? 'Z' : '-',
+psr & PSTATE_C ? 'C' : '-',
+psr & PSTATE_V ? 'V' : '-',
+ns_status,
+el,
+psr & PSTATE_SP ? 'h' : 't');
+
+if (flags & CPU_DUMP_FPU) {
+int numvfpregs = 32;
+for (i = 0; i < numvfpregs; i += 2) {
+uint64_t vlo = float64_val(env->vfp.regs[i * 2]);
+uint64_t vhi = float64_val(env->vfp.regs[(i * 2) + 1]);
+cpu_fprintf(f, "q%02d=%016" PRIx64 ":%016" PRIx64 " ",
+i, vhi, vlo);
+vlo = float64_val(env->vfp.regs[(i + 1) * 2]);
+vhi = float64_val(env->vfp.regs[((i + 1) * 2) + 1]);
+cpu_fprintf(f, "q%02d=%016" PRIx64 ":%016" PRIx64 "\n",
+i + 1, vhi, vlo);
+}
+cpu_fprintf(f, "FPCR: %08x  FPSR: %08x\n",
+vfp_get_fpcr(env), vfp_get_fpsr(env));
+}
+}
+
+static const char *cpu_mode_names[16] = {
+  "usr", "fiq", "irq", "svc", "???", "???", "mon", "abt",
+  "???", "???", "hyp", "und", "???", "???", "???", "sys"
+};
+
+void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
+int flags)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = >env;
+int i;
+uint32_t psr;
+const char *ns_status;
+
+if (is_a64(env)) {
+aarch64_cpu_dump_state(cs, f, cpu_fprintf, flags);
+return;
+}
+
+for(i=0;i<16;i++) {
+cpu_fprintf(f, "R%02d=%08x", i, env->regs[i]);
+if ((i % 4) == 3)
+cpu_fprintf(f, "\n");
+else
+cpu_fprintf(f, " ");
+}
+psr = cpsr_read(env);
+
+if (arm_feature(env, ARM_FEATURE_EL3) &&
+(psr & CPSR_M) != ARM_CPU_MODE_MON) {
+ns_status = env->cp15.scr_el3 & SCR_NS ? "NS " : "S ";
+} else {
+ns_status = "";
+}
+
+cpu_fprintf(f, "PSR=%08x %c%c%c%c %c %s%s%d\n",
+psr,
+psr & (1 << 31) ? 'N' : '-',
+psr & (1 << 30) ? 'Z' : '-',
+psr & (1 << 29) ? 'C' : '-',
+psr & (1 << 28) ? 'V' : '-',
+psr & CPSR_T ? 'T' : 'A',
+ns_status,
+cpu_mode_names[psr & 0xf], (psr & 0x10) ? 32 : 26);

[Qemu-devel] [RFC PATCH 1/3] Factor out {linux,bsd}-user/qemu.h

2017-01-21 Thread Alessandro Di Federico
A quite large part of {linux,bsd}-user/qemu.h is shared, this patch
introduces qemu-user-common.h which factors it out. This shared part is
also the bare minimum required to build a linux-user-like target, and,
in particular, it will be useful for the libtcg targets.
---
 bsd-user/qemu.h| 193 +
 linux-user/qemu.h  | 176 +---
 qemu-user-common.h | 180 +
 3 files changed, 182 insertions(+), 367 deletions(-)
 create mode 100644 qemu-user-common.h

diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index 2b2b9184e0..b51319caf0 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -17,10 +17,8 @@
 #ifndef QEMU_H
 #define QEMU_H
 
-
-#include "cpu.h"
+#include "qemu-user-common.h"
 #include "exec/exec-all.h"
-#include "exec/cpu_ldst.h"
 
 #undef DEBUG_REMAP
 #ifdef DEBUG_REMAP
@@ -217,195 +215,6 @@ void mmap_fork_end(int child);
 /* main.c */
 extern unsigned long x86_stack_size;
 
-/* user access */
-
-#define VERIFY_READ 0
-#define VERIFY_WRITE 1 /* implies read access */
-
-static inline int access_ok(int type, abi_ulong addr, abi_ulong size)
-{
-return page_check_range((target_ulong)addr, size,
-(type == VERIFY_READ) ? PAGE_READ : (PAGE_READ | 
PAGE_WRITE)) == 0;
-}
-
-/* NOTE __get_user and __put_user use host pointers and don't check access. */
-/* These are usually used to access struct data members once the
- * struct has been locked - usually with lock_user_struct().
- */
-#define __put_user(x, hptr)\
-({\
-int size = sizeof(*hptr);\
-switch(size) {\
-case 1:\
-*(uint8_t *)(hptr) = (uint8_t)(typeof(*hptr))(x);\
-break;\
-case 2:\
-*(uint16_t *)(hptr) = tswap16((typeof(*hptr))(x));\
-break;\
-case 4:\
-*(uint32_t *)(hptr) = tswap32((typeof(*hptr))(x));\
-break;\
-case 8:\
-*(uint64_t *)(hptr) = tswap64((typeof(*hptr))(x));\
-break;\
-default:\
-abort();\
-}\
-0;\
-})
-
-#define __get_user(x, hptr) \
-({\
-int size = sizeof(*hptr);\
-switch(size) {\
-case 1:\
-x = (typeof(*hptr))*(uint8_t *)(hptr);\
-break;\
-case 2:\
-x = (typeof(*hptr))tswap16(*(uint16_t *)(hptr));\
-break;\
-case 4:\
-x = (typeof(*hptr))tswap32(*(uint32_t *)(hptr));\
-break;\
-case 8:\
-x = (typeof(*hptr))tswap64(*(uint64_t *)(hptr));\
-break;\
-default:\
-/* avoid warning */\
-x = 0;\
-abort();\
-}\
-0;\
-})
-
-/* put_user()/get_user() take a guest address and check access */
-/* These are usually used to access an atomic data type, such as an int,
- * that has been passed by address.  These internally perform locking
- * and unlocking on the data type.
- */
-#define put_user(x, gaddr, target_type) \
-({  \
-abi_ulong __gaddr = (gaddr);\
-target_type *__hptr;\
-abi_long __ret; \
-if ((__hptr = lock_user(VERIFY_WRITE, __gaddr, sizeof(target_type), 0))) { 
\
-__ret = __put_user((x), __hptr);\
-unlock_user(__hptr, __gaddr, sizeof(target_type));  \
-} else  \
-__ret = -TARGET_EFAULT; \
-__ret;  \
-})
-
-#define get_user(x, gaddr, target_type) \
-({  \
-abi_ulong __gaddr = (gaddr);\
-target_type *__hptr;\
-abi_long __ret; \
-if ((__hptr = lock_user(VERIFY_READ, __gaddr, sizeof(target_type), 1))) { \
-__ret = __get_user((x), __hptr);\
-unlock_user(__hptr, __gaddr, 0);\
-} else {\
-/* avoid warning */ \
-(x) = 0;\
-__ret = -TARGET_EFAULT; \
-}   \
-__ret;  \
-})
-
-#define put_user_ual(x, gaddr) put_user((x), (gaddr), abi_ulong)
-#define put_user_sal(x, gaddr) put_user((x), (gaddr), abi_long)
-#define put_user_u64(x, gaddr) put_user((x), (gaddr), uint64_t)
-#define 

[Qemu-devel] [RFC PATCH 3/3] Introduce libtcg infrastructure

2017-01-21 Thread Alessandro Di Federico
[This is a draft patch is for review purposes only]

* Extend the build system to build libtcg-$arch.so dynamic libraries.
* Introduce --enable-libtcg and --disable-libtcg the *-libtcg target,
  similar to *-linux-user and *-bsd-user, since it enables
  CONFIG_USER_ONLY, but uses only the TCG frontends (in particular the
  various /target/$arch/translate.c).
* If there's at least a *-libtcg, compile everything as position
  independent code.
* In case we're building libtcg, install the output binary in the
  $PREFIX/lib directory instead of $PREFIX/bin.
* Reduce the number of object files linked into libtcg-$arch.so to the
  minimum.
* Install the tcg.h header to use libtcg.
* Create libtcg/, with a draft function referencing the functions that
  libtcg would use.
---
 Makefile|  7 +
 Makefile.target | 40 +---
 configure   | 20 
 crypto/Makefile.objs|  2 +-
 default-configs/aarch64-libtcg.mak  |  0
 default-configs/alpha-libtcg.mak|  0
 default-configs/arm-libtcg.mak  |  0
 default-configs/armeb-libtcg.mak|  0
 default-configs/cris-libtcg.mak |  0
 default-configs/i386-libtcg.mak |  0
 default-configs/m68k-libtcg.mak |  0
 default-configs/microblaze-libtcg.mak   |  0
 default-configs/microblazeel-libtcg.mak |  0
 default-configs/mips-libtcg.mak |  0
 default-configs/mips64-libtcg.mak   |  0
 default-configs/mips64el-libtcg.mak |  0
 default-configs/mipsel-libtcg.mak   |  0
 default-configs/mipsn32-libtcg.mak  |  0
 default-configs/mipsn32el-libtcg.mak|  0
 default-configs/or32-libtcg.mak |  0
 default-configs/ppc-libtcg.mak  |  1 +
 default-configs/ppc64-libtcg.mak|  1 +
 default-configs/ppc64abi32-libtcg.mak   |  1 +
 default-configs/ppc64le-libtcg.mak  |  1 +
 default-configs/s390x-libtcg.mak|  0
 default-configs/sh4-libtcg.mak  |  0
 default-configs/sh4eb-libtcg.mak|  0
 default-configs/sparc-libtcg.mak|  0
 default-configs/sparc32plus-libtcg.mak  |  0
 default-configs/sparc64-libtcg.mak  |  0
 default-configs/unicore32-libtcg.mak|  0
 default-configs/x86_64-libtcg.mak   |  0
 hw/core/Makefile.objs   |  5 ++-
 include/exec/helper-gen.h   | 12 +++
 include/exec/helper-head.h  |  8 +
 include/exec/helper-tcg.h   | 12 +++
 libtcg/Makefile.objs|  1 +
 libtcg/qemu.h   |  6 
 libtcg/tcg.c| 55 +
 libtcg/tcg.h|  0
 target/alpha/Makefile.objs  |  9 --
 target/arm/Makefile.objs| 22 +
 target/cris/Makefile.objs   | 10 --
 target/i386/Makefile.objs   | 14 ++---
 target/lm32/Makefile.objs   | 12 +--
 target/m68k/Makefile.objs   |  7 -
 target/microblaze/Makefile.objs | 10 --
 target/mips/Makefile.objs   | 11 +--
 target/moxie/Makefile.objs  |  8 -
 target/openrisc/Makefile.objs   | 11 +--
 target/ppc/Makefile.objs| 18 +--
 target/ppc/translate.c  |  2 ++
 target/ppc/translate_init.c |  8 +
 target/s390x/Makefile.objs  | 16 +++---
 target/sh4/Makefile.objs|  8 -
 target/sparc/Makefile.objs  | 15 ++---
 target/tilegx/Makefile.objs |  7 -
 target/tricore/Makefile.objs|  7 -
 target/unicore32/Makefile.objs  |  8 +++--
 target/xtensa/Makefile.objs | 14 ++---
 trace/Makefile.objs |  2 +-
 translate-all.c |  4 ++-
 62 files changed, 322 insertions(+), 73 deletions(-)
 create mode 100644 default-configs/aarch64-libtcg.mak
 create mode 100644 default-configs/alpha-libtcg.mak
 create mode 100644 default-configs/arm-libtcg.mak
 create mode 100644 default-configs/armeb-libtcg.mak
 create mode 100644 default-configs/cris-libtcg.mak
 create mode 100644 default-configs/i386-libtcg.mak
 create mode 100644 default-configs/m68k-libtcg.mak
 create mode 100644 default-configs/microblaze-libtcg.mak
 create mode 100644 default-configs/microblazeel-libtcg.mak
 create mode 100644 default-configs/mips-libtcg.mak
 create mode 100644 default-configs/mips64-libtcg.mak
 create mode 100644 default-configs/mips64el-libtcg.mak
 create mode 100644 default-configs/mipsel-libtcg.mak
 create mode 100644 default-configs/mipsn32-libtcg.mak
 create mode 100644 default-configs/mipsn32el-libtcg.mak
 create mode 100644 default-configs/or32-libtcg.mak
 create mode 100644 default-configs/ppc-libtcg.mak
 create mode 100644 default-configs/ppc64-libtcg.mak
 create mode 100644 default-configs/ppc64abi32-libtcg.mak

[Qemu-devel] [RFC PATCH 0/3] Preparing the build system for libtcg

2017-01-21 Thread Alessandro Di Federico
This series of patches introduce a set of changes, mainly to the QEMU
build system, to open the way to implementing "libtcg", i.e., using
QEMU's tiny code generator frontends as a library.

For the initial proposal, please see the related discussion:

http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg04847.html

The idea is to build a PIC version of QEMU embedding only what's
required to use the TCG frontend (typically
target/$arch/translate.c). To achieve this, some functions have been
moved out of translate.c to reduce the coupling with the other object
files.

All the configurations available on Linux and FreeBSD still build fine
(without warnings).

The last commit also introduces the libtcg/ directory and a series of
*-libtcg targets (similar to the *-{bsd,linux}-user) with the bare
minimum skeleton for the actual implementation and some dummy references
to some functions that in my previous implementation of libtcg were
required, to make sure that all the required object files are linked in.

I've verified that all the libtcg-$arch.so libraries have no pending
symbols.

This is the complete list of new targets available:

aarch64-libtcg alpha-libtcg arm-libtcg armeb-libtcg cris-libtcg i386-libtcg 
m68k-libtcg microblaze-libtcg microblazeel-libtcg mips-libtcg mips64-libtcg 
mips64el-libtcg mipsel-libtcg mipsn32-libtcg mipsn32el-libtcg or32-libtcg 
ppc-libtcg ppc64-libtcg ppc64abi32-libtcg ppc64le-libtcg s390x-libtcg 
sh4-libtcg sh4eb-libtcg sparc-libtcg sparc32plus-libtcg sparc64-libtcg 
unicore32-libtcg x86_64-libtcg

Bonus: to reduce the coupling among object files I've create a quick &
dirty Python script to create a Graphviz graph of the various
dependencies among the object files. Simply feed it with a set of object
files, change the name of the "entry" object file (currently anything
containing "libtcg") and it will print the .dot file:

python graph.py *.o > graph.dot
dot -Tsvg graph.dot > graph.svg

You can find the script here:

https://clearmind.me/downloads/graph.py

Comments are very welcome.

Alessandro Di Federico (3):
  Factor out {linux,bsd}-user/qemu.h
  *-user targets object files decoupling
  Introduce libtcg infrastructure

 Makefile|   7 ++
 Makefile.target |  40 ++-
 bsd-user/qemu.h | 193 +---
 configure   |  20 
 crypto/Makefile.objs|   2 +-
 default-configs/aarch64-libtcg.mak  |   0
 default-configs/alpha-libtcg.mak|   0
 default-configs/arm-libtcg.mak  |   0
 default-configs/armeb-libtcg.mak|   0
 default-configs/cris-libtcg.mak |   0
 default-configs/i386-libtcg.mak |   0
 default-configs/m68k-libtcg.mak |   0
 default-configs/microblaze-libtcg.mak   |   0
 default-configs/microblazeel-libtcg.mak |   0
 default-configs/mips-libtcg.mak |   0
 default-configs/mips64-libtcg.mak   |   0
 default-configs/mips64el-libtcg.mak |   0
 default-configs/mipsel-libtcg.mak   |   0
 default-configs/mipsn32-libtcg.mak  |   0
 default-configs/mipsn32el-libtcg.mak|   0
 default-configs/or32-libtcg.mak |   0
 default-configs/ppc-libtcg.mak  |   1 +
 default-configs/ppc64-libtcg.mak|   1 +
 default-configs/ppc64abi32-libtcg.mak   |   1 +
 default-configs/ppc64le-libtcg.mak  |   1 +
 default-configs/s390x-libtcg.mak|   0
 default-configs/sh4-libtcg.mak  |   0
 default-configs/sh4eb-libtcg.mak|   0
 default-configs/sparc-libtcg.mak|   0
 default-configs/sparc32plus-libtcg.mak  |   0
 default-configs/sparc64-libtcg.mak  |   0
 default-configs/unicore32-libtcg.mak|   0
 default-configs/x86_64-libtcg.mak   |   0
 exec.c  |   2 +
 hw/core/Makefile.objs   |   5 +-
 include/exec/helper-gen.h   |  12 +-
 include/exec/helper-head.h  |   8 ++
 include/exec/helper-tcg.h   |  12 +-
 libtcg/Makefile.objs|   1 +
 libtcg/qemu.h   |   6 +
 libtcg/tcg.c|  55 +
 libtcg/tcg.h|   0
 linux-user/qemu.h   | 176 +
 qemu-user-common.h  | 180 +
 target/alpha/Makefile.objs  |   9 +-
 target/arm/Makefile.objs|  22 +++-
 target/arm/cpu.c| 118 +++
 target/arm/cpu.h|  82 ++
 target/arm/helper.c |  73 +---
 target/arm/translate-a64.c  |  54 -
 target/arm/translate.c  |  97 ++--
 target/arm/translate.h  |   7 --
 target/cris/Makefile.objs   |  10 +-
 target/i386/Makefile.objs   |  14 ++-
 

Re: [Qemu-devel] Nested PCI passthrough

2017-01-21 Thread Alex Williamson
On Sat, 21 Jan 2017 04:16:25 +0100
fassl  wrote:

> Hello,
> 
> i am trying to pass through a graphic card to a guest within a guest.
> So far i can see a text console within the target vm which says
> "radeon: ring 0 test failed", so it times out in radeon_vce_ring_test
> function. I am using qemu 2.8.50 at revision
> 0f6bcf68a99efdc531b209551f2b760b0bdcc554.
> 
> The relevant lowermost host arguments are:
> -machine pc-q35-2.8,accel=kvm,kernel-irqchip=split
> -device intel-iommu,intremap=on,eim=on
> 
> If i dont set the x-vga flag for the passed through device in the
> lowermost host the VM within the VM does not reset(?) the device during
> shutdown and the screen freezes. If i do, the screen goes black and no
> signal is going to the screen anymore and i can restart the target VM
> without the whole machine freezing. (one has to set
> CONFIG_VFIO_PCI_VGA=y in the kernel of the first VM, or call
> pci_register_vga to get this)
> 
> Also with irqchip=split the first vm cannot shutdown gracefully and
> crashes during shutdown.
> 
> I also can see some IRTE vector and trigger mode inconsistencies, can
> they cause this?
> 
> Or are there some quirks to be done to get this going?
> 
> If anybody could lead me into the right direction or if there is work
> going on in this field i would be more than glad to help testing.
> 
> Thank you in advance

This shouldn't even be possible right now, vfio-pci should generate an
abort when used with intel-iommu.  There is work in progress to make
some form of vfio-pci work with intel-iommu, but devices that don't
support function level reset (such as graphics cards) are going to have
a more difficult time than most being supported in this configuration.
What is your use case that you're even attempting this (or consider it
a sane thing to do)?  Thanks,

Alex



[Qemu-devel] Nested PCI passthrough

2017-01-21 Thread fassl
Hello,

i am trying to pass through a graphic card to a guest within a guest.
So far i can see a text console within the target vm which says
"radeon: ring 0 test failed", so it times out in radeon_vce_ring_test
function. I am using qemu 2.8.50 at revision
0f6bcf68a99efdc531b209551f2b760b0bdcc554.

The relevant lowermost host arguments are:
-machine pc-q35-2.8,accel=kvm,kernel-irqchip=split
-device intel-iommu,intremap=on,eim=on

If i dont set the x-vga flag for the passed through device in the
lowermost host the VM within the VM does not reset(?) the device during
shutdown and the screen freezes. If i do, the screen goes black and no
signal is going to the screen anymore and i can restart the target VM
without the whole machine freezing. (one has to set
CONFIG_VFIO_PCI_VGA=y in the kernel of the first VM, or call
pci_register_vga to get this)

Also with irqchip=split the first vm cannot shutdown gracefully and
crashes during shutdown.

I also can see some IRTE vector and trigger mode inconsistencies, can
they cause this?

Or are there some quirks to be done to get this going?

If anybody could lead me into the right direction or if there is work
going on in this field i would be more than glad to help testing.

Thank you in advance