[Qemu-devel] Some patch about mips, gen_HILO bug fix.

2012-12-10 Thread Elta Era
Hi all, I make three patch about mips

1: Fix my email address in dsp_helper.c
2: Fix repl_ph, value should sign-extend to target_long
3: Fix gen_HILO, there is a bug when we use dsp arch, at that time acc
index will be 0-3, and mipsdsp already add in. mipsdsp just take acc index
from opcode, on other arch, it my bring a error.


0001-Fix-my-email-address.patch
Description: Binary data


0002-Make-repl_ph-to-sign-extended-to-target_long.patch
Description: Binary data


0003-Fix-gen_HILO-to-make-it-adapt-each-arch-which-use-ac.patch
Description: Binary data


Re: [Qemu-devel] [PATCH/RFC] block: Ensure that block size constraints are considered

2012-12-10 Thread Kevin Wolf
Am 07.12.2012 21:26, schrieb Heinz Graalfs:
> Hello Kevin,
> 
> I'm resending my answer as of Nov 23rd.
> 
> Is this still on your queue?

No, it wasn't. I guess I was waiting for a new version of the patch.

>>>  }
>>>  
>>>  void *qemu_blockalign(BlockDriverState *bs, size_t size)
>>> diff --git a/block/raw-posix.c b/block/raw-posix.c
>>> index f2f0404..baebf1d 100644
>>> --- a/block/raw-posix.c
>>> +++ b/block/raw-posix.c
>>> @@ -700,6 +700,12 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState 
>>> *bs, int fd,
>>>  acb->aio_nbytes = nb_sectors * 512;
>>>  acb->aio_offset = sector_num * 512;
>>>  
>>> +/* O_DIRECT also requires an aligned length */
>>> +if (bs->open_flags & BDRV_O_NOCACHE) {
>>> +acb->aio_nbytes += acb->bs->buffer_alignment - 1;
>>> +acb->aio_nbytes &= ~(acb->bs->buffer_alignment - 1);
>>> +}
>>
>> Modifying aio_nbytes, but not the iov looks wrong to me. This may work
>> in the handle_aiocb_rw_linear() code path, but not with actual vectored I/O.
> 
> Current coding ensures that read IO buffers always seem to be aligned
> correctly. Whereas read length values are not always appropriate for an
> O_DIRECT scenario.
> 
> For a 2048 formatted disk I verified that
> 
> 1. non vectored IO - the length needs to be adapted several times,
>which is accomplished now by the patch.
> 
> 2. vectored IO - the qiov's total length is always a multiple of the
>logical block size 
>   (which is also verified in virtio_blk_handle_read())
>The particular iov length fields are already correctly setup as a
>multiple of the logical block size when processed in
>virtio_blk_handle_request().

I must admit that I don't quite understand this. As far as I know,
virtio-blk doesn't make any difference between requests with niov = 1
and real vectored requests. So how can the length of the latter always
be right, whereas the length of the former may be wrong?

The other point is that requests may not even be coming from virtio-blk.
They could be made by other device emulations or they could come from a
block job. (They also could be the result of a merge in the block layer,
though if the original requests were aligned, the result will stay aligned)

Kevin



Re: [Qemu-devel] [PATCH 4/8] s390: Add channel I/O instructions.

2012-12-10 Thread Alexander Graf


On 07.12.2012, at 13:50, Cornelia Huck  wrote:

> Provide handlers for (most) channel I/O instructions.
> 
> Signed-off-by: Cornelia Huck 
> ---
> target-s390x/cpu.h|  87 +++
> target-s390x/ioinst.c | 694 +-
> target-s390x/ioinst.h |  16 ++
> trace-events  |   6 +
> 4 files changed, 796 insertions(+), 7 deletions(-)
> 
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 73bfc20..bc119f8 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -127,6 +127,8 @@ typedef struct CPUS390XState {
> QEMUTimer *tod_timer;
> 
> QEMUTimer *cpu_timer;
> +
> +void *chsc_page;
> } CPUS390XState;
> 
> #include "cpu-qom.h"
> @@ -363,6 +365,91 @@ static inline unsigned 
> s390_del_running_cpu(CPUS390XState *env)
> void cpu_lock(void);
> void cpu_unlock(void);
> 
> +typedef struct SubchDev SubchDev;
> +typedef struct SCHIB SCHIB;
> +typedef struct ORB ORB;
> +
> +static inline SubchDev *css_find_subch(uint8_t m, uint8_t cssid, uint8_t 
> ssid,
> +   uint16_t schid)
> +{
> +return NULL;
> +}
> +static inline bool css_subch_visible(SubchDev *sch)
> +{
> +return false;
> +}
> +static inline void css_conditional_io_interrupt(SubchDev *sch)
> +{
> +}
> +static inline int css_do_stsch(SubchDev *sch, uint64_t addr)
> +{
> +return -ENODEV;
> +}
> +static inline bool css_schid_final(uint8_t cssid, uint8_t ssid, uint16_t 
> schid)
> +{
> +return true;
> +}
> +static inline int css_do_msch(SubchDev *sch, SCHIB *schib)
> +{
> +return -ENODEV;
> +}
> +static inline int css_do_xsch(SubchDev *sch)
> +{
> +return -ENODEV;
> +}
> +static inline int css_do_csch(SubchDev *sch)
> +{
> +return -ENODEV;
> +}
> +static inline int css_do_hsch(SubchDev *sch)
> +{
> +return -ENODEV;
> +}
> +static inline int css_do_ssch(SubchDev *sch, ORB *orb)
> +{
> +return -ENODEV;
> +}
> +static inline int css_do_tsch(SubchDev *sch, uint64_t addr)
> +{
> +return -ENODEV;
> +}
> +static inline int css_do_stcrw(uint64_t addr)
> +{
> +return 1;
> +}
> +static inline int css_do_tpi(uint64_t addr, int lowcore)
> +{
> +return 0;
> +}
> +static inline int css_collect_chp_desc(int m, uint8_t cssid, uint8_t f_chpid,
> +   int rfmt, uint8_t l_chpid, void *buf)
> +{
> +return 0;
> +}
> +static inline void css_do_schm(uint8_t mbk, int update, int dct, uint64_t 
> mbo)
> +{
> +}
> +static inline int css_enable_mss(void)
> +{
> +return -EINVAL;
> +}
> +static inline int css_enable_mcsse(void)
> +{
> +return -EINVAL;
> +}
> +static inline int css_do_rsch(SubchDev *sch)
> +{
> +return -ENODEV;
> +}
> +static inline int css_do_rchp(uint8_t cssid, uint8_t chpid)
> +{
> +return -ENODEV;
> +}
> +static inline bool css_present(uint8_t cssid)
> +{
> +return false;
> +}
> +
> static inline void cpu_set_tls(CPUS390XState *env, target_ulong newtls)
> {
> env->aregs[0] = newtls >> 32;
> diff --git a/target-s390x/ioinst.c b/target-s390x/ioinst.c
> index 8577b2c..60ce985 100644
> --- a/target-s390x/ioinst.c
> +++ b/target-s390x/ioinst.c
> @@ -15,14 +15,19 @@
> 
> #include "cpu.h"
> #include "ioinst.h"
> +#include "trace.h"
> 
> -#ifdef DEBUG_IOINST
> -#define dprintf(fmt, ...) \
> -do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> -#else
> -#define dprintf(fmt, ...) \
> -do { } while (0)
> -#endif
> +/* Special handling for the prefix page. */
> +static void *s390_get_address(CPUS390XState *env, ram_addr_t guest_addr)
> +{
> +if (guest_addr < 8192) {
> +guest_addr += env->psa;
> +} else if ((env->psa <= guest_addr) && (guest_addr < env->psa + 8192)) {
> +guest_addr -= env->psa;
> +}
> +
> +return qemu_get_ram_ptr(guest_addr);

Do we actually need this?

> +}
> 
> int ioinst_disassemble_sch_ident(uint32_t value, int *m, int *cssid, int 
> *ssid,
>  int *schid)
> @@ -44,3 +49,678 @@ int ioinst_disassemble_sch_ident(uint32_t value, int *m, 
> int *cssid, int *ssid,
> *schid = value & IOINST_SCHID_NR;
> return 0;
> }
> +
> +int ioinst_handle_xsch(CPUS390XState *env, uint64_t reg1)
> +{
> +int cssid, ssid, schid, m;
> +SubchDev *sch;
> +int ret = -ENODEV;
> +int cc;
> +
> +if (ioinst_disassemble_sch_ident(reg1, &m, &cssid, &ssid, &schid)) {
> +program_interrupt(env, PGM_OPERAND, 2);
> +return -EIO;
> +}
> +trace_ioinst_sch_id("xsch", cssid, ssid, schid);
> +sch = css_find_subch(m, cssid, ssid, schid);
> +if (sch && css_subch_visible(sch)) {
> +ret = css_do_xsch(sch);
> +}
> +switch (ret) {
> +case -ENODEV:
> +cc = 3;
> +break;
> +case -EBUSY:
> +cc = 2;
> +break;
> +case 0:
> +cc = 0;
> +break;
> +default:
> +cc = 1;
> +break;
> +}
> +
> +return cc;
> +}
> +
> +int ioinst_handle_csch(CPUS390XState *env, uint64_t re

[Qemu-devel] [PATCH] qemu-img: find the highest offset in use during check

2012-12-10 Thread Federico Simoncelli
This patch adds the support for reporting the highest offset in use by
an image. This is particularly useful after a conversion (or a rebase)
where the destination is a block device in order to find the actual
amount of space in use.

Signed-off-by: Federico Simoncelli 
---
 block.h|1 +
 block/qcow2-refcount.c |   10 --
 qemu-img.c |4 
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/block.h b/block.h
index 722c620..de42e8c 100644
--- a/block.h
+++ b/block.h
@@ -213,6 +213,7 @@ typedef struct BdrvCheckResult {
 int check_errors;
 int corruptions_fixed;
 int leaks_fixed;
+int64_t highest_offset;
 BlockFragInfo bfi;
 } BdrvCheckResult;
 
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 96224d1..017439d 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1116,7 +1116,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
   BdrvCheckMode fix)
 {
 BDRVQcowState *s = bs->opaque;
-int64_t size, i;
+int64_t size, i, highest_cluster;
 int nb_clusters, refcount1, refcount2;
 QCowSnapshot *sn;
 uint16_t *refcount_table;
@@ -1154,7 +1154,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
 s->refcount_table_offset,
 s->refcount_table_size * sizeof(uint64_t));
 
-for(i = 0; i < s->refcount_table_size; i++) {
+for(i = 0, highest_cluster = 0; i < s->refcount_table_size; i++) {
 uint64_t offset, cluster;
 offset = s->refcount_table[i];
 cluster = offset >> s->cluster_bits;
@@ -1197,6 +1197,11 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
 }
 
 refcount2 = refcount_table[i];
+
+if (refcount1 > 0 || refcount2 > 0) {
+highest_cluster = i;
+}
+
 if (refcount1 != refcount2) {
 
 /* Check if we're allowed to fix the mismatch */
@@ -1231,6 +1236,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
 }
 }
 
+res->highest_offset = (highest_cluster + 1) * s->cluster_size;
 ret = 0;
 
 fail:
diff --git a/qemu-img.c b/qemu-img.c
index e29e01b..3a8090b 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -470,6 +470,10 @@ static int img_check(int argc, char **argv)
 result.bfi.fragmented_clusters * 100.0 / 
result.bfi.allocated_clusters);
 }
 
+if (result.highest_offset > 0) {
+printf("Highest offset in use: %lu\n", result.highest_offset);
+}
+
 bdrv_delete(bs);
 
 if (ret < 0 || result.check_errors) {
-- 
1.7.1




Re: [Qemu-devel] [PATCH 4/8] s390: Add channel I/O instructions.

2012-12-10 Thread Cornelia Huck
On Mon, 10 Dec 2012 10:00:16 +0100
Alexander Graf  wrote:

> 
> 
> On 07.12.2012, at 13:50, Cornelia Huck  wrote:
> 
> > Provide handlers for (most) channel I/O instructions.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> > target-s390x/cpu.h|  87 +++
> > target-s390x/ioinst.c | 694 
> > +-
> > target-s390x/ioinst.h |  16 ++
> > trace-events  |   6 +
> > 4 files changed, 796 insertions(+), 7 deletions(-)
> > 
> > diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> > index 73bfc20..bc119f8 100644
> > --- a/target-s390x/cpu.h
> > +++ b/target-s390x/cpu.h
> > @@ -127,6 +127,8 @@ typedef struct CPUS390XState {
> > QEMUTimer *tod_timer;
> > 
> > QEMUTimer *cpu_timer;
> > +
> > +void *chsc_page;
> > } CPUS390XState;
> > 
> > #include "cpu-qom.h"
> > @@ -363,6 +365,91 @@ static inline unsigned 
> > s390_del_running_cpu(CPUS390XState *env)
> > void cpu_lock(void);
> > void cpu_unlock(void);
> > 
> > +typedef struct SubchDev SubchDev;
> > +typedef struct SCHIB SCHIB;
> > +typedef struct ORB ORB;
> > +
> > +static inline SubchDev *css_find_subch(uint8_t m, uint8_t cssid, uint8_t 
> > ssid,
> > +   uint16_t schid)
> > +{
> > +return NULL;
> > +}
> > +static inline bool css_subch_visible(SubchDev *sch)
> > +{
> > +return false;
> > +}
> > +static inline void css_conditional_io_interrupt(SubchDev *sch)
> > +{
> > +}
> > +static inline int css_do_stsch(SubchDev *sch, uint64_t addr)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline bool css_schid_final(uint8_t cssid, uint8_t ssid, uint16_t 
> > schid)
> > +{
> > +return true;
> > +}
> > +static inline int css_do_msch(SubchDev *sch, SCHIB *schib)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline int css_do_xsch(SubchDev *sch)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline int css_do_csch(SubchDev *sch)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline int css_do_hsch(SubchDev *sch)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline int css_do_ssch(SubchDev *sch, ORB *orb)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline int css_do_tsch(SubchDev *sch, uint64_t addr)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline int css_do_stcrw(uint64_t addr)
> > +{
> > +return 1;
> > +}
> > +static inline int css_do_tpi(uint64_t addr, int lowcore)
> > +{
> > +return 0;
> > +}
> > +static inline int css_collect_chp_desc(int m, uint8_t cssid, uint8_t 
> > f_chpid,
> > +   int rfmt, uint8_t l_chpid, void 
> > *buf)
> > +{
> > +return 0;
> > +}
> > +static inline void css_do_schm(uint8_t mbk, int update, int dct, uint64_t 
> > mbo)
> > +{
> > +}
> > +static inline int css_enable_mss(void)
> > +{
> > +return -EINVAL;
> > +}
> > +static inline int css_enable_mcsse(void)
> > +{
> > +return -EINVAL;
> > +}
> > +static inline int css_do_rsch(SubchDev *sch)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline int css_do_rchp(uint8_t cssid, uint8_t chpid)
> > +{
> > +return -ENODEV;
> > +}
> > +static inline bool css_present(uint8_t cssid)
> > +{
> > +return false;
> > +}
> > +
> > static inline void cpu_set_tls(CPUS390XState *env, target_ulong newtls)
> > {
> > env->aregs[0] = newtls >> 32;
> > diff --git a/target-s390x/ioinst.c b/target-s390x/ioinst.c
> > index 8577b2c..60ce985 100644
> > --- a/target-s390x/ioinst.c
> > +++ b/target-s390x/ioinst.c
> > @@ -15,14 +15,19 @@
> > 
> > #include "cpu.h"
> > #include "ioinst.h"
> > +#include "trace.h"
> > 
> > -#ifdef DEBUG_IOINST
> > -#define dprintf(fmt, ...) \
> > -do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> > -#else
> > -#define dprintf(fmt, ...) \
> > -do { } while (0)
> > -#endif
> > +/* Special handling for the prefix page. */
> > +static void *s390_get_address(CPUS390XState *env, ram_addr_t guest_addr)
> > +{
> > +if (guest_addr < 8192) {
> > +guest_addr += env->psa;
> > +} else if ((env->psa <= guest_addr) && (guest_addr < env->psa + 8192)) 
> > {
> > +guest_addr -= env->psa;
> > +}
> > +
> > +return qemu_get_ram_ptr(guest_addr);
> 
> Do we actually need this?

Yes. I've seen failures for I/O instructions using the lowcore (which
the Linux kernel likes to do).

> 
> > +}
> > 
> > int ioinst_disassemble_sch_ident(uint32_t value, int *m, int *cssid, int 
> > *ssid,
> >  int *schid)
> > @@ -44,3 +49,678 @@ int ioinst_disassemble_sch_ident(uint32_t value, int 
> > *m, int *cssid, int *ssid,
> > *schid = value & IOINST_SCHID_NR;
> > return 0;
> > }
> > +
> > +int ioinst_handle_xsch(CPUS390XState *env, uint64_t reg1)
> > +{
> > +int cssid, ssid, schid, m;
> > +SubchDev *sch;
> > +int ret = -ENODEV;
> > +int cc;
> > +
> > +if (ioinst_disassemble_sch_ident(reg1, &m, &cssid, &ssid, &schid)) {
> > +program_interrupt(env, PGM_OPERAND, 2);
> > +return -EIO;
> > +}
> >

Re: [Qemu-devel] Some patch about mips, gen_HILO bug fix.

2012-12-10 Thread Wei-Ren Chen
Hi,

> 1: Fix my email address in dsp_helper.c
> 2: Fix repl_ph, value should sign-extend to target_long
> 3: Fix gen_HILO, there is a bug when we use dsp arch, at that time acc index
> will be 0-3, and mipsdsp already add in. mipsdsp just take acc index from
> opcode, on other arch, it my bring a error.

  Why not use `git send-mail`? See more details on
  http://wiki.qemu.org/Contribute/SubmitAPatch

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] removing on-demand msix vector allocation

2012-12-10 Thread Michael S. Tsirkin
On Fri, Dec 07, 2012 at 08:37:22AM +0100, Jan Kiszka wrote:
> On 2012-12-06 08:59, Michael S. Tsirkin wrote:
> > I've been looking at handling of msix masking
> > in qemu. It looks like all of virtio,vfio and
> > device assignment implemented their own
> > similar but slightly different thing.
> > So I am inclined to move this handling to common
> > code in msix.c, adding irqfd support right there.
> > 
> > While doing this rework, one of the more painful
> > bits of code to change is the code that dynamically
> > allocates msix table entries as we inject msi.
> > If this actually triggers it's going to be
> > painfully slow as route changes are rcu
> > write side in kernel.
> > Since recent kernels support direct injection,
> > do we care anymore? I think if you run out of
> > vectors, it's better to simply disable irqchip
> > than try to limp along changing routes all the time.
> 
> But how would the logic without dynamic allocation look like? Always
> configure a route in the PCI layer if an MSI/MSI-X entry is enabled?
> That would also affect emulated devices that don't use irqfd, thus you
> would waste routing entries.

Yes. 
So we can fail during initialization and ask user to
disable irqchip: at the moment, at least in my testing,
dynamic swap out of MSI entries performs very badly
anyway.

> OTOH, if don't set up such routes, you
> cannot support MSI/-X on older x86 kernels without direct vector injection.
> 
> Jan
> 
> 





Re: [Qemu-devel] [PATCH 6/8] s390: Wire up channel I/O in kvm.

2012-12-10 Thread Alexander Graf


On 07.12.2012, at 13:50, Cornelia Huck  wrote:

> Trigger the code for our virtual css in case of instruction
> intercepts for I/O instructions.
> 
> Handle the tsch exit for the subchannel-related part of tsch.
> 
> Signed-off-by: Cornelia Huck 
> ---
> target-s390x/cpu.h |  11 +++
> target-s390x/kvm.c | 246 ++---
> 2 files changed, 244 insertions(+), 13 deletions(-)
> 
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index eb24c6b..1d21ca8 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -1178,6 +1178,13 @@ uint32_t set_cc_nz_f64(float64 v);
> /* misc_helper.c */
> void program_interrupt(CPUS390XState *env, uint32_t code, int ilc);
> 
> +#ifdef CONFIG_KVM
> +int kvm_s390_io_interrupt(CPUS390XState *env, uint16_t subchannel_id,
> +  uint16_t subchannel_nr, uint32_t io_int_parm,
> +  uint32_t io_int_word);
> +int kvm_s390_crw_mchk(CPUS390XState *env);
> +void kvm_s390_enable_css_support(CPUS390XState *env);
> +#else
> static inline int kvm_s390_io_interrupt(CPUS390XState *env,
> uint16_t subchannel_id,
> uint16_t subchannel_nr,
> @@ -1190,6 +1197,10 @@ static inline int kvm_s390_crw_mchk(CPUS390XState *env)
> {
> return -EOPNOTSUPP;
> }
> +static inline void kvm_s390_enable_css_support(CPUS390XState *env)
> +{
> +}
> +#endif
> 
> static inline void s390_io_interrupt(CPUS390XState *env,
>  uint16_t subchannel_id,
> diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
> index 94de764..c9a4a7d 100644
> --- a/target-s390x/kvm.c
> +++ b/target-s390x/kvm.c
> @@ -30,6 +30,7 @@
> #include "kvm.h"
> #include "cpu.h"
> #include "device_tree.h"
> +#include "ioinst.h"
> 
> /* #define DEBUG_KVM */
> 
> @@ -43,9 +44,27 @@
> 
> #define IPA0_DIAG   0x8300
> #define IPA0_SIGP   0xae00
> -#define IPA0_PRIV   0xb200
> +#define IPA0_B2 0xb200
> +#define IPA0_B9 0xb900
> +#define IPA0_EB 0xeb00
> 
> #define PRIV_SCLP_CALL  0x20
> +#define PRIV_CSCH   0x30
> +#define PRIV_HSCH   0x31
> +#define PRIV_MSCH   0x32
> +#define PRIV_SSCH   0x33
> +#define PRIV_STSCH  0x34
> +#define PRIV_TSCH   0x35
> +#define PRIV_TPI0x36
> +#define PRIV_SAL0x37
> +#define PRIV_RSCH   0x38
> +#define PRIV_STCRW  0x39
> +#define PRIV_STCPS  0x3a
> +#define PRIV_RCHP   0x3b
> +#define PRIV_SCHM   0x3c
> +#define PRIV_CHSC   0x5f
> +#define PRIV_SIGA   0x74
> +#define PRIV_XSCH   0x76
> #define DIAG_KVM_HYPERCALL  0x500
> #define DIAG_KVM_BREAKPOINT 0x501
> 
> @@ -350,10 +369,120 @@ static int kvm_sclp_service_call(CPUS390XState *env, 
> struct kvm_run *run,
> return 0;
> }
> 
> -static int handle_priv(CPUS390XState *env, struct kvm_run *run, uint8_t ipa1)
> +static int kvm_handle_css_inst(CPUS390XState *env, struct kvm_run *run,
> +   uint8_t ipa0, uint8_t ipa1, uint8_t ipb)
> +{
> +int r = 0;
> +int no_cc = 0;
> +
> +if (ipa0 != 0xb2) {
> +/* Not handled for now. */
> +return -1;
> +}
> +cpu_synchronize_state(env);
> +switch (ipa1) {
> +case PRIV_XSCH:
> +r = ioinst_handle_xsch(env, env->regs[1]);
> +break;
> +case PRIV_CSCH:
> +r = ioinst_handle_csch(env, env->regs[1]);
> +break;
> +case PRIV_HSCH:
> +r = ioinst_handle_hsch(env, env->regs[1]);
> +break;
> +case PRIV_MSCH:
> +r = ioinst_handle_msch(env, env->regs[1], run->s390_sieic.ipb);
> +break;
> +case PRIV_SSCH:
> +r = ioinst_handle_ssch(env, env->regs[1], run->s390_sieic.ipb);
> +break;
> +case PRIV_STCRW:
> +r = ioinst_handle_stcrw(env, run->s390_sieic.ipb);
> +break;
> +case PRIV_STSCH:
> +r = ioinst_handle_stsch(env, env->regs[1], run->s390_sieic.ipb);
> +break;
> +case PRIV_TSCH:
> +/* We should only get tsch via KVM_EXIT_S390_TSCH. */
> +fprintf(stderr, "Spurious tsch intercept\n");
> +break;
> +case PRIV_CHSC:
> +r = ioinst_handle_chsc(env, run->s390_sieic.ipb);
> +break;
> +case PRIV_TPI:
> +/* This should have been handled by kvm already. */
> +fprintf(stderr, "Spurious tpi intercept\n");
> +break;
> +case PRIV_SCHM:
> +no_cc = 1;
> +r = ioinst_handle_schm(env, env->regs[1], env->regs[2],
> +   run->s390_sieic.ipb);
> +b

Re: [Qemu-devel] removing on-demand msix vector allocation

2012-12-10 Thread Jan Kiszka
On 2012-12-10 10:36, Michael S. Tsirkin wrote:
> On Fri, Dec 07, 2012 at 08:37:22AM +0100, Jan Kiszka wrote:
>> On 2012-12-06 08:59, Michael S. Tsirkin wrote:
>>> I've been looking at handling of msix masking
>>> in qemu. It looks like all of virtio,vfio and
>>> device assignment implemented their own
>>> similar but slightly different thing.
>>> So I am inclined to move this handling to common
>>> code in msix.c, adding irqfd support right there.
>>>
>>> While doing this rework, one of the more painful
>>> bits of code to change is the code that dynamically
>>> allocates msix table entries as we inject msi.
>>> If this actually triggers it's going to be
>>> painfully slow as route changes are rcu
>>> write side in kernel.
>>> Since recent kernels support direct injection,
>>> do we care anymore? I think if you run out of
>>> vectors, it's better to simply disable irqchip
>>> than try to limp along changing routes all the time.
>>
>> But how would the logic without dynamic allocation look like? Always
>> configure a route in the PCI layer if an MSI/MSI-X entry is enabled?
>> That would also affect emulated devices that don't use irqfd, thus you
>> would waste routing entries.
> 
> Yes. 
> So we can fail during initialization and ask user to
> disable irqchip: at the moment, at least in my testing,
> dynamic swap out of MSI entries performs very badly
> anyway.

That would be a poor approach as it regresses needlessly even over
latest kernels. We only allocate/flush dynamically over older kernels
without direct MSI injections.

What we need is a flag, set e.g. on msi[x]_init, to give the core a hint
if it should allocate static routes for irqfd or if it will be able to
use direct injection later on. Then we can simply do static allocation
unconditionally on kernels without direct injection.

Jan




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] removing on-demand msix vector allocation

2012-12-10 Thread Michael S. Tsirkin
On Mon, Dec 10, 2012 at 10:39:39AM +0100, Jan Kiszka wrote:
> On 2012-12-10 10:36, Michael S. Tsirkin wrote:
> > On Fri, Dec 07, 2012 at 08:37:22AM +0100, Jan Kiszka wrote:
> >> On 2012-12-06 08:59, Michael S. Tsirkin wrote:
> >>> I've been looking at handling of msix masking
> >>> in qemu. It looks like all of virtio,vfio and
> >>> device assignment implemented their own
> >>> similar but slightly different thing.
> >>> So I am inclined to move this handling to common
> >>> code in msix.c, adding irqfd support right there.
> >>>
> >>> While doing this rework, one of the more painful
> >>> bits of code to change is the code that dynamically
> >>> allocates msix table entries as we inject msi.
> >>> If this actually triggers it's going to be
> >>> painfully slow as route changes are rcu
> >>> write side in kernel.
> >>> Since recent kernels support direct injection,
> >>> do we care anymore? I think if you run out of
> >>> vectors, it's better to simply disable irqchip
> >>> than try to limp along changing routes all the time.
> >>
> >> But how would the logic without dynamic allocation look like? Always
> >> configure a route in the PCI layer if an MSI/MSI-X entry is enabled?
> >> That would also affect emulated devices that don't use irqfd, thus you
> >> would waste routing entries.
> > 
> > Yes. 
> > So we can fail during initialization and ask user to
> > disable irqchip: at the moment, at least in my testing,
> > dynamic swap out of MSI entries performs very badly
> > anyway.
> 
> That would be a poor approach as it regresses needlessly even over
> latest kernels.
> We only allocate/flush dynamically over older kernels
> without direct MSI injections.
> 
> What we need is a flag, set e.g. on msi[x]_init, to give the core a hint
> if it should allocate static routes for irqfd or if it will be able to
> use direct injection later on. Then we can simply do static allocation
> unconditionally on kernels without direct injection.
> 
> Jan
> 
> 

Makes sense.

-- 
MST



Re: [Qemu-devel] [RFC PATCH v4 0/8] s390: channel I/O support in qemu.

2012-12-10 Thread Alexander Graf

On 07.12.2012, at 13:50, Cornelia Huck wrote:

> Hi,
> 
> just a quick dump of my qemu patch series for channel I/O.
> 
> I've managed to chop the virtual css patch into some smaller
> chunks (patches 2-6), which are hopefully easier to review.
> 
> The virtio-ccw patch is still based upon the current virtio
> infrastructure; I'll try to rebase it upon the virtio refactoring
> once that has most of the infrastructure in place.

I wouldn't want to hold this feature back only for some virtio refactoring. 
Let's try to get it into shape for now and worry about the refactor work when 
that's closer to merging.


Alex

> 
> Cornelia Huck (8):
>  Update linux headers.
>  s390: Channel I/O basic defintions.
>  s390: I/O interrupt and machine check injection.
>  s390: Add channel I/O instructions.
>  s390: Virtual channel subsystem support.
>  s390: Wire up channel I/O in kvm.
>  s390-virtio: Factor out some initialization code.
>  s390: Add new channel I/O based virtio transport.
> 
> hw/s390-virtio.c |  281 +---
> hw/s390x/Makefile.objs   |2 +
> hw/s390x/css.c   | 1195 ++
> hw/s390x/css.h   |   92 +++
> hw/s390x/virtio-ccw.c|  909 ++
> hw/s390x/virtio-ccw.h|   81 +++
> linux-headers/asm-generic/kvm_para.h |4 +
> linux-headers/asm-powerpc/kvm.h  |   59 ++
> linux-headers/asm-powerpc/kvm_para.h |7 +-
> linux-headers/linux/kvm.h|   36 +-
> target-s390x/Makefile.objs   |2 +-
> target-s390x/cpu.h   |  230 +++
> target-s390x/helper.c|  145 +
> target-s390x/ioinst.c|  726 +
> target-s390x/ioinst.h|  223 +++
> target-s390x/kvm.c   |  246 ++-
> trace-events |   18 +
> 17 files changed, 4161 insertions(+), 95 deletions(-)
> create mode 100644 hw/s390x/css.c
> create mode 100644 hw/s390x/css.h
> create mode 100644 hw/s390x/virtio-ccw.c
> create mode 100644 hw/s390x/virtio-ccw.h
> create mode 100644 linux-headers/asm-generic/kvm_para.h
> create mode 100644 target-s390x/ioinst.c
> create mode 100644 target-s390x/ioinst.h
> 
> -- 
> 1.7.12.4
> 




Re: [Qemu-devel] [PATCH 2/8] s390: Channel I/O basic defintions.

2012-12-10 Thread Alexander Graf

On 07.12.2012, at 13:50, Cornelia Huck wrote:

> Basic channel I/O structures and helper function.
> 
> Signed-off-by: Cornelia Huck 
> ---
> target-s390x/Makefile.objs |   2 +-
> target-s390x/ioinst.c  |  46 ++
> target-s390x/ioinst.h  | 207 +
> 3 files changed, 254 insertions(+), 1 deletion(-)
> create mode 100644 target-s390x/ioinst.c
> create mode 100644 target-s390x/ioinst.h
> 
> diff --git a/target-s390x/Makefile.objs b/target-s390x/Makefile.objs
> index e728abf..3afb0b7 100644
> --- a/target-s390x/Makefile.objs
> +++ b/target-s390x/Makefile.objs
> @@ -1,4 +1,4 @@
> obj-y += translate.o helper.o cpu.o interrupt.o
> obj-y += int_helper.o fpu_helper.o cc_helper.o mem_helper.o misc_helper.o
> -obj-$(CONFIG_SOFTMMU) += machine.o
> +obj-$(CONFIG_SOFTMMU) += machine.o ioinst.o
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/target-s390x/ioinst.c b/target-s390x/ioinst.c
> new file mode 100644
> index 000..8577b2c
> --- /dev/null
> +++ b/target-s390x/ioinst.c
> @@ -0,0 +1,46 @@
> +/*
> + * I/O instructions for S/390
> + *
> + * Copyright 2012 IBM Corp.
> + * Author(s): Cornelia Huck 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or (at
> + * your option) any later version. See the COPYING file in the top-level
> + * directory.
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "cpu.h"
> +#include "ioinst.h"
> +
> +#ifdef DEBUG_IOINST
> +#define dprintf(fmt, ...) \
> +do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define dprintf(fmt, ...) \
> +do { } while (0)
> +#endif
> +
> +int ioinst_disassemble_sch_ident(uint32_t value, int *m, int *cssid, int 
> *ssid,
> + int *schid)
> +{
> +if (!(value & IOINST_SCHID_ONE)) {
> +return -EINVAL;
> +}
> +if (!(value & IOINST_SCHID_M)) {
> +if (value & IOINST_SCHID_CSSID) {
> +return -EINVAL;
> +}
> +*cssid = 0;
> +*m = 0;
> +} else {
> +*cssid = (value & IOINST_SCHID_CSSID) >> 24;

(value & IOINST_SCHID_CSSID_MASK) >> IOINST_SCHID_CSSID_SHIFT


Alex

> +*m = 1;
> +}
> +*ssid = (value & IOINST_SCHID_SSID) >> 17;
> +*schid = value & IOINST_SCHID_NR;
> +return 0;
> +}
> diff --git a/target-s390x/ioinst.h b/target-s390x/ioinst.h
> new file mode 100644
> index 000..b5b4a03
> --- /dev/null
> +++ b/target-s390x/ioinst.h
> @@ -0,0 +1,207 @@
> +/*
> + * S/390 channel I/O instructions
> + *
> + * Copyright 2012 IBM Corp.
> + * Author(s): Cornelia Huck 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or (at
> + * your option) any later version. See the COPYING file in the top-level
> + * directory.
> +*/
> +
> +#ifndef IOINST_S390X_H
> +#define IOINST_S390X_H
> +/*
> + * Channel I/O related definitions, as defined in the Principles
> + * Of Operation (and taken from the Linux implementation).
> + */
> +
> +/* subchannel status word (command mode only) */
> +typedef struct SCSW {
> +uint16_t flags;
> +uint16_t ctrl;
> +uint32_t cpa;
> +uint8_t dstat;
> +uint8_t cstat;
> +uint16_t count;
> +} QEMU_PACKED SCSW;
> +
> +#define SCSW_FLAGS_MASK_KEY 0xf000
> +#define SCSW_FLAGS_MASK_SCTL 0x0800
> +#define SCSW_FLAGS_MASK_ESWF 0x0400
> +#define SCSW_FLAGS_MASK_CC 0x0300
> +#define SCSW_FLAGS_MASK_FMT 0x0080
> +#define SCSW_FLAGS_MASK_PFCH 0x0040
> +#define SCSW_FLAGS_MASK_ISIC 0x0020
> +#define SCSW_FLAGS_MASK_ALCC 0x0010
> +#define SCSW_FLAGS_MASK_SSI 0x0008
> +#define SCSW_FLAGS_MASK_ZCC 0x0004
> +#define SCSW_FLAGS_MASK_ECTL 0x0002
> +#define SCSW_FLAGS_MASK_PNO 0x0001
> +
> +#define SCSW_CTRL_MASK_FCTL 0x7000
> +#define SCSW_CTRL_MASK_ACTL 0x0fe0
> +#define SCSW_CTRL_MASK_STCTL 0x001f
> +
> +#define SCSW_FCTL_CLEAR_FUNC 0x1000
> +#define SCSW_FCTL_HALT_FUNC 0x2000
> +#define SCSW_FCTL_START_FUNC 0x4000
> +
> +#define SCSW_ACTL_SUSP 0x0020
> +#define SCSW_ACTL_DEVICE_ACTIVE 0x0040
> +#define SCSW_ACTL_SUBCH_ACTIVE 0x0080
> +#define SCSW_ACTL_CLEAR_PEND 0x0100
> +#define SCSW_ACTL_HALT_PEND  0x0200
> +#define SCSW_ACTL_START_PEND 0x0400
> +#define SCSW_ACTL_RESUME_PEND 0x0800
> +
> +#define SCSW_STCTL_STATUS_PEND 0x0001
> +#define SCSW_STCTL_SECONDARY 0x0002
> +#define SCSW_STCTL_PRIMARY 0x0004
> +#define SCSW_STCTL_INTERMEDIATE 0x0008
> +#define SCSW_STCTL_ALERT 0x0010
> +
> +#define SCSW_DSTAT_ATTENTION 0x80
> +#define SCSW_DSTAT_STAT_MOD  0x40
> +#define SCSW_DSTAT_CU_END0x20
> +#define SCSW_DSTAT_BUSY  0x10
> +#define SCSW_DSTAT_CHANNEL_END   0x08
> +#define SCSW_DSTAT_DEVICE_END0x04
> +#define SCSW_DSTAT_UNIT_CHECK0x02
> +#define SCSW_DSTAT_UNIT_EXCEP0x01
> +
> +#define SCSW_CSTAT_PCI   0x80
> +#define SCSW_CSTAT_INCORR_LEN0x40
> +#define SCSW_CSTAT_PROG_CHECK0x20
> +#define SCSW_CSTAT_PROT_CHECK0x10
> +#define SCSW_CSTAT_DATA_CHECK0x08
> +#define SCSW_CSTAT_CHN_CTRL_CHK  0x04
> +#define SCSW_CSTAT_INTF_CTR

Re: [Qemu-devel] [PATCH 3/8] s390: I/O interrupt and machine check injection.

2012-12-10 Thread Alexander Graf

On 07.12.2012, at 13:50, Cornelia Huck wrote:

> I/O interrupts are queued per isc. Only crw pending machine checks
> are supported.
> 
> Signed-off-by: Cornelia Huck 
> ---
> target-s390x/cpu.h|  67 +++
> target-s390x/helper.c | 145 ++
> 2 files changed, 212 insertions(+)
> 
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 0f9a1f7..73bfc20 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -47,6 +47,11 @@
> #define MMU_USER_IDX 1
> 
> #define MAX_EXT_QUEUE 16
> +#define MAX_IO_QUEUE 16
> +#define MAX_MCHK_QUEUE 16
> +
> +#define PSW_MCHK_MASK 0x0004
> +#define PSW_IO_MASK 0x0200
> 
> typedef struct PSW {
> uint64_t mask;
> @@ -59,6 +64,17 @@ typedef struct ExtQueue {
> uint32_t param64;
> } ExtQueue;
> 
> +typedef struct IOQueue {
> +uint16_t id;
> +uint16_t nr;
> +uint32_t parm;
> +uint32_t word;
> +} IOQueue;
> +
> +typedef struct MchkQueue {
> +uint16_t type;
> +} MchkQueue;
> +
> typedef struct CPUS390XState {
> uint64_t regs[16];/* GP registers */
> 
> @@ -88,8 +104,16 @@ typedef struct CPUS390XState {
> 
> int pending_int;
> ExtQueue ext_queue[MAX_EXT_QUEUE];
> +IOQueue io_queue[MAX_IO_QUEUE][8];
> +MchkQueue mchk_queue[MAX_MCHK_QUEUE];
> 
> int ext_index;
> +int io_index[8];
> +int mchk_index;
> +
> +uint64_t ckc;
> +uint64_t cputm;
> +uint32_t todpr;
> 
> CPU_COMMON
> 
> @@ -364,12 +388,16 @@ static inline void cpu_set_tls(CPUS390XState *env, 
> target_ulong newtls)
> #define EXCP_EXT 1 /* external interrupt */
> #define EXCP_SVC 2 /* supervisor call (syscall) */
> #define EXCP_PGM 3 /* program interruption */
> +#define EXCP_IO  7 /* I/O interrupt */
> +#define EXCP_MCHK 8 /* machine check */
> 
> #endif /* CONFIG_USER_ONLY */
> 
> #define INTERRUPT_EXT(1 << 0)
> #define INTERRUPT_TOD(1 << 1)
> #define INTERRUPT_CPUTIMER   (1 << 2)
> +#define INTERRUPT_IO (1 << 3)
> +#define INTERRUPT_MCHK   (1 << 4)
> 
> /* Program Status Word.  */
> #define S390_PSWM_REGNUM 0
> @@ -977,6 +1005,45 @@ static inline void cpu_inject_ext(CPUS390XState *env, 
> uint32_t code, uint32_t pa
> cpu_interrupt(env, CPU_INTERRUPT_HARD);
> }
> 
> +static inline void cpu_inject_io(CPUS390XState *env, uint16_t subchannel_id,
> + uint16_t subchannel_number,
> + uint32_t io_int_parm, uint32_t io_int_word)
> +{
> +int isc = ffs(io_int_word << 2) - 1;
> +
> +if (env->io_index[isc] == MAX_IO_QUEUE - 1) {
> +/* ugh - can't queue anymore. Let's drop. */
> +return;
> +}
> +
> +env->io_index[isc]++;
> +assert(env->io_index[isc] < MAX_IO_QUEUE);
> +
> +env->io_queue[env->io_index[isc]][isc].id = subchannel_id;
> +env->io_queue[env->io_index[isc]][isc].nr = subchannel_number;
> +env->io_queue[env->io_index[isc]][isc].parm = io_int_parm;
> +env->io_queue[env->io_index[isc]][isc].word = io_int_word;
> +
> +env->pending_int |= INTERRUPT_IO;
> +cpu_interrupt(env, CPU_INTERRUPT_HARD);
> +}
> +
> +static inline void cpu_inject_crw_mchk(CPUS390XState *env)
> +{
> +if (env->mchk_index == MAX_MCHK_QUEUE - 1) {
> +/* ugh - can't queue anymore. Let's drop. */
> +return;
> +}
> +
> +env->mchk_index++;
> +assert(env->mchk_index < MAX_MCHK_QUEUE);
> +
> +env->mchk_queue[env->mchk_index].type = 1;
> +
> +env->pending_int |= INTERRUPT_MCHK;
> +cpu_interrupt(env, CPU_INTERRUPT_HARD);
> +}
> +
> static inline bool cpu_has_work(CPUState *cpu)
> {
> CPUS390XState *env = &S390_CPU(cpu)->env;
> diff --git a/target-s390x/helper.c b/target-s390x/helper.c
> index b7b812a..4ff148d 100644
> --- a/target-s390x/helper.c
> +++ b/target-s390x/helper.c
> @@ -574,12 +574,144 @@ static void do_ext_interrupt(CPUS390XState *env)
> load_psw(env, mask, addr);
> }
> 
> +static void do_io_interrupt(CPUS390XState *env)
> +{
> +uint64_t mask, addr;
> +LowCore *lowcore;
> +hwaddr len = TARGET_PAGE_SIZE;
> +IOQueue *q;
> +uint8_t isc;
> +int disable = 1;
> +int found = 0;
> +
> +if (!(env->psw.mask & PSW_MASK_IO)) {
> +cpu_abort(env, "I/O int w/o I/O mask\n");
> +}
> +
> +for (isc = 0; isc < 8; isc++) {
> +if (env->io_index[isc] < 0) {
> +continue;
> +}
> +if (env->io_index[isc] > MAX_IO_QUEUE) {
> +cpu_abort(env, "I/O queue overrun for isc %d: %d\n",
> +  isc, env->io_index[isc]);
> +}
> +
> +q = &env->io_queue[env->io_index[isc]][isc];
> +if (!(env->cregs[6] & q->word)) {
> +disable = 0;
> +continue;
> +}
> +found = 1;
> +lowcore = cpu_physical_memory_map(env->psa, &len, 1);

This one is missing a check whether len >= sizeof(*lowcore) :).

> +
> +lowcore->subchannel_id = cpu_to

Re: [Qemu-devel] [PATCH 2/8] s390: Channel I/O basic defintions.

2012-12-10 Thread Cornelia Huck
On Mon, 10 Dec 2012 09:07:57 +0100
Alexander Graf  wrote:

> 
> On 07.12.2012, at 13:50, Cornelia Huck wrote:
> 
> > Basic channel I/O structures and helper function.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> > target-s390x/Makefile.objs |   2 +-
> > target-s390x/ioinst.c  |  46 ++
> > target-s390x/ioinst.h  | 207 
> > +
> > 3 files changed, 254 insertions(+), 1 deletion(-)
> > create mode 100644 target-s390x/ioinst.c
> > create mode 100644 target-s390x/ioinst.h
> > 
> > diff --git a/target-s390x/Makefile.objs b/target-s390x/Makefile.objs
> > index e728abf..3afb0b7 100644
> > --- a/target-s390x/Makefile.objs
> > +++ b/target-s390x/Makefile.objs
> > @@ -1,4 +1,4 @@
> > obj-y += translate.o helper.o cpu.o interrupt.o
> > obj-y += int_helper.o fpu_helper.o cc_helper.o mem_helper.o misc_helper.o
> > -obj-$(CONFIG_SOFTMMU) += machine.o
> > +obj-$(CONFIG_SOFTMMU) += machine.o ioinst.o
> > obj-$(CONFIG_KVM) += kvm.o
> > diff --git a/target-s390x/ioinst.c b/target-s390x/ioinst.c
> > new file mode 100644
> > index 000..8577b2c
> > --- /dev/null
> > +++ b/target-s390x/ioinst.c
> > @@ -0,0 +1,46 @@
> > +/*
> > + * I/O instructions for S/390
> > + *
> > + * Copyright 2012 IBM Corp.
> > + * Author(s): Cornelia Huck 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or (at
> > + * your option) any later version. See the COPYING file in the top-level
> > + * directory.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "cpu.h"
> > +#include "ioinst.h"
> > +
> > +#ifdef DEBUG_IOINST
> > +#define dprintf(fmt, ...) \
> > +do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> > +#else
> > +#define dprintf(fmt, ...) \
> > +do { } while (0)
> > +#endif
> > +
> > +int ioinst_disassemble_sch_ident(uint32_t value, int *m, int *cssid, int 
> > *ssid,
> > + int *schid)
> > +{
> > +if (!(value & IOINST_SCHID_ONE)) {
> > +return -EINVAL;
> > +}
> > +if (!(value & IOINST_SCHID_M)) {
> > +if (value & IOINST_SCHID_CSSID) {
> > +return -EINVAL;
> > +}
> > +*cssid = 0;
> > +*m = 0;
> > +} else {
> > +*cssid = (value & IOINST_SCHID_CSSID) >> 24;
> 
> (value & IOINST_SCHID_CSSID_MASK) >> IOINST_SCHID_CSSID_SHIFT

I think that actually decreases readability.

> 
> 
> Alex
> 
> > +*m = 1;
> > +}
> > +*ssid = (value & IOINST_SCHID_SSID) >> 17;
> > +*schid = value & IOINST_SCHID_NR;
> > +return 0;
> > +}
> > diff --git a/target-s390x/ioinst.h b/target-s390x/ioinst.h
> > new file mode 100644
> > index 000..b5b4a03
> > --- /dev/null
> > +++ b/target-s390x/ioinst.h
> > @@ -0,0 +1,207 @@
> > +/*
> > + * S/390 channel I/O instructions
> > + *
> > + * Copyright 2012 IBM Corp.
> > + * Author(s): Cornelia Huck 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or (at
> > + * your option) any later version. See the COPYING file in the top-level
> > + * directory.
> > +*/
> > +
> > +#ifndef IOINST_S390X_H
> > +#define IOINST_S390X_H
> > +/*
> > + * Channel I/O related definitions, as defined in the Principles
> > + * Of Operation (and taken from the Linux implementation).
> > + */
> > +
> > +/* subchannel status word (command mode only) */
> > +typedef struct SCSW {
> > +uint16_t flags;
> > +uint16_t ctrl;
> > +uint32_t cpa;
> > +uint8_t dstat;
> > +uint8_t cstat;
> > +uint16_t count;
> > +} QEMU_PACKED SCSW;
> > +
> > +#define SCSW_FLAGS_MASK_KEY 0xf000
> > +#define SCSW_FLAGS_MASK_SCTL 0x0800
> > +#define SCSW_FLAGS_MASK_ESWF 0x0400
> > +#define SCSW_FLAGS_MASK_CC 0x0300
> > +#define SCSW_FLAGS_MASK_FMT 0x0080
> > +#define SCSW_FLAGS_MASK_PFCH 0x0040
> > +#define SCSW_FLAGS_MASK_ISIC 0x0020
> > +#define SCSW_FLAGS_MASK_ALCC 0x0010
> > +#define SCSW_FLAGS_MASK_SSI 0x0008
> > +#define SCSW_FLAGS_MASK_ZCC 0x0004
> > +#define SCSW_FLAGS_MASK_ECTL 0x0002
> > +#define SCSW_FLAGS_MASK_PNO 0x0001
> > +
> > +#define SCSW_CTRL_MASK_FCTL 0x7000
> > +#define SCSW_CTRL_MASK_ACTL 0x0fe0
> > +#define SCSW_CTRL_MASK_STCTL 0x001f
> > +
> > +#define SCSW_FCTL_CLEAR_FUNC 0x1000
> > +#define SCSW_FCTL_HALT_FUNC 0x2000
> > +#define SCSW_FCTL_START_FUNC 0x4000
> > +
> > +#define SCSW_ACTL_SUSP 0x0020
> > +#define SCSW_ACTL_DEVICE_ACTIVE 0x0040
> > +#define SCSW_ACTL_SUBCH_ACTIVE 0x0080
> > +#define SCSW_ACTL_CLEAR_PEND 0x0100
> > +#define SCSW_ACTL_HALT_PEND  0x0200
> > +#define SCSW_ACTL_START_PEND 0x0400
> > +#define SCSW_ACTL_RESUME_PEND 0x0800
> > +
> > +#define SCSW_STCTL_STATUS_PEND 0x0001
> > +#define SCSW_STCTL_SECONDARY 0x0002
> > +#define SCSW_STCTL_PRIMARY 0x0004
> > +#define SCSW_STCTL_INTERMEDIATE 0x0008
> > +#define SCSW_STCTL_ALERT 0x0010
> > +
> > +#define SCSW_DSTAT_ATTENTION 0x80
> > +#define SCSW_DSTAT_STAT_MOD  0x40
> > +#define SCSW_DSTAT_CU_END0x20
> > +#define SCSW_DSTAT_BUSY  0x10
> > +#define SCSW_DST

Re: [Qemu-devel] [PATCH 0/2] tests: avoid aio_flush() in test cases

2012-12-10 Thread Kevin Wolf
Am 04.12.2012 17:59, schrieb Paolo Bonzini:
> Il 04/12/2012 16:12, Stefan Hajnoczi ha scritto:
>> There is a patch to drop aio_flush().  Most callers shouldn't use that
>> interface.  It turns out that the aio and thread pool test cases *do* need
>> low-level flush functionality so they can test the aio code.
>>
>> Convert test-aio.c and test-thread-pool.c to use replacements for
>> qemu_aio_flush() and aio_flush().
>>
>> Stefan Hajnoczi (2):
>>   tests: use aio_poll() instead of aio_flush() in test-aio.c
>>   tests: avoid qemu_aio_flush() in test-thread-pool.c
>>
>>  tests/test-aio.c | 31 +++
>>  tests/test-thread-pool.c | 20 ++--
>>  2 files changed, 29 insertions(+), 22 deletions(-)
>>
> 
> Looks good.

Thanks, applied to the block branch.

Kevin



Re: [Qemu-devel] [PATCH 3/8] s390: I/O interrupt and machine check injection.

2012-12-10 Thread Cornelia Huck
On Mon, 10 Dec 2012 09:20:57 +0100
Alexander Graf  wrote:

> 
> On 07.12.2012, at 13:50, Cornelia Huck wrote:
> 
> > I/O interrupts are queued per isc. Only crw pending machine checks
> > are supported.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> > target-s390x/cpu.h|  67 +++
> > target-s390x/helper.c | 145 
> > ++
> > 2 files changed, 212 insertions(+)
> > 
> > diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> > index 0f9a1f7..73bfc20 100644
> > --- a/target-s390x/cpu.h
> > +++ b/target-s390x/cpu.h
> > @@ -47,6 +47,11 @@
> > #define MMU_USER_IDX 1
> > 
> > #define MAX_EXT_QUEUE 16
> > +#define MAX_IO_QUEUE 16
> > +#define MAX_MCHK_QUEUE 16
> > +
> > +#define PSW_MCHK_MASK 0x0004
> > +#define PSW_IO_MASK 0x0200
> > 
> > typedef struct PSW {
> > uint64_t mask;
> > @@ -59,6 +64,17 @@ typedef struct ExtQueue {
> > uint32_t param64;
> > } ExtQueue;
> > 
> > +typedef struct IOQueue {
> > +uint16_t id;
> > +uint16_t nr;
> > +uint32_t parm;
> > +uint32_t word;
> > +} IOQueue;
> > +
> > +typedef struct MchkQueue {
> > +uint16_t type;
> > +} MchkQueue;
> > +
> > typedef struct CPUS390XState {
> > uint64_t regs[16];  /* GP registers */
> > 
> > @@ -88,8 +104,16 @@ typedef struct CPUS390XState {
> > 
> > int pending_int;
> > ExtQueue ext_queue[MAX_EXT_QUEUE];
> > +IOQueue io_queue[MAX_IO_QUEUE][8];
> > +MchkQueue mchk_queue[MAX_MCHK_QUEUE];
> > 
> > int ext_index;
> > +int io_index[8];
> > +int mchk_index;
> > +
> > +uint64_t ckc;
> > +uint64_t cputm;
> > +uint32_t todpr;
> > 
> > CPU_COMMON
> > 
> > @@ -364,12 +388,16 @@ static inline void cpu_set_tls(CPUS390XState *env, 
> > target_ulong newtls)
> > #define EXCP_EXT 1 /* external interrupt */
> > #define EXCP_SVC 2 /* supervisor call (syscall) */
> > #define EXCP_PGM 3 /* program interruption */
> > +#define EXCP_IO  7 /* I/O interrupt */
> > +#define EXCP_MCHK 8 /* machine check */
> > 
> > #endif /* CONFIG_USER_ONLY */
> > 
> > #define INTERRUPT_EXT(1 << 0)
> > #define INTERRUPT_TOD(1 << 1)
> > #define INTERRUPT_CPUTIMER   (1 << 2)
> > +#define INTERRUPT_IO (1 << 3)
> > +#define INTERRUPT_MCHK   (1 << 4)
> > 
> > /* Program Status Word.  */
> > #define S390_PSWM_REGNUM 0
> > @@ -977,6 +1005,45 @@ static inline void cpu_inject_ext(CPUS390XState *env, 
> > uint32_t code, uint32_t pa
> > cpu_interrupt(env, CPU_INTERRUPT_HARD);
> > }
> > 
> > +static inline void cpu_inject_io(CPUS390XState *env, uint16_t 
> > subchannel_id,
> > + uint16_t subchannel_number,
> > + uint32_t io_int_parm, uint32_t 
> > io_int_word)
> > +{
> > +int isc = ffs(io_int_word << 2) - 1;
> > +
> > +if (env->io_index[isc] == MAX_IO_QUEUE - 1) {
> > +/* ugh - can't queue anymore. Let's drop. */
> > +return;
> > +}
> > +
> > +env->io_index[isc]++;
> > +assert(env->io_index[isc] < MAX_IO_QUEUE);
> > +
> > +env->io_queue[env->io_index[isc]][isc].id = subchannel_id;
> > +env->io_queue[env->io_index[isc]][isc].nr = subchannel_number;
> > +env->io_queue[env->io_index[isc]][isc].parm = io_int_parm;
> > +env->io_queue[env->io_index[isc]][isc].word = io_int_word;
> > +
> > +env->pending_int |= INTERRUPT_IO;
> > +cpu_interrupt(env, CPU_INTERRUPT_HARD);
> > +}
> > +
> > +static inline void cpu_inject_crw_mchk(CPUS390XState *env)
> > +{
> > +if (env->mchk_index == MAX_MCHK_QUEUE - 1) {
> > +/* ugh - can't queue anymore. Let's drop. */
> > +return;
> > +}
> > +
> > +env->mchk_index++;
> > +assert(env->mchk_index < MAX_MCHK_QUEUE);
> > +
> > +env->mchk_queue[env->mchk_index].type = 1;
> > +
> > +env->pending_int |= INTERRUPT_MCHK;
> > +cpu_interrupt(env, CPU_INTERRUPT_HARD);
> > +}
> > +
> > static inline bool cpu_has_work(CPUState *cpu)
> > {
> > CPUS390XState *env = &S390_CPU(cpu)->env;
> > diff --git a/target-s390x/helper.c b/target-s390x/helper.c
> > index b7b812a..4ff148d 100644
> > --- a/target-s390x/helper.c
> > +++ b/target-s390x/helper.c
> > @@ -574,12 +574,144 @@ static void do_ext_interrupt(CPUS390XState *env)
> > load_psw(env, mask, addr);
> > }
> > 
> > +static void do_io_interrupt(CPUS390XState *env)
> > +{
> > +uint64_t mask, addr;
> > +LowCore *lowcore;
> > +hwaddr len = TARGET_PAGE_SIZE;
> > +IOQueue *q;
> > +uint8_t isc;
> > +int disable = 1;
> > +int found = 0;
> > +
> > +if (!(env->psw.mask & PSW_MASK_IO)) {
> > +cpu_abort(env, "I/O int w/o I/O mask\n");
> > +}
> > +
> > +for (isc = 0; isc < 8; isc++) {
> > +if (env->io_index[isc] < 0) {
> > +continue;
> > +}
> > +if (env->io_index[isc] > MAX_IO_QUEUE) {
> > +cpu_abort(env, "I/O queue overrun for isc %d: %d\n",
> > +  isc, env->io_index[isc]

Re: [Qemu-devel] [PATCH 6/8] s390: Wire up channel I/O in kvm.

2012-12-10 Thread Cornelia Huck
On Mon, 10 Dec 2012 10:40:15 +0100
Alexander Graf  wrote:

> 
> 
> On 07.12.2012, at 13:50, Cornelia Huck  wrote:
> 
> > Trigger the code for our virtual css in case of instruction
> > intercepts for I/O instructions.
> > 
> > Handle the tsch exit for the subchannel-related part of tsch.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> > target-s390x/cpu.h |  11 +++
> > target-s390x/kvm.c | 246 
> > ++---
> > 2 files changed, 244 insertions(+), 13 deletions(-)
> > 
> > diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> > index eb24c6b..1d21ca8 100644
> > --- a/target-s390x/cpu.h
> > +++ b/target-s390x/cpu.h
> > @@ -1178,6 +1178,13 @@ uint32_t set_cc_nz_f64(float64 v);
> > /* misc_helper.c */
> > void program_interrupt(CPUS390XState *env, uint32_t code, int ilc);
> > 
> > +#ifdef CONFIG_KVM
> > +int kvm_s390_io_interrupt(CPUS390XState *env, uint16_t subchannel_id,
> > +  uint16_t subchannel_nr, uint32_t io_int_parm,
> > +  uint32_t io_int_word);
> > +int kvm_s390_crw_mchk(CPUS390XState *env);
> > +void kvm_s390_enable_css_support(CPUS390XState *env);
> > +#else
> > static inline int kvm_s390_io_interrupt(CPUS390XState *env,
> > uint16_t subchannel_id,
> > uint16_t subchannel_nr,
> > @@ -1190,6 +1197,10 @@ static inline int kvm_s390_crw_mchk(CPUS390XState 
> > *env)
> > {
> > return -EOPNOTSUPP;
> > }
> > +static inline void kvm_s390_enable_css_support(CPUS390XState *env)
> > +{
> > +}
> > +#endif
> > 
> > static inline void s390_io_interrupt(CPUS390XState *env,
> >  uint16_t subchannel_id,
> > diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
> > index 94de764..c9a4a7d 100644
> > --- a/target-s390x/kvm.c
> > +++ b/target-s390x/kvm.c
> > @@ -30,6 +30,7 @@
> > #include "kvm.h"
> > #include "cpu.h"
> > #include "device_tree.h"
> > +#include "ioinst.h"
> > 
> > /* #define DEBUG_KVM */
> > 
> > @@ -43,9 +44,27 @@
> > 
> > #define IPA0_DIAG   0x8300
> > #define IPA0_SIGP   0xae00
> > -#define IPA0_PRIV   0xb200
> > +#define IPA0_B2 0xb200
> > +#define IPA0_B9 0xb900
> > +#define IPA0_EB 0xeb00
> > 
> > #define PRIV_SCLP_CALL  0x20
> > +#define PRIV_CSCH   0x30
> > +#define PRIV_HSCH   0x31
> > +#define PRIV_MSCH   0x32
> > +#define PRIV_SSCH   0x33
> > +#define PRIV_STSCH  0x34
> > +#define PRIV_TSCH   0x35
> > +#define PRIV_TPI0x36
> > +#define PRIV_SAL0x37
> > +#define PRIV_RSCH   0x38
> > +#define PRIV_STCRW  0x39
> > +#define PRIV_STCPS  0x3a
> > +#define PRIV_RCHP   0x3b
> > +#define PRIV_SCHM   0x3c
> > +#define PRIV_CHSC   0x5f
> > +#define PRIV_SIGA   0x74
> > +#define PRIV_XSCH   0x76
> > #define DIAG_KVM_HYPERCALL  0x500
> > #define DIAG_KVM_BREAKPOINT 0x501
> > 
> > @@ -350,10 +369,120 @@ static int kvm_sclp_service_call(CPUS390XState *env, 
> > struct kvm_run *run,
> > return 0;
> > }
> > 
> > -static int handle_priv(CPUS390XState *env, struct kvm_run *run, uint8_t 
> > ipa1)
> > +static int kvm_handle_css_inst(CPUS390XState *env, struct kvm_run *run,
> > +   uint8_t ipa0, uint8_t ipa1, uint8_t ipb)
> > +{
> > +int r = 0;
> > +int no_cc = 0;
> > +
> > +if (ipa0 != 0xb2) {
> > +/* Not handled for now. */
> > +return -1;
> > +}
> > +cpu_synchronize_state(env);
> > +switch (ipa1) {
> > +case PRIV_XSCH:
> > +r = ioinst_handle_xsch(env, env->regs[1]);
> > +break;
> > +case PRIV_CSCH:
> > +r = ioinst_handle_csch(env, env->regs[1]);
> > +break;
> > +case PRIV_HSCH:
> > +r = ioinst_handle_hsch(env, env->regs[1]);
> > +break;
> > +case PRIV_MSCH:
> > +r = ioinst_handle_msch(env, env->regs[1], run->s390_sieic.ipb);
> > +break;
> > +case PRIV_SSCH:
> > +r = ioinst_handle_ssch(env, env->regs[1], run->s390_sieic.ipb);
> > +break;
> > +case PRIV_STCRW:
> > +r = ioinst_handle_stcrw(env, run->s390_sieic.ipb);
> > +break;
> > +case PRIV_STSCH:
> > +r = ioinst_handle_stsch(env, env->regs[1], run->s390_sieic.ipb);
> > +break;
> > +case PRIV_TSCH:
> > +/* We should only get tsch via KVM_EXIT_S390_TSCH. */
> > +fprintf(stderr, "Spurious tsch intercept\n");
> > +break;
> > +case PRIV_CHSC:
> > +r = ioinst_handle_chsc(env, run->s390_sieic.ipb);
> > +break;
> >

Re: [Qemu-devel] [RFC PATCH v4 0/8] s390: channel I/O support in qemu.

2012-12-10 Thread Cornelia Huck
On Mon, 10 Dec 2012 09:02:51 +0100
Alexander Graf  wrote:

> 
> On 07.12.2012, at 13:50, Cornelia Huck wrote:
> 
> > Hi,
> > 
> > just a quick dump of my qemu patch series for channel I/O.
> > 
> > I've managed to chop the virtual css patch into some smaller
> > chunks (patches 2-6), which are hopefully easier to review.
> > 
> > The virtio-ccw patch is still based upon the current virtio
> > infrastructure; I'll try to rebase it upon the virtio refactoring
> > once that has most of the infrastructure in place.
> 
> I wouldn't want to hold this feature back only for some virtio refactoring. 
> Let's try to get it into shape for now and worry about the refactor work when 
> that's closer to merging.

The virtio-ccw stuff is already in a state where I welcome feedback -
if you keep in mind that the modelling will look a bit different after
the virtio refactoring. (And the other patches could be merged when
ready without waiting for the refactoring.)
> 
> 
> Alex
> 
> > 
> > Cornelia Huck (8):
> >  Update linux headers.
> >  s390: Channel I/O basic defintions.
> >  s390: I/O interrupt and machine check injection.
> >  s390: Add channel I/O instructions.
> >  s390: Virtual channel subsystem support.
> >  s390: Wire up channel I/O in kvm.
> >  s390-virtio: Factor out some initialization code.
> >  s390: Add new channel I/O based virtio transport.
> > 
> > hw/s390-virtio.c |  281 +---
> > hw/s390x/Makefile.objs   |2 +
> > hw/s390x/css.c   | 1195 
> > ++
> > hw/s390x/css.h   |   92 +++
> > hw/s390x/virtio-ccw.c|  909 ++
> > hw/s390x/virtio-ccw.h|   81 +++
> > linux-headers/asm-generic/kvm_para.h |4 +
> > linux-headers/asm-powerpc/kvm.h  |   59 ++
> > linux-headers/asm-powerpc/kvm_para.h |7 +-
> > linux-headers/linux/kvm.h|   36 +-
> > target-s390x/Makefile.objs   |2 +-
> > target-s390x/cpu.h   |  230 +++
> > target-s390x/helper.c|  145 +
> > target-s390x/ioinst.c|  726 +
> > target-s390x/ioinst.h|  223 +++
> > target-s390x/kvm.c   |  246 ++-
> > trace-events |   18 +
> > 17 files changed, 4161 insertions(+), 95 deletions(-)
> > create mode 100644 hw/s390x/css.c
> > create mode 100644 hw/s390x/css.h
> > create mode 100644 hw/s390x/virtio-ccw.c
> > create mode 100644 hw/s390x/virtio-ccw.h
> > create mode 100644 linux-headers/asm-generic/kvm_para.h
> > create mode 100644 target-s390x/ioinst.c
> > create mode 100644 target-s390x/ioinst.h
> > 
> > -- 
> > 1.7.12.4
> > 
> 




[Qemu-devel] [PATCH] qxl+vnc: register a vm state change handler for dummy spice_server

2012-12-10 Thread Uri Lublin
When qxl + vnc are used, a dummy spice_server is initialized.
The spice_server has to be told when the VM runstate changes,
which is what this patch does.

Without it, from qxl_send_events(), the following error message is shown:
  qxl_send_events: spice-server bug: guest stopped, ignoring
---
 ui/spice-core.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/ui/spice-core.c b/ui/spice-core.c
index 261c6f2..6a6ab12 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -732,6 +732,7 @@ int qemu_spice_add_interface(SpiceBaseInstance *sin)
  */
 spice_server = spice_server_new();
 spice_server_init(spice_server, &core_interface);
+qemu_add_vm_change_state_handler(vm_change_state_handler, 
&spice_server);
 }
 
 return spice_server_add_interface(spice_server, sin);
-- 
1.7.1




[Qemu-devel] [PATCH] qxl: save qemu_create_displaysurface_from result

2012-12-10 Thread Gerd Hoffmann
https://bugzilla.redhat.com/show_bug.cgi?id=885644

Reported-by: Markus Armbruster 
Signed-off-by: Gerd Hoffmann 
---
 hw/qxl-render.c |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/qxl-render.c b/hw/qxl-render.c
index 98ecb21..88e63f8 100644
--- a/hw/qxl-render.c
+++ b/hw/qxl-render.c
@@ -113,11 +113,12 @@ static void qxl_render_update_area_unlocked(PCIQXLDevice 
*qxl)
qxl->guest_primary.bits_pp);
 if (qxl->guest_primary.qxl_stride > 0) {
 qemu_free_displaysurface(vga->ds);
-qemu_create_displaysurface_from(qxl->guest_primary.surface.width,
-qxl->guest_primary.surface.height,
-qxl->guest_primary.bits_pp,
-qxl->guest_primary.abs_stride,
-qxl->guest_primary.data);
+vga->ds->surface = qemu_create_displaysurface_from
+(qxl->guest_primary.surface.width,
+ qxl->guest_primary.surface.height,
+ qxl->guest_primary.bits_pp,
+ qxl->guest_primary.abs_stride,
+ qxl->guest_primary.data);
 } else {
 qemu_resize_displaysurface(vga->ds,
 qxl->guest_primary.surface.width,
-- 
1.7.1




Re: [Qemu-devel] [RFC 3/3] docs: document virtio-balloon stats

2012-12-10 Thread Luiz Capitulino
On Sat, 8 Dec 2012 06:26:05 +
Dietmar Maurer  wrote:

> >  2. wait for the (kernel-based) auto-ballooning feature, which is on the
> > works
> 
> Do you have more information on that? Any links?

No, no links. What's being worked right now is a notification mechanism for
the kernel to inform user-space about memory pressure:

 http://marc.info/?l=linux-mm&m=135513372205134&w=2

Then we can add that to qemu for auto-inflate and change the virtio-balloon
driver to do auto-deflate.



[Qemu-devel] [PATCH] Fix error code checking for SetFilePointer() call

2012-12-10 Thread Fabien Chouteau
An error has occurred if the return value is invalid_set_file_pointer
and getlasterror doesn't return no_error.

Signed-off-by: Fabien Chouteau 
---
 block/raw-win32.c |   17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/block/raw-win32.c b/block/raw-win32.c
index 0c05c58..ce207a3 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -303,13 +303,24 @@ static int raw_truncate(BlockDriverState *bs, int64_t 
offset)
 {
 BDRVRawState *s = bs->opaque;
 LONG low, high;
+DWORD dwPtrLow;
 
 low = offset;
 high = offset >> 32;
-if (!SetFilePointer(s->hfile, low, &high, FILE_BEGIN))
-   return -EIO;
-if (!SetEndOfFile(s->hfile))
+
+/*
+ * An error has occurred if the return value is INVALID_SET_FILE_POINTER
+ * and GetLastError doesn't return NO_ERROR.
+ */
+dwPtrLow = SetFilePointer(s->hfile, low, &high, FILE_BEGIN);
+if (dwPtrLow == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
+fprintf(stderr, "SetFilePointer error: %d\n", GetLastError());
+return -EIO;
+}
+if (SetEndOfFile(s->hfile) == 0) {
+fprintf(stderr, "SetEndOfFile error: %d\n", GetLastError());
 return -EIO;
+}
 return 0;
 }
 
-- 
1.7.9.5




Re: [Qemu-devel] [RFC 3/3] docs: document virtio-balloon stats

2012-12-10 Thread Dietmar Maurer
> > >  2. wait for the (kernel-based) auto-ballooning feature, which is on the
> > > works
> >
> > Do you have more information on that? Any links?
> 
> No, no links. What's being worked right now is a notification mechanism for
> the kernel to inform user-space about memory pressure:
> 
>  http://marc.info/?l=linux-mm&m=135513372205134&w=2
> 
> Then we can add that to qemu for auto-inflate and change the virtio-balloon
> driver to do auto-deflate.

Many thanks for that information.




Re: [Qemu-devel] [PATCH] qemu-img: find the highest offset in use during check

2012-12-10 Thread Kevin Wolf
Am 10.12.2012 10:13, schrieb Federico Simoncelli:
> This patch adds the support for reporting the highest offset in use by
> an image. This is particularly useful after a conversion (or a rebase)
> where the destination is a block device in order to find the actual
> amount of space in use.
> 
> Signed-off-by: Federico Simoncelli 
> ---
>  block.h|1 +
>  block/qcow2-refcount.c |   10 --
>  qemu-img.c |4 
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/block.h b/block.h
> index 722c620..de42e8c 100644
> --- a/block.h
> +++ b/block.h
> @@ -213,6 +213,7 @@ typedef struct BdrvCheckResult {
>  int check_errors;
>  int corruptions_fixed;
>  int leaks_fixed;
> +int64_t highest_offset;
>  BlockFragInfo bfi;
>  } BdrvCheckResult;
>  
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 96224d1..017439d 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -1116,7 +1116,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
> BdrvCheckResult *res,
>BdrvCheckMode fix)
>  {
>  BDRVQcowState *s = bs->opaque;
> -int64_t size, i;
> +int64_t size, i, highest_cluster;
>  int nb_clusters, refcount1, refcount2;
>  QCowSnapshot *sn;
>  uint16_t *refcount_table;
> @@ -1154,7 +1154,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  s->refcount_table_offset,
>  s->refcount_table_size * sizeof(uint64_t));
>  
> -for(i = 0; i < s->refcount_table_size; i++) {
> +for(i = 0, highest_cluster = 0; i < s->refcount_table_size; i++) {
>  uint64_t offset, cluster;
>  offset = s->refcount_table[i];
>  cluster = offset >> s->cluster_bits;
> @@ -1197,6 +1197,11 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  }
>  
>  refcount2 = refcount_table[i];
> +
> +if (refcount1 > 0 || refcount2 > 0) {
> +highest_cluster = i;
> +}
> +
>  if (refcount1 != refcount2) {
>  
>  /* Check if we're allowed to fix the mismatch */
> @@ -1231,6 +1236,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  }
>  }
>  
> +res->highest_offset = (highest_cluster + 1) * s->cluster_size;
>  ret = 0;
>  
>  fail:
> diff --git a/qemu-img.c b/qemu-img.c
> index e29e01b..3a8090b 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -470,6 +470,10 @@ static int img_check(int argc, char **argv)
>  result.bfi.fragmented_clusters * 100.0 / 
> result.bfi.allocated_clusters);
>  }
>  
> +if (result.highest_offset > 0) {
> +printf("Highest offset in use: %lu\n", result.highest_offset);

highest_offset is not a unsigned long, but int64_t. Please use PRId64
instead of %lu.

I think we also need to change qemu-iotests so that it filters out these
lines, or many test cases would fail now. Or maybe we should think about
introducing a -v switch that enables the messages about fragmentation
and the highest offset.

Kevin



Re: [Qemu-devel] [PATCH] NVMe: Initial commit to add an NVM Express device

2012-12-10 Thread Kevin Wolf
Hi Keith,

Am 08.12.2012 20:20, schrieb Keith Busch:
>> IIUC from the website above, NVMe is to be used with SSDs?  It would be
>> good to add to the commit message how to actually use the device
>> command-line-wise beyond the obvious -device nvme: I did not spot on
>> brief sight where you expose a bus to add drives (nor a special IF_*
>> interface type to assign to a drive), so others might wonder as well.
> 
> Actually the nvme device _is_ the SSD. The emulated controller here
> creates files to use for its backing storage so you don't add
> additional drives, if that makes sense.

I think the device would be much more useful if you could make it use
the qemu block layer instead of implementing your own functions for only
raw images and only with a given magic file name.

Kevin



Re: [Qemu-devel] [PATCH qom-cpu 1/4] target-i386: Inline -cpu host check into cpu_x86_register()

2012-12-10 Thread Eduardo Habkost
On Sun, Dec 09, 2012 at 08:45:50PM +0100, Andreas Färber wrote:
> Simplifies the upcoming cleanup of cpu_x86_find_by_name().

...by making cpu_x86_register() more complicated, and having CPU model
name lookup spread into different parts of the code.

The CPU model lookup is a bit complex because of the "host" exception,
but at least the complexity was hidden inside cpu_x86_find_by_name()
(making it very easy to replace that logic by a CPU subclass lookup,
later).

> 
> Signed-off-by: Andreas Färber 
> ---
>  target-i386/cpu.c |   12 +++-
>  1 Datei geändert, 7 Zeilen hinzugefügt(+), 5 Zeilen entfernt(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 7be3ad8..a46faa2 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -1217,9 +1217,7 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
> const char *name)
>  break;
>  }
>  }
> -if (kvm_enabled() && name && strcmp(name, "host") == 0) {
> -kvm_cpu_fill_host(x86_cpu_def);
> -} else if (!def) {
> +if (!def) {
>  return -1;
>  } else {
>  memcpy(x86_cpu_def, def, sizeof(*def));
> @@ -1505,8 +1503,12 @@ int cpu_x86_register(X86CPU *cpu, const char 
> *cpu_model)
>  name = model_pieces[0];
>  features = model_pieces[1];
>  
> -if (cpu_x86_find_by_name(def, name) < 0) {
> -goto error;
> +if (kvm_enabled() && strcmp(name, "host") == 0) {
> +kvm_cpu_fill_host(def);
> +} else {
> +if (cpu_x86_find_by_name(def, name) < 0) {
> +goto error;
> +}
>  }
>  
>  if (cpu_x86_parse_featurestr(def, features) < 0) {
> -- 
> 1.7.10.4
> 

-- 
Eduardo



[Qemu-devel] [PATCH 1/2] pixman: require 0.16.4 as minimum version

2012-12-10 Thread Gerd Hoffmann
Lower the bar a bit.  0.16.4 is known-good, and is shipped by debian.
Fixes build failures on the debian-based buildbot slaves.

Signed-off-by: Gerd Hoffmann 
---
 configure |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index e5aedef..a4e62c4 100755
--- a/configure
+++ b/configure
@@ -2127,7 +2127,7 @@ fi
 # pixman support probe
 
 if test "$pixman" = ""; then
-  if $pkg_config --atleast-version=0.18.4 pixman-1 > /dev/null 2>&1; then
+  if $pkg_config --atleast-version=0.16.4 pixman-1 > /dev/null 2>&1; then
 pixman="system"
   else
 pixman="internal"
@@ -2138,7 +2138,7 @@ if test "$pixman" = "system"; then
   pixman_libs=`$pkg_config --libs pixman-1 2>/dev/null`
 else
   if test ! -d ${source_path}/pixman/pixman; then
-echo "ERROR: pixman not present (or older than 0.18.4). Your options:"
+echo "ERROR: pixman not present (or older than 0.16.4). Your options:"
 echo "  (1) Prefered: Install the pixman devel package (any recent"
 echo "  distro should have packages as Xorg needs pixman too)."
 echo "  (2) Fetch the pixman submodule, using:"
-- 
1.7.1




[Qemu-devel] [PATCH 2/2] pixman: update internal copy to pixman-0.28.2

2012-12-10 Thread Gerd Hoffmann
Some w64 fixes by Stefan Weil found their way into 0.28.2,
so update the internal copy to that version to improve
windows support.

Signed-off-by: Gerd Hoffmann 
---
 pixman |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/pixman b/pixman
index 97336fa..a5e5179 16
--- a/pixman
+++ b/pixman
@@ -1 +1 @@
-Subproject commit 97336fad32acf802003855cd8bd6477fa49a12e3
+Subproject commit a5e5179b5624c99c812e9bf6e7b907e355a811e8
-- 
1.7.1




[Qemu-devel] [PATCH 0/2] pixman: two little fixes.

2012-12-10 Thread Gerd Hoffmann
  Hi,

Two little pixman fixes.  The first unbreaks a bunch of build failures.

please apply,
  Gerd

Gerd Hoffmann (2):
  pixman: require 0.16.4 as minimum version
  pixman: update internal copy to pixman-0.28.2

 configure |4 ++--
 pixman|2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)




Re: [Qemu-devel] [Qemu-ppc] [0/2] pseries: Rework PCI code for handling multiple PHBs

2012-12-10 Thread Michael S. Tsirkin
On Wed, Nov 28, 2012 at 01:50:53PM +1100, David Gibson wrote:
> On Tue, Nov 27, 2012 at 02:36:57PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Nov 27, 2012 at 05:07:31PM +1100, David Gibson wrote:
> > > Hi Michael, Alex,
> > > 
> > > This patch represents a compromise I hope will be acceptable after the
> > > long thread discussing handling of multiple PCI host bridges on the
> > > pseries machine.  Patch 1/1 is just a preliminary enforcing uniqueness
> > > of LIOBNs in the IOMMU code.
> > > 
> > > Patch 2/2 is the meat.  It allows either explicit configuration of all
> > > the properties, or the user can just set an abstract index which will
> > > generate sensible and probably-unique values for all the rest.
> > > 
> > > With these patches I was able to do something like:
> > >   qemu-system-ppc64 -M pseries -m 1024 -nographic \
> > >   fc17-root.qcow2 -net none -device nec-usb-xhci -device \
> > >   spapr-pci-host-bridge,index=1 -device e1000,netdev=fred \
> > >   -netdev user,id=fred
> > > 
> > > I was able to see both the PCI domains in the guest, and use the NIC
> > > on the secondary domain.
> > > 
> > > There are still some gotches with multiple domains though.  The domain
> > > value in PCIHostBus is still always initialized to 0, and there are
> > > other places in the PCI core where handling of multiple domains is
> > > essentially stubbed out.
> > > 
> > > Michael, any thoughts on what to do about that?  I could fix up the
> > > PCI code so that domain is actually set and used.  But I think the
> > > whole notion of domain numbers is kind of bogus on the qemu side:
> > > since PCI domains are completely independent from each other, it's
> > > only platform convention which determines what the domain numbers are.
> > > On platforms that don't have a strong convention, the guest will
> > > number them itself and we have no way of knowing that.  So it seems to
> > > me that the PCI code should instead of domain numbers just use the
> > > device ID, or the bus name or some qemu side symbolic name.  For
> > > platforms that do have a numbering convention those names can be
> > > derived from the domain numbers, but it also works for platforms that
> > > don't.
> > 
> > I agree with this last statement: using bus numbers and domain
> > numbers is not a good idea. We mostly need to support domain 0
> > to mean "default" for backwards compatibility.
> 
> Ok.  Where are domain numbers actually specified by the qemu user?

pci_parse_devaddr

> -- 
> David Gibson  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ 
> _other_
>   | _way_ _around_!
> http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] [PATCH 1/2] pseries: Don't allow TCE (iommu) tables to be registered with duplicate LIOBNs

2012-12-10 Thread Michael S. Tsirkin
On Tue, Nov 27, 2012 at 05:07:32PM +1100, David Gibson wrote:
> The PAPR specification requires that every bus or device mediated by the
> IOMMU have a unique Logical IO Bus Number (LIOBN).  This patch adds a check
> to enforce this, which will help catch errors in configuration earlier.
> 
> Signed-off-by: David Gibson 

Acked-by: Michael S. Tsirkin 

> ---
>  hw/spapr_iommu.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/hw/spapr_iommu.c b/hw/spapr_iommu.c
> index 02d78cc..3011b25 100644
> --- a/hw/spapr_iommu.c
> +++ b/hw/spapr_iommu.c
> @@ -120,6 +120,12 @@ DMAContext *spapr_tce_new_dma_context(uint32_t liobn, 
> size_t window_size)
>  {
>  sPAPRTCETable *tcet;
>  
> +if (spapr_tce_find_by_liobn(liobn)) {
> +fprintf(stderr, "Attempted to create TCE table with duplicate"
> +" LIOBN 0x%x\n", liobn);
> +return NULL;
> +}
> +
>  if (!window_size) {
>  return NULL;
>  }
> -- 
> 1.7.10.4



Re: [Qemu-devel] [PATCH 2/2] pseries: Properly handle allocation of multiple PCI host bridges

2012-12-10 Thread Michael S. Tsirkin
On Tue, Nov 27, 2012 at 05:07:33PM +1100, David Gibson wrote:
> From: Alexey Kardashevskiy 
> 
> Multiple - even many - PCI host bridges (i.e. PCI domains) are very
> common on real PAPR compliant hardware.  For reasons related to the
> PAPR specified IOMMU interfaces, PCI device assignment with VFIO will
> generally require at least two (virtual) PHBs and possibly more
> depending on which devices are assigned.
> 
> At the moment the qemu PAPR PCI code will not deal with this well,
> leaving several crucial parameters of PHBs other than the default one
> uninitialized.  This patch reworks the code to allow this.
> 
> Every PHB needs a unique BUID (Bus Unit Identifier, the id used for
> the PAPR PCI related interfaces) and a unique LIOBN (Logical IO Bus
> Number, the id used for the PAPR IOMMU related interfaces).  In
> addition they need windows in CPU real address space to access PCI
> memory space, PCI IO space and MSIs.  Properties are added to the PCI
> host bridge qdevice to allow configuration of all these.
> 
> To simplify configuration of multiple PHBs for common cases, a
> convenience "index" property is also added.  This can be set instead
> of the low-level properties, and will generate suitable values for the
> other parameters, different for each index valu.
> 
> Signed-off-by: David Gibson 

Acked-by: Michael S. Tsirkin 

> ---
>  hw/spapr.c |   13 +---
>  hw/spapr_pci.c |   97 
> +---
>  hw/spapr_pci.h |   21 
>  3 files changed, 87 insertions(+), 44 deletions(-)
> 
> diff --git a/hw/spapr.c b/hw/spapr.c
> index d23aa9d..1609f73 100644
> --- a/hw/spapr.c
> +++ b/hw/spapr.c
> @@ -76,12 +76,6 @@
>  #define MAX_CPUS256
>  #define XICS_IRQS   1024
>  
> -#define SPAPR_PCI_BUID  0x8002001ULL
> -#define SPAPR_PCI_MEM_WIN_ADDR  (0x100ULL + 0xA000)
> -#define SPAPR_PCI_MEM_WIN_SIZE  0x2000
> -#define SPAPR_PCI_IO_WIN_ADDR   (0x100ULL + 0x8000)
> -#define SPAPR_PCI_MSI_WIN_ADDR  (0x100ULL + 0x9000)
> -
>  #define PHANDLE_XICP0x
>  
>  #define HTAB_SIZE(spapr)(1ULL << ((spapr)->htab_shift))
> @@ -854,12 +848,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
>  /* Set up PCI */
>  spapr_pci_rtas_init();
>  
> -spapr_create_phb(spapr, "pci", SPAPR_PCI_BUID,
> - SPAPR_PCI_MEM_WIN_ADDR,
> - SPAPR_PCI_MEM_WIN_SIZE,
> - SPAPR_PCI_IO_WIN_ADDR,
> - SPAPR_PCI_MSI_WIN_ADDR);
> -phb = PCI_HOST_BRIDGE(QLIST_FIRST(&spapr->phbs));
> +phb = spapr_create_phb(spapr, 0);
>  
>  for (i = 0; i < nb_nics; i++) {
>  NICInfo *nd = &nd_table[i];
> diff --git a/hw/spapr_pci.c b/hw/spapr_pci.c
> index 3c5b855..dc2ffaf 100644
> --- a/hw/spapr_pci.c
> +++ b/hw/spapr_pci.c
> @@ -435,7 +435,7 @@ static void pci_spapr_set_irq(void *opaque, int irq_num, 
> int level)
>   */
>  sPAPRPHBState *phb = opaque;
>  
> -trace_spapr_pci_lsi_set(phb->busname, irq_num, 
> phb->lsi_table[irq_num].irq);
> +trace_spapr_pci_lsi_set(phb->dtbusname, irq_num, 
> phb->lsi_table[irq_num].irq);
>  qemu_set_irq(spapr_phb_lsi_qirq(phb, irq_num), level);
>  }
>  
> @@ -522,6 +522,58 @@ static int spapr_phb_init(SysBusDevice *s)
>  int i;
>  PCIBus *bus;
>  
> +if (sphb->index != -1) {
> +hwaddr windows_base;
> +
> +if ((sphb->buid != -1) || (sphb->dma_liobn != -1)
> +|| (sphb->mem_win_addr != -1)
> +|| (sphb->io_win_addr != -1)
> +|| (sphb->msi_win_addr != -1)) {
> +fprintf(stderr, "Either \"index\" or other parameters must"
> +" be specified for PAPR PHB, not both\n");
> +return -1;
> +}
> +
> +sphb->buid = SPAPR_PCI_BASE_BUID + sphb->index;
> +sphb->dma_liobn = SPAPR_PCI_BASE_LIOBN + sphb->index;
> +
> +windows_base = SPAPR_PCI_WINDOW_BASE
> ++ sphb->index * SPAPR_PCI_WINDOW_SPACING;
> +sphb->mem_win_addr = windows_base + SPAPR_PCI_MMIO_WIN_OFF;
> +sphb->io_win_addr = windows_base + SPAPR_PCI_IO_WIN_OFF;
> +sphb->msi_win_addr = windows_base + SPAPR_PCI_MSI_WIN_OFF;
> +}
> +
> +if (sphb->buid == -1) {
> +fprintf(stderr, "BUID not specified for PHB\n");
> +return -1;
> +}
> +
> +if (sphb->dma_liobn == -1) {
> +fprintf(stderr, "LIOBN not specified for PHB\n");
> +return -1;
> +}
> +
> +if (sphb->mem_win_addr == -1) {
> +fprintf(stderr, "Memory window address not specified for PHB\n");
> +return -1;
> +}
> +
> +if (sphb->io_win_addr == -1) {
> +fprintf(stderr, "IO window address not specified for PHB\n");
> +return -1;
> +}
> +
> +if (sphb->msi_win_addr == -1) {
> +fprintf(stderr, "MSI window address not specified for PHB\n");
> +return -1;
> +

Re: [Qemu-devel] [PATCH v5 06/11] dataplane: add Linux AIO request queue

2012-12-10 Thread Stefan Hajnoczi
On Fri, Dec 07, 2012 at 03:21:22PM +0100, Kevin Wolf wrote:
> Am 05.12.2012 21:47, schrieb Stefan Hajnoczi:
> > The IOQueue has a pool of iocb structs and a function to add new
> > read/write requests.  Multiple requests can be added before calling the
> > submit function to actually tell the host kernel to begin I/O.  This
> > allows callers to batch requests and submit them in one go.
> > 
> > The actual I/O is performed using Linux AIO.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  hw/dataplane/Makefile.objs |   2 +-
> >  hw/dataplane/ioq.c | 118 
> > +
> >  hw/dataplane/ioq.h |  57 ++
> >  3 files changed, 176 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/dataplane/ioq.c
> >  create mode 100644 hw/dataplane/ioq.h
> > 
> > diff --git a/hw/dataplane/Makefile.objs b/hw/dataplane/Makefile.objs
> > index e26bd7d..abd408f 100644
> > --- a/hw/dataplane/Makefile.objs
> > +++ b/hw/dataplane/Makefile.objs
> > @@ -1,3 +1,3 @@
> >  ifeq ($(CONFIG_VIRTIO), y)
> > -common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o 
> > event-poll.o
> > +common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o 
> > event-poll.o ioq.o
> >  endif
> > diff --git a/hw/dataplane/ioq.c b/hw/dataplane/ioq.c
> > new file mode 100644
> > index 000..7adeb5d
> > --- /dev/null
> > +++ b/hw/dataplane/ioq.c
> > @@ -0,0 +1,118 @@
> > +/*
> > + * Linux AIO request queue
> > + *
> > + * Copyright 2012 IBM, Corp.
> > + * Copyright 2012 Red Hat, Inc. and/or its affiliates
> > + *
> > + * Authors:
> > + *   Stefan Hajnoczi 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or 
> > later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "hw/dataplane/ioq.h"
> > +
> > +void ioq_init(IOQueue *ioq, int fd, unsigned int max_reqs)
> > +{
> > +int rc;
> > +
> > +ioq->fd = fd;
> > +ioq->max_reqs = max_reqs;
> > +
> > +memset(&ioq->io_ctx, 0, sizeof ioq->io_ctx);
> > +rc = io_setup(max_reqs, &ioq->io_ctx);
> > +if (rc != 0) {
> > +fprintf(stderr, "ioq io_setup failed %d\n", rc);
> > +exit(1);
> > +}
> > +
> > +rc = event_notifier_init(&ioq->io_notifier, 0);
> > +if (rc != 0) {
> > +fprintf(stderr, "ioq io event notifier creation failed %d\n", rc);
> > +exit(1);
> > +}
> > +
> > +ioq->freelist = g_malloc0(sizeof ioq->freelist[0] * max_reqs);
> > +ioq->freelist_idx = 0;
> > +
> > +ioq->queue = g_malloc0(sizeof ioq->queue[0] * max_reqs);
> > +ioq->queue_idx = 0;
> > +}
> > +
> > +void ioq_cleanup(IOQueue *ioq)
> > +{
> > +g_free(ioq->freelist);
> > +g_free(ioq->queue);
> > +
> > +event_notifier_cleanup(&ioq->io_notifier);
> > +io_destroy(ioq->io_ctx);
> > +}
> > +
> > +EventNotifier *ioq_get_notifier(IOQueue *ioq)
> > +{
> > +return &ioq->io_notifier;
> > +}
> > +
> > +struct iocb *ioq_get_iocb(IOQueue *ioq)
> > +{
> > +if (unlikely(ioq->freelist_idx == 0)) {
> > +fprintf(stderr, "ioq underflow\n");
> > +exit(1);
> > +}
> 
> Can this happen? If no, it should be an assertion. If yes, the error
> handling code is wrong, we can't just exit qemu. It's already not nice
> to do it in setup functions, but during runtime I think it's not acceptable.
> 
> > +struct iocb *iocb = ioq->freelist[--ioq->freelist_idx];
> > +ioq->queue[ioq->queue_idx++] = iocb;
> > +return iocb;
> > +}
> > +
> > +void ioq_put_iocb(IOQueue *ioq, struct iocb *iocb)
> > +{
> > +if (unlikely(ioq->freelist_idx == ioq->max_reqs)) {
> > +fprintf(stderr, "ioq overflow\n");
> > +exit(1);
> > +}
> 
> Same here.

The ioq is sized so that guest cannot submit more than max_reqs due to
vring size.  Therefore this cannot happen and I have changed them to
asserts.

Stefan



Re: [Qemu-devel] [PATCH v5 10/11] dataplane: add virtio-blk data plane code

2012-12-10 Thread Stefan Hajnoczi
On Fri, Dec 07, 2012 at 07:04:39PM +0100, Kevin Wolf wrote:
> Am 05.12.2012 21:47, schrieb Stefan Hajnoczi:
> > virtio-blk-data-plane is a subset implementation of virtio-blk.  It only
> > handles read, write, and flush requests.  It does this using a dedicated
> > thread that executes an epoll(2)-based event loop and processes I/O
> > using Linux AIO.
> > 
> > This approach performs very well but can be used for raw image files
> > only.  The number of IOPS achieved has been reported to be several times
> > higher than the existing virtio-blk implementation.
> > 
> > Eventually it should be possible to unify virtio-blk-data-plane with the
> > main body of QEMU code once the block layer and hardware emulation is
> > able to run outside the global mutex.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> 
> > +static int process_request(IOQueue *ioq, struct iovec iov[],
> > +   unsigned int out_num, unsigned int in_num,
> > +   unsigned int head)
> > +{
> > +VirtIOBlockDataPlane *s = container_of(ioq, VirtIOBlockDataPlane, 
> > ioqueue);
> > +struct iovec *in_iov = &iov[out_num];
> > +struct virtio_blk_outhdr outhdr;
> > +QEMUIOVector *inhdr;
> > +size_t in_size;
> > +
> > +/* Copy in outhdr */
> > +if (unlikely(iov_to_buf(iov, out_num, 0, &outhdr,
> > +sizeof(outhdr)) != sizeof(outhdr))) {
> > +error_report("virtio-blk request outhdr too short");
> > +return -EFAULT;
> > +}
> > +iov_discard(&iov, &out_num, sizeof(outhdr));
> > +
> > +/* Grab inhdr for later */
> > +in_size = iov_size(in_iov, in_num);
> > +if (in_size < sizeof(struct virtio_blk_inhdr)) {
> > +error_report("virtio_blk request inhdr too short");
> > +return -EFAULT;
> > +}
> > +inhdr = g_slice_new(QEMUIOVector);
> > +qemu_iovec_init(inhdr, 1);
> > +qemu_iovec_concat_iov(inhdr, in_iov, in_num,
> > +in_size - sizeof(struct virtio_blk_inhdr),
> > +sizeof(struct virtio_blk_inhdr));
> > +iov_discard(&in_iov, &in_num, -sizeof(struct virtio_blk_inhdr));
> > +
> > +/* TODO Linux sets the barrier bit even when not advertised! */
> > +outhdr.type &= ~VIRTIO_BLK_T_BARRIER;
> > +
> > +struct iocb *iocb;
> > +switch (outhdr.type & (VIRTIO_BLK_T_OUT | VIRTIO_BLK_T_SCSI_CMD |
> > +   VIRTIO_BLK_T_FLUSH)) {
> > +case VIRTIO_BLK_T_IN:
> > +iocb = ioq_rdwr(ioq, true, in_iov, in_num, outhdr.sector * 512);
> > +break;
> > +
> > +case VIRTIO_BLK_T_OUT:
> > +iocb = ioq_rdwr(ioq, false, iov, out_num, outhdr.sector * 512);
> > +break;
> > +
> > +case VIRTIO_BLK_T_SCSI_CMD:
> > +/* TODO support SCSI commands */
> > +fail_request_early(s, head, inhdr, VIRTIO_BLK_S_UNSUPP);
> > +return 0;
> > +
> > +case VIRTIO_BLK_T_FLUSH:
> > +/* TODO fdsync not supported by Linux AIO, do it synchronously 
> > here! */
> > +fdatasync(s->fd);
> 
> We shouldn't ignore errors here.

Fixed, thanks.

Stefan



Re: [Qemu-devel] [PATCHv6] rbd block driver fix race between aio completition and aio cancel

2012-12-10 Thread Kevin Wolf
Am 30.11.2012 14:50, schrieb Stefan Hajnoczi:
> On Fri, Nov 30, 2012 at 9:55 AM, Stefan Priebe  wrote:
>> This one fixes a race which qemu had also in iscsi block driver
>> between cancellation and io completition.
>>
>> qemu_rbd_aio_cancel was not synchronously waiting for the end of
>> the command.
>>
>> To archieve this it introduces a new status flag which uses
>> -EINPROGRESS.
>>
>> Changes since PATCHv5:
>> - qemu_aio_release has to be done in qemu_rbd_aio_cancel if I/O
>>   was cancelled
>>
>> Changes since PATCHv4:
>> - removed unnecessary qemu_vfree of acb->bounce as BH will always
>>   run
>>
>> Changes since PATCHv3:
>> - removed unnecessary if condition in rbd_start_aio as we
>>   haven't start io yet
>> - moved acb->status = 0 to rbd_aio_bh_cb so qemu_aio_wait always
>>   waits until BH was executed
>>
>> Changes since PATCHv2:
>> - fixed missing braces
>> - added vfree for bounce
>>
>> Signed-off-by: Stefan Priebe 
>>
>> ---
>>  block/rbd.c |   20 
>>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> Reviewed-by: Stefan Hajnoczi 

Thanks, applied to the block branch.

For future patches, please put a "---" line between the real commit
message (including the SoB, of course) and the changelog so that git am
automatically removes the changelog.

Kevin



[Qemu-devel] [PATCH v6 00/12] virtio: virtio-blk data plane

2012-12-10 Thread Stefan Hajnoczi
This series adds the -device virtio-blk-pci,x-data-plane=on property that
enables a high performance I/O codepath.  A dedicated thread is used to process
virtio-blk requests outside the global mutex and without going through the QEMU
block layer.

Khoa Huynh  reported an increase from 140,000 IOPS to 600,000
IOPS for a single VM using virtio-blk-data-plane in July:

  http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580

The virtio-blk-data-plane approach was originally presented at Linux Plumbers
Conference 2010.  The following slides contain a brief overview:

  
http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf

The basic approach is:
1. Each virtio-blk device has a thread dedicated to handling ioeventfd
   signalling when the guest kicks the virtqueue.
2. Requests are processed without going through the QEMU block layer using
   Linux AIO directly.
3. Completion interrupts are injected via irqfd from the dedicated thread.

To try it out:

  qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=...
   -device 
virtio-blk-pci,drive=drive0,scsi=off,config-wce=off,x-data-plane=on

Limitations:
 * Only format=raw is supported
 * Live migration is not supported
 * Block jobs, hot unplug, and other operations fail with -EBUSY
 * I/O throttling limits are ignored
 * Only Linux hosts are supported due to Linux AIO usage

The code has reached a stage where I feel it is ready to merge.  Users have
been playing with it for some time and want the significant performance boost.

We are refactoring QEMU to get rid of the global mutex.  I believe that
virtio-blk-data-plane can eventually become the default mode of operation.

Instead of waiting for global mutex removal efforts to finish, I want to use
virtio-blk-data-plane as an example device for AioContext and threaded hw
dispatch refactoring.  This means:

1. When the block layer can bind to an AioContext and execute I/O outside the
   global mutex, virtio-blk-data-plane can use this (and gain image format
   support).

2. When hw dispatch no longer needs the global mutex we can use hw/virtio.c
   again and perhaps run a pool of iothreads instead of dedicated data plane
   threads.

But in the meantime, I have cleaned up the virtio-blk-data-plane code so that
it can be merged as an experimental feature.

v6:
 * Move hw/Makefile.objs dataplane/ inclusion from Patch 4 to Patch 3 [Kevin]
 * Split discard() with front/back and switch ssize_t to size_t [Michael]
 * Disable WCE config feature [Michael]
 * Assert on ioq underflow/overflow, it can never happen [Kevin]
 * Propagate fdatasync() errors [Kevin]
 * Remember to init/destroy hostmem mutex
 * Declare VirtIOBlkConf->data_plane in the right patch so building works

v5:
 * Omit memory regions with dirty logging enabled from hostmem [Michael]
 * Add doc comment about quiescing requests across memory hot unplug [Michael]
 * Clarify which Linux vhost version the vring code originates from [Michael]
 * Break up indirect vring buffer into 1 hostmem_lookup() per descriptor 
[Michael]
 * Barriers in hw/dataplane/vring.c to force fields to be loaded [Michael]
 * split vring_set_notification() into enable/disable [Paolo]
 * barriers in vring.c instead of virtio-blk.c [Michael]
 * move setup code from hw/virtio-blk.c into hw/dataplane/virtio-blk.c [Michael]

 * Note I did not get rid of the mutex+condvar approach to draining requests.
   I've had good feedback on the performance of the patch series so I'm not
   worried about eliminating the lock (it's very rarely contended).  Hope
   Michael and Paolo are okay with this approach.

v4:
 * Add qemu_iovec_concat_iov() [Paolo]
 * Use QEMUIOVector to copy out virtio_blk_inhdr [Michael, Paolo]

v3:
 * Don't assume iovec layout [Michael]
 * Better naming for hostmem.c MemoryListener callbacks [Don]
 * More vring quarantining if commands are bogus instead of exiting [Blue]

v2:
 * Use MemoryListener for thread-safe memory mapping [Paolo, Anthony, and 
everyone else pointed this out ;-)]
 * Quarantine invalid vring instead of exiting [Blue]
 * Replace __u16 kernel types with uint16_t [Blue]

Changes from the RFC v9:
 * Add x-data-plane=on|off option and coexist with regular virtio-blk code
 * Create thread from BH so it inherits iothread cpusets
 * Drain requests on vm_stop() so stopped guest does not access image file
 * Add migration blocker
 * Add bdrv_in_use() to prevent block jobs and other operations that can 
interfere
 * Drop IOQueue request merging for simplicity
 * Drop ioctl interrupt injection and always use irqfd for simplicity
 * Major cleanup to split up source files
 * Rebase from qemu-kvm.git onto qemu.git
 * Address Michael Tsirkin's review comments

Stefan Hajnoczi (12):
  raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane
  configure: add CONFIG_VIRTIO_BLK_DATA_PLANE
  dataplane: add host memory mapping code
  dataplane: add virtqueue vring code
  dataplane: add event loo

[Qemu-devel] [PATCH v6 05/12] dataplane: add event loop

2012-12-10 Thread Stefan Hajnoczi
Outside the safety of the global mutex we need to poll on file
descriptors.  I found epoll(2) is a convenient way to do that, although
other options could replace this module in the future (such as an
AioContext-based loop or glib's GMainLoop).

One important feature of this small event loop implementation is that
the loop can be terminated in a thread-safe way.  This allows QEMU to
stop the data plane thread cleanly.

Signed-off-by: Stefan Hajnoczi 
---
 hw/dataplane/Makefile.objs |   2 +-
 hw/dataplane/event-poll.c  | 109 +
 hw/dataplane/event-poll.h  |  40 +
 3 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 hw/dataplane/event-poll.c
 create mode 100644 hw/dataplane/event-poll.h

diff --git a/hw/dataplane/Makefile.objs b/hw/dataplane/Makefile.objs
index 34e6d57..e26bd7d 100644
--- a/hw/dataplane/Makefile.objs
+++ b/hw/dataplane/Makefile.objs
@@ -1,3 +1,3 @@
 ifeq ($(CONFIG_VIRTIO), y)
-common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o
+common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o event-poll.o
 endif
diff --git a/hw/dataplane/event-poll.c b/hw/dataplane/event-poll.c
new file mode 100644
index 000..4a53d48
--- /dev/null
+++ b/hw/dataplane/event-poll.c
@@ -0,0 +1,109 @@
+/*
+ * Event loop with file descriptor polling
+ *
+ * Copyright 2012 IBM, Corp.
+ * Copyright 2012 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *   Stefan Hajnoczi 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include 
+#include "hw/dataplane/event-poll.h"
+
+/* Add an event notifier and its callback for polling */
+void event_poll_add(EventPoll *poll, EventHandler *handler,
+EventNotifier *notifier, EventCallback *callback)
+{
+struct epoll_event event = {
+.events = EPOLLIN,
+.data.ptr = handler,
+};
+handler->notifier = notifier;
+handler->callback = callback;
+if (epoll_ctl(poll->epoll_fd, EPOLL_CTL_ADD,
+  event_notifier_get_fd(notifier), &event) != 0) {
+fprintf(stderr, "failed to add event handler to epoll: %m\n");
+exit(1);
+}
+}
+
+/* Event callback for stopping the event_poll_run() loop */
+static bool handle_stop(EventHandler *handler)
+{
+return false; /* stop event loop */
+}
+
+void event_poll_init(EventPoll *poll)
+{
+/* Create epoll file descriptor */
+poll->epoll_fd = epoll_create1(EPOLL_CLOEXEC);
+if (poll->epoll_fd < 0) {
+fprintf(stderr, "epoll_create1 failed: %m\n");
+exit(1);
+}
+
+/* Set up stop notifier */
+if (event_notifier_init(&poll->stop_notifier, 0) < 0) {
+fprintf(stderr, "failed to init stop notifier\n");
+exit(1);
+}
+event_poll_add(poll, &poll->stop_handler,
+   &poll->stop_notifier, handle_stop);
+}
+
+void event_poll_cleanup(EventPoll *poll)
+{
+event_notifier_cleanup(&poll->stop_notifier);
+close(poll->epoll_fd);
+poll->epoll_fd = -1;
+}
+
+/* Block until the next event and invoke its callback
+ *
+ * Signals must be masked, EINTR should never happen.  This is true for QEMU
+ * threads.
+ */
+static bool event_poll(EventPoll *poll)
+{
+EventHandler *handler;
+struct epoll_event event;
+int nevents;
+
+/* Wait for the next event.  Only do one event per call to keep the
+ * function simple, this could be changed later. */
+nevents = epoll_wait(poll->epoll_fd, &event, 1, -1);
+if (unlikely(nevents != 1)) {
+fprintf(stderr, "epoll_wait failed: %m\n");
+exit(1); /* should never happen */
+}
+
+/* Find out which event handler has become active */
+handler = event.data.ptr;
+
+/* Clear the eventfd */
+event_notifier_test_and_clear(handler->notifier);
+
+/* Handle the event */
+return handler->callback(handler);
+}
+
+void event_poll_run(EventPoll *poll)
+{
+while (event_poll(poll)) {
+/* do nothing */
+}
+}
+
+/* Stop the event_poll_run() loop
+ *
+ * This function can be used from another thread.
+ */
+void event_poll_stop(EventPoll *poll)
+{
+event_notifier_set(&poll->stop_notifier);
+}
diff --git a/hw/dataplane/event-poll.h b/hw/dataplane/event-poll.h
new file mode 100644
index 000..5e1771f
--- /dev/null
+++ b/hw/dataplane/event-poll.h
@@ -0,0 +1,40 @@
+/*
+ * Event loop with file descriptor polling
+ *
+ * Copyright 2012 IBM, Corp.
+ * Copyright 2012 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *   Stefan Hajnoczi 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef EVENT_POLL_H
+#define EVENT_POLL_H
+
+#include "event_notifier.h"
+
+typedef struct EventHandler EventHandler;
+typedef bool EventCallback(EventHandler *handler);
+struct EventHandler {
+EventNotifier *notifier;/* ev

[Qemu-devel] [PATCH v6 12/12] virtio-blk: add x-data-plane=on|off performance feature

2012-12-10 Thread Stefan Hajnoczi
The virtio-blk-data-plane feature is easy to integrate into
hw/virtio-blk.c.  The data plane can be started and stopped similar to
vhost-net.

Users can take advantage of the virtio-blk-data-plane feature using the
new -device virtio-blk-pci,x-data-plane=on property.

The x-data-plane name was chosen because at this stage the feature is
experimental and likely to see changes in the future.

If the VM configuration does not support virtio-blk-data-plane an error
message is printed.  Although we could fall back to regular virtio-blk,
I prefer the explicit approach since it prompts the user to fix their
configuration if they want the performance benefit of
virtio-blk-data-plane.

Limitations:
 * Only format=raw is supported
 * Live migration is not supported
 * Block jobs, hot unplug, and other operations fail with -EBUSY
 * I/O throttling limits are ignored
 * Only Linux hosts are supported due to Linux AIO usage

Signed-off-by: Stefan Hajnoczi 
---
 hw/virtio-blk.c | 28 +++-
 hw/virtio-pci.c |  3 +++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index fabf387..4e7ef64 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -17,6 +17,7 @@
 #include "hw/block-common.h"
 #include "blockdev.h"
 #include "virtio-blk.h"
+#include "hw/dataplane/virtio-blk.h"
 #include "scsi-defs.h"
 #ifdef __linux__
 # include 
@@ -33,6 +34,7 @@ typedef struct VirtIOBlock
 VirtIOBlkConf *blk;
 unsigned short sector_mask;
 DeviceState *qdev;
+VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
 
 static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
@@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 .num_writes = 0,
 };
 
+/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
+ * dataplane here instead of waiting for .set_status().
+ */
+if (s->dataplane) {
+virtio_blk_data_plane_start(s->dataplane);
+return;
+}
+
 while ((req = virtio_blk_get_request(s))) {
 virtio_blk_handle_request(req, &mrb);
 }
@@ -446,8 +456,13 @@ static void virtio_blk_dma_restart_cb(void *opaque, int 
running,
 {
 VirtIOBlock *s = opaque;
 
-if (!running)
+if (!running) {
+/* qemu_drain_all() doesn't know about data plane, quiesce here */
+if (s->dataplane) {
+virtio_blk_data_plane_drain(s->dataplane);
+}
 return;
+}
 
 if (!s->bh) {
 s->bh = qemu_bh_new(virtio_blk_dma_restart_bh, s);
@@ -541,6 +556,10 @@ static void virtio_blk_set_status(VirtIODevice *vdev, 
uint8_t status)
 VirtIOBlock *s = to_virtio_blk(vdev);
 uint32_t features;
 
+if (s->dataplane && !(status & VIRTIO_CONFIG_S_DRIVER)) {
+virtio_blk_data_plane_stop(s->dataplane);
+}
+
 if (!(status & VIRTIO_CONFIG_S_DRIVER_OK)) {
 return;
 }
@@ -638,6 +657,10 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, 
VirtIOBlkConf *blk)
 s->sector_mask = (s->conf->logical_block_size / BDRV_SECTOR_SIZE) - 1;
 
 s->vq = virtio_add_queue(&s->vdev, 128, virtio_blk_handle_output);
+if (!virtio_blk_data_plane_create(&s->vdev, blk, &s->dataplane)) {
+virtio_cleanup(&s->vdev);
+return NULL;
+}
 
 qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
 s->qdev = dev;
@@ -655,6 +678,9 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, 
VirtIOBlkConf *blk)
 void virtio_blk_exit(VirtIODevice *vdev)
 {
 VirtIOBlock *s = to_virtio_blk(vdev);
+
+virtio_blk_data_plane_destroy(s->dataplane);
+s->dataplane = NULL;
 unregister_savevm(s->qdev, "virtio-blk", s);
 blockdev_mark_auto_del(s->bs);
 virtio_cleanup(vdev);
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 7684ac9..c60b89a 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -896,6 +896,9 @@ static Property virtio_blk_properties[] = {
 DEFINE_PROP_BIT("scsi", VirtIOPCIProxy, blk.scsi, 0, true),
 #endif
 DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, 
VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
+#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
+DEFINE_PROP_BIT("x-data-plane", VirtIOPCIProxy, blk.data_plane, 0, false),
+#endif
 DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
 DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
 DEFINE_PROP_END_OF_LIST(),
-- 
1.8.0.1




Re: [Qemu-devel] [PATCH v6 00/12] virtio: virtio-blk data plane

2012-12-10 Thread Stefan Hajnoczi
On Mon, Dec 10, 2012 at 2:09 PM, Stefan Hajnoczi  wrote:
> v6:

Note that v6 is based on git://repo.or.cz/qemu/kevin.git block.

Stefan



[Qemu-devel] [PATCH v6 01/12] raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane

2012-12-10 Thread Stefan Hajnoczi
The raw_get_aio_fd() function allows virtio-blk-data-plane to get the
file descriptor of a raw image file with Linux AIO enabled.  This
interface is really a layering violation that can be resolved once the
block layer is able to run outside the global mutex - at that point
virtio-blk-data-plane will switch from custom Linux AIO code to using
the block layer.

Signed-off-by: Stefan Hajnoczi 
---
 block.h   |  9 +
 block/raw-posix.c | 34 ++
 2 files changed, 43 insertions(+)

diff --git a/block.h b/block.h
index 24bea09..7f84414 100644
--- a/block.h
+++ b/block.h
@@ -365,6 +365,15 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+#ifdef CONFIG_LINUX_AIO
+int raw_get_aio_fd(BlockDriverState *bs);
+#else
+static inline int raw_get_aio_fd(BlockDriverState *bs)
+{
+return -ENOTSUP;
+}
+#endif
+
 enum BlockAcctType {
 BDRV_ACCT_READ,
 BDRV_ACCT_WRITE,
diff --git a/block/raw-posix.c b/block/raw-posix.c
index abfedbe..634824b 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1777,6 +1777,40 @@ static BlockDriver bdrv_host_cdrom = {
 };
 #endif /* __FreeBSD__ */
 
+#ifdef CONFIG_LINUX_AIO
+/**
+ * Return the file descriptor for Linux AIO
+ *
+ * This function is a layering violation and should be removed when it becomes
+ * possible to call the block layer outside the global mutex.  It allows the
+ * caller to hijack the file descriptor so I/O can be performed outside the
+ * block layer.
+ */
+int raw_get_aio_fd(BlockDriverState *bs)
+{
+BDRVRawState *s;
+
+if (!bs->drv) {
+return -ENOMEDIUM;
+}
+
+if (bs->drv == bdrv_find_format("raw")) {
+bs = bs->file;
+}
+
+/* raw-posix has several protocols so just check for raw_aio_readv */
+if (bs->drv->bdrv_aio_readv != raw_aio_readv) {
+return -ENOTSUP;
+}
+
+s = bs->opaque;
+if (!s->use_aio) {
+return -ENOTSUP;
+}
+return s->fd;
+}
+#endif /* CONFIG_LINUX_AIO */
+
 static void bdrv_file_init(void)
 {
 /*
-- 
1.8.0.1




[Qemu-devel] [PATCH v6 02/12] configure: add CONFIG_VIRTIO_BLK_DATA_PLANE

2012-12-10 Thread Stefan Hajnoczi
The virtio-blk-data-plane feature only works with Linux AIO.  Therefore
add a ./configure option and necessary checks to implement this
dependency.

Signed-off-by: Stefan Hajnoczi 
---
 configure | 21 +
 1 file changed, 21 insertions(+)

diff --git a/configure b/configure
index e5aedef..6999072 100755
--- a/configure
+++ b/configure
@@ -223,6 +223,7 @@ libiscsi=""
 coroutine=""
 seccomp=""
 glusterfs=""
+virtio_blk_data_plane=""
 
 # parse CC options first
 for opt do
@@ -880,6 +881,10 @@ for opt do
   ;;
   --enable-glusterfs) glusterfs="yes"
   ;;
+  --disable-virtio-blk-data-plane) virtio_blk_data_plane="no"
+  ;;
+  --enable-virtio-blk-data-plane) virtio_blk_data_plane="yes"
+  ;;
   *) echo "ERROR: unknown option $opt"; show_help="yes"
   ;;
   esac
@@ -2257,6 +2262,17 @@ EOF
 fi
 
 ##
+# adjust virtio-blk-data-plane based on linux-aio
+
+if test "$virtio_blk_data_plane" = "yes" -a \
+   "$linux_aio" != "yes" ; then
+  echo "Error: virtio-blk-data-plane requires Linux AIO, please try 
--enable-linux-aio"
+  exit 1
+elif test -z "$virtio_blk_data_plane" ; then
+  virtio_blk_data_plane=$linux_aio
+fi
+
+##
 # attr probe
 
 if test "$attr" != "no" ; then
@@ -3259,6 +3275,7 @@ echo "build guest agent $guest_agent"
 echo "seccomp support   $seccomp"
 echo "coroutine backend $coroutine_backend"
 echo "GlusterFS support $glusterfs"
+echo "virtio-blk-data-plane $virtio_blk_data_plane"
 
 if test "$sdl_too_old" = "yes"; then
 echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -3605,6 +3622,10 @@ if test "$glusterfs" = "yes" ; then
   echo "CONFIG_GLUSTERFS=y" >> $config_host_mak
 fi
 
+if test "$virtio_blk_data_plane" = "yes" ; then
+  echo "CONFIG_VIRTIO_BLK_DATA_PLANE=y" >> $config_host_mak
+fi
+
 # USB host support
 case "$usb" in
 linux)
-- 
1.8.0.1




[Qemu-devel] [PATCH v6 04/12] dataplane: add virtqueue vring code

2012-12-10 Thread Stefan Hajnoczi
The virtio-blk-data-plane cannot access memory using the usual QEMU
functions since it executes outside the global mutex and the memory APIs
are this time are not thread-safe.

This patch introduces a virtqueue module based on the kernel's vhost
vring code.  The trick is that we map guest memory ahead of time and
access it cheaply outside the global mutex.

Once the hardware emulation code can execute outside the global mutex it
will be possible to drop this code.

Signed-off-by: Stefan Hajnoczi 
---
 hw/dataplane/Makefile.objs |   2 +-
 hw/dataplane/vring.c   | 361 +
 hw/dataplane/vring.h   |  63 
 trace-events   |   3 +
 4 files changed, 428 insertions(+), 1 deletion(-)
 create mode 100644 hw/dataplane/vring.c
 create mode 100644 hw/dataplane/vring.h

diff --git a/hw/dataplane/Makefile.objs b/hw/dataplane/Makefile.objs
index 8c8dea1..34e6d57 100644
--- a/hw/dataplane/Makefile.objs
+++ b/hw/dataplane/Makefile.objs
@@ -1,3 +1,3 @@
 ifeq ($(CONFIG_VIRTIO), y)
-common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o
+common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o
 endif
diff --git a/hw/dataplane/vring.c b/hw/dataplane/vring.c
new file mode 100644
index 000..8321c70
--- /dev/null
+++ b/hw/dataplane/vring.c
@@ -0,0 +1,361 @@
+/* Copyright 2012 Red Hat, Inc.
+ * Copyright IBM, Corp. 2012
+ *
+ * Based on Linux 2.6.39 vhost code:
+ * Copyright (C) 2009 Red Hat, Inc.
+ * Copyright (C) 2006 Rusty Russell IBM Corporation
+ *
+ * Author: Michael S. Tsirkin 
+ * Stefan Hajnoczi 
+ *
+ * Inspiration, some code, and most witty comments come from
+ * Documentation/virtual/lguest/lguest.c, by Rusty Russell
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+
+#include "trace.h"
+#include "hw/dataplane/vring.h"
+
+/* Map the guest's vring to host memory */
+bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
+{
+hwaddr vring_addr = virtio_queue_get_ring_addr(vdev, n);
+hwaddr vring_size = virtio_queue_get_ring_size(vdev, n);
+void *vring_ptr;
+
+vring->broken = false;
+
+hostmem_init(&vring->hostmem);
+vring_ptr = hostmem_lookup(&vring->hostmem, vring_addr, vring_size, true);
+if (!vring_ptr) {
+error_report("Failed to map vring "
+ "addr %#" HWADDR_PRIx " size %" HWADDR_PRIu,
+ vring_addr, vring_size);
+vring->broken = true;
+return false;
+}
+
+vring_init(&vring->vr, virtio_queue_get_num(vdev, n), vring_ptr, 4096);
+
+vring->last_avail_idx = 0;
+vring->last_used_idx = 0;
+vring->signalled_used = 0;
+vring->signalled_used_valid = false;
+
+trace_vring_setup(virtio_queue_get_ring_addr(vdev, n),
+  vring->vr.desc, vring->vr.avail, vring->vr.used);
+return true;
+}
+
+void vring_teardown(Vring *vring)
+{
+hostmem_finalize(&vring->hostmem);
+}
+
+/* Disable guest->host notifies */
+void vring_disable_notification(VirtIODevice *vdev, Vring *vring)
+{
+if (!(vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX))) {
+vring->vr.used->flags |= VRING_USED_F_NO_NOTIFY;
+}
+}
+
+/* Enable guest->host notifies
+ *
+ * Return true if the vring is empty, false if there are more requests.
+ */
+bool vring_enable_notification(VirtIODevice *vdev, Vring *vring)
+{
+if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
+vring_avail_event(&vring->vr) = vring->vr.avail->idx;
+} else {
+vring->vr.used->flags &= ~VRING_USED_F_NO_NOTIFY;
+}
+smp_mb(); /* ensure update is seen before reading avail_idx */
+return !vring_more_avail(vring);
+}
+
+/* This is stolen from linux/drivers/vhost/vhost.c:vhost_notify() */
+bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
+{
+uint16_t old, new;
+bool v;
+/* Flush out used index updates. This is paired
+ * with the barrier that the Guest executes when enabling
+ * interrupts. */
+smp_mb();
+
+if ((vdev->guest_features & VIRTIO_F_NOTIFY_ON_EMPTY) &&
+unlikely(vring->vr.avail->idx == vring->last_avail_idx)) {
+return true;
+}
+
+if (!(vdev->guest_features & VIRTIO_RING_F_EVENT_IDX)) {
+return !(vring->vr.avail->flags & VRING_AVAIL_F_NO_INTERRUPT);
+}
+old = vring->signalled_used;
+v = vring->signalled_used_valid;
+new = vring->signalled_used = vring->last_used_idx;
+vring->signalled_used_valid = true;
+
+if (unlikely(!v)) {
+return true;
+}
+
+return vring_need_event(vring_used_event(&vring->vr), new, old);
+}
+
+/* This is stolen from linux/drivers/vhost/vhost.c. */
+static int get_indirect(Vring *vring,
+struct iovec iov[], struct iovec *iov_end,
+unsigned int *out_num, unsigned int *in_num,
+struct vring_desc *indirect)
+{
+struct vring_desc desc;
+unsigned int i = 0, count, found = 0;
+

[Qemu-devel] [PATCH v6 07/12] iov: add iov_discard_front/back() to remove data

2012-12-10 Thread Stefan Hajnoczi
The iov_discard_front/back() functions remove data from the front or
back of the vector.  This is useful when peeling off header/footer
structs.

Signed-off-by: Stefan Hajnoczi 
---
 iov.c | 51 +++
 iov.h | 13 +
 2 files changed, 64 insertions(+)

diff --git a/iov.c b/iov.c
index a81eedc..d3b19e3 100644
--- a/iov.c
+++ b/iov.c
@@ -354,3 +354,54 @@ size_t qemu_iovec_memset(QEMUIOVector *qiov, size_t offset,
 {
 return iov_memset(qiov->iov, qiov->niov, offset, fillc, bytes);
 }
+
+size_t iov_discard_front(struct iovec **iov, unsigned int *iov_cnt,
+ size_t bytes)
+{
+size_t total = 0;
+struct iovec *cur;
+
+for (cur = *iov; *iov_cnt > 0; cur++) {
+if (cur->iov_len > bytes) {
+cur->iov_base += bytes;
+cur->iov_len -= bytes;
+total += bytes;
+break;
+}
+
+bytes -= cur->iov_len;
+total += cur->iov_len;
+*iov_cnt -= 1;
+}
+
+*iov = cur;
+return total;
+}
+
+size_t iov_discard_back(struct iovec *iov, unsigned int *iov_cnt,
+size_t bytes)
+{
+size_t total = 0;
+struct iovec *cur;
+
+if (*iov_cnt == 0) {
+return 0;
+}
+
+cur = iov + (*iov_cnt - 1);
+
+while (*iov_cnt > 0) {
+if (cur->iov_len > bytes) {
+cur->iov_len -= bytes;
+total += bytes;
+break;
+}
+
+bytes -= cur->iov_len;
+total += cur->iov_len;
+cur--;
+*iov_cnt -= 1;
+}
+
+return total;
+}
diff --git a/iov.h b/iov.h
index 34c8ec9..237e34c 100644
--- a/iov.h
+++ b/iov.h
@@ -95,3 +95,16 @@ void iov_hexdump(const struct iovec *iov, const unsigned int 
iov_cnt,
 unsigned iov_copy(struct iovec *dst_iov, unsigned int dst_iov_cnt,
  const struct iovec *iov, unsigned int iov_cnt,
  size_t offset, size_t bytes);
+
+/*
+ * Remove a given number of bytes from the front or back of a vector.
+ * This may update iov and/or iov_cnt to exclude iovec elements that are
+ * no longer required.
+ *
+ * The number of bytes actually discarded is returned.  This number may be
+ * smaller than requested if the vector is too small.
+ */
+size_t iov_discard_front(struct iovec **iov, unsigned int *iov_cnt,
+ size_t bytes);
+size_t iov_discard_back(struct iovec *iov, unsigned int *iov_cnt,
+size_t bytes);
-- 
1.8.0.1




[Qemu-devel] [PATCH v6 08/12] test-iov: add iov_discard_front/back() testcases

2012-12-10 Thread Stefan Hajnoczi
Signed-off-by: Stefan Hajnoczi 
---
 tests/test-iov.c | 150 +++
 1 file changed, 150 insertions(+)

diff --git a/tests/test-iov.c b/tests/test-iov.c
index cbe7a89..720d95c 100644
--- a/tests/test-iov.c
+++ b/tests/test-iov.c
@@ -250,11 +250,161 @@ static void test_io(void)
 #endif
 }
 
+static void test_discard_front(void)
+{
+struct iovec *iov;
+struct iovec *iov_tmp;
+unsigned int iov_cnt;
+unsigned int iov_cnt_tmp;
+void *old_base;
+size_t size;
+size_t ret;
+
+/* Discard zero bytes */
+iov_random(&iov, &iov_cnt);
+iov_tmp = iov;
+iov_cnt_tmp = iov_cnt;
+ret = iov_discard_front(&iov_tmp, &iov_cnt_tmp, 0);
+g_assert(ret == 0);
+g_assert(iov_tmp == iov);
+g_assert(iov_cnt_tmp == iov_cnt);
+iov_free(iov, iov_cnt);
+
+/* Discard more bytes than vector size */
+iov_random(&iov, &iov_cnt);
+iov_tmp = iov;
+iov_cnt_tmp = iov_cnt;
+size = iov_size(iov, iov_cnt);
+ret = iov_discard_front(&iov_tmp, &iov_cnt_tmp, size + 1);
+g_assert(ret == size);
+g_assert(iov_cnt_tmp == 0);
+iov_free(iov, iov_cnt);
+
+/* Discard entire vector */
+iov_random(&iov, &iov_cnt);
+iov_tmp = iov;
+iov_cnt_tmp = iov_cnt;
+size = iov_size(iov, iov_cnt);
+ret = iov_discard_front(&iov_tmp, &iov_cnt_tmp, size);
+g_assert(ret == size);
+g_assert(iov_cnt_tmp == 0);
+iov_free(iov, iov_cnt);
+
+/* Discard within first element */
+iov_random(&iov, &iov_cnt);
+iov_tmp = iov;
+iov_cnt_tmp = iov_cnt;
+old_base = iov->iov_base;
+size = g_test_rand_int_range(1, iov->iov_len);
+ret = iov_discard_front(&iov_tmp, &iov_cnt_tmp, size);
+g_assert(ret == size);
+g_assert(iov_tmp == iov);
+g_assert(iov_cnt_tmp == iov_cnt);
+g_assert(iov_tmp->iov_base == old_base + size);
+iov_tmp->iov_base = old_base; /* undo before g_free() */
+iov_free(iov, iov_cnt);
+
+/* Discard entire first element */
+iov_random(&iov, &iov_cnt);
+iov_tmp = iov;
+iov_cnt_tmp = iov_cnt;
+ret = iov_discard_front(&iov_tmp, &iov_cnt_tmp, iov->iov_len);
+g_assert(ret == iov->iov_len);
+g_assert(iov_tmp == iov + 1);
+g_assert(iov_cnt_tmp == iov_cnt - 1);
+iov_free(iov, iov_cnt);
+
+/* Discard within second element */
+iov_random(&iov, &iov_cnt);
+iov_tmp = iov;
+iov_cnt_tmp = iov_cnt;
+old_base = iov[1].iov_base;
+size = iov->iov_len + g_test_rand_int_range(1, iov[1].iov_len);
+ret = iov_discard_front(&iov_tmp, &iov_cnt_tmp, size);
+g_assert(ret == size);
+g_assert(iov_tmp == iov + 1);
+g_assert(iov_cnt_tmp == iov_cnt - 1);
+g_assert(iov_tmp->iov_base == old_base + (size - iov->iov_len));
+iov_tmp->iov_base = old_base; /* undo before g_free() */
+iov_free(iov, iov_cnt);
+}
+
+static void test_discard_back(void)
+{
+struct iovec *iov;
+unsigned int iov_cnt;
+unsigned int iov_cnt_tmp;
+void *old_base;
+size_t size;
+size_t ret;
+
+/* Discard zero bytes */
+iov_random(&iov, &iov_cnt);
+iov_cnt_tmp = iov_cnt;
+ret = iov_discard_back(iov, &iov_cnt_tmp, 0);
+g_assert(ret == 0);
+g_assert(iov_cnt_tmp == iov_cnt);
+iov_free(iov, iov_cnt);
+
+/* Discard more bytes than vector size */
+iov_random(&iov, &iov_cnt);
+iov_cnt_tmp = iov_cnt;
+size = iov_size(iov, iov_cnt);
+ret = iov_discard_back(iov, &iov_cnt_tmp, size + 1);
+g_assert(ret == size);
+g_assert(iov_cnt_tmp == 0);
+iov_free(iov, iov_cnt);
+
+/* Discard entire vector */
+iov_random(&iov, &iov_cnt);
+iov_cnt_tmp = iov_cnt;
+size = iov_size(iov, iov_cnt);
+ret = iov_discard_back(iov, &iov_cnt_tmp, size);
+g_assert(ret == size);
+g_assert(iov_cnt_tmp == 0);
+iov_free(iov, iov_cnt);
+
+/* Discard within last element */
+iov_random(&iov, &iov_cnt);
+iov_cnt_tmp = iov_cnt;
+old_base = iov[iov_cnt - 1].iov_base;
+size = g_test_rand_int_range(1, iov[iov_cnt - 1].iov_len);
+ret = iov_discard_back(iov, &iov_cnt_tmp, size);
+g_assert(ret == size);
+g_assert(iov_cnt_tmp == iov_cnt);
+g_assert(iov[iov_cnt - 1].iov_base == old_base);
+iov_free(iov, iov_cnt);
+
+/* Discard entire last element */
+iov_random(&iov, &iov_cnt);
+iov_cnt_tmp = iov_cnt;
+old_base = iov[iov_cnt - 1].iov_base;
+size = iov[iov_cnt - 1].iov_len;
+ret = iov_discard_back(iov, &iov_cnt_tmp, size);
+g_assert(ret == size);
+g_assert(iov_cnt_tmp == iov_cnt - 1);
+iov_free(iov, iov_cnt);
+
+/* Discard within second-to-last element */
+iov_random(&iov, &iov_cnt);
+iov_cnt_tmp = iov_cnt;
+old_base = iov[iov_cnt - 2].iov_base;
+size = iov[iov_cnt - 1].iov_len +
+   g_test_rand_int_range(1, iov[iov_cnt - 2].iov_len);
+ret = iov_discard_back(iov, &iov_cnt_tmp, size);
+g_assert(ret == size);
+g_assert(iov_cnt_tmp == iov_cnt - 1);
+g_assert(iov[i

[Qemu-devel] [PATCH v6 11/12] dataplane: add virtio-blk data plane code

2012-12-10 Thread Stefan Hajnoczi
virtio-blk-data-plane is a subset implementation of virtio-blk.  It only
handles read, write, and flush requests.  It does this using a dedicated
thread that executes an epoll(2)-based event loop and processes I/O
using Linux AIO.

This approach performs very well but can be used for raw image files
only.  The number of IOPS achieved has been reported to be several times
higher than the existing virtio-blk implementation.

Eventually it should be possible to unify virtio-blk-data-plane with the
main body of QEMU code once the block layer and hardware emulation is
able to run outside the global mutex.

Signed-off-by: Stefan Hajnoczi 
---
 hw/dataplane/Makefile.objs |   2 +-
 hw/dataplane/virtio-blk.c  | 472 +
 hw/dataplane/virtio-blk.h  |  43 +
 hw/virtio-blk.h|   1 +
 trace-events   |   6 +
 5 files changed, 523 insertions(+), 1 deletion(-)
 create mode 100644 hw/dataplane/virtio-blk.c
 create mode 100644 hw/dataplane/virtio-blk.h

diff --git a/hw/dataplane/Makefile.objs b/hw/dataplane/Makefile.objs
index abd408f..682aa9e 100644
--- a/hw/dataplane/Makefile.objs
+++ b/hw/dataplane/Makefile.objs
@@ -1,3 +1,3 @@
 ifeq ($(CONFIG_VIRTIO), y)
-common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o event-poll.o 
ioq.o
+common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o event-poll.o 
ioq.o virtio-blk.o
 endif
diff --git a/hw/dataplane/virtio-blk.c b/hw/dataplane/virtio-blk.c
new file mode 100644
index 000..1355e04
--- /dev/null
+++ b/hw/dataplane/virtio-blk.c
@@ -0,0 +1,472 @@
+/*
+ * Dedicated thread for virtio-blk I/O processing
+ *
+ * Copyright 2012 IBM, Corp.
+ * Copyright 2012 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *   Stefan Hajnoczi 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "trace.h"
+#include "iov.h"
+#include "event-poll.h"
+#include "qemu-thread.h"
+#include "vring.h"
+#include "ioq.h"
+#include "migration.h"
+#include "hw/virtio-blk.h"
+#include "hw/dataplane/virtio-blk.h"
+
+enum {
+SEG_MAX = 126,  /* maximum number of I/O segments */
+VRING_MAX = SEG_MAX + 2,/* maximum number of vring descriptors */
+REQ_MAX = VRING_MAX,/* maximum number of requests in the vring,
+ * is VRING_MAX / 2 with traditional and
+ * VRING_MAX with indirect descriptors */
+};
+
+typedef struct {
+struct iocb iocb;   /* Linux AIO control block */
+QEMUIOVector *inhdr;/* iovecs for virtio_blk_inhdr */
+unsigned int head;  /* vring descriptor index */
+} VirtIOBlockRequest;
+
+struct VirtIOBlockDataPlane {
+bool started;
+QEMUBH *start_bh;
+QemuThread thread;
+
+BlockDriverState *bs;
+int fd; /* image file descriptor */
+
+VirtIODevice *vdev;
+Vring vring;/* virtqueue vring */
+EventNotifier *guest_notifier;  /* irq */
+
+EventPoll event_poll;   /* event poller */
+EventHandler io_handler;/* Linux AIO completion handler */
+EventHandler notify_handler;/* virtqueue notify handler */
+
+IOQueue ioqueue;/* Linux AIO queue (should really be per
+   dataplane thread) */
+VirtIOBlockRequest requests[REQ_MAX]; /* pool of requests, managed by the
+ queue */
+
+unsigned int num_reqs;
+QemuMutex num_reqs_lock;
+QemuCond no_reqs_cond;
+
+Error *migration_blocker;
+};
+
+/* Raise an interrupt to signal guest, if necessary */
+static void notify_guest(VirtIOBlockDataPlane *s)
+{
+if (!vring_should_notify(s->vdev, &s->vring)) {
+return;
+}
+
+event_notifier_set(s->guest_notifier);
+}
+
+static void complete_request(struct iocb *iocb, ssize_t ret, void *opaque)
+{
+VirtIOBlockDataPlane *s = opaque;
+VirtIOBlockRequest *req = container_of(iocb, VirtIOBlockRequest, iocb);
+struct virtio_blk_inhdr hdr;
+int len;
+
+if (likely(ret >= 0)) {
+hdr.status = VIRTIO_BLK_S_OK;
+len = ret;
+} else {
+hdr.status = VIRTIO_BLK_S_IOERR;
+len = 0;
+}
+
+trace_virtio_blk_data_plane_complete_request(s, req->head, ret);
+
+qemu_iovec_from_buf(req->inhdr, 0, &hdr, sizeof(hdr));
+qemu_iovec_destroy(req->inhdr);
+g_slice_free(QEMUIOVector, req->inhdr);
+
+/* According to the virtio specification len should be the number of bytes
+ * written to, but for virtio-blk it seems to be the number of bytes
+ * transferred plus the status bytes.
+ */
+vring_push(&s->vring, req->head, len + sizeof(hdr));
+
+qemu_mutex_lock(&s->num_reqs_lock);
+if (--s->num_reqs == 0) {
+qemu_cond_broadcast(&s->no_reqs_cond);
+}
+qemu_mutex_unl

Re: [Qemu-devel] [PATCH v2] exec.c: Use tb1->phys_hash_next directly in tb_remove

2012-12-10 Thread Wei-Ren Chen
  [CC'ed qemu-trivial]

  ping?

On Wed, Nov 21, 2012 at 07:52:48AM +0800, 陳韋任 (Wei-Ren Chen) wrote:
>   When tb_remove was first commited at fd6ce8f6, there were three different
> calls pass different names to offsetof. In current codebase, the other two
> calls are replaced with tb_page_remove. There is no need to have a general
> tb_remove. Omit passing the third parameter and using tb1->phys_hash_next
> directly.
> 
> Signed-off-by: Chen Wei-Ren 
> ---
> 
> v2: Address Peter's comment.
> - place comment used belong tb_remove above tb_phys_invalidate.
> - remove unnecessary bracket.
> 
>  exec.c |   12 +---
>  1 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 8435de0..6343838 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -862,18 +862,16 @@ static void tb_page_check(void)
>  
>  #endif
>  
> -/* invalidate one TB */
> -static inline void tb_remove(TranslationBlock **ptb, TranslationBlock *tb,
> - int next_offset)
> +static inline void tb_hash_remove(TranslationBlock **ptb, TranslationBlock 
> *tb)
>  {
>  TranslationBlock *tb1;
>  for(;;) {
>  tb1 = *ptb;
>  if (tb1 == tb) {
> -*ptb = *(TranslationBlock **)((char *)tb1 + next_offset);
> +*ptb = tb1->phys_hash_next;
>  break;
>  }
> -ptb = (TranslationBlock **)((char *)tb1 + next_offset);
> +ptb = &tb1->phys_hash_next;
>  }
>  }
>  
> @@ -929,6 +927,7 @@ static inline void tb_reset_jump(TranslationBlock *tb, 
> int n)
>  tb_set_jmp_target(tb, n, (uintptr_t)(tb->tc_ptr + 
> tb->tb_next_offset[n]));
>  }
>  
> +/* invalidate one TB */
>  void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
>  {
>  CPUArchState *env;
> @@ -940,8 +939,7 @@ void tb_phys_invalidate(TranslationBlock *tb, 
> tb_page_addr_t page_addr)
>  /* remove the TB from the hash list */
>  phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
>  h = tb_phys_hash_func(phys_pc);
> -tb_remove(&tb_phys_hash[h], tb,
> -  offsetof(TranslationBlock, phys_hash_next));
> +tb_hash_remove(&tb_phys_hash[h], tb);
>  
>  /* remove the TB from the page list */
>  if (tb->page_addr[0] != page_addr) {
> -- 
> 1.7.3.4
> 
> 
> -- 
> Wei-Ren Chen (陳韋任)
> Computer Systems Lab, Institute of Information Science,
> Academia Sinica, Taiwan (R.O.C.)
> Tel:886-2-2788-3799 #1667
> Homepage: http://people.cs.nctu.edu.tw/~chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



[Qemu-devel] [PATCH v6 09/12] iov: add qemu_iovec_concat_iov()

2012-12-10 Thread Stefan Hajnoczi
The qemu_iovec_concat() function copies a subset of a QEMUIOVector.  The
new qemu_iovec_concat_iov() function does the same for a iov/cnt pair.

It is easy to define qemu_iovec_concat() in terms of
qemu_iovec_concat_iov().  The existing code is mostly unchanged, except
for the assertion src->size >= soffset, which cannot be efficiently
checked upfront on a iov/cnt pair.  Instead we assert upon hitting the
end of src with an unsatisfied soffset.

Signed-off-by: Stefan Hajnoczi 
---
 iov.c | 39 +++
 qemu-common.h |  3 +++
 2 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/iov.c b/iov.c
index d3b19e3..0feab8e 100644
--- a/iov.c
+++ b/iov.c
@@ -289,34 +289,49 @@ void qemu_iovec_add(QEMUIOVector *qiov, void *base, 
size_t len)
 }
 
 /*
- * Concatenates (partial) iovecs from src to the end of dst.
+ * Concatenates (partial) iovecs from src_iov to the end of dst.
  * It starts copying after skipping `soffset' bytes at the
  * beginning of src and adds individual vectors from src to
  * dst copies up to `sbytes' bytes total, or up to the end
- * of src if it comes first.  This way, it is okay to specify
+ * of src_iov if it comes first.  This way, it is okay to specify
  * very large value for `sbytes' to indicate "up to the end
  * of src".
  * Only vector pointers are processed, not the actual data buffers.
  */
-void qemu_iovec_concat(QEMUIOVector *dst,
-   QEMUIOVector *src, size_t soffset, size_t sbytes)
+void qemu_iovec_concat_iov(QEMUIOVector *dst,
+   struct iovec *src_iov, unsigned int src_cnt,
+   size_t soffset, size_t sbytes)
 {
 int i;
 size_t done;
-struct iovec *siov = src->iov;
 assert(dst->nalloc != -1);
-assert(src->size >= soffset);
-for (i = 0, done = 0; done < sbytes && i < src->niov; i++) {
-if (soffset < siov[i].iov_len) {
-size_t len = MIN(siov[i].iov_len - soffset, sbytes - done);
-qemu_iovec_add(dst, siov[i].iov_base + soffset, len);
+for (i = 0, done = 0; done < sbytes && i < src_cnt; i++) {
+if (soffset < src_iov[i].iov_len) {
+size_t len = MIN(src_iov[i].iov_len - soffset, sbytes - done);
+qemu_iovec_add(dst, src_iov[i].iov_base + soffset, len);
 done += len;
 soffset = 0;
 } else {
-soffset -= siov[i].iov_len;
+soffset -= src_iov[i].iov_len;
 }
 }
-/* return done; */
+assert(soffset == 0); /* offset beyond end of src */
+}
+
+/*
+ * Concatenates (partial) iovecs from src to the end of dst.
+ * It starts copying after skipping `soffset' bytes at the
+ * beginning of src and adds individual vectors from src to
+ * dst copies up to `sbytes' bytes total, or up to the end
+ * of src if it comes first.  This way, it is okay to specify
+ * very large value for `sbytes' to indicate "up to the end
+ * of src".
+ * Only vector pointers are processed, not the actual data buffers.
+ */
+void qemu_iovec_concat(QEMUIOVector *dst,
+   QEMUIOVector *src, size_t soffset, size_t sbytes)
+{
+qemu_iovec_concat_iov(dst, src->iov, src->niov, soffset, sbytes);
 }
 
 void qemu_iovec_destroy(QEMUIOVector *qiov)
diff --git a/qemu-common.h b/qemu-common.h
index cef264c..4cc63e1 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -379,6 +379,9 @@ void qemu_iovec_init_external(QEMUIOVector *qiov, struct 
iovec *iov, int niov);
 void qemu_iovec_add(QEMUIOVector *qiov, void *base, size_t len);
 void qemu_iovec_concat(QEMUIOVector *dst,
QEMUIOVector *src, size_t soffset, size_t sbytes);
+void qemu_iovec_concat_iov(QEMUIOVector *dst,
+   struct iovec *src_iov, unsigned int src_cnt,
+   size_t soffset, size_t sbytes);
 void qemu_iovec_destroy(QEMUIOVector *qiov);
 void qemu_iovec_reset(QEMUIOVector *qiov);
 size_t qemu_iovec_to_buf(QEMUIOVector *qiov, size_t offset,
-- 
1.8.0.1




[Qemu-devel] [PATCH v6 10/12] virtio-blk: restore VirtIOBlkConf->config_wce flag

2012-12-10 Thread Stefan Hajnoczi
Two slightly different versions of a patch to conditionally set
VIRTIO_BLK_F_CONFIG_WCE through the "config-wce" qdev property have been
applied (ea776abca and eec7f96c2).  David Gibson
 noticed that the "config-wce"
property is broken as a result and fixed it recently.

The fix sets the host_features VIRTIO_BLK_F_CONFIG_WCE bit from a qdev
property.  Unfortunately, the virtio device then has no chance to test
for the presence of the feature bit during virtio_blk_init().

Therefore, reinstate the VirtIOBlkConf->config_wce flag.  Drop the
duplicate qdev property to set the host_features bit.  The
VirtIOBlkConf->config_wce flag will be used by virtio-blk-data-plane in
a later patch.

Signed-off-by: Stefan Hajnoczi 
---
 hw/virtio-blk.c | 3 +++
 hw/virtio-blk.h | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index e25cc96..fabf387 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -524,6 +524,9 @@ static uint32_t virtio_blk_get_features(VirtIODevice *vdev, 
uint32_t features)
 features |= (1 << VIRTIO_BLK_F_BLK_SIZE);
 features |= (1 << VIRTIO_BLK_F_SCSI);
 
+if (s->blk->config_wce) {
+features |= (1 << VIRTIO_BLK_F_CONFIG_WCE);
+}
 if (bdrv_enable_write_cache(s->bs))
 features |= (1 << VIRTIO_BLK_F_WCE);
 
diff --git a/hw/virtio-blk.h b/hw/virtio-blk.h
index 651a000..454f445 100644
--- a/hw/virtio-blk.h
+++ b/hw/virtio-blk.h
@@ -104,10 +104,10 @@ struct VirtIOBlkConf
 BlockConf conf;
 char *serial;
 uint32_t scsi;
+uint32_t config_wce;
 };
 
 #define DEFINE_VIRTIO_BLK_FEATURES(_state, _field) \
-DEFINE_VIRTIO_COMMON_FEATURES(_state, _field), \
-DEFINE_PROP_BIT("config-wce", _state, _field, VIRTIO_BLK_F_CONFIG_WCE, 
true)
+DEFINE_VIRTIO_COMMON_FEATURES(_state, _field)
 
 #endif
-- 
1.8.0.1




[Qemu-devel] [PATCH v6 03/12] dataplane: add host memory mapping code

2012-12-10 Thread Stefan Hajnoczi
The data plane thread needs to map guest physical addresses to host
pointers.  Normally this is done with cpu_physical_memory_map() but the
function assumes the global mutex is held.  The data plane thread does
not touch the global mutex and therefore needs a thread-safe memory
mapping mechanism.

Hostmem registers a MemoryListener similar to how vhost collects and
pushes memory region information into the kernel.  There is a
fine-grained lock on the regions list which is held during lookup and
when installing a new regions list.

When the physical memory map changes the MemoryListener callbacks are
invoked.  They build up a new list of memory regions which is finally
installed when the list has been completed.

Signed-off-by: Stefan Hajnoczi 
---
 hw/Makefile.objs   |   2 +-
 hw/dataplane/Makefile.objs |   3 +
 hw/dataplane/hostmem.c | 176 +
 hw/dataplane/hostmem.h |  57 +++
 4 files changed, 237 insertions(+), 1 deletion(-)
 create mode 100644 hw/dataplane/Makefile.objs
 create mode 100644 hw/dataplane/hostmem.c
 create mode 100644 hw/dataplane/hostmem.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index d581d8d..cec84bc 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y = usb/ ide/
+common-obj-y = usb/ ide/ dataplane/
 common-obj-y += loader.o
 common-obj-$(CONFIG_VIRTIO) += virtio-console.o
 common-obj-$(CONFIG_VIRTIO) += virtio-rng.o
diff --git a/hw/dataplane/Makefile.objs b/hw/dataplane/Makefile.objs
new file mode 100644
index 000..8c8dea1
--- /dev/null
+++ b/hw/dataplane/Makefile.objs
@@ -0,0 +1,3 @@
+ifeq ($(CONFIG_VIRTIO), y)
+common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o
+endif
diff --git a/hw/dataplane/hostmem.c b/hw/dataplane/hostmem.c
new file mode 100644
index 000..d5e1070
--- /dev/null
+++ b/hw/dataplane/hostmem.c
@@ -0,0 +1,176 @@
+/*
+ * Thread-safe guest to host memory mapping
+ *
+ * Copyright 2012 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *   Stefan Hajnoczi 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "exec-memory.h"
+#include "hostmem.h"
+
+static int hostmem_lookup_cmp(const void *phys_, const void *region_)
+{
+hwaddr phys = *(const hwaddr *)phys_;
+const HostmemRegion *region = region_;
+
+if (phys < region->guest_addr) {
+return -1;
+} else if (phys >= region->guest_addr + region->size) {
+return 1;
+} else {
+return 0;
+}
+}
+
+/**
+ * Map guest physical address to host pointer
+ */
+void *hostmem_lookup(Hostmem *hostmem, hwaddr phys, hwaddr len, bool is_write)
+{
+HostmemRegion *region;
+void *host_addr = NULL;
+hwaddr offset_within_region;
+
+qemu_mutex_lock(&hostmem->current_regions_lock);
+region = bsearch(&phys, hostmem->current_regions,
+ hostmem->num_current_regions,
+ sizeof(hostmem->current_regions[0]),
+ hostmem_lookup_cmp);
+if (!region) {
+goto out;
+}
+if (is_write && region->readonly) {
+goto out;
+}
+offset_within_region = phys - region->guest_addr;
+if (offset_within_region + len <= region->size) {
+host_addr = region->host_addr + offset_within_region;
+}
+out:
+qemu_mutex_unlock(&hostmem->current_regions_lock);
+
+return host_addr;
+}
+
+/**
+ * Install new regions list
+ */
+static void hostmem_listener_commit(MemoryListener *listener)
+{
+Hostmem *hostmem = container_of(listener, Hostmem, listener);
+
+qemu_mutex_lock(&hostmem->current_regions_lock);
+g_free(hostmem->current_regions);
+hostmem->current_regions = hostmem->new_regions;
+hostmem->num_current_regions = hostmem->num_new_regions;
+qemu_mutex_unlock(&hostmem->current_regions_lock);
+
+/* Reset new regions list */
+hostmem->new_regions = NULL;
+hostmem->num_new_regions = 0;
+}
+
+/**
+ * Add a MemoryRegionSection to the new regions list
+ */
+static void hostmem_append_new_region(Hostmem *hostmem,
+  MemoryRegionSection *section)
+{
+void *ram_ptr = memory_region_get_ram_ptr(section->mr);
+size_t num = hostmem->num_new_regions;
+size_t new_size = (num + 1) * sizeof(hostmem->new_regions[0]);
+
+hostmem->new_regions = g_realloc(hostmem->new_regions, new_size);
+hostmem->new_regions[num] = (HostmemRegion){
+.host_addr = ram_ptr + section->offset_within_region,
+.guest_addr = section->offset_within_address_space,
+.size = section->size,
+.readonly = section->readonly,
+};
+hostmem->num_new_regions++;
+}
+
+static void hostmem_listener_append_region(MemoryListener *listener,
+   MemoryRegionSection *section)
+{
+Hostmem *hostmem = container_of(listener, Hostmem, listener);
+
+/* Ignore non-RAM regio

[Qemu-devel] KVM call agenda for 2012-12-11

2012-12-10 Thread Juan Quintela

Hi

Please send in any agenda topics you are interested in.

Later, Juan.



[Qemu-devel] [PATCH v6 06/12] dataplane: add Linux AIO request queue

2012-12-10 Thread Stefan Hajnoczi
The IOQueue has a pool of iocb structs and a function to add new
read/write requests.  Multiple requests can be added before calling the
submit function to actually tell the host kernel to begin I/O.  This
allows callers to batch requests and submit them in one go.

The actual I/O is performed using Linux AIO.

Signed-off-by: Stefan Hajnoczi 
---
 hw/dataplane/Makefile.objs |   2 +-
 hw/dataplane/ioq.c | 117 +
 hw/dataplane/ioq.h |  57 ++
 3 files changed, 175 insertions(+), 1 deletion(-)
 create mode 100644 hw/dataplane/ioq.c
 create mode 100644 hw/dataplane/ioq.h

diff --git a/hw/dataplane/Makefile.objs b/hw/dataplane/Makefile.objs
index e26bd7d..abd408f 100644
--- a/hw/dataplane/Makefile.objs
+++ b/hw/dataplane/Makefile.objs
@@ -1,3 +1,3 @@
 ifeq ($(CONFIG_VIRTIO), y)
-common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o event-poll.o
+common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += hostmem.o vring.o event-poll.o 
ioq.o
 endif
diff --git a/hw/dataplane/ioq.c b/hw/dataplane/ioq.c
new file mode 100644
index 000..0c9f5c4
--- /dev/null
+++ b/hw/dataplane/ioq.c
@@ -0,0 +1,117 @@
+/*
+ * Linux AIO request queue
+ *
+ * Copyright 2012 IBM, Corp.
+ * Copyright 2012 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *   Stefan Hajnoczi 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/dataplane/ioq.h"
+
+void ioq_init(IOQueue *ioq, int fd, unsigned int max_reqs)
+{
+int rc;
+
+ioq->fd = fd;
+ioq->max_reqs = max_reqs;
+
+memset(&ioq->io_ctx, 0, sizeof ioq->io_ctx);
+rc = io_setup(max_reqs, &ioq->io_ctx);
+if (rc != 0) {
+fprintf(stderr, "ioq io_setup failed %d\n", rc);
+exit(1);
+}
+
+rc = event_notifier_init(&ioq->io_notifier, 0);
+if (rc != 0) {
+fprintf(stderr, "ioq io event notifier creation failed %d\n", rc);
+exit(1);
+}
+
+ioq->freelist = g_malloc0(sizeof ioq->freelist[0] * max_reqs);
+ioq->freelist_idx = 0;
+
+ioq->queue = g_malloc0(sizeof ioq->queue[0] * max_reqs);
+ioq->queue_idx = 0;
+}
+
+void ioq_cleanup(IOQueue *ioq)
+{
+g_free(ioq->freelist);
+g_free(ioq->queue);
+
+event_notifier_cleanup(&ioq->io_notifier);
+io_destroy(ioq->io_ctx);
+}
+
+EventNotifier *ioq_get_notifier(IOQueue *ioq)
+{
+return &ioq->io_notifier;
+}
+
+struct iocb *ioq_get_iocb(IOQueue *ioq)
+{
+/* Underflow cannot happen since ioq is sized for max_reqs */
+assert(ioq->freelist_idx != 0);
+
+struct iocb *iocb = ioq->freelist[--ioq->freelist_idx];
+ioq->queue[ioq->queue_idx++] = iocb;
+return iocb;
+}
+
+void ioq_put_iocb(IOQueue *ioq, struct iocb *iocb)
+{
+/* Overflow cannot happen since ioq is sized for max_reqs */
+assert(ioq->freelist_idx != ioq->max_reqs);
+
+ioq->freelist[ioq->freelist_idx++] = iocb;
+}
+
+struct iocb *ioq_rdwr(IOQueue *ioq, bool read, struct iovec *iov,
+  unsigned int count, long long offset)
+{
+struct iocb *iocb = ioq_get_iocb(ioq);
+
+if (read) {
+io_prep_preadv(iocb, ioq->fd, iov, count, offset);
+} else {
+io_prep_pwritev(iocb, ioq->fd, iov, count, offset);
+}
+io_set_eventfd(iocb, event_notifier_get_fd(&ioq->io_notifier));
+return iocb;
+}
+
+int ioq_submit(IOQueue *ioq)
+{
+int rc = io_submit(ioq->io_ctx, ioq->queue_idx, ioq->queue);
+ioq->queue_idx = 0; /* reset */
+return rc;
+}
+
+int ioq_run_completion(IOQueue *ioq, IOQueueCompletion *completion,
+   void *opaque)
+{
+struct io_event events[ioq->max_reqs];
+int nevents, i;
+
+do {
+nevents = io_getevents(ioq->io_ctx, 0, ioq->max_reqs, events, NULL);
+} while (nevents < 0 && errno == EINTR);
+if (nevents < 0) {
+return nevents;
+}
+
+for (i = 0; i < nevents; i++) {
+ssize_t ret = ((uint64_t)events[i].res2 << 32) | events[i].res;
+
+completion(events[i].obj, ret, opaque);
+ioq_put_iocb(ioq, events[i].obj);
+}
+return nevents;
+}
diff --git a/hw/dataplane/ioq.h b/hw/dataplane/ioq.h
new file mode 100644
index 000..890db22
--- /dev/null
+++ b/hw/dataplane/ioq.h
@@ -0,0 +1,57 @@
+/*
+ * Linux AIO request queue
+ *
+ * Copyright 2012 IBM, Corp.
+ * Copyright 2012 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *   Stefan Hajnoczi 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef IOQ_H
+#define IOQ_H
+
+#include 
+#include "event_notifier.h"
+
+typedef struct {
+int fd; /* file descriptor */
+unsigned int max_reqs;  /* max length of freelist and queue */
+
+io_context_t io_ctx;/* Linux AIO context */
+EventNotifier io_notifier;  /* Linux AIO eventfd */
+
+/* Requests

Re: [Qemu-devel] [PATCH] NVMe: Initial commit to add an NVM Express device

2012-12-10 Thread Stefan Hajnoczi
On Mon, Dec 10, 2012 at 1:36 PM, Kevin Wolf  wrote:
> Hi Keith,
>
> Am 08.12.2012 20:20, schrieb Keith Busch:
>>> IIUC from the website above, NVMe is to be used with SSDs?  It would be
>>> good to add to the commit message how to actually use the device
>>> command-line-wise beyond the obvious -device nvme: I did not spot on
>>> brief sight where you expose a bus to add drives (nor a special IF_*
>>> interface type to assign to a drive), so others might wonder as well.
>>
>> Actually the nvme device _is_ the SSD. The emulated controller here
>> creates files to use for its backing storage so you don't add
>> additional drives, if that makes sense.
>
> I think the device would be much more useful if you could make it use
> the qemu block layer instead of implementing your own functions for only
> raw images and only with a given magic file name.

Quick pointers to get started on Kevin's suggestion:

bdrv_aio_readv(), bdrv_aio_writev(), bdrv_aio_flush(), and
bdrv_aio_discard() provide the block device operations that emulated
storage controllers use.

Take a look at hw/virtio-blk.c to see how to take a -device
nvme,drive= (internally this is your BlockDriverState*).

Stefan



Re: [Qemu-devel] [PATCH] virtio: verify that all outstanding buffers are flushed (was Re: vmstate conversion for virtio?)

2012-12-10 Thread Anthony Liguori
Rusty Russell  writes:

> "Michael S. Tsirkin"  writes:
>
> No, because I don't understand it.  Is it true for the case of
> virtio_blk, which has outstanding requests?
>
>>> Currently we dump a massive structure; it's inelegant at the very
>>> least.

Inelegant is a kind word..

There's a couple things to consider though which is why this code hasn't
changed so far.

1) We're writing native endian values to the wire.  This is seriously
   broken.  Just imagine trying to migrate from qemu-system-i386 on an
   big endian box to a little endian box.

2) Fixing (1) either means (a) breaking migration across the board
   gracefully or (b) breaking migration on [big|little] endian hosts in
   an extremely ungraceful way.

3) We send a ton of crap over the wire that is unnecessary, but we need
   to maintain it.

I wrote up a patch series to try to improve the situation that I'll send
out.  I haven't gotten around to testing it with an older version of
QEMU yet.

I went for 2.b and choose to break big endian hosts.

>>> 
>>> Cheers,
>>> Rusty.
>>
>> Hmm not sure what you refer to. I see this per ring:
>>
>> qemu_put_be32(f, vdev->vq[i].vring.num);
>> qemu_put_be64(f, vdev->vq[i].pa);
>> qemu_put_be16s(f, &vdev->vq[i].last_avail_idx);
>>
>> Looks like there's no way around savng these fields.

Correct.

Regards,

Anthony Liguori

>
> Not what I'm referring to.  See here:
>
> virtio.h defines a 48k structure:
>
> #define VIRTQUEUE_MAX_SIZE 1024
>
> typedef struct VirtQueueElement
> {
> unsigned int index;
> unsigned int out_num;
> unsigned int in_num;
> hwaddr in_addr[VIRTQUEUE_MAX_SIZE];
> hwaddr out_addr[VIRTQUEUE_MAX_SIZE];
> struct iovec in_sg[VIRTQUEUE_MAX_SIZE];
> struct iovec out_sg[VIRTQUEUE_MAX_SIZE];
> } VirtQueueElement;
>
> virtio-blk.c uses it in its request struct:
>
> typedef struct VirtIOBlockReq
> {
> VirtIOBlock *dev;
> VirtQueueElement elem;
> struct virtio_blk_inhdr *in;
> struct virtio_blk_outhdr *out;
> struct virtio_scsi_inhdr *scsi;
> QEMUIOVector qiov;
> struct VirtIOBlockReq *next;
> BlockAcctCookie acct;
> } VirtIOBlockReq;
>
> ... and saves it in virtio_blk_save:
>
> static void virtio_blk_save(QEMUFile *f, void *opaque)
> {
> VirtIOBlock *s = opaque;
> VirtIOBlockReq *req = s->rq;
>
> virtio_save(&s->vdev, f);
> 
> while (req) {
> qemu_put_sbyte(f, 1);
> qemu_put_buffer(f, (unsigned char*)&req->elem, sizeof(req->elem));
> req = req->next;
> }
> qemu_put_sbyte(f, 0);
> }
>
> Cheers,
> Rusty.



Re: [Qemu-devel] [PATCH_v3] add target-openrisc floating point exception

2012-12-10 Thread Wei-Ren Chen
Hi Feng,

  One question,

> @@ -25,5 +25,5 @@  void HELPER(exception)(CPUOpenRISCState *env,
> uint32_t excp)
>  {
>  OpenRISCCPU *cpu = OPENRISC_CPU(ENV_GET_CPU(env));
>  
> -raise_exception(cpu, excp);
> +do_raise_exception(cpu, excp, 0);
>  }

  Shouldn't above helper function be called from the code cache?
The 3rd argument of do_raise_exception means if the exception comes
from code cache or not, iiuc. Is it correct to put zero here?

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] [Bug 1087974] [NEW] [regression] vnc tight png produces garbled output

2012-12-10 Thread Stefan Hajnoczi
On Sat, Dec 8, 2012 at 12:46 PM, Tim Hardeck  wrote:
> Public bug reported:
>
> VNC Tight PNG compression did work fine two or three month ago but don't 
> anymore. Now when Tight PNG is used parts of the desktop are shown but they 
> are scrambled together.
> I have always tested this feature against QEMU git with noVNC by only 
> allowing Tight PNG compression.

Hi Tim,
If you have a few minutes please use git-bisect(1) to identify the
commit that causes the regression.

The rough steps are:
1. Verify that qemu.git/master is broken and find an older commit
where it works.
2. Use git-bisect(1) to binary search the commit history between these
two points - it will leave you with the commit that caused the
regression.

Here some quick links to get you started:
http://git-scm.com/book/en/Git-Tools-Debugging-with-Git#Binary-Search
http://blog.evan.pro/getting-started-with-git-bisect-in-60-seconds

Stefan



[Qemu-devel] [RFC 0/4] virtio: stabilize migration format

2012-12-10 Thread Anthony Liguori
This series replaces:

qemu_put_buffer(f, (unsigned char*)&req->elem, sizeof(req->elem));

With code that properly saves out each element of the structure using
a well defined endian format.  Migration is broken today from big endian to
little endian hosts.

There's no way to fix this problem without bumping the migration version
number and that's exactly what we do here.  By bumping the migration version
number, we do break new->old migration but that's unavoidable right now.

In order to support old->new, we assume that all incoming data is in little
endian.  The final patch adds a check to the load routines to fail old->new
on big endian hosts where this may not have been true.

I've not tested this thoroughly enough to apply yet but wanted to share.

---
 hw/virtio-blk.c|   22 +---
 hw/virtio-serial-bus.c |   19 +++
 hw/virtio.c|  123 +
 hw/virtio.h|4 +
 qemu-file.h|7 ++
 savevm.c   |   45 +
 6 files changed, 201 insertions(+), 19 deletions(-)




[Qemu-devel] [PATCH 2/4] virtio: add wrapper for saving/restoring virtqueue elements

2012-12-10 Thread Anthony Liguori
Putting raw structures on the wire is bad news.  Add a wrapper and use it.

Note that in virtio-serial-bus, we were mapping both the in and out vectors as
writable.  This is a bug that is fixed by this change.  I checked the revision
history, it has been there since the code was first added and does not appear
to be intentional.

Signed-off-by: Anthony Liguori 
---
 hw/virtio-blk.c|  9 ++---
 hw/virtio-serial-bus.c | 10 ++
 hw/virtio.c| 13 +
 hw/virtio.h|  4 
 4 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index e25cc96..7ab174f 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -555,7 +555,7 @@ static void virtio_blk_save(QEMUFile *f, void *opaque)
 
 while (req) {
 qemu_put_sbyte(f, 1);
-qemu_put_buffer(f, (unsigned char*)&req->elem, sizeof(req->elem));
+virtio_put_virt_queue_element(f, &req->elem);
 req = req->next;
 }
 qemu_put_sbyte(f, 0);
@@ -576,14 +576,9 @@ static int virtio_blk_load(QEMUFile *f, void *opaque, int 
version_id)
 
 while (qemu_get_sbyte(f)) {
 VirtIOBlockReq *req = virtio_blk_alloc_request(s);
-qemu_get_buffer(f, (unsigned char*)&req->elem, sizeof(req->elem));
+virtio_get_virt_queue_element(f, &req->elem);
 req->next = s->rq;
 s->rq = req;
-
-virtqueue_map_sg(req->elem.in_sg, req->elem.in_addr,
-req->elem.in_num, 1);
-virtqueue_map_sg(req->elem.out_sg, req->elem.out_addr,
-req->elem.out_num, 0);
 }
 
 return 0;
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 155da58..aa1ded0 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -629,8 +629,7 @@ static void virtio_serial_save(QEMUFile *f, void *opaque)
 qemu_put_be32s(f, &port->iov_idx);
 qemu_put_be64s(f, &port->iov_offset);
 
-qemu_put_buffer(f, (unsigned char *)&port->elem,
-sizeof(port->elem));
+virtio_put_virt_queue_element(f, &port->elem);
 }
 }
 }
@@ -731,12 +730,7 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
 qemu_get_be32s(f, &port->iov_idx);
 qemu_get_be64s(f, &port->iov_offset);
 
-qemu_get_buffer(f, (unsigned char *)&port->elem,
-sizeof(port->elem));
-virtqueue_map_sg(port->elem.in_sg, port->elem.in_addr,
- port->elem.in_num, 1);
-virtqueue_map_sg(port->elem.out_sg, port->elem.out_addr,
- port->elem.out_num, 1);
+virtio_get_virt_queue_element(f, &port->elem);
 
 /*
  *  Port was throttled on source machine.  Let's
diff --git a/hw/virtio.c b/hw/virtio.c
index f40a8c5..8eb8f69 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -875,6 +875,19 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
 return 0;
 }
 
+void virtio_put_virt_queue_element(QEMUFile *f, const VirtQueueElement *elem)
+{
+qemu_put_buffer(f, (unsigned char*)elem, sizeof(*elem));
+}
+
+void virtio_get_virt_queue_element(QEMUFile *f, VirtQueueElement *elem)
+{
+qemu_get_buffer(f, (unsigned char *)elem, sizeof(*elem));
+
+virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1);
+virtqueue_map_sg(elem->out_sg, elem->out_addr, elem->out_num, 0);
+}
+
 void virtio_cleanup(VirtIODevice *vdev)
 {
 qemu_del_vm_change_state_handler(vdev->vmstate);
diff --git a/hw/virtio.h b/hw/virtio.h
index 7c17f7b..4af8239 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -159,6 +159,10 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f);
 
 int virtio_load(VirtIODevice *vdev, QEMUFile *f);
 
+void virtio_put_virt_queue_element(QEMUFile *f, const VirtQueueElement *elem);
+
+void virtio_get_virt_queue_element(QEMUFile *f, VirtQueueElement *elem);
+
 void virtio_cleanup(VirtIODevice *vdev);
 
 void virtio_notify_config(VirtIODevice *vdev);
-- 
1.8.0




[Qemu-devel] [PATCH 3/4] virtio: modify savevm to have a stable wire format

2012-12-10 Thread Anthony Liguori
We were memcpy()'ing a structure to the wire :-/  Since savevm really
only works on x86 today, lets just declare that this element is sent
over the wire as a little endian value in order to fix the bitness.

Unfortunately, we also send raw pointers and size_t which are going
to be different values on a 32-bit vs. 64-bit QEMU so we need to also
deal with that case.

A lot of values that should have been previously ignored are now sent
as 0 and ignored on the receive side too.

Signed-off-by: Anthony Liguori 
---
 hw/virtio.c | 114 ++--
 1 file changed, 112 insertions(+), 2 deletions(-)

diff --git a/hw/virtio.c b/hw/virtio.c
index 8eb8f69..0108d62 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -875,14 +875,124 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
 return 0;
 }
 
+/* We used to memcpy the structure to the wire so that's the reason for all of
+ * this ugliness */
+
+#if HOST_LONG_BITS == 32
+static uint32 virtio_get_long(QEMUFile *f)
+{
+return qemu_get_le32(f);
+}
+
+static void virtio_put_long(QEMUFile *f, uint32_t value)
+{
+qemu_put_le32(f, value);
+}
+#elif HOST_LONG_BITS == 64
+static uint64 virtio_get_long(QEMUFile *f)
+{
+return qemu_get_le64(f);
+}
+
+static void virtio_put_long(QEMUFile *f, uint64_t value)
+{
+qemu_put_le64(f, value);
+}
+#else
+#error Invalid HOST_LONG_BITS
+#endif
+
 void virtio_put_virt_queue_element(QEMUFile *f, const VirtQueueElement *elem)
 {
-qemu_put_buffer(f, (unsigned char*)elem, sizeof(*elem));
+int i;
+
+qemu_put_le32(f, elem->index);
+qemu_put_le32(f, elem->out_num);
+qemu_put_le32(f, elem->in_num);
+
+qemu_put_le32(f, 0); /* padding for structure */
+
+for (i = 0; i < VIRTQUEUE_MAX_SIZE; i++) {
+if (i < elem->in_num) {
+qemu_put_le64(f, elem->in_addr[i]);
+} else {
+qemu_put_le64(f, 0);
+}
+}
+
+for (i = 0; i < VIRTQUEUE_MAX_SIZE; i++) {
+if (i < elem->out_num) {
+qemu_put_le64(f, elem->out_addr[i]);
+} else {
+qemu_put_le64(f, 0);
+}
+}
+
+for (i = 0; i < VIRTQUEUE_MAX_SIZE; i++) {
+virtio_put_long(f, 0);
+if (i < elem->in_num) {
+virtio_put_long(f, elem->in_sg[i].iov_len);
+} else {
+virtio_put_long(f, 0);
+}
+}
+
+for (i = 0; i < VIRTQUEUE_MAX_SIZE; i++) {
+virtio_put_long(f, 0);
+if (i < elem->out_num) {
+virtio_put_long(f, elem->out_sg[i].iov_len);
+} else {
+virtio_put_long(f, 0);
+}
+}
 }
 
 void virtio_get_virt_queue_element(QEMUFile *f, VirtQueueElement *elem)
 {
-qemu_get_buffer(f, (unsigned char *)elem, sizeof(*elem));
+int i;
+
+elem->index = qemu_get_le32(f);
+elem->out_num = qemu_get_le32(f);
+elem->in_num = qemu_get_le32(f);
+
+assert(elem->out_num <= VIRTQUEUE_MAX_SIZE &&
+   elem->in_num <= VIRTQUEUE_MAX_SIZE);
+
+qemu_get_le32(f); /* padding for structure */
+
+for (i = 0; i < VIRTQUEUE_MAX_SIZE; i++) {
+if (i < elem->in_num) {
+elem->in_addr[i] = qemu_get_le64(f);
+} else {
+qemu_get_le64(f);
+}
+}
+
+for (i = 0; i < VIRTQUEUE_MAX_SIZE; i++) {
+if (i < elem->out_num) {
+elem->out_addr[i] = qemu_get_le64(f);
+} else {
+qemu_get_le64(f);
+}
+}
+
+for (i = 0; i < VIRTQUEUE_MAX_SIZE; i++) {
+virtio_get_long(f);
+if (i < elem->in_num) {
+elem->in_sg[i].iov_len = virtio_get_long(f);
+} else {
+virtio_get_long(f);
+}
+}
+
+for (i = 0; i < VIRTQUEUE_MAX_SIZE; i++) {
+virtio_get_long(f);
+if (i < elem->out_num) {
+elem->out_sg[i].iov_len = virtio_get_long(f);
+} else {
+virtio_get_long(f);
+}
+}
 
 virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1);
 virtqueue_map_sg(elem->out_sg, elem->out_addr, elem->out_num, 0);
-- 
1.8.0




[Qemu-devel] [PATCH 1/4] savevm: introduce little endian variants of savevm routines

2012-12-10 Thread Anthony Liguori
Signed-off-by: Anthony Liguori 
---
 qemu-file.h |  7 +++
 savevm.c| 45 +
 2 files changed, 52 insertions(+)

diff --git a/qemu-file.h b/qemu-file.h
index d64bdbb..ac5286c 100644
--- a/qemu-file.h
+++ b/qemu-file.h
@@ -94,6 +94,9 @@ static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
 void qemu_put_be16(QEMUFile *f, unsigned int v);
 void qemu_put_be32(QEMUFile *f, unsigned int v);
 void qemu_put_be64(QEMUFile *f, uint64_t v);
+void qemu_put_le16(QEMUFile *f, unsigned int v);
+void qemu_put_le32(QEMUFile *f, unsigned int v);
+void qemu_put_le64(QEMUFile *f, uint64_t v);
 int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size);
 int qemu_get_byte(QEMUFile *f);
 
@@ -108,6 +111,10 @@ unsigned int qemu_get_be16(QEMUFile *f);
 unsigned int qemu_get_be32(QEMUFile *f);
 uint64_t qemu_get_be64(QEMUFile *f);
 
+unsigned int qemu_get_le16(QEMUFile *f);
+unsigned int qemu_get_le32(QEMUFile *f);
+uint64_t qemu_get_le64(QEMUFile *f);
+
 int qemu_file_rate_limit(QEMUFile *f);
 int64_t qemu_file_set_rate_limit(QEMUFile *f, int64_t new_rate);
 int64_t qemu_file_get_rate_limit(QEMUFile *f);
diff --git a/savevm.c b/savevm.c
index 5d04d59..f6ae0ba 100644
--- a/savevm.c
+++ b/savevm.c
@@ -749,6 +749,26 @@ void qemu_put_be64(QEMUFile *f, uint64_t v)
 qemu_put_be32(f, v);
 }
 
+void qemu_put_le16(QEMUFile *f, unsigned int v)
+{
+qemu_put_byte(f, v);
+qemu_put_byte(f, v >> 8);
+}
+
+void qemu_put_le32(QEMUFile *f, unsigned int v)
+{
+qemu_put_byte(f, v);
+qemu_put_byte(f, v >> 8);
+qemu_put_byte(f, v >> 16);
+qemu_put_byte(f, v >> 24);
+}
+
+void qemu_put_le64(QEMUFile *f, uint64_t v)
+{
+qemu_put_be32(f, v);
+qemu_put_be32(f, v >> 32);
+}
+
 unsigned int qemu_get_be16(QEMUFile *f)
 {
 unsigned int v;
@@ -775,6 +795,31 @@ uint64_t qemu_get_be64(QEMUFile *f)
 return v;
 }
 
+unsigned int qemu_get_le16(QEMUFile *f)
+{
+unsigned int v;
+v = qemu_get_byte(f);
+v |= qemu_get_byte(f) << 8;
+return v;
+}
+
+unsigned int qemu_get_le32(QEMUFile *f)
+{
+unsigned int v;
+v = qemu_get_byte(f);
+v |= qemu_get_byte(f) << 8;
+v |= qemu_get_byte(f) << 16;
+v |= qemu_get_byte(f) << 24;
+return v;
+}
+
+uint64_t qemu_get_le64(QEMUFile *f)
+{
+uint64_t v;
+v = qemu_get_be32(f);
+v |= (uint64_t)qemu_get_be32(f) << 32;
+return v;
+}
 
 /* timer */
 
-- 
1.8.0




[Qemu-devel] [PATCH 4/4] virtio: bump migration version number

2012-12-10 Thread Anthony Liguori
And disable migration on big endian hosts from older versions where
endianness of the device state was ambiguous on the wire.

Signed-off-by: Anthony Liguori 
---
 hw/virtio-blk.c| 13 +++--
 hw/virtio-serial-bus.c |  9 +++--
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 7ab174f..837a709 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -566,8 +566,17 @@ static int virtio_blk_load(QEMUFile *f, void *opaque, int 
version_id)
 VirtIOBlock *s = opaque;
 int ret;
 
-if (version_id != 2)
+#ifdef HOST_WORDS_BIGENDIAN
+/* Because of the use of native endianness, we can't reliably handle
+ * migration below this version on big endian hosts. */
+if (version < 3) {
 return -EINVAL;
+}
+#endif
+
+if (version_id < 2) {
+return -EINVAL;
+}
 
 ret = virtio_load(&s->vdev, f);
 if (ret) {
@@ -633,7 +642,7 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, 
VirtIOBlkConf *blk)
 
 qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
 s->qdev = dev;
-register_savevm(dev, "virtio-blk", virtio_blk_id++, 2,
+register_savevm(dev, "virtio-blk", virtio_blk_id++, 3,
 virtio_blk_save, virtio_blk_load, s);
 bdrv_set_dev_ops(s->bs, &virtio_block_ops, s);
 bdrv_set_buffer_alignment(s->bs, s->conf->logical_block_size);
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index aa1ded0..6aa3b85 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -665,9 +665,14 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
 unsigned int i;
 int ret;
 
-if (version_id > 3) {
+#ifdef HOST_WORDS_BIGENDIAN
+/* Because of the use of native endianness, we can't reliably handle
+ * migration below this version on big endian hosts. */
+
+if (version_id < 4) {
 return -EINVAL;
 }
+#endif
 
 /* The virtio device */
 ret = virtio_load(&s->vdev, f);
@@ -987,7 +992,7 @@ VirtIODevice *virtio_serial_init(DeviceState *dev, 
virtio_serial_conf *conf)
  * Register for the savevm section with the virtio-console name
  * to preserve backward compat
  */
-register_savevm(dev, "virtio-console", -1, 3, virtio_serial_save,
+register_savevm(dev, "virtio-console", -1, 4, virtio_serial_save,
 virtio_serial_load, vser);
 
 vser->post_load.timer = qemu_new_timer_ns(vm_clock,
-- 
1.8.0




Re: [Qemu-devel] [PATCH 1/4] savevm: introduce little endian variants of savevm routines

2012-12-10 Thread Peter Maydell
On 10 December 2012 14:29, Anthony Liguori  wrote:
> Signed-off-by: Anthony Liguori 
> ---
>  qemu-file.h |  7 +++
>  savevm.c| 45 +
>  2 files changed, 52 insertions(+)
>
> diff --git a/qemu-file.h b/qemu-file.h
> index d64bdbb..ac5286c 100644
> --- a/qemu-file.h
> +++ b/qemu-file.h
> @@ -94,6 +94,9 @@ static inline void qemu_put_ubyte(QEMUFile *f, unsigned int 
> v)
>  void qemu_put_be16(QEMUFile *f, unsigned int v);
>  void qemu_put_be32(QEMUFile *f, unsigned int v);
>  void qemu_put_be64(QEMUFile *f, uint64_t v);
> +void qemu_put_le16(QEMUFile *f, unsigned int v);
> +void qemu_put_le32(QEMUFile *f, unsigned int v);
> +void qemu_put_le64(QEMUFile *f, uint64_t v);

Do we want to add a comment here somewhere that says the le versions
are for backcompat with a specific thing and the be ones are the
ones to use in new code (well, new code not using vmstate)?

-- PMM



Re: [Qemu-devel] [PATCH 1/4] savevm: introduce little endian variants of savevm routines

2012-12-10 Thread Anthony Liguori
Peter Maydell  writes:

> On 10 December 2012 14:29, Anthony Liguori  wrote:
>> Signed-off-by: Anthony Liguori 
>> ---
>>  qemu-file.h |  7 +++
>>  savevm.c| 45 +
>>  2 files changed, 52 insertions(+)
>>
>> diff --git a/qemu-file.h b/qemu-file.h
>> index d64bdbb..ac5286c 100644
>> --- a/qemu-file.h
>> +++ b/qemu-file.h
>> @@ -94,6 +94,9 @@ static inline void qemu_put_ubyte(QEMUFile *f, unsigned 
>> int v)
>>  void qemu_put_be16(QEMUFile *f, unsigned int v);
>>  void qemu_put_be32(QEMUFile *f, unsigned int v);
>>  void qemu_put_be64(QEMUFile *f, uint64_t v);
>> +void qemu_put_le16(QEMUFile *f, unsigned int v);
>> +void qemu_put_le32(QEMUFile *f, unsigned int v);
>> +void qemu_put_le64(QEMUFile *f, uint64_t v);
>
> Do we want to add a comment here somewhere that says the le versions
> are for backcompat with a specific thing and the be ones are the
> ones to use in new code (well, new code not using vmstate)?

Yeah, that's a good idea.  I've unfortunately found a couple more cases
of this (writing native endian to the wire).

Regards,

Anthony Liguori

>
> -- PMM




[Qemu-devel] [PATCH] target-mips: Fix for helpers for EXTR_* instructions

2012-12-10 Thread Petar Jovanovic
From: Petar Jovanovic 

The change removes some unnecessary and incorrect code for EXTR_S.H.
Further, it corrects the mask for shift value in the EXTR_ instructions. It also
extends the existing tests so they trigger the issues corrected with the change.

Signed-off-by: Petar Jovanovic 
---
 target-mips/dsp_helper.c   |   45 +++
 tests/tcg/mips/mips32-dsp/extr_r_w.c   |   23 
 tests/tcg/mips/mips32-dsp/extr_rs_w.c  |   23 
 tests/tcg/mips/mips32-dsp/extr_s_h.c   |   23 
 tests/tcg/mips/mips32-dsp/extr_w.c |   23 
 tests/tcg/mips/mips32-dsp/extrv_r_w.c  |   25 +
 tests/tcg/mips/mips32-dsp/extrv_rs_w.c |   25 +
 tests/tcg/mips/mips32-dsp/extrv_s_h.c  |   17 
 tests/tcg/mips/mips32-dsp/extrv_w.c|   26 ++
 9 files changed, 195 insertions(+), 35 deletions(-)

diff --git a/target-mips/dsp_helper.c b/target-mips/dsp_helper.c
index 14daf91..a54809b 100644
--- a/target-mips/dsp_helper.c
+++ b/target-mips/dsp_helper.c
@@ -484,35 +484,6 @@ static inline uint8_t mipsdsp_rrshift1_sub_u8(uint8_t a, 
uint8_t b)
 return (temp >> 1) & 0x00FF;
 }
 
-static inline int64_t mipsdsp_rashift_short_acc(int32_t ac,
-int32_t shift,
-CPUMIPSState *env)
-{
-int32_t sign, temp31;
-int64_t temp, acc;
-
-sign = (env->active_tc.HI[ac] >> 31) & 0x01;
-acc = ((int64_t)env->active_tc.HI[ac] << 32) |
-  ((int64_t)env->active_tc.LO[ac] & 0x);
-if (shift == 0) {
-temp = acc;
-} else {
-if (sign == 0) {
-temp = (((int64_t)0x01 << (32 - shift + 1)) - 1) & (acc >> shift);
-} else {
-temp = int64_t)0x01 << (shift + 1)) - 1) << (32 - shift)) |
-   (acc >> shift);
-}
-}
-
-temp31 = (temp >> 31) & 0x01;
-if (sign != temp31) {
-set_DSPControl_overflow_flag(1, 23, env);
-}
-
-return temp;
-}
-
 /*  128 bits long. p[0] is LO, p[1] is HI. */
 static inline void mipsdsp_rndrashift_short_acc(int64_t *p,
 int32_t ac,
@@ -3407,7 +3378,7 @@ target_ulong helper_extr_w(target_ulong ac, target_ulong 
shift,
 int32_t tempI;
 int64_t tempDL[2];
 
-shift = shift & 0x0F;
+shift = shift & 0x1F;
 
 mipsdsp_rndrashift_short_acc(tempDL, ac, shift, env);
 if ((tempDL[1] != 0 || (tempDL[0] & MIPSDSP_LHI) != 0) &&
@@ -3435,7 +3406,7 @@ target_ulong helper_extr_r_w(target_ulong ac, 
target_ulong shift,
 {
 int64_t tempDL[2];
 
-shift = shift & 0x0F;
+shift = shift & 0x1F;
 
 mipsdsp_rndrashift_short_acc(tempDL, ac, shift, env);
 if ((tempDL[1] != 0 || (tempDL[0] & MIPSDSP_LHI) != 0) &&
@@ -3462,7 +3433,7 @@ target_ulong helper_extr_rs_w(target_ulong ac, 
target_ulong shift,
 int32_t tempI, temp64;
 int64_t tempDL[2];
 
-shift = shift & 0x0F;
+shift = shift & 0x1F;
 
 mipsdsp_rndrashift_short_acc(tempDL, ac, shift, env);
 if ((tempDL[1] != 0 || (tempDL[0] & MIPSDSP_LHI) != 0) &&
@@ -3645,11 +3616,15 @@ target_ulong helper_dextr_rs_l(target_ulong ac, 
target_ulong shift,
 target_ulong helper_extr_s_h(target_ulong ac, target_ulong shift,
  CPUMIPSState *env)
 {
-int64_t temp;
+int64_t temp, acc;
+
+shift = shift & 0x1F;
+
+acc = ((int64_t)env->active_tc.HI[ac] << 32) |
+  ((int64_t)env->active_tc.LO[ac] & 0x);
 
-shift = shift & 0x0F;
+temp = acc >> shift;
 
-temp = mipsdsp_rashift_short_acc(ac, shift, env);
 if (temp > (int64_t)0x7FFF) {
 temp = 0x7FFF;
 set_DSPControl_overflow_flag(1, 23, env);
diff --git a/tests/tcg/mips/mips32-dsp/extr_r_w.c 
b/tests/tcg/mips/mips32-dsp/extr_r_w.c
index 0beeefd..02e0224 100644
--- a/tests/tcg/mips/mips32-dsp/extr_r_w.c
+++ b/tests/tcg/mips/mips32-dsp/extr_r_w.c
@@ -44,5 +44,28 @@ int main()
 assert(dsp == 0);
 assert(result == rt);
 
+/* Clear dspcontrol */
+dsp = 0;
+__asm
+("wrdsp %0\n\t"
+ :
+ : "r"(dsp)
+);
+
+ach = 0x3fff;
+acl = 0x2bcdef01;
+result = 0x7ffe;
+__asm
+("mthi %2, $ac1\n\t"
+ "mtlo %3, $ac1\n\t"
+ "extr_r.w %0, $ac1, 0x1F\n\t"
+ "rddsp %1\n\t"
+ : "=r"(rt), "=r"(dsp)
+ : "r"(ach), "r"(acl)
+);
+dsp = (dsp >> 23) & 0x01;
+assert(dsp == 0);
+assert(result == rt);
+
 return 0;
 }
diff --git a/tests/tcg/mips/mips32-dsp/extr_rs_w.c 
b/tests/tcg/mips/mips32-dsp/extr_rs_w.c
index 24c748d..c3a22ee 100644
--- a/tests/tcg/mips/mips32-dsp/extr_rs_w.c
+++ b/tests/tcg/mips/mips32-dsp/extr_rs_w.c
@@ -44,5 +44,28 @@ int main()
 assert(dsp == 0);
 assert(result == rt);
 
+/* Clear dspcontrol */
+dsp = 0;
+__asm
+("wrdsp %0\n\t"
+ :
+   

Re: [Qemu-devel] [PATCH V17 1/6] docs: document for add-cow file format

2012-12-10 Thread Kevin Wolf
Am 06.12.2012 07:51, schrieb Dong Xu Wang:
> Document for add-cow format, the usage and spec of add-cow are introduced.
> 
> Signed-off-by: Dong Xu Wang 
> ---
>  docs/specs/add-cow.txt |  154 
> 
>  1 files changed, 154 insertions(+), 0 deletions(-)
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
> new file mode 100644
> index 000..24e9a11
> --- /dev/null
> +++ b/docs/specs/add-cow.txt
> @@ -0,0 +1,154 @@
> +== General ==
> +
> +The raw file format does not support backing files or copy on write feature.
> +The add-cow image format makes it possible to use backing files with a raw
> +image by keeping a separate .add-cow metadata file. Once all sectors
> +have been written into the raw image it is safe to discard the .add-cow
> +and backing files, then we can use the raw image directly.
> +
> +An example usage of add-cow would look like::

Double colon.

> +(ubuntu.img is a disk image which has an installed OS.)
> +1)  Create a raw image with the same size of ubuntu.img
> +qemu-img create -f raw test.raw 8G
> +2)  Create an add-cow image which will store dirty bitmap
> +qemu-img create -f add-cow test.add-cow \
> +-o backing_file=ubuntu.img,image_file=test.raw
> +3)  Run qemu with add-cow image
> +qemu -drive if=virtio,file=test.add-cow
> +
> +test.raw may be larger than ubuntu.img, in that case, the size of 
> test.add-cow
> +will be calculated from the size of test.raw.
> +
> +image_fmt can be omitted, in that case image_fmt should be set as "raw".

By "should be set as" you mean "is assumed to be"?

> +backing_fmt can also be omitted, add-cow should do a probe operation and 
> determine

This line takes more than 80 characters. More follow, I won't comment on
each.

> +what the backing file's format is.
> +
> +=Specification=
> +
> +The file format looks like this:
> +
> + +---+---+
> + | Header|   COW bitmap  |
> + +---+---+
> +
> +All numbers in add-cow are stored in Little Endian byte order.
> +
> +== Header ==
> +
> +The Header is included in the first bytes:
> +(HEADER_SIZE is defined in 44-47 bytes.)
> +Byte0  -  3:magic
> +add-cow magic string ("ACOW").
> +
> +4  -  7:version
> +Version number (only valid value is 1 now).
> +
> +8  - 11:backing file name offset
> +Offset in the add-cow file at which the backing file
> +name is stored (NB: The string is not 
> NUL-terminated).
> +If backing file name does NOT exist, this field will 
> be
> +0. Must be between 80 and [HEADER_SIZE - 2](a file 
> name
> +must be at least 1 byte).
> +
> +12 - 15:backing file name size
> +Length of the backing file name in bytes. It will be > 0
> +if the backing file name offset is 0. If backing file
> +name offset is non-zero, then it must be non-zero. 
> Must
> +be less than [HEADER_SIZE - 80] to fit in the 
> reserved
> +part of the header. Backing file name offset + size
> +must be no more than HEADER_SIZE.
> +
> +16 - 19:image file name offset
> +Offset in the add-cow file at which the image file 
> name
> +is stored (NB: The string is not NUL-terminated). It
> +must be between 80 and [HEADER_SIZE - 2]. Image file
> +name size + offset must be no more than HEADER_SIZE.
> +
> +20 - 23:image file name size
> +Length of the image file name in bytes.
> +Must be less than [HEADER_SIZE - 80] to fit in the 
> reserved
> +part of the header.
> +
> +24 - 27:cluster bits
> +Number of bits that are used for addressing an offset
> +within a cluster (1 << cluster_bits is the cluster 
> size).
> +Must not be less than 9 (i.e. 512 byte clusters).
> +
> +Note: qemu as of today has an implementation limit 
> of 2 MB
> +as the maximum cluster size and won't be able to 
> open images
> +with larger cluster sizes.
> +
> +28 - 35:features
> +Bitmask of features. If a feature bit is set but not 
> recognized,
> +the add-cow file should be dropped. They are not 
> used in v1.

Does v1 mean header.version = 1? I think this is wrong, we will want to
add incompatible feature flags 

Re: [Qemu-devel] [PATCH] tcg-i386: Improve cmov detection

2012-12-10 Thread Wei-Ren Chen
On Mon, Nov 26, 2012 at 08:23:10AM -0800, Richard Henderson wrote:
> On 11/24/2012 10:12 AM, Peter Maydell wrote:
> > MacOS gcc objects to this:
> > In file included from /Users/pm215/src/qemu/tcg/tcg.c:174:
> > /Users/pm215/src/qemu/tcg/i386/tcg-target.c:105:19: warning: cpuid.h:
> > No such file or directory
> > 
> > (though for some reason not as a fatal error).
> 
> Bizzare.
> 
> Out of curiosity, does llvm ship a cpuid.h?  Or am I going to be
> better off not relying on that header at all?

  I don't think LLVM ship cpuid.h.

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



[Qemu-devel] [Bug 1087974] Re: [regression] vnc tight png produces garbled output

2012-12-10 Thread Tim Hardeck
47683d669f993308c2b84bed4ce64aafb5d7ced4 is the first bad commit
commit 47683d669f993308c2b84bed4ce64aafb5d7ced4
Author: Gerd Hoffmann 
Date:   Thu Oct 11 12:04:33 2012 +0200

pixman/vnc: remove rgb_prepare_row* functions

Let pixman do it instead.

Signed-off-by: Gerd Hoffmann 

:04 04 653d58e66bf3a2d8240b2f9176979c44ccd720e1
6b6e367a8522cb58b42ad8f204387a354d3b3d00 M  ui


Just reverting this particular commit isn't enough thou but it is connected.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1087974

Title:
  [regression] vnc tight png produces garbled output

Status in QEMU:
  New

Bug description:
  VNC Tight PNG compression did work fine two or three month ago but don't 
anymore. Now when Tight PNG is used parts of the desktop are shown but they are 
scrambled together.
  I have always tested this feature against QEMU git with noVNC by only 
allowing Tight PNG compression.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1087974/+subscriptions



Re: [Qemu-devel] [PATCH 1/2] qapi: add visitor for parsing int[KMGT] input string

2012-12-10 Thread Igor Mammedov
On Fri, 07 Dec 2012 19:57:35 +0100
Andreas Färber  wrote:

> Am 06.12.2012 22:12, schrieb Igor Mammedov:
> > Caller of visit_type_unit_suffixed_int() will have to specify
> > value of 'K' suffix via unit argument.
> > For Kbytes it's 1024, for Khz it's 1000.
> > 
> > Signed-off-by: Igor Mammedov 
> > ---
> >  v2:
> >   - convert type_freq to type_unit_suffixed_int.
> >   - provide qapi_dealloc_type_unit_suffixed_int() impl.
> > ---
> >  qapi/qapi-dealloc-visitor.c |  7 +++
> >  qapi/qapi-visit-core.c  | 13 +
> >  qapi/qapi-visit-core.h  |  8 
> >  qapi/string-input-visitor.c | 22 ++
> >  4 files changed, 50 insertions(+)
> > 
> > diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
> > index 75214e7..57e662c 100644
> > --- a/qapi/qapi-dealloc-visitor.c
> > +++ b/qapi/qapi-dealloc-visitor.c
> > @@ -143,6 +143,12 @@ static void qapi_dealloc_type_enum(Visitor *v, int
> > *obj, const char *strings[], {
> >  }
> >  
> > +static void qapi_dealloc_type_unit_suffixed_int(Visitor *v, int64_t *obj,
> > +const char *name,
> > +const int unit, Error
> > **errp) +{
> > +}
> > +
> >  Visitor *qapi_dealloc_get_visitor(QapiDeallocVisitor *v)
> >  {
> >  return &v->visitor;
> > @@ -170,6 +176,7 @@ QapiDeallocVisitor *qapi_dealloc_visitor_new(void)
> >  v->visitor.type_str = qapi_dealloc_type_str;
> >  v->visitor.type_number = qapi_dealloc_type_number;
> >  v->visitor.type_size = qapi_dealloc_type_size;
> > +v->visitor.type_unit_suffixed_int =
> > qapi_dealloc_type_unit_suffixed_int; 
> >  QTAILQ_INIT(&v->stack);
> >  
> > diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> > index 7a82b63..dcbc1a9 100644
> > --- a/qapi/qapi-visit-core.c
> > +++ b/qapi/qapi-visit-core.c
> > @@ -311,3 +311,16 @@ void input_type_enum(Visitor *v, int *obj, const
> > char *strings[], g_free(enum_str);
> >  *obj = value;
> >  }
> > +
> > +void visit_type_unit_suffixed_int(Visitor *v, int64_t *obj, const char
> > *name,
> > +  const int unit, Error **errp)
> > +{
> > +if (!error_is_set(errp)) {
> 
> if (error_is_set(errp)) {
Thanks, I'll fix it.

> > +return;
> > +}
> > +if (v->type_unit_suffixed_int) {
> > +v->type_unit_suffixed_int(v, obj, name, unit, errp);
> > +} else {
> > +visit_type_int64(v, obj, name, errp);
> > +}
> > +}
> > diff --git a/qapi/qapi-visit-core.h b/qapi/qapi-visit-core.h
> > index 60aceda..04e690a 100644
> > --- a/qapi/qapi-visit-core.h
> > +++ b/qapi/qapi-visit-core.h
> > @@ -62,6 +62,12 @@ struct Visitor
> >  void (*type_int64)(Visitor *v, int64_t *obj, const char *name, Error
> > **errp); /* visit_type_size() falls back to (*type_uint64)() if type_size
> > is unset */ void (*type_size)(Visitor *v, uint64_t *obj, const char
> > *name, Error **errp);
> > +/*
> > + * visit_unit_suffixed_int() falls back to (*type_int64)()
> > + * if type_unit_suffixed_int is unset
> > +*/
> 
> Indentation is one off.
ditto

> 
> > +void (*type_unit_suffixed_int)(Visitor *v, int64_t *obj, const char
> > *name,
> > +   const int unit, Error **errp);
> 
> Are we expecting differently suffixed ints? Otherwise we could
> optionally shorten to type_suffixed_int (but that probably still doesn't
> fit within one comment line ;)).
Not with current implementation. I'll shorten it as you've suggested.

> 
> >  };
> >  
> >  void visit_start_handle(Visitor *v, void **obj, const char *kind,
> > @@ -91,5 +97,7 @@ void visit_type_size(Visitor *v, uint64_t *obj, const
> > char *name, Error **errp); void visit_type_bool(Visitor *v, bool *obj,
> > const char *name, Error **errp); void visit_type_str(Visitor *v, char
> > **obj, const char *name, Error **errp); void visit_type_number(Visitor
> > *v, double *obj, const char *name, Error **errp); +void
> > visit_type_unit_suffixed_int(Visitor *v, int64_t *obj, const char *name,
> > +  const int unit, Error **errp);
> >  
> >  #endif
> > diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
> > index 497eb9a..d2bd154 100644
> > --- a/qapi/string-input-visitor.c
> > +++ b/qapi/string-input-visitor.c
> > @@ -110,6 +110,27 @@ static void parse_start_optional(Visitor *v, bool
> > *present, *present = true;
> >  }
> >  
> > +static void parse_type_unit_suffixed_int(Visitor *v, int64_t *obj,
> > + const char *name, const int
> > unit,
> > + Error **errp)
> > +{
> > +StringInputVisitor *siv = DO_UPCAST(StringInputVisitor, visitor, v);
> > +char *endp = (char *) siv->string;
> > +long long val = 0;
> > +
> > +if (siv->string) {
> > +val = strtosz_suffix_unit(siv->string, &endp,
> > + STRTOSZ_DEFSUFFIX_B, unit);
>

Re: [Qemu-devel] [Xen-devel] [PATCH 0/2] QEMU/xen: simplify cpu_ioreq_pio and cpu_ioreq_move

2012-12-10 Thread Stefano Stabellini
On Fri, 7 Dec 2012, Ian Jackson wrote:
> Stefano Stabellini writes ("[Xen-devel] [PATCH 0/2] QEMU/xen: simplify 
> cpu_ioreq_pio and cpu_ioreq_move"):
> > after reviewing the patch "fix multiply issue for int and uint types"
> > with Ian Jackson, we realized that cpu_ioreq_pio and cpu_ioreq_move are
> > in much need for a simplification as well as removal of a possible
> > integer overflow.
> > 
> > This patch series tries to accomplish both switching to two new helper
> > functions and using a more obvious arithmetic. Doing so it should also
> > fix the original problem that Dongxiao was experiencing. The C language
> > can be a nasty backstabber when signed and unsigned integers are
> > involved.
> 
> I think the attached patch is better as it removes some formulaic
> code.  I don't think I have a guest which can repro the bug so I have
> only compile tested it.
> 
> Dongxiao, would you care to take a look ?
> 
> PS: I'm pretty sure the original overflows aren't security problems.
> 
> Thanks,
> Ian.
> 
> commit d19731e4e452e3415a5c03771d0406efc803baa9
> Author: Ian Jackson 
> Date:   Fri Dec 7 16:02:04 2012 +
> 
> cpu_ioreq_pio, cpu_ioreq_move: introduce read_phys_req_item, 
> write_phys_req_item
> 
> The current code compare i (int) with req->count (uint32_t) in a for
> loop, risking an infinite loop if req->count is >INT_MAX.  It also
> does the multiplication of req->size in a too-small type, leading to
> integer overflows.
> 
> Turn read_physical and write_physical into two different helper
> functions, read_phys_req_item and write_phys_req_item, that take care
> of adding or subtracting offset depending on sign.
> 
> This removes the formulaic multiplication to a single place where the
> integer overflows can be dealt with by casting to wide-enough unsigned
> types.
> 
> Reported-By: Dongxiao Xu 
> Signed-off-by: Stefano Stabellini 
> Signed-off-by: Ian Jackson 
> 
> diff --git a/i386-dm/helper2.c b/i386-dm/helper2.c
> index c6d049c..9b8552c 100644
> --- a/i386-dm/helper2.c
> +++ b/i386-dm/helper2.c
> @@ -339,21 +339,40 @@ static void do_outp(CPUState *env, unsigned long addr,
>  }
>  }
>  
> -static inline void read_physical(uint64_t addr, unsigned long size, void 
> *val)
> +/*
> + * Helper functions which read/write an object from/to physical guest
> + * memory, as part of the implementation of an ioreq.
> + *
> + * Equivalent to
> + *   cpu_physical_memory_rw(addr + (req->df ? -1 : +1) * req->size * i,
> + *  val, req->size, 0/1)
> + * except without the integer overflow problems.
> + */
> +static void rw_phys_req_item(target_phys_addr_t addr,
> + ioreq_t *req, uint32_t i, void *val, int rw)
>  {
> -return cpu_physical_memory_rw((target_phys_addr_t)addr, val, size, 0);
> +/* Do everything unsigned so overflow just results in a truncated result
> + * and accesses to undesired parts of guest memory, which is up
> + * to the guest */
> +target_phys_addr_t offset = (target_phys_addr_t)req->size * i;
> +if (req->df) addr -= offset;
> +else addr -= offset;

This can't be right, can it?

The search/replace changes below look correct.

For the sake of consistency, could you please send a patch against
upstream QEMU to qemu-devel? The corresponding code is in xen-all.c
(cpu_ioreq_pio and cpu_ioreq_move).



> +cpu_physical_memory_rw(addr, val, req->size, rw);
>  }
> -
> -static inline void write_physical(uint64_t addr, unsigned long size, void 
> *val)
> +static inline void read_phys_req_item(target_phys_addr_t addr,
> +  ioreq_t *req, uint32_t i, void *val)
>  {
> -return cpu_physical_memory_rw((target_phys_addr_t)addr, val, size, 1);
> +rw_phys_req_item(addr, req, i, val, 0);
> +}
> +static inline void write_phys_req_item(target_phys_addr_t addr,
> +   ioreq_t *req, uint32_t i, void *val)
> +{
> +rw_phys_req_item(addr, req, i, val, 1);
>  }
>  
>  static void cpu_ioreq_pio(CPUState *env, ioreq_t *req)
>  {
> -int i, sign;
> -
> -sign = req->df ? -1 : 1;
> +uint32_t i;
>  
>  if (req->dir == IOREQ_READ) {
>  if (!req->data_is_ptr) {
> @@ -363,9 +382,7 @@ static void cpu_ioreq_pio(CPUState *env, ioreq_t *req)
>  
>  for (i = 0; i < req->count; i++) {
>  tmp = do_inp(env, req->addr, req->size);
> -write_physical((target_phys_addr_t) req->data
> -  + (sign * i * req->size),
> -  req->size, &tmp);
> +write_phys_req_item((target_phys_addr_t) req->data, req, i, 
> &tmp);
>  }
>  }
>  } else if (req->dir == IOREQ_WRITE) {
> @@ -375,9 +392,7 @@ static void cpu_ioreq_pio(CPUState *env, ioreq_t *req)
>  for (i = 0; i < req->count; i++) {
>  unsigned long tmp = 0;
>  
> -read_physical((tar

Re: [Qemu-devel] Some patch about mips, gen_HILO bug fix.

2012-12-10 Thread Jovanovic, Petar
> 0002-Make-repl_ph-to-sign-extended-to-target_long.patch
> 0003-Fix-gen_HILO-to-make-it-adapt-each-arch-which-use-ac.patch

Can you send examples/tests for the issues that you fix?
It makes easier to review if you provide a simple example of a failing test.


Petar




Re: [Qemu-devel] [PATCH 2/2] target-i386: use visit_type_unit_suffixed_int() to parse tsc_freq property value

2012-12-10 Thread Igor Mammedov
On Fri, 7 Dec 2012 18:09:06 -0200
Eduardo Habkost  wrote:

> On Fri, Dec 07, 2012 at 08:00:09PM +0100, Andreas Färber wrote:
> > Am 06.12.2012 22:12, schrieb Igor Mammedov:
> > > Signed-off-by: Igor Mammedov 
> > > ---
> > >   v2:
> > >- replace visit_type_freq() with visit_type_unit_suffixed_int()
> > >  in x86_cpuid_set_tsc_freq()
> > > ---
> > >  target-i386/cpu.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > > index c6c2ca0..b7f0aba 100644
> > > --- a/target-i386/cpu.c
> > > +++ b/target-i386/cpu.c
> > > @@ -1195,7 +1195,7 @@ static void x86_cpuid_set_tsc_freq(Object *obj,
> > > Visitor *v, void *opaque, const int64_t max = INT64_MAX;
> > >  int64_t value;
> > >  
> > > -visit_type_int(v, &value, name, errp);
> > > +visit_type_unit_suffixed_int(v, &value, name, 1000, errp);
> > >  if (error_is_set(errp)) {
> > >  return;
> > >  }
> > 
> > This trivial usage is fine obviously. But since this series set out to
> > make things more generic I am missing at least one use case for 1024.
> > Does nothing like that exist in qdev-properties.c or so already?
> 
> cutils.c has:
> 
> int64_t strtosz_suffix(const char *nptr, char **end, const char
> default_suffix) {
> return strtosz_suffix_unit(nptr, end, default_suffix, 1024);
> }
> 
> $ git grep -w strtosz_suffix
> [...]
> qapi/opts-visitor.c:val = strtosz_suffix(opt->str ? opt->str : "",
> &endptr, qemu-img.c:sval = strtosz_suffix(argv[optind++], &end,
> STRTOSZ_DEFSUFFIX_B); qemu-img.c:sval = strtosz_suffix(optarg,
> &end, STRTOSZ_DEFSUFFIX_B);
> 
> The opts-visitor.c match, in turn, is inside opts_type_size(), that's the
> ->type_size method of OptsVisitor. There are many 'size' elements inside
> qapi-schema.json.
> 
> I don't see any code using visit_type_size() directly, but I see two users
> of type 'size' on qapi-schema.json: NetdevTapOptions and NetdevDumpOptions.
> 
> I didn't know that we already had a visitor method using the suffixed-int
> parsing code. Should we change the visit_type_size() code to be to use use
> the new generic ->type_suffixed_int method and kill ->type_size?

If there isn't strong opposition to do it in incremental way,
I'd prefer for these patches go in first.

And then later fix users of visit_type_size() to use type_suffixed_int() or
maybe have a new type_suffixed_uint() so that size could be represented as
uint64_t instead of int64_t as it's now. That would require to
rewrite strtosz_* and its callers a bit.






[Qemu-devel] [PATCH] target-mips: Use EXCP_SC rather than a magic number

2012-12-10 Thread Wei-Ren Chen
  From the discussion on the ML [1], the exception limit defined by
magic number 0x100 is actually EXCP_SC defined in cpu.h. Replace the
magic number with EXCP_SC. Remove "#if 1 .. #endif" as well.

[1] http://lists.gnu.org/archive/html/qemu-devel/2012-11/msg03080.html

Signed-off-by: Chen Wei-Ren 
---
 target-mips/op_helper.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index f45d494..98a445c 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -39,10 +39,10 @@ static inline void QEMU_NORETURN 
do_raise_exception_err(CPUMIPSState *env,
 uintptr_t pc)
 {
 TranslationBlock *tb;
-#if 1
-if (exception < 0x100)
+if (exception < EXCP_SC) {
 qemu_log("%s: %d %d\n", __func__, exception, error_code);
-#endif
+}
+
 env->exception_index = exception;
 env->error_code = error_code;
 
-- 
1.7.3.4



Re: [Qemu-devel] [PATCH] NVMe: Initial commit to add an NVM Express device

2012-12-10 Thread Busch, Keith
On Mon, Dec 10, 2012 at 7:11 AM, Stefan Hajnoczi  wrote:
> Quick pointers to get started on Kevin's suggestion:
> 
> bdrv_aio_readv(), bdrv_aio_writev(), bdrv_aio_flush(), and
> bdrv_aio_discard() provide the block device operations that emulated
> storage controllers use.
> 
> Take a look at hw/virtio-blk.c to see how to take a -device
> nvme,drive= (internally this is your BlockDriverState*).
> 
> Stefan

Thanks for all the feedback. I'll look into everything mentioned so far should 
be able to submit an updated patch set addressing the suggestions within a week 
or two.



[Qemu-devel] What is the current state of the various usb implementations?

2012-12-10 Thread tom289332
Hello,

I sent this to qemu-discuss a week ago and got no responses, so I'm trying here.

Looking at `qemu-kvm -device ?`, there seem to be multiple usb implementations:
ich9-usb-uhci2
ich9-usb-uhci3
ich9-usb-uhci1
piix3-usb-uhci
piix4-usb-uhci
vt82c686b-usb-uhci
ich9-usb-ehci1
usb-ehci
nec-usb-xhci

Are they all fully implemented and stable? If not, what is the state of each?

Thanks,
Tom



[Qemu-devel] [RFC PATCH v7 1/8] qdev : add a maximum device allowed field for the bus.

2012-12-10 Thread fred . konrad
From: KONRAD Frederic 

Add a max_dev field to BusState to specify the maximum amount of devices allowed
on the bus ( have no effect if max_dev=0 )

Signed-off-by: KONRAD Frederic 
---
 hw/qdev-core.h|  2 ++
 hw/qdev-monitor.c | 11 +++
 2 files changed, 13 insertions(+)

diff --git a/hw/qdev-core.h b/hw/qdev-core.h
index fff7f0f..ee4becd 100644
--- a/hw/qdev-core.h
+++ b/hw/qdev-core.h
@@ -113,6 +113,8 @@ struct BusState {
 const char *name;
 int allow_hotplug;
 int max_index;
+/* maximum devices allowed on the bus, 0 : no limit. */
+int max_dev;
 QTAILQ_HEAD(ChildrenHead, BusChild) children;
 QLIST_ENTRY(BusState) sibling;
 };
diff --git a/hw/qdev-monitor.c b/hw/qdev-monitor.c
index a1b4d6a..7a9d275 100644
--- a/hw/qdev-monitor.c
+++ b/hw/qdev-monitor.c
@@ -292,6 +292,17 @@ static BusState *qbus_find_recursive(BusState *bus, const 
char *name,
 if (bus_typename && !object_dynamic_cast(OBJECT(bus), bus_typename)) {
 match = 0;
 }
+if ((bus->max_dev != 0) && (bus->max_dev <= bus->max_index)) {
+if (name != NULL) {
+/* bus was explicitly specified : return an error. */
+qerror_report(ERROR_CLASS_GENERIC_ERROR, "Bus '%s' is full",
+  bus->name);
+return NULL;
+} else {
+/* bus was not specified : try to find another one. */
+match = 0;
+}
+}
 if (match) {
 return bus;
 }
-- 
1.7.11.7




[Qemu-devel] [RFC PATCH v7 2/8] virtio-bus : Introduce virtio-bus

2012-12-10 Thread fred . konrad
From: KONRAD Frederic 

Introduce virtio-bus. Refactored transport device will create a bus which
extends virtio-bus.

Signed-off-by: KONRAD Frederic 
---
 hw/Makefile.objs |   1 +
 hw/virtio-bus.c  | 120 +++
 hw/virtio-bus.h  |  83 ++
 3 files changed, 204 insertions(+)
 create mode 100644 hw/virtio-bus.c
 create mode 100644 hw/virtio-bus.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index d581d8d..6fa4de4 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -3,6 +3,7 @@ common-obj-y += loader.o
 common-obj-$(CONFIG_VIRTIO) += virtio-console.o
 common-obj-$(CONFIG_VIRTIO) += virtio-rng.o
 common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
+common-obj-$(CONFIG_VIRTIO) += virtio-bus.o
 common-obj-y += fw_cfg.o
 common-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bridge_dev.o
 common-obj-$(CONFIG_PCI) += msix.o msi.o
diff --git a/hw/virtio-bus.c b/hw/virtio-bus.c
new file mode 100644
index 000..6afd2b6
--- /dev/null
+++ b/hw/virtio-bus.c
@@ -0,0 +1,120 @@
+/*
+ * VirtioBus
+ *
+ *  Copyright (C) 2012 : GreenSocs Ltd
+ *  http://www.greensocs.com/ , email: i...@greensocs.com
+ *
+ *  Developed by :
+ *  Frederic Konrad   
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ *
+ */
+
+#include "hw.h"
+#include "qemu-error.h"
+#include "qdev.h"
+#include "virtio-bus.h"
+#include "virtio.h"
+
+/* #define DEBUG_VIRTIO_BUS */
+
+#ifdef DEBUG_VIRTIO_BUS
+#define DPRINTF(fmt, ...) \
+do { printf("virtio_bus: " fmt , ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) do { } while (0)
+#endif
+
+/* Plug the VirtIODevice */
+int virtio_bus_plug_device(VirtIODevice *vdev)
+{
+DeviceState *qdev = DEVICE(vdev);
+BusState *qbus = BUS(qdev_get_parent_bus(qdev));
+VirtioBusState *bus = VIRTIO_BUS(qbus);
+VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
+DPRINTF("%s : plug device.\n", qbus->name);
+
+bus->vdev = vdev;
+
+if (klass->device_plugged != NULL) {
+klass->device_plugged(qbus->parent);
+}
+
+/*
+ * The lines below will disappear when we drop VirtIOBindings.
+ */
+bus->bindings.notify = klass->notify;
+bus->bindings.save_config = klass->save_config;
+bus->bindings.save_queue = klass->save_queue;
+bus->bindings.load_config = klass->load_config;
+bus->bindings.load_queue = klass->load_queue;
+bus->bindings.load_done = klass->load_done;
+bus->bindings.get_features = klass->get_features;
+bus->bindings.query_guest_notifiers = klass->query_guest_notifiers;
+bus->bindings.set_guest_notifiers = klass->set_guest_notifiers;
+bus->bindings.set_host_notifier = klass->set_host_notifier;
+bus->bindings.vmstate_change = klass->vmstate_change;
+virtio_bind_device(bus->vdev, &(bus->bindings), qbus->parent);
+
+return 0;
+}
+
+/* Reset the virtio_bus */
+void virtio_bus_reset(VirtioBusState *bus)
+{
+DPRINTF("%s : reset device.\n", qbus->name);
+if (bus->vdev != NULL) {
+virtio_reset(bus->vdev);
+}
+}
+
+/* Destroy the VirtIODevice */
+void virtio_bus_destroy_device(VirtioBusState *bus)
+{
+DeviceState *qdev;
+BusState *qbus = BUS(bus);
+VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
+DPRINTF("%s : remove device.\n", qbus->name);
+
+if (bus->vdev != NULL) {
+if (klass->device_unplug != NULL) {
+klass->device_unplug(qbus->parent);
+}
+qdev = DEVICE(bus->vdev);
+qdev_free(qdev);
+bus->vdev = NULL;
+}
+}
+
+/* Return the virtio device id of the plugged device. */
+uint16_t get_virtio_device_id(VirtioBusState *bus)
+{
+return bus->vdev->device_id;
+}
+
+static const TypeInfo virtio_bus_info = {
+.name = TYPE_VIRTIO_BUS,
+.parent = TYPE_BUS,
+.instance_size = sizeof(VirtioBusState),
+.abstract = true,
+.class_size = sizeof(VirtioBusClass),
+};
+
+static void virtio_register_types(void)
+{
+type_register_static(&virtio_bus_info);
+}
+
+type_init(virtio_register_types)
diff --git a/hw/virtio-bus.h b/hw/virtio-bus.h
new file mode 100644
index 000..8bb303a
--- /dev/null
+++ b/hw/virtio-bus.h
@@ -0,0 +1,83 @@
+/*
+ * VirtioBus
+ *
+ *  Copyright (C) 2012 : GreenSocs Ltd
+ *  http://www.greensocs.com/ , email: i...@greensocs.com
+ *
+ *  Developed by :
+ *  Frederic Konrad   
+ *
+ * This program is

[Qemu-devel] [RFC PATCH v7 7/8] virtio-pci-blk : Switch to new API.

2012-12-10 Thread fred . konrad
From: KONRAD Frederic 

Here the virtio-blk-pci is modified for the new API. The device virtio-pci-blk
extends virtio-pci. It creates and connects a virtio-blk during the init.

Signed-off-by: KONRAD Frederic 
---
 hw/virtio-pci.c | 113 +++-
 hw/virtio-pci.h |  14 ++-
 2 files changed, 59 insertions(+), 68 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 8de26fd..776a5b4 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -734,26 +734,6 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice 
*vdev)
 proxy->host_features = vdev->get_features(vdev, proxy->host_features);
 }
 
-static int virtio_blk_init_pci(PCIDevice *pci_dev)
-{
-VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
-VirtIODevice *vdev;
-
-if (proxy->class_code != PCI_CLASS_STORAGE_SCSI &&
-proxy->class_code != PCI_CLASS_STORAGE_OTHER)
-proxy->class_code = PCI_CLASS_STORAGE_SCSI;
-
-vdev = virtio_blk_init(&pci_dev->qdev, &proxy->blk);
-if (!vdev) {
-return -1;
-}
-vdev->nvectors = proxy->nvectors;
-virtio_init_pci(proxy, vdev);
-/* make the actual value visible */
-proxy->nvectors = vdev->nvectors;
-return 0;
-}
-
 static void virtio_exit_pci(PCIDevice *pci_dev)
 {
 VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
@@ -762,15 +742,6 @@ static void virtio_exit_pci(PCIDevice *pci_dev)
 msix_uninit_exclusive_bar(pci_dev);
 }
 
-static void virtio_blk_exit_pci(PCIDevice *pci_dev)
-{
-VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
-
-virtio_pci_stop_ioeventfd(proxy);
-virtio_blk_exit(proxy->vdev);
-virtio_exit_pci(pci_dev);
-}
-
 static int virtio_serial_init_pci(PCIDevice *pci_dev)
 {
 VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
@@ -888,43 +859,6 @@ static void virtio_rng_exit_pci(PCIDevice *pci_dev)
 virtio_exit_pci(pci_dev);
 }
 
-static Property virtio_blk_properties[] = {
-DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0),
-DEFINE_BLOCK_PROPERTIES(VirtIOPCIProxy, blk.conf),
-DEFINE_BLOCK_CHS_PROPERTIES(VirtIOPCIProxy, blk.conf),
-DEFINE_PROP_STRING("serial", VirtIOPCIProxy, blk.serial),
-#ifdef __linux__
-DEFINE_PROP_BIT("scsi", VirtIOPCIProxy, blk.scsi, 0, true),
-#endif
-DEFINE_PROP_BIT("config-wce", VirtIOPCIProxy, blk.config_wce, 0, true),
-DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, 
VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
-DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
-DEFINE_VIRTIO_BLK_FEATURES(VirtIOPCIProxy, host_features),
-DEFINE_PROP_END_OF_LIST(),
-};
-
-static void virtio_blk_class_init(ObjectClass *klass, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(klass);
-PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
-
-k->init = virtio_blk_init_pci;
-k->exit = virtio_blk_exit_pci;
-k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
-k->device_id = PCI_DEVICE_ID_VIRTIO_BLOCK;
-k->revision = VIRTIO_PCI_ABI_VERSION;
-k->class_id = PCI_CLASS_STORAGE_SCSI;
-dc->reset = virtio_pci_reset;
-dc->props = virtio_blk_properties;
-}
-
-static TypeInfo virtio_blk_info = {
-.name  = "virtio-blk-pci",
-.parent= TYPE_PCI_DEVICE,
-.instance_size = sizeof(VirtIOPCIProxy),
-.class_init= virtio_blk_class_init,
-};
-
 static Property virtio_net_properties[] = {
 DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, 
VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, false),
 DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),
@@ -1243,6 +1177,51 @@ static const TypeInfo virtio_pci_info = {
 .class_size= sizeof(VirtioPCIClass),
 };
 
+/* virtio-blk-pci */
+
+static Property virtio_blk_pci_properties[] = {
+DEFINE_PROP_HEX32("class", VirtIOBlkPCI, parent_obj.class_code, 0),
+DEFINE_BLOCK_PROPERTIES(VirtIOBlkPCI, blk.conf),
+DEFINE_BLOCK_CHS_PROPERTIES(VirtIOBlkPCI, blk.conf),
+DEFINE_PROP_STRING("serial", VirtIOBlkPCI, blk.serial),
+#ifdef __linux__
+DEFINE_PROP_BIT("scsi", VirtIOBlkPCI, blk.scsi, 0, true),
+#endif
+DEFINE_PROP_BIT("config-wce", VirtIOBlkPCI, blk.config_wce, 0, true),
+DEFINE_PROP_BIT("ioeventfd", VirtIOBlkPCI, parent_obj.flags,
+VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
+DEFINE_PROP_UINT32("vectors", VirtIOBlkPCI, parent_obj.nvectors, 2),
+DEFINE_VIRTIO_BLK_FEATURES(VirtIOBlkPCI, parent_obj.host_features),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static int virtio_blk_pci_init(VirtIOPCIProxy *vpci_dev)
+{
+DeviceState *vdev;
+VirtIOBlkPCI *dev = VIRTIO_BLK_PCI(vpci_dev);
+vdev = qdev_create(BUS(vpci_dev->bus), "virtio-blk");
+virtio_blk_set_conf(vdev, &(dev->blk));
+if (qdev_init(vdev) < 0) {
+return -1;
+}
+return 0;
+}
+
+static void virtio_blk_pci_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+Virt

[Qemu-devel] [PATCH] xen: fix trivial PCI passthrough MSI-X bug

2012-12-10 Thread Stefano Stabellini
We are currently passing entry->data as address parameter. Pass
entry->addr instead.

Signed-off-by: Stefano Stabellini 
Tested-by: Sander Eikelenboom 
Xen-devel: http://marc.info/?l=xen-devel&m=135515462613715

diff --git a/hw/xen_pt_msi.c b/hw/xen_pt_msi.c
index 6807672..db757cd 100644
--- a/hw/xen_pt_msi.c
+++ b/hw/xen_pt_msi.c
@@ -321,7 +321,7 @@ static int xen_pt_msix_update_one(XenPCIPassthroughState 
*s, int entry_nr)
 
 pirq = entry->pirq;
 
-rc = msi_msix_setup(s, entry->data, entry->data, &pirq, true, entry_nr,
+rc = msi_msix_setup(s, entry->addr, entry->data, &pirq, true, entry_nr,
 entry->pirq == XEN_PT_UNASSIGNED_PIRQ);
 if (rc) {
 return rc;



Re: [Qemu-devel] [PATCH v7 00/10] i8254, i8259 and running Microport UNIX (ca 1987)

2012-12-10 Thread Anthony Liguori
Jan Kiszka  writes:

> On 2012-12-10 06:14, Matthew Ogilvie wrote:
>> On Sun, Nov 25, 2012 at 02:51:36PM -0700, Matthew Ogilvie wrote:
>>> This series makes a series of mostly-unrelated fixes to allow
>>> running an old Microport UNIX (ca 1987) guest under qemu.
>>>
>>> Changes since version 6:
>>>* Patches 1 through 6 haven't changed, other than resolving
>>>  a couple of simple conflicts.
>>>* Patch 7 "fixes" IRQ0 by just making it work like before,
>>>  rather than fixing it properly.  This avoids possible risk
>>>  to cross-version migration, etc.
>>>* Patches 8, 9, and 10 provide one possible gradual transition path
>>>  to properly fix the 8254 model with relatively little risk to
>>>  migration/etc.  The idea is that 8 and 9 could be applied
>>>  immediately in preparation for a future fix, and then the
>>>  actual fix (10) could be applied sometime in the future when
>>>  migrating to or from pre-patch-9 versions is no longer a concern.
>>> I am not actually aware of ANY guest that actually needs
>>>  an improved 8254 model, but this provides one way to improve
>>>  it if desired.
>>>
>> 
>> Ping?
>> 
>> What would it take to get some variation of this series
>> into 1.4?  The last feedback I've seen was against version 5, back
>> in September.
>> http://search.gmane.org/?query=ogilvie&group=gmane.comp.emulators.qemu
>
> I suppose it's primarily a question of time for some reviewer(s). Sorry,
> I wasn't able to look at it yet, maybe I will have a chance next week.

If you added a test case for the i8254 using the mc146818rtc qtest test
case as an example, you would very likely attract more reviewers.

It would also make it easier to ensure that the issues you're fixing
here don't regress in the future too.

Regards,

Anthony Liguori

>
>> 
>>> 
>>> Split up this series?
>>>
>>> I'm not sure what the next steps are to get these into qemu, other
>>> than waiting for 1.4 for at least the non-trivial parts?
>>>
>>> Patches 1 through 3 could be considered independent trivial patches.
>>> Would splitting them apart improve the changes of getting them into qemu?
>>>
>>> Patch 4 isn't quite trivial, but it is well isolated (other than
>>> small documentation conflicts against patch 3).  Should it be split
>>> off?  It hasn't changed since version 3, but nobody has really
>>> commented on it.
>>>
>>> Patches 5 through 10 are interrelated, and should remain related in
>>> a series.
>>>
>>> 
>>> Still needed:
>>>
>>>   * Corresponding KVM patches.  The best approach may depend
>>> on what option is selected for qemu above.
>>>  * Note that KVM uses a simplified model that doesn't try
>>>to emulate the trailing edge of the interrupt very well
>>>at all.  I'm not proposing to change this aspect of it.
>>>  * A patch analogous to 7 should be easy.
>>>  * Patches 8 through 10 are also fairly easy by themselves.
>>>But now we start having an explosion of combinations
>>>of versions of KVM and qemu and migration to/from, and it
>>>might be better to:
>>>  * Or more involved fixes would involve new ioctl()'s and
>>>command line arguments to select old or fixed 8254 models
>>>dynamically.  See below.
>> 
>> Any preferences?
>
> As Avi left, I'm putting Gleb and Marcelo on CC.
>
>> 
>>>
>>> 
>>> Alternative options for improving the i8254 model and migration:
>>>
>>> 1. Don't fix 8254 at all.  Just apply through patch 7 or 8, and don't try
>>>to make any additional fixes.  I don't know of any guests that need
>>>improvements, so this could be a viable option.
>> 
>> Or:
>> 1.1. Don't fix any 8259 lines either, except for the one line (IRQ2) that
>>  is giving me trouble.  (Recall that the original problem is the guest
>>  masking off IRQ14 in the 8259, and the resulting IRQ2 trailing edge
>>  isn't handled correctly in the master 8259, resulting in a
>>  spurious interrupt.)
>> 
>>>
>>> 2. Just fix it immediately, and don't worry about migration.  Squash
>>>the last few patches together.  A single missed periodic
>>>timer tick that only happens when migrating
>>>between versions of qemu is probably not a significant
>>>concern.  (Unless someone knows of an OS that actually runs
>>>the i8254 in single shot mode 4, where a missed interrupt
>>>could cause a hang or something?)
>>>
>>> 3. Use patches 8 and 9 now, and patch 10 sometime in the future.
>>>If it was just qemu, this would be attractive.  But when you
>>>also need to worry about a bunch of combinations of versions of
>>>qemu and KVM and migration, this is looking less attractive.
>>>
>>> 4. Support both old and fixed i8254 models, selectable at runtime
>>>with a command line option.  (Question: What should such an
>>>option look like?)  This may be the best way to actually
>>>change the 8254, but I'm not sure 

Re: [Qemu-devel] [PULL] Memory API ioport cleanups

2012-12-10 Thread Anthony Liguori
Andreas Färber  writes:

> Hello,
>
> As coordinated with Avi and Gerd, here's some ioport conversions to Memory 
> API.
>
> Cc: Avi Kivity 
> Cc: Gerd Hoffmann 
> Cc: Julien Grall 
> Cc: Jason Baron 
>
>

Pulled. Thanks.

Regards,

Anthony Liguori

> The following changes since commit 16c6c80ac3a772b42a87b77dfdf0fdac7c607b0e:
>
>   Open up 1.4 development branch (2012-12-03 14:08:40 -0600)
>
> are available in the git repository at:
>
>   git://github.com/afaerber/qemu-cpu.git memory-ioport
>
> for you to fetch changes up to 582299336879504353e60c7937fbc70fea93f3da:
>
>   hw/dma.c: Replace register_ioport_* (2012-12-04 14:50:22 +0100)
>
> 
> Julien Grall (6):
>   isa: Add isa_address_space_io()
>   hw/apm.c: Replace register_ioport_*
>   hw/cirrus_vga.c: Replace register_ioport_*
>   serial: Replace register_ioport_*
>   hw/pc.c: Replace register_ioport_*
>   hw/dma.c: Replace register_ioport_*
>
>  hw/acpi_piix4.c   |2 +-
>  hw/apm.c  |   23 +---
>  hw/apm.h  |5 ++-
>  hw/cirrus_vga.c   |   48 ++--
>  hw/dma.c  |  106 
> +++--
>  hw/isa-bus.c  |9 +
>  hw/isa.h  |1 +
>  hw/lpc_ich9.c |2 +-
>  hw/mips_mipssim.c |3 +-
>  hw/pc.c   |   49 -
>  hw/serial.c   |4 +-
>  hw/serial.h   |2 +-
>  hw/vt82c686.c |2 +-
>  13 Dateien geändert, 178 Zeilen hinzugefügt(+), 78 Zeilen entfernt(-)



Re: [Qemu-devel] [PULL] QOM CPUState patch queue 2012-12-06

2012-12-10 Thread Anthony Liguori
Andreas Färber  writes:

> Hello,
>
> This is my current QOM CPU patch queue. Please pull.
>
> Regards,
> Andreas

Pulled. Thanks.

Regards,

Anthony Liguori

>
> Cc: Eduardo Habkost 
> Cc: Igor Mammedov 
> Cc: Paolo Bonzini 
>
>
> The following changes since commit 19e6c50d2d843220efbdd3b2db21d83c122c364a:
>
>   target-mips: Fix incorrect shift for SHILO and SHILOV (2012-12-06 08:12:14 
> +0100)
>
> are available in the git repository at:
>
>   git://github.com/afaerber/qemu-cpu.git qom-cpu
>
> for you to fetch changes up to d7f57a46d07c0a72295a56704ab0fecefb2aaea8:
>
>   target-i386: Postpone cpuid_level update to realize time (2012-12-06 
> 09:17:06 +0100)
>
> 
> Eduardo Habkost (11):
>   user: Move *-user/qemu-types.h to main directory
>   user: Rename qemu-types.h to qemu-user-types.h
>   ui/vnc-palette.c: Include headers it needs
>   qapi/qmp-registry.c: Include headers it needs
>   qga/channel-posix.c: Include headers it needs
>   qlist.h: Do not include qemu-common.h
>   Create qemu-types.h for struct typedefs
>   sysemu.h: Include qemu-types.h instead of qemu-common.h
>   qdev: qdev_create(): use error_report() instead of hw_error()
>   target-i386/cpu.c: Coding style fixes
>   target-i386: Separate feature string parsing from CPU model lookup
>
> Igor Mammedov (2):
>   target-i386: Use define for cpuid vendor string size
>   target-i386: Postpone cpuid_level update to realize time
>
>  bsd-user/qemu-types.h|   24 ---
>  bsd-user/qemu.h  |2 +-
>  cpu-all.h|2 +-
>  hw/qdev-core.h   |   11 +--
>  hw/qdev.c|8 ++-
>  linux-user/qemu.h|2 +-
>  qapi/qmp-registry.c  |2 +
>  qemu-common.h|   52 +-
>  qemu-types.h |   61 
>  linux-user/qemu-types.h => qemu-user-types.h |0
>  qga/channel-posix.c  |5 ++
>  qlist.h  |1 -
>  sysemu.h |2 +-
>  target-i386/cpu.c|  100 
> --
>  target-i386/cpu.h|2 +
>  ui/vnc-palette.c |2 +
>  16 Dateien geändert, 144 Zeilen hinzugefügt(+), 132 Zeilen entfernt(-)
>  delete mode 100644 bsd-user/qemu-types.h
>  create mode 100644 qemu-types.h
>  rename linux-user/qemu-types.h => qemu-user-types.h (100%)



Re: [Qemu-devel] [PULL 00/13] Trivial patches for 2 to 7 November

2012-12-10 Thread Anthony Liguori
Stefan Hajnoczi  writes:

> These patches were mostly submitted during the 1.3 hard freeze.  I'll catch up
> with the new trivial patches next week.
>
> The following changes since commit 80625b97b52836b944a6438e8e3e9d992e6a00b6:
>
>   xilinx_uartlite: Accept input after rx FIFO pop (2012-12-05 09:20:36 +0100)
>
> are available in the git repository at:
>
>   git://github.com/stefanha/qemu.git trivial-patches
>

Pulled. Thanks.

Regards,

Anthony Liguori

> for you to fetch changes up to 654598c944aa31cdbea435bd468055af9c918d16:
>
>   pc_sysfw: Plug memory leak on pc_fw_add_pflash_drv() error path (2012-12-07 
> 12:34:12 +0100)
>
> 
> Markus Armbruster (2):
>   Clean up pci_drive_hot_add()'s use of BlockInterfaceType
>   pc_sysfw: Plug memory leak on pc_fw_add_pflash_drv() error path
>
> Michal Privoznik (1):
>   qemu-options: Fix space at EOL
>
> Petar Jovanovic (2):
>   target-mips: Fix incorrect code and test for INSV
>   target-mips: Fix incorrect shift for SHILO and SHILOV
>
> Peter Crosthwaite (2):
>   sd: Send debug printfery to stderr not stdout
>   arm: a9mpcore: remove un-used ptimer_iomem field
>
> Peter Maydell (1):
>   configure: Remove stray debug output
>
> Richard Henderson (3):
>   target-alpha: Remove t0, t1 from CPUAlphaState
>   target-m68k: Remove t1 from CPUM68KState
>   target-sparc: Remove t0, t1 from CPUSPARCState
>
> Stefan Weil (4):
>   Fix spelling (prefered -> preferred)
>   Fix comments (adress -> address, layed -> laid, wierd -> weird)
>   s390x: Spelling fixes (endianess -> endianness, occured -> occurred)
>   Fix spelling in comments and documentation
>
>  configure  |  4 +---
>  hw/a9mpcore.c  |  1 -
>  hw/device-hotplug.c| 11 ---
>  hw/pc_sysfw.c  |  4 +++-
>  hw/pci-hotplug.c   |  7 +++
>  hw/s390x/sclp.h|  4 ++--
>  hw/s390x/sclpconsole.c |  4 ++--
>  hw/sd.c|  4 ++--
>  hw/usb.h   |  6 +++---
>  net/tap-win32.c|  7 ---
>  qemu-options.hx|  2 +-
>  slirp/ip_icmp.c|  2 +-
>  sysemu.h   |  3 +--
>  target-alpha/cpu.h |  7 ---
>  target-m68k/cpu.h  |  3 ---
>  target-mips/dsp_helper.c   | 19 ++-
>  target-sparc/cpu.h |  1 -
>  tcg/tcg.h  |  4 ++--
>  tests/qemu-iotests/iotests.py  |  2 +-
>  tests/tcg/mips/mips32-dsp/insv.c   |  2 +-
>  tests/tcg/mips/mips32-dsp/shilo.c  | 18 ++
>  tests/tcg/mips/mips32-dsp/shilov.c | 20 
>  uri.c  |  4 ++--
>  23 files changed, 81 insertions(+), 58 deletions(-)
>
> -- 
> 1.8.0.1




Re: [Qemu-devel] [PULL 00/18] acpi: switch to memory api

2012-12-10 Thread Anthony Liguori
Gerd Hoffmann  writes:

>   Hi,
>
> Same patches as posted last week.  No review comments, 1.4 tree
> open, so it should be ready to go in now.
>
> cheers,
>   Gerd
>
> The following changes since commit 16c6c80ac3a772b42a87b77dfdf0fdac7c607b0e:
>
>   Open up 1.4 development branch (2012-12-03 14:08:40 -0600)

Pulled. Thanks.

Regards,

Anthony Liguori

>
> are available in the git repository at:
>   git://git.kraxel.org/qemu acpi.1
>
> Gerd Hoffmann (18):
>   apci: switch piix4 to memory api
>   apci: switch ich9 to memory api
>   apci: switch vt82c686 to memory api
>   apci: switch timer to memory api
>   apci: switch cnt to memory api
>   apci: switch evt to memory api
>   acpi: cleanup piix4 memory region
>   acpi: cleanup vt82c686 memory region
>   apci: switch ich9 gpe to memory api
>   apci: switch ich9 smi to memory api
>   acpi: cleanup ich9 memory region
>   acpi: switch smbus to memory api
>   acpi: fix piix4 smbus mapping
>   apci: switch piix4 gpe to memory api
>   acpi: remove acpi_gpe_blk
>   apci: switch piix4 pci hotplug to memory api
>   q35: update lpc pci config space according to configured devices
>   acpi: drop debug port
>
>  hw/acpi.c   |  113 ++-
>  hw/acpi.h   |   18 +++--
>  hw/acpi_ich9.c  |  199 ++
>  hw/acpi_ich9.h  |3 +
>  hw/acpi_piix4.c |  183 +++
>  hw/ich9.h   |1 +
>  hw/lpc_ich9.c   |   29 
>  hw/pm_smbus.c   |   17 -
>  hw/pm_smbus.h   |3 +-
>  hw/smbus_ich9.c |   64 --
>  hw/vt82c686.c   |  102 +
>  11 files changed, 317 insertions(+), 415 deletions(-)



Re: [Qemu-devel] [Xen-devel] [PATCH 0/2] QEMU/xen: simplify cpu_ioreq_pio and cpu_ioreq_move

2012-12-10 Thread Ian Jackson
Stefano Stabellini writes ("Re: [Xen-devel] [PATCH 0/2] QEMU/xen: simplify 
cpu_ioreq_pio and cpu_ioreq_move"):
> On Fri, 7 Dec 2012, Ian Jackson wrote:
...
> > +if (req->df) addr -= offset;
> > +else addr -= offset;
> 
> This can't be right, can it?

Indeed not.  v2 has this fixed.

> The search/replace changes below look correct.

Thanks.

> For the sake of consistency, could you please send a patch against
> upstream QEMU to qemu-devel? The corresponding code is in xen-all.c
> (cpu_ioreq_pio and cpu_ioreq_move).

I will do that.

Thanks,
Ian.



Re: [Qemu-devel] [PULL] QOM CPUState patch queue 2012-12-06

2012-12-10 Thread Paolo Bonzini
Il 10/12/2012 17:57, Anthony Liguori ha scritto:
> Andreas Färber  writes:
> 
>> Hello,
>>
>> This is my current QOM CPU patch queue. Please pull.
>>
>> Regards,
>> Andreas
> 
> Pulled. Thanks.

Thanks, I'll rebase the directory stuff as soon as possible, please help
reviewing it in the meantime! :)

Paolo




[Qemu-devel] [PATCH v2] kvm: Detect number of available memory slots from the host kernel

2012-12-10 Thread Alex Williamson
The kernel already exposes an interface for this, x86 returns a proper
value and for the rest we can default to the defacto standard of 32.
The primary motivation for this is to support more PCI assigned
devices, both through pci-assign and vfio-pci.

Signed-off-by: Alex Williamson 
---
v2: Test (!num_slots) instead of (num_slots <= 0)

 kvm-all.c |   27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 8e9a8d8..74cf0d6 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -71,7 +71,8 @@ typedef struct kvm_dirty_log KVMDirtyLog;
 
 struct KVMState
 {
-KVMSlot slots[32];
+KVMSlot *slots;
+int num_slots;
 int fd;
 int vmfd;
 int coalesced_mmio;
@@ -120,7 +121,7 @@ static KVMSlot *kvm_alloc_slot(KVMState *s)
 {
 int i;
 
-for (i = 0; i < ARRAY_SIZE(s->slots); i++) {
+for (i = 0; i < s->num_slots; i++) {
 if (s->slots[i].memory_size == 0) {
 return &s->slots[i];
 }
@@ -136,7 +137,7 @@ static KVMSlot *kvm_lookup_matching_slot(KVMState *s,
 {
 int i;
 
-for (i = 0; i < ARRAY_SIZE(s->slots); i++) {
+for (i = 0; i < s->num_slots; i++) {
 KVMSlot *mem = &s->slots[i];
 
 if (start_addr == mem->start_addr &&
@@ -158,7 +159,7 @@ static KVMSlot *kvm_lookup_overlapping_slot(KVMState *s,
 KVMSlot *found = NULL;
 int i;
 
-for (i = 0; i < ARRAY_SIZE(s->slots); i++) {
+for (i = 0; i < s->num_slots; i++) {
 KVMSlot *mem = &s->slots[i];
 
 if (mem->memory_size == 0 ||
@@ -180,7 +181,7 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void 
*ram,
 {
 int i;
 
-for (i = 0; i < ARRAY_SIZE(s->slots); i++) {
+for (i = 0; i < s->num_slots; i++) {
 KVMSlot *mem = &s->slots[i];
 
 if (ram >= mem->ram && ram < mem->ram + mem->memory_size) {
@@ -340,7 +341,7 @@ static int kvm_set_migration_log(int enable)
 
 s->migration_log = enable;
 
-for (i = 0; i < ARRAY_SIZE(s->slots); i++) {
+for (i = 0; i < s->num_slots; i++) {
 mem = &s->slots[i];
 
 if (!mem->memory_size) {
@@ -1268,9 +1269,6 @@ int kvm_init(void)
 #ifdef KVM_CAP_SET_GUEST_DEBUG
 QTAILQ_INIT(&s->kvm_sw_breakpoints);
 #endif
-for (i = 0; i < ARRAY_SIZE(s->slots); i++) {
-s->slots[i].slot = i;
-}
 s->vmfd = -1;
 s->fd = qemu_open("/dev/kvm", O_RDWR);
 if (s->fd == -1) {
@@ -1324,6 +1322,16 @@ int kvm_init(void)
 goto err;
 }
 
+s->num_slots = kvm_check_extension(s, KVM_CAP_NR_MEMSLOTS);
+if (!s->num_slots) {
+s->num_slots = 32;
+}
+s->slots = g_malloc0(s->num_slots * sizeof(*s->slots));
+
+for (i = 0; i < s->num_slots; i++) {
+s->slots[i].slot = i;
+}
+
 s->coalesced_mmio = kvm_check_extension(s, KVM_CAP_COALESCED_MMIO);
 
 s->broken_set_mem_region = 1;
@@ -1393,6 +1401,7 @@ err:
 if (s->fd != -1) {
 close(s->fd);
 }
+g_free(s->slots);
 g_free(s);
 
 return ret;




Re: [Qemu-devel] [PATCH V17 4/6] rename qcow2-cache.c to block-cache.c

2012-12-10 Thread Kevin Wolf
Am 06.12.2012 07:51, schrieb Dong Xu Wang:
> We will re-use qcow2-cache as block layer common cache code,
> so change its name and made some changes, define a struct named
> BlockTableType, pass BlockTableType and table size parameters to
> block cache initialization function.
> 
> Signed-off-by: Dong Xu Wang 
> ---
>  block/Makefile.objs|3 +-
>  block/block-cache.c|  317 +++
>  block/block-cache.h|   76 +++
>  block/qcow2-cache.c|  323 
> 
>  block/qcow2-cluster.c  |   53 
>  block/qcow2-refcount.c |   66 ++-
>  block/qcow2.c  |   21 ++--
>  block/qcow2.h  |   24 +---
>  trace-events   |   13 +-
>  9 files changed, 481 insertions(+), 415 deletions(-)
>  create mode 100644 block/block-cache.c
>  create mode 100644 block/block-cache.h
>  delete mode 100644 block/qcow2-cache.c
> 
> diff --git a/block/Makefile.objs b/block/Makefile.objs
> index 7f01510..d23c250 100644
> --- a/block/Makefile.objs
> +++ b/block/Makefile.objs
> @@ -1,5 +1,6 @@
>  block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
> vvfat.o
> -block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
> qcow2-cache.o
> +block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
> +block-obj-y += block-cache.o
>  block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
>  block-obj-y += qed-check.o
>  block-obj-y += parallels.o blkdebug.o blkverify.o
> diff --git a/block/block-cache.c b/block/block-cache.c
> new file mode 100644
> index 000..bf5c57c
> --- /dev/null
> +++ b/block/block-cache.c
> @@ -0,0 +1,317 @@
> +/*
> + * QEMU Block Layer Cache
> + *
> + * Copyright IBM, Corp. 2012
> + *
> + * Authors:
> + *  Dong Xu Wang 
> + *
> + * This file is based on qcow2-cache.c, see its copyrights below:
> + *
> + * L2/refcount table cache for the QCOW2 format
> + *
> + * Copyright (c) 2010 Kevin Wolf 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "block_int.h"
> +#include "qemu-common.h"
> +#include "trace.h"
> +#include "block-cache.h"
> +
> +BlockCache *block_cache_create(BlockDriverState *bs, int num_tables,
> +   size_t cluster_size, BlockTableType type)

cluster_size should probably be called table_size. It just happens that
qcow2 tables are always one cluster, but it may be different for other
formats using this implementation in the future.

> +int block_cache_put(BlockDriverState *bs, BlockCache *c, void **table)
> +{
> +int i;
> +
> +for (i = 0; i < c->size; i++) {
> +if (c->entries[i].table == *table) {
> +goto found;
> +}
> +}
> +return -ENOENT;
> +
> +found:
> +c->entries[i].ref--;
> +assert(c->entries[i].ref >= 0);
> +*table = NULL;
> +return 0;
> +}

Why did you swap the assert() and *table = NULL?

> diff --git a/block/block-cache.h b/block/block-cache.h
> new file mode 100644
> index 000..4efa06e
> --- /dev/null
> +++ b/block/block-cache.h
> @@ -0,0 +1,76 @@
> +/*
> + * QEMU Block Layer Cache
> + *
> + * Copyright IBM, Corp. 2012
> + *
> + * Authors:
> + *  Dong Xu Wang 
> + *
> + * This file is based on qcow2-cache.c, see its copyrights below:
> + *
> + * L2/refcount table cache for the QCOW2 format
> + *
> + * Copyright (c) 2010 Kevin Wolf 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> 

Re: [Qemu-devel] [PATCH V17 6/6] qemu-iotests: add add-cow iotests support.

2012-12-10 Thread Kevin Wolf
Am 06.12.2012 07:51, schrieb Dong Xu Wang:
> This patch will use qemu-iotests to test add-cow file format.
> 
> Signed-off-by: Dong Xu Wang 
> ---
>  tests/qemu-iotests/017   |2 +-
>  tests/qemu-iotests/020   |2 +-
>  tests/qemu-iotests/common|6 ++
>  tests/qemu-iotests/common.rc |   15 ++-
>  4 files changed, 22 insertions(+), 3 deletions(-)

I think at least tests 037 and 038 would work as well.

Kevin



Re: [Qemu-devel] [PATCH qom-cpu v2 0/5] target-alpha: CPU subclasses

2012-12-10 Thread Richard Henderson
On 12/08/2012 05:40 PM, Andreas Färber wrote:
> Andreas Färber (5):
>   target-alpha: Let cpu_alpha_init() return AlphaCPU
>   alpha: Pass AlphaCPU array to Typhoon
>   target-alpha: Avoid leaking the alarm timer over reset
>   target-alpha: Turn CPU definitions into subclasses
>   target-alpha: Add support for -cpu ?

Looks ok.

Acked-by: Richard Henderson 


r~



[Qemu-devel] [RFC PATCH v7 4/8] virtio-pci : Refactor virtio-pci device.

2012-12-10 Thread fred . konrad
From: KONRAD Frederic 

Create the virtio-pci device. This transport device will create a
virtio-pci-bus, so one VirtIODevice can be connected.

Signed-off-by: KONRAD Frederic 
---
 hw/virtio-pci.c | 127 
 hw/virtio-pci.h |  19 +
 2 files changed, 146 insertions(+)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 5ac8d0d..8de26fd 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -1119,6 +1119,130 @@ static TypeInfo virtio_scsi_info = {
 .class_init= virtio_scsi_class_init,
 };
 
+/*
+ * virtio-pci : This is the PCIDevice which have a virtio-pci-bus.
+ */
+
+/* This is called by virtio-bus just after the device is plugged. */
+static void virtio_pci_device_plugged(void *opaque)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(opaque);
+uint8_t *config;
+uint32_t size;
+
+/* Put the PCI IDs */
+switch (get_virtio_device_id(proxy->bus)) {
+
+case VIRTIO_ID_BLOCK:
+pci_config_set_device_id(proxy->pci_dev.config,
+ PCI_DEVICE_ID_VIRTIO_BLOCK);
+pci_config_set_class(proxy->pci_dev.config, PCI_CLASS_STORAGE_SCSI);
+break;
+default:
+error_report("unknown device id\n");
+break;
+
+}
+
+/* TODO: vdev should be accessed through virtio-bus functions. */
+proxy->vdev = proxy->bus->vdev;
+config = proxy->pci_dev.config;
+
+if (proxy->class_code) {
+pci_config_set_class(config, proxy->class_code);
+}
+pci_set_word(config + PCI_SUBSYSTEM_VENDOR_ID,
+ pci_get_word(config + PCI_VENDOR_ID));
+pci_set_word(config + PCI_SUBSYSTEM_ID, get_virtio_device_id(proxy->bus));
+config[PCI_INTERRUPT_PIN] = 1;
+
+if (proxy->bus->vdev->nvectors &&
+msix_init_exclusive_bar(&proxy->pci_dev, proxy->bus->vdev->nvectors,
+1)) {
+proxy->bus->vdev->nvectors = 0;
+}
+
+proxy->pci_dev.config_write = virtio_write_config;
+
+size = VIRTIO_PCI_REGION_SIZE(&proxy->pci_dev)
+ + proxy->bus->vdev->config_len;
+if (size & (size-1)) {
+size = 1 << qemu_fls(size);
+}
+
+memory_region_init_io(&proxy->bar, &virtio_pci_config_ops, proxy,
+  "virtio-pci", size);
+pci_register_bar(&proxy->pci_dev, 0, PCI_BASE_ADDRESS_SPACE_IO,
+ &proxy->bar);
+
+if (!kvm_has_many_ioeventfds()) {
+proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
+}
+
+proxy->host_features |= 0x1 << VIRTIO_F_NOTIFY_ON_EMPTY;
+proxy->host_features |= 0x1 << VIRTIO_F_BAD_FEATURE;
+proxy->host_features = proxy->bus->vdev->get_features(proxy->bus->vdev,
+  
proxy->host_features);
+}
+
+/* This is called by virtio-bus just before the device is unplugged. */
+static void virtio_pci_device_unplug(void *opaque)
+{
+VirtIOPCIProxy *dev = VIRTIO_PCI(opaque);
+virtio_pci_stop_ioeventfd(dev);
+}
+
+static int virtio_pci_init(PCIDevice *pci_dev)
+{
+VirtIOPCIProxy *dev = VIRTIO_PCI(pci_dev);
+VirtioPCIClass *k = VIRTIO_PCI_GET_CLASS(pci_dev);
+dev->bus = virtio_pci_bus_new(dev);
+if (k->init != NULL) {
+return k->init(dev);
+}
+return 0;
+}
+
+static void virtio_pci_exit(PCIDevice *pci_dev)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(pci_dev);
+VirtioBusState *bus = VIRTIO_BUS(proxy->bus);
+BusState *qbus = BUS(proxy->bus);
+virtio_bus_destroy_device(bus);
+qbus_free(qbus);
+}
+
+static void virtio_pci_rst(DeviceState *qdev)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(qdev);
+VirtioBusState *bus = VIRTIO_BUS(proxy->bus);
+virtio_pci_stop_ioeventfd(proxy);
+virtio_bus_reset(bus);
+msix_unuse_all_vectors(&proxy->pci_dev);
+proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
+}
+
+static void virtio_pci_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->init = virtio_pci_init;
+k->exit = virtio_pci_exit;
+k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
+k->revision = VIRTIO_PCI_ABI_VERSION;
+k->class_id = PCI_CLASS_OTHERS;
+dc->reset = virtio_pci_rst;
+}
+
+static const TypeInfo virtio_pci_info = {
+.name  = TYPE_VIRTIO_PCI,
+.parent= TYPE_PCI_DEVICE,
+.instance_size = sizeof(VirtIOPCIProxy),
+.class_init= virtio_pci_class_init,
+.class_size= sizeof(VirtioPCIClass),
+};
+
 /* virtio-pci-bus */
 
 VirtioBusState *virtio_pci_bus_new(VirtIOPCIProxy *dev)
@@ -1145,6 +1269,8 @@ static void virtio_pci_bus_class_init(ObjectClass *klass, 
void *data)
 k->set_host_notifier = virtio_pci_set_host_notifier;
 k->set_guest_notifiers = virtio_pci_set_guest_notifiers;
 k->vmstate_change = virtio_pci_vmstate_change;
+k->device_plugged = virtio_pci_device_plugged;
+k->device_unplug = virtio_pci_device_unplug;
 }
 
 static const TypeInfo virtio_p

[Qemu-devel] [PATCH v2] tcg-i386: Perform cmov detection at runtime for 32-bit.

2012-12-10 Thread Richard Henderson
Cc: Aurelien Jarno 
Cc: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 configure | 19 +++
 tcg/i386/tcg-target.c | 31 ++-
 tcg/i386/tcg-target.h |  5 -
 3 files changed, 49 insertions(+), 6 deletions(-)

Changes v1->v2:
  * Configure-time detection for cpuid.h.
  * Don't do any checking for preprocessor defines that "imply" cmov.


r~



diff --git a/configure b/configure
index e5aedef..1231600 100755
--- a/configure
+++ b/configure
@@ -3071,6 +3071,21 @@ if compile_prog "" "" ; then
 has_environ=yes
 fi
 
+
+# check if cpuid.h is usable.
+
+cpuid_h=no
+cat > $TMPC << EOF
+#include 
+int main(void) {
+  return 0;
+}
+EOF
+if compile_prog "" "" ; then
+cpuid_h=yes
+fi
+
+
 ##
 # End of CC checks
 # After here, no more $cc or $ld runs
@@ -3601,6 +3616,10 @@ if test "$has_environ" = "yes" ; then
   echo "CONFIG_HAS_ENVIRON=y" >> $config_host_mak
 fi
 
+if test "$cpuid_h" = "yes" ; then
+  echo "CONFIG_CPUID_H=y" >> $config_host_mak
+fi
+
 if test "$glusterfs" = "yes" ; then
   echo "CONFIG_GLUSTERFS=y" >> $config_host_mak
 fi
diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c
index 6f3ad3c..8d603a5 100644
--- a/tcg/i386/tcg-target.c
+++ b/tcg/i386/tcg-target.c
@@ -97,6 +97,18 @@ static const int tcg_target_call_oarg_regs[] = {
 # define TCG_REG_L1 TCG_REG_EDX
 #endif
 
+/* For 32-bit, we are going to attempt to determine at runtime whether cmov
+   is available.  However, the host compiler must supply , as we're
+   not going to go so far as our own inline assembly.  */
+#if TCG_TARGET_REG_BITS == 64
+# define have_cmov 1
+#elif defined(CONFIG_CPUID_H)
+#include 
+static bool have_cmov;
+#else
+# define have_cmov 0
+#endif
+
 static uint8_t *tb_ret_addr;
 
 static void patch_reloc(uint8_t *code_ptr, int type,
@@ -943,7 +955,14 @@ static void tcg_out_movcond32(TCGContext *s, TCGCond cond, 
TCGArg dest,
   TCGArg v1)
 {
 tcg_out_cmp(s, c1, c2, const_c2, 0);
-tcg_out_modrm(s, OPC_CMOVCC | tcg_cond_to_jcc[cond], dest, v1);
+if (have_cmov) {
+tcg_out_modrm(s, OPC_CMOVCC | tcg_cond_to_jcc[cond], dest, v1);
+} else {
+int over = gen_new_label();
+tcg_out_jxx(s, tcg_cond_to_jcc[tcg_invert_cond(cond)], over, 1);
+tcg_out_mov(s, TCG_TYPE_I32, dest, v1);
+tcg_out_label(s, over, s->code_ptr);
+}
 }
 
 #if TCG_TARGET_REG_BITS == 64
@@ -2243,6 +2262,16 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 
 static void tcg_target_init(TCGContext *s)
 {
+/* For 32-bit, 99% certainty that we're running on hardware that supports
+   cmov, but we still need to check.  In case cmov is not available, we'll
+   use a small forward branch.  */
+#ifndef have_cmov
+{
+unsigned a, b, c, d;
+have_cmov = (__get_cpuid(1, &a, &b, &c, &d) && (d & bit_CMOV));
+}
+#endif
+
 #if !defined(CONFIG_USER_ONLY)
 /* fail safe */
 if ((1 << CPU_TLB_ENTRY_BITS) != sizeof(CPUTLBEntry))
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index dbc6756..450078b 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -90,12 +90,7 @@ typedef enum {
 #define TCG_TARGET_HAS_nand_i32 0
 #define TCG_TARGET_HAS_nor_i32  0
 #define TCG_TARGET_HAS_deposit_i32  1
-#if defined(__x86_64__) || defined(__i686__)
-/* Use cmov only if the compiler is already doing so.  */
 #define TCG_TARGET_HAS_movcond_i32  1
-#else
-#define TCG_TARGET_HAS_movcond_i32  0
-#endif
 
 #if TCG_TARGET_REG_BITS == 64
 #define TCG_TARGET_HAS_div2_i64 1
-- 
1.7.11.7




Re: [Qemu-devel] [PATCH 2/2] target-i386: use visit_type_unit_suffixed_int() to parse tsc_freq property value

2012-12-10 Thread mdroth
On Mon, Dec 10, 2012 at 05:13:45PM +0100, Igor Mammedov wrote:
> On Fri, 7 Dec 2012 18:09:06 -0200
> Eduardo Habkost  wrote:
> 
> > On Fri, Dec 07, 2012 at 08:00:09PM +0100, Andreas Färber wrote:
> > > Am 06.12.2012 22:12, schrieb Igor Mammedov:
> > > > Signed-off-by: Igor Mammedov 
> > > > ---
> > > >   v2:
> > > >- replace visit_type_freq() with visit_type_unit_suffixed_int()
> > > >  in x86_cpuid_set_tsc_freq()
> > > > ---
> > > >  target-i386/cpu.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > > > index c6c2ca0..b7f0aba 100644
> > > > --- a/target-i386/cpu.c
> > > > +++ b/target-i386/cpu.c
> > > > @@ -1195,7 +1195,7 @@ static void x86_cpuid_set_tsc_freq(Object *obj,
> > > > Visitor *v, void *opaque, const int64_t max = INT64_MAX;
> > > >  int64_t value;
> > > >  
> > > > -visit_type_int(v, &value, name, errp);
> > > > +visit_type_unit_suffixed_int(v, &value, name, 1000, errp);
> > > >  if (error_is_set(errp)) {
> > > >  return;
> > > >  }
> > > 
> > > This trivial usage is fine obviously. But since this series set out to
> > > make things more generic I am missing at least one use case for 1024.
> > > Does nothing like that exist in qdev-properties.c or so already?
> > 
> > cutils.c has:
> > 
> > int64_t strtosz_suffix(const char *nptr, char **end, const char
> > default_suffix) {
> > return strtosz_suffix_unit(nptr, end, default_suffix, 1024);
> > }
> > 
> > $ git grep -w strtosz_suffix
> > [...]
> > qapi/opts-visitor.c:val = strtosz_suffix(opt->str ? opt->str : "",
> > &endptr, qemu-img.c:sval = strtosz_suffix(argv[optind++], &end,
> > STRTOSZ_DEFSUFFIX_B); qemu-img.c:sval = strtosz_suffix(optarg,
> > &end, STRTOSZ_DEFSUFFIX_B);
> > 
> > The opts-visitor.c match, in turn, is inside opts_type_size(), that's the
> > ->type_size method of OptsVisitor. There are many 'size' elements inside
> > qapi-schema.json.
> > 
> > I don't see any code using visit_type_size() directly, but I see two users
> > of type 'size' on qapi-schema.json: NetdevTapOptions and NetdevDumpOptions.
> > 
> > I didn't know that we already had a visitor method using the suffixed-int
> > parsing code. Should we change the visit_type_size() code to be to use use
> > the new generic ->type_suffixed_int method and kill ->type_size?
> 
> If there isn't strong opposition to do it in incremental way,
> I'd prefer for these patches go in first.
> 
> And then later fix users of visit_type_size() to use type_suffixed_int() or
> maybe have a new type_suffixed_uint() so that size could be represented as
> uint64_t instead of int64_t as it's now. That would require to
> rewrite strtosz_* and its callers a bit.

I think that seems reasonable. Between the 2 that should give us a nice
way to support [KMGx] suffixes for options in a backward-compatible way.

Shame that users don't have a well-established way to specify base 2 vs.
10, but so long as we document our choice for each option that should
cover most cases well enough.



Re: [Qemu-devel] [PATCH 1/2] qapi: add visitor for parsing int[KMGT] input string

2012-12-10 Thread mdroth
On Mon, Dec 10, 2012 at 05:01:38PM +0100, Igor Mammedov wrote:
> On Fri, 07 Dec 2012 19:57:35 +0100
> Andreas Färber  wrote:
> 
> > Am 06.12.2012 22:12, schrieb Igor Mammedov:
> > > Caller of visit_type_unit_suffixed_int() will have to specify
> > > value of 'K' suffix via unit argument.
> > > For Kbytes it's 1024, for Khz it's 1000.
> > > 
> > > Signed-off-by: Igor Mammedov 
> > > ---
> > >  v2:
> > >   - convert type_freq to type_unit_suffixed_int.
> > >   - provide qapi_dealloc_type_unit_suffixed_int() impl.
> > > ---
> > >  qapi/qapi-dealloc-visitor.c |  7 +++
> > >  qapi/qapi-visit-core.c  | 13 +
> > >  qapi/qapi-visit-core.h  |  8 
> > >  qapi/string-input-visitor.c | 22 ++
> > >  4 files changed, 50 insertions(+)
> > > 
> > > diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
> > > index 75214e7..57e662c 100644
> > > --- a/qapi/qapi-dealloc-visitor.c
> > > +++ b/qapi/qapi-dealloc-visitor.c
> > > @@ -143,6 +143,12 @@ static void qapi_dealloc_type_enum(Visitor *v, int
> > > *obj, const char *strings[], {
> > >  }
> > >  
> > > +static void qapi_dealloc_type_unit_suffixed_int(Visitor *v, int64_t *obj,
> > > +const char *name,
> > > +const int unit, Error
> > > **errp) +{
> > > +}
> > > +
> > >  Visitor *qapi_dealloc_get_visitor(QapiDeallocVisitor *v)
> > >  {
> > >  return &v->visitor;
> > > @@ -170,6 +176,7 @@ QapiDeallocVisitor *qapi_dealloc_visitor_new(void)
> > >  v->visitor.type_str = qapi_dealloc_type_str;
> > >  v->visitor.type_number = qapi_dealloc_type_number;
> > >  v->visitor.type_size = qapi_dealloc_type_size;
> > > +v->visitor.type_unit_suffixed_int =
> > > qapi_dealloc_type_unit_suffixed_int; 
> > >  QTAILQ_INIT(&v->stack);
> > >  
> > > diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> > > index 7a82b63..dcbc1a9 100644
> > > --- a/qapi/qapi-visit-core.c
> > > +++ b/qapi/qapi-visit-core.c
> > > @@ -311,3 +311,16 @@ void input_type_enum(Visitor *v, int *obj, const
> > > char *strings[], g_free(enum_str);
> > >  *obj = value;
> > >  }
> > > +
> > > +void visit_type_unit_suffixed_int(Visitor *v, int64_t *obj, const char
> > > *name,
> > > +  const int unit, Error **errp)
> > > +{
> > > +if (!error_is_set(errp)) {
> > 
> > if (error_is_set(errp)) {
> Thanks, I'll fix it.
> 
> > > +return;
> > > +}
> > > +if (v->type_unit_suffixed_int) {
> > > +v->type_unit_suffixed_int(v, obj, name, unit, errp);
> > > +} else {
> > > +visit_type_int64(v, obj, name, errp);
> > > +}
> > > +}
> > > diff --git a/qapi/qapi-visit-core.h b/qapi/qapi-visit-core.h
> > > index 60aceda..04e690a 100644
> > > --- a/qapi/qapi-visit-core.h
> > > +++ b/qapi/qapi-visit-core.h
> > > @@ -62,6 +62,12 @@ struct Visitor
> > >  void (*type_int64)(Visitor *v, int64_t *obj, const char *name, Error
> > > **errp); /* visit_type_size() falls back to (*type_uint64)() if type_size
> > > is unset */ void (*type_size)(Visitor *v, uint64_t *obj, const char
> > > *name, Error **errp);
> > > +/*
> > > + * visit_unit_suffixed_int() falls back to (*type_int64)()
> > > + * if type_unit_suffixed_int is unset
> > > +*/
> > 
> > Indentation is one off.
> ditto
> 
> > 
> > > +void (*type_unit_suffixed_int)(Visitor *v, int64_t *obj, const char
> > > *name,
> > > +   const int unit, Error **errp);
> > 
> > Are we expecting differently suffixed ints? Otherwise we could
> > optionally shorten to type_suffixed_int (but that probably still doesn't
> > fit within one comment line ;)).
> Not with current implementation. I'll shorten it as you've suggested.
> 
> > 
> > >  };
> > >  
> > >  void visit_start_handle(Visitor *v, void **obj, const char *kind,
> > > @@ -91,5 +97,7 @@ void visit_type_size(Visitor *v, uint64_t *obj, const
> > > char *name, Error **errp); void visit_type_bool(Visitor *v, bool *obj,
> > > const char *name, Error **errp); void visit_type_str(Visitor *v, char
> > > **obj, const char *name, Error **errp); void visit_type_number(Visitor
> > > *v, double *obj, const char *name, Error **errp); +void
> > > visit_type_unit_suffixed_int(Visitor *v, int64_t *obj, const char *name,
> > > +  const int unit, Error **errp);
> > >  
> > >  #endif
> > > diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
> > > index 497eb9a..d2bd154 100644
> > > --- a/qapi/string-input-visitor.c
> > > +++ b/qapi/string-input-visitor.c
> > > @@ -110,6 +110,27 @@ static void parse_start_optional(Visitor *v, bool
> > > *present, *present = true;
> > >  }
> > >  
> > > +static void parse_type_unit_suffixed_int(Visitor *v, int64_t *obj,
> > > + const char *name, const int
> > > unit,
> > > + Error **errp)
> > 

[Qemu-devel] [PATCHv2 1/2] qemu-img: find the highest offset in use during check

2012-12-10 Thread Federico Simoncelli
This patch adds the support for reporting the highest offset in use by
an image. This is particularly useful after a conversion (or a rebase)
where the destination is a block device in order to find the actual
amount of space in use.

Signed-off-by: Federico Simoncelli 
---
 block.h  |1 +
 block/qcow2-refcount.c   |   10 --
 qemu-img.c   |4 
 tests/qemu-iotests/026   |6 +++---
 tests/qemu-iotests/036   |3 ++-
 tests/qemu-iotests/039   |2 +-
 tests/qemu-iotests/common.rc |5 +++--
 7 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/block.h b/block.h
index 722c620..de42e8c 100644
--- a/block.h
+++ b/block.h
@@ -213,6 +213,7 @@ typedef struct BdrvCheckResult {
 int check_errors;
 int corruptions_fixed;
 int leaks_fixed;
+int64_t highest_offset;
 BlockFragInfo bfi;
 } BdrvCheckResult;
 
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 96224d1..017439d 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1116,7 +1116,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
   BdrvCheckMode fix)
 {
 BDRVQcowState *s = bs->opaque;
-int64_t size, i;
+int64_t size, i, highest_cluster;
 int nb_clusters, refcount1, refcount2;
 QCowSnapshot *sn;
 uint16_t *refcount_table;
@@ -1154,7 +1154,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
 s->refcount_table_offset,
 s->refcount_table_size * sizeof(uint64_t));
 
-for(i = 0; i < s->refcount_table_size; i++) {
+for(i = 0, highest_cluster = 0; i < s->refcount_table_size; i++) {
 uint64_t offset, cluster;
 offset = s->refcount_table[i];
 cluster = offset >> s->cluster_bits;
@@ -1197,6 +1197,11 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
 }
 
 refcount2 = refcount_table[i];
+
+if (refcount1 > 0 || refcount2 > 0) {
+highest_cluster = i;
+}
+
 if (refcount1 != refcount2) {
 
 /* Check if we're allowed to fix the mismatch */
@@ -1231,6 +1236,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
 }
 }
 
+res->highest_offset = (highest_cluster + 1) * s->cluster_size;
 ret = 0;
 
 fail:
diff --git a/qemu-img.c b/qemu-img.c
index e29e01b..45c1ec1 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -470,6 +470,10 @@ static int img_check(int argc, char **argv)
 result.bfi.fragmented_clusters * 100.0 / 
result.bfi.allocated_clusters);
 }
 
+if (result.highest_offset > 0) {
+printf("Highest offset in use: %" PRId64 "\n", result.highest_offset);
+}
+
 bdrv_delete(bs);
 
 if (ret < 0 || result.check_errors) {
diff --git a/tests/qemu-iotests/026 b/tests/qemu-iotests/026
index 1602ccd..107a3ff 100755
--- a/tests/qemu-iotests/026
+++ b/tests/qemu-iotests/026
@@ -102,7 +102,7 @@ if [ "$event" == "l2_load" ]; then
 $QEMU_IO -c "read $vmstate 0 128k " $BLKDBG_TEST_IMG | _filter_qemu_io
 fi
 
-$QEMU_IMG check $TEST_IMG 2>&1 | grep -v "refcount=1 reference=0"
+_check_test_img 2>&1 | grep -v "refcount=1 reference=0"
 
 done
 done
@@ -147,7 +147,7 @@ echo
 echo "Event: $event; errno: $errno; imm: $imm; once: $once; write $vmstate"
 $QEMU_IO -c "write $vmstate 0 64M" $BLKDBG_TEST_IMG | _filter_qemu_io
 
-$QEMU_IMG check $TEST_IMG 2>&1 | grep -v "refcount=1 reference=0"
+_check_test_img 2>&1 | grep -v "refcount=1 reference=0"
 
 done
 done
@@ -186,7 +186,7 @@ echo
 echo "Event: $event; errno: $errno; imm: $imm; once: $once"
 $QEMU_IO -c "write -b 0 64k" $BLKDBG_TEST_IMG | _filter_qemu_io
 
-$QEMU_IMG check $TEST_IMG 2>&1 | grep -v "refcount=1 reference=0"
+_check_test_img 2>&1 | grep -v "refcount=1 reference=0"
 
 done
 done
diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
index 329533e..4dbfc57 100755
--- a/tests/qemu-iotests/036
+++ b/tests/qemu-iotests/036
@@ -59,7 +59,8 @@ _make_test_img 64M
 echo
 echo === Repair image ===
 echo
-$QEMU_IMG check -r all $TEST_IMG
+_check_test_img -r all
+
 ./qcow2.py $TEST_IMG dump-header
 
 # success, all done
diff --git a/tests/qemu-iotests/039 b/tests/qemu-iotests/039
index c5ae806..ae35175 100755
--- a/tests/qemu-iotests/039
+++ b/tests/qemu-iotests/039
@@ -86,7 +86,7 @@ $QEMU_IO -r -c "read -P 0x5a 0 512" $TEST_IMG | 
_filter_qemu_io
 echo
 echo "== Repairing the image file must succeed =="
 
-$QEMU_IMG check -r all $TEST_IMG
+_check_test_img -r all
 
 # The dirty bit must not be set
 ./qcow2.py $TEST_IMG dump-header | grep incompatible_features
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index aef5f52..22c0186 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -161,9 +161,10 @@ _cleanup_test_img()
 
 _check_test_img()
 {
-$QEMU_IMG check -f $IMGFMT $TEST_IMG 2>&1 | \
+$QEMU_IMG check "$@" -f $IMGFMT $TEST_IMG 2>&1 | \
   

[Qemu-devel] [RFC PATCH v7 0/8] Virtio refactoring.

2012-12-10 Thread fred . konrad
From: KONRAD Frederic 

You can clone that from here :
git.greensocs.com/home/greensocs/git/qemu_virtio.git virtio_refactoring_v7

These are the two last steps of refactoring ( only for virtio-blk device. ):

* It modifies virtio-blk-pci to extend virtio-pci and to connect a
  virtio-blk during the initialisation.
* It modifies virtio-blk ( QOM cast ).

I think the last step is breaking virtio-blk-s390, can somebody confirm that?

If it is breaking virtio-blk-s390 how can I modify it ?

I think it is the right thing to do, and I want to know what do you think before
turning all the remaining virtio-device.

Any feedback ?

This patch-set is :
* Introducing a virtio-bus which extends bus-state.
* Implementing a virtio-pci-bus which extends virtio-bus.
* Implementing a virtio-pci device which has a virtio-pci-bus.
* Implementing virtio-device which extends device-states.
* Implementing a virtio-blk which extends virtio-device.
* Modifying virtio-blk-pci to extends virtio-pci and connect a virtio-blk.

The first patch is a modification of qdev-monitor.c, it forces the function
qbus_find_recursive(..) to return a non-full bus, and return an error if the
desired bus ( with "bus=" option ) is full. It add a max_dev field to the
bus_state structure. If max_dev=0 it has no limitation, if not the maximum
amount of device connected to the bus is max_dev.

Changes v6 -> v7:
* virtio-bus : Added virtio-bus-reset.
* virtio-pci : Fixed virtio-pci-exit.
* virtio-pci : Added virtio-pci-rst.
* virtio-pci : Added VirtioPCIClass filled with an init function.
* virtio-blk : Added virtio_blk_set_conf.
* virtio-blk : QOM casts.
* virtio-blk-pci : Switched to the new API.

Changes v5 -> v6:
* Renamed virtio_common_init_ to virtio_init, modify virtio_common_init to
  allocate and call virtio_init. Drop the unused structure size parameters.
* Renamed init/exit callback in VirtioBusClass.
* Renamed virtio_blk_init virtio_blk_common_init.
* Modified virtio_blk_init to call virtio_blk_common_init.

Changes v4 -> v5:
* use ERROR_CLASS_GENERIC_ERROR in place of creating a new error type for
  the maximum device limitation. ( Peter )
* Removed bus_in_use function. We assume that the virtio-bus is not in use,
  when plugin in. ( Peter )
* Added virtio_bus_destroy_device().
* Implemented the exit function of virtio-pci.
* Implemented the init callback for virtio-pci ( must be modified, it still
  access vdev directly. ).
* Implemented the exit callback for virtio-pci.
* Started virtio-device refactoring.
* Started virtio-blk refactoring. 

Changes v3 -> v4:
* Added virtio-bus.o in Makefile.objs ( accidentally dropped from v3 ).
* *const* TypeInfo in virtio-bus.
* Introduced virtio-pci-bus.
* Reintroduced virtio-pci.
* Introduced virtio-device.
* Started virtio-blk refactoring.
* Added an error type in qerror.h for the "bus full" error.

Changes v2 -> v3:
* Added VirtioBusClass.
* Renamed VirtioBus -> VirtioBusState.
* Renamed qbus -> parent_obj.
* Plug the device only in a non-full bus.

Changes v1 -> v2:
* All the little fix you suggest ( License, Debug printf, naming convention,
  ...)
* Added get_virtio_device_id(), and remove the pci_id* from the VirtioBus
  structure.
* Added virtio_bus_reset().
* Added cast macros VIRTIO_BUS.
* Added virtio_bus_plug_device.
* Replaced the old-style "bus->qbus" by BUS() macro.

Fred.

KONRAD Frederic (8):
  qdev : add a maximum device allowed field for the bus.
  virtio-bus : Introduce virtio-bus
  virtio-pci-bus : Introduce virtio-pci-bus.
  virtio-pci : Refactor virtio-pci device.
  virtio-device : Refactor virtio-device.
  virtio-blk : Add the virtio-blk device.
  virtio-pci-blk : Switch to new API.
  virtio-blk : QOM modifications.

 hw/Makefile.objs  |   1 +
 hw/qdev-core.h|   2 +
 hw/qdev-monitor.c |  11 +++
 hw/virtio-blk.c   | 140 ---
 hw/virtio-blk.h   |   8 ++
 hw/virtio-bus.c   | 120 +++
 hw/virtio-bus.h   |  83 
 hw/virtio-pci.c   | 277 +-
 hw/virtio-pci.h   |  52 +-
 hw/virtio.c   |  50 +++---
 hw/virtio.h   |  28 ++
 11 files changed, 657 insertions(+), 115 deletions(-)
 create mode 100644 hw/virtio-bus.c
 create mode 100644 hw/virtio-bus.h

-- 
1.7.11.7




Re: [Qemu-devel] Some patch about mips, gen_HILO bug fix.

2012-12-10 Thread Richard Henderson
On 12/10/2012 12:37 AM, Elta Era wrote:
>  tcg_gen_movi_tl(cpu_gpr[ret], \
> -(target_long)((int32_t)imm << 16 | \
> +(target_long)(int32_t)((int32_t)imm << 16 | \
>  (uint32_t)(uint16_t)imm));

This can alternately be fixed by removing an unnecessary cast:

-  (target_long)((int32_t)imm << 16 | (uint32_t)(uint16_t)imm)
+  (target_long)((int32_t)imm << 16 | (uint16_t)imm)



r~



Re: [Qemu-devel] [PATCH 1/4] fix implicit declaration of syscall() in linux-user/mmap.c

2012-12-10 Thread Stefan Weil

Am 10.12.2012 07:59, schrieb John Spencer:

on glibc, this header is getting pulled in automatically via
another header, however on musl we need to include it explicitly.

linux-user/mmap.c:705:9: warning: implicit declaration of function 'syscall'
linux-user/mmap.c:705:9: warning: nested extern declaration of 'syscall'

Signed-off-by: John Spencer

---
  linux-user/mmap.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index b412e3f..171b449 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -25,6 +25,7 @@
  #include
  #include
  #include
+#include
  #include
  #include



According to the Linux man-page SYSCALL(2), syscall
is declared in unistd.h. On my Debian Linux with glibc,
this information is correct. Here is the result of grep:

/usr/include/unistd.h:extern long int syscall (long int __sysno, ...) 
__THROW;


unistd.h is included implicitly via qemu-common.h,
so if you don't get the declaration, there is a buggy
implementation of the header files in musl:

http://git.musl-libc.org/cgit/musl/plain/include/unistd.h
does not match the Linux documentation.

Please report this to the musl developers.

Regards
Stefan Weil




Re: [Qemu-devel] [PATCH 2/4] fix build error on ARM due to wrong glibc check

2012-12-10 Thread Stefan Weil

Am 10.12.2012 07:59, schrieb John Spencer:

the test for glibc<  2 "succeeds" wrongly for any non-glibc C library,
and breaks the build on musl libc.
we must first test if __GLIBC__ is defined at all, before using it
unconditionally.

Signed-off-by: John Spencer

---
  user-exec.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/user-exec.c b/user-exec.c
index ef9b172..145 100644
--- a/user-exec.c
+++ b/user-exec.c
@@ -442,7 +442,7 @@ int cpu_signal_handler(int host_signum, void *pinfo,
  unsigned long pc;
  int is_write;

-#if (__GLIBC__<  2 || (__GLIBC__ == 2&&  __GLIBC_MINOR__<= 3))
+#if defined(__GLIBC__)&&  (__GLIBC__<  2 || (__GLIBC__ == 2&&  
__GLIBC_MINOR__<= 3))
  pc = uc->uc_mcontext.gregs[R15];
  #else
  pc = uc->uc_mcontext.arm_pc;
   


Reviewed-by: Stefan Weil 




[Qemu-devel] [PATCH 2/2] qemu-img: add json output option to the check command

2012-12-10 Thread Federico Simoncelli
This option --output=[human|json] make qemu-img check output an human
or JSON representation at the choice of the user.

Signed-off-by: Federico Simoncelli 
---
 qapi-schema.json |   38 +
 qemu-img-cmds.hx |4 +-
 qemu-img.c   |  246 +++---
 qemu-img.texi|5 +-
 4 files changed, 221 insertions(+), 72 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 5dfa052..8877285 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -245,6 +245,44 @@
'*backing-filename-format': 'str', '*snapshots': ['SnapshotInfo'] } 
}
 
 ##
+# @ImageCheck:
+#
+# Information about a QEMU image file check
+#
+# @filename: name of the image file checked
+#
+# @format: format of the image file checked
+#
+# @check-errors: number of unexpected errors occurred during check
+#
+# @highest-offset: #optional highest offset (in bytes) in use by the image
+#
+# @corruptions: #optional number of corruptions found during the check
+#
+# @leaks: #optional number of leaks found during the check
+#
+# @corruptions-fixed: #optional number of corruptions fixed during the check
+#
+# @leaks-fixed: #optional number of leaks fixed during the check
+#
+# @total-clusters: #optional total number of clusters
+#
+# @allocated-clusters: #optional total number of allocated clusters
+#
+# @fragmented-clusters: #optional total number of fragmented clusters
+#
+# Since: 1.4
+#
+##
+
+{ 'type': 'ImageCheck',
+  'data': {'filename': 'str', 'format': 'str', 'check-errors': 'int',
+   '*highest-offset': 'int', '*corruptions': 'int', '*leaks': 'int',
+   '*corruptions-fixed': 'int', '*leaks-fixed': 'int',
+   '*total-clusters': 'int', '*allocated-clusters': 'int',
+   '*fragmented-clusters': 'int' } }
+
+##
 # @StatusInfo:
 #
 # Information about VCPU run state
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index a181363..259fc14 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -10,9 +10,9 @@ STEXI
 ETEXI
 
 DEF("check", img_check,
-"check [-f fmt] [-r [leaks | all]] filename")
+"check [-f fmt] [--output=ofmt] [-r [leaks | all]] filename")
 STEXI
-@item check [-f @var{fmt}] [-r [leaks | all]] @var{filename}
+@item check [-f @var{fmt}] [--output=@var{ofmt}] [-r [leaks | all]] 
@var{filename}
 ETEXI
 
 DEF("create", img_create,
diff --git a/qemu-img.c b/qemu-img.c
index 45c1ec1..18ba5c2 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -42,6 +42,16 @@ typedef struct img_cmd_t {
 int (*handler)(int argc, char **argv);
 } img_cmd_t;
 
+enum {
+OPTION_OUTPUT = 256,
+OPTION_BACKING_CHAIN = 257,
+};
+
+typedef enum OutputFormat {
+OFORMAT_JSON,
+OFORMAT_HUMAN,
+} OutputFormat;
+
 /* Default to cache=writeback as data integrity is not important for qemu-tcg. 
*/
 #define BDRV_O_FLAGS BDRV_O_CACHE_WB
 #define BDRV_DEFAULT_CACHE "writeback"
@@ -370,6 +380,122 @@ out:
 return 0;
 }
 
+static void dump_json_image_check(ImageCheck *check)
+{
+Error *errp = NULL;
+QString *str;
+QmpOutputVisitor *ov = qmp_output_visitor_new();
+QObject *obj;
+visit_type_ImageCheck(qmp_output_get_visitor(ov),
+  &check, NULL, &errp);
+obj = qmp_output_get_qobject(ov);
+str = qobject_to_json_pretty(obj);
+assert(str != NULL);
+printf("%s\n", qstring_get_str(str));
+qobject_decref(obj);
+qmp_output_visitor_cleanup(ov);
+QDECREF(str);
+}
+
+static void dump_human_image_check(ImageCheck *check)
+{
+if (check->corruptions_fixed || check->leaks_fixed) {
+ printf("The following inconsistencies were found and repaired:\n\n"
+"%" PRId64 " leaked clusters\n"
+"%" PRId64 " corruptions\n\n"
+"Double checking the fixed image now...\n",
+check->leaks_fixed,
+check->corruptions_fixed);
+}
+
+if (!(check->corruptions || check->leaks || check->check_errors)) {
+printf("No errors were found on the image.\n");
+} else {
+if (check->corruptions) {
+printf("\n%" PRId64 " errors were found on the image.\n"
+"Data may be corrupted, or further writes to the image "
+"may corrupt it.\n",
+check->corruptions);
+}
+
+if (check->leaks) {
+printf("\n%" PRId64 " leaked clusters were found on the image.\n"
+"This means waste of disk space, but no harm to data.\n",
+check->leaks);
+}
+
+if (check->check_errors) {
+printf("\n%" PRId64 " internal errors have occurred during the 
check.\n",
+check->check_errors);
+}
+}
+
+if (check->total_clusters != 0 && check->allocated_clusters != 0) {
+printf("%" PRId64 "/%" PRId64 "= %0.2f%% allocated, %0.2f%% 
fragmented\n",
+check->allocated_clusters, check->total_clusters,
+check->allocated_clusters * 100.0 / check->total_clusters,

Re: [Qemu-devel] [PATCH 4/4] linux-user/syscall.c: remove wrong forward decl of setgroups()

2012-12-10 Thread Stefan Weil

Am 10.12.2012 07:59, schrieb John Spencer:

this declaration is wrong:
the correct prototype on linux is:
int setgroups(size_t size, const gid_t *list);

since by default musl libc exposes this symbol in unistd.h
additionally to grp.h, the wrong declaration causes a build error.

the proper fix is to simply include the correct header.

Signed-off-by: John Spencer

---
  linux-user/syscall.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index fabbcd7..665316e 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -28,6 +28,7 @@
  #include
  #include
  #include
+#include
  #include
  #include
  #include
@@ -585,7 +586,6 @@ extern int personality(int);
  extern int flock(int, int);
  extern int setfsuid(int);
  extern int setfsgid(int);
-extern int setgroups(int, gid_t *);

  /* ARM EABI and MIPS expect 64bit types aligned even on pairs or registers */
  #ifdef TARGET_ARM
   



Reviewed-by: Stefan Weil 




Re: [Qemu-devel] [PATCH 3/4] fix implicit declaration of syscall() in linux-user/syscall.c

2012-12-10 Thread Stefan Weil

Am 10.12.2012 07:59, schrieb John Spencer:

Signed-off-by: John Spencer

---
  linux-user/syscall.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 31d5276..fabbcd7 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -39,6 +39,7 @@
  #include
  #include
  #include
+#include
  #include
  #include
  #ifdef __ia64__



This is a workaround for a missing declaration of function syscall
in musl's unistd.h. See my comment to patch 1 for more information.



  1   2   >