Re: [Qemu-devel] [PATCH v3] Sort the help info shown in monitor at runtime

2011-10-12 Thread Wenyi Gao
On Wed, 2011-10-12 at 11:32 +0800, Wayne Xia wrote:
 This patch would try sort the command list in monitor at runtime. As a result,
 command help and help info would show a more friendly sorted command list.
 For eg:
 (qemu)help
 acl_add
 acl_policy
 acl_remove
 acl_reset
 acl_show
 balloon
 block_passwd
 ...
 the command list is sorted.
 
 v3: using qsort function to sort the command list.
 
 Signed-off-by: Wayne Xia xiaw...@linux.vnet.ibm.com
 ---
  monitor.c |   30 ++
  1 files changed, 26 insertions(+), 4 deletions(-)
 
 diff --git a/monitor.c b/monitor.c
 index 31b212a..a172167 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -195,8 +195,8 @@ static inline int mon_print_count_get(const Monitor *mon) 
 { return 0; }
 
  static QLIST_HEAD(mon_list, Monitor) mon_list;
 
 -static const mon_cmd_t mon_cmds[];
 -static const mon_cmd_t info_cmds[];
 +static mon_cmd_t mon_cmds[];
 +static mon_cmd_t info_cmds[];
 
  static const mon_cmd_t qmp_cmds[];
  static const mon_cmd_t qmp_query_cmds[];
 @@ -2726,13 +2726,14 @@ int monitor_get_fd(Monitor *mon, const char *fdname)
  return -1;
  }
 
 -static const mon_cmd_t mon_cmds[] = {
 +/* mon_cmds and info_cmds would be sorted at runtime */
 +static mon_cmd_t mon_cmds[] = {
  #include hmp-commands.h
  { NULL, NULL, },
  };
 
  /* Please update hmp-commands.hx when adding or changing commands */
 -static const mon_cmd_t info_cmds[] = {
 +static mon_cmd_t info_cmds[] = {
  {
  .name   = version,
  .args_type  = ,
 @@ -5068,6 +5069,25 @@ static void monitor_event(void *opaque, int event)
  }
  }
 
 +static int
 +compare_mon_cmd(const void *a, const void *b)
 +{
 +return strcmp(((const mon_cmd_t *)a)-name,
 +((const mon_cmd_t *)b)-name);
 +}
 +
 +static void sortcmdlist(void)
 +{
 +int array_num;
 +int elem_size = sizeof(mon_cmd_t);
 +
 +array_num = sizeof(mon_cmds)/elem_size-1;
 +qsort((void *)mon_cmds, array_num, elem_size, compare_mon_cmd);
 +
 +array_num = sizeof(info_cmds)/elem_size-1;
 +qsort((void *)info_cmds, array_num, elem_size, compare_mon_cmd);
 +}
 +
 
  /*
   * Local variables:
 @@ -5110,6 +5130,8 @@ void monitor_init(CharDriverState *chr, int flags)
  QLIST_INSERT_HEAD(mon_list, mon, entry);
  if (!default_mon || (flags  MONITOR_IS_DEFAULT))
  default_mon = mon;
 +
 +sortcmdlist();
  }
 
  static void bdrv_password_cb(Monitor *mon, const char *password, void 
 *opaque)

Tested-by: Wenyi Gao we...@linux.vnet.ibm.com
Work nice.



Wenyi Gao






Re: [Qemu-devel] [PATCH] kernel/kvm: fix improper nmi emulation

2011-10-12 Thread Kenji Kaneshige
(2011/10/10 19:26), Avi Kivity wrote:
 On 10/10/2011 08:06 AM, Lai Jiangshan wrote:
 From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com

 Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
 button event happens. This doesn't properly emulate real hardware on
 which NMI button event triggers LINT1. Because of this, NMI is sent to
 the processor even when LINT1 is maskied in LVT. For example, this
 causes the problem that kdump initiated by NMI sometimes doesn't work
 on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.

 With this patch, KVM_NMI ioctl is handled as follows.

 - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
 request of triggering LINT1 on the processor. LINT1 is emulated in
 in-kernel irqchip.

 - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
 request of injecting NMI to the processor. This assumes LINT1 is
 already emulated in userland.
 
 Please add a KVM_NMI section to Documentation/virtual/kvm/api.txt.
 

 -static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
 -{
 - kvm_inject_nmi(vcpu);
 -
 - return 0;
 -}
 -
 static int vcpu_ioctl_tpr_access_reporting(struct kvm_vcpu *vcpu,
 struct kvm_tpr_access_ctl *tac)
 {
 @@ -3038,9 +3031,10 @@ long kvm_arch_vcpu_ioctl(struct file *fi
 break;
 }
 case KVM_NMI: {
 - r = kvm_vcpu_ioctl_nmi(vcpu);
 - if (r)
 - goto out;
 + if (irqchip_in_kernel(vcpu-kvm))
 + kvm_apic_lint1_deliver(vcpu);
 + else
 + kvm_inject_nmi(vcpu);
 r = 0;
 break;
 }
 
 Why did you drop kvm_vcpu_ioctl_nmi()?
 
 Please add (and document) a KVM_CAP flag that lets userspace know the new 
 behaviour is supported.
 

Sorry for the delayed responding.

I don't understand why new KVM_CAP flag is needed.

I think the old behavior was clearly a bug, and new behavior is not a new
capability. Furthermore, the kvm patch and the qemu patch in this patchset
can be applied independently. If only the kvm patch is applied, NMI bug in
kernel irq is fixed and qemu NMI behavior is not changed. If the only the
qemu patch is applied, qemu NMI bug is fixed and the NMI behavior in kernel
irq is not changed.

Regards,
Kenji Kaneshige



Re: [Qemu-devel] [PATCH 1/1 V2] kernel/kvm: fix improper nmi emulation

2011-10-12 Thread Kenji Kaneshige
(2011/10/12 2:00), Lai Jiangshan wrote:
 From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com
 
 Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
 button event happens. This doesn't properly emulate real hardware on
 which NMI button event triggers LINT1. Because of this, NMI is sent to
 the processor even when LINT1 is maskied in LVT. For example, this
 causes the problem that kdump initiated by NMI sometimes doesn't work
 on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
 
 With this patch, KVM_NMI ioctl is handled as follows.
 
 - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
request of triggering LINT1 on the processor. LINT1 is emulated in
in-kernel irqchip.
 
 - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
request of injecting NMI to the processor. This assumes LINT1 is
already emulated in userland.
 
 (laijs) Changed from v1:
 Add KVM_NMI API document
 Add KVM_CAP_USER_NMI
 
 Signed-off-by: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com
 Tested-by: Lai Jiangshanla...@cn.fujitsu.com
 ---
   Documentation/virtual/kvm/api.txt |   20 
   arch/x86/kvm/irq.h|1 +
   arch/x86/kvm/lapic.c  |7 +++
   arch/x86/kvm/x86.c|   12 
   include/linux/kvm.h   |3 +++
   5 files changed, 43 insertions(+), 0 deletions(-)
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index b0e4b9c..5c24cc3 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1430,6 +1430,26 @@ is supported; 2 if the processor requires all virtual 
 machines to have
   an RMA, or 1 if the processor can use an RMA but doesn't require it,
   because it supports the Virtual RMA (VRMA) facility.
 
 +4.64 KVM_NMI
 +
 +Capability: KVM_CAP_USER_NMI
 +Architectures: x86
 +Type: vcpu ioctl
 +Parameters: none
 +Returns: 0 on success, -1 on error
 +
 +This ioctl injects NMI to the vcpu.
 +
 +If with capability KVM_CAP_LAPIC_NMI, KVM_NMI ioctl is handled as follows:
 +
 + - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
 +   request of triggering LINT1 on the processor. LINT1 is emulated in
 +   in-kernel lapic irqchip.
 +
 + - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
 +   request of injecting NMI to the processor. This assumes LINT1 is
 +   already emulated in userland lapic.
 +
   5. The kvm_run structure
 
   Application code obtains a pointer to the kvm_run structure by
 diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
 index 53e2d08..0c96315 100644
 --- a/arch/x86/kvm/irq.h
 +++ b/arch/x86/kvm/irq.h
 @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
   void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
   void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
   void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
 +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
   void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
   void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
   void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
 diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
 index 57dcbd4..87fe36a 100644
 --- a/arch/x86/kvm/lapic.c
 +++ b/arch/x86/kvm/lapic.c
 @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
   kvm_apic_local_deliver(apic, APIC_LVT0);
   }
 
 +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
 +{
 + struct kvm_lapic *apic = vcpu-arch.apic;
 +
 + kvm_apic_local_deliver(apic, APIC_LVT1);
 +}
 +
   static struct kvm_timer_ops lapic_timer_ops = {
   .is_periodic = lapic_is_periodic,
   };
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 84a28ea..6862ef7 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -2729,12 +2729,24 @@ static int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu 
 *vcpu,
   return 0;
   }
 
 +#ifdef KVM_CAP_LAPIC_NMI
 +static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
 +{
 + if (irqchip_in_kernel(vcpu-kvm))
 + kvm_apic_lint1_deliver(vcpu);
 + else
 + kvm_inject_nmi(vcpu);
 +
 + return 0;
 +}
 +#else
   static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
   {
   kvm_inject_nmi(vcpu);
 
   return 0;
   }
 +#endif

I don't think we need to keep old kvm_vcpu_ioctl_nmi() behavior because
it's clearly a bug.

Regards,
Kenji Kaneshige



Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread Michael S. Tsirkin
On Wed, Oct 12, 2011 at 02:07:46PM +1100, David Gibson wrote:
 Um.. why?  PCI is defined by the spec to be LE, so I don't see that we
 need explicit endianness versions for PCI helpers.

LE in the spec only applies to structures defined by the spec,
that is pci configuration and msix tables in device memory.

-- 
MST



Re: [Qemu-devel] [PATCH] spice-input: migrate ledstate

2011-10-12 Thread Gerd Hoffmann

  Hi,


There is no ledstate in a PS/2 keyboard (or I'm reading too much into
the implementation in qemu).


There is.  It isn't explicitly stored into the state struct though 
because the ps/2 keyboard itself doesn't use it, it just calls 
kbd_put_ledstate() to inform others about it.


cheers,
  Gerd




[Qemu-devel] [PATCH 1/2] Add opt_set_bool function

2011-10-12 Thread M. Mohan Kumar
In addition to qemu_opt_set function, we need a function to set bool value
also.

Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
---
 qemu-option.c |   35 +++
 qemu-option.h |1 +
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/qemu-option.c b/qemu-option.c
index 105d760..d6bc908 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -636,6 +636,41 @@ int qemu_opt_set(QemuOpts *opts, const char *name, const 
char *value)
 return 0;
 }
 
+int qemu_opt_set_bool(QemuOpts *opts, const char *name, int val)
+{
+QemuOpt *opt;
+const QemuOptDesc *desc = opts-list-desc;
+int i;
+
+for (i = 0; desc[i].name != NULL; i++) {
+if (strcmp(desc[i].name, name) == 0) {
+break;
+}
+}
+if (desc[i].name == NULL) {
+if (i == 0) {
+/* empty list - allow any */;
+} else {
+qerror_report(QERR_INVALID_PARAMETER, name);
+return -1;
+}
+}
+
+opt = g_malloc0(sizeof(*opt));
+opt-name = g_strdup(name);
+opt-opts = opts;
+QTAILQ_INSERT_TAIL(opts-head, opt, next);
+if (desc[i].name != NULL) {
+opt-desc = desc+i;
+}
+opt-value.boolean = !!val;
+if (qemu_opt_parse(opt)  0) {
+qemu_opt_del(opt);
+return -1;
+}
+return 0;
+}
+
 int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
  int abort_on_failure)
 {
diff --git a/qemu-option.h b/qemu-option.h
index b515813..af4d36b 100644
--- a/qemu-option.h
+++ b/qemu-option.h
@@ -109,6 +109,7 @@ int qemu_opt_get_bool(QemuOpts *opts, const char *name, int 
defval);
 uint64_t qemu_opt_get_number(QemuOpts *opts, const char *name, uint64_t 
defval);
 uint64_t qemu_opt_get_size(QemuOpts *opts, const char *name, uint64_t defval);
 int qemu_opt_set(QemuOpts *opts, const char *name, const char *value);
+int qemu_opt_set_bool(QemuOpts *opts, const char *name, int val);
 typedef int (*qemu_opt_loopfunc)(const char *name, const char *value, void 
*opaque);
 int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
  int abort_on_failure);
-- 
1.7.6




[Qemu-devel] [PATCH V4 2/2] hw/9pfs: Add readonly support for 9p export

2011-10-12 Thread M. Mohan Kumar
A new fsdev parameter readonly is introduced to control accessing 9p export.
readonly=on|off can be used to specify the access type. By default rw
access is given.

Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
---
Changes from previous version V3:
* Use opt_set_bool function to set readonly option
* Change the flag from MS_READONLY to 9p specific

Change from previous version V2:
* QEMU_OPT_BOOL is used for readdonly parameter

Changes from previous version:
* Use readonly option instead of access
* Change function return type to boolean where its needed

 fsdev/file-op-9p.h |3 +-
 fsdev/qemu-fsdev.c |   12 +-
 fsdev/qemu-fsdev.h |1 +
 hw/9pfs/virtio-9p-device.c |3 ++
 hw/9pfs/virtio-9p.c|   46 
 qemu-config.c  |7 ++
 vl.c   |2 +
 7 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 33fb07f..b75290d 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -58,7 +58,8 @@ typedef struct extended_ops {
 } extended_ops;
 
 /* FsContext flag values */
-#define PATHNAME_FSCONTEXT 0x1
+#define PATHNAME_FSCONTEXT  0x1
+#define P9_RDONLY_EXPORT0x2
 
 /* cache flags */
 #define V9FS_WRITETHROUGH_CACHE 0x1
diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index d08ba9c..f8a8227 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -29,13 +29,13 @@ static FsTypeTable FsTypes[] = {
 int qemu_fsdev_add(QemuOpts *opts)
 {
 struct FsTypeListEntry *fsle;
-int i;
+int i, flags = 0;
 const char *fsdev_id = qemu_opts_id(opts);
 const char *fstype = qemu_opt_get(opts, fstype);
 const char *path = qemu_opt_get(opts, path);
 const char *sec_model = qemu_opt_get(opts, security_model);
 const char *cache = qemu_opt_get(opts, cache);
-
+int rdonly = qemu_opt_get_bool(opts, readonly, 0);
 
 if (!fsdev_id) {
 fprintf(stderr, fsdev: No id specified\n);
@@ -76,6 +76,14 @@ int qemu_fsdev_add(QemuOpts *opts)
 fsle-fse.cache_flags = V9FS_WRITETHROUGH_CACHE;
 }
 }
+if (rdonly) {
+flags |= P9_RDONLY_EXPORT;
+} else {
+flags = ~P9_RDONLY_EXPORT;
+}
+
+fsle-fse.flags = flags;
+
 QTAILQ_INSERT_TAIL(fstype_entries, fsle, next);
 return 0;
 }
diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
index 0f67880..2938eaf 100644
--- a/fsdev/qemu-fsdev.h
+++ b/fsdev/qemu-fsdev.h
@@ -44,6 +44,7 @@ typedef struct FsTypeEntry {
 char *security_model;
 int cache_flags;
 FileOperations *ops;
+int flags;
 } FsTypeEntry;
 
 typedef struct FsTypeListEntry {
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 1846e36..336292c 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -125,6 +125,9 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
*conf)
 s-tag_len = len;
 s-ctx.uid = -1;
 s-ctx.flags = 0;
+if (fse-flags  P9_RDONLY_EXPORT) {
+s-ctx.flags |= P9_RDONLY_EXPORT;
+}
 
 s-ops = fse-ops;
 s-vdev.get_features = virtio_9p_get_features;
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 47ed2f1..9f15787 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -1271,6 +1271,11 @@ static void v9fs_fix_path(V9fsPath *dst, V9fsPath *src, 
int len)
 dst-size++;
 }
 
+static inline bool is_ro_export(FsContext *fs_ctx)
+{
+return fs_ctx-flags  P9_RDONLY_EXPORT;
+}
+
 static void v9fs_version(void *opaque)
 {
 V9fsPDU *pdu = opaque;
@@ -1690,6 +1695,14 @@ static void v9fs_open(void *opaque)
 } else {
 flags = omode_to_uflags(mode);
 }
+if (is_ro_export(s-ctx)) {
+if (mode  O_WRONLY || mode  O_RDWR || mode  O_APPEND) {
+err = -EROFS;
+goto out;
+} else {
+flags |= O_NOATIME;
+}
+}
 err = v9fs_co_open(pdu, fidp, flags);
 if (err  0) {
 goto out;
@@ -3301,6 +3314,33 @@ static void v9fs_op_not_supp(void *opaque)
 complete_pdu(pdu-s, pdu, -EOPNOTSUPP);
 }
 
+static inline bool is_read_only_op(int id)
+{
+switch (id) {
+case P9_TREADDIR:
+case P9_TSTATFS:
+case P9_TGETATTR:
+case P9_TXATTRWALK:
+case P9_TLOCK:
+case P9_TGETLOCK:
+case P9_TREADLINK:
+case P9_TVERSION:
+case P9_TLOPEN:
+case P9_TATTACH:
+case P9_TSTAT:
+case P9_TWALK:
+case P9_TCLUNK:
+case P9_TFSYNC:
+case P9_TOPEN:
+case P9_TREAD:
+case P9_TAUTH:
+case P9_TFLUSH:
+return 1;
+default:
+return 0;
+}
+}
+
 static void submit_pdu(V9fsState *s, V9fsPDU *pdu)
 {
 Coroutine *co;
@@ -3312,6 +3352,12 @@ static void submit_pdu(V9fsState *s, V9fsPDU *pdu)
 } else {
 handler = pdu_co_handlers[pdu-id];
 }
+
+if (is_ro_export(s-ctx)  !is_read_only_op(pdu-id)) {
+complete_pdu(s, 

Re: [Qemu-devel] [PATCH] qed: fix use-after-free during l2 cache commit

2011-10-12 Thread Stefan Hajnoczi
On Tue, Oct 11, 2011 at 04:22:11PM +0200, Kevin Wolf wrote:
 Am 30.09.2011 17:49, schrieb Amit Shah:
  On (Fri) 30 Sep 2011 [16:23:30], Stefan Hajnoczi wrote:
  On Fri, Sep 30, 2011 at 12:27 PM, Amit Shah amit.s...@redhat.com wrote:
  On (Fri) 30 Sep 2011 [11:39:11], Stefan Hajnoczi wrote:
  QED's metadata caching strategy allows two parallel requests to race for
  metadata lookup.  The first one to complete will populate the metadata
  cache and the second one will drop the data it just read in favor of the
  cached data.
 
  There is a use-after-free in qed_read_l2_table_cb() and
  qed_commit_l2_update() where l2_table-offset was used after the
  l2_table may have been freed due to a metadata lookup race.  Fix this by
  keeping the l2_offset in a local variable and not reaching into the
  possibly freed l2_table.
 
  Reported-by: Amit Shah amit.s...@redhat.com
  Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
  ---
  Hi Amit,
  Thanks for reporting the assertion failure you saw at 
  http://fpaste.org/CDuv/.
  Does this patch fix the problem?
 
  Yes, this fixes it.
 
  Were you able to reliably reproduce the assertion failure before?
  
  Absolutely.
  
  I even reverted the patch and tried the same image; same segfault
  again.
  
  I wonder because this only happens when two metadata lookups race
  (which is rare enough on my setup that I've never seen this failure).
  It might be worth trying a few times.
  
  Get the F16 beta-rc LXE live iso, install guest.  It doesn't cleanly
  reboot, you have to kill the VM.  Next start of the VM produces this
  segfault.
  
  https://alt.fedoraproject.org/pub/alt/stage/16-Beta.RC2/Live/x86_64/Fedora-16-Beta-x86_64-Live-LXDE.iso
 
 Can we try to artificially produce it in a qemu-iotests case?

I will take a look.

Stefan



[Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing

2011-10-12 Thread M. Mohan Kumar
Security model is needed only for 'local' fs driver.

Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
---
 fsdev/qemu-fsdev.c |6 +
 fsdev/qemu-fsdev.h |1 +
 hw/9pfs/virtio-9p-device.c |   47 ++-
 vl.c   |   20 +++--
 4 files changed, 43 insertions(+), 31 deletions(-)

diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index 36db127..d08ba9c 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -58,11 +58,6 @@ int qemu_fsdev_add(QemuOpts *opts)
 return -1;
 }
 
-if (!sec_model) {
-fprintf(stderr, fsdev: No security_model specified.\n);
-return -1;
-}
-
 if (!path) {
 fprintf(stderr, fsdev: No path specified.\n);
 return -1;
@@ -72,6 +67,7 @@ int qemu_fsdev_add(QemuOpts *opts)
 
 fsle-fse.fsdev_id = g_strdup(fsdev_id);
 fsle-fse.path = g_strdup(path);
+fsle-fse.fsdriver = g_strdup(fstype);
 fsle-fse.security_model = g_strdup(sec_model);
 fsle-fse.ops = FsTypes[i].ops;
 fsle-fse.cache_flags = 0;
diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
index 9c440f2..0f67880 100644
--- a/fsdev/qemu-fsdev.h
+++ b/fsdev/qemu-fsdev.h
@@ -40,6 +40,7 @@ typedef struct FsTypeTable {
 typedef struct FsTypeEntry {
 char *fsdev_id;
 char *path;
+char *fsdriver;
 char *security_model;
 int cache_flags;
 FileOperations *ops;
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index aac58ad..1846e36 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -83,29 +83,30 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
*conf)
 exit(1);
 }
 
-if (!strcmp(fse-security_model, passthrough)) {
-/* Files on the Fileserver set to client user credentials */
-s-ctx.fs_sm = SM_PASSTHROUGH;
-s-ctx.xops = passthrough_xattr_ops;
-} else if (!strcmp(fse-security_model, mapped)) {
-/* Files on the fileserver are set to QEMU credentials.
- * Client user credentials are saved in extended attributes.
- */
-s-ctx.fs_sm = SM_MAPPED;
-s-ctx.xops = mapped_xattr_ops;
-} else if (!strcmp(fse-security_model, none)) {
-/*
- * Files on the fileserver are set to QEMU credentials.
- */
-s-ctx.fs_sm = SM_NONE;
-s-ctx.xops = none_xattr_ops;
-} else {
-fprintf(stderr, Default to security_model=none. You may want
- enable advanced security model using 
-security option:\n\t security_model=passthrough\n\t 
-security_model=mapped\n);
-s-ctx.fs_sm = SM_NONE;
-s-ctx.xops = none_xattr_ops;
+/* security models is needed only for local fs driver */
+if (!strcmp(fse-fsdriver, local)) {
+if (!strcmp(fse-security_model, passthrough)) {
+/* Files on the Fileserver set to client user credentials */
+s-ctx.fs_sm = SM_PASSTHROUGH;
+s-ctx.xops = passthrough_xattr_ops;
+} else if (!strcmp(fse-security_model, mapped)) {
+/* Files on the fileserver are set to QEMU credentials.
+* Client user credentials are saved in extended attributes.
+*/
+s-ctx.fs_sm = SM_MAPPED;
+s-ctx.xops = mapped_xattr_ops;
+} else if (!strcmp(fse-security_model, none)) {
+/*
+* Files on the fileserver are set to QEMU credentials.
+*/
+s-ctx.fs_sm = SM_NONE;
+s-ctx.xops = none_xattr_ops;
+} else {
+fprintf(stderr, Invalid security_model %s specified.\n
+Available security models are:\t 
+passthrough,mapped or none\n, fse-security_model);
+exit(1);
+}
 }
 
 s-ctx.cache_flags = fse-cache_flags;
diff --git a/vl.c b/vl.c
index 6760e39..a961fa3 100644
--- a/vl.c
+++ b/vl.c
@@ -2795,6 +2795,7 @@ int main(int argc, char **argv, char **envp)
 QemuOpts *fsdev;
 QemuOpts *device;
 const char *cache;
+const char *fsdriver;
 
 olist = qemu_find_opts(virtfs);
 if (!olist) {
@@ -2809,13 +2810,26 @@ int main(int argc, char **argv, char **envp)
 
 if (qemu_opt_get(opts, fstype) == NULL ||
 qemu_opt_get(opts, mount_tag) == NULL ||
-qemu_opt_get(opts, path) == NULL ||
-qemu_opt_get(opts, security_model) == NULL) {
+qemu_opt_get(opts, path) == NULL) {
 fprintf(stderr, Usage: -virtfs fstype,path=/share_path/,
-security_model=[mapped|passthrough|none],
+{security_model=[mapped|passthrough|none]},
 mount_tag=tag.\n);
 exit(1);
 }
+fsdriver = 

Re: [Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing

2011-10-12 Thread Daniel P. Berrange
On Wed, Oct 12, 2011 at 01:24:16PM +0530, M. Mohan Kumar wrote:
 Security model is needed only for 'local' fs driver.
 
 Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
 ---
  fsdev/qemu-fsdev.c |6 +
  fsdev/qemu-fsdev.h |1 +
  hw/9pfs/virtio-9p-device.c |   47 ++-
  vl.c   |   20 +++--
  4 files changed, 43 insertions(+), 31 deletions(-)
 
 --- a/fsdev/qemu-fsdev.h
 +++ b/fsdev/qemu-fsdev.h
 @@ -40,6 +40,7 @@ typedef struct FsTypeTable {
  typedef struct FsTypeEntry {
  char *fsdev_id;
  char *path;
 +char *fsdriver;
  char *security_model;
  int cache_flags;
  FileOperations *ops;
 diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
 index aac58ad..1846e36 100644
 --- a/hw/9pfs/virtio-9p-device.c
 +++ b/hw/9pfs/virtio-9p-device.c
 @@ -83,29 +83,30 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
 *conf)
  exit(1);
  }
  
 -if (!strcmp(fse-security_model, passthrough)) {
 -/* Files on the Fileserver set to client user credentials */
 -s-ctx.fs_sm = SM_PASSTHROUGH;
 -s-ctx.xops = passthrough_xattr_ops;
 -} else if (!strcmp(fse-security_model, mapped)) {
 -/* Files on the fileserver are set to QEMU credentials.
 - * Client user credentials are saved in extended attributes.
 - */
 -s-ctx.fs_sm = SM_MAPPED;
 -s-ctx.xops = mapped_xattr_ops;
 -} else if (!strcmp(fse-security_model, none)) {
 -/*
 - * Files on the fileserver are set to QEMU credentials.
 - */
 -s-ctx.fs_sm = SM_NONE;
 -s-ctx.xops = none_xattr_ops;
 -} else {
 -fprintf(stderr, Default to security_model=none. You may want
 - enable advanced security model using 
 -security option:\n\t security_model=passthrough\n\t 
 -security_model=mapped\n);
 -s-ctx.fs_sm = SM_NONE;
 -s-ctx.xops = none_xattr_ops;
 +/* security models is needed only for local fs driver */
 +if (!strcmp(fse-fsdriver, local)) {
 +if (!strcmp(fse-security_model, passthrough)) {
 +/* Files on the Fileserver set to client user credentials */
 +s-ctx.fs_sm = SM_PASSTHROUGH;
 +s-ctx.xops = passthrough_xattr_ops;
 +} else if (!strcmp(fse-security_model, mapped)) {
 +/* Files on the fileserver are set to QEMU credentials.
 +* Client user credentials are saved in extended attributes.
 +*/
 +s-ctx.fs_sm = SM_MAPPED;
 +s-ctx.xops = mapped_xattr_ops;
 +} else if (!strcmp(fse-security_model, none)) {
 +/*
 +* Files on the fileserver are set to QEMU credentials.
 +*/
 +s-ctx.fs_sm = SM_NONE;
 +s-ctx.xops = none_xattr_ops;
 +} else {
 +fprintf(stderr, Invalid security_model %s specified.\n
 +Available security models are:\t 
 +passthrough,mapped or none\n, fse-security_model);
 +exit(1);
 +}

Are you sure there aren't use cases where people would like to
choose between  passthrough  mapped, even when using the 'proxy'
or 'handle' security drivers.

Both of the security models seem pretty generally useful to me,
regardless of the driver type.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread Michael S. Tsirkin
On Wed, Oct 12, 2011 at 02:11:37PM +1100, David Gibson wrote:
 On Sun, Oct 02, 2011 at 12:52:39PM +0200, Michael S. Tsirkin wrote:
  On Sun, Oct 02, 2011 at 12:29:08PM +0200, Avi Kivity wrote:
   On 10/02/2011 12:25 PM, Michael S. Tsirkin wrote:
   On Mon, Sep 05, 2011 at 02:34:56PM +1000, David Gibson wrote:
 This patch adds functions to pci.[ch] to perform PCI DMA operations.  
At
 present, these are just stubs which perform directly cpu physical 
memory
 accesses.
   
 Using these stubs, however, distinguishes PCI device DMA transactions 
from
 other accesses to physical memory, which will allow PCI IOMMU support 
to
 be added in one place, rather than updating every PCI driver at that 
time.
   
 That is, it allows us to update individual PCI drivers to support an 
IOMMU
 without having yet determined the details of how the IOMMU emulation 
will
 operate.  This will let us remove the most bitrot-sensitive part of an
 IOMMU patch in advance.
   
 Signed-off-by: David Gibsonda...@gibson.dropbear.id.au
   
   So something I just thought about:
   
   all wrappers now go through cpu_physical_memory_rw.
   This is a problem as e.g. virtio assumes that
   accesses such as stw are atomic. cpu_physical_memory_rw
   is a memcpy which makes no such guarantees.
   
   
   Let's change cpu_physical_memory_rw() to provide that guarantee for
   aligned two and four byte accesses.  Having separate paths just for
   that is not maintainable.
  
  Well, we also have stX_phys convert to target native endian-ness
  (nop for KVM but not necessarily for qemu).
 
 Yes.. as do the stX_pci_dma() helpers.  They assume LE, rather than
 having two variants, because PCI is an LE spec, and all normal PCI
 devices work in LE.

IMO, not really. PCI devices do DMA any way they like.  LE is
probably more common because both ARM and x86 processors are LE.

  If we need to model some perverse BE PCI device,
 it can reswap itself.

An explicit API for this would be cleaner.

-- 
MST



Re: [Qemu-devel] [PATCH 01/36] ds1225y: Use stdio instead of QEMUFile

2011-10-12 Thread Zhi Hui Li

On 10/11/2011 06:00 PM, Juan Quintela wrote:

QEMUFile * is only intended for migration nowadays.  Using it for
anything else just adds pain and a layer of buffers for no good
reason.

Signed-off-by: Juan Quintelaquint...@redhat.com
---
  hw/ds1225y.c |   28 
  1 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/hw/ds1225y.c b/hw/ds1225y.c
index 9875c44..6852a61 100644
--- a/hw/ds1225y.c
+++ b/hw/ds1225y.c
@@ -29,7 +29,7 @@ typedef struct {
  DeviceState qdev;
  uint32_t chip_size;
  char *filename;
-QEMUFile *file;
+FILE *file;
  uint8_t *contents;
  } NvRamState;

@@ -70,9 +70,9 @@ static void nvram_writeb (void *opaque, target_phys_addr_t 
addr, uint32_t val)

  s-contents[addr] = val;
  if (s-file) {
-qemu_fseek(s-file, addr, SEEK_SET);
-qemu_put_byte(s-file, (int)val);
-qemu_fflush(s-file);
+fseek(s-file, addr, SEEK_SET);
+fputc(val, s-file);
+fflush(s-file);
  }
  }

@@ -108,15 +108,17 @@ static int nvram_post_load(void *opaque, int version_id)

  /* Close file, as filename may has changed in load/store process */
  if (s-file) {
-qemu_fclose(s-file);
+fclose(s-file);
  }

  /* Write back nvram contents */
-s-file = qemu_fopen(s-filename, wb);
+s-file = fopen(s-filename, wb);
  if (s-file) {
  /* Write back contents, as 'wb' mode cleaned the file */
-qemu_put_buffer(s-file, s-contents, s-chip_size);
-qemu_fflush(s-file);
+if (fwrite(s-contents, s-chip_size, 1, s-file) != 1) {
+printf(nvram_post_load: short write\n);
+}
+fflush(s-file);
  }

  return 0;
@@ -143,7 +145,7 @@ typedef struct {
  static int nvram_sysbus_initfn(SysBusDevice *dev)
  {
  NvRamState *s =FROM_SYSBUS(SysBusNvRamState, dev)-nvram;
-QEMUFile *file;
+FILE *file;
  int s_io;

  s-contents = g_malloc0(s-chip_size);
@@ -153,11 +155,13 @@ static int nvram_sysbus_initfn(SysBusDevice *dev)
  sysbus_init_mmio(dev, s-chip_size, s_io);

  /* Read current file */
-file = qemu_fopen(s-filename, rb);
+file = fopen(s-filename, rb);
  if (file) {
  /* Read nvram contents */
-qemu_get_buffer(file, s-contents, s-chip_size);
-qemu_fclose(file);
+if (fread(s-contents, s-chip_size, 1, file) != 1) {
+printf(nvram_sysbus_initfn: short read\n);
+}
+fclose(file);
  }
  nvram_post_load(s, 0);



Tested-by: Zhi Hui Lizhihu...@linux.vnet.ibm.com



Re: [Qemu-devel] [PATCH 3/6] block: switch bdrv_read()/bdrv_write() to coroutines

2011-10-12 Thread Stefan Hajnoczi
On Tue, Oct 11, 2011 at 7:44 AM, Zhi Yong Wu zwu.ker...@gmail.com wrote:
 On Thu, Oct 6, 2011 at 12:17 AM, Stefan Hajnoczi
 stefa...@linux.vnet.ibm.com wrote:
 @@ -1101,36 +1144,7 @@ static void set_dirty_bitmap(BlockDriverState *bs, 
 int64_t sector_num,
  int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors)
  {
 -    BlockDriver *drv = bs-drv;
 -
 -    if (!bs-drv)
 -        return -ENOMEDIUM;
 -
 -    if (bdrv_has_async_rw(drv)  qemu_in_coroutine()) {
 -        QEMUIOVector qiov;
 -        struct iovec iov = {
 -            .iov_base = (void *)buf,
 -            .iov_len = nb_sectors * BDRV_SECTOR_SIZE,
 -        };
 -
 -        qemu_iovec_init_external(qiov, iov, 1);
 -        return bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
 -    }
 -
 -    if (bs-read_only)
 -        return -EACCES;
 -    if (bdrv_check_request(bs, sector_num, nb_sectors))
 -        return -EIO;
 -
 -    if (bs-dirty_bitmap) {
 -        set_dirty_bitmap(bs, sector_num, nb_sectors, 1);
 -    }
 -
 -    if (bs-wr_highest_sector  sector_num + nb_sectors - 1) {
 -        bs-wr_highest_sector = sector_num + nb_sectors - 1;
 -    }
 The above codes are removed, will it be safe?

If you are checking that removing bs-wr_highest_sector code is okay,
then yes, it is safe because bdrv_co_do_writev() does the dirty bitmap
and wr_highest_sector updates.  We haven't lost any code by unifying
request processing - bdrv_co_do_writev() must do everything that
bdrv_aio_writev() and bdrv_write() did.

Stefan



Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread Gerd Hoffmann

  Hi,


Yes.. as do the stX_pci_dma() helpers.  They assume LE, rather than
having two variants, because PCI is an LE spec, and all normal PCI
devices work in LE.


IMO, not really. PCI devices do DMA any way they like.  LE is
probably more common because both ARM and x86 processors are LE.


Also having _le_ in the function name makes explicitly clear that the 
functions read/write little endian values and byteswaps if needed, which 
makes the code more readable.  I'd suggest to add it even if there is no 
need for a _be_ companion as devices needing that are rare.


cheers,
  Gerd




Re: [Qemu-devel] [PATCH 3/6] block: switch bdrv_read()/bdrv_write() to coroutines

2011-10-12 Thread Zhi Yong Wu
On Wed, Oct 12, 2011 at 5:03 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Tue, Oct 11, 2011 at 7:44 AM, Zhi Yong Wu zwu.ker...@gmail.com wrote:
 On Thu, Oct 6, 2011 at 12:17 AM, Stefan Hajnoczi
 stefa...@linux.vnet.ibm.com wrote:
 @@ -1101,36 +1144,7 @@ static void set_dirty_bitmap(BlockDriverState *bs, 
 int64_t sector_num,
  int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors)
  {
 -    BlockDriver *drv = bs-drv;
 -
 -    if (!bs-drv)
 -        return -ENOMEDIUM;
 -
 -    if (bdrv_has_async_rw(drv)  qemu_in_coroutine()) {
 -        QEMUIOVector qiov;
 -        struct iovec iov = {
 -            .iov_base = (void *)buf,
 -            .iov_len = nb_sectors * BDRV_SECTOR_SIZE,
 -        };
 -
 -        qemu_iovec_init_external(qiov, iov, 1);
 -        return bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
 -    }
 -
 -    if (bs-read_only)
 -        return -EACCES;
 -    if (bdrv_check_request(bs, sector_num, nb_sectors))
 -        return -EIO;
How about the above four lines of codes?
 -
 -    if (bs-dirty_bitmap) {
 -        set_dirty_bitmap(bs, sector_num, nb_sectors, 1);
 -    }
 -
 -    if (bs-wr_highest_sector  sector_num + nb_sectors - 1) {
 -        bs-wr_highest_sector = sector_num + nb_sectors - 1;
 -    }
 The above codes are removed, will it be safe?

 If you are checking that removing bs-wr_highest_sector code is okay,
 then yes, it is safe because bdrv_co_do_writev() does the dirty bitmap
 and wr_highest_sector updates.  We haven't lost any code by unifying
OK. got it. thanks.
 request processing - bdrv_co_do_writev() must do everything that
 bdrv_aio_writev() and bdrv_write() did.

 Stefan




-- 
Regards,

Zhi Yong Wu



Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread Michael S. Tsirkin
On Wed, Oct 12, 2011 at 02:09:26PM +1100, David Gibson wrote:
 On Mon, Oct 03, 2011 at 08:17:05AM -0500, Anthony Liguori wrote:
  On 10/02/2011 07:14 AM, Michael S. Tsirkin wrote:
  On Sun, Oct 02, 2011 at 02:01:10PM +0200, Avi Kivity wrote:
  Hmm, not entirely virtio specific, some devices use stX macros to do the
  conversion.  E.g. stw_be_phys and stl_le_phys are used in several
  places.
  
  These are fine - explicit endianness.
  
  Right. So changing these to e.g. stl_dma and assuming
  LE is default seems like a step backwards.
  
  We're generalizing too much.
  
  In general, the device model doesn't need atomic access functions.

Anthony, are you sure? PCI both provides atomic operations for devices (likely
uncommon). PCI express spec strongly recommends at least dword update
granularity for both reads and writes.
Some guests might depend on this.

  That's because device model RAM access is not coherent with CPU RAM
  access.
  Virtio is a very, very special case.  virtio requires coherent RAM
  access.

E.g., e1000 driver seems to allocate its rings in coherent memory.
Required? Your guess is as good as mine. It seems to work fine
ATM without these guarantees.

 Right, but it should only need that for the actual rings in the virtio
 core.  I was expecting that those would remain as direct physical
 memory accesses - precisely because virtio is special - rather than
 accesses through any kind of DMA interface.

At the moment, yes. Further, that was just an example I know about.
How about msi/msix? We don't want to
split these writes as that would confuse the APIC.

 -- 
 David Gibson  | I'll have my music baroque, and my code
 david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ 
 _other_
   | _way_ _around_!
 http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] Is realview-pb-a8 fully supported ?

2011-10-12 Thread Peter Maydell
On 10 October 2011 14:48, Francis Moreau francis.m...@gmail.com wrote:
 On Mon, Oct 10, 2011 at 10:42 AM, Peter Maydell
 peter.mayd...@linaro.org wrote:
 On 10 October 2011 08:35, Francis Moreau francis.m...@gmail.com wrote:
 I noticed another point for the realview platofrm: if I boot with -M
 512, it works however if I set -M 256 then it doesn't.

 Perhaps your kernel is configured to load in the higher 256MB
 address range

 hmm which options do you have in mind ?

Hmm, I thought there was an option for this but I can't find it
in the config, so I must have been misremembering somehow.

 When I say it doesn't work, it means that nothing happen when
 starting qemu: no trace, it looks like it's running an infinite loop.

Not even Uncompressing the kernel ?

If you want to track down what's going on then you'll need to
connect an ARM gdb up to qemu and single step through the boot
process, I'm afraid.

 BTW I'm wondering which kernel source I should use to build kernels
 for such plateforms (realview, vexpress, versatile) ? I'm currently
 using the source from kernel.org (well similar since this server seems
 really dead). but I'm not sure if it's a good idea...

I think the mainline kernel sources should in theory work
(in particular if they work with 512MB then that's a good
sign...) but I'm not a kernel expert; mostly I use other peoples'
prebuilt ones.

-- PMM



Re: [Qemu-devel] qemu-0.15.1 stable call for patches

2011-10-12 Thread Brad

On 26/09/11 9:16 AM, Justin M. Forbes wrote:

With the current patch queue I would like to start getting qemu-0.15.1
stable into shape.  With this in mind, the plan is to tag the release on
Monday Oct 3rd.  If you have patches pending for stable, now would be the
time to send them. Please CC jmfor...@linuxtx.org if you can to ensure that
I see them.

Thanks,
Justin


Is there anywhere where we can see what's been pulled into the pending 
0.15.1 code base so far since you don't seem to really post to the list

about the stable branches?


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.




Re: [Qemu-devel] [PATCH 1/2] hw/9pfs: Add new virtfs option cache=writethrough to skip host page cache

2011-10-12 Thread Stefan Hajnoczi
On Mon, Oct 10, 2011 at 10:06 AM, Aneesh Kumar K.V
aneesh.ku...@linux.vnet.ibm.com wrote:
 diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
 index 5c8b5ed..441a37f 100644
 --- a/hw/9pfs/virtio-9p-handle.c
 +++ b/hw/9pfs/virtio-9p-handle.c
 @@ -202,6 +202,15 @@ static ssize_t handle_pwritev(FsContext *ctx, int fd, 
 const struct iovec *iov,
         return writev(fd, iov, iovcnt);

The sync_file_range(2) call below is dead-code since we'll return
immediately after writev(2) completes.  The writev(2) return value
needs to be saved temporarily and then returned after
sync_file_range(2).

     }
  #endif
 +    if (ctx-cache_flags  V9FS_WRITETHROUGH_CACHE) {

-drive cache=writethrough means something different from 9pfs
writethrough.  This is confusing so I wonder if there is a better
name like immediate write-out.

 +        /*
 +         * Initiate a writeback. This is not a data integrity sync.
 +         * We want to ensure that we don't leave dirty pages in the cache
 +         * after write when cache=writethrough is sepcified.
 +         */
 +        sync_file_range(fd, offset, 0,
 +                        SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE);
 +    }

I'm not sure whether SYNC_FILE_RANGE_WAIT_BEFORE is necessary.  As a
best-effort mechanism just SYNC_FILE_RANGE_WRITE does the job although
a client that rapidly rewrites may be able to leave dirty pages in the
host page cache.  SYNC_FILE_RANGE_WAIT_BEFORE ensures that dirty pages
get written out but it is no longer asynchronous because it blocks.

Stefan



Re: [Qemu-devel] [BUG] USB assertion triggers in usb_packet_complete()

2011-10-12 Thread Stefan Hajnoczi
On Tue, Oct 11, 2011 at 8:35 AM, Thomas Huth th...@linux.vnet.ibm.com wrote:
 Am Mon, 10 Oct 2011 15:03:41 +0200
 schrieb Thomas Huth th...@linux.vnet.ibm.com:

 I am currently facing a problem when running QEMU (up-to-date git
 version) with OHCI and a lot of virtual USB devices.
 The emulator dies with the following assertion:

 qemu-system-arm: hw/usb.c:337: usb_packet_complete:
 Assertion `p-owner != ((void *)0)' failed.

Hi Thomas,
I hit the same bug recently and Gerd has posted a patch which you can test:
http://patchwork.ozlabs.org/patch/118726/

Stefan



[Qemu-devel] PCI 64-bit BAR access with qemu

2011-10-12 Thread Francois WELLENREITER


Hi there,

I've read a few days ago that it was possible to emulate PCI device 
with 64-bit BARs and have a real 64-bit memory access.
Thus, I've created a virtual device named toto accessible through a 
64-bit BAR


___

static const MemoryRegionOps bxi_common_mmio_ops = {
.read = toto_mmio_read,
.write = toto_mmio_write,
.endianness = DEVICE_LITTLE_ENDIAN,
.impl = {
.min_access_size = 1,
.max_access_size = 8,
},
};

___

memory_region_init_io(d-mmio, toto_mmio_ops, d, toto-mmio,
0x1000);

pci_register_bar(d-dev, BAR_0,
 PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
d-mmio);



when I use in my driver pointers to unsigned char, unsigned short or 
unsigned int,

I can see that toto_mmio_read is called with increasing sizes.
But that does not work at all with unsigned long long.
In such a case, I always obtain two calls to toto_mmio_read with sizes 
limited to 4.
My understanding here is that a direct 64-bit memory access does not 
work (but is composed of 2 32-bit accesses).


Then, I've tried to understand why this happens and realized that 
cpu_physical_memory_rw was always called for this memory region (for in 
the old_mmio manner) :


___

(gdb) bt
#0  toto_mmio_read (opaque=0x2bec2c0, addr=0, size=4) at 
/home/workspace/qemu/hw/toto_hw.c:92
#1  0x005f6d37 in memory_region_read_accessor (opaque=0x2bec738, 
addr=0, value=0x7f9997ffec78, size=

4, shift=0, mask=4294967295) at /home/workspace/qemu/memory.c:239
#2  0x005f6ee0 in access_with_adjusted_size (addr=0, 
value=0x7f9997ffec78, size=4, access_size_min=1,
access_size_max=8, access=0x5f6cdf memory_region_read_accessor, 
opaque=0x2bec738)

at /home/workspace/qemu/memory.c:284
#3  0x005f8909 in memory_region_read_thunk_n (_mr=0x2bec738, 
addr=0, size=4)

at /home/workspace/qemu/memory.c:824
#4  0x005f8af1 in memory_region_read_thunk_l (mr=0x2bec738, addr=0)
at /home/workspace/qemu/memory.c:867
#5  0x005c8f69 in cpu_physical_memory_rw (addr=3892314112, 
buf=0x7f999c5d0028 \320\b, len=8,

is_write=0) at /home/workspace/qemu/exec.c:3965
#6  0x005ecfa1 in kvm_cpu_exec (env=0x238ee10) at 
/home/workspace/qemu/kvm-all.c:985
#7  0x005bf1fb in qemu_kvm_cpu_thread_fn (arg=0x238ee10) at 
/home/workspace/qemu/cpus.c:661

#8  0x0034dfc077e1 in start_thread () from /lib64/libpthread.so.0
#9  0x0034df4e68ed in clone () from /lib64/libc.so.6



Did I miss anything ? Is it a real defect ?
Are there developments still planned to allow a direct 64-bit access ?

Thanks for any help,

 François




Re: [Qemu-devel] [PATCH 0/3] block: zero write detection

2011-10-12 Thread Stefan Hajnoczi
On Tue, Oct 11, 2011 at 03:46:28PM +0200, Kevin Wolf wrote:
 Am 07.10.2011 17:49, schrieb Stefan Hajnoczi:
  Image streaming copies data from the backing file into the image file.  It 
  is
  important to represent zero regions from the backing file efficiently during
  streaming, otherwise the image file grows to the full virtual disk size and
  loses sparseness.
  
  There are two ways to implement zero write detection, they are subtly 
  different:
  
  1. Allow image formats to provide efficient representations for zero 
  regions.
 QED does this with zero clusters and it has been discussed for qcow2v3.
  
  2. During streaming, check for zeroes and skip writing to the image file 
  when
 zeroes are detected.
  
  However, there are some disadvantages to #2 because it leaves unallocated 
  holes
  in the image file.  If image streaming is aborted before it completes then 
  it
  will be necessary to reread all unallocated clusters from the backing file 
  upon
  resuming image streaming.  Potentionally worse is that a backing file over a
  slow remote connection will have the zero regions fetched again and again if
  the guest accesses them.  #1 avoids these problems because the image file
  contains information on which regions are zeroes and do not need to be
  refetched.
  
  This patch series implements #1 with the existing QED zero cluster feature. 
   In
  the future we can add qcow2v3 zero clusters too.  We can also implement #2
  directly in the image streaming code as a fallback when the BlockDriver does
  not support zero detection #1 itself.  That way we get the best possible 
  zero
  write detection, depending on the image format.
  
  Here is a qemu-iotest to verify that zero write detection is working:
  http://repo.or.cz/w/qemu-iotests/stefanha.git/commitdiff/226949695eef51bdcdea3e6ce3d7e5a863427f37
  
  Stefan Hajnoczi (3):
block: add zero write detection interface
qed: add zero write detection support
qemu-io: add zero write detection option
  
   block.c |   16 +++
   block.h |2 +
   block/qed.c |   81 
  +--
   block_int.h |   13 +
   qemu-io.c   |   35 -
   5 files changed, 132 insertions(+), 15 deletions(-)
 
 It's good to have an option to detect zero writes and turn them into
 zero clusters, but it's something that introduces some overhead and
 probably won't be suitable as a default.

Yes, this series simply has a bdrv_set_zero_detection() API to toggle it
at runtime.  By default it is off to save CPU cycles.

 I think what we really want to have for image streaming is an API that
 explicitly writes zeros and doesn't have to look at the whole buffer (or
 actually doesn't even get a buffer).

I didn't take this approach to avoid having block drivers handle the
zero buffers that need to be allocated when the region does not cover
entire clusters.  It can be done for sure but I'm not sure how to do it
nicely yet.

Stefan



Re: [Qemu-devel] PCI 64-bit BAR access with qemu

2011-10-12 Thread Max Filippov
    I've read a few days ago that it was possible to emulate PCI device with
 64-bit BARs and have a real 64-bit memory access.
 Thus, I've created a virtual device named toto accessible through a 64-bit
 BAR

You've probably confused an ability to locate BAR anywhere in 64-bit
address space (such BAR actually spans 2 consecutive PCI BAR registers
and has 100 in its 3 least significant bits) and an ability to access
BAR-mapped memory in 64 bit items.

You obviously want the latter but currently it is not implemented, see e.g.

static inline DATA_TYPE glue(io_read, SUFFIX)(target_phys_addr_t physaddr,
  target_ulong addr,
  void *retaddr)

definition in the softmmu_template.h.

And it's quite simple to fix it, you only need to change
io_{read,write} in the softmmu_template.h and extend
io_mem_{read,write} loops in exec.c to 4 elements, taking care that
io_mem_{read,write}[3] can pass uint64_t.

-- 
Thanks.
-- Max



Re: [Qemu-devel] [PATCH 0/3] block: zero write detection

2011-10-12 Thread Kevin Wolf
Am 12.10.2011 12:39, schrieb Stefan Hajnoczi:
 On Tue, Oct 11, 2011 at 03:46:28PM +0200, Kevin Wolf wrote:
 Am 07.10.2011 17:49, schrieb Stefan Hajnoczi:
 Image streaming copies data from the backing file into the image file.  It 
 is
 important to represent zero regions from the backing file efficiently during
 streaming, otherwise the image file grows to the full virtual disk size and
 loses sparseness.

 There are two ways to implement zero write detection, they are subtly 
 different:

 1. Allow image formats to provide efficient representations for zero 
 regions.
QED does this with zero clusters and it has been discussed for qcow2v3.

 2. During streaming, check for zeroes and skip writing to the image file 
 when
zeroes are detected.

 However, there are some disadvantages to #2 because it leaves unallocated 
 holes
 in the image file.  If image streaming is aborted before it completes then 
 it
 will be necessary to reread all unallocated clusters from the backing file 
 upon
 resuming image streaming.  Potentionally worse is that a backing file over a
 slow remote connection will have the zero regions fetched again and again if
 the guest accesses them.  #1 avoids these problems because the image file
 contains information on which regions are zeroes and do not need to be
 refetched.

 This patch series implements #1 with the existing QED zero cluster feature. 
  In
 the future we can add qcow2v3 zero clusters too.  We can also implement #2
 directly in the image streaming code as a fallback when the BlockDriver does
 not support zero detection #1 itself.  That way we get the best possible 
 zero
 write detection, depending on the image format.

 Here is a qemu-iotest to verify that zero write detection is working:
 http://repo.or.cz/w/qemu-iotests/stefanha.git/commitdiff/226949695eef51bdcdea3e6ce3d7e5a863427f37

 Stefan Hajnoczi (3):
   block: add zero write detection interface
   qed: add zero write detection support
   qemu-io: add zero write detection option

  block.c |   16 +++
  block.h |2 +
  block/qed.c |   81 
 +--
  block_int.h |   13 +
  qemu-io.c   |   35 -
  5 files changed, 132 insertions(+), 15 deletions(-)

 It's good to have an option to detect zero writes and turn them into
 zero clusters, but it's something that introduces some overhead and
 probably won't be suitable as a default.
 
 Yes, this series simply has a bdrv_set_zero_detection() API to toggle it
 at runtime.  By default it is off to save CPU cycles.
 
 I think what we really want to have for image streaming is an API that
 explicitly writes zeros and doesn't have to look at the whole buffer (or
 actually doesn't even get a buffer).
 
 I didn't take this approach to avoid having block drivers handle the
 zero buffers that need to be allocated when the region does not cover
 entire clusters.  It can be done for sure but I'm not sure how to do it
 nicely yet.

If I understand your QED code right, in such cases it ignores that there
are some zeros that could be turned into a zero cluster. Considering
this and that you always fill a buffer just to be able to check it
(which is known to take considerable time from qemu-img convert
experience) - how could any solution that works consistently, but
requires an allocation in the block driver be less nice?

Kevin



[Qemu-devel] balloon driver on winxp guest start failed

2011-10-12 Thread hkran

Hi,

I used balloon driver for windows  virtio-win-0.1-15.iso (from 
http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/bin/)


following the install guard , I installed the balloon driver like this:

devcon.exe install d:\wxp\x86\balloon.inf 
PCI\VEN_1AF4DEV_1002SUBSYS_00051AF4REV_00
 then reboot guest Os, but the status of driver installed is always 
incorrect, that show me the driver start failed (code 10) in the device 
manager.


I typed the following cmds in the monitor command line:

(qemu) device_add virtio-balloon
(qemu) info balloon
balloon: actual=2048
(qemu) balloon 1024
(qemu) info balloon
balloon: actual=2048
(qemu) info balloon
balloon: actual=2048

And I also tried it by using qemu -balloon virtio param  when getting 
qemu up, the status is worse, the winxp guest froze at boot screen.


Am I using balloon driver in a correct way?





Re: [Qemu-devel] [BUG] USB assertion triggers in usb_packet_complete()

2011-10-12 Thread Thomas Huth
Am Wed, 12 Oct 2011 11:02:42 +0100
schrieb Stefan Hajnoczi stefa...@gmail.com:

 On Tue, Oct 11, 2011 at 8:35 AM, Thomas Huth th...@linux.vnet.ibm.com wrote:
  Am Mon, 10 Oct 2011 15:03:41 +0200
  schrieb Thomas Huth th...@linux.vnet.ibm.com:
 
  I am currently facing a problem when running QEMU (up-to-date git
  version) with OHCI and a lot of virtual USB devices.
  The emulator dies with the following assertion:
 
  qemu-system-arm: hw/usb.c:337: usb_packet_complete:
  Assertion `p-owner != ((void *)0)' failed.
 
 Hi Thomas,
 I hit the same bug recently and Gerd has posted a patch which you can test:
 http://patchwork.ozlabs.org/patch/118726/

Thanks for the hint, Stefan, you're right, that seems to be the same
bug. Your patch is working fine in my scenario, too.

However, Gerd's patch is not working for me, the assertion still
triggers. It seems like usb_packet_complete() is called for the leaf
node before it is called for the hub node, so the leaf node already set
p-owner = NULL.

 Thomas



[Qemu-devel] [PATCH] usb-hid: activate usb tablet / mouse after migration.

2011-10-12 Thread Gerd Hoffmann
qemu uses the ps/2 mouse by default.  The usb tablet (or mouse) is
activated as soon as qemu sees some guest activity on the device,
i.e. polling for HID events.  That used to work fine for both fresh
boot and migration.

Remote wakeup support changed the picture though: There will be no
polling after migration in case the guest suspended the usb bus,
waiting for wakeup events.  Result is that the ps/2 mouse stays
active.

Fix this by activating the usb tablet / mouse in post_load() in case
the guest enabled remote wakeup.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb-hid.c |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/hw/usb-hid.c b/hw/usb-hid.c
index 7c564b6..997f828 100644
--- a/hw/usb-hid.c
+++ b/hw/usb-hid.c
@@ -520,10 +520,21 @@ static int usb_keyboard_initfn(USBDevice *dev)
 return usb_hid_initfn(dev, HID_KEYBOARD);
 }
 
+static int usb_ptr_post_load(void *opaque, int version_id)
+{
+USBHIDState *s = opaque;
+
+if (s-dev.remote_wakeup) {
+hid_pointer_activate(s-hid);
+}
+return 0;
+}
+
 static const VMStateDescription vmstate_usb_ptr = {
 .name = usb-ptr,
 .version_id = 1,
 .minimum_version_id = 1,
+.post_load = usb_ptr_post_load,
 .fields = (VMStateField []) {
 VMSTATE_USB_DEVICE(dev, USBHIDState),
 VMSTATE_HID_POINTER_DEVICE(hid, USBHIDState),
-- 
1.7.1




Re: [Qemu-devel] [PATCH 0/3] block: zero write detection

2011-10-12 Thread Stefan Hajnoczi
On Wed, Oct 12, 2011 at 12:03 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 12.10.2011 12:39, schrieb Stefan Hajnoczi:
 On Tue, Oct 11, 2011 at 03:46:28PM +0200, Kevin Wolf wrote:
 Am 07.10.2011 17:49, schrieb Stefan Hajnoczi:
 Image streaming copies data from the backing file into the image file.  It 
 is
 important to represent zero regions from the backing file efficiently 
 during
 streaming, otherwise the image file grows to the full virtual disk size and
 loses sparseness.

 There are two ways to implement zero write detection, they are subtly 
 different:

 1. Allow image formats to provide efficient representations for zero 
 regions.
    QED does this with zero clusters and it has been discussed for 
 qcow2v3.

 2. During streaming, check for zeroes and skip writing to the image file 
 when
    zeroes are detected.

 However, there are some disadvantages to #2 because it leaves unallocated 
 holes
 in the image file.  If image streaming is aborted before it completes then 
 it
 will be necessary to reread all unallocated clusters from the backing file 
 upon
 resuming image streaming.  Potentionally worse is that a backing file over 
 a
 slow remote connection will have the zero regions fetched again and again 
 if
 the guest accesses them.  #1 avoids these problems because the image file
 contains information on which regions are zeroes and do not need to be
 refetched.

 This patch series implements #1 with the existing QED zero cluster 
 feature.  In
 the future we can add qcow2v3 zero clusters too.  We can also implement #2
 directly in the image streaming code as a fallback when the BlockDriver 
 does
 not support zero detection #1 itself.  That way we get the best possible 
 zero
 write detection, depending on the image format.

 Here is a qemu-iotest to verify that zero write detection is working:
 http://repo.or.cz/w/qemu-iotests/stefanha.git/commitdiff/226949695eef51bdcdea3e6ce3d7e5a863427f37

 Stefan Hajnoczi (3):
   block: add zero write detection interface
   qed: add zero write detection support
   qemu-io: add zero write detection option

  block.c     |   16 +++
  block.h     |    2 +
  block/qed.c |   81 
 +--
  block_int.h |   13 +
  qemu-io.c   |   35 -
  5 files changed, 132 insertions(+), 15 deletions(-)

 It's good to have an option to detect zero writes and turn them into
 zero clusters, but it's something that introduces some overhead and
 probably won't be suitable as a default.

 Yes, this series simply has a bdrv_set_zero_detection() API to toggle it
 at runtime.  By default it is off to save CPU cycles.

 I think what we really want to have for image streaming is an API that
 explicitly writes zeros and doesn't have to look at the whole buffer (or
 actually doesn't even get a buffer).

 I didn't take this approach to avoid having block drivers handle the
 zero buffers that need to be allocated when the region does not cover
 entire clusters.  It can be done for sure but I'm not sure how to do it
 nicely yet.

 If I understand your QED code right, in such cases it ignores that there
 are some zeros that could be turned into a zero cluster. Considering
 this and that you always fill a buffer just to be able to check it
 (which is known to take considerable time from qemu-img convert
 experience) - how could any solution that works consistently, but
 requires an allocation in the block driver be less nice?

The fallback is easy when you already have a buffer - just do the write :).

My point is that this patch is the simplest approach.  Other
approaches can optimize better and the question is whether they are
worth doing.

Stefan



Re: [Qemu-devel] PCI 64-bit BAR access with qemu

2011-10-12 Thread Francois WELLENREITER


Hi Max,

thank you for your answer. Actually, I hadn't confused.
I already thought to your proposal but I found that it was a really ugly 
solution (essentially because of the uint32_t to uint64_t silent 
conversion).

Isn't there any other (current or future) development that may fix it ?

François

Le 12/10/2011 13:00, Max Filippov a écrit :

I've read a few days ago that it was possible to emulate PCI device with
64-bit BARs and have a real 64-bit memory access.
Thus, I've created a virtual device named toto accessible through a 64-bit
BAR

You've probably confused an ability to locate BAR anywhere in 64-bit
address space (such BAR actually spans 2 consecutive PCI BAR registers
and has 100 in its 3 least significant bits) and an ability to access
BAR-mapped memory in 64 bit items.

You obviously want the latter but currently it is not implemented, see e.g.

static inline DATA_TYPE glue(io_read, SUFFIX)(target_phys_addr_t physaddr,
   target_ulong addr,
   void *retaddr)

definition in the softmmu_template.h.

And it's quite simple to fix it, you only need to change
io_{read,write} in the softmmu_template.h and extend
io_mem_{read,write} loops in exec.c to 4 elements, taking care that
io_mem_{read,write}[3] can pass uint64_t.






Re: [Qemu-devel] [PATCH 3/6] block: switch bdrv_read()/bdrv_write() to coroutines

2011-10-12 Thread Kevin Wolf
Am 05.10.2011 18:17, schrieb Stefan Hajnoczi:
 The bdrv_read()/bdrv_write() functions call .bdrv_read()/.bdrv_write().
 They should go through bdrv_co_do_readv() and bdrv_co_do_writev()
 instead in order to unify request processing code across sync, aio, and
 coroutine interfaces.  This is also an important step towards removing
 BlockDriverState .bdrv_read()/.bdrv_write() in the future.
 
 Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

This breaks drivers that only provide synchronous .bdrv_read/write.
Attempts to read or write anything results in endless recursion.

Kevin



Re: [Qemu-devel] [RFC PATCH v2] Specification for qcow2 version 3

2011-10-12 Thread Stefan Hajnoczi
On Tue, Jun 28, 2011 at 10:38 AM, Frediano Ziglio fredd...@gmail.com wrote:
 2011/6/27 Kevin Wolf kw...@redhat.com:
 This is the second draft for what I think could be added when we increase 
 qcow2's
 version number to 3. This includes points that have been made by several 
 people
 over the past few months. We're probably not going to implement this next 
 week,
 but I think it's important to get discussions started early, so here it is.

 Changes implemented in this RFC:

 - Added compatible/incompatible/auto-clear feature bits plus an optional
  feature name table to allow useful error messages even if an older version
  doesn't know some feature at all.

 - Added a dirty flag which tells that the refcount may not be accurate (QED
  mode). This means that we can save writes to the refcount table with
  cache=writethrough, but isn't really useful otherwise since Qcow2Cache.

 - Configurable refcount width. If you don't want to use internal snapshots,
  make refcounts one bit and save cache space and I/O.

 - Added subclusters. This separate the COW size (one subcluster, I'm thinking
  of 64k default size here) from the allocation size (one cluster, 2M). Less
  fragmentation, less metadata, but still reasonable COW granularity.

  This also allows to preallocate clusters, but none of their subclusters. You
  can have an image that is like raw + COW metadata, and you can also
  preallocate metadata for images with backing files.

 - Zero cluster flags. This allows discard even with a backing file that 
 doesn't
  contain zeros. It is also useful for copy-on-read/image streaming, as you'll
  want to keep sparseness without accessing the remote image for an 
 unallocated
  cluster all the time.

 - Fixed internal snapshot metadata to use 64 bit VM state size. You can't 
 save
  a snapshot of a VM with = 4 GB RAM today.

 Possible future additions:

 - Add per-L2-table dirty flag to L1?
 - Add per-refcount-block full flag to refcount table?

 Hi,
  thinking about image improvement I would add

 - GUID for image and backing file
 - relative path for backing file

 This would help finding images in a distributed environment or if file
 are moved, ie: gfs/nfs/ocfs mounted in different mount points, backing
 used a template in a different images directory and move this
 directory somewhere else. Also with GUID a possible higher level could
 manage a GUID - file image db.

 I was also think about a backing file length field to support
 resizing but probably can be implemented with zero cluster. Assume you
 have a image of 5gb, create a new image with first image as backing
 one, now resize second image from 5gb to 3gb then resize it again
 (after some works) to 10gb, part from 3gb to 5gb should not be read
 from backing file.

Interesting idea.  One could argue either way.  When image file size
!= backing file size you need to know what you are doing :).  I think
the case where the image is smaller than the backing file is rare and
zeroing vs exposing the backing file on resize isn't an obvious
choice.

 Also a bit in l2 offset to say there is no l2 table cause all
 clusters in l2 are contiguous so we avoid entirely l2. Obviously this
 require an optimization step to detect or create such condition.

There are several reserved L1 entry bits which could be used to mark
this mode.  This mode severely restricts qcow2 features though: how
would snapshots and COW work?  Perhaps by breaking the huge cluster
back into an L2 table with individual clusters?  Backing files also
cannot be used - unless we extend the sub-clusters approach and also
keep a large bitmap with allocated/unallocated/zero information.

A mode like this could be used for best performance on local storage,
where efficiently image transport (e.g. scp or http) is not required.
Actually I think this is reasonable, we could use qemu-img convert to
produce a compact qcow2 for export and use the L2-less qcow2 for
running the actual VM.

Kevin: what do you think about fleshing out this mode instead of sub-clusters?

 This mail sound quite strange to me, I thought qed would be the future
 of qcow2 but I must be really wrong.

What it's called doesn't matter but we need better metadata, and by
making qcow2v3 extensible we can now improvements without losing
support for existing image files.

Stefan



Re: [Qemu-devel] [PATCH 4/6] block: switch bdrv_aio_readv() to coroutines

2011-10-12 Thread Kevin Wolf
Am 05.10.2011 18:17, schrieb Stefan Hajnoczi:
 More sync, aio, and coroutine unification.  Make bdrv_aio_readv() go
 through coroutine request processing.
 
 Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
 ---
  block.c |   48 +++-
  1 files changed, 35 insertions(+), 13 deletions(-)
 
 diff --git a/block.c b/block.c
 index 90c29db..b83e911 100644
 --- a/block.c
 +++ b/block.c
 @@ -78,6 +78,15 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState 
 *bs,
  int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
  static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
  int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
 +static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
 +   int64_t sector_num,
 +   QEMUIOVector *qiov,
 +   int nb_sectors,
 +   BlockDriverCompletionFunc *cb,
 +   void *opaque,
 +   bool is_write,
 +   CoroutineEntry *entry);
 +static void coroutine_fn bdrv_co_do_rw(void *opaque);
  
  static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
  QTAILQ_HEAD_INITIALIZER(bdrv_states);
 @@ -2346,17 +2355,10 @@ BlockDriverAIOCB *bdrv_aio_readv(BlockDriverState 
 *bs, int64_t sector_num,
   QEMUIOVector *qiov, int nb_sectors,
   BlockDriverCompletionFunc *cb, void *opaque)
  {
 -BlockDriver *drv = bs-drv;
 -
  trace_bdrv_aio_readv(bs, sector_num, nb_sectors, opaque);
  
 -if (!drv)
 -return NULL;
 -if (bdrv_check_request(bs, sector_num, nb_sectors))
 -return NULL;
 -
 -return drv-bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
 -   cb, opaque);
 +return bdrv_co_aio_rw_vector(bs, sector_num, qiov, nb_sectors,
 + cb, opaque, false, bdrv_co_do_rw);
  }
  
  typedef struct BlockCompleteData {
 @@ -2803,6 +2805,7 @@ static void bdrv_co_rw_bh(void *opaque)
  qemu_aio_release(acb);
  }
  
 +/* Invoke .bdrv_co_readv/.bdrv_co_writev */
  static void coroutine_fn bdrv_co_rw(void *opaque)
  {
  BlockDriverAIOCBCoroutine *acb = opaque;
 @@ -2820,13 +2823,32 @@ static void coroutine_fn bdrv_co_rw(void *opaque)
  qemu_bh_schedule(acb-bh);
  }
  
 +/* Invoke bdrv_co_do_readv/bdrv_co_do_writev */
 +static void coroutine_fn bdrv_co_do_rw(void *opaque)
 +{
 +BlockDriverAIOCBCoroutine *acb = opaque;
 +BlockDriverState *bs = acb-common.bs;
 +
 +if (!acb-is_write) {
 +acb-req.error = bdrv_co_do_readv(bs, acb-req.sector,
 +acb-req.nb_sectors, acb-req.qiov);
 +} else {
 +acb-req.error = bdrv_co_do_writev(bs, acb-req.sector,
 +acb-req.nb_sectors, acb-req.qiov);
 +}
 +
 +acb-bh = qemu_bh_new(bdrv_co_rw_bh, acb);
 +qemu_bh_schedule(acb-bh);
 +}

The difference between the existing bdrv_co_rw and the new bdrv_co_do_rw
is that the former directly calls drv-... whereas the latter does some
checks first.

I think you could just switch bdrv_co_rw to do the checks. If I'm not
mistaken, the other path is dead code anyway after this change.
Actually, it looks like this series leaves quite a bit of dead code
behind, but I need to apply all patches and check the result to be sure.

Kevin



Re: [Qemu-devel] [PATCH 0/6] block: do request processing in a coroutine

2011-10-12 Thread Kevin Wolf
Am 05.10.2011 18:17, schrieb Stefan Hajnoczi:
 Block layer features like dirty block tracing, I/O throttling, and live block
 copy are forced to duplicate code due to the three different interfaces:
 synchronous, asynchronous, and coroutines.
 
 Since there are bdrv_read(), bdrv_aio_readv(), and bdrv_co_readv() interfaces
 for read (and similar for write), per-request processing needs to be 
 duplicated
 for each of these execution contexts.  For example, dirty block tracking code
 is duplicated across these three interfaces.
 
 This patch series unifies request processing so that there is only one code
 path.  I see this as a prerequisite to the live block copy (image streaming)
 code I am working on, so I'm pushing it now.
 
 The short-term win from this series is that it becomes easy to add live block
 copy and other features.  We now have a single code path where the 
 perf-request
 processing is done.
 
 The longer-term win will be dropping the BlockDriver .bdrv_read(),
 .bdrv_write(), .bdrv_aio_readv(), and .bdrv_aio_writev() interfaces.  By doing
 that we can bring all BlockDrivers onto a common interface, namely
 .bdrv_co_readv() and .bdrv_co_writev().  It will also allow us to drop most of
 the sync and aio emulation code.
 
 A consequence of this patch series is that every I/O request goes through at
 least one coroutine.  There is no longer a direct .bdrv_read(), .bdrv_write(),
 .bdrv_aio_readv(), or .bdrv_aio_writev() call - we're trying to phase out 
 those
 interfaces.  I have not noticed performance degradation in correctness tests
 but we need to confirm that there has not been a performance regression.
 
 Stefan Hajnoczi (6):
   block: directly invoke .bdrv_aio_*() in bdrv_co_io_em()
   block: split out bdrv_co_do_readv() and bdrv_co_do_writev()
   block: switch bdrv_read()/bdrv_write() to coroutines
   block: switch bdrv_aio_readv() to coroutines
   block: mark blocks dirty on coroutine write completion
   block: switch bdrv_aio_writev() to coroutines
 
  block.c |  273 
 +++
  1 files changed, 134 insertions(+), 139 deletions(-)
 

Except for breaking synchronous drivers this looks good.

I have applied the first two patches to the block branch, so if you
don't need to change them, submitting patches 3-6 for v2 will be enough.

Kevin



Re: [Qemu-devel] [PATCH 1/2] hw/9pfs: Add new virtfs option cache=writethrough to skip host page cache

2011-10-12 Thread Aneesh Kumar K.V
On Wed, 12 Oct 2011 10:55:21 +0100, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Mon, Oct 10, 2011 at 10:06 AM, Aneesh Kumar K.V
 aneesh.ku...@linux.vnet.ibm.com wrote:
  diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
  index 5c8b5ed..441a37f 100644
  --- a/hw/9pfs/virtio-9p-handle.c
  +++ b/hw/9pfs/virtio-9p-handle.c
  @@ -202,6 +202,15 @@ static ssize_t handle_pwritev(FsContext *ctx, int fd, 
  const struct iovec *iov,
          return writev(fd, iov, iovcnt);
 
 The sync_file_range(2) call below is dead-code since we'll return
 immediately after writev(2) completes.  The writev(2) return value
 needs to be saved temporarily and then returned after
 sync_file_range(2).

Missed that. Will fix in the next update

 
      }
   #endif
  +    if (ctx-cache_flags  V9FS_WRITETHROUGH_CACHE) {
 
 -drive cache=writethrough means something different from 9pfs
 writethrough.  This is confusing so I wonder if there is a better
 name like immediate write-out.
 

cache=immediate-write-out ? 

  +        /*
  +         * Initiate a writeback. This is not a data integrity sync.
  +         * We want to ensure that we don't leave dirty pages in the cache
  +         * after write when cache=writethrough is sepcified.
  +         */
  +        sync_file_range(fd, offset, 0,
  +                        SYNC_FILE_RANGE_WAIT_BEFORE | 
  SYNC_FILE_RANGE_WRITE);
  +    }
 
 I'm not sure whether SYNC_FILE_RANGE_WAIT_BEFORE is necessary.  As a
 best-effort mechanism just SYNC_FILE_RANGE_WRITE does the job although
 a client that rapidly rewrites may be able to leave dirty pages in the
 host page cache

Shouldn't we prevent this ? That is the reason for me to use that
WAIT_BEFORE ?

.  SYNC_FILE_RANGE_WAIT_BEFORE ensures that dirty pages
 get written out but it is no longer asynchronous because it blocks.

-aneesh




Re: [Qemu-devel] [RFC PATCH v2] Specification for qcow2 version 3

2011-10-12 Thread Kevin Wolf
Am 12.10.2011 14:51, schrieb Stefan Hajnoczi:
 Also a bit in l2 offset to say there is no l2 table cause all
 clusters in l2 are contiguous so we avoid entirely l2. Obviously this
 require an optimization step to detect or create such condition.
 
 There are several reserved L1 entry bits which could be used to mark
 this mode.  This mode severely restricts qcow2 features though: how
 would snapshots and COW work?  Perhaps by breaking the huge cluster
 back into an L2 table with individual clusters?  Backing files also
 cannot be used - unless we extend the sub-clusters approach and also
 keep a large bitmap with allocated/unallocated/zero information.
 
 A mode like this could be used for best performance on local storage,
 where efficiently image transport (e.g. scp or http) is not required.
 Actually I think this is reasonable, we could use qemu-img convert to
 produce a compact qcow2 for export and use the L2-less qcow2 for
 running the actual VM.
 
 Kevin: what do you think about fleshing out this mode instead of sub-clusters?

I'm hesitant to something like this as it adds quite some complexity and
I'm not sure if there are practical use cases for it at all.

If you take the current cluster sizes, an L2 table contains 512 MB of
data, so you would lose any sparseness. You would probably already get
full allocation just by creating a file system on the image.

But even if you do have a use case where sparseness doesn't matter, the
effect is very much the same as allowing a 512 MB cluster size and not
changing any of the qcow2 internals.

(What would the use case be? Backing files or snapshots with a COW
granularity of 512 MB isn't going to fly. That leaves only something
like encryption.)

Kevin



Re: [Qemu-devel] [v7 Patch 1/5]Qemu: Enhance info block to display host cache setting

2011-10-12 Thread Kevin Wolf
Am 11.10.2011 05:10, schrieb Supriya Kannery:
 Enhance info block to display hostcache setting for each
 block device.
 
 Example:
 (qemu) info block
 ide0-hd0: removable=0 file=../sles11-32.raw ro=0 drv=raw encrypted=0
 
 Enhanced to display hostcache setting:
 (qemu) info block
 ide0-hd0: removable=0 hostcache=1 file=../sles11-32.raw ro=0 drv=raw 
 encrypted=0
 
 Signed-off-by: Supriya Kannery supri...@linux.vnet.ibm.com
 
 ---
  block.c |   20 
  qmp-commands.hx |2 ++
  2 files changed, 18 insertions(+), 4 deletions(-)
 
 Index: qemu/qmp-commands.hx
 ===
 --- qemu.orig/qmp-commands.hx
 +++ qemu/qmp-commands.hx
 @@ -1142,6 +1142,7 @@ Each json-object contain the following:
  - locked: true if the device is locked, false otherwise (json-bool)
  - tray-open: only present if removable, true if the device has a tray,
 and it is open (json-bool)
 +- hostcache: true if host pagecache enabled, false otherwise (json-bool)
  - inserted: only present if the device is inserted, it is a json-object
 containing the following:
   - file: device file name (json-string)
 @@ -1163,6 +1164,7 @@ Example:
   {
  device:ide0-hd0,
  locked:false,
 +hostcache:false,
  removable:false,
  inserted:{
 ro:false,
 Index: qemu/block.c
 ===
 --- qemu.orig/block.c
 +++ qemu/block.c
 @@ -1866,6 +1866,15 @@ static void bdrv_print_dict(QObject *obj
  monitor_printf(mon,  tray-open=%d,
 qdict_get_bool(bs_dict, tray-open));
  }
 +
 +if (qdict_haskey(bs_dict, open_flags)) {
 +int open_flags = qdict_get_int(bs_dict, open_flags);
 +if (open_flags  BDRV_O_NOCACHE)
 +monitor_printf(mon,  hostcache=0);
 +else
 +monitor_printf(mon,  hostcache=1);

Coding style requires braces.

 +}
 +
  if (qdict_haskey(bs_dict, inserted)) {
  QDict *qdict = qobject_to_qdict(qdict_get(bs_dict, inserted));
  
 @@ -1903,11 +1912,14 @@ void bdrv_info(Monitor *mon, QObject **r
  QDict *bs_dict;
  
  bs_obj = qobject_from_jsonf({ 'device': %s, 'type': 'unknown', 
 -'removable': %i, 'locked': %i },
 +'removable': %i, 'locked': %i, 
 +'hostcache': %i },
  bs-device_name,
  bdrv_dev_has_removable_media(bs),
 -bdrv_dev_is_medium_locked(bs));
 +bdrv_dev_is_medium_locked(bs),
 +!(bs-open_flags  BDRV_O_NOCACHE));
  bs_dict = qobject_to_qdict(bs_obj);
 +qdict_put(bs_dict, open_flags, qint_from_int(bs-open_flags));

No. This adds a open_flags field to the QMP structure that is
transferred to clients. This is wrong, open_flags is an internal thing
that should never be visible on an interface.

In bdrv_print_dict, access the hostcache field that you introduced, it
provides the same information.

Kevin



Re: [Qemu-devel] [PATCH 2/2] pseries: Add device tree properties for VMX/VSX and DFP under kvm

2011-10-12 Thread Alexander Graf

On 10/11/2011 06:31 AM, David Gibson wrote:

Sufficiently recent PAPR specifications define properties ibm,vmx
and ibm,dfp on the CPU node which advertise whether the VMX vector
extensions (or the later VSX version) and/or the Decimal Floating
Point operations from IBM's recent POWER CPUs are available.

Currently we do not put these in the guest device tree and the guest
kernel will consequently assume they are not available.  This is good,
because they are not supported under TCG.  VMX is similar enough to
Altivec that it might be trivial to support, but VSX and DFP would
both require significant work to support in TCG.

However, when running under kvm on a host which supports these
instructions, there's no reason not to let the guest use them.  This
patch, therefore, checks for the relevant support on the host CPU
and, if present, advertises them to the guest as well.

Signed-off-by: David Gibsonda...@gibson.dropbear.id.au
---
  hw/spapr.c   |   17 +
  target-ppc/kvm.c |   10 ++
  target-ppc/kvm_ppc.h |   12 
  3 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 9a3a1ea..00b9c67 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -186,6 +186,8 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 0x, 0x};
  uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
  uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 
10;
+uint32_t vmx = kvm_enabled() ? kvmppc_get_vmx() : 0;
+uint32_t dfp = kvm_enabled() ? kvmppc_get_dfp() : 0;


This should only happen when the CPU type selected by -cpu also supports 
VMX or DFP. For now, this is moot as we can't run any non-host CPU in 
KVM for HV PPC. But looking forward, it certainly makes sense to split it.


This is not a nack, but please make sure to implement -cpu host in the 
near future and default to that for -M pseries. Then we can change this 
piece here to check for vcpu capabilities and AND them with the host caps.


The rest looks good.

Alex



  if ((index % smt) != 0) {
  continue;
@@ -233,6 +235,21 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 segs, sizeof(segs;
  }

+/* Advertise VMX/VSX (vector extensions) if available
+ *   0 / no property == no vector extensions
+ *   1   == VMX / Altivec available
+ *   2   == VSX available */
+if (vmx) {
+_FDT((fdt_property_cell(fdt, ibm,vmx, vmx)));
+}
+
+/* Advertise DFP (Decimal Floating Point) if available
+ *   0 / no property == no DFP
+ *   1   == DFP available */
+if (dfp) {
+_FDT((fdt_property_cell(fdt, ibm,dfp, dfp)));
+}
+
  _FDT((fdt_end_node(fdt)));
  }

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 6667b61..6a48eb4 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -694,6 +694,16 @@ uint64_t kvmppc_get_clockfreq(void)
  return kvmppc_read_int_cpu_dt(clock-frequency);
  }

+uint32_t kvmppc_get_vmx(void)
+{
+return kvmppc_read_int_cpu_dt(ibm,vmx);
+}
+
+uint32_t kvmppc_get_dfp(void)
+{
+return kvmppc_read_int_cpu_dt(ibm,dfp);
+}
+
  int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len)
  {
  uint32_t *hc = (uint32_t*)buf;
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 9e8a7b5..fa131bf 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -15,6 +15,8 @@ void kvmppc_init(void);

  uint32_t kvmppc_get_tbfreq(void);
  uint64_t kvmppc_get_clockfreq(void);
+uint32_t kvmppc_get_vmx(void);
+uint32_t kvmppc_get_dfp(void);
  int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int buf_len);
  int kvmppc_set_interrupt(CPUState *env, int irq, int level);
  void kvmppc_set_papr(CPUState *env);
@@ -35,6 +37,16 @@ static inline uint64_t kvmppc_get_clockfreq(void)
  return 0;
  }

+static inline uint32_t kvmppc_get_vmx(void)
+{
+return 0;
+}
+
+static inline uint32_t kvmppc_get_dfp(void)
+{
+return 0;
+}
+
  static inline int kvmppc_get_hypercall(CPUState *env, uint8_t *buf, int 
buf_len)
  {
  return -1;





Re: [Qemu-devel] [PATCH V4 2/2] hw/9pfs: Add readonly support for 9p export

2011-10-12 Thread Aneesh Kumar K.V
On Wed, 12 Oct 2011 13:23:34 +0530, M. Mohan Kumar mo...@in.ibm.com wrote:
 A new fsdev parameter readonly is introduced to control accessing 9p export.
 readonly=on|off can be used to specify the access type. By default rw
 access is given.
 
 Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
 ---
 Changes from previous version V3:
 * Use opt_set_bool function to set readonly option
 * Change the flag from MS_READONLY to 9p specific
 
 Change from previous version V2:
 * QEMU_OPT_BOOL is used for readdonly parameter
 
 Changes from previous version:
 * Use readonly option instead of access
 * Change function return type to boolean where its needed
 
  fsdev/file-op-9p.h |3 +-
  fsdev/qemu-fsdev.c |   12 +-
  fsdev/qemu-fsdev.h |1 +
  hw/9pfs/virtio-9p-device.c |3 ++
  hw/9pfs/virtio-9p.c|   46 
 
  qemu-config.c  |7 ++
  vl.c   |2 +
  7 files changed, 71 insertions(+), 3 deletions(-)
 
 diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
 index 33fb07f..b75290d 100644
 --- a/fsdev/file-op-9p.h
 +++ b/fsdev/file-op-9p.h
 @@ -58,7 +58,8 @@ typedef struct extended_ops {
  } extended_ops;
 
  /* FsContext flag values */
 -#define PATHNAME_FSCONTEXT 0x1
 +#define PATHNAME_FSCONTEXT  0x1

why ?

 +#define P9_RDONLY_EXPORT0x2
 
  /* cache flags */
  #define V9FS_WRITETHROUGH_CACHE 0x1
 diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
 index d08ba9c..f8a8227 100644
 --- a/fsdev/qemu-fsdev.c
 +++ b/fsdev/qemu-fsdev.c
 @@ -29,13 +29,13 @@ static FsTypeTable FsTypes[] = {
  int qemu_fsdev_add(QemuOpts *opts)
  {
  struct FsTypeListEntry *fsle;
 -int i;
 +int i, flags = 0;
  const char *fsdev_id = qemu_opts_id(opts);
  const char *fstype = qemu_opt_get(opts, fstype);
  const char *path = qemu_opt_get(opts, path);
  const char *sec_model = qemu_opt_get(opts, security_model);
  const char *cache = qemu_opt_get(opts, cache);
 -
 +int rdonly = qemu_opt_get_bool(opts, readonly, 0);
 
  if (!fsdev_id) {
  fprintf(stderr, fsdev: No id specified\n);
 @@ -76,6 +76,14 @@ int qemu_fsdev_add(QemuOpts *opts)
  fsle-fse.cache_flags = V9FS_WRITETHROUGH_CACHE;
  }
  }
 +if (rdonly) {
 +flags |= P9_RDONLY_EXPORT;
 +} else {
 +flags = ~P9_RDONLY_EXPORT;
 +}
 +
 +fsle-fse.flags = flags;
 +
  QTAILQ_INSERT_TAIL(fstype_entries, fsle, next);
  return 0;
  }
 diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
 index 0f67880..2938eaf 100644
 --- a/fsdev/qemu-fsdev.h
 +++ b/fsdev/qemu-fsdev.h
 @@ -44,6 +44,7 @@ typedef struct FsTypeEntry {
  char *security_model;
  int cache_flags;
  FileOperations *ops;
 +int flags;

do we need extra flags ? Why not use cache_flags ? renaming that to
export_flags ?

  } FsTypeEntry;
 
  typedef struct FsTypeListEntry {
 diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
 index 1846e36..336292c 100644
 --- a/hw/9pfs/virtio-9p-device.c
 +++ b/hw/9pfs/virtio-9p-device.c
 @@ -125,6 +125,9 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
 *conf)
  s-tag_len = len;
  s-ctx.uid = -1;
  s-ctx.flags = 0;
 +if (fse-flags  P9_RDONLY_EXPORT) {
 +s-ctx.flags |= P9_RDONLY_EXPORT;
 +}
 
  s-ops = fse-ops;
  s-vdev.get_features = virtio_9p_get_features;
 diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
 index 47ed2f1..9f15787 100644
 --- a/hw/9pfs/virtio-9p.c
 +++ b/hw/9pfs/virtio-9p.c
 @@ -1271,6 +1271,11 @@ static void v9fs_fix_path(V9fsPath *dst, V9fsPath 
 *src, int len)
  dst-size++;
  }
 
 +static inline bool is_ro_export(FsContext *fs_ctx)
 +{
 +return fs_ctx-flags  P9_RDONLY_EXPORT;
 +}
 +
  static void v9fs_version(void *opaque)
  {
  V9fsPDU *pdu = opaque;
 @@ -1690,6 +1695,14 @@ static void v9fs_open(void *opaque)
  } else {
  flags = omode_to_uflags(mode);
  }
 +if (is_ro_export(s-ctx)) {
 +if (mode  O_WRONLY || mode  O_RDWR || mode  O_APPEND) {
 +err = -EROFS;
 +goto out;
 +} else {
 +flags |= O_NOATIME;
 +}
 +}

What about O_TRUNC ?

  err = v9fs_co_open(pdu, fidp, flags);
  if (err  0) {
  goto out;
 @@ -3301,6 +3314,33 @@ static void v9fs_op_not_supp(void *opaque)
  complete_pdu(pdu-s, pdu, -EOPNOTSUPP);
  }
 

-aneesh




Re: [Qemu-devel] [PATCH 1/2] ppc: Generalize the kvmppc_get_clockfreq() function

2011-10-12 Thread Alexander Graf

On 10/11/2011 06:31 AM, David Gibson wrote:

Currently the kvmppc_get_clockfreq() function reads the host's clock
frequency from /proc/device-tree, which is useful to past to the guest
in KVM setups.  However, there are some other host properties
advertised in the device tree which can also be relevant to the
guests.

This patch, therefore, replaces kvmppc_get_clockfreq() which can
retrieve any named, single integer property from the host device
tree's CPU node.

Signed-off-by: David Gibsonda...@gibson.dropbear.id.au


Thanks, applied both patches to ppc-next.


Alex




Re: [Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing

2011-10-12 Thread Aneesh Kumar K.V
On Wed, 12 Oct 2011 13:24:16 +0530, M. Mohan Kumar mo...@in.ibm.com wrote:
 Security model is needed only for 'local' fs driver.

Can you also cleanup that fstype - fsdriver rename ? fsdriver seems
more appropriate.

 
 Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
 ---
  fsdev/qemu-fsdev.c |6 +
  fsdev/qemu-fsdev.h |1 +
  hw/9pfs/virtio-9p-device.c |   47 ++-
  vl.c   |   20 +++--
  4 files changed, 43 insertions(+), 31 deletions(-)
 
 diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
 index 36db127..d08ba9c 100644
 --- a/fsdev/qemu-fsdev.c
 +++ b/fsdev/qemu-fsdev.c
 @@ -58,11 +58,6 @@ int qemu_fsdev_add(QemuOpts *opts)
  return -1;
  }
 
 -if (!sec_model) {
 -fprintf(stderr, fsdev: No security_model specified.\n);
 -return -1;
 -}
 -
  if (!path) {
  fprintf(stderr, fsdev: No path specified.\n);
  return -1;
 @@ -72,6 +67,7 @@ int qemu_fsdev_add(QemuOpts *opts)
 
  fsle-fse.fsdev_id = g_strdup(fsdev_id);
  fsle-fse.path = g_strdup(path);
 +fsle-fse.fsdriver = g_strdup(fstype);

Why use it as a string ? Why can't this again be an export_flag. That
would help us to avoid that strdup 


  fsle-fse.security_model = g_strdup(sec_model);
  fsle-fse.ops = FsTypes[i].ops;
  fsle-fse.cache_flags = 0;
 diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
 index 9c440f2..0f67880 100644
 --- a/fsdev/qemu-fsdev.h
 +++ b/fsdev/qemu-fsdev.h
 @@ -40,6 +40,7 @@ typedef struct FsTypeTable {
  typedef struct FsTypeEntry {
  char *fsdev_id;
  char *path;
 +char *fsdriver;
  char *security_model;
  int cache_flags;
  FileOperations *ops;
 diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
 index aac58ad..1846e36 100644
 --- a/hw/9pfs/virtio-9p-device.c
 +++ b/hw/9pfs/virtio-9p-device.c
 @@ -83,29 +83,30 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
 *conf)
  exit(1);
  }
 
 -if (!strcmp(fse-security_model, passthrough)) {
 -/* Files on the Fileserver set to client user credentials */
 -s-ctx.fs_sm = SM_PASSTHROUGH;
 -s-ctx.xops = passthrough_xattr_ops;
 -} else if (!strcmp(fse-security_model, mapped)) {
 -/* Files on the fileserver are set to QEMU credentials.
 - * Client user credentials are saved in extended attributes.
 - */
 -s-ctx.fs_sm = SM_MAPPED;
 -s-ctx.xops = mapped_xattr_ops;
 -} else if (!strcmp(fse-security_model, none)) {
 -/*
 - * Files on the fileserver are set to QEMU credentials.
 - */
 -s-ctx.fs_sm = SM_NONE;
 -s-ctx.xops = none_xattr_ops;
 -} else {
 -fprintf(stderr, Default to security_model=none. You may want
 - enable advanced security model using 
 -security option:\n\t security_model=passthrough\n\t 
 -security_model=mapped\n);
 -s-ctx.fs_sm = SM_NONE;
 -s-ctx.xops = none_xattr_ops;
 +/* security models is needed only for local fs driver */
 +if (!strcmp(fse-fsdriver, local)) {
 +if (!strcmp(fse-security_model, passthrough)) {
 +/* Files on the Fileserver set to client user credentials */
 +s-ctx.fs_sm = SM_PASSTHROUGH;
 +s-ctx.xops = passthrough_xattr_ops;
 +} else if (!strcmp(fse-security_model, mapped)) {
 +/* Files on the fileserver are set to QEMU credentials.
 +* Client user credentials are saved in extended attributes.
 +*/
 +s-ctx.fs_sm = SM_MAPPED;
 +s-ctx.xops = mapped_xattr_ops;
 +} else if (!strcmp(fse-security_model, none)) {
 +/*
 +* Files on the fileserver are set to QEMU credentials.
 +*/
 +s-ctx.fs_sm = SM_NONE;
 +s-ctx.xops = none_xattr_ops;
 +} else {
 +fprintf(stderr, Invalid security_model %s specified.\n
 +Available security models are:\t 
 +passthrough,mapped or none\n, fse-security_model);
 +exit(1);
 +}
  }
 
  s-ctx.cache_flags = fse-cache_flags;
 diff --git a/vl.c b/vl.c
 index 6760e39..a961fa3 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -2795,6 +2795,7 @@ int main(int argc, char **argv, char **envp)
  QemuOpts *fsdev;
  QemuOpts *device;
  const char *cache;
 +const char *fsdriver;
 
  olist = qemu_find_opts(virtfs);
  if (!olist) {
 @@ -2809,13 +2810,26 @@ int main(int argc, char **argv, char **envp)
 
  if (qemu_opt_get(opts, fstype) == NULL ||
  qemu_opt_get(opts, mount_tag) == NULL ||
 -qemu_opt_get(opts, path) == NULL ||
 -qemu_opt_get(opts, security_model) == NULL) {
 +

[Qemu-devel] [PATCH 03/10] scsi-generic: check ioctl statuses when SG_IO succeeds

2011-10-12 Thread Paolo Bonzini
A succeeding ioctl does not imply that the SCSI command succeeded.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-generic.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index 5130e9a..04549aa 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -311,7 +311,7 @@ static int get_blocksize(BlockDriverState *bdrv)
 io_header.timeout = 6000; /* XXX */
 
 ret = bdrv_ioctl(bdrv, SG_IO, io_header);
-if (ret  0)
+if (ret  0 || io_header.driver_status || io_header.host_status)
 return -1;
 
 return ldl_be_p(buf[4]);
@@ -341,7 +341,7 @@ static int get_stream_blocksize(BlockDriverState *bdrv)
 io_header.timeout = 6000; /* XXX */
 
 ret = bdrv_ioctl(bdrv, SG_IO, io_header);
-if (ret  0)
+if (ret  0 || io_header.driver_status || io_header.host_status)
 return -1;
 
 return ldl_be_p(buf[0])  0xff;
-- 
1.7.6





[Qemu-devel] [PATCH 00/10] scsi: add specialized block device passthrough

2011-10-12 Thread Paolo Bonzini
This series adds a new scsi-block device that is able to do SCSI
passthrough for block devices only, but at the same time does not suffer
the limitations of scsi-generic.  In particular it does not need a
bounce buffer that is as big as the request, and will be able to support
s/g lists.  This puts scsi-block and virtio-blk on feature parity.

To do this, scsi-generic is simplified to the point that its ReqOps can be
used by any SCSIDevice.  Then, a new scsi-disk variant is introduced that
uses regular AIO read/writes whenever possible, and falls back to SG_IO
for other SCSI commands.  Management can then choose between isolating
the guest from the specificities of the target device (scsi-hd/scsi-cd),
or making it aware of them (scsi-block).

In the future, both scsi-generic and scsi-block could be made to support
other types of passthrough, for example iSCSI passthrough.

Patches 1-4 are cleanups to scsi-generic, patches 5-6 are cleanups
to scsi-disk.  Then comes the actual implementation of the feature.

Paolo Bonzini (10):
  scsi-generic: drop SCSIGenericState
  scsi-generic: remove scsi_req_fixup
  scsi-generic: check ioctl statuses when SG_IO succeeds
  scsi-generic: look at host status
  scsi-disk: do not duplicate BlockDriverState member
  scsi-disk: small clean up to INQUIRY
  scsi: make reqops const
  scsi: export scsi_generic_reqops
  scsi: pass cdb to alloc_req
  scsi-disk: add scsi-block for device passthrough

 hw/scsi-bus.c |   12 ++--
 hw/scsi-disk.c|  260 +---
 hw/scsi-generic.c |  128 ---
 hw/scsi.h |   12 ++-
 4 files changed, 255 insertions(+), 157 deletions(-)

-- 
1.7.6




[Qemu-devel] [PATCH 06/10] scsi-disk: small clean up to INQUIRY

2011-10-12 Thread Paolo Bonzini
Set s-removable, s-qdev.blocksize and s-qdev.type in the callers
of scsi_initfn.

With this in place, s-qdev.type is allowed, and we can just reuse it
as the first byte in VPD data (just like we do in standard INQUIRY data).
Also set s-removable is set consistently and we can use it.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-disk.c |   46 +-
 1 files changed, 21 insertions(+), 25 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 5e3ef51..a4daa29 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -397,11 +397,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 return -1;
 }
 
-if (s-qdev.type == TYPE_ROM) {
-outbuf[buflen++] = 5;
-} else {
-outbuf[buflen++] = 0;
-}
+outbuf[buflen++] = s-qdev.type  0x1f;
 outbuf[buflen++] = page_code ; // this page
 outbuf[buflen++] = 0x00;
 
@@ -541,11 +537,10 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 memset(outbuf, 0, buflen);
 
 outbuf[0] = s-qdev.type  0x1f;
+outbuf[1] = s-removable ? 0x80 : 0;
 if (s-qdev.type == TYPE_ROM) {
-outbuf[1] = 0x80;
 memcpy(outbuf[16], QEMU CD-ROM , 16);
 } else {
-outbuf[1] = s-removable ? 0x80 : 0;
 memcpy(outbuf[16], QEMU HARDDISK   , 16);
 }
 memcpy(outbuf[8], QEMU, 8);
@@ -1511,7 +1506,7 @@ static void scsi_disk_unit_attention_reported(SCSIDevice 
*dev)
 }
 }
 
-static int scsi_initfn(SCSIDevice *dev, uint8_t scsi_type)
+static int scsi_initfn(SCSIDevice *dev)
 {
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
 DriveInfo *dinfo;
@@ -1521,7 +1516,7 @@ static int scsi_initfn(SCSIDevice *dev, uint8_t scsi_type)
 return -1;
 }
 
-if (scsi_type == TYPE_DISK  !bdrv_is_inserted(s-qdev.conf.bs)) {
+if (!s-removable  !bdrv_is_inserted(s-qdev.conf.bs)) {
 error_report(Device needs media, but drive is empty);
 return -1;
 }
@@ -1543,19 +1538,12 @@ static int scsi_initfn(SCSIDevice *dev, uint8_t 
scsi_type)
 return -1;
 }
 
-if (scsi_type == TYPE_ROM) {
+if (s-removable) {
 bdrv_set_dev_ops(s-qdev.conf.bs, scsi_cd_block_ops, s);
-s-qdev.blocksize = 2048;
-} else if (scsi_type == TYPE_DISK) {
-s-qdev.blocksize = s-qdev.conf.logical_block_size;
-} else {
-error_report(scsi-disk: Unhandled SCSI type %02x, scsi_type);
-return -1;
 }
 s-cluster_size = s-qdev.blocksize / 512;
 bdrv_set_buffer_alignment(s-qdev.conf.bs, s-qdev.blocksize);
 
-s-qdev.type = scsi_type;
 qemu_add_vm_change_state_handler(scsi_dma_restart_cb, s);
 bdrv_iostatus_enable(s-qdev.conf.bs);
 add_boot_device_path(s-qdev.conf.bootindex, dev-qdev, ,0);
@@ -1564,27 +1552,35 @@ static int scsi_initfn(SCSIDevice *dev, uint8_t 
scsi_type)
 
 static int scsi_hd_initfn(SCSIDevice *dev)
 {
-return scsi_initfn(dev, TYPE_DISK);
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
+s-qdev.blocksize = s-qdev.conf.logical_block_size;
+s-qdev.type = TYPE_DISK;
+return scsi_initfn(s-qdev);
 }
 
 static int scsi_cd_initfn(SCSIDevice *dev)
 {
-return scsi_initfn(dev, TYPE_ROM);
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
+s-qdev.blocksize = 2048;
+s-qdev.type = TYPE_ROM;
+s-removable = true;
+return scsi_initfn(s-qdev);
 }
 
 static int scsi_disk_initfn(SCSIDevice *dev)
 {
 DriveInfo *dinfo;
-uint8_t scsi_type;
 
 if (!dev-conf.bs) {
-scsi_type = TYPE_DISK;  /* will die in scsi_initfn() */
-} else {
-dinfo = drive_get_by_blockdev(dev-conf.bs);
-scsi_type = dinfo-media_cd ? TYPE_ROM : TYPE_DISK;
+return scsi_initfn(dev);  /* ... and die there */
 }
 
-return scsi_initfn(dev, scsi_type);
+dinfo = drive_get_by_blockdev(dev-conf.bs);
+if (dinfo-media_cd) {
+return scsi_cd_initfn(dev);
+} else {
+return scsi_hd_initfn(dev);
+}
 }
 
 static SCSIReqOps scsi_disk_reqops = {
-- 
1.7.6





Re: [Qemu-devel] [v7 Patch 4/5]Qemu: Add commandline -drive option 'hostcache'

2011-10-12 Thread Kevin Wolf
Am 11.10.2011 05:11, schrieb Supriya Kannery:
 qemu command option 'hostcache' added to -drive for block devices.
 While starting a VM from qemu commandline, this option can be used 
 for setting host cache usage for block data access. Simultaneous
 use of 'hostcache' and 'cache' options not allowed.
 
 Signed-off-by: Supriya Kannery supri...@linux.vnet.ibm.com

I'm not sure if introducing this alone makes sense. I think I would only
do it when we introduce more options that allow replacing all cache=...
options by other parameters.

 
 ---
  blockdev.c  |   13 +
  qemu-config.c   |4 
  qemu-options.hx |2 +-
  3 files changed, 18 insertions(+), 1 deletion(-)
 
 Index: qemu/blockdev.c
 ===
 --- qemu.orig/blockdev.c
 +++ qemu/blockdev.c
 @@ -237,6 +237,7 @@ DriveInfo *drive_init(QemuOpts *opts, in
  DriveInfo *dinfo;
  int snapshot = 0;
  int ret;
 +int hostcache = 0;
  
  translation = BIOS_ATA_TRANSLATION_AUTO;
  media = MEDIA_DISK;
 @@ -319,7 +320,19 @@ DriveInfo *drive_init(QemuOpts *opts, in
   }
  }
  
 +if ((hostcache = qemu_opt_get_bool(opts, hostcache, -1)) != -1) {
 +if (!hostcache) {
 +bdrv_flags |= BDRV_O_NOCACHE;
 +} else {
 +bdrv_flags = ~BDRV_O_NOCACHE;
 +}

bdrv_flags is initialised to 0, so the else branch is unnecessary.

Kevin



Re: [Qemu-devel] [PATCH 1/2] hw/9pfs: Add new virtfs option cache=writethrough to skip host page cache

2011-10-12 Thread Stefan Hajnoczi
On Wed, Oct 12, 2011 at 2:19 PM, Aneesh Kumar K.V
aneesh.ku...@linux.vnet.ibm.com wrote:
 On Wed, 12 Oct 2011 10:55:21 +0100, Stefan Hajnoczi stefa...@gmail.com 
 wrote:
 On Mon, Oct 10, 2011 at 10:06 AM, Aneesh Kumar K.V
 aneesh.ku...@linux.vnet.ibm.com wrote:
  diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
  index 5c8b5ed..441a37f 100644
  --- a/hw/9pfs/virtio-9p-handle.c
  +++ b/hw/9pfs/virtio-9p-handle.c
  @@ -202,6 +202,15 @@ static ssize_t handle_pwritev(FsContext *ctx, int fd, 
  const struct iovec *iov,
          return writev(fd, iov, iovcnt);

 The sync_file_range(2) call below is dead-code since we'll return
 immediately after writev(2) completes.  The writev(2) return value
 needs to be saved temporarily and then returned after
 sync_file_range(2).

 Missed that. Will fix in the next update


      }
   #endif
  +    if (ctx-cache_flags  V9FS_WRITETHROUGH_CACHE) {

 -drive cache=writethrough means something different from 9pfs
 writethrough.  This is confusing so I wonder if there is a better
 name like immediate write-out.


 cache=immediate-write-out ?

  +        /*
  +         * Initiate a writeback. This is not a data integrity sync.
  +         * We want to ensure that we don't leave dirty pages in the cache
  +         * after write when cache=writethrough is sepcified.
  +         */
  +        sync_file_range(fd, offset, 0,
  +                        SYNC_FILE_RANGE_WAIT_BEFORE | 
  SYNC_FILE_RANGE_WRITE);
  +    }

 I'm not sure whether SYNC_FILE_RANGE_WAIT_BEFORE is necessary.  As a
 best-effort mechanism just SYNC_FILE_RANGE_WRITE does the job although
 a client that rapidly rewrites may be able to leave dirty pages in the
 host page cache

 Shouldn't we prevent this ? That is the reason for me to use that
 WAIT_BEFORE ?

The flag will cause sync_file_range(2) to wait on in-flight I/O.  The
guest will notice slow I/O.  You should at least specify a range
instead of nbytes=0 in the arguments.

Stefan



Re: [Qemu-devel] [RFC PATCH v2] Specification for qcow2 version 3

2011-10-12 Thread Stefan Hajnoczi
On Wed, Oct 12, 2011 at 2:31 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 12.10.2011 14:51, schrieb Stefan Hajnoczi:
 Also a bit in l2 offset to say there is no l2 table cause all
 clusters in l2 are contiguous so we avoid entirely l2. Obviously this
 require an optimization step to detect or create such condition.

 There are several reserved L1 entry bits which could be used to mark
 this mode.  This mode severely restricts qcow2 features though: how
 would snapshots and COW work?  Perhaps by breaking the huge cluster
 back into an L2 table with individual clusters?  Backing files also
 cannot be used - unless we extend the sub-clusters approach and also
 keep a large bitmap with allocated/unallocated/zero information.

 A mode like this could be used for best performance on local storage,
 where efficiently image transport (e.g. scp or http) is not required.
 Actually I think this is reasonable, we could use qemu-img convert to
 produce a compact qcow2 for export and use the L2-less qcow2 for
 running the actual VM.

 Kevin: what do you think about fleshing out this mode instead of 
 sub-clusters?

 I'm hesitant to something like this as it adds quite some complexity and
 I'm not sure if there are practical use cases for it at all.

 If you take the current cluster sizes, an L2 table contains 512 MB of
 data, so you would lose any sparseness. You would probably already get
 full allocation just by creating a file system on the image.

 But even if you do have a use case where sparseness doesn't matter, the
 effect is very much the same as allowing a 512 MB cluster size and not
 changing any of the qcow2 internals.

I guess I'm thinking of the 512 MB cluster size situation, because
we'd definitely want a cow bitmap in order to keep backing files and
sparseness.

 (What would the use case be? Backing files or snapshots with a COW
 granularity of 512 MB isn't going to fly. That leaves only something
 like encryption.)

COW granularity needs to stay at 64-256 kb since those are reasonable
request sizes for COW.

Stefan



Re: [Qemu-devel] [PATCH V5] Add stdio char device on windows

2011-10-12 Thread Fabien Chouteau
On 06/10/2011 16:37, Fabien Chouteau wrote:
 Simple implementation of an stdio char device on Windows.
 

Any comments?

-- 
Fabien Chouteau



Re: [Qemu-devel] [v7 Patch 5/5]Qemu: New struct 'BDRVReopenState' for image files reopen

2011-10-12 Thread Kevin Wolf
Am 11.10.2011 05:11, schrieb Supriya Kannery:
 Struct BDRVReopenState introduced for handling reopen state of images. 
 This can be  extended by each of the block drivers to reopen respective
 image files. Implementation for raw-posix is done here.
 
 Signed-off-by: Supriya Kannery supri...@linux.vnet.ibm.com

Maybe it would make sense to split this into two patches, one for the
block.c infrastructure and another one for adding the callbacks in drivers.

 ---
  block.c   |   46 
  block/raw-posix.c |   62 
 ++
  block_int.h   |   15 +
  3 files changed, 114 insertions(+), 9 deletions(-)
 
 Index: qemu/block/raw-posix.c
 ===
 --- qemu.orig/block/raw-posix.c
 +++ qemu/block/raw-posix.c
 @@ -279,6 +279,66 @@ static int raw_open(BlockDriverState *bs
  return raw_open_common(bs, filename, flags, 0);
  }
  
 +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **rs,
 +  int flags)
 +{
 +BDRVReopenState *raw_rs = g_malloc0(sizeof(BDRVReopenState));
 +BDRVRawState *s = bs-opaque;
 +
 +raw_rs-bs = bs;
 +raw_rs-reopen_flags = flags;
 +raw_rs-reopen_fd = -1;
 +
 +/* If only O_DIRECT to be toggled, use fcntl */
 +if (!((bs-open_flags  ~BDRV_O_NOCACHE) ^
 +(flags  ~BDRV_O_NOCACHE))) {
 +raw_rs-reopen_fd = dup(s-fd);
 +if (raw_rs-reopen_fd = 0) {
 +return -1;

This leaks raw_rs.

 +}
 +}
 +
 +/* TBD: Handle O_DSYNC and other flags */
 +*rs = raw_rs;
 +return 0;
 +}
 +
 +static int raw_reopen_commit(BDRVReopenState *rs)

bdrv_reopen_commit must never fail. Any error that can happen must
already be handled in bdrv_reopen_prepare.

The commit function should really only do s-fd = rs-reopen_fd (besides
cleanup), everything else should already be prepared.

 +{
 +BlockDriverState *bs = rs-bs;
 +BDRVRawState *s = bs-opaque;
 +
 +if (!rs-reopen_fd) {
 +return -1;
 +}
 +
 +int ret = fcntl_setfl(rs-reopen_fd, rs-reopen_flags);

reopen_flags is BDRV_O_*, not O_*, so it needs to be translated.

 +if (ret  0) {
 +return ret;
 +}
 +
 +/* Set new flags, so replace old fd with new one */
 +close(s-fd);
 +s-fd = rs-reopen_fd;
 +s-open_flags = bs-open_flags = rs-reopen_flags;
 +g_free(rs);
 +rs = NULL;

Setting to NULL has no effect, rs isn't read afterwards.

 +
 +return 0;
 +}
 +
 +static int raw_reopen_abort(BDRVReopenState *rs)

This must not fail either, so it can be void, too.

 +{
 +if (rs-reopen_fd != -1) {
 +close(rs-reopen_fd);
 +rs-reopen_fd = -1;
 +rs-reopen_flags = 0;
 +}
 +g_free(rs);
 +rs = NULL;
 +return 0;
 +}
 +
  /* XXX: use host sector size if necessary with:
  #ifdef DIOCGSECTORSIZE
  {
 @@ -896,6 +956,9 @@ static BlockDriver bdrv_file = {
  .instance_size = sizeof(BDRVRawState),
  .bdrv_probe = NULL, /* no probe for protocols */
  .bdrv_file_open = raw_open,
 +.bdrv_reopen_prepare = raw_reopen_prepare,
 +.bdrv_reopen_commit = raw_reopen_commit,
 +.bdrv_reopen_abort = raw_reopen_abort,
  .bdrv_read = raw_read,
  .bdrv_write = raw_write,
  .bdrv_close = raw_close,
 @@ -1166,6 +1229,10 @@ static BlockDriver bdrv_host_device = {
  .instance_size  = sizeof(BDRVRawState),
  .bdrv_probe_device  = hdev_probe_device,
  .bdrv_file_open = hdev_open,
 +.bdrv_reopen_prepare
 += raw_reopen_prepare,
 +.bdrv_reopen_commit = raw_reopen_commit,
 +.bdrv_reopen_abort  = raw_reopen_abort,
  .bdrv_close = raw_close,
  .bdrv_create= hdev_create,
  .create_options = raw_create_options,
 Index: qemu/block.c
 ===
 --- qemu.orig/block.c
 +++ qemu/block.c
 @@ -706,6 +706,7 @@ int bdrv_reopen(BlockDriverState *bs, in
  {
  BlockDriver *drv = bs-drv;
  int ret = 0, open_flags;
 +BDRVReopenState *rs;
  
  /* Quiesce IO for the given block device */
  qemu_aio_flush();
 @@ -713,20 +714,48 @@ int bdrv_reopen(BlockDriverState *bs, in
  qerror_report(QERR_DATA_SYNC_FAILED, bs-device_name);
  return ret;
  }
 -open_flags = bs-open_flags;
 -bdrv_close(bs);
  
 -ret = bdrv_open(bs, bs-filename, bdrv_flags, drv);
 -if (ret  0) {
 -/* Reopen failed. Try to open with original flags */
 -qerror_report(QERR_REOPEN_FILE_FAILED, bs-filename);
 -ret = bdrv_open(bs, bs-filename, open_flags, drv);
 +/* Use driver specific reopen() if available */
 +if (drv-bdrv_reopen_prepare) {
 +ret = drv-bdrv_reopen_prepare(bs, rs, bdrv_flags);
  if (ret  0) {
 -/* Reopen failed with orig and modified flags */
 -abort();
 +   

Re: [Qemu-devel] [RFC PATCH v2] Specification for qcow2 version 3

2011-10-12 Thread Kevin Wolf
Am 12.10.2011 16:37, schrieb Stefan Hajnoczi:
 On Wed, Oct 12, 2011 at 2:31 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 12.10.2011 14:51, schrieb Stefan Hajnoczi:
 Also a bit in l2 offset to say there is no l2 table cause all
 clusters in l2 are contiguous so we avoid entirely l2. Obviously this
 require an optimization step to detect or create such condition.

 There are several reserved L1 entry bits which could be used to mark
 this mode.  This mode severely restricts qcow2 features though: how
 would snapshots and COW work?  Perhaps by breaking the huge cluster
 back into an L2 table with individual clusters?  Backing files also
 cannot be used - unless we extend the sub-clusters approach and also
 keep a large bitmap with allocated/unallocated/zero information.

 A mode like this could be used for best performance on local storage,
 where efficiently image transport (e.g. scp or http) is not required.
 Actually I think this is reasonable, we could use qemu-img convert to
 produce a compact qcow2 for export and use the L2-less qcow2 for
 running the actual VM.

 Kevin: what do you think about fleshing out this mode instead of 
 sub-clusters?

 I'm hesitant to something like this as it adds quite some complexity and
 I'm not sure if there are practical use cases for it at all.

 If you take the current cluster sizes, an L2 table contains 512 MB of
 data, so you would lose any sparseness. You would probably already get
 full allocation just by creating a file system on the image.

 But even if you do have a use case where sparseness doesn't matter, the
 effect is very much the same as allowing a 512 MB cluster size and not
 changing any of the qcow2 internals.
 
 I guess I'm thinking of the 512 MB cluster size situation, because
 we'd definitely want a cow bitmap in order to keep backing files and
 sparseness.
 
 (What would the use case be? Backing files or snapshots with a COW
 granularity of 512 MB isn't going to fly. That leaves only something
 like encryption.)
 
 COW granularity needs to stay at 64-256 kb since those are reasonable
 request sizes for COW.

But how do you do that without L2 tables? What you're describing
(different sizes for allocation and COW) is exactly what subclusters are
doing. I can't see how switching to 512 MB clusters and a single-level
table can make that work.

Kevin



[Qemu-devel] [PATCH 07/10] scsi: make reqops const

2011-10-12 Thread Paolo Bonzini
Also delete a stale occurrence of SCSIReqOps inside SCSIDeviceInfo.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-bus.c |   10 +-
 hw/scsi-disk.c|2 +-
 hw/scsi-generic.c |2 +-
 hw/scsi.h |7 +++
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index bdd6e94..252e903 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -160,7 +160,7 @@ static int32_t scsi_invalid_command(SCSIRequest *req, 
uint8_t *buf)
 return 0;
 }
 
-struct SCSIReqOps reqops_invalid_opcode = {
+const struct SCSIReqOps reqops_invalid_opcode = {
 .size = sizeof(SCSIRequest),
 .send_command = scsi_invalid_command
 };
@@ -178,7 +178,7 @@ static int32_t scsi_unit_attention(SCSIRequest *req, 
uint8_t *buf)
 return 0;
 }
 
-struct SCSIReqOps reqops_unit_attention = {
+const struct SCSIReqOps reqops_unit_attention = {
 .size = sizeof(SCSIRequest),
 .send_command = scsi_unit_attention
 };
@@ -386,7 +386,7 @@ static uint8_t *scsi_target_get_buf(SCSIRequest *req)
 return r-buf;
 }
 
-struct SCSIReqOps reqops_target_command = {
+const struct SCSIReqOps reqops_target_command = {
 .size = sizeof(SCSITargetReq),
 .send_command = scsi_target_send_command,
 .read_data= scsi_target_read_data,
@@ -394,8 +394,8 @@ struct SCSIReqOps reqops_target_command = {
 };
 
 
-SCSIRequest *scsi_req_alloc(SCSIReqOps *reqops, SCSIDevice *d, uint32_t tag,
-uint32_t lun, void *hba_private)
+SCSIRequest *scsi_req_alloc(const SCSIReqOps *reqops, SCSIDevice *d,
+uint32_t tag, uint32_t lun, void *hba_private)
 {
 SCSIRequest *req;
 
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index a4daa29..6c1d5a2 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1583,7 +1583,7 @@ static int scsi_disk_initfn(SCSIDevice *dev)
 }
 }
 
-static SCSIReqOps scsi_disk_reqops = {
+static const SCSIReqOps scsi_disk_reqops = {
 .size = sizeof(SCSIDiskReq),
 .free_req = scsi_free_request,
 .send_command = scsi_send_command,
diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index 7b291ec..9c9f64a 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -432,7 +432,7 @@ static int scsi_generic_initfn(SCSIDevice *s)
 return 0;
 }
 
-static SCSIReqOps scsi_generic_req_ops = {
+static const SCSIReqOps scsi_generic_req_ops = {
 .size = sizeof(SCSIGenericReq),
 .free_req = scsi_free_request,
 .send_command = scsi_send_command,
diff --git a/hw/scsi.h b/hw/scsi.h
index c8649cf..832682e 100644
--- a/hw/scsi.h
+++ b/hw/scsi.h
@@ -41,7 +41,7 @@ struct SCSICommand {
 struct SCSIRequest {
 SCSIBus   *bus;
 SCSIDevice*dev;
-SCSIReqOps*ops;
+const SCSIReqOps  *ops;
 uint32_t  refcount;
 uint32_t  tag;
 uint32_t  lun;
@@ -95,7 +95,6 @@ struct SCSIDeviceInfo {
 SCSIRequest *(*alloc_req)(SCSIDevice *s, uint32_t tag, uint32_t lun,
   void *hba_private);
 void (*unit_attention_reported)(SCSIDevice *s);
-SCSIReqOps reqops;
 };
 
 struct SCSIBusInfo {
@@ -175,8 +174,8 @@ extern const struct SCSISense 
sense_code_DEVICE_INTERNAL_RESET;
 
 int scsi_sense_valid(SCSISense sense);
 
-SCSIRequest *scsi_req_alloc(SCSIReqOps *reqops, SCSIDevice *d, uint32_t tag,
-uint32_t lun, void *hba_private);
+SCSIRequest *scsi_req_alloc(const SCSIReqOps *reqops, SCSIDevice *d,
+uint32_t tag, uint32_t lun, void *hba_private);
 SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
   uint8_t *buf, void *hba_private);
 int32_t scsi_req_enqueue(SCSIRequest *req);
-- 
1.7.6





[Qemu-devel] [PATCH 05/10] scsi-disk: do not duplicate BlockDriverState member

2011-10-12 Thread Paolo Bonzini
Same as for scsi-generic, avoid duplication even if it causes longer
lines.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-disk.c |   94 +++
 1 files changed, 46 insertions(+), 48 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 860a3bf..5e3ef51 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -65,7 +65,6 @@ typedef struct SCSIDiskReq {
 struct SCSIDiskState
 {
 SCSIDevice qdev;
-BlockDriverState *bs;
 /* The qemu block layer uses a fixed 512 byte sector size.
This is the number of 512 byte blocks in a single scsi sector.  */
 int cluster_size;
@@ -119,7 +118,7 @@ static uint32_t scsi_init_iovec(SCSIDiskReq *r)
 
 if (!r-iov.iov_base) {
 r-buflen = SCSI_DMA_BUF_SIZE;
-r-iov.iov_base = qemu_blockalign(s-bs, r-buflen);
+r-iov.iov_base = qemu_blockalign(s-qdev.conf.bs, r-buflen);
 }
 r-iov.iov_len = MIN(r-sector_count * 512, r-buflen);
 qemu_iovec_init_external(r-qiov, r-iov, 1);
@@ -134,7 +133,7 @@ static void scsi_read_complete(void * opaque, int ret)
 
 if (r-req.aiocb != NULL) {
 r-req.aiocb = NULL;
-bdrv_acct_done(s-bs, r-acct);
+bdrv_acct_done(s-qdev.conf.bs, r-acct);
 }
 
 if (ret) {
@@ -160,7 +159,7 @@ static void scsi_flush_complete(void * opaque, int ret)
 
 if (r-req.aiocb != NULL) {
 r-req.aiocb = NULL;
-bdrv_acct_done(s-bs, r-acct);
+bdrv_acct_done(s-qdev.conf.bs, r-acct);
 }
 
 if (ret  0) {
@@ -210,8 +209,8 @@ static void scsi_read_data(SCSIRequest *req)
 /* Save a ref for scsi_read_complete, in case r is canceled.  */
 scsi_req_ref(r-req);
 n = scsi_init_iovec(r);
-bdrv_acct_start(s-bs, r-acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
-r-req.aiocb = bdrv_aio_readv(s-bs, r-sector, r-qiov, n,
+bdrv_acct_start(s-qdev.conf.bs, r-acct, n * BDRV_SECTOR_SIZE, 
BDRV_ACCT_READ);
+r-req.aiocb = bdrv_aio_readv(s-qdev.conf.bs, r-sector, r-qiov, n,
   scsi_read_complete, r);
 if (r-req.aiocb == NULL) {
 scsi_read_complete(r, -EIO);
@@ -222,10 +221,10 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int 
error, int type)
 {
 int is_read = (type == SCSI_REQ_STATUS_RETRY_READ);
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r-req.dev);
-BlockErrorAction action = bdrv_get_on_error(s-bs, is_read);
+BlockErrorAction action = bdrv_get_on_error(s-qdev.conf.bs, is_read);
 
 if (action == BLOCK_ERR_IGNORE) {
-bdrv_mon_event(s-bs, BDRV_ACTION_IGNORE, is_read);
+bdrv_mon_event(s-qdev.conf.bs, BDRV_ACTION_IGNORE, is_read);
 return 0;
 }
 
@@ -235,9 +234,9 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, 
int type)
 type = SCSI_REQ_STATUS_RETRY_TYPE_MASK;
 r-status |= SCSI_REQ_STATUS_RETRY | type;
 
-bdrv_mon_event(s-bs, BDRV_ACTION_STOP, is_read);
+bdrv_mon_event(s-qdev.conf.bs, BDRV_ACTION_STOP, is_read);
 vm_stop(RUN_STATE_IO_ERROR);
-bdrv_iostatus_set_err(s-bs, error);
+bdrv_iostatus_set_err(s-qdev.conf.bs, error);
 return 1;
 } else {
 switch (error) {
@@ -254,7 +253,7 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, 
int type)
 scsi_check_condition(r, SENSE_CODE(IO_ERROR));
 break;
 }
-bdrv_mon_event(s-bs, BDRV_ACTION_REPORT, is_read);
+bdrv_mon_event(s-qdev.conf.bs, BDRV_ACTION_REPORT, is_read);
 return 0;
 }
 }
@@ -267,7 +266,7 @@ static void scsi_write_complete(void * opaque, int ret)
 
 if (r-req.aiocb != NULL) {
 r-req.aiocb = NULL;
-bdrv_acct_done(s-bs, r-acct);
+bdrv_acct_done(s-qdev.conf.bs, r-acct);
 }
 
 if (ret) {
@@ -311,8 +310,8 @@ static void scsi_write_data(SCSIRequest *req)
 if (s-tray_open) {
 scsi_write_complete(r, -ENOMEDIUM);
 }
-bdrv_acct_start(s-bs, r-acct, n * BDRV_SECTOR_SIZE, 
BDRV_ACCT_WRITE);
-r-req.aiocb = bdrv_aio_writev(s-bs, r-sector, r-qiov, n,
+bdrv_acct_start(s-qdev.conf.bs, r-acct, n * BDRV_SECTOR_SIZE, 
BDRV_ACCT_WRITE);
+r-req.aiocb = bdrv_aio_writev(s-qdev.conf.bs, r-sector, r-qiov, n,
scsi_write_complete, r);
 if (r-req.aiocb == NULL) {
 scsi_write_complete(r, -ENOMEM);
@@ -450,7 +449,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 case 0x83: /* Device identification page, mandatory */
 {
 int max_len = 255 - 8;
-int id_len = strlen(bdrv_get_device_name(s-bs));
+int id_len = strlen(bdrv_get_device_name(s-qdev.conf.bs));
 
 if (id_len  max_len)
 id_len = max_len;
@@ -463,7 +462,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 outbuf[buflen++] = 0;   // reserved
 

[Qemu-devel] [PATCH 08/10] scsi: export scsi_generic_reqops

2011-10-12 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-generic.c |2 +-
 hw/scsi.h |3 +++
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index 9c9f64a..eb066ba 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -432,7 +432,7 @@ static int scsi_generic_initfn(SCSIDevice *s)
 return 0;
 }
 
-static const SCSIReqOps scsi_generic_req_ops = {
+const SCSIReqOps scsi_generic_req_ops = {
 .size = sizeof(SCSIGenericReq),
 .free_req = scsi_free_request,
 .send_command = scsi_send_command,
diff --git a/hw/scsi.h b/hw/scsi.h
index 832682e..c23 100644
--- a/hw/scsi.h
+++ b/hw/scsi.h
@@ -196,4 +196,7 @@ void scsi_device_purge_requests(SCSIDevice *sdev, SCSISense 
sense);
 int scsi_device_get_sense(SCSIDevice *dev, uint8_t *buf, int len, bool fixed);
 SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int target, int lun);
 
+/* scsi-generic.c. */
+extern const SCSIReqOps scsi_generic_req_ops;
+
 #endif
-- 
1.7.6





[Qemu-devel] [PATCH 10/10] scsi-disk: add scsi-block for device passthrough

2011-10-12 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-disk.c |  116 
 1 files changed, 116 insertions(+), 0 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 835cc7f..c97ef52 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -39,6 +39,10 @@ do { fprintf(stderr, scsi-disk:  fmt , ## __VA_ARGS__); } 
while (0)
 #include blockdev.h
 #include block_int.h
 
+#ifdef __linux
+#include scsi/sg.h
+#endif
+
 #define SCSI_DMA_BUF_SIZE131072
 #define SCSI_MAX_INQUIRY_LEN 256
 
@@ -1603,6 +1607,103 @@ static SCSIRequest *scsi_new_request(SCSIDevice *d, 
uint32_t tag, uint32_t lun,
 return req;
 }
 
+#ifdef __linux__
+static int get_device_type(SCSIDiskState *s)
+{
+BlockDriverState *bdrv = s-qdev.conf.bs;
+uint8_t cmd[16];
+uint8_t buf[36];
+uint8_t sensebuf[8];
+sg_io_hdr_t io_header;
+int ret;
+
+memset(cmd, 0, sizeof(cmd));
+memset(buf, 0, sizeof(buf));
+cmd[0] = INQUIRY;
+cmd[4] = sizeof(buf);
+
+memset(io_header, 0, sizeof(io_header));
+io_header.interface_id = 'S';
+io_header.dxfer_direction = SG_DXFER_FROM_DEV;
+io_header.dxfer_len = sizeof(buf);
+io_header.dxferp = buf;
+io_header.cmdp = cmd;
+io_header.cmd_len = sizeof(cmd);
+io_header.mx_sb_len = sizeof(sensebuf);
+io_header.sbp = sensebuf;
+io_header.timeout = 6000; /* XXX */
+
+ret = bdrv_ioctl(bdrv, SG_IO, io_header);
+if (ret  0 || io_header.driver_status || io_header.host_status) {
+return -1;
+}
+s-qdev.blocksize = s-qdev.conf.logical_block_size;
+s-qdev.type = buf[0];
+s-removable = (buf[1]  0x80) != 0;
+return 0;
+}
+
+static int scsi_block_initfn(SCSIDevice *dev)
+{
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
+int sg_version;
+int rc;
+
+if (!s-qdev.conf.bs) {
+error_report(scsi-block: drive property not set);
+return -1;
+}
+
+/* check we are using a driver managing SG_IO (version 3 and after) */
+if (bdrv_ioctl(s-qdev.conf.bs, SG_GET_VERSION_NUM, sg_version)  0 ||
+sg_version  3) {
+error_report(scsi-block: scsi generic interface too old);
+return -1;
+}
+
+/* get device type from INQUIRY data */
+rc = get_device_type(s);
+if (rc  0) {
+error_report(scsi-block: INQUIRY failed);
+return -1;
+}
+
+return scsi_initfn(s-qdev);
+}
+
+static SCSIRequest *scsi_block_new_request(SCSIDevice *d, uint32_t tag,
+   uint32_t lun, uint8_t *buf,
+   void *hba_private)
+{
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, d);
+
+switch (buf[0]) {
+case SERVICE_ACTION_IN_16:
+if ((buf[1]  31) != SAI_READ_CAPACITY_16) {
+break;
+}
+
+case READ_6:
+case READ_10:
+case READ_12:
+case READ_16:
+case READ_CAPACITY_10:
+case WRITE_6:
+case WRITE_10:
+case WRITE_12:
+case WRITE_16:
+case WRITE_VERIFY_10:
+case WRITE_VERIFY_12:
+case WRITE_VERIFY_16:
+return scsi_req_alloc(scsi_disk_reqops, s-qdev, tag, lun,
+  hba_private);
+}
+
+return scsi_req_alloc(scsi_generic_req_ops, s-qdev, tag, lun,
+  hba_private);
+}
+#endif
+
 #define DEFINE_SCSI_DISK_PROPERTIES()   \
 DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf),  \
 DEFINE_PROP_STRING(ver,  SCSIDiskState, version), \
@@ -1638,6 +1739,21 @@ static SCSIDeviceInfo scsi_disk_info[] = {
 DEFINE_SCSI_DISK_PROPERTIES(),
 DEFINE_PROP_END_OF_LIST(),
 },
+#ifdef __linux__
+},{
+.qdev.name= scsi-block,
+.qdev.fw_name = disk,
+.qdev.desc= SCSI block device passthrough,
+.qdev.size= sizeof(SCSIDiskState),
+.qdev.reset   = scsi_disk_reset,
+.init = scsi_block_initfn,
+.destroy  = scsi_destroy,
+.alloc_req= scsi_block_new_request,
+.qdev.props   = (Property[]) {
+DEFINE_SCSI_DISK_PROPERTIES(),
+DEFINE_PROP_END_OF_LIST(),
+},
+#endif
 },{
 .qdev.name= scsi-disk, /* legacy -device scsi-disk */
 .qdev.fw_name = disk,
-- 
1.7.6




[Qemu-devel] [PATCH 01/10] scsi-generic: drop SCSIGenericState

2011-10-12 Thread Paolo Bonzini
It is not needed, because s-bs is already stored in SCSIDevice, and
can be reached from the conf.bs member.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-generic.c |   85 +++--
 1 files changed, 37 insertions(+), 48 deletions(-)

diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index 2696fa1..1e5c41b 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -46,8 +46,6 @@ do { fprintf(stderr, scsi-generic:  fmt , ## __VA_ARGS__); 
} while (0)
 #define MAX_UINT ((unsigned int)-1)
 #endif
 
-typedef struct SCSIGenericState SCSIGenericState;
-
 typedef struct SCSIGenericReq {
 SCSIRequest req;
 uint8_t *buf;
@@ -56,12 +54,6 @@ typedef struct SCSIGenericReq {
 sg_io_hdr_t io_header;
 } SCSIGenericReq;
 
-struct SCSIGenericState
-{
-SCSIDevice qdev;
-BlockDriverState *bs;
-};
-
 static void scsi_free_request(SCSIRequest *req)
 {
 SCSIGenericReq *r = DO_UPCAST(SCSIGenericReq, req, req);
@@ -174,7 +166,7 @@ static void scsi_read_complete(void * opaque, int ret)
 static void scsi_read_data(SCSIRequest *req)
 {
 SCSIGenericReq *r = DO_UPCAST(SCSIGenericReq, req, req);
-SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, r-req.dev);
+SCSIDevice *s = r-req.dev;
 int ret;
 
 DPRINTF(scsi_read_data 0x%x\n, req-tag);
@@ -183,7 +175,7 @@ static void scsi_read_data(SCSIRequest *req)
 return;
 }
 
-ret = execute_command(s-bs, r, SG_DXFER_FROM_DEV, scsi_read_complete);
+ret = execute_command(s-conf.bs, r, SG_DXFER_FROM_DEV, 
scsi_read_complete);
 if (ret  0) {
 scsi_command_complete(r, ret);
 return;
@@ -193,7 +185,7 @@ static void scsi_read_data(SCSIRequest *req)
 static void scsi_write_complete(void * opaque, int ret)
 {
 SCSIGenericReq *r = (SCSIGenericReq *)opaque;
-SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, r-req.dev);
+SCSIDevice *s = r-req.dev;
 
 DPRINTF(scsi_write_complete() ret = %d\n, ret);
 r-req.aiocb = NULL;
@@ -204,9 +196,9 @@ static void scsi_write_complete(void * opaque, int ret)
 }
 
 if (r-req.cmd.buf[0] == MODE_SELECT  r-req.cmd.buf[4] == 12 
-s-qdev.type == TYPE_TAPE) {
-s-qdev.blocksize = (r-buf[9]  16) | (r-buf[10]  8) | r-buf[11];
-DPRINTF(block size %d\n, s-qdev.blocksize);
+s-type == TYPE_TAPE) {
+s-blocksize = (r-buf[9]  16) | (r-buf[10]  8) | r-buf[11];
+DPRINTF(block size %d\n, s-blocksize);
 }
 
 scsi_command_complete(r, ret);
@@ -216,8 +208,8 @@ static void scsi_write_complete(void * opaque, int ret)
The transfer may complete asynchronously.  */
 static void scsi_write_data(SCSIRequest *req)
 {
-SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, req-dev);
 SCSIGenericReq *r = DO_UPCAST(SCSIGenericReq, req, req);
+SCSIDevice *s = r-req.dev;
 int ret;
 
 DPRINTF(scsi_write_data 0x%x\n, req-tag);
@@ -227,7 +219,7 @@ static void scsi_write_data(SCSIRequest *req)
 return;
 }
 
-ret = execute_command(s-bs, r, SG_DXFER_TO_DEV, scsi_write_complete);
+ret = execute_command(s-conf.bs, r, SG_DXFER_TO_DEV, scsi_write_complete);
 if (ret  0) {
 scsi_command_complete(r, ret);
 }
@@ -261,8 +253,8 @@ static void scsi_req_fixup(SCSIRequest *req)
 
 static int32_t scsi_send_command(SCSIRequest *req, uint8_t *cmd)
 {
-SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, req-dev);
 SCSIGenericReq *r = DO_UPCAST(SCSIGenericReq, req, req);
+SCSIDevice *s = r-req.dev;
 int ret;
 
 scsi_req_fixup(r-req);
@@ -285,7 +277,7 @@ static int32_t scsi_send_command(SCSIRequest *req, uint8_t 
*cmd)
 g_free(r-buf);
 r-buflen = 0;
 r-buf = NULL;
-ret = execute_command(s-bs, r, SG_DXFER_NONE, scsi_command_complete);
+ret = execute_command(s-conf.bs, r, SG_DXFER_NONE, 
scsi_command_complete);
 if (ret  0) {
 scsi_command_complete(r, ret);
 return 0;
@@ -372,77 +364,74 @@ static int get_stream_blocksize(BlockDriverState *bdrv)
 
 static void scsi_generic_reset(DeviceState *dev)
 {
-SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev.qdev, dev);
+SCSIDevice *s = DO_UPCAST(SCSIDevice, qdev, dev);
 
-scsi_device_purge_requests(s-qdev, SENSE_CODE(RESET));
+scsi_device_purge_requests(s, SENSE_CODE(RESET));
 }
 
-static void scsi_destroy(SCSIDevice *d)
+static void scsi_destroy(SCSIDevice *s)
 {
-SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, d);
-
-scsi_device_purge_requests(s-qdev, SENSE_CODE(NO_SENSE));
-blockdev_mark_auto_del(s-qdev.conf.bs);
+scsi_device_purge_requests(s, SENSE_CODE(NO_SENSE));
+blockdev_mark_auto_del(s-conf.bs);
 }
 
-static int scsi_generic_initfn(SCSIDevice *dev)
+static int scsi_generic_initfn(SCSIDevice *s)
 {
-SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, dev);
 int sg_version;
 struct sg_scsi_id scsiid;
 
-if (!s-qdev.conf.bs) 

[Qemu-devel] [PATCH 09/10] scsi: pass cdb to alloc_req

2011-10-12 Thread Paolo Bonzini
This will let scsi-block choose between passthrough and emulation.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-bus.c |2 +-
 hw/scsi-disk.c|4 ++--
 hw/scsi-generic.c |2 +-
 hw/scsi.h |2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index 252e903..72c0dd2 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -451,7 +451,7 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
 req = scsi_req_alloc(reqops_target_command, d, tag, lun,
  hba_private);
 } else {
-req = d-info-alloc_req(d, tag, lun, hba_private);
+req = d-info-alloc_req(d, tag, lun, buf, hba_private);
 }
 }
 
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 6c1d5a2..835cc7f 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1593,8 +1593,8 @@ static const SCSIReqOps scsi_disk_reqops = {
 .get_buf  = scsi_get_buf,
 };
 
-static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag,
- uint32_t lun, void *hba_private)
+static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag, uint32_t lun,
+ uint8_t *buf, void *hba_private)
 {
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, d);
 SCSIRequest *req;
diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index eb066ba..9091e13 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -443,7 +443,7 @@ const SCSIReqOps scsi_generic_req_ops = {
 };
 
 static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag, uint32_t lun,
- void *hba_private)
+ uint8_t *buf, void *hba_private)
 {
 SCSIRequest *req;
 
diff --git a/hw/scsi.h b/hw/scsi.h
index c23..aa816e6 100644
--- a/hw/scsi.h
+++ b/hw/scsi.h
@@ -93,7 +93,7 @@ struct SCSIDeviceInfo {
 scsi_qdev_initfn init;
 void (*destroy)(SCSIDevice *s);
 SCSIRequest *(*alloc_req)(SCSIDevice *s, uint32_t tag, uint32_t lun,
-  void *hba_private);
+  uint8_t *buf, void *hba_private);
 void (*unit_attention_reported)(SCSIDevice *s);
 };
 
-- 
1.7.6





Re: [Qemu-devel] [PATCH V4 2/2] hw/9pfs: Add readonly support for 9p export

2011-10-12 Thread M. Mohan Kumar
On Wednesday, October 12, 2011 07:38:25 PM Aneesh Kumar K.V wrote:
 On Wed, 12 Oct 2011 13:23:34 +0530, M. Mohan Kumar mo...@in.ibm.com 
wrote:
  A new fsdev parameter readonly is introduced to control accessing 9p
  export. readonly=on|off can be used to specify the access type. By
  default rw access is given.
  
  Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
  ---
  Changes from previous version V3:
  * Use opt_set_bool function to set readonly option
  * Change the flag from MS_READONLY to 9p specific
  
  Change from previous version V2:
  * QEMU_OPT_BOOL is used for readdonly parameter
  
  Changes from previous version:
  * Use readonly option instead of access
  * Change function return type to boolean where its needed
  
   fsdev/file-op-9p.h |3 +-
   fsdev/qemu-fsdev.c |   12 +-
   fsdev/qemu-fsdev.h |1 +
   hw/9pfs/virtio-9p-device.c |3 ++
   hw/9pfs/virtio-9p.c|   46
    qemu-config.c 
   |7 ++
   vl.c   |2 +
   7 files changed, 71 insertions(+), 3 deletions(-)
  
  diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
  index 33fb07f..b75290d 100644
  --- a/fsdev/file-op-9p.h
  +++ b/fsdev/file-op-9p.h
  @@ -58,7 +58,8 @@ typedef struct extended_ops {
  
   } extended_ops;
   
   /* FsContext flag values */
  
  -#define PATHNAME_FSCONTEXT 0x1
  +#define PATHNAME_FSCONTEXT  0x1
 
 why ?
It was part of alignment change for above line

 
  +#define P9_RDONLY_EXPORT0x2
  
   /* cache flags */
   #define V9FS_WRITETHROUGH_CACHE 0x1
  
  diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
  index d08ba9c..f8a8227 100644
  --- a/fsdev/qemu-fsdev.c
  +++ b/fsdev/qemu-fsdev.c
  @@ -29,13 +29,13 @@ static FsTypeTable FsTypes[] = {
  
   int qemu_fsdev_add(QemuOpts *opts)
   {
   
   struct FsTypeListEntry *fsle;
  
  -int i;
  +int i, flags = 0;
  
   const char *fsdev_id = qemu_opts_id(opts);
   const char *fstype = qemu_opt_get(opts, fstype);
   const char *path = qemu_opt_get(opts, path);
   const char *sec_model = qemu_opt_get(opts, security_model);
   const char *cache = qemu_opt_get(opts, cache);
  
  -
  +int rdonly = qemu_opt_get_bool(opts, readonly, 0);
  
   if (!fsdev_id) {
   
   fprintf(stderr, fsdev: No id specified\n);
  
  @@ -76,6 +76,14 @@ int qemu_fsdev_add(QemuOpts *opts)
  
   fsle-fse.cache_flags = V9FS_WRITETHROUGH_CACHE;
   
   }
   
   }
  
  +if (rdonly) {
  +flags |= P9_RDONLY_EXPORT;
  +} else {
  +flags = ~P9_RDONLY_EXPORT;
  +}
  +
  +fsle-fse.flags = flags;
  +
  
   QTAILQ_INSERT_TAIL(fstype_entries, fsle, next);
   return 0;
   
   }
  
  diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
  index 0f67880..2938eaf 100644
  --- a/fsdev/qemu-fsdev.h
  +++ b/fsdev/qemu-fsdev.h
  @@ -44,6 +44,7 @@ typedef struct FsTypeEntry {
  
   char *security_model;
   int cache_flags;
   FileOperations *ops;
  
  +int flags;
 
 do we need extra flags ? Why not use cache_flags ? renaming that to
 export_flags ?
Yes, we can use same variable.
 
   } FsTypeEntry;
   
   typedef struct FsTypeListEntry {
  
  diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
  index 1846e36..336292c 100644
  --- a/hw/9pfs/virtio-9p-device.c
  +++ b/hw/9pfs/virtio-9p-device.c
  @@ -125,6 +125,9 @@ VirtIODevice *virtio_9p_init(DeviceState *dev,
  V9fsConf *conf)
  
   s-tag_len = len;
   s-ctx.uid = -1;
   s-ctx.flags = 0;
  
  +if (fse-flags  P9_RDONLY_EXPORT) {
  +s-ctx.flags |= P9_RDONLY_EXPORT;
  +}
  
   s-ops = fse-ops;
   s-vdev.get_features = virtio_9p_get_features;
  
  diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
  index 47ed2f1..9f15787 100644
  --- a/hw/9pfs/virtio-9p.c
  +++ b/hw/9pfs/virtio-9p.c
  @@ -1271,6 +1271,11 @@ static void v9fs_fix_path(V9fsPath *dst, V9fsPath
  *src, int len)
  
   dst-size++;
   
   }
  
  +static inline bool is_ro_export(FsContext *fs_ctx)
  +{
  +return fs_ctx-flags  P9_RDONLY_EXPORT;
  +}
  +
  
   static void v9fs_version(void *opaque)
   {
   
   V9fsPDU *pdu = opaque;
  
  @@ -1690,6 +1695,14 @@ static void v9fs_open(void *opaque)
  
   } else {
   
   flags = omode_to_uflags(mode);
   
   }
  
  +if (is_ro_export(s-ctx)) {
  +if (mode  O_WRONLY || mode  O_RDWR || mode  O_APPEND) {
  +err = -EROFS;
  +goto out;
  +} else {
  +flags |= O_NOATIME;
  +}
  +}
 
 What about O_TRUNC ?

Thanks, I will include the check for O_TRUNC
 
   err = v9fs_co_open(pdu, fidp, flags);
   if (err  0) {
   
   goto out;
  
  @@ -3301,6 +3314,33 @@ static void v9fs_op_not_supp(void *opaque)
  
   complete_pdu(pdu-s, pdu, -EOPNOTSUPP);
   
   

Re: [Qemu-devel] [PATCH 1/2] hw/9pfs: Add new virtfs option cache=writethrough to skip host page cache

2011-10-12 Thread Aneesh Kumar K.V
On Wed, 12 Oct 2011 15:32:36 +0100, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Wed, Oct 12, 2011 at 2:19 PM, Aneesh Kumar K.V
 aneesh.ku...@linux.vnet.ibm.com wrote:
  On Wed, 12 Oct 2011 10:55:21 +0100, Stefan Hajnoczi stefa...@gmail.com 
  wrote:
  On Mon, Oct 10, 2011 at 10:06 AM, Aneesh Kumar K.V
  aneesh.ku...@linux.vnet.ibm.com wrote:
   diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
   index 5c8b5ed..441a37f 100644
   --- a/hw/9pfs/virtio-9p-handle.c
   +++ b/hw/9pfs/virtio-9p-handle.c
   @@ -202,6 +202,15 @@ static ssize_t handle_pwritev(FsContext *ctx, int 
   fd, const struct iovec *iov,
           return writev(fd, iov, iovcnt);
 
  The sync_file_range(2) call below is dead-code since we'll return
  immediately after writev(2) completes.  The writev(2) return value
  needs to be saved temporarily and then returned after
  sync_file_range(2).
 
  Missed that. Will fix in the next update
 
 
       }
    #endif
   +    if (ctx-cache_flags  V9FS_WRITETHROUGH_CACHE) {
 
  -drive cache=writethrough means something different from 9pfs
  writethrough.  This is confusing so I wonder if there is a better
  name like immediate write-out.
 
 
  cache=immediate-write-out ?
 
   +        /*
   +         * Initiate a writeback. This is not a data integrity sync.
   +         * We want to ensure that we don't leave dirty pages in the 
   cache
   +         * after write when cache=writethrough is sepcified.
   +         */
   +        sync_file_range(fd, offset, 0,
   +                        SYNC_FILE_RANGE_WAIT_BEFORE | 
   SYNC_FILE_RANGE_WRITE);
   +    }
 
  I'm not sure whether SYNC_FILE_RANGE_WAIT_BEFORE is necessary.  As a
  best-effort mechanism just SYNC_FILE_RANGE_WRITE does the job although
  a client that rapidly rewrites may be able to leave dirty pages in the
  host page cache
 
  Shouldn't we prevent this ? That is the reason for me to use that
  WAIT_BEFORE ?
 
 The flag will cause sync_file_range(2) to wait on in-flight I/O.  The
 guest will notice slow I/O.  You should at least specify a range
 instead of nbytes=0 in the arguments.

version 3 i have 

commit db3cb2ac561d837176f9e0d02075a898c57ad309
Author: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Date:   Wed Oct 12 19:11:23 2011 +0530

hw/9pfs: Add new virtfs option writeout=immediate skip host page cache

writeout=immediate implies the after pwritev we do a sync_file_range.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 8de8abf..af3ecbe 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -53,12 +53,16 @@ struct xattr_operations;
 /* FsContext flag values */
 #define PATHNAME_FSCONTEXT 0x1
 
+/* export flags */
+#define V9FS_IMMEDIATE_WRITEOUT 0x1
+
 typedef struct FsContext
 {
 int flags;
 char *fs_root;
 SecModel fs_sm;
 uid_t uid;
+int export_flags;
 struct xattr_operations **xops;
 /* fs driver specific data */
 void *private;
diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index 768819f..946bad7 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -34,6 +34,8 @@ int qemu_fsdev_add(QemuOpts *opts)
 const char *fstype = qemu_opt_get(opts, fstype);
 const char *path = qemu_opt_get(opts, path);
 const char *sec_model = qemu_opt_get(opts, security_model);
+const char *writeout = qemu_opt_get(opts, writeout);
+
 
 if (!fsdev_id) {
 fprintf(stderr, fsdev: No id specified\n);
@@ -72,10 +74,14 @@ int qemu_fsdev_add(QemuOpts *opts)
 fsle-fse.path = g_strdup(path);
 fsle-fse.security_model = g_strdup(sec_model);
 fsle-fse.ops = FsTypes[i].ops;
-
+fsle-fse.export_flags = 0;
+if (writeout) {
+if (!strcmp(writeout, immediate)) {
+fsle-fse.export_flags = V9FS_IMMEDIATE_WRITEOUT;
+}
+}
 QTAILQ_INSERT_TAIL(fstype_entries, fsle, next);
 return 0;
-
 }
 
 FsTypeEntry *get_fsdev_fsentry(char *id)
diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
index e04931a..3b2feb4 100644
--- a/fsdev/qemu-fsdev.h
+++ b/fsdev/qemu-fsdev.h
@@ -41,6 +41,7 @@ typedef struct FsTypeEntry {
 char *fsdev_id;
 char *path;
 char *security_model;
+int export_flags;
 FileOperations *ops;
 } FsTypeEntry;
 
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index e5b68da..403eed0 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -115,6 +115,7 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
*conf)
 exit(1);
 }
 
+s-ctx.export_flags = fse-export_flags;
 s-ctx.fs_root = g_strdup(fse-path);
 len = strlen(conf-tag);
 if (len  MAX_TAG_LEN) {
diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
index 5c8b5ed..91d757a 100644
--- a/hw/9pfs/virtio-9p-handle.c
+++ b/hw/9pfs/virtio-9p-handle.c
@@ -192,16 +192,27 @@ static ssize_t handle_preadv(FsContext *ctx, int fd, 
const struct iovec *iov,
 static ssize_t 

[Qemu-devel] [PATCH 04/10] scsi-generic: look at host status

2011-10-12 Thread Paolo Bonzini
Pass down the host status so that failing transport can be detected
by the guest.  Similar treatment of host status could be done in
virtio-blk, too.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-generic.c |   20 
 1 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index 04549aa..7b291ec 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -39,8 +39,13 @@ do { fprintf(stderr, scsi-generic:  fmt , ## __VA_ARGS__); 
} while (0)
 
 #define SCSI_SENSE_BUF_SIZE 96
 
-#define SG_ERR_DRIVER_TIMEOUT 0x06
-#define SG_ERR_DRIVER_SENSE 0x08
+#define SG_ERR_DRIVER_TIMEOUT  0x06
+#define SG_ERR_DRIVER_SENSE0x08
+
+#define SG_ERR_DID_OK  0x00
+#define SG_ERR_DID_NO_CONNECT  0x01
+#define SG_ERR_DID_BUS_BUSY0x02
+#define SG_ERR_DID_TIME_OUT0x03
 
 #ifndef MAX_UINT
 #define MAX_UINT ((unsigned int)-1)
@@ -68,8 +73,9 @@ static void scsi_command_complete(void *opaque, int ret)
 SCSIGenericReq *r = (SCSIGenericReq *)opaque;
 
 r-req.aiocb = NULL;
-if (r-io_header.driver_status  SG_ERR_DRIVER_SENSE)
+if (r-io_header.driver_status  SG_ERR_DRIVER_SENSE) {
 r-req.sense_len = r-io_header.sb_len_wr;
+}
 
 if (ret != 0) {
 switch (ret) {
@@ -86,9 +92,15 @@ static void scsi_command_complete(void *opaque, int ret)
 break;
 }
 } else {
-if (r-io_header.driver_status  SG_ERR_DRIVER_TIMEOUT) {
+if (r-io_header.host_status == SG_ERR_DID_NO_CONNECT ||
+r-io_header.host_status == SG_ERR_DID_BUS_BUSY ||
+r-io_header.host_status == SG_ERR_DID_TIME_OUT ||
+(r-io_header.driver_status  SG_ERR_DRIVER_TIMEOUT)) {
 status = BUSY;
 BADF(Driver Timeout\n);
+} else if (r-io_header.host_status) {
+status = CHECK_CONDITION;
+scsi_req_build_sense(r-req, SENSE_CODE(I_T_NEXUS_LOSS));
 } else if (r-io_header.status) {
 status = r-io_header.status;
 } else if (r-io_header.driver_status  SG_ERR_DRIVER_SENSE) {
-- 
1.7.6





[Qemu-devel] [PATCH 02/10] scsi-generic: remove scsi_req_fixup

2011-10-12 Thread Paolo Bonzini
This is not needed anymore, since asynchronous ioctls were introduced
by commit 221f715 (new scsi-generic abstraction, use SG_IO, 2009-03-28).

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi-generic.c |   15 ---
 1 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index 1e5c41b..5130e9a 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -233,19 +233,6 @@ static uint8_t *scsi_get_buf(SCSIRequest *req)
 return r-buf;
 }
 
-static void scsi_req_fixup(SCSIRequest *req)
-{
-switch(req-cmd.buf[0]) {
-case REWIND:
-case START_STOP:
-if (req-dev-type == TYPE_TAPE) {
-/* force IMMED, otherwise qemu waits end of command */
-req-cmd.buf[1] = 0x01;
-}
-break;
-}
-}
-
 /* Execute a scsi command.  Returns the length of the data expected by the
command.  This will be Positive for data transfers from the device
(eg. disk reads), negative for transfers to the device (eg. disk writes),
@@ -257,8 +244,6 @@ static int32_t scsi_send_command(SCSIRequest *req, uint8_t 
*cmd)
 SCSIDevice *s = r-req.dev;
 int ret;
 
-scsi_req_fixup(r-req);
-
 DPRINTF(Command: lun=%d tag=0x%x len %zd data=0x%02x, lun, tag,
 r-req.cmd.xfer, cmd[0]);
 
-- 
1.7.6





Re: [Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing

2011-10-12 Thread M. Mohan Kumar
-- 
Regards,
M. Mohan Kumar
On Wednesday, October 12, 2011 01:58:00 PM Daniel P. Berrange wrote:
 On Wed, Oct 12, 2011 at 01:24:16PM +0530, M. Mohan Kumar wrote:
  Security model is needed only for 'local' fs driver.
  
  Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
  ---
  
   fsdev/qemu-fsdev.c |6 +
   fsdev/qemu-fsdev.h |1 +
   hw/9pfs/virtio-9p-device.c |   47
   ++- vl.c  
   |   20 +++--
   4 files changed, 43 insertions(+), 31 deletions(-)
  
  --- a/fsdev/qemu-fsdev.h
  +++ b/fsdev/qemu-fsdev.h
  @@ -40,6 +40,7 @@ typedef struct FsTypeTable {
  
   typedef struct FsTypeEntry {
   
   char *fsdev_id;
   char *path;
  
  +char *fsdriver;
  
   char *security_model;
   int cache_flags;
   FileOperations *ops;
  
  diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
  index aac58ad..1846e36 100644
  --- a/hw/9pfs/virtio-9p-device.c
  +++ b/hw/9pfs/virtio-9p-device.c
  @@ -83,29 +83,30 @@ VirtIODevice *virtio_9p_init(DeviceState *dev,
  V9fsConf *conf)
  
   exit(1);
   
   }
  
  -if (!strcmp(fse-security_model, passthrough)) {
  -/* Files on the Fileserver set to client user credentials */
  -s-ctx.fs_sm = SM_PASSTHROUGH;
  -s-ctx.xops = passthrough_xattr_ops;
  -} else if (!strcmp(fse-security_model, mapped)) {
  -/* Files on the fileserver are set to QEMU credentials.
  - * Client user credentials are saved in extended attributes.
  - */
  -s-ctx.fs_sm = SM_MAPPED;
  -s-ctx.xops = mapped_xattr_ops;
  -} else if (!strcmp(fse-security_model, none)) {
  -/*
  - * Files on the fileserver are set to QEMU credentials.
  - */
  -s-ctx.fs_sm = SM_NONE;
  -s-ctx.xops = none_xattr_ops;
  -} else {
  -fprintf(stderr, Default to security_model=none. You may want
  - enable advanced security model using 
  -security option:\n\t security_model=passthrough\n\t 
  -security_model=mapped\n);
  -s-ctx.fs_sm = SM_NONE;
  -s-ctx.xops = none_xattr_ops;
  +/* security models is needed only for local fs driver */
  +if (!strcmp(fse-fsdriver, local)) {
  +if (!strcmp(fse-security_model, passthrough)) {
  +/* Files on the Fileserver set to client user credentials */
  +s-ctx.fs_sm = SM_PASSTHROUGH;
  +s-ctx.xops = passthrough_xattr_ops;
  +} else if (!strcmp(fse-security_model, mapped)) {
  +/* Files on the fileserver are set to QEMU credentials.
  +* Client user credentials are saved in extended attributes.
  +*/
  +s-ctx.fs_sm = SM_MAPPED;
  +s-ctx.xops = mapped_xattr_ops;
  +} else if (!strcmp(fse-security_model, none)) {
  +/*
  +* Files on the fileserver are set to QEMU credentials.
  +*/
  +s-ctx.fs_sm = SM_NONE;
  +s-ctx.xops = none_xattr_ops;
  +} else {
  +fprintf(stderr, Invalid security_model %s specified.\n
  +Available security models are:\t 
  +passthrough,mapped or none\n,
  fse-security_model); +exit(1);
  +}
 
 Are you sure there aren't use cases where people would like to
 choose between  passthrough  mapped, even when using the 'proxy'
 or 'handle' security drivers.

Proxy FS driver is added to overcome the limit imposed by local + passthrough 
security model combination that needs qemu to be started by root user. Mapped 
and none secuiry model can be used by non root user also.

So Proxy FS driver does not need any security model(its pass-through only)
 
 Both of the security models seem pretty generally useful to me,
 regardless of the driver type.
 



Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread David Gibson
On Wed, Oct 12, 2011 at 09:22:01AM +0200, Michael S. Tsirkin wrote:
 On Wed, Oct 12, 2011 at 02:07:46PM +1100, David Gibson wrote:
  Um.. why?  PCI is defined by the spec to be LE, so I don't see that we
  need explicit endianness versions for PCI helpers.
 
 LE in the spec only applies to structures defined by the spec,
 that is pci configuration and msix tables in device memory.

Well, true.  But when was the last time you saw a PCI device with BE
registers?  I think it's a rare enough case that the individual device
models can reswap themselves.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread David Gibson
On Thu, Oct 13, 2011 at 02:43:06AM +1100, David Gibson wrote:
 On Wed, Oct 12, 2011 at 09:22:01AM +0200, Michael S. Tsirkin wrote:
  On Wed, Oct 12, 2011 at 02:07:46PM +1100, David Gibson wrote:
   Um.. why?  PCI is defined by the spec to be LE, so I don't see that we
   need explicit endianness versions for PCI helpers.
  
  LE in the spec only applies to structures defined by the spec,
  that is pci configuration and msix tables in device memory.
 
 Well, true.  But when was the last time you saw a PCI device with BE
 registers?  I think it's a rare enough case that the individual device
 models can reswap themselves.

s/BE registers/BE DMA accessed structures/

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] [PATCH v2 16/23] i8259: PREP: Replace pic_intack_read with pic_read_irq

2011-10-12 Thread Andreas Färber
Hello Jan,

Since the last round of pulls, PReP magically boots again. :)

Am 07.10.2011 09:19, schrieb Jan Kiszka:
 There is nothing in the i8259 spec that justifies the special
 pic_intack_read. At least the Linux PREP kernels configure the PICs
 properly so that pic_read_irq returns identical values, and setting
 read_reg_select in PIC0 cannot be derived from any special i8259 mode.
 
 So switch ppc_prep to pic_read_irq and drop the now unused PIC code.

Seems to resolve an existing XXX in the code.

 CC: Andreas Färber andreas.faer...@web.de
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com

Linux is the only thing we can boot with -M prep that I'm aware of, via
-kernel. And the 40p series doesn't use this at all. I see no regression
on my Debian Etch, so I'm fine with the simplification.

Tested-by: Andreas Färber andreas.faer...@web.de

 diff --git a/hw/ppc_prep.c b/hw/ppc_prep.c
 index d26049b..6427baa 100644
 --- a/hw/ppc_prep.c
 +++ b/hw/ppc_prep.c
 @@ -130,7 +130,7 @@ static inline uint32_t 
 _PPC_intack_read(target_phys_addr_t addr)
  uint32_t retval = 0;
  
  if ((addr  0xf) == 0)
 -retval = pic_intack_read(isa_pic);
 +retval = pic_read_irq(isa_pic);

Mind to add the braces while touching it?

Regards,
Andreas



Re: [Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing

2011-10-12 Thread Daniel P. Berrange
On Wed, Oct 12, 2011 at 09:05:50PM +0530, M. Mohan Kumar wrote:
 -- 
 Regards,
 M. Mohan Kumar
 On Wednesday, October 12, 2011 01:58:00 PM Daniel P. Berrange wrote:
  On Wed, Oct 12, 2011 at 01:24:16PM +0530, M. Mohan Kumar wrote:
   Security model is needed only for 'local' fs driver.
   
   Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
   ---
   
fsdev/qemu-fsdev.c |6 +
fsdev/qemu-fsdev.h |1 +
hw/9pfs/virtio-9p-device.c |   47
++- vl.c  
|   20 +++--
4 files changed, 43 insertions(+), 31 deletions(-)
   
   --- a/fsdev/qemu-fsdev.h
   +++ b/fsdev/qemu-fsdev.h
   @@ -40,6 +40,7 @@ typedef struct FsTypeTable {
   
typedef struct FsTypeEntry {

char *fsdev_id;
char *path;
   
   +char *fsdriver;
   
char *security_model;
int cache_flags;
FileOperations *ops;
   
   diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
   index aac58ad..1846e36 100644
   --- a/hw/9pfs/virtio-9p-device.c
   +++ b/hw/9pfs/virtio-9p-device.c
   @@ -83,29 +83,30 @@ VirtIODevice *virtio_9p_init(DeviceState *dev,
   V9fsConf *conf)
   
exit(1);

}
   
   -if (!strcmp(fse-security_model, passthrough)) {
   -/* Files on the Fileserver set to client user credentials */
   -s-ctx.fs_sm = SM_PASSTHROUGH;
   -s-ctx.xops = passthrough_xattr_ops;
   -} else if (!strcmp(fse-security_model, mapped)) {
   -/* Files on the fileserver are set to QEMU credentials.
   - * Client user credentials are saved in extended attributes.
   - */
   -s-ctx.fs_sm = SM_MAPPED;
   -s-ctx.xops = mapped_xattr_ops;
   -} else if (!strcmp(fse-security_model, none)) {
   -/*
   - * Files on the fileserver are set to QEMU credentials.
   - */
   -s-ctx.fs_sm = SM_NONE;
   -s-ctx.xops = none_xattr_ops;
   -} else {
   -fprintf(stderr, Default to security_model=none. You may want
   - enable advanced security model using 
   -security option:\n\t security_model=passthrough\n\t 
   -security_model=mapped\n);
   -s-ctx.fs_sm = SM_NONE;
   -s-ctx.xops = none_xattr_ops;
   +/* security models is needed only for local fs driver */
   +if (!strcmp(fse-fsdriver, local)) {
   +if (!strcmp(fse-security_model, passthrough)) {
   +/* Files on the Fileserver set to client user credentials */
   +s-ctx.fs_sm = SM_PASSTHROUGH;
   +s-ctx.xops = passthrough_xattr_ops;
   +} else if (!strcmp(fse-security_model, mapped)) {
   +/* Files on the fileserver are set to QEMU credentials.
   +* Client user credentials are saved in extended attributes.
   +*/
   +s-ctx.fs_sm = SM_MAPPED;
   +s-ctx.xops = mapped_xattr_ops;
   +} else if (!strcmp(fse-security_model, none)) {
   +/*
   +* Files on the fileserver are set to QEMU credentials.
   +*/
   +s-ctx.fs_sm = SM_NONE;
   +s-ctx.xops = none_xattr_ops;
   +} else {
   +fprintf(stderr, Invalid security_model %s specified.\n
   +Available security models are:\t 
   +passthrough,mapped or none\n,
   fse-security_model); +exit(1);
   +}
  
  Are you sure there aren't use cases where people would like to
  choose between  passthrough  mapped, even when using the 'proxy'
  or 'handle' security drivers.
 
 Proxy FS driver is added to overcome the limit imposed by local + passthrough 
 security model combination that needs qemu to be started by root user. Mapped 
 and none secuiry model can be used by non root user also.
 
 So Proxy FS driver does not need any security model(its pass-through only)

The Proxy FS driver does not need the security model, but if so desired
it would be possible to choose to implement the security models. It just
happens that the driver is hardcoded to only operate in 'passthrough'
mode.

I think that disabling the parsing of the 'security' parameter for non-local
drivers is dangerous, because an application might think that the 'mapped'
model was supported, but its parameter would get silently ignored. If the
requested value is not supported, then the application should always be
told about that.

So, IMHO, it would be better to have logic such as:

if (strcmp(security_mode, passthrough) == 0) {
 ...
} else if (strcmp(security_model, mapped) == 0) {
 if (strcmp(fsdriver, local) != 0) {
fprintf(stderr, security mode 'passthrough' is not supported by 
'%s'\n, fsdriver);
exit(1);
 }
 ...
} else if (strcmp(security_model, none) == 0) {
 if (strcmp(fsdriver, local) != 0) {
  

Re: [Qemu-devel] [PATCH v2 16/23] i8259: PREP: Replace pic_intack_read with pic_read_irq

2011-10-12 Thread Jan Kiszka
On 2011-10-12 18:02, Andreas Färber wrote:
 Hello Jan,
 
 Since the last round of pulls, PReP magically boots again. :)
 
 Am 07.10.2011 09:19, schrieb Jan Kiszka:
 There is nothing in the i8259 spec that justifies the special
 pic_intack_read. At least the Linux PREP kernels configure the PICs
 properly so that pic_read_irq returns identical values, and setting
 read_reg_select in PIC0 cannot be derived from any special i8259 mode.

 So switch ppc_prep to pic_read_irq and drop the now unused PIC code.
 
 Seems to resolve an existing XXX in the code.
 
 CC: Andreas Färber andreas.faer...@web.de
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 
 Linux is the only thing we can boot with -M prep that I'm aware of, via
 -kernel. And the 40p series doesn't use this at all. I see no regression
 on my Debian Etch, so I'm fine with the simplification.
 
 Tested-by: Andreas Färber andreas.faer...@web.de

Thanks!

 
 diff --git a/hw/ppc_prep.c b/hw/ppc_prep.c
 index d26049b..6427baa 100644
 --- a/hw/ppc_prep.c
 +++ b/hw/ppc_prep.c
 @@ -130,7 +130,7 @@ static inline uint32_t 
 _PPC_intack_read(target_phys_addr_t addr)
  uint32_t retval = 0;
  
  if ((addr  0xf) == 0)
 -retval = pic_intack_read(isa_pic);
 +retval = pic_read_irq(isa_pic);
 
 Mind to add the braces while touching it?

Will do if I have to repost this series (I hope not...).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 1/2] Add opt_set_bool function

2011-10-12 Thread Andreas Färber

Am 12.10.2011 09:53, schrieb M. Mohan Kumar:

In addition to qemu_opt_set function, we need a function to set bool value
also.

Signed-off-by: M. Mohan Kumarmo...@in.ibm.com
---
  qemu-option.c |   35 +++
  qemu-option.h |1 +
  2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/qemu-option.c b/qemu-option.c
index 105d760..d6bc908 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -636,6 +636,41 @@ int qemu_opt_set(QemuOpts *opts, const char *name, const 
char *value)
  return 0;
  }

+int qemu_opt_set_bool(QemuOpts *opts, const char *name, int val)


Might it make sense to let qemu_opt_{get,set}_bool() use bool type?

Andreas

--
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746, AG Nürnberg



[Qemu-devel] [PATCH] configure: Detect when glibc implements makecontext() to always fail

2011-10-12 Thread Peter Maydell
Improve the configure test for presence of ucontext functions by
making linker warnings fatal; this allows us to detect when we are
linked with a glibc which implements makecontext() to always return
ENOSYS.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
Compiling on an Ubuntu Natty ARM host will hit this.
(Anybody think we should clean up our configure tests so we can
enable -Werror and -Wl,--fatal-warnings on all of them?)

 configure |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 9b4fe34..4d9d9e0 100755
--- a/configure
+++ b/configure
@@ -2549,9 +2549,12 @@ ucontext_coroutine=no
 if test $darwin != yes; then
   cat  $TMPC  EOF
 #include ucontext.h
-int main(void) { makecontext(0, 0, 0); }
+int main(void) { makecontext(0, 0, 0); return 0; }
 EOF
-  if compile_prog   ; then
+  # Note that we enable fatal linker warnings to catch the
+  # glibc makecontext is not implemented and will always fail
+  # linker warning.
+  if compile_prog -Wl,--fatal-warnings  ; then
   ucontext_coroutine=yes
   fi
 fi
-- 
1.7.4.1




[Qemu-devel] [PATCH v3] add add-cow file format

2011-10-12 Thread Dong Xu Wang
Add add-cow file format

Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com
---
 Makefile.objs  |1 +
 block.c|2 +-
 block.h|1 +
 block/add-cow.c|  412 
 block_int.h|1 +
 docs/specs/add-cow.txt |   45 ++
 6 files changed, 461 insertions(+), 1 deletions(-)
 create mode 100644 block/add-cow.c
 create mode 100644 docs/specs/add-cow.txt

diff --git a/Makefile.objs b/Makefile.objs
index c849e51..624c04c 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -31,6 +31,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o
+block-nested-y += add-cow.o
 block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-nested-y += qed-check.o
 block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
diff --git a/block.c b/block.c
index e865fab..c25241d 100644
--- a/block.c
+++ b/block.c
@@ -106,7 +106,7 @@ int is_windows_drive(const char *filename)
 #endif
 
 /* check if the path starts with protocol: */
-static int path_has_protocol(const char *path)
+int path_has_protocol(const char *path)
 {
 #ifdef _WIN32
 if (is_windows_drive(path) ||
diff --git a/block.h b/block.h
index 16bfa0a..8b09f12 100644
--- a/block.h
+++ b/block.h
@@ -256,6 +256,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, 
QEMUSnapshotInfo *sn);
 
 char *get_human_readable_size(char *buf, int buf_size, int64_t size);
 int path_is_absolute(const char *path);
+int path_has_protocol(const char *path);
 void path_combine(char *dest, int dest_size,
   const char *base_path,
   const char *filename);
diff --git a/block/add-cow.c b/block/add-cow.c
new file mode 100644
index 000..d2538a2
--- /dev/null
+++ b/block/add-cow.c
@@ -0,0 +1,412 @@
+#include qemu-common.h
+#include block_int.h
+#include module.h
+
+#define ADD_COW_MAGIC  (((uint64_t)'A'  56) | ((uint64_t)'D'  48) | \
+((uint64_t)'D'  40) | ((uint64_t)'_'  32) | \
+((uint64_t)'C'  24) | ((uint64_t)'O'  16) | \
+((uint64_t)'W'  8) | 0xFF)
+#define ADD_COW_VERSION 1
+
+typedef struct AddCowHeader {
+uint64_t magic;
+uint32_t version;
+char backing_file[1024];
+char image_file[1024];
+uint64_t size;
+} QEMU_PACKED AddCowHeader;
+
+typedef struct BDRVAddCowState {
+char image_file[1024];
+BlockDriverState *image_hd;
+uint8_t *bitmap;
+uint64_t bitmap_size;
+} BDRVAddCowState;
+
+static int add_cow_probe(const uint8_t *buf, int buf_size, const char 
*filename)
+{
+const AddCowHeader *header = (const void *)buf;
+
+if (be64_to_cpu(header-magic) == ADD_COW_MAGIC 
+be32_to_cpu(header-version) == ADD_COW_VERSION) {
+return 100;
+} else {
+return 0;
+}
+}
+
+static int add_cow_open(BlockDriverState *bs, int flags)
+{
+AddCowHeader header;
+int64_t size;
+char image_filename[1024];
+int image_flags;
+BlockDriver *image_drv = NULL;
+int ret;
+BDRVAddCowState *state = (BDRVAddCowState *)(bs-opaque);
+
+ret = bdrv_pread(bs-file, 0, header, sizeof(header));
+if (ret != sizeof(header)) {
+goto fail;
+}
+
+if (be64_to_cpu(header.magic) != ADD_COW_MAGIC ||
+be32_to_cpu(header.version) != ADD_COW_VERSION) {
+ret = -1;
+goto fail;
+}
+
+size = be64_to_cpu(header.size);
+bs-total_sectors = size / BDRV_SECTOR_SIZE;
+
+QEMU_BUILD_BUG_ON(sizeof(state-image_file) != sizeof(header.image_file));
+pstrcpy(bs-backing_file, sizeof(bs-backing_file),
+header.backing_file);
+pstrcpy(state-image_file, sizeof(state-image_file),
+header.image_file);
+
+state-bitmap_size = ((bs-total_sectors + 7)  3);
+state-bitmap = g_malloc0(state-bitmap_size);
+
+ret = bdrv_pread(bs-file, sizeof(header), state-bitmap,
+state-bitmap_size);
+if (ret != state-bitmap_size) {
+goto fail;
+}
+   /* If there is a image_file, must be together with backing_file */
+if (state-image_file[0] != '\0') {
+state-image_hd = bdrv_new();
+/* Relative to image or working dir, need discussion */
+if (path_has_protocol(state-image_file)) {
+pstrcpy(image_filename, sizeof(image_filename),
+state-image_file);
+} else {
+path_combine(image_filename, sizeof(image_filename),
+ bs-filename, state-image_file);
+}
+
+image_drv = bdrv_find_format(raw);
+image_flags =
+ (flags  (~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING))) | BDRV_O_RDWR;
+state-image_hd-keep_read_only = 0;
+
+ret = bdrv_open(state-image_hd, image_filename, image_flags,
+ 

Re: [Qemu-devel] qemu-0.15.1 stable call for patches

2011-10-12 Thread Justin M. Forbes
On Wed, 2011-10-12 at 05:41 -0400, Brad wrote:
 On 26/09/11 9:16 AM, Justin M. Forbes wrote:
  With the current patch queue I would like to start getting qemu-0.15.1
  stable into shape.  With this in mind, the plan is to tag the release on
  Monday Oct 3rd.  If you have patches pending for stable, now would be the
  time to send them. Please CC jmfor...@linuxtx.org if you can to ensure that
  I see them.
 
  Thanks,
  Justin
 
 Is there anywhere where we can see what's been pulled into the pending 
 0.15.1 code base so far since you don't seem to really post to the list
 about the stable branches?
 

I typically reply when things get applied, though you can always check
the git tree at:

git://git.qemu.org/qemu-stable-0.15.git

Justin




[Qemu-devel] [PATCH] ps2: migrate ledstate

2011-10-12 Thread Christophe Fergeau
Make the ps2 device track its ledstate so that we can migrate it.
Otherwise it gets lost across migration, and spice-server gets
confused about the actual keyboard state and sends bogus
caps/scroll/num key events.

Signed-off-by: Christophe Fergeau cferg...@redhat.com
---
 hw/ps2.c |   13 +++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/ps2.c b/hw/ps2.c
index 24228c1..f19ea18 100644
--- a/hw/ps2.c
+++ b/hw/ps2.c
@@ -92,6 +92,7 @@ typedef struct {
not the keyboard controller.  */
 int translate;
 int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */
+int ledstate;
 } PS2KbdState;
 
 typedef struct {
@@ -195,11 +196,17 @@ uint32_t ps2_read_data(void *opaque)
 return val;
 }
 
+static void ps2_set_ledstate(PS2KbdState *s, int ledstate)
+{
+s-ledstate = ledstate;
+kbd_put_ledstate(ledstate);
+}
+
 static void ps2_reset_keyboard(PS2KbdState *s)
 {
 s-scan_enabled = 1;
 s-scancode_set = 2;
-kbd_put_ledstate(0);
+ps2_set_ledstate(s, 0);
 }
 
 void ps2_write_keyboard(void *opaque, int val)
@@ -274,7 +281,7 @@ void ps2_write_keyboard(void *opaque, int val)
 s-common.write_cmd = -1;
 break;
 case KBD_CMD_SET_LEDS:
-kbd_put_ledstate(val);
+ps2_set_ledstate(s, val);
 ps2_queue(s-common, KBD_REPLY_ACK);
 s-common.write_cmd = -1;
 break;
@@ -563,6 +570,7 @@ static int ps2_kbd_post_load(void* opaque, int version_id)
 
 if (version_id == 2)
 s-scancode_set=2;
+kbd_put_ledstate(s-ledstate);
 return 0;
 }
 
@@ -577,6 +585,7 @@ static const VMStateDescription vmstate_ps2_keyboard = {
 VMSTATE_INT32(scan_enabled, PS2KbdState),
 VMSTATE_INT32(translate, PS2KbdState),
 VMSTATE_INT32_V(scancode_set, PS2KbdState,3),
+VMSTATE_INT32(ledstate, PS2KbdState),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.7.6.4




Re: [Qemu-devel] [PATCH] ps2: migrate ledstate

2011-10-12 Thread Anthony Liguori

On 10/12/2011 11:35 AM, Christophe Fergeau wrote:

Make the ps2 device track its ledstate so that we can migrate it.
Otherwise it gets lost across migration, and spice-server gets
confused about the actual keyboard state and sends bogus
caps/scroll/num key events.

Signed-off-by: Christophe Fergeaucferg...@redhat.com
---
  hw/ps2.c |   13 +++--
  1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/ps2.c b/hw/ps2.c
index 24228c1..f19ea18 100644
--- a/hw/ps2.c
+++ b/hw/ps2.c
@@ -92,6 +92,7 @@ typedef struct {
 not the keyboard controller.  */
  int translate;
  int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */
+int ledstate;
  } PS2KbdState;

  typedef struct {
@@ -195,11 +196,17 @@ uint32_t ps2_read_data(void *opaque)
  return val;
  }

+static void ps2_set_ledstate(PS2KbdState *s, int ledstate)
+{
+s-ledstate = ledstate;
+kbd_put_ledstate(ledstate);
+}
+
  static void ps2_reset_keyboard(PS2KbdState *s)
  {
  s-scan_enabled = 1;
  s-scancode_set = 2;
-kbd_put_ledstate(0);
+ps2_set_ledstate(s, 0);
  }

  void ps2_write_keyboard(void *opaque, int val)
@@ -274,7 +281,7 @@ void ps2_write_keyboard(void *opaque, int val)
  s-common.write_cmd = -1;
  break;
  case KBD_CMD_SET_LEDS:
-kbd_put_ledstate(val);
+ps2_set_ledstate(s, val);
  ps2_queue(s-common, KBD_REPLY_ACK);
  s-common.write_cmd = -1;
  break;
@@ -563,6 +570,7 @@ static int ps2_kbd_post_load(void* opaque, int version_id)

  if (version_id == 2)
  s-scancode_set=2;
+kbd_put_ledstate(s-ledstate);
  return 0;
  }

@@ -577,6 +585,7 @@ static const VMStateDescription vmstate_ps2_keyboard = {
  VMSTATE_INT32(scan_enabled, PS2KbdState),
  VMSTATE_INT32(translate, PS2KbdState),
  VMSTATE_INT32_V(scancode_set, PS2KbdState,3),
+VMSTATE_INT32(ledstate, PS2KbdState),


If you're adding a field, you need to bump the version.

Regards,

Anthony Liguori


  VMSTATE_END_OF_LIST()
  }
  };





Re: [Qemu-devel] [PATCH] ps2: migrate ledstate

2011-10-12 Thread Christophe Fergeau
Hey,

On Wed, Oct 12, 2011 at 06:35:28PM +0200, Christophe Fergeau wrote:
 Make the ps2 device track its ledstate so that we can migrate it.
 Otherwise it gets lost across migration, and spice-server gets
 confused about the actual keyboard state and sends bogus
 caps/scroll/num key events.

This patch adds a ledstate to the PS2 keyboard driver as an alternative
to the original patch. I forgot to mention in the changelog entry that
this fixes https://bugzilla.redhat.com/show_bug.cgi?id=729294
One thing that could be improved after this patch is that the ledstate
is now present both in spice-input and hw/ps2.c, but I couldn't find
an easy way to get the ps2 ledstate from spice-input so I left things
this way for now.

Christophe


pgpCVtzFk8rCm.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing

2011-10-12 Thread Aneesh Kumar K.V
On Wed, 12 Oct 2011 09:28:00 +0100, Daniel P. Berrange berra...@redhat.com 
wrote:
 On Wed, Oct 12, 2011 at 01:24:16PM +0530, M. Mohan Kumar wrote:
  Security model is needed only for 'local' fs driver.
  
  Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
  ---
   fsdev/qemu-fsdev.c |6 +
   fsdev/qemu-fsdev.h |1 +
   hw/9pfs/virtio-9p-device.c |   47 
  ++-
   vl.c   |   20 +++--
   4 files changed, 43 insertions(+), 31 deletions(-)
  



  * Files on the fileserver are set to QEMU credentials.
  +*/
  +s-ctx.fs_sm = SM_NONE;
  +s-ctx.xops = none_xattr_ops;
  +} else {
  +fprintf(stderr, Invalid security_model %s specified.\n
  +Available security models are:\t 
  +passthrough,mapped or none\n, fse-security_model);
  +exit(1);
  +}
 
 Are you sure there aren't use cases where people would like to
 choose between  passthrough  mapped, even when using the 'proxy'
 or 'handle' security drivers.

Currently handle fs driver requires CAP_DAC_READ_SEARCH and if qemu is
not going to run with specific capabilities this implies root
privileges. So handle fs driver doesn't do the mapping required by
different security model. Proxy fs driver is enabling us to run file
system operations as root. So even for that we don't need mapped
security model. Even if we want to store file attributes in xattr with
proxy fs driver, that will go as a proxy's argument not as -fsdev 
argument. Proxy also don't require export path name. But that is another
patch.

 
 Both of the security models seem pretty generally useful to me,
 regardless of the driver type.
 

-aneesh




[Qemu-devel] [ANNOUNCE] QEMU 0.15.1.tar.gz is available

2011-10-12 Thread Justin M. Forbes
The QEMU team is pleased to announce the availability of the 0.15.1
stable release.

Download instructions are available at http://wiki.qemu.org/Download

A detailed change log is available at
http://wiki.qemu.org/Changelog/0.15

On behalf of the QEMU team, I'd like to thank everyone who contributed
to make this release happen!

Regards,

Justin M. Forbes





Re: [Qemu-devel] [PATCH RFC V1 01/11] Introduce HostPCIDevice to access a pci device on the host.

2011-10-12 Thread Anthony PERARD
On Tue, Oct 4, 2011 at 19:21, Jan Kiszka jan.kis...@web.de wrote:
 This wasn't run through checkpatch.pl, I bet.

 On 2011-10-04 16:51, Anthony PERARD wrote:
 Signed-off-by: Anthony PERARD anthony.per...@citrix.com
 ---
  hw/host-pci-device.c |  192 
 ++
  hw/host-pci-device.h |   36 +
  2 files changed, 228 insertions(+), 0 deletions(-)
  create mode 100644 hw/host-pci-device.c
  create mode 100644 hw/host-pci-device.h

 diff --git a/hw/host-pci-device.c b/hw/host-pci-device.c
 new file mode 100644
 index 000..b3f2899
 --- /dev/null
 +++ b/hw/host-pci-device.c
 @@ -0,0 +1,192 @@
 +#include qemu-common.h
 +#include host-pci-device.h
 +
 +static int path_to(const HostPCIDevice *d,
 +                   const char *name, char *buf, ssize_t size)
 +{
 +    return snprintf(buf, size, /sys/bus/pci/devices/%04x:%02x:%02x.%x/%s,
 +                    d-domain, d-bus, d-dev, d-func, name);
 +}
 +
 +static int get_resource(HostPCIDevice *d)
 +{
 +    int i;
 +    FILE *f;
 +    char path[PATH_MAX];
 +    unsigned long long start, end, flags, size;
 +
 +    path_to(d, resource, path, sizeof (path));
 +    f = fopen(path, r);
 +    if (!f) {
 +        fprintf(stderr, Error: Can't open %s: %s\n, path, 
 strerror(errno));
 +        return -1;
 +    }
 +
 +    for (i = 0; i  PCI_NUM_REGIONS; i++) {
 +        if (fscanf(f, %llx %llx %llx, start, end, flags) != 3) {
 +            fprintf(stderr, Error: Syntax error in %s\n, path);
 +            break;
 +        }
 +        if (start) {
 +            size = end - start + 1;
 +        } else {
 +            size = 0;
 +        }
 +
 +        flags = 0xf;

 No magic numbers please.

 It also looks a bit strange to me: It's the resource type encoded in the
 second byte? Aren't you interested in it?

Actually, we are interessted to have the BAR with the different flags
(IO/MEM, prefetch, size) like the value in the config space. Because
the base_address value will be given to the VM (by the function
pt_bar_reg_read). But I can later merge the values (start | (flags 
~pci_base_address_io/mem_mask)), and have the right value to give
back.

So here, I will keep seperate the base address, and the flags.

 +
 +        if (i  PCI_ROM_SLOT) {
 +            d-base_addr[i] = start | flags;
 +            d-size[i] = size;
 +        } else {
 +            d-rom_base_addr = start | flags;
 +            d-rom_size = size;
 +        }
 +    }
 +
 +    fclose(f);
 +    return 0;
 +}
 +
 +static unsigned long get_value(HostPCIDevice *d, const char *name)
 +{
 +    char path[PATH_MAX];
 +    FILE *f;
 +    unsigned long value;
 +
 +    path_to(d, name, path, sizeof (path));
 +    f = fopen(path, r);
 +    if (!f) {
 +        fprintf(stderr, Error: Can't open %s: %s\n, path, 
 strerror(errno));
 +        return -1;
 +    }
 +    if (fscanf(f, %lx\n, value) != 1) {
 +        fprintf(stderr, Error: Syntax error in %s\n, path);
 +        value = -1;
 +    }
 +    fclose(f);
 +    return value;
 +}
 +
 +static int pci_dev_is_virtfn(HostPCIDevice *d)
 +{
 +    int rc;
 +    char path[PATH_MAX];
 +    struct stat buf;
 +
 +    path_to(d, physfn, path, sizeof (path));
 +    rc = !stat(path, buf);
 +
 +    return rc;
 +}
 +
 +static int host_pci_config_fd(HostPCIDevice *d)

 [ We will also need the reverse: pass in open file descriptors that
 HostPCIDevice should use. Can be added later. ]

 +{
 +    char path[PATH_MAX];
 +
 +    if (d-config_fd  0) {
 +        path_to(d, config, path, sizeof (path));
 +        d-config_fd = open(path, O_RDWR);
 +        if (d-config_fd  0) {
 +            fprintf(stderr, HostPCIDevice: Can not open '%s': %s\n,
 +                    path, strerror(errno));
 +        }
 +    }
 +    return d-config_fd;
 +}
 +static int host_pci_config_read(HostPCIDevice *d, int pos, void *buf, int 
 len)
 +{
 +    int fd = host_pci_config_fd(d);
 +    int res = 0;
 +
 +    res = pread(fd, buf, len, pos);
 +    if (res  0) {
 +        fprintf(stderr, host_pci_config: read failed: %s (fd: %i)\n,
 +                strerror(errno), fd);
 +        return -1;
 +    }
 +    return res;
 +}
 +static int host_pci_config_write(HostPCIDevice *d,
 +                                 int pos, const void *buf, int len)
 +{
 +    int fd = host_pci_config_fd(d);
 +    int res = 0;
 +
 +    res = pwrite(fd, buf, len, pos);
 +    if (res  0) {
 +        fprintf(stderr, host_pci_config: write failed: %s\n,
 +                strerror(errno));
 +        return -1;
 +    }
 +    return res;
 +}
 +
 +uint8_t host_pci_read_byte(HostPCIDevice *d, int pos)
 +{
 +  uint8_t buf;
 +  host_pci_config_read(d, pos, buf, 1);
 +  return buf;
 +}
 +uint16_t host_pci_read_word(HostPCIDevice *d, int pos)
 +{
 +  uint16_t buf;
 +  host_pci_config_read(d, pos, buf, 2);
 +  return le16_to_cpu(buf);
 +}
 +uint32_t host_pci_read_long(HostPCIDevice *d, int pos)
 +{
 +  uint32_t buf;
 +  host_pci_config_read(d, pos, buf, 4);
 +  return le32_to_cpu(buf);
 +}
 +int 

[Qemu-devel] [PATCH 0/6] trace: Add support for trace events grouping

2011-10-12 Thread Mark Wu
This series add support for trace events grouping. The state of a given group
of trace events can be queried or changed in bulk by the following monitor
commands:

* info trace-groups
  View available trace event groups and their state.  State 1 means enabled,
  state 0 means disabled.

* trace-group NAME on|off
  Enable/disable a given trace event group.

A group of trace events can also be enabled in early running stage through
adding its group name prefixed with group: to trace events list file
which is passed to -trace events.

Mark Wu (6):
  trace: Make tracetool generate a group list
  trace: Add HMP monitor commands for trace events group
  trace: Add trace events group implementation in the backend simple
  trace: Add trace events group implementation in the backend stderr
  trace: Enable -trace events argument to control initial state of
groups
  trace: Update doc for trace events group

 docs/tracing.txt  |   29 ++--
 hmp-commands.hx   |   14 
 monitor.c |   22 
 scripts/tracetool |   94 +++-
 trace-events  |   88 +
 trace/control.c   |   17 +
 trace/control.h   |9 +
 trace/default.c   |   15 
 trace/simple.c|   30 +
 trace/simple.h|7 
 trace/stderr.c|   32 ++
 trace/stderr.h|7 
 12 files changed, 359 insertions(+), 5 deletions(-)




[Qemu-devel] [PATCH 2/6] trace: Add HMP monitor commands for trace events group

2011-10-12 Thread Mark Wu
Add monitor commands 'trace-group NAME on|off' and 'info trace-groups'
to set and query the state of a given group of trace events.

Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
---
 hmp-commands.hx |   14 ++
 monitor.c   |   22 ++
 trace/control.h |9 +
 trace/default.c |   15 +++
 4 files changed, 60 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 9e1cca8..b415616 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -194,6 +194,20 @@ STEXI
 changes status of a trace event
 ETEXI
 
+{
+.name   = trace-group,
+.args_type  = name:s,option:b,
+.params = name on|off,
+.help   = changes status of a specific trace event,
+.mhandler.cmd = do_trace_event_group_set_state,
+},
+
+STEXI
+@item trace-group
+@findex trace-group
+changes status of a group of trace events
+ETEXI
+
 #if defined(CONFIG_SIMPLE_TRACE)
 {
 .name   = trace-file,
diff --git a/monitor.c b/monitor.c
index 88d8228..0b8ca09 100644
--- a/monitor.c
+++ b/monitor.c
@@ -605,6 +605,17 @@ static void do_trace_event_set_state(Monitor *mon, const 
QDict *qdict)
 }
 }
 
+static void do_trace_event_group_set_state(Monitor *mon, const QDict *qdict)
+{
+const char *gp_name = qdict_get_str(qdict, name);
+bool new_state = qdict_get_bool(qdict, option);
+int ret = trace_event_group_set_state(gp_name, new_state);
+
+if (!ret) {
+monitor_printf(mon, unknown group name \%s\\n, gp_name);
+}
+}
+
 #ifdef CONFIG_SIMPLE_TRACE
 static void do_trace_file(Monitor *mon, const QDict *qdict)
 {
@@ -1010,6 +1021,10 @@ static void do_trace_print_events(Monitor *mon)
 trace_print_events((FILE *)mon, monitor_fprintf);
 }
 
+static void do_trace_print_groups(Monitor *mon)
+{
+trace_print_groups((FILE *)mon, monitor_fprintf);
+}
 /**
  * do_quit(): Quit QEMU execution
  */
@@ -3170,6 +3185,13 @@ static const mon_cmd_t info_cmds[] = {
 .mhandler.info = do_trace_print_events,
 },
 {
+.name   = trace-groups,
+.args_type  = ,
+.params = ,
+.help   = show available trace-groups  their state,
+.mhandler.info = do_trace_print_groups,
+},
+{
 .name   = NULL,
 },
 };
diff --git a/trace/control.h b/trace/control.h
index 2acaa42..97ecce7 100644
--- a/trace/control.h
+++ b/trace/control.h
@@ -15,12 +15,21 @@
 
 /** Print the state of all events. */
 void trace_print_events(FILE *stream, fprintf_function stream_printf);
+
+/** Print the state of all groups. */
+void trace_print_groups(FILE *stream, fprintf_function stream_printf);
+
 /** Set the state of an event.
  *
  * @return Whether the state changed.
  */
 bool trace_event_set_state(const char *name, bool state);
 
+/** Set the state of a group of  events.
+ *
+ * @return Whether the state changed.
+ */
+bool trace_event_group_set_state(const char *name, bool state);
 
 /** Initialize the tracing backend.
  *
diff --git a/trace/default.c b/trace/default.c
index c9b27a2..c7e70c7 100644
--- a/trace/default.c
+++ b/trace/default.c
@@ -18,6 +18,14 @@ void trace_print_events(FILE *stream, fprintf_function 
stream_printf)
   operation not supported with the current backend\n);
 }
 
+void trace_print_groups(FILE *stream, fprintf_function stream_printf)
+{
+fprintf(stderr, warning: 
+cannot print the trace groups with the current backend\n);
+stream_printf(stream, error: 
+  operation not supported with the current backend\n);
+}
+
 bool trace_event_set_state(const char *name, bool state)
 {
 fprintf(stderr, warning: 
@@ -25,6 +33,13 @@ bool trace_event_set_state(const char *name, bool state)
 return false;
 }
 
+bool trace_event_group_set_state(const char *gp_name, bool state)
+{
+fprintf(stderr, warning: 
+cannot set the state of a trace group with the current 
backend\n);
+return false;
+}
+
 bool trace_backend_init(const char *events, const char *file)
 {
 if (events) {
-- 
1.7.1




[Qemu-devel] [PATCH 1/6] trace: Make tracetool generate a group list

2011-10-12 Thread Mark Wu
Each trace events group starts with a line containing group_start:
GroupName and end with a line containing group_end. The range of
a trace events group is determined by the tracetool script when it
processes the trace-events file.

Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
---
 scripts/tracetool |   94 +++-
 trace-events  |   88 +
 2 files changed, 180 insertions(+), 2 deletions(-)

diff --git a/scripts/tracetool b/scripts/tracetool
index 4c9951d..3b4ca41 100755
--- a/scripts/tracetool
+++ b/scripts/tracetool
@@ -166,6 +166,82 @@ linetoc_end_nop()
 return
 }
 
+linetoh_begin_group()
+{
+group_num=0
+}
+
+linetoh_end_group()
+{
+cat EOF
+#define NR_TRACE_EVENT_GROUPS $group_num
+extern TraceEventGroup trace_group_list[NR_TRACE_EVENT_GROUPS];
+EOF
+}
+
+linetoc_begin_group()
+{
+cat EOF trace-groups
+
+TraceEventGroup trace_group_list[] = {
+
+EOF
+group_num=0
+}
+
+linetoc_end_group()
+{
+cat EOF trace-groups
+};
+EOF
+cat  trace-groups
+rm -f trace-groups
+}
+
+linetoc_group()
+{
+if echo $str|grep -q group_start; then
+gp_name=${1##*group_start:}
+   if ! test -z $gp_name; then
+start=$((${backend}_event_num))
+   fi
+elif echo $str|grep -q group_end; then
+   stop=$((${backend}_event_num - 1))
+   cat EOF trace-groups
+{.gp_name = $gp_name, .state = 0, .start = $start, .end = $stop},
+
+EOF
+   group_num=$((group_num + 1))
+fi
+}
+
+linetoc_group_simple()
+{
+linetoc_group $1
+}
+
+linetoc_group_stderr()
+{
+linetoc_group $1
+}
+
+linetoh_group()
+{
+if echo $str|grep -q group_end; then
+group_num=$((group_num + 1))
+fi
+}
+
+linetoh_group_simple()
+{
+linetoh_group
+}
+
+linetoh_group_stderr()
+{
+linetoh_group
+}
+
 linetoh_begin_simple()
 {
 cat EOF
@@ -173,8 +249,10 @@ linetoh_begin_simple()
 EOF
 
 simple_event_num=0
+linetoh_begin_group
 }
 
+
 cast_args_to_uint64_t()
 {
 local arg
@@ -206,12 +284,14 @@ EOF
 simple_event_num=$((simple_event_num + 1))
 }
 
+
 linetoh_end_simple()
 {
 cat EOF
 #define NR_TRACE_EVENTS $simple_event_num
 extern TraceEvent trace_list[NR_TRACE_EVENTS];
 EOF
+linetoh_end_group
 }
 
 linetoc_begin_simple()
@@ -222,7 +302,7 @@ linetoc_begin_simple()
 TraceEvent trace_list[] = {
 EOF
 simple_event_num=0
-
+linetoc_begin_group
 }
 
 linetoc_simple()
@@ -240,6 +320,7 @@ linetoc_end_simple()
 cat EOF
 };
 EOF
+linetoc_end_group
 }
 
 #STDERR
@@ -285,6 +366,7 @@ linetoh_end_stderr()
 cat EOF
 #define NR_TRACE_EVENTS $stderr_event_num
 EOF
+linetoh_end_group
 }
 
 linetoc_begin_stderr()
@@ -295,6 +377,7 @@ linetoc_begin_stderr()
 TraceEvent trace_list[] = {
 EOF
 stderr_event_num=0
+linetoc_begin_group
 }
 
 linetoc_stderr()
@@ -312,6 +395,7 @@ linetoc_end_stderr()
 cat EOF
 };
 EOF
+linetoc_end_group
 }
 #END OF STDERR
 
@@ -523,12 +607,18 @@ convert()
 begin=lineto$1_begin_$backend
 process_line=lineto$1_$backend
 end=lineto$1_end_$backend
+group=lineto$1_group_$backend
 
 $begin
 
 while read -r str; do
 # Skip comments and empty lines
-test -z ${str%%#*}  continue
+if test -z ${str%%#*}; then
+   if [ $backend == simple -o $backend == stderr ]  [ ! 
`echo $str|grep -q ^# group_` ]; then
+   $group $str
+   fi
+   continue
+   fi
 
 echo
 # Process the line.  The nop backend handles disabled lines.
diff --git a/trace-events b/trace-events
index a31d9aa..cbd4b58 100644
--- a/trace-events
+++ b/trace-events
@@ -26,6 +26,7 @@
 # The format-string should be a sprintf()-compatible format string.
 
 # qemu-malloc.c
+# group_start:qemu-memory
 g_malloc(size_t size, void *ptr) size %zu ptr %p
 g_realloc(void *ptr, size_t size, void *newptr) ptr %p size %zu newptr %p
 g_free(void *ptr) ptr %p
@@ -34,8 +35,10 @@ g_free(void *ptr) ptr %p
 qemu_memalign(size_t alignment, size_t size, void *ptr) alignment %zu size 
%zu ptr %p
 qemu_vmalloc(size_t size, void *ptr) size %zu ptr %p
 qemu_vfree(void *ptr) ptr %p
+# group_end
 
 # hw/virtio.c
+# group_start:virtio
 virtqueue_fill(void *vq, const void *elem, unsigned int len, unsigned int idx) 
vq %p elem %p len %u idx %u
 virtqueue_flush(void *vq, unsigned int count) vq %p count %u
 virtqueue_pop(void *vq, void *elem, unsigned int in_num, unsigned int out_num) 
vq %p elem %p in_num %u out_num %u
@@ -43,6 +46,7 @@ virtio_queue_notify(void *vdev, int n, void *vq) vdev %p n 
%d vq %p
 virtio_irq(void *vq) vq %p
 virtio_notify(void *vdev, void *vq) vdev %p vq %p
 virtio_set_status(void *vdev, uint8_t val) vdev %p val %u
+# group_end
 
 # hw/virtio-serial-bus.c
 virtio_serial_send_control_event(unsigned int port, uint16_t event, uint16_t 
value) port %u, event %u, value %u
@@ -51,11 +55,14 @@ virtio_serial_handle_control_message(uint16_t event, 
uint16_t value) 

[Qemu-devel] [PATCH 3/6] trace: Add trace events group implementation in the backend simple

2011-10-12 Thread Mark Wu

Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
---
 trace/simple.c |   30 ++
 trace/simple.h |7 +++
 2 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/trace/simple.c b/trace/simple.c
index b639dda..7aa4c0b 100644
--- a/trace/simple.c
+++ b/trace/simple.c
@@ -321,6 +321,16 @@ void trace_print_events(FILE *stream, fprintf_function 
stream_printf)
 }
 }
 
+void trace_print_groups(FILE *stream, fprintf_function stream_printf)
+{
+unsigned int i;
+
+for (i = 0; i  NR_TRACE_EVENT_GROUPS; i++) {
+stream_printf(stream, %s [GROUP ID %u] : state %u\n,
+  trace_group_list[i].gp_name, i,
+ trace_group_list[i].state);
+}
+}
 bool trace_event_set_state(const char *name, bool state)
 {
 unsigned int i;
@@ -334,6 +344,26 @@ bool trace_event_set_state(const char *name, bool state)
 return false;
 }
 
+bool trace_event_group_set_state(const char *gp_name, bool state)
+{
+unsigned int i;
+unsigned int j;
+TraceEventGroup *group;
+
+for (i = 0; i  NR_TRACE_EVENT_GROUPS; i++) {
+   group = trace_group_list[i]; 
+if (!strcmp(group-gp_name, gp_name)) {
+group-state = state;
+
+   for (j = group-start; j = group-end; j++) {
+trace_list[j].state = state;
+}
+return true;
+}
+}
+return false;
+}
+
 /* Helper function to create a thread with signals blocked.  Use glib's
  * portable threads since QEMU abstractions cannot be used due to reentrancy in
  * the tracer.  Also note the signal masking on POSIX hosts so that the thread
diff --git a/trace/simple.h b/trace/simple.h
index 466e75b..cf119f3 100644
--- a/trace/simple.h
+++ b/trace/simple.h
@@ -22,6 +22,13 @@ typedef struct {
 bool state;
 } TraceEvent;
 
+typedef struct {
+const char *gp_name;
+bool state;
+int start;
+int end;
+} TraceEventGroup;
+
 void trace0(TraceEventID event);
 void trace1(TraceEventID event, uint64_t x1);
 void trace2(TraceEventID event, uint64_t x1, uint64_t x2);
-- 
1.7.1




[Qemu-devel] [PATCH 5/6] trace: Enable -trace events argument to control initial state of groups

2011-10-12 Thread Mark Wu
A group of trace events can be enabled in early running stage through
adding its group name prefixed with group: to trace events list file
which is passed to -trace events.

Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
---
 trace/control.c |   17 +
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/trace/control.c b/trace/control.c
index 4c5527d..5043c83 100644
--- a/trace/control.c
+++ b/trace/control.c
@@ -23,10 +23,27 @@ void trace_backend_init_events(const char *fname)
 exit(1);
 }
 char line_buf[1024];
+char *group;
+
 while (fgets(line_buf, sizeof(line_buf), fp)) {
 size_t len = strlen(line_buf);
 if (len  1) {  /* skip empty lines */
 line_buf[len - 1] = '\0';
+   group = strstr(line_buf, group:);
+   if (group != NULL) {
+   group += strlen(group:);
+if (group == NULL) {
+   fprintf(stderr, error: empty group name\n);
+   exit(1);
+}
+if (!trace_event_group_set_state(group, true)) {
+fprintf(stderr, error: trace event group '%s'
+does not exist\n, group);
+exit(1);
+}
+continue;
+}
+
 if (!trace_event_set_state(line_buf, true)) {
 fprintf(stderr,
 error: trace event '%s' does not exist\n, line_buf);
-- 
1.7.1




[Qemu-devel] [PATCH 6/6] trace: Update doc for trace events group

2011-10-12 Thread Mark Wu

Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
---
 docs/tracing.txt |   29 ++---
 1 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/docs/tracing.txt b/docs/tracing.txt
index 95ca16c..2bd4824 100644
--- a/docs/tracing.txt
+++ b/docs/tracing.txt
@@ -16,6 +16,7 @@ for debugging, profiling, and observing execution.
 
echo bdrv_aio_readv/tmp/events
echo bdrv_aio_writev  /tmp/events
+   echo group:virtio  /tmp/events
 
 3. Run the virtual machine to produce a trace file:
 
@@ -53,6 +54,16 @@ source code like this:
 return ptr;
 }
 
+== Trace events group ==
+
+Trace events group is used to represent a set of trace events added for the 
same
+component or feature. Each trace events group starts with a line containing
+group_start:GroupName and end with a line containing group_end. The range 
of
+a trace events group is determined by the tracetool script when it processes
+the trace-events file. You can change the state of a group of trace events at
+one time through trace events group.
+
+
 === Declaring trace events ===
 
 The tracetool script produces the trace.h header file which is included by
@@ -106,8 +117,8 @@ respectively.  This ensures portability between 32- and 
64-bit platforms.
 
 == Generic interface and monitor commands ==
 
-You can programmatically query and control the dynamic state of trace events
-through a backend-agnostic interface:
+You can programmatically query and control the dynamic state of trace events or
+groups through a backend-agnostic interface:
 
 * trace_print_events
 
@@ -122,6 +133,11 @@ through a backend-agnostic interface:
 [...]
 trace_event_set_state(virtio_irq, false); /* disable */
 
+* trace_print_groups
+
+* trace_event_group_set_state
+  Enables or disables a given group of trace events at runtime inside QEMU.
+
 Note that some of the backends do not provide an implementation for this
 interface, in which case QEMU will just print a warning.
 
@@ -131,12 +147,19 @@ This functionality is also provided through monitor 
commands:
   View available trace events and their state.  State 1 means enabled, state 0
   means disabled.
 
+* info trace-groups
+  View available trace event groups and their state.  State 1 means enabled, 
state 0
+  means disabled.
+
 * trace-event NAME on|off
   Enable/disable a given trace event.
 
+* trace-group NAME on|off
+  Enable/disable a given trace event group.
+
 The -trace events=file command line argument can be used to enable the
 events listed in file from the very beginning of the program. This file must
-contain one event name per line.
+contain one event name or one group name, which is prefixed by group:.
 
 == Trace backends ==
 
-- 
1.7.1




[Qemu-devel] [PATCHv2] ps2: migrate ledstate

2011-10-12 Thread Christophe Fergeau
Make the ps2 device track its ledstate so that we can migrate it.
Otherwise it gets lost across migration, and spice-server gets
confused about the actual keyboard state and sends bogus
caps/scroll/num key events. This fixes RH bug #729294

Signed-off-by: Christophe Fergeau cferg...@redhat.com
---
 hw/ps2.c |   15 ---
 1 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/ps2.c b/hw/ps2.c
index 24228c1..3a88681 100644
--- a/hw/ps2.c
+++ b/hw/ps2.c
@@ -92,6 +92,7 @@ typedef struct {
not the keyboard controller.  */
 int translate;
 int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */
+int ledstate;
 } PS2KbdState;
 
 typedef struct {
@@ -195,11 +196,17 @@ uint32_t ps2_read_data(void *opaque)
 return val;
 }
 
+static void ps2_set_ledstate(PS2KbdState *s, int ledstate)
+{
+s-ledstate = ledstate;
+kbd_put_ledstate(ledstate);
+}
+
 static void ps2_reset_keyboard(PS2KbdState *s)
 {
 s-scan_enabled = 1;
 s-scancode_set = 2;
-kbd_put_ledstate(0);
+ps2_set_ledstate(s, 0);
 }
 
 void ps2_write_keyboard(void *opaque, int val)
@@ -274,7 +281,7 @@ void ps2_write_keyboard(void *opaque, int val)
 s-common.write_cmd = -1;
 break;
 case KBD_CMD_SET_LEDS:
-kbd_put_ledstate(val);
+ps2_set_ledstate(s, val);
 ps2_queue(s-common, KBD_REPLY_ACK);
 s-common.write_cmd = -1;
 break;
@@ -544,7 +551,7 @@ static void ps2_mouse_reset(void *opaque)
 
 static const VMStateDescription vmstate_ps2_common = {
 .name = PS2 Common State,
-.version_id = 3,
+.version_id = 4,
 .minimum_version_id = 2,
 .minimum_version_id_old = 2,
 .fields  = (VMStateField []) {
@@ -563,6 +570,7 @@ static int ps2_kbd_post_load(void* opaque, int version_id)
 
 if (version_id == 2)
 s-scancode_set=2;
+kbd_put_ledstate(s-ledstate);
 return 0;
 }
 
@@ -577,6 +585,7 @@ static const VMStateDescription vmstate_ps2_keyboard = {
 VMSTATE_INT32(scan_enabled, PS2KbdState),
 VMSTATE_INT32(translate, PS2KbdState),
 VMSTATE_INT32_V(scancode_set, PS2KbdState,3),
+VMSTATE_INT32(ledstate, PS2KbdState),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.7.6.4




[Qemu-devel] [PATCH 4/6] trace: Add trace events group implementation in the backend stderr

2011-10-12 Thread Mark Wu

Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
---
 trace/stderr.c |   32 
 trace/stderr.h |7 +++
 2 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/trace/stderr.c b/trace/stderr.c
index 7107c4a..c409840 100644
--- a/trace/stderr.c
+++ b/trace/stderr.c
@@ -12,6 +12,17 @@ void trace_print_events(FILE *stream, fprintf_function 
stream_printf)
 }
 }
 
+void trace_print_groups(FILE *stream, fprintf_function stream_printf)
+{
+unsigned int i;
+
+for (i = 0; i  NR_TRACE_EVENT_GROUPS; i++) {
+stream_printf(stream, %s [GROUP ID %u] : state %u\n,
+  trace_group_list[i].gp_name, i,
+  trace_group_list[i].state);
+}
+}
+
 bool trace_event_set_state(const char *name, bool state)
 {
 unsigned int i;
@@ -25,6 +36,27 @@ bool trace_event_set_state(const char *name, bool state)
 return false;
 }
 
+bool trace_event_group_set_state(const char *gp_name, bool state)
+{
+unsigned int i;
+unsigned int j;
+TraceEventGroup *group;
+
+for (i = 0; i  NR_TRACE_EVENT_GROUPS; i++) {
+
+   group = trace_group_list[i];
+if (!strcmp(group-gp_name, gp_name)) {
+group-state = state;
+
+for (j = group-start; j = group-end; j++) {
+trace_list[j].state = state;
+}
+return true;
+}
+}
+return false;
+}
+
 bool trace_backend_init(const char *events, const char *file)
 {
 if (file) {
diff --git a/trace/stderr.h b/trace/stderr.h
index d575b61..45499f6 100644
--- a/trace/stderr.h
+++ b/trace/stderr.h
@@ -8,4 +8,11 @@ typedef struct {
 bool state;
 } TraceEvent;
 
+typedef struct {
+const char *gp_name;
+bool state;
+int start;
+int end;
+} TraceEventGroup;
+
 #endif /* ! TRACE_STDERR_H */
-- 
1.7.1




[Qemu-devel] 2 MiB alignment in qemu_vmalloc()

2011-10-12 Thread Stefan Weil

Hello Avi,

commit 36b58628 increased the alignment for some large memory
blocks (typically the system RAM) to 2 MiB (QEMU_VMALLOC_ALIGN)
on x86_64 Linux hosts.

As far as I know, this was only required for KVM.

There is a bad side effect of this increase: the Valgrind tool
only supports an alignment of up to 1 MiB. It aborts execution
with current QEMU for any target (even non-KVM targets).

It might be possible to modify Valgrind (as far as I know this
is already discussed), and of course I can also patch my local
QEMU. Nevertheless, I think the alignment should be reduced
again when there is no KVM support or KVM is disabled.
Maybe the large alignment has other unwanted side effects.

The code is in oslib-posix.c (target independent) and needs
something like kvm_enabled() (currently a macro). What would
you suggest? Maybe you can provide a patch.

Kind regards,
Stefan




Re: [Qemu-devel] Is realview-pb-a8 fully supported ?

2011-10-12 Thread Francis Moreau
On Wed, Oct 12, 2011 at 11:23 AM, Peter Maydell
peter.mayd...@linaro.org wrote:
 On 10 October 2011 14:48, Francis Moreau francis.m...@gmail.com wrote:
 On Mon, Oct 10, 2011 at 10:42 AM, Peter Maydell
 peter.mayd...@linaro.org wrote:
 On 10 October 2011 08:35, Francis Moreau francis.m...@gmail.com wrote:
 I noticed another point for the realview platofrm: if I boot with -M
 512, it works however if I set -M 256 then it doesn't.

 Perhaps your kernel is configured to load in the higher 256MB
 address range

 hmm which options do you have in mind ?

 Hmm, I thought there was an option for this but I can't find it
 in the config, so I must have been misremembering somehow.

 When I say it doesn't work, it means that nothing happen when
 starting qemu: no trace, it looks like it's running an infinite loop.

 Not even Uncompressing the kernel ?

yes uncompressing the kernel works. Below is what I'm seeing exactly:

Uncompressing Linux... done, booting the kernel.


 If you want to track down what's going on then you'll need to
 connect an ARM gdb up to qemu and single step through the boot
 process, I'm afraid.


Ok but that means that I understand the code I'm going to step through :-/

 BTW I'm wondering which kernel source I should use to build kernels
 for such plateforms (realview, vexpress, versatile) ? I'm currently
 using the source from kernel.org (well similar since this server seems
 really dead). but I'm not sure if it's a good idea...

 I think the mainline kernel sources should in theory work
 (in particular if they work with 512MB then that's a good
 sign...) but I'm not a kernel expert; mostly I use other peoples'
 prebuilt ones.

ok, but wouldn't you recommend to use the kernel from Linaro ? ;)

Thanks
-- 
Francis



Re: [Qemu-devel] 2 MiB alignment in qemu_vmalloc()

2011-10-12 Thread Alexander Graf

On 12.10.2011, at 20:05, Stefan Weil wrote:

 Hello Avi,
 
 commit 36b58628 increased the alignment for some large memory
 blocks (typically the system RAM) to 2 MiB (QEMU_VMALLOC_ALIGN)
 on x86_64 Linux hosts.
 
 As far as I know, this was only required for KVM.
 
 There is a bad side effect of this increase: the Valgrind tool
 only supports an alignment of up to 1 MiB. It aborts execution
 with current QEMU for any target (even non-KVM targets).
 
 It might be possible to modify Valgrind (as far as I know this
 is already discussed), and of course I can also patch my local
 QEMU. Nevertheless, I think the alignment should be reduced
 again when there is no KVM support or KVM is disabled.
 Maybe the large alignment has other unwanted side effects.

Actually, I'd much rather prefer to keep the differences between KVM and 
non-KVM low here. THP can potentially also work on TCG, so the alignment isn't 
completely moot here. Though it's certainly a lot less useful, as code isn't 
directly executed from there and we the rest of the overhead is a lot higher 
either way (especially the softmmu one).

Either way, why does valgrind break when we enforce big alignment? That really 
sounds more like a valgrind bug than anything else. Memalign is there for 
exactly that reason, no?


Alex




Re: [Qemu-devel] 2 MiB alignment in qemu_vmalloc()

2011-10-12 Thread Stefan Weil

Am 12.10.2011 22:02, schrieb Alexander Graf:
Actually, I'd much rather prefer to keep the differences between KVM 
and non-KVM low here. THP can potentially also work on TCG, so the 
alignment isn't completely moot here. Though it's certainly a lot less 
useful, as code isn't directly executed from there and we the rest of 
the overhead is a lot higher either way (especially the softmmu one).

Either way, why does valgrind break when we enforce big alignment? That really 
sounds more like a valgrind bug than anything else. Memalign is there for 
exactly that reason, no?


Alex


Actually, there is even a difference between KVM (x86_64) and KVM (non 
x86_64)

in the current code: only x86_64 hosts use the 2 MiB alignment.

Valgrind breaks because it has an assertion which limits the alignment.
This limitation was already discussed in 2008 and still exists in latest
Ubuntu and other distributions (and also in latest Valgrind SVN trunk).

Therefore I don't expect that it will be fixed soon.

See these bug reports, for example:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=489297
http://bugs.kde.org/show_bug.cgi?id=203877

Cheers,
Stefan




Re: [Qemu-devel] Is realview-pb-a8 fully supported ?

2011-10-12 Thread Peter Maydell
On 12 October 2011 20:39, Francis Moreau francis.m...@gmail.com wrote:
 On Wed, Oct 12, 2011 at 11:23 AM, Peter Maydell
 peter.mayd...@linaro.org wrote:
 I think the mainline kernel sources should in theory work
 (in particular if they work with 512MB then that's a good
 sign...) but I'm not a kernel expert; mostly I use other peoples'
 prebuilt ones.

 ok, but wouldn't you recommend to use the kernel from Linaro ? ;)

Obviously those work too (and in practice they're the ones
I test most often with). Linaro kernels will certainly have
been tested with Versatile Express, but probably not with the
other older ARM devboards. But if you don't need to be on the
bleeding edge for anything then I don't think it makes much
difference whether you use the Linaro kernel or mainline for
an established platform like the ARM devboards.
(more info on the aims of the linaro kernel tree here:
https://wiki.linaro.org/WorkingGroups/Kernel/KernelTree
)

-- PMM



Re: [Qemu-devel] 2 MiB alignment in qemu_vmalloc()

2011-10-12 Thread Alexander Graf

On 12.10.2011, at 22:41, Stefan Weil wrote:

 Am 12.10.2011 22:02, schrieb Alexander Graf:
 Actually, I'd much rather prefer to keep the differences between KVM and 
 non-KVM low here. THP can potentially also work on TCG, so the alignment 
 isn't completely moot here. Though it's certainly a lot less useful, as code 
 isn't directly executed from there and we the rest of the overhead is a lot 
 higher either way (especially the softmmu one).
 Either way, why does valgrind break when we enforce big alignment? That 
 really sounds more like a valgrind bug than anything else. Memalign is there 
 for exactly that reason, no?
 
 
 Alex
 
 Actually, there is even a difference between KVM (x86_64) and KVM (non x86_64)
 in the current code: only x86_64 hosts use the 2 MiB alignment.

Right. It might make sense to find a reasonable alignment for all archs and 
just set it to that. I vote for 16MB :).

 Valgrind breaks because it has an assertion which limits the alignment.
 This limitation was already discussed in 2008 and still exists in latest
 Ubuntu and other distributions (and also in latest Valgrind SVN trunk).
 
 Therefore I don't expect that it will be fixed soon.
 
 See these bug reports, for example:
 
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=489297
 http://bugs.kde.org/show_bug.cgi?id=203877

Well, yes, my point is that it's a bug in valgrind that should be fixed. I 
don't think we should special-case QEMU because of bugs in debugging software :)


Alex




Re: [Qemu-devel] [PATCH 3/6] trace: Add trace events group implementation in the backend simple

2011-10-12 Thread Ryan Harper
* Mark Wu wu...@linux.vnet.ibm.com [2011-10-12 12:26]:
 
 Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
 ---
  trace/simple.c |   30 ++
  trace/simple.h |7 +++
  2 files changed, 37 insertions(+), 0 deletions(-)
 
 diff --git a/trace/simple.c b/trace/simple.c
 index b639dda..7aa4c0b 100644
 --- a/trace/simple.c
 +++ b/trace/simple.c
 @@ -321,6 +321,16 @@ void trace_print_events(FILE *stream, fprintf_function 
 stream_printf)
  }
  }
 
 +void trace_print_groups(FILE *stream, fprintf_function stream_printf)
 +{
 +unsigned int i;
 +
 +for (i = 0; i  NR_TRACE_EVENT_GROUPS; i++) {
 +stream_printf(stream, %s [GROUP ID %u] : state %u\n,
 +  trace_group_list[i].gp_name, i,
 +   trace_group_list[i].state);

You've got a mix of space and tabs in this file.  Remove the tabs and
adjust the spacing to CODING_STYLE rules.

 +}
 +}
  bool trace_event_set_state(const char *name, bool state)
  {
  unsigned int i;
 @@ -334,6 +344,26 @@ bool trace_event_set_state(const char *name, bool state)
  return false;
  }
 
 +bool trace_event_group_set_state(const char *gp_name, bool state)
 +{
 +unsigned int i;
 +unsigned int j;
 +TraceEventGroup *group;
 +
 +for (i = 0; i  NR_TRACE_EVENT_GROUPS; i++) {
 + group = trace_group_list[i]; 

here

 +if (!strcmp(group-gp_name, gp_name)) {
 +group-state = state;
 +
 + for (j = group-start; j = group-end; j++) {

here

 +trace_list[j].state = state;
 +}
 +return true;
 +}
 +}
 +return false;
 +}
 +
  /* Helper function to create a thread with signals blocked.  Use glib's
   * portable threads since QEMU abstractions cannot be used due to reentrancy 
 in
   * the tracer.  Also note the signal masking on POSIX hosts so that the 
 thread
 diff --git a/trace/simple.h b/trace/simple.h
 index 466e75b..cf119f3 100644
 --- a/trace/simple.h
 +++ b/trace/simple.h
 @@ -22,6 +22,13 @@ typedef struct {
  bool state;
  } TraceEvent;
 
 +typedef struct {
 +const char *gp_name;
 +bool state;
 +int start;
 +int end;
 +} TraceEventGroup;
 +
  void trace0(TraceEventID event);
  void trace1(TraceEventID event, uint64_t x1);
  void trace2(TraceEventID event, uint64_t x1, uint64_t x2);
 -- 
 1.7.1
 

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com




[Qemu-devel] Buggy SDL Zoom

2011-10-12 Thread Stefan Weil

Hi,

the SDL zoom feature which is implemented in sdl_zoom_template.h
(and the SDL_rotozoom version which it is based on) accesses memory
beyond the allocated limits.

This can be easily reproduced using Valgrind and some Linux desktop
which resizes QEMU's window to fill the whole screen (I did run the tests
on an Ubuntu netbook).

Another effect can be observed by repeatedly increasing the zoom factor
with the Alt-Ctrl-+: the image grows up to a certain value and then
collapses again.

It looks like other programs using SDL_rotozoom also discovered
out-of-bound problems, and in newer versions, the SDL_rotozoom
code was totally rewritten.

For security reasons, I suggest disabling the zoom feature until
either the current code is replaced by a (tested) newer version
of SDL_rotozoom or fixed.

Cheers,
Stefan Weil




Re: [Qemu-devel] 2 MiB alignment in qemu_vmalloc()

2011-10-12 Thread Stefan Weil

Am 12.10.2011 22:47, schrieb Alexander Graf:
Well, yes, my point is that it's a bug in valgrind that should be 
fixed. I don't think we should special-case QEMU because of bugs in 
debugging software :)


Alex


Yes, the valgrind bug should be fixed. I don't know why it isn't,
but that's not the point:

Valgrind is very valuable for finding certain kinds of bugs in QEMU
which we want to find and fix (I hope we agree on this).
Therefore developers should be able to use it.

Today, they cannot use Valgrind with QEMU out-of-the box,
because they get an assertion. Some developers will stop here.
Others will ask Google, look in Valgrind's code and spend some
time to find and fix the problem before they start using
Valgrind to find QEMU bugs.

They could have spent their time better.

I can try to make QEMU more useable with Valgrind by changing
the QEMU code (which was Valgrind compatible up to Avi's change).

I cannot change the Valgrind code, and even if I could, it would
take a lot of time until all Linux distributions would include a
fixed Valgrind :-(

If all existing gdb versions did not work with QEMU,
but there were a simple QEMU change which made them work,
what would you do?

Cheers,
Stefan





Re: [Qemu-devel] 2 MiB alignment in qemu_vmalloc()

2011-10-12 Thread Alexander Graf

On 12.10.2011, at 23:19, Stefan Weil wrote:

 Am 12.10.2011 22:47, schrieb Alexander Graf:
 Well, yes, my point is that it's a bug in valgrind that should be fixed. I 
 don't think we should special-case QEMU because of bugs in debugging 
 software :)
 
 Alex
 
 Yes, the valgrind bug should be fixed. I don't know why it isn't,
 but that's not the point:
 
 Valgrind is very valuable for finding certain kinds of bugs in QEMU
 which we want to find and fix (I hope we agree on this).
 Therefore developers should be able to use it.
 
 Today, they cannot use Valgrind with QEMU out-of-the box,
 because they get an assertion. Some developers will stop here.
 Others will ask Google, look in Valgrind's code and spend some
 time to find and fix the problem before they start using
 Valgrind to find QEMU bugs.
 
 They could have spent their time better.
 
 I can try to make QEMU more useable with Valgrind by changing
 the QEMU code (which was Valgrind compatible up to Avi's change).
 
 I cannot change the Valgrind code, and even if I could, it would
 take a lot of time until all Linux distributions would include a
 fixed Valgrind :-(
 
 If all existing gdb versions did not work with QEMU,
 but there were a simple QEMU change which made them work,
 what would you do?

I would add a command line option to modify the alignment in runtime, 
defaulting it to something sane (16MB), so you can work around bugs in vagrind 
with a simple parameter :)


Alex




[Qemu-devel] [PATCH V4] Add AACI audio playback support to the ARM Versatile/PB platform

2011-10-12 Thread Mathieu Sonet

This driver emulates the ARM AACI interface (PL041) connected to a LM4549 codec.
It enables audio playback for the Versatile/PB platform.

Limitations:
- Supports only a playback on one channel (Versatile/Vexpress)
- Supports only one TX FIFO in compact-mode or non-compact mode.
- Supports playback of 12, 16, 18 and 20 bits samples.
- Record is not supported.
- The PL041 is hardwired to a LM4549 codec.

Versatile/PB test build:
linux-2.6.38.5
buildroot-2010.11
alsa-lib-1.0.22
alsa-utils-1.0.22
mpg123-0.66

Qemu host: Ubuntu 10.04 in Vmware/OS X

Playback tested successfully with speaker-test/aplay/mpg123.

Signed-off-by: Mathieu Sonet cont...@elasticsheep.com
---
v3-v4

* Fix debug message compile error
* Remove Versatile/PB reference in pl041.c
* Use 20-bit samples in the LM4549 API
* Add support for non-compact mode
* Now support 12, 16, 18 and 20 bit sample size
* A FIFO reset is triggered when the AACIFE bit is reset

 Makefile.target  |1 +
 hw/lm4549.c  |  332 
 hw/lm4549.h  |   44 
 hw/pl041.c   |  644 ++
 hw/pl041.h   |  135 
 hw/pl041.hx  |   81 +++
 hw/versatilepb.c |8 +
 7 files changed, 1245 insertions(+), 0 deletions(-)
 create mode 100644 hw/lm4549.c
 create mode 100644 hw/lm4549.h
 create mode 100644 hw/pl041.c
 create mode 100644 hw/pl041.h
 create mode 100644 hw/pl041.hx

diff --git a/Makefile.target b/Makefile.target
index 42adfec..00173cd 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -355,6 +355,7 @@ obj-arm-y += syborg_virtio.o
 obj-arm-y += vexpress.o
 obj-arm-y += strongarm.o
 obj-arm-y += collie.o
+obj-arm-y += pl041.o lm4549.o

 obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o
 obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o
diff --git a/hw/lm4549.c b/hw/lm4549.c
new file mode 100644
index 000..883ef60
--- /dev/null
+++ b/hw/lm4549.c
@@ -0,0 +1,332 @@
+/*
+ * LM4549 Audio Codec Interface
+ *
+ * Copyright (c) 2011
+ * Written by Mathieu Sonet - www.elasticsheep.com
+ *
+ * This code is licenced under the GPL.
+ *
+ * *
+ *
+ * This driver emulates the LM4549 codec.
+ *
+ * It supports only one playback voice and no record voice.
+ */
+
+#include hw.h
+#include audio/audio.h
+#include lm4549.h
+
+#if 0
+#define LM4549_DEBUG  1
+#endif
+
+#if 0
+#define LM4549_DUMP_DAC_INPUT 1
+#endif
+
+#ifdef LM4549_DEBUG
+#define DPRINTF(fmt, ...) \
+do { printf(lm4549:  fmt , ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while (0)
+#endif
+
+#if defined(LM4549_DUMP_DAC_INPUT)
+#include stdio.h
+static FILE *fp_dac_input;
+#endif
+
+/* LM4549 register list */
+enum {
+LM4549_Reset= 0x00,
+LM4549_Master_Volume= 0x02,
+LM4549_Line_Out_Volume  = 0x04,
+LM4549_Master_Volume_Mono   = 0x06,
+LM4549_PC_Beep_Volume   = 0x0A,
+LM4549_Phone_Volume = 0x0C,
+LM4549_Mic_Volume   = 0x0E,
+LM4549_Line_In_Volume   = 0x10,
+LM4549_CD_Volume= 0x12,
+LM4549_Video_Volume = 0x14,
+LM4549_Aux_Volume   = 0x16,
+LM4549_PCM_Out_Volume   = 0x18,
+LM4549_Record_Select= 0x1A,
+LM4549_Record_Gain  = 0x1C,
+LM4549_General_Purpose  = 0x20,
+LM4549_3D_Control   = 0x22,
+LM4549_Powerdown_Ctrl_Stat  = 0x26,
+LM4549_Ext_Audio_ID = 0x28,
+LM4549_Ext_Audio_Stat_Ctrl  = 0x2A,
+LM4549_PCM_Front_DAC_Rate   = 0x2C,
+LM4549_PCM_ADC_Rate = 0x32,
+LM4549_Vendor_ID1   = 0x7C,
+LM4549_Vendor_ID2   = 0x7E
+};
+
+static void lm4549_reset(lm4549_state *s)
+{
+uint16_t *regfile = s-regfile;
+
+regfile[LM4549_Reset]   = 0x0d50;
+regfile[LM4549_Master_Volume]   = 0x8008;
+regfile[LM4549_Line_Out_Volume] = 0x8000;
+regfile[LM4549_Master_Volume_Mono]  = 0x8000;
+regfile[LM4549_PC_Beep_Volume]  = 0x;
+regfile[LM4549_Phone_Volume]= 0x8008;
+regfile[LM4549_Mic_Volume]  = 0x8008;
+regfile[LM4549_Line_In_Volume]  = 0x8808;
+regfile[LM4549_CD_Volume]   = 0x8808;
+regfile[LM4549_Video_Volume]= 0x8808;
+regfile[LM4549_Aux_Volume]  = 0x8808;
+regfile[LM4549_PCM_Out_Volume]  = 0x8808;
+regfile[LM4549_Record_Select]   = 0x;
+regfile[LM4549_Record_Gain] = 0x8000;
+regfile[LM4549_General_Purpose] = 0x;
+regfile[LM4549_3D_Control]  = 0x0101;
+regfile[LM4549_Powerdown_Ctrl_Stat] = 0x000f;
+regfile[LM4549_Ext_Audio_ID]= 0x0001;
+regfile[LM4549_Ext_Audio_Stat_Ctrl] = 0x;
+regfile[LM4549_PCM_Front_DAC_Rate]  = 0xbb80;
+regfile[LM4549_PCM_ADC_Rate]= 0xbb80;
+

Re: [Qemu-devel] [PATCH] usb-hid: activate usb tablet / mouse after migration.

2011-10-12 Thread TeLeMan
On Wed, Oct 12, 2011 at 19:30, Gerd Hoffmann kra...@redhat.com wrote:
 qemu uses the ps/2 mouse by default.  The usb tablet (or mouse) is
 activated as soon as qemu sees some guest activity on the device,
 i.e. polling for HID events.  That used to work fine for both fresh
 boot and migration.
It does not fix usb tablet/mouse when starting vm directly from snapshot.

 Remote wakeup support changed the picture though: There will be no
 polling after migration in case the guest suspended the usb bus,
 waiting for wakeup events.  Result is that the ps/2 mouse stays
 active.

 Fix this by activating the usb tablet / mouse in post_load() in case
 the guest enabled remote wakeup.

 Signed-off-by: Gerd Hoffmann kra...@redhat.com
 ---
  hw/usb-hid.c |   11 +++
  1 files changed, 11 insertions(+), 0 deletions(-)

 diff --git a/hw/usb-hid.c b/hw/usb-hid.c
 index 7c564b6..997f828 100644
 --- a/hw/usb-hid.c
 +++ b/hw/usb-hid.c
 @@ -520,10 +520,21 @@ static int usb_keyboard_initfn(USBDevice *dev)
     return usb_hid_initfn(dev, HID_KEYBOARD);
  }

 +static int usb_ptr_post_load(void *opaque, int version_id)
 +{
 +    USBHIDState *s = opaque;
 +
 +    if (s-dev.remote_wakeup) {
 +        hid_pointer_activate(s-hid);
 +    }
 +    return 0;
 +}
 +
  static const VMStateDescription vmstate_usb_ptr = {
     .name = usb-ptr,
     .version_id = 1,
     .minimum_version_id = 1,
 +    .post_load = usb_ptr_post_load,
     .fields = (VMStateField []) {
         VMSTATE_USB_DEVICE(dev, USBHIDState),
         VMSTATE_HID_POINTER_DEVICE(hid, USBHIDState),
 --
 1.7.1






Re: [Qemu-devel] [PATCH] qemu-char: Fix use of free() instead of g_free()

2011-10-12 Thread Dong Xu Wang
于 10/07/2011 01:38 PM, Stefan Weil 写道:
 cppcheck reported these errors:
 
 qemu-char.c:1667: error: Mismatching allocation and deallocation: s
 qemu-char.c:1668: error: Mismatching allocation and deallocation: chr
 qemu-char.c:1769: error: Mismatching allocation and deallocation: s
 qemu-char.c:1770: error: Mismatching allocation and deallocation: chr
 
 Signed-off-by: Stefan Weils...@weilnetz.de
 ---
   qemu-char.c |8 
   1 files changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/qemu-char.c b/qemu-char.c
 index 09d2309..e1b2b87 100644
 --- a/qemu-char.c
 +++ b/qemu-char.c
 @@ -1664,8 +1664,8 @@ static int qemu_chr_open_win(QemuOpts *opts, 
 CharDriverState **_chr)
   chr-chr_close = win_chr_close;
 
   if (win_chr_init(chr, filename)  0) {
 -free(s);
 -free(chr);
 +g_free(s);
 +g_free(chr);
   return -EIO;
   }
   qemu_chr_generic_open(chr);
 @@ -1766,8 +1766,8 @@ static int qemu_chr_open_win_pipe(QemuOpts *opts, 
 CharDriverState **_chr)
   chr-chr_close = win_chr_close;
 
   if (win_chr_pipe_init(chr, filename)  0) {
 -free(s);
 -free(chr);
 +g_free(s);
 +g_free(chr);
   return -EIO;
   }
   qemu_chr_generic_open(chr);
Tested-by: Dongxu Wang wdon...@linux.vnet.ibm.com



Re: [Qemu-devel] balloon driver on winxp guest start failed

2011-10-12 Thread hkran

On 10/12/2011 07:09 PM, hkran wrote:

Hi,

I used balloon driver for windows  virtio-win-0.1-15.iso (from 
http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/bin/)


following the install guard , I installed the balloon driver like this:

devcon.exe install d:\wxp\x86\balloon.inf 
PCI\VEN_1AF4DEV_1002SUBSYS_00051AF4REV_00
 then reboot guest Os, but the status of driver installed is always 
incorrect, that show me the driver start failed (code 10) in the 
device manager.


I typed the following cmds in the monitor command line:

(qemu) device_add virtio-balloon
(qemu) info balloon
balloon: actual=2048
(qemu) balloon 1024
(qemu) info balloon
balloon: actual=2048
(qemu) info balloon
balloon: actual=2048

And I also tried it by using qemu -balloon virtio param  when 
getting qemu up, the status is worse, the winxp guest froze at boot 
screen.


Am I using balloon driver in a correct way?



For the boot failure case, I take more looks into it.  I open the trace 
output and see the following when boot failed

Balloon driver, built on Oct 13 2011 10:46:59
^M-- DriverEntry
^Mfile z:\source\kvm-guest-drivers-windows\balloon\sys\driver.c line 151
^M-- BalloonDeviceAdd
^M-- BalloonDeviceAdd
^M-- BalloonEvtDevicePrepareHardware
^M- Port   Resource [C0A0-C0C0]
^M-- BalloonEvtDevicePrepareHardware
^M-- BalloonEvtDeviceD0Entry
^M-- BalloonInit
^M-- VIRTIO_BALLOON_F_STATS_VQ
^M-- BalloonInit
^M-- BalloonInterruptEnable
^M-- BalloonInterruptEnable

here, the system is blocked.

I compare it with the logfile in the normal case that I hot-plugin the 
balloon device, and then find the system blocked before calling at 
BalloonInterruptDpc.


Is it meaning that we open the interrupt of balloon device too soon when 
booting the system?