date:20111012

Re: [Qemu-devel] [PATCH v3] Sort the help info shown in monitor at runtime

2011-10-12 Thread Wenyi Gao

On Wed, 2011-10-12 at 11:32 +0800, Wayne Xia wrote:
 This patch would try sort the command list in monitor at runtime. As a result,
 command help and help info would show a more friendly sorted command list.
 For eg:
 (qemu)help
 acl_add
 acl_policy
 acl_remove
 acl_reset
 acl_show
 balloon
 block_passwd
 ...
 the command list is sorted.
 
 v3: using qsort function to sort the command list.
 
 Signed-off-by: Wayne Xia xiaw...@linux.vnet.ibm.com
 ---
  monitor.c |   30 ++
  1 files changed, 26 insertions(+), 4 deletions(-)
 
 diff --git a/monitor.c b/monitor.c
 index 31b212a..a172167 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -195,8 +195,8 @@ static inline int mon_print_count_get(const Monitor *mon) 
 { return 0; }
 
  static QLIST_HEAD(mon_list, Monitor) mon_list;
 
 -static const mon_cmd_t mon_cmds[];
 -static const mon_cmd_t info_cmds[];
 +static mon_cmd_t mon_cmds[];
 +static mon_cmd_t info_cmds[];
 
  static const mon_cmd_t qmp_cmds[];
  static const mon_cmd_t qmp_query_cmds[];
 @@ -2726,13 +2726,14 @@ int monitor_get_fd(Monitor *mon, const char *fdname)
  return -1;
  }
 
 -static const mon_cmd_t mon_cmds[] = {
 +/* mon_cmds and info_cmds would be sorted at runtime */
 +static mon_cmd_t mon_cmds[] = {
  #include hmp-commands.h
  { NULL, NULL, },
  };
 
  /* Please update hmp-commands.hx when adding or changing commands */
 -static const mon_cmd_t info_cmds[] = {
 +static mon_cmd_t info_cmds[] = {
  {
  .name   = version,
  .args_type  = ,
 @@ -5068,6 +5069,25 @@ static void monitor_event(void *opaque, int event)
  }
  }
 
 +static int
 +compare_mon_cmd(const void *a, const void *b)
 +{
 +return strcmp(((const mon_cmd_t *)a)-name,
 +((const mon_cmd_t *)b)-name);
 +}
 +
 +static void sortcmdlist(void)
 +{
 +int array_num;
 +int elem_size = sizeof(mon_cmd_t);
 +
 +array_num = sizeof(mon_cmds)/elem_size-1;
 +qsort((void *)mon_cmds, array_num, elem_size, compare_mon_cmd);
 +
 +array_num = sizeof(info_cmds)/elem_size-1;
 +qsort((void *)info_cmds, array_num, elem_size, compare_mon_cmd);
 +}
 +
 
  /*
   * Local variables:
 @@ -5110,6 +5130,8 @@ void monitor_init(CharDriverState *chr, int flags)
  QLIST_INSERT_HEAD(mon_list, mon, entry);
  if (!default_mon || (flags  MONITOR_IS_DEFAULT))
  default_mon = mon;
 +
 +sortcmdlist();
  }
 
  static void bdrv_password_cb(Monitor *mon, const char *password, void 
 *opaque)

Tested-by: Wenyi Gao we...@linux.vnet.ibm.com
Work nice.



Wenyi Gao

Re: [Qemu-devel] [PATCH] kernel/kvm: fix improper nmi emulation

2011-10-12 Thread Kenji Kaneshige

(2011/10/10 19:26), Avi Kivity wrote:
 On 10/10/2011 08:06 AM, Lai Jiangshan wrote:
 From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com

 Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
 button event happens. This doesn't properly emulate real hardware on
 which NMI button event triggers LINT1. Because of this, NMI is sent to
 the processor even when LINT1 is maskied in LVT. For example, this
 causes the problem that kdump initiated by NMI sometimes doesn't work
 on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.

 With this patch, KVM_NMI ioctl is handled as follows.

 - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
 request of triggering LINT1 on the processor. LINT1 is emulated in
 in-kernel irqchip.

 - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
 request of injecting NMI to the processor. This assumes LINT1 is
 already emulated in userland.
 
 Please add a KVM_NMI section to Documentation/virtual/kvm/api.txt.
 

 -static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
 -{
 - kvm_inject_nmi(vcpu);
 -
 - return 0;
 -}
 -
 static int vcpu_ioctl_tpr_access_reporting(struct kvm_vcpu *vcpu,
 struct kvm_tpr_access_ctl *tac)
 {
 @@ -3038,9 +3031,10 @@ long kvm_arch_vcpu_ioctl(struct file *fi
 break;
 }
 case KVM_NMI: {
 - r = kvm_vcpu_ioctl_nmi(vcpu);
 - if (r)
 - goto out;
 + if (irqchip_in_kernel(vcpu-kvm))
 + kvm_apic_lint1_deliver(vcpu);
 + else
 + kvm_inject_nmi(vcpu);
 r = 0;
 break;
 }
 
 Why did you drop kvm_vcpu_ioctl_nmi()?
 
 Please add (and document) a KVM_CAP flag that lets userspace know the new 
 behaviour is supported.
 

Sorry for the delayed responding.

I don't understand why new KVM_CAP flag is needed.

I think the old behavior was clearly a bug, and new behavior is not a new
capability. Furthermore, the kvm patch and the qemu patch in this patchset
can be applied independently. If only the kvm patch is applied, NMI bug in
kernel irq is fixed and qemu NMI behavior is not changed. If the only the
qemu patch is applied, qemu NMI bug is fixed and the NMI behavior in kernel
irq is not changed.

Regards,
Kenji Kaneshige

Re: [Qemu-devel] [PATCH 1/1 V2] kernel/kvm: fix improper nmi emulation

2011-10-12 Thread Kenji Kaneshige

(2011/10/12 2:00), Lai Jiangshan wrote:
 From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com
 
 Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
 button event happens. This doesn't properly emulate real hardware on
 which NMI button event triggers LINT1. Because of this, NMI is sent to
 the processor even when LINT1 is maskied in LVT. For example, this
 causes the problem that kdump initiated by NMI sometimes doesn't work
 on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
 
 With this patch, KVM_NMI ioctl is handled as follows.
 
 - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
request of triggering LINT1 on the processor. LINT1 is emulated in
in-kernel irqchip.
 
 - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
request of injecting NMI to the processor. This assumes LINT1 is
already emulated in userland.
 
 (laijs) Changed from v1:
 Add KVM_NMI API document
 Add KVM_CAP_USER_NMI
 
 Signed-off-by: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com
 Tested-by: Lai Jiangshanla...@cn.fujitsu.com
 ---
   Documentation/virtual/kvm/api.txt |   20 
   arch/x86/kvm/irq.h|1 +
   arch/x86/kvm/lapic.c  |7 +++
   arch/x86/kvm/x86.c|   12 
   include/linux/kvm.h   |3 +++
   5 files changed, 43 insertions(+), 0 deletions(-)
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index b0e4b9c..5c24cc3 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1430,6 +1430,26 @@ is supported; 2 if the processor requires all virtual 
 machines to have
   an RMA, or 1 if the processor can use an RMA but doesn't require it,
   because it supports the Virtual RMA (VRMA) facility.
 
 +4.64 KVM_NMI
 +
 +Capability: KVM_CAP_USER_NMI
 +Architectures: x86
 +Type: vcpu ioctl
 +Parameters: none
 +Returns: 0 on success, -1 on error
 +
 +This ioctl injects NMI to the vcpu.
 +
 +If with capability KVM_CAP_LAPIC_NMI, KVM_NMI ioctl is handled as follows:
 +
 + - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
 +   request of triggering LINT1 on the processor. LINT1 is emulated in
 +   in-kernel lapic irqchip.
 +
 + - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
 +   request of injecting NMI to the processor. This assumes LINT1 is
 +   already emulated in userland lapic.
 +
   5. The kvm_run structure
 
   Application code obtains a pointer to the kvm_run structure by
 diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
 index 53e2d08..0c96315 100644
 --- a/arch/x86/kvm/irq.h
 +++ b/arch/x86/kvm/irq.h
 @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s);
   void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
   void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
   void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
 +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
   void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
   void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
   void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
 diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
 index 57dcbd4..87fe36a 100644
 --- a/arch/x86/kvm/lapic.c
 +++ b/arch/x86/kvm/lapic.c
 @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
   kvm_apic_local_deliver(apic, APIC_LVT0);
   }
 
 +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
 +{
 + struct kvm_lapic *apic = vcpu-arch.apic;
 +
 + kvm_apic_local_deliver(apic, APIC_LVT1);
 +}
 +
   static struct kvm_timer_ops lapic_timer_ops = {
   .is_periodic = lapic_is_periodic,
   };
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 84a28ea..6862ef7 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -2729,12 +2729,24 @@ static int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu 
 *vcpu,
   return 0;
   }
 
 +#ifdef KVM_CAP_LAPIC_NMI
 +static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
 +{
 + if (irqchip_in_kernel(vcpu-kvm))
 + kvm_apic_lint1_deliver(vcpu);
 + else
 + kvm_inject_nmi(vcpu);
 +
 + return 0;
 +}
 +#else
   static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
   {
   kvm_inject_nmi(vcpu);
 
   return 0;
   }
 +#endif

I don't think we need to keep old kvm_vcpu_ioctl_nmi() behavior because
it's clearly a bug.

Regards,
Kenji Kaneshige

Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread Michael S. Tsirkin

On Wed, Oct 12, 2011 at 02:07:46PM +1100, David Gibson wrote:
 Um.. why?  PCI is defined by the spec to be LE, so I don't see that we
 need explicit endianness versions for PCI helpers.

LE in the spec only applies to structures defined by the spec,
that is pci configuration and msix tables in device memory.

-- 
MST

Re: [Qemu-devel] [PATCH] spice-input: migrate ledstate

2011-10-12 Thread Gerd Hoffmann


  Hi,


There is no ledstate in a PS/2 keyboard (or I'm reading too much into
the implementation in qemu).


There is.  It isn't explicitly stored into the state struct though 
because the ps/2 keyboard itself doesn't use it, it just calls 
kbd_put_ledstate() to inform others about it.


cheers,
  Gerd

[Qemu-devel] [PATCH 1/2] Add opt_set_bool function

2011-10-12 Thread M. Mohan Kumar

In addition to qemu_opt_set function, we need a function to set bool value
also.

Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
---
 qemu-option.c |   35 +++
 qemu-option.h |1 +
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/qemu-option.c b/qemu-option.c
index 105d760..d6bc908 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -636,6 +636,41 @@ int qemu_opt_set(QemuOpts *opts, const char *name, const 
char *value)
 return 0;
 }
 
+int qemu_opt_set_bool(QemuOpts *opts, const char *name, int val)
+{
+QemuOpt *opt;
+const QemuOptDesc *desc = opts-list-desc;
+int i;
+
+for (i = 0; desc[i].name != NULL; i++) {
+if (strcmp(desc[i].name, name) == 0) {
+break;
+}
+}
+if (desc[i].name == NULL) {
+if (i == 0) {
+/* empty list - allow any */;
+} else {
+qerror_report(QERR_INVALID_PARAMETER, name);
+return -1;
+}
+}
+
+opt = g_malloc0(sizeof(*opt));
+opt-name = g_strdup(name);
+opt-opts = opts;
+QTAILQ_INSERT_TAIL(opts-head, opt, next);
+if (desc[i].name != NULL) {
+opt-desc = desc+i;
+}
+opt-value.boolean = !!val;
+if (qemu_opt_parse(opt)  0) {
+qemu_opt_del(opt);
+return -1;
+}
+return 0;
+}
+
 int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
  int abort_on_failure)
 {
diff --git a/qemu-option.h b/qemu-option.h
index b515813..af4d36b 100644
--- a/qemu-option.h
+++ b/qemu-option.h
@@ -109,6 +109,7 @@ int qemu_opt_get_bool(QemuOpts *opts, const char *name, int 
defval);
 uint64_t qemu_opt_get_number(QemuOpts *opts, const char *name, uint64_t 
defval);
 uint64_t qemu_opt_get_size(QemuOpts *opts, const char *name, uint64_t defval);
 int qemu_opt_set(QemuOpts *opts, const char *name, const char *value);
+int qemu_opt_set_bool(QemuOpts *opts, const char *name, int val);
 typedef int (*qemu_opt_loopfunc)(const char *name, const char *value, void 
*opaque);
 int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
  int abort_on_failure);
-- 
1.7.6

[Qemu-devel] [PATCH V4 2/2] hw/9pfs: Add readonly support for 9p export

2011-10-12 Thread M. Mohan Kumar

A new fsdev parameter readonly is introduced to control accessing 9p export.
readonly=on|off can be used to specify the access type. By default rw
access is given.

Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
---
Changes from previous version V3:
* Use opt_set_bool function to set readonly option
* Change the flag from MS_READONLY to 9p specific

Change from previous version V2:
* QEMU_OPT_BOOL is used for readdonly parameter

Changes from previous version:
* Use readonly option instead of access
* Change function return type to boolean where its needed

 fsdev/file-op-9p.h |3 +-
 fsdev/qemu-fsdev.c |   12 +-
 fsdev/qemu-fsdev.h |1 +
 hw/9pfs/virtio-9p-device.c |3 ++
 hw/9pfs/virtio-9p.c|   46 
 qemu-config.c  |7 ++
 vl.c   |2 +
 7 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 33fb07f..b75290d 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -58,7 +58,8 @@ typedef struct extended_ops {
 } extended_ops;
 
 /* FsContext flag values */
-#define PATHNAME_FSCONTEXT 0x1
+#define PATHNAME_FSCONTEXT  0x1
+#define P9_RDONLY_EXPORT0x2
 
 /* cache flags */
 #define V9FS_WRITETHROUGH_CACHE 0x1
diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index d08ba9c..f8a8227 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -29,13 +29,13 @@ static FsTypeTable FsTypes[] = {
 int qemu_fsdev_add(QemuOpts *opts)
 {
 struct FsTypeListEntry *fsle;
-int i;
+int i, flags = 0;
 const char *fsdev_id = qemu_opts_id(opts);
 const char *fstype = qemu_opt_get(opts, fstype);
 const char *path = qemu_opt_get(opts, path);
 const char *sec_model = qemu_opt_get(opts, security_model);
 const char *cache = qemu_opt_get(opts, cache);
-
+int rdonly = qemu_opt_get_bool(opts, readonly, 0);
 
 if (!fsdev_id) {
 fprintf(stderr, fsdev: No id specified\n);
@@ -76,6 +76,14 @@ int qemu_fsdev_add(QemuOpts *opts)
 fsle-fse.cache_flags = V9FS_WRITETHROUGH_CACHE;
 }
 }
+if (rdonly) {
+flags |= P9_RDONLY_EXPORT;
+} else {
+flags = ~P9_RDONLY_EXPORT;
+}
+
+fsle-fse.flags = flags;
+
 QTAILQ_INSERT_TAIL(fstype_entries, fsle, next);
 return 0;
 }
diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
index 0f67880..2938eaf 100644
--- a/fsdev/qemu-fsdev.h
+++ b/fsdev/qemu-fsdev.h
@@ -44,6 +44,7 @@ typedef struct FsTypeEntry {
 char *security_model;
 int cache_flags;
 FileOperations *ops;
+int flags;
 } FsTypeEntry;
 
 typedef struct FsTypeListEntry {
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 1846e36..336292c 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -125,6 +125,9 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
*conf)
 s-tag_len = len;
 s-ctx.uid = -1;
 s-ctx.flags = 0;
+if (fse-flags  P9_RDONLY_EXPORT) {
+s-ctx.flags |= P9_RDONLY_EXPORT;
+}
 
 s-ops = fse-ops;
 s-vdev.get_features = virtio_9p_get_features;
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index 47ed2f1..9f15787 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -1271,6 +1271,11 @@ static void v9fs_fix_path(V9fsPath *dst, V9fsPath *src, 
int len)
 dst-size++;
 }
 
+static inline bool is_ro_export(FsContext *fs_ctx)
+{
+return fs_ctx-flags  P9_RDONLY_EXPORT;
+}
+
 static void v9fs_version(void *opaque)
 {
 V9fsPDU *pdu = opaque;
@@ -1690,6 +1695,14 @@ static void v9fs_open(void *opaque)
 } else {
 flags = omode_to_uflags(mode);
 }
+if (is_ro_export(s-ctx)) {
+if (mode  O_WRONLY || mode  O_RDWR || mode  O_APPEND) {
+err = -EROFS;
+goto out;
+} else {
+flags |= O_NOATIME;
+}
+}
 err = v9fs_co_open(pdu, fidp, flags);
 if (err  0) {
 goto out;
@@ -3301,6 +3314,33 @@ static void v9fs_op_not_supp(void *opaque)
 complete_pdu(pdu-s, pdu, -EOPNOTSUPP);
 }
 
+static inline bool is_read_only_op(int id)
+{
+switch (id) {
+case P9_TREADDIR:
+case P9_TSTATFS:
+case P9_TGETATTR:
+case P9_TXATTRWALK:
+case P9_TLOCK:
+case P9_TGETLOCK:
+case P9_TREADLINK:
+case P9_TVERSION:
+case P9_TLOPEN:
+case P9_TATTACH:
+case P9_TSTAT:
+case P9_TWALK:
+case P9_TCLUNK:
+case P9_TFSYNC:
+case P9_TOPEN:
+case P9_TREAD:
+case P9_TAUTH:
+case P9_TFLUSH:
+return 1;
+default:
+return 0;
+}
+}
+
 static void submit_pdu(V9fsState *s, V9fsPDU *pdu)
 {
 Coroutine *co;
@@ -3312,6 +3352,12 @@ static void submit_pdu(V9fsState *s, V9fsPDU *pdu)
 } else {
 handler = pdu_co_handlers[pdu-id];
 }
+
+if (is_ro_export(s-ctx)  !is_read_only_op(pdu-id)) {
+complete_pdu(s,

Re: [Qemu-devel] [PATCH] qed: fix use-after-free during l2 cache commit

2011-10-12 Thread Stefan Hajnoczi

On Tue, Oct 11, 2011 at 04:22:11PM +0200, Kevin Wolf wrote:
Am 30.09.2011 17:49, schrieb Amit Shah:
On (Fri) 30 Sep 2011 [16:23:30], Stefan Hajnoczi wrote:
On Fri, Sep 30, 2011 at 12:27 PM, Amit Shah amit.s...@redhat.com wrote:
On (Fri) 30 Sep 2011 [11:39:11], Stefan Hajnoczi wrote:
QED's metadata caching strategy allows two parallel requests to race for
metadata lookup. The first one to complete will populate the metadata
cache and the second one will drop the data it just read in favor of the
cached data.

There is a use-after-free in qed_read_l2_table_cb() and
qed_commit_l2_update() where l2_table-offset was used after the
l2_table may have been freed due to a metadata lookup race. Fix this by
keeping the l2_offset in a local variable and not reaching into the
possibly freed l2_table.

Reported-by: Amit Shah amit.s...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
Hi Amit,
Thanks for reporting the assertion failure you saw at
http://fpaste.org/CDuv/.
Does this patch fix the problem?

Yes, this fixes it.

Were you able to reliably reproduce the assertion failure before?

Absolutely.

I even reverted the patch and tried the same image; same segfault
again.

I wonder because this only happens when two metadata lookups race
(which is rare enough on my setup that I've never seen this failure).
It might be worth trying a few times.

Get the F16 beta-rc LXE live iso, install guest. It doesn't cleanly
reboot, you have to kill the VM. Next start of the VM produces this
segfault.

https://alt.fedoraproject.org/pub/alt/stage/16-Beta.RC2/Live/x86_64/Fedora-16-Beta-x86_64-Live-LXDE.iso

Can we try to artificially produce it in a qemu-iotests case?

I will take a look.

Stefan

[Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing

2011-10-12 Thread M. Mohan Kumar

Security model is needed only for 'local' fs driver.

Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
---
 fsdev/qemu-fsdev.c |6 +
 fsdev/qemu-fsdev.h |1 +
 hw/9pfs/virtio-9p-device.c |   47 ++-
 vl.c   |   20 +++--
 4 files changed, 43 insertions(+), 31 deletions(-)

diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index 36db127..d08ba9c 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -58,11 +58,6 @@ int qemu_fsdev_add(QemuOpts *opts)
 return -1;
 }
 
-if (!sec_model) {
-fprintf(stderr, fsdev: No security_model specified.\n);
-return -1;
-}
-
 if (!path) {
 fprintf(stderr, fsdev: No path specified.\n);
 return -1;
@@ -72,6 +67,7 @@ int qemu_fsdev_add(QemuOpts *opts)
 
 fsle-fse.fsdev_id = g_strdup(fsdev_id);
 fsle-fse.path = g_strdup(path);
+fsle-fse.fsdriver = g_strdup(fstype);
 fsle-fse.security_model = g_strdup(sec_model);
 fsle-fse.ops = FsTypes[i].ops;
 fsle-fse.cache_flags = 0;
diff --git a/fsdev/qemu-fsdev.h b/fsdev/qemu-fsdev.h
index 9c440f2..0f67880 100644
--- a/fsdev/qemu-fsdev.h
+++ b/fsdev/qemu-fsdev.h
@@ -40,6 +40,7 @@ typedef struct FsTypeTable {
 typedef struct FsTypeEntry {
 char *fsdev_id;
 char *path;
+char *fsdriver;
 char *security_model;
 int cache_flags;
 FileOperations *ops;
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index aac58ad..1846e36 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -83,29 +83,30 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
*conf)
 exit(1);
 }
 
-if (!strcmp(fse-security_model, passthrough)) {
-/* Files on the Fileserver set to client user credentials */
-s-ctx.fs_sm = SM_PASSTHROUGH;
-s-ctx.xops = passthrough_xattr_ops;
-} else if (!strcmp(fse-security_model, mapped)) {
-/* Files on the fileserver are set to QEMU credentials.
- * Client user credentials are saved in extended attributes.
- */
-s-ctx.fs_sm = SM_MAPPED;
-s-ctx.xops = mapped_xattr_ops;
-} else if (!strcmp(fse-security_model, none)) {
-/*
- * Files on the fileserver are set to QEMU credentials.
- */
-s-ctx.fs_sm = SM_NONE;
-s-ctx.xops = none_xattr_ops;
-} else {
-fprintf(stderr, Default to security_model=none. You may want
- enable advanced security model using 
-security option:\n\t security_model=passthrough\n\t 
-security_model=mapped\n);
-s-ctx.fs_sm = SM_NONE;
-s-ctx.xops = none_xattr_ops;
+/* security models is needed only for local fs driver */
+if (!strcmp(fse-fsdriver, local)) {
+if (!strcmp(fse-security_model, passthrough)) {
+/* Files on the Fileserver set to client user credentials */
+s-ctx.fs_sm = SM_PASSTHROUGH;
+s-ctx.xops = passthrough_xattr_ops;
+} else if (!strcmp(fse-security_model, mapped)) {
+/* Files on the fileserver are set to QEMU credentials.
+* Client user credentials are saved in extended attributes.
+*/
+s-ctx.fs_sm = SM_MAPPED;
+s-ctx.xops = mapped_xattr_ops;
+} else if (!strcmp(fse-security_model, none)) {
+/*
+* Files on the fileserver are set to QEMU credentials.
+*/
+s-ctx.fs_sm = SM_NONE;
+s-ctx.xops = none_xattr_ops;
+} else {
+fprintf(stderr, Invalid security_model %s specified.\n
+Available security models are:\t 
+passthrough,mapped or none\n, fse-security_model);
+exit(1);
+}
 }
 
 s-ctx.cache_flags = fse-cache_flags;
diff --git a/vl.c b/vl.c
index 6760e39..a961fa3 100644
--- a/vl.c
+++ b/vl.c
@@ -2795,6 +2795,7 @@ int main(int argc, char **argv, char **envp)
 QemuOpts *fsdev;
 QemuOpts *device;
 const char *cache;
+const char *fsdriver;
 
 olist = qemu_find_opts(virtfs);
 if (!olist) {
@@ -2809,13 +2810,26 @@ int main(int argc, char **argv, char **envp)
 
 if (qemu_opt_get(opts, fstype) == NULL ||
 qemu_opt_get(opts, mount_tag) == NULL ||
-qemu_opt_get(opts, path) == NULL ||
-qemu_opt_get(opts, security_model) == NULL) {
+qemu_opt_get(opts, path) == NULL) {
 fprintf(stderr, Usage: -virtfs fstype,path=/share_path/,
-security_model=[mapped|passthrough|none],
+{security_model=[mapped|passthrough|none]},
 mount_tag=tag.\n);
 exit(1);
 }
+fsdriver =

Re: [Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing

2011-10-12 Thread Daniel P. Berrange

On Wed, Oct 12, 2011 at 01:24:16PM +0530, M. Mohan Kumar wrote:
 Security model is needed only for 'local' fs driver.
 
 Signed-off-by: M. Mohan Kumar mo...@in.ibm.com
 ---
  fsdev/qemu-fsdev.c |6 +
  fsdev/qemu-fsdev.h |1 +
  hw/9pfs/virtio-9p-device.c |   47 ++-
  vl.c   |   20 +++--
  4 files changed, 43 insertions(+), 31 deletions(-)
 
 --- a/fsdev/qemu-fsdev.h
 +++ b/fsdev/qemu-fsdev.h
 @@ -40,6 +40,7 @@ typedef struct FsTypeTable {
  typedef struct FsTypeEntry {
  char *fsdev_id;
  char *path;
 +char *fsdriver;
  char *security_model;
  int cache_flags;
  FileOperations *ops;
 diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
 index aac58ad..1846e36 100644
 --- a/hw/9pfs/virtio-9p-device.c
 +++ b/hw/9pfs/virtio-9p-device.c
 @@ -83,29 +83,30 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf 
 *conf)
  exit(1);
  }
  
 -if (!strcmp(fse-security_model, passthrough)) {
 -/* Files on the Fileserver set to client user credentials */
 -s-ctx.fs_sm = SM_PASSTHROUGH;
 -s-ctx.xops = passthrough_xattr_ops;
 -} else if (!strcmp(fse-security_model, mapped)) {
 -/* Files on the fileserver are set to QEMU credentials.
 - * Client user credentials are saved in extended attributes.
 - */
 -s-ctx.fs_sm = SM_MAPPED;
 -s-ctx.xops = mapped_xattr_ops;
 -} else if (!strcmp(fse-security_model, none)) {
 -/*
 - * Files on the fileserver are set to QEMU credentials.
 - */
 -s-ctx.fs_sm = SM_NONE;
 -s-ctx.xops = none_xattr_ops;
 -} else {
 -fprintf(stderr, Default to security_model=none. You may want
 - enable advanced security model using 
 -security option:\n\t security_model=passthrough\n\t 
 -security_model=mapped\n);
 -s-ctx.fs_sm = SM_NONE;
 -s-ctx.xops = none_xattr_ops;
 +/* security models is needed only for local fs driver */
 +if (!strcmp(fse-fsdriver, local)) {
 +if (!strcmp(fse-security_model, passthrough)) {
 +/* Files on the Fileserver set to client user credentials */
 +s-ctx.fs_sm = SM_PASSTHROUGH;
 +s-ctx.xops = passthrough_xattr_ops;
 +} else if (!strcmp(fse-security_model, mapped)) {
 +/* Files on the fileserver are set to QEMU credentials.
 +* Client user credentials are saved in extended attributes.
 +*/
 +s-ctx.fs_sm = SM_MAPPED;
 +s-ctx.xops = mapped_xattr_ops;
 +} else if (!strcmp(fse-security_model, none)) {
 +/*
 +* Files on the fileserver are set to QEMU credentials.
 +*/
 +s-ctx.fs_sm = SM_NONE;
 +s-ctx.xops = none_xattr_ops;
 +} else {
 +fprintf(stderr, Invalid security_model %s specified.\n
 +Available security models are:\t 
 +passthrough,mapped or none\n, fse-security_model);
 +exit(1);
 +}

Are you sure there aren't use cases where people would like to
choose between  passthrough  mapped, even when using the 'proxy'
or 'handle' security drivers.

Both of the security models seem pretty generally useful to me,
regardless of the driver type.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread Michael S. Tsirkin

On Wed, Oct 12, 2011 at 02:11:37PM +1100, David Gibson wrote:
 On Sun, Oct 02, 2011 at 12:52:39PM +0200, Michael S. Tsirkin wrote:
  On Sun, Oct 02, 2011 at 12:29:08PM +0200, Avi Kivity wrote:
   On 10/02/2011 12:25 PM, Michael S. Tsirkin wrote:
   On Mon, Sep 05, 2011 at 02:34:56PM +1000, David Gibson wrote:
 This patch adds functions to pci.[ch] to perform PCI DMA operations.  
At
 present, these are just stubs which perform directly cpu physical 
memory
 accesses.
   
 Using these stubs, however, distinguishes PCI device DMA transactions 
from
 other accesses to physical memory, which will allow PCI IOMMU support 
to
 be added in one place, rather than updating every PCI driver at that 
time.
   
 That is, it allows us to update individual PCI drivers to support an 
IOMMU
 without having yet determined the details of how the IOMMU emulation 
will
 operate.  This will let us remove the most bitrot-sensitive part of an
 IOMMU patch in advance.
   
 Signed-off-by: David Gibsonda...@gibson.dropbear.id.au
   
   So something I just thought about:
   
   all wrappers now go through cpu_physical_memory_rw.
   This is a problem as e.g. virtio assumes that
   accesses such as stw are atomic. cpu_physical_memory_rw
   is a memcpy which makes no such guarantees.
   
   
   Let's change cpu_physical_memory_rw() to provide that guarantee for
   aligned two and four byte accesses.  Having separate paths just for
   that is not maintainable.
  
  Well, we also have stX_phys convert to target native endian-ness
  (nop for KVM but not necessarily for qemu).
 
 Yes.. as do the stX_pci_dma() helpers.  They assume LE, rather than
 having two variants, because PCI is an LE spec, and all normal PCI
 devices work in LE.

IMO, not really. PCI devices do DMA any way they like.  LE is
probably more common because both ARM and x86 processors are LE.

  If we need to model some perverse BE PCI device,
 it can reswap itself.

An explicit API for this would be cleaner.

-- 
MST

Re: [Qemu-devel] [PATCH 01/36] ds1225y: Use stdio instead of QEMUFile

2011-10-12 Thread Zhi Hui Li


On 10/11/2011 06:00 PM, Juan Quintela wrote:

QEMUFile * is only intended for migration nowadays.  Using it for
anything else just adds pain and a layer of buffers for no good
reason.

Signed-off-by: Juan Quintelaquint...@redhat.com
---
  hw/ds1225y.c |   28 
  1 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/hw/ds1225y.c b/hw/ds1225y.c
index 9875c44..6852a61 100644
--- a/hw/ds1225y.c
+++ b/hw/ds1225y.c
@@ -29,7 +29,7 @@ typedef struct {
  DeviceState qdev;
  uint32_t chip_size;
  char *filename;
-QEMUFile *file;
+FILE *file;
  uint8_t *contents;
  } NvRamState;

@@ -70,9 +70,9 @@ static void nvram_writeb (void *opaque, target_phys_addr_t 
addr, uint32_t val)

  s-contents[addr] = val;
  if (s-file) {
-qemu_fseek(s-file, addr, SEEK_SET);
-qemu_put_byte(s-file, (int)val);
-qemu_fflush(s-file);
+fseek(s-file, addr, SEEK_SET);
+fputc(val, s-file);
+fflush(s-file);
  }
  }

@@ -108,15 +108,17 @@ static int nvram_post_load(void *opaque, int version_id)

  /* Close file, as filename may has changed in load/store process */
  if (s-file) {
-qemu_fclose(s-file);
+fclose(s-file);
  }

  /* Write back nvram contents */
-s-file = qemu_fopen(s-filename, wb);
+s-file = fopen(s-filename, wb);
  if (s-file) {
  /* Write back contents, as 'wb' mode cleaned the file */
-qemu_put_buffer(s-file, s-contents, s-chip_size);
-qemu_fflush(s-file);
+if (fwrite(s-contents, s-chip_size, 1, s-file) != 1) {
+printf(nvram_post_load: short write\n);
+}
+fflush(s-file);
  }

  return 0;
@@ -143,7 +145,7 @@ typedef struct {
  static int nvram_sysbus_initfn(SysBusDevice *dev)
  {
  NvRamState *s =FROM_SYSBUS(SysBusNvRamState, dev)-nvram;
-QEMUFile *file;
+FILE *file;
  int s_io;

  s-contents = g_malloc0(s-chip_size);
@@ -153,11 +155,13 @@ static int nvram_sysbus_initfn(SysBusDevice *dev)
  sysbus_init_mmio(dev, s-chip_size, s_io);

  /* Read current file */
-file = qemu_fopen(s-filename, rb);
+file = fopen(s-filename, rb);
  if (file) {
  /* Read nvram contents */
-qemu_get_buffer(file, s-contents, s-chip_size);
-qemu_fclose(file);
+if (fread(s-contents, s-chip_size, 1, file) != 1) {
+printf(nvram_sysbus_initfn: short read\n);
+}
+fclose(file);
  }
  nvram_post_load(s, 0);



Tested-by: Zhi Hui Lizhihu...@linux.vnet.ibm.com

Re: [Qemu-devel] [PATCH 3/6] block: switch bdrv_read()/bdrv_write() to coroutines

2011-10-12 Thread Stefan Hajnoczi

On Tue, Oct 11, 2011 at 7:44 AM, Zhi Yong Wu zwu.ker...@gmail.com wrote:
 On Thu, Oct 6, 2011 at 12:17 AM, Stefan Hajnoczi
 stefa...@linux.vnet.ibm.com wrote:
 @@ -1101,36 +1144,7 @@ static void set_dirty_bitmap(BlockDriverState *bs, 
 int64_t sector_num,
  int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors)
  {
 -    BlockDriver *drv = bs-drv;
 -
 -    if (!bs-drv)
 -        return -ENOMEDIUM;
 -
 -    if (bdrv_has_async_rw(drv)  qemu_in_coroutine()) {
 -        QEMUIOVector qiov;
 -        struct iovec iov = {
 -            .iov_base = (void *)buf,
 -            .iov_len = nb_sectors * BDRV_SECTOR_SIZE,
 -        };
 -
 -        qemu_iovec_init_external(qiov, iov, 1);
 -        return bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
 -    }
 -
 -    if (bs-read_only)
 -        return -EACCES;
 -    if (bdrv_check_request(bs, sector_num, nb_sectors))
 -        return -EIO;
 -
 -    if (bs-dirty_bitmap) {
 -        set_dirty_bitmap(bs, sector_num, nb_sectors, 1);
 -    }
 -
 -    if (bs-wr_highest_sector  sector_num + nb_sectors - 1) {
 -        bs-wr_highest_sector = sector_num + nb_sectors - 1;
 -    }
 The above codes are removed, will it be safe?

If you are checking that removing bs-wr_highest_sector code is okay,
then yes, it is safe because bdrv_co_do_writev() does the dirty bitmap
and wr_highest_sector updates.  We haven't lost any code by unifying
request processing - bdrv_co_do_writev() must do everything that
bdrv_aio_writev() and bdrv_write() did.

Stefan

Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread Gerd Hoffmann


  Hi,


Yes.. as do the stX_pci_dma() helpers.  They assume LE, rather than
having two variants, because PCI is an LE spec, and all normal PCI
devices work in LE.


IMO, not really. PCI devices do DMA any way they like.  LE is
probably more common because both ARM and x86 processors are LE.


Also having _le_ in the function name makes explicitly clear that the 
functions read/write little endian values and byteswaps if needed, which 
makes the code more readable.  I'd suggest to add it even if there is no 
need for a _be_ companion as devices needing that are rare.


cheers,
  Gerd

Re: [Qemu-devel] [PATCH 3/6] block: switch bdrv_read()/bdrv_write() to coroutines

2011-10-12 Thread Zhi Yong Wu

On Wed, Oct 12, 2011 at 5:03 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Tue, Oct 11, 2011 at 7:44 AM, Zhi Yong Wu zwu.ker...@gmail.com wrote:
 On Thu, Oct 6, 2011 at 12:17 AM, Stefan Hajnoczi
 stefa...@linux.vnet.ibm.com wrote:
 @@ -1101,36 +1144,7 @@ static void set_dirty_bitmap(BlockDriverState *bs, 
 int64_t sector_num,
  int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors)
  {
 -    BlockDriver *drv = bs-drv;
 -
 -    if (!bs-drv)
 -        return -ENOMEDIUM;
 -
 -    if (bdrv_has_async_rw(drv)  qemu_in_coroutine()) {
 -        QEMUIOVector qiov;
 -        struct iovec iov = {
 -            .iov_base = (void *)buf,
 -            .iov_len = nb_sectors * BDRV_SECTOR_SIZE,
 -        };
 -
 -        qemu_iovec_init_external(qiov, iov, 1);
 -        return bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
 -    }
 -
 -    if (bs-read_only)
 -        return -EACCES;
 -    if (bdrv_check_request(bs, sector_num, nb_sectors))
 -        return -EIO;
How about the above four lines of codes?
 -
 -    if (bs-dirty_bitmap) {
 -        set_dirty_bitmap(bs, sector_num, nb_sectors, 1);
 -    }
 -
 -    if (bs-wr_highest_sector  sector_num + nb_sectors - 1) {
 -        bs-wr_highest_sector = sector_num + nb_sectors - 1;
 -    }
 The above codes are removed, will it be safe?

 If you are checking that removing bs-wr_highest_sector code is okay,
 then yes, it is safe because bdrv_co_do_writev() does the dirty bitmap
 and wr_highest_sector updates.  We haven't lost any code by unifying
OK. got it. thanks.
 request processing - bdrv_co_do_writev() must do everything that
 bdrv_aio_writev() and bdrv_write() did.

 Stefan




-- 
Regards,

Zhi Yong Wu

Re: [Qemu-devel] [PATCH 1/9] Add stub functions for PCI device models to do PCI DMA

2011-10-12 Thread Michael S. Tsirkin

On Wed, Oct 12, 2011 at 02:09:26PM +1100, David Gibson wrote:
 On Mon, Oct 03, 2011 at 08:17:05AM -0500, Anthony Liguori wrote:
  On 10/02/2011 07:14 AM, Michael S. Tsirkin wrote:
  On Sun, Oct 02, 2011 at 02:01:10PM +0200, Avi Kivity wrote:
  Hmm, not entirely virtio specific, some devices use stX macros to do the
  conversion.  E.g. stw_be_phys and stl_le_phys are used in several
  places.
  
  These are fine - explicit endianness.
  
  Right. So changing these to e.g. stl_dma and assuming
  LE is default seems like a step backwards.
  
  We're generalizing too much.
  
  In general, the device model doesn't need atomic access functions.

Anthony, are you sure? PCI both provides atomic operations for devices (likely
uncommon). PCI express spec strongly recommends at least dword update
granularity for both reads and writes.
Some guests might depend on this.

  That's because device model RAM access is not coherent with CPU RAM
  access.
  Virtio is a very, very special case.  virtio requires coherent RAM
  access.

E.g., e1000 driver seems to allocate its rings in coherent memory.
Required? Your guess is as good as mine. It seems to work fine
ATM without these guarantees.

 Right, but it should only need that for the actual rings in the virtio
 core.  I was expecting that those would remain as direct physical
 memory accesses - precisely because virtio is special - rather than
 accesses through any kind of DMA interface.

At the moment, yes. Further, that was just an example I know about.
How about msi/msix? We don't want to
split these writes as that would confuse the APIC.

 -- 
 David Gibson  | I'll have my music baroque, and my code
 david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ 
 _other_
   | _way_ _around_!
 http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] Is realview-pb-a8 fully supported ?

2011-10-12 Thread Peter Maydell

On 10 October 2011 14:48, Francis Moreau francis.m...@gmail.com wrote:
 On Mon, Oct 10, 2011 at 10:42 AM, Peter Maydell
 peter.mayd...@linaro.org wrote:
 On 10 October 2011 08:35, Francis Moreau francis.m...@gmail.com wrote:
 I noticed another point for the realview platofrm: if I boot with -M
 512, it works however if I set -M 256 then it doesn't.

 Perhaps your kernel is configured to load in the higher 256MB
 address range

 hmm which options do you have in mind ?

Hmm, I thought there was an option for this but I can't find it
in the config, so I must have been misremembering somehow.

 When I say it doesn't work, it means that nothing happen when
 starting qemu: no trace, it looks like it's running an infinite loop.

Not even Uncompressing the kernel ?

If you want to track down what's going on then you'll need to
connect an ARM gdb up to qemu and single step through the boot
process, I'm afraid.

 BTW I'm wondering which kernel source I should use to build kernels
 for such plateforms (realview, vexpress, versatile) ? I'm currently
 using the source from kernel.org (well similar since this server seems
 really dead). but I'm not sure if it's a good idea...

I think the mainline kernel sources should in theory work
(in particular if they work with 512MB then that's a good
sign...) but I'm not a kernel expert; mostly I use other peoples'
prebuilt ones.

-- PMM

Re: [Qemu-devel] qemu-0.15.1 stable call for patches

2011-10-12 Thread Brad


On 26/09/11 9:16 AM, Justin M. Forbes wrote:

With the current patch queue I would like to start getting qemu-0.15.1
stable into shape.  With this in mind, the plan is to tag the release on
Monday Oct 3rd.  If you have patches pending for stable, now would be the
time to send them. Please CC jmfor...@linuxtx.org if you can to ensure that
I see them.

Thanks,
Justin


Is there anywhere where we can see what's been pulled into the pending 
0.15.1 code base so far since you don't seem to really post to the list

about the stable branches?


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Re: [Qemu-devel] [PATCH 1/2] hw/9pfs: Add new virtfs option cache=writethrough to skip host page cache

2011-10-12 Thread Stefan Hajnoczi

On Mon, Oct 10, 2011 at 10:06 AM, Aneesh Kumar K.V
aneesh.ku...@linux.vnet.ibm.com wrote:
 diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
 index 5c8b5ed..441a37f 100644
 --- a/hw/9pfs/virtio-9p-handle.c
 +++ b/hw/9pfs/virtio-9p-handle.c
 @@ -202,6 +202,15 @@ static ssize_t handle_pwritev(FsContext *ctx, int fd, 
 const struct iovec *iov,
         return writev(fd, iov, iovcnt);

The sync_file_range(2) call below is dead-code since we'll return
immediately after writev(2) completes.  The writev(2) return value
needs to be saved temporarily and then returned after
sync_file_range(2).

     }
  #endif
 +    if (ctx-cache_flags  V9FS_WRITETHROUGH_CACHE) {

-drive cache=writethrough means something different from 9pfs
writethrough.  This is confusing so I wonder if there is a better
name like immediate write-out.

 +        /*
 +         * Initiate a writeback. This is not a data integrity sync.
 +         * We want to ensure that we don't leave dirty pages in the cache
 +         * after write when cache=writethrough is sepcified.
 +         */
 +        sync_file_range(fd, offset, 0,
 +                        SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE);
 +    }

I'm not sure whether SYNC_FILE_RANGE_WAIT_BEFORE is necessary.  As a
best-effort mechanism just SYNC_FILE_RANGE_WRITE does the job although
a client that rapidly rewrites may be able to leave dirty pages in the
host page cache.  SYNC_FILE_RANGE_WAIT_BEFORE ensures that dirty pages
get written out but it is no longer asynchronous because it blocks.

Stefan

Re: [Qemu-devel] [BUG] USB assertion triggers in usb_packet_complete()

2011-10-12 Thread Stefan Hajnoczi

On Tue, Oct 11, 2011 at 8:35 AM, Thomas Huth th...@linux.vnet.ibm.com wrote:
 Am Mon, 10 Oct 2011 15:03:41 +0200
 schrieb Thomas Huth th...@linux.vnet.ibm.com:

 I am currently facing a problem when running QEMU (up-to-date git
 version) with OHCI and a lot of virtual USB devices.
 The emulator dies with the following assertion:

 qemu-system-arm: hw/usb.c:337: usb_packet_complete:
 Assertion `p-owner != ((void *)0)' failed.

Hi Thomas,
I hit the same bug recently and Gerd has posted a patch which you can test:
http://patchwork.ozlabs.org/patch/118726/

Stefan

[Qemu-devel] PCI 64-bit BAR access with qemu

2011-10-12 Thread Francois WELLENREITER



Hi there,

I've read a few days ago that it was possible to emulate PCI device 
with 64-bit BARs and have a real 64-bit memory access.
Thus, I've created a virtual device named toto accessible through a 
64-bit BAR


___

static const MemoryRegionOps bxi_common_mmio_ops = {
.read = toto_mmio_read,
.write = toto_mmio_write,
.endianness = DEVICE_LITTLE_ENDIAN,
.impl = {
.min_access_size = 1,
.max_access_size = 8,
},
};

___

memory_region_init_io(d-mmio, toto_mmio_ops, d, toto-mmio,
0x1000);

pci_register_bar(d-dev, BAR_0,
 PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
d-mmio);



when I use in my driver pointers to unsigned char, unsigned short or 
unsigned int,

I can see that toto_mmio_read is called with increasing sizes.
But that does not work at all with unsigned long long.
In such a case, I always obtain two calls to toto_mmio_read with sizes 
limited to 4.
My understanding here is that a direct 64-bit memory access does not 
work (but is composed of 2 32-bit accesses).


Then, I've tried to understand why this happens and realized that 
cpu_physical_memory_rw was always called for this memory region (for in 
the old_mmio manner) :


___

(gdb) bt
#0  toto_mmio_read (opaque=0x2bec2c0, addr=0, size=4) at 
/home/workspace/qemu/hw/toto_hw.c:92
#1  0x005f6d37 in memory_region_read_accessor (opaque=0x2bec738, 
addr=0, value=0x7f9997ffec78, size=

4, shift=0, mask=4294967295) at /home/workspace/qemu/memory.c:239
#2  0x005f6ee0 in access_with_adjusted_size (addr=0, 
value=0x7f9997ffec78, size=4, access_size_min=1,
access_size_max=8, access=0x5f6cdf memory_region_read_accessor, 
opaque=0x2bec738)

at /home/workspace/qemu/memory.c:284
#3  0x005f8909 in memory_region_read_thunk_n (_mr=0x2bec738, 
addr=0, size=4)

at /home/workspace/qemu/memory.c:824
#4  0x005f8af1 in memory_region_read_thunk_l (mr=0x2bec738, addr=0)
at /home/workspace/qemu/memory.c:867
#5  0x005c8f69 in cpu_physical_memory_rw (addr=3892314112, 
buf=0x7f999c5d0028 \320\b, len=8,

is_write=0) at /home/workspace/qemu/exec.c:3965
#6  0x005ecfa1 in kvm_cpu_exec (env=0x238ee10) at 
/home/workspace/qemu/kvm-all.c:985
#7  0x005bf1fb in qemu_kvm_cpu_thread_fn (arg=0x238ee10) at 
/home/workspace/qemu/cpus.c:661

#8  0x0034dfc077e1 in start_thread () from /lib64/libpthread.so.0
#9  0x0034df4e68ed in clone () from /lib64/libc.so.6



Did I miss anything ? Is it a real defect ?
Are there developments still planned to allow a direct 64-bit access ?

Thanks for any help,

 François

Re: [Qemu-devel] [PATCH 0/3] block: zero write detection

2011-10-12 Thread Stefan Hajnoczi

On Tue, Oct 11, 2011 at 03:46:28PM +0200, Kevin Wolf wrote:
Am 07.10.2011 17:49, schrieb Stefan Hajnoczi:
Image streaming copies data from the backing file into the image file. It
is
important to represent zero regions from the backing file efficiently during
streaming, otherwise the image file grows to the full virtual disk size and
loses sparseness.

There are two ways to implement zero write detection, they are subtly
different:

1. Allow image formats to provide efficient representations for zero
regions.
QED does this with zero clusters and it has been discussed for qcow2v3.

2. During streaming, check for zeroes and skip writing to the image file
when
zeroes are detected.

However, there are some disadvantages to #2 because it leaves unallocated
holes
in the image file. If image streaming is aborted before it completes then
it
will be necessary to reread all unallocated clusters from the backing file
upon
resuming image streaming. Potentionally worse is that a backing file over a
slow remote connection will have the zero regions fetched again and again if
the guest accesses them. #1 avoids these problems because the image file
contains information on which regions are zeroes and do not need to be
refetched.

This patch series implements #1 with the existing QED zero cluster feature.
In
the future we can add qcow2v3 zero clusters too. We can also implement #2
directly in the image streaming code as a fallback when the BlockDriver does
not support zero detection #1 itself. That way we get the best possible
zero
write detection, depending on the image format.

Here is a qemu-iotest to verify that zero write detection is working:
http://repo.or.cz/w/qemu-iotests/stefanha.git/commitdiff/226949695eef51bdcdea3e6ce3d7e5a863427f37

Stefan Hajnoczi (3):
block: add zero write detection interface
qed: add zero write detection support
qemu-io: add zero write detection option

It's good to have an option to detect zero writes and turn them into
zero clusters, but it's something that introduces some overhead and
probably won't be suitable as a default.

Yes, this series simply has a bdrv_set_zero_detection() API to toggle it
at runtime. By default it is off to save CPU cycles.

I think what we really want to have for image streaming is an API that
explicitly writes zeros and doesn't have to look at the whole buffer (or
actually doesn't even get a buffer).

I didn't take this approach to avoid having block drivers handle the
zero buffers that need to be allocated when the region does not cover
entire clusters. It can be done for sure but I'm not sure how to do it
nicely yet.

Stefan

Re: [Qemu-devel] PCI 64-bit BAR access with qemu

2011-10-12 Thread Max Filippov

    I've read a few days ago that it was possible to emulate PCI device with
 64-bit BARs and have a real 64-bit memory access.
 Thus, I've created a virtual device named toto accessible through a 64-bit
 BAR

You've probably confused an ability to locate BAR anywhere in 64-bit
address space (such BAR actually spans 2 consecutive PCI BAR registers
and has 100 in its 3 least significant bits) and an ability to access
BAR-mapped memory in 64 bit items.

You obviously want the latter but currently it is not implemented, see e.g.

static inline DATA_TYPE glue(io_read, SUFFIX)(target_phys_addr_t physaddr,
  target_ulong addr,
  void *retaddr)

definition in the softmmu_template.h.

And it's quite simple to fix it, you only need to change
io_{read,write} in the softmmu_template.h and extend
io_mem_{read,write} loops in exec.c to 4 elements, taking care that
io_mem_{read,write}[3] can pass uint64_t.

-- 
Thanks.
-- Max

Re: [Qemu-devel] [PATCH 0/3] block: zero write detection

2011-10-12 Thread Kevin Wolf

Am 12.10.2011 12:39, schrieb Stefan Hajnoczi:
On Tue, Oct 11, 2011 at 03:46:28PM +0200, Kevin Wolf wrote:
Am 07.10.2011 17:49, schrieb Stefan Hajnoczi:
Image streaming copies data from the backing file into the image file. It
is
important to represent zero regions from the backing file efficiently during
streaming, otherwise the image file grows to the full virtual disk size and
loses sparseness.