date:20130912

Re: [Qemu-devel] [PATCH v5 0/5] bugs fix for hpet

2013-09-12 Thread Paolo Bonzini

Il 12/09/2013 05:25, Liu Ping Fan ha scritto:
 v5:
   use stand compat property to fix hpet intcap on pc-q35, while on pc-piix, 
 hard code intcap as IRQ2
 
 v4:
   use stand compat property to fix hpet intcap
 
 v3:
   change hpet interrupt capablity on board's demand
 
 Liu Ping Fan (5):
   hpet: inverse polarity when pin above ISA_NUM_IRQS
   hpet: entitle more irq pins for hpet
   PC: use qdev_xx to create hpet instead of sysbus_create_xx
   PC: differentiate hpet's interrupt capability on piix and q35
   PC-1.6: add compatibility for hpet intcap on pc-q35-1.6
 
  hw/i386/pc.c | 17 ++---
  hw/i386/pc_piix.c|  3 ++-
  hw/i386/pc_q35.c |  2 +-
  hw/timer/hpet.c  | 24 
  include/hw/i386/pc.h |  7 ++-
  5 files changed, 43 insertions(+), 10 deletions(-)
 

Reviewed-by: Paolo Bonzini pbonz...@redhat.com

Nice series, patches are split well.  Thank you very much!

Paolo

Re: [Qemu-devel] [PATCH v6 2/8] rule.mak: allow per object cflags and libs

2013-09-12 Thread Paolo Bonzini

Il 12/09/2013 04:52, Fam Zheng ha scritto:
define unnest-dir
$(foreach var,$(nested-vars),$(call push-var,$(var),$1/))
$(eval obj-parent-$1 := $(obj))
$(eval obj := $(if $(obj),$(obj)/$1,$1))
$(eval include $(SRC_PATH)/$1/Makefile.objs)
   +$(foreach v,$(nested-vars),$(call fix-obj-vars,$v,$(if 
   $(obj),$(obj)/)))
$(eval obj := $(obj-parent-$1))
$(eval obj-parent-$1 := )
$(foreach var,$(nested-vars),$(call pop-var,$(var),$1/))
   
  
  I'm not sure this will work for targets in the toplevel directory when
  obj-base is not empty.  This can be fixed later though, as part of a
  general revamping of obj-base.  Please add a FIXME comment.
  
 I'm not sure about the problem, can you give an example, so I can be specific
 in the comment?

Can you try using vl.o-cflags instead of a per-target QEMU_CFLAGS?  I
think it won't work, because the toplevel Makefile.objs is included
directly and not through unnest-dir.

Paolo

[Qemu-devel] [PATCH] pc: add 1.7 machine types for piix,q35

2013-09-12 Thread Michael S. Tsirkin

piix 1.7 is the default.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/i386/pc_piix.c | 19 +--
 hw/i386/pc_q35.c  | 17 -
 2 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 66551b4..0ade373 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -274,6 +274,11 @@ static void pc_compat_1_2(QEMUMachineInitArgs *args)
 disable_kvm_pv_eoi();
 }
 
+static void pc_init_pci_1_7(QEMUMachineInitArgs *args)
+{
+pc_init_pci(args);
+}
+
 static void pc_init_pci_1_6(QEMUMachineInitArgs *args)
 {
 pc_compat_1_6(args);
@@ -344,14 +349,23 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args)
 .desc = Standard PC (i440FX + PIIX, 1996), \
 .hot_add_cpu = pc_hot_add_cpu
 
-#define PC_I440FX_1_6_MACHINE_OPTIONS PC_I440FX_MACHINE_OPTIONS
+#define PC_I440FX_1_7_MACHINE_OPTIONS PC_I440FX_MACHINE_OPTIONS
+
+static QEMUMachine pc_i440fx_machine_v1_7 = {
+PC_I440FX_1_7_MACHINE_OPTIONS,
+.name = pc-i440fx-1.7,
+.alias = pc,
+.init = pc_init_pci_1_7,
+.is_default = 1,
+};
+
+#define PC_I440FX_1_6_MACHINE_OPTIONS PC_I440FX_1_7_MACHINE_OPTIONS
 
 static QEMUMachine pc_i440fx_machine_v1_6 = {
 PC_I440FX_1_6_MACHINE_OPTIONS,
 .name = pc-i440fx-1.6,
 .alias = pc,
 .init = pc_init_pci_1_6,
-.is_default = 1,
 };
 
 static QEMUMachine pc_i440fx_machine_v1_5 = {
@@ -740,6 +754,7 @@ static QEMUMachine xenfv_machine = {
 
 static void pc_machine_init(void)
 {
+qemu_register_machine(pc_i440fx_machine_v1_7);
 qemu_register_machine(pc_i440fx_machine_v1_6);
 qemu_register_machine(pc_i440fx_machine_v1_5);
 qemu_register_machine(pc_i440fx_machine_v1_4);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 54c2b4c..0abd9b1 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -238,6 +238,11 @@ static void pc_compat_1_4(QEMUMachineInitArgs *args)
 x86_cpu_compat_set_features(Westmere, FEAT_1_ECX, 0, 
CPUID_EXT_PCLMULQDQ);
 }
 
+static void pc_q35_init_1_7(QEMUMachineInitArgs *args)
+{
+pc_q35_init(args);
+}
+
 static void pc_q35_init_1_6(QEMUMachineInitArgs *args)
 {
 pc_compat_1_6(args);
@@ -261,7 +266,16 @@ static void pc_q35_init_1_4(QEMUMachineInitArgs *args)
 .desc = Standard PC (Q35 + ICH9, 2009), \
 .hot_add_cpu = pc_hot_add_cpu
 
-#define PC_Q35_1_6_MACHINE_OPTIONS PC_Q35_MACHINE_OPTIONS
+#define PC_Q35_1_7_MACHINE_OPTIONS PC_Q35_MACHINE_OPTIONS
+
+static QEMUMachine pc_q35_machine_v1_7 = {
+PC_Q35_1_7_MACHINE_OPTIONS,
+.name = pc-q35-1.7,
+.alias = q35,
+.init = pc_q35_init_1_7,
+};
+
+#define PC_Q35_1_6_MACHINE_OPTIONS PC_Q35_1_7_MACHINE_OPTIONS
 
 static QEMUMachine pc_q35_machine_v1_6 = {
 PC_Q35_1_6_MACHINE_OPTIONS,
@@ -296,6 +310,7 @@ static QEMUMachine pc_q35_machine_v1_4 = {
 
 static void pc_q35_machine_init(void)
 {
+qemu_register_machine(pc_q35_machine_v1_7);
 qemu_register_machine(pc_q35_machine_v1_6);
 qemu_register_machine(pc_q35_machine_v1_5);
 qemu_register_machine(pc_q35_machine_v1_4);
-- 
MST

[Qemu-devel] [PATCH v7 0/8] Shared Library Module Support

2013-09-12 Thread Fam Zheng

This series implements feature of shared object building as described in:

http://wiki.qemu.org/Features/Modules

The main idea behind modules is to isolate dependencies on third party
libraries from qemu executables, such as libglusterfs or librbd, so that the
end users can install core qemu package with fewer dependencies.  And only for
those who want to use particular modules, need they install qemu-foo
sub-package, which in turn requires libbar and libbiz packages.

It's implemented in three steps:

1. The first patches fix current build system to correctly handle nested
   variables and object specific options:

[01/08] ui/Makefile.objs: delete unnecessary cocoa.o dependency
[02/08] make.rule: fix $(obj) to a real relative path
[03/08] rule.mak: allow per object cflags and libs

2. The Makefile changes adds necessary options and rules to build DSO objects:

[04/08] build-sys: introduce common-obj-m and block-obj-m for DSO

3. The next patch adds framework to load modules from installed directory:

[05/08] module: implement module loading function

A few more changes are following to complete it:

[06/08] Makefile: install modules with make install
[07/08] .gitignore: ignore module related files (dll, so, mo)

In the end of series, the block drivers are converted:

[08/08] block: convert block drivers linked with libs to modules

v7: While the discussion on module path and loading restriction goes on, more
input and revisions are apparently necessary to achieve a proper design.
However I'd like to send this revision for building system fixes spotted from
reviewers, to make it easier for early testing:

[01] Add Peter's patch for ui/Makefile.objs. (Peter)
[02] Move nested vars prefixing to immediately after unnest-vars-1 (Paolo)
 Fix order of libcacard/Makefile and unnest-vars in Makefile. (Daniel)
[03] Fix link command if no libtool. (Peter)
[04] Squash in --enable-modules. (Paolo)
 Fix a missing comma in no dso building.
[05] Don't load modules in *-user. (Peter)
 #ifdef out module loading code if no --enable-modules. (Daniel)

v6: Dropping RFC.

[01] Move addprefix to unnest-vars.
[01] Drop unnest-vars in tests/Makefile.
[03] Move all: modules to Makefile.
[03] Add empty modules: in rules.mak, for clarity.
[03] .mo is no longer storing object list, just an empty file now. In
expand-objs, objects are extracted with $foo.mo-objs). another reason for
this is for makefile.target, objects in .mo file would be prefixed with
../, so generated two times in one make all, which always outdates the
link target.
[04] Use CONFIG_MODDIR instead of CONFIG_PREFIX for module searching.
[06] Added make install.
[07] New.
[08] Drop qed changes, added iscsi, ssh, rbd and gluster as modules.

v5: Keep foo.mo-objs idea for module objects.
Unnest block-obj-m and common-obj-m in Makefile.target.
Move add-modules to unnest-vars to be reused in Makefile.target.
Use /dev/null to replace realpath for expand-objs.

v4: Added --enable-modules in the end of series.
Make nested-vars and obj-base as arguemnts to unnest-vars.
Take Paolo's idea in comments for v2 and switch back module objects syntax
to:
$(obj)/foo.mo : $(addprefix $(obj)/, bar.o biz.o qux.o)

because this needs less duplication among Makefiles.

Fam Zheng (7):
  make.rule: fix $(obj) to a real relative path
  rule.mak: allow per object cflags and libs
  build-sys: introduce common-obj-m and block-obj-m for DSO
  module: implement module loading function
  Makefile: install modules with make install
  .gitignore: ignore module related files (dll, so, mo)
  block: convert block drivers linked with libs to modules

Peter Maydell (1):
  ui/Makefile.objs: delete unnecessary cocoa.o dependency

 .gitignore|  3 ++
 Makefile  | 28 ++-
 Makefile.objs | 18 ++--
 Makefile.target   | 21 +++---
 block.c   |  1 +
 block/Makefile.objs   | 11 +++-
 configure | 74 -
 include/qemu/module.h |  9 ++
 rules.mak | 77 ++-
 scripts/create_config | 10 +++
 ui/Makefile.objs  |  2 --
 util/module.c | 55 
 vl.c  |  2 ++
 13 files changed, 249 insertions(+), 62 deletions(-)

-- 
1.8.3.1

[Qemu-devel] [PATCH v7 2/8] make.rule: fix $(obj) to a real relative path

2013-09-12 Thread Fam Zheng

Makefile.target includes rule.mak and unnested common-obj-y, then prefix
them with '../', this will ignore object specific QEMU_CFLAGS in subdir
Makefile.objs:

$(obj)/curl.o: QEMU_CFLAGS += $(CURL_CFLAGS)

Because $(obj) here is './block', instead of '../block'. This doesn't
hurt compiling because we basically build all .o from top Makefile,
before entering Makefile.target, but it will affact arriving per-object
libs support.

The starting point of $(obj) is passed in as argument of unnest-vars, as
well as nested variables, so that different Makefiles can pass in a
right value.

Signed-off-by: Fam Zheng f...@redhat.com
---
 Makefile| 14 ++
 Makefile.objs   | 16 +---
 Makefile.target | 17 +
 configure   |  1 +
 rules.mak   | 14 +-
 5 files changed, 38 insertions(+), 24 deletions(-)

diff --git a/Makefile b/Makefile
index 806946e..a23b95c 100644
--- a/Makefile
+++ b/Makefile
@@ -115,6 +115,16 @@ defconfig:
 
 ifneq ($(wildcard config-host.mak),)
 include $(SRC_PATH)/Makefile.objs
+endif
+
+dummy := $(call unnest-vars,, \
+stub-obj-y \
+util-obj-y \
+qga-obj-y \
+block-obj-y \
+common-obj-y)
+
+ifneq ($(wildcard config-host.mak),)
 include $(SRC_PATH)/tests/Makefile
 endif
 ifeq ($(CONFIG_SMARTCARD_NSS),y)
@@ -123,6 +133,10 @@ endif
 
 all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all
 
+vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
+
+vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS)
+
 config-host.h: config-host.h-timestamp
 config-host.h-timestamp: config-host.mak
 qemu-options.def: $(SRC_PATH)/qemu-options.hx
diff --git a/Makefile.objs b/Makefile.objs
index f46a4cd..4f7a364 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -41,7 +41,7 @@ libcacard-y += libcacard/vcardt.o
 # single QEMU executable should support all CPUs and machines.
 
 ifeq ($(CONFIG_SOFTMMU),y)
-common-obj-y = $(block-obj-y) blockdev.o blockdev-nbd.o block/
+common-obj-y = blockdev.o blockdev-nbd.o block/
 common-obj-y += net/
 common-obj-y += readline.o
 common-obj-y += qdev-monitor.o device-hotplug.o
@@ -109,17 +109,3 @@ version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
 # FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
 # by libqemuutil.a.  These should be moved to a separate .json schema.
 qga-obj-y = qga/ qapi-types.o qapi-visit.o
-
-vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
-
-vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS)
-
-QEMU_CFLAGS+=$(GLIB_CFLAGS)
-
-nested-vars += \
-   stub-obj-y \
-   util-obj-y \
-   qga-obj-y \
-   block-obj-y \
-   common-obj-y
-dummy := $(call unnest-vars)
diff --git a/Makefile.target b/Makefile.target
index 9a49852..87906ea 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -143,13 +143,22 @@ endif # CONFIG_SOFTMMU
 # Workaround for http://gcc.gnu.org/PR55489, see configure.
 %/translate.o: QEMU_CFLAGS += $(TRANSLATE_OPT_CFLAGS)
 
-nested-vars += obj-y
+dummy := $(call unnest-vars,,obj-y)
 
-# This resolves all nested paths, so it must come last
+# we are making another call to unnest-vars with different vars, protect obj-y,
+# it can be overriden in subdir Makefile.objs
+obj-y-save := $(obj-y)
+
+block-obj-y :=
+common-obj-y :=
 include $(SRC_PATH)/Makefile.objs
+dummy := $(call unnest-vars,..,block-obj-y common-obj-y)
+
+# Now restore obj-y
+obj-y := $(obj-y-save)
+
+all-obj-y = $(obj-y) $(common-obj-y) $(block-obj-y)
 
-all-obj-y = $(obj-y)
-all-obj-y += $(addprefix ../, $(common-obj-y))
 
 ifndef CONFIG_HAIKU
 LIBS+=-lm
diff --git a/configure b/configure
index e989609..cc3cd4d 100755
--- a/configure
+++ b/configure
@@ -2251,6 +2251,7 @@ fi
 if $pkg_config --atleast-version=$glib_req_ver gthread-2.0; then
 glib_cflags=`$pkg_config --cflags gthread-2.0`
 glib_libs=`$pkg_config --libs gthread-2.0`
+CFLAGS=$glib_cflags $CFLAGS
 LIBS=$glib_libs $LIBS
 libs_qga=$glib_libs $libs_qga
 else
diff --git a/rules.mak b/rules.mak
index 4499745..0c5125d 100644
--- a/rules.mak
+++ b/rules.mak
@@ -103,9 +103,6 @@ clean: clean-timestamp
 
 # magic to descend into other directories
 
-obj := .
-old-nested-dirs :=
-
 define push-var
 $(eval save-$2-$1 = $(value $1))
 $(eval $1 :=)
@@ -119,9 +116,11 @@ endef
 
 define unnest-dir
 $(foreach var,$(nested-vars),$(call push-var,$(var),$1/))
-$(eval obj := $(obj)/$1)
+$(eval obj-parent-$1 := $(obj))
+$(eval obj := $(if $(obj),$(obj)/$1,$1))
 $(eval include $(SRC_PATH)/$1/Makefile.objs)
-$(eval obj := $(patsubst %/$1,%,$(obj)))
+$(eval obj := $(obj-parent-$1))
+$(eval obj-parent-$1 := )
 $(foreach var,$(nested-vars),$(call pop-var,$(var),$1/))
 endef
 
@@ -136,7 +135,12 @@ $(if $(nested-dirs),
 endef
 
 define unnest-vars
+$(eval obj := $1)
+$(eval nested-vars := $2)
+$(eval old-nested-dirs := )
 $(call unnest-vars-1)
+$(if $1,$(foreach v,$(nested-vars),$(eval \
+   $v := $(addprefix $1/,$($v)
 $(foreach var,$(nested-vars),$(eval $(var) := $(filter-out %/, $($(var)
 $(shell mkdir -p $(sort

[Qemu-devel] [PATCH v7 3/8] rule.mak: allow per object cflags and libs

2013-09-12 Thread Fam Zheng

Adds extract-libs in LINK to expand any per object libs, the syntax to define
such a libs options is like:

foo.o-libs := $(CURL_LIBS)

in block/Makefile.objs.

Similarly,

foo.o-cflags := $(FOO_CFLAGS)

is also supported.

foo.o must be listed a nested var (e.g. common-obj-y) to make the
option variables effective.

Signed-off-by: Fam Zheng f...@redhat.com
---
 rules.mak | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/rules.mak b/rules.mak
index 0c5125d..355c275 100644
--- a/rules.mak
+++ b/rules.mak
@@ -17,15 +17,17 @@ QEMU_DGFLAGS += -MMD -MP -MT $@ -MF $(*D)/$(*F).d
 # Same as -I$(SRC_PATH) -I., but for the nested source/object directories
 QEMU_INCLUDES += -I$(D) -I$(@D)
 
+extract-libs = $(strip $(foreach o,$1,$($o-libs)))
+
 %.o: %.c
-   $(call quiet-command,$(CC) $(QEMU_INCLUDES) $(QEMU_CFLAGS) 
$(QEMU_DGFLAGS) $(CFLAGS) -c -o $@ $,  CC$(TARGET_DIR)$@)
+   $(call quiet-command,$(CC) $(QEMU_INCLUDES) $(QEMU_CFLAGS) 
$(QEMU_DGFLAGS) $(CFLAGS) $($@-cflags) -c -o $@ $,  CC$(TARGET_DIR)$@)
 %.o: %.rc
$(call quiet-command,$(WINDRES) -I. -o $@ $,  RC$(TARGET_DIR)$@)
 
 ifeq ($(LIBTOOL),)
 LINK = $(call quiet-command,$(CC) $(QEMU_CFLAGS) $(CFLAGS) $(LDFLAGS) -o $@ \
$(sort $(filter %.o, $1)) $(filter-out %.o, $1) $(version-obj-y) \
-   $(LIBS),  LINK  $(TARGET_DIR)$@)
+   $(call extract-libs,$^) $(LIBS),  LINK  $(TARGET_DIR)$@)
 else
 LIBTOOL += $(if $(V),,--quiet)
 %.lo: %.c
@@ -41,7 +43,7 @@ LINK = $(call quiet-command,\
$(sort $(filter %.o, $1)) $(filter-out %.o, $1) \
$(if $(filter %.lo %.la,$^),$(version-lobj-y),$(version-obj-y)) \
$(if $(filter %.lo %.la,$^),$(LIBTOOLFLAGS)) \
-   $(LIBS),$(if $(filter %.lo %.la,$^),lt LINK ,   LINK  
)$(TARGET_DIR)$@)
+   $(call extract-libs,$^) $(LIBS),$(if $(filter %.lo %.la,$^),lt LINK , 
  LINK  )$(TARGET_DIR)$@)
 endif
 
 %.asm: %.S
@@ -114,11 +116,22 @@ $(eval $1 = $(value save-$2-$1) $$(subdir-$2-$1))
 $(eval save-$2-$1 :=)
 endef
 
+define fix-obj-vars
+$(foreach v,$($1), \
+   $(if $($v-cflags), \
+   $(eval $2$v-cflags := $($v-cflags)) \
+   $(eval $v-cflags := )) \
+   $(if $($v-libs), \
+   $(eval $2$v-libs := $($v-libs)) \
+   $(eval $v-libs := )))
+endef
+
 define unnest-dir
 $(foreach var,$(nested-vars),$(call push-var,$(var),$1/))
 $(eval obj-parent-$1 := $(obj))
 $(eval obj := $(if $(obj),$(obj)/$1,$1))
 $(eval include $(SRC_PATH)/$1/Makefile.objs)
+$(foreach v,$(nested-vars),$(call fix-obj-vars,$v,$(if $(obj),$(obj)/)))
 $(eval obj := $(obj-parent-$1))
 $(eval obj-parent-$1 := )
 $(foreach var,$(nested-vars),$(call pop-var,$(var),$1/))
-- 
1.8.3.1

[Qemu-devel] [PATCH v7 1/8] ui/Makefile.objs: delete unnecessary cocoa.o dependency

2013-09-12 Thread Fam Zheng

From: Peter Maydell peter.mayd...@linaro.org

Delete an unnecessary dependency for cocoa.o; we already have
a general rule that tells Make that we can build a .o file
from a .m source using an ObjC compiler, so this specific
rule is unnecessary. Further, it is using the dubious construct
$(SRC_PATH)/$(obj) to get at the source directory, which will
break when $(obj) is redefined as part of the preparation for
per-object library support.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Fam Zheng f...@redhat.com
---
 ui/Makefile.objs | 2 --
 1 file changed, 2 deletions(-)

diff --git a/ui/Makefile.objs b/ui/Makefile.objs
index 6ddc0de..f33be47 100644
--- a/ui/Makefile.objs
+++ b/ui/Makefile.objs
@@ -17,6 +17,4 @@ common-obj-$(CONFIG_GTK) += gtk.o x_keymap.o
 
 $(obj)/sdl.o $(obj)/sdl_zoom.o: QEMU_CFLAGS += $(SDL_CFLAGS) 
 
-$(obj)/cocoa.o: $(SRC_PATH)/$(obj)/cocoa.m
-
 $(obj)/gtk.o: QEMU_CFLAGS += $(GTK_CFLAGS) $(VTE_CFLAGS)
-- 
1.8.3.1

[Qemu-devel] [PATCH v7 7/8] .gitignore: ignore module related files (dll, so, mo)

2013-09-12 Thread Fam Zheng

Signed-off-by: Fam Zheng f...@redhat.com
---
 .gitignore | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.gitignore b/.gitignore
index d2c5c2f..4d0ac09 100644
--- a/.gitignore
+++ b/.gitignore
@@ -63,6 +63,9 @@ fsdev/virtfs-proxy-helper.pod
 *.cp
 *.dvi
 *.exe
+*.dll
+*.so
+*.mo
 *.fn
 *.ky
 *.log
-- 
1.8.3.1

[Qemu-devel] [PATCH v7 6/8] Makefile: install modules with make install

2013-09-12 Thread Fam Zheng

Install all the subdirs for modules under configure option moddir.

Signed-off-by: Fam Zheng f...@redhat.com
---
 Makefile | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Makefile b/Makefile
index ef76967..00c2a52 100644
--- a/Makefile
+++ b/Makefile
@@ -360,6 +360,12 @@ install-datadir install-localstatedir
 ifneq ($(TOOLS),)
$(INSTALL_PROG) $(STRIP_OPT) $(TOOLS) $(DESTDIR)$(bindir)
 endif
+ifneq ($(CONFIG_MODULES),)
+   for s in $(patsubst %.mo,%(DSOSUF),$(modules-m)); do \
+   $(INSTALL_DIR) $(DESTDIR)$(moddir)/$$(dirname $$s); \
+   $(INSTALL_PROG) $(STRIP_OPT) $$s 
$(DESTDIR)$(moddir)/$$(dirname $$s); \
+   done
+endif
 ifneq ($(HELPERS-y),)
$(INSTALL_DIR) $(DESTDIR)$(libexecdir)
$(INSTALL_PROG) $(STRIP_OPT) $(HELPERS-y) $(DESTDIR)$(libexecdir)
-- 
1.8.3.1

[Qemu-devel] [PATCH v7 4/8] build-sys: introduce common-obj-m and block-obj-m for DSO

2013-09-12 Thread Fam Zheng

Add necessary rules and flags for shared object generation.
$(common-obj-m) will include $(block-obj-m), like $(common-obj-y) does
for $(block-obj-y). The new rules introduced here are:

0) For all %.so compiling:

QEMU_CFLAGS += -fPIC

1) %.o in $(common-obj-m) is compiled to %.o, then linked to %.so.

2) %.mo in $(common-obj-m) is the placeholder for %.so for pattern
matching in Makefile. It's linked to -shared with all its dependencies
(multiple *.o) as input. Which means the list of depended objects must
be specified in each sub-Makefile.objs:

foo.mo-objs := bar.o baz.o qux.o

in the same style with foo.o-cflags and foo.o-libs. The objects here
will be prefixed with $(obj)/ if it's a subdirectory Makefile.objs.

Also introduce --enable-modules in configure, the option will enable
support of shared object build. Otherwise objects are static linked to
executables.

Signed-off-by: Fam Zheng f...@redhat.com
---
 Makefile| 10 --
 Makefile.objs   |  2 ++
 Makefile.target |  6 +-
 configure   | 14 ++
 rules.mak   | 50 ++
 5 files changed, 71 insertions(+), 11 deletions(-)

diff --git a/Makefile b/Makefile
index a23b95c..ef76967 100644
--- a/Makefile
+++ b/Makefile
@@ -122,7 +122,9 @@ dummy := $(call unnest-vars,, \
 util-obj-y \
 qga-obj-y \
 block-obj-y \
-common-obj-y)
+block-obj-m \
+common-obj-y \
+common-obj-m)
 
 ifneq ($(wildcard config-host.mak),)
 include $(SRC_PATH)/tests/Makefile
@@ -131,7 +133,7 @@ ifeq ($(CONFIG_SMARTCARD_NSS),y)
 include $(SRC_PATH)/libcacard/Makefile
 endif
 
-all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all
+all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
 
 vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
 
@@ -249,6 +251,10 @@ clean:
rm -f qemu-options.def
find . -name '*.[oda]' -type f -exec rm -f {} +
find . -name '*.l[oa]' -type f -exec rm -f {} +
+   find . -name '*.so' -type f -exec rm -f {} +
+   find . -name '*.mo' -type f -exec rm -f {} +
+   find . -name '*.dll' -type f -exec rm -f {} +
+
rm -f $(TOOLS) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -Rf .libs
rm -f qemu-img-cmds.h
diff --git a/Makefile.objs b/Makefile.objs
index 4f7a364..023166b 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -19,6 +19,8 @@ block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o 
qemu-coroutine-io.o
 block-obj-y += qemu-coroutine-sleep.o
 block-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
 
+block-obj-m = block/
+
 ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
 # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
 # only pull in the actual virtio-9p device if we also enabled virtio.
diff --git a/Makefile.target b/Makefile.target
index 87906ea..7fb9e4d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -152,7 +152,11 @@ obj-y-save := $(obj-y)
 block-obj-y :=
 common-obj-y :=
 include $(SRC_PATH)/Makefile.objs
-dummy := $(call unnest-vars,..,block-obj-y common-obj-y)
+dummy := $(call unnest-vars,.., \
+   block-obj-y \
+   block-obj-m \
+   common-obj-y \
+   common-obj-m)
 
 # Now restore obj-y
 obj-y := $(obj-y-save)
diff --git a/configure b/configure
index cc3cd4d..5bc7256 100755
--- a/configure
+++ b/configure
@@ -190,6 +190,9 @@ mingw32=no
 gcov=no
 gcov_tool=gcov
 EXESUF=
+DSOSUF=.so
+LDFLAGS_SHARED=-shared
+modules=no
 prefix=/usr/local
 mandir=\${prefix}/share/man
 datadir=\${prefix}/share
@@ -485,6 +488,7 @@ OpenBSD)
 Darwin)
   bsd=yes
   darwin=yes
+  LDFLAGS_SHARED=-bundle
   if [ $cpu = x86_64 ] ; then
 QEMU_CFLAGS=-arch x86_64 $QEMU_CFLAGS
 LDFLAGS=-arch x86_64 $LDFLAGS
@@ -584,6 +588,7 @@ fi
 
 if test $mingw32 = yes ; then
   EXESUF=.exe
+  DSOSUF=.dll
   QEMU_CFLAGS=-DWIN32_LEAN_AND_MEAN -DWINVER=0x501 $QEMU_CFLAGS
   # enable C99/POSIX format strings (needs mingw32-runtime 3.15 or later)
   QEMU_CFLAGS=-D__USE_MINGW_ANSI_STDIO=1 $QEMU_CFLAGS
@@ -646,6 +651,8 @@ for opt do
   ;;
   --disable-debug-info)
   ;;
+  --enable-modules) modules=yes
+  ;;
   --cpu=*)
   ;;
   --target-list=*) target_list=$optarg
@@ -1048,6 +1055,7 @@ echo   --libdir=PATHinstall libraries in 
PATH
 echo   --sysconfdir=PATHinstall config in PATH$confsuffix
 echo   --localstatedir=PATH install local state in PATH (set at runtime 
on win32)
 echo   --with-confsuffix=SUFFIX suffix for QEMU data inside datadir and 
sysconfdir [$confsuffix]
+echo   --enable-modules enable modules support
 echo   --enable-debug-tcg   enable TCG debugging
 echo   --disable-debug-tcg  disable TCG debugging (default)
 echo   --enable-debug-info   enable debugging information (default)
@@ -3572,6 +3580,7 @@ echo python$python
 if test $slirp = yes ; then
 echo smbd  $smbd
 fi
+echo module support

[Qemu-devel] [PATCH v7 5/8] module: implement module loading function

2013-09-12 Thread Fam Zheng

Added three types of modules:

typedef enum {
MODULE_LOAD_BLOCK = 0,
MODULE_LOAD_UI,
MODULE_LOAD_NET,
MODULE_LOAD_MAX,
} module_load_type;

and their loading function:

void module_load(module_load_type).

which loads all .so files in a subdir under ${PREFIX}/qemu/, e.g.
/usr/lib/qemu/block. Modules of each type should be loaded before
respective subsystem initialization code.

Requires gmodule-2.0 from glib.

Signed-off-by: Fam Zheng f...@redhat.com
---
 block.c   |  1 +
 configure | 28 +-
 include/qemu/module.h |  9 +
 scripts/create_config | 10 ++
 util/module.c | 55 +++
 vl.c  |  2 ++
 6 files changed, 96 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 26639e8..16ceaaf 100644
--- a/block.c
+++ b/block.c
@@ -4008,6 +4008,7 @@ BlockDriverAIOCB *bdrv_aio_discard(BlockDriverState *bs,
 
 void bdrv_init(void)
 {
+module_load(MODULE_LOAD_BLOCK);
 module_call_init(MODULE_INIT_BLOCK);
 }
 
diff --git a/configure b/configure
index 5bc7256..275b1a0 100755
--- a/configure
+++ b/configure
@@ -199,6 +199,7 @@ datadir=\${prefix}/share
 qemu_docdir=\${prefix}/share/doc/qemu
 bindir=\${prefix}/bin
 libdir=\${prefix}/lib
+moddir=\${prefix}/lib/qemu
 libexecdir=\${prefix}/libexec
 includedir=\${prefix}/include
 sysconfdir=\${prefix}/etc
@@ -676,6 +677,8 @@ for opt do
   ;;
   --libdir=*) libdir=$optarg
   ;;
+  --moddir=*) moddir=$optarg
+  ;;
   --libexecdir=*) libexecdir=$optarg
   ;;
   --includedir=*) includedir=$optarg
@@ -1052,6 +1055,7 @@ echo   --datadir=PATH   install firmware in 
PATH$confsuffix
 echo   --docdir=PATHinstall documentation in PATH$confsuffix
 echo   --bindir=PATHinstall binaries in PATH
 echo   --libdir=PATHinstall libraries in PATH
+echo   --moddir=PATHinstall modules in PATH
 echo   --sysconfdir=PATHinstall config in PATH$confsuffix
 echo   --localstatedir=PATH install local state in PATH (set at runtime 
on win32)
 echo   --with-confsuffix=SUFFIX suffix for QEMU data inside datadir and 
sysconfdir [$confsuffix]
@@ -2256,15 +2260,19 @@ if test $mingw32 = yes; then
 else
 glib_req_ver=2.12
 fi
-if $pkg_config --atleast-version=$glib_req_ver gthread-2.0; then
-glib_cflags=`$pkg_config --cflags gthread-2.0`
-glib_libs=`$pkg_config --libs gthread-2.0`
-CFLAGS=$glib_cflags $CFLAGS
-LIBS=$glib_libs $LIBS
-libs_qga=$glib_libs $libs_qga
-else
-error_exit glib-$glib_req_ver required to compile QEMU
-fi
+
+for i in gthread-2.0 gmodule-2.0; do
+if $pkg_config --atleast-version=$glib_req_ver $i; then
+glib_cflags=`$pkg_config --cflags $i`
+glib_libs=`$pkg_config --libs $i`
+CFLAGS=$glib_cflags $CFLAGS
+LIBS=$glib_libs $LIBS
+libs_qga=$glib_libs $libs_qga
+else
+error_exit glib-$glib_req_ver required to compile QEMU
+fi
+done
+
 
 ##
 # pixman support probe
@@ -3557,6 +3565,7 @@ echo Install prefix$prefix
 echo BIOS directory`eval echo $qemu_datadir`
 echo binary directory  `eval echo $bindir`
 echo library directory `eval echo $libdir`
+echo module directory  `eval echo $moddir`
 echo libexec directory `eval echo $libexecdir`
 echo include directory `eval echo $includedir`
 echo config directory  `eval echo $sysconfdir`
@@ -3680,6 +3689,7 @@ echo all:  $config_host_mak
 echo prefix=$prefix  $config_host_mak
 echo bindir=$bindir  $config_host_mak
 echo libdir=$libdir  $config_host_mak
+echo moddir=$moddir  $config_host_mak
 echo libexecdir=$libexecdir  $config_host_mak
 echo includedir=$includedir  $config_host_mak
 echo mandir=$mandir  $config_host_mak
diff --git a/include/qemu/module.h b/include/qemu/module.h
index c4ccd57..f00bc25 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -37,4 +37,13 @@ void register_module_init(void (*fn)(void), module_init_type 
type);
 
 void module_call_init(module_init_type type);
 
+typedef enum {
+MODULE_LOAD_BLOCK = 0,
+MODULE_LOAD_UI,
+MODULE_LOAD_NET,
+MODULE_LOAD_MAX,
+} module_load_type;
+
+void module_load(module_load_type type);
+
 #endif
diff --git a/scripts/create_config b/scripts/create_config
index b1adbf5..50391c2 100755
--- a/scripts/create_config
+++ b/scripts/create_config
@@ -26,6 +26,13 @@ case $line in
 # save for the next definitions
 prefix=${line#*=}
 ;;
+ moddir=*)
+eval moddir=\${line#*=}\
+echo #define CONFIG_MODDIR \$moddir\
+;;
+ CONFIG_MODULES=*)
+echo #define CONFIG_MODULES \${line#*=}\
+;;
  CONFIG_AUDIO_DRIVERS=*)
 drivers=${line#*=}
 echo #define CONFIG_AUDIO_DRIVERS \\
@@ -104,6 +111,9 @@ case $line in
 value=${line#*=}
 echo #define $name $value
 ;;
+ DSOSUF=*)
+echo #define HOST_DSOSUF \${line#*=}\
+;;
 esac
 
 done # read
diff --git

[Qemu-devel] [PATCH v7 8/8] block: convert block drivers linked with libs to modules

2013-09-12 Thread Fam Zheng

The converted block drivers are:

curl
iscsi
rbd
ssh
glusterfs

no longer adds flags and libs for them to global variables, instead
create config-host.mak variables like FOO_CFLAGS and FOO_LIBS, which is
used as per object cflags and libs.

Signed-off-by: Fam Zheng f...@redhat.com
---
 block/Makefile.objs | 11 ++-
 configure   | 33 +++--
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 3bb85b5..f98d379 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -23,4 +23,13 @@ common-obj-y += commit.o
 common-obj-y += mirror.o
 common-obj-y += backup.o
 
-$(obj)/curl.o: QEMU_CFLAGS+=$(CURL_CFLAGS)
+iscsi.o-cflags := $(LIBISCSI_CFLAGS)
+iscsi.o-libs   := $(LIBISCSI_LIBS)
+curl.o-cflags  := $(CURL_CFLAGS)
+curl.o-libs:= $(CURL_LIBS)
+rbd.o-cflags   := $(RBD_CFLAGS)
+rbd.o-libs := $(RBD_LIBS)
+gluster.o-cflags   := $(GLUSTERFS_CFLAGS)
+gluster.o-libs := $(GLUSTERFS_LIBS)
+ssh.o-cflags   := $(LIBSSH2_CFLAGS)
+ssh.o-libs := $(LIBSSH2_LIBS)
diff --git a/configure b/configure
index 275b1a0..2f02b3d 100755
--- a/configure
+++ b/configure
@@ -2221,8 +2221,6 @@ EOF
   curl_libs=`$curlconfig --libs 2/dev/null`
   if compile_prog $curl_cflags $curl_libs ; then
 curl=yes
-libs_tools=$curl_libs $libs_tools
-libs_softmmu=$curl_libs $libs_softmmu
   else
 if test $curl = yes ; then
   feature_not_found curl
@@ -2382,8 +2380,6 @@ EOF
   rbd_libs=-lrbd -lrados
   if compile_prog  $rbd_libs ; then
 rbd=yes
-libs_tools=$rbd_libs $libs_tools
-libs_softmmu=$rbd_libs $libs_softmmu
   else
 if test $rbd = yes ; then
   feature_not_found rados block device
@@ -2400,9 +2396,6 @@ if test $libssh2 != no ; then
 libssh2_cflags=`$pkg_config libssh2 --cflags`
 libssh2_libs=`$pkg_config libssh2 --libs`
 libssh2=yes
-libs_tools=$libssh2_libs $libs_tools
-libs_softmmu=$libssh2_libs $libs_softmmu
-QEMU_CFLAGS=$QEMU_CFLAGS $libssh2_cflags
   else
 if test $libssh2 = yes ; then
   error_exit libssh2 = $min_libssh2_version required for 
--enable-libssh2
@@ -2618,9 +2611,6 @@ if test $glusterfs != no ; then
 glusterfs=yes
 glusterfs_cflags=`$pkg_config --cflags glusterfs-api`
 glusterfs_libs=`$pkg_config --libs glusterfs-api`
-CFLAGS=$CFLAGS $glusterfs_cflags
-libs_tools=$glusterfs_libs $libs_tools
-libs_softmmu=$glusterfs_libs $libs_softmmu
 if $pkg_config --atleast-version=5 glusterfs-api; then
   glusterfs_discard=yes
 fi
@@ -2988,11 +2978,9 @@ EOF
 libiscsi=yes
 libiscsi_cflags=$($pkg_config --cflags libiscsi)
 libiscsi_libs=$($pkg_config --libs libiscsi)
-CFLAGS=$CFLAGS $libiscsi_cflags
-LIBS=$LIBS $libiscsi_libs
   elif compile_prog  -liscsi ; then
 libiscsi=yes
-LIBS=$LIBS -liscsi
+libiscsi_libs=-liscsi
   else
 if test $libiscsi = yes ; then
   feature_not_found libiscsi
@@ -3907,8 +3895,9 @@ if test $bswap_h = yes ; then
   echo CONFIG_MACHINE_BSWAP_H=y  $config_host_mak
 fi
 if test $curl = yes ; then
-  echo CONFIG_CURL=y  $config_host_mak
+  echo CONFIG_CURL=m  $config_host_mak
   echo CURL_CFLAGS=$curl_cflags  $config_host_mak
+  echo CURL_LIBS=$curl_libs  $config_host_mak
 fi
 if test $brlapi = yes ; then
   echo CONFIG_BRLAPI=y  $config_host_mak
@@ -3997,7 +3986,9 @@ if test $glx = yes ; then
 fi
 
 if test $libiscsi = yes ; then
-  echo CONFIG_LIBISCSI=y  $config_host_mak
+  echo CONFIG_LIBISCSI=m  $config_host_mak
+  echo LIBISCSI_CFLAGS=$libiscsi_cflags  $config_host_mak
+  echo LIBISCSI_LIBS=$libiscsi_libs  $config_host_mak
 fi
 
 if test $seccomp = yes; then
@@ -4018,7 +4009,9 @@ if test $qom_cast_debug = yes ; then
   echo CONFIG_QOM_CAST_DEBUG=y  $config_host_mak
 fi
 if test $rbd = yes ; then
-  echo CONFIG_RBD=y  $config_host_mak
+  echo CONFIG_RBD=m  $config_host_mak
+  echo RBD_CFLAGS=$rbd_cflags  $config_host_mak
+  echo RBD_LIBS=$rbd_libs  $config_host_mak
 fi
 
 echo CONFIG_COROUTINE_BACKEND=$coroutine  $config_host_mak
@@ -4056,7 +4049,9 @@ if test $getauxval = yes ; then
 fi
 
 if test $glusterfs = yes ; then
-  echo CONFIG_GLUSTERFS=y  $config_host_mak
+  echo CONFIG_GLUSTERFS=m  $config_host_mak
+  echo GLUSTERFS_CFLAGS=$glusterfs_cflags  $config_host_mak
+  echo GLUSTERFS_LIBS=$glusterfs_libs  $config_host_mak
 fi
 
 if test $glusterfs_discard = yes ; then
@@ -4064,7 +4059,9 @@ if test $glusterfs_discard = yes ; then
 fi
 
 if test $libssh2 = yes ; then
-  echo CONFIG_LIBSSH2=y  $config_host_mak
+  echo CONFIG_LIBSSH2=m  $config_host_mak
+  echo LIBSSH2_CFLAGS=$libssh2_cflags  $config_host_mak
+  echo LIBSSH2_LIBS=$libssh2_libs  $config_host_mak
 fi
 
 if test $virtio_blk_data_plane = yes ; then
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 3/3 resend v2] arch_init: right return for ram_save_iterate

2013-09-12 Thread Lei Li


On 09/11/2013 07:27 PM, Paolo Bonzini wrote:

Il 11/09/2013 13:06, Juan Quintela ha scritto:

And I think that the right solution is make qemu_get_rate_limit() to
return -1 in case of error (or the error, I don't care).

You might do both things, it would avoid the useless g_usleep you
pointed out below.  But Lei's patch is good, because an error could
happen exactly during the qemu_put_be64 that writes RAM_SAVE_FLAG_EOS.

Caller checks also.  This is the reason I wanted qemu_file_* callers to
return an error.  It has some advantages and some disadvantages.  We
don't agree on which ones are bigger O:-)

I think the disadvantages are bigger.  It litters the code with error
handling, hides where things actually happen, and doesn't even simplify
QEMUFile itself.  Checking only at the toplevel is simpler, all we need
to do is ensure that we get there every now and then (and that's what
qemu_file_rate_limit does).


savevm.c: qemu_savevm_state_iterate()

 if (qemu_file_rate_limit(f)) {
 return 0;
 }

check is incorrect again, we should return an error if there is one
error.

Nothing cares if qemu_savevm_state_iterate returns 0 or negative, so
changing qemu_savevm_state_iterate to only return 0/1 would make sense too.

In this case, 0 means:
   please, call us again
when what we mean is:
   don't care about calling us again, there is an error.  Handle the error.

Or alternatively, 0 means:

we haven't finished the work

when what we mean is:

we haven't finished the work (BTW, please check if there is an error)


Notice that qemu_save_iterate() already returns errors in other code
paths

Yes that's also unnecessary.


If we change th ereturn value for qemu_file_rate_limit() the change that
cames with this patch is not needed, that was my point.

This is what an earlier patch from Lei did.  I told him (or her?) to


It's her. :)


leave qemu_file_rate_limit aside since the idea behind QEMUFile is to
only handle the error at the top.


Yes, I changed the return value for qemu_file_rate_limit() to negative for
that if there has been an error at the beginning. After the discussion with
Paolo, I think he's suggestion based on the idea that only handle error at
the top behind QEMUFile makes sense to me, so that's where this patch come
from.



Paolo




--
Lei

Re: [Qemu-devel] [PATCH v6 2/8] rule.mak: allow per object cflags and libs

2013-09-12 Thread Fam Zheng

On Thu, 09/12 08:34, Paolo Bonzini wrote:
 Il 12/09/2013 04:52, Fam Zheng ha scritto:
 define unnest-dir
 $(foreach var,$(nested-vars),$(call push-var,$(var),$1/))
 $(eval obj-parent-$1 := $(obj))
 $(eval obj := $(if $(obj),$(obj)/$1,$1))
 $(eval include $(SRC_PATH)/$1/Makefile.objs)
+$(foreach v,$(nested-vars),$(call fix-obj-vars,$v,$(if 
$(obj),$(obj)/)))
 $(eval obj := $(obj-parent-$1))
 $(eval obj-parent-$1 := )
 $(foreach var,$(nested-vars),$(call pop-var,$(var),$1/))

   
   I'm not sure this will work for targets in the toplevel directory when
   obj-base is not empty.  This can be fixed later though, as part of a
   general revamping of obj-base.  Please add a FIXME comment.
   
  I'm not sure about the problem, can you give an example, so I can be 
  specific
  in the comment?
 
 Can you try using vl.o-cflags instead of a per-target QEMU_CFLAGS?  I
 think it won't work, because the toplevel Makefile.objs is included
 directly and not through unnest-dir.
 
I think this case works. Only in subdir %-cflags relies on unnest-vars to
prefix them, toplevel objects don't need this:

In toplevel Makefile.objs,

vl.o-cflags = -DTEST_CFLAGS_FOR_VL_O

Then,

rm vl.o ; make vl.o V=1 | grep TEST_CFLAGS_FOR_VL_O
cc hidden options  -DTEST_CFLAGS_FOR_VL_O -c -o vl.o vl.c

Fam

Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance

2013-09-12 Thread Orit Wasserman

On 09/11/2013 04:54 AM, junqing.w...@cs2c.com.cn wrote:
 Hi,
 
The first is that if the VM failure happen in the middle on the live 
migration  the backup VM state will be inconsistent which means you can't 
failover to it.
 
 Yes, I have concerned about this problem. That is why we need a prefetch 
 buffer.
 

You are right I missed that.

Solving it is not simple as you need some transaction mechanism that will 
change the backup VM state only when the transaction completes (the live 
migration completes). Kemari has something like that. 
 
 The backup VM state will be loaded only when the one whole migration data is 
 prefetched. Otherwise, VM state will not be loaded. So the backup VM is 
 ensured to have a consistent state like a checkpoint.
 However, how close this checkpoint to the point of the VM failure depends on 
 the workload and bandwidth.
 

At the moment in your implementation the prefetch buffer can be very large 
(several copies of guest memory size) 
are you planning to address this issue?

The second is that sadly live migration doesn't always converge this means  
that the backup VM won't have a consist state to failover to. You need to 
detect such a case and throttle down the guest to force convergence.
 
 Yes, that's a problem. AFAK, qemu already have an auto convergence feature.

How about activating it when you do fault tolerance automatically?

 From another perspective,  if many migrations could not converge, maybe the 
 workload is high and the bandwidth is low,  and it is not recommended to use 
 FT in general.
 

I agree but we need some way to notify the user of such problem.

Regards,
Orit

Re: [Qemu-devel] [PATCH 3/3 resend v2] arch_init: right return for ram_save_iterate

2013-09-12 Thread Orit Wasserman

On 09/11/2013 02:27 PM, Paolo Bonzini wrote:
 Il 11/09/2013 13:06, Juan Quintela ha scritto:
 And I think that the right solution is make qemu_get_rate_limit() to
 return -1 in case of error (or the error, I don't care).

 You might do both things, it would avoid the useless g_usleep you
 pointed out below.  But Lei's patch is good, because an error could
 happen exactly during the qemu_put_be64 that writes RAM_SAVE_FLAG_EOS.

 Caller checks also.  This is the reason I wanted qemu_file_* callers to
 return an error.  It has some advantages and some disadvantages.  We
 don't agree on which ones are bigger O:-)
 
 I think the disadvantages are bigger.  It litters the code with error
 handling, hides where things actually happen, and doesn't even simplify
 QEMUFile itself.  Checking only at the toplevel is simpler, all we need
 to do is ensure that we get there every now and then (and that's what
 qemu_file_rate_limit does).
 

I also prefer the error checking at the top level.

Orit

 savevm.c: qemu_savevm_state_iterate()

 if (qemu_file_rate_limit(f)) {
 return 0;
 }

 check is incorrect again, we should return an error if there is one
 error.

 Nothing cares if qemu_savevm_state_iterate returns 0 or negative, so
 changing qemu_savevm_state_iterate to only return 0/1 would make sense too.

 In this case, 0 means:
   please, call us again
 when what we mean is:
   don't care about calling us again, there is an error.  Handle the error.
 
 Or alternatively, 0 means:
 
we haven't finished the work
 
 when what we mean is:
 
we haven't finished the work (BTW, please check if there is an error)
 
 Notice that qemu_save_iterate() already returns errors in other code
 paths
 
 Yes that's also unnecessary.
 
 If we change th ereturn value for qemu_file_rate_limit() the change that
 cames with this patch is not needed, that was my point.
 
 This is what an earlier patch from Lei did.  I told him (or her?) to
 leave qemu_file_rate_limit aside since the idea behind QEMUFile is to
 only handle the error at the top.
 
 Paolo

Re: [Qemu-devel] [PATCH v5 0/5] bugs fix for hpet

2013-09-12 Thread liu ping fan

On Thu, Sep 12, 2013 at 2:29 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 12/09/2013 05:25, Liu Ping Fan ha scritto:
 v5:
   use stand compat property to fix hpet intcap on pc-q35, while on pc-piix, 
 hard code intcap as IRQ2

 v4:
   use stand compat property to fix hpet intcap

 v3:
   change hpet interrupt capablity on board's demand

 Liu Ping Fan (5):
   hpet: inverse polarity when pin above ISA_NUM_IRQS
   hpet: entitle more irq pins for hpet
   PC: use qdev_xx to create hpet instead of sysbus_create_xx
   PC: differentiate hpet's interrupt capability on piix and q35
   PC-1.6: add compatibility for hpet intcap on pc-q35-1.6

  hw/i386/pc.c | 17 ++---
  hw/i386/pc_piix.c|  3 ++-
  hw/i386/pc_q35.c |  2 +-
  hw/timer/hpet.c  | 24 
  include/hw/i386/pc.h |  7 ++-
  5 files changed, 43 insertions(+), 10 deletions(-)


 Reviewed-by: Paolo Bonzini pbonz...@redhat.com

 Nice series, patches are split well.  Thank you very much!

Thanks for your guide :)  and learn much through it.

Regards,
Pingfan
 Paolo

Re: [Qemu-devel] [PATCH] qdev-monitor: Avoid exiting when hot-plugging two devices with the same bootindex value

2013-09-12 Thread Paolo Bonzini

Il 11/09/2013 20:26, Marcel Apfelbaum ha scritto:
 Qemu is expected to quit if the same boot index value is used by two devices.
 However, hot-plugging a device with a bootindex value already used should
 fail with a friendly message rather than quitting a running VM.

I think the problem is right where QEMU exits, i.e. in
add_boot_device_path.  This function should return an error instead, via
an Error ** argument.

Callers, typically a device's init or realize function, will either
print the error before returning an error code (e.g. -EBUSY for init) or
propagate the error up (for realize).

Returning/propagating failure will still cause QEMU to exit when the
duplicate bootindexes are found on the command line.

Paolo

 Signed-off-by: Marcel Apfelbaum marce...@redhat.com
 ---
  qdev-monitor.c | 33 +
  1 file changed, 33 insertions(+)
 
 diff --git a/qdev-monitor.c b/qdev-monitor.c
 index 410cdcb..654d086 100644
 --- a/qdev-monitor.c
 +++ b/qdev-monitor.c
 @@ -24,6 +24,7 @@
  #include qmp-commands.h
  #include sysemu/arch_init.h
  #include qemu/config-file.h
 +#include sysemu/sysemu.h
  
  /*
   * Aliases were a bad idea from the start.  Let's keep them
 @@ -442,6 +443,31 @@ static BusState *qbus_find(const char *path)
  }
  }
  
 +#define OBJ_PROP_BOOTINDEX bootindex
 +
 +static bool bootindex_colision(Object *obj, QemuOpts *opts)
 +{
 +int32_t bootindex;
 +
 +if (!object_property_find(obj, OBJ_PROP_BOOTINDEX, NULL)) {
 +return false;
 +}
 +
 +/* avoid parsing by setting the property and then getting it typed */
 +object_property_parse(obj, qemu_opt_get(opts, OBJ_PROP_BOOTINDEX),
 +  OBJ_PROP_BOOTINDEX, NULL);
 +bootindex = (int32_t)object_property_get_int(obj, OBJ_PROP_BOOTINDEX,
 + NULL);
 +
 +if (bootindex = 0) {
 +if (get_boot_device(bootindex)) {
 +return true;
 +}
 +}
 +
 +return false;
 +}
 +
  DeviceState *qdev_device_add(QemuOpts *opts)
  {
  ObjectClass *obj;
 @@ -502,6 +528,13 @@ DeviceState *qdev_device_add(QemuOpts *opts)
  /* create device, set properties */
  qdev = DEVICE(object_new(driver));
  
 +if (qdev_hotplug  bootindex_colision(OBJECT(qdev), opts)) {
 +qerror_report(QERR_INVALID_PARAMETER_VALUE,
 +  bootindex, an unused boot index value);
 +qdev_free(qdev);
 +return NULL;
 +}
 +
  if (bus) {
  qdev_set_parent_bus(qdev, bus);
  }

Re: [Qemu-devel] Disabling IRQ error

2013-09-12 Thread Xie Xianshan


Hi Max,
Thanks for your patience and help.
I`ve tried to do what you said, but the problem doesn`t go away.
And actually i cannot add a new register to the fpga device, because the 
fpga device i`m emulating already exists in the real world.
So i cannot change anything about hardware properties and linux driver 
for this device.


By the way, how did you finally fix your problem?

Thanks,
 Simen

于 2013/09/11 17:29, Max Filippov 写道:

On Wed, Sep 11, 2013 at 12:12 PM, Xie Xianshanxi...@cn.fujitsu.com  wrote:

   I want to add a new device fpga for e500, and trigger an interrupt IRQ3
while the register BB_INTR_REG which belongs to device fpga is wrote by
the device driver of fpga.
   For e500, IRQ3 is an external interrupt irq.
   According the debug log, the disabling error is encoutered during writing
BB_INTR_REG register.
   - write BB_INTR_REG register
   - qemu_irq_raise() is called.
   - after serval minutes,
 the error message about disabling irq is displayed.
   - continue the next execution without error(with poll?)


So your device raises IRQ, but it doesn't lower it. Real devices
usually don't do that, they either generate a short pulse on the
IRQ line (in case of edge-triggered IRQ) or raise IRQ line on
some event and then lower it on a command from its driver
(level-triggered IRQ).

You can do the following to make your device behave that way:
- make your fpga device capable of lowering its IRQ, e.g. by adding
   another register:


static void fpga_write(FPGAState *s, unsigned int offset, uint32_t value,
unsigned size) {
 switch(offset) {
 case BB_INTR_REG:
 qemu_irq_raise(s-irq);
 break;

case BB_INTC_REG:
qemu_irq_lower(s-irq);
break;

 }
}


- provide an interrupt service routine in the linux driver for your fpga
   device that would check whether the interrupt was caused by its
   device, and if so lower the device's IRQ.

Thanks.
-- Max





--
Best Regards
Xie Xianshan
--
Xie Xianshan
Development Dept.I
Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
No. 6 Wenzhu Road, Nanjing, 210012, China
PHONE: +86+25-86630566-9555
FUJITSU INTERNAL: 7998-9555
MAIL: xi...@cn.fujitsu.com
--
This communication is for use by the intended recipient(s) only and may
contain information that is privileged, confidential and exempt from
disclosure under applicable law. If you are not an intended recipient of
this communication, you are hereby notified that any dissemination,
distribution or copying hereof is strictly prohibited.  If you have
received this communication in error, please notify me by reply e-mail,
permanently delete this communication from your system, and destroy any
hard copies you may have printed

Re: [Qemu-devel] Disabling IRQ error

2013-09-12 Thread Max Filippov

On Thu, Sep 12, 2013 at 11:49 AM, Xie Xianshan xi...@cn.fujitsu.com wrote:
 Hi Max,
 Thanks for your patience and help.
 I`ve tried to do what you said, but the problem doesn`t go away.
 And actually i cannot add a new register to the fpga device, because the
 fpga device i`m emulating already exists in the real world.
 So i cannot change anything about hardware properties and linux driver for
 this device.

Then I don't get its IRQ logic. Does it mean an IRQ to be edge-triggered?
Its model should use qemu_irq_pulse then instead of qemu_irq_raise and
its IRQ line should be connected to edge-sensing input of interrupt controller.
Input that you have used is also used to sense PCI IRQ and is level-sensing.

 By the way, how did you finally fix your problem?

I didn't have any. (:

-- 
Thanks.
-- Max

Re: [Qemu-devel] [PATCH v3 00/29] tcg-aarch64 improvements

2013-09-12 Thread Claudio Fontana

On 10.09.2013 10:45, Peter Maydell wrote:
 On 10 September 2013 09:27, Claudio Fontana claudio.font...@huawei.com 
 wrote:
 On another side, I end up having to manually revert some parts
 of these which you put as prerequisites, during bisection when
 landing after them, which is a huge time drain when tracking
 regressions introduced in the later part of the series.
 
 I don't understand this; can you explain? If these early
 refactoring patches have bugs then we should just identify
 them and fix them. If they don't have bugs why would you
 need to manually revert parts of them?
 

To revert the next patches which do introduce bugs.

I could not see bugs in the refactoring patches, but there is stuff to fix 
regardless of bugs.

Re: [Qemu-devel] [RFC] aio: add aio_context_acquire() and aio_context_release()

2013-09-12 Thread Stefan Hajnoczi

On Tue, Sep 10, 2013 at 02:42:10PM -0500, Michael Roth wrote:
 Quoting Stefan Hajnoczi (2013-08-29 02:43:02)
  On Thu, Aug 29, 2013 at 09:09:45AM +0800, Wenchao Xia wrote:
   于 2013-8-28 16:49, Stefan Hajnoczi 写道:
   On Wed, Aug 28, 2013 at 11:25:33AM +0800, Wenchao Xia wrote:
   +void aio_context_release(AioContext *ctx)
   +{
   +qemu_mutex_lock(ctx-acquire_lock);
   +assert(ctx-owner  qemu_thread_is_self(ctx-owner));
   +ctx-owner = NULL;
   +qemu_cond_signal(ctx-acquire_cond);
   +qemu_mutex_unlock(ctx-acquire_lock);
   +}
  if main thread have call bdrv_aio_readv(cb *bdrv_cb), now it
   is possible bdrv_cb will be executed in another thread which
   aio_context_acquire() it. I think there are some ways to solve,
   but leave a comments here now to tip better?
   
   Callbacks, BHs, and timers are executed in the thread that calls
   aio_poll().  This is safe since other threads cannot run aio_poll() or
   submit new block I/O requests at the same time.
   
   In other words: code should only care which AioContext it runs under,
   not which thread ID it runs under.  (I think we talked about this on IRC
   a few weeks ago.)
   
   Are there any situations you are worried about?
   
   Stefan
   
 Yes, we have discussed it before and think it may be safe for block
   driver caller. Still, here I mean to add some in-code comment to tip how
   to use it safely.
   
   for example:
   
   static int glob_test = 0;
   
   int aio_cb(void *opaque)
   {
   glob_test++;
   }
   
   Thread A:
   bdrv_aio_read(bs, aio_cb...);
   .
   glob_test++;
   
   
   Normally glob_test have no race condition since they supposed
   to work in one thread, but it need to be considered when
   aio_context_acquire() is involved. How about:
   /* Note that callback can run in different thread which acquired the
   AioContext and do a poll() call. */
  
  I will add a comment to aio_context_acquire() to explain that callbacks,
  timers, and BHs may run in another thread.
  
  Normally this is not a problem since the callbacks access BDS or
  AioContext, which are both protected by acquire/release.  But people
  should be aware of this.
 
 Do you imagine we'll ever have a case where one main loop thread attempts
 to register events (aio_set_fd_handler etc.) with another thread? Currently
 virtio-blk (iothread) registers events with the dataplane's AioContext, but
 doesn't start up the dataplane thread until afterward, so there's no
 contention there. But looking at a scenario where the dataplane threads
 are created/started independently of virtio-blk (as in the QContext RFC for
 example), that may not be the case (there we start up the main loop thread
 during machine init time, then attach events to it via dataplane start
 routine).
 
 So, in that case, I take it we'd do something like this, in, say,
 virtio_blk_data_plane_start?
 
 aio_context_acquire(ctx)
 aio_set_fd_handler(ctx, ...)
 aio_context_release(ctx)
 
 Or maybe push the acquire/release down into aio_set_fd_handler, and allow
 for support for recursive acquisition?

Or use a BH to schedule initialization code into the AioContext.

 As far as locking constraints (for situations like virtio-net dataplane, or
 threaded virtio-blk, where we may have lots of shared state between threads,
 potentially protected by a 'block'/'network' lock or something along that
 line), I assume we need to ensure that any locks that may be acquired after
 acquisition of an AioContext (via, say, event callbacks), must be released
 prior to calling aio_context_acquire() from another thread to avoid an
 ABBA deadlock/'lock'-order reversal?

We should avoid long-held locks where possible.  For example, protect
the relevant subsystem structures (e.g. bdrv_states or net_clients)
during access (adding/remove a device) but don't have a block/net
subsystem lock.

This way it's much easier to avoid lock ordering problems since
fine-grained locks also don't nest as much.

 I ask because I'm looking at how to handle this same scenario for
 qemu_set_fd_handler, set_fd_handler(ctx, ...). My approach is a little
 different:
 
 https://github.com/mdroth/qemu/commit/9a749a2a1ae93ac1b7d6a1216edaf0ca96ff1edb#L1R110
 
 but i think the requirements end up being similar for how
 users need to structure their code and handle locking, just wanted to
 double-check though because it seems like a potential pain to have to
 figure out what locks you need to drop prior to do event registration.
 (one alternative to this requirement is making the event updates
 asynchronous, and pushing the logic to deal with stable
 callbacks/events into the event callbacks and their opaque data).
 
 For anything outside of event registration, I assume a 'light-weight'
 AioContext loop could be converted to driving a more general GMainContext
 loop via the corresponding g_main_context_acquire(ctx)/release(ctx)?

If there are any doubts about lock ordering, I think the simplest
solution

Re: [Qemu-devel] [PATCH v3 00/29] tcg-aarch64 improvements

2013-09-12 Thread Claudio Fontana

On 10.09.2013 15:16, Richard Henderson wrote:
 On 09/10/2013 01:27 AM, Claudio Fontana wrote:
 There are two aspects.

 On one side, although some changes do not break anything, I see some 
 problems in them.
 
 Then let us discuss them, sooner rather than later.
 
 Putting them as a prerequisite for the rest forces us to agreeing on
 everything before moving forward, instead of being able to agree on separate
 chunks (meat first, rest later). In my view, this makes the process longer.
 
 If we have no common ground on how the port should look, then we simply cannot
 move forward full stop.
 
 Having put together a foundation of AArch64Insn and tcg_fmt_*, that I believe
 to be clean and easy to understand, I simply refuse on aesthetic grounds to

on aesthetic grounds?

 rewrite later patches to instead use the magic number and open-coded insn
 format used throughout the port today.  That way leads to a much greater 
 chance
 of error in my opinion.
 

I just asked you to reorder the way you do things, so that I had less work to 
do when dissecting problems in the actual functional changes.
If it's really impossible for you to do that, I guess we can move forward 
anyway, it just creates more work here before we can have a chunk we agree on.

I will put additional comments on the parts that I would like to see improved.

Thanks,

Claudio

Re: [Qemu-devel] [PATCH v3 0/3] qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe

2013-09-12 Thread Stefan Hajnoczi

On Thu, Sep 12, 2013 at 11:10:01AM +0800, liu ping fan wrote:
 Do you think this series is ready to be merged?
 I have some code to run hpet on a dedicated thread, and in theory it
 will rely on this.

Yes, it is ready.  To ease the merge I have rebased it and sent a new
revision.

Stefan

Re: [Qemu-devel] [PATCH v3 0/3] qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe

2013-09-12 Thread liu ping fan

On Thu, Sep 12, 2013 at 4:17 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Thu, Sep 12, 2013 at 11:10:01AM +0800, liu ping fan wrote:
 Do you think this series is ready to be merged?
 I have some code to run hpet on a dedicated thread, and in theory it
 will rely on this.

 Yes, it is ready.  To ease the merge I have rebased it and sent a new
 revision.

Great, thanks!

Regards,
Pingfan

[Qemu-devel] MSI-X doesn't work when running Windows as guest

2013-09-12 Thread Gal Hammer


Hi,

I've notice that the virtio-serial Windows' driver doesn't use MSI-X 
vectors when running using upstream qemu or 
qemu-kvm-1.2.2-13.fc18.x86_64. The same VM works with MSI-X when using 
qemu-kvm-0.12.1.2-2.355.el6.x86_64.


From what I saw, Windows is trying to enable MSI-X by writing a 2 bytes 
value to device's PCI-config address 66h.


So when everything works well the flow goes like this:

pci_default_write_config value: 8000 len: 2
pci_default_write_config value: 1 len: 2
msix_enabled 0 (67)
pci_default_write_config value: e107 len: 2
pci_default_write_config value: 1 len: 2
msix_enabled 0 (67)
pci_default_write_config value: 8001 len: 2
msix_enabled 1 (67)

But on upstream it goes:

pci_default_write_config addr: 66 value: 8000 size: 2
pci_default_write_config addr: 66 value: 1 size: 2
msix_enabled 0 (67)
pci_default_write_config addr: 66 value: e307 size: 2 (NOTE: Value is 
diffrent!).

pci_default_write_config addr: 66 value: 1 size: 2
msix_enabled 0 (67)

(NOTE: Missing the write of 8001).

My qemu's command line:

--- snip ---

/usr/bin/qemu-kvm -m 1G -smp 2 -enable-kvm -usb -device usb-tablet \
-device 
ide-drive,drive=drive-virtio0-0-0,id=virtio0-0-0,bootindex=1 \
-drive 
file=win7_32_viorng.qcow2,if=none,id=drive-virtio0-0-0,format=qcow2,werror=stop,rerror=stop,cache=none 
\

-monitor stdio \
-vga qxl -spice id=on,disable-ticketing,port=5903 \
-device virtio-serial-pci,id=virtio-serial0,vectors=2 \
-chardev spicevmc,id=spicechannel0,name=vdagent

--- snip ---

Thanks,

Gal.

Re: [Qemu-devel] [PATCH v2] e1000: NetClientInfo.receive_iov implemented

2013-09-12 Thread Stefan Hajnoczi

On Wed, Sep 11, 2013 at 12:57:58PM +0200, Vincenzo Maffione wrote:
 Thanks for the help!
 
 Actually I've found out that the variable copied I use in this patch can
 be removed, we can simply increment the variable ba instead (ba +=
 iov_copy).
 
 I have the patch v3 to do that ready. Do you think it is worth sending it?

Yep, that would be good.  I'll replace the patch with v3.

Stefan

Re: [Qemu-devel] [PATCH v3 01/29] tcg-aarch64: Set ext based on TCG_OPF_64BIT

2013-09-12 Thread Claudio Fontana

On 02.09.2013 19:54, Richard Henderson wrote:
 Signed-off-by: Richard Henderson r...@twiddle.net
 ---
  tcg/aarch64/tcg-target.c | 28 +++-
  1 file changed, 7 insertions(+), 21 deletions(-)
 
 diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c
 index 55ff700..5b067fe 100644
 --- a/tcg/aarch64/tcg-target.c
 +++ b/tcg/aarch64/tcg-target.c
 @@ -1105,9 +1105,9 @@ static inline void tcg_out_load_pair(TCGContext *s, 
 TCGReg addr,
  static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 const TCGArg *args, const int *const_args)
  {
 -/* ext will be set in the switch below, which will fall through to the
 -   common code. It triggers the use of extended regs where appropriate. 
 */
 -int ext = 0;
 +/* 99% of the time, we can signal the use of extension registers
 +   by looking to see if the opcode handles 64-bit data.  */
 +bool ext = (tcg_op_defs[opc].flags  TCG_OPF_64BIT) != 0;
  
  switch (opc) {
  case INDEX_op_exit_tb:
 @@ -1163,7 +1163,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
  break;
  
  case INDEX_op_mov_i64:
 -ext = 1; /* fall through */
  case INDEX_op_mov_i32:
  tcg_out_movr(s, ext, args[0], args[1]);
  break;
 @@ -1176,43 +1175,36 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
  break;
  
  case INDEX_op_add_i64:
 -ext = 1; /* fall through */
  case INDEX_op_add_i32:
  tcg_out_arith(s, ARITH_ADD, ext, args[0], args[1], args[2], 0);
  break;
  
  case INDEX_op_sub_i64:
 -ext = 1; /* fall through */
  case INDEX_op_sub_i32:
  tcg_out_arith(s, ARITH_SUB, ext, args[0], args[1], args[2], 0);
  break;
  
  case INDEX_op_and_i64:
 -ext = 1; /* fall through */
  case INDEX_op_and_i32:
  tcg_out_arith(s, ARITH_AND, ext, args[0], args[1], args[2], 0);
  break;
  
  case INDEX_op_or_i64:
 -ext = 1; /* fall through */
  case INDEX_op_or_i32:
  tcg_out_arith(s, ARITH_OR, ext, args[0], args[1], args[2], 0);
  break;
  
  case INDEX_op_xor_i64:
 -ext = 1; /* fall through */
  case INDEX_op_xor_i32:
  tcg_out_arith(s, ARITH_XOR, ext, args[0], args[1], args[2], 0);
  break;
  
  case INDEX_op_mul_i64:
 -ext = 1; /* fall through */
  case INDEX_op_mul_i32:
  tcg_out_mul(s, ext, args[0], args[1], args[2]);
  break;
  
  case INDEX_op_shl_i64:
 -ext = 1; /* fall through */
  case INDEX_op_shl_i32:
  if (const_args[2]) {/* LSL / UBFM Wd, Wn, (32 - m) */
  tcg_out_shl(s, ext, args[0], args[1], args[2]);
 @@ -1222,7 +1214,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
  break;
  
  case INDEX_op_shr_i64:
 -ext = 1; /* fall through */
  case INDEX_op_shr_i32:
  if (const_args[2]) {/* LSR / UBFM Wd, Wn, m, 31 */
  tcg_out_shr(s, ext, args[0], args[1], args[2]);
 @@ -1232,7 +1223,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
  break;
  
  case INDEX_op_sar_i64:
 -ext = 1; /* fall through */
  case INDEX_op_sar_i32:
  if (const_args[2]) {/* ASR / SBFM Wd, Wn, m, 31 */
  tcg_out_sar(s, ext, args[0], args[1], args[2]);
 @@ -1242,7 +1232,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
  break;
  
  case INDEX_op_rotr_i64:
 -ext = 1; /* fall through */
  case INDEX_op_rotr_i32:
  if (const_args[2]) {/* ROR / EXTR Wd, Wm, Wm, m */
  tcg_out_rotr(s, ext, args[0], args[1], args[2]);
 @@ -1252,7 +1241,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
  break;
  
  case INDEX_op_rotl_i64:
 -ext = 1; /* fall through */
  case INDEX_op_rotl_i32: /* same as rotate right by (32 - m) */
  if (const_args[2]) {/* ROR / EXTR Wd, Wm, Wm, 32 - m */
  tcg_out_rotl(s, ext, args[0], args[1], args[2]);
 @@ -1265,14 +1253,12 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
  break;
  
  case INDEX_op_brcond_i64:
 -ext = 1; /* fall through */
  case INDEX_op_brcond_i32: /* CMP 0, 1, cond(2), label 3 */
  tcg_out_cmp(s, ext, args[0], args[1], 0);
  tcg_out_goto_label_cond(s, args[2], args[3]);
  break;
  
  case INDEX_op_setcond_i64:
 -ext = 1; /* fall through */
  case INDEX_op_setcond_i32:
  tcg_out_cmp(s, ext, args[1], args[2], 0);
  tcg_out_cset(s, 0, args[0], args[3]);

There's not point to change to 'bool' if you pass '0': either we keep as int, 
and we pass 0,
or we change to bool, and we pass 'false'.
There are instances of this also in successive patches, I point out only this 
one, but it should be checked in the whole series.

 @@ -1315,9 +1301,11 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,

Re: [Qemu-devel] [PATCH v3 02/29] tcg-aarch64: Change all ext variables to bool

2013-09-12 Thread Claudio Fontana

On 02.09.2013 19:54, Richard Henderson wrote:
 Signed-off-by: Richard Henderson r...@twiddle.net
 ---
  tcg/aarch64/tcg-target.c | 44 ++--
  1 file changed, 22 insertions(+), 22 deletions(-)
 
 diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c
 index 5b067fe..bde4c72 100644
 --- a/tcg/aarch64/tcg-target.c
 +++ b/tcg/aarch64/tcg-target.c
 @@ -326,7 +326,7 @@ static inline void tcg_out_ldst_12(TCGContext *s,
| op_type  20 | scaled_uimm  10 | rn  5 | rd);
  }
  
 -static inline void tcg_out_movr(TCGContext *s, int ext, TCGReg rd, TCGReg 
 src)
 +static inline void tcg_out_movr(TCGContext *s, bool ext, TCGReg rd, TCGReg 
 src)
  {
  /* register to register move using MOV (shifted register with no shift) 
 */
  /* using MOV 0x2a0003e0 | (shift).. */
 @@ -407,7 +407,7 @@ static inline void tcg_out_ldst(TCGContext *s, enum 
 aarch64_ldst_op_data data,
  }
  
  /* mov alias implemented with add immediate, useful to move to/from SP */
 -static inline void tcg_out_movr_sp(TCGContext *s, int ext, TCGReg rd, TCGReg 
 rn)
 +static inline void tcg_out_movr_sp(TCGContext *s, bool ext, TCGReg rd, 
 TCGReg rn)
  {
  /* using ADD 0x1100 | (ext) | rn  5 | rd */
  unsigned int base = ext ? 0x9100 : 0x1100;
 @@ -437,7 +437,7 @@ static inline void tcg_out_st(TCGContext *s, TCGType 
 type, TCGReg arg,
  }
  
  static inline void tcg_out_arith(TCGContext *s, enum aarch64_arith_opc opc,
 - int ext, TCGReg rd, TCGReg rn, TCGReg rm,
 + bool ext, TCGReg rd, TCGReg rn, TCGReg rm,
   int shift_imm)
  {
  /* Using shifted register arithmetic operations */
 @@ -453,7 +453,7 @@ static inline void tcg_out_arith(TCGContext *s, enum 
 aarch64_arith_opc opc,
  tcg_out32(s, base | rm  16 | shift | rn  5 | rd);
  }
  
 -static inline void tcg_out_mul(TCGContext *s, int ext,
 +static inline void tcg_out_mul(TCGContext *s, bool ext,
 TCGReg rd, TCGReg rn, TCGReg rm)
  {
  /* Using MADD 0x1b00 with Ra = wzr alias MUL 0x1b007c00 */
 @@ -462,7 +462,7 @@ static inline void tcg_out_mul(TCGContext *s, int ext,
  }
  
  static inline void tcg_out_shiftrot_reg(TCGContext *s,
 -enum aarch64_srr_opc opc, int ext,
 +enum aarch64_srr_opc opc, bool ext,
  TCGReg rd, TCGReg rn, TCGReg rm)
  {
  /* using 2-source data processing instructions 0x1ac02000 */
 @@ -470,7 +470,7 @@ static inline void tcg_out_shiftrot_reg(TCGContext *s,
  tcg_out32(s, base | rm  16 | opc  8 | rn  5 | rd);
  }
  
 -static inline void tcg_out_ubfm(TCGContext *s, int ext, TCGReg rd, TCGReg rn,
 +static inline void tcg_out_ubfm(TCGContext *s, bool ext, TCGReg rd, TCGReg 
 rn,
  unsigned int a, unsigned int b)
  {
  /* Using UBFM 0x5300 Wd, Wn, a, b */
 @@ -478,7 +478,7 @@ static inline void tcg_out_ubfm(TCGContext *s, int ext, 
 TCGReg rd, TCGReg rn,
  tcg_out32(s, base | a  16 | b  10 | rn  5 | rd);
  }
  
 -static inline void tcg_out_sbfm(TCGContext *s, int ext, TCGReg rd, TCGReg rn,
 +static inline void tcg_out_sbfm(TCGContext *s, bool ext, TCGReg rd, TCGReg 
 rn,
  unsigned int a, unsigned int b)
  {
  /* Using SBFM 0x1300 Wd, Wn, a, b */
 @@ -486,7 +486,7 @@ static inline void tcg_out_sbfm(TCGContext *s, int ext, 
 TCGReg rd, TCGReg rn,
  tcg_out32(s, base | a  16 | b  10 | rn  5 | rd);
  }
  
 -static inline void tcg_out_extr(TCGContext *s, int ext, TCGReg rd,
 +static inline void tcg_out_extr(TCGContext *s, bool ext, TCGReg rd,
  TCGReg rn, TCGReg rm, unsigned int a)
  {
  /* Using EXTR 0x1380 Wd, Wn, Wm, a */
 @@ -494,7 +494,7 @@ static inline void tcg_out_extr(TCGContext *s, int ext, 
 TCGReg rd,
  tcg_out32(s, base | rm  16 | a  10 | rn  5 | rd);
  }
  
 -static inline void tcg_out_shl(TCGContext *s, int ext,
 +static inline void tcg_out_shl(TCGContext *s, bool ext,
 TCGReg rd, TCGReg rn, unsigned int m)
  {
  int bits, max;
 @@ -503,28 +503,28 @@ static inline void tcg_out_shl(TCGContext *s, int ext,
  tcg_out_ubfm(s, ext, rd, rn, bits - (m  max), max - (m  max));
  }
  
 -static inline void tcg_out_shr(TCGContext *s, int ext,
 +static inline void tcg_out_shr(TCGContext *s, bool ext,
 TCGReg rd, TCGReg rn, unsigned int m)
  {
  int max = ext ? 63 : 31;
  tcg_out_ubfm(s, ext, rd, rn, m  max, max);
  }
  
 -static inline void tcg_out_sar(TCGContext *s, int ext,
 +static inline void tcg_out_sar(TCGContext *s, bool ext,
 TCGReg rd, TCGReg rn, unsigned int m)
  {
  int max = ext ? 63 : 31;
  tcg_out_sbfm(s, ext, rd, rn, m  max, max);
  }
  
 -static inline void

Re: [Qemu-devel] [PATCH v3 03/29] tcg-aarch64: Don't handle mov/movi in tcg_out_op

2013-09-12 Thread Claudio Fontana

On 02.09.2013 19:54, Richard Henderson wrote:
 Signed-off-by: Richard Henderson r...@twiddle.net
 ---
  tcg/aarch64/tcg-target.c | 20 +++-
  1 file changed, 7 insertions(+), 13 deletions(-)
 
 diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c
 index bde4c72..79a447d 100644
 --- a/tcg/aarch64/tcg-target.c
 +++ b/tcg/aarch64/tcg-target.c
 @@ -1162,18 +1162,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
   args[0], args[1], args[2]);
  break;
  
 -case INDEX_op_mov_i64:
 -case INDEX_op_mov_i32:
 -tcg_out_movr(s, ext, args[0], args[1]);
 -break;
 -
 -case INDEX_op_movi_i64:
 -tcg_out_movi(s, TCG_TYPE_I64, args[0], args[1]);
 -break;
 -case INDEX_op_movi_i32:
 -tcg_out_movi(s, TCG_TYPE_I32, args[0], args[1]);
 -break;
 -
  case INDEX_op_add_i64:
  case INDEX_op_add_i32:
  tcg_out_arith(s, ARITH_ADD, ext, args[0], args[1], args[2], 0);
 @@ -1337,8 +1325,14 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
  tcg_out_movr(s, 0, args[0], args[1]);
  break;
  
 +case INDEX_op_mov_i64:
 +case INDEX_op_mov_i32:
 +case INDEX_op_movi_i64:
 +case INDEX_op_movi_i32:
 +/* Always implemented with tcg_out_mov/i, never with tcg_out_op.  */
  default:
 -tcg_abort(); /* opcode not implemented */
 +/* Opcode not implemented.  */
 +tcg_abort();
  }
  }
  

Ok



-- 
Claudio Fontana
Server OS Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

office: +49 89 158834 4135
mobile: +49 15253060158

Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance

2013-09-12 Thread junqing . wang

hi,


At the moment in your implementation the prefetch buffer can be very large 
(several copies of guest memory size)  are you planning to address this 
issue? I agree but we need some way to notify the user of such problem.

This issue has been handled (maybe not in the best way).  The prefetch buffer 
size could be increased up to 1.5 * vm memory size. When the migration data 
size is larger than it, the prefetching is stopped with a warning (pls refer to 
the code 4/4) and the loading is started. In this situation, 
broken-in-the-middle problem is inevitable.

The second is that sadly live migration doesn't always converge this means  
that the backup VM won't have a consist state to failover to. You need to 
detect such a case and throttle down the guest to force convergence.
 
 Yes, that's a problem. AFAK, qemu already have an auto convergence feature.
 How about activating it when you do fault tolerance automatically?
That is feasible. Any comments by others?

[Qemu-devel] [PATCH] qemu-iotests: Cleanup test image in test number 007

2013-09-12 Thread Bharata B Rao

qemu-iotests number 007 doesn't do test image cleanup. This will affect
those protocols that expect a clean state before every test. Hence
ensure that test image is cleaned up in this test.

Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com
---
 tests/qemu-iotests/007 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/007 b/tests/qemu-iotests/007
index c454f2c..6fa7603 100755
--- a/tests/qemu-iotests/007
+++ b/tests/qemu-iotests/007
@@ -30,7 +30,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
-#  _cleanup_test_img
+   _cleanup_test_img
true
 }
 trap _cleanup; exit \$status 0 1 2 3 15
-- 
1.7.11.7

[Qemu-devel] [PATCH 1/1] chardev: fix pty_chr_timer

2013-09-12 Thread Gerd Hoffmann

pty_chr_timer first calls pty_chr_update_read_handler(), then clears
timer_tag (because it is a one-shot timer).   This is the wrong order
though.  pty_chr_update_read_handler might re-arm time timer, and the
new timer_tag gets overwitten in that case.

This leads to crashes when unplugging a pty chardev:  pty_chr_close
thinks no timer is running - timer isn't canceled - pty_chr_timer gets
called with stale CharDevState - BOOM.

This patch fixes the ordering.
Kill the pointless goto while being at it.

https://bugzilla.redhat.com/show_bug.cgi?id=994414

Cc: qemu-sta...@nongnu.org
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 qemu-char.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index 6259496..f7f5464 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -1026,15 +1026,11 @@ static gboolean pty_chr_timer(gpointer opaque)
 struct CharDriverState *chr = opaque;
 PtyCharDriver *s = chr-opaque;
 
-if (s-connected) {
-goto out;
-}
-
-/* Next poll ... */
-pty_chr_update_read_handler(chr);
-
-out:
 s-timer_tag = 0;
+if (!s-connected) {
+/* Next poll ... */
+pty_chr_update_read_handler(chr);
+}
 return FALSE;
 }
 
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH] scsi: prefer UUID to VM name for the initiator name

2013-09-12 Thread Stefan Hajnoczi

On Tue, Sep 10, 2013 at 06:40:04PM +0200, Paolo Bonzini wrote:
 The UUID is unique even across multiple hosts, thus it is
 better than a VM name even if it is less user-friendly.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  block/iscsi.c   | 23 ---
  include/sysemu/sysemu.h |  2 ++
  stubs/Makefile.objs |  1 +
  stubs/uuid.c| 12 
  4 files changed, 31 insertions(+), 7 deletions(-)
  create mode 100644 stubs/uuid.c

Will changing the initiator name break existing setups/ACLs?

Stefan

[Qemu-devel] [PULL 0/1] chardev: fix pty_chr_timer

2013-09-12 Thread Gerd Hoffmann

  Hi,

Bugfix for the pty chardev which might segfault on unplug.
Has been on the list for quite a while, wasn't picked up so far,
so I'll try a single-patch-pull-request now ...

please pull,
  Gerd

The following changes since commit 2d1fe1873a984d1c2c89ffa3d12949cafc718551:

  Merge remote-tracking branch 'pmaydell/tags/pull-target-arm-20130910' into 
staging (2013-09-11 14:46:52 -0500)

are available in the git repository at:


  git://git.kraxel.org/qemu chardev.7

for you to fetch changes up to b0d768c35e08d2057b63e8e77e7a513c447199fa:

  chardev: fix pty_chr_timer (2013-09-12 09:58:18 +0200)


Gerd Hoffmann (1):
  chardev: fix pty_chr_timer

 qemu-char.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

Re: [Qemu-devel] [PATCH] qdev-monitor: Avoid exiting when hot-plugging two devices with the same bootindex value

2013-09-12 Thread Marcel Apfelbaum

On Thu, 2013-09-12 at 09:49 +0200, Paolo Bonzini wrote:
 Il 11/09/2013 20:26, Marcel Apfelbaum ha scritto:
  Qemu is expected to quit if the same boot index value is used by two 
  devices.
  However, hot-plugging a device with a bootindex value already used should
  fail with a friendly message rather than quitting a running VM.
 
 I think the problem is right where QEMU exits, i.e. in
 add_boot_device_path.  This function should return an error instead, via
 an Error ** argument.
 
 Callers, typically a device's init or realize function, will either
 print the error before returning an error code (e.g. -EBUSY for init) or
 propagate the error up (for realize).
Thanks, I'll try this.
Marcel

 Returning/propagating failure will still cause QEMU to exit when the
 duplicate bootindexes are found on the command line.
 
 Paolo
 
  Signed-off-by: Marcel Apfelbaum marce...@redhat.com
  ---
   qdev-monitor.c | 33 +
   1 file changed, 33 insertions(+)
  
  diff --git a/qdev-monitor.c b/qdev-monitor.c
  index 410cdcb..654d086 100644
  --- a/qdev-monitor.c
  +++ b/qdev-monitor.c
  @@ -24,6 +24,7 @@
   #include qmp-commands.h
   #include sysemu/arch_init.h
   #include qemu/config-file.h
  +#include sysemu/sysemu.h
   
   /*
* Aliases were a bad idea from the start.  Let's keep them
  @@ -442,6 +443,31 @@ static BusState *qbus_find(const char *path)
   }
   }
   
  +#define OBJ_PROP_BOOTINDEX bootindex
  +
  +static bool bootindex_colision(Object *obj, QemuOpts *opts)
  +{
  +int32_t bootindex;
  +
  +if (!object_property_find(obj, OBJ_PROP_BOOTINDEX, NULL)) {
  +return false;
  +}
  +
  +/* avoid parsing by setting the property and then getting it typed */
  +object_property_parse(obj, qemu_opt_get(opts, OBJ_PROP_BOOTINDEX),
  +  OBJ_PROP_BOOTINDEX, NULL);
  +bootindex = (int32_t)object_property_get_int(obj, OBJ_PROP_BOOTINDEX,
  + NULL);
  +
  +if (bootindex = 0) {
  +if (get_boot_device(bootindex)) {
  +return true;
  +}
  +}
  +
  +return false;
  +}
  +
   DeviceState *qdev_device_add(QemuOpts *opts)
   {
   ObjectClass *obj;
  @@ -502,6 +528,13 @@ DeviceState *qdev_device_add(QemuOpts *opts)
   /* create device, set properties */
   qdev = DEVICE(object_new(driver));
   
  +if (qdev_hotplug  bootindex_colision(OBJECT(qdev), opts)) {
  +qerror_report(QERR_INVALID_PARAMETER_VALUE,
  +  bootindex, an unused boot index value);
  +qdev_free(qdev);
  +return NULL;
  +}
  +
   if (bus) {
   qdev_set_parent_bus(qdev, bus);
   }

Re: [Qemu-devel] MSI-X doesn't work when running Windows as guest

2013-09-12 Thread Michael S. Tsirkin

On Thu, Sep 12, 2013 at 11:23:46AM +0300, Gal Hammer wrote:
 Hi,
 
 I've notice that the virtio-serial Windows' driver doesn't use MSI-X
 vectors when running using upstream qemu or
 qemu-kvm-1.2.2-13.fc18.x86_64. The same VM works with MSI-X when
 using qemu-kvm-0.12.1.2-2.355.el6.x86_64.
 
 From what I saw, Windows is trying to enable MSI-X by writing a 2
 bytes value to device's PCI-config address 66h.
 
 So when everything works well the flow goes like this:
 
 pci_default_write_config value: 8000 len: 2
 pci_default_write_config value: 1 len: 2
 msix_enabled 0 (67)
 pci_default_write_config value: e107 len: 2
 pci_default_write_config value: 1 len: 2
 msix_enabled 0 (67)
 pci_default_write_config value: 8001 len: 2
 msix_enabled 1 (67)
 
 But on upstream it goes:
 
 pci_default_write_config addr: 66 value: 8000 size: 2
 pci_default_write_config addr: 66 value: 1 size: 2
 msix_enabled 0 (67)
 pci_default_write_config addr: 66 value: e307 size: 2 (NOTE: Value
 is diffrent!).
 pci_default_write_config addr: 66 value: 1 size: 2
 msix_enabled 0 (67)
 
 (NOTE: Missing the write of 8001).
 
 My qemu's command line:
 
 --- snip ---
 
 /usr/bin/qemu-kvm -m 1G -smp 2 -enable-kvm -usb -device usb-tablet \
 -device
 ide-drive,drive=drive-virtio0-0-0,id=virtio0-0-0,bootindex=1 \
 -drive 
 file=win7_32_viorng.qcow2,if=none,id=drive-virtio0-0-0,format=qcow2,werror=stop,rerror=stop,cache=none
 \
 -monitor stdio \
 -vga qxl -spice id=on,disable-ticketing,port=5903 \
 -device virtio-serial-pci,id=virtio-serial0,vectors=2 \
 -chardev spicevmc,id=spicechannel0,name=vdagent
 
 --- snip ---
 
 Thanks,
 
 Gal.


So it's a known change from qemu-kvm to qemu.
With qemu-kvm the default cpu was kvm64.
With qemu the default cpu is qemu64 even if you use -enable-kvm.

Not an issue for libvirt as that specifies -cpu,
but will be an issue for command-line users.

Maybe we should change the default for new machine types and when
-enable-kvm is specified?

Eduardo?
Anthony?

-- 
MST

Re: [Qemu-devel] [PATCH] coroutine: add ./configure --disable-coroutine-pool

2013-09-12 Thread Stefan Hajnoczi

On Wed, Sep 11, 2013 at 05:38:07PM +0200, Kevin Wolf wrote:
 Am 11.09.2013 um 17:32 hat Gabriel Kerneis geschrieben:
  On Wed, Sep 11, 2013 at 05:24:55PM +0200, Kevin Wolf wrote:
   In config-host.make we do get:
   
   CONFIG_COROUTINE_POOL=0
   
   But when config-host.h is generated from it, I assume it's only checked
   if the variable is defined, so we end up with:
   
   #define CONFIG_COROUTINE_POOL 1
  
  Did you clean your tree?
  
  $ rm -rf *
  $ ../../configure --disable-coroutine-pool
  $ make config-host.h
  $ grep COROUTINE_POOL config-host.h
  #define CONFIG_COROUTINE_POOL 0
 
 My bad, I checked after configure, but before running make. We must have
 a weird build system that this isn't created during configure. :-)

Yes, this confused by too once or twice while writing this patch :).

The relevant code in scripts/create_config is:

 CONFIG_*=*) # configuration
name=${line%=*}
value=${line#*=}
echo #define $name $value
;;

Stefan

[Qemu-devel] [PATCH v3] e1000: NetClientInfo.receive_iov implemented

2013-09-12 Thread Vincenzo Maffione

This patch implements the NetClientInfo.receive_iov method for the
e1000 device emulation. In this way a network backend that uses
qemu_sendv_packet() can deliver the fragmented packet without
requiring an additional copy in the frontend/backend network code
(nc_sendv_compat() function).

The existing method NetClientInfo.receive has been reimplemented
using the new method.

Signed-off-by: Vincenzo Maffione v.maffi...@gmail.com
---
 hw/net/e1000.c | 70 --
 1 file changed, 58 insertions(+), 12 deletions(-)

I propose this patch also because our research group (University of Pisa,
Department of Computer Engineering) is working on the e1000 device
(optimizations and paravirtual extensions) and we have patches to
support the VALE switch as a network backend (see
http://info.iet.unipi.it/~luigi/vale/).
The VALE backend uses qemu_sendv_packet() to send fragmented packets: For
this reason we think it could be interesting to better support these packets
with e1000.

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index d3f274c..151d25e 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -32,6 +32,7 @@
 #include hw/loader.h
 #include sysemu/sysemu.h
 #include sysemu/dma.h
+#include qemu/iov.h
 
 #include e1000_regs.h
 
@@ -64,6 +65,8 @@ static int debugflags = DBGBIT(TXERR) | DBGBIT(GENERAL);
 /* this is the size past which hardware will drop packets when setting LPE=1 */
 #define MAXIMUM_ETHERNET_LPE_SIZE 16384
 
+#define MAXIMUM_ETHERNET_HDR_LEN (14+4)
+
 /*
  * HW models:
  *  E1000_DEV_ID_82540EM works with Windows and Linux
@@ -899,7 +902,7 @@ static uint64_t rx_desc_base(E1000State *s)
 }
 
 static ssize_t
-e1000_receive(NetClientState *nc, const uint8_t *buf, size_t size)
+e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
 {
 E1000State *s = qemu_get_nic_opaque(nc);
 PCIDevice *d = PCI_DEVICE(s);
@@ -908,8 +911,12 @@ e1000_receive(NetClientState *nc, const uint8_t *buf, 
size_t size)
 unsigned int n, rdt;
 uint32_t rdh_start;
 uint16_t vlan_special = 0;
-uint8_t vlan_status = 0, vlan_offset = 0;
+uint8_t vlan_status = 0;
 uint8_t min_buf[MIN_BUF_SIZE];
+struct iovec min_iov;
+uint8_t *filter_buf = iov-iov_base;
+size_t size = iov_size(iov, iovcnt);
+size_t iov_ofs = 0;
 size_t desc_offset;
 size_t desc_size;
 size_t total_size;
@@ -924,10 +931,16 @@ e1000_receive(NetClientState *nc, const uint8_t *buf, 
size_t size)
 
 /* Pad to minimum Ethernet frame length */
 if (size  sizeof(min_buf)) {
-memcpy(min_buf, buf, size);
+iov_to_buf(iov, iovcnt, 0, min_buf, size);
 memset(min_buf[size], 0, sizeof(min_buf) - size);
-buf = min_buf;
-size = sizeof(min_buf);
+min_iov.iov_base = filter_buf = min_buf;
+min_iov.iov_len = size = sizeof(min_buf);
+iovcnt = 1;
+iov = min_iov;
+} else if (iov-iov_len  MAXIMUM_ETHERNET_HDR_LEN) {
+/* This is very unlikely, but may happen. */
+iov_to_buf(iov, iovcnt, 0, min_buf, MAXIMUM_ETHERNET_HDR_LEN);
+filter_buf = min_buf;
 }
 
 /* Discard oversized packets if !LPE and !SBP. */
@@ -938,14 +951,24 @@ e1000_receive(NetClientState *nc, const uint8_t *buf, 
size_t size)
 return size;
 }
 
-if (!receive_filter(s, buf, size))
+if (!receive_filter(s, filter_buf, size)) {
 return size;
+}
 
-if (vlan_enabled(s)  is_vlan_packet(s, buf)) {
-vlan_special = cpu_to_le16(be16_to_cpup((uint16_t *)(buf + 14)));
-memmove((uint8_t *)buf + 4, buf, 12);
+if (vlan_enabled(s)  is_vlan_packet(s, filter_buf)) {
+vlan_special = cpu_to_le16(be16_to_cpup((uint16_t *)(filter_buf
++ 14)));
+iov_ofs = 4;
+if (filter_buf == iov-iov_base) {
+memmove(filter_buf + 4, filter_buf, 12);
+} else {
+iov_from_buf(iov, iovcnt, 4, filter_buf, 12);
+while (iov-iov_len = iov_ofs) {
+iov_ofs -= iov-iov_len;
+iov++;
+}
+}
 vlan_status = E1000_RXD_STAT_VP;
-vlan_offset = 4;
 size -= 4;
 }
 
@@ -967,12 +990,23 @@ e1000_receive(NetClientState *nc, const uint8_t *buf, 
size_t size)
 desc.status |= (vlan_status | E1000_RXD_STAT_DD);
 if (desc.buffer_addr) {
 if (desc_offset  size) {
+size_t iov_copy;
+hwaddr ba = le64_to_cpu(desc.buffer_addr);
 size_t copy_size = size - desc_offset;
 if (copy_size  s-rxbuf_size) {
 copy_size = s-rxbuf_size;
 }
-pci_dma_write(d, le64_to_cpu(desc.buffer_addr),
-  buf + desc_offset + vlan_offset, copy_size);
+do {
+iov_copy = MIN(copy_size, iov-iov_len - iov_ofs);
+

Re: [Qemu-devel] Ping http://patchwork.ozlabs.org/patch/251410/

2013-09-12 Thread Peter Maydell

On 12 September 2013 00:08, Richard Henderson r...@twiddle.net wrote:
 There have been two patches posted for this uninitialized warning, outstanding
 since June 14.  I still encounter this daily...

Looks like material for -trivial to me, cc'ing.

-- PMM

Re: [Qemu-devel] [PATCH] scsi: prefer UUID to VM name for the initiator name

2013-09-12 Thread Paolo Bonzini

Il 12/09/2013 10:38, Stefan Hajnoczi ha scritto:
 On Tue, Sep 10, 2013 at 06:40:04PM +0200, Paolo Bonzini wrote:
 The UUID is unique even across multiple hosts, thus it is
 better than a VM name even if it is less user-friendly.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  block/iscsi.c   | 23 ---
  include/sysemu/sysemu.h |  2 ++
  stubs/Makefile.objs |  1 +
  stubs/uuid.c| 12 
  4 files changed, 31 insertions(+), 7 deletions(-)
  create mode 100644 stubs/uuid.c
 
 Will changing the initiator name break existing setups/ACLs?

The username for authentication is separate from the initiator name.

It could break persistent reservations, but there is a workaround
(specify the initiator name manually through -iscsi).  I'll make sure
to document this in the release changelog.

Does it look good?

Paolo

Re: [Qemu-devel] [PATCH 4/4] qemu-img: conditionally discard target on convert

2013-09-12 Thread Peter Lieven


On 18.07.2013 16:14, Paolo Bonzini wrote:

Il 18/07/2013 15:55, ronnie sahlberg ha scritto:

bdrv-write_zeroes will use writesame16 and set the unmap flag only if
BDRV_MAY_DISCARD == 1 and BDRV_O_UNMAP == 1 and lbprz == 1.

When you use WRITESAME16 you can ignore the lbprz flag.
Just send a WRITESAME16 command with one block of data that is all set to zero.
If the unmap flag is set and if unmapped blocks read back the same as
the block you provided (all zero)
then it will unmap those blocks, where possible.

True, so the unmap flag can be set iff BDRV_MAY_DISCARD == 1.  block.c
can take care of checking BDRV_O_UNMAP.

Would you add BDRV_MAY_DISCARD to BdrvRequestFlags?

This would require making BdrvRequestFlags public.

Before I start implementation I would still like to verify if we need this
flag. E.g. in which case would we do not want to optimize writing zeroes
via discard if the user has set BDRV_O_UNMAP?

One case I would think of is sanitizing data. But in this case shouldn't this
be a different function? In case of a container format its also not guaranteed 
that the
contents are erased from the underlying media. Looking at the commit message
for f08f2ddae078e8a7683f8b16da8e0cc3029c7b89 it sounds that bdrv_write_zeroes
was intended to provide an efficient way to zero contents, but only in the sense
that it is guaranteed that they read back as zero (compared to bdrv_discard).

Peter

Re: [Qemu-devel] Ping http://patchwork.ozlabs.org/patch/251410/

2013-09-12 Thread Stefan Hajnoczi

On Wed, Sep 11, 2013 at 04:08:01PM -0700, Richard Henderson wrote:
 There have been two patches posted for this uninitialized warning, outstanding
 since June 14.  I still encounter this daily...

I'm still happy with the patch.

Michael: Do you want to take this through the trivial patches tree?

Stefan

Re: [Qemu-devel] [PATCH 4/4] qemu-img: conditionally discard target on convert

2013-09-12 Thread Paolo Bonzini

Il 12/09/2013 10:52, Peter Lieven ha scritto:
 On 18.07.2013 16:14, Paolo Bonzini wrote:
 Il 18/07/2013 15:55, ronnie sahlberg ha scritto:
 bdrv-write_zeroes will use writesame16 and set the unmap flag only if
 BDRV_MAY_DISCARD == 1 and BDRV_O_UNMAP == 1 and lbprz == 1.
 When you use WRITESAME16 you can ignore the lbprz flag.
 Just send a WRITESAME16 command with one block of data that is all
 set to zero.
 If the unmap flag is set and if unmapped blocks read back the same as
 the block you provided (all zero)
 then it will unmap those blocks, where possible.
 True, so the unmap flag can be set iff BDRV_MAY_DISCARD == 1.  block.c
 can take care of checking BDRV_O_UNMAP.
 Would you add BDRV_MAY_DISCARD to BdrvRequestFlags?
 
 This would require making BdrvRequestFlags public.
 
 Before I start implementation I would still like to verify if we need this
 flag. E.g. in which case would we do not want to optimize writing zeroes
 via discard if the user has set BDRV_O_UNMAP?

For example if the bdrv_write_zeroes is coming from a WRITE SAME command
(sent by the guest to an emulated SCSI disk).  In this case, we should
not discard if the UNMAP bit is zero in the request.

The WRITE SAME implementation in scsi-disk.c is old and broken, but the
block layer should provide enough API to make it right.

Paolo

 One case I would think of is sanitizing data. But in this case shouldn't
 this
 be a different function? In case of a container format its also not
 guaranteed that the
 contents are erased from the underlying media. Looking at the commit
 message
 for f08f2ddae078e8a7683f8b16da8e0cc3029c7b89 it sounds that
 bdrv_write_zeroes
 was intended to provide an efficient way to zero contents, but only in
 the sense
 that it is guaranteed that they read back as zero (compared to
 bdrv_discard).
 
 Peter

Re: [Qemu-devel] [PATCH v3 00/29] tcg-aarch64 improvements

2013-09-12 Thread Peter Maydell

On 12 September 2013 09:03, Claudio Fontana claudio.font...@huawei.com wrote:
 On 10.09.2013 10:45, Peter Maydell wrote:
 On 10 September 2013 09:27, Claudio Fontana claudio.font...@huawei.com 
 wrote:
 On another side, I end up having to manually revert some parts
 of these which you put as prerequisites, during bisection when
 landing after them, which is a huge time drain when tracking
 regressions introduced in the later part of the series.

 I don't understand this; can you explain? If these early
 refactoring patches have bugs then we should just identify
 them and fix them. If they don't have bugs why would you
 need to manually revert parts of them?

 To revert the next patches which do introduce bugs.

Huh? The next patches would apply on top of the refactoring
patches, so you don't need to remove the refactoring to
revert the functional changes. (On the other hand if we
did things the way round you're suggesting with the
functional changes first then we would need to revert
or manually undo the refactoring parts in order to
revert the functional change patches.)

Personally I think that first refactor/clean up, then
add new features/improvements is a fairly standard order
to do things.

-- PMM

Re: [Qemu-devel] [PATCH v3 01/29] tcg-aarch64: Set ext based on TCG_OPF_64BIT

2013-09-12 Thread Peter Maydell

On 12 September 2013 09:25, Claudio Fontana claudio.font...@huawei.com wrote:
 On 02.09.2013 19:54, Richard Henderson wrote:

 -case INDEX_op_bswap64_i64:
 -ext = 1; /* fall through */
  case INDEX_op_bswap32_i64:
 +/* Despite the _i64, this is a 32-bit bswap.  */
 +ext = 0;
 +/* FALLTHRU */
 +case INDEX_op_bswap64_i64:

 we waste too much y space here, which gives context and is a scarse resource.
 What about

 case INDEX_op_bswap32_i64: /* Despite the _i64, this is a 32-bit bswap.  */
 ext = false; /* FALLTHRU */

Consensus in the rest of the code is for /* fall through */
rather than /* FALLTHRU */ -- there's only 28 of the
latter compared to 169 of the former.

-- PMM

Re: [Qemu-devel] [PATCH v3 01/29] tcg-aarch64: Set ext based on TCG_OPF_64BIT

2013-09-12 Thread Claudio Fontana

On 12.09.2013 10:58, Peter Maydell wrote:
 On 12 September 2013 09:25, Claudio Fontana claudio.font...@huawei.com 
 wrote:
 On 02.09.2013 19:54, Richard Henderson wrote:

 -case INDEX_op_bswap64_i64:
 -ext = 1; /* fall through */
  case INDEX_op_bswap32_i64:
 +/* Despite the _i64, this is a 32-bit bswap.  */
 +ext = 0;
 +/* FALLTHRU */
 +case INDEX_op_bswap64_i64:

 we waste too much y space here, which gives context and is a scarse resource.
 What about

 case INDEX_op_bswap32_i64: /* Despite the _i64, this is a 32-bit bswap.  */
 ext = false; /* FALLTHRU */
 
 Consensus in the rest of the code is for /* fall through */
 rather than /* FALLTHRU */ -- there's only 28 of the
 latter compared to 169 of the former.
 

I like /* fall through */ better as well.

[Qemu-devel] [PATCH v4 2/3] qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe

2013-09-12 Thread Stefan Hajnoczi

Introduce QEMUTimerList-active_timers_lock to protect the linked list
of active timers.  This allows qemu_timer_mod_ns() to be called from any
thread.

Note that vm_clock is not thread-safe and its use of
qemu_clock_has_timers() works fine today but is also not thread-safe.

The purpose of this patch is to eventually let device models set or
cancel timers from a vcpu thread without holding the global mutex.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 include/qemu/timer.h | 17 ++
 qemu-timer.c | 87 
 2 files changed, 85 insertions(+), 19 deletions(-)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index e4934dd..b58903b 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -115,6 +115,10 @@ static inline int64_t qemu_clock_get_us(QEMUClockType type)
  * Determines whether a clock's default timer list
  * has timers attached
  *
+ * Note that this function should not be used when other threads also access
+ * the timer list.  The return value may be outdated by the time it is acted
+ * upon.
+ *
  * Returns: true if the clock's default timer list
  * has timers attached
  */
@@ -271,6 +275,10 @@ void timerlist_free(QEMUTimerList *timer_list);
  *
  * Determine whether a timer list has active timers
  *
+ * Note that this function should not be used when other threads also access
+ * the timer list.  The return value may be outdated by the time it is acted
+ * upon.
+ *
  * Returns: true if the timer list has timers.
  */
 bool timerlist_has_timers(QEMUTimerList *timer_list);
@@ -512,6 +520,9 @@ void timer_free(QEMUTimer *ts);
  * @ts: the timer
  *
  * Delete a timer from the active list.
+ *
+ * This function is thread-safe but the timer and its timer list must not be
+ * freed while this function is running.
  */
 void timer_del(QEMUTimer *ts);
 
@@ -521,6 +532,9 @@ void timer_del(QEMUTimer *ts);
  * @expire_time: the expiry time in nanoseconds
  *
  * Modify a timer to expire at @expire_time
+ *
+ * This function is thread-safe but the timer and its timer list must not be
+ * freed while this function is running.
  */
 void timer_mod_ns(QEMUTimer *ts, int64_t expire_time);
 
@@ -531,6 +545,9 @@ void timer_mod_ns(QEMUTimer *ts, int64_t expire_time);
  *
  * Modify a timer to expiry at @expire_time, taking into
  * account the scale associated with the timer.
+ *
+ * This function is thread-safe but the timer and its timer list must not be
+ * freed while this function is running.
  */
 void timer_mod(QEMUTimer *ts, int64_t expire_timer);
 
diff --git a/qemu-timer.c b/qemu-timer.c
index ed3fcb2..e504747 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -66,6 +66,7 @@ QEMUClock qemu_clocks[QEMU_CLOCK_MAX];
 
 struct QEMUTimerList {
 QEMUClock *clock;
+QemuMutex active_timers_lock;
 QEMUTimer *active_timers;
 QLIST_ENTRY(QEMUTimerList) list;
 QEMUTimerListNotifyCB *notify_cb;
@@ -101,6 +102,7 @@ QEMUTimerList *timerlist_new(QEMUClockType type,
 timer_list-clock = clock;
 timer_list-notify_cb = cb;
 timer_list-notify_opaque = opaque;
+qemu_mutex_init(timer_list-active_timers_lock);
 QLIST_INSERT_HEAD(clock-timerlists, timer_list, list);
 return timer_list;
 }
@@ -111,6 +113,7 @@ void timerlist_free(QEMUTimerList *timer_list)
 if (timer_list-clock) {
 QLIST_REMOVE(timer_list, list);
 }
+qemu_mutex_destroy(timer_list-active_timers_lock);
 g_free(timer_list);
 }
 
@@ -163,9 +166,17 @@ bool qemu_clock_has_timers(QEMUClockType type)
 
 bool timerlist_expired(QEMUTimerList *timer_list)
 {
-return (timer_list-active_timers 
-timer_list-active_timers-expire_time 
-qemu_clock_get_ns(timer_list-clock-type));
+int64_t expire_time;
+
+qemu_mutex_lock(timer_list-active_timers_lock);
+if (!timer_list-active_timers) {
+qemu_mutex_unlock(timer_list-active_timers_lock);
+return false;
+}
+expire_time = timer_list-active_timers-expire_time;
+qemu_mutex_unlock(timer_list-active_timers_lock);
+
+return expire_time  qemu_clock_get_ns(timer_list-clock-type);
 }
 
 bool qemu_clock_expired(QEMUClockType type)
@@ -182,13 +193,25 @@ bool qemu_clock_expired(QEMUClockType type)
 int64_t timerlist_deadline_ns(QEMUTimerList *timer_list)
 {
 int64_t delta;
+int64_t expire_time;
 
-if (!timer_list-clock-enabled || !timer_list-active_timers) {
+if (!timer_list-clock-enabled) {
 return -1;
 }
 
-delta = timer_list-active_timers-expire_time -
-qemu_clock_get_ns(timer_list-clock-type);
+/* The active timers list may be modified before the caller uses our return
+ * value but -notify_cb() is called when the deadline changes.  Therefore
+ * the caller should notice the change and there is no race condition.
+ */
+qemu_mutex_lock(timer_list-active_timers_lock);
+if (!timer_list-active_timers) {
+

[Qemu-devel] [PATCH v4 1/3] qemu-timer: drop outdated signal safety comments

2013-09-12 Thread Stefan Hajnoczi

host_alarm_handler() is invoked from the signal processing thread
(currently the iothread).  Previously we did processing in a real signal
handler with signalfd and therefore needed signal-safe timer code.

Today host_alarm_handler() just marks the alarm timer as expired/pending
and notifies the main loop using qemu_notify_event().

Therefore these outdated comments about signal safety can be dropped.

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 qemu-timer.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 95ff47f..ed3fcb2 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -301,8 +301,6 @@ void timer_del(QEMUTimer *ts)
 {
 QEMUTimer **pt, *t;
 
-/* NOTE: this code must be signal safe because
-   timer_expired() can be called from a signal. */
 pt = ts-timer_list-active_timers;
 for(;;) {
 t = *pt;
@@ -325,8 +323,6 @@ void timer_mod_ns(QEMUTimer *ts, int64_t expire_time)
 timer_del(ts);
 
 /* add the timer in the sorted list */
-/* NOTE: this code must be signal safe because
-   timer_expired() can be called from a signal. */
 pt = ts-timer_list-active_timers;
 for(;;) {
 t = *pt;
-- 
1.8.3.1

[Qemu-devel] [PATCH v4 3/3] qemu-timer: do not take the lock in timer_pending

2013-09-12 Thread Stefan Hajnoczi

From: Paolo Bonzini pbonz...@redhat.com

We can deduce the result from expire_time, by making it always -1 if
the timer is not in the active_timers list.  We need to check against
negative times passed to timer_mod_ns; clamping them to zero is not
a problem because the only clock that has a zero value at VM startup
is QEMU_CLOCK_VIRTUAL, and it is monotonic so it cannot be non-zero.
QEMU_CLOCK_HOST, instead, is not monotonic but it cannot go to negative
values unless the host time is seriously screwed up and points to
the 1960s.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 qemu-timer.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index e504747..6b62e88 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -312,6 +312,7 @@ void timer_init(QEMUTimer *ts,
 ts-cb = cb;
 ts-opaque = opaque;
 ts-scale = scale;
+ts-expire_time = -1;
 }
 
 void timer_free(QEMUTimer *ts)
@@ -323,6 +324,7 @@ static void timer_del_locked(QEMUTimerList *timer_list, 
QEMUTimer *ts)
 {
 QEMUTimer **pt, *t;
 
+ts-expire_time = -1;
 pt = timer_list-active_timers;
 for(;;) {
 t = *pt;
@@ -365,7 +367,7 @@ void timer_mod_ns(QEMUTimer *ts, int64_t expire_time)
 }
 pt = t-next;
 }
-ts-expire_time = expire_time;
+ts-expire_time = MAX(expire_time, 0);
 ts-next = *pt;
 *pt = ts;
 qemu_mutex_unlock(timer_list-active_timers_lock);
@@ -385,19 +387,7 @@ void timer_mod(QEMUTimer *ts, int64_t expire_time)
 
 bool timer_pending(QEMUTimer *ts)
 {
-QEMUTimerList *timer_list = ts-timer_list;
-QEMUTimer *t;
-bool found = false;
-
-qemu_mutex_lock(timer_list-active_timers_lock);
-for (t = timer_list-active_timers; t != NULL; t = t-next) {
-if (t == ts) {
-found = true;
-break;
-}
-}
-qemu_mutex_unlock(timer_list-active_timers_lock);
-return found;
+return ts-expire_time = 0;
 }
 
 bool timer_expired(QEMUTimer *timer_head, int64_t current_time)
@@ -429,6 +419,7 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 /* remove timer from the list before calling the callback */
 timer_list-active_timers = ts-next;
 ts-next = NULL;
+ts-expire_time = -1;
 cb = ts-cb;
 opaque = ts-opaque;
 qemu_mutex_unlock(timer_list-active_timers_lock);
-- 
1.8.3.1

[Qemu-devel] [PATCH v4 0/3] qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe

2013-09-12 Thread Stefan Hajnoczi

v4:
 * Rebased  retested for easy review and merge
 * No code changes

v3:
 * Squashed Paolo's fixes and added his patch to avoid locking in 
timer_pending()

v2:
 * Rebased onto qemu.git/master following the merge of Alex's AioContext timers

The purpose of these patches is to eventually allow device models to set and
cancel timers without holding the global mutex.  When the device model runs in
a vcpu thread and the iothread processes timers, the
QEMUTimerList-active_timers must be protected from concurrent access.

Patch 1 is a clean-up.

Patch 2 is the entire change needed to protect -active_timers.

Patch 3 makes timer_pending() run without a lock.

Paolo Bonzini (1):
  qemu-timer: do not take the lock in timer_pending

Stefan Hajnoczi (2):
  qemu-timer: drop outdated signal safety comments
  qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe

 include/qemu/timer.h | 17 ++
 qemu-timer.c | 92 
 2 files changed, 81 insertions(+), 28 deletions(-)

-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 4/4] qemu-img: conditionally discard target on convert

2013-09-12 Thread Peter Lieven


On 12.09.2013 10:55, Paolo Bonzini wrote:

Il 12/09/2013 10:52, Peter Lieven ha scritto:

On 18.07.2013 16:14, Paolo Bonzini wrote:

Il 18/07/2013 15:55, ronnie sahlberg ha scritto:

bdrv-write_zeroes will use writesame16 and set the unmap flag only if
BDRV_MAY_DISCARD == 1 and BDRV_O_UNMAP == 1 and lbprz == 1.

When you use WRITESAME16 you can ignore the lbprz flag.
Just send a WRITESAME16 command with one block of data that is all
set to zero.
If the unmap flag is set and if unmapped blocks read back the same as
the block you provided (all zero)
then it will unmap those blocks, where possible.

True, so the unmap flag can be set iff BDRV_MAY_DISCARD == 1.  block.c
can take care of checking BDRV_O_UNMAP.

Would you add BDRV_MAY_DISCARD to BdrvRequestFlags?

This would require making BdrvRequestFlags public.

Is the BdrvRequestFlags the right place to put BRDV_REQ_MAY_UNMAP?


Before I start implementation I would still like to verify if we need this
flag. E.g. in which case would we do not want to optimize writing zeroes
via discard if the user has set BDRV_O_UNMAP?

For example if the bdrv_write_zeroes is coming from a WRITE SAME command
(sent by the guest to an emulated SCSI disk).  In this case, we should
not discard if the UNMAP bit is zero in the request.

The WRITE SAME implementation in scsi-disk.c is old and broken, but the
block layer should provide enough API to make it right.

Its actually calling discard regardless if the the buffer contents are zero.
So the outcome of this call is undefined.

case WRITE_SAME_16:
nb_sectors = scsi_data_cdb_length(r-req.cmd.buf);
if (bdrv_is_read_only(s-qdev.conf.bs)) {
scsi_check_condition(r, SENSE_CODE(WRITE_PROTECTED));
return 0;
}
if (!check_lba_range(s, r-req.cmd.lba, nb_sectors)) {
goto illegal_lba;
}

/*
 * We only support WRITE SAME with the unmap bit set for now.
 */
if (!(req-cmd.buf[1]  0x8)) {
goto illegal_request;
}

/* The request is used as the AIO opaque value, so add a ref.  */
scsi_req_ref(r-req);
r-req.aiocb = bdrv_aio_discard(s-qdev.conf.bs,
r-req.cmd.lba * (s-qdev.blocksize / 
512),
nb_sectors * (s-qdev.blocksize / 512),
scsi_aio_complete, r);
return 0;


Peter

Re: [Qemu-devel] [PATCH v6 4/8] module: implement module loading function

2013-09-12 Thread Daniel P. Berrange

On Thu, Sep 12, 2013 at 09:36:43AM +0400, Michael Tokarev wrote:
 12.09.2013 07:02, Fam Zheng wrote.
 On Wed, 09/11 11:46, Richard Henderson wrote:
 On 09/11/2013 08:48 AM, Daniel P. Berrange wrote:
 We know the precise list of valid modules when building QEMU,
 so IMHO, this should just explicitly load each known module
 name, and *not* readdir. Also it should do something along the
 lines suggested their of poisoning exported symbols with a
 build hash to guarantee the modules loaded match the original
 binary and that the symbols change on every rebuild.
 
 We need not mangle the symbols, which could be complicated to
 actually implement, and irritating to work around within gdb.
 
 Agree with this, some id or hash check should be enough.
 
 A solution which I proposed at the very beginning -- to export
 a hashed init function from modules, and call it from the
 main executable.  Like, instead of, say, qemu_module_init(),
 call qemu_module_init_0xdeadbeaf(), where 0xdeadbeaf is a
 hash of some build-dependent value.  This should be enough
 to keep it going.
 
 Ofcourse, if a module lacks this function, it should not be
 loaded.

Yep, that would be a reasonable way todo this. THe current
patches use attribute(constructor) so QEMU doesn't actually
call any explicit init function after dlopen()ing. That could
easily be changed though. 

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] [RFC PATCH 0/8] Remove stub mon-protocol-event for block

2013-09-12 Thread Wenchao Xia

This series will remove the usage of symbols of mon-protocol-event in
qemu-img, qemu-nbd and qemu-io, in short remove the connetion for block
layer.

Background:
  I am tring to decouple block layer code with other unnnessary components,
and in ./stub there many symbols that qemu-img linked as fake implemtion.
As a first step, I am decouple monitor with block layer code, this is the
first part of it.
  There are still other stub symbols for monitor, which will be solved later.
It seems error handlering is also link with those symbols, and will adjust
that.

Wenchao Xia (8):
  1 block: use type MonitorEvent directly
  2 block: do not include monitor.h in block.c
  3 qapi: move MonitorEvent define
  4 qapi: rename MonitorEvent to QEvent
  5 block: add a callback layer for common functions
  6 block: replace monitor_protocol_event() with callback
  7 block: do not include monitor.h
  7 stubs: remove mon-protocol-event.o in stub obj

 block.c|   22 ++
 block/qcow2-refcount.c |4 +++-
 blockjob.c |   10 --
 include/block/block.h  |   12 
 include/block/block_int.h  |3 +--
 include/monitor/monitor.h  |   40 ++--
 include/qapi/qmp/qevent.h  |   41 +
 include/qapi/qmp/types.h   |1 +
 monitor.c  |   12 ++--
 stubs/Makefile.objs|1 -
 stubs/mon-protocol-event.c |2 +-
 tests/Makefile |3 ++-
 ui/vnc.c   |2 +-
 vl.c   |4 
 14 files changed, 100 insertions(+), 57 deletions(-)
 create mode 100644 include/qapi/qmp/qevent.h

[Qemu-devel] [RFC PATCH 1/8] block: use type MonitorEvent directly

2013-09-12 Thread Wenchao Xia

block_int.h included monitor.h, so it knows the typedef.

Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 block.c   |2 +-
 include/block/block_int.h |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index a325efc..2e2774d 100644
--- a/block.c
+++ b/block.c
@@ -1711,7 +1711,7 @@ void bdrv_set_dev_ops(BlockDriverState *bs, const 
BlockDevOps *ops,
 }
 
 void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
-   enum MonitorEvent ev,
+   MonitorEvent ev,
BlockErrorAction action, bool is_read)
 {
 QObject *data;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7c35198..5de45a1 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -315,7 +315,7 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
 int is_windows_drive(const char *filename);
 #endif
 void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
-   enum MonitorEvent ev,
+   MonitorEvent ev,
BlockErrorAction action, bool is_read);
 
 /**
-- 
1.7.1

[Qemu-devel] [RFC PATCH 3/8] qapi: move MonitorEvent define

2013-09-12 Thread Wenchao Xia

Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 include/monitor/monitor.h |   38 +-
 include/qapi/qmp/qevent.h |   41 +
 include/qapi/qmp/types.h  |1 +
 3 files changed, 43 insertions(+), 37 deletions(-)
 create mode 100644 include/qapi/qmp/qevent.h

diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 10fa0e3..686c0eb 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -4,6 +4,7 @@
 #include qemu-common.h
 #include qapi/qmp/qerror.h
 #include qapi/qmp/qdict.h
+#include qapi/qmp/qevent.h
 #include block/block.h
 #include monitor/readline.h
 
@@ -19,43 +20,6 @@ extern Monitor *default_mon;
 /* flags for monitor commands */
 #define MONITOR_CMD_ASYNC   0x0001
 
-/* QMP events */
-typedef enum MonitorEvent {
-QEVENT_SHUTDOWN,
-QEVENT_RESET,
-QEVENT_POWERDOWN,
-QEVENT_STOP,
-QEVENT_RESUME,
-QEVENT_VNC_CONNECTED,
-QEVENT_VNC_INITIALIZED,
-QEVENT_VNC_DISCONNECTED,
-QEVENT_BLOCK_IO_ERROR,
-QEVENT_RTC_CHANGE,
-QEVENT_WATCHDOG,
-QEVENT_SPICE_CONNECTED,
-QEVENT_SPICE_INITIALIZED,
-QEVENT_SPICE_DISCONNECTED,
-QEVENT_BLOCK_JOB_COMPLETED,
-QEVENT_BLOCK_JOB_CANCELLED,
-QEVENT_BLOCK_JOB_ERROR,
-QEVENT_BLOCK_JOB_READY,
-QEVENT_DEVICE_DELETED,
-QEVENT_DEVICE_TRAY_MOVED,
-QEVENT_NIC_RX_FILTER_CHANGED,
-QEVENT_SUSPEND,
-QEVENT_SUSPEND_DISK,
-QEVENT_WAKEUP,
-QEVENT_BALLOON_CHANGE,
-QEVENT_SPICE_MIGRATE_COMPLETED,
-QEVENT_GUEST_PANICKED,
-QEVENT_BLOCK_IMAGE_CORRUPTED,
-
-/* Add to 'monitor_event_names' array in monitor.c when
- * defining new events here */
-
-QEVENT_MAX,
-} MonitorEvent;
-
 int monitor_cur_is_qmp(void);
 
 void monitor_protocol_event(MonitorEvent event, QObject *data);
diff --git a/include/qapi/qmp/qevent.h b/include/qapi/qmp/qevent.h
new file mode 100644
index 000..aef71d9
--- /dev/null
+++ b/include/qapi/qmp/qevent.h
@@ -0,0 +1,41 @@
+#ifndef QEVENT_H
+#define QEVENT_H
+
+/* QMP events */
+typedef enum MonitorEvent {
+QEVENT_SHUTDOWN,
+QEVENT_RESET,
+QEVENT_POWERDOWN,
+QEVENT_STOP,
+QEVENT_RESUME,
+QEVENT_VNC_CONNECTED,
+QEVENT_VNC_INITIALIZED,
+QEVENT_VNC_DISCONNECTED,
+QEVENT_BLOCK_IO_ERROR,
+QEVENT_RTC_CHANGE,
+QEVENT_WATCHDOG,
+QEVENT_SPICE_CONNECTED,
+QEVENT_SPICE_INITIALIZED,
+QEVENT_SPICE_DISCONNECTED,
+QEVENT_BLOCK_JOB_COMPLETED,
+QEVENT_BLOCK_JOB_CANCELLED,
+QEVENT_BLOCK_JOB_ERROR,
+QEVENT_BLOCK_JOB_READY,
+QEVENT_DEVICE_DELETED,
+QEVENT_DEVICE_TRAY_MOVED,
+QEVENT_NIC_RX_FILTER_CHANGED,
+QEVENT_SUSPEND,
+QEVENT_SUSPEND_DISK,
+QEVENT_WAKEUP,
+QEVENT_BALLOON_CHANGE,
+QEVENT_SPICE_MIGRATE_COMPLETED,
+QEVENT_GUEST_PANICKED,
+QEVENT_BLOCK_IMAGE_CORRUPTED,
+
+/* Add to 'monitor_event_names' array in monitor.c when
+ * defining new events here */
+
+QEVENT_MAX,
+} MonitorEvent;
+
+#endif
diff --git a/include/qapi/qmp/types.h b/include/qapi/qmp/types.h
index 7782ec5..ba317bf 100644
--- a/include/qapi/qmp/types.h
+++ b/include/qapi/qmp/types.h
@@ -21,5 +21,6 @@
 #include qapi/qmp/qdict.h
 #include qapi/qmp/qlist.h
 #include qapi/qmp/qjson.h
+#include qapi/qmp/qevent.h
 
 #endif /* QEMU_OBJECTS_H */
-- 
1.7.1

[Qemu-devel] [RFC PATCH 4/8] qapi: rename MonitorEvent to QEvent

2013-09-12 Thread Wenchao Xia

Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 block.c|2 +-
 include/block/block_int.h  |2 +-
 include/monitor/monitor.h  |2 +-
 include/qapi/qmp/qevent.h  |4 ++--
 monitor.c  |   12 ++--
 stubs/mon-protocol-event.c |2 +-
 ui/vnc.c   |2 +-
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/block.c b/block.c
index a532eaa..07385bf 100644
--- a/block.c
+++ b/block.c
@@ -1710,7 +1710,7 @@ void bdrv_set_dev_ops(BlockDriverState *bs, const 
BlockDevOps *ops,
 }
 
 void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
-   MonitorEvent ev,
+   QEvent ev,
BlockErrorAction action, bool is_read)
 {
 QObject *data;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 5de45a1..db2cb49 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -315,7 +315,7 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
 int is_windows_drive(const char *filename);
 #endif
 void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
-   MonitorEvent ev,
+   QEvent ev,
BlockErrorAction action, bool is_read);
 
 /**
diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 686c0eb..97fcee3 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -22,7 +22,7 @@ extern Monitor *default_mon;
 
 int monitor_cur_is_qmp(void);
 
-void monitor_protocol_event(MonitorEvent event, QObject *data);
+void monitor_protocol_event(QEvent event, QObject *data);
 void monitor_init(CharDriverState *chr, int flags);
 
 int monitor_suspend(Monitor *mon);
diff --git a/include/qapi/qmp/qevent.h b/include/qapi/qmp/qevent.h
index aef71d9..dda9511 100644
--- a/include/qapi/qmp/qevent.h
+++ b/include/qapi/qmp/qevent.h
@@ -2,7 +2,7 @@
 #define QEVENT_H
 
 /* QMP events */
-typedef enum MonitorEvent {
+typedef enum QEvent {
 QEVENT_SHUTDOWN,
 QEVENT_RESET,
 QEVENT_POWERDOWN,
@@ -36,6 +36,6 @@ typedef enum MonitorEvent {
  * defining new events here */
 
 QEVENT_MAX,
-} MonitorEvent;
+} QEvent;
 
 #endif
diff --git a/monitor.c b/monitor.c
index 74f3f1b..9377834 100644
--- a/monitor.c
+++ b/monitor.c
@@ -175,7 +175,7 @@ typedef struct MonitorControl {
  * instance.
  */
 typedef struct MonitorEventState {
-MonitorEvent event; /* Event being tracked */
+QEvent event;   /* Event being tracked */
 int64_t rate;   /* Period over which to throttle. 0 to disable */
 int64_t last;   /* Time at which event was last emitted */
 QEMUTimer *timer;   /* Timer for handling delayed events */
@@ -517,7 +517,7 @@ QemuMutex monitor_event_state_lock;
  * Emits the event to every monitor instance
  */
 static void
-monitor_protocol_event_emit(MonitorEvent event,
+monitor_protocol_event_emit(QEvent event,
 QObject *data)
 {
 Monitor *mon;
@@ -536,7 +536,7 @@ monitor_protocol_event_emit(MonitorEvent event,
  * applying any rate limiting if required.
  */
 static void
-monitor_protocol_event_queue(MonitorEvent event,
+monitor_protocol_event_queue(QEvent event,
  QObject *data)
 {
 MonitorEventState *evstate;
@@ -614,7 +614,7 @@ static void monitor_protocol_event_handler(void *opaque)
  * milliseconds
  */
 static void
-monitor_protocol_event_throttle(MonitorEvent event,
+monitor_protocol_event_throttle(QEvent event,
 int64_t rate)
 {
 MonitorEventState *evstate;
@@ -650,7 +650,7 @@ static void monitor_protocol_event_init(void)
  *
  * Event-specific data can be emitted through the (optional) 'data' parameter.
  */
-void monitor_protocol_event(MonitorEvent event, QObject *data)
+void monitor_protocol_event(QEvent event, QObject *data)
 {
 QDict *qmp;
 const char *event_name;
@@ -1067,7 +1067,7 @@ CommandInfoList *qmp_query_commands(Error **errp)
 EventInfoList *qmp_query_events(Error **errp)
 {
 EventInfoList *info, *ev_list = NULL;
-MonitorEvent e;
+QEvent e;
 
 for (e = 0 ; e  QEVENT_MAX ; e++) {
 const char *event_name = monitor_event_names[e];
diff --git a/stubs/mon-protocol-event.c b/stubs/mon-protocol-event.c
index 0946e94..e769729 100644
--- a/stubs/mon-protocol-event.c
+++ b/stubs/mon-protocol-event.c
@@ -1,6 +1,6 @@
 #include qemu-common.h
 #include monitor/monitor.h
 
-void monitor_protocol_event(MonitorEvent event, QObject *data)
+void monitor_protocol_event(QEvent event, QObject *data)
 {
 }
diff --git a/ui/vnc.c b/ui/vnc.c
index 5601cc3..47fda54 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -275,7 +275,7 @@ static void vnc_client_cache_addr(VncState *client)
 client-info = QOBJECT(qdict);
 }
 
-static void vnc_qmp_event(VncState *vs, MonitorEvent event)
+static void vnc_qmp_event(VncState *vs, QEvent event)
 {
 QDict

[Qemu-devel] [RFC PATCH 8/8] stubs: remove mon-protocol-event.o in stub obj

2013-09-12 Thread Wenchao Xia

Now block layer do not use this symbol now, so qemg-img, qemu-io
and qemu-nbd do not link with this file any more. The test program
tests/test-qdev-global-props still need it, so add this obj in
rule of test/Makefile, and keeps the c file now.

Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 stubs/Makefile.objs |1 -
 tests/Makefile  |3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index f306cba..6fe1e3f 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -15,7 +15,6 @@ stub-obj-y += migr-blocker.o
 stub-obj-y += mon-is-qmp.o
 stub-obj-y += mon-printf.o
 stub-obj-y += mon-print-filename.o
-stub-obj-y += mon-protocol-event.o
 stub-obj-y += mon-set-error.o
 stub-obj-y += pci-drive-hot-add.o
 stub-obj-y += reset.o
diff --git a/tests/Makefile b/tests/Makefile
index c13fefc..7d3ee3b 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -134,7 +134,8 @@ tests/test-qdev-global-props$(EXESUF): 
tests/test-qdev-global-props.o \
hw/core/irq.o \
qom/object.o qom/container.o qom/qom-qobject.o \
$(test-qapi-obj-y) \
-   libqemuutil.a libqemustub.a
+   libqemuutil.a libqemustub.a \
+stubs/mon-protocol-event.o
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/tests/qapi-schema/qapi-schema-test.json 
$(SRC_PATH)/scripts/qapi-types.py
-- 
1.7.1

[Qemu-devel] [RFC PATCH 7/8] block: do not include monitor.h

2013-09-12 Thread Wenchao Xia

Block layer do not need it any more.

Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 include/block/block_int.h |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index db2cb49..ec92f16 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -31,7 +31,6 @@
 #include qemu/timer.h
 #include qapi-types.h
 #include qapi/qmp/qerror.h
-#include monitor/monitor.h
 #include qemu/hbitmap.h
 #include block/snapshot.h
 #include qemu/main-loop.h
-- 
1.7.1

[Qemu-devel] [RFC PATCH 2/8] block: do not include monitor.h in block.c

2013-09-12 Thread Wenchao Xia

block_int.h already included it.

Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 block.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/block.c b/block.c
index 2e2774d..a532eaa 100644
--- a/block.c
+++ b/block.c
@@ -24,7 +24,6 @@
 #include config-host.h
 #include qemu-common.h
 #include trace.h
-#include monitor/monitor.h
 #include block/block_int.h
 #include block/blockjob.h
 #include qemu/module.h
-- 
1.7.1

[Qemu-devel] [RFC PATCH 5/8] block: add a callback layer for common functions

2013-09-12 Thread Wenchao Xia

This structure can hold some call back functions, such as
event emit, error printf. By using call back, block layer
can be decoupled with other components.

Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 block.c   |7 +++
 include/block/block.h |   11 +++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 07385bf..576b86e 100644
--- a/block.c
+++ b/block.c
@@ -55,6 +55,13 @@ typedef enum {
 BDRV_REQ_ZERO_WRITE   = 0x2,
 } BdrvRequestFlags;
 
+BDRVCommonHooks bdrv_common_hooks;
+
+void bdrv_set_common_hooks(BDRVCommonHooks *hooks)
+{
+bdrv_common_hooks = *hooks;
+}
+
 static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load);
 static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
diff --git a/include/block/block.h b/include/block/block.h
index 728ec1a..7913f48 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -119,6 +119,17 @@ typedef struct BDRVReopenState {
 void *opaque;
 } BDRVReopenState;
 
+/*
+ * Now all block layer use same hooks, If needed it can be changed as per
+ * bds.
+ */
+typedef struct BDRVCommonHooks {
+void (*hooks)(void *);
+} BDRVCommonHooks;
+
+extern BDRVCommonHooks bdrv_common_hooks;
+
+void bdrv_set_common_hooks(BDRVCommonHooks *hooks);
 
 void bdrv_iostatus_enable(BlockDriverState *bs);
 void bdrv_iostatus_reset(BlockDriverState *bs);
-- 
1.7.1

[Qemu-devel] [RFC PATCH 6/8] block: replace monitor_protocol_event() with callback

2013-09-12 Thread Wenchao Xia

For code inside block layer, it is replaced with a call back
function. vl.c will tell block layer how to emit the event.

Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 block.c|   12 ++--
 block/qcow2-refcount.c |4 +++-
 blockjob.c |   10 --
 include/block/block.h  |3 ++-
 vl.c   |4 
 5 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 576b86e..5695064 100644
--- a/block.c
+++ b/block.c
@@ -1723,6 +1723,10 @@ void bdrv_emit_qmp_error_event(const BlockDriverState 
*bdrv,
 QObject *data;
 const char *action_str;
 
+if (!bdrv_common_hooks.event_emit) {
+return;
+}
+
 switch (action) {
 case BDRV_ACTION_REPORT:
 action_str = report;
@@ -1741,7 +1745,7 @@ void bdrv_emit_qmp_error_event(const BlockDriverState 
*bdrv,
   bdrv-device_name,
   action_str,
   is_read ? read : write);
-monitor_protocol_event(ev, data);
+bdrv_common_hooks.event_emit(ev, data);
 
 qobject_decref(data);
 }
@@ -1750,9 +1754,13 @@ static void bdrv_emit_qmp_eject_event(BlockDriverState 
*bs, bool ejected)
 {
 QObject *data;
 
+if (!bdrv_common_hooks.event_emit) {
+return;
+}
+
 data = qobject_from_jsonf({ 'device': %s, 'tray-open': %i },
   bdrv_get_device_name(bs), ejected);
-monitor_protocol_event(QEVENT_DEVICE_TRAY_MOVED, data);
+bdrv_common_hooks.event_emit(QEVENT_DEVICE_TRAY_MOVED, data);
 
 qobject_decref(data);
 }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ba129de..829083f 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1790,7 +1790,9 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, 
int chk, int64_t offset,
 data = qobject_from_jsonf({ 'device': %s, 'msg': %s, 'offset': %
 PRId64 , 'size': % PRId64  }, bs-device_name, message,
 offset, size);
-monitor_protocol_event(QEVENT_BLOCK_IMAGE_CORRUPTED, data);
+if (bdrv_common_hooks.event_emit) {
+bdrv_common_hooks.event_emit(QEVENT_BLOCK_IMAGE_CORRUPTED, data);
+}
 g_free(message);
 qobject_decref(data);
 
diff --git a/blockjob.c b/blockjob.c
index e7d49b7..7ff356c 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -245,8 +245,14 @@ QObject *qobject_from_block_job(BlockJob *job)
 
 void block_job_ready(BlockJob *job)
 {
-QObject *data = qobject_from_block_job(job);
-monitor_protocol_event(QEVENT_BLOCK_JOB_READY, data);
+QObject *data;
+
+if (!bdrv_common_hooks.event_emit) {
+return;
+}
+
+data = qobject_from_block_job(job);
+bdrv_common_hooks.event_emit(QEVENT_BLOCK_JOB_READY, data);
 qobject_decref(data);
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 7913f48..fadbf5b 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -7,6 +7,7 @@
 #include block/coroutine.h
 #include qapi/qmp/qobject.h
 #include qapi-types.h
+#include qapi/qmp/qevent.h
 
 /* block.c */
 typedef struct BlockDriver BlockDriver;
@@ -124,7 +125,7 @@ typedef struct BDRVReopenState {
  * bds.
  */
 typedef struct BDRVCommonHooks {
-void (*hooks)(void *);
+void (*event_emit)(QEvent event, QObject *data);
 } BDRVCommonHooks;
 
 extern BDRVCommonHooks bdrv_common_hooks;
diff --git a/vl.c b/vl.c
index 4e709d5..a277598 100644
--- a/vl.c
+++ b/vl.c
@@ -2852,6 +2852,7 @@ int main(int argc, char **argv, char **envp)
 };
 const char *trace_events = NULL;
 const char *trace_file = NULL;
+BDRVCommonHooks bdrv_hooks;
 
 atexit(qemu_run_exit_notifiers);
 error_set_progname(argv[0]);
@@ -2915,6 +2916,9 @@ int main(int argc, char **argv, char **envp)
 nb_nics = 0;
 
 bdrv_init_with_whitelist();
+memset(bdrv_hooks, 0, sizeof(bdrv_hooks));
+bdrv_hooks.event_emit = monitor_protocol_event;
+bdrv_set_common_hooks(bdrv_hooks);
 
 autostart= 1;
 
-- 
1.7.1

Re: [Qemu-devel] [RFC PATCH 0/8] Remove stub mon-protocol-event for block

2013-09-12 Thread Paolo Bonzini

Il 12/09/2013 11:15, Wenchao Xia ha scritto:
 This series will remove the usage of symbols of mon-protocol-event in
 qemu-img, qemu-nbd and qemu-io, in short remove the connetion for block
 layer.
 
 Background:
   I am tring to decouple block layer code with other unnnessary components,
 and in ./stub there many symbols that qemu-img linked as fake implemtion.
 As a first step, I am decouple monitor with block layer code, this is the
 first part of it.
   There are still other stub symbols for monitor, which will be solved later.
 It seems error handlering is also link with those symbols, and will adjust
 that.
 
 Wenchao Xia (8):
   1 block: use type MonitorEvent directly
   2 block: do not include monitor.h in block.c
   3 qapi: move MonitorEvent define
   4 qapi: rename MonitorEvent to QEvent
   5 block: add a callback layer for common functions
   6 block: replace monitor_protocol_event() with callback
   7 block: do not include monitor.h
   7 stubs: remove mon-protocol-event.o in stub obj
 
  block.c|   22 ++
  block/qcow2-refcount.c |4 +++-
  blockjob.c |   10 --
  include/block/block.h  |   12 
  include/block/block_int.h  |3 +--
  include/monitor/monitor.h  |   40 ++--
  include/qapi/qmp/qevent.h  |   41 +
  include/qapi/qmp/types.h   |1 +
  monitor.c  |   12 ++--
  stubs/Makefile.objs|1 -
  stubs/mon-protocol-event.c |2 +-
  tests/Makefile |3 ++-
  ui/vnc.c   |2 +-
  vl.c   |4 
  14 files changed, 100 insertions(+), 57 deletions(-)
  create mode 100644 include/qapi/qmp/qevent.h
 

Patches 1-4 look good.  I'm not sure of the advantage of the last four,
however.  The ugly part of monitor_protocol_event is not really the
stub, but the dependency on QObject.

So, in my opinion a more interesting approach would be to describe
events using QAPI types.  Generating the events would require a small
amount of code to build QObjects manually, because the event syntax
doesn't match exactly a QAPI union, but that is only a technical detail.

Paolo

Re: [Qemu-devel] [PATCH] qdev-monitor: Avoid exiting when hot-plugging two devices with the same bootindex value

2013-09-12 Thread Markus Armbruster

Paolo Bonzini pbonz...@redhat.com writes:

 Il 11/09/2013 20:26, Marcel Apfelbaum ha scritto:
 Qemu is expected to quit if the same boot index value is used by two devices.
 However, hot-plugging a device with a bootindex value already used should
 fail with a friendly message rather than quitting a running VM.

 I think the problem is right where QEMU exits, i.e. in
 add_boot_device_path.  This function should return an error instead, via
 an Error ** argument.

Agree.

 Callers, typically a device's init or realize function, will either
 print the error before returning an error code (e.g. -EBUSY for init) or
 propagate the error up (for realize).

 Returning/propagating failure will still cause QEMU to exit when the
 duplicate bootindexes are found on the command line.

I have an unfinished patch in my tree that does exactly that.  It's
unfinished, because cleanup on error paths needs work.  Current state
appended with FIXMEs and all.  Beware, the FIXMEs may not be correct and
are almost certainly not complete.


diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index c5a6c21..f8b2b27 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2147,8 +2147,16 @@ static void isabus_fdc_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
-add_boot_device_path(isa-bootindexA, dev, /floppy@0);
-add_boot_device_path(isa-bootindexB, dev, /floppy@1);
+add_boot_device_path_err(isa-bootindexA, dev, /floppy@0, err);
+if (err) {
+error_propagate(errp, err);
+return;
+}
+add_boot_device_path_err(isa-bootindexB, dev, /floppy@1, err);
+if (err) {
+error_propagate(errp, err);
+return;
+}
 }
 
 static void sysbus_fdc_initfn(Object *obj)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index e2f55cc..de7ca31 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -678,6 +678,10 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
 return -1;
 }
 
+if (add_boot_device_path(s-conf-bootindex, qdev, /disk@0,0)  0) {
+return -1;
+}
+
 virtio_init(vdev, virtio-blk, VIRTIO_ID_BLOCK,
 sizeof(struct virtio_blk_config));
 
@@ -691,6 +695,7 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
 #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
 if (!virtio_blk_data_plane_create(vdev, blk, s-dataplane)) {
 virtio_cleanup(vdev);
+/* FIXME rm_boot_device_path() */
 return -1;
 }
 s-migration_state_notifier.notify = virtio_blk_migration_state_changed;
@@ -705,7 +710,6 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
 
 bdrv_iostatus_enable(s-bs);
 
-add_boot_device_path(s-conf-bootindex, qdev, /disk@0,0);
 return 0;
 }
 
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 7b3d3ee..c9568f5 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -687,12 +687,16 @@ int rom_add_file(const char *file, const char *fw_dir,
 snprintf(devpath, sizeof(devpath), /rom@ TARGET_FMT_plx, addr);
 }
 
-add_boot_device_path(bootindex, NULL, devpath);
+if (add_boot_device_path(bootindex, NULL, devpath)  0) {
+goto err_closed;
+/* FIXME undo rom_insert()? */
+}
 return 0;
 
 err:
 if (fd != -1)
 close(fd);
+err_closed:
 g_free(rom-data);
 g_free(rom-path);
 g_free(rom-name);
diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 011764f..d532b21 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1813,9 +1813,7 @@ static int assigned_initfn(struct PCIDevice *pci_dev)
 
 assigned_dev_load_option_rom(dev);
 
-add_boot_device_path(dev-bootindex, pci_dev-qdev, NULL);
-
-return 0;
+return add_boot_device_path(dev-bootindex, pci_dev-qdev, NULL);
 
 assigned_out:
 deassign_device(dev);
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 18c4b7e..557be55 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -166,10 +166,16 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind 
kind)
 return -1;
 }
 
+if (add_boot_device_path(dev-conf.bootindex, dev-qdev,
+ dev-unit ? /disk@1 : /disk@0)  0) {
+return -1;
+}
+
 if (ide_init_drive(s, dev-conf.bs, kind,
dev-version, dev-serial, dev-model, dev-wwn,
dev-conf.cyls, dev-conf.heads, dev-conf.secs,
dev-chs_trans)  0) {
+/* FIXME rm_boot_device_path() */
 return -1;
 }
 
@@ -180,9 +186,6 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind)
 dev-serial = g_strdup(s-drive_serial_str);
 }
 
-add_boot_device_path(dev-conf.bootindex, dev-qdev,
- dev-unit ? /disk@1 : /disk@0);
-
 return 0;
 }
 
diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index a1c08fb..9c064ce 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -3185,7 +3185,10 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
-

Re: [Qemu-devel] vfio for platform devices - 9/5/2012 - minutes

2013-09-12 Thread Bhushan Bharat-R65777

 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Wednesday, September 11, 2013 10:45 PM
 To: Yoder Stuart-B08248
 Cc: Wood Scott-B07421; Sethi Varun-B16395; Bhushan Bharat-R65777; 'Peter
 Maydell'; 'Santosh Shukla'; 'Alexander Graf'; 'Antonios Motakis'; 'Christoffer
 Dall'; 'kim.phill...@linaro.org'; kvm...@lists.cs.columbia.edu; kvm-
 p...@vger.kernel.org; qemu-devel@nongnu.org
 Subject: Re: vfio for platform devices - 9/5/2012 - minutes

 On Wed, 2013-09-11 at 16:42 +, Yoder Stuart-B08248 wrote:

   -Original Message-
   From: Yoder Stuart-B08248
   Sent: Thursday, September 05, 2013 12:51 PM
   To: Wood Scott-B07421; Sethi Varun-B16395; Bhushan Bharat-R65777;
   'Peter Maydell'; 'Santosh Shukla'; 'Alex Williamson'; 'Alexander
   Graf'; 'Antonios Motakis'; 'Christoffer Dall'; 'kim.phill...@linaro.org'
   Cc: kvm...@lists.cs.columbia.edu; 'kvm-...@vger.kernel.org'; 'qemu-
   de...@nongnu.org'
   Subject: vfio for platform devices - 9/5/2012 - minutes

   We had a call with those interested and/or working on vfio for
   platform devices.

   Participants: Scott Wood, Varun Sethi, Bharat Bhushan, Peter Maydell,
 Santosh Shukla, Alex Williamson, Alexander Graf,
 Antonios Motakis, Christoffer Dall, Kim Phillips,
 Stuart Yoder

   Several aspects to vfio for platform devices:

   1. IOMMU groups

-iommu driver needs to register a bus notifier for the platform bus
 and create groups for relevant platform devices  -Antonios is
   looking at this for several ARM IOMMUs  -PAMU (Freescale) driver
   already does this

   2. unbinding device from host

PCI:
  echo :06:0d.0 
   /sys/bus/pci/devices/:06:0d.0/driver/unbind
Platform:
  echo ffe101300.dma 
   /sys/bus/platform/devices/ffe101300.dma/driver/unbind

-don't believe there are issues or work to do here

   3. binding device to vfio-platform driver

PCI:
  echo 1102 0002  /sys/bus/pci/drivers/vfio-pci/new_id

-this is probably the least understood issue-- platform drivers
 register themselves with the bus for a specific name
 string.  That is matched with device tree compatible strings
 later to bind a device to a driver  -we want is to have the
   vfio-platform driver to dynamically bind
 to a variety of platform devices previously unknown to
 vfio-platform
-ideally unbinding and binding could be an atomic operation  -Alex
   W pointed out that x86 could leverage this work so
 keep that in mind in what we design  -Kim Phillips (Linaro) will
   start working on this

  One thing we didn't discuss needs to be considered (probably by Kim
  who is looking at the 'binding device' issue) is around returning a
  passthru device back to the host.

  After a platform device has been bound to vfio and is in use by user
  space or a virtual machine, we also need to be able to unwind all that
  and return the device back to the host in a sane state.

  What happens when user space exits and the vfio file descriptors are
  closed?

 For reference, expectations of how vfio-pci handles these situations:

 For vfio-pci, when the reference count on the device fd drops to zero we call 
 a
 device disable function that includes disabling the bus master bit in config
 space stop ongoing DMA.

There is no bus mastering for platform devices and device disabling is not a 
standard like PCI. Do you think that we might need to do some device specific 
handling. For example for DMA controller, we need to atleast disable the DMA 
controller and mask its interrupts (may be ensure that there is no 
pending/running dma transaction, release irqs etc). As we are not yet having 
any linkage to respective kernel driver then I am not sure how we will do that 
specific handling?

  What if the device is still active and doing bus
  mastering?   (e.g. a VM crashed causing a QEMU
  exit)

 If the VM crashes the vfio fds get released resulting in the above opportunity
 for the vfio device driver to quiesce the device.

I think the quiesing of devices with we device specific, so the generic 
vfio=platform driver may not be able to handle that, right?

  How can the vfio-platform layer in the host kernel get a specific
  device in a sane state?

 It's not easy on pci either.  We save config space prior to exposing the 
 device
 and restore config space later, but it's not complete.  We mostly rely on 
 device
 (function) resets, to put things in a sane state, but those aren't always
 supported.

All platform devices also may not have reset capability (and if any then not 
generic way for all devices).

  I just introduced patches for v3.12 that enable a PCI bus reset
 interface, but it's mostly useful for userspace, since on PCI it's often the
 case that a bus contains multiple devices which don't necessarily align to 
 iommu
 group boundaries.

  When a plaform device is 'unbound'

[Qemu-devel] [PATCH v7 1/3] device_tree.c: Terminate the empty reservemap in create_device_tree()

2013-09-12 Thread Peter Maydell

Device trees created with create_device_tree() may not have any
entries in their reservemap, because the FDT API requires that the
reservemap is completed before any FDT nodes are added, and
create_device_tree() itself creates a node.  However we were not
calling fdt_finish_reservemap(), which meant that there was no
terminator in the reservemap list and whatever happened to be at the
start of the FDT data section would end up being interpreted as
reservemap entries.  Avoid this by calling fdt_finish_reservemap()
to add the terminator.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Acked-by: Alexander Graf ag...@suse.de
---
 device_tree.c |4 
 1 file changed, 4 insertions(+)

diff --git a/device_tree.c b/device_tree.c
index ffec99a..391da8c 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -41,6 +41,10 @@ void *create_device_tree(int *sizep)
 if (ret  0) {
 goto fail;
 }
+ret = fdt_finish_reservemap(fdt);
+if (ret  0) {
+goto fail;
+}
 ret = fdt_begin_node(fdt, );
 if (ret  0) {
 goto fail;
-- 
1.7.9.5

[Qemu-devel] [PATCH v7 2/3] hw/arm/boot: Allow boards to provide an fdt blob

2013-09-12 Thread Peter Maydell

From: John Rigby john.ri...@linaro.org

If no fdt is provided on command line and the new field
get_dtb in struct arm_boot_info is set then call it to
get a device tree blob.

Signed-off-by: John Rigby john.ri...@linaro.org
[PMM: minor tweaks and cleanup]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/arm/boot.c|   32 
 include/hw/arm/arm.h |7 +++
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 1e313af..967397b 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -228,23 +228,31 @@ static void set_kernel_args_old(const struct 
arm_boot_info *info)
 static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo)
 {
 void *fdt = NULL;
-char *filename;
 int size, rc;
 uint32_t acells, scells;
 
-filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, binfo-dtb_filename);
-if (!filename) {
-fprintf(stderr, Couldn't open dtb file %s\n, binfo-dtb_filename);
-goto fail;
-}
+if (binfo-dtb_filename) {
+char *filename;
+filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, binfo-dtb_filename);
+if (!filename) {
+fprintf(stderr, Couldn't open dtb file %s\n, 
binfo-dtb_filename);
+goto fail;
+}
 
-fdt = load_device_tree(filename, size);
-if (!fdt) {
-fprintf(stderr, Couldn't open dtb file %s\n, filename);
+fdt = load_device_tree(filename, size);
+if (!fdt) {
+fprintf(stderr, Couldn't open dtb file %s\n, filename);
+g_free(filename);
+goto fail;
+}
 g_free(filename);
-goto fail;
+} else if (binfo-get_dtb) {
+fdt = binfo-get_dtb(binfo, size);
+if (!fdt) {
+fprintf(stderr, Board was unable to create a dtb blob\n);
+goto fail;
+}
 }
-g_free(filename);
 
 acells = qemu_devtree_getprop_cell(fdt, /, #address-cells);
 scells = qemu_devtree_getprop_cell(fdt, /, #size-cells);
@@ -436,7 +444,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 /* for device tree boot, we pass the DTB directly in r2. Otherwise
  * we point to the kernel args.
  */
-if (info-dtb_filename) {
+if (info-dtb_filename || info-get_dtb) {
 /* Place the DTB after the initrd in memory. Note that some
  * kernels will trash anything in the 4K page the initrd
  * ends in, so make sure the DTB isn't caught up in that.
diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
index ecbbba8..cbbf4ca 100644
--- a/include/hw/arm/arm.h
+++ b/include/hw/arm/arm.h
@@ -50,6 +50,13 @@ struct arm_boot_info {
  const struct arm_boot_info *info);
 void (*secondary_cpu_reset_hook)(ARMCPU *cpu,
  const struct arm_boot_info *info);
+/* if a board is able to create a dtb without a dtb file then it
+ * sets get_dtb. This will only be used if no dtb file is provided
+ * by the user. On success, sets *size to the length of the created
+ * dtb, and returns a pointer to it. (The caller must free this memory
+ * with g_free() when it has finished with it.) On failure, returns NULL.
+ */
+void *(*get_dtb)(const struct arm_boot_info *info, int *size);
 /* if a board needs to be able to modify a device tree provided by
  * the user it should implement this hook.
  */
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH] qdev-monitor: Avoid exiting when hot-plugging two devices with the same bootindex value

2013-09-12 Thread Marcel Apfelbaum

On Thu, 2013-09-12 at 11:43 +0200, Markus Armbruster wrote:
 Paolo Bonzini pbonz...@redhat.com writes:
 
  Il 11/09/2013 20:26, Marcel Apfelbaum ha scritto:
  Qemu is expected to quit if the same boot index value is used by two 
  devices.
  However, hot-plugging a device with a bootindex value already used should
  fail with a friendly message rather than quitting a running VM.
 
  I think the problem is right where QEMU exits, i.e. in
  add_boot_device_path.  This function should return an error instead, via
  an Error ** argument.
 
 Agree.
 
  Callers, typically a device's init or realize function, will either
  print the error before returning an error code (e.g. -EBUSY for init) or
  propagate the error up (for realize).
 
  Returning/propagating failure will still cause QEMU to exit when the
  duplicate bootindexes are found on the command line.
 
 I have an unfinished patch in my tree that does exactly that.  It's
 unfinished, because cleanup on error paths needs work.  Current state
 appended with FIXMEs and all.  Beware, the FIXMEs may not be correct and
 are almost certainly not complete.
Thanks Markus,
Should I use it as my starting point and finish it or you intend to?
Marcel

 
 
 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index c5a6c21..f8b2b27 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c
 @@ -2147,8 +2147,16 @@ static void isabus_fdc_realize(DeviceState *dev, Error 
 **errp)
  return;
  }
  
 -add_boot_device_path(isa-bootindexA, dev, /floppy@0);
 -add_boot_device_path(isa-bootindexB, dev, /floppy@1);
 +add_boot_device_path_err(isa-bootindexA, dev, /floppy@0, err);
 +if (err) {
 +error_propagate(errp, err);
 +return;
 +}
 +add_boot_device_path_err(isa-bootindexB, dev, /floppy@1, err);
 +if (err) {
 +error_propagate(errp, err);
 +return;
 +}
  }
  
  static void sysbus_fdc_initfn(Object *obj)
 diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
 index e2f55cc..de7ca31 100644
 --- a/hw/block/virtio-blk.c
 +++ b/hw/block/virtio-blk.c
 @@ -678,6 +678,10 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
  return -1;
  }
  
 +if (add_boot_device_path(s-conf-bootindex, qdev, /disk@0,0)  0) {
 +return -1;
 +}
 +
  virtio_init(vdev, virtio-blk, VIRTIO_ID_BLOCK,
  sizeof(struct virtio_blk_config));
  
 @@ -691,6 +695,7 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
  #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
  if (!virtio_blk_data_plane_create(vdev, blk, s-dataplane)) {
  virtio_cleanup(vdev);
 +/* FIXME rm_boot_device_path() */
  return -1;
  }
  s-migration_state_notifier.notify = virtio_blk_migration_state_changed;
 @@ -705,7 +710,6 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
  
  bdrv_iostatus_enable(s-bs);
  
 -add_boot_device_path(s-conf-bootindex, qdev, /disk@0,0);
  return 0;
  }
  
 diff --git a/hw/core/loader.c b/hw/core/loader.c
 index 7b3d3ee..c9568f5 100644
 --- a/hw/core/loader.c
 +++ b/hw/core/loader.c
 @@ -687,12 +687,16 @@ int rom_add_file(const char *file, const char *fw_dir,
  snprintf(devpath, sizeof(devpath), /rom@ TARGET_FMT_plx, addr);
  }
  
 -add_boot_device_path(bootindex, NULL, devpath);
 +if (add_boot_device_path(bootindex, NULL, devpath)  0) {
 +goto err_closed;
 +/* FIXME undo rom_insert()? */
 +}
  return 0;
  
  err:
  if (fd != -1)
  close(fd);
 +err_closed:
  g_free(rom-data);
  g_free(rom-path);
  g_free(rom-name);
 diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
 index 011764f..d532b21 100644
 --- a/hw/i386/kvm/pci-assign.c
 +++ b/hw/i386/kvm/pci-assign.c
 @@ -1813,9 +1813,7 @@ static int assigned_initfn(struct PCIDevice *pci_dev)
  
  assigned_dev_load_option_rom(dev);
  
 -add_boot_device_path(dev-bootindex, pci_dev-qdev, NULL);
 -
 -return 0;
 +return add_boot_device_path(dev-bootindex, pci_dev-qdev, NULL);
  
  assigned_out:
  deassign_device(dev);
 diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
 index 18c4b7e..557be55 100644
 --- a/hw/ide/qdev.c
 +++ b/hw/ide/qdev.c
 @@ -166,10 +166,16 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind 
 kind)
  return -1;
  }
  
 +if (add_boot_device_path(dev-conf.bootindex, dev-qdev,
 + dev-unit ? /disk@1 : /disk@0)  0) {
 +return -1;
 +}
 +
  if (ide_init_drive(s, dev-conf.bs, kind,
 dev-version, dev-serial, dev-model, dev-wwn,
 dev-conf.cyls, dev-conf.heads, dev-conf.secs,
 dev-chs_trans)  0) {
 +/* FIXME rm_boot_device_path() */
  return -1;
  }
  
 @@ -180,9 +186,6 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind 
 kind)
  dev-serial = g_strdup(s-drive_serial_str);
  }
  
 -add_boot_device_path(dev-conf.bootindex, dev-qdev,
 -

[Qemu-devel] [PATCH v7 0/3] hw/arm: Add 'virt' platform

2013-09-12 Thread Peter Maydell

This patch series adds a 'virt' platform which uses the
kernel's mach-virt (fully device-tree driven) support
to create a simple minimalist platform intended for
use for KVM VM guests.

The major change here is that I've added a PL011 UART.

Sample command line:

 qemu-system-arm -machine type=virt -display none \
  -kernel zImage \
  -append 'root=/dev/vda rw console=ttyAMA0 rootwait'
  -cpu cortex-a15 \
  -device virtio-blk-device,drive=foo \
  -drive if=none,file=arm-wheezy.img,id=foo \
  -m 2048 -serial stdio

Note that there is no earlyprintk via the PL011 because
there's no defined device tree binding for hey, here
is your earlyprintk UART.


*** NOTE *** to get the PL011 to work you'll need to
tweak the kernel a bit:

diff --git a/arch/arm/mach-virt/virt.c b/arch/arm/mach-virt/virt.c
index b184e57..2b6aceb 100644
--- a/arch/arm/mach-virt/virt.c
+++ b/arch/arm/mach-virt/virt.c
@@ -21,11 +21,13 @@
 #include linux/of_irq.h
 #include linux/of_platform.h
 #include linux/smp.h
+#include linux/clk-provider.h
 
 #include asm/mach/arch.h
 
 static void __init virt_init(void)
 {
+   of_clk_init(NULL);
of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
 }
 

Otherwise the kernel doesn't ever add the clock to its
list, and then it refuses to probe for the PL011.
(I'm told this isn't really the right fix, though, and
ideally the call should be done in some generic location
rather than in every machine's init function.)

The alternative would be for the kernel to be fixed to
follow its own device tree binding documentation and not
require clocks/clock-names properties on the pl011 node.


Changes from John Rigby's v3-my v4:
 * renamed user-facing machine to just virt
 * removed the A9 support (it can't work since the A9 has no
   generic timers)
 * added virtio-mmio transports instead of random set of 'soc' devices
 * instead of updating io_base as we step through adding devices,
   define a memory map with an array (similar to vexpress)
 * folded in some minor fixes from John's aarch64-support patch
 * rather than explicitly doing endian-swapping on FDT cells,
   use fdt APIs that let us just pass in host-endian values
   and let the fdt layer take care of the swapping
 * miscellaneous minor code cleanups and style fixes
Changes v4-v5:
 * removed outdated TODO remarks from commit messages
Changes v5-v6:
 * adjusted the memory map as per Anup's review comments
   (actually made the changes this time!)
Changes v6-v7:
 * added a PL011 UART, at Alex's suggestion (and the accompanying
   fake clock dtb node that this requires)
 * added an irqmap[] in parallel with the memmap[] so that our
   assignment of devices to irq lines is neatly in one place
 * the removal of arm_pic allows us to get rid of an irritating
   array sized to the number of CPUs
 * included the terminate dtb reservemap patch since it's a
   dependency to get the kernel to boot

John Rigby (1):
  hw/arm/boot: Allow boards to provide an fdt blob

Peter Maydell (2):
  device_tree.c: Terminate the empty reservemap in create_device_tree()
  hw/arm: Add 'virt' platform

 device_tree.c|4 +
 hw/arm/Makefile.objs |2 +-
 hw/arm/boot.c|   32 ++--
 hw/arm/virt.c|  419 ++
 include/hw/arm/arm.h |7 +
 5 files changed, 451 insertions(+), 13 deletions(-)
 create mode 100644 hw/arm/virt.c

-- 
1.7.9.5

[Qemu-devel] [PATCH v7 3/3] hw/arm: Add 'virt' platform

2013-09-12 Thread Peter Maydell

Add 'virt' platform support corresponding to arch/arm/mach-virt
in the Linux kernel tree. This has no platform-specific code but
can use any device whose kernel driver is is able to work purely
from a device tree node. We use this to instantiate a minimal
set of devices: a GIC and some virtio-mmio transports.

Signed-off-by: John Rigby john.ri...@linaro.org
[PMM:
 Significantly overhauled:
 * renamed user-facing machine to just virt
 * removed the A9 support (it can't work since the A9 has no
   generic timers)
 * added virtio-mmio transports instead of random set of 'soc' devices
   (though we retain a pl011 UART)
 * instead of updating io_base as we step through adding devices,
   define a memory map with an array (similar to vexpress)
 * similarly, define irqmap with an array
 * folded in some minor fixes from John's aarch64-support patch
 * rather than explicitly doing endian-swapping on FDT cells,
   use fdt APIs that let us just pass in host-endian values
   and let the fdt layer take care of the swapping
 * miscellaneous minor code cleanups and style fixes
]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/arm/Makefile.objs |2 +-
 hw/arm/virt.c|  419 ++
 2 files changed, 420 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/virt.c

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 3671b42..78b5614 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -1,7 +1,7 @@
 obj-y += boot.o collie.o exynos4_boards.o gumstix.o highbank.o
 obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
 obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
-obj-y += tosa.o versatilepb.o vexpress.o xilinx_zynq.o z2.o
+obj-y += tosa.o versatilepb.o vexpress.o virt.o xilinx_zynq.o z2.o
 
 obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
 obj-y += omap1.o omap2.o strongarm.o
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
new file mode 100644
index 000..448a0e5
--- /dev/null
+++ b/hw/arm/virt.c
@@ -0,0 +1,419 @@
+/*
+ * ARM mach-virt emulation
+ *
+ * Copyright (c) 2013 Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see http://www.gnu.org/licenses/.
+ *
+ * Emulate a virtual board which works by passing Linux all the information
+ * it needs about what devices are present via the device tree.
+ * There are some restrictions about what we can do here:
+ *  + we can only present devices whose Linux drivers will work based
+ *purely on the device tree with no platform data at all
+ *  + we want to present a very stripped-down minimalist platform,
+ *both because this reduces the security attack surface from the guest
+ *and also because it reduces our exposure to being broken when
+ *the kernel updates its device tree bindings and requires further
+ *information in a device binding that we aren't providing.
+ * This is essentially the same approach kvmtool uses.
+ */
+
+#include hw/sysbus.h
+#include hw/arm/arm.h
+#include hw/arm/primecell.h
+#include hw/devices.h
+#include net/net.h
+#include sysemu/device_tree.h
+#include sysemu/sysemu.h
+#include sysemu/kvm.h
+#include hw/boards.h
+#include exec/address-spaces.h
+#include qemu/bitops.h
+#include qemu/error-report.h
+
+#define NUM_VIRTIO_TRANSPORTS 32
+
+/* Number of external interrupt lines to configure the GIC with */
+#define NUM_IRQS 128
+
+#define GIC_FDT_IRQ_TYPE_SPI 0
+#define GIC_FDT_IRQ_TYPE_PPI 1
+
+#define GIC_FDT_IRQ_FLAGS_EDGE_LO_HI 1
+#define GIC_FDT_IRQ_FLAGS_EDGE_HI_LO 2
+#define GIC_FDT_IRQ_FLAGS_LEVEL_HI 4
+#define GIC_FDT_IRQ_FLAGS_LEVEL_LO 8
+
+#define GIC_FDT_IRQ_PPI_CPU_START 8
+#define GIC_FDT_IRQ_PPI_CPU_WIDTH 8
+
+enum {
+VIRT_FLASH,
+VIRT_MEM,
+VIRT_CPUPERIPHS,
+VIRT_GIC_DIST,
+VIRT_GIC_CPU,
+VIRT_UART,
+VIRT_MMIO,
+};
+
+typedef struct MemMapEntry {
+hwaddr base;
+hwaddr size;
+} MemMapEntry;
+
+typedef struct VirtBoardInfo {
+struct arm_boot_info bootinfo;
+const char *cpu_model;
+const char *cpu_compatible;
+const char *qdevname;
+const char *gic_compatible;
+const MemMapEntry *memmap;
+const int *irqmap;
+int smp_cpus;
+void *fdt;
+int fdt_size;
+uint32_t clock_phandle;
+} VirtBoardInfo;
+
+/* Addresses and sizes of our components.
+ * We leave the first 64K free for possible use later for
+ * flash (for running boot code such as UEFI); following
+ * that is I/O, and then

Re: [Qemu-devel] Disabling IRQ error

2013-09-12 Thread Xie Xianshan


Dear Max,
Does it mean an IRQ to be edge-triggered?
   No, it is a level-sensitive and active-high interrupt.
This is why i tried to use qemu_irq_raise() to trigger IRQ.


Thanks,
Simen



Hi Max,
 Thanks for your patience and help.
 I`ve tried to do what you said, but the problem doesn`t go away.
And actually i cannot add a new register to the fpga device, because the
fpga device i`m emulating already exists in the real world.
So i cannot change anything about hardware properties and linux driver for
this device.


Then I don't get its IRQ logic. Does it mean an IRQ to be edge-triggered?
Its model should use qemu_irq_pulse then instead of qemu_irq_raise and
its IRQ line should be connected to edge-sensing input of interrupt controller.
Input that you have used is also used to sense PCI IRQ and is level-sensing.


 By the way, how did you finally fix your problem?


I didn't have any. (:

Re: [Qemu-devel] [PATCH] qdev-monitor: Avoid exiting when hot-plugging two devices with the same bootindex value

2013-09-12 Thread Markus Armbruster

Marcel Apfelbaum marce...@redhat.com writes:

 On Thu, 2013-09-12 at 11:43 +0200, Markus Armbruster wrote:
 Paolo Bonzini pbonz...@redhat.com writes:
 
  Il 11/09/2013 20:26, Marcel Apfelbaum ha scritto:
  Qemu is expected to quit if the same boot index value is used by
  two devices.
  However, hot-plugging a device with a bootindex value already used should
  fail with a friendly message rather than quitting a running VM.
 
  I think the problem is right where QEMU exits, i.e. in
  add_boot_device_path.  This function should return an error instead, via
  an Error ** argument.
 
 Agree.
 
  Callers, typically a device's init or realize function, will either
  print the error before returning an error code (e.g. -EBUSY for init) or
  propagate the error up (for realize).
 
  Returning/propagating failure will still cause QEMU to exit when the
  duplicate bootindexes are found on the command line.
 
 I have an unfinished patch in my tree that does exactly that.  It's
 unfinished, because cleanup on error paths needs work.  Current state
 appended with FIXMEs and all.  Beware, the FIXMEs may not be correct and
 are almost certainly not complete.
 Thanks Markus,
 Should I use it as my starting point and finish it or you intend to?

If you have cycles to spare, you're quite welcome to take this patch and
run with it!

You may have noticed that my patch moves the code to add the boot device
path in a few cases.  I did this in the hope of simplifying error paths.
Do not hesitate to undo such moves where they turn out not to simplify
anything.

Here's an issue that exists before my patch: DeviceClass method
unrealize() should clean up everything done by realize().  In
particular, unrealize() needs to remove any entry added to fw_boot_order
by realize() via add_boot_device_path().  Code to do that doesn't exist,
yet.  Hot unplug is technically broken for all devices with bootindex.
Impact unknown.  Should probably be fixed in a separate patch.

Re: [Qemu-devel] Disabling IRQ error

2013-09-12 Thread Max Filippov

On Thu, Sep 12, 2013 at 2:51 PM, Xie Xianshan xi...@cn.fujitsu.com wrote:
 Dear Max,

 Does it mean an IRQ to be edge-triggered?
No, it is a level-sensitive and active-high interrupt.
 This is why i tried to use qemu_irq_raise() to trigger IRQ.

Ok, back to your original question:

 I`m getting the nobody cared disabling IRQ error, when i raised external 
 interrupts IRQ3 to the Openpic in QEMU.

Your linux driver should return anything except IRQ_NONE from its ISR
when it detects IRQ from your device. As I understand once you raise IRQ
you don't lower it, so the driver must always return anything except
IRQ_NONE. If it doesn't do so you will see the error message above.

-- 
Thanks.
-- Max

[Qemu-devel] [PULL 00/11] SCSI patches for 2013-09-11

2013-09-12 Thread Paolo Bonzini

Anthony,

The following changes since commit 2d1fe1873a984d1c2c89ffa3d12949cafc718551:

  Merge remote-tracking branch 'pmaydell/tags/pull-target-arm-20130910' into 
staging (2013-09-11 14:46:52 -0500)

are available in the git repository at:


  git://github.com/bonzini/qemu.git scsi-next

for you to fetch changes up to f4ff3b7ba1bcb77d5b5cdbd6e695df793761170b:

  spapr-vscsi: Report error on unsupported MAD requests (2013-09-12 13:15:54 
+0200)


Alexey Kardashevskiy (2):
  spapr-vscsi: add task management
  spapr-vscsi: Report error on unsupported MAD requests

Markus Armbruster (2):
  virtio-scsi: Make type virtio-scsi-common abstract
  scsi: Fix scsi_bus_legacy_add_drive() scsi-generic with serial

Nikunj A. Dadhania (1):
  spapr-vscsi: Adding VSCSI capabilities

Paolo Bonzini (1):
  scsi: prefer UUID to VM name for the initiator name

Peter Lieven (3):
  iscsi: add logical block provisioning information to iscsilun
  iscsi: add .bdrv_get_block_status
  iscsi: split discard requests in multiple parts

Peter Maydell (2):
  hw/scsi/lsi53c895a: Use sextract32 for sign-extension
  hw/scsi/lsi53c895a: Use deposit32 rather than handcoded shift/mask

 block/iscsi.c   | 392 +---
 hw/scsi/lsi53c895a.c|  19 +--
 hw/scsi/scsi-bus.c  |   2 +-
 hw/scsi/spapr_vscsi.c   | 195 
 hw/scsi/srp.h   |   7 +
 hw/scsi/virtio-scsi.c   |   1 +
 include/sysemu/sysemu.h |   2 +
 stubs/Makefile.objs |   1 +
 stubs/uuid.c|  12 ++
 9 files changed, 496 insertions(+), 135 deletions(-)
 create mode 100644 stubs/uuid.c
-- 
1.8.3.1

[Qemu-devel] [PULL 05/11] hw/scsi/lsi53c895a: Use sextract32 for sign-extension

2013-09-12 Thread Paolo Bonzini

From: Peter Maydell peter.mayd...@linaro.org

Use sextract32() for doing sign-extension rather than rolling
our own implementation.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/lsi53c895a.c | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 0c36842..a26b20b 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -998,12 +998,6 @@ bad:
 s-msg_action = 0;
 }
 
-/* Sign extend a 24-bit value.  */
-static inline int32_t sxt24(int32_t n)
-{
-return (n  8)  8;
-}
-
 #define LSI_BUF_SIZE 4096
 static void lsi_memcpy(LSIState *s, uint32_t dest, uint32_t src, int count)
 {
@@ -1083,7 +1077,7 @@ again:
 /* Table indirect addressing.  */
 
 /* 32-bit Table indirect */
-offset = sxt24(addr);
+offset = sextract32(addr, 0, 24);
 pci_dma_read(pci_dev, s-dsa + offset, buf, 8);
 /* byte count is stored in bits 0:23 only */
 s-dbc = cpu_to_le32(buf[0])  0xff;
@@ -1183,13 +1177,13 @@ again:
 uint32_t id;
 
 if (insn  (1  25)) {
-id = read_dword(s, s-dsa + sxt24(insn));
+id = read_dword(s, s-dsa + sextract32(insn, 0, 24));
 } else {
 id = insn;
 }
 id = (id  16)  0xf;
 if (insn  (1  26)) {
-addr = s-dsp + sxt24(addr);
+addr = s-dsp + sextract32(addr, 0, 24);
 }
 s-dnad = addr;
 switch (opcode) {
@@ -1385,7 +1379,7 @@ again:
 if (cond == jmp) {
 if (insn  (1  23)) {
 /* Relative address.  */
-addr = s-dsp + sxt24(addr);
+addr = s-dsp + sextract32(addr, 0, 24);
 }
 switch ((insn  27)  7) {
 case 0: /* Jump */
@@ -1438,7 +1432,7 @@ again:
 int i;
 
 if (insn  (1  28)) {
-addr = s-dsa + sxt24(addr);
+addr = s-dsa + sextract32(addr, 0, 24);
 }
 n = (insn  7);
 reg = (insn  16)  0xff;
-- 
1.8.3.1

[Qemu-devel] [PULL 06/11] hw/scsi/lsi53c895a: Use deposit32 rather than handcoded shift/mask

2013-09-12 Thread Paolo Bonzini

From: Peter Maydell peter.mayd...@linaro.org

Use deposit32() rather than handcoded shifts/masks to update the
scratch registers. This is cleaner and incidentally avoids a clang
sanitizer complaint (runtime error: left shift of 255 by 24 places
cannot be represented in type 'int').

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/lsi53c895a.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index a26b20b..5affc82 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -1870,8 +1870,7 @@ static void lsi_reg_writeb(LSIState *s, int offset, 
uint8_t val)
 int shift;
 n = (offset - 0x58)  2;
 shift = (offset  3) * 8;
-s-scratch[n] = ~(0xff  shift);
-s-scratch[n] |= (val  0xff)  shift;
+s-scratch[n] = deposit32(s-scratch[n], shift, 8, val);
 } else {
 BADF(Unhandled writeb 0x%x = 0x%x\n, offset, val);
 }
-- 
1.8.3.1

[Qemu-devel] [PULL 02/11] spapr-vscsi: add task management

2013-09-12 Thread Paolo Bonzini

From: Alexey Kardashevskiy a...@ozlabs.ru

At the moment the guest kernel issues two types of task management
requests to the hypervisor - task about and lun reset. This adds
handling for these tasks. As spapr-vscsi starts calling scsi_req_cancel(),
free_request callback was implemented.

As virtio-vscsi, spapr-vscsi does not handle CLEAR_ACA either as CDB
control byte does not seem to be used at all so NACA bit is not
set to the guest so the guest has no good reason to call CLEAR_ACA task.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
[Fix choice of UCSOLCNT vs. SCSOLCNT. - Paolo]
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/spapr_vscsi.c | 119 ++
 hw/scsi/srp.h |   7 +++
 2 files changed, 99 insertions(+), 27 deletions(-)

diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index b2fcd4b..c31a870 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -117,6 +117,20 @@ static struct vscsi_req *vscsi_get_req(VSCSIState *s)
 return NULL;
 }
 
+static struct vscsi_req *vscsi_find_req(VSCSIState *s, uint64_t srp_tag)
+{
+vscsi_req *req;
+int i;
+
+for (i = 0; i  VSCSI_REQ_LIMIT; i++) {
+req = s-reqs[i];
+if (req-iu.srp.cmd.tag == srp_tag) {
+return req;
+}
+}
+return NULL;
+}
+
 static void vscsi_put_req(vscsi_req *req)
 {
 if (req-sreq != NULL) {
@@ -755,40 +769,91 @@ static int vscsi_queue_cmd(VSCSIState *s, vscsi_req *req)
 static int vscsi_process_tsk_mgmt(VSCSIState *s, vscsi_req *req)
 {
 union viosrp_iu *iu = req-iu;
-int fn;
+vscsi_req *tmpreq;
+int i, lun = 0, resp = SRP_TSK_MGMT_COMPLETE;
+SCSIDevice *d;
+uint64_t tag = iu-srp.rsp.tag;
+uint8_t sol_not = iu-srp.cmd.sol_not;
 
 fprintf(stderr, vscsi_process_tsk_mgmt %02x\n,
 iu-srp.tsk_mgmt.tsk_mgmt_func);
 
-switch (iu-srp.tsk_mgmt.tsk_mgmt_func) {
-#if 0 /* We really don't deal with these for now */
-case SRP_TSK_ABORT_TASK:
-fn = ABORT_TASK;
-break;
-case SRP_TSK_ABORT_TASK_SET:
-fn = ABORT_TASK_SET;
-break;
-case SRP_TSK_CLEAR_TASK_SET:
-fn = CLEAR_TASK_SET;
-break;
-case SRP_TSK_LUN_RESET:
-fn = LOGICAL_UNIT_RESET;
-break;
-case SRP_TSK_CLEAR_ACA:
-fn = CLEAR_ACA;
-break;
-#endif
-default:
-fn = 0;
+d = vscsi_device_find(s-bus, be64_to_cpu(req-iu.srp.tsk_mgmt.lun), 
lun);
+if (!d) {
+resp = SRP_TSK_MGMT_FIELDS_INVALID;
+} else {
+switch (iu-srp.tsk_mgmt.tsk_mgmt_func) {
+case SRP_TSK_ABORT_TASK:
+if (d-lun != lun) {
+resp = SRP_TSK_MGMT_FIELDS_INVALID;
+break;
+}
+
+tmpreq = vscsi_find_req(s, req-iu.srp.tsk_mgmt.task_tag);
+if (tmpreq  tmpreq-sreq) {
+assert(tmpreq-sreq-hba_private);
+scsi_req_cancel(tmpreq-sreq);
+}
+break;
+
+case SRP_TSK_LUN_RESET:
+if (d-lun != lun) {
+resp = SRP_TSK_MGMT_FIELDS_INVALID;
+break;
+}
+
+qdev_reset_all(d-qdev);
+break;
+
+case SRP_TSK_ABORT_TASK_SET:
+case SRP_TSK_CLEAR_TASK_SET:
+if (d-lun != lun) {
+resp = SRP_TSK_MGMT_FIELDS_INVALID;
+break;
+}
+
+for (i = 0; i  VSCSI_REQ_LIMIT; i++) {
+tmpreq = s-reqs[i];
+if (tmpreq-iu.srp.cmd.lun != req-iu.srp.tsk_mgmt.lun) {
+continue;
+}
+if (!tmpreq-active || !tmpreq-sreq) {
+continue;
+}
+assert(tmpreq-sreq-hba_private);
+scsi_req_cancel(tmpreq-sreq);
+}
+break;
+
+case SRP_TSK_CLEAR_ACA:
+resp = SRP_TSK_MGMT_NOT_SUPPORTED;
+break;
+
+default:
+resp = SRP_TSK_MGMT_FIELDS_INVALID;
+break;
+}
 }
-if (fn) {
-/* XXX Send/Handle target task management */
-;
+
+/* Compose the response here as  */
+memset(iu, 0, sizeof(struct srp_rsp) + 4);
+iu-srp.rsp.opcode = SRP_RSP;
+iu-srp.rsp.req_lim_delta = cpu_to_be32(1);
+iu-srp.rsp.tag = tag;
+iu-srp.rsp.flags |= SRP_RSP_FLAG_RSPVALID;
+iu-srp.rsp.resp_data_len = cpu_to_be32(4);
+if (resp) {
+iu-srp.rsp.sol_not = (sol_not  0x04)  2;
 } else {
-vscsi_makeup_sense(s, req, ILLEGAL_REQUEST, 0x20, 0);
-vscsi_send_rsp(s, req, CHECK_CONDITION, 0, 0);
+iu-srp.rsp.sol_not = (sol_not  0x02)  1;
 }
-return !fn;
+
+iu-srp.rsp.status = GOOD;
+iu-srp.rsp.data[3] = resp;
+
+vscsi_send_iu(s, req, sizeof(iu-srp.rsp) + 4, VIOSRP_SRP_FORMAT);
+
+return 1;
 }
 
 static int vscsi_handle_srp_req(VSCSIState *s, vscsi_req *req)
diff --git

[Qemu-devel] [PULL 07/11] iscsi: add logical block provisioning information to iscsilun

2013-09-12 Thread Paolo Bonzini

From: Peter Lieven p...@kamp.de

Signed-off-by: Peter Lieven p...@kamp.de
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 block/iscsi.c | 77 +++
 1 file changed, 77 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index 60b3967..bfd659a 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -52,6 +52,10 @@ typedef struct IscsiLun {
 uint64_t num_blocks;
 int events;
 QEMUTimer *nop_timer;
+uint8_t lbpme;
+uint8_t lbprz;
+struct scsi_inquiry_logical_block_provisioning lbp;
+struct scsi_inquiry_block_limits bl;
 } IscsiLun;
 
 typedef struct IscsiAIOCB {
@@ -999,6 +1003,8 @@ static int iscsi_readcapacity_sync(IscsiLun *iscsilun)
 } else {
 iscsilun-block_size = rc16-block_length;
 iscsilun-num_blocks = rc16-returned_lba + 1;
+iscsilun-lbpme = rc16-lbpme;
+iscsilun-lbprz = rc16-lbprz;
 }
 }
 break;
@@ -1051,6 +1057,37 @@ static QemuOptsList runtime_opts = {
 },
 };
 
+static struct scsi_task *iscsi_do_inquiry(struct iscsi_context *iscsi,
+  int lun, int evpd, int pc) {
+int full_size;
+struct scsi_task *task = NULL;
+task = iscsi_inquiry_sync(iscsi, lun, evpd, pc, 64);
+if (task == NULL || task-status != SCSI_STATUS_GOOD) {
+goto fail;
+}
+full_size = scsi_datain_getfullsize(task);
+if (full_size  task-datain.size) {
+scsi_free_scsi_task(task);
+
+/* we need more data for the full list */
+task = iscsi_inquiry_sync(iscsi, lun, evpd, pc, full_size);
+if (task == NULL || task-status != SCSI_STATUS_GOOD) {
+goto fail;
+}
+}
+
+return task;
+
+fail:
+error_report(iSCSI: Inquiry command failed : %s,
+ iscsi_get_error(iscsi));
+if (task) {
+scsi_free_scsi_task(task);
+return NULL;
+}
+return NULL;
+}
+
 /*
  * We support iscsi url's on the form
  * iscsi://[username%password@]host[:port]/targetname/lun
@@ -1180,6 +1217,46 @@ static int iscsi_open(BlockDriverState *bs, QDict 
*options, int flags)
 bs-sg = 1;
 }
 
+if (iscsilun-lbpme) {
+struct scsi_inquiry_logical_block_provisioning *inq_lbp;
+task = iscsi_do_inquiry(iscsilun-iscsi, iscsilun-lun, 1,
+
SCSI_INQUIRY_PAGECODE_LOGICAL_BLOCK_PROVISIONING);
+if (task == NULL) {
+ret = -EINVAL;
+goto out;
+}
+inq_lbp = scsi_datain_unmarshall(task);
+if (inq_lbp == NULL) {
+error_report(iSCSI: failed to unmarshall inquiry datain blob);
+ret = -EINVAL;
+goto out;
+}
+memcpy(iscsilun-lbp, inq_lbp,
+   sizeof(struct scsi_inquiry_logical_block_provisioning));
+scsi_free_scsi_task(task);
+task = NULL;
+}
+
+if (iscsilun-lbp.lbpu || iscsilun-lbp.lbpws) {
+struct scsi_inquiry_block_limits *inq_bl;
+task = iscsi_do_inquiry(iscsilun-iscsi, iscsilun-lun, 1,
+SCSI_INQUIRY_PAGECODE_BLOCK_LIMITS);
+if (task == NULL) {
+ret = -EINVAL;
+goto out;
+}
+inq_bl = scsi_datain_unmarshall(task);
+if (inq_bl == NULL) {
+error_report(iSCSI: failed to unmarshall inquiry datain blob);
+ret = -EINVAL;
+goto out;
+}
+memcpy(iscsilun-bl, inq_bl,
+   sizeof(struct scsi_inquiry_block_limits));
+scsi_free_scsi_task(task);
+task = NULL;
+}
+
 #if defined(LIBISCSI_FEATURE_NOP_COUNTER)
 /* Set up a timer for sending out iSCSI NOPs */
 iscsilun-nop_timer = timer_new_ms(QEMU_CLOCK_REALTIME, 
iscsi_nop_timed_event, iscsilun);
-- 
1.8.3.1

[Qemu-devel] [PULL 03/11] virtio-scsi: Make type virtio-scsi-common abstract

2013-09-12 Thread Paolo Bonzini

From: Markus Armbruster arm...@redhat.com

It's the abstract base of virtio-scsi-device and vhost-scsi.

Signed-off-by: Markus Armbruster arm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/virtio-scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 3bd690d..26d95a1 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -693,6 +693,7 @@ static const TypeInfo virtio_scsi_common_info = {
 .name = TYPE_VIRTIO_SCSI_COMMON,
 .parent = TYPE_VIRTIO_DEVICE,
 .instance_size = sizeof(VirtIOSCSICommon),
+.abstract = true,
 .class_init = virtio_scsi_common_class_init,
 };
 
-- 
1.8.3.1

[Qemu-devel] [PULL 04/11] scsi: Fix scsi_bus_legacy_add_drive() scsi-generic with serial

2013-09-12 Thread Paolo Bonzini

From: Markus Armbruster arm...@redhat.com

scsi_bus_legacy_add_drive() creates either a scsi-disk or a
scsi-generic device.  It sets property serial to argument serial
unless null.  Crashes with scsi-generic, because it doesn't have such
the property.

Only usb_msd_initfn_storage() passes non-null serial.  Reproducer:

$ qemu-system-x86_64 -nodefaults -display none -S -usb \
-drive if=none,file=/dev/sg1,id=usb-drv0 \
-device usb-storage,id=usb-msd0,drive=usb-drv0,serial=123
qemu-system-x86_64: -device 
usb-storage,id=usb-msd0,drive=usb-drv0,serial=123: Property '.serial' not found
Aborted (core dumped)

Fix by handling exactly like removable: set the property only when
it exists.

Cc: qemu-sta...@nongnu.org
Reviewed-by: Andreas Färber afaer...@suse.de
Signed-off-by: Markus Armbruster arm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/scsi-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 5cd6137..4d36841 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -224,7 +224,7 @@ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, 
BlockDriverState *bdrv,
 if (object_property_find(OBJECT(dev), removable, NULL)) {
 qdev_prop_set_bit(dev, removable, removable);
 }
-if (serial) {
+if (serial  object_property_find(OBJECT(dev), serial, NULL)) {
 qdev_prop_set_string(dev, serial, serial);
 }
 if (qdev_prop_set_drive(dev, drive, bdrv)  0) {
-- 
1.8.3.1

[Qemu-devel] [PULL 11/11] spapr-vscsi: Report error on unsupported MAD requests

2013-09-12 Thread Paolo Bonzini

From: Alexey Kardashevskiy a...@ozlabs.ru

The existing driver just dropped unsupported requests. This adds error
responses to those unhandled requests.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/spapr_vscsi.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index f206452..2a26042 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -977,29 +977,43 @@ static int vscsi_send_capabilities(VSCSIState *s, 
vscsi_req *req)
 static int vscsi_handle_mad_req(VSCSIState *s, vscsi_req *req)
 {
 union mad_iu *mad = req-iu.mad;
+bool request_handled = false;
+uint64_t retlen = 0;
 
 switch (be32_to_cpu(mad-empty_iu.common.type)) {
 case VIOSRP_EMPTY_IU_TYPE:
 fprintf(stderr, Unsupported EMPTY MAD IU\n);
+retlen = sizeof(mad-empty_iu);
 break;
 case VIOSRP_ERROR_LOG_TYPE:
 fprintf(stderr, Unsupported ERROR LOG MAD IU\n);
-mad-error_log.common.status = cpu_to_be16(1);
-vscsi_send_iu(s, req, sizeof(mad-error_log), VIOSRP_MAD_FORMAT);
+retlen = sizeof(mad-error_log);
 break;
 case VIOSRP_ADAPTER_INFO_TYPE:
 vscsi_send_adapter_info(s, req);
+request_handled = true;
 break;
 case VIOSRP_HOST_CONFIG_TYPE:
-mad-host_config.common.status = cpu_to_be16(1);
-vscsi_send_iu(s, req, sizeof(mad-host_config), VIOSRP_MAD_FORMAT);
+retlen = sizeof(mad-host_config);
 break;
 case VIOSRP_CAPABILITIES_TYPE:
 vscsi_send_capabilities(s, req);
+request_handled = true;
 break;
 default:
 fprintf(stderr, VSCSI: Unknown MAD type %02x\n,
 be32_to_cpu(mad-empty_iu.common.type));
+/*
+ * PAPR+ says that The length field is set to the length
+ * of the data structure(s) used in the command.
+ * As we did not recognize the request type, put zero there.
+ */
+retlen = 0;
+}
+
+if (!request_handled) {
+mad-empty_iu.common.status = cpu_to_be16(VIOSRP_MAD_NOT_SUPPORTED);
+vscsi_send_iu(s, req, retlen, VIOSRP_MAD_FORMAT);
 }
 
 return 1;
-- 
1.8.3.1

[Qemu-devel] [PULL 08/11] iscsi: add .bdrv_get_block_status

2013-09-12 Thread Paolo Bonzini

From: Peter Lieven p...@kamp.de

this patch adds a coroutine for .bdrv_co_block_status as well as
a generic framework that can be used to build coroutines in block/iscsi.

Signed-off-by: Peter Lieven p...@kamp.de
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 block/iscsi.c | 136 ++
 1 file changed, 136 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index bfd659a..c377d21 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -58,6 +58,15 @@ typedef struct IscsiLun {
 struct scsi_inquiry_block_limits bl;
 } IscsiLun;
 
+typedef struct IscsiTask {
+int status;
+int complete;
+int retries;
+int do_retry;
+struct scsi_task *task;
+Coroutine *co;
+} IscsiTask;
+
 typedef struct IscsiAIOCB {
 BlockDriverAIOCB common;
 QEMUIOVector *qiov;
@@ -111,6 +120,41 @@ iscsi_schedule_bh(IscsiAIOCB *acb)
 qemu_bh_schedule(acb-bh);
 }
 
+static void
+iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
+void *command_data, void *opaque)
+{
+struct IscsiTask *iTask = opaque;
+struct scsi_task *task = command_data;
+
+iTask-complete = 1;
+iTask-status = status;
+iTask-do_retry = 0;
+iTask-task = task;
+
+if (iTask-retries--  0  status == SCSI_STATUS_CHECK_CONDITION
+ task-sense.key == SCSI_SENSE_UNIT_ATTENTION) {
+iTask-do_retry = 1;
+goto out;
+}
+
+if (status != SCSI_STATUS_GOOD) {
+error_report(iSCSI: Failure. %s, iscsi_get_error(iscsi));
+}
+
+out:
+if (iTask-co) {
+qemu_coroutine_enter(iTask-co, NULL);
+}
+}
+
+static void iscsi_co_init_iscsitask(IscsiLun *iscsilun, struct IscsiTask 
*iTask)
+{
+*iTask = (struct IscsiTask) {
+.co = qemu_coroutine_self(),
+.retries= ISCSI_CMD_RETRIES,
+};
+}
 
 static void
 iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void 
*command_data,
@@ -848,6 +892,96 @@ iscsi_getlength(BlockDriverState *bs)
 return len;
 }
 
+static int64_t coroutine_fn iscsi_co_get_block_status(BlockDriverState *bs,
+  int64_t sector_num,
+  int nb_sectors, int *pnum)
+{
+IscsiLun *iscsilun = bs-opaque;
+struct scsi_get_lba_status *lbas = NULL;
+struct scsi_lba_status_descriptor *lbasd = NULL;
+struct IscsiTask iTask;
+int64_t ret;
+
+iscsi_co_init_iscsitask(iscsilun, iTask);
+
+if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+ret = -EINVAL;
+goto out;
+}
+
+/* default to all sectors allocated */
+ret = BDRV_BLOCK_DATA;
+ret |= (sector_num  BDRV_SECTOR_BITS) | BDRV_BLOCK_OFFSET_VALID;
+*pnum = nb_sectors;
+
+/* LUN does not support logical block provisioning */
+if (iscsilun-lbpme == 0) {
+goto out;
+}
+
+retry:
+if (iscsi_get_lba_status_task(iscsilun-iscsi, iscsilun-lun,
+  sector_qemu2lun(sector_num, iscsilun),
+  8 + 16, iscsi_co_generic_cb,
+  iTask) == NULL) {
+ret = -EIO;
+goto out;
+}
+
+while (!iTask.complete) {
+iscsi_set_events(iscsilun);
+qemu_coroutine_yield();
+}
+
+if (iTask.do_retry) {
+if (iTask.task != NULL) {
+scsi_free_scsi_task(iTask.task);
+iTask.task = NULL;
+}
+goto retry;
+}
+
+if (iTask.status != SCSI_STATUS_GOOD) {
+/* in case the get_lba_status_callout fails (i.e.
+ * because the device is busy or the cmd is not
+ * supported) we pretend all blocks are allocated
+ * for backwards compatiblity */
+goto out;
+}
+
+lbas = scsi_datain_unmarshall(iTask.task);
+if (lbas == NULL) {
+ret = -EIO;
+goto out;
+}
+
+lbasd = lbas-descriptors[0];
+
+if (sector_qemu2lun(sector_num, iscsilun) != lbasd-lba) {
+ret = -EIO;
+goto out;
+}
+
+*pnum = sector_lun2qemu(lbasd-num_blocks, iscsilun);
+if (*pnum  nb_sectors) {
+*pnum = nb_sectors;
+}
+
+if (lbasd-provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED ||
+lbasd-provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) {
+ret = ~BDRV_BLOCK_DATA;
+if (iscsilun-lbprz) {
+ret |= BDRV_BLOCK_ZERO;
+}
+}
+
+out:
+if (iTask.task != NULL) {
+scsi_free_scsi_task(iTask.task);
+}
+return ret;
+}
+
 static int parse_chap(struct iscsi_context *iscsi, const char *target)
 {
 QemuOptsList *list;
@@ -1398,6 +1532,8 @@ static BlockDriver bdrv_iscsi = {
 .bdrv_getlength  = iscsi_getlength,
 .bdrv_truncate   = iscsi_truncate,
 
+.bdrv_co_get_block_status = iscsi_co_get_block_status,
+
 .bdrv_aio_readv  = iscsi_aio_readv,
 .bdrv_aio_writev = iscsi_aio_writev,
 .bdrv_aio_flush  = iscsi_aio_flush,

[Qemu-devel] [PULL 09/11] iscsi: split discard requests in multiple parts

2013-09-12 Thread Paolo Bonzini

From: Peter Lieven p...@kamp.de

Replace .bdrv_aio_discard with .bdrv_co_discard so that discard
requests can be split in multiple parts, each for a small amount
of sectors.

This is useful because we expose a generic API with no limit
on the amount of sectors that can be unmapped in one request.

Signed-off-by: Peter Lieven p...@kamp.de
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 block/iscsi.c | 156 +++---
 1 file changed, 73 insertions(+), 83 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index c377d21..68f99d3 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -87,6 +87,7 @@ typedef struct IscsiAIOCB {
 #define NOP_INTERVAL 5000
 #define MAX_NOP_FAILURES 3
 #define ISCSI_CMD_RETRIES 5
+#define ISCSI_MAX_UNMAP 131072
 
 static void
 iscsi_bh_cb(void *p)
@@ -618,88 +619,6 @@ iscsi_aio_flush(BlockDriverState *bs,
 return acb-common;
 }
 
-static int iscsi_aio_discard_acb(IscsiAIOCB *acb);
-
-static void
-iscsi_unmap_cb(struct iscsi_context *iscsi, int status,
- void *command_data, void *opaque)
-{
-IscsiAIOCB *acb = opaque;
-
-if (acb-canceled != 0) {
-return;
-}
-
-acb-status = 0;
-if (status != 0) {
-if (status == SCSI_STATUS_CHECK_CONDITION
- acb-task-sense.key == SCSI_SENSE_UNIT_ATTENTION
- acb-retries--  0) {
-scsi_free_scsi_task(acb-task);
-acb-task = NULL;
-if (iscsi_aio_discard_acb(acb) == 0) {
-iscsi_set_events(acb-iscsilun);
-return;
-}
-}
-error_report(Failed to unmap data on iSCSI lun. %s,
- iscsi_get_error(iscsi));
-acb-status = -EIO;
-}
-
-iscsi_schedule_bh(acb);
-}
-
-static int iscsi_aio_discard_acb(IscsiAIOCB *acb) {
-struct iscsi_context *iscsi = acb-iscsilun-iscsi;
-struct unmap_list list[1];
-
-acb-canceled   = 0;
-acb-bh = NULL;
-acb-status = -EINPROGRESS;
-acb-buf= NULL;
-
-list[0].lba = sector_qemu2lun(acb-sector_num, acb-iscsilun);
-list[0].num = acb-nb_sectors * BDRV_SECTOR_SIZE / 
acb-iscsilun-block_size;
-
-acb-task = iscsi_unmap_task(iscsi, acb-iscsilun-lun,
- 0, 0, list[0], 1,
- iscsi_unmap_cb,
- acb);
-if (acb-task == NULL) {
-error_report(iSCSI: Failed to send unmap command. %s,
- iscsi_get_error(iscsi));
-return -1;
-}
-
-return 0;
-}
-
-static BlockDriverAIOCB *
-iscsi_aio_discard(BlockDriverState *bs,
-  int64_t sector_num, int nb_sectors,
-  BlockDriverCompletionFunc *cb, void *opaque)
-{
-IscsiLun *iscsilun = bs-opaque;
-IscsiAIOCB *acb;
-
-acb = qemu_aio_get(iscsi_aiocb_info, bs, cb, opaque);
-
-acb-iscsilun= iscsilun;
-acb-nb_sectors  = nb_sectors;
-acb-sector_num  = sector_num;
-acb-retries = ISCSI_CMD_RETRIES;
-
-if (iscsi_aio_discard_acb(acb) != 0) {
-qemu_aio_release(acb);
-return NULL;
-}
-
-iscsi_set_events(iscsilun);
-
-return acb-common;
-}
-
 #ifdef __linux__
 static void
 iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
@@ -982,6 +901,77 @@ out:
 return ret;
 }
 
+static int
+coroutine_fn iscsi_co_discard(BlockDriverState *bs, int64_t sector_num,
+   int nb_sectors)
+{
+IscsiLun *iscsilun = bs-opaque;
+struct IscsiTask iTask;
+struct unmap_list list;
+uint32_t nb_blocks;
+uint32_t max_unmap;
+
+if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+return -EINVAL;
+}
+
+if (!iscsilun-lbp.lbpu) {
+/* UNMAP is not supported by the target */
+return 0;
+}
+
+list.lba = sector_qemu2lun(sector_num, iscsilun);
+nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
+
+max_unmap = iscsilun-bl.max_unmap;
+if (max_unmap == 0x) {
+max_unmap = ISCSI_MAX_UNMAP;
+}
+
+while (nb_blocks  0) {
+iscsi_co_init_iscsitask(iscsilun, iTask);
+list.num = nb_blocks;
+if (list.num  max_unmap) {
+list.num = max_unmap;
+}
+retry:
+if (iscsi_unmap_task(iscsilun-iscsi, iscsilun-lun, 0, 0, list, 1,
+ iscsi_co_generic_cb, iTask) == NULL) {
+return -EIO;
+}
+
+while (!iTask.complete) {
+iscsi_set_events(iscsilun);
+qemu_coroutine_yield();
+}
+
+if (iTask.task != NULL) {
+scsi_free_scsi_task(iTask.task);
+iTask.task = NULL;
+}
+
+if (iTask.do_retry) {
+goto retry;
+}
+
+if (iTask.status == SCSI_STATUS_CHECK_CONDITION) {
+/* the target might fail with a check condition if it
+   is not happy with the alignment of the UNMAP request
+

[Qemu-devel] [PULL 10/11] spapr-vscsi: Adding VSCSI capabilities

2013-09-12 Thread Paolo Bonzini

From: Nikunj A. Dadhania nik...@linux.vnet.ibm.com

This implements capabilities exchange between vscsi host and client.  As
at the moment no capability is supported, put zero flags everywhere and
return.

Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Nikunj A. Dadhania nik...@linux.vnet.ibm.com
---
 hw/scsi/spapr_vscsi.c | 54 +++
 1 file changed, 54 insertions(+)

diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index c31a870..f206452 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -923,6 +923,57 @@ static int vscsi_send_adapter_info(VSCSIState *s, 
vscsi_req *req)
 return vscsi_send_iu(s, req, sizeof(*sinfo), VIOSRP_MAD_FORMAT);
 }
 
+static int vscsi_send_capabilities(VSCSIState *s, vscsi_req *req)
+{
+struct viosrp_capabilities *vcap;
+struct capabilities cap = { };
+uint16_t len, req_len;
+uint64_t buffer;
+int rc;
+
+vcap = req-iu.mad.capabilities;
+req_len = len = be16_to_cpu(vcap-common.length);
+buffer = be64_to_cpu(vcap-buffer);
+if (len  sizeof(cap)) {
+fprintf(stderr, vscsi_send_capabilities: capabilities size mismatch 
!\n);
+
+/*
+ * Just read and populate the structure that is known.
+ * Zero rest of the structure.
+ */
+len = sizeof(cap);
+}
+rc = spapr_vio_dma_read(s-vdev, buffer, cap, len);
+if (rc)  {
+fprintf(stderr, vscsi_send_capabilities: DMA read failure !\n);
+}
+
+/*
+ * Current implementation does not suppport any migration or
+ * reservation capabilities. Construct the response telling the
+ * guest not to use them.
+ */
+cap.flags = 0;
+cap.migration.ecl = 0;
+cap.reserve.type = 0;
+cap.migration.common.server_support = 0;
+cap.reserve.common.server_support = 0;
+
+rc = spapr_vio_dma_write(s-vdev, buffer, cap, len);
+if (rc)  {
+fprintf(stderr, vscsi_send_capabilities: DMA write failure !\n);
+}
+if (req_len  len) {
+/*
+ * Being paranoid and lets not worry about the error code
+ * here. Actual write of the cap is done above.
+ */
+spapr_vio_dma_set(s-vdev, (buffer + len), 0, (req_len - len));
+}
+vcap-common.status = rc ? cpu_to_be32(1) : 0;
+return vscsi_send_iu(s, req, sizeof(*vcap), VIOSRP_MAD_FORMAT);
+}
+
 static int vscsi_handle_mad_req(VSCSIState *s, vscsi_req *req)
 {
 union mad_iu *mad = req-iu.mad;
@@ -943,6 +994,9 @@ static int vscsi_handle_mad_req(VSCSIState *s, vscsi_req 
*req)
 mad-host_config.common.status = cpu_to_be16(1);
 vscsi_send_iu(s, req, sizeof(mad-host_config), VIOSRP_MAD_FORMAT);
 break;
+case VIOSRP_CAPABILITIES_TYPE:
+vscsi_send_capabilities(s, req);
+break;
 default:
 fprintf(stderr, VSCSI: Unknown MAD type %02x\n,
 be32_to_cpu(mad-empty_iu.common.type));
-- 
1.8.3.1

[Qemu-devel] [PULL 01/11] scsi: prefer UUID to VM name for the initiator name

2013-09-12 Thread Paolo Bonzini

The UUID is unique even across multiple hosts, thus it is
better than a VM name even if it is less user-friendly.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 block/iscsi.c   | 23 ---
 include/sysemu/sysemu.h |  2 ++
 stubs/Makefile.objs |  1 +
 stubs/uuid.c| 12 
 4 files changed, 31 insertions(+), 7 deletions(-)
 create mode 100644 stubs/uuid.c

diff --git a/block/iscsi.c b/block/iscsi.c
index 813abd8..60b3967 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -33,6 +33,8 @@
 #include trace.h
 #include block/scsi.h
 #include qemu/iov.h
+#include sysemu/sysemu.h
+#include qmp-commands.h
 
 #include iscsi/iscsi.h
 #include iscsi/scsi-lowlevel.h
@@ -922,8 +924,9 @@ static char *parse_initiator_name(const char *target)
 {
 QemuOptsList *list;
 QemuOpts *opts;
-const char *name = NULL;
-const char *iscsi_name = qemu_get_vm_name();
+const char *name;
+char *iscsi_name;
+UuidInfo *uuid_info;
 
 list = qemu_find_opts(iscsi);
 if (list) {
@@ -933,16 +936,22 @@ static char *parse_initiator_name(const char *target)
 }
 if (opts) {
 name = qemu_opt_get(opts, initiator-name);
+if (name) {
+return g_strdup(name);
+}
 }
 }
 
-if (name) {
-return g_strdup(name);
+uuid_info = qmp_query_uuid(NULL);
+if (strcmp(uuid_info-UUID, UUID_NONE) == 0) {
+name = qemu_get_vm_name();
 } else {
-return g_strdup_printf(iqn.2008-11.org.linux-kvm%s%s,
-   iscsi_name ? : : ,
-   iscsi_name ? iscsi_name : );
+name = uuid_info-UUID;
 }
+iscsi_name = g_strdup_printf(iqn.2008-11.org.linux-kvm%s%s,
+ name ? : : , name ? name : );
+qapi_free_UuidInfo(uuid_info);
+return iscsi_name;
 }
 
 #if defined(LIBISCSI_FEATURE_NOP_COUNTER)
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index b1aa059..e2c6f58 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -17,7 +17,9 @@ extern const char *bios_name;
 extern const char *qemu_name;
 extern uint8_t qemu_uuid[];
 int qemu_uuid_parse(const char *str, uint8_t *uuid);
+
 #define UUID_FMT 
%02hhx%02hhx%02hhx%02hhx-%02hhx%02hhx-%02hhx%02hhx-%02hhx%02hhx-%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx
+#define UUID_NONE ----
 
 bool runstate_check(RunState state);
 void runstate_set(RunState new_state);
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index f306cba..df92fe5 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -22,6 +22,7 @@ stub-obj-y += reset.o
 stub-obj-y += set-fd-handler.o
 stub-obj-y += slirp.o
 stub-obj-y += sysbus.o
+stub-obj-y += uuid.o
 stub-obj-y += vm-stop.o
 stub-obj-y += vmstate.o
 stub-obj-$(CONFIG_WIN32) += fd-register.o
diff --git a/stubs/uuid.c b/stubs/uuid.c
new file mode 100644
index 000..ffc0ed4
--- /dev/null
+++ b/stubs/uuid.c
@@ -0,0 +1,12 @@
+#include qemu-common.h
+#include sysemu/sysemu.h
+#include qmp-commands.h
+
+UuidInfo *qmp_query_uuid(Error **errp)
+{
+UuidInfo *info = g_malloc0(sizeof(*info));
+
+info-UUID = g_strdup(UUID_NONE);
+return info;
+}
+
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH v3 0/4] timers thread-safe stuff

2013-09-12 Thread Paolo Bonzini

Il 27/08/2013 05:20, Liu Ping Fan ha scritto:
 Saw the Alex's patches has been merged, rebase mine onto his.
 
 v3:
   1. rename seqlock_read_check as seqlock_read_retry
   2. Document timerlist were protected by BQL, and discard private lock 
 around qemu_event_wait(tl-ev).
 
 v2:
   1. fix comment in commit and code
   2. fix race issue for qemu_clock_enable(foo,disable)
 
 
 Liu Ping Fan (2):
   timer: protect timers_state's clock with seqlock
   timer: make qemu_clock_enable sync between disable and timer's cb
 
 Paolo Bonzini (2):
   seqlock: introduce read-write seqlock
   qemu-thread: add QemuEvent
 
  cpus.c  |  36 +++---
  include/qemu/seqlock.h  |  72 +++
  include/qemu/thread-posix.h |   8 +++
  include/qemu/thread-win32.h |   4 ++
  include/qemu/thread.h   |   7 +++
  include/qemu/timer.h|   4 ++
  qemu-timer.c|  20 +++-
  util/qemu-thread-posix.c| 116 
 
  util/qemu-thread-win32.c|  26 ++
  9 files changed, 286 insertions(+), 7 deletions(-)
  create mode 100644 include/qemu/seqlock.h
 

Stefan, could you pick up these four patches as well?

Paolo

Re: [Qemu-devel] [PATCH] qdev-monitor: Avoid exiting when hot-plugging two devices with the same bootindex value

2013-09-12 Thread Marcel Apfelbaum

On Thu, 2013-09-12 at 13:04 +0200, Markus Armbruster wrote:
 Marcel Apfelbaum marce...@redhat.com writes:
 
  On Thu, 2013-09-12 at 11:43 +0200, Markus Armbruster wrote:
  Paolo Bonzini pbonz...@redhat.com writes:
  
   Il 11/09/2013 20:26, Marcel Apfelbaum ha scritto:
   Qemu is expected to quit if the same boot index value is used by
   two devices.
   However, hot-plugging a device with a bootindex value already used 
   should
   fail with a friendly message rather than quitting a running VM.
  
   I think the problem is right where QEMU exits, i.e. in
   add_boot_device_path.  This function should return an error instead, via
   an Error ** argument.
  
  Agree.
  
   Callers, typically a device's init or realize function, will either
   print the error before returning an error code (e.g. -EBUSY for init) or
   propagate the error up (for realize).
  
   Returning/propagating failure will still cause QEMU to exit when the
   duplicate bootindexes are found on the command line.
  
  I have an unfinished patch in my tree that does exactly that.  It's
  unfinished, because cleanup on error paths needs work.  Current state
  appended with FIXMEs and all.  Beware, the FIXMEs may not be correct and
  are almost certainly not complete.
  Thanks Markus,
  Should I use it as my starting point and finish it or you intend to?
 
 If you have cycles to spare, you're quite welcome to take this patch and
 run with it!
I'll try to finish it, thanks!
 
 You may have noticed that my patch moves the code to add the boot device
 path in a few cases.  I did this in the hope of simplifying error paths.
 Do not hesitate to undo such moves where they turn out not to simplify
 anything.
 
 Here's an issue that exists before my patch: DeviceClass method
 unrealize() should clean up everything done by realize().  In
 particular, unrealize() needs to remove any entry added to fw_boot_order
 by realize() via add_boot_device_path().  Code to do that doesn't exist,
 yet.  Hot unplug is technically broken for all devices with bootindex.
 Impact unknown.  Should probably be fixed in a separate patch.
The outcome is that after hot-unplugging a device with bootindex=x,
the device still remains in the fw_boot_order and no device can 
be hot-plugged with the same bootindex 'x'. Furthermore, because
of the current issue Qemu quits on an operation that should succeed! 
I plan to send a patch to fix this too.

Thanks,
Marcel

Re: [Qemu-devel] [PATCH] qemu-img: fix invalid JSON

2013-09-12 Thread Kevin Wolf

Am 11.09.2013 um 18:58 hat Eric Blake geschrieben:
 On 09/11/2013 10:47 AM, Paolo Bonzini wrote:
  Single quotes for JSON are a QMP-ism, use real JSON in
  qemu-img output.
  
  Reported-by: Kevin Wolf kw...@redhat.com
  Signed-off-by: Paolo Bonzini pbonz...@redhat.com
  ---
   qemu-img.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
 Reviewed-by: Eric Blake ebl...@redhat.com

Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [PATCH] qemu-iotests: Cleanup test image in test number 007

2013-09-12 Thread Kevin Wolf

Am 12.09.2013 um 10:37 hat Bharata B Rao geschrieben:
 qemu-iotests number 007 doesn't do test image cleanup. This will affect
 those protocols that expect a clean state before every test. Hence
 ensure that test image is cleaned up in this test.
 
 Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com

Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [PATCH v6 4/8] module: implement module loading function

2013-09-12 Thread Eric Blake

On 09/11/2013 11:36 PM, Michael Tokarev wrote:
 
 A solution which I proposed at the very beginning -- to export
 a hashed init function from modules, and call it from the
 main executable.  Like, instead of, say, qemu_module_init(),
 call qemu_module_init_0xdeadbeaf(), where 0xdeadbeaf is a
 hash of some build-dependent value.  This should be enough
 to keep it going.

And of course, since we store sources in git, you already have such a
hash value at your disposal:
 $CC -DBUILD_HASH=$(git rev-parse HEAD) ...
coupled with
 glue(qemu_module_init_, BUILD_HASH)
where the only trick is to figure out how to bake in a hash when
building from a released tarball rather than git.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC PATCH 0/8] Remove stub mon-protocol-event for block

2013-09-12 Thread Kevin Wolf

Am 12.09.2013 um 11:31 hat Paolo Bonzini geschrieben:
 Il 12/09/2013 11:15, Wenchao Xia ha scritto:
  This series will remove the usage of symbols of mon-protocol-event in
  qemu-img, qemu-nbd and qemu-io, in short remove the connetion for block
  layer.
  
  Background:
I am tring to decouple block layer code with other unnnessary components,
  and in ./stub there many symbols that qemu-img linked as fake implemtion.
  As a first step, I am decouple monitor with block layer code, this is the
  first part of it.
There are still other stub symbols for monitor, which will be solved 
  later.
  It seems error handlering is also link with those symbols, and will adjust
  that.

 Patches 1-4 look good.  I'm not sure of the advantage of the last four,
 however.  The ugly part of monitor_protocol_event is not really the
 stub, but the dependency on QObject.
 
 So, in my opinion a more interesting approach would be to describe
 events using QAPI types.  Generating the events would require a small
 amount of code to build QObjects manually, because the event syntax
 doesn't match exactly a QAPI union, but that is only a technical detail.

You mean the QAPI schema cannot describe events? Why do you think so,
and wouldn't this be a reason to extend the schema language?

The QMP spec describes events like this:

{ event: json-string, data: json-object,
  timestamp: { seconds: json-number, microseconds: json-number } }

Looks like it could be described as:

{ 'type': 'EventTypeA',
  'data': {
  'data': {
  # event-specific data
  } } }
{ 'type': 'EventBase',
  'data': { 'event': 'str'. 'timestamp': {
  'seconds': 'int', 'microseconds': 'int' } }
{ 'union': 'Event',
  'base:' EventBase',
  'discriminator': 'event',
  'data': {
  'EVENT_NAME_A': 'EventTypeA'
  } }

Kevin

[Qemu-devel] [Bug 1224444] [NEW] virtio-serial loses writes when used over virtio-mmio

2013-09-12 Thread Richard Jones

Public bug reported:

virtio-serial appears to lose writes, but only when used on top of
virtio-mmio.  The scenario is this:

/home/rjones/d/qemu/arm-softmmu/qemu-system-arm \
-global virtio-blk-device.scsi=off \
-nodefconfig \
-nodefaults \
-nographic \
-M vexpress-a15 \
-machine accel=kvm:tcg \
-m 500 \
-no-reboot \
-kernel /home/rjones/d/libguestfs/tmp/.guestfs-1001/kernel.27944 \
-dtb /home/rjones/d/libguestfs/tmp/.guestfs-1001/dtb.27944 \
-initrd /home/rjones/d/libguestfs/tmp/.guestfs-1001/initrd.27944 \
-device virtio-scsi-device,id=scsi \
-drive 
file=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/scratch.1,cache=unsafe,format=raw,id=hd0,if=none
 \
-device scsi-hd,drive=hd0 \
-drive 
file=/home/rjones/d/libguestfs/tmp/.guestfs-1001/root.27944,snapshot=on,id=appliance,cache=unsafe,if=none
 \
-device scsi-hd,drive=appliance \
-device virtio-serial-device \
-serial stdio \
-chardev 
socket,path=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/guestfsd.sock,id=channel0
 \
-device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
-append 'panic=1 mem=500M console=ttyAMA0 udevtimeout=600 no_timer_check 
acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 
guestfs_verbose=1 TERM=xterm-256color'

After the guest starts up, a daemon writes 4 bytes to a virtio-serial
socket.  The host side reads these 4 bytes correctly and writes a 64
byte message.  The guest never sees this message.

I enabled virtio-mmio debugging, and this is what is printed (## = my
comment):

## guest opens the socket:
trying to open virtio-serial channel 
'/dev/virtio-ports/org.libguestfs.channel.0'
virtio_mmio: virtio_mmio_write offset 0x50 value 0x3
opened the socket, sock = 3
udevadm settle
## guest writes 4 bytes to the socket:
virtio_mmio: virtio_mmio_write offset 0x50 value 0x5
virtio_mmio: virtio_mmio setting IRQ 1
virtio_mmio: virtio_mmio_read offset 0x60
virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
virtio_mmio: virtio_mmio setting IRQ 0
sent magic GUESTFS_LAUNCH_FLAG
## host reads 4 bytes successfully:
main_loop libguestfs: recv_from_daemon: received GUESTFS_LAUNCH_FLAG
libguestfs: [14605ms] appliance is up
Guest launched OK.
## host writes 64 bytes to socket:
libguestfs: writing the data to the socket (size = 64)
waiting for next request
libguestfs: data written OK
## hangs here forever with guest in read() call never receiving any data

I am using qemu from git today (2d1fe1873a984).

Some notes:

- It's not 100% reproducible.  Sometimes everything works fine, although it 
fails often (eg  2/3rds of the time).
- KVM is being used.
- We've long used virtio-serial on x86 and have never seen anything like this.

** Affects: qemu
 Importance: Undecided
 Status: New

** Description changed:

  virtio-serial appears to lose writes, but only when used on top of
  virtio-mmio.  The scenario is this:
  
  /home/rjones/d/qemu/arm-softmmu/qemu-system-arm \
- -global virtio-blk-device.scsi=off \
- -nodefconfig \
- -nodefaults \
- -nographic \
- -M vexpress-a15 \
- -machine accel=kvm:tcg \
- -m 500 \
- -no-reboot \
- -kernel /home/rjones/d/libguestfs/tmp/.guestfs-1001/kernel.27944 \
- -dtb /home/rjones/d/libguestfs/tmp/.guestfs-1001/dtb.27944 \
- -initrd /home/rjones/d/libguestfs/tmp/.guestfs-1001/initrd.27944 \
- -device virtio-scsi-device,id=scsi \
- -drive 
file=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/scratch.1,cache=unsafe,format=raw,id=hd0,if=none
 \
- -device scsi-hd,drive=hd0 \
- -drive 
file=/home/rjones/d/libguestfs/tmp/.guestfs-1001/root.27944,snapshot=on,id=appliance,cache=unsafe,if=none
 \
- -device scsi-hd,drive=appliance \
- -device virtio-serial-device \
- -serial stdio \
- -chardev 
socket,path=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/guestfsd.sock,id=channel0
 \
- -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
- -append 'panic=1 mem=500M console=ttyAMA0 udevtimeout=600 no_timer_check 
acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 
guestfs_verbose=1 TERM=xterm-256color'
+ -global virtio-blk-device.scsi=off \
+ -nodefconfig \
+ -nodefaults \
+ -nographic \
+ -M vexpress-a15 \
+ -machine accel=kvm:tcg \
+ -m 500 \
+ -no-reboot \
+ -kernel /home/rjones/d/libguestfs/tmp/.guestfs-1001/kernel.27944 \
+ -dtb /home/rjones/d/libguestfs/tmp/.guestfs-1001/dtb.27944 \
+ -initrd /home/rjones/d/libguestfs/tmp/.guestfs-1001/initrd.27944 \
+ -device virtio-scsi-device,id=scsi \
+ -drive 
file=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/scratch.1,cache=unsafe,format=raw,id=hd0,if=none
 \
+ -device scsi-hd,drive=hd0 \
+ -drive 
file=/home/rjones/d/libguestfs/tmp/.guestfs-1001/root.27944,snapshot=on,id=appliance,cache=unsafe,if=none
 \
+ -device scsi-hd,drive=appliance \
+ -device

Re: [Qemu-devel] [PATCH] pc: add 1.7 machine types for piix,q35

2013-09-12 Thread Andreas Färber

Am 12.09.2013 08:24, schrieb Michael S. Tsirkin:
 piix 1.7 is the default.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  hw/i386/pc_piix.c | 19 +--
  hw/i386/pc_q35.c  | 17 -
  2 files changed, 33 insertions(+), 3 deletions(-)

Looks like you forget to rebase? Stefan's net-next tree was merged last
night, so there's already the two _v1_7 machines registered at least.
The pc_init_ function and the 1_6 - 1_7 #define might still be
applicable though.

Andreas

 
 diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
 index 66551b4..0ade373 100644
 --- a/hw/i386/pc_piix.c
 +++ b/hw/i386/pc_piix.c
 @@ -274,6 +274,11 @@ static void pc_compat_1_2(QEMUMachineInitArgs *args)
  disable_kvm_pv_eoi();
  }
  
 +static void pc_init_pci_1_7(QEMUMachineInitArgs *args)
 +{
 +pc_init_pci(args);
 +}
 +
  static void pc_init_pci_1_6(QEMUMachineInitArgs *args)
  {
  pc_compat_1_6(args);
 @@ -344,14 +349,23 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args)
  .desc = Standard PC (i440FX + PIIX, 1996), \
  .hot_add_cpu = pc_hot_add_cpu
  
 -#define PC_I440FX_1_6_MACHINE_OPTIONS PC_I440FX_MACHINE_OPTIONS
 +#define PC_I440FX_1_7_MACHINE_OPTIONS PC_I440FX_MACHINE_OPTIONS
 +
 +static QEMUMachine pc_i440fx_machine_v1_7 = {
 +PC_I440FX_1_7_MACHINE_OPTIONS,
 +.name = pc-i440fx-1.7,
 +.alias = pc,
 +.init = pc_init_pci_1_7,
 +.is_default = 1,
 +};
 +
 +#define PC_I440FX_1_6_MACHINE_OPTIONS PC_I440FX_1_7_MACHINE_OPTIONS
  
  static QEMUMachine pc_i440fx_machine_v1_6 = {
  PC_I440FX_1_6_MACHINE_OPTIONS,
  .name = pc-i440fx-1.6,
  .alias = pc,
  .init = pc_init_pci_1_6,
 -.is_default = 1,
  };
  
  static QEMUMachine pc_i440fx_machine_v1_5 = {
 @@ -740,6 +754,7 @@ static QEMUMachine xenfv_machine = {
  
  static void pc_machine_init(void)
  {
 +qemu_register_machine(pc_i440fx_machine_v1_7);
  qemu_register_machine(pc_i440fx_machine_v1_6);
  qemu_register_machine(pc_i440fx_machine_v1_5);
  qemu_register_machine(pc_i440fx_machine_v1_4);
 diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
 index 54c2b4c..0abd9b1 100644
 --- a/hw/i386/pc_q35.c
 +++ b/hw/i386/pc_q35.c
 @@ -238,6 +238,11 @@ static void pc_compat_1_4(QEMUMachineInitArgs *args)
  x86_cpu_compat_set_features(Westmere, FEAT_1_ECX, 0, 
 CPUID_EXT_PCLMULQDQ);
  }
  
 +static void pc_q35_init_1_7(QEMUMachineInitArgs *args)
 +{
 +pc_q35_init(args);
 +}
 +
  static void pc_q35_init_1_6(QEMUMachineInitArgs *args)
  {
  pc_compat_1_6(args);
 @@ -261,7 +266,16 @@ static void pc_q35_init_1_4(QEMUMachineInitArgs *args)
  .desc = Standard PC (Q35 + ICH9, 2009), \
  .hot_add_cpu = pc_hot_add_cpu
  
 -#define PC_Q35_1_6_MACHINE_OPTIONS PC_Q35_MACHINE_OPTIONS
 +#define PC_Q35_1_7_MACHINE_OPTIONS PC_Q35_MACHINE_OPTIONS
 +
 +static QEMUMachine pc_q35_machine_v1_7 = {
 +PC_Q35_1_7_MACHINE_OPTIONS,
 +.name = pc-q35-1.7,
 +.alias = q35,
 +.init = pc_q35_init_1_7,
 +};
 +
 +#define PC_Q35_1_6_MACHINE_OPTIONS PC_Q35_1_7_MACHINE_OPTIONS
  
  static QEMUMachine pc_q35_machine_v1_6 = {
  PC_Q35_1_6_MACHINE_OPTIONS,
 @@ -296,6 +310,7 @@ static QEMUMachine pc_q35_machine_v1_4 = {
  
  static void pc_q35_machine_init(void)
  {
 +qemu_register_machine(pc_q35_machine_v1_7);
  qemu_register_machine(pc_q35_machine_v1_6);
  qemu_register_machine(pc_q35_machine_v1_5);
  qemu_register_machine(pc_q35_machine_v1_4);
 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [RFC PATCH 0/8] Remove stub mon-protocol-event for block

2013-09-12 Thread Markus Armbruster

[Note cc: Luiz]

Paolo Bonzini pbonz...@redhat.com writes:

 Il 12/09/2013 11:15, Wenchao Xia ha scritto:
 This series will remove the usage of symbols of mon-protocol-event in
 qemu-img, qemu-nbd and qemu-io, in short remove the connetion for block
 layer.
 
 Background:
   I am tring to decouple block layer code with other unnnessary components,
 and in ./stub there many symbols that qemu-img linked as fake implemtion.
 As a first step, I am decouple monitor with block layer code, this is the
 first part of it.
   There are still other stub symbols for monitor, which will be solved later.
 It seems error handlering is also link with those symbols, and will adjust
 that.
 
 Wenchao Xia (8):
   1 block: use type MonitorEvent directly
   2 block: do not include monitor.h in block.c
   3 qapi: move MonitorEvent define
   4 qapi: rename MonitorEvent to QEvent
   5 block: add a callback layer for common functions
   6 block: replace monitor_protocol_event() with callback
   7 block: do not include monitor.h
   7 stubs: remove mon-protocol-event.o in stub obj
 
  block.c|   22 ++
  block/qcow2-refcount.c |4 +++-
  blockjob.c |   10 --
  include/block/block.h  |   12 
  include/block/block_int.h  |3 +--
  include/monitor/monitor.h  |   40 ++--
  include/qapi/qmp/qevent.h  |   41 +
  include/qapi/qmp/types.h   |1 +
  monitor.c  |   12 ++--
  stubs/Makefile.objs|1 -
  stubs/mon-protocol-event.c |2 +-
  tests/Makefile |3 ++-
  ui/vnc.c   |2 +-
  vl.c   |4 
  14 files changed, 100 insertions(+), 57 deletions(-)
  create mode 100644 include/qapi/qmp/qevent.h
 

 Patches 1-4 look good.  I'm not sure of the advantage of the last four,
 however.  The ugly part of monitor_protocol_event is not really the
 stub, but the dependency on QObject.

 So, in my opinion a more interesting approach would be to describe
 events using QAPI types.  Generating the events would require a small
 amount of code to build QObjects manually, because the event syntax
 doesn't match exactly a QAPI union, but that is only a technical detail.

 Paolo

See QMP TODO list, item Add events support to the QAPI
http://wiki.qemu.org/QMP#TODO
https://lists.nongnu.org/archive/html/qemu-devel/2013-09/msg01816.html

Re: [Qemu-devel] [PATCH v6 4/8] module: implement module loading function

2013-09-12 Thread Daniel P. Berrange

On Thu, Sep 12, 2013 at 05:59:30AM -0600, Eric Blake wrote:
 On 09/11/2013 11:36 PM, Michael Tokarev wrote:
  
  A solution which I proposed at the very beginning -- to export
  a hashed init function from modules, and call it from the
  main executable.  Like, instead of, say, qemu_module_init(),
  call qemu_module_init_0xdeadbeaf(), where 0xdeadbeaf is a
  hash of some build-dependent value.  This should be enough
  to keep it going.
 
 And of course, since we store sources in git, you already have such a
 hash value at your disposal:
  $CC -DBUILD_HASH=$(git rev-parse HEAD) ...
 coupled with
  glue(qemu_module_init_, BUILD_HASH)
 where the only trick is to figure out how to bake in a hash when
 building from a released tarball rather than git.

IMHO we want this to change any time you do './configure', so I would
not tie this to git hash - we don't want all users of particular
release tar.gz to have the same hash regardless of configure options.
We want a situation where any time a distro builds an RPM or equiv,
a new hash is used. So to me generating it in 'configure' seems like
a reasonable place.


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] [PATCH] block: Assert validity of BdrvActionOps

2013-09-12 Thread Max Reitz

In qmp_transaction, assert that the BdrvActionOps to be used is actually
valid.

This assertion failing is very improbable, however, it might happen, if
a new TransactionActionKind is introduced out of order and the
actions[] array is not updated.

Signed-off-by: Max Reitz mre...@redhat.com
---
 blockdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 07dac05..14a0bb1 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1102,6 +1102,8 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 assert(dev_info-kind  ARRAY_SIZE(actions));
 
 ops = actions[dev_info-kind];
+assert(ops-instance_size  0);
+
 state = g_malloc0(ops-instance_size);
 state-ops = ops;
 state-action = dev_info;
-- 
1.8.3.1

[Qemu-devel] [Bug 1224444] Re: virtio-serial loses writes when used over virtio-mmio

2013-09-12 Thread Richard Jones

** Description changed:

  virtio-serial appears to lose writes, but only when used on top of
  virtio-mmio.  The scenario is this:
  
  /home/rjones/d/qemu/arm-softmmu/qemu-system-arm \
  -global virtio-blk-device.scsi=off \
  -nodefconfig \
  -nodefaults \
  -nographic \
  -M vexpress-a15 \
  -machine accel=kvm:tcg \
  -m 500 \
  -no-reboot \
  -kernel /home/rjones/d/libguestfs/tmp/.guestfs-1001/kernel.27944 \
  -dtb /home/rjones/d/libguestfs/tmp/.guestfs-1001/dtb.27944 \
  -initrd /home/rjones/d/libguestfs/tmp/.guestfs-1001/initrd.27944 \
  -device virtio-scsi-device,id=scsi \
  -drive 
file=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/scratch.1,cache=unsafe,format=raw,id=hd0,if=none
 \
  -device scsi-hd,drive=hd0 \
  -drive 
file=/home/rjones/d/libguestfs/tmp/.guestfs-1001/root.27944,snapshot=on,id=appliance,cache=unsafe,if=none
 \
  -device scsi-hd,drive=appliance \
  -device virtio-serial-device \
  -serial stdio \
  -chardev 
socket,path=/home/rjones/d/libguestfs/tmp/libguestfsLa9dE2/guestfsd.sock,id=channel0
 \
  -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
  -append 'panic=1 mem=500M console=ttyAMA0 udevtimeout=600 no_timer_check 
acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 
guestfs_verbose=1 TERM=xterm-256color'
  
  After the guest starts up, a daemon writes 4 bytes to a virtio-serial
  socket.  The host side reads these 4 bytes correctly and writes a 64
  byte message.  The guest never sees this message.
  
  I enabled virtio-mmio debugging, and this is what is printed (## = my
  comment):
  
  ## guest opens the socket:
  trying to open virtio-serial channel 
'/dev/virtio-ports/org.libguestfs.channel.0'
  virtio_mmio: virtio_mmio_write offset 0x50 value 0x3
  opened the socket, sock = 3
  udevadm settle
  ## guest writes 4 bytes to the socket:
  virtio_mmio: virtio_mmio_write offset 0x50 value 0x5
  virtio_mmio: virtio_mmio setting IRQ 1
  virtio_mmio: virtio_mmio_read offset 0x60
  virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
  virtio_mmio: virtio_mmio setting IRQ 0
  sent magic GUESTFS_LAUNCH_FLAG
  ## host reads 4 bytes successfully:
  main_loop libguestfs: recv_from_daemon: received GUESTFS_LAUNCH_FLAG
  libguestfs: [14605ms] appliance is up
  Guest launched OK.
  ## host writes 64 bytes to socket:
  libguestfs: writing the data to the socket (size = 64)
  waiting for next request
  libguestfs: data written OK
  ## hangs here forever with guest in read() call never receiving any data
  
  I am using qemu from git today (2d1fe1873a984).
  
  Some notes:
  
  - It's not 100% reproducible.  Sometimes everything works fine, although it 
fails often (eg  2/3rds of the time).
  - KVM is being used.
  - We've long used virtio-serial on x86 and have never seen anything like this.
+ 
+ This is what the output looks like when it doesn't fail:
+ 
+ trying to open virtio-serial channel 
'/dev/virtio-ports/org.libguestfs.channel.0
+ '
+ virtio_mmio: virtio_mmio_write offset 0x50 value 0x3
+ opened the socket, sock = 3
+ udevadm settle
+ virtio_mmio: virtio_mmio_write offset 0x50 value 0x5
+ virtio_mmio: virtio_mmio setting IRQ 1
+ virtio_mmio: virtio_mmio_read offset 0x60
+ virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
+ virtio_mmio: virtio_mmio setting IRQ 0
+ sent magic GUESTlibguestfs: recv_from_daemon: received GUESTFS_LAUNCH_FLAG
+ libguestfs: [14710ms] appliance is up
+ Guest launched OK.
+ libguestfs: writing the data to the socket (size = 64)
+ FS_LAUNCH_FLAG
+ main_loop waiting for next request
+ libguestfs: data written OK
+ virtio_mmio: virtio_mmio_write offset 0x50 value 0x2
+ virtio_mmio: virtio_mmio setting IRQ 1
+ virtio_mmio: virtio_mmio setting IRQ 1
+ virtio_mmio: virtio_mmio_write offset 0x50 value 0x2
+ virtio_mmio: virtio_mmio_read offset 0x60
+ virtio_mmio: virtio_mmio setting IRQ 1
+ virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
+ virtio_mmio: virtio_mmio setting IRQ 0
+ virtio_mmio: virtio_mmio_read offset 0x60
+ virtio_mmio: virtio_mmio_write offset 0x64 value 0x0
+ virtio_mmio: virtio_mmio setting IRQ 0
+ virtio_mmio: virtio_mmio_read offset 0x60
+ virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
+ [... more virtio-mmio lines omitted ...]
+ virtio_mmio: virtio_mmio_write offset 0x64 value 0x0
+ virtio_mmio: virtio_mmio setting IRQ 1
+ virtio_mmio: virtio_mmio_read offset 0x60
+ virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
+ virtio_mmio: virtio_mmio setting IRQ 0
+ guestfsd: main_loop: new request, len 0x3c
+ virtio_mmio: virtio_mmio_write offset 0x50 value 0x4
+ : 20 00 f5 f5 00 00 00 04 00 00 00 d2 00 00 00 00 | 
...|virtio_mmio: virtio_mmio_write offset 0x50 value 0x2
+ virtio_mmio: virtio_mmio setting IRQ 1
+ 
+ virtio_mmio: virtio_mmio_read offset 0x60
+ virtio_mmio: virtio_mmio_write offset 0x64 value 0x1
+ virtio_mmio: virtio_mmio setting IRQ 0
+ 0010: 00 12 34 00 00 00 00 00 00 00 00

Re: [Qemu-devel] [PATCH v3] e1000: NetClientInfo.receive_iov implemented

2013-09-12 Thread Stefan Hajnoczi

On Thu, Sep 12, 2013 at 10:47:37AM +0200, Vincenzo Maffione wrote:
 This patch implements the NetClientInfo.receive_iov method for the
 e1000 device emulation. In this way a network backend that uses
 qemu_sendv_packet() can deliver the fragmented packet without
 requiring an additional copy in the frontend/backend network code
 (nc_sendv_compat() function).
 
 The existing method NetClientInfo.receive has been reimplemented
 using the new method.
 
 Signed-off-by: Vincenzo Maffione v.maffi...@gmail.com
 ---
  hw/net/e1000.c | 70 
 --
  1 file changed, 58 insertions(+), 12 deletions(-)
 
 I propose this patch also because our research group (University of Pisa,
 Department of Computer Engineering) is working on the e1000 device
 (optimizations and paravirtual extensions) and we have patches to
 support the VALE switch as a network backend (see
 http://info.iet.unipi.it/~luigi/vale/).
 The VALE backend uses qemu_sendv_packet() to send fragmented packets: For
 this reason we think it could be interesting to better support these packets
 with e1000.

Thanks, applied to my net-next tree:
https://github.com/stefanha/qemu/commits/net-next

Stefan

Re: [Qemu-devel] [PATCH v3 01/29] tcg-aarch64: Set ext based on TCG_OPF_64BIT

2013-09-12 Thread Richard Henderson

On 09/12/2013 01:58 AM, Peter Maydell wrote:
 On 12 September 2013 09:25, Claudio Fontana claudio.font...@huawei.com 
 wrote:
 On 02.09.2013 19:54, Richard Henderson wrote:

 -case INDEX_op_bswap64_i64:
 -ext = 1; /* fall through */
  case INDEX_op_bswap32_i64:
 +/* Despite the _i64, this is a 32-bit bswap.  */
 +ext = 0;
 +/* FALLTHRU */
 +case INDEX_op_bswap64_i64:

 we waste too much y space here, which gives context and is a scarse resource.
 What about

 case INDEX_op_bswap32_i64: /* Despite the _i64, this is a 32-bit bswap.  */
 ext = false; /* FALLTHRU */
 
 Consensus in the rest of the code is for /* fall through */
 rather than /* FALLTHRU */ -- there's only 28 of the
 latter compared to 169 of the former.

Those 28 may well be all mine too.  The fingers still remember
that one must use FALLTHRU for lint.   ;-)


r~

Re: [Qemu-devel] [PATCH v3 02/29] tcg-aarch64: Change all ext variables to bool

2013-09-12 Thread Richard Henderson

On 09/12/2013 01:29 AM, Claudio Fontana wrote:
 I see the problem related to the previous patch.
 What about continuing to use int in the previous patch,
 and replace it with bool in this one? The previous patch would only target 
 the way ext is set,
 and this one would really contain all bool changes.

For the next edition, I'm planning to swap the order of the first two
patches, and also quit using bool.

I think TCGType is a bit more descriptive here, and also uses the
same 0/1 values that we want.  I added a QEMU_BUILD_BUG_ON to assert
that the enum values don't change.



r~

1 2 >

1 - 100 of 197 matches

Mail list logo