date:20130106

[Qemu-devel] [PATCH V2 07/10] snapshot: qmp use new internal API for external snapshot transaction

2013-01-06 Thread Wenchao Xia

  This patch switch to internal common API to take group external
snapshots from qmp_transaction interface. qmp layer simply does
a translation from user input.

Signed-off-by: Wenchao Xia 
---
 blockdev.c |  215 
 1 files changed, 87 insertions(+), 128 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 668506d..299039f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1173,157 +1173,116 @@ void qmp_blockdev_snapshot_sync(const char *device, 
const char *snapshot_file,
errp);
 }
 
-
-/* New and old BlockDriverState structs for group snapshots */
-typedef struct BlkTransactionStates {
-BlockDriverState *old_bs;
-BlockDriverState *new_bs;
-QSIMPLEQ_ENTRY(BlkTransactionStates) entry;
-} BlkTransactionStates;
-
-/*
- * 'Atomic' group snapshots.  The snapshots are taken as a set, and if any fail
- *  then we do not pivot any of the devices in the group, and abandon the
- *  snapshots
- */
-void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
+/* translation from qmp commands */
+static int fill_blk_trs_ext_create_sync(BlockdevSnapshot *create_sync,
+BlkTransStatesSync *st_sync,
+Error **errp)
 {
-int ret = 0;
-BlockdevActionList *dev_entry = dev_list;
-BlkTransactionStates *states, *next;
-Error *local_err = NULL;
-
-QSIMPLEQ_HEAD(snap_bdrv_states, BlkTransactionStates) snap_bdrv_states;
-QSIMPLEQ_INIT(&snap_bdrv_states);
+const char *format = "qcow2";
+enum NewImageMode mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
+BlockDriverState *bs;
+const char *device = create_sync->device;
+const char *name = create_sync->snapshot_file;
 
-/* drain all i/o before any snapshots */
-bdrv_drain_all();
+if (create_sync->has_mode) {
+mode = create_sync->mode;
+}
+if (create_sync->has_format) {
+format = create_sync->format;
+}
 
-/* We don't do anything in this loop that commits us to the snapshot */
-while (NULL != dev_entry) {
-BlockdevAction *dev_info = NULL;
-BlockDriver *proto_drv;
-BlockDriver *drv;
-int flags;
-enum NewImageMode mode;
-const char *new_image_file;
-const char *device;
-const char *format = "qcow2";
+/* find the target bs */
+bs = bdrv_find(device);
+if (!bs) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+return -1;
+}
 
-dev_info = dev_entry->value;
-dev_entry = dev_entry->next;
+switch (mode) {
+case NEW_IMAGE_MODE_EXISTING:
+st_sync->use_existing = true;
+break;
+case NEW_IMAGE_MODE_ABSOLUTE_PATHS:
+st_sync->use_existing = false;
+break;
+default:
+error_setg(errp, "Device %s requested invalid snapshot"
+ " mode %d.", device, mode);
+return -1;
+}
 
-states = g_malloc0(sizeof(BlkTransactionStates));
-QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, states, entry);
+st_sync->external.new_image_file = name;
+st_sync->external.format = format;
+st_sync->external.old_bs = bs;
 
-switch (dev_info->kind) {
-case BLOCKDEV_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC:
-device = dev_info->blockdev_snapshot_sync->device;
-if (!dev_info->blockdev_snapshot_sync->has_mode) {
-dev_info->blockdev_snapshot_sync->mode = 
NEW_IMAGE_MODE_ABSOLUTE_PATHS;
-}
-new_image_file = dev_info->blockdev_snapshot_sync->snapshot_file;
-if (dev_info->blockdev_snapshot_sync->has_format) {
-format = dev_info->blockdev_snapshot_sync->format;
-}
-mode = dev_info->blockdev_snapshot_sync->mode;
-break;
-default:
-abort();
-}
+return 0;
+}
 
-drv = bdrv_find_format(format);
-if (!drv) {
-error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
-goto delete_and_fail;
-}
+static int fill_blk_trs(BlockdevAction *dev_info,
+BlkTransStates *states,
+SNTime *time,
+const char *time_str,
+Error **errp)
+{
+int ret = 0;
 
-states->old_bs = bdrv_find(device);
-if (!states->old_bs) {
-error_set(errp, QERR_DEVICE_NOT_FOUND, device);
-goto delete_and_fail;
-}
+switch (dev_info->kind) {
+case BLOCKDEV_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC:
+states->st_sync.type = BLK_SNAPSHOT_EXTERNAL;
+states->st_sync.op = BLK_SN_SYNC_CREATE;
+ret = fill_blk_trs_ext_create_sync(dev_info->blockdev_snapshot_sync,
+   &states->st_sync,
+   errp);
+break;
+default:
+abort();
+}
 
-if (!bdrv_i

[Qemu-devel] [PATCH V2 06/10] snapshot: implemention of internal common API to take snapshots

2013-01-06 Thread Wenchao Xia

  This patch is the implemention of transaction based snapshot
internal API. Internal snapshot for specified block device
is added, which can be used as part of functionality in one way
to live full back up vm seperating internal block snapshot and
vmstate(vmstate goes to another file, not implemented in this
patch).
  Every request's handler need to have three function:
execution, updating, canceling. Also another check function
could be provided to check if request is valid before execition.
  internal snapshot implemention was based on the code from
diet...@proxmox.com.

v2:
  Remove space in comments.
  Use qemu_timeval and localtime_r on both Linux and Windows.
  Use true instread of TRUE for bool flag.
  Factor out error dealing code in submit_transaction.

Signed-off-by: Wenchao Xia 
Signed-off-by: Dietmar Maurer 
---
 blockdev.c |  484 
 1 files changed, 484 insertions(+), 0 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index d724e2d..668506d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -660,6 +660,490 @@ void do_commit(Monitor *mon, const QDict *qdict)
 }
 }
 
+
+/* Snapshot functions.
+ * Following are implemention around core struct BlkTransStates.
+ * to start, invalidate, cancel the action.
+ */
+
+/* Block internal snapshot(qcow2, sheepdog image...) support.
+ * Checking and execution was splitted to enable a check for every device
+ * before execution in group case.
+ */
+
+SNTime get_sn_time(void)
+{
+SNTime time;
+qemu_gettimeofday(&time.tv);
+time.vm_clock_nsec = qemu_get_clock_ns(vm_clock);
+return time;
+}
+
+void generate_sn_name_from_time(SNTime *time, char *time_str, int size)
+{
+time_t t = time->tv.tv_sec;
+struct tm tm;
+localtime_r(&t, &tm);
+strftime(time_str, size, "vm-%Y%m%d%H%M%S", &tm);
+}
+
+static int blk_sn_check_create_internal_sync(BlkTransStates *states,
+ Error **errp)
+{
+BlkTransStatesSync *st_sync = &states->st_sync;
+BlockDriverState *bs = st_sync->internal.bs;
+const char *device = bdrv_get_device_name(bs);
+QEMUSnapshotInfo *old_sn = &st_sync->internal.old_sn;
+const char *name = st_sync->internal.sn_name;
+bool use_existing = st_sync->use_existing;
+int ret;
+
+if (!bdrv_is_inserted(bs)) {
+error_set_check(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
+return -1;
+}
+
+if (bdrv_is_read_only(bs)) {
+error_set_check(errp, QERR_DEVICE_IS_READ_ONLY, device);
+return -1;
+}
+
+if (!bdrv_can_snapshot(bs)) {
+error_set_check(errp, QERR_NOT_SUPPORTED);
+return -1;
+}
+
+ret = bdrv_snapshot_find(bs, old_sn, name);
+if (ret >= 0) {
+st_sync->internal.old_sn_exist = true;
+} else {
+if (use_existing) {
+/* caller require use existing one */
+error_set_check(errp, ERROR_CLASS_GENERIC_ERROR,
+"snapshot '%s' not exists on '%s' but caller "
+"want to use it.",
+name, device);
+return -1;
+}
+}
+
+st_sync->internal.step = BLK_SNAPSHOT_INT_START;
+return 0;
+}
+
+static int blk_sn_create_internal_sync(BlkTransStates *states,
+   Error **errp)
+{
+BlkTransStatesSync *st_sync = &states->st_sync;
+BlockDriverState *bs = st_sync->internal.bs;
+const char *name = st_sync->internal.sn_name;
+QEMUSnapshotInfo *sn = &st_sync->internal.sn;
+int ret = 0;
+const char *device = bdrv_get_device_name(bs);
+QEMUSnapshotInfo *old_sn = &st_sync->internal.old_sn;
+
+SNTime *time = &st_sync->internal.time;
+if (time->vm_clock_nsec == 0) {
+/* not preset, it need to be set */
+error_setg_check(errp,
+ "Invalid timestamp was set on for '%s' on '%s'\n",
+ name, device);
+return -1;
+}
+
+sn->date_sec = time->tv.tv_sec;
+sn->date_nsec = time->tv.tv_usec * 1000;
+sn->vm_clock_nsec = time->vm_clock_nsec;
+sn->vm_state_size = st_sync->internal.vm_state_size;
+
+if (st_sync->internal.old_sn_exist) {
+ret = bdrv_snapshot_delete(bs, old_sn->id_str);
+if (ret < 0) {
+error_setg_check(errp,
+   "Error in deleting existing snapshot %s on '%s'\n",
+   name, device);
+return -1;
+}
+pstrcpy(sn->name, sizeof(sn->name), old_sn->name);
+pstrcpy(sn->id_str, sizeof(sn->id_str), old_sn->id_str);
+} else {
+pstrcpy(sn->name, sizeof(sn->name), name);
+}
+
+ret = bdrv_snapshot_create(bs, sn);
+if (ret < 0) {
+error_set_check(errp, ERROR_CLASS_GENERIC_ERROR,
+  "Error while creating snapshot on '%s'\n", device);
+return -1;
+}
+
+st_sync->internal.step = BLK_SNAPSHOT_INT_CREATED;
+return

[Qemu-devel] [PATCH V2 05/10] snapshot: design of internal common API to take snapshots

2013-01-06 Thread Wenchao Xia

  This patch added API to take snapshots in unified style for
both internal or external type. The core structure is based
on transaction, for that there is a qmp interface need to support
, qmp_transaction, so all operations are packed as requests.
  In this way a sperate internal layer for snapshot is splitted
out from qmp layer, and now qmp can just translate the user request
and fill in internal API. Internal API use params defined inside
qemu, so other component inside qemu can use it without considering
the qmp's parameter format.

v2:
  Renamed BlkTransaction* to BlkTrans* to make name shorter.
  Use present tense in comments.
  Deleted async snapshot related member.
  Use qemu_timeval in SNTime.

Signed-off-by: Wenchao Xia 
---
 include/sysemu/blockdev.h |  119 +
 1 files changed, 119 insertions(+), 0 deletions(-)

diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
index 1fe5332..4d61485 100644
--- a/include/sysemu/blockdev.h
+++ b/include/sysemu/blockdev.h
@@ -66,4 +66,123 @@ void qmp_change_blockdev(const char *device, const char 
*filename,
  bool has_format, const char *format, Error **errp);
 void do_commit(Monitor *mon, const QDict *qdict);
 int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data);
+
+
+/* snapshot transaction API.
+ * Split out a layer around core struct BlkTransStates, so other
+ * component in qemu can fill the request and simply use the API to submit,
+ * QMP may just use part of the API's function, no need to expose all internal
+ * function to user.
+ */
+
+/* sync snapshot */
+
+typedef enum BlkTransOperationSync {
+BLK_SN_SYNC_CREATE = 0,
+BLK_SN_SYNC_DELETE,
+} BlkTransOperationSync;
+
+/* internal snapshot */
+
+typedef struct SNTime {
+qemu_timeval tv;
+uint64_t vm_clock_nsec; /* VM clock relative to boot */
+} SNTime;
+
+typedef enum BlkSnapshotIntStep {
+BLK_SNAPSHOT_INT_START = 0,
+BLK_SNAPSHOT_INT_CREATED,
+BLK_SNAPSHOT_INT_CANCELED,
+} BlkSnapshotIntStep;
+
+typedef struct BlkSnapshotInternal {
+/* caller input */
+const char *sn_name; /* must be set in create/delete. */
+BlockDriverState *bs; /* must be set in create/delete */
+SNTime time; /* must be set in create. */
+uint64_t vm_state_size; /* optional, default is 0, only valid in create. */
+/* following are used internal */
+QEMUSnapshotInfo sn;
+QEMUSnapshotInfo old_sn;
+bool old_sn_exist;
+BlkSnapshotIntStep step;
+} BlkSnapshotInternal;
+
+/* external snapshot */
+
+typedef enum BlkSnapshotExtStep {
+BLK_SNAPSHOT_EXT_START = 0,
+BLK_SNAPSHOT_EXT_CREATED,
+BLK_SNAPSHOT_EXT_INVALIDATED,
+BLK_SNAPSHOT_EXT_CANCELED,
+} BlkSnapshotExtStep;
+
+typedef struct BlkSnapshotExternal {
+/* caller input */
+const char *new_image_file; /* must be set in create/delete. */
+BlockDriverState *old_bs; /* must be set in create/delete. */
+const char *format; /* must be set in create. */
+/* following are used internal */
+BlockDriverState *new_bs;
+BlockDriver *format_drv;
+BlkSnapshotExtStep step;
+} BlkSnapshotExternal;
+
+typedef enum BlkSnapshotType {
+BLK_SNAPSHOT_INTERNAL = 0,
+BLK_SNAPSHOT_EXTERNAL,
+BLK_SNAPSHOT_NOSUPPORT,
+} BlkSnapshotType;
+
+/* for simple sync type params are all put here ignoring the difference of
+   different operation type as create/delete. */
+typedef struct BlkTransStatesSync {
+/* caller input */
+BlkSnapshotType type;
+union {
+BlkSnapshotInternal internal;
+BlkSnapshotExternal external;
+};
+bool use_existing;
+BlkTransOperationSync op;
+} BlkTransStatesSync;
+
+/* Core structure for group snapshots, fill in it and then call the API. */
+typedef struct BlkTransStates BlkTransStates;
+
+struct BlkTransStates {
+/* caller input */
+BlkTransStatesSync st_sync;
+/* following are used internal */
+Error *err;
+int (*blk_trans_do)(BlkTransStates *states, Error **errp);
+int (*blk_trans_invalid)(BlkTransStates *states, Error **errp);
+int (*blk_trans_cancel)(BlkTransStates *states, Error **errp);
+QSIMPLEQ_ENTRY(BlkTransStates) entry;
+};
+
+typedef QSIMPLEQ_HEAD(snap_bdrv_states, BlkTransStates) \
+  BlkTransStatesList;
+
+/* API */
+BlkTransStates *blk_trans_st_new(void);
+void blk_trans_st_delete(BlkTransStates **p_st);
+BlkTransStatesList *blk_trans_st_list_new(void);
+void blk_trans_st_list_delete(BlkTransStatesList **p_list);
+
+/* add a request to list, request would be checked to see if it is valid,
+   return -1 when met error and states would not be queued. */
+int add_transaction(BlkTransStatesList *list,
+BlkTransStates *states,
+Error **errp);
+
+/* 'Atomic' submit the request. In snapshot creation case, if any
+ * fail then we do not pivot any of the devices in the group, and abandon the
+ * snapshots
+ */
+int submit_transac

[Qemu-devel] [PATCH V2 02/10] block: add function deappend()

2013-01-06 Thread Wenchao Xia

  This function should revert the append operation.

Signed-off-by: Wenchao Xia 
---
 block.c   |   21 +
 include/block/block.h |1 +
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 09208c2..48ddf64 100644
--- a/block.c
+++ b/block.c
@@ -1376,6 +1376,27 @@ void bdrv_append(BlockDriverState *bs_new, 
BlockDriverState *bs_top)
 bs_new->drv ? bs_new->drv->format_name : "");
 }
 
+/*
+ * Remove bs contents which is at the top of an image chain while
+ * the chain is live. After bdrv_append(bs_new, bs_top) called,
+ * bs_top keeps as the top of the chain but bs_new become the 2nd
+ * top one, so call bdrv_deappend(bs_new, bs_top) to cancel the change.
+ * As a result, bs_top still keeps the top position but bs_new is
+ * is discarded.
+ *
+ * This will modify the BlockDriverState fields, and swap contents
+ * between bs_old and bs_top. Both bs_old and bs_top are modified.
+ *
+ * bs_old and bs_top should be the pairs which have been used in
+ * bdrv_append().
+ *
+ * This function does not delete any image files.
+ */
+void bdrv_deappend(BlockDriverState *bs_old, BlockDriverState *bs_top)
+{
+bdrv_swap(bs_old, bs_top);
+}
+
 void bdrv_delete(BlockDriverState *bs)
 {
 assert(!bs->dev);
diff --git a/include/block/block.h b/include/block/block.h
index a49fd71..ae6a5ae 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -130,6 +130,7 @@ BlockDriverState *bdrv_new(const char *device_name);
 void bdrv_make_anon(BlockDriverState *bs);
 void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old);
 void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top);
+void bdrv_deappend(BlockDriverState *bs_old, BlockDriverState *bs_top);
 void bdrv_delete(BlockDriverState *bs);
 int bdrv_parse_cache_flags(const char *mode, int *flags);
 int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags);
-- 
1.7.1

[Qemu-devel] [PATCH V2 01/10] block: export function bdrv_find_snapshot()

2013-01-06 Thread Wenchao Xia

  This function is moved from savevm.c to block.c, and exported.

Signed-off-by: Wenchao Xia 
---
 block.c   |   23 +++
 include/block/block.h |2 ++
 savevm.c  |   22 --
 3 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/block.c b/block.c
index 4e28c55..09208c2 100644
--- a/block.c
+++ b/block.c
@@ -3210,6 +3210,29 @@ int bdrv_snapshot_load_tmp(BlockDriverState *bs,
 return -ENOTSUP;
 }
 
+int bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info,
+   const char *name)
+{
+QEMUSnapshotInfo *sn_tab, *sn;
+int nb_sns, i, ret;
+
+ret = -ENOENT;
+nb_sns = bdrv_snapshot_list(bs, &sn_tab);
+if (nb_sns < 0) {
+return ret;
+}
+for (i = 0; i < nb_sns; i++) {
+sn = &sn_tab[i];
+if (!strcmp(sn->id_str, name) || !strcmp(sn->name, name)) {
+*sn_info = *sn;
+ret = 0;
+break;
+}
+}
+g_free(sn_tab);
+return ret;
+}
+
 /* backing_file can either be relative, or absolute, or a protocol.  If it is
  * relative, it must be relative to the chain.  So, passing in bs->filename
  * from a BDS as backing_file should not be done, as that may be relative to
diff --git a/include/block/block.h b/include/block/block.h
index 0719339..a49fd71 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -330,6 +330,8 @@ int bdrv_snapshot_list(BlockDriverState *bs,
 int bdrv_snapshot_load_tmp(BlockDriverState *bs,
const char *snapshot_name);
 char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn);
+int bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info,
+   const char *name);
 
 char *get_human_readable_size(char *buf, int buf_size, int64_t size);
 int path_is_absolute(const char *path);
diff --git a/savevm.c b/savevm.c
index 529d60e..f16a920 100644
--- a/savevm.c
+++ b/savevm.c
@@ -2036,28 +2036,6 @@ out:
 return ret;
 }
 
-static int bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info,
-  const char *name)
-{
-QEMUSnapshotInfo *sn_tab, *sn;
-int nb_sns, i, ret;
-
-ret = -ENOENT;
-nb_sns = bdrv_snapshot_list(bs, &sn_tab);
-if (nb_sns < 0)
-return ret;
-for(i = 0; i < nb_sns; i++) {
-sn = &sn_tab[i];
-if (!strcmp(sn->id_str, name) || !strcmp(sn->name, name)) {
-*sn_info = *sn;
-ret = 0;
-break;
-}
-}
-g_free(sn_tab);
-return ret;
-}
-
 /*
  * Deletes snapshots of a given name in all opened images.
  */
-- 
1.7.1

[Qemu-devel] [PATCH V2 03/10] error: add function error_set_check()

2013-01-06 Thread Wenchao Xia

  This function will return instead of abort when *errp is not NULL.
Also macro error_setg_check is added.

Signed-off-by: Wenchao Xia 
---
 error.c  |   19 +++
 include/qapi/error.h |5 +
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/error.c b/error.c
index 519f6b6..e4ec67c 100644
--- a/error.c
+++ b/error.c
@@ -43,6 +43,25 @@ void error_set(Error **errp, ErrorClass err_class, const 
char *fmt, ...)
 *errp = err;
 }
 
+void error_set_check(Error **errp, ErrorClass err_class, const char *fmt, ...)
+{
+Error *err;
+va_list ap;
+
+if ((errp == NULL) || (*errp != NULL)) {
+return;
+}
+
+err = g_malloc0(sizeof(*err));
+
+va_start(ap, fmt);
+err->msg = g_strdup_vprintf(fmt, ap);
+va_end(ap);
+err->err_class = err_class;
+
+*errp = err;
+}
+
 void error_set_errno(Error **errp, int os_errno, ErrorClass err_class,
  const char *fmt, ...)
 {
diff --git a/include/qapi/error.h b/include/qapi/error.h
index 5cd2f0c..04b87a9 100644
--- a/include/qapi/error.h
+++ b/include/qapi/error.h
@@ -29,6 +29,9 @@ typedef struct Error Error;
  */
 void error_set(Error **err, ErrorClass err_class, const char *fmt, ...) 
GCC_FMT_ATTR(3, 4);
 
+void error_set_check(Error **errp, ErrorClass err_class, const char *fmt, ...)
+GCC_FMT_ATTR(3, 4);
+
 /**
  * Set an indirect pointer to an error given a ErrorClass value and a
  * printf-style human message, followed by a strerror() string if
@@ -41,6 +44,8 @@ void error_set_errno(Error **err, int os_error, ErrorClass 
err_class, const char
  */
 #define error_setg(err, fmt, ...) \
 error_set(err, ERROR_CLASS_GENERIC_ERROR, fmt, ## __VA_ARGS__)
+#define error_setg_check(err, fmt, ...) \
+error_set_check(err, ERROR_CLASS_GENERIC_ERROR, fmt, ## __VA_ARGS__)
 #define error_setg_errno(err, os_error, fmt, ...) \
 error_set_errno(err, os_error, ERROR_CLASS_GENERIC_ERROR, fmt, ## 
__VA_ARGS__)
 
-- 
1.7.1

[Qemu-devel] [PATCH V2 00/10] snapshot: take block snapshots in unified way

2013-01-06 Thread Wenchao Xia

  These patch added a seperated layer to take internal or external snapshots
in a unified way, the granularity is block device, so other functions can
just combine the request and submit, such as group snapshot, savevm.

Total goal are:
  Live back up vm in external or internal image, which need three functions:
1 live snapshot block device internal/external.
2 live save vmstate internal/external.
3 combination of the function unit.

  This serial provide part one.

v2:
  1) seperate patches for bdrv_deappend() and bdrv_snapshot_find().
  2) use error set function which will keep first error instead of replacing a
existing error.
  3) use qemu_timespec for snapshot timestamp, added a lock in localtime_r() on
windows and use it for timestamp.
  4) remove space in comments and use present tense.
  5) remove member async in BlkTransactionStates and async related structure,
which is not supported now.
  6) remove snapshot delete function.
  7) remove snapshot info function.
  8) redesign qmp interface, new interface is added instead of using parameter
to distinguish internal case in old external interface.
  9) renamed parameter in hmp interface, and better tips for it.
  10) patches are splitted to more smaller ones.
  11) code adjust according to comments.

Wenchao Xia (10):
  block: export function bdrv_find_snapshot()
  block: add function deappend()
  error: add function error_set_check()
  oslib-win32: add lock for time functions
  snapshot: design of internal common API to take snapshots
  snapshot: implemention of internal common API to take snapshots
  snapshot: qmp use new internal API for external snapshot transaction
  snapshot: qmp add internal snapshot transaction interface
  snapshot: qmp add blockdev-snapshot-internal-sync interface
  snapshot: hmp add internal snapshot support for block device

 block.c   |   44 +++
 blockdev.c|  753 +
 error.c   |   19 ++
 hmp-commands.hx   |   28 +-
 hmp.c |   25 +-
 include/block/block.h |3 +
 include/qapi/error.h  |5 +
 include/sysemu/blockdev.h |  119 +++
 oslib-win32.c |   12 +-
 qapi-schema.json  |   50 +++-
 qmp-commands.hx   |   52 +++-
 savevm.c  |   22 --
 12 files changed, 954 insertions(+), 178 deletions(-)

Re: [Qemu-devel] QEMU CDROM tray-open and locked logic

2013-01-06 Thread Peter Lieven

Hi Paolo,

Am 04.01.2013 um 19:42 schrieb Paolo Bonzini :

> Il 04/01/2013 11:26, Peter Lieven ha scritto:
>> Hi,
>> 
>> i have observed the following with qemu-kvm-1.2.0 which I think is not right:
>> 
>> a) if the CDROM is locked and I sent a eject command I get the error that the
>> CDROM is locked, but its ejected nevertheless.
> 
> That's because the CD-ROM sends an "eject requested" event, and udev
> does the unlock & eject itself.

ah ok, this explains it.

> 
>> b) if I eject the CDROM in the OS I see tray-open=1 and locked=1. In the
>> tray-open=1 state the CDROM can`t be locked, can it?
> 
> A "real" CD-ROM drive could have the tray open and not responding to the
> button.  Table 330 of MMC6 matches "Operation Lock, current prevent
> state Locked, No media present" to "No Error, media insertion is not
> permitted".  I cannot check right now what happens later and whether
> QEMU's behavior makes sense.

okay, so this could be also correct.

last question. if the OS opens the tray, the cdrom is still inserted in the
case that the iso is still inserted. this behavior is also ok, but its a little
confusing. this leads to a lot of cases that have to be checked regarding
cdroms in qemu.

thank you,
peter

[Qemu-devel] [PATCH V10 1/4] add def_print_str and use it in qemu_opts_print.

2013-01-06 Thread Dong Xu Wang

qemu_opts_print has no user now, so can re-write the function safely.

qemu_opts_print will be used while using "qemu-img create", it will
produce the same output as previous code.

The behavior of this function has changed:

1. Print every possible option, whether a value has been set or not.
2. Option descriptors may provide a default value.
3. Print to stdout instead of stderr.

Previously the behavior was to print every option that has been set.
Options that have not been set would be skipped.

Signed-off-by: Dong Xu Wang 
---
v7->v8:
1) print "elements => accept any params" while opts_accepts_any() ==
true.
2) since def_print_str is the default value if an option isn't set,
so rename it to def_value_str.

 include/qemu/option.h |1 +
 qemu-option.c |   31 ---
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/include/qemu/option.h b/include/qemu/option.h
index ba197cd..394170a 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -96,6 +96,7 @@ typedef struct QemuOptDesc {
 const char *name;
 enum QemuOptType type;
 const char *help;
+const char *def_value_str;
 } QemuOptDesc;
 
 struct QemuOptsList {
diff --git a/qemu-option.c b/qemu-option.c
index f532b76..6f19fd3 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -862,15 +862,32 @@ void qemu_opts_del(QemuOpts *opts)
 
 int qemu_opts_print(QemuOpts *opts, void *dummy)
 {
-QemuOpt *opt;
+QemuOptDesc *desc = opts->list->desc;
 
-fprintf(stderr, "%s: %s:", opts->list->name,
-opts->id ? opts->id : "");
-QTAILQ_FOREACH(opt, &opts->head, next) {
-fprintf(stderr, " %s=\"%s\"", opt->name, opt->str);
+if (desc[0].name == NULL) {
+printf("no elements => accept any params");
+return 0;
 }
-fprintf(stderr, "\n");
-return 0;
+for (; desc && desc->name; desc++) {
+const char *value = desc->def_value_str;
+QemuOpt *opt;
+
+opt = qemu_opt_find(opts, desc->name);
+if (opt) {
+value = opt->str;
+}
+
+if (!value) {
+continue;
+}
+
+if (desc->type == QEMU_OPT_STRING) {
+printf("%s='%s' ", desc->name, value);
+} else {
+printf("%s=%s ", desc->name, value);
+}
+}
+ return 0;
 }
 
 static int opts_do_parse(QemuOpts *opts, const char *params,
-- 
1.7.1

[Qemu-devel] [PATCH V10 2/4] Create four opts list related functions

2013-01-06 Thread Dong Xu Wang

This patch will create 4 functions, count_opts_list, append_opts_list,
free_opts_list and print_opts_list, they will used in following commits.

Signed-off-by: Dong Xu Wang 
---
v6->v7):
1) Fix typo.

v5->v6):
1) allocate enough space in append_opts_list function.

 include/qemu/option.h |4 ++
 qemu-option.c |   90 +
 2 files changed, 94 insertions(+), 0 deletions(-)

diff --git a/include/qemu/option.h b/include/qemu/option.h
index 394170a..f784c2e 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -156,4 +156,8 @@ int qemu_opts_print(QemuOpts *opts, void *dummy);
 int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void 
*opaque,
   int abort_on_failure);
 
+QemuOptsList *append_opts_list(QemuOptsList *dest,
+   QemuOptsList *list);
+void free_opts_list(QemuOptsList *list);
+void print_opts_list(QemuOptsList *list);
 #endif
diff --git a/qemu-option.c b/qemu-option.c
index 6f19fd3..227daa9 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -1149,3 +1149,93 @@ int qemu_opts_foreach(QemuOptsList *list, 
qemu_opts_loopfunc func, void *opaque,
 loc_pop(&loc);
 return rc;
 }
+
+static size_t count_opts_list(QemuOptsList *list)
+{
+size_t i = 0;
+
+while (list && list->desc[i].name) {
+i++;
+}
+
+return i;
+}
+
+/* Create a new QemuOptsList and make its desc to the merge of first and 
second.
+ * It will allocate space for one new QemuOptsList plus enough space for
+ * QemuOptDesc in first and second QemuOptsList. First argument's QemuOptDesc
+ * members take precedence over second's.
+ */
+QemuOptsList *append_opts_list(QemuOptsList *first,
+   QemuOptsList *second)
+{
+size_t num_first_options, num_second_options;
+QemuOptsList *dest = NULL;
+int i = 0;
+int index = 0;
+
+num_first_options = count_opts_list(first);
+num_second_options = count_opts_list(second);
+if (num_first_options + num_second_options == 0) {
+return NULL;
+}
+
+dest = g_malloc0(sizeof(QemuOptsList)
++ (num_first_options + num_second_options + 1) * sizeof(QemuOptDesc));
+
+dest->name = "append_opts_list";
+dest->implied_opt_name = NULL;
+dest->merge_lists = false;
+QTAILQ_INIT(&dest->head);
+while (first && (first->desc[i].name)) {
+if (!find_desc_by_name(dest->desc, first->desc[i].name)) {
+dest->desc[index].name = g_strdup(first->desc[i].name);
+dest->desc[index].help = g_strdup(first->desc[i].help);
+dest->desc[index].type = first->desc[i].type;
+dest->desc[index].def_value_str =
+g_strdup(first->desc[i].def_value_str);
+++index;
+   }
+i++;
+}
+i = 0;
+while (second && (second->desc[i].name)) {
+if (!find_desc_by_name(dest->desc, second->desc[i].name)) {
+dest->desc[index].name = g_strdup(first->desc[i].name);
+dest->desc[index].help = g_strdup(first->desc[i].help);
+dest->desc[index].type = second->desc[i].type;
+dest->desc[index].def_value_str =
+g_strdup(second->desc[i].def_value_str);
+++index;
+}
+i++;
+}
+dest->desc[index].name = NULL;
+return dest;
+}
+
+void free_opts_list(QemuOptsList *list)
+{
+int i = 0;
+
+while (list && list->desc[i].name) {
+g_free((char *)list->desc[i].name);
+g_free((char *)list->desc[i].help);
+g_free((char *)list->desc[i].def_value_str);
+i++;
+}
+
+g_free(list);
+}
+
+void print_opts_list(QemuOptsList *list)
+{
+int i = 0;
+printf("Supported options:\n");
+while (list && list->desc[i].name) {
+printf("%-16s %s\n", list->desc[i].name,
+list->desc[i].help ?
+list->desc[i].help : "No description available");
+i++;
+}
+}
-- 
1.7.1

[Qemu-devel] [PATCH V10 4/4] remove QEMUOptionParameter related functions and struct

2013-01-06 Thread Dong Xu Wang

Signed-off-by: Dong Xu Wang 
---
 include/qemu/option.h |   32 --
 qemu-option.c |  285 -
 2 files changed, 0 insertions(+), 317 deletions(-)

diff --git a/include/qemu/option.h b/include/qemu/option.h
index f784c2e..094663b 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -38,17 +38,6 @@ enum QEMUOptionParType {
 OPT_STRING,
 };
 
-typedef struct QEMUOptionParameter {
-const char *name;
-enum QEMUOptionParType type;
-union {
-uint64_t n;
-char* s;
-} value;
-const char *help;
-} QEMUOptionParameter;
-
-
 const char *get_opt_name(char *buf, int buf_size, const char *p, char delim);
 const char *get_opt_value(char *buf, int buf_size, const char *p);
 int get_next_param_value(char *buf, int buf_size,
@@ -58,27 +47,6 @@ int get_param_value(char *buf, int buf_size,
 int check_params(char *buf, int buf_size,
  const char * const *params, const char *str);
 
-
-/*
- * The following functions take a parameter list as input. This is a pointer to
- * the first element of a QEMUOptionParameter array which is terminated by an
- * entry with entry->name == NULL.
- */
-
-QEMUOptionParameter *get_option_parameter(QEMUOptionParameter *list,
-const char *name);
-int set_option_parameter(QEMUOptionParameter *list, const char *name,
-const char *value);
-int set_option_parameter_int(QEMUOptionParameter *list, const char *name,
-uint64_t value);
-QEMUOptionParameter *append_option_parameters(QEMUOptionParameter *dest,
-QEMUOptionParameter *list);
-QEMUOptionParameter *parse_option_parameters(const char *param,
-QEMUOptionParameter *list, QEMUOptionParameter *dest);
-void free_option_parameters(QEMUOptionParameter *list);
-void print_option_parameters(QEMUOptionParameter *list);
-void print_option_help(QEMUOptionParameter *list);
-
 /* -- */
 
 typedef struct QemuOpt QemuOpt;
diff --git a/qemu-option.c b/qemu-option.c
index 227daa9..fec0250 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -153,22 +153,6 @@ int check_params(char *buf, int buf_size,
 return 0;
 }
 
-/*
- * Searches an option list for an option with the given name
- */
-QEMUOptionParameter *get_option_parameter(QEMUOptionParameter *list,
-const char *name)
-{
-while (list && list->name) {
-if (!strcmp(list->name, name)) {
-return list;
-}
-list++;
-}
-
-return NULL;
-}
-
 static void parse_option_bool(const char *name, const char *value, bool *ret,
   Error **errp)
 {
@@ -240,275 +224,6 @@ static void parse_option_size(const char *name, const 
char *value,
 }
 }
 
-/*
- * Sets the value of a parameter in a given option list. The parsing of the
- * value depends on the type of option:
- *
- * OPT_FLAG (uses value.n):
- *  If no value is given, the flag is set to 1.
- *  Otherwise the value must be "on" (set to 1) or "off" (set to 0)
- *
- * OPT_STRING (uses value.s):
- *  value is strdup()ed and assigned as option value
- *
- * OPT_SIZE (uses value.n):
- *  The value is converted to an integer. Suffixes for kilobytes etc. are
- *  allowed (powers of 1024).
- *
- * Returns 0 on succes, -1 in error cases
- */
-int set_option_parameter(QEMUOptionParameter *list, const char *name,
-const char *value)
-{
-bool flag;
-Error *local_err = NULL;
-
-// Find a matching parameter
-list = get_option_parameter(list, name);
-if (list == NULL) {
-fprintf(stderr, "Unknown option '%s'\n", name);
-return -1;
-}
-
-// Process parameter
-switch (list->type) {
-case OPT_FLAG:
-parse_option_bool(name, value, &flag, &local_err);
-if (!error_is_set(&local_err)) {
-list->value.n = flag;
-}
-break;
-
-case OPT_STRING:
-if (value != NULL) {
-list->value.s = g_strdup(value);
-} else {
-fprintf(stderr, "Option '%s' needs a parameter\n", name);
-return -1;
-}
-break;
-
-case OPT_SIZE:
-parse_option_size(name, value, &list->value.n, &local_err);
-break;
-
-default:
-fprintf(stderr, "Bug: Option '%s' has an unknown type\n", name);
-return -1;
-}
-
-if (error_is_set(&local_err)) {
-qerror_report_err(local_err);
-error_free(local_err);
-return -1;
-}
-
-return 0;
-}
-
-/*
- * Sets the given parameter to an integer instead of a string.
- * This function cannot be used to set string options.
- *
- * Returns 0 on success, -1 in error cases
- */
-int set_option_parameter_int(QEMUOptionParameter *list, const char *name,
-uint64_t value)
-{
-// Find a matching parameter
-list = get_option_parameter(list, name);
-if (list == NULL) {
-fprintf(stderr, "Unknown option '%s'\n", name);
-return -1;
-}
-
-

[Qemu-devel] [PATCH V10 0/4] replace QEMUOptionParameter with QemuOpts parser

2013-01-06 Thread Dong Xu Wang

Patch 1 add def_value and use it in qemu_opts_print.

Patch 2 Create functions to pair with QEMUOptionParameter parser.

Patch 3 Use QemuOpts parser in Block.

Patch 4 Remove QEMUOptionParameter parser related code.

V9->V10)
1) Fix compile error on patch 3.

v8->v9)
1) add qemu_ prefix to gluster_create_opts.
2) fix bug: bdrv_gluster_unix and bdrv_gluster_rdma should also be
   converted.

v7->v8)
1) print "elements => accept any params" while opts_accepts_any() ==
true.
2) since def_print_str is the default value if an option isn't set,
so rename it to def_value_str.
3) rebase to upstream source tree.
4) add gluster.c, raw-win32.c, and rbd.c.

v6->v7:
1) Fix typo: enouth->enough.
2) use osdep.h:stringify(), not redefining new macro.
3) preserve TODO comment.
4) fix typo: BLOCK_OPT_ENCRYPT->BLOCK_OPT_STATIC.
5) initialize disk_type even when opts is NULL.

v5->v6:
1) allocate enough space in append_opts_list function.
2) judge if opts == NULL in block layer create functions.
3) use bdrv_create_file(filename, NULL) in qcow_create funtion.
4) made more readable while using qemu_opt_get_number funtion.

v4->v5:
1) Rewrite qemu_opts_create_nofail function based on Peter Maydell's comments.
2) Use g_strdup_printf in qemu_opt_set_number.
3) Rewrite qemu_opts_print.
4) .bdrv_create_options returns pointer directly. Fix a bug about "encryption".
5) Check qemu_opt_get_number in raw-posix.c.

v3->v4:
1) Rebased to the newest source tree.
2) Remove redundant "#include "block-cache.h"
3) Other small changes.

v2->v3:
1) rewrite qemu_opt_set_bool and qemu_opt_set_number according Paolo's coments.
2) split patches to make review easier.

v1->v2:
1) add Luiz's patches.
2) create qemu_opt_set_number() and qemu_opts_create_nofail() functions.
3) add QemuOptsList map to drivers.
4) use original opts parser, not creating new ones.
5) fix other bugs.


Dong Xu Wang (4):
  add def_print_str and use it in qemu_opts_print.
  Create four opts list related functions
  Use QemuOpts support in block layer
  remove QEMUOptionParameter related functions and struct

 block.c   |   91 +--
 block/cow.c   |   46 +++---
 block/gluster.c   |   37 ++--
 block/qcow.c  |   60 
 block/qcow2.c |  171 ++--
 block/qed.c   |   86 +-
 block/raw-posix.c |   65 
 block/raw-win32.c |   30 ++--
 block/raw.c   |   30 ++--
 block/rbd.c   |   62 
 block/sheepdog.c  |   75 -
 block/vdi.c   |   69 
 block/vmdk.c  |   74 
 block/vpc.c   |   67 
 block/vvfat.c |   11 +-
 include/block/block.h |4 +-
 include/block/block_int.h |6 +-
 include/qemu/option.h |   37 +
 qemu-img.c|   61 
 qemu-option.c |  406 +
 20 files changed, 640 insertions(+), 848 deletions(-)

[Qemu-devel] [RESEND PATCH] pci-assign: Enable MSIX on device to match guest

2013-01-06 Thread Alex Williamson

When a guest enables MSIX on a device we evaluate the MSIX vector
table, typically find no unmasked vectors and don't switch the device
to MSIX mode.  This generally works fine and the device will be
switched once the guest enables and therefore unmasks a vector.
Unfortunately some drivers enable MSIX, then use interfaces to send
commands between VF & PF or PF & firmware that act based on the host
state of the device.  These therefore may break when MSIX is managed
lazily.  This change re-enables the previous test used to enable MSIX
(see qemu-kvm a6b402c9), which basically guesses whether a vector
will be used based on the data field of the vector table.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Alex Williamson 
Acked-by: Michael S. Tsirkin 
---

Michael has now ack'd this patch as the correct initial first step,
so I'm resending with that included.  I'm actually not sure what the
expected upstream path is for this file now that it's part of qemu.
There's no entry for hw/kvm/* in MAINTAINERS nor anything specifically
for this file.  Is kvm still upstream for this, through the uq branch
or is it qemu for anything not specifically part of a kvm interface?
Anthony, Gleb, Marcelo, Michael, feel free to add this to your tree,
any path is fine by me.  Thanks,

Alex

 hw/kvm/pci-assign.c |   17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/kvm/pci-assign.c b/hw/kvm/pci-assign.c
index 8ee9428..896cfe8 100644
--- a/hw/kvm/pci-assign.c
+++ b/hw/kvm/pci-assign.c
@@ -1031,6 +1031,19 @@ static bool assigned_dev_msix_masked(MSIXTableEntry 
*entry)
 return (entry->ctrl & cpu_to_le32(0x1)) != 0;
 }
 
+/*
+ * When MSI-X is first enabled the vector table typically has all the
+ * vectors masked, so we can't use that as the obvious test to figure out
+ * how many vectors to initially enable.  Instead we look at the data field
+ * because this is what worked for pci-assign for a long time.  This makes
+ * sure the physical MSI-X state tracks the guest's view, which is important
+ * for some VF/PF and PF/fw communication channels.
+ */
+static bool assigned_dev_msix_skipped(MSIXTableEntry *entry)
+{
+return !entry->data;
+}
+
 static int assigned_dev_update_msix_mmio(PCIDevice *pci_dev)
 {
 AssignedDevice *adev = DO_UPCAST(AssignedDevice, dev, pci_dev);
@@ -1041,7 +1054,7 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
*pci_dev)
 
 /* Get the usable entry number for allocating */
 for (i = 0; i < adev->msix_max; i++, entry++) {
-if (assigned_dev_msix_masked(entry)) {
+if (assigned_dev_msix_skipped(entry)) {
 continue;
 }
 entries_nr++;
@@ -1070,7 +1083,7 @@ static int assigned_dev_update_msix_mmio(PCIDevice 
*pci_dev)
 for (i = 0; i < adev->msix_max; i++, entry++) {
 adev->msi_virq[i] = -1;
 
-if (assigned_dev_msix_masked(entry)) {
+if (assigned_dev_msix_skipped(entry)) {
 continue;
 }

Re: [Qemu-devel] [PATCH 28/32] softmmu: move include files to include/sysemu/

2013-01-06 Thread Amos Kong

On Thu, Dec 06, 2012 at 02:07:06PM +0100, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini 
> ---
>  arch_init.c|6 +++---
>  audio/audio.c  |2 +-
>  audio/winwaveaudio.c   |2 +-
>  balloon.c  |4 ++--
>  block-migration.c  |2 +-

...

>  hw/z2.c|4 ++--
>  hw/zynq_slcr.c |2 +-
>  include/exec/exec-memory.h |2 +-
>  include/exec/softmmu_template.h|2 +-
>  arch_init.h => include/sysemu/arch_init.h  |0

Failed to compile latest qemu(8e4a424b305e29dc0e454f52df3b35577f342975) on 
Fedora17
$ rpm -q gcc
gcc-4.7.2-2.fc17.x86_64

It's fine in RHEL.
# rpm -q gcc
gcc-4.4.6-4.el6.x86_64

Include issue of gcc ?

# ./configure --target-list=x86_64-softmmu --enable-debug
# make
  ...
  CCx86_64-softmmu/arch_init.o
In file included from /home/devel/qemu/include/sysemu/arch_init.h:4:0,
 from /home/devel/qemu/arch_init.c:36:
./qmp-commands.h:3:1: error: expected identifier or ‘(’ before ‘{’ token
./qmp-commands.h:10:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:20:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:30:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:40:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:50:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:60:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:70:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:80:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:90:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:100:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:110:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:120:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:130:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:140:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:154:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:164:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:174:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:184:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:194:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:204:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:214:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:224:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:233:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:244:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:254:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:264:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:274:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:284:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:294:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:304:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:314:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:323:2: error: expected identifier or ‘(’ before ‘,’ token
./qmp-commands.h:333:2: error: expected identifier or ‘(’ before ‘,’ token
/home/devel/qemu/arch_init.c:104:28: error: ‘QEMU_ARCH_I386’ undeclared here 
(not in a function)
make[1]: *** [arch_init.o] Error 1
make: *** [subdir-x86_64-softmmu] Error 2

Amos

[Qemu-devel] [Bug 998435] Re: qemu-kvm-spice doesn't support spice/qxl installs

2013-01-06 Thread Tristan Schmelcher

Possibly related to bug 958549.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/998435

Title:
  qemu-kvm-spice doesn't support spice/qxl  installs

Status in QEMU:
  Confirmed
Status in “qemu-kvm-spice” package in Ubuntu:
  Confirmed

Bug description:
  Been setup as follows :-

  boris@boris-P5Q-E:~$ dpkg -l | grep spice
  ii  gir1.2-spice-client-glib-2.0   0.9-0ubuntu1 GObject for 
communicating with Spice servers (GObject-Introspection)
  ii  gir1.2-spice-client-gtk-2.00.9-0ubuntu1   
 GTK2 widget for SPICE clients (GObject-Introspection)
  ii  gir1.2-spice-client-gtk-3.00.9-0ubuntu1   
 GTK3 widget for SPICE clients (GObject-Introspection)
  ii  libspice-client-glib-2.0-1 0.9-0ubuntu1   
 GObject for communicating with Spice servers (runtime library)
  ii  libspice-client-glib-2.0-dev   0.9-0ubuntu1   
 GObject for communicating with Spice servers (development files)
  ii  libspice-client-gtk-2.0-1  0.9-0ubuntu1   
 GTK2 widget for SPICE clients (runtime library)
  ii  libspice-client-gtk-2.0-dev0.9-0ubuntu1   
 GTK2 widget for SPICE clients (development files)
  ii  libspice-client-gtk-3.0-1  0.9-0ubuntu1   
 GTK3 widget for SPICE clients (runtime library)
  ii  libspice-client-gtk-3.0-dev0.9-0ubuntu1   
 GTK3 widget for SPICE clients (development files)
  ii  libspice-protocol-dev  0.10.1-1   
 SPICE protocol headers
  ii  libspice-server-dev0.10.0-1   
 Header files and development documentation for spice-server
  ii  libspice-server1   0.10.0-1   
 Implements the server side of the SPICE protocol
  ii  python-spice-client-gtk0.9-0ubuntu1   
 GTK2 widget for SPICE clients (Python binding)
  ii  qemu-kvm-spice 1.0.50-2012.03-0ubuntu2
 Full virtualization on amd64 hardware

  Spice/QXL install  doesn't work on Ubuntu 12.04 .  View also 
https://bugs.launchpad.net/ubuntu/+source/seabios/+bug/823494
  It doesn't look like duplicate of bug mentioned above.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/998435/+subscriptions

Re: [Qemu-devel] [PATCH v2 0/7] I2C libqos and tmp105 qtest support

2013-01-06 Thread Andreas Färber

Am 14.12.2012 12:34, schrieb Andreas Färber:
> Hello,
> 
> Here's I2C support for the libqos framework and an OMAP driver.
> The design was inspired by QEMU's i2c_bus and Linux' i2c_adapter.
> Seems like low hanging fruit (while we're still waiting for PCI support :)),
> hopefully motivating more people to contribute qtests for new devices or
> bug fixes.

Ping! See below...

> I found a bug in omap_i2c related to the SBD bit not getting cleared after
> a single-byte read; the patch has been dropped as we were not able to verify
> it against a TRM or hardware. The driver was therefore changed to ignore the
> SBD bit, like the Linux driver does.
> 
> I've split up the proposed tmp105.h header further to facilitate reuse of
> the register enum in qtest.
> 
> The test case itself initially fails; in lack of a v2 patch by Alex I'm
> appending my own fix proposal that makes the test pass.
> 
> I'm also appending a QOM'ish cleanup of the file, replacing the current API
> with a QOM property.
> To do a qom-set for testing a non-zero temperature reading, we'll need either
> a canonical path for the TMP105 or Jason's qtest_qmp_resp() to iterate over
> qom-list /machine/unassigned output looking for type "tmp105" with qom-get.
[...]
> v1 -> v2:
> * Factor out libqos API for I2C and an OMAP driver
> * Avoid casts by using uint8_t* in API, suggested by Blue
> * Ignore SBD bit in omap_i2c driver
> * Drop omap_i2c patch clearing SBD
> * Incorporate Alex' tmp105.h patch
> * Split out tmp105_regs.h for qtest, inspired by rtc-test
> * Append replacement for Alex' fix
> * Append QOM cleanup and turn setter function into QOM property
[...]
> Alex Horn (1):
>   tmp105: Create API for TMP105 temperature sensor

This was applied by Anthony meanwhile.

> Andreas Färber (6):
>   libqtest: Prepare I2C libqos
>   tmp105: Split out I2C message constants from header
>   tests: Add tmp105 qtest test case
>   tmp105: Fix I2C protocol bug
>   tmp105: QOM'ify
>   tmp105: Add temperature QOM property

These need to be rebased due to the header reorganization and gcov.
Anyone any feedback, especially on the libqos part? Anthony? Stefan?

Thanks,
Andreas

>  hw/i2c.h|3 -
>  hw/tmp105.c |  101 ++-
>  hw/tmp105.h |   47 +++
>  hw/tmp105_regs.h|   50 
>  tests/Makefile  |3 +
>  tests/libi2c-omap.c |  166 
> +++
>  tests/libi2c.c  |   22 +++
>  tests/libi2c.h  |   30 ++
>  tests/tmp105-test.c |   76 +++
>  9 Dateien geändert, 453 Zeilen hinzugefügt(+), 45 Zeilen entfernt(-)
>  create mode 100644 hw/tmp105.h
>  create mode 100644 hw/tmp105_regs.h
>  create mode 100644 tests/libi2c-omap.c
>  create mode 100644 tests/libi2c.c
>  create mode 100644 tests/libi2c.h
>  create mode 100644 tests/tmp105-test.c

[Qemu-devel] [Bug 998435] Re: qemu-kvm-spice doesn't support spice/qxl installs

2013-01-06 Thread Tristan Schmelcher

Same here. I was trying to install Win7 64-bit on 12.04 with qemu-kvm-
spice and neither qxl nor vga graphics would work--both showed a black
screen. Configuring the display as VNC instead of SPICE made no
difference. But choosing cirrus graphics made it work.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/998435

Title:
  qemu-kvm-spice doesn't support spice/qxl  installs

Status in QEMU:
  Confirmed
Status in “qemu-kvm-spice” package in Ubuntu:
  Confirmed

Bug description:
  Been setup as follows :-

  boris@boris-P5Q-E:~$ dpkg -l | grep spice
  ii  gir1.2-spice-client-glib-2.0   0.9-0ubuntu1 GObject for 
communicating with Spice servers (GObject-Introspection)
  ii  gir1.2-spice-client-gtk-2.00.9-0ubuntu1   
 GTK2 widget for SPICE clients (GObject-Introspection)
  ii  gir1.2-spice-client-gtk-3.00.9-0ubuntu1   
 GTK3 widget for SPICE clients (GObject-Introspection)
  ii  libspice-client-glib-2.0-1 0.9-0ubuntu1   
 GObject for communicating with Spice servers (runtime library)
  ii  libspice-client-glib-2.0-dev   0.9-0ubuntu1   
 GObject for communicating with Spice servers (development files)
  ii  libspice-client-gtk-2.0-1  0.9-0ubuntu1   
 GTK2 widget for SPICE clients (runtime library)
  ii  libspice-client-gtk-2.0-dev0.9-0ubuntu1   
 GTK2 widget for SPICE clients (development files)
  ii  libspice-client-gtk-3.0-1  0.9-0ubuntu1   
 GTK3 widget for SPICE clients (runtime library)
  ii  libspice-client-gtk-3.0-dev0.9-0ubuntu1   
 GTK3 widget for SPICE clients (development files)
  ii  libspice-protocol-dev  0.10.1-1   
 SPICE protocol headers
  ii  libspice-server-dev0.10.0-1   
 Header files and development documentation for spice-server
  ii  libspice-server1   0.10.0-1   
 Implements the server side of the SPICE protocol
  ii  python-spice-client-gtk0.9-0ubuntu1   
 GTK2 widget for SPICE clients (Python binding)
  ii  qemu-kvm-spice 1.0.50-2012.03-0ubuntu2
 Full virtualization on amd64 hardware

  Spice/QXL install  doesn't work on Ubuntu 12.04 .  View also 
https://bugs.launchpad.net/ubuntu/+source/seabios/+bug/823494
  It doesn't look like duplicate of bug mentioned above.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/998435/+subscriptions

Re: [Qemu-devel] [PATCH] virtio-pci: replace byte swap hack

2013-01-06 Thread Blue Swirl

On Sun, Jan 6, 2013 at 6:25 PM, Andreas Färber  wrote:
> Am 06.01.2013 14:17, schrieb Alexander Graf:
>>
>> On 30.12.2012, at 13:55, Blue Swirl wrote:
>>
>>> Remove byte swaps by declaring the config space
>>> as native endian.
>>
>> This is wrong. Virtio-pci config space is split into 2 regions. One with 
>> native endianness, the other one with little endian.
>
> Can that MemoryRegion be split in two?

Yes, but unfortunately the offset for the second region depends on if
MSIX is enabled or not. PCI layer manages these bits without the
device seeing any changes.

This could be handled by introducing a callback at PCI layer to inform
interested devices about changes to MSIX setup, or even generalized:
inform devices about changes within any set of bits specified by the
device.

>
> Andreas
>
> --
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

[Qemu-devel] unbounded qemu NetQue's ?

2013-01-06 Thread Luigi Rizzo

Hi,
while testing the tx path in qemu without a network backend connected,
i noticed that qemu_net_queue_send() builds up an unbounded
queue, because QTAILQ* have no notion of a queue length.

As a result, memory usage of a qemu instance can grow to extremely
large values.

I wonder if it makes sense to implement a hard limit on size of
NetQue's. The patch below is only a partial implementation
but probably sufficient for these purposes.

cheers
luigi

diff -urp qemu-1.3.0-orig/net/queue.c qemu-1.3.0/net/queue.c
--- qemu-1.3.0-orig/net/queue.c 2012-12-03 20:37:05.0 +0100
+++ qemu-1.3.0/net/queue.c  2013-01-06 19:38:12.0 +0100
@@ -92,6 +92,9 @@ static void qemu_net_queue_append(NetQue
 {
 NetPacket *packet;
 
+if (queue->packets.tqh_count > 1)
+   return; // XXX drop
+
 packet = g_malloc(sizeof(NetPacket) + size);
 packet->sender = sender;
 packet->flags = flags;
diff -urp qemu-1.3.0-orig/qemu-queue.h qemu-1.3.0/qemu-queue.h
--- qemu-1.3.0-orig/qemu-queue.h2012-12-03 20:37:05.0 +0100
+++ qemu-1.3.0/qemu-queue.h 2013-01-06 19:34:01.0 +0100
@@ -320,11 +320,12 @@ struct {
 struct name {   \
 qual type *tqh_first;   /* first element */ \
 qual type *qual *tqh_last;  /* addr of last next element */ \
+   int32_t tqh_count;  \
 }
 #define QTAILQ_HEAD(name, type)  Q_TAILQ_HEAD(name, struct type,)
 
 #define QTAILQ_HEAD_INITIALIZER(head)   \
-{ NULL, &(head).tqh_first }
+{ NULL, &(head).tqh_first, 0 }
 
 #define Q_TAILQ_ENTRY(type, qual)   \
 struct {\
@@ -339,6 +340,7 @@ struct {
 #define QTAILQ_INIT(head) do {  \
 (head)->tqh_first = NULL;   \
 (head)->tqh_last = &(head)->tqh_first;  \
+(head)->tqh_count = 0; \
 } while (/*CONSTCOND*/0)
 
 #define QTAILQ_INSERT_HEAD(head, elm, field) do {   \
@@ -348,6 +350,7 @@ struct {
 else\
 (head)->tqh_last = &(elm)->field.tqe_next;  \
 (head)->tqh_first = (elm);  \
+(head)->tqh_count++;   \
 (elm)->field.tqe_prev = &(head)->tqh_first; \
 } while (/*CONSTCOND*/0)
 
@@ -356,6 +359,7 @@ struct {
 (elm)->field.tqe_prev = (head)->tqh_last;   \
 *(head)->tqh_last = (elm);  \
 (head)->tqh_last = &(elm)->field.tqe_next;  \
+(head)->tqh_count++;   \
 } while (/*CONSTCOND*/0)
 
 #define QTAILQ_INSERT_AFTER(head, listelm, elm, field) do { \
@@ -381,6 +385,7 @@ struct {
 (elm)->field.tqe_prev;  \
 else\
 (head)->tqh_last = (elm)->field.tqe_prev;   \
+(head)->tqh_count--;   \
 *(elm)->field.tqe_prev = (elm)->field.tqe_next; \
 } while (/*CONSTCOND*/0)

[Qemu-devel] [PULL] target-s390x rewrite

2013-01-06 Thread Richard Henderson

Now that all the prerequisites are in, I've rebased vs master.
Which has included a few minor include path tweaks due to the
source base reorg.


r~


The following changes since commit 549db5c32bb025501e2eeb23d2e5cc669061eb71:

  hw/i386: Fix broken build for non POSIX hosts (2013-01-05 10:14:05 +)

are available in the git repository at:

  git://repo.or.cz/qemu/rth.git s390-reorg

for you to fetch changes up to 2b35e93fbfc52c2be6cf85e5a54a3707cdabd914:

  target-s390: Claim maintainership (2013-01-05 12:18:46 -0800)


Richard Henderson (149):
  target-s390: Disassemble more z10 and z196 opcodes
  target-s390: Fix disassembly of cpsdr
  target-s390: Fix gdbstub
  target-s390: Add missing temp_free in gen_op_calc_cc
  target-s390: Use TCG registers for FPR
  target-s390: Register helpers
  target-s390: Fix SACF exit
  target-s390: Fix BCR
  target-s390: Tidy unconditional BRCL
  target-s390: Fix PSW_MASK handling
  target-s390: Add format based disassassmbly infrastructure
  target-s390: Split out disas_jcc
  target-s390: Reorg exception handling
  target-s390: Convert ADD HALFWORD
  target-s390: Implement SUBTRACT HALFWORD
  target-s390: Implement ADD LOGICAL WITH SIGNED IMMEDIATE
  target-s390: Convert MULTIPLY HALFWORD, SINGLE
  target-s390: Convert 32-bit MULTIPLY, MULTIPLY LOGICAL
  target-s390: Convert 64-bit MULTIPLY LOGICAL
  target-s390: Convert AND, OR, XOR
  target-s390: Convert COMPARE, COMPARE LOGICAL
  target-s390: Convert LOAD, LOAD LOGICAL
  target-s390: Convert LOAD ADDRESS
  target-s390: Convert LOAD (LOGICAL) BYTE, CHARACTER, HALFWORD
  target-s390: Convert LOAD AND TEST
  target-s390: Convert LOAD LOGICAL IMMEDIATE
  target-s390: Convert LOAD COMPLIMENT, POSITIVE, NEGATIVE
  target-s390: Convert AND, OR, XOR, INSERT IMMEDIATE
  target-s390: Convert STORE
  target-s390: Convert ADD LOGICAL CARRY and SUBTRACT LOGICAL BORROW
  target-s390: Convert BRANCH AND SAVE
  target-s390: Convert BRANCH ON CONDITION
  target-s390: Convert BRANCH ON COUNT
  target-s390: Convert DIVIDE
  target-s390: Send signals for divide
  target-s390: Convert TEST UNDER MASK
  target-s390: Convert SET ADDRESSING MODE
  target-s390: Convert SUPERVISOR CALL
  target-s390: Convert MOVE LONG
  target-s390: Convert FP LOAD
  target-s390: Convert INSERT CHARACTER
  target-s390: Cleanup cc computation helpers
  target-s390: Convert INSERT CHARACTERS UNDER MASK
  target-s390: Convert EXECUTE
  target-s390: Convert FP STORE
  target-s390: Convert CONVERT TO DECIMAL
  target-s390: Convert SET SYSTEM MASK
  target-s390: Convert LOAD PSW
  target-s390: Convert DIAGNOSE
  target-s390: Convert SHIFT, ROTATE SINGLE
  target-s390: Convert SHIFT DOUBLE
  target-s390: Convert LOAD, STORE MULTIPLE
  target-s390: Convert MOVE
  target-s390: Convert NI, XI, OI
  target-s390: Convert STNSM, STOSM
  target-s390: Convert LAM, STAM
  target-s390: Convert CLCLE, MVCLE
  target-s390: Convert MVC
  target-s390: Convert NC, XC, OC, TR, UNPK
  target-s390: Convert CLC
  target-s390: Convert MVCP, MVCS
  target-s390: Convert LRA
  target-s390: Convert SIGP
  target-s390: Convert EFPC, STFPC
  target-s390: Convert LCTL, STCTL
  target-s390: Convert COMPARE AND SWAP
  target-s390: Convert CLM
  target-s390: Convert STCM
  target-s390: Convert TPROT
  target-s390: Convert LOAD CONTROL, part 2
  target-s390: Convert LOAD REVERSED
  target-s390: Convert STORE REVERSED
  target-s390: Convert LLGT
  target-s390: Convert FP ADD, COMPARE, LOAD TEST/ROUND/LENGTHENED
  target-s390: Convert FP SUBTRACT
  target-s390: Convert FP DIVIDE
  target-s390: Convert FP MULTIPLY
  target-s390: Convert MULTIPLY AND ADD, SUBTRACT
  target-s390: Convert TEST DATA CLASS
  target-s390: Convert FP LOAD COMPLIMENT, NEGATIVE, POSITIVE
  target-s390: Convert FP SQUARE ROOT
  target-s390: Convert LOAD ZERO
  target-s390: Convert CONVERT TO FIXED
  target-s390: Convert CONVERT FROM FIXED
  target-s390: Convert FLOGR
  target-s390: Convert LFPC, SFPC
  target-s390: Convert IPM
  target-s390: Convert CKSM
  target-s390: Convert EAR, SAR
  target-s390: Convert MVPG
  target-s390: Convert CLST, MVST
  target-s390: Convert SRST
  target-s390: Convert STIDP
  target-s390: Convert SCK
  target-s390: Convert STCK
  target-s390: Convert SCKC, STCKC
  target-s390: Convert SPT, STPT
  target-s390: Convert SPKA
  target-s390: Convert PTLB
  target-s390: Convert SPX, STPX
  target-s390: Convert STAP
  target-s390: Convert IPTE
  target-s390: Convert ISKE
  target-s390: Convert SSKE
  target-s390: Convert RRBE

[Qemu-devel] [PATCH v2 2/2] qdev: Prepare "realized" property

2013-01-06 Thread Andreas Färber

Based on earlier patches by Paolo and me, introduce the QOM realizefn at
device level only, as requested by Anthony.

For now this just wraps the qdev initfn, which it deprecates.

Signed-off-by: Paolo Bonzini 
Signed-off-by: Andreas Färber 
Cc: Anthony Liguori 
---
 hw/qdev-core.h |4 +++
 hw/qdev.c  |   96 ++--
 2 Dateien geändert, 76 Zeilen hinzugefügt(+), 24 Zeilen entfernt(-)

diff --git a/hw/qdev-core.h b/hw/qdev-core.h
index 5836e35..e50c866 100644
--- a/hw/qdev-core.h
+++ b/hw/qdev-core.h
@@ -20,6 +20,8 @@ enum {
 typedef int (*qdev_initfn)(DeviceState *dev);
 typedef int (*qdev_event)(DeviceState *dev);
 typedef void (*qdev_resetfn)(DeviceState *dev);
+typedef void (*DeviceRealize)(DeviceState *dev, Error **err);
+typedef void (*DeviceUnrealize)(DeviceState *dev, Error **err);
 
 struct VMStateDescription;
 
@@ -38,6 +40,8 @@ typedef struct DeviceClass {
 const struct VMStateDescription *vmsd;
 
 /* Private to qdev / bus.  */
+DeviceRealize realize;
+DeviceUnrealize unrealize;
 qdev_initfn init;
 qdev_event unplug;
 qdev_event exit;
diff --git a/hw/qdev.c b/hw/qdev.c
index b214c8a..f45ab14 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -148,37 +148,30 @@ DeviceState *qdev_try_create(BusState *bus, const char 
*type)
Return 0 on success.  */
 int qdev_init(DeviceState *dev)
 {
-DeviceClass *dc = DEVICE_GET_CLASS(dev);
-int rc;
+Error *local_err = NULL;
 
 assert(!dev->realized);
 
-rc = dc->init(dev);
-if (rc < 0) {
+object_property_set_bool(OBJECT(dev), true, "realized", &local_err);
+if (local_err != NULL) {
+error_free(local_err);
 qdev_free(dev);
-return rc;
+return -1;
 }
+return 0;
+}
 
-if (!OBJECT(dev)->parent) {
-static int unattached_count = 0;
-gchar *name = g_strdup_printf("device[%d]", unattached_count++);
-
-object_property_add_child(container_get(qdev_get_machine(),
-"/unattached"),
-  name, OBJECT(dev), NULL);
-g_free(name);
-}
+static void device_realize(DeviceState *dev, Error **err)
+{
+DeviceClass *dc = DEVICE_GET_CLASS(dev);
 
-if (qdev_get_vmsd(dev)) {
-vmstate_register_with_alias_id(dev, -1, qdev_get_vmsd(dev), dev,
-   dev->instance_id_alias,
-   dev->alias_required_for_version);
-}
-dev->realized = true;
-if (dev->hotplugged) {
-device_reset(dev);
+if (dc->init) {
+int rc = dc->init(dev);
+if (rc < 0) {
+error_setg(err, "Device initialization failed.");
+return;
+}
 }
-return 0;
 }
 
 void qdev_set_legacy_instance_id(DeviceState *dev, int alias_id,
@@ -636,6 +629,55 @@ void qdev_property_add_static(DeviceState *dev, Property 
*prop,
 assert_no_error(local_err);
 }
 
+static bool device_get_realized(Object *obj, Error **err)
+{
+DeviceState *dev = DEVICE(obj);
+return dev->realized;
+}
+
+static void device_set_realized(Object *obj, bool value, Error **err)
+{
+DeviceState *dev = DEVICE(obj);
+DeviceClass *dc = DEVICE_GET_CLASS(dev);
+Error *local_err = NULL;
+
+if (value && !dev->realized) {
+if (dc->realize) {
+dc->realize(dev, &local_err);
+}
+
+if (!obj->parent && local_err == NULL) {
+static int unattached_count;
+gchar *name = g_strdup_printf("device[%d]", unattached_count++);
+
+object_property_add_child(container_get(qdev_get_machine(),
+"/unattached"),
+  name, obj, &local_err);
+g_free(name);
+}
+
+if (qdev_get_vmsd(dev) && local_err == NULL) {
+vmstate_register_with_alias_id(dev, -1, qdev_get_vmsd(dev), dev,
+   dev->instance_id_alias,
+   dev->alias_required_for_version);
+}
+if (dev->hotplugged && local_err == NULL) {
+device_reset(dev);
+}
+} else if (!value && dev->realized) {
+if (dc->unrealize) {
+dc->unrealize(dev, &local_err);
+}
+}
+
+if (local_err != NULL) {
+error_propagate(err, local_err);
+return;
+}
+
+dev->realized = value;
+}
+
 static void device_initfn(Object *obj)
 {
 DeviceState *dev = DEVICE(obj);
@@ -650,6 +692,9 @@ static void device_initfn(Object *obj)
 dev->instance_id_alias = -1;
 dev->realized = false;
 
+object_property_add_bool(obj, "realized",
+ device_get_realized, device_set_realized, NULL);
+
 class = object_get_class(OBJECT(dev));
 do {
 for (prop = DEVICE_CLASS(class)->props; prop && prop->name; prop++) {
@@ -707,7 +752,10 @@ static

[Qemu-devel] [PATCH v2 1/2] qdev: Fold state enum into bool realized

2013-01-06 Thread Andreas Färber

Whether the device was initialized or not is QOM-level information and
currently unused. Drop it from qdev. This leaves the boolean state of
whether or not DeviceClass::init was called or not, a.k.a. "realized".

Suggested-by: Anthony Liguori 
Signed-off-by: Andreas Färber 
---
 hw/qdev-addr.c  |2 +-
 hw/qdev-core.h  |7 +--
 hw/qdev-properties-system.c |4 ++--
 hw/qdev-properties.c|   24 
 hw/qdev.c   |   12 ++--
 5 Dateien geändert, 22 Zeilen hinzugefügt(+), 27 Zeilen entfernt(-)

diff --git a/hw/qdev-addr.c b/hw/qdev-addr.c
index 3bfe101..b4388f6 100644
--- a/hw/qdev-addr.c
+++ b/hw/qdev-addr.c
@@ -40,7 +40,7 @@ static void set_taddr(Object *obj, Visitor *v, void *opaque,
 Error *local_err = NULL;
 int64_t value;
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
diff --git a/hw/qdev-core.h b/hw/qdev-core.h
index fdf14ec..5836e35 100644
--- a/hw/qdev-core.h
+++ b/hw/qdev-core.h
@@ -8,11 +8,6 @@
 #include "hw/irq.h"
 #include "qapi/error.h"
 
-enum DevState {
-DEV_STATE_CREATED = 1,
-DEV_STATE_INITIALIZED,
-};
-
 enum {
 DEV_NVECTORS_UNSPECIFIED = -1,
 };
@@ -55,7 +50,7 @@ struct DeviceState {
 Object parent_obj;
 
 const char *id;
-enum DevState state;
+bool realized;
 QemuOpts *opts;
 int hotplugged;
 BusState *parent_bus;
diff --git a/hw/qdev-properties-system.c b/hw/qdev-properties-system.c
index c73c713..ce0f793 100644
--- a/hw/qdev-properties-system.c
+++ b/hw/qdev-properties-system.c
@@ -42,7 +42,7 @@ static void set_pointer(Object *obj, Visitor *v, Property 
*prop,
 char *str;
 int ret;
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -254,7 +254,7 @@ static void set_vlan(Object *obj, Visitor *v, void *opaque,
 int32_t id;
 NetClientState *hubport;
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index f724357..a8a31f5 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -32,7 +32,7 @@ static void set_enum(Object *obj, Visitor *v, void *opaque,
 Property *prop = opaque;
 int *ptr = qdev_get_prop_ptr(dev, prop);
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -85,7 +85,7 @@ static void set_bit(Object *obj, Visitor *v, void *opaque,
 Error *local_err = NULL;
 bool value;
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -125,7 +125,7 @@ static void set_uint8(Object *obj, Visitor *v, void *opaque,
 Property *prop = opaque;
 uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -192,7 +192,7 @@ static void set_uint16(Object *obj, Visitor *v, void 
*opaque,
 Property *prop = opaque;
 uint16_t *ptr = qdev_get_prop_ptr(dev, prop);
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -225,7 +225,7 @@ static void set_uint32(Object *obj, Visitor *v, void 
*opaque,
 Property *prop = opaque;
 uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -250,7 +250,7 @@ static void set_int32(Object *obj, Visitor *v, void *opaque,
 Property *prop = opaque;
 int32_t *ptr = qdev_get_prop_ptr(dev, prop);
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -323,7 +323,7 @@ static void set_uint64(Object *obj, Visitor *v, void 
*opaque,
 Property *prop = opaque;
 uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -413,7 +413,7 @@ static void set_string(Object *obj, Visitor *v, void 
*opaque,
 Error *local_err = NULL;
 char *str;
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -477,7 +477,7 @@ static void set_mac(Object *obj, Visitor *v, void *opaque,
 int i, pos;
 char *str, *p;
 
-if (dev->state != DEV_STATE_CREATED) {
+if (dev->realized) {
 error_set(errp, QERR_PERMISSION_DENIED);
 return;
 }
@@ -569,7 +569,7 @@

[Qemu-devel] [PATCH v2 0/2] QOM realize, device-only

2013-01-06 Thread Andreas Färber

Hello Anthony,

Here's a rebased version of device-only QOM realization, otherwise unchanged.
This is based on master now and drops all general-purpose qdev cleanups that
potentially conflict with Paolo's reference counting change proposal.
It will trivially conflict with my qdev unparenting fix, pending to be added
to the qom-cpu queue (touches k->unparent in class_init).

Same as v1, this is not the big throw of a recursive, central realization model.

The immediate purpose of this series is to allow CPUs, which partially already
have realizefns prepared, to adjust to the new DeviceState signature (being a
device on qom-cpu branch) and hook them up to DeviceClass::realize rather than
exposing as a function through cpu.h. Please review and apply to qemu.git.

As discussed in the last cover letter, doing realization at DeviceState level
does not allow for trivial [un]realize_children as proposed by Paolo for Object.
An idea how to implement this as a follow-up would be to do a deep search for
child<> properties dynamic_cast'able to DeviceState. Or to simply try setting
a "realized" property and it that fails check for children (but how to differ
between "realized" that failed and "realized" that doesn't exist then?).

Available from:
https://github.com/afaerber/qemu-cpu/commits/realize-qdev.v2
git://github.com/afaerber/qemu-cpu.git realize-qdev.v2

Regards,
Andreas

Cc: Anthony Liguori 
Cc: Paolo Bonzini 
Cc: Eduardo Habkost 
Cc: Igor Mammedov 
Cc: Peter Maydell 

Andreas Färber (2):
  qdev: Fold state enum into bool realized
  qdev: Prepare "realized" property

 hw/qdev-addr.c  |2 +-
 hw/qdev-core.h  |   11 ++---
 hw/qdev-properties-system.c |4 +-
 hw/qdev-properties.c|   24 +-
 hw/qdev.c   |  106 +++
 5 Dateien geändert, 97 Zeilen hinzugefügt(+), 50 Zeilen entfernt(-)

-- 
1.7.10.4

Re: [Qemu-devel] [PATCH] virtio-pci: replace byte swap hack

2013-01-06 Thread Alexander Graf



Am 06.01.2013 um 19:26 schrieb Blue Swirl :

> On Sun, Jan 6, 2013 at 1:17 PM, Alexander Graf  wrote:
>> 
>> On 30.12.2012, at 13:55, Blue Swirl wrote:
>> 
>>> Remove byte swaps by declaring the config space
>>> as native endian.
>> 
>> This is wrong. Virtio-pci config space is split into 2 regions. One with 
>> native endianness, the other one with little endian.
> 
> True. I must say that this is poor architectural design in multiple
> ways. The endianness problem is aggravated by device being also
> instantiated by other devices, so we can't (at least easily) pass an
> endianness flag from above, or make two devices that are selected by
> the board. Another problem is that the offset changes dynamically
> depending on if MSIX is enabled or not. The design is probably also
> fixed by the guest driver assumptions and can't ever be improved.
> Paravirtual devices should have cleaner design than what for example
> original PC devices have, not more insane. :-(

I guess now you understand my wording in the comment that your patch removes a 
bit better :).


Alex

[Qemu-devel] [PATCH ppc-next v2 1/2] target-ppc: Slim conversion of model definitions to QOM subclasses

2013-01-06 Thread Andreas Färber

Since the model list is highly macrofied, keep ppc_def_t for now and
save a pointer to it in PowerPCCPUClass. This results in a flat list of
subclasses including aliases, to be refined later.

Move cpu_ppc_init() to translate_init.c and drop helper.c.
Long-term the idea is to turn translate_init.c into a standalone cpu.c.

Inline cpu_ppc_usable() into type registration.

Split cpu_ppc_register() in two by code movement into the initfn and
by turning the remaining part into a realizefn.
Move qemu_init_vcpu() call into the new realizefn and adapt
create_ppc_opcodes() to return an Error.

Change ppc_find_by_pvr() -> ppc_cpu_class_by_pvr().
Change ppc_find_by_name() -> ppc_cpu_class_by_name().

Turn -cpu host into its own subclass. This requires to move the
kvm_enabled() check in ppc_cpu_class_by_name() to avoid the class being
found via the normal name lookup in the !kvm_enabled() case.
Turn kvmppc_host_cpu_def() into the class_init and add an initfn that
asserts KVM is in fact enabled.

Implement -cpu ? and the QMP equivalent in terms of subclasses.
This newly exposes -cpu host to the user, ordered last for -cpu ?.

Signed-off-by: Andreas Färber 
---
 target-ppc/Makefile.objs|3 +-
 target-ppc/cpu-qom.h|5 +
 target-ppc/cpu.h|4 -
 target-ppc/helper.c |   50 ---
 target-ppc/kvm.c|   37 -
 target-ppc/kvm_ppc.h|8 +-
 target-ppc/translate_init.c |  345 +--
 7 Dateien geändert, 276 Zeilen hinzugefügt(+), 176 Zeilen entfernt(-)
 delete mode 100644 target-ppc/helper.c

diff --git a/target-ppc/Makefile.objs b/target-ppc/Makefile.objs
index 237a0ed..15d2a3c 100644
--- a/target-ppc/Makefile.objs
+++ b/target-ppc/Makefile.objs
@@ -1,7 +1,6 @@
-obj-y += translate.o helper.o
+obj-y += translate.o
 obj-$(CONFIG_SOFTMMU) += machine.o
 obj-$(CONFIG_KVM) += kvm.o kvm_ppc.o
-obj-y += helper.o
 obj-y += excp_helper.o
 obj-y += fpu_helper.o
 obj-y += int_helper.o
diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
index fb6b5a4..b338f8f 100644
--- a/target-ppc/cpu-qom.h
+++ b/target-ppc/cpu-qom.h
@@ -50,6 +50,9 @@ typedef struct PowerPCCPUClass {
 /*< public >*/
 
 void (*parent_reset)(CPUState *cpu);
+
+/* TODO inline fields here */
+ppc_def_t *info;
 } PowerPCCPUClass;
 
 /**
@@ -73,5 +76,7 @@ static inline PowerPCCPU *ppc_env_get_cpu(CPUPPCState *env)
 
 #define ENV_GET_CPU(e) CPU(ppc_env_get_cpu(e))
 
+PowerPCCPUClass *ppc_cpu_class_by_pvr(uint32_t pvr);
+
 
 #endif
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index e88ebe0..443f90c 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1156,10 +1156,6 @@ void ppc_store_msr (CPUPPCState *env, target_ulong 
value);
 
 void ppc_cpu_list (FILE *f, fprintf_function cpu_fprintf);
 
-const ppc_def_t *ppc_find_by_pvr(uint32_t pvr);
-const ppc_def_t *cpu_ppc_find_by_name (const char *name);
-int cpu_ppc_register_internal (CPUPPCState *env, const ppc_def_t *def);
-
 /* Time-base and decrementer management */
 #ifndef NO_CPU_IO_DEFS
 uint64_t cpu_ppc_load_tbl (CPUPPCState *env);
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
deleted file mode 100644
index 103855a..000
--- a/target-ppc/helper.c
+++ /dev/null
@@ -1,50 +0,0 @@
-/*
- *  PowerPC emulation helpers for QEMU.
- *
- *  Copyright (c) 2003-2007 Jocelyn Mayer
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2 of the License, or (at your option) any later version.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; if not, see .
- */
-
-#include "cpu.h"
-#include "helper_regs.h"
-#include "sysemu/kvm.h"
-#include "kvm_ppc.h"
-#include "sysemu/cpus.h"
-
-PowerPCCPU *cpu_ppc_init(const char *cpu_model)
-{
-PowerPCCPU *cpu;
-CPUPPCState *env;
-const ppc_def_t *def;
-
-def = cpu_ppc_find_by_name(cpu_model);
-if (!def) {
-return NULL;
-}
-
-cpu = POWERPC_CPU(object_new(TYPE_POWERPC_CPU));
-env = &cpu->env;
-
-if (tcg_enabled()) {
-ppc_translate_init();
-}
-
-env->cpu_model_str = cpu_model;
-cpu_ppc_register_internal(env, def);
-
-qemu_init_vcpu(env);
-
-return cpu;
-}
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 436ca47..a589575 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1210,18 +1210,29 @@ static void alter_insns(uint64_t *word, uint64_t flags, 
bool on)
 }
 }
 
-const ppc_def_t *kvmppc_host_cpu_def(void)
+static void kvmppc_host_cpu_initfn(Object *obj)
 {
+assert(kvm_enabled());
+

[Qemu-devel] [PATCH ppc-next v2 2/2] target-ppc: Error out for -cpu host on unknown PVR

2013-01-06 Thread Andreas Färber

Previously we silently exited, with subclasses we got an opcode warning.
Instead, explicitly tell the user what's wrong.

An indication for this is -cpu ? showing "host" with an all-zero PVR.

Signed-off-by: Andreas Färber 
---
 target-ppc/kvm.c |8 
 1 Datei geändert, 8 Zeilen hinzugefügt(+)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index a589575..b6b5a6a 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1212,7 +1212,15 @@ static void alter_insns(uint64_t *word, uint64_t flags, 
bool on)
 
 static void kvmppc_host_cpu_initfn(Object *obj)
 {
+PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(obj);
+
 assert(kvm_enabled());
+
+if (pcc->info->pvr != mfpvr()) {
+fprintf(stderr, "Your host CPU is unsupported.\n"
+"Please choose a supported model instead, see -cpu ?.\n");
+exit(1);
+}
 }
 
 static void kvmppc_host_cpu_class_init(ObjectClass *oc, void *data)
-- 
1.7.10.4

[Qemu-devel] [PATCH ppc-next v2 0/2] PowerPCCPU subclasses

2013-01-06 Thread Andreas Färber

Hi Alex,

The CPUListState generalization has been merged now. Here's a rebased version
of PowerPCCPU subclasses, intended for v1.4.

Available from:
git://github.com/afaerber/qemu-cpu.git qom-cpu-ppc-classes.v2
https://github.com/afaerber/qemu-cpu/commits/qom-cpu-ppc-classes.v2

Thanks,
Andreas

v1 -> v2:
* Fixed type name for TYPE_POWERPC_CPU subclasses.
* Improved error message for -cpu host, suggested by Alex.

>From original subclasses series:
* Redone from scratch
* Adopted name--cpu type name scheme

Cc: Alexander Graf 
Cc: qemu-ppc 
Cc: David Gibson 

Andreas Färber (2):
  target-ppc: Slim conversion of model definitions to QOM subclasses
  target-ppc: Error out for -cpu host on unknown PVR

 target-ppc/Makefile.objs|3 +-
 target-ppc/cpu-qom.h|5 +
 target-ppc/cpu.h|4 -
 target-ppc/helper.c |   50 ---
 target-ppc/kvm.c|   45 +-
 target-ppc/kvm_ppc.h|8 +-
 target-ppc/translate_init.c |  345 +--
 7 Dateien geändert, 284 Zeilen hinzugefügt(+), 176 Zeilen entfernt(-)
 delete mode 100644 target-ppc/helper.c

-- 
1.7.10.4

Re: [Qemu-devel] [PATCH] virtio-pci: replace byte swap hack

2013-01-06 Thread Blue Swirl

On Sun, Jan 6, 2013 at 1:17 PM, Alexander Graf  wrote:
>
> On 30.12.2012, at 13:55, Blue Swirl wrote:
>
>> Remove byte swaps by declaring the config space
>> as native endian.
>
> This is wrong. Virtio-pci config space is split into 2 regions. One with 
> native endianness, the other one with little endian.

True. I must say that this is poor architectural design in multiple
ways. The endianness problem is aggravated by device being also
instantiated by other devices, so we can't (at least easily) pass an
endianness flag from above, or make two devices that are selected by
the board. Another problem is that the offset changes dynamically
depending on if MSIX is enabled or not. The design is probably also
fixed by the guest driver assumptions and can't ever be improved.
Paravirtual devices should have cleaner design than what for example
original PC devices have, not more insane. :-(

Anyway, the real fix is that the memory region should contain two
subregions, native and LE. The offset of the second region should be
adjusted if MSIX bit is toggled. This may need some kind of
(nontrivial) callback system.

The original design could be improved to avoid the checks and calls to
virtio_is_big_endian() from the config handler by introducing two sets
of config functions (LE+LE, BE+LE) and selecting the correct one at
init time (based on result of virtio_is_big_endian() which does not
change). This would not help the real fix.

Both need more thought and hopefully there are even better options, so
I'll revert the commit.

>
> Alex
>
>>
>> Signed-off-by: Blue Swirl 
>> ---
>> exec.c  |   18 --
>> hw/virtio-pci.c |   17 +
>> 2 files changed, 1 insertions(+), 34 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index a6923ad..140eb56 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -2587,24 +2587,6 @@ int cpu_memory_rw_debug(CPUArchState *env, 
>> target_ulong addr,
>> }
>> #endif
>>
>> -#if !defined(CONFIG_USER_ONLY)
>> -
>> -/*
>> - * A helper function for the _utterly broken_ virtio device model to find 
>> out if
>> - * it's running on a big endian machine. Don't do this at home kids!
>> - */
>> -bool virtio_is_big_endian(void);
>> -bool virtio_is_big_endian(void)
>> -{
>> -#if defined(TARGET_WORDS_BIGENDIAN)
>> -return true;
>> -#else
>> -return false;
>> -#endif
>> -}
>> -
>> -#endif
>> -
>> #ifndef CONFIG_USER_ONLY
>> bool cpu_physical_memory_is_io(hwaddr phys_addr)
>> {
>> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
>> index d2d2454..df78a3b 100644
>> --- a/hw/virtio-pci.c
>> +++ b/hw/virtio-pci.c
>> @@ -92,9 +92,6 @@
>>  */
>> #define wmb() do { } while (0)
>>
>> -/* HACK for virtio to determine if it's running a big endian guest */
>> -bool virtio_is_big_endian(void);
>> -
>> /* virtio device */
>>
>> static void virtio_pci_notify(void *opaque, uint16_t vector)
>> @@ -390,15 +387,9 @@ static uint64_t virtio_pci_config_read(void *opaque, 
>> hwaddr addr,
>> break;
>> case 2:
>> val = virtio_config_readw(proxy->vdev, addr);
>> -if (virtio_is_big_endian()) {
>> -val = bswap16(val);
>> -}
>> break;
>> case 4:
>> val = virtio_config_readl(proxy->vdev, addr);
>> -if (virtio_is_big_endian()) {
>> -val = bswap32(val);
>> -}
>> break;
>> }
>> return val;
>> @@ -423,15 +414,9 @@ static void virtio_pci_config_write(void *opaque, 
>> hwaddr addr,
>> virtio_config_writeb(proxy->vdev, addr, val);
>> break;
>> case 2:
>> -if (virtio_is_big_endian()) {
>> -val = bswap16(val);
>> -}
>> virtio_config_writew(proxy->vdev, addr, val);
>> break;
>> case 4:
>> -if (virtio_is_big_endian()) {
>> -val = bswap32(val);
>> -}
>> virtio_config_writel(proxy->vdev, addr, val);
>> break;
>> }
>> @@ -444,7 +429,7 @@ static const MemoryRegionOps virtio_pci_config_ops = {
>> .min_access_size = 1,
>> .max_access_size = 4,
>> },
>> -.endianness = DEVICE_LITTLE_ENDIAN,
>> +.endianness = DEVICE_NATIVE_ENDIAN,
>> };
>>
>> static void virtio_write_config(PCIDevice *pci_dev, uint32_t address,
>> --
>> 1.7.2.5
>>
>>
>

Re: [Qemu-devel] [PATCH] virtio-pci: replace byte swap hack

2013-01-06 Thread Andreas Färber

Am 06.01.2013 14:17, schrieb Alexander Graf:
> 
> On 30.12.2012, at 13:55, Blue Swirl wrote:
> 
>> Remove byte swaps by declaring the config space
>> as native endian.
> 
> This is wrong. Virtio-pci config space is split into 2 regions. One with 
> native endianness, the other one with little endian.

Can that MemoryRegion be split in two?

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH 3/4] hw: Add test device for unittests execution

2013-01-06 Thread Andreas Färber

Am 04.01.2013 09:16, schrieb Gerd Hoffmann:
> From: Lucas Meneghel Rodrigues 
> 
> Add a test device which supports the kvmctl ioports,
> so one can run the KVM unittest suite.
> 
> Intended Usage:
> 
> qemu-system-x86_64 -nographic \
> -device pc-testdev \
> -device isa-debug-exit,iobase=0xf4,iosize=0x04 \
> -kernel /path/to/kvm/unittests/msr.flat
> 
> Where msr.flat is one of the KVM unittests, present on a
> separate repo,
> 
> git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
> 
> [ kraxel: more memory api + qom fixes ]
> 
> CC: Paolo Bonzini 
> Signed-off-by: Alexander Graf 
> Signed-off-by: Marcelo Tosatti 
> Signed-off-by: Lucas Meneghel Rodrigues 
> Signed-off-by: Gerd Hoffmann 
[...]
> diff --git a/hw/pc-testdev.c b/hw/pc-testdev.c
> new file mode 100644
> index 000..620c86c
> --- /dev/null
> +++ b/hw/pc-testdev.c
[...]
> +typedef struct PCTestdev {
> +ISADevice parent_obj;
> +
> +MemoryRegion ioport;
> +MemoryRegion flush;
> +MemoryRegion irq;
> +MemoryRegion iomem;
> +uint32_t ioport_data;
> +char iomem_buf[IOMEM_LEN];
> +} PCTestdev;
> +
> +#define TYPE_TESTDEV "pc-testdev"
> +#define TESTDEV(obj) \
> + OBJECT_CHECK(struct PCTestdev, (obj), TYPE_TESTDEV)

You define a typedef above but ignore it here and everywhere below. I'm
surprised that Anthony didn't complain - struct is an implementation
detail of today's QOM classes in lack of C++/etc. language support, and
even qdev used the typedefs.

Lucas/Gerd, can one of you please clean this up in a follow-up?

I only noticed this because I had to touch some of these lines rebasing
my ISA QOM realize queue - maybe some Perl wizard can add a
checkpatch.pl rule to catch this?

Thanks,
Andreas

> +
> +static void test_irq_line(void *opaque, hwaddr addr, uint64_t data,
> +  unsigned len)
> +{
> +struct PCTestdev *dev = opaque;
> +struct ISADevice *isa = ISA_DEVICE(dev);
> +
> +qemu_set_irq(isa_get_irq(isa, addr), !!data);
> +}
> +
> +static const MemoryRegionOps test_irq_ops = {
> +.write = test_irq_line,
> +.valid.min_access_size = 1,
> +.valid.max_access_size = 1,
> +.endianness = DEVICE_LITTLE_ENDIAN,
> +};
> +
> +static void test_ioport_write(void *opaque, hwaddr addr, uint64_t data,
> +  unsigned len)
> +{
> +struct PCTestdev *dev = opaque;
> +dev->ioport_data = data;
> +}
> +
> +static uint64_t test_ioport_read(void *opaque, hwaddr addr, unsigned len)
> +{
> +struct PCTestdev *dev = opaque;
> +return dev->ioport_data;
> +}
> +
> +static const MemoryRegionOps test_ioport_ops = {
> +.read = test_ioport_read,
> +.write = test_ioport_write,
> +.endianness = DEVICE_LITTLE_ENDIAN,
> +};
> +
> +static void test_flush_page(void *opaque, hwaddr addr, uint64_t data,
> +unsigned len)
> +{
> +hwaddr page = 4096;
> +void *a = cpu_physical_memory_map(data & ~0xffful, &page, 0);
> +
> +/* We might not be able to get the full page, only mprotect what we 
> actually
> +   have mapped */
> +mprotect(a, page, PROT_NONE);
> +mprotect(a, page, PROT_READ|PROT_WRITE);
> +cpu_physical_memory_unmap(a, page, 0, 0);
> +}
> +
> +static const MemoryRegionOps test_flush_ops = {
> +.write = test_flush_page,
> +.valid.min_access_size = 4,
> +.valid.max_access_size = 4,
> +.endianness = DEVICE_LITTLE_ENDIAN,
> +};
> +
> +static uint64_t test_iomem_read(void *opaque, hwaddr addr, unsigned len)
> +{
> +struct PCTestdev *dev = opaque;
> +uint64_t ret = 0;
> +memcpy(&ret, &dev->iomem_buf[addr], len);
> +ret = le64_to_cpu(ret);
> +
> +return ret;
> +}
> +
> +static void test_iomem_write(void *opaque, hwaddr addr, uint64_t val,
> + unsigned len)
> +{
> +struct PCTestdev *dev = opaque;
> +val = cpu_to_le64(val);
> +memcpy(&dev->iomem_buf[addr], &val, len);
> +dev->iomem_buf[addr] = val;
> +}
> +
> +static const MemoryRegionOps test_iomem_ops = {
> +.read = test_iomem_read,
> +.write = test_iomem_write,
> +.endianness = DEVICE_LITTLE_ENDIAN,
> +};
> +
> +static int init_test_device(ISADevice *isa)
> +{
> +struct PCTestdev *dev = TESTDEV(isa);
> +MemoryRegion *mem = isa_address_space(isa);
> +MemoryRegion *io = isa_address_space_io(isa);
> +
> +memory_region_init_io(&dev->ioport, &test_ioport_ops, dev,
> +  "pc-testdev-ioport", 4);
> +memory_region_init_io(&dev->flush, &test_flush_ops, dev,
> +  "pc-testdev-flush-page", 4);
> +memory_region_init_io(&dev->irq, &test_irq_ops, dev,
> +  "pc-testdev-irq-line", 24);
> +memory_region_init_io(&dev->iomem, &test_iomem_ops, dev,
> +  "pc-testdev-iomem", IOMEM_LEN);
> +
> +memory_region_add_subregion(io,  0xe0,   &dev->ioport);
> +memory_region_add_subregion(io,  0xe4,   &dev->flush);
> +memory_region

Re: [Qemu-devel] [PATCH 1/4] switch debugcon to memory api

2013-01-06 Thread Andreas Färber

Am 04.01.2013 09:16, schrieb Gerd Hoffmann:
> Also some QOM glue while being at it.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  hw/debugcon.c |   31 ---
>  1 files changed, 24 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/debugcon.c b/hw/debugcon.c
> index 14f83f1..e8a855e 100644
> --- a/hw/debugcon.c
> +++ b/hw/debugcon.c
> @@ -29,20 +29,27 @@
>  #include "isa.h"
>  #include "pc.h"
>  
> +#define TYPE_ISA_DEBUGCON_DEVICE "isa-debugcon"
> +#define ISA_DEBUGCON_DEVICE(obj) \
> + OBJECT_CHECK(ISADebugconState, (obj), TYPE_ISA_DEBUGCON_DEVICE)

Note that my previous QOM'ification RFC used the more verbose
[TYPE_]ISA_DEBUG_CONSOLE to avoid the "debugcon" abbreviation. Care to
cleanup?

Otherwise looks good, thanks, and would've got a Reviewed-by had you not
rushed to send a PULL.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [FYI] Testing new patch management scripts...

2013-01-06 Thread Andreas Färber

Am 05.01.2013 04:05, schrieb Anthony Liguori:
> Andreas Färber  writes:
>> Am 02.01.2013 19:07, schrieb Anthony Liguori:
>>>   2) Patches that no longer apply via git am --3way will receive a
>>>  notification.
>>
>> Isn't Nacked-by too strong of a notification for a mere technical need
>> for a rebase (compared to a rejection of its concept/approach)?
> 
> Yeah, I need some sort of tag though for tracking purposes..  Maybe
> Rejected-by...

Sorry, forgot to mention: I wanted to propose Bounced-by.

>> Also, since quite a few of such messages have been appearing on the list
>> already, might it make sense to shorten the message and rather link to
>> some qemu.org Wiki page to spare subscribers' mailboxes?
> 
> I'm focusing at the moment on getting the code in shape to release.  We
> can tweak the text bits over time.  Patches will only send one mail
> per-thread so it's unlikely that it will ever cause a space problem even
> in the stingiest of corporate mailbox quotas :-)

We shouldn't forgot non-corporate mailboxes since QEMU is a community
project. :-) I was rather thinking about disk space long-term though and
these messages mostly reaching core community members so far that don't
need detailed explaining.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH] vfio-pci: Make host MSI-X enable track guest

2013-01-06 Thread Alex Williamson

On Sun, 2013-01-06 at 10:50 +0200, Michael S. Tsirkin wrote:
> On Fri, Dec 21, 2012 at 08:38:07AM -0700, Alex Williamson wrote:
> > On Fri, 2012-12-21 at 14:21 +0200, Michael S. Tsirkin wrote:
> > > On Thu, Dec 20, 2012 at 03:12:46PM -0700, Alex Williamson wrote:
> > > > On Thu, 2012-12-20 at 18:36 +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Dec 20, 2012 at 09:06:41AM -0700, Alex Williamson wrote:
> > > > > > Guests typically enable MSI-X with all of the vectors in the MSI-X
> > > > > > vector table masked.  Only when the vector is enabled does the 
> > > > > > vector
> > > > > > get unmasked, resulting in a vector_use callback.  These two points,
> > > > > > enable and unmask, correspond to pci_enable_msix() and request_irq()
> > > > > > for Linux guests.  Some drivers rely on VF/PF or PF/fw communication
> > > > > > channels that expect the physical state of the device to match the
> > > > > > guest visible state of the device.  They don't appreciate lazily
> > > > > > enabling MSI-X on the physical device.
> > > > > > 
> > > > > > To solve this, enable MSI-X with a single vector when the MSI-X
> > > > > > capability is enabled and immediate disable the vector.  This leaves
> > > > > > the physical device in exactly the same state between host and 
> > > > > > guest.
> > > > > > Furthermore, the brief gap where we enable vector 0, it fires into
> > > > > > userspace, not KVM, so the guest doesn't get spurious interrupts.
> > > > > > Ideally we could call VFIO_DEVICE_SET_IRQS with the right parameters
> > > > > > to enable MSI-X with zero vectors, but this will currently return an
> > > > > > error as the Linux MSI-X interfaces do not allow it.
> > > > > > 
> > > > > > Cc: qemu-sta...@nongnu.org
> > > > > > Signed-off-by: Alex Williamson 
> > > > > 
> > > > > Do you need an interface for this?  Can you do low-level pci config
> > > > > access instead?  I imagine you would then enable MSIX and mask all
> > > > > vectors at the same time.
> > > > > 
> > > > > No?
> > > > 
> > > > I really don't like the idea of enabling MSI-X directly through config
> > > > space.  We're just asking for ownership conflicts doing that.  In fact,
> > > > vfio prevents MSI/X from being enabled through config space since those
> > > > are controlled through ioctl.
> > > 
> > > For vfio the natural thing to do would be to add interfaces
> > > to do this in a controlled manner.
> > 
> > The ioctl is the controlled manner.  The ioctl will actually accept
> > enabling zero vectors, but you'll get an -ERANGE generated from
> > pci_enable_msix, so I think the interface there is fine.
> 
> See comment below at the patch.
> 
> > > >  It also prevents access to the MSI-X
> > > > vector table since userspace has no business reading or modifying it.
> > > > Thanks,
> > > > 
> > > > Alex
> > > 
> > > I'm not sure what the point of this is. If a device can do DMA writes
> > > there is no way to distinguish between them and MSI on the bus.
> > > So this seems to buy us no additional security.
> > 
> > IOMMUs supporting interrupt remapping can.
> > Do you propose we have one
> > interface when interrupt remapping is present and another when it's not?
> > I can't think of anything useful that userspace can do with direct
> > access to MSIX setup, including this MSI-X enable stub.
> 
> I am simply pointing out that interrupts are not virtualized
> well at the moment. The guest has access to the device
> and that knows the real contents of the MSIX table.
> Using device as a side channel, guest and device can detect
> the difference between real and observed MSIX table content.
> It might not matter on most devices but it would be more
> robust if we made guest and device states match as closely
> as possible.

Actually, the end result of the operation below puts the physical device
in exactly the same state as the guest view of the device.  I agree that
an improvement to the kernel API would allow us to avoid the brief
transition through a single vector enabled and avoid teardown and
re-setup any time a new vector is enabled.

> > Do you have any actual objections to the patch below?  I agree that
> > kernel interfaces can be improved and I'd prefer the ability to enable
> > MSI-X with zero vectors and incrementally add and remove vectors, but
> > creating side channels so userspace can poke around here is not an
> > acceptable alternative imho.  We need a solution for users hitting this
> > now and I'm pretty happy with how this solution works.
> 
> It's ugly but for an existing vfio I don't see a better solution, no.
> 
> >  I only wish we could make legacy assignment behave as cleanly, but
> >  I'm not willing to poke MSI-X enable bits myself to make it happen.
> 
> With existing kernels I don't think there's a better option.
> 
> >  Thanks,
> > 
> > Alex
> > > > > > ---
> > > > > >  hw/vfio_pci.c |   31 +++
> > > > > >  1 file changed, 27 insertions(+), 4 deletions(-)
> > > > > > 
> > > > > > VFIO makes th

Re: [Qemu-devel] [PATCH] pci-assign: Enable MSIX on device to match guest

2013-01-06 Thread Alex Williamson

On Sun, 2013-01-06 at 15:23 +0200, Michael S. Tsirkin wrote:
> On Wed, Jan 02, 2013 at 08:49:42AM -0700, Alex Williamson wrote:
> > On Fri, 2012-12-21 at 08:46 -0700, Alex Williamson wrote:
> > > On Fri, 2012-12-21 at 14:17 +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Dec 20, 2012 at 03:15:38PM -0700, Alex Williamson wrote:
> > > > > On Thu, 2012-12-20 at 18:38 +0200, Michael S. Tsirkin wrote:
> > > > > > On Thu, Dec 20, 2012 at 09:05:50AM -0700, Alex Williamson wrote:
> > > > > > > When a guest enables MSIX on a device we evaluate the MSIX vector
> > > > > > > table, typically find no unmasked vectors and don't switch the 
> > > > > > > device
> > > > > > > to MSIX mode.  This generally works fine and the device will be
> > > > > > > switched once the guest enables and therefore unmasks a vector.
> > > > > > > Unfortunately some drivers enable MSIX, then use interfaces to 
> > > > > > > send
> > > > > > > commands between VF & PF or PF & firmware that act based on the 
> > > > > > > host
> > > > > > > state of the device.  These therefore break when MSIX is managed
> > > > > > > lazily.  This change re-enables the previous test used to enable 
> > > > > > > MSIX
> > > > > > > (see qemu-kvm a6b402c9), which basically guesses whether a vector
> > > > > > > will be used based on the data field of the vector table.
> > > > > > > 
> > > > > > > Cc: qemu-sta...@nongnu.org
> > > > > > > Signed-off-by: Alex Williamson 
> > > > > > 
> > > > > > Same question: can't we enable and mask MSIX through config sysfs?
> > > > > > In this case it can be done in userspace ...
> > > > > 
> > > > > In this case userspace could do this, but I think it's still 
> > > > > incredibly
> > > > > dangerous.  Kernel space drivers can also directly enable MSI-X on a
> > > > > device, but you might get shot for writing one that did.
> > > > 
> > > > What would be the reason for the kernel driver to do this?
> > > 
> > > Maybe they don't know how many vectors to use until they enable MSI-X
> > > and query some firmware interface.  It's a hypothetical situation, I'm
> > > just trying to illustrate that if a kernel driver did want to do this,
> > > they'd have to develop interfaces to allow it, not just manually poke
> > > their MSI-X enable bit.
> > > 
> > > > >  We should
> > > > > follow the rules, play be the existing kernel interfaces, and work to
> > > > > eventually improve those interfaces.  Thanks,
> > > > > 
> > > > > Alex
> > > > 
> > > > I'm not against adding an interface for this long term but we have
> > > > existing kernels to support too.  IMHO it would be nicer than
> > > > the data hack which relies on non-documented guest behaviour
> > > > that might change without warning in the future.
> > > 
> > > We've unwittingly used the data hack for years and only ripped it out
> > > because it was undocumented.  The patch below adds documentation for it,
> > > so at least we have a more clear understanding of why it was there if we
> > > want to try to rip it out again.  This fully supports existing kernels
> > > and as I mention below, we might be able to do better with limiting how
> > > many vectors we enabled, but I think this is the right initial fix and
> > > right fix for stable and we can continue to experiment from here.
> > 
> > Happy new year.  I'd like to close on this as we do currently have a
> > regression for devices that cannot handle MSI-X being lazily enabled.
> > The option here is to document and revert to the old style
> > initialization behavior where we look at the data field of the vector to
> > get a hint whether the guest intends to make use of the vector.  This
> > gives us the same behavior as we had previously, but still allows
> > vectors to be added, so we maintain the current FreeBSD support.  This
> > much needs to go to stable.
> 
> By the way, could you clarify what exactly happens with FreeBSD?

FreeBSD enables MSIX in config space without setting anything in the
MSIX table address/data fields.  Thus previously, when we only looked at
the data field and did not add vectors dynamically, FreeBSD wouldn't get
any vectors setup.  This change therefore does not effect FreeBSD since
it will still lazily enable vectors.  This would be addressed by a
follow-on patch since we don't know of any drivers that care about the
lazy setup on FreeBSD.

> > For the development tree, I think we can do better.  Using the data
> > field is not 100% reliable in giving us the number of vectors the guest
> > actually intends to use.  Instead we'd like to enable MSI-X with no
> > vectors and add vectors as the guest unmasks them.  The host Linux MSI
> > API currently doesn't allow this, so I think the next best thing is to
> > enable MSI-X with a single vector in the case where MSI-X is enabled but
> > no vectors are unmasked.  This conserves vectors on the host though we
> > do potentially allow spurious interrupts through the enabled vector
> > (though we previously enabled multiple vectors using the above da

Re: [Qemu-devel] [PATCH KVM v2 0/4] fix KVM i8259 IRQ trailing-edge logic

2013-01-06 Thread Gleb Natapov

On Wed, Dec 26, 2012 at 10:39:52PM -0700, Matthew Ogilvie wrote:
> Changes since version 1 (from Sep 9):
>* Split off patch 1; this is the critical prerequisite to
>  make the i8254 work with the fixed i8259.
>* Add patch 2, to make additional changes to the i8254
>  to make it consistent with the spec and with proposed changes
>  to qemu's i8254 model.
> 
> Background:
> 
> According to the spec, the i8259 will cancel an interrupt if
> the line goes low before it starts servicing it, even when
> it is considered edge-triggered.  This is a problem
> with an old Microport UNIX guest (ca 1987), where the
> guest masks off IRQ14 in the slave i8259, but the host's
> master i8259 model will still try to deliver an interrupt even after
> IRQ2 has been brought low, resulting in a spurious interrupt (IRQ15).
> 
> Before the i8259 can be fixed, the i8254 model needs to be fixed
> so it doesn't depend on the broken i8259.
> 
> Alternative: This could be narrowly fixed for IRQ2 only with no
> risk at all (and no need to touch the i8254), but if possible it
> seems reasonable to fix it for all lines at the same time.
> 
> But there may be some risk:
> 
> First, I think I've convinced myself that between the i8254 and i8259,
> the only substantial migration breakage should be when a
> whole series of conditions are met, and the combination
> should be rare enough not to worry about it.  See writeup
> in my qemu patch series (version 8; Dec 16).
> 
> Second, there is also the possibility that other devices are relying
> on the broken model.  I'm especially concerned with various users
> of a function called
> 
> qemu_irq_pulse()
> 
> in the qemu source tree, which immediately lowers IRQ line after
> raising it.  If any of those calls are routed straight into
> the i8259 (as opposed to an APIC or some other chip), then it
> probably won't behave as desired.
> 
> I'll probably dig into qemu_irq_pulse() callers more carefully
> eventually, but there are lot of them, and any high-level incite
> anyone can provide into them would be helpful.
> 
$ git grep qemu_irq_pulse | wc -l
34

Files are:
hw/bonito.c
hw/dma.c
hw/grlib_apbuart.c
hw/grlib_gptimer.c
hw/hpet.c
hw/irq.h
hw/milkymist-ac97.c
hw/milkymist-minimac2.c
hw/milkymist-pfpu.c
hw/milkymist-softusb.c
hw/milkymist-sysctl.c
hw/milkymist-tmu2.c
hw/omap1.c
hw/omap_gptimer.c
hw/onenand.c
hw/spapr_events.c
hw/spapr_llan.c
hw/spapr_pci.c
hw/spapr_vio.c
hw/spapr_vty.c
hw/stellaris.c
hw/xilinx_ethlite.c

Looks like only two of those are relevant to PC platform hw/dma.c and
hw/hpet.c. In hw/dma.c it is used for internal qemu communication, not
real device. In hw/hpet.c from a quick glance it looks like _pulse is
only used when HPET in not in legacy emulation mode which means that
pulse should be directed to APIC.

> See the Dec 16 patch series I sent to the qemu mailing list for
> more information.
> http://search.gmane.org/?query=ogilvie&group=gmane.comp.emulators.qemu
> 
> Matthew Ogilvie (4):
>   KVM: fix i8254 IRQ0 to be normally high
>   KVM: additional i8254 output fixes
>   KVM: fix i8259 interrupt high to low transition logic
>   KVM: i8259: refactor pic_set_irq level logic
> 
>  arch/x86/kvm/i8254.c | 50 +++---
>  arch/x86/kvm/i8259.c | 36 ++--
>  2 files changed, 53 insertions(+), 33 deletions(-)
> 
> -- 
> 1.7.10.2.484.gcd07cc5

--
Gleb.

Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix do_kvm_cpu_synchronize_state data integrity issue

2013-01-06 Thread Bhushan Bharat-R65777



> -Original Message-
> From: Jason J. Herne [mailto:jjhe...@linux.vnet.ibm.com]
> Sent: Friday, January 04, 2013 11:10 PM
> To: Alexander Graf
> Cc: Christian Borntraeger; Anthony Liguori; Marcelo Tosatti; qemu-
> de...@nongnu.org qemu-devel; Bhushan Bharat-R65777
> Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix
> do_kvm_cpu_synchronize_state data integrity issue
> 
> On 01/04/2013 11:27 AM, Alexander Graf wrote:
> >
> > On 04.01.2013, at 16:25, Jason J. Herne wrote:
> >
> >> If I've followed the conversation correctly this is what needs to be done:
> >>
> >> 1. Remove the level parameters from kvm_arch_get_registers and
> kvm_arch_put_registers.
> >>
> >> 2. Add a new bitmap parameter to kvm_arch_get_registers and
> kvm_arch_put_registers.
> >
> > I would combine these into "replace levels with bitmap".
> >
> >> 3. Define a bit that correlates to our current notion of "all runtime
> registers".  This bit, and all bits in this bitmap, would be architecture
> specific.
> >
> > Why would that bit be architecture specific? "All runtime registers" ==
> "registers that gdb can access" IIRC. The implementation on what exactly that
> means obviously is architecture specific, but the bit itself would not be, as
> the gdbstub wants to be able to synchronize in arch independent code.
> >
> >> 4. Remove the cpustate->kvm_sync_dirty field.  Replace it with a bitmap 
> >> that
> tracks which bits are dirty and need to be synced back to KVM-land.
> >>
> >> 5. As we do today, we'll assume registers are dirty and turn on their
> corresponding bit in this new bitmap whenever we "get" the registers from KVM.
> >
> > Yes. Changing these semantics is nothing for today :).
> >
> >> 6. Add other bits as needed on a case by case basis.
> >>
> >> Does this seem to match what was discussed, and what we want to do?
> >
> > It's probably the best way forward, keeping everyone happy.
> >
> > Please coordinate with Bharat on who actually wants to sit down to implement
> this. Or if you're quick you might be able to beat him to it regardless thanks
> to time zones :).
> >
> 
> Hi Bharat,
> 
> How would you like to handle these changes?  I can do them, or you could if 
> you
> prefer. Please let me know.

Hi Jason,

I will be happy to see the patch from you and because of some other things if 
you think that it will take time then let me know, I will do the changes. This 
framework will be used by my watchdog patches and only thing I want is my 
watchdog changes get pushed in QEMU.

Thanks
-Bharat

Re: [Qemu-devel] [PATCH qom-cpu 11/11] target-i386: check/enforce: Check all feature words

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:12PM -0200, Eduardo Habkost wrote:
> This adds the following feature words to the list of flags to be checked
> by kvm_check_features_against_host():
> 
>  - cpuid_7_0_ebx_features
>  - ext4_features
>  - kvm_features
>  - svm_features
> 
> This will ensure the "enforce" flag works as it should: it won't allow
> QEMU to be started unless every flag that was requested by the user or
> defined in the CPU model is supported by the host.
> 
> This patch may cause existing configurations where "enforce" wasn't
> preventing QEMU from being started to abort QEMU. But that's exactly the
> point of this patch: if a flag was not supported by the host and QEMU
> wasn't aborting, it was a bug in the "enforce" code.
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Cc: Gleb Natapov 
> Cc: Marcelo Tosatti 
> Cc: k...@vger.kernel.org
> Cc: libvir-l...@redhat.com
> Cc: Jiri Denemark 
> 
> CCing libvirt people, as this is directly related to the planned usage
> of the "enforce" flag by libvirt.
> 
> The libvirt team probably has a problem in their hands: libvirt should
> use "enforce" to make sure all requested flags are making their way into
> the guest (so the resulting CPU is always the same, on any host), but
> users may have existing working configurations where a flag is not
> supported by the guest and the user really doesn't care about it. Those
> configurations will necessarily break when libvirt starts using
> "enforce".
> 
> One example where it may cause trouble for common setups: pc-1.3 wants
> the kvm_pv_eoi flag enabled by default (so "enforce" will make sure it
> is enabled), but the user may have an existing VM running on a host
> without pv_eoi support. That setup is unsafe today because
> live-migration between different host kernel versions may enable/disable
> pv_eoi silently (that's why we need the "enforce" flag to be used by
> libvirt), but the user probably would like to be able to live-migrate
> that VM anyway (and have libvirt to "just do the right thing").
> 
> One possible solution to libvirt is to use "enforce" only on newer
> machine-types, so existing machines with older machine-types will keep
> the unsafe host-dependent-ABI behavior, but at least would keep
> live-migration working in case the user is careful.
> 
> I really don't know what the libvirt team prefers, but that's the
> situation today. The longer we take to make "enforce" strict as it
> should and make libvirt finally use it, more users will have VMs with
> migration-unsafe unpredictable guest ABIs.
> 
> Changes v2:
>  - Coding style fix
> ---
>  target-i386/cpu.c | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 876b0f6..52727ad 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -955,8 +955,9 @@ static int unavailable_host_feature(struct 
> model_features_t *f, uint32_t mask)
>  return 0;
>  }
>  
> -/* best effort attempt to inform user requested cpu flags aren't making
> - * their way to the guest.
> +/* Check if all requested cpu flags are making their way to the guest
> + *
> + * Returns 0 if all flags are supported by the host, non-zero otherwise.
>   *
>   * This function may be called only if KVM is enabled.
>   */
> @@ -973,7 +974,15 @@ static int kvm_check_features_against_host(x86_def_t 
> *guest_def)
>  {&guest_def->ext2_features, &host_def.ext2_features,
>  ext2_feature_name, 0x8001, R_EDX},
>  {&guest_def->ext3_features, &host_def.ext3_features,
> -ext3_feature_name, 0x8001, R_ECX}
> +ext3_feature_name, 0x8001, R_ECX},
> +{&guest_def->ext4_features, &host_def.ext4_features,
> +NULL, 0xC001, R_EDX},
Since there is not name array for ext4_features they cannot be added or
removed on the command line hence no need to check them, no?

> +{&guest_def->cpuid_7_0_ebx_features, 
> &host_def.cpuid_7_0_ebx_features,
> +cpuid_7_0_ebx_feature_name, 7, R_EBX},
> +{&guest_def->svm_features, &host_def.svm_features,
> +svm_feature_name, 0x800A, R_EDX},
> +{&guest_def->kvm_features, &host_def.kvm_features,
> +kvm_feature_name, KVM_CPUID_FEATURES, R_EAX},
>  };
>  
>  assert(kvm_enabled());
> -- 
> 1.7.11.7

--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 10/11] target-i386: Call kvm_check_features_against_host() only if CONFIG_KVM is set

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:11PM -0200, Eduardo Habkost wrote:
> This will be necessary once kvm_check_features_against_host() starts
> using KVM-specific definitions (so it won't compile anymore if
> CONFIG_KVM is not set).
> 
> Signed-off-by: Eduardo Habkost 
> ---
>  target-i386/cpu.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 1c3c7e1..876b0f6 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -936,6 +936,7 @@ static void kvm_cpu_fill_host(x86_def_t *x86_cpu_def)
>  #endif /* CONFIG_KVM */
>  }
>  
> +#ifdef CONFIG_KVM
>  static int unavailable_host_feature(struct model_features_t *f, uint32_t 
> mask)
>  {
>  int i;
> @@ -987,6 +988,7 @@ static int kvm_check_features_against_host(x86_def_t 
> *guest_def)
>  }
>  return rv;
>  }
> +#endif
>  
>  static void x86_cpuid_version_get_family(Object *obj, Visitor *v, void 
> *opaque,
>   const char *name, Error **errp)
> @@ -1410,10 +1412,12 @@ static int cpu_x86_parse_featurestr(x86_def_t 
> *x86_cpu_def, char *features)
>  x86_cpu_def->kvm_features &= ~minus_kvm_features;
>  x86_cpu_def->svm_features &= ~minus_svm_features;
>  x86_cpu_def->cpuid_7_0_ebx_features &= ~minus_7_0_ebx_features;
> +#ifdef CONFIG_KVM
>  if (check_cpuid && kvm_enabled()) {
>  if (kvm_check_features_against_host(x86_cpu_def) && enforce_cpuid)
>  goto error;
>  }
> +#endif
Provide kvm_check_features_against_host() stub if !CONFIG_KVM and drop
ifdef here.

>  return 0;
>  
>  error:
> -- 
> 1.7.11.7
> 

--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 09/11] target-i386: check/enforce: Eliminate check_feat field

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:10PM -0200, Eduardo Habkost wrote:
> Now that all entries have check_feat=~0 on
> kvm_check_features_against_host(), we can eliminate check_feat entirely
> and make the code check all bits.
> 
> This patch shouldn't introduce any behavior change, as check_feat is set
> to ~0 on all entries.
> 
> Signed-off-by: Eduardo Habkost 
Reviewed-by: Gleb Natapov 

> ---
> Changes v2:
>  - Coding style fix
> ---
>  target-i386/cpu.c | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index a9dd959..1c3c7e1 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -148,7 +148,6 @@ const char *get_register_name_32(unsigned int reg)
>  typedef struct model_features_t {
>  uint32_t *guest_feat;
>  uint32_t *host_feat;
> -uint32_t check_feat;
>  const char **flag_names;
>  uint32_t cpuid;
>  int reg;
> @@ -956,8 +955,7 @@ static int unavailable_host_feature(struct 
> model_features_t *f, uint32_t mask)
>  }
>  
>  /* best effort attempt to inform user requested cpu flags aren't making
> - * their way to the guest.  Note: ft[].check_feat ideally should be
> - * specified via a guest_def field to suppress report of extraneous flags.
> + * their way to the guest.
>   *
>   * This function may be called only if KVM is enabled.
>   */
> @@ -968,13 +966,13 @@ static int kvm_check_features_against_host(x86_def_t 
> *guest_def)
>  int rv, i;
>  struct model_features_t ft[] = {
>  {&guest_def->features, &host_def.features,
> -~0, feature_name, 0x0001, R_EDX},
> +feature_name, 0x0001, R_EDX},
>  {&guest_def->ext_features, &host_def.ext_features,
> -~0, ext_feature_name, 0x0001, R_ECX},
> +ext_feature_name, 0x0001, R_ECX},
>  {&guest_def->ext2_features, &host_def.ext2_features,
> -~0, ext2_feature_name, 0x8001, R_EDX},
> +ext2_feature_name, 0x8001, R_EDX},
>  {&guest_def->ext3_features, &host_def.ext3_features,
> -~0, ext3_feature_name, 0x8001, R_ECX}
> +ext3_feature_name, 0x8001, R_ECX}
>  };
>  
>  assert(kvm_enabled());
> @@ -982,7 +980,7 @@ static int kvm_check_features_against_host(x86_def_t 
> *guest_def)
>  kvm_cpu_fill_host(&host_def);
>  for (rv = 0, i = 0; i < ARRAY_SIZE(ft); ++i)
>  for (mask = 1; mask; mask <<= 1)
> -if (ft[i].check_feat & mask && *ft[i].guest_feat & mask &&
> +if (*ft[i].guest_feat & mask &&
>  !(*ft[i].host_feat & mask)) {
>  unavailable_host_feature(&ft[i], mask);
>  rv = 1;
> -- 
> 1.7.11.7
> 

--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 08/11] target-i386: check/enforce: Check SVM flag support as well

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:09PM -0200, Eduardo Habkost wrote:
> When nested SVM is supported, the kernel returns the SVM flag on
> GET_SUPPORTED_CPUID[1], so we can check the SVM flag safely on
> kvm_check_features_against_host().
> 
> I don't know why the original code ignored the SVM flag. Maybe it was
> because kvm_cpu_fill_host() used the CPUID instruction directly instead
> of GET_SUPPORTED_CPUID
> 
> [1] Older kernels (before v2.6.37) returned the SVM flag even if nested
> SVM was _not_ supported. So the only cases where this patch should
> change behavior is when SVM is being requested by the user or the
> CPU model, but not supported by the host. And on these cases we
> really want QEMU to abort if the "enforce" option is set.
> 
> Signed-off-by: Eduardo Habkost 
Reviewed-by: Gleb Natapov 

> ---
> Cc: Joerg Roedel 
> Cc: k...@vger.kernel.org
> Cc: libvir-l...@redhat.com
> Cc: Jiri Denemark 
> 
> I'm CCing libvirt people in case having SVM enabled by default may cause
> trouble when libvirt starts using the "enforce" flag. I don't know if
> libvirt expects most of the QEMU CPU models to have nested SVM enabled.
> 
> Changes v2:
>  - Coding style fix
> ---
>  target-i386/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index ce64b98..a9dd959 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -974,7 +974,7 @@ static int kvm_check_features_against_host(x86_def_t 
> *guest_def)
>  {&guest_def->ext2_features, &host_def.ext2_features,
>  ~0, ext2_feature_name, 0x8001, R_EDX},
>  {&guest_def->ext3_features, &host_def.ext3_features,
> -~CPUID_EXT3_SVM, ext3_feature_name, 0x8001, R_ECX}
> +~0, ext3_feature_name, 0x8001, R_ECX}
>  };
>  
>  assert(kvm_enabled());
> -- 
> 1.7.11.7
> 

--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 07/11] target-i386: check/enforce: Check all CPUID.80000001H.EDX bits

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:08PM -0200, Eduardo Habkost wrote:
> I have no idea why PPRO_FEATURES was being ignored on the check of the
> CPUID.8001H.EDX bits. I believe it was a mistake, and it was
> supposed to be ~(PPRO_FEATURES & CPUID_EXT2_AMD_ALIASES) or just
> ~CPUID_EXT2_AMD_ALIASES, because some time ago kvm_cpu_fill_host() used
> the CPUID instruction directly (instead of
> kvm_arch_get_supported_cpuid()).
> 
> But now kvm_cpu_fill_host() use kvm_arch_get_supported_cpuid(), and
> kvm_arch_get_supported_cpuid() returns all supported bits for
> CPUID.8001H.EDX, even the AMD aliases (that are explicitly copied
> from CPUID.01H.EDX), so we can make the code check/enforce all the
> CPUID.8001H.EDX bits.
> 
> Signed-off-by: Eduardo Habkost 
Reviewed-by: Gleb Natapov 

> ---
>  target-i386/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 42c4c99..ce64b98 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -972,7 +972,7 @@ static int kvm_check_features_against_host(x86_def_t 
> *guest_def)
>  {&guest_def->ext_features, &host_def.ext_features,
>  ~0, ext_feature_name, 0x0001, R_ECX},
>  {&guest_def->ext2_features, &host_def.ext2_features,
> -~PPRO_FEATURES, ext2_feature_name, 0x8001, R_EDX},
> +~0, ext2_feature_name, 0x8001, R_EDX},
>  {&guest_def->ext3_features, &host_def.ext3_features,
>  ~CPUID_EXT3_SVM, ext3_feature_name, 0x8001, R_ECX}
>  };
> -- 
> 1.7.11.7
> 

--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 06/11] target-i386: check/enforce: Do not ignore "hypervisor" flag

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:07PM -0200, Eduardo Habkost wrote:
> We don't need any hack to ignore CPUID_EXT_HYPERVISOR anymore, because
> kvm_arch_get_supported_cpuid() now set CPUID_EXT_HYPERVISOR properly.
> So, this shouldn't introduce any behavior change, but it makes the code
> simpler.
> 
> Signed-off-by: Eduardo Habkost 
Reviewed-by: Gleb Natapov 

> ---
> My goal is to eliminate the check_feat field completely, as
> kvm_arch_get_supported_cpuid() should now really return all the bits we
> can set on all CPUID leaves.
> ---
>  target-i386/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index c3e5db8..42c4c99 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -970,7 +970,7 @@ static int kvm_check_features_against_host(x86_def_t 
> *guest_def)
>  {&guest_def->features, &host_def.features,
>  ~0, feature_name, 0x0001, R_EDX},
>  {&guest_def->ext_features, &host_def.ext_features,
> -~CPUID_EXT_HYPERVISOR, ext_feature_name, 0x0001, R_ECX},
> +~0, ext_feature_name, 0x0001, R_ECX},
>  {&guest_def->ext2_features, &host_def.ext2_features,
>  ~PPRO_FEATURES, ext2_feature_name, 0x8001, R_EDX},
>  {&guest_def->ext3_features, &host_def.ext3_features,
> -- 
> 1.7.11.7
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 05/11] target-i386: check/enforce: Fix CPUID leaf numbers on error messages

2013-01-06 Thread Gleb Natapov

On Sun, Jan 06, 2013 at 04:12:54PM +0200, Gleb Natapov wrote:
> On Fri, Jan 04, 2013 at 08:01:06PM -0200, Eduardo Habkost wrote:
> > The -cpu check/enforce warnings are printing incorrect information about the
> > missing flags. There are no feature flags on CPUID leaves 0 and 0x8000, 
> > but
> > there were references to 0 and 0x8000 in the table at
> > kvm_check_features_against_host().
> > 
> > This changes the model_features_t struct to contain the register number as
> > well, so the error messages print the correct CPUID leaf+register 
> > information,
> > instead of wrong CPUID leaf numbers.
> > 
> > This also changes the format of the error messages, so they follow the
> > "CPUID... [bit ]" convention used on Intel
> > documentation. Example output:
> > 
> > $ qemu-system-x86_64 -machine pc-1.0,accel=kvm -cpu 
> > Opteron_G4,+ia64,enforce
> > warning: host doesn't support requested feature: CPUID.01H:EDX.ia64 
> > [bit 30]
> > warning: host doesn't support requested feature: CPUID.01H:ECX.xsave 
> > [bit 26]
> > warning: host doesn't support requested feature: CPUID.01H:ECX.avx [bit 
> > 28]
> > warning: host doesn't support requested feature: 
> > CPUID.8001H:ECX.abm [bit 5]
> > warning: host doesn't support requested feature: 
> > CPUID.8001H:ECX.sse4a [bit 6]
> > warning: host doesn't support requested feature: 
> > CPUID.8001H:ECX.misalignsse [bit 7]
> > warning: host doesn't support requested feature: 
> > CPUID.8001H:ECX.3dnowprefetch [bit 8]
> > warning: host doesn't support requested feature: 
> > CPUID.8001H:ECX.xop [bit 11]
> > warning: host doesn't support requested feature: 
> > CPUID.8001H:ECX.fma4 [bit 16]
> > Unable to find x86 CPU definition
> > $
> > 
> > Signed-off-by: Eduardo Habkost 
> Reviewed-by: Gleb Natapov 
> But see the question below.
Never mind. I found the answer in the following patches :)

> 
> > ---
> > Cc: Gleb Natapov 
> > Cc: Marcelo Tosatti 
> > Cc: k...@vger.kernel.org
> > 
> > Changes v2:
> >  - Coding style fixes
> >  - Add assert() for invalid register numbers on
> >unavailable_host_feature()
> > ---
> >  target-i386/cpu.c | 42 +-
> >  target-i386/cpu.h |  3 +++
> >  2 files changed, 36 insertions(+), 9 deletions(-)
> > 
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index e916ae0..c3e5db8 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -124,6 +124,25 @@ static const char *cpuid_7_0_ebx_feature_name[] = {
> >  NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
> >  };
> >  
> > +const char *get_register_name_32(unsigned int reg)
> > +{
> > +static const char *reg_names[CPU_NB_REGS32] = {
> > +[R_EAX] = "EAX",
> > +[R_ECX] = "ECX",
> > +[R_EDX] = "EDX",
> > +[R_EBX] = "EBX",
> > +[R_ESP] = "ESP",
> > +[R_EBP] = "EBP",
> > +[R_ESI] = "ESI",
> > +[R_EDI] = "EDI",
> > +};
> > +
> > +if (reg > CPU_NB_REGS32) {
> > +return NULL;
> > +}
> > +return reg_names[reg];
> > +}
> > +
> >  /* collects per-function cpuid data
> >   */
> >  typedef struct model_features_t {
> > @@ -132,7 +151,8 @@ typedef struct model_features_t {
> >  uint32_t check_feat;
> >  const char **flag_names;
> >  uint32_t cpuid;
> > -} model_features_t;
> > +int reg;
> > +} model_features_t;
> >  
> >  int check_cpuid = 0;
> >  int enforce_cpuid = 0;
> > @@ -923,10 +943,13 @@ static int unavailable_host_feature(struct 
> > model_features_t *f, uint32_t mask)
> >  
> >  for (i = 0; i < 32; ++i)
> >  if (1 << i & mask) {
> > -fprintf(stderr, "warning: host cpuid %04x_%04x lacks requested"
> > -" flag '%s' [0x%08x]\n",
> > -f->cpuid >> 16, f->cpuid & 0x,
> > -f->flag_names[i] ? f->flag_names[i] : "[reserved]", mask);
> > +const char *reg = get_register_name_32(f->reg);
> > +assert(reg);
> > +fprintf(stderr, "warning: host doesn't support requested 
> > feature: "
> > +"CPUID.%02XH:%s%s%s [bit %d]\n",
> > +f->cpuid, reg,
> > +f->flag_names[i] ? "." : "",
> > +f->flag_names[i] ? f->flag_names[i] : "", i);
> >  break;
> >  }
> >  return 0;
> > @@ -945,13 +968,14 @@ static int kvm_check_features_against_host(x86_def_t 
> > *guest_def)
> >  int rv, i;
> >  struct model_features_t ft[] = {
> >  {&guest_def->features, &host_def.features,
> > -~0, feature_name, 0x},
> > +~0, feature_name, 0x0001, R_EDX},
> >  {&guest_def->ext_features, &host_def.ext_features,
> > -~CPUID_EXT_HYPERVISOR, ext_feature_name, 0x0001},
> > +~CPUID_EXT_HYPERVISOR, ext_feature_name, 0x0001, R_ECX},
> >  {&guest_def->ext2_features, &host_def.ext2_features,
> > -~P

Re: [Qemu-devel] [PATCH qom-cpu 05/11] target-i386: check/enforce: Fix CPUID leaf numbers on error messages

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:06PM -0200, Eduardo Habkost wrote:
> The -cpu check/enforce warnings are printing incorrect information about the
> missing flags. There are no feature flags on CPUID leaves 0 and 0x8000, 
> but
> there were references to 0 and 0x8000 in the table at
> kvm_check_features_against_host().
> 
> This changes the model_features_t struct to contain the register number as
> well, so the error messages print the correct CPUID leaf+register information,
> instead of wrong CPUID leaf numbers.
> 
> This also changes the format of the error messages, so they follow the
> "CPUID... [bit ]" convention used on Intel
> documentation. Example output:
> 
> $ qemu-system-x86_64 -machine pc-1.0,accel=kvm -cpu 
> Opteron_G4,+ia64,enforce
> warning: host doesn't support requested feature: CPUID.01H:EDX.ia64 [bit 
> 30]
> warning: host doesn't support requested feature: CPUID.01H:ECX.xsave [bit 
> 26]
> warning: host doesn't support requested feature: CPUID.01H:ECX.avx [bit 
> 28]
> warning: host doesn't support requested feature: CPUID.8001H:ECX.abm 
> [bit 5]
> warning: host doesn't support requested feature: 
> CPUID.8001H:ECX.sse4a [bit 6]
> warning: host doesn't support requested feature: 
> CPUID.8001H:ECX.misalignsse [bit 7]
> warning: host doesn't support requested feature: 
> CPUID.8001H:ECX.3dnowprefetch [bit 8]
> warning: host doesn't support requested feature: CPUID.8001H:ECX.xop 
> [bit 11]
> warning: host doesn't support requested feature: CPUID.8001H:ECX.fma4 
> [bit 16]
> Unable to find x86 CPU definition
> $
> 
> Signed-off-by: Eduardo Habkost 
Reviewed-by: Gleb Natapov 
But see the question below.

> ---
> Cc: Gleb Natapov 
> Cc: Marcelo Tosatti 
> Cc: k...@vger.kernel.org
> 
> Changes v2:
>  - Coding style fixes
>  - Add assert() for invalid register numbers on
>unavailable_host_feature()
> ---
>  target-i386/cpu.c | 42 +-
>  target-i386/cpu.h |  3 +++
>  2 files changed, 36 insertions(+), 9 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index e916ae0..c3e5db8 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -124,6 +124,25 @@ static const char *cpuid_7_0_ebx_feature_name[] = {
>  NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
>  };
>  
> +const char *get_register_name_32(unsigned int reg)
> +{
> +static const char *reg_names[CPU_NB_REGS32] = {
> +[R_EAX] = "EAX",
> +[R_ECX] = "ECX",
> +[R_EDX] = "EDX",
> +[R_EBX] = "EBX",
> +[R_ESP] = "ESP",
> +[R_EBP] = "EBP",
> +[R_ESI] = "ESI",
> +[R_EDI] = "EDI",
> +};
> +
> +if (reg > CPU_NB_REGS32) {
> +return NULL;
> +}
> +return reg_names[reg];
> +}
> +
>  /* collects per-function cpuid data
>   */
>  typedef struct model_features_t {
> @@ -132,7 +151,8 @@ typedef struct model_features_t {
>  uint32_t check_feat;
>  const char **flag_names;
>  uint32_t cpuid;
> -} model_features_t;
> +int reg;
> +} model_features_t;
>  
>  int check_cpuid = 0;
>  int enforce_cpuid = 0;
> @@ -923,10 +943,13 @@ static int unavailable_host_feature(struct 
> model_features_t *f, uint32_t mask)
>  
>  for (i = 0; i < 32; ++i)
>  if (1 << i & mask) {
> -fprintf(stderr, "warning: host cpuid %04x_%04x lacks requested"
> -" flag '%s' [0x%08x]\n",
> -f->cpuid >> 16, f->cpuid & 0x,
> -f->flag_names[i] ? f->flag_names[i] : "[reserved]", mask);
> +const char *reg = get_register_name_32(f->reg);
> +assert(reg);
> +fprintf(stderr, "warning: host doesn't support requested 
> feature: "
> +"CPUID.%02XH:%s%s%s [bit %d]\n",
> +f->cpuid, reg,
> +f->flag_names[i] ? "." : "",
> +f->flag_names[i] ? f->flag_names[i] : "", i);
>  break;
>  }
>  return 0;
> @@ -945,13 +968,14 @@ static int kvm_check_features_against_host(x86_def_t 
> *guest_def)
>  int rv, i;
>  struct model_features_t ft[] = {
>  {&guest_def->features, &host_def.features,
> -~0, feature_name, 0x},
> +~0, feature_name, 0x0001, R_EDX},
>  {&guest_def->ext_features, &host_def.ext_features,
> -~CPUID_EXT_HYPERVISOR, ext_feature_name, 0x0001},
> +~CPUID_EXT_HYPERVISOR, ext_feature_name, 0x0001, R_ECX},
>  {&guest_def->ext2_features, &host_def.ext2_features,
> -~PPRO_FEATURES, ext2_feature_name, 0x8000},
> +~PPRO_FEATURES, ext2_feature_name, 0x8001, R_EDX},
>  {&guest_def->ext3_features, &host_def.ext3_features,
> -~CPUID_EXT3_SVM, ext3_feature_name, 0x8001}};
> +~CPUID_EXT3_SVM, ext3_feature_name, 0x8001, R_ECX}
Why do we exclude PPRO_FEATURES/CPUID_EXT3_SVM from been checked?

>

Re: [Qemu-devel] [PATCH qom-cpu 04/11] target-i386: kvm: Enable all supported KVM features for -cpu host

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:05PM -0200, Eduardo Habkost wrote:
> When using -cpu host, we don't need to use the kvm_default_features
> variable, as the user is explicitly asking QEMU to enable all feature
> supported by the host.
> 
> This changes the kvm_cpu_fill_host() code to use GET_SUPPORTED_CPUID to
> initialize the kvm_features field, so we get all host KVM features
> enabled.
> 
> This will also allow use to properly check/enforce KVM features inside
> kvm_check_features_against_host() later. For example, we will be able to
> make this:
> 
>   $ qemu-system-x86_64 -cpu ...,+kvm_pv_eoi,enforce
> 
> refuse to start if kvm_pv_eoi is not supported by the host (after we fix
> kvm_check_features_against_host() to check KVM flags as well).
> 
> Signed-off-by: Eduardo Habkost 
Reviewed-by: Gleb Natapov 

> ---
> Changes v2:
>  - Coding style (indentation) fix
> 
> Cc: Gleb Natapov 
> Cc: Michael S. Tsirkin 
> Cc: Marcelo Tosatti 
> Cc: k...@vger.kernel.org
> ---
>  target-i386/cpu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index c49a97c..e916ae0 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -911,6 +911,8 @@ static void kvm_cpu_fill_host(x86_def_t *x86_cpu_def)
>  /* Other KVM-specific feature fields: */
>  x86_cpu_def->svm_features =
>  kvm_arch_get_supported_cpuid(s, 0x800A, 0, R_EDX);
> +x86_cpu_def->kvm_features =
> +kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX);
>  
>  #endif /* CONFIG_KVM */
>  }
> -- 
> 1.7.11.7

--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 03/11] target-i386: kvm: -cpu host: Use GET_SUPPORTED_CPUID for SVM features

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:04PM -0200, Eduardo Habkost wrote:
> The existing -cpu host code simply set every bit inside svm_features
> (initializing it to -1), and that makes it impossible to make the
> enforce/check options work properly when the user asks for SVM features
> explicitly in the command-line.
> 
> So, instead of initializing svm_features to -1, use GET_SUPPORTED_CPUID
> to fill only the bits that are supported by the host (just like we do
> for all other CPUID feature words inside kvm_cpu_fill_host()).
> 
> This will keep the existing behavior (as filter_features_for_kvm()
> already uses GET_SUPPORTED_CPUID to filter svm_features), but will allow
> us to properly check for KVM features inside
> kvm_check_features_against_host() later.
> 
> For example, we will be able to make this:
> 
>   $ qemu-system-x86_64 -cpu ...,+pfthreshold,enforce
> 
> refuse to start if the SVM "pfthreshold" feature is not supported by the
> host (after we fix kvm_check_features_against_host() to check SVM flags
> as well).
> 
> Signed-off-by: Eduardo Habkost 
Reviewed-by: Gleb Natapov 

> ---
> Changes v2:
>  - Coding style (indentation) fix
> 
> Cc: Gleb Natapov 
> Cc: Marcelo Tosatti 
> Cc: Joerg Roedel 
> Cc: k...@vger.kernel.org
> ---
>  target-i386/cpu.c | 11 ---
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index c83a566..c49a97c 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -908,13 +908,10 @@ static void kvm_cpu_fill_host(x86_def_t *x86_cpu_def)
>  }
>  }
>  
> -/*
> - * Every SVM feature requires emulation support in KVM - so we can't just
> - * read the host features here. KVM might even support SVM features not
> - * available on the host hardware. Just set all bits and mask out the
> - * unsupported ones later.
> - */
> -x86_cpu_def->svm_features = -1;
> +/* Other KVM-specific feature fields: */
> +x86_cpu_def->svm_features =
> +kvm_arch_get_supported_cpuid(s, 0x800A, 0, R_EDX);
> +
>  #endif /* CONFIG_KVM */
>  }
>  
> -- 
> 1.7.11.7

--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 02/11] target-i386: Disable kvm_mmu_op by default on pc-1.4

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:03PM -0200, Eduardo Habkost wrote:
> The kvm_mmu_op feature was removed from the kernel since v3.3 (released
> in March 2012), it was marked for removal since January 2011 and it's
> slower than shadow or hardware assisted paging (see kernel commit
> fb92045843). It doesn't make sense to keep it enabled by default.
> 
Actually it was effectively removed Oct 1 2009 by a68a6a7282373. After 3
and a half years of not having it I think we can safely drop it without
trying to preserve it in older machine types.

> Also, keeping it enabled by default would cause unnecessary hassle when
> libvirt start using the "enforce" option.
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Cc: k...@vger.kernel.org
> Cc: Michael S. Tsirkin 
> Cc: Gleb Natapov 
> Cc: Marcelo Tosatti 
> Cc: libvir-l...@redhat.com
> Cc: Jiri Denemark 
> 
> I was planning to reverse the logic of the compat init functions and
> make pc_init_pci_1_3() enable kvm_mmu_op and then call pc_init_pci_1_4()
> instead. But that would require changing pc_init_pci_no_kvmclock() and
> pc_init_isa() as well. So to keep the changes simple, I am keeping the
> pattern used when pc_init_pci_1_3() was introduced, making
> pc_init_pci_1_4() disable kvm_mmu_op and then call pc_init_pci_1_3().
> 
> Changes v2:
>  - Coding style fix
>  - Removed redundant comments above machine init functions
> ---
>  hw/pc_piix.c  | 9 -
>  target-i386/cpu.c | 9 +
>  target-i386/cpu.h | 1 +
>  3 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 99747a7..a32af6a 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
> @@ -217,6 +217,7 @@ static void pc_init1(MemoryRegion *system_memory,
>  }
>  }
>  
> +/* machine init function for pc-0.14 - pc-1.2 */
>  static void pc_init_pci(QEMUMachineInitArgs *args)
>  {
>  ram_addr_t ram_size = args->ram_size;
> @@ -238,6 +239,12 @@ static void pc_init_pci_1_3(QEMUMachineInitArgs *args)
>  pc_init_pci(args);
>  }
>  
> +static void pc_init_pci_1_4(QEMUMachineInitArgs *args)
> +{
> +disable_kvm_mmu_op();
> +pc_init_pci_1_3(args);
> +}
> +
>  static void pc_init_pci_no_kvmclock(QEMUMachineInitArgs *args)
>  {
>  ram_addr_t ram_size = args->ram_size;
> @@ -285,7 +292,7 @@ static QEMUMachine pc_machine_v1_4 = {
>  .name = "pc-1.4",
>  .alias = "pc",
>  .desc = "Standard PC",
> -.init = pc_init_pci_1_3,
> +.init = pc_init_pci_1_4,
>  .max_cpus = 255,
>  .is_default = 1,
>  };
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index e6435da..c83a566 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -158,6 +158,15 @@ void enable_kvm_pv_eoi(void)
>  #endif
>  }
>  
> +void disable_kvm_mmu_op(void)
> +{
> +#ifdef CONFIG_KVM
No need for ifdef here too.

> +if (kvm_enabled()) {
> +kvm_default_features &= ~(1UL << KVM_FEATURE_MMU_OP);
clear_bit()

> +}
> +#endif
> +}
> +
>  void host_cpuid(uint32_t function, uint32_t count,
>  uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx)
>  {
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index 1283537..27c8d0c 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -1219,5 +1219,6 @@ void do_smm_enter(CPUX86State *env1);
>  void cpu_report_tpr_access(CPUX86State *env, TPRAccess access);
>  
>  void enable_kvm_pv_eoi(void);
> +void disable_kvm_mmu_op(void);
>  
>  #endif /* CPU_I386_H */
> -- 
> 1.7.11.7

--
Gleb.

Re: [Qemu-devel] [PATCH] pci-assign: Enable MSIX on device to match guest

2013-01-06 Thread Michael S. Tsirkin

On Wed, Jan 02, 2013 at 08:49:42AM -0700, Alex Williamson wrote:
> On Fri, 2012-12-21 at 08:46 -0700, Alex Williamson wrote:
> > On Fri, 2012-12-21 at 14:17 +0200, Michael S. Tsirkin wrote:
> > > On Thu, Dec 20, 2012 at 03:15:38PM -0700, Alex Williamson wrote:
> > > > On Thu, 2012-12-20 at 18:38 +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Dec 20, 2012 at 09:05:50AM -0700, Alex Williamson wrote:
> > > > > > When a guest enables MSIX on a device we evaluate the MSIX vector
> > > > > > table, typically find no unmasked vectors and don't switch the 
> > > > > > device
> > > > > > to MSIX mode.  This generally works fine and the device will be
> > > > > > switched once the guest enables and therefore unmasks a vector.
> > > > > > Unfortunately some drivers enable MSIX, then use interfaces to send
> > > > > > commands between VF & PF or PF & firmware that act based on the host
> > > > > > state of the device.  These therefore break when MSIX is managed
> > > > > > lazily.  This change re-enables the previous test used to enable 
> > > > > > MSIX
> > > > > > (see qemu-kvm a6b402c9), which basically guesses whether a vector
> > > > > > will be used based on the data field of the vector table.
> > > > > > 
> > > > > > Cc: qemu-sta...@nongnu.org
> > > > > > Signed-off-by: Alex Williamson 
> > > > > 
> > > > > Same question: can't we enable and mask MSIX through config sysfs?
> > > > > In this case it can be done in userspace ...
> > > > 
> > > > In this case userspace could do this, but I think it's still incredibly
> > > > dangerous.  Kernel space drivers can also directly enable MSI-X on a
> > > > device, but you might get shot for writing one that did.
> > > 
> > > What would be the reason for the kernel driver to do this?
> > 
> > Maybe they don't know how many vectors to use until they enable MSI-X
> > and query some firmware interface.  It's a hypothetical situation, I'm
> > just trying to illustrate that if a kernel driver did want to do this,
> > they'd have to develop interfaces to allow it, not just manually poke
> > their MSI-X enable bit.
> > 
> > > >  We should
> > > > follow the rules, play be the existing kernel interfaces, and work to
> > > > eventually improve those interfaces.  Thanks,
> > > > 
> > > > Alex
> > > 
> > > I'm not against adding an interface for this long term but we have
> > > existing kernels to support too.  IMHO it would be nicer than
> > > the data hack which relies on non-documented guest behaviour
> > > that might change without warning in the future.
> > 
> > We've unwittingly used the data hack for years and only ripped it out
> > because it was undocumented.  The patch below adds documentation for it,
> > so at least we have a more clear understanding of why it was there if we
> > want to try to rip it out again.  This fully supports existing kernels
> > and as I mention below, we might be able to do better with limiting how
> > many vectors we enabled, but I think this is the right initial fix and
> > right fix for stable and we can continue to experiment from here.
> 
> Happy new year.  I'd like to close on this as we do currently have a
> regression for devices that cannot handle MSI-X being lazily enabled.
> The option here is to document and revert to the old style
> initialization behavior where we look at the data field of the vector to
> get a hint whether the guest intends to make use of the vector.  This
> gives us the same behavior as we had previously, but still allows
> vectors to be added, so we maintain the current FreeBSD support.  This
> much needs to go to stable.

By the way, could you clarify what exactly happens with FreeBSD?


> For the development tree, I think we can do better.  Using the data
> field is not 100% reliable in giving us the number of vectors the guest
> actually intends to use.  Instead we'd like to enable MSI-X with no
> vectors and add vectors as the guest unmasks them.  The host Linux MSI
> API currently doesn't allow this, so I think the next best thing is to
> enable MSI-X with a single vector in the case where MSI-X is enabled but
> no vectors are unmasked.  This conserves vectors on the host though we
> do potentially allow spurious interrupts through the enabled vector
> (though we previously enabled multiple vectors using the above data
> method without problems).

Hmm I think this can lose interrupts if they are sent before
there is a handler. The data hack only sends interrupts when
there's a handler so it's safe.

> The alternative that you're proposing to this longer term solution is to
> manually mask all vectors in the physical MSI-X vector table from
> userspace

This masking is not necessary I think: all vectors are masked
after reset IIRC.

> then manually enable MSI-X on the physical device (through
> pci-sysfs resource and config access respectively).  This puts the
> physical device is a state that better matches the guest view of the
> devices, but I'm doubtful that the risk is worth the reward.  This adds
>

Re: [Qemu-devel] [PATCHv2] linuxboot optionrom: do not jump into loaded kernel in a big real mode

2013-01-06 Thread Alexander Graf


On 06.01.2013, at 14:09, Gleb Natapov wrote:

> Big real mode is fully emulated by KVM now, so if control is passed to
> the loaded kernel while one of the segment registers is in big real
> mode all the real mode part of the Linux start up is emulated. This
> slows boot process down. Fix that by resetting ES limit to 0x before
> jumping into the kernel.
> 
> The patch also removes unused code segment definition from GDT and
> changes
> ES register to be 16bit in protected mode since CS stays 16bit too and
> it is CS segment that determines effective operands and addresses
> length.
> 
> Signed-off-by: Gleb Natapov 

Reviewed-by: Alexander Graf 


Alex

> ---
> v1->v2
>  - add patch for binary linuxboot.bin
> 
> diff --git a/pc-bios/linuxboot.bin b/pc-bios/linuxboot.bin
> index 
> e7c36694f997c3c34f7f4af3c2923bd2ef6094e7..435cac4ebff3fcd83cab4bf74de11f7071ab5aa6
>  100644
> GIT binary patch
> delta 72
> zcmZqRXyBNj#oWTwIZ^izW6s78X^grY3=9l?2hw%`DF%L}13f4D4!pei7sLaB|Nnu+
> PBpBVlxtghlk#QmbUy&EH
> 
> delta 68
> zcmZqRXyBNj#azSGI8pZyW6H)4X^g52K)|nbpyx#2ftL^ef_NbC|38qJbsj=bI={J@
> NsfKa#1||;1e*hia7ytkO
> 
> diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
> index 748c831..afe39a5 100644
> --- a/pc-bios/optionrom/linuxboot.S
> +++ b/pc-bios/optionrom/linuxboot.S
> @@ -101,18 +101,20 @@ copy_kernel:
>   mov $1, %eax
>   mov %eax, %cr0
> 
> - /* So we can set ES to a 32-bit segment */
> + /* So we can enlarge ES segment limit */
>   mov $0x10, %eax
>   mov %eax, %es
> 
> - /* We're now running in 16-bit CS, but 32-bit ES! */
> -
>   /* Load kernel and initrd */
>   read_fw_blob_addr32(FW_CFG_KERNEL)
>   read_fw_blob_addr32(FW_CFG_INITRD)
>   read_fw_blob_addr32(FW_CFG_CMDLINE)
>   read_fw_blob_addr32(FW_CFG_SETUP)
> 
> + /* Do not leave ES in big real mode  */
> + mov $0x08, %eax
> + mov %eax, %es
> +
>   /* And now jump into Linux! */
>   mov $0, %eax
>   mov %eax, %cr0
> @@ -130,10 +132,10 @@ gdt:
>   /* 0x00 */
> .byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 
> - /* 0x08: code segment (base=0, limit=0xf, type=32bit code 
> exec/read, DPL=0, 4k) */
> -.byte0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00
> + /* 0x08: data segment (base=0, limit=0x, type=16bit data 
> read/write, DPL=0, 4k) */
> +.byte0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x00, 0x00
> 
> - /* 0x10: data segment (base=0, limit=0xf, type=32bit data 
> read/write, DPL=0, 4k) */
> -.byte0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00
> + /* 0x10: data segment (base=0, limit=0xf, type=16bit data 
> read/write, DPL=0, 4k) */
> +.byte0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x8f, 0x00
> 
> BOOT_ROM_END
> --
>   Gleb.

Re: [Qemu-devel] [PATCH] virtio-pci: replace byte swap hack

2013-01-06 Thread Alexander Graf


On 30.12.2012, at 13:55, Blue Swirl wrote:

> Remove byte swaps by declaring the config space
> as native endian.

This is wrong. Virtio-pci config space is split into 2 regions. One with native 
endianness, the other one with little endian.


Alex

> 
> Signed-off-by: Blue Swirl 
> ---
> exec.c  |   18 --
> hw/virtio-pci.c |   17 +
> 2 files changed, 1 insertions(+), 34 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index a6923ad..140eb56 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2587,24 +2587,6 @@ int cpu_memory_rw_debug(CPUArchState *env, 
> target_ulong addr,
> }
> #endif
> 
> -#if !defined(CONFIG_USER_ONLY)
> -
> -/*
> - * A helper function for the _utterly broken_ virtio device model to find 
> out if
> - * it's running on a big endian machine. Don't do this at home kids!
> - */
> -bool virtio_is_big_endian(void);
> -bool virtio_is_big_endian(void)
> -{
> -#if defined(TARGET_WORDS_BIGENDIAN)
> -return true;
> -#else
> -return false;
> -#endif
> -}
> -
> -#endif
> -
> #ifndef CONFIG_USER_ONLY
> bool cpu_physical_memory_is_io(hwaddr phys_addr)
> {
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index d2d2454..df78a3b 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -92,9 +92,6 @@
>  */
> #define wmb() do { } while (0)
> 
> -/* HACK for virtio to determine if it's running a big endian guest */
> -bool virtio_is_big_endian(void);
> -
> /* virtio device */
> 
> static void virtio_pci_notify(void *opaque, uint16_t vector)
> @@ -390,15 +387,9 @@ static uint64_t virtio_pci_config_read(void *opaque, 
> hwaddr addr,
> break;
> case 2:
> val = virtio_config_readw(proxy->vdev, addr);
> -if (virtio_is_big_endian()) {
> -val = bswap16(val);
> -}
> break;
> case 4:
> val = virtio_config_readl(proxy->vdev, addr);
> -if (virtio_is_big_endian()) {
> -val = bswap32(val);
> -}
> break;
> }
> return val;
> @@ -423,15 +414,9 @@ static void virtio_pci_config_write(void *opaque, hwaddr 
> addr,
> virtio_config_writeb(proxy->vdev, addr, val);
> break;
> case 2:
> -if (virtio_is_big_endian()) {
> -val = bswap16(val);
> -}
> virtio_config_writew(proxy->vdev, addr, val);
> break;
> case 4:
> -if (virtio_is_big_endian()) {
> -val = bswap32(val);
> -}
> virtio_config_writel(proxy->vdev, addr, val);
> break;
> }
> @@ -444,7 +429,7 @@ static const MemoryRegionOps virtio_pci_config_ops = {
> .min_access_size = 1,
> .max_access_size = 4,
> },
> -.endianness = DEVICE_LITTLE_ENDIAN,
> +.endianness = DEVICE_NATIVE_ENDIAN,
> };
> 
> static void virtio_write_config(PCIDevice *pci_dev, uint32_t address,
> -- 
> 1.7.2.5
> 
>

[Qemu-devel] [PATCHv2] linuxboot optionrom: do not jump into loaded kernel in a big real mode

2013-01-06 Thread Gleb Natapov

Big real mode is fully emulated by KVM now, so if control is passed to
the loaded kernel while one of the segment registers is in big real
mode all the real mode part of the Linux start up is emulated. This
slows boot process down. Fix that by resetting ES limit to 0x before
jumping into the kernel.

The patch also removes unused code segment definition from GDT and
changes
ES register to be 16bit in protected mode since CS stays 16bit too and
it is CS segment that determines effective operands and addresses
length.

Signed-off-by: Gleb Natapov 
---
 v1->v2
  - add patch for binary linuxboot.bin

diff --git a/pc-bios/linuxboot.bin b/pc-bios/linuxboot.bin
index 
e7c36694f997c3c34f7f4af3c2923bd2ef6094e7..435cac4ebff3fcd83cab4bf74de11f7071ab5aa6
 100644
GIT binary patch
delta 72
zcmZqRXyBNj#oWTwIZ^izW6s78X^grY3=9l?2hw%`DF%L}13f4D4!pei7sLaB|Nnu+
PBpBVlxtghlk#QmbUy&EH

delta 68
zcmZqRXyBNj#azSGI8pZyW6H)4X^g52K)|nbpyx#2ftL^ef_NbC|38qJbsj=bI={J@
NsfKa#1||;1e*hia7ytkO

diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
index 748c831..afe39a5 100644
--- a/pc-bios/optionrom/linuxboot.S
+++ b/pc-bios/optionrom/linuxboot.S
@@ -101,18 +101,20 @@ copy_kernel:
mov $1, %eax
mov %eax, %cr0
 
-   /* So we can set ES to a 32-bit segment */
+   /* So we can enlarge ES segment limit */
mov $0x10, %eax
mov %eax, %es
 
-   /* We're now running in 16-bit CS, but 32-bit ES! */
-
/* Load kernel and initrd */
read_fw_blob_addr32(FW_CFG_KERNEL)
read_fw_blob_addr32(FW_CFG_INITRD)
read_fw_blob_addr32(FW_CFG_CMDLINE)
read_fw_blob_addr32(FW_CFG_SETUP)
 
+   /* Do not leave ES in big real mode  */
+   mov $0x08, %eax
+   mov %eax, %es
+
/* And now jump into Linux! */
mov $0, %eax
mov %eax, %cr0
@@ -130,10 +132,10 @@ gdt:
/* 0x00 */
 .byte  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
 
-   /* 0x08: code segment (base=0, limit=0xf, type=32bit code 
exec/read, DPL=0, 4k) */
-.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00
+   /* 0x08: data segment (base=0, limit=0x, type=16bit data 
read/write, DPL=0, 4k) */
+.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x00, 0x00
 
-   /* 0x10: data segment (base=0, limit=0xf, type=32bit data 
read/write, DPL=0, 4k) */
-.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00
+   /* 0x10: data segment (base=0, limit=0xf, type=16bit data 
read/write, DPL=0, 4k) */
+.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x8f, 0x00
 
 BOOT_ROM_END
--
Gleb.

Re: [Qemu-devel] [PATCH] linuxboot optionrom: do not jump into loaded kernel in a big real mode

2013-01-06 Thread Alexander Graf


On 06.01.2013, at 13:56, Gleb Natapov wrote:

> On Sun, Jan 06, 2013 at 01:47:17PM +0100, Alexander Graf wrote:
>> 
>> On 06.01.2013, at 13:36, Gleb Natapov wrote:
>> 
>>> Big real mode is fully emulated by KVM now, so if control is passed to
>>> the loaded kernel while one of the segment registers is in big real
>>> mode all the real mode part of the Linux start up is emulated. This
>>> slows boot process down. Fix that by resetting ES limit to 0x before
>>> jumping into the kernel.
>>> 
>>> The patch also removes unused code segment definition from GDT and changes
>>> ES register to be 16bit in protected mode since CS stays 16bit too and
>>> it is CS segment that determines effective operands and addresses length.
>>> 
>>> Signed-off-by: Gleb Natapov 
>> 
>> I assume you tested that things still work, so the changes look ok to me.
>> 
> Loaded kernel with kvm and tcg using -kernel/-initrd.

Good. No objections from my side then.

>> However, this patch should also include a binary patch to 
>> pc-bios/linuxboot.bin, since linuxboot.S doesn't get compiled into the .bin 
>> form on every machine.
>> 
>> 
>> Alex
> OK, something like this?:

Looks like a binary patch, yes :).


Alex

Re: [Qemu-devel] [PATCH] linuxboot optionrom: do not jump into loaded kernel in a big real mode

2013-01-06 Thread Gleb Natapov

On Sun, Jan 06, 2013 at 01:47:17PM +0100, Alexander Graf wrote:
> 
> On 06.01.2013, at 13:36, Gleb Natapov wrote:
> 
> > Big real mode is fully emulated by KVM now, so if control is passed to
> > the loaded kernel while one of the segment registers is in big real
> > mode all the real mode part of the Linux start up is emulated. This
> > slows boot process down. Fix that by resetting ES limit to 0x before
> > jumping into the kernel.
> > 
> > The patch also removes unused code segment definition from GDT and changes
> > ES register to be 16bit in protected mode since CS stays 16bit too and
> > it is CS segment that determines effective operands and addresses length.
> > 
> > Signed-off-by: Gleb Natapov 
> 
> I assume you tested that things still work, so the changes look ok to me.
> 
Loaded kernel with kvm and tcg using -kernel/-initrd.

> However, this patch should also include a binary patch to 
> pc-bios/linuxboot.bin, since linuxboot.S doesn't get compiled into the .bin 
> form on every machine.
> 
> 
> Alex
OK, something like this?:


diff --git a/pc-bios/linuxboot.bin b/pc-bios/linuxboot.bin
index 
e7c36694f997c3c34f7f4af3c2923bd2ef6094e7..435cac4ebff3fcd83cab4bf74de11f7071ab5aa6
 100644
GIT binary patch
delta 72
zcmZqRXyBNj#oWTwIZ^izW6s78X^grY3=9l?2hw%`DF%L}13f4D4!pei7sLaB|Nnu+
PBpBVlxtghlk#QmbUy&EH

delta 68
zcmZqRXyBNj#azSGI8pZyW6H)4X^g52K)|nbpyx#2ftL^ef_NbC|38qJbsj=bI={J@
NsfKa#1||;1e*hia7ytkO



> 
> > diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
> > index 748c831..afe39a5 100644
> > --- a/pc-bios/optionrom/linuxboot.S
> > +++ b/pc-bios/optionrom/linuxboot.S
> > @@ -101,18 +101,20 @@ copy_kernel:
> > mov $1, %eax
> > mov %eax, %cr0
> > 
> > -   /* So we can set ES to a 32-bit segment */
> > +   /* So we can enlarge ES segment limit */
> > mov $0x10, %eax
> > mov %eax, %es
> > 
> > -   /* We're now running in 16-bit CS, but 32-bit ES! */
> > -
> > /* Load kernel and initrd */
> > read_fw_blob_addr32(FW_CFG_KERNEL)
> > read_fw_blob_addr32(FW_CFG_INITRD)
> > read_fw_blob_addr32(FW_CFG_CMDLINE)
> > read_fw_blob_addr32(FW_CFG_SETUP)
> > 
> > +   /* Do not leave ES in big real mode  */
> > +   mov $0x08, %eax
> > +   mov %eax, %es
> > +
> > /* And now jump into Linux! */
> > mov $0, %eax
> > mov %eax, %cr0
> > @@ -130,10 +132,10 @@ gdt:
> > /* 0x00 */
> > .byte   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> > 
> > -   /* 0x08: code segment (base=0, limit=0xf, type=32bit code 
> > exec/read, DPL=0, 4k) */
> > -.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00
> > +   /* 0x08: data segment (base=0, limit=0x, type=16bit data 
> > read/write, DPL=0, 4k) */
> > +.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x00, 0x00
> > 
> > -   /* 0x10: data segment (base=0, limit=0xf, type=32bit data 
> > read/write, DPL=0, 4k) */
> > -.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00
> > +   /* 0x10: data segment (base=0, limit=0xf, type=16bit data 
> > read/write, DPL=0, 4k) */
> > +.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x8f, 0x00
> > 
> > BOOT_ROM_END
> > --
> > Gleb.

--
Gleb.

Re: [Qemu-devel] [PATCH] linuxboot optionrom: do not jump into loaded kernel in a big real mode

2013-01-06 Thread Alexander Graf


On 06.01.2013, at 13:36, Gleb Natapov wrote:

> Big real mode is fully emulated by KVM now, so if control is passed to
> the loaded kernel while one of the segment registers is in big real
> mode all the real mode part of the Linux start up is emulated. This
> slows boot process down. Fix that by resetting ES limit to 0x before
> jumping into the kernel.
> 
> The patch also removes unused code segment definition from GDT and changes
> ES register to be 16bit in protected mode since CS stays 16bit too and
> it is CS segment that determines effective operands and addresses length.
> 
> Signed-off-by: Gleb Natapov 

I assume you tested that things still work, so the changes look ok to me.

However, this patch should also include a binary patch to 
pc-bios/linuxboot.bin, since linuxboot.S doesn't get compiled into the .bin 
form on every machine.


Alex

> diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
> index 748c831..afe39a5 100644
> --- a/pc-bios/optionrom/linuxboot.S
> +++ b/pc-bios/optionrom/linuxboot.S
> @@ -101,18 +101,20 @@ copy_kernel:
>   mov $1, %eax
>   mov %eax, %cr0
> 
> - /* So we can set ES to a 32-bit segment */
> + /* So we can enlarge ES segment limit */
>   mov $0x10, %eax
>   mov %eax, %es
> 
> - /* We're now running in 16-bit CS, but 32-bit ES! */
> -
>   /* Load kernel and initrd */
>   read_fw_blob_addr32(FW_CFG_KERNEL)
>   read_fw_blob_addr32(FW_CFG_INITRD)
>   read_fw_blob_addr32(FW_CFG_CMDLINE)
>   read_fw_blob_addr32(FW_CFG_SETUP)
> 
> + /* Do not leave ES in big real mode  */
> + mov $0x08, %eax
> + mov %eax, %es
> +
>   /* And now jump into Linux! */
>   mov $0, %eax
>   mov %eax, %cr0
> @@ -130,10 +132,10 @@ gdt:
>   /* 0x00 */
> .byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 
> - /* 0x08: code segment (base=0, limit=0xf, type=32bit code 
> exec/read, DPL=0, 4k) */
> -.byte0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00
> + /* 0x08: data segment (base=0, limit=0x, type=16bit data 
> read/write, DPL=0, 4k) */
> +.byte0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x00, 0x00
> 
> - /* 0x10: data segment (base=0, limit=0xf, type=32bit data 
> read/write, DPL=0, 4k) */
> -.byte0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00
> + /* 0x10: data segment (base=0, limit=0xf, type=16bit data 
> read/write, DPL=0, 4k) */
> +.byte0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x8f, 0x00
> 
> BOOT_ROM_END
> --
>   Gleb.

[Qemu-devel] [PATCH] linuxboot optionrom: do not jump into loaded kernel in a big real mode

2013-01-06 Thread Gleb Natapov

Big real mode is fully emulated by KVM now, so if control is passed to
the loaded kernel while one of the segment registers is in big real
mode all the real mode part of the Linux start up is emulated. This
slows boot process down. Fix that by resetting ES limit to 0x before
jumping into the kernel.

The patch also removes unused code segment definition from GDT and changes
ES register to be 16bit in protected mode since CS stays 16bit too and
it is CS segment that determines effective operands and addresses length.

Signed-off-by: Gleb Natapov 
diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
index 748c831..afe39a5 100644
--- a/pc-bios/optionrom/linuxboot.S
+++ b/pc-bios/optionrom/linuxboot.S
@@ -101,18 +101,20 @@ copy_kernel:
mov $1, %eax
mov %eax, %cr0
 
-   /* So we can set ES to a 32-bit segment */
+   /* So we can enlarge ES segment limit */
mov $0x10, %eax
mov %eax, %es
 
-   /* We're now running in 16-bit CS, but 32-bit ES! */
-
/* Load kernel and initrd */
read_fw_blob_addr32(FW_CFG_KERNEL)
read_fw_blob_addr32(FW_CFG_INITRD)
read_fw_blob_addr32(FW_CFG_CMDLINE)
read_fw_blob_addr32(FW_CFG_SETUP)
 
+   /* Do not leave ES in big real mode  */
+   mov $0x08, %eax
+   mov %eax, %es
+
/* And now jump into Linux! */
mov $0, %eax
mov %eax, %cr0
@@ -130,10 +132,10 @@ gdt:
/* 0x00 */
 .byte  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
 
-   /* 0x08: code segment (base=0, limit=0xf, type=32bit code 
exec/read, DPL=0, 4k) */
-.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00
+   /* 0x08: data segment (base=0, limit=0x, type=16bit data 
read/write, DPL=0, 4k) */
+.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x00, 0x00
 
-   /* 0x10: data segment (base=0, limit=0xf, type=32bit data 
read/write, DPL=0, 4k) */
-.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00
+   /* 0x10: data segment (base=0, limit=0xf, type=16bit data 
read/write, DPL=0, 4k) */
+.byte  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0x8f, 0x00
 
 BOOT_ROM_END
--
Gleb.

Re: [Qemu-devel] [PATCH qom-cpu 01/11] target-i386: Don't set any KVM flag by default if KVM is disabled

2013-01-06 Thread Gleb Natapov

On Fri, Jan 04, 2013 at 08:01:02PM -0200, Eduardo Habkost wrote:
> This is a cleanup that tries to solve two small issues:
> 
>  - We don't need a separate kvm_pv_eoi_features variable just to keep a
>constant calculated at compile-time, and this style would require
>adding a separate variable (that's declared twice because of the
>CONFIG_KVM ifdef) for each feature that's going to be enabled/disable
>by machine-type compat code.
>  - The pc-1.3 code is setting the kvm_pv_eoi flag on cpuid_kvm_features
>even when KVM is disabled at runtime. This small incosistency in
>the cpuid_kvm_features field isn't a problem today because
>cpuid_kvm_features is ignored by the TCG code, but it may cause
>unexpected problems later when refactoring the CPUID handling code.
> 
> This patch eliminates the kvm_pv_eoi_features variable and simply uses
> CONFIG_KVM and kvm_enabled() inside the enable_kvm_pv_eoi() compat
> function, so it enables kvm_pv_eoi only if KVM is enabled. I believe
> this makes the behavior of enable_kvm_pv_eoi() clearer and easier to
> understand.
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Cc: k...@vger.kernel.org
> Cc: Michael S. Tsirkin 
> Cc: Gleb Natapov 
> Cc: Marcelo Tosatti 
> 
> Changes v2:
>  - Coding style fix
> ---
>  target-i386/cpu.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 82685dc..e6435da 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -145,15 +145,17 @@ static uint32_t kvm_default_features = (1 << 
> KVM_FEATURE_CLOCKSOURCE) |
>  (1 << KVM_FEATURE_ASYNC_PF) |
>  (1 << KVM_FEATURE_STEAL_TIME) |
>  (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT);
> -static const uint32_t kvm_pv_eoi_features = (0x1 << KVM_FEATURE_PV_EOI);
>  #else
>  static uint32_t kvm_default_features = 0;
> -static const uint32_t kvm_pv_eoi_features = 0;
>  #endif
>  
>  void enable_kvm_pv_eoi(void)
>  {
> -kvm_default_features |= kvm_pv_eoi_features;
> +#ifdef CONFIG_KVM
You do not need ifdef here.

> +if (kvm_enabled()) {
> +kvm_default_features |= (1UL << KVM_FEATURE_PV_EOI);
> +}
> +#endif
>  }
>  
>  void host_cpuid(uint32_t function, uint32_t count,
> -- 
> 1.7.11.7

--
Gleb.

Re: [Qemu-devel] QEMU build fails with Clang?

2013-01-06 Thread Blue Swirl

On Sun, Jan 6, 2013 at 9:40 AM, Brad Smith  wrote:
> On Sat, Jan 05, 2013 at 04:51:01PM +, Blue Swirl wrote:
>> On Sat, Jan 5, 2013 at 1:48 AM, Brad Smith  wrote:
>> > Supposedly QEMU is able to build with Clang and yet trying to do so
>> > I am seeing the build fail as follows..
>> >
>> > Comments?
>>
>> Clang does not support this kind of assembly code yet. The attached
>> patch avoids this by using 'as' and 'cpp' for .S files, please try. It
>> could still fail if the 'as' does not come from binutils.
>
> Could this please use $CC -E instead unless overridden? It just
> makes it easier if I only have to override CC and not CPP as well
> if using LLVM/Clang or even say another copy of GCC and have it
> use the appropriate C preprocessor.

I've just posted to the list an updated version using $cc -E.

>
> BTW, there are some interesting warnings spit out while compiling
> with Clang and most of them look to be legit issues with the code.

I'm using this to get a warning free build:
CFLAGS=-Wno-unused-value -Wno-initializer-overrides
-Wno-constant-conversion -Wno-unneeded-internal-declaration

Not all warnings are very interesting to fix, for example initializer
override warnings comes from somewhat useful shorthand in defining
tables and the only warning from -Wconstant-conversion in ARM code
looks like a bug in Clang.

But there could be additional Clang warning flags besides the defaults
that could be useful to enable, using -Weverything (or what was it)
produced some interesting warnings.

>
> QEMU builds and works fine with Clang with this patch applied.
>
>> >
>> >
>> > gmake[1]: Entering directory `/home/brad/qemu/pc-bios/optionrom'
>> > clang -I. -I/home/brad/qemu -I/home/brad/qemu/include 
>> > -I/home/brad/qemu/libcacard -Wall -Wstrict-prototypes -Werror 
>> > -fomit-frame-pointer -fno-builtin -I/home/brad/qemu   -fno-stack-protector 
>> > -MMD -MP -MT multiboot.o -MF ./multiboot.d -Wall -Wstrict-prototypes 
>> > -Werror -fomit-frame-pointer -fno-builtin -I/home/brad/qemu   
>> > -fno-stack-protector -c -o multiboot.o multiboot.S
>> > multiboot.S:31:1: error: unexpected directive .code16
>> > .code16; .text; .global _start; _start:; .short 0xaa55; .byte (_end - 
>> > _start) / 512; lret; .org 0x18; .short 0; .short _pnph; _pnph: .ascii 
>> > "$PnP"; .byte 0x01; .byte ( _pnph_len / 16 ); .short 0x; .byte 0x00; 
>> > .byte 0x00; .long 0x; .short _manufacturer; .short _product; .long 
>> > 0x; .short 0x; .short 0x; .short _bev; .short 0x; 
>> > .short 0x; .equ _pnph_len, . - _pnph; _bev:; movw %cs, %ax; movw %ax, 
>> > %ds;
>> > ^
>> > multiboot.S:31:8: error: .code16 not supported yet
>> > .code16; .text; .global _start; _start:; .short 0xaa55; .byte (_end - 
>> > _start) / 512; lret; .org 0x18; .short 0; .short _pnph; _pnph: .ascii 
>> > "$PnP"; .byte 0x01; .byte ( _pnph_len / 16 ); .short 0x; .byte 0x00; 
>> > .byte 0x00; .long 0x; .short _manufacturer; .short _product; .long 
>> > 0x; .short 0x; .short 0x; .short _bev; .short 0x; 
>> > .short 0x; .equ _pnph_len, . - _pnph; _bev:; movw %cs, %ax; movw %ax, 
>> > %ds;
>> >^
>> > multiboot.S:71:135: error: unknown directive
>> >  read_fw 0x0a; mov %eax, %edi; read_fw 0x0b; mov %eax, %ecx; mov $0x12, 
>> > %ax; mov $0x510, %edx; outw %ax, (%dx); mov $0x511, %dx; cld; .dc.b 
>> > 0xf3,0x6c
>> >
>> >^
>> > multiboot.S:102:2: error: unknown directive
>> >  .dc.b 0x26,0x67,0x66,0x89,0x4f,0xfc
>> >  ^
>> > multiboot.S:130:17: error: unexpected token in argument list
>> >  data32 lgdt %gs:6
>> > ^
>> > multiboot.S:138:15: error: unknown token in expression
>> >  data32 ljmp *%gs:0
>> >   ^
>> > multiboot.S:152:135: error: unknown directive
>> >  read_fw 0x07; mov %eax, %edi; read_fw 0x08; mov %eax, %ecx; mov $0x11, 
>> > %ax; mov $0x510, %edx; outw %ax, (%dx); mov $0x511, %dx; cld; .dc.b 
>> > 0xf3,0x6c
>> >
>> >^
>> > gmake[1]: *** [multiboot.o] Error 1
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>

[Qemu-devel] [PATCH] optionrom: build with discrete CPP and AS steps

2013-01-06 Thread Blue Swirl

Build option ROM .S files with separate preprocessor and
assembler steps because the C compiler could be unsuitable.

Signed-off-by: Blue Swirl 
---
 configure |7 ++-
 rules.mak |7 +--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index fe18ed2..8169a00 100755
--- a/configure
+++ b/configure
@@ -264,6 +264,8 @@ else
 fi
 
 ar="${AR-${cross_prefix}ar}"
+as="${AS-${cross_prefix}as}"
+cpp="${CPP-$cc -E}"
 objcopy="${OBJCOPY-${cross_prefix}objcopy}"
 ld="${LD-${cross_prefix}ld}"
 libtool="${LIBTOOL-${cross_prefix}libtool}"
@@ -3726,6 +3728,8 @@ echo "CC_I386=$cc_i386" >> $config_host_mak
 echo "HOST_CC=$host_cc" >> $config_host_mak
 echo "OBJCC=$objcc" >> $config_host_mak
 echo "AR=$ar" >> $config_host_mak
+echo "AS=$as" >> $config_host_mak
+echo "CPP=$cpp" >> $config_host_mak
 echo "OBJCOPY=$objcopy" >> $config_host_mak
 echo "LD=$ld" >> $config_host_mak
 echo "WINDRES=$windres" >> $config_host_mak
@@ -4277,9 +4281,10 @@ for rom in seabios vgabios ; do
 config_mak=roms/$rom/config.mak
 echo "# Automatically generated by configure - do not modify" > $config_mak
 echo "SRC_PATH=$source_path/roms/$rom" >> $config_mak
+echo "AS=$as" >> $config_mak
 echo "CC=$cc" >> $config_mak
 echo "BCC=bcc" >> $config_mak
-echo "CPP=${cross_prefix}cpp" >> $config_mak
+echo "CPP=$cpp" >> $config_mak
 echo "OBJCOPY=objcopy" >> $config_mak
 echo "IASL=iasl" >> $config_mak
 echo "LD=$ld" >> $config_mak
diff --git a/rules.mak b/rules.mak
index 8448b94..fe0c881 100644
--- a/rules.mak
+++ b/rules.mak
@@ -28,8 +28,11 @@ else
$(call quiet-command,$(LIBTOOL) --mode=compile --quiet --tag=CC $(CC) 
$(QEMU_INCLUDES) $(QEMU_CFLAGS) $(QEMU_DGFLAGS) $(CFLAGS) -c -o $@ $<,"  lt CC 
$@")
 endif
 
-%.o: %.S
-   $(call quiet-command,$(CC) $(QEMU_INCLUDES) $(QEMU_CFLAGS) 
$(QEMU_DGFLAGS) $(CFLAGS) -c -o $@ $<,"  AS$(TARGET_DIR)$@")
+%.asm: %.S
+   $(call quiet-command,$(CPP) $(QEMU_INCLUDES) $(QEMU_CFLAGS) 
$(QEMU_DGFLAGS) $(CFLAGS) -o $@ $<,"  CPP   $(TARGET_DIR)$@")
+
+%.o: %.asm
+   $(call quiet-command,$(AS) $(ASFLAGS) -o $@ $<,"  AS
$(TARGET_DIR)$@")
 
 %.o: %.m
$(call quiet-command,$(OBJCC) $(QEMU_INCLUDES) $(QEMU_CFLAGS) 
$(QEMU_DGFLAGS) $(CFLAGS) -c -o $@ $<,"  OBJC  $(TARGET_DIR)$@")
-- 
1.7.2.5

[Qemu-devel] [PATCH 3/3] qga: add guest-set-time command

2013-01-06 Thread Lei Li

Signed-off-by: Lei Li 
---
 qga/commands-posix.c |   57 ++
 qga/qapi-schema.json |   32 
 2 files changed, 89 insertions(+), 0 deletions(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 190199d..7fff49a 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -121,6 +121,63 @@ struct HostTimeInfo *qmp_guest_get_time(Error **errp)
 return host_time;
 }
 
+void qmp_guest_set_time(bool has_seconds, int64_t seconds,
+bool has_microseconds, int64_t microseconds,
+bool has_utc_offset, int64_t utc_offset, Error **errp)
+{
+int ret;
+int status;
+pid_t pid, rpid;
+struct timeval tv;
+HostTimeInfo *host_time;
+
+if ((!has_seconds) && (!has_microseconds) && (!has_utc_offset)) {
+host_time = get_host_time();
+if (!host_time) {
+error_set(errp, QERR_QGA_COMMAND_FAILED, "Failed to set guest 
time");
+return;
+}
+tv.tv_sec = host_time->seconds;
+tv.tv_usec = host_time->microseconds;
+} else if (has_seconds && has_microseconds && has_utc_offset) {
+tv.tv_sec = (time_t) seconds + utc_offset;
+tv.tv_usec = (time_t) microseconds;
+} else {
+error_set(errp, QERR_INVALID_PARAMETER, "parameter missing");
+return;
+}
+
+ret = settimeofday(&tv, NULL);
+if (ret < 0) {
+error_set(errp, QERR_QGA_COMMAND_FAILED, strerror(errno));
+return;
+}
+
+/* Set the Hardware Clock to the current System Time. */
+pid = fork();
+if (pid == 0) {
+setsid();
+reopen_fd_to_null(0);
+reopen_fd_to_null(1);
+reopen_fd_to_null(2);
+
+execle("/sbin/hwclock", "hwclock", "-w", NULL, environ);
+_exit(EXIT_FAILURE);
+} else if (pid < 0) {
+goto exit_err;
+}
+
+do {
+rpid = waitpid(pid, &status, 0);
+} while (rpid == -1 && errno == EINTR);
+if (rpid == pid && WIFEXITED(status) && !WEXITSTATUS(status)) {
+return;
+}
+
+exit_err:
+error_set(errp, QERR_UNDEFINED_ERROR);
+}
+
 typedef struct GuestFileHandle {
 uint64_t id;
 FILE *fh;
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 4a8b93c..4649b55 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -117,6 +117,38 @@
   'returns': 'HostTimeInfo' }
 
 ##
+# @guest-set-time:
+#
+# Set guest time. If none arguments were given, will set
+# host time to guest.
+#
+# Right now, when a guest is paused or migrated to a file
+# then loaded from that file, the guest OS has no idea that
+# there was a big gap in the time. Depending on how long
+# the gap was, NTP might not be able to resynchronize the
+# guest.
+#
+# This command tries to set guest time based on the information
+# from host or an absolute value given by management app, and
+# set the Hardware Clock to the current System Time. This
+# will make it easier for a guest to resynchronize without
+# waiting for NTP.
+#
+# @seconds: #optional "seconds" time.
+#
+# @microseconds: #optional "microseconds" time.
+#
+# @utc-offset: #optional utc offset.
+#
+# Returns: Nothing on success.
+#
+# Since: 1.4
+##
+{ 'command': 'guest-set-time',
+  'data': { '*seconds': 'int', '*microseconds': 'int',
+'*utc-offset': 'int' } }
+
+##
 # @GuestAgentCommandInfo:
 #
 # Information about guest agent commands.
-- 
1.7.7.6

[Qemu-devel] [PATCH 2/3] qga: add guest-get-time command

2013-01-06 Thread Lei Li

Signed-off-by: Lei Li 
---
 qga/commands-posix.c |   12 
 qga/qapi-schema.json |   17 +
 2 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 26b0fa0..190199d 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -109,6 +109,18 @@ static HostTimeInfo *get_host_time(void)
 return host_time;
 }
 
+struct HostTimeInfo *qmp_guest_get_time(Error **errp)
+{
+HostTimeInfo *host_time = get_host_time();
+
+if (!host_time) {
+error_set(errp, QERR_QGA_COMMAND_FAILED, "Failed to get host time");
+return NULL;
+}
+
+return host_time;
+}
+
 typedef struct GuestFileHandle {
 uint64_t id;
 FILE *fh;
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 7793aff..4a8b93c 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -100,6 +100,23 @@
  'utc-offset': 'int' } }
 
 ##
+# @guest-get-time:
+#
+# Get the information about host time in UTC and the
+# UTC offset.
+#
+# This command tries to get the host time which is
+# presumably correct, since need to be able to resynchronize
+# clock to host in guest.
+#
+# Returns: @HostTimeInfo on success.
+#
+# Since 1.4
+##
+{ 'command': 'guest-get-time',
+  'returns': 'HostTimeInfo' }
+
+##
 # @GuestAgentCommandInfo:
 #
 # Information about guest agent commands.
-- 
1.7.7.6

[Qemu-devel] [PATCH 1/3] qga: add support to get host time

2013-01-06 Thread Lei Li

Signed-off-by: Lei Li 
---
 qga/commands-posix.c |   18 ++
 qga/qapi-schema.json |   17 +
 2 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index a657201..26b0fa0 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -91,6 +91,24 @@ exit_err:
 error_set(err, QERR_UNDEFINED_ERROR);
 }
 
+static HostTimeInfo *get_host_time(void)
+{
+int err;
+qemu_timeval tq;
+HostTimeInfo *host_time;
+
+err = qemu_gettimeofday(&tq);
+if (err < 0) {
+return NULL;
+}
+
+host_time = g_malloc0(sizeof(HostTimeInfo));
+host_time->seconds = tq.tv_sec;
+host_time->microseconds = tq.tv_usec;
+
+return host_time;
+}
+
 typedef struct GuestFileHandle {
 uint64_t id;
 FILE *fh;
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index ed0eb69..7793aff 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -83,6 +83,23 @@
 { 'command': 'guest-ping' }
 
 ##
+# @HostTimeInfo
+#
+# Information about host time.
+#
+# @seconds: "seconds" time from the host.
+#
+# @microseconds: "microseconds" time from the host.
+#
+# @utc-offset: information about utc offset.
+#
+# Since: 1.4
+##
+{ 'type': 'HostTimeInfo',
+  'data': { 'seconds': 'int', 'microseconds': 'int',
+ 'utc-offset': 'int' } }
+
+##
 # @GuestAgentCommandInfo:
 #
 # Information about guest agent commands.
-- 
1.7.7.6

[Qemu-devel] [RFC PATCH 0/3] Time resync support by qemu-ga

2013-01-06 Thread Lei Li

This patch series attempts to add time resync support
to qemu-ga by introducing qemu-ga commands guest-get-time
and guest-set-time.

Right now, when a guest is paused or migrated to a file
then loaded from that file, the guest OS has no idea that
there was a big gap in the time. Depending on how long the
gap was, NTP might not be able to resynchronize the guest.
So adding new guest-agent command that is called any time
a guest is resumed  and which tells the guest to update its
own wall clock time based on the information from the host
will make it easier for a guest to resynchronize without
waiting for NTP.

The previous RFC send for discussion and suggestion as link
here:

http://article.gmane.org/gmane.comp.emulators.qemu/186126

The interface for these commands like:

{ 'command': 'guest-get-time', 'returns': 'HostTimeInfo' }

{ 'command': 'guest-set-time',
  'data': { '*seconds': 'int', '*microseconds': 'int',
'*utc-offset': 'int' } }

TODO:
This is a RFC version with POSIX-specific command implemented.
I just test on Linux guest, will add win32-specific command to
support Windows guest later. Since I want to make sure if this
seems like the way we should be headed.

Your comments and suggestions are very welcome!

Lei Li (3):
  qga: add support to get host time
  qga: add guest-get-time command
  qga: add guest-set-time command

Re: [Qemu-devel] QEMU build fails with Clang?

2013-01-06 Thread Brad Smith

On Sat, Jan 05, 2013 at 04:51:01PM +, Blue Swirl wrote:
> On Sat, Jan 5, 2013 at 1:48 AM, Brad Smith  wrote:
> > Supposedly QEMU is able to build with Clang and yet trying to do so
> > I am seeing the build fail as follows..
> >
> > Comments?
> 
> Clang does not support this kind of assembly code yet. The attached
> patch avoids this by using 'as' and 'cpp' for .S files, please try. It
> could still fail if the 'as' does not come from binutils.

Could this please use $CC -E instead unless overridden? It just
makes it easier if I only have to override CC and not CPP as well
if using LLVM/Clang or even say another copy of GCC and have it
use the appropriate C preprocessor.

BTW, there are some interesting warnings spit out while compiling
with Clang and most of them look to be legit issues with the code.

QEMU builds and works fine with Clang with this patch applied.

> >
> >
> > gmake[1]: Entering directory `/home/brad/qemu/pc-bios/optionrom'
> > clang -I. -I/home/brad/qemu -I/home/brad/qemu/include 
> > -I/home/brad/qemu/libcacard -Wall -Wstrict-prototypes -Werror 
> > -fomit-frame-pointer -fno-builtin -I/home/brad/qemu   -fno-stack-protector 
> > -MMD -MP -MT multiboot.o -MF ./multiboot.d -Wall -Wstrict-prototypes 
> > -Werror -fomit-frame-pointer -fno-builtin -I/home/brad/qemu   
> > -fno-stack-protector -c -o multiboot.o multiboot.S
> > multiboot.S:31:1: error: unexpected directive .code16
> > .code16; .text; .global _start; _start:; .short 0xaa55; .byte (_end - 
> > _start) / 512; lret; .org 0x18; .short 0; .short _pnph; _pnph: .ascii 
> > "$PnP"; .byte 0x01; .byte ( _pnph_len / 16 ); .short 0x; .byte 0x00; 
> > .byte 0x00; .long 0x; .short _manufacturer; .short _product; .long 
> > 0x; .short 0x; .short 0x; .short _bev; .short 0x; 
> > .short 0x; .equ _pnph_len, . - _pnph; _bev:; movw %cs, %ax; movw %ax, 
> > %ds;
> > ^
> > multiboot.S:31:8: error: .code16 not supported yet
> > .code16; .text; .global _start; _start:; .short 0xaa55; .byte (_end - 
> > _start) / 512; lret; .org 0x18; .short 0; .short _pnph; _pnph: .ascii 
> > "$PnP"; .byte 0x01; .byte ( _pnph_len / 16 ); .short 0x; .byte 0x00; 
> > .byte 0x00; .long 0x; .short _manufacturer; .short _product; .long 
> > 0x; .short 0x; .short 0x; .short _bev; .short 0x; 
> > .short 0x; .equ _pnph_len, . - _pnph; _bev:; movw %cs, %ax; movw %ax, 
> > %ds;
> >^
> > multiboot.S:71:135: error: unknown directive
> >  read_fw 0x0a; mov %eax, %edi; read_fw 0x0b; mov %eax, %ecx; mov $0x12, 
> > %ax; mov $0x510, %edx; outw %ax, (%dx); mov $0x511, %dx; cld; .dc.b 
> > 0xf3,0x6c
> > 
> >   ^
> > multiboot.S:102:2: error: unknown directive
> >  .dc.b 0x26,0x67,0x66,0x89,0x4f,0xfc
> >  ^
> > multiboot.S:130:17: error: unexpected token in argument list
> >  data32 lgdt %gs:6
> > ^
> > multiboot.S:138:15: error: unknown token in expression
> >  data32 ljmp *%gs:0
> >   ^
> > multiboot.S:152:135: error: unknown directive
> >  read_fw 0x07; mov %eax, %edi; read_fw 0x08; mov %eax, %ecx; mov $0x11, 
> > %ax; mov $0x510, %edx; outw %ax, (%dx); mov $0x511, %dx; cld; .dc.b 
> > 0xf3,0x6c
> > 
> >   ^
> > gmake[1]: *** [multiboot.o] Error 1

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Re: [Qemu-devel] [PATCH] vfio-pci: Make host MSI-X enable track guest

2013-01-06 Thread Michael S. Tsirkin

On Fri, Dec 21, 2012 at 08:38:07AM -0700, Alex Williamson wrote:
> On Fri, 2012-12-21 at 14:21 +0200, Michael S. Tsirkin wrote:
> > On Thu, Dec 20, 2012 at 03:12:46PM -0700, Alex Williamson wrote:
> > > On Thu, 2012-12-20 at 18:36 +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Dec 20, 2012 at 09:06:41AM -0700, Alex Williamson wrote:
> > > > > Guests typically enable MSI-X with all of the vectors in the MSI-X
> > > > > vector table masked.  Only when the vector is enabled does the vector
> > > > > get unmasked, resulting in a vector_use callback.  These two points,
> > > > > enable and unmask, correspond to pci_enable_msix() and request_irq()
> > > > > for Linux guests.  Some drivers rely on VF/PF or PF/fw communication
> > > > > channels that expect the physical state of the device to match the
> > > > > guest visible state of the device.  They don't appreciate lazily
> > > > > enabling MSI-X on the physical device.
> > > > > 
> > > > > To solve this, enable MSI-X with a single vector when the MSI-X
> > > > > capability is enabled and immediate disable the vector.  This leaves
> > > > > the physical device in exactly the same state between host and guest.
> > > > > Furthermore, the brief gap where we enable vector 0, it fires into
> > > > > userspace, not KVM, so the guest doesn't get spurious interrupts.
> > > > > Ideally we could call VFIO_DEVICE_SET_IRQS with the right parameters
> > > > > to enable MSI-X with zero vectors, but this will currently return an
> > > > > error as the Linux MSI-X interfaces do not allow it.
> > > > > 
> > > > > Cc: qemu-sta...@nongnu.org
> > > > > Signed-off-by: Alex Williamson 
> > > > 
> > > > Do you need an interface for this?  Can you do low-level pci config
> > > > access instead?  I imagine you would then enable MSIX and mask all
> > > > vectors at the same time.
> > > > 
> > > > No?
> > > 
> > > I really don't like the idea of enabling MSI-X directly through config
> > > space.  We're just asking for ownership conflicts doing that.  In fact,
> > > vfio prevents MSI/X from being enabled through config space since those
> > > are controlled through ioctl.
> > 
> > For vfio the natural thing to do would be to add interfaces
> > to do this in a controlled manner.
> 
> The ioctl is the controlled manner.  The ioctl will actually accept
> enabling zero vectors, but you'll get an -ERANGE generated from
> pci_enable_msix, so I think the interface there is fine.

See comment below at the patch.

> > >  It also prevents access to the MSI-X
> > > vector table since userspace has no business reading or modifying it.
> > > Thanks,
> > > 
> > > Alex
> > 
> > I'm not sure what the point of this is. If a device can do DMA writes
> > there is no way to distinguish between them and MSI on the bus.
> > So this seems to buy us no additional security.
> 
> IOMMUs supporting interrupt remapping can.
> Do you propose we have one
> interface when interrupt remapping is present and another when it's not?
> I can't think of anything useful that userspace can do with direct
> access to MSIX setup, including this MSI-X enable stub.

I am simply pointing out that interrupts are not virtualized
well at the moment. The guest has access to the device
and that knows the real contents of the MSIX table.
Using device as a side channel, guest and device can detect
the difference between real and observed MSIX table content.
It might not matter on most devices but it would be more
robust if we made guest and device states match as closely
as possible.



> Do you have any actual objections to the patch below?  I agree that
> kernel interfaces can be improved and I'd prefer the ability to enable
> MSI-X with zero vectors and incrementally add and remove vectors, but
> creating side channels so userspace can poke around here is not an
> acceptable alternative imho.  We need a solution for users hitting this
> now and I'm pretty happy with how this solution works.

It's ugly but for an existing vfio I don't see a better solution, no.

>  I only wish we could make legacy assignment behave as cleanly, but
>  I'm not willing to poke MSI-X enable bits myself to make it happen.

With existing kernels I don't think there's a better option.

>  Thanks,
> 
> Alex
> > > > > ---
> > > > >  hw/vfio_pci.c |   31 +++
> > > > >  1 file changed, 27 insertions(+), 4 deletions(-)
> > > > > 
> > > > > VFIO makes this a bit cleaner, so I think this is both the stable and
> > > > > final fix here.
> > > > > 
> > > > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> > > > > index 7c27834..5178ccc 100644
> > > > > --- a/hw/vfio_pci.c
> > > > > +++ b/hw/vfio_pci.c
> > > > > @@ -561,8 +561,9 @@ static int vfio_enable_vectors(VFIODevice *vdev, 
> > > > > bool msix)
> > > > >  return ret;
> > > > >  }
> > > > >  
> > > > > -static int vfio_msix_vector_use(PCIDevice *pdev,
> > > > > -unsigned int nr, MSIMessage msg)
> > > > > +static int vfio_msix_vector_do_use(PCIDevice

64 matches

Mail list logo