date:20130628

[Qemu-devel] [Bug 1195882] Re: Make fails on Centos - can't find autoreconf

2013-06-28 Thread Eric Blake

> As  far as I know,  autoreconf does not exist in the red hat/fedora/centos 
> world.
> 
> Can't find a yum, rpm. or other distro for it on the above os.

You aren't looking very  hard; on my RHEL 6.4 box (which has the same
packages as what you can install in CentOS), I see:

$ rpm -qf `which autoreconf`
autoconf-2.63-5.1.el6.noarch

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1195882

Title:
  Make fails on Centos - can't find autoreconf

Status in QEMU:
  New

Bug description:
[root@H002 qemu-1.4.2]# make
   
  \  GEN   i386-softmmu/config-devices.mak  
 
GEN   x86_64-softmmu/config-devices.mak 
 
GEN   alpha-softmmu/config-devices.mak  
 
GEN   arm-softmmu/config-devices.mak
 
GEN   cris-softmmu/config-devices.mak   
 
GEN   lm32-softmmu/config-devices.mak   

  (   )

  GEN   unicore32-linux-user/config-devices.mak
GEN   s390x-linux-user/config-devices.mak
GEN   config-all-devices.mak
GEN   config-host.h
  (cd /opt/qemu/qemu-1.4.2/pixman; autoreconf -v --install)
  /bin/sh: autoreconf: command not found
  make: *** [/opt/qemu/qemu-1.4.2/pixman/configure] Error 127

  *

  Exact same error for 1.5.1 build 
  So, qemu not supported on anything but Ubuntu?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1195882/+subscriptions

[Qemu-devel] [PATCH V5 5/7] monitor: support sub commands in auto completion

2013-06-28 Thread Wenchao Xia

This patch allow auto completion work normal for sub command case,
"info block [DEVICE]" can auto complete now, by re-enter the completion
function. Also, original "info" is treated as a special case, now it is
treated as a sub command group, global variable info_cmds is not used
in any more.

"help" command is still treated as a special case, since it is not a sub
command group but want to auto complete command in root command table.

parse_cmdline() takes another parameter now to tell next valid
parameter's position in original cmdline.

Signed-off-by: Wenchao Xia 
---
 monitor.c |   62 +---
 1 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/monitor.c b/monitor.c
index bc62fc7..2f5b91d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -801,9 +801,24 @@ static int get_str(char *buf, int buf_size, const char 
**pp)
 
 #define MAX_ARGS 16
 
-/* NOTE: this parser is an approximate form of the real command parser */
+/*
+ * Parse the command line to get valid args.
+ * @cmdline: command line to be parsed.
+ * @pnb_args: location to store the number of args, must NOT be NULL.
+ * @args: location to store the args, which should be freed by caller, must
+ *NOT be NULL.
+ * @args_cmdline: location to store char position in @cmdline correspond to
+ *@args include '\0', can be NULL.
+ *
+ * For example, if @cmdline = " ab" and @args_cmdline != NULL, result will be:
+ * pnb_args = 1, args[0] = "ab", args_cmdline[0] = "ab",
+ * args_cmdline[1] = "\0".
+ *
+ * NOTE: this parser is an approximate form of the real command parser. Number
+ *   of args have a limit of MAX_ARGS.
+ */
 static void parse_cmdline(const char *cmdline,
-  int *pnb_args, char **args)
+ int *pnb_args, char **args, const char **args_cmdline)
 {
 const char *p;
 int nb_args, ret;
@@ -811,14 +826,14 @@ static void parse_cmdline(const char *cmdline,
 
 p = cmdline;
 nb_args = 0;
-for (;;) {
+while (nb_args < MAX_ARGS) {
 while (qemu_isspace(*p)) {
 p++;
 }
-if (*p == '\0') {
-break;
+if (args_cmdline) {
+args_cmdline[nb_args] = p;
 }
-if (nb_args >= MAX_ARGS) {
+if (*p == '\0') {
 break;
 }
 ret = get_str(buf, sizeof(buf), &p);
@@ -888,7 +903,7 @@ static void help_cmd(Monitor *mon, const char *name)
 return;
 }
 
-parse_cmdline(name, &nb_args, args);
+parse_cmdline(name, &nb_args, args, NULL);
 if (nb_args >= MAX_ARGS) {
 goto cleanup;
 }
@@ -4179,17 +4194,18 @@ static const char *next_arg_type(const char *typestr)
 return (p != NULL ? ++p : typestr);
 }
 
-static void monitor_find_completion(Monitor *mon,
-const char *cmdline)
+static void monitor_find_completion_by_table(Monitor *mon,
+ const mon_cmd_t *cmd_table,
+ const char *cmdline)
 {
-const char *cmdname;
+const char *cmdname, *args_cmdline[MAX_ARGS];
 char *args[MAX_ARGS];
 int nb_args, i, len;
 const char *ptype, *str;
 const mon_cmd_t *cmd;
 MonitorBlockComplete mbs;
 
-parse_cmdline(cmdline, &nb_args, args);
+parse_cmdline(cmdline, &nb_args, args, args_cmdline);
 #ifdef DEBUG_COMPLETION
 for (i = 0; i < nb_args; i++) {
 monitor_printf(mon, "arg%d = '%s'\n", i, args[i]);
@@ -4212,12 +4228,12 @@ static void monitor_find_completion(Monitor *mon,
 else
 cmdname = args[0];
 readline_set_completion_index(mon->rs, strlen(cmdname));
-for (cmd = mon->cmd_table; cmd->name != NULL; cmd++) {
+for (cmd = cmd_table; cmd->name != NULL; cmd++) {
 cmd_completion(mon, cmdname, cmd->name);
 }
 } else {
 /* find the command */
-for (cmd = mon->cmd_table; cmd->name != NULL; cmd++) {
+for (cmd = cmd_table; cmd->name != NULL; cmd++) {
 if (compare_cmd(args[0], cmd->name)) {
 break;
 }
@@ -4226,6 +4242,12 @@ static void monitor_find_completion(Monitor *mon,
 goto cleanup;
 }
 
+if (cmd->sub_table) {
+monitor_find_completion_by_table(mon, cmd->sub_table,
+ args_cmdline[1]);
+goto cleanup;
+}
+
 ptype = next_arg_type(cmd->args_type);
 for(i = 0; i < nb_args - 2; i++) {
 if (*ptype != '\0') {
@@ -4252,13 +4274,7 @@ static void monitor_find_completion(Monitor *mon,
 bdrv_iterate(block_completion_it, &mbs);
 break;
 case 's':
-/* XXX: more generic ? */
-if (!strcmp(cmd->name, "info")) {
-readline_set_completion_index(mon->rs, strlen(str));
-for(cmd = info_cmd

[Qemu-devel] [PATCH V5 6/7] monitor: improve "help" in auto completion for sub command

2013-06-28 Thread Wenchao Xia

Now special case "help *" in auto completion can work with sub commands,
such as "help info a*".

Signed-off-by: Wenchao Xia 
---
 monitor.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index 2f5b91d..3ef18ee 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4283,10 +4283,8 @@ static void monitor_find_completion_by_table(Monitor 
*mon,
 cmd_completion(mon, str, QKeyCode_lookup[i]);
 }
 } else if (!strcmp(cmd->name, "help|?")) {
-readline_set_completion_index(mon->rs, strlen(str));
-for (cmd = mon->cmd_table; cmd->name != NULL; cmd++) {
-cmd_completion(mon, str, cmd->name);
-}
+monitor_find_completion_by_table(mon, cmd_table,
+ args_cmdline[1]);
 }
 break;
 default:
-- 
1.7.1

[Qemu-devel] [PATCH V5 7/7] monitor: improve "help" to allow show details of single command in sub group

2013-06-28 Thread Wenchao Xia

A new parameter type 'S' is introduced to allow user input any string.
"help info block" do not tip extra parameter error now.

Signed-off-by: Wenchao Xia 
---
 hmp-commands.hx |2 +-
 monitor.c   |   30 +-
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 915b0d1..8cf5f29 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -11,7 +11,7 @@ ETEXI
 
 {
 .name   = "help|?",
-.args_type  = "name:s?",
+.args_type  = "name:S?",
 .params = "[cmd]",
 .help   = "show the help",
 .mhandler.cmd = do_help_cmd,
diff --git a/monitor.c b/monitor.c
index 3ef18ee..ebdc2a3 100644
--- a/monitor.c
+++ b/monitor.c
@@ -83,6 +83,7 @@
  * 'F'  filename
  * 'B'  block device name
  * 's'  string (accept optional quote)
+ * 'S'  supported types, can be any string (accept optional quote)
  * 'O'  option string of the form NAME=VALUE,...
  *  parsed according to QemuOptsList given by its name
  *  Example: 'device:O' uses qemu_device_opts.
@@ -4011,6 +4012,31 @@ static const mon_cmd_t *monitor_parse_command(Monitor 
*mon,
 }
 }
 break;
+case 'S':
+{
+/* package all remaining string */
+int len;
+
+while (qemu_isspace(*p)) {
+p++;
+}
+if (*typestr == '?') {
+typestr++;
+if (*p == '\0') {
+/* no remaining string: NULL argument */
+break;
+}
+}
+len = strlen(p);
+if (len <= 0) {
+monitor_printf(mon, "%s: string expected\n",
+   cmdname);
+break;
+}
+qdict_put(qdict, key, qstring_from_str(p));
+p += len;
+}
+break;
 default:
 bad_type:
 monitor_printf(mon, "%s: unknown type '%c'\n", cmdname, c);
@@ -4282,7 +4308,9 @@ static void monitor_find_completion_by_table(Monitor *mon,
 for (i = 0; i < Q_KEY_CODE_MAX; i++) {
 cmd_completion(mon, str, QKeyCode_lookup[i]);
 }
-} else if (!strcmp(cmd->name, "help|?")) {
+}
+case 'S':
+if (!strcmp(cmd->name, "help|?")) {
 monitor_find_completion_by_table(mon, cmd_table,
  args_cmdline[1]);
 }
-- 
1.7.1

[Qemu-devel] [PATCH V5 3/7] monitor: code move for parse_cmdline()

2013-06-28 Thread Wenchao Xia

get_str() is called by parse_cmdline() so it is moved also. Some
code style error reported by check script, is also fixed.

Signed-off-by: Wenchao Xia 
---
 monitor.c |  191 +++--
 1 files changed, 98 insertions(+), 93 deletions(-)

diff --git a/monitor.c b/monitor.c
index 68e2e80..03a017d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -733,6 +733,104 @@ static int compare_cmd(const char *name, const char *list)
 return 0;
 }
 
+static int get_str(char *buf, int buf_size, const char **pp)
+{
+const char *p;
+char *q;
+int c;
+
+q = buf;
+p = *pp;
+while (qemu_isspace(*p)) {
+p++;
+}
+if (*p == '\0') {
+fail:
+*q = '\0';
+*pp = p;
+return -1;
+}
+if (*p == '\"') {
+p++;
+while (*p != '\0' && *p != '\"') {
+if (*p == '\\') {
+p++;
+c = *p++;
+switch (c) {
+case 'n':
+c = '\n';
+break;
+case 'r':
+c = '\r';
+break;
+case '\\':
+case '\'':
+case '\"':
+break;
+default:
+qemu_printf("unsupported escape code: '\\%c'\n", c);
+goto fail;
+}
+if ((q - buf) < buf_size - 1) {
+*q++ = c;
+}
+} else {
+if ((q - buf) < buf_size - 1) {
+*q++ = *p;
+}
+p++;
+}
+}
+if (*p != '\"') {
+qemu_printf("unterminated string\n");
+goto fail;
+}
+p++;
+} else {
+while (*p != '\0' && !qemu_isspace(*p)) {
+if ((q - buf) < buf_size - 1) {
+*q++ = *p;
+}
+p++;
+}
+}
+*q = '\0';
+*pp = p;
+return 0;
+}
+
+#define MAX_ARGS 16
+
+/* NOTE: this parser is an approximate form of the real command parser */
+static void parse_cmdline(const char *cmdline,
+  int *pnb_args, char **args)
+{
+const char *p;
+int nb_args, ret;
+char buf[1024];
+
+p = cmdline;
+nb_args = 0;
+for (;;) {
+while (qemu_isspace(*p)) {
+p++;
+}
+if (*p == '\0') {
+break;
+}
+if (nb_args >= MAX_ARGS) {
+break;
+}
+ret = get_str(buf, sizeof(buf), &p);
+args[nb_args] = g_strdup(buf);
+nb_args++;
+if (ret < 0) {
+break;
+}
+}
+*pnb_args = nb_args;
+}
+
 static void help_cmd_dump(Monitor *mon, const mon_cmd_t *cmds,
   const char *prefix, const char *name)
 {
@@ -3411,71 +3509,6 @@ static int get_double(Monitor *mon, double *pval, const 
char **pp)
 return 0;
 }
 
-static int get_str(char *buf, int buf_size, const char **pp)
-{
-const char *p;
-char *q;
-int c;
-
-q = buf;
-p = *pp;
-while (qemu_isspace(*p))
-p++;
-if (*p == '\0') {
-fail:
-*q = '\0';
-*pp = p;
-return -1;
-}
-if (*p == '\"') {
-p++;
-while (*p != '\0' && *p != '\"') {
-if (*p == '\\') {
-p++;
-c = *p++;
-switch(c) {
-case 'n':
-c = '\n';
-break;
-case 'r':
-c = '\r';
-break;
-case '\\':
-case '\'':
-case '\"':
-break;
-default:
-qemu_printf("unsupported escape code: '\\%c'\n", c);
-goto fail;
-}
-if ((q - buf) < buf_size - 1) {
-*q++ = c;
-}
-} else {
-if ((q - buf) < buf_size - 1) {
-*q++ = *p;
-}
-p++;
-}
-}
-if (*p != '\"') {
-qemu_printf("unterminated string\n");
-goto fail;
-}
-p++;
-} else {
-while (*p != '\0' && !qemu_isspace(*p)) {
-if ((q - buf) < buf_size - 1) {
-*q++ = *p;
-}
-p++;
-}
-}
-*q = '\0';
-*pp = p;
-return 0;
-}
-
 /*
  * Store the command-name in cmdname, and return a pointer to
  * the remaining of the command string.
@@ -3532,8 +3565,6 @@ static char *key_get_info(const char *type, char **key)
 static int default_fmt_format = 'x';
 static int default_fmt_size = 4;
 
-#define MAX_ARGS 16
-
 static int is_valid_option(const char *c, const char *typestr)
 {
 char option[3];
@@ -4099,32 +4130,6 @@ static void block_completion_it(void *opaque, 
BlockDriverState *b

[Qemu-devel] [PATCH V5 2/7] monitor: avoid direct use of global variable *mon_cmds

2013-06-28 Thread Wenchao Xia

New member *cmd_table is added in structure Monitor to avoid direct usage of
*mon_cmds. Now monitor have an associated command table, when global variable
*info_cmds is also discarded, structure Monitor would gain full control about
how to deal with user input.

Signed-off-by: Wenchao Xia 
Reviewed-by: Eric Blake 
---
 monitor.c |   14 +-
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/monitor.c b/monitor.c
index 29e3a95..68e2e80 100644
--- a/monitor.c
+++ b/monitor.c
@@ -194,6 +194,7 @@ struct Monitor {
 CPUState *mon_cpu;
 BlockDriverCompletionFunc *password_completion_cb;
 void *password_opaque;
+mon_cmd_t *cmd_table;
 QError *error;
 QLIST_HEAD(,mon_fd_t) fds;
 QLIST_ENTRY(Monitor) entry;
@@ -749,7 +750,7 @@ static void help_cmd(Monitor *mon, const char *name)
 if (name && !strcmp(name, "info")) {
 help_cmd_dump(mon, info_cmds, "info ", NULL);
 } else {
-help_cmd_dump(mon, mon_cmds, "", name);
+help_cmd_dump(mon, mon->cmd_table, "", name);
 if (name && !strcmp(name, "log")) {
 const QEMULogItem *item;
 monitor_printf(mon, "Log items (comma separated):\n");
@@ -3974,7 +3975,7 @@ static void handle_user_command(Monitor *mon, const char 
*cmdline)
 
 qdict = qdict_new();
 
-cmd = monitor_parse_command(mon, cmdline, 0, mon_cmds, qdict);
+cmd = monitor_parse_command(mon, cmdline, 0, mon->cmd_table, qdict);
 if (!cmd)
 goto out;
 
@@ -4163,12 +4164,12 @@ static void monitor_find_completion(Monitor *mon,
 else
 cmdname = args[0];
 readline_set_completion_index(mon->rs, strlen(cmdname));
-for(cmd = mon_cmds; cmd->name != NULL; cmd++) {
+for (cmd = mon->cmd_table; cmd->name != NULL; cmd++) {
 cmd_completion(mon, cmdname, cmd->name);
 }
 } else {
 /* find the command */
-for (cmd = mon_cmds; cmd->name != NULL; cmd++) {
+for (cmd = mon->cmd_table; cmd->name != NULL; cmd++) {
 if (compare_cmd(args[0], cmd->name)) {
 break;
 }
@@ -4219,7 +4220,7 @@ static void monitor_find_completion(Monitor *mon,
 }
 } else if (!strcmp(cmd->name, "help|?")) {
 readline_set_completion_index(mon->rs, strlen(str));
-for (cmd = mon_cmds; cmd->name != NULL; cmd++) {
+for (cmd = mon->cmd_table; cmd->name != NULL; cmd++) {
 cmd_completion(mon, str, cmd->name);
 }
 }
@@ -4782,6 +4783,9 @@ void monitor_init(CharDriverState *chr, int flags)
 if (!default_mon || (flags & MONITOR_IS_DEFAULT))
 default_mon = mon;
 
+/* Always use *mon_cmds as root command table now. */
+mon->cmd_table = mon_cmds;
+
 sortcmdlist();
 }
 
-- 
1.7.1

[Qemu-devel] [PATCH V5 1/7] monitor: avoid direct use of global *cur_mon in completion functions

2013-06-28 Thread Wenchao Xia

Parameter *mon is added to replace *cur_mon, and readline_completion()
pass rs->mon as value, which should be initialized in readline_init()
called by monitor_init(). In short, structure ReadLineState controls
where the action would be taken now.

Signed-off-by: Wenchao Xia 
Reviewed-by: Eric Blake 
---
 include/monitor/readline.h |3 +-
 monitor.c  |   55 ++-
 readline.c |5 +--
 3 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/include/monitor/readline.h b/include/monitor/readline.h
index fc9806e..0faf6e1 100644
--- a/include/monitor/readline.h
+++ b/include/monitor/readline.h
@@ -8,7 +8,8 @@
 #define READLINE_MAX_COMPLETIONS 256
 
 typedef void ReadLineFunc(Monitor *mon, const char *str, void *opaque);
-typedef void ReadLineCompletionFunc(const char *cmdline);
+typedef void ReadLineCompletionFunc(Monitor *mon,
+const char *cmdline);
 
 typedef struct ReadLineState {
 char cmd_buf[READLINE_CMD_BUF_SIZE + 1];
diff --git a/monitor.c b/monitor.c
index 9be515c..29e3a95 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3998,7 +3998,7 @@ out:
 QDECREF(qdict);
 }
 
-static void cmd_completion(const char *name, const char *list)
+static void cmd_completion(Monitor *mon, const char *name, const char *list)
 {
 const char *p, *pstart;
 char cmd[128];
@@ -4016,7 +4016,7 @@ static void cmd_completion(const char *name, const char 
*list)
 memcpy(cmd, pstart, len);
 cmd[len] = '\0';
 if (name[0] == '\0' || !strncmp(name, cmd, strlen(name))) {
-readline_add_completion(cur_mon->rs, cmd);
+readline_add_completion(mon->rs, cmd);
 }
 if (*p == '\0')
 break;
@@ -4024,7 +4024,7 @@ static void cmd_completion(const char *name, const char 
*list)
 }
 }
 
-static void file_completion(const char *input)
+static void file_completion(Monitor *mon, const char *input)
 {
 DIR *ffs;
 struct dirent *d;
@@ -4047,7 +4047,7 @@ static void file_completion(const char *input)
 pstrcpy(file_prefix, sizeof(file_prefix), p + 1);
 }
 #ifdef DEBUG_COMPLETION
-monitor_printf(cur_mon, "input='%s' path='%s' prefix='%s'\n",
+monitor_printf(mon, "input='%s' path='%s' prefix='%s'\n",
input, path, file_prefix);
 #endif
 ffs = opendir(path);
@@ -4074,20 +4074,27 @@ static void file_completion(const char *input)
 if (stat(file, &sb) == 0 && S_ISDIR(sb.st_mode)) {
 pstrcat(file, sizeof(file), "/");
 }
-readline_add_completion(cur_mon->rs, file);
+readline_add_completion(mon->rs, file);
 }
 }
 closedir(ffs);
 }
 
+typedef struct MonitorBlockComplete {
+Monitor *mon;
+const char *input;
+} MonitorBlockComplete;
+
 static void block_completion_it(void *opaque, BlockDriverState *bs)
 {
 const char *name = bdrv_get_device_name(bs);
-const char *input = opaque;
+MonitorBlockComplete *mbc = opaque;
+Monitor *mon = mbc->mon;
+const char *input = mbc->input;
 
 if (input[0] == '\0' ||
 !strncmp(name, (char *)input, strlen(input))) {
-readline_add_completion(cur_mon->rs, name);
+readline_add_completion(mon->rs, name);
 }
 }
 
@@ -4123,18 +4130,20 @@ static const char *next_arg_type(const char *typestr)
 return (p != NULL ? ++p : typestr);
 }
 
-static void monitor_find_completion(const char *cmdline)
+static void monitor_find_completion(Monitor *mon,
+const char *cmdline)
 {
 const char *cmdname;
 char *args[MAX_ARGS];
 int nb_args, i, len;
 const char *ptype, *str;
 const mon_cmd_t *cmd;
+MonitorBlockComplete mbs;
 
 parse_cmdline(cmdline, &nb_args, args);
 #ifdef DEBUG_COMPLETION
-for(i = 0; i < nb_args; i++) {
-monitor_printf(cur_mon, "arg%d = '%s'\n", i, (char *)args[i]);
+for (i = 0; i < nb_args; i++) {
+monitor_printf(mon, "arg%d = '%s'\n", i, args[i]);
 }
 #endif
 
@@ -4153,9 +4162,9 @@ static void monitor_find_completion(const char *cmdline)
 cmdname = "";
 else
 cmdname = args[0];
-readline_set_completion_index(cur_mon->rs, strlen(cmdname));
+readline_set_completion_index(mon->rs, strlen(cmdname));
 for(cmd = mon_cmds; cmd->name != NULL; cmd++) {
-cmd_completion(cmdname, cmd->name);
+cmd_completion(mon, cmdname, cmd->name);
 }
 } else {
 /* find the command */
@@ -4183,33 +4192,35 @@ static void monitor_find_completion(const char *cmdline)
 switch(*ptype) {
 case 'F':
 /* file completion */
-readline_set_completion_index(cur_mon->rs, strlen(str));
-file_completion(str);
+readline_set_completion_index(mon->rs, strlen(str));
+file_completion(mon, str);
 break;
 case 'B':

[Qemu-devel] [PATCH V5 4/7] monitor: avoid direct use of global *info_cmds in help functions

2013-06-28 Thread Wenchao Xia

In help functions info_cmds is treated as sub command group now, not as
a special case any more. Still help can't show message for single command
under "info", since command parser reject additional parameter, which
can be improved by change "help" item parameter define later. "log" is
still treated as special help case. compare_cmd() is used instead of strcmp()
in command searching.

Signed-off-by: Wenchao Xia 
---
 monitor.c |   63 +++-
 1 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/monitor.c b/monitor.c
index 03a017d..bc62fc7 100644
--- a/monitor.c
+++ b/monitor.c
@@ -831,33 +831,76 @@ static void parse_cmdline(const char *cmdline,
 *pnb_args = nb_args;
 }
 
+static void help_cmd_dump_one(Monitor *mon,
+  const mon_cmd_t *cmd,
+  char **prefix_args,
+  int prefix_args_nb)
+{
+int i;
+
+for (i = 0; i < prefix_args_nb; i++) {
+monitor_printf(mon, "%s ", prefix_args[i]);
+}
+monitor_printf(mon, "%s %s -- %s\n", cmd->name, cmd->params, cmd->help);
+}
+
 static void help_cmd_dump(Monitor *mon, const mon_cmd_t *cmds,
-  const char *prefix, const char *name)
+  char **args, int nb_args, int arg_index)
 {
 const mon_cmd_t *cmd;
 
-for(cmd = cmds; cmd->name != NULL; cmd++) {
-if (!name || !strcmp(name, cmd->name))
-monitor_printf(mon, "%s%s %s -- %s\n", prefix, cmd->name,
-   cmd->params, cmd->help);
+/* Dump all */
+if (arg_index >= nb_args) {
+for (cmd = cmds; cmd->name != NULL; cmd++) {
+help_cmd_dump_one(mon, cmd, args, arg_index);
+}
+return;
+}
+
+/* Find one entry to dump */
+for (cmd = cmds; cmd->name != NULL; cmd++) {
+if (compare_cmd(args[arg_index], cmd->name)) {
+if (cmd->sub_table) {
+help_cmd_dump(mon, cmd->sub_table,
+  args, nb_args, arg_index + 1);
+} else {
+help_cmd_dump_one(mon, cmd, args, arg_index);
+}
+break;
+}
 }
 }
 
 static void help_cmd(Monitor *mon, const char *name)
 {
-if (name && !strcmp(name, "info")) {
-help_cmd_dump(mon, info_cmds, "info ", NULL);
-} else {
-help_cmd_dump(mon, mon->cmd_table, "", name);
-if (name && !strcmp(name, "log")) {
+char *args[MAX_ARGS];
+int nb_args = 0, i;
+
+if (name) {
+/* special case for log */
+if (!strcmp(name, "log")) {
 const QEMULogItem *item;
 monitor_printf(mon, "Log items (comma separated):\n");
 monitor_printf(mon, "%-10s %s\n", "none", "remove all logs");
 for (item = qemu_log_items; item->mask != 0; item++) {
 monitor_printf(mon, "%-10s %s\n", item->name, item->help);
 }
+return;
+}
+
+parse_cmdline(name, &nb_args, args);
+if (nb_args >= MAX_ARGS) {
+goto cleanup;
 }
 }
+
+help_cmd_dump(mon, mon->cmd_table, args, nb_args, 0);
+
+cleanup:
+nb_args = nb_args < MAX_ARGS ? nb_args : MAX_ARGS;
+for (i = 0; i < nb_args; i++) {
+g_free(args[i]);
+}
 }
 
 static void do_help_cmd(Monitor *mon, const QDict *qdict)
-- 
1.7.1

[Qemu-devel] [PATCH V5 0/7] monitor: support sub command group in auto completion and help

2013-06-28 Thread Wenchao Xia

Global variable *mon_cmds and *info_cmds are not directly used any more,
*cur_mon is not used in completion related functions. It is possible to create
a monitor with different command table now, but that requirement do not exist
yet, so not changed it to save trouble. Log command is still a special case
now, may be it can be converted as sub group later.

Patch 1-3 make sure the functions can be re-entered safely.

V2:
  General:
  To discard *info_comds more graceful, help related function is modified to 
support
sub command too.
  Patch 6/7 are added to improve help related functions.
  Patch 5: not directly return to make sure args are freed.

  Address Luiz's comments:
  Split patch into small series.
  struct mon_cmd_t was not moved into header file, instead mon_cmnd_t 
*cmd_table is
added as a member in struct Monitor.
  5/7: drop original code comments for "info" in monitor_find_completion().

v3:
  5/7: add parameter **args_cmdline in parse_cmdline() to tell next valid
parameter's position. This fix the issue in case command length in input is not
equal to its name's length such as "help|?", and the case input start with
space such as "  s". 
  7/7: better commit message.

v4:
  Address Eric's comments:
  1/7, 2/7, 4/7: better commit title and message.
  1/7 remove useless (char *) in old code, add space around "for ()" in old 
code.
  3/7: separate code moving patch before usage.
  4/7: add space around "for ()" in old code, add min(nb_args, MAX_ARGS) in free
to make code stronger.

v5:
  4/7: use "a < b ? a : b" instead of macro min.

Wenchao Xia (7):
  1 monitor: avoid direct use of global *cur_mon in completion functions
  2 monitor: avoid direct use of global variable *mon_cmds
  3 monitor: code move for parse_cmdline()
  4 monitor: avoid direct use of global *info_cmds in help functions
  5 monitor: support sub commands in auto completion
  6 monitor: improve "help" in auto completion for sub command
  7 monitor: improve "help" to allow show details of single command in sub group

 hmp-commands.hx|2 +-
 include/monitor/readline.h |3 +-
 monitor.c  |  387 
 readline.c |5 +-
 4 files changed, 254 insertions(+), 143 deletions(-)

Re: [Qemu-devel] [PATCH V4 0/7] monitor: support sub command group in auto completion and help

2013-06-28 Thread Wenchao Xia

于 2013-6-28 11:39, Wenchao Xia 写道:
> Global variable *mon_cmds and *info_cmds are not directly used any more,
> *cur_mon is not used in completion related functions. It is possible to create
> a monitor with different command table now, but that requirement do not exist
> yet, so not changed it to save trouble. Log command is still a special case
> now, may be it can be converted as sub group later.
> 
> Patch 1-3 make sure the functions can be re-entered safely.
> 
> V2:
>General:
>To discard *info_comds more graceful, help related function is modified to 
> support
> sub command too.
>Patch 6/7 are added to improve help related functions.
>Patch 5: not directly return to make sure args are freed.
> 
>Address Luiz's comments:
>Split patch into small series.
>struct mon_cmd_t was not moved into header file, instead mon_cmnd_t 
> *cmd_table is
> added as a member in struct Monitor.
>5/7: drop original code comments for "info" in monitor_find_completion().
> 
> v3:
>5/7: add parameter **args_cmdline in parse_cmdline() to tell next valid
> parameter's position. This fix the issue in case command length in input is 
> not
> equal to its name's length such as "help|?", and the case input start with
> space such as "  s".
>7/7: better commit message.
> 
> v4:
>Address Eric's comments:
>1/7, 2/7, 4/7: better commit title and message.
>1/7 remove useless (char *) in old code, add space around "for ()" in old 
> code.
>3/7: separate code moving patch before usage.
>4/7: add space around "for ()" in old code, add min(nb_args, MAX_ARGS) in 
> free
> to make code stronger.
> 
> Wenchao Xia (7):
>1 monitor: avoid direct use of global *cur_mon in completion functions
>2 monitor: avoid direct use of global variable *mon_cmds
>3 monitor: code move for parse_cmdline()
>4 monitor: avoid direct use of global *info_cmds in help functions
>5 monitor: support sub commands in auto completion
>6 monitor: improve "help" in auto completion for sub command
>7 monitor: improve "help" to allow show details of single command in sub 
> group
> 
>   hmp-commands.hx|2 +-
>   include/monitor/readline.h |3 +-
>   monitor.c  |  387 
> 
>   readline.c |5 +-
>   4 files changed, 254 insertions(+), 143 deletions(-)
> 
> 
> 
  This version have a build issue in 4/7 caused by macro 'min' can't be
recognized. I am a bit too rush on this version for catching a train.
I'll respin to fix that, sorry for interrupt.

-- 
Best Regards

Wenchao Xia

Re: [Qemu-devel] [PATCH v4 00/11] Fix versatile_pci

2013-06-28 Thread Rob Landley

On 04/08/2013 03:16:18 PM, Peter Maydell wrote:

On 8 April 2013 18:37, Rob Landley  wrote:
> On 04/06/2013 10:44:25 AM, Peter Maydell wrote:
>>
>> This patch series fixes a number of serious bugs in our emulation  
of
>> the PCI controller found on VersatilePB and the early Realview  
boards:

>>  * our interrupt mapping was totally wrong
>
>
> Yes. Yes it was. However, what you were doing matched the kernel  
for many

> years.

> The kernel guys have screwed this up three consecutive times, as  
described

> here:
>
>   http://landley.net/hg/aboriginal/rev/7bf850767bb8
>
> Because as far as I can tell, nobody has any test hardware for this  
anymore.

There is some but it's pretty rare.

Now that the next kernel's about to come out, I'm trying to get my arm  
versatile image to work under qemu 1.5.0. The old kernel doesn't work,  
and the current vanilla kernel doesn't work. This change broke it.

I'm testing current linux-git both with and without this patch:

--- linux/arch/arm/mach-versatile/pci.c 2013-04-28 19:36:01.0  
-0500
+++ linux.bak/arch/arm/mach-versatile/pci.c 2013-04-29  
19:09:44.857097553 -0500

@@ -333,7 +333,7 @@
 *  26 1 IRQ_SIC_PCI2
 *  27 1 IRQ_SIC_PCI3
 */
-   irq = IRQ_SIC_PCI0 + ((slot - 24 + pin - 1) & 3);
+   irq = 59; //IRQ_SIC_PCI0 + ((slot - 24 + pin - 1) & 3);

return irq;
 }

With the patch, qemu 1.2.0 works, but qemu 1.5 versatile does not work  
with or without the patch.

Here's a test image to demonstrate the problem: it works fine under  
qemu 1.2.0, but doesn't under 1.5.0:

  http://landley.net/aboriginal/bin/system-image-armv5l.tar.bz2

Extract it and ./run-emulator.sh. You get a shell prompt from 1.2.0,  
from 1.5.0 it never gets a scsi interrupt, and after a timeout goes  
into an abort/reset loop.

>> This version of the patchset avoids breaking legacy Linux guests
>> by automatically detecting those broken kernels and switching back
>> to the old mapping.

As testing with the above image confirms: it does not.

>> This works by looking at the values the kernel
>> writes to the PCI_INTERRUPT_LINE register in the config space,  
which

>> is effectively the interrupt number the kernel expects the device
>> to be using. If this doesn't match reality we use the broken  
mapping.

>> (Thanks to Michael S. Tsirkin for this suggestion.)

Define "reality".

The kernel changed what irqs it was expecting _three_times_ last year,  
and as far as I can tell none of them were what they were _trying_ to  
do.

Here's my blog entry and the source control comments where I diagnosed  
this stuff to show that the kernel guys have no flipping CLUE how an  
arm versatile board actually works:

1) http://landley.net/notes-2013.html#15-03-2013
2) http://landley.net/hg/aboriginal/rev/1588
3) http://landley.net/hg/aboriginal/rev/1589

Here's the text from #2:

=
The arm guys have now screwed up the ARM versatile board's IRQ routing  
three consecutive times. I'm impressed.

The ARM versatile scsi IRQ is position 27 on the IRQ controller. That's  
what
QEMU implemented, that's what worked for years. In commit 1bc39ac5dab2  
the
kernel guys screwed up their math (among other things doing -24 and  
then &3

on the result.. can we say NOP?) so it was now trying to use IRQ 28 once
the unnecessary "swizzle" function got done with it. (I started  
reverting
that 6 months ago in aboriginal changeset 1534.) Then in commit  
f5565295892e
they incremented the IRQ controller start by 32... and didn't adjust  
map_irq()

so it was still requesting 28, now before the start of the IRQ
controller's range. Then in commit e3e92a7be693 they noticed it was
broken, and added 64 to it. (So now it's requesting 64+28=92, when it
_should_ be requesting 32+27=59. Their own description of what changed  
does

not support what the patch did, and yet...)
=

All that was about the SCSI IRQ. Entry #3 above was about the  
_ethernet_ IRQ, which they screwed up in a different way (they did an  
&3 in a way that wrapped)about how they did the math for the ethernet  
irq wrong in a _different_ way than they did the scsi irq math wrong.

The line that's actually calculating the IRQ is:

irq = 27 + ((slot - 24 + pin - 1) & 3);

Since 24 is divisible by 3, subtracting 24 and then & 3 on the result  
is a NOP so this math can't POSSIBLY do what they think it's doing.

The kernel code in this area is CRAP, has changed multiple times in  
semi-random ways, has obviously NEVER been tested on real hardware, and  
if you've implemented what the actual versatile documentation says it's  
clearly not what the kernel is doing.

> My concern here was that you'd going to change qemu so it doesn't  
run the
> old images, and require a very new qemu to run the new images, so  
there will

> be an incompatible flag day.

No, it doesn't break old images, as the paragraph you quote clearly
states. Yes, if the kernel guys fix the kernel yo

[Qemu-devel] [PATCH] migration: add timeout option for tcp migraion send/receive socket

2013-06-28 Thread Zhanghaoyu (A)

When network disconnection occurs during live migration, the migration thread 
will be stuck in the function sendmsg(), as the migration socket is in 
~O_NONBLOCK mode now.

Signed-off-by: Zeng Junliang 
---
 include/migration/migration.h |4 
 migration-tcp.c   |   23 ++-
 2 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index f0640e0..1a56248 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -23,6 +23,8 @@
 #include "qapi-types.h"
 #include "exec/cpu-common.h"
 
+#define QEMU_MIGRATE_SOCKET_OP_TIMEOUT 60
+
 struct MigrationParams {
 bool blk;
 bool shared;
@@ -109,6 +111,8 @@ uint64_t xbzrle_mig_pages_transferred(void);
 uint64_t xbzrle_mig_pages_overflow(void);
 uint64_t xbzrle_mig_pages_cache_miss(void);
 
+int tcp_migration_set_socket_timeout(int fd, int optname, int timeout_in_sec);
+
 /**
  * @migrate_add_blocker - prevent migration from proceeding
  *
diff --git a/migration-tcp.c b/migration-tcp.c
index b20ee58..391db0a 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -33,7 +33,7 @@ static void tcp_wait_for_connect(int fd, void *opaque)
 {
 MigrationState *s = opaque;
 
-if (fd < 0) {
+if (tcp_migration_set_socket_timeout(fd, SO_SNDTIMEO, 
QEMU_MIGRATE_SOCKET_OP_TIMEOUT) < 0) {
 DPRINTF("migrate connect error\n");
 s->file = NULL;
 migrate_fd_error(s);
@@ -76,6 +76,10 @@ static void tcp_accept_incoming_migration(void *opaque)
 goto out;
 }
 
+if (tcp_migration_set_socket_timeout(c, SO_RCVTIMEO, 
QEMU_MIGRATE_SOCKET_OP_TIMEOUT) < 0) {
+fprintf(stderr, "set tcp migration socket receive timeout error\n");
+goto out;
+}
 process_incoming_migration(f);
 return;
 
@@ -95,3 +99,20 @@ void tcp_start_incoming_migration(const char *host_port, 
Error **errp)
 qemu_set_fd_handler2(s, NULL, tcp_accept_incoming_migration, NULL,
  (void *)(intptr_t)s);
 }
+
+int tcp_migration_set_socket_timeout(int fd, int optname, int timeout_in_sec)
+{
+struct timeval timeout;
+int ret = 0;
+
+if (fd < 0 || timeout_in_sec < 0 ||
+(optname != SO_RCVTIMEO && optname != SO_SNDTIMEO))
+return -1;
+
+timeout.tv_sec = timeout_in_sec;
+timeout.tv_usec = 0;
+
+ret = qemu_setsockopt(fd, SOL_SOCKET, optname, &timeout, sizeof(timeout));
+
+return ret;
+}
\ No newline at end of file
-- 
1.7.3.1.msysgit.0

Re: [Qemu-devel] [Bug 1195882] Re: Make fails on Centos - can't find autoreconf

2013-06-28 Thread Todd Ross

As  far as I know,  autoreconf does not exist in the red
hat/fedora/centos world.

Can't find a yum, rpm. or other distro for it on the above os.

Am I missing something obvious hrte?

Thanks


Sent from a mobile device.

 Original message 
From: Iggy  
Date:  
To: tr...@seraband.com 
Subject: [Bug 1195882] Re: Make fails on Centos - can't find autoreconf 
 
autoreconf is part of the autoconf package. Do you have that installed?

-- 
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1195882

Title:
  Make fails on Centos - can't find autoreconf

Status in QEMU:
  New

Bug description:
    [root@H002 qemu-1.4.2]# make
   
  \  GEN   i386-softmmu/config-devices.mak  
 
    GEN   x86_64-softmmu/config-devices.mak 
 
    GEN   alpha-softmmu/config-devices.mak  
 
    GEN   arm-softmmu/config-devices.mak
 
    GEN   cris-softmmu/config-devices.mak   
 
    GEN   lm32-softmmu/config-devices.mak   

  (   )

  GEN   unicore32-linux-user/config-devices.mak
    GEN   s390x-linux-user/config-devices.mak
    GEN   config-all-devices.mak
    GEN   config-host.h
  (cd /opt/qemu/qemu-1.4.2/pixman; autoreconf -v --install)
  /bin/sh: autoreconf: command not found
  make: *** [/opt/qemu/qemu-1.4.2/pixman/configure] Error 127

  *

  Exact same error for 1.5.1 build 
  So, qemu not supported on anything but Ubuntu?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1195882/+subscriptions

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1195882

Title:
  Make fails on Centos - can't find autoreconf

Status in QEMU:
  New

Bug description:
[root@H002 qemu-1.4.2]# make
   
  \  GEN   i386-softmmu/config-devices.mak  
 
GEN   x86_64-softmmu/config-devices.mak 
 
GEN   alpha-softmmu/config-devices.mak  
 
GEN   arm-softmmu/config-devices.mak
 
GEN   cris-softmmu/config-devices.mak   
 
GEN   lm32-softmmu/config-devices.mak   

  (   )

  GEN   unicore32-linux-user/config-devices.mak
GEN   s390x-linux-user/config-devices.mak
GEN   config-all-devices.mak
GEN   config-host.h
  (cd /opt/qemu/qemu-1.4.2/pixman; autoreconf -v --install)
  /bin/sh: autoreconf: command not found
  make: *** [/opt/qemu/qemu-1.4.2/pixman/configure] Error 127

  *

  Exact same error for 1.5.1 build 
  So, qemu not supported on anything but Ubuntu?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1195882/+subscriptions

Re: [Qemu-devel] [PATCH 05/30] exec: do not use qemu/tls.h

2013-06-28 Thread Ed Maste

On 28 June 2013 14:26, Paolo Bonzini  wrote:
>
> +/* This is thread-local depending on __linux__ because:

Is the comment perhaps unchanged from an earlier revision that used a
different test?  It seems odd to me to reference __linux__ here.

> + *  - the only -user mode supporting multiple VCPU threads is linux-user
> + *  - TCG system mode is single-threaded regarding VCPUs
> + *  - KVM system mode is multi-threaded but limited to Linux
> + */
> +#if defined CONFIG_KVM || (defined CONFIG_USER_ONLY && defined 
> CONFIG_USE_NPTL)

Also, in discussion on the FreeBSD bsd-user patch set the suggestion
was made that we do away with a flag, and just have thread support
always enabled.  Would you suggest this test then become KVM ||
(USER_ONLY && (USE_NPTL || __FreeBSD__))?

Re: [Qemu-devel] [PATCH v2 0/8] rdma: core logic w/ unpin example

2013-06-28 Thread Michael R. Hines


FYI: This version also passes under the 'virt-test' framework
in addition to my very aggressive looped regression tests,
in case anyone was concerned about additional testing.

I've also submitted a patch to virt-test to include rdma support.

As soon as this patch applies, I'll do the same on the libvirt mailing list.

- Michael

On 06/28/2013 03:59 PM, mrhi...@linux.vnet.ibm.com wrote:

From: "Michael R. Hines" 

This version seems ready to go, if there are no fundamental problems.

Changes since v1:
- Complete endianness handling of all protocol messages
- Splitout unpin patch
- ./configure fixes
- Fix documentation

Michael R. Hines (8):
   rdma: update documentation to reflect new unpin support
   rdma: introduce ram_handle_compressed()
   rdma: core logic
   rdma: unpin support
   rdma: send pc.ram
   rdma: allow state transitions between other states besides ACTIVE
   rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state
 transition
   rdma: account for the time spent in MIG_STATE_SETUP through QMP

  Makefile.objs |1 +
  arch_init.c   |   62 +-
  configure |   40 +
  docs/rdma.txt |   51 +-
  hmp.c |4 +
  include/migration/migration.h |7 +
  migration-rdma.c  | 3042 +
  migration.c   |   48 +-
  qapi-schema.json  |9 +-
  9 files changed, 3219 insertions(+), 45 deletions(-)
  create mode 100644 migration-rdma.c

[Qemu-devel] [Bug 1195882] Re: Make fails on Centos - can't find autoreconf

2013-06-28 Thread Iggy

autoreconf is part of the autoconf package. Do you have that installed?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1195882

Title:
  Make fails on Centos - can't find autoreconf

Status in QEMU:
  New

Bug description:
[root@H002 qemu-1.4.2]# make
   
  \  GEN   i386-softmmu/config-devices.mak  
 
GEN   x86_64-softmmu/config-devices.mak 
 
GEN   alpha-softmmu/config-devices.mak  
 
GEN   arm-softmmu/config-devices.mak
 
GEN   cris-softmmu/config-devices.mak   
 
GEN   lm32-softmmu/config-devices.mak   

  (   )

  GEN   unicore32-linux-user/config-devices.mak
GEN   s390x-linux-user/config-devices.mak
GEN   config-all-devices.mak
GEN   config-host.h
  (cd /opt/qemu/qemu-1.4.2/pixman; autoreconf -v --install)
  /bin/sh: autoreconf: command not found
  make: *** [/opt/qemu/qemu-1.4.2/pixman/configure] Error 127

  *

  Exact same error for 1.5.1 build 
  So, qemu not supported on anything but Ubuntu?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1195882/+subscriptions

Re: [Qemu-devel] Openbios upgrade broke sparc32 linux.

2013-06-28 Thread Mark Cave-Ayland


On 28/06/13 03:08, Rob Landley wrote:


Commit 467b34689d27 upgraded the openbios image, and ever since my linux
system images hang about the time they try to initialize interrupts.

http://landley.net/aboriginal/bin/system-image-sparc.tar.bz2

Extract that and "./run-emulator.sh" in the tarball. Using qemu 1.2.0
for example works fine, you get a shell prompt. Using 1.5.0 hangs.

Rob


Hi Rob,

Thanks for the bug report. I did a quick bisect on OpenBIOS and it 
points to the following commit:


commit 167aafd70f64e74a77787ca5bf9f4dc750b27fc3
Author: blueswirl 
Date:   Sun Feb 3 16:50:11 2013 +

SPARC32: microSPARC-II identification

For the microSPARC-II = Fujitsu MB86904 = Sun STP1012PGA,
PSR.IMPL=0 and PSR.VERS=4.

This CPU model is used as default by QEMU when emulating
a SparcStation-4 or SparcStation-5.

Signed-off-by: Olivier DANET 
Signed-off-by: Blue Swirl 


The commit itself is very simple and looks like this: 
http://git.qemu.org/?p=openbios.git;a=commitdiff;h=0fe772df8717ef75d91eae8ef221e9966ce2fd7f.


My guess would be that Linux is trying to do some slightly different 
initialisation based upon identifying the CPU, but I'm not too familiar 
with the kernel code myself. Blue/Olivier - can either of you comment on 
this?



ATB,

Mark.

[Qemu-devel] [Bug 1195882] [NEW] Make fails on Centos - can't find autoreconf

2013-06-28 Thread Todd Ross

Public bug reported:

  [root@H002 qemu-1.4.2]# make  
 
\  GEN   i386-softmmu/config-devices.mak
   
  GEN   x86_64-softmmu/config-devices.mak   
   
  GEN   alpha-softmmu/config-devices.mak
   
  GEN   arm-softmmu/config-devices.mak  
   
  GEN   cris-softmmu/config-devices.mak 
   
  GEN   lm32-softmmu/config-devices.mak   

(   )

GEN   unicore32-linux-user/config-devices.mak
  GEN   s390x-linux-user/config-devices.mak
  GEN   config-all-devices.mak
  GEN   config-host.h
(cd /opt/qemu/qemu-1.4.2/pixman; autoreconf -v --install)
/bin/sh: autoreconf: command not found
make: *** [/opt/qemu/qemu-1.4.2/pixman/configure] Error 127

*

Exact same error for 1.5.1 build 
So, qemu not supported on anything but Ubuntu?

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1195882

Title:
  Make fails on Centos - can't find autoreconf

Status in QEMU:
  New

Bug description:
[root@H002 qemu-1.4.2]# make
   
  \  GEN   i386-softmmu/config-devices.mak  
 
GEN   x86_64-softmmu/config-devices.mak 
 
GEN   alpha-softmmu/config-devices.mak  
 
GEN   arm-softmmu/config-devices.mak
 
GEN   cris-softmmu/config-devices.mak   
 
GEN   lm32-softmmu/config-devices.mak   

  (   )

  GEN   unicore32-linux-user/config-devices.mak
GEN   s390x-linux-user/config-devices.mak
GEN   config-all-devices.mak
GEN   config-host.h
  (cd /opt/qemu/qemu-1.4.2/pixman; autoreconf -v --install)
  /bin/sh: autoreconf: command not found
  make: *** [/opt/qemu/qemu-1.4.2/pixman/configure] Error 127

  *

  Exact same error for 1.5.1 build 
  So, qemu not supported on anything but Ubuntu?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1195882/+subscriptions

[Qemu-devel] [PULL] hmp: Make "info block" output more readable

2013-06-28 Thread Luiz Capitulino

From: Kevin Wolf 

HMP is meant for humans and you should notice it.

This changes the output format to use a bit more space to display the
information more readable and leaves out irrelevant information (e.g.
mention only that an image is encrypted, but not when it's not; display
I/O limits only if throttling is in effect; ...)

Before:

(qemu) info block
ide0-hd0: removable=0 io-status=ok file=/tmp/overlay.qcow2
backing_file=/tmp/backing.img backing_file_depth=1 ro=0 drv=qcow2
encrypted=1 bps=0 bps_rd=0 bps_wr=0 iops=1024 iops_rd=0 iops_wr=0
ide1-cd0: removable=1 locked=0 tray-open=0 io-status=ok
file=/home/kwolf/images/iso/Fedora-18-x86_64-Live-Desktop.iso ro=1
drv=raw encrypted=0 bps=0 bps_rd=0 bps_wr=0 iops=0 iops_rd=0 iops_wr=0
floppy0: removable=1 locked=0 tray-open=0 [not inserted]
sd0: removable=1 locked=0 tray-open=0 [not inserted]

After:

(qemu) info block
ide0-hd0: /tmp/overlay.qcow2 (qcow2, encrypted)
Backing file: /tmp/backing.img (chain depth: 1)
I/O limits:   bps=0 bps_rd=0 bps_wr=0 iops=1024 iops_rd=0 iops_wr=0

ide1-cd0: /home/kwolf/images/iso/Fedora-18-x86_64-Live-Desktop.iso (raw, 
read-only)
Removable device: not locked, tray closed

floppy0: [not inserted]
Removable device: not locked, tray closed

sd0: [not inserted]
Removable device: not locked, tray closed

Signed-off-by: Kevin Wolf 
Reviewed-by: Fam Zheng 
Acked-by: Anthony Liguori 
Signed-off-by: Luiz Capitulino 
---
 hmp.c | 94 +++
 1 file changed, 55 insertions(+), 39 deletions(-)

diff --git a/hmp.c b/hmp.c
index 148a3fb..2daed43 100644
--- a/hmp.c
+++ b/hmp.c
@@ -291,62 +291,78 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)
 if (device && strcmp(device, info->value->device)) {
 continue;
 }
-monitor_printf(mon, "%s: removable=%d",
-   info->value->device, info->value->removable);
 
-if (info->value->removable) {
-monitor_printf(mon, " locked=%d", info->value->locked);
-monitor_printf(mon, " tray-open=%d", info->value->tray_open);
+if (info != block_list) {
+monitor_printf(mon, "\n");
+}
+
+monitor_printf(mon, "%s", info->value->device);
+if (info->value->has_inserted) {
+monitor_printf(mon, ": %s (%s%s%s)\n",
+   info->value->inserted->file,
+   info->value->inserted->drv,
+   info->value->inserted->ro ? ", read-only" : "",
+   info->value->inserted->encrypted ? ", encrypted" : 
"");
+} else {
+monitor_printf(mon, ": [not inserted]\n");
 }
 
-if (info->value->has_io_status) {
-monitor_printf(mon, " io-status=%s",
+if (info->value->has_io_status && info->value->io_status != 
BLOCK_DEVICE_IO_STATUS_OK) {
+monitor_printf(mon, "I/O status:   %s\n",
BlockDeviceIoStatus_lookup[info->value->io_status]);
 }
 
-if (info->value->has_inserted) {
-monitor_printf(mon, " file=");
-monitor_print_filename(mon, info->value->inserted->file);
-
-if (info->value->inserted->has_backing_file) {
-monitor_printf(mon, " backing_file=");
-monitor_print_filename(mon, 
info->value->inserted->backing_file);
-monitor_printf(mon, " backing_file_depth=%" PRId64,
-info->value->inserted->backing_file_depth);
-}
-monitor_printf(mon, " ro=%d drv=%s encrypted=%d",
-   info->value->inserted->ro,
-   info->value->inserted->drv,
-   info->value->inserted->encrypted);
+if (info->value->removable) {
+monitor_printf(mon, "Removable device: %slocked, tray %s\n",
+   info->value->locked ? "" : "not ",
+   info->value->tray_open ? "open" : "closed");
+}
 
-monitor_printf(mon, " bps=%" PRId64 " bps_rd=%" PRId64
-" bps_wr=%" PRId64 " iops=%" PRId64
-" iops_rd=%" PRId64 " iops_wr=%" PRId64,
+
+if (!info->value->has_inserted) {
+continue;
+}
+
+if (info->value->inserted->has_backing_file) {
+monitor_printf(mon,
+   "Backing file: %s "
+   "(chain depth: %" PRId64 ")\n",
+   info->value->inserted->backing_file,
+   info->value->inserted->backing_file_depth);
+}
+
+if (info->value->inserted->bps
+|| info->value->inserted->bps_rd
+|| info->value->inserted->bps_wr
+|| info->value->inserted->iops
+

[Qemu-devel] [PULL] QMP queue

2013-06-28 Thread Luiz Capitulino

The following changes since commit 36125631e79d53ffb9365740f43f386e2171d116:

  Merge remote-tracking branch 'kwolf/for-anthony' into staging (2013-06-28 
10:37:34 -0500)

are available in the git repository at:


  git://repo.or.cz/qemu/qmp-unstable.git queue/qmp

for you to fetch changes up to fbe2e26c15af35e4d157874dc80f6a19eebaa83b:

  hmp: Make "info block" output more readable (2013-06-28 16:14:39 -0400)


Kevin Wolf (1):
  hmp: Make "info block" output more readable

 hmp.c | 94 +++
 1 file changed, 55 insertions(+), 39 deletions(-)

-- 
1.8.1.4

Re: [Qemu-devel] [PATCH 05/30] exec: do not use qemu/tls.h

2013-06-28 Thread Anthony Liguori

Paolo Bonzini  writes:

> The next patch will change qemu/tls.h to support more platforms, but at
> some performance cost.  Declare cpu_single_env directly instead of using
> the tls.h abstractions.
>
> Signed-off-by: Paolo Bonzini 

Reviewed-by: Anthony Liguori 

Regards,

Anthony Liguori

> ---
>  exec.c | 10 --
>  include/exec/cpu-all.h | 14 +++---
>  include/qemu/tls.h | 52 
> --
>  3 files changed, 19 insertions(+), 57 deletions(-)
>  delete mode 100644 include/qemu/tls.h
>
> diff --git a/exec.c b/exec.c
> index d28403b..a788981 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -70,9 +70,15 @@ static MemoryRegion io_mem_unassigned;
>  #endif
>  
>  CPUArchState *first_cpu;
> +
>  /* current CPU in the current thread. It is only valid inside
> -   cpu_exec() */
> -DEFINE_TLS(CPUArchState *,cpu_single_env);
> + * cpu_exec().  See comment in include/exec/cpu-all.h.  */
> +#if defined CONFIG_KVM || (defined CONFIG_USER_ONLY && defined 
> CONFIG_USE_NPTL)
> +__thread CPUArchState *cpu_single_env;
> +#else
> +CPUArchState *cpu_single_env;
> +#endif
> +
>  /* 0 = Do not count executed instructions.
> 1 = Precise instruction counting.
> 2 = Adaptive rate instruction counting.  */
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index e9c3717..2202ba3 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -20,7 +20,6 @@
>  #define CPU_ALL_H
>  
>  #include "qemu-common.h"
> -#include "qemu/tls.h"
>  #include "exec/cpu-common.h"
>  #include "qemu/thread.h"
>  
> @@ -368,8 +367,17 @@ void cpu_dump_statistics(CPUArchState *env, FILE *f, 
> fprintf_function cpu_fprint
>  void QEMU_NORETURN cpu_abort(CPUArchState *env, const char *fmt, ...)
>  GCC_FMT_ATTR(2, 3);
>  extern CPUArchState *first_cpu;
> -DECLARE_TLS(CPUArchState *,cpu_single_env);
> -#define cpu_single_env tls_var(cpu_single_env)
> +
> +/* This is thread-local depending on __linux__ because:
> + *  - the only -user mode supporting multiple VCPU threads is linux-user
> + *  - TCG system mode is single-threaded regarding VCPUs
> + *  - KVM system mode is multi-threaded but limited to Linux
> + */
> +#if defined CONFIG_KVM || (defined CONFIG_USER_ONLY && defined 
> CONFIG_USE_NPTL)
> +extern __thread CPUArchState *cpu_single_env;
> +#else
> +extern CPUArchState *cpu_single_env;
> +#endif
>  
>  /* Flags for use in ENV->INTERRUPT_PENDING.
>  
> diff --git a/include/qemu/tls.h b/include/qemu/tls.h
> deleted file mode 100644
> index b92ea9d..000
> --- a/include/qemu/tls.h
> +++ /dev/null
> @@ -1,52 +0,0 @@
> -/*
> - * Abstraction layer for defining and using TLS variables
> - *
> - * Copyright (c) 2011 Red Hat, Inc
> - * Copyright (c) 2011 Linaro Limited
> - *
> - * Authors:
> - *  Paolo Bonzini 
> - *  Peter Maydell 
> - *
> - * This program is free software; you can redistribute it and/or
> - * modify it under the terms of the GNU General Public License as
> - * published by the Free Software Foundation; either version 2 of
> - * the License, or (at your option) any later version.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License along
> - * with this program; if not, see .
> - */
> -
> -#ifndef QEMU_TLS_H
> -#define QEMU_TLS_H
> -
> -/* Per-thread variables. Note that we only have implementations
> - * which are really thread-local on Linux; the dummy implementations
> - * define plain global variables.
> - *
> - * This means that for the moment use should be restricted to
> - * per-VCPU variables, which are OK because:
> - *  - the only -user mode supporting multiple VCPU threads is linux-user
> - *  - TCG system mode is single-threaded regarding VCPUs
> - *  - KVM system mode is multi-threaded but limited to Linux
> - *
> - * TODO: proper implementations via Win32 .tls sections and
> - * POSIX pthread_getspecific.
> - */
> -#ifdef __linux__
> -#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
> -#define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
> -#define tls_var(x)   tls__##x
> -#else
> -/* Dummy implementations which define plain global variables */
> -#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
> -#define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
> -#define tls_var(x)   tls__##x
> -#endif
> -
> -#endif
> -- 
> 1.8.1.4

Re: [Qemu-devel] [PATCH 04/30] add a header file for atomic operations

2013-06-28 Thread Anthony Liguori

Paolo Bonzini  writes:

> We're already using them in several places, but __sync builtins are just
> too ugly to type, and do not provide seqcst load/store operations.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  docs/atomics.txt | 345 
> +++
>  hw/display/qxl.c |   3 +-
>  hw/virtio/vhost.c|   9 +-
>  include/qemu/atomic.h| 190 +-
>  migration.c  |   3 +-
>  tests/test-thread-pool.c |   8 +-
>  6 files changed, 514 insertions(+), 44 deletions(-)
>  create mode 100644 docs/atomics.txt
>
> diff --git a/docs/atomics.txt b/docs/atomics.txt
> new file mode 100644
> index 000..e2ce04b
> --- /dev/null
> +++ b/docs/atomics.txt

Really nice write-up!

> diff --git a/hw/display/qxl.c b/hw/display/qxl.c
> index 3862d7a..f24cb4e 100644
> --- a/hw/display/qxl.c
> +++ b/hw/display/qxl.c
> @@ -23,6 +23,7 @@
>  #include "qemu-common.h"
>  #include "qemu/timer.h"
>  #include "qemu/queue.h"
> +#include "qemu/atomic.h"
>  #include "monitor/monitor.h"
>  #include "sysemu/sysemu.h"
>  #include "trace.h"
> @@ -1726,7 +1727,7 @@ static void qxl_send_events(PCIQXLDevice *d, uint32_t 
> events)
>  trace_qxl_send_events_vm_stopped(d->id, events);
>  return;
>  }
> -old_pending = __sync_fetch_and_or(&d->ram->int_pending, le_events);
> +old_pending = atomic_or(&d->ram->int_pending, le_events);
>  if ((old_pending & le_events) == le_events) {
>  return;
>  }
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 96ab625..8f6ab13 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include "hw/virtio/vhost.h"
>  #include "hw/hw.h"
> +#include "qemu/atomic.h"
>  #include "qemu/range.h"
>  #include 
>  #include "exec/address-spaces.h"
> @@ -47,11 +48,9 @@ static void vhost_dev_sync_region(struct vhost_dev *dev,
>  addr += VHOST_LOG_CHUNK;
>  continue;
>  }
> -/* Data must be read atomically. We don't really
> - * need the barrier semantics of __sync
> - * builtins, but it's easier to use them than
> - * roll our own. */
> -log = __sync_fetch_and_and(from, 0);
> +/* Data must be read atomically. We don't really need barrier 
> semantics
> + * but it's easier to use atomic_* than roll our own. */
> +log = atomic_xchg(from, 0);
>  while ((bit = sizeof(log) > sizeof(int) ?
>  ffsll(log) : ffs(log))) {
>  hwaddr page_addr;
> diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
> index 10becb6..04d64d0 100644
> --- a/include/qemu/atomic.h
> +++ b/include/qemu/atomic.h
> @@ -1,68 +1,194 @@
> -#ifndef __QEMU_BARRIER_H
> -#define __QEMU_BARRIER_H 1
> +/*
> + * Simple interface for atomic operations.
> + *
> + * Copyright (C) 2013 Red Hat, Inc.
> + *
> + * Author: Paolo Bonzini 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
>  
> -/* Compiler barrier */
> -#define barrier()   asm volatile("" ::: "memory")
> +#ifndef __QEMU_ATOMIC_H
> +#define __QEMU_ATOMIC_H 1
>  
> -#if defined(__i386__)
> +#include "qemu/compiler.h"
>  
> -#include "qemu/compiler.h"/* QEMU_GNUC_PREREQ */
> +/* For C11 atomic ops */
>  
> -/*
> - * Because of the strongly ordered x86 storage model, wmb() and rmb() are 
> nops
> - * on x86(well, a compiler barrier only).  Well, at least as long as
> - * qemu doesn't do accesses to write-combining memory or non-temporal
> - * load/stores from C code.
> - */
> -#define smp_wmb()   barrier()
> -#define smp_rmb()   barrier()
> +/* Compiler barrier */
> +#define barrier()   ({ asm volatile("" ::: "memory"); (void)0; })
> +
> +#ifndef __ATOMIC_RELAXED
>  
>  /*
> - * We use GCC builtin if it's available, as that can use
> - * mfence on 32 bit as well, e.g. if built with -march=pentium-m.
> - * However, on i386, there seem to be known bugs as recently as 4.3.
> - * */
> -#if QEMU_GNUC_PREREQ(4, 4)
> -#define smp_mb() __sync_synchronize()
> + * We use GCC builtin if it's available, as that can use mfence on
> + * 32-bit as well, e.g. if built with -march=pentium-m. However, on
> + * i386 the spec is buggy, and the implementation followed it until
> + * 4.3 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36793).
> + */
> +#if defined(__i386__) || defined(__x86_64__)
> +#if !QEMU_GNUC_PREREQ(4, 4)
> +#if defined __x86_64__
> +#define smp_mb()({ asm volatile("mfence" ::: "memory"); (void)0; })
>  #else
> -#define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory")
> +#define smp_mb()({ asm volatile("lock; addl $0,0(%%esp) " ::: "memory"); 
> (void)0; })
> +#endif
> +#endif
>  #endif
>  
> -#elif defined(__x86_64__)
>  
> +#ifdef __alpha__
> +#define smp_read_barrier_depends()   asm volatile("mb":::"memory")
> +#endif
> +
> +#if defined(__i386__) || defined(__x86_64__) || define

Re: [Qemu-devel] [PATCH 1/2] Refine and export infinite loop checking in collect_image_info_list()

2013-06-28 Thread Eric Blake

On 06/27/2013 01:38 AM, Xu Wang wrote:
> From: Xu Wang 
> 
> Signed-off-by: Xu Wang 
> ---
>  qemu-img.c | 110 
> +
>  1 file changed, 89 insertions(+), 21 deletions(-)
> 

> +/**
> + * Check backing file chain if there is a loop in it and build list of
> + * ImageInfo if needed.
> + *
> + * @filename: topmost image filename

absolute? relative?

> + * @fmt: topmost image format (may be NULL to autodetect)
> + * @head: head of ImageInfo list. If not need please set head to null.
> + * @chain: true  - enumerate entire backing file chain
> + * false - only topmost image file
> + * @backing_file: if this value is set, filename will insert into hash
> + *table directly. open and seek backing file start from it.
> + *
> + * If return true, stands for a backing file loop exists or some error
> + * happend. If return false, everything is ok.

s/happend/happened/

> + */
> +static bool backing_file_loop_check(const char *filename, const char *fmt,
> + bool chain, const char *backing_file) {

Indentation is off.

> +GHashTable *filenames;
> +BlockDriverState *bs;
> +ImageInfo *info;
> +Error *err = NULL;
> +
> +filenames = g_hash_table_new_full(g_str_hash, str_equal_func, NULL, 
> NULL);
> +
> +/* If backing file exists, filename will insert into hash table and seek
> + * the whole backing file chain from @backing_file.
> + */
> +if (backing_file) {
> +g_hash_table_insert(filenames, (gpointer)filename, NULL);

Does this have any false positives (perhaps mishandling due to relative
names) or false negatives (perhaps hard links allow different spellings
of the same file to create a loop, although the difference in names
won't indicate the problem)?  I'd really like to see you add a testcase
before this patch gets committed, although I agree that a patch along
these lines is worthwhile.  For example, make sure the following chain
is not rejected:

/dir1/base.img <- /dir1/wrap.img(relative backing 'base.img') <-
/dir2/base.img (absolute backing '/dir1/base.img') <-
/dir2/wrap.img(relative backing 'base.img')

whether opened in /dir2/ via relative name 'wrap.img' or absolute name
'/dir2/wrap.img'.  Likewise, make sure you can detect this loop:

create directory 'dir'
create './dir/b.img'
create './b.img' with relative backing 'dir/b.img'
remove ./dir/b.img and dir
ln -s . dir
now 'b.img' refers to itself as backing file, even though the names
./b.img and ./dir/b.img are not equal by strcmp.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v5 14/14] Implement dimm-info

2013-06-28 Thread Eric Blake

On 06/26/2013 03:13 AM, Hu Tao wrote:
> From: Vasilis Liaskovitis 
> 
> "query-dimm-info" and "info dimm" will give current state of all dimms in the
> system e.g.
> 

> +++ b/qapi-schema.json
> @@ -3621,3 +3621,30 @@
>  { 'type': 'MemoryInfo',
>'data': { 'total': 'int' } }
>  { 'command': 'query-memory', 'returns': 'MemoryInfo' }
> +
> +##
> +# @DimmInfo:
> +#
> +# Information about status of dimm device
> +#
> +# @dimm: the name of the dimm
> +#
> +# @state: 'true' means the dimm device is plugged in, 'false' means
> +# means the dimm device is plugged out.

s/plugged out/unplugged/

> +#
> +# Since: 1.6
> +#
> +##
> +{ 'type': 'DimmInfo',
> +  'data': {'dimm': 'str', 'state': 'bool'} }
> +
> +##
> +# @query-dimm-info:
> +#
> +# Returns the state of dimm devices
> +#
> +# Returns: list of @DimmInfo
> +#
> +# Since: 1.6
> +##
> +{ 'command': 'query-dimm-info', 'returns': ['DimmInfo'] }

Looks good from an interface perspective; but missing an example in
qmp-commands.hx.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v5 12/14] Implement "info memory" and "query-memory"

2013-06-28 Thread Eric Blake

On 06/26/2013 03:13 AM, Hu Tao wrote:
> From: Vasilis Liaskovitis 
> 
> Returns total physical memory available to guest in bytes, including 
> hotplugged
> memory. Note that the number reported here may be different from what the 
> guest
> sees e.g. if the guest has not logically onlined hotplugged memory.
> 
> This functionality is provided independently of a balloon device, since a
> guest can be using ACPI memory hotplug without using a balloon device.
> 
> Signed-off-by: Vasilis Liaskovitis 
> Signed-off-by: Hu Tao 
> ---

> +++ b/qapi-schema.json
> @@ -3608,3 +3608,16 @@
>  '*cpuid-input-ecx': 'int',
>  'cpuid-register': 'X86CPURegister32',
>  'features': 'int' } }
> +
> +##
> +# @query-memory:
> +#
> +# Returns total memory in bytes, including hotplugged dimms
> +#
> +# Returns: int

Not true - it returns a struct of useful memory statistics (of which,
your initial implementation of the MemoryInfo has only one statistic).
This struct is free to grow larger in the future, if we determine other
useful things to return.

> +#
> +# Since: 1.6
> +##
> +{ 'type': 'MemoryInfo',
> +  'data': { 'total': 'int' } }
> +{ 'command': 'query-memory', 'returns': 'MemoryInfo' }

You've created two items (the 'MemoryInfo' type and the 'query-memory'
command), so you should have two pieces of documentation (yes, I know
there are pre-existing cases of undocumented types that are declared
next to the sole command that uses them, but we should avoid that where
possible).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 1/8] rdma: update documentation to reflect new unpin support

2013-06-28 Thread Michael R. Hines


On 06/28/2013 04:14 PM, Eric Blake wrote:

On 06/28/2013 01:59 PM, mrhi...@linux.vnet.ibm.com wrote:

From: "Michael R. Hines" 

As requested, the protocol now includes memory unpinning support.
This has been implemented in a non-optimized manner, in such a way
that one could devise an LRU or other workload-specific information
on top of the basic mechanism to influence the way unpinning happens
during runtime.

The feature is not yet user-facing, and is thus can only be enable

s/enable/enabled/


at compile-time.

Signed-off-by: Michael R. Hines 
---
  docs/rdma.txt |   51 ++-
  1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/docs/rdma.txt b/docs/rdma.txt
index 45a4b1d..f3083fd 100644
--- a/docs/rdma.txt
+++ b/docs/rdma.txt
@@ -35,7 +35,7 @@ memory tracked during each live migration iteration round 
cannot keep pace
  with the rate of dirty memory produced by the workload.
  
  RDMA currently comes in two flavors: both Ethernet based (RoCE, or RDMA

-over Convered Ethernet) as well as Infiniband-based. This implementation of
+over Converged Ethernet) as well as Infiniband-based. This implementation of
  migration using RDMA is capable of using both technologies because of
  the use of the OpenFabrics OFED software stack that abstracts out the
  programming model irrespective of the underlying hardware.
@@ -188,9 +188,9 @@ header portion and a data portion (but together are 
transmitted
  as a single SEND message).
  
  Header:

-* Length  (of the data portion, uint32, network byte order)
-* Type(what command to perform, uint32, network byte order)
-* Repeat  (Number of commands in data portion, same type only)
+* Length   (of the data portion, uint32, network byte order)
+* Type (what command to perform, uint32, network byte 
order)
+* Repeat   (Number of commands in data portion, same type only)

Perhaps worth splitting into two patches, trivial typo/format fixes vs.
new content?  But I won't insist, as anyone backporting rdma to an older
branch will pick up all related rdma patches, rather than stopping at
just the initial implementation.


I don't mind resending - it's a quick "git am" followed by "git commit 
--amend".



+ 8. Register request(dynamic chunk registration)
+ 9. Register result ('rkey' to be used by sender)
+10. Register finished  (registration for current iteration 
finished)
+11. Unregister request (unpin previously registered memory)

Alignment looks off :)



At any rate, touching that up is trivial enough that I don't mind if you
add: Reviewed-by: Eric Blake 


Thanks, Eric.

Re: [Qemu-devel] [PATCH v2 1/8] rdma: update documentation to reflect new unpin support

2013-06-28 Thread Eric Blake

On 06/28/2013 01:59 PM, mrhi...@linux.vnet.ibm.com wrote:
> From: "Michael R. Hines" 
> 
> As requested, the protocol now includes memory unpinning support.
> This has been implemented in a non-optimized manner, in such a way
> that one could devise an LRU or other workload-specific information
> on top of the basic mechanism to influence the way unpinning happens
> during runtime.
> 
> The feature is not yet user-facing, and is thus can only be enable

s/enable/enabled/

> at compile-time.
> 
> Signed-off-by: Michael R. Hines 
> ---
>  docs/rdma.txt |   51 ++-
>  1 file changed, 30 insertions(+), 21 deletions(-)
> 
> diff --git a/docs/rdma.txt b/docs/rdma.txt
> index 45a4b1d..f3083fd 100644
> --- a/docs/rdma.txt
> +++ b/docs/rdma.txt
> @@ -35,7 +35,7 @@ memory tracked during each live migration iteration round 
> cannot keep pace
>  with the rate of dirty memory produced by the workload.
>  
>  RDMA currently comes in two flavors: both Ethernet based (RoCE, or RDMA
> -over Convered Ethernet) as well as Infiniband-based. This implementation of
> +over Converged Ethernet) as well as Infiniband-based. This implementation of
>  migration using RDMA is capable of using both technologies because of
>  the use of the OpenFabrics OFED software stack that abstracts out the
>  programming model irrespective of the underlying hardware.
> @@ -188,9 +188,9 @@ header portion and a data portion (but together are 
> transmitted
>  as a single SEND message).
>  
>  Header:
> -* Length  (of the data portion, uint32, network byte order)
> -* Type(what command to perform, uint32, network byte order)
> -* Repeat  (Number of commands in data portion, same type only)
> +* Length   (of the data portion, uint32, network byte order)
> +* Type (what command to perform, uint32, network byte 
> order)
> +* Repeat   (Number of commands in data portion, same type 
> only)

Perhaps worth splitting into two patches, trivial typo/format fixes vs.
new content?  But I won't insist, as anyone backporting rdma to an older
branch will pick up all related rdma patches, rather than stopping at
just the initial implementation.

> + 8. Register request(dynamic chunk registration)
> + 9. Register result ('rkey' to be used by sender)
> +10. Register finished  (registration for current iteration 
> finished)
> +11. Unregister request (unpin previously registered memory)

Alignment looks off :)

At any rate, touching that up is trivial enough that I don't mind if you
add: Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 01/30] memory: access FlatView from a local variable

2013-06-28 Thread Anthony Liguori

Paolo Bonzini  writes:

> We will soon require accesses to as->current_map to be placed under
> a lock (with reference counting so as to keep the critical section
> small).  To simplify this change, always fetch as->current_map into
> a local variable and access it through that variable.
>
> Signed-off-by: Paolo Bonzini 

Reviewed-by: Anthony Liguori 

Regards,

Anthony Liguori

> ---
>  memory.c | 31 +--
>  1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/memory.c b/memory.c
> index 688c817..1f44cd1 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -578,13 +578,15 @@ static void 
> address_space_add_del_ioeventfds(AddressSpace *as,
>  
>  static void address_space_update_ioeventfds(AddressSpace *as)
>  {
> +FlatView *view;
>  FlatRange *fr;
>  unsigned ioeventfd_nb = 0;
>  MemoryRegionIoeventfd *ioeventfds = NULL;
>  AddrRange tmp;
>  unsigned i;
>  
> -FOR_EACH_FLAT_RANGE(fr, as->current_map) {
> +view = as->current_map;
> +FOR_EACH_FLAT_RANGE(fr, view) {
>  for (i = 0; i < fr->mr->ioeventfd_nb; ++i) {
>  tmp = addrrange_shift(fr->mr->ioeventfds[i].addr,
>int128_sub(fr->addr.start,
> @@ -1142,7 +1144,8 @@ void memory_region_sync_dirty_bitmap(MemoryRegion *mr)
>  FlatRange *fr;
>  
>  QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) {
> -FOR_EACH_FLAT_RANGE(fr, as->current_map) {
> +FlatView *view = as->current_map;
> +FOR_EACH_FLAT_RANGE(fr, view) {
>  if (fr->mr == mr) {
>  MEMORY_LISTENER_UPDATE_REGION(fr, as, Forward, log_sync);
>  }
> @@ -1192,12 +1195,14 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr)
>  
>  static void memory_region_update_coalesced_range_as(MemoryRegion *mr, 
> AddressSpace *as)
>  {
> +FlatView *view;
>  FlatRange *fr;
>  CoalescedMemoryRange *cmr;
>  AddrRange tmp;
>  MemoryRegionSection section;
>  
> -FOR_EACH_FLAT_RANGE(fr, as->current_map) {
> +view = as->current_map;
> +FOR_EACH_FLAT_RANGE(fr, view) {
>  if (fr->mr == mr) {
>  section = (MemoryRegionSection) {
>  .address_space = as,
> @@ -1488,9 +1493,9 @@ static int cmp_flatrange_addr(const void *addr_, const 
> void *fr_)
>  return 0;
>  }
>  
> -static FlatRange *address_space_lookup(AddressSpace *as, AddrRange addr)
> +static FlatRange *flatview_lookup(FlatView *view, AddrRange addr)
>  {
> -return bsearch(&addr, as->current_map->ranges, as->current_map->nr,
> +return bsearch(&addr, view->ranges, view->nr,
> sizeof(FlatRange), cmp_flatrange_addr);
>  }
>  
> @@ -1501,6 +1506,7 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
>  MemoryRegion *root;
>  AddressSpace *as;
>  AddrRange range;
> +FlatView *view;
>  FlatRange *fr;
>  
>  addr += mr->addr;
> @@ -1511,13 +1517,14 @@ MemoryRegionSection memory_region_find(MemoryRegion 
> *mr,
>  
>  as = memory_region_to_address_space(root);
>  range = addrrange_make(int128_make64(addr), int128_make64(size));
> -fr = address_space_lookup(as, range);
> +
> +view = as->current_map;
> +fr = flatview_lookup(view, range);
>  if (!fr) {
>  return ret;
>  }
>  
> -while (fr > as->current_map->ranges
> -   && addrrange_intersects(fr[-1].addr, range)) {
> +while (fr > view->ranges && addrrange_intersects(fr[-1].addr, range)) {
>  --fr;
>  }
>  
> @@ -1537,9 +1544,11 @@ MemoryRegionSection memory_region_find(MemoryRegion 
> *mr,
>  
>  void address_space_sync_dirty_bitmap(AddressSpace *as)
>  {
> +FlatView *view;
>  FlatRange *fr;
>  
> -FOR_EACH_FLAT_RANGE(fr, as->current_map) {
> +view = as->current_map;
> +FOR_EACH_FLAT_RANGE(fr, view) {
>  MEMORY_LISTENER_UPDATE_REGION(fr, as, Forward, log_sync);
>  }
>  }
> @@ -1559,6 +1568,7 @@ void memory_global_dirty_log_stop(void)
>  static void listener_add_address_space(MemoryListener *listener,
> AddressSpace *as)
>  {
> +FlatView *view;
>  FlatRange *fr;
>  
>  if (listener->address_space_filter
> @@ -1572,7 +1582,8 @@ static void listener_add_address_space(MemoryListener 
> *listener,
>  }
>  }
>  
> -FOR_EACH_FLAT_RANGE(fr, as->current_map) {
> +view = as->current_map;
> +FOR_EACH_FLAT_RANGE(fr, view) {
>  MemoryRegionSection section = {
>  .mr = fr->mr,
>  .address_space = as,
> -- 
> 1.8.1.4

Re: [Qemu-devel] [PATCH 03/30] memory: add reference counting to FlatView

2013-06-28 Thread Anthony Liguori

Paolo Bonzini  writes:

> With this change, a FlatView can be used even after a concurrent
> update has replaced it.  Because we do not have RCU, we use a
> mutex to protect the small critical sections that read/write the
> as->current_map pointer.  Accesses to the FlatView can be done
> outside the mutex.
>
> If a MemoryRegion will be used after the FlatView is unref-ed (or after
> a MemoryListener callback is returned), a reference has to be added to
> that MemoryRegion.  For example, memory_region_find adds a reference to
> the MemoryRegion that it returns.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  memory.c | 79 
> 
>  1 file changed, 69 insertions(+), 10 deletions(-)
>
> diff --git a/memory.c b/memory.c
> index 319894e..bb92e17 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -29,12 +29,26 @@ static unsigned memory_region_transaction_depth;
>  static bool memory_region_update_pending;
>  static bool global_dirty_log = false;
>  
> +/* flat_view_mutex is taken around reading as->current_map; the critical
> + * section is extremely short, so I'm using a single mutex for every AS.
> + * We could also RCU for the read-side.
> + *
> + * The BQL is taken around transaction commits, hence both locks are taken
> + * while writing to as->current_map (with the BQL taken outside).
> + */
> +static QemuMutex flat_view_mutex;
> +
>  static QTAILQ_HEAD(memory_listeners, MemoryListener) memory_listeners
>  = QTAILQ_HEAD_INITIALIZER(memory_listeners);
>  
>  static QTAILQ_HEAD(, AddressSpace) address_spaces
>  = QTAILQ_HEAD_INITIALIZER(address_spaces);
>  
> +static void memory_init(void)
> +{
> +qemu_mutex_init(&flat_view_mutex);
> +}
> +
>  typedef struct AddrRange AddrRange;
>  
>  /*
> @@ -225,6 +239,7 @@ struct FlatRange {
>   * order.
>   */
>  struct FlatView {
> +unsigned ref;
>  FlatRange *ranges;
>  unsigned nr;
>  unsigned nr_allocated;
> @@ -246,6 +261,7 @@ static bool flatrange_equal(FlatRange *a, FlatRange *b)
>  
>  static void flatview_init(FlatView *view)
>  {
> +view->ref = 1;
>  view->ranges = NULL;
>  view->nr = 0;
>  view->nr_allocated = 0;
> @@ -279,6 +295,18 @@ static void flatview_destroy(FlatView *view)
>  g_free(view);
>  }
>  
> +static void flatview_ref(FlatView *view)
> +{
> +__sync_fetch_and_add(&view->ref, 1);
> +}
> +
> +static void flatview_unref(FlatView *view)
> +{
> +if (__sync_fetch_and_sub(&view->ref, 1) == 1) {
> +flatview_destroy(view);
> +}
> +}
> +
>  static bool can_merge(FlatRange *r1, FlatRange *r2)
>  {
>  return int128_eq(addrrange_end(r1->addr), r2->addr.start)
> @@ -578,6 +606,17 @@ static void 
> address_space_add_del_ioeventfds(AddressSpace *as,
>  }
>  }
>  
> +static FlatView *address_space_get_flatview(AddressSpace *as)
> +{
> +FlatView *view;
> +
> +qemu_mutex_lock(&flat_view_mutex);
> +view = as->current_map;
> +flatview_ref(view);
> +qemu_mutex_unlock(&flat_view_mutex);
> +return view;
> +}
> +
>  static void address_space_update_ioeventfds(AddressSpace *as)
>  {
>  FlatView *view;
> @@ -587,7 +626,7 @@ static void address_space_update_ioeventfds(AddressSpace 
> *as)
>  AddrRange tmp;
>  unsigned i;
>  
> -view = as->current_map;
> +view = address_space_get_flatview(as);
>  FOR_EACH_FLAT_RANGE(fr, view) {
>  for (i = 0; i < fr->mr->ioeventfd_nb; ++i) {
>  tmp = addrrange_shift(fr->mr->ioeventfds[i].addr,
> @@ -609,6 +648,7 @@ static void address_space_update_ioeventfds(AddressSpace 
> *as)
>  g_free(as->ioeventfds);
>  as->ioeventfds = ioeventfds;
>  as->ioeventfd_nb = ioeventfd_nb;
> +flatview_unref(view);
>  }
>  
>  static void address_space_update_topology_pass(AddressSpace *as,
> @@ -676,14 +716,25 @@ static void 
> address_space_update_topology_pass(AddressSpace *as,
>  
>  static void address_space_update_topology(AddressSpace *as)
>  {
> -FlatView *old_view = as->current_map;
> +FlatView *old_view = address_space_get_flatview(as);
>  FlatView *new_view = generate_memory_topology(as->root);
>  
>  address_space_update_topology_pass(as, old_view, new_view, false);
>  address_space_update_topology_pass(as, old_view, new_view, true);
>  
> +qemu_mutex_lock(&flat_view_mutex);
> +flatview_unref(as->current_map);
>  as->current_map = new_view;
> -flatview_destroy(old_view);
> +qemu_mutex_unlock(&flat_view_mutex);
> +
> +/* Note that all the old MemoryRegions are still alive up to this
> + * point.  This relieves most MemoryListeners from the need to
> + * ref/unref the MemoryRegions they get---unless they use them
> + * outside the iothread mutex, in which case precise reference
> + * counting is necessary.
> + */
> +flatview_unref(old_view);
> +
>  address_space_update_ioeventfds(as);
>  }
>  
> @@ -1146,12 +1197,13 @@ void memory_region_sync_dirty_bitmap(MemoryRegion *mr

[Qemu-devel] [PATCH v2 1/8] rdma: update documentation to reflect new unpin support

2013-06-28 Thread mrhines

From: "Michael R. Hines" 

As requested, the protocol now includes memory unpinning support.
This has been implemented in a non-optimized manner, in such a way
that one could devise an LRU or other workload-specific information
on top of the basic mechanism to influence the way unpinning happens
during runtime.

The feature is not yet user-facing, and is thus can only be enable
at compile-time.

Signed-off-by: Michael R. Hines 
---
 docs/rdma.txt |   51 ++-
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/docs/rdma.txt b/docs/rdma.txt
index 45a4b1d..f3083fd 100644
--- a/docs/rdma.txt
+++ b/docs/rdma.txt
@@ -35,7 +35,7 @@ memory tracked during each live migration iteration round 
cannot keep pace
 with the rate of dirty memory produced by the workload.
 
 RDMA currently comes in two flavors: both Ethernet based (RoCE, or RDMA
-over Convered Ethernet) as well as Infiniband-based. This implementation of
+over Converged Ethernet) as well as Infiniband-based. This implementation of
 migration using RDMA is capable of using both technologies because of
 the use of the OpenFabrics OFED software stack that abstracts out the
 programming model irrespective of the underlying hardware.
@@ -188,9 +188,9 @@ header portion and a data portion (but together are 
transmitted
 as a single SEND message).
 
 Header:
-* Length  (of the data portion, uint32, network byte order)
-* Type(what command to perform, uint32, network byte order)
-* Repeat  (Number of commands in data portion, same type only)
+* Length   (of the data portion, uint32, network byte order)
+* Type (what command to perform, uint32, network byte 
order)
+* Repeat   (Number of commands in data portion, same type only)
 
 The 'Repeat' field is here to support future multiple page registrations
 in a single message without any need to change the protocol itself
@@ -202,17 +202,19 @@ The maximum number of repeats is hard-coded to 4096. This 
is a conservative
 limit based on the maximum size of a SEND message along with emperical
 observations on the maximum future benefit of simultaneous page registrations.
 
-The 'type' field has 10 different command values:
-1. Unused
-2. Error  (sent to the source during bad things)
-3. Ready  (control-channel is available)
-4. QEMU File  (for sending non-live device state)
-5. RAM Blocks request (used right after connection setup)
-6. RAM Blocks result  (used right after connection setup)
-7. Compress page  (zap zero page and skip registration)
-8. Register request   (dynamic chunk registration)
-9. Register result('rkey' to be used by sender)
-10. Register finished  (registration for current iteration finished)
+The 'type' field has 12 different command values:
+ 1. Unused
+ 2. Error   (sent to the source during bad things)
+ 3. Ready   (control-channel is available)
+ 4. QEMU File   (for sending non-live device state)
+ 5. RAM Blocks request  (used right after connection setup)
+ 6. RAM Blocks result   (used right after connection setup)
+ 7. Compress page   (zap zero page and skip registration)
+ 8. Register request(dynamic chunk registration)
+ 9. Register result ('rkey' to be used by sender)
+10. Register finished  (registration for current iteration 
finished)
+11. Unregister request (unpin previously registered memory)
+12. Unregister finished(confirmation that unpin completed)
 
 A single control message, as hinted above, can contain within the data
 portion an array of many commands of the same type. If there is more than
@@ -243,7 +245,7 @@ qemu_rdma_exchange_send(header, data, optional response 
header & data):
from the receiver to tell us that the receiver
is *ready* for us to transmit some new bytes.
 2. Optionally: if we are expecting a response from the command
-   (that we have no yet transmitted), let's post an RQ
+   (that we have not yet transmitted), let's post an RQ
work request to receive that data a few moments later.
 3. When the READY arrives, librdmacm will
unblock us and we immediately post a RQ work request
@@ -293,8 +295,10 @@ librdmacm provides the user with a 'private data' area to 
be exchanged
 at connection-setup time before any infiniband traffic is generated.
 
 Header:
-* Version (protocol version validated before send/recv occurs), uint32, 
network byte order
-* Flags   (bitwise OR of each capability), uint32, network byte order
+* Version (protocol version validated before send/recv occurs), 
+   uint32, network byte order
+* Flags   (bitwise OR of each capability), 
+   uint32, netw

[Qemu-devel] [PATCH v2 4/8] rdma: unpin support

2013-06-28 Thread mrhines

From: "Michael R. Hines" 

As requested, the protocol now includes memory unpinning support.
This has been implemented in a non-optimized manner, in such a way
that one could devise an LRU or other workload-specific information
on top of the basic mechanism to influence the way unpinning happens
during runtime.

The feature is not yet user-facing, and is thus can only be enable
at compile-time.

Signed-off-by: Michael R. Hines 
---
 migration-rdma.c |  143 ++
 1 file changed, 143 insertions(+)

diff --git a/migration-rdma.c b/migration-rdma.c
index 0bd5e23..6218d48 100644
--- a/migration-rdma.c
+++ b/migration-rdma.c
@@ -944,6 +944,132 @@ const char *print_wrid(int wrid)
 return wrid_desc[wrid];
 }
 
+/*
+ * RDMA requires memory registration (mlock/pinning), but this is not good for
+ * overcommitment.
+ *
+ * In preparation for the future where LRU information or workload-specific
+ * writable writable working set memory access behavior is available to QEMU
+ * it would be nice to have in place the ability to UN-register/UN-pin
+ * particular memory regions from the RDMA hardware when it is determine that
+ * those regions of memory will likely not be accessed again in the near 
future.
+ *
+ * While we do not yet have such information right now, the following
+ * compile-time option allows us to perform a non-optimized version of this
+ * behavior.
+ *
+ * By uncommenting this option, you will cause *all* RDMA transfers to be
+ * unregistered immediately after the transfer completes on both sides of the
+ * connection. This has no effect in 'rdma-pin-all' mode, only regular mode.
+ *
+ * This will have a terrible impact on migration performance, so until future
+ * workload information or LRU information is available, do not attempt to use
+ * this feature except for basic testing.
+ */
+//#define RDMA_UNREGISTRATION_EXAMPLE
+
+/*
+ * Perform a non-optimized memory unregistration after every transfer
+ * for demonsration purposes, only if pin-all is not requested.
+ *
+ * Potential optimizations:
+ * 1. Start a new thread to run this function continuously
+- for bit clearing
+- and for receipt of unregister messages
+ * 2. Use an LRU.
+ * 3. Use workload hints.
+ */
+#ifdef RDMA_UNREGISTRATION_EXAMPLE
+static int qemu_rdma_unregister_waiting(RDMAContext *rdma)
+{
+while (rdma->unregistrations[rdma->unregister_current]) {
+int ret;
+uint64_t wr_id = rdma->unregistrations[rdma->unregister_current];
+uint64_t chunk =
+(wr_id & RDMA_WRID_CHUNK_MASK) >> RDMA_WRID_CHUNK_SHIFT;
+uint64_t index =
+(wr_id & RDMA_WRID_BLOCK_MASK) >> RDMA_WRID_BLOCK_SHIFT;
+RDMALocalBlock *block =
+&(rdma->local_ram_blocks.block[index]);
+RDMARegister reg = { .current_index = index };
+RDMAControlHeader resp = { .type = RDMA_CONTROL_UNREGISTER_FINISHED,
+ };
+RDMAControlHeader head = { .len = sizeof(RDMARegister),
+   .type = RDMA_CONTROL_UNREGISTER_REQUEST,
+   .repeat = 1,
+ };
+
+DDPRINTF("Processing unregister for chunk: %" PRIu64 " at position 
%d\n",
+chunk, rdma->unregister_current);
+
+rdma->unregistrations[rdma->unregister_current] = 0;
+rdma->unregister_current++;
+
+if (rdma->unregister_current == RDMA_SIGNALED_SEND_MAX) {
+rdma->unregister_current = 0;
+}
+
+DDPRINTF("Sending unregister for chunk: %" PRIu64 "\n", chunk);
+
+clear_bit(chunk, block->unregister_bitmap);
+
+if (test_bit(chunk, block->transit_bitmap)) {
+DDPRINTF("Cannot unregister inflight chunk: %" PRIu64 "\n", chunk);
+continue;
+}
+
+ret = ibv_dereg_mr(block->pmr[chunk]);
+block->pmr[chunk] = NULL;
+block->remote_keys[chunk] = 0;
+
+if (ret != 0) {
+perror("unregistration chunk failed");
+return -ret;
+}
+rdma->total_registrations--;
+
+reg.key.chunk = chunk;
+register_to_network(®);
+ret = qemu_rdma_exchange_send(rdma, &head, (uint8_t *) ®,
+&resp, NULL, NULL);
+if (ret < 0) {
+return ret;
+}
+
+DDPRINTF("Unregister for chunk: %" PRIu64 " complete.\n", chunk);
+}
+
+return 0;
+}
+
+/*
+ * Set bit for unregistration in the next iteration.
+ * We cannot transmit right here, but will unpin later.
+ */
+static void qemu_rdma_signal_unregister(RDMAContext *rdma, uint64_t index,
+uint64_t chunk, uint64_t wr_id)
+{
+if (rdma->unregistrations[rdma->unregister_next] != 0) {
+fprintf(stderr, "rdma migration: queue is full!\n");
+} else {
+RDMALocalBlock *block = &(rdma->local_ram_blocks.block[index]);
+
+

Re: [Qemu-devel] [PATCH 02/30] memory: use a new FlatView pointer on every topology update

2013-06-28 Thread Anthony Liguori

Paolo Bonzini  writes:

> This is the first step towards converting as->current_map to
> RCU-style updates, where the FlatView updates run concurrently
> with uses of an old FlatView.
>
> Signed-off-by: Paolo Bonzini 

Reviewed-by: Anthony Liguori 

Regards,

Anthony Liguori

> ---
>  memory.c | 34 ++
>  1 file changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/memory.c b/memory.c
> index 1f44cd1..319894e 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -276,6 +276,7 @@ static void flatview_destroy(FlatView *view)
>  memory_region_unref(view->ranges[i].mr);
>  }
>  g_free(view->ranges);
> +g_free(view);
>  }
>  
>  static bool can_merge(FlatRange *r1, FlatRange *r2)
> @@ -512,17 +513,18 @@ static void render_memory_region(FlatView *view,
>  }
>  
>  /* Render a memory topology into a list of disjoint absolute ranges. */
> -static FlatView generate_memory_topology(MemoryRegion *mr)
> +static FlatView *generate_memory_topology(MemoryRegion *mr)
>  {
> -FlatView view;
> +FlatView *view;
>  
> -flatview_init(&view);
> +view = g_new(FlatView, 1);
> +flatview_init(view);
>  
>  if (mr) {
> -render_memory_region(&view, mr, int128_zero(),
> +render_memory_region(view, mr, int128_zero(),
>   addrrange_make(int128_zero(), int128_2_64()), 
> false);
>  }
> -flatview_simplify(&view);
> +flatview_simplify(view);
>  
>  return view;
>  }
> @@ -610,8 +612,8 @@ static void address_space_update_ioeventfds(AddressSpace 
> *as)
>  }
>  
>  static void address_space_update_topology_pass(AddressSpace *as,
> -   FlatView old_view,
> -   FlatView new_view,
> +   const FlatView *old_view,
> +   const FlatView *new_view,
> bool adding)
>  {
>  unsigned iold, inew;
> @@ -621,14 +623,14 @@ static void 
> address_space_update_topology_pass(AddressSpace *as,
>   * Kill ranges in the old map, and instantiate ranges in the new map.
>   */
>  iold = inew = 0;
> -while (iold < old_view.nr || inew < new_view.nr) {
> -if (iold < old_view.nr) {
> -frold = &old_view.ranges[iold];
> +while (iold < old_view->nr || inew < new_view->nr) {
> +if (iold < old_view->nr) {
> +frold = &old_view->ranges[iold];
>  } else {
>  frold = NULL;
>  }
> -if (inew < new_view.nr) {
> -frnew = &new_view.ranges[inew];
> +if (inew < new_view->nr) {
> +frnew = &new_view->ranges[inew];
>  } else {
>  frnew = NULL;
>  }
> @@ -674,14 +676,14 @@ static void 
> address_space_update_topology_pass(AddressSpace *as,
>  
>  static void address_space_update_topology(AddressSpace *as)
>  {
> -FlatView old_view = *as->current_map;
> -FlatView new_view = generate_memory_topology(as->root);
> +FlatView *old_view = as->current_map;
> +FlatView *new_view = generate_memory_topology(as->root);
>  
>  address_space_update_topology_pass(as, old_view, new_view, false);
>  address_space_update_topology_pass(as, old_view, new_view, true);
>  
> -*as->current_map = new_view;
> -flatview_destroy(&old_view);
> +as->current_map = new_view;
> +flatview_destroy(old_view);
>  address_space_update_ioeventfds(as);
>  }
>  
> -- 
> 1.8.1.4

[Qemu-devel] [PATCH v2 7/8] rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state transition

2013-06-28 Thread mrhines

From: "Michael R. Hines" 

As described in the previous patch, until now, the MIG_STATE_SETUP
state was not really a 'formal' state. It has been used as a 'zero' state
(what we're calling 'NONE' here) and QEMU has been unconditionally transitioning
into this state when the QMP migration command was called. Instead we want to
introduce MIG_STATE_NONE, which is our starting state in the state machine, and
then immediately transition into the MIG_STATE_SETUP state when the QMP migrate
command is issued.

In order to do this, we must delay the transition into MIG_STATE_ACTIVE until
later in the migration_thread(). This is done to be able to timestamp the 
amount of
time spent in the SETUP state for proper accounting to the user during
an RDMA migration.

Furthermore, the management software, until now, has never been aware of the
existence of the SETUP state whatsoever. This must change, because, timing of 
this
state implies that the state actually exists.

These two patches cannot be separated because the 'query_migrate' QMP
switch statement needs to know how to handle this new state transition.

Reviewed-by: Juan Quintela 
Tested-by: Michael R. Hines 
Signed-off-by: Michael R. Hines 
---
 migration.c |   21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/migration.c b/migration.c
index cbfdb2c..a199a67 100644
--- a/migration.c
+++ b/migration.c
@@ -36,7 +36,8 @@
 #endif
 
 enum {
-MIG_STATE_ERROR,
+MIG_STATE_ERROR = -1,
+MIG_STATE_NONE,
 MIG_STATE_SETUP,
 MIG_STATE_CANCELLED,
 MIG_STATE_ACTIVE,
@@ -63,7 +64,7 @@ static NotifierList migration_state_notifiers =
 MigrationState *migrate_get_current(void)
 {
 static MigrationState current_migration = {
-.state = MIG_STATE_SETUP,
+.state = MIG_STATE_NONE,
 .bandwidth_limit = MAX_THROTTLE,
 .xbzrle_cache_size = DEFAULT_MIGRATE_CACHE_SIZE,
 .mbps = -1,
@@ -184,9 +185,13 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 MigrationState *s = migrate_get_current();
 
 switch (s->state) {
-case MIG_STATE_SETUP:
+case MIG_STATE_NONE:
 /* no migration has happened ever */
 break;
+case MIG_STATE_SETUP:
+info->has_status = true;
+info->status = g_strdup("setup");
+break;
 case MIG_STATE_ACTIVE:
 info->has_status = true;
 info->status = g_strdup("active");
@@ -257,7 +262,7 @@ void 
qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
 MigrationState *s = migrate_get_current();
 MigrationCapabilityStatusList *cap;
 
-if (s->state == MIG_STATE_ACTIVE) {
+if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP) {
 error_set(errp, QERR_MIGRATION_ACTIVE);
 return;
 }
@@ -393,7 +398,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 params.blk = blk;
 params.shared = inc;
 
-if (s->state == MIG_STATE_ACTIVE) {
+if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP) {
 error_set(errp, QERR_MIGRATION_ACTIVE);
 return;
 }
@@ -525,6 +530,8 @@ static void *migration_thread(void *opaque)
 DPRINTF("beginning savevm\n");
 qemu_savevm_state_begin(s->file, &s->params);
 
+migrate_set_state(s, MIG_STATE_SETUP, MIG_STATE_ACTIVE);
+
 while (s->state == MIG_STATE_ACTIVE) {
 int64_t current_time;
 uint64_t pending_size;
@@ -604,8 +611,8 @@ static void *migration_thread(void *opaque)
 
 void migrate_fd_connect(MigrationState *s)
 {
-s->state = MIG_STATE_ACTIVE;
-trace_migrate_set_state(MIG_STATE_ACTIVE);
+s->state = MIG_STATE_SETUP;
+trace_migrate_set_state(MIG_STATE_SETUP);
 
 /* This is a best 1st approximation. ns to ms */
 s->expected_downtime = max_downtime/100;
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 8/8] rdma: account for the time spent in MIG_STATE_SETUP through QMP

2013-06-28 Thread mrhines

From: "Michael R. Hines" 

Using the previous patches, we're now able to timestamp the SETUP
state. Once we have this time, let the user know about it in the
schema.

Reviewed-by: Juan Quintela 
Reviewed-by: Eric Blake 
Signed-off-by: Michael R. Hines 
---
 hmp.c |4 
 include/migration/migration.h |1 +
 migration.c   |9 +
 qapi-schema.json  |9 -
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/hmp.c b/hmp.c
index 148a3fb..5f52f17 100644
--- a/hmp.c
+++ b/hmp.c
@@ -164,6 +164,10 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
 monitor_printf(mon, "downtime: %" PRIu64 " milliseconds\n",
info->downtime);
 }
+if (info->has_setup_time) {
+monitor_printf(mon, "setup: %" PRIu64 " milliseconds\n",
+   info->setup_time);
+}
 }
 
 if (info->has_ram) {
diff --git a/include/migration/migration.h b/include/migration/migration.h
index b5e413a..71dbe54 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -49,6 +49,7 @@ struct MigrationState
 int64_t dirty_bytes_rate;
 bool enabled_capabilities[MIGRATION_CAPABILITY_MAX];
 int64_t xbzrle_cache_size;
+int64_t setup_time;
 };
 
 void process_incoming_migration(QEMUFile *f);
diff --git a/migration.c b/migration.c
index a199a67..892302a 100644
--- a/migration.c
+++ b/migration.c
@@ -191,6 +191,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 case MIG_STATE_SETUP:
 info->has_status = true;
 info->status = g_strdup("setup");
+info->has_total_time = false;
 break;
 case MIG_STATE_ACTIVE:
 info->has_status = true;
@@ -200,6 +201,8 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 - s->total_time;
 info->has_expected_downtime = true;
 info->expected_downtime = s->expected_downtime;
+info->has_setup_time = true;
+info->setup_time = s->setup_time;
 
 info->has_ram = true;
 info->ram = g_malloc0(sizeof(*info->ram));
@@ -231,6 +234,8 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 info->total_time = s->total_time;
 info->has_downtime = true;
 info->downtime = s->downtime;
+info->has_setup_time = true;
+info->setup_time = s->setup_time;
 
 info->has_ram = true;
 info->ram = g_malloc0(sizeof(*info->ram));
@@ -522,6 +527,7 @@ static void *migration_thread(void *opaque)
 {
 MigrationState *s = opaque;
 int64_t initial_time = qemu_get_clock_ms(rt_clock);
+int64_t setup_start = qemu_get_clock_ms(host_clock);
 int64_t initial_bytes = 0;
 int64_t max_size = 0;
 int64_t start_time = initial_time;
@@ -530,8 +536,11 @@ static void *migration_thread(void *opaque)
 DPRINTF("beginning savevm\n");
 qemu_savevm_state_begin(s->file, &s->params);
 
+s->setup_time = qemu_get_clock_ms(host_clock) - setup_start;
 migrate_set_state(s, MIG_STATE_SETUP, MIG_STATE_ACTIVE);
 
+DPRINTF("setup complete\n");
+
 while (s->state == MIG_STATE_ACTIVE) {
 int64_t current_time;
 uint64_t pending_size;
diff --git a/qapi-schema.json b/qapi-schema.json
index 6590307..7ab4d1a 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -578,6 +578,12 @@
 #expected downtime in milliseconds for the guest in last walk
 #of the dirty bitmap. (since 1.3)
 #
+# @setup-time: #optional amount of setup time in milliseconds _before_ the 
+#iterations begin but _after_ the QMP command is issued. This is 
designed 
+#to provide an accounting of any activities (such as RDMA pinning) 
which
+#may be expensive, but do not actually occur during the iterative
+#migration rounds themselves. (since 1.6)
+#
 # Since: 0.14.0
 ##
 { 'type': 'MigrationInfo',
@@ -586,7 +592,8 @@
'*xbzrle-cache': 'XBZRLECacheStats',
'*total-time': 'int',
'*expected-downtime': 'int',
-   '*downtime': 'int'} }
+   '*downtime': 'int',
+   '*setup-time': 'int'} }
 
 ##
 # @query-migrate
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 2/8] rdma: introduce ram_handle_compressed()

2013-06-28 Thread mrhines

From: "Michael R. Hines" 

This gives RDMA shared access to madvise() on the destination side
when an entire chunk is found to be zero.

Reviewed-by: Juan Quintela 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Chegu Vinod 
Tested-by: Chegu Vinod 
Tested-by: Michael R. Hines 
Signed-off-by: Michael R. Hines 
---
 arch_init.c   |   29 +++--
 include/migration/migration.h |2 ++
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index ea9ddad..82657e4 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -777,6 +777,24 @@ static inline void *host_from_stream_offset(QEMUFile *f,
 return NULL;
 }
 
+/*
+ * If a page (or a whole RDMA chunk) has been
+ * determined to be zero, then zap it.
+ */
+void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
+{
+if (ch != 0 || !is_zero_page(host)) {
+memset(host, ch, size);
+#ifndef _WIN32
+if (ch == 0 &&
+(!kvm_enabled() || kvm_has_sync_mmu()) &&
+getpagesize() <= TARGET_PAGE_SIZE) {
+qemu_madvise(host, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED);
+}
+#endif
+}
+}
+
 static int ram_load(QEMUFile *f, void *opaque, int version_id)
 {
 ram_addr_t addr;
@@ -847,16 +865,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 }
 
 ch = qemu_get_byte(f);
-if (ch != 0 || !is_zero_page(host)) {
-memset(host, ch, TARGET_PAGE_SIZE);
-#ifndef _WIN32
-if (ch == 0 &&
-(!kvm_enabled() || kvm_has_sync_mmu()) &&
-getpagesize() <= TARGET_PAGE_SIZE) {
-qemu_madvise(host, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED);
-}
-#endif
-}
+ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
 } else if (flags & RAM_SAVE_FLAG_PAGE) {
 void *host;
 
diff --git a/include/migration/migration.h b/include/migration/migration.h
index f0640e0..9d3cc85 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -109,6 +109,8 @@ uint64_t xbzrle_mig_pages_transferred(void);
 uint64_t xbzrle_mig_pages_overflow(void);
 uint64_t xbzrle_mig_pages_cache_miss(void);
 
+void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
+
 /**
  * @migrate_add_blocker - prevent migration from proceeding
  *
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 0/8] rdma: core logic w/ unpin example

2013-06-28 Thread mrhines

From: "Michael R. Hines" 

This version seems ready to go, if there are no fundamental problems.

Changes since v1:
- Complete endianness handling of all protocol messages
- Splitout unpin patch
- ./configure fixes
- Fix documentation

Michael R. Hines (8):
  rdma: update documentation to reflect new unpin support
  rdma: introduce ram_handle_compressed()
  rdma: core logic
  rdma: unpin support
  rdma: send pc.ram
  rdma: allow state transitions between other states besides ACTIVE
  rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state
transition
  rdma: account for the time spent in MIG_STATE_SETUP through QMP

 Makefile.objs |1 +
 arch_init.c   |   62 +-
 configure |   40 +
 docs/rdma.txt |   51 +-
 hmp.c |4 +
 include/migration/migration.h |7 +
 migration-rdma.c  | 3042 +
 migration.c   |   48 +-
 qapi-schema.json  |9 +-
 9 files changed, 3219 insertions(+), 45 deletions(-)
 create mode 100644 migration-rdma.c

-- 
1.7.10.4

[Qemu-devel] [PATCH v2 6/8] rdma: allow state transitions between other states besides ACTIVE

2013-06-28 Thread mrhines

From: "Michael R. Hines" 

This patch is in preparation for the next ones: Until now the MIG_STATE_SETUP
state was not really a 'formal' state. It has been used as a 'zero' state
and QEMU has been unconditionally transitioning into this state when
the QMP migrate command was called. In preparation for timing this state,
we have to make this state a a 'real' state which actually gets transitioned
from later in the migration_thread() from SETUP => ACTIVE, rather than just
automatically dropping into this state at the beginninig of the migration.

This means that the state transition function (migration_finish_set_state())
needs to be capable of transitioning from valid states _other_ than just
MIG_STATE_ACTIVE.

The function is in fact already capable of doing that, but was not allowing the
old state to be a parameter specified as an input.

This patch fixes that and only makes the transition if the current state
matches the old state that the caller intended to transition from.

Reviewed-by: Juan Quintela 
Tested-by: Michael R. Hines 
Signed-off-by: Michael R. Hines 
---
 migration.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/migration.c b/migration.c
index 62c6b85..cbfdb2c 100644
--- a/migration.c
+++ b/migration.c
@@ -295,9 +295,9 @@ static void migrate_fd_cleanup(void *opaque)
 notifier_list_notify(&migration_state_notifiers, s);
 }
 
-static void migrate_finish_set_state(MigrationState *s, int new_state)
+static void migrate_set_state(MigrationState *s, int old_state, int new_state)
 {
-if (__sync_val_compare_and_swap(&s->state, MIG_STATE_ACTIVE,
+if (__sync_val_compare_and_swap(&s->state, old_state,
 new_state) == new_state) {
 trace_migrate_set_state(new_state);
 }
@@ -316,7 +316,7 @@ static void migrate_fd_cancel(MigrationState *s)
 {
 DPRINTF("cancelling migration\n");
 
-migrate_finish_set_state(s, MIG_STATE_CANCELLED);
+migrate_set_state(s, s->state, MIG_STATE_CANCELLED);
 }
 
 void add_migration_state_change_notifier(Notifier *notify)
@@ -546,14 +546,14 @@ static void *migration_thread(void *opaque)
 qemu_savevm_state_complete(s->file);
 qemu_mutex_unlock_iothread();
 if (!qemu_file_get_error(s->file)) {
-migrate_finish_set_state(s, MIG_STATE_COMPLETED);
+migrate_set_state(s, MIG_STATE_ACTIVE, 
MIG_STATE_COMPLETED);
 break;
 }
 }
 }
 
 if (qemu_file_get_error(s->file)) {
-migrate_finish_set_state(s, MIG_STATE_ERROR);
+migrate_set_state(s, MIG_STATE_ACTIVE, MIG_STATE_ERROR);
 break;
 }
 current_time = qemu_get_clock_ms(rt_clock);
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 5/8] rdma: send pc.ram

2013-06-28 Thread mrhines

From: "Michael R. Hines" 

This takes advantages of the previous patches:

1. use the new QEMUFileOps hook 'save_page'

2. call out to the right accessor methods to invoke
   the iteration hooks defined in QEMUFileOps

Reviewed-by: Paolo Bonzini 
Reviewed-by: Chegu Vinod 
Tested-by: Chegu Vinod 
Tested-by: Michael R. Hines 
Signed-off-by: Michael R. Hines 
---
 arch_init.c |   33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index 82657e4..b07bc52 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -115,6 +115,7 @@ const uint32_t arch_type = QEMU_ARCH;
 #define RAM_SAVE_FLAG_EOS  0x10
 #define RAM_SAVE_FLAG_CONTINUE 0x20
 #define RAM_SAVE_FLAG_XBZRLE   0x40
+/* 0x80 is reserved in migration.h start with 0x100 next */
 
 
 static struct defconfig_file {
@@ -447,6 +448,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 ram_bulk_stage = false;
 }
 } else {
+int ret;
 uint8_t *p;
 int cont = (block == last_sent_block) ?
 RAM_SAVE_FLAG_CONTINUE : 0;
@@ -455,7 +457,18 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 
 /* In doubt sent page as normal */
 bytes_sent = -1;
-if (is_zero_page(p)) {
+ret = ram_control_save_page(f, block->offset,
+   offset, TARGET_PAGE_SIZE, &bytes_sent);
+
+if (ret != RAM_SAVE_CONTROL_NOT_SUPP) {
+if (ret != RAM_SAVE_CONTROL_DELAYED) {
+if (bytes_sent > 0) {
+acct_info.norm_pages++;
+} else if (bytes_sent == 0) {
+acct_info.dup_pages++;
+}
+}
+} else if (is_zero_page(p)) {
 acct_info.dup_pages++;
 bytes_sent = save_block_hdr(f, block, offset, cont,
 RAM_SAVE_FLAG_COMPRESS);
@@ -605,6 +618,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 }
 
 qemu_mutex_unlock_ramlist();
+
+ram_control_before_iterate(f, RAM_CONTROL_SETUP);
+ram_control_after_iterate(f, RAM_CONTROL_SETUP);
+
 qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
 
 return 0;
@@ -623,6 +640,8 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 reset_ram_globals();
 }
 
+ram_control_before_iterate(f, RAM_CONTROL_ROUND);
+
 t0 = qemu_get_clock_ns(rt_clock);
 i = 0;
 while ((ret = qemu_file_rate_limit(f)) == 0) {
@@ -653,6 +672,12 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 
 qemu_mutex_unlock_ramlist();
 
+/*
+ * Must occur before EOS (or any QEMUFile operation)
+ * because of RDMA protocol.
+ */
+ram_control_after_iterate(f, RAM_CONTROL_ROUND);
+
 if (ret < 0) {
 bytes_transferred += total_sent;
 return ret;
@@ -670,6 +695,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 qemu_mutex_lock_ramlist();
 migration_bitmap_sync();
 
+ram_control_before_iterate(f, RAM_CONTROL_FINISH);
+
 /* try transferring iterative blocks of memory */
 
 /* flush all remaining blocks regardless of rate limiting */
@@ -683,6 +710,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 }
 bytes_transferred += bytes_sent;
 }
+
+ram_control_after_iterate(f, RAM_CONTROL_FINISH);
 migration_end();
 
 qemu_mutex_unlock_ramlist();
@@ -885,6 +914,8 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 ret = -EINVAL;
 goto done;
 }
+} else if (flags & RAM_SAVE_FLAG_HOOK) {
+ram_control_load_hook(f, flags);
 }
 error = qemu_file_get_error(f);
 if (error) {
-- 
1.7.10.4

[Qemu-devel] [PATCH 13/30] qemu-thread: report RCU quiescent states

2013-06-28 Thread Paolo Bonzini

Most threads will use mutexes and other sleeping synchronization primitives
(condition variables, semaphores, events) periodically.  For these threads,
the synchronization primitives are natural places to report a quiescent
state (possibly an extended one).

Signed-off-by: Paolo Bonzini 
---
 docs/rcu.txt | 33 -
 util/qemu-thread-posix.c | 30 ++
 util/qemu-thread-win32.c | 16 +++-
 util/rcu.c   |  3 ---
 4 files changed, 73 insertions(+), 9 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index a3510b9..6c4a852 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -168,6 +168,35 @@ of "quiescent states", i.e. points where no RCU read-side 
critical
 section can be active.  All threads created with qemu_thread_create
 participate in the RCU mechanism and need to annotate such points.
 
+Luckily, in most cases no manual annotation is needed, because waiting
+on condition variables (qemu_cond_wait), semaphores (qemu_sem_wait,
+qemu_sem_timedwait) or events (qemu_event_wait) implicitly marks the thread
+as quiescent for the whole duration of the wait.  (There is an exception
+for semaphore waits with a zero timeout).
+
+Manual annotation is still needed in the following cases:
+
+- threads that spend their sleeping time in the kernel, for example
+  in a call to select(), poll(), sigwait() or WaitForMultipleObjects().
+  The QEMU I/O thread is an example of this case.  When running under
+  KVM, VCPUs are also in a quiescent state while running the guest.
+
+- threads that perform a lot of I/O.  In QEMU, the workers used for
+  aio=thread are an example of this case (see aio_worker in block/raw-*).
+
+- threads that run continuously until they exit.  The migration thread
+  is an example of this case.
+
+Regarding the second case, note that the workers run in the QEMU thread
+pool.  The thread pool uses semaphores for synchronization, hence it does
+report quiescent states periodically.  However, in some cases (e.g. NFS
+mounted with the "hard" option) the workers can take an arbitrarily long
+amount of time.  When this happens, synchronize_rcu() will not exit and
+call_rcu() callbacks will be delayed arbitrarily.  It is therefore a
+good idea to mark I/O system calls as quiescence points in the worker
+functions.
+
+
 Marking quiescent states is done with the following three APIs:
 
  void rcu_quiescent_state(void);
@@ -229,7 +258,9 @@ DIFFERENCES WITH LINUX
   type of the callback's argument to be the type of the first argument.
   call_rcu1 is the same as Linux's call_rcu.
 
-- Quiescent points must be marked explicitly in the thread.
+- Quiescent points must be marked explicitly unless the thread uses
+  condvars/semaphores/events for synchronization.  Note that mutexes
+  do not report quiescent points (see the first item above).
 
 
 RCU PATTERNS
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 2df3382..21190be 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -119,7 +119,9 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex)
 {
 int err;
 
+rcu_thread_offline();
 err = pthread_cond_wait(&cond->cond, &mutex->lock);
+rcu_thread_online();
 if (err)
 error_exit(err, __func__);
 }
@@ -212,6 +214,10 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
 int rc;
 struct timespec ts;
 
+if (ms) {
+rcu_thread_offline();
+}
+
 #if defined(__APPLE__) || defined(__NetBSD__)
 compute_abs_deadline(&ts, ms);
 pthread_mutex_lock(&sem->lock);
@@ -227,7 +233,10 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
 }
 }
 pthread_mutex_unlock(&sem->lock);
-return (rc == ETIMEDOUT ? -1 : 0);
+if (rc == ETIMEDOUT) {
+rc == -1;
+}
+
 #else
 if (ms <= 0) {
 /* This is cheaper than sem_timedwait.  */
@@ -235,7 +244,7 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
 rc = sem_trywait(&sem->sem);
 } while (rc == -1 && errno == EINTR);
 if (rc == -1 && errno == EAGAIN) {
-return -1;
+goto out;
 }
 } else {
 compute_abs_deadline(&ts, ms);
@@ -243,18 +252,25 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
 rc = sem_timedwait(&sem->sem, &ts);
 } while (rc == -1 && errno == EINTR);
 if (rc == -1 && errno == ETIMEDOUT) {
-return -1;
+goto out;
 }
 }
 if (rc < 0) {
 error_exit(errno, __func__);
 }
-return 0;
 #endif
+
+out:
+if (ms) {
+rcu_thread_online();
+}
+return rc;
 }
 
 void qemu_sem_wait(QemuSemaphore *sem)
 {
+rcu_thread_offline();
+
 #if defined(__APPLE__) || defined(__NetBSD__)
 pthread_mutex_lock(&sem->lock);
 --sem->count;
@@ -272,6 +288,8 @@ void qemu_sem_wait(QemuSemaphore *sem)
 error_exit(errno, __func__);
 }
 #endif
+
+rcu_thread_online();
 }
 
 #ifdef __l

Re: [Qemu-devel] [PATCH 30/30] exec: put address space dispatch under RCU critical section

2013-06-28 Thread Jan Kiszka

On 2013-06-28 20:26, Paolo Bonzini wrote:
> With this change, address space dispatch can be moved outside the
> BQL.  The actual I/O would still have to happen within the lock.
> 
> The next step would be to introduce a function that can only
> be called from outside the BQL, address_space_rw_unlocked.  The
> function would do something like
> 
> mr = address_space_translate(...)
> if (!mr->unlocked) {
> locked = true;
> qemu_mutex_lock_iothread();
> }
> ...dispatch...
> if (locked) {
> qemu_mutex_unlock_iothread();
> }
> 
> (Note that subpages are already ready to be changed to unlocked
> access).
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  cputlb.c |  4 +++-
>  exec.c   | 30 ++
>  hw/ppc/spapr_iommu.c | 10 +-
>  3 files changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/cputlb.c b/cputlb.c
> index 82875b1..97bfe70 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -267,8 +267,9 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
>  tlb_add_large_page(env, vaddr, size);
>  }
>  
> +rcu_read_lock();
>  sz = size;
> -d = address_space_memory.dispatch;
> +d = rcu_dereference(&address_space_memory.dispatch);
>  section = address_space_translate_for_iotlb(d, paddr, &xlat, &sz);
>  assert(sz >= TARGET_PAGE_SIZE);
>  
> @@ -321,6 +322,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
>  } else {
>  te->addr_write = -1;
>  }
> +rcu_read_unlock();
>  }
>  
>  /* NOTE: this function can trigger an exception */
> diff --git a/exec.c b/exec.c
> index e564014..8c6f925 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -97,6 +97,8 @@ struct PhysPageEntry {
>  typedef PhysPageEntry Node[L2_SIZE];
>  
>  struct AddressSpaceDispatch {
> +struct rcu_head rcu;
> +
>  /* This is a multi-level map on the physical address space.
>   * The bottom level has pointers to MemoryRegionSections.
>   */
> @@ -120,6 +122,8 @@ typedef struct subpage_t {
>  #define PHYS_SECTION_WATCH 3
>  
>  typedef struct PhysPageMap {
> +struct rcu_head rcu;
> +
>  unsigned sections_nb;
>  unsigned sections_nb_alloc;
>  unsigned nodes_nb;
> @@ -236,6 +240,7 @@ bool memory_region_is_unassigned(MemoryRegion *mr)
>  && mr != &io_mem_watch;
>  }
>  
> +/* Called from RCU critical section */
>  static MemoryRegionSection *address_space_lookup_region(AddressSpaceDispatch 
> *d,
>  hwaddr addr,
>  bool resolve_subpage)
> @@ -252,6 +257,7 @@ static MemoryRegionSection 
> *address_space_lookup_region(AddressSpaceDispatch *d,
>  return section;
>  }
>  
> +/* Called from RCU critical section */
>  static MemoryRegionSection *
>  address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, 
> hwaddr *xlat,
>   hwaddr *plen, bool resolve_subpage)
> @@ -280,8 +286,10 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
> hwaddr addr,
>  MemoryRegion *mr;
>  hwaddr len = *plen;
>  
> +rcu_read_lock();
>  for (;;) {
> -section = address_space_translate_internal(as->dispatch, addr, 
> &addr, plen, true);
> +AddressSpaceDispatch *d = rcu_dereference(&as->dispatch);
> +section = address_space_translate_internal(d, addr, &addr, plen, 
> true);
>  mr = section->mr;
>  
>  if (!mr->iommu_ops) {
> @@ -303,9 +311,11 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
> hwaddr addr,
>  *plen = len;
>  *xlat = addr;
>  memory_region_ref(mr);
> +rcu_read_unlock();
>  return mr;
>  }

At this point, do we still have unowned memory regions? If so, we must
not return them here if the caller is not holding the BQL (which is
supposed to protect their registration/modification/deregistration).

I played with a version today that returns NULL in this case. Then the
caller of address_space_translate can take the BQL and retry the
translation.

Jan




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] block: add the optional file entry to query-block

2013-06-28 Thread Eric Blake

On 06/28/2013 08:32 AM, Federico Simoncelli wrote:
> This patch adds the optional file entry to the query-block output.
> The value is a json-object representing the information about the
> underlying file or device (when present).
> 
> Signed-off-by: Federico Simoncelli 
> ---
>  block/qapi.c   |9 -
>  qapi-schema.json   |4 +++-
>  qmp-commands.hx|8 
>  tests/qemu-iotests/043.out |   15 +++
>  4 files changed, 34 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake 

> +++ b/block/qapi.c
> @@ -119,7 +119,7 @@ void bdrv_query_image_info(BlockDriverState *bs,
>  
>  info->filename= g_strdup(bs->filename);
>  info->format  = g_strdup(bdrv_get_format_name(bs));
> -info->virtual_size= total_sectors * 512;
> +info->virtual_size= bdrv_getlength(bs);

This change seems independently useful, but I'm not sure if it's worth
splitting out into a separate patch.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 02/30] memory: use a new FlatView pointer on every topology update

2013-06-28 Thread Paolo Bonzini

This is the first step towards converting as->current_map to
RCU-style updates, where the FlatView updates run concurrently
with uses of an old FlatView.

Signed-off-by: Paolo Bonzini 
---
 memory.c | 34 ++
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/memory.c b/memory.c
index 1f44cd1..319894e 100644
--- a/memory.c
+++ b/memory.c
@@ -276,6 +276,7 @@ static void flatview_destroy(FlatView *view)
 memory_region_unref(view->ranges[i].mr);
 }
 g_free(view->ranges);
+g_free(view);
 }
 
 static bool can_merge(FlatRange *r1, FlatRange *r2)
@@ -512,17 +513,18 @@ static void render_memory_region(FlatView *view,
 }
 
 /* Render a memory topology into a list of disjoint absolute ranges. */
-static FlatView generate_memory_topology(MemoryRegion *mr)
+static FlatView *generate_memory_topology(MemoryRegion *mr)
 {
-FlatView view;
+FlatView *view;
 
-flatview_init(&view);
+view = g_new(FlatView, 1);
+flatview_init(view);
 
 if (mr) {
-render_memory_region(&view, mr, int128_zero(),
+render_memory_region(view, mr, int128_zero(),
  addrrange_make(int128_zero(), int128_2_64()), 
false);
 }
-flatview_simplify(&view);
+flatview_simplify(view);
 
 return view;
 }
@@ -610,8 +612,8 @@ static void address_space_update_ioeventfds(AddressSpace 
*as)
 }
 
 static void address_space_update_topology_pass(AddressSpace *as,
-   FlatView old_view,
-   FlatView new_view,
+   const FlatView *old_view,
+   const FlatView *new_view,
bool adding)
 {
 unsigned iold, inew;
@@ -621,14 +623,14 @@ static void 
address_space_update_topology_pass(AddressSpace *as,
  * Kill ranges in the old map, and instantiate ranges in the new map.
  */
 iold = inew = 0;
-while (iold < old_view.nr || inew < new_view.nr) {
-if (iold < old_view.nr) {
-frold = &old_view.ranges[iold];
+while (iold < old_view->nr || inew < new_view->nr) {
+if (iold < old_view->nr) {
+frold = &old_view->ranges[iold];
 } else {
 frold = NULL;
 }
-if (inew < new_view.nr) {
-frnew = &new_view.ranges[inew];
+if (inew < new_view->nr) {
+frnew = &new_view->ranges[inew];
 } else {
 frnew = NULL;
 }
@@ -674,14 +676,14 @@ static void 
address_space_update_topology_pass(AddressSpace *as,
 
 static void address_space_update_topology(AddressSpace *as)
 {
-FlatView old_view = *as->current_map;
-FlatView new_view = generate_memory_topology(as->root);
+FlatView *old_view = as->current_map;
+FlatView *new_view = generate_memory_topology(as->root);
 
 address_space_update_topology_pass(as, old_view, new_view, false);
 address_space_update_topology_pass(as, old_view, new_view, true);
 
-*as->current_map = new_view;
-flatview_destroy(&old_view);
+as->current_map = new_view;
+flatview_destroy(old_view);
 address_space_update_ioeventfds(as);
 }
 
-- 
1.8.1.4

Re: [Qemu-devel] [PATCH v4] Add timestamp to error message

2013-06-28 Thread Seiji Aguchi

> >>
> >> I wonder if TIME_H is maybe a bit too nondescript and could conflict
> >> with other guards.

OK.
I will use QEMU_TIME_H.
Apologies for my curt reply.

Seiji

Re: [Qemu-devel] [PATCH] q35 chipset: Extend support of SMBUS(module pm_smbus.c) HST_STS register v2.

2013-06-28 Thread Anthony Liguori

Maksim Ratnikov  writes:

>  From b4c324b42b488aca76aae06c8fa23a45acd91fcf Mon Sep 17 00:00:00 2001
> From: MRatnikov 
> Date: Fri, 28 Jun 2013 02:57:51 +0400
> Subject: [PATCH]  Extend support of SMBUS(module pm_smbus.c) HST_STS 
> register v2

Patch is corrupted.  Please send with git-send-email.

Regards,

Anthony Liguori



>
> Signed-off-by: MRatnikov 
> ---
> Previous realization doesn't consider flags in the status register.
> Add DS and INTR bits of HST_STS register set after transaction execution.
> Update bits resetting in HST_STS register. Update error processing:
> if DEV_ERR bit set transaction isn't execution.
>
>   hw/i2c/pm_smbus.c |   25 +++--
>   1 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/hw/i2c/pm_smbus.c b/hw/i2c/pm_smbus.c
> index 0b5bb89..35f154e 100644
> --- a/hw/i2c/pm_smbus.c
> +++ b/hw/i2c/pm_smbus.c
> @@ -32,6 +32,18 @@
>   #define SMBHSTDAT1  0x06
>   #define SMBBLKDAT   0x07
>
> +#define STS_HOST_BUSY   (1)
> +#define STS_INTR(1<<1)
> +#define STS_DEV_ERR (1<<2)
> +#define STS_BUS_ERR (1<<3)
> +#define STS_FAILED  (1<<4)
> +#define STS_SMBALERT(1<<5)
> +#define STS_INUSE_STS   (1<<6)
> +#define STS_BYTE_DONE   (1<<7)
> +/* Signs of successfully transaction end :
> +*  ByteDoneStatus = 1 (STS_BYTE_DONE) and INTR = 1 (STS_INTR )
> +*/
> +
>   //#define DEBUG
>
>   #ifdef DEBUG
> @@ -50,9 +62,14 @@ static void smb_transaction(PMSMBus *s)
>   i2c_bus *bus = s->smbus;
>
>   SMBUS_DPRINTF("SMBus trans addr=0x%02x prot=0x%02x\n", addr, prot);
> +/* Transaction isn't exec if STS_DEV_ERR bit set */
> +if ((s->smb_stat & STS_DEV_ERR) != 0)  {
> +goto error;
> +}
>   switch(prot) {
>   case 0x0:
>   smbus_quick_command(bus, addr, read);
> +s->smb_stat |= STS_BYTE_DONE | STS_INTR;
>   break;
>   case 0x1:
>   if (read) {
> @@ -60,6 +77,7 @@ static void smb_transaction(PMSMBus *s)
>   } else {
>   smbus_send_byte(bus, addr, cmd);
>   }
> +s->smb_stat |= STS_BYTE_DONE | STS_INTR;
>   break;
>   case 0x2:
>   if (read) {
> @@ -67,6 +85,7 @@ static void smb_transaction(PMSMBus *s)
>   } else {
>   smbus_write_byte(bus, addr, cmd, s->smb_data0);
>   }
> +s->smb_stat |= STS_BYTE_DONE | STS_INTR;
>   break;
>   case 0x3:
>   if (read) {
> @@ -77,6 +96,7 @@ static void smb_transaction(PMSMBus *s)
>   } else {
>   smbus_write_word(bus, addr, cmd, (s->smb_data1 << 8) | 
> s->smb_data0);
>   }
> +s->smb_stat |= STS_BYTE_DONE | STS_INTR;
>   break;
>   case 0x5:
>   if (read) {
> @@ -84,6 +104,7 @@ static void smb_transaction(PMSMBus *s)
>   } else {
>   smbus_write_block(bus, addr, cmd, s->smb_data, s->smb_data0);
>   }
> +s->smb_stat |= STS_BYTE_DONE | STS_INTR;
>   break;
>   default:
>   goto error;
> @@ -91,7 +112,7 @@ static void smb_transaction(PMSMBus *s)
>   return;
>
> error:
> -s->smb_stat |= 0x04;
> +s->smb_stat |= STS_DEV_ERR;
>   }
>
>   static void smb_ioport_writeb(void *opaque, hwaddr addr, uint64_t val,
> @@ -102,7 +123,7 @@ static void smb_ioport_writeb(void *opaque, hwaddr 
> addr, uint64_t val,
>   SMBUS_DPRINTF("SMB writeb port=0x%04x val=0x%02x\n", addr, val);
>   switch(addr) {
>   case SMBHSTSTS:
> -s->smb_stat = 0;
> +s->smb_stat = (~(val & 0xff)) & s->smb_stat;
>   s->smb_index = 0;
>   break;
>   case SMBHSTCNT:
> -- 
> 1.7.3.4

Re: [Qemu-devel] [PATCH v4] Add timestamp to error message

2013-06-28 Thread Laszlo Ersek

On 06/28/13 20:54, Seiji Aguchi wrote:
> 
>>> diff --git a/include/qemu/time.h b/include/qemu/time.h
>>> new file mode 100644
>>> index 000..f70739b
>>> --- /dev/null
>>> +++ b/include/qemu/time.h
>>> @@ -0,0 +1,11 @@
>>> +#ifndef TIME_H
>>> +#define TIME_H
>>
>> I wonder if TIME_H is maybe a bit too nondescript and could conflict
>> with other guards.
>>
>> The guards currently used in "include/qemu/*.h" files are inconsistent
>> -- some use a QEMU_ prefix, some a __QEMU_ one, and others use none.
>>
>> Don't respin just because of this, but maybe it's something to consider.
> 
> This should be discussed in other thread...

Ah, right, apologies for the ambiguity. I just meant that either
QEMU_TIME_H or __QEMU_TIME_H would have a lower chance to clash with
another guard than plain TIME_H -- "some" prefix would be nice, but I
couldn't point out a specific one, because the current practice varies.
I didn't mean to suggest that you should unify the prefixes across all
guards!

Anyway, even my QEMU_TIME_H / __QEMU_TIME_H suggestion is tangential;
feel free to ignore it.

Thanks
Laszlo

Re: [Qemu-devel] [PATCH v3] Add timestamp to error message

2013-06-28 Thread Seiji Aguchi

Thank you for the review.

> > +#include "qemu-common.h"
> > +
> > +/*  "1970-01-01T00:00:00.99Z" + '\0' */
> > +#define TIMESTAMP_LEN 28
> 
> Self-documenting constants are nicer:
> 
> #define TIMESTAMP_LEN (sizeof("1970-01-01T00:00:00.99Z")+1)

I will fix it.

> 
> extern void qemu_get_timestamp_str(char *timestr, size_t len)
> 
> where len is the length of timestr, and where the comments document that
> len should be at least TIMESTAMP_LEN to avoid truncation (or even
> assert() if it is not)?

I will fix as above.

> 
> > +++ b/qemu-options.hx
> > @@ -3102,3 +3102,15 @@ HXCOMM This is the last statement. Insert new 
> > options before this line!
> >  STEXI
> >  @end table
> >  ETEXI
> > +
> > +DEF("msg", HAS_ARG, QEMU_OPTION_msg,
> > +"-msg [timestamp=on|off]\n"
> > +"output message with timestamp (default: off)\n",
> > +QEMU_ARCH_ALL)
> 
> Did you test that the existing query-command-line-options QMP command
> will list this option (just making sure that libvirt will be able to
> know when to use this option).

I will test it in the next patch.

Seiji

> 
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org

Re: [Qemu-devel] [PATCH v4] Add timestamp to error message

2013-06-28 Thread Seiji Aguchi


> > diff --git a/include/qemu/time.h b/include/qemu/time.h
> > new file mode 100644
> > index 000..f70739b
> > --- /dev/null
> > +++ b/include/qemu/time.h
> > @@ -0,0 +1,11 @@
> > +#ifndef TIME_H
> > +#define TIME_H
> 
> I wonder if TIME_H is maybe a bit too nondescript and could conflict
> with other guards.
> 
> The guards currently used in "include/qemu/*.h" files are inconsistent
> -- some use a QEMU_ prefix, some a __QEMU_ one, and others use none.
> 
> Don't respin just because of this, but maybe it's something to consider.

This should be discussed in other thread...

> 
> 
> > +
> > +#include "qemu-common.h"
> > +
> > +/*  "1970-01-01T00:00:00.99Z" + '\0' */
> > +#define TIMESTAMP_LEN 28
> > +extern void qemu_get_timestamp_str(char (*timestr)[]);

I will change to "extern void qemu_get_timestamp_str(char *timestr, size_t len)"
as Eric pointed out.

> 
> 
> > +extern bool enable_timestamp_msg;
> > +
> > +#endif /* !TIME_H */
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index ca6fdf6..7890921 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -3102,3 +3102,15 @@ HXCOMM This is the last statement. Insert new 
> > options before this line!
> >  STEXI
> >  @end table
> >  ETEXI
> > +
> > +DEF("msg", HAS_ARG, QEMU_OPTION_msg,
> > +"-msg [timestamp=on|off]\n"
> > +"output message with timestamp (default: off)\n",
> > +QEMU_ARCH_ALL)
> > +STEXI
> > +@item -msg timestamp=on|off
> > +@findex - msg
> 
> A space has snuck in between "-" and "msg". Perhaps this is the only
> note that would warrant a respin (or rather a maintainer fixup at commit).

I will fix it.

> 
> 
> > +Output message with timestamp.
> 
> As a non-native speaker, I propose rewording this as "prepend a
> timestamp to each log message" (in the prior occurrence as well) if you
> decided to repost.

Will fix as well.

> >  void error_report(const char *fmt, ...)
> >  {
> >  va_list ap;
> > +char timestr[TIMESTAMP_LEN];
> > +
> > +if (enable_timestamp_msg) {
> > +qemu_get_timestamp_str(×tr);
> > +error_printf("%s ", timestr);
> > +}
> >
> >  error_print_loc();
> >  va_start(ap, fmt);
> 
> Does this print the timestamp to all kinds of monitors too? On stderr,
> the timestamp is great. When printed to an "interactive monitor", it
> could also help post-mortem debugging. But would it not confuse libvirt
> eg.? (Apologies if this has been discussed before.)

I will try to add timestamp to monitor_printf().
(In the long term, I would like to add it to all fprintf() in qemu.)

> 
> 
> > diff --git a/util/qemu-time.c b/util/qemu-time.c
> > new file mode 100644
> > index 000..37f7b9e
> > --- /dev/null
> > +++ b/util/qemu-time.c
> > @@ -0,0 +1,24 @@
> > +/*
> > + * Time handling
> > + *
> > + * Copyright (C) 2013 Hitachi Data Systems Corp.
> > + *
> > + * Authors:
> > + *  Seiji Aguchi 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or 
> > later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +#include "qemu/time.h"
> > +
> > +void qemu_get_timestamp_str(char (*timestr)[])
> > +{
> > +GTimeVal tv;
> > +gchar *tmp_str = NULL;
> > +
> > +g_get_current_time(&tv);
> 
> Hm. There's also g_get_monotonic_time(), but (a) that's less portable
> (available since 2.28), (b) this is simply good enough IMO, in practice. OK.
> 
> 
> > +tmp_str = g_time_val_to_iso8601(&tv);
> > +g_strlcpy((gchar *)*timestr, tmp_str, TIMESTAMP_LEN);
> > +g_free(tmp_str);
> > +return;
> 
> You're not returning a value so the last statement is superfluous.

I will remove the unnecessary "return".

> > +
> > +static void configure_msg(QemuOpts *opts)
> > +{
> > +enable_timestamp_msg = qemu_opt_get_bool(opts, "timestamp", true);
> > +}
> 
> I think the default value doesn't match the docs, which say "deafult:
> off". As far as I recall (from Tomoki's "-realtime [mlock=on|off]"
> patch), statements about defaults in the option docs don't describe how
> qemu works by default (ie. when you omit the option altogether), they
> describe what happens if you specify the option but omit its arguments
> (ie. with "-msg" only.)
> 
> In that case, the docs should state something like:
> 
> DEF("msg", HAS_ARG, QEMU_OPTION_msg,
> "-msg [timestamp=on|off]\n"
> " change the format of error messages\n"
> " timestamp=on|off enables leading timestamps (default: on)\n",
> QEMU_ARCH_ALL)

Currently, this patch add timestamp to just error_report().
But, we may need it for both error and normal messages.
So, I will change the sentence "change the format of error messages"
to "change the format of messages".

I appreciate your review.

Seiji

[Qemu-devel] [PATCH 30/30] exec: put address space dispatch under RCU critical section

2013-06-28 Thread Paolo Bonzini

With this change, address space dispatch can be moved outside the
BQL.  The actual I/O would still have to happen within the lock.

The next step would be to introduce a function that can only
be called from outside the BQL, address_space_rw_unlocked.  The
function would do something like

mr = address_space_translate(...)
if (!mr->unlocked) {
locked = true;
qemu_mutex_lock_iothread();
}
...dispatch...
if (locked) {
qemu_mutex_unlock_iothread();
}

(Note that subpages are already ready to be changed to unlocked
access).

Signed-off-by: Paolo Bonzini 
---
 cputlb.c |  4 +++-
 exec.c   | 30 ++
 hw/ppc/spapr_iommu.c | 10 +-
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 82875b1..97bfe70 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -267,8 +267,9 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
 tlb_add_large_page(env, vaddr, size);
 }
 
+rcu_read_lock();
 sz = size;
-d = address_space_memory.dispatch;
+d = rcu_dereference(&address_space_memory.dispatch);
 section = address_space_translate_for_iotlb(d, paddr, &xlat, &sz);
 assert(sz >= TARGET_PAGE_SIZE);
 
@@ -321,6 +322,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
 } else {
 te->addr_write = -1;
 }
+rcu_read_unlock();
 }
 
 /* NOTE: this function can trigger an exception */
diff --git a/exec.c b/exec.c
index e564014..8c6f925 100644
--- a/exec.c
+++ b/exec.c
@@ -97,6 +97,8 @@ struct PhysPageEntry {
 typedef PhysPageEntry Node[L2_SIZE];
 
 struct AddressSpaceDispatch {
+struct rcu_head rcu;
+
 /* This is a multi-level map on the physical address space.
  * The bottom level has pointers to MemoryRegionSections.
  */
@@ -120,6 +122,8 @@ typedef struct subpage_t {
 #define PHYS_SECTION_WATCH 3
 
 typedef struct PhysPageMap {
+struct rcu_head rcu;
+
 unsigned sections_nb;
 unsigned sections_nb_alloc;
 unsigned nodes_nb;
@@ -236,6 +240,7 @@ bool memory_region_is_unassigned(MemoryRegion *mr)
 && mr != &io_mem_watch;
 }
 
+/* Called from RCU critical section */
 static MemoryRegionSection *address_space_lookup_region(AddressSpaceDispatch 
*d,
 hwaddr addr,
 bool resolve_subpage)
@@ -252,6 +257,7 @@ static MemoryRegionSection 
*address_space_lookup_region(AddressSpaceDispatch *d,
 return section;
 }
 
+/* Called from RCU critical section */
 static MemoryRegionSection *
 address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr 
*xlat,
  hwaddr *plen, bool resolve_subpage)
@@ -280,8 +286,10 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 MemoryRegion *mr;
 hwaddr len = *plen;
 
+rcu_read_lock();
 for (;;) {
-section = address_space_translate_internal(as->dispatch, addr, &addr, 
plen, true);
+AddressSpaceDispatch *d = rcu_dereference(&as->dispatch);
+section = address_space_translate_internal(d, addr, &addr, plen, true);
 mr = section->mr;
 
 if (!mr->iommu_ops) {
@@ -303,9 +311,11 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 *plen = len;
 *xlat = addr;
 memory_region_ref(mr);
+rcu_read_unlock();
 return mr;
 }
 
+/* Called from RCU critical section */
 MemoryRegionSection *
 address_space_translate_for_iotlb(AddressSpaceDispatch *d, hwaddr addr, hwaddr 
*xlat,
   hwaddr *plen)
@@ -727,6 +737,7 @@ static int cpu_physical_memory_set_dirty_tracking(int 
enable)
 return ret;
 }
 
+/* Called from RCU critical section */
 hwaddr memory_region_section_get_iotlb(AddressSpaceDispatch *d,
CPUArchState *env,
MemoryRegionSection *section,
@@ -1706,6 +1717,11 @@ static uint16_t dummy_section(MemoryRegion *mr)
 
 MemoryRegion *iotlb_to_region(hwaddr index)
 {
+/* This assumes that address_space_memory.dispatch->sections
+ * does not change between address_space_translate_for_iotlb and
+ * here.  This is currently true because TCG runs within the
+ * BQL.
+ */
 return address_space_memory.dispatch->sections[index & 
~TARGET_PAGE_MASK].mr;
 }
 
@@ -1762,11 +1778,12 @@ static void core_begin(MemoryListener *listener)
 }
 
 /* This listener's commit run after the other AddressSpaceDispatch listeners'.
- * All AddressSpaceDispatch instances have switched to the next map.
+ * All AddressSpaceDispatch instances have switched to the next map, but
+ * there may be concurrent readers.
  */
 static void core_commit(MemoryListener *listener)
 {
-phys_sections_free(prev_map);
+call_rcu(prev_map, phys_sections_free, rcu);
 }
 
 static void tcg_commit(MemoryListener *listener)
@@ -1816,13 +1833

Re: [Qemu-devel] [PATCH v4 07/10] qemu-ga: Add Windows VSS requester to quiesce applications and filesystems

2013-06-28 Thread Laszlo Ersek

On 06/28/13 20:01, Laszlo Ersek wrote:

> (b) error_setg_win32() should imitate error_setg_errno():
> 
> #define error_setg_win32(err, win32, fmt, ...) \
> error_set_errno(err, win32, ERROR_CLASS_GENERIC_ERROR, fmt, ## 
> __VA_ARGS__)

Typo of course, that's error_set_win32() in the replacement text. Sorry!

Laszlo

[Qemu-devel] [PATCH 27/30] exec: change some APIs to take AddressSpaceDispatch

2013-06-28 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 exec.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 7f87e16..528c4d7 100644
--- a/exec.c
+++ b/exec.c
@@ -236,11 +236,10 @@ bool memory_region_is_unassigned(MemoryRegion *mr)
 && mr != &io_mem_watch;
 }
 
-static MemoryRegionSection *address_space_lookup_region(AddressSpace *as,
+static MemoryRegionSection *address_space_lookup_region(AddressSpaceDispatch 
*d,
 hwaddr addr,
 bool resolve_subpage)
 {
-AddressSpaceDispatch *d = as->dispatch;
 MemoryRegionSection *section;
 subpage_t *subpage;
 
@@ -254,13 +253,13 @@ static MemoryRegionSection 
*address_space_lookup_region(AddressSpace *as,
 }
 
 static MemoryRegionSection *
-address_space_translate_internal(AddressSpace *as, hwaddr addr, hwaddr *xlat,
+address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr 
*xlat,
  hwaddr *plen, bool resolve_subpage)
 {
 MemoryRegionSection *section;
 Int128 diff;
 
-section = address_space_lookup_region(as, addr, resolve_subpage);
+section = address_space_lookup_region(d, addr, resolve_subpage);
 /* Compute offset within MemoryRegionSection */
 addr -= section->offset_within_address_space;
 
@@ -282,7 +281,7 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 hwaddr len = *plen;
 
 for (;;) {
-section = address_space_translate_internal(as, addr, &addr, plen, 
true);
+section = address_space_translate_internal(as->dispatch, addr, &addr, 
plen, true);
 mr = section->mr;
 
 if (!mr->iommu_ops) {
@@ -311,7 +310,7 @@ address_space_translate_for_iotlb(AddressSpace *as, hwaddr 
addr, hwaddr *xlat,
   hwaddr *plen)
 {
 MemoryRegionSection *section;
-section = address_space_translate_internal(as, addr, xlat, plen, false);
+section = address_space_translate_internal(as->dispatch, addr, xlat, plen, 
false);
 
 assert(!section->mr->iommu_ops);
 return section;
-- 
1.8.1.4

[Qemu-devel] [PATCH 29/30] exec: add a reference to the region returned by address_space_translate

2013-06-28 Thread Paolo Bonzini

Once address_space_translate will only be protected by RCU, the returned
MemoryRegion might disappear as soon as the RCU read-side critical section
ends.  Avoid this by adding a reference to the region, and dropping it
in the caller of address_space_translate.

This generalizes what was done for address_space_map to other callers of
address_space_translate.  It is necessary to call address_space_translate
out of the BQL.

Signed-off-by: Paolo Bonzini 
---
 exec.c| 19 +++
 include/exec/memory.h |  3 ++-
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/exec.c b/exec.c
index 3e1a576..e564014 100644
--- a/exec.c
+++ b/exec.c
@@ -302,6 +302,7 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 
 *plen = len;
 *xlat = addr;
+memory_region_ref(mr);
 return mr;
 }
 
@@ -1990,6 +1991,7 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, 
uint8_t *buf,
 memcpy(buf, ptr, l);
 }
 }
+memory_region_unref(mr);
 len -= l;
 buf += l;
 addr += l;
@@ -2146,7 +2148,7 @@ void *address_space_map(AddressSpace *as,
 bounce.addr = addr;
 bounce.len = l;
 
-memory_region_ref(mr);
+/* Keep the reference to mr until address_space_unmap.  */
 bounce.mr = mr;
 if (!is_write) {
 address_space_read(as, addr, bounce.buffer, l);
@@ -2169,12 +2171,13 @@ void *address_space_map(AddressSpace *as,
 
 l = len;
 this_mr = address_space_translate(as, addr, &xlat, &l, is_write);
+memory_region_unref(this_mr);
 if (this_mr != mr || xlat != base + done) {
 break;
 }
 }
 
-memory_region_ref(mr);
+/* Keep the reference to mr until address_space_unmap.  */
 *plen = done;
 return qemu_ram_ptr_length(raddr + base, plen);
 }
@@ -2272,6 +2275,7 @@ static inline uint32_t ldl_phys_internal(hwaddr addr,
 break;
 }
 }
+memory_region_unref(mr);
 return val;
 }
 
@@ -2331,6 +2335,7 @@ static inline uint64_t ldq_phys_internal(hwaddr addr,
 break;
 }
 }
+memory_region_unref(mr);
 return val;
 }
 
@@ -2398,6 +2403,7 @@ static inline uint32_t lduw_phys_internal(hwaddr addr,
 break;
 }
 }
+memory_region_unref(mr);
 return val;
 }
 
@@ -2445,6 +2451,7 @@ void stl_phys_notdirty(hwaddr addr, uint32_t val)
 }
 }
 }
+memory_region_unref(mr);
 }
 
 /* warning: addr must be aligned */
@@ -2486,6 +2493,7 @@ static inline void stl_phys_internal(hwaddr addr, 
uint32_t val,
 }
 invalidate_and_set_dirty(addr1, 4);
 }
+memory_region_unref(mr);
 }
 
 void stl_phys(hwaddr addr, uint32_t val)
@@ -2549,6 +2557,7 @@ static inline void stw_phys_internal(hwaddr addr, 
uint32_t val,
 }
 invalidate_and_set_dirty(addr1, 2);
 }
+memory_region_unref(mr);
 }
 
 void stw_phys(hwaddr addr, uint32_t val)
@@ -2638,11 +2647,13 @@ bool cpu_physical_memory_is_io(hwaddr phys_addr)
 {
 MemoryRegion*mr;
 hwaddr l = 1;
+bool res;
 
 mr = address_space_translate(&address_space_memory,
  phys_addr, &phys_addr, &l, false);
 
-return !(memory_region_is_ram(mr) ||
- memory_region_is_romd(mr));
+res = !(memory_region_is_ram(mr) || memory_region_is_romd(mr));
+memory_region_unref(mr);
+return res;
 }
 #endif
diff --git a/include/exec/memory.h b/include/exec/memory.h
index b21a460..66226b1 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -986,7 +986,8 @@ bool address_space_write(AddressSpace *as, hwaddr addr,
 bool address_space_read(AddressSpace *as, hwaddr addr, uint8_t *buf, int len);
 
 /* address_space_translate: translate an address range into an address space
- * into a MemoryRegion and an address range into that section
+ * into a MemoryRegion and an address range into that section.  Add a reference
+ * to that region.
  *
  * @as: #AddressSpace to be accessed
  * @addr: address within that address space
-- 
1.8.1.4

[Qemu-devel] [PATCH 28/30] exec: change iotlb APIs to take AddressSpaceDispatch

2013-06-28 Thread Paolo Bonzini

This makes it possible to start following RCU rules, which require
not dereferencing as->dispatch more than once.  It is not covering
the whole of TCG, since the TLB data structures are not RCU-friendly,
but it is enough for exec.c.

Signed-off-by: Paolo Bonzini 
---
 cputlb.c  | 7 ---
 exec.c| 9 +
 include/exec/cputlb.h | 9 ++---
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 51381ae..82875b1 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -253,6 +253,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
   hwaddr paddr, int prot,
   int mmu_idx, target_ulong size)
 {
+AddressSpaceDispatch *d;
 MemoryRegionSection *section;
 unsigned int index;
 target_ulong address;
@@ -267,8 +268,8 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
 }
 
 sz = size;
-section = address_space_translate_for_iotlb(&address_space_memory, paddr,
-&xlat, &sz);
+d = address_space_memory.dispatch;
+section = address_space_translate_for_iotlb(d, paddr, &xlat, &sz);
 assert(sz >= TARGET_PAGE_SIZE);
 
 #if defined(DEBUG_TLB)
@@ -288,7 +289,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
 }
 
 code_address = address;
-iotlb = memory_region_section_get_iotlb(env, section, vaddr, paddr, xlat,
+iotlb = memory_region_section_get_iotlb(d, env, section, vaddr, paddr, 
xlat,
 prot, &address);
 
 index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
diff --git a/exec.c b/exec.c
index 528c4d7..3e1a576 100644
--- a/exec.c
+++ b/exec.c
@@ -306,11 +306,11 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 }
 
 MemoryRegionSection *
-address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat,
+address_space_translate_for_iotlb(AddressSpaceDispatch *d, hwaddr addr, hwaddr 
*xlat,
   hwaddr *plen)
 {
 MemoryRegionSection *section;
-section = address_space_translate_internal(as->dispatch, addr, xlat, plen, 
false);
+section = address_space_translate_internal(d, addr, xlat, plen, false);
 
 assert(!section->mr->iommu_ops);
 return section;
@@ -726,7 +726,8 @@ static int cpu_physical_memory_set_dirty_tracking(int 
enable)
 return ret;
 }
 
-hwaddr memory_region_section_get_iotlb(CPUArchState *env,
+hwaddr memory_region_section_get_iotlb(AddressSpaceDispatch *d,
+   CPUArchState *env,
MemoryRegionSection *section,
target_ulong vaddr,
hwaddr paddr, hwaddr xlat,
@@ -746,7 +747,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 iotlb |= PHYS_SECTION_ROM;
 }
 } else {
-iotlb = section - address_space_memory.dispatch->sections;
+iotlb = section - d->sections;
 iotlb += xlat;
 }
 
diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h
index e21cb60..968b6a4 100644
--- a/include/exec/cputlb.h
+++ b/include/exec/cputlb.h
@@ -31,12 +31,15 @@ void tlb_set_dirty(CPUArchState *env, target_ulong vaddr);
 extern int tlb_flush_count;
 
 /* exec.c */
+typedef struct AddressSpaceDispatch AddressSpaceDispatch;
+
 void tb_flush_jmp_cache(CPUArchState *env, target_ulong addr);
 
 MemoryRegionSection *
-address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat,
-  hwaddr *plen);
-hwaddr memory_region_section_get_iotlb(CPUArchState *env,
+address_space_translate_for_iotlb(AddressSpaceDispatch *d, hwaddr addr,
+  hwaddr *xlat, hwaddr *plen);
+hwaddr memory_region_section_get_iotlb(AddressSpaceDispatch *d,
+   CPUArchState *env,
MemoryRegionSection *section,
target_ulong vaddr,
hwaddr paddr, hwaddr xlat,
-- 
1.8.1.4

[Qemu-devel] [PATCH 24/30] exec: separate current radix tree from the one being built

2013-06-28 Thread Paolo Bonzini

This same treatment previously done to phys_node_map and phys_sections
is now applied to the dispatch field of AddressSpace.  Topology updates
use as->next_dispatch while accesses use as->dispatch.

Signed-off-by: Paolo Bonzini 
---
 exec.c| 23 ---
 include/exec/memory.h |  1 +
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/exec.c b/exec.c
index dffdf23..0d852ee 100644
--- a/exec.c
+++ b/exec.c
@@ -855,7 +855,7 @@ static void register_multipage(AddressSpaceDispatch *d,
 static void mem_add(MemoryListener *listener, MemoryRegionSection *section)
 {
 AddressSpace *as = container_of(listener, AddressSpace, dispatch_listener);
-AddressSpaceDispatch *d = as->dispatch;
+AddressSpaceDispatch *d = as->next_dispatch;
 MemoryRegionSection now = *section, remain = *section;
 Int128 page_size = int128_make64(TARGET_PAGE_SIZE);
 
@@ -1718,9 +1718,21 @@ static void io_mem_init(void)
 static void mem_begin(MemoryListener *listener)
 {
 AddressSpace *as = container_of(listener, AddressSpace, dispatch_listener);
+AddressSpaceDispatch *d = g_new(AddressSpaceDispatch, 1);
+
+d->phys_map  = (PhysPageEntry) { .ptr = PHYS_MAP_NODE_NIL, .is_leaf = 0 };
+d->as = as;
+as->next_dispatch = d;
+}
+
+static void mem_commit(MemoryListener *listener)
+{
+AddressSpace *as = container_of(listener, AddressSpace, dispatch_listener);
 AddressSpaceDispatch *d = as->dispatch;
 
-d->phys_map.ptr = PHYS_MAP_NODE_NIL;
+/* cur_map will soon be switched to next_map, too.  */
+as->dispatch = as->next_dispatch;
+g_free(d);
 }
 
 static void core_begin(MemoryListener *listener)
@@ -1784,13 +1796,10 @@ static MemoryListener tcg_memory_listener = {
 
 void address_space_init_dispatch(AddressSpace *as)
 {
-AddressSpaceDispatch *d = g_new(AddressSpaceDispatch, 1);
-
-d->phys_map  = (PhysPageEntry) { .ptr = PHYS_MAP_NODE_NIL, .is_leaf = 0 };
-d->as = as;
-as->dispatch = d;
+as->dispatch = NULL;
 as->dispatch_listener = (MemoryListener) {
 .begin = mem_begin,
+.commit = mem_commit,
 .region_add = mem_add,
 .region_nop = mem_add,
 .priority = 0,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 1cd1f50..b21a460 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -212,6 +212,7 @@ struct AddressSpace {
 int ioeventfd_nb;
 struct MemoryRegionIoeventfd *ioeventfds;
 struct AddressSpaceDispatch *dispatch;
+struct AddressSpaceDispatch *next_dispatch;
 MemoryListener dispatch_listener;
 
 QTAILQ_ENTRY(AddressSpace) address_spaces_link;
-- 
1.8.1.4

[Qemu-devel] [PATCH 26/30] exec: remove cur_map

2013-06-28 Thread Paolo Bonzini

cur_map is not used anymore; instead, each AddressSpaceDispatch
has its own nodes/sections pair.  The priorities of the
MemoryListeners, and in the future RCU, guarantee that the
nodes/sections are not freed while they are still in use.

(In fact, next_map itself is not needed except to free the data on the
next update).

To avoid incorrect use, replace cur_map with a temporary copy that
is only valid while the topology is being updated.  If you use it,
the name prev_map makes it clear that you're doing something weird.

Signed-off-by: Paolo Bonzini 
---
 exec.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/exec.c b/exec.c
index 581f0c4..7f87e16 100644
--- a/exec.c
+++ b/exec.c
@@ -128,7 +128,7 @@ typedef struct PhysPageMap {
 MemoryRegionSection *sections;
 } PhysPageMap;
 
-static PhysPageMap cur_map;
+static PhysPageMap *prev_map;
 static PhysPageMap next_map;
 
 #define PHYS_MAP_NODE_NIL (((uint16_t)~0) >> 1)
@@ -804,7 +804,7 @@ static void phys_section_destroy(MemoryRegion *mr)
 }
 }
 
-static void phys_sections_clear(PhysPageMap *map)
+static void phys_sections_free(PhysPageMap *map)
 {
 while (map->sections_nb > 0) {
 MemoryRegionSection *section = &map->sections[--map->sections_nb];
@@ -812,6 +812,7 @@ static void phys_sections_clear(PhysPageMap *map)
 }
 g_free(map->sections);
 g_free(map->nodes);
+g_free(map);
 }
 
 static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection 
*section)
@@ -1745,6 +1746,9 @@ static void core_begin(MemoryListener *listener)
 {
 uint16_t n;
 
+prev_map = g_new(PhysPageMap, 1);
+*prev_map = next_map;
+
 memset(&next_map, 0, sizeof(next_map));
 n = dummy_section(&io_mem_unassigned);
 assert(n == PHYS_SECTION_UNASSIGNED);
@@ -1761,9 +1765,7 @@ static void core_begin(MemoryListener *listener)
  */
 static void core_commit(MemoryListener *listener)
 {
-PhysPageMap info = cur_map;
-cur_map = next_map;
-phys_sections_clear(&info);
+phys_sections_free(prev_map);
 }
 
 static void tcg_commit(MemoryListener *listener)
-- 
1.8.1.4

[Qemu-devel] [PATCH 25/30] exec: put memory map in AddressSpaceDispatch

2013-06-28 Thread Paolo Bonzini

This lets us get a consistent (phys_map, nodes, sections) using
RCU.  After this patch, cur_map is not used anymore except for freeing
it at the end of the topology update.

Signed-off-by: Paolo Bonzini 
---
 exec.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/exec.c b/exec.c
index 0d852ee..581f0c4 100644
--- a/exec.c
+++ b/exec.c
@@ -94,11 +94,15 @@ struct PhysPageEntry {
 uint16_t ptr : 15;
 };
 
+typedef PhysPageEntry Node[L2_SIZE];
+
 struct AddressSpaceDispatch {
 /* This is a multi-level map on the physical address space.
  * The bottom level has pointers to MemoryRegionSections.
  */
 PhysPageEntry phys_map;
+Node *nodes;
+MemoryRegionSection *sections;
 AddressSpace *as;
 };
 
@@ -115,8 +119,6 @@ typedef struct subpage_t {
 #define PHYS_SECTION_ROM 2
 #define PHYS_SECTION_WATCH 3
 
-typedef PhysPageEntry Node[L2_SIZE];
-
 typedef struct PhysPageMap {
 unsigned sections_nb;
 unsigned sections_nb_alloc;
@@ -238,14 +240,15 @@ static MemoryRegionSection 
*address_space_lookup_region(AddressSpace *as,
 hwaddr addr,
 bool resolve_subpage)
 {
+AddressSpaceDispatch *d = as->dispatch;
 MemoryRegionSection *section;
 subpage_t *subpage;
 
-section = phys_page_find(as->dispatch->phys_map, addr >> TARGET_PAGE_BITS,
- cur_map.nodes, cur_map.sections);
+section = phys_page_find(d->phys_map, addr >> TARGET_PAGE_BITS,
+ d->nodes, d->sections);
 if (resolve_subpage && section->mr->subpage) {
 subpage = container_of(section->mr, subpage_t, iomem);
-section = &cur_map.sections[subpage->sub_section[SUBPAGE_IDX(addr)]];
+section = &d->sections[subpage->sub_section[SUBPAGE_IDX(addr)]];
 }
 return section;
 }
@@ -744,7 +747,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 iotlb |= PHYS_SECTION_ROM;
 }
 } else {
-iotlb = section - cur_map.sections;
+iotlb = section - address_space_memory.dispatch->sections;
 iotlb += xlat;
 }
 
@@ -1701,7 +1704,7 @@ static uint16_t dummy_section(MemoryRegion *mr)
 
 MemoryRegion *iotlb_to_region(hwaddr index)
 {
-return cur_map.sections[index & ~TARGET_PAGE_MASK].mr;
+return address_space_memory.dispatch->sections[index & 
~TARGET_PAGE_MASK].mr;
 }
 
 static void io_mem_init(void)
@@ -1728,11 +1731,14 @@ static void mem_begin(MemoryListener *listener)
 static void mem_commit(MemoryListener *listener)
 {
 AddressSpace *as = container_of(listener, AddressSpace, dispatch_listener);
-AddressSpaceDispatch *d = as->dispatch;
+AddressSpaceDispatch *cur = as->dispatch;
+AddressSpaceDispatch *next = as->next_dispatch;
 
-/* cur_map will soon be switched to next_map, too.  */
-as->dispatch = as->next_dispatch;
-g_free(d);
+next->nodes = next_map.nodes;
+next->sections = next_map.sections;
+
+as->dispatch = next;
+g_free(cur);
 }
 
 static void core_begin(MemoryListener *listener)
-- 
1.8.1.4

[Qemu-devel] [PATCH 23/30] exec: move listener from AddressSpaceDispatch to AddressSpace

2013-06-28 Thread Paolo Bonzini

This will help having two copies of AddressSpaceDispatch during the
recreation of the radix tree (one being built, and one that is complete
and accessed under RCU).  We do not want to have to unregister and
re-register the listener.

Signed-off-by: Paolo Bonzini 
---
 exec.c| 17 +
 include/exec/memory.h |  2 ++
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/exec.c b/exec.c
index f138e56..dffdf23 100644
--- a/exec.c
+++ b/exec.c
@@ -99,7 +99,6 @@ struct AddressSpaceDispatch {
  * The bottom level has pointers to MemoryRegionSections.
  */
 PhysPageEntry phys_map;
-MemoryListener listener;
 AddressSpace *as;
 };
 
@@ -855,7 +854,8 @@ static void register_multipage(AddressSpaceDispatch *d,
 
 static void mem_add(MemoryListener *listener, MemoryRegionSection *section)
 {
-AddressSpaceDispatch *d = container_of(listener, AddressSpaceDispatch, 
listener);
+AddressSpace *as = container_of(listener, AddressSpace, dispatch_listener);
+AddressSpaceDispatch *d = as->dispatch;
 MemoryRegionSection now = *section, remain = *section;
 Int128 page_size = int128_make64(TARGET_PAGE_SIZE);
 
@@ -1717,7 +1717,8 @@ static void io_mem_init(void)
 
 static void mem_begin(MemoryListener *listener)
 {
-AddressSpaceDispatch *d = container_of(listener, AddressSpaceDispatch, 
listener);
+AddressSpace *as = container_of(listener, AddressSpace, dispatch_listener);
+AddressSpaceDispatch *d = as->dispatch;
 
 d->phys_map.ptr = PHYS_MAP_NODE_NIL;
 }
@@ -1786,22 +1787,22 @@ void address_space_init_dispatch(AddressSpace *as)
 AddressSpaceDispatch *d = g_new(AddressSpaceDispatch, 1);
 
 d->phys_map  = (PhysPageEntry) { .ptr = PHYS_MAP_NODE_NIL, .is_leaf = 0 };
-d->listener = (MemoryListener) {
+d->as = as;
+as->dispatch = d;
+as->dispatch_listener = (MemoryListener) {
 .begin = mem_begin,
 .region_add = mem_add,
 .region_nop = mem_add,
 .priority = 0,
 };
-d->as = as;
-as->dispatch = d;
-memory_listener_register(&d->listener, as);
+memory_listener_register(&as->dispatch_listener, as);
 }
 
 void address_space_destroy_dispatch(AddressSpace *as)
 {
 AddressSpaceDispatch *d = as->dispatch;
 
-memory_listener_unregister(&d->listener);
+memory_listener_unregister(&as->dispatch_listener);
 g_free(d);
 as->dispatch = NULL;
 }
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 913ac45..1cd1f50 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -212,6 +212,8 @@ struct AddressSpace {
 int ioeventfd_nb;
 struct MemoryRegionIoeventfd *ioeventfds;
 struct AddressSpaceDispatch *dispatch;
+MemoryListener dispatch_listener;
+
 QTAILQ_ENTRY(AddressSpace) address_spaces_link;
 };
 
-- 
1.8.1.4

[Qemu-devel] [PATCH 21/30] exec: separate current memory map from the one being built

2013-06-28 Thread Paolo Bonzini

Currently, phys_node_map and phys_sections are shared by all
of the AddressSpaceDispatch.  When updating mem topology, all
AddressSpaceDispatch will rebuild dispatch tables sequentially
on them.  In order to prepare for RCU access, leave the old
memory map alive while the next one is being accessed.

When rebuilding, the new dispatch tables will build and lookup
next_map; after all dispatch tables are rebuilt, we can switch
to next_* and free the previous table.

Based on a patch from Liu Ping Fan.

Signed-off-by: Liu Ping Fan 
Signed-off-by: Paolo Bonzini 
---
 exec.c | 103 -
 1 file changed, 63 insertions(+), 40 deletions(-)

diff --git a/exec.c b/exec.c
index 7ad513c..f138e56 100644
--- a/exec.c
+++ b/exec.c
@@ -111,16 +111,24 @@ typedef struct subpage_t {
 uint16_t sub_section[TARGET_PAGE_SIZE];
 } subpage_t;
 
-static MemoryRegionSection *phys_sections;
-static unsigned phys_sections_nb, phys_sections_nb_alloc;
 #define PHYS_SECTION_UNASSIGNED 0
 #define PHYS_SECTION_NOTDIRTY 1
 #define PHYS_SECTION_ROM 2
 #define PHYS_SECTION_WATCH 3
 
-/* Simple allocator for PhysPageEntry nodes */
-static PhysPageEntry (*phys_map_nodes)[L2_SIZE];
-static unsigned phys_map_nodes_nb, phys_map_nodes_nb_alloc;
+typedef PhysPageEntry Node[L2_SIZE];
+
+typedef struct PhysPageMap {
+unsigned sections_nb;
+unsigned sections_nb_alloc;
+unsigned nodes_nb;
+unsigned nodes_nb_alloc;
+Node *nodes;
+MemoryRegionSection *sections;
+} PhysPageMap;
+
+static PhysPageMap cur_map;
+static PhysPageMap next_map;
 
 #define PHYS_MAP_NODE_NIL (((uint16_t)~0) >> 1)
 
@@ -135,13 +143,13 @@ static MemoryRegion io_mem_watch;
 
 static void phys_map_node_reserve(unsigned nodes)
 {
-if (phys_map_nodes_nb + nodes > phys_map_nodes_nb_alloc) {
-typedef PhysPageEntry Node[L2_SIZE];
-phys_map_nodes_nb_alloc = MAX(phys_map_nodes_nb_alloc * 2, 16);
-phys_map_nodes_nb_alloc = MAX(phys_map_nodes_nb_alloc,
-  phys_map_nodes_nb + nodes);
-phys_map_nodes = g_renew(Node, phys_map_nodes,
- phys_map_nodes_nb_alloc);
+if (next_map.nodes_nb + nodes > next_map.nodes_nb_alloc) {
+next_map.nodes_nb_alloc = MAX(next_map.nodes_nb_alloc * 2,
+16);
+next_map.nodes_nb_alloc = MAX(next_map.nodes_nb_alloc,
+  next_map.nodes_nb + nodes);
+next_map.nodes = g_renew(Node, next_map.nodes,
+ next_map.nodes_nb_alloc);
 }
 }
 
@@ -150,12 +158,12 @@ static uint16_t phys_map_node_alloc(void)
 unsigned i;
 uint16_t ret;
 
-ret = phys_map_nodes_nb++;
+ret = next_map.nodes_nb++;
 assert(ret != PHYS_MAP_NODE_NIL);
-assert(ret != phys_map_nodes_nb_alloc);
+assert(ret != next_map.nodes_nb_alloc);
 for (i = 0; i < L2_SIZE; ++i) {
-phys_map_nodes[ret][i].is_leaf = 0;
-phys_map_nodes[ret][i].ptr = PHYS_MAP_NODE_NIL;
+next_map.nodes[ret][i].is_leaf = 0;
+next_map.nodes[ret][i].ptr = PHYS_MAP_NODE_NIL;
 }
 return ret;
 }
@@ -170,7 +178,7 @@ static void phys_page_set_level(PhysPageEntry *lp, hwaddr 
*index,
 
 if (!lp->is_leaf && lp->ptr == PHYS_MAP_NODE_NIL) {
 lp->ptr = phys_map_node_alloc();
-p = phys_map_nodes[lp->ptr];
+p = next_map.nodes[lp->ptr];
 if (level == 0) {
 for (i = 0; i < L2_SIZE; i++) {
 p[i].is_leaf = 1;
@@ -178,7 +186,7 @@ static void phys_page_set_level(PhysPageEntry *lp, hwaddr 
*index,
 }
 }
 } else {
-p = phys_map_nodes[lp->ptr];
+p = next_map.nodes[lp->ptr];
 }
 lp = &p[(*index >> (level * L2_BITS)) & (L2_SIZE - 1)];
 
@@ -205,20 +213,20 @@ static void phys_page_set(AddressSpaceDispatch *d,
 phys_page_set_level(&d->phys_map, &index, &nb, leaf, P_L2_LEVELS - 1);
 }
 
-static MemoryRegionSection *phys_page_find(AddressSpaceDispatch *d, hwaddr 
index)
+static MemoryRegionSection *phys_page_find(PhysPageEntry lp, hwaddr index,
+   Node *nodes, MemoryRegionSection 
*sections)
 {
-PhysPageEntry lp = d->phys_map;
 PhysPageEntry *p;
 int i;
 
 for (i = P_L2_LEVELS - 1; i >= 0 && !lp.is_leaf; i--) {
 if (lp.ptr == PHYS_MAP_NODE_NIL) {
-return &phys_sections[PHYS_SECTION_UNASSIGNED];
+return §ions[PHYS_SECTION_UNASSIGNED];
 }
-p = phys_map_nodes[lp.ptr];
+p = nodes[lp.ptr];
 lp = p[(index >> (i * L2_BITS)) & (L2_SIZE - 1)];
 }
-return &phys_sections[lp.ptr];
+return §ions[lp.ptr];
 }
 
 bool memory_region_is_unassigned(MemoryRegion *mr)
@@ -234,10 +242,11 @@ static MemoryRegionSection 
*address_space_lookup_region(AddressSpace *as,
 MemoryRegionSection *section;
 subpage_t *subpage;
 
-section = phys_page_find(as->dis

[Qemu-devel] [PATCH 18/30] memory: protect current_map by RCU

2013-06-28 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 include/exec/memory.h |  5 +
 memory.c  | 50 +-
 2 files changed, 26 insertions(+), 29 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9c33bba..aa7a922 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -24,6 +24,7 @@
 #include "qemu/queue.h"
 #include "qemu/int128.h"
 #include "qemu/notify.h"
+#include "qemu/rcu.h"
 
 #define MAX_PHYS_ADDR_SPACE_BITS 62
 #define MAX_PHYS_ADDR(((hwaddr)1 << MAX_PHYS_ADDR_SPACE_BITS) - 1)
@@ -168,9 +169,13 @@ struct MemoryRegion {
  */
 struct AddressSpace {
 /* All fields are private. */
+struct rcu_head rcu;
 char *name;
 MemoryRegion *root;
+
+/* Accessed via RCU.  */
 struct FlatView *current_map;
+
 int ioeventfd_nb;
 struct MemoryRegionIoeventfd *ioeventfds;
 struct AddressSpaceDispatch *dispatch;
diff --git a/memory.c b/memory.c
index bb92e17..7a4fe37 100644
--- a/memory.c
+++ b/memory.c
@@ -29,26 +29,12 @@ static unsigned memory_region_transaction_depth;
 static bool memory_region_update_pending;
 static bool global_dirty_log = false;
 
-/* flat_view_mutex is taken around reading as->current_map; the critical
- * section is extremely short, so I'm using a single mutex for every AS.
- * We could also RCU for the read-side.
- *
- * The BQL is taken around transaction commits, hence both locks are taken
- * while writing to as->current_map (with the BQL taken outside).
- */
-static QemuMutex flat_view_mutex;
-
 static QTAILQ_HEAD(memory_listeners, MemoryListener) memory_listeners
 = QTAILQ_HEAD_INITIALIZER(memory_listeners);
 
 static QTAILQ_HEAD(, AddressSpace) address_spaces
 = QTAILQ_HEAD_INITIALIZER(address_spaces);
 
-static void memory_init(void)
-{
-qemu_mutex_init(&flat_view_mutex);
-}
-
 typedef struct AddrRange AddrRange;
 
 /*
@@ -239,6 +225,7 @@ struct FlatRange {
  * order.
  */
 struct FlatView {
+struct rcu_head rcu;
 unsigned ref;
 FlatRange *ranges;
 unsigned nr;
@@ -610,10 +597,10 @@ static FlatView *address_space_get_flatview(AddressSpace 
*as)
 {
 FlatView *view;
 
-qemu_mutex_lock(&flat_view_mutex);
-view = as->current_map;
+rcu_read_lock();
+view = rcu_dereference(&as->current_map);
 flatview_ref(view);
-qemu_mutex_unlock(&flat_view_mutex);
+rcu_read_unlock();
 return view;
 }
 
@@ -722,10 +709,9 @@ static void address_space_update_topology(AddressSpace *as)
 address_space_update_topology_pass(as, old_view, new_view, false);
 address_space_update_topology_pass(as, old_view, new_view, true);
 
-qemu_mutex_lock(&flat_view_mutex);
-flatview_unref(as->current_map);
-as->current_map = new_view;
-qemu_mutex_unlock(&flat_view_mutex);
+/* Writes are protected by the BQL.  */
+rcu_assign_pointer(as->current_map, new_view);
+call_rcu(old_view, flatview_unref, rcu);
 
 /* Note that all the old MemoryRegions are still alive up to this
  * point.  This relieves most MemoryListeners from the need to
@@ -1687,10 +1673,6 @@ void memory_listener_unregister(MemoryListener *listener)
 
 void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name)
 {
-if (QTAILQ_EMPTY(&address_spaces)) {
-memory_init();
-}
-
 memory_region_transaction_begin();
 as->root = root;
 as->current_map = g_new(FlatView, 1);
@@ -1704,6 +1686,14 @@ void address_space_init(AddressSpace *as, MemoryRegion 
*root, const char *name)
 memory_region_transaction_commit();
 }
 
+static void do_address_space_destroy(AddressSpace *as)
+{
+address_space_destroy_dispatch(as);
+flatview_unref(as->current_map);
+g_free(as->name);
+g_free(as->ioeventfds);
+}
+
 void address_space_destroy(AddressSpace *as)
 {
 /* Flush out anything from MemoryListeners listening in on this */
@@ -1711,10 +1701,12 @@ void address_space_destroy(AddressSpace *as)
 as->root = NULL;
 memory_region_transaction_commit();
 QTAILQ_REMOVE(&address_spaces, as, address_spaces_link);
-address_space_destroy_dispatch(as);
-flatview_unref(as->current_map);
-g_free(as->name);
-g_free(as->ioeventfds);
+
+/* At this point, as->dispatch and as->current_map are dummy
+ * entries that the guest should never use.  Wait for the old
+ * values to expire before freeing the data.
+ */
+call_rcu(as, do_address_space_destroy, rcu);
 }
 
 bool io_mem_read(MemoryRegion *mr, hwaddr addr, uint64_t *pval, unsigned size)
-- 
1.8.1.4

[Qemu-devel] [PATCH 22/30] memory: move MemoryListener declaration earlier

2013-06-28 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 include/exec/memory.h | 66 +--
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index aa7a922..913ac45 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -164,6 +164,39 @@ struct MemoryRegion {
 NotifierList iommu_notify;
 };
 
+typedef struct MemoryListener MemoryListener;
+
+/**
+ * MemoryListener: callbacks structure for updates to the physical memory map
+ *
+ * Allows a component to adjust to changes in the guest-visible memory map.
+ * Use with memory_listener_register() and memory_listener_unregister().
+ */
+struct MemoryListener {
+void (*begin)(MemoryListener *listener);
+void (*commit)(MemoryListener *listener);
+void (*region_add)(MemoryListener *listener, MemoryRegionSection *section);
+void (*region_del)(MemoryListener *listener, MemoryRegionSection *section);
+void (*region_nop)(MemoryListener *listener, MemoryRegionSection *section);
+void (*log_start)(MemoryListener *listener, MemoryRegionSection *section);
+void (*log_stop)(MemoryListener *listener, MemoryRegionSection *section);
+void (*log_sync)(MemoryListener *listener, MemoryRegionSection *section);
+void (*log_global_start)(MemoryListener *listener);
+void (*log_global_stop)(MemoryListener *listener);
+void (*eventfd_add)(MemoryListener *listener, MemoryRegionSection *section,
+bool match_data, uint64_t data, EventNotifier *e);
+void (*eventfd_del)(MemoryListener *listener, MemoryRegionSection *section,
+bool match_data, uint64_t data, EventNotifier *e);
+void (*coalesced_mmio_add)(MemoryListener *listener, MemoryRegionSection 
*section,
+   hwaddr addr, hwaddr len);
+void (*coalesced_mmio_del)(MemoryListener *listener, MemoryRegionSection 
*section,
+   hwaddr addr, hwaddr len);
+/* Lower = earlier (during add), later (during del) */
+unsigned priority;
+AddressSpace *address_space_filter;
+QTAILQ_ENTRY(MemoryListener) link;
+};
+
 /**
  * AddressSpace: describes a mapping of addresses to #MemoryRegion objects
  */
@@ -202,39 +235,6 @@ struct MemoryRegionSection {
 bool readonly;
 };
 
-typedef struct MemoryListener MemoryListener;
-
-/**
- * MemoryListener: callbacks structure for updates to the physical memory map
- *
- * Allows a component to adjust to changes in the guest-visible memory map.
- * Use with memory_listener_register() and memory_listener_unregister().
- */
-struct MemoryListener {
-void (*begin)(MemoryListener *listener);
-void (*commit)(MemoryListener *listener);
-void (*region_add)(MemoryListener *listener, MemoryRegionSection *section);
-void (*region_del)(MemoryListener *listener, MemoryRegionSection *section);
-void (*region_nop)(MemoryListener *listener, MemoryRegionSection *section);
-void (*log_start)(MemoryListener *listener, MemoryRegionSection *section);
-void (*log_stop)(MemoryListener *listener, MemoryRegionSection *section);
-void (*log_sync)(MemoryListener *listener, MemoryRegionSection *section);
-void (*log_global_start)(MemoryListener *listener);
-void (*log_global_stop)(MemoryListener *listener);
-void (*eventfd_add)(MemoryListener *listener, MemoryRegionSection *section,
-bool match_data, uint64_t data, EventNotifier *e);
-void (*eventfd_del)(MemoryListener *listener, MemoryRegionSection *section,
-bool match_data, uint64_t data, EventNotifier *e);
-void (*coalesced_mmio_add)(MemoryListener *listener, MemoryRegionSection 
*section,
-   hwaddr addr, hwaddr len);
-void (*coalesced_mmio_del)(MemoryListener *listener, MemoryRegionSection 
*section,
-   hwaddr addr, hwaddr len);
-/* Lower = earlier (during add), later (during del) */
-unsigned priority;
-AddressSpace *address_space_filter;
-QTAILQ_ENTRY(MemoryListener) link;
-};
-
 /**
  * memory_region_init: Initialize a memory region
  *
-- 
1.8.1.4

[Qemu-devel] [PATCH 16/30] block: report RCU quiescent states

2013-06-28 Thread Paolo Bonzini

The aio workers may spend a long time executing I/O operations;
mark that time as an extended quiescent state.

Signed-off-by: Paolo Bonzini 
---
 block/raw-posix.c | 3 +++
 block/raw-win32.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index c0ccf27..637fe8a 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -26,6 +26,7 @@
 #include "qemu/log.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
+#include "qemu/rcu.h"
 #include "trace.h"
 #include "block/thread-pool.h"
 #include "qemu/iov.h"
@@ -735,6 +736,7 @@ static int aio_worker(void *arg)
 RawPosixAIOData *aiocb = arg;
 ssize_t ret = 0;
 
+rcu_thread_offline();
 switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
 case QEMU_AIO_READ:
 ret = handle_aiocb_rw(aiocb);
@@ -774,6 +776,7 @@ static int aio_worker(void *arg)
 }
 
 g_slice_free(RawPosixAIOData, aiocb);
+rcu_thread_online();
 return ret;
 }
 
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 7c03b6d..fc573b7 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -25,6 +25,7 @@
 #include "qemu/timer.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
+#include "qemu/rcu.h"
 #include "raw-aio.h"
 #include "trace.h"
 #include "block/thread-pool.h"
@@ -99,6 +100,7 @@ static int aio_worker(void *arg)
 ssize_t ret = 0;
 size_t count;
 
+rcu_thread_offline();
 switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
 case QEMU_AIO_READ:
 count = handle_aiocb_rw(aiocb);
@@ -136,6 +138,7 @@ static int aio_worker(void *arg)
 }
 
 g_slice_free(RawWin32AIOData, aiocb);
+rcu_thread_online();
 return ret;
 }
 
-- 
1.8.1.4

[Qemu-devel] [PATCH 15/30] cpus: report RCU quiescent states

2013-06-28 Thread Paolo Bonzini

CPU threads have extended quiescent states while relinquishing control
to the accelerator (except TCG).

Signed-off-by: Paolo Bonzini 
---
 cpus.c| 3 +++
 kvm-all.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/cpus.c b/cpus.c
index c8bc8ad..4b5cb8d 100644
--- a/cpus.c
+++ b/cpus.c
@@ -37,6 +37,7 @@
 #include "sysemu/qtest.h"
 #include "qemu/main-loop.h"
 #include "qemu/bitmap.h"
+#include "qemu/rcu.h"
 
 #ifndef _WIN32
 #include "qemu/compatfd.h"
@@ -793,6 +794,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
 while (1) {
 cpu_single_env = NULL;
 qemu_mutex_unlock_iothread();
+rcu_thread_offline();
 do {
 int sig;
 r = sigwait(&waitset, &sig);
@@ -801,6 +803,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
 perror("sigwait");
 exit(1);
 }
+rcu_thread_online();
 qemu_mutex_lock_iothread();
 cpu_single_env = env;
 qemu_wait_io_event_common(cpu);
diff --git a/kvm-all.c b/kvm-all.c
index 91aa7ff..547abe3 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -33,6 +33,7 @@
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
 #include "qemu/event_notifier.h"
+#include "qemu/rcu.h"
 #include "trace.h"
 
 /* This check must be after config-host.h is included */
@@ -1644,7 +1645,9 @@ int kvm_cpu_exec(CPUArchState *env)
 }
 qemu_mutex_unlock_iothread();
 
+rcu_thread_offline();
 run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
+rcu_thread_online();
 
 qemu_mutex_lock_iothread();
 kvm_arch_post_run(cpu, run);
-- 
1.8.1.4

[Qemu-devel] [PATCH 20/30] exec: change well-known physical sections to macros

2013-06-28 Thread Paolo Bonzini

From: Liu Ping Fan 

Sections like phys_section_unassigned always have fixed address
in phys_sections.  Declared as macro, so we can use them
when having more than one phys_sections array.

Signed-off-by: Liu Ping Fan 
Signed-off-by: Liu Ping Fan 
Signed-off-by: Paolo Bonzini 
---
 exec.c | 34 --
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/exec.c b/exec.c
index a788981..7ad513c 100644
--- a/exec.c
+++ b/exec.c
@@ -113,10 +113,10 @@ typedef struct subpage_t {
 
 static MemoryRegionSection *phys_sections;
 static unsigned phys_sections_nb, phys_sections_nb_alloc;
-static uint16_t phys_section_unassigned;
-static uint16_t phys_section_notdirty;
-static uint16_t phys_section_rom;
-static uint16_t phys_section_watch;
+#define PHYS_SECTION_UNASSIGNED 0
+#define PHYS_SECTION_NOTDIRTY 1
+#define PHYS_SECTION_ROM 2
+#define PHYS_SECTION_WATCH 3
 
 /* Simple allocator for PhysPageEntry nodes */
 static PhysPageEntry (*phys_map_nodes)[L2_SIZE];
@@ -174,7 +174,7 @@ static void phys_page_set_level(PhysPageEntry *lp, hwaddr 
*index,
 if (level == 0) {
 for (i = 0; i < L2_SIZE; i++) {
 p[i].is_leaf = 1;
-p[i].ptr = phys_section_unassigned;
+p[i].ptr = PHYS_SECTION_UNASSIGNED;
 }
 }
 } else {
@@ -213,7 +213,7 @@ static MemoryRegionSection 
*phys_page_find(AddressSpaceDispatch *d, hwaddr index
 
 for (i = P_L2_LEVELS - 1; i >= 0 && !lp.is_leaf; i--) {
 if (lp.ptr == PHYS_MAP_NODE_NIL) {
-return &phys_sections[phys_section_unassigned];
+return &phys_sections[PHYS_SECTION_UNASSIGNED];
 }
 p = phys_map_nodes[lp.ptr];
 lp = p[(index >> (i * L2_BITS)) & (L2_SIZE - 1)];
@@ -731,9 +731,9 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 iotlb = (memory_region_get_ram_addr(section->mr) & TARGET_PAGE_MASK)
 + xlat;
 if (!section->readonly) {
-iotlb |= phys_section_notdirty;
+iotlb |= PHYS_SECTION_NOTDIRTY;
 } else {
-iotlb |= phys_section_rom;
+iotlb |= PHYS_SECTION_ROM;
 }
 } else {
 iotlb = section - phys_sections;
@@ -746,7 +746,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 if (vaddr == (wp->vaddr & TARGET_PAGE_MASK)) {
 /* Avoid trapping reads of pages with a write breakpoint. */
 if ((prot & PAGE_WRITE) || (wp->flags & BP_MEM_READ)) {
-iotlb = phys_section_watch + paddr;
+iotlb = PHYS_SECTION_WATCH + paddr;
 *address |= TLB_MMIO;
 break;
 }
@@ -1670,7 +1670,7 @@ static subpage_t *subpage_init(AddressSpace *as, hwaddr 
base)
 printf("%s: %p base " TARGET_FMT_plx " len %08x %d\n", __func__,
mmio, base, TARGET_PAGE_SIZE, subpage_memory);
 #endif
-subpage_register(mmio, 0, TARGET_PAGE_SIZE-1, phys_section_unassigned);
+subpage_register(mmio, 0, TARGET_PAGE_SIZE-1, PHYS_SECTION_UNASSIGNED);
 
 return mmio;
 }
@@ -1712,11 +1712,17 @@ static void mem_begin(MemoryListener *listener)
 
 static void core_begin(MemoryListener *listener)
 {
+uint16_t n;
+
 phys_sections_clear();
-phys_section_unassigned = dummy_section(&io_mem_unassigned);
-phys_section_notdirty = dummy_section(&io_mem_notdirty);
-phys_section_rom = dummy_section(&io_mem_rom);
-phys_section_watch = dummy_section(&io_mem_watch);
+n = dummy_section(&io_mem_unassigned);
+assert(n == PHYS_SECTION_UNASSIGNED);
+n = dummy_section(&io_mem_notdirty);
+assert(n == PHYS_SECTION_NOTDIRTY);
+n = dummy_section(&io_mem_rom);
+assert(n == PHYS_SECTION_ROM);
+n = dummy_section(&io_mem_watch);
+assert(n == PHYS_SECTION_WATCH);
 }
 
 static void tcg_commit(MemoryListener *listener)
-- 
1.8.1.4

[Qemu-devel] [PATCH 11/30] rcu: add rcutorture

2013-06-28 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 tests/Makefile |   5 +-
 tests/rcutorture.c | 439 +
 2 files changed, 443 insertions(+), 1 deletion(-)
 create mode 100644 tests/rcutorture.c

diff --git a/tests/Makefile b/tests/Makefile
index 0400b39..6a81b94 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -50,6 +50,8 @@ gcov-files-test-tls-y =
 check-unit-y += tests/test-int128$(EXESUF)
 # all code tested by test-int128 is inside int128.h
 gcov-files-test-int128-y =
+check-unit-y += tests/rcutorture$(EXESUF)
+gcov-files-rcutorture-y = util/rcu.c
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -82,7 +84,7 @@ test-obj-y = tests/check-qint.o tests/check-qstring.o 
tests/check-qdict.o \
tests/test-qmp-input-visitor.o tests/test-qmp-input-strict.o \
tests/test-qmp-commands.o tests/test-visitor-serialization.o \
tests/test-x86-cpuid.o tests/test-mul64.o tests/test-int128.o \
-   tests/test-tls.o
+   tests/test-tls.o tests/rcutorture.o
 
 test-qapi-obj-y = tests/test-qapi-visit.o tests/test-qapi-types.o
 
@@ -107,6 +109,7 @@ tests/test-xbzrle$(EXESUF): tests/test-xbzrle.o xbzrle.o 
page_cache.o libqemuuti
 tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
 tests/test-int128$(EXESUF): tests/test-int128.o
 tests/test-tls$(EXESUF): tests/test-tls.o libqemuutil.a
+tests/rcutorture$(EXESUF): tests/rcutorture.o libqemuutil.a
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/rcutorture.c b/tests/rcutorture.c
new file mode 100644
index 000..02ab9ed
--- /dev/null
+++ b/tests/rcutorture.c
@@ -0,0 +1,439 @@
+/*
+ * rcutorture.c: simple user-level performance/stress test of RCU.
+ *
+ * Usage:
+ * ./rcu  rperf [  ]
+ * Run a read-side performance test with the specified
+ * number of readers for  seconds.
+ * ./rcu  uperf [  ]
+ * Run an update-side performance test with the specified
+ * number of updaters and specified duration.
+ * ./rcu  perf [  ]
+ * Run a combined read/update performance test with the specified
+ * number of readers and one updater and specified duration.
+ *
+ * The above tests produce output as follows:
+ *
+ * n_reads: 46008000  n_updates: 146026  nreaders: 2  nupdaters: 1 duration: 1
+ * ns/read: 43.4707  ns/update: 6848.1
+ *
+ * The first line lists the total number of RCU reads and updates executed
+ * during the test, the number of reader threads, the number of updater
+ * threads, and the duration of the test in seconds.  The second line
+ * lists the average duration of each type of operation in nanoseconds,
+ * or "nan" if the corresponding type of operation was not performed.
+ *
+ * ./rcu  stress [  ]
+ * Run a stress test with the specified number of readers and
+ * one updater.
+ *
+ * This test produces output as follows:
+ *
+ * n_reads: 114633217  n_updates: 3903415  n_mberror: 0
+ * rcu_stress_count: 114618391 14826 0 0 0 0 0 0 0 0 0
+ *
+ * The first line lists the number of RCU read and update operations
+ * executed, followed by the number of memory-ordering violations
+ * (which will be zero in a correct RCU implementation).  The second
+ * line lists the number of readers observing progressively more stale
+ * data.  A correct RCU implementation will have all but the first two
+ * numbers non-zero.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (c) 2008 Paul E. McKenney, IBM Corporation.
+ */
+
+/*
+ * Test variables.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include "qemu/atomic.h"
+#include "qemu/rcu.h"
+#include "qemu/compiler.h"
+#include "qemu/thread.h"
+
+long long n_reads = 0LL;
+long n_updates = 0L;
+int nthreadsrunning;
+
+char argsbuf[64];
+
+#define GOFLAG_INIT 0
+#define GOFLAG_RUN  1
+#define GOFLAG_STOP 2
+
+static volatile int goflag = GOFLAG_INIT;
+
+#define RCU_READ_RUN 1000
+
+#define NR_THREADS 100
+static QemuThread threads[NR_THREADS];
+static struct rcu_reader_data *data[NR_THREADS];
+static int n_threads;
+
+static void create_thread(void *(*func)(void *))
+{
+if (n_threads >= NR_THREADS) {
+fprintf(stderr, "Thread limit of %d exceeded!\n", NR_THREADS);
+ex

[Qemu-devel] [PATCH 12/30] rcu: allow nested calls to rcu_thread_offline/rcu_thread_online

2013-06-28 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 docs/rcu.txt   |  5 +
 include/qemu/rcu.h | 21 +++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index ca1c099..a3510b9 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -187,6 +187,11 @@ Marking quiescent states is done with the following three 
APIs:
 thread.
 
 
+rcu_thread_offline() and rcu_thread_online() can be nested.  The end of
+the extended quiescent state will coincide with the outermost call to
+rcu_thread_online().
+
+
 The following APIs can be used to use RCU in a thread that is not
 created with qemu_thread_create():
 
diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h
index 1ed70bb..eff6d1e 100644
--- a/include/qemu/rcu.h
+++ b/include/qemu/rcu.h
@@ -78,6 +78,9 @@ struct rcu_reader_data {
 unsigned long ctr;
 bool waiting;
 
+/* Data used by reader only */
+unsigned offline;
+
 /* Data used for registry, protected by rcu_gp_lock */
 QLIST_ENTRY(rcu_reader_data) node;
 };
@@ -100,9 +103,14 @@ static inline void rcu_read_unlock(void)
 static inline void rcu_quiescent_state(void)
 {
 struct rcu_reader_data *p_rcu_reader = tls_get_rcu_reader();
+unsigned ctr;
+
+if (p_rcu_reader->offline > 0) {
+return;
+}
 
 /* Reuses smp_rmb() in the last rcu_read_unlock().  */
-unsigned ctr = atomic_read(&rcu_gp_ctr);
+ctr = atomic_read(&rcu_gp_ctr);
 atomic_xchg(&p_rcu_reader->ctr, ctr);
 if (atomic_read(&p_rcu_reader->waiting)) {
 atomic_set(&p_rcu_reader->waiting, false);
@@ -114,6 +122,10 @@ static inline void rcu_thread_offline(void)
 {
 struct rcu_reader_data *p_rcu_reader = tls_get_rcu_reader();
 
+if (p_rcu_reader->offline++ > 0) {
+return;
+}
+
 atomic_xchg(&p_rcu_reader->ctr, 0);
 if (atomic_read(&p_rcu_reader->waiting)) {
 atomic_set(&p_rcu_reader->waiting, false);
@@ -123,7 +135,12 @@ static inline void rcu_thread_offline(void)
 
 static inline void rcu_thread_online(void)
 {
-rcu_quiescent_state();
+struct rcu_reader_data *p_rcu_reader = tls_get_rcu_reader();
+
+assert(p_rcu_reader->offline != 0);
+if (--p_rcu_reader->offline == 0) {
+rcu_quiescent_state();
+}
 }
 
 extern void synchronize_rcu(void);
-- 
1.8.1.4

[Qemu-devel] [PATCH 19/30] memory: avoid ref/unref in memory_region_find

2013-06-28 Thread Paolo Bonzini

Do the entire lookup under RCU.

Signed-off-by: Paolo Bonzini 
---
 memory.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/memory.c b/memory.c
index 7a4fe37..4fc23a2 100644
--- a/memory.c
+++ b/memory.c
@@ -1559,7 +1559,8 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 as = memory_region_to_address_space(root);
 range = addrrange_make(int128_make64(addr), int128_make64(size));
 
-view = address_space_get_flatview(as);
+rcu_read_lock();
+view = rcu_dereference(&as->current_map);
 fr = flatview_lookup(view, range);
 if (!fr) {
 return ret;
@@ -1580,7 +1581,7 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 ret.readonly = fr->readonly;
 memory_region_ref(ret.mr);
 
-flatview_unref(view);
+rcu_read_unlock();
 return ret;
 }
 
-- 
1.8.1.4

[Qemu-devel] [PATCH 17/30] migration: report RCU quiescent states

2013-06-28 Thread Paolo Bonzini

The migration thread polls s->state periodically, it does not
use a mutex or condition variable, so it has to report quiescent
states manually.

Signed-off-by: Paolo Bonzini 
---
 migration.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/migration.c b/migration.c
index 83f5691..28010dc 100644
--- a/migration.c
+++ b/migration.c
@@ -22,6 +22,7 @@
 #include "qemu/sockets.h"
 #include "migration/block.h"
 #include "qemu/thread.h"
+#include "qemu/rcu.h"
 #include "qmp-commands.h"
 #include "trace.h"
 
@@ -508,6 +509,7 @@ static void *migration_thread(void *opaque)
 int64_t current_time;
 uint64_t pending_size;
 
+rcu_quiescent_state();
 if (!qemu_file_rate_limit(s->file)) {
 DPRINTF("iterate\n");
 pending_size = qemu_savevm_state_pending(s->file, max_size);
-- 
1.8.1.4

[Qemu-devel] [PATCH 10/30] rcu: add call_rcu

2013-06-28 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 docs/rcu.txt   | 108 +--
 include/qemu/rcu.h |  22 ++
 util/rcu.c | 120 +
 3 files changed, 246 insertions(+), 4 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index 118a28a..ca1c099 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -82,7 +82,50 @@ The core RCU API is small:
 Note that it would be valid for another update to come while
 synchronize_rcu is running.  Because of this, it is better that
 the updater releases any locks it may hold before calling
-synchronize_rcu.
+synchronize_rcu.  If this is not possible (for example, because
+the updater is protected by the BQL), you can use call_rcu.
+
+ void call_rcu1(struct rcu_head * head,
+void (*func)(struct rcu_head *head));
+
+This function invokes func(head) after all pre-existing RCU
+read-side critical sections on all threads have completed.  This
+marks the end of the removal phase, with func taking care
+asynchronously of the reclamation phase.
+
+The foo struct needs to have an rcu_head structure added,
+perhaps as follows:
+
+struct foo {
+struct rcu_head rcu;
+int a;
+char b;
+long c;
+};
+
+so that the reclaimer function can fetch the struct foo address
+and free it:
+
+call_rcu1(foo_reclaim, &foo.rcu);
+
+void foo_reclaim(struct rcu_head *rp)
+{
+struct foo *fp = container_of(rp, struct foo, rcu);
+g_free(fp);
+}
+
+For the common case where the rcu_head member is the first of the
+struct, you can use the following macro.
+
+ void call_rcu(T *p,
+   void (*func)(T *p),
+   field-name);
+
+call_rcu1 is typically used through this macro, in the common case
+where the "struct rcu_head" is the first field in the struct.  In
+the above case, one could have written simply:
+
+call_rcu(foo_reclaim, g_free, rcu);
 
  typeof(*p) rcu_dereference(p);
  typeof(p) rcu_assign_pointer(p, typeof(p) v);
@@ -176,6 +219,11 @@ DIFFERENCES WITH LINUX
 - rcu_dereference takes a _pointer_ to the variable being accessed.
   Wrong usage will be detected by the compiler.
 
+- call_rcu is a macro that has an extra argument (the name of the first
+  field in the struct, which must be a struct rcu_head), and expects the
+  type of the callback's argument to be the type of the first argument.
+  call_rcu1 is the same as Linux's call_rcu.
+
 - Quiescent points must be marked explicitly in the thread.
 
 
@@ -231,7 +279,47 @@ The write side looks simply like this (with appropriate 
locking):
 synchronize_rcu();
 free(old);
 
-Note that the same idiom would be possible with reader/writer
+If the processing cannot be done purely within the critical section, it
+is possible to combine this idiom with a "real" reference count:
+
+rcu_read_lock();
+p = rcu_dereference(&foo);
+foo_ref(p);
+rcu_read_unlock();
+/* do something with p. */
+foo_unref(p);
+
+The write side can be like this:
+
+qemu_mutex_lock(&foo_mutex);
+old = foo;
+rcu_assign_pointer(foo, new);
+qemu_mutex_unlock(&foo_mutex);
+synchronize_rcu();
+foo_unref(old);
+
+or with call_rcu:
+
+qemu_mutex_lock(&foo_mutex);
+old = foo;
+rcu_assign_pointer(foo, new);
+qemu_mutex_unlock(&foo_mutex);
+call_rcu(foo_unref, old, rcu);
+
+In both cases, the write side only performs removal.  Reclamation
+happens when the last reference to a "foo" object is dropped.
+Using synchronize_rcu() is undesirably expensive, because the
+last reference may be dropped on the read side.  Hence you can
+use call_rcu() instead:
+
+ foo_unref(struct foo *p) {
+if (atomic_dec(&p->refcount) == 0) {
+call_rcu(foo_destroy, p, rcu);
+}
+}
+
+
+Note that the same idioms would be possible with reader/writer
 locks:
 
 read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock);
@@ -241,13 +329,25 @@ locks:
 write_mutex_unlock(&foo_rwlock);
 free(p);
 
+--
+
+read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock);
+p = foo;old = foo;
+foo_ref(p); foo = new;
+read_unlock(&foo_rwlock);   write_mutex_unlock(&foo_rwlock);
+/* do something with p. */  foo_unref(old);
+foo_unref(p);
+
+foo_unref could use a mechanism such as bottom halves to move deallocation
+out of hot paths.
+
 
 RCU resizable arrays
 
 
 Resizable arrays can be used with RCU.  The expensi

[Qemu-devel] [PATCH 04/30] add a header file for atomic operations

2013-06-28 Thread Paolo Bonzini

We're already using them in several places, but __sync builtins are just
too ugly to type, and do not provide seqcst load/store operations.

Signed-off-by: Paolo Bonzini 
---
 docs/atomics.txt | 345 +++
 hw/display/qxl.c |   3 +-
 hw/virtio/vhost.c|   9 +-
 include/qemu/atomic.h| 190 +-
 migration.c  |   3 +-
 tests/test-thread-pool.c |   8 +-
 6 files changed, 514 insertions(+), 44 deletions(-)
 create mode 100644 docs/atomics.txt

diff --git a/docs/atomics.txt b/docs/atomics.txt
new file mode 100644
index 000..e2ce04b
--- /dev/null
+++ b/docs/atomics.txt
@@ -0,0 +1,345 @@
+CPUs perform independent memory operations effectively in random order.
+but this can be a problem for CPU-CPU interaction (including interactions
+between QEMU and the guest).  Multi-threaded programs use various tools
+to instruct the compiler and the CPU to restrict the order to something
+that is consistent with the expectations of the programmer.
+
+The most basic tool is locking.  Mutexes, condition variables and
+semaphores are used in QEMU, and should be the default approach to
+synchronization.  Anything else is considerably harder, but it's
+also justified more often than one would like.  The two tools that
+are provided by qemu/atomic.h are memory barriers and atomic operations.
+
+Macros defined by qemu/atomic.h fall in three camps:
+
+- compiler barriers: barrier();
+
+- weak atomic access and manual memory barriers: atomic_read(),
+  atomic_set(), smp_rmb(), smp_wmb(), smp_mb(), smp_read_barrier_depends();
+
+- sequentially consistent atomic access: everything else.
+
+
+COMPILER MEMORY BARRIER
+===
+
+barrier() prevents the compiler from moving the memory accesses either
+side of it to the other side.  The compiler barrier has no direct effect
+on the CPU, which may then reorder things however it wishes.
+
+barrier() is mostly used within qemu/atomic.h itself.  On some
+architectures, CPU guarantees are strong enough that blocking compiler
+optimizations already ensures the correct order of execution.  In this
+case, qemu/atomic.h will reduce stronger memory barriers to simple
+compiler barriers.
+
+Still, barrier() can be useful when writing code that can be interrupted
+by signal handlers.
+
+
+SEQUENTIALLY CONSISTENT ATOMIC ACCESS
+=
+
+Most of the operations in the qemu/atomic.h header ensure *sequential
+consistency*, where "the result of any execution is the same as if the
+operations of all the processors were executed in some sequential order,
+and the operations of each individual processor appear in this sequence
+in the order specified by its program".
+
+qemu/atomic.h provides the following set of atomic read-modify-write
+operations:
+
+typeof(*ptr) atomic_inc(ptr)
+typeof(*ptr) atomic_dec(ptr)
+typeof(*ptr) atomic_add(ptr, val)
+typeof(*ptr) atomic_sub(ptr, val)
+typeof(*ptr) atomic_and(ptr, val)
+typeof(*ptr) atomic_or(ptr, val)
+typeof(*ptr) atomic_xchg(ptr, val
+typeof(*ptr) atomic_cmpxchg(ptr, old, new)
+
+all of which return the old value of *ptr.  These operations are
+polymorphic; they operate on any type that is as wide as an int.
+
+Sequentially consistent loads and stores can be done using:
+
+atomic_add(ptr, 0) for loads
+atomic_xchg(ptr, val) for stores
+
+However, they are quite expensive on some platforms, notably POWER and
+ARM.  Therefore, qemu/atomic.h provides two primitives with slightly
+weaker constraints:
+
+typeof(*ptr) atomic_mb_read(ptr)
+void atomic_mb_set(ptr, val)
+
+The semantics of these primitives map to Java volatile variables,
+and are strongly related to memory barriers as used in the Linux
+kernel (see below).
+
+As long as you use atomic_mb_read and atomic_mb_set, accesses cannot
+be reordered with each other, and it is also not possible to reorder
+"normal" accesses around them.
+
+However, and this is the important difference between
+atomic_mb_read/atomic_mb_set and sequential consistency, it is important
+for both threads to access the same volatile variable.  It is not the
+case that everything visible to thread A when it writes volatile field f
+becomes visible to thread B after it reads volatile field g. The store
+and load have to "match" (i.e., be performed on the same volatile
+field) to achieve the right semantics.
+
+
+These operations operate on any type that is as wide as an int or smaller.
+
+
+WEAK ATOMIC ACCESS AND MANUAL MEMORY BARRIERS
+=
+
+Compared to sequentially consistent atomic access, programming with
+weaker consistency models can be considerably more complicated.
+In general, if the algorithm you are writing includes both writes
+and reads on the same side, it is generally simpler to use sequentially
+consistent primitives.
+
+When using this model, variables are accessed with atom

[Qemu-devel] [PATCH 06/30] qemu-thread: add TLS wrappers

2013-06-28 Thread Paolo Bonzini

Fast TLS is not available on some platforms, but it is always nice to
use it.  This wrapper implementation falls back to pthread_get/setspecific
on POSIX systems that lack __thread, but uses the dynamic linker's TLS
support on Linux and Windows.

The user shall call alloc_foo() in every thread that needs to access the
variable---exactly once and before any access.  foo is the name of the
variable as passed to DECLARE_TLS and DEFINE_TLS.  Then, get_foo() will
return the address of the variable.  It is guaranteed to remain the same
across the lifetime of a thread, so you can cache it.

Signed-off-by: Paolo Bonzini 
---
 configure|  21 +++
 include/qemu/tls.h   | 139 +++
 tests/Makefile   |   7 ++-
 tests/test-tls.c |  87 +
 util/qemu-thread-win32.c |  17 ++
 5 files changed, 270 insertions(+), 1 deletion(-)
 create mode 100644 include/qemu/tls.h
 create mode 100644 tests/test-tls.c

diff --git a/configure b/configure
index 0e0adde..8ccdf2e 100755
--- a/configure
+++ b/configure
@@ -284,6 +284,7 @@ fi
 ar="${AR-${cross_prefix}ar}"
 as="${AS-${cross_prefix}as}"
 cpp="${CPP-$cc -E}"
+nm="${NM-${cross_prefix}nm}"
 objcopy="${OBJCOPY-${cross_prefix}objcopy}"
 ld="${LD-${cross_prefix}ld}"
 libtool="${LIBTOOL-${cross_prefix}libtool}"
@@ -3163,6 +3164,22 @@ if test "$trace_backend" = "dtrace"; then
 fi
 
 ##
+# check for TLS runtime
+
+# Some versions of mingw include the "magic" definitions that make
+# TLS work, some don't.  Check for it.
+
+if test "$mingw32" = yes; then
+  cat > $TMPC << EOF
+int main(void) {}
+EOF
+  compile_prog "" ""
+  if $nm $TMPE | grep _tls_used > /dev/null 2>&1; then
+mingw32_tls_runtime=yes
+  fi
+fi
+
+##
 # check and set a backend for coroutine
 
 # We prefer ucontext, but it's not always possible. The fallback
@@ -3621,6 +3638,9 @@ if test "$mingw32" = "yes" ; then
   version_micro=0
   echo 
"CONFIG_FILEVERSION=$version_major,$version_minor,$version_subminor,$version_micro"
 >> $config_host_mak
   echo 
"CONFIG_PRODUCTVERSION=$version_major,$version_minor,$version_subminor,$version_micro"
 >> $config_host_mak
+  if test "$mingw32_tls_runtime" = yes; then
+echo "CONFIG_MINGW32_TLS_RUNTIME=y" >> $config_host_mak
+  fi
 else
   echo "CONFIG_POSIX=y" >> $config_host_mak
 fi
@@ -4043,6 +4063,7 @@ echo "OBJCC=$objcc" >> $config_host_mak
 echo "AR=$ar" >> $config_host_mak
 echo "AS=$as" >> $config_host_mak
 echo "CPP=$cpp" >> $config_host_mak
+echo "NM=$nm" >> $config_host_mak
 echo "OBJCOPY=$objcopy" >> $config_host_mak
 echo "LD=$ld" >> $config_host_mak
 echo "WINDRES=$windres" >> $config_host_mak
diff --git a/include/qemu/tls.h b/include/qemu/tls.h
new file mode 100644
index 000..6a0f976
--- /dev/null
+++ b/include/qemu/tls.h
@@ -0,0 +1,139 @@
+/*
+ * Abstraction layer for defining and using TLS variables
+ *
+ * Copyright (c) 2011, 2013 Red Hat, Inc
+ * Copyright (c) 2011 Linaro Limited
+ *
+ * Authors:
+ *  Paolo Bonzini 
+ *  Peter Maydell 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ */
+
+#ifndef QEMU_TLS_H
+#define QEMU_TLS_H
+
+#if defined(__linux__) || defined(__FreeBSD__)
+#define DECLARE_TLS(type, x) \
+extern __thread typeof(type) x;  \
+ \
+static inline typeof(type) *tls_get_##x(void)\
+{\
+return &x;   \
+}\
+ \
+static inline typeof(type) *tls_alloc_##x(void)  \
+{\
+return &x;   \
+}\
+ \
+extern int dummy_##__LINE__
+
+#define DEFINE_TLS(type, x)  \
+__thread typeof(type) x
+
+#elif defined CONFIG_POSIX
+typedef struct QEMUTLSValue {
+pthread_key_t k;
+pthread_once_t o;
+} QEMUTLSValue;
+
+#define DECLARE_TLS(type, x) \
+extern QEMUTLSValue x;   \
+extern void init_##x(void);  \
+ \
+sta

[Qemu-devel] [PATCH 03/30] memory: add reference counting to FlatView

2013-06-28 Thread Paolo Bonzini

With this change, a FlatView can be used even after a concurrent
update has replaced it.  Because we do not have RCU, we use a
mutex to protect the small critical sections that read/write the
as->current_map pointer.  Accesses to the FlatView can be done
outside the mutex.

If a MemoryRegion will be used after the FlatView is unref-ed (or after
a MemoryListener callback is returned), a reference has to be added to
that MemoryRegion.  For example, memory_region_find adds a reference to
the MemoryRegion that it returns.

Signed-off-by: Paolo Bonzini 
---
 memory.c | 79 
 1 file changed, 69 insertions(+), 10 deletions(-)

diff --git a/memory.c b/memory.c
index 319894e..bb92e17 100644
--- a/memory.c
+++ b/memory.c
@@ -29,12 +29,26 @@ static unsigned memory_region_transaction_depth;
 static bool memory_region_update_pending;
 static bool global_dirty_log = false;
 
+/* flat_view_mutex is taken around reading as->current_map; the critical
+ * section is extremely short, so I'm using a single mutex for every AS.
+ * We could also RCU for the read-side.
+ *
+ * The BQL is taken around transaction commits, hence both locks are taken
+ * while writing to as->current_map (with the BQL taken outside).
+ */
+static QemuMutex flat_view_mutex;
+
 static QTAILQ_HEAD(memory_listeners, MemoryListener) memory_listeners
 = QTAILQ_HEAD_INITIALIZER(memory_listeners);
 
 static QTAILQ_HEAD(, AddressSpace) address_spaces
 = QTAILQ_HEAD_INITIALIZER(address_spaces);
 
+static void memory_init(void)
+{
+qemu_mutex_init(&flat_view_mutex);
+}
+
 typedef struct AddrRange AddrRange;
 
 /*
@@ -225,6 +239,7 @@ struct FlatRange {
  * order.
  */
 struct FlatView {
+unsigned ref;
 FlatRange *ranges;
 unsigned nr;
 unsigned nr_allocated;
@@ -246,6 +261,7 @@ static bool flatrange_equal(FlatRange *a, FlatRange *b)
 
 static void flatview_init(FlatView *view)
 {
+view->ref = 1;
 view->ranges = NULL;
 view->nr = 0;
 view->nr_allocated = 0;
@@ -279,6 +295,18 @@ static void flatview_destroy(FlatView *view)
 g_free(view);
 }
 
+static void flatview_ref(FlatView *view)
+{
+__sync_fetch_and_add(&view->ref, 1);
+}
+
+static void flatview_unref(FlatView *view)
+{
+if (__sync_fetch_and_sub(&view->ref, 1) == 1) {
+flatview_destroy(view);
+}
+}
+
 static bool can_merge(FlatRange *r1, FlatRange *r2)
 {
 return int128_eq(addrrange_end(r1->addr), r2->addr.start)
@@ -578,6 +606,17 @@ static void address_space_add_del_ioeventfds(AddressSpace 
*as,
 }
 }
 
+static FlatView *address_space_get_flatview(AddressSpace *as)
+{
+FlatView *view;
+
+qemu_mutex_lock(&flat_view_mutex);
+view = as->current_map;
+flatview_ref(view);
+qemu_mutex_unlock(&flat_view_mutex);
+return view;
+}
+
 static void address_space_update_ioeventfds(AddressSpace *as)
 {
 FlatView *view;
@@ -587,7 +626,7 @@ static void address_space_update_ioeventfds(AddressSpace 
*as)
 AddrRange tmp;
 unsigned i;
 
-view = as->current_map;
+view = address_space_get_flatview(as);
 FOR_EACH_FLAT_RANGE(fr, view) {
 for (i = 0; i < fr->mr->ioeventfd_nb; ++i) {
 tmp = addrrange_shift(fr->mr->ioeventfds[i].addr,
@@ -609,6 +648,7 @@ static void address_space_update_ioeventfds(AddressSpace 
*as)
 g_free(as->ioeventfds);
 as->ioeventfds = ioeventfds;
 as->ioeventfd_nb = ioeventfd_nb;
+flatview_unref(view);
 }
 
 static void address_space_update_topology_pass(AddressSpace *as,
@@ -676,14 +716,25 @@ static void 
address_space_update_topology_pass(AddressSpace *as,
 
 static void address_space_update_topology(AddressSpace *as)
 {
-FlatView *old_view = as->current_map;
+FlatView *old_view = address_space_get_flatview(as);
 FlatView *new_view = generate_memory_topology(as->root);
 
 address_space_update_topology_pass(as, old_view, new_view, false);
 address_space_update_topology_pass(as, old_view, new_view, true);
 
+qemu_mutex_lock(&flat_view_mutex);
+flatview_unref(as->current_map);
 as->current_map = new_view;
-flatview_destroy(old_view);
+qemu_mutex_unlock(&flat_view_mutex);
+
+/* Note that all the old MemoryRegions are still alive up to this
+ * point.  This relieves most MemoryListeners from the need to
+ * ref/unref the MemoryRegions they get---unless they use them
+ * outside the iothread mutex, in which case precise reference
+ * counting is necessary.
+ */
+flatview_unref(old_view);
+
 address_space_update_ioeventfds(as);
 }
 
@@ -1146,12 +1197,13 @@ void memory_region_sync_dirty_bitmap(MemoryRegion *mr)
 FlatRange *fr;
 
 QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) {
-FlatView *view = as->current_map;
+FlatView *view = address_space_get_flatview(as);
 FOR_EACH_FLAT_RANGE(fr, view) {
 if (fr->mr == mr) {
 MEMORY_LISTENER_UPDATE_REGION(fr, as, Forwar

[Qemu-devel] [PATCH 00/30] Memory API changes for 1.6: RCU-protected address space dispatch

2013-06-28 Thread Paolo Bonzini

These are the next part in the unlocked address space dispatch work.
At the end of this series, address space dispatch runs under an RCU
critical section (RCU is hopefully well documented in patch 8).
It becomes then possible to add unlocked variants of address_space_rw.

Compared to Liu Ping Fan's approach posted last year, the path I followed
is more long-winded but requires much less changes in device models that
choose to adopt lockless MMIO dispatching.  This is nice, because it
produces an API that is easier to use.

In addition, I decided to sidestep complicated locking discussions by
using RCU.  RCU can bulk-protect the read side of all data structures
while remaining extremely light-weight.  The write side needs its own
lock, but in many cases it can simply be the BQL (very useful for things
that perform updates in callbacks!).

Patches 1-3 add reference counting to FlatView, which is independent
from the other patches but a necessary prerequisite to protect
as->current_map with RCU.

Patches 4-17 implement RCU and update existing threads to be RCU-friendly
(again, more information is in docs/rcu.txt) whenever necessary.
A lot of the code here comes from the URCU library.

Patches 18-30 do the rest of the work, using fresh copies of the dispatch
data structures on every update.  Several patches in this series are
based on Liu Ping Fan's work, which I made more fine-grained.

The diffstat looks daunting; however, note that ~50% of the new lines
are documentation and test cases.  Available from branch rcu of my
github repository.

I don't plan to send a pull request for this series anytime soon,
but reviews and testing are welcome.

Paolo

Liu Ping Fan (1):
  exec: change well-known physical sections to macros

Paolo Bonzini (29):
  memory: access FlatView from a local variable
  memory: use a new FlatView pointer on every topology update
  memory: add reference counting to FlatView
  add a header file for atomic operations
  exec: do not use qemu/tls.h
  qemu-thread: add TLS wrappers
  qemu-thread: add QemuEvent
  rcu: add rcu library
  qemu-thread: register threads with RCU
  rcu: add call_rcu
  rcu: add rcutorture
  rcu: allow nested calls to rcu_thread_offline/rcu_thread_online
  qemu-thread: report RCU quiescent states
  event loop: report RCU quiescent states
  cpus: report RCU quiescent states
  block: report RCU quiescent states
  migration: report RCU quiescent states
  memory: protect current_map by RCU
  memory: avoid ref/unref in memory_region_find
  exec: separate current memory map from the one being built
  memory: move MemoryListener declaration earlier
  exec: move listener from AddressSpaceDispatch to AddressSpace
  exec: separate current radix tree from the one being built
  exec: put memory map in AddressSpaceDispatch
  exec: remove cur_map
  exec: change some APIs to take AddressSpaceDispatch
  exec: change iotlb APIs to take AddressSpaceDispatch
  exec: add a reference to the region returned by
address_space_translate
  exec: put address space dispatch under RCU critical section

 aio-posix.c |   8 +
 aio-win32.c |  11 +-
 block/raw-posix.c   |   3 +
 block/raw-win32.c   |   3 +
 configure   |  21 +++
 cpus.c  |   3 +
 cputlb.c|   9 +-
 docs/atomics.txt| 345 ++
 docs/rcu.txt| 440 
 exec.c  | 244 
 hw/9pfs/virtio-9p-synth.c   |   1 +
 hw/display/qxl.c|   3 +-
 hw/ppc/spapr_iommu.c|  10 +-
 hw/virtio/vhost.c   |   9 +-
 include/exec/cpu-all.h  |  14 +-
 include/exec/cputlb.h   |   9 +-
 include/exec/memory.h   |  77 
 include/qemu/atomic.h   | 190 +++
 include/qemu/queue.h|  13 ++
 include/qemu/rcu-pointer.h  | 110 +++
 include/qemu/rcu.h  | 180 ++
 include/qemu/thread-posix.h |   8 +
 include/qemu/thread-win32.h |   4 +
 include/qemu/thread.h   |  10 +-
 include/qemu/tls.h  | 125 +++--
 kvm-all.c   |   3 +
 libcacard/Makefile  |   3 +-
 main-loop.c |   6 +
 memory.c| 127 +
 migration.c |   5 +-
 tests/Makefile  |  10 +-
 tests/rcutorture.c  | 439 +++
 tests/test-thread-pool.c|   8 +-
 tests/test-tls.c|  87 +
 util/Makefile.objs  |   1 +
 util/qemu-thread-posix.c| 174 +-
 util/qemu-thread-win32.c|  61 +-
 util/rcu.c  | 312 +++
 38 files changed, 2857 insertions(+), 229 deletions(-)
 create mode 100644 docs/atomics.txt
 create mode 100644 docs/rcu.txt
 create mode 100644 include/qemu/rcu-pointer.h
 create mode 100644 include/qemu/rcu.h
 create mode 100644 t

[Qemu-devel] [PATCH 08/30] rcu: add rcu library

2013-06-28 Thread Paolo Bonzini

This includes a (mangled) copy of the urcu-qsbr code from liburcu.
The main changes are: 1) removing dependencies on many other header files
in liburcu; 2) removing for simplicity the tentative busy waiting in
synchronize_rcu, which has limited performance effects; 3) replacing
futexes in synchronize_rcu with QemuEvents for Win32 portability.
The API is the same as liburcu, so it should be possible in the future
to require liburcu on POSIX systems for example and use our copy only
on Windows.

Among the various versions available I chose urcu-qsbr, which has the
fastest rcu_read_{lock,unlock} but requires the program to manually
annotate quiescent points or intervals.  QEMU threads usually have easily
identified quiescent periods, so this should not be a problem.

Signed-off-by: Paolo Bonzini 
---
 docs/rcu.txt   | 303 +
 hw/9pfs/virtio-9p-synth.c  |   1 +
 include/qemu/queue.h   |  13 ++
 include/qemu/rcu-pointer.h | 110 
 include/qemu/rcu.h | 141 +
 include/qemu/thread.h  |   3 -
 libcacard/Makefile |   3 +-
 util/Makefile.objs |   1 +
 util/rcu.c | 195 +
 9 files changed, 766 insertions(+), 4 deletions(-)
 create mode 100644 docs/rcu.txt
 create mode 100644 include/qemu/rcu-pointer.h
 create mode 100644 include/qemu/rcu.h
 create mode 100644 util/rcu.c

diff --git a/docs/rcu.txt b/docs/rcu.txt
new file mode 100644
index 000..4869ec7
--- /dev/null
+++ b/docs/rcu.txt
@@ -0,0 +1,303 @@
+Using RCU (Read-Copy-Update) for synchronization
+
+
+Read-copy update (RCU) is a synchronization mechanism that is used to
+protect read-mostly data structures.  RCU is very efficient and scalable
+on the read side (it is wait-free), and thus can make the read paths
+extremely fast.
+
+RCU supports concurrency between a single writer and multiple readers,
+thus it is not used alone.  Typically, the write-side will use a lock to
+serialize multiple updates, but other approaches are possible (e.g.,
+restricting updates to a single task).  In QEMU, when a lock is used,
+this will often be the "iothread mutex", also known as the "big QEMU
+lock" (BQL).  Also, restricting updates to a single task is done in
+QEMU using the "bottom half" API.
+
+RCU is fundamentally a "wait-to-finish" mechanism.  The read side marks
+sections of code with "critical sections", and the update side will wait
+for the execution of all *currently running* critical sections before
+proceeding, or before asynchronously executing a callback.
+
+The key point here is that only the currently running critical sections
+are waited for; critical sections that are started _after_ the beginning
+of the wait do not extend the wait, despite running concurrently with
+the updater.  This is the reason why RCU is more scalable than,
+for example, reader-writer locks.  It is so much more scalable that
+the system will have a single instance of the RCU mechanism; a single
+mechanism can be used for an arbitrary number of "things", without
+having to worry about things such as contention or deadlocks.
+
+How is this possible?  The basic idea is to split updates in two phases,
+"removal" and "reclamation".  During removal, we ensure that subsequent
+readers will not be able to get a reference to the old data.  After
+removal has completed, a critical section will not be able to access
+the old data.  Therefore, critical sections that begin after removal
+do not matter; as soon as all previous critical sections have finished,
+there cannot be any readers who hold references to the data structure,
+which may not be safely reclaimed (e.g., freed or unref'ed).
+
+Here is a picutre:
+
+thread 1  thread 2  thread 3
+------
+enter RCU crit.sec.
+   |finish removal phase
+   |begin wait
+   |  |enter RCU crit.sec.
+exit RCU crit.sec |   |
+complete wait |
+begin reclamation phase   |
+   exit RCU crit.sec.
+
+
+Note how thread 3 is still executing its critical section when thread 2
+starts reclaiming data.  This is possible, because the old version of the
+data structure was not accessible at the time thread 3 began executing
+that critical section.
+
+
+RCU API
+===
+
+The core RCU API is small:
+
+ void rcu_read_lock(void);
+
+Used by a reader to inform the reclaimer that the reader is
+entering an RCU read-side critical section.
+
+ void rcu_read_unlock(void);
+
+Used by a reader to inform the reclaimer that the reader is
+exiting an RCU

[Qemu-devel] [PATCH 14/30] event loop: report RCU quiescent states

2013-06-28 Thread Paolo Bonzini

Threads that run event loops also have places that can sleep for an extended
time.  Place an extended quiescent state there.

Signed-off-by: Paolo Bonzini 
---
 aio-posix.c |  8 
 aio-win32.c | 11 ++-
 main-loop.c |  6 ++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/aio-posix.c b/aio-posix.c
index b68eccd..a57b276 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -17,6 +17,7 @@
 #include "block/block.h"
 #include "qemu/queue.h"
 #include "qemu/sockets.h"
+#include "qemu/rcu.h"
 
 struct AioHandler
 {
@@ -175,6 +176,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 int ret;
 bool busy, progress;
 
+rcu_quiescent_state();
 progress = false;
 
 /*
@@ -232,9 +234,15 @@ bool aio_poll(AioContext *ctx, bool blocking)
 }
 
 /* wait until next event */
+if (blocking) {
+rcu_thread_offline();
+}
 ret = g_poll((GPollFD *)ctx->pollfds->data,
  ctx->pollfds->len,
  blocking ? -1 : 0);
+if (blocking) {
+rcu_thread_online();
+}
 
 /* if we have any readable fds, dispatch event */
 if (ret > 0) {
diff --git a/aio-win32.c b/aio-win32.c
index 38723bf..9652afb 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -19,6 +19,7 @@
 #include "block/block.h"
 #include "qemu/queue.h"
 #include "qemu/sockets.h"
+#include "qemu/rcu.h"
 
 struct AioHandler {
 EventNotifier *e;
@@ -99,6 +100,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 bool busy, progress;
 int count;
 
+rcu_quiescent_state();
 progress = false;
 
 /*
@@ -175,7 +177,14 @@ bool aio_poll(AioContext *ctx, bool blocking)
 /* wait until next event */
 while (count > 0) {
 int timeout = blocking ? INFINITE : 0;
-int ret = WaitForMultipleObjects(count, events, FALSE, timeout);
+
+if (timeout) {
+rcu_thread_offline();
+}
+ret = WaitForMultipleObjects(count, events, FALSE, timeout);
+if (timeout) {
+rcu_thread_online();
+}
 
 /* if we have any signaled events, dispatch event */
 if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
diff --git a/main-loop.c b/main-loop.c
index a44fff6..166357f 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -28,6 +28,7 @@
 #include "slirp/libslirp.h"
 #include "qemu/main-loop.h"
 #include "block/aio.h"
+#include "qemu/rcu.h"
 
 #ifndef _WIN32
 
@@ -220,13 +221,16 @@ static int os_host_main_loop_wait(uint32_t timeout)
 if (timeout > 0) {
 spin_counter = 0;
 qemu_mutex_unlock_iothread();
+rcu_thread_offline();
 } else {
 spin_counter++;
+rcu_quiescent_state();
 }
 
 ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout);
 
 if (timeout > 0) {
+rcu_thread_online();
 qemu_mutex_lock_iothread();
 }
 
@@ -424,7 +428,9 @@ static int os_host_main_loop_wait(uint32_t timeout)
 }
 
 qemu_mutex_unlock_iothread();
+rcu_thread_offline();
 g_poll_ret = g_poll(poll_fds, n_poll_fds + w->num, poll_timeout);
+rcu_thread_online();
 qemu_mutex_lock_iothread();
 if (g_poll_ret > 0) {
 for (i = 0; i < w->num; i++) {
-- 
1.8.1.4

[Qemu-devel] [PATCH 09/30] qemu-thread: register threads with RCU

2013-06-28 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 docs/rcu.txt | 13 +++--
 util/qemu-thread-posix.c | 28 +++-
 util/qemu-thread-win32.c |  2 ++
 3 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index 4869ec7..118a28a 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -122,8 +122,8 @@ on many POSIX systems other than Linux and Solaris).
 
 For this reason, QEMU's RCU implementation resorts to manual annotation
 of "quiescent states", i.e. points where no RCU read-side critical
-section can be active.  All threads that participate in the RCU mechanism
-need to annotate such points.
+section can be active.  All threads created with qemu_thread_create
+participate in the RCU mechanism and need to annotate such points.
 
 Marking quiescent states is done with the following three APIs:
 
@@ -144,8 +144,8 @@ Marking quiescent states is done with the following three 
APIs:
 thread.
 
 
-Furthermore, threads that participate in the RCU mechanism must communicate
-this fact using the following APIs:
+The following APIs can be used to use RCU in a thread that is not
+created with qemu_thread_create():
 
  void rcu_register_thread(void);
 
@@ -160,8 +160,9 @@ this fact using the following APIs:
 either manually or by using the QemuCond/QemuSemaphore/QemuEvent
 APIs.
 
-Note that these APIs are relatively heavyweight, and should _not_ be
-nested.
+Note that these APIs are relatively heavyweight, should _not_ be
+nested, and should not be called in threads that are created with
+qemu_thread_create().
 
 
 DIFFERENCES WITH LINUX
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 8178f9b..2df3382 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -26,6 +26,7 @@
 #endif
 #include "qemu/thread.h"
 #include "qemu/atomic.h"
+#include "qemu/rcu.h"
 
 static void error_exit(int err, const char *msg)
 {
@@ -384,6 +385,26 @@ void qemu_event_wait(QemuEvent *ev)
 }
 
 
+typedef struct QemuThreadData {
+/* Passed to win32_start_routine.  */
+void *(*start_routine)(void *);
+void *arg;
+} QemuThreadData;
+
+static void *thread_start_routine(void *arg)
+{
+QemuThreadData *data = (QemuThreadData *) arg;
+void *(*start_routine)(void *) = data->start_routine;
+void *thread_arg = data->arg;
+void *ret;
+
+rcu_register_thread();
+g_free(data);
+ret = start_routine(thread_arg);
+rcu_unregister_thread();
+return ret;
+}
+
 void qemu_thread_create(QemuThread *thread,
void *(*start_routine)(void*),
void *arg, int mode)
@@ -391,6 +412,11 @@ void qemu_thread_create(QemuThread *thread,
 sigset_t set, oldset;
 int err;
 pthread_attr_t attr;
+QemuThreadData *data;
+
+data = g_malloc(sizeof(*data));
+data->start_routine = start_routine;
+data->arg = arg;
 
 err = pthread_attr_init(&attr);
 if (err) {
@@ -406,7 +432,7 @@ void qemu_thread_create(QemuThread *thread,
 /* Leave signal handling to the iothread.  */
 sigfillset(&set);
 pthread_sigmask(SIG_SETMASK, &set, &oldset);
-err = pthread_create(&thread->thread, &attr, start_routine, arg);
+err = pthread_create(&thread->thread, &attr, thread_start_routine, data);
 if (err)
 error_exit(err, __func__);
 
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index de49f1e..18978be 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -295,6 +295,7 @@ static unsigned __stdcall win32_start_routine(void *arg)
 data = NULL;
 }
 qemu_thread_data = data;
+rcu_register_thread();
 qemu_thread_exit(start_routine(thread_arg));
 abort();
 }
@@ -310,6 +311,7 @@ void qemu_thread_exit(void *arg)
 data->exited = true;
 LeaveCriticalSection(&data->cs);
 }
+rcu_unregister_thread();
 _endthreadex(0);
 }
 
-- 
1.8.1.4

[Qemu-devel] [PATCH 01/30] memory: access FlatView from a local variable

2013-06-28 Thread Paolo Bonzini

We will soon require accesses to as->current_map to be placed under
a lock (with reference counting so as to keep the critical section
small).  To simplify this change, always fetch as->current_map into
a local variable and access it through that variable.

Signed-off-by: Paolo Bonzini 
---
 memory.c | 31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/memory.c b/memory.c
index 688c817..1f44cd1 100644
--- a/memory.c
+++ b/memory.c
@@ -578,13 +578,15 @@ static void address_space_add_del_ioeventfds(AddressSpace 
*as,
 
 static void address_space_update_ioeventfds(AddressSpace *as)
 {
+FlatView *view;
 FlatRange *fr;
 unsigned ioeventfd_nb = 0;
 MemoryRegionIoeventfd *ioeventfds = NULL;
 AddrRange tmp;
 unsigned i;
 
-FOR_EACH_FLAT_RANGE(fr, as->current_map) {
+view = as->current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 for (i = 0; i < fr->mr->ioeventfd_nb; ++i) {
 tmp = addrrange_shift(fr->mr->ioeventfds[i].addr,
   int128_sub(fr->addr.start,
@@ -1142,7 +1144,8 @@ void memory_region_sync_dirty_bitmap(MemoryRegion *mr)
 FlatRange *fr;
 
 QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) {
-FOR_EACH_FLAT_RANGE(fr, as->current_map) {
+FlatView *view = as->current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 if (fr->mr == mr) {
 MEMORY_LISTENER_UPDATE_REGION(fr, as, Forward, log_sync);
 }
@@ -1192,12 +1195,14 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr)
 
 static void memory_region_update_coalesced_range_as(MemoryRegion *mr, 
AddressSpace *as)
 {
+FlatView *view;
 FlatRange *fr;
 CoalescedMemoryRange *cmr;
 AddrRange tmp;
 MemoryRegionSection section;
 
-FOR_EACH_FLAT_RANGE(fr, as->current_map) {
+view = as->current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 if (fr->mr == mr) {
 section = (MemoryRegionSection) {
 .address_space = as,
@@ -1488,9 +1493,9 @@ static int cmp_flatrange_addr(const void *addr_, const 
void *fr_)
 return 0;
 }
 
-static FlatRange *address_space_lookup(AddressSpace *as, AddrRange addr)
+static FlatRange *flatview_lookup(FlatView *view, AddrRange addr)
 {
-return bsearch(&addr, as->current_map->ranges, as->current_map->nr,
+return bsearch(&addr, view->ranges, view->nr,
sizeof(FlatRange), cmp_flatrange_addr);
 }
 
@@ -1501,6 +1506,7 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 MemoryRegion *root;
 AddressSpace *as;
 AddrRange range;
+FlatView *view;
 FlatRange *fr;
 
 addr += mr->addr;
@@ -1511,13 +1517,14 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 
 as = memory_region_to_address_space(root);
 range = addrrange_make(int128_make64(addr), int128_make64(size));
-fr = address_space_lookup(as, range);
+
+view = as->current_map;
+fr = flatview_lookup(view, range);
 if (!fr) {
 return ret;
 }
 
-while (fr > as->current_map->ranges
-   && addrrange_intersects(fr[-1].addr, range)) {
+while (fr > view->ranges && addrrange_intersects(fr[-1].addr, range)) {
 --fr;
 }
 
@@ -1537,9 +1544,11 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 
 void address_space_sync_dirty_bitmap(AddressSpace *as)
 {
+FlatView *view;
 FlatRange *fr;
 
-FOR_EACH_FLAT_RANGE(fr, as->current_map) {
+view = as->current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 MEMORY_LISTENER_UPDATE_REGION(fr, as, Forward, log_sync);
 }
 }
@@ -1559,6 +1568,7 @@ void memory_global_dirty_log_stop(void)
 static void listener_add_address_space(MemoryListener *listener,
AddressSpace *as)
 {
+FlatView *view;
 FlatRange *fr;
 
 if (listener->address_space_filter
@@ -1572,7 +1582,8 @@ static void listener_add_address_space(MemoryListener 
*listener,
 }
 }
 
-FOR_EACH_FLAT_RANGE(fr, as->current_map) {
+view = as->current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 MemoryRegionSection section = {
 .mr = fr->mr,
 .address_space = as,
-- 
1.8.1.4

[Qemu-devel] [PATCH 05/30] exec: do not use qemu/tls.h

2013-06-28 Thread Paolo Bonzini

The next patch will change qemu/tls.h to support more platforms, but at
some performance cost.  Declare cpu_single_env directly instead of using
the tls.h abstractions.

Signed-off-by: Paolo Bonzini 
---
 exec.c | 10 --
 include/exec/cpu-all.h | 14 +++---
 include/qemu/tls.h | 52 --
 3 files changed, 19 insertions(+), 57 deletions(-)
 delete mode 100644 include/qemu/tls.h

diff --git a/exec.c b/exec.c
index d28403b..a788981 100644
--- a/exec.c
+++ b/exec.c
@@ -70,9 +70,15 @@ static MemoryRegion io_mem_unassigned;
 #endif
 
 CPUArchState *first_cpu;
+
 /* current CPU in the current thread. It is only valid inside
-   cpu_exec() */
-DEFINE_TLS(CPUArchState *,cpu_single_env);
+ * cpu_exec().  See comment in include/exec/cpu-all.h.  */
+#if defined CONFIG_KVM || (defined CONFIG_USER_ONLY && defined CONFIG_USE_NPTL)
+__thread CPUArchState *cpu_single_env;
+#else
+CPUArchState *cpu_single_env;
+#endif
+
 /* 0 = Do not count executed instructions.
1 = Precise instruction counting.
2 = Adaptive rate instruction counting.  */
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index e9c3717..2202ba3 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -20,7 +20,6 @@
 #define CPU_ALL_H
 
 #include "qemu-common.h"
-#include "qemu/tls.h"
 #include "exec/cpu-common.h"
 #include "qemu/thread.h"
 
@@ -368,8 +367,17 @@ void cpu_dump_statistics(CPUArchState *env, FILE *f, 
fprintf_function cpu_fprint
 void QEMU_NORETURN cpu_abort(CPUArchState *env, const char *fmt, ...)
 GCC_FMT_ATTR(2, 3);
 extern CPUArchState *first_cpu;
-DECLARE_TLS(CPUArchState *,cpu_single_env);
-#define cpu_single_env tls_var(cpu_single_env)
+
+/* This is thread-local depending on __linux__ because:
+ *  - the only -user mode supporting multiple VCPU threads is linux-user
+ *  - TCG system mode is single-threaded regarding VCPUs
+ *  - KVM system mode is multi-threaded but limited to Linux
+ */
+#if defined CONFIG_KVM || (defined CONFIG_USER_ONLY && defined CONFIG_USE_NPTL)
+extern __thread CPUArchState *cpu_single_env;
+#else
+extern CPUArchState *cpu_single_env;
+#endif
 
 /* Flags for use in ENV->INTERRUPT_PENDING.
 
diff --git a/include/qemu/tls.h b/include/qemu/tls.h
deleted file mode 100644
index b92ea9d..000
--- a/include/qemu/tls.h
+++ /dev/null
@@ -1,52 +0,0 @@
-/*
- * Abstraction layer for defining and using TLS variables
- *
- * Copyright (c) 2011 Red Hat, Inc
- * Copyright (c) 2011 Linaro Limited
- *
- * Authors:
- *  Paolo Bonzini 
- *  Peter Maydell 
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation; either version 2 of
- * the License, or (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License along
- * with this program; if not, see .
- */
-
-#ifndef QEMU_TLS_H
-#define QEMU_TLS_H
-
-/* Per-thread variables. Note that we only have implementations
- * which are really thread-local on Linux; the dummy implementations
- * define plain global variables.
- *
- * This means that for the moment use should be restricted to
- * per-VCPU variables, which are OK because:
- *  - the only -user mode supporting multiple VCPU threads is linux-user
- *  - TCG system mode is single-threaded regarding VCPUs
- *  - KVM system mode is multi-threaded but limited to Linux
- *
- * TODO: proper implementations via Win32 .tls sections and
- * POSIX pthread_getspecific.
- */
-#ifdef __linux__
-#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
-#define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
-#define tls_var(x)   tls__##x
-#else
-/* Dummy implementations which define plain global variables */
-#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
-#define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
-#define tls_var(x)   tls__##x
-#endif
-
-#endif
-- 
1.8.1.4

[Qemu-devel] [PATCH 07/30] qemu-thread: add QemuEvent

2013-06-28 Thread Paolo Bonzini

This emulates Win32 manual-reset events using futexes or conditional
variables.  Typical ways to use them are with multi-producer,
single-consumer data structures, to test for a complex condition whose
elements come from different threads:

for (;;) {
qemu_event_reset(ev);
... test complex condition ...
if (condition is true) {
break;
}
qemu_event_wait(ev);
}

Or more efficiently (but with some duplication):

... evaluate condition ...
while (!condition) {
qemu_event_reset(ev);
... evaluate condition ...
if (!condition) {
qemu_event_wait(ev);
... evaluate condition ...
}
}

QemuEvent provides a very fast userspace path in the common case when
no other thread is waiting, or the event is not changing state.  It
is used to report RCU quiescent states to the thread calling
synchronize_rcu (the latter being the single consumer), and to report
call_rcu invocations to the thread that receives them.

Signed-off-by: Paolo Bonzini 
---
 include/qemu/thread-posix.h |   8 +++
 include/qemu/thread-win32.h |   4 ++
 include/qemu/thread.h   |   7 +++
 util/qemu-thread-posix.c| 116 
 util/qemu-thread-win32.c|  26 ++
 5 files changed, 161 insertions(+)

diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index 0f30dcc..916b2a7 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -21,6 +21,14 @@ struct QemuSemaphore {
 #endif
 };
 
+struct QemuEvent {
+#ifndef __linux__
+pthread_mutex_t lock;
+pthread_cond_t cond;
+#endif
+unsigned value;
+};
+
 struct QemuThread {
 pthread_t thread;
 };
diff --git a/include/qemu/thread-win32.h b/include/qemu/thread-win32.h
index 13adb95..3d58081 100644
--- a/include/qemu/thread-win32.h
+++ b/include/qemu/thread-win32.h
@@ -17,6 +17,10 @@ struct QemuSemaphore {
 HANDLE sema;
 };
 
+struct QemuEvent {
+HANDLE event;
+};
+
 typedef struct QemuThreadData QemuThreadData;
 struct QemuThread {
 QemuThreadData *data;
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index c02404b..3e32c65 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -7,6 +7,7 @@
 typedef struct QemuMutex QemuMutex;
 typedef struct QemuCond QemuCond;
 typedef struct QemuSemaphore QemuSemaphore;
+typedef struct QemuEvent QemuEvent;
 typedef struct QemuThread QemuThread;
 
 #ifdef _WIN32
@@ -45,6 +46,12 @@ void qemu_sem_wait(QemuSemaphore *sem);
 int qemu_sem_timedwait(QemuSemaphore *sem, int ms);
 void qemu_sem_destroy(QemuSemaphore *sem);
 
+void qemu_event_init(QemuEvent *ev, bool init);
+void qemu_event_set(QemuEvent *ev);
+void qemu_event_reset(QemuEvent *ev);
+void qemu_event_wait(QemuEvent *ev);
+void qemu_event_destroy(QemuEvent *ev);
+
 void qemu_thread_create(QemuThread *thread,
 void *(*start_routine)(void *),
 void *arg, int mode);
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 4489abf..8178f9b 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -20,7 +20,12 @@
 #include 
 #include 
 #include 
+#ifdef __linux__
+#include 
+#include 
+#endif
 #include "qemu/thread.h"
+#include "qemu/atomic.h"
 
 static void error_exit(int err, const char *msg)
 {
@@ -268,6 +273,117 @@ void qemu_sem_wait(QemuSemaphore *sem)
 #endif
 }
 
+#ifdef __linux__
+#define futex(...)  syscall(__NR_futex, __VA_ARGS__)
+
+static inline void futex_wake(QemuEvent *ev, int n)
+{
+futex(ev, FUTEX_WAKE, n, NULL, NULL, 0);
+}
+
+static inline void futex_wait(QemuEvent *ev, unsigned val)
+{
+futex(ev, FUTEX_WAIT, (int) val, NULL, NULL, 0);
+}
+#else
+static inline void futex_wake(QemuEvent *ev, int n)
+{
+if (n == 1) {
+pthread_cond_signal(&ev->cond);
+} else {
+pthread_cond_broadcast(&ev->cond);
+}
+}
+
+static inline void futex_wait(QemuEvent *ev, unsigned val)
+{
+pthread_mutex_lock(&ev->lock);
+if (ev->value == val) {
+pthread_cond_wait(&ev->cond, &ev->lock);
+}
+pthread_mutex_unlock(&ev->lock);
+}
+#endif
+
+/* Valid transitions:
+ * - free->set, when setting the event
+ * - busy->set, when setting the event, followed by futex_wake
+ * - set->free, when resetting the event
+ * - free->busy, when waiting
+ *
+ * set->busy does not happen (it can be observed from the outside but
+ * it really is set->free->busy).
+ *
+ * busy->free provably cannot happen; to enforce it, the set->free transition
+ * is done with an OR, which becomes a no-op if the event has concurrently
+ * transitioned to free or busy.
+ */
+
+#define EV_SET 0
+#define EV_FREE1
+#define EV_BUSY   -1
+
+void qemu_event_init(QemuEvent *ev, bool init)
+{
+#ifndef __linux__
+pthread_mutex_init(&ev->lock, NULL);
+pthread_cond_init(&ev->cond, NULL);
+#endif
+
+ev->value = (init ? EV_SET : EV_FREE);
+}
+
+void qemu_

[Qemu-devel] [PULL 13/14] qemu-socket: don't leak opts on error

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 util/qemu-sockets.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 126cbb6..095716e 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -963,7 +963,7 @@ int socket_dgram(SocketAddress *remote, SocketAddress 
*local, Error **errp)
 
 default:
 error_setg(errp, "socket type unsupported for datagram");
-return -1;
+fd = -1;
 }
 qemu_opts_del(opts);
 return fd;
-- 
1.7.10.4

[Qemu-devel] [PULL 09/14] qemu-char: use ChardevBackendKind in CharDriver

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 include/sysemu/char.h |2 +-
 qemu-char.c   |4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/sysemu/char.h b/include/sysemu/char.h
index 066c216..e65e4a4 100644
--- a/include/sysemu/char.h
+++ b/include/sysemu/char.h
@@ -281,7 +281,7 @@ CharDriverState *qemu_chr_find(const char *name);
 QemuOpts *qemu_chr_parse_compat(const char *label, const char *filename);
 
 void register_char_driver(const char *name, CharDriverState *(*open)(QemuOpts 
*));
-void register_char_driver_qapi(const char *name, int kind,
+void register_char_driver_qapi(const char *name, ChardevBackendKind kind,
 void (*parse)(QemuOpts *opts, ChardevBackend *backend, Error **errp));
 
 /* add an eventfd to the qemu devices that are polled */
diff --git a/qemu-char.c b/qemu-char.c
index 0cda56c..90754e6 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -3120,7 +3120,7 @@ typedef struct CharDriver {
 /* old, pre qapi */
 CharDriverState *(*open)(QemuOpts *opts);
 /* new, qapi-based */
-int kind;
+ChardevBackendKind kind;
 void (*parse)(QemuOpts *opts, ChardevBackend *backend, Error **errp);
 } CharDriver;
 
@@ -3137,7 +3137,7 @@ void register_char_driver(const char *name, 
CharDriverState *(*open)(QemuOpts *)
 backends = g_slist_append(backends, s);
 }
 
-void register_char_driver_qapi(const char *name, int kind,
+void register_char_driver_qapi(const char *name, ChardevBackendKind kind,
 void (*parse)(QemuOpts *opts, ChardevBackend *backend, Error **errp))
 {
 CharDriver *s;
-- 
1.7.10.4

[Qemu-devel] [PULL 06/14] qemu-char: print notification to stderr

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 qemu-char.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index e3b3224..371f630 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2666,8 +2666,8 @@ static CharDriverState *qemu_chr_open_socket_fd(int fd, 
bool do_nodelay,
 }
 
 if (is_listen && is_waitconnect) {
-printf("QEMU waiting for connection on: %s\n",
-   chr->filename);
+fprintf(stderr, "QEMU waiting for connection on: %s\n",
+chr->filename);
 tcp_chr_accept(s->listen_chan, G_IO_IN, chr);
 qemu_set_nonblock(s->listen_fd);
 }
-- 
1.7.10.4

[Qemu-devel] [PULL 08/14] qemu-char: don't leak opts on error

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 qemu-char.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-char.c b/qemu-char.c
index 371f630..0cda56c 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -3178,7 +3178,7 @@ CharDriverState *qemu_chr_new_from_opts(QemuOpts *opts,
 if (i == NULL) {
 error_setg(errp, "chardev: backend \"%s\" not found",
qemu_opt_get(opts, "backend"));
-return NULL;
+goto err;
 }
 
 if (!cd->open) {
-- 
1.7.10.4

[Qemu-devel] [PULL 11/14] qemu-char: add -chardev mux support

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Allow to explicitly create mux chardevs on the command line,
like you can using QMP.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 qemu-char.c |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/qemu-char.c b/qemu-char.c
index 392de29..c097ca1 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -3115,6 +3115,19 @@ static void qemu_chr_parse_memory(QemuOpts *opts, 
ChardevBackend *backend,
 }
 }
 
+static void qemu_chr_parse_mux(QemuOpts *opts, ChardevBackend *backend,
+   Error **errp)
+{
+const char *chardev = qemu_opt_get(opts, "chardev");
+
+if (chardev == NULL) {
+error_setg(errp, "chardev: mux: no chardev given");
+return;
+}
+backend->mux = g_new0(ChardevMux, 1);
+backend->mux->chardev = g_strdup(chardev);
+}
+
 typedef struct CharDriver {
 const char *name;
 /* old, pre qapi */
@@ -3481,6 +3494,9 @@ QemuOptsList qemu_chardev_opts = {
 },{
 .name = "size",
 .type = QEMU_OPT_SIZE,
+},{
+.name = "chardev",
+.type = QEMU_OPT_STRING,
 },
 { /* end of list */ }
 },
@@ -3771,6 +3787,8 @@ static void register_types(void)
 register_char_driver_qapi("console", CHARDEV_BACKEND_KIND_CONSOLE, NULL);
 register_char_driver_qapi("pipe", CHARDEV_BACKEND_KIND_PIPE,
   qemu_chr_parse_pipe);
+register_char_driver_qapi("mux", CHARDEV_BACKEND_KIND_MUX,
+  qemu_chr_parse_mux);
 }
 
 type_init(register_types);
-- 
1.7.10.4

[Qemu-devel] [PULL 10/14] qemu-char: minor mux chardev fixes

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

mux failure path has a memory leak.  creating a mux chardev can't
fail though, so just assert() that instead of fixing an error path
which never ever runs anyway ...

Also fix bid being leaked while being at it.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 qemu-char.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index 90754e6..392de29 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -3186,7 +3186,7 @@ CharDriverState *qemu_chr_new_from_opts(QemuOpts *opts,
 ChardevBackend *backend = g_new0(ChardevBackend, 1);
 ChardevReturn *ret = NULL;
 const char *id = qemu_opts_id(opts);
-const char *bid = NULL;
+char *bid = NULL;
 
 if (qemu_opt_get_bool(opts, "mux", 0)) {
 bid = g_strdup_printf("%s-base", id);
@@ -3213,9 +3213,7 @@ CharDriverState *qemu_chr_new_from_opts(QemuOpts *opts,
 backend->kind = CHARDEV_BACKEND_KIND_MUX;
 backend->mux->chardev = g_strdup(bid);
 ret = qmp_chardev_add(id, backend, errp);
-if (error_is_set(errp)) {
-goto qapi_out;
-}
+assert(!error_is_set(errp));
 }
 
 chr = qemu_chr_find(id);
@@ -3223,6 +3221,7 @@ CharDriverState *qemu_chr_new_from_opts(QemuOpts *opts,
 qapi_out:
 qapi_free_ChardevBackend(backend);
 qapi_free_ChardevReturn(ret);
+g_free(bid);
 return chr;
 }
 
-- 
1.7.10.4

[Qemu-devel] [PULL 14/14] doc: we use seabios, not bochs bios

2013-06-28 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
---
 qemu-doc.texi |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-doc.texi b/qemu-doc.texi
index 8022890..185dd47 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -214,7 +214,7 @@ PCI UHCI USB controller and a virtual USB hub.
 
 SMP is supported with up to 255 CPUs.
 
-QEMU uses the PC BIOS from the Bochs project and the Plex86/Bochs LGPL
+QEMU uses the PC BIOS from the Seabios project and the Plex86/Bochs LGPL
 VGA BIOS.
 
 QEMU uses YM3812 emulation by Tatsuyuki Satoh.
-- 
1.7.10.4

[Qemu-devel] [PULL 07/14] qemu-char: fix documentation for telnet+wait socket flags

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 qapi-schema.json |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 6590307..5c32528 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3288,10 +3288,11 @@
 # @addr: socket address to listen on (server=true)
 #or connect to (server=false)
 # @server: #optional create server socket (default: true)
-# @wait: #optional wait for connect (not used for server
-#sockets, default: false)
+# @wait: #optional wait for incoming connection on server
+#sockets (default: false).
 # @nodelay: #optional set TCP_NODELAY socket option (default: false)
-# @telnet: #optional enable telnet protocol (default: false)
+# @telnet: #optional enable telnet protocol on server
+#  sockets (default: false)
 #
 # Since: 1.4
 ##
-- 
1.7.10.4

[Qemu-devel] [PULL 12/14] qemu-char: report udp backend errors

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 qemu-char.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/qemu-char.c b/qemu-char.c
index c097ca1..5a8f9c0 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2255,6 +2255,8 @@ static CharDriverState *qemu_chr_open_udp(QemuOpts *opts)
 
 fd = inet_dgram_opts(opts, &local_err);
 if (fd < 0) {
+qerror_report_err(local_err);
+error_free(local_err);
 return NULL;
 }
 return qemu_chr_open_udp_fd(fd);
-- 
1.7.10.4

[Qemu-devel] [PULL 00/14] Trivial patches for 2013-06-28

2013-06-28 Thread Michael Tokarev

Hello.  There's a next week trivial-patches pull request.
Most of the changes this time come from Gerd, with his
single series of bugfixes for sockets and char code, plus
one trivial patch from me.

Please pull.

Thanks,

/mjt

The following changes since commit 36125631e79d53ffb9365740f43f386e2171d116:

  Merge remote-tracking branch 'kwolf/for-anthony' into staging (2013-06-28 
10:37:34 -0500)

are available in the git repository at:

  git://git.corpit.ru/qemu.git trivial-patches

for you to fetch changes up to a8ad4159ed7d8a442a9c049a6fd0d47aa330c2aa:

  doc: we use seabios, not bochs bios (2013-06-28 22:10:34 +0400)


Gerd Hoffmann (13):
  qemu-socket: zero-initialize SocketAddress
  qemu-socket: drop pointless allocation
  qemu-socket: catch monitor_get_fd failures
  qemu-char: check optional fields using has_*
  qemu-char: use more specific error_setg_* variants
  qemu-char: print notification to stderr
  qemu-char: fix documentation for telnet+wait socket flags
  qemu-char: don't leak opts on error
  qemu-char: use ChardevBackendKind in CharDriver
  qemu-char: minor mux chardev fixes
  qemu-char: add -chardev mux support
  qemu-char: report udp backend errors
  qemu-socket: don't leak opts on error

Michael Tokarev (1):
  doc: we use seabios, not bochs bios

 include/sysemu/char.h |2 +-
 qapi-schema.json  |7 ---
 qemu-char.c   |   45 -
 qemu-doc.texi |2 +-
 util/qemu-sockets.c   |9 -
 5 files changed, 42 insertions(+), 23 deletions(-)

[Qemu-devel] [PULL 05/14] qemu-char: use more specific error_setg_* variants

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 qemu-char.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index a538217..e3b3224 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2604,7 +2604,7 @@ static CharDriverState *qemu_chr_open_socket_fd(int fd, 
bool do_nodelay,
 
 memset(&ss, 0, ss_len);
 if (getsockname(fd, (struct sockaddr *) &ss, &ss_len) != 0) {
-error_setg(errp, "getsockname: %s", strerror(errno));
+error_setg_errno(errp, errno, "getsockname");
 return NULL;
 }
 
@@ -3529,7 +3529,7 @@ static int qmp_chardev_open_file_source(char *src, int 
flags,
 
 TFR(fd = qemu_open(src, flags, 0666));
 if (fd == -1) {
-error_setg(errp, "open %s: %s", src, strerror(errno));
+error_setg_file_open(errp, errno, src);
 }
 return fd;
 }
-- 
1.7.10.4

[Qemu-devel] [PULL 01/14] qemu-socket: zero-initialize SocketAddress

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 util/qemu-sockets.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 96eca2a..86fb09c 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -848,9 +848,9 @@ int unix_nonblocking_connect(const char *path,
 
 SocketAddress *socket_parse(const char *str, Error **errp)
 {
-SocketAddress *addr = NULL;
+SocketAddress *addr;
 
-addr = g_new(SocketAddress, 1);
+addr = g_new0(SocketAddress, 1);
 if (strstart(str, "unix:", NULL)) {
 if (str[5] == '\0') {
 error_setg(errp, "invalid Unix socket address");
-- 
1.7.10.4

[Qemu-devel] [PULL 04/14] qemu-char: check optional fields using has_*

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 qemu-char.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index a030e6b..a538217 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -3493,7 +3493,7 @@ static CharDriverState *qmp_chardev_open_file(ChardevFile 
*file, Error **errp)
 {
 HANDLE out;
 
-if (file->in) {
+if (file->has_in) {
 error_setg(errp, "input file not supported");
 return NULL;
 }
@@ -3544,7 +3544,7 @@ static CharDriverState *qmp_chardev_open_file(ChardevFile 
*file, Error **errp)
 return NULL;
 }
 
-if (file->in) {
+if (file->has_in) {
 flags = O_RDONLY;
 in = qmp_chardev_open_file_source(file->in, flags, errp);
 if (error_is_set(errp)) {
-- 
1.7.10.4

[Qemu-devel] [PULL 02/14] qemu-socket: drop pointless allocation

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 util/qemu-sockets.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 86fb09c..35023a8 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -871,7 +871,6 @@ SocketAddress *socket_parse(const char *str, Error **errp)
 }
 } else {
 addr->kind = SOCKET_ADDRESS_KIND_INET;
-addr->inet = g_new(InetSocketAddress, 1);
 addr->inet = inet_parse(str, errp);
 if (addr->inet == NULL) {
 goto fail;
-- 
1.7.10.4

[Qemu-devel] [PULL 03/14] qemu-socket: catch monitor_get_fd failures

2013-06-28 Thread Michael Tokarev

From: Gerd Hoffmann 

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 util/qemu-sockets.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 35023a8..126cbb6 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -903,7 +903,7 @@ int socket_connect(SocketAddress *addr, Error **errp,
 
 case SOCKET_ADDRESS_KIND_FD:
 fd = monitor_get_fd(cur_mon, addr->fd->str, errp);
-if (callback) {
+if (fd >= 0 && callback) {
 qemu_set_nonblock(fd);
 callback(fd, opaque);
 }
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH v4 07/10] qemu-ga: Add Windows VSS requester to quiesce applications and filesystems

2013-06-28 Thread Laszlo Ersek

On 06/06/13 17:06, Tomoki Sekiyama wrote:

> diff --git a/qga/vss-win32-requester.h b/qga/vss-win32-requester.h
> new file mode 100644
> index 000..f180f56
> --- /dev/null
> +++ b/qga/vss-win32-requester.h
> @@ -0,0 +1,31 @@
> +/*
> + * QEMU Guest Agent VSS Requester declarations
> + *
> + * Copyright Hitachi Data Systems Corp. 2013
> + *
> + * Authors:
> + *  Tomoki Sekiyama   
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef VSS_WIN32_REQUESTER_H
> +#define VSS_WIN32_REQUESTER_H
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +HRESULT vss_init(void);

Can you include the mingw header that defines the HRESULT type? As far
as I know we like to make headers standalone.

(OTOH I vaguely recall a patch where the order between a mingw header
and a (generated?) trace header could cause a build error... I think it
would be worth trying still.)


> +void vss_deinit(void);
> +int vss_initialized(void);

Since this is qemu-ga / qemu code, I think a "bool" return type would be
more usual. (The current prototype is correct too of course.)


> +
> +void qga_vss_fsfreeze_freeze(int *nr_volume, struct Error **err);
> +void qga_vss_fsfreeze_thaw(int *nr_volume, struct Error **err);

Can you drop the "struct" word in these prototypes?


> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif
>
>

> diff --git a/qga/vss-win32-requester.cpp b/qga/vss-win32-requester.cpp
> new file mode 100644
> index 000..7784926
> --- /dev/null
> +++ b/qga/vss-win32-requester.cpp
> @@ -0,0 +1,419 @@
> +/*
> + * QEMU Guest Agent win32 VSS Requester implementations
> + *
> + * Copyright Hitachi Data Systems Corp. 2013
> + *
> + * Authors:
> + *  Tomoki Sekiyama   
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include 
> +#include 

Can you remove this #include and replace all assert() occurrences with
g_assert()?


> +extern "C" {
> +#include "guest-agent-core.h"
> +}
> +#include "vss-win32-requester.h"
> +#include "vss-win32-provider.h"
> +#include "vss-win32.h"
> +#include "inc/win2003/vswriter.h"
> +#include "inc/win2003/vsbackup.h"
> +
> +/* Functions in VSSAPI.DLL */
> +typedef HRESULT(STDAPICALLTYPE * t_CreateVssBackupComponents)(
> +OUT IVssBackupComponents**);
> +typedef void(APIENTRY * t_VssFreeSnapshotProperties)(IN VSS_SNAPSHOT_PROP*);
> +
> +static t_CreateVssBackupComponents _CreateVssBackupComponents;
> +static t_VssFreeSnapshotProperties _VssFreeSnapshotProperties;

I apologize in advance for splitting hairs, but :)

  17.4.3.1.2 Global names [lib.global.names]; p1

  Certain sets of names and function signatures are always reserved to
  the implementation:

  - Each name that contains a double underscore (__) or begins with an
underscore followed by an uppercase letter (2.11) is reserved to the
implementation for any use.

Unless there's a pressing reason, could you drop the leading
underscores?


> +static IVssBackupComponents *pVssbc;
> +static IVssAsync *pAsyncSnapshot;
> +static HMODULE hLib;
> +static HANDLE hEvent = INVALID_HANDLE_VALUE, hEvent2 = INVALID_HANDLE_VALUE;
> +static int cFrozenVols;

Can you decorate each of these static variables with a short comment? It
does get clear(er) what they are used for by reading the code, but the
comments would save others time.

Also I'd recommend grouping them differently:

- first group: long term objects related to VSSAPI.DLL and released
  *only* in vss_deinit(): hLib, _CreateVssBackupComponents,
  _VssFreeSnapshotProperties;

- second group: objects that make sense only in preparation for, during,
  and right after a freeze: pVssbc, pAsyncSnapshot, hEvent, hEvent2,
  cFrozenVols. (You could even introduce a struct for these, but that's
  just an idea.)


> +
> +GCC_FMT_ATTR(1, 2)
> +static void errmsg(const char *fmt, ...)
> +{
> +va_list ap;
> +va_start(ap, fmt);
> +char *msg = g_strdup_vprintf(fmt, ap);
> +va_end(ap);
> +MessageBox(NULL, msg, "Error in QEMU guest agent", MB_OK | 
> MB_ICONWARNING);
> +g_free(msg);
> +}

(Still splitting hairs, bear with me please:) can you rename this
function to errmsg_dialog() or something similar, for consistency with
06/10?


> +
> +static void error_set_win32(Error **errp, DWORD err,
> +ErrorClass eclass, const char *text)
> +{
> +char *msg = NULL, *nul = strchr(text, '(');
> +int len = nul ? nul - text : -1;
> +
> +/* print error message in native encoding */
> +FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
> +  FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
> +  NULL, err, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
> +  (char *)&msg, 0, NULL);
> +printf("%.*s. (Error: %lx) %s\n", len, text, err, msg);
> +LocalFree(msg);

I don't think you should p

Re: [Qemu-devel] [PULL v2 00/21] pci,kvm,misc enhancements

2013-06-28 Thread Anthony Liguori

Markus Armbruster  writes:

> "Michael S. Tsirkin"  writes:
>
>>   pvpanic: fix fwcfg for big endian hosts
>
> Umm, 1+10+9 is 20, but the pull is for 21 patches; what's going on here?

Funny :-)

Note that the series is missing 2/21.

The branch has 20 commits, I suspect Michael did git-format-patch to a
directory, deleted one of the files, and never bothered changing the N
of M.

Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 0/2] Provide sextract32() and sextract64()

2013-06-28 Thread Richard Henderson

On 06/28/2013 04:40 AM, Peter Maydell wrote:
> Peter Maydell (2):
>   bitops: Provide sextract32() and sextract64()
>   tests: Add test-bitops.c with some sextract tests
> 
>  include/qemu/bitops.h |   50 +
>  tests/Makefile|2 ++
>  tests/test-bitops.c   |   75 
> +
>  3 files changed, 127 insertions(+)
>  create mode 100644 tests/test-bitops.c

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH] target-mips: fix mipsdsp_trunc16_sat16_round

2013-06-28 Thread Richard Henderson

On 06/27/2013 02:48 PM, Petar Jovanovic wrote:
>> This doesn't look right either, as it doesn't properly check for overflow of
>> negative values.
> 
> What overflow of negative values?
> Can you please list the values for which the result would not be correct?

Hmm, I suppose since we're always rounding to +INF, we can't
overflow in the negative direction.  The patch could use some
commentary along those lines...

r~

Re: [Qemu-devel] [PATCH v3 2/2] net: introduce command to query rx-filter information

2013-06-28 Thread Markus Armbruster

Eric Blake  writes:

> On 06/26/2013 08:15 AM, Markus Armbruster wrote:
>> Luiz Capitulino  writes:
>> 
>>> On Wed, 26 Jun 2013 14:00:30 +0200
>>> Markus Armbruster  wrote:
>>>
 Meh, I got confused and reviewed an out-of-date version.  Hope it's not
 entirely noise.
>>>
>>> I don't think it is. But this series got applied to Michael's tree, so
>>> if you want your comments addressed before applying to master (I think
>>> we do) then it's better to state it clearly.
>> 
>> Michael, please give Amos a chance to reply to my review.
>
> Looks like the pull request is already live but had issues; meanwhile, I
> also found an issue when reviewing the pull request:
> https://lists.gnu.org/archive/html/qemu-devel/2013-06/msg05127.html

Missed it, thanks!

Re: [Qemu-devel] [PATCH v2] linux-user: Move cpu_clone_regs() and cpu_set_tls() into linux-user

2013-06-28 Thread Richard Henderson

On 06/28/2013 06:22 AM, Peter Maydell wrote:
> The functions cpu_clone_regs() and cpu_set_tls() are not purely CPU
> related -- they are specific to the TLS ABI for a a particular OS.
> Move them into the linux-user/ tree where they belong.
> 
> target-lm32 had entirely unused implementations, since it has no
> linux-user target; just drop them.
> 
> Signed-off-by: Peter Maydell 
> ---
> Changes v1->v2: move the openrisc and i386 functions (accidentally
> omitted from v1), drop the unused lm32 functions.

Acked-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v2] cpu: Drop unnecessary dynamic casts in *_env_get_cpu()

2013-06-28 Thread Richard Henderson

On 06/28/2013 06:23 AM, Andreas Färber wrote:
> A transition from CPUFooState to FooCPU can be considered safe,
> just like FooCPU::env access in the opposite direction.
> The only benefit of the FOO_CPU() casts would be protection against
> bogus CPUFooState pointers, but then surrounding code would likely
> break, too.
> 
> This should slightly improve interrupt etc. performance.
> 
> Reported-by: Anthony Liguori 
> Signed-off-by: Andreas Färber 

Acked-by: Richard Henderson 


r~

Re: [Qemu-devel] [PULL v2 00/21] pci,kvm,misc enhancements

2013-06-28 Thread Markus Armbruster

Eric Blake  writes:

> On 06/25/2013 03:48 PM, Anthony Liguori wrote:
>> "Michael S. Tsirkin"  writes:
>> 
>>> From: Michael S. Tsirkin 
>>>
>>> The following changes since commit 90a2541b763b31d2b551b07e24aae3de5266d31b:
>>>
>>>   target-i386: fix over 80 chars warnings (2013-06-15 17:50:38 +)
>> 
>> Just as an aside, you should rebase and test before sending pull
>> requests.  This conflicts with current tip.  Normally, wouldn't be a
>> problem but FYI for the future.
>> 
>> There's a trivial conflict with ppc-softmmu.mak that I can resolve but
>> also a conflict with Stefano's pull request that I cannot resolve
>> without rebasing :-/  So please rebase against latest tip and resubmit.
>
> That, and there are review issues with 21/21 that should be resolved
> before it gets queued.
> https://lists.gnu.org/archive/html/qemu-devel/2013-06/msg04741.html

Yes.

1 2 3 >

1 - 100 of 291 matches

Mail list logo