date:20170907

Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device

2017-09-07 Thread Dong Jia Shi

* Cornelia Huck  [2017-09-06 15:18:21 +0200]:

> On Tue,  5 Sep 2017 13:16:45 +0200
> Halil Pasic  wrote:
> 
> > Add a fake device meant for testing the correctness of our css emulation.
> > 
> > What we currently have is writing a Fibonacci sequence of uint32_t to the
> > device via ccw write. The write is going to fail if it ain't a Fibonacci
> > and indicate a device exception in scsw together with the proper residual
> > count.
> > 
> > Of course lot's of invalid inputs (besides basic data processing) can be
> > tested with that as well.
> > 
> > Usage:
> > 1) fire up a qemu with something like -device ccw-tester,devno=fe.0.0001
> >on the command line
> > 2) exercise the device from the guest
> > 
> > Signed-off-by: Halil Pasic 
> > ---
> > 
> > It may not make sense to merge this work in the current form, as it is
> > solely for test purposes.
> > ---
> >  hw/s390x/Makefile.objs |   1 +
> >  hw/s390x/ccw-tester.c  | 179 
> > +
> >  2 files changed, 180 insertions(+)
> >  create mode 100644 hw/s390x/ccw-tester.c
> 
> The main problem here is that you want to exercise a middle layer (the
> css code) and need to write boilerplate code on both host and guest
> side in order to be able to do so.
> 
> In general, a device that accepts arbitrary channel programs looks
> useful for testing purposes. I would split out processing of expected
> responses out, though, so that it can be more easily reused for
> different use cases.
> 
> (I dimly recall other test devices...)
> 
> For the guest tester: Can that be done via the qtest infrastructure
> somehow?
> 

I'm thinking of a method these days:
Could passing through an fully emulated ccw device (e.g. 3270), or a
virtio ccw device, in the level 1 kvm guest to a level 2 guest be a test
method for this?

All of the CCWs will be translated to IDAL CCWs by vfio-ccw in the level
1 guest (which is the level 2 kvm host) and issued to the level 1 kvm
host. So, those IDALs will eventually be handled by the emulated device,
or the virtio ccw device, on the level 1 kvm host...

Some days ago, one of my colleague tried the emulated 3270 passing
through. She stucked with the problem that the level 1 kvm host handling
a senseid IDAL ccw as a Direct ccw.

Maybe I could try to pass through a virtio ccw device. I don't think of
any obvious problem that would lead to fail. Any comment?

-- 
Dong Jia Shi

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Peter Xu

On Wed, Sep 06, 2017 at 04:14:37PM +0100, Dr. David Alan Gilbert wrote:
> * Stefan Hajnoczi (stefa...@gmail.com) wrote:
> > On Wed, Aug 23, 2017 at 02:51:03PM +0800, Peter Xu wrote:
> > > The root problem is that, monitor commands are all handled in main
> > > loop thread now, no matter how many monitors we specify. And, if main
> > > loop thread hangs due to some reason, all monitors will be stuck.
> > 
> > I see a larger issue with postcopy: existing QEMU code assumes that
> > guest memory access is instantaneous.
> > 
> > Postcopy breaks this assumption and introduces blocking points that can
> > now take unbounded time.
> > 
> > This problem isn't specific to the monitor.  It can also happen to other
> > components in QEMU like the gdbstub.
> > 
> > Do we need an asynchronous memory API?  Synchronous memory access should
> > only be allowed in vcpu threads.
> 
> It would probably be useful for gdbstub where the overhead of async
> doesn't matter;  but doing that for all IO emulation is hard.

Agreed.

IIUC one problem is that we should have code that cached the HVA for
specific GPA, then when it wants to write to that GPA, it directly
writes to corresponding HVA.  No memory API is used.  I am not sure
whether it's possible to convert all these usages into memory APIs
(even if it supports async operations).

-- 
Peter Xu

[Qemu-devel] [Bug 1715573] [NEW] Android-x86_64 guest - "Could not disable RealTimeClock events (20160831/evxfevnt-267)"; UI sluggish, ACPI doesn't work with QEMU 2.10.0

2017-09-07 Thread Marcelo "Marc" Ranolfi

Public bug reported:

I'm running a custom-built Android-x86_64 guest in an Arch Linux host
with the 4.12.10 kernel.

Following the latest Arch Linux upgrade to QEMU 2.10.0-1, upon booting
the virtual machine, I get the error mentioned in the title. Afterwards,
the virtual machine becomes slower and slower. The ACPI functions (the
libvirt's Shutdown button, for example) don't work.

When I downgrade to QEMU 2.9.0-3 everything works fine once again.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1715573

Title:
  Android-x86_64 guest - "Could not disable RealTimeClock events
  (20160831/evxfevnt-267)"; UI sluggish, ACPI doesn't work with QEMU
  2.10.0

Status in QEMU:
  New

Bug description:
  I'm running a custom-built Android-x86_64 guest in an Arch Linux host
  with the 4.12.10 kernel.

  Following the latest Arch Linux upgrade to QEMU 2.10.0-1, upon booting
  the virtual machine, I get the error mentioned in the title.
  Afterwards, the virtual machine becomes slower and slower. The ACPI
  functions (the libvirt's Shutdown button, for example) don't work.

  When I downgrade to QEMU 2.9.0-3 everything works fine once again.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1715573/+subscriptions

Re: [Qemu-devel] [PATCH v4 00/20] instrument: Add basic event instrumentation

2017-09-07 Thread Markus Armbruster

Lluís Vilanova  writes:

> This series adds an API to add instrumentation events.
>
> It also provides additional APIs for:
> * Controlling tracing events
> * Peek/poke guest memory
>
> There's still missing APIs for (can be added in later series?):
> * Provide something like tracing's per-vCPU trace states (i.e., so that each
>   vCPU can have different instrumentation code). It's still not clear to me if
>   we should extend the per-vCPU bitmap with instrumentation events, or 
> otherwise
>   somehow reuse the bits in tracing events (since they're currently limited).
> * Peek/poke guest registers
>
> The instrumentation code is dynamically loaded as a library into QEMU either
> when it starts or later using its remote control interfaces.
>
> Signed-off-by: Lluís Vilanova 

Doesn't apply to current master, and appears to have whitespace errors.
Please rebase and use scripts/checkpatch.pl to tidy up.

Re: [Qemu-devel] [PATCH v4 2/3] watchdog.h: Drop local redefinition of actions enum

2017-09-07 Thread Michal Privoznik

On 09/06/2017 07:25 PM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> On 09/06/2017 06:24 AM, Michal Privoznik wrote:
>>> We already have enum that enumerates all the action that a
>>
>> s/action/actions/
>>
>>> watchdog can take when hitting its timeout: WatchdogAction.
>>> Use that instead of inventing our own.
>>>
>>> Signed-off-by: Michal Privoznik 
>>> ---
>>
>>> @@ -77,27 +77,16 @@ int select_watchdog(const char *p)
>>>  
>>>  int select_watchdog_action(const char *p)
>>>  {
>>> -if (strcasecmp(p, "reset") == 0)
>>> -watchdog_action = WDT_RESET;
>>
>> The old code was case-insensitive,
>>
>>> +action = qapi_enum_parse(&WatchdogAction_lookup, p, -1, NULL);
>>
>> the new code is not.  Do we care?  (I don't, but we could be breaking
>> someone's control flow).  Should qapi_enum_parse be taught to be
>> case-insensitive?  Or perhaps we answer related questions first: Do we
>> have any QAPI enums that have values differing only in case? Do we
>> prevent such QAPI definitions, to give us the potential of making the
>> parsing insensitive?
> 
> Case-sensitive everywhere is fine.  Case-insensitive everywhere also
> fine, just not my personal preference.  What's not fine is "guess
> whether this part of the interface is case-sensitive or not".
> 
> QMP is case-sensitive.  Let's keep it that way.
> 
> The -watchdog-action option has a case-insensitive argument.  The
> obvious way to remain misfeature-^Wbackwards compatible is converting
> the argument to lower case before handing it off to qapi_enum_parse.  I
> doubt it matters, but just doing it is less work than debating how far
> exactly we want to bend over backwards.
> 
> g_ascii_strdown() should do.  It only converts ASCII characters, but
> anything else is going to fail in qapi_enum_parse() anyway.
> 

On the other hand, the documentation enumerates the accepted values in
lowercase. So one can argue that upper- or mixed-case is just a misuse
of a bug in the code. But getting the code in is more important to me so
I'll do the strdown() conversion and sent yet another version.

Michal

Re: [Qemu-devel] [PATCH 2/9] s390x: fix invalid use of cc 1 for SSCH

2017-09-07 Thread Dong Jia Shi

* Cornelia Huck  [2017-09-06 13:25:38 +0200]:

> On Wed, 6 Sep 2017 16:27:20 +0800
> Dong Jia Shi  wrote:
> 
> > * Halil Pasic  [2017-09-05 19:20:43 +0200]:
> > 
> > > 
> > > 
> > > On 09/05/2017 05:46 PM, Cornelia Huck wrote:  
> > > > On Tue, 5 Sep 2017 17:24:19 +0200
> > > > Halil Pasic  wrote:
> > > >   
> > > >> My problem with a program check (indicated by SCSW word 2 bit 10) is
> > > >> that, in my reading of the architecture, the semantic behind it is: The
> > > >> channel subsystem (not the cu or device) has detected, that the 
> > > >> the channel program (previously submitted as an ORB) is erroneous. 
> > > >> Which
> > > >> programs are erroneous is specified by the architecture. What we have
> > > >> here does not qualify.
> > > >>
> > > >> My idea was to rather blame the virtual hardware (device) and put no 
> > > >> blame
> > > >> on the program nor he channel subsystem. This could be done using 
> > > >> device
> > > >> status (unit check with command reject, maybe unit exception) or 
> > > >> interface
> > > >> check. My train of thought was, the problem is not consistent across a
> > > >> device type, so it has to be device specific.  
> > > > 
> > > > Unit exception might be a better way to express what is happening here.
> > > > At least, it moves us away from cc 1 and not towards cc 3 :)
> > > >   
> > > 
> > > I will do a follow up patch pursuing device exception.
> > >   
> > > >>
> > > >> Of course blaming the device could mislead the person encountering the
> > > >> problem, and make him believe it's an non-virtual hardware problem.
> > > >>
> > > >> About the misleading, I think the best we can do is log out a message
> > > >> indicating what really happened.  
> > > > 
> > > > Just document it in the code? If it doesn't happen with Linux as a
> > > > guest, it is highly unlikely to be seen in the wild.
> > > >   
> > > 
> > > 
> > > Well we have two problems here:
> > > 1) Unit exception can be already defined by the device type for the
> > > command (reference: 
> > > http://publibfp.dhe.ibm.com/cgi-bin/bookmgr/BOOKS/dz9ar110/2.6.10?DT=19920904110920).
> > > I think this one is what you mean. And I agree that's best handled
> > > with comment in code.  
> > Using unit check, with bit 3 byte 0 of the sense data set to 1, to
> > indicate an 'Equipment check', sounds a bit more proper than unit
> > exception.
> 
> I don't agree: Equipment check sounds a lot more dire (and seems to
> imply a malfunction). I like unit exception better.
Got the point. Fair enough!

-- 
Dong Jia Shi

[Qemu-devel] [PATCH v5 0/3] watchdog.h: Drop local redefinition of actions enum

2017-09-07 Thread Michal Privoznik

diff to v4:
- in 3/3 compare passed action string case insensitively to the qapi enum

Michal Privoznik (3):
  qapi: Rename WatchdogExpirationAction enum
  watchdog.h: Drop local redefinition of actions enum
  watchdog: Allow setting action on the fly

 hw/watchdog/watchdog.c| 65 ---
 hw/watchdog/wdt_diag288.c |  6 ++---
 include/sysemu/watchdog.h | 12 ++---
 monitor.c |  4 +--
 qapi-schema.json  |  9 +++
 qapi/run-state.json   |  6 ++---
 6 files changed, 52 insertions(+), 50 deletions(-)

-- 
2.13.5

[Qemu-devel] [PATCH v5 1/3] qapi: Rename WatchdogExpirationAction enum

2017-09-07 Thread Michal Privoznik

The new name is WatchdogAction which is shorter,

Signed-off-by: Michal Privoznik 
---
 hw/watchdog/watchdog.c | 14 +++---
 monitor.c  |  4 ++--
 qapi/run-state.json|  6 +++---
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/watchdog/watchdog.c b/hw/watchdog/watchdog.c
index 0c5c9cde1c..358d79804d 100644
--- a/hw/watchdog/watchdog.c
+++ b/hw/watchdog/watchdog.c
@@ -109,17 +109,17 @@ void watchdog_perform_action(void)
 {
 switch (watchdog_action) {
 case WDT_RESET: /* same as 'system_reset' in monitor */
-qapi_event_send_watchdog(WATCHDOG_EXPIRATION_ACTION_RESET, 
&error_abort);
+qapi_event_send_watchdog(WATCHDOG_ACTION_RESET, &error_abort);
 qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
 break;
 
 case WDT_SHUTDOWN:  /* same as 'system_powerdown' in monitor */
-qapi_event_send_watchdog(WATCHDOG_EXPIRATION_ACTION_SHUTDOWN, 
&error_abort);
+qapi_event_send_watchdog(WATCHDOG_ACTION_SHUTDOWN, &error_abort);
 qemu_system_powerdown_request();
 break;
 
 case WDT_POWEROFF:  /* same as 'quit' command in monitor */
-qapi_event_send_watchdog(WATCHDOG_EXPIRATION_ACTION_POWEROFF, 
&error_abort);
+qapi_event_send_watchdog(WATCHDOG_ACTION_POWEROFF, &error_abort);
 exit(0);
 
 case WDT_PAUSE: /* same as 'stop' command in monitor */
@@ -127,21 +127,21 @@ void watchdog_perform_action(void)
  * you would get a deadlock.  Bypass the problem.
  */
 qemu_system_vmstop_request_prepare();
-qapi_event_send_watchdog(WATCHDOG_EXPIRATION_ACTION_PAUSE, 
&error_abort);
+qapi_event_send_watchdog(WATCHDOG_ACTION_PAUSE, &error_abort);
 qemu_system_vmstop_request(RUN_STATE_WATCHDOG);
 break;
 
 case WDT_DEBUG:
-qapi_event_send_watchdog(WATCHDOG_EXPIRATION_ACTION_DEBUG, 
&error_abort);
+qapi_event_send_watchdog(WATCHDOG_ACTION_DEBUG, &error_abort);
 fprintf(stderr, "watchdog: timer fired\n");
 break;
 
 case WDT_NONE:
-qapi_event_send_watchdog(WATCHDOG_EXPIRATION_ACTION_NONE, 
&error_abort);
+qapi_event_send_watchdog(WATCHDOG_ACTION_NONE, &error_abort);
 break;
 
 case WDT_NMI:
-qapi_event_send_watchdog(WATCHDOG_EXPIRATION_ACTION_INJECT_NMI,
+qapi_event_send_watchdog(WATCHDOG_ACTION_INJECT_NMI,
  &error_abort);
 nmi_monitor_handle(0, NULL);
 break;
diff --git a/monitor.c b/monitor.c
index 9239f7adde..e6a6675c15 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3537,8 +3537,8 @@ void watchdog_action_completion(ReadLineState *rs, int 
nb_args, const char *str)
 return;
 }
 readline_set_completion_index(rs, strlen(str));
-for (i = 0; i < WATCHDOG_EXPIRATION_ACTION__MAX; i++) {
-add_completion_option(rs, str, WatchdogExpirationAction_str(i));
+for (i = 0; i < WATCHDOG_ACTION__MAX; i++) {
+add_completion_option(rs, str, WatchdogAction_str(i));
 }
 }
 
diff --git a/qapi/run-state.json b/qapi/run-state.json
index d36ff49834..bca46a8785 100644
--- a/qapi/run-state.json
+++ b/qapi/run-state.json
@@ -253,10 +253,10 @@
 #
 ##
 { 'event': 'WATCHDOG',
-  'data': { 'action': 'WatchdogExpirationAction' } }
+  'data': { 'action': 'WatchdogAction' } }
 
 ##
-# @WatchdogExpirationAction:
+# @WatchdogAction:
 #
 # An enumeration of the actions taken when the watchdog device's timer is
 # expired
@@ -279,7 +279,7 @@
 #
 # Since: 2.1
 ##
-{ 'enum': 'WatchdogExpirationAction',
+{ 'enum': 'WatchdogAction',
   'data': [ 'reset', 'shutdown', 'poweroff', 'pause', 'debug', 'none',
 'inject-nmi' ] }
 
-- 
2.13.5

[Qemu-devel] [PATCH v5 2/3] watchdog.h: Drop local redefinition of actions enum

2017-09-07 Thread Michal Privoznik

We already have enum that enumerates all the actions that a
watchdog can take when hitting its timeout: WatchdogAction.
Use that instead of inventing our own.

Signed-off-by: Michal Privoznik 
---
 hw/watchdog/watchdog.c| 45 -
 hw/watchdog/wdt_diag288.c |  6 +++---
 include/sysemu/watchdog.h | 12 ++--
 3 files changed, 25 insertions(+), 38 deletions(-)

diff --git a/hw/watchdog/watchdog.c b/hw/watchdog/watchdog.c
index 358d79804d..0d3eeed187 100644
--- a/hw/watchdog/watchdog.c
+++ b/hw/watchdog/watchdog.c
@@ -30,7 +30,7 @@
 #include "hw/nmi.h"
 #include "qemu/help_option.h"
 
-static int watchdog_action = WDT_RESET;
+static WatchdogAction watchdog_action = WATCHDOG_ACTION_RESET;
 static QLIST_HEAD(watchdog_list, WatchdogTimerModel) watchdog_list;
 
 void watchdog_add_model(WatchdogTimerModel *model)
@@ -77,27 +77,19 @@ int select_watchdog(const char *p)
 
 int select_watchdog_action(const char *p)
 {
-if (strcasecmp(p, "reset") == 0)
-watchdog_action = WDT_RESET;
-else if (strcasecmp(p, "shutdown") == 0)
-watchdog_action = WDT_SHUTDOWN;
-else if (strcasecmp(p, "poweroff") == 0)
-watchdog_action = WDT_POWEROFF;
-else if (strcasecmp(p, "pause") == 0)
-watchdog_action = WDT_PAUSE;
-else if (strcasecmp(p, "debug") == 0)
-watchdog_action = WDT_DEBUG;
-else if (strcasecmp(p, "none") == 0)
-watchdog_action = WDT_NONE;
-else if (strcasecmp(p, "inject-nmi") == 0)
-watchdog_action = WDT_NMI;
-else
-return -1;
+int action;
+char *qapi_value;
 
+qapi_value = g_ascii_strdown(p, -1);
+action = qapi_enum_parse(&WatchdogAction_lookup, qapi_value, -1, NULL);
+g_free(qapi_value);
+if (action < 0)
+return -1;
+watchdog_action = action;
 return 0;
 }
 
-int get_watchdog_action(void)
+WatchdogAction get_watchdog_action(void)
 {
 return watchdog_action;
 }
@@ -108,21 +100,21 @@ int get_watchdog_action(void)
 void watchdog_perform_action(void)
 {
 switch (watchdog_action) {
-case WDT_RESET: /* same as 'system_reset' in monitor */
+case WATCHDOG_ACTION_RESET: /* same as 'system_reset' in monitor */
 qapi_event_send_watchdog(WATCHDOG_ACTION_RESET, &error_abort);
 qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
 break;
 
-case WDT_SHUTDOWN:  /* same as 'system_powerdown' in monitor */
+case WATCHDOG_ACTION_SHUTDOWN:  /* same as 'system_powerdown' in monitor */
 qapi_event_send_watchdog(WATCHDOG_ACTION_SHUTDOWN, &error_abort);
 qemu_system_powerdown_request();
 break;
 
-case WDT_POWEROFF:  /* same as 'quit' command in monitor */
+case WATCHDOG_ACTION_POWEROFF:  /* same as 'quit' command in monitor */
 qapi_event_send_watchdog(WATCHDOG_ACTION_POWEROFF, &error_abort);
 exit(0);
 
-case WDT_PAUSE: /* same as 'stop' command in monitor */
+case WATCHDOG_ACTION_PAUSE: /* same as 'stop' command in monitor */
 /* In a timer callback, when vm_stop calls qemu_clock_enable
  * you would get a deadlock.  Bypass the problem.
  */
@@ -131,19 +123,22 @@ void watchdog_perform_action(void)
 qemu_system_vmstop_request(RUN_STATE_WATCHDOG);
 break;
 
-case WDT_DEBUG:
+case WATCHDOG_ACTION_DEBUG:
 qapi_event_send_watchdog(WATCHDOG_ACTION_DEBUG, &error_abort);
 fprintf(stderr, "watchdog: timer fired\n");
 break;
 
-case WDT_NONE:
+case WATCHDOG_ACTION_NONE:
 qapi_event_send_watchdog(WATCHDOG_ACTION_NONE, &error_abort);
 break;
 
-case WDT_NMI:
+case WATCHDOG_ACTION_INJECT_NMI:
 qapi_event_send_watchdog(WATCHDOG_ACTION_INJECT_NMI,
  &error_abort);
 nmi_monitor_handle(0, NULL);
 break;
+
+default:
+assert(0);
 }
 }
diff --git a/hw/watchdog/wdt_diag288.c b/hw/watchdog/wdt_diag288.c
index 47f289216a..1475743527 100644
--- a/hw/watchdog/wdt_diag288.c
+++ b/hw/watchdog/wdt_diag288.c
@@ -57,9 +57,9 @@ static void diag288_timer_expired(void *dev)
  * the BQL; reset before triggering the action to avoid races with
  * diag288 instructions. */
 switch (get_watchdog_action()) {
-case WDT_DEBUG:
-case WDT_NONE:
-case WDT_PAUSE:
+case WATCHDOG_ACTION_DEBUG:
+case WATCHDOG_ACTION_NONE:
+case WATCHDOG_ACTION_PAUSE:
 break;
 default:
 wdt_diag288_reset(dev);
diff --git a/include/sysemu/watchdog.h b/include/sysemu/watchdog.h
index 72a4da07a6..677ace3945 100644
--- a/include/sysemu/watchdog.h
+++ b/include/sysemu/watchdog.h
@@ -23,15 +23,7 @@
 #define QEMU_WATCHDOG_H
 
 #include "qemu/queue.h"
-
-/* Possible values for action parameter. */
-#define WDT_RESET1  /* Hard reset. */
-#define WDT_SHUTDOWN 2  /* Shutdown. */
-#define WDT_POWEROFF 3  /* Quit. */
-#define WDT_PAUS

[Qemu-devel] [PATCH v5 3/3] watchdog: Allow setting action on the fly

2017-09-07 Thread Michal Privoznik

Currently, the only time that users can set watchdog action is at
the start as all we expose is this -watchdog-action command line
argument. This is suboptimal when users want to plug the device
later via monitor. Alternatively, they might want to change the
action for already existing device on the fly.

Inspired by: https://bugzilla.redhat.com/show_bug.cgi?id=1447169

Signed-off-by: Michal Privoznik 
---
 hw/watchdog/watchdog.c | 8 +++-
 qapi-schema.json   | 9 +
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/hw/watchdog/watchdog.c b/hw/watchdog/watchdog.c
index 0d3eeed187..670114ecfe 100644
--- a/hw/watchdog/watchdog.c
+++ b/hw/watchdog/watchdog.c
@@ -29,6 +29,7 @@
 #include "qapi-event.h"
 #include "hw/nmi.h"
 #include "qemu/help_option.h"
+#include "qmp-commands.h"
 
 static WatchdogAction watchdog_action = WATCHDOG_ACTION_RESET;
 static QLIST_HEAD(watchdog_list, WatchdogTimerModel) watchdog_list;
@@ -85,7 +86,7 @@ int select_watchdog_action(const char *p)
 g_free(qapi_value);
 if (action < 0)
 return -1;
-watchdog_action = action;
+qmp_watchdog_set_action(action, &error_abort);
 return 0;
 }
 
@@ -142,3 +143,8 @@ void watchdog_perform_action(void)
 assert(0);
 }
 }
+
+void qmp_watchdog_set_action(WatchdogAction action, Error **errp)
+{
+watchdog_action = action;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index f3af2cb851..f5db401838 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3143,3 +3143,12 @@
 # Since 2.9
 ##
 { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
+
+##
+# @watchdog-set-action:
+#
+# Set watchdog action
+#
+# Since 2.11
+##
+{ 'command': 'watchdog-set-action', 'data' : {'action': 'WatchdogAction'} }
-- 
2.13.5

Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device

2017-09-07 Thread Cornelia Huck

On Thu, 7 Sep 2017 15:31:09 +0800
Dong Jia Shi  wrote:

> * Cornelia Huck  [2017-09-06 15:18:21 +0200]:
> 
> > On Tue,  5 Sep 2017 13:16:45 +0200
> > Halil Pasic  wrote:
> >   
> > > Add a fake device meant for testing the correctness of our css emulation.
> > > 
> > > What we currently have is writing a Fibonacci sequence of uint32_t to the
> > > device via ccw write. The write is going to fail if it ain't a Fibonacci
> > > and indicate a device exception in scsw together with the proper residual
> > > count.
> > > 
> > > Of course lot's of invalid inputs (besides basic data processing) can be
> > > tested with that as well.
> > > 
> > > Usage:
> > > 1) fire up a qemu with something like -device ccw-tester,devno=fe.0.0001
> > >on the command line
> > > 2) exercise the device from the guest
> > > 
> > > Signed-off-by: Halil Pasic 
> > > ---
> > > 
> > > It may not make sense to merge this work in the current form, as it is
> > > solely for test purposes.
> > > ---
> > >  hw/s390x/Makefile.objs |   1 +
> > >  hw/s390x/ccw-tester.c  | 179 
> > > +
> > >  2 files changed, 180 insertions(+)
> > >  create mode 100644 hw/s390x/ccw-tester.c  
> > 
> > The main problem here is that you want to exercise a middle layer (the
> > css code) and need to write boilerplate code on both host and guest
> > side in order to be able to do so.
> > 
> > In general, a device that accepts arbitrary channel programs looks
> > useful for testing purposes. I would split out processing of expected
> > responses out, though, so that it can be more easily reused for
> > different use cases.
> > 
> > (I dimly recall other test devices...)
> > 
> > For the guest tester: Can that be done via the qtest infrastructure
> > somehow?
> >   
> 
> I'm thinking of a method these days:
> Could passing through an fully emulated ccw device (e.g. 3270), or a
> virtio ccw device, in the level 1 kvm guest to a level 2 guest be a test
> method for this?
> 
> All of the CCWs will be translated to IDAL CCWs by vfio-ccw in the level
> 1 guest (which is the level 2 kvm host) and issued to the level 1 kvm
> host. So, those IDALs will eventually be handled by the emulated device,
> or the virtio ccw device, on the level 1 kvm host...
> 
> Some days ago, one of my colleague tried the emulated 3270 passing
> through. She stucked with the problem that the level 1 kvm host handling
> a senseid IDAL ccw as a Direct ccw.
> 
> Maybe I could try to pass through a virtio ccw device. I don't think of
> any obvious problem that would lead to fail. Any comment?
> 

That actually looks like a good thing to try! Cool idea.

Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device

2017-09-07 Thread Cornelia Huck

On Wed, 6 Sep 2017 18:16:29 +0200
Halil Pasic  wrote:

> On 09/06/2017 05:20 PM, Cornelia Huck wrote:
> > On Wed, 6 Sep 2017 16:24:13 +0200
> > Halil Pasic  wrote:
> >   
> >> On 09/06/2017 03:18 PM, Cornelia Huck wrote:  
> >>> On Tue,  5 Sep 2017 13:16:45 +0200
> >>> Halil Pasic  wrote:
> >>> 
>  Add a fake device meant for testing the correctness of our css emulation.
> 
>  What we currently have is writing a Fibonacci sequence of uint32_t to the
>  device via ccw write. The write is going to fail if it ain't a Fibonacci
>  and indicate a device exception in scsw together with the proper residual
>  count.
> 
>  Of course lot's of invalid inputs (besides basic data processing) can be
>  tested with that as well.
> 
>  Usage:
>  1) fire up a qemu with something like -device ccw-tester,devno=fe.0.0001
> on the command line
>  2) exercise the device from the guest
> 
>  Signed-off-by: Halil Pasic 
>  ---
> 
>  It may not make sense to merge this work in the current form, as it is
>  solely for test purposes.
>  ---
>   hw/s390x/Makefile.objs |   1 +
>   hw/s390x/ccw-tester.c  | 179 
>  +
>   2 files changed, 180 insertions(+)
>   create mode 100644 hw/s390x/ccw-tester.c
> >>>
> >>> The main problem here is that you want to exercise a middle layer (the
> >>> css code) and need to write boilerplate code on both host and guest
> >>> side in order to be able to do so.
> >>> 
> >>
> >> Nod.
> >>  
> >>> In general, a device that accepts arbitrary channel programs looks
> >>> useful for testing purposes. I would split out processing of expected
> >>> responses out, though, so that it can be more easily reused for
> >>> different use cases.
> >>> 
> >>
> >> I'm not sure what do you mean here. Could you clarify please?  
> > 
> > Maybe doing things like logging received ccws and responses etc. and
> > and then comparing against an expected outcome. You should be able to
> > write test cases for different sets of things to be tested. I know this
> > is awfully vague :)  
> 
> Yes it does sound awfully vague :). In my case different tests are
> actually done in the kernel module implementing a driver for this
> device. I compare the expected outcome and the actual outcome there.
> This device is just a back-end lending itself to experimentation.

I think we want both guest-side and host-side checking. I think we
should aim for a guest-side checker that does not require a kernel
module (a simple guest makes it easier to run as an automated test.)

> >>> For the guest tester: Can that be done via the qtest infrastructure
> >>> somehow?
> >>> 
> >>
> >> Well, for now I have the out-of-tree Linux kernel module provided in the
> >> cover letter of the series (you did not comment on that one yet).  
> > 
> > Yes, I saw that, but have not yet looked at it. If there is a way to do
> > it without too many extra parts, I'd prefer that.
> >   
> 
> Well, I think the module is almost as economical with extra parts as it
> can get: it uses the kernel infrastructure and does not do a whole lot
> of things on top.

And that's also the problem: You drag the whole kernel infrastructure
along with it.

> 
> I think it's worth a look.

It certainly is, but you know how it is with resources...

> I hope it will also give answers to some
> of the implicit questions I see above. Yes, tests done in the driver
> are currently very few. Both with and without indirect data access
> we first let a device consume a Fibonacci sequence and verify that
> the IO has succeeded. Then we deliberately change an element in the
> sequence so it ain't the next Fibonacci number. And check for unit
> exception and proper residual count. With that we are sure that
> the data was properly consumed up until the given element. For IDA
> we the shorten the sequence so it does not contain the 'broken'
> element, and expect completion again. As you see this is a sufficient
> test for the good path for single CCW programs.
> 
> Extending to testing chaining (different concern), or responses
> to broken channel programs (e.g. ORB flags, bad addresses, tic
> rules) should be quite straightforward.

Yes, that all sounds very nice. We just disagree about the means :)

> 
> >>
> >> I think for building trust regarding my IDA implementation it should be
> >> able to do the job. Don't you agree?  
> > 
> > There's nothing wrong with your test. But we really should try to move
> > to some kind of unified testing that is hopefully self-contained so
> > that interested people can just play with it within the context of qemu.
> >   
> >>
> >> Just a couple of hours ago Janosch (cc-ing Janosch) came to my office,
> >> and told be that he is working on CCW-tests for zonk (a minimal kernel
> >> for testing -- internal tool AFAIR).
> >>
> >> By qtest you mean libqtest/libqos? I'm not familiar with that and

Re: [Qemu-devel] [PATCH RFC] accel: default to an actually available accelerator

2017-09-07 Thread Kevin Wolf

Am 06.09.2017 um 13:29 hat Cornelia Huck geschrieben:
> On Wed,  6 Sep 2017 11:49:27 +0200
> Cornelia Huck  wrote:
> 
> > configure_accelerator() falls back to tcg if no accelerator has
> > been specified. Formerly, we could be sure that tcg is always
> > available; however, with --disable-tcg, this is not longer true,
> > and you are not able to start qemu without explicitly specifying
> > another accelerator on those builds.
> > 
> > Instead, choose an accelerator in the order tcg->kvm->xen->hax.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> > 
> > RFC mainly because this breaks iotest 186 in a different way on a
> > tcg-less x86_64 build: Before, it fails with "-machine accel=tcg: No
> > accelerator found"; afterwards, there seems to be a difference in
> > output due to different autogenerated devices. Not sure how to handle
> > that.
> > 
> > cc:ing some hopefully interested folks (-ENOMAINTAINER again).
> > 
> > ---
> >  accel/accel.c  | 22 --
> >  arch_init.c| 17 +
> >  include/sysemu/arch_init.h |  2 ++
> >  qemu-options.hx|  6 --
> >  4 files changed, 43 insertions(+), 4 deletions(-)
> > 
> > diff --git a/accel/accel.c b/accel/accel.c
> > index 8ae40e1e13..26a3f32627 100644
> > --- a/accel/accel.c
> > +++ b/accel/accel.c
> > @@ -68,6 +68,25 @@ static int accel_init_machine(AccelClass *acc, 
> > MachineState *ms)
> >  return ret;
> >  }
> >  
> > +static const char *default_accelerator(void)
> > +{
> > +if (tcg_available()) {
> > +return "tcg";
> > +}
> > +if (kvm_available()) {
> > +return "kvm";
> > +}
> > +if (xen_available()) {
> > +return "xen";
> > +}
> > +if (hax_available()) {
> > +return "hax";
> > +}
> > +/* configure makes sure we have at least one accelerator */
> > +g_assert(false);
> > +return "";
> > +}
> > +
> >  void configure_accelerator(MachineState *ms)
> >  {
> >  const char *accel, *p;
> > @@ -79,8 +98,7 @@ void configure_accelerator(MachineState *ms)
> >  
> >  accel = qemu_opt_get(qemu_get_machine_opts(), "accel");
> >  if (accel == NULL) {
> > -/* Use the default "accelerator", tcg */
> > -accel = "tcg";
> > +accel = default_accelerator();
> 
> It actually may be easier to just switch the default to
> "tcg:kvm:xen:hax". Haven't tested that, though.

This would have been my first thought and looks a bit simpler, so if
it works, I'd go for it.

But the real reason why I'm replying: Should we add changing the default
to "kvm:tcg" to the list of planned 3.0 changes? I am part of the group
that intentionally uses TCG occasionally, but I think the majority of
users wants to use KVM (or whatever the fastest option is on their
system) whenever it is available.

Kevin

[Qemu-devel] [PATCH 1/2] docker: Update ubuntu image

2017-09-07 Thread Fam Zheng

Base on the newer ubuntu-lts (16.06) and include more packages for
better build coverage.

Signed-off-by: Fam Zheng 
---
 tests/docker/dockerfiles/ubuntu.docker | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tests/docker/dockerfiles/ubuntu.docker 
b/tests/docker/dockerfiles/ubuntu.docker
index a360a050a2..d73ce02246 100644
--- a/tests/docker/dockerfiles/ubuntu.docker
+++ b/tests/docker/dockerfiles/ubuntu.docker
@@ -1,12 +1,17 @@
-FROM ubuntu:14.04
+FROM ubuntu:16.04
 RUN echo "deb http://archive.ubuntu.com/ubuntu/ trusty universe multiverse" >> 
\
 /etc/apt/sources.list
 RUN apt-get update
 ENV PACKAGES flex bison \
-libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev \
+libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev 
libncursesw5-dev \
 libseccomp-dev libgnutls-dev libssh2-1-dev  libspice-server-dev \
 libspice-protocol-dev libnss3-dev libfdt-dev \
-libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev libpixman-1-dev \
+libgtk-3-dev libvte-2.91-dev libsdl1.2-dev libpng12-dev libpixman-1-dev \
+libvdeplug-dev liblzo2-dev libsnappy-dev libbz2-dev libxen-dev 
librdmacm-dev libibverbs-dev \
+libsasl2-dev libjpeg-turbo8-dev xfslibs-dev libcap-ng-dev libbrlapi-dev 
libcurl4-gnutls-dev \
+libbluetooth-dev librbd-dev libaio-dev glusterfs-common libnuma-dev 
libepoxy-dev libdrm-dev libgbm-dev \
+libjemalloc-dev libcacard-dev libusbredirhost-dev libnfs-dev libcap-dev 
libattr1-dev \
+texinfo \
 git make ccache python-yaml gcc clang sparse
 RUN apt-get -y install $PACKAGES
 RUN dpkg -l $PACKAGES | sort > /packages.txt
-- 
2.13.5

[Qemu-devel] [PATCH 2/2] docker: Enable features explicitly in test-full

2017-09-07 Thread Fam Zheng

Also avoid "set -e".

Signed-off-by: Fam Zheng 
---
 tests/docker/test-full | 80 ++
 1 file changed, 75 insertions(+), 5 deletions(-)

diff --git a/tests/docker/test-full b/tests/docker/test-full
index 05f0d491d1..bd095ad91b 100755
--- a/tests/docker/test-full
+++ b/tests/docker/test-full
@@ -1,8 +1,8 @@
-#!/bin/bash -e
+#!/bin/bash
 #
-# Compile all the targets.
+# Compile all the targets with as many features enabled as possible
 #
-# Copyright (c) 2016 Red Hat Inc.
+# Copyright 2016, 2017 Red Hat Inc.
 #
 # Authors:
 #  Fam Zheng 
@@ -15,5 +15,75 @@
 
 cd "$BUILD_DIR"
 
-build_qemu
-make check $MAKEFLAGS
+build_qemu \
+--enable-attr \
+--enable-bluez \
+--enable-brlapi \
+--enable-bsd-user \
+--enable-bzip2 \
+--enable-cap-ng \
+--enable-coroutine-pool \
+--enable-crypto-afalg \
+--enable-curl \
+--enable-curses \
+--enable-debug \
+--enable-debug-info \
+--enable-debug-tcg \
+--enable-docs \
+--enable-fdt \
+--enable-gcrypt \
+--enable-glusterfs \
+--enable-gnutls \
+--enable-gprof \
+--enable-gtk \
+--enable-guest-agent \
+--enable-jemalloc \
+--enable-kvm \
+--enable-libiscsi \
+--enable-libnfs \
+--enable-libssh2 \
+--enable-libusb \
+--enable-linux-aio \
+--enable-linux-user \
+--enable-live-block-migration \
+--enable-lzo \
+--enable-modules \
+--enable-numa \
+--enable-opengl \
+--enable-pie \
+--enable-profiler \
+--enable-qom-cast-debug \
+--enable-rbd \
+--enable-rdma \
+--enable-replication \
+--enable-sdl \
+--enable-seccomp \
+--enable-smartcard \
+--enable-snappy \
+--enable-spice \
+--enable-stack-protector \
+--enable-system \
+--enable-tcg \
+--enable-tcg-interpreter \
+--enable-tools \
+--enable-tpm \
+--enable-trace-backend=ftrace \
+--enable-usb-redir \
+--enable-user \
+--enable-vde \
+--enable-vhost-net \
+--enable-vhost-scsi \
+--enable-vhost-user \
+--enable-vhost-vsock \
+--enable-virtfs \
+--enable-vnc \
+--enable-vnc-jpeg \
+--enable-vnc-png \
+--enable-vnc-sasl \
+--enable-vte \
+--enable-werror \
+--enable-xen \
+--enable-xen-pci-passthrough \
+--enable-xen-pv-domain-build \
+--enable-xfsctl \
+&& make check $MAKEFLAGS
-- 
2.13.5

[Qemu-devel] [PATCH 0/2] docker: Update ubuntu and test-full for more coverage

2017-09-07 Thread Fam Zheng



Fam Zheng (2):
  docker: Update ubuntu image
  docker: Enable features explicitly in test-full

 tests/docker/dockerfiles/ubuntu.docker | 11 +++--
 tests/docker/test-full | 80 +++---
 2 files changed, 83 insertions(+), 8 deletions(-)

-- 
2.13.5

Re: [Qemu-devel] [PATCH RFC] accel: default to an actually available accelerator

2017-09-07 Thread Thomas Huth

On 07.09.2017 10:11, Kevin Wolf wrote:
[...]
> But the real reason why I'm replying: Should we add changing the default
> to "kvm:tcg" to the list of planned 3.0 changes? I am part of the group
> that intentionally uses TCG occasionally, but I think the majority of
> users wants to use KVM (or whatever the fastest option is on their
> system) whenever it is available.

If you consider how often people are getting this wrong (they want to
use KVM but end up with TCG in the first try), I think that's a good idea.

Maybe we should start a Wiki page where we collect ideas for QEMU 3.0?

 Thomas

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Peter Xu

On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > This does imply that you need a separate monitor I/O processing, from the
> > > command execution thread, but I see no need for all commands to suddenly
> > > become async. Just allowing interleaved replies is sufficient from the
> > > POV of the protocol definition. This interleaving is easy to handle from
> > > the client POV - just requires a unique 'serial' in the request by the
> > > client, that is copied into the reply by QEMU.
> > 
> > OK, so for that we can just take Marc-André's syntax and call it 'id':
> >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > 
> > then it's upto the caller to ensure those id's are unique.
> 
> Libvirt has in fact generated a unique 'id' for every monitor command
> since day 1 of supporting QMP.
> 
> > I do worry about two things:
> >   a) With this the caller doesn't really know which commands could be
> >   in parallel - for example if we've got a recovery command that's
> >   executed by this non-locking thread that's OK, we expect that
> >   to be doable in parallel.  If in the future though we do
> >   what you initially suggested and have a bunch of commands get
> >   routed to the migration thread (say) then those would suddenly
> >   operate in parallel with other commands that we're previously
> >   synchronous.
> 
> We could still have an opt-in for async commands. eg default to executing
> all commands in the main thread, unless the client issues an explicit
> "make it async" command, to switch to allowing the migration thread to
> process it async.
> 
>  { "execute": "qmp_allow_async",
>"data": { "commands": [
>"migrate_cancel",
>] } }
> 
> 
>  { "return": { "commands": [
>"migrate_cancel",
>] } }
> 
> The server response contains the subset of commands from the request
> for which async is supported.
> 
> That gives good negotiation ability going forward as we incrementally
> support async on more commands.

I think this goes back to the discussion on which design we'd like to
choose.  IMHO the whole async idea plus the per-command-id is indeed
cleaner and nicer, and I believe that can benefit not only libvirt,
but also other QMP users.  The problem is, I have no idea how long
it'll take to let us have such a feature - I believe that will include
QEMU and Libvirt to both support that.  And it'll be a pity if the
postcopy recovery cannot work only because we cannot guarantee a
stable monitor.

I'm curious whether there are other requirements (besides postcopy
recovery) that would want an always-alive monitor to run some
lock-free commands?  If there is, I'd be more inclined to first
provide a work-around solution like "-qmp-lockfree", and we can
provide a better solution afterwards until when the whole async QMP
work ready.

> 
> >   b) I still worry how the various IO channels will behave on another
> >   thread.  But that's more a general feeling rather than anything
> >   specific.
> 
> The only complexity will be around making sure the Chardev code uses
> the right GMainContext for any watches on the underlying QIOChannel,
> so that we poll() from the custom thread instead of the main thread.
> IOW, as long as all I/O is done from the single thread everything
> should work fine.
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

-- 
Peter Xu

Re: [Qemu-devel] [PULL 0/5] Vga 20170901 patches

2017-09-07 Thread Gerd Hoffmann

  Hi,

I'v been reviewing your patches and took some notes but didn't
finished, 
> since there is something bothering me.
> I see they haven't been reviewed yet, can you hold this PR and wait
> the 
> weekend?

Ping.  Care to share your concerns?  Weekend is long over ...

cheers,
  Gerd

Re: [Qemu-devel] [PATCH RFC] accel: default to an actually available accelerator

2017-09-07 Thread Cornelia Huck

On Thu, 7 Sep 2017 10:14:27 +0200
Thomas Huth  wrote:

> On 07.09.2017 10:11, Kevin Wolf wrote:
> [...]
> > But the real reason why I'm replying: Should we add changing the default
> > to "kvm:tcg" to the list of planned 3.0 changes? I am part of the group
> > that intentionally uses TCG occasionally, but I think the majority of
> > users wants to use KVM (or whatever the fastest option is on their
> > system) whenever it is available.  
> 
> If you consider how often people are getting this wrong (they want to
> use KVM but end up with TCG in the first try), I think that's a good idea.

Agreed.

I'm wondering what that means for our tests, though. Some of them work
slightly different under tcg or kvm (cf. iotest 186, as referenced in
the original mail), and sometimes you'll probably explicitly want to
exercise tcg. That does not speak against the change, but we probably
need to look at what we want in more detail.

> Maybe we should start a Wiki page where we collect ideas for QEMU 3.0?

Also a good idea.

[Do we already have any idea about the timeframe for 3.0?]

Re: [Qemu-devel] [Qemu-trivial] [PATCH] filter-mirror: segfault when specifying non existent device

2017-09-07 Thread Eduardo Otubo

On Tue, Aug 22, 2017 at 09:19:20AM +0800, Zhang Chen wrote:
> 
> 
> On 08/21/2017 11:50 PM, Eduardo Otubo wrote:
> > When using filter-mirror like the example below where the interface
> > 'ndev0' does not exist on the host, QEMU crashes into segmentation
> > fault.
> > 
> >   $ qemu-system-x86_64 -S -machine pc -netdev user,id=ndev0 -object 
> > filter-mirror,id=test-object,netdev=ndev0
> > 
> > This happens because the function filter_mirror_setup() does not checks
> > if the device actually exists and still keep on processing calling
> > qemu_chr_find(). This patch fixes this issue.
> > 
> > Signed-off-by: Eduardo Otubo 
> 
> Looks good for me.
> 
> Reviewed-by: Zhang Chen

Ping.

> 
> Thanks
> Zhang Chen
> 
> > ---
> >   net/filter-mirror.c | 14 +++---
> >   1 file changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/net/filter-mirror.c b/net/filter-mirror.c
> > index 90e2c92337..e18a4b16a0 100644
> > --- a/net/filter-mirror.c
> > +++ b/net/filter-mirror.c
> > @@ -213,14 +213,22 @@ static void filter_mirror_setup(NetFilterState *nf, 
> > Error **errp)
> >   MirrorState *s = FILTER_MIRROR(nf);
> >   Chardev *chr;
> > +if (s->outdev == NULL) {
> > +goto err;
> > +}
> > +
> >   chr = qemu_chr_find(s->outdev);
> > +
> >   if (chr == NULL) {
> > -error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
> > -  "Device '%s' not found", s->outdev);
> > -return;
> > +goto err;
> >   }
> >   qemu_chr_fe_init(&s->chr_out, chr, errp);
> > +
> > +err:
> > +error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, "Device '%s' not found",
> > +  nf->netdev_id);
> > +return;
> >   }
> >   static void redirector_rs_finalize(SocketReadState *rs)
> 
> -- 
> Thanks
> Zhang Chen
> 
> 
> 
> 

-- 
Eduardo Otubo
Senior Software Engineer @ RedHat

[Qemu-devel] [PATCH 1/9] buildsys: Move gtk/vte cflags/libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 configure| 3 +--
 ui/Makefile.objs | 4 
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index b37cd54bda..e27a4bd683 100755
--- a/configure
+++ b/configure
@@ -2437,7 +2437,6 @@ if test "$gtk" != "no"; then
 gtk_cflags="$gtk_cflags $x11_cflags"
 gtk_libs="$gtk_libs $x11_libs"
 fi
-libs_softmmu="$gtk_libs $libs_softmmu"
 gtk="yes"
 elif test "$gtk" = "yes"; then
 feature_not_found "gtk" "Install gtk3-devel"
@@ -2678,7 +2677,6 @@ if test "$vte" != "no"; then
 vte_cflags=$($pkg_config --cflags $vtepackage)
 vte_libs=$($pkg_config --libs $vtepackage)
 vteversion=$($pkg_config --modversion $vtepackage)
-libs_softmmu="$vte_libs $libs_softmmu"
 vte="yes"
 elif test "$vte" = "yes"; then
 if test "$gtkabi" = "3.0"; then
@@ -5750,6 +5748,7 @@ fi
 if test "$vte" = "yes" ; then
   echo "CONFIG_VTE=y" >> $config_host_mak
   echo "VTE_CFLAGS=$vte_cflags" >> $config_host_mak
+  echo "VTE_LIBS=$vte_libs" >> $config_host_mak
 fi
 if test "$virglrenderer" = "yes" ; then
   echo "CONFIG_VIRGL=y" >> $config_host_mak
diff --git a/ui/Makefile.objs b/ui/Makefile.objs
index 3369451285..146a8ce062 100644
--- a/ui/Makefile.objs
+++ b/ui/Makefile.objs
@@ -45,6 +45,10 @@ gtk.o-cflags := $(GTK_CFLAGS) $(VTE_CFLAGS)
 gtk-egl.o-cflags := $(GTK_CFLAGS) $(VTE_CFLAGS)
 gtk-gl-area.o-cflags := $(GTK_CFLAGS) $(VTE_CFLAGS)
 
+gtk.o-libs := $(GTK_LIBS) $(VTE_LIBS)
+gtk-egl.o-libs := $(GTK_LIBS) $(VTE_LIBS)
+gtk-gl-area.o-libs := $(GTK_LIBS) $(VTE_LIBS)
+
 gtk-egl.o-libs += $(OPENGL_LIBS)
 shader.o-libs += $(OPENGL_LIBS)
 console-gl.o-libs += $(OPENGL_LIBS)
-- 
2.13.5

[Qemu-devel] [PATCH 0/9] buildsys: Move ui/usb library cflags/libs to per object

2017-09-07 Thread Fam Zheng

Hi Gerd,

This is part two of the QEMU_CFLAGS and libs_softmmu cleanup series, including
ui/, audio/ and hw/usb/. Please review.

Fam Zheng (9):
  buildsys: Move gtk/vte cflags/libs to per object
  buildsys: Move sdl cflags/libs to per object
  buildsys: Move vnc cflags/libs to per object
  buildsys: Move audio libs to per object
  buildsys: Move curese cflags/libs to per object
  buildsys: Move opengl cflags to per object
  buildsys: Move libcacard cflags/libs to per object
  buildsys: Move libusb cflags/libs to per object
  buildsys: Move usb redir cflags/libs to per object

 audio/Makefile.objs  |  6 ++
 configure| 54 ++--
 hw/usb/Makefile.objs | 13 +++--
 ui/Makefile.objs | 37 ++-
 4 files changed, 72 insertions(+), 38 deletions(-)

-- 
2.13.5

[Qemu-devel] [PATCH 8/9] buildsys: Move libusb cflags/libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 configure| 4 ++--
 hw/usb/Makefile.objs | 5 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 85fef3d16f..bee1302155 100755
--- a/configure
+++ b/configure
@@ -4252,8 +4252,6 @@ if test "$libusb" != "no" ; then
 libusb="yes"
 libusb_cflags=$($pkg_config --cflags libusb-1.0)
 libusb_libs=$($pkg_config --libs libusb-1.0)
-QEMU_CFLAGS="$QEMU_CFLAGS $libusb_cflags"
-libs_softmmu="$libs_softmmu $libusb_libs"
 else
 if test "$libusb" = "yes"; then
 feature_not_found "libusb" "Install libusb devel >= 1.0.13"
@@ -5835,6 +5833,8 @@ fi
 
 if test "$libusb" = "yes" ; then
   echo "CONFIG_USB_LIBUSB=y" >> $config_host_mak
+  echo "LIBUSB_CFLAGS=$libusb_cflags" >> $config_host_mak
+  echo "LIBUSB_LIBS=$libusb_libs" >> $config_host_mak
 fi
 
 if test "$usb_redir" = "yes" ; then
diff --git a/hw/usb/Makefile.objs b/hw/usb/Makefile.objs
index 795ff25a5e..9ce9eadb47 100644
--- a/hw/usb/Makefile.objs
+++ b/hw/usb/Makefile.objs
@@ -42,6 +42,11 @@ common-obj-$(CONFIG_USB_REDIR) += redirect.o quirks.o
 # usb pass-through
 common-obj-y += $(patsubst %,host-%.o,$(HOST_USB))
 
+host-libusb.o-cflags := $(LIBUSB_CFLAGS)
+host-libusb.o-libs := $(LIBUSB_LIBS)
+
 ifeq ($(CONFIG_USB_LIBUSB),y)
 common-obj-$(CONFIG_XEN) += xen-usb.o
+xen-usb.o-cflags := $(LIBUSB_CFLAGS)
+xen-usb.o-libs := $(LIBUSB_LIBS)
 endif
-- 
2.13.5

[Qemu-devel] [PATCH 2/9] buildsys: Move sdl cflags/libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 audio/Makefile.objs | 1 +
 configure   | 2 +-
 ui/Makefile.objs| 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/audio/Makefile.objs b/audio/Makefile.objs
index 481d1aa30e..c20695668f 100644
--- a/audio/Makefile.objs
+++ b/audio/Makefile.objs
@@ -11,3 +11,4 @@ common-obj-$(CONFIG_AUDIO_WIN_INT) += audio_win_int.o
 common-obj-y += wavcapture.o
 
 sdlaudio.o-cflags := $(SDL_CFLAGS)
+sdlaudio.o-libs := $(SDL_LIBS)
diff --git a/configure b/configure
index e27a4bd683..812a1ce615 100755
--- a/configure
+++ b/configure
@@ -2794,7 +2794,6 @@ EOF
 sdl_cflags="$sdl_cflags $x11_cflags"
 sdl_libs="$sdl_libs $x11_libs"
   fi
-  libs_softmmu="$sdl_libs $libs_softmmu"
 fi
 
 ##
@@ -5593,6 +5592,7 @@ if test "$sdl" = "yes" ; then
   echo "CONFIG_SDL=y" >> $config_host_mak
   echo "CONFIG_SDLABI=$sdlabi" >> $config_host_mak
   echo "SDL_CFLAGS=$sdl_cflags" >> $config_host_mak
+  echo "SDL_LIBS=$sdl_libs" >> $config_host_mak
 fi
 if test "$cocoa" = "yes" ; then
   echo "CONFIG_COCOA=y" >> $config_host_mak
diff --git a/ui/Makefile.objs b/ui/Makefile.objs
index 146a8ce062..b3e29e21f0 100644
--- a/ui/Makefile.objs
+++ b/ui/Makefile.objs
@@ -27,6 +27,7 @@ sdl.mo-objs += sdl2-gl.o
 endif
 endif
 sdl.mo-cflags := $(SDL_CFLAGS)
+sdl.mo-libs := $(SDL_LIBS)
 
 ifeq ($(CONFIG_OPENGL),y)
 common-obj-y += shader.o
-- 
2.13.5

[Qemu-devel] [PATCH 3/9] buildsys: Move vnc cflags/libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 configure| 14 --
 ui/Makefile.objs | 21 -
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/configure b/configure
index 812a1ce615..9b0eb91a9b 100755
--- a/configure
+++ b/configure
@@ -2834,8 +2834,8 @@ EOF
   vnc_sasl_libs="-lsasl2"
   if compile_prog "$vnc_sasl_cflags" "$vnc_sasl_libs" ; then
 vnc_sasl=yes
-libs_softmmu="$vnc_sasl_libs $libs_softmmu"
-QEMU_CFLAGS="$QEMU_CFLAGS $vnc_sasl_cflags"
+vnc_libs="$vnc_libs $vnc_sasl_libs"
+vnc_cflags="$vnc_cflags $vnc_sasl_cflags"
   else
 if test "$vnc_sasl" = "yes" ; then
   feature_not_found "vnc-sasl" "Install Cyrus SASL devel"
@@ -2856,8 +2856,8 @@ EOF
 vnc_jpeg_libs="-ljpeg"
   if compile_prog "$vnc_jpeg_cflags" "$vnc_jpeg_libs" ; then
 vnc_jpeg=yes
-libs_softmmu="$vnc_jpeg_libs $libs_softmmu"
-QEMU_CFLAGS="$QEMU_CFLAGS $vnc_jpeg_cflags"
+vnc_libs="$vnc_libs $vnc_jpeg_libs"
+vnc_cflags="$vnc_cflags $vnc_jpeg_cflags"
   else
 if test "$vnc_jpeg" = "yes" ; then
   feature_not_found "vnc-jpeg" "Install libjpeg-turbo devel"
@@ -2888,8 +2888,8 @@ EOF
   fi
   if compile_prog "$vnc_png_cflags" "$vnc_png_libs" ; then
 vnc_png=yes
-libs_softmmu="$vnc_png_libs $libs_softmmu"
-QEMU_CFLAGS="$QEMU_CFLAGS $vnc_png_cflags"
+vnc_libs="$vnc_libs $vnc_png_libs"
+vnc_cflags="$vnc_cflags $vnc_png_cflags"
   else
 if test "$vnc_png" = "yes" ; then
   feature_not_found "vnc-png" "Install libpng devel"
@@ -5558,6 +5558,8 @@ echo "CONFIG_BDRV_RW_WHITELIST=$block_drv_rw_whitelist" 
>> $config_host_mak
 echo "CONFIG_BDRV_RO_WHITELIST=$block_drv_ro_whitelist" >> $config_host_mak
 if test "$vnc" = "yes" ; then
   echo "CONFIG_VNC=y" >> $config_host_mak
+  echo "VNC_CFLAGS=$vnc_cflags" >> $config_host_mak
+  echo "VNC_LIBS=$vnc_libs" >> $config_host_mak
 fi
 if test "$vnc_sasl" = "yes" ; then
   echo "CONFIG_VNC_SASL=y" >> $config_host_mak
diff --git a/ui/Makefile.objs b/ui/Makefile.objs
index b3e29e21f0..a0b3c15a28 100644
--- a/ui/Makefile.objs
+++ b/ui/Makefile.objs
@@ -1,11 +1,14 @@
-vnc-obj-y += vnc.o
-vnc-obj-y += vnc-enc-zlib.o vnc-enc-hextile.o
-vnc-obj-y += vnc-enc-tight.o vnc-palette.o
-vnc-obj-y += vnc-enc-zrle.o
-vnc-obj-y += vnc-auth-vencrypt.o
-vnc-obj-$(CONFIG_VNC_SASL) += vnc-auth-sasl.o
-vnc-obj-y += vnc-ws.o
-vnc-obj-y += vnc-jobs.o
+vnc.mo-objs += vnc.o
+vnc.mo-objs += vnc-enc-zlib.o vnc-enc-hextile.o
+vnc.mo-objs += vnc-enc-tight.o vnc-palette.o
+vnc.mo-objs += vnc-enc-zrle.o
+vnc.mo-objs += vnc-auth-vencrypt.o
+vnc.mo-objs += $(if $(CONFIG_VNC_SASL), vnc-auth-sasl.o)
+vnc.mo-objs += vnc-ws.o
+vnc.mo-objs += vnc-jobs.o
+
+vnc.mo-cflags := $(VNC_CFLAGS)
+vnc.mo-libs := $(VNC_LIBS)
 
 common-obj-y += keymaps.o console.o cursor.o qemu-pixman.o
 common-obj-y += input.o input-keymap.o input-legacy.o
@@ -14,7 +17,7 @@ common-obj-$(CONFIG_SPICE) += spice-core.o spice-input.o 
spice-display.o
 common-obj-$(CONFIG_SDL) += sdl.mo x_keymap.o
 common-obj-$(CONFIG_COCOA) += cocoa.o
 common-obj-$(CONFIG_CURSES) += curses.o
-common-obj-$(CONFIG_VNC) += $(vnc-obj-y)
+common-obj-$(CONFIG_VNC) += vnc.mo
 common-obj-$(CONFIG_GTK) += gtk.o x_keymap.o
 
 ifeq ($(CONFIG_SDLABI),1.2)
-- 
2.13.5

[Qemu-devel] [PATCH 4/9] buildsys: Move audio libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 audio/Makefile.objs |  5 +
 configure   | 15 ++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/audio/Makefile.objs b/audio/Makefile.objs
index c20695668f..8a5ede6e2b 100644
--- a/audio/Makefile.objs
+++ b/audio/Makefile.objs
@@ -12,3 +12,8 @@ common-obj-y += wavcapture.o
 
 sdlaudio.o-cflags := $(SDL_CFLAGS)
 sdlaudio.o-libs := $(SDL_LIBS)
+alsaaudio.o-libs := $(ALSA_LIBS)
+paaudio.o-libs := $(PULSE_LIBS)
+coreaudio.o-libs := $(COREAUDIO_LIBS)
+dsoundaudio.o-libs := $(DSOUND_LIBS)
+ossaudio.o-libs := $(OSS_LIBS)
diff --git a/configure b/configure
index 9b0eb91a9b..71fa9717ff 100755
--- a/configure
+++ b/configure
@@ -3040,13 +3040,13 @@ for drv in $audio_drv_list; do
 alsa)
 audio_drv_probe $drv alsa/asoundlib.h -lasound \
 "return snd_pcm_close((snd_pcm_t *)0);"
-libs_softmmu="-lasound $libs_softmmu"
+alsa_libs="-lasound"
 ;;
 
 pa)
 audio_drv_probe $drv pulse/pulseaudio.h "-lpulse" \
 "pa_context_set_source_output_volume(NULL, 0, NULL, NULL, NULL); 
return 0;"
-libs_softmmu="-lpulse $libs_softmmu"
+pulse_libs="-lpulse"
 audio_pt_int="yes"
 ;;
 
@@ -3057,16 +3057,16 @@ for drv in $audio_drv_list; do
 ;;
 
 coreaudio)
-  libs_softmmu="-framework CoreAudio $libs_softmmu"
+  coreaudio_libs="-framework CoreAudio"
 ;;
 
 dsound)
-  libs_softmmu="-lole32 -ldxguid $libs_softmmu"
+  dsound_libs="-lole32 -ldxguid"
   audio_win_int="yes"
 ;;
 
 oss)
-  libs_softmmu="$oss_lib $libs_softmmu"
+  oss_libs="$oss_lib"
 ;;
 
 wav)
@@ -5548,6 +5548,11 @@ for drv in $audio_drv_list; do
 def=CONFIG_$(echo $drv | LC_ALL=C tr '[a-z]' '[A-Z]')
 echo "$def=y" >> $config_host_mak
 done
+echo "ALSA_LIBS=$alsa_libs" >> $config_host_mak
+echo "PULSE_LIBS=$pulse_libs" >> $config_host_mak
+echo "COREAUDIO_LIBS=$coreaudio_libs" >> $config_host_mak
+echo "DSOUND_LIBS=$dsound_libs" >> $config_host_mak
+echo "OSS_LIBS=$oss_libs" >> $config_host_mak
 if test "$audio_pt_int" = "yes" ; then
   echo "CONFIG_AUDIO_PT_INT=y" >> $config_host_mak
 fi
-- 
2.13.5

[Qemu-devel] [PATCH 6/9] buildsys: Move opengl cflags to per object

2017-09-07 Thread Fam Zheng

Opengl libs are already moved to per object, do so for the cflags as
well.

Signed-off-by: Fam Zheng 
---
 configure| 2 +-
 ui/Makefile.objs | 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index b08cb4ce02..b1320b1994 100755
--- a/configure
+++ b/configure
@@ -3623,7 +3623,6 @@ if test "$opengl" != "no" ; then
 if test "$gtk" = "yes" && $pkg_config --exists "$gtkpackage >= 3.16"; then
 gtk_gl="yes"
 fi
-QEMU_CFLAGS="$QEMU_CFLAGS $opengl_cflags"
   else
 if test "$opengl" = "yes" ; then
   feature_not_found "opengl" "Please install opengl (mesa) devel pkgs: 
$opengl_pkgs"
@@ -5844,6 +5843,7 @@ fi
 
 if test "$opengl" = "yes" ; then
   echo "CONFIG_OPENGL=y" >> $config_host_mak
+  echo "OPENGL_CFLAGS=$opengl_cflags" >> $config_host_mak
   echo "OPENGL_LIBS=$opengl_libs" >> $config_host_mak
   if test "$opengl_dmabuf" = "yes" ; then
 echo "CONFIG_OPENGL_DMABUF=y" >> $config_host_mak
diff --git a/ui/Makefile.objs b/ui/Makefile.objs
index e3403b698b..95943d9c33 100644
--- a/ui/Makefile.objs
+++ b/ui/Makefile.objs
@@ -53,10 +53,10 @@ gtk.o-libs := $(GTK_LIBS) $(VTE_LIBS)
 gtk-egl.o-libs := $(GTK_LIBS) $(VTE_LIBS)
 gtk-gl-area.o-libs := $(GTK_LIBS) $(VTE_LIBS)
 
-gtk-egl.o-libs += $(OPENGL_LIBS)
-shader.o-libs += $(OPENGL_LIBS)
-console-gl.o-libs += $(OPENGL_LIBS)
-egl-helpers.o-libs += $(OPENGL_LIBS)
+$(foreach x, gtk-egl shader console egl-helpers, \
+   $(eval $x.o-libs += $(OPENGL_LIBS)) \
+   $(eval $x.o-cflags += $(OPENGL_CFLAGS)) \
+)
 
 curses.o-cflags := $(CURSES_CFLAGS)
 curses.o-libs := $(CURSES_LIBS)
-- 
2.13.5

[Qemu-devel] [PATCH 5/9] buildsys: Move curese cflags/libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 configure| 6 --
 ui/Makefile.objs | 3 +++
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 71fa9717ff..b08cb4ce02 100755
--- a/configure
+++ b/configure
@@ -3137,8 +3137,8 @@ EOF
   unset IFS
   if compile_prog "$curses_inc" "$curses_lib" ; then
 curses_found=yes
-QEMU_CFLAGS="$curses_inc $QEMU_CFLAGS"
-libs_softmmu="$curses_lib $libs_softmmu"
+curses_cflags="$curses_inc $curses_cflags"
+curses_libs="$curses_lib $curses_libs"
 break
   fi
 done
@@ -5606,6 +5606,8 @@ if test "$cocoa" = "yes" ; then
 fi
 if test "$curses" = "yes" ; then
   echo "CONFIG_CURSES=y" >> $config_host_mak
+  echo "CURSES_CFLAGS=$curses_cflags" >> $config_host_mak
+  echo "CURSES_LIBS=$curses_libs" >> $config_host_mak
 fi
 if test "$pipe2" = "yes" ; then
   echo "CONFIG_PIPE2=y" >> $config_host_mak
diff --git a/ui/Makefile.objs b/ui/Makefile.objs
index a0b3c15a28..e3403b698b 100644
--- a/ui/Makefile.objs
+++ b/ui/Makefile.objs
@@ -57,3 +57,6 @@ gtk-egl.o-libs += $(OPENGL_LIBS)
 shader.o-libs += $(OPENGL_LIBS)
 console-gl.o-libs += $(OPENGL_LIBS)
 egl-helpers.o-libs += $(OPENGL_LIBS)
+
+curses.o-cflags := $(CURSES_CFLAGS)
+curses.o-libs := $(CURSES_LIBS)
-- 
2.13.5

[Qemu-devel] [PATCH 7/9] buildsys: Move libcacard cflags/libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 configure| 4 ++--
 hw/usb/Makefile.objs | 6 --
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index b1320b1994..85fef3d16f 100755
--- a/configure
+++ b/configure
@@ -4237,8 +4237,6 @@ if test "$smartcard" != "no"; then
 if $pkg_config libcacard; then
 libcacard_cflags=$($pkg_config --cflags libcacard)
 libcacard_libs=$($pkg_config --libs libcacard)
-QEMU_CFLAGS="$QEMU_CFLAGS $libcacard_cflags"
-libs_softmmu="$libs_softmmu $libcacard_libs"
 smartcard="yes"
 else
 if test "$smartcard" = "yes"; then
@@ -5831,6 +5829,8 @@ fi
 
 if test "$smartcard" = "yes" ; then
   echo "CONFIG_SMARTCARD=y" >> $config_host_mak
+  echo "SMARTCARD_CFLAGS=$libcacard_cflags" >> $config_host_mak
+  echo "SMARTCARD_LIBS=$libcacard_libs" >> $config_host_mak
 fi
 
 if test "$libusb" = "yes" ; then
diff --git a/hw/usb/Makefile.objs b/hw/usb/Makefile.objs
index 97f1c4561a..795ff25a5e 100644
--- a/hw/usb/Makefile.objs
+++ b/hw/usb/Makefile.objs
@@ -26,8 +26,10 @@ common-obj-$(CONFIG_USB_BLUETOOTH)+= dev-bluetooth.o
 
 ifeq ($(CONFIG_USB_SMARTCARD),y)
 common-obj-y  += dev-smartcard-reader.o
-common-obj-$(CONFIG_SMARTCARD)+= ccid-card-passthru.o
-common-obj-$(CONFIG_SMARTCARD)+= ccid-card-emulated.o
+common-obj-$(CONFIG_SMARTCARD)+= smartcard.mo
+smartcard.mo-objs := ccid-card-passthru.o ccid-card-emulated.o
+smartcard.mo-cflags := $(SMARTCARD_CFLAGS)
+smartcard.mo-libs := $(SMARTCARD_LIBS)
 endif
 
 ifeq ($(CONFIG_POSIX),y)
-- 
2.13.5

[Qemu-devel] [PATCH 0/2] buildsys: Move vde cflags/libs to per object

2017-09-07 Thread Fam Zheng

Hi Jason,

this is the net/ part of the conversion of library cflags/libs from global
QEMU_CFLAGS and libs_softmmu to per object variables. Please review.

Fam Zheng (2):
  vl: Don't include vde header
  buildsys: Move vde libs to per object

 configure | 3 +--
 net/Makefile.objs | 2 ++
 vl.c  | 4 
 3 files changed, 3 insertions(+), 6 deletions(-)

-- 
2.13.5

[Qemu-devel] [PATCH 9/9] buildsys: Move usb redir cflags/libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 configure| 4 ++--
 hw/usb/Makefile.objs | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index bee1302155..763a65f156 100755
--- a/configure
+++ b/configure
@@ -4266,8 +4266,6 @@ if test "$usb_redir" != "no" ; then
 usb_redir="yes"
 usb_redir_cflags=$($pkg_config --cflags libusbredirparser-0.5)
 usb_redir_libs=$($pkg_config --libs libusbredirparser-0.5)
-QEMU_CFLAGS="$QEMU_CFLAGS $usb_redir_cflags"
-libs_softmmu="$libs_softmmu $usb_redir_libs"
 else
 if test "$usb_redir" = "yes"; then
 feature_not_found "usb-redir" "Install usbredir devel"
@@ -5839,6 +5837,8 @@ fi
 
 if test "$usb_redir" = "yes" ; then
   echo "CONFIG_USB_REDIR=y" >> $config_host_mak
+  echo "USB_REDIR_CFLAGS=$usb_redir_cflags" >> $config_host_mak
+  echo "USB_REDIR_LIBS=$usb_redir_libs" >> $config_host_mak
 fi
 
 if test "$opengl" = "yes" ; then
diff --git a/hw/usb/Makefile.objs b/hw/usb/Makefile.objs
index 9ce9eadb47..92eb199697 100644
--- a/hw/usb/Makefile.objs
+++ b/hw/usb/Makefile.objs
@@ -38,6 +38,8 @@ endif
 
 # usb redirection
 common-obj-$(CONFIG_USB_REDIR) += redirect.o quirks.o
+redirect.o-cflags = $(USB_REDIR_CFLAGS)
+redirect.o-libs = $(USB_REDIR_LIBS)
 
 # usb pass-through
 common-obj-y += $(patsubst %,host-%.o,$(HOST_USB))
-- 
2.13.5

Re: [Qemu-devel] [Qemu-trivial] [PATCH] dma/i82374: avoid double creation of i82374 device

2017-09-07 Thread Eduardo Otubo

On Fri, Sep 01, 2017 at 12:44:30PM -0300, Eduardo Habkost wrote:
> On Fri, Sep 01, 2017 at 05:34:34PM +0200, Markus Armbruster wrote:
> > Eduardo Habkost  writes:
> > 
> > > i82374 is compiled in only on ppc and sh4, so I'm CCing the
> > > maintainers for those architectures.
> > >
> > > On Fri, Sep 01, 2017 at 01:03:32PM +0200, Eduardo Otubo wrote:
> > >> When used with the following command line:
> > >> 
> > >>  ./ppc64-softmmu/qemu-system-ppc64 -S -machine 40p,accel=tcg -device 
> > >> i82374
> > >> 
> > >> QEMU with machine type 40p already creates the device i82374. If
> > >> specified in the command line, it will try to create it again, hence
> > >> generating the error.

Well pointed, forgot to describe the actual error. I may inlcude for
the next version of the patch. For for correctness sake, here it is:

 ./ppc64-softmmu/qemu-system-ppc64 -S -machine 40p,accel=tcg -device i82374
 qemu-system-ppc64: hw/isa/isa-bus.c:110: isa_bus_dma: Assertion `!bus->dma[0] 
&& !bus->dma[1]' failed.
 Aborted (core dumped)

> > >
> > > Which error?
> > >
> > >
> > >>   One way to avoid this problem is to set
> > >> user_creatable=false.
> > >> 
> > >> Signed-off-by: Eduardo Otubo 
> > >
> > > The patch does more than just avoiding double creation: it
> > > prevents usage of "-device i82374" completely.
> > >
> > > Maybe nobody needs it to work with -device today (would the
> > > device even work?) and it is OK to set user_creatable=false until
> > > we fix the crash.  But we need to be sure of that.
> > >
> > >> ---
> > >>  hw/dma/i82374.c | 1 +
> > >>  1 file changed, 1 insertion(+)
> > >> 
> > >> diff --git a/hw/dma/i82374.c b/hw/dma/i82374.c
> > >> index 6c0f975df0..5275d822e0 100644
> > >> --- a/hw/dma/i82374.c
> > >> +++ b/hw/dma/i82374.c
> > >> @@ -139,6 +139,7 @@ static void i82374_class_init(ObjectClass *klass, 
> > >> void *data)
> > >>  dc->realize = i82374_realize;
> > >>  dc->vmsd = &vmstate_i82374;
> > >>  dc->props = i82374_properties;
> > >> +dc->user_creatable = false;
> > >
> > > A "Reason:" comment explaining why user_creatable=false is
> > > mandatory.  See the comment above user_creatable declaration in
> > > qdev-core.h for reference.
> > >
> > > I suggest the following:
> > >
> > > /*
> > >  * Reason: i82374_realize() crashes (assertion failure inside 
> > > isa_bus_dma()
> > >  * if the device is instantiated twice.
> > >  */

I agree with the comment above. If there's nothing left to fix/add
I'll just send a v2 for this shortly.

> > 
> > We need to find out *why* it crashes.  Once we know, we can likely write
> > a better comment.
> 
> It crashes because isa_bus_dma() isn't supposed to be called
> twice for the same bus.
> 
> Making isa_bus_dma()/DMA_init()/i82374_realize() return an error
> instead of asserting would be even better than setting
> user_creatable=false.
> 
> -- 
> Eduardo
> 

-- 
Eduardo Otubo
Senior Software Engineer @ RedHat

[Qemu-devel] [PATCH 2/2] buildsys: Move vde libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 configure | 3 +--
 net/Makefile.objs | 2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index fb7e34a901..bc09487668 100755
--- a/configure
+++ b/configure
@@ -2963,8 +2963,6 @@ int main(void)
 EOF
   if compile_prog "" "$vde_libs" ; then
 vde=yes
-libs_softmmu="$vde_libs $libs_softmmu"
-libs_tools="$vde_libs $libs_tools"
   else
 if test "$vde" = "yes" ; then
   feature_not_found "vde" "Install vde (Virtual Distributed Ethernet) 
devel"
@@ -5545,6 +5543,7 @@ if test "$slirp" = "yes" ; then
 fi
 if test "$vde" = "yes" ; then
   echo "CONFIG_VDE=y" >> $config_host_mak
+  echo "VDE_LIBS=$vde_libs" >> $config_host_mak
 fi
 if test "$netmap" = "yes" ; then
   echo "CONFIG_NETMAP=y" >> $config_host_mak
diff --git a/net/Makefile.objs b/net/Makefile.objs
index 67ba5e26fb..64adf32f40 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -21,3 +21,5 @@ tap-obj-$(CONFIG_SOLARIS) = tap-solaris.o
 tap-obj-y ?= tap-stub.o
 common-obj-$(CONFIG_POSIX) += tap.o $(tap-obj-y)
 common-obj-$(CONFIG_WIN32) += tap-win32.o
+
+vde.o-libs = $(VDE_LIBS)
-- 
2.13.5

[Qemu-devel] [PATCH 1/2] vl: Don't include vde header

2017-09-07 Thread Fam Zheng

Nothing in vl.c uses anything from the vde package, do remove the
unnecessary include.

Signed-off-by: Fam Zheng 
---
 vl.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/vl.c b/vl.c
index e75757f977..293cd91ec9 100644
--- a/vl.c
+++ b/vl.c
@@ -31,10 +31,6 @@
 #include "sysemu/seccomp.h"
 #endif
 
-#if defined(CONFIG_VDE)
-#include 
-#endif
-
 #ifdef CONFIG_SDL
 #if defined(__APPLE__) || defined(main)
 #include 
-- 
2.13.5

Re: [Qemu-devel] [Qemu-trivial] [PATCH] dma/i82374: avoid double creation of i82374 device

2017-09-07 Thread Eduardo Otubo

On Sat, Sep 02, 2017 at 11:15:20AM +0200, Aurelien Jarno wrote:
> On 2017-09-01 11:30, Eduardo Habkost wrote:
> > i82374 is compiled in only on ppc and sh4, so I'm CCing the
> > maintainers for those architectures.
> 
> The i82374 device is not useful nor usable on SH4. It has just been
> added in commit 85d3846a39 to be able to run the tests.
> 
> Aurelien
> 

Any word from the ppc guys?

-- 
Eduardo Otubo
Senior Software Engineer @ RedHat

[Qemu-devel] [PATCH] buildsys: Move rdma libs to per object

2017-09-07 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 configure   | 2 +-
 migration/Makefile.objs | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index fb7e34a901..470b8a085b 100755
--- a/configure
+++ b/configure
@@ -2818,7 +2818,6 @@ EOF
   rdma_libs="-lrdmacm -libverbs"
   if compile_prog "" "$rdma_libs" ; then
 rdma="yes"
-libs_softmmu="$libs_softmmu $rdma_libs"
   else
 if test "$rdma" = "yes" ; then
 error_exit \
@@ -6035,6 +6034,7 @@ echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
 
 if test "$rdma" = "yes" ; then
   echo "CONFIG_RDMA=y" >> $config_host_mak
+  echo "RDMA_LIBS=$rdma_libs" >> $config_host_mak
 fi
 
 if test "$have_rtnetlink" = "yes" ; then
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index 1c7770da46..99e038024d 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -11,3 +11,4 @@ common-obj-$(CONFIG_RDMA) += rdma.o
 
 common-obj-$(CONFIG_LIVE_BLOCK_MIGRATION) += block.o
 
+rdma.o-libs := $(RDMA_LIBS)
-- 
2.13.5

Re: [Qemu-devel] [PATCH v4 07/20] instrument: [qapi] Add library loader

2017-09-07 Thread Markus Armbruster

I missed prior versions of this series.  Please cc: qapi-schema
maintainers on all non-trivial schema patches.
scripts/get_maintainer.pl points to them for this patch.

Marc-André, semantic conflict with your QAPI conditionals series.  Just
a heads-up, there's nothing for you to do about it right now.

Lluís Vilanova  writes:

> Signed-off-by: Lluís Vilanova 
> ---
>  instrument/Makefile.objs |1 
>  instrument/load.h|4 ++
>  instrument/qmp.c |   88 ++
>  qapi-schema.json |3 +
>  qapi/instrument.json |   96 
> ++
>  5 files changed, 192 insertions(+)
>  create mode 100644 instrument/qmp.c
>  create mode 100644 qapi/instrument.json

Missing: update of MAINTAINERS for qapi/instrument.json.

> diff --git a/instrument/Makefile.objs b/instrument/Makefile.objs
> index 5ea5c77245..13a8f60431 100644
> --- a/instrument/Makefile.objs
> +++ b/instrument/Makefile.objs
> @@ -2,3 +2,4 @@
>  
>  target-obj-y += cmdline.o
>  target-obj-$(CONFIG_INSTRUMENT) += load.o
> +target-obj-y += qmp.o
> diff --git a/instrument/load.h b/instrument/load.h
> index 2ddb2c6c19..f8a02e6849 100644
> --- a/instrument/load.h
> +++ b/instrument/load.h
> @@ -25,6 +25,8 @@
>   * @INSTR_LOAD_DLERROR: Error with libdl (see dlerror).
>   *
>   * Error codes for instr_load().
> + *
> + * NOTE: Keep in sync with QAPI's #InstrLoadCode.
>   */
>  typedef enum {
>  INSTR_LOAD_OK,
> @@ -40,6 +42,8 @@ typedef enum {
>   * @INSTR_UNLOAD_DLERROR: Error with libdl (see dlerror).
>   *
>   * Error codes for instr_unload().
> + *
> + * NOTE: Keep in sync with QAPI's #InstrUnloadCode.
>   */
>  typedef enum {
>  INSTR_UNLOAD_OK,

Any particular reason why you can't simply use the QAPI-generated enum
types?

> diff --git a/instrument/qmp.c b/instrument/qmp.c
> new file mode 100644
> index 00..c36960c12f
> --- /dev/null
> +++ b/instrument/qmp.c
> @@ -0,0 +1,88 @@
> +/*
> + * QMP interface for instrumentation control commands.
> + *
> + * Copyright (C) 2012-2017 Lluís Vilanova 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +#include "qapi/qmp/qerror.h"
> +#include "qmp-commands.h"
> +
> +#include 

System headers go right after "qemu/osdep.h".

> +
> +#include "instrument/load.h"
> +
> +
> +

Fewer blank lines would do.

> +InstrLoadResult *qmp_instr_load(const char * path,
> +bool have_args, strList * args,

checkpatch ERROR: "foo * bar" should be "foo *bar"

Please feed it all your patches, and carefully consider which of its
complaints you should address.

> +Error **errp)
> +{
> +InstrLoadResult *res = g_malloc0(sizeof(*res));
> +
> +#if defined(CONFIG_INSTRUMENT)
> +int argc = 0;
> +const char **argv = NULL;
> +
> +strList *entry = have_args ? args : NULL;
> +while (entry != NULL) {
> +argv = realloc(argv, sizeof(*argv) * (argc + 1));
> +argv[argc] = entry->value;
> +argc++;
> +entry = entry->next;
> +}
> +
> +InstrLoadError code = instr_load(path, argc, argv, &res->handle);
> +switch (code) {
> +case INSTR_LOAD_OK:
> +res->code = INSTR_LOAD_CODE_OK;
> +res->has_handle = true;
> +break;
> +case INSTR_LOAD_TOO_MANY:
> +res->code = INSTR_LOAD_CODE_TOO_MANY;
> +break;
> +case INSTR_LOAD_ERROR:
> +res->code = INSTR_LOAD_CODE_ERROR;
> +break;
> +case INSTR_LOAD_DLERROR:
> +res->has_msg = true;
> +res->msg = dlerror();
> +res->code = INSTR_LOAD_CODE_DLERROR;
> +break;
> +}
> +#else
> +res->code = INSTR_LOAD_CODE_UNAVAILABLE;
> +#endif
> +
> +return res;
> +}
> +
> +InstrUnloadResult *qmp_instr_unload(int64_t handle, Error **errp)
> +{
> +InstrUnloadResult *res = g_malloc0(sizeof(*res));
> +
> +#if defined(CONFIG_INSTRUMENT)
> +InstrUnloadError code = instr_unload(handle);
> +switch (code) {
> +case INSTR_UNLOAD_OK:
> +res->code = INSTR_UNLOAD_CODE_OK;
> +break;
> +case INSTR_UNLOAD_INVALID:
> +res->code = INSTR_UNLOAD_CODE_INVALID;
> +break;
> +case INSTR_UNLOAD_DLERROR:
> +res->has_msg = true;
> +res->msg = dlerror();
> +break;
> +res->code = INSTR_UNLOAD_CODE_DLERROR;
> +}
> +#else
> +res->code = INSTR_UNLOAD_CODE_UNAVAILABLE;
> +#endif
> +
> +return res;
> +}

You're inventing your own "this feature isn't compiled in" protocol.  We
already got too many of them.  Let's stick to one of the existing ones,
namely the one that's got some visibility in introspection.  Bonus:
turns the semantic conflict with Marc-André's work into a textual
conflict.  Here's how:

* Add your commands to qmp_unregister_commands_hack() in monitor.c,
  r

Re: [Qemu-devel] [PATCH RFC] accel: default to an actually available accelerator

2017-09-07 Thread Kevin Wolf

Am 07.09.2017 um 10:25 hat Cornelia Huck geschrieben:
> On Thu, 7 Sep 2017 10:14:27 +0200
> Thomas Huth  wrote:
> 
> > On 07.09.2017 10:11, Kevin Wolf wrote:
> > [...]
> > > But the real reason why I'm replying: Should we add changing the default
> > > to "kvm:tcg" to the list of planned 3.0 changes? I am part of the group
> > > that intentionally uses TCG occasionally, but I think the majority of
> > > users wants to use KVM (or whatever the fastest option is on their
> > > system) whenever it is available.  
> > 
> > If you consider how often people are getting this wrong (they want to
> > use KVM but end up with TCG in the first try), I think that's a good idea.
> 
> Agreed.
> 
> I'm wondering what that means for our tests, though. Some of them work
> slightly different under tcg or kvm (cf. iotest 186, as referenced in
> the original mail), and sometimes you'll probably explicitly want to
> exercise tcg. That does not speak against the change, but we probably
> need to look at what we want in more detail.

This is a bug in test 186, and probably 172, too. Normally, we use the
options from ./common:

export QEMU_OPTIONS="-nodefaults -machine accel=qtest"

However, these two test cases overwrite QEMU_OPTIONS and neglect to add
a '-machine accel=qtest' option manually.

Kevin

[Qemu-devel] [PATCH] buildsys: Move brlapi libs to per object

2017-09-07 Thread Fam Zheng

baum.o already receives the sdl cflags in its per object variable, do
the same for brlapi libs to avoid cluttering libs_softmmu.

Signed-off-by: Fam Zheng 
---
 chardev/Makefile.objs | 1 +
 configure | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/chardev/Makefile.objs b/chardev/Makefile.objs
index 52a8127606..d68e1347f9 100644
--- a/chardev/Makefile.objs
+++ b/chardev/Makefile.objs
@@ -20,5 +20,6 @@ chardev-obj-$(CONFIG_WIN32) += char-win-stdio.o
 common-obj-y += msmouse.o wctablet.o testdev.o
 common-obj-$(CONFIG_BRLAPI) += baum.o
 baum.o-cflags := $(SDL_CFLAGS)
+baum.o-libs := $(BRLAPI_LIBS)
 
 common-obj-$(CONFIG_SPICE) += spice.o
diff --git a/configure b/configure
index fb7e34a901..d718718cfa 100755
--- a/configure
+++ b/configure
@@ -3106,7 +3106,6 @@ int main( void ) { return brlapi__openConnection (NULL, 
NULL, NULL); }
 EOF
   if compile_prog "" "$brlapi_libs" ; then
 brlapi=yes
-libs_softmmu="$brlapi_libs $libs_softmmu"
   else
 if test "$brlapi" = "yes" ; then
   feature_not_found "brlapi" "Install brlapi devel"
@@ -5693,6 +5692,7 @@ if test "$curl" = "yes" ; then
 fi
 if test "$brlapi" = "yes" ; then
   echo "CONFIG_BRLAPI=y" >> $config_host_mak
+  echo "BRLAPI_LIBS=$brlapi_libs" >> $config_host_mak
 fi
 if test "$bluez" = "yes" ; then
   echo "CONFIG_BLUEZ=y" >> $config_host_mak
-- 
2.13.5

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Stefan Hajnoczi

On Thu, Sep 7, 2017 at 9:13 AM, Peter Xu  wrote:
> On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
>> On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
>> > * Daniel P. Berrange (berra...@redhat.com) wrote:
>> > > This does imply that you need a separate monitor I/O processing, from the
>> > > command execution thread, but I see no need for all commands to suddenly
>> > > become async. Just allowing interleaved replies is sufficient from the
>> > > POV of the protocol definition. This interleaving is easy to handle from
>> > > the client POV - just requires a unique 'serial' in the request by the
>> > > client, that is copied into the reply by QEMU.
>> >
>> > OK, so for that we can just take Marc-André's syntax and call it 'id':
>> >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
>> >
>> > then it's upto the caller to ensure those id's are unique.
>>
>> Libvirt has in fact generated a unique 'id' for every monitor command
>> since day 1 of supporting QMP.
>>
>> > I do worry about two things:
>> >   a) With this the caller doesn't really know which commands could be
>> >   in parallel - for example if we've got a recovery command that's
>> >   executed by this non-locking thread that's OK, we expect that
>> >   to be doable in parallel.  If in the future though we do
>> >   what you initially suggested and have a bunch of commands get
>> >   routed to the migration thread (say) then those would suddenly
>> >   operate in parallel with other commands that we're previously
>> >   synchronous.
>>
>> We could still have an opt-in for async commands. eg default to executing
>> all commands in the main thread, unless the client issues an explicit
>> "make it async" command, to switch to allowing the migration thread to
>> process it async.
>>
>>  { "execute": "qmp_allow_async",
>>"data": { "commands": [
>>"migrate_cancel",
>>] } }
>>
>>
>>  { "return": { "commands": [
>>"migrate_cancel",
>>] } }
>>
>> The server response contains the subset of commands from the request
>> for which async is supported.
>>
>> That gives good negotiation ability going forward as we incrementally
>> support async on more commands.
>
> I think this goes back to the discussion on which design we'd like to
> choose.  IMHO the whole async idea plus the per-command-id is indeed
> cleaner and nicer, and I believe that can benefit not only libvirt,
> but also other QMP users.  The problem is, I have no idea how long
> it'll take to let us have such a feature - I believe that will include
> QEMU and Libvirt to both support that.  And it'll be a pity if the
> postcopy recovery cannot work only because we cannot guarantee a
> stable monitor.

Please don't rush in a hack, they often introduce new bugs that we
have to support long-term when they are part of the QMP API.

In your original email you mentioned "info cpus".  Have you considered
modifying this command so it does not sync the CPU?  I'm not sure
callers really need to sync the CPU, typically they just want to know
the vcpu numbers, thread IDs, and current state (halted, running,
etc).

The next step after that would be to audit other monitor commands for
unnecessary vcpu synchronization.

Stefan

Re: [Qemu-devel] [PATCH v4 08/20] instrument: [hmp] Add library loader

2017-09-07 Thread Markus Armbruster

Lluís Vilanova  writes:

> Signed-off-by: Lluís Vilanova 
> ---
>  hmp-commands.hx |   28 ++
>  monitor.c   |   60 
> +++
>  2 files changed, 88 insertions(+)
>
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 1941e19932..703d7262f5 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1858,6 +1858,34 @@ ETEXI
>  .sub_table  = info_cmds,
>  },
>  
> +{
> +.name   = "instr-load",
> +.args_type  = "path:F,args:s?",
> +.params = "path [arg]",
> +.help   = "load an instrumentation library",
> +.cmd= hmp_instr_load,
> +},
> +
> +STEXI
> +@item instr-load @var{path} [args=value[,...]]
> +@findex instr-load
> +Load an instrumentation library.
> +ETEXI
> +
> +{
> +.name   = "instr-unload",
> +.args_type  = "handle:i",
> +.params = "handle",
> +.help   = "unload an instrumentation library",
> +.cmd= hmp_instr_unload,
> +},
> +
> +STEXI
> +@item instr-unload
> +@findex instr-unload
> +Unload an instrumentation library.
> +ETEXI
> +
>  STEXI
>  @end table
>  ETEXI

Want #ifdef CONFIG_INSTRUMENT, see my review of the previous patch.

See also my remark there on returning handles vs. passing in IDs.

> diff --git a/monitor.c b/monitor.c
> index e0f880107f..8a7684f860 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -2319,6 +2319,66 @@ int monitor_fd_param(Monitor *mon, const char *fdname, 
> Error **errp)
>  return fd;
>  }
>  
> +static void hmp_instr_load(Monitor *mon, const QDict *qdict)
> +{
> +const char *path = qdict_get_str(qdict, "path");
> +const char *str = qdict_get_try_str(qdict, "args");
> +strList args;

Blank line between last declaration and first statement, please.

> +args.value = (str == NULL) ? NULL : (char *)str;

What's wrong with

   args.value = (char *)str;

?

> +args.next = NULL;
> +InstrLoadResult *res = qmp_instr_load(path, args.value != NULL,
> +  args.value != NULL ? &args : NULL,
> +  NULL);
> +switch (res->code) {
> +case INSTR_LOAD_CODE_OK:
> +monitor_printf(mon, "Handle: %"PRId64"\n", res->handle);
> +monitor_printf(mon, "OK\n");
> +break;
> +case INSTR_LOAD_CODE_TOO_MANY:
> +monitor_printf(mon, "Too many instrumentation libraries already 
> loaded\n");
> +break;
> +case INSTR_LOAD_CODE_ERROR:
> +monitor_printf(mon, "Instrumentation library returned a non-zero 
> value during initialization");
> +break;
> +case INSTR_LOAD_CODE_DLERROR:
> +monitor_printf(mon, "Error loading library: %s\n", res->msg);
> +break;
> +case INSTR_LOAD_CODE_UNAVAILABLE:
> +monitor_printf(mon, "Service not available\n");
> +break;
> +default:
> +fprintf(stderr, "Unknown instrumentation load code: %d", res->code);
> +exit(1);

Impossible conditions should be assertion failures, but it's a moot point
because...

> +break;
> +}
> +qapi_free_InstrLoadResult(res);
> +}

With qmp_instr_load() fixed to set an error on failure, this becomes
something like

   InstrLoadResult *res = qmp_instr_load(path, args.value != NULL,
 args.value != NULL ? &args : NULL,
 &err);
   if (err) {
   error_report_err(err);
   } else {
   monitor_printf(mon, "Handle: %"PRId64"\n", res->handle);
   monitor_printf(mon, "OK\n");
   }
   qapi_free_InstrLoadResult(res);

> +
> +static void hmp_instr_unload(Monitor *mon, const QDict *qdict)
> +{
> +int64_t handle = qdict_get_int(qdict, "handle");
> +InstrUnloadResult *res = qmp_instr_unload(handle, NULL);
> +switch (res->code) {
> +case INSTR_UNLOAD_CODE_OK:
> +monitor_printf(mon, "OK\n");
> +break;
> +case INSTR_UNLOAD_CODE_INVALID:
> +monitor_printf(mon, "Invalid handle\n");
> +break;
> +case INSTR_UNLOAD_CODE_DLERROR:
> +monitor_printf(mon, "Error unloading library: %s\n", res->msg);
> +break;
> +case INSTR_UNLOAD_CODE_UNAVAILABLE:
> +monitor_printf(mon, "Service not available\n");
> +break;
> +default:
> +fprintf(stderr, "Unknown instrumentation unload code: %d", 
> res->code);
> +exit(1);
> +break;
> +}
> +qapi_free_InstrUnloadResult(res);
> +}
> +
>  /* Please update hmp-commands.hx when adding or changing commands */
>  static mon_cmd_t info_cmds[] = {
>  #include "hmp-commands-info.h"

Likewise.

[Qemu-devel] [PATCH] qemu-iotests: Add missing -machine accel=qtest

2017-09-07 Thread Kevin Wolf

A basic set of qemu options is initialised in ./common:

export QEMU_OPTIONS="-nodefaults -machine accel=qtest"

However, two test cases (172 and 186) overwrite QEMU_OPTIONS and neglect
to manually set '-machine accel=qtest'. Add the missing option for 172.
186 probably only copied the code from 172, it doesn't actually need to
overwrite QEMU_OPTIONS, so remove that in 186.

Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/172 | 2 +-
 tests/qemu-iotests/186 | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/172 b/tests/qemu-iotests/172
index 826d6fecd3..02c5f79bab 100755
--- a/tests/qemu-iotests/172
+++ b/tests/qemu-iotests/172
@@ -56,7 +56,7 @@ function do_run_qemu()
 done
 fi
 echo quit
-) | $QEMU -nographic -monitor stdio -serial none "$@"
+) | $QEMU -machine accel=qtest -nographic -monitor stdio -serial none "$@"
 echo
 }
 
diff --git a/tests/qemu-iotests/186 b/tests/qemu-iotests/186
index 2b9f618f90..44cc01ed87 100755
--- a/tests/qemu-iotests/186
+++ b/tests/qemu-iotests/186
@@ -56,15 +56,15 @@ function do_run_qemu()
 done
 fi
 echo quit
-) | $QEMU -S -nodefaults -display none -device virtio-scsi-pci -monitor 
stdio "$@" 2>&1
+) | $QEMU -S -display none -device virtio-scsi-pci -monitor stdio "$@" 2>&1
 echo
 }
 
 function check_info_block()
 {
 echo "info block" |
-QEMU_OPTIONS="" do_run_qemu "$@" | _filter_win32 | _filter_hmp |
-_filter_qemu | _filter_generated_node_ids
+do_run_qemu "$@" | _filter_win32 | _filter_hmp | _filter_qemu |
+_filter_generated_node_ids
 }
 
 
-- 
2.13.5

[Qemu-devel] [PATCH] buildsys: Move seccomp cflags/libs to per object

2017-09-07 Thread Fam Zheng

Like many other libraries, libseccomp cflags and libs should only apply
to the building of necessary objects. Do so in the usual way with the
help of per object variables.

Signed-off-by: Fam Zheng 
---
 Makefile.objs | 2 ++
 configure | 6 --
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 24a4ea08b8..d9cf7ad791 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -70,6 +70,8 @@ common-obj-y += backends/
 common-obj-y += chardev/
 
 common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
+qemu-seccomp.o-cflags := $(SECCOMP_CFLAGS)
+qemu-seccomp.o-libs := $(SECCOMP_LIBS)
 
 common-obj-$(CONFIG_FDT) += device_tree.o
 
diff --git a/configure b/configure
index fb7e34a901..fb81a0189b 100755
--- a/configure
+++ b/configure
@@ -2052,8 +2052,8 @@ if test "$seccomp" != "no" ; then
 
 if test "$libseccomp_minver" != "" &&
$pkg_config --atleast-version=$libseccomp_minver libseccomp ; then
-libs_softmmu="$libs_softmmu $($pkg_config --libs libseccomp)"
-QEMU_CFLAGS="$QEMU_CFLAGS $($pkg_config --cflags libseccomp)"
+seccomp_cflags="$($pkg_config --cflags libseccomp)"
+seccomp_libs="$($pkg_config --libs libseccomp)"
 seccomp="yes"
 else
 if test "$seccomp" = "yes" ; then
@@ -5875,6 +5875,8 @@ fi
 
 if test "$seccomp" = "yes"; then
   echo "CONFIG_SECCOMP=y" >> $config_host_mak
+  echo "SECCOMP_CFLAGS=$seccomp_cflags" >> $config_host_mak
+  echo "SECCOMP_LIBS=$seccomp_libs" >> $config_host_mak
 fi
 
 # XXX: suppress that
-- 
2.13.5

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Daniel P. Berrange

On Thu, Sep 07, 2017 at 04:13:41PM +0800, Peter Xu wrote:
> On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > > This does imply that you need a separate monitor I/O processing, from 
> > > > the
> > > > command execution thread, but I see no need for all commands to suddenly
> > > > become async. Just allowing interleaved replies is sufficient from the
> > > > POV of the protocol definition. This interleaving is easy to handle from
> > > > the client POV - just requires a unique 'serial' in the request by the
> > > > client, that is copied into the reply by QEMU.
> > > 
> > > OK, so for that we can just take Marc-André's syntax and call it 'id':
> > >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > > 
> > > then it's upto the caller to ensure those id's are unique.
> > 
> > Libvirt has in fact generated a unique 'id' for every monitor command
> > since day 1 of supporting QMP.
> > 
> > > I do worry about two things:
> > >   a) With this the caller doesn't really know which commands could be
> > >   in parallel - for example if we've got a recovery command that's
> > >   executed by this non-locking thread that's OK, we expect that
> > >   to be doable in parallel.  If in the future though we do
> > >   what you initially suggested and have a bunch of commands get
> > >   routed to the migration thread (say) then those would suddenly
> > >   operate in parallel with other commands that we're previously
> > >   synchronous.
> > 
> > We could still have an opt-in for async commands. eg default to executing
> > all commands in the main thread, unless the client issues an explicit
> > "make it async" command, to switch to allowing the migration thread to
> > process it async.
> > 
> >  { "execute": "qmp_allow_async",
> >"data": { "commands": [
> >"migrate_cancel",
> >] } }
> > 
> > 
> >  { "return": { "commands": [
> >"migrate_cancel",
> >] } }
> > 
> > The server response contains the subset of commands from the request
> > for which async is supported.
> > 
> > That gives good negotiation ability going forward as we incrementally
> > support async on more commands.
> 
> I think this goes back to the discussion on which design we'd like to
> choose.  IMHO the whole async idea plus the per-command-id is indeed
> cleaner and nicer, and I believe that can benefit not only libvirt,
> but also other QMP users.  The problem is, I have no idea how long
> it'll take to let us have such a feature - I believe that will include
> QEMU and Libvirt to both support that.  And it'll be a pity if the
> postcopy recovery cannot work only because we cannot guarantee a
> stable monitor.

This is not a blocker for having postcopy recovery feature merged.
It merely means that in a situation where the mainloop is blocked,
then we can't recover, in other situations we'll be able to recover
fine. Sure it would be nice to fix that problem too, but I don't
see it as a block.

I don't think the hacks proposed are a good tradeoff, compared to
fixing the fundamental problem with the monitor impl in QEMU. We
have discussed this monitor problem for years pretty much since
day 1 of QMP being designed, but it never gets serious attention.
IMHO it is well overdue to change that and focus attention on the
root problem and not just punt it down the road yet again by adding
short term hacks.

Adding an extra monitor channel, even as a short term hack, is
*not* short term from libvirt's POV - we'll have to carry that
code for many years into the future, even after QEMU provides
a real fix. So even if QEMU provides such a short term hack, I
would none the less be strongly against libvirt using it.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and RSCH

2017-09-07 Thread Dong Jia Shi

* Halil Pasic  [2017-09-06 16:43:42 +0200]:

> 
> 
> On 09/06/2017 04:20 PM, Cornelia Huck wrote:
> > On Wed, 6 Sep 2017 14:25:13 +0200
> > Halil Pasic  wrote:
> > 
> >> We have basically two possibilities/options which ask for different
> >> handling:
> >> 1) EFAULT is due to a bug in the vfio-ccw implementation
> >> (can be QEMU or kernel).
> >> 2) EFAULT is due to buggy channel program.
> >>
> >> Option 2) is basically to be handled with a channel-program check and
> >> setting primary secondary and alert status. For reference see PoP page
> >> 15-59 ("Designation of Storage Area").  An exception may be an invalid
> >> channel program address in the ORB. There the channel-program check ain't
> >> explicitly stated (although) I would expect one. It may be implied by the
> >> things on page 15-59 though.
> >>
> >> Option 1) is however very similar to other we have figured out that the
> >> implementation is broken situations and should be handled consequently.
> >> The current state of the discussion is with a unit exception.
> >>
> >> Does that make sense?
> > 
> > I think the situation is slightly different here, though. For the orb
> > flags, we reject something out of hand because we have not implemented
> > it, and for that, unit exception sounds like a good fit. Processing
> > errors, however, are more similar to errors in the hardware, and as
> > such can probably be reported via something like equipment check.
> > 
> 
> Noted. Let's see what Dong Jia has to say, before we continuing a
> discussion on something (option 1) what may be irrelevant anyway.
> 
> >>
> >> Now, Dong Jia, I need your help to figure out do we have option 1 or
> >> option 2 here? After quick look at the kernel code, it appears to me that
> >> I've seen both option 1 and option 2 (I'm afraid) -- but my assessment
> >> was really very superficial.
There are three cases (all in the kernel) that generate a -EFAULT ret
code:
a. vfio_ccw_mdev_write: copy_from_user() fails.
  This is option 1.

b. ccwchain_fetch_tic
  It's mostly likely that the vfio-ccw driver processed the ccw chains
  wrongly. (Actually I can not think of any other reason.)
  This is option 1.

c. ccwchain_fetch_idal
  When we find that an IDAW contents an invalid address
  This is option 2.

> >>
> >> I would expect option 2 to be handled differently (kernel provides the
> >> SCSW) though.
> >>
> >> Regards,
> >> Halil
> >>
> > 
> > 

-- 
Dong Jia Shi

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Stefan Hajnoczi

On Wed, Sep 6, 2017 at 4:14 PM, Dr. David Alan Gilbert
 wrote:
> * Stefan Hajnoczi (stefa...@gmail.com) wrote:
>> On Wed, Aug 23, 2017 at 02:51:03PM +0800, Peter Xu wrote:
>> > The root problem is that, monitor commands are all handled in main
>> > loop thread now, no matter how many monitors we specify. And, if main
>> > loop thread hangs due to some reason, all monitors will be stuck.
>>
>> I see a larger issue with postcopy: existing QEMU code assumes that
>> guest memory access is instantaneous.
>>
>> Postcopy breaks this assumption and introduces blocking points that can
>> now take unbounded time.
>>
>> This problem isn't specific to the monitor.  It can also happen to other
>> components in QEMU like the gdbstub.
>>
>> Do we need an asynchronous memory API?  Synchronous memory access should
>> only be allowed in vcpu threads.
>
> It would probably be useful for gdbstub where the overhead of async
> doesn't matter;  but doing that for all IO emulation is hard.

Why is it hard?

Memory access can be synchronous in the vcpu thread.  That eliminates
a lot of code straight away.

Anything using dma-helpers.c is already async.  They just don't know
that the memory access part is being made async too :).

The remaining cases are virtio and some other devices.

If you are worried about performance, the first rule is that async
memory access is only needed on the destination side when post-copy is
active.  Maybe use setjmp to return from the signal handler and queue
a callback for when the page has been loaded.

Stefan

Re: [Qemu-devel] [PATCH v3] vhost-user: disable the broken subprocess tests

2017-09-07 Thread Peter Maydell

On 5 September 2017 at 19:06, Philippe Mathieu-Daudé  wrote:
> tests/vhost-user-test keeps failing on build-system since Aug 15:
>
>   ERROR:tests/vhost-user-test.c:835:test_flags_mismatch: child process 
> (/i386/vhost-user/flags-mismatch/subprocess [4836]) failed unexpectedly
> ...
>   ERROR:tests/vhost-user-test.c:807:test_connect_fail: child process 
> (/x86_64/vhost-user/connect-fail/subprocess [58910]) failed unexpectedly
>
> Suggested-by: Peter Maydell 
> Suggested-by: Daniel P. Berrange 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v3: only disable tests using glib subprocess,
> migrate/multiqueue/guest-mem are still tested.
>
> v2: remove unuseful warning

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v4 2/3] watchdog.h: Drop local redefinition of actions enum

2017-09-07 Thread Markus Armbruster

Michal Privoznik  writes:

> On 09/06/2017 07:25 PM, Markus Armbruster wrote:
>> Eric Blake  writes:
>> 
>>> On 09/06/2017 06:24 AM, Michal Privoznik wrote:
 We already have enum that enumerates all the action that a
>>>
>>> s/action/actions/
>>>
 watchdog can take when hitting its timeout: WatchdogAction.
 Use that instead of inventing our own.

 Signed-off-by: Michal Privoznik 
 ---
>>>
 @@ -77,27 +77,16 @@ int select_watchdog(const char *p)
  
  int select_watchdog_action(const char *p)
  {
 -if (strcasecmp(p, "reset") == 0)
 -watchdog_action = WDT_RESET;
>>>
>>> The old code was case-insensitive,
>>>
 +action = qapi_enum_parse(&WatchdogAction_lookup, p, -1, NULL);
>>>
>>> the new code is not.  Do we care?  (I don't, but we could be breaking
>>> someone's control flow).  Should qapi_enum_parse be taught to be
>>> case-insensitive?  Or perhaps we answer related questions first: Do we
>>> have any QAPI enums that have values differing only in case? Do we
>>> prevent such QAPI definitions, to give us the potential of making the
>>> parsing insensitive?
>> 
>> Case-sensitive everywhere is fine.  Case-insensitive everywhere also
>> fine, just not my personal preference.  What's not fine is "guess
>> whether this part of the interface is case-sensitive or not".
>> 
>> QMP is case-sensitive.  Let's keep it that way.
>> 
>> The -watchdog-action option has a case-insensitive argument.  The
>> obvious way to remain misfeature-^Wbackwards compatible is converting
>> the argument to lower case before handing it off to qapi_enum_parse.  I
>> doubt it matters, but just doing it is less work than debating how far
>> exactly we want to bend over backwards.
>> 
>> g_ascii_strdown() should do.  It only converts ASCII characters, but
>> anything else is going to fail in qapi_enum_parse() anyway.
>> 
>
> On the other hand, the documentation enumerates the accepted values in
> lowercase. So one can argue that upper- or mixed-case is just a misuse
> of a bug in the code.

I quite agree, but...

>   But getting the code in is more important to me so
> I'll do the strdown() conversion and sent yet another version.

... like you, I have to pick my battles.  A respin adding the stupid
case conversion seems safer for both of us than risking a debate on how
far we need to go for backward compatibility.

Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device

2017-09-07 Thread Janosch Frank

[...]
> The main problem here is that you want to exercise a middle layer (the
> css code) and need to write boilerplate code on both host and guest
> side in order to be able to do so.
> 

 Nod.
  
> In general, a device that accepts arbitrary channel programs looks
> useful for testing purposes. I would split out processing of expected
> responses out, though, so that it can be more easily reused for
> different use cases.
> 

 I'm not sure what do you mean here. Could you clarify please?  
>>>
>>> Maybe doing things like logging received ccws and responses etc. and
>>> and then comparing against an expected outcome. You should be able to
>>> write test cases for different sets of things to be tested. I know this
>>> is awfully vague :)  
>>
>> Yes it does sound awfully vague :). In my case different tests are
>> actually done in the kernel module implementing a driver for this
>> device. I compare the expected outcome and the actual outcome there.
>> This device is just a back-end lending itself to experimentation.
> 
> I think we want both guest-side and host-side checking. I think we
> should aim for a guest-side checker that does not require a kernel
> module (a simple guest makes it easier to run as an automated test.)

Sure, that would be great.

The thing is that I want to test the subchannel and not the device and
therefore I really do not want to have to issue control commands to a
device in order to get data. Having a device that reads and writes
without a lot of overhead (like this) is therefore my main target.

Where this device lives doesn't concern me and as I'm new to this
wonderful IO system take my statements with some salt. :)


> For the guest tester: Can that be done via the qtest infrastructure
> somehow?
> 

 Well, for now I have the out-of-tree Linux kernel module provided in the
 cover letter of the series (you did not comment on that one yet).  
>>>
>>> Yes, I saw that, but have not yet looked at it. If there is a way to do
>>> it without too many extra parts, I'd prefer that.
>>>   
>>
>> Well, I think the module is almost as economical with extra parts as it
>> can get: it uses the kernel infrastructure and does not do a whole lot
>> of things on top.
> 
> And that's also the problem: You drag the whole kernel infrastructure
> along with i
> 
>>
>> I think it's worth a look.
> 
> It certainly is, but you know how it is with resources...

Yes it is and we certainly want to be integrated in as much external
testing as possible. It seems like a few people began to run into the
same direction but chose different approaches. My zonk test intentions
are mainly to get a understanding how this all works without having to
use the Linux devices but getting my hands dirty with the instructions
and structures.

I see your arguments and we'll look into it and discuss consolidating
our efforts.




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v1 1/2] io: send proper HTTP response for websocket errors

2017-09-07 Thread Daniel P. Berrange

On Wed, Sep 06, 2017 at 03:06:03PM -0300, Philippe Mathieu-Daudé wrote:
> Hi Daniel,
> 
> On 09/06/2017 07:40 AM, Daniel P. Berrange wrote:
> > When any error occurs while processing the websockets handshake,
> > QEMU just terminates the connection abruptly. This is in violation
> > of the HTTP specs and does not help the client understand what they
> > did wrong. This is particularly bad when the client gives the wrong
> > path, as a "404 Not Found" would be very helpful.
> > 
> > Refactor the handshake code so that it always sends a response to
> > the client unless there was an I/O error.
> > 
> > Fixes bug: #1715186
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >   io/channel-websock.c | 179 
> > ++-
> >   1 file changed, 134 insertions(+), 45 deletions(-)
> > 
> > diff --git a/io/channel-websock.c b/io/channel-websock.c
> > index 5a3badbec2..b9cc5a1371 100644
> > --- a/io/channel-websock.c
> > +++ b/io/channel-websock.c
> > @@ -44,13 +44,39 @@
> >   #define QIO_CHANNEL_WEBSOCK_CONNECTION_UPGRADE "Upgrade"
> >   #define QIO_CHANNEL_WEBSOCK_UPGRADE_WEBSOCKET "websocket"
> > -#define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RESPONSE  \
> > +#define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON \
> > +"Server: QEMU VNC\r\n"   \
> > +"Date: %s\r\n"
> 
> and
> "Sec-WebSocket-Version: " \
>   QIO_CHANNEL_WEBSOCK_SUPPORTED_VERSION "\r\n" \

This header is only supposed to be set in error responses,
not in the 101 response.

> or what about:
> 
> #define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON(conn)   \
>   "Server: QEMU VNC\r\n" \
>   "Date: %s\r\n" \
>   "Connection: " conn "\r\n" \
>   "Sec-WebSocket-Version: "  \
> QIO_CHANNEL_WEBSOCK_SUPPORTED_VERSION "\r\n"

I'm not a fan of parameterizing this - I think it is clearer
to see the full connection header inline below.

> > +
> > +#define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_OK\
> >   "HTTP/1.1 101 Switching Protocols\r\n"  \
> > +QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON\
> >   "Upgrade: websocket\r\n"\
> >   "Connection: Upgrade\r\n"   \
> >   "Sec-WebSocket-Accept: %s\r\n"  \
> >   "Sec-WebSocket-Protocol: binary\r\n"\
> >   "\r\n"
> > +#define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_NOT_FOUND \
> > +"HTTP/1.1 404 Not Found\r\n"\
> > +QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON\
> > +"Connection: close\r\n" \
> > +"\r\n"
> > +#define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_BAD_REQUEST \
> > +"HTTP/1.1 400 Bad Request\r\n"\
> > +QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON  \
> > +"Connection: close\r\n"   \
> > +"Sec-WebSocket-Version: 13\r\n"   \
> 
> drop
> 
> > +"\r\n"
> 
> or with previous macro:
> 
> #define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_BAD_REQUEST \
> "HTTP/1.1 400 Bad Request\r\n"\
> QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON("close") \
> "\r\n"
> 
> > +#define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_SERVER_ERR \
> > +"HTTP/1.1 500 Internal Server Error\r\n" \
> > +QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON \
> > +"Connection: close\r\n"  \
> > +"\r\n"
> > +#define QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_TOO_LARGE  \
> > +"HTTP/1.1 403 Request Entity Too Large\r\n"  \
> > +QIO_CHANNEL_WEBSOCK_HANDSHAKE_RES_COMMON \
> > +"Connection: close\r\n"  \
> > +"\r\n"
> >   #define QIO_CHANNEL_WEBSOCK_HANDSHAKE_DELIM "\r\n"
> >   #define QIO_CHANNEL_WEBSOCK_HANDSHAKE_END "\r\n\r\n"
> >   #define QIO_CHANNEL_WEBSOCK_SUPPORTED_VERSION "13"


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Dr. David Alan Gilbert

* Peter Xu (pet...@redhat.com) wrote:
> On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > > This does imply that you need a separate monitor I/O processing, from 
> > > > the
> > > > command execution thread, but I see no need for all commands to suddenly
> > > > become async. Just allowing interleaved replies is sufficient from the
> > > > POV of the protocol definition. This interleaving is easy to handle from
> > > > the client POV - just requires a unique 'serial' in the request by the
> > > > client, that is copied into the reply by QEMU.
> > > 
> > > OK, so for that we can just take Marc-André's syntax and call it 'id':
> > >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > > 
> > > then it's upto the caller to ensure those id's are unique.
> > 
> > Libvirt has in fact generated a unique 'id' for every monitor command
> > since day 1 of supporting QMP.
> > 
> > > I do worry about two things:
> > >   a) With this the caller doesn't really know which commands could be
> > >   in parallel - for example if we've got a recovery command that's
> > >   executed by this non-locking thread that's OK, we expect that
> > >   to be doable in parallel.  If in the future though we do
> > >   what you initially suggested and have a bunch of commands get
> > >   routed to the migration thread (say) then those would suddenly
> > >   operate in parallel with other commands that we're previously
> > >   synchronous.
> > 
> > We could still have an opt-in for async commands. eg default to executing
> > all commands in the main thread, unless the client issues an explicit
> > "make it async" command, to switch to allowing the migration thread to
> > process it async.
> > 
> >  { "execute": "qmp_allow_async",
> >"data": { "commands": [
> >"migrate_cancel",
> >] } }
> > 
> > 
> >  { "return": { "commands": [
> >"migrate_cancel",
> >] } }
> > 
> > The server response contains the subset of commands from the request
> > for which async is supported.
> > 
> > That gives good negotiation ability going forward as we incrementally
> > support async on more commands.
> 
> I think this goes back to the discussion on which design we'd like to
> choose.  IMHO the whole async idea plus the per-command-id is indeed
> cleaner and nicer, and I believe that can benefit not only libvirt,
> but also other QMP users.  The problem is, I have no idea how long
> it'll take to let us have such a feature - I believe that will include
> QEMU and Libvirt to both support that.  And it'll be a pity if the
> postcopy recovery cannot work only because we cannot guarantee a
> stable monitor.

libvirt will need changes for postcopy recovery however we do it;
so we need to do it the way they want.

I think Dan's suggestion isn't as hard as it initially sounded;  a first
thing to try would be taking all the monitor IO into another thread
and feeding all commands to the main thread for execution - that sounds
like the hard part.
(I'm not sure how multiple monitors interact for this).

Dave

> I'm curious whether there are other requirements (besides postcopy
> recovery) that would want an always-alive monitor to run some
> lock-free commands?  If there is, I'd be more inclined to first
> provide a work-around solution like "-qmp-lockfree", and we can
> provide a better solution afterwards until when the whole async QMP
> work ready.
> 
> > 
> > >   b) I still worry how the various IO channels will behave on another
> > >   thread.  But that's more a general feeling rather than anything
> > >   specific.
> > 
> > The only complexity will be around making sure the Chardev code uses
> > the right GMainContext for any watches on the underlying QIOChannel,
> > so that we poll() from the custom thread instead of the main thread.
> > IOW, as long as all I/O is done from the single thread everything
> > should work fine.
> > 
> > Regards,
> > Daniel
> > -- 
> > |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange 
> > :|
> > |: https://libvirt.org -o-https://fstop138.berrange.com 
> > :|
> > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange 
> > :|
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Dr. David Alan Gilbert

* Stefan Hajnoczi (stefa...@gmail.com) wrote:
> On Thu, Sep 7, 2017 at 9:13 AM, Peter Xu  wrote:
> > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> >> On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> >> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> >> > > This does imply that you need a separate monitor I/O processing, from 
> >> > > the
> >> > > command execution thread, but I see no need for all commands to 
> >> > > suddenly
> >> > > become async. Just allowing interleaved replies is sufficient from the
> >> > > POV of the protocol definition. This interleaving is easy to handle 
> >> > > from
> >> > > the client POV - just requires a unique 'serial' in the request by the
> >> > > client, that is copied into the reply by QEMU.
> >> >
> >> > OK, so for that we can just take Marc-André's syntax and call it 'id':
> >> >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> >> >
> >> > then it's upto the caller to ensure those id's are unique.
> >>
> >> Libvirt has in fact generated a unique 'id' for every monitor command
> >> since day 1 of supporting QMP.
> >>
> >> > I do worry about two things:
> >> >   a) With this the caller doesn't really know which commands could be
> >> >   in parallel - for example if we've got a recovery command that's
> >> >   executed by this non-locking thread that's OK, we expect that
> >> >   to be doable in parallel.  If in the future though we do
> >> >   what you initially suggested and have a bunch of commands get
> >> >   routed to the migration thread (say) then those would suddenly
> >> >   operate in parallel with other commands that we're previously
> >> >   synchronous.
> >>
> >> We could still have an opt-in for async commands. eg default to executing
> >> all commands in the main thread, unless the client issues an explicit
> >> "make it async" command, to switch to allowing the migration thread to
> >> process it async.
> >>
> >>  { "execute": "qmp_allow_async",
> >>"data": { "commands": [
> >>"migrate_cancel",
> >>] } }
> >>
> >>
> >>  { "return": { "commands": [
> >>"migrate_cancel",
> >>] } }
> >>
> >> The server response contains the subset of commands from the request
> >> for which async is supported.
> >>
> >> That gives good negotiation ability going forward as we incrementally
> >> support async on more commands.
> >
> > I think this goes back to the discussion on which design we'd like to
> > choose.  IMHO the whole async idea plus the per-command-id is indeed
> > cleaner and nicer, and I believe that can benefit not only libvirt,
> > but also other QMP users.  The problem is, I have no idea how long
> > it'll take to let us have such a feature - I believe that will include
> > QEMU and Libvirt to both support that.  And it'll be a pity if the
> > postcopy recovery cannot work only because we cannot guarantee a
> > stable monitor.
> 
> Please don't rush in a hack, they often introduce new bugs that we
> have to support long-term when they are part of the QMP API.
> 
> In your original email you mentioned "info cpus".  Have you considered
> modifying this command so it does not sync the CPU?  I'm not sure
> callers really need to sync the CPU, typically they just want to know
> the vcpu numbers, thread IDs, and current state (halted, running,
> etc).

But it has the pc as well, so that's actual state.

Dave

> The next step after that would be to audit other monitor commands for
> unnecessary vcpu synchronization.
> 
> Stefan
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Dr. David Alan Gilbert

* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Thu, Sep 07, 2017 at 04:13:41PM +0800, Peter Xu wrote:
> > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> > > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > > > This does imply that you need a separate monitor I/O processing, from 
> > > > > the
> > > > > command execution thread, but I see no need for all commands to 
> > > > > suddenly
> > > > > become async. Just allowing interleaved replies is sufficient from the
> > > > > POV of the protocol definition. This interleaving is easy to handle 
> > > > > from
> > > > > the client POV - just requires a unique 'serial' in the request by the
> > > > > client, that is copied into the reply by QEMU.
> > > > 
> > > > OK, so for that we can just take Marc-André's syntax and call it 'id':
> > > >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > > > 
> > > > then it's upto the caller to ensure those id's are unique.
> > > 
> > > Libvirt has in fact generated a unique 'id' for every monitor command
> > > since day 1 of supporting QMP.
> > > 
> > > > I do worry about two things:
> > > >   a) With this the caller doesn't really know which commands could be
> > > >   in parallel - for example if we've got a recovery command that's
> > > >   executed by this non-locking thread that's OK, we expect that
> > > >   to be doable in parallel.  If in the future though we do
> > > >   what you initially suggested and have a bunch of commands get
> > > >   routed to the migration thread (say) then those would suddenly
> > > >   operate in parallel with other commands that we're previously
> > > >   synchronous.
> > > 
> > > We could still have an opt-in for async commands. eg default to executing
> > > all commands in the main thread, unless the client issues an explicit
> > > "make it async" command, to switch to allowing the migration thread to
> > > process it async.
> > > 
> > >  { "execute": "qmp_allow_async",
> > >"data": { "commands": [
> > >"migrate_cancel",
> > >] } }
> > > 
> > > 
> > >  { "return": { "commands": [
> > >"migrate_cancel",
> > >] } }
> > > 
> > > The server response contains the subset of commands from the request
> > > for which async is supported.
> > > 
> > > That gives good negotiation ability going forward as we incrementally
> > > support async on more commands.
> > 
> > I think this goes back to the discussion on which design we'd like to
> > choose.  IMHO the whole async idea plus the per-command-id is indeed
> > cleaner and nicer, and I believe that can benefit not only libvirt,
> > but also other QMP users.  The problem is, I have no idea how long
> > it'll take to let us have such a feature - I believe that will include
> > QEMU and Libvirt to both support that.  And it'll be a pity if the
> > postcopy recovery cannot work only because we cannot guarantee a
> > stable monitor.
> 
> This is not a blocker for having postcopy recovery feature merged.
> It merely means that in a situation where the mainloop is blocked,
> then we can't recover, in other situations we'll be able to recover
> fine. Sure it would be nice to fix that problem too, but I don't
> see it as a block.

It's probably OK to merge the recovery code before the monitor code;
but I don't think it's something you'd want to tell users about -
a 'postcopy recovery that only works rarely' isn't much use.

Dave

> I don't think the hacks proposed are a good tradeoff, compared to
> fixing the fundamental problem with the monitor impl in QEMU. We
> have discussed this monitor problem for years pretty much since
> day 1 of QMP being designed, but it never gets serious attention.
> IMHO it is well overdue to change that and focus attention on the
> root problem and not just punt it down the road yet again by adding
> short term hacks.
> 
> Adding an extra monitor channel, even as a short term hack, is
> *not* short term from libvirt's POV - we'll have to carry that
> code for many years into the future, even after QEMU provides
> a real fix. So even if QEMU provides such a short term hack, I
> would none the less be strongly against libvirt using it.
> 
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [RFC PATCH qemu 4/4] memory: Add flat views to HMP "info mtree"

2017-09-07 Thread Alexey Kardashevskiy

This adds a new switch to "info mtree" to print dispatch tree internals.

Signed-off-by: Alexey Kardashevskiy 
---

Example:

aik@fstn1-p1:~$ echo "info mtree -f -d" | nc localhost 3
QEMU 2.9.94 monitor - type 'help' for more information
(qemu) info mtree -f -d
FlatView #0
 AS "memory"
  -7fff (prio 0, ram): ppc_spapr.ram
 AS "cpu-memory"
  -7fff (prio 0, ram): ppc_spapr.ram
 AS "cpu-memory"
  -7fff (prio 0, ram): ppc_spapr.ram
  Dispatch
Physical sections
  #0 @ root="(noname)" [unassigned]
  #1 @ root="(noname)" [not dirty]
  #2 @ root="(noname)" [ROM]
  #3 @ root="(noname)" [watch]
  #4 @ root="ppc_spapr.ram"
Nodes (9 bits per level, 6 levels) ptr=[3] skip=4
  [0]
  0   skip=3  ptr=[3]
  1..511  skip=1  ptr=NIL
  [1]
  0   skip=2  ptr=[3]
  1..511  skip=1  ptr=NIL
  [2]
  0   skip=1  ptr=[3]
  1..511  skip=1  ptr=NIL
  [3]
  0..1skip=0  ptr=#4
  2..511  skip=1  ptr=NIL

FlatView #1
 AS "I/O"
  - (prio 0, i/o): io
  Dispatch
Physical sections
  #0 @ root="(noname)" [unassigned]
  #1 @ root="(noname)" [not dirty]
  #2 @ root="(noname)" [ROM]
  #3 @ root="(noname)" [watch]
  #4 @ root="io"
Nodes (9 bits per level, 6 levels) ptr=[5] skip=6
  [0]
  0   skip=5  ptr=[5]
  1..511  skip=1  ptr=NIL
  [1]
  0   skip=4  ptr=[5]
  1..511  skip=1  ptr=NIL
  [2]
  0   skip=3  ptr=[5]
  1..511  skip=1  ptr=NIL
  [3]
  0   skip=2  ptr=[5]
  1..511  skip=1  ptr=NIL
  [4]
  0   skip=1  ptr=[5]
  1..511  skip=1  ptr=NIL
  [5]
  0..15   skip=0  ptr=#4
 16..511  skip=0  ptr=#0

FlatView #2
 AS "pci@8002000"
  -3fff (prio 0, i/o): tce-iommu-8000
  0400-0400 (prio 0, i/o): msi
  Dispatch
Physical sections
  #0 @ root="(noname)" [unassigned]
  #1 @ root="(noname)" [not dirty]
  #2 @ root="(noname)" [ROM]
  #3 @ root="(noname)" [watch]
  #4 @ root="tce-iommu-8000"
  #5 @0400 root="msi"
Nodes (9 bits per level, 6 levels) ptr=[2] skip=3
  [0]
  0   skip=2  ptr=[2]
  1..511  skip=1  ptr=NIL
  [1]
  0   skip=1  ptr=[2]
  1..511  skip=1  ptr=NIL
  [2]
  0   skip=0  ptr=#4
  1..7skip=1  ptr=NIL
  8   skip=3  ptr=[6]
  9..511  skip=1  ptr=NIL
  [3]
  0   skip=0  ptr=#4
  1..511  skip=1  ptr=NIL
  [4]
  0   skip=2  ptr=[6]
  1..511  skip=1  ptr=NIL
  [5]
  0   skip=1  ptr=[6]
  1..511  skip=1  ptr=NIL
  [6]
  0..15   skip=0  ptr=#5
 16..511  skip=0  ptr=#0

(qemu)
---
 include/exec/memory.h |  5 +++-
 exec.c| 66 +++
 memory.c  | 20 
 monitor.c |  3 ++-
 hmp-commands-info.hx  |  7 +++---
 5 files changed, 92 insertions(+), 9 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 41ab165302..2a50bbe79f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1527,7 +1527,10 @@ void memory_global_dirty_log_start(void);
  */
 void memory_global_dirty_log_stop(void);
 
-void mtree_info(fprintf_function mon_printf, void *f, bool flatview);
+void mtree_info(fprintf_function mon_printf, void *f, bool flatview,
+bool dispatch_tree);
+void mtree_print_dispatch(fprintf_function mon, void *f,
+  struct AddressSpaceDispatch *d);
 
 /**
  * memory_region_request_mmio_ptr: request a pointer to an mmio
diff --git a/exec.c b/exec.c
index 51243f57f4..8c565e9102 100644
--- a/exec.c
+++ b/exec.c
@@ -3605,3 +3605,69 @@ void page_size_init(void)
 }
 qemu_host_page_mask = -(intptr_t)qemu_host_page_size;
 }
+
+static void mtree_print_phys_entries(fprintf_function mon, void *f,
+ int start, int end, int skip, int ptr)
+{
+if (start == end - 1) {
+mon(f, "\t%3d  ", start);
+} else {
+mon(f, "\t%3d..%-3d ", start, end - 1);
+}
+mon(f, " skip=%d ", skip);
+if (ptr == PHYS_MAP_NODE_NIL) {
+mon(f, " ptr=NIL");
+} else if (!skip) {
+mon(f, " ptr=#%d", ptr);
+} else {
+mon(f, " ptr=[%d]", ptr);
+}
+mon(f, "\n");
+}
+
+void mtree_print_dispatch(fprintf_function mon, void *f,
+  AddressSpaceDispatch *d)
+{
+int i

[Qemu-devel] [RFC PATCH qemu 1/4] memory: Postpone flatview and dispatch tree building till all devices are added

2017-09-07 Thread Alexey Kardashevskiy

Most devices use at least one address space and every time a new address
space is added, flat views and dispatch trees are rebuild for all address
spaces. This is not a problem for a relatively small amount of devices but
even 50 virtio-pci devices use more than 8GB of RAM.

What happens that on every flatview/dispatch rebuild, new arrays are
allocated and old ones release but the release is done via RCU so until
an entire machine is build, they are not released.

This wraps devices creation into memory_region_transaction_begin/commit
to massively reduce amount of flat view/dispatch tree (re)allocations.

Signed-off-by: Alexey Kardashevskiy 
---
 vl.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/vl.c b/vl.c
index 8e247cc2a2..3c39cc8b3a 100644
--- a/vl.c
+++ b/vl.c
@@ -4655,12 +4655,16 @@ int main(int argc, char **argv, char **envp)
 igd_gfx_passthru();
 
 /* init generic devices */
+memory_region_transaction_begin();
+
 rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
 if (qemu_opts_foreach(qemu_find_opts("device"),
   device_init_func, NULL, NULL)) {
 exit(1);
 }
 
+memory_region_transaction_commit();
+
 cpu_synchronize_all_post_init();
 
 rom_reset_order_override();
-- 
2.11.0

[Qemu-devel] [RFC PATCH qemu 3/4] memory: Share flat views and dispatch trees between address spaces

2017-09-07 Thread Alexey Kardashevskiy

This allows sharing flat views between address spaces when the same root
memory region is used when creating a new address space.

This adds a global list of flat views and a list of attached address
spaces per a flat view. Each address space references a flat view.

This hard codes the dispatch tree building instead of doing so via
a memory listener.

Signed-off-by: Alexey Kardashevskiy 
---

This was suggested by Paolo in
https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg05011.html

I am not putting "Suggested-by" as I want to make sure that this is doing
what was actually suggested.
---
 include/exec/memory-internal.h |   6 +-
 include/exec/memory.h  |   9 +-
 exec.c |  58 ++
 hw/pci/pci.c   |   3 +-
 memory.c   | 253 +++--
 5 files changed, 187 insertions(+), 142 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index fb467acdba..8516e0b48f 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -22,9 +22,11 @@
 #ifndef CONFIG_USER_ONLY
 typedef struct AddressSpaceDispatch AddressSpaceDispatch;
 
-void address_space_init_dispatch(AddressSpace *as);
 void address_space_unregister(AddressSpace *as);
-void address_space_destroy_dispatch(AddressSpace *as);
+void address_space_dispatch_free(AddressSpaceDispatch *d);
+AddressSpaceDispatch *mem_begin(void);
+void mem_commit(AddressSpaceDispatch *d);
+void mem_add(AddressSpaceDispatch *d, MemoryRegionSection *section);
 
 extern const MemoryRegionOps unassigned_mem_ops;
 
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 83e82e90d5..41ab165302 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -27,6 +27,7 @@
 #include "qemu/rcu.h"
 #include "hw/qdev-core.h"
 
+typedef struct AddressSpaceDispatch AddressSpaceDispatch;
 #define RAM_ADDR_INVALID (~(ram_addr_t)0)
 
 #define MAX_PHYS_ADDR_SPACE_BITS 62
@@ -312,6 +313,7 @@ struct MemoryListener {
 };
 
 AddressSpaceDispatch *address_space_to_dispatch(AddressSpace *as);
+MemoryRegion *address_space_root(AddressSpace *as);
 
 /**
  * AddressSpace: describes a mapping of addresses to #MemoryRegion objects
@@ -320,20 +322,17 @@ struct AddressSpace {
 /* All fields are private. */
 struct rcu_head rcu;
 char *name;
-MemoryRegion *root;
-int ref_count;
-bool malloced;
 
 /* Accessed via RCU.  */
 struct FlatView *current_map;
 
 int ioeventfd_nb;
 struct MemoryRegionIoeventfd *ioeventfds;
-struct AddressSpaceDispatch *dispatch;
-struct AddressSpaceDispatch *next_dispatch;
+
 MemoryListener dispatch_listener;
 QTAILQ_HEAD(memory_listeners_as, MemoryListener) listeners;
 QTAILQ_ENTRY(AddressSpace) address_spaces_link;
+QTAILQ_ENTRY(AddressSpace) flat_view_link;
 };
 
 /**
diff --git a/exec.c b/exec.c
index 66f01f5fce..51243f57f4 100644
--- a/exec.c
+++ b/exec.c
@@ -188,15 +188,12 @@ typedef struct PhysPageMap {
 } PhysPageMap;
 
 struct AddressSpaceDispatch {
-struct rcu_head rcu;
-
 MemoryRegionSection *mru_section;
 /* This is a multi-level map on the physical address space.
  * The bottom level has pointers to MemoryRegionSections.
  */
 PhysPageEntry phys_map;
 PhysPageMap map;
-AddressSpace *as;
 };
 
 typedef struct AddressSpaceDispatch AddressSpaceDispatch;
@@ -240,11 +237,6 @@ struct DirtyBitmapSnapshot {
 unsigned long dirty[];
 };
 
-AddressSpaceDispatch *address_space_to_dispatch(AddressSpace *as)
-{
-return atomic_rcu_read(&as->dispatch);
-}
-
 #endif
 
 #if !defined(CONFIG_USER_ONLY)
@@ -1354,10 +1346,8 @@ static void register_multipage(AddressSpaceDispatch *d,
 phys_page_set(d, start_addr >> TARGET_PAGE_BITS, num_pages, section_index);
 }
 
-static void mem_add(MemoryListener *listener, MemoryRegionSection *section)
+void mem_add(AddressSpaceDispatch *d, MemoryRegionSection *section)
 {
-AddressSpace *as = container_of(listener, AddressSpace, dispatch_listener);
-AddressSpaceDispatch *d = as->next_dispatch;
 MemoryRegionSection now = *section, remain = *section;
 Int128 page_size = int128_make64(TARGET_PAGE_SIZE);
 
@@ -2680,9 +2670,8 @@ static void io_mem_init(void)
   NULL, UINT64_MAX);
 }
 
-static void mem_begin(MemoryListener *listener)
+AddressSpaceDispatch *mem_begin(void)
 {
-AddressSpace *as = container_of(listener, AddressSpace, dispatch_listener);
 AddressSpaceDispatch *d = g_new0(AddressSpaceDispatch, 1);
 uint16_t n;
 
@@ -2696,27 +2685,19 @@ static void mem_begin(MemoryListener *listener)
 assert(n == PHYS_SECTION_WATCH);
 
 d->phys_map  = (PhysPageEntry) { .ptr = PHYS_MAP_NODE_NIL, .skip = 1 };
-as->next_dispatch = d;
+
+return d;
 }
 
-static void address_space_dispatch_free(AddressSpaceDispatch *d)
+void address_space_dispatch_free(AddressSpaceDispatch *d)
 {
 phys_sections_free(&d->map);
 g_free(d)

[Qemu-devel] [RFC PATCH qemu 0/4] memory: Reduce memory use

2017-09-07 Thread Alexey Kardashevskiy

This was inspired by https://bugzilla.redhat.com/show_bug.cgi?id=1481593

What happens ithere is that every virtio block device creates 2 address
spaces - for modern config space (called "virtio-pci-cfg-as") and
for busmaster (common pci thing, called after the device name,
in my case "virtio-blk-pci").

Each address_space_init() updates topology for every address space.
Every topology update (address_space_update_topology()) creates a new
dispatch tree - AddressSpaceDispatch with nodes (1KB) and
sections (48KB) and destroys the old one.

However the dispatch destructor is postponed via RCU which does not
get a chance to execute until the machine is initialized but before
we get there, memory is not returned to the pool, and this is a lot
of memory which grows n^2.

These patches are trying to address the memory use and boot time
issues but tbh only the first one provides visible outcome.

There are still things to polish and double check the use of RCU,
I'd like to get any feedback before proceeding - is this going
the right way or way too ugly?


This is based on sha1
1ab5eb4efb Peter Maydell "Update version for v2.10.0 release".

Please comment. Thanks.



Alexey Kardashevskiy (4):
  memory: Postpone flatview and dispatch tree building till all devices
are added
  memory: Prepare for shared flat views
  memory: Share flat views and dispatch trees between address spaces
  memory: Add flat views to HMP "info mtree"

 include/exec/memory-internal.h |   6 +-
 include/exec/memory.h  |  93 +
 exec.c | 242 +++--
 hw/alpha/typhoon.c |   2 +-
 hw/dma/rc4030.c|   4 +-
 hw/i386/amd_iommu.c|   2 +-
 hw/i386/intel_iommu.c  |   9 +-
 hw/intc/openpic_kvm.c  |   2 +-
 hw/pci-host/apb.c  |   2 +-
 hw/pci/pci.c   |   3 +-
 hw/ppc/spapr_iommu.c   |   4 +-
 hw/s390x/s390-pci-bus.c|   2 +-
 hw/vfio/common.c   |   6 +-
 hw/virtio/vhost.c  |   6 +-
 memory.c   | 299 +++--
 monitor.c  |   3 +-
 vl.c   |   4 +
 hmp-commands-info.hx   |   7 +-
 18 files changed, 448 insertions(+), 248 deletions(-)

-- 
2.11.0

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Daniel P. Berrange

On Thu, Sep 07, 2017 at 10:19:47AM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > On Thu, Sep 07, 2017 at 04:13:41PM +0800, Peter Xu wrote:
> > > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> > > > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > > > > This does imply that you need a separate monitor I/O processing, 
> > > > > > from the
> > > > > > command execution thread, but I see no need for all commands to 
> > > > > > suddenly
> > > > > > become async. Just allowing interleaved replies is sufficient from 
> > > > > > the
> > > > > > POV of the protocol definition. This interleaving is easy to handle 
> > > > > > from
> > > > > > the client POV - just requires a unique 'serial' in the request by 
> > > > > > the
> > > > > > client, that is copied into the reply by QEMU.
> > > > > 
> > > > > OK, so for that we can just take Marc-André's syntax and call it 'id':
> > > > >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > > > > 
> > > > > then it's upto the caller to ensure those id's are unique.
> > > > 
> > > > Libvirt has in fact generated a unique 'id' for every monitor command
> > > > since day 1 of supporting QMP.
> > > > 
> > > > > I do worry about two things:
> > > > >   a) With this the caller doesn't really know which commands could be
> > > > >   in parallel - for example if we've got a recovery command that's
> > > > >   executed by this non-locking thread that's OK, we expect that
> > > > >   to be doable in parallel.  If in the future though we do
> > > > >   what you initially suggested and have a bunch of commands get
> > > > >   routed to the migration thread (say) then those would suddenly
> > > > >   operate in parallel with other commands that we're previously
> > > > >   synchronous.
> > > > 
> > > > We could still have an opt-in for async commands. eg default to 
> > > > executing
> > > > all commands in the main thread, unless the client issues an explicit
> > > > "make it async" command, to switch to allowing the migration thread to
> > > > process it async.
> > > > 
> > > >  { "execute": "qmp_allow_async",
> > > >"data": { "commands": [
> > > >"migrate_cancel",
> > > >] } }
> > > > 
> > > > 
> > > >  { "return": { "commands": [
> > > >"migrate_cancel",
> > > >] } }
> > > > 
> > > > The server response contains the subset of commands from the request
> > > > for which async is supported.
> > > > 
> > > > That gives good negotiation ability going forward as we incrementally
> > > > support async on more commands.
> > > 
> > > I think this goes back to the discussion on which design we'd like to
> > > choose.  IMHO the whole async idea plus the per-command-id is indeed
> > > cleaner and nicer, and I believe that can benefit not only libvirt,
> > > but also other QMP users.  The problem is, I have no idea how long
> > > it'll take to let us have such a feature - I believe that will include
> > > QEMU and Libvirt to both support that.  And it'll be a pity if the
> > > postcopy recovery cannot work only because we cannot guarantee a
> > > stable monitor.
> > 
> > This is not a blocker for having postcopy recovery feature merged.
> > It merely means that in a situation where the mainloop is blocked,
> > then we can't recover, in other situations we'll be able to recover
> > fine. Sure it would be nice to fix that problem too, but I don't
> > see it as a block.
> 
> It's probably OK to merge the recovery code before the monitor code;
> but I don't think it's something you'd want to tell users about -
> a 'postcopy recovery that only works rarely' isn't much use.

I dunno. Compared to today where there's zero post-copy recovery,
I think even an incremental improvement is useful. Its a choice
between "your VM is dead" and "you've a 50/50 chance of life".


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [Bug 1714331] Re: Virtual machines not working anymore on 2.10

2017-09-07 Thread Phil Dennis-Jordan

Hi Stefan,

What version of OVMF are you using? There were a bunch of bugs related to
FADT in OVMF which were fixed back in March. Specifically, commits
072060a, 78807f6, and 198a46d. If your version of OVMF doesn't include
these fixes, that's a likely source for your problems.

Hope that helps.

Phil


On 6 September 2017 at 16:34, Stefan Hajnoczi  wrote:

> On Tue, Sep 05, 2017 at 06:42:06PM -, Chris Unsworth wrote:
> > I think my issue looks like the same.  Sometimes I just get spinning
> > dots, and sometimes there is the message about doing an automatic repair
> > below the spinning dots before it stops and uses 100% cpu.  I just did a
> > git bisect:
>
> Perfect, thanks Chris!
>
> I have CCed Phil and Paolo regarding the commit you identified.
>
> >
> > # git bisect log
> > # bad: [1ab5eb4efb91a3d4569b0df6e824cc08ab4bd8ec] Update version for
> v2.10.0 release
> > # good: [6c02258e143700314ebf268dae47eb23db17d1cf] Update version for
> v2.9.0 release
> > git bisect start 'v2.10.0' 'v2.9.0'
> > # bad: [269c20b2bbd2aa8531e0cdc741fb166f290d7a2b] tests/qdict: check
> more get_try_int() cases
> > git bisect bad 269c20b2bbd2aa8531e0cdc741fb166f290d7a2b
> > # bad: [eba0161990af8509608332450ee7e338273cf5df] Merge remote-tracking
> branch 'rth/tags/pull-s390-20170512' into staging
> > git bisect bad eba0161990af8509608332450ee7e338273cf5df
> > # good: [9ea5ada76f34a0ef048b131c3a166d8564199bdb] audio: Use
> ARRAY_SIZE from qemu/osdep.h
> > git bisect good 9ea5ada76f34a0ef048b131c3a166d8564199bdb
> > # bad: [1effe6ad5eac1b2e50a077695ac801d172891d6a] Merge remote-tracking
> branch 'danpb/tags/pull-qcrypto-2017-05-09-1' into staging
> > git bisect bad 1effe6ad5eac1b2e50a077695ac801d172891d6a
> > # good: [f03f9f0c10dcfadee5811d43240f0a6af230f1ce] Merge
> remote-tracking branch 'cohuck/tags/s390x-3270-20170504' into staging
> > git bisect good f03f9f0c10dcfadee5811d43240f0a6af230f1ce
> > # good: [6c02258e143700314ebf268dae47eb23db17d1cf]
> qobject-input-visitor: Document full_name_nth()
> > git bisect good 6c02258e143700314ebf268dae47eb23db17d1cf
> > # bad: [95615ce5a1be1a5dd3597d8cb6ba83f0010e] vhost-scsi: create a
> vhost-scsi-common abstraction
> > git bisect bad 95615ce5a1be1a5dd3597d8cb6ba83f0010e
> > # bad: [31f5a726b59bda5580e2f9413867893501dd7d93] trace: add qemu mutex
> lock and unlock trace events
> > git bisect bad 31f5a726b59bda5580e2f9413867893501dd7d93
> > # bad: [49e00a18708e27c815828d9440d5c9300d19547c] use _Static_assert in
> QEMU_BUILD_BUG_ON
> > git bisect bad 49e00a18708e27c815828d9440d5c9300d19547c
> > # bad: [6103451aeb749e92bf7d730429985189c6921c32] hw/i386: Build-time
> assertion on pc/q35 reset register being identical.
> > git bisect bad 6103451aeb749e92bf7d730429985189c6921c32
> > # bad: [77af8a2b95b79699de650965d5228772743efe84] hw/i386: Use Rev3
> FADT (ACPI 2.0) instead of Rev1 to improve guest OS support.
> > git bisect bad 77af8a2b95b79699de650965d5228772743efe84
> > # first bad commit: [77af8a2b95b79699de650965d5228772743efe84] hw/i386:
> Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve guest OS support.
> >
> >
> > 77af8a2b95b79699de650965d5228772743efe84 is the first bad commit
> > commit 77af8a2b95b79699de650965d5228772743efe84
> > Author: Phil Dennis-Jordan 
> > Date:   Wed Mar 15 19:20:26 2017 +1300
> >
> > hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve guest
> > OS support.
> >
> > This updates the FADT generated for x86/64 machine types from
> > Revision 1 to 3. (Based on ACPI standard 2.0 instead of 1.0) The
> > intention is to expose the reset register information to guest operating
> > systems which require it, specifically OS X/macOS. Revision 1 FADTs do
> > not contain the fields relating to the reset register.
> >
> > The new layout and contents remains backwards-compatible with
> > operating systems which only support ACPI 1.0, as the existing fields
> > are not modified by this change, as the 64-bit and 32-bit variants are
> > allowed to co-exist according to the ACPI 2.0 standard. No regressions
> > became apparent in tests with a range of Windows (XP-10) and Linux
> > versions.
> >
> > The BIOS tables test suite's FADT checksum test has also been
> > updated to reflect the new FADT layout and content.
> >
> > Signed-off-by: Phil Dennis-Jordan 
> > Message-Id: <1489558827-28971-2-git-send-email-p...@philjordan.eu>
> > Signed-off-by: Paolo Bonzini 
> >
> > :04 04 40063761c0b86f87e798e03ea48eff9ea0753425
> 6d2a94150cf1eafb16f0ccf6325281415fef64a6 M  hw
> > :04 04 fe3f1480a91b76fea238c765f0725e715932d96d
> 68f9368d8d78fd3267f609b603f97e8a74bdf528 M  include
> > :04 04 895e961b0a160100aa95b2f557cfe6b87a7d9bff
> 8ed08cef10fddee7814e38ad62be11371592a75a M  tests
> >
> > --
> > You received this bug notification because you are a member of qemu-
> > devel-ml, which is subscribed to QEMU.
> > https://bugs.launchpad.net/bugs/1714331
> >
> > Title:
> >   Virtual machines not working anym

[Qemu-devel] [RFC PATCH qemu 2/4] memory: Prepare for shared flat views

2017-09-07 Thread Alexey Kardashevskiy

We are going to share flat views and dispatch trees between address
spaces. This moves bits around but no change in behaviour is expected
here. The following patch will implement sharing.

This switches from AddressSpace to AddressSpaceDispatch as in many
places this is what is used actually. Also, since multiple address
spaces will be sharing a flat view, MemoryRegionSection cannot store
the address space pointer, this changes it to AddressSpaceDispatch
as well. As a not very obvious result, IOMMUTLBEntry also switched
from AddressSpace to AddressSpaceDispatch to make
address_space_get_iotlb_entry() work.

Signed-off-by: Alexey Kardashevskiy 
---
 include/exec/memory.h   |  79 +++---
 exec.c  | 128 +++-
 hw/alpha/typhoon.c  |   2 +-
 hw/dma/rc4030.c |   4 +-
 hw/i386/amd_iommu.c |   2 +-
 hw/i386/intel_iommu.c   |   9 ++--
 hw/intc/openpic_kvm.c   |   2 +-
 hw/pci-host/apb.c   |   2 +-
 hw/ppc/spapr_iommu.c|   4 +-
 hw/s390x/s390-pci-bus.c |   2 +-
 hw/vfio/common.c|   6 +--
 hw/virtio/vhost.c   |   6 +--
 memory.c|  26 ++
 13 files changed, 170 insertions(+), 102 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 400dd4491b..83e82e90d5 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -48,6 +48,7 @@
 
 typedef struct MemoryRegionOps MemoryRegionOps;
 typedef struct MemoryRegionMmio MemoryRegionMmio;
+typedef struct AddressSpaceDispatch AddressSpaceDispatch;
 
 struct MemoryRegionMmio {
 CPUReadMemoryFunc *read[3];
@@ -67,7 +68,7 @@ typedef enum {
 #define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : 0))
 
 struct IOMMUTLBEntry {
-AddressSpace*target_as;
+AddressSpaceDispatch *target_dispatch;
 hwaddr   iova;
 hwaddr   translated_addr;
 hwaddr   addr_mask;  /* 0xfff = 4k translation */
@@ -310,6 +311,8 @@ struct MemoryListener {
 QTAILQ_ENTRY(MemoryListener) link_as;
 };
 
+AddressSpaceDispatch *address_space_to_dispatch(AddressSpace *as);
+
 /**
  * AddressSpace: describes a mapping of addresses to #MemoryRegion objects
  */
@@ -346,7 +349,7 @@ struct AddressSpace {
  */
 struct MemoryRegionSection {
 MemoryRegion *mr;
-AddressSpace *address_space;
+AddressSpaceDispatch *dispatch;
 hwaddr offset_within_region;
 Int128 size;
 hwaddr offset_within_address_space;
@@ -1635,10 +1638,19 @@ void address_space_destroy(AddressSpace *as);
  * @buf: buffer with the data transferred
  * @is_write: indicates the transfer direction
  */
-MemTxResult address_space_rw(AddressSpace *as, hwaddr addr,
+MemTxResult address_space_dispatch_rw(AddressSpaceDispatch *d, hwaddr addr,
  MemTxAttrs attrs, uint8_t *buf,
  int len, bool is_write);
 
+static inline MemTxResult address_space_rw(AddressSpace *as, hwaddr addr,
+   MemTxAttrs attrs, uint8_t *buf,
+   int len, bool is_write)
+{
+return address_space_dispatch_rw(address_space_to_dispatch(as),
+ addr, attrs, buf, len, is_write);
+}
+
+
 /**
  * address_space_write: write to address space.
  *
@@ -1651,10 +1663,18 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr 
addr,
  * @attrs: memory transaction attributes
  * @buf: buffer with the data transferred
  */
-MemTxResult address_space_write(AddressSpace *as, hwaddr addr,
+MemTxResult address_space_dispatch_write(AddressSpaceDispatch *d, hwaddr addr,
 MemTxAttrs attrs,
 const uint8_t *buf, int len);
 
+static inline MemTxResult address_space_write(AddressSpace *as, hwaddr addr,
+MemTxAttrs attrs,
+const uint8_t *buf, int len)
+{
+return address_space_dispatch_write(address_space_to_dispatch(as),
+addr, attrs, buf, len);
+}
+
 /* address_space_ld*: load from an address space
  * address_space_st*: store to an address space
  *
@@ -1840,8 +1860,8 @@ void stq_be_phys_cached(MemoryRegionCache *cache, hwaddr 
addr, uint64_t val);
 /* address_space_get_iotlb_entry: translate an address into an IOTLB
  * entry. Should be called from an RCU critical section.
  */
-IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr,
-bool is_write);
+IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpaceDispatch *d,
+hwaddr addr, bool is_write);
 
 /* address_space_translate: translate an address range into an address space
  * into a MemoryRegion and an address range into that section.  Should be
@@ -1855,9 +1875,17 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace 
*as, hwaddr addr,
  * @len: point

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Daniel P. Berrange

On Thu, Sep 07, 2017 at 10:15:09AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> > > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > > > This does imply that you need a separate monitor I/O processing, from 
> > > > > the
> > > > > command execution thread, but I see no need for all commands to 
> > > > > suddenly
> > > > > become async. Just allowing interleaved replies is sufficient from the
> > > > > POV of the protocol definition. This interleaving is easy to handle 
> > > > > from
> > > > > the client POV - just requires a unique 'serial' in the request by the
> > > > > client, that is copied into the reply by QEMU.
> > > > 
> > > > OK, so for that we can just take Marc-André's syntax and call it 'id':
> > > >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > > > 
> > > > then it's upto the caller to ensure those id's are unique.
> > > 
> > > Libvirt has in fact generated a unique 'id' for every monitor command
> > > since day 1 of supporting QMP.
> > > 
> > > > I do worry about two things:
> > > >   a) With this the caller doesn't really know which commands could be
> > > >   in parallel - for example if we've got a recovery command that's
> > > >   executed by this non-locking thread that's OK, we expect that
> > > >   to be doable in parallel.  If in the future though we do
> > > >   what you initially suggested and have a bunch of commands get
> > > >   routed to the migration thread (say) then those would suddenly
> > > >   operate in parallel with other commands that we're previously
> > > >   synchronous.
> > > 
> > > We could still have an opt-in for async commands. eg default to executing
> > > all commands in the main thread, unless the client issues an explicit
> > > "make it async" command, to switch to allowing the migration thread to
> > > process it async.
> > > 
> > >  { "execute": "qmp_allow_async",
> > >"data": { "commands": [
> > >"migrate_cancel",
> > >] } }
> > > 
> > > 
> > >  { "return": { "commands": [
> > >"migrate_cancel",
> > >] } }
> > > 
> > > The server response contains the subset of commands from the request
> > > for which async is supported.
> > > 
> > > That gives good negotiation ability going forward as we incrementally
> > > support async on more commands.
> > 
> > I think this goes back to the discussion on which design we'd like to
> > choose.  IMHO the whole async idea plus the per-command-id is indeed
> > cleaner and nicer, and I believe that can benefit not only libvirt,
> > but also other QMP users.  The problem is, I have no idea how long
> > it'll take to let us have such a feature - I believe that will include
> > QEMU and Libvirt to both support that.  And it'll be a pity if the
> > postcopy recovery cannot work only because we cannot guarantee a
> > stable monitor.
> 
> libvirt will need changes for postcopy recovery however we do it;
> so we need to do it the way they want.
> 
> I think Dan's suggestion isn't as hard as it initially sounded;  a first
> thing to try would be taking all the monitor IO into another thread
> and feeding all commands to the main thread for execution - that sounds
> like the hard part.
> (I'm not sure how multiple monitors interact for this).

Multiple monitors is probably not as hard as it sounds. No matter how
many monitors you have configured today, they'll all serviced by the
main loop so commands from each monitor are strictly serialized.

So if you moved I/O processing to a separate thread, and had a single
queue of commands to be executed by the main thread, you would still
have the exact same serialized processing across multiple monitors.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[Qemu-devel] [PATCH] hw/core/qdev: Do not allow hot-plugging without hotplug controller

2017-09-07 Thread Thomas Huth

qdev_unplug() bails out with an assertion if the user tries to device_del
a hot-plugged device that does not have a hotplug controller. Unfortunately,
our devices are all marked with hotpluggable = true by default (see the
device_class_init() function in qdev.c), so it currently can happen that
the user runs into this situation and QEMU gets terminated unexpectedly:

$ qemu-system-aarch64 -M virt -nographic -nodefaults -monitor stdio -S
QEMU 2.10.50 monitor - type 'help' for more information
(qemu) device_add aux-to-i2c-bridge,id=x
(qemu) device_del x
**
ERROR:qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)

Hotplugging devices without a hotplug controller does not make much sense,
so we should disallow this during the device_add process already!

Suggested-by: Paolo Bonzini 
Signed-off-by: Thomas Huth 
---
 hw/core/qdev.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 606ab53..d9ccce6 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -908,6 +908,11 @@ static void device_set_realized(Object *obj, bool value, 
Error **errp)
 if (local_err != NULL) {
 goto fail;
 }
+} else if (dev->hotplugged) {
+/* Hot-plugged device without hotplug controller? No way! */
+error_setg(&local_err, QERR_DEVICE_NO_HOTPLUG,
+   object_get_typename(obj));
+goto fail;
 }
 
 if (dc->realize) {
-- 
1.8.3.1

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Dr. David Alan Gilbert

* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Thu, Sep 07, 2017 at 10:19:47AM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > On Thu, Sep 07, 2017 at 04:13:41PM +0800, Peter Xu wrote:
> > > > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> > > > > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert 
> > > > > wrote:
> > > > > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > > > > > This does imply that you need a separate monitor I/O processing, 
> > > > > > > from the
> > > > > > > command execution thread, but I see no need for all commands to 
> > > > > > > suddenly
> > > > > > > become async. Just allowing interleaved replies is sufficient 
> > > > > > > from the
> > > > > > > POV of the protocol definition. This interleaving is easy to 
> > > > > > > handle from
> > > > > > > the client POV - just requires a unique 'serial' in the request 
> > > > > > > by the
> > > > > > > client, that is copied into the reply by QEMU.
> > > > > > 
> > > > > > OK, so for that we can just take Marc-André's syntax and call it 
> > > > > > 'id':
> > > > > >   
> > > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > > > > > 
> > > > > > then it's upto the caller to ensure those id's are unique.
> > > > > 
> > > > > Libvirt has in fact generated a unique 'id' for every monitor command
> > > > > since day 1 of supporting QMP.
> > > > > 
> > > > > > I do worry about two things:
> > > > > >   a) With this the caller doesn't really know which commands could 
> > > > > > be
> > > > > >   in parallel - for example if we've got a recovery command that's
> > > > > >   executed by this non-locking thread that's OK, we expect that
> > > > > >   to be doable in parallel.  If in the future though we do
> > > > > >   what you initially suggested and have a bunch of commands get
> > > > > >   routed to the migration thread (say) then those would suddenly
> > > > > >   operate in parallel with other commands that we're previously
> > > > > >   synchronous.
> > > > > 
> > > > > We could still have an opt-in for async commands. eg default to 
> > > > > executing
> > > > > all commands in the main thread, unless the client issues an explicit
> > > > > "make it async" command, to switch to allowing the migration thread to
> > > > > process it async.
> > > > > 
> > > > >  { "execute": "qmp_allow_async",
> > > > >"data": { "commands": [
> > > > >"migrate_cancel",
> > > > >] } }
> > > > > 
> > > > > 
> > > > >  { "return": { "commands": [
> > > > >"migrate_cancel",
> > > > >] } }
> > > > > 
> > > > > The server response contains the subset of commands from the request
> > > > > for which async is supported.
> > > > > 
> > > > > That gives good negotiation ability going forward as we incrementally
> > > > > support async on more commands.
> > > > 
> > > > I think this goes back to the discussion on which design we'd like to
> > > > choose.  IMHO the whole async idea plus the per-command-id is indeed
> > > > cleaner and nicer, and I believe that can benefit not only libvirt,
> > > > but also other QMP users.  The problem is, I have no idea how long
> > > > it'll take to let us have such a feature - I believe that will include
> > > > QEMU and Libvirt to both support that.  And it'll be a pity if the
> > > > postcopy recovery cannot work only because we cannot guarantee a
> > > > stable monitor.
> > > 
> > > This is not a blocker for having postcopy recovery feature merged.
> > > It merely means that in a situation where the mainloop is blocked,
> > > then we can't recover, in other situations we'll be able to recover
> > > fine. Sure it would be nice to fix that problem too, but I don't
> > > see it as a block.
> > 
> > It's probably OK to merge the recovery code before the monitor code;
> > but I don't think it's something you'd want to tell users about -
> > a 'postcopy recovery that only works rarely' isn't much use.
> 
> I dunno. Compared to today where there's zero post-copy recovery,
> I think even an incremental improvement is useful. Its a choice
> between "your VM is dead" and "you've a 50/50 chance of life".

There's a chunk of people who wont use postcopy because they regard
it as dangerous; they need something that works in most cases before
they'll use it.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] tcg: move atomic_template.h to accel/tcg/

2017-09-07 Thread Thomas Huth

On 13.07.2017 13:35, Paolo Bonzini wrote:
> On 12/07/2017 07:52, Thomas Huth wrote:
>> On 11.07.2017 20:55, Philippe Mathieu-Daudé wrote:
>>> Signed-off-by: Philippe Mathieu-Daudé 
>>> ---
>>> same as Thomas previous patch. this file had no entry in MAINTAINERS.
>>>
>>>  atomic_template.h => accel/tcg/atomic_template.h | 0
>>>  1 file changed, 0 insertions(+), 0 deletions(-)
>>>  rename atomic_template.h => accel/tcg/atomic_template.h (100%)
>>>
>>> diff --git a/atomic_template.h b/accel/tcg/atomic_template.h
>>> similarity index 100%
>>> rename from atomic_template.h
>>> rename to accel/tcg/atomic_template.h
>>
>> It's also used by a file in tcg/, but I've checked, and yes, the code
>> still compiles fine if the header gets moved to accel/tcg/.
>>
>> Tested-by: Thomas Huth 
>>
> 
> That part of tcg/tcg-runtime.c probably should be moved to user-exec.c,
> and user-exec.c should in turn be in accel/tcg.
> 
> Since this is just code movement we can do it after soft freeze.
> Philippe, can you send v2?

Ping?

Looks like the atomic_template.h is still in the main directory ...

 Thomas

Re: [Qemu-devel] [PATCHv4 2/6] seccomp: add obsolete argument to command line

2017-09-07 Thread Eduardo Otubo

On Fri, Sep 01, 2017 at 12:05:41PM +0100, Daniel P. Berrange wrote:
> On Fri, Sep 01, 2017 at 12:58:14PM +0200, Eduardo Otubo wrote:
> > This patch introduces the argument [,obsolete=allow] to the `-sandbox on'
> > option. It allows Qemu to run safely on old system that still relies on
> > old system calls.
> > 
> > Signed-off-by: Eduardo Otubo 
> > ---
> >  include/sysemu/seccomp.h |  3 ++-
> >  qemu-options.hx  | 12 ++--
> >  qemu-seccomp.c   | 23 ++-
> >  vl.c | 22 +-
> >  4 files changed, 55 insertions(+), 5 deletions(-)
> > 
> 
> > @@ -72,6 +85,14 @@ int seccomp_start(void)
> >  
> >  for (i = 0; i < ARRAY_SIZE(blacklist); i++) {
> >  switch (blacklist[i].set) {
> > +case QEMU_SECCOMP_SET_OBSOLETE:
> > +if (!(seccomp_opts & QEMU_SECCOMP_SET_OBSOLETE)) {
> > +goto add_syscall;
> > +} else {
> > +continue;
> > +}
> > +
> > +break;
> 
> THis can be simplified:
> 
> if ((seccomp_opts & QEMU_SECCOMP_SET_OBSOLETE)) {
> continue;
> }
> 
> break;
> 
> thus avoiding need to 'goto'
> 
> Likewise for all following patches

Do you think there's anything else to fix on this series? if nothing
else emerges, I'll send the v5 tomorrow (also with the style fixes).

-- 
Eduardo Otubo
Senior Software Engineer @ RedHat

Re: [Qemu-devel] [RFC PATCH qemu 1/4] memory: Postpone flatview and dispatch tree building till all devices are added

2017-09-07 Thread Peter Maydell

On 7 September 2017 at 10:20, Alexey Kardashevskiy  wrote:
> Most devices use at least one address space and every time a new address
> space is added, flat views and dispatch trees are rebuild for all address
> spaces. This is not a problem for a relatively small amount of devices but
> even 50 virtio-pci devices use more than 8GB of RAM.
>
> What happens that on every flatview/dispatch rebuild, new arrays are
> allocated and old ones release but the release is done via RCU so until
> an entire machine is build, they are not released.
>
> This wraps devices creation into memory_region_transaction_begin/commit
> to massively reduce amount of flat view/dispatch tree (re)allocations.
>
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  vl.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/vl.c b/vl.c
> index 8e247cc2a2..3c39cc8b3a 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4655,12 +4655,16 @@ int main(int argc, char **argv, char **envp)
>  igd_gfx_passthru();
>
>  /* init generic devices */
> +memory_region_transaction_begin();
> +
>  rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
>  if (qemu_opts_foreach(qemu_find_opts("device"),
>device_init_func, NULL, NULL)) {
>  exit(1);
>  }
>
> +memory_region_transaction_commit();

What happens if something in a device realize function tries
to do a read from an AddressSpace? I can't think of any examples
that need to do that and I suspect it's probably a bug if anybody
tries it, but I'm curious whether this change alters the failure
mode...

thanks
-- PMM

Re: [Qemu-devel] [PATCH 0/9] buildsys: Move ui/usb library cflags/libs to per object

2017-09-07 Thread Gerd Hoffmann

On Thu, 2017-09-07 at 16:29 +0800, Fam Zheng wrote:
> Hi Gerd,
> 
> This is part two of the QEMU_CFLAGS and libs_softmmu cleanup series,
> including
> ui/, audio/ and hw/usb/. Please review.

Looks good, survived a test build.

Reviewed-by: Gerd Hoffmann 

cheers,
  Gerd

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Dr. David Alan Gilbert

* Stefan Hajnoczi (stefa...@gmail.com) wrote:
> On Wed, Sep 6, 2017 at 4:14 PM, Dr. David Alan Gilbert
>  wrote:
> > * Stefan Hajnoczi (stefa...@gmail.com) wrote:
> >> On Wed, Aug 23, 2017 at 02:51:03PM +0800, Peter Xu wrote:
> >> > The root problem is that, monitor commands are all handled in main
> >> > loop thread now, no matter how many monitors we specify. And, if main
> >> > loop thread hangs due to some reason, all monitors will be stuck.
> >>
> >> I see a larger issue with postcopy: existing QEMU code assumes that
> >> guest memory access is instantaneous.
> >>
> >> Postcopy breaks this assumption and introduces blocking points that can
> >> now take unbounded time.
> >>
> >> This problem isn't specific to the monitor.  It can also happen to other
> >> components in QEMU like the gdbstub.
> >>
> >> Do we need an asynchronous memory API?  Synchronous memory access should
> >> only be allowed in vcpu threads.
> >
> > It would probably be useful for gdbstub where the overhead of async
> > doesn't matter;  but doing that for all IO emulation is hard.
> 
> Why is it hard?
> 
> Memory access can be synchronous in the vcpu thread.  That eliminates
> a lot of code straight away.
> 
> Anything using dma-helpers.c is already async.  They just don't know
> that the memory access part is being made async too :).

Can you point me to some info on that ?

> The remaining cases are virtio and some other devices.
> 
> If you are worried about performance, the first rule is that async
> memory access is only needed on the destination side when post-copy is
> active.  Maybe use setjmp to return from the signal handler and queue
> a callback for when the page has been loaded.

I'm not sure it's worth trying to be too clever at avoiding this;
I see the fact that we're doing IO with the bql held as a more
fundamental problem.

Dave

> Stefan
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] buildsys: Move rdma libs to per object

2017-09-07 Thread Dr. David Alan Gilbert

* Fam Zheng (f...@redhat.com) wrote:
> Signed-off-by: Fam Zheng 

OK, I've not actually got a preference as to whether it's
per-object or not - I don't really see any advantage.

But since it doesn't look like it'll break it;

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  configure   | 2 +-
>  migration/Makefile.objs | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/configure b/configure
> index fb7e34a901..470b8a085b 100755
> --- a/configure
> +++ b/configure
> @@ -2818,7 +2818,6 @@ EOF
>rdma_libs="-lrdmacm -libverbs"
>if compile_prog "" "$rdma_libs" ; then
>  rdma="yes"
> -libs_softmmu="$libs_softmmu $rdma_libs"
>else
>  if test "$rdma" = "yes" ; then
>  error_exit \
> @@ -6035,6 +6034,7 @@ echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
>  
>  if test "$rdma" = "yes" ; then
>echo "CONFIG_RDMA=y" >> $config_host_mak
> +  echo "RDMA_LIBS=$rdma_libs" >> $config_host_mak
>  fi
>  
>  if test "$have_rtnetlink" = "yes" ; then
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index 1c7770da46..99e038024d 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -11,3 +11,4 @@ common-obj-$(CONFIG_RDMA) += rdma.o
>  
>  common-obj-$(CONFIG_LIVE_BLOCK_MIGRATION) += block.o
>  
> +rdma.o-libs := $(RDMA_LIBS)
> -- 
> 2.13.5
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] Hello everyone, ask a few vm hot plug cpu and ram problem

2017-09-07 Thread Paul Schlacter

Like CPU and ram have a maximum value. If I want to produce this product,
provided to the cloud host users.  The maximum value is set to how much
more appropriate.


If the settings are too small, users want to dynamically add CPU or memory,
and can't meet the needs of users.


If the setting is too large, such as max memory set to 40G, current memory
is set to 2G, the virtual machine memory only 1G more, much less than 2G.
If this value is set too large,  Are there any other effects?


Ask your advice.

Re: [Qemu-devel] [PULL 0/2] Fix tests with new gcc

2017-09-07 Thread Peter Maydell

On 6 September 2017 at 12:44, Juan Quintela  wrote:
> Hi
>
> TNhis has already been reviewed on list.
>
> Please, apply.
>
> The following changes since commit b07d1c2f5607489d4d4a6a65ce36a3e896ac065e:
>
>   Revert "kvm: use DIV_ROUND_UP" (2017-09-05 18:55:40 +0100)
>
> are available in the git repository at:
>
>   git://github.com/juanquintela/qemu.git tags/tests/20170906
>
> for you to fetch changes up to c18aaeb69072b74cb0a66e6638fcb14837c7cd3a:
>
>   tests: Make vmgenid test compile (2017-09-05 22:34:40 +0200)
>
> 
> tests/next for 20170906
>
> 
> Juan Quintela (2):
>   tests: Use real size for iov tests
>   tests: Make vmgenid test compile

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH] buildsys: Move rdma libs to per object

2017-09-07 Thread Peter Xu

On Thu, Sep 07, 2017 at 04:42:30PM +0800, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 

Reviewed-by: Peter Xu 

> ---
>  configure   | 2 +-
>  migration/Makefile.objs | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/configure b/configure
> index fb7e34a901..470b8a085b 100755
> --- a/configure
> +++ b/configure
> @@ -2818,7 +2818,6 @@ EOF
>rdma_libs="-lrdmacm -libverbs"
>if compile_prog "" "$rdma_libs" ; then
>  rdma="yes"
> -libs_softmmu="$libs_softmmu $rdma_libs"
>else
>  if test "$rdma" = "yes" ; then
>  error_exit \
> @@ -6035,6 +6034,7 @@ echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
>  
>  if test "$rdma" = "yes" ; then
>echo "CONFIG_RDMA=y" >> $config_host_mak
> +  echo "RDMA_LIBS=$rdma_libs" >> $config_host_mak
>  fi
>  
>  if test "$have_rtnetlink" = "yes" ; then
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index 1c7770da46..99e038024d 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -11,3 +11,4 @@ common-obj-$(CONFIG_RDMA) += rdma.o
>  
>  common-obj-$(CONFIG_LIVE_BLOCK_MIGRATION) += block.o
>  
> +rdma.o-libs := $(RDMA_LIBS)
> -- 
> 2.13.5
> 

-- 
Peter Xu

[Qemu-devel] [Bug 1714331] Re: Virtual machines not working anymore on 2.10

2017-09-07 Thread Mary Sherman

I'm using the Ubuntu artful version (0~20161202.7bbe0b3e-1), so I'll
need to build a recent OVMF package to verify the solution.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1714331

Title:
  Virtual machines not working anymore on 2.10

Status in QEMU:
  New

Bug description:
  Using 2.10, my virtual machine(s) don't work anymore. This happens
  100% of the times.

  -

  I use QEMU compiling it from source, on Ubuntu 16.04 amd64. This is
  the configure command:

  configure --target-list=x86_64-softmmu --enable-debug --enable-gtk
  --enable-spice --audio-drv-list=pa

  I have one virtual disk, with a Windows 10 64-bit, which I launch in
  two different ways; both work perfectly on 2.9 (and used to do on 2.8,
  but I haven't used it for a long time).

  This is the first way:

  qemu-system-x86_64
-drive if=pflash,format=raw,readonly,file=/path/to/OVMF_CODE.fd
-drive if=pflash,format=raw,file=/tmp/OVMF_VARS.fd.tmp
-enable-kvm
-machine q35,accel=kvm,mem-merge=off
-cpu 
host,kvm=off,hv_vendor_id=vgaptrocks,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time
-smp 4,cores=4,sockets=1,threads=1
-m 4096
-display gtk
-vga qxl
-rtc base=localtime
-serial none
-parallel none
-usb
-device usb-host,vendorid=0x,productid=0x
-device usb-host,vendorid=0x,productid=0x
-device usb-host,vendorid=0x,productid=0x
-device usb-host,vendorid=0x,productid=0x
-device virtio-scsi-pci,id=scsi
-drive 
file=/path/to/image-diff.img,id=hdd1,format=qcow2,if=none,cache=writeback
-device scsi-hd,drive=hdd1
-net nic,model=virtio
-net user

  On QEMU 2.10, I get the `Recovery - Your PC/Device needs to be
  repaired` windows screen; on 2.9, it boots regularly.

  This is the second way:

  qemu-system-x86_64
-drive if=pflash,format=raw,readonly,file=/path/to/OVMF_CODE.fd
-drive if=pflash,format=raw,file=/tmp/OVMF_VARS.fd.tmp
-enable-kvm
-machine q35,accel=kvm,mem-merge=off
-cpu 
host,kvm=off,hv_vendor_id=vgaptrocks,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time
-smp 4,cores=4,sockets=1,threads=1
-m 10240
-vga none
-rtc base=localtime
-serial none
-parallel none
-usb
-device vfio-pci,host=01:00.0,multifunction=on
-device vfio-pci,host=01:00.1
-device usb-host,vendorid=0x,productid=0x
-device usb-host,vendorid=0x,productid=0x
-device usb-host,vendorid=0x,productid=0x
-device usb-host,vendorid=0x,productid=0x
-device usb-host,vendorid=0x,productid=0x
-device usb-host,vendorid=0x,productid=0x
-device virtio-scsi-pci,id=scsi
-drive 
file=/path/to/image-diff.img,id=hdd1,format=qcow2,if=none,cache=writeback
-device scsi-hd,drive=hdd1
-net nic,model=virtio
-net user

  On QEMU 2.10, I get the debug window on the linux monitor, and blank screen 
on VFIO one (no BIOS screen at all); after 10/20 seconds, QEMU crashes without 
any message.
  On 2.9, this works perfectly.

  -

  I am able to perform a git bisect, if that helps, but if this is the
  case, I'd need this issue to be reviewed, since bisecting is going to
  take me a lot of time.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1714331/+subscriptions

Re: [Qemu-devel] [RFC PATCH qemu 0/4] memory: Reduce memory use

2017-09-07 Thread Dr. David Alan Gilbert

* Alexey Kardashevskiy (a...@ozlabs.ru) wrote:
> This was inspired by https://bugzilla.redhat.com/show_bug.cgi?id=1481593
> 
> What happens ithere is that every virtio block device creates 2 address
> spaces - for modern config space (called "virtio-pci-cfg-as") and
> for busmaster (common pci thing, called after the device name,
> in my case "virtio-blk-pci").
> 
> Each address_space_init() updates topology for every address space.
> Every topology update (address_space_update_topology()) creates a new
> dispatch tree - AddressSpaceDispatch with nodes (1KB) and
> sections (48KB) and destroys the old one.
> 
> However the dispatch destructor is postponed via RCU which does not
> get a chance to execute until the machine is initialized but before
> we get there, memory is not returned to the pool, and this is a lot
> of memory which grows n^2.
> 
> These patches are trying to address the memory use and boot time
> issues but tbh only the first one provides visible outcome.

Do you have a feel for how much memory is saved?

Dave

> There are still things to polish and double check the use of RCU,
> I'd like to get any feedback before proceeding - is this going
> the right way or way too ugly?
> 
> 
> This is based on sha1
> 1ab5eb4efb Peter Maydell "Update version for v2.10.0 release".
> 
> Please comment. Thanks.
> 
> 
> 
> Alexey Kardashevskiy (4):
>   memory: Postpone flatview and dispatch tree building till all devices
> are added
>   memory: Prepare for shared flat views
>   memory: Share flat views and dispatch trees between address spaces
>   memory: Add flat views to HMP "info mtree"
> 
>  include/exec/memory-internal.h |   6 +-
>  include/exec/memory.h  |  93 +
>  exec.c | 242 +++--
>  hw/alpha/typhoon.c |   2 +-
>  hw/dma/rc4030.c|   4 +-
>  hw/i386/amd_iommu.c|   2 +-
>  hw/i386/intel_iommu.c  |   9 +-
>  hw/intc/openpic_kvm.c  |   2 +-
>  hw/pci-host/apb.c  |   2 +-
>  hw/pci/pci.c   |   3 +-
>  hw/ppc/spapr_iommu.c   |   4 +-
>  hw/s390x/s390-pci-bus.c|   2 +-
>  hw/vfio/common.c   |   6 +-
>  hw/virtio/vhost.c  |   6 +-
>  memory.c   | 299 
> +++--
>  monitor.c  |   3 +-
>  vl.c   |   4 +
>  hmp-commands-info.hx   |   7 +-
>  18 files changed, 448 insertions(+), 248 deletions(-)
> 
> -- 
> 2.11.0
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] qemu-iotests: Add missing -machine accel=qtest

2017-09-07 Thread Cornelia Huck

On Thu,  7 Sep 2017 10:53:10 +0200
Kevin Wolf  wrote:

> A basic set of qemu options is initialised in ./common:
> 
> export QEMU_OPTIONS="-nodefaults -machine accel=qtest"
> 
> However, two test cases (172 and 186) overwrite QEMU_OPTIONS and neglect
> to manually set '-machine accel=qtest'. Add the missing option for 172.
> 186 probably only copied the code from 172, it doesn't actually need to
> overwrite QEMU_OPTIONS, so remove that in 186.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/172 | 2 +-
>  tests/qemu-iotests/186 | 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)

I tested this on top of my change switching the default machine to
"tcg:kvm:xen:hax", and this now passes on x86_64 builds both with and
without tcg.

Tested-by: Cornelia Huck 
Reviewed-by: Cornelia Huck

Re: [Qemu-devel] [PATCHv4 2/6] seccomp: add obsolete argument to command line

2017-09-07 Thread Daniel P. Berrange

On Fri, Sep 01, 2017 at 12:58:14PM +0200, Eduardo Otubo wrote:
> This patch introduces the argument [,obsolete=allow] to the `-sandbox on'
> option. It allows Qemu to run safely on old system that still relies on
> old system calls.
> 
> Signed-off-by: Eduardo Otubo 
> ---
>  include/sysemu/seccomp.h |  3 ++-
>  qemu-options.hx  | 12 ++--
>  qemu-seccomp.c   | 23 ++-
>  vl.c | 22 +-
>  4 files changed, 55 insertions(+), 5 deletions(-)
> 

> @@ -1032,7 +1036,23 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, 
> Error **errp)
>  {
>  if (qemu_opt_get_bool(opts, "enable", false)) {
>  #ifdef CONFIG_SECCOMP
> -if (seccomp_start() < 0) {
> +uint32_t seccomp_opts = 0x0;
> +const char *value = NULL;
> +
> +value = qemu_opt_get(opts, "obsolete");
> +if (value) {
> +if (strcmp(value, "allow") == 0) {

I would have a slight preference for g_str_equal(value, "allow")

> +seccomp_opts |= QEMU_SECCOMP_SET_OBSOLETE;
> +} else if (strcmp(value, "deny")) {

and  !g_str_equal(value, "deny")

> +/* this is the default option, this if is here
> +  * to provide a little bit of consistency for
> +  * the command line */
> + } else {
> + error_report("invalid argument for obsolete");
> + }

There seem to be tabs for indent here too

> +}
> +
> +if (seccomp_start(seccomp_opts) < 0) {
>  error_report("failed to install seccomp syscall filter "
>   "in the kernel");
>  return -1;
> -- 
> 2.13.5
> 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCHv4 3/6] seccomp: add elevateprivileges argument to command line

2017-09-07 Thread Daniel P. Berrange

On Fri, Sep 01, 2017 at 12:58:15PM +0200, Eduardo Otubo wrote:
> This patch introduces the new argument
> [,elevateprivileges=allow|deny|children] to the `-sandbox on'. It allows
> or denies Qemu process to elevate its privileges by blacklisting all
> set*uid|gid system calls. The 'children' option will let forks and
> execves run unprivileged.
> 
> Signed-off-by: Eduardo Otubo 
> ---
>  include/sysemu/seccomp.h |  1 +
>  qemu-options.hx  | 12 +---
>  qemu-seccomp.c   | 29 ++---
>  vl.c | 27 +++
>  4 files changed, 55 insertions(+), 14 deletions(-)


> diff --git a/vl.c b/vl.c
> index ca267f9918..1d44b05772 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -29,6 +29,7 @@
>  
>  #ifdef CONFIG_SECCOMP
>  #include "sysemu/seccomp.h"
> +#include "sys/prctl.h"
>  #endif
>  
>  #if defined(CONFIG_VDE)
> @@ -275,6 +276,10 @@ static QemuOptsList qemu_sandbox_opts = {
>  .name = "obsolete",
>  .type = QEMU_OPT_STRING,
>  },
> +{
> +.name = "elevateprivileges",
> +.type = QEMU_OPT_STRING,
> +},
>  { /* end of list */ }
>  },
>  };
> @@ -1052,6 +1057,28 @@ static int parse_sandbox(void *opaque, QemuOpts *opts, 
> Error **errp)
>   }
>  }
>  
> +value = qemu_opt_get(opts, "elevateprivileges");
> +if (value) {
> +if (strcmp(value, "deny") == 0) {
> +seccomp_opts |= QEMU_SECCOMP_SET_PRIVILEGED;
> +} else if (strcmp(value, "children") == 0) {
> +seccomp_opts |= QEMU_SECCOMP_SET_PRIVILEGED;
> +
> +/* calling prctl directly because we're
> + * not sure if host has CAP_SYS_ADMIN set*/
> +if (prctl(PR_SET_NO_NEW_PRIVS, 1)) {
> +error_report("failed to set no_new_privs "
> + "aborting");
> +return -1;
> +}
> +} else if (strcmp(value, "allow") == 0) {
> +/* default value */

Again slight preference for g_str_equal() in all these checks.


> +} else {
> +error_report("invalid argument for elevateprivileges");
> +return -1;
> +}
> +}
> +
>  if (seccomp_start(seccomp_opts) < 0) {
>  error_report("failed to install seccomp syscall filter "
>   "in the kernel");

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCHv4 2/6] seccomp: add obsolete argument to command line

2017-09-07 Thread Daniel P. Berrange

On Thu, Sep 07, 2017 at 11:31:04AM +0200, Eduardo Otubo wrote:
> On Fri, Sep 01, 2017 at 12:05:41PM +0100, Daniel P. Berrange wrote:
> > On Fri, Sep 01, 2017 at 12:58:14PM +0200, Eduardo Otubo wrote:
> > > This patch introduces the argument [,obsolete=allow] to the `-sandbox on'
> > > option. It allows Qemu to run safely on old system that still relies on
> > > old system calls.
> > > 
> > > Signed-off-by: Eduardo Otubo 
> > > ---
> > >  include/sysemu/seccomp.h |  3 ++-
> > >  qemu-options.hx  | 12 ++--
> > >  qemu-seccomp.c   | 23 ++-
> > >  vl.c | 22 +-
> > >  4 files changed, 55 insertions(+), 5 deletions(-)
> > > 
> > 
> > > @@ -72,6 +85,14 @@ int seccomp_start(void)
> > >  
> > >  for (i = 0; i < ARRAY_SIZE(blacklist); i++) {
> > >  switch (blacklist[i].set) {
> > > +case QEMU_SECCOMP_SET_OBSOLETE:
> > > +if (!(seccomp_opts & QEMU_SECCOMP_SET_OBSOLETE)) {
> > > +goto add_syscall;
> > > +} else {
> > > +continue;
> > > +}
> > > +
> > > +break;
> > 
> > THis can be simplified:
> > 
> > if ((seccomp_opts & QEMU_SECCOMP_SET_OBSOLETE)) {
> > continue;
> > }
> > 
> > break;
> > 
> > thus avoiding need to 'goto'
> > 
> > Likewise for all following patches
> 
> Do you think there's anything else to fix on this series? if nothing
> else emerges, I'll send the v5 tomorrow (also with the style fixes).

I just sent one more comment, but apart from the that & the style fixes
it looks good to me.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Dr. David Alan Gilbert

* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > This does imply that you need a separate monitor I/O processing, from the
> > > command execution thread, but I see no need for all commands to suddenly
> > > become async. Just allowing interleaved replies is sufficient from the
> > > POV of the protocol definition. This interleaving is easy to handle from
> > > the client POV - just requires a unique 'serial' in the request by the
> > > client, that is copied into the reply by QEMU.
> > 
> > OK, so for that we can just take Marc-André's syntax and call it 'id':
> >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > 
> > then it's upto the caller to ensure those id's are unique.
> 
> Libvirt has in fact generated a unique 'id' for every monitor command
> since day 1 of supporting QMP.
> 
> > I do worry about two things:
> >   a) With this the caller doesn't really know which commands could be
> >   in parallel - for example if we've got a recovery command that's
> >   executed by this non-locking thread that's OK, we expect that
> >   to be doable in parallel.  If in the future though we do
> >   what you initially suggested and have a bunch of commands get
> >   routed to the migration thread (say) then those would suddenly
> >   operate in parallel with other commands that we're previously
> >   synchronous.
> 
> We could still have an opt-in for async commands. eg default to executing
> all commands in the main thread, unless the client issues an explicit
> "make it async" command, to switch to allowing the migration thread to
> process it async.
> 
>  { "execute": "qmp_allow_async",
>"data": { "commands": [
>"migrate_cancel",
>] } }
> 
> 
>  { "return": { "commands": [
>"migrate_cancel",
>] } }
> 
> The server response contains the subset of commands from the request
> for which async is supported.
> 
> That gives good negotiation ability going forward as we incrementally
> support async on more commands.

Is that 'qmp_allow_async' a command purely to query whether a command
is async or is it a wrapper to cause that command to be executed async?

> >   b) I still worry how the various IO channels will behave on another
> >   thread.  But that's more a general feeling rather than anything
> >   specific.
> 
> The only complexity will be around making sure the Chardev code uses
> the right GMainContext for any watches on the underlying QIOChannel,
> so that we poll() from the custom thread instead of the main thread.
> IOW, as long as all I/O is done from the single thread everything
> should work fine.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v3 1/7] block: skip implicit nodes in snapshots, blockjobs

2017-09-07 Thread Kevin Wolf

Am 25.08.2017 um 15:23 hat Manos Pitsidianakis geschrieben:
> Implicit filter nodes added at the top of nodes can interfere with block
> jobs. This is not a problem when they are added by other jobs since
> adding another job will issue a QERR_DEVICE_IN_USE, but it can happen in
> the next commit which introduces an implicitly created throttle filter
> node below BlockBackend, which we want to be skipped during automatic
> operations on the graph since the user does not necessarily know about
> their existence.
> 
> All implicit filters will have either bs->file or bs->backing set. This
> is a needed assumption so we can know which direction we will skip down
> the graph.
> 
> Signed-off-by: Manos Pitsidianakis 
> ---
>  include/block/block_int.h | 17 +
>  block.c   | 10 ++
>  block/qapi.c  | 14 +-
>  blockdev.c| 34 ++
>  4 files changed, 66 insertions(+), 9 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 7571c0aaaf..9e0f70e055 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -699,6 +699,21 @@ static inline BlockDriverState 
> *backing_bs(BlockDriverState *bs)
>  return bs->backing ? bs->backing->bs : NULL;
>  }
>  
> +static inline BlockDriverState *file_bs(BlockDriverState *bs)
> +{
> +return bs->file ? bs->file->bs : NULL;
> +}
> +
> +/* This is a convenience function to get either bs->file->bs or
> + * bs->backing->bs * */
> +static inline BlockDriverState *child_bs(BlockDriverState *bs)
> +{
> +BlockDriverState *backing = backing_bs(bs);
> +BlockDriverState *file = file_bs(bs);
> +assert(!(file && backing));
> +return backing ?: file;
> +}

I still think you should just get the only child and not restrict it to
bs->file or bs->backing.

child = QLIST_FIRST(&bs->children);
assert(!QLIST_NEXT(child, next));
return child->bs;

file_bs() isn't needed at all then. And Berto is probably
right that this is simple enough that it can just be inlined in
bdrv_get_first_explicit().

> diff --git a/blockdev.c b/blockdev.c
> index 23475abb72..fc7b65c3f0 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1300,6 +1300,10 @@ SnapshotInfo 
> *qmp_blockdev_snapshot_delete_internal_sync(const char *device,
>  if (!bs) {
>  return NULL;
>  }
> +
> +/* Skip implicit filter nodes */
> +bs = bdrv_get_first_explicit(bs);
> +
> [...]

Most, if not all, of the commands that you change accept both node names
and BlockBackend names, which function as aliases for their root node.
If a node name is given, the user claims that they know the graph
structure and it is supposed to be exact. We should not skip the
implicit filter node then. (For example, for creating an external
snapshot, creating the snapshot above the filter is meaningful. It
requires that the user knows about the filter node, but by using a node
name they tell us that they do.)

Please make sure that your qemu-iotests case covers all of the cases for
which you add a bdrv_get_first_explicit() call in this patch. So far, we
seem to be completely missing:

* Create external snapshot
* Create internal snapshot
* Delete internal snapshot
* blockdev-backup
* blockdev-mirror

All of them should cover the case where a BlockBackend name is given as
well as the cases with a node name.

Kevin

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Daniel P. Berrange

On Thu, Sep 07, 2017 at 11:04:02AM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > > This does imply that you need a separate monitor I/O processing, from 
> > > > the
> > > > command execution thread, but I see no need for all commands to suddenly
> > > > become async. Just allowing interleaved replies is sufficient from the
> > > > POV of the protocol definition. This interleaving is easy to handle from
> > > > the client POV - just requires a unique 'serial' in the request by the
> > > > client, that is copied into the reply by QEMU.
> > > 
> > > OK, so for that we can just take Marc-André's syntax and call it 'id':
> > >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > > 
> > > then it's upto the caller to ensure those id's are unique.
> > 
> > Libvirt has in fact generated a unique 'id' for every monitor command
> > since day 1 of supporting QMP.
> > 
> > > I do worry about two things:
> > >   a) With this the caller doesn't really know which commands could be
> > >   in parallel - for example if we've got a recovery command that's
> > >   executed by this non-locking thread that's OK, we expect that
> > >   to be doable in parallel.  If in the future though we do
> > >   what you initially suggested and have a bunch of commands get
> > >   routed to the migration thread (say) then those would suddenly
> > >   operate in parallel with other commands that we're previously
> > >   synchronous.
> > 
> > We could still have an opt-in for async commands. eg default to executing
> > all commands in the main thread, unless the client issues an explicit
> > "make it async" command, to switch to allowing the migration thread to
> > process it async.
> > 
> >  { "execute": "qmp_allow_async",
> >"data": { "commands": [
> >"migrate_cancel",
> >] } }
> > 
> > 
> >  { "return": { "commands": [
> >"migrate_cancel",
> >] } }
> > 
> > The server response contains the subset of commands from the request
> > for which async is supported.
> > 
> > That gives good negotiation ability going forward as we incrementally
> > support async on more commands.
> 
> Is that 'qmp_allow_async' a command purely to query whether a command
> is async or is it a wrapper to cause that command to be executed async?

The former.

It merely used by the client to tell QEMU that it wants the command(s)
listed to have async processing enabled. QEMU reports back which commands
it has actually enabled async for.

IOW, before executing this, everything is still processed synchronously,
even if QEMU has support for async. This ensures back compat as we enable
support for async per command. After executing this command, then future
usage of 'migrate_cancel' would be run async.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC PATCH qemu 0/4] memory: Reduce memory use

2017-09-07 Thread David Gibson

On Thu, Sep 07, 2017 at 10:51:42AM +0100, Dr. David Alan Gilbert wrote:
> * Alexey Kardashevskiy (a...@ozlabs.ru) wrote:
> > This was inspired by https://bugzilla.redhat.com/show_bug.cgi?id=1481593
> > 
> > What happens ithere is that every virtio block device creates 2 address
> > spaces - for modern config space (called "virtio-pci-cfg-as") and
> > for busmaster (common pci thing, called after the device name,
> > in my case "virtio-blk-pci").
> > 
> > Each address_space_init() updates topology for every address space.
> > Every topology update (address_space_update_topology()) creates a new
> > dispatch tree - AddressSpaceDispatch with nodes (1KB) and
> > sections (48KB) and destroys the old one.
> > 
> > However the dispatch destructor is postponed via RCU which does not
> > get a chance to execute until the machine is initialized but before
> > we get there, memory is not returned to the pool, and this is a lot
> > of memory which grows n^2.
> > 
> > These patches are trying to address the memory use and boot time
> > issues but tbh only the first one provides visible outcome.
> 
> Do you have a feel for how much memory is saved?

I think that's a bit hard to answer.

As noted above there's O(n^2) (or more) space complexity here - one
which shouldn't be required by the data we actually have to track.
That means the amount of "excess" memory depends on how many devices
there are.

I haven't yet looked at these patches in detail, to know if it truly
fixes that O(n^2) or just pares the constant.  If it does fix the
O(n^2) then the amount is going to vary from "probably not enough to
worry about" in normal use cases to hundreds of gigabytes in cases
with many devices.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Stefan Hajnoczi

On Thu, Sep 7, 2017 at 10:35 AM, Dr. David Alan Gilbert
 wrote:
> * Stefan Hajnoczi (stefa...@gmail.com) wrote:
>> On Wed, Sep 6, 2017 at 4:14 PM, Dr. David Alan Gilbert
>>  wrote:
>> > * Stefan Hajnoczi (stefa...@gmail.com) wrote:
>> >> On Wed, Aug 23, 2017 at 02:51:03PM +0800, Peter Xu wrote:
>> >> > The root problem is that, monitor commands are all handled in main
>> >> > loop thread now, no matter how many monitors we specify. And, if main
>> >> > loop thread hangs due to some reason, all monitors will be stuck.
>> >>
>> >> I see a larger issue with postcopy: existing QEMU code assumes that
>> >> guest memory access is instantaneous.
>> >>
>> >> Postcopy breaks this assumption and introduces blocking points that can
>> >> now take unbounded time.
>> >>
>> >> This problem isn't specific to the monitor.  It can also happen to other
>> >> components in QEMU like the gdbstub.
>> >>
>> >> Do we need an asynchronous memory API?  Synchronous memory access should
>> >> only be allowed in vcpu threads.
>> >
>> > It would probably be useful for gdbstub where the overhead of async
>> > doesn't matter;  but doing that for all IO emulation is hard.
>>
>> Why is it hard?
>>
>> Memory access can be synchronous in the vcpu thread.  That eliminates
>> a lot of code straight away.
>>
>> Anything using dma-helpers.c is already async.  They just don't know
>> that the memory access part is being made async too :).
>
> Can you point me to some info on that ?

IDE and SCSI use dma-helpers.c to perform I/O:
hw/ide/core.c:892:s->bus->dma->aiocb =
dma_blk_io(blk_get_aio_context(s->blk),
hw/ide/macio.c:189:s->bus->dma->aiocb =
dma_blk_io(blk_get_aio_context(s->blk), &s->sg,
hw/scsi/scsi-disk.c:348:r->req.aiocb =
dma_blk_io(blk_get_aio_context(s->qdev.conf.blk),
hw/scsi/scsi-disk.c:551:r->req.aiocb =
dma_blk_io(blk_get_aio_context(s->qdev.conf.blk),

They pass a scatter-gather list of guest RAM addresses to
dma-helpers.c.  They receive a callback when I/O has finished.

Try following the code path.  Request submission may be from a vcpu
thread or IOThread.  Completion occurs in the main loop or an
IOThread.

The main point is that this API is already asynchronous.  If any
changes are needed for async guest memory access (not sure, I haven't
checked), then at least the dma-helpers.c users do not need to be
modified.

>> The remaining cases are virtio and some other devices.
>>
>> If you are worried about performance, the first rule is that async
>> memory access is only needed on the destination side when post-copy is
>> active.  Maybe use setjmp to return from the signal handler and queue
>> a callback for when the page has been loaded.
>
> I'm not sure it's worth trying to be too clever at avoiding this;
> I see the fact that we're doing IO with the bql held as a more
> fundamental problem.

QEMU should be doing I/O syscalls in async fashion or threadpool
workers (no BQL) so the BQL is not an issue.  Anything else could
cause unbounded waits even without postcopy.

Can you explain what you are worried about?

Stefan

Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and RSCH

2017-09-07 Thread Halil Pasic

On 09/07/2017 10:58 AM, Dong Jia Shi wrote:
> * Halil Pasic  [2017-09-06 16:43:42 +0200]:
> 
>>
>>
>> On 09/06/2017 04:20 PM, Cornelia Huck wrote:
>>> On Wed, 6 Sep 2017 14:25:13 +0200
>>> Halil Pasic  wrote:
>>>
 We have basically two possibilities/options which ask for different
 handling:
 1) EFAULT is due to a bug in the vfio-ccw implementation
 (can be QEMU or kernel).
 2) EFAULT is due to buggy channel program.

 Option 2) is basically to be handled with a channel-program check and
 setting primary secondary and alert status. For reference see PoP page
 15-59 ("Designation of Storage Area").  An exception may be an invalid
 channel program address in the ORB. There the channel-program check ain't
 explicitly stated (although) I would expect one. It may be implied by the
 things on page 15-59 though.

 Option 1) is however very similar to other we have figured out that the
 implementation is broken situations and should be handled consequently.
 The current state of the discussion is with a unit exception.

 Does that make sense?
>>>
>>> I think the situation is slightly different here, though. For the orb
>>> flags, we reject something out of hand because we have not implemented
>>> it, and for that, unit exception sounds like a good fit. Processing
>>> errors, however, are more similar to errors in the hardware, and as
>>> such can probably be reported via something like equipment check.
>>>
>>
>> Noted. Let's see what Dong Jia has to say, before we continuing a
>> discussion on something (option 1) what may be irrelevant anyway.
>>

 Now, Dong Jia, I need your help to figure out do we have option 1 or
 option 2 here? After quick look at the kernel code, it appears to me that
 I've seen both option 1 and option 2 (I'm afraid) -- but my assessment
 was really very superficial.
> There are three cases (all in the kernel) that generate a -EFAULT ret
> code:
> a. vfio_ccw_mdev_write: copy_from_user() fails.
>   This is option 1.
> 
> b. ccwchain_fetch_tic
>   It's mostly likely that the vfio-ccw driver processed the ccw chains
>   wrongly. (Actually I can not think of any other reason.)
>   This is option 1.
> 
> c. ccwchain_fetch_idal
>   When we find that an IDAW contents an invalid address
>   This is option 2.
> 

So my fear was justified. If we can't tell if its option 1 or 2, and
currently we can't, I would say we shall trust our infrastructure and
blame the channel program: that boils down to channel-program check.
That's my assessment with the kernel driver being as-is.

If we are willing to change the vfio-ccw kernel driver, then generating
a response to option 2 conditions in the kernel (basically an SCSW) is
IMHO better. For instance we can do SCSW.cpa and SCWS.count properly.
AFAIR we are not allowed to present conditions that logically did not
happen (yet) -- for instance if we have per-fetched a bad CCW address
but the given ccw did not become active yet.

If we handle option 2 via a different channel (SCWS instead of return
code) then we could happily do the proper handling for option 1 here.

So Dong Jia the call is again yours to make!

Regards,
Halil

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Stefan Hajnoczi

On Thu, Sep 7, 2017 at 10:18 AM, Dr. David Alan Gilbert
 wrote:
> * Stefan Hajnoczi (stefa...@gmail.com) wrote:
>> On Thu, Sep 7, 2017 at 9:13 AM, Peter Xu  wrote:
>> > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
>> >> On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
>> >> > * Daniel P. Berrange (berra...@redhat.com) wrote:
>> >> > > This does imply that you need a separate monitor I/O processing, from 
>> >> > > the
>> >> > > command execution thread, but I see no need for all commands to 
>> >> > > suddenly
>> >> > > become async. Just allowing interleaved replies is sufficient from the
>> >> > > POV of the protocol definition. This interleaving is easy to handle 
>> >> > > from
>> >> > > the client POV - just requires a unique 'serial' in the request by the
>> >> > > client, that is copied into the reply by QEMU.
>> >> >
>> >> > OK, so for that we can just take Marc-André's syntax and call it 'id':
>> >> >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
>> >> >
>> >> > then it's upto the caller to ensure those id's are unique.
>> >>
>> >> Libvirt has in fact generated a unique 'id' for every monitor command
>> >> since day 1 of supporting QMP.
>> >>
>> >> > I do worry about two things:
>> >> >   a) With this the caller doesn't really know which commands could be
>> >> >   in parallel - for example if we've got a recovery command that's
>> >> >   executed by this non-locking thread that's OK, we expect that
>> >> >   to be doable in parallel.  If in the future though we do
>> >> >   what you initially suggested and have a bunch of commands get
>> >> >   routed to the migration thread (say) then those would suddenly
>> >> >   operate in parallel with other commands that we're previously
>> >> >   synchronous.
>> >>
>> >> We could still have an opt-in for async commands. eg default to executing
>> >> all commands in the main thread, unless the client issues an explicit
>> >> "make it async" command, to switch to allowing the migration thread to
>> >> process it async.
>> >>
>> >>  { "execute": "qmp_allow_async",
>> >>"data": { "commands": [
>> >>"migrate_cancel",
>> >>] } }
>> >>
>> >>
>> >>  { "return": { "commands": [
>> >>"migrate_cancel",
>> >>] } }
>> >>
>> >> The server response contains the subset of commands from the request
>> >> for which async is supported.
>> >>
>> >> That gives good negotiation ability going forward as we incrementally
>> >> support async on more commands.
>> >
>> > I think this goes back to the discussion on which design we'd like to
>> > choose.  IMHO the whole async idea plus the per-command-id is indeed
>> > cleaner and nicer, and I believe that can benefit not only libvirt,
>> > but also other QMP users.  The problem is, I have no idea how long
>> > it'll take to let us have such a feature - I believe that will include
>> > QEMU and Libvirt to both support that.  And it'll be a pity if the
>> > postcopy recovery cannot work only because we cannot guarantee a
>> > stable monitor.
>>
>> Please don't rush in a hack, they often introduce new bugs that we
>> have to support long-term when they are part of the QMP API.
>>
>> In your original email you mentioned "info cpus".  Have you considered
>> modifying this command so it does not sync the CPU?  I'm not sure
>> callers really need to sync the CPU, typically they just want to know
>> the vcpu numbers, thread IDs, and current state (halted, running,
>> etc).
>
> But it has the pc as well, so that's actual state.

In what circumstances is the pc useful?

If the client just wants the vcpu -> thread ID mapping, it doesn't
matter at all.

If the CPU is halted, then the PC is already accurate and
synchronization isn't a problem.

If the CPU is running, then an accurate PC is meaningless since it
will have changed the moment the monitor command completes.  We might
as well just keep a copy of the last PC when entering QEMU in a vcpu
thread.

So I think we can offer a perfectly useful PC value without syncing.

Stefan

Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device

2017-09-07 Thread Halil Pasic



On 09/07/2017 10:08 AM, Cornelia Huck wrote:
> On Thu, 7 Sep 2017 15:31:09 +0800
> Dong Jia Shi  wrote:
> 
>> * Cornelia Huck  [2017-09-06 15:18:21 +0200]:
>>
>>> On Tue,  5 Sep 2017 13:16:45 +0200
>>> Halil Pasic  wrote:
>>>   
 Add a fake device meant for testing the correctness of our css emulation.

 What we currently have is writing a Fibonacci sequence of uint32_t to the
 device via ccw write. The write is going to fail if it ain't a Fibonacci
 and indicate a device exception in scsw together with the proper residual
 count.

 Of course lot's of invalid inputs (besides basic data processing) can be
 tested with that as well.

 Usage:
 1) fire up a qemu with something like -device ccw-tester,devno=fe.0.0001
on the command line
 2) exercise the device from the guest

 Signed-off-by: Halil Pasic 
 ---

 It may not make sense to merge this work in the current form, as it is
 solely for test purposes.
 ---
  hw/s390x/Makefile.objs |   1 +
  hw/s390x/ccw-tester.c  | 179 
 +
  2 files changed, 180 insertions(+)
  create mode 100644 hw/s390x/ccw-tester.c  
>>>
>>> The main problem here is that you want to exercise a middle layer (the
>>> css code) and need to write boilerplate code on both host and guest
>>> side in order to be able to do so.
>>>
>>> In general, a device that accepts arbitrary channel programs looks
>>> useful for testing purposes. I would split out processing of expected
>>> responses out, though, so that it can be more easily reused for
>>> different use cases.
>>>
>>> (I dimly recall other test devices...)
>>>
>>> For the guest tester: Can that be done via the qtest infrastructure
>>> somehow?
>>>   
>>
>> I'm thinking of a method these days:
>> Could passing through an fully emulated ccw device (e.g. 3270), or a
>> virtio ccw device, in the level 1 kvm guest to a level 2 guest be a test
>> method for this?
>>
>> All of the CCWs will be translated to IDAL CCWs by vfio-ccw in the level
>> 1 guest (which is the level 2 kvm host) and issued to the level 1 kvm
>> host. So, those IDALs will eventually be handled by the emulated device,
>> or the virtio ccw device, on the level 1 kvm host...
>>
>> Some days ago, one of my colleague tried the emulated 3270 passing
>> through. She stucked with the problem that the level 1 kvm host handling
>> a senseid IDAL ccw as a Direct ccw.
>>
>> Maybe I could try to pass through a virtio ccw device. I don't think of
>> any obvious problem that would lead to fail. Any comment?
>>
> 
> That actually looks like a good thing to try! Cool idea.
> 

I'm afraid that it would not work without some extra work.
AFAIR Connie we said that the 3270 does not use any IDA, so
I did not touch the 3270 emulation code in QEMU. To make
the scenario viable one should convert the 3270 emulation
to ccw data stream (unless the original implementation
already took care of IDA, which I doubt).

Halil

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Peter Xu

On Thu, Sep 07, 2017 at 10:18:16AM +0100, Dr. David Alan Gilbert wrote:
> * Stefan Hajnoczi (stefa...@gmail.com) wrote:
> > On Thu, Sep 7, 2017 at 9:13 AM, Peter Xu  wrote:
> > > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> > >> On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> > >> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > >> > > This does imply that you need a separate monitor I/O processing, 
> > >> > > from the
> > >> > > command execution thread, but I see no need for all commands to 
> > >> > > suddenly
> > >> > > become async. Just allowing interleaved replies is sufficient from 
> > >> > > the
> > >> > > POV of the protocol definition. This interleaving is easy to handle 
> > >> > > from
> > >> > > the client POV - just requires a unique 'serial' in the request by 
> > >> > > the
> > >> > > client, that is copied into the reply by QEMU.
> > >> >
> > >> > OK, so for that we can just take Marc-André's syntax and call it 'id':
> > >> >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > >> >
> > >> > then it's upto the caller to ensure those id's are unique.
> > >>
> > >> Libvirt has in fact generated a unique 'id' for every monitor command
> > >> since day 1 of supporting QMP.
> > >>
> > >> > I do worry about two things:
> > >> >   a) With this the caller doesn't really know which commands could be
> > >> >   in parallel - for example if we've got a recovery command that's
> > >> >   executed by this non-locking thread that's OK, we expect that
> > >> >   to be doable in parallel.  If in the future though we do
> > >> >   what you initially suggested and have a bunch of commands get
> > >> >   routed to the migration thread (say) then those would suddenly
> > >> >   operate in parallel with other commands that we're previously
> > >> >   synchronous.
> > >>
> > >> We could still have an opt-in for async commands. eg default to executing
> > >> all commands in the main thread, unless the client issues an explicit
> > >> "make it async" command, to switch to allowing the migration thread to
> > >> process it async.
> > >>
> > >>  { "execute": "qmp_allow_async",
> > >>"data": { "commands": [
> > >>"migrate_cancel",
> > >>] } }
> > >>
> > >>
> > >>  { "return": { "commands": [
> > >>"migrate_cancel",
> > >>] } }
> > >>
> > >> The server response contains the subset of commands from the request
> > >> for which async is supported.
> > >>
> > >> That gives good negotiation ability going forward as we incrementally
> > >> support async on more commands.
> > >
> > > I think this goes back to the discussion on which design we'd like to
> > > choose.  IMHO the whole async idea plus the per-command-id is indeed
> > > cleaner and nicer, and I believe that can benefit not only libvirt,
> > > but also other QMP users.  The problem is, I have no idea how long
> > > it'll take to let us have such a feature - I believe that will include
> > > QEMU and Libvirt to both support that.  And it'll be a pity if the
> > > postcopy recovery cannot work only because we cannot guarantee a
> > > stable monitor.
> > 
> > Please don't rush in a hack, they often introduce new bugs that we
> > have to support long-term when they are part of the QMP API.

Sorry, I wasn't meant to push anything.  I was trying to see what
would be the best way to go.

> > 
> > In your original email you mentioned "info cpus".  Have you considered
> > modifying this command so it does not sync the CPU?  I'm not sure
> > callers really need to sync the CPU, typically they just want to know
> > the vcpu numbers, thread IDs, and current state (halted, running,
> > etc).
> 
> But it has the pc as well, so that's actual state.

Yes.  Even if we don't need to sync pc regs for this single "info
cpus" command, I do feel slightly awkward if we don't allow things
like syncing CPU to happen in any command.  IMHO we just need to make
sure these commands may block.

> 
> Dave
> 
> > The next step after that would be to audit other monitor commands for
> > unnecessary vcpu synchronization.

It's really hard to do this for every single command, at least to me.

Comparing to this, I think now I more prefer what Dan has suggested in
the other reply to have an extra way to request async commands while
keep the rest of commands compatible (though obviously I misunderstood
the email when I was writting up previous reply...).

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and RSCH

2017-09-07 Thread Cornelia Huck

On Thu, 7 Sep 2017 16:58:31 +0800
Dong Jia Shi  wrote:

> * Halil Pasic  [2017-09-06 16:43:42 +0200]:
> 
> > 
> > 
> > On 09/06/2017 04:20 PM, Cornelia Huck wrote:  
> > > On Wed, 6 Sep 2017 14:25:13 +0200
> > > Halil Pasic  wrote:
> > >   
> > >> We have basically two possibilities/options which ask for different
> > >> handling:
> > >> 1) EFAULT is due to a bug in the vfio-ccw implementation
> > >> (can be QEMU or kernel).
> > >> 2) EFAULT is due to buggy channel program.
> > >>
> > >> Option 2) is basically to be handled with a channel-program check and
> > >> setting primary secondary and alert status. For reference see PoP page
> > >> 15-59 ("Designation of Storage Area").  An exception may be an invalid
> > >> channel program address in the ORB. There the channel-program check ain't
> > >> explicitly stated (although) I would expect one. It may be implied by the
> > >> things on page 15-59 though.
> > >>
> > >> Option 1) is however very similar to other we have figured out that the
> > >> implementation is broken situations and should be handled consequently.
> > >> The current state of the discussion is with a unit exception.
> > >>
> > >> Does that make sense?  
> > > 
> > > I think the situation is slightly different here, though. For the orb
> > > flags, we reject something out of hand because we have not implemented
> > > it, and for that, unit exception sounds like a good fit. Processing
> > > errors, however, are more similar to errors in the hardware, and as
> > > such can probably be reported via something like equipment check.
> > >   
> > 
> > Noted. Let's see what Dong Jia has to say, before we continuing a
> > discussion on something (option 1) what may be irrelevant anyway.
> >   
> > >>
> > >> Now, Dong Jia, I need your help to figure out do we have option 1 or
> > >> option 2 here? After quick look at the kernel code, it appears to me that
> > >> I've seen both option 1 and option 2 (I'm afraid) -- but my assessment
> > >> was really very superficial.  
> There are three cases (all in the kernel) that generate a -EFAULT ret
> code:
> a. vfio_ccw_mdev_write: copy_from_user() fails.
>   This is option 1.
> 
> b. ccwchain_fetch_tic
>   It's mostly likely that the vfio-ccw driver processed the ccw chains
>   wrongly. (Actually I can not think of any other reason.)

Me neither, I'd consider hitting this a bug in the implementation.

>   This is option 1.
> 
> c. ccwchain_fetch_idal
>   When we find that an IDAW contents an invalid address
>   This is option 2.
> 
> > >>
> > >> I would expect option 2 to be handled differently (kernel provides the
> > >> SCSW) though.

So we would do an equipment check for the first two ("equipment", i.e.
the software, is malfunctioning) and use a more appropriate way for the
malformed idaw?

Re: [Qemu-devel] [PULL 00/14] Block layer patches

2017-09-07 Thread Peter Maydell

On 6 September 2017 at 15:02, Kevin Wolf  wrote:
> The following changes since commit 98bfaac788be0ca63d7d010c8d4ba100ff1d8278:
>
>   Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-09-01-v3' 
> into staging (2017-09-04 13:28:09 +0100)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to 83a8c775a8bf134eb18a719322939b74a818d750:
>
>   qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing 
> (2017-09-06 14:40:18 +0200)
>
> 
> Block layer patches
>

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH 5/5] s390x/ccs: add ccw-tester emulated device

2017-09-07 Thread Cornelia Huck

On Thu, 7 Sep 2017 12:21:50 +0200
Halil Pasic  wrote:

> On 09/07/2017 10:08 AM, Cornelia Huck wrote:
> > On Thu, 7 Sep 2017 15:31:09 +0800
> > Dong Jia Shi  wrote:

> >> I'm thinking of a method these days:
> >> Could passing through an fully emulated ccw device (e.g. 3270), or a
> >> virtio ccw device, in the level 1 kvm guest to a level 2 guest be a test
> >> method for this?
> >>
> >> All of the CCWs will be translated to IDAL CCWs by vfio-ccw in the level
> >> 1 guest (which is the level 2 kvm host) and issued to the level 1 kvm
> >> host. So, those IDALs will eventually be handled by the emulated device,
> >> or the virtio ccw device, on the level 1 kvm host...
> >>
> >> Some days ago, one of my colleague tried the emulated 3270 passing
> >> through. She stucked with the problem that the level 1 kvm host handling
> >> a senseid IDAL ccw as a Direct ccw.
> >>
> >> Maybe I could try to pass through a virtio ccw device. I don't think of
> >> any obvious problem that would lead to fail. Any comment?
> >>  
> > 
> > That actually looks like a good thing to try! Cool idea.
> >   
> 
> I'm afraid that it would not work without some extra work.
> AFAIR Connie we said that the 3270 does not use any IDA, so
> I did not touch the 3270 emulation code in QEMU. To make
> the scenario viable one should convert the 3270 emulation
> to ccw data stream (unless the original implementation
> already took care of IDA, which I doubt).

But the vfio-ccw code uses idals... no need to touch the 3270 emulation.

Re: [Qemu-devel] [RFC v2 06/32] postcopy: use UFFDIO_ZEROPAGE only when available

2017-09-07 Thread Dr. David Alan Gilbert

* Marc-André Lureau (marcandre.lur...@gmail.com) wrote:
> Hi
> 
> On Thu, Aug 24, 2017 at 9:27 PM, Dr. David Alan Gilbert (git)
>  wrote:
> > From: "Dr. David Alan Gilbert" 
> >
> > Use a flag on the RAMBlock to state whether it has the
> > UFFDIO_ZEROPAGE capability, use it when it's available.
> >
> > This allows the use of postcopy on tmpfs as well as hugepage
> > backed files.
> >
> > Signed-off-by: Dr. David Alan Gilbert 
> > ---
> >  exec.c| 15 +++
> >  include/exec/cpu-common.h |  3 +++
> >  migration/postcopy-ram.c  | 14 +++---
> >  3 files changed, 29 insertions(+), 3 deletions(-)
> >
> > diff --git a/exec.c b/exec.c
> > index 35b4cea2ed..80c3d1d121 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -103,6 +103,11 @@ static MemoryRegion io_mem_unassigned;
> >   */
> >  #define RAM_RESIZEABLE (1 << 2)
> >
> > +/* UFFDIO_ZEROPAGE is available on this RAMBlock to atomically
> > + * zero the page and wake waiting processes.
> > + * (Set during postcopy)
> > + */
> > +#define RAM_UF_ZEROPAGE (1 << 3)
> >  #endif
> >
> >  #ifdef TARGET_PAGE_BITS_VARY
> > @@ -1705,6 +1710,16 @@ bool qemu_ram_is_shared(RAMBlock *rb)
> >  return rb->flags & RAM_SHARED;
> >  }
> >
> > +bool qemu_ram_is_uf_zeroable(RAMBlock *rb)
> > +{
> > +return rb->flags & RAM_UF_ZEROPAGE;
> > +}
> > +
> > +void qemu_ram_set_uf_zeroable(RAMBlock *rb)
> > +{
> > +rb->flags |= RAM_UF_ZEROPAGE;
> > +}
> > +
> >  /* Called with iothread lock held.  */
> >  void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState 
> > *dev)
> >  {
> > diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> > index 0d861a6289..24d335f95d 100644
> > --- a/include/exec/cpu-common.h
> > +++ b/include/exec/cpu-common.h
> > @@ -73,6 +73,9 @@ void qemu_ram_set_idstr(RAMBlock *block, const char 
> > *name, DeviceState *dev);
> >  void qemu_ram_unset_idstr(RAMBlock *block);
> >  const char *qemu_ram_get_idstr(RAMBlock *rb);
> >  bool qemu_ram_is_shared(RAMBlock *rb);
> > +bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
> > +void qemu_ram_set_uf_zeroable(RAMBlock *rb);
> > +
> >  size_t qemu_ram_pagesize(RAMBlock *block);
> >  size_t qemu_ram_pagesize_largest(void);
> >
> > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > index 7a414ebad8..640b72d86d 100644
> > --- a/migration/postcopy-ram.c
> > +++ b/migration/postcopy-ram.c
> > @@ -408,6 +408,11 @@ static int ram_block_enable_notify(const char 
> > *block_name, void *host_addr,
> >  error_report("%s userfault: Region doesn't support COPY", 
> > __func__);
> >  return -1;
> >  }
> > +if (reg_struct.ioctls & ((__u64)1 << _UFFDIO_ZEROPAGE)) {
> > +RAMBlock *rb = qemu_ram_block_by_name(block_name);
> > +qemu_ram_set_uf_zeroable(rb);
> > +}
> > +
> >
> 
> extra empty line

Thanks, gone.

> >  return 0;
> >  }
> > @@ -617,11 +622,14 @@ int postcopy_place_page(MigrationIncomingState *mis, 
> > void *host, void *from,
> >  int postcopy_place_page_zero(MigrationIncomingState *mis, void *host,
> >   RAMBlock *rb)
> >  {
> > +size_t pagesize = qemu_ram_pagesize(rb);
> >  trace_postcopy_place_page_zero(host);
> >
> > -if (qemu_ram_pagesize(rb) == getpagesize()) {
> 
> Is this check drop intentionally?

Yes, it used to be the case that we knew that for hugepages we couldn't
do zeroing and that was the rule we used.   Now we're using the
capability flag returned from the uffdio registration on this RAMBlock
and it tells us if we can use the ZERO ioctl on this block.

Dave

> > -if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, NULL, 
> > getpagesize(),
> > -rb)) {
> > +/* Normal RAMBlocks can zero a page using UFFDIO_ZEROPAGE
> > + * but it's not available for everything (e.g. hugetlbpages)
> > + */
> > +if (qemu_ram_is_uf_zeroable(rb)) {
> > +if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, NULL, pagesize, 
> > rb)) {
> >  int e = errno;
> >  error_report("%s: %s zero host: %p",
> >   __func__, strerror(e), host);
> > --
> > 2.13.5
> >
> >
> 
> 
> 
> -- 
> Marc-André Lureau
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v4 00/20] instrument: Add basic event instrumentation

2017-09-07 Thread Markus Armbruster

Lluís Vilanova  writes:

> This series adds an API to add instrumentation events.
>
> It also provides additional APIs for:
> * Controlling tracing events
> * Peek/poke guest memory
>
> There's still missing APIs for (can be added in later series?):
> * Provide something like tracing's per-vCPU trace states (i.e., so that each
>   vCPU can have different instrumentation code). It's still not clear to me if
>   we should extend the per-vCPU bitmap with instrumentation events, or 
> otherwise
>   somehow reuse the bits in tracing events (since they're currently limited).
> * Peek/poke guest registers
>
> The instrumentation code is dynamically loaded as a library into QEMU either
> when it starts or later using its remote control interfaces.
>
> Signed-off-by: Lluís Vilanova 

Taking a step back.

This looks like a way to dynamically load arbitrary code.  What
interfaces can this code use?  Your cover letter should answer this.

As long as the answer is "everything the dynamic linker is willing to
resolve", this series heading nowhere.  We can talk about an interface
for plugins, but "anything goes" is not on the menu.

Re: [Qemu-devel] [PATCH 2/9] s390x: fix invalid use of cc 1 for SSCH

2017-09-07 Thread Halil Pasic

On 09/07/2017 10:02 AM, Dong Jia Shi wrote:
> * Cornelia Huck  [2017-09-06 13:25:38 +0200]:
> 
>> On Wed, 6 Sep 2017 16:27:20 +0800
>> Dong Jia Shi  wrote:
>>
>>> * Halil Pasic  [2017-09-05 19:20:43 +0200]:
>>>

 On 09/05/2017 05:46 PM, Cornelia Huck wrote:  
> On Tue, 5 Sep 2017 17:24:19 +0200
> Halil Pasic  wrote:
>   
>> My problem with a program check (indicated by SCSW word 2 bit 10) is
>> that, in my reading of the architecture, the semantic behind it is: The
>> channel subsystem (not the cu or device) has detected, that the 
>> the channel program (previously submitted as an ORB) is erroneous. Which
>> programs are erroneous is specified by the architecture. What we have
>> here does not qualify.
>>
>> My idea was to rather blame the virtual hardware (device) and put no 
>> blame
>> on the program nor he channel subsystem. This could be done using device
>> status (unit check with command reject, maybe unit exception) or 
>> interface
>> check. My train of thought was, the problem is not consistent across a
>> device type, so it has to be device specific.  
>
> Unit exception might be a better way to express what is happening here.
> At least, it moves us away from cc 1 and not towards cc 3 :)
>   

 I will do a follow up patch pursuing device exception.

>>
>> Of course blaming the device could mislead the person encountering the
>> problem, and make him believe it's an non-virtual hardware problem.
>>
>> About the misleading, I think the best we can do is log out a message
>> indicating what really happened.  
>
> Just document it in the code? If it doesn't happen with Linux as a
> guest, it is highly unlikely to be seen in the wild.
>   

 Well we have two problems here:
 1) Unit exception can be already defined by the device type for the
 command (reference: 
 http://publibfp.dhe.ibm.com/cgi-bin/bookmgr/BOOKS/dz9ar110/2.6.10?DT=19920904110920).
 I think this one is what you mean. And I agree that's best handled
 with comment in code.  
>>> Using unit check, with bit 3 byte 0 of the sense data set to 1, to
>>> indicate an 'Equipment check', sounds a bit more proper than unit
>>> exception.
>>
>> I don't agree: Equipment check sounds a lot more dire (and seems to
>> imply a malfunction). I like unit exception better.
> Got the point. Fair enough!
> 

I do see some benefit in doing unit check over unit exception. Just
kept quite to see the discussion unfold. As already said, unit exception
seems to be something reserved for the device type to define in a more
or less arbitrary but unambiguous way. I agreed to use this, because
I trust Connie's assessment about not really being used by the
devices in the wild (obviously nothing changed here).

If we consider the semantic of unit check with command reject, it's
a surprisingly good match: basically device detected a programming
error (which can not be detected by the channel-subsystem because it
is device (type) specific). For reference see:
http://publibfp.dhe.ibm.com/cgi-bin/bookmgr/BOOKS/dz9ar110/2.7.2.1?DT=19920904110920

IMHO that's almost exactly what we have here: the channel-program
is good from the perspective of the channel subsystem, but the device
can't deal with it. So we would not lie that the device is at fault
(was Connie's concern initially) but we would not lie about having
a generally invalid channel program (was my concern).

So how about an unit check with a command reject? (The only problem
I see is is on the device vs device type plane -- but that ain't better
for unit exception.)

Halil

Re: [Qemu-devel] [RFC v2 09/32] vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message

2017-09-07 Thread Dr. David Alan Gilbert

* Marc-André Lureau (marcandre.lur...@gmail.com) wrote:
> Hi
> 
> On Thu, Aug 24, 2017 at 12:27 PM, Dr. David Alan Gilbert (git)
>  wrote:
> > From: "Dr. David Alan Gilbert" 
> >
> > Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE
> > message on an incoming advise.
> >
> > Later patches will fill in the behaviour/contents of the
> > message.
> >
> > Signed-off-by: Dr. David Alan Gilbert 
> > ---
> >  contrib/libvhost-user/libvhost-user.c | 21 ---
> >  contrib/libvhost-user/libvhost-user.h |  6 -
> >  docs/interop/vhost-user.txt   |  9 +++
> >  hw/virtio/vhost-user.c| 48 
> > +++
> >  migration/postcopy-ram.h  |  1 +
> >  migration/savevm.c|  6 +
> >  6 files changed, 86 insertions(+), 5 deletions(-)
> >
> > diff --git a/contrib/libvhost-user/libvhost-user.c 
> > b/contrib/libvhost-user/libvhost-user.c
> > index 201b9846e9..8bbdf5fb40 100644
> > --- a/contrib/libvhost-user/libvhost-user.c
> > +++ b/contrib/libvhost-user/libvhost-user.c
> > @@ -42,9 +42,6 @@ vu_request_to_string(int req)
> >  REQ(VHOST_USER_NONE),
> >  REQ(VHOST_USER_GET_FEATURES),
> >  REQ(VHOST_USER_SET_FEATURES),
> > -REQ(VHOST_USER_NONE),
> > -REQ(VHOST_USER_GET_FEATURES),
> > -REQ(VHOST_USER_SET_FEATURES),
> 
> nice cleanup ;)
> 
> >  REQ(VHOST_USER_SET_OWNER),
> >  REQ(VHOST_USER_RESET_OWNER),
> >  REQ(VHOST_USER_SET_MEM_TABLE),
> > @@ -62,7 +59,10 @@ vu_request_to_string(int req)
> >  REQ(VHOST_USER_GET_QUEUE_NUM),
> >  REQ(VHOST_USER_SET_VRING_ENABLE),
> >  REQ(VHOST_USER_SEND_RARP),
> > -REQ(VHOST_USER_INPUT_GET_CONFIG),
> > +REQ(VHOST_USER_SET_SLAVE_REQ_FD),
> > +REQ(VHOST_USER_IOTLB_MSG),
> > +REQ(VHOST_USER_SET_VRING_ENDIAN),
> > +REQ(VHOST_USER_POSTCOPY_ADVISE),
> >  REQ(VHOST_USER_MAX),
> >  };
> >  #undef REQ
> > @@ -744,6 +744,17 @@ vu_set_vring_enable_exec(VuDev *dev, VhostUserMsg 
> > *vmsg)
> >  }
> >
> >  static bool
> > +vu_set_postcopy_advise(VuDev *dev, VhostUserMsg *vmsg)
> > +{
> > +/* TODO: Open ufd, pass it back in the request
> > + * TODO: Add addresses
> 
> Extra EOL space

Gone.

> > + */
> > +vmsg->payload.u64 = 0xcafe;
> > +vmsg->size = sizeof(vmsg->payload.u64);
> > +return true; /* = send a reply */
> > +}
> > +
> > +static bool
> >  vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
> >  {
> >  int do_reply = 0;
> > @@ -808,6 +819,8 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
> >  return vu_set_vring_enable_exec(dev, vmsg);
> >  case VHOST_USER_NONE:
> >  break;
> > +case VHOST_USER_POSTCOPY_ADVISE:
> > +return vu_set_postcopy_advise(dev, vmsg);
> >  default:
> >  vmsg_close_fds(vmsg);
> >  vu_panic(dev, "Unhandled request: %d", vmsg->request);
> > diff --git a/contrib/libvhost-user/libvhost-user.h 
> > b/contrib/libvhost-user/libvhost-user.h
> > index 95d0d34a28..3987ce643d 100644
> > --- a/contrib/libvhost-user/libvhost-user.h
> > +++ b/contrib/libvhost-user/libvhost-user.h
> > @@ -62,7 +62,11 @@ typedef enum VhostUserRequest {
> >  VHOST_USER_GET_QUEUE_NUM = 17,
> >  VHOST_USER_SET_VRING_ENABLE = 18,
> >  VHOST_USER_SEND_RARP = 19,
> > -VHOST_USER_INPUT_GET_CONFIG = 20,
> > +VHOST_USER_NET_SET_MTU  = 20,
> > +VHOST_USER_SET_SLAVE_REQ_FD = 21,
> > +VHOST_USER_IOTLB_MSG= 22,
> > +VHOST_USER_SET_VRING_ENDIAN = 23,
> > +VHOST_USER_POSTCOPY_ADVISE  = 24,
> >  VHOST_USER_MAX
> >  } VhostUserRequest;
> >
> > diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> > index a279560eb0..dad2a1b343 100644
> > --- a/docs/interop/vhost-user.txt
> > +++ b/docs/interop/vhost-user.txt
> > @@ -606,6 +606,15 @@ Master message types
> >and expect this message once (per VQ) during device configuration
> >(ie. before the master starts the VQ).
> >
> > + * VHOST_USER_POSTCOPY_ADVISE
> > +  Id: 24
> > +  Master payload: N/A
> > +  Slave payload: userfault fd + u64
> > +
> > +  Master advises slave that a migration with postcopy enabled is 
> > underway,
> > +  the slave must open a userfaultfd for later use.
> > +  Note that at this stage the migration is still in precopy mode.
> > +
> >  Slave message types
> >  ---
> >
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index c51bbd1296..7063e4df61 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -67,6 +67,7 @@ typedef enum VhostUserRequest {
> >  VHOST_USER_SET_SLAVE_REQ_FD = 21,
> >  VHOST_USER_IOTLB_MSG = 22,
> >  VHOST_USER_SET_VRING_ENDIAN = 23,
> > +VHOST_USER_POSTCOPY_ADVISE  = 24,
> >  VHOST_USER_MAX
> >  } VhostUserRequest;
> >
> > @@ -724,6 +725,50 @@ out:
> >  return ret;
> >  }
> >
> > +/*
> > + * Called at the start of

Re: [Qemu-devel] [PATCH] target/arm: Add Jazelle feature

2017-09-07 Thread Peter Maydell

On 5 September 2017 at 22:12, Portia Stephens
 wrote:
> This adds the Jazelle execution state as a feature if ARM_FEATURE_V6 is
> set or if the processor is arm926 or arm1026. This fixes the issue that any
> BXJ instruction will result in an illegal_op. BXJ instructions will now
> check if the architecture supports ARM_FEATURE_JAZELLE.
>
> Signed-off-by: Portia Stephens 
> Reviewed-by: Alistair Francis 

Thanks, applied to target-arm.next. I tweaked the text of the comment
and the commit message a little to make it clearer that all the JAZELLE
feature bit does is indicate support for the "trivial Jazelle"
implementation, not the full thing.

thanks
-- PMM

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Markus Armbruster

"Dr. David Alan Gilbert"  writes:

> * Daniel P. Berrange (berra...@redhat.com) wrote:
>> On Thu, Sep 07, 2017 at 04:13:41PM +0800, Peter Xu wrote:
>> > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
>> > > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
>> > > > * Daniel P. Berrange (berra...@redhat.com) wrote:
>> > > > > This does imply that you need a separate monitor I/O processing, 
>> > > > > from the
>> > > > > command execution thread, but I see no need for all commands to 
>> > > > > suddenly
>> > > > > become async. Just allowing interleaved replies is sufficient from 
>> > > > > the
>> > > > > POV of the protocol definition. This interleaving is easy to handle 
>> > > > > from
>> > > > > the client POV - just requires a unique 'serial' in the request by 
>> > > > > the
>> > > > > client, that is copied into the reply by QEMU.
>> > > > 
>> > > > OK, so for that we can just take Marc-André's syntax and call it 'id':
>> > > >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
>> > > > 
>> > > > then it's upto the caller to ensure those id's are unique.
>> > > 
>> > > Libvirt has in fact generated a unique 'id' for every monitor command
>> > > since day 1 of supporting QMP.
>> > > 
>> > > > I do worry about two things:
>> > > >   a) With this the caller doesn't really know which commands could be
>> > > >   in parallel - for example if we've got a recovery command that's
>> > > >   executed by this non-locking thread that's OK, we expect that
>> > > >   to be doable in parallel.  If in the future though we do
>> > > >   what you initially suggested and have a bunch of commands get
>> > > >   routed to the migration thread (say) then those would suddenly
>> > > >   operate in parallel with other commands that we're previously
>> > > >   synchronous.
>> > > 
>> > > We could still have an opt-in for async commands. eg default to executing
>> > > all commands in the main thread, unless the client issues an explicit
>> > > "make it async" command, to switch to allowing the migration thread to
>> > > process it async.
>> > > 
>> > >  { "execute": "qmp_allow_async",
>> > >"data": { "commands": [
>> > >"migrate_cancel",
>> > >] } }
>> > > 
>> > > 
>> > >  { "return": { "commands": [
>> > >"migrate_cancel",
>> > >] } }
>> > > 
>> > > The server response contains the subset of commands from the request
>> > > for which async is supported.
>> > > 
>> > > That gives good negotiation ability going forward as we incrementally
>> > > support async on more commands.
>> > 
>> > I think this goes back to the discussion on which design we'd like to
>> > choose.  IMHO the whole async idea plus the per-command-id is indeed
>> > cleaner and nicer, and I believe that can benefit not only libvirt,
>> > but also other QMP users.  The problem is, I have no idea how long
>> > it'll take to let us have such a feature - I believe that will include
>> > QEMU and Libvirt to both support that.  And it'll be a pity if the
>> > postcopy recovery cannot work only because we cannot guarantee a
>> > stable monitor.
>> 
>> This is not a blocker for having postcopy recovery feature merged.
>> It merely means that in a situation where the mainloop is blocked,
>> then we can't recover, in other situations we'll be able to recover
>> fine. Sure it would be nice to fix that problem too, but I don't
>> see it as a block.
>
> It's probably OK to merge the recovery code before the monitor code;
> but I don't think it's something you'd want to tell users about -
> a 'postcopy recovery that only works rarely' isn't much use.

"Rarely"?  Are main loop hangs *that* common?

Can we quantify the problem to help gauge urgency?

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread

2017-09-07 Thread Dr. David Alan Gilbert

* Markus Armbruster (arm...@redhat.com) wrote:
> "Dr. David Alan Gilbert"  writes:
> 
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> >> On Thu, Sep 07, 2017 at 04:13:41PM +0800, Peter Xu wrote:
> >> > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> >> > > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote:
> >> > > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> >> > > > > This does imply that you need a separate monitor I/O processing, 
> >> > > > > from the
> >> > > > > command execution thread, but I see no need for all commands to 
> >> > > > > suddenly
> >> > > > > become async. Just allowing interleaved replies is sufficient from 
> >> > > > > the
> >> > > > > POV of the protocol definition. This interleaving is easy to 
> >> > > > > handle from
> >> > > > > the client POV - just requires a unique 'serial' in the request by 
> >> > > > > the
> >> > > > > client, that is copied into the reply by QEMU.
> >> > > > 
> >> > > > OK, so for that we can just take Marc-André's syntax and call it 
> >> > > > 'id':
> >> > > >   https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> >> > > > 
> >> > > > then it's upto the caller to ensure those id's are unique.
> >> > > 
> >> > > Libvirt has in fact generated a unique 'id' for every monitor command
> >> > > since day 1 of supporting QMP.
> >> > > 
> >> > > > I do worry about two things:
> >> > > >   a) With this the caller doesn't really know which commands could be
> >> > > >   in parallel - for example if we've got a recovery command that's
> >> > > >   executed by this non-locking thread that's OK, we expect that
> >> > > >   to be doable in parallel.  If in the future though we do
> >> > > >   what you initially suggested and have a bunch of commands get
> >> > > >   routed to the migration thread (say) then those would suddenly
> >> > > >   operate in parallel with other commands that we're previously
> >> > > >   synchronous.
> >> > > 
> >> > > We could still have an opt-in for async commands. eg default to 
> >> > > executing
> >> > > all commands in the main thread, unless the client issues an explicit
> >> > > "make it async" command, to switch to allowing the migration thread to
> >> > > process it async.
> >> > > 
> >> > >  { "execute": "qmp_allow_async",
> >> > >"data": { "commands": [
> >> > >"migrate_cancel",
> >> > >] } }
> >> > > 
> >> > > 
> >> > >  { "return": { "commands": [
> >> > >"migrate_cancel",
> >> > >] } }
> >> > > 
> >> > > The server response contains the subset of commands from the request
> >> > > for which async is supported.
> >> > > 
> >> > > That gives good negotiation ability going forward as we incrementally
> >> > > support async on more commands.
> >> > 
> >> > I think this goes back to the discussion on which design we'd like to
> >> > choose.  IMHO the whole async idea plus the per-command-id is indeed
> >> > cleaner and nicer, and I believe that can benefit not only libvirt,
> >> > but also other QMP users.  The problem is, I have no idea how long
> >> > it'll take to let us have such a feature - I believe that will include
> >> > QEMU and Libvirt to both support that.  And it'll be a pity if the
> >> > postcopy recovery cannot work only because we cannot guarantee a
> >> > stable monitor.
> >> 
> >> This is not a blocker for having postcopy recovery feature merged.
> >> It merely means that in a situation where the mainloop is blocked,
> >> then we can't recover, in other situations we'll be able to recover
> >> fine. Sure it would be nice to fix that problem too, but I don't
> >> see it as a block.
> >
> > It's probably OK to merge the recovery code before the monitor code;
> > but I don't think it's something you'd want to tell users about -
> > a 'postcopy recovery that only works rarely' isn't much use.
> 
> "Rarely"?  Are main loop hangs *that* common?
> 
> Can we quantify the problem to help gauge urgency?

Not really; it depends on workload and behaviour.  The people who worry
about postcopy recovery actually care about it working in almost every
case.
So I'm OK to add the recovery code, it's just not something we should
be shouting about to users until we have the monitor fixed.

Dave

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC v2 10/32] vhub: Support sending fds back to qemu

2017-09-07 Thread Dr. David Alan Gilbert

* Marc-André Lureau (marcandre.lur...@gmail.com) wrote:
> Hi
> 
> On Thu, Aug 24, 2017 at 12:27 PM, Dr. David Alan Gilbert (git)
>  wrote:
> > From: "Dr. David Alan Gilbert" 
> >
> > Allow replies with fds (for postcopy)
> >
> > Signed-off-by: Dr. David Alan Gilbert 
> > ---
> >  contrib/libvhost-user/libvhost-user.c | 29 -
> >  1 file changed, 28 insertions(+), 1 deletion(-)
> >
> > diff --git a/contrib/libvhost-user/libvhost-user.c 
> > b/contrib/libvhost-user/libvhost-user.c
> > index 8bbdf5fb40..47884c0a15 100644
> > --- a/contrib/libvhost-user/libvhost-user.c
> > +++ b/contrib/libvhost-user/libvhost-user.c
> > @@ -213,6 +213,30 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
> > *vmsg)
> >  {
> >  int rc;
> >  uint8_t *p = (uint8_t *)vmsg;
> > +char control[CMSG_SPACE(VHOST_MEMORY_MAX_NREGIONS * sizeof(int))] = { 
> > };
> > +struct iovec iov = {
> > +.iov_base = (char *)vmsg,
> > +.iov_len = VHOST_USER_HDR_SIZE,
> > +};
> > +struct msghdr msg = {
> > +.msg_iov = &iov,
> > +.msg_iovlen = 1,
> > +.msg_control = control,
> > +};
> > +struct cmsghdr *cmsg;
> > +
> > +memset(control, 0, sizeof(control));
> > +if (vmsg->fds) {
> 
> This is going to be always true, right? Check vmsg->fd_num > 0 instead?

Ah yes, thanks.

> I would also add check or assert(vmsg->fd_num <= VHOST_MEMORY_MAX_NREGIONS)

-if (vmsg->fds) {
+assert(vmsg->fd_num <= VHOST_MEMORY_MAX_NREGIONS);
+if (vmsg->fd_num > 0) {

> 
> > +size_t fdsize = vmsg->fd_num * sizeof(int);
> > +msg.msg_controllen = CMSG_SPACE(fdsize);
> > +cmsg = CMSG_FIRSTHDR(&msg);
> > +cmsg->cmsg_len = CMSG_LEN(fdsize);
> > +cmsg->cmsg_level = SOL_SOCKET;
> > +cmsg->cmsg_type = SCM_RIGHTS;
> > +memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
> > +} else {
> > +msg.msg_controllen = 0;
> > +}
> >
> >  /* Set the version in the flags when sending the reply */
> >  vmsg->flags &= ~VHOST_USER_VERSION_MASK;
> > @@ -220,7 +244,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg 
> > *vmsg)
> >  vmsg->flags |= VHOST_USER_REPLY_MASK;
> >
> >  do {
> > -rc = write(conn_fd, p, VHOST_USER_HDR_SIZE);
> > +rc = sendmsg(conn_fd, &msg, 0);
> >  } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
> >
> >  do {
> > @@ -313,6 +337,7 @@ vu_get_features_exec(VuDev *dev, VhostUserMsg *vmsg)
> >  }
> >
> >  vmsg->size = sizeof(vmsg->payload.u64);
> > +vmsg->fd_num = 0;
> >
> >  DPRINT("Sending back to guest u64: 0x%016"PRIx64"\n", 
> > vmsg->payload.u64);
> >
> > @@ -454,6 +479,7 @@ vu_set_log_base_exec(VuDev *dev, VhostUserMsg *vmsg)
> >  dev->log_size = log_mmap_size;
> >
> >  vmsg->size = sizeof(vmsg->payload.u64);
> > +vmsg->fd_num = 0;
> >
> >  return true;
> >  }
> > @@ -698,6 +724,7 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
> > *vmsg)
> >
> >  vmsg->payload.u64 = features;
> >  vmsg->size = sizeof(vmsg->payload.u64);
> > +vmsg->fd_num = 0;
> >
> >  return true;
> >  }
> > --
> > 2.13.5
> >
> >
> 
> other than that
> Reviewed-by: Marc-André Lureau 

Thanks.

Dave

> 
> 
> -- 
> Marc-André Lureau
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and RSCH

2017-09-07 Thread Halil Pasic



On 09/07/2017 12:24 PM, Cornelia Huck wrote:
> On Thu, 7 Sep 2017 16:58:31 +0800
> Dong Jia Shi  wrote:
> 
>> * Halil Pasic  [2017-09-06 16:43:42 +0200]:
>>
>>>
>>>
>>> On 09/06/2017 04:20 PM, Cornelia Huck wrote:  
 On Wed, 6 Sep 2017 14:25:13 +0200
 Halil Pasic  wrote:
   
> We have basically two possibilities/options which ask for different
> handling:
> 1) EFAULT is due to a bug in the vfio-ccw implementation
> (can be QEMU or kernel).
> 2) EFAULT is due to buggy channel program.
>
> Option 2) is basically to be handled with a channel-program check and
> setting primary secondary and alert status. For reference see PoP page
> 15-59 ("Designation of Storage Area").  An exception may be an invalid
> channel program address in the ORB. There the channel-program check ain't
> explicitly stated (although) I would expect one. It may be implied by the
> things on page 15-59 though.
>
> Option 1) is however very similar to other we have figured out that the
> implementation is broken situations and should be handled consequently.
> The current state of the discussion is with a unit exception.
>
> Does that make sense?  

 I think the situation is slightly different here, though. For the orb
 flags, we reject something out of hand because we have not implemented
 it, and for that, unit exception sounds like a good fit. Processing
 errors, however, are more similar to errors in the hardware, and as
 such can probably be reported via something like equipment check.
   
>>>
>>> Noted. Let's see what Dong Jia has to say, before we continuing a
>>> discussion on something (option 1) what may be irrelevant anyway.
>>>   
>
> Now, Dong Jia, I need your help to figure out do we have option 1 or
> option 2 here? After quick look at the kernel code, it appears to me that
> I've seen both option 1 and option 2 (I'm afraid) -- but my assessment
> was really very superficial.  
>> There are three cases (all in the kernel) that generate a -EFAULT ret
>> code:
>> a. vfio_ccw_mdev_write: copy_from_user() fails.
>>   This is option 1.
>>
>> b. ccwchain_fetch_tic
>>   It's mostly likely that the vfio-ccw driver processed the ccw chains
>>   wrongly. (Actually I can not think of any other reason.)
> 
> Me neither, I'd consider hitting this a bug in the implementation.

Nod.

> 
>>   This is option 1.
>>
>> c. ccwchain_fetch_idal
>>   When we find that an IDAW contents an invalid address
>>   This is option 2.
>>
>
> I would expect option 2 to be handled differently (kernel provides the
> SCSW) though.
> 
> So we would do an equipment check for the first two ("equipment", i.e.
> the software, is malfunctioning) and use a more appropriate way for the
> malformed idaw?
> 

You have probably missed my previous email where I state something very
similar (MID  <2aa8cf98-c331-fe5a-0a7e-1a553c6c5...@linux.vnet.ibm.com>).
There I say: if we change the kernel code, yes, if we don't I prefer a
program check.


Halil

1 2 3 4 >

1 - 100 of 384 matches

Mail list logo