date:20190124

[Qemu-devel] [PATCH v2] ui/egl-helpers: Augment parameter list of egl_texture_blend() to convey scales of viewport

2019-01-24 Thread Chen Zhang via Qemu-devel

From 0be823a56682be57fe0370bb91e4062ec7b47be5 Mon Sep 17 00:00:00 2001
From: Chen Zhang 
Date: Fri, 25 Jan 2019 15:33:08 +0800
Subject: [PATCH v2] ui/egl-helpers: Augment parameter list of
 egl_texture_blend() to convey scales of viewport.

 This would help gtk-egl display showing scaled DMABuf cursor images when
 gtk window was zoomed. A default scale of (1.0, 1.0) was presumed for
 call sites where no scaling is needed.

Signed-off-by: Chen Zhang 
---
 include/ui/egl-helpers.h | 2 +-
 ui/egl-headless.c| 3 ++-
 ui/egl-helpers.c | 9 +
 ui/gtk-egl.c | 3 ++-
 ui/spice-display.c   | 2 +-
 5 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/ui/egl-helpers.h b/include/ui/egl-helpers.h
index 3fc656a..b976cb8 100644
--- a/include/ui/egl-helpers.h
+++ b/include/ui/egl-helpers.h
@@ -27,7 +27,7 @@ void egl_fb_read(void *dst, egl_fb *src);
 
 void egl_texture_blit(QemuGLShader *gls, egl_fb *dst, egl_fb *src, bool flip);
 void egl_texture_blend(QemuGLShader *gls, egl_fb *dst, egl_fb *src, bool flip,
-   int x, int y);
+   int x, int y, double scale_x, double scale_y);
 
 #ifdef CONFIG_OPENGL_DMABUF
 
diff --git a/ui/egl-headless.c b/ui/egl-headless.c
index 519e7ba..e67b47a 100644
--- a/ui/egl-headless.c
+++ b/ui/egl-headless.c
@@ -142,7 +142,8 @@ static void egl_scanout_flush(DisplayChangeListener *dcl,
 egl_texture_blit(edpy->gls, >blit_fb, >guest_fb,
  !edpy->y_0_top);
 egl_texture_blend(edpy->gls, >blit_fb, >cursor_fb,
-  !edpy->y_0_top, edpy->pos_x, edpy->pos_y);
+  !edpy->y_0_top, edpy->pos_x, edpy->pos_y,
+  1.0, 1.0);
 } else {
 /* no cursor -> use simple framebuffer blit */
 egl_fb_blit(>blit_fb, >guest_fb, edpy->y_0_top);
diff --git a/ui/egl-helpers.c b/ui/egl-helpers.c
index 5e115b3..e90eef8 100644
--- a/ui/egl-helpers.c
+++ b/ui/egl-helpers.c
@@ -120,14 +120,15 @@ void egl_texture_blit(QemuGLShader *gls, egl_fb *dst, 
egl_fb *src, bool flip)
 }
 
 void egl_texture_blend(QemuGLShader *gls, egl_fb *dst, egl_fb *src, bool flip,
-   int x, int y)
+   int x, int y, double scale_x, double scale_y)
 {
 glBindFramebuffer(GL_FRAMEBUFFER_EXT, dst->framebuffer);
+int w = scale_x * src->width;
+int h = scale_y * src->height;
 if (flip) {
-glViewport(x, y, src->width, src->height);
+glViewport(x, y, w, h);
 } else {
-glViewport(x, dst->height - src->height - y,
-   src->width, src->height);
+glViewport(x, dst->height - h - y, w, h);
 }
 glEnable(GL_TEXTURE_2D);
 glBindTexture(GL_TEXTURE_2D, src->texture);
diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c
index afd1714..42801b6 100644
--- a/ui/gtk-egl.c
+++ b/ui/gtk-egl.c
@@ -278,7 +278,8 @@ void gd_egl_scanout_flush(DisplayChangeListener *dcl,
  vc->gfx.y0_top);
 egl_texture_blend(vc->gfx.gls, >gfx.win_fb, >gfx.cursor_fb,
   vc->gfx.y0_top,
-  vc->gfx.cursor_x, vc->gfx.cursor_y);
+  vc->gfx.cursor_x, vc->gfx.cursor_y,
+  vc->gfx.scale_x, vc->gfx.scale_y);
 } else {
 egl_fb_blit(>gfx.win_fb, >gfx.guest_fb, !vc->gfx.y0_top);
 }
diff --git a/ui/spice-display.c b/ui/spice-display.c
index 52f8cb5..aea6f6e 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -1090,7 +1090,7 @@ static void qemu_spice_gl_update(DisplayChangeListener 
*dcl,
 egl_texture_blit(ssd->gls, >blit_fb, >guest_fb,
  !y_0_top);
 egl_texture_blend(ssd->gls, >blit_fb, >cursor_fb,
-  !y_0_top, x, y);
+  !y_0_top, x, y, 1.0, 1.0);
 glFlush();
 }
 
-- 
2.7.4

Re: [Qemu-devel] test-filter-mirror hangs

2019-01-24 Thread Markus Armbruster

Jason Wang  writes:

> On 2019/1/24 下午5:51, Peter Xu wrote:
>> On Thu, Jan 24, 2019 at 09:11:15AM +, Dr. David Alan Gilbert wrote:
>>> * Jason Wang (jasow...@redhat.com) wrote:
 On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote:
> * Jason Wang (jasow...@redhat.com) wrote:
>> On 2019/1/22 上午2:56, Peter Maydell wrote:
>>> On Thu, 17 Jan 2019 at 09:46, Jason Wang  wrote:
 On 2019/1/15 上午12:33, Zhang Chen wrote:
> On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
> mailto:dgilb...@redhat.com>> wrote:
>
>* Peter Maydell (peter.mayd...@linaro.org
>) wrote:
>> Recently I've noticed that test-filter-mirror has been 
> hanging
>> intermittently, typically when run on some other TCG 
> architecture.
>> In the instance I've just looked at, this was with s390x 
> guest on
>> x86-64 host, though I've also seen it on other host archs and
>> perhaps with other guests.
>
>Watch out to see if you really do see it for other guests;
>it carefully avoids using virtio-net to avoid vhost; but on 
> s390x it
>uses virtio-net-ccw - could that hit the vhost it was trying 
> to avoid?
>
>> Below is a backtrace, though it seems to be pretty unhelpful.
>> Anybody got any theories ? Does the mirror test rely on dirty
>> memory bitmaps like the migration test (which also hangs
>> occasionally with TCG due to some bug I'm sure we've 
> investigated
>> in the past) ?
>
>I don't think it relies on the CPU at all.
> I have no idea about this currently, but Jason and I designed the
> test case.
> Add Jason: Have any comments about this ?
 I can't reproduce this locally with s390x-softmmu. It looks to me the
 test should be independent to any kinds of emulation. It should pass
 when mainloop work.
>>> I've just seen a hang with ppc64 guest on s390x host, so it is
>>> indeed not specific to s390x guest (and so not specific to
>>> virtio-net either, since the ppc64 guest setup uses e1000).
>>>
>>> thanks
>>> -- PMM
>> Finally reproduced locally after hundreds (sometimes thousands) times of
>> running.
>>
>> Bisection points to OOB monitor[1].
>>
>> It looks to me after OOB is used unconditionally we lose a barrier to 
>> make
>> sure socket is connected before sending packets in test-filter-mirror.c. 
>> Is
>> there any other similar and simple thing that we could do to kick the
>> mainloop?
> Do you mean the:
>
>   /* send a qmp command to guarantee that 'connected' is setting to 
> true. */
>   qmp_discard_response(qts, "{ 'execute' : 'query-status'}");

 Yes.


> why was that ever sufficient to know the socket was ready?

 It was suggested by Fam, I don't remember the details. Can we make sure all
 pending events has been processed (UNIX socket was set to connected) after
 query-status is returned with an non OOB monitor?
>>> I'm not sure - it doesn't sound like a 'query-status' should ensure
>>> anything else.
>>> How about something like a 'query-chardev' - can that tell you what you
>>> need and loop until it's ready?
>> Yeah it sounds hacky to use "query status" to make sure a specific
>> chardev is connected even before the OOB...
>
>
> Probably, but anyway it works before OOB.

I don't doubt it worked.  Relying on inappropriate assumptions always
works just fine right until the assumptions become invalid :)

[...]

Re: [Qemu-devel] test-filter-mirror hangs

2019-01-24 Thread Markus Armbruster

Jason Wang  writes:

> On 2019/1/24 下午5:47, Markus Armbruster wrote:
>> Please cc: me on QMP issues.
>
>
> Ok.
>
>
>>
>> Jason Wang  writes:
>>
>>> On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote:
 * Jason Wang (jasow...@redhat.com) wrote:
> On 2019/1/22 上午2:56, Peter Maydell wrote:
>> On Thu, 17 Jan 2019 at 09:46, Jason Wang  wrote:
>>> On 2019/1/15 上午12:33, Zhang Chen wrote:
 On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
 mailto:dgilb...@redhat.com>> wrote:

* Peter Maydell (peter.mayd...@linaro.org
) wrote:
> Recently I've noticed that test-filter-mirror has been hanging
> intermittently, typically when run on some other TCG 
 architecture.
> In the instance I've just looked at, this was with s390x 
 guest on
> x86-64 host, though I've also seen it on other host archs and
> perhaps with other guests.

Watch out to see if you really do see it for other guests;
it carefully avoids using virtio-net to avoid vhost; but on 
 s390x it
uses virtio-net-ccw - could that hit the vhost it was trying to 
 avoid?

> Below is a backtrace, though it seems to be pretty unhelpful.
> Anybody got any theories ? Does the mirror test rely on dirty
> memory bitmaps like the migration test (which also hangs
> occasionally with TCG due to some bug I'm sure we've 
 investigated
> in the past) ?

I don't think it relies on the CPU at all.
 I have no idea about this currently, but Jason and I designed the
 test case.
 Add Jason: Have any comments about this ?
>>> I can't reproduce this locally with s390x-softmmu. It looks to me the
>>> test should be independent to any kinds of emulation. It should pass
>>> when mainloop work.
>> I've just seen a hang with ppc64 guest on s390x host, so it is
>> indeed not specific to s390x guest (and so not specific to
>> virtio-net either, since the ppc64 guest setup uses e1000).
>>
>> thanks
>> -- PMM
> Finally reproduced locally after hundreds (sometimes thousands) times of
> running.
>
> Bisection points to OOB monitor[1].
>
> It looks to me after OOB is used unconditionally we lose a barrier to make
> sure socket is connected before sending packets in test-filter-mirror.c. 
> Is
> there any other similar and simple thing that we could do to kick the
> mainloop?
 Do you mean the:

   /* send a qmp command to guarantee that 'connected' is setting to 
 true. */
   qmp_discard_response(qts, "{ 'execute' : 'query-status'}");
>>>
>>> Yes.
>>>
>>>
 why was that ever sufficient to know the socket was ready?
>>>
>>> It was suggested by Fam, I don't remember the details. Can we make
>>> sure all pending events has been processed (UNIX socket was set to
>>> connected) after query-status is returned with an non OOB monitor?
>> I'm afraid I lack context.  Which socket are you talking about?  The
>> test has at least the QMP socket, the send_sock[], and recv_sock.  What
>> exactly are you trying to accomplish?
>
>
> I mean recv_sock. If mirror tries to send a packet to it before its
> is_connected is set to true, packet will be dropped.

So the *socket* is connected (in the TCP sense), but something else
(whatever owns is_connected) is not.  Can you point me to where
is_connected is set to true?

>> By the way, mkstemp(sock_path) followed by unix_connect(sock_path, NULL)
>> looks rather fishy.  Why create a temporary file only to create a Unix
>> domain socket right over it?
>
>
> I vaguely remember passing fd created by unix domain socket doesn't
> work when the test is introduced. So my understanding is the author
> needs a way to create a unique file name which will be used b Unix
> domain socket at that time.

We should really, really, really improve the test harness to run each
test program in its very own temporary directory.  Then tests can simply
create files with fixed names, and leave cleanup to the test harness.

>>   Why is ignoring errors a good idea?
>
>
> I don't get, which error is missed, it checks the return value of both
> mkstemp() and unix_connect().

Now I neglected to provide enough context for you :)

I read

recv_sock = unix_connect(sock_path, NULL);

and immediately went "why are errors ignored".  If I had read on (as I
should've), I would've seen the are not:

g_assert_cmpint(recv_sock, !=, -1);

Sorry for the noise.

I'd replace both lines by

recv_sock = unix_connect(sock_path, _abort);

Reports the actual error, which is an obvious improvement, with the
location pointing to the failing spot within unix_connect().  To find
where

Re: [Qemu-devel] test-filter-mirror hangs

2019-01-24 Thread Jason Wang




On 2019/1/24 下午7:01, Daniel P. Berrangé wrote:

On Thu, Jan 24, 2019 at 10:30:23AM +, Daniel P. Berrangé wrote:

On Thu, Jan 24, 2019 at 10:11:55AM +, Daniel P. Berrangé wrote:

On Wed, Jan 23, 2019 at 07:53:46PM +, Dr. David Alan Gilbert wrote:

Do you mean the:

 /* send a qmp command to guarantee that 'connected' is setting to true. */
 qmp_discard_response(qts, "{ 'execute' : 'query-status'}");

why was that ever sufficient to know the socket was ready?

This doesn't make any sense to me.

There's the netdev socket, which has been passed in as a pre-opened socket
FD, so that's guaranteed connected.

There's the chardev server socket, to which we've just done a unix_connect()
call to establish a connection. If unix_connect() has succeeded, then at least
the socket is connected & ready for I/O from the test's side. This is a
reliable stream socket, so even if the test sends data on the socket right away
and QEMU isn't ready, it won't be lost. It'll be buffered and received by QEMU
as soon as QEMU starts to monitor for incoming data on the socket.

So I don't get what trying to wait for a "connected" state actually achieves.
It feels like a mistaken attempt to paper over some other unknown flaw that
just worked by some lucky side-effect.

Immediately after writing that, I see what's happened.

The  filter_redirector_receive_iov() method is triggered when QEMU reads
from the -netdev socket (which we passed in as an FD and immediately
write to).

This method will discard all data, however, if the chr_out -chardev is
not in a connected state. So we do indeed have a race condition in this
test suite.

In fact I'd say this filter-mirror object is racy by design even when
run in normal usage, if your chardev is a server mode with "nowait" set,
or is a client mode with "reconnect" set. It will simply discard data.

We can fix the test suite by using FD passing for the -chardev
too, so we're guaranteed to be connected immediately.  It might be
possible to remove "nowait" flag, but I'm not sure if that will cause
problems with the qtest handshake as it might block QEMU at startup
preventing qtest handshake from being performed.

If we care about the race in real QEMU execution, then we must either
document that "nowait" or "reconnect" should never be used with
filter-mirror, or perhaps can make use of "qemu_chr_wait_connected"
to synchronize startup fo the filter-mirror object with the chardev
initialization. That could fix the test suite too

Actually using qemu_chr_wait_connected would cause the test suite to
hang, and it wouldn't fix data loss in the case where the chardev
disconnected and then waited to connect again.

I think the core problem here is that the netdev code assumes that the
filters are always able to process packets. A proper solution would
involve the filters having a "bool ready" state and callback to notify
the netdev anytime this state changes.

The filter-mirror should *not* report ready until the chardev has been
opened.

The netdevs should then not read packets off the wire unless all the
regsitered filters are reporting that they are ready.



Netdev should know nothing about filters. And there will be still a race 
between iterating all filters and handling disconnection if we did this.




  If a filter then
transitions to not-ready, the netdev should again stop reading packets
off the wire & queue any that it might have had in flight, until the
filter becomes ready again.



I agree to queue the packets in this case.

Thanks




Without this kind of setup the filters are inherantly racy in several
of the possible -chardev  configurations.

In that sense the flaky test has actually done us a favour showing that
the code is broken. It is not in fact the test that is broken, and though
we could workaround it in the test that doens't fix the root cause problem.

Regards,
Daniel

Re: [Qemu-devel] [PATCH v2] tests/vm: move images to $HOME/.cache/qemu-vm/images

2019-01-24 Thread Alex Bennée



Gerd Hoffmann  writes:

> It's easier to move around the images then, by replacing the
> subdirectory with a symlink.  Allows to share the images between
> multiple qemu checkouts for example.
>
> Signed-off-by: Gerd Hoffmann 

Queued to testing/next, thanks.
> ---
>
> Notes:
> v2: use $HOME/.cache/qemu-vm/images as location
>
>  tests/vm/Makefile.include | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
> index a98fb3027f..a58383d263 100644
> --- a/tests/vm/Makefile.include
> +++ b/tests/vm/Makefile.include
> @@ -3,7 +3,8 @@
>  .PHONY: vm-build-all vm-clean-all
>
>  IMAGES := ubuntu.i386 freebsd netbsd openbsd centos
> -IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES))
> +IMAGES_DIR := $(HOME)/.cache/qemu-vm/images
> +IMAGE_FILES := $(patsubst %, $(IMAGES_DIR)/%.img, $(IMAGES))
>
>  .PRECIOUS: $(IMAGE_FILES)
>
> @@ -24,9 +25,10 @@ vm-build-all: $(addprefix vm-build-, $(IMAGES))
>  vm-clean-all:
>   rm -f $(IMAGE_FILES)
>
> -tests/vm/%.img: $(SRC_PATH)/tests/vm/% \
> - $(SRC_PATH)/tests/vm/basevm.py \
> - $(SRC_PATH)/tests/vm/Makefile.include
> +$(IMAGES_DIR)/%.img: $(SRC_PATH)/tests/vm/% \
> + $(SRC_PATH)/tests/vm/basevm.py \
> + $(SRC_PATH)/tests/vm/Makefile.include
> + @mkdir -p $(IMAGES_DIR)
>   $(call quiet-command, \
>   $< \
>   $(if $(V)$(DEBUG), --debug) \
> @@ -37,7 +39,7 @@ tests/vm/%.img: $(SRC_PATH)/tests/vm/% \
>
>
>  # Build in VM $(IMAGE)
> -vm-build-%: tests/vm/%.img
> +vm-build-%: $(IMAGES_DIR)/%.img
>   $(call quiet-command, \
>   $(SRC_PATH)/tests/vm/$* \
>   $(if $(V)$(DEBUG), --debug) \


--
Alex Bennée

Re: [Qemu-devel] [PATCH v3 15/50] audio: reduce glob_audio_state usage

2019-01-24 Thread Gerd Hoffmann

On Thu, Jan 24, 2019 at 09:12:58PM +0100, Zoltán Kővágó wrote:
> On 2019-01-24 12:19, Gerd Hoffmann wrote:
> >   Hi,
> > 
> >> So, I think with the first part the only open issue is whenever we go
> >> with the nested types (i.e. patch #1 as-is) or not.  Given that the
> >> one-element-structs added in that patch will get additional fields I
> >> think the nesting makes sense.
> > 
> > Spoke too soon: scripts/checkpatch.pl flags a bunch of codestyle issues.
> 
> Most of them are about the code style of the old audio subsystem, I
> fixed some of them but not everything.  IIRC last time it wasn't a
> problem, but it was in 2015.  Should I go over them again and fix all of
> them?

The first ones I saw where not a old audio codestyle (which is
whitespace-after-function-name mostly) issues but newly introduced ones.
A few rules have been added since 2015.

Fixing the existing issues due to old audio code style (when
changing/moving code) is fine, but not required.  Newly added code
should follow usual qemu code style.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v5] log: Make glib logging go through QEMU

2019-01-24 Thread Markus Armbruster

"Dr. David Alan Gilbert"  writes:

> * Markus Armbruster (arm...@redhat.com) wrote:
>> Christophe Fergeau  writes:
>> 
>> > This commit adds a qemu_init_logging() helper which calls
>> > g_log_set_default_handler() so that glib logs (g_log, g_warning, ...)
>> > are handled similarly to other QEMU logs. This means they will get a
>> > timestamp if timestamps are enabled, and they will go through the
>> > monitor if one is configured.
>> 
>> s/monitor/HMP monitor/
>> 
>> I see why one would like to extend the timestamp feature to GLib log
>> messages.  Routing them through the HMP monitor is perhaps debatable.
>> Cc: Dave in case he has an opinion.
>
> Yes, it's a little odd; what's wrong with stderr for this type of thing?
> My experience has been that things like spice errors are fairly
> asynchronous rather than directly triggered by commands, so maybe less
> suitable for interleaving in the monitor.

Fortunately, error_printf() & friends print to an HMP monitor only when
the current thread is running in HMP monitor context.

> While stderr and hmp output are normally the same, if someone has
> HMP wired to a script, I'd assume this is more likely to break it.

Programs consuming HMP output are brittle.  While we don't break them
just because we can, we do change HMP output without worrying about them
whenever we feel the change is an improvement for *human* readers.

Another possible concern is messages vanishing from stderr.

Re: [Qemu-devel] test-filter-mirror hangs

2019-01-24 Thread Jason Wang

On 2019/1/24 下午6:30, Daniel P. Berrangé wrote:

On Thu, Jan 24, 2019 at 10:11:55AM +, Daniel P. Berrangé wrote:

On Wed, Jan 23, 2019 at 07:53:46PM +, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:

On 2019/1/22 上午2:56, Peter Maydell wrote:

On Thu, 17 Jan 2019 at 09:46, Jason Wang  wrote:

On 2019/1/15 上午12:33, Zhang Chen wrote:

On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
mailto:dgilb...@redhat.com>> wrote:

  * Peter Maydell (peter.mayd...@linaro.org
  ) wrote:
  > Recently I've noticed that test-filter-mirror has been hanging
  > intermittently, typically when run on some other TCG architecture.
  > In the instance I've just looked at, this was with s390x guest on
  > x86-64 host, though I've also seen it on other host archs and
  > perhaps with other guests.

  Watch out to see if you really do see it for other guests;
  it carefully avoids using virtio-net to avoid vhost; but on s390x it
  uses virtio-net-ccw - could that hit the vhost it was trying to avoid?

  > Below is a backtrace, though it seems to be pretty unhelpful.
  > Anybody got any theories ? Does the mirror test rely on dirty
  > memory bitmaps like the migration test (which also hangs
  > occasionally with TCG due to some bug I'm sure we've investigated
  > in the past) ?

  I don't think it relies on the CPU at all.
   I have no idea about this currently, but Jason and I designed the
test case.
Add Jason: Have any comments about this ?

I can't reproduce this locally with s390x-softmmu. It looks to me the
test should be independent to any kinds of emulation. It should pass
when mainloop work.

I've just seen a hang with ppc64 guest on s390x host, so it is
indeed not specific to s390x guest (and so not specific to
virtio-net either, since the ppc64 guest setup uses e1000).

thanks
-- PMM

Finally reproduced locally after hundreds (sometimes thousands) times of
running.

Bisection points to OOB monitor[1].

It looks to me after OOB is used unconditionally we lose a barrier to make
sure socket is connected before sending packets in test-filter-mirror.c. Is
there any other similar and simple thing that we could do to kick the
mainloop?

Do you mean the:

 /* send a qmp command to guarantee that 'connected' is setting to true. */
 qmp_discard_response(qts, "{ 'execute' : 'query-status'}");

why was that ever sufficient to know the socket was ready?

This doesn't make any sense to me.

There's the netdev socket, which has been passed in as a pre-opened socket
FD, so that's guaranteed connected.

There's the chardev server socket, to which we've just done a unix_connect()
call to establish a connection. If unix_connect() has succeeded, then at least
the socket is connected & ready for I/O from the test's side. This is a
reliable stream socket, so even if the test sends data on the socket right away
and QEMU isn't ready, it won't be lost. It'll be buffered and received by QEMU
as soon as QEMU starts to monitor for incoming data on the socket.

So I don't get what trying to wait for a "connected" state actually achieves.
It feels like a mistaken attempt to paper over some other unknown flaw that
just worked by some lucky side-effect.

Immediately after writing that, I see what's happened.

The  filter_redirector_receive_iov() method is triggered when QEMU reads
from the -netdev socket (which we passed in as an FD and immediately
write to).

This method will discard all data, however, if the chr_out -chardev is
not in a connected state. So we do indeed have a race condition in this
test suite.

In fact I'd say this filter-mirror object is racy by design even when
run in normal usage, if your chardev is a server mode with "nowait" set,
or is a client mode with "reconnect" set. It will simply discard data.

Is this issue only existed in the case of mirror? It looks to me some 
other user of chardev has the same assumption. They neither wait for the 
socket to be connected nor process the CHR_EVENT_OPEN.

We can fix the test suite by using FD passing for the -chardev
too, so we're guaranteed to be connected immediately.

Good to know this, I remember when the case is introduced this doesn't 
work. Will post a fix shortly.

   It might be
possible to remove "nowait" flag, but I'm not sure if that will cause
problems with the qtest handshake as it might block QEMU at startup
preventing qtest handshake from being performed.

Yes, nowait doesn't work qtest wait for qemu in this case.

If we care about the race in real QEMU execution, then we must either
document that "nowait" or "reconnect" should never be used with
filter-mirror, or perhaps can make use of "qemu_chr_wait_connected"
to synchronize startup fo the filter-mirror object with the chardev
initialization. That could fix the test suite too

From my point of view, the issue is tcp_chr_write() drop packet 
silently. If it

Re: [Qemu-devel] [PATCH RFC 8/9] tests: Add OpenBSD image

2019-01-24 Thread Thomas Huth

On 2019-01-25 01:48, Brad Smith wrote:
> On 1/24/2019 11:52 AM, Daniel P. Berrangé wrote:
> 
>> On Thu, Jan 24, 2019 at 05:10:19PM +0100, Philippe Mathieu-Daudé wrote:
>>> On 1/24/19 4:56 PM, Kamil Rytarowski wrote:
 On 24.01.2019 16:52, Philippe Mathieu-Daudé wrote:
> On 8/16/17 9:21 AM, Fam Zheng wrote:
>> The image is prepared following instructions as in:
>>
>> https://wiki.qemu.org/Hosts/BSD
>>
>> Signed-off-by: Fam Zheng 
>> ---
>>   tests/vm/openbsd | 45 +
>>   1 file changed, 45 insertions(+)
>>   create mode 100755 tests/vm/openbsd
>>
>> diff --git a/tests/vm/openbsd b/tests/vm/openbsd
>> new file mode 100755
>> index 00..d37ff83a59
>> --- /dev/null
>> +++ b/tests/vm/openbsd
>> @@ -0,0 +1,45 @@
>> +#!/usr/bin/env python
>> +#
>> +# OpenBSD VM image
>> +#
>> +# Copyright (C) 2017 Red Hat Inc.
>> +#
>> +# Authors:
>> +#  Fam Zheng 
>> +#
>> +# This work is licensed under the terms of the GNU GPL, version
>> 2.  See
>> +# the COPYING file in the top-level directory.
>> +#
>> +
>> +import os
>> +import sys
>> +import logging
>> +import subprocess
>> +import tempfile
>> +import time
>> +import basevm
>> +
>> +class OpenBSDVM(basevm.BaseVM):
>> +    name = "openbsd"
>> +    BUILD_SCRIPT = """
>> +    set -e;
>> +    cd $(mktemp -d /var/tmp/qemu-test.XX);
>> +    tar -xf /dev/rsd1c;
>> +    ./configure --cc=x86_64-unknown-openbsd6.1-gcc-4.9.4
>> --python=python2.7 {configure_opts};
>> +    gmake -j{jobs};
>> +    # XXX: "gmake check" seems to always hang or fail
>> +    #gmake check;
> OK, Now it makes more sense...
>
> After spending various hours trying to fix various issues on
> OpenBSD, I
> notice that we never ran tests on this OS.
> The only binary I can run is qemu-img, the rest seems useless.
> I'll summarize in a different thread.
>
 Is this W^X related?
>>> Part of it could be but I'm not sure.
>>>
>>> The 6.1 VM provided by Fam has /usr/local mounted with wxallowed, I
>>> tried building/running there and nothing changed, mmap() still returns
>>> ENOTSUP:
>> ENOTSUP from mmap is certainly what you'd expect from the W^X  scenario
>>
>>    https://undeadly.org/cgi?action=article=20160527203200
>>
>>   "W^X violations are no longer permitted by default.  A kernel log
>> message
>>    is generated, and mprotect/mmap return ENOTSUP.  If the sysctl(8) flag
>>    kern.wxabort is set then a SIGABRT occurs instead, for gdb use or
>> coredump
>>    creation."
> 
> Yes, this policy change was introduced with 6.0.
> 
> Our ports tree has an option which results in the QEMU binaries being
> linked with "-z wxneeded".

Then it's maybe high time to send such changes upstream now ;-)

 Thomas

Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-24 Thread Thomas Huth

On 2019-01-25 07:01, Thomas Huth wrote:
> On 2019-01-24 18:23, Stefano Garzarella wrote:
>> If the WRITE_ZEROES feature is enabled, we check this
>> command in the test_basic().
>>
>> Signed-off-by: Stefano Garzarella 
>> ---
>>  tests/virtio-blk-test.c | 63 +
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
>> index 04c608764b..8cabbcb85a 100644
>> --- a/tests/virtio-blk-test.c
>> +++ b/tests/virtio-blk-test.c
>> @@ -231,6 +231,69 @@ static void test_basic(QVirtioDevice *dev, 
>> QGuestAllocator *alloc,
>>  
>>  guest_free(alloc, req_addr);
>>  
>> +if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
>> +struct virtio_blk_discard_write_zeroes *dwz_hdr;
>> +void *expected;
>> +
>> +/*
>> + * WRITE_ZEROES request on the same sector of previous test where
>> + * we wrote "TEST".
>> + */
>> +req.type = VIRTIO_BLK_T_WRITE_ZEROES;
>> +req.data = g_malloc0(512);
> 
> Wouldn't it be more interesting to do a memset(req.data, 0xaa, 512) or
> something similar here, to see whether zeroes or 0xaa is written?

Ah, never mind, I thought req.data would be a sector buffer here, but
looking at the lines below, it apparently is something different.

Why do you allocate 512 bytes here? I'd rather expect
g_malloc0(sizeof(struct virtio_blk_discard_write_zeroes)) here. ... and
then you could also use a local "struct virtio_blk_discard_write_zeroes
dwz_hdr" variable instead of a pointer, and drop the g_malloc0() completely?

>> +dwz_hdr = (struct virtio_blk_discard_write_zeroes *)req.data;
>> +dwz_hdr->sector = 0;
>> +dwz_hdr->num_sectors = 1;
>> +dwz_hdr->flags = 0;
>> +
>> +req_addr = virtio_blk_request(alloc, dev, , 512);
>> +
>> +g_free(req.data);
>> +
>> +free_head = qvirtqueue_add(vq, req_addr, 16, false, true);
>> +qvirtqueue_add(vq, req_addr + 16, 512, false, true);
>> +qvirtqueue_add(vq, req_addr + 528, 1, true, false);
>> +
>> +qvirtqueue_kick(dev, vq, free_head);
>> +
>> +qvirtio_wait_used_elem(dev, vq, free_head, NULL,
>> +   QVIRTIO_BLK_TIMEOUT_US);
>> +status = readb(req_addr + 528);
>> +g_assert_cmpint(status, ==, 0);
>> +
>> +guest_free(alloc, req_addr);
>> +
>> +/* Read request to check if the sector contains all zeroes */
>> +req.type = VIRTIO_BLK_T_IN;
>> +req.ioprio = 1;
>> +req.sector = 0;
>> +req.data = g_malloc0(512);
>> +
>> +req_addr = virtio_blk_request(alloc, dev, , 512);
>> +
>> +g_free(req.data);
>> +
>> +free_head = qvirtqueue_add(vq, req_addr, 16, false, true);
>> +qvirtqueue_add(vq, req_addr + 16, 512, true, true);
>> +qvirtqueue_add(vq, req_addr + 528, 1, true, false);
>> +
>> +qvirtqueue_kick(dev, vq, free_head);
>> +
>> +qvirtio_wait_used_elem(dev, vq, free_head, NULL,
>> +   QVIRTIO_BLK_TIMEOUT_US);
>> +status = readb(req_addr + 528);
>> +g_assert_cmpint(status, ==, 0);
>> +
>> +data = g_malloc(512);
>> +expected = g_malloc0(512);
>> +memread(req_addr + 16, data, 512);
>> +g_assert_cmpmem(data, 512, expected, 512);
>> +g_free(expected);
>> +g_free(data);
>> +
>> +guest_free(alloc, req_addr);
>> +}
>> +
>>  if (features & (1u << VIRTIO_F_ANY_LAYOUT)) {
>>  /* Write and read with 2 descriptor layout */
>>  /* Write request */
>>
> 
>

Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-24 Thread Thomas Huth

On 2019-01-25 07:01, Thomas Huth wrote:
> On 2019-01-24 18:23, Stefano Garzarella wrote:
>> If the WRITE_ZEROES feature is enabled, we check this
>> command in the test_basic().
>>
>> Signed-off-by: Stefano Garzarella 
>> ---
>>  tests/virtio-blk-test.c | 63 +
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
>> index 04c608764b..8cabbcb85a 100644
>> --- a/tests/virtio-blk-test.c
>> +++ b/tests/virtio-blk-test.c
>> @@ -231,6 +231,69 @@ static void test_basic(QVirtioDevice *dev, 
>> QGuestAllocator *alloc,
>>  
>>  guest_free(alloc, req_addr);
>>  
>> +if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
>> +struct virtio_blk_discard_write_zeroes *dwz_hdr;
>> +void *expected;
>> +
>> +/*
>> + * WRITE_ZEROES request on the same sector of previous test where
>> + * we wrote "TEST".
>> + */
>> +req.type = VIRTIO_BLK_T_WRITE_ZEROES;
>> +req.data = g_malloc0(512);
> 
> Wouldn't it be more interesting to do a memset(req.data, 0xaa, 512) or
> something similar here, to see whether zeroes or 0xaa is written?

Ah, never mind, I thought req.data would be a sector buffer here, but
looking at the lines below, it apparently is something different.

Why do you allocate 512 bytes here? I'd rather expect
g_malloc0(sizeof(struct virtio_blk_discard_write_zeroes)) here. ... and
then you could also use a local "struct virtio_blk_discard_write_zeroes
dwz_hdr" variable instead of a pointer, and drop the g_malloc0() completely?

>> +dwz_hdr = (struct virtio_blk_discard_write_zeroes *)req.data;
>> +dwz_hdr->sector = 0;
>> +dwz_hdr->num_sectors = 1;
>> +dwz_hdr->flags = 0;
>> +
>> +req_addr = virtio_blk_request(alloc, dev, , 512);
>> +
>> +g_free(req.data);
>> +
>> +free_head = qvirtqueue_add(vq, req_addr, 16, false, true);
>> +qvirtqueue_add(vq, req_addr + 16, 512, false, true);
>> +qvirtqueue_add(vq, req_addr + 528, 1, true, false);
>> +
>> +qvirtqueue_kick(dev, vq, free_head);
>> +
>> +qvirtio_wait_used_elem(dev, vq, free_head, NULL,
>> +   QVIRTIO_BLK_TIMEOUT_US);
>> +status = readb(req_addr + 528);
>> +g_assert_cmpint(status, ==, 0);
>> +
>> +guest_free(alloc, req_addr);
>> +
>> +/* Read request to check if the sector contains all zeroes */
>> +req.type = VIRTIO_BLK_T_IN;
>> +req.ioprio = 1;
>> +req.sector = 0;
>> +req.data = g_malloc0(512);
>> +
>> +req_addr = virtio_blk_request(alloc, dev, , 512);
>> +
>> +g_free(req.data);
>> +
>> +free_head = qvirtqueue_add(vq, req_addr, 16, false, true);
>> +qvirtqueue_add(vq, req_addr + 16, 512, true, true);
>> +qvirtqueue_add(vq, req_addr + 528, 1, true, false);
>> +
>> +qvirtqueue_kick(dev, vq, free_head);
>> +
>> +qvirtio_wait_used_elem(dev, vq, free_head, NULL,
>> +   QVIRTIO_BLK_TIMEOUT_US);
>> +status = readb(req_addr + 528);
>> +g_assert_cmpint(status, ==, 0);
>> +
>> +data = g_malloc(512);
>> +expected = g_malloc0(512);
>> +memread(req_addr + 16, data, 512);
>> +g_assert_cmpmem(data, 512, expected, 512);
>> +g_free(expected);
>> +g_free(data);
>> +
>> +guest_free(alloc, req_addr);
>> +}
>> +
>>  if (features & (1u << VIRTIO_F_ANY_LAYOUT)) {
>>  /* Write and read with 2 descriptor layout */
>>  /* Write request */
>>
> 
>

[Qemu-devel] [PATCH 1/3] hw/i386/pc.c: remove unused function pc_acpi_init()

2019-01-24 Thread Wei Yang

Function pc_acpi_init() is now used in no place.

Remove the definition and declaration.

Signed-off-by: Wei Yang 
---
 hw/i386/pc.c | 27 ---
 include/hw/i386/pc.h |  1 -
 2 files changed, 28 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5317e08f60..734d3268fa 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1280,33 +1280,6 @@ void pc_pci_as_mapping_init(Object *owner, MemoryRegion 
*system_memory,
 pci_address_space, -1);
 }
 
-void pc_acpi_init(const char *default_dsdt)
-{
-char *filename;
-
-if (acpi_tables != NULL) {
-/* manually set via -acpitable, leave it alone */
-return;
-}
-
-filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, default_dsdt);
-if (filename == NULL) {
-warn_report("failed to find %s", default_dsdt);
-} else {
-QemuOpts *opts = qemu_opts_create(qemu_find_opts("acpi"), NULL, 0,
-  _abort);
-Error *err = NULL;
-
-qemu_opt_set(opts, "file", filename, _abort);
-
-acpi_table_add_builtin(opts, );
-if (err) {
-warn_reportf_err(err, "failed to load %s: ", filename);
-}
-g_free(filename);
-}
-}
-
 void xen_load_linux(PCMachineState *pcms)
 {
 int i;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 9d29c4b1df..541124ba6d 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -187,7 +187,6 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
level);
 
 void pc_cpus_init(PCMachineState *pcms);
 void pc_hot_add_cpu(const int64_t id, Error **errp);
-void pc_acpi_init(const char *default_dsdt);
 
 void pc_guest_info_init(PCMachineState *pcms);
 
-- 
2.19.1

[Qemu-devel] [PATCH 2/3] hw/acpi: remove unused function acpi_table_add_builtin()

2019-01-24 Thread Wei Yang

Function acpi_table_add_builtin() is now used in no place.

Remove the definition and declaration.

Signed-off-by: Wei Yang 
---
 hw/acpi/core.c | 6 --
 include/hw/acpi/acpi.h | 1 -
 2 files changed, 7 deletions(-)

diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index d6f0709691..e9b1a85e54 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -305,12 +305,6 @@ out:
 
 static bool acpi_table_builtin = false;
 
-void acpi_table_add_builtin(const QemuOpts *opts, Error **errp)
-{
-acpi_table_builtin = true;
-acpi_table_add(opts, errp);
-}
-
 unsigned acpi_table_len(void *current)
 {
 struct acpi_table_header *hdr = current - sizeof(hdr->_length);
diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h
index c20ace0d0b..4a8bbaf1b5 100644
--- a/include/hw/acpi/acpi.h
+++ b/include/hw/acpi/acpi.h
@@ -190,7 +190,6 @@ uint8_t *acpi_table_first(void);
 uint8_t *acpi_table_next(uint8_t *current);
 unsigned acpi_table_len(void *current);
 void acpi_table_add(const QemuOpts *opts, Error **errp);
-void acpi_table_add_builtin(const QemuOpts *opts, Error **errp);
 
 typedef struct AcpiSlicOem AcpiSlicOem;
 struct AcpiSlicOem {
-- 
2.19.1

[Qemu-devel] [PATCH 3/3] hw/acpi: remove unnecessary variable acpi_table_builtin

2019-01-24 Thread Wei Yang

acpi_table_builtin is now always false, it is not necessary to check it
again.

This patch just removes it.

Signed-off-by: Wei Yang 
---
 hw/acpi/core.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index e9b1a85e54..f9c96535d1 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -303,8 +303,6 @@ out:
 error_propagate(errp, err);
 }
 
-static bool acpi_table_builtin = false;
-
 unsigned acpi_table_len(void *current)
 {
 struct acpi_table_header *hdr = current - sizeof(hdr->_length);
@@ -320,7 +318,7 @@ void *acpi_table_hdr(void *h)
 
 uint8_t *acpi_table_first(void)
 {
-if (acpi_table_builtin || !acpi_tables) {
+if (!acpi_tables) {
 return NULL;
 }
 return acpi_table_hdr(acpi_tables + ACPI_TABLE_PFX_SIZE);
-- 
2.19.1

Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-24 Thread Thomas Huth

On 2019-01-24 18:23, Stefano Garzarella wrote:
> If the WRITE_ZEROES feature is enabled, we check this
> command in the test_basic().
> 
> Signed-off-by: Stefano Garzarella 
> ---
>  tests/virtio-blk-test.c | 63 +
>  1 file changed, 63 insertions(+)
> 
> diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
> index 04c608764b..8cabbcb85a 100644
> --- a/tests/virtio-blk-test.c
> +++ b/tests/virtio-blk-test.c
> @@ -231,6 +231,69 @@ static void test_basic(QVirtioDevice *dev, 
> QGuestAllocator *alloc,
>  
>  guest_free(alloc, req_addr);
>  
> +if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
> +struct virtio_blk_discard_write_zeroes *dwz_hdr;
> +void *expected;
> +
> +/*
> + * WRITE_ZEROES request on the same sector of previous test where
> + * we wrote "TEST".
> + */
> +req.type = VIRTIO_BLK_T_WRITE_ZEROES;
> +req.data = g_malloc0(512);

Wouldn't it be more interesting to do a memset(req.data, 0xaa, 512) or
something similar here, to see whether zeroes or 0xaa is written?

> +dwz_hdr = (struct virtio_blk_discard_write_zeroes *)req.data;
> +dwz_hdr->sector = 0;
> +dwz_hdr->num_sectors = 1;
> +dwz_hdr->flags = 0;
> +
> +req_addr = virtio_blk_request(alloc, dev, , 512);
> +
> +g_free(req.data);
> +
> +free_head = qvirtqueue_add(vq, req_addr, 16, false, true);
> +qvirtqueue_add(vq, req_addr + 16, 512, false, true);
> +qvirtqueue_add(vq, req_addr + 528, 1, true, false);
> +
> +qvirtqueue_kick(dev, vq, free_head);
> +
> +qvirtio_wait_used_elem(dev, vq, free_head, NULL,
> +   QVIRTIO_BLK_TIMEOUT_US);
> +status = readb(req_addr + 528);
> +g_assert_cmpint(status, ==, 0);
> +
> +guest_free(alloc, req_addr);
> +
> +/* Read request to check if the sector contains all zeroes */
> +req.type = VIRTIO_BLK_T_IN;
> +req.ioprio = 1;
> +req.sector = 0;
> +req.data = g_malloc0(512);
> +
> +req_addr = virtio_blk_request(alloc, dev, , 512);
> +
> +g_free(req.data);
> +
> +free_head = qvirtqueue_add(vq, req_addr, 16, false, true);
> +qvirtqueue_add(vq, req_addr + 16, 512, true, true);
> +qvirtqueue_add(vq, req_addr + 528, 1, true, false);
> +
> +qvirtqueue_kick(dev, vq, free_head);
> +
> +qvirtio_wait_used_elem(dev, vq, free_head, NULL,
> +   QVIRTIO_BLK_TIMEOUT_US);
> +status = readb(req_addr + 528);
> +g_assert_cmpint(status, ==, 0);
> +
> +data = g_malloc(512);
> +expected = g_malloc0(512);
> +memread(req_addr + 16, data, 512);
> +g_assert_cmpmem(data, 512, expected, 512);
> +g_free(expected);
> +g_free(data);
> +
> +guest_free(alloc, req_addr);
> +}
> +
>  if (features & (1u << VIRTIO_F_ANY_LAYOUT)) {
>  /* Write and read with 2 descriptor layout */
>  /* Write request */
>

Re: [Qemu-devel] [PATCH v6 00/10] hw/m68k: add Apple Machintosh Quadra 800 machine

2019-01-24 Thread Thomas Huth

On 2019-01-24 18:37, Mark Cave-Ayland wrote:
> On 24/01/2019 17:15, Laurent Vivier wrote:
> 
>> On 24/01/2019 18:02, Thomas Huth wrote:
>>> On 2018-11-02 16:22, Mark Cave-Ayland wrote:
 (MCA: here's the latest version of the q800 patchset. I've hope that I've
 addressed most of the comments, plus this will now boot into the Debian
 installer correctly when applied to git master.
>>>
>>> Any update on this series? Why did it get stalled again?
>>>
>>
>> I was thinking about this today.
>>
>> Mark, perhaps you can send a rebased version of the series?
>>
>> I think we need reviews for "esp: add pseudo-DMA as used by Macintosh".

I just gave it a quick review (see my separate mail) ... some few nits,
but all in all, it looks quite good to me.

> 1) Do we mind some of the more verbose comments that were taken from the Linux
> headers in some of the files? (I can also see that updates to the comment 
> checking in
> checkpatch.pl now cause the series to fail with style issues, so these will 
> need to
> be touched up regardless)

I personally would clean them up, but that's just a matter of taste. I
think that's nothing that should hold up this series. It can also be
done with a patch on top afterwards.

> 2) Do we need to add migration support for the ESP pseudo-DMA?

That would be cleaner, of course. But since the q800 likely can't be
migrated anyway, it's not that important. So if it is too much of a
hassle to add it, maybe simply add a comment next to the vmstate section
saying "/* TODO: Add migration support for the pdma data */" ?

 Thomas

Re: [Qemu-devel] [PATCH v6 05/10] esp: add pseudo-DMA as used by Macintosh

2019-01-24 Thread Thomas Huth

On 2018-11-02 16:22, Mark Cave-Ayland wrote:
> From: Laurent Vivier 

I'd suggest to add a patch description that contains the text that
Laurent provided as a reply to this patch in v5:

 8< --
There is no DMA in Quadra 800, so the CPU reads/writes the data from the
PDMA register (offset 0x100, ESP_PDMA in hw/m68k/q800.c) and copies them
to/from the memory.

There is a nice assembly loop in the kernel to do that, see
linux/drivers/scsi/mac_esp.c:MAC_ESP_PDMA_LOOP().

The start of the transfer is triggered by the DREQ interrupt (see linux
mac_esp_send_pdma_cmd()), the CPU polls on the IRQ flag to start the
transfer after a SCSI command has been sent (in Quadra 800 it goes
through the VIA2, the via2-irq line and the vIFR register)

The Macintosh hardware includes hardware handshaking to prevent the CPU
from reading invalid data or writing data faster than the peripheral
device can accept it.

This is the "blind mode", and from the doc:
"Approximate maximum SCSI transfer rates within a blocks are 1.4 MB per
second for blind transfers in the Macintosh II"

Some references can be found in:
  Apple Macintosh Family Hardware Reference, ISBN 0-201-19255-1
  Guide to the Macintosh Family Hardware, ISBN-0-201-52405-8
 >8 --

?

> Co-developed-by: Mark Cave-Ayland 
> Signed-off-by: Mark Cave-Ayland 
> Signed-off-by: Laurent Vivier 
> ---
>  hw/scsi/esp.c | 291 
> +-
>  include/hw/scsi/esp.h |   7 ++
>  2 files changed, 269 insertions(+), 29 deletions(-)
> 
> diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
> index 630d923623..8e9e27e479 100644
> --- a/hw/scsi/esp.c
> +++ b/hw/scsi/esp.c
[...]
> @@ -356,8 +511,7 @@ static void handle_ti(ESPState *s)
>  s->dma_left = minlen;
>  s->rregs[ESP_RSTAT] &= ~STAT_TC;
>  esp_do_dma(s);
> -}
> -if (s->do_cmd) {
> +} else if (s->do_cmd) {

I'm not sure about this change... is it required? It could also change
the behavior of the other users of this device...?

>  trace_esp_handle_ti_cmd(s->cmdlen);
>  s->ti_size = 0;
>  s->cmdlen = 0;
> @@ -384,6 +538,7 @@ void esp_hard_reset(ESPState *s)
>  static void esp_soft_reset(ESPState *s)
>  {
>  qemu_irq_lower(s->irq);
> +qemu_irq_lower(s->irq_data);
>  esp_hard_reset(s);
>  }
>  
> @@ -619,6 +774,80 @@ static const MemoryRegionOps sysbus_esp_mem_ops = {
>  .valid.accepts = esp_mem_accepts,
>  };
>  
> +static void sysbus_esp_pdma_write(void *opaque, hwaddr addr,
> +  uint64_t val, unsigned int size)
> +{
> +SysBusESPState *sysbus = opaque;
> +ESPState *s = >esp;
> +uint32_t dmalen;
> +
> +dmalen = s->rregs[ESP_TCLO];
> +dmalen |= s->rregs[ESP_TCMID] << 8;
> +dmalen |= s->rregs[ESP_TCHI] << 16;
> +if (dmalen == 0 || s->pdma_len == 0) {
> +return;
> +}
> +switch (size) {
> +case 1:
> +*s->pdma_cur++ = val;
> +s->pdma_len--;
> +dmalen--;
> +break;
> +case 2:
> +*s->pdma_cur++ = val >> 8;
> +*s->pdma_cur++ = val;

Is there any chance that we could end up here with pdma_len == 1 or
dmalen == 1 ? If yes, the following two lines will trigger a wrap-around
with likely very bad side effects in the future...

Maybe assert(s->pdma_len >= 2 && dmalen >= 2) at least?

> +s->pdma_len -= 2;
> +dmalen -= 2;
> +break;
> +}
> +s->rregs[ESP_TCLO] = dmalen & 0xff;
> +s->rregs[ESP_TCMID] = dmalen >> 8;
> +s->rregs[ESP_TCHI] = dmalen >> 16;
> +if (s->pdma_len == 0 && s->pdma_cb) {
> +esp_lower_drq(s);
> +s->pdma_cb(s);
> +s->pdma_cb = NULL;
> +}
> +}
> +
> +static uint64_t sysbus_esp_pdma_read(void *opaque, hwaddr addr,
> + unsigned int size)
> +{
> +SysBusESPState *sysbus = opaque;
> +ESPState *s = >esp;
> +uint64_t val = 0;
> +
> +if (s->pdma_len == 0) {
> +return 0;
> +}
> +switch (size) {
> +case 1:
> +val = *s->pdma_cur++;
> +s->pdma_len--;
> +break;
> +case 2:

assert(s->pdma_len >= 2) ?

> +val = *s->pdma_cur++;
> +val = (val << 8) | *s->pdma_cur++;
> +s->pdma_len -= 2;
> +break;
> +}
> +
> +if (s->pdma_len == 0 && s->pdma_cb) {
> +esp_lower_drq(s);
> +s->pdma_cb(s);
> +s->pdma_cb = NULL;
> +}
> +return val;
> +}
> +
> +static const MemoryRegionOps sysbus_esp_pdma_ops = {
> +.read = sysbus_esp_pdma_read,
> +.write = sysbus_esp_pdma_write,
> +.endianness = DEVICE_NATIVE_ENDIAN,
> +.valid.min_access_size = 1,
> +.valid.max_access_size = 2,
> +};
> +
>  static const struct SCSIBusInfo esp_scsi_info = {
>  .tcq = false,
>  .max_target = ESP_MAX_DEVS,
> @@ -651,12 +880,16 @@ static void

Re: [Qemu-devel] [RFC PATCH v3 7/7] target/ppc: support single stepping with KVM HV

2019-01-24 Thread Alexey Kardashevskiy




On 19/01/2019 01:07, Fabiano Rosas wrote:
> The hardware singlestep mechanism in POWER works via a Trace Interrupt
> (0xd00) that happens after any instruction executes, whenever MSR_SE =
> 1 (PowerISA Section 6.5.15 - Trace Interrupt).
> 
> However, with kvm_hv, the Trace Interrupt happens inside the guest and
> KVM has no visibility of it. Therefore, when the gdbstub uses the
> KVM_SET_GUEST_DEBUG ioctl to enable singlestep, KVM simply ignores it.
> 
> This patch takes advantage of the Trace Interrupt to perform the step
> inside the guest, but uses a breakpoint at the Trace Interrupt handler
> to return control to KVM. The exit is treated by KVM as a regular
> breakpoint and it returns to the host (and QEMU eventually).
> 
> Before signalling GDB, QEMU sets the Next Instruction Pointer to the
> instruction following the one being stepped and restores the MSR,
> SRR0, SRR1 values from before the step, effectively skipping the
> interrupt handler execution and hiding the trace interrupt breakpoint
> from GDB.
> 
> This approach works with both of GDB's 'scheduler-locking' options
> (off, step).
> 
> Note:
> 
> - kvm_arch_set_singlestep happens after GDB asks for a single step,
>   while the vcpus are stopped.
> 
> - kvm_handle_singlestep executes after the step, during the handling
>   of the Emulation Assist Interrupt (breakpoint).


Good job! Few comments below.


> 
> Signed-off-by: Fabiano Rosas 
> ---
>  target/ppc/cpu.h |   5 ++
>  target/ppc/kvm.c | 180 +--
>  2 files changed, 178 insertions(+), 7 deletions(-)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 2185ef5e67..c7320c908e 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1157,6 +1157,11 @@ struct CPUPPCState {
>  uint32_t tm_vscr;
>  uint64_t tm_dscr;
>  uint64_t tm_tar;
> +
> +/* Used for software single step */
> +target_ulong sstep_msr;
> +target_ulong sstep_srr0;
> +target_ulong sstep_srr1;
>  };
>  
>  #define SET_FIT_PERIOD(a_, b_, c_, d_)  \
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index c27190d7fb..880597a4a6 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -1555,6 +1555,68 @@ void kvm_arch_remove_all_hw_breakpoints(void)
>  nb_hw_breakpoint = nb_hw_watchpoint = 0;
>  }
>  
> +void kvm_arch_set_singlestep(CPUState *cs, int enabled)
> +{
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +CPUPPCState *env = >env;
> +target_ulong trace_handler_addr;
> +uint32_t insn;
> +
> +if (enabled) {

if (!enabled) {
return;
}

and reduce indent?


> +cpu_synchronize_state(cs);
> +
> +/*
> + * Save the registers that will be affected by the single step
> + * mechanism. These will be restored after the step at
> + * kvm_handle_singlestep.
> + */
> +env->sstep_msr = env->msr;
> +env->sstep_srr0 = env->spr[SPR_SRR0];
> +env->sstep_srr1 = env->spr[SPR_SRR1];
> +
> +cpu_memory_rw_debug(cs, env->nip, (uint8_t *), sizeof(insn), 0);
> +
> +/*
> + * rfid overwrites MSR with SRR1. Check if it has the SE bit
> + * already set, meaning the guest is doing a single step
> + * itself and set the SRR1_SE bit instead of MSR_SE to trigger
> + * our own single step.
> + */
> +if (extract32(insn, 26, 6) == 19 && extract32(insn, 1, 10) == 18) {

We could define "rfid" like XL(19,18):

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/xmon/ppc-opc.c#n4388


> +if ((env->spr[SPR_SRR1] >> MSR_SE) & 1) {
> +env->sstep_msr |= (1ULL << MSR_SE);
> +}
> +
> +env->spr[SPR_SRR1] |= (1ULL << MSR_SE);
> +} else {
> +/*
> + * MSR_SE = 1 will cause a Trace Interrupt in the guest
> + * after the next instruction executes.
> + */
> +env->msr |= (1ULL << MSR_SE);
> +}
> +
> +/*
> + * We set a breakpoint at the interrupt handler address so
> + * that the singlestep will be seen by KVM (this is treated by
> + * KVM like an ordinary breakpoint) and control is returned to
> + * QEMU.
> + */
> +trace_handler_addr = ppc_get_trace_int_handler_addr(cs);
> +
> +if (env->nip == trace_handler_addr) {
> +/*
> + * We are trying to step over the interrupt handler
> + * address itself; move the breakpoint to the next
> + * instruction.
> + */
> +trace_handler_addr += 4;
> +}
> +
> +kvm_insert_breakpoint(cs, trace_handler_addr, 4, GDB_BREAKPOINT_SW);
> +}
> +}
> +
>  void kvm_arch_update_guest_debug(CPUState *cs, struct kvm_guest_debug *dbg)
>  {
>  int n;
> @@ -1594,6 +1656,93 @@ void kvm_arch_update_guest_debug(CPUState *cs, struct 
> kvm_guest_debug *dbg)
>  }
>  }
>  
> +/* Revert any

Re: [Qemu-devel] test-filter-mirror hangs

2019-01-24 Thread Jason Wang

On 2019/1/24 下午5:47, Markus Armbruster wrote:

Please cc: me on QMP issues.

Ok.

Jason Wang  writes:

On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:

On 2019/1/22 上午2:56, Peter Maydell wrote:

On Thu, 17 Jan 2019 at 09:46, Jason Wang  wrote:

On 2019/1/15 上午12:33, Zhang Chen wrote:

On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
mailto:dgilb...@redhat.com>> wrote:

   * Peter Maydell (peter.mayd...@linaro.org
   ) wrote:
   > Recently I've noticed that test-filter-mirror has been hanging
   > intermittently, typically when run on some other TCG architecture.
   > In the instance I've just looked at, this was with s390x guest on
   > x86-64 host, though I've also seen it on other host archs and
   > perhaps with other guests.

   Watch out to see if you really do see it for other guests;
   it carefully avoids using virtio-net to avoid vhost; but on s390x it
   uses virtio-net-ccw - could that hit the vhost it was trying to avoid?

   > Below is a backtrace, though it seems to be pretty unhelpful.
   > Anybody got any theories ? Does the mirror test rely on dirty
   > memory bitmaps like the migration test (which also hangs
   > occasionally with TCG due to some bug I'm sure we've investigated
   > in the past) ?

   I don't think it relies on the CPU at all.
I have no idea about this currently, but Jason and I designed the
test case.
Add Jason: Have any comments about this ?

I can't reproduce this locally with s390x-softmmu. It looks to me the
test should be independent to any kinds of emulation. It should pass
when mainloop work.

I've just seen a hang with ppc64 guest on s390x host, so it is
indeed not specific to s390x guest (and so not specific to
virtio-net either, since the ppc64 guest setup uses e1000).

thanks
-- PMM

Finally reproduced locally after hundreds (sometimes thousands) times of
running.

Bisection points to OOB monitor[1].

It looks to me after OOB is used unconditionally we lose a barrier to make
sure socket is connected before sending packets in test-filter-mirror.c. Is
there any other similar and simple thing that we could do to kick the
mainloop?

Do you mean the:

  /* send a qmp command to guarantee that 'connected' is setting to true. */
  qmp_discard_response(qts, "{ 'execute' : 'query-status'}");

Yes.

why was that ever sufficient to know the socket was ready?

It was suggested by Fam, I don't remember the details. Can we make
sure all pending events has been processed (UNIX socket was set to
connected) after query-status is returned with an non OOB monitor?

I'm afraid I lack context.  Which socket are you talking about?  The
test has at least the QMP socket, the send_sock[], and recv_sock.  What
exactly are you trying to accomplish?

I mean recv_sock. If mirror tries to send a packet to it before its 
is_connected is set to true, packet will be dropped.

By the way, mkstemp(sock_path) followed by unix_connect(sock_path, NULL)
looks rather fishy.  Why create a temporary file only to create a Unix
domain socket right over it?

I vaguely remember passing fd created by unix domain socket doesn't work 
when the test is introduced. So my understanding is the author needs a 
way to create a unique file name which will be used b Unix domain socket 
at that time.

  Why is ignoring errors a good idea?

I don't get, which error is missed, it checks the return value of both 
mkstemp() and unix_connect().

Thanks

Re: [Qemu-devel] test-filter-mirror hangs

2019-01-24 Thread Jason Wang

On 2019/1/24 下午5:51, Peter Xu wrote:

On Thu, Jan 24, 2019 at 09:11:15AM +, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:

On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:

On 2019/1/22 上午2:56, Peter Maydell wrote:

On Thu, 17 Jan 2019 at 09:46, Jason Wang  wrote:

On 2019/1/15 上午12:33, Zhang Chen wrote:

On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
mailto:dgilb...@redhat.com>> wrote:

   * Peter Maydell (peter.mayd...@linaro.org
   ) wrote:
   > Recently I've noticed that test-filter-mirror has been hanging
   > intermittently, typically when run on some other TCG architecture.
   > In the instance I've just looked at, this was with s390x guest on
   > x86-64 host, though I've also seen it on other host archs and
   > perhaps with other guests.

   Watch out to see if you really do see it for other guests;
   it carefully avoids using virtio-net to avoid vhost; but on s390x it
   uses virtio-net-ccw - could that hit the vhost it was trying to avoid?

   > Below is a backtrace, though it seems to be pretty unhelpful.
   > Anybody got any theories ? Does the mirror test rely on dirty
   > memory bitmaps like the migration test (which also hangs
   > occasionally with TCG due to some bug I'm sure we've investigated
   > in the past) ?

   I don't think it relies on the CPU at all.
I have no idea about this currently, but Jason and I designed the
test case.
Add Jason: Have any comments about this ?

I can't reproduce this locally with s390x-softmmu. It looks to me the
test should be independent to any kinds of emulation. It should pass
when mainloop work.

I've just seen a hang with ppc64 guest on s390x host, so it is
indeed not specific to s390x guest (and so not specific to
virtio-net either, since the ppc64 guest setup uses e1000).

thanks
-- PMM

Finally reproduced locally after hundreds (sometimes thousands) times of
running.

Bisection points to OOB monitor[1].

It looks to me after OOB is used unconditionally we lose a barrier to make
sure socket is connected before sending packets in test-filter-mirror.c. Is
there any other similar and simple thing that we could do to kick the
mainloop?

Do you mean the:

  /* send a qmp command to guarantee that 'connected' is setting to true. */
  qmp_discard_response(qts, "{ 'execute' : 'query-status'}");

Yes.

why was that ever sufficient to know the socket was ready?

It was suggested by Fam, I don't remember the details. Can we make sure all
pending events has been processed (UNIX socket was set to connected) after
query-status is returned with an non OOB monitor?

I'm not sure - it doesn't sound like a 'query-status' should ensure
anything else.
How about something like a 'query-chardev' - can that tell you what you
need and loop until it's ready?

Yeah it sounds hacky to use "query status" to make sure a specific
chardev is connected even before the OOB...

Probably, but anyway it works before OOB.

I saw that currently the chardev requires "nowait":

 qts = qtest_initf(
 "-netdev socket,id=qtest-bn0,fd=%d "
 "-device %s,netdev=qtest-bn0,id=qtest-e0 "
 "-chardev socket,id=mirror0,path=%s,server,nowait "
 "-object filter-mirror,id=qtest-f0,netdev=qtest-bn0,queue=tx,outdev=mirror0 
"
 , send_sock[1], devstr, sock_path);

Could it work without "nowait"?  Would that make sure QEMU will wait
until connection established before going on?

Doesn't work for qtest which will wait for the qemu as well.

Thanks

Regards,

Re: [Qemu-devel] test-filter-mirror hangs

2019-01-24 Thread Jason Wang

On 2019/1/24 下午5:11, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:

On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:

On 2019/1/22 上午2:56, Peter Maydell wrote:

On Thu, 17 Jan 2019 at 09:46, Jason Wang  wrote:

On 2019/1/15 上午12:33, Zhang Chen wrote:

On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
mailto:dgilb...@redhat.com>> wrote:

   * Peter Maydell (peter.mayd...@linaro.org
   ) wrote:
   > Recently I've noticed that test-filter-mirror has been hanging
   > intermittently, typically when run on some other TCG architecture.
   > In the instance I've just looked at, this was with s390x guest on
   > x86-64 host, though I've also seen it on other host archs and
   > perhaps with other guests.

   Watch out to see if you really do see it for other guests;
   it carefully avoids using virtio-net to avoid vhost; but on s390x it
   uses virtio-net-ccw - could that hit the vhost it was trying to avoid?

   > Below is a backtrace, though it seems to be pretty unhelpful.
   > Anybody got any theories ? Does the mirror test rely on dirty
   > memory bitmaps like the migration test (which also hangs
   > occasionally with TCG due to some bug I'm sure we've investigated
   > in the past) ?

   I don't think it relies on the CPU at all.
I have no idea about this currently, but Jason and I designed the
test case.
Add Jason: Have any comments about this ?

I can't reproduce this locally with s390x-softmmu. It looks to me the
test should be independent to any kinds of emulation. It should pass
when mainloop work.

I've just seen a hang with ppc64 guest on s390x host, so it is
indeed not specific to s390x guest (and so not specific to
virtio-net either, since the ppc64 guest setup uses e1000).

thanks
-- PMM

Finally reproduced locally after hundreds (sometimes thousands) times of
running.

Bisection points to OOB monitor[1].

It looks to me after OOB is used unconditionally we lose a barrier to make
sure socket is connected before sending packets in test-filter-mirror.c. Is
there any other similar and simple thing that we could do to kick the
mainloop?

Do you mean the:

  /* send a qmp command to guarantee that 'connected' is setting to true. */
  qmp_discard_response(qts, "{ 'execute' : 'query-status'}");

Yes.

why was that ever sufficient to know the socket was ready?

It was suggested by Fam, I don't remember the details. Can we make sure all
pending events has been processed (UNIX socket was set to connected) after
query-status is returned with an non OOB monitor?

I'm not sure - it doesn't sound like a 'query-status' should ensure
anything else.
How about something like a 'query-chardev' - can that tell you what you
need and loop until it's ready?

Dave

That may work.

Thanks

Thanks

Dave

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-24 Thread Eduardo Habkost

On Thu, Jan 24, 2019 at 10:08:37PM -0500, Michael S. Tsirkin wrote:
> On Thu, Jan 24, 2019 at 05:14:43PM -0200, Eduardo Habkost wrote:
> > On Thu, Jan 24, 2019 at 02:05:45PM -0500, Michael S. Tsirkin wrote:
> > > On Thu, Jan 24, 2019 at 04:28:39PM -0200, Eduardo Habkost wrote:
> > > > On Thu, Jan 24, 2019 at 12:45:54PM -0500, Michael S. Tsirkin wrote:
> > > > > On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote:
> > > > > > On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote:
> > > > > > > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote:
> > > > > > > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote:
> > > > > > > > > From: Zhang Yi 
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Zhang Yi 
> > > > [...]
> > > > > > > > > + - 'pmem' option of memory-backend-file is 'on':
> > > > > > > > > +   The backend is a file supporting DAX, e.g., a file on an 
> > > > > > > > > ext4 or
> > > > > > > > > +   xfs file system mounted with '-o dax'. if your pmem=on 
> > > > > > > > > ,but the backend is
> > > > > > > > > +   not a file supporting DAX, mapping with this flag results 
> > > > > > > > > in an EOPNOTSUPP
> > > > > > > > > +   error.
> > > > > > > > 
> > > > > > > > Won't this break existing configurations that work today on QEMU
> > > > > > > > 3.1.0?  Why exactly it is OK to break compatibility here?
> > > > > > > won't, pmem option default is off, if people who start VM don't 
> > > > > > > know what
> > > > > > > backend file is, it is suggested and *default to set pmem=off,
> > > > > > > if people well know the backend file have dax capbility. it is 
> > > > > > > suggest
> > > > > > > to set pmem=on. 
> > > > > > > 
> > > > > > > For a special case that we use /dev/dax as backend, we already 
> > > > > > > have a
> > > > > > > patch to add MAP_SYNC falg mapiing from device dax mode.
> > > > > > > see https://lkml.org/lkml/2018/4/22/524 
> > > > > > > 
> > > > > > > So, if people force set pmem=on, mapping a regular file, it will 
> > > > > > > results
> > > > > > > in an EOPNOTSUPP error. 
> > > > > > 
> > > > > > This is where compatibility is being broken, isn't it?  People
> > > > > > currently using pmem=on on a regular file will start getting
> > > > > > errors after a QEMU upgrade.  Existing VMs with pmem=on may stop
> > > > > > booting.  Maybe this is OK, but we need to be able to explain why
> > > > > > it is OK.
> > > > > 
> > > > > I think it's OK since pmem explicitly means "persistent":
> > > > > 
> > > > > The @option{pmem} option specifies whether the backing file specified
> > > > > by @option{mem-path} is in host persistent memory that can be accessed
> > > > > using the SNIA NVM programming model (e.g. Intel NVDIMM).
> > > > > If @option{pmem} is set to 'on', QEMU will take necessary operations 
> > > > > to
> > > > > guarantee the persistence of its own writes to @option{mem-path}
> > > > > (e.g. in vNVDIMM label emulation and live migration).
> > > > 
> > > > If it's OK, let's at least explicitly document that we are
> > > > breaking compatibility in those cases.
> > > > 
> > > > 
> > > > > > > 
> > > > [...]
> > > > > I think generally MAP_SYNC is required.
> > > > > But for compatibility reasons we might need to support
> > > > > !MAP_SYNC on old kernels even though it's risky.
> > > > 
> > > > What about making MAP_SYNC optional only on older machine-types?
> > > 
> > > I don't think this makes sense. It's not a guest visible change,
> > > machine types are for that.
> > 
> > Losing data written to persistent memory is surely guest-visible
> > behavior.
> 
> I think we need not be purists here. Most people don't lose power and
> then it's fine and compatible. People who want more robustness need to
> use more modern kernels, that is all.

I don't think that's being purist.  I want to avoid hidden bugs
if we ignore that MAP_SYNC failed for any unexpected reason.  If
we need to ignore errors in some cases, let's at least limit that
to cases where we absolutely have to.

But I would also be happy with just a warning.

-- 
Eduardo

Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-24 Thread Michael S. Tsirkin

On Thu, Jan 24, 2019 at 05:14:43PM -0200, Eduardo Habkost wrote:
> On Thu, Jan 24, 2019 at 02:05:45PM -0500, Michael S. Tsirkin wrote:
> > On Thu, Jan 24, 2019 at 04:28:39PM -0200, Eduardo Habkost wrote:
> > > On Thu, Jan 24, 2019 at 12:45:54PM -0500, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote:
> > > > > On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote:
> > > > > > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote:
> > > > > > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote:
> > > > > > > > From: Zhang Yi 
> > > > > > > > 
> > > > > > > > Signed-off-by: Zhang Yi 
> > > [...]
> > > > > > > > + - 'pmem' option of memory-backend-file is 'on':
> > > > > > > > +   The backend is a file supporting DAX, e.g., a file on an 
> > > > > > > > ext4 or
> > > > > > > > +   xfs file system mounted with '-o dax'. if your pmem=on ,but 
> > > > > > > > the backend is
> > > > > > > > +   not a file supporting DAX, mapping with this flag results 
> > > > > > > > in an EOPNOTSUPP
> > > > > > > > +   error.
> > > > > > > 
> > > > > > > Won't this break existing configurations that work today on QEMU
> > > > > > > 3.1.0?  Why exactly it is OK to break compatibility here?
> > > > > > won't, pmem option default is off, if people who start VM don't 
> > > > > > know what
> > > > > > backend file is, it is suggested and *default to set pmem=off,
> > > > > > if people well know the backend file have dax capbility. it is 
> > > > > > suggest
> > > > > > to set pmem=on. 
> > > > > > 
> > > > > > For a special case that we use /dev/dax as backend, we already have 
> > > > > > a
> > > > > > patch to add MAP_SYNC falg mapiing from device dax mode.
> > > > > > see https://lkml.org/lkml/2018/4/22/524 
> > > > > > 
> > > > > > So, if people force set pmem=on, mapping a regular file, it will 
> > > > > > results
> > > > > > in an EOPNOTSUPP error. 
> > > > > 
> > > > > This is where compatibility is being broken, isn't it?  People
> > > > > currently using pmem=on on a regular file will start getting
> > > > > errors after a QEMU upgrade.  Existing VMs with pmem=on may stop
> > > > > booting.  Maybe this is OK, but we need to be able to explain why
> > > > > it is OK.
> > > > 
> > > > I think it's OK since pmem explicitly means "persistent":
> > > > 
> > > > The @option{pmem} option specifies whether the backing file specified
> > > > by @option{mem-path} is in host persistent memory that can be accessed
> > > > using the SNIA NVM programming model (e.g. Intel NVDIMM).
> > > > If @option{pmem} is set to 'on', QEMU will take necessary operations to
> > > > guarantee the persistence of its own writes to @option{mem-path}
> > > > (e.g. in vNVDIMM label emulation and live migration).
> > > 
> > > If it's OK, let's at least explicitly document that we are
> > > breaking compatibility in those cases.
> > > 
> > > 
> > > > > > 
> > > [...]
> > > > I think generally MAP_SYNC is required.
> > > > But for compatibility reasons we might need to support
> > > > !MAP_SYNC on old kernels even though it's risky.
> > > 
> > > What about making MAP_SYNC optional only on older machine-types?
> > 
> > I don't think this makes sense. It's not a guest visible change,
> > machine types are for that.
> 
> Losing data written to persistent memory is surely guest-visible
> behavior.

I think we need not be purists here. Most people don't lose power and
then it's fine and compatible. People who want more robustness need to
use more modern kernels, that is all.

> -- 
> Eduardo

Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-24 Thread Yi Zhang

On 2019-01-24 at 17:14:43 -0200, Eduardo Habkost wrote:
> On Thu, Jan 24, 2019 at 02:05:45PM -0500, Michael S. Tsirkin wrote:
> > On Thu, Jan 24, 2019 at 04:28:39PM -0200, Eduardo Habkost wrote:
> > > On Thu, Jan 24, 2019 at 12:45:54PM -0500, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote:
> > > > > On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote:
> > > > > > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote:
> > > > > > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote:
> > > > > > > > From: Zhang Yi 
> > > > > > > > 
> > > > > > > > Signed-off-by: Zhang Yi 
> > > [...]
> > > > > > > > + - 'pmem' option of memory-backend-file is 'on':
> > > > > > > > +   The backend is a file supporting DAX, e.g., a file on an 
> > > > > > > > ext4 or
> > > > > > > > +   xfs file system mounted with '-o dax'. if your pmem=on ,but 
> > > > > > > > the backend is
> > > > > > > > +   not a file supporting DAX, mapping with this flag results 
> > > > > > > > in an EOPNOTSUPP
> > > > > > > > +   error.
> > > > > > > 
> > > > > > > Won't this break existing configurations that work today on QEMU
> > > > > > > 3.1.0?  Why exactly it is OK to break compatibility here?
> > > > > > won't, pmem option default is off, if people who start VM don't 
> > > > > > know what
> > > > > > backend file is, it is suggested and *default to set pmem=off,
> > > > > > if people well know the backend file have dax capbility. it is 
> > > > > > suggest
> > > > > > to set pmem=on. 
> > > > > > 
> > > > > > For a special case that we use /dev/dax as backend, we already have 
> > > > > > a
> > > > > > patch to add MAP_SYNC falg mapiing from device dax mode.
> > > > > > see https://lkml.org/lkml/2018/4/22/524 
> > > > > > 
> > > > > > So, if people force set pmem=on, mapping a regular file, it will 
> > > > > > results
> > > > > > in an EOPNOTSUPP error. 
> > > > > 
> > > > > This is where compatibility is being broken, isn't it?  People
> > > > > currently using pmem=on on a regular file will start getting
> > > > > errors after a QEMU upgrade.  Existing VMs with pmem=on may stop
> > > > > booting.  Maybe this is OK, but we need to be able to explain why
> > > > > it is OK.
> > > > 
> > > > I think it's OK since pmem explicitly means "persistent":
> > > > 
> > > > The @option{pmem} option specifies whether the backing file specified
> > > > by @option{mem-path} is in host persistent memory that can be accessed
> > > > using the SNIA NVM programming model (e.g. Intel NVDIMM).
> > > > If @option{pmem} is set to 'on', QEMU will take necessary operations to
> > > > guarantee the persistence of its own writes to @option{mem-path}
> > > > (e.g. in vNVDIMM label emulation and live migration).
> > > 
> > > If it's OK, let's at least explicitly document that we are
> > > breaking compatibility in those cases.
Yes, I will add more explanation in those broken cases.
> > > 
> > > 
> > > > > > 
> > > [...]
> > > > I think generally MAP_SYNC is required.
> > > > But for compatibility reasons we might need to support
> > > > !MAP_SYNC on old kernels even though it's risky.
> > > 
> > > What about making MAP_SYNC optional only on older machine-types?
Isn't Older Machine-type compatiable with new kernel? 
> > 
> > I don't think this makes sense. It's not a guest visible change,
> > machine types are for that.
> 
> Losing data written to persistent memory is surely guest-visible
> behavior.

Guest always visit it is a persistent memory, but it is only a faked
"persistent" front-end, the only way to guarantee the persistent is make the
host back-end pmem=on, that is not a guest visible option.

> 
> -- 
> Eduardo
>

Re: [Qemu-devel] [RFC PATCH v4 29/44] hw/pci/Makefile.objs: make pcie configurable

2019-01-24 Thread Yang Zhong

On Thu, Jan 24, 2019 at 09:43:44PM -0500, Michael S. Tsirkin wrote:
> On Fri, Jan 25, 2019 at 10:10:53AM +0800, Yang Zhong wrote:
> > On Wed, Jan 23, 2019 at 09:23:49AM -0500, Michael S. Tsirkin wrote:
> > > On Wed, Jan 23, 2019 at 02:56:03PM +0800, Yang Zhong wrote:
> > > > Make pcie splited from pci and make it configurable.
> > > > 
> > > > Signed-off-by: Yang Zhong 
> > > > Cc: Michael S. Tsirkin 
> > > > Reviewed-by: Thomas Huth 
> > > > ---
> > > >  default-configs/pci.mak | 1 +
> > > >  hw/pci/Kconfig  | 3 +++
> > > >  hw/pci/Makefile.objs| 5 +++--
> > > >  3 files changed, 7 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/default-configs/pci.mak b/default-configs/pci.mak
> > > > index f7b3690bbd..b17b456b1e 100644
> > > > --- a/default-configs/pci.mak
> > > > +++ b/default-configs/pci.mak
> > > > @@ -1,4 +1,5 @@
> > > >  CONFIG_PCI=y
> > > > +CONFIG_PCI_EXPRESS=y
> > > >  # For now, CONFIG_IDE_CORE requires ISA, so we enable it here
> > > >  CONFIG_ISA_BUS=y
> > > >  CONFIG_VIRTIO_PCI=y
> > > > diff --git a/hw/pci/Kconfig b/hw/pci/Kconfig
> > > > index d3d2205577..81533b9dc0 100644
> > > > --- a/hw/pci/Kconfig
> > > > +++ b/hw/pci/Kconfig
> > > > @@ -1,2 +1,5 @@
> > > >  config PCI
> > > >  bool
> > > > +
> > > > +config PCI_EXPRESS
> > > > +bool
> > > 
> > > Hmm this allows PCIE without PCI.
> > > Should PCI_EXPRESS select PCI?
> > > 
> > > It's selected itself so can't depend on PCI.
> > >
> >   Hello Michael,
> > 
> >   I did this in patch 30 as below:
> >   
> >   diff --git a/hw/pci/Kconfig b/hw/pci/Kconfig
> >   index 81533b9dc0..4ca2537980 100644
> >   --- a/hw/pci/Kconfig
> >   +++ b/hw/pci/Kconfig
> >   @@ -3,3 +3,4 @@ config PCI
> > 
> >   config PCI_EXPRESS
> >  bool
> > +select PCI
> > 
> >Regards,
> > 
> >Yang
> 
> Maybe squash this into patch 29 then.
> 
> > >
  Okay, i will do it in next version, thanks Michael, Yang.
 
> > > > diff --git a/hw/pci/Makefile.objs b/hw/pci/Makefile.objs
> > > > index 9f905e6344..d30eb32cbb 100644
> > > > --- a/hw/pci/Makefile.objs
> > > > +++ b/hw/pci/Makefile.objs
> > > > @@ -2,8 +2,9 @@ common-obj-$(CONFIG_PCI) += pci.o pci_bridge.o
> > > >  common-obj-$(CONFIG_PCI) += msix.o msi.o
> > > >  common-obj-$(CONFIG_PCI) += shpc.o
> > > >  common-obj-$(CONFIG_PCI) += slotid_cap.o
> > > > -common-obj-$(CONFIG_PCI) += pci_host.o pcie_host.o
> > > > -common-obj-$(CONFIG_PCI) += pcie.o pcie_aer.o pcie_port.o
> > > > +common-obj-$(CONFIG_PCI) += pci_host.o
> > > > +common-obj-$(CONFIG_PCI_EXPRESS) += pcie.o pcie_aer.o
> > > > +common-obj-$(CONFIG_PCI_EXPRESS) += pcie_port.o pcie_host.o
> > > >  
> > > >  common-obj-$(call lnot,$(CONFIG_PCI)) += pci-stub.o
> > > >  common-obj-$(CONFIG_ALL) += pci-stub.o
> > > > -- 
> > > > 2.17.1

Re: [Qemu-devel] [RFC PATCH v4 29/44] hw/pci/Makefile.objs: make pcie configurable

2019-01-24 Thread Michael S. Tsirkin

On Fri, Jan 25, 2019 at 10:10:53AM +0800, Yang Zhong wrote:
> On Wed, Jan 23, 2019 at 09:23:49AM -0500, Michael S. Tsirkin wrote:
> > On Wed, Jan 23, 2019 at 02:56:03PM +0800, Yang Zhong wrote:
> > > Make pcie splited from pci and make it configurable.
> > > 
> > > Signed-off-by: Yang Zhong 
> > > Cc: Michael S. Tsirkin 
> > > Reviewed-by: Thomas Huth 
> > > ---
> > >  default-configs/pci.mak | 1 +
> > >  hw/pci/Kconfig  | 3 +++
> > >  hw/pci/Makefile.objs| 5 +++--
> > >  3 files changed, 7 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/default-configs/pci.mak b/default-configs/pci.mak
> > > index f7b3690bbd..b17b456b1e 100644
> > > --- a/default-configs/pci.mak
> > > +++ b/default-configs/pci.mak
> > > @@ -1,4 +1,5 @@
> > >  CONFIG_PCI=y
> > > +CONFIG_PCI_EXPRESS=y
> > >  # For now, CONFIG_IDE_CORE requires ISA, so we enable it here
> > >  CONFIG_ISA_BUS=y
> > >  CONFIG_VIRTIO_PCI=y
> > > diff --git a/hw/pci/Kconfig b/hw/pci/Kconfig
> > > index d3d2205577..81533b9dc0 100644
> > > --- a/hw/pci/Kconfig
> > > +++ b/hw/pci/Kconfig
> > > @@ -1,2 +1,5 @@
> > >  config PCI
> > >  bool
> > > +
> > > +config PCI_EXPRESS
> > > +bool
> > 
> > Hmm this allows PCIE without PCI.
> > Should PCI_EXPRESS select PCI?
> > 
> > It's selected itself so can't depend on PCI.
> >
>   Hello Michael,
> 
>   I did this in patch 30 as below:
>   
>   diff --git a/hw/pci/Kconfig b/hw/pci/Kconfig
>   index 81533b9dc0..4ca2537980 100644
>   --- a/hw/pci/Kconfig
>   +++ b/hw/pci/Kconfig
>   @@ -3,3 +3,4 @@ config PCI
> 
>   config PCI_EXPRESS
>  bool
> +select PCI
> 
>Regards,
> 
>Yang

Maybe squash this into patch 29 then.

> > 
> > > diff --git a/hw/pci/Makefile.objs b/hw/pci/Makefile.objs
> > > index 9f905e6344..d30eb32cbb 100644
> > > --- a/hw/pci/Makefile.objs
> > > +++ b/hw/pci/Makefile.objs
> > > @@ -2,8 +2,9 @@ common-obj-$(CONFIG_PCI) += pci.o pci_bridge.o
> > >  common-obj-$(CONFIG_PCI) += msix.o msi.o
> > >  common-obj-$(CONFIG_PCI) += shpc.o
> > >  common-obj-$(CONFIG_PCI) += slotid_cap.o
> > > -common-obj-$(CONFIG_PCI) += pci_host.o pcie_host.o
> > > -common-obj-$(CONFIG_PCI) += pcie.o pcie_aer.o pcie_port.o
> > > +common-obj-$(CONFIG_PCI) += pci_host.o
> > > +common-obj-$(CONFIG_PCI_EXPRESS) += pcie.o pcie_aer.o
> > > +common-obj-$(CONFIG_PCI_EXPRESS) += pcie_port.o pcie_host.o
> > >  
> > >  common-obj-$(call lnot,$(CONFIG_PCI)) += pci-stub.o
> > >  common-obj-$(CONFIG_ALL) += pci-stub.o
> > > -- 
> > > 2.17.1

Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

2019-01-24 Thread Eric Farman





On 01/24/2019 09:25 PM, Eric Farman wrote:



On 01/21/2019 06:03 AM, Cornelia Huck wrote:

Rework handling of multiple I/O requests to return -EAGAIN if
we are already processing an I/O request. Introduce a mutex
to disallow concurrent writes to the I/O region.

The expectation is that userspace simply retries the operation
if it gets -EAGAIN.

We currently don't allow multiple ssch requests at the same
time, as we don't have support for keeping channel programs
around for more than one request.

Signed-off-by: Cornelia Huck 
---
  drivers/s390/cio/vfio_ccw_drv.c |  1 +
  drivers/s390/cio/vfio_ccw_fsm.c |  8 +++-
  drivers/s390/cio/vfio_ccw_ops.c | 31 +++--
  drivers/s390/cio/vfio_ccw_private.h |  2 ++
  4 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c 
b/drivers/s390/cio/vfio_ccw_drv.c

index a10cec0e86eb..2ef189fe45ed 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -125,6 +125,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
  private->sch = sch;
  dev_set_drvdata(>dev, private);
+    mutex_init(>io_mutex);
  spin_lock_irq(sch->lock);
  private->state = VFIO_CCW_STATE_NOT_OPER;
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c 
b/drivers/s390/cio/vfio_ccw_fsm.c

index cab17865aafe..f6ed934cc565 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private 
*private)

  sch = private->sch;
  spin_lock_irqsave(sch->lock, flags);
-    private->state = VFIO_CCW_STATE_BUSY;


[1]


  orb = cp_get_orb(>cp, (u32)(addr_t)sch, sch->lpm);
@@ -42,6 +41,8 @@ static int fsm_io_helper(struct vfio_ccw_private 
*private)

   */
  sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
  ret = 0;
+    /* Don't allow another ssch for now */
+    private->state = VFIO_CCW_STATE_BUSY;


[1]


  break;
  case 1:    /* Status pending */
  case 2:    /* Busy */
@@ -99,7 +100,7 @@ static void fsm_io_error(struct vfio_ccw_private 
*private,

  static void fsm_io_busy(struct vfio_ccw_private *private,
  enum vfio_ccw_event event)
  {
-    private->io_region->ret_code = -EBUSY;
+    private->io_region->ret_code = -EAGAIN;
  }
  static void fsm_disabled_irq(struct vfio_ccw_private *private,
@@ -130,8 +131,6 @@ static void fsm_io_request(struct vfio_ccw_private 
*private,

  struct mdev_device *mdev = private->mdev;
  char *errstr = "request";
-    private->state = VFIO_CCW_STATE_BUSY;
-


[1]


  memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
  if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {
@@ -176,7 +175,6 @@ static void fsm_io_request(struct vfio_ccw_private 
*private,

  }
  err_out:
-    private->state = VFIO_CCW_STATE_IDLE;


[1] I think these changes are cool.  We end up going into (and staying 
in) state=BUSY if we get cc=0 on the SSCH, rather than in/out as we 
bumble along.


But why can't these be separated out from this patch?  It does change 
the behavior of the state machine, and seem distinct from the addition 
of the mutex you otherwise add here?  At the very least, this behavior 
change should be documented in the commit since it's otherwise lost in 
the mutex/EAGAIN stuff.



  trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private),
 io_region->ret_code, errstr);
  }
diff --git a/drivers/s390/cio/vfio_ccw_ops.c 
b/drivers/s390/cio/vfio_ccw_ops.c

index f673e106c041..3fa9fc570400 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct 
mdev_device *mdev,

  {
  struct vfio_ccw_private *private;
  struct ccw_io_region *region;
+    int ret;
  if (*ppos + count > sizeof(*region))
  return -EINVAL;
  private = dev_get_drvdata(mdev_parent_dev(mdev));
+    mutex_lock(>io_mutex);
  region = private->io_region;
  if (copy_to_user(buf, (void *)region + *ppos, count))
-    return -EFAULT;
-
-    return count;
+    ret = -EFAULT;
+    else
+    ret = count;
+    mutex_unlock(>io_mutex);
+    return ret;
  }
  static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
@@ -188,25 +192,30 @@ static ssize_t vfio_ccw_mdev_write(struct 
mdev_device *mdev,

  {
  struct vfio_ccw_private *private;
  struct ccw_io_region *region;
+    int ret;
  if (*ppos + count > sizeof(*region))
  return -EINVAL;
  private = dev_get_drvdata(mdev_parent_dev(mdev));
-    if (private->state != VFIO_CCW_STATE_IDLE)
+    if (private->state == VFIO_CCW_STATE_NOT_OPER ||
+    private->state == VFIO_CCW_STATE_STANDBY)
  return -EACCES;
+    if (!mutex_trylock(>io_mutex))
+    return -EAGAIN;


Ah, I see Halil's difficulty here.

It is true there is a race condition today, and that this doesn't 
address it.  That's fine, add it

Re: [Qemu-devel] [RFC PATCH v4 29/44] hw/pci/Makefile.objs: make pcie configurable

2019-01-24 Thread Yang Zhong

On Wed, Jan 23, 2019 at 09:23:49AM -0500, Michael S. Tsirkin wrote:
> On Wed, Jan 23, 2019 at 02:56:03PM +0800, Yang Zhong wrote:
> > Make pcie splited from pci and make it configurable.
> > 
> > Signed-off-by: Yang Zhong 
> > Cc: Michael S. Tsirkin 
> > Reviewed-by: Thomas Huth 
> > ---
> >  default-configs/pci.mak | 1 +
> >  hw/pci/Kconfig  | 3 +++
> >  hw/pci/Makefile.objs| 5 +++--
> >  3 files changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/default-configs/pci.mak b/default-configs/pci.mak
> > index f7b3690bbd..b17b456b1e 100644
> > --- a/default-configs/pci.mak
> > +++ b/default-configs/pci.mak
> > @@ -1,4 +1,5 @@
> >  CONFIG_PCI=y
> > +CONFIG_PCI_EXPRESS=y
> >  # For now, CONFIG_IDE_CORE requires ISA, so we enable it here
> >  CONFIG_ISA_BUS=y
> >  CONFIG_VIRTIO_PCI=y
> > diff --git a/hw/pci/Kconfig b/hw/pci/Kconfig
> > index d3d2205577..81533b9dc0 100644
> > --- a/hw/pci/Kconfig
> > +++ b/hw/pci/Kconfig
> > @@ -1,2 +1,5 @@
> >  config PCI
> >  bool
> > +
> > +config PCI_EXPRESS
> > +bool
> 
> Hmm this allows PCIE without PCI.
> Should PCI_EXPRESS select PCI?
> 
> It's selected itself so can't depend on PCI.
>
  Hello Michael,

  I did this in patch 30 as below:
  
  diff --git a/hw/pci/Kconfig b/hw/pci/Kconfig
  index 81533b9dc0..4ca2537980 100644
  --- a/hw/pci/Kconfig
  +++ b/hw/pci/Kconfig
  @@ -3,3 +3,4 @@ config PCI

  config PCI_EXPRESS
 bool
+select PCI

   Regards,

   Yang

> 
> > diff --git a/hw/pci/Makefile.objs b/hw/pci/Makefile.objs
> > index 9f905e6344..d30eb32cbb 100644
> > --- a/hw/pci/Makefile.objs
> > +++ b/hw/pci/Makefile.objs
> > @@ -2,8 +2,9 @@ common-obj-$(CONFIG_PCI) += pci.o pci_bridge.o
> >  common-obj-$(CONFIG_PCI) += msix.o msi.o
> >  common-obj-$(CONFIG_PCI) += shpc.o
> >  common-obj-$(CONFIG_PCI) += slotid_cap.o
> > -common-obj-$(CONFIG_PCI) += pci_host.o pcie_host.o
> > -common-obj-$(CONFIG_PCI) += pcie.o pcie_aer.o pcie_port.o
> > +common-obj-$(CONFIG_PCI) += pci_host.o
> > +common-obj-$(CONFIG_PCI_EXPRESS) += pcie.o pcie_aer.o
> > +common-obj-$(CONFIG_PCI_EXPRESS) += pcie_port.o pcie_host.o
> >  
> >  common-obj-$(call lnot,$(CONFIG_PCI)) += pci-stub.o
> >  common-obj-$(CONFIG_ALL) += pci-stub.o
> > -- 
> > 2.17.1

Re: [Qemu-devel] [PATCH] i386: extended the cpuid level when Intel PT is enabled

2019-01-24 Thread Kang, Luwei

> > Intel Processor Trace required CPUID[0x14] but the cpuid level is 0xd
> > when create a kvm guest with e.g. "-cpu qemu64,+intel-pt".
> >
> > Signed-off-by: Luwei Kang 
> > ---
> >  target/i386/cpu.c | 7 +++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c index
> > 2f54125..da477b3 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -5023,6 +5023,13 @@ static void x86_cpu_expand_features(X86CPU *cpu, 
> > Error **errp)
> >  x86_cpu_adjust_feat_level(cpu, FEAT_C000_0001_EDX);
> >  x86_cpu_adjust_feat_level(cpu, FEAT_SVM);
> >  x86_cpu_adjust_feat_level(cpu, FEAT_XSAVE);
> > +
> > +/* Intel Processor Trace requires CPUID[0x14] */
> > +if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
> > + kvm_enabled()) {
> > +x86_cpu_adjust_level(cpu, >env.cpuid_min_level, 0x14);
> > +}
> 
> This will require a new machine-type compatibility flag to enable the new 
> behavior, so we don't change CPUID data under the guest feet during live 
> migration.

Hi Eduardo,
Thanks for your reply. I have some question on your comments.
The cpuid level come from specific machine-type (e.g. qemu64, 
Skylake-Server) and they are all 0xd, but Intel PT required 0x14 so I extend 
the cpuid level.
I don't fully understand what is the "require a new machine-type 
compatibility flag" mean, I need to add a new flag in each machine-type? 
I try to do live migration with "-cpu qemu64,+intel-pt" and "-cpu host" are 
all passed test. We didn't change the cpuid data during live migration just 
initialize the cpuid data when create a new vcpu. Please correct me if anything 
wrong.

Thanks,
Luwei Kang

> 
> > +
> >  /* SVM requires CPUID[0x800A] */
> >  if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
> >  x86_cpu_adjust_level(cpu, >cpuid_min_xlevel,
> > 0x800A);
> > --
> > 1.8.3.1
> >
> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

2019-01-24 Thread Eric Farman





On 01/21/2019 06:03 AM, Cornelia Huck wrote:

Rework handling of multiple I/O requests to return -EAGAIN if
we are already processing an I/O request. Introduce a mutex
to disallow concurrent writes to the I/O region.

The expectation is that userspace simply retries the operation
if it gets -EAGAIN.

We currently don't allow multiple ssch requests at the same
time, as we don't have support for keeping channel programs
around for more than one request.

Signed-off-by: Cornelia Huck 
---
  drivers/s390/cio/vfio_ccw_drv.c |  1 +
  drivers/s390/cio/vfio_ccw_fsm.c |  8 +++-
  drivers/s390/cio/vfio_ccw_ops.c | 31 +++--
  drivers/s390/cio/vfio_ccw_private.h |  2 ++
  4 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index a10cec0e86eb..2ef189fe45ed 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -125,6 +125,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
  
  	private->sch = sch;

dev_set_drvdata(>dev, private);
+   mutex_init(>io_mutex);
  
  	spin_lock_irq(sch->lock);

private->state = VFIO_CCW_STATE_NOT_OPER;
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index cab17865aafe..f6ed934cc565 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -28,7 +28,6 @@ static int fsm_io_helper(struct vfio_ccw_private *private)
sch = private->sch;
  
  	spin_lock_irqsave(sch->lock, flags);

-   private->state = VFIO_CCW_STATE_BUSY;


[1]

  
  	orb = cp_get_orb(>cp, (u32)(addr_t)sch, sch->lpm);
  
@@ -42,6 +41,8 @@ static int fsm_io_helper(struct vfio_ccw_private *private)

 */
sch->schib.scsw.cmd.actl |= SCSW_ACTL_START_PEND;
ret = 0;
+   /* Don't allow another ssch for now */
+   private->state = VFIO_CCW_STATE_BUSY;


[1]


break;
case 1: /* Status pending */
case 2: /* Busy */
@@ -99,7 +100,7 @@ static void fsm_io_error(struct vfio_ccw_private *private,
  static void fsm_io_busy(struct vfio_ccw_private *private,
enum vfio_ccw_event event)
  {
-   private->io_region->ret_code = -EBUSY;
+   private->io_region->ret_code = -EAGAIN;
  }
  
  static void fsm_disabled_irq(struct vfio_ccw_private *private,

@@ -130,8 +131,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
struct mdev_device *mdev = private->mdev;
char *errstr = "request";
  
-	private->state = VFIO_CCW_STATE_BUSY;

-


[1]


memcpy(scsw, io_region->scsw_area, sizeof(*scsw));
  
  	if (scsw->cmd.fctl & SCSW_FCTL_START_FUNC) {

@@ -176,7 +175,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
}
  
  err_out:

-   private->state = VFIO_CCW_STATE_IDLE;


[1] I think these changes are cool.  We end up going into (and staying 
in) state=BUSY if we get cc=0 on the SSCH, rather than in/out as we 
bumble along.


But why can't these be separated out from this patch?  It does change 
the behavior of the state machine, and seem distinct from the addition 
of the mutex you otherwise add here?  At the very least, this behavior 
change should be documented in the commit since it's otherwise lost in 
the mutex/EAGAIN stuff.



trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private),
   io_region->ret_code, errstr);
  }
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index f673e106c041..3fa9fc570400 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device 
*mdev,
  {
struct vfio_ccw_private *private;
struct ccw_io_region *region;
+   int ret;
  
  	if (*ppos + count > sizeof(*region))

return -EINVAL;
  
  	private = dev_get_drvdata(mdev_parent_dev(mdev));

+   mutex_lock(>io_mutex);
region = private->io_region;
if (copy_to_user(buf, (void *)region + *ppos, count))
-   return -EFAULT;
-
-   return count;
+   ret = -EFAULT;
+   else
+   ret = count;
+   mutex_unlock(>io_mutex);
+   return ret;
  }
  
  static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,

@@ -188,25 +192,30 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device 
*mdev,
  {
struct vfio_ccw_private *private;
struct ccw_io_region *region;
+   int ret;
  
  	if (*ppos + count > sizeof(*region))

return -EINVAL;
  
  	private = dev_get_drvdata(mdev_parent_dev(mdev));

-   if (private->state != VFIO_CCW_STATE_IDLE)
+   if (private->state == VFIO_CCW_STATE_NOT_OPER ||
+   private->state == VFIO_CCW_STATE_STANDBY)
return -EACCES;
+   if (!mutex_trylock(>io_mutex))
+   return

Re: [Qemu-devel] [RFC PATCH v4 26/44] kconfig: introduce kconfig files

2019-01-24 Thread Yang Zhong

On Thu, Jan 24, 2019 at 03:06:01PM +0100, Thomas Huth wrote:
> On 2019-01-23 07:56, Yang Zhong wrote:
> > From: Paolo Bonzini 
> > 
> > The Kconfig files were generated mostly with this script:
> > 
> >   for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* | sort -u`; do
> > set fnord `git grep -lw $i -- 'hw/*/Makefile.objs' `
> > shift
> > if test $# = 1; then
> >   cat >> $(dirname $1)/Kconfig << EOF
> > config ${i#CONFIG_}
> > bool
> > 
> > EOF
> >   git add $(dirname $1)/Kconfig
> > else
> >   echo $i $*
> > fi
> >   done
> >   sed -i '$d' hw/*/Kconfig
> >   for i in hw/*; do
> > if test -d $i && ! test -f $i/Kconfig; then
> >   touch $i/Kconfig
> >   git add $i/Kconfig
> > fi
> >   done
> > 
> > Whenever a symbol is referenced from multiple subdirectories, the
> > script prints the list of directories that reference the symbol.
> > These symbols have to be added manually to the Kconfig files.
> > 
> > Kconfig.host and hw/Kconfig were created manually.
> > 
> > Signed-off-by: Paolo Bonzini 
> > Signed-off-by: Yang Zhong 
> > ---
> [...]
> > diff --git a/hw/cris/Kconfig b/hw/cris/Kconfig
> > new file mode 100644
> > index 00..c2c26e5150
> > --- /dev/null
> > +++ b/hw/cris/Kconfig
> > @@ -0,0 +1,2 @@
> > +config AXIS
> > +bool
> 
> Please also add here:
> 
> config ETRAXFS
> bool
>
  Yes, i will add this, thanks, Yang. 
> > diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> > new file mode 100644
> > index 00..1a3e8b0e02
> > --- /dev/null
> > +++ b/hw/riscv/Kconfig
> > @@ -0,0 +1,14 @@
> > +config HTIF
> > +bool
> > +
> > +config HART
> > +bool
> > +
> > +config SIFIVE
> > +bool
> > +
> > +config SPIKE
> > +bool
> > +
> > +config RISCV_VIRTIO
> > +bool
> 
> Please rename the RISCV_VIRTIO to RISCV_VIRT.
> 
> We also additionally need these two here:
> 
> config SIFIVE_E
> bool
> 
> config SIFIVE_U
> bool
> 
  Wow, this is my mistake, i omitted this, thanks for reminder!

  Regards,

  Yang
>  Thanks,
>   Thomas

Re: [Qemu-devel] [RFC PATCH v3 6/7] target/ppc: Refactor kvm_handle_debug

2019-01-24 Thread Alexey Kardashevskiy




On 19/01/2019 01:07, Fabiano Rosas wrote:
> There are four scenarios being handled in this function:
> 
> - single stepping
> - hardware breakpoints
> - software breakpoints
> - fallback (no debug supported)
> 
> A future patch will add code to handle specific single step and
> software breakpoints cases so let's split each scenario into its own
> function now to avoid hurting readability.
> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Alexey Kardashevskiy 

> ---
>  target/ppc/kvm.c | 86 
>  1 file changed, 50 insertions(+), 36 deletions(-)
> 
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 96a5895792..c27190d7fb 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -1621,52 +1621,66 @@ static int kvm_handle_hw_breakpoint(CPUState *cs,
>  return handle;
>  }
>  
> +static int kvm_handle_singlestep(void)
> +{
> +return 1;
> +}
> +
> +static int kvm_handle_sw_breakpoint(void)
> +{
> +return 1;
> +}
> +
>  static int kvm_handle_debug(PowerPCCPU *cpu, struct kvm_run *run)
>  {
>  CPUState *cs = CPU(cpu);
>  CPUPPCState *env = >env;
>  struct kvm_debug_exit_arch *arch_info = >debug.arch;
> -int handle = 0;
>  
>  if (cs->singlestep_enabled) {
> -handle = 1;
> -} else if (arch_info->status) {
> -handle = kvm_handle_hw_breakpoint(cs, arch_info);
> -} else if (kvm_find_sw_breakpoint(cs, arch_info->address)) {
> -handle = 1;
> -} else {
> -/* QEMU is not able to handle debug exception, so inject
> - * program exception to guest;
> - * Yes program exception NOT debug exception !!
> - * When QEMU is using debug resources then debug exception must
> - * be always set. To achieve this we set MSR_DE and also set
> - * MSRP_DEP so guest cannot change MSR_DE.
> - * When emulating debug resource for guest we want guest
> - * to control MSR_DE (enable/disable debug interrupt on need).
> - * Supporting both configurations are NOT possible.
> - * So the result is that we cannot share debug resources
> - * between QEMU and Guest on BOOKE architecture.
> - * In the current design QEMU gets the priority over guest,
> - * this means that if QEMU is using debug resources then guest
> - * cannot use them;
> - * For software breakpoint QEMU uses a privileged instruction;
> - * So there cannot be any reason that we are here for guest
> - * set debug exception, only possibility is guest executed a
> - * privileged / illegal instruction and that's why we are
> - * injecting a program interrupt.
> - */
> +return kvm_handle_singlestep();
> +}
> +
> +if (arch_info->status) {
> +return kvm_handle_hw_breakpoint(cs, arch_info);
> +}
>  
> -cpu_synchronize_state(cs);
> -/* env->nip is PC, so increment this by 4 to use
> - * ppc_cpu_do_interrupt(), which set srr0 = env->nip - 4.
> - */
> -env->nip += 4;
> -cs->exception_index = POWERPC_EXCP_PROGRAM;
> -env->error_code = POWERPC_EXCP_INVAL;
> -ppc_cpu_do_interrupt(cs);
> +if (kvm_find_sw_breakpoint(cs, arch_info->address)) {
> +return kvm_handle_sw_breakpoint();
>  }
>  
> -return handle;
> +/*
> + * QEMU is not able to handle debug exception, so inject
> + * program exception to guest;
> + * Yes program exception NOT debug exception !!
> + * When QEMU is using debug resources then debug exception must
> + * be always set. To achieve this we set MSR_DE and also set
> + * MSRP_DEP so guest cannot change MSR_DE.
> + * When emulating debug resource for guest we want guest
> + * to control MSR_DE (enable/disable debug interrupt on need).
> + * Supporting both configurations are NOT possible.
> + * So the result is that we cannot share debug resources
> + * between QEMU and Guest on BOOKE architecture.
> + * In the current design QEMU gets the priority over guest,
> + * this means that if QEMU is using debug resources then guest
> + * cannot use them;
> + * For software breakpoint QEMU uses a privileged instruction;
> + * So there cannot be any reason that we are here for guest
> + * set debug exception, only possibility is guest executed a
> + * privileged / illegal instruction and that's why we are
> + * injecting a program interrupt.
> + */
> +cpu_synchronize_state(cs);
> +/*
> + * env->nip is PC, so increment this by 4 to use
> + * ppc_cpu_do_interrupt(), which set srr0 = env->nip - 4.
> + */
> +env->nip += 4;
> +cs->exception_index = POWERPC_EXCP_PROGRAM;
> +env->error_code = POWERPC_EXCP_INVAL;
> +ppc_cpu_do_interrupt(cs);
> +
> +return 0;
>  }
>  
>  int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> 

-- 
Alexey

Re: [Qemu-devel] [RFC PATCH v3 3/7] kvm: support checking for single step capability

2019-01-24 Thread Alexey Kardashevskiy




On 19/01/2019 01:07, Fabiano Rosas wrote:
> For single stepping (via KVM) of a guest vcpu to work, KVM needs not
> only to support the SET_GUEST_DEBUG ioctl but to also recognize the
> KVM_GUESTDBG_SINGLESTEP bit in the control field of the
> kvm_guest_debug struct.
> 
> This patch adds support for querying the single step capability so
> that QEMU can decide what to do for the platforms that do not have
> such support.


Belongs to 4/7.


> 
> Signed-off-by: Fabiano Rosas 
> ---
>  accel/kvm/kvm-all.c  | 7 +++
>  include/sysemu/kvm.h | 1 +
>  2 files changed, 8 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 4e1de942ce..0dc7a32883 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2264,6 +2264,13 @@ bool kvm_arm_supports_user_irq(void)
>  return kvm_check_extension(kvm_state, KVM_CAP_ARM_USER_IRQ);
>  }
>  
> +/* Whether the KVM_SET_GUEST_DEBUG ioctl supports single stepping */
> +int kvm_has_guestdbg_singlestep(void)
> +{
> +/* return kvm_check_extension(kvm_state, KVM_CAP_GUEST_DEBUG_SSTEP); */
> +return 0;
> +}
> +
>  #ifdef KVM_CAP_SET_GUEST_DEBUG
>  struct kvm_sw_breakpoint *kvm_find_sw_breakpoint(CPUState *cpu,
>   target_ulong pc)
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index a6d1cd190f..ca2bbff053 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -214,6 +214,7 @@ int kvm_has_pit_state2(void);
>  int kvm_has_many_ioeventfds(void);
>  int kvm_has_gsi_routing(void);
>  int kvm_has_intx_set_mask(void);
> +int kvm_has_guestdbg_singlestep(void);
>  
>  int kvm_init_vcpu(CPUState *cpu);
>  int kvm_cpu_exec(CPUState *cpu);
> 

-- 
Alexey

Re: [Qemu-devel] [RFC PATCH v3 2/7] target/ppc: Add ppc_get_trace_int_handler_addr

2019-01-24 Thread Alexey Kardashevskiy




On 19/01/2019 01:07, Fabiano Rosas wrote:
> The upcoming single step functionality (KVM HV) needs to write to the
> Trace Interrupt handler's address for its mechanism to work. The
> address is calculated by applying an offset according to the value of
> the Alternate Interrupt Location (AIL) bits in the LPCR register.
> 
> Signed-off-by: Fabiano Rosas 
> ---
>  target/ppc/cpu.h |  1 +
>  target/ppc/excp_helper.c | 12 
>  2 files changed, 13 insertions(+)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 486abaf99b..2185ef5e67 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1256,6 +1256,7 @@ struct PPCVirtualHypervisorClass {
>  OBJECT_GET_CLASS(PPCVirtualHypervisorClass, (obj), \
>   TYPE_PPC_VIRTUAL_HYPERVISOR)
>  
> +target_ulong ppc_get_trace_int_handler_addr(CPUState *cs);
>  void ppc_cpu_do_interrupt(CPUState *cpu);
>  bool ppc_cpu_exec_interrupt(CPUState *cpu, int int_req);
>  void ppc_cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf,
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index 337a3ef8bb..5d13d05c3b 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -746,6 +746,18 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
> excp_model, int excp)
>  check_tlb_flush(env, false);
>  }
>  
> +target_ulong ppc_get_trace_int_handler_addr(CPUState *cs)
> +{
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +CPUPPCState *env = >env;
> +int ail;
> +
> +ail = (env->spr[SPR_LPCR] & LPCR_AIL) >> LPCR_AIL_SHIFT;
> +return env->excp_vectors[POWERPC_EXCP_TRACE] |
> +ppc_excp_vector_offset(cs, ail);
> +}
> +
> +


Extra empty line.

The entire patch seems to belong to 7/7, it does not make sense on its
own as the helper is not called by anyone and all the files which it is
changing belong to target/ppc/.


>  void ppc_cpu_do_interrupt(CPUState *cs)
>  {
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
> 

-- 
Alexey

Re: [Qemu-devel] [RFC PATCH v3 1/7] target/ppc: Move exception vector offset computation into a function

2019-01-24 Thread Alexey Kardashevskiy




On 19/01/2019 01:07, Fabiano Rosas wrote:
> Signed-off-by: Fabiano Rosas 
> ---
>  target/ppc/excp_helper.c | 31 ---
>  1 file changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index 0ec7ae1ad4..337a3ef8bb 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -65,6 +65,25 @@ static inline void dump_syscall(CPUPPCState *env)
>ppc_dump_gpr(env, 6), env->nip);
>  }
>  
> +static uint64_t ppc_excp_vector_offset(CPUState *cs, int ail)
> +{
> +uint64_t offset;

nit: uninitialized variable. It should not matter though as cpu_abort()
has __attribute__ ((__noreturn__)) but still inaccurate imho.

Other than that

Reviewed-by: Alexey Kardashevskiy 



> +
> +switch (ail) {
> +case AIL_0001_8000:
> +offset = 0x18000;
> +break;
> +case AIL_C000___4000:
> +offset = 0xc0004000ull;
> +break;
> +default:
> +cpu_abort(cs, "Invalid AIL combination %d\n", ail);
> +break;
> +}
> +
> +return offset;
> +}
> +
>  /* Note that this function should be greatly optimized
>   * when called with a constant excp, from ppc_hw_interrupt
>   */
> @@ -685,17 +704,7 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
> excp_model, int excp)
>  /* Handle AIL */
>  if (ail) {
>  new_msr |= (1 << MSR_IR) | (1 << MSR_DR);
> -switch(ail) {
> -case AIL_0001_8000:
> -vector |= 0x18000;
> -break;
> -case AIL_C000___4000:
> -vector |= 0xc0004000ull;
> -break;
> -default:
> -cpu_abort(cs, "Invalid AIL combination %d\n", ail);
> -break;
> -}
> +vector |= ppc_excp_vector_offset(cs, ail);
>  }
>  
>  #if defined(TARGET_PPC64)
> 

-- 
Alexey

Re: [Qemu-devel] [PATCH] gdbstub: Fix i386/x86_64 machine description and add control registers

2019-01-24 Thread Doug Gale

Here is the sequence that led to this patch:

- Debugging my kernel, I wished I could use conditional breapoints to put a
condition on CR3 to make breakpoints in user processes only break when they
are the active address space.
- Knew it was no problem to add a register to GDB stub
- Added CR3. While there, added CR0 and CR2 and CR4 and CR8 and EFER
- Attempted to use it. Dreaded g packet error.
- Setup to debug GDB. Step tdesc code. Find it failing and silently
dropping description.
- Changed register count in gdbstub to match what default machine
description would want. No more error. GDB definitely dropping to default.
- Learn more about GDB XML parser. Find it has a verbose parser log. set
debug_xml=1
- Watch in horror as parser gets messed up parsing nesting due to feature
including feature
- Wonder if any QEMU machine descriptions xi:include actually work or if
all architectures are lined up to coincidentally use the same register
numbers as the default.
- Run everything through XML validators that strictly enforce DTD and
everything, no problem, totally valid.
- Try several changes to make include work. Not working.
- Move the content of the includes into the top level file so no include.
Bingo, works.
- Discover that fsbase and gsbase now working because the entire machine
description is correctly implemented, makes the register window in
qtcreator suddenly work great
- Make sure 32 bit works too. Find a couple of issues. Fix.
- Submit


On Thu, Jan 24, 2019 at 4:59 PM Doug Gale  wrote:

> The machine description we send is being (silently) thrown on the floor by
> GDB and GDB silently uses the default machine description.
>
> With current QEMU, if you debug gdb, and set debug_xml=1 and continue,
> then attach to qemu gdbstub from the debugged gdb, you will see the xml
> parse fail completely, and gdb will fall back to the default machine
> description, silently, and changes to our xml (in qemu source code) have no
> effect. They might as well be empty.
>
> The point of fixing the machine description was IDE's with GDB integration
> will break on QEMU. The default machine description has fs_base, which
> fails to be retrieved, whick breaks the whole register window (in
> qt-creator at least, likely others). With my patch the register window
> works perfectly.
>
> I didn't delete anything, I removed the superfluous nesting of files by
> xi:include and moved the description into a single xml file. I added
> fs_base, gs_base, k_gs_base, cr0/2/3/4/6, efer.
>
> Removing the nesting into xml includes fixes it because the xml parse
> fails on  unnecessary include indirections and placed the data inline.
>
> I tried lots of things to fix the nesting. After a while I asked, why
> bother? It doesn't need to go through that include level of indirection
> does it?
>
> This patch leads to another patch I want to submit later, which fixes the
> broken real mode and protected mode debugging on x86_64. I want to add the
> ability to turn off the "always 64 bit registers" hack that works around
> 'g' packet too large. You fix that by patching gdb (which then reacts to
> packet to large with realloc instead of freaking out), not by breaking real
> and protected mode debugging on x86_64 target by forcing always-64-bit. My
> intention is to default to current behavior and have a command line switch
> that enables changing register sizes (fixing real and protected mode
> debugging on x86_64).
>
> This patch includes a FORCE_64 define that will be replaced by a check for
> the option that indicates patched gdb and "register size change ok mode".
>
>
> On Thu, Jan 24, 2019 at 6:44 AM Peter Maydell 
> wrote:
>
>> On Thu, 24 Jan 2019 at 04:08, Doug Gale  wrote:
>> >
>> > Signed-off-by: Doug Gale 
>> > ---
>> >  configure   |   4 +-
>> >  gdb-xml/i386-32bit-core.xml |  65 ---
>> >  gdb-xml/i386-32bit-sse.xml  |  52 -
>> >  gdb-xml/i386-32bit.xml  | 184 ++-
>> >  gdb-xml/i386-64bit-core.xml |  73 -
>> >  gdb-xml/i386-64bit-sse.xml  |  60 ---
>> >  gdb-xml/i386-64bit.xml  | 210 +++-
>> >  target/i386/cpu.c   |   4 +-
>> >  target/i386/gdbstub.c   | 186 +++-
>> >  9 files changed, 573 insertions(+), 265 deletions(-)
>> >  delete mode 100644 gdb-xml/i386-32bit-core.xml
>> >  delete mode 100644 gdb-xml/i386-32bit-sse.xml
>> >  delete mode 100644 gdb-xml/i386-64bit-core.xml
>> >  delete mode 100644 gdb-xml/i386-64bit-sse.xml
>>
>> Could you provide a commit message that explains what's
>> wrong with the machine description we have (ie what bug
>> or bugs this change is fixing) and why deleting half
>> the xml files is the right way to fix it, please?
>>
>> Does the "add control registers" part need to be in
>> the same patch, or is it a separate feature which
>> could be in its own patch ?
>>
>> thanks
>> -- PMM
>>
>

Re: [Qemu-devel] [PATCH v1 5/8] RISC-V: Add priv_ver to DisasContext

2019-01-24 Thread Alistair Francis

On Thu, Jan 24, 2019 at 4:37 PM Palmer Dabbelt  wrote:
>
> On Tue, 15 Jan 2019 14:25:44 PST (-0800), alistai...@gmail.com wrote:
> > On Tue, Jan 15, 2019 at 2:24 PM Richard Henderson
> >  wrote:
> >>
> >> On 1/15/19 10:58 AM, Alistair Francis wrote:
> >> > -static void riscv_tr_init_disas_context(DisasContextBase *dcbase, 
> >> > CPUState *cs)
> >> > +static void riscv_tr_init_disas_context(DisasContextBase *dcbase, 
> >> > CPUState *cpu)
> >>
> >> Why change this?  I know there is variation in the naming, but my
> >> preferred default mapping is CPUState *cs, RISCVCPU *cpu.
> >
> > Good point, I have changed it back to cs.
>
> I don't see a v2, so I'm just going to go ahead and squash in

Yeah, I was waiting for more comments before sending a v2 on such a
trivial change.

>
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 8593c2170af4..b7176cbf98e1 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -2015,10 +2015,10 @@ static void decode_opc(DisasContext *ctx)
>  }
>  }
>
> -static void riscv_tr_init_disas_context(DisasContextBase *dcbase, 
> CPUState *cpu)
> +static void riscv_tr_init_disas_context(DisasContextBase *dcbase, 
> CPUState *cs)
>  {
>  DisasContext *ctx = container_of(dcbase, DisasContext, base);
> -CPURISCVState *env = cpu->env_ptr;
> +CPURISCVState *env = cs->env_ptr;
>
>  ctx->pc_succ_insn = ctx->base.pc_first;
>  ctx->mem_idx = ctx->base.tb->flags & TB_FLAGS_MMU_MASK;
>
> and add Richard's tag.

Sounds good.

Alistair

>
> >
> > Alistair
> >
> >>
> >> Otherwise,
> >> Reviewed-by: Richard Henderson

Re: [Qemu-devel] [PATCH RFC 8/9] tests: Add OpenBSD image

2019-01-24 Thread Brad Smith


On 1/24/2019 11:52 AM, Daniel P. Berrangé wrote:


On Thu, Jan 24, 2019 at 05:10:19PM +0100, Philippe Mathieu-Daudé wrote:

On 1/24/19 4:56 PM, Kamil Rytarowski wrote:

On 24.01.2019 16:52, Philippe Mathieu-Daudé wrote:

On 8/16/17 9:21 AM, Fam Zheng wrote:

The image is prepared following instructions as in:

https://wiki.qemu.org/Hosts/BSD

Signed-off-by: Fam Zheng 
---
  tests/vm/openbsd | 45 +
  1 file changed, 45 insertions(+)
  create mode 100755 tests/vm/openbsd

diff --git a/tests/vm/openbsd b/tests/vm/openbsd
new file mode 100755
index 00..d37ff83a59
--- /dev/null
+++ b/tests/vm/openbsd
@@ -0,0 +1,45 @@
+#!/usr/bin/env python
+#
+# OpenBSD VM image
+#
+# Copyright (C) 2017 Red Hat Inc.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This work is licensed under the terms of the GNU GPL, version 2.  See
+# the COPYING file in the top-level directory.
+#
+
+import os
+import sys
+import logging
+import subprocess
+import tempfile
+import time
+import basevm
+
+class OpenBSDVM(basevm.BaseVM):
+name = "openbsd"
+BUILD_SCRIPT = """
+set -e;
+cd $(mktemp -d /var/tmp/qemu-test.XX);
+tar -xf /dev/rsd1c;
+./configure --cc=x86_64-unknown-openbsd6.1-gcc-4.9.4 
--python=python2.7 {configure_opts};
+gmake -j{jobs};
+# XXX: "gmake check" seems to always hang or fail
+#gmake check;

OK, Now it makes more sense...

After spending various hours trying to fix various issues on OpenBSD, I
notice that we never ran tests on this OS.
The only binary I can run is qemu-img, the rest seems useless.
I'll summarize in a different thread.


Is this W^X related?

Part of it could be but I'm not sure.

The 6.1 VM provided by Fam has /usr/local mounted with wxallowed, I
tried building/running there and nothing changed, mmap() still returns
ENOTSUP:

ENOTSUP from mmap is certainly what you'd expect from the W^X  scenario

   https://undeadly.org/cgi?action=article=20160527203200

  "W^X violations are no longer permitted by default.  A kernel log message
   is generated, and mprotect/mmap return ENOTSUP.  If the sysctl(8) flag
   kern.wxabort is set then a SIGABRT occurs instead, for gdb use or coredump
   creation."


Yes, this policy change was introduced with 6.0.

Our ports tree has an option which results in the QEMU binaries being 
linked with "-z wxneeded".

Re: [Qemu-devel] [PATCH v1 5/8] RISC-V: Add priv_ver to DisasContext

2019-01-24 Thread Palmer Dabbelt


On Tue, 15 Jan 2019 14:25:44 PST (-0800), alistai...@gmail.com wrote:

On Tue, Jan 15, 2019 at 2:24 PM Richard Henderson
 wrote:


On 1/15/19 10:58 AM, Alistair Francis wrote:
> -static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState 
*cs)
> +static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState 
*cpu)

Why change this?  I know there is variation in the naming, but my
preferred default mapping is CPUState *cs, RISCVCPU *cpu.


Good point, I have changed it back to cs.


I don't see a v2, so I'm just going to go ahead and squash in

   diff --git a/target/riscv/translate.c b/target/riscv/translate.c
   index 8593c2170af4..b7176cbf98e1 100644
   --- a/target/riscv/translate.c
   +++ b/target/riscv/translate.c
   @@ -2015,10 +2015,10 @@ static void decode_opc(DisasContext *ctx)
}
}
   
   -static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cpu)

   +static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState 
*cs)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
   -CPURISCVState *env = cpu->env_ptr;
   +CPURISCVState *env = cs->env_ptr;
   
ctx->pc_succ_insn = ctx->base.pc_first;

ctx->mem_idx = ctx->base.tb->flags & TB_FLAGS_MMU_MASK;

and add Richard's tag.



Alistair



Otherwise,
Reviewed-by: Richard Henderson

Re: [Qemu-devel] [PATCH v1 8/8] RISC-V: Add misa runtime write support

2019-01-24 Thread Palmer Dabbelt


On Mon, 14 Jan 2019 15:59:00 PST (-0800), Alistair Francis wrote:

From: Michael Clark 

This patch adds support for writing misa. misa is validated based
on rules in the ISA specification. 'E' is mutually exclusive with
all other extensions. 'D' depends on 'F' so 'D' bit is dropped
if 'F' is not present. A conservative approach to consistency is
taken by flushing the translation cache on misa writes. misa_mask
is added to the CPU struct to store the original set of extensions.

Cc: Palmer Dabbelt 
Cc: Sagar Karandikar 
Cc: Bastian Koppelmann 
Cc: Alistair Francis 
Signed-off-by: Michael Clark 
Signed-off-by: Alistair Francis 
---
 target/riscv/cpu.c  |  2 +-
 target/riscv/cpu.h  |  4 ++-
 target/riscv/cpu_bits.h | 11 +
 target/riscv/csr.c  | 54 -
 4 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 28d7e5302f..cc3ddc0ae4 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -88,7 +88,7 @@ typedef struct RISCVCPUInfo {
 
 static void set_misa(CPURISCVState *env, target_ulong misa)

 {
-env->misa = misa;
+env->misa_mask = env->misa = misa;
 }
 
 static void set_versions(CPURISCVState *env, int user_ver, int priv_ver)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index a97435bd7b..5c2aebf132 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -86,7 +86,8 @@
so a cpu features bitfield is required, likewise for optional PMP support */
 enum {
 RISCV_FEATURE_MMU,
-RISCV_FEATURE_PMP
+RISCV_FEATURE_PMP,
+RISCV_FEATURE_MISA
 };
 
 #define USER_VERSION_2_02_0 0x00020200

@@ -118,6 +119,7 @@ struct CPURISCVState {
 target_ulong user_ver;
 target_ulong priv_ver;
 target_ulong misa;
+target_ulong misa_mask;
 
 uint32_t features;
 
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h

index 5439f4719e..7afcb2468d 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -311,10 +311,21 @@
 #define MSTATUS32_SD0x8000
 #define MSTATUS64_SD0x8000ULL
 
+#define MISA32_MXL  0xC000

+#define MISA64_MXL  0xC000ULL
+
+#define MXL_RV321
+#define MXL_RV642
+#define MXL_RV128   3
+
 #if defined(TARGET_RISCV32)
 #define MSTATUS_SD MSTATUS32_SD
+#define MISA_MXL MISA32_MXL
+#define MXL_VAL MXL_RV32
 #elif defined(TARGET_RISCV64)
 #define MSTATUS_SD MSTATUS64_SD
+#define MISA_MXL MISA64_MXL
+#define MXL_VAL MXL_RV64
 #endif
 
 /* sstatus CSR bits */

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index e2bd374f09..e72fcf1265 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -332,6 +332,58 @@ static int read_misa(CPURISCVState *env, int csrno, 
target_ulong *val)
 return 0;
 }
 
+static int write_misa(CPURISCVState *env, int csrno, target_ulong val)

+{
+if (!riscv_feature(env, RISCV_FEATURE_MISA)) {
+/* drop write to misa */
+return 0;
+}
+
+/* 'I' or 'E' must be present */
+if (!(val & (RVI | RVE))) {
+/* It is not, drop write to misa */
+return 0;
+}
+
+/* 'E' excludes all other extensions */
+if (val & RVE) {
+/* when we support 'E' we can do "val = RVE;" however
+ * for now we just drop writes if 'E' is present.
+ */
+return 0;
+}
+
+/* Mask extensions that are not supported by this hart */
+val &= env->misa_mask;
+
+/* Mask extensions that are not supported by QEMU */
+val &= (RVI | RVE | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
+
+/* 'D' depends on 'F', so clear 'D' if 'F' is not present */
+if ((val & RVD) && !(val & RVF)) {
+val &= ~RVD;
+}
+
+/* Suppress 'C' if next instruction is not aligned
+ * TODO: this should check next_pc
+ */
+if ((val & RVC) && (GETPC() & ~3) != 0) {
+val &= ~RVC;
+}
+
+/* misa.MXL writes are not supported by QEMU */
+val = (env->misa & MISA_MXL) | (val & ~MISA_MXL);
+
+/* flush translation cache */
+if (val != env->misa) {
+tb_flush(CPU(riscv_env_get_cpu(env)));
+}
+
+env->misa = val;
+
+return 0;
+}
+
 static int read_medeleg(CPURISCVState *env, int csrno, target_ulong *val)
 {
 *val = env->medeleg;
@@ -810,7 +862,7 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 
 /* Machine Trap Setup */

 [CSR_MSTATUS] = { any,  read_mstatus, write_mstatus },
-[CSR_MISA] ={ any,  read_misa   },
+[CSR_MISA] ={ any,  read_misa,write_misa},
 [CSR_MIDELEG] = { any,  read_mideleg, write_mideleg },
 [CSR_MEDELEG] = { any,  read_medeleg, write_medeleg },
 [CSR_MIE] = { any,  read_mie, write_mie },
--
2.19.1


Reviewed-by: Palmer Dabbelt

Re: [Qemu-devel] [PATCH v1 0/8] Upstream RISC-V fork patches, part 3

2019-01-24 Thread Palmer Dabbelt


On Mon, 14 Jan 2019 15:57:41 PST (-0800), Alistair Francis wrote:


Alistair Francis (1):
  RISC-V: Add priv_ver to DisasContext

Michael Clark (5):
  RISC-V: Implement mstatus.TSR/TW/TVM
  RISC-V: Use riscv prefix consistently on cpu helpers
  RISC-V: Add misa to DisasContext
  RISC-V: Add misa.MAFD checks to translate
  RISC-V: Add misa runtime write support

Richard Henderson (2):
  RISC-V: Split out mstatus_fs from tb_flags
  RISC-V: Mark mstatus.fs dirty

 linux-user/riscv/signal.c |   4 +-
 target/riscv/cpu.c|   2 +-
 target/riscv/cpu.h|  31 ++--
 target/riscv/cpu_bits.h   |  11 ++
 target/riscv/cpu_helper.c |  10 +-
 target/riscv/csr.c|  91 +---
 target/riscv/fpu_helper.c |   6 +-
 target/riscv/op_helper.c  |  47 --
 target/riscv/translate.c  | 292 --
 9 files changed, 388 insertions(+), 106 deletions(-)


Thanks.  Assuming that squash is OK I'll include these in my next PR, otherwise 
just send me a v2 and I'll swap them out.

Re: [Qemu-devel] [Qemu-ppc] [PULL 3/5] ppc: e6500 registers SPR 604 twice

2019-01-24 Thread Alexey Kardashevskiy




On 24/01/2019 23:49, Laurent Vivier wrote:
> From: Jon Diekema 
> 
> When using the e6500 CPU, QEMU generates a fatal error after
> complaining about registering SPR 604 twice.
> 
> Building and testing with commit
> 9b2e891ec5ccdb4a7d583b77988848282606fdea shows the issue:
> 
> qemu-system-ppc64 --version
> QEMU emulator version 3.1.50 (v3.1.0-456-g9b2e891ec5-dirty)
> Copyright (c) 2003-2018 Fabrice Bellard and the QEMU Project developers
> 
> qemu-system-ppc64 -M none -cpu e6500
> Error: Trying to register SPR 604 (25c) twice !
> 
> Signed-off-by: Jon Diekema 
> Message-Id: 
> 
> [removed duplicated mail header in the commit message]
> Signed-off-by: Laurent Vivier 
> ---
>  target/ppc/translate_init.inc.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
> index ade06cc773..59e0b86762 100644
> --- a/target/ppc/translate_init.inc.c
> +++ b/target/ppc/translate_init.inc.c
> @@ -4947,14 +4947,6 @@ static void init_proc_e500(CPUPPCState *env, int 
> version)
>  }
>  
>  if (version == fsl_e6500) {
> -spr_register(env, SPR_BOOKE_SPRG8, "SPRG8",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);
> -spr_register(env, SPR_BOOKE_SPRG9, "SPRG9",
> - SPR_NOACCESS, SPR_NOACCESS,
> - _read_generic, _write_generic,
> - 0x);


Out of curiosity - does every BookE board have these?


>  /* Thread identification */
>  spr_register(env, SPR_TIR, "TIR",
>   SPR_NOACCESS, SPR_NOACCESS,
> 

-- 
Alexey

Re: [Qemu-devel] building rst docs with sphinx

2019-01-24 Thread Philippe Mathieu-Daudé

Hi Peter,

On 1/24/19 7:56 PM, Peter Maydell wrote:
> I had another look this afternoon at building our rST docs
> with sphinx-build. In particular, we currently have some
> docs in rst format, but we're not building them into HTML
> or shipping them. (Predictably, this means a few errors and
> warnings have crept in...)

Are you talking about the files in docs/ such docs/devel/testing.rst?

Or about documention within the source files?
If so, is that the format you are talking about?

https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#the-c-domain

'''
The C domain (name c) is suited for documentation of C API.

.. c:function:: function prototype

Describes a C function. The signature should be given as in C, e.g.:

.. c:function:: PyObject* PyType_GenericAlloc(PyTypeObject *type,
Py_ssize_t nitems)
'''

> 
> I had a play about with adding some makefile runes, but
> I'm not sure entirely what I should be aiming for.
> 
> (1) configure: My thought is that we should just make
> sphinx-build a requirement for the existing --enable-docs
> switch (as texinfo and pod2man are currently). The
> disadvantage is that we won't support a "build the half
> of the docs you have the tools for and leave the others"
> setup. The advantage, which I think is significant, is that
> distros will naturally be directed to the missing build
> dependency (either they're building with --enable-docs
> and will get the configure message, or they aren't and
> then their build will fail later because of missing docs
> files when they try to put the built files into the package).
> 
> (2) What do we actually want to ship?
> That is, what do we want 'make install-doc' to copy into
> the installation directory?
> https://wiki.qemu.org/Features/Documentation
> has a good suggested breakdown of docs for where we
> eventually want to be. I think we probably don't want
> to install the "developer's guide" (docs/devel) on
> end-user systems. The others are presumably OK.
> Currently, we seem to only install manpages and a
> few other things in the 'install-doc' makefile target
> (we don't install a bunch of plain-text user-facing
> docs) so this would be a significant expansion.
> 
> (3) Indexes, table-of-contents pages, etc
> Are we aiming to ship these?
> I think that we probably want to have what from
> Sphinx's point of view are multiple separate documents,
> so that they each get their own ToC and index. This
> means we can for instance ship the ToC/index for
> the user docs but not have it contain index entries
> for developer docs.
> 
> Overall what I'm hoping for is to be able to get some
> basic structure/building commands into master so we
> have a framework and something we can iterate on to
> move forward.
> 
> thanks
> -- PMM
>

Re: [Qemu-devel] [PATCH v3 1/1] riscv: Ensure the kernel start address is correctly cast

2019-01-24 Thread Philippe Mathieu-Daudé

On 1/24/19 6:37 PM, Alistair Francis wrote:
> Cast the kernel start address to the target bit length.
> 
> This ensures that we calculate the initrd offset to a valid address for
> the architecture.
> 
> Steps to reproduce the original problem (reported by Alex):
>   Build U-Boot for the virt machine for riscv32. Then run it with
> 
> $ qemu-system-riscv32 -M virt -kernel u-boot -nographic -initrd 
> 
>   You can find the initrd address with
> 
> U-Boot# fdt addr $fdtcontroladdr
> U-Boot# fdt ls /chosen
> 
>   Then take a peek at that address:
> 
> U-Boot# md.b 
> 
>   and you will see that there is nothing there without this patch. The
>   reason is that the binary was loaded to a negative address.
> 
> Signed-off-by: Alistair Francis 
> Suggested-by: Alexander Graf 
> Reported-by: Alexander Graf 
> ---
> v3:
>  - Add steps to reproduce

Thanks, this is useful to write an acceptance test.

Reviewed-by: Philippe Mathieu-Daudé 

> v2:
>  - Remove old comment
>  hw/riscv/sifive_e.c | 2 +-
>  hw/riscv/sifive_u.c | 2 +-
>  hw/riscv/spike.c| 2 +-
>  hw/riscv/virt.c | 2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
> index 5d9d65ff29..e5d7fc548e 100644
> --- a/hw/riscv/sifive_e.c
> +++ b/hw/riscv/sifive_e.c
> @@ -74,7 +74,7 @@ static const struct MemmapEntry {
>  [SIFIVE_E_DTIM] = { 0x8000, 0x4000 }
>  };
>  
> -static uint64_t load_kernel(const char *kernel_filename)
> +static target_ulong load_kernel(const char *kernel_filename)
>  {
>  uint64_t kernel_entry, kernel_high;
>  
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 3bd3b67507..3b3acec377 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -65,7 +65,7 @@ static const struct MemmapEntry {
>  
>  #define GEM_REVISION0x10070109
>  
> -static uint64_t load_kernel(const char *kernel_filename)
> +static target_ulong load_kernel(const char *kernel_filename)
>  {
>  uint64_t kernel_entry, kernel_high;
>  
> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> index 268df04c3c..79cb4c1282 100644
> --- a/hw/riscv/spike.c
> +++ b/hw/riscv/spike.c
> @@ -53,7 +53,7 @@ static const struct MemmapEntry {
>  [SPIKE_DRAM] = { 0x8000,0x0 },
>  };
>  
> -static uint64_t load_kernel(const char *kernel_filename)
> +static target_ulong load_kernel(const char *kernel_filename)
>  {
>  uint64_t kernel_entry, kernel_high;
>  
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index e7f0716fb6..648462b18c 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -62,7 +62,7 @@ static const struct MemmapEntry {
>  [VIRT_PCIE_ECAM] =   { 0x3000,0x1000 },
>  };
>  
> -static uint64_t load_kernel(const char *kernel_filename)
> +static target_ulong load_kernel(const char *kernel_filename)
>  {
>  uint64_t kernel_entry, kernel_high;
>  
>

Re: [Qemu-devel] [PATCH 3/4] aspeed/smc: Add dummy data register

2019-01-24 Thread Alistair Francis

On Thu, Jan 24, 2019 at 6:06 AM Cédric Le Goater  wrote:
>
> The SMC controllers have a register containing the byte that will be
> used as dummy output. It can be modified by software.
>
> Signed-off-by: Cédric Le Goater 
> Reviewed-by: Philippe Mathieu-Daudé 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/ssi/aspeed_smc.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
> index 6045ca11b969..9f3b6f4b4501 100644
> --- a/hw/ssi/aspeed_smc.c
> +++ b/hw/ssi/aspeed_smc.c
> @@ -98,8 +98,8 @@
>  /* Misc Control Register #1 */
>  #define R_MISC_CTRL1  (0x50 / 4)
>
> -/* Misc Control Register #2 */
> -#define R_MISC_CTRL2  (0x54 / 4)
> +/* SPI dummy cycle data */
> +#define R_DUMMY_DATA  (0x54 / 4)
>
>  /* DMA Control/Status Register */
>  #define R_DMA_CTRL(0x80 / 4)
> @@ -529,7 +529,7 @@ static void aspeed_smc_flash_setup(AspeedSMCFlash *fl, 
> uint32_t addr)
>   */
>  if (aspeed_smc_flash_mode(fl) == CTRL_FREADMODE) {
>  for (i = 0; i < aspeed_smc_flash_dummies(fl); i++) {
> -ssi_transfer(fl->controller->spi, 0xFF);
> +ssi_transfer(fl->controller->spi, s->regs[R_DUMMY_DATA] & 0xff);
>  }
>  }
>  }
> @@ -664,6 +664,7 @@ static uint64_t aspeed_smc_read(void *opaque, hwaddr 
> addr, unsigned int size)
>  addr == s->r_timings ||
>  addr == s->r_ce_ctrl ||
>  addr == R_INTR_CTRL ||
> +addr == R_DUMMY_DATA ||
>  (addr >= R_SEG_ADDR0 && addr < R_SEG_ADDR0 + s->ctrl->max_slaves) ||
>  (addr >= s->r_ctrl0 && addr < s->r_ctrl0 + s->ctrl->max_slaves)) {
>  return s->regs[addr];
> @@ -697,6 +698,8 @@ static void aspeed_smc_write(void *opaque, hwaddr addr, 
> uint64_t data,
>  if (value != s->regs[R_SEG_ADDR0 + cs]) {
>  aspeed_smc_flash_set_segment(s, cs, value);
>  }
> +} else if (addr == R_DUMMY_DATA) {
> +s->regs[addr] = value & 0xff;
>  } else {
>  qemu_log_mask(LOG_UNIMP, "%s: not implemented: 0x%" HWADDR_PRIx "\n",
>__func__, addr);
> --
> 2.20.1
>
>

[Qemu-devel] [PATCH v2 1/5] roms: add the edk2 project as a git submodule

2019-01-24 Thread Laszlo Ersek

The roms/edk2 submodule can help with three goals:
- build the OVMF and ArmVirtQemu virtual UEFI firmware platforms (to be
  implemented later),
- build the EfiRom tool on the fly, which is used in roms/Makefile, for
  building the "efirom" target,
- build UEFI test applications (to be run in guests), for qtest support.

Edk2 commit 85588389222a3636baf0f9ed8227f2434af4c3f9 stands for the latest
"stable tag", namely "edk2-stable201811".

The edk2 repository tracks some binary files that should not be removed by
QEMU's top-level "make clean"; exempt the full pathnames from the "find"
command.

Cc: "Michael S. Tsirkin" 
Cc: Ard Biesheuvel 
Cc: Gerd Hoffmann 
Cc: Igor Mammedov 
Cc: Philippe Mathieu-Daudé 
Cc: Shannon Zhao 
Signed-off-by: Laszlo Ersek 
Reviewed-by: Gerd Hoffmann 
---

Notes:
v2:
- pick up R-b [Gerd]

 Makefile| 6 +-
 .gitmodules | 3 +++
 roms/edk2   | 1 +
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index de898eab6234..b0a70b2929ad 100644
--- a/Makefile
+++ b/Makefile
@@ -604,7 +604,11 @@ clean:
rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h 
gen-op-arm.h
rm -f qemu-options.def
rm -f *.msi
-   find . \( -name '*.so' -o -name '*.dll' -o -name '*.mo' -o -name 
'*.[oda]' \) -type f -exec rm {} +
+   find . \( -name '*.so' -o -name '*.dll' -o -name '*.mo' -o -name 
'*.[oda]' \) -type f \
+   ! -path ./roms/edk2/ArmPkg/Library/GccLto/liblto-aarch64.a \
+   ! -path ./roms/edk2/ArmPkg/Library/GccLto/liblto-arm.a \
+   ! -path ./roms/edk2/BaseTools/Source/Python/UPT/Dll/sqlite3.dll 
\
+   -exec rm {} +
rm -f $(filter-out %.tlb,$(TOOLS)) $(HELPERS-y) qemu-ga TAGS cscope.* 
*.pod *~ */*~
rm -f fsdev/*.pod scsi/*.pod
rm -f qemu-img-cmds.h
diff --git a/.gitmodules b/.gitmodules
index 6b91176098c8..ceafb0ee29a0 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -49,3 +49,6 @@
 [submodule "tests/fp/berkeley-softfloat-3"]
path = tests/fp/berkeley-softfloat-3
url = https://github.com/cota/berkeley-softfloat-3
+[submodule "roms/edk2"]
+   path = roms/edk2
+   url = https://github.com/tianocore/edk2.git
diff --git a/roms/edk2 b/roms/edk2
new file mode 16
index ..85588389222a
--- /dev/null
+++ b/roms/edk2
@@ -0,0 +1 @@
+Subproject commit 85588389222a3636baf0f9ed8227f2434af4c3f9
-- 
2.19.1.3.g30247aa5d201

[Qemu-devel] [PATCH v2 5/5] tests/data: introduce "uefi-boot-images" with the "bios-tables-test" ISOs

2019-01-24 Thread Laszlo Ersek

Add UEFI-bootable qcow2-compressed ISO images built from:

  tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest

Cc: "Michael S. Tsirkin" 
Cc: Ard Biesheuvel 
Cc: Gerd Hoffmann 
Cc: Igor Mammedov 
Cc: Philippe Mathieu-Daudé 
Cc: Shannon Zhao 
Signed-off-by: Laszlo Ersek 
---

Notes:
v2:
- refresh the images (should only differ in internal timestamps etc)

v1:
- Again, if needed, I'd be happy to be designated as Maintainer for the
  files being added in this patch; please let me know the right spot in
  "MAINTAINERS".

 tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2 | Bin 0 -> 
11776 bytes
 tests/data/uefi-boot-images/bios-tables-test.arm.iso.qcow2 | Bin 0 -> 
11776 bytes
 tests/data/uefi-boot-images/bios-tables-test.i386.iso.qcow2| Bin 0 -> 
12800 bytes
 tests/data/uefi-boot-images/bios-tables-test.x86_64.iso.qcow2  | Bin 0 -> 
13312 bytes
 4 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2 
b/tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2
new file mode 100644
index 
..8d7f8e67e85aa2156adb3423322018f6a6ea890c
GIT binary patch
literal 11776
zcmeI2byyo)xUYk|yA^i}R;*YlG`PD3x8f9v2ZsX1-6>Ft6n7}@l430`#ogU5yX}7N
zefD$CJ^Rlq=UbV)^ZR6ywUW%NnTd+D%q;)_fO}g0S3~(F5Fvm2Fn+Hf{?V|15$2Eh
zuK@p)U}Xh%Z0Km-J+0}!}#M<@0s{0*>F?Dcue(UJ^FDqaGZ;U-XsGQ9$Oda0Z
zySn_l7y17^>#vXKzwQ6a5rBF69sh2q0OY4N*gp;KiGMA>8~oGyU+ur`|IrbE{OT7n
z;5RYeS4*nHD%m1B+uK9;9fA}=t`9IrCUER2VKr^$W<1@4<<#vn=
zgMd4q<9;6T4g`EuelR=@w#D)?csT!?5M?45O5I%^pqa7wsfOwu*B27C5d~+;d
zg^!H@X#G%$*-3>jW&1O&|%1Auw|NI90)Fx}n3khQ(B$fFNo*GA=d
z&2KGtgr1Y%?YYHsHma88l8YiwV_*^xCygWkRI4y$e=Yth^bHw;(WjW*L6{k*@5
zB;%4}I=Ol{Vj^IBe2Rc6|C7QGKM5+Wekt09cDcX?BZ4c5T5h+gB?rG%7>L1Xo1#
zz^AFKspGFsx}K{bA7h=MA8R3FI=;PdJ#}?MKR#
zRhr?c}WM;$&1_q#J5(5Unp&(*n_jD+aGEuXb^mcZ)wdBK&>Wwvvbxh%#ZGGQ4=XJB=B4_mXEB
z9X79F>|g+<-cSf$sVtho6pRHQ5VWHC95n<80C?H)Q1c6yX>+ny2Kk2v$NGo+fA(KN
zq78%%goA^%T2vVJU0d6xEG7H7wYEySw?S-Tm7kN-+D;WDzy)k9$zT?%g~?GK$*W
z`T#y#E1lLE`0@CXn>!o=D3idu!5S*(qYi|Fwt=|6RJ~uZOr2>!i-{R_FrU$4gL!LvoB7C@D-u>#e2WEe
zdrS$hjYm99Ml3)Cbfg3?`?R+|u84Ip5#v0K@&7L9a_C9mC1>#0g^q3{Ghs?1Xk=
zz2t%Tm!yA!#YfH=vG!O4~u
z#0Y_ufPbd({7$ok9TI*@6A({gf$r-E$0_lK}0TP(j61U;Zvbvo@o2dA
z<-a|h`agSr^7p$F|FweN@VEXa|HPO6>Hk~*j$>f|t^Y0m)KUH>HURvq5GUa`@dSPo
zKkqjQCB*@M88I997fI57lk`CRmnJv+ttn1_lQR1^sWpC+7VbCcoWuct83XQbGE#&8
zk^sgB@n2;6Q_tM>TeFOTfB9MeI`{q$P5hUa?d5+M{1?tY_5bPi|K5Z`tQe?E9KW}1
z$YAz8WvZqU07gBhrs&@|D5!<*t3e5_Ic-62MP!s?C+Zz
zv}sIGSG+?(XrtdqgcMw6fKLG}QKtX;~3s8`noC$F9nH!oMaBem~7R}O
zSS>hs!dSjlBVx3)w;SgJx9avaVp_{Ck@`PJBA3|=g207k(vJJZh;E+)(<1T#a+
zbZcqcB|ncF)JB@pjTUuaZk{$1XdnvW
z#`WgJPO;o6m_bW$!eo
zu}b=F4!cpUv|KWK9VYI)s4_eLIx|7j`|Kt=|Pl$iI2XYoY~KNeU-!;2zaKAKcYT
z(9#~BGPx|aA+AoCJ8`hj-e^1h+VnYX*>C(k*8K}@Tho(geqc6{JzC<`(XP)L5DE7m$q}wnv6IgcG(bM~o+4v9`N9c77
zvh2+1woz{N@hlPUzywZcONb!uMJqL`d;$LSw;_*ur_r=3?!GFxW8nDF6tjGwIUq@d
zqdY=zhC>Nddj?w6WLSqLf2u{y77@NJkbGW?qhZSV9Gdz#B7Prm-sx0Pec#xKBu%jlW`8n=v7HMmF$-wD2DuKduP+$UlyyzZEw{GNY33xI7_fz1KOdre~oNWHpryKmAUSp!!8V
z5AxccL4ks{Q<{g1wFalt1x{glUUFXcjSPk;Gl9PKT1M&??C@W1xAWJgq(Zl84
z`G@lJ#x|ds2g;4E>*vd>eZ022C1dxZ2~1Yaj6Ymme6g+(>|#db$W-uqpkAgQr
z3Go@qdZaXES!h71U)XwZdEp3IQ%YUwk$u@z_mK_wI3g*SCS_B`X+rSVQiD5!X
zb2;?;9@~F_T=fMOWq!`>Ycd%e5N;=(jo2wsb9KuJ{#dv10%JGJQ{Pj@}}_yg!C
z>hAG!^UT
zh5upTlN4X8Eg1c(=$Tqs9ZNWE)n-XP!sAEsf3^QKjlyhBm`a%b8J(!CM`Nn0{sq
z|CaHDofp{#@0~{~t)g23+G$3l4S#;-D-vV@;?x8l>;9M$fg+rrG#pYgwt2T-=I*g7
zN85j3WTjr>UN!TtG*7o(h`hM8Up%&6;0(zQg(0jEwf;bPm2}K~J(xfCGL;QzUDAyl
zZ<%cPoa;EtUK5MC+G6XlxMkkDF1(@7>`T;D#K3hxN$9=%ZIYY$w~T5`GiOw}f_iUT
zO=QK#sN29xsyD^g3egwEa`5FneVou{bv0L=}Dmu5f6f=4Ru~1t)74mM{>^k
zLV6rJ!8e`n>ps5+?=vXx+p4OcDO!2t&=BX
z?B{x^^Rc=1b5J#^=_H=4_6*b?&9M@^rIC
zxOo;hsV;vn?AewlcRlId`mui!4*mtAf<<=DTO6aaq6kRKyN?1*(!lM>XB
z-fmHrz5*^1xlGlNbHwB6KVx6gemZ+fcc?uc)8AKYgJz^H%CyU%t6I$Ss1EDKSfe~iX%9t
zRb6^TlSDkA-{T5HRZU|qktdG0Z0^b%7s`$oBM;3L!htFa*Q@rj1`uF=Fc?Z0UZUvIy;z7-7H_acMd^POY
zQy5g1Fr!n@B4_uF#CW#iBA-)QyCRpAZJq5*?AJjrKE$`dZs(J|AWfdH?mm~l!nsEB
z;J3J(EC-2sws|Nu$`lTBbw&5CeBB|wYA9F;mVW4vet21y^>|LXP_A2jAbif#BFM;G5kIWF)EZB7{pd(nsNovah5+w!;u3XW*&=>eo1cudZON
z2x;NE5D;-f!rvnWF&1ml-CaScAF(YELole^>HR9xk)8SbN2o4mVHZ52AYNH{
zjhoalVn#tLroBs#)aX>V5VuOL{5!zsbm-{OF?PKCrGF_YnsiP!9%MtXNX7X;4(
z1lq7ix?K%mx<>d%j$mA28Sx
ztIcI$3yspM#i^%yp+`mrHNfQ9VT@#o8GUX);Bv$_6uu8rW2;0l%G7~xMhiLc{(
zn;hvXdQUV#$@JWAfGt1w3#{u+r@xP>I&(>;AF4bwR0K@`h0(
zRv6T!j34s2Hf@u?pYt+^B`+XQKsP`B++SH(KwJMU4gx;`Q8FarX^yjITS*mc=4=(3
zr`+AsXQoeH7M4dSp_kaDR}FjXs743)}V!SCsNZqm_ilFjb
z$`KZ*KZkd(f||;M>b$U@QtD(d-^HngR0DecpkjX8z=4`82E2WRp_p*D(G+mUB7A|z
zbyyxKPUh_EhHZFVvNMF)+wvNIqy{VLbDm#@i%?j-BEdinq8w8$QIzerh^d42^&^I@
zXjkGvy3rDkwPNbaQI59laMDcFPrdg_qpvN5W-A2nKKLX-@=?iQr^yW&(E4ynYU~>7
zhg+)kZU}6iud}^;D;?`~8_(t7@7RdnvPe;ad{0a7ti|UbOQ)qu16>@+ZCJsq$J8F@
zTp${LmFiajap{{mrJ)h7}qPg7OY_(b6QT%jOR=W^#J$!ucRSPcB_roh4{%m5t
zGcX$I=+?y=JEoo?P8~1467;z3e^8Beh9f+;vZ;O*IV%;hj4PJDkL%BX1CuByDP
zDnU0rE$siZ*BpSDU#J&?8t$Kp!x{Pl=%*Z>J0yt0yBY59kBq_sYZHnI@%#%

Re: [Qemu-devel] [PATCH v2 2/2] gen_pcie_root_port: Add ACS (Access Control Services) capability

2019-01-24 Thread Knut Omang

On Thu, 2019-01-24 at 10:33 -0700, Alex Williamson wrote:
> On Thu, 24 Jan 2019 11:12:53 +0100
> Knut Omang  wrote:
> 
> > Claim ACS support in the generic PCIe root port to allow
> > passthrough of individual functions of a device to different
> > guests (in a nested virt.setting) with VFIO.
> > Without this patch, all functions of a device, such as all VFs of
> > an SR/IOV device, will end up in the same IOMMU group.
> > A similar situation occurs on Windows with Hyper-V.
> > 
> > In the single function device case, it also has a small cosmetic
> > benefit in that the root port itself is not grouped with
> > the device. VFIO handles that situation in that binding rules
> > only apply to endpoints, so it does not limit passthrough in
> > those cases.
> > 
> > Signed-off-by: Knut Omang 
> > ---
> >  hw/pci-bridge/gen_pcie_root_port.c | 2 ++
> >  hw/pci-bridge/pcie_root_port.c | 4 
> >  include/hw/pci/pcie_port.h | 1 +
> >  3 files changed, 7 insertions(+)
> > 
> > diff --git a/hw/pci-bridge/gen_pcie_root_port.c 
> > b/hw/pci-bridge/gen_pcie_root_port.c
> > index 9766edb..b5a5ecc 100644
> > --- a/hw/pci-bridge/gen_pcie_root_port.c
> > +++ b/hw/pci-bridge/gen_pcie_root_port.c
> > @@ -20,6 +20,7 @@
> >  OBJECT_CHECK(GenPCIERootPort, (obj), TYPE_GEN_PCIE_ROOT_PORT)
> >  
> >  #define GEN_PCIE_ROOT_PORT_AER_OFFSET   0x100
> > +#define GEN_PCIE_ROOT_PORT_ACS_OFFSET   0x148
> 
> So you prefer that everyone passing through here decode these to figure
> out that ACS_OFFSET is (AER_OFFSET + ERR_SIZEOF) since my comment on v1
> was ignored?

Sorry, not at all - I managed to overlook your comment - will fix it,

> >  #define GEN_PCIE_ROOT_PORT_MSIX_NR_VECTOR   1
> >  
> >  typedef struct GenPCIERootPort {
> > @@ -149,6 +150,7 @@ static void gen_rp_dev_class_init(ObjectClass *klass, 
> > void *data)
> >  rpc->interrupts_init = gen_rp_interrupts_init;
> >  rpc->interrupts_uninit = gen_rp_interrupts_uninit;
> >  rpc->aer_offset = GEN_PCIE_ROOT_PORT_AER_OFFSET;
> > +rpc->acs_offset = GEN_PCIE_ROOT_PORT_ACS_OFFSET;
> >  }
> >  
> >  static const TypeInfo gen_rp_dev_info = {
> > diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
> > index 34ad767..a0b4cf7 100644
> > --- a/hw/pci-bridge/pcie_root_port.c
> > +++ b/hw/pci-bridge/pcie_root_port.c
> > @@ -47,6 +47,7 @@ static void rp_reset(DeviceState *qdev)
> >  pcie_cap_deverr_reset(d);
> >  pcie_cap_slot_reset(d);
> >  pcie_cap_arifwd_reset(d);
> > +pcie_cap_acs_reset(d);
> 
> Only the generic root port initializes acs_offset to enable an ACS
> capability, but all members of the device class call the reset function
> which does no checking that an ACS capability exists.  We've just
> corrupted config space for the device.

Ouch! Not good at all, sorry!
Will look at it (after a good night's sleep this time..)

Thanks!
Knut

> >  pcie_aer_root_reset(d);
> >  pci_bridge_reset(qdev);
> >  pci_bridge_disable_base_limit(d);
> > @@ -106,6 +107,9 @@ static void rp_realize(PCIDevice *d, Error **errp)
> >  pcie_aer_root_init(d);
> >  rp_aer_vector_update(d);
> >  
> > +if (rpc->acs_offset) {
> > +pcie_acs_init(d, rpc->acs_offset);
> > +}
> >  return;
> >  
> >  err:
> > diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h
> > index df242a0..09586f4 100644
> > --- a/include/hw/pci/pcie_port.h
> > +++ b/include/hw/pci/pcie_port.h
> > @@ -78,6 +78,7 @@ typedef struct PCIERootPortClass {
> >  int exp_offset;
> >  int aer_offset;
> >  int ssvid_offset;
> > +int acs_offset;/* If nonzero, optional ACS capability offset */
> >  int ssid;
> >  } PCIERootPortClass;
> >  
>

[Qemu-devel] [PATCH v2 2/5] roms: build the EfiRom utility from the roms/edk2 submodule

2019-01-24 Thread Laszlo Ersek

Building the EfiRom utility from "roms/edk2/BaseTools" should make
"roms/Makefile" more self-contained. Otherwise, we'd call the system-wide
EfiRom for building the combined iPXE option ROMs, but call the sibling
utilities from "roms/edk2/BaseTools" for building "roms/edk2" content.

Cc: "Michael S. Tsirkin" 
Cc: Ard Biesheuvel 
Cc: Gerd Hoffmann 
Cc: Igor Mammedov 
Cc: Philippe Mathieu-Daudé 
Cc: Shannon Zhao 
Signed-off-by: Laszlo Ersek 
Reviewed-by: Gerd Hoffmann 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---

Notes:
v2:
- pick up R-b / T-b [Gerd, Phil]

 roms/Makefile | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/roms/Makefile b/roms/Makefile
index a6043eff37e9..78d5dd18c301 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -47,10 +47,7 @@ SEABIOS_EXTRAVERSION="-prebuilt.qemu.org"
 # We need that to combine multiple images (legacy bios,
 # efi ia32, efi x64) into a single rom binary.
 #
-# We try to find it in the path.  You can also pass the location on
-# the command line, i.e. "make EFIROM=/path/to/EfiRom efirom"
-#
-EFIROM ?= $(shell which EfiRom 2>/dev/null)
+EFIROM = edk2/BaseTools/Source/C/bin/EfiRom
 
 default:
@echo "nothing is build by default"
@@ -59,8 +56,7 @@ default:
@echo "  vgabios-- update vgabios binaries (seabios)"
@echo "  sgabios-- update sgabios binaries"
@echo "  pxerom -- update nic roms (bios only)"
-   @echo "  efirom -- update nic roms (bios+efi, this needs"
-   @echo "the EfiRom utility from edk2 / tianocore)"
+   @echo "  efirom -- update nic roms (bios+efi)"
@echo "  slof   -- update slof.bin"
@echo "  skiboot-- update skiboot.lid"
@echo "  u-boot.e500-- update u-boot.e500"
@@ -106,7 +102,7 @@ pxe-rom-%: build-pxe-roms
 
 efirom: $(patsubst %,efi-rom-%,$(pxerom_variants))
 
-efi-rom-%: build-pxe-roms build-efi-roms
+efi-rom-%: build-pxe-roms build-efi-roms $(EFIROM)
$(EFIROM) -f "0x$(VID)" -i "0x$(DID)" -l 0x02 \
-b ipxe/src/bin/$(VID)$(DID).rom \
-ec ipxe/src/bin-i386-efi/$(VID)$(DID).efidrv \
@@ -124,6 +120,8 @@ build-efi-roms: build-pxe-roms
$(patsubst %,bin-i386-efi/%.efidrv,$(pxerom_targets)) \
$(patsubst %,bin-x86_64-efi/%.efidrv,$(pxerom_targets))
 
+$(EFIROM):
+   $(MAKE) -C edk2/BaseTools
 
 slof:
$(MAKE) -C SLOF CROSS=$(powerpc64_cross_prefix) qemu
@@ -150,6 +148,7 @@ clean:
$(MAKE) -C sgabios clean
rm -f sgabios/.depend
$(MAKE) -C ipxe/src veryclean
+   $(MAKE) -C edk2/BaseTools clean
$(MAKE) -C SLOF clean
rm -rf u-boot/build.e500
$(MAKE) -C u-boot-sam460ex distclean
-- 
2.19.1.3.g30247aa5d201

[Qemu-devel] [PATCH v2 0/5] add the BiosTablesTest UEFI app, build it with the new roms/edk2 submodule

2019-01-24 Thread Laszlo Ersek

Previous version (v1):
20190118223400.24311-1-lersek@redhat.com">http://mid.mail-archive.com/20190118223400.24311-1-lersek@redhat.com

Updates in v2 have been noted on each patch in the series.

Cc: "Michael S. Tsirkin" 
Cc: Ard Biesheuvel 
Cc: Gerd Hoffmann 
Cc: Igor Mammedov 
Cc: Philippe Mathieu-Daudé 
Cc: Shannon Zhao 

Thanks
Laszlo

Laszlo Ersek (5):
  roms: add the edk2 project as a git submodule
  roms: build the EfiRom utility from the roms/edk2 submodule
  tests: introduce "uefi-test-tools" with the BiosTablesTest UEFI app
  tests/uefi-test-tools: add build scripts
  tests/data: introduce "uefi-boot-images" with the "bios-tables-test"
ISOs

 .gitmodules  |   3 
+
 Makefile |   6 
+-
 roms/Makefile|  13 
+-
 roms/edk2|   1 
+
 tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2   | Bin 
0 -> 11776 bytes
 tests/data/uefi-boot-images/bios-tables-test.arm.iso.qcow2   | Bin 
0 -> 11776 bytes
 tests/data/uefi-boot-images/bios-tables-test.i386.iso.qcow2  | Bin 
0 -> 12800 bytes
 tests/data/uefi-boot-images/bios-tables-test.x86_64.iso.qcow2| Bin 
0 -> 13312 bytes
 tests/uefi-test-tools/.gitignore |   3 
+
 tests/uefi-test-tools/LICENSE|  25 

 tests/uefi-test-tools/Makefile   |  97 
+
 tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.c   | 130 
++
 tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.inf |  41 
++
 tests/uefi-test-tools/UefiTestToolsPkg/Include/Guid/BiosTablesTest.h |  67 
+
 tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dec  |  27 

 tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dsc  |  69 
++
 tests/uefi-test-tools/build.sh   | 145 

 17 files changed, 619 insertions(+), 8 deletions(-)
 create mode 16 roms/edk2
 create mode 100644 
tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2
 create mode 100644 tests/data/uefi-boot-images/bios-tables-test.arm.iso.qcow2
 create mode 100644 tests/data/uefi-boot-images/bios-tables-test.i386.iso.qcow2
 create mode 100644 
tests/data/uefi-boot-images/bios-tables-test.x86_64.iso.qcow2
 create mode 100644 tests/uefi-test-tools/.gitignore
 create mode 100644 tests/uefi-test-tools/LICENSE
 create mode 100644 tests/uefi-test-tools/Makefile
 create mode 100644 
tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.c
 create mode 100644 
tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.inf
 create mode 100644 
tests/uefi-test-tools/UefiTestToolsPkg/Include/Guid/BiosTablesTest.h
 create mode 100644 tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dec
 create mode 100644 tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dsc
 create mode 100755 tests/uefi-test-tools/build.sh

-- 
2.19.1.3.g30247aa5d201

[Qemu-devel] [PATCH v2 3/5] tests: introduce "uefi-test-tools" with the BiosTablesTest UEFI app

2019-01-24 Thread Laszlo Ersek

The "bios-tables-test" program in QEMU's test suite locates the RSD PTR
ACPI table in guest RAM, and (chasing pointers to other ACPI tables)
performs various sanity checks on the QEMU-generated and
firmware-installed tables.

Currently this set of test cases doesn't work with UEFI guests. The ACPI
spec defines distinct methods for OSPM to locate the RSD PTR on
traditional BIOS vs. UEFI platforms, and the UEFI method is more difficult
to implement from the hypervisor side with just raw guest memory access.

Add a UEFI application (to be booted in the UEFI guest) that populates a
small, MB-aligned structure in guest RAM. The structure begins with a
signature GUID. The hypervisor should loop over all MB-aligned pages in
guest RAM until one matches the signature GUID at offset 0, at which point
the hypervisor can fetch the RSDP address field(s) from the structure.

QEMU's test logic currently spins on a pre-determined guest address, until
that address assumes a magic value. The method described in this patch is
conceptually the same ("busy loop until match is found"), except there is
no hard-coded address. This plays a lot more nicely with UEFI guest
firmware (we'll be able to use the normal page allocation UEFI service).
Given the size of EFI_GUID (16 bytes -- 128 bits), mismatches should be
astronomically unlikely. In addition, given the typical guest RAM size for
such tests (128 MB), there are 128 locations to check in one iteration of
the "outer" loop, which shouldn't introduce an intolerable delay after the
guest stores the RSDP address(es), and then the GUID.

The GUID that the hypervisor should search for is

  AB87A6B1-2034-BDA0-71BD-375007757785

Expressed as a byte array:

 {
   0xb1, 0xa6, 0x87, 0xab,
   0x34, 0x20,
   0xa0, 0xbd,
   0x71, 0xbd, 0x37, 0x50, 0x07, 0x75, 0x77, 0x85
 }

Note that in the patch, we define "gBiosTablesTestGuid" with all bits
inverted. This is a simple method to prevent the UEFI binary, which
incorporates "gBiosTablesTestGuid", from matching the actual GUID in guest
RAM.

The UEFI application is written against the edk2 framework, which was
introduced earlier as a git submodule. The next patch will provide build
scripts for maintainers.

The source code follows the edk2 coding style, and is licensed under the
2-clause BSDL (in case someone would like to include UefiTestToolsPkg
content in a different edk2 platform).

The "UefiTestToolsPkg.dsc" platform description file resolves the used
edk2 library classes to instances (= library implementations) such that
the UEFI binaries inherit no platform dependencies. They are expected to
run on any system that conforms to the UEFI-2.3.1 spec (which was released
in 2012). The arch-specific build options are carried over from edk2's
ArmVirtPkg and OvmfPkg platforms.

Cc: "Michael S. Tsirkin" 
Cc: Ard Biesheuvel 
Cc: Gerd Hoffmann 
Cc: Igor Mammedov 
Cc: Philippe Mathieu-Daudé 
Cc: Shannon Zhao 
Signed-off-by: Laszlo Ersek 
---

Notes:
v2:
- no change

v1:
- If that's necessary, I'd be glad to be designated as Maintainer or
  Reviewer in "MAINTAINERS" for "tests/uefi-test-tools/", I just
  couldn't figure out under what subsystem I should add the magic lines.
  "MAINTAINERS" needs a Table of Contents! :)

 tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dec  |  27 

 tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dsc  |  69 
+++
 tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.inf |  41 
++
 tests/uefi-test-tools/UefiTestToolsPkg/Include/Guid/BiosTablesTest.h |  67 
++
 tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.c   | 130 

 tests/uefi-test-tools/LICENSE|  25 

 6 files changed, 359 insertions(+)

diff --git a/tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dec 
b/tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dec
new file mode 100644
index ..ed3a2fe11084
--- /dev/null
+++ b/tests/uefi-test-tools/UefiTestToolsPkg/UefiTestToolsPkg.dec
@@ -0,0 +1,27 @@
+## @file
+# edk2 package declaration for the test helper UEFI applications that run in
+# guests.
+#
+# Copyright (C) 2019, Red Hat, Inc.
+#
+# This program and the accompanying materials are licensed and made available
+# under the terms and conditions of the BSD License that accompanies this
+# distribution. The full text of the license may be found at
+# .
+#
+# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS, WITHOUT
+# WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
+##
+
+[Defines]
+  DEC_SPECIFICATION = 1.27
+  PACKAGE_NAME  = UefiTestToolsPkg
+  PACKAGE_GUID  = 7b3f1794-0c85-4b27-a536-44dbf0b0669c
+  PACKAGE_VERSION   = 0.1
+
+[Includes]
+  Include
+
+[Guids]
+  gBiosTablesTestGuid = {0x5478594e, 0xdfcb, 0x425f, {0x8e, 0x42, 0xc8, 0xaf,

[Qemu-devel] [PATCH v2 4/5] tests/uefi-test-tools: add build scripts

2019-01-24 Thread Laszlo Ersek

Introduce the following build scripts under "tests/uefi-test-tools":

* "build.sh" builds a single module (a UEFI application) from
  UefiTestToolsPkg, for a single QEMU emulation target.

  "build.sh" relies on cross-compilers when the emulation target and the
  build host architecture don't match. The cross-compiler prefix is
  computed according to a fixed, Linux-specific pattern. No attempt is
  made to copy or reimplement the GNU Make magic from "qemu/roms/Makefile"
  for cross-compiler prefix determination. The reason is that the build
  host OSes that are officially supported by edk2, and those that are
  supported by QEMU, intersect only in Linux. (Note that the UNIXGCC
  toolchain is being removed from edk2,
  .)

* "Makefile" currently builds the "UefiTestToolsPkg/BiosTablesTest"
  application, for arm, aarch64, i386, and x86_64, with the help of
  "build.sh".

  "Makefile" turns each resultant UEFI executable into a UEFI-bootable,
  qcow2-compressed ISO image. The ISO images are output as
  "tests/data/uefi-boot-images/bios-tables-test..iso.qcow2".

  Each ISO image should be passed to QEMU as follows:

-drive id=boot-cd,if=none,readonly,format=qcow2,file=$ISO \
-device virtio-scsi-pci,id=scsi0 \
-device scsi-cd,drive=boot-cd,bus=scsi0.0,bootindex=0 \

  "Makefile" assumes that "mkdosfs", "mtools", and "genisoimage" are
  present.

Cc: "Michael S. Tsirkin" 
Cc: Ard Biesheuvel 
Cc: Gerd Hoffmann 
Cc: Igor Mammedov 
Cc: Philippe Mathieu-Daudé 
Cc: Shannon Zhao 
Signed-off-by: Laszlo Ersek 
---

Notes:
v2:
- add the .NOTPARALLEL target [Phil, help-make, edk2-devel]

 tests/uefi-test-tools/Makefile   |  97 +
 tests/uefi-test-tools/.gitignore |   3 +
 tests/uefi-test-tools/build.sh   | 145 
 3 files changed, 245 insertions(+)

diff --git a/tests/uefi-test-tools/Makefile b/tests/uefi-test-tools/Makefile
new file mode 100644
index ..61d263861e01
--- /dev/null
+++ b/tests/uefi-test-tools/Makefile
@@ -0,0 +1,97 @@
+# Makefile for the test helper UEFI applications that run in guests.
+#
+# Copyright (C) 2019, Red Hat, Inc.
+#
+# This program and the accompanying materials are licensed and made available
+# under the terms and conditions of the BSD License that accompanies this
+# distribution. The full text of the license may be found at
+# .
+#
+# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS, WITHOUT
+# WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
+
+edk2_dir  := ../../roms/edk2
+images_dir:= ../data/uefi-boot-images
+emulation_targets := arm aarch64 i386 x86_64
+uefi_binaries := bios-tables-test
+intermediate_suffixes := .efi .fat .iso.raw
+
+images: $(foreach binary,$(uefi_binaries), \
+   $(foreach target,$(emulation_targets), \
+   $(images_dir)/$(binary).$(target).iso.qcow2))
+
+# Preserve all intermediate targets if the build succeeds.
+# - Intermediate targets help with development & debugging.
+# - Preserving intermediate targets also keeps spurious changes out of the
+#   final build products, in case the user re-runs "make" without any changes
+#   to the UEFI source code. Normally, the intermediate files would have been
+#   removed by the last "make" invocation, hence the re-run would rebuild them
+#   from the unchanged UEFI sources. Unfortunately, the "mkdosfs" and
+#   "genisoimage" utilities embed timestamp-based information in their outputs,
+#   which causes git to report differences for the tracked qcow2 ISO images.
+.SECONDARY: $(foreach binary,$(uefi_binaries), \
+   $(foreach target,$(emulation_targets), \
+   $(foreach suffix,$(intermediate_suffixes), \
+   Build/$(binary).$(target)$(suffix
+
+# In the pattern rules below, the stem (%, $*) stands for
+# "$(binary).$(target)".
+
+# Convert the raw ISO image to a qcow2 one, enabling compression, and using a
+# small cluster size. This allows for small binary files under git control,
+# hence for small binary patches.
+$(images_dir)/%.iso.qcow2: Build/%.iso.raw
+   mkdir -p -- $(images_dir)
+   $${QTEST_QEMU_IMG:-qemu-img} convert -f raw -O qcow2 -c \
+   -o cluster_size=512 -- $< $@
+
+# Embed the "UEFI system partition" into an ISO9660 file system as an ElTorito
+# boot image.
+Build/%.iso.raw: Build/%.fat
+   genisoimage -input-charset ASCII -efi-boot $(notdir $<) -no-emul-boot \
+   -quiet -o $@ -- $<
+
+# Define chained macros in order to map QEMU system emulation targets to
+# *short* UEFI architecture identifiers. Periods are allowed in, and ultimately
+# stripped from, the argument.
+map_arm_to_uefi = $(subst arm,ARM,$(1))
+map_aarch64_to_uefi = $(subst aarch64,AA64,$(call map_arm_to_uefi,$(1)))
+map_i386_to_uefi= $(subst i386,IA32,$(call

Re: [Qemu-devel] [PATCH] gdbstub: Fix i386/x86_64 machine description and add control registers

2019-01-24 Thread Doug Gale

The machine description we send is being (silently) thrown on the floor by
GDB and GDB silently uses the default machine description.

With current QEMU, if you debug gdb, and set debug_xml=1 and continue, then
attach to qemu gdbstub from the debugged gdb, you will see the xml parse
fail completely, and gdb will fall back to the default machine description,
silently, and changes to our xml (in qemu source code) have no effect. They
might as well be empty.

The point of fixing the machine description was IDE's with GDB integration
will break on QEMU. The default machine description has fs_base, which
fails to be retrieved, whick breaks the whole register window (in
qt-creator at least, likely others). With my patch the register window
works perfectly.

I didn't delete anything, I removed the superfluous nesting of files by
xi:include and moved the description into a single xml file. I added
fs_base, gs_base, k_gs_base, cr0/2/3/4/6, efer.

Removing the nesting into xml includes fixes it because the xml parse fails
on 
wrote:

> On Thu, 24 Jan 2019 at 04:08, Doug Gale  wrote:
> >
> > Signed-off-by: Doug Gale 
> > ---
> >  configure   |   4 +-
> >  gdb-xml/i386-32bit-core.xml |  65 ---
> >  gdb-xml/i386-32bit-sse.xml  |  52 -
> >  gdb-xml/i386-32bit.xml  | 184 ++-
> >  gdb-xml/i386-64bit-core.xml |  73 -
> >  gdb-xml/i386-64bit-sse.xml  |  60 ---
> >  gdb-xml/i386-64bit.xml  | 210 +++-
> >  target/i386/cpu.c   |   4 +-
> >  target/i386/gdbstub.c   | 186 +++-
> >  9 files changed, 573 insertions(+), 265 deletions(-)
> >  delete mode 100644 gdb-xml/i386-32bit-core.xml
> >  delete mode 100644 gdb-xml/i386-32bit-sse.xml
> >  delete mode 100644 gdb-xml/i386-64bit-core.xml
> >  delete mode 100644 gdb-xml/i386-64bit-sse.xml
>
> Could you provide a commit message that explains what's
> wrong with the machine description we have (ie what bug
> or bugs this change is fixing) and why deleting half
> the xml files is the right way to fix it, please?
>
> Does the "add control registers" part need to be in
> the same patch, or is it a separate feature which
> could be in its own patch ?
>
> thanks
> -- PMM
>

Re: [Qemu-devel] [PATCH 2/2] aspeed/scu: Implement power off register

2019-01-24 Thread Joel Stanley

On Fri, 4 Jan 2019 at 03:26, Peter Maydell  wrote:
>
> On Tue, 11 Dec 2018 at 03:11, Joel Stanley  wrote:
> >
> > This register does not exist in hardware. It is here to allow the guest
> > code to cause Qemu to exit when required.
> >
> > The register address chosen is unused in the emulated machines
> > datasheets.

> I'm always a bit dubious about adding things to QEMU devices
> which don't exist in the real hardware we're emulating. If we
> do want to do that, I think we should clearly flag them up as
> being QEMU-specific with suitable comments and naming of
> the #define, etc.

Since writing this patch I was made aware of -no-reboot. That flag
solves the problem I had so we can drop these patches for now.

Cheers,

Joel

Re: [Qemu-devel] [PATCH v2 1/2] pcie: Add a simple PCIe ACS (Access Control Services) helper function

2019-01-24 Thread Knut Omang

On Thu, 2019-01-24 at 10:22 -0700, Alex Williamson wrote:
> On Thu, 24 Jan 2019 11:12:52 +0100
> Knut Omang  wrote:
> 
> > Add a helper function to add PCIe capability for Access Control Services 
> > (ACS)
> > ACS support in the associated root port is a prerequisite to be able to do
> > passthrough of individual functions of a device with VFIO
> > without Alex Williamson's pcie_acs_override kernel patch or similar
> > in the guest.
> > 
> > Signed-off-by: Knut Omang 
> > ---
> >  hw/pci/pcie.c  | 21 +
> >  include/hw/pci/pcie.h  |  6 ++
> >  include/hw/pci/pcie_regs.h |  4 
> >  3 files changed, 31 insertions(+)
> > 
> > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > index 230478f..5ab3d1d 100644
> > --- a/hw/pci/pcie.c
> > +++ b/hw/pci/pcie.c
> > @@ -742,6 +742,13 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
> >  PCI_EXP_DEVCTL2_ARI;
> >  }
> >  
> > +/* Access Control Services (ACS)
> > + */
> 
> Comment style
>
> > +void pcie_cap_acs_reset(PCIDevice *dev)
> > +{
> > +pci_set_word(dev->config + dev->exp.acs_cap + PCI_ACS_CTRL, 0);
> > +}
> > +
> >  /**
> >   * pci express extended capability list management functions
> >   * uint16_t ext_cap_id (16 bit)
> > @@ -906,3 +913,17 @@ void pcie_ats_init(PCIDevice *dev, uint16_t offset)
> >  
> >  pci_set_word(dev->wmask + dev->exp.ats_cap + PCI_ATS_CTRL, 0x800f);
> >  }
> > +
> > +/* ACS (Access Control Services) */
> > +void pcie_acs_init(PCIDevice *dev, uint16_t offset)
> > +{
> > +pcie_add_capability(dev, PCI_EXT_CAP_ID_ACS, PCI_ACS_VER,
> > +offset, PCI_ACS_SIZEOF);
> > +dev->exp.acs_cap = offset;
> > +pci_set_word(dev->config + offset + PCI_ACS_CAP,
> > + PCI_ACS_SV | PCI_ACS_TB | PCI_ACS_RR | PCI_ACS_CR | 
> > PCI_ACS_UF);
> 
> This is still only valid for downstream ports yet neither restricted
> nor commented do indicate that.  You could use an object_dynamic_cast
> to triggger an assert should someone use it for an invalid type of
> device, ex:
> 
> assert(object_dynamic_cast(OBJECT(dev), TYPE_PCIE_SLOT));

Sorry, I didn't realize what you meant with v1 - this evolved from 
just a fix in the implementation of ioh3420 to a fix in the generic code, 
which I now realize of course is also used for downstream ports...

> > +
> > +pci_set_word(dev->config + offset + PCI_ACS_CTRL, 0);
> 
> Suspect this is unnecessary given the reset callback.

ok

> > +pci_set_word(dev->wmask + offset + PCI_ACS_CTRL,
> > + PCI_ACS_SV | PCI_ACS_TB | PCI_ACS_RR | PCI_ACS_CR | 
> > PCI_ACS_UF);
> > +}
> > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > index 5b82a0d..4c40711 100644
> > --- a/include/hw/pci/pcie.h
> > +++ b/include/hw/pci/pcie.h
> > @@ -79,6 +79,9 @@ struct PCIExpressDevice {
> >  
> >  /* Offset of ATS capability in config space */
> >  uint16_t ats_cap;
> > +
> > +/* ACS */
> > +uint16_t acs_cap;
> >  };
> >  
> >  #define COMPAT_PROP_PCP "power_controller_present"
> > @@ -116,6 +119,8 @@ void pcie_cap_flr_init(PCIDevice *dev);
> >  void pcie_cap_flr_write_config(PCIDevice *dev,
> > uint32_t addr, uint32_t val, int len);
> >  
> > +void pcie_cap_acs_reset(PCIDevice *dev);
> > +
> >  /* ARI forwarding capability and control */
> >  void pcie_cap_arifwd_init(PCIDevice *dev);
> >  void pcie_cap_arifwd_reset(PCIDevice *dev);
> > @@ -129,6 +134,7 @@ void pcie_add_capability(PCIDevice *dev,
> >  void pcie_sync_bridge_lnk(PCIDevice *dev);
> >  
> >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > +void pcie_acs_init(PCIDevice *dev, uint16_t offset);
> >  void pcie_dev_ser_num_init(PCIDevice *dev, uint16_t offset, uint64_t 
> > ser_num);
> >  void pcie_ats_init(PCIDevice *dev, uint16_t offset);
> >  
> > diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
> > index ad4e780..3fc9aca 100644
> > --- a/include/hw/pci/pcie_regs.h
> > +++ b/include/hw/pci/pcie_regs.h
> > @@ -175,4 +175,8 @@ typedef enum PCIExpLinkWidth {
> >   PCI_ERR_COR_INTERNAL | \
> >   PCI_ERR_COR_HL_OVERFLOW)
> >  
> > +/* ACS */
> > +#define PCI_ACS_VER 0x2
> 
> There's no such version, even the PCIe 5.0 drafts only define version 1.

Hmm - I have no idea how it ended up as 2 in the first place - my model device 
is of
course also v1 - will fix it.

Thanks!
Knut

> > +#define PCI_ACS_SIZEOF  8
> > +
> >  #endif /* QEMU_PCIE_REGS_H */
>

Re: [Qemu-devel] [Qemu-block] Aborts in iotest 169

2019-01-24 Thread Dr. David Alan Gilbert

* Kevin Wolf (kw...@redhat.com) wrote:
> Am 24.01.2019 um 11:49 hat Dr. David Alan Gilbert geschrieben:
> > * Kevin Wolf (kw...@redhat.com) wrote:
> > > Am 24.01.2019 um 10:29 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > 23.01.2019 18:48, Max Reitz wrote:
> > > > > Hi,
> > > > > 
> > > > > When running 169 in parallel (e.g. like so:
> > > > > 
> > > > > $ while TEST_DIR=/tmp/t0 ./check -T -qcow2 169; do; done
> > > > > $ while TEST_DIR=/tmp/t1 ./check -T -qcow2 169; do; done
> > > > > $ while TEST_DIR=/tmp/t2 ./check -T -qcow2 169; do; done
> > > > > $ while TEST_DIR=/tmp/t3 ./check -T -qcow2 169; do; done
> > > > > 
> > > > > in four different shells), I get aborts:
> > > > > 
> > > > > (Often I get segfaults, but that's because of
> > > > > http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg05579.html 
> > > > > --
> > > > > feel free to apply the attached patch to make them go away)
> > > > > 
> > > > > 
> > > > > WARNING:qemu:qemu received signal 6:
> > > > > build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
> > > > > -chardev socket,id=mon,path=/tmp/t0/tmpbX30XU/qemua-25745-monitor.sock
> > > > > -mon chardev=mon,mode=control -display none -vga none -qtest
> > > > > unix:path=/tmp/t0/qemua-25745-qtest.sock -machine accel=qtest
> > > > > -nodefaults -machine accel=qtest -drive
> > > > > if=virtio,id=drive0,file=/tmp/t0/disk_a,format=qcow2,cache=writeback
> > > > > .E..
> > > > > ==
> > > > > ERROR:
> > > > > test_do_test_migration_resume_source_not_persistent__not_migbitmap
> > > > > (__main__.TestDirtyBitmapMigration)
> > > > > --
> > > > > Traceback (most recent call last):
> > > > >File "169", line 206, in 
> > > > >  setattr(klass, 'test_' + method + name, lambda self: mc(self))
> > > > >File "169", line 113, in do_test_migration_resume_source
> > > > >  self.check_bitmap(self.vm_a, sha256)
> > > > >File "169", line 72, in check_bitmap
> > > > >  node='drive0', name='bitmap0')
> > > > >File "tests/qemu-iotests/../../scripts/qemu.py", line 369, in qmp
> > > > >  return self._qmp.cmd(cmd, args=qmp_args)
> > > > >File "tests/qemu-iotests/../../scripts/qmp/qmp.py", line 191, in 
> > > > > cmd
> > > > >  return self.cmd_obj(qmp_cmd)
> > > > >File "tests/qemu-iotests/../../scripts/qmp/qmp.py", line 174, in 
> > > > > cmd_obj
> > > > >  resp = self.__json_read()
> > > > >File "tests/qemu-iotests/../../scripts/qmp/qmp.py", line 82, in
> > > > > __json_read
> > > > >  data = self.__sockfile.readline()
> > > > >File "/usr/lib64/python2.7/socket.py", line 451, in readline
> > > > >  data = self._sock.recv(self._rbufsize)
> > > > > error: [Errno 104] Connection reset by peer
> > > > > 
> > > > > --
> > > > > Ran 20 tests
> > > > > 
> > > > > FAILED (errors=1)
> > > > > 
> > > > > 
> > > > > Or:
> > > > > 
> > > > > WARNING:qemu:qemu received signal 6:
> > > > > build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
> > > > > -chardev socket,id=mon,path=/tmp/t3/tmp0pllWD/qemua-3445-monitor.sock
> > > > > -mon chardev=mon,mode=control -display none -vga none -qtest
> > > > > unix:path=/tmp/t3/qemua-3445-qtest.sock -machine accel=qtest 
> > > > > -nodefaults
> > > > > -machine accel=qtest -drive
> > > > > if=virtio,id=drive0,file=/tmp/t3/disk_a,format=qcow2,cache=writeback
> > > > > WARNING:qemu:qemu received signal 6:
> > > > > build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
> > > > > -chardev socket,id=mon,path=/tmp/t3/tmp0pllWD/qemua-3445-monitor.sock
> > > > > -mon chardev=mon,mode=control -display none -vga none -qtest
> > > > > unix:path=/tmp/t3/qemua-3445-qtest.sock -machine accel=qtest 
> > > > > -nodefaults
> > > > > -machine accel=qtest -drive
> > > > > if=virtio,id=drive0,file=/tmp/t3/disk_a,format=qcow2,cache=writeback
> > > > > 
> > > > > ...F
> > > > > ==
> > > > > FAIL: test_do_test_migration_resume_source_persistent__not_migbitmap
> > > > > (__main__.TestDirtyBitmapMigration)
> > > > > --
> > > > > Traceback (most recent call last):
> > > > >File "169", line 206, in 
> > > > >  setattr(klass, 'test_' + method + name, lambda self: mc(self))
> > > > >File "169", line 125, in do_test_migration_resume_source
> > > > >  self.assertEqual(log, '')
> > > > > AssertionError: "qemu-system-x86_64: invalid runstate transition:
> > > > > 'running' -> 'postmigrate'\n" != ''
> > > > > 
> > > > > --
> > > > > Ran 20 tests
> > > > > 
> > > > > FAILED (failures=1)
> > > > > 
> > > > > 
> > > > > The backtrace always goes like this:
> > > > > 
> >

Re: [Qemu-devel] 3.1: second invocation of migrate crashes qemu

2019-01-24 Thread Dr. David Alan Gilbert

* Kevin Wolf (kw...@redhat.com) wrote:
> Am 21.01.2019 um 17:05 hat Dr. David Alan Gilbert geschrieben:
> > * Kevin Wolf (kw...@redhat.com) wrote:
> > > Am 18.01.2019 um 16:57 hat Dr. David Alan Gilbert geschrieben:
> > > > * Kevin Wolf (kw...@redhat.com) wrote:
> > > > > Am 14.01.2019 um 11:51 hat Dr. David Alan Gilbert geschrieben:
> > > > > > * Michael Tokarev (m...@tls.msk.ru) wrote:
> > > > > > > $ qemu-system-x86_64 -monitor stdio -hda foo.img
> > > > > > > QEMU 3.1.0 monitor - type 'help' for more information
> > > > > > > (qemu) stop
> > > > > > > (qemu) migrate "exec:cat >/dev/null"
> > > > > > > (qemu) migrate "exec:cat >/dev/null"
> > > > > > > qemu-system-x86_64: /build/qemu/qemu-3.1/block.c:4647: 
> > > > > > > bdrv_inactivate_recurse: Assertion `!(bs->open_flags & 
> > > > > > > BDRV_O_INACTIVE)' failed.
> > > > > > > Aborted
> > > > > > 
> > > > > > And on head as well;  it only happens if the 1st migrate is 
> > > > > > succesful;
> > > > > > if it got cancelled the 2nd one works, so it's not too bad.
> > > > > > 
> > > > > > I suspect the problem here is all around locking/ownership - the 
> > > > > > block
> > > > > > devices get shutdown at the end of migration since the assumption is
> > > > > > that the other end has them open now and we had better release them.
> > > > > 
> > > > > Yes, only "cont" gets control back to the source VM.
> > > > > 
> > > > > I think we really should limit the possible monitor commands in the
> > > > > postmigrate status, and possibly provide a way to get back to the
> > > > > regular paused state (which means getting back control of the 
> > > > > resources)
> > > > > without resuming the VM first.
> > > > 
> > > > This error is a little interesting if you'd done something like:
> > > > 
> > > > 
> > > >  src:
> > > >  stop
> > > >  migrate
> > > > 
> > > >  dst:
> > > >  
> > > >  start a new qemu
> > > > 
> > > >  src:
> > > >  migrate
> > > > 
> > > > Now that used to work (safely) - note we've not started
> > > > a VM succesfully anywhere else.
> > > > 
> > > > Now the source refuses to let that happen - with a rather
> > > > nasty abort.
> > > 
> > > Essentially it's another effect of the problem that migration has always
> > > lacked a proper model of ownership transfer. And it's still treating
> > > this as a block layer problem rather than making it a core concept of
> > > migration as it should.
> > > 
> > > We can stack another one-off fix on top, and get back control of the
> > > block devices automatically on a second 'migrate'. But it feels like a
> > > hack and not like VMs had a properly designed and respected state
> > > machine.
> > 
> > Hmm; I don't like to get back to this argument because I think
> > we've got a perfectly servicable model that's implemented at higher
> > levels outside qemu, and the real problem is the block layer added
> > new assumptions about the semantics without checking they were really
> > true.
> > qemu only has the view from a single host; it takes the higher level
> > view from something like libvirt to have the view across multiple hosts
> > to understand who has the ownership when.
> 
> Obviously the upper layer is not handling this without the help of QEMU
> or we wouldn't have had bugs that images were accessed by two QEMU
> processes at the same time. We didn't change the assumptions, but we
> only started to actually check the preconditions that have always been
> necessary to perform live migration correctly.

In this case there is a behaviour that was perfectly legal before that
fails now; further the case is safe - the source hasn't accessed the
disks after the first migration and isn't trying to access it again
either.

> But if you like to think the upper layer should handle all of this,

I don't really want the upper layer to handle all of this; but I don't
think we can handle it all either - we've not got the higher level
view of screwups that happen outside qemu.

>then
> it's on libvirt to handle the ownership transfer manually. If we really
> want, we can add explicit QMP commands to activate and inactivate block
> nodes. This can be done and requiring that the management layer does
> all of this would be a consistent interface, too.
> 
> I just don't like this design much for two reasons: The first is that
> you can't migrate a VM that has disks with a simple 'migrate' command
> any more. The second is that if you implement it consistently, this has
> an impact on compatibility. I think it's a design that could be
> considered if we were adding live migration as a new feature, but it's
> probably hard to switch to it now.
> 
> In any case, I do think we should finally make a decision how ownership
> of resources should work in the context of migration, and then implement
> that.

I think we're mostly OK, but what I'd like would be:
  a) I'd like things to fail gently rather than abort; so I'd either
 like the current functions to fail

Re: [Qemu-devel] [PATCH 2/4] aspeed/smc: define registers for all possible CS

2019-01-24 Thread Joel Stanley

On Fri, 25 Jan 2019 at 01:08, Cédric Le Goater  wrote:
>
> The model should expose one control register per possible CS. When
> testing the validity of the register number in the read operation,
> replace 's->num_cs' by 'ctrl->max_slaves' which represents the maximum
> number of flash devices a controller can handle.
>
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Joel Stanley

Re: [Qemu-devel] [PATCH 1/4] aspeed/smc: fix default read value

2019-01-24 Thread Joel Stanley

On Fri, 25 Jan 2019 at 01:08, Cédric Le Goater  wrote:
>
> 0x should be returned for non implemented registers.
>
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Joel Stanley

Re: [Qemu-devel] [PATCH v3 15/50] audio: reduce glob_audio_state usage

2019-01-24 Thread Zoltán Kővágó

On 2019-01-24 12:19, Gerd Hoffmann wrote:
>   Hi,
> 
>> So, I think with the first part the only open issue is whenever we go
>> with the nested types (i.e. patch #1 as-is) or not.  Given that the
>> one-element-structs added in that patch will get additional fields I
>> think the nesting makes sense.
> 
> Spoke too soon: scripts/checkpatch.pl flags a bunch of codestyle issues.
> 

Most of them are about the code style of the old audio subsystem, I
fixed some of them but not everything.  IIRC last time it wasn't a
problem, but it was in 2015.  Should I go over them again and fix all of
them?

Regards,
Zoltan

Re: [Qemu-devel] [PATCH 3/4] aspeed/smc: Add dummy data register

2019-01-24 Thread Joel Stanley

On Fri, 25 Jan 2019 at 01:05, Cédric Le Goater  wrote:
>
> The SMC controllers have a register containing the byte that will be
> used as dummy output. It can be modified by software.
>
> Signed-off-by: Cédric Le Goater 
> Reviewed-by: Philippe Mathieu-Daudé 

Reviewed-by: Joel Stanley

[Qemu-devel] [Bug 1813201] [NEW] QEMU TCG i386 / x86_64 system emulation crash when executing int instruction

2019-01-24 Thread Alberto Ortega

Public bug reported:

QEMU version:
-

qemu from git, master branch commit
d058a37a6e8daa8d71a6f2b613eb415b69363755

Release versions are also affected.

Summary:


QEMU i386 and x86_64 system emulation crash when executing the following
"int" instruction:

cd08  int 8

This generates a kernel NULL pointer dereference error in Linux, and a
BSOD error in Windows.

No special permissions are required to execute the instruction, any
unprivileged user can execute it.

This issue has been reproduced in QEMU running in TCG mode. KVM is not
affected.

Kernel panic log:

[  111.091138] BUG: unable to handle kernel NULL pointer dereference at 0014
[  111.092145] IP: [] doublefault_fn+0xd/0x130
[  111.092145] *pdpt =  *pde = f000ff53f000ff53 [  111.092145] 
[  111.092145] Oops:  [#1] SMP
[  111.092145] Modules linked in: kvm_amd bochs_drm ppdev ttm drm_kms_helper 
drm kvm irqbypass evdev pcspkr serio_raw sg parport_pc parport button ip_tables 
x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb xts lrw gf128mul 
ablk_helper cryptd aes_i586 mbcache sr_mod sd_mod cdrom ata_generic ata_piix 
libata psmouse e1000 scsi_mod i2c_piix4 floppy
[  111.092145] CPU: 0 PID: 409 Comm: int8.elf Not tainted 4.9.0-8-686-pae #1 
Debian 4.9.130-2
[  111.092145] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
[  111.092145] task: f6c88a80 task.stack: f6e52000
[  111.092145] EIP: 0060:[] EFLAGS: 4086 CPU: 0
[  111.092145] EIP is at doublefault_fn+0xd/0x130
[  111.092145] EAX:  EBX:  ECX:  EDX: 
[  111.092145] ESI:  EDI:  EBP: ce8f13fc ESP: ce8f13d4
[  111.092145]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  111.092145] CR0: 8005003b CR2: 0014 CR3: 0e8e1000 CR4: 06f0
[  111.092145] Stack:
[  111.092145]         

[  111.092145]         

[  111.092145]      fed0 ce474ad0  
00017d78
[  111.092145] Call Trace:
[  111.092145] Code: 86 fd ff eb a3 89 f6 8d bc 27 00 00 00 00 55 89 e5 3e 8d 
74 26 00 5d e9 e2 79 fd ff 66 90 55 89 e5 56 53 83 ec 20 3e 8d 74 26 00 <65> a1 
14 00 00 00 89 45 f4 31 c0 31 c0 c7 45 f0 00 00 00 00 66
[  111.092145] EIP: [] [  111.092145] doublefault_fn+0xd/0x130
[  111.092145]  SS:ESP 0068:ce8f13d4
[  111.092145] CR2: 0014
[  111.092145] ---[ end trace 8afa7884b76cafc1 ]---

Testcase:
-

void main() {
asm("int $0x8");
}

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: tcg

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1813201

Title:
  QEMU TCG i386 / x86_64 system emulation crash when executing int
  instruction

Status in QEMU:
  New

Bug description:
  QEMU version:
  -

  qemu from git, master branch commit
  d058a37a6e8daa8d71a6f2b613eb415b69363755

  Release versions are also affected.

  Summary:
  

  QEMU i386 and x86_64 system emulation crash when executing the
  following "int" instruction:

  cd08  int 8

  This generates a kernel NULL pointer dereference error in Linux, and a
  BSOD error in Windows.

  No special permissions are required to execute the instruction, any
  unprivileged user can execute it.

  This issue has been reproduced in QEMU running in TCG mode. KVM is not
  affected.

  Kernel panic log:

  [  111.091138] BUG: unable to handle kernel NULL pointer dereference at 
0014
  [  111.092145] IP: [] doublefault_fn+0xd/0x130
  [  111.092145] *pdpt =  *pde = f000ff53f000ff53 [  
111.092145] 
  [  111.092145] Oops:  [#1] SMP
  [  111.092145] Modules linked in: kvm_amd bochs_drm ppdev ttm drm_kms_helper 
drm kvm irqbypass evdev pcspkr serio_raw sg parport_pc parport button ip_tables 
x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb xts lrw gf128mul 
ablk_helper cryptd aes_i586 mbcache sr_mod sd_mod cdrom ata_generic ata_piix 
libata psmouse e1000 scsi_mod i2c_piix4 floppy
  [  111.092145] CPU: 0 PID: 409 Comm: int8.elf Not tainted 4.9.0-8-686-pae #1 
Debian 4.9.130-2
  [  111.092145] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
  [  111.092145] task: f6c88a80 task.stack: f6e52000
  [  111.092145] EIP: 0060:[] EFLAGS: 4086 CPU: 0
  [  111.092145] EIP is at doublefault_fn+0xd/0x130
  [  111.092145] EAX:  EBX:  ECX:  EDX: 
  [  111.092145] ESI:  EDI:  EBP: ce8f13fc ESP: ce8f13d4
  [  111.092145]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  [  111.092145] CR0: 8005003b CR2: 0014 CR3: 0e8e1000 CR4: 06f0
  [  111.092145] Stack:
  [  111.092145]

Re: [Qemu-devel] [PATCH] json: Fix % handling when not interpolating

2019-01-24 Thread Eric Blake

On 1/24/19 12:29 PM, Markus Armbruster wrote:

>>>   - block.c: JSON pseudo-filenames starting with "json:"
>>>
>>> Reproducer: https://bugzilla.redhat.com/show_bug.cgi?id=1668244#c3
>>>
>>>   - block/rbd.c: JSON key pairs
>>>
>>> Pseudo-filenames starting with "rbd:".
>>>
>>
>> Missed curl as being impacted. You'd have to do a v2 pull request to
>> mention it now...
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1668244
> 
> Isn't that an instance of 'JSON pseudo-filenames starting with "json:"'?

Indeed - and I even linked to the same BZ without realizing it.  Nothing
further to see here, I'll go back to hiding in the corner...

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

2019-01-24 Thread Eric Farman





On 01/23/2019 08:34 AM, Cornelia Huck wrote:

On Wed, 23 Jan 2019 14:06:01 +0100
Halil Pasic  wrote:


On Wed, 23 Jan 2019 11:34:47 +0100
Cornelia Huck  wrote:



Yes, one can usually think of interfaces as contracts: both sides need
to keep their end for things to work as intended. Unfortunately the
vfio-ccw iterface is not a very well specified one, and that makes
reasoning about right order so much harder.


That's probably where our disconnect comes from.



I was under the impression that the right ordering is dictated by the
SCSW in userspace. E.g. if there is an FC bit set there userspace is not
ought to issue a SSCH request (write to the io_region). The kernel part
however may say 'userspace read the actual SCSW' by signaling
the io_trigger eventfd. Userspace is supposed to read the IRB from the
region and update it's SCSW.

Now if userspace reads a broken SCSW from the IRB, because of a race
(due to poorly written kernel part -- userspace not at fault), it is
going to make wrong assumptions about currently legal and illegal
operations (ordering).


My understanding of the interface was that writing to the I/O region
triggers a ssch (unless rejected with error) and that reading it just
gets whatever the kernel wrote there the last time it updated its
internal structures. The eventfd simply triggers to say "the region has
been updated with an IRB", not to say "userspace, read this".



Previously I described a scenario where IRB can break without userspace
being at fault (race between unsolicited interrupt -- can happen at any
time -- and a legit io request). I was under the impression we agreed on
this.


There is a bug in there (clearing the cp for non-final interrupts), and
it needs to be fixed. I'm not so sure if the unsolicited interrupt
thing is a bug (beyond that the internal state machine is confused).



This in turn could lead to userspace violating the contract, as perceived
by the kernel side.


Which contract? ;)

Also, I'm not sure if we'd rather get a deferred cc 1?


As I'm encountering dcc=1 quite regularly lately, it's a nice error. 
But we don't have a good way of recovering from it, and so my test tends 
to go down in a heap quite quickly.  This patch set will probably help; 
I should really get it applied and try it out.







At this point, I'm mostly confused... I'd prefer to simply fix things
as they come up so that we can finally move forward with the halt/clear
handling (and probably rework the state machine on top of that.)


+1 for fixing things as we go.  I hear the complaints about this code 
(and probably say them too), but remain convinced that a large rewrite 
is unnecessary.  Lots of opportunities for improvement, with lots of 
willing and motivated participants, means it can only get better!


   


I understand. I guess you will want to send a new version because of the
stuff that got lost in the rebase, or?


Yes, I'll send a new version; but I'll wait for more feedback for a bit.



I'll try to provide some now.  Still digging through the emails marked 
"todo" :)


 - Eric

Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

2019-01-24 Thread Eric Farman





On 01/24/2019 05:19 AM, Cornelia Huck wrote:

On Thu, 24 Jan 2019 11:08:02 +0100
Pierre Morel  wrote:


On 23/01/2019 11:21, Cornelia Huck wrote:

On Tue, 22 Jan 2019 19:33:46 +0100
Halil Pasic  wrote:
   

On Mon, 21 Jan 2019 12:03:51 +0100
Cornelia Huck  wrote:
  

--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -28,6 +28,7 @@
* @mdev: pointer to the mediated device
* @nb: notifier for vfio events
* @io_region: MMIO region to input/output I/O arguments/results
+ * @io_mutex: protect against concurrent update of I/O structures


We could be a bit more specific about what does this mutex guard.
Is it only io_region, or cp, irb and the new regions a well? ->state does
not seem to be covered, but should need some sort of synchronisation
too, or?


I'm not sure. IIRC Pierre had some ideas about locking in the fsm?
   


Yes I postponed this work to not collide with your patch series.

Do you think I should provide a new version of the FSM reworking series
based on the last comment I got?

I would take into account that the asynchronous commands will come with
your patch series and only provide the framework changes.


This was more an answer to Halil's concerns around state
synchronization. I would prefer to first get this series (or a
variation) into decent shape, and then address state machine handling
on top of that (when we know more about the transitions involved), just
to avoid confusion.

Does that sound reasonable?



It does to me.

whatever bug is passed around daycare.  I'm catching up on my "todo" 
emails now!>

Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-24 Thread Eduardo Habkost

On Thu, Jan 24, 2019 at 02:05:45PM -0500, Michael S. Tsirkin wrote:
> On Thu, Jan 24, 2019 at 04:28:39PM -0200, Eduardo Habkost wrote:
> > On Thu, Jan 24, 2019 at 12:45:54PM -0500, Michael S. Tsirkin wrote:
> > > On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote:
> > > > On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote:
> > > > > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote:
> > > > > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote:
> > > > > > > From: Zhang Yi 
> > > > > > > 
> > > > > > > Signed-off-by: Zhang Yi 
> > [...]
> > > > > > > + - 'pmem' option of memory-backend-file is 'on':
> > > > > > > +   The backend is a file supporting DAX, e.g., a file on an ext4 
> > > > > > > or
> > > > > > > +   xfs file system mounted with '-o dax'. if your pmem=on ,but 
> > > > > > > the backend is
> > > > > > > +   not a file supporting DAX, mapping with this flag results in 
> > > > > > > an EOPNOTSUPP
> > > > > > > +   error.
> > > > > > 
> > > > > > Won't this break existing configurations that work today on QEMU
> > > > > > 3.1.0?  Why exactly it is OK to break compatibility here?
> > > > > won't, pmem option default is off, if people who start VM don't know 
> > > > > what
> > > > > backend file is, it is suggested and *default to set pmem=off,
> > > > > if people well know the backend file have dax capbility. it is suggest
> > > > > to set pmem=on. 
> > > > > 
> > > > > For a special case that we use /dev/dax as backend, we already have a
> > > > > patch to add MAP_SYNC falg mapiing from device dax mode.
> > > > > see https://lkml.org/lkml/2018/4/22/524 
> > > > > 
> > > > > So, if people force set pmem=on, mapping a regular file, it will 
> > > > > results
> > > > > in an EOPNOTSUPP error. 
> > > > 
> > > > This is where compatibility is being broken, isn't it?  People
> > > > currently using pmem=on on a regular file will start getting
> > > > errors after a QEMU upgrade.  Existing VMs with pmem=on may stop
> > > > booting.  Maybe this is OK, but we need to be able to explain why
> > > > it is OK.
> > > 
> > > I think it's OK since pmem explicitly means "persistent":
> > > 
> > > The @option{pmem} option specifies whether the backing file specified
> > > by @option{mem-path} is in host persistent memory that can be accessed
> > > using the SNIA NVM programming model (e.g. Intel NVDIMM).
> > > If @option{pmem} is set to 'on', QEMU will take necessary operations to
> > > guarantee the persistence of its own writes to @option{mem-path}
> > > (e.g. in vNVDIMM label emulation and live migration).
> > 
> > If it's OK, let's at least explicitly document that we are
> > breaking compatibility in those cases.
> > 
> > 
> > > > > 
> > [...]
> > > I think generally MAP_SYNC is required.
> > > But for compatibility reasons we might need to support
> > > !MAP_SYNC on old kernels even though it's risky.
> > 
> > What about making MAP_SYNC optional only on older machine-types?
> 
> I don't think this makes sense. It's not a guest visible change,
> machine types are for that.

Losing data written to persistent memory is surely guest-visible
behavior.

-- 
Eduardo

Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-24 Thread Michael S. Tsirkin

On Thu, Jan 24, 2019 at 04:28:39PM -0200, Eduardo Habkost wrote:
> On Thu, Jan 24, 2019 at 12:45:54PM -0500, Michael S. Tsirkin wrote:
> > On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote:
> > > On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote:
> > > > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote:
> > > > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote:
> > > > > > From: Zhang Yi 
> > > > > > 
> > > > > > Signed-off-by: Zhang Yi 
> [...]
> > > > > > + - 'pmem' option of memory-backend-file is 'on':
> > > > > > +   The backend is a file supporting DAX, e.g., a file on an ext4 or
> > > > > > +   xfs file system mounted with '-o dax'. if your pmem=on ,but the 
> > > > > > backend is
> > > > > > +   not a file supporting DAX, mapping with this flag results in an 
> > > > > > EOPNOTSUPP
> > > > > > +   error.
> > > > > 
> > > > > Won't this break existing configurations that work today on QEMU
> > > > > 3.1.0?  Why exactly it is OK to break compatibility here?
> > > > won't, pmem option default is off, if people who start VM don't know 
> > > > what
> > > > backend file is, it is suggested and *default to set pmem=off,
> > > > if people well know the backend file have dax capbility. it is suggest
> > > > to set pmem=on. 
> > > > 
> > > > For a special case that we use /dev/dax as backend, we already have a
> > > > patch to add MAP_SYNC falg mapiing from device dax mode.
> > > > see https://lkml.org/lkml/2018/4/22/524 
> > > > 
> > > > So, if people force set pmem=on, mapping a regular file, it will results
> > > > in an EOPNOTSUPP error. 
> > > 
> > > This is where compatibility is being broken, isn't it?  People
> > > currently using pmem=on on a regular file will start getting
> > > errors after a QEMU upgrade.  Existing VMs with pmem=on may stop
> > > booting.  Maybe this is OK, but we need to be able to explain why
> > > it is OK.
> > 
> > I think it's OK since pmem explicitly means "persistent":
> > 
> > The @option{pmem} option specifies whether the backing file specified
> > by @option{mem-path} is in host persistent memory that can be accessed
> > using the SNIA NVM programming model (e.g. Intel NVDIMM).
> > If @option{pmem} is set to 'on', QEMU will take necessary operations to
> > guarantee the persistence of its own writes to @option{mem-path}
> > (e.g. in vNVDIMM label emulation and live migration).
> 
> If it's OK, let's at least explicitly document that we are
> breaking compatibility in those cases.
> 
> 
> > > > 
> [...]
> > I think generally MAP_SYNC is required.
> > But for compatibility reasons we might need to support
> > !MAP_SYNC on old kernels even though it's risky.
> 
> What about making MAP_SYNC optional only on older machine-types?

I don't think this makes sense. It's not a guest visible change,
machine types are for that.

> -- 
> Eduardo

[Qemu-devel] building rst docs with sphinx

2019-01-24 Thread Peter Maydell

I had another look this afternoon at building our rST docs
with sphinx-build. In particular, we currently have some
docs in rst format, but we're not building them into HTML
or shipping them. (Predictably, this means a few errors and
warnings have crept in...)

I had a play about with adding some makefile runes, but
I'm not sure entirely what I should be aiming for.

(1) configure: My thought is that we should just make
sphinx-build a requirement for the existing --enable-docs
switch (as texinfo and pod2man are currently). The
disadvantage is that we won't support a "build the half
of the docs you have the tools for and leave the others"
setup. The advantage, which I think is significant, is that
distros will naturally be directed to the missing build
dependency (either they're building with --enable-docs
and will get the configure message, or they aren't and
then their build will fail later because of missing docs
files when they try to put the built files into the package).

(2) What do we actually want to ship?
That is, what do we want 'make install-doc' to copy into
the installation directory?
https://wiki.qemu.org/Features/Documentation
has a good suggested breakdown of docs for where we
eventually want to be. I think we probably don't want
to install the "developer's guide" (docs/devel) on
end-user systems. The others are presumably OK.
Currently, we seem to only install manpages and a
few other things in the 'install-doc' makefile target
(we don't install a bunch of plain-text user-facing
docs) so this would be a significant expansion.

(3) Indexes, table-of-contents pages, etc
Are we aiming to ship these?
I think that we probably want to have what from
Sphinx's point of view are multiple separate documents,
so that they each get their own ToC and index. This
means we can for instance ship the ToC/index for
the user docs but not have it contain index entries
for developer docs.

Overall what I'm hoping for is to be able to get some
basic structure/building commands into master so we
have a framework and something we can iterate on to
move forward.

thanks
-- PMM

Re: [Qemu-devel] [PATCH] i386: extended the cpuid level when Intel PT is enabled

2019-01-24 Thread Eduardo Habkost

Hi,

Thanks for the patch.  Comment below:

On Thu, Jan 24, 2019 at 08:54:43PM -0500, Luwei Kang wrote:
> Intel Processor Trace required CPUID[0x14] but the cpuid level
> is 0xd when create a kvm guest with e.g. "-cpu qemu64,+intel-pt".
> 
> Signed-off-by: Luwei Kang 
> ---
>  target/i386/cpu.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 2f54125..da477b3 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5023,6 +5023,13 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error 
> **errp)
>  x86_cpu_adjust_feat_level(cpu, FEAT_C000_0001_EDX);
>  x86_cpu_adjust_feat_level(cpu, FEAT_SVM);
>  x86_cpu_adjust_feat_level(cpu, FEAT_XSAVE);
> +
> +/* Intel Processor Trace requires CPUID[0x14] */
> +if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
> + kvm_enabled()) {
> +x86_cpu_adjust_level(cpu, >env.cpuid_min_level, 0x14);
> +}

This will require a new machine-type compatibility flag to enable
the new behavior, so we don't change CPUID data under the guest
feet during live migration.

> +
>  /* SVM requires CPUID[0x800A] */
>  if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
>  x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x800A);
> -- 
> 1.8.3.1
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH v5] log: Make glib logging go through QEMU

2019-01-24 Thread Dr. David Alan Gilbert

* Markus Armbruster (arm...@redhat.com) wrote:
> Christophe Fergeau  writes:
> 
> > This commit adds a qemu_init_logging() helper which calls
> > g_log_set_default_handler() so that glib logs (g_log, g_warning, ...)
> > are handled similarly to other QEMU logs. This means they will get a
> > timestamp if timestamps are enabled, and they will go through the
> > monitor if one is configured.
> 
> s/monitor/HMP monitor/
> 
> I see why one would like to extend the timestamp feature to GLib log
> messages.  Routing them through the HMP monitor is perhaps debatable.
> Cc: Dave in case he has an opinion.

Yes, it's a little odd; what's wrong with stderr for this type of thing?
My experience has been that things like spice errors are fairly
asynchronous rather than directly triggered by commands, so maybe less
suitable for interleaving in the monitor.

While stderr and hmp output are normally the same, if someone has
HMP wired to a script, I'd assume this is more likely to break it.

Dave

> > This commit also adds a call to qemu_init_logging() to the binaries
> > installed by QEMU.
> > glib debug messages are enabled through G_MESSAGES_DEBUG similarly to
> > glib default log handler.
> >
> > At the moment, this change will mostly impact SPICE logging if your
> > spice version is >= 0.14.1. With older spice versions, this is not going
> > to work as expected, but will not have any ill effect, so this call is
> > not conditional on the SPICE version.
> >
> > Signed-off-by: Christophe Fergeau 
> > Reviewed-by: Daniel P. Berrangé 
> > Reviewed-by: Stefan Hajnoczi 
> > ---
> > One more iteration of the patch as it hit CI failures
> > (https://patchew.org/QEMU/20181214105642.673-1-cferg...@redhat.com/ )
> > Only difference from v4 is the addition of #include "qemu/error-report.h"
> > in bsd-user and linux-user.
> >
> >
> >  bsd-user/main.c |  2 ++
> >  include/qemu/error-report.h |  2 ++
> >  linux-user/main.c   |  2 ++
> >  qemu-img.c  |  1 +
> >  qemu-io.c   |  1 +
> >  qemu-nbd.c  |  1 +
> >  scsi/qemu-pr-helper.c   |  1 +
> >  util/qemu-error.c   | 47 +
> >  vl.c|  1 +
> >  9 files changed, 58 insertions(+)
> >
> > diff --git a/bsd-user/main.c b/bsd-user/main.c
> > index 0d3156974c..0df5c853d3 100644
> > --- a/bsd-user/main.c
> > +++ b/bsd-user/main.c
> > @@ -24,6 +24,7 @@
> >  #include "qapi/error.h"
> >  #include "qemu.h"
> >  #include "qemu/config-file.h"
> > +#include "qemu/error-report.h"
> >  #include "qemu/path.h"
> >  #include "qemu/help_option.h"
> >  #include "cpu.h"
> > @@ -743,6 +744,7 @@ int main(int argc, char **argv)
> >  if (argc <= 1)
> >  usage();
> >  
> > +qemu_init_logging();
> >  module_call_init(MODULE_INIT_TRACE);
> >  qemu_init_cpu_list();
> >  module_call_init(MODULE_INIT_QOM);
> > diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
> > index 0a8d9cc9ea..2852e9df2a 100644
> > --- a/include/qemu/error-report.h
> > +++ b/include/qemu/error-report.h
> > @@ -49,6 +49,8 @@ bool error_report_once_cond(bool *printed, const char 
> > *fmt, ...)
> >  bool warn_report_once_cond(bool *printed, const char *fmt, ...)
> >  GCC_FMT_ATTR(2, 3);
> >  
> > +void qemu_init_logging(void);
> > +
> >  /*
> >   * Similar to error_report(), except it prints the message just once.
> >   * Return true when it prints, false otherwise.
> > diff --git a/linux-user/main.c b/linux-user/main.c
> > index a0aba9cb1e..d9b3ffd1f4 100644
> > --- a/linux-user/main.c
> > +++ b/linux-user/main.c
> > @@ -27,6 +27,7 @@
> >  #include "qemu/path.h"
> >  #include "qemu/config-file.h"
> >  #include "qemu/cutils.h"
> > +#include "qemu/error-report.h"
> >  #include "qemu/help_option.h"
> >  #include "cpu.h"
> >  #include "exec/exec-all.h"
> > @@ -600,6 +601,7 @@ int main(int argc, char **argv, char **envp)
> >  int ret;
> >  int execfd;
> >  
> > +qemu_init_logging();
> >  module_call_init(MODULE_INIT_TRACE);
> >  qemu_init_cpu_list();
> >  module_call_init(MODULE_INIT_QOM);
> > diff --git a/qemu-img.c b/qemu-img.c
> > index ad04f59565..9214392565 100644
> > --- a/qemu-img.c
> > +++ b/qemu-img.c
> > @@ -4912,6 +4912,7 @@ int main(int argc, char **argv)
> >  signal(SIGPIPE, SIG_IGN);
> >  #endif
> >  
> > +qemu_init_logging();
> >  module_call_init(MODULE_INIT_TRACE);
> >  error_set_progname(argv[0]);
> >  qemu_init_exec_dir(argv[0]);
> > diff --git a/qemu-io.c b/qemu-io.c
> > index 6df7731af4..ad38d12e68 100644
> > --- a/qemu-io.c
> > +++ b/qemu-io.c
> > @@ -524,6 +524,7 @@ int main(int argc, char **argv)
> >  signal(SIGPIPE, SIG_IGN);
> >  #endif
> >  
> > +qemu_init_logging();
> >  module_call_init(MODULE_INIT_TRACE);
> >  progname = g_path_get_basename(argv[0]);
> >  qemu_init_exec_dir(argv[0]);
> > diff --git a/qemu-nbd.c b/qemu-nbd.c
> > index 51b55f2e06..274b22d445 100644
> > ---

Re: [Qemu-devel] [PATCH RFC 1/2] virtio-blk: add DISCARD and WRITE ZEROES features

2019-01-24 Thread Stefano Garzarella

On Thu, Jan 24, 2019 at 6:55 PM Dr. David Alan Gilbert
 wrote:
>
> * Stefano Garzarella (sgarz...@redhat.com) wrote:
> > This patch adds the support of DISCARD and WRITE ZEROES commands,
> > that have been introduced in the virtio-blk protocol to have
> > better performance when using SSD backend.
> >
> > Signed-off-by: Stefano Garzarella 
>
> Hi,
>   Do you need to make those features machine-type dependent
> so that a VM started on a nice new qemu with this feature doesn't
> get confused when live migrated back to an older version?
>

Hi Dave,
oh, thanks! I think the answer is absolutely yes :)
I'll fix it!

Another doubt that I have now is that I need to flush the
MultiReqBuffer before submitting DISCARD or WRITE ZEROES commands.

Thanks,
Stefano

Re: [Qemu-devel] [PATCH] configure: Don't add Xen's libs to LDFLAGS

2019-01-24 Thread Peter Maydell

On Thu, 24 Jan 2019 at 17:40, Eric Blake  wrote:
>
> On 1/24/19 2:45 AM, Markus Armbruster wrote:
>
> >> Signed-off-by: Michael Tokarev 
> >> Revieved-by: Michael Tokarev 
> >
> > Typo in Reviewed-by.
>
> Should we tighten checkpatch.pl to flag suspicious-looking 'xxx-by:'
> tags, to catch instances of typos?

Yes, I would vote for having it whitelist the half a dozen
expected ones and complain about the rest. I think we
kind of discussed this in the past...

thanks
-- PMM

Re: [Qemu-devel] [PATCH] json: Fix % handling when not interpolating

2019-01-24 Thread Markus Armbruster

Eric Blake  writes:

> On 1/24/19 3:35 AM, Markus Armbruster wrote:
>
>> To gauge the bug's impact, let's review non-interpolating users of this
>> parser, i.e. code passing NULL context to json_message_parser_init():
>> 
>> * tests/check-qjson.c, tests/test-qobject-input-visitor.c,
>>   tests/test-visitor-serialization.c
>> 
>>   Plenty of tests, but we still failed to cover the buggy case.
>> 
>> * monitor.c: QMP input
>> 
>> * qga/main.c: QGA input
>> 
>> * qobject_from_json():
>> 
>>   - qobject-input-visitor.c: JSON command line option arguments of
>> -display and -blockdev
>> 
>> Reproducer: -blockdev '{"%"}'
>> 
>>   - block.c: JSON pseudo-filenames starting with "json:"
>> 
>> Reproducer: https://bugzilla.redhat.com/show_bug.cgi?id=1668244#c3
>> 
>>   - block/rbd.c: JSON key pairs
>> 
>> Pseudo-filenames starting with "rbd:".
>> 
>
> Missed curl as being impacted. You'd have to do a v2 pull request to
> mention it now...
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1668244

Isn't that an instance of 'JSON pseudo-filenames starting with "json:"'?

Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-24 Thread Eduardo Habkost

On Thu, Jan 24, 2019 at 12:45:54PM -0500, Michael S. Tsirkin wrote:
> On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote:
> > On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote:
> > > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote:
> > > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote:
> > > > > From: Zhang Yi 
> > > > > 
> > > > > Signed-off-by: Zhang Yi 
[...]
> > > > > + - 'pmem' option of memory-backend-file is 'on':
> > > > > +   The backend is a file supporting DAX, e.g., a file on an ext4 or
> > > > > +   xfs file system mounted with '-o dax'. if your pmem=on ,but the 
> > > > > backend is
> > > > > +   not a file supporting DAX, mapping with this flag results in an 
> > > > > EOPNOTSUPP
> > > > > +   error.
> > > > 
> > > > Won't this break existing configurations that work today on QEMU
> > > > 3.1.0?  Why exactly it is OK to break compatibility here?
> > > won't, pmem option default is off, if people who start VM don't know what
> > > backend file is, it is suggested and *default to set pmem=off,
> > > if people well know the backend file have dax capbility. it is suggest
> > > to set pmem=on. 
> > > 
> > > For a special case that we use /dev/dax as backend, we already have a
> > > patch to add MAP_SYNC falg mapiing from device dax mode.
> > > see https://lkml.org/lkml/2018/4/22/524 
> > > 
> > > So, if people force set pmem=on, mapping a regular file, it will results
> > > in an EOPNOTSUPP error. 
> > 
> > This is where compatibility is being broken, isn't it?  People
> > currently using pmem=on on a regular file will start getting
> > errors after a QEMU upgrade.  Existing VMs with pmem=on may stop
> > booting.  Maybe this is OK, but we need to be able to explain why
> > it is OK.
> 
> I think it's OK since pmem explicitly means "persistent":
> 
> The @option{pmem} option specifies whether the backing file specified
> by @option{mem-path} is in host persistent memory that can be accessed
> using the SNIA NVM programming model (e.g. Intel NVDIMM).
> If @option{pmem} is set to 'on', QEMU will take necessary operations to
> guarantee the persistence of its own writes to @option{mem-path}
> (e.g. in vNVDIMM label emulation and live migration).

If it's OK, let's at least explicitly document that we are
breaking compatibility in those cases.


> > > 
[...]
> I think generally MAP_SYNC is required.
> But for compatibility reasons we might need to support
> !MAP_SYNC on old kernels even though it's risky.

What about making MAP_SYNC optional only on older machine-types?

-- 
Eduardo

Re: [Qemu-devel] [PATCH] json: Fix % handling when not interpolating

2019-01-24 Thread Eric Blake

On 1/24/19 3:35 AM, Markus Armbruster wrote:

> To gauge the bug's impact, let's review non-interpolating users of this
> parser, i.e. code passing NULL context to json_message_parser_init():
> 
> * tests/check-qjson.c, tests/test-qobject-input-visitor.c,
>   tests/test-visitor-serialization.c
> 
>   Plenty of tests, but we still failed to cover the buggy case.
> 
> * monitor.c: QMP input
> 
> * qga/main.c: QGA input
> 
> * qobject_from_json():
> 
>   - qobject-input-visitor.c: JSON command line option arguments of
> -display and -blockdev
> 
> Reproducer: -blockdev '{"%"}'
> 
>   - block.c: JSON pseudo-filenames starting with "json:"
> 
> Reproducer: https://bugzilla.redhat.com/show_bug.cgi?id=1668244#c3
> 
>   - block/rbd.c: JSON key pairs
> 
> Pseudo-filenames starting with "rbd:".
> 

Missed curl as being impacted. You'd have to do a v2 pull request to
mention it now...

https://bugzilla.redhat.com/show_bug.cgi?id=1668244

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Consistency of iotests 093 and 136

2019-01-24 Thread Eric Blake

On 1/24/19 8:34 AM, Alberto Garcia wrote:
> On Thu 24 Jan 2019 11:11:06 AM CET, Alberto Garcia wrote:
>> On Wed 23 Jan 2019 06:00:49 PM CET, Max Reitz wrote:
>>> Hi,
>>>
>>> 093 and 136 seem really flaky to me.  I can reproduce that by running:
>>
>> That's interesting, I can make 093 fail quite easily now (I haven't
>> tested the other one yet), but I don't think this happened
>> earlier. I'll try to figure out what's going on.
> 
> I bisected this and it seems that 093 started to fail after this:
> 
> 8258292e monitor: Remove "x-oob", offer capability "oob" unconditionally
> 
> I'm not familiar with that option so I need to investigate.

We've got several tests failing after making x-oob unconditional; here's
another thread:

https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg05587.html

Could it be that the test was using some sort of QMP command as an
attempt to synchronize state, but the OOB handling is now making it not
a reliable sync point?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-block] Aborts in iotest 169

2019-01-24 Thread Eric Blake

On 1/24/19 4:15 AM, Kevin Wolf wrote:

>> But how to fix Qemu not to crash? May be, forbid some transitions 
>> (FINISH_MIGRATE -> RUNNING),
>>   or at least error-out qmp_cont if runstate is  FINISH_MIGRATE?
> 

> I wonder whether the QAPI schema should have a field 'run-states' for
> commands, and by default we would only include states where the VM has
> ownership of its resources (e.g. images are activated) and which are not
> temporary states that are automatically left, like finish-migrate.

We already have 'allow-oob' and 'allow-preconfig' flags on a per-command
basis; you're basically proposing that we extend this mechanism for
marking other attributes of commands,...

> 
> Then the default for commands is to be rejected in "unusual" runstates
> where we're not expecting user intervention, and we must explicitly
> allow them if they are okay, in fact.
> 
> Instead of listing every obscure runstate, maybe we should really use
> categories of runstates instead:
> 
> 1. Running
> 2. Paused, owns all resources (like disk images)
> 3. Paused, doesn't own some resources (source VM after migration
>completes, destination before migration completes)
> 4. Paused temporarily for internal reasons (e.g. finish-migrate,
>restore-vm, save-vm)
> 
> Most commands should be okay with 1 and 2, but possibly not 3, and
> almost never 4.

...then enforcing that commands are only executed according to the
attributes they have (where the default attributes match categories 1
and 2, and commands have to opt-in if they are safe to run in category 3
or 4 just like they have to opt-in for preconfig or oob usage).


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RFC 1/2] virtio-blk: add DISCARD and WRITE ZEROES features

2019-01-24 Thread Dr. David Alan Gilbert

* Stefano Garzarella (sgarz...@redhat.com) wrote:
> This patch adds the support of DISCARD and WRITE ZEROES commands,
> that have been introduced in the virtio-blk protocol to have
> better performance when using SSD backend.
> 
> Signed-off-by: Stefano Garzarella 

Hi,
  Do you need to make those features machine-type dependent
so that a VM started on a nice new qemu with this feature doesn't
get confused when live migrated back to an older version?

Dave

> ---
>  hw/block/virtio-blk.c | 79 +++
>  1 file changed, 79 insertions(+)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index f208c6ddb9..8850957751 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -145,6 +145,25 @@ out:
>  aio_context_release(blk_get_aio_context(s->conf.conf.blk));
>  }
>  
> +static void virtio_blk_discard_wzeroes_complete(void *opaque, int ret)
> +{
> +VirtIOBlockReq *req = opaque;
> +VirtIOBlock *s = req->dev;
> +
> +aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
> +if (ret) {
> +if (virtio_blk_handle_rw_error(req, -ret, 0)) {
> +goto out;
> +}
> +}
> +
> +virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
> +virtio_blk_free_request(req);
> +
> +out:
> +aio_context_release(blk_get_aio_context(s->conf.conf.blk));
> +}
> +
>  #ifdef __linux__
>  
>  typedef struct {
> @@ -584,6 +603,56 @@ static int virtio_blk_handle_request(VirtIOBlockReq 
> *req, MultiReqBuffer *mrb)
>  virtio_blk_free_request(req);
>  break;
>  }
> +/*
> + * VIRTIO_BLK_T_DISCARD and VIRTIO_BLK_T_WRITE_ZEROES are defined with
> + * VIRTIO_BLK_T_OUT flag set. We masked this flag in the switch 
> statement,
> + * so we must mask it for these requests, then we will check the type.
> + */
> +case VIRTIO_BLK_T_DISCARD & ~VIRTIO_BLK_T_OUT:
> +case VIRTIO_BLK_T_WRITE_ZEROES & ~VIRTIO_BLK_T_OUT:
> +{
> +struct virtio_blk_discard_write_zeroes dwz_hdr;
> +uint64_t sector;
> +int bytes;
> +
> +if (unlikely(iov_to_buf(out_iov, out_num, 0, _hdr,
> +sizeof(dwz_hdr)) != sizeof(dwz_hdr))) {
> +virtio_error(vdev, "virtio-blk discard/wzeroes header too 
> short");
> +return -1;
> +}
> +
> +sector = virtio_ldq_p(VIRTIO_DEVICE(req->dev), _hdr.sector);
> +bytes = virtio_ldl_p(VIRTIO_DEVICE(req->dev),
> + _hdr.num_sectors) << BDRV_SECTOR_BITS;
> +
> +if (!virtio_blk_sect_range_ok(req->dev, sector, bytes)) {
> +virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
> +virtio_blk_free_request(req);
> +return 0;
> +}
> +
> +if ((type & ~(VIRTIO_BLK_T_BARRIER)) == VIRTIO_BLK_T_DISCARD) {
> +blk_aio_pdiscard(req->dev->blk, sector << BDRV_SECTOR_BITS, 
> bytes,
> + virtio_blk_discard_wzeroes_complete, req);
> +} else if ((type & ~(VIRTIO_BLK_T_BARRIER)) ==
> +   VIRTIO_BLK_T_WRITE_ZEROES) {
> +int flags = 0;
> +
> +if (virtio_ldl_p(VIRTIO_DEVICE(req->dev), _hdr.flags) &
> +VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP) {
> +flags |= BDRV_REQ_MAY_UNMAP;
> +}
> +
> +blk_aio_pwrite_zeroes(req->dev->blk, sector << BDRV_SECTOR_BITS,
> +  bytes, flags,
> +  virtio_blk_discard_wzeroes_complete, req);
> +} else { /* Unsupported if VIRTIO_BLK_T_OUT is not set */
> +virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP);
> +virtio_blk_free_request(req);
> +}
> +
> +break;
> +}
>  default:
>  virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP);
>  virtio_blk_free_request(req);
> @@ -763,6 +832,14 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
> uint8_t *config)
>  blkcfg.alignment_offset = 0;
>  blkcfg.wce = blk_enable_write_cache(s->blk);
>  virtio_stw_p(vdev, _queues, s->conf.num_queues);
> +virtio_stl_p(vdev, _discard_sectors, 
> BDRV_REQUEST_MAX_SECTORS);
> +virtio_stl_p(vdev, _discard_seg, 1);
> +virtio_stl_p(vdev, _sector_alignment,
> + blk_size >> BDRV_SECTOR_BITS);
> +virtio_stl_p(vdev, _write_zeroes_sectors,
> + BDRV_REQUEST_MAX_SECTORS);
> +virtio_stl_p(vdev, _write_zeroes_seg, 1);
> +blkcfg.write_zeroes_may_unmap = 1;
>  memcpy(config, , sizeof(struct virtio_blk_config));
>  }
>  
> @@ -787,6 +864,8 @@ static uint64_t virtio_blk_get_features(VirtIODevice 
> *vdev, uint64_t features,
>  virtio_add_feature(, VIRTIO_BLK_F_GEOMETRY);
>  virtio_add_feature(, VIRTIO_BLK_F_TOPOLOGY);
>  virtio_add_feature(, VIRTIO_BLK_F_BLK_SIZE);
> +virtio_add_feature(, VIRTIO_BLK_F_DISCARD);
> +virtio_add_feature(, VIRTIO_BLK_F_WRITE_ZEROES);
>  if

Re: [Qemu-devel] [PATCH] QGA: Fix guest-get-fsinfo PCI address collection inWindows

2019-01-24 Thread Matt Hines

Patch v2 removed the device number and added a summary



From: Michael Roth
Sent: Thursday, January 24, 2019 9:56
To: mhi...@scalecomputing.com; qemu-devel@nongnu.org
Cc: Matt Hines
Subject: Re: [PATCH] QGA: Fix guest-get-fsinfo PCI address collection inWindows

Quoting mhi...@scalecomputing.com (2019-01-14 03:03:23)
> From: Matt Hines 
> 
> Signed-off-by: Matt Hines 
> ---
>  configure|   2 +-
>  qga/commands-win32.c | 295 
> +--
>  qga/qapi-schema.json |   3 +-
>  3 files changed, 197 insertions(+), 103 deletions(-)
> 
> diff --git a/configure b/configure
> index 5b1d83ea26..46f21c089f 100755
> --- a/configure
> +++ b/configure
> @@ -4694,7 +4694,7 @@ int main(void) {
>  EOF
>if compile_prog "" "" ; then
>  guest_agent_ntddscsi=yes
> -libs_qga="-lsetupapi $libs_qga"
> +libs_qga="-lsetupapi -lcfgmgr32 $libs_qga"
>fi
>  fi
> 
> diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> index 62e1b51dfe..8c8f3a2c65 100644
> --- a/qga/commands-win32.c
> +++ b/qga/commands-win32.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #endif
>  #include 
> @@ -491,56 +492,29 @@ static GuestDiskBusType find_bus_type(STORAGE_BUS_TYPE 
> bus)
>  return win2qemu[(int)bus];
>  }
> 
> -/* XXX: The following function is BROKEN!
> - *
> - * It does not work and probably has never worked. When we query for list of
> - * disks we get cryptic names like "\Device\001d" instead of
> - * "\PhysicalDriveX" or "\HarddiskX". Whether the names can be translated one
> - * way or the other for comparison is an open question.
> - *
> - * When we query volume names (the original version) we are able to match 
> those
> - * but then the property queries report error "Invalid function". (duh!)
> - */
> -
> -/*
> -DEFINE_GUID(GUID_DEVINTERFACE_VOLUME,
> -0x53f5630dL, 0xb6bf, 0x11d0, 0x94, 0xf2,
> -0x00, 0xa0, 0xc9, 0x1e, 0xfb, 0x8b);
> -*/
>  DEFINE_GUID(GUID_DEVINTERFACE_DISK,
>  0x53f56307L, 0xb6bf, 0x11d0, 0x94, 0xf2,
>  0x00, 0xa0, 0xc9, 0x1e, 0xfb, 0x8b);
> +DEFINE_GUID(GUID_DEVINTERFACE_STORAGEPORT,
> +0x2accfe60L, 0xc130, 0x11d2, 0xb0, 0x82,
> +0x00, 0xa0, 0xc9, 0x1e, 0xfb, 0x8b);
> 
> -
> -static GuestPCIAddress *get_pci_info(char *guid, Error **errp)
> +static GuestPCIAddress *get_pci_info(int number, Error **errp)
>  {
>  HDEVINFO dev_info;
>  SP_DEVINFO_DATA dev_info_data;
> -DWORD size = 0;
> +SP_DEVICE_INTERFACE_DATA dev_iface_data;
> +HANDLE dev_file;
>  int i;
> -char dev_name[MAX_PATH];
> -char *buffer = NULL;
>  GuestPCIAddress *pci = NULL;
> -char *name = NULL;
>  bool partial_pci = false;
> +
>  pci = g_malloc0(sizeof(*pci));
>  pci->domain = -1;
>  pci->slot = -1;
>  pci->function = -1;
>  pci->bus = -1;
> 
> -if (g_str_has_prefix(guid, ".\\") ||
> -g_str_has_prefix(guid, "?\\")) {
> -name = g_strdup(guid + 4);
> -} else {
> -name = g_strdup(guid);
> -}
> -
> -if (!QueryDosDevice(name, dev_name, ARRAY_SIZE(dev_name))) {
> -error_setg_win32(errp, GetLastError(), "failed to get dos device 
> name");
> -goto out;
> -}
> -
>  dev_info = SetupDiGetClassDevs(_DEVINTERFACE_DISK, 0, 0,
> DIGCF_PRESENT | DIGCF_DEVICEINTERFACE);
>  if (dev_info == INVALID_HANDLE_VALUE) {
> @@ -550,90 +524,208 @@ static GuestPCIAddress *get_pci_info(char *guid, Error 
> **errp)
> 
>  g_debug("enumerating devices");
>  dev_info_data.cbSize = sizeof(SP_DEVINFO_DATA);
> +dev_iface_data.cbSize = sizeof(SP_DEVICE_INTERFACE_DATA);
>  for (i = 0; SetupDiEnumDeviceInfo(dev_info, i, _info_data); i++) {
> -DWORD addr, bus, slot, data, size2;
> -int func, dev;
> -while (!SetupDiGetDeviceRegistryProperty(dev_info, _info_data,
> -
> SPDRP_PHYSICAL_DEVICE_OBJECT_NAME,
> -, (PBYTE)buffer, size,
> -)) {
> -size = MAX(size, size2);
> -if (GetLastError() == ERROR_INSUFFICIENT_BUFFER) {
> -g_free(buffer);
> -/* Double the size to avoid problems on
> - * W2k MBCS systems per KB 888609.
> - * https://support.microsoft.com/en-us/kb/259695 */
> -buffer = g_malloc(size * 2);
> -} else {
> +PSP_DEVICE_INTERFACE_DETAIL_DATA pdev_iface_detail_data = NULL;
> +STORAGE_DEVICE_NUMBER sdn;
> +char *parent_dev_id = NULL;
> +HDEVINFO parent_dev_info;
> +SP_DEVINFO_DATA parent_dev_info_data;
> +DWORD j;
> +DWORD size = 0;
> +
> +g_debug("getting device path");
> +if (SetupDiEnumDeviceInterfaces(dev_info, _info_data,
> +_DEVINTERFACE_DISK, 0,
>

Re: [Qemu-devel] [PATCH] QGA: Fix guest-get-fsinfo PCI address collection in Windows

2019-01-24 Thread Michael Roth

Quoting mhi...@scalecomputing.com (2019-01-14 03:03:23)
> From: Matt Hines 
> 
> Signed-off-by: Matt Hines 
> ---
>  configure|   2 +-
>  qga/commands-win32.c | 295 
> +--
>  qga/qapi-schema.json |   3 +-
>  3 files changed, 197 insertions(+), 103 deletions(-)
> 
> diff --git a/configure b/configure
> index 5b1d83ea26..46f21c089f 100755
> --- a/configure
> +++ b/configure
> @@ -4694,7 +4694,7 @@ int main(void) {
>  EOF
>if compile_prog "" "" ; then
>  guest_agent_ntddscsi=yes
> -libs_qga="-lsetupapi $libs_qga"
> +libs_qga="-lsetupapi -lcfgmgr32 $libs_qga"
>fi
>  fi
> 
> diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> index 62e1b51dfe..8c8f3a2c65 100644
> --- a/qga/commands-win32.c
> +++ b/qga/commands-win32.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #endif
>  #include 
> @@ -491,56 +492,29 @@ static GuestDiskBusType find_bus_type(STORAGE_BUS_TYPE 
> bus)
>  return win2qemu[(int)bus];
>  }
> 
> -/* XXX: The following function is BROKEN!
> - *
> - * It does not work and probably has never worked. When we query for list of
> - * disks we get cryptic names like "\Device\001d" instead of
> - * "\PhysicalDriveX" or "\HarddiskX". Whether the names can be translated one
> - * way or the other for comparison is an open question.
> - *
> - * When we query volume names (the original version) we are able to match 
> those
> - * but then the property queries report error "Invalid function". (duh!)
> - */
> -
> -/*
> -DEFINE_GUID(GUID_DEVINTERFACE_VOLUME,
> -0x53f5630dL, 0xb6bf, 0x11d0, 0x94, 0xf2,
> -0x00, 0xa0, 0xc9, 0x1e, 0xfb, 0x8b);
> -*/
>  DEFINE_GUID(GUID_DEVINTERFACE_DISK,
>  0x53f56307L, 0xb6bf, 0x11d0, 0x94, 0xf2,
>  0x00, 0xa0, 0xc9, 0x1e, 0xfb, 0x8b);
> +DEFINE_GUID(GUID_DEVINTERFACE_STORAGEPORT,
> +0x2accfe60L, 0xc130, 0x11d2, 0xb0, 0x82,
> +0x00, 0xa0, 0xc9, 0x1e, 0xfb, 0x8b);
> 
> -
> -static GuestPCIAddress *get_pci_info(char *guid, Error **errp)
> +static GuestPCIAddress *get_pci_info(int number, Error **errp)
>  {
>  HDEVINFO dev_info;
>  SP_DEVINFO_DATA dev_info_data;
> -DWORD size = 0;
> +SP_DEVICE_INTERFACE_DATA dev_iface_data;
> +HANDLE dev_file;
>  int i;
> -char dev_name[MAX_PATH];
> -char *buffer = NULL;
>  GuestPCIAddress *pci = NULL;
> -char *name = NULL;
>  bool partial_pci = false;
> +
>  pci = g_malloc0(sizeof(*pci));
>  pci->domain = -1;
>  pci->slot = -1;
>  pci->function = -1;
>  pci->bus = -1;
> 
> -if (g_str_has_prefix(guid, ".\\") ||
> -g_str_has_prefix(guid, "?\\")) {
> -name = g_strdup(guid + 4);
> -} else {
> -name = g_strdup(guid);
> -}
> -
> -if (!QueryDosDevice(name, dev_name, ARRAY_SIZE(dev_name))) {
> -error_setg_win32(errp, GetLastError(), "failed to get dos device 
> name");
> -goto out;
> -}
> -
>  dev_info = SetupDiGetClassDevs(_DEVINTERFACE_DISK, 0, 0,
> DIGCF_PRESENT | DIGCF_DEVICEINTERFACE);
>  if (dev_info == INVALID_HANDLE_VALUE) {
> @@ -550,90 +524,208 @@ static GuestPCIAddress *get_pci_info(char *guid, Error 
> **errp)
> 
>  g_debug("enumerating devices");
>  dev_info_data.cbSize = sizeof(SP_DEVINFO_DATA);
> +dev_iface_data.cbSize = sizeof(SP_DEVICE_INTERFACE_DATA);
>  for (i = 0; SetupDiEnumDeviceInfo(dev_info, i, _info_data); i++) {
> -DWORD addr, bus, slot, data, size2;
> -int func, dev;
> -while (!SetupDiGetDeviceRegistryProperty(dev_info, _info_data,
> -
> SPDRP_PHYSICAL_DEVICE_OBJECT_NAME,
> -, (PBYTE)buffer, size,
> -)) {
> -size = MAX(size, size2);
> -if (GetLastError() == ERROR_INSUFFICIENT_BUFFER) {
> -g_free(buffer);
> -/* Double the size to avoid problems on
> - * W2k MBCS systems per KB 888609.
> - * https://support.microsoft.com/en-us/kb/259695 */
> -buffer = g_malloc(size * 2);
> -} else {
> +PSP_DEVICE_INTERFACE_DETAIL_DATA pdev_iface_detail_data = NULL;
> +STORAGE_DEVICE_NUMBER sdn;
> +char *parent_dev_id = NULL;
> +HDEVINFO parent_dev_info;
> +SP_DEVINFO_DATA parent_dev_info_data;
> +DWORD j;
> +DWORD size = 0;
> +
> +g_debug("getting device path");
> +if (SetupDiEnumDeviceInterfaces(dev_info, _info_data,
> +_DEVINTERFACE_DISK, 0,
> +_iface_data)) {
> +while (!SetupDiGetDeviceInterfaceDetail(dev_info, 
> _iface_data,
> +pdev_iface_detail_data,
> +

Re: [Qemu-devel] [PATCH] qga: check length of command-line & environment variables

2019-01-24 Thread P J P

+-- On Thu, 24 Jan 2019, Michael Roth wrote --+
| I would call a helper function like get_args_max() or whatever and have
| the posix implementation in qga/commands-posix.c and a stub'd version
| in qga/commands-win32.c. There's an article here that might be useful
| for figuring out how we would implement get_args_max() it for win32:
| 
|   https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553

Interesting, I'll follow-up on it.

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-24 Thread Michael S. Tsirkin

On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote:
> On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote:
> > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote:
> > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote:
> > > > From: Zhang Yi 
> > > > 
> > > > Signed-off-by: Zhang Yi 
> > > > ---
> > > >  docs/nvdimm.txt | 29 -
> > > >  qemu-options.hx |  4 
> > > >  2 files changed, 32 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
> > > > index 5f158a6..166c395 100644
> > > > --- a/docs/nvdimm.txt
> > > > +++ b/docs/nvdimm.txt
> > > > @@ -142,11 +142,38 @@ backend of vNVDIMM:
> > > >  Guest Data Persistence
> > > >  --
> > > >  
> > > > +vNVDIMM is designed and implemented to guarantee the guest data
> > > > +persistence on the backends in case of host crash or a power failures.
> > > > +However, there are still some requirements and limitations
> > > > +as explained below.
> > > > +
> > > >  Though QEMU supports multiple types of vNVDIMM backends on Linux,
> > > > -currently the only one that can guarantee the guest write persistence
> > > > +if MAP_SYNC is not supported by the host kernel and the backends,
> > > > +the only backend that can guarantee the guest write persistence
> > > >  is the device DAX on the real NVDIMM device (e.g., /dev/dax0.0), to
> > > >  which all guest access do not involve any host-side kernel cache.
> > > >  
> > > > +mmap(2) flag MAP_SYNC is added since Linux kernel 4.15. On such
> > > > +systems, QEMU can mmap(2) the dax backend files with MAP_SYNC, which
> > > > +ensures filesystem metadata consistency in case of a host crash or a 
> > > > power
> > > > +failure. Enabling MAP_SYNC in QEMU requires below conditions
> > > > +
> > > > + - 'pmem' option of memory-backend-file is 'on':
> > > > +   The backend is a file supporting DAX, e.g., a file on an ext4 or
> > > > +   xfs file system mounted with '-o dax'. if your pmem=on ,but the 
> > > > backend is
> > > > +   not a file supporting DAX, mapping with this flag results in an 
> > > > EOPNOTSUPP
> > > > +   error.
> > > 
> > > Won't this break existing configurations that work today on QEMU
> > > 3.1.0?  Why exactly it is OK to break compatibility here?
> > won't, pmem option default is off, if people who start VM don't know what
> > backend file is, it is suggested and *default to set pmem=off,
> > if people well know the backend file have dax capbility. it is suggest
> > to set pmem=on. 
> > 
> > For a special case that we use /dev/dax as backend, we already have a
> > patch to add MAP_SYNC falg mapiing from device dax mode.
> > see https://lkml.org/lkml/2018/4/22/524 
> > 
> > So, if people force set pmem=on, mapping a regular file, it will results
> > in an EOPNOTSUPP error. 
> 
> This is where compatibility is being broken, isn't it?  People
> currently using pmem=on on a regular file will start getting
> errors after a QEMU upgrade.  Existing VMs with pmem=on may stop
> booting.  Maybe this is OK, but we need to be able to explain why
> it is OK.

I think it's OK since pmem explicitly means "persistent":

The @option{pmem} option specifies whether the backing file specified
by @option{mem-path} is in host persistent memory that can be accessed
using the SNIA NVM programming model (e.g. Intel NVDIMM).
If @option{pmem} is set to 'on', QEMU will take necessary operations to
guarantee the persistence of its own writes to @option{mem-path}
(e.g. in vNVDIMM label emulation and live migration).




> > 
> > see http://man7.org/linux/man-pages/man2/mmap.2.html 
> > > 
> > > > +
> > > > + - 'share' option of memory-backend-file is 'on':
> > > > +   MAP_SYNC flag available only with the MAP_SHARED_VALIDATE mapping 
> > > > type.
> > > 
> > > I don't understand what this paragraph means.
> > see http://man7.org/linux/man-pages/man2/mmap.2.html 
> > > 
> > > > +
> > > > + - 'MAP_SYNC' is supported on linux kernel.(default opened since Linux 
> > > > 4.15)
> > > > +
> > > 
> > > I don't understand why you are making the semantics of
> > > command-line options change depending on the host kernel.
> > the option pmem=on do not dependent the host kernel. MAP_SYNC will be ignore
> > if the kernel don't support. the "pmem=on" have another meaning
> > see https://patchwork.kernel.org/patch/10459407/
> > > 
> > > > +Otherwise, We will ignore the MAP_SYNC flag.
> > > > +
> > > 
> > > See the questions I sent about supported use cases at
> > > .
> > > I still don't see those questions answered:
> > > 
> > > ] We have at least 3 different possible use cases we might need to
> > > ] support:
> > > ] 
> > > ] 1) pmem=on, MAP_SYNC not desired
> > > ] 2) pmem=on, MAP_SYNC desired but optional
> > > ] 3) pmem=on, MAP_SYNC required, not optional
> > > ] 
> > 
> > Sorry for my poor understanding, I don't know what these mean? 
> > pmem=on will force flag

Re: [Qemu-devel] Emulation of TCG OPAL self-encrypting drive

2019-01-24 Thread John Snow




On 1/24/19 5:24 AM, David Kozub wrote:
>>
> 
> libata seems to support SCSI / ATA Translation, including ATA PASS
> THROUGH (12) and ATA PASS THROUGH (16). Is this not sufficient? (The
> implementation can be seen in ata_scsi_pass_thru.)
> 

Oh, I missed this! Thanks for pointing it out. I'll take a look and see
if I can recommend where to start making incisions in QEMU.

>> ...but you could create an emulated SCSI disk and then pass those SCSI
>> commands to an ATA device -- achieving a *kind* of pass through, but I
>> don't know if that's helpful to your project. If so, I'd start looking
>> at the scsi disk sources instead of the ATA sources.
> 
> Perhaps. Or maybe fogetting about pass-through and really just
> implementing OPAL in QEMU.

Re: [Qemu-devel] [PATCH] configure: Don't add Xen's libs to LDFLAGS

2019-01-24 Thread Eric Blake

On 1/24/19 2:45 AM, Markus Armbruster wrote:

>> Signed-off-by: Michael Tokarev 
>> Revieved-by: Michael Tokarev 
> 
> Typo in Reviewed-by.

Should we tighten checkpatch.pl to flag suspicious-looking 'xxx-by:'
tags, to catch instances of typos?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] qga: check length of command-line & environment variables

2019-01-24 Thread Michael Roth

Quoting P J P (2019-01-13 11:28:03)
> +-- On Fri, 11 Jan 2019, Daniel P. Berrangé wrote --+
> | qga/commands.c already includes qemu/osdep.h which includs unistd.h.
> |
> | The build problem patchew reported was from *mingw* builds where
> | sysconf does not exist.
> 
> I see; Not sure how to fix it. Maybe with conditional declaration?
> 
> #ifdef __MINGW[32|64]__
> extern long int sysconf (int __name);
> #endif

I would call a helper function like get_args_max() or whatever and have
the posix implementation in qga/commands-posix.c and a stub'd version
in qga/commands-win32.c. There's an article here that might be useful
for figuring out how we would implement get_args_max() it for win32:

  https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553

> 
> Thank you.
> --
> Prasad J Pandit / Red Hat Product Security Team
> 47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

Re: [Qemu-devel] [PATCH v6 00/10] hw/m68k: add Apple Machintosh Quadra 800 machine

2019-01-24 Thread Mark Cave-Ayland

On 24/01/2019 17:15, Laurent Vivier wrote:

> On 24/01/2019 18:02, Thomas Huth wrote:
>> On 2018-11-02 16:22, Mark Cave-Ayland wrote:
>>> (MCA: here's the latest version of the q800 patchset. I've hope that I've
>>> addressed most of the comments, plus this will now boot into the Debian
>>> installer correctly when applied to git master.
>>
>> Any update on this series? Why did it get stalled again?
>>
> 
> I was thinking about this today.
> 
> Mark, perhaps you can send a rebased version of the series?
> 
> I think we need reviews for "esp: add pseudo-DMA as used by Macintosh".

I've just done a quick rebase and push of my latest branch to
https://github.com/mcayland/qemu/tree/q800-dev-part1-mca and from what I can 
tell it
still boots using the test command line in both graphic and -nographic mode.

Unfortuantely I have a lot less time this dev cycle than normal, so I'm going 
to hand
this back over to Laurent for now while I focus back on SPARC/PPC. From my 
notes the
outstanding things to look at are:

1) Do we mind some of the more verbose comments that were taken from the Linux
headers in some of the files? (I can also see that updates to the comment 
checking in
checkpatch.pl now cause the series to fail with style issues, so these will 
need to
be touched up regardless)

2) Do we need to add migration support for the ESP pseudo-DMA?

3) Is there a Linux test docker image available?

Other than these points I think the series is about good to go.

ATB,

Mark.

[Qemu-devel] [PATCH v3 1/1] riscv: Ensure the kernel start address is correctly cast

2019-01-24 Thread Alistair Francis

Cast the kernel start address to the target bit length.

This ensures that we calculate the initrd offset to a valid address for
the architecture.

Steps to reproduce the original problem (reported by Alex):
  Build U-Boot for the virt machine for riscv32. Then run it with

$ qemu-system-riscv32 -M virt -kernel u-boot -nographic -initrd 

  You can find the initrd address with

U-Boot# fdt addr $fdtcontroladdr
U-Boot# fdt ls /chosen

  Then take a peek at that address:

U-Boot# md.b 

  and you will see that there is nothing there without this patch. The
  reason is that the binary was loaded to a negative address.

Signed-off-by: Alistair Francis 
Suggested-by: Alexander Graf 
Reported-by: Alexander Graf 
---
v3:
 - Add steps to reproduce
v2:
 - Remove old comment
 hw/riscv/sifive_e.c | 2 +-
 hw/riscv/sifive_u.c | 2 +-
 hw/riscv/spike.c| 2 +-
 hw/riscv/virt.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
index 5d9d65ff29..e5d7fc548e 100644
--- a/hw/riscv/sifive_e.c
+++ b/hw/riscv/sifive_e.c
@@ -74,7 +74,7 @@ static const struct MemmapEntry {
 [SIFIVE_E_DTIM] = { 0x8000, 0x4000 }
 };
 
-static uint64_t load_kernel(const char *kernel_filename)
+static target_ulong load_kernel(const char *kernel_filename)
 {
 uint64_t kernel_entry, kernel_high;
 
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index 3bd3b67507..3b3acec377 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -65,7 +65,7 @@ static const struct MemmapEntry {
 
 #define GEM_REVISION0x10070109
 
-static uint64_t load_kernel(const char *kernel_filename)
+static target_ulong load_kernel(const char *kernel_filename)
 {
 uint64_t kernel_entry, kernel_high;
 
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 268df04c3c..79cb4c1282 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -53,7 +53,7 @@ static const struct MemmapEntry {
 [SPIKE_DRAM] = { 0x8000,0x0 },
 };
 
-static uint64_t load_kernel(const char *kernel_filename)
+static target_ulong load_kernel(const char *kernel_filename)
 {
 uint64_t kernel_entry, kernel_high;
 
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index e7f0716fb6..648462b18c 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -62,7 +62,7 @@ static const struct MemmapEntry {
 [VIRT_PCIE_ECAM] =   { 0x3000,0x1000 },
 };
 
-static uint64_t load_kernel(const char *kernel_filename)
+static target_ulong load_kernel(const char *kernel_filename)
 {
 uint64_t kernel_entry, kernel_high;
 
-- 
2.19.1

Re: [Qemu-devel] [PATCH v2 1/1] riscv: Ensure the kernel start address is correctly cast

2019-01-24 Thread Alistair Francis

On Wed, Jan 23, 2019 at 6:00 PM Palmer Dabbelt  wrote:
>
> On Tue, 15 Jan 2019 13:09:28 PST (-0800), alistai...@gmail.com wrote:
> > On Mon, Jan 14, 2019 at 2:58 AM Philippe Mathieu-Daudé
> >  wrote:
> >>
> >> Hi Alistair,
> >>
> >> On 1/12/19 2:17 AM, Alistair Francis wrote:
> >> > Cast the kernel start address to the target bit length.
> >> >
> >> > This ensures that we calculate the initrd offset to a valid address for
> >> > the architecture.
> >>
> >> Can you add an example of the failure symptoms?
> >
> > I can.
>
> Should I be waiting for a v3?

You should be, I thought I sent it but I must have forgotten. I'll send it now.

Alistair

>
> >
> >>
> >> >
> >> > Signed-off-by: Alistair Francis 
> >> > Suggested-by: Alexander Graf 
> >> > Reported-by: Alexander Graf 
> >> > ---
> >> > v2:
> >> >  - Remove old comment
> >> >  hw/riscv/sifive_e.c | 2 +-
> >> >  hw/riscv/sifive_u.c | 2 +-
> >> >  hw/riscv/spike.c| 2 +-
> >> >  hw/riscv/virt.c | 2 +-
> >> >  4 files changed, 4 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
> >> > index 5d9d65ff29..e5d7fc548e 100644
> >> > --- a/hw/riscv/sifive_e.c
> >> > +++ b/hw/riscv/sifive_e.c
> >> > @@ -74,7 +74,7 @@ static const struct MemmapEntry {
> >> >  [SIFIVE_E_DTIM] = { 0x8000, 0x4000 }
> >> >  };
> >> >
> >> > -static uint64_t load_kernel(const char *kernel_filename)
> >> > +static target_ulong load_kernel(const char *kernel_filename)
> >> >  {
> >> >  uint64_t kernel_entry, kernel_high;
> >>
> >> Shouldn't you update load_elf() and co now to take target_ulong
> >> arguments? This would fix this error generically for all archs.
> >
> > That is an option, but as load_elf() is called by every other machine
> > I don't want to break them. It's entirely possible that other machines
> > rely on this behaviour and changing it will break them.
> >
> > Alistair
> >
> >>
> >> >
> >> > diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> >> > index 3bd3b67507..3b3acec377 100644
> >> > --- a/hw/riscv/sifive_u.c
> >> > +++ b/hw/riscv/sifive_u.c
> >> > @@ -65,7 +65,7 @@ static const struct MemmapEntry {
> >> >
> >> >  #define GEM_REVISION0x10070109
> >> >
> >> > -static uint64_t load_kernel(const char *kernel_filename)
> >> > +static target_ulong load_kernel(const char *kernel_filename)
> >> >  {
> >> >  uint64_t kernel_entry, kernel_high;
> >> >
> >> > diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> >> > index 268df04c3c..79cb4c1282 100644
> >> > --- a/hw/riscv/spike.c
> >> > +++ b/hw/riscv/spike.c
> >> > @@ -53,7 +53,7 @@ static const struct MemmapEntry {
> >> >  [SPIKE_DRAM] = { 0x8000,0x0 },
> >> >  };
> >> >
> >> > -static uint64_t load_kernel(const char *kernel_filename)
> >> > +static target_ulong load_kernel(const char *kernel_filename)
> >> >  {
> >> >  uint64_t kernel_entry, kernel_high;
> >> >
> >> > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> >> > index e7f0716fb6..648462b18c 100644
> >> > --- a/hw/riscv/virt.c
> >> > +++ b/hw/riscv/virt.c
> >> > @@ -62,7 +62,7 @@ static const struct MemmapEntry {
> >> >  [VIRT_PCIE_ECAM] =   { 0x3000,0x1000 },
> >> >  };
> >> >
> >> > -static uint64_t load_kernel(const char *kernel_filename)
> >> > +static target_ulong load_kernel(const char *kernel_filename)
> >> >  {
> >> >  uint64_t kernel_entry, kernel_high;
> >> >
> >> >

Re: [Qemu-devel] [PATCH v2 2/2] gen_pcie_root_port: Add ACS (Access Control Services) capability

2019-01-24 Thread Alex Williamson

On Thu, 24 Jan 2019 11:12:53 +0100
Knut Omang  wrote:

> Claim ACS support in the generic PCIe root port to allow
> passthrough of individual functions of a device to different
> guests (in a nested virt.setting) with VFIO.
> Without this patch, all functions of a device, such as all VFs of
> an SR/IOV device, will end up in the same IOMMU group.
> A similar situation occurs on Windows with Hyper-V.
> 
> In the single function device case, it also has a small cosmetic
> benefit in that the root port itself is not grouped with
> the device. VFIO handles that situation in that binding rules
> only apply to endpoints, so it does not limit passthrough in
> those cases.
> 
> Signed-off-by: Knut Omang 
> ---
>  hw/pci-bridge/gen_pcie_root_port.c | 2 ++
>  hw/pci-bridge/pcie_root_port.c | 4 
>  include/hw/pci/pcie_port.h | 1 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c 
> b/hw/pci-bridge/gen_pcie_root_port.c
> index 9766edb..b5a5ecc 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -20,6 +20,7 @@
>  OBJECT_CHECK(GenPCIERootPort, (obj), TYPE_GEN_PCIE_ROOT_PORT)
>  
>  #define GEN_PCIE_ROOT_PORT_AER_OFFSET   0x100
> +#define GEN_PCIE_ROOT_PORT_ACS_OFFSET   0x148

So you prefer that everyone passing through here decode these to figure
out that ACS_OFFSET is (AER_OFFSET + ERR_SIZEOF) since my comment on v1
was ignored?

>  #define GEN_PCIE_ROOT_PORT_MSIX_NR_VECTOR   1
>  
>  typedef struct GenPCIERootPort {
> @@ -149,6 +150,7 @@ static void gen_rp_dev_class_init(ObjectClass *klass, 
> void *data)
>  rpc->interrupts_init = gen_rp_interrupts_init;
>  rpc->interrupts_uninit = gen_rp_interrupts_uninit;
>  rpc->aer_offset = GEN_PCIE_ROOT_PORT_AER_OFFSET;
> +rpc->acs_offset = GEN_PCIE_ROOT_PORT_ACS_OFFSET;
>  }
>  
>  static const TypeInfo gen_rp_dev_info = {
> diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
> index 34ad767..a0b4cf7 100644
> --- a/hw/pci-bridge/pcie_root_port.c
> +++ b/hw/pci-bridge/pcie_root_port.c
> @@ -47,6 +47,7 @@ static void rp_reset(DeviceState *qdev)
>  pcie_cap_deverr_reset(d);
>  pcie_cap_slot_reset(d);
>  pcie_cap_arifwd_reset(d);
> +pcie_cap_acs_reset(d);

Only the generic root port initializes acs_offset to enable an ACS
capability, but all members of the device class call the reset function
which does no checking that an ACS capability exists.  We've just
corrupted config space for the device.

>  pcie_aer_root_reset(d);
>  pci_bridge_reset(qdev);
>  pci_bridge_disable_base_limit(d);
> @@ -106,6 +107,9 @@ static void rp_realize(PCIDevice *d, Error **errp)
>  pcie_aer_root_init(d);
>  rp_aer_vector_update(d);
>  
> +if (rpc->acs_offset) {
> +pcie_acs_init(d, rpc->acs_offset);
> +}
>  return;
>  
>  err:
> diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h
> index df242a0..09586f4 100644
> --- a/include/hw/pci/pcie_port.h
> +++ b/include/hw/pci/pcie_port.h
> @@ -78,6 +78,7 @@ typedef struct PCIERootPortClass {
>  int exp_offset;
>  int aer_offset;
>  int ssvid_offset;
> +int acs_offset;/* If nonzero, optional ACS capability offset */
>  int ssid;
>  } PCIERootPortClass;
>

[Qemu-devel] [PATCH RFC 0/2] virtio-blk: add DISCARD and WRITE ZEROES features

2019-01-24 Thread Stefano Garzarella

This series adds the support of DISCARD and WRITE ZEROES commands
and extends the virtio-blk-test to test WRITE_ZEROES command when
the feature is enabled.

RFC because I'm not sure if the "case" conditions that I used in
virtio-blk.c is clean enough.

This series requires the new virtio headers from linux v5.0-rc1
already imported by Paolo:

Based-on: <20190104082731.24967-1-pbonz...@redhat.com>

Thanks,
Stefano

Stefano Garzarella (2):
  virtio-blk: add DISCARD and WRITE ZEROES features
  tests/virtio-blk: add test for WRITE_ZEROES command

 hw/block/virtio-blk.c   | 79 +
 tests/virtio-blk-test.c | 63 
 2 files changed, 142 insertions(+)

-- 
2.20.1

[Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-24 Thread Stefano Garzarella

If the WRITE_ZEROES feature is enabled, we check this
command in the test_basic().

Signed-off-by: Stefano Garzarella 
---
 tests/virtio-blk-test.c | 63 +
 1 file changed, 63 insertions(+)

diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index 04c608764b..8cabbcb85a 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -231,6 +231,69 @@ static void test_basic(QVirtioDevice *dev, QGuestAllocator 
*alloc,
 
 guest_free(alloc, req_addr);
 
+if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
+struct virtio_blk_discard_write_zeroes *dwz_hdr;
+void *expected;
+
+/*
+ * WRITE_ZEROES request on the same sector of previous test where
+ * we wrote "TEST".
+ */
+req.type = VIRTIO_BLK_T_WRITE_ZEROES;
+req.data = g_malloc0(512);
+dwz_hdr = (struct virtio_blk_discard_write_zeroes *)req.data;
+dwz_hdr->sector = 0;
+dwz_hdr->num_sectors = 1;
+dwz_hdr->flags = 0;
+
+req_addr = virtio_blk_request(alloc, dev, , 512);
+
+g_free(req.data);
+
+free_head = qvirtqueue_add(vq, req_addr, 16, false, true);
+qvirtqueue_add(vq, req_addr + 16, 512, false, true);
+qvirtqueue_add(vq, req_addr + 528, 1, true, false);
+
+qvirtqueue_kick(dev, vq, free_head);
+
+qvirtio_wait_used_elem(dev, vq, free_head, NULL,
+   QVIRTIO_BLK_TIMEOUT_US);
+status = readb(req_addr + 528);
+g_assert_cmpint(status, ==, 0);
+
+guest_free(alloc, req_addr);
+
+/* Read request to check if the sector contains all zeroes */
+req.type = VIRTIO_BLK_T_IN;
+req.ioprio = 1;
+req.sector = 0;
+req.data = g_malloc0(512);
+
+req_addr = virtio_blk_request(alloc, dev, , 512);
+
+g_free(req.data);
+
+free_head = qvirtqueue_add(vq, req_addr, 16, false, true);
+qvirtqueue_add(vq, req_addr + 16, 512, true, true);
+qvirtqueue_add(vq, req_addr + 528, 1, true, false);
+
+qvirtqueue_kick(dev, vq, free_head);
+
+qvirtio_wait_used_elem(dev, vq, free_head, NULL,
+   QVIRTIO_BLK_TIMEOUT_US);
+status = readb(req_addr + 528);
+g_assert_cmpint(status, ==, 0);
+
+data = g_malloc(512);
+expected = g_malloc0(512);
+memread(req_addr + 16, data, 512);
+g_assert_cmpmem(data, 512, expected, 512);
+g_free(expected);
+g_free(data);
+
+guest_free(alloc, req_addr);
+}
+
 if (features & (1u << VIRTIO_F_ANY_LAYOUT)) {
 /* Write and read with 2 descriptor layout */
 /* Write request */
-- 
2.20.1

Re: [Qemu-devel] [PATCH v2 1/4] qga-win: prevent crash when executing guest-file-read with large count

2019-01-24 Thread Michael Roth

Quoting Basil Salman (2019-01-13 04:05:28)
> BZ: #1594054
> guest-file-read command is currently implelmented to read from a

*implemented

> file handle count number of bytes. when executed with a very large count 
> number
> qemu-ga crashes.
> after some digging turns out that qemu-ga crashes after trying to allocate
> a buffer large enough to save the data read in it, the buffer was allocated 
> using
> g_malloc0 which is not fail safe, and results a crash in case of failure.
> g_malloc0 was replaced with g_try_malloc0() which returns NULL on failure,
> A check was added for that case in order to prevent qemu-ga from crashing
> and to send a response to the qemu-ga client accordingly.
> 
> Signed-off-by: Basil Salman 
> ---
>  qga/commands-win32.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> index 62e1b51dfe..4260faa573 100644
> --- a/qga/commands-win32.c
> +++ b/qga/commands-win32.c
> @@ -345,7 +345,13 @@ GuestFileRead *qmp_guest_file_read(int64_t handle, bool 
> has_count,
>  }
> 
>  fh = gfh->fh;
> -buf = g_malloc0(count+1);
> +buf = g_try_malloc0(count + 1);
> +if (!buf) {
> +error_setg(errp,
> +   "failed to allocate sufficient memory"
> +   "to complete the requested service");
> +return read_data;

return NULL might be a little clearer since that's what we do in the
preceeding checks

> +}
>  is_ok = ReadFile(fh, buf, count, _count, NULL);
>  if (!is_ok) {
>  error_setg_win32(errp, GetLastError(), "failed to read file");
> -- 
> 2.17.2
>

[Qemu-devel] [PATCH RFC 1/2] virtio-blk: add DISCARD and WRITE ZEROES features

2019-01-24 Thread Stefano Garzarella

This patch adds the support of DISCARD and WRITE ZEROES commands,
that have been introduced in the virtio-blk protocol to have
better performance when using SSD backend.

Signed-off-by: Stefano Garzarella 
---
 hw/block/virtio-blk.c | 79 +++
 1 file changed, 79 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f208c6ddb9..8850957751 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -145,6 +145,25 @@ out:
 aio_context_release(blk_get_aio_context(s->conf.conf.blk));
 }
 
+static void virtio_blk_discard_wzeroes_complete(void *opaque, int ret)
+{
+VirtIOBlockReq *req = opaque;
+VirtIOBlock *s = req->dev;
+
+aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
+if (ret) {
+if (virtio_blk_handle_rw_error(req, -ret, 0)) {
+goto out;
+}
+}
+
+virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
+virtio_blk_free_request(req);
+
+out:
+aio_context_release(blk_get_aio_context(s->conf.conf.blk));
+}
+
 #ifdef __linux__
 
 typedef struct {
@@ -584,6 +603,56 @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb)
 virtio_blk_free_request(req);
 break;
 }
+/*
+ * VIRTIO_BLK_T_DISCARD and VIRTIO_BLK_T_WRITE_ZEROES are defined with
+ * VIRTIO_BLK_T_OUT flag set. We masked this flag in the switch statement,
+ * so we must mask it for these requests, then we will check the type.
+ */
+case VIRTIO_BLK_T_DISCARD & ~VIRTIO_BLK_T_OUT:
+case VIRTIO_BLK_T_WRITE_ZEROES & ~VIRTIO_BLK_T_OUT:
+{
+struct virtio_blk_discard_write_zeroes dwz_hdr;
+uint64_t sector;
+int bytes;
+
+if (unlikely(iov_to_buf(out_iov, out_num, 0, _hdr,
+sizeof(dwz_hdr)) != sizeof(dwz_hdr))) {
+virtio_error(vdev, "virtio-blk discard/wzeroes header too short");
+return -1;
+}
+
+sector = virtio_ldq_p(VIRTIO_DEVICE(req->dev), _hdr.sector);
+bytes = virtio_ldl_p(VIRTIO_DEVICE(req->dev),
+ _hdr.num_sectors) << BDRV_SECTOR_BITS;
+
+if (!virtio_blk_sect_range_ok(req->dev, sector, bytes)) {
+virtio_blk_req_complete(req, VIRTIO_BLK_S_IOERR);
+virtio_blk_free_request(req);
+return 0;
+}
+
+if ((type & ~(VIRTIO_BLK_T_BARRIER)) == VIRTIO_BLK_T_DISCARD) {
+blk_aio_pdiscard(req->dev->blk, sector << BDRV_SECTOR_BITS, bytes,
+ virtio_blk_discard_wzeroes_complete, req);
+} else if ((type & ~(VIRTIO_BLK_T_BARRIER)) ==
+   VIRTIO_BLK_T_WRITE_ZEROES) {
+int flags = 0;
+
+if (virtio_ldl_p(VIRTIO_DEVICE(req->dev), _hdr.flags) &
+VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP) {
+flags |= BDRV_REQ_MAY_UNMAP;
+}
+
+blk_aio_pwrite_zeroes(req->dev->blk, sector << BDRV_SECTOR_BITS,
+  bytes, flags,
+  virtio_blk_discard_wzeroes_complete, req);
+} else { /* Unsupported if VIRTIO_BLK_T_OUT is not set */
+virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP);
+virtio_blk_free_request(req);
+}
+
+break;
+}
 default:
 virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP);
 virtio_blk_free_request(req);
@@ -763,6 +832,14 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
 blkcfg.alignment_offset = 0;
 blkcfg.wce = blk_enable_write_cache(s->blk);
 virtio_stw_p(vdev, _queues, s->conf.num_queues);
+virtio_stl_p(vdev, _discard_sectors, BDRV_REQUEST_MAX_SECTORS);
+virtio_stl_p(vdev, _discard_seg, 1);
+virtio_stl_p(vdev, _sector_alignment,
+ blk_size >> BDRV_SECTOR_BITS);
+virtio_stl_p(vdev, _write_zeroes_sectors,
+ BDRV_REQUEST_MAX_SECTORS);
+virtio_stl_p(vdev, _write_zeroes_seg, 1);
+blkcfg.write_zeroes_may_unmap = 1;
 memcpy(config, , sizeof(struct virtio_blk_config));
 }
 
@@ -787,6 +864,8 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, 
uint64_t features,
 virtio_add_feature(, VIRTIO_BLK_F_GEOMETRY);
 virtio_add_feature(, VIRTIO_BLK_F_TOPOLOGY);
 virtio_add_feature(, VIRTIO_BLK_F_BLK_SIZE);
+virtio_add_feature(, VIRTIO_BLK_F_DISCARD);
+virtio_add_feature(, VIRTIO_BLK_F_WRITE_ZEROES);
 if (virtio_has_feature(features, VIRTIO_F_VERSION_1)) {
 if (s->conf.scsi) {
 error_setg(errp, "Please set scsi=off for virtio-blk devices in 
order to use virtio 1.0");
-- 
2.20.1

Re: [Qemu-devel] [PATCH v2 2/4] qga: fix send_response error handling

2019-01-24 Thread Michael Roth

Quoting Basil Salman (2019-01-13 04:05:29)
> Sometimes qemu-ga fails to send a response to client due to memory allocation
> issues due to a large response message, this can be experienced while trying
> to read large number of bytes using QMP command guest-file-read.

send_response has 2 areas that can fail:

1) When formatting the QDict *rsp from qmp_dispatch() into JSON via
qobject_to_json():

payload_qstr = qobject_to_json(QOBJECT(payload));
if (!payload_qstr) {
return -EINVAL;
}

But we can only reach that via qobject_to_json() calling qstring_new()
and that returning NULL. The qstring's initial size is independent of
the actual payload size. So I don't see how a large read would induce
this.

There is other code in qobject_to_json() -> to_json() that could maybe
hit an allocation failure once it start converting the payload to JSON,
but AFAICT that would cause a crash of qemu/qemu-ga once g_realloc()
finally fails to grow the qstring in qstring_append(), not an error
return.

So I don't think it's a bad idea to generate an error response like
you're doing in your patch for future cases, but I don't see how it
is reachable in the current code (even without the fix in patch 1).

Do you have a particular reproducer for this specific failure? Are
you sure it wasn't just the entire guest agent process crashing?

2) The other error you can get from send_response() is if there's
a problem writing things out to the actual communication channel,
in which case sending another error response likely won't help.

> 
> Added a check to send an error response to qemu-ga client in such cases.
> 
> Signed-off-by: Basil Salman 
> ---
>  qga/main.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/qga/main.c b/qga/main.c
> index 87a0711c14..964275c40c 100644
> --- a/qga/main.c
> +++ b/qga/main.c
> @@ -561,6 +561,8 @@ static void process_command(GAState *s, QDict *req)
>  {
>  QDict *rsp;
>  int ret;
> +QDict *ersp;
> +Error *err = NULL;
> 
>  g_assert(req);
>  g_debug("processing command");
> @@ -569,9 +571,20 @@ static void process_command(GAState *s, QDict *req)
>  ret = send_response(s, rsp);
>  if (ret < 0) {
>  g_warning("error sending response: %s", strerror(-ret));
> +goto err;
>  }
>  qobject_unref(rsp);
>  }
> +return;
> +err:
> +error_setg(, "Insufficient system resources exist to "
> +  "complete the requested service");
> +ersp = qmp_error_response(err);
> +ret = send_response(s, ersp);
> +if (ret < 0) {
> +g_warning("error sending error response: %s", strerror(-ret));
> +}
> +qobject_unref(ersp);
>  }
> 
>  /* handle requests/control events coming in over the channel */
> -- 
> 2.17.2
>

Re: [Qemu-devel] [PATCH v2 1/2] pcie: Add a simple PCIe ACS (Access Control Services) helper function

2019-01-24 Thread Alex Williamson

On Thu, 24 Jan 2019 11:12:52 +0100
Knut Omang  wrote:

> Add a helper function to add PCIe capability for Access Control Services (ACS)
> ACS support in the associated root port is a prerequisite to be able to do
> passthrough of individual functions of a device with VFIO
> without Alex Williamson's pcie_acs_override kernel patch or similar
> in the guest.
> 
> Signed-off-by: Knut Omang 
> ---
>  hw/pci/pcie.c  | 21 +
>  include/hw/pci/pcie.h  |  6 ++
>  include/hw/pci/pcie_regs.h |  4 
>  3 files changed, 31 insertions(+)
> 
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index 230478f..5ab3d1d 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -742,6 +742,13 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
>  PCI_EXP_DEVCTL2_ARI;
>  }
>  
> +/* Access Control Services (ACS)
> + */

Comment style

> +void pcie_cap_acs_reset(PCIDevice *dev)
> +{
> +pci_set_word(dev->config + dev->exp.acs_cap + PCI_ACS_CTRL, 0);
> +}
> +
>  /**
>   * pci express extended capability list management functions
>   * uint16_t ext_cap_id (16 bit)
> @@ -906,3 +913,17 @@ void pcie_ats_init(PCIDevice *dev, uint16_t offset)
>  
>  pci_set_word(dev->wmask + dev->exp.ats_cap + PCI_ATS_CTRL, 0x800f);
>  }
> +
> +/* ACS (Access Control Services) */
> +void pcie_acs_init(PCIDevice *dev, uint16_t offset)
> +{
> +pcie_add_capability(dev, PCI_EXT_CAP_ID_ACS, PCI_ACS_VER,
> +offset, PCI_ACS_SIZEOF);
> +dev->exp.acs_cap = offset;
> +pci_set_word(dev->config + offset + PCI_ACS_CAP,
> + PCI_ACS_SV | PCI_ACS_TB | PCI_ACS_RR | PCI_ACS_CR | 
> PCI_ACS_UF);

This is still only valid for downstream ports yet neither restricted
nor commented do indicate that.  You could use an object_dynamic_cast
to triggger an assert should someone use it for an invalid type of
device, ex:

assert(object_dynamic_cast(OBJECT(dev), TYPE_PCIE_SLOT));

> +
> +pci_set_word(dev->config + offset + PCI_ACS_CTRL, 0);

Suspect this is unnecessary given the reset callback.

> +pci_set_word(dev->wmask + offset + PCI_ACS_CTRL,
> + PCI_ACS_SV | PCI_ACS_TB | PCI_ACS_RR | PCI_ACS_CR | 
> PCI_ACS_UF);
> +}
> diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> index 5b82a0d..4c40711 100644
> --- a/include/hw/pci/pcie.h
> +++ b/include/hw/pci/pcie.h
> @@ -79,6 +79,9 @@ struct PCIExpressDevice {
>  
>  /* Offset of ATS capability in config space */
>  uint16_t ats_cap;
> +
> +/* ACS */
> +uint16_t acs_cap;
>  };
>  
>  #define COMPAT_PROP_PCP "power_controller_present"
> @@ -116,6 +119,8 @@ void pcie_cap_flr_init(PCIDevice *dev);
>  void pcie_cap_flr_write_config(PCIDevice *dev,
> uint32_t addr, uint32_t val, int len);
>  
> +void pcie_cap_acs_reset(PCIDevice *dev);
> +
>  /* ARI forwarding capability and control */
>  void pcie_cap_arifwd_init(PCIDevice *dev);
>  void pcie_cap_arifwd_reset(PCIDevice *dev);
> @@ -129,6 +134,7 @@ void pcie_add_capability(PCIDevice *dev,
>  void pcie_sync_bridge_lnk(PCIDevice *dev);
>  
>  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> +void pcie_acs_init(PCIDevice *dev, uint16_t offset);
>  void pcie_dev_ser_num_init(PCIDevice *dev, uint16_t offset, uint64_t 
> ser_num);
>  void pcie_ats_init(PCIDevice *dev, uint16_t offset);
>  
> diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
> index ad4e780..3fc9aca 100644
> --- a/include/hw/pci/pcie_regs.h
> +++ b/include/hw/pci/pcie_regs.h
> @@ -175,4 +175,8 @@ typedef enum PCIExpLinkWidth {
>   PCI_ERR_COR_INTERNAL | \
>   PCI_ERR_COR_HL_OVERFLOW)
>  
> +/* ACS */
> +#define PCI_ACS_VER 0x2

There's no such version, even the PCIe 5.0 drafts only define version 1.

> +#define PCI_ACS_SIZEOF  8
> +
>  #endif /* QEMU_PCIE_REGS_H */

Re: [Qemu-devel] [PATCH v4 0/3] tests: Reorganize MIPS TCG directories and files

2019-01-24 Thread Philippe Mathieu-Daudé

On 1/24/19 4:19 PM, Aleksandar Markovic wrote:
> From: Aleksandar Markovic 
> 
> Reorganize MIPS TCG directories and files. 
[...]
>  496 files changed, 193 insertions(+), 13685 deletions(-)

O_o

Re: [Qemu-devel] [PATCH] hw: sd: set category of the sd memory card

2019-01-24 Thread Philippe Mathieu-Daudé

On 1/24/19 5:20 PM, kumar sourav wrote:
> Sets the category of the sd memory card as DEVICE_CATEGORY_STORAGE.
> Devices should be assigned to one of DEVICE_CATEGORY_.
> 
> Signed-off-by: kumar sourav 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  hw/sd/sd.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/sd/sd.c b/hw/sd/sd.c
> index d4356e9b73..aaab15f386 100644
> --- a/hw/sd/sd.c
> +++ b/hw/sd/sd.c
> @@ -2121,6 +2121,7 @@ static void sd_class_init(ObjectClass *klass, void 
> *data)
>  dc->vmsd = _vmstate;
>  dc->reset = sd_reset;
>  dc->bus_type = TYPE_SD_BUS;
> +set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
>  
>  sc->set_voltage = sd_set_voltage;
>  sc->get_dat_lines = sd_get_dat_lines;
>

[Qemu-devel] [PATCH] s390x: remove direct reference to mem_path global form s90x code

2019-01-24 Thread Igor Mammedov

I plan to deprecate -mem-path option and replace it with memory-backend,
for that it's necessary to get rid of mem_path global variable.
Do it for s390x case, replacing it with alternative way to enable
1Mb hugepages capability.

Signed-off-by: Igor Mammedov 
---
PS:
Original code nor the new one probably is not entirely correct when
huge pages are enabled in case where mixed initial RAM and memory
backends are used, backend's page size might not match initial RAM's
so I'm not sure if enabling 1MB cap is correct in this case on s390
(should it be the same for all RAM???).
With new approach 1Mb cap is not enabled if the smallest page size
is not 1Mb.
---
 target/s390x/kvm.c | 37 -
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 2ebf26a..22e868a 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -285,33 +285,28 @@ void kvm_s390_crypto_reset(void)
 }
 }
 
-static int kvm_s390_configure_mempath_backing(KVMState *s)
+static int kvm_s390_configure_hugepage_backing(KVMState *s)
 {
-size_t path_psize = qemu_mempath_getpagesize(mem_path);
+size_t psize = qemu_getrampagesize();
 
-if (path_psize == 4 * KiB) {
-return 0;
-}
-
-if (!hpage_1m_allowed()) {
-error_report("This QEMU machine does not support huge page "
- "mappings");
-return -EINVAL;
-}
+if (psize == 1 * MiB) {
+if (!hpage_1m_allowed()) {
+error_report("This QEMU machine does not support huge page "
+ "mappings");
+return -EINVAL;
+}
 
-if (path_psize != 1 * MiB) {
+if (kvm_vm_enable_cap(s, KVM_CAP_S390_HPAGE_1M, 0)) {
+error_report("Memory backing with 1M pages was specified, "
+ "but KVM does not support this memory backing");
+return -EINVAL;
+}
+cap_hpage_1m = 1;
+} else if (psize == 2 * GiB) {
 error_report("Memory backing with 2G pages was specified, "
  "but KVM does not support this memory backing");
 return -EINVAL;
 }
-
-if (kvm_vm_enable_cap(s, KVM_CAP_S390_HPAGE_1M, 0)) {
-error_report("Memory backing with 1M pages was specified, "
- "but KVM does not support this memory backing");
-return -EINVAL;
-}
-
-cap_hpage_1m = 1;
 return 0;
 }
 
@@ -319,7 +314,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 {
 MachineClass *mc = MACHINE_GET_CLASS(ms);
 
-if (mem_path && kvm_s390_configure_mempath_backing(s)) {
+if (kvm_s390_configure_hugepage_backing(s)) {
 return -EINVAL;
 }
 
-- 
2.7.4

Re: [Qemu-devel] [PATCH v6 00/10] hw/m68k: add Apple Machintosh Quadra 800 machine

2019-01-24 Thread Laurent Vivier

On 24/01/2019 18:02, Thomas Huth wrote:
> On 2018-11-02 16:22, Mark Cave-Ayland wrote:
>> (MCA: here's the latest version of the q800 patchset. I've hope that I've
>> addressed most of the comments, plus this will now boot into the Debian
>> installer correctly when applied to git master.
> 
> Any update on this series? Why did it get stalled again?
> 

I was thinking about this today.

Mark, perhaps you can send a rebased version of the series?

I think we need reviews for "esp: add pseudo-DMA as used by Macintosh".

Thanks,
Laurent

[Qemu-devel] AArch64: some missed undefined instructions

2019-01-24 Thread Laurent Desnogues

Hello,

I did exhaustive comparisons against latest binutils and found the
following undefined instructions that QEMU fails to flag:

- in disas_b_exc_sys, before calling disas_system bits [23:22] should
be checked to be 0

- in disas_ldst_reg_imm9, PRFM is wrongly detected:  PRFM is for idx =
0, not for is_unpriv, the rest being undefined

- in disas_ldst_multiple_struct, if the instruction is not
post-indexed, then bits [20:16] should be checked to be 0

- in disas_ldst_single_struct, if the instruction is not post-indexed,
then bits [20:16] should be checked to be 0;  also bit [31] should be
0

- in disas_add_sub_ext_reg, bits [23:22] should be checked to be 0

- in disas_data_proc_1src, there's a missing default that would flag
undefined instructions

- in disas_fp_1src, disas_fp_2src, disas_fp_3src, and disas_fp_imm
bits,  [31:29] should be checked to be 0

- in disas_fp_imm, bits [9:5] should be checked to be 0

- in disas_simd_indexed, SDOT and UDOT are not scalar instructions.

That's all I found.  I hope I didn't make any transcription error :-)

Thanks,

Laurent

1 2 3 4 >

1 - 100 of 334 matches

Mail list logo