Re: [PATCH] console: make QMP screendump use coroutine

2020-02-19 Thread Markus Armbruster
Marc-André Lureau  writes:

> Thanks to the QMP coroutine support, the screendump handler can
> trigger a graphic_hw_update(), yield and let the main loop run until
> update is done. Then the handler is resumed, and the ppm_save() will
> write the screen image to disk in the coroutine context (thus
> non-blocking).
>
> For now, HMP doesn't have coroutine support, so it remains potentially
> outdated or glitched.
>
> Fixes:
> https://bugzilla.redhat.com/show_bug.cgi?id=1230527
>
> Based-on: <20200109183545.27452-2-kw...@redhat.com>
>
> Cc: Kevin Wolf 
> Signed-off-by: Marc-André Lureau 
> ---
>  qapi/ui.json|  3 ++-
>  ui/console.c| 35 +++
>  ui/trace-events |  2 +-
>  3 files changed, 30 insertions(+), 10 deletions(-)
>
> diff --git a/qapi/ui.json b/qapi/ui.json
> index e04525d8b4..d941202f34 100644
> --- a/qapi/ui.json
> +++ b/qapi/ui.json
> @@ -96,7 +96,8 @@
>  #
>  ##
>  { 'command': 'screendump',
> -  'data': {'filename': 'str', '*device': 'str', '*head': 'int'} }
> +  'data': {'filename': 'str', '*device': 'str', '*head': 'int'},
> +  'coroutine': true }
>  
>  ##
>  # == Spice
> diff --git a/ui/console.c b/ui/console.c
> index ac79d679f5..db184b473f 100644
> --- a/ui/console.c
> +++ b/ui/console.c
> @@ -167,6 +167,7 @@ struct QemuConsole {
>  QEMUFIFO out_fifo;
>  uint8_t out_fifo_buf[16];
>  QEMUTimer *kbd_timer;
> +Coroutine *screendump_co;
>  
>  QTAILQ_ENTRY(QemuConsole) next;
>  };
> @@ -194,7 +195,6 @@ static void dpy_refresh(DisplayState *s);
>  static DisplayState *get_alloc_displaystate(void);
>  static void text_console_update_cursor_timer(void);
>  static void text_console_update_cursor(void *opaque);
> -static bool ppm_save(int fd, DisplaySurface *ds, Error **errp);
>  
>  static void gui_update(void *opaque)
>  {
> @@ -263,6 +263,9 @@ static void gui_setup_refresh(DisplayState *ds)
>  
>  void graphic_hw_update_done(QemuConsole *con)
>  {
> +if (con && con->screendump_co) {

How can !con happen?

> +aio_co_wake(con->screendump_co);
> +}
>  }
>  
>  void graphic_hw_update(QemuConsole *con)
> @@ -310,16 +313,16 @@ void graphic_hw_invalidate(QemuConsole *con)
>  }
>  }
>  
> -static bool ppm_save(int fd, DisplaySurface *ds, Error **errp)
> +static bool ppm_save(int fd, pixman_image_t *image, Error **errp)
>  {
> -int width = pixman_image_get_width(ds->image);
> -int height = pixman_image_get_height(ds->image);
> +int width = pixman_image_get_width(image);
> +int height = pixman_image_get_height(image);
>  g_autoptr(Object) ioc = OBJECT(qio_channel_file_new_fd(fd));
>  g_autofree char *header = NULL;
>  g_autoptr(pixman_image_t) linebuf = NULL;
>  int y;
>  
> -trace_ppm_save(fd, ds);
> +trace_ppm_save(fd, image);
>  
>  header = g_strdup_printf("P6\n%d %d\n%d\n", width, height, 255);
>  if (qio_channel_write_all(QIO_CHANNEL(ioc),
> @@ -329,7 +332,7 @@ static bool ppm_save(int fd, DisplaySurface *ds, Error 
> **errp)
>  
>  linebuf = qemu_pixman_linebuf_create(PIXMAN_BE_r8g8b8, width);
>  for (y = 0; y < height; y++) {
> -qemu_pixman_linebuf_fill(linebuf, ds->image, width, 0, y);
> +qemu_pixman_linebuf_fill(linebuf, image, width, 0, y);
>  if (qio_channel_write_all(QIO_CHANNEL(ioc),
>(char *)pixman_image_get_data(linebuf),
>pixman_image_get_stride(linebuf), errp) < 
> 0) {

Looks like an unrelated optimization / simplification.  If I was
maintainer, I'd ask for a separate patch.

> @@ -340,11 +343,18 @@ static bool ppm_save(int fd, DisplaySurface *ds, Error 
> **errp)
>  return true;
>  }
>  
> +static void graphic_hw_update_bh(void *con)
> +{
> +graphic_hw_update(con);
> +}
> +
> +/* may be called in coroutine context or not */

Hmm.

Even though the QMP core always calls in coroutine context, the comment
is correct: hmp_screendump() calls it outside coroutine context.
Because of that...

>  void qmp_screendump(const char *filename, bool has_device, const char 
> *device,
>  bool has_head, int64_t head, Error **errp)
>  {
>  QemuConsole *con;
>  DisplaySurface *surface;
> +g_autoptr(pixman_image_t) image = NULL;
>  int fd;
>  
>  if (has_device) {
> @@ -365,7 +375,15 @@ void qmp_screendump(const char *filename, bool 
> has_device, const char *device,
>  }
>  }
>  
> -graphic_hw_update(con);
> +if (qemu_in_coroutine()) {
> +assert(!con->screendump_co);

What if multiple QMP monitors simultaneously screendump?  Hmm, it works
because all execute one after another in the same coroutine
qmp_dispatcher_co.  Implicit mutual exclusion.

Executing them one after another is bad, because it lets an ill-behaved
QMP command starve *all* QMP monitors.  We do it only out of
(reasonable!) fear of implicit mutual exclusion requirements like the
one you add.

Let's not add more if we can help it.

Your scr

Re: [PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2020-02-19 Thread Cédric Le Goater
On 2/20/20 2:50 AM, Alexey Kardashevskiy wrote:
> 
> 
> On 19/02/2020 18:18, Cédric Le Goater wrote:
>> On 2/19/20 7:44 AM, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 19/02/2020 12:20, Alexey Kardashevskiy wrote:


 On 18/02/2020 23:59, Cédric Le Goater wrote:
> On 2/18/20 1:48 PM, Cédric Le Goater wrote:
>> On 2/18/20 10:40 AM, Cédric Le Goater wrote:
>>> On 2/18/20 10:10 AM, Alexey Kardashevskiy wrote:


 On 18/02/2020 20:05, Alexey Kardashevskiy wrote:
>
>
> On 18/02/2020 18:12, Cédric Le Goater wrote:
>> On 2/18/20 1:30 AM, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 17/02/2020 20:48, Cédric Le Goater wrote:
 On 2/17/20 3:12 AM, Alexey Kardashevskiy wrote:
> The following changes since commit 
> 05943fb4ca41f626078014c0327781815c6584c5:
>
>   ppc: free 'fdt' after reset the machine (2020-02-17 11:27:23 
> +1100)
>
> are available in the Git repository at:
>
>   g...@github.com:aik/qemu.git tags/qemu-slof-20200217
>
> for you to fetch changes up to 
> ea9a03e5aa023c5391bab5259898475d0298aac2:
>
>   pseries: Update SLOF firmware image (2020-02-17 13:08:59 +1100)
>
> 
> Alexey Kardashevskiy (1):
>   pseries: Update SLOF firmware image
>
>  pc-bios/README   |   2 +-
>  pc-bios/slof.bin | Bin 931032 -> 968560 bytes
>  roms/SLOF|   2 +-
>  3 files changed, 2 insertions(+), 2 deletions(-)
>
>
> *** Note: this is not for master, this is for pseries
>

 Hello Alexey,

 QEMU fails to boot from disk. See below.
>>>
>>>
>>> It does boot mine (fedora 30, ubuntu 18.04), see below. I believe I
>>> could have broken something but I need more detail. Thanks,
>>
>> fedora31 boots but not ubuntu 19.10. Could it be GRUB version 2.04 ? 
>
>
> No, not that either:


 but it might be because of power9 - I only tried power8, rsyncing the
 image to a p9 machine now...
>>>
>>> Here is the disk : 
>>>
>>> Disk /dev/sda: 50 GiB, 53687091200 bytes, 104857600 sectors
>>> Disk model: QEMU HARDDISK   
>>> Units: sectors of 1 * 512 = 512 bytes
>>> Sector size (logical/physical): 512 bytes / 512 bytes
>>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> Disklabel type: gpt
>>> Disk identifier: 27DCE458-231A-4981-9FF1-983F87C2902D
>>>
>>> Device Start   End   Sectors Size Type
>>> /dev/sda1   2048 16383 14336   7M PowerPC PReP boot
>>> /dev/sda2  16384 100679679 100663296  48G Linux filesystem
>>> /dev/sda3  100679680 104857566   4177887   2G Linux swap
>>>
>>>
>>> GPT ? 
>>
>> For the failure, I bisected up to :
>>
>> f12149908705 ("ext2: Read all 64bit of inode number")
>
> Here is a possible fix for it. I did some RPN on my hp28s in the past 
> but I am not forth fluent.


 you basically zeroed the top bits by shifting them too far right :)

 The proper fix I think is:

 -  32 lshift or
 +  20 lshift or

 I keep forgetting it is all in hex. Can you please give it a try? My
 128GB disk does not expose this problem somehow. Thanks,
>>>
>>> Better try this one please:
>>>
>>> https://github.com/aik/SLOF/tree/ext4
>> Tested with the same image. Looks good. 
> 
> 
> Thanks for testing. But it is still bizarre behaviour, why do we end up
> there anyway...
> 
> 
>>> What I still do not understand is why GRUB is using ext2 from SLOF, it
>>> should parse ext4 itself :-/
>>
>> Here is the fs information.
>>
>>
>> Filesystem volume name:   
>> Last mounted on:  /
>> Filesystem UUID:  8d53f6b4-ffc2-4d8f-bd09-67ac97d7b0c5
>> Filesystem magic number:  0xEF53
>> Filesystem revision #:1 (dynamic)
>> Filesystem features:  has_journal ext_attr resize_inode dir_index 
>> filetype needs_recovery extent flex_bg sparse_super large_file huge_file 
>> uninit_bg dir_nlink extra_isize
> 
> 
> huh, this one does not have 64bit like mine, I blindly assumed that by
> 2020 everything would be using that. Well that explains the bug. And
> yours also has uninit_bg (the whole idea of this flag is not obvious but
> ok).
> 
> 
>> Filesystem flags: unsigned_directory_hash 
>> Default mount options:user_xattr acl
>> Filesystem state: clean
>> Errors behavior:  Continue
>> Filesystem OS type:   Linux
>> Inode count:  3127296
>> Block count:  12582912
>> Reserved bl

Re: [PATCH v2 7/7] block/block-copy: hide structure definitions

2020-02-19 Thread Vladimir Sementsov-Ogievskiy

17.02.2020 17:04, Max Reitz wrote:

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:

Hide structure definitions and add explicit API instead, to keep an
eye on the scope of the shared fields.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block-copy.h | 57 +++--
  block/backup-top.c |  6 ++--
  block/backup.c | 27 
  block/block-copy.c | 64 ++
  4 files changed, 86 insertions(+), 68 deletions(-)


[...]


diff --git a/block/backup.c b/block/backup.c
index cf62b1a38c..acab0d08da 100644
--- a/block/backup.c
+++ b/block/backup.c


[...]


@@ -458,6 +458,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
  job->sync_bitmap = sync_bitmap;
  job->bitmap_mode = bitmap_mode;
  job->bcs = bcs;
+job->bcs_bitmap = block_copy_dirty_bitmap(bcs);


It seems a bit weird to me to store a pointer to the BCS-owned bitmap
here, because, well, it’s a BCS-owned object, and just calling
block_copy_dirty_bitmap() every time wouldn’t be prohibitively expensive.

I feel sufficiently bad about this to warrant not giving an R-b, but I
know I shouldn’t withhold an R-b over this, so:

Reviewed-by: Max Reitz 


Hmm, actually, I tend to agree with you. Why did I write it so? I'll look and 
may be change it
to block_copy_dirty_bitmap() calls every time.

Thanks for reviewing!




  job->cluster_size = cluster_size;
  job->len = len;





--
Best regards,
Vladimir



Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock

2020-02-19 Thread Vladimir Sementsov-Ogievskiy

17.02.2020 16:38, Max Reitz wrote:

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:

Currently, block_copy operation lock the whole requested region. But
there is no reason to lock clusters, which are already copied, it will
disturb other parallel block_copy requests for no reason.

Let's instead do the following:

Lock only sub-region, which we are going to operate on. Then, after
copying all dirty sub-regions, we should wait for intersecting
requests block-copy, if they failed, we should retry these new dirty
clusters.


Just a thought spoken aloud:

I would expect the number of intersecting CBW requests to be low in
general, so I don’t know how useful this change is in practice.  OTOH,
it makes block_copy call the existing implementation in a loop, which
seems just worse.

But then again, in the common case, block_copy_dirty_clusters() won’t
copy anything because it’s all been copied already, so there is no
change; and even if something is copied, the second call will just
re-check the dirty bitmap to see that the area’s clean (which will be
quick compared to the copy operation).  So there’s probably nothing to
worry about.


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/block-copy.c | 116 +
  1 file changed, 95 insertions(+), 21 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 20068cd699..aca44b13fb 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -39,29 +39,62 @@ static BlockCopyInFlightReq 
*block_copy_find_inflight_req(BlockCopyState *s,
  return NULL;
  }
  
-static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,

-   int64_t offset,
-   int64_t bytes)
+/*
+ * If there are no intersecting requests return false. Otherwise, wait for the
+ * first found intersecting request to finish and return true.
+ */
+static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
+ int64_t end)


s/end/bytes/?

(And maybe s/start/offset/, too)


  {
-BlockCopyInFlightReq *req;
+BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
  
-while ((req = block_copy_find_inflight_req(s, offset, bytes))) {

-qemu_co_queue_wait(&req->wait_queue, NULL);
+if (!req) {
+return false;
  }
+
+qemu_co_queue_wait(&req->wait_queue, NULL);
+
+return true;
  }
  
+/* Called only on full-dirty region */

  static void block_copy_inflight_req_begin(BlockCopyState *s,
BlockCopyInFlightReq *req,
int64_t offset, int64_t bytes)
  {
+assert(!block_copy_find_inflight_req(s, offset, bytes));
+
+bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
+
  req->offset = offset;
  req->bytes = bytes;
  qemu_co_queue_init(&req->wait_queue);
  QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
  }
  
-static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)

+static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
+BlockCopyInFlightReq *req, int64_t new_bytes)


It took me a while to understand that this is operation drops the tail
of the request.  I think there should be a comment on this.

(I thought it would successively drop the head after each copy, and so I
was wondering why the code didn’t match that.)


  {
+if (new_bytes == req->bytes) {
+return;
+}
+
+assert(new_bytes > 0 && new_bytes < req->bytes);
+
+bdrv_set_dirty_bitmap(s->copy_bitmap,
+  req->offset + new_bytes, req->bytes - new_bytes);> +
+req->bytes = new_bytes;
+qemu_co_queue_restart_all(&req->wait_queue);
+}
+
+static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
+ BlockCopyInFlightReq *req,
+ int ret)
+{
+if (ret < 0) {
+bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
+}
  QLIST_REMOVE(req, list);
  qemu_co_queue_restart_all(&req->wait_queue);
  }
@@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
  return ret;
  }
  
-int coroutine_fn block_copy(BlockCopyState *s,

-int64_t offset, uint64_t bytes,
-bool *error_is_read)
+/*
+ * block_copy_dirty_clusters
+ *
+ * Copy dirty clusters in @start/@bytes range.
+ * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
+ * clusters found and -errno on failure.
+ */
+static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
+  int64_t offset, int64_t 
bytes,
+  bool *error_is_read)
  {
  int ret = 0;
-Block

[Bug 1823790] Re: QEMU mishandling of SO_PEERSEC forces systemd into tight loop

2020-02-19 Thread Charlie Sharpsteen
Laurent's patch worked for me as well.

I grabbed the source for the Debian 10 qemu-user-static package,
qemu_3.1+dfsg-8+deb10u3, applied the patch and re-built the qemu-arm-
static binary. Copying the new binary into a Docker image based on
arm32v7/debian:10-slim allowed /sbin/init to bring up the container with
a responsive systemctl command.

Prior to the patch, systemd did not start any services inside the
container and systemctl would hang when executed directly.

Thanks!
-Charlie

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1823790

Title:
  QEMU mishandling of SO_PEERSEC forces systemd into tight loop

Status in QEMU:
  Confirmed

Bug description:
  While building Debian images for embedded ARM target systems I
  detected that QEMU seems to force newer systemd daemons into a tight
  loop.

  My setup is the following:

  Host machine: Ubuntu 18.04, amd64
  LXD container: Debian Buster, arm64, systemd 241
  QEMU: qemu-aarch64-static, 4.0.0-rc2 (custom build) and 3.1.0 (Debian 
1:3.1+dfsg-7)

  To easily reproduce the issue I have created the following repository:
  https://github.com/lueschem/edi-qemu

  The call where systemd gets looping is the following:
  2837 getsockopt(3,1,31,274891889456,274887218756,274888927920) = -1 errno=34 
(Numerical result out of range)

  Furthermore I also verified that the issue is not related to LXD.
  The same behavior can be reproduced using systemd-nspawn.

  This issue reported against systemd seems to be related:
  https://github.com/systemd/systemd/issues/11557

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1823790/+subscriptions



Re: [PATCH v2 1/5] vhost-user block device backend

2020-02-19 Thread Coiby Xu
Hi Kevin,

Thank you for reviewing my work in a rather detailed way.

> >  blockdev-vu.c  | 1008 
> >  include/block/vhost-user.h |   46 ++
> >  vl.c   |4 +
> >  3 files changed, 1058 insertions(+)
> >  create mode 100644 blockdev-vu.c
> >  create mode 100644 include/block/vhost-user.h

> This adds a single, relatively big source file. I see at least two
parts: The generic vhost-user infrastructure with connection handling
etc. and the implementation of the specific vhost-user-blk device.
Separating these into two files is probably a good idea.

> I would also suggest to put the files in a new subdirectory
block/export/ and call them vhost-user.c/vhost-user-blk.c. The new
header file can be in the same directory as it shouldn't be used by
anyone else.

I've split blockdev-vu.c in two separate files but in a different subdirectory
 - backends/vhost-user-blk-server.c
 - util/vhost-user-server.c

> > +static QTAILQ_HEAD(, VubDev) vub_devs = QTAILQ_HEAD_INITIALIZER(vub_devs);
> > +
> > +
> > +typedef struct VubReq {
> > +VuVirtqElement *elem;

> Maybe worth a comment that this was allocated with plain malloc(), so
> you must use free() rather than g_free() (which would be the default in
> QEMU)?

Although VuVirtqElement is created using malloc, VubReq is created using g_new0.


I missed several suggestions in v3 but in v4 all suggestions have been
applied. Thank you!
On Thu, Jan 16, 2020 at 9:56 PM Kevin Wolf  wrote:
>
> Hi,
>
> I'm only doing a quick first review pointing out the more obvious
> things while I familiarise myself with your code. I intend to review it
> in more detail later (either in a second pass for this series, or when
> you post v3).
>
> Am 14.01.2020 um 15:06 hat Coiby Xu geschrieben:
> > By making use of libvhost, multiple block device drives can be exported and 
> > each drive can serve multiple clients simultaneously. Since 
> > vhost-user-server needs a block drive to be created first, delay the 
> > creation of this object.
> >
> > Signed-off-by: Coiby Xu 
>
> Please wrap the commit message at 72 characters.
>
> >  blockdev-vu.c  | 1008 
> >  include/block/vhost-user.h |   46 ++
> >  vl.c   |4 +
> >  3 files changed, 1058 insertions(+)
> >  create mode 100644 blockdev-vu.c
> >  create mode 100644 include/block/vhost-user.h
>
> This adds a single, relatively big source file. I see at least two
> parts: The generic vhost-user infrastructure with connection handling
> etc. and the implementation of the specific vhost-user-blk device.
> Separating these into two files is probably a good idea.
>
> I would also suggest to put the files in a new subdirectory
> block/export/ and call them vhost-user.c/vhost-user-blk.c. The new
> header file can be in the same directory as it shouldn't be used by
> anyone else.
>
> > diff --git a/blockdev-vu.c b/blockdev-vu.c
> > new file mode 100644
> > index 00..45f0bb43a7
> > --- /dev/null
> > +++ b/blockdev-vu.c
> > @@ -0,0 +1,1008 @@
>
> The LICENSE file clarifies that files without a license header are
> GPLv2+, so it's not strictly a problem, but I think it is good style to
> include a license header that explicitly tells so.
>
> > +#include "qemu/osdep.h"
> > +#include "block/vhost-user.h"
> > +#include "qapi/error.h"
> > +#include "qapi/qapi-types-sockets.h"
> > +#include "qapi/qapi-commands-block.h"
> > +
> > +#include "sysemu/block-backend.h"
> > +#include "qemu/main-loop.h"
> > +
> > +#include "qemu/units.h"
> > +
> > +#include "block/block.h"
> > +
> > +#include "qom/object_interfaces.h"
> > +
> > +#include 
> > +
> > +#include "hw/qdev-properties.h"
>
> Does the order of includes and the empty lines between them signify
> anything? If not, I suggest just sorting them alphabetically (and maybe
> using empty lines between different subdirectories if you like this
> better than a single large block).
>
> According to CODING_STYLE.rst, system headers like  come
> before all QEMU headers (except qemu/osdep.h, which always must come
> first).
>
> > +enum {
> > +VHOST_USER_BLK_MAX_QUEUES = 8,
> > +};
> > +
> > +struct virtio_blk_inhdr {
> > +unsigned char status;
> > +};
> > +
> > +
> > +static QTAILQ_HEAD(, VubDev) vub_devs = QTAILQ_HEAD_INITIALIZER(vub_devs);
> > +
> > +
> > +typedef struct VubReq {
> > +VuVirtqElement *elem;
>
> Maybe worth a comment that this was allocated with plain malloc(), so
> you must use free() rather than g_free() (which would be the default in
> QEMU)?
>
> > +int64_t sector_num;
> > +size_t size;
> > +struct virtio_blk_inhdr *in;
> > +struct virtio_blk_outhdr out;
> > +VuClient *client;
> > +struct VuVirtq *vq;
> > +} VubReq;
>
> I'm not completely sure yet, but I think I would prefer VuBlock to Vub
> in the type names. Some may even prefer VhostUserBlock, but I can see
> that this would be quite lengthy.
>
> > +static void
> > +remove

Re: [PATCH v4 00/14] Fixes for DP8393X SONIC device emulation

2020-02-19 Thread Jason Wang



On 2020/2/19 下午3:55, Laurent Vivier wrote:

Le 19/02/2020 à 02:57, Aleksandar Markovic a écrit :

2:54 AM Sre, 19.02.2020. Aleksandar Markovic
mailto:aleksandar.m.m...@gmail.com>> је
написао/ла:

2:06 AM Sre, 19.02.2020. Finn Thain 
> је написао/ла:

On Tue, 18 Feb 2020, Aleksandar Markovic wrote:


On Wednesday, January 29, 2020, Finn Thain

mailto:fth...@telegraphics.com.au>>

wrote:


Hi All,

There are bugs in the emulated dp8393x device that can stop packet
reception in a Linux/m68k guest (q800 machine).

With a Linux/m68k v5.5 guest (q800), it's possible to remotely

trigger

an Oops by sending ping floods.

With a Linux/mips guest (magnum machine), the driver fails to probe
the dp8393x device.

With a NetBSD/arc 5.1 guest (magnum), the bugs in the device can be
fatal to the guest kernel.

Whilst debugging the device, I found that the receiver algorithm
differs from the one described in the National Semiconductor
datasheet.

This patch series resolves these bugs.

AFAIK, all bugs in the Linux sonic driver were fixed in Linux v5.5.
---


Herve,

Do your Jazz tests pass with these changes?


AFAIK those tests did not expose the NetBSD panic that is caused by
mainline QEMU (mentioned above).

I have actually run the tests you requested (Hervé described them in an
earlier thread). There was no regression. Quite the reverse -- it's no
longer possible to remotely crash the NetBSD kernel.

Apparently my testing was also the first time that the jazzsonic driver
(from the Linux/mips Magnum port) was tested successfully with QEMU. It
doesn't work in mainline QEMU.


Well, I appologize if I missed all these facts. I just did not notice

them, at least not in this form. And, yes, some "Tested-by:" by Herve
would be desirable and nice.
Or, perhaps, even "Reviewed-by:".


It would be nice to have this merged before next release because q800
machine networking is not reliable without them.



I will send the pull request that contains this series before the end of 
this week.


Thanks




And thank you to Finn for all his hard work on this device emulation.

Laurent











Re: [PATCH v7 0/4] colo: Add support for continuous replication

2020-02-19 Thread Jason Wang



On 2020/2/20 上午9:38, Zhang, Chen wrote:

Hi Jason,

I noticed this series can't be merged or queued, do you met some 
problem about it?



Thanks

Zhang Chen



Not, I've queued this.

Thanks





[PATCH qemu v7 2/5] spapr/spapr: Make vty_getchars public

2020-02-19 Thread Alexey Kardashevskiy
A serial device fetches the data from the chardev backend as soon as
input happens and stores it in its internal device specific buffer, every
char device implements it again. Since there is no unified interface to
read such buffer, we will have to read characters directly from
VIO_SPAPR_VTY_DEVICE. The OF client is going to need this.

Signed-off-by: Alexey Kardashevskiy 
---
 include/hw/ppc/spapr_vio.h | 1 +
 hw/char/spapr_vty.c| 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
index bed7df60e35c..77e9b73bdfe0 100644
--- a/include/hw/ppc/spapr_vio.h
+++ b/include/hw/ppc/spapr_vio.h
@@ -130,6 +130,7 @@ int spapr_vio_send_crq(SpaprVioDevice *dev, uint8_t *crq);
 
 SpaprVioDevice *vty_lookup(SpaprMachineState *spapr, target_ulong reg);
 void vty_putchars(SpaprVioDevice *sdev, uint8_t *buf, int len);
+int vty_getchars(SpaprVioDevice *sdev, uint8_t *buf, int max);
 void spapr_vty_create(SpaprVioBus *bus, Chardev *chardev);
 void spapr_vlan_create(SpaprVioBus *bus, NICInfo *nd);
 void spapr_vscsi_create(SpaprVioBus *bus);
diff --git a/hw/char/spapr_vty.c b/hw/char/spapr_vty.c
index ecb94f5673ca..1c00da75b4f1 100644
--- a/hw/char/spapr_vty.c
+++ b/hw/char/spapr_vty.c
@@ -52,7 +52,7 @@ static void vty_receive(void *opaque, const uint8_t *buf, int 
size)
 }
 }
 
-static int vty_getchars(SpaprVioDevice *sdev, uint8_t *buf, int max)
+int vty_getchars(SpaprVioDevice *sdev, uint8_t *buf, int max)
 {
 SpaprVioVty *dev = VIO_SPAPR_VTY_DEVICE(sdev);
 int n = 0;
-- 
2.17.1




[PATCH qemu v7 3/5] spapr/cas: Separate CAS handling from rebuilding the FDT

2020-02-19 Thread Alexey Kardashevskiy
At the moment "ibm,client-architecture-support" ("CAS") is implemented
in SLOF and QEMU assists via the custom H_CAS hypercall which copies
an updated flatten device tree (FDT) blob to the SLOF memory which
it then uses to update its internal tree.

When we enable the OpenFirmware client interface in QEMU, we won't need
to copy the FDT to the guest as the client is expected to fetch
the device tree using the client interface.

This moves FDT rebuild out to a separate helper which is going to be
called from the "ibm,client-architecture-support" handler and leaves
writing FDT to the guest in the H_CAS handler.

This should not cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy 
---
 include/hw/ppc/spapr.h |  7 +
 hw/ppc/spapr.c |  1 -
 hw/ppc/spapr_hcall.c   | 67 ++
 3 files changed, 48 insertions(+), 27 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 09110961a589..7802acee0c85 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -102,6 +102,8 @@ typedef enum {
 #define SPAPR_CAP_FIXED_CCD 0x03
 #define SPAPR_CAP_FIXED_NA  0x10 /* Lets leave a bit of a gap... */
 
+#define FDT_MAX_SIZE0x10
+
 typedef struct SpaprCapabilities SpaprCapabilities;
 struct SpaprCapabilities {
 uint8_t caps[SPAPR_CAP_NUM];
@@ -558,6 +560,11 @@ void spapr_register_hypercall(target_ulong opcode, 
spapr_hcall_fn fn);
 target_ulong spapr_hypercall(PowerPCCPU *cpu, target_ulong opcode,
  target_ulong *args);
 
+target_ulong do_client_architecture_support(PowerPCCPU *cpu,
+SpaprMachineState *spapr,
+target_ulong addr,
+target_ulong fdt_bufsize);
+
 /* Virtual Processor Area structure constants */
 #define VPA_MIN_SIZE   640
 #define VPA_SIZE_OFFSET0x4
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 90b68e2f479e..62d6487c2568 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -96,7 +96,6 @@
  *
  * We load our kernel at 4M, leaving space for SLOF initial image
  */
-#define FDT_MAX_SIZE0x10
 #define RTAS_MAX_ADDR   0x8000 /* RTAS must stay below that */
 #define FW_MAX_SIZE 0x40
 #define FW_FILE_NAME"slof.bin"
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 6db3dbde9c92..35de3ab95f42 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1664,16 +1664,12 @@ static bool spapr_transient_dev_before_cas(void)
 return false;
 }
 
-static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
-  SpaprMachineState *spapr,
-  target_ulong opcode,
-  target_ulong *args)
+target_ulong do_client_architecture_support(PowerPCCPU *cpu,
+SpaprMachineState *spapr,
+target_ulong vec,
+target_ulong fdt_bufsize)
 {
-/* Working address in data buffer */
-target_ulong addr = ppc64_phys_to_real(args[0]);
-target_ulong fdt_buf = args[1];
-target_ulong fdt_bufsize = args[2];
-target_ulong ov_table;
+target_ulong ov_table; /* Working address in data buffer */
 uint32_t cas_pvr;
 SpaprOptionVector *ov1_guest, *ov5_guest, *ov5_cas_old;
 bool guest_radix;
@@ -1693,7 +1689,7 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 }
 }
 
-cas_pvr = cas_check_pvr(spapr, cpu, &addr, &raw_mode_supported, 
&local_err);
+cas_pvr = cas_check_pvr(spapr, cpu, &vec, &raw_mode_supported, &local_err);
 if (local_err) {
 error_report_err(local_err);
 return H_HARDWARE;
@@ -1716,7 +1712,7 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 }
 
 /* For the future use: here @ov_table points to the first option vector */
-ov_table = addr;
+ov_table = vec;
 
 ov1_guest = spapr_ovec_parse_vector(ov_table, 1);
 if (!ov1_guest) {
@@ -1840,7 +1836,6 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 
 if (!spapr->cas_reboot) {
 void *fdt;
-SpaprDeviceTreeUpdateHeader hdr = { .version_id = 1 };
 
 /* If spapr_machine_reset() did not set up a HPT but one is necessary
  * (because the guest isn't going to use radix) then set it up here. */
@@ -1849,21 +1844,7 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 spapr_setup_hpt_and_vrma(spapr);
 }
 
-if (fdt_bufsize < sizeof(hdr)) {
-error_report("SLOF provided insufficient CAS buffer "
- TARGET_FMT_lu " (min: %zu)", fdt_bufsize, 
sizeof(hdr));
-exit(EXIT_FAIL

[PATCH qemu v7 4/5] spapr: Implement Open Firmware client interface

2020-02-19 Thread Alexey Kardashevskiy
The PAPR platform which describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.

Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boottime firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it's become increasingly awkward to handle as we've implemented
new features.

This implements a boot time OF client interface (CI) which is
enabled by a new "vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.

The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.

This adds support for a console. For output any serial device can be used,
for stdin the support is limited by spapr-vty only as allowing input from
a serial device requires device-model specific code (output is simpler).

This implements a handful of CI methods just to get Linux and GRUB going;
Linux requires even less. In particular, this implements the device tree
fetching, reading from block device, read-write stdout/stdin and
ibm,client-architecture-support.

This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when vof=on as not packing the blob leaves some room for
appending.

In absence of SLOF, this assigns "phandles" to device tree nodes to make
device tree traversing work.

When vof=on, this adds "/chosen" every time QEMU (re)builds a tree.

This implements "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.

This adds basic instances support which are managed by a hashmap
ihandle -> [phandle, DeviceState, CharBackend].

Before the guest started, the used memory is:
0..4000 - the initial firmware
1..18 - stack

This OF CI does not implement "interpret".

With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v7:
* now we have a small firmware which loads at 0 as SLOF and starts from
0x100 as SLOF
* no MBR/ELF/GRUB business in QEMU anymore
* blockdev is a separate patch
* networking is a separate patch

v6:
* borrowed a big chunk of commit log introduction from David
* fixed initial stack pointer (points to the highest address of stack)
* traces for "interpret" and others
* disabled  translate_kernel_address() hack so grub can load (work in
progress)
* added "milliseconds" for grub
* fixed "claim" allocator again
* moved FDT_MAX_SIZE to spapr.h as spapr_of_client.c wants it too for CAS
* moved the most code possible from spapr.c to spapr_of_client.c, such as
RTAS, prom entry and FDT build/finalize
* separated blobs
* GRUB now proceeds to its console prompt (there are still other issues)
* parse MBR/GPT to find PReP and load GRUB

v5:
* made instances keep device and chardev pointers
* removed VIO dependencies
* print error if RTAS memory is not claimed as it should have been
* pack FDT as "quiesce"

v4:
* fixed open
* validate ihandles in "call-method"

v3:
* fixed phandles allocation
* s/__be32/uint32_t/ as we do not normally have __be32 type in qemu
* fixed size of /chosen/stdout
* bunch of renames
* do not create rtas properties at all, let the client deal with it;
instead setprop allows changing these in the FDT
* no more packing FDT when bios=off - nobody needs it and getprop does not
work otherwise
* allow updating initramdisk device tree properties (for zImage)
* added instances
* fixed stdout on OF's "write"
* removed special handling for stdout in OF client, spapr-vty handles it
instead

v2:
* fixed claim()
* added "setprop"
* cleaner client interface and RTAS blobs management
* boots to petitboot and further to the target system
* more trace points
---
 hw/ppc/Makefile.objs |1 +
 pc-bios/vof/Makefile |   18 +
 include/hw/ppc/spapr.h   |   20 +-
 pc-bios/vof/vof.h|   48 ++
 hw/ppc/spapr.c   |   68 ++-
 hw/ppc/spapr_hcall.c |6 +-
 hw/ppc/spapr_of_client.c | 1221 ++
 pc-bios/vof/bootmem.c|   13 +
 pc-bios/vof/ci.c |  136 +
 pc-bios/vof/libc.c   |   91 +++
 pc-bios/vof/main.c   |   23 +
 hw/ppc/trace-events  |   21 +
 pc-bios/README   |2 +
 pc-bios/vof.bin  |  Bin 0 -> 4272 bytes
 pc-bios/vof/entry.S  

[PATCH qemu v7 0/5] spapr: Kill SLOF

2020-02-19 Thread Alexey Kardashevskiy
This is another attempt to implement minimalistic
Open Firmware Client Interface in QEMU.

With this thing, I can boot unmodified Ubuntu 18.04 and Fedora 30
directly from the disk without SLOF.

A useful discussion happened esrlier:
https://lore.kernel.org/qemu-devel/f881c2e7-be92-9695-6e19-2dd88cbc6...@ozlabs.ru/

5/5 is kind of controvertial though. This respin does not include
networking.

This is based on sha1
015fb0ead60d Chen Qun "hw/ppc/virtex_ml507:fix leak of fdevice tree blob".

Please comment. Thanks.



Alexey Kardashevskiy (5):
  ppc/spapr: Move GPRs setup to one place
  spapr/spapr: Make vty_getchars public
  spapr/cas: Separate CAS handling from rebuilding the FDT
  spapr: Implement Open Firmware client interface
  spapr/vof: Add basic support for MBR/GPT/GRUB

 hw/ppc/Makefile.objs|1 +
 pc-bios/vof/Makefile|   18 +
 include/hw/ppc/spapr.h  |   27 +-
 include/hw/ppc/spapr_cpu_core.h |4 +-
 include/hw/ppc/spapr_vio.h  |1 +
 pc-bios/vof/vof.h   |   63 ++
 hw/char/spapr_vty.c |2 +-
 hw/ppc/spapr.c  |   69 +-
 hw/ppc/spapr_cpu_core.c |6 +-
 hw/ppc/spapr_hcall.c|   73 +-
 hw/ppc/spapr_of_client.c| 1285 +++
 hw/ppc/spapr_rtas.c |2 +-
 pc-bios/vof/bootblock.c |  242 ++
 pc-bios/vof/bootmem.c   |   13 +
 pc-bios/vof/ci.c|  147 
 pc-bios/vof/elf32.c |  273 +++
 pc-bios/vof/libc.c  |   91 +++
 pc-bios/vof/main.c  |   24 +
 hw/ppc/trace-events |   25 +
 pc-bios/README  |2 +
 pc-bios/vof.bin |  Bin 0 -> 9180 bytes
 pc-bios/vof/entry.S |   58 ++
 pc-bios/vof/l.lds   |   48 ++
 23 files changed, 2429 insertions(+), 45 deletions(-)
 create mode 100644 pc-bios/vof/Makefile
 create mode 100644 pc-bios/vof/vof.h
 create mode 100644 hw/ppc/spapr_of_client.c
 create mode 100644 pc-bios/vof/bootblock.c
 create mode 100644 pc-bios/vof/bootmem.c
 create mode 100644 pc-bios/vof/ci.c
 create mode 100644 pc-bios/vof/elf32.c
 create mode 100644 pc-bios/vof/libc.c
 create mode 100644 pc-bios/vof/main.c
 create mode 100755 pc-bios/vof.bin
 create mode 100644 pc-bios/vof/entry.S
 create mode 100644 pc-bios/vof/l.lds

-- 
2.17.1




[PATCH qemu v7 1/5] ppc/spapr: Move GPRs setup to one place

2020-02-19 Thread Alexey Kardashevskiy
At the moment "pseries" starts in SLOF which only expects the FDT blob
pointer in r3. As we are going to introduce a OpenFirmware support in
QEMU, we will be booting OF clients directly and these expect a stack
pointer in r1, Linux looks at r3/r4 for the initramdisk location
(although vmlinux can find this from the device tree but zImage from
distro kernels cannot).

This extends spapr_cpu_set_entry_state() to take more registers. This
should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v7:
* removed r5 as it points to prom entry which is now provided by
a new firmware in later patches
---
 include/hw/ppc/spapr_cpu_core.h | 4 +++-
 hw/ppc/spapr.c  | 2 +-
 hw/ppc/spapr_cpu_core.c | 6 +-
 hw/ppc/spapr_rtas.c | 2 +-
 4 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
index 1c4cc6559c52..7aed8f555b4f 100644
--- a/include/hw/ppc/spapr_cpu_core.h
+++ b/include/hw/ppc/spapr_cpu_core.h
@@ -40,7 +40,9 @@ typedef struct SpaprCpuCoreClass {
 } SpaprCpuCoreClass;
 
 const char *spapr_get_cpu_core_type(const char *cpu_type);
-void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong 
r3);
+void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip,
+   target_ulong r1, target_ulong r3,
+   target_ulong r4);
 
 typedef struct SpaprCpuState {
 uint64_t vpa_addr;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 828e2cc1359a..90b68e2f479e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1691,7 +1691,7 @@ static void spapr_machine_reset(MachineState *machine)
 spapr->fdt_blob = fdt;
 
 /* Set up the entry state */
-spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
+spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, 0, fdt_addr, 
0);
 first_ppc_cpu->env.gpr[5] = 0;
 
 spapr->cas_reboot = false;
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index d09125d9afd4..590bd70e05cc 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -84,13 +84,17 @@ static void spapr_reset_vcpu(PowerPCCPU *cpu)
 spapr_irq_cpu_intc_reset(spapr, cpu);
 }
 
-void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong 
r3)
+void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip,
+   target_ulong r1, target_ulong r3,
+   target_ulong r4)
 {
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 CPUPPCState *env = &cpu->env;
 
 env->nip = nip;
+env->gpr[1] = r1;
 env->gpr[3] = r3;
+env->gpr[4] = r4;
 kvmppc_set_reg_ppc_online(cpu, 1);
 CPU(cpu)->halted = 0;
 /* Enable Power-saving mode Exit Cause exceptions */
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 656fdd221665..fe83b50c6629 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -190,7 +190,7 @@ static void rtas_start_cpu(PowerPCCPU *callcpu, 
SpaprMachineState *spapr,
  */
 newcpu->env.tb_env->tb_offset = callcpu->env.tb_env->tb_offset;
 
-spapr_cpu_set_entry_state(newcpu, start, r3);
+spapr_cpu_set_entry_state(newcpu, start, 0, r3, 0);
 
 qemu_cpu_kick(CPU(newcpu));
 
-- 
2.17.1




[PATCH qemu v7 5/5] spapr/vof: Add basic support for MBR/GPT/GRUB

2020-02-19 Thread Alexey Kardashevskiy
This hooks up disks to block backends so vof.bin can read MBR/GPT,
find a bootloader and run it. This bypasses the device drivers
and goes straight to the backend.

This implements basic support for 32bit big endian bootloader;
tested on GRUB.

Signed-off-by: Alexey Kardashevskiy 
---
 pc-bios/vof/Makefile |   2 +-
 pc-bios/vof/vof.h|  15 +++
 hw/ppc/spapr_of_client.c |  64 +
 pc-bios/vof/bootblock.c  | 242 ++
 pc-bios/vof/ci.c |  11 ++
 pc-bios/vof/elf32.c  | 273 +++
 pc-bios/vof/main.c   |   1 +
 hw/ppc/trace-events  |   4 +
 pc-bios/vof.bin  | Bin 4272 -> 9180 bytes
 9 files changed, 611 insertions(+), 1 deletion(-)
 create mode 100644 pc-bios/vof/bootblock.c
 create mode 100644 pc-bios/vof/elf32.c

diff --git a/pc-bios/vof/Makefile b/pc-bios/vof/Makefile
index 49f7e240eeff..7e4227bb2cc6 100644
--- a/pc-bios/vof/Makefile
+++ b/pc-bios/vof/Makefile
@@ -8,7 +8,7 @@ build-all: vof.bin
 %.o: %.c
cc -m32 -mbig-endian -c -fno-stack-protector 
-Wno-builtin-declaration-mismatch -o $@ $<
 
-vof.elf: entry.o main.o libc.o ci.o bootmem.o
+vof.elf: entry.o main.o libc.o ci.o bootmem.o bootblock.o elf32.o
ld -nostdlib -e_start -Tl.lds -EB -o $@ $^
 
 %.bin: %.elf
diff --git a/pc-bios/vof/vof.h b/pc-bios/vof/vof.h
index 738b2539aa19..b16270b289b7 100644
--- a/pc-bios/vof/vof.h
+++ b/pc-bios/vof/vof.h
@@ -37,6 +37,8 @@ phandle ci_finddevice(const char *path);
 uint32_t ci_getprop(phandle ph, const char *propname, void *prop, int len);
 ihandle ci_open(const char *path);
 void ci_close(ihandle ih);
+uint32_t ci_block_size(ihandle ih);
+uint32_t ci_seek(ihandle ih, uint64_t offset);
 uint32_t ci_read(ihandle ih, void *buf, int len);
 uint32_t ci_write(ihandle ih, const void *buf, int len);
 void ci_stdout(const char *buf);
@@ -44,5 +46,18 @@ void ci_stdoutn(const char *buf, int len);
 void *ci_claim(void *virt, uint32_t size, uint32_t align);
 uint32_t ci_release(void *virt, uint32_t size);
 
+/* ELF */
+int elf_load_file(void *file_addr, uint32_t *entry,
+  int (*pre_load)(void*, long),
+  void (*post_load)(void*, long));
+
+/* booting from blockdev */
+void boot_block(void);
+
 /* booting from -kernel */
 void boot_from_memory(uint64_t initrd, uint64_t initrdsize);
+
+/* bswap */
+#define le16_to_cpu(x) __builtin_bswap16(x)
+#define le32_to_cpu(x) __builtin_bswap32(x)
+#define le64_to_cpu(x) __builtin_bswap64(x)
diff --git a/hw/ppc/spapr_of_client.c b/hw/ppc/spapr_of_client.c
index 4c476e138e60..a36b32487349 100644
--- a/hw/ppc/spapr_of_client.c
+++ b/hw/ppc/spapr_of_client.c
@@ -43,6 +43,9 @@ typedef struct {
 typedef struct {
 DeviceState *dev;
 CharBackend *cbe;
+BlockBackend *blk;
+uint64_t blk_pos;
+uint16_t blk_physical_block_size;
 char *params;
 char *path; /* the path used to open the instance */
 uint32_t phandle;
@@ -494,6 +497,8 @@ static uint32_t spapr_of_client_open(SpaprMachineState 
*spapr, const char *path)
 if (inst->dev) {
 const char *cdevstr = object_property_get_str(OBJECT(inst->dev),
   "chardev", NULL);
+const char *blkstr = object_property_get_str(OBJECT(inst->dev),
+ "drive", NULL);
 
 if (cdevstr) {
 Chardev *cdev = qemu_chr_find(cdevstr);
@@ -501,6 +506,13 @@ static uint32_t spapr_of_client_open(SpaprMachineState 
*spapr, const char *path)
 if (cdev) {
 inst->cbe = cdev->be;
 }
+} else if (blkstr) {
+BlockConf conf = { 0 };
+
+inst->blk = blk_by_name(blkstr);
+conf.blk = inst->blk;
+blkconf_blocksizes(&conf);
+inst->blk_physical_block_size = conf.physical_block_size;
 }
 }
 
@@ -602,6 +614,8 @@ static uint32_t of_client_write(SpaprMachineState *spapr, 
uint32_t ihandle,
 if (inst->cbe) {
 toprint = qemu_chr_fe_write_all(inst->cbe, (uint8_t *) tmp,
 toprint);
+} else if (inst->blk) {
+trace_spapr_of_client_blk_write(ihandle, len);
 }
 } else {
 /* We normally open stdout so this is fallback */
@@ -636,6 +650,17 @@ static uint32_t of_client_read(SpaprMachineState *spapr, 
uint32_t ihandle,
 SpaprVioDevice *sdev = VIO_SPAPR_DEVICE(inst->dev);
 
 ret = vty_getchars(sdev, buf, len); /* qemu_chr_fe_read_all? */
+} else if (inst->blk) {
+int rc = blk_pread(inst->blk, inst->blk_pos, buf, len);
+
+if (rc > 0) {
+ret = rc;
+}
+trace_spapr_of_client_blk_read(ihandle, inst->blk_pos, len,
+   ret);
+if (rc > 0) {
+

[PATCH] hw/char/pl011: Output characters using best-effort mode

2020-02-19 Thread Gavin Shan
Currently, PL011 is used by ARM virt board by default. It's possible to
block the system from booting. With below parameters in command line, the
backend could run into endless attempts of transmitting packets, which
can't succeed because of running out of sending buffer. The socket might
be not accepted n server side. It's not correct because disconnected
serial port shouldn't stop the system from booting.

   -machine virt,gic-version=3 -cpu max -m 4096
   -monitor none -serial tcp:127.0.0.1:50900

The issue can be reproduced by starting a program which listens on TCP
port 50900 and then sleep without accepting any incoming connections. On
the other hand, a VM is started with above parameters and modified qemu
where the PL011 is flooded with 5000K data after it's created. Eventually,
the flooding won't proceed and stops after transmitting 2574K data. It's
basically to simulate tons of output from EDK-II and demonstrates how the
tons of output can block the system from booting.

This fixes the issue by using newly added API qemu_chr_fe_try_write_all(),
which provides another type of service (best-effort). It's different from
qemu_chr_fe_write_all() as the data will be dropped if the backend has
been running into so-called broken state or 50 attempts of transmissions.
The broken state is cleared if the data is transmitted at once.

Signed-off-by: Gavin Shan 
---
 chardev/char-fe.c | 15 +--
 chardev/char.c| 20 ++--
 hw/char/pl011.c   |  5 +
 include/chardev/char-fe.h | 14 ++
 include/chardev/char.h|  6 --
 5 files changed, 46 insertions(+), 14 deletions(-)

diff --git a/chardev/char-fe.c b/chardev/char-fe.c
index f3530a90e6..6558fcfb94 100644
--- a/chardev/char-fe.c
+++ b/chardev/char-fe.c
@@ -39,7 +39,7 @@ int qemu_chr_fe_write(CharBackend *be, const uint8_t *buf, 
int len)
 return 0;
 }
 
-return qemu_chr_write(s, buf, len, false);
+return qemu_chr_write(s, buf, len, false, false);
 }
 
 int qemu_chr_fe_write_all(CharBackend *be, const uint8_t *buf, int len)
@@ -50,7 +50,18 @@ int qemu_chr_fe_write_all(CharBackend *be, const uint8_t 
*buf, int len)
 return 0;
 }
 
-return qemu_chr_write(s, buf, len, true);
+return qemu_chr_write(s, buf, len, true, false);
+}
+
+int qemu_chr_fe_try_write_all(CharBackend *be, const uint8_t *buf, int len)
+{
+Chardev *s = be->chr;
+
+if (!s) {
+return 0;
+}
+
+return qemu_chr_write(s, buf, len, true, true);
 }
 
 int qemu_chr_fe_read_all(CharBackend *be, uint8_t *buf, int len)
diff --git a/chardev/char.c b/chardev/char.c
index 87237568df..cd17fac123 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -106,9 +106,8 @@ static void qemu_chr_write_log(Chardev *s, const uint8_t 
*buf, size_t len)
 }
 }
 
-static int qemu_chr_write_buffer(Chardev *s,
- const uint8_t *buf, int len,
- int *offset, bool write_all)
+static int qemu_chr_write_buffer(Chardev *s, const uint8_t *buf, int len,
+ int *offset, bool write_all, bool best_effort)
 {
 ChardevClass *cc = CHARDEV_GET_CLASS(s);
 int res = 0;
@@ -119,7 +118,14 @@ static int qemu_chr_write_buffer(Chardev *s,
 retry:
 res = cc->chr_write(s, buf + *offset, len - *offset);
 if (res < 0 && errno == EAGAIN && write_all) {
+if (best_effort && s->retries > 50) {
+break;
+}
+
 g_usleep(100);
+if (best_effort) {
+s->retries++;
+}
 goto retry;
 }
 
@@ -127,6 +133,7 @@ static int qemu_chr_write_buffer(Chardev *s,
 break;
 }
 
+s->retries = 0;
 *offset += res;
 if (!write_all) {
 break;
@@ -140,7 +147,8 @@ static int qemu_chr_write_buffer(Chardev *s,
 return res;
 }
 
-int qemu_chr_write(Chardev *s, const uint8_t *buf, int len, bool write_all)
+int qemu_chr_write(Chardev *s, const uint8_t *buf, int len,
+   bool write_all, bool best_effort)
 {
 int offset = 0;
 int res;
@@ -148,11 +156,11 @@ int qemu_chr_write(Chardev *s, const uint8_t *buf, int 
len, bool write_all)
 if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_PLAY) {
 replay_char_write_event_load(&res, &offset);
 assert(offset <= len);
-qemu_chr_write_buffer(s, buf, offset, &offset, true);
+qemu_chr_write_buffer(s, buf, offset, &offset, true, false);
 return res;
 }
 
-res = qemu_chr_write_buffer(s, buf, len, &offset, write_all);
+res = qemu_chr_write_buffer(s, buf, len, &offset, write_all, best_effort);
 
 if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
 replay_char_write_event_save(res, offset);
diff --git a/hw/char/pl011.c b/hw/char/pl011.c
index 13e784f9d9..348188f49e 100644
--- a/hw/char/pl011.c
+++ b/hw/char/pl011.c
@@ -179,11 +179,8 @@ s

Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC

2020-02-19 Thread Howard Spoelstra
On Wed, Feb 19, 2020 at 8:28 PM BALATON Zoltan  wrote:

> On Wed, 19 Feb 2020, Howard Spoelstra wrote:
> > I tested with the current ppc-for-5.0 branch and with v1 of the hardfloat
> > patches applied on top of that. There is a noticeable speed improvement
> in
> > Linux and OSX hosts. Windows 10 host doesn't seem to be impressed at
> all. I
> > saw no obvious glitches so far. The fpu performance on OSX hosts seems
> very
> > slow. This was not always the case in the past, when it was on par with
> > Linux performance.
>
> Interesting, thanks for the measurements.
>
> > Below are my results.
> >
> > Best,
> > Howard
> >
> > Host Linux (Fedora 31):
> > Mac OS tests: 9.2 with MacBench 5.0
> > Baseline(100%): G3 300Mhz
> > 5.0 branch + hardfloat patches: cpu 193%, fpu 86%
> > 5.0 branch: cpu 188%, fpu 57%
>
> Here there's a difference in cpu value before and after patch which I
> can't explain (only changed FPU stuff so it should not change others) but
> also not seen in other measurements so this could be some external
> influence such as something else using CPU while running test? Unless this
> happens consistently I'd put it down to measurement error.
>

  Yes, I would put that cpu value down to some fluctuation in the test

>
> > Mac OSX tests: 10.5 with Skidmarks 4.0 test
> > Baseline(100%): G4 1.0Ghz.
> > 5.0 branch + hardfloat patches: Int:131 FP:11 Vec:15
> > 5.0 branch: Int:131 FP:9 Vec:11
> >
> > Host OSX Sierra:
> > Mac OS tests: 9.2 with MacBench 5.0
> > Baseline(100%): G3 300Mhz
> > 5.0 branch + hardfloat patches: cpu 199%, fpu 66%
> > 5.0 branch: cpu 199%, fpu 40%
> > Mac OSX tests: 10.5 with Skidmarks 4.0 test
> > Baseline(100%): G4 1.0Ghz.
> > 5.0 branch + hardfloat patches: Int:129 PF:11 Vec:14
>
> These values seem to match Linux measurement above so don't seem slower
> although MacOS9 seems to be slower (66 vs. 86) so either this depends on
> the ops used or something else.
>

 Yes, the baseline speed for the fpu in Mac OS 9.2 is relatively low.

>
> > 5.0 branch: Int:129 FP:8 Vec:9
> >
> > Host Windows 10:
> > Mac OS tests: 9.2 with MacBench 5.0
> > Baseline(100%): G3 300Mhz
> > 5.0 branch + hardfloat patches: cpu 180%, fpu 54%
>

 new run 5.0 branch + hardfloat patches: cpu 184%, fpu 54%

> 5.0 branch: cpu 199%, fpu 40%
>

 new run 5.0 branch: cpu 184%, fpu 56%

It seems I misreported (copy/past without changing the values) the earlier
Windows-based results with Mac OS 9.2 guest. As said above (and this now
seems to confirm) Windows is not impressed at all and perhaps a bit slower
even.
Windows builds are particularly sensitive to any other activity on the
system. Moving the qemu window drops performance considerably. Perhaps due
to SDL not running in its own thread?

>
> Here there's again difference in cpu value but the other way so maybe if
> the cause is external CPU usage then this again may be an outlying
> measurement? You could retake these two to verify if you get same numbers
> again. The fpu value does seem to improve just not as much as the others
> and it's also lower to start with. I wonder why.
>


> > Mac OSX tests: 10.5 with Skidmarks 4.0 test
> > Baseline(100%): G4 1.0Ghz.
> > 5.0 branch + hardfloat patches: Int:130 FP:9 Vec:10
> > 5.0 branch: Int:130 FP:10 Vec:11
> >
> > All tests done on the same host with v1 of the hardfloat patches
> > Intel i7-4770K at 3.50Ghz. 32Gb memory
> > All guests set to 1024x768 and "thousands" of colors.
>
> Does it mean this host machine were rebooted into these OSes or these were
> run in a VM. In case using VM, were all three running in VM or one was on
> host (I'd guess OSX host with Linux and Windows VMs).
>
> > Linux and OSX (with brew) use default compilers.
> > Windows build cross-compiled from Fedora with x86_64-win64-mingw32
>
> I assume Linux and OSX were 64 bit builds, is Windows 32 bit or 64 bit
> exe?
>

No virtualisation. I run all on the same bare metal, so booted into these
three separately from separate SSDs. You might guess OSX Sierra is running
on less "official" hardware and you would be right. All qemu builds were 64
bit.

Best,
Howard


[no subject]

2020-02-19 Thread Wayne Li
Dear QEMU list members,

This will kind of be a repost but I'd like to post my question again
because I've gained some more knowledge that makes me feel that my question
would be easier to answer.  So we developed a custom-made QEMU VM that
emulates a custom machine that has an e5500 processor.  I'm running this VM
on a T4240-RDB board which has an e6500 processor and I'm trying to get the
VM running with KVM enabled.  The problem I'm having is the program counter
refuses to increment at all.  It just stays at the address 0xFFFC.  On
a run without KVM enabled, the VM will also start executing at this same
address but the program counter beings to increment immediately.  I know
this is a custom QEMU VM and maybe some of the startup stuff we do could be
causing problems, but what could possibly stop the program counter from
incrementing altogether?

Also, I do have another side question.  When running with KVM enabled, I
see the kernel-level ioctl call KVM_RUN running and then returning over and
over again (by the way before the VM kinda grinds to a halt I only see QEMU
make the KVM_RUN call twice, but the kernel-level ioctl function is being
called over and over again for some reason).  And each time the KVM_RUN
call returns, the return-from-interrupt takes the VM to the address
0xFFFC.  What is the KVM_RUN ioctl call used for?  Why is it being
called over and over again?  Maybe if I understood this better I'd be able
to figure out what's stopping my program counter from incrementing.

-Thanks, Wayne Li


[PATCH v10 20/22] fuzz: add virtio-net fuzz target

2020-02-19 Thread Alexander Bulekov
The virtio-net fuzz target feeds inputs to all three virtio-net
virtqueues, and uses forking to avoid leaking state between fuzz runs.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qtest/fuzz/Makefile.include  |   1 +
 tests/qtest/fuzz/virtio_net_fuzz.c | 198 +
 2 files changed, 199 insertions(+)
 create mode 100644 tests/qtest/fuzz/virtio_net_fuzz.c

diff --git a/tests/qtest/fuzz/Makefile.include 
b/tests/qtest/fuzz/Makefile.include
index 38b8cdd9f1..77385777ef 100644
--- a/tests/qtest/fuzz/Makefile.include
+++ b/tests/qtest/fuzz/Makefile.include
@@ -8,6 +8,7 @@ fuzz-obj-y += tests/qtest/fuzz/qos_fuzz.o
 
 # Targets
 fuzz-obj-y += tests/qtest/fuzz/i440fx_fuzz.o
+fuzz-obj-y += tests/qtest/fuzz/virtio_net_fuzz.o
 
 FUZZ_CFLAGS += -I$(SRC_PATH)/tests -I$(SRC_PATH)/tests/qtest
 
diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c 
b/tests/qtest/fuzz/virtio_net_fuzz.c
new file mode 100644
index 00..d08a47e278
--- /dev/null
+++ b/tests/qtest/fuzz/virtio_net_fuzz.c
@@ -0,0 +1,198 @@
+/*
+ * virtio-net Fuzzing Target
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "standard-headers/linux/virtio_config.h"
+#include "tests/qtest/libqtest.h"
+#include "tests/qtest/libqos/virtio-net.h"
+#include "fuzz.h"
+#include "fork_fuzz.h"
+#include "qos_fuzz.h"
+
+
+#define QVIRTIO_NET_TIMEOUT_US (30 * 1000 * 1000)
+#define QVIRTIO_RX_VQ 0
+#define QVIRTIO_TX_VQ 1
+#define QVIRTIO_CTRL_VQ 2
+
+static int sockfds[2];
+static bool sockfds_initialized;
+
+static void virtio_net_fuzz_multi(QTestState *s,
+const unsigned char *Data, size_t Size, bool check_used)
+{
+typedef struct vq_action {
+uint8_t queue;
+uint8_t length;
+uint8_t write;
+uint8_t next;
+uint8_t rx;
+} vq_action;
+
+uint32_t free_head = 0;
+
+QGuestAllocator *t_alloc = fuzz_qos_alloc;
+
+QVirtioNet *net_if = fuzz_qos_obj;
+QVirtioDevice *dev = net_if->vdev;
+QVirtQueue *q;
+vq_action vqa;
+while (Size >= sizeof(vqa)) {
+memcpy(&vqa, Data, sizeof(vqa));
+Data += sizeof(vqa);
+Size -= sizeof(vqa);
+
+q = net_if->queues[vqa.queue % 3];
+
+vqa.length = vqa.length >= Size ? Size :  vqa.length;
+
+/*
+ * Only attempt to write incoming packets, when using the socket
+ * backend. Otherwise, always place the input on a virtqueue.
+ */
+if (vqa.rx && sockfds_initialized) {
+write(sockfds[0], Data, vqa.length);
+} else {
+vqa.rx = 0;
+uint64_t req_addr = guest_alloc(t_alloc, vqa.length);
+/*
+ * If checking used ring, ensure that the fuzzer doesn't trigger
+ * trivial asserion failure on zero-zied buffer
+ */
+qtest_memwrite(s, req_addr, Data, vqa.length);
+
+
+free_head = qvirtqueue_add(s, q, req_addr, vqa.length,
+vqa.write, vqa.next);
+qvirtqueue_add(s, q, req_addr, vqa.length, vqa.write , vqa.next);
+qvirtqueue_kick(s, dev, q, free_head);
+}
+
+/* Run the main loop */
+qtest_clock_step(s, 100);
+flush_events(s);
+
+/* Wait on used descriptors */
+if (check_used && !vqa.rx) {
+gint64 start_time = g_get_monotonic_time();
+/*
+ * normally, we could just use qvirtio_wait_used_elem, but since we
+ * must manually run the main-loop for all the bhs to run, we use
+ * this hack with flush_events(), to run the main_loop
+ */
+while (!vqa.rx && q != net_if->queues[QVIRTIO_RX_VQ]) {
+uint32_t got_desc_idx;
+/* Input led to a virtio_error */
+if (dev->bus->get_status(dev) & VIRTIO_CONFIG_S_NEEDS_RESET) {
+break;
+}
+if (dev->bus->get_queue_isr_status(dev, q) &&
+qvirtqueue_get_buf(s, q, &got_desc_idx, NULL)) {
+g_assert_cmpint(got_desc_idx, ==, free_head);
+break;
+}
+g_assert(g_get_monotonic_time() - start_time
+<= QVIRTIO_NET_TIMEOUT_US);
+
+/* Run the main loop */
+qtest_clock_step(s, 100);
+flush_events(s);
+}
+}
+Data += vqa.length;
+Size -= vqa.length;
+}
+}
+
+static void virtio_net_fork_fuzz(QTestState *s,
+const unsigned char *Data, size_t Size)
+{
+if (fork() == 0) {
+virtio_net_fuzz_multi(s, Data, Size, false);
+flush_events(s);
+_Exit(0);
+} else {
+wait(NULL);
+}
+}
+
+static void virtio_net_fork_fuzz_

[PATCH v10 17/22] fuzz: add target/fuzz makefile rules

2020-02-19 Thread Alexander Bulekov
Signed-off-by: Alexander Bulekov 
Reviewed-by: Darren Kenny 
Reviewed-by: Stefan Hajnoczi 
---
 Makefile| 15 ++-
 Makefile.target | 16 
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f0e1a2fc1d..36ca26f0f5 100644
--- a/Makefile
+++ b/Makefile
@@ -477,7 +477,7 @@ config-host.h-timestamp: config-host.mak
 qemu-options.def: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > 
$@,"GEN","$@")
 
-TARGET_DIRS_RULES := $(foreach t, all clean install, $(addsuffix /$(t), 
$(TARGET_DIRS)))
+TARGET_DIRS_RULES := $(foreach t, all fuzz clean install, $(addsuffix /$(t), 
$(TARGET_DIRS)))
 
 SOFTMMU_ALL_RULES=$(filter %-softmmu/all, $(TARGET_DIRS_RULES))
 $(SOFTMMU_ALL_RULES): $(authz-obj-y)
@@ -490,6 +490,15 @@ ifdef DECOMPRESS_EDK2_BLOBS
 $(SOFTMMU_ALL_RULES): $(edk2-decompressed)
 endif
 
+SOFTMMU_FUZZ_RULES=$(filter %-softmmu/fuzz, $(TARGET_DIRS_RULES))
+$(SOFTMMU_FUZZ_RULES): $(authz-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(block-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(chardev-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(crypto-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(io-obj-y)
+$(SOFTMMU_FUZZ_RULES): config-all-devices.mak
+$(SOFTMMU_FUZZ_RULES): $(edk2-decompressed)
+
 .PHONY: $(TARGET_DIRS_RULES)
 # The $(TARGET_DIRS_RULES) are of the form SUBDIR/GOAL, so that
 # $(dir $@) yields the sub-directory, and $(notdir $@) yields the sub-goal
@@ -540,6 +549,9 @@ subdir-slirp: slirp/all
 $(filter %/all, $(TARGET_DIRS_RULES)): libqemuutil.a $(common-obj-y) \
$(qom-obj-y)
 
+$(filter %/fuzz, $(TARGET_DIRS_RULES)): libqemuutil.a $(common-obj-y) \
+   $(qom-obj-y) $(crypto-user-obj-$(CONFIG_USER_ONLY))
+
 ROM_DIRS = $(addprefix pc-bios/, $(ROMS))
 ROM_DIRS_RULES=$(foreach t, all clean, $(addsuffix /$(t), $(ROM_DIRS)))
 # Only keep -O and -g cflags
@@ -549,6 +561,7 @@ $(ROM_DIRS_RULES):
 
 .PHONY: recurse-all recurse-clean recurse-install
 recurse-all: $(addsuffix /all, $(TARGET_DIRS) $(ROM_DIRS))
+recurse-fuzz: $(addsuffix /fuzz, $(TARGET_DIRS) $(ROM_DIRS))
 recurse-clean: $(addsuffix /clean, $(TARGET_DIRS) $(ROM_DIRS))
 recurse-install: $(addsuffix /install, $(TARGET_DIRS))
 $(addsuffix /install, $(TARGET_DIRS)): all
diff --git a/Makefile.target b/Makefile.target
index 6f4dd72022..2d43dc586a 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -228,6 +228,22 @@ ifdef CONFIG_TRACE_SYSTEMTAP
rm -f *.stp
 endif
 
+ifdef CONFIG_FUZZ
+include $(SRC_PATH)/tests/qtest/fuzz/Makefile.include
+include $(SRC_PATH)/tests/qtest/Makefile.include
+
+fuzz: fuzz-vars
+fuzz-vars: QEMU_CFLAGS := $(FUZZ_CFLAGS) $(QEMU_CFLAGS)
+fuzz-vars: QEMU_LDFLAGS := $(FUZZ_LDFLAGS) $(QEMU_LDFLAGS)
+fuzz-vars: $(QEMU_PROG_FUZZ)
+dummy := $(call unnest-vars,, fuzz-obj-y)
+
+
+$(QEMU_PROG_FUZZ): config-devices.mak $(all-obj-y) $(COMMON_LDADDS) 
$(fuzz-obj-y)
+   $(call LINK, $(filter-out %.mak, $^))
+
+endif
+
 install: all
 ifneq ($(PROGS),)
$(call install-prog,$(PROGS),$(DESTDIR)$(bindir))
-- 
2.25.0




[PATCH v10 11/22] libqos: move useful qos-test funcs to qos_external

2020-02-19 Thread Alexander Bulekov
The moved functions are not specific to qos-test and might be useful
elsewhere. For example the virtual-device fuzzer makes use of them for
qos-assisted fuzz-targets.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Darren Kenny 
---
 tests/qtest/Makefile.include  |   1 +
 tests/qtest/libqos/qos_external.c | 168 ++
 tests/qtest/libqos/qos_external.h |  28 +
 tests/qtest/qos-test.c| 132 +--
 4 files changed, 198 insertions(+), 131 deletions(-)
 create mode 100644 tests/qtest/libqos/qos_external.c
 create mode 100644 tests/qtest/libqos/qos_external.h

diff --git a/tests/qtest/Makefile.include b/tests/qtest/Makefile.include
index 838618e6f9..e769c1ad70 100644
--- a/tests/qtest/Makefile.include
+++ b/tests/qtest/Makefile.include
@@ -172,6 +172,7 @@ libqos-usb-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) 
tests/qtest/libqos/u
 # qos devices:
 libqos-obj-y =  $(libqgraph-obj-y)
 libqos-obj-y += $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
+libqos-obj-y += tests/qtest/libqos/qos_external.o
 libqos-obj-y += tests/qtest/libqos/e1000e.o
 libqos-obj-y += tests/qtest/libqos/i2c.o
 libqos-obj-y += tests/qtest/libqos/i2c-imx.o
diff --git a/tests/qtest/libqos/qos_external.c 
b/tests/qtest/libqos/qos_external.c
new file mode 100644
index 00..398556dde0
--- /dev/null
+++ b/tests/qtest/libqos/qos_external.c
@@ -0,0 +1,168 @@
+/*
+ * libqos driver framework
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include "libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qbool.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/module.h"
+#include "qapi/qmp/qlist.h"
+#include "libqos/malloc.h"
+#include "libqos/qgraph.h"
+#include "libqos/qgraph_internal.h"
+#include "libqos/qos_external.h"
+
+
+
+void apply_to_node(const char *name, bool is_machine, bool is_abstract)
+{
+char *machine_name = NULL;
+if (is_machine) {
+const char *arch = qtest_get_arch();
+machine_name = g_strconcat(arch, "/", name, NULL);
+name = machine_name;
+}
+qos_graph_node_set_availability(name, true);
+if (is_abstract) {
+qos_delete_cmd_line(name);
+}
+g_free(machine_name);
+}
+
+/**
+ * apply_to_qlist(): using QMP queries QEMU for a list of
+ * machines and devices available, and sets the respective node
+ * as true. If a node is found, also all its produced and contained
+ * child are marked available.
+ *
+ * See qos_graph_node_set_availability() for more info
+ */
+void apply_to_qlist(QList *list, bool is_machine)
+{
+const QListEntry *p;
+const char *name;
+bool abstract;
+QDict *minfo;
+QObject *qobj;
+QString *qstr;
+QBool *qbool;
+
+for (p = qlist_first(list); p; p = qlist_next(p)) {
+minfo = qobject_to(QDict, qlist_entry_obj(p));
+qobj = qdict_get(minfo, "name");
+qstr = qobject_to(QString, qobj);
+name = qstring_get_str(qstr);
+
+qobj = qdict_get(minfo, "abstract");
+if (qobj) {
+qbool = qobject_to(QBool, qobj);
+abstract = qbool_get_bool(qbool);
+} else {
+abstract = false;
+}
+
+apply_to_node(name, is_machine, abstract);
+qobj = qdict_get(minfo, "alias");
+if (qobj) {
+qstr = qobject_to(QString, qobj);
+name = qstring_get_str(qstr);
+apply_to_node(name, is_machine, abstract);
+}
+}
+}
+
+QGuestAllocator *get_machine_allocator(QOSGraphObject *obj)
+{
+return obj->get_driver(obj, "memory");
+}
+
+/**
+ * allocate_objects(): given an array of nodes @arg,
+ * walks the path invoking all constructors and
+ * passing the corresponding parameter in order to
+ * continue the objects allocation.
+ * Once the test is reached, return the object it consumes.
+ *
+ * Since the machine and QEDGE_CONSUMED_BY nodes allocate
+ * memory in the constructor, g_test_queue_destroy is used so
+ * that after execution they can be safely free'd.  (The test's
+ * ->before callback is also welcome to use g_test_queue_destroy).
+ *
+ * Note: as specified in walk_path() too, @arg is an array of
+ * char *, where arg[0] is a pointer to the command line
+ * string that will be used to properly start QEMU when exec

[PATCH v10 12/22] fuzz: add fuzzer skeleton

2020-02-19 Thread Alexander Bulekov
tests/fuzz/fuzz.c serves as the entry point for the virtual-device
fuzzer. Namely, libfuzzer invokes the LLVMFuzzerInitialize and
LLVMFuzzerTestOneInput functions, both of which are defined in this
file. This change adds a "FuzzTarget" struct, along with the
fuzz_add_target function, which should be used to define new fuzz
targets.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 MAINTAINERS   |   8 ++
 tests/qtest/fuzz/Makefile.include |   6 +
 tests/qtest/fuzz/fuzz.c   | 179 ++
 tests/qtest/fuzz/fuzz.h   |  95 
 4 files changed, 288 insertions(+)
 create mode 100644 tests/qtest/fuzz/Makefile.include
 create mode 100644 tests/qtest/fuzz/fuzz.c
 create mode 100644 tests/qtest/fuzz/fuzz.h

diff --git a/MAINTAINERS b/MAINTAINERS
index a8e2a5f8c7..ee2300b44c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2176,6 +2176,14 @@ F: qtest.c
 F: accel/qtest.c
 F: tests/qtest/
 
+Device Fuzzing
+M: Alexander Bulekov 
+R: Paolo Bonzini 
+R: Bandan Das 
+R: Stefan Hajnoczi 
+S: Maintained
+F: tests/qtest/fuzz/
+
 Register API
 M: Alistair Francis 
 S: Maintained
diff --git a/tests/qtest/fuzz/Makefile.include 
b/tests/qtest/fuzz/Makefile.include
new file mode 100644
index 00..8632bb89f4
--- /dev/null
+++ b/tests/qtest/fuzz/Makefile.include
@@ -0,0 +1,6 @@
+QEMU_PROG_FUZZ=qemu-fuzz-$(TARGET_NAME)$(EXESUF)
+
+fuzz-obj-y += tests/qtest/libqtest.o
+fuzz-obj-y += tests/qtest/fuzz/fuzz.o # Fuzzer skeleton
+
+FUZZ_CFLAGS += -I$(SRC_PATH)/tests -I$(SRC_PATH)/tests/qtest
diff --git a/tests/qtest/fuzz/fuzz.c b/tests/qtest/fuzz/fuzz.c
new file mode 100644
index 00..0d78ac8d36
--- /dev/null
+++ b/tests/qtest/fuzz/fuzz.c
@@ -0,0 +1,179 @@
+/*
+ * fuzzing driver
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include 
+
+#include "sysemu/qtest.h"
+#include "sysemu/runstate.h"
+#include "sysemu/sysemu.h"
+#include "qemu/main-loop.h"
+#include "tests/qtest/libqtest.h"
+#include "tests/qtest/libqos/qgraph.h"
+#include "fuzz.h"
+
+#define MAX_EVENT_LOOPS 10
+
+typedef struct FuzzTargetState {
+FuzzTarget *target;
+QSLIST_ENTRY(FuzzTargetState) target_list;
+} FuzzTargetState;
+
+typedef QSLIST_HEAD(, FuzzTargetState) FuzzTargetList;
+
+static const char *fuzz_arch = TARGET_NAME;
+
+static FuzzTargetList *fuzz_target_list;
+static FuzzTarget *fuzz_target;
+static QTestState *fuzz_qts;
+
+
+
+void flush_events(QTestState *s)
+{
+int i = MAX_EVENT_LOOPS;
+while (g_main_context_pending(NULL) && i-- > 0) {
+main_loop_wait(false);
+}
+}
+
+static QTestState *qtest_setup(void)
+{
+qtest_server_set_send_handler(&qtest_client_inproc_recv, &fuzz_qts);
+return qtest_inproc_init(&fuzz_qts, false, fuzz_arch,
+&qtest_server_inproc_recv);
+}
+
+void fuzz_add_target(const FuzzTarget *target)
+{
+FuzzTargetState *tmp;
+FuzzTargetState *target_state;
+if (!fuzz_target_list) {
+fuzz_target_list = g_new0(FuzzTargetList, 1);
+}
+
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+if (g_strcmp0(tmp->target->name, target->name) == 0) {
+fprintf(stderr, "Error: Fuzz target name %s already in use\n",
+target->name);
+abort();
+}
+}
+target_state = g_new0(FuzzTargetState, 1);
+target_state->target = g_new0(FuzzTarget, 1);
+*(target_state->target) = *target;
+QSLIST_INSERT_HEAD(fuzz_target_list, target_state, target_list);
+}
+
+
+
+static void usage(char *path)
+{
+printf("Usage: %s --fuzz-target=FUZZ_TARGET [LIBFUZZER ARGUMENTS]\n", 
path);
+printf("where FUZZ_TARGET is one of:\n");
+FuzzTargetState *tmp;
+if (!fuzz_target_list) {
+fprintf(stderr, "Fuzz target list not initialized\n");
+abort();
+}
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+printf(" * %s  : %s\n", tmp->target->name,
+tmp->target->description);
+}
+exit(0);
+}
+
+static FuzzTarget *fuzz_get_target(char* name)
+{
+FuzzTargetState *tmp;
+if (!fuzz_target_list) {
+fprintf(stderr, "Fuzz target list not initialized\n");
+abort();
+}
+
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+if (strcmp(tmp->target->name, name) == 0) {
+return tmp->target;
+}
+}
+return NULL;
+}
+
+
+/* Executed for each fuzzing-input */
+int LLVMFuzzerTestOneInput(const unsigned char *Data, size_t Size)
+{
+/*
+ * Do the pre-fuzz-initialization before the first fuzzing iteration,
+ * instead of before the actual fuzz loop. This is needed since libfuzzer
+ * may fork off additional workers, prior to the fuzzing loop, and if
+ * pre_fuzz() sets u

[PATCH v10 09/22] libqos: rename i2c_send and i2c_recv

2020-02-19 Thread Alexander Bulekov
The names i2c_send and i2c_recv collide with functions defined in
hw/i2c/core.c. This causes an error when linking against libqos and
softmmu simultaneously (for example when using qtest inproc). Rename the
libqos functions to avoid this.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
Acked-by: Thomas Huth 
---
 tests/qtest/libqos/i2c.c   | 10 +-
 tests/qtest/libqos/i2c.h   |  4 ++--
 tests/qtest/pca9552-test.c | 10 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tests/qtest/libqos/i2c.c b/tests/qtest/libqos/i2c.c
index 156114e745..38f800dbab 100644
--- a/tests/qtest/libqos/i2c.c
+++ b/tests/qtest/libqos/i2c.c
@@ -10,12 +10,12 @@
 #include "libqos/i2c.h"
 #include "libqtest.h"
 
-void i2c_send(QI2CDevice *i2cdev, const uint8_t *buf, uint16_t len)
+void qi2c_send(QI2CDevice *i2cdev, const uint8_t *buf, uint16_t len)
 {
 i2cdev->bus->send(i2cdev->bus, i2cdev->addr, buf, len);
 }
 
-void i2c_recv(QI2CDevice *i2cdev, uint8_t *buf, uint16_t len)
+void qi2c_recv(QI2CDevice *i2cdev, uint8_t *buf, uint16_t len)
 {
 i2cdev->bus->recv(i2cdev->bus, i2cdev->addr, buf, len);
 }
@@ -23,8 +23,8 @@ void i2c_recv(QI2CDevice *i2cdev, uint8_t *buf, uint16_t len)
 void i2c_read_block(QI2CDevice *i2cdev, uint8_t reg,
 uint8_t *buf, uint16_t len)
 {
-i2c_send(i2cdev, ®, 1);
-i2c_recv(i2cdev, buf, len);
+qi2c_send(i2cdev, ®, 1);
+qi2c_recv(i2cdev, buf, len);
 }
 
 void i2c_write_block(QI2CDevice *i2cdev, uint8_t reg,
@@ -33,7 +33,7 @@ void i2c_write_block(QI2CDevice *i2cdev, uint8_t reg,
 uint8_t *cmd = g_malloc(len + 1);
 cmd[0] = reg;
 memcpy(&cmd[1], buf, len);
-i2c_send(i2cdev, cmd, len + 1);
+qi2c_send(i2cdev, cmd, len + 1);
 g_free(cmd);
 }
 
diff --git a/tests/qtest/libqos/i2c.h b/tests/qtest/libqos/i2c.h
index 945b65b34c..c65f087834 100644
--- a/tests/qtest/libqos/i2c.h
+++ b/tests/qtest/libqos/i2c.h
@@ -47,8 +47,8 @@ struct QI2CDevice {
 void *i2c_device_create(void *i2c_bus, QGuestAllocator *alloc, void *addr);
 void add_qi2c_address(QOSGraphEdgeOptions *opts, QI2CAddress *addr);
 
-void i2c_send(QI2CDevice *dev, const uint8_t *buf, uint16_t len);
-void i2c_recv(QI2CDevice *dev, uint8_t *buf, uint16_t len);
+void qi2c_send(QI2CDevice *dev, const uint8_t *buf, uint16_t len);
+void qi2c_recv(QI2CDevice *dev, uint8_t *buf, uint16_t len);
 
 void i2c_read_block(QI2CDevice *dev, uint8_t reg,
 uint8_t *buf, uint16_t len);
diff --git a/tests/qtest/pca9552-test.c b/tests/qtest/pca9552-test.c
index 4b800d3c3e..d80ed93cd3 100644
--- a/tests/qtest/pca9552-test.c
+++ b/tests/qtest/pca9552-test.c
@@ -32,22 +32,22 @@ static void receive_autoinc(void *obj, void *data, 
QGuestAllocator *alloc)
 
 pca9552_init(i2cdev);
 
-i2c_send(i2cdev, ®, 1);
+qi2c_send(i2cdev, ®, 1);
 
 /* PCA9552_LS0 */
-i2c_recv(i2cdev, &resp, 1);
+qi2c_recv(i2cdev, &resp, 1);
 g_assert_cmphex(resp, ==, 0x54);
 
 /* PCA9552_LS1 */
-i2c_recv(i2cdev, &resp, 1);
+qi2c_recv(i2cdev, &resp, 1);
 g_assert_cmphex(resp, ==, 0x55);
 
 /* PCA9552_LS2 */
-i2c_recv(i2cdev, &resp, 1);
+qi2c_recv(i2cdev, &resp, 1);
 g_assert_cmphex(resp, ==, 0x55);
 
 /* PCA9552_LS3 */
-i2c_recv(i2cdev, &resp, 1);
+qi2c_recv(i2cdev, &resp, 1);
 g_assert_cmphex(resp, ==, 0x54);
 }
 
-- 
2.25.0




[PATCH v10 22/22] fuzz: add documentation to docs/devel/

2020-02-19 Thread Alexander Bulekov
Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 docs/devel/fuzzing.txt | 116 +
 1 file changed, 116 insertions(+)
 create mode 100644 docs/devel/fuzzing.txt

diff --git a/docs/devel/fuzzing.txt b/docs/devel/fuzzing.txt
new file mode 100644
index 00..324d2cd92b
--- /dev/null
+++ b/docs/devel/fuzzing.txt
@@ -0,0 +1,116 @@
+= Fuzzing =
+
+== Introduction ==
+
+This document describes the virtual-device fuzzing infrastructure in QEMU and
+how to use it to implement additional fuzzers.
+
+== Basics ==
+
+Fuzzing operates by passing inputs to an entry point/target function. The
+fuzzer tracks the code coverage triggered by the input. Based on these
+findings, the fuzzer mutates the input and repeats the fuzzing.
+
+To fuzz QEMU, we rely on libfuzzer. Unlike other fuzzers such as AFL, libfuzzer
+is an _in-process_ fuzzer. For the developer, this means that it is their
+responsibility to ensure that state is reset between fuzzing-runs.
+
+== Building the fuzzers ==
+
+NOTE: If possible, build a 32-bit binary. When forking, the 32-bit fuzzer is
+much faster, since the page-map has a smaller size. This is due to the fact 
that
+AddressSanitizer mmaps ~20TB of memory, as part of its detection. This results
+in a large page-map, and a much slower fork().
+
+To build the fuzzers, install a recent version of clang:
+Configure with (substitute the clang binaries with the version you installed):
+
+CC=clang-8 CXX=clang++-8 /path/to/configure --enable-fuzzing
+
+Fuzz targets are built similarly to system/softmmu:
+
+make i386-softmmu/fuzz
+
+This builds ./i386-softmmu/qemu-fuzz-i386
+
+The first option to this command is: --fuzz_taget=FUZZ_NAME
+To list all of the available fuzzers run qemu-fuzz-i386 with no arguments.
+
+eg:
+./i386-softmmu/qemu-fuzz-i386 --fuzz-target=virtio-net-fork-fuzz
+
+Internally, libfuzzer parses all arguments that do not begin with "--".
+Information about these is available by passing -help=1
+
+Now the only thing left to do is wait for the fuzzer to trigger potential
+crashes.
+
+== Adding a new fuzzer ==
+Coverage over virtual devices can be improved by adding additional fuzzers.
+Fuzzers are kept in tests/qtest/fuzz/ and should be added to
+tests/qtest/fuzz/Makefile.include
+
+Fuzzers can rely on both qtest and libqos to communicate with virtual devices.
+
+1. Create a new source file. For example 
``tests/qtest/fuzz/foo-device-fuzz.c``.
+
+2. Write the fuzzing code using the libqtest/libqos API. See existing fuzzers
+for reference.
+
+3. Register the fuzzer in ``tests/fuzz/Makefile.include`` by appending the
+corresponding object to fuzz-obj-y
+
+Fuzzers can be more-or-less thought of as special qtest programs which can
+modify the qtest commands and/or qtest command arguments based on inputs
+provided by libfuzzer. Libfuzzer passes a byte array and length. Commonly the
+fuzzer loops over the byte-array interpreting it as a list of qtest commands,
+addresses, or values.
+
+= Implementation Details =
+
+== The Fuzzer's Lifecycle ==
+
+The fuzzer has two entrypoints that libfuzzer calls. libfuzzer provides it's
+own main(), which performs some setup, and calls the entrypoints:
+
+LLVMFuzzerInitialize: called prior to fuzzing. Used to initialize all of the
+necessary state
+
+LLVMFuzzerTestOneInput: called for each fuzzing run. Processes the input and
+resets the state at the end of each run.
+
+In more detail:
+
+LLVMFuzzerInitialize parses the arguments to the fuzzer (must start with two
+dashes, so they are ignored by libfuzzer main()). Currently, the arguments
+select the fuzz target. Then, the qtest client is initialized. If the target
+requires qos, qgraph is set up and the QOM/LIBQOS modules are initialized.
+Then the QGraph is walked and the QEMU cmd_line is determined and saved.
+
+After this, the vl.c:qemu__main is called to set up the guest. There are
+target-specific hooks that can be called before and after qemu_main, for
+additional setup(e.g. PCI setup, or VM snapshotting).
+
+LLVMFuzzerTestOneInput: Uses qtest/qos functions to act based on the fuzz
+input. It is also responsible for manually calling the main loop/main_loop_wait
+to ensure that bottom halves are executed and any cleanup required before the
+next input.
+
+Since the same process is reused for many fuzzing runs, QEMU state needs to
+be reset at the end of each run. There are currently two implemented
+options for resetting state:
+1. Reboot the guest between runs.
+   Pros: Straightforward and fast for simple fuzz targets.
+   Cons: Depending on the device, does not reset all device state. If the
+   device requires some initialization prior to being ready for fuzzing
+   (common for QOS-based targets), this initialization needs to be done after
+   each reboot.
+   Example target: i440fx-qtest-reboot-fuzz
+2. Run each test case in a separate forked process and copy the coverage
+   information back to the pa

[PATCH v10 18/22] fuzz: add configure flag --enable-fuzzing

2020-02-19 Thread Alexander Bulekov
Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Darren Kenny 
---
 configure | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/configure b/configure
index 115dc38085..bd873177ad 100755
--- a/configure
+++ b/configure
@@ -505,6 +505,7 @@ debug_mutex="no"
 libpmem=""
 default_devices="yes"
 plugins="no"
+fuzzing="no"
 
 supported_cpu="no"
 supported_os="no"
@@ -635,6 +636,15 @@ int main(void) { return 0; }
 EOF
 }
 
+write_c_fuzzer_skeleton() {
+cat > $TMPC <
+#include 
+int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);
+int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { return 0; }
+EOF
+}
+
 if check_define __linux__ ; then
   targetos="Linux"
 elif check_define _WIN32 ; then
@@ -1545,6 +1555,10 @@ for opt do
   ;;
   --disable-containers) use_containers="no"
   ;;
+  --enable-fuzzing) fuzzing=yes
+  ;;
+  --disable-fuzzing) fuzzing=no
+  ;;
   *)
   echo "ERROR: unknown option $opt"
   echo "Try '$0 --help' for more information"
@@ -6035,6 +6049,15 @@ EOF
   fi
 fi
 
+##
+# checks for fuzzer
+if test "$fuzzing" = "yes" ; then
+  write_c_fuzzer_skeleton
+  if compile_prog "$CPU_CFLAGS -Werror -fsanitize=address,fuzzer" ""; then
+  have_fuzzer=yes
+  fi
+fi
+
 ##
 # check for libpmem
 
@@ -6621,6 +6644,7 @@ echo "libpmem support   $libpmem"
 echo "libudev   $libudev"
 echo "default devices   $default_devices"
 echo "plugin support$plugins"
+echo "fuzzing support   $fuzzing"
 
 if test "$supported_cpu" = "no"; then
 echo
@@ -7456,6 +7480,16 @@ fi
 if test "$sheepdog" = "yes" ; then
   echo "CONFIG_SHEEPDOG=y" >> $config_host_mak
 fi
+if test "$fuzzing" = "yes" ; then
+  if test "$have_fuzzer" = "yes"; then
+FUZZ_LDFLAGS=" -fsanitize=address,fuzzer"
+FUZZ_CFLAGS=" -fsanitize=address,fuzzer"
+CFLAGS=" -fsanitize=address,fuzzer-no-link"
+  else
+error_exit "Your compiler doesn't support -fsanitize=address,fuzzer"
+exit 1
+  fi
+fi
 
 if test "$plugins" = "yes" ; then
 echo "CONFIG_PLUGIN=y" >> $config_host_mak
@@ -7556,6 +7590,11 @@ if test "$libudev" != "no"; then
 echo "CONFIG_LIBUDEV=y" >> $config_host_mak
 echo "LIBUDEV_LIBS=$libudev_libs" >> $config_host_mak
 fi
+if test "$fuzzing" != "no"; then
+echo "CONFIG_FUZZ=y" >> $config_host_mak
+echo "FUZZ_CFLAGS=$FUZZ_CFLAGS" >> $config_host_mak
+echo "FUZZ_LDFLAGS=$FUZZ_LDFLAGS" >> $config_host_mak
+fi
 
 if test "$edk2_blobs" = "yes" ; then
   echo "DECOMPRESS_EDK2_BLOBS=y" >> $config_host_mak
-- 
2.25.0




[PATCH v10 08/22] qtest: add in-process incoming command handler

2020-02-19 Thread Alexander Bulekov
The handler allows a qtest client to send commands to the server by
directly calling a function, rather than using a file/CharBackend

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 include/sysemu/qtest.h |  1 +
 qtest.c| 13 +
 2 files changed, 14 insertions(+)

diff --git a/include/sysemu/qtest.h b/include/sysemu/qtest.h
index e2f1047fd7..eedd3664f0 100644
--- a/include/sysemu/qtest.h
+++ b/include/sysemu/qtest.h
@@ -28,5 +28,6 @@ void qtest_server_init(const char *qtest_chrdev, const char 
*qtest_log, Error **
 
 void qtest_server_set_send_handler(void (*send)(void *, const char *),
  void *opaque);
+void qtest_server_inproc_recv(void *opaque, const char *buf);
 
 #endif
diff --git a/qtest.c b/qtest.c
index 938c3746d6..ad6eb6a526 100644
--- a/qtest.c
+++ b/qtest.c
@@ -803,3 +803,16 @@ bool qtest_driver(void)
 {
 return qtest_chr.chr != NULL;
 }
+
+void qtest_server_inproc_recv(void *dummy, const char *buf)
+{
+static GString *gstr;
+if (!gstr) {
+gstr = g_string_new(NULL);
+}
+g_string_append(gstr, buf);
+if (gstr->str[gstr->len - 1] == '\n') {
+qtest_process_inbuf(NULL, gstr);
+g_string_truncate(gstr, 0);
+}
+}
-- 
2.25.0




[PATCH v10 21/22] fuzz: add virtio-scsi fuzz target

2020-02-19 Thread Alexander Bulekov
The virtio-scsi fuzz target sets up and fuzzes the available virtio-scsi
queues. After an element is placed on a queue, the fuzzer can select
whether to perform a kick, or continue adding elements.

Signed-off-by: Alexander Bulekov 
---
 tests/qtest/fuzz/Makefile.include   |   1 +
 tests/qtest/fuzz/virtio_scsi_fuzz.c | 213 
 2 files changed, 214 insertions(+)
 create mode 100644 tests/qtest/fuzz/virtio_scsi_fuzz.c

diff --git a/tests/qtest/fuzz/Makefile.include 
b/tests/qtest/fuzz/Makefile.include
index 77385777ef..cde3e9636c 100644
--- a/tests/qtest/fuzz/Makefile.include
+++ b/tests/qtest/fuzz/Makefile.include
@@ -9,6 +9,7 @@ fuzz-obj-y += tests/qtest/fuzz/qos_fuzz.o
 # Targets
 fuzz-obj-y += tests/qtest/fuzz/i440fx_fuzz.o
 fuzz-obj-y += tests/qtest/fuzz/virtio_net_fuzz.o
+fuzz-obj-y += tests/qtest/fuzz/virtio_scsi_fuzz.o
 
 FUZZ_CFLAGS += -I$(SRC_PATH)/tests -I$(SRC_PATH)/tests/qtest
 
diff --git a/tests/qtest/fuzz/virtio_scsi_fuzz.c 
b/tests/qtest/fuzz/virtio_scsi_fuzz.c
new file mode 100644
index 00..3b95247f12
--- /dev/null
+++ b/tests/qtest/fuzz/virtio_scsi_fuzz.c
@@ -0,0 +1,213 @@
+/*
+ * virtio-serial Fuzzing Target
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "tests/qtest/libqtest.h"
+#include "libqos/virtio-scsi.h"
+#include "libqos/virtio.h"
+#include "libqos/virtio-pci.h"
+#include "standard-headers/linux/virtio_ids.h"
+#include "standard-headers/linux/virtio_pci.h"
+#include "standard-headers/linux/virtio_scsi.h"
+#include "fuzz.h"
+#include "fork_fuzz.h"
+#include "qos_fuzz.h"
+
+#define PCI_SLOT0x02
+#define PCI_FN  0x00
+#define QVIRTIO_SCSI_TIMEOUT_US (1 * 1000 * 1000)
+
+#define MAX_NUM_QUEUES 64
+
+/* Based on tests/virtio-scsi-test.c */
+typedef struct {
+int num_queues;
+QVirtQueue *vq[MAX_NUM_QUEUES + 2];
+} QVirtioSCSIQueues;
+
+static QVirtioSCSIQueues *qvirtio_scsi_init(QVirtioDevice *dev, uint64_t mask)
+{
+QVirtioSCSIQueues *vs;
+uint64_t feat;
+int i;
+
+vs = g_new0(QVirtioSCSIQueues, 1);
+
+feat = qvirtio_get_features(dev);
+if (mask) {
+feat &= ~QVIRTIO_F_BAD_FEATURE | mask;
+} else {
+feat &= ~(QVIRTIO_F_BAD_FEATURE | (1ull << VIRTIO_RING_F_EVENT_IDX));
+}
+qvirtio_set_features(dev, feat);
+
+vs->num_queues = qvirtio_config_readl(dev, 0);
+
+for (i = 0; i < vs->num_queues + 2; i++) {
+vs->vq[i] = qvirtqueue_setup(dev, fuzz_qos_alloc, i);
+}
+
+qvirtio_set_driver_ok(dev);
+
+return vs;
+}
+
+static void virtio_scsi_fuzz(QTestState *s, QVirtioSCSIQueues* queues,
+const unsigned char *Data, size_t Size)
+{
+/*
+ * Data is a sequence of random bytes. We split them up into "actions",
+ * followed by data:
+ * [vqa][][vqa][][vqa][] ...
+ * The length of the data is specified by the preceding vqa.length
+ */
+typedef struct vq_action {
+uint8_t queue;
+uint8_t length;
+uint8_t write;
+uint8_t next;
+uint8_t kick;
+} vq_action;
+
+/* Keep track of the free head for each queue we interact with */
+bool vq_touched[MAX_NUM_QUEUES + 2] = {0};
+uint32_t free_head[MAX_NUM_QUEUES + 2];
+
+QGuestAllocator *t_alloc = fuzz_qos_alloc;
+
+QVirtioSCSI *scsi = fuzz_qos_obj;
+QVirtioDevice *dev = scsi->vdev;
+QVirtQueue *q;
+vq_action vqa;
+while (Size >= sizeof(vqa)) {
+/* Copy the action, so we can normalize length, queue and flags */
+memcpy(&vqa, Data, sizeof(vqa));
+
+Data += sizeof(vqa);
+Size -= sizeof(vqa);
+
+vqa.queue = vqa.queue % queues->num_queues;
+/* Cap length at the number of remaining bytes in data */
+vqa.length = vqa.length >= Size ? Size : vqa.length;
+vqa.write = vqa.write & 1;
+vqa.next = vqa.next & 1;
+vqa.kick = vqa.kick & 1;
+
+
+q = queues->vq[vqa.queue];
+
+/* Copy the data into ram, and place it on the virtqueue */
+uint64_t req_addr = guest_alloc(t_alloc, vqa.length);
+qtest_memwrite(s, req_addr, Data, vqa.length);
+if (vq_touched[vqa.queue] == 0) {
+vq_touched[vqa.queue] = 1;
+free_head[vqa.queue] = qvirtqueue_add(s, q, req_addr, vqa.length,
+vqa.write, vqa.next);
+} else {
+qvirtqueue_add(s, q, req_addr, vqa.length, vqa.write , vqa.next);
+}
+
+if (vqa.kick) {
+qvirtqueue_kick(s, dev, q, free_head[vqa.queue]);
+free_head[vqa.queue] = 0;
+}
+Data += vqa.length;
+Size -= vqa.length;
+}
+/* In the end, kick each queue we interacted with */
+for (int i = 0; i < MAX_NUM_QUEUES + 2; i++) {
+if (vq_touched[i]) 

[PATCH v10 15/22] fuzz: support for fork-based fuzzing.

2020-02-19 Thread Alexander Bulekov
fork() is a simple way to ensure that state does not leak in between
fuzzing runs. Unfortunately, the fuzzer mutation engine relies on
bitmaps which contain coverage information for each fuzzing run, and
these bitmaps should be copied from the child to the parent(where the
mutation occurs). These bitmaps are created through compile-time
instrumentation and they are not shared with fork()-ed processes, by
default. To address this, we create a shared memory region, adjust its
size and map it _over_ the counter region. Furthermore, libfuzzer
doesn't generally expose the globals that specify the location of the
counters/coverage bitmap. As a workaround, we rely on a custom linker
script which forces all of the bitmaps we care about to be placed in a
contiguous region, which is easy to locate and mmap over.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qtest/fuzz/Makefile.include |  5 +++
 tests/qtest/fuzz/fork_fuzz.c  | 55 +++
 tests/qtest/fuzz/fork_fuzz.h  | 23 +
 tests/qtest/fuzz/fork_fuzz.ld | 37 +
 4 files changed, 120 insertions(+)
 create mode 100644 tests/qtest/fuzz/fork_fuzz.c
 create mode 100644 tests/qtest/fuzz/fork_fuzz.h
 create mode 100644 tests/qtest/fuzz/fork_fuzz.ld

diff --git a/tests/qtest/fuzz/Makefile.include 
b/tests/qtest/fuzz/Makefile.include
index 8632bb89f4..a90915d56d 100644
--- a/tests/qtest/fuzz/Makefile.include
+++ b/tests/qtest/fuzz/Makefile.include
@@ -2,5 +2,10 @@ QEMU_PROG_FUZZ=qemu-fuzz-$(TARGET_NAME)$(EXESUF)
 
 fuzz-obj-y += tests/qtest/libqtest.o
 fuzz-obj-y += tests/qtest/fuzz/fuzz.o # Fuzzer skeleton
+fuzz-obj-y += tests/qtest/fuzz/fork_fuzz.o
 
 FUZZ_CFLAGS += -I$(SRC_PATH)/tests -I$(SRC_PATH)/tests/qtest
+
+# Linker Script to force coverage-counters into known regions which we can mark
+# shared
+FUZZ_LDFLAGS += -Xlinker -T$(SRC_PATH)/tests/qtest/fuzz/fork_fuzz.ld
diff --git a/tests/qtest/fuzz/fork_fuzz.c b/tests/qtest/fuzz/fork_fuzz.c
new file mode 100644
index 00..2bd0851903
--- /dev/null
+++ b/tests/qtest/fuzz/fork_fuzz.c
@@ -0,0 +1,55 @@
+/*
+ * Fork-based fuzzing helpers
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "fork_fuzz.h"
+
+
+void counter_shm_init(void)
+{
+char *shm_path = g_strdup_printf("/qemu-fuzz-cntrs.%d", getpid());
+int fd = shm_open(shm_path, O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
+g_free(shm_path);
+
+if (fd == -1) {
+perror("Error: ");
+exit(1);
+}
+if (ftruncate(fd, &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START) == -1) {
+perror("Error: ");
+exit(1);
+}
+/* Copy what's in the counter region to the shm.. */
+void *rptr = mmap(NULL ,
+&__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START,
+PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+memcpy(rptr,
+   &__FUZZ_COUNTERS_START,
+   &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START);
+
+munmap(rptr, &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START);
+
+/* And map the shm over the counter region */
+rptr = mmap(&__FUZZ_COUNTERS_START,
+&__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START,
+PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, fd, 0);
+
+close(fd);
+
+if (!rptr) {
+perror("Error: ");
+exit(1);
+}
+}
+
+
diff --git a/tests/qtest/fuzz/fork_fuzz.h b/tests/qtest/fuzz/fork_fuzz.h
new file mode 100644
index 00..9ecb8b58ef
--- /dev/null
+++ b/tests/qtest/fuzz/fork_fuzz.h
@@ -0,0 +1,23 @@
+/*
+ * Fork-based fuzzing helpers
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef FORK_FUZZ_H
+#define FORK_FUZZ_H
+
+extern uint8_t __FUZZ_COUNTERS_START;
+extern uint8_t __FUZZ_COUNTERS_END;
+
+void counter_shm_init(void);
+
+#endif
+
diff --git a/tests/qtest/fuzz/fork_fuzz.ld b/tests/qtest/fuzz/fork_fuzz.ld
new file mode 100644
index 00..b23a59f194
--- /dev/null
+++ b/tests/qtest/fuzz/fork_fuzz.ld
@@ -0,0 +1,37 @@
+/* We adjust linker script modification to place all of the stuff that needs to
+ * persist across fuzzing runs into a contiguous seciton of memory. Then, it is
+ * easy to re-map the counter-related memory as shared.
+*/
+
+SECTIONS
+{
+  .data.fuzz_start : ALIGN(4K)
+  {
+  __FUZZ_COUNTERS_START = .;
+  __start___sancov_cntrs = .;
+  *(_*sancov_cntrs);
+  __stop___sancov_cntrs = .;
+
+  /* Lowest stack counter */
+  *(__sancov_lowest_stack);
+  }
+  .data.fuzz_ordered :
+  {
+  /* Coverage counters. They're not necessary for fuzzing, but are useful
+   * for analyzing the fuzzing performance
+   

[PATCH v10 07/22] libqtest: make bufwrite rely on the TransportOps

2020-02-19 Thread Alexander Bulekov
When using qtest "in-process" communication, qtest_sendf directly calls
a function in the server (qtest.c). Previously, bufwrite used
socket_send, which bypasses the TransportOps enabling the call into
qtest.c. This change replaces the socket_send calls with ops->send,
maintaining the benefits of the direct socket_send call, while adding
support for in-process qtest calls.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 tests/qtest/libqtest.c | 71 --
 tests/qtest/libqtest.h |  4 +++
 2 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index e5056a1d0f..49075b55a1 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -37,10 +37,18 @@
 
 
 typedef void (*QTestSendFn)(QTestState *s, const char *buf);
+typedef void (*ExternalSendFn)(void *s, const char *buf);
 typedef GString* (*QTestRecvFn)(QTestState *);
 
 typedef struct QTestClientTransportOps {
 QTestSendFn send;  /* for sending qtest commands */
+
+/*
+ * use external_send to send qtest command strings through functions which
+ * do not accept a QTestState as the first parameter.
+ */
+ExternalSendFn  external_send;
+
 QTestRecvFn recv_line; /* for receiving qtest command responses */
 } QTestTransportOps;
 
@@ -1078,8 +1086,8 @@ void qtest_bufwrite(QTestState *s, uint64_t addr, const 
void *data, size_t size)
 
 bdata = g_base64_encode(data, size);
 qtest_sendf(s, "b64write 0x%" PRIx64 " 0x%zx ", addr, size);
-socket_send(s->fd, bdata, strlen(bdata));
-socket_send(s->fd, "\n", 1);
+s->ops.send(s, bdata);
+s->ops.send(s, "\n");
 qtest_rsp(s, 0);
 g_free(bdata);
 }
@@ -1367,3 +1375,62 @@ static void qtest_client_set_rx_handler(QTestState *s, 
QTestRecvFn recv)
 {
 s->ops.recv_line = recv;
 }
+/* A type-safe wrapper for s->send() */
+static void send_wrapper(QTestState *s, const char *buf)
+{
+s->ops.external_send(s, buf);
+}
+
+static GString *qtest_client_inproc_recv_line(QTestState *s)
+{
+GString *line;
+size_t offset;
+char *eol;
+
+eol = strchr(s->rx->str, '\n');
+offset = eol - s->rx->str;
+line = g_string_new_len(s->rx->str, offset);
+g_string_erase(s->rx, 0, offset + 1);
+return line;
+}
+
+QTestState *qtest_inproc_init(QTestState **s, bool log, const char* arch,
+void (*send)(void*, const char*))
+{
+QTestState *qts;
+qts = g_new0(QTestState, 1);
+*s = qts; /* Expose qts early on, since the query endianness relies on it 
*/
+qts->wstatus = 0;
+for (int i = 0; i < MAX_IRQ; i++) {
+qts->irq_level[i] = false;
+}
+
+qtest_client_set_rx_handler(qts, qtest_client_inproc_recv_line);
+
+/* send() may not have a matching protoype, so use a type-safe wrapper */
+qts->ops.external_send = send;
+qtest_client_set_tx_handler(qts, send_wrapper);
+
+qts->big_endian = qtest_query_target_endianness(qts);
+
+/*
+ * Set a dummy path for QTEST_QEMU_BINARY. Doesn't need to exist, but this
+ * way, qtest_get_arch works for inproc qtest.
+ */
+gchar *bin_path = g_strconcat("/qemu-system-", arch, NULL);
+setenv("QTEST_QEMU_BINARY", bin_path, 0);
+g_free(bin_path);
+
+return qts;
+}
+
+void qtest_client_inproc_recv(void *opaque, const char *str)
+{
+QTestState *qts = *(QTestState **)opaque;
+
+if (!qts->rx) {
+qts->rx = g_string_new(NULL);
+}
+g_string_append(qts->rx, str);
+return;
+}
diff --git a/tests/qtest/libqtest.h b/tests/qtest/libqtest.h
index c9e21e05b3..f5cf93c386 100644
--- a/tests/qtest/libqtest.h
+++ b/tests/qtest/libqtest.h
@@ -729,4 +729,8 @@ bool qtest_probe_child(QTestState *s);
  */
 void qtest_set_expected_status(QTestState *s, int status);
 
+QTestState *qtest_inproc_init(QTestState **s, bool log, const char* arch,
+void (*send)(void*, const char*));
+
+void qtest_client_inproc_recv(void *opaque, const char *str);
 #endif
-- 
2.25.0




[PATCH v10 19/22] fuzz: add i440fx fuzz targets

2020-02-19 Thread Alexander Bulekov
These three targets should simply fuzz reads/writes to a couple ioports,
but they mostly serve as examples of different ways to write targets.
They demonstrate using qtest and qos for fuzzing, as well as using
rebooting and forking to reset state, or not resetting it at all.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 tests/qtest/fuzz/Makefile.include |   3 +
 tests/qtest/fuzz/i440fx_fuzz.c| 193 ++
 2 files changed, 196 insertions(+)
 create mode 100644 tests/qtest/fuzz/i440fx_fuzz.c

diff --git a/tests/qtest/fuzz/Makefile.include 
b/tests/qtest/fuzz/Makefile.include
index e3bdd33ff4..38b8cdd9f1 100644
--- a/tests/qtest/fuzz/Makefile.include
+++ b/tests/qtest/fuzz/Makefile.include
@@ -6,6 +6,9 @@ fuzz-obj-y += tests/qtest/fuzz/fuzz.o # Fuzzer skeleton
 fuzz-obj-y += tests/qtest/fuzz/fork_fuzz.o
 fuzz-obj-y += tests/qtest/fuzz/qos_fuzz.o
 
+# Targets
+fuzz-obj-y += tests/qtest/fuzz/i440fx_fuzz.o
+
 FUZZ_CFLAGS += -I$(SRC_PATH)/tests -I$(SRC_PATH)/tests/qtest
 
 # Linker Script to force coverage-counters into known regions which we can mark
diff --git a/tests/qtest/fuzz/i440fx_fuzz.c b/tests/qtest/fuzz/i440fx_fuzz.c
new file mode 100644
index 00..ab5f112584
--- /dev/null
+++ b/tests/qtest/fuzz/i440fx_fuzz.c
@@ -0,0 +1,193 @@
+/*
+ * I440FX Fuzzing Target
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "qemu/main-loop.h"
+#include "tests/qtest/libqtest.h"
+#include "tests/qtest/libqos/pci.h"
+#include "tests/qtest/libqos/pci-pc.h"
+#include "fuzz.h"
+#include "fuzz/qos_fuzz.h"
+#include "fuzz/fork_fuzz.h"
+
+
+#define I440FX_PCI_HOST_BRIDGE_CFG 0xcf8
+#define I440FX_PCI_HOST_BRIDGE_DATA 0xcfc
+
+/*
+ * the input to the fuzzing functions below is a buffer of random bytes. we
+ * want to convert these bytes into a sequence of qtest or qos calls. to do
+ * this we define some opcodes:
+ */
+enum action_id {
+WRITEB,
+WRITEW,
+WRITEL,
+READB,
+READW,
+READL,
+ACTION_MAX
+};
+
+static void i440fx_fuzz_qtest(QTestState *s,
+const unsigned char *Data, size_t Size) {
+/*
+ * loop over the Data, breaking it up into actions. each action has an
+ * opcode, address offset and value
+ */
+typedef struct QTestFuzzAction {
+uint8_t opcode;
+uint8_t addr;
+uint32_t value;
+} QTestFuzzAction;
+QTestFuzzAction a;
+
+while (Size >= sizeof(a)) {
+/* make a copy of the action so we can normalize the values in-place */
+memcpy(&a, Data, sizeof(a));
+/* select between two i440fx Port IO addresses */
+uint16_t addr = a.addr % 2 ? I440FX_PCI_HOST_BRIDGE_CFG :
+  I440FX_PCI_HOST_BRIDGE_DATA;
+switch (a.opcode % ACTION_MAX) {
+case WRITEB:
+qtest_outb(s, addr, (uint8_t)a.value);
+break;
+case WRITEW:
+qtest_outw(s, addr, (uint16_t)a.value);
+break;
+case WRITEL:
+qtest_outl(s, addr, (uint32_t)a.value);
+break;
+case READB:
+qtest_inb(s, addr);
+break;
+case READW:
+qtest_inw(s, addr);
+break;
+case READL:
+qtest_inl(s, addr);
+break;
+}
+/* Move to the next operation */
+Size -= sizeof(a);
+Data += sizeof(a);
+}
+flush_events(s);
+}
+
+static void i440fx_fuzz_qos(QTestState *s,
+const unsigned char *Data, size_t Size) {
+/*
+ * Same as i440fx_fuzz_qtest, but using QOS. devfn is incorporated into the
+ * value written over Port IO
+ */
+typedef struct QOSFuzzAction {
+uint8_t opcode;
+uint8_t offset;
+int devfn;
+uint32_t value;
+} QOSFuzzAction;
+
+static QPCIBus *bus;
+if (!bus) {
+bus = qpci_new_pc(s, fuzz_qos_alloc);
+}
+
+QOSFuzzAction a;
+while (Size >= sizeof(a)) {
+memcpy(&a, Data, sizeof(a));
+switch (a.opcode % ACTION_MAX) {
+case WRITEB:
+bus->config_writeb(bus, a.devfn, a.offset, (uint8_t)a.value);
+break;
+case WRITEW:
+bus->config_writew(bus, a.devfn, a.offset, (uint16_t)a.value);
+break;
+case WRITEL:
+bus->config_writel(bus, a.devfn, a.offset, (uint32_t)a.value);
+break;
+case READB:
+bus->config_readb(bus, a.devfn, a.offset);
+break;
+case READW:
+bus->config_readw(bus, a.devfn, a.offset);
+break;
+case READL:
+bus->config_readl(bus, a.devfn, a.offset);
+break;
+}
+Size -= sizeof(a);
+Data += sizeof(a);
+}
+

[PATCH v10 13/22] exec: keep ram block across fork when using qtest

2020-02-19 Thread Alexander Bulekov
Ram blocks were marked MADV_DONTFORK breaking fuzzing-tests which
execute each test-input in a forked process.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 exec.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index 67e520d18e..43f6659d12 100644
--- a/exec.c
+++ b/exec.c
@@ -35,6 +35,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/tcg.h"
+#include "sysemu/qtest.h"
 #include "qemu/timer.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
@@ -2306,8 +2307,15 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp, bool shared)
 if (new_block->host) {
 qemu_ram_setup_dump(new_block->host, new_block->max_length);
 qemu_madvise(new_block->host, new_block->max_length, 
QEMU_MADV_HUGEPAGE);
-/* MADV_DONTFORK is also needed by KVM in absence of synchronous MMU */
-qemu_madvise(new_block->host, new_block->max_length, 
QEMU_MADV_DONTFORK);
+/*
+ * MADV_DONTFORK is also needed by KVM in absence of synchronous MMU
+ * Configure it unless the machine is a qtest server, in which case
+ * KVM is not used and it may be forked (eg for fuzzing purposes).
+ */
+if (!qtest_enabled()) {
+qemu_madvise(new_block->host, new_block->max_length,
+ QEMU_MADV_DONTFORK);
+}
 ram_block_notify_add(new_block->host, new_block->max_length);
 }
 }
-- 
2.25.0




[PATCH v10 06/22] libqtest: add a layer of abstraction to send/recv

2020-02-19 Thread Alexander Bulekov
This makes it simple to swap the transport functions for qtest commands
to and from the qtest client. For example, now it is possible to
directly pass qtest commands to a server handler that exists within the
same process, without the standard way of writing to a file descriptor.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 tests/qtest/libqtest.c | 48 ++
 1 file changed, 39 insertions(+), 9 deletions(-)

diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index 76c9f8eade..e5056a1d0f 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -35,6 +35,15 @@
 #define SOCKET_TIMEOUT 50
 #define SOCKET_MAX_FDS 16
 
+
+typedef void (*QTestSendFn)(QTestState *s, const char *buf);
+typedef GString* (*QTestRecvFn)(QTestState *);
+
+typedef struct QTestClientTransportOps {
+QTestSendFn send;  /* for sending qtest commands */
+QTestRecvFn recv_line; /* for receiving qtest command responses */
+} QTestTransportOps;
+
 struct QTestState
 {
 int fd;
@@ -45,6 +54,7 @@ struct QTestState
 bool big_endian;
 bool irq_level[MAX_IRQ];
 GString *rx;
+QTestTransportOps ops;
 };
 
 static GHookList abrt_hooks;
@@ -52,6 +62,14 @@ static struct sigaction sigact_old;
 
 static int qtest_query_target_endianness(QTestState *s);
 
+static void qtest_client_socket_send(QTestState*, const char *buf);
+static void socket_send(int fd, const char *buf, size_t size);
+
+static GString *qtest_client_socket_recv_line(QTestState *);
+
+static void qtest_client_set_tx_handler(QTestState *s, QTestSendFn send);
+static void qtest_client_set_rx_handler(QTestState *s, QTestRecvFn recv);
+
 static int init_socket(const char *socket_path)
 {
 struct sockaddr_un addr;
@@ -234,6 +252,9 @@ QTestState *qtest_init_without_qmp_handshake(const char 
*extra_args)
 sock = init_socket(socket_path);
 qmpsock = init_socket(qmp_socket_path);
 
+qtest_client_set_rx_handler(s, qtest_client_socket_recv_line);
+qtest_client_set_tx_handler(s, qtest_client_socket_send);
+
 qtest_add_abrt_handler(kill_qemu_hook_func, s);
 
 command = g_strdup_printf("exec %s "
@@ -379,13 +400,9 @@ static void socket_send(int fd, const char *buf, size_t 
size)
 }
 }
 
-static void socket_sendf(int fd, const char *fmt, va_list ap)
+static void qtest_client_socket_send(QTestState *s, const char *buf)
 {
-gchar *str = g_strdup_vprintf(fmt, ap);
-size_t size = strlen(str);
-
-socket_send(fd, str, size);
-g_free(str);
+socket_send(s->fd, buf, strlen(buf));
 }
 
 static void GCC_FMT_ATTR(2, 3) qtest_sendf(QTestState *s, const char *fmt, ...)
@@ -393,8 +410,11 @@ static void GCC_FMT_ATTR(2, 3) qtest_sendf(QTestState *s, 
const char *fmt, ...)
 va_list ap;
 
 va_start(ap, fmt);
-socket_sendf(s->fd, fmt, ap);
+gchar *str = g_strdup_vprintf(fmt, ap);
 va_end(ap);
+
+s->ops.send(s, str);
+g_free(str);
 }
 
 /* Sends a message and file descriptors to the socket.
@@ -431,7 +451,7 @@ static void socket_send_fds(int socket_fd, int *fds, size_t 
fds_num,
 g_assert_cmpint(ret, >, 0);
 }
 
-static GString *qtest_recv_line(QTestState *s)
+static GString *qtest_client_socket_recv_line(QTestState *s)
 {
 GString *line;
 size_t offset;
@@ -468,7 +488,7 @@ static gchar **qtest_rsp(QTestState *s, int expected_args)
 int i;
 
 redo:
-line = qtest_recv_line(s);
+line = s->ops.recv_line(s);
 words = g_strsplit(line->str, " ", 0);
 g_string_free(line, TRUE);
 
@@ -1337,3 +1357,13 @@ void qmp_assert_error_class(QDict *rsp, const char 
*class)
 
 qobject_unref(rsp);
 }
+
+static void qtest_client_set_tx_handler(QTestState *s,
+QTestSendFn send)
+{
+s->ops.send = send;
+}
+static void qtest_client_set_rx_handler(QTestState *s, QTestRecvFn recv)
+{
+s->ops.recv_line = recv;
+}
-- 
2.25.0




[PATCH v10 16/22] fuzz: add support for qos-assisted fuzz targets

2020-02-19 Thread Alexander Bulekov
Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qtest/fuzz/Makefile.include |   2 +
 tests/qtest/fuzz/qos_fuzz.c   | 234 ++
 tests/qtest/fuzz/qos_fuzz.h   |  33 +
 3 files changed, 269 insertions(+)
 create mode 100644 tests/qtest/fuzz/qos_fuzz.c
 create mode 100644 tests/qtest/fuzz/qos_fuzz.h

diff --git a/tests/qtest/fuzz/Makefile.include 
b/tests/qtest/fuzz/Makefile.include
index a90915d56d..e3bdd33ff4 100644
--- a/tests/qtest/fuzz/Makefile.include
+++ b/tests/qtest/fuzz/Makefile.include
@@ -1,8 +1,10 @@
 QEMU_PROG_FUZZ=qemu-fuzz-$(TARGET_NAME)$(EXESUF)
 
 fuzz-obj-y += tests/qtest/libqtest.o
+fuzz-obj-y += $(libqos-obj-y)
 fuzz-obj-y += tests/qtest/fuzz/fuzz.o # Fuzzer skeleton
 fuzz-obj-y += tests/qtest/fuzz/fork_fuzz.o
+fuzz-obj-y += tests/qtest/fuzz/qos_fuzz.o
 
 FUZZ_CFLAGS += -I$(SRC_PATH)/tests -I$(SRC_PATH)/tests/qtest
 
diff --git a/tests/qtest/fuzz/qos_fuzz.c b/tests/qtest/fuzz/qos_fuzz.c
new file mode 100644
index 00..bbb17470ff
--- /dev/null
+++ b/tests/qtest/fuzz/qos_fuzz.c
@@ -0,0 +1,234 @@
+/*
+ * QOS-assisted fuzzing helpers
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "sysemu/sysemu.h"
+#include "qemu/main-loop.h"
+
+#include "tests/qtest/libqtest.h"
+#include "tests/qtest/libqos/malloc.h"
+#include "tests/qtest/libqos/qgraph.h"
+#include "tests/qtest/libqos/qgraph_internal.h"
+#include "tests/qtest/libqos/qos_external.h"
+
+#include "fuzz.h"
+#include "qos_fuzz.h"
+
+#include "qapi/qapi-commands-machine.h"
+#include "qapi/qapi-commands-qom.h"
+#include "qapi/qmp/qlist.h"
+
+
+void *fuzz_qos_obj;
+QGuestAllocator *fuzz_qos_alloc;
+
+static const char *fuzz_target_name;
+static char **fuzz_path_vec;
+
+/*
+ * Replaced the qmp commands with direct qmp_marshal calls.
+ * Probably there is a better way to do this
+ */
+static void qos_set_machines_devices_available(void)
+{
+QDict *req = qdict_new();
+QObject *response;
+QDict *args = qdict_new();
+QList *lst;
+Error *err = NULL;
+
+qmp_marshal_query_machines(NULL, &response, &err);
+assert(!err);
+lst = qobject_to(QList, response);
+apply_to_qlist(lst, true);
+
+qobject_unref(response);
+
+
+qdict_put_str(req, "execute", "qom-list-types");
+qdict_put_str(args, "implements", "device");
+qdict_put_bool(args, "abstract", true);
+qdict_put_obj(req, "arguments", (QObject *) args);
+
+qmp_marshal_qom_list_types(args, &response, &err);
+assert(!err);
+lst = qobject_to(QList, response);
+apply_to_qlist(lst, false);
+qobject_unref(response);
+qobject_unref(req);
+}
+
+static char **current_path;
+
+void *qos_allocate_objects(QTestState *qts, QGuestAllocator **p_alloc)
+{
+return allocate_objects(qts, current_path + 1, p_alloc);
+}
+
+static const char *qos_build_main_args(void)
+{
+char **path = fuzz_path_vec;
+QOSGraphNode *test_node;
+GString *cmd_line = g_string_new(path[0]);
+void *test_arg;
+
+if (!path) {
+fprintf(stderr, "QOS Path not found\n");
+abort();
+}
+
+/* Before test */
+current_path = path;
+test_node = qos_graph_get_node(path[(g_strv_length(path) - 1)]);
+test_arg = test_node->u.test.arg;
+if (test_node->u.test.before) {
+test_arg = test_node->u.test.before(cmd_line, test_arg);
+}
+/* Prepend the arguments that we need */
+g_string_prepend(cmd_line,
+TARGET_NAME " -display none -machine accel=qtest -m 64 ");
+return cmd_line->str;
+}
+
+/*
+ * This function is largely a copy of qos-test.c:walk_path. Since walk_path
+ * is itself a callback, its a little annoying to add another argument/layer of
+ * indirection
+ */
+static void walk_path(QOSGraphNode *orig_path, int len)
+{
+QOSGraphNode *path;
+QOSGraphEdge *edge;
+
+/* etype set to QEDGE_CONSUMED_BY so that machine can add to the command 
line */
+QOSEdgeType etype = QEDGE_CONSUMED_BY;
+
+/* twice QOS_PATH_MAX_ELEMENT_SIZE since each edge can have its arg */
+char **path_vec = g_new0(char *, (QOS_PATH_MAX_ELEMENT_SIZE * 2));
+int path_vec_size = 0;
+
+char *after_cmd, *before_cmd, *after_device;
+GString *

[PATCH v10 10/22] libqos: split qos-test and libqos makefile vars

2020-02-19 Thread Alexander Bulekov
Most qos-related objects were specified in the qos-test-obj-y variable.
qos-test-obj-y also included qos-test.o which defines a main().
This made it difficult to repurpose qos-test-obj-y to link anything
beside tests/qos-test against libqos. This change separates objects that
are libqos-specific and ones that are qos-test specific into different
variables.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Darren Kenny 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
---
 tests/qtest/Makefile.include | 71 ++--
 1 file changed, 36 insertions(+), 35 deletions(-)

diff --git a/tests/qtest/Makefile.include b/tests/qtest/Makefile.include
index eb0f23b108..838618e6f9 100644
--- a/tests/qtest/Makefile.include
+++ b/tests/qtest/Makefile.include
@@ -157,52 +157,53 @@ check-qtest-s390x-y += migration-test
 # libqos / qgraph :
 libqgraph-obj-y = tests/qtest/libqos/qgraph.o
 
-libqos-obj-y = $(libqgraph-obj-y) tests/qtest/libqos/pci.o 
tests/qtest/libqos/fw_cfg.o
-libqos-obj-y += tests/qtest/libqos/malloc.o
-libqos-obj-y += tests/qtest/libqos/libqos.o
-libqos-spapr-obj-y = $(libqos-obj-y) tests/qtest/libqos/malloc-spapr.o
+libqos-core-obj-y = $(libqgraph-obj-y) tests/qtest/libqos/pci.o 
tests/qtest/libqos/fw_cfg.o
+libqos-core-obj-y += tests/qtest/libqos/malloc.o
+libqos-core-obj-y += tests/qtest/libqos/libqos.o
+libqos-spapr-obj-y = $(libqos-core-obj-y) tests/qtest/libqos/malloc-spapr.o
 libqos-spapr-obj-y += tests/qtest/libqos/libqos-spapr.o
 libqos-spapr-obj-y += tests/qtest/libqos/rtas.o
 libqos-spapr-obj-y += tests/qtest/libqos/pci-spapr.o
-libqos-pc-obj-y = $(libqos-obj-y) tests/qtest/libqos/pci-pc.o
+libqos-pc-obj-y = $(libqos-core-obj-y) tests/qtest/libqos/pci-pc.o
 libqos-pc-obj-y += tests/qtest/libqos/malloc-pc.o 
tests/qtest/libqos/libqos-pc.o
 libqos-pc-obj-y += tests/qtest/libqos/ahci.o
 libqos-usb-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) 
tests/qtest/libqos/usb.o
 
 # qos devices:
-qos-test-obj-y = tests/qtest/qos-test.o $(libqgraph-obj-y)
-qos-test-obj-y += $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
-qos-test-obj-y += tests/qtest/libqos/e1000e.o
-qos-test-obj-y += tests/qtest/libqos/i2c.o
-qos-test-obj-y += tests/qtest/libqos/i2c-imx.o
-qos-test-obj-y += tests/qtest/libqos/i2c-omap.o
-qos-test-obj-y += tests/qtest/libqos/sdhci.o
-qos-test-obj-y += tests/qtest/libqos/tpci200.o
-qos-test-obj-y += tests/qtest/libqos/virtio.o
-qos-test-obj-$(CONFIG_VIRTFS) += tests/qtest/libqos/virtio-9p.o
-qos-test-obj-y += tests/qtest/libqos/virtio-balloon.o
-qos-test-obj-y += tests/qtest/libqos/virtio-blk.o
-qos-test-obj-y += tests/qtest/libqos/virtio-mmio.o
-qos-test-obj-y += tests/qtest/libqos/virtio-net.o
-qos-test-obj-y += tests/qtest/libqos/virtio-pci.o
-qos-test-obj-y += tests/qtest/libqos/virtio-pci-modern.o
-qos-test-obj-y += tests/qtest/libqos/virtio-rng.o
-qos-test-obj-y += tests/qtest/libqos/virtio-scsi.o
-qos-test-obj-y += tests/qtest/libqos/virtio-serial.o
+libqos-obj-y =  $(libqgraph-obj-y)
+libqos-obj-y += $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
+libqos-obj-y += tests/qtest/libqos/e1000e.o
+libqos-obj-y += tests/qtest/libqos/i2c.o
+libqos-obj-y += tests/qtest/libqos/i2c-imx.o
+libqos-obj-y += tests/qtest/libqos/i2c-omap.o
+libqos-obj-y += tests/qtest/libqos/sdhci.o
+libqos-obj-y += tests/qtest/libqos/tpci200.o
+libqos-obj-y += tests/qtest/libqos/virtio.o
+libqos-obj-$(CONFIG_VIRTFS) += tests/qtest/libqos/virtio-9p.o
+libqos-obj-y += tests/qtest/libqos/virtio-balloon.o
+libqos-obj-y += tests/qtest/libqos/virtio-blk.o
+libqos-obj-y += tests/qtest/libqos/virtio-mmio.o
+libqos-obj-y += tests/qtest/libqos/virtio-net.o
+libqos-obj-y += tests/qtest/libqos/virtio-pci.o
+libqos-obj-y += tests/qtest/libqos/virtio-pci-modern.o
+libqos-obj-y += tests/qtest/libqos/virtio-rng.o
+libqos-obj-y += tests/qtest/libqos/virtio-scsi.o
+libqos-obj-y += tests/qtest/libqos/virtio-serial.o
 
 # qos machines:
-qos-test-obj-y += tests/qtest/libqos/aarch64-xlnx-zcu102-machine.o
-qos-test-obj-y += tests/qtest/libqos/arm-imx25-pdk-machine.o
-qos-test-obj-y += tests/qtest/libqos/arm-n800-machine.o
-qos-test-obj-y += tests/qtest/libqos/arm-raspi2-machine.o
-qos-test-obj-y += tests/qtest/libqos/arm-sabrelite-machine.o
-qos-test-obj-y += tests/qtest/libqos/arm-smdkc210-machine.o
-qos-test-obj-y += tests/qtest/libqos/arm-virt-machine.o
-qos-test-obj-y += tests/qtest/libqos/arm-xilinx-zynq-a9-machine.o
-qos-test-obj-y += tests/qtest/libqos/ppc64_pseries-machine.o
-qos-test-obj-y += tests/qtest/libqos/x86_64_pc-machine.o
+libqos-obj-y += tests/qtest/libqos/aarch64-xlnx-zcu102-machine.o
+libqos-obj-y += tests/qtest/libqos/arm-imx25-pdk-machine.o
+libqos-obj-y += tests/qtest/libqos/arm-n800-machine.o
+libqos-obj-y += tests/qtest/libqos/arm-raspi2-machine.o
+libqos-obj-y += tests/qtest/libqos/arm-sabrelite-machine.o
+libqos-obj-y += tests/qtest/libqos/arm-smdkc210-machine.o
+libqos-obj-y += tests/qtest/libqos/arm-virt-machine.o
+libqos-obj-y += tests/qtest/libqos/arm-xilinx-zynq-a9-machine.o

[PATCH v10 05/22] qtest: add qtest_server_send abstraction

2020-02-19 Thread Alexander Bulekov
qtest_server_send is a function pointer specifying the handler used to
transmit data to the qtest client. In the standard configuration, this
calls the CharBackend handler, but now it is possible for other types of
handlers, e.g direct-function calls if the qtest client and server
exist within the same process (inproc)

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
Acked-by: Thomas Huth 
---
 include/sysemu/qtest.h |  3 +++
 qtest.c| 18 --
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/qtest.h b/include/sysemu/qtest.h
index 5ed09c80b1..e2f1047fd7 100644
--- a/include/sysemu/qtest.h
+++ b/include/sysemu/qtest.h
@@ -26,4 +26,7 @@ bool qtest_driver(void);
 
 void qtest_server_init(const char *qtest_chrdev, const char *qtest_log, Error 
**errp);
 
+void qtest_server_set_send_handler(void (*send)(void *, const char *),
+ void *opaque);
+
 #endif
diff --git a/qtest.c b/qtest.c
index 12432f99cf..938c3746d6 100644
--- a/qtest.c
+++ b/qtest.c
@@ -42,6 +42,8 @@ static GString *inbuf;
 static int irq_levels[MAX_IRQ];
 static qemu_timeval start_time;
 static bool qtest_opened;
+static void (*qtest_server_send)(void*, const char*);
+static void *qtest_server_send_opaque;
 
 #define FMT_timeval "%ld.%06ld"
 
@@ -228,8 +230,10 @@ static void GCC_FMT_ATTR(1, 2) qtest_log_send(const char 
*fmt, ...)
 va_end(ap);
 }
 
-static void do_qtest_send(CharBackend *chr, const char *str, size_t len)
+static void qtest_server_char_be_send(void *opaque, const char *str)
 {
+size_t len = strlen(str);
+CharBackend* chr = (CharBackend *)opaque;
 qemu_chr_fe_write_all(chr, (uint8_t *)str, len);
 if (qtest_log_fp && qtest_opened) {
 fprintf(qtest_log_fp, "%s", str);
@@ -238,7 +242,7 @@ static void do_qtest_send(CharBackend *chr, const char 
*str, size_t len)
 
 static void qtest_send(CharBackend *chr, const char *str)
 {
-do_qtest_send(chr, str, strlen(str));
+qtest_server_send(qtest_server_send_opaque, str);
 }
 
 static void GCC_FMT_ATTR(2, 3) qtest_sendf(CharBackend *chr,
@@ -783,6 +787,16 @@ void qtest_server_init(const char *qtest_chrdev, const 
char *qtest_log, Error **
 qemu_chr_fe_set_echo(&qtest_chr, true);
 
 inbuf = g_string_new("");
+
+if (!qtest_server_send) {
+qtest_server_set_send_handler(qtest_server_char_be_send, &qtest_chr);
+}
+}
+
+void qtest_server_set_send_handler(void (*send)(void*, const char*), void 
*opaque)
+{
+qtest_server_send = send;
+qtest_server_send_opaque = opaque;
 }
 
 bool qtest_driver(void)
-- 
2.25.0




[PATCH v10 14/22] main: keep rcu_atfork callback enabled for qtest

2020-02-19 Thread Alexander Bulekov
The qtest-based fuzzer makes use of forking to reset-state between
tests. Keep the callback enabled, so the call_rcu thread gets created
within the child process.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Darren Kenny 
Acked-by: Stefan Hajnoczi 
---
 softmmu/vl.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/softmmu/vl.c b/softmmu/vl.c
index 46a48d09df..78f6530620 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -3813,7 +3813,17 @@ void qemu_init(int argc, char **argv, char **envp)
 set_memory_options(&ram_slots, &maxram_size, machine_class);
 
 os_daemonize();
-rcu_disable_atfork();
+
+/*
+ * If QTest is enabled, keep the rcu_atfork enabled, since system processes
+ * may be forked testing purposes (e.g. fork-server based fuzzing) The fork
+ * should happen before a signle cpu instruction is executed, to prevent
+ * deadlocks. See commit 73c6e40, rcu: "completely disable pthread_atfork
+ * callbacks as soon as possible"
+ */
+if (!qtest_enabled()) {
+rcu_disable_atfork();
+}
 
 if (pid_file && !qemu_write_pidfile(pid_file, &err)) {
 error_reportf_err(err, "cannot create PID file: ");
-- 
2.25.0




[PATCH v10 04/22] fuzz: add FUZZ_TARGET module type

2020-02-19 Thread Alexander Bulekov
Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 include/qemu/module.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/qemu/module.h b/include/qemu/module.h
index 65ba596e46..684753d808 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -46,6 +46,7 @@ typedef enum {
 MODULE_INIT_TRACE,
 MODULE_INIT_XEN_BACKEND,
 MODULE_INIT_LIBQOS,
+MODULE_INIT_FUZZ_TARGET,
 MODULE_INIT_MAX
 } module_init_type;
 
@@ -56,7 +57,8 @@ typedef enum {
 #define xen_backend_init(function) module_init(function, \
MODULE_INIT_XEN_BACKEND)
 #define libqos_init(function) module_init(function, MODULE_INIT_LIBQOS)
-
+#define fuzz_target_init(function) module_init(function, \
+   MODULE_INIT_FUZZ_TARGET)
 #define block_module_load_one(lib) module_load_one("block-", lib)
 #define ui_module_load_one(lib) module_load_one("ui-", lib)
 #define audio_module_load_one(lib) module_load_one("audio-", lib)
-- 
2.25.0




[PATCH v10 02/22] softmmu: split off vl.c:main() into main.c

2020-02-19 Thread Alexander Bulekov
A program might rely on functions implemented in vl.c, but implement its
own main(). By placing main into a separate source file, there are no
complaints about duplicate main()s when linking against vl.o. For
example, the virtual-device fuzzer uses a main() provided by libfuzzer,
and needs to perform some initialization before running the softmmu
initialization. Now, main simply calls three vl.c functions which
handle the guest initialization, main loop and cleanup.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
---
 MAINTAINERS |  1 +
 Makefile.target |  2 +-
 include/sysemu/sysemu.h |  4 
 softmmu/Makefile.objs   |  1 +
 softmmu/main.c  | 53 +
 softmmu/vl.c| 36 +++-
 6 files changed, 69 insertions(+), 28 deletions(-)
 create mode 100644 softmmu/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 98cbeaab97..a8e2a5f8c7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2024,6 +2024,7 @@ F: include/sysemu/runstate.h
 F: util/main-loop.c
 F: util/qemu-timer.c
 F: softmmu/vl.c
+F: softmmu/main.c
 F: qapi/run-state.json
 
 Human Monitor (HMP)
diff --git a/Makefile.target b/Makefile.target
index 06c36d1161..6f4dd72022 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -203,7 +203,7 @@ endif
 COMMON_LDADDS = ../libqemuutil.a
 
 # build either PROG or PROGW
-$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS)
+$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS) $(softmmu-main-y)
$(call LINK, $(filter-out %.mak, $^))
 ifdef CONFIG_DARWIN
$(call quiet-command,Rez -append $(SRC_PATH)/pc-bios/qemu.rsrc -o 
$@,"REZ","$(TARGET_DIR)$@")
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 6358a324a7..3e81a1a79c 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -116,6 +116,10 @@ QemuOpts *qemu_get_machine_opts(void);
 
 bool defaults_enabled(void);
 
+void qemu_init(int argc, char **argv, char **envp);
+void qemu_main_loop(void);
+void qemu_cleanup(void);
+
 extern QemuOptsList qemu_legacy_drive_opts;
 extern QemuOptsList qemu_common_drive_opts;
 extern QemuOptsList qemu_drive_opts;
diff --git a/softmmu/Makefile.objs b/softmmu/Makefile.objs
index d80a5ffe5a..dd15c24346 100644
--- a/softmmu/Makefile.objs
+++ b/softmmu/Makefile.objs
@@ -1,2 +1,3 @@
+softmmu-main-y = softmmu/main.o
 obj-y += vl.o
 vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
diff --git a/softmmu/main.c b/softmmu/main.c
new file mode 100644
index 00..7adc530c73
--- /dev/null
+++ b/softmmu/main.c
@@ -0,0 +1,53 @@
+/*
+ * QEMU System Emulator
+ *
+ * Copyright (c) 2003-2020 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "sysemu/sysemu.h"
+
+#ifdef CONFIG_SDL
+#if defined(__APPLE__) || defined(main)
+#include 
+int main(int argc, char **argv)
+{
+return qemu_main(argc, argv, NULL);
+}
+#undef main
+#define main qemu_main
+#endif
+#endif /* CONFIG_SDL */
+
+#ifdef CONFIG_COCOA
+#undef main
+#define main qemu_main
+#endif /* CONFIG_COCOA */
+
+int main(int argc, char **argv, char **envp)
+{
+qemu_init(argc, argv, envp);
+qemu_main_loop();
+qemu_cleanup();
+
+return 0;
+}
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 7dcb0879c4..46a48d09df 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -36,25 +36,6 @@
 #include "sysemu/seccomp.h"
 #include "sysemu/tcg.h"
 
-#ifdef CONFIG_SDL
-#if defined(__APPLE__) || defined(main)
-#include 
-int qemu_main(int argc, char **argv, char **envp);
-int main(int argc, char **argv)
-{
-return qemu_main(argc, argv, NULL);
-}
-#undef main
-#define main qemu_main
-#endif
-#endif /* CONFIG_SDL */
-
-#ifdef CONFIG_COCOA
-#undef main
-#define main qemu_main
-#endif /* CONFIG_COCOA */
-
-
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
 #include "sysemu/accel.h"
@@ -1671,7 +1652,7 @@ static 

[PATCH v10 00/22] Add virtual device fuzzing support

2020-02-19 Thread Alexander Bulekov
Hello,

This series adds a framework for coverage-guided fuzzing of
virtual-devices. Fuzzing targets are based on qtest and can make use of
libqos. Fuzzing can help discover device bugs, such as
assertion-failures, timeouts, and overflows, triggerable from within
guests.

V10:
 * Update MAINTAINERS for vl.c, main.c and tests/qtest/fuzz
 * Fix changes to checkpatch
 * Fix typos in virtio-scsi fuzzer

V9:
 * Fix bug in the virtio-scsi fuzzer. Virtqueues were being kicked only
   if free_head != 0 (which it never was).
 * Move vl.c and main.c into a new directory: softmmu/
 * virtio-net-fuzz: refactor the looop over used descriptor.
 * Improve comments for i440fx and virtio-scsi fuzzers.

V8:
 * Small fixes to the virtio-net.
 * Keep rcu_atfork when not using qtest.

V7:
 * virtio-net: add virtio-net-check-used which waits for inputs on
 the tx/ctrl vq by watching the used vring.
 * virtio-net: add virtio-net-socket which uses the socket backend and can
 exercise the rx components of virtio-net.
 * virtio-net: add virtio-net-slirp which uses the user backend and exercises
 slirp. This may lead to real traffic emitted by qemu so it is best to
 run in an isolated network environment.
 * build should succeed after each commit

V5/V6:
 * added virtio-scsi fuzzer
 * add support for using fork-based fuzzers with multiple libfuzzer
   workers
 * misc fixes addressing V4 comments
 * cleanup in-process handlers/globals in libqtest.c
 * small fixes to fork-based fuzzing and support for multiple workers
 * changes to the virtio-net fuzzer to kick after each vq add

V4:
 * add/transfer license headers to new files
 * restructure the added QTestClientTransportOps struct
 * restructure the FuzzTarget struct and fuzzer skeleton
 * fork-based fuzzer now directly mmaps shm over the coverage bitmaps
 * fixes to i440 and virtio-net fuzz targets
 * undo the changes to qtest_memwrite
 * possible to build /fuzz and /all in the same build-dir
 * misc fixes to address V3 comments

V3:
 * rebased onto v4.1.0+
 * add the fuzzer as a new build-target type in the build-system
 * add indirection to qtest client/server communication functions
 * remove ramfile and snapshot-based fuzzing support
 * add i440fx fuzz-target as a reference for developers.
 * add linker-script to assist with fork-based fuzzer

V2:
 * split off changes to qos virtio-net and qtest server to other patches
 * move vl:main initialization into new func: qemu_init
 * moved useful functions from qos-test.c to a separate object
 * use struct of function pointers for add_fuzz_target(), instead of
   arguments
 * move ramfile to migration/qemu-file
 * rewrite fork-based fuzzer pending patch to libfuzzer
 * pass check-patch

Alexander Bulekov (22):
  softmmu: move vl.c to softmmu/
  softmmu: split off vl.c:main() into main.c
  module: check module wasn't already initialized
  fuzz: add FUZZ_TARGET module type
  qtest: add qtest_server_send abstraction
  libqtest: add a layer of abstraction to send/recv
  libqtest: make bufwrite rely on the TransportOps
  qtest: add in-process incoming command handler
  libqos: rename i2c_send and i2c_recv
  libqos: split qos-test and libqos makefile vars
  libqos: move useful qos-test funcs to qos_external
  fuzz: add fuzzer skeleton
  exec: keep ram block across fork when using qtest
  main: keep rcu_atfork callback enabled for qtest
  fuzz: support for fork-based fuzzing.
  fuzz: add support for qos-assisted fuzz targets
  fuzz: add target/fuzz makefile rules
  fuzz: add configure flag --enable-fuzzing
  fuzz: add i440fx fuzz targets
  fuzz: add virtio-net fuzz target
  fuzz: add virtio-scsi fuzz target
  fuzz: add documentation to docs/devel/

 MAINTAINERS |  11 +-
 Makefile|  15 +-
 Makefile.objs   |   2 -
 Makefile.target |  19 ++-
 configure   |  39 +
 docs/devel/fuzzing.txt  | 116 ++
 exec.c  |  12 +-
 include/qemu/module.h   |   4 +-
 include/sysemu/qtest.h  |   4 +
 include/sysemu/sysemu.h |   4 +
 qtest.c |  31 +++-
 scripts/checkpatch.pl   |   2 +-
 scripts/get_maintainer.pl   |   3 +-
 softmmu/Makefile.objs   |   3 +
 softmmu/main.c  |  53 +++
 vl.c => softmmu/vl.c|  48 +++---
 tests/qtest/Makefile.include|  72 -
 tests/qtest/fuzz/Makefile.include   |  18 +++
 tests/qtest/fuzz/fork_fuzz.c|  55 +++
 tests/qtest/fuzz/fork_fuzz.h|  23 +++
 tests/qtest/fuzz/fork_fuzz.ld   |  37 +
 tests/qtest/fuzz/fuzz.c | 179 +
 tests/qtest/fuzz/fuzz.h |  95 +++
 tests/qtest/fuzz/i440fx_fuzz.c  | 193 +++
 tests/qtest/fuzz/qos_fuzz.c | 234 
 tests/qtest/fuzz/qos_fuzz.h |  33 
 test

[PATCH v10 03/22] module: check module wasn't already initialized

2020-02-19 Thread Alexander Bulekov
The virtual-device fuzzer must initialize QOM, prior to running
vl:qemu_init, so that it can use the qos_graph to identify the arguments
required to initialize a guest for libqos-assisted fuzzing. This change
prevents errors when vl:qemu_init tries to (re)initialize the previously
initialized QOM module.

Signed-off-by: Alexander Bulekov 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Darren Kenny 
Reviewed-by: Philippe Mathieu-Daudé 
---
 util/module.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/util/module.c b/util/module.c
index 8c5315a7a3..236a7bb52a 100644
--- a/util/module.c
+++ b/util/module.c
@@ -30,6 +30,7 @@ typedef struct ModuleEntry
 typedef QTAILQ_HEAD(, ModuleEntry) ModuleTypeList;
 
 static ModuleTypeList init_type_list[MODULE_INIT_MAX];
+static bool modules_init_done[MODULE_INIT_MAX];
 
 static ModuleTypeList dso_init_list;
 
@@ -91,11 +92,17 @@ void module_call_init(module_init_type type)
 ModuleTypeList *l;
 ModuleEntry *e;
 
+if (modules_init_done[type]) {
+return;
+}
+
 l = find_type(type);
 
 QTAILQ_FOREACH(e, l, node) {
 e->init();
 }
+
+modules_init_done[type] = true;
 }
 
 #ifdef CONFIG_MODULES
-- 
2.25.0




[PATCH v10 01/22] softmmu: move vl.c to softmmu/

2020-02-19 Thread Alexander Bulekov
Move vl.c to a separate directory, similar to linux-user/
Update the chechpatch and get_maintainer scripts, since they relied on
/vl.c for top_of_tree checks.

Signed-off-by: Alexander Bulekov 
---
 MAINTAINERS   | 2 +-
 Makefile.objs | 2 --
 Makefile.target   | 1 +
 scripts/checkpatch.pl | 2 +-
 scripts/get_maintainer.pl | 3 ++-
 softmmu/Makefile.objs | 2 ++
 vl.c => softmmu/vl.c  | 0
 7 files changed, 7 insertions(+), 5 deletions(-)
 create mode 100644 softmmu/Makefile.objs
 rename vl.c => softmmu/vl.c (100%)

diff --git a/MAINTAINERS b/MAINTAINERS
index c7717df720..98cbeaab97 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2023,7 +2023,7 @@ F: include/qemu/main-loop.h
 F: include/sysemu/runstate.h
 F: util/main-loop.c
 F: util/qemu-timer.c
-F: vl.c
+F: softmmu/vl.c
 F: qapi/run-state.json
 
 Human Monitor (HMP)
diff --git a/Makefile.objs b/Makefile.objs
index 26b9cff954..8a1cbe8000 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -58,8 +58,6 @@ common-obj-y += ui/
 common-obj-m += ui/
 
 common-obj-y += dma-helpers.o
-common-obj-y += vl.o
-vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
 common-obj-$(CONFIG_TPM) += tpm.o
 
 common-obj-y += backends/
diff --git a/Makefile.target b/Makefile.target
index 6e61f607b1..06c36d1161 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -160,6 +160,7 @@ obj-y += qapi/
 obj-y += memory.o
 obj-y += memory_mapping.o
 obj-y += migration/ram.o
+obj-y += softmmu/
 LIBS := $(libs_softmmu) $(LIBS)
 
 # Hardware support
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index ce43a306f8..c85ad11de1 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -462,7 +462,7 @@ sub top_of_kernel_tree {
my @tree_check = (
"COPYING", "MAINTAINERS", "Makefile",
"README.rst", "docs", "VERSION",
-   "vl.c"
+   "linux-user", "softmmu"
);
 
foreach my $check (@tree_check) {
diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 27991eb1cf..271f5ff42a 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -795,7 +795,8 @@ sub top_of_tree {
 && (-f "${lk_path}Makefile")
 && (-d "${lk_path}docs")
 && (-f "${lk_path}VERSION")
-&& (-f "${lk_path}vl.c")) {
+&& (-d "${lk_path}linux-user/")
+&& (-d "${lk_path}softmmu/")) {
return 1;
 }
 return 0;
diff --git a/softmmu/Makefile.objs b/softmmu/Makefile.objs
new file mode 100644
index 00..d80a5ffe5a
--- /dev/null
+++ b/softmmu/Makefile.objs
@@ -0,0 +1,2 @@
+obj-y += vl.o
+vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
diff --git a/vl.c b/softmmu/vl.c
similarity index 100%
rename from vl.c
rename to softmmu/vl.c
-- 
2.25.0




Re: [PATCH 2/2] aspeed/smc: Fix User mode select/unselect scheme

2020-02-19 Thread Andrew Jeffery



On Thu, 6 Feb 2020, at 21:56, Cédric Le Goater wrote:
> The Aspeed SMC Controller can operate in different modes : Read, Fast
> Read, Write and User modes. When the User mode is configured, it
> selects automatically the SPI slave device until the CE_STOP_ACTIVE
> bit is set to 1. When any other modes are configured the device is
> unselected. The HW logic handles the chip select automatically when
> the flash is accessed through its AHB window.
> 
> When configuring the CEx Control Register, the User mode logic to
> select and unselect the slave is incorrect and data corruption can be
> seen on machines using two chips, witherspoon and romulus.
> 
> Rework the handler setting the CEx Control Register to fix this issue.
> 
> Fixes: 7c1c69bca43c ("ast2400: add SMC controllers (FMC and SPI)")
> Signed-off-by: Cédric Le Goater 

Champion!

Reviewed-by: Andrew Jeffery 



Re: [PATCH 1/2] aspeed/smc: Add some tracing

2020-02-19 Thread Andrew Jeffery



On Thu, 6 Feb 2020, at 21:56, Cédric Le Goater wrote:
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Andrew Jeffery 



Re: The issues about architecture of the COLO checkpoint

2020-02-19 Thread Daniel Cho
Hi Hailiang,

I have already patched the file to my branch, but there is a problem while
doing migration.
Here is the error message from SVM
"qemu-system-x86_64: /root/download/qemu-4.1.0/memory.c:1079:
memory_region_transaction_commit: Assertion `qemu_mutex_iothread_locked()'
failed."

Do you have this problem?

Best regards,
Daniel Cho

Daniel Cho  於 2020年2月20日 週四 上午11:49寫道:

> Hi Zhang,
>
> Thanks, I will configure on code for testing first.
> However, if you have free time, could you please send the patch file to
> us, Thanks.
>
> Best Regard,
> Daniel Cho
>
>
> Zhang, Chen  於 2020年2月20日 週四 上午11:07寫道:
>
>>
>> On 2/18/2020 5:22 PM, Daniel Cho wrote:
>>
>> Hi Hailiang,
>> Thanks for your help. If we have any problems we will contact you for
>> your favor.
>>
>>
>> Hi Zhang,
>>
>> " If colo-compare got a primary packet without related secondary packet
>> in a certain time , it will automatically trigger checkpoint.  "
>> As you said, the colo-compare will trigger checkpoint, but does it need
>> to limit checkpoint times?
>> There is a problem about doing many checkpoints while we use fio to
>> random write files. Then it will cause low throughput on PVM.
>> Is this situation is normal on COLO?
>>
>>
>> Hi Daniel,
>>
>> The checkpoint time is designed to be user adjustable based on user
>> environment(workload/network status/business conditions...).
>>
>> In net/colo-compare.c
>>
>> /* TODO: Should be configurable */
>> #define REGULAR_PACKET_CHECK_MS 3000
>>
>> If you need, I can send a patch for this issue. Make users can change the
>> value by QMP and qemu monitor commands.
>>
>> Thanks
>>
>> Zhang Chen
>>
>>
>>
>> Best regards,
>> Daniel Cho
>>
>> Zhang, Chen  於 2020年2月17日 週一 下午1:36寫道:
>>
>>>
>>> On 2/15/2020 11:35 AM, Daniel Cho wrote:
>>>
>>> Hi Dave,
>>>
>>> Yes, I agree with you, it does need a timeout.
>>>
>>>
>>> Hi Daniel and Dave,
>>>
>>> Current colo-compare already have the timeout mechanism.
>>>
>>> Named packet_check_timer,  It will scan primary packet queue to make
>>> sure all the primary packet not stay too long time.
>>>
>>> If colo-compare got a primary packet without related secondary packet in
>>> a certain time , it will automatic trigger checkpoint.
>>>
>>> https://github.com/qemu/qemu/blob/master/net/colo-compare.c#L847
>>>
>>>
>>> Thanks
>>>
>>> Zhang Chen
>>>
>>>
>>>
>>> Hi Hailiang,
>>>
>>> We base on qemu-4.1.0 for using COLO feature, in your patch, we found a
>>> lot of difference  between your version and ours.
>>> Could you give us a latest release version which is close your
>>> developing code?
>>>
>>> Thanks.
>>>
>>> Regards
>>> Daniel Cho
>>>
>>> Dr. David Alan Gilbert  於 2020年2月13日 週四 下午6:38寫道:
>>>
 * Daniel Cho (daniel...@qnap.com) wrote:
 > Hi Hailiang,
 >
 > 1.
 > OK, we will try the patch
 > “0001-COLO-Optimize-memory-back-up-process.patch”,
 > and thanks for your help.
 >
 > 2.
 > We understand the reason to compare PVM and SVM's packet.
 However, the
 > empty of SVM's packet queue might happened on setting COLO feature
 and SVM
 > broken.
 >
 > On situation 1 ( setting COLO feature ):
 > We could force do checkpoint after setting COLO feature finish,
 then it
 > will protect the state of PVM and SVM . As the Zhang Chen said.
 >
 > On situation 2 ( SVM broken ):
 > COLO will do failover for PVM, so it might not cause any wrong on
 PVM.
 >
 > However, those situations are our views, so there might be a big
 difference
 > between reality and our views.
 > If we have any wrong views and opinions, please let us know, and
 correct
 > us.

 It does need a timeout; the SVM being broken or being in a state where
 it never sends the corresponding packet (because of a state difference)
 can happen and COLO needs to timeout when the packet hasn't arrived
 after a while and trigger the checkpoint.

 Dave

 > Thanks.
 >
 > Best regards,
 > Daniel Cho
 >
 > Zhang, Chen  於 2020年2月13日 週四 上午10:17寫道:
 >
 > > Add cc Jason Wang, he is a network expert.
 > >
 > > In case some network things goes wrong.
 > >
 > >
 > >
 > > Thanks
 > >
 > > Zhang Chen
 > >
 > >
 > >
 > > *From:* Zhang, Chen
 > > *Sent:* Thursday, February 13, 2020 10:10 AM
 > > *To:* 'Zhanghailiang' ; Daniel Cho
 <
 > > daniel...@qnap.com>
 > > *Cc:* Dr. David Alan Gilbert ;
 qemu-devel@nongnu.org
 > > *Subject:* RE: The issues about architecture of the COLO checkpoint
 > >
 > >
 > >
 > > For the issue 2:
 > >
 > >
 > >
 > > COLO need use the network packets to confirm PVM and SVM in the
 same state,
 > >
 > > Generally speaking, we can’t send PVM packets without compared with
 SVM
 > > packets.
 > >
 > > But to prevent jamming, I think COLO can do force checkpoint and
 send the
 > 

Re: The issues about architecture of the COLO checkpoint

2020-02-19 Thread Daniel Cho
Hi Zhang,

Thanks, I will configure on code for testing first.
However, if you have free time, could you please send the patch file to us,
Thanks.

Best Regard,
Daniel Cho


Zhang, Chen  於 2020年2月20日 週四 上午11:07寫道:

>
> On 2/18/2020 5:22 PM, Daniel Cho wrote:
>
> Hi Hailiang,
> Thanks for your help. If we have any problems we will contact you for your
> favor.
>
>
> Hi Zhang,
>
> " If colo-compare got a primary packet without related secondary packet in
> a certain time , it will automatically trigger checkpoint.  "
> As you said, the colo-compare will trigger checkpoint, but does it need to
> limit checkpoint times?
> There is a problem about doing many checkpoints while we use fio to random
> write files. Then it will cause low throughput on PVM.
> Is this situation is normal on COLO?
>
>
> Hi Daniel,
>
> The checkpoint time is designed to be user adjustable based on user
> environment(workload/network status/business conditions...).
>
> In net/colo-compare.c
>
> /* TODO: Should be configurable */
> #define REGULAR_PACKET_CHECK_MS 3000
>
> If you need, I can send a patch for this issue. Make users can change the
> value by QMP and qemu monitor commands.
>
> Thanks
>
> Zhang Chen
>
>
>
> Best regards,
> Daniel Cho
>
> Zhang, Chen  於 2020年2月17日 週一 下午1:36寫道:
>
>>
>> On 2/15/2020 11:35 AM, Daniel Cho wrote:
>>
>> Hi Dave,
>>
>> Yes, I agree with you, it does need a timeout.
>>
>>
>> Hi Daniel and Dave,
>>
>> Current colo-compare already have the timeout mechanism.
>>
>> Named packet_check_timer,  It will scan primary packet queue to make sure
>> all the primary packet not stay too long time.
>>
>> If colo-compare got a primary packet without related secondary packet in
>> a certain time , it will automatic trigger checkpoint.
>>
>> https://github.com/qemu/qemu/blob/master/net/colo-compare.c#L847
>>
>>
>> Thanks
>>
>> Zhang Chen
>>
>>
>>
>> Hi Hailiang,
>>
>> We base on qemu-4.1.0 for using COLO feature, in your patch, we found a
>> lot of difference  between your version and ours.
>> Could you give us a latest release version which is close your developing
>> code?
>>
>> Thanks.
>>
>> Regards
>> Daniel Cho
>>
>> Dr. David Alan Gilbert  於 2020年2月13日 週四 下午6:38寫道:
>>
>>> * Daniel Cho (daniel...@qnap.com) wrote:
>>> > Hi Hailiang,
>>> >
>>> > 1.
>>> > OK, we will try the patch
>>> > “0001-COLO-Optimize-memory-back-up-process.patch”,
>>> > and thanks for your help.
>>> >
>>> > 2.
>>> > We understand the reason to compare PVM and SVM's packet. However,
>>> the
>>> > empty of SVM's packet queue might happened on setting COLO feature and
>>> SVM
>>> > broken.
>>> >
>>> > On situation 1 ( setting COLO feature ):
>>> > We could force do checkpoint after setting COLO feature finish,
>>> then it
>>> > will protect the state of PVM and SVM . As the Zhang Chen said.
>>> >
>>> > On situation 2 ( SVM broken ):
>>> > COLO will do failover for PVM, so it might not cause any wrong on
>>> PVM.
>>> >
>>> > However, those situations are our views, so there might be a big
>>> difference
>>> > between reality and our views.
>>> > If we have any wrong views and opinions, please let us know, and
>>> correct
>>> > us.
>>>
>>> It does need a timeout; the SVM being broken or being in a state where
>>> it never sends the corresponding packet (because of a state difference)
>>> can happen and COLO needs to timeout when the packet hasn't arrived
>>> after a while and trigger the checkpoint.
>>>
>>> Dave
>>>
>>> > Thanks.
>>> >
>>> > Best regards,
>>> > Daniel Cho
>>> >
>>> > Zhang, Chen  於 2020年2月13日 週四 上午10:17寫道:
>>> >
>>> > > Add cc Jason Wang, he is a network expert.
>>> > >
>>> > > In case some network things goes wrong.
>>> > >
>>> > >
>>> > >
>>> > > Thanks
>>> > >
>>> > > Zhang Chen
>>> > >
>>> > >
>>> > >
>>> > > *From:* Zhang, Chen
>>> > > *Sent:* Thursday, February 13, 2020 10:10 AM
>>> > > *To:* 'Zhanghailiang' ; Daniel Cho <
>>> > > daniel...@qnap.com>
>>> > > *Cc:* Dr. David Alan Gilbert ;
>>> qemu-devel@nongnu.org
>>> > > *Subject:* RE: The issues about architecture of the COLO checkpoint
>>> > >
>>> > >
>>> > >
>>> > > For the issue 2:
>>> > >
>>> > >
>>> > >
>>> > > COLO need use the network packets to confirm PVM and SVM in the same
>>> state,
>>> > >
>>> > > Generally speaking, we can’t send PVM packets without compared with
>>> SVM
>>> > > packets.
>>> > >
>>> > > But to prevent jamming, I think COLO can do force checkpoint and
>>> send the
>>> > > PVM packets in this case.
>>> > >
>>> > >
>>> > >
>>> > > Thanks
>>> > >
>>> > > Zhang Chen
>>> > >
>>> > >
>>> > >
>>> > > *From:* Zhanghailiang 
>>> > > *Sent:* Thursday, February 13, 2020 9:45 AM
>>> > > *To:* Daniel Cho 
>>> > > *Cc:* Dr. David Alan Gilbert ;
>>> qemu-devel@nongnu.org;
>>> > > Zhang, Chen 
>>> > > *Subject:* RE: The issues about architecture of the COLO checkpoint
>>> > >
>>> > >
>>> > >
>>> > > Hi,
>>> > >
>>> > >
>>> > >
>>> > > 1.   After re-walked through the codes, yes, you are right,
>>> actually,
>>> > > after the first migration,

Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module

2020-02-19 Thread Zhang, Chen



On 2/12/2020 10:56 AM, Jason Wang wrote:

On 2020/2/11 下午4:58, Zhang, Chen wrote:

-Original Message-
From: Jason Wang
Sent: Monday, January 20, 2020 10:57 AM
To: Zhang, Chen; Paolo Bonzini
; Philippe Mathieu-Daudé;
qemu-dev
Cc: Zhang Chen
Subject: Re: [PATCH V4 0/5] Introduce Advanced Watch Dog module


On 2020/1/19 下午5:10, Zhang, Chen wrote:

Hi~

Anyone have comments about this module?

Hi Chen:

I will take a look at this series.

Sorry for slow reply due to CNY and extend leave.
OK, waiting your comments~ Thanks~


Two general questions:

- if it can detect more than network stall, it should not belong to /net

This module use network connection status to detect all the issue(Host to 
Guest/Host to Host/Host to Admin...).
The target is more than network but all use network way. So it is looks a 
tricky problem.


Ok.



- need to convince libvirt guys for this proposal, since usually it's the duty 
of
upper layer instead of qemu itself


Yes, It looks a upper layer responsibility, but In the cover latter I have 
explained the reason why we need this in Qemu.
   try to make this module as simple as possible. This module give upper layer 
software a new way to connect/monitoring Qemu.
And due to all the COLO code implement in Qemu side, Many customer want to use 
this FT solution without other dependencies,
it is very easy to integrated to real product.

Thanks
Zhang Chen


I would like to hear from libvirt about such design.



Hi Jason,

OK. I add the libvirt mailing list in this thread.

The full mail discussion and patches:

https://lists.nongnu.org/archive/html/qemu-devel/2020-02/msg02611.html


By the way, I noticed Eric is libvirt maintianer.

Hi Eric and Paolo, Can you give some comments about this series?


Thanks

Zhang Chen




Thanks





[PATCH v5 16/18] spapr: Don't clamp RMA to 16GiB on new machine types

2020-02-19 Thread David Gibson
In spapr_machine_init() we clamp the size of the RMA to 16GiB and the
comment saying why doesn't make a whole lot of sense.  In fact, this was
done because the real mode handling code elsewhere limited the RMA in TCG
mode to the maximum value configurable in LPCR[RMLS], 16GiB.

But,
 * Actually LPCR[RMLS] has been able to encode a 256GiB size for a very
   long time, we just didn't implement it properly in the softmmu
 * LPCR[RMLS] shouldn't really be relevant anyway, it only was because we
   used to abuse the RMOR based translation mode in order to handle the
   fact that we're not modelling the hypervisor parts of the cpu

We've now removed those limitations in the modelling so the 16GiB clamp no
longer serves a function.  However, we can't just remove the limit
universally: that would break migration to earlier qemu versions, where
the 16GiB RMLS limit still applies, no matter how bad the reasons for it
are.

So, we replace the 16GiB clamp, with a clamp to a limit defined in the
machine type class.  We set it to 16 GiB for machine types 4.2 and earlier,
but set it to 0 meaning unlimited for the new 5.0 machine type.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 13 -
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4dab489931..6e9f15f64d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2702,12 +2702,14 @@ static void spapr_machine_init(MachineState *machine)
 
 spapr->rma_size = node0_size;
 
-/* Actually we don't support unbounded RMA anymore since we added
- * proper emulation of HV mode. The max we can get is 16G which
- * also happens to be what we configure for PAPR mode so make sure
- * we don't do anything bigger than that
+/*
+ * Clamp the RMA size based on machine type.  This is for
+ * migration compatibility with older qemu versions, which limited
+ * the RMA size for complicated and mostly bad reasons.
  */
-spapr->rma_size = MIN(spapr->rma_size, 0x4ull);
+if (smc->rma_limit) {
+spapr->rma_size = MIN(spapr->rma_size, smc->rma_limit);
+}
 
 if (spapr->rma_size > node0_size) {
 error_report("Numa node 0 has to span the RMA (%#08"HWADDR_PRIx")",
@@ -4600,6 +4602,7 @@ static void spapr_machine_4_2_class_options(MachineClass 
*mc)
 compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
 smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_OFF;
 smc->default_caps.caps[SPAPR_CAP_FWNMI_MCE] = SPAPR_CAP_OFF;
+smc->rma_limit = 16 * GiB;
 mc->nvdimm_supported = false;
 }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index fc49c1a710..8a44a1f488 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -126,6 +126,7 @@ struct SpaprMachineClass {
 bool pre_4_1_migration; /* don't migrate hpt-max-page-size */
 bool linux_pci_probe;
 bool smp_threads_vsmt; /* set VSMT to smp_threads by default */
+hwaddr rma_limit;  /* clamp the RMA to this size */
 
 void (*phb_placement)(SpaprMachineState *spapr, uint32_t index,
   uint64_t *buid, hwaddr *pio, 
-- 
2.24.1




[PATCH v5 17/18] spapr: Clean up RMA size calculation

2020-02-19 Thread David Gibson
Move the calculation of the Real Mode Area (RMA) size into a helper
function.  While we're there clean it up and correct it in a few ways:
  * Add comments making it clearer where the various constraints come from
  * Remove a pointless check that the RMA fits within Node 0 (we've just
clamped it so that it does)

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 59 ++
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6e9f15f64d..f0354b699d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2648,6 +2648,40 @@ static PCIHostState *spapr_create_default_phb(void)
 return PCI_HOST_BRIDGE(dev);
 }
 
+static hwaddr spapr_rma_size(SpaprMachineState *spapr, Error **errp)
+{
+MachineState *machine = MACHINE(spapr);
+hwaddr rma_size = machine->ram_size;
+hwaddr node0_size = spapr_node0_size(machine);
+
+/* RMA has to fit in the first NUMA node */
+rma_size = MIN(rma_size, node0_size);
+
+/*
+ * VRMA access is via a special 1TiB SLB mapping, so the RMA can
+ * never exceed that
+ */
+rma_size = MIN(rma_size, TiB);
+
+/*
+ * Clamp the RMA size based on machine type.  This is for
+ * migration compatibility with older qemu versions, which limited
+ * the RMA size for complicated and mostly bad reasons.
+ */
+if (smc->rma_limit) {
+spapr->rma_size = MIN(spapr->rma_size, smc->rma_limit);
+}
+
+if (rma_size < (MIN_RMA_SLOF * MiB)) {
+error_setg(errp,
+"pSeries SLOF firmware requires >= %ldMiB guest RMA (Real Mode Area)",
+   MIN_RMA_SLOF);
+return -1;
+}
+
+return rma_size;
+}
+
 /* pSeries LPAR / sPAPR hardware init */
 static void spapr_machine_init(MachineState *machine)
 {
@@ -2660,7 +2694,6 @@ static void spapr_machine_init(MachineState *machine)
 int i;
 MemoryRegion *sysmem = get_system_memory();
 MemoryRegion *ram = g_new(MemoryRegion, 1);
-hwaddr node0_size = spapr_node0_size(machine);
 long load_limit, fw_size;
 char *filename;
 Error *resize_hpt_err = NULL;
@@ -2700,22 +2733,7 @@ static void spapr_machine_init(MachineState *machine)
 exit(1);
 }
 
-spapr->rma_size = node0_size;
-
-/*
- * Clamp the RMA size based on machine type.  This is for
- * migration compatibility with older qemu versions, which limited
- * the RMA size for complicated and mostly bad reasons.
- */
-if (smc->rma_limit) {
-spapr->rma_size = MIN(spapr->rma_size, smc->rma_limit);
-}
-
-if (spapr->rma_size > node0_size) {
-error_report("Numa node 0 has to span the RMA (%#08"HWADDR_PRIx")",
- spapr->rma_size);
-exit(1);
-}
+spapr->rma_size = spapr_rma_size(spapr, &error_fatal);
 
 /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
 load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
@@ -2954,13 +2972,6 @@ static void spapr_machine_init(MachineState *machine)
 }
 }
 
-if (spapr->rma_size < MIN_RMA_SLOF) {
-error_report(
-"pSeries SLOF firmware requires >= %ldMiB guest RMA (Real Mode 
Area memory)",
-MIN_RMA_SLOF / MiB);
-exit(1);
-}
-
 if (kernel_filename) {
 uint64_t lowaddr = 0;
 
-- 
2.24.1




[PATCH v5 03/18] target/ppc: Correct handling of real mode accesses with vhyp on hash MMU

2020-02-19 Thread David Gibson
On ppc we have the concept of virtual hypervisor ("vhyp") mode, where we
only model the non-hypervisor-privileged parts of the cpu.  Essentially we
model the hypervisor's behaviour from the point of view of a guest OS, but
we don't model the hypervisor's execution.

In particular, in this mode, qemu's notion of target physical address is
a guest physical address from the vcpu's point of view.  So accesses in
guest real mode don't require translation.  If we were modelling the
hypervisor mode, we'd need to translate the guest physical address into
a host physical address.

Currently, we handle this sloppily: we rely on setting up the virtual LPCR
and RMOR registers so that GPAs are simply HPAs plus an offset, which we
set to zero.  This is already conceptually dubious, since the LPCR and RMOR
registers don't exist in the non-hypervisor portion of the CPU.  It gets
worse with POWER9, where RMOR and LPCR[VPM0] no longer exist at all.

Clean this up by explicitly handling the vhyp case.  While we're there,
remove some unnecessary nesting of if statements that made the logic to
select the correct real mode behaviour a bit less clear than it could be.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 60 -
 1 file changed, 35 insertions(+), 25 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 3e0be4d55f..392f90e0ae 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -789,27 +789,30 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
  */
 raddr = eaddr & 0x0FFFULL;
 
-/* In HV mode, add HRMOR if top EA bit is clear */
-if (msr_hv || !env->has_hv_mode) {
+if (cpu->vhyp) {
+/*
+ * In virtual hypervisor mode, there's nothing to do:
+ *   EA == GPA == qemu guest address
+ */
+} else if (msr_hv || !env->has_hv_mode) {
+/* In HV mode, add HRMOR if top EA bit is clear */
 if (!(eaddr >> 63)) {
 raddr |= env->spr[SPR_HRMOR];
 }
-} else {
-/* Otherwise, check VPM for RMA vs VRMA */
-if (env->spr[SPR_LPCR] & LPCR_VPM0) {
-slb = &env->vrma_slb;
-if (slb->sps) {
-goto skip_slb_search;
-}
-/* Not much else to do here */
+} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+/* Emulated VRMA mode */
+slb = &env->vrma_slb;
+if (!slb->sps) {
+/* Invalid VRMA setup, machine check */
 cs->exception_index = POWERPC_EXCP_MCHECK;
 env->error_code = 0;
 return 1;
-} else if (raddr < env->rmls) {
-/* RMA. Check bounds in RMLS */
-raddr |= env->spr[SPR_RMOR];
-} else {
-/* The access failed, generate the approriate interrupt */
+}
+
+goto skip_slb_search;
+} else {
+/* Emulated old-style RMO mode, bounds check against RMLS */
+if (raddr >= env->rmls) {
 if (rwx == 2) {
 ppc_hash64_set_isi(cs, SRR1_PROTFAULT);
 } else {
@@ -821,6 +824,8 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 }
 return 1;
 }
+
+raddr |= env->spr[SPR_RMOR];
 }
 tlb_set_page(cs, eaddr & TARGET_PAGE_MASK, raddr & TARGET_PAGE_MASK,
  PAGE_READ | PAGE_WRITE | PAGE_EXEC, mmu_idx,
@@ -953,22 +958,27 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 /* In real mode the top 4 effective address bits are ignored */
 raddr = addr & 0x0FFFULL;
 
-/* In HV mode, add HRMOR if top EA bit is clear */
-if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
+if (cpu->vhyp) {
+/*
+ * In virtual hypervisor mode, there's nothing to do:
+ *   EA == GPA == qemu guest address
+ */
+return raddr;
+} else if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
+/* In HV mode, add HRMOR if top EA bit is clear */
 return raddr | env->spr[SPR_HRMOR];
-}
-
-/* Otherwise, check VPM for RMA vs VRMA */
-if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+/* Emulated VRMA mode */
 slb = &env->vrma_slb;
 if (!slb->sps) {
 return -1;
 }
-} else if (raddr < env->rmls) {
-/* RMA. Check bounds in RMLS */
-return raddr | env->spr[SPR_RMOR];
 } else {
-return -1;
+/* Emulated old-style RMO mode, bounds check against RMLS */
+if (raddr >= 

[PATCH v5 11/18] target/ppc: Streamline construction of VRMA SLB entry

2020-02-19 Thread David Gibson
When in VRMA mode (i.e. a guest thinks it has the MMU off, but the
hypervisor is still applying translation) we use a special SLB entry,
rather than looking up an SLBE by address as we do when guest translation
is on.

We build that special entry in ppc_hash64_update_vrma() along with some
logic for handling some non-VRMA cases.  Split the actual build of the
VRMA SLBE into a separate helper and streamline it a bit.

Signed-off-by: David Gibson 
---
 target/ppc/mmu-hash64.c | 74 +++--
 1 file changed, 34 insertions(+), 40 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 203a41cca1..ac21c14f68 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -791,6 +791,35 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
 }
 }
 
+static int build_vrma_slbe(PowerPCCPU *cpu, ppc_slb_t *slb)
+{
+CPUPPCState *env = &cpu->env;
+target_ulong lpcr = env->spr[SPR_LPCR];
+uint32_t vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
+target_ulong vsid = SLB_VSID_VRMA | ((vrmasd << 4) & SLB_VSID_LLP_MASK);
+int i;
+
+for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+const PPCHash64SegmentPageSizes *sps = &cpu->hash64_opts->sps[i];
+
+if (!sps->page_shift) {
+break;
+}
+
+if ((vsid & SLB_VSID_LLP_MASK) == sps->slb_enc) {
+slb->esid = SLB_ESID_V;
+slb->vsid = vsid;
+slb->sps = sps;
+return 0;
+}
+}
+
+error_report("Bad page size encoding in LPCR[VRMASD]; LPCR=0x"
+ TARGET_FMT_lx"\n", lpcr);
+
+return -1;
+}
+
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr,
 int rwx, int mmu_idx)
 {
@@ -1046,53 +1075,18 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 {
 CPUPPCState *env = &cpu->env;
-const PPCHash64SegmentPageSizes *sps = NULL;
-target_ulong esid, vsid, lpcr;
 ppc_slb_t *slb = &env->vrma_slb;
-uint32_t vrmasd;
-int i;
-
-/* First clear it */
-slb->esid = slb->vsid = 0;
-slb->sps = NULL;
 
 /* Is VRMA enabled ? */
 if (ppc_hash64_use_vrma(env)) {
-return;
-}
-
-/*
- * Make one up. Mostly ignore the ESID which will not be needed
- * for translation
- */
-lpcr = env->spr[SPR_LPCR];
-vsid = SLB_VSID_VRMA;
-vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
-vsid |= (vrmasd << 4) & (SLB_VSID_L | SLB_VSID_LP);
-esid = SLB_ESID_V;
-
-for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
-const PPCHash64SegmentPageSizes *sps1 = &cpu->hash64_opts->sps[i];
-
-if (!sps1->page_shift) {
-break;
-}
-
-if ((vsid & SLB_VSID_LLP_MASK) == sps1->slb_enc) {
-sps = sps1;
-break;
+if (build_vrma_slbe(cpu, slb) == 0) {
+return;
 }
 }
 
-if (!sps) {
-error_report("Bad page size encoding esid 0x"TARGET_FMT_lx
- " vsid 0x"TARGET_FMT_lx, esid, vsid);
-return;
-}
-
-slb->vsid = vsid;
-slb->esid = esid;
-slb->sps = sps;
+/* Otherwise, clear it to indicate error */
+slb->esid = slb->vsid = 0;
+slb->sps = NULL;
 }
 
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
-- 
2.24.1




[PATCH v5 15/18] spapr: Don't attempt to clamp RMA to VRMA constraint

2020-02-19 Thread David Gibson
The Real Mode Area (RMA) is the part of memory which a guest can access
when in real (MMU off) mode.  Of course, for a guest under KVM, the MMU
isn't really turned off, it's just in a special translation mode - Virtual
Real Mode Area (VRMA) - which looks like real mode in guest mode.

The mechanics of how this works when using the hash MMU (HPT) put a
constraint on the size of the RMA, which depends on the size of the
HPT.  So, the latter part of spapr_setup_hpt_and_vrma() clamps the RMA
we advertise to the guest based on this VRMA limit.

There are several things wrong with this:
 1) spapr_setup_hpt_and_vrma() doesn't actually clamp, it takes the minimum
of Node 0 memory size and the VRMA limit.  That will *often* work the
same as clamping, but there can be other constraints on RMA size which
supersede Node 0 memory size.  We have real bugs caused by this
(currently worked around in the guest kernel)
 2) Some callers of spapr_setup_hpt_and_vrma() are in a situation where
we're past the point that we can actually advertise an RMA limit to the
guest
 3) But most fundamentally, the VRMA limit depends on host configuration
(page size) which shouldn't be visible to the guest, but this partially
exposes it.  This can cause problems with migration in certain edge
cases, although we will mostly get away with it.

In practice, this clamping is almost never applied anyway.  With 64kiB
pages and the normal rules for sizing of the HPT, the theoretical VRMA
limit will be 4x(guest memory size) and so never hit.  It will hit with
4kiB pages, where it will be (guest memory size)/4.  However all mainstream
distro kernels for POWER have used a 64kiB page size for at least 10 years.

So, simply replace this logic with a check that the RMA we've calculated
based only on guest visible configuration will fit within the host implied
VRMA limit.  This can break if running HPT guests on a host kernel with
4kiB page size.  As noted that's very rare.  There also exist several
possible workarounds:
  * Change the host kernel to use 64kiB pages
  * Use radix MMU (RPT) guests instead of HPT
  * Use 64kiB hugepages on the host to back guest memory
  * Increase the guest memory size so that the RMA hits one of the fixed
limits before the RMA limit.  This is relatively easy on POWER8 which
has a 16GiB limit, harder on POWER9 which has a 1TiB limit.
  * Use a guest NUMA configuration which artificially constrains the RMA
within the VRMA limit (the RMA must always fit within Node 0).

Previously, on KVM, we also temporarily reduced the rma_size to 256M so
that the we'd load the kernel and initrd safely, regardless of the VRMA
limit.  This was a) confusing, b) could significantly limit the size of
images we could load and c) introduced a behavioural difference between
KVM and TCG.  So we remove that as well.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr.c | 28 ++--
 hw/ppc/spapr_hcall.c   |  4 ++--
 include/hw/ppc/spapr.h |  3 +--
 3 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b68d80ba69..4dab489931 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1569,7 +1569,7 @@ void spapr_reallocate_hpt(SpaprMachineState *spapr, int 
shift,
 spapr_set_all_lpcrs(0, LPCR_HR | LPCR_UPRT);
 }
 
-void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr)
+void spapr_setup_hpt(SpaprMachineState *spapr)
 {
 int hpt_shift;
 
@@ -1585,10 +1585,16 @@ void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr)
 }
 spapr_reallocate_hpt(spapr, hpt_shift, &error_fatal);
 
-if (spapr->vrma_adjust) {
+if (kvm_enabled()) {
 hwaddr vrma_limit = kvmppc_vrma_limit(spapr->htab_shift);
 
-spapr->rma_size = MIN(spapr_node0_size(MACHINE(spapr)), vrma_limit);
+/* Check our RMA fits in the possible VRMA */
+if (vrma_limit < spapr->rma_size) {
+error_report("Unable to create %" HWADDR_PRIu
+ "MiB RMA (VRMA only allows %" HWADDR_PRIu "MiB",
+ spapr->rma_size / MiB, vrma_limit / MiB);
+exit(EXIT_FAILURE);
+}
 }
 }
 
@@ -1628,7 +1634,7 @@ static void spapr_machine_reset(MachineState *machine)
 spapr->patb_entry = PATE1_GR;
 spapr_set_all_lpcrs(LPCR_HR | LPCR_UPRT, LPCR_HR | LPCR_UPRT);
 } else {
-spapr_setup_hpt_and_vrma(spapr);
+spapr_setup_hpt(spapr);
 }
 
 qemu_devices_reset();
@@ -2696,20 +2702,6 @@ static void spapr_machine_init(MachineState *machine)
 
 spapr->rma_size = node0_size;
 
-/* With KVM, we don't actually know whether KVM supports an
- * unbounded RMA (PR KVM) or is limited by the hash table size
- * (HV KVM using VRMA), so we always assume the latter
- *
- * In that case, we also limit the initial allocations for RTAS
- * etc... to 256M since we have no way to know what the VRMA size
-

[PATCH v5 12/18] target/ppc: Don't store VRMA SLBE persistently

2020-02-19 Thread David Gibson
Currently, we construct the SLBE used for VRMA translations when the LPCR
is written (which controls some bits in the SLBE), then use it later for
translations.

This is a bit complex and confusing - simplify it by simply constructing
the SLBE directly from the LPCR when we need it.

Signed-off-by: David Gibson 
---
 target/ppc/cpu.h|  3 ---
 target/ppc/mmu-hash64.c | 28 ++--
 2 files changed, 6 insertions(+), 25 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index f9871b1233..5a55fb02bd 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1044,9 +1044,6 @@ struct CPUPPCState {
 uint32_t flags;
 uint64_t insns_flags;
 uint64_t insns_flags2;
-#if defined(TARGET_PPC64)
-ppc_slb_t vrma_slb;
-#endif
 
 int error_code;
 uint32_t pending_interrupts;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index ac21c14f68..f8bf92aa2e 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -825,6 +825,7 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 {
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = &cpu->env;
+ppc_slb_t vrma_slbe;
 ppc_slb_t *slb;
 unsigned apshift;
 hwaddr ptex;
@@ -863,8 +864,8 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 }
 } else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
-slb = &env->vrma_slb;
-if (!slb->sps) {
+slb = &vrma_slbe;
+if (build_vrma_slbe(cpu, slb) != 0) {
 /* Invalid VRMA setup, machine check */
 cs->exception_index = POWERPC_EXCP_MCHECK;
 env->error_code = 0;
@@ -1012,6 +1013,7 @@ skip_slb_search:
 hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, target_ulong addr)
 {
 CPUPPCState *env = &cpu->env;
+ppc_slb_t vrma_slbe;
 ppc_slb_t *slb;
 hwaddr ptex, raddr;
 ppc_hash_pte64_t pte;
@@ -1033,8 +1035,8 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 return raddr | env->spr[SPR_HRMOR];
 } else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
-slb = &env->vrma_slb;
-if (!slb->sps) {
+slb = &vrma_slbe;
+if (build_vrma_slbe(cpu, slb) != 0) {
 return -1;
 }
 } else {
@@ -1072,30 +1074,12 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 cpu->env.tlb_need_flush = TLB_NEED_GLOBAL_FLUSH | TLB_NEED_LOCAL_FLUSH;
 }
 
-static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
-{
-CPUPPCState *env = &cpu->env;
-ppc_slb_t *slb = &env->vrma_slb;
-
-/* Is VRMA enabled ? */
-if (ppc_hash64_use_vrma(env)) {
-if (build_vrma_slbe(cpu, slb) == 0) {
-return;
-}
-}
-
-/* Otherwise, clear it to indicate error */
-slb->esid = slb->vsid = 0;
-slb->sps = NULL;
-}
-
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 {
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 CPUPPCState *env = &cpu->env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-ppc_hash64_update_vrma(cpu);
 }
 
 void helper_store_lpcr(CPUPPCState *env, target_ulong val)
-- 
2.24.1




[PATCH v5 08/18] target/ppc: Streamline calculation of RMA limit from LPCR[RMLS]

2020-02-19 Thread David Gibson
Currently we use a big switch statement in ppc_hash64_update_rmls() to work
out what the right RMA limit is based on the LPCR[RMLS] field.  There's no
formula for this - it's just an arbitrary mapping defined by the existing
CPU implementations - but we can make it a bit more readable by using a
lookup table rather than a switch.  In addition we can use the MiB/GiB
symbols to make it a bit clearer.

While there we add a bit of clarity and rationale to the comment about
what happens if the LPCR[RMLS] doesn't contain a valid value.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 71 -
 1 file changed, 35 insertions(+), 36 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 8acd1f78ae..4e6c1f722b 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -18,6 +18,7 @@
  * License along with this library; if not, see .
  */
 #include "qemu/osdep.h"
+#include "qemu/units.h"
 #include "cpu.h"
 #include "exec/exec-all.h"
 #include "exec/helper-proto.h"
@@ -757,6 +758,39 @@ static void ppc_hash64_set_c(PowerPCCPU *cpu, hwaddr ptex, 
uint64_t pte1)
 stb_phys(CPU(cpu)->as, base + offset, (pte1 & 0xff) | 0x80);
 }
 
+static target_ulong rmls_limit(PowerPCCPU *cpu)
+{
+CPUPPCState *env = &cpu->env;
+/*
+ * This is the full 4 bits encoding of POWER8. Previous
+ * CPUs only support a subset of these but the filtering
+ * is done when writing LPCR
+ */
+const target_ulong rma_sizes[] = {
+[0] = 0,
+[1] = 16 * GiB,
+[2] = 1 * GiB,
+[3] = 64 * MiB,
+[4] = 256 * MiB,
+[5] = 0,
+[6] = 0,
+[7] = 128 * MiB,
+[8] = 32 * MiB,
+};
+target_ulong rmls = (env->spr[SPR_LPCR] & LPCR_RMLS) >> LPCR_RMLS_SHIFT;
+
+if (rmls < ARRAY_SIZE(rma_sizes)) {
+return rma_sizes[rmls];
+} else {
+/*
+ * Bad value, so the OS has shot itself in the foot.  Return a
+ * 0-sized RMA which we expect to trigger an immediate DSI or
+ * ISI
+ */
+return 0;
+}
+}
+
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr,
 int rwx, int mmu_idx)
 {
@@ -1006,41 +1040,6 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 cpu->env.tlb_need_flush = TLB_NEED_GLOBAL_FLUSH | TLB_NEED_LOCAL_FLUSH;
 }
 
-static void ppc_hash64_update_rmls(PowerPCCPU *cpu)
-{
-CPUPPCState *env = &cpu->env;
-uint64_t lpcr = env->spr[SPR_LPCR];
-
-/*
- * This is the full 4 bits encoding of POWER8. Previous
- * CPUs only support a subset of these but the filtering
- * is done when writing LPCR
- */
-switch ((lpcr & LPCR_RMLS) >> LPCR_RMLS_SHIFT) {
-case 0x8: /* 32MB */
-env->rmls = 0x200ull;
-break;
-case 0x3: /* 64MB */
-env->rmls = 0x400ull;
-break;
-case 0x7: /* 128MB */
-env->rmls = 0x800ull;
-break;
-case 0x4: /* 256MB */
-env->rmls = 0x1000ull;
-break;
-case 0x2: /* 1GB */
-env->rmls = 0x4000ull;
-break;
-case 0x1: /* 16GB */
-env->rmls = 0x4ull;
-break;
-default:
-/* What to do here ??? */
-env->rmls = 0;
-}
-}
-
 static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 {
 CPUPPCState *env = &cpu->env;
@@ -1099,7 +1098,7 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 CPUPPCState *env = &cpu->env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-ppc_hash64_update_rmls(cpu);
+env->rmls = rmls_limit(cpu);
 ppc_hash64_update_vrma(cpu);
 }
 
-- 
2.24.1




[PATCH v5 14/18] spapr,ppc: Simplify signature of kvmppc_rma_size()

2020-02-19 Thread David Gibson
This function calculates the maximum size of the RMA as implied by the
host's page size of structure of the VRMA (there are a number of other
constraints on the RMA size which will supersede this one in many
circumstances).

The current interface takes the current RMA size estimate, and clamps it
to the VRMA derived size.  The only current caller passes in an arguably
wrong value (it will match the current RMA estimate in some but not all
cases).

We want to fix that, but for now just keep concerns separated by having the
KVM helper function just return the VRMA derived limit, and let the caller
combine it with other constraints.  We call the new function
kvmppc_vrma_limit() to more clearly indicate its limited responsibility.

The helper should only ever be called in the KVM enabled case, so replace
its !CONFIG_KVM stub with an assert() rather than a dummy value.

Signed-off-by: David Gibson 
Reviewed-by: Cedric Le Goater 
Reviewed-by: Greg Kurz 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr.c   | 5 +++--
 target/ppc/kvm.c | 5 ++---
 target/ppc/kvm_ppc.h | 7 +++
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 272a270b7a..b68d80ba69 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1586,8 +1586,9 @@ void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr)
 spapr_reallocate_hpt(spapr, hpt_shift, &error_fatal);
 
 if (spapr->vrma_adjust) {
-spapr->rma_size = kvmppc_rma_size(spapr_node0_size(MACHINE(spapr)),
-  spapr->htab_shift);
+hwaddr vrma_limit = kvmppc_vrma_limit(spapr->htab_shift);
+
+spapr->rma_size = MIN(spapr_node0_size(MACHINE(spapr)), vrma_limit);
 }
 }
 
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 7f44b1aa1a..597f72be1b 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2113,7 +2113,7 @@ void kvmppc_error_append_smt_possible_hint(Error *const 
*errp)
 
 
 #ifdef TARGET_PPC64
-uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
+uint64_t kvmppc_vrma_limit(unsigned int hash_shift)
 {
 struct kvm_ppc_smmu_info info;
 long rampagesize, best_page_shift;
@@ -2140,8 +2140,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned 
int hash_shift)
 }
 }
 
-return MIN(current_size,
-   1ULL << (best_page_shift + hash_shift - 7));
+return 1ULL << (best_page_shift + hash_shift - 7);
 }
 #endif
 
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 9e4f2357cc..332fa0aa1c 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -47,7 +47,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t 
page_shift,
   int *pfd, bool need_vfio);
 int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size);
 int kvmppc_reset_htab(int shift_hint);
-uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift);
+uint64_t kvmppc_vrma_limit(unsigned int hash_shift);
 bool kvmppc_has_cap_spapr_vfio(void);
 #endif /* !CONFIG_USER_ONLY */
 bool kvmppc_has_cap_epr(void);
@@ -255,10 +255,9 @@ static inline int kvmppc_reset_htab(int shift_hint)
 return 0;
 }
 
-static inline uint64_t kvmppc_rma_size(uint64_t current_size,
-   unsigned int hash_shift)
+static inline uint64_t kvmppc_vrma_limit(unsigned int hash_shift)
 {
-return ram_size;
+g_assert_not_reached();
 }
 
 static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
-- 
2.24.1




[PATCH v5 06/18] target/ppc: Remove RMOR register from POWER9 & POWER10

2020-02-19 Thread David Gibson
Currently we create the Real Mode Offset Register (RMOR) on all Book3S cpus
from POWER7 onwards.  However the translation mode which the RMOR controls
is no longer supported in POWER9, and so the register has been removed from
the architecture.

Remove it from our model on POWER9 and POWER10.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/translate_init.inc.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index ab79975fec..925bc31ca5 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8015,12 +8015,16 @@ static void gen_spr_book3s_ids(CPUPPCState *env)
  SPR_NOACCESS, SPR_NOACCESS,
  &spr_read_generic, &spr_write_generic,
  0x);
-spr_register_hv(env, SPR_RMOR, "RMOR",
+spr_register_hv(env, SPR_HRMOR, "HRMOR",
  SPR_NOACCESS, SPR_NOACCESS,
  SPR_NOACCESS, SPR_NOACCESS,
  &spr_read_generic, &spr_write_generic,
  0x);
-spr_register_hv(env, SPR_HRMOR, "HRMOR",
+}
+
+static void gen_spr_rmor(CPUPPCState *env)
+{
+spr_register_hv(env, SPR_RMOR, "RMOR",
  SPR_NOACCESS, SPR_NOACCESS,
  SPR_NOACCESS, SPR_NOACCESS,
  &spr_read_generic, &spr_write_generic,
@@ -8535,6 +8539,7 @@ static void init_proc_POWER7(CPUPPCState *env)
 
 /* POWER7 Specific Registers */
 gen_spr_book3s_ids(env);
+gen_spr_rmor(env);
 gen_spr_amr(env);
 gen_spr_book3s_purr(env);
 gen_spr_power5p_common(env);
@@ -8676,6 +8681,7 @@ static void init_proc_POWER8(CPUPPCState *env)
 
 /* POWER8 Specific Registers */
 gen_spr_book3s_ids(env);
+gen_spr_rmor(env);
 gen_spr_amr(env);
 gen_spr_iamr(env);
 gen_spr_book3s_purr(env);
-- 
2.24.1




[PATCH v5 13/18] spapr: Don't use weird units for MIN_RMA_SLOF

2020-02-19 Thread David Gibson
MIN_RMA_SLOF records the minimum about of RMA that the SLOF firmware
requires.  It lets us give a meaningful error if the RMA ends up too small,
rather than just letting SLOF crash.

It's currently stored as a number of megabytes, which is strange for global
constants.  Move that megabyte scaling into the definition of the constant
like most other things use.

Change from M to MiB in the associated message while we're at it.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 828e2cc135..272a270b7a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -103,7 +103,7 @@
 #define FW_OVERHEAD 0x280
 #define KERNEL_LOAD_ADDRFW_MAX_SIZE
 
-#define MIN_RMA_SLOF128UL
+#define MIN_RMA_SLOF(128 * MiB)
 
 #define PHANDLE_INTC0x
 
@@ -2959,10 +2959,10 @@ static void spapr_machine_init(MachineState *machine)
 }
 }
 
-if (spapr->rma_size < (MIN_RMA_SLOF * MiB)) {
+if (spapr->rma_size < MIN_RMA_SLOF) {
 error_report(
-"pSeries SLOF firmware requires >= %ldM guest RMA (Real Mode Area 
memory)",
-MIN_RMA_SLOF);
+"pSeries SLOF firmware requires >= %ldMiB guest RMA (Real Mode 
Area memory)",
+MIN_RMA_SLOF / MiB);
 exit(1);
 }
 
-- 
2.24.1




[PATCH v5 02/18] ppc: Remove stub of PPC970 HID4 implementation

2020-02-19 Thread David Gibson
The PowerPC 970 CPU was a cut-down POWER4, which had hypervisor capability.
However, it can be (and often was) strapped into "Apple mode", where the
hypervisor capabilities were disabled (essentially putting it always in
hypervisor mode).

That's actually the only mode of the 970 we support in qemu, and we're
unlikely to change that any time soon.  However, we do have a partial
implementation of the 970's HID4 register which affects things only
relevant for hypervisor mode.

That stub is also really ugly, since it attempts to duplicate the effects
of HID4 by re-encoding it into the LPCR register used in newer CPUs, but
in a really confusing way.

Just get rid of it.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Greg Kurz 
---
 target/ppc/mmu-hash64.c | 29 +
 target/ppc/translate_init.inc.c | 20 
 2 files changed, 9 insertions(+), 40 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index da8966ccf5..3e0be4d55f 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1091,33 +1091,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 
 /* Filter out bits */
 switch (env->mmu_model) {
-case POWERPC_MMU_64B: /* 970 */
-if (val & 0x40) {
-lpcr |= LPCR_LPES0;
-}
-if (val & 0x8000ull) {
-lpcr |= LPCR_LPES1;
-}
-if (val & 0x20) {
-lpcr |= (0x4ull << LPCR_RMLS_SHIFT);
-}
-if (val & 0x4000ull) {
-lpcr |= (0x2ull << LPCR_RMLS_SHIFT);
-}
-if (val & 0x2000ull) {
-lpcr |= (0x1ull << LPCR_RMLS_SHIFT);
-}
-env->spr[SPR_RMOR] = ((lpcr >> 41) & 0xull) << 26;
-
-/*
- * XXX We could also write LPID from HID4 here
- * but since we don't tag any translation on it
- * it doesn't actually matter
- *
- * XXX For proper emulation of 970 we also need
- * to dig HRMOR out of HID5
- */
-break;
 case POWERPC_MMU_2_03: /* P5p */
 lpcr = val & (LPCR_RMLS | LPCR_ILE |
   LPCR_LPES0 | LPCR_LPES1 |
@@ -1154,7 +1127,7 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 }
 break;
 default:
-;
+g_assert_not_reached();
 }
 env->spr[SPR_LPCR] = lpcr;
 ppc_hash64_update_rmls(cpu);
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index a0d0eaabf2..ab79975fec 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -7895,25 +7895,21 @@ static void spr_write_lpcr(DisasContext *ctx, int sprn, 
int gprn)
 {
 gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
 }
-
-static void spr_write_970_hid4(DisasContext *ctx, int sprn, int gprn)
-{
-#if defined(TARGET_PPC64)
-spr_write_generic(ctx, sprn, gprn);
-gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
-#endif
-}
-
 #endif /* !defined(CONFIG_USER_ONLY) */
 
 static void gen_spr_970_lpar(CPUPPCState *env)
 {
 #if !defined(CONFIG_USER_ONLY)
-/* Logical partitionning */
-/* PPC970: HID4 is effectively the LPCR */
+/*
+ * PPC970: HID4 covers things later controlled by the LPCR and
+ * RMOR in later CPUs, but with a different encoding.  We only
+ * support the 970 in "Apple mode" which has all hypervisor
+ * facilities disabled by strapping, so we can basically just
+ * ignore it
+ */
 spr_register(env, SPR_970_HID4, "HID4",
  SPR_NOACCESS, SPR_NOACCESS,
- &spr_read_generic, &spr_write_970_hid4,
+ &spr_read_generic, &spr_write_generic,
  0x);
 #endif
 }
-- 
2.24.1




[PATCH v5 10/18] target/ppc: Only calculate RMLS derived RMA limit on demand

2020-02-19 Thread David Gibson
When the LPCR is written, we update the env->rmls field with the RMA limit
it implies.  Simplify things by just calculating the value directly from
the LPCR value when we need it.

It's possible this is a little slower, but it's unlikely to be significant,
since this is only for real mode accesses in a translation configuration
that's not used very often, and the whole thing is behind the qemu TLB
anyway.  Therefore, keeping the number of state variables down and not
having to worry about making sure it's always in sync seems the better
option.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/cpu.h| 1 -
 target/ppc/mmu-hash64.c | 8 +---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 8077fdb068..f9871b1233 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1046,7 +1046,6 @@ struct CPUPPCState {
 uint64_t insns_flags2;
 #if defined(TARGET_PPC64)
 ppc_slb_t vrma_slb;
-target_ulong rmls;
 #endif
 
 int error_code;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 46690bc79b..203a41cca1 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -844,8 +844,10 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 
 goto skip_slb_search;
 } else {
+target_ulong limit = rmls_limit(cpu);
+
 /* Emulated old-style RMO mode, bounds check against RMLS */
-if (raddr >= env->rmls) {
+if (raddr >= limit) {
 if (rwx == 2) {
 ppc_hash64_set_isi(cs, SRR1_PROTFAULT);
 } else {
@@ -1007,8 +1009,9 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 return -1;
 }
 } else {
+target_ulong limit = rmls_limit(cpu);
 /* Emulated old-style RMO mode, bounds check against RMLS */
-if (raddr >= env->rmls) {
+if (raddr >= limit) {
 return -1;
 }
 return raddr | env->spr[SPR_RMOR];
@@ -1098,7 +1101,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 CPUPPCState *env = &cpu->env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-env->rmls = rmls_limit(cpu);
 ppc_hash64_update_vrma(cpu);
 }
 
-- 
2.24.1




[PATCH v5 07/18] target/ppc: Use class fields to simplify LPCR masking

2020-02-19 Thread David Gibson
When we store the Logical Partitioning Control Register (LPCR) we have a
big switch statement to work out which are valid bits for the cpu model
we're emulating.

As well as being ugly, this isn't really conceptually correct, since it is
based on the mmu_model variable, whereas the LPCR isn't (only) about the
MMU, so mmu_model is basically just acting as a proxy for the cpu model.

Handle this in a simpler way, by adding a suitable lpcr_mask to the QOM
class.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/cpu-qom.h|  1 +
 target/ppc/mmu-hash64.c | 36 ++---
 target/ppc/translate_init.inc.c | 27 +
 3 files changed, 26 insertions(+), 38 deletions(-)

diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index e499575dc8..15d6b54a7d 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -177,6 +177,7 @@ typedef struct PowerPCCPUClass {
 uint64_t insns_flags;
 uint64_t insns_flags2;
 uint64_t msr_mask;
+uint64_t lpcr_mask; /* Available bits in the LPCR */
 uint64_t lpcr_pm;   /* Power-saving mode Exit Cause Enable bits */
 powerpc_mmu_t   mmu_model;
 powerpc_excp_t  excp_model;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 71e08801cc..8acd1f78ae 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1095,42 +1095,10 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 {
+PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 CPUPPCState *env = &cpu->env;
-uint64_t lpcr = 0;
 
-/* Filter out bits */
-switch (env->mmu_model) {
-case POWERPC_MMU_2_03: /* P5p */
-lpcr = val & (LPCR_RMLS | LPCR_ILE |
-  LPCR_LPES0 | LPCR_LPES1 |
-  LPCR_RMI | LPCR_HDICE);
-break;
-case POWERPC_MMU_2_06: /* P7 */
-lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_DPFD |
-  LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
-  LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2 |
-  LPCR_MER | LPCR_TC |
-  LPCR_LPES0 | LPCR_LPES1 | LPCR_HDICE);
-break;
-case POWERPC_MMU_2_07: /* P8 */
-lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_KBV |
-  LPCR_DPFD | LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
-  LPCR_AIL | LPCR_ONL | LPCR_P8_PECE0 | LPCR_P8_PECE1 |
-  LPCR_P8_PECE2 | LPCR_P8_PECE3 | LPCR_P8_PECE4 |
-  LPCR_MER | LPCR_TC | LPCR_LPES0 | LPCR_HDICE);
-break;
-case POWERPC_MMU_3_00: /* P9 */
-lpcr = val & (LPCR_VPM1 | LPCR_ISL | LPCR_KBV | LPCR_DPFD |
-  (LPCR_PECE_U_MASK & LPCR_HVEE) | LPCR_ILE | LPCR_AIL |
-  LPCR_UPRT | LPCR_EVIRT | LPCR_ONL | LPCR_HR | LPCR_LD |
-  (LPCR_PECE_L_MASK & (LPCR_PDEE | LPCR_HDEE | LPCR_EEE |
-  LPCR_DEE | LPCR_OEE)) | LPCR_MER | LPCR_GTSE | LPCR_TC |
-  LPCR_HEIC | LPCR_LPES0 | LPCR_HVICE | LPCR_HDICE);
-break;
-default:
-g_assert_not_reached();
-}
-env->spr[SPR_LPCR] = lpcr;
+env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
 ppc_hash64_update_rmls(cpu);
 ppc_hash64_update_vrma(cpu);
 }
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 925bc31ca5..5b7a5226e1 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8476,6 +8476,8 @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data)
 (1ull << MSR_DR) |
 (1ull << MSR_PMM) |
 (1ull << MSR_RI);
+pcc->lpcr_mask = LPCR_RMLS | LPCR_ILE | LPCR_LPES0 | LPCR_LPES1 |
+LPCR_RMI | LPCR_HDICE;
 pcc->mmu_model = POWERPC_MMU_2_03;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
@@ -8653,6 +8655,12 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 (1ull << MSR_PMM) |
 (1ull << MSR_RI) |
 (1ull << MSR_LE);
+pcc->lpcr_mask = LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_DPFD |
+LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
+LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2 |
+LPCR_MER | LPCR_TC |
+LPCR_LPES0 | LPCR_LPES1 | LPCR_HDICE;
+pcc->lpcr_pm = LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2;
 pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
@@ -8669,7 +8677,6 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 pcc->l1_dcache_size = 0x8000;
 pcc->l1_icache_size = 0x8000;
 pcc->interrupts_big_endian = ppc_cpu_interrupts_big_endian_lpcr;
-pcc->lpcr_pm = LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2;
 }
 
 static void init_proc_POWER8(CPUPPCState *env)
@@ -882

[PATCH v5 18/18] spapr: Fold spapr_node0_size() into its only caller

2020-02-19 Thread David Gibson
The Real Mode Area (RMA) needs to fit within the NUMA node owning memory
at address 0.  That's usually node 0, but can be a later one if there are
some nodes which have no memory (only CPUs).

This is currently handled by the spapr_node0_size() helper.  It has only
one caller, so there's not a lot of point splitting it out.  It's also
extremely easy to misread the code as clamping to the size of the smallest
node rather than the first node with any memory.

So, fold it into the caller, and add some commentary to make it a bit
clearer exactly what it's doing.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 37 +
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f0354b699d..9ba645c9cb 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -296,20 +296,6 @@ static void spapr_populate_pa_features(SpaprMachineState 
*spapr,
 _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
 }
 
-static hwaddr spapr_node0_size(MachineState *machine)
-{
-if (machine->numa_state->num_nodes) {
-int i;
-for (i = 0; i < machine->numa_state->num_nodes; ++i) {
-if (machine->numa_state->nodes[i].node_mem) {
-return MIN(pow2floor(machine->numa_state->nodes[i].node_mem),
-   machine->ram_size);
-}
-}
-}
-return machine->ram_size;
-}
-
 static void add_str(GString *s, const gchar *s1)
 {
 g_string_append_len(s, s1, strlen(s1) + 1);
@@ -2652,10 +2638,24 @@ static hwaddr spapr_rma_size(SpaprMachineState *spapr, 
Error **errp)
 {
 MachineState *machine = MACHINE(spapr);
 hwaddr rma_size = machine->ram_size;
-hwaddr node0_size = spapr_node0_size(machine);
 
 /* RMA has to fit in the first NUMA node */
-rma_size = MIN(rma_size, node0_size);
+if (machine->numa_state->num_nodes) {
+/*
+ * It's possible for there to be some zero-memory nodes first
+ * in the list.  We need the RMA to fit inside the memory of
+ * the first node which actually has some memory.
+ */
+int i;
+
+for (i = 0; i < machine->numa_state->num_nodes; ++i) {
+if (machine->numa_state->nodes[i].node_mem != 0) {
+rma_size = MIN(rma_size,
+   machine->numa_state->nodes[i].node_mem);
+break;
+}
+}
+}
 
 /*
  * VRMA access is via a special 1TiB SLB mapping, so the RMA can
@@ -2672,6 +2672,11 @@ static hwaddr spapr_rma_size(SpaprMachineState *spapr, 
Error **errp)
 spapr->rma_size = MIN(spapr->rma_size, smc->rma_limit);
 }
 
+/*
+ * RMA size must be a power of 2
+ */
+rma_size = pow2floor(rma_size);
+
 if (rma_size < (MIN_RMA_SLOF * MiB)) {
 error_setg(errp,
 "pSeries SLOF firmware requires >= %ldMiB guest RMA (Real Mode Area)",
-- 
2.24.1




[PATCH v5 00/18] target/ppc: Correct some errors with real mode handling

2020-02-19 Thread David Gibson
POWER "book S" (server class) cpus have a concept of "real mode" where
MMU translation is disabled... sort of.  In fact this can mean a bunch
of slightly different things when hypervisor mode and other
considerations are present.

We had some errors in edge cases here, so clean some things up and
correct them.

Some of those limitations caused problems with calculating the size of
the Real Mode Area of pseries guests, so continue on to clean up and
correct those calculations as well.

Changes since v4:
 * Some tiny cosmetic fixes to the original patches
 * Added a bunch of extra patches correcting RMA calculation
Changes since v3:
 * Fix style errors reported by checkpatch
Changes since v2:
 * Removed 32-bit hypervisor stubs more completely
 * Minor polish based on review comments
Changes since RFCv1:
 * Add a number of extra patches taking advantage of the initial
   cleanups

David Gibson (18):
  ppc: Remove stub support for 32-bit hypervisor mode
  ppc: Remove stub of PPC970 HID4 implementation
  target/ppc: Correct handling of real mode accesses with vhyp on hash
MMU
  target/ppc: Introduce ppc_hash64_use_vrma() helper
  spapr, ppc: Remove VPM0/RMLS hacks for POWER9
  target/ppc: Remove RMOR register from POWER9 & POWER10
  target/ppc: Use class fields to simplify LPCR masking
  target/ppc: Streamline calculation of RMA limit from LPCR[RMLS]
  target/ppc: Correct RMLS table
  target/ppc: Only calculate RMLS derived RMA limit on demand
  target/ppc: Streamline construction of VRMA SLB entry
  target/ppc: Don't store VRMA SLBE persistently
  spapr: Don't use weird units for MIN_RMA_SLOF
  spapr,ppc: Simplify signature of kvmppc_rma_size()
  spapr: Don't attempt to clamp RMA to VRMA constraint
  spapr: Don't clamp RMA to 16GiB on new machine types
  spapr: Clean up RMA size calculation
  spapr: Fold spapr_node0_size() into its only caller

 hw/ppc/spapr.c  | 124 ++--
 hw/ppc/spapr_cpu_core.c |   6 +-
 hw/ppc/spapr_hcall.c|   4 +-
 include/hw/ppc/spapr.h  |   4 +-
 target/ppc/cpu-qom.h|   1 +
 target/ppc/cpu.h|  25 +--
 target/ppc/kvm.c|   5 +-
 target/ppc/kvm_ppc.h|   7 +-
 target/ppc/mmu-hash64.c | 327 
 target/ppc/translate_init.inc.c |  63 --
 10 files changed, 252 insertions(+), 314 deletions(-)

-- 
2.24.1




[PATCH v5 05/18] spapr, ppc: Remove VPM0/RMLS hacks for POWER9

2020-02-19 Thread David Gibson
For the "pseries" machine, we use "virtual hypervisor" mode where we
only model the CPU in non-hypervisor privileged mode.  This means that
we need guest physical addresses within the modelled cpu to be treated
as absolute physical addresses.

We used to do that by clearing LPCR[VPM0] and setting LPCR[RMLS] to a high
limit so that the old offset based translation for guest mode applied,
which does what we need.  However, POWER9 has removed support for that
translation mode, which meant we had some ugly hacks to keep it working.

We now explicitly handle this sort of translation for virtual hypervisor
mode, so the hacks aren't necessary.  We don't need to set VPM0 and RMLS
from the machine type code - they're now ignored in vhyp mode.  On the cpu
side we don't need to allow LPCR[RMLS] to be set on POWER9 in vhyp mode -
that was only there to allow the hack on the machine side.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 hw/ppc/spapr_cpu_core.c | 6 +-
 target/ppc/mmu-hash64.c | 8 
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index d09125d9af..ea5e11f1d9 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -58,14 +58,10 @@ static void spapr_reset_vcpu(PowerPCCPU *cpu)
  * we don't get spurious wakups before an RTAS start-cpu call.
  * For the same reason, set PSSCR_EC.
  */
-lpcr &= ~(LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_KBV | pcc->lpcr_pm);
+lpcr &= ~(LPCR_VPM1 | LPCR_ISL | LPCR_KBV | pcc->lpcr_pm);
 lpcr |= LPCR_LPES0 | LPCR_LPES1;
 env->spr[SPR_PSSCR] |= PSSCR_EC;
 
-/* Set RMLS to the max (ie, 16G) */
-lpcr &= ~LPCR_RMLS;
-lpcr |= 1ull << LPCR_RMLS_SHIFT;
-
 ppc_store_lpcr(cpu, lpcr);
 
 /* Set a full AMOR so guest can use the AMR as it sees fit */
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 0f9c0149e8..71e08801cc 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1126,14 +1126,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
   (LPCR_PECE_L_MASK & (LPCR_PDEE | LPCR_HDEE | LPCR_EEE |
   LPCR_DEE | LPCR_OEE)) | LPCR_MER | LPCR_GTSE | LPCR_TC |
   LPCR_HEIC | LPCR_LPES0 | LPCR_HVICE | LPCR_HDICE);
-/*
- * If we have a virtual hypervisor, we need to bring back RMLS. It
- * doesn't exist on an actual P9 but that's all we know how to
- * configure with softmmu at the moment
- */
-if (cpu->vhyp) {
-lpcr |= (val & LPCR_RMLS);
-}
 break;
 default:
 g_assert_not_reached();
-- 
2.24.1




Re: [PATCH v3 04/12] target/ppc: Introduce ppc_hash64_use_vrma() helper

2020-02-19 Thread David Gibson
On Wed, Feb 19, 2020 at 11:06:20AM -0300, Fabiano Rosas wrote:
> David Gibson  writes:
> 
> > When running guests under a hypervisor, the hypervisor obviously needs to
> > be protected from guest accesses even if those are in what the guest
> > considers real mode (translation off).  The POWER hardware provides two
> > ways of doing that: The old way has guest real mode accesses simply offset
> > and bounds checked into host addresses.  It works, but requires that a
> > significant chunk of the guest's memory - the RMA - be physically
> > contiguous in the host, which is pretty inconvenient.  The new way, known
> > as VRMA, has guest real mode accesses translated in roughly the normal way
> > but with some special parameters.
> >
> > In POWER7 and POWER8 the LPCR[VPM0] bit selected between the two modes, but
> > in POWER9 only VRMA mode is supported
> 
> ... when translation is off, right? Because I see in the 3.0 ISA that
> LPCR[VPM1] is still there.

Right.  This whole patch (and the whole series) is about when the
guest is in translation off mode.

> 
> > and LPCR[VPM0] no longer exists.  We
> > handle that difference in behaviour in ppc_hash64_set_isi().. but not in
> > other places that we blindly check LPCR[VPM0].
> >
> > Correct those instances with a new helper to tell if we should be in VRMA
> > mode.
> >
> > Signed-off-by: David Gibson 
> > Reviewed-by: Cédric Le Goater 
> > ---
> >  target/ppc/mmu-hash64.c | 41 +++--
> >  1 file changed, 19 insertions(+), 22 deletions(-)
> >
> > diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> > index 5fabd93c92..d878180df5 100644
> > --- a/target/ppc/mmu-hash64.c
> > +++ b/target/ppc/mmu-hash64.c
> > @@ -668,6 +668,19 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU 
> > *cpu,
> >  return 0;
> >  }
> >  
> > +static bool ppc_hash64_use_vrma(CPUPPCState *env)
> > +{
> > +switch (env->mmu_model) {
> > +case POWERPC_MMU_3_00:
> > +/* ISAv3.0 (POWER9) always uses VRMA, the VPM0 field and RMOR
> > + * register no longer exist */
> > +return true;
> > +
> > +default:
> > +return !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> > +}
> > +}
> > +
> >  static void ppc_hash64_set_isi(CPUState *cs, uint64_t error_code)
> >  {
> >  CPUPPCState *env = &POWERPC_CPU(cs)->env;
> > @@ -676,15 +689,7 @@ static void ppc_hash64_set_isi(CPUState *cs, uint64_t 
> > error_code)
> >  if (msr_ir) {
> >  vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
> >  } else {
> > -switch (env->mmu_model) {
> > -case POWERPC_MMU_3_00:
> > -/* Field deprecated in ISAv3.00 - interrupts always go to 
> > hyperv */
> > -vpm = true;
> > -break;
> > -default:
> > -vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> > -break;
> > -}
> > +vpm = ppc_hash64_use_vrma(env);
> >  }
> >  if (vpm && !msr_hv) {
> >  cs->exception_index = POWERPC_EXCP_HISI;
> > @@ -702,15 +707,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, uint64_t 
> > dar, uint64_t dsisr)
> >  if (msr_dr) {
> >  vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
> >  } else {
> > -switch (env->mmu_model) {
> > -case POWERPC_MMU_3_00:
> > -/* Field deprecated in ISAv3.00 - interrupts always go to 
> > hyperv */
> > -vpm = true;
> > -break;
> > -default:
> > -vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> > -break;
> > -}
> > +vpm = ppc_hash64_use_vrma(env);
> >  }
> >  if (vpm && !msr_hv) {
> >  cs->exception_index = POWERPC_EXCP_HDSI;
> > @@ -799,7 +796,7 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
> > eaddr,
> >  if (!(eaddr >> 63)) {
> >  raddr |= env->spr[SPR_HRMOR];
> >  }
> > -} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
> > +} else if (ppc_hash64_use_vrma(env)) {
> >  /* Emulated VRMA mode */
> >  slb = &env->vrma_slb;
> >  if (!slb->sps) {
> > @@ -967,7 +964,7 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
> > target_ulong addr)
> >  } else if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
> >  /* In HV mode, add HRMOR if top EA bit is clear */
> >  return raddr | env->spr[SPR_HRMOR];
> > -} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
> > +} else if (ppc_hash64_use_vrma(env)) {
> >  /* Emulated VRMA mode */
> >  slb = &env->vrma_slb;
> >  if (!slb->sps) {
> > @@ -1056,8 +1053,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
> >  slb->sps = NULL;
> >  
> >  /* Is VRMA enabled ? */
> > -lpcr = env->spr[SPR_LPCR];
> > -if (!(lpcr & LPCR_VPM0)) {
> > +if (ppc_hash64_use_vrma(env)) {
> 
> Shouldn't this be !ppc_hash64_use_vrma(env)?
> 
> And a comment about the original code: 

Re: [PATCH v3 11/12] target/ppc: Streamline construction of VRMA SLB entry

2020-02-19 Thread David Gibson
On Wed, Feb 19, 2020 at 11:34:22AM -0300, Fabiano Rosas wrote:
> David Gibson  writes:
> 
> 
> Hi, just a nitpick, feel free to ignore.
> 
> > When in VRMA mode (i.e. a guest thinks it has the MMU off, but the
> > hypervisor is still applying translation) we use a special SLB entry,
> > rather than looking up an SLBE by address as we do when guest translation
> > is on.
> >
> > We build that special entry in ppc_hash64_update_vrma() along with some
> > logic for handling some non-VRMA cases.  Split the actual build of the
> > VRMA SLBE into a separate helper and streamline it a bit.
> >
> > Signed-off-by: David Gibson 
> > ---
> >  target/ppc/mmu-hash64.c | 79 -
> >  1 file changed, 38 insertions(+), 41 deletions(-)
> >
> > diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> > index 170a78bd2e..06cfff9860 100644
> > --- a/target/ppc/mmu-hash64.c
> > +++ b/target/ppc/mmu-hash64.c
> > @@ -789,6 +789,39 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
> >  }
> >  }
> >  
> > +static int build_vrma_slbe(PowerPCCPU *cpu, ppc_slb_t *slb)
> > +{
> > +CPUPPCState *env = &cpu->env;
> > +target_ulong lpcr = env->spr[SPR_LPCR];
> > +uint32_t vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
> > +target_ulong vsid = SLB_VSID_VRMA | ((vrmasd << 4) & 
> > SLB_VSID_LLP_MASK);
> > +int i;
> > +
> > +/*
> > + * Make one up. Mostly ignore the ESID which will not be needed
> > + * for translation
> > + */
> 
> I find this comment a bit vague. I suggest we either leave it behind or
> make it more precise. The ISA says:
> 
> "translation of effective addresses to virtual addresses use the SLBE
> values in Figure 18 instead of the entry in the SLB corresponding to the
> ESID"

Yeah, it wasn't very helpful in its initial location, and it's even
less helpful here.  I've dropped it.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH v5 04/18] target/ppc: Introduce ppc_hash64_use_vrma() helper

2020-02-19 Thread David Gibson
When running guests under a hypervisor, the hypervisor obviously needs to
be protected from guest accesses even if those are in what the guest
considers real mode (translation off).  The POWER hardware provides two
ways of doing that: The old way has guest real mode accesses simply offset
and bounds checked into host addresses.  It works, but requires that a
significant chunk of the guest's memory - the RMA - be physically
contiguous in the host, which is pretty inconvenient.  The new way, known
as VRMA, has guest real mode accesses translated in roughly the normal way
but with some special parameters.

In POWER7 and POWER8 the LPCR[VPM0] bit selected between the two modes, but
in POWER9 only VRMA mode is supported and LPCR[VPM0] no longer exists.  We
handle that difference in behaviour in ppc_hash64_set_isi().. but not in
other places that we blindly check LPCR[VPM0].

Correct those instances with a new helper to tell if we should be in VRMA
mode.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 43 -
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 392f90e0ae..0f9c0149e8 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -668,6 +668,21 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
 return 0;
 }
 
+static bool ppc_hash64_use_vrma(CPUPPCState *env)
+{
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+/*
+ * ISAv3.0 (POWER9) always uses VRMA, the VPM0 field and RMOR
+ * register no longer exist
+ */
+return true;
+
+default:
+return !!(env->spr[SPR_LPCR] & LPCR_VPM0);
+}
+}
+
 static void ppc_hash64_set_isi(CPUState *cs, uint64_t error_code)
 {
 CPUPPCState *env = &POWERPC_CPU(cs)->env;
@@ -676,15 +691,7 @@ static void ppc_hash64_set_isi(CPUState *cs, uint64_t 
error_code)
 if (msr_ir) {
 vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
 } else {
-switch (env->mmu_model) {
-case POWERPC_MMU_3_00:
-/* Field deprecated in ISAv3.00 - interrupts always go to hyperv */
-vpm = true;
-break;
-default:
-vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
-break;
-}
+vpm = ppc_hash64_use_vrma(env);
 }
 if (vpm && !msr_hv) {
 cs->exception_index = POWERPC_EXCP_HISI;
@@ -702,15 +709,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, uint64_t dar, 
uint64_t dsisr)
 if (msr_dr) {
 vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
 } else {
-switch (env->mmu_model) {
-case POWERPC_MMU_3_00:
-/* Field deprecated in ISAv3.00 - interrupts always go to hyperv */
-vpm = true;
-break;
-default:
-vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
-break;
-}
+vpm = ppc_hash64_use_vrma(env);
 }
 if (vpm && !msr_hv) {
 cs->exception_index = POWERPC_EXCP_HDSI;
@@ -799,7 +798,7 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 if (!(eaddr >> 63)) {
 raddr |= env->spr[SPR_HRMOR];
 }
-} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
 slb = &env->vrma_slb;
 if (!slb->sps) {
@@ -967,7 +966,7 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 } else if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
 /* In HV mode, add HRMOR if top EA bit is clear */
 return raddr | env->spr[SPR_HRMOR];
-} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
 slb = &env->vrma_slb;
 if (!slb->sps) {
@@ -1056,8 +1055,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 slb->sps = NULL;
 
 /* Is VRMA enabled ? */
-lpcr = env->spr[SPR_LPCR];
-if (!(lpcr & LPCR_VPM0)) {
+if (ppc_hash64_use_vrma(env)) {
 return;
 }
 
@@ -1065,6 +1063,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
  * Make one up. Mostly ignore the ESID which will not be needed
  * for translation
  */
+lpcr = env->spr[SPR_LPCR];
 vsid = SLB_VSID_VRMA;
 vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
 vsid |= (vrmasd << 4) & (SLB_VSID_L | SLB_VSID_LP);
-- 
2.24.1




[PATCH v5 01/18] ppc: Remove stub support for 32-bit hypervisor mode

2020-02-19 Thread David Gibson
a4f30719a8cd, way back in 2007 noted that "PowerPC hypervisor mode is not
fundamentally available only for PowerPC 64" and added a 32-bit version
of the MSR[HV] bit.

But nothing was ever really done with that; there is no meaningful support
for 32-bit hypervisor mode 13 years later.  Let's stop pretending and just
remove the stubs.

Signed-off-by: David Gibson 
Reviewed-by: Fabiano Rosas 
---
 target/ppc/cpu.h| 21 +++--
 target/ppc/translate_init.inc.c |  6 +++---
 2 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index b283042515..8077fdb068 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -24,8 +24,6 @@
 #include "exec/cpu-defs.h"
 #include "cpu-qom.h"
 
-/* #define PPC_EMULATE_32BITS_HYPV */
-
 #define TCG_GUEST_DEFAULT_MO 0
 
 #define TARGET_PAGE_BITS_64K 16
@@ -300,13 +298,12 @@ typedef struct ppc_v3_pate_t {
 #define MSR_SF   63 /* Sixty-four-bit modehflags */
 #define MSR_TAG  62 /* Tag-active mode (POWERx ?)*/
 #define MSR_ISF  61 /* Sixty-four-bit interrupt mode on 630  */
-#define MSR_SHV  60 /* hypervisor state   hflags */
+#define MSR_HV   60 /* hypervisor state   hflags */
 #define MSR_TS0  34 /* Transactional state, 2 bits (Book3s)  */
 #define MSR_TS1  33
 #define MSR_TM   32 /* Transactional Memory Available (Book3s)   */
 #define MSR_CM   31 /* Computation mode for BookE hflags */
 #define MSR_ICM  30 /* Interrupt computation mode for BookE  */
-#define MSR_THV  29 /* hypervisor state for 32 bits PowerPC   hflags */
 #define MSR_GS   28 /* guest state for BookE */
 #define MSR_UCLE 26 /* User-mode cache lock enable for BookE */
 #define MSR_VR   25 /* altivec availablex hflags */
@@ -401,10 +398,13 @@ typedef struct ppc_v3_pate_t {
 
 #define msr_sf   ((env->msr >> MSR_SF)   & 1)
 #define msr_isf  ((env->msr >> MSR_ISF)  & 1)
-#define msr_shv  ((env->msr >> MSR_SHV)  & 1)
+#if defined(TARGET_PPC64)
+#define msr_hv   ((env->msr >> MSR_HV)   & 1)
+#else
+#define msr_hv   (0)
+#endif
 #define msr_cm   ((env->msr >> MSR_CM)   & 1)
 #define msr_icm  ((env->msr >> MSR_ICM)  & 1)
-#define msr_thv  ((env->msr >> MSR_THV)  & 1)
 #define msr_gs   ((env->msr >> MSR_GS)   & 1)
 #define msr_ucle ((env->msr >> MSR_UCLE) & 1)
 #define msr_vr   ((env->msr >> MSR_VR)   & 1)
@@ -449,16 +449,9 @@ typedef struct ppc_v3_pate_t {
 
 /* Hypervisor bit is more specific */
 #if defined(TARGET_PPC64)
-#define MSR_HVB (1ULL << MSR_SHV)
-#define msr_hv  msr_shv
-#else
-#if defined(PPC_EMULATE_32BITS_HYPV)
-#define MSR_HVB (1ULL << MSR_THV)
-#define msr_hv  msr_thv
+#define MSR_HVB (1ULL << MSR_HV)
 #else
 #define MSR_HVB (0ULL)
-#define msr_hv  (0)
-#endif
 #endif
 
 /* DSISR */
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 53995f62ea..a0d0eaabf2 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8804,7 +8804,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_PM_ISA206;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
@@ -9017,7 +9017,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
@@ -9228,7 +9228,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
-- 
2.24.1




[PATCH v5 09/18] target/ppc: Correct RMLS table

2020-02-19 Thread David Gibson
The table of RMA limits based on the LPCR[RMLS] field is slightly wrong.
We're missing the RMLS == 0 => 256 GiB RMA option, which is available on
POWER8, so add that.

The comment that goes with the table is much more wrong.  We *don't* filter
invalid RMLS values when writing the LPCR, and there's not really a
sensible way to do so.  Furthermore, while in theory the set of RMLS values
is implementation dependent, it seems in practice the same set has been
available since around POWER4+ up until POWER8, the last model which
supports RMLS at all.  So, correct that as well.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 4e6c1f722b..46690bc79b 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -762,12 +762,12 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
 {
 CPUPPCState *env = &cpu->env;
 /*
- * This is the full 4 bits encoding of POWER8. Previous
- * CPUs only support a subset of these but the filtering
- * is done when writing LPCR
+ * In theory the meanings of RMLS values are implementation
+ * dependent.  In practice, this seems to have been the set from
+ * POWER4+..POWER8, and RMLS is no longer supported in POWER9.
  */
 const target_ulong rma_sizes[] = {
-[0] = 0,
+[0] = 256 * GiB,
 [1] = 16 * GiB,
 [2] = 1 * GiB,
 [3] = 64 * MiB,
-- 
2.24.1




Re: The issues about architecture of the COLO checkpoint

2020-02-19 Thread Zhang, Chen


On 2/18/2020 5:22 PM, Daniel Cho wrote:

Hi Hailiang,
Thanks for your help. If we have any problems we will contact you for 
your favor.



Hi Zhang,

" If colo-compare got a primary packet without related secondary 
packet in a certain time , it will automatically trigger checkpoint.  "
As you said, the colo-compare will trigger checkpoint, but does it 
need to limit checkpoint times?
There is a problem about doing many checkpoints while we use fio to 
random write files. Then it will cause low throughput on PVM.

Is this situation is normal on COLO?



Hi Daniel,

The checkpoint time is designed to be user adjustable based on user 
environment(workload/network status/business conditions...).


In net/colo-compare.c

/* TODO: Should be configurable */
#define REGULAR_PACKET_CHECK_MS 3000

If you need, I can send a patch for this issue. Make users can change 
the value by QMP and qemu monitor commands.


Thanks

Zhang Chen




Best regards,
Daniel Cho

Zhang, Chen mailto:chen.zh...@intel.com>> 於 
2020年2月17日 週一 下午1:36寫道:



On 2/15/2020 11:35 AM, Daniel Cho wrote:

Hi Dave,

Yes, I agree with you, it does need a timeout.



Hi Daniel and Dave,

Current colo-compare already have the timeout mechanism.

Named packet_check_timer,  It will scan primary packet queue to
make sure all the primary packet not stay too long time.

If colo-compare got a primary packet without related secondary
packet in a certain time , it will automatic trigger checkpoint.

https://github.com/qemu/qemu/blob/master/net/colo-compare.c#L847


Thanks

Zhang Chen




Hi Hailiang,

We base on qemu-4.1.0 for using COLO feature, in your patch, we
found a lot of difference  between your version and ours.
Could you give us a latest release version which is close your
developing code?

Thanks.

Regards
Daniel Cho

Dr. David Alan Gilbert mailto:dgilb...@redhat.com>> 於 2020年2月13日 週四 下午6:38寫道:

* Daniel Cho (daniel...@qnap.com )
wrote:
> Hi Hailiang,
>
> 1.
>     OK, we will try the patch
> “0001-COLO-Optimize-memory-back-up-process.patch”,
> and thanks for your help.
>
> 2.
>     We understand the reason to compare PVM and SVM's
packet. However, the
> empty of SVM's packet queue might happened on setting COLO
feature and SVM
> broken.
>
> On situation 1 ( setting COLO feature ):
>     We could force do checkpoint after setting COLO feature
finish, then it
> will protect the state of PVM and SVM . As the Zhang Chen said.
>
> On situation 2 ( SVM broken ):
>     COLO will do failover for PVM, so it might not cause
any wrong on PVM.
>
> However, those situations are our views, so there might be
a big difference
> between reality and our views.
> If we have any wrong views and opinions, please let us
know, and correct
> us.

It does need a timeout; the SVM being broken or being in a
state where
it never sends the corresponding packet (because of a state
difference)
can happen and COLO needs to timeout when the packet hasn't
arrived
after a while and trigger the checkpoint.

Dave

> Thanks.
>
> Best regards,
> Daniel Cho
>
> Zhang, Chen mailto:chen.zh...@intel.com>> 於 2020年2月13日 週四
上午10:17寫道:
>
> > Add cc Jason Wang, he is a network expert.
> >
> > In case some network things goes wrong.
> >
> >
> >
> > Thanks
> >
> > Zhang Chen
> >
> >
> >
> > *From:* Zhang, Chen
> > *Sent:* Thursday, February 13, 2020 10:10 AM
> > *To:* 'Zhanghailiang' mailto:zhang.zhanghaili...@huawei.com>>; Daniel Cho <
> > daniel...@qnap.com >
> > *Cc:* Dr. David Alan Gilbert mailto:dgilb...@redhat.com>>; qemu-devel@nongnu.org

> > *Subject:* RE: The issues about architecture of the COLO
checkpoint
> >
> >
> >
> > For the issue 2:
> >
> >
> >
> > COLO need use the network packets to confirm PVM and SVM
in the same state,
> >
> > Generally speaking, we can’t send PVM packets without
compared with SVM
> > packets.
> >
> > But to prevent jamming, I think COLO can do force
checkpoint and send the
> > PVM packets in this case.
> >
> >
> >
> > Thanks
> >
> > Zhang Chen
> >
> >
> >
> > *From:* Zhanghailiang mailto:zhang.zhanghaili...@huawei.com>>
> > *Sent:* Thursday, Februa

Re: [PATCH v3 04/12] target/ppc: Introduce ppc_hash64_use_vrma() helper

2020-02-19 Thread Paul Mackerras
On Wed, Feb 19, 2020 at 11:06:20AM -0300, Fabiano Rosas wrote:
> David Gibson  writes:
> 
> > When running guests under a hypervisor, the hypervisor obviously needs to
> > be protected from guest accesses even if those are in what the guest
> > considers real mode (translation off).  The POWER hardware provides two
> > ways of doing that: The old way has guest real mode accesses simply offset
> > and bounds checked into host addresses.  It works, but requires that a
> > significant chunk of the guest's memory - the RMA - be physically
> > contiguous in the host, which is pretty inconvenient.  The new way, known
> > as VRMA, has guest real mode accesses translated in roughly the normal way
> > but with some special parameters.
> >
> > In POWER7 and POWER8 the LPCR[VPM0] bit selected between the two modes, but
> > in POWER9 only VRMA mode is supported
> 
> ... when translation is off, right? Because I see in the 3.0 ISA that
> LPCR[VPM1] is still there.

VRMA stands for virtual real mode area, and the "real mode" part
implies that translation is off.  VRMA is not used when translation is
on because then the CPU is not in real mode.

LPCR[VPM1] is indeed still there, but it is a bit different to VPM0
(or what VPM0 used to do); VPM1 doesn't change how translation is
done, just what happens on a fault.

Paul.



Re: [PATCH v2] pcie_root_port: Add enable_hotplug option

2020-02-19 Thread Laine Stump

On 2/19/20 9:55 AM, Julia Suvorova wrote:

Make hot-plug/hot-unplug on PCIe Root Ports optional to allow libvirt
manage it and restrict unplug for the whole machine. This is going to
prevent user-initiated unplug in guests (Windows mostly).
Hotplug is enabled by default.
Usage:
 -device pcie-root-port,enable-hotplug=false,...

If you want to disable hot-unplug on some downstream ports of one
switch, disable hot-unplug on PCIe Root Port connected to the upstream
port as well as on the selected downstream ports.

Discussion related:
 https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg00530.html

Signed-off-by: Julia Suvorova 
---
v1: https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg04868.html

v2:
 * change name of the option to 'enable-hotplug' [Laine]



Heh... I didn't actually expect you to do that just for me :-) 
(especially since I guess nobody else was bothered by "disable"). But 
now that you did, I look at it and realize that the "enable-" part is 
redundant, ie. just "hotplug=on|off|true|false" is plenty descriptive 
(since it's implied that it's being enabled).


But I've already created too much of a tempest over such a tiny detail, 
and kind of wish I'd just kept quiet instead...


I'll try to test this with libvirt in the next day or two.



 * change order of enabling capability bits [Igor]
 * enable HPS bit [Igor]
 * add option to xio3130_downstream [Ján]






[PATCH v9 2/3] Acceptance test: add "boot_linux" tests

2020-02-19 Thread Cleber Rosa
This acceptance test, validates that a full blown Linux guest can
successfully boot in QEMU.  In this specific case, the guest chosen is
Fedora version 31.

 * x86_64, pc-i440fx and pc-q35 machine types, with TCG and KVM as
   accelerators

 * aarch64 and virt machine type, with TCG and KVM as accelerators

 * ppc64 and pseries machine type with TCG as accelerator

 * s390x and s390-ccw-virtio machine type with TCG as accelerator

The Avocado vmimage utils library is used to download and cache the
Linux guest images, and from those images a snapshot image is created
and given to QEMU.  If a qemu-img binary is available in the build
directory, it's used to create the snapshot image, so that matching
qemu-system-* and qemu-img are used in the same test run.  If qemu-img
is not available in the build tree, one is attempted to be found
installed system-wide (in the $PATH).  If qemu-img is not found in the
build dir or in the $PATH, the test is canceled.

The method for checking the successful boot is based on "cloudinit"
and its "phone home" feature.  The guest is given an ISO image with
the location of the phone home server, and the information to post
(the instance ID).  Upon receiving the correct information, from the
guest, the test is considered to have PASSed.

This test is currently limited to user mode networking only, and
instructs the guest to connect to the "router" address that is hard
coded in QEMU.

To create the cloudinit ISO image that will be used to configure the
guest, the pycdlib library is also required and has been added as
requirement to the virtual environment created by "check-venv".

The console output is read by a separate thread, by means of the
Avocado datadrainer utility module.

Signed-off-by: Cleber Rosa 
---
 .travis.yml|   2 +-
 tests/acceptance/boot_linux.py | 215 +
 tests/requirements.txt |   3 +-
 3 files changed, 218 insertions(+), 2 deletions(-)
 create mode 100644 tests/acceptance/boot_linux.py

diff --git a/.travis.yml b/.travis.yml
index 5887055951..0c54cdf40f 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -313,7 +313,7 @@ matrix:
 # Acceptance (Functional) tests
 - name: "GCC check-acceptance"
   env:
-- 
CONFIG="--target-list=aarch64-softmmu,alpha-softmmu,arm-softmmu,m68k-softmmu,microblaze-softmmu,mips-softmmu,mips64el-softmmu,nios2-softmmu,or1k-softmmu,ppc-softmmu,ppc64-softmmu,s390x-softmmu,sparc-softmmu,x86_64-softmmu,xtensa-softmmu"
+- CONFIG="--enable-tools 
--target-list=aarch64-softmmu,alpha-softmmu,arm-softmmu,m68k-softmmu,microblaze-softmmu,mips-softmmu,mips64el-softmmu,nios2-softmmu,or1k-softmmu,ppc-softmmu,ppc64-softmmu,s390x-softmmu,sparc-softmmu,x86_64-softmmu,xtensa-softmmu"
 - TEST_CMD="make check-acceptance"
   after_script:
 - python3 -c 'import json; r = 
json.load(open("tests/results/latest/results.json")); [print(t["logfile"]) for 
t in r["tests"] if t["status"] not in ("PASS", "SKIP")]' | xargs cat
diff --git a/tests/acceptance/boot_linux.py b/tests/acceptance/boot_linux.py
new file mode 100644
index 00..6787e79aea
--- /dev/null
+++ b/tests/acceptance/boot_linux.py
@@ -0,0 +1,215 @@
+# Functional test that boots a complete Linux system via a cloud image
+#
+# Copyright (c) 2018-2020 Red Hat, Inc.
+#
+# Author:
+#  Cleber Rosa 
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later.  See the COPYING file in the top-level directory.
+
+import os
+
+from avocado_qemu import Test, BUILD_DIR
+
+from qemu.accel import kvm_available
+from qemu.accel import tcg_available
+
+from avocado.utils import cloudinit
+from avocado.utils import network
+from avocado.utils import vmimage
+from avocado.utils import datadrainer
+
+ACCEL_NOT_AVAILABLE_FMT = "%s accelerator does not seem to be available"
+KVM_NOT_AVAILABLE = ACCEL_NOT_AVAILABLE_FMT % "KVM"
+TCG_NOT_AVAILABLE = ACCEL_NOT_AVAILABLE_FMT % "TCG"
+
+
+class BootLinux(Test):
+"""
+Boots a Linux system, checking for a successful initialization
+"""
+
+timeout = 900
+chksum = None
+
+def setUp(self):
+super(BootLinux, self).setUp()
+self.prepare_boot()
+self.vm.add_args('-smp', '2')
+self.vm.add_args('-m', '1024')
+self.vm.add_args('-drive', 'file=%s' % self.boot.path)
+self.prepare_cloudinit()
+
+def prepare_boot(self):
+self.log.info('Downloading/preparing boot image')
+# Fedora 31 only provides ppc64le images
+image_arch = self.arch
+if image_arch == 'ppc64':
+image_arch = 'ppc64le'
+# If qemu-img has been built, use it, otherwise the system wide one
+# will be used.  If none is available, the test will cancel.
+qemu_img = os.path.join(BUILD_DIR, 'qemu-img')
+if os.path.exists(qemu_img):
+vmimage.QEMU_IMG = qemu_img
+try:
+self.boot = vmimage.get(
+'fedora', arch=image_

[PATCH v9 0/3] Acceptance test: Add "boot_linux" acceptance test

2020-02-19 Thread Cleber Rosa
This acceptance test, validates that a full blown Linux guest can
successfully boot in QEMU.  In this specific case, the guest chosen is
Fedora version 31.  It covers the following architectures and
machine types:

 * x86_64, pc-i440fx and pc-q35 machine types, with TCG and KVM as
   accelerators

 * aarch64 and virt machine type, with TCG and KVM as accelerators

 * ppc64 and pseries machine type with TCG as accelerator

 * s390x and s390-ccw-virtio machine type with TCG as accelerator

This has been tested on x86_64, ppc64le and aarch64 hosts and has been
running reliably (in my experience) on Travis CI.

Git:
  - URI: https://github.com/clebergnu/qemu/tree/test_boot_linux_v9
  - Remote: https://github.com/clebergnu/qemu
  - Branch: test_boot_linux_v9

Travis CI:
  - Build: https://travis-ci.org/clebergnu/qemu/builds/652694503

Previous version:
  - v8: https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04095.html
  - v7: https://lists.gnu.org/archive/html/qemu-devel/2019-11/msg00220.html
  - v6: https://lists.gnu.org/archive/html/qemu-devel/2019-06/msg01202.html
  - v5: https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04652.html
  - v4: https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg02032.html
  - v3: https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg01677.html
  - v2: https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg04318.html
  - v1: http://lists.nongnu.org/archive/html/qemu-devel/2018-09/msg02530.html

Changes from v8:


* Renamed "BLD_DIR" to "BUILD_DIR", "SRC_DIR" to "SOURCE_DIR" and dropped
  "LNK_DIR" variables on tests/acceptance/avocado_qemu/__init__.py

* Changed memory allocation to 1024 MB, so that it puts less pressure on
  the host memory, and should be compatible with 32bit hosts (I've found
  no significant effects to the test times)

* Explicitly enabled TCG and skip tests if it's not available

* Added tags for when accel is TCG ("accel:tcg")

* Added additional tags for "pc" alias, that is, "pc-i440fx"

* Renamed tests to make the machine type and accellerator more explicit:
  - BootLinuxX8664.test_pc => BootLinuxX8664.test_pc_i440fx_tcg
  - BootLinuxX8664.test_pc_kvm => BootLinuxX8664.test_pc_i440fx_kvm
  - BootLinuxX8664.test_q35 => BootLinuxX8664.test_pc_q35_tcg
  - BootLinuxX8664.test_kvm_q35 => BootLinuxX8664.test_pc_q35_kvm
  - BootLinuxAarch64.test_virt => BootLinuxAarch64.test_virt_tcg
  - BootLinuxAarch64.test_kvm_virt => BootLinuxAarch64.test_virt_kvm
  - BootLinuxPPC64.test_pseries => BootLinuxPPC64.test_pseries_tcg
  - BootLinuxS390X.test_s390_ccw_virtio => 
BootLinuxS390X.test_s390_ccw_virtio_tcg

* Renamed target "get-vmimage" to "get-vm-images", and added a help
  entry under "check-help".

* Bumped pycdlib version to 1.9.0, which contains an endianess bug that
  was seen on s390x hosts.

Changes from v7:


This version drops a number of commits that had been already reviewed
and have been merged:

 * Dropped commit "Acceptance tests: use relative location for tests",
   already present in the latest master.

 * Dropped commit "Acceptance tests: use avocado tags for machine type",
   already present in the latest master.

 * Dropped commit: "Acceptance tests: introduce utility method for tags
   unique vals", already present in the latest master.

With regards to the handling of the build directory, and the usage of
a qemu-img binary from the build tree, the following changed:

 * Dropped commit "Acceptance tests: add the build directory to the
   system PATH", because the qemu-img binary to be used is now
   explicitly defined, instead of relying on the modification of the
   PATH environment variable.

 * Dropped commit "Acceptance tests: depend on qemu-img", replaced by
   explicitly setting the qemu-img binary to be used for snapshot
   generation.  Also, the newly added "--enable-tools" configure line
   on Travis CI makes sure that a matching qemu-img binary is
   available on CI.

 * Dropped commit "Acceptance tests: keep a stable reference to the
   QEMU build dir", replaced by a different approach that introduces
   variables tracking the build dir, source dir and link (from build
   to source) dir.

 * New commit "Acceptance tests: introduce BLD_DIR, SRC_DIR and
   LNK_DIR".

 * New commit "Acceptance tests: add make targets to download images",
   that downloads the cloud images, aka vmimages, before the test
   execution itself.

 * New commit "[TO BE REMOVED] Use Avocado master branch + vmimage fix"
   to facilitate the review/test of this version.

Additionally:

  * The check for the availability of kvm now makes use of the
strengthened qemu.accel.kvm_available() and passes the QEMU binary
as an argument to make sure KVM support is compiled into that
binary.

 * The timeout was increased to 900 seconds.  This is just one extra
   step to avoid false negatives on very slow systems.  As a
   comparison, on Travis CI, on a x86_64 host, the slowest test takes
   around 250 seco

[PATCH v9 1/3] Acceptance tests: introduce BUILD_DIR and SOURCE_DIR

2020-02-19 Thread Cleber Rosa
Some tests may benefit from using resources from a build directory.
This introduces three variables that can help tests find resources in
those directories.

First, a BUILD_DIR is assumed to exist, given that the primary form of
running the acceptance tests is from a build directory (which may or
may not be the same as the source tree, that is, the SOURCE_DIR).

If the directory containing the acceptance tests happens to be a link
to a directory, it's assumed to it points to the source tree
(SOURCE_DIR), which is the behavior defined on the QEMU Makefiles.  If
the directory containing the acceptance tests is not a link, then a
in-tree build is assumed, and the BUILD_DIR and SOURCE_DIR have the
same value.

Signed-off-by: Cleber Rosa 
---
 tests/acceptance/avocado_qemu/__init__.py | 25 +--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/tests/acceptance/avocado_qemu/__init__.py 
b/tests/acceptance/avocado_qemu/__init__.py
index d4358eb431..59e7b4f763 100644
--- a/tests/acceptance/avocado_qemu/__init__.py
+++ b/tests/acceptance/avocado_qemu/__init__.py
@@ -16,8 +16,21 @@ import tempfile
 
 import avocado
 
-SRC_ROOT_DIR = os.path.join(os.path.dirname(__file__), '..', '..', '..')
-sys.path.append(os.path.join(SRC_ROOT_DIR, 'python'))
+#: The QEMU build root directory.  It may also be the source directory
+#: if building from the source dir, but it's safer to use BUILD_DIR for
+#: that purpose.  Be aware that if this code is moved outside of a source
+#: and build tree, it will not be accurate.
+BUILD_DIR = 
os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(__file__
+
+if os.path.islink(os.path.dirname(os.path.dirname(__file__))):
+# The link to the acceptance tests dir in the source code directory
+lnk = os.path.dirname(os.path.dirname(__file__))
+#: The QEMU root source directory
+SOURCE_DIR = os.path.dirname(os.path.dirname(os.readlink(lnk)))
+else:
+SOURCE_DIR = BUILD_DIR
+
+sys.path.append(os.path.join(SOURCE_DIR, 'python'))
 
 from qemu.machine import QEMUMachine
 
@@ -49,10 +62,10 @@ def pick_default_qemu_bin(arch=None):
 if is_readable_executable_file(qemu_bin_relative_path):
 return qemu_bin_relative_path
 
-qemu_bin_from_src_dir_path = os.path.join(SRC_ROOT_DIR,
+qemu_bin_from_bld_dir_path = os.path.join(BUILD_DIR,
   qemu_bin_relative_path)
-if is_readable_executable_file(qemu_bin_from_src_dir_path):
-return qemu_bin_from_src_dir_path
+if is_readable_executable_file(qemu_bin_from_bld_dir_path):
+return qemu_bin_from_bld_dir_path
 
 
 def _console_interaction(test, success_message, failure_message,
@@ -153,7 +166,7 @@ class Test(avocado.Test):
 self.qemu_bin = self.params.get('qemu_bin',
 default=default_qemu_bin)
 if self.qemu_bin is None:
-self.cancel("No QEMU binary defined or found in the source tree")
+self.cancel("No QEMU binary defined or found in the build tree")
 
 def _new_vm(self, *args):
 vm = QEMUMachine(self.qemu_bin, sock_dir=tempfile.mkdtemp())
-- 
2.21.1




[PATCH v9 3/3] Acceptance tests: add make targets to download images

2020-02-19 Thread Cleber Rosa
The newly introduced "boot linux" tests make use of Linux images that
are larger than usual, and fall into what Avocado calls "vmimages",
and can be referred to by name, version and architecture.

The images can be downloaded automatically during the test. But, to
make for more reliable test results, this introduces a target that
will download the vmimages for the architectures that have been
configured and are available for the currently used distro (Fedora
31).

Signed-off-by: Cleber Rosa 
---
 tests/Makefile.include | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 2f1cafed72..3fc6e4f2cc 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -20,6 +20,8 @@ check-help:
@echo " $(MAKE) check-venv   Creates a Python venv for tests"
@echo " $(MAKE) check-clean  Clean the tests and related data"
@echo
+   @echo " $(MAKE) get-vm-imagesDownloads all images used by 
acceptance tests, according to configured targets (~350 MB each, 1.5 GB max)"
+   @echo
@echo
@echo "The variable SPEED can be set to control the gtester speed 
setting."
@echo "Default options are -k and (for $(MAKE) V=1) --verbose; they can 
be"
@@ -886,7 +888,20 @@ $(TESTS_RESULTS_DIR):
 
 check-venv: $(TESTS_VENV_DIR)
 
-check-acceptance: check-venv $(TESTS_RESULTS_DIR)
+FEDORA_31_ARCHES_CANDIDATES=$(patsubst ppc64,ppc64le,$(TARGETS))
+FEDORA_31_ARCHES := x86_64 aarch64 ppc64le s390x
+FEDORA_31_DOWNLOAD=$(filter $(FEDORA_31_ARCHES),$(FEDORA_31_ARCHES_CANDIDATES))
+
+# download one specific Fedora 31 image
+get-vm-image-fedora-31-%: $(check-venv)
+   $(call quiet-command, \
+ $(TESTS_VENV_DIR)/bin/python -m avocado vmimage get \
+ --distro=fedora --distro-version=31 --arch=$*)
+
+# download all vm images, according to defined targets
+get-vm-images: $(check-venv) $(patsubst %,get-vm-image-fedora-31-%, 
$(FEDORA_31_DOWNLOAD))
+
+check-acceptance: check-venv $(TESTS_RESULTS_DIR) get-vm-images
$(call quiet-command, \
 $(TESTS_VENV_DIR)/bin/python -m avocado \
 --show=$(AVOCADO_SHOW) run --job-results-dir=$(TESTS_RESULTS_DIR) \
@@ -897,7 +912,7 @@ check-acceptance: check-venv $(TESTS_RESULTS_DIR)
 
 # Consolidated targets
 
-.PHONY: check-block check-qapi-schema check-qtest check-unit check check-clean
+.PHONY: check-block check-qapi-schema check-qtest check-unit check check-clean 
get-vm-images
 check-qapi-schema: check-tests/qapi-schema/frontend 
check-tests/qapi-schema/doc-good.texi
 check-qtest: $(patsubst %,check-qtest-%, $(QTEST_TARGETS))
 ifeq ($(CONFIG_TOOLS),y)
-- 
2.21.1




Re: [PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2020-02-19 Thread Alexey Kardashevskiy



On 19/02/2020 18:18, Cédric Le Goater wrote:
> On 2/19/20 7:44 AM, Alexey Kardashevskiy wrote:
>>
>>
>> On 19/02/2020 12:20, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 18/02/2020 23:59, Cédric Le Goater wrote:
 On 2/18/20 1:48 PM, Cédric Le Goater wrote:
> On 2/18/20 10:40 AM, Cédric Le Goater wrote:
>> On 2/18/20 10:10 AM, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 18/02/2020 20:05, Alexey Kardashevskiy wrote:


 On 18/02/2020 18:12, Cédric Le Goater wrote:
> On 2/18/20 1:30 AM, Alexey Kardashevskiy wrote:
>>
>>
>> On 17/02/2020 20:48, Cédric Le Goater wrote:
>>> On 2/17/20 3:12 AM, Alexey Kardashevskiy wrote:
 The following changes since commit 
 05943fb4ca41f626078014c0327781815c6584c5:

   ppc: free 'fdt' after reset the machine (2020-02-17 11:27:23 
 +1100)

 are available in the Git repository at:

   g...@github.com:aik/qemu.git tags/qemu-slof-20200217

 for you to fetch changes up to 
 ea9a03e5aa023c5391bab5259898475d0298aac2:

   pseries: Update SLOF firmware image (2020-02-17 13:08:59 +1100)

 
 Alexey Kardashevskiy (1):
   pseries: Update SLOF firmware image

  pc-bios/README   |   2 +-
  pc-bios/slof.bin | Bin 931032 -> 968560 bytes
  roms/SLOF|   2 +-
  3 files changed, 2 insertions(+), 2 deletions(-)


 *** Note: this is not for master, this is for pseries

>>>
>>> Hello Alexey,
>>>
>>> QEMU fails to boot from disk. See below.
>>
>>
>> It does boot mine (fedora 30, ubuntu 18.04), see below. I believe I
>> could have broken something but I need more detail. Thanks,
>
> fedora31 boots but not ubuntu 19.10. Could it be GRUB version 2.04 ? 


 No, not that either:
>>>
>>>
>>> but it might be because of power9 - I only tried power8, rsyncing the
>>> image to a p9 machine now...
>>
>> Here is the disk : 
>>
>> Disk /dev/sda: 50 GiB, 53687091200 bytes, 104857600 sectors
>> Disk model: QEMU HARDDISK   
>> Units: sectors of 1 * 512 = 512 bytes
>> Sector size (logical/physical): 512 bytes / 512 bytes
>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>> Disklabel type: gpt
>> Disk identifier: 27DCE458-231A-4981-9FF1-983F87C2902D
>>
>> Device Start   End   Sectors Size Type
>> /dev/sda1   2048 16383 14336   7M PowerPC PReP boot
>> /dev/sda2  16384 100679679 100663296  48G Linux filesystem
>> /dev/sda3  100679680 104857566   4177887   2G Linux swap
>>
>>
>> GPT ? 
>
> For the failure, I bisected up to :
>
> f12149908705 ("ext2: Read all 64bit of inode number")

 Here is a possible fix for it. I did some RPN on my hp28s in the past 
 but I am not forth fluent.
>>>
>>>
>>> you basically zeroed the top bits by shifting them too far right :)
>>>
>>> The proper fix I think is:
>>>
>>> -  32 lshift or
>>> +  20 lshift or
>>>
>>> I keep forgetting it is all in hex. Can you please give it a try? My
>>> 128GB disk does not expose this problem somehow. Thanks,
>>
>> Better try this one please:
>>
>> https://github.com/aik/SLOF/tree/ext4
> Tested with the same image. Looks good. 


Thanks for testing. But it is still bizarre behaviour, why do we end up
there anyway...


>> What I still do not understand is why GRUB is using ext2 from SLOF, it
>> should parse ext4 itself :-/
> 
> Here is the fs information.
> 
> 
> Filesystem volume name:   
> Last mounted on:  /
> Filesystem UUID:  8d53f6b4-ffc2-4d8f-bd09-67ac97d7b0c5
> Filesystem magic number:  0xEF53
> Filesystem revision #:1 (dynamic)
> Filesystem features:  has_journal ext_attr resize_inode dir_index 
> filetype needs_recovery extent flex_bg sparse_super large_file huge_file 
> uninit_bg dir_nlink extra_isize


huh, this one does not have 64bit like mine, I blindly assumed that by
2020 everything would be using that. Well that explains the bug. And
yours also has uninit_bg (the whole idea of this flag is not obvious but
ok).


> Filesystem flags: unsigned_directory_hash 
> Default mount options:user_xattr acl
> Filesystem state: clean
> Errors behavior:  Continue
> Filesystem OS type:   Linux
> Inode count:  3127296
> Block count:  12582912
> Reserved block count: 552210
> Free blocks:  7907437
> Free inodes:  2863361
> First block:  0
> Block size:   4096
> Fragment size:4096


Mine here has:

Re: [PATCH v7 0/4] colo: Add support for continuous replication

2020-02-19 Thread Zhang, Chen

Hi Jason,

I noticed this series can't be merged or queued, do you met some problem 
about it?



Thanks

Zhang Chen



Max Reitz ; qemu-block 
Subject: Re: [PATCH v7 0/4] colo: Add support for continuous
replication

On Fri, 25 Oct 2019 19:06:31 +0200
Lukas Straub  wrote:


Hello Everyone,
These Patches add support for continuous replication to colo. This
means that after the Primary fails and the Secondary did a failover,
the Secondary can then become Primary and resume replication to a
new

Secondary.

Regards,
Lukas Straub

v7:
   - clarify meaning of ip's in documentation and note that active and

hidden

 images just need to be created once
   - reverted removal of bdrv_is_root_node(top_bs) in replication and

adjusted

 the top-id= parameter in documentation acordingly

v6:
   - documented the position= and insert= options
   - renamed replication test
   - clarified documentation by using different ip's for primary and
secondary
   - added Reviewed-by tags

v5:
   - change syntax for the position= parameter
   - fix spelling mistake

v4:
   - fix checkpatch.pl warnings

v3:
   - add test for replication changes
   - check if the filter to be inserted before/behind belongs to the
same interface
   - fix the error message for the position= parameter
   - rename term "after" -> "behind" and variable "insert_before" ->

"insert_before_flag"

   - document the quorum node on the secondary side
   - simplify quorum parameters in documentation
   - remove trailing spaces in documentation
   - clarify the testing procedure in documentation

v2:
   - fix email formating
   - fix checkpatch.pl warnings
   - fix patchew error
   - clearer commit messages


Lukas Straub (4):
block/replication.c: Ignore requests after failover
tests/test-replication.c: Add test for for secondary node continuing
  replication
net/filter.c: Add Options to insert filters anywhere in the filter
  list
colo: Update Documentation for continuous replication

   block/replication.c|  35 +-
   docs/COLO-FT.txt   | 224 +++-

-

   docs/block-replication.txt |  28 +++--
   include/net/filter.h   |   2 +
   net/filter.c   |  92 ++-
   qemu-options.hx|  31 -
   tests/test-replication.c   |  52 +
   7 files changed, 389 insertions(+), 75 deletions(-)


Hello Everyone,
So I guess this is ready for merging or will that have to wait until
the 4.2 release is done?

Due to Qemu 4.2 release schedule, after soft feature freeze(Oct29) the

master branch does not accept feature changes.

But I don't know if sub-maintainer(block or net) can queue this series first

then merge it after 4.2 release?

Thanks
Zhang Chen


Will try to queue this series.

Thank you Jason~

Thanks
Zhang Chen


Thanks



Regards,
Lukas Straub




Re: [PATCH v3 02/12] ppc: Remove stub of PPC970 HID4 implementation

2020-02-19 Thread David Gibson
On Wed, Feb 19, 2020 at 12:18:34PM +0100, BALATON Zoltan wrote:
> On Wed, 19 Feb 2020, David Gibson wrote:
> > The PowerPC 970 CPU was a cut-down POWER4, which had hypervisor capability.
> > However, it can be (and often was) strapped into "Apple mode", where the
> > hypervisor capabilities were disabled (essentially putting it always in
> > hypervisor mode).
> > 
> > That's actually the only mode of the 970 we support in qemu, and we're
> > unlikely to change that any time soon.  However, we do have a partial
> > implementation of the 970's HID4 register which affects things only
> > relevant for hypervisor mode.
> > 
> > That stub is also really ugly, since it attempts to duplicate the effects
> > of HID4 by re-encoding it into the LPCR register used in newer CPUs, but
> > in a really confusing way.
> > 
> > Just get rid of it.
> > 
> > Signed-off-by: David Gibson 
> > Reviewed-by: Cédric Le Goater 
> > Reviewed-by: Greg Kurz 
> > ---
> > target/ppc/mmu-hash64.c | 28 +---
> > target/ppc/translate_init.inc.c | 17 ++---
> > 2 files changed, 7 insertions(+), 38 deletions(-)
> > 
> > diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> > index da8966ccf5..a881876647 100644
> > --- a/target/ppc/mmu-hash64.c
> > +++ b/target/ppc/mmu-hash64.c
> > @@ -1091,33 +1091,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong 
> > val)
> > 
> > /* Filter out bits */
> > switch (env->mmu_model) {
> > -case POWERPC_MMU_64B: /* 970 */
> > -if (val & 0x40) {
> > -lpcr |= LPCR_LPES0;
> > -}
> > -if (val & 0x8000ull) {
> > -lpcr |= LPCR_LPES1;
> > -}
> > -if (val & 0x20) {
> > -lpcr |= (0x4ull << LPCR_RMLS_SHIFT);
> > -}
> > -if (val & 0x4000ull) {
> > -lpcr |= (0x2ull << LPCR_RMLS_SHIFT);
> > -}
> > -if (val & 0x2000ull) {
> > -lpcr |= (0x1ull << LPCR_RMLS_SHIFT);
> > -}
> > -env->spr[SPR_RMOR] = ((lpcr >> 41) & 0xull) << 26;
> > -
> > -/*
> > - * XXX We could also write LPID from HID4 here
> > - * but since we don't tag any translation on it
> > - * it doesn't actually matter
> > - *
> > - * XXX For proper emulation of 970 we also need
> > - * to dig HRMOR out of HID5
> > - */
> > -break;
> > case POWERPC_MMU_2_03: /* P5p */
> > lpcr = val & (LPCR_RMLS | LPCR_ILE |
> >   LPCR_LPES0 | LPCR_LPES1 |
> > @@ -1154,6 +1127,7 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
> > }
> > break;
> > default:
> > +g_assert_not_reached();
> > ;
> 
> Is this empty statement (lone semicolon) still needed now that you've added
> something to this case? Thought it was only there to be able to add a label
> to it so it could be removed now. (Does this count as a double ; that a
> recent patch was trying to fix?)

The ; is redundant, but given this whole chunk of code is removed
later in the series, I don't think it's worth messing with.

> 
> Regards,
> BALATON Zoltan
> 
> > }
> > env->spr[SPR_LPCR] = lpcr;
> > diff --git a/target/ppc/translate_init.inc.c 
> > b/target/ppc/translate_init.inc.c
> > index a0d0eaabf2..d7d4f012b8 100644
> > --- a/target/ppc/translate_init.inc.c
> > +++ b/target/ppc/translate_init.inc.c
> > @@ -7895,25 +7895,20 @@ static void spr_write_lpcr(DisasContext *ctx, int 
> > sprn, int gprn)
> > {
> > gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
> > }
> > -
> > -static void spr_write_970_hid4(DisasContext *ctx, int sprn, int gprn)
> > -{
> > -#if defined(TARGET_PPC64)
> > -spr_write_generic(ctx, sprn, gprn);
> > -gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
> > -#endif
> > -}
> > -
> > #endif /* !defined(CONFIG_USER_ONLY) */
> > 
> > static void gen_spr_970_lpar(CPUPPCState *env)
> > {
> > #if !defined(CONFIG_USER_ONLY)
> > /* Logical partitionning */
> > -/* PPC970: HID4 is effectively the LPCR */
> > +/* PPC970: HID4 covers things later controlled by the LPCR and
> > + * RMOR in later CPUs, but with a different encoding.  We only
> > + * support the 970 in "Apple mode" which has all hypervisor
> > + * facilities disabled by strapping, so we can basically just
> > + * ignore it */
> > spr_register(env, SPR_970_HID4, "HID4",
> >  SPR_NOACCESS, SPR_NOACCESS,
> > - &spr_read_generic, &spr_write_970_hid4,
> > + &spr_read_generic, &spr_write_generic,
> >  0x);
> > #endif
> > }
> > 


-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v2 fixed 11/16] util/mmap-alloc: Prepare for resizable mmaps

2020-02-19 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:49PM +0100, David Hildenbrand wrote:
> @@ -178,13 +183,15 @@ void *qemu_ram_mmap(int fd,
>  size_t offset, total;
>  void *ptr, *guardptr;
>  
> +g_assert(QEMU_IS_ALIGNED(size, pagesize));

(NOTE: assertion is fine, but as I mentioned in previous patch, I
 think this pagesize could not be the real one that's going to be
 mapped...)

> +
>  /*
>   * Note: this always allocates at least one extra page of virtual address
>   * space, even if size is already aligned.
>   */
>  total = size + align;
>  
> -guardptr = mmap_reserve(total, fd);
> +guardptr = mmap_reserve(0, total, fd);

s/0/NULL/

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 fixed 10/16] util/mmap-alloc: Factor out populating of memory to mmap_populate()

2020-02-19 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:48PM +0100, David Hildenbrand wrote:
> We want to populate memory within a reserved memory region. Let's factor
> that out.
> 
> Reviewed-by: Richard Henderson 
> Acked-by: Murilo Opsfelder Araujo 
> Cc: Igor Kotrasinski 
> Cc: "Michael S. Tsirkin" 
> Cc: Greg Kurz 
> Cc: Murilo Opsfelder Araujo 
> Cc: Eduardo Habkost 
> Cc: "Dr. David Alan Gilbert" 
> Cc: Igor Mammedov 
> Signed-off-by: David Hildenbrand 

The naming could be a bit misleading IMO, because we didn't populate
the memory and it's still serviced on demand.  However I don't have a
quick and better name of that either...

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 fixed 09/16] util/mmap-alloc: Factor out reserving of a memory region to mmap_reserve()

2020-02-19 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:47PM +0100, David Hildenbrand wrote:
> We want to reserve a memory region without actually populating memory.
> Let's factor that out.
> 
> Reviewed-by: Igor Kotrasinski 
> Acked-by: Murilo Opsfelder Araujo 
> Reviewed-by: Richard Henderson 
> Cc: "Michael S. Tsirkin" 
> Cc: Greg Kurz 
> Cc: Murilo Opsfelder Araujo 
> Cc: Eduardo Habkost 
> Cc: "Dr. David Alan Gilbert" 
> Cc: Igor Mammedov 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 fixed 08/16] util/mmap-alloc: Factor out calculation of pagesize to mmap_pagesize()

2020-02-19 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:46PM +0100, David Hildenbrand wrote:
> Factor it out and add a comment.
> 
> Reviewed-by: Igor Kotrasinski 
> Acked-by: Murilo Opsfelder Araujo 
> Reviewed-by: Richard Henderson 
> Cc: "Michael S. Tsirkin" 
> Cc: Murilo Opsfelder Araujo 
> Cc: Greg Kurz 
> Cc: Eduardo Habkost 
> Cc: "Dr. David Alan Gilbert" 
> Cc: Igor Mammedov 
> Signed-off-by: David Hildenbrand 
> ---
>  util/mmap-alloc.c | 21 -
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
> index 27dcccd8ec..82f02a2cec 100644
> --- a/util/mmap-alloc.c
> +++ b/util/mmap-alloc.c
> @@ -82,17 +82,27 @@ size_t qemu_mempath_getpagesize(const char *mem_path)
>  return qemu_real_host_page_size;
>  }
>  
> +static inline size_t mmap_pagesize(int fd)
> +{
> +#if defined(__powerpc64__) && defined(__linux__)
> +/* Mappings in the same segment must share the same page size */
> +return qemu_fd_getpagesize(fd);
> +#else
> +return qemu_real_host_page_size;
> +#endif
> +}

Pure question: This will return 4K even for huge pages on x86, is this
what we want?

This is of course not related to this specific patch which still
follows the old code, but I'm thinking whether it was intended or not
even in the old code (or is there anything to do with the
MAP_NORESERVE fix for ppc64 huge pages?).  Do you know the answer?

Thanks,

> +
>  void *qemu_ram_mmap(int fd,
>  size_t size,
>  size_t align,
>  bool shared,
>  bool is_pmem)
>  {
> +const size_t pagesize = mmap_pagesize(fd);
>  int flags;
>  int map_sync_flags = 0;
>  int guardfd;
>  size_t offset;
> -size_t pagesize;
>  size_t total;
>  void *guardptr;
>  void *ptr;
> @@ -113,7 +123,6 @@ void *qemu_ram_mmap(int fd,
>   * anonymous memory is OK.
>   */
>  flags = MAP_PRIVATE;
> -pagesize = qemu_fd_getpagesize(fd);
>  if (fd == -1 || pagesize == qemu_real_host_page_size) {
>  guardfd = -1;
>  flags |= MAP_ANONYMOUS;
> @@ -123,7 +132,6 @@ void *qemu_ram_mmap(int fd,
>  }
>  #else
>  guardfd = -1;
> -pagesize = qemu_real_host_page_size;
>  flags = MAP_PRIVATE | MAP_ANONYMOUS;
>  #endif
>  
> @@ -198,15 +206,10 @@ void *qemu_ram_mmap(int fd,
>  
>  void qemu_ram_munmap(int fd, void *ptr, size_t size)
>  {
> -size_t pagesize;
> +const size_t pagesize = mmap_pagesize(fd);
>  
>  if (ptr) {
>  /* Unmap both the RAM block and the guard page */
> -#if defined(__powerpc64__) && defined(__linux__)
> -pagesize = qemu_fd_getpagesize(fd);
> -#else
> -pagesize = qemu_real_host_page_size;
> -#endif
>  munmap(ptr, size + pagesize);
>  }
>  }
> -- 
> 2.24.1
> 
> 

-- 
Peter Xu




[PATCH v2 14/20] linux-user, x86_64: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall_64.tbl and syscallhdr.sh from linux/arch/x86/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure|   3 +-
 linux-user/Makefile.objs |   1 +
 linux-user/x86_64/Makefile.objs  |   5 +
 linux-user/x86_64/syscall_64.tbl | 402 +++
 linux-user/x86_64/syscall_nr.h   | 356 ---
 linux-user/x86_64/syscallhdr.sh  |  28 +++
 6 files changed, 438 insertions(+), 357 deletions(-)
 create mode 100644 linux-user/x86_64/Makefile.objs
 create mode 100644 linux-user/x86_64/syscall_64.tbl
 delete mode 100644 linux-user/x86_64/syscall_nr.h
 create mode 100644 linux-user/x86_64/syscallhdr.sh

diff --git a/configure b/configure
index c5d342356e8a..38fe8c91eff8 100755
--- a/configure
+++ b/configure
@@ -1858,7 +1858,7 @@ rm -f */config-devices.mak.d
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
 for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc s390x sparc sparc64 \
-i386 ; do
+i386 x86_64 ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7690,6 +7690,7 @@ case "$target_name" in
   ;;
   x86_64)
 TARGET_BASE_ARCH=i386
+TARGET_SYSTBL_ABI=common,64
 mttcg="yes"
gdb_xml_files="i386-64bit.xml"
   ;;
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 720d9773b813..1791bc48cd17 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -19,4 +19,5 @@ obj-$(TARGET_S390X) += s390x/
 obj-$(TARGET_SH4) += sh4/
 obj-$(TARGET_SPARC) += sparc/
 obj-$(TARGET_SPARC64) += $(TARGET_ABI_DIR)/
+obj-$(TARGET_X86_64) += x86_64/
 obj-$(TARGET_XTENSA) += xtensa/
diff --git a/linux-user/x86_64/Makefile.objs b/linux-user/x86_64/Makefile.objs
new file mode 100644
index ..2cef1d48becc
--- /dev/null
+++ b/linux-user/x86_64/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/x86_64/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/x86_64/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/x86_64/syscall_64.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/x86_64/syscall_64.tbl b/linux-user/x86_64/syscall_64.tbl
new file mode 100644
index ..c29976eca4a8
--- /dev/null
+++ b/linux-user/x86_64/syscall_64.tbl
@@ -0,0 +1,402 @@
+#
+# 64-bit system call numbers and entry vectors
+#
+# The format is:
+#
+#
+# The __x64_sys_*() stubs are created on-the-fly for sys_*() system calls
+#
+# The abi is "common", "64" or "x32" for this file.
+#
+0  common  read__x64_sys_read
+1  common  write   __x64_sys_write
+2  common  open__x64_sys_open
+3  common  close   __x64_sys_close
+4  common  stat__x64_sys_newstat
+5  common  fstat   __x64_sys_newfstat
+6  common  lstat   __x64_sys_newlstat
+7  common  poll__x64_sys_poll
+8  common  lseek   __x64_sys_lseek
+9  common  mmap__x64_sys_mmap
+10 common  mprotect__x64_sys_mprotect
+11 common  munmap  __x64_sys_munmap
+12 common  brk __x64_sys_brk
+13 64  rt_sigaction__x64_sys_rt_sigaction
+14 common  rt_sigprocmask  __x64_sys_rt_sigprocmask
+15 64  rt_sigreturn__x64_sys_rt_sigreturn/ptregs
+16 64  ioctl   __x64_sys_ioctl
+17 common  pread64 __x64_sys_pread64
+18 common  pwrite64__x64_sys_pwrite64
+19 64  readv   __x64_sys_readv
+20 64  writev  __x64_sys_writev
+21 common  access  __x64_sys_access
+22 common  pipe__x64_sys_pipe
+23 common  select  __x64_sys_select
+24 common  sched_yield __x64_sys_sched_yield
+25 common  mremap  __x64_sys_mremap
+26 common  msync   __x64_sys_msync
+27 common  mincore __x64_sys_mincore
+28 common  madvise __x64_sys_madvise
+29 common  shmget  __x64_sys_shmget
+30 common  shmat   __x64_sys_shmat
+31 common  shmctl  __x64_sys_shmctl
+32 common  dup __x64_sys_dup
+33 common  dup2__x64_sys_dup2
+34 common  pause   __x64_sys_pause
+35 common  nanosleep   __x64_sys_nanosleep
+36 common  getitimer  

[PATCH v2 19/20] linux-user,mips: move content of mips_syscall_args

2020-02-19 Thread Laurent Vivier
Move content of mips_syscall_args to mips-syscall-args-o32.c.inc to
ease automatic update. No functionnal change

Signed-off-by: Laurent Vivier 
---
 linux-user/mips/cpu_loop.c | 440 +
 linux-user/mips/syscall-args-o32.c.inc | 438 
 2 files changed, 439 insertions(+), 439 deletions(-)
 create mode 100644 linux-user/mips/syscall-args-o32.c.inc

diff --git a/linux-user/mips/cpu_loop.c b/linux-user/mips/cpu_loop.c
index 396367d81d8d..553e8ca7f576 100644
--- a/linux-user/mips/cpu_loop.c
+++ b/linux-user/mips/cpu_loop.c
@@ -26,447 +26,9 @@
 
 # ifdef TARGET_ABI_MIPSO32
 #  define MIPS_SYSCALL_NUMBER_UNUSED -1
-#  define MIPS_SYS(name, args) args,
 static const int8_t mips_syscall_args[] = {
-MIPS_SYS(sys_syscall, 8)/* 4000 */
-MIPS_SYS(sys_exit   , 1)
-MIPS_SYS(sys_fork   , 0)
-MIPS_SYS(sys_read   , 3)
-MIPS_SYS(sys_write  , 3)
-MIPS_SYS(sys_open   , 3)/* 4005 */
-MIPS_SYS(sys_close  , 1)
-MIPS_SYS(sys_waitpid, 3)
-MIPS_SYS(sys_creat  , 2)
-MIPS_SYS(sys_link   , 2)
-MIPS_SYS(sys_unlink , 1)/* 4010 */
-MIPS_SYS(sys_execve , 0)
-MIPS_SYS(sys_chdir  , 1)
-MIPS_SYS(sys_time   , 1)
-MIPS_SYS(sys_mknod  , 3)
-MIPS_SYS(sys_chmod  , 2)/* 4015 */
-MIPS_SYS(sys_lchown , 3)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_ni_syscall , 0)/* was sys_stat */
-MIPS_SYS(sys_lseek  , 3)
-MIPS_SYS(sys_getpid , 0)/* 4020 */
-MIPS_SYS(sys_mount  , 5)
-MIPS_SYS(sys_umount , 1)
-MIPS_SYS(sys_setuid , 1)
-MIPS_SYS(sys_getuid , 0)
-MIPS_SYS(sys_stime  , 1)/* 4025 */
-MIPS_SYS(sys_ptrace , 4)
-MIPS_SYS(sys_alarm  , 1)
-MIPS_SYS(sys_ni_syscall , 0)/* was sys_fstat */
-MIPS_SYS(sys_pause  , 0)
-MIPS_SYS(sys_utime  , 2)/* 4030 */
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_access , 2)
-MIPS_SYS(sys_nice   , 1)
-MIPS_SYS(sys_ni_syscall , 0)/* 4035 */
-MIPS_SYS(sys_sync   , 0)
-MIPS_SYS(sys_kill   , 2)
-MIPS_SYS(sys_rename , 2)
-MIPS_SYS(sys_mkdir  , 2)
-MIPS_SYS(sys_rmdir  , 1)/* 4040 */
-MIPS_SYS(sys_dup, 1)
-MIPS_SYS(sys_pipe   , 0)
-MIPS_SYS(sys_times  , 1)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_brk, 1)/* 4045 */
-MIPS_SYS(sys_setgid , 1)
-MIPS_SYS(sys_getgid , 0)
-MIPS_SYS(sys_ni_syscall , 0)/* was signal(2) */
-MIPS_SYS(sys_geteuid, 0)
-MIPS_SYS(sys_getegid, 0)/* 4050 */
-MIPS_SYS(sys_acct   , 0)
-MIPS_SYS(sys_umount2, 2)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_ioctl  , 3)
-MIPS_SYS(sys_fcntl  , 3)/* 4055 */
-MIPS_SYS(sys_ni_syscall , 2)
-MIPS_SYS(sys_setpgid, 2)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_olduname   , 1)
-MIPS_SYS(sys_umask  , 1)/* 4060 */
-MIPS_SYS(sys_chroot , 1)
-MIPS_SYS(sys_ustat  , 2)
-MIPS_SYS(sys_dup2   , 2)
-MIPS_SYS(sys_getppid, 0)
-MIPS_SYS(sys_getpgrp, 0)/* 4065 */
-MIPS_SYS(sys_setsid , 0)
-MIPS_SYS(sys_sigaction  , 3)
-MIPS_SYS(sys_sgetmask   , 0)
-MIPS_SYS(sys_ssetmask   , 1)
-MIPS_SYS(sys_setreuid   , 2)/* 4070 */
-MIPS_SYS(sys_setregid   , 2)
-MIPS_SYS(sys_sigsuspend , 0)
-MIPS_SYS(sys_sigpending , 1)
-MIPS_SYS(sys_sethostname, 2)
-MIPS_SYS(sys_setrlimit  , 2)/* 4075 */
-MIPS_SYS(sys_getrlimit  , 2)
-MIPS_SYS(sys_getrusage  , 2)
-MIPS_SYS(sys_gettimeofday, 2)
-MIPS_SYS(sys_settimeofday, 2)
-MIPS_SYS(sys_getgroups  , 2)/* 4080 */
-MIPS_SYS(sys_setgroups  , 2)
-MIPS_SYS(sys_ni_syscall , 0)/* old_select */
-MIPS_SYS(sys_symlink, 2)
-MIPS_SYS(sys_ni_syscall , 0)/* was sys_lstat */
-MIPS_SYS(sys_readlink   , 3)/* 4085 */
-MIPS_SYS(sys_uselib , 1)
-MIPS_SYS(sys_swapon , 2)
-MIPS_SYS(sys_reboot , 3)
-MIPS_SYS(old_readdir, 3)
-MIPS_SYS(old_mmap   , 6)/* 4090 */
-MIPS_SYS(sys_munmap , 2)
-MIPS_SYS(sys_truncate   , 2)
-MIPS_SYS(sys_ftruncate  , 2)
-MIPS_SYS(sys_fchmod , 2)
-MIPS_SYS(sys_fchown , 3)/* 4095 */
-MIPS_SYS(sys_getpriority, 2)
-MIPS_SYS(sys_setpriority, 3)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_statfs , 2)
-MI

[PATCH v2 13/20] linux-user, i386: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall_32.tbl and syscallhdr.sh from linux/arch/x86/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Disable arch_prctl in syscall_32.tbl because linux-user/syscall.c only
defines do_arch_prctl() with TARGET_ABI32, and TARGET_ABI32 is never
defined for TARGET_I386 (This needs to be fixed).

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure  |   3 +-
 linux-user/Makefile.objs   |   1 +
 linux-user/i386/Makefile.objs  |   5 +
 linux-user/i386/syscall_32.tbl | 442 +
 linux-user/i386/syscall_nr.h   | 387 -
 linux-user/i386/syscallhdr.sh  |  28 +++
 6 files changed, 478 insertions(+), 388 deletions(-)
 create mode 100644 linux-user/i386/Makefile.objs
 create mode 100644 linux-user/i386/syscall_32.tbl
 delete mode 100644 linux-user/i386/syscall_nr.h
 create mode 100644 linux-user/i386/syscallhdr.sh

diff --git a/configure b/configure
index 41a5513d23b5..c5d342356e8a 100755
--- a/configure
+++ b/configure
@@ -1858,7 +1858,7 @@ rm -f */config-devices.mak.d
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
 for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc s390x sparc sparc64 \
-; do
+i386 ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7686,6 +7686,7 @@ case "$target_name" in
   i386)
 mttcg="yes"
gdb_xml_files="i386-32bit.xml"
+TARGET_SYSTBL_ABI=i386
   ;;
   x86_64)
 TARGET_BASE_ARCH=i386
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 36f20cad794c..720d9773b813 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -10,6 +10,7 @@ obj-$(TARGET_AARCH64) += arm/semihost.o
 obj-$(TARGET_ALPHA) += alpha/
 obj-$(TARGET_ARM) += arm/
 obj-$(TARGET_HPPA) += hppa/
+obj-$(TARGET_I386) += i386/
 obj-$(TARGET_M68K) += m68k/
 obj-$(TARGET_MICROBLAZE) += microblaze/
 obj-$(TARGET_PPC) += ppc/
diff --git a/linux-user/i386/Makefile.objs b/linux-user/i386/Makefile.objs
new file mode 100644
index ..c25cf17bfb64
--- /dev/null
+++ b/linux-user/i386/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/i386/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/i386/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/i386/syscall_32.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/i386/syscall_32.tbl b/linux-user/i386/syscall_32.tbl
new file mode 100644
index ..a2728f45906e
--- /dev/null
+++ b/linux-user/i386/syscall_32.tbl
@@ -0,0 +1,442 @@
+#
+# 32-bit system call numbers and entry vectors
+#
+# The format is:
+# 
+#
+# The __ia32_sys and __ia32_compat_sys stubs are created on-the-fly for
+# sys_*() system calls and compat_sys_*() compat system calls if
+# IA32_EMULATION is defined, and expect struct pt_regs *regs as their only
+# parameter.
+#
+# The abi is always "i386" for this file.
+#
+0  i386restart_syscall sys_restart_syscall 
__ia32_sys_restart_syscall
+1  i386exitsys_exit
__ia32_sys_exit
+2  i386forksys_fork
__ia32_sys_fork
+3  i386readsys_read
__ia32_sys_read
+4  i386write   sys_write   
__ia32_sys_write
+5  i386opensys_open
__ia32_compat_sys_open
+6  i386close   sys_close   
__ia32_sys_close
+7  i386waitpid sys_waitpid 
__ia32_sys_waitpid
+8  i386creat   sys_creat   
__ia32_sys_creat
+9  i386linksys_link
__ia32_sys_link
+10 i386unlink  sys_unlink  
__ia32_sys_unlink
+11 i386execve  sys_execve  
__ia32_compat_sys_execve
+12 i386chdir   sys_chdir   
__ia32_sys_chdir
+13 i386timesys_time32  
__ia32_sys_time32
+14 i386mknod   sys_mknod   
__ia32_sys_mknod
+15 i386chmod   sys_chmod   
__ia32_sys_chmod
+16 i386lchown  sys_lchown16
__ia32_sys_lchown16
+17 i386break
+18 i386oldstat sys_stat
__ia32_sys_stat
+19 i386lseek   sys_lseek   
__ia32_compat_sys_lseek
+

[PATCH v2 20/20] linux-user,mips: update syscall-args-o32.c.inc

2020-02-19 Thread Laurent Vivier
Add a script to update the file from strace github and run it

Signed-off-by: Laurent Vivier 
---
 linux-user/mips/syscall-args-o32.c.inc | 874 -
 scripts/update-mips-syscall-args.sh|  57 ++
 2 files changed, 493 insertions(+), 438 deletions(-)
 create mode 100755 scripts/update-mips-syscall-args.sh

diff --git a/linux-user/mips/syscall-args-o32.c.inc 
b/linux-user/mips/syscall-args-o32.c.inc
index f060b061441a..0ad35857b4e4 100644
--- a/linux-user/mips/syscall-args-o32.c.inc
+++ b/linux-user/mips/syscall-args-o32.c.inc
@@ -1,438 +1,436 @@
-#  define MIPS_SYS(name, args) args,
-MIPS_SYS(sys_syscall, 8)/* 4000 */
-MIPS_SYS(sys_exit   , 1)
-MIPS_SYS(sys_fork   , 0)
-MIPS_SYS(sys_read   , 3)
-MIPS_SYS(sys_write  , 3)
-MIPS_SYS(sys_open   , 3)/* 4005 */
-MIPS_SYS(sys_close  , 1)
-MIPS_SYS(sys_waitpid, 3)
-MIPS_SYS(sys_creat  , 2)
-MIPS_SYS(sys_link   , 2)
-MIPS_SYS(sys_unlink , 1)/* 4010 */
-MIPS_SYS(sys_execve , 0)
-MIPS_SYS(sys_chdir  , 1)
-MIPS_SYS(sys_time   , 1)
-MIPS_SYS(sys_mknod  , 3)
-MIPS_SYS(sys_chmod  , 2)/* 4015 */
-MIPS_SYS(sys_lchown , 3)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_ni_syscall , 0)/* was sys_stat */
-MIPS_SYS(sys_lseek  , 3)
-MIPS_SYS(sys_getpid , 0)/* 4020 */
-MIPS_SYS(sys_mount  , 5)
-MIPS_SYS(sys_umount , 1)
-MIPS_SYS(sys_setuid , 1)
-MIPS_SYS(sys_getuid , 0)
-MIPS_SYS(sys_stime  , 1)/* 4025 */
-MIPS_SYS(sys_ptrace , 4)
-MIPS_SYS(sys_alarm  , 1)
-MIPS_SYS(sys_ni_syscall , 0)/* was sys_fstat */
-MIPS_SYS(sys_pause  , 0)
-MIPS_SYS(sys_utime  , 2)/* 4030 */
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_access , 2)
-MIPS_SYS(sys_nice   , 1)
-MIPS_SYS(sys_ni_syscall , 0)/* 4035 */
-MIPS_SYS(sys_sync   , 0)
-MIPS_SYS(sys_kill   , 2)
-MIPS_SYS(sys_rename , 2)
-MIPS_SYS(sys_mkdir  , 2)
-MIPS_SYS(sys_rmdir  , 1)/* 4040 */
-MIPS_SYS(sys_dup, 1)
-MIPS_SYS(sys_pipe   , 0)
-MIPS_SYS(sys_times  , 1)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_brk, 1)/* 4045 */
-MIPS_SYS(sys_setgid , 1)
-MIPS_SYS(sys_getgid , 0)
-MIPS_SYS(sys_ni_syscall , 0)/* was signal(2) */
-MIPS_SYS(sys_geteuid, 0)
-MIPS_SYS(sys_getegid, 0)/* 4050 */
-MIPS_SYS(sys_acct   , 0)
-MIPS_SYS(sys_umount2, 2)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_ioctl  , 3)
-MIPS_SYS(sys_fcntl  , 3)/* 4055 */
-MIPS_SYS(sys_ni_syscall , 2)
-MIPS_SYS(sys_setpgid, 2)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_olduname   , 1)
-MIPS_SYS(sys_umask  , 1)/* 4060 */
-MIPS_SYS(sys_chroot , 1)
-MIPS_SYS(sys_ustat  , 2)
-MIPS_SYS(sys_dup2   , 2)
-MIPS_SYS(sys_getppid, 0)
-MIPS_SYS(sys_getpgrp, 0)/* 4065 */
-MIPS_SYS(sys_setsid , 0)
-MIPS_SYS(sys_sigaction  , 3)
-MIPS_SYS(sys_sgetmask   , 0)
-MIPS_SYS(sys_ssetmask   , 1)
-MIPS_SYS(sys_setreuid   , 2)/* 4070 */
-MIPS_SYS(sys_setregid   , 2)
-MIPS_SYS(sys_sigsuspend , 0)
-MIPS_SYS(sys_sigpending , 1)
-MIPS_SYS(sys_sethostname, 2)
-MIPS_SYS(sys_setrlimit  , 2)/* 4075 */
-MIPS_SYS(sys_getrlimit  , 2)
-MIPS_SYS(sys_getrusage  , 2)
-MIPS_SYS(sys_gettimeofday, 2)
-MIPS_SYS(sys_settimeofday, 2)
-MIPS_SYS(sys_getgroups  , 2)/* 4080 */
-MIPS_SYS(sys_setgroups  , 2)
-MIPS_SYS(sys_ni_syscall , 0)/* old_select */
-MIPS_SYS(sys_symlink, 2)
-MIPS_SYS(sys_ni_syscall , 0)/* was sys_lstat */
-MIPS_SYS(sys_readlink   , 3)/* 4085 */
-MIPS_SYS(sys_uselib , 1)
-MIPS_SYS(sys_swapon , 2)
-MIPS_SYS(sys_reboot , 3)
-MIPS_SYS(old_readdir, 3)
-MIPS_SYS(old_mmap   , 6)/* 4090 */
-MIPS_SYS(sys_munmap , 2)
-MIPS_SYS(sys_truncate   , 2)
-MIPS_SYS(sys_ftruncate  , 2)
-MIPS_SYS(sys_fchmod , 2)
-MIPS_SYS(sys_fchown , 3)/* 4095 */
-MIPS_SYS(sys_getpriority, 2)
-MIPS_SYS(sys_setpriority, 3)
-MIPS_SYS(sys_ni_syscall , 0)
-MIPS_SYS(sys_statfs , 2)
-MIPS_SYS(sys_fstatfs, 2)/* 4100 */
-MIPS_SYS(sys_ni_syscall , 0)/* was ioperm(2) */
-MIPS_SYS(sys_socketcall , 2)
-  

[PATCH v2 18/20] linux-user: update syscall.tbl from linux 0bf999f9c5e7

2020-02-19 Thread Laurent Vivier
Run scripts/update-syscalltbl.sh with linux commit 0bf999f9c5e7

Signed-off-by: Laurent Vivier 
---
 linux-user/arm/syscall.tbl| 2 ++
 linux-user/hppa/syscall.tbl   | 2 ++
 linux-user/i386/syscall_32.tbl| 2 ++
 linux-user/m68k/syscall.tbl   | 4 +++-
 linux-user/microblaze/syscall.tbl | 2 ++
 linux-user/mips/syscall_o32.tbl   | 2 ++
 linux-user/mips64/syscall_n32.tbl | 2 ++
 linux-user/mips64/syscall_n64.tbl | 2 ++
 linux-user/ppc/syscall.tbl| 2 ++
 linux-user/s390x/syscall.tbl  | 2 ++
 linux-user/sh4/syscall.tbl| 2 ++
 linux-user/sparc/syscall.tbl  | 2 ++
 linux-user/sparc64/syscall.tbl| 2 ++
 linux-user/x86_64/syscall_64.tbl  | 2 ++
 linux-user/xtensa/syscall.tbl | 2 ++
 15 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/linux-user/arm/syscall.tbl b/linux-user/arm/syscall.tbl
index 6da7dc4d79cc..4d1cf74a2caa 100644
--- a/linux-user/arm/syscall.tbl
+++ b/linux-user/arm/syscall.tbl
@@ -449,3 +449,5 @@
 433common  fspick  sys_fspick
 434common  pidfd_open  sys_pidfd_open
 435common  clone3  sys_clone3
+437common  openat2 sys_openat2
+438common  pidfd_getfd sys_pidfd_getfd
diff --git a/linux-user/hppa/syscall.tbl b/linux-user/hppa/syscall.tbl
index 285ff516150c..52a15f5cd130 100644
--- a/linux-user/hppa/syscall.tbl
+++ b/linux-user/hppa/syscall.tbl
@@ -433,3 +433,5 @@
 433common  fspick  sys_fspick
 434common  pidfd_open  sys_pidfd_open
 435common  clone3  sys_clone3_wrapper
+437common  openat2 sys_openat2
+438common  pidfd_getfd sys_pidfd_getfd
diff --git a/linux-user/i386/syscall_32.tbl b/linux-user/i386/syscall_32.tbl
index a2728f45906e..4fea592676c2 100644
--- a/linux-user/i386/syscall_32.tbl
+++ b/linux-user/i386/syscall_32.tbl
@@ -440,3 +440,5 @@
 433i386fspick  sys_fspick  
__ia32_sys_fspick
 434i386pidfd_open  sys_pidfd_open  
__ia32_sys_pidfd_open
 435i386clone3  sys_clone3  
__ia32_sys_clone3
+437i386openat2 sys_openat2 
__ia32_sys_openat2
+438i386pidfd_getfd sys_pidfd_getfd 
__ia32_sys_pidfd_getfd
diff --git a/linux-user/m68k/syscall.tbl b/linux-user/m68k/syscall.tbl
index a88a285a0e5f..f4f49fcb76d0 100644
--- a/linux-user/m68k/syscall.tbl
+++ b/linux-user/m68k/syscall.tbl
@@ -434,4 +434,6 @@
 432common  fsmount sys_fsmount
 433common  fspick  sys_fspick
 434common  pidfd_open  sys_pidfd_open
-# 435 reserved for clone3
+435common  clone3  __sys_clone3
+437common  openat2 sys_openat2
+438common  pidfd_getfd sys_pidfd_getfd
diff --git a/linux-user/microblaze/syscall.tbl 
b/linux-user/microblaze/syscall.tbl
index 09b0cd7dab0a..4c67b11f9c9e 100644
--- a/linux-user/microblaze/syscall.tbl
+++ b/linux-user/microblaze/syscall.tbl
@@ -441,3 +441,5 @@
 433common  fspick  sys_fspick
 434common  pidfd_open  sys_pidfd_open
 435common  clone3  sys_clone3
+437common  openat2 sys_openat2
+438common  pidfd_getfd sys_pidfd_getfd
diff --git a/linux-user/mips/syscall_o32.tbl b/linux-user/mips/syscall_o32.tbl
index 353539ea4140..ac586774c980 100644
--- a/linux-user/mips/syscall_o32.tbl
+++ b/linux-user/mips/syscall_o32.tbl
@@ -423,3 +423,5 @@
 433o32 fspick  sys_fspick
 434o32 pidfd_open  sys_pidfd_open
 435o32 clone3  __sys_clone3
+437o32 openat2 sys_openat2
+438o32 pidfd_getfd sys_pidfd_getfd
diff --git a/linux-user/mips64/syscall_n32.tbl 
b/linux-user/mips64/syscall_n32.tbl
index e7c5ab38e403..1f9e8ad636cc 100644
--- a/linux-user/mips64/syscall_n32.tbl
+++ b/linux-user/mips64/syscall_n32.tbl
@@ -374,3 +374,5 @@
 433n32 fspick  sys_fspick
 434n32 pidfd_open  sys_pidfd_open
 435n32 clone3  __sys_clone3
+437n32 openat2 sys_openat2
+438n32 pidfd_getfd sys_pidfd_getfd
diff --git a/linux-user/mips64/syscall_n64.tbl 
b/linux-user/mips64/syscall_n64.tbl
index 13cd66581f3b..c0b9d802dbf6 100644
--- a/linux-user/mips64/syscall_n64.tbl
+++ b/linux-user/mips64/syscall_n64.tbl
@@ -350,3 +350,5 @@
 433n64 fspick  sys_fspick
 434n64 pidfd_open  sys_pidfd_open
 435  

[PATCH v2 17/20] linux-user, scripts: add a script to update syscall.tbl

2020-02-19 Thread Laurent Vivier
scripts/update-syscalltbl.sh has the list of syscall.tbl to update and
can copy them from the linux source directory

Signed-off-by: Laurent Vivier 
---
 MAINTAINERS  |  1 +
 scripts/update-syscalltbl.sh | 49 
 2 files changed, 50 insertions(+)
 create mode 100755 scripts/update-syscalltbl.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index 1740a4fddc14..dac93f447544 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2422,6 +2422,7 @@ S: Maintained
 F: linux-user/
 F: default-configs/*-linux-user.mak
 F: scripts/qemu-binfmt-conf.sh
+F: scripts/update-syscalltbl.sh
 
 Tiny Code Generator (TCG)
 -
diff --git a/scripts/update-syscalltbl.sh b/scripts/update-syscalltbl.sh
new file mode 100755
index ..2d23e5680075
--- /dev/null
+++ b/scripts/update-syscalltbl.sh
@@ -0,0 +1,49 @@
+TBL_LIST="\
+arch/alpha/kernel/syscalls/syscall.tbl,linux-user/alpha/syscall.tbl \
+arch/arm/tools/syscall.tbl,linux-user/arm/syscall.tbl \
+arch/m68k/kernel/syscalls/syscall.tbl,linux-user/m68k/syscall.tbl \
+arch/microblaze/kernel/syscalls/syscall.tbl,linux-user/microblaze/syscall.tbl \
+arch/mips/kernel/syscalls/syscall_n32.tbl,linux-user/mips64/syscall_n32.tbl \
+arch/mips/kernel/syscalls/syscall_n64.tbl,linux-user/mips64/syscall_n64.tbl \
+arch/mips/kernel/syscalls/syscall_o32.tbl,linux-user/mips/syscall_o32.tbl \
+arch/parisc/kernel/syscalls/syscall.tbl,linux-user/hppa/syscall.tbl \
+arch/powerpc/kernel/syscalls/syscall.tbl,linux-user/ppc/syscall.tbl \
+arch/s390/kernel/syscalls/syscall.tbl,linux-user/s390x/syscall.tbl \
+arch/sh/kernel/syscalls/syscall.tbl,linux-user/sh4/syscall.tbl \
+arch/sparc/kernel/syscalls/syscall.tbl,linux-user/sparc64/syscall.tbl \
+arch/sparc/kernel/syscalls/syscall.tbl,linux-user/sparc/syscall.tbl \
+arch/x86/entry/syscalls/syscall_32.tbl,linux-user/i386/syscall_32.tbl \
+arch/x86/entry/syscalls/syscall_64.tbl,linux-user/x86_64/syscall_64.tbl \
+arch/xtensa/kernel/syscalls/syscall.tbl,linux-user/xtensa/syscall.tbl\
+"
+
+linux="$1"
+output="$2"
+
+if [ -z "$linux" ] || ! [ -d "$linux" ]; then
+cat << EOF
+usage: update-syscalltbl.sh LINUX_PATH [OUTPUT_PATH]
+
+LINUX_PATH  Linux kernel directory to obtain the syscall.tbl from
+OUTPUT_PATH output directory, usually the qemu source tree (default: $PWD)
+EOF
+exit 1
+fi
+
+if [ -z "$output" ]; then
+output="$PWD"
+fi
+
+for entry in $TBL_LIST; do
+OFS="$IFS"
+IFS=,
+set $entry
+src=$1
+dst=$2
+IFS="$OFS"
+if ! cp "$linux/$src" "$output/$dst" ; then
+echo "Cannot copy $linux/$src to $output/$dst" 1>&2
+exit 1
+fi
+done
+
-- 
2.24.1




[PATCH v2 16/20] linux-user, mips64: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall_n32.tbl, syscall_n64.tbl and syscallhdr.sh from
linux/arch/parisc/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Move the offsets (6000 for n32 and 5000 for n64) from the file to
the Makefile.objs to be passed to syscallhdr.sh

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)
we don't need to split syscall_nr.h as it is generated
according TARGET_SYSTBL_ABI to TARGET_ABI_DIR
and values are incorrect according to the file name.
we need to hardcode the ABI and the offset to generate
the good content.

remove  dependencies to syscall_nr.h in source directory

 configure |   4 +-
 linux-user/Makefile.objs  |   1 +
 linux-user/mips64/Makefile.objs   |  12 +
 linux-user/mips64/syscall_n32.tbl | 376 
 linux-user/mips64/syscall_n64.tbl | 352 +++
 linux-user/mips64/syscall_nr.h| 725 --
 linux-user/mips64/syscallhdr.sh   |  33 ++
 7 files changed, 777 insertions(+), 726 deletions(-)
 create mode 100644 linux-user/mips64/Makefile.objs
 create mode 100644 linux-user/mips64/syscall_n32.tbl
 create mode 100644 linux-user/mips64/syscall_n64.tbl
 delete mode 100644 linux-user/mips64/syscall_nr.h
 create mode 100644 linux-user/mips64/syscallhdr.sh

diff --git a/configure b/configure
index 2da7504ddd51..3432a1117841 100755
--- a/configure
+++ b/configure
@@ -1858,7 +1858,7 @@ rm -f */config-devices.mak.d
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
 for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc s390x sparc sparc64 \
-i386 x86_64 mips ; do
+i386 x86_64 mips mips64 ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7743,12 +7743,14 @@ case "$target_name" in
 TARGET_BASE_ARCH=mips
 echo "TARGET_ABI_MIPSN32=y" >> $config_target_mak
 echo "TARGET_ABI32=y" >> $config_target_mak
+TARGET_SYSTBL_ABI=n32
   ;;
   mips64|mips64el)
 mttcg="yes"
 TARGET_ARCH=mips64
 TARGET_BASE_ARCH=mips
 echo "TARGET_ABI_MIPSN64=y" >> $config_target_mak
+TARGET_SYSTBL_ABI=n64
   ;;
   moxie)
   ;;
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 0a0715e9e192..1940910a7321 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -14,6 +14,7 @@ obj-$(TARGET_I386) += i386/
 obj-$(TARGET_M68K) += m68k/
 obj-$(TARGET_MICROBLAZE) += microblaze/
 obj-$(TARGET_MIPS) += mips/
+obj-$(TARGET_MIPS64) += mips64/
 obj-$(TARGET_PPC) += ppc/
 obj-$(TARGET_PPC64) += ppc/
 obj-$(TARGET_S390X) += s390x/
diff --git a/linux-user/mips64/Makefile.objs b/linux-user/mips64/Makefile.objs
new file mode 100644
index ..573448f9568a
--- /dev/null
+++ b/linux-user/mips64/Makefile.objs
@@ -0,0 +1,12 @@
+generated-files-y += linux-user/$(TARGET_ABI_DIR)/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/$(TARGET_ABI_DIR)/syscallhdr.sh
+
+ifeq ($(TARGET_SYSTBL_ABI),n32)
+%/syscall_nr.h: $(SRC_PATH)/linux-user/$(TARGET_ABI_DIR)/syscall_n32.tbl 
$(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ n32 "" 6000,"GEN","$@")
+endif
+ifeq ($(TARGET_SYSTBL_ABI),n64)
+%/syscall_nr.h: $(SRC_PATH)/linux-user/$(TARGET_ABI_DIR)/syscall_n64.tbl 
$(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ n64 "" 5000,"GEN","$@")
+endif
diff --git a/linux-user/mips64/syscall_n32.tbl 
b/linux-user/mips64/syscall_n32.tbl
new file mode 100644
index ..e7c5ab38e403
--- /dev/null
+++ b/linux-user/mips64/syscall_n32.tbl
@@ -0,0 +1,376 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for mips
+#
+# The format is:
+# 
+#
+# The  is always "n32" for this file.
+#
+0  n32 readsys_read
+1  n32 write   sys_write
+2  n32 opensys_open
+3  n32 close   sys_close
+4  n32 statsys_newstat
+5  n32 fstat   sys_newfstat
+6  n32 lstat   sys_newlstat
+7  n32 pollsys_poll
+8  n32 lseek   sys_lseek
+9  n32 mmapsys_mips_mmap
+10 n32 mprotectsys_mprotect
+11 n32 munmap  sys_munmap
+12 n32 brk sys_brk
+13 n32 rt_sigactioncompat_sys_rt_sigaction
+14 n32 rt_sigprocmask  compat_sys_rt_sigprocmask
+15 n32 ioctl   compat_sys_ioctl
+16 n32 pread64 sys_pread64
+17 n32 pwrite64

[PATCH v2 15/20] linux-user, mips: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from 
linux/arch/mips/kernel/syscalls/syscall_o32.tbl v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h
Move the offset (4000) from the file to the Makefile.objs to be passed
to syscallhdr.sh
Rename on the fly fadvise64 to fadvise64_64.

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure   |   3 +-
 linux-user/Makefile.objs|   1 +
 linux-user/mips/Makefile.objs   |   5 +
 linux-user/mips/syscall_nr.h| 425 
 linux-user/mips/syscall_o32.tbl | 425 
 linux-user/mips/syscallhdr.sh   |  36 +++
 6 files changed, 469 insertions(+), 426 deletions(-)
 create mode 100644 linux-user/mips/Makefile.objs
 delete mode 100644 linux-user/mips/syscall_nr.h
 create mode 100644 linux-user/mips/syscall_o32.tbl
 create mode 100644 linux-user/mips/syscallhdr.sh

diff --git a/configure b/configure
index 38fe8c91eff8..2da7504ddd51 100755
--- a/configure
+++ b/configure
@@ -1858,7 +1858,7 @@ rm -f */config-devices.mak.d
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
 for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc s390x sparc sparc64 \
-i386 x86_64 ; do
+i386 x86_64 mips ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7735,6 +7735,7 @@ case "$target_name" in
 mttcg="yes"
 TARGET_ARCH=mips
 echo "TARGET_ABI_MIPSO32=y" >> $config_target_mak
+TARGET_SYSTBL_ABI=o32
   ;;
   mipsn32|mipsn32el)
 mttcg="yes"
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 1791bc48cd17..0a0715e9e192 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -13,6 +13,7 @@ obj-$(TARGET_HPPA) += hppa/
 obj-$(TARGET_I386) += i386/
 obj-$(TARGET_M68K) += m68k/
 obj-$(TARGET_MICROBLAZE) += microblaze/
+obj-$(TARGET_MIPS) += mips/
 obj-$(TARGET_PPC) += ppc/
 obj-$(TARGET_PPC64) += ppc/
 obj-$(TARGET_S390X) += s390x/
diff --git a/linux-user/mips/Makefile.objs b/linux-user/mips/Makefile.objs
new file mode 100644
index ..9be4de07d99a
--- /dev/null
+++ b/linux-user/mips/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/mips/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/mips/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/mips/syscall_o32.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ $(TARGET_SYSTBL_ABI) "" 
4000,"GEN","$@")
diff --git a/linux-user/mips/syscall_nr.h b/linux-user/mips/syscall_nr.h
deleted file mode 100644
index 0be3af1c8455..
--- a/linux-user/mips/syscall_nr.h
+++ /dev/null
@@ -1,425 +0,0 @@
-/*
- * Linux o32 style syscalls are in the range from 4000 to 4999.
- */
-
-#ifndef LINUX_USER_MIPS_SYSCALL_NR_H
-#define LINUX_USER_MIPS_SYSCALL_NR_H
-
-#define TARGET_NR_Linux4000
-#define TARGET_NR_syscall  (TARGET_NR_Linux +   0)
-#define TARGET_NR_exit (TARGET_NR_Linux +   1)
-#define TARGET_NR_fork (TARGET_NR_Linux +   2)
-#define TARGET_NR_read (TARGET_NR_Linux +   3)
-#define TARGET_NR_write(TARGET_NR_Linux +   4)
-#define TARGET_NR_open (TARGET_NR_Linux +   5)
-#define TARGET_NR_close(TARGET_NR_Linux +   6)
-#define TARGET_NR_waitpid  (TARGET_NR_Linux +   7)
-#define TARGET_NR_creat(TARGET_NR_Linux +   8)
-#define TARGET_NR_link (TARGET_NR_Linux +   9)
-#define TARGET_NR_unlink   (TARGET_NR_Linux +  10)
-#define TARGET_NR_execve   (TARGET_NR_Linux +  11)
-#define TARGET_NR_chdir(TARGET_NR_Linux +  12)
-#define TARGET_NR_time (TARGET_NR_Linux +  13)
-#define TARGET_NR_mknod(TARGET_NR_Linux +  14)
-#define TARGET_NR_chmod(TARGET_NR_Linux +  15)
-#define TARGET_NR_lchown   (TARGET_NR_Linux +  16)
-#define TARGET_NR_break(TARGET_NR_Linux +  17)
-#define TARGET_NR_unused18 (TARGET_NR_Linux +  18)
-#define TARGET_NR_lseek(TARGET_NR_Linux +  19)
-#define TARGET_NR_getpid   (TARGET_NR_Linux +  20)
-#define TARGET_NR_mount(TARGET_NR_Linux +  21)
-#define TARGET_NR_umount   (TARGET_NR_Linux +  22)
-#define TARGET_NR_setuid   (TARGET_NR_Linux +  23)
-#define TARGET_NR_getuid   (TARGET_NR_Linux +  24)
-#define TARGET_NR_stime(TARGET_NR_Linux +  25)
-#define TARGET_NR_ptrace   (TARGET_NR_Linux +  26)
-#define TARGET_NR_alarm(TARGET_NR_Linux +  27)
-#define TARGET_NR_unuse

[PATCH v2 12/20] linux-user, sparc, sparc64: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/sparc/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure|   6 +-
 linux-user/Makefile.objs |   2 +
 linux-user/sparc/Makefile.objs   |   5 +
 linux-user/sparc/syscall.tbl | 483 +++
 linux-user/sparc/syscall_nr.h| 363 ---
 linux-user/sparc/syscallhdr.sh   |  32 ++
 linux-user/sparc64/Makefile.objs |   5 +
 linux-user/sparc64/syscall.tbl   | 483 +++
 linux-user/sparc64/syscall_nr.h  | 366 ---
 linux-user/sparc64/syscallhdr.sh |  32 ++
 10 files changed, 1047 insertions(+), 730 deletions(-)
 create mode 100644 linux-user/sparc/Makefile.objs
 create mode 100644 linux-user/sparc/syscall.tbl
 delete mode 100644 linux-user/sparc/syscall_nr.h
 create mode 100644 linux-user/sparc/syscallhdr.sh
 create mode 100644 linux-user/sparc64/Makefile.objs
 create mode 100644 linux-user/sparc64/syscall.tbl
 delete mode 100644 linux-user/sparc64/syscall_nr.h
 create mode 100644 linux-user/sparc64/syscallhdr.sh

diff --git a/configure b/configure
index a6a733e09e4d..41a5513d23b5 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,8 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc s390x ; do
+for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc s390x sparc sparc64 \
+; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7799,14 +7800,17 @@ case "$target_name" in
 bflt="yes"
   ;;
   sparc)
+TARGET_SYSTBL_ABI=common,32
   ;;
   sparc64)
 TARGET_BASE_ARCH=sparc
+TARGET_SYSTBL_ABI=common,64
   ;;
   sparc32plus)
 TARGET_ARCH=sparc64
 TARGET_BASE_ARCH=sparc
 TARGET_ABI_DIR=sparc
+TARGET_SYSTBL_ABI=common,32
 echo "TARGET_ABI32=y" >> $config_target_mak
   ;;
   s390x)
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index f4e666e74c91..36f20cad794c 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -16,4 +16,6 @@ obj-$(TARGET_PPC) += ppc/
 obj-$(TARGET_PPC64) += ppc/
 obj-$(TARGET_S390X) += s390x/
 obj-$(TARGET_SH4) += sh4/
+obj-$(TARGET_SPARC) += sparc/
+obj-$(TARGET_SPARC64) += $(TARGET_ABI_DIR)/
 obj-$(TARGET_XTENSA) += xtensa/
diff --git a/linux-user/sparc/Makefile.objs b/linux-user/sparc/Makefile.objs
new file mode 100644
index ..29d3f066cbab
--- /dev/null
+++ b/linux-user/sparc/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/sparc/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/sparc/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/sparc/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/sparc/syscall.tbl b/linux-user/sparc/syscall.tbl
new file mode 100644
index ..8c8cc7537fb2
--- /dev/null
+++ b/linux-user/sparc/syscall.tbl
@@ -0,0 +1,483 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for sparc
+#
+# The format is:
+# 
+#
+# The  can be common, 64, or 32 for this file.
+#
+0  common  restart_syscall sys_restart_syscall
+1  32  exitsys_exit
sparc_exit
+1  64  exitsparc_exit
+2  common  forksys_fork
+3  common  readsys_read
+4  common  write   sys_write
+5  common  opensys_open
compat_sys_open
+6  common  close   sys_close
+7  common  wait4   sys_wait4   
compat_sys_wait4
+8  common  creat   sys_creat
+9  common  linksys_link
+10 common  unlink  sys_unlink
+11 32  execv   sunos_execv
+11 64  execv   sys_nis_syscall
+12 common  chdir   sys_chdir
+13 32  chown   sys_chown16
+13 64  chown   sys_chown
+14 common  mknod   sys_mknod
+15 common  chmod   sys_chmod
+16 32  lchown  sys_lchown16
+16 64  lchown  sys_lchown
+17 common  brk sys_brk
+18 common  perfctr sys_nis_syscall
+19 common  lseek   sys_lseek   
compat_sys_lseek
+20 common  getpid  sys_getpid
+21 common  capget  

[PATCH v2 07/20] linux-user, microblaze: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/microblaze/kernel/syscalls 
v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure   |   3 +-
 linux-user/Makefile.objs|   1 +
 linux-user/microblaze/Makefile.objs |   5 +
 linux-user/microblaze/syscall.tbl   | 443 
 linux-user/microblaze/syscall_nr.h  | 442 ---
 linux-user/microblaze/syscallhdr.sh |  32 ++
 6 files changed, 483 insertions(+), 443 deletions(-)
 create mode 100644 linux-user/microblaze/Makefile.objs
 create mode 100644 linux-user/microblaze/syscall.tbl
 delete mode 100644 linux-user/microblaze/syscall_nr.h
 create mode 100644 linux-user/microblaze/syscallhdr.sh

diff --git a/configure b/configure
index 001534166271..4cc57aa62818 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha hppa m68k xtensa sh4 ; do
+for arch in alpha hppa m68k xtensa sh4 microblaze ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7723,6 +7723,7 @@ case "$target_name" in
   ;;
   microblaze|microblazeel)
 TARGET_ARCH=microblaze
+TARGET_SYSTBL_ABI=common
 bflt="yes"
 echo "TARGET_ABI32=y" >> $config_target_mak
   ;;
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index d31f30d75851..5a26281e8867 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -12,5 +12,6 @@ obj-$(TARGET_AARCH64) += arm/semihost.o
 obj-$(TARGET_ALPHA) += alpha/
 obj-$(TARGET_HPPA) += hppa/
 obj-$(TARGET_M68K) += m68k/
+obj-$(TARGET_MICROBLAZE) += microblaze/
 obj-$(TARGET_SH4) += sh4/
 obj-$(TARGET_XTENSA) += xtensa/
diff --git a/linux-user/microblaze/Makefile.objs 
b/linux-user/microblaze/Makefile.objs
new file mode 100644
index ..bb8b318dda7f
--- /dev/null
+++ b/linux-user/microblaze/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/microblaze/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/microblaze/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/microblaze/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/microblaze/syscall.tbl 
b/linux-user/microblaze/syscall.tbl
new file mode 100644
index ..09b0cd7dab0a
--- /dev/null
+++ b/linux-user/microblaze/syscall.tbl
@@ -0,0 +1,443 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for microblaze
+#
+# The format is:
+#
+#
+# The  is always "common" for this file
+#
+0  common  restart_syscall sys_restart_syscall
+1  common  exitsys_exit
+2  common  forksys_fork
+3  common  readsys_read
+4  common  write   sys_write
+5  common  opensys_open
+6  common  close   sys_close
+7  common  waitpid sys_waitpid
+8  common  creat   sys_creat
+9  common  linksys_link
+10 common  unlink  sys_unlink
+11 common  execve  sys_execve
+12 common  chdir   sys_chdir
+13 common  timesys_time32
+14 common  mknod   sys_mknod
+15 common  chmod   sys_chmod
+16 common  lchown  sys_lchown
+17 common  break   sys_ni_syscall
+18 common  oldstat sys_ni_syscall
+19 common  lseek   sys_lseek
+20 common  getpid  sys_getpid
+21 common  mount   sys_mount
+22 common  umount  sys_oldumount
+23 common  setuid  sys_setuid
+24 common  getuid  sys_getuid
+25 common  stime   sys_stime32
+26 common  ptrace  sys_ptrace
+27 common  alarm   sys_alarm
+28 common  oldfstatsys_ni_syscall
+29 common  pause   sys_pause
+30 common  utime   sys_utime32
+31 common  sttysys_ni_syscall
+32 common  gttysys_ni_syscall
+33 common  access  sys_access
+34 com

[PATCH v2 09/20] linux-user, ppc: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/ppc/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h
and to not generate the entry if entry point is sys_ni_syscall.

Fix ppc/signal.c to define do_sigreturn() for TARGET_ABI32.

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)
we don't need to split syscall_nr.h as it is generated
according TARGET_SYSTBL_ABI to TARGET_ABI_DIR
and values are incorrect according to the file name.
(generated syscall32_nr.h and syscall64_nr.h were
identical by TARGET_ABI_DIR)

remove  dependencies to syscall_nr.h in source directory

 configure|   6 +-
 linux-user/Makefile.objs |   2 +
 linux-user/ppc/Makefile.objs |   6 +
 linux-user/ppc/signal.c  |   2 +-
 linux-user/ppc/syscall.tbl   | 519 +++
 linux-user/ppc/syscall_nr.h  | 402 ---
 linux-user/ppc/syscallhdr.sh |  34 +++
 7 files changed, 567 insertions(+), 404 deletions(-)
 create mode 100644 linux-user/ppc/Makefile.objs
 create mode 100644 linux-user/ppc/syscall.tbl
 delete mode 100644 linux-user/ppc/syscall_nr.h
 create mode 100644 linux-user/ppc/syscallhdr.sh

diff --git a/configure b/configure
index 7c9ee47c04e9..b5abea89d300 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha hppa m68k xtensa sh4 microblaze arm ; do
+for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7756,10 +7756,12 @@ case "$target_name" in
   ;;
   ppc)
 gdb_xml_files="power-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
+TARGET_SYSTBL_ABI=common,nospu,32
   ;;
   ppc64)
 TARGET_BASE_ARCH=ppc
 TARGET_ABI_DIR=ppc
+TARGET_SYSTBL_ABI=common,nospu,64
 mttcg=yes
 gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml power-vsx.xml"
   ;;
@@ -7767,6 +7769,7 @@ case "$target_name" in
 TARGET_ARCH=ppc64
 TARGET_BASE_ARCH=ppc
 TARGET_ABI_DIR=ppc
+TARGET_SYSTBL_ABI=common,nospu,64
 mttcg=yes
 gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml power-vsx.xml"
   ;;
@@ -7774,6 +,7 @@ case "$target_name" in
 TARGET_ARCH=ppc64
 TARGET_BASE_ARCH=ppc
 TARGET_ABI_DIR=ppc
+TARGET_SYSTBL_ABI=common,nospu,32
 echo "TARGET_ABI32=y" >> $config_target_mak
 gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml power-vsx.xml"
   ;;
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index bc12e38291bc..8b00dad687b2 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -12,5 +12,7 @@ obj-$(TARGET_ARM) += arm/
 obj-$(TARGET_HPPA) += hppa/
 obj-$(TARGET_M68K) += m68k/
 obj-$(TARGET_MICROBLAZE) += microblaze/
+obj-$(TARGET_PPC) += ppc/
+obj-$(TARGET_PPC64) += ppc/
 obj-$(TARGET_SH4) += sh4/
 obj-$(TARGET_XTENSA) += xtensa/
diff --git a/linux-user/ppc/Makefile.objs b/linux-user/ppc/Makefile.objs
new file mode 100644
index ..be92e67eb160
--- /dev/null
+++ b/linux-user/ppc/Makefile.objs
@@ -0,0 +1,6 @@
+generated-files-y += linux-user/$(TARGET_ABI_DIR)/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/$(TARGET_ABI_DIR)/syscallhdr.sh
+
+%/syscall_nr.h: $(SRC_PATH)/linux-user/$(TARGET_ABI_DIR)/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
index 5b82af6cb623..0c4e7ba54caf 100644
--- a/linux-user/ppc/signal.c
+++ b/linux-user/ppc/signal.c
@@ -588,7 +588,7 @@ sigsegv:
 
 }
 
-#if !defined(TARGET_PPC64)
+#if !defined(TARGET_PPC64) || defined(TARGET_ABI32)
 long do_sigreturn(CPUPPCState *env)
 {
 struct target_sigcontext *sc = NULL;
diff --git a/linux-user/ppc/syscall.tbl b/linux-user/ppc/syscall.tbl
new file mode 100644
index ..43f736ed47f2
--- /dev/null
+++ b/linux-user/ppc/syscall.tbl
@@ -0,0 +1,519 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for powerpc
+#
+# The format is:
+# 
+#
+# The  can be common, spu, nospu, 64, or 32 for this file.
+#
+0  nospu   restart_syscall sys_restart_syscall
+1  nospu   exitsys_exit
+2  nospu   forkppc_fork
+3  common  readsys_read
+4  common  write   sys_write
+5  common  opensys_open
compat_sys_open
+6  common  close   sys_close
+7  common  waitpid  

[PATCH v2 06/20] linux-user, sh4: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/sh/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure|   3 +-
 linux-user/Makefile.objs |   1 +
 linux-user/sh4/Makefile.objs |   5 +
 linux-user/sh4/syscall.tbl   | 440 ++
 linux-user/sh4/syscall_nr.h  | 441 ---
 linux-user/sh4/syscallhdr.sh |  32 +++
 6 files changed, 480 insertions(+), 442 deletions(-)
 create mode 100644 linux-user/sh4/Makefile.objs
 create mode 100644 linux-user/sh4/syscall.tbl
 delete mode 100644 linux-user/sh4/syscall_nr.h
 create mode 100644 linux-user/sh4/syscallhdr.sh

diff --git a/configure b/configure
index deb112b06f36..001534166271 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha hppa m68k xtensa ; do
+for arch in alpha hppa m68k xtensa sh4 ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7789,6 +7789,7 @@ case "$target_name" in
   ;;
   sh4|sh4eb)
 TARGET_ARCH=sh4
+TARGET_SYSTBL_ABI=common
 bflt="yes"
   ;;
   sparc)
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 13b821baf752..d31f30d75851 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -12,4 +12,5 @@ obj-$(TARGET_AARCH64) += arm/semihost.o
 obj-$(TARGET_ALPHA) += alpha/
 obj-$(TARGET_HPPA) += hppa/
 obj-$(TARGET_M68K) += m68k/
+obj-$(TARGET_SH4) += sh4/
 obj-$(TARGET_XTENSA) += xtensa/
diff --git a/linux-user/sh4/Makefile.objs b/linux-user/sh4/Makefile.objs
new file mode 100644
index ..83fc939570d5
--- /dev/null
+++ b/linux-user/sh4/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/sh4/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/sh4/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/sh4/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/sh4/syscall.tbl b/linux-user/sh4/syscall.tbl
new file mode 100644
index ..b5ed26c4c005
--- /dev/null
+++ b/linux-user/sh4/syscall.tbl
@@ -0,0 +1,440 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for sh
+#
+# The format is:
+#
+#
+# The  is always "common" for this file
+#
+0  common  restart_syscall sys_restart_syscall
+1  common  exitsys_exit
+2  common  forksys_fork
+3  common  readsys_read
+4  common  write   sys_write
+5  common  opensys_open
+6  common  close   sys_close
+7  common  waitpid sys_waitpid
+8  common  creat   sys_creat
+9  common  linksys_link
+10 common  unlink  sys_unlink
+11 common  execve  sys_execve
+12 common  chdir   sys_chdir
+13 common  timesys_time32
+14 common  mknod   sys_mknod
+15 common  chmod   sys_chmod
+16 common  lchown  sys_lchown16
+# 17 was break
+18 common  oldstat sys_stat
+19 common  lseek   sys_lseek
+20 common  getpid  sys_getpid
+21 common  mount   sys_mount
+22 common  umount  sys_oldumount
+23 common  setuid  sys_setuid16
+24 common  getuid  sys_getuid16
+25 common  stime   sys_stime32
+26 common  ptrace  sys_ptrace
+27 common  alarm   sys_alarm
+28 common  oldfstatsys_fstat
+29 common  pause   sys_pause
+30 common  utime   sys_utime32
+# 31 was stty
+# 32 was gtty
+33 common  access  sys_access
+34 common  nicesys_nice
+# 35 was ftime
+36 common  syncsys_sync
+37 common  killsys_kill
+38 common  rename  sys_rename
+39 common  mkdir   sys_mkdir
+40 common  rmdir   sys_rmdir
+41 common  dup sy

[PATCH v2 02/20] linux-user, alpha: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/alpha/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure  |   3 +-
 linux-user/Makefile.objs   |   2 +
 linux-user/alpha/Makefile.objs |   5 +
 linux-user/alpha/syscall.tbl   | 479 
 linux-user/alpha/syscall_nr.h  | 492 -
 linux-user/alpha/syscallhdr.sh |  32 +++
 6 files changed, 520 insertions(+), 493 deletions(-)
 create mode 100644 linux-user/alpha/Makefile.objs
 create mode 100644 linux-user/alpha/syscall.tbl
 delete mode 100644 linux-user/alpha/syscall_nr.h
 create mode 100644 linux-user/alpha/syscallhdr.sh

diff --git a/configure b/configure
index 795adf41195f..91a0b667a581 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in ; do
+for arch in alpha ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7693,6 +7693,7 @@ case "$target_name" in
   ;;
   alpha)
 mttcg="yes"
+TARGET_SYSTBL_ABI=common
   ;;
   arm|armeb)
 TARGET_ARCH=arm
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index d2f33beb5e52..a1afb4d21f9f 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -8,3 +8,5 @@ obj-$(TARGET_I386) += vm86.o
 obj-$(TARGET_ARM) += arm/nwfpe/
 obj-$(TARGET_ARM) += arm/semihost.o
 obj-$(TARGET_AARCH64) += arm/semihost.o
+
+obj-$(TARGET_ALPHA) += alpha/
diff --git a/linux-user/alpha/Makefile.objs b/linux-user/alpha/Makefile.objs
new file mode 100644
index ..d6397a70abb2
--- /dev/null
+++ b/linux-user/alpha/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/alpha/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/alpha/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/alpha/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/alpha/syscall.tbl b/linux-user/alpha/syscall.tbl
new file mode 100644
index ..36d42da7466a
--- /dev/null
+++ b/linux-user/alpha/syscall.tbl
@@ -0,0 +1,479 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for alpha
+#
+# The format is:
+#
+#
+# The  is always "common" for this file
+#
+0  common  osf_syscall alpha_syscall_zero
+1  common  exitsys_exit
+2  common  forkalpha_fork
+3  common  readsys_read
+4  common  write   sys_write
+5  common  osf_old_opensys_ni_syscall
+6  common  close   sys_close
+7  common  osf_wait4   sys_osf_wait4
+8  common  osf_old_creat   sys_ni_syscall
+9  common  linksys_link
+10 common  unlink  sys_unlink
+11 common  osf_execve  sys_ni_syscall
+12 common  chdir   sys_chdir
+13 common  fchdir  sys_fchdir
+14 common  mknod   sys_mknod
+15 common  chmod   sys_chmod
+16 common  chown   sys_chown
+17 common  brk sys_osf_brk
+18 common  osf_getfsstat   sys_ni_syscall
+19 common  lseek   sys_lseek
+20 common  getxpid sys_getxpid
+21 common  osf_mount   sys_osf_mount
+22 common  umount2 sys_umount
+23 common  setuid  sys_setuid
+24 common  getxuid sys_getxuid
+25 common  exec_with_loadersys_ni_syscall
+26 common  ptrace  sys_ptrace
+27 common  osf_nrecvmsgsys_ni_syscall
+28 common  osf_nsendmsgsys_ni_syscall
+29 common  osf_nrecvfrom   sys_ni_syscall
+30 common  osf_naccept sys_ni_syscall
+31 common  osf_ngetpeernamesys_ni_syscall
+32 common  osf_ngetsocknamesys_ni_syscall
+33 common  access  sys_access
+34 common  osf_chflags sys_ni_syscall
+35 common  osf_fchflagssys_ni_syscall
+36 common  syncsys_sync
+37 common  killsys_kill
+38 comm

[PATCH v2 04/20] linux-user, m68k: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/m68k/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure |   3 +-
 linux-user/Makefile.objs  |   1 +
 linux-user/m68k/Makefile.objs |   5 +
 linux-user/m68k/syscall.tbl   | 437 ++
 linux-user/m68k/syscall_nr.h  | 434 -
 linux-user/m68k/syscallhdr.sh |  32 +++
 6 files changed, 477 insertions(+), 435 deletions(-)
 create mode 100644 linux-user/m68k/Makefile.objs
 create mode 100644 linux-user/m68k/syscall.tbl
 delete mode 100644 linux-user/m68k/syscall_nr.h
 create mode 100644 linux-user/m68k/syscallhdr.sh

diff --git a/configure b/configure
index 24338dfa1bcf..7f1af37f552a 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha hppa ; do
+for arch in alpha hppa m68k ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7719,6 +7719,7 @@ case "$target_name" in
   m68k)
 bflt="yes"
 gdb_xml_files="cf-core.xml cf-fp.xml m68k-fp.xml"
+TARGET_SYSTBL_ABI=common
   ;;
   microblaze|microblazeel)
 TARGET_ARCH=microblaze
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 9f8e001241d5..ac74b23683cf 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -11,3 +11,4 @@ obj-$(TARGET_AARCH64) += arm/semihost.o
 
 obj-$(TARGET_ALPHA) += alpha/
 obj-$(TARGET_HPPA) += hppa/
+obj-$(TARGET_M68K) += m68k/
diff --git a/linux-user/m68k/Makefile.objs b/linux-user/m68k/Makefile.objs
new file mode 100644
index ..961bd05c237f
--- /dev/null
+++ b/linux-user/m68k/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/m68k/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/m68k/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/m68k/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/m68k/syscall.tbl b/linux-user/m68k/syscall.tbl
new file mode 100644
index ..a88a285a0e5f
--- /dev/null
+++ b/linux-user/m68k/syscall.tbl
@@ -0,0 +1,437 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for m68k
+#
+# The format is:
+#
+#
+# The  is always "common" for this file
+#
+0  common  restart_syscall sys_restart_syscall
+1  common  exitsys_exit
+2  common  fork__sys_fork
+3  common  readsys_read
+4  common  write   sys_write
+5  common  opensys_open
+6  common  close   sys_close
+7  common  waitpid sys_waitpid
+8  common  creat   sys_creat
+9  common  linksys_link
+10 common  unlink  sys_unlink
+11 common  execve  sys_execve
+12 common  chdir   sys_chdir
+13 common  timesys_time32
+14 common  mknod   sys_mknod
+15 common  chmod   sys_chmod
+16 common  chown   sys_chown16
+# 17 was break
+18 common  oldstat sys_stat
+19 common  lseek   sys_lseek
+20 common  getpid  sys_getpid
+21 common  mount   sys_mount
+22 common  umount  sys_oldumount
+23 common  setuid  sys_setuid16
+24 common  getuid  sys_getuid16
+25 common  stime   sys_stime32
+26 common  ptrace  sys_ptrace
+27 common  alarm   sys_alarm
+28 common  oldfstatsys_fstat
+29 common  pause   sys_pause
+30 common  utime   sys_utime32
+# 31 was stty
+# 32 was gtty
+33 common  access  sys_access
+34 common  nicesys_nice
+# 35 was ftime
+36 common  syncsys_sync
+37 common  killsys_kill
+38 common  rename  sys_rename
+39 common  mkdir   sys_mkdir
+40 common  rmdir   sys_rmdir
+41 common  dup 

[PATCH v2 00/20] linux-user: generate syscall_nr.sh

2020-02-19 Thread Laurent Vivier
This series copies the files syscall.tbl from linux v5.5 and generates
the file syscall_nr.h from them.

This is done for all the QEMU targets that have a syscall.tbl
in the linux source tree: mips, mips64, i386, x86_64, sparc, s390x,
ppc, arm, microblaze, sh4, xtensa, m68k, hppa and alpha.

tilegx and cris are depecrated in linux (tilegx has no maintainer in QEMU)

aarch64, nios2, openrisc and riscv have no syscall.tbl in linux.

It seems there is a bug in QEMU that forces to disable manually arch_prctl
with i386 target: do_arch_prctl() is only defined with TARGET_ABI32 but
TARGET_ABI32 is never defined with TARGET_I386 (nor TARGET_X86_64).

I have also removed all syscalls in s390x/syscall_nr.h defined for
!defined(TARGET_S390X).

I have added a script to copy all these files from linux and updated
them at the end of the series with their latest version for today.

The two last patches manage the special case for mips O32 that needs
to know the number of arguments. We find them in strace sources.

v2:
fix a typo (double comma) in $(call quiet-command)
add a script to remove dependencies to syscall_nr.h in source directory

ppc, mips64:

we don't need to split syscall_nr.h as it is generated
according TARGET_SYSTBL_ABI to TARGET_ABI_DIR
and generated values are incorrect according to the file name.

arm:

manage TARGET_NR_arm_sync_file_range

Once the syscall_nr.h are built in the build directory, the following script
allows to compare them with the original one (first argument is the path
to build directory), it must be run from the source directory:

cat > check_syscall_nr.sh < /tmp/old
else
git show $REFERENCE:$syscall_nr | \
sed 's/[[:blank:]]\/\*[^*]*\*\///' | \
sed "s/TARGET_NR_Linux/$offset/" > /tmp/old
fi
diff -wu --color=always /tmp/old \
$BUILD/$target/$syscall_nr | less -R
}

for arch in $ARCHS ; do
syscall_nr_diff $arch $arch-linux-user
done

syscall_nr_diff ppcppc64-linux-user

syscall_nr_diff mips   mips-linux-user4000
syscall_nr_diff mips64 mips64-linux-user  5000
syscall_nr_diff mips64 mipsn32-linux-user 6000
EOF

Laurent Vivier (20):
  linux-user: introduce parameters to generate syscall_nr.h
  linux-user,alpha: add syscall table generation support
  linux-user,hppa: add syscall table generation support
  linux-user,m68k: add syscall table generation support
  linux-user,xtensa: add syscall table generation support
  linux-user,sh4: add syscall table generation support
  linux-user,microblaze: add syscall table generation support
  linux-user,arm: add syscall table generation support
  linux-user,ppc: add syscall table generation support
  linux-user,s390x: remove syscall definitions for !TARGET_S390X
  linux-user,s390x: add syscall table generation support
  linux-user,sparc,sparc64: add syscall table generation support
  linux-user,i386: add syscall table generation support
  linux-user,x86_64: add syscall table generation support
  linux-user,mips: add syscall table generation support
  linux-user,mips64: add syscall table generation support
  linux-user,scripts: add a script to update syscall.tbl
  linux-user: update syscall.tbl from linux 0bf999f9c5e7
  linux-user,mips: move content of mips_syscall_args
  linux-user,mips: update syscall-args-o32.c.inc

 MAINTAINERS|   1 +
 Makefile.target|   3 +-
 configure  |  35 ++
 linux-user/Makefile.objs   |  19 +-
 linux-user/alpha/Makefile.objs |   5 +
 linux-user/alpha/syscall.tbl   | 479 
 linux-user/alpha/syscall_nr.h  | 492 -
 linux-user/alpha/syscallhdr.sh |  32 ++
 linux-user/arm/Makefile.objs   |   8 +
 linux-user/arm/syscall.tbl | 453 +++
 linux-user/arm/syscall_nr.h| 447 ---
 linux-user/arm/syscallhdr.sh   |  31 ++
 linux-user/hppa/Makefile.objs  |   5 +
 linux-user/hppa/syscall.tbl| 437 +++
 linux-user/hppa/syscall_nr.h   | 358 
 linux-user/hppa/syscallhdr.sh  |  32 ++
 linux-user/i386/Makefile.objs  |   5 +
 linux-user/i386/syscall_32.tbl | 444 +++
 linux-user/i386/syscall_nr.h   | 387 -
 linux-user/i386/syscallhdr.sh  |  28 +
 linux-user/m68k/Makefile.objs  |   5 +
 linux-user/m68k/syscall.tbl| 439 +++
 linux-user/m68k/syscall_nr.h   | 434 ---
 linux-user/m68k/syscallhdr.sh  |  32 ++
 linux-user/microblaze/Makefile.objs|   5 +
 linux-user/microblaze/syscall.tbl  | 445 +++
 linux-user/microblaze/syscall_nr.h | 442 ---
 linux-user/microblaze/syscallhdr.sh|  32 ++
 linux-user/mips/Makefile.objs  |   5 +
 linux-user/mips/cpu_lo

[PATCH v2 08/20] linux-user, arm: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/arm/tools/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Update syscall.c to manage TARGET_NR_arm_sync_file_range as it has
replaced TARGET_NR_sync_file_range2

Move existing stuff from linux-user/Makefile.objs to
linux-user/arm/Makefile.objs

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)
manage TARGET_NR_arm_sync_file_range

remove  dependencies to syscall_nr.h in source directory

 configure|   3 +-
 linux-user/Makefile.objs |   3 +-
 linux-user/arm/Makefile.objs |   8 +
 linux-user/arm/syscall.tbl   | 451 +++
 linux-user/arm/syscall_nr.h  | 447 --
 linux-user/arm/syscallhdr.sh |  31 +++
 linux-user/syscall.c |   6 +
 7 files changed, 499 insertions(+), 450 deletions(-)
 create mode 100644 linux-user/arm/Makefile.objs
 create mode 100644 linux-user/arm/syscall.tbl
 delete mode 100644 linux-user/arm/syscall_nr.h
 create mode 100644 linux-user/arm/syscallhdr.sh

diff --git a/configure b/configure
index 4cc57aa62818..7c9ee47c04e9 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha hppa m68k xtensa sh4 microblaze ; do
+for arch in alpha hppa m68k xtensa sh4 microblaze arm ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7697,6 +7697,7 @@ case "$target_name" in
   ;;
   arm|armeb)
 TARGET_ARCH=arm
+TARGET_SYSTBL_ABI=common,oabi
 bflt="yes"
 mttcg="yes"
 gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 5a26281e8867..bc12e38291bc 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -5,11 +5,10 @@ obj-y = main.o syscall.o strace.o mmap.o signal.o \
 
 obj-$(TARGET_HAS_BFLT) += flatload.o
 obj-$(TARGET_I386) += vm86.o
-obj-$(TARGET_ARM) += arm/nwfpe/
-obj-$(TARGET_ARM) += arm/semihost.o
 obj-$(TARGET_AARCH64) += arm/semihost.o
 
 obj-$(TARGET_ALPHA) += alpha/
+obj-$(TARGET_ARM) += arm/
 obj-$(TARGET_HPPA) += hppa/
 obj-$(TARGET_M68K) += m68k/
 obj-$(TARGET_MICROBLAZE) += microblaze/
diff --git a/linux-user/arm/Makefile.objs b/linux-user/arm/Makefile.objs
new file mode 100644
index ..c7eb94dcba8e
--- /dev/null
+++ b/linux-user/arm/Makefile.objs
@@ -0,0 +1,8 @@
+obj-$(TARGET_ARM) += nwfpe/
+obj-$(TARGET_ARM) += semihost.o
+
+generated-files-y += linux-user/arm/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/arm/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/arm/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/arm/syscall.tbl b/linux-user/arm/syscall.tbl
new file mode 100644
index ..6da7dc4d79cc
--- /dev/null
+++ b/linux-user/arm/syscall.tbl
@@ -0,0 +1,451 @@
+#
+# Linux system call numbers and entry vectors
+#
+# The format is:
+#  [  
[]]
+#
+# Where abi is:
+#  common - for system calls shared between oabi and eabi (may have compat)
+#  oabi   - for oabi-only system calls (may have compat)
+#  eabi   - for eabi-only system calls
+#
+# For each syscall number, "common" is mutually exclusive with oabi and eabi
+#
+0  common  restart_syscall sys_restart_syscall
+1  common  exitsys_exit
+2  common  forksys_fork
+3  common  readsys_read
+4  common  write   sys_write
+5  common  opensys_open
+6  common  close   sys_close
+# 7 was sys_waitpid
+8  common  creat   sys_creat
+9  common  linksys_link
+10 common  unlink  sys_unlink
+11 common  execve  sys_execve
+12 common  chdir   sys_chdir
+13 oabitimesys_time32
+14 common  mknod   sys_mknod
+15 common  chmod   sys_chmod
+16 common  lchown  sys_lchown16
+# 17 was sys_break
+# 18 was sys_stat
+19 common  lseek   sys_lseek
+20 common  getpid  sys_getpid
+21 common  mount   sys_mount
+22 oabiumount  sys_oldumount
+23 common  setuid  sys_setuid16
+24 common  getuid  sys_getuid16
+25 oabistime   sys_stime32
+26 common  ptrace  sys_ptrace
+27 oabialarm   sys_alarm
+# 28 was sys_fstat
+29 common  pause   sys_pause
+30 oabi  

[PATCH v2 03/20] linux-user, hppa: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/parisc/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure |   3 +-
 linux-user/Makefile.objs  |   1 +
 linux-user/hppa/Makefile.objs |   5 +
 linux-user/hppa/syscall.tbl   | 435 ++
 linux-user/hppa/syscall_nr.h  | 358 
 linux-user/hppa/syscallhdr.sh |  32 +++
 6 files changed, 475 insertions(+), 359 deletions(-)
 create mode 100644 linux-user/hppa/Makefile.objs
 create mode 100644 linux-user/hppa/syscall.tbl
 delete mode 100644 linux-user/hppa/syscall_nr.h
 create mode 100644 linux-user/hppa/syscallhdr.sh

diff --git a/configure b/configure
index 91a0b667a581..24338dfa1bcf 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha ; do
+for arch in alpha hppa ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7712,6 +7712,7 @@ case "$target_name" in
   ;;
   hppa)
 mttcg="yes"
+TARGET_SYSTBL_ABI=common,64
   ;;
   lm32)
   ;;
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index a1afb4d21f9f..9f8e001241d5 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -10,3 +10,4 @@ obj-$(TARGET_ARM) += arm/semihost.o
 obj-$(TARGET_AARCH64) += arm/semihost.o
 
 obj-$(TARGET_ALPHA) += alpha/
+obj-$(TARGET_HPPA) += hppa/
diff --git a/linux-user/hppa/Makefile.objs b/linux-user/hppa/Makefile.objs
new file mode 100644
index ..f8368be6f314
--- /dev/null
+++ b/linux-user/hppa/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/hppa/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/hppa/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/hppa/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/hppa/syscall.tbl b/linux-user/hppa/syscall.tbl
new file mode 100644
index ..285ff516150c
--- /dev/null
+++ b/linux-user/hppa/syscall.tbl
@@ -0,0 +1,435 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for parisc
+#
+# The format is:
+# 
+#
+# The  can be common, 64, or 32 for this file.
+#
+0  common  restart_syscall sys_restart_syscall
+1  common  exitsys_exit
+2  common  forksys_fork_wrapper
+3  common  readsys_read
+4  common  write   sys_write
+5  common  opensys_open
compat_sys_open
+6  common  close   sys_close
+7  common  waitpid sys_waitpid
+8  common  creat   sys_creat
+9  common  linksys_link
+10 common  unlink  sys_unlink
+11 common  execve  sys_execve  
compat_sys_execve
+12 common  chdir   sys_chdir
+13 32  timesys_time32
+13 64  timesys_time
+14 common  mknod   sys_mknod
+15 common  chmod   sys_chmod
+16 common  lchown  sys_lchown
+17 common  socket  sys_socket
+18 common  statsys_newstat 
compat_sys_newstat
+19 common  lseek   sys_lseek   
compat_sys_lseek
+20 common  getpid  sys_getpid
+21 common  mount   sys_mount   
compat_sys_mount
+22 common  bindsys_bind
+23 common  setuid  sys_setuid
+24 common  getuid  sys_getuid
+25 32  stime   sys_stime32
+25 64  stime   sys_stime
+26 common  ptrace  sys_ptrace  
compat_sys_ptrace
+27 common  alarm   sys_alarm
+28 common  fstat   sys_newfstat
compat_sys_newfstat
+29 common  pause   sys_pause
+30 32  utime   sys_utime32
+30 64  utime   sys_utime
+31 common  connect sys_connect
+32 common  listen  sys_listen
+33 common  access  sys_access
+34 common  nicesys_nice
+35 common  accept  sys_accept
+36 common  syncsys_sync
+37 common  killsys

[PATCH v2 11/20] linux-user, s390x: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl from linux/arch/s390x/kernel/syscalls v5.5
Copy syscallhdr.sh from m68k.

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure  |   3 +-
 linux-user/Makefile.objs   |   1 +
 linux-user/s390x/Makefile.objs |   5 +
 linux-user/s390x/syscall.tbl   | 440 +
 linux-user/s390x/syscall_nr.h  | 331 -
 linux-user/s390x/syscallhdr.sh |  32 +++
 6 files changed, 480 insertions(+), 332 deletions(-)
 create mode 100644 linux-user/s390x/Makefile.objs
 create mode 100644 linux-user/s390x/syscall.tbl
 delete mode 100644 linux-user/s390x/syscall_nr.h
 create mode 100755 linux-user/s390x/syscallhdr.sh

diff --git a/configure b/configure
index b5abea89d300..a6a733e09e4d 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc ; do
+for arch in alpha hppa m68k xtensa sh4 microblaze arm ppc s390x ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7810,6 +7810,7 @@ case "$target_name" in
 echo "TARGET_ABI32=y" >> $config_target_mak
   ;;
   s390x)
+TARGET_SYSTBL_ABI=common,64
 mttcg=yes
 gdb_xml_files="s390x-core64.xml s390-acr.xml s390-fpr.xml s390-vx.xml 
s390-cr.xml s390-virt.xml s390-gs.xml"
   ;;
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 8b00dad687b2..f4e666e74c91 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -14,5 +14,6 @@ obj-$(TARGET_M68K) += m68k/
 obj-$(TARGET_MICROBLAZE) += microblaze/
 obj-$(TARGET_PPC) += ppc/
 obj-$(TARGET_PPC64) += ppc/
+obj-$(TARGET_S390X) += s390x/
 obj-$(TARGET_SH4) += sh4/
 obj-$(TARGET_XTENSA) += xtensa/
diff --git a/linux-user/s390x/Makefile.objs b/linux-user/s390x/Makefile.objs
new file mode 100644
index ..f30f1625ccff
--- /dev/null
+++ b/linux-user/s390x/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/s390x/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/s390x/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/s390x/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/s390x/syscall.tbl b/linux-user/s390x/syscall.tbl
new file mode 100644
index ..3054e9c035a3
--- /dev/null
+++ b/linux-user/s390x/syscall.tbl
@@ -0,0 +1,440 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# System call table for s390
+#
+# Format:
+#
+# 
+#
+# where  can be common, 64, or 32
+
+1commonexitsys_exitsys_exit
+2commonforksys_forksys_fork
+3commonreadsys_read
compat_sys_s390_read
+4commonwrite   sys_write   
compat_sys_s390_write
+5commonopensys_open
compat_sys_open
+6commonclose   sys_close   
sys_close
+7commonrestart_syscall sys_restart_syscall 
sys_restart_syscall
+8commoncreat   sys_creat   
sys_creat
+9commonlinksys_linksys_link
+10   commonunlink  sys_unlink  
sys_unlink
+11   commonexecve  sys_execve  
compat_sys_execve
+12   commonchdir   sys_chdir   
sys_chdir
+13   32time-   
sys_time32
+14   commonmknod   sys_mknod   
sys_mknod
+15   commonchmod   sys_chmod   
sys_chmod
+16   32lchown  -   
sys_lchown16
+19   commonlseek   sys_lseek   
compat_sys_lseek
+20   commongetpid  sys_getpid  
sys_getpid
+21   commonmount   sys_mount   
compat_sys_mount
+22   commonumount  sys_oldumount   
sys_oldumount
+23   32setuid  -   
sys_setuid16
+24   32getuid  -   
sys_getuid16
+25   32stime   -   
sys_stime32
+26   commonptrace  sys_ptrace  
compat_sys_ptrace
+27

[PATCH v2 05/20] linux-user, xtensa: add syscall table generation support

2020-02-19 Thread Laurent Vivier
Copy syscall.tbl and syscallhdr.sh from linux/arch/xtensa/kernel/syscalls v5.5
Update syscallhdr.sh to generate QEMU syscall_nr.h

Signed-off-by: Laurent Vivier 
---

Notes:
v2: fix a typo (double comma) in $(call quiet-command)

remove  dependencies to syscall_nr.h in source directory

 configure   |   3 +-
 linux-user/Makefile.objs|   1 +
 linux-user/xtensa/Makefile.objs |   5 +
 linux-user/xtensa/syscall.tbl   | 408 +++
 linux-user/xtensa/syscall_nr.h  | 469 
 linux-user/xtensa/syscallhdr.sh |  32 +++
 6 files changed, 448 insertions(+), 470 deletions(-)
 create mode 100644 linux-user/xtensa/Makefile.objs
 create mode 100644 linux-user/xtensa/syscall.tbl
 delete mode 100644 linux-user/xtensa/syscall_nr.h
 create mode 100644 linux-user/xtensa/syscallhdr.sh

diff --git a/configure b/configure
index 7f1af37f552a..deb112b06f36 100755
--- a/configure
+++ b/configure
@@ -1857,7 +1857,7 @@ rm -f */config-devices.mak.d
 
 # Remove syscall_nr.h to be sure they will be regenerated in the build
 # directory, not in the source directory
-for arch in alpha hppa m68k ; do
+for arch in alpha hppa m68k xtensa ; do
 # remove the file if it has been generated in the source directory
 rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
 # remove the dependency files
@@ -7814,6 +7814,7 @@ case "$target_name" in
   ;;
   xtensa|xtensaeb)
 TARGET_ARCH=xtensa
+TARGET_SYSTBL_ABI=common
 bflt="yes"
 mttcg="yes"
   ;;
diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index ac74b23683cf..13b821baf752 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -12,3 +12,4 @@ obj-$(TARGET_AARCH64) += arm/semihost.o
 obj-$(TARGET_ALPHA) += alpha/
 obj-$(TARGET_HPPA) += hppa/
 obj-$(TARGET_M68K) += m68k/
+obj-$(TARGET_XTENSA) += xtensa/
diff --git a/linux-user/xtensa/Makefile.objs b/linux-user/xtensa/Makefile.objs
new file mode 100644
index ..d4be1b745544
--- /dev/null
+++ b/linux-user/xtensa/Makefile.objs
@@ -0,0 +1,5 @@
+generated-files-y += linux-user/xtensa/syscall_nr.h
+
+syshdr := $(SRC_PATH)/linux-user/xtensa/syscallhdr.sh
+%/syscall_nr.h: $(SRC_PATH)/linux-user/xtensa/syscall.tbl $(syshdr)
+   $(call quiet-command, sh $(syshdr) $< $@ 
$(TARGET_SYSTBL_ABI),"GEN","$@")
diff --git a/linux-user/xtensa/syscall.tbl b/linux-user/xtensa/syscall.tbl
new file mode 100644
index ..25f4de729a6d
--- /dev/null
+++ b/linux-user/xtensa/syscall.tbl
@@ -0,0 +1,408 @@
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# system call numbers and entry vectors for xtensa
+#
+# The format is:
+#
+#
+# The  is always "common" for this file
+#
+0  common  spill   sys_ni_syscall
+1  common  xtensa  sys_ni_syscall
+2  common  available4  sys_ni_syscall
+3  common  available5  sys_ni_syscall
+4  common  available6  sys_ni_syscall
+5  common  available7  sys_ni_syscall
+6  common  available8  sys_ni_syscall
+7  common  available9  sys_ni_syscall
+# File Operations
+8  common  opensys_open
+9  common  close   sys_close
+10 common  dup sys_dup
+11 common  dup2sys_dup2
+12 common  readsys_read
+13 common  write   sys_write
+14 common  select  sys_select
+15 common  lseek   sys_lseek
+16 common  pollsys_poll
+17 common  _llseek sys_llseek
+18 common  epoll_wait  sys_epoll_wait
+19 common  epoll_ctl   sys_epoll_ctl
+20 common  epoll_createsys_epoll_create
+21 common  creat   sys_creat
+22 common  truncatesys_truncate
+23 common  ftruncate   sys_ftruncate
+24 common  readv   sys_readv
+25 common  writev  sys_writev
+26 common  fsync   sys_fsync
+27 common  fdatasync   sys_fdatasync
+28 common  truncate64  sys_truncate64
+29 common  ftruncate64 sys_ftruncate64
+30 common  pread64 sys_pread64
+31 common  pwrite64sys_pwrite64
+32 common  linksys_link
+33 common  rename  sys_rename
+34 common  symlink sys_symlink
+35 common  readlinksys_readlink
+36 common  mknod   sys_mknod
+3

[PATCH v2 01/20] linux-user: introduce parameters to generate syscall_nr.h

2020-02-19 Thread Laurent Vivier
This will be used when we'll import syscall.tbl from the kernel

Add a script to remove all the dependencies to syscall_nr.h
that point to source directory and not to the build directory.
The list of arch will be update while the generated files are added.

Signed-off-by: Laurent Vivier 
---

Notes:
v2: add script to remove dependencies to syscall_nr.h in source directory

 Makefile.target |  3 ++-
 configure   | 14 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/Makefile.target b/Makefile.target
index 6e61f607b14a..9babf2643e0b 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -128,7 +128,8 @@ ifdef CONFIG_LINUX_USER
 
 QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) \
  -I$(SRC_PATH)/linux-user/host/$(ARCH) \
- -I$(SRC_PATH)/linux-user
+ -I$(SRC_PATH)/linux-user \
+ -Ilinux-user/$(TARGET_ABI_DIR)
 
 obj-y += linux-user/
 obj-y += gdbstub.o thunk.o
diff --git a/configure b/configure
index 6f5d85094965..795adf41195f 100755
--- a/configure
+++ b/configure
@@ -1855,6 +1855,17 @@ fi
 # Remove old dependency files to make sure that they get properly regenerated
 rm -f */config-devices.mak.d
 
+# Remove syscall_nr.h to be sure they will be regenerated in the build
+# directory, not in the source directory
+for arch in ; do
+# remove the file if it has been generated in the source directory
+rm -f "${source_path}/linux-user/${arch}/syscall_nr.h"
+# remove the dependency files
+find . -name "*.d" \
+   -exec grep -q "${source_path}/linux-user/${arch}/syscall_nr.h" {} 
\; \
+   -exec rm {} \;
+done
+
 if test -z "$python"
 then
 error_exit "Python not found. Use --python=/path/to/python"
@@ -7829,6 +7840,9 @@ echo "TARGET_ABI_DIR=$TARGET_ABI_DIR" >> 
$config_target_mak
 if [ "$HOST_VARIANT_DIR" != "" ]; then
 echo "HOST_VARIANT_DIR=$HOST_VARIANT_DIR" >> $config_target_mak
 fi
+if [ "$TARGET_SYSTBL_ABI" != "" ]; then
+echo "TARGET_SYSTBL_ABI=$TARGET_SYSTBL_ABI" >> $config_target_mak
+fi
 
 if supported_xen_target $target; then
 echo "CONFIG_XEN=y" >> $config_target_mak
-- 
2.24.1




  1   2   3   4   >