from:"Rafael David Tinoco"

Re: [PATCH] configure: actually disable 'git_update' mode with --disable-git-update

2020-10-02 Thread Rafael David Tinoco





 Assuming you're just using git for conveniently applying local
 downstream patches, you don't need the git repo to exist once
 getting to the build stage. IOW just delete the .git dir after
 applying patches before running a build.


...then what do you do if the build fails and you want to
edit/update the patch before retrying? "Blow away your .git
tree every time you build and reconstitute it somehow later"
doesn't seem like a very friendly thing to require...


+1. This option is disconnected with sustaining engineering reality 
IMHO: tons of interactive rebases, adding and dropping patches, 
re-orderings - so previous existing patches can allow the new ones (or 
even existing ones) to become clean cherry-picks - in between patch sets 
being worked on, bisections before continuing all this, etc.

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-07-31 Thread Rafael David Tinoco

I just pushed/uploaded a SRU for bionic from:

https://code.launchpad.net/~rafaeldtinoco/ubuntu/+source/qemu/+git/qemu/+merge/387269

Waiting for SRU on it.


** Changed in: qemu (Ubuntu Bionic)
 Assignee: Rafael David Tinoco (rafaeldtinoco) => (unassigned)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Fix Released
Status in kunpeng920 ubuntu-20.04 series:
  Fix Released
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  Fix Released
Status in qemu source package in Focal:
  Fix Released

Bug description:
  
  SRU TEAM REVIEWER: This has already been SRUed for Focal, Eoan and Bionic. 
Unfortunately the Bionic SRU did not work and we had to reverse the change. 
Since then we had another update and now I'm retrying the SRU.

  After discussing with @paelzer (and @dannf as a reviewer) extensively,
  Christian and I agreed that we should scope this SRU as Aarch64 only
  AND I was much, much more conservative in question of what is being
  changed in the AIO qemu code.

  New code has been tested against the initial Test Case and the new
  one, regressed for Bionic. More information (about tests and
  discussion) can be found in the MR at
  ~rafaeldtinoco/ubuntu/+source/qemu:lp1805256-bionic-refix

  BIONIC REGRESSION BUG:

  https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1885419

  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  INITIAL

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  BIONIC REGRESSED ISSUE

  https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1885419

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-07-31 Thread Rafael David Tinoco

** Description changed:

+ 
+ SRU TEAM REVIEWER: This has already been SRUed for Focal, Eoan and Bionic. 
Unfortunately the Bionic SRU did not work and we had to reverse the change. 
Since then we had another update and now I'm retrying the SRU.
+ 
+ After discussing with @paelzer (and @dannf as a reviewer) extensively,
+ Christian and I agreed that we should scope this SRU as Aarch64 only AND
+ I was much, much more conservative in question of what is being changed
+ in the AIO qemu code.
+ 
+ New code has been tested against the initial Test Case and the new one,
+ regressed for Bionic. More information (about tests and discussion) can
+ be found in the MR at ~rafaeldtinoco/ubuntu/+source/qemu:lp1805256
+ -bionic-refix
+ 
+ BIONIC REGRESSION BUG:
+ 
+ https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1885419
+ 
  [Impact]
  
  * QEMU locking primitives might face a race condition in QEMU Async I/O
  bottom halves scheduling. This leads to a dead lock making either QEMU
  or one of its tools to hang indefinitely.
  
  [Test Case]
+ 
+ INITIAL
  
  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
  
  Hangs indefinitely approximately 30% of the runs in Aarch64.
  
  [Regression Potential]
  
  * This is a change to a core part of QEMU: The AIO scheduling. It works
  like a "kernel" scheduler, whereas kernel schedules OS tasks, the QEMU
  AIO code is responsible to schedule QEMU coroutines or event listeners
  callbacks.
  
  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the issue.
  Tested platforms were: amd64 and aarch64 based on his commit log.
  
  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.
  
  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.
+ 
+ BIONIC REGRESSED ISSUE
+ 
+ https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1885419
  
  [Other Info]
  
   * Original Description bellow:
  
  Command:
  
  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
  
  Hangs indefinitely approximately 30% of the runs.
  
  
  
  Workaround:
  
  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
  
  Run "qemu-img convert" with "a single coroutine" to avoid this issue.
  
  
  
  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...
  
  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()
  
  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start
  
  
  
  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2
  
  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]
  
  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]
  
  
  """
  
  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).
  
  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).
  
  
  
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:
  
  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2
  
  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.
  
  Once hung, attaching gdb gives the following backtrace:
  
  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at

[Bug 1886811] Re: systemd complains Failed to enqueue loopback interface start request: Operation not supported

2020-07-31 Thread Rafael David Tinoco

qemu (1:5.0-5ubuntu3) groovy; urgency=medium

has the merge with this fix:

- linux-user-add-netlink-RTM_SETLINK-command.patch (Closes: #964289)


** Changed in: qemu (Ubuntu)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1886811

Title:
  systemd complains Failed to enqueue loopback interface start request:
  Operation not supported

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu package in Debian:
  Fix Released

Bug description:
  This symptom seems similar to
  https://bugs.launchpad.net/qemu/+bug/1823790

  Host Linux: Debian 11 Bullseye (testing) on x84-64 architecture
  qemu version: latest git of git commit hash 
eb2c66b10efd2b914b56b20ae90655914310c925
  compiled with "./configure --static --disable-system" 

  Down stream bug report at 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964289
  Bug report (closed) to systemd: 
https://github.com/systemd/systemd/issues/16359

  systemd in armhf and armel (both little endian 32-bit) containers fail to 
start with
  Failed to enqueue loopback interface start request: Operation not supported

  How to reproduce on Debian (and probably Ubuntu):
  mmdebstrap --components="main contrib non-free" --architectures=armhf 
--variant=important bullseye /var/lib/machines/armhf-bullseye
  systemd-nspawn -D /var/lib/machines/armhf-bullseye -b

  When "armhf" architecture is replaced with "mips" (32-bit big endian) or 
"ppc64"
  (64-bit big endian), the container starts up fine.

  The same symptom is also observed with "powerpc" (32-bit big endian)
  architecture.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1886811/+subscriptions

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-07-21 Thread Rafael David Tinoco

Status from old attempts to solve same nature issues:



Older (2018) merge request from @raharper:

https://github.com/koverstreet/bcache-tools/pull/1

addressing the fact that kernel uevents would not always emit 
CACHED_UUID parameters, making udev to delete (whenever that happens) 
/dev/bcache/{by-uuid,by-label} symlinks.

This last MR pointed to previous related bugs:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890446
https://bugs.launchpad.net/curtin/+bug/1728742

And to an upstream kernel patch:

https://lore.kernel.org/patchwork/patch/921298/

to

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729145

that wasn't accepted upstream.

Even not being accepted upstream, the SRU was attempted:

LP: #1729145

https://lists.ubuntu.com/archives/kernel-team/2017-December/088680.html
https://lists.ubuntu.com/archives/kernel-team/2017-December/088679.html

Both were NACKED.

Attempted again:

https://lists.ubuntu.com/archives/kernel-team/2017-December/088682.html
https://lists.ubuntu.com/archives/kernel-team/2017-December/088683.html

NACKED again.

And a v2 was sent:

https://lists.ubuntu.com/archives/kernel-team/2017-December/088751.html
https://lists.ubuntu.com/archives/kernel-team/2017-December/088750.html
https://lists.ubuntu.com/archives/kernel-team/2017-December/088749.html

and acked in January 2018 by Coling:

https://lists.ubuntu.com/archives/kernel-team/2018-January/089492.html

but not upstreamed.

BIONIC contains the fix:

commit ed9333e1b583
Author: Ryan Harper 
Date:   Mon Dec 11 12:12:01 2017

UBUNTU: SAUCE: (no-up) bcache: decouple emitting a cached_dev CHANGE uevent

BugLink: http://bugs.launchpad.net/bugs/1729145

- decouple emitting a cached_dev CHANGE uevent which includes dev.uuid
  and dev.label from bch_cached_dev_run() which only happens when a
  bcacheX device is bound to the actual backing block device (bcache0 -> 
vdb)

- update bch_cached_dev_run() to invoke bch_cached_dev_emit_change() as
  needed; no functional code path changes here

- Modify register_bcache to detect a re-registering of a bcache
  cached_dev, and in that case call bcache_cached_dev_emit_change() to

Signed-off-by: Ryan Harper 
Signed-off-by: Joseph Salisbury 
Acked-by: Colin Ian King 
Acked-by: Stefan Bader 
Signed-off-by: Khalid Elmously 
[ saf: fix incorrect indentation ]
Signed-off-by: Seth Forshee 

FOCAL contains the fix:

commit 67553dcd7905
Author: Ryan Harper 
Date:   Mon Dec 11 12:12:01 2017

UBUNTU: SAUCE: (no-up) bcache: decouple emitting a cached_dev CHANGE
uevent

GROOVY contains the fix:

commit 67553dcd7905
Author: Ryan Harper 
Date:   Mon Dec 11 12:12:01 2017

UBUNTU: SAUCE: (no-up) bcache: decouple emitting a cached_dev CHANGE
uevent



So, the kernel patch wasn't accepted, nor bcache-tools patch by 
@raharper, the bcache-export-cached.



New Upstream summary from @raharper:

https://github.com/systemd/systemd/pull/16317#issuecomment-655647313

in the upstream merge request made by @rbalint.


** Bug watch added: Debian Bug tracker #890446
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890446

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Fix Released
Status in kunpeng920 ubuntu-20.04 series:
  Fix Released
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  Fix Released
Status in qemu source package in Focal:
  Fix Released

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-07-21 Thread Rafael David Tinoco

I've hidden last post as it was posted in the wrong bug.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Fix Released
Status in kunpeng920 ubuntu-20.04 series:
  Fix Released
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  Fix Released
Status in qemu source package in Focal:
  Fix Released

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-07-20 Thread Rafael David Tinoco

Thanks @dannf! I spoke to Christian and him and I agreed to confine this
change into ARM builds only (as SRU for Bionic). Preparing it...

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Fix Released
Status in kunpeng920 ubuntu-20.04 series:
  Fix Released
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  Fix Released
Status in qemu source package in Focal:
  Fix Released

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-07-12 Thread Rafael David Tinoco

Worked being done for the Bionic SRU:

BUG: https://bugs.launchpad.net/qemu/+bug/1805256
(fix for the bionic regression demonstrated at LP: #1885419)
PPA: https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1805256-bionic
MERGE: https://tinyurl.com/y8sucs6x

Merge proposal currently going under review, tests and discussions.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Fix Released
Status in kunpeng920 ubuntu-20.04 series:
  Fix Released
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  Fix Released
Status in qemu source package in Focal:
  Fix Released

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-07-12 Thread Rafael David Tinoco

Started working on this again...

** Changed in: qemu (Ubuntu Bionic)
   Status: Triaged => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Fix Released
Status in kunpeng920 ubuntu-20.04 series:
  Fix Released
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  Fix Released
Status in qemu source package in Focal:
  Fix Released

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0,

[Bug 1878134] Re: Assertion failures in ati_reg_read_offs/ati_reg_write_offs

2020-05-14 Thread Rafael David Tinoco

Hello Alexander,

I believe your fuzz test result was meant to the upstream project so I
moved it.

o/

** Also affects: qemu
   Importance: Undecided
   Status: New

** No longer affects: qemu-kvm (Ubuntu)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1878134

Title:
  Assertion failures in ati_reg_read_offs/ati_reg_write_offs

Status in QEMU:
  New

Bug description:
  Hello,
  While fuzzing, I found inputs that trigger assertion failures in
  ati_reg_read_offs/ati_reg_write_offs

  uint32_t extract32(uint32_t, int, int): Assertion `start >= 0 &&
  length > 0 && length <= 32 - start' failed

  #3  0x76866092 in __GI___assert_fail (assertion=0x56e760c0  
"start >= 0 && length > 0 && length <= 32 - start", file=0x56e76120  
"/home/alxndr/Development/qemu/include/qemu/bitops.h", line=0x12c, 
function=0x56e76180 <__PRETTY_FUNCTION__.extract32> "uint32_t 
extract32(uint32_t, int, int)") at assert.c:101
  #4  0x5653d8a7 in ati_mm_read (opaque=, addr=0x1a, 
size=) at 
/home/alxndr/Development/qemu/include/qemu/log-for-trace.h:29
  #5  0x5653c825 in ati_mm_read (opaque=, addr=0x4, 
size=) at /home/alxndr/Development/qemu/hw/display/ati.c:289
  #6  0x5601446e in memory_region_read_accessor (mr=0x6314dc20, 
addr=, value=, size=, 
shift=, mask=, attrs=...) at 
/home/alxndr/Development/qemu/memory.c:434
  #7  0x56001a70 in access_with_adjusted_size (addr=, 
value=, size=, access_size_min=, 
access_size_max=, access_fn=, mr=0x6314dc20, 
attrs=...) at /home/alxndr/Development/qemu/memory.c:544
  #8  0x56001a70 in memory_region_dispatch_read1 (mr=0x6314dc20, 
addr=0x4, pval=, size=0x4, attrs=...) at 
/home/alxndr/Development/qemu/memory.c:1396

  
  I can reproduce it in qemu 5.0 built with using:
  cat << EOF | ~/Development/qemu/build/i386-softmmu/qemu-system-i386 -M 
pc-q35-5.0 -device ati-vga -nographic -qtest stdio -monitor none -serial none
  outl 0xcf8 0x80001018
  outl 0xcfc 0xe200
  outl 0xcf8 0x8000101c
  outl 0xcf8 0x80001004
  outw 0xcfc 0x7
  outl 0xcf8 0x8000fa20
  write 0xe204 0x1 0x1a
  readq 0xe200
  EOF

  Similarly for ati_reg_write_offs:
  cat << EOF | ~/Development/qemu/build/i386-softmmu/qemu-system-i386 -M 
pc-q35-5.0 -device ati-vga -nographic -qtest stdio -monitor none -serial none
  outl 0xcf8 0x80001018
  outl 0xcfc 0xe200
  outl 0xcf8 0x8000101c
  outl 0xcf8 0x80001004
  outw 0xcfc 0x7
  outl 0xcf8 0x8000fa20
  write 0xe200 0x8 0x6a006a00
  EOF

  I also attached the traces to this launchpad report, in case the
  formatting is broken:

  qemu-system-i386 -M pc-q35-5.0 -device ati-vga -nographic -qtest stdio
  -monitor none -serial none < attachment

  Please let me know if I can provide any further info.
  -Alex

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1878134/+subscriptions

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-05-06 Thread Rafael David Tinoco

FYIO, from now on all the "merge" work will be done in the merge
requests being linked to this BUG (at the top). @paelzer will be
verifying those.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Triaged
Status in kunpeng920 ubuntu-20.04 series:
  Triaged
Status in kunpeng920 upstream-kernel series:
  Fix Committed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Disco:
  In Progress
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-05-06 Thread Rafael David Tinoco

** Description changed:

+ [Impact]
+ 
+ * QEMU locking primitives might face a race condition in QEMU Async I/O
+ bottom halves scheduling. This leads to a dead lock making either QEMU
+ or one of its tools to hang indefinitely.
+ 
+ [Test Case]
+ 
+ * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
+ 
+ Hangs indefinitely approximately 30% of the runs in Aarch64.
+ 
+ [Regression Potential]
+ 
+ * This is a change to a core part of QEMU: The AIO scheduling. It works
+ like a "kernel" scheduler, whereas kernel schedules OS tasks, the QEMU
+ AIO code is responsible to schedule QEMU coroutines or event listeners
+ callbacks.
+ 
+ * There was a long discussion upstream about primitives and Aarch64.
+ After quite sometime Paolo released this patch and it solves the issue.
+ Tested platforms were: amd64 and aarch64 based on his commit log.
+ 
+ * Christian suggests that this fix stay little longer in -proposed to
+ make sure it won't cause any regressions.
+ 
+ [Other Info]
+ 
+  * Original Description bellow:
+ 
+ 
  Command:
  
  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
  
  Hangs indefinitely approximately 30% of the runs.
  
  
  
  Workaround:
  
  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
  
  Run "qemu-img convert" with "a single coroutine" to avoid this issue.
  
  
  
  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...
  
  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()
  
  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start
  
  
  
  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2
  
  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]
  
  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]
  
  
  """
  
  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).
  
  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).
  
  
  
- [ Original Description ]
- 
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:
  
  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2
  
  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.
  
  Once hung, attaching gdb gives the following backtrace:
  
  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=,
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975
  
  Reproduced w/ latest QEMU git (@ 53744e0a182)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  New
Status in kunpeng920 ubuntu-18.04-hwe

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-05-06 Thread Rafael David Tinoco

** Changed in: qemu (Ubuntu)
 Assignee: Rafael David Tinoco (rafaeldtinoco) => (unassigned)

** Changed in: qemu
   Status: In Progress => Fix Released

** Changed in: qemu (Ubuntu Focal)
   Status: Incomplete => In Progress

** Changed in: qemu (Ubuntu Eoan)
   Status: Incomplete => In Progress

** Changed in: qemu (Ubuntu Disco)
   Status: Incomplete => In Progress

** Changed in: qemu (Ubuntu Bionic)
   Status: Incomplete => In Progress

** Changed in: qemu (Ubuntu)
   Status: Incomplete => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  New
Status in kunpeng920 ubuntu-18.04-hwe series:
  New
Status in kunpeng920 ubuntu-19.10 series:
  New
Status in kunpeng920 ubuntu-20.04 series:
  New
Status in kunpeng920 upstream-kernel series:
  Fix Committed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Disco:
  In Progress
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in Focal:
  In Progress

Bug description:
  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  [ Original Description ]

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=,
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-05-05 Thread Rafael David Tinoco

Hello Ike,

Please, let me know if you want me to go after the needed SRUs for this
fix or if you will.

I'll wait for the final feedback from tests with your PPA.

Cheers!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Triaged
Status in kunpeng920 ubuntu-18.04 series:
  New
Status in kunpeng920 ubuntu-18.04-hwe series:
  New
Status in kunpeng920 ubuntu-19.10 series:
  New
Status in kunpeng920 ubuntu-20.04 series:
  New
Status in kunpeng920 upstream-kernel series:
  Fix Committed
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  Incomplete
Status in qemu source package in Bionic:
  Incomplete
Status in qemu source package in Disco:
  Incomplete
Status in qemu source package in Eoan:
  Incomplete
Status in qemu source package in Focal:
  Incomplete

Bug description:
  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  [ Original Description ]

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=,
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/kunpeng920/+bug/1805256/+subscriptions

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-04-14 Thread Rafael David Tinoco

** Changed in: qemu (Ubuntu Eoan)
 Assignee: Rafael David Tinoco (rafaeldtinoco) => (unassigned)

** Changed in: qemu
 Assignee: Rafael David Tinoco (rafaeldtinoco) => (unassigned)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Incomplete
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  Incomplete
Status in qemu source package in Bionic:
  Incomplete
Status in qemu source package in Disco:
  Incomplete
Status in qemu source package in Eoan:
  Incomplete
Status in qemu source package in Focal:
  Incomplete

Bug description:
  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  [ Original Description ]

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=,
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/kunpeng920/+bug/1805256/+subscriptions

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-12-17 Thread Rafael David Tinoco

Hello Fred,

Based on Dann's feedback on testing, I'm failing to see where your patch
fixes the "root" cause (despite being able to mitigate the issue by
changing the aio notification mechanism).

I think the root cause is best described in this 2 emails from the
thread:

https://lore.kernel.org/qemu-devel/20191009080220.GA2905@hc/

and

https://lore.kernel.org/qemu-devel/966c119d-aa76-2149-108f-
867aebd77...@redhat.com/

So, by adding ctx->notify_for_convert, it is very likely you
workarounded the issue by doing what Jan already said: removing both
variables (ctx->list_lock and, in old case, ctx->notify_me, in your
case, ctx->notify_for_convert) from the same cacheline and making the
issue to "disappear" (as we would eventually do in a workaround patch).

What about aarch64 issue with both, ctx->list_lock and
ctx->notify_for_convert, being synchronized by qemu used primitives, and
being in the same cache line ?

Any "workaround" here would try to dodge the same cacheline situation,
but, for upstream, I suppose Paolo wants to have something else
regarding aarch64 ATOMIC_SEQ_CST.

like describe in this part of the discussion:

https://lore.kernel.org/qemu-devel/96c26e21-5996-0c63-ce8b-
99a1b5473...@redhat.com/

Unless I'm missing something, am I ?

Thank you!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Confirmed
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  Confirmed
Status in qemu source package in Bionic:
  Confirmed
Status in qemu source package in Disco:
  Confirmed
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in Focal:
  Confirmed

Bug description:
  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  [ Original Description ]

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=,
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-10-03 Thread Rafael David Tinoco

On 02/10/19 16:58, Torvald Riegel wrote:
> This example looks like Dekker synchronization (if I get the intent right).

It is the same pattern.  However, one of the two synchronized variables
is a counter rather than just a flag.

> Two possible implementations of this are either (1) with all memory
> accesses having seq-cst MO, or (2) with relaxed-MO accesses and seq-cst
> fences on between the store and load on both ends.  It's possible to mix
> both, but that get's trickier I think.  I'd prefer the one with just
> fences, just because it's easiest, conceptually.

Got it.

I'd also prefer the one with just fences, because we only really control
one side of the synchronization primitive (ctx_notify_me in my litmus
test) and I don't like the idea of forcing seq-cst MO on the other side
(bh_scheduled).  The performance issue that I mentioned is that x86
doesn't have relaxed fetch and add, so you'd have a redundant fence like
this:

lockxaddl $2, mem1
mfence
...
movlmem1, %r8

(Gory QEMU details however allow us to use relaxed load and store here,
because there's only one writer).

> It works if you use (1) or (2) consistently.  cppmem and the Batty et al.
> tech report should give you the gory details.
>
>> 1) understand why ATOMIC_SEQ_CST is not enough in this case.  QEMU code
>> seems to be making the same assumptions as Linux about the memory model,
>> and this is wrong because QEMU uses C11 atomics if available.
>> Fortunately, this kind of synchronization in QEMU is relatively rare and
>> only this particular bit seems affected.  If there is a fix which stays
>> within the C11 memory model, and does not pessimize code on x86, we can
>> use it[1] and document the pitfall.
>
> Using the fences between the store/load pairs in Dekker-like
> synchronization should do that, right?  It's also relatively easy to deal
> with.
>
>> 2) if there's no way to fix the bug, qemu/atomic.h needs to switch to
>> __sync_fetch_and_add and friends.  And again, in this case the
>> difference between the C11 and Linux/QEMU memory models must be documented.
>
> I surely not aware of all the constraints here, but I'd be surprised if the
> C11 memory model isn't good enough for portable synchronization code (with
> the exception of the consume MO minefield, perhaps). 

This helps a lot already; I'll work on a documentation and code patch.
Thanks very much.

Paolo

>>   int main() {
>> atomic_int ctx_notify_me = 0;
>> atomic_int bh_scheduled = 0;
>> {{{ {
>>   bh_scheduled.store(1, mo_release);
>>   atomic_thread_fence(mo_seq_cst);
>>   // must be zero since the bug report shows no notification
>>   ctx_notify_me.load(mo_relaxed).readsvalue(0);
>> }
>> ||| {
>>   ctx_notify_me.store(2, mo_seq_cst);
>>   r2=bh_scheduled.load(mo_relaxed);
>> }
>> }}};
>> return 0;
>>   }



** Changed in: qemu (Ubuntu Disco)
   Importance: Undecided => Medium

** Changed in: qemu (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: qemu (Ubuntu Ff-series)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Bionic:
  New
Status in qemu source package in Disco:
  New
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in FF-Series:
  New

Bug description:
  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-10-03 Thread Rafael David Tinoco

On Wed, 2019-10-02 at 15:20 +0200, Paolo Bonzini wrote:
> On 02/10/19 13:05, Jan Glauber wrote:
>> The arm64 code generated for the
>> atomic_[add|sub] accesses of ctx->notify_me doesn't contain any
>> memory barriers. It is just plain ldaxr/stlxr.
>>
>> From my understanding this is not sufficient for SMP sync.
>>
 If I read this comment correct:

 void aio_notify(AioContext *ctx)
 {
 /* Write e.g. bh->scheduled before reading ctx->notify_me.  Pairs
  * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
  */
 smp_mb();
 if (ctx->notify_me) {

 it points out that the smp_mb() should be paired. But as
 I said the used atomics don't generate any barriers at all.
>>>
>>> Awesome!  That would be a compiler bug though, as atomic_add and atomic_sub
>>> are defined as sequentially consistent:
>>>
>>> #define atomic_add(ptr, n) ((void) __atomic_fetch_add(ptr, n, 
>>> __ATOMIC_SEQ_CST))
>>> #define atomic_sub(ptr, n) ((void) __atomic_fetch_sub(ptr, n, 
>>> __ATOMIC_SEQ_CST))
>>
>> Compiler bug sounds kind of unlikely...
>
> Indeed the assembly produced by the compiler matches for example the
> mappings at https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html.  A
> small testcase is as follows:
>
>   int ctx_notify_me;
>   int bh_scheduled;
>
>   int x()
>   {
>   int one = 1;
>   int ret;
>   __atomic_store(_scheduled, , __ATOMIC_RELEASE); // x1
>   __atomic_thread_fence(__ATOMIC_SEQ_CST);   // x2
>   __atomic_load(_notify_me, , __ATOMIC_RELAXED); // x3
>   return ret;
>   }
>
>   int y()
>   {
>   int ret;
>   __atomic_fetch_add(_notify_me, 2, __ATOMIC_SEQ_CST);  // y1
>   __atomic_load(_scheduled, , __ATOMIC_RELAXED); // y2
>   return ret;
>   }
>
> Here y (which is aio_poll) wants to order the write to ctx->notify_me
> before reads of bh->scheduled.  However, the processor can speculate the
> load of bh->scheduled between the load-acquire and store-release of
> ctx->notify_me.  So you can have something like:
>
>  thread 0 (y)  thread 1 (x)
>  ---   -
>  y1: load-acq ctx->notify_me
>  y2: load-rlx bh->scheduled
>x1: store-rel bh->scheduled <-- 1
>x2: memory barrier
>x3: load-rlx ctx->notify_me
>  y1: store-rel ctx->notify_me <-- 2
>
> Being very puzzled, I tried to put this into cppmem:
>
>   int main() {
> atomic_int ctx_notify_me = 0;
> atomic_int bh_scheduled = 0;
> {{{ {
>   bh_scheduled.store(1, mo_release);
>   atomic_thread_fence(mo_seq_cst);
>   // must be zero since the bug report shows no notification
>   ctx_notify_me.load(mo_relaxed).readsvalue(0);
> }
> ||| {
>   ctx_notify_me.store(2, mo_seq_cst);
>   r2=bh_scheduled.load(mo_relaxed);
> }
> }}};
> return 0;
>   }
>
> and much to my surprise, the tool said r2 *can* be 0.  Same if I put a
> CAS like
>
> cas_strong_explicit(ctx_notify_me.readsvalue(0), 0, 2,
> mo_seq_cst, mo_seq_cst);
>
> which resembles the code in the test case a bit more.

This example looks like Dekker synchronization (if I get the intent
right).

Two possible implementations of this are either (1) with all memory
accesses having seq-cst MO, or (2) with relaxed-MO accesses and seq-cst
fences on between the store and load on both ends.  It's possible to mix
both, but that get's trickier I think.  I'd prefer the one with just
fences, just because it's easiest, conceptually.

> I then found a discussion about using the C11 memory model in Linux
> (https://gcc.gnu.org/ml/gcc/2014-02/msg00058.html) which contains the
> following statement, which is a bit disheartening even though it is
> about a different test:
>
>My first gut feeling was that the assertion should never fire, but
>that was wrong because (as I seem to usually forget) the seq-cst
>total order is just a constraint but doesn't itself contribute
>to synchronizes-with -- but this is different for seq-cst fences.

It works if you use (1) or (2) consistently.  cppmem and the Batty et al.
tech report should give you the gory details.
My comment is just about seq-cst working differently on memory accesses vs.
fences (in the way it's specified in the memory model).

> and later in the thread:
>
>Use of C11 atomics to implement Linux kernel atomic operations
>requires knowledge of the underlying architecture and the compiler's
>implementation, as was noted earlier in this thread.
>
> Indeed if I add an atomic_thread_fence I get only one valid execution,
> where r2 must be 1.  This is similar to GCC's bug
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697, and we can fix it in
> QEMU by using

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-10-03 Thread Rafael David Tinoco

Documenting this here as bug# was dropped from the mail thread:

On 02/10/19 13:05, Jan Glauber wrote:
> The arm64 code generated for the
> atomic_[add|sub] accesses of ctx->notify_me doesn't contain any
> memory barriers. It is just plain ldaxr/stlxr.
>
> From my understanding this is not sufficient for SMP sync.
>
>>> If I read this comment correct:
>>>
>>> void aio_notify(AioContext *ctx)
>>> {
>>> /* Write e.g. bh->scheduled before reading ctx->notify_me.  Pairs
>>>  * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
>>>  */
>>> smp_mb();
>>> if (ctx->notify_me) {
>>>
>>> it points out that the smp_mb() should be paired. But as
>>> I said the used atomics don't generate any barriers at all.
>>
>> Awesome!  That would be a compiler bug though, as atomic_add and atomic_sub
>> are defined as sequentially consistent:
>>
>> #define atomic_add(ptr, n) ((void) __atomic_fetch_add(ptr, n, 
>> __ATOMIC_SEQ_CST))
>> #define atomic_sub(ptr, n) ((void) __atomic_fetch_sub(ptr, n, 
>> __ATOMIC_SEQ_CST))
>
> Compiler bug sounds kind of unlikely...
Indeed the assembly produced by the compiler matches for example the
mappings at https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html.  A
small testcase is as follows:

  int ctx_notify_me;
  int bh_scheduled;

  int x()
  {
  int one = 1;
  int ret;
  __atomic_store(_scheduled, , __ATOMIC_RELEASE); // x1
  __atomic_thread_fence(__ATOMIC_SEQ_CST);   // x2
  __atomic_load(_notify_me, , __ATOMIC_RELAXED); // x3
  return ret;
  }

  int y()
  {
  int ret;
  __atomic_fetch_add(_notify_me, 2, __ATOMIC_SEQ_CST);  // y1
  __atomic_load(_scheduled, , __ATOMIC_RELAXED); // y2
  return ret;
  }

Here y (which is aio_poll) wants to order the write to ctx->notify_me
before reads of bh->scheduled.  However, the processor can speculate the
load of bh->scheduled between the load-acquire and store-release of
ctx->notify_me.  So you can have something like:

 thread 0 (y)  thread 1 (x)
 ---   -
 y1: load-acq ctx->notify_me
 y2: load-rlx bh->scheduled
   x1: store-rel bh->scheduled <-- 1
   x2: memory barrier
   x3: load-rlx ctx->notify_me
 y1: store-rel ctx->notify_me <-- 2

Being very puzzled, I tried to put this into cppmem:

  int main() {
atomic_int ctx_notify_me = 0;
atomic_int bh_scheduled = 0;
{{{ {
  bh_scheduled.store(1, mo_release);
  atomic_thread_fence(mo_seq_cst);
  // must be zero since the bug report shows no notification
  ctx_notify_me.load(mo_relaxed).readsvalue(0);
}
||| {
  ctx_notify_me.store(2, mo_seq_cst);
  r2=bh_scheduled.load(mo_relaxed);
}
}}};
return 0;
  }

and much to my surprise, the tool said r2 *can* be 0.  Same if I put a
CAS like

cas_strong_explicit(ctx_notify_me.readsvalue(0), 0, 2,
mo_seq_cst, mo_seq_cst);

which resembles the code in the test case a bit more.

I then found a discussion about using the C11 memory model in Linux
(https://gcc.gnu.org/ml/gcc/2014-02/msg00058.html) which contains the
following statement, which is a bit disheartening even though it is
about a different test:

   My first gut feeling was that the assertion should never fire, but
   that was wrong because (as I seem to usually forget) the seq-cst
   total order is just a constraint but doesn't itself contribute
   to synchronizes-with -- but this is different for seq-cst fences.

and later in the thread:

   Use of C11 atomics to implement Linux kernel atomic operations
   requires knowledge of the underlying architecture and the compiler's
   implementation, as was noted earlier in this thread.

Indeed if I add an atomic_thread_fence I get only one valid execution,
where r2 must be 1.  This is similar to GCC's bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697, and we can fix it in
QEMU by using __sync_fetch_and_add; in fact cppmem also shows one valid
execution if the store is replaced with something like GCC's assembly
for __sync_fetch_and_add (or Linux's assembly for atomic_add_return):

cas_strong_explicit(ctx_notify_me.readsvalue(0), 0, 2,
mo_release, mo_release);
atomic_thread_fence(mo_seq_cst);

So we should:

1) understand why ATOMIC_SEQ_CST is not enough in this case.  QEMU code
seems to be making the same assumptions as Linux about the memory model,
and this is wrong because QEMU uses C11 atomics if available.
Fortunately, this kind of synchronization in QEMU is relatively rare and
only this particular bit seems affected.  If there is a fix which stays
within the C11 memory model, and does not pessimize code on x86, we can
use it[1] and document the pitfall.

2) if there's no way

Re: Thoughts on VM fence infrastructure

2019-09-30 Thread Rafael David Tinoco



>>> There are times when the main loop can get blocked even though the CPU
>>> threads can be running and can in some configurations perform IO
>>> even without the main loop (I think!).
>> Ah, that's a very good point. Indeed, you can perform IO in those
>> cases specially when using vhost devices.
>>
>>> By setting a timer in the kernel that sends a signal to qemu, the kernel
>>> will send that signal however broken qemu is.
>> Got you now. That's probably better. Do you reckon a signal is
>> preferable over SIGEV_THREAD?
> Not sure; probably the safest is getting the kernel to SIGKILL it - but
> that's a complete nightmare to debug - your process just goes *pop*
> with no apparent reason why.
> I've not used SIGEV_THREAD - it looks promising though.

Sorry to "enter" the discussion, but, in "real" HW, its not by accident
that watchdog devices timeout generates a NMI to CPUs, causing the
kernel to handle the interrupt - and panic (or to take other action set
by specific watchdog drivers that re-implements the default ones).

Can't you simple "inject" a NMI in all guest vCPUs BEFORE you take any
action in QEMU itself? Just like the virtual watchdog device would do,
from inside the guest (/dev/watchdog), but capable of being updated by
outside, in this case of yours (if I understood correctly).

Possibly you would have to have a dedicated loop for this "watchdog
device" (AIO threads ?) not to compete with existing coroutines/BH Tasks
and their jittering on your "realtime watchdog needs".

Regarding remaining existing I/OS for the guest's devices in question
(vhost/vhost-user etc), would be just like a real host where the "bus"
received commands, but sender died right after...

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-09-11 Thread Rafael David Tinoco

** Also affects: qemu (Ubuntu Ff-series)
   Importance: Undecided
   Status: New

** Also affects: qemu (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: qemu (Ubuntu Eoan)
   Importance: Medium
 Assignee: Rafael David Tinoco (rafaeldtinoco)
   Status: In Progress

** Also affects: qemu (Ubuntu Disco)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Bionic:
  New
Status in qemu source package in Disco:
  New
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in FF-Series:
  New

Bug description:
  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  [ Original Description ]

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=,
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions

Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues

2019-09-11 Thread Rafael David Tinoco

> Zhengui's theory that notify_me doesn't work properly on ARM is more
> promising, but he couldn't provide a clear explanation of why he thought
> notify_me is involved.  In particular, I would have expected notify_me to
> be wrong if the qemu_poll_ns call came from aio_ctx_dispatch, for example:
> 
> 
> glib_pollfds_fill
>   g_main_context_prepare
> aio_ctx_prepare
>   atomic_or(>notify_me, 1)
> qemu_poll_ns
> glib_pollfds_poll
>   g_main_context_check
> aio_ctx_check
>   atomic_and(>notify_me, ~1)
>   g_main_context_dispatch
> aio_ctx_dispatch
>   /* do something for event */
> qemu_poll_ns 
> 

Paolo,

I tried confining execution in a single NUMA domain (cpu & mem) and
still faced the issue, then, I added a mutex "ctx->notify_me_lcktest"
into context to protect "ctx->notify_me", like showed bellow, and it
seems to have either fixed or mitigated it.

I was able to cause the hung once every 3 or 4 runs. I have already ran
qemu-img convert more than 30 times now and couldn't reproduce it again.

Next step is to play with the barriers and check why existing ones
aren't enough for ordering access to ctx->notify_me ... or should I
try/do something else in your opinion ?

This arch/machine (Huawei D06):

$ lscpu
Architecture:aarch64
Byte Order:  Little Endian
CPU(s):  96
On-line CPU(s) list: 0-95
Thread(s) per core:  1
Core(s) per socket:  48
Socket(s):   2
NUMA node(s):4
Vendor ID:   0x48
Model:   0
Stepping:0x0
CPU max MHz: 2000.
CPU min MHz: 200.
BogoMIPS:200.00
L1d cache:   64K
L1i cache:   64K
L2 cache:512K
L3 cache:32768K
NUMA node0 CPU(s):   0-23
NUMA node1 CPU(s):   24-47
NUMA node2 CPU(s):   48-71
NUMA node3 CPU(s):   72-95
Flags:   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
cpuid asimdrdm dcpop



diff --git a/include/block/aio.h b/include/block/aio.h
index 0ca25dfec6..0724086d91 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -84,6 +84,7 @@ struct AioContext {
  * dispatch phase, hence a simple counter is enough for them.
  */
 uint32_t notify_me;
+QemuMutex notify_me_lcktest;

 /* A lock to protect between QEMUBH and AioHandler adders and deleter,
  * and to ensure that no callbacks are removed while we're walking and
diff --git a/util/aio-posix.c b/util/aio-posix.c
index 51c41ed3c9..031d6e2997 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -529,7 +529,9 @@ static bool run_poll_handlers(AioContext *ctx,
int64_t max_ns, int64_t *timeout)
 bool progress;
 int64_t start_time, elapsed_time;

+qemu_mutex_lock(>notify_me_lcktest);
 assert(ctx->notify_me);
+qemu_mutex_unlock(>notify_me_lcktest);
 assert(qemu_lockcnt_count(>list_lock) > 0);

 trace_run_poll_handlers_begin(ctx, max_ns, *timeout);
@@ -601,8 +603,10 @@ bool aio_poll(AioContext *ctx, bool blocking)
  * so disable the optimization now.
  */
 if (blocking) {
+qemu_mutex_lock(>notify_me_lcktest);
 assert(in_aio_context_home_thread(ctx));
 atomic_add(>notify_me, 2);
+qemu_mutex_unlock(>notify_me_lcktest);
 }

 qemu_lockcnt_inc(>list_lock);
@@ -647,8 +651,10 @@ bool aio_poll(AioContext *ctx, bool blocking)
 }

 if (blocking) {
+qemu_mutex_lock(>notify_me_lcktest);
 atomic_sub(>notify_me, 2);
 aio_notify_accept(ctx);
+qemu_mutex_unlock(>notify_me_lcktest);
 }

 /* Adjust polling time */
diff --git a/util/async.c b/util/async.c
index c10642a385..140e1e86f5 100644
--- a/util/async.c
+++ b/util/async.c
@@ -221,7 +221,9 @@ aio_ctx_prepare(GSource *source, gint*timeout)
 {
 AioContext *ctx = (AioContext *) source;

+qemu_mutex_lock(>notify_me_lcktest);
 atomic_or(>notify_me, 1);
+qemu_mutex_unlock(>notify_me_lcktest);

 /* We assume there is no timeout already supplied */
 *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
@@ -239,8 +241,10 @@ aio_ctx_check(GSource *source)
 AioContext *ctx = (AioContext *) source;
 QEMUBH *bh;

+qemu_mutex_lock(>notify_me_lcktest);
 atomic_and(>notify_me, ~1);
 aio_notify_accept(ctx);
+qemu_mutex_unlock(>notify_me_lcktest);

 for (bh = ctx->first_bh; bh; bh = bh->next) {
 if (bh->scheduled) {
@@ -346,11 +350,13 @@ void aio_notify(AioContext *ctx)
 /* Write e.g. bh->scheduled before reading ctx->notify_me.  Pairs
  * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
  */
-smp_mb();
+//smp_mb();
+qemu_mutex_lock(>notify_me_lcktest);
 if (ctx->notify_me) {
 event_notifier_set(>notifier);
 atomic_mb_set(>notified, true);
 }
+qemu_mutex_unlock(>notify_me_lcktest);
 }

 void aio_notify_accept(AioContext *ctx)
@@ -424,6 +430,8 @@ AioContext *aio_context_new(Error

Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues

2019-09-11 Thread Rafael David Tinoco



> Note that the RCU thread is expected to sit most of the time doing 
> nothing, so I don't think this matters.

Agreed.

> Zhengui's theory that notify_me doesn't work properly on ARM is more
> promising, but he couldn't provide a clear explanation of why he thought
> notify_me is involved.  In particular, I would have expected notify_me to
> be wrong if the qemu_poll_ns call came from aio_ctx_dispatch, for example:
> 
> 
> glib_pollfds_fill
>   g_main_context_prepare
> aio_ctx_prepare
>   atomic_or(>notify_me, 1)
> qemu_poll_ns
> glib_pollfds_poll
>   g_main_context_check
> aio_ctx_check
>   atomic_and(>notify_me, ~1)
>   g_main_context_dispatch
> aio_ctx_dispatch
>   /* do something for event */
> qemu_poll_ns 
> 

Yep, will focus there.

> 
> Can you place somewhere your util/async.o object file for me to look at it?

Sure!

https://send.firefox.com/download/45c26bbe1075eea1/#ZD_e_96imPG2QuDqaX-jhg

Note: this async.o has value as int, EV_BUSY as 3, aborts if any errno
in qemu_futex() and uses >value as 1st argument to wake/wait (as in
https://pastebin.ubuntu.com/p/xk8D6H6kgM/).

> 
> You could change it to 3, but it has to have all the bits in EV_FREE 
> (see atomic_or(>value, EV_FREE) in qemu_event_reset).
> 
> You could also change it to -1u, but I don't see a particular need to do so.
> 

Yep, it was a dead end on my side.

>> - Should qemu_event_set() check return code from
>> qemu_futex_wake()->qemu_futex()->syscall() in order to know if ANY
>> waiter was ever woken up ? Maybe even loop until at least 1 is awaken ?
> 
> Why would it need to do so?
> 

No need, just realized after I saw no tasks waking that thread up. Like
you said, ctx->notify_me seems more promising, will give it a try.

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-09-11 Thread Rafael David Tinoco

** Description changed:

+ Command:
+ 
+ qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
+ 
+ Hangs indefinitely approximately 30% of the runs.
+ 
+ 
+ 
+ Workaround:
+ 
+ qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
+ 
+ Run "qemu-img convert" with "a single coroutine" to avoid this issue.
+ 
+ 
+ 
+ (gdb) thread 1
+ ...
+ (gdb) bt
+ #0 0xbf1ad81c in __GI_ppoll
+ #1 0xaabcf73c in ppoll
+ #2 qemu_poll_ns
+ #3 0xaabd0764 in os_host_main_loop_wait
+ #4 main_loop_wait
+ ...
+ 
+ (gdb) thread 2
+ ...
+ (gdb) bt
+ #0 syscall ()
+ #1 0xaabd41cc in qemu_futex_wait
+ #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
+ #3 0xaabed05c in call_rcu_thread
+ #4 0xaabd34c8 in qemu_thread_start
+ #5 0xbf25c880 in start_thread
+ #6 0xbf1b6b9c in thread_start ()
+ 
+ (gdb) thread 3
+ ...
+ (gdb) bt
+ #0 0xbf11aa20 in __GI___sigtimedwait
+ #1 0xbf2671b4 in __sigwait
+ #2 0xaabd1ddc in sigwait_compat
+ #3 0xaabd34c8 in qemu_thread_start
+ #4 0xbf25c880 in start_thread
+ #5 0xbf1b6b9c in thread_start
+ 
+ 
+ 
+ (gdb) run
+ Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
+ ./disk01.ext4.qcow2 ./output.qcow2
+ 
+ [New Thread 0xbec5ad90 (LWP 72839)]
+ [New Thread 0xbe459d90 (LWP 72840)]
+ [New Thread 0xbdb57d90 (LWP 72841)]
+ [New Thread 0xacac9d90 (LWP 72859)]
+ [New Thread 0xa7ffed90 (LWP 72860)]
+ [New Thread 0xa77fdd90 (LWP 72861)]
+ [New Thread 0xa6ffcd90 (LWP 72862)]
+ [New Thread 0xa67fbd90 (LWP 72863)]
+ [New Thread 0xa5ffad90 (LWP 72864)]
+ 
+ [Thread 0xa5ffad90 (LWP 72864) exited]
+ [Thread 0xa6ffcd90 (LWP 72862) exited]
+ [Thread 0xa77fdd90 (LWP 72861) exited]
+ [Thread 0xbdb57d90 (LWP 72841) exited]
+ [Thread 0xa67fbd90 (LWP 72863) exited]
+ [Thread 0xacac9d90 (LWP 72859) exited]
+ [Thread 0xa7ffed90 (LWP 72860) exited]
+ 
+ 
+ """
+ 
+ All the tasks left are blocked in a system call, so no task left to call
+ qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
+ thread #1 (doing poll() in a pipe with thread #2).
+ 
+ Those 7 threads exit before disk conversion is complete (sometimes in
+ the beginning, sometimes at the end).
+ 
+ 
+ 
+ [ Original Description ]
+ 
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:
  
  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2
  
  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.
  
  Once hung, attaching gdb gives the following backtrace:
  
  (gdb) bt
- #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
- timeout=, timeout@entry=0x0, sigmask=0xc123b950)
- at ../sysdeps/unix/sysv/linux/ppoll.c:39
- #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
- __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
- #2  qemu_poll_ns (fds=, nfds=, 
- timeout=timeout@entry=-1) at util/qemu-timer.c:322
+ #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
+ timeout=, timeout@entry=0x0, sigmask=0xc123b950)
+ at ../sysdeps/unix/sysv/linux/ppoll.c:39
+ #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
+ __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
+ #2  qemu_poll_ns (fds=, nfds=,
+ timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
- at util/main-loop.c:233
+ at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975
  
  Reproduced w/ latest QEMU git (@ 53744e0a182)

** Description changed:

  Command:
  
- qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
+ qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
  
  Hangs indefinitely approximately 30% of the runs.
  
  
  
  Workaround:
  
  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2
  
  Run "qemu-img convert" with "a single coroutine" to avoid this issue.
  
  
  
  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...
  
  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start

Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues

2019-09-11 Thread Rafael David Tinoco

Quick update...

> value INT_MAX (4294967295) seems WRONG for qemu_futex_wait():
> 
> - EV_BUSY, being -1, and passed as an argument qemu_futex_wait(void *,
> unsigned), is a two's complement, making argument into a INT_MAX when
> that's not what is expected (unless I missed something).
> 
> *** If that is the case, unsure if you, Paolo, prefer declaring
> *(QemuEvent)->value as an integer or changing EV_BUSY to "2" would okay
> here ***
> 
> BUG: description:
> https://bugs.launchpad.net/qemu/+bug/1805256/comments/15

I realized this might be intentional, but, still, I tried:

https://pastebin.ubuntu.com/p/6rkkY6fJdm/

looking for anything that could have misbehaved in arm64 (specially
concerned on casting and type conversions between the functions).

> QUESTION:
> 
> - Should qemu_event_set() check return code from
> qemu_futex_wake()->qemu_futex()->syscall() in order to know if ANY
> waiter was ever woken up ? Maybe even loop until at least 1 is awaken ?

And I also tried:

-qemu_futex(f, FUTEX_WAKE, n, NULL, NULL, 0);
+while(qemu_futex(pval, FUTEX_WAKE, val, NULL, NULL, 0) == 0)
+continue;

and it made little difference (took way more time for me to reproduce
the issue though):

"""
(gdb) run
Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
./disk01.ext4.qcow2 ./output.qcow2

[New Thread 0xbec5ad90 (LWP 72839)]
[New Thread 0xbe459d90 (LWP 72840)]
[New Thread 0xbdb57d90 (LWP 72841)]
[New Thread 0xacac9d90 (LWP 72859)]
[New Thread 0xa7ffed90 (LWP 72860)]
[New Thread 0xa77fdd90 (LWP 72861)]
[New Thread 0xa6ffcd90 (LWP 72862)]
[New Thread 0xa67fbd90 (LWP 72863)]
[New Thread 0xa5ffad90 (LWP 72864)]

[Thread 0xa5ffad90 (LWP 72864) exited]
[Thread 0xa6ffcd90 (LWP 72862) exited]
[Thread 0xa77fdd90 (LWP 72861) exited]
[Thread 0xbdb57d90 (LWP 72841) exited]
[Thread 0xa67fbd90 (LWP 72863) exited]
[Thread 0xacac9d90 (LWP 72859) exited]
[Thread 0xa7ffed90 (LWP 72860) exited]


"""

All the tasks left are blocked in a system call, so no task left to call
qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
thread #1 (doing poll() in a pipe with thread #2).

Those 7 threads exit before disk conversion is complete (sometimes in
the beginning, sometimes at the end).

I'll try to check why those tasks exited.

Any thoughts ?

Tks

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-09-10 Thread Rafael David Tinoco

In comment #14, please disregard the second half of the issue, related
to:

   0xaabd4100 <+16>: cbz w1, 0xaabd4108 
   0xaabd4104 <+20>: ret
   0xaabd4108 <+24>: ldaxr w1, [x0]
   0xaabd410c <+28>: orr w1, w1, #0x1
=> 0xaabd4110 <+32>: stlxr w2, w1, [x0]
   0xaabd4114 <+36>: cbnz w2, 0xaabd4108

Duh! This is just a regular load/xor/store logic for atomic_or() inside
qemu_event_reset().

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=, 
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions

[Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues

2019-09-10 Thread Rafael David Tinoco

Paolo,

While debugging hungs in ARM64 while doing a simple:

qemu-img convert -f qcow2 -O qcow2 file.qcow2 output.qcow2

I might have found 2 issues which I'd like you to review, if possible.

ISSUE #1


I've caught the following stack trace after an HUNG in qemu-img convert:

(gdb) bt
#0 syscall ()
#1 0xaabd41cc in qemu_futex_wait
#2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
#3 0xaabed05c in call_rcu_thread
#4 0xaabd34c8 in qemu_thread_start
#5 0xbf25c880 in start_thread
#6 0xbf1b6b9c in thread_start ()

(gdb) print rcu_call_ready_event
$4 = {value = 4294967295, initialized = true}

value INT_MAX (4294967295) seems WRONG for qemu_futex_wait():

- EV_BUSY, being -1, and passed as an argument qemu_futex_wait(void *,
unsigned), is a two's complement, making argument into a INT_MAX when
that's not what is expected (unless I missed something).

*** If that is the case, unsure if you, Paolo, prefer declaring
*(QemuEvent)->value as an integer or changing EV_BUSY to "2" would okay
here ***

BUG: description:
https://bugs.launchpad.net/qemu/+bug/1805256/comments/15


ISSUE #2


I found this when debugging lockups while in futex() in a specific ARM64
server - https://bugs.launchpad.net/qemu/+bug/1805256 - which I'm still
investigating.

After fixing the issue above, I'm still getting stuck into:

qemu_event_wait() -> qemu_futex_wait()

***
As if qemu_event_set() has ran before qemu_futex_wait() ever started running
***

The Other threads are waiting for poll() on a PIPE coming from this
stuck thread (thread #1), and in sigwait():

(gdb) thread 1
...
(gdb) bt
#0  0xbf1ad81c in __GI_ppoll
#1  0xaabcf73c in ppoll
#2  qemu_poll_ns
#3  0xaabd0764 in os_host_main_loop_wait
#4  main_loop_wait
...

(gdb) thread 2
...
(gdb) bt
#0 syscall ()
#1 0xaabd41cc in qemu_futex_wait
#2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
#3 0xaabed05c in call_rcu_thread
#4 0xaabd34c8 in qemu_thread_start
#5 0xbf25c880 in start_thread
#6 0xbf1b6b9c in thread_start ()

(gdb) thread 3
...
(gdb) bt
#0  0xbf11aa20 in __GI___sigtimedwait
#1  0xbf2671b4 in __sigwait
#2  0xaabd1ddc in sigwait_compat
#3  0xaabd34c8 in qemu_thread_start
#4  0xbf25c880 in start_thread
#5  0xbf1b6b9c in thread_start

QUESTION:

- Should qemu_event_set() check return code from
qemu_futex_wake()->qemu_futex()->syscall() in order to know if ANY
waiter was ever woken up ? Maybe even loop until at least 1 is awaken ?

Tks in advance,

Rafael D. Tinoco

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-09-10 Thread Rafael David Tinoco

QEMU BUG: #1

Alright, one of the issues is (according to comment #14):

"""
Meaning that code is waiting for a futex inside kernel.

(gdb) print rcu_call_ready_event
$4 = {value = 4294967295, initialized = true}

The QemuEvent "rcu_call_ready_event->value" is set to INT_MAX and I
don't know why yet.

rcu_call_ready_event->value is only touched by:

qemu_event_init() -> bool init ? EV_SET : EV_FREE
qemu_event_reset() -> atomic_or(>value, EV_FREE)
qemu_event_set() -> atomic_xchg(>value, EV_SET)
qemu_event_wait() -> atomic_cmpxchg(>value, EV_FREE, EV_BUSY)'
"""

Now I know why rcu_call_ready_event->value is set to INT_MAX. That is
because in the following declaration:

struct QemuEvent {
#ifndef __linux__
pthread_mutex_t lock;
pthread_cond_t cond;
#endif
unsigned value;
bool initialized;
};

#define EV_SET 0
#define EV_FREE1
#define EV_BUSY   -1

"value" is declared as unsigned, but EV_BUSY sets it to -1, and,
according to the Two's Complement Operation
(https://en.wikipedia.org/wiki/Two%27s_complement), it will be INT_MAX
(4294967295).

So this is the "first bug" found AND it is definitely funny that this
hasn't been seen in other architectures at all... I can reproduce it at
will.

With that said, it seems that there is still another issue causing (less
frequently):

(gdb) thread 2
[Switching to thread 2 (Thread 0xbec5ad90 (LWP 17459))]
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
38  ../sysdeps/unix/sysv/linux/aarch64/syscall.S: No such file or directory.
(gdb) bt
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0xaabd41cc in qemu_futex_wait (val=, f=) at ./util/qemu-thread-posix.c:438
#2  qemu_event_wait (ev=ev@entry=0xaac86ce8 ) at 
./util/qemu-thread-posix.c:442
#3  0xaabed05c in call_rcu_thread (opaque=opaque@entry=0x0) at 
./util/rcu.c:261
#4  0xaabd34c8 in qemu_thread_start (args=) at 
./util/qemu-thread-posix.c:498
#5  0xbf25c880 in start_thread (arg=0xf5bf) at 
pthread_create.c:486
#6  0xbf1b6b9c in thread_start () at 
../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 2 to be stuck at "futex()" kernel syscall (like the FUTEX_WAKE
never happened and/or wasn't atomic for this arch/binary). Need to
investigate this also.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=, 
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-09-10 Thread Rafael David Tinoco

** Summary changed:

- qemu-img hangs on high core count ARM system
+ qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

** Changed in: qemu
   Status: Confirmed => In Progress

** Changed in: qemu
 Assignee: (unassigned) => Rafael David Tinoco (rafaeldtinoco)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=, 
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system

2019-09-10 Thread Rafael David Tinoco

Alright, here is what is happening:

Whenever program is stuck, thread #2 backtrace is this:

(gdb) bt
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0xaabd41b0 in qemu_futex_wait (val=, f=) at ./util/qemu-thread-posix.c:438
#2  qemu_event_wait (ev=ev@entry=0xaac87ce8 ) at 
./util/qemu-thread-posix.c:442
#3  0xaabee03c in call_rcu_thread (opaque=opaque@entry=0x0) at 
./util/rcu.c:261
#4  0xaabd34c8 in qemu_thread_start (args=) at 
./util/qemu-thread-posix.c:498
#5  0xbf26a880 in start_thread (arg=0xf5bf) at 
pthread_create.c:486
#6  0xbf1c4b9c in thread_start () at 
../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Meaning that code is waiting for a futex inside kernel.

(gdb) print rcu_call_ready_event
$4 = {value = 4294967295, initialized = true}

The QemuEvent "rcu_call_ready_event->value" is set to INT_MAX and I
don't know why yet.

rcu_call_ready_event->value is only touched by:

qemu_event_init() -> bool init ? EV_SET : EV_FREE
qemu_event_reset() -> atomic_or(>value, EV_FREE)
qemu_event_set() -> atomic_xchg(>value, EV_SET)
qemu_event_wait() -> atomic_cmpxchg(>value, EV_FREE, EV_BUSY)'

And there should be no 0x7fff value for "ev->value".

qemu_event_init() is the one initializing the global:

static QemuEvent rcu_call_ready_event;

and it is called by "rcu_init_complete()" which is called by
"rcu_init()":

static void __attribute__((__constructor__)) rcu_init(void)

a constructor function.

So, "fixing" this issue by:

(gdb) print rcu_call_ready_event
$8 = {value = 4294967295, initialized = true}

(gdb) watch rcu_call_ready_event
Hardware watchpoint 1: rcu_call_ready_event

(gdb) set rcu_call_ready_event.initialized = 1

(gdb) set rcu_call_ready_event.value = 0

and note that I added a watchpoint to rcu_call_ready_event global:



Thread 1 "qemu-img" received signal SIGINT, Interrupt.
(gdb) thread 2
[Switching to thread 2 (Thread 0xbec61d90 (LWP 33625))]

(gdb) bt
#0  0xaabd4110 in qemu_event_reset (ev=ev@entry=0xaac87ce8 
)
#1  0xaabedff8 in call_rcu_thread (opaque=opaque@entry=0x0) at 
./util/rcu.c:255
#2  0xaabd34c8 in qemu_thread_start (args=) at 
./util/qemu-thread-posix.c:498
#3  0xbf26a880 in start_thread (arg=0xf5bf) at 
pthread_create.c:486
#4  0xbf1c4b9c in thread_start () at 
../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) print rcu_call_ready_event
$9 = {value = 0, initialized = true}

You can see I advanced in the qemu_event_{reset,set,wait} logic.

(gdb) disassemble /m 0xaabd4110
Dump of assembler code for function qemu_event_reset:
408 in ./util/qemu-thread-posix.c

409 in ./util/qemu-thread-posix.c

410 in ./util/qemu-thread-posix.c
411 in ./util/qemu-thread-posix.c
   0xaabd40f0 <+0>: ldrbw1, [x0, #4]
   0xaabd40f4 <+4>: cbz w1, 0xaabd411c 

   0xaabd411c <+44>:stp x29, x30, [sp, #-16]!
   0xaabd4120 <+48>:adrpx3, 0xaac2
   0xaabd4124 <+52>:add x3, x3, #0x908
   0xaabd4128 <+56>:mov x29, sp
   0xaabd412c <+60>:adrpx1, 0xaac2
   0xaabd4130 <+64>:adrpx0, 0xaac2
   0xaabd4134 <+68>:add x3, x3, #0x290
   0xaabd4138 <+72>:add x1, x1, #0xc00
   0xaabd413c <+76>:add x0, x0, #0xd40
   0xaabd4140 <+80>:mov w2, #0x19b// #411
   0xaabd4144 <+84>:bl  0xaaaff190 <__assert_fail@plt>

412 in ./util/qemu-thread-posix.c
   0xaabd40f8 <+8>: ldr w1, [x0]

413 in ./util/qemu-thread-posix.c
   0xaabd40fc <+12>:dmb ishld

414 in ./util/qemu-thread-posix.c
   0xaabd4100 <+16>:cbz w1, 0xaabd4108 

   0xaabd4104 <+20>:ret
   0xaabd4108 <+24>:ldaxr   w1, [x0]
   0xaabd410c <+28>:orr w1, w1, #0x1
=> 0xaabd4110 <+32>:stlxr   w2, w1, [x0]
   0xaabd4114 <+36>:cbnzw2, 0xaabd4108 

   0xaabd4118 <+40>:ret

And I'm currently inside the STLXR and LDAXR logic. To make sure my program 
counter is advancing, I added a breakpoint at 0xaabd4108, so CBNZ 
instruction would branch indefinitely into LDXAR instruction again, until the
LDAXR<->STLXR logic is satisfied (inside qemu_event_wait()).

(gdb) break *(0xaabd4108)
Breakpoint 2 at 0xaabd4108: file ./util/qemu-thread-posix.c, line 414.

which is basically this:

if (value == EV_SET) {EV_SET == 0
atomic_or(>value, EV_FREE);   EV_FREE = 1
}

and we can see that this logic being called one time after another:

(gdb) c
Thread 2 "qemu-img" hit Breakpoint 3, 0xaabd4108 in qemu_event_reset (

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system

2019-09-09 Thread Rafael David Tinoco

Alright,

I'm still investigating this but wanted to share some findings... I
haven't got a kernel dump yet after the task is frozen, I have analyzed
only the userland part of it (although I have checked if code was
running inside kernel with perf cycles:u/cycles:k at some point).

The big picture is this: Whenever qemu-img hangs, we have 3 hung tasks
basically with these stacks:



TRHREAD #1
__GI_ppoll (../sysdeps/unix/sysv/linux/ppoll.c:39)
ppoll (/usr/include/aarch64-linux-gnu/bits/poll2.h:77)
qemu_poll_ns (./util/qemu-timer.c:322)
os_host_main_loop_wait (./util/main-loop.c:233)
main_loop_wait (./util/main-loop.c:497)
convert_do_copy (./qemu-img.c:1981)
img_convert (./qemu-img.c:2457)
main (./qemu-img.c:4976)

got stack traces:

./33293/stack  ./33293/stack 
[<0>] __switch_to+0xc0/0x218   [<0>] __switch_to+0xc0/0x218  
[<0>] ptrace_stop+0x148/0x2b0  [<0>] do_sys_poll+0x508/0x5c0 
[<0>] get_signal+0x5a4/0x730   [<0>] __arm64_sys_ppoll+0xc0/0x118
[<0>] do_notify_resume+0x158/0x358 [<0>] el0_svc_common+0xa0/0x168   
[<0>] work_pending+0x8/0x10[<0>] el0_svc_handler+0x38/0x78   
   [<0>] el0_svc+0x8/0xc  

root@d06-1:~$ perf record -F  -e cycles:u -p 33293 -- sleep 10
[ perf record: Woken up 6 times to write data ]
[ perf record: Captured and wrote 1.871 MB perf.data (48730 samples) ]

root@d06-1:~$ perf report --stdio
# Overhead  Command   Shared Object   Symbol
#     ..  ..
#
37.82%  qemu-img  libc-2.29.so[.] 0x000df710
21.81%  qemu-img  [unknown]   [k] 0x10099504
14.23%  qemu-img  [unknown]   [k] 0x10085dc0
 9.13%  qemu-img  [unknown]   [k] 0x1008fff8
 6.47%  qemu-img  libc-2.29.so[.] 0x000df708
 5.69%  qemu-img  qemu-img[.] qemu_event_reset
 2.57%  qemu-img  libc-2.29.so[.] 0x000df678
 0.63%  qemu-img  libc-2.29.so[.] 0x000df700
 0.49%  qemu-img  libc-2.29.so[.] __sigtimedwait
 0.42%  qemu-img  libpthread-2.29.so  [.] __libc_sigwait



TRHREAD #3
__GI___sigtimedwait (../sysdeps/unix/sysv/linux/sigtimedwait.c:29)
__sigwait (linux/sigwait.c:28)
qemu_thread_start (./util/qemu-thread-posix.c:498)
start_thread (pthread_create.c:486)
thread_start (linux/aarch64/clone.S:78)


./33303/stack  ./33303/stack
   
[<0>] __switch_to+0xc0/0x218   [<0>] __switch_to+0xc0/0x218 
   
[<0>] ptrace_stop+0x148/0x2b0  [<0>] do_sigtimedwait.isra.9+0x194/0x288 
   
[<0>] get_signal+0x5a4/0x730   [<0>] 
__arm64_sys_rt_sigtimedwait+0xac/0x110
[<0>] do_notify_resume+0x158/0x358 [<0>] el0_svc_common+0xa0/0x168  
   
[<0>] work_pending+0x8/0x10[<0>] el0_svc_handler+0x38/0x78  
   
   [<0>] el0_svc+0x8/0xc   

root@d06-1:~$ perf record -F  -e cycles:u -p 33303 -- sleep 10
[ perf record: Woken up 6 times to write data ]
[ perf record: Captured and wrote 1.905 MB perf.data (49647 samples) ]

root@d06-1:~$ perf report --stdio
# Overhead  Command   Shared Object   Symbol
#     ..  ..
#
45.37%  qemu-img  libc-2.29.so[.] 0x000df710
23.52%  qemu-img  [unknown]   [k] 0x10099504
 9.08%  qemu-img  [unknown]   [k] 0x1008fff8
 8.89%  qemu-img  [unknown]   [k] 0x10085dc0
 5.56%  qemu-img  libc-2.29.so[.] 0x000df708
 3.66%  qemu-img  libc-2.29.so[.] 0x000df678
 1.01%  qemu-img  libc-2.29.so[.] __sigtimedwait
 0.80%  qemu-img  libc-2.29.so[.] 0x000df700
 0.64%  qemu-img  qemu-img[.] qemu_event_reset
 0.55%  qemu-img  libc-2.29.so[.] 0x000df718
 0.52%  qemu-img  libpthread-2.29.so  [.] __libc_sigwait



TRHREAD #2
syscall (linux/aarch64/syscall.S:38)
qemu_futex_wait (./util/qemu-thread-posix.c:438)
qemu_event_wait (./util/qemu-thread-posix.c:442)
call_rcu_thread (./util/rcu.c:261)
qemu_thread_start (./util/qemu-thread-posix.c:498)
start_thread (pthread_create.c:486)
thread_start (linux/aarch64/clone.S:78)

./33302/stack  ./33302/stack   
[<0>] __switch_to+0xc0/0x218   [<0>] __switch_to+0xc0/0x218
[<0>] ptrace_stop+0x148/0x2b0  [<0>] ptrace_stop+0x148/0x2b0   
[<0>] get_signal+0x5a4/0x730   [<0>] get_signal+0x5a4/0x730
[<0>] do_notify_resume+0x1c4/0x358 [<0>] do_notify_resume+0x1c4/0x358  
[<0>] work_pending+0x8/0x10[<0>] work_pending+0x8/0x10



root@d06-1:~$ perf report --stdio
# Overhead  Command   Shared Object   Symbol
#

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system

2019-09-09 Thread Rafael David Tinoco

Alright, with a d06 aarch64 machine I was able to reproduce it after 8
attempts.I'll debug it today and provide feedback on my findings.

(gdb) bt full
#0  0xb0b2181c in __GI_ppoll (fds=0xce5ab770, nfds=4, 
timeout=, timeout@entry=0x0,
sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
_x3tmp = 0
_x0tmp = 187650583213936
_x0 = 187650583213936
_x3 = 0
_x4tmp = 8
_x1tmp = 4
_x1 = 4
_x4 = 8
_x2tmp = 
_x2 = 0
_x8 = 73
_sys_result = 
_sys_result = 
sc_cancel_oldtype = 0
sc_ret = 
tval = {tv_sec = 0, tv_nsec = 187650583137792}
#1  0xcd2a773c in ppoll (__ss=0x0, __timeout=0x0, __nfds=, __fds=)
at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
No locals.
#2  qemu_poll_ns (fds=, nfds=, 
timeout=timeout@entry=-1) at ./util/qemu-timer.c:322
No locals.
#3  0xcd2a8764 in os_host_main_loop_wait (timeout=-1) at 
./util/main-loop.c:233
context = 0xce599d90
ret = 
context = 
ret = 
#4  main_loop_wait (nonblocking=) at ./util/main-loop.c:497
ret = 
timeout = 4294967295
timeout_ns = 
#5  0xcd1df454 in convert_do_copy (s=0xf9b2b1d8) at 
./qemu-img.c:1981
ret = 
i = 
n = 
sector_num = 
ret = 
i = 
n = 
sector_num = 
#6  img_convert (argc=, argv=) at 
./qemu-img.c:2457
c = 
bs_i = 
flags = 16898
src_flags = 0
fmt = 0xf9b2bad1 "qcow2"
out_fmt = 
cache = 0xcd2cb1c8 "unsafe"
src_cache = 0xcd2ca9c0 "writeback"
out_baseimg = 
out_filename = 
out_baseimg_param = 
snapshot_name = 0x0
drv = 
proto_drv = 
bdi = {cluster_size = 65536, vm_state_offset = 32212254720, is_dirty = 
false, unallocated_blocks_are_zero = true,
  needs_compressed_writes = false}
out_bs = 
opts = 0xce5ab390
sn_opts = 0x0
create_opts = 0xce5ab0c0
open_opts = 
options = 0x0
local_err = 0x0
writethrough = false
src_writethrough = false
quiet = 
image_opts = false
skip_create = false
progress = 
tgt_image_opts = false
ret = 
force_share = false
explict_min_sparse = false
s = {src = 0xce577240, src_sectors = 0xce577300, src_num = 1, 
total_sectors = 62914560,allocated_sectors = 9572096, allocated_done = 6541440, 
sector_num = 8863744, wr_offs = 8859776, status = BLK_DATA, sector_next_status 
= 8863744, target = 0xce5bd2a0, has_zero_init = true,compressed = false, 
unallocated_blocks_are_zero = true, target_has_backing = false, 
target_backing_sectors = -1, wr_in_order = true, copy_range = false, min_sparse 
= 8, alignment = 8,cluster_sectors = 128, buf_sectors = 4096, num_coroutines = 
8, running_coroutines = 8, co = {0xce5ceda0,0xce5cef50, 0xce5cf100, 
0xce5cf2b0, 0xce5cf460, 0xce5cf610, 0xce5cf7c0,0xce5cf970, 
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, wait_sector_num = {-1, 8859904, 
8860928, 8863360,8861952, 8862976, 8862592, 8861440, 0, 0, 0, 0, 0, 0, 0, 0}, 
lock = {locked = 0, ctx = 0x0, from_push = {slh_first = 0x0}, to_pop = 
{slh_first = 0x0}, handoff = 0, sequence = 0, holder = 0x0}, ret = -115}
__PRETTY_FUNCTION__ = "img_convert"
#7  0xcd1d8400 in main (argc=7, argv=) at 
./qemu-img.c:4976
cmd = 0xcd34ad78 
cmdname = 
local_error = 0x0
trace_file = 0x0
c = 
long_options = {{name = 0xcd2cbbb0 "help", has_arg = 0, flag = 0x0, 
val = 104}, {
name = 0xcd2cbc78 "version", has_arg = 0, flag = 0x0, val = 
86}, {name = 0xcd2cbc80 "trace",
has_arg = 1, flag = 0x0, val = 84}, {name = 0x0, has_arg = 0, flag 
= 0x0, val = 0}}

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on high core count ARM system

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
  __fds=) at

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system

2019-09-06 Thread Rafael David Tinoco

Alright, I couldn't reproduce this yet, I'm running same test case in a
24 cores box and causing lots of context switches and CPU migrations in
parallel (trying to exhaust the logic).

Will let this running for sometime to check.

Unfortunately this can be related QEMU AIO BH locking/primitives and
cache coherency in the HW in question (which I got specs from:
https://en.wikichip.org/wiki/hisilicon/kunpeng/hi1616):

l1$ size8 MiB
l1d$ size   4 MiB
l1i$ size   4 MiB
l2$ size32 MiB
l3$ size64 MiB

like for example when having 2 threads in different NUMA domains, or
some other situation.

I can't simulate the same since I have a SOC with:

Cortex-A53 MPCore 24cores,

L1 I/D=32KB/32KB
L2 =256KB
L3 =4MB

and I'm not even close to L1/L2/L3 cache numbers from D06 =o).

Just got a note that I'll be able to reproduce this in the real HW, will
get back soon with real gdb debugging.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on high core count ARM system

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=, 
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system

2019-09-06 Thread Rafael David Tinoco

OOhh nm on the virtual environment test, as I just remembered we don't
have KVM on 2nd level for aarch64 yet (at least in ARMv8 implementing
virt extension). I'll try to reproduce in the real env only.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on high core count ARM system

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=, 
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system

2019-09-06 Thread Rafael David Tinoco

Hello Liz,

I'll try to reproduce this issue in a Cortex-A53 aarch64 real
environment (w/ 24 HW threads) AND in a virtual environment w/ lots of
vCPUs... but, if it's a barrier missing - or the lack of atomicity
and/or ordering in a primitive - then, I'm afraid the context switch in
between vCPUs might not be the same as in real CPUs (IPIs are sent and
handled differently and host kernel delays IPI delivery because of its
own callbacks, before scheduling, etc...) and I could need a qemu dump
from your environment.

Would that be feasible ? Can you reproduce this nowadays ? This bug has
aged a little, so I'm now sure!

Could you provide me the dump caused by latest package available for
your Ubuntu version ? This way I have the debug symbols to work with.

Meanwhile, I'll be trying to reproduce on my side.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on high core count ARM system

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=, 
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions

[Qemu-devel] [Bug 1805256] Re: qemu-img hangs on high core count ARM system

2019-09-05 Thread Rafael David Tinoco

** Changed in: qemu (Ubuntu)
   Status: Confirmed => In Progress

** Changed in: qemu (Ubuntu)
 Assignee: (unassigned) => Rafael David Tinoco (rafaeldtinoco)

** Changed in: qemu (Ubuntu)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on high core count ARM system

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760, 
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=, 
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1805256/+subscriptions

[Qemu-devel] [Bug 1834113] Re: QEMU touchpad input erratic after wakeup from sleep

2019-08-19 Thread Rafael David Tinoco

Avi,

Something I have realized we missed as a feedback here - or maybe I
missed checking previous comments - is how your mouse is being setup for
the guest. Is it being PS/2 emulated (default) or is it being given as
an USB device (when qemu cmd line has "-usb -device usb-tablet"). Also,
are you using SPICE protocol (perhaps with USB direction option ?).

Are you able to tell which xserver-xorg-input-XX module is being used
inside the guest ? You will probably find that information from Xorg log
files (check if you're using xf86-input-wacom or xserver-xorg-input-
evdev or some other).

Another thing that comes to my mind as well, are you using powersaving
features ? Specifically the I2C bus I'm concerned. Using "powertop", you
are able to change "Runtime PM for I2C Adapter" option under the
Tunables Tab (turning the power mgmt to off). I would like to know if
you are able to reproduce the issue without having power management
enabled for I2C. You can try disabling only I2C and then disabling all
PM options as a second attempt.

>From your host:

Device #1

[2.834320] input: WCOM488E:00 056A:488E Mouse as
/devices/pci:00/:00:15.0/i2c_designware.0/i2c-1/i2c-
WCOM488E:00/0018:056A:488E.0001/input/input12

[3.064686] input: Wacom HID 488E Finger as
/devices/pci:00/:00:15.0/i2c_designware.0/i2c-1/i2c-
WCOM488E:00/0018:056A:488E.0001/input/input17

Device #2

[2.834860] input: SYNA2393:00 06CB:7A13 Mouse as
/devices/pci:00/:00:15.1/i2c_designware.1/i2c-6/i2c-
SYNA2393:00/0018:06CB:7A13.0002/input/input13

[2.834929] input: SYNA2393:00 06CB:7A13 Touchpad as
/devices/pci:00/:00:15.1/i2c_designware.1/i2c-6/i2c-
SYNA2393:00/0018:06CB:7A13.0002/input/input14

Could you describe your input devices ? How many mice, trackpads, pens,
etc, you are using connected to the host ?

Thanks! And sorry for so many questions =).

--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1834113

Title:
QEMU touchpad input erratic after wakeup from sleep

Status in QEMU:
Incomplete
Status in libvirt package in Ubuntu:
Incomplete
Status in qemu package in Ubuntu:
Incomplete

Bug description:
Using Ubuntu host and guest. Normally the touchpad works great. Within
the last few days, suddenly, apparently after a wake from sleep, the
touchpad will behave erratically. For example, it will take two clicks
to select something, and when moving the cursor it will act as though
it is dragging even with the button not clicked.

A reboot fixes the issue temporarily.

ProblemType: Bug
DistroRelease: Ubuntu 19.04
Package: qemu 1:3.1+dfsg-2ubuntu3.1
Uname: Linux 5.1.14-050114-generic x86_64
ApportVersion: 2.20.10-0ubuntu27
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Mon Jun 24 20:55:44 2019
Dependencies:

EcryptfsInUse: Yes
InstallationDate: Installed on 2019-02-20 (124 days ago)
InstallationMedia: Ubuntu 18.04 "Bionic" - Build amd64 LIVE Binary
20180608-09:38
Lsusb:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID 8087:0025 Intel Corp.
Bus 001 Device 003: ID 0c45:671d Microdia
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Dell Inc. Precision 5530
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.1.14-050114-generic
root=UUID=18e8777c-1764-41e4-a19f-62476055de23 ro mem_sleep_default=deep
mem_sleep_default=deep acpi_rev_override=1 scsi_mod.use_blk_mq=1
nouveau.modeset=0 nouveau.runpm=0 nouveau.blacklist=1 acpi_backlight=none
acpi_osi=Linux acpi_osi=!
SourcePackage: qemu
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/26/2019
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.10.1
dmi.board.name: 0FP2W2
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 10
dmi.chassis.vendor: Dell Inc.
dmi.modalias:
dmi:bvnDellInc.:bvr1.10.1:bd04/26/2019:svnDellInc.:pnPrecision5530:pvr:rvnDellInc.:rn0FP2W2:rvrA00:cvnDellInc.:ct10:cvr:
dmi.product.family: Precision
dmi.product.name: Precision 5530
dmi.product.sku: 087D
dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1834113/+subscriptions

[Qemu-devel] [Bug 1830821] Re: Expose ARCH_CAP_MDS_NO in guest

2019-08-04 Thread Rafael David Tinoco

*** This bug is a duplicate of bug 1828495 ***
https://bugs.launchpad.net/bugs/1828495

Commit:

https://bugs.launchpad.net/intel/+bug/1828495/comments/42

Addresses exactly this bug fix.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1830821

Title:
  Expose ARCH_CAP_MDS_NO in guest

Status in intel:
  New
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Confirmed
Status in qemu source package in Bionic:
  Confirmed
Status in qemu source package in Cosmic:
  Confirmed
Status in qemu source package in Disco:
  Confirmed

Bug description:
  Description:

  MDS_NO is bit 5 of ARCH_CAPABILITIES. Expose this bit to guest.

  Target Qemu: 4.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/intel/+bug/1830821/+subscriptions

[Qemu-devel] [Bug 1830821] Re: Expose ARCH_CAP_MDS_NO in guest

2019-06-13 Thread Rafael David Tinoco

*** This bug is a duplicate of bug 1828495 ***
https://bugs.launchpad.net/bugs/1828495

I'm marking this bug as a duplicate of LP: #1828495 since both are
asking for mitigations pass-through to i386 kvm guests and I'm preparing
a fix for both simultaneously.

** This bug has been marked a duplicate of bug 1828495
   [KVM][CLX] CPUID_7_0_EDX_ARCH_CAPABILITIES is not enabled in VM.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1830821

Title:
  Expose ARCH_CAP_MDS_NO in guest

Status in intel:
  New
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Confirmed
Status in qemu source package in Bionic:
  Confirmed
Status in qemu source package in Cosmic:
  Confirmed
Status in qemu source package in Disco:
  Confirmed

Bug description:
  Description:

  MDS_NO is bit 5 of ARCH_CAPABILITIES. Expose this bit to guest.

  Target Qemu: 4.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/intel/+bug/1830821/+subscriptions

[Qemu-devel] [Bug 1830821] Re: Expose ARCH_CAP_MDS_NO in guest

2019-06-11 Thread Rafael David Tinoco

** Changed in: qemu (Ubuntu Disco)
   Status: Fix Released => Confirmed

** Changed in: qemu (Ubuntu Disco)
   Importance: Undecided => Wishlist

** Changed in: qemu (Ubuntu)
   Status: Fix Released => Confirmed

** Changed in: qemu (Ubuntu)
   Importance: Undecided => Wishlist

** Changed in: qemu (Ubuntu)
 Assignee: (unassigned) => Rafael David Tinoco (rafaeldtinoco)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1830821

Title:
  Expose ARCH_CAP_MDS_NO in guest

Status in intel:
  New
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Confirmed
Status in qemu source package in Bionic:
  Confirmed
Status in qemu source package in Cosmic:
  Confirmed
Status in qemu source package in Disco:
  Confirmed

Bug description:
  Description:

  MDS_NO is bit 5 of ARCH_CAPABILITIES. Expose this bit to guest.

  Target Qemu: 4.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/intel/+bug/1830821/+subscriptions

[Qemu-devel] [Bug 1830821] Re: Expose ARCH_CAP_MDS_NO in guest

2019-06-11 Thread Rafael David Tinoco

This effort, if done, would be done together with:

https://bugs.launchpad.net/intel/+bug/1828495

Please read comments:

https://bugs.launchpad.net/intel/+bug/1828495/comments/8

and

https://bugs.launchpad.net/intel/+bug/1828495/comments/10

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1830821

Title:
  Expose ARCH_CAP_MDS_NO in guest

Status in intel:
  New
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  Confirmed
Status in qemu source package in Cosmic:
  Confirmed
Status in qemu source package in Disco:
  Fix Released

Bug description:
  Description:

  MDS_NO is bit 5 of ARCH_CAPABILITIES. Expose this bit to guest.

  Target Qemu: 4.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/intel/+bug/1830821/+subscriptions

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2017-01-24 Thread Rafael David Tinoco

Thanks Christian! Will do!!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive mitaka series:
  Fix Committed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  Fix Released
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2017-01-23 Thread Rafael David Tinoco

For me we had enough tests already. Upstream development/tests, Zesty,
Yakkety. Christian, could you please move Xenial for me ? I have some
end users waiting for this. Thank you very much.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive mitaka series:
  Fix Committed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  Fix Released
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2017-01-11 Thread Rafael David Tinoco

Yakkety Verification (with 3.13 kernel from Trusty since a <= 3.17
kernel is needed). This verifies that Ubuntu Cloud Archive repositories
will be alright with this new packages (from Xenial / Yakkety).

## CURRENT

inaddy@(ykvm01):~$ apt-cache policy qemu-kvm
qemu-kvm:
  Installed: 1:2.6.1+dfsg-0ubuntu5.1
  Candidate: 1:2.6.1+dfsg-0ubuntu5.1

ykvm01 (sender):

Jan 11 11:34:35 ykvm01 kernel: type=1400 audit(1484141675.962:53):
apparmor="DENIED" operation="mknod" profile="libvirt-7cdcb6c0-f85e-4639
-912b-c785bd5992d9" name="/tmp/memfd-bF8new" pid=1934 comm="qemu-
system-x86" requested_mask="c" denied_mask="c" fsuid=111 ouid=111

inaddy@(ykvm01):~$ sudo virsh migrate --live guest qemu+ssh://ykvm02/system
error: internal error: unable to execute QEMU command 'migrate': Migration 
disabled: failed to allocate shared memory

ykvm02 (receiver):

Jan 11 11:39:31 ykvm02 kernel: type=1400 audit(1484141971.526:53):
apparmor="DENIED" operation="mknod" profile="libvirt-7cdcb6c0-f85e-4639
-912b-c785bd5992d9" name="/tmp/memfd-JZ6L9T" pid=2177 comm="qemu-
system-x86" requested_mask="c" denied_mask="c" fsuid=111 ouid=111

OBS: The check was being done in the wrong place AND situation, like I
showed in this bug.


## PROPOSED

inaddy@(ykvm01):~$ apt-cache policy qemu-kvm
qemu-kvm:
  Installed: 1:2.6.1+dfsg-0ubuntu5.2
  Candidate: 1:2.6.1+dfsg-0ubuntu5.2

ykvm01 (sender):



ykvm02 (receiver):

inaddy@(ykvm02):~$ virsh list
 IdName   State

 1 guest  running



Its all good.

verification-yakkety-done

** Tags removed: verification-needed
** Tags added: verification-done

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive mitaka series:
  Fix Committed
Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  Fix Committed
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2017-01-10 Thread Rafael David Tinoco

Xenial Verification (with 3.13 kernel from Trusty since a <= 3.17 kernel
is needed). This verifies that Ubuntu Cloud Archive repositories will be
alright with this new packages (from Xenial / Yakkety).

## CURRENT

inaddy@(xkvm01):~$ apt-cache policy qemu-kvm
qemu-kvm:
  Installed: 1:2.5+dfsg-5ubuntu10.6
  Candidate: 1:2.5+dfsg-5ubuntu10.6

xkvm01 (sender):

Jan 11 01:07:54 xkvm01 kernel: type=1400 audit(1484104074.014:13):
apparmor="DENIED" operation="mknod" profile="libvirt-7cdcb6c0-f85e-4639
-912b-c785bd5992d9" name="/tmp/memfd-Jh5UhR" pid=2535 comm="qemu-
system-x86" requested_mask="c" denied_mask="c" fsuid=112 ouid=112

$ sudo virsh migrate --live guest qemu+ssh://xkvm02/system
error: internal error: unable to execute QEMU command 'migrate': Migration 
disabled: failed to allocate shared memory

xkvm02 (receiver):

Jan 11 01:08:23 xkvm02 kernel: type=1400 audit(1484104103.888:53):
apparmor="DENIED" operation="mknod" profile="libvirt-7cdcb6c0-f85e-4639
-912b-c785bd5992d9" name="/tmp/memfd-fc9rij" pid=2000 comm="qemu-
system-x86" requested_mask="c" denied_mask="c" fsuid=112 ouid=112

OBS: The check was being done in the wrong place AND situation, like I
showed in this bug.

## PROPOSED


inaddy@(xkvm01):~$ apt-cache policy qemu-kvm
qemu-kvm:
  Installed: 1:2.5+dfsg-5ubuntu10.7
  Candidate: 1:2.5+dfsg-5ubuntu10.7

xkvm01 (sender):



xkvm02 (receiver):

inaddy@(xkvm02):~$ virsh list
 IdName   State

 1 guest  running



Its all good.

verification-xenial-done

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive mitaka series:
  Fix Committed
Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  Fix Committed
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined"

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-12-08 Thread Rafael David Tinoco

@jamespage, @cpaelzer,

I'll verify this fix in couple of days so it can be released.

Thank you!

Rafael

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive mitaka series:
  Fix Committed
Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  Fix Committed
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes,

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-12-08 Thread Rafael David Tinoco

Hello Antonio (@arcimboldo)

The fix only makes sense for newer QEMUs (>= Xenial, like the one from
Mitaka Ubuntu Cloud Archive).

OBS:

The "migration check" is done in VHOST initialization functions when the
devices are virtually attached to the virtual machine. If you are using
kernel 3.13 and have apparmor enabled, then all the running instances
have the "migration blocker" ON - because of this buggy migration check
- and won't be able to live migration.

Unfortunately there is a "in-memory" linked list telling qemu that is
has a blocker (with the reason). This blocker was added during instance
startup and will be checked/used only when instance is live-migrated.

Check this: http://pastebin.ubuntu.com/23517175/

If you started the instance in a host not running apparmor (or not
having libvirt profile loaded, for example) it won't block the creation
of /tmp/memfd-XXX files during instance initialization. That won't
trigger the "blocker flag" inside the running program and, if/when
needed, the live migration will be able to occur.

This means that, after installing the new package, if you're using
apparmor, yes, you would have to RESTART running instances that were
affected by this bug in order to live migrating them. Sorry for the bad
news!  Even if you remove the apparmor rules, the migration blocker is
already set.

Hacking your process virtual memory would jeopardize the contents of the
virtual memory (could be catastrophic specially for a virtual machine).

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive mitaka series:
  Fix Committed
Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  Fix Committed
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-22 Thread Rafael David Tinoco

** Patch added: "zesty_qemu_2.6.1+dfsg-0ubuntu7.debdiff"
   
https://bugs.launchpad.net/qemu/+bug/1626972/+attachment/4781485/+files/zesty_qemu_2.6.1+dfsg-0ubuntu7.debdiff

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes,

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-22 Thread Rafael David Tinoco

** Description changed:

- And, when libvirt starts using apparmor, and creating apparmor profiles
- for every virtual machine created in the compute nodes, mitaka qemu (2.5
- - and upstream also) uses a fallback mechanism for creating shared
- memory for live-migrations. This fall back mechanism, on kernels 3.13 -
- that don't have memfd_create() system-call, try to create files on /tmp/
+ [Impact]
+ 
+  * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
+  * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
+  * Apparmor will block access to /tmp/ and QEMU will fail migrating.
+ 
+ [Test Case]
+ 
+  * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
+  * Try to live-migration from one to another. 
+  * Apparmor will block creation of /tmp/memfd-XXX files.
+ 
+ [Regression Potential]
+ 
+  Pros:
+  * Exhaustively tested this.
+  * Worked with upstream on this fix. 
+  * I'm implementing new vhost log mechanism for upstream.
+  * One line change to a blocker that is already broken.
+ 
+  Cons:
+  * To break live migration in other circumstances. 
+ 
+ [Other Info]
+ 
+  * Christian Ehrhardt has been following this.
+ 
+ ORIGINAL DESCRIPTION:
+ 
+ When libvirt starts using apparmor, and creating apparmor profiles for
+ every virtual machine created in the compute nodes, mitaka qemu (2.5 -
+ and upstream also) uses a fallback mechanism for creating shared memory
+ for live-migrations. This fall back mechanism, on kernels 3.13 - that
+ don't have memfd_create() system-call, try to create files on /tmp/
  directory and fails.. causing live-migration not to work.
  
  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.
  
  From qemu 2.5, logic is on :
  
  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
- if (memfd_create)... ### only works with HWE kernels
+ if (memfd_create)... ### only works with HWE kernels
  
- else ### 3.13 kernels, gets blocked by apparmor
-tmpdir = g_get_tmp_dir
-...
-mfd = mkstemp(fname)
+ else ### 3.13 kernels, gets blocked by apparmor
+    tmpdir = g_get_tmp_dir
+    ...
+    mfd = mkstemp(fname)
  }
  
  And you can see the errors:
  
  From the host trying to send the virtual machine:
  
  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory
  
  From the host trying to receive the virtual machine:
  
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0
  
  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as expected,
  so, clearly, apparmor is stepping into the live

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-22 Thread Rafael David Tinoco

Right now Zesty is behind Yakkety because of a Security Update. Not sure
you need me to attach a debdiff for Zesty as well. Let me know.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-22 Thread Rafael David Tinoco

Took some more time here because of LP: #1621269.

** Patch added: "yakkety_qemu_2.6.1+dfsg-0ubuntu5.2.debdiff"
   
https://bugs.launchpad.net/qemu/+bug/1626972/+attachment/4781464/+files/yakkety_qemu_2.6.1+dfsg-0ubuntu5.2.debdiff

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1626972/+subscriptions

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-22 Thread Rafael David Tinoco

Thanks Christian,

Then I'll finish this SRU first. Will work in the vhost mmap log file
right after.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1626972/+subscriptions

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-22 Thread Rafael David Tinoco

** Patch added: "xenial_qemu_2.5+dfsg-5ubuntu10.7.debdiff"
   
https://bugs.launchpad.net/qemu/+bug/1626972/+attachment/4781425/+files/xenial_qemu_2.5+dfsg-5ubuntu10.7.debdiff

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1626972/+subscriptions

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-18 Thread Rafael David Tinoco

** Changed in: cloud-archive
   Status: New => In Progress

** Changed in: cloud-archive
 Assignee: (unassigned) => Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1626972/+subscriptions

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-18 Thread Rafael David Tinoco

For Ubuntu Xenial (Mitaka), Yakkety (Newton), Zesty: Commit 0d34fbabc1
fixes the issue for vhost-net kernel. Vhost-net kernel doesn't use
shared log so the verification is not used and apparmor profiles won't
block the live migration. With customers using vhost-user that might
still cause migration problems, but, likely, those are the vast
minority.

commit 0d34fbabc13891da41582b0823867dc5733fffef
Author: Rafael David Tinoco <rafael.tin...@canonical.com>
Date: Mon Oct 24 15:35:03 2016 +

vhost: migration blocker only if shared log is used

Commit 31190ed7 added a migration blocker in vhost_dev_init() to
check if memfd would succeed. It is better if this blocker first
checks if vhost backend requires shared log. This will avoid a
situation where a blocker is added inappropriately (e.g. shared
log allocation fails when vhost backend doesn't support it).

Signed-off-by: Rafael David Tinoco <rafael.tin...@canonical.com>
Reviewed-by: Marc-André Lureau <marcandre.lur...@redhat.com>
Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
Signed-off-by: Michael S. Tsirkin <m...@redhat.com>

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 131f164..25bf67f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1122,7 +1122,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
 if (!(hdev->features & (0x1ULL << VHOST_F_LOG_ALL))) {
 error_setg(>migration_blocker,
"Migration disabled: vhost lacks VHOST_F_LOG_ALL 
feature.");
- } else if (!qemu_memfd_check()) {
+ } else if (vhost_dev_log_is_shared(hdev) && !qemu_memfd_check()) {
 error_setg(>migration_blocker,
"Migration disabled: failed to allocate shared memory");
 }

The "final" fix for upstream fix is being finished by me, but, might not
be suitable for SRU since it will add features in qemu (and likely to
libvirt) in order for the vhost log file to be passed (by using an
already opened file descriptor). This will require changes in libvirt
and nova-compute but this change will, finally, allow security driver to
apply rules to vhost log file for shared logs (mostly for vhost-user
drivers).

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel:

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-18 Thread Rafael David Tinoco

** Changed in: qemu (Ubuntu Xenial)
   Status: New => In Progress

** Changed in: qemu (Ubuntu Yakkety)
   Status: New => In Progress

** Changed in: qemu (Ubuntu Xenial)
 Assignee: (unassigned) => Rafael David Tinoco (inaddy)

** Changed in: qemu (Ubuntu Yakkety)
 Assignee: (unassigned) => Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1626972/+subscriptions

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-18 Thread Rafael David Tinoco

** Also affects: qemu (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: qemu (Ubuntu)
   Status: New => In Progress

** Changed in: qemu (Ubuntu)
 Assignee: (unassigned) => Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Xenial:
  In Progress
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1626972/+subscriptions

Re: [Qemu-devel] [PATCH] vhost: secure vhost shared log files using argv paremeter

2016-11-08 Thread Rafael David Tinoco

Hello, 

> On Tue, Nov 8, 2016 at 4:49 PM Rafael David Tinoco 
> <rafael.tin...@canonical.com> wrote:
> Hello Michael, André,
> 
> Could you do a quick review before a final submission ?
> 
> http://paste.ubuntu.com/23446279/
> ...
> (André) > Could it be only a filename? This would simplify testing.
> (Michael) > When vhostlog is not specified, can we just use memfd as we did?
> 
> Michael said: 
> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg08197.html
> I think that the best approach is to allow passing in the fd, not the file 
> path. If not passed, use memfd.

Missed this one.

> I do agree :)

Sounds good. I see that the new approach is to let the managing library to 
create the files and just pass the file descriptors, this way security rules 
are applied to library itself and not to qemu processes. 

> Do we really need to give a path? (pass fd with -add-fd/qmp add-fd)

I guess not. So, for shared logs:

- vhostlogfd has to be provided.
- if vhostlogfd is not provided, use memfd.
(we don't  want writes in /tmp, should i remove fallback mechanism from memfd 
logic)
- if memfd fails, log can't be shared/created and there is a migration blocker.

André, Michael,

I'll work on that and get the patches soon, meanwhile, could u push:

- "vhost: migration blocker only if shared log is use"

so I can backport it to Debian ? 

Thank you,
-Rafael Tinoco

Re: [Qemu-devel] [PATCH] vhost: secure vhost shared log files using argv paremeter

2016-11-08 Thread Rafael David Tinoco

Hello Michael, André,

Could you do a quick review before a final submission ?

http://paste.ubuntu.com/23446279/

- I split the commits into 1) bugfix, 2) new util with test, 3) vhostlog

The unit test is testing passing fds between 2 processes and asserting
contents of mmap buffer coming from the "vhostlog" util (mmap-file).

Your final comment on the "vhostlog" was:

>> Argv examples:
>>
>> -netdev tap,id=net0,vhost=on
>> -netdev tap,id=net0,vhost=on,vhostlog=/tmp/guest.log
>> -netdev tap,id=net0,vhost=on,vhostlog=/tmp

(André) > Could it be only a filename? This would simplify testing.
(Michael) > When vhostlog is not specified, can we just use memfd as we did?

I'm going to change this to:

1 - if vhostlog is not provided shared log can't be used. Use memfd.

2 - for shared logs, vhostlog has to be provided as a "file" ?

Should i keep vhostlog being a directory also ? (i know we are unlinking the
file so might not be needed BUT a static file might have a race condition in
between different instances and providing a directory - that creates random
files on it - might be better approach).

Is there anything else ?

Thank you

Rafael Tinoco

On Mon, Oct 31, 2016 at 8:30 PM, Michael S. Tsirkin <m...@redhat.com> wrote:
> On Mon, Oct 31, 2016 at 08:35:33AM -0200, Rafael David Tinoco wrote:
>> On Sun, Oct 30, 2016 at 5:26 PM, Michael S. Tsirkin <m...@redhat.com> wrote:
>> >
>> > On Sat, Oct 22, 2016 at 07:00:41AM +, Rafael David Tinoco wrote:
>> > > Commit 31190ed7 added a migration blocker in vhost_dev_init() to
>> > > check if memfd would succeed. It is better if this blocker first
>> > > checks if vhost backend requires shared log. This will avoid a
>> > > situation where a blocker is added inappropriately (e.g. shared
>> > > log allocation fails when vhost backend doesn't support it).
>> >
>> > Sounds like a bugfix but I'm not sure. Can this part be split
>> > out in a patch by itself?
>>
>> Already sent some days ago (and pointed by Marc today).
>>
>> > > Commit: 35f9b6e added a fallback mechanism for systems not supporting
>> > > memfd_create syscall (started being supported since 3.17).
>> > >
>> > > Backporting memfd_create might not be accepted for distros relying
>> > > on older kernels. Nowadays there is no way for security driver
>> > > to discover memfd filename to be created: /memfd-XX.
>> > >
>> > > Also, because vhost log file descriptors can be passed to other
>> > > processes, after discussion, we thought it is best to back mmap by
>> > > using files that can be placed into a specific directory: this commit
>> > > creates "vhostlog" argv parameter for such purpose. This will allow
>> > > security drivers to operate on those files appropriately.
>> > >
>> > > Argv examples:
>> > >
>> > > -netdev tap,id=net0,vhost=on
>> > > -netdev tap,id=net0,vhost=on,vhostlog=/tmp/guest.log
>> > > -netdev tap,id=net0,vhost=on,vhostlog=/tmp
>> > >
>> > > For vhost backends supporting shared logs, if vhostlog is non-existent,
>> > > or a directory, random files are going to be created in the specified
>> > > directory (or, for non-existent, in tmpdir). If vhostlog is specified,
>> > > the filepath is always used when allocating vhost log files.
>> >
>> > When vhostlog is not specified, can we just use memfd as we did?
>> >
>>
>> This was my approach on a "pastebin" example before this patch (in the
>> discussion thread we had). Problem goes back to when vhost log file
>> descriptor is shared with some vhost-user implementation - like the
>> interface allows to - and the security driver labelling issue. IMO,
>> yes, we could let vhostlog to specify a log file, and, if not
>> specified, assume memfd is ok to be used.
>>
>> Please let me know if you - and Marc - want me to keep using memfd.
>> I'll create the mmap-file tests and files in a different commit, like
>> Marc has asked for, and will propose the patch again by the end of
>> this week.
>
> I think that the best approach is to allow passing in the fd,
> not the file path. If not passed, use memfd.
>
> --
> MST

Re: [Qemu-devel] [PATCH] vhost: secure vhost shared log files using argv paremeter

2016-10-31 Thread Rafael David Tinoco

On Sun, Oct 30, 2016 at 5:26 PM, Michael S. Tsirkin <m...@redhat.com> wrote:
>
> On Sat, Oct 22, 2016 at 07:00:41AM +0000, Rafael David Tinoco wrote:
> > Commit 31190ed7 added a migration blocker in vhost_dev_init() to
> > check if memfd would succeed. It is better if this blocker first
> > checks if vhost backend requires shared log. This will avoid a
> > situation where a blocker is added inappropriately (e.g. shared
> > log allocation fails when vhost backend doesn't support it).
>
> Sounds like a bugfix but I'm not sure. Can this part be split
> out in a patch by itself?

Already sent some days ago (and pointed by Marc today).

> > Commit: 35f9b6e added a fallback mechanism for systems not supporting
> > memfd_create syscall (started being supported since 3.17).
> >
> > Backporting memfd_create might not be accepted for distros relying
> > on older kernels. Nowadays there is no way for security driver
> > to discover memfd filename to be created: /memfd-XX.
> >
> > Also, because vhost log file descriptors can be passed to other
> > processes, after discussion, we thought it is best to back mmap by
> > using files that can be placed into a specific directory: this commit
> > creates "vhostlog" argv parameter for such purpose. This will allow
> > security drivers to operate on those files appropriately.
> >
> > Argv examples:
> >
> > -netdev tap,id=net0,vhost=on
> > -netdev tap,id=net0,vhost=on,vhostlog=/tmp/guest.log
> > -netdev tap,id=net0,vhost=on,vhostlog=/tmp
> >
> > For vhost backends supporting shared logs, if vhostlog is non-existent,
> > or a directory, random files are going to be created in the specified
> > directory (or, for non-existent, in tmpdir). If vhostlog is specified,
> > the filepath is always used when allocating vhost log files.
>
> When vhostlog is not specified, can we just use memfd as we did?
>

This was my approach on a "pastebin" example before this patch (in the
discussion thread we had). Problem goes back to when vhost log file
descriptor is shared with some vhost-user implementation - like the
interface allows to - and the security driver labelling issue. IMO,
yes, we could let vhostlog to specify a log file, and, if not
specified, assume memfd is ok to be used.

Please let me know if you - and Marc - want me to keep using memfd.
I'll create the mmap-file tests and files in a different commit, like
Marc has asked for, and will propose the patch again by the end of
this week.

[Qemu-devel] [PATCH] vhost: migration blocker only if shared log is used

2016-10-24 Thread Rafael David Tinoco

Commit 31190ed7 added a migration blocker in vhost_dev_init() to
check if memfd would succeed. It is better if this blocker first
checks if vhost backend requires shared log. This will avoid a
situation where a blocker is added inappropriately (e.g. shared
log allocation fails when vhost backend doesn't support it).

Signed-off-by: Rafael David Tinoco <rafael.tin...@canonical.com>
Reviewed-by: Marc-André Lureau <marcandre.lur...@redhat.com>
---
 hw/virtio/vhost.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index bd051ab..742d0aa 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1122,7 +1122,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
 if (!(hdev->features & (0x1ULL << VHOST_F_LOG_ALL))) {
 error_setg(>migration_blocker,
"Migration disabled: vhost lacks VHOST_F_LOG_ALL 
feature.");
-} else if (!qemu_memfd_check()) {
+} else if (vhost_dev_log_is_shared(hdev) && !qemu_memfd_check()) {
 error_setg(>migration_blocker,
"Migration disabled: failed to allocate shared memory");
 }
-- 
2.9.3

[Qemu-devel] [Bug 1626972] Fwd: [PATCH] vhost: secure vhost shared log files using argv paremeter

2016-10-22 Thread Rafael David Tinoco

> Begin forwarded message:
> 
> From: Rafael David Tinoco <rafael.tin...@canonical.com>
> Subject: Re: [Qemu-devel] [PATCH] vhost: secure vhost shared log files using 
> argv paremeter
> Date: October 22, 2016 at 19:52:31 GMT-2
> To: Marc-André Lureau <marcandre.lur...@gmail.com>
> Cc: Rafael David Tinoco <rafael.tin...@canonical.com>, qemu-devel 
> <qemu-devel@nongnu.org>
> 
> Hello,
> 
>> On Oct 22, 2016, at 05:18, Marc-André Lureau <marcandre.lur...@gmail.com> 
>> wrote:
>> 
>> Hi
>> 
>> On Sat, Oct 22, 2016 at 10:01 AM Rafael David Tinoco 
>> <rafael.tin...@canonical.com> wrote:
>> Commit 31190ed7 added a migration blocker in vhost_dev_init() to
>> check if memfd would succeed. It is better if this blocker first
>> checks if vhost backend requires shared log. This will avoid a
>> situation where a blocker is added inappropriately (e.g. shared
>> log allocation fails when vhost backend doesn't support it).
>> 
>> Could you make this a seperate patch?
> 
> Just did, in another e-mail, cc'ing you.
> 
>> Argv examples:
>> 
>>-netdev tap,id=net0,vhost=on
>>-netdev tap,id=net0,vhost=on,vhostlog=/tmp/guest.log
>>-netdev tap,id=net0,vhost=on,vhostlog=/tmp
>> 
>> Could it be only a filename? This would simplify testing.
> 
> It could. Should I keep the /tmp/ logic if no vhostlog arg is present 
> ? Or you think it should fail if no arg is given ? I'm afraid of backward 
> compatibility when back-porting this to older qemu versions on stable 
> releases (like my case: I'll backport this to ~3 different versions). 
> 
>> For vhost backends supporting shared logs, if vhostlog is non-existent,
>> or a directory, random files are going to be created in the specified
>> directory (or, for non-existent, in tmpdir). If vhostlog is specified,
>> the filepath is always used when allocating vhost log files.
>> 
>> 
>> Regarding testing, you add utility code mmap-file, could you make this a 
>> seperate commit, with unit tests?
>> 
> 
> Sure, I'll work on it.
> 
>> thanks
> 
> Thank u!
> 
> -Rafael Tinoco

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1

[Qemu-devel] [Bug 1626972] Fwd: [PATCH] vhost: secure vhost shared log files using argv paremeter

2016-10-22 Thread Rafael David Tinoco

> Begin forwarded message:
> 
> From: Marc-André Lureau <marcandre.lur...@gmail.com>
> Subject: Re: [Qemu-devel] [PATCH] vhost: secure vhost shared log files using 
> argv paremeter
> Date: October 22, 2016 at 05:18:02 GMT-2
> To: Rafael David Tinoco <rafael.tin...@canonical.com>
> Cc: QEMU <qemu-devel@nongnu.org>
> 
> Hi
> 
> On Sat, Oct 22, 2016 at 10:01 AM Rafael David Tinoco 
> <rafael.tin...@canonical.com <mailto:rafael.tin...@canonical.com>> wrote:
> Commit 31190ed7 added a migration blocker in vhost_dev_init() to
> check if memfd would succeed. It is better if this blocker first
> checks if vhost backend requires shared log. This will avoid a
> situation where a blocker is added inappropriately (e.g. shared
> log allocation fails when vhost backend doesn't support it).
> 
> Could you make this a seperate patch?
>  
> Commit: 35f9b6e added a fallback mechanism for systems not supporting
> memfd_create syscall (started being supported since 3.17).
> 
> Backporting memfd_create might not be accepted for distros relying
> on older kernels. Nowadays there is no way for security driver
> to discover memfd filename to be created: /memfd-XX.
> 
> Also, because vhost log file descriptors can be passed to other
> processes, after discussion, we thought it is best to back mmap by
> using files that can be placed into a specific directory: this commit
> creates "vhostlog" argv parameter for such purpose. This will allow
> security drivers to operate on those files appropriately.
> 
> Argv examples:
> 
> -netdev tap,id=net0,vhost=on
> -netdev tap,id=net0,vhost=on,vhostlog=/tmp/guest.log
> -netdev tap,id=net0,vhost=on,vhostlog=/tmp
> 
> Could it be only a filename? This would simplify testing.
>  
> 
> For vhost backends supporting shared logs, if vhostlog is non-existent,
> or a directory, random files are going to be created in the specified
> directory (or, for non-existent, in tmpdir). If vhostlog is specified,
> the filepath is always used when allocating vhost log files.
> 
> 
> Regarding testing, you add utility code mmap-file, could you make this a 
> seperate commit, with unit tests?
> 
> thanks
> 
> Signed-off-by: Rafael David Tinoco <rafael.tin...@canonical.com 
> <mailto:rafael.tin...@canonical.com>>
> ---
>  hw/net/vhost_net.c|   4 +-
>  hw/scsi/vhost-scsi.c  |   2 +-
>  hw/virtio/vhost-vsock.c   |   2 +-
>  hw/virtio/vhost.c |  41 +++--
>  include/hw/virtio/vhost.h |   4 +-
>  include/net/vhost_net.h   |   1 +
>  include/qemu/mmap-file.h  |  10 +++
>  net/tap.c |   6 ++
>  qapi-schema.json  |   3 +
>  qemu-options.hx   |   3 +-
>  util/Makefile.objs|   1 +
>  util/mmap-file.c  | 153 
> ++
>  12 files changed, 207 insertions(+), 23 deletions(-)
>  create mode 100644 include/qemu/mmap-file.h
>  create mode 100644 util/mmap-file.c
> 
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index f2d49ad..d650c92 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -171,8 +171,8 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>  net->dev.vq_index = net->nc->queue_index * net->dev.nvqs;
>  }
> 
> -r = vhost_dev_init(>dev, options->opaque,
> -   options->backend_type, options->busyloop_timeout);
> +r = vhost_dev_init(>dev, options->opaque, options->backend_type,
> +   options->busyloop_timeout, options->vhostlog);
>  if (r < 0) {
>  goto fail;
>  }
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index 5b26946..5dc3d30 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -248,7 +248,7 @@ static void vhost_scsi_realize(DeviceState *dev, Error 
> **errp)
>  s->dev.backend_features = 0;
> 
>  ret = vhost_dev_init(>dev, (void *)(uintptr_t)vhostfd,
> - VHOST_BACKEND_TYPE_KERNEL, 0);
> + VHOST_BACKEND_TYPE_KERNEL, 0, NULL);
>  if (ret < 0) {
>  error_setg(errp, "vhost-scsi: vhost initialization failed: %s",
> strerror(-ret));
> diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
> index b481562..6cf6081 100644
> --- a/hw/virtio/vhost-vsock.c
> +++ b/hw/virtio/vhost-vsock.c
> @@ -342,7 +342,7 @@ static void vhost_vsock_device_realize(DeviceState *dev, 
> Error **errp)
>  vsock->vhost_dev.nvqs = ARRAY_SIZE(vsock->vhost_vqs);
>  vsock->vhost_dev.vqs = vsock->vhost_vqs;
>  ret =

Re: [Qemu-devel] [PATCH] vhost: secure vhost shared log files using argv paremeter

2016-10-22 Thread Rafael David Tinoco

Hello,

> On Oct 22, 2016, at 05:18, Marc-André Lureau <marcandre.lur...@gmail.com> 
> wrote:
> 
> Hi
> 
> On Sat, Oct 22, 2016 at 10:01 AM Rafael David Tinoco 
> <rafael.tin...@canonical.com> wrote:
> Commit 31190ed7 added a migration blocker in vhost_dev_init() to
> check if memfd would succeed. It is better if this blocker first
> checks if vhost backend requires shared log. This will avoid a
> situation where a blocker is added inappropriately (e.g. shared
> log allocation fails when vhost backend doesn't support it).
> 
> Could you make this a seperate patch?

Just did, in another e-mail, cc'ing you.

> Argv examples:
> 
> -netdev tap,id=net0,vhost=on
> -netdev tap,id=net0,vhost=on,vhostlog=/tmp/guest.log
> -netdev tap,id=net0,vhost=on,vhostlog=/tmp
> 
> Could it be only a filename? This would simplify testing.

It could. Should I keep the /tmp/ logic if no vhostlog arg is present ? 
Or you think it should fail if no arg is given ? I'm afraid of backward 
compatibility when back-porting this to older qemu versions on stable releases 
(like my case: I'll backport this to ~3 different versions). 

> For vhost backends supporting shared logs, if vhostlog is non-existent,
> or a directory, random files are going to be created in the specified
> directory (or, for non-existent, in tmpdir). If vhostlog is specified,
> the filepath is always used when allocating vhost log files.
> 
> 
> Regarding testing, you add utility code mmap-file, could you make this a 
> seperate commit, with unit tests?
> 

Sure, I'll work on it.

> thanks

Thank u!

-Rafael Tinoco

[Qemu-devel] [PATCH] vhost: secure vhost shared log files using argv paremeter

2016-10-22 Thread Rafael David Tinoco

Commit 31190ed7 added a migration blocker in vhost_dev_init() to
check if memfd would succeed. It is better if this blocker first
checks if vhost backend requires shared log. This will avoid a
situation where a blocker is added inappropriately (e.g. shared
log allocation fails when vhost backend doesn't support it).
---
 hw/virtio/vhost.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index bd051ab..742d0aa 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1122,7 +1122,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
 if (!(hdev->features & (0x1ULL << VHOST_F_LOG_ALL))) {
 error_setg(>migration_blocker,
"Migration disabled: vhost lacks VHOST_F_LOG_ALL 
feature.");
-} else if (!qemu_memfd_check()) {
+} else if (vhost_dev_log_is_shared(hdev) && !qemu_memfd_check()) {
 error_setg(>migration_blocker,
"Migration disabled: failed to allocate shared memory");
 }
-- 
2.9.3

[Qemu-devel] [PATCH] vhost: secure vhost shared log files using argv paremeter

2016-10-22 Thread Rafael David Tinoco

Commit 31190ed7 added a migration blocker in vhost_dev_init() to
check if memfd would succeed. It is better if this blocker first
checks if vhost backend requires shared log. This will avoid a
situation where a blocker is added inappropriately (e.g. shared
log allocation fails when vhost backend doesn't support it).

Commit: 35f9b6e added a fallback mechanism for systems not supporting
memfd_create syscall (started being supported since 3.17).

Backporting memfd_create might not be accepted for distros relying
on older kernels. Nowadays there is no way for security driver
to discover memfd filename to be created: /memfd-XX.

Also, because vhost log file descriptors can be passed to other
processes, after discussion, we thought it is best to back mmap by
using files that can be placed into a specific directory: this commit
creates "vhostlog" argv parameter for such purpose. This will allow
security drivers to operate on those files appropriately.

Argv examples:

-netdev tap,id=net0,vhost=on
-netdev tap,id=net0,vhost=on,vhostlog=/tmp/guest.log
-netdev tap,id=net0,vhost=on,vhostlog=/tmp

For vhost backends supporting shared logs, if vhostlog is non-existent,
or a directory, random files are going to be created in the specified
directory (or, for non-existent, in tmpdir). If vhostlog is specified,
the filepath is always used when allocating vhost log files.

Signed-off-by: Rafael David Tinoco <rafael.tin...@canonical.com>
---
 hw/net/vhost_net.c|   4 +-
 hw/scsi/vhost-scsi.c  |   2 +-
 hw/virtio/vhost-vsock.c   |   2 +-
 hw/virtio/vhost.c |  41 +++--
 include/hw/virtio/vhost.h |   4 +-
 include/net/vhost_net.h   |   1 +
 include/qemu/mmap-file.h  |  10 +++
 net/tap.c |   6 ++
 qapi-schema.json  |   3 +
 qemu-options.hx   |   3 +-
 util/Makefile.objs|   1 +
 util/mmap-file.c  | 153 ++
 12 files changed, 207 insertions(+), 23 deletions(-)
 create mode 100644 include/qemu/mmap-file.h
 create mode 100644 util/mmap-file.c

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index f2d49ad..d650c92 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -171,8 +171,8 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 net->dev.vq_index = net->nc->queue_index * net->dev.nvqs;
 }
 
-r = vhost_dev_init(>dev, options->opaque,
-   options->backend_type, options->busyloop_timeout);
+r = vhost_dev_init(>dev, options->opaque, options->backend_type,
+   options->busyloop_timeout, options->vhostlog);
 if (r < 0) {
 goto fail;
 }
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 5b26946..5dc3d30 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -248,7 +248,7 @@ static void vhost_scsi_realize(DeviceState *dev, Error 
**errp)
 s->dev.backend_features = 0;
 
 ret = vhost_dev_init(>dev, (void *)(uintptr_t)vhostfd,
- VHOST_BACKEND_TYPE_KERNEL, 0);
+ VHOST_BACKEND_TYPE_KERNEL, 0, NULL);
 if (ret < 0) {
 error_setg(errp, "vhost-scsi: vhost initialization failed: %s",
strerror(-ret));
diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
index b481562..6cf6081 100644
--- a/hw/virtio/vhost-vsock.c
+++ b/hw/virtio/vhost-vsock.c
@@ -342,7 +342,7 @@ static void vhost_vsock_device_realize(DeviceState *dev, 
Error **errp)
 vsock->vhost_dev.nvqs = ARRAY_SIZE(vsock->vhost_vqs);
 vsock->vhost_dev.vqs = vsock->vhost_vqs;
 ret = vhost_dev_init(>vhost_dev, (void *)(uintptr_t)vhostfd,
- VHOST_BACKEND_TYPE_KERNEL, 0);
+ VHOST_BACKEND_TYPE_KERNEL, 0, NULL);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "vhost-vsock: vhost_dev_init failed");
 goto err_virtio;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index bd051ab..d874ebb 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -20,7 +20,7 @@
 #include "qemu/atomic.h"
 #include "qemu/range.h"
 #include "qemu/error-report.h"
-#include "qemu/memfd.h"
+#include "qemu/mmap-file.h"
 #include 
 #include "exec/address-spaces.h"
 #include "hw/virtio/virtio-bus.h"
@@ -326,7 +326,7 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
 return log_size;
 }
 
-static struct vhost_log *vhost_log_alloc(uint64_t size, bool share)
+static struct vhost_log *vhost_log_alloc(char *path, uint64_t size, bool share)
 {
 struct vhost_log *log;
 uint64_t logsize = size * sizeof(*(log->log));
@@ -334,9 +334,7 @@ static struct vhost_log *vhost_log_alloc(uint64_t size, 
bool share)
 
 log = g_new0(struct vhost_log, 1);
 if (share) {
-log->log = qemu_memfd_alloc("vhos

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-20 Thread Rafael David Tinoco

gard printfs and minor problems)

OBS: I'm basically removing fallback mechanism from memfd, creating a generic 
qemu_mmap_XXX implementation, adding a vhostlog parameter in tap cmdline AND 
changing the decision on what to use: if vhostlog is present in cmdline, 
qemu_mmap_XXX on vhostlog is used. If it is a directory, a random file is 
created inside it. If it is a file, the file is used. If no vhostlog is given 
(default while libvirt isn't changed), it tries first to use memfd (all newer 
kernels), and, if not possible, it tries to fallback using the qemu_mmap 
mechanism on "tmp" directory creating random files. 

PS: Remember that this is because selinux/apparmor labelling on tmp files (and 
because file descriptors can be passed away, like we discussed before). 

If that is okay I'll provide a patch asap. Let me know if you prefer something 
else.

Thank you,
Rafael

> On Oct 04, 2016, at 12:29, Rafael David Tinoco <rafael.tin...@canonical.com> 
> wrote:
> 
> 
>> On Oct 04, 2016, at 10:50, Marc-André Lureau <marcandre.lur...@gmail.com> 
>> wrote:
>> 
>> What about having a single config parameter as a place to put all vhost logs 
>> for all drives for a single instance ? Remove the memfd implementation with 
>> all the memfd shared_memory option ? Replace it with a 
>> open+unlink+ftruncate+mmap approach only.
>> 
>> 
>> I fail to see your point, memfd is superior to open+unlink and has other 
>> advantages with sealing etc.
> 
> I was just summarising needs based on previous statement from Daniel:
> 
>> This makes me wonder about the memfd_create() code path too - we'll
>> again not want that external process to be granted access to arbitrary
>> FDs of QEMU's and I'm not sure of a way to get the memfd  FD to have
>> a specific label. So I think it is possible that when using libvirt
>> we'll want the ability to tell QEMU to *always* use an explicit file
>> in a path libvirt specifies, and never use memfd even if available.
>> 
>> Regards,
>> Daniel

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-20 Thread Rafael David Tinoco

The correct (and draft) one:
http://pastebin.ubuntu.com/23357210/

Im passing vhostlog parameter as "hdev->log_filename" so it can be accessed 
from net_init_tap()-> functions AND from vhost_dev_start()-> functions. This 
way I don't have to change function prototypes anymore.

> On Oct 21, 2016, at 01:03, Rafael David Tinoco <rafael.tin...@canonical.com> 
> wrote:
> 
> Also, if possible, I would like comments about a draft:
> 
> https://pastebin.canonical.com/168579/
> (please disregard printfs and minor problems)

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-04 Thread Rafael David Tinoco


> On Oct 04, 2016, at 10:50, Marc-André Lureau  
> wrote:
> 
> What about having a single config parameter as a place to put all vhost logs 
> for all drives for a single instance ? Remove the memfd implementation with 
> all the memfd shared_memory option ? Replace it with a 
> open+unlink+ftruncate+mmap approach only.
> 
> 
> I fail to see your point, memfd is superior to open+unlink and has other 
> advantages with sealing etc.

I was just summarising needs based on previous statement from Daniel:

> This makes me wonder about the memfd_create() code path too - we'll
> again not want that external process to be granted access to arbitrary
> FDs of QEMU's and I'm not sure of a way to get the memfd  FD to have
> a specific label. So I think it is possible that when using libvirt
> we'll want the ability to tell QEMU to *always* use an explicit file
> in a path libvirt specifies, and never use memfd even if available.
> 
> Regards,
> Daniel

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-04 Thread Rafael David Tinoco

> On Oct 04, 2016, at 10:10, Marc-André Lureau  
> wrote:
> 
> > How will this path be used? Is it going to be global to qemu for various
> > use (kinda like $TMP), or per-device, or for memfd fallback only? Should
> > the path pre-exist? (I suppose, if not, qemu should clean it up when
> > leaving)
> 
> I'd expect it to be an option set against the vhost user backend, since
> that's the thing using this.
> 
> If other things have similar usage needs wrt memfd in future, they would
> also need similar path config option.

I was going for that approach. I could have something similar to:

-netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:5c:10:f2,bus=pci.0,addr=0x3,vhostpath=/var/lib///

> The log may be shared if there are several vhost-user (stored in 
> vhost_log_shm global), so I think it makes more sense to have a global config 
> path for it, or you may end up duplicating that information per vhost backend 
> and having files in either of the specified paths.

But, yes, indeed the vhost_log_shm makes that approach tricky. If sharing the 
same log file with multiple vhost backend. Besides, tools like openstack would 
put all the vhost log files in the same place at the end. 

Having a global config path, forced to be specified, orelse the vhost log isn't 
created, like when it fails nowadays. This seems to be the right approach.

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-04 Thread Rafael David Tinoco

True. 

What about having a single config parameter as a place to put all vhost logs 
for all drives for a single instance ? Remove the memfd implementation with all 
the memfd shared_memory option ? Replace it with a open+unlink+ftruncate+mmap 
approach only.

This way every device would get its own log file and vhost-user backends would 
be able to get its file descriptors. (and, of course, allow the security 
drivers to do their jobs). 

>> On Oct 04, 2016, at 10:25, Daniel P. Berrange  wrote:
>> 
>> Hmm, is there a reason why it is shared? That seems to make an assumption
>> that all vhost-user backends would be managed by the same external process.
>> While that may be the common case today, it doesn't feel like a reasonable
>> assumption to make long term. IOW it feels wiser to have it set per-NIC
>> unless I'm missing something important that means it must be shared ?
>

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-04 Thread Rafael David Tinoco

Let me work on it. I'll get back soon. 

Tks Daniel.

> On Oct 04, 2016, at 05:36, Daniel P. Berrange <berra...@redhat.com> wrote:
> 
> On Mon, Oct 03, 2016 at 04:15:55PM -0300, Rafael David Tinoco wrote:
>> Yes, definitely. Check this:
> 
> [snip]
> 
> So in that case, I think we must add ability to specify an explicit path
> that apps can use *regardles* of whether memfd support exists or not.
> 
>>> On Oct 03, 2016, at 15:46, Rafael David Tinoco 
>>> <rafael.tin...@canonical.com> wrote:
>>> 
>>>> So you're saying that the file descriptor here is actually getting
>>>> passed to a different process for it to use ?

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-03 Thread Rafael David Tinoco

Yes, definitely. Check this:

/**
 * @qemu_chr_fe_set_msgfds:
 *
 * For backends capable of fd passing, set an array of fds to be passed with
 * the next send operation.
 * A subsequent call to this function before calling a write function will
 * result in overwriting the fd array with the new value without being send.
 * Upon writing the message the fd array is freed.
 *
 * Returns: -1 if fd passing isn't supported.
 */
int qemu_chr_fe_set_msgfds(CharDriverState *s, int *fds, int num);

So, at least for vhost_dev_log_resize, this "interface" is being implemented:

vhost_user_set_log_base -> VhostUserMsg = VHOST_USER_SET_LOG_BASE

vhost_user_write(with the VHOST_USER_GET_LOG_BASE message):

- configures the file descriptors(... , fds, fd_num)
  qemu_chr_fe_set_msgfds

- writes them down the char driver
  qemu_chr_fe_write_all

> On Oct 03, 2016, at 15:46, Rafael David Tinoco <rafael.tin...@canonical.com> 
> wrote:
> 
>> So you're saying that the file descriptor here is actually getting
>> passed to a different process for it to use ?

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-03 Thread Rafael David Tinoco

Hello Daniel,

> On Oct 03, 2016, at 14:55, Daniel P. Berrange  wrote:
> 
>> Well, it unlinks the file but the references are still there while the
>> descriptor isn't closed by this process, or by the one that receives the
>> descriptor (that is why is the "unlink" so early).
>> 
>> If you check vhost_dev_log_resize(), it gets *possible* new vhost log
>> (if a new size is given) and informs the vhost dev driver about the new
>> log base (vhost_ops->vhost_set_log_base).
>> 
>> For vhost_user, this means that the file descriptors for vhost logs are
>> likely going to be passed to vhost backend (fds[] in
>> vhost_user_set_log_base). This is just one example, not sure about
>> others.
>> 
>> Probably the best approach here, like what Marc-André said, is to create
>> some sort of TMPDIR, set by libvirt perhaps ?
> 
> So you're saying that the file descriptor here is actually getting
> passed to a different process for it to use ?
> 
> If so that means we definitely do not want this in TMPDIR. If we
> create a generic file in TMPDIR, then its going to have a generic
> security label. That means that the other process we're giving the
> FD to is going to have to be granted permission to access this FD
> and we certainly don't want to grant permission for it to access
> any of QEMU's other FDs. So for the SELinux integration, we'll
> need this FD to be in a specific directory, so that we can setup
> policy such that the file created gets given a specific SELinux
> label. We can then grant the other process access to only that
> particular file, and not anything else of QEMU's.
> 
> This makes me wonder about the memfd_create() code path too - we'll
> again not want that external process to be granted access to arbitrary
> FDs of QEMU's and I'm not sure of a way to get the memfd  FD to have
> a specific label. So I think it is possible that when using libvirt
> we'll want the ability to tell QEMU to *always* use an explicit file
> in a path libvirt specifies, and never use memfd even if available.

Check this execution path:

(vhost_vsock_device_realize)
  vhost_dev_init
  vhost_commit
  |- vhost_get_log_size
  |...
  |- vhost_dev_log_resize

(vhost_dev_log_resize):
  vhost_log_get -> here if the size is bigger, a new log is created
  dev->vhost_ops->vhost_set_log_base() -> kernel or user vhost driver
  vhost_log_put()

So,

* In case of the kernel mode, this is just a:

vhost in kernel mode = vhost_kernel_set_log_base
return vhost_kernel_call(dev, VHOST_SET_LOG_BASE, );

which makes an ioctl to dev->opaque file descriptor to set a new vhost
log base.

* But in the case of user mode:

vhost in user mode = vhost_user_set_log_base

which gets the log file descriptor (log->fd) and gives to
vhost_user_write. vhost_user_write will do a qemu_chr_fe_set_msgfds
passing the log file descriptors for the backend vhost driver
(CharDriverState).

If I'm reading this right.. if the backend driver is:

static int tcp_set_msgfds(CharDriverState *chr, int *fds, int num)

it would check for:

!qio_channel_has_feature(s->ioc, QIO_CHANNEL_FEATURE_FD_PASS)) {

and configure s->write_msgfds. This would be sent in:

static int tcp_chr_write(CharDriverState *chr, const uint8_t *buf, int
len)

with "io_channel_send_full" + "qio_channel_writev_full + io_writev from
QIOChannelClass.

https://www.berrange.com/posts/2016/08/16/

This, from your blog, probably confirms this behaviour:

"The migration code supports a number of different protocols besides
just “tcp:“. In particular it allows an “fd:” protocol to tell QEMU to
use a passed-in file descriptor, and an “exec:” protocol to tell QEMU to
launch an external command to tunnel the connection. It is desirable to
be able to use TLS with these protocols too, but when using TLS the
client QEMU needs to know the hostname of the target QEMU in order to
correctly validate the x509 certificate it receives. Thus, a second
“tls-hostname” parameter was added to allow QEMU to be informed of the
hostname to use for x509 certificate validation when using a non-tcp
migration protocol. This can be set on the source QEMU prior to starting
the migration using the “migrate_set_str_parameter” monitor command"

=)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability

Re: [Qemu-devel] [PATCH] util: secure memfd_create fallback mechanism

2016-10-03 Thread Rafael David Tinoco

Hello Marc, 

> On Sep 27, 2016, at 08:13, Marc-André Lureau <mlur...@redhat.com> wrote:
> 
>>> On Tue, Sep 27, 2016 at 03:06:21AM +, Rafael David Tinoco wrote:
>>> We should not have QEMU creating unpredictabile filenames in the
>>> first place - any filenames should be determined by libvirt
>>> explicitly.
>> 
>> Note that the filename, per se, is not as important as other files,
>> since qemu won't provide it for being accessed by external programs, and,
>> deletes the file, while keeping the descriptor, right after its creation
>> (due to its nature, that is probably why it was created in /tmp).
>> 
>> Having libvirt to define a filename that would not be used for recent
>> kernels (> 3.17) and would exist for a fraction of second doesn't seem
>> right to me.
>> 
> 
> There are other parts of qemu that rely on creating temporary files, and this 
> seems to lack a bit of uniformity. Would it make sense to define a place 
> where qemu could create those? Or setting TMPDIR should help too. Could 
> libvirt set a per-vm TMPDIR with appropriate security rules?

Best move I can see. Only problem is that if we do that, we would have to 
create a fallback mechanism for when TMPDIR is not set. It would go back to 
/tmp ? 

In my particular case (for 1 vhost log file):

-netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:5c:10:f2,bus=pci.0,addr=0x3

I could have something similar to:

-netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:5c:10:f2,bus=pci.0,addr=0x3,vhostpath=/var/lib///

and put mkstemp() files (one per vhost device) in there. 

Even so, what to do when "vhostpath" is not informed ? 

I'm worried that, right now there are security drivers either blocking the live 
migration entirely or allowing all instances to be able to read 
/tmp/memfd-. 

Don't you think we could push the first patch until we come up with a better 
approach for the tmp (and default tmp) files & directories ? The patch is not 
worse than what was committed already. 

Tks

Rafael

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism

2016-10-03 Thread Rafael David Tinoco

Sorry, I was only able to come back to this today.

> On Sep 27, 2016, at 09:18, Daniel Berrange <1626...@bugs.launchpad.net> wrote:
> 
>> There are numerous people relying on older kernels in openstack 
>> deployments - sometimes with specific drivers (ovswitch, dpdk, 
>> infiniband) holding kernel upgrades - but still in need of upgrading 
>> userland (e.g. newer releases). Having a fallback mechanism seems 
>> appropriate for those cases.
> 
> I'm not against some kind of fallback - just about the way it
> silently creates files in /tmp.
> 

That is why memfd_create is used here I suppose: To allow 
anonymous-backed-pages to have a descriptor and to be sealed. When falling back 
this mechanism I don't see any other way other than creating a temporary file. 
Of course one way would be something like:

http://paste.ubuntu.com/23270379/

But this is pretty much the same, just solving the "where to place the 
temporary file" (non configurable for this usage). 

>> 
>> Note that the filename, per se, is not as important as other files, 
>> since qemu won't provide it for being accessed by external programs, and,
>> deletes the file, while keeping the descriptor, right after its creation
>> (due to its nature, that is probably why it was created in /tmp).
> 
> If it doesn't shared with other processes, and is deleted immediately,
> why does the file need to be on disk at all ?

Well, it unlinks the file but the references are still there while the 
descriptor isn't closed by this process, or by the one that receives the 
descriptor (that is why is the "unlink" so early). 

If you check vhost_dev_log_resize(), it gets *possible* new vhost log (if a new 
size is given) and informs the vhost dev driver about the new log base 
(vhost_ops->vhost_set_log_base). 

For vhost_user, this means that the file descriptors for vhost logs are likely 
going to be passed to vhost backend (fds[] in vhost_user_set_log_base). This is 
just one example, not sure about others. 

Probably the best approach here, like what Marc-André said, is to create some 
sort of TMPDIR, set by libvirt perhaps ?

> 
> Regards,
> Daniel

Re: [Qemu-devel] [PATCH] util: secure memfd_create fallback mechanism

2016-09-27 Thread Rafael David Tinoco

Hello!

> On Sep 27, 2016, at 08:13, Marc-André Lureau  wrote:
> 
>> Note that the filename, per se, is not as important as other files,
>> since qemu won't provide it for being accessed by external programs, and,
>> deletes the file, while keeping the descriptor, right after its creation
>> (due to its nature, that is probably why it was created in /tmp).
>> 
>> Having libvirt to define a filename that would not be used for recent
>> kernels (> 3.17) and would exist for a fraction of second doesn't seem
>> right to me.
>> 
> 
> There are other parts of qemu that rely on creating temporary files, and this 
> seems to lack a bit of uniformity. Would it make sense to define a place 
> where qemu could create those? Or setting TMPDIR should help too. Could 
> libvirt set a per-vm TMPDIR with appropriate security rules?

You got a point. With a per-vm TMPDIR we don't have to care about filenames in 
future for the security driver, while still securing them per-instance base. 
I'll come back to you! 

Thank you!

Re: [Qemu-devel] [PATCH] util: secure memfd_create fallback mechanism

2016-09-27 Thread Rafael David Tinoco

> On Sep 27, 2016, at 05:36, Daniel P. Berrange <berra...@redhat.com> wrote:
> 
> On Tue, Sep 27, 2016 at 03:06:21AM +0000, Rafael David Tinoco wrote:
>> Commit: 35f9b6ef3acc9d0546c395a566b04e63ca84e302 added a fallback
>> mechanism for systems not supporting memfd_create syscall (started
>> being supported since 3.17).
> 
> This is really dubious code in general and IMHO should just
> be reverted.

There are numerous people relying on older kernels in openstack 
deployments - sometimes with specific drivers (ovswitch, dpdk, 
infiniband) holding kernel upgrades - but still in need of upgrading 
userland (e.g. newer releases). Having a fallback mechanism seems 
appropriate for those cases.

> 
> We have a golden rule that any time QEMU needs to be able to
> create a file on disk, then the path should be explicitly
> provided as a command line argument so that mgmt apps can
> control the location used.
> 
>> Backporting memfd_create might not be accepted for distros relying
>> on older kernels. Nowadays there is no way for security driver
>> to discover memfd filename to be created: /memfd-XX.
>> 
>> It is more appropriate to include UUID and/or VM names in the
>> temporary filename, allowing security driver rules to be applied
>> while maintaining the required unpredictability with mkstemp.
> 
> We should not have QEMU creating unpredictabile filenames in the
> first place - any filenames should be determined by libvirt
> explicitly.

Note that the filename, per se, is not as important as other files, 
since qemu won't provide it for being accessed by external programs, and,
deletes the file, while keeping the descriptor, right after its creation
(due to its nature, that is probably why it was created in /tmp).

Having libvirt to define a filename that would not be used for recent
kernels (> 3.17) and would exist for a fraction of second doesn't seem
right to me. 

> 
>> This change will allow libvirt to know exact memfd file to be created
>> for vhost log AND to create appropriate security rules to allow access
>> per instance (instead of a opened rule like /memfd-*).
> 
> Even with this change it is bad - we don't want driver backends
> creating arbitrary files in the shared /tmp directory.

On the other hand, if we are creating a tmp file, like I said, I see 
benefit on having unpredictability (mkstemp), but providing predictable
parts to allow security driver to apply rules per instance basis 
(/tmp/memfd-UUID*, /tmp/memfd-VMname*). 

Looking forward to a decision so I can backport correct behaviour
(with or without memfd file).  

Thank you!

Best Regards,
Rafael

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-09-26 Thread Rafael David Tinoco

I'll follow to see if patch was accepted upstream:

https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg06191.html
https://www.mail-archive.com/qemu-devel@nongnu.org/msg400892.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1626972/+subscriptions

[Qemu-devel] [PATCH] util: secure memfd_create fallback mechanism

2016-09-26 Thread Rafael David Tinoco

Commit: 35f9b6ef3acc9d0546c395a566b04e63ca84e302 added a fallback
mechanism for systems not supporting memfd_create syscall (started
being supported since 3.17).

Backporting memfd_create might not be accepted for distros relying
on older kernels. Nowadays there is no way for security driver
to discover memfd filename to be created: /memfd-XX.

It is more appropriate to include UUID and/or VM names in the
temporary filename, allowing security driver rules to be applied
while maintaining the required unpredictability with mkstemp.

This change will allow libvirt to know exact memfd file to be created
for vhost log AND to create appropriate security rules to allow access
per instance (instead of a opened rule like /memfd-*).

Example of apparmor deny messages with this change:

Per VM UUID (preferred, generated automatically by libvirt):

kernel: [26632.154856] type=1400 audit(1474945148.633:78): apparmor=
"DENIED" operation="mknod" profile="libvirt-0b96011f-0dc0-44a3-92c3-
196de2efab6d" name="/tmp/memfd-0b96011f-0dc0-44a3-92c3-196de2efab6d-
qeHrBV" pid=75161 comm="qemu-system-x86" requested_mask="c" denied_
mask="c" fsuid=107 ouid=107

Per VM name (if no UUID is specified):

kernel: [26447.505653] type=1400 audit(1474944963.985:72): apparmor=
"DENIED" operation="mknod" profile="libvirt-----
" name="/tmp/memfd-instance-teste-osYpHh" pid=74648
comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107
ouid=107

Signed-off-by: Rafael David Tinoco <rafael.tin...@canonical.com>
---
 util/memfd.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/util/memfd.c b/util/memfd.c
index 4571d1a..4b715ac 100644
--- a/util/memfd.c
+++ b/util/memfd.c
@@ -30,6 +30,9 @@
 #include 
 
 #include "qemu/memfd.h"
+#include "qmp-commands.h"
+#include "qemu-common.h"
+#include "sysemu/sysemu.h"
 
 #ifdef CONFIG_MEMFD
 #include 
@@ -94,11 +97,32 @@ void *qemu_memfd_alloc(const char *name, size_t size, 
unsigned int seals,
 return NULL;
 }
 } else {
+int ret = 0;
 const char *tmpdir = g_get_tmp_dir();
+UuidInfo *uinfo;
+NameInfo *ninfo;
 gchar *fname;
 
-fname = g_strdup_printf("%s/memfd-XX", tmpdir);
+uinfo = qmp_query_uuid(NULL);
+
+ret = strcmp(uinfo->UUID, UUID_NONE);
+if (ret == 0) {
+ninfo = qmp_query_name(NULL);
+if (ninfo->has_name) {
+fname = g_strdup_printf("%s/memfd-%s-XX", tmpdir,
+ninfo->name);
+} else {
+fname = g_strdup_printf("%s/memfd-XX", tmpdir);
+}
+qapi_free_NameInfo(ninfo);
+} else {
+fname = g_strdup_printf("%s/memfd-%s-XX", tmpdir,
+uinfo->UUID);
+}
+
 mfd = mkstemp(fname);
+
+qapi_free_UuidInfo(uinfo);
 unlink(fname);
 g_free(fname);
 
-- 
2.9.3

[Qemu-devel] [PATCH] util: secure memfd_create fallback mechanism

2016-09-26 Thread Rafael David Tinoco

Commit: 35f9b6ef3acc9d0546c395a566b04e63ca84e302 added a fallback
mechanism for systems not supporting memfd_create syscall (started
being supported since 3.17).

Backporting memfd_create might not be accepted for distros relying
on older kernels. Nowadays there is no way for security driver
to discover memfd filename to be created: /memfd-XX.

It is more appropriate to include UUID and/or VM names in the
temporary filename, allowing security driver rules to be applied
while maintaining the required unpredictability with mkstemp.

This change will allow libvirt to know exact memfd file to be created
for vhost log AND to create appropriate security rules to allow access
per instance (instead of a opened rule like /memfd-*).

Example of apparmor deny messages with this change:

Per VM UUID (preferred, generated automatically by libvirt):

kernel: [26632.154856] type=1400 audit(1474945148.633:78): apparmor=
"DENIED" operation="mknod" profile="libvirt-0b96011f-0dc0-44a3-92c3-
196de2efab6d" name="/tmp/memfd-0b96011f-0dc0-44a3-92c3-196de2efab6d-
qeHrBV" pid=75161 comm="qemu-system-x86" requested_mask="c" denied_
mask="c" fsuid=107 ouid=107

Per VM name (if no UUID is specified):

kernel: [26447.505653] type=1400 audit(1474944963.985:72): apparmor=
"DENIED" operation="mknod" profile="libvirt-----
" name="/tmp/memfd-instance-teste-osYpHh" pid=74648
comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107
ouid=107

Signed-off-by: Rafael David Tinoco <rafael.tin...@canonical.com>
---
 util/memfd.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/util/memfd.c b/util/memfd.c
index 4571d1a..4b715ac 100644
--- a/util/memfd.c
+++ b/util/memfd.c
@@ -30,6 +30,9 @@
 #include 
 
 #include "qemu/memfd.h"
+#include "qmp-commands.h"
+#include "qemu-common.h"
+#include "sysemu/sysemu.h"
 
 #ifdef CONFIG_MEMFD
 #include 
@@ -94,11 +97,32 @@ void *qemu_memfd_alloc(const char *name, size_t size, 
unsigned int seals,
 return NULL;
 }
 } else {
+int ret = 0;
 const char *tmpdir = g_get_tmp_dir();
+UuidInfo *uinfo;
+NameInfo *ninfo;
 gchar *fname;
 
-fname = g_strdup_printf("%s/memfd-XX", tmpdir);
+uinfo = qmp_query_uuid(NULL);
+
+ret = strcmp(uinfo->UUID, UUID_NONE);
+if (ret == 0) {
+ninfo = qmp_query_name(NULL);
+if (ninfo->has_name) {
+fname = g_strdup_printf("%s/memfd-%s-XX", tmpdir,
+ninfo->name);
+} else {
+fname = g_strdup_printf("%s/memfd-XX", tmpdir);
+}
+qapi_free_NameInfo(ninfo);
+} else {
+fname = g_strdup_printf("%s/memfd-%s-XX", tmpdir,
+uinfo->UUID);
+}
+
 mfd = mkstemp(fname);
+
+qapi_free_UuidInfo(uinfo);
 unlink(fname);
 g_free(fname);
 
-- 
2.9.3

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-09-23 Thread Rafael David Tinoco

Fixed it according to checkpatch.pl as stated in
http://wiki.qemu.org/Contribute/SubmitAPatch.

http://paste.ubuntu.com/23220104/

Will submit to mailing list after testing everything.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1626972/+subscriptions

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-09-23 Thread Rafael David Tinoco

I came up with this patch for QEMU:

http://paste.ubuntu.com/23217056/

I'm finishing libvirt patch so I can propose upstream QEMU already sure
that libvirt will benefit from this change. Right after I'll propose
libvirt upstream patch (changing vert-aa-helper logic).

And later:

Improved it a little bit: http://paste.ubuntu.com/23217333/

And fixed it:

http://paste.ubuntu.com/23219599/
(Probable the version to be suggested to upstream)

** Changed in: qemu
   Status: New => In Progress

** Changed in: qemu
 Assignee: (unassigned) => Rafael David Tinoco (inaddy)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1626972/+subscriptions

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-09-23 Thread Rafael David Tinoco

Related: https://bugs.launchpad.net/nova/+bug/1613423

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
 tmpdir = g_get_tmp_dir
 ...
 mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as
  expected, so, clearly, apparmor is stepping into the live migration).
  I'm sure that virtual machines have to be confined and that this isn't
  the desired behaviour...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1626972/+subscriptions

[Qemu-devel] [Bug 1626972] [NEW] QEMU memfd_create fallback mechanism change for security drivers

2016-09-23 Thread Rafael David Tinoco

Public bug reported:

And, when libvirt starts using apparmor, and creating apparmor profiles
for every virtual machine created in the compute nodes, mitaka qemu (2.5
- and upstream also) uses a fallback mechanism for creating shared
memory for live-migrations. This fall back mechanism, on kernels 3.13 -
that don't have memfd_create() system-call, try to create files on /tmp/
directory and fails.. causing live-migration not to work.

Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
can't live migrate.

>From qemu 2.5, logic is on :

void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
{
if (memfd_create)... ### only works with HWE kernels

else ### 3.13 kernels, gets blocked by apparmor
   tmpdir = g_get_tmp_dir
   ...
   mfd = mkstemp(fname)
}

And you can see the errors:

>From the host trying to send the virtual machine:

2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

>From the host trying to receive the virtual machine:

Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

When leaving libvirt without apparmor capabilities (thus not confining
virtual machines on compute nodes, the live migration works as expected,
so, clearly, apparmor is stepping into the live migration). I'm sure
that virtual machines have to be confined and that this isn't the
desired behaviour...

** Affects: qemu
 Importance: Undecided
 Assignee: Rafael David Tinoco (inaddy)
 Status: In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in QEMU:
  In Progress

Bug description:
  And, when libvirt starts using apparmor, and creating apparmor
  profiles for every virtual machine created in the compute nodes,
  mitaka qemu (2.5 - and upstream also) uses a fallback mechanism for
  creating shared memory for live-migrations. This fall back mechanism,
  on kernels 3.13 - that don't have memfd_create() system-call, try to
  create files on /tmp/ directory and fails.. causing live-migration not
  to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

85 matches

Mail list logo