date:20191107

On Thu, Nov 07, 2019 at 09:44:36AM +0800, Guoheyi wrote:
> 
> 
> On 2019/11/7 1:55, Cornelia Huck wrote:
> > On Tue, 5 Nov 2019 17:10:53 +0800
> > Heyi Guo  wrote:
> > 
> > > To keep backward compatibility, we add new KVM capability
> > > "KVM_CAP_FORWARD_HYPERCALL" to probe whether KVM supports forwarding
> > > hypercall to userspace.
> > > 
> > > The capability should be enabled explicitly, for we don't want user
> > > space application to deal with unexpected hypercall exits. After
> > > enabling this cap, all HVC calls unhandled by kvm will be forwarded to
> > > user space.
> > > 
> > > Signed-off-by: Heyi Guo 
> > > Cc: Peter Maydell 
> > > Cc: "Michael S. Tsirkin" 
> > > Cc: Cornelia Huck 
> > > Cc: Paolo Bonzini 
> > > Cc: Dave Martin 
> > > Cc: Marc Zyngier 
> > > Cc: Mark Rutland 
> > > Cc: James Morse 
> > > ---
> > >   linux-headers/linux/kvm.h |  1 +
> > >   target/arm/sdei.c | 16 
> > >   target/arm/sdei.h |  2 ++
> > >   3 files changed, 19 insertions(+)
> > > 
> > > diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> > > index 3d9b18f7f8..36c9b3859f 100644
> > > --- a/linux-headers/linux/kvm.h
> > > +++ b/linux-headers/linux/kvm.h
> > > @@ -1000,6 +1000,7 @@ struct kvm_ppc_resize_hpt {
> > >   #define KVM_CAP_PMU_EVENT_FILTER 173
> > >   #define KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 174
> > >   #define KVM_CAP_HYPERV_DIRECT_TLBFLUSH 175
> > > +#define KVM_CAP_FORWARD_HYPERCALL 176
> > >   #ifdef KVM_CAP_IRQ_ROUTING
> > Is this cap upstream already? I would have thought your header sync
> > would have brought it in, then. (Saying this, that header sync looks
> > awfully small.)
> > 
> > If it is not upstream yet, please split off this hunk into a separate
> > patch -- it's a bit annoying, but makes life easier for merging.
> No, it is not upstream yet. The whole framework and interfaces between KVM
> and qemu are still under discussion. I'll keep in mind of this when moving
> forward to next steps...
> 
> Thanks,
> HG

It's best to add it in some other place meanwhile.
Then we can drop it when it's in an upstream header.


> > 
> > 
> > .
> > 
>

Re: [PATCH v2 03/21] iotests: Add _filter_json_filename

On 06.11.19 16:44, Maxim Levitsky wrote:
> On Tue, 2019-10-15 at 16:27 +0200, Max Reitz wrote:
>> Signed-off-by: Max Reitz 
>> ---
>>  tests/qemu-iotests/common.filter | 24 
>>  1 file changed, 24 insertions(+)
>>
>> diff --git a/tests/qemu-iotests/common.filter 
>> b/tests/qemu-iotests/common.filter
>> index 9f418b4881..63bc6f6f26 100644
>> --- a/tests/qemu-iotests/common.filter
>> +++ b/tests/qemu-iotests/common.filter
>> @@ -227,5 +227,29 @@ _filter_qmp_empty_return()
>>  grep -v '{"return": {}}'
>>  }
>>  
>> +_filter_json_filename()
>> +{
>> +$PYTHON -c 'import sys
>> +result, *fnames = sys.stdin.read().split("json:{")
> 
> Very minor nitpick, maybe I would give 'fnames' a more generic name,
> since its is just result of a split, thus not really a list of filenames.
> Feel free to ignore that though.

Hm...  It is a list of filenames, namely of all nested json:{}
filenames.  I could call it fname_split, but I actually think fnames is
not too wrong.

In any case, thanks for reviewing again!

Max



signature.asc
Description: OpenPGP digital signature

Re: [RFC v2 02/14] standard-headers: import arm_sdei.h

On Thu, Nov 07, 2019 at 09:40:49AM +0800, Guoheyi wrote:
> 
> 
> On 2019/11/7 1:52, Cornelia Huck wrote:
> > On Tue, 5 Nov 2019 17:10:44 +0800
> > Heyi Guo  wrote:
> > 
> > > Import Linux header file include/uapi/linux/arm_sdei.h from kernel 
> > > v5.4-rc5.
> > > 
> > > This is to prepare for qemu SDEI emulation.
> > > 
> > > Signed-off-by: Heyi Guo 
> > > Cc: Peter Maydell 
> > > Cc: Dave Martin 
> > > Cc: Marc Zyngier 
> > > Cc: Mark Rutland 
> > > Cc: James Morse 
> > > Cc: "Michael S. Tsirkin" 
> > > Cc: Cornelia Huck 
> > > Cc: Paolo Bonzini 
> > > ---
> > > 
> > > Notes:
> > >  v2:
> > >  - Import arm_sdei.h by running update-linux-headers.sh
> > > 
> > >   include/standard-headers/linux/arm_sdei.h | 73 +++
> > >   1 file changed, 73 insertions(+)
> > >   create mode 100644 include/standard-headers/linux/arm_sdei.h
> > Just a remark that I find it a bit odd that that a header that looks
> > arm-specific is in the generic linux/ directory (already in the kernel,
> > I know.) Is this for sharing between arm and arm64, maybe?
> I don't think arm platforms will use this header. In section 4.1 of SDEI
> spec, it says " Both the client and dispatcher of SDEI must execute in
> AArch64 state."
> So shall we move it to include/standard-headers/asm-arm64/?
> 
> Thanks,
> HG


Yea, that's because it's used by drivers/firmware/arm_sdei.c, also flat
in the top level hierarchy. It's been like this historically.
If you want to do a small kernel project and reorganize
drivers/firmware/ according to the architecture,
then arm_sdei.h can move too.

Until that happens upstream let's just mirror what kernel does.

> > 
> > 
> > .
> > 
>

[PULL 1/1] usb-host: add option to allow all resets.

2019-11-07 Thread Gerd Hoffmann

Commit 65f14ab98da1 ("usb-host: skip reset for untouched devices")
filters out multiple usb device resets in a row.  While this improves
the situation for usb some devices it doesn't work for others :-(

So go add a config option to make the behavior configurable.

Buglink: https://bugs.launchpad.net/bugs/1846451
Signed-off-by: Gerd Hoffmann 
Message-id: 20191015064426.19454-1-kra...@redhat.com
---
 hw/usb/host-libusb.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index 472cc26fc403..fcf48c019333 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -86,7 +86,9 @@ struct USBHostDevice {
 uint32_t options;
 uint32_t loglevel;
 bool needs_autoscan;
-bool allow_guest_reset;
+bool allow_one_guest_reset;
+bool allow_all_guest_resets;
+
 /* state */
 QTAILQ_ENTRY(USBHostDevice)  next;
 int  seen, errcount;
@@ -1462,10 +1464,10 @@ static void usb_host_handle_reset(USBDevice *udev)
 USBHostDevice *s = USB_HOST_DEVICE(udev);
 int rc;
 
-if (!s->allow_guest_reset) {
+if (!s->allow_one_guest_reset && !s->allow_all_guest_resets) {
 return;
 }
-if (udev->addr == 0) {
+if (!s->allow_all_guest_resets && udev->addr == 0) {
 return;
 }
 
@@ -1586,7 +1588,10 @@ static Property usb_host_dev_properties[] = {
 DEFINE_PROP_UINT32("productid", USBHostDevice, match.product_id, 0),
 DEFINE_PROP_UINT32("isobufs",  USBHostDevice, iso_urb_count,4),
 DEFINE_PROP_UINT32("isobsize", USBHostDevice, iso_urb_frames,   32),
-DEFINE_PROP_BOOL("guest-reset", USBHostDevice, allow_guest_reset, true),
+DEFINE_PROP_BOOL("guest-reset", USBHostDevice,
+ allow_one_guest_reset, true),
+DEFINE_PROP_BOOL("guest-resets-all", USBHostDevice,
+ allow_all_guest_resets, false),
 DEFINE_PROP_UINT32("loglevel",  USBHostDevice, loglevel,
LIBUSB_LOG_LEVEL_WARNING),
 DEFINE_PROP_BIT("pipeline",USBHostDevice, options,
-- 
2.18.1

Re: [PATCH v2 05/21] iotests: Replace IMGOPTS by _unsupported_imgopts

On 06.11.19 16:45, Maxim Levitsky wrote:
> On Tue, 2019-10-15 at 16:27 +0200, Max Reitz wrote:
>> Some tests require compat=1.1 and thus set IMGOPTS='compat=1.1'
>> globally.  That is not how it should be done; instead, they should
>> simply set _unsupported_imgopts to compat=0.10 (compat=1.1 is the
>> default anyway).
>>
>> This makes the tests heed user-specified $IMGOPTS.  Some do not work
>> with all image options, though, so we need to disable them accordingly.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  tests/qemu-iotests/036 | 3 +--
>>  tests/qemu-iotests/060 | 4 ++--
>>  tests/qemu-iotests/062 | 3 ++-
>>  tests/qemu-iotests/066 | 3 ++-
>>  tests/qemu-iotests/068 | 3 ++-
>>  tests/qemu-iotests/098 | 4 ++--
>>  6 files changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
>> index 5f929ad3be..bbaf0ef45b 100755
>> --- a/tests/qemu-iotests/036
>> +++ b/tests/qemu-iotests/036
>> @@ -43,9 +43,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>>  # This tests qcow2-specific low-level functionality
>>  _supported_fmt qcow2
>>  _supported_proto file
>> -
>>  # Only qcow2v3 and later supports feature bits
>> -IMGOPTS="compat=1.1"
>> +_unsupported_imgopts 'compat=0.10'
>>  
>>  echo
>>  echo === Image with unknown incompatible feature bit ===
>> diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
>> index b91d8321bb..9c2ef42522 100755
>> --- a/tests/qemu-iotests/060
>> +++ b/tests/qemu-iotests/060
>> @@ -48,6 +48,8 @@ _filter_io_error()
>>  _supported_fmt qcow2
>>  _supported_proto file
>>  _supported_os Linux
>> +# These tests only work for compat=1.1 images with refcount_bits=16
>> +_unsupported_imgopts 'compat=0.10' 'refcount_bits=\([^1]\|.\([^6]\|$\)\)'
> Looks like the reason for that is that the test hardcodes (or guesses that 
> is) various qcow2 structures
> thing I have seen few times already in the iotests.
> Not now but sometime in the future it would be nice to extend qcow2.py (or 
> something
> like that) to dump location of all qcow2 structures so that the guesswork 
> could be eliminated.

With the peek_file* functions we have now it’s actually simple to dump
that location ($(peek_file_be "$TEST_IMG" 48 8) for the refcount table
offset, for example).

But it wouldn’t help, because compat=0.10 or refcount_bits != 16 won’t
change those locations.  So the locations aren’t the reason why we need
to forbid those options here.

The reason we need refcount_bits=16 is that we’re going to directly
manipulate a refcount block.  To do so, we need to know the refcount
width, and I don’t think it’s worth trying to implement something generic.

We need compat=1.1 because compat=0.10 doesn’t have feature bits, so
there’s no “corrupt” bit there.

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v15 00/12] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

On Thu, Nov 07, 2019 at 03:44:59PM +0800, Tao Xu wrote:
> This series of patches will build Heterogeneous Memory Attribute Table (HMAT)
> according to the command line. The ACPI HMAT describes the memory attributes,
> such as memory side cache attributes and bandwidth and latency details,
> related to the Memory Proximity Domain.
> The software is expected to use HMAT information as hint for optimization.
> 
> In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
> the platform's HMAT tables.


ok this looks good to me.
Given we are in freeze, please ping me after the release to merge this.

> The V14 patches link:
> https://patchwork.kernel.org/cover/11214887/
> 
> Changelog:
> v15:
> - Add a new patch to refactor do_strtosz() (Eduardo)
> - Make tests without breaking CI (Michael)
> v14:
> - Reuse the codes of do_strtosz to build qemu_strtotime_ns
>   (Eduardo)
> - Squash patch v13 01/12 and 02/12 together (Daniel and Eduardo)
> - Drop time unit picosecond (Eric)
> - Use qemu ctz64 and clz64 instead of builtin function
> v13:
> - Modify some text description
> - Drop "initiator_valid" field in struct NodeInfo
> - Reuse Garray to store the raw bandwidth and bandwidth data
> - Calculate common base unit using range bitmap
> - Add a patch to alculate hmat latency and bandwidth entry list
> - Drop the total_levels option and use readable cache size
> - Remove the unnecessary head file
> - Use decimal notation with appropriate suffix for cache size
> v12:
> - Fix a bug that a memory-only node without initiator setting
>   doesn't report error. (reported by Danmei Wei)
> - Fix a bug that if HMAT is enabled and without hmat-lb setting,
>   QEMU will crash. (reported by Danmei Wei)
> v11:
> - Move numa option patches forward.
> - Add num_initiator in Numa_state to record the number of
> initiators.
> - Simplify struct HMAT_LB_Info, use uint64_t array to store data.
> - Drop hmat_get_base().
> - Calculate base in build_hmat_lb().
> v10:
> - Add qemu_strtotime_ps() to convert strings with time suffixes
> to numbers, and add some tests for it.
> - Add qapi buildin type time, and add some tests for it.
> - Add machine oprion properties "-machine hmat=on|off" for enabling
> or disabling HMAT in QEMU.
> 
> Liu Jingqi (5):
>   numa: Extend CLI to provide memory latency and bandwidth information
>   numa: Extend CLI to provide memory side cache information
>   hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
>   hmat acpi: Build System Locality Latency and Bandwidth Information
> Structure(s)
>   hmat acpi: Build Memory Side Cache Information Structure(s)
> 
> Tao Xu (7):
>   util/cutils: refactor do_strtosz() to support suffixes list
>   util/cutils: Add qemu_strtotime_ns()
>   qapi: Add builtin type time
>   tests: Add test for QAPI builtin type time
>   numa: Extend CLI to provide initiator information for numa nodes
>   numa: Calculate hmat latency and bandwidth entry list
>   tests/bios-tables-test: add test cases for ACPI HMAT
> 
>  hw/acpi/Kconfig   |   7 +-
>  hw/acpi/Makefile.objs |   1 +
>  hw/acpi/hmat.c| 263 
>  hw/acpi/hmat.h|  42 
>  hw/core/machine.c |  64 ++
>  hw/core/numa.c| 284 +-
>  hw/i386/acpi-build.c  |   5 +
>  include/qapi/visitor-impl.h   |   4 +
>  include/qapi/visitor.h|   8 +
>  include/qemu/cutils.h |   1 +
>  include/sysemu/numa.h | 104 ++
>  qapi/machine.json | 178 +++-
>  qapi/opts-visitor.c   |  22 ++
>  qapi/qapi-visit-core.c|  12 ++
>  qapi/qobject-input-visitor.c  |  18 ++
>  qapi/trace-events |   1 +
>  qemu-options.hx   |  96 -
>  scripts/qapi/schema.py|   1 +
>  tests/bios-tables-test-allowed-diff.h |   8 +
>  tests/bios-tables-test.c  |  44 
>  tests/data/acpi/pc/APIC.acpihmat  |   0
>  tests/data/acpi/pc/DSDT.acpihmat  |   0
>  tests/data/acpi/pc/HMAT.acpihmat  |   0
>  tests/data/acpi/pc/SRAT.acpihmat  |   0
>  tests/data/acpi/q35/APIC.acpihmat |   0
>  tests/data/acpi/q35/DSDT.acpihmat |   0
>  tests/data/acpi/q35/HMAT.acpihmat |   0
>  tests/data/acpi/q35/SRAT.acpihmat |   0
>  tests/test-cutils.c   | 204 ++
>  tests/test-keyval.c   | 122 +++
>  tests/test-qobject-input-visitor.c|  29 +++
>  util/cutils.c |  86 +---
>  32 files changed, 1562 insertions(+), 42 deletions(-)
>  create mode 100644 hw/acpi/hmat.c
>  create mode 100644 hw/acpi/hmat.h
>  create mode 100644 tests/data/acpi/pc/APIC.ac

[qemu-web PATCH] Add device fuzzing blog post

This blog post covers the device fuzzing GSoC project that Alexander
Olenik did in 2019.

Cc: Alexander Oleinik 
Signed-off-by: Stefan Hajnoczi 
---
 _posts/2019-11-07-device-fuzzing.md |  73 
 screenshots/fuzzing-intro.png   | Bin 0 -> 66276 bytes
 screenshots/fuzzing.png | Bin 0 -> 100281 bytes
 3 files changed, 73 insertions(+)
 create mode 100644 _posts/2019-11-07-device-fuzzing.md
 create mode 100644 screenshots/fuzzing-intro.png
 create mode 100644 screenshots/fuzzing.png

diff --git a/_posts/2019-11-07-device-fuzzing.md 
b/_posts/2019-11-07-device-fuzzing.md
new file mode 100644
index 000..2881068
--- /dev/null
+++ b/_posts/2019-11-07-device-fuzzing.md
@@ -0,0 +1,73 @@
+---
+layout: post
+title:  "Fuzzing QEMU Device Emulation"
+date:   2019-11-07 07:50:00 +0200
+categories: [fuzzing, gsoc, internships]
+---
+QEMU (https://www.qemu.org/) emulates a large number of network cards, disk
+controllers, and other devices needed to simulate a virtual computer system,
+called the "guest".
+
+The guest is untrusted and QEMU may even be used to run malicious
+software, so it is important that bugs in emulated devices do not
+allow the guest to compromise QEMU and escape the confines of the
+guest. For this reason a Google Summer of Code project was undertaken
+to develop fuzz tests for emulated devices.
+
+![QEMU device emulation attack surface](/screenshots/fuzzing-intro.png)
+
+Fuzzing is a testing technique that feeds random inputs to a program
+in order to trigger bugs. Random inputs can be generated quickly
+without relying on human guidance and this makes fuzzing an automated
+testing approach.
+
+## Device Fuzzing
+Emulated devices are exposed to the guest through a set of registers
+and also through data structures located in guest RAM that are
+accessed by the device in a process known as Direct Memory Access
+(DMA). Fuzzing emulated devices involves mapping random inputs to the
+device registers and DMA memory structures in order to explore code
+paths in QEMU's device emulation code.
+
+![Device fuzzing overview](/screenshots/fuzzing.png)
+
+Fuzz testing discovered an assertion failure in the virtio-net network
+card emulation code in QEMU that can be triggered by a guest. Fixing
+such bugs is usually easy once fuzz testing has generated a reproducer.
+
+Modern fuzz testing intelligently selects random inputs such that new
+code paths are explored and previously-tested code paths are not
+tested repeatedly. This is called coverage-guided fuzzing and
+involves an instrumented program executable so the fuzzer can detect
+the code paths that are taken for a given input. This was
+surprisingly effective at automatically exploring the input space of
+emulated devices in QEMU without requiring the fuzz test author to
+provide detailed knowledge of device internals.
+
+## How Fuzzing was Integrated into QEMU
+Device fuzzing in QEMU is driven by the open source libfuzzer library
+(https://llvm.org/docs/LibFuzzer.html). A special build of QEMU
+includes device emulation fuzz tests and launches without running a
+normal guest. Instead the fuzz test directly programs device
+registers and stores random data into DMA memory structures.
+
+The next step for the QEMU project will be to integrate fuzzing into
+Google's OSS-Fuzz (https://google.github.io/oss-fuzz/) continuous
+fuzzing service. This will ensure that fuzz tests are automatically
+run after new code is merged into QEMU and bugs are reported to the
+community.
+
+## Conclusion
+Fuzzing emulated devices has already revealed bugs in QEMU that would
+have been time-consuming to find through manual testing approaches.
+So far only a limited number of devices have been fuzz-tested and we
+hope to increase this number now that the foundations have been laid.
+The goal is to integrate these fuzz tests into OSS-Fuzz so that fuzz
+testing happens continuously.
+
+This project would not have been possible without Google's generous
+funding of Google Summer of Code. Alexander Oleinik developed the
+fuzzing code and was mentored by Bandan Das, Paolo Bonzini, and Stefan
+Hajnoczi.
+
+This article was contributed by Stefan Hajnoczi and Alexander Oleinik.
diff --git a/screenshots/fuzzing-intro.png b/screenshots/fuzzing-intro.png
new file mode 100644
index 000..e130027
Binary files /dev/null and b/screenshots/fuzzing-intro.png differ
diff --git a/screenshots/fuzzing.png b/screenshots/fuzzing.png
new file mode 100644
index 000..2f15ecb
Binary files /dev/null and b/screenshots/fuzzing.png differ
-- 
2.23.0

Re: [PATCH 5/6] migration/postcopy: enable random order target page arrival

* Wei Yang (richardw.y...@linux.intel.com) wrote:
> On Wed, Nov 06, 2019 at 08:08:28PM +, Dr. David Alan Gilbert wrote:
> >* Wei Yang (richardw.y...@linux.intel.com) wrote:
> >> After using number of target page received to track one host page, we
> >> could have the capability to handle random order target page arrival in
> >> one host page.
> >> 
> >> This is a preparation for enabling compress during postcopy.
> >> 
> >> Signed-off-by: Wei Yang 
> >> ---
> >>  migration/ram.c | 16 +++-
> >>  1 file changed, 3 insertions(+), 13 deletions(-)
> >> 
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index b5759793a9..da0596411c 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -4015,7 +4015,6 @@ static int ram_load_postcopy(QEMUFile *f)
> >>  MigrationIncomingState *mis = migration_incoming_get_current();
> >>  /* Temporary page that is later 'placed' */
> >>  void *postcopy_host_page = mis->postcopy_tmp_page;
> >> -void *last_host = NULL;
> >>  bool all_zero = false;
> >>  int target_pages = 0;
> >>  
> >> @@ -4062,24 +4061,15 @@ static int ram_load_postcopy(QEMUFile *f)
> >>   * that's moved into place later.
> >>   * The migration protocol uses,  possibly smaller, 
> >> target-pages
> >>   * however the source ensures it always sends all the 
> >> components
> >> - * of a host page in order.
> >> + * of a host page in one chunk.
> >>   */
> >>  page_buffer = postcopy_host_page +
> >>((uintptr_t)host & (block->page_size - 1));
> >>  /* If all TP are zero then we can optimise the place */
> >>  if (target_pages == 1) {
> >>  all_zero = true;
> >> -} else {
> >> -/* not the 1st TP within the HP */
> >> -if (host != (last_host + TARGET_PAGE_SIZE)) {
> >> -error_report("Non-sequential target page %p/%p",
> >> -  host, last_host);
> >> -ret = -EINVAL;
> >> -break;
> >> -}
> >
> >I think this is losing more protection than needed.
> >I think you can still protect against a page from a different host-page
> >arriving until we've placed the current host-page.
> >So something like:
> >
> >if (((uintptr_t)host & ~(block->page_size - 1)) !=
> >last_host)
> >
> 
> OK, looks reasonable.
> 
> >and then set last_host to the start of the host page.
> >
> 
> I think it is not necessary to update the last_host on each target page. We
> can just set it at the first target page.

Yes, that would be fine.

Dave

> >Then you'll check if that flush is really working.
> >
> >Dave
> >
> >>  }
> >>  
> >> -
> >>  /*
> >>   * If it's the last part of a host page then we place the host
> >>   * page
> >> @@ -4090,7 +4080,6 @@ static int ram_load_postcopy(QEMUFile *f)
> >>  }
> >>  place_source = postcopy_host_page;
> >>  }
> >> -last_host = host;
> >>  
> >>  switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
> >>  case RAM_SAVE_FLAG_ZERO:
> >> @@ -4143,7 +4132,8 @@ static int ram_load_postcopy(QEMUFile *f)
> >>  
> >>  if (!ret && place_needed) {
> >>  /* This gets called at the last target page in the host page 
> >> */
> >> -void *place_dest = host + TARGET_PAGE_SIZE - block->page_size;
> >> +void *place_dest = (void *)QEMU_ALIGN_DOWN((unsigned 
> >> long)host,
> >> +   block->page_size);
> >>  
> >>  if (all_zero) {
> >>  ret = postcopy_place_page_zero(mis, place_dest,
> >> -- 
> >> 2.17.1
> >> 
> >--
> >Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> -- 
> Wei Yang
> Help you, Help me
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH 0/6] migration/postcopy: enable compress during postcopy

* Wei Yang (richardw.y...@linux.intel.com) wrote:
> On Wed, Nov 06, 2019 at 08:11:44PM +, Dr. David Alan Gilbert wrote:
> >* Wei Yang (richardw.y...@linux.intel.com) wrote:
> >> This patch set tries enable compress during postcopy.
> >> 
> >> postcopy requires to place a whole host page, while migration thread 
> >> migrate
> >> memory in target page size. This makes postcopy need to collect all target
> >> pages in one host page before placing via userfaultfd.
> >> 
> >> To enable compress during postcopy, there are two problems to solve:
> >> 
> >> 1. Random order for target page arrival
> >> 2. Target pages in one host page arrives without interrupt by target
> >>page from other host page
> >> 
> >> The first one is handled by counting the number of target pages arrived
> >> instead of the last target page arrived.
> >> 
> >> The second one is handled by:
> >> 
> >> 1. Flush compress thread for each host page
> >> 2. Wait for decompress thread for before placing host page
> >> 
> >> With the combination of these two changes, compress is enabled during
> >> postcopy.
> >
> >What have you tested this with? 2MB huge pages I guess?
> >
> 
> I tried with this qemu option:
> 
>-object memory-backend-file,id=mem1,mem-path=/dev/hugepages/guest2,size=4G 
> \
>-device pc-dimm,id=dimm1,memdev=mem1
> 
> /dev/hugepages/guest2 is a file under hugetlbfs
> 
>hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)

OK, yes that should be fine.
I suspect on Power/ARM where they have normal memory with 16/64k pages,
the cost of the flush will mean compression is more expensive in
postcopy mode; but still makes it possible.

Dave

> >Dave
> >
> >> Wei Yang (6):
> >>   migration/postcopy: reduce memset when it is zero page and
> >> matches_target_page_size
> >>   migration/postcopy: wait for decompress thread in precopy
> >>   migration/postcopy: count target page number to decide the
> >> place_needed
> >>   migration/postcopy: set all_zero to true on the first target page
> >>   migration/postcopy: enable random order target page arrival
> >>   migration/postcopy: enable compress during postcopy
> >> 
> >>  migration/migration.c | 11 
> >>  migration/ram.c   | 65 ++-
> >>  2 files changed, 45 insertions(+), 31 deletions(-)
> >> 
> >> -- 
> >> 2.17.1
> >> 
> >--
> >Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> -- 
> Wei Yang
> Help you, Help me
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PULL 0/1] Ui 20191106 patches

2019-11-07 Thread Peter Maydell

On Wed, 6 Nov 2019 at 07:08, Gerd Hoffmann  wrote:
>
> The following changes since commit 36609b4fa36f0ac934874371874416f7533a5408:
>
>   Merge remote-tracking branch 
> 'remotes/palmer/tags/palmer-for-master-4.2-sf1' into staging (2019-11-02 
> 17:59:03 +)
>
> are available in the Git repository at:
>
>   git://git.kraxel.org/qemu tags/ui-20191106-pull-request
>
> for you to fetch changes up to 88b40c683fda6fa00639de01d4274e94bd4f1cdd:
>
>   qemu-options: Rework the help text of the '-display' option (2019-11-05 
> 12:10:42 +0100)
>
> 
> ui: rework -display help text
>
> 
>

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.

-- PMM

Re: [PATCH v2 10/21] iotests: Replace IMGOPTS= by -o

On 06.11.19 16:47, Maxim Levitsky wrote:
> On Tue, 2019-10-15 at 16:27 +0200, Max Reitz wrote:
>> Tests should not overwrite all user-supplied image options, but only add
>> to it (which will effectively overwrite conflicting values).  Accomplish
>> this by passing options to _make_test_img via -o instead of $IMGOPTS.
>>
>> For some tests, there is no functional change because they already only
>> appended options to IMGOPTS.  For these, this patch is just a
>> simplification.
>>
>> For others, this is a change, so they now heed user-specified $IMGOPTS.
>> Some of those tests do not work with all image options, though, so we
>> need to disable them accordingly.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  tests/qemu-iotests/031 |  9 ---
>>  tests/qemu-iotests/039 | 24 ++
>>  tests/qemu-iotests/059 | 18 ++---
>>  tests/qemu-iotests/060 |  6 ++---
>>  tests/qemu-iotests/061 | 57 ++
>>  tests/qemu-iotests/079 |  3 +--
>>  tests/qemu-iotests/106 |  2 +-
>>  tests/qemu-iotests/108 |  2 +-
>>  tests/qemu-iotests/112 | 32 
>>  tests/qemu-iotests/115 |  3 +--
>>  tests/qemu-iotests/121 |  6 ++---
>>  tests/qemu-iotests/125 |  2 +-
>>  tests/qemu-iotests/137 |  2 +-
>>  tests/qemu-iotests/138 |  3 +--
>>  tests/qemu-iotests/175 |  2 +-
>>  tests/qemu-iotests/190 |  2 +-
>>  tests/qemu-iotests/191 |  3 +--
>>  tests/qemu-iotests/220 |  4 ++-
>>  tests/qemu-iotests/243 |  6 +++--
>>  tests/qemu-iotests/244 | 10 +---
>>  tests/qemu-iotests/250 |  3 +--
>>  tests/qemu-iotests/265 |  2 +-
>>  22 files changed, 100 insertions(+), 101 deletions(-)

[...]

>> @@ -161,7 +161,7 @@ _cleanup_test_img
>>  
>>  echo
>>  echo "=== Testing 4TB monolithicFlat creation and IO ==="
>> -IMGOPTS="subformat=monolithicFlat" _make_test_img 4T
>> +_make_test_img -o "subformat=monolithicFlat" 4T
>>  _img_info
>>  $QEMU_IO -c "write -P 0xa 900G 512" "$TEST_IMG" | _filter_qemu_io
>>  $QEMU_IO -c "read -v 900G 1024" "$TEST_IMG" | _filter_qemu_io
>> @@ -170,7 +170,7 @@ _cleanup_test_img
>>  echo
>>  echo "=== Testing qemu-img map on extents ==="
>>  for fmt in monolithicSparse twoGbMaxExtentSparse; do
>> -IMGOPTS="subformat=$fmt" _make_test_img 31G
>> +_make_test_img -o "subformat=$fmt" 31G
>>  $QEMU_IO -c "write 65024 1k" "$TEST_IMG" | _filter_qemu_io
>>  $QEMU_IO -c "write 2147483136 1k" "$TEST_IMG" | _filter_qemu_io
>>  $QEMU_IO -c "write 5G 1k" "$TEST_IMG" | _filter_qemu_io
> 
> Looks good. Another test that pokes at guessed locations... :-)

Actually, no.  These are writes on the format, not the file itself.  The
monolithicSparse subformat will store everything in a single file,
whereas twoGbMaxExtentSparse will create one file per 2 GB of guest
disk.  So the locations are chosen accordingly to that 2 GB limit ((1)
something somewhere in the first extent, (2) something that wraps around
the first 2 GB limit, so hits extents #0 and #1, and (3) something in
the middle of extent #2.)

(The following qemu-img map call then verifies that it lands in the
different files for twoGbMaxExtentSparse, and that monolithicSparse is
at least indeed sparsely allocated.)

Max



signature.asc
Description: OpenPGP digital signature

Re: QEMU HTML documentation now on qemu.org

On Wed, Nov 6, 2019 at 5:21 PM Stefan Hajnoczi  wrote:
> Hi,
> You can now access the latest QEMU HTML documentation built from
> qemu.git/master nightly at:
>
>   https://wiki.qemu.org/docs/qemu-doc.html
>   https://wiki.qemu.org/docs/qemu-qmp-ref.html
>   https://wiki.qemu.org/docs/qemu-ga-ref.html
>   ...as well as interop/ and specs/
>
> Feel free to link to the documentation from the QEMU website and/or
> wiki!
>
> The container image that builds the docs is here:
>
>   https://github.com/stefanha/qemu-docs
>
> It is hosted on QEMU's Rackspace cloud account.

I forgot to add Markus.

I hope this helps the QEMU documentation effort.  I currently do not
have plans to work on this further.  You are welcome to send pull
requests to the qemu-docs container image repo or just ask me and I'll
make changes.

Stefan

Re: [qemu-web PATCH] Add device fuzzing blog post

2019-11-07 Thread Thomas Huth

- Original Message -
> From: "Stefan Hajnoczi" 
> Sent: Thursday, November 7, 2019 10:11:36 AM
> 
> This blog post covers the device fuzzing GSoC project that Alexander
> Olenik did in 2019.
[...]
> +This article was contributed by Stefan Hajnoczi and Alexander Oleinik.

You could also use the "author:" field in the header for this.

> diff --git a/screenshots/fuzzing-intro.png b/screenshots/fuzzing-intro.png
> new file mode 100644
> index 000..e130027
> Binary files /dev/null and b/screenshots/fuzzing-intro.png differ
> diff --git a/screenshots/fuzzing.png b/screenshots/fuzzing.png
> new file mode 100644
> index 000..2f15ecb
> Binary files /dev/null and b/screenshots/fuzzing.png differ

Seems like the images are missing ... can you please attach them?

 Thanks,
  Thomas

[Bug 1851547] Re: qemu 4 crashes with this parameter attached -usb -device usb-host, hostbus=1, hostaddr=7 \

Hi Marietto,
  Can you attach an lsusb output from your host?
I'm curious what host bug 1, addr 7 and 8 are.

Dave

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1851547

Title:
  qemu 4 crashes with this parameter attached -usb -device usb-
  host,hostbus=1,hostaddr=7 \

Status in QEMU:
  New

Bug description:
  Hello.

  qemu / kvm does not start anymore after upgrading ubuntu from 19.04 to
  19.10 and qemu from 3 to 4,as you can see below. what can I do ?

  root@ziomario-Z390-AORUS-PRO:/home/ziomario/Scrivania/OS-KVM# ./boot-
  OS-HSP2.sh

  > qemu-system-x86_64: /build/qemu-
  UryNDZ/qemu-4.0+dfsg/hw/usb/core.c:720: usb_ep_get: asserzione "dev !=
  NULL" non riuscita.

  ./boot-OS-HSP2.sh: riga 40: 26312 Annullato (core dump creato) qemu-
  system-x86_64 -enable-kvm -m 16000 -cpu
  Penryn,kvm=on,vendor=GenuineIntel,+invtsc,vmware-cpuid-
  freq=on,$MY_OPTIONS -machine pc-q35-2.9 -smp 4,cores=2 -vga none
  -device vfio-pci,host=01:00.0,bus=pcie.0,multifunction=on -device
  vfio-pci,host=01:00.1,bus=pcie.0 -device vfio-
  pci,host=01:00.2,bus=pcie.0 -device vfio-pci,host=01:00.3,bus=pcie.0
  -usb -device usb-host,hostbus=1,hostaddr=7 -drive
  if=pflash,format=raw,readonly,file=$OVMF/OVMF_CODE.fd -drive
  if=pflash,format=raw,file=$OVMF/OVMF_VARS-1024x768.fd -smbios type=2
  -device ich9-ahci,id=sata -drive
  id=Clover,if=none,snapshot=on,format=qcow2,file=./'Mo/CloverNG.qcow2'
  -device ide-hd,bus=sata.2,drive=Clover -device ide-
  hd,bus=sata.3,drive=InstallMedia -drive
  id=InstallMedia,if=none,file=BaseSystemHS.img,format=raw -drive
  id=BsdHDD,if=none,file=/dev/sdg,format=raw -device ide-
  hd,bus=sata.4,drive=BsdHDD -netdev
  tap,id=net0,ifname=tap0,script=no,downscript=no -device
  e1000-82545em,netdev=net0,id=net0,mac=52:54:00:c9:18:27 -monitor stdio

  It seems that this line is not good anymore (it worked with qemu 3.x)
  :

  -usb -device usb-host,hostbus=1,hostaddr=7 \

  when I removed it,it works. But I need that. With what can I change it
  ? You can reproduce that upgrading ubuntu 19.04 to 19.10 because in
  that way also qemu will be upgraded from 3 to 4. These are the
  packages that I'm using :

  root@ziomario-Z390-AORUS-PRO:/home/ziomario# qemu-system-x86_64 --version
  QEMU emulator version 4.0.0 (Debian 1:4.0+dfsg-0ubuntu9)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1851547/+subscriptions

Re: [PATCH v2 10/21] iotests: Replace IMGOPTS= by -o

2019-11-07 Thread Maxim Levitsky

On Thu, 2019-11-07 at 10:20 +0100, Max Reitz wrote:
> On 06.11.19 16:47, Maxim Levitsky wrote:
> > On Tue, 2019-10-15 at 16:27 +0200, Max Reitz wrote:
> > > Tests should not overwrite all user-supplied image options, but only add
> > > to it (which will effectively overwrite conflicting values).  Accomplish
> > > this by passing options to _make_test_img via -o instead of $IMGOPTS.
> > > 
> > > For some tests, there is no functional change because they already only
> > > appended options to IMGOPTS.  For these, this patch is just a
> > > simplification.
> > > 
> > > For others, this is a change, so they now heed user-specified $IMGOPTS.
> > > Some of those tests do not work with all image options, though, so we
> > > need to disable them accordingly.
> > > 
> > > Signed-off-by: Max Reitz 
> > > ---
> > >  tests/qemu-iotests/031 |  9 ---
> > >  tests/qemu-iotests/039 | 24 ++
> > >  tests/qemu-iotests/059 | 18 ++---
> > >  tests/qemu-iotests/060 |  6 ++---
> > >  tests/qemu-iotests/061 | 57 ++
> > >  tests/qemu-iotests/079 |  3 +--
> > >  tests/qemu-iotests/106 |  2 +-
> > >  tests/qemu-iotests/108 |  2 +-
> > >  tests/qemu-iotests/112 | 32 
> > >  tests/qemu-iotests/115 |  3 +--
> > >  tests/qemu-iotests/121 |  6 ++---
> > >  tests/qemu-iotests/125 |  2 +-
> > >  tests/qemu-iotests/137 |  2 +-
> > >  tests/qemu-iotests/138 |  3 +--
> > >  tests/qemu-iotests/175 |  2 +-
> > >  tests/qemu-iotests/190 |  2 +-
> > >  tests/qemu-iotests/191 |  3 +--
> > >  tests/qemu-iotests/220 |  4 ++-
> > >  tests/qemu-iotests/243 |  6 +++--
> > >  tests/qemu-iotests/244 | 10 +---
> > >  tests/qemu-iotests/250 |  3 +--
> > >  tests/qemu-iotests/265 |  2 +-
> > >  22 files changed, 100 insertions(+), 101 deletions(-)
> 
> [...]
> 
> > > @@ -161,7 +161,7 @@ _cleanup_test_img
> > >  
> > >  echo
> > >  echo "=== Testing 4TB monolithicFlat creation and IO ==="
> > > -IMGOPTS="subformat=monolithicFlat" _make_test_img 4T
> > > +_make_test_img -o "subformat=monolithicFlat" 4T
> > >  _img_info
> > >  $QEMU_IO -c "write -P 0xa 900G 512" "$TEST_IMG" | _filter_qemu_io
> > >  $QEMU_IO -c "read -v 900G 1024" "$TEST_IMG" | _filter_qemu_io
> > > @@ -170,7 +170,7 @@ _cleanup_test_img
> > >  echo
> > >  echo "=== Testing qemu-img map on extents ==="
> > >  for fmt in monolithicSparse twoGbMaxExtentSparse; do
> > > -IMGOPTS="subformat=$fmt" _make_test_img 31G
> > > +_make_test_img -o "subformat=$fmt" 31G
> > >  $QEMU_IO -c "write 65024 1k" "$TEST_IMG" | _filter_qemu_io
> > >  $QEMU_IO -c "write 2147483136 1k" "$TEST_IMG" | _filter_qemu_io
> > >  $QEMU_IO -c "write 5G 1k" "$TEST_IMG" | _filter_qemu_io
> > 
> > Looks good. Another test that pokes at guessed locations... :-)
> 
> Actually, no.  These are writes on the format, not the file itself.  The
> monolithicSparse subformat will store everything in a single file,
> whereas twoGbMaxExtentSparse will create one file per 2 GB of guest
> disk.  So the locations are chosen accordingly to that 2 GB limit ((1)
> something somewhere in the first extent, (2) something that wraps around
> the first 2 GB limit, so hits extents #0 and #1, and (3) something in
> the middle of extent #2.)
> 
> (The following qemu-img map call then verifies that it lands in the
> different files for twoGbMaxExtentSparse, and that monolithicSparse is
> at least indeed sparsely allocated.)
> 
> Max
Good to know, I missed this one.


Best regards,
Maxim Levitsky

Re: [PATCH v2 05/21] iotests: Replace IMGOPTS by _unsupported_imgopts

2019-11-07 Thread Maxim Levitsky

On Thu, 2019-11-07 at 10:08 +0100, Max Reitz wrote:
> On 06.11.19 16:45, Maxim Levitsky wrote:
> > On Tue, 2019-10-15 at 16:27 +0200, Max Reitz wrote:
> > > Some tests require compat=1.1 and thus set IMGOPTS='compat=1.1'
> > > globally.  That is not how it should be done; instead, they should
> > > simply set _unsupported_imgopts to compat=0.10 (compat=1.1 is the
> > > default anyway).
> > > 
> > > This makes the tests heed user-specified $IMGOPTS.  Some do not work
> > > with all image options, though, so we need to disable them accordingly.
> > > 
> > > Signed-off-by: Max Reitz 
> > > ---
> > >  tests/qemu-iotests/036 | 3 +--
> > >  tests/qemu-iotests/060 | 4 ++--
> > >  tests/qemu-iotests/062 | 3 ++-
> > >  tests/qemu-iotests/066 | 3 ++-
> > >  tests/qemu-iotests/068 | 3 ++-
> > >  tests/qemu-iotests/098 | 4 ++--
> > >  6 files changed, 11 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
> > > index 5f929ad3be..bbaf0ef45b 100755
> > > --- a/tests/qemu-iotests/036
> > > +++ b/tests/qemu-iotests/036
> > > @@ -43,9 +43,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
> > >  # This tests qcow2-specific low-level functionality
> > >  _supported_fmt qcow2
> > >  _supported_proto file
> > > -
> > >  # Only qcow2v3 and later supports feature bits
> > > -IMGOPTS="compat=1.1"
> > > +_unsupported_imgopts 'compat=0.10'
> > >  
> > >  echo
> > >  echo === Image with unknown incompatible feature bit ===
> > > diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
> > > index b91d8321bb..9c2ef42522 100755
> > > --- a/tests/qemu-iotests/060
> > > +++ b/tests/qemu-iotests/060
> > > @@ -48,6 +48,8 @@ _filter_io_error()
> > >  _supported_fmt qcow2
> > >  _supported_proto file
> > >  _supported_os Linux
> > > +# These tests only work for compat=1.1 images with refcount_bits=16
> > > +_unsupported_imgopts 'compat=0.10' 'refcount_bits=\([^1]\|.\([^6]\|$\)\)'
> > 
> > Looks like the reason for that is that the test hardcodes (or guesses that 
> > is) various qcow2 structures
> > thing I have seen few times already in the iotests.
> > Not now but sometime in the future it would be nice to extend qcow2.py (or 
> > something
> > like that) to dump location of all qcow2 structures so that the guesswork 
> > could be eliminated.
> 
> With the peek_file* functions we have now it’s actually simple to dump
> that location ($(peek_file_be "$TEST_IMG" 48 8) for the refcount table
> offset, for example).
> 
> But it wouldn’t help, because compat=0.10 or refcount_bits != 16 won’t
> change those locations.  So the locations aren’t the reason why we need
> to forbid those options here.
> 
> The reason we need refcount_bits=16 is that we’re going to directly
> manipulate a refcount block.  To do so, we need to know the refcount
> width, and I don’t think it’s worth trying to implement something generic.
> 
> We need compat=1.1 because compat=0.10 doesn’t have feature bits, so
> there’s no “corrupt” bit there.
> 
> Max
> 
This makes sense! Sorry for the noise!

Best regards,
Maxim Levitsky

Re: [PATCH v2 03/21] iotests: Add _filter_json_filename

2019-11-07 Thread Maxim Levitsky

On Thu, 2019-11-07 at 09:59 +0100, Max Reitz wrote:
> On 06.11.19 16:44, Maxim Levitsky wrote:
> > On Tue, 2019-10-15 at 16:27 +0200, Max Reitz wrote:
> > > Signed-off-by: Max Reitz 
> > > ---
> > >  tests/qemu-iotests/common.filter | 24 
> > >  1 file changed, 24 insertions(+)
> > > 
> > > diff --git a/tests/qemu-iotests/common.filter 
> > > b/tests/qemu-iotests/common.filter
> > > index 9f418b4881..63bc6f6f26 100644
> > > --- a/tests/qemu-iotests/common.filter
> > > +++ b/tests/qemu-iotests/common.filter
> > > @@ -227,5 +227,29 @@ _filter_qmp_empty_return()
> > >  grep -v '{"return": {}}'
> > >  }
> > >  
> > > +_filter_json_filename()
> > > +{
> > > +$PYTHON -c 'import sys
> > > +result, *fnames = sys.stdin.read().split("json:{")
> > 
> > Very minor nitpick, maybe I would give 'fnames' a more generic name,
> > since its is just result of a split, thus not really a list of filenames.
> > Feel free to ignore that though.
> 
> Hm...  It is a list of filenames, namely of all nested json:{}
> filenames.  I could call it fname_split, but I actually think fnames is
> not too wrong.

Makes sense, I guess leave it as is.

> 
> In any case, thanks for reviewing again!

No problem! Thanks to you too for making these tests more generic,
this is IMHO very very good thing, especially with all the qcow2
corruptions we see recently.


Best regards,
Maxim Levitsky


> 
> Max
>

[Bug 1848556] Re: qemu-img check failing on remote image in Eoan

2019-11-07 Thread Launchpad Bug Tracker

This bug was fixed in the package qemu - 1:4.0+dfsg-0ubuntu10

---
qemu (1:4.0+dfsg-0ubuntu10) focal; urgency=medium

  * d/p/ubuntu/lp-1848556-curl-Handle-success-in-multi_check_completion.patch:
fix a potential hang when qemu or qemu-img where accessing http backed
disks via libcurl (LP: #1848556)
  * d/p/u/lp-1848497-virtio-balloon-fix-QEMU-4.0-config-size-migration-in.patch:
fix migration issue from qemu <4.0 when using virtio-balloon (LP: #1848497)

 -- Christian Ehrhardt   Mon, 21 Oct
2019 14:51:45 +0200

** Changed in: qemu (Ubuntu Focal)
   Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1848556

Title:
  qemu-img check failing on remote image in Eoan

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Eoan:
  Triaged
Status in qemu source package in Focal:
  Fix Released

Bug description:
  Ubuntu SRU Template:

  [Impact]

   * There is fallout due to changes in libcurl that affect qemu and might 
 lead to a hang.

   * Fix by backporting the upstream fix

  [Test Case]

   * If you have network just run
 $ qemu-img check http://10.193.37.117/cloud/eoan-server-cloudimg-amd64.img

   * Without network, install apache2, and get a complex qemu file (like a 
 cloud image) onto the system. Then access the file via apache http but 
 not localhost (that would work)

  [Regression Potential]

   * The change is local to the libcurl usage of qemu, so that could be 
 affected. But then this is what has been found to not work here, so I'd 
 expect not too much trouble. But if so then in the curl usage (which 
 means disks on http)

  [Other Info]
   
   * n/a

  ---

  The "qemu-img check" function is failing on remote (HTTP-hosted)
  images, beginning with Ubuntu 19.10 (qemu-utils version 1:4.0+dfsg-
  0ubuntu9). With previous versions, through Ubuntu 19.04/qemu-utils
  version 1:3.1+dfsg-2ubuntu3.5, the following worked:

  $ /usr/bin/qemu-img check  
http://10.193.37.117/cloud/eoan-server-cloudimg-amd64.img
  No errors were found on the image.
  19778/36032 = 54.89% allocated, 90.34% fragmented, 89.90% compressed clusters
  Image end offset: 514064384

  The 10.193.37.117 server holds an Apache server that hosts the cloud
  images on a LAN. Beginning with Ubuntu 19.10/qemu-utils 1:4.0+dfsg-
  0ubuntu9, the same command never returns. (I've left it for up to an
  hour with no change.) I'm able to wget the image from the same server
  and installation on which qemu-img check fails. I've tried several
  .img files on the server, ranging from Bionic to Eoan, with the same
  results with all of them.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1848556/+subscriptions

Re: [PULL 0/1] Audio 20191106 patches

2019-11-07 Thread Peter Maydell

On Wed, 6 Nov 2019 at 08:27, Gerd Hoffmann  wrote:
>
> The following changes since commit 36609b4fa36f0ac934874371874416f7533a5408:
>
>   Merge remote-tracking branch 
> 'remotes/palmer/tags/palmer-for-master-4.2-sf1' into staging (2019-11-02 
> 17:59:03 +)
>
> are available in the Git repository at:
>
>   git://git.kraxel.org/qemu tags/audio-20191106-pull-request
>
> for you to fetch changes up to 14d4f01191354e9520c47c692007344c30ab358b:
>
>   audio: add -audiodev pa,in|out.latency= to documentation (2019-11-06 
> 08:08:10 +0100)
>
> 
> audio: documentation update
>
> 


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.

-- PMM

Re: QEMU HTML documentation now on qemu.org

On Wed, Nov 06, 2019 at 05:19:28PM +0100, Stefan Hajnoczi wrote:
> Hi,
> You can now access the latest QEMU HTML documentation built from
> qemu.git/master nightly at:
> 
>   https://wiki.qemu.org/docs/qemu-doc.html
>   https://wiki.qemu.org/docs/qemu-qmp-ref.html
>   https://wiki.qemu.org/docs/qemu-ga-ref.html
>   ...as well as interop/ and specs/
> 
> Feel free to link to the documentation from the QEMU website and/or
> wiki!

What's the reason for putting on wiki.qemu.org URL ? It feels like
having it under www.qemu.org would be a more natural home, especially
if we can then make it pick up the jekyll theme around the pages. 

Ideally we should publish the docs under versioned URL when we
make a release. eg  /docs/latest/  for current GIT master
which I presume the above is tracking, and then a /docs/$VERSION/...
for each major release we cut.

That way users can get an accurate view of features in the QEMU
they are actually using.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

privileged entropy sources in QEMU/KVM guests

Hi,

related TianoCore BZ:

  https://bugzilla.tianocore.org/show_bug.cgi?id=1871

(I'm starting this thread separately because at least some of the topics
are specific to QEMU, and I didn't want to litter the BZ with a
discussion that may not be interesting to all participants CC'd on the
BZ. I am keeping people CC'd on this initial posting; please speak up if
you'd like to be dropped from the email thread.)

QEMU provides guests with the virtio-rng device, and the OVMF and
ArmVirtQemu* edk2 platforms build EFI_RNG_PROTOCOL on top of that
device. But, that doesn't seem enough for all edk2 use cases.

Also, virtio-rng (hence EFI_RNG_PROTOCOL too) is optional, and its
absence may affect some other use cases.


(1) For UEFI HTTPS boot, TLS would likely benefit from good quality
entropy. If the VM config includes virtio-rng (hence the guest firmware
has EFI_RNG_PROTOCOL), then it should be used as a part of HTTPS boot.

However, what if virtio-rng (hence EFI_RNG_PROTOCOL) are absent? Should
UEFI HTTPS boot be disabled completely (or prevented / rejected
somehow), blaming lack of good entropy? Or should TLS silently fall back
to "mixing some counters [such as TSC] together and applying a
deterministic cryptographic transformation"?

IOW, knowing that the TLS setup may not be based on good quality
entropy, should we allow related firmware services to "degrade silently"
(not functionally, but potentially in security), or should we deny the
services altogether?


(2) It looks like the SMM driver implementing the privileged part of the
UEFI variable runtime service could need access to good quality entropy,
while running in SMM; in the future.

This looks problematic on QEMU. Entropy is a valuable resource, and
whatever resource SMM drivers depend on, should not be possible for e.g.
a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
Therefore, it's not *only* the case that SMM drivers must not consume
EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
outside of SMM/SMRAM), but also that SMM drivers must not depend on the
same piece of *hardware* that feeds EFI_RNG_PROTOCOL.

Furthermore, assuming we dedicate a hardware entropy device specifically
to SMM drivers, such a device cannot be PCI(e). It would have to be a
platform device at a fixed location (IO port or MMIO) that is only
accessible to such guest code that executes in SMM. IOW, device access
would have to be restricted similarly to pflash. (In fact the variable
SMM driver will need, AIUI, the entropy for encrypting various variable
contents, which are then written into pflash.)

Alternatively, CPU instructions could exist that return entropy, and are
executable only inside SMM. It seems that e.g. RDRAND can be trapped in
guests ("A VMEXIT due to RDRAND will have exit reason 57 (decimal)").
Then KVM / QEMU could provide any particular implementation we wanted --
for example an exception could be injected unless RDRAND had been
executed from within SMM. Unfortunately, such an arbitrary restriction
(of RDRAND to SMM) would diverge from the Intel SDM, and would likely
break other (non-SMM) guest code.

Does a platform device that is dynamically detectable and usable in SMM
only seem like an acceptable design for QEMU?

Thanks,
Laszlo

Re: [RFC PATCH 18/18] qemu-storage-daemon: Add --monitor option

2019-11-07 Thread Kevin Wolf

Am 06.11.2019 um 15:32 hat Max Reitz geschrieben:
> On 17.10.19 15:02, Kevin Wolf wrote:
> > This adds and parses the --monitor option, so that a QMP monitor can be
> > used in the storage daemon. The monitor offers commands defined in the
> > QAPI schema at storage-daemon/qapi/qapi-schema.json.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  storage-daemon/qapi/qapi-schema.json | 15 
> >  qemu-storage-daemon.c| 34 
> >  Makefile | 30 
> >  Makefile.objs|  4 ++--
> >  monitor/Makefile.objs|  2 ++
> >  qapi/Makefile.objs   |  5 
> >  qom/Makefile.objs|  1 +
> >  scripts/qapi/gen.py  |  5 
> >  storage-daemon/Makefile.objs |  1 +
> >  storage-daemon/qapi/Makefile.objs|  1 +
> >  10 files changed, 96 insertions(+), 2 deletions(-)
> >  create mode 100644 storage-daemon/qapi/qapi-schema.json
> >  create mode 100644 storage-daemon/Makefile.objs
> >  create mode 100644 storage-daemon/qapi/Makefile.objs
> 
> [...]
> 
> > diff --git a/qapi/Makefile.objs b/qapi/Makefile.objs
> > index 3e04e299ed..03d256f0a4 100644
> > --- a/qapi/Makefile.objs
> > +++ b/qapi/Makefile.objs
> > @@ -30,3 +30,8 @@ obj-y += $(QAPI_TARGET_MODULES:%=qapi-events-%.o)
> >  obj-y += qapi-events.o
> >  obj-y += $(QAPI_TARGET_MODULES:%=qapi-commands-%.o)
> >  obj-y += qapi-commands.o
> > +
> > +QAPI_MODULES_STORAGE_DAEMON = block block-core char common crypto 
> > introspect
> > +QAPI_MODULES_STORAGE_DAEMON += job monitor qom sockets pragma transaction
> 
> I took a look into the rest, and I wonder whether query-iothreads from
> misc.json would be useful, too.

Possibly. It would be a separate patch, but I can add it.

The question is just where to move query-iothreads. Do we have a good
place, or do I need to separate misc.json and a new misc-sysemu.json?

> > diff --git a/scripts/qapi/gen.py b/scripts/qapi/gen.py
> > index 796c17c38a..c25634f673 100644
> > --- a/scripts/qapi/gen.py
> > +++ b/scripts/qapi/gen.py
> > @@ -44,6 +44,11 @@ class QAPIGen(object):
> >  return ''
> >  
> >  def write(self, output_dir):
> > +# Include paths starting with ../ are used to reuse modules of the 
> > main
> > +# schema in specialised schemas. Don't overwrite the files that are
> > +# already generated for the main schema.
> > +if self.fname.startswith('../'):
> > +return
> 
> Sorry, but I’m about to ask a clueless question: How do we ensure that
> the main schema is generated before something else could make sure of
> the specialized schemas?

"Make sure"?

I think the order of the generation doesn't matter because generating
the storage daemon files doesn't actually access the main ones.
Generated C files shouldn't be a problem either because if we link an
object file into a binary, we have a make dependency for it.

Maybe the only a bit trickier question is whether we have the
dependencies right so that qemu-storage-daemon.c is only built after the
header files of both the main schema and the specific one have been
generated. If I understand the Makefile correctly, generated-files-y
takes care of this, and this patch adds all new header files to it if I
didn't miss any.

Kevin


signature.asc
Description: PGP signature

Re: privileged entropy sources in QEMU/KVM guests

* Laszlo Ersek (ler...@redhat.com) wrote:
> Hi,
> 
> related TianoCore BZ:
> 
>   https://bugzilla.tianocore.org/show_bug.cgi?id=1871
> 
> (I'm starting this thread separately because at least some of the topics
> are specific to QEMU, and I didn't want to litter the BZ with a
> discussion that may not be interesting to all participants CC'd on the
> BZ. I am keeping people CC'd on this initial posting; please speak up if
> you'd like to be dropped from the email thread.)
> 
> QEMU provides guests with the virtio-rng device, and the OVMF and
> ArmVirtQemu* edk2 platforms build EFI_RNG_PROTOCOL on top of that
> device. But, that doesn't seem enough for all edk2 use cases.
> 
> Also, virtio-rng (hence EFI_RNG_PROTOCOL too) is optional, and its
> absence may affect some other use cases.
> 
> 
> (1) For UEFI HTTPS boot, TLS would likely benefit from good quality
> entropy. If the VM config includes virtio-rng (hence the guest firmware
> has EFI_RNG_PROTOCOL), then it should be used as a part of HTTPS boot.
> 
> However, what if virtio-rng (hence EFI_RNG_PROTOCOL) are absent? Should
> UEFI HTTPS boot be disabled completely (or prevented / rejected
> somehow), blaming lack of good entropy? Or should TLS silently fall back
> to "mixing some counters [such as TSC] together and applying a
> deterministic cryptographic transformation"?
> 
> IOW, knowing that the TLS setup may not be based on good quality
> entropy, should we allow related firmware services to "degrade silently"
> (not functionally, but potentially in security), or should we deny the
> services altogether?

I don't see a downside to insisting that if you want to use https then
you must provide an entropy source; they're easy enough to add using
virtio-rng if the CPU doesn't provide it.

> 
> (2) It looks like the SMM driver implementing the privileged part of the
> UEFI variable runtime service could need access to good quality entropy,
> while running in SMM; in the future.
> 
> This looks problematic on QEMU. Entropy is a valuable resource, and
> whatever resource SMM drivers depend on, should not be possible for e.g.
> a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
> Therefore, it's not *only* the case that SMM drivers must not consume
> EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
> outside of SMM/SMRAM), but also that SMM drivers must not depend on the
> same piece of *hardware* that feeds EFI_RNG_PROTOCOL.
> 
> Furthermore, assuming we dedicate a hardware entropy device specifically
> to SMM drivers, such a device cannot be PCI(e). It would have to be a
> platform device at a fixed location (IO port or MMIO) that is only
> accessible to such guest code that executes in SMM. IOW, device access
> would have to be restricted similarly to pflash. (In fact the variable
> SMM driver will need, AIUI, the entropy for encrypting various variable
> contents, which are then written into pflash.)

Ewww.  I guess a virtio-rng instance wired to virtio-mmio could do that.
It's a bit grim though.

Dave

> Alternatively, CPU instructions could exist that return entropy, and are
> executable only inside SMM. It seems that e.g. RDRAND can be trapped in
> guests ("A VMEXIT due to RDRAND will have exit reason 57 (decimal)").
> Then KVM / QEMU could provide any particular implementation we wanted --
> for example an exception could be injected unless RDRAND had been
> executed from within SMM. Unfortunately, such an arbitrary restriction
> (of RDRAND to SMM) would diverge from the Intel SDM, and would likely
> break other (non-SMM) guest code.
> 
> Does a platform device that is dynamically detectable and usable in SMM
> only seem like an acceptable design for QEMU?
> 
> Thanks,
> Laszlo
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: privileged entropy sources in QEMU/KVM guests

2019-11-07 Thread Ard Biesheuvel

Hi Laszlo,

Thanks for starting this thread.


On Thu, 7 Nov 2019 at 11:11, Laszlo Ersek  wrote:
>
> Hi,
>
> related TianoCore BZ:
>
>   https://bugzilla.tianocore.org/show_bug.cgi?id=1871
>
> (I'm starting this thread separately because at least some of the topics
> are specific to QEMU, and I didn't want to litter the BZ with a
> discussion that may not be interesting to all participants CC'd on the
> BZ. I am keeping people CC'd on this initial posting; please speak up if
> you'd like to be dropped from the email thread.)
>
> QEMU provides guests with the virtio-rng device, and the OVMF and
> ArmVirtQemu* edk2 platforms build EFI_RNG_PROTOCOL on top of that
> device. But, that doesn't seem enough for all edk2 use cases.
>
> Also, virtio-rng (hence EFI_RNG_PROTOCOL too) is optional, and its
> absence may affect some other use cases.
>
>
> (1) For UEFI HTTPS boot, TLS would likely benefit from good quality
> entropy. If the VM config includes virtio-rng (hence the guest firmware
> has EFI_RNG_PROTOCOL), then it should be used as a part of HTTPS boot.
>
> However, what if virtio-rng (hence EFI_RNG_PROTOCOL) are absent? Should
> UEFI HTTPS boot be disabled completely (or prevented / rejected
> somehow), blaming lack of good entropy? Or should TLS silently fall back
> to "mixing some counters [such as TSC] together and applying a
> deterministic cryptographic transformation"?
>
> IOW, knowing that the TLS setup may not be based on good quality
> entropy, should we allow related firmware services to "degrade silently"
> (not functionally, but potentially in security), or should we deny the
> services altogether?
>

TLS uses a source of randomness to establish symmetric session keys
for encryption. So it really depends on the use case whether HTTPS is
used for authentication or for confidentiality, and it seems to me
that it would typically be the former. So disabling HTTPS boot in this
case seems counterproductive to me.

>
> (2) It looks like the SMM driver implementing the privileged part of the
> UEFI variable runtime service could need access to good quality entropy,
> while running in SMM; in the future.
>
> This looks problematic on QEMU. Entropy is a valuable resource, and
> whatever resource SMM drivers depend on, should not be possible for e.g.
> a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
> Therefore, it's not *only* the case that SMM drivers must not consume
> EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
> outside of SMM/SMRAM), but also that SMM drivers must not depend on the
> same piece of *hardware* that feeds EFI_RNG_PROTOCOL.
>

The typical model is to seed a DRBG [deterministic pseudorandom
sequence generator] using a sufficient amount of high quality entropy.
Once you have done that, it is rather hard to exhaust a DRBG - it is a
mathematical construction that is designed to last for a long time (<=
2^48 invocations [not bytes] according to the NIST spec), after which
it does not degrade although it may have generated so much output that
its internal state may be inferred if you have captured enough of it
(which is a rather theoretical issue IMHO)

The problem is that using the output of a DRBG as a seed is
non-trivial - the spec describes ways to do this, but wiring
virtio-rng to a DRBG in the host and using its output to seed a DRBG
in the guest is slighly problematic.

So it seems to me that the correct way to model this is to make the
host's true entropy source a shared resource like any other.

> Furthermore, assuming we dedicate a hardware entropy device specifically
> to SMM drivers, such a device cannot be PCI(e). It would have to be a
> platform device at a fixed location (IO port or MMIO) that is only
> accessible to such guest code that executes in SMM. IOW, device access
> would have to be restricted similarly to pflash. (In fact the variable
> SMM driver will need, AIUI, the entropy for encrypting various variable
> contents, which are then written into pflash.)
>
> Alternatively, CPU instructions could exist that return entropy, and are
> executable only inside SMM. It seems that e.g. RDRAND can be trapped in
> guests ("A VMEXIT due to RDRAND will have exit reason 57 (decimal)").
> Then KVM / QEMU could provide any particular implementation we wanted --
> for example an exception could be injected unless RDRAND had been
> executed from within SMM. Unfortunately, such an arbitrary restriction
> (of RDRAND to SMM) would diverge from the Intel SDM, and would likely
> break other (non-SMM) guest code.
>
> Does a platform device that is dynamically detectable and usable in SMM
> only seem like an acceptable design for QEMU?
>

Re: [RFC PATCH 00/18] Add qemu-storage-daemon

On Thu, Oct 17, 2019 at 03:01:46PM +0200, Kevin Wolf wrote:
> This series adds a new tool 'qemu-storage-daemon', which can be used to
> export and perform operations on block devices. There is some overlap
> between qemu-img/qemu-nbd and the new qemu-storage-daemon, but there are
> a few important differences:
> 
> * The qemu-storage-daemon has QMP support. The command set is obviously
>   restricted compared to the system emulator because there is no guest,
>   but all of the block operations are present.
> 
>   This means that it can access advanced options or operations that the
>   qemu-img command line doesn't expose. For example, blockdev-create is
>   a lot more powerful than 'qemu-img create', and qemu-storage-daemon
>   allows to execute it without starting a guest.
> 
>   Compared to qemu-nbd it means that, for example, block jobs can now be
>   executed on the server side, and backing chains shared by multiple VMs
>   can be modified this way.
> 
> * The existing tools all have a separately invented one-off syntax for
>   the job at hand, which usually comes with restrictions compared to the
>   system emulator. qemu-storage-daemon shares the same syntax with the
>   system emulator for most options and prefers QAPI based interfaces
>   where possible (such as --blockdev), so it should be easy to make use
>   of in libvirt.
> 
> * While this series implements only NBD exports, the storage daemon is
>   intended to serve multiple protocols and its syntax reflects this. In
>   the past, we had proposals to add new one-off tools for exporting over
>   new protocols like FUSE or TCMU.
> 
>   With a generic storage daemon, additional export methods have a home
>   without adding a new tool for each of them.
> 
> I'm posting this as an RFC mainly for two reasons:
> 
> 1. The monitor integration, which could be argued to be a little hackish
>(some generated QAPI source files are built from a separate QAPI
>schema, but the per-module ones are taken from the system emulator)
>and Markus will want to have a closer look there. But from the IRC
>discussions we had, we seem to agree on the general approach here.
> 
> 2. I'm not completely sure if the command line syntax is the final
>version that we want to support long-term. Many options directly use
>QAPI visitors (--blockdev, --export, --nbd-server) and should be
>fine. However, others use QemuOpts (--chardev, --monitor, --object).
> 
>This is the same as in the system emulator, so we wouldn't be adding
>a new problem, but as there was talk about QAPIfying the command
>line, and I wouldn't want later syntax changes or adding lots of
>compatibility code to a new tool, I thought we should probably
>discuss whether QAPIfying from the start would be an option.

I think that following what the QEMU emulators currently do for
CLI args should be an explicit anti-goal, because we know that it is
a long standing source of pain.  Fixing it in the emulator binaries
is hard due to backward compatibility, but for this new binary we
have a clean slate.

This feels like a good opportunity to implement & demonstrate what
we think QEMU configuration ought to look like. Work done for this
in the qemu-storage-daemon may well help us understand how we'll
be able to move the QEMU emulators into a new scheme later.

My personal wish would be to have no use of QemuOpts at all.

Use GOptionContext *only* for parsing command line arguments
related to execution of the daemon - ie things like --help,
--version, --daemon, --pid-file.

The use a "--config /path/to/json/file" arg to point to the config
file for everything else using QAPI schema to describe it fully.

When loading the config file, things should be created in order
in which they are specified. ie don't try and group things,
otherwise we end up back with the horrific hacks for objects
where some are created early & some late.

For an ambitious stretch goal, I think we should seriously consider
whether our use of chardevs is appropriate in all cases that exist,
and whether we can avoid the trap of over-using chardev in the new
storage daemon since it is a clean slate in terms of user facing
CLI config.

chardevs are designed for & reasonably well suited to attaching to
devices like serial ports, parallel ports, etc. You have a 1:1
remote:local peer relationship. The transport is a dumb byte
stream, nothing special needed on top & the user can cope with
any type of chardev.

Many cases where we've used chardevs as a backend in QEMU are a
poor fit. We just used chardevs as an "easy" way to configure a
UNIX or TCP socket from the CLI, and don't care about, nor work
with, any othuer chardev backends. As a result of this misuse
we've had to put in an increasing number of hacks in the chardev
code to deal with fact that callers want to know about  & use
socket semantics. eg FD passing, the chardev reconnection polling
code.

The monitor is a prime example of a bad fit - it would

Re: Looking for issues/features for my first contribution

2019-11-07 Thread Aleksandar Markovic

On Thursday, November 7, 2019, Rajath Shashidhara 
wrote:

> Hi all,
>
> I am a Computer Science graduate student at The University of Texas at
> Austin (UT, Austin). I am looking forward to contributing to qemu !
>
> This semester, I am taking a class in Virtualization (
> https://github.com/vijay03/cs378-f19) and contributing to a
> virtualization related open-source project is a significant part of the
> course.
> I would be interested in contributing a patchset to qemu - possibly a
> self-contained feature or a reasonably complex bug fix that can be
> completed in under a month's time. I did look at both the bugtracker and
> the QEMU Google Summer of Code 2019 page [https://wiki.qemu.org/Google_
> Summer_of_Code_2019] for ideas. However, I would be interested in hearing
> from the community and I would be delighted if somebody can be suggest a
> suitable project !
>
>
Hello, Rajath!

Thank you for expressing interest in QEMU open source project.

There is certainly a place for you and your contributions in QEMU, and you
are very welcomed!

It looks to me the following project would fit your description:

'Implement emulation of DS3231 real time clock in QEMU'

Datasheet:

https://datasheets.maximintegrated.com/en/ds/DS3231.pdf

The steps needed to complete it (in my opinion):

- collect datasheets of as many as possible RTC chips already emulated in
QEMU (there are around of dozen of them, see folder hw/rtc)

- do a comparative analysis of selected RTC implementations in QEMU

- get to know general QEMU device model

- design and implement DS3231 emulation

I can give you (unfortunately constrained by tight time limits) some help
and guidance. But there are other people in community too (more
knowledgable in the area than me).

I salute your initiative!

Yours,
Aleksandar




> I am an advanced C programmer with both professional and academic
> background in systems design & implementation - especially OS & Networks.
> Given my background, I feel fairly confident that I can pickup the QEMU
> codebase quickly.
>
> Eagerly looking forward to hearing from the community !
>
> Thanks,
> Rajath Shashidhara
>
>
>

Re: [PATCH v3 11/16] hw/arm/raspi: Use -smp cores= option to restrict enabled cores

2019-11-07 Thread Bonnans, Laurent

On 10/20/19 1:47 AM, Philippe Mathieu-Daudé wrote:
> The abstract TYPE_BCM283X device provides a 'enabled-cpus' property
> to restrict the number of cores powered on reset. This because on
> real hardware the GPU is responsible of starting the cores and keep
> them spinning until the Linux kernel is ready to use them.
> When using the -kernel paramenter, QEMU does this by installing the
> 'raspi_smpboot' code when arm_boot_info::write_board_setup() is
> called. This is a special feature to help the Linux kernel, and can
> only be used with a Linux kernel.
>
> Even if loaded with the -kernel option, U-boot is not Linux, thus
> is not recognized as it and the raspi_smpboot code is not installed.
>
> Upon introduction of this machine in commit 1df7d1f9303, the -smp 
> option allowd to limit the number of cores powered on reset.
> Unfortunately later commit 72649619341 added a check which made this
> feature unusable:
>
>$ qemu-system-aarch64 -M raspi3 -smp 1
>qemu-system-aarch64: Invalid SMP CPUs 1. The min CPUs supported by machine 
> 'raspi3' is 4
>
> Fortunately, the -smp option allow various kind of CPU topology:
>
>-smp 
> [cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,dies=dies][,sockets=sockets]
> set the number of CPUs to 'n' [default=1]
> maxcpus= maximum number of total cpus, including
> offline CPUs for hotplug, etc
> cores= number of CPU cores on one socket (for PC, it's on one die)
> threads= number of threads on one CPU core
> dies= number of CPU dies on one socket (for PC only)
> sockets= number of discrete sockets in the system
>
> Let's use the 'cores' argument to specify the number of cores powered
> at reset to restore this feature, and allow to boot U-boot.
>
> We can now run U-boot using:
>
>$ qemu-system-aarch64 -M raspi3 -smp 4,cores=1 ...
>
> Reported-by: Laurent Bonnans 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>   hw/arm/raspi.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
> index 569d85c11a..45d3f91f95 100644
> --- a/hw/arm/raspi.c
> +++ b/hw/arm/raspi.c
> @@ -190,8 +190,8 @@ static void raspi_init(MachineState *machine, int version)
>   /* Setup the SOC */
>   object_property_add_const_link(OBJECT(&s->soc), "ram", OBJECT(&s->ram),
>  &error_abort);
> -object_property_set_int(OBJECT(&s->soc), machine->smp.cpus, 
> "enabled-cpus",
> -&error_abort);
> +object_property_set_int(OBJECT(&s->soc), machine->smp.cores,
> +"enabled-cpus", &error_abort);
>   int board_rev = version == 3 ? 0xa02082 : 0xa21041;
>   object_property_set_int(OBJECT(&s->soc), board_rev, "board-rev",
>   &error_abort);

Hi Phil,

Thanks for the patch, I finally got the chance to make some tests (sorry 
about the delay).

Using the proposed -smp options indeed helps to run u-boot elfs 
directly. However, the cores fail to start when switching to linux 
(tested on raspi2):

[ 0.071030] smp: Bringing up secondary CPUs ...
[ 1.157876] CPU1: failed to come online
[ 2.219899] CPU2: failed to come online
[ 3.285412] CPU3: failed to come online
[ 3.286137] smp: Brought up 1 node, 1 CPU
[ 3.286766] SMP: Total of 1 processors activated (125.00 BogoMIPS).
[ 3.287442] CPU: All CPU(s) started in SVC mode.

The behavior persist even without using the option on the command line.
The normal behavior is restored if I use "-smp 4, cores=4"

Greetings,

Laurent

Re: [RFC PATCH 18/18] qemu-storage-daemon: Add --monitor option

On 07.11.19 11:12, Kevin Wolf wrote:
> Am 06.11.2019 um 15:32 hat Max Reitz geschrieben:
>> On 17.10.19 15:02, Kevin Wolf wrote:
>>> This adds and parses the --monitor option, so that a QMP monitor can be
>>> used in the storage daemon. The monitor offers commands defined in the
>>> QAPI schema at storage-daemon/qapi/qapi-schema.json.
>>>
>>> Signed-off-by: Kevin Wolf 
>>> ---
>>>  storage-daemon/qapi/qapi-schema.json | 15 
>>>  qemu-storage-daemon.c| 34 
>>>  Makefile | 30 
>>>  Makefile.objs|  4 ++--
>>>  monitor/Makefile.objs|  2 ++
>>>  qapi/Makefile.objs   |  5 
>>>  qom/Makefile.objs|  1 +
>>>  scripts/qapi/gen.py  |  5 
>>>  storage-daemon/Makefile.objs |  1 +
>>>  storage-daemon/qapi/Makefile.objs|  1 +
>>>  10 files changed, 96 insertions(+), 2 deletions(-)
>>>  create mode 100644 storage-daemon/qapi/qapi-schema.json
>>>  create mode 100644 storage-daemon/Makefile.objs
>>>  create mode 100644 storage-daemon/qapi/Makefile.objs
>>
>> [...]
>>
>>> diff --git a/qapi/Makefile.objs b/qapi/Makefile.objs
>>> index 3e04e299ed..03d256f0a4 100644
>>> --- a/qapi/Makefile.objs
>>> +++ b/qapi/Makefile.objs
>>> @@ -30,3 +30,8 @@ obj-y += $(QAPI_TARGET_MODULES:%=qapi-events-%.o)
>>>  obj-y += qapi-events.o
>>>  obj-y += $(QAPI_TARGET_MODULES:%=qapi-commands-%.o)
>>>  obj-y += qapi-commands.o
>>> +
>>> +QAPI_MODULES_STORAGE_DAEMON = block block-core char common crypto 
>>> introspect
>>> +QAPI_MODULES_STORAGE_DAEMON += job monitor qom sockets pragma transaction
>>
>> I took a look into the rest, and I wonder whether query-iothreads from
>> misc.json would be useful, too.
> 
> Possibly. It would be a separate patch, but I can add it.
> 
> The question is just where to move query-iothreads. Do we have a good
> place, or do I need to separate misc.json and a new misc-sysemu.json?

I’d just put it in block.json because of the “io”... O:-)

>>> diff --git a/scripts/qapi/gen.py b/scripts/qapi/gen.py
>>> index 796c17c38a..c25634f673 100644
>>> --- a/scripts/qapi/gen.py
>>> +++ b/scripts/qapi/gen.py
>>> @@ -44,6 +44,11 @@ class QAPIGen(object):
>>>  return ''
>>>  
>>>  def write(self, output_dir):
>>> +# Include paths starting with ../ are used to reuse modules of the 
>>> main
>>> +# schema in specialised schemas. Don't overwrite the files that are
>>> +# already generated for the main schema.
>>> +if self.fname.startswith('../'):
>>> +return
>>
>> Sorry, but I’m about to ask a clueless question: How do we ensure that
>> the main schema is generated before something else could make sure of
>> the specialized schemas?
> 
> "Make sure"?

Oops, s/ sure/ use/.

> I think the order of the generation doesn't matter because generating
> the storage daemon files doesn't actually access the main ones.
> Generated C files shouldn't be a problem either because if we link an
> object file into a binary, we have a make dependency for it.

I was mostly wondering about the fact that make mustn’t try to compile
the “generated files” (which aren’t really generated here) before they
are actually generated when the main schema is processed.

Max

> Maybe the only a bit trickier question is whether we have the
> dependencies right so that qemu-storage-daemon.c is only built after the
> header files of both the main schema and the specific one have been
> generated. If I understand the Makefile correctly, generated-files-y
> takes care of this, and this patch adds all new header files to it if I
> didn't miss any.
> 
> Kevin
> 




signature.asc
Description: OpenPGP digital signature

[Bug 1848556] Re: qemu-img check failing on remote image in Eoan

2019-11-07 Thread Christian Ehrhardt 

Focal is complete the MPs reviewed, SRU Teamplates ready and pre-tests done.
Uploading to E-unapproved for the SRU Teams consideration.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1848556

Title:
  qemu-img check failing on remote image in Eoan

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Eoan:
  Triaged
Status in qemu source package in Focal:
  Fix Released

Bug description:
  Ubuntu SRU Template:

  [Impact]

   * There is fallout due to changes in libcurl that affect qemu and might 
 lead to a hang.

   * Fix by backporting the upstream fix

  [Test Case]

   * If you have network just run
 $ qemu-img check http://10.193.37.117/cloud/eoan-server-cloudimg-amd64.img

   * Without network, install apache2, and get a complex qemu file (like a 
 cloud image) onto the system. Then access the file via apache http but 
 not localhost (that would work)

  [Regression Potential]

   * The change is local to the libcurl usage of qemu, so that could be 
 affected. But then this is what has been found to not work here, so I'd 
 expect not too much trouble. But if so then in the curl usage (which 
 means disks on http)

  [Other Info]
   
   * n/a

  ---

  The "qemu-img check" function is failing on remote (HTTP-hosted)
  images, beginning with Ubuntu 19.10 (qemu-utils version 1:4.0+dfsg-
  0ubuntu9). With previous versions, through Ubuntu 19.04/qemu-utils
  version 1:3.1+dfsg-2ubuntu3.5, the following worked:

  $ /usr/bin/qemu-img check  
http://10.193.37.117/cloud/eoan-server-cloudimg-amd64.img
  No errors were found on the image.
  19778/36032 = 54.89% allocated, 90.34% fragmented, 89.90% compressed clusters
  Image end offset: 514064384

  The 10.193.37.117 server holds an Apache server that hosts the cloud
  images on a LAN. Beginning with Ubuntu 19.10/qemu-utils 1:4.0+dfsg-
  0ubuntu9, the same command never returns. (I've left it for up to an
  hour with no change.) I'm able to wget the image from the same server
  and installation on which qemu-img check fails. I've tried several
  .img files on the server, ranging from Bionic to Eoan, with the same
  results with all of them.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1848556/+subscriptions

Re: [virtio-dev] Re: guest / host buffer sharing ...

2019-11-07 Thread Gerd Hoffmann

  Hi,

> > This is not about host memory, buffers are in guest ram, everything else
> > would make sharing those buffers between drivers inside the guest (as
> > dma-buf) quite difficult.
> 
> Given it's just guest memory, can the guest just have a virt queue on
> which it places pointers to the memory it wants to share as elements in
> the queue?

Well, good question.  I'm actually wondering what the best approach is
to handle long-living, large buffers in virtio ...

virtio-blk (and others) are using the approach you describe.  They put a
pointer to the io request header, followed by pointer(s) to the io
buffers directly into the virtqueue.  That works great with storage for
example.  The queue entries are tagged being "in" or "out" (driver to
device or visa-versa), so the virtio transport can set up dma mappings
accordingly or even transparently copy data if needed.

For long-living buffers where data can potentially flow both ways this
model doesn't fit very well though.  So what virtio-gpu does instead is
transferring the scatter list as virtio payload.  Does feel a bit
unclean as it doesn't really fit the virtio architecture.  It assumes
the host can directly access guest memory for example (which is usually
the case but explicitly not required by virtio).  It also requires
quirks in virtio-gpu to handle VIRTIO_F_IOMMU_PLATFORM properly, which
in theory should be handled fully transparently by the virtio-pci
transport.

We could instead have a "create-buffer" command which adds the buffer
pointers as elements to the virtqueue as you describe.  Then simply
continue using the buffer even after completing the "create-buffer"
command.  Which isn't exactly clean either.  It would likewise assume
direct access to guest memory, and it would likewise need quirks for
VIRTIO_F_IOMMU_PLATFORM as the virtio-pci transport tears down the dma
mappings for the virtqueue entries after command completion.

Comments, suggestions, ideas?

cheers,
  Gerd

Re: [virtio-dev] Re: guest / host buffer sharing ...

* Gerd Hoffmann (kra...@redhat.com) wrote:
>   Hi,
> 
> > > This is not about host memory, buffers are in guest ram, everything else
> > > would make sharing those buffers between drivers inside the guest (as
> > > dma-buf) quite difficult.
> > 
> > Given it's just guest memory, can the guest just have a virt queue on
> > which it places pointers to the memory it wants to share as elements in
> > the queue?
> 
> Well, good question.  I'm actually wondering what the best approach is
> to handle long-living, large buffers in virtio ...
> 
> virtio-blk (and others) are using the approach you describe.  They put a
> pointer to the io request header, followed by pointer(s) to the io
> buffers directly into the virtqueue.  That works great with storage for
> example.  The queue entries are tagged being "in" or "out" (driver to
> device or visa-versa), so the virtio transport can set up dma mappings
> accordingly or even transparently copy data if needed.
> 
> For long-living buffers where data can potentially flow both ways this
> model doesn't fit very well though.  So what virtio-gpu does instead is
> transferring the scatter list as virtio payload.  Does feel a bit
> unclean as it doesn't really fit the virtio architecture.  It assumes
> the host can directly access guest memory for example (which is usually
> the case but explicitly not required by virtio).  It also requires
> quirks in virtio-gpu to handle VIRTIO_F_IOMMU_PLATFORM properly, which
> in theory should be handled fully transparently by the virtio-pci
> transport.
> 
> We could instead have a "create-buffer" command which adds the buffer
> pointers as elements to the virtqueue as you describe.  Then simply
> continue using the buffer even after completing the "create-buffer"
> command.  Which isn't exactly clean either.  It would likewise assume
> direct access to guest memory, and it would likewise need quirks for
> VIRTIO_F_IOMMU_PLATFORM as the virtio-pci transport tears down the dma
> mappings for the virtqueue entries after command completion.
> 
> Comments, suggestions, ideas?

What about not completing the command while the device is using the
memory?

Dave

> cheers,
>   Gerd
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: Looking for issues/features for my first contribution

2019-11-07 Thread Alex Bennée



Rajath Shashidhara  writes:

> Hi all,
>
> I am a Computer Science graduate student at The University of Texas at
> Austin (UT, Austin). I am looking forward to contributing to qemu !
>
> This semester, I am taking a class in Virtualization
> (https://github.com/vijay03/cs378-f19) and contributing to a
> virtualization related open-source project is a significant part of
> the course.
> I would be interested in contributing a patchset to qemu - possibly a
> self-contained feature or a reasonably complex bug fix that can be
> completed in under a month's time. I did look at both the bugtracker
> and the QEMU Google Summer of Code 2019 page
> [https://wiki.qemu.org/Google_Summer_of_Code_2019] for ideas. However,
> I would be interested in hearing from the community and I would be
> delighted if somebody can be suggest a suitable project !

Ahh someone else looking at QEMU \o/

You might find some suggestions in the thread:

  Date: Sun, 3 Nov 2019 04:59:31 -0600
  Message-ID: 

  Subject: Feature Recommendations?

> I am an advanced C programmer with both professional and academic
> background in systems design & implementation - especially OS &
> Networks. Given my background, I feel fairly confident that I can
> pickup the QEMU codebase quickly.
>
> Eagerly looking forward to hearing from the community !

Expanding a device emulation could be doable. I'm not sure what the
current state of the Raspberry Pi emulations are but I don't think they
are complete as we keep having to fix bits as kernel drivers are
enabled. Phillipe (cc'd) might be able to give some pointers.

--
Alex Bennée

Re: privileged entropy sources in QEMU/KVM guests

On 11/07/19 11:18, Dr. David Alan Gilbert wrote:
> * Laszlo Ersek (ler...@redhat.com) wrote:
>> Hi,
>>
>> related TianoCore BZ:
>>
>>   https://bugzilla.tianocore.org/show_bug.cgi?id=1871
>>
>> (I'm starting this thread separately because at least some of the topics
>> are specific to QEMU, and I didn't want to litter the BZ with a
>> discussion that may not be interesting to all participants CC'd on the
>> BZ. I am keeping people CC'd on this initial posting; please speak up if
>> you'd like to be dropped from the email thread.)
>>
>> QEMU provides guests with the virtio-rng device, and the OVMF and
>> ArmVirtQemu* edk2 platforms build EFI_RNG_PROTOCOL on top of that
>> device. But, that doesn't seem enough for all edk2 use cases.
>>
>> Also, virtio-rng (hence EFI_RNG_PROTOCOL too) is optional, and its
>> absence may affect some other use cases.
>>
>>
>> (1) For UEFI HTTPS boot, TLS would likely benefit from good quality
>> entropy. If the VM config includes virtio-rng (hence the guest firmware
>> has EFI_RNG_PROTOCOL), then it should be used as a part of HTTPS boot.
>>
>> However, what if virtio-rng (hence EFI_RNG_PROTOCOL) are absent? Should
>> UEFI HTTPS boot be disabled completely (or prevented / rejected
>> somehow), blaming lack of good entropy? Or should TLS silently fall back
>> to "mixing some counters [such as TSC] together and applying a
>> deterministic cryptographic transformation"?
>>
>> IOW, knowing that the TLS setup may not be based on good quality
>> entropy, should we allow related firmware services to "degrade silently"
>> (not functionally, but potentially in security), or should we deny the
>> services altogether?
> 
> I don't see a downside to insisting that if you want to use https then
> you must provide an entropy source; they're easy enough to add using
> virtio-rng if the CPU doesn't provide it.

Possibly true; however it could be considered a usability regression by
end-users. ("UEFI HTTPS boot used to work, now it breaks with the same
VM config". Unless we can respond, "UEFI HTTPS boot's TLS init has never
been secure enough", they'll have a point. See also Ard's followup.)

> 
>>
>> (2) It looks like the SMM driver implementing the privileged part of the
>> UEFI variable runtime service could need access to good quality entropy,
>> while running in SMM; in the future.
>>
>> This looks problematic on QEMU. Entropy is a valuable resource, and
>> whatever resource SMM drivers depend on, should not be possible for e.g.
>> a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
>> Therefore, it's not *only* the case that SMM drivers must not consume
>> EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
>> outside of SMM/SMRAM), but also that SMM drivers must not depend on the
>> same piece of *hardware* that feeds EFI_RNG_PROTOCOL.
>>
>> Furthermore, assuming we dedicate a hardware entropy device specifically
>> to SMM drivers, such a device cannot be PCI(e). It would have to be a
>> platform device at a fixed location (IO port or MMIO) that is only
>> accessible to such guest code that executes in SMM. IOW, device access
>> would have to be restricted similarly to pflash. (In fact the variable
>> SMM driver will need, AIUI, the entropy for encrypting various variable
>> contents, which are then written into pflash.)
> 
> Ewww.  I guess a virtio-rng instance wired to virtio-mmio could do that.
> It's a bit grim though.

*shudder* please let's keep virtio-mmio (or any remotely enumerable /
complex "bus" thingy) out of this :( I'm all for extensible hardware
interfaces, but cramming more and more infrastructure code into SMM
looks very questionable to me.

My main concern here is that most physical platform vendors will just
solder some physical entropy-gen chip onto their boards, and then
hard-code the MMIO base address of that as a build-time constant in
their firmware blobs. This obviously won't work for QEMU, where the hw
can change from boot to boot; so the generic edk2 solution (regardless
of the actual chip) need to allow for that kind of dynamism. This is a
recurrent problem between QEMU and edk2, alas. The answer is of course
dynamic detection, but I *still* like to keep the enumeration logic to
the absolute minimum in SMM.

Thanks!
Laszlo

> 
> Dave
> 
>> Alternatively, CPU instructions could exist that return entropy, and are
>> executable only inside SMM. It seems that e.g. RDRAND can be trapped in
>> guests ("A VMEXIT due to RDRAND will have exit reason 57 (decimal)").
>> Then KVM / QEMU could provide any particular implementation we wanted --
>> for example an exception could be injected unless RDRAND had been
>> executed from within SMM. Unfortunately, such an arbitrary restriction
>> (of RDRAND to SMM) would diverge from the Intel SDM, and would likely
>> break other (non-SMM) guest code.
>>
>> Does a platform device that is dynamically detectable and usable in SMM
>> only seem like an acceptable design for QEMU?
>>
>> Thanks,
>> Laszlo

Re: [PATCH v2 20/21] iotests: Disable data_file where it cannot be used

On 06.11.19 16:52, Maxim Levitsky wrote:
> On Tue, 2019-10-15 at 16:27 +0200, Max Reitz wrote:
>> Signed-off-by: Max Reitz 
>> ---
>>  tests/qemu-iotests/007 | 5 +++--
>>  tests/qemu-iotests/014 | 2 ++
>>  tests/qemu-iotests/015 | 5 +++--
>>  tests/qemu-iotests/026 | 5 -
>>  tests/qemu-iotests/029 | 5 +++--
>>  tests/qemu-iotests/031 | 6 +++---
>>  tests/qemu-iotests/036 | 5 +++--
>>  tests/qemu-iotests/039 | 3 +++
>>  tests/qemu-iotests/046 | 2 ++
>>  tests/qemu-iotests/048 | 2 ++
>>  tests/qemu-iotests/051 | 5 +++--
>>  tests/qemu-iotests/058 | 5 +++--
>>  tests/qemu-iotests/060 | 6 --
>>  tests/qemu-iotests/061 | 6 --
>>  tests/qemu-iotests/062 | 2 +-
>>  tests/qemu-iotests/066 | 2 +-
>>  tests/qemu-iotests/067 | 6 --
>>  tests/qemu-iotests/068 | 5 +++--
>>  tests/qemu-iotests/071 | 3 +++
>>  tests/qemu-iotests/073 | 2 ++
>>  tests/qemu-iotests/074 | 2 ++
>>  tests/qemu-iotests/080 | 5 +++--
>>  tests/qemu-iotests/090 | 2 ++
>>  tests/qemu-iotests/098 | 6 --
>>  tests/qemu-iotests/099 | 3 ++-
>>  tests/qemu-iotests/103 | 5 +++--
>>  tests/qemu-iotests/108 | 6 --
>>  tests/qemu-iotests/112 | 5 +++--
>>  tests/qemu-iotests/114 | 2 ++
>>  tests/qemu-iotests/121 | 3 +++
>>  tests/qemu-iotests/138 | 2 ++
>>  tests/qemu-iotests/156 | 2 ++
>>  tests/qemu-iotests/176 | 7 +--
>>  tests/qemu-iotests/191 | 2 ++
>>  tests/qemu-iotests/201 | 6 +++---
>>  tests/qemu-iotests/214 | 3 ++-
>>  tests/qemu-iotests/217 | 3 ++-
>>  tests/qemu-iotests/220 | 5 +++--
>>  tests/qemu-iotests/243 | 6 --
>>  tests/qemu-iotests/244 | 5 +++--
>>  tests/qemu-iotests/250 | 2 ++
>>  tests/qemu-iotests/267 | 5 +++--
>>  42 files changed, 117 insertions(+), 52 deletions(-)

[...]

>> diff --git a/tests/qemu-iotests/031 b/tests/qemu-iotests/031
>> index c44fcf91bb..646ecd593f 100755
>> --- a/tests/qemu-iotests/031
>> +++ b/tests/qemu-iotests/031
>> @@ -40,9 +40,9 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>>  # This tests qcow2-specific low-level functionality
>>  _supported_fmt qcow2
>>  _supported_proto file
>> -# We want to test compat=0.10, which does not support refcount widths
>> -# other than 16
>> -_unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)'
>> +# We want to test compat=0.10, which does not support external data
>> +# files or refcount widths other than 16
>> +_unsupported_imgopts data_file 'refcount_bits=\([^1]\|.\([^6]\|$\)\)'
> 
> This is maybe another reason to split this test for compat=0.10 and for 
> compat=1.1
> But still can be done later of course.

Hm, but I don’t really think this test is important for external data
files.  There is no I/O.

[...]

>> diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
>> index bbaf0ef45b..512598421c 100755
>> --- a/tests/qemu-iotests/036
>> +++ b/tests/qemu-iotests/036
>> @@ -43,8 +43,9 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>>  # This tests qcow2-specific low-level functionality
>>  _supported_fmt qcow2
>>  _supported_proto file
>> -# Only qcow2v3 and later supports feature bits
>> -_unsupported_imgopts 'compat=0.10'
>> +# Only qcow2v3 and later supports feature bits;
>> +# qcow2.py does not support external data files
> 
> Minor nitpick, maybe tag this with TODO or so. No need to do now.

Hm, well, and the same applies here.  (Just not a very important test.)

[...]

>> diff --git a/tests/qemu-iotests/046 b/tests/qemu-iotests/046
>> index 4e03ead7b1..a066eec605 100755
>> --- a/tests/qemu-iotests/046
>> +++ b/tests/qemu-iotests/046
>> @@ -38,6 +38,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>>  
>>  _supported_fmt qcow2
>>  _supported_proto file
>> +# data_file does not support compressed clusters
>> +_unsupported_imgopts data_file
> This is a very nice test, which doesn't seem to  use compressed
> clusters that much. I think it should be split as well.
> No need to do this now of course, but maybe mark with TODO to 
> avoid loosing that info.

The other problem is that blkdebug doesn’t work so well with external
data files, so basically this whole test doesn’t work.

[...]

>> diff --git a/tests/qemu-iotests/048 b/tests/qemu-iotests/048
>> index a8feb76184..2af6b74b41 100755
>> --- a/tests/qemu-iotests/048
>> +++ b/tests/qemu-iotests/048
>> @@ -49,6 +49,8 @@ _compare()
>>  _supported_fmt raw qcow2 qed luks
>>  _supported_proto file
>>  _supported_os Linux
>> +# Using 'cp' is incompatible with external data files
>> +_unsupported_imgopts data_file
> You could compare the external files instead in theory *I think*.
> Another item on some TODO list I guess.

This is a test of qemu-img compare, not of the image format.  So it
doesn’t make much sense to me to compare the external files, and also it
should be completely sufficient to run this test only without external
data files.

[...]

>> diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
>> index 9cd1d60d45..0053bad46a 100755
>> --- a/tests/qemu-iotests/051
>> +++ b/tests/qemu-iotests/051
>> @@ -39,8 +39,9 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
>>  _sup

Re: privileged entropy sources in QEMU/KVM guests

* Laszlo Ersek (ler...@redhat.com) wrote:
> On 11/07/19 11:18, Dr. David Alan Gilbert wrote:
> > * Laszlo Ersek (ler...@redhat.com) wrote:
> >> Hi,
> >>
> >> related TianoCore BZ:
> >>
> >>   https://bugzilla.tianocore.org/show_bug.cgi?id=1871
> >>
> >> (I'm starting this thread separately because at least some of the topics
> >> are specific to QEMU, and I didn't want to litter the BZ with a
> >> discussion that may not be interesting to all participants CC'd on the
> >> BZ. I am keeping people CC'd on this initial posting; please speak up if
> >> you'd like to be dropped from the email thread.)
> >>
> >> QEMU provides guests with the virtio-rng device, and the OVMF and
> >> ArmVirtQemu* edk2 platforms build EFI_RNG_PROTOCOL on top of that
> >> device. But, that doesn't seem enough for all edk2 use cases.
> >>
> >> Also, virtio-rng (hence EFI_RNG_PROTOCOL too) is optional, and its
> >> absence may affect some other use cases.
> >>
> >>
> >> (1) For UEFI HTTPS boot, TLS would likely benefit from good quality
> >> entropy. If the VM config includes virtio-rng (hence the guest firmware
> >> has EFI_RNG_PROTOCOL), then it should be used as a part of HTTPS boot.
> >>
> >> However, what if virtio-rng (hence EFI_RNG_PROTOCOL) are absent? Should
> >> UEFI HTTPS boot be disabled completely (or prevented / rejected
> >> somehow), blaming lack of good entropy? Or should TLS silently fall back
> >> to "mixing some counters [such as TSC] together and applying a
> >> deterministic cryptographic transformation"?
> >>
> >> IOW, knowing that the TLS setup may not be based on good quality
> >> entropy, should we allow related firmware services to "degrade silently"
> >> (not functionally, but potentially in security), or should we deny the
> >> services altogether?
> > 
> > I don't see a downside to insisting that if you want to use https then
> > you must provide an entropy source; they're easy enough to add using
> > virtio-rng if the CPU doesn't provide it.
> 
> Possibly true; however it could be considered a usability regression by
> end-users. ("UEFI HTTPS boot used to work, now it breaks with the same
> VM config". Unless we can respond, "UEFI HTTPS boot's TLS init has never
> been secure enough", they'll have a point. See also Ard's followup.)

You could turn it into a scary warning for a few releases first.

> > 
> >>
> >> (2) It looks like the SMM driver implementing the privileged part of the
> >> UEFI variable runtime service could need access to good quality entropy,
> >> while running in SMM; in the future.
> >>
> >> This looks problematic on QEMU. Entropy is a valuable resource, and
> >> whatever resource SMM drivers depend on, should not be possible for e.g.
> >> a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
> >> Therefore, it's not *only* the case that SMM drivers must not consume
> >> EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
> >> outside of SMM/SMRAM), but also that SMM drivers must not depend on the
> >> same piece of *hardware* that feeds EFI_RNG_PROTOCOL.
> >>
> >> Furthermore, assuming we dedicate a hardware entropy device specifically
> >> to SMM drivers, such a device cannot be PCI(e). It would have to be a
> >> platform device at a fixed location (IO port or MMIO) that is only
> >> accessible to such guest code that executes in SMM. IOW, device access
> >> would have to be restricted similarly to pflash. (In fact the variable
> >> SMM driver will need, AIUI, the entropy for encrypting various variable
> >> contents, which are then written into pflash.)
> > 
> > Ewww.  I guess a virtio-rng instance wired to virtio-mmio could do that.
> > It's a bit grim though.
> 
> *shudder* please let's keep virtio-mmio (or any remotely enumerable /
> complex "bus" thingy) out of this :( I'm all for extensible hardware
> interfaces, but cramming more and more infrastructure code into SMM
> looks very questionable to me.

The reason I suggested virtio-mmio was because it's not enumerable; it's
a fixed location; so you just check that the device you have there is
what you expect.
It means not inventing a new qemu device (although you would have to
make it addable on x86, and you would have to make it hideable in SMM).
(pci with preallocated addresses is similar).

> My main concern here is that most physical platform vendors will just
> solder some physical entropy-gen chip onto their boards, and then
> hard-code the MMIO base address of that as a build-time constant in
> their firmware blobs. This obviously won't work for QEMU, where the hw
> can change from boot to boot; so the generic edk2 solution (regardless
> of the actual chip) need to allow for that kind of dynamism. This is a
> recurrent problem between QEMU and edk2, alas. The answer is of course
> dynamic detection, but I *still* like to keep the enumeration logic to
> the absolute minimum in SMM.

While the hw can change from boot to boot on qemu, there's no
requirement that as a bios you respect that;  ju

Re: privileged entropy sources in QEMU/KVM guests

On 07/11/19 11:25, Ard Biesheuvel wrote:
>> This looks problematic on QEMU. Entropy is a valuable resource, and
>> whatever resource SMM drivers depend on, should not be possible for e.g.
>> a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
>> Therefore, it's not *only* the case that SMM drivers must not consume
>> EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
>> outside of SMM/SMRAM), but also that SMM drivers must not depend on the
>> same piece of *hardware* that feeds EFI_RNG_PROTOCOL.
>>
> The typical model is to seed a DRBG [deterministic pseudorandom
> sequence generator] using a sufficient amount of high quality entropy.
> Once you have done that, it is rather hard to exhaust a DRBG - it is a
> mathematical construction that is designed to last for a long time (<=
> 2^48 invocations [not bytes] according to the NIST spec), after which
> it does not degrade although it may have generated so much output that
> its internal state may be inferred if you have captured enough of it
> (which is a rather theoretical issue IMHO)
> 
> The problem is that using the output of a DRBG as a seed is
> non-trivial - the spec describes ways to do this, but wiring
> virtio-rng to a DRBG in the host and using its output to seed a DRBG
> in the guest is slighly problematic.
> 
> So it seems to me that the correct way to model this is to make the
> host's true entropy source a shared resource like any other.
> 

Yes, I would make SMM use a cryptographic pseudo-random number generator 
and seed it from virtio-rng from DXE, way before the OS starts and can 
"attack" it.

Once you've gotten a seed, you can create a CSPRNG with a stream cipher 
such as ChaCha20, which is literally 30 lines of code.

Paolo

#define ROTL(a,b) (((a) << (b)) | ((a) >> (32 - (b
#define QR(a, b, c, d) (\
a += b,  d ^= a,  d = ROTL(d,16),   \
c += d,  b ^= c,  b = ROTL(b,12),   \
a += b,  d ^= a,  d = ROTL(d, 8),   \
c += d,  b ^= c,  b = ROTL(b, 7))
#define ROUNDS 20

// initial state:
// in[0] = 0x65787061
// in[1] = 0x6e642033
// in[2] = 0x322d6279
// in[3] = 0x7465206b
// in[4]..in[11] = key (seed)
// in[12]..in[13] = 0
// in[14]..in[15] = nonce, can probably use RDTSC?
static uint32_t in[16];

uint32_t chacha_rng(void)
{
int i;
static uint32_t x[16], p;
if (p < 16)
return in[p++] + x[p++];

if (++in[12] == 0)
++in[13];

for (i = 0; i < 16; ++i)
x[i] = in[i];

// 10 loops × 2 rounds/loop = 20 rounds
for (i = 0; i < ROUNDS; i += 2) {
// Odd round
QR(x[0], x[4], x[ 8], x[12]); // column 0
QR(x[1], x[5], x[ 9], x[13]); // column 1
QR(x[2], x[6], x[10], x[14]); // column 2
QR(x[3], x[7], x[11], x[15]); // column 3
// Even round
QR(x[0], x[5], x[10], x[15]); // diagonal 1 (main diagonal)
QR(x[1], x[6], x[11], x[12]); // diagonal 2
QR(x[2], x[7], x[ 8], x[13]); // diagonal 3
QR(x[3], x[4], x[ 9], x[14]); // diagonal 4
}
p = 1;
return in[0] + x[0];
}

Re: privileged entropy sources in QEMU/KVM guests

On Thu, Nov 07, 2019 at 11:10:57AM +0100, Laszlo Ersek wrote:
> Hi,
> 
> related TianoCore BZ:
> 
>   https://bugzilla.tianocore.org/show_bug.cgi?id=1871
> 
> (I'm starting this thread separately because at least some of the topics
> are specific to QEMU, and I didn't want to litter the BZ with a
> discussion that may not be interesting to all participants CC'd on the
> BZ. I am keeping people CC'd on this initial posting; please speak up if
> you'd like to be dropped from the email thread.)
> 
> QEMU provides guests with the virtio-rng device, and the OVMF and
> ArmVirtQemu* edk2 platforms build EFI_RNG_PROTOCOL on top of that
> device. But, that doesn't seem enough for all edk2 use cases.
> 
> Also, virtio-rng (hence EFI_RNG_PROTOCOL too) is optional, and its
> absence may affect some other use cases.

The optional nature of virtio-rng is something that results in the
the same problems for Linux.

If virtio-rng is absent, then Linux now has a general purpose fallback
via the CPU timing jitter entropy source:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bb5530e4082446aac3a3d69780cd4dbfa4520013

Is it practical to provide a jitter entropy source for EDK2
too ?

> (1) For UEFI HTTPS boot, TLS would likely benefit from good quality
> entropy. If the VM config includes virtio-rng (hence the guest firmware
> has EFI_RNG_PROTOCOL), then it should be used as a part of HTTPS boot.
> 
> However, what if virtio-rng (hence EFI_RNG_PROTOCOL) are absent? Should
> UEFI HTTPS boot be disabled completely (or prevented / rejected
> somehow), blaming lack of good entropy? Or should TLS silently fall back
> to "mixing some counters [such as TSC] together and applying a
> deterministic cryptographic transformation"?
> 
> IOW, knowing that the TLS setup may not be based on good quality
> entropy, should we allow related firmware services to "degrade silently"
> (not functionally, but potentially in security), or should we deny the
> services altogether?

If we can guarantee we always present fallback like jitterentropy
then the problem with TLS init goes away IIUC.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: privileged entropy sources in QEMU/KVM guests

On Thu, Nov 07, 2019 at 12:37:11PM +0100, Paolo Bonzini wrote:
> On 07/11/19 11:25, Ard Biesheuvel wrote:
> >> This looks problematic on QEMU. Entropy is a valuable resource, and
> >> whatever resource SMM drivers depend on, should not be possible for e.g.
> >> a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
> >> Therefore, it's not *only* the case that SMM drivers must not consume
> >> EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
> >> outside of SMM/SMRAM), but also that SMM drivers must not depend on the
> >> same piece of *hardware* that feeds EFI_RNG_PROTOCOL.
> >>
> > The typical model is to seed a DRBG [deterministic pseudorandom
> > sequence generator] using a sufficient amount of high quality entropy.
> > Once you have done that, it is rather hard to exhaust a DRBG - it is a
> > mathematical construction that is designed to last for a long time (<=
> > 2^48 invocations [not bytes] according to the NIST spec), after which
> > it does not degrade although it may have generated so much output that
> > its internal state may be inferred if you have captured enough of it
> > (which is a rather theoretical issue IMHO)
> > 
> > The problem is that using the output of a DRBG as a seed is
> > non-trivial - the spec describes ways to do this, but wiring
> > virtio-rng to a DRBG in the host and using its output to seed a DRBG
> > in the guest is slighly problematic.
> > 
> > So it seems to me that the correct way to model this is to make the
> > host's true entropy source a shared resource like any other.
> > 
> 
> Yes, I would make SMM use a cryptographic pseudo-random number generator 
> and seed it from virtio-rng from DXE, way before the OS starts and can 
> "attack" it.
> 
> Once you've gotten a seed, you can create a CSPRNG with a stream cipher 
> such as ChaCha20, which is literally 30 lines of code.

If all we need is a one-time seed then virtio-rng is possibly overkill as
that provides a continuous stream. Instead could QEMU read a few bytes
from the host's /dev/urandom and pass it to EDK via fw_cfg, which can
use it for the CSPRNG seed. EDK would have to erase the fw_cfg field
to prevent the seed value leaking to the guest OS, but other than that
its quite straightforward.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PULL 0/1] Seabios 20191106 patches

2019-11-07 Thread Peter Maydell

On Wed, 6 Nov 2019 at 12:26, Gerd Hoffmann  wrote:
>
> The following changes since commit 36609b4fa36f0ac934874371874416f7533a5408:
>
>   Merge remote-tracking branch 
> 'remotes/palmer/tags/palmer-for-master-4.2-sf1' into staging (2019-11-02 
> 17:59:03 +)
>
> are available in the Git repository at:
>
>   git://git.kraxel.org/qemu tags/seabios-20191106-pull-request
>
> for you to fetch changes up to 58b16e57ded751e2e8be626124aad1d46a408a33:
>
>   seabios: update to pre-1.13 snapshot (2019-11-06 13:23:02 +0100)
>
> 
> seabios: update to pre-1.13 snapshot
>
> 
>
> Gerd Hoffmann (1):
>   seabios: update to pre-1.13 snapshot

Hi; this fails 'make check' on at least
aarch64, aarch32, FreeBSD, NetBSD, s390:

ERROR:/home/linux1/qemu/tests/boot-sector.c:161:boot_sector_test:
assertion failed (signature == SIGNATURE): (0x == 0xdead)
ERROR - Bail out!
ERROR:/home/linux1/qemu/tests/boot-sector.c:161:boot_sector_test:
assertion failed (signature == SIGNATURE): (0x == 0xdead)
PASS 9 bios-tables-test /x86_64/acpi/q35/bridge
Aborted (core dumped)
/home/linux1/qemu/tests/Makefile.include:916: recipe for target
'check-qtest-i386' failed
make: *** [check-qtest-i386] Error 1

the x86-64 bootsector tests seem to fail similarly.

thanks
-- PMM

Re: [RFC v2 11/14] linux-headers/kvm.h: add capability to forward hypercall

2019-11-07 Thread Guoheyi





On 2019/11/7 16:57, Michael S. Tsirkin wrote:

On Thu, Nov 07, 2019 at 09:44:36AM +0800, Guoheyi wrote:


On 2019/11/7 1:55, Cornelia Huck wrote:

On Tue, 5 Nov 2019 17:10:53 +0800
Heyi Guo  wrote:


To keep backward compatibility, we add new KVM capability
"KVM_CAP_FORWARD_HYPERCALL" to probe whether KVM supports forwarding
hypercall to userspace.

The capability should be enabled explicitly, for we don't want user
space application to deal with unexpected hypercall exits. After
enabling this cap, all HVC calls unhandled by kvm will be forwarded to
user space.

Signed-off-by: Heyi Guo 
Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Cornelia Huck 
Cc: Paolo Bonzini 
Cc: Dave Martin 
Cc: Marc Zyngier 
Cc: Mark Rutland 
Cc: James Morse 
---
   linux-headers/linux/kvm.h |  1 +
   target/arm/sdei.c | 16 
   target/arm/sdei.h |  2 ++
   3 files changed, 19 insertions(+)

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 3d9b18f7f8..36c9b3859f 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1000,6 +1000,7 @@ struct kvm_ppc_resize_hpt {
   #define KVM_CAP_PMU_EVENT_FILTER 173
   #define KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 174
   #define KVM_CAP_HYPERV_DIRECT_TLBFLUSH 175
+#define KVM_CAP_FORWARD_HYPERCALL 176
   #ifdef KVM_CAP_IRQ_ROUTING

Is this cap upstream already? I would have thought your header sync
would have brought it in, then. (Saying this, that header sync looks
awfully small.)

If it is not upstream yet, please split off this hunk into a separate
patch -- it's a bit annoying, but makes life easier for merging.

No, it is not upstream yet. The whole framework and interfaces between KVM
and qemu are still under discussion. I'll keep in mind of this when moving
forward to next steps...

Thanks,
HG

It's best to add it in some other place meanwhile.
Do you mean to split this patch from the whole patch set and send it 
separately? Sorry I'm not clear about maintainers' work and may bring 
you some trouble...


Thanks,
HG


Then we can drop it when it's in an upstream header.




.


.

Re: privileged entropy sources in QEMU/KVM guests

On 11/07/19 11:25, Ard Biesheuvel wrote:
> On Thu, 7 Nov 2019 at 11:11, Laszlo Ersek  wrote:

>> (1) For UEFI HTTPS boot, TLS would likely benefit from good quality
>> entropy. If the VM config includes virtio-rng (hence the guest firmware
>> has EFI_RNG_PROTOCOL), then it should be used as a part of HTTPS boot.
>>
>> However, what if virtio-rng (hence EFI_RNG_PROTOCOL) are absent? Should
>> UEFI HTTPS boot be disabled completely (or prevented / rejected
>> somehow), blaming lack of good entropy? Or should TLS silently fall back
>> to "mixing some counters [such as TSC] together and applying a
>> deterministic cryptographic transformation"?
>>
>> IOW, knowing that the TLS setup may not be based on good quality
>> entropy, should we allow related firmware services to "degrade silently"
>> (not functionally, but potentially in security), or should we deny the
>> services altogether?
>>
> 
> TLS uses a source of randomness to establish symmetric session keys
> for encryption. So it really depends on the use case whether HTTPS is
> used for authentication or for confidentiality, and it seems to me
> that it would typically be the former. So disabling HTTPS boot in this
> case seems counterproductive to me.

OK. So this might be an argument for an RngLib instance that tries to
consume EFI_RNG_PROTOCOL, and if the protocol is absent, the lib
instance falls back to a TSC-seeded PRNG.

We'd have to make sure (or prove) that the protocol lookup in the lib
occurs *after* BDS made an attempt to connect the virtio-rng device(s).

>> (2) It looks like the SMM driver implementing the privileged part of the
>> UEFI variable runtime service could need access to good quality entropy,
>> while running in SMM; in the future.
>>
>> This looks problematic on QEMU. Entropy is a valuable resource, and
>> whatever resource SMM drivers depend on, should not be possible for e.g.
>> a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
>> Therefore, it's not *only* the case that SMM drivers must not consume
>> EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
>> outside of SMM/SMRAM), but also that SMM drivers must not depend on the
>> same piece of *hardware* that feeds EFI_RNG_PROTOCOL.
>>
> 
> The typical model is to seed a DRBG [deterministic pseudorandom
> sequence generator] using a sufficient amount of high quality entropy.
> Once you have done that, it is rather hard to exhaust a DRBG - it is a
> mathematical construction that is designed to last for a long time (<=
> 2^48 invocations [not bytes] according to the NIST spec), after which
> it does not degrade although it may have generated so much output that
> its internal state may be inferred if you have captured enough of it
> (which is a rather theoretical issue IMHO)

Thanks! I think this helps.

Because then the guest SMM code could read the seed from (for example) a
well-known PCI BDF (such as 0/0/0), at either a fixed config space
offset, or from a vendor capability. This doesn't depend on PCI
enumeration, and it also cannot be interfered with by 3rd party UEFI
code or OS code, because the only such reads would occur in the subject
SMM drivers' entry point functions.

On the Q35 board in QEMU, we already use some config space registers
that are left unspecified in Intel datasheet 316966-002, Table 5-1 "DRAM
Controller Register Address Map (D0:F0)", for various paravirt purposes.
We haven't run out of such "free for the taking" config space registers
yet, and for this particular purpose we only need a single byte
register. (The first read would expose whether the feature were
supported, the other reads would provide bytes for the seed.)

> 
> The problem is that using the output of a DRBG as a seed is
> non-trivial - the spec describes ways to do this, but wiring
> virtio-rng to a DRBG in the host and using its output to seed a DRBG
> in the guest is slighly problematic.

Can we forward /dev/urandom from the host to the guest through an
interface like described above? (Single byte config space register.)

> So it seems to me that the correct way to model this is to make the
> host's true entropy source a shared resource like any other.

I don't know enough to agree or disagree. I guess this might require
additional permission management on the host side.

(NB my only purpose with this thread is to ensure that the internal edk2
interfaces, such as lib class APIs and possible SMM protocols, will
offer the dynamism that's necessary when running on QEMU.)

Thanks!
Laszlo

Re: [PATCH 1/2] migration/compress: compress QEMUFile is not writable

* Wei Yang (richardw.y...@linux.intel.com) wrote:
> We open a file with empty_ops for compress QEMUFile, which means this is
> not writable.

That explanation sounds reasonable; but I'm confused by the history of
this;  the code was added by Liang Li in :

  b3be289 qemu-file: Fix qemu_put_compression_data flaw

  ( https://www.mail-archive.com/qemu-devel@nongnu.org/msg368974.html )

with almost exactly the opposite argument;  can we figure out why?

Dave

> Signed-off-by: Wei Yang 
> ---
>  migration/qemu-file.c | 16 +++-
>  1 file changed, 3 insertions(+), 13 deletions(-)
> 
> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> index 26fb25ddc1..f3d99a96ec 100644
> --- a/migration/qemu-file.c
> +++ b/migration/qemu-file.c
> @@ -744,11 +744,8 @@ static int qemu_compress_data(z_stream *stream, uint8_t 
> *dest, size_t dest_len,
>  /* Compress size bytes of data start at p and store the compressed
>   * data to the buffer of f.
>   *
> - * When f is not writable, return -1 if f has no space to save the
> - * compressed data.
> - * When f is wirtable and it has no space to save the compressed data,
> - * do fflush first, if f still has no space to save the compressed
> - * data, return -1.
> + * Since the file is dummy file with empty_ops, return -1 if f has no space 
> to
> + * save the compressed data.
>   */
>  ssize_t qemu_put_compression_data(QEMUFile *f, z_stream *stream,
>const uint8_t *p, size_t size)
> @@ -756,14 +753,7 @@ ssize_t qemu_put_compression_data(QEMUFile *f, z_stream 
> *stream,
>  ssize_t blen = IO_BUF_SIZE - f->buf_index - sizeof(int32_t);
>  
>  if (blen < compressBound(size)) {
> -if (!qemu_file_is_writable(f)) {
> -return -1;
> -}
> -qemu_fflush(f);
> -blen = IO_BUF_SIZE - sizeof(int32_t);
> -if (blen < compressBound(size)) {
> -return -1;
> -}
> +return -1;
>  }
>  
>  blen = qemu_compress_data(stream, f->buf + f->buf_index + 
> sizeof(int32_t),
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH 0/6] migration/postcopy: enable compress during postcopy

On Thu, Nov 07, 2019 at 09:15:44AM +, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> On Wed, Nov 06, 2019 at 08:11:44PM +, Dr. David Alan Gilbert wrote:
>> >* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> >> This patch set tries enable compress during postcopy.
>> >> 
>> >> postcopy requires to place a whole host page, while migration thread 
>> >> migrate
>> >> memory in target page size. This makes postcopy need to collect all target
>> >> pages in one host page before placing via userfaultfd.
>> >> 
>> >> To enable compress during postcopy, there are two problems to solve:
>> >> 
>> >> 1. Random order for target page arrival
>> >> 2. Target pages in one host page arrives without interrupt by target
>> >>page from other host page
>> >> 
>> >> The first one is handled by counting the number of target pages arrived
>> >> instead of the last target page arrived.
>> >> 
>> >> The second one is handled by:
>> >> 
>> >> 1. Flush compress thread for each host page
>> >> 2. Wait for decompress thread for before placing host page
>> >> 
>> >> With the combination of these two changes, compress is enabled during
>> >> postcopy.
>> >
>> >What have you tested this with? 2MB huge pages I guess?
>> >
>> 
>> I tried with this qemu option:
>> 
>>-object 
>> memory-backend-file,id=mem1,mem-path=/dev/hugepages/guest2,size=4G \
>>-device pc-dimm,id=dimm1,memdev=mem1
>> 
>> /dev/hugepages/guest2 is a file under hugetlbfs
>> 
>>hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
>
>OK, yes that should be fine.
>I suspect on Power/ARM where they have normal memory with 16/64k pages,
>the cost of the flush will mean compression is more expensive in
>postcopy mode; but still makes it possible.
>

Not get your point clearly about more expensive. You mean more expensive on
ARM/Power?

If the solution looks good to you, I would prepare v2.

>Dave
>
>> >Dave
>> >
>> >> Wei Yang (6):
>> >>   migration/postcopy: reduce memset when it is zero page and
>> >> matches_target_page_size
>> >>   migration/postcopy: wait for decompress thread in precopy
>> >>   migration/postcopy: count target page number to decide the
>> >> place_needed
>> >>   migration/postcopy: set all_zero to true on the first target page
>> >>   migration/postcopy: enable random order target page arrival
>> >>   migration/postcopy: enable compress during postcopy
>> >> 
>> >>  migration/migration.c | 11 
>> >>  migration/ram.c   | 65 ++-
>> >>  2 files changed, 45 insertions(+), 31 deletions(-)
>> >> 
>> >> -- 
>> >> 2.17.1
>> >> 
>> >--
>> >Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>> 
>> -- 
>> Wei Yang
>> Help you, Help me
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me

Re: [RFC PATCH 00/18] Add qemu-storage-daemon

2019-11-07 Thread Kevin Wolf

Am 07.11.2019 um 11:33 hat Daniel P. Berrangé geschrieben:
> On Thu, Oct 17, 2019 at 03:01:46PM +0200, Kevin Wolf wrote:
> > 2. I'm not completely sure if the command line syntax is the final
> >version that we want to support long-term. Many options directly use
> >QAPI visitors (--blockdev, --export, --nbd-server) and should be
> >fine. However, others use QemuOpts (--chardev, --monitor, --object).
> > 
> >This is the same as in the system emulator, so we wouldn't be adding
> >a new problem, but as there was talk about QAPIfying the command
> >line, and I wouldn't want later syntax changes or adding lots of
> >compatibility code to a new tool, I thought we should probably
> >discuss whether QAPIfying from the start would be an option.
> 
> I think that following what the QEMU emulators currently do for
> CLI args should be an explicit anti-goal, because we know that it is
> a long standing source of pain.  Fixing it in the emulator binaries
> is hard due to backward compatibility, but for this new binary we
> have a clean slate.
> 
> This feels like a good opportunity to implement & demonstrate what
> we think QEMU configuration ought to look like. Work done for this
> in the qemu-storage-daemon may well help us understand how we'll
> be able to move the QEMU emulators into a new scheme later.

It might be, which is why I'm asking. Now that the storage daemon has
missed 4.2, we have a little more time to decide what the command line
should look like in detail.

However, I don't think this is something that should delay the storage
daemon until after 5.0.

> My personal wish would be to have no use of QemuOpts at all.
> 
> Use GOptionContext *only* for parsing command line arguments
> related to execution of the daemon - ie things like --help,
> --version, --daemon, --pid-file.

I really don't believe that the solution for having too much variety in
option parsing is adding in yet another type. GOptionContext is not
something I'm considering at the moment.

But it's a getopt() replacement, not something that could actually parse
the more complex options, so it's a separate question anyway. If we ever
want to use it, we can replace getopt() in all binaries at once.

> The use a "--config /path/to/json/file" arg to point to the config
> file for everything else using QAPI schema to describe it fully.

If this implies that the storage daemon can only do useful things if you
specify a config file, I disagree.

I agree that it's not really nice if you can't use a config file to
specify a lengthy configuration and that supporting one would be good.

But it is at least equally unfriendly to require a config file for
simple configurations where using command line arguments is easily
possible.

> When loading the config file, things should be created in order
> in which they are specified. ie don't try and group things,
> otherwise we end up back with the horrific hacks for objects
> where some are created early & some late.

Yes. This is how the storage daemon command line works, too.

I think Markus already had some patches for command line QAPIfication
that were incomplete at least for the system emulator. It might be
easier to make it feature complete for the storage daemon because it
supports much less problematic options. Maybe he can post a useful
subset (if it's too much work to clean up the full thing right now) and
we can work from there.

The one that I expect to be a bit tricky to be QAPIfied is --object.

> For an ambitious stretch goal, I think we should seriously consider
> whether our use of chardevs is appropriate in all cases that exist,
> and whether we can avoid the trap of over-using chardev in the new
> storage daemon since it is a clean slate in terms of user facing
> CLI config.
> 
> chardevs are designed for & reasonably well suited to attaching to
> devices like serial ports, parallel ports, etc. You have a 1:1
> remote:local peer relationship. The transport is a dumb byte
> stream, nothing special needed on top & the user can cope with
> any type of chardev.
> 
> Many cases where we've used chardevs as a backend in QEMU are a
> poor fit. We just used chardevs as an "easy" way to configure a
> UNIX or TCP socket from the CLI, and don't care about, nor work
> with, any othuer chardev backends. As a result of this misuse
> we've had to put in an increasing number of hacks in the chardev
> code to deal with fact that callers want to know about  & use
> socket semantics. eg FD passing, the chardev reconnection polling
> code.
> 
> The monitor is a prime example of a bad fit - it would be better
> suited by simply referencing a SocketAddress QAPI type, instead
> of having the chardev indirection. It would then directly use
> the QIOChannelSocket APIs and avoid the inconvenient chardev
> abstractions which are a source of complexity & instability for
> no net gain.  vhostuser is another prime example, responsible
> for much of the complexity & bugs recently added

Re: [PATCH 0/6] migration/postcopy: enable compress during postcopy

* Wei Yang (richardw.y...@linux.intel.com) wrote:
> On Thu, Nov 07, 2019 at 09:15:44AM +, Dr. David Alan Gilbert wrote:
> >* Wei Yang (richardw.y...@linux.intel.com) wrote:
> >> On Wed, Nov 06, 2019 at 08:11:44PM +, Dr. David Alan Gilbert wrote:
> >> >* Wei Yang (richardw.y...@linux.intel.com) wrote:
> >> >> This patch set tries enable compress during postcopy.
> >> >> 
> >> >> postcopy requires to place a whole host page, while migration thread 
> >> >> migrate
> >> >> memory in target page size. This makes postcopy need to collect all 
> >> >> target
> >> >> pages in one host page before placing via userfaultfd.
> >> >> 
> >> >> To enable compress during postcopy, there are two problems to solve:
> >> >> 
> >> >> 1. Random order for target page arrival
> >> >> 2. Target pages in one host page arrives without interrupt by target
> >> >>page from other host page
> >> >> 
> >> >> The first one is handled by counting the number of target pages arrived
> >> >> instead of the last target page arrived.
> >> >> 
> >> >> The second one is handled by:
> >> >> 
> >> >> 1. Flush compress thread for each host page
> >> >> 2. Wait for decompress thread for before placing host page
> >> >> 
> >> >> With the combination of these two changes, compress is enabled during
> >> >> postcopy.
> >> >
> >> >What have you tested this with? 2MB huge pages I guess?
> >> >
> >> 
> >> I tried with this qemu option:
> >> 
> >>-object 
> >> memory-backend-file,id=mem1,mem-path=/dev/hugepages/guest2,size=4G \
> >>-device pc-dimm,id=dimm1,memdev=mem1
> >> 
> >> /dev/hugepages/guest2 is a file under hugetlbfs
> >> 
> >>hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
> >
> >OK, yes that should be fine.
> >I suspect on Power/ARM where they have normal memory with 16/64k pages,
> >the cost of the flush will mean compression is more expensive in
> >postcopy mode; but still makes it possible.
> >
> 
> Not get your point clearly about more expensive. You mean more expensive on
> ARM/Power?

Yes;  you're doing a flush at the end of each host page;  on x86 without
hugepages you don't do anything, on arm/power you'll need to do a flush
at the end of each of their normal pages - so that's a bit more
expensive.

> If the solution looks good to you, I would prepare v2.

Yes; I think it is OK.

Dave

> >Dave
> >
> >> >Dave
> >> >
> >> >> Wei Yang (6):
> >> >>   migration/postcopy: reduce memset when it is zero page and
> >> >> matches_target_page_size
> >> >>   migration/postcopy: wait for decompress thread in precopy
> >> >>   migration/postcopy: count target page number to decide the
> >> >> place_needed
> >> >>   migration/postcopy: set all_zero to true on the first target page
> >> >>   migration/postcopy: enable random order target page arrival
> >> >>   migration/postcopy: enable compress during postcopy
> >> >> 
> >> >>  migration/migration.c | 11 
> >> >>  migration/ram.c   | 65 ++-
> >> >>  2 files changed, 45 insertions(+), 31 deletions(-)
> >> >> 
> >> >> -- 
> >> >> 2.17.1
> >> >> 
> >> >--
> >> >Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> >> 
> >> -- 
> >> Wei Yang
> >> Help you, Help me
> >--
> >Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> -- 
> Wei Yang
> Help you, Help me
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH 1/2] migration/compress: compress QEMUFile is not writable

On Thu, Nov 07, 2019 at 11:59:10AM +, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> We open a file with empty_ops for compress QEMUFile, which means this is
>> not writable.
>
>That explanation sounds reasonable; but I'm confused by the history of
>this;  the code was added by Liang Li in :
>
>  b3be289 qemu-file: Fix qemu_put_compression_data flaw
>
>  ( https://www.mail-archive.com/qemu-devel@nongnu.org/msg368974.html )
>
>with almost exactly the opposite argument;  can we figure out why?
>

Hmm... sounds interesting.

Toke a look into the change log, which says

Current qemu_put_compression_data can only work with no writable
QEMUFile, and can't work with the writable QEMUFile. But it does
not provide any measure to prevent users from using it with a
writable QEMUFile.

We should fix this flaw to make it works with writable QEMUFile.

While I don't see a chance to use writable QEMUFile. Do I miss something?

>Dave
>

-- 
Wei Yang
Help you, Help me

Re: guest / host buffer sharing ...

On Wed, Nov 6, 2019 at 1:50 PM Gerd Hoffmann  wrote:
> > In the graphics buffer sharing use case, how does the other side
> > determine how to interpret this data?
>
> The idea is to have free form properties (name=value, with value being
> a string) for that kind of metadata.
>
> > Shouldn't there be a VIRTIO
> > device spec for the messaging so compatible implementations can be
> > written by others?
>
> Adding a list of common properties to the spec certainly makes sense,
> so everybody uses the same names.  Adding struct-ed properties for
> common use cases might be useful too.

Why not define VIRTIO devices for wayland and friends?

This new device exposes buffer sharing plus properties - effectively a
new device model nested inside VIRTIO.  The VIRTIO device model has
the necessary primitives to solve the buffer sharing problem so I'm
struggling to see the purpose of this new device.

Custom/niche applications that do not wish to standardize their device
type can maintain out-of-tree VIRTIO devices.  Both kernel and
userspace drivers can be written for the device and there is already
VIRTIO driver code that can be reused.  They have access to the full
VIRTIO device model, including feature negotiation and configuration
space.

Stefan

Re: [PATCH 2/2] migration/compress: disable compress if failed to setup

* Wei Yang (richardw.y...@linux.intel.com) wrote:
> In current logic, if compress_threads_save_setup() returns -1 the whole
> migration would fail, while we could handle it gracefully by disable
> compress.

I think it's fine for migration to fail here; the user askd for
compression - if it doesn't work then it's OK to fail and they can
switch it off; since it fails right at the start there's nothing lost.

Dave

> Signed-off-by: Wei Yang 
> ---
>  migration/migration.c |  9 +
>  migration/migration.h |  1 +
>  migration/ram.c   | 15 ---
>  3 files changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 5f7e4d15e9..02b95f4223 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2093,6 +2093,15 @@ bool migrate_use_compression(void)
>  return s->enabled_capabilities[MIGRATION_CAPABILITY_COMPRESS];
>  }
>  
> +void migrate_disable_compression(void)
> +{
> +MigrationState *s;
> +
> +s = migrate_get_current();
> +
> +s->enabled_capabilities[MIGRATION_CAPABILITY_COMPRESS] = false;
> +}
> +
>  int migrate_compress_level(void)
>  {
>  MigrationState *s;
> diff --git a/migration/migration.h b/migration/migration.h
> index 4f2fe193dc..51368d3a6e 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -309,6 +309,7 @@ bool migrate_use_return_path(void);
>  uint64_t ram_get_total_transferred_pages(void);
>  
>  bool migrate_use_compression(void);
> +void migrate_disable_compression(void);
>  int migrate_compress_level(void);
>  int migrate_compress_threads(void);
>  int migrate_compress_wait_thread(void);
> diff --git a/migration/ram.c b/migration/ram.c
> index 96c9b16402..39279161a8 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -533,12 +533,12 @@ static void compress_threads_save_cleanup(void)
>  comp_param = NULL;
>  }
>  
> -static int compress_threads_save_setup(void)
> +static void compress_threads_save_setup(void)
>  {
>  int i, thread_count;
>  
>  if (!migrate_use_compression()) {
> -return 0;
> +return;
>  }
>  thread_count = migrate_compress_threads();
>  compress_threads = g_new0(QemuThread, thread_count);
> @@ -569,11 +569,14 @@ static int compress_threads_save_setup(void)
> do_data_compress, comp_param + i,
> QEMU_THREAD_JOINABLE);
>  }
> -return 0;
> +return;
>  
>  exit:
>  compress_threads_save_cleanup();
> -return -1;
> +migrate_disable_compression();
> +error_report("%s: failed to setup compress threads, compress disabled",
> + __func__);
> +return;
>  }
>  
>  /* Multiple fd's */
> @@ -3338,9 +3341,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>  RAMState **rsp = opaque;
>  RAMBlock *block;
>  
> -if (compress_threads_save_setup()) {
> -return -1;
> -}
> +compress_threads_save_setup();
>  
>  /* migration has already setup the bitmap, reuse it. */
>  if (!migration_in_colo_state()) {
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [RFC v2 11/14] linux-headers/kvm.h: add capability to forward hypercall

2019-11-07 Thread Cornelia Huck

On Thu, 7 Nov 2019 19:57:22 +0800
Guoheyi  wrote:

> On 2019/11/7 16:57, Michael S. Tsirkin wrote:
> > On Thu, Nov 07, 2019 at 09:44:36AM +0800, Guoheyi wrote:  
> >>
> >> On 2019/11/7 1:55, Cornelia Huck wrote:  
> >>> On Tue, 5 Nov 2019 17:10:53 +0800
> >>> Heyi Guo  wrote:
> >>>  
>  To keep backward compatibility, we add new KVM capability
>  "KVM_CAP_FORWARD_HYPERCALL" to probe whether KVM supports forwarding
>  hypercall to userspace.
> 
>  The capability should be enabled explicitly, for we don't want user
>  space application to deal with unexpected hypercall exits. After
>  enabling this cap, all HVC calls unhandled by kvm will be forwarded to
>  user space.
> 
>  Signed-off-by: Heyi Guo 
>  Cc: Peter Maydell 
>  Cc: "Michael S. Tsirkin" 
>  Cc: Cornelia Huck 
>  Cc: Paolo Bonzini 
>  Cc: Dave Martin 
>  Cc: Marc Zyngier 
>  Cc: Mark Rutland 
>  Cc: James Morse 
>  ---
> linux-headers/linux/kvm.h |  1 +
> target/arm/sdei.c | 16 
> target/arm/sdei.h |  2 ++
> 3 files changed, 19 insertions(+)
> 
>  diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
>  index 3d9b18f7f8..36c9b3859f 100644
>  --- a/linux-headers/linux/kvm.h
>  +++ b/linux-headers/linux/kvm.h
>  @@ -1000,6 +1000,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_PMU_EVENT_FILTER 173
> #define KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 174
> #define KVM_CAP_HYPERV_DIRECT_TLBFLUSH 175
>  +#define KVM_CAP_FORWARD_HYPERCALL 176
> #ifdef KVM_CAP_IRQ_ROUTING  
> >>> Is this cap upstream already? I would have thought your header sync
> >>> would have brought it in, then. (Saying this, that header sync looks
> >>> awfully small.)
> >>>
> >>> If it is not upstream yet, please split off this hunk into a separate
> >>> patch -- it's a bit annoying, but makes life easier for merging.  
> >> No, it is not upstream yet. The whole framework and interfaces between KVM
> >> and qemu are still under discussion. I'll keep in mind of this when moving
> >> forward to next steps...
> >>
> >> Thanks,
> >> HG  
> > It's best to add it in some other place meanwhile.  
> Do you mean to split this patch from the whole patch set and send it 
> separately? Sorry I'm not clear about maintainers' work and may bring 
> you some trouble...

My preferred approach:

- add a commit entitled "placeholder for headers update" that contains
  the not-yet-upstream changes in the header files you need
- base the rest of your work on that
...

...
- if kernel changes are upstream: replace the placeholder patch with a
  real update (may include separate patches, if you need an additional
  header); maintainer merges
- if kernel changes are not yet upstream: maintainer merges with
  placeholder to a feature branch, replaces with real update and merges
  once kernel patches hit upstream
(not every maintainer does the second approach; they may ask you
instead to resend with a proper headers update once the kernel changes
are upstream)

Re: [RFC 3/3] tests/vhost-user-fs-test: add vhost-user-fs test case

* Stefan Hajnoczi (stefa...@gmail.com) wrote:
> On Tue, Oct 29, 2019 at 12:36:05AM +, Dr. David Alan Gilbert wrote:
> > * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > > +static void after_test(void *arg G_GNUC_UNUSED)
> > > +{
> > > +unlink(socket_path);
> > > +
> > > +remove_dir_and_children(shared_dir);
> > 
> > This scares me. Especially since it's running as root.
> > Can we add a bunch of paranoid checks to make sure it doesn't
> > end up rm -rf / ?
> 
> Yes, we can resolve the path and check it is not "/".

I suggest checking for "/", ".", ".." and ""
if any of those get in it's probably bad.

> > > +/* Open a file by nodeid using FUSE_OPEN */
> > > +static int32_t fuse_open(QVirtioFS *vfs, uint64_t nodeid, uint32_t flags,
> > > + uint64_t *fh)
> > > +{
> > > +struct fuse_in_header in_hdr = {
> > > +.opcode = guest32(FUSE_OPEN),
> > > +.unique = guest64(virtio_fs_get_unique(vfs)),
> > > +.nodeid = guest64(nodeid),
> > > +};
> > > +struct fuse_open_in in = {
> > > +.flags = guest32(flags),
> > > +};
> > > +struct iovec sg_in[] = {
> > > +{ .iov_base = &in_hdr, .iov_len = sizeof(in_hdr) },
> > > +{ .iov_base = &in, .iov_len = sizeof(in) },
> > > +};
> > > +struct fuse_out_header out_hdr;
> > > +struct fuse_open_out out;
> > > +struct iovec sg_out[] = {
> > > +{ .iov_base = &out_hdr, .iov_len = sizeof(out_hdr) },
> > > +{ .iov_base = &out, .iov_len = sizeof(out) },
> > > +};
> > 
> > I wonder if anything can be done to reduce the size of the iovec boiler
> > plate?
> 
> I'm not aware of a clean way to build the iovec array automatically but
> we could do this if you prefer it:
> 
>   #define IOVEC(elem) { .iov_base = &elem, .iov_len = sizeof(elem) }
> 
>   struct iovec sg_in[] = {
> IOVEC(in_hdr),
> IOVEC(in),
>   };
> 
> Do you find this nicer?

Only a little; probably not worth it.

Dave

> Stefan


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v8 1/3] docs: improve qcow2 spec about extending image header

2019-11-07 Thread Vladimir Sementsov-Ogievskiy

06.11.2019 22:19, Eric Blake wrote:
> On 10/18/19 9:36 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
>>> Maybe:
>>>
>>> if software doesn't know how to interpret the field, it may be safely 
>>> ignored unless a corresponding incompatible feature flag bit is set; 
>>> however, the field should be preserved unchanged when rewriting the image 
>>> header.
>>>
 +
 +For all additional fields zero value equals to absence of field (absence 
 is
 +when field.offset + field.size > @header_length). This implies
 +that if software want's to set fields up to some field not aligned to 
 multiply
 +of 8 it must align header up by zeroes. And on the other hand, if software
 +need some optional field which is absent it should assume that it's value 
 is
 +zero.
>>>
>>> Maybe:
>>>
>>> Each optional field that does not have a corresponding incompatible feature 
>>> bit must support the value 0 that gives the same default behavior as when 
>>> the optional field is omitted.
>>
>> Hmmm. That doesn't work, as "corresponding" is something not actually 
>> defined. Consider our zstd extension.
>>
>> It has corresponding incompatible bit, therefore, this sentence doesn't 
>> apply to it. But still, if incompatible bit is unset we can have this field. 
>> And it's zero value must correspond
>> to the absence of the field.
>>
>> So, additional field may use incomaptible bit only for subset of its values.
>>
>> But, I see, that you want to allow 0 value to not match field-absence if 
>> incompatible bit is set?
> 
> Not necessarily.  Rather, if the value of an unknown field can be safely 
> ignored, then it should default to 0.  If it cannot be safely ignored, then 
> that field will not be set to a non-zero value without also setting an 
> incompatible feature flag, so that software that does not know how to 
> interpret the field will fail to load the image because it also fails to 
> recognize the associated new incompatible feature bit.
> 
> But I'd really like Kevin's opinion on how much wording is worth adding.
> 
>>
>> So, may be
>>
>> Additional fields has the following compatible behavior by default:
> 
> s/has/have/
> 
>>
>> 1. If software doesn't know how to interpret the field, it may be safely 
>> ignored, other than preserving the field unchanged when rewriting the image 
>> header.
>> 2. Zeroed additional field gives the same behavior as when this field is 
>> omitted.
> 
> Both good.
> 
>>
>> This default behavior may be altered with help of incompatible feature bits. 
>> So, if, for example, additional field has corresponding incompatible feature 
>> bit, and it is set, we are sure that software which opens the image knows 
>> how to interpret the field, so,
>> 1. The field definitely will not be ignored when corresponding incompatible 
>> bit is set.
>> 2. The field may define any meaning it wants for zero value for the case 
>> when corresponding incompatible bit is set.
> 
> Rather wordy but seems accurate.  Perhaps compress it to:
> 
> 3. Any additional field whose value must not be ignored for correct handling 
> of the file will be accompanied by a corresponding incompatible feature bit.
> 
> and maybe even reorder it to list the points as:
> 
> Additional fields have the following compatibility rules:
> 1. Any additional field whose value must not be ignored for correct handling 
> of the file will be accompanied by a corresponding incompatible feature bit.

I'd like to stress, that incompatible bit is needed for incompatible value, not 
for the field itself. (So field may be accompanied by incompatible bit for some
it's values and for others - not), So, what about

1. If the value of the additional field must not be ignored for correct 
handling of the file, it will be accompanied by a corresponding incompatible 
feature bit.

> 2. If there are no unrecognized incompatible feature bits set, an additional 
> field may be safely ignored other than preserving its value when rewriting 
> the image header.

Sounds like we can ignore the field if we know its meaning and know its 
incompatible bit..

2. If there are no unrecognized incompatible feature bits set, an unknown 
additional field may be safely ignored other than preserving its value when 
rewriting the image header.

> 3. An explicit value of 0 will have the same behavior as when the field is 
> not present.

But it may be changed by incompatible bit..

3. An explicit value of 0 will have the same behavior as when the field is not 
present, if not altered by specific incompatible bit.

> 
> 
 +It's allowed for the header end to cut some field in the middle (in this 
 case
 +the field is considered as absent), but in this case the part of the field
 +which is covered by @header_length must be zeroed.
 +
 +    < ... No additional fields in the header currently ... >
>>>
>>> Do we even still need this if we require 8-byte alignment?  We'd never be 
>>> able to cut a single field in the middle

[Patch v2 1/6] migration/postcopy: reduce memset when it is zero page and matches_target_page_size

In this case, page_buffer content would not be used.

Skip this to save some time.

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 99a98b2da4..7938a643d9 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4091,7 +4091,13 @@ static int ram_load_postcopy(QEMUFile *f)
 switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
 case RAM_SAVE_FLAG_ZERO:
 ch = qemu_get_byte(f);
-memset(page_buffer, ch, TARGET_PAGE_SIZE);
+/*
+ * Can skip to set page_buffer when
+ * this is a zero page and (block->page_size == TARGET_PAGE_SIZE).
+ */
+if (ch || !matches_target_page_size) {
+memset(page_buffer, ch, TARGET_PAGE_SIZE);
+}
 if (ch) {
 all_zero = false;
 }
-- 
2.17.1

[Patch v2 0/6] migration/postcopy: enable compress during postcopy

This patch set tries enable compress during postcopy.

postcopy requires to place a whole host page, while migration thread migrate
memory in target page size. This makes postcopy need to collect all target
pages in one host page before placing via userfaultfd.

To enable compress during postcopy, there are two problems to solve:

1. Random order for target page arrival
2. Target pages in one host page arrives without interrupt by target
   page from other host page

The first one is handled by counting the number of target pages arrived
instead of the last target page arrived.

The second one is handled by:

1. Flush compress thread for each host page
2. Wait for decompress thread for before placing host page

With the combination of these two changes, compress is enabled during
postcopy.

---
v2:
 * use uintptr_t to calculate place_dest
 * check target pages belongs to the same host page

Wei Yang (6):
  migration/postcopy: reduce memset when it is zero page and
matches_target_page_size
  migration/postcopy: wait for decompress thread in precopy
  migration/postcopy: count target page number to decide the
place_needed
  migration/postcopy: set all_zero to true on the first target page
  migration/postcopy: enable random order target page arrival
  migration/postcopy: enable compress during postcopy

 migration/migration.c | 11 ---
 migration/ram.c   | 67 +--
 2 files changed, 52 insertions(+), 26 deletions(-)

-- 
2.17.1

[Patch v2 3/6] migration/postcopy: count target page number to decide the place_needed

In postcopy, it requires to place whole host page instead of target
page.

Currently, it relies on the page offset to decide whether this is the
last target page. We also can count the target page number during the
iteration. When the number of target page equals
(host page size / target page size), this means it is the last target
page in the host page.

This is a preparation for non-ordered target page transmission.

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f59e3fe197..5c05376d8d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4017,6 +4017,7 @@ static int ram_load_postcopy(QEMUFile *f)
 void *postcopy_host_page = mis->postcopy_tmp_page;
 void *last_host = NULL;
 bool all_zero = false;
+int target_pages = 0;
 
 while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
 ram_addr_t addr;
@@ -4051,6 +4052,7 @@ static int ram_load_postcopy(QEMUFile *f)
 ret = -EINVAL;
 break;
 }
+target_pages++;
 matches_target_page_size = block->page_size == TARGET_PAGE_SIZE;
 /*
  * Postcopy requires that we place whole host pages atomically;
@@ -4082,8 +4084,10 @@ static int ram_load_postcopy(QEMUFile *f)
  * If it's the last part of a host page then we place the host
  * page
  */
-place_needed = (((uintptr_t)host + TARGET_PAGE_SIZE) &
- (block->page_size - 1)) == 0;
+if (target_pages == (block->page_size / TARGET_PAGE_SIZE)) {
+place_needed = true;
+target_pages = 0;
+}
 place_source = postcopy_host_page;
 }
 last_host = host;
-- 
2.17.1

[Patch v2 4/6] migration/postcopy: set all_zero to true on the first target page

For the first target page, all_zero is set to true for this round check.

After target_pages introduced, we could leverage this variable instead
of checking the address offset.

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 5c05376d8d..b5759793a9 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4067,7 +4067,7 @@ static int ram_load_postcopy(QEMUFile *f)
 page_buffer = postcopy_host_page +
   ((uintptr_t)host & (block->page_size - 1));
 /* If all TP are zero then we can optimise the place */
-if (!((uintptr_t)host & (block->page_size - 1))) {
+if (target_pages == 1) {
 all_zero = true;
 } else {
 /* not the 1st TP within the HP */
-- 
2.17.1

[Patch v2 5/6] migration/postcopy: enable random order target page arrival

After using number of target page received to track one host page, we
could have the capability to handle random order target page arrival in
one host page.

This is a preparation for enabling compress during postcopy.

Signed-off-by: Wei Yang 

---
v2:
   * use uintptr_t to calculate place_dest
   * check target pages belongs to the same host page
---
 migration/ram.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index b5759793a9..666ad69284 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4015,7 +4015,7 @@ static int ram_load_postcopy(QEMUFile *f)
 MigrationIncomingState *mis = migration_incoming_get_current();
 /* Temporary page that is later 'placed' */
 void *postcopy_host_page = mis->postcopy_tmp_page;
-void *last_host = NULL;
+void *this_host = NULL;
 bool all_zero = false;
 int target_pages = 0;
 
@@ -4062,24 +4062,26 @@ static int ram_load_postcopy(QEMUFile *f)
  * that's moved into place later.
  * The migration protocol uses,  possibly smaller, target-pages
  * however the source ensures it always sends all the components
- * of a host page in order.
+ * of a host page in one chunk.
  */
 page_buffer = postcopy_host_page +
   ((uintptr_t)host & (block->page_size - 1));
 /* If all TP are zero then we can optimise the place */
 if (target_pages == 1) {
 all_zero = true;
+this_host = (void *)QEMU_ALIGN_DOWN((uintptr_t)host,
+block->page_size);
 } else {
 /* not the 1st TP within the HP */
-if (host != (last_host + TARGET_PAGE_SIZE)) {
-error_report("Non-sequential target page %p/%p",
-  host, last_host);
+if (QEMU_ALIGN_DOWN((uintptr_t)host, block->page_size) !=
+(uintptr_t)this_host) {
+error_report("Non-same host page %p/%p",
+  host, this_host);
 ret = -EINVAL;
 break;
 }
 }
 
-
 /*
  * If it's the last part of a host page then we place the host
  * page
@@ -4090,7 +4092,6 @@ static int ram_load_postcopy(QEMUFile *f)
 }
 place_source = postcopy_host_page;
 }
-last_host = host;
 
 switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
 case RAM_SAVE_FLAG_ZERO:
@@ -4143,7 +4144,8 @@ static int ram_load_postcopy(QEMUFile *f)
 
 if (!ret && place_needed) {
 /* This gets called at the last target page in the host page */
-void *place_dest = host + TARGET_PAGE_SIZE - block->page_size;
+void *place_dest = (void *)QEMU_ALIGN_DOWN((uintptr_t)host,
+   block->page_size);
 
 if (all_zero) {
 ret = postcopy_place_page_zero(mis, place_dest,
-- 
2.17.1

[Patch v2 2/6] migration/postcopy: wait for decompress thread in precopy

Compress is not supported with postcopy, it is safe to wait for
decompress thread just in precopy.

This is a preparation for later patch.

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 7938a643d9..f59e3fe197 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4375,6 +4375,7 @@ static int ram_load_precopy(QEMUFile *f)
 }
 }
 
+ret |= wait_for_decompress_done();
 return ret;
 }
 
@@ -4406,8 +4407,6 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 } else {
 ret = ram_load_precopy(f);
 }
-
-ret |= wait_for_decompress_done();
 }
 trace_ram_load_complete(ret, seq_iter);
 
-- 
2.17.1

[Patch v2 6/6] migration/postcopy: enable compress during postcopy

postcopy requires to place a whole host page, while migration thread
migrate memory in target page size. This makes postcopy need to collect
all target pages in one host page before placing via userfaultfd.

To enable compress during postcopy, there are two problems to solve:

1. Random order for target page arrival
2. Target pages in one host page arrives without interrupt by target
   page from other host page

The first one is handled by previous cleanup patch.

This patch handles the second one by:

1. Flush compress thread for each host page
2. Wait for decompress thread for before placing host page

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/migration.c | 11 ---
 migration/ram.c   | 28 +++-
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 354ad072fa..3c926a3ae3 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1005,17 +1005,6 @@ static bool migrate_caps_check(bool *cap_list,
 #endif
 
 if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) {
-if (cap_list[MIGRATION_CAPABILITY_COMPRESS]) {
-/* The decompression threads asynchronously write into RAM
- * rather than use the atomic copies needed to avoid
- * userfaulting.  It should be possible to fix the decompression
- * threads for compatibility in future.
- */
-error_setg(errp, "Postcopy is not currently compatible "
-   "with compression");
-return false;
-}
-
 /* This check is reasonably expensive, so only when it's being
  * set the first time, also it's only the destination that needs
  * special support.
diff --git a/migration/ram.c b/migration/ram.c
index 666ad69284..0d53786311 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3449,6 +3449,14 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 
 rs->target_page_count += pages;
 
+/*
+ * During postcopy, it is necessary to make sure one whole host
+ * page is sent in one chunk.
+ */
+if (migrate_postcopy_ram()) {
+flush_compressed_data(rs);
+}
+
 /*
  * we want to check in the 1st loop, just in case it was the 1st
  * time and we had to sync the dirty bitmap.
@@ -4026,6 +4034,7 @@ static int ram_load_postcopy(QEMUFile *f)
 void *place_source = NULL;
 RAMBlock *block = NULL;
 uint8_t ch;
+int len;
 
 addr = qemu_get_be64(f);
 
@@ -4043,7 +4052,8 @@ static int ram_load_postcopy(QEMUFile *f)
 
 trace_ram_load_postcopy_loop((uint64_t)addr, flags);
 place_needed = false;
-if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE)) {
+if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
+ RAM_SAVE_FLAG_COMPRESS_PAGE)) {
 block = ram_block_from_stream(f, flags);
 
 host = host_from_ram_block_offset(block, addr);
@@ -4126,6 +4136,17 @@ static int ram_load_postcopy(QEMUFile *f)
  TARGET_PAGE_SIZE);
 }
 break;
+case RAM_SAVE_FLAG_COMPRESS_PAGE:
+all_zero = false;
+len = qemu_get_be32(f);
+if (len < 0 || len > compressBound(TARGET_PAGE_SIZE)) {
+error_report("Invalid compressed data length: %d", len);
+ret = -EINVAL;
+break;
+}
+decompress_data_with_multi_threads(f, page_buffer, len);
+break;
+
 case RAM_SAVE_FLAG_EOS:
 /* normal exit */
 multifd_recv_sync_main();
@@ -4137,6 +4158,11 @@ static int ram_load_postcopy(QEMUFile *f)
 break;
 }
 
+/* Got the whole host page, wait for decompress before placing. */
+if (place_needed) {
+ret |= wait_for_decompress_done();
+}
+
 /* Detect for any possible file errors */
 if (!ret && qemu_file_get_error(f)) {
 ret = qemu_file_get_error(f);
-- 
2.17.1

Re: privileged entropy sources in QEMU/KVM guests

On 07/11/19 12:52, Daniel P. Berrangé wrote:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bb5530e4082446aac3a3d69780cd4dbfa4520013
> 
> Is it practical to provide a jitter entropy source for EDK2
> too ?

The hard part is not collecting jitter (though the firmware might be too
deterministic for that), but rather turning it into a random number seed
(mixing data from various sources, crediting entropy, etc.).

Paolo

Re: privileged entropy sources in QEMU/KVM guests

On 07/11/19 12:55, Daniel P. Berrangé wrote:
>> Yes, I would make SMM use a cryptographic pseudo-random number generator 
>> and seed it from virtio-rng from DXE, way before the OS starts and can 
>> "attack" it.
>>
>> Once you've gotten a seed, you can create a CSPRNG with a stream cipher 
>> such as ChaCha20, which is literally 30 lines of code.
> If all we need is a one-time seed then virtio-rng is possibly overkill as
> that provides a continuous stream. Instead could QEMU read a few bytes
> from the host's /dev/urandom and pass it to EDK via fw_cfg, which can
> use it for the CSPRNG seed. EDK would have to erase the fw_cfg field
> to prevent the seed value leaking to the guest OS, but other than that
> its quite straightforward.

That would need anyway a change to the emulated hardware.  If the guest
is able to use virtio-rng after the firmware exits (which is the case is
all the firmware needs is a one-time seed), then using virtio-rng is the
simplest alternative as it needs no change at all outside the firmware.

Paolo

Re: [PATCH v4 15/20] fuzz: add fuzzer skeleton

On Wed, Oct 30, 2019 at 02:50:00PM +, Oleinik, Alexander wrote:
> diff --git a/tests/fuzz/fuzz.c b/tests/fuzz/fuzz.c
> new file mode 100644
> index 00..0e38f81c48
> --- /dev/null
> +++ b/tests/fuzz/fuzz.c
> @@ -0,0 +1,177 @@
> +/*
> + * fuzzing driver
> + *
> + * Copyright Red Hat Inc., 2019
> + *
> + * Authors:
> + *  Alexander Bulekov   

Bulekov instead of Oleinik?

> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +
> +#include 
> +#include 

stdio.h and stdlib.h are already included by qemu/osdep.h.

> +/* Executed for each fuzzing-input */
> +int LLVMFuzzerTestOneInput(const unsigned char *Data, size_t Size)
> +{
> +if (fuzz_target->fuzz) {

Will this ever be NULL?

> +fuzz_target->fuzz(fuzz_qts, Data, Size);
> +}
> +return 0;
> +}
> +
> +/* Executed once, prior to fuzzing */
> +int LLVMFuzzerInitialize(int *argc, char ***argv, char ***envp)
> +{
> +
> +char *target_name;
> +
> +/* Initialize qgraph and modules */
> +qos_graph_init();
> +module_call_init(MODULE_INIT_FUZZ_TARGET);
> +module_call_init(MODULE_INIT_QOM);
> +module_call_init(MODULE_INIT_LIBQOS);
> +
> +if (*argc <= 1) {
> +usage(**argv);
> +}
> +
> +/* Identify the fuzz target */
> +target_name = (*argv)[1];
> +if (!strstr(target_name, "--fuzz-target=")) {
> +usage(**argv);
> +}
> +
> +target_name += strlen("--fuzz-target=");
> +
> +fuzz_target = fuzz_get_target(target_name);
> +if (!fuzz_target) {
> +usage(**argv);
> +}
> +
> +fuzz_qts = qtest_setup();
> +
> +if (!fuzz_target) {

This is dead code since fuzz_target was already checked above.  Please
remove this if statement.

> +fprintf(stderr, "Error: Fuzz fuzz_target name %s not found\n",
> +target_name);
> +usage(**argv);
> +}
> +
> +if (fuzz_target->pre_vm_init) {
> +fuzz_target->pre_vm_init();
> +}
> +
> +/* Run QEMU's softmmu main with the fuzz-target dependent arguments */
> +char *init_cmdline = fuzz_target->get_init_cmdline(fuzz_target);

Where is init_cmdline freed or should this be const char *?

> +wordexp_t result;
> +wordexp(init_cmdline, &result, 0);

What is the purpose of word expansion here?

> +
> +qemu_init(result.we_wordc, result.we_wordv, NULL);
> +
> +if (fuzz_target->pre_fuzz) {
> +fuzz_target->pre_fuzz(fuzz_qts);
> +}
> +
> +return 0;
> +}
> diff --git a/tests/fuzz/fuzz.h b/tests/fuzz/fuzz.h
> new file mode 100644
> index 00..b569b622d7
> --- /dev/null
> +++ b/tests/fuzz/fuzz.h
> @@ -0,0 +1,66 @@
> +/*
> + * fuzzing driver
> + *
> + * Copyright Red Hat Inc., 2019
> + *
> + * Authors:
> + *  Alexander Bulekov   
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef FUZZER_H_
> +#define FUZZER_H_
> +
> +#include "qemu/osdep.h"
> +#include "qemu/units.h"
> +#include "qapi/error.h"
> +#include "exec/memory.h"
> +#include "tests/libqtest.h"
> +
> +

Some documentation would be nice:

/**
 * A libfuzzer fuzzing target
 *
 * The QEMU fuzzing binary is built with all available targets, each
 * with a unique @name that can be specified on the command-line to
 * select which target should run.
 *
 * A target must implement ->fuzz() to process a random input.  If QEMU
 * crashes in ->fuzz() then libfuzzer will record a failure.
 *
 * Fuzzing targets are registered with fuzz_add_target():
 *
 *   static const FuzzTarget fuzz_target = {
 *   .name = "my-device-fifo",
 *   .description = "Fuzz the FIFO buffer registers of my-device",
 *   ...
 *   };
 *
 *   static void register_fuzz_target(void)
 *   {
 *   fuzz_add_target(&fuzz_target);
 *   }
 *   fuzz_target_init(register_fuzz_target);
 */

> +typedef struct FuzzTarget {
> +const char *name; /* command-line option(FUZZ_TARGET) for the 
> target */
> +const char *description;  /* help text */
> +

If any of the function pointers can be NULL, please document this.

> +/* returns the arg-list that is passed to qemu/softmmu init() */
> +char* (*get_init_cmdline)(struct FuzzTarget *);

Does the caller need to call g_free() on the returned string?  Please
document this.

> +
> +/*
> + * will run once, prior to running qemu/softmmu init.
> + * eg: set up shared-memory for communication with the child-process
> + */
> +void(*pre_vm_init)(void);
> +
> +/*
> + * will run once, prior to to the fuzz-loop.

s/to to/to/

> + * eg: detect the memory map
> + */
> +void(*pre_fuzz)(QTestState *);

Please also mention that QEMU has been initialized at this point.

> +
> +/*
> + * accepts and executes an input from libfuzzer. this is repeatedly
> + * executed during the fuzzing loo

Re: [PATCH v4 16/20] fuzz: add support for fork-based fuzzing.

On Wed, Oct 30, 2019 at 02:50:01PM +, Oleinik, Alexander wrote:
> diff --git a/tests/fuzz/fork_fuzz.c b/tests/fuzz/fork_fuzz.c
> new file mode 100644
> index 00..4c4d00b034
> --- /dev/null
> +++ b/tests/fuzz/fork_fuzz.c
> @@ -0,0 +1,51 @@
> +/*
> + * Fork-based fuzzing helpers
> + *
> + * Copyright Red Hat Inc., 2019
> + *
> + * Authors:
> + *  Alexander Bulekov   
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "fork_fuzz.h"
> +
> +uintptr_t feature_shm;

Where is this variable used?

> +
> +void counter_shm_init(void)
> +{
> +int fd = shm_open("/qemu-fuzz-cntrs", O_CREAT | O_RDWR, S_IRUSR | 
> S_IWUSR);

It must be possible to run multiple fuzzer instances simultaneously on
one host.  Please use a unique shmem path for each parent process (e.g.
getpid() in the parent and getppid() in the child).

> +if (fd == -1) {
> +perror("Error: ");
> +exit(1);
> +}
> +if (ftruncate(fd, &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START) == -1) {
> +perror("Error: ");
> +exit(1);
> +}
> +/* Copy what's in the counter region to the shm.. */
> +void *rptr = mmap(NULL ,
> +&__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START,
> +PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> +memcpy(rptr,
> +   &__FUZZ_COUNTERS_START,
> +   &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START);
> +
> +munmap(rptr, &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START);
> +
> +/* And map the shm over the counter region */
> +rptr = mmap(&__FUZZ_COUNTERS_START,
> +&__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START,
> +PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, fd, 0);

fd can be closed here to prevent leaking it.


signature.asc
Description: PGP signature

Re: [PATCH v4 17/20] fuzz: add support for qos-assisted fuzz targets

On Wed, Oct 30, 2019 at 02:50:02PM +, Oleinik, Alexander wrote:
> +static char *qos_build_main_args()

Please use func(void) in C.  In C () functions have unspecified and
unchecked arguments whereas in C++ () means (void).  We want the
compiler to complain if arguments are passed to this function, so it
needs to be (void).

> +{
> +char **path = fuzz_path_vec;
> +QOSGraphNode *test_node;
> +GString *cmd_line = g_string_new(path[0]);
> +void *test_arg;
> +
> +/* Before test */
> +current_path = path;
> +test_node = qos_graph_get_node(path[(g_strv_length(path) - 1)]);
> +test_arg = test_node->u.test.arg;
> +if (test_node->u.test.before) {
> +test_arg = test_node->u.test.before(cmd_line, test_arg);
> +}
> +/* Prepend the arguments that we need */
> +g_string_prepend(cmd_line,
> +"qemu-system-i386 -display none -machine accel=qtest -m 64 ");

Does i386 need to be hardcoded?  An earlier patch declared a fuzz_arch
or similar variable (from TARGET_NAME).

signature.asc
Description: PGP signature

Re: [PATCH v4 18/20] fuzz: add i440fx fuzz targets

On Wed, Oct 30, 2019 at 02:50:03PM +, Oleinik, Alexander wrote:
> +static void i440fx_fuzz_qos_fork(QTestState *s,
> +const unsigned char *Data, size_t Size) {
> +if (fork() == 0) {
> +i440fx_fuzz_qos(s, Data, Size);
> +_Exit(0);
> +} else {
> +wait(NULL);
> +}
> +}
> +
> +static const char *i440fx_qtest_argv = "qemu_system_i386 -machine 
> accel=qtest"

Binaries are named qemu-system-TARGET.  I guess nothing looks at argv[0]
but it should use hyphens instead of underscores.

> +   "-m 0 -display none";
> +static char *i440fx_argv(FuzzTarget *t)
> +{
> +return (char *)i440fx_qtest_argv;

.get_init_cmdline() should probably return const char *.

Otherwise:

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: privileged entropy sources in QEMU/KVM guests

On 11/07/19 12:37, Paolo Bonzini wrote:
> On 07/11/19 11:25, Ard Biesheuvel wrote:
>>> This looks problematic on QEMU. Entropy is a valuable resource, and
>>> whatever resource SMM drivers depend on, should not be possible for e.g.
>>> a 3rd party UEFI driver (or even for the runtime OS) to exhaust.
>>> Therefore, it's not *only* the case that SMM drivers must not consume
>>> EFI_RNG_PROTOCOL (which exists at a less critical privilege level, i.e.
>>> outside of SMM/SMRAM), but also that SMM drivers must not depend on the
>>> same piece of *hardware* that feeds EFI_RNG_PROTOCOL.
>>>
>> The typical model is to seed a DRBG [deterministic pseudorandom
>> sequence generator] using a sufficient amount of high quality entropy.
>> Once you have done that, it is rather hard to exhaust a DRBG - it is a
>> mathematical construction that is designed to last for a long time (<=
>> 2^48 invocations [not bytes] according to the NIST spec), after which
>> it does not degrade although it may have generated so much output that
>> its internal state may be inferred if you have captured enough of it
>> (which is a rather theoretical issue IMHO)
>>
>> The problem is that using the output of a DRBG as a seed is
>> non-trivial - the spec describes ways to do this, but wiring
>> virtio-rng to a DRBG in the host and using its output to seed a DRBG
>> in the guest is slighly problematic.
>>
>> So it seems to me that the correct way to model this is to make the
>> host's true entropy source a shared resource like any other.
>>
> 
> Yes, I would make SMM use a cryptographic pseudo-random number generator 
> and seed it from virtio-rng from DXE,

The VirtioRngDxe driver is a UEFI driver that follows the UEFI driver
model. Meaning (in this context), it is connected to the virtio-rng
device in the BDS phase, by platform BDS code.

The variable SMM driver is necessary for producing the UEFI Variable and
Variable Write architectural protocols. BDS can only be entered when all
the architectural protocols have been installed.

Therefore we have a circular dependency with the above -- assuming we
intend to delay the *startup* of the variable SMM driver until after
EFI_RNG_PROTOCOL is available.

But, perhaps, could the variable SMM driver start up anyway, and consume
EFI_RNG_PROTOCOL just when it really needs the random seed? I doubt it:
other EFI_RNG_PROTOCOL instances could be produced by other (3rd party)
UEFI drivers. Or other (3rd party) modules could "attack" VirtioRngDxe.
A privileged (SMM) driver should not consume such sensitive data from a
non-privileged driver, unless the latter was built into the platform
firmware, and the consumption occurred before the End-of-Dxe event
(which is signaled by platform BDS, early in the BDS phase).

Put differently, the non-privileged driver that's the source of the
sensitive data would have to be a "platform DXE driver". The virtio
drivers are not such drivers however.

I can imagine a roundabout way to hack this around in OVMF's platform
BDS, quite horribly, but I'd like to consider other approaches first.

Thank you!
Laszlo

> way before the OS starts and can 
> "attack" it.
> 
> Once you've gotten a seed, you can create a CSPRNG with a stream cipher 
> such as ChaCha20, which is literally 30 lines of code.
> 
> Paolo
> 
> #define ROTL(a,b) (((a) << (b)) | ((a) >> (32 - (b
> #define QR(a, b, c, d) (  \
>   a += b,  d ^= a,  d = ROTL(d,16),   \
>   c += d,  b ^= c,  b = ROTL(b,12),   \
>   a += b,  d ^= a,  d = ROTL(d, 8),   \
>   c += d,  b ^= c,  b = ROTL(b, 7))
> #define ROUNDS 20
> 
> // initial state:
> // in[0] = 0x65787061
> // in[1] = 0x6e642033
> // in[2] = 0x322d6279
> // in[3] = 0x7465206b
> // in[4]..in[11] = key (seed)
> // in[12]..in[13] = 0
> // in[14]..in[15] = nonce, can probably use RDTSC?
> static uint32_t in[16];
> 
> uint32_t chacha_rng(void)
> {
>   int i;
>   static uint32_t x[16], p;
>   if (p < 16)
>   return in[p++] + x[p++];
> 
>   if (++in[12] == 0)
>   ++in[13];
> 
>   for (i = 0; i < 16; ++i)
>   x[i] = in[i];
> 
>   // 10 loops × 2 rounds/loop = 20 rounds
>   for (i = 0; i < ROUNDS; i += 2) {
>   // Odd round
>   QR(x[0], x[4], x[ 8], x[12]); // column 0
>   QR(x[1], x[5], x[ 9], x[13]); // column 1
>   QR(x[2], x[6], x[10], x[14]); // column 2
>   QR(x[3], x[7], x[11], x[15]); // column 3
>   // Even round
>   QR(x[0], x[5], x[10], x[15]); // diagonal 1 (main diagonal)
>   QR(x[1], x[6], x[11], x[12]); // diagonal 2
>   QR(x[2], x[7], x[ 8], x[13]); // diagonal 3
>   QR(x[3], x[4], x[ 9], x[14]); // diagonal 4
>   }
>   p = 1;
>   return in[0] + x[0];
> }
>

Re: [PATCH v14 03/11] tests: Add test for QAPI builtin type time

2019-11-07 Thread Eduardo Habkost

On Thu, Nov 07, 2019 at 02:24:52PM +0800, Tao Xu wrote:
> On 11/7/2019 4:53 AM, Eduardo Habkost wrote:
> > On Mon, Oct 28, 2019 at 03:52:12PM +0800, Tao Xu wrote:
> > > Add tests for time input such as zero, around limit of precision,
> > > signed upper limit, actual upper limit, beyond limits, time suffixes,
> > > and etc.
> > > 
> > > Signed-off-by: Tao Xu 
> > > ---
> > [...]
> > > +/* Close to signed upper limit 0x7c00 (53 msbs set) */
> > > +qdict = keyval_parse("time1=9223372036854774784," /* 
> > > 7c00 */
> > > + "time2=9223372036854775295", /* 
> > > 7dff */
> > > + NULL, &error_abort);
> > > +v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
> > > +qobject_unref(qdict);
> > > +visit_start_struct(v, NULL, NULL, 0, &error_abort);
> > > +visit_type_time(v, "time1", &time, &error_abort);
> > > +g_assert_cmphex(time, ==, 0x7c00);
> > > +visit_type_time(v, "time2", &time, &error_abort);
> > > +g_assert_cmphex(time, ==, 0x7c00);
> > 
> > I'm confused by this test case and the one below[1].  Are these
> > known bugs?  Shouldn't we document them as known bugs?
> 
> Because do_strtosz() or do_strtomul() actually parse with strtod(), so the
> precision is 53 bits, so in these cases, 7dff and
> fbff are rounded.

My questions remain: why isn't this being treated like a bug?

-- 
Eduardo

Re: privileged entropy sources in QEMU/KVM guests

On 11/07/19 13:50, Paolo Bonzini wrote:
> On 07/11/19 12:55, Daniel P. Berrangé wrote:
>>> Yes, I would make SMM use a cryptographic pseudo-random number generator 
>>> and seed it from virtio-rng from DXE, way before the OS starts and can 
>>> "attack" it.
>>>
>>> Once you've gotten a seed, you can create a CSPRNG with a stream cipher 
>>> such as ChaCha20, which is literally 30 lines of code.
>> If all we need is a one-time seed then virtio-rng is possibly overkill as
>> that provides a continuous stream. Instead could QEMU read a few bytes
>> from the host's /dev/urandom and pass it to EDK via fw_cfg, which can
>> use it for the CSPRNG seed. EDK would have to erase the fw_cfg field
>> to prevent the seed value leaking to the guest OS, but other than that
>> its quite straightforward.
> 
> That would need anyway a change to the emulated hardware.  If the guest
> is able to use virtio-rng after the firmware exits (which is the case is
> all the firmware needs is a one-time seed), then using virtio-rng is the
> simplest alternative as it needs no change at all outside the firmware.

This is a really good point!

I'll think more about using virtio-rng, hopefully without horribly
hacking OVMF's BDS code.

Thanks
Laszlo

Re: Looking for issues/features for my first contribution

2019-11-07 Thread Aleksandar Markovic

On Thu, Nov 7, 2019 at 11:37 AM Aleksandar Markovic
 wrote:
>
>
>
> On Thursday, November 7, 2019, Rajath Shashidhara  
> wrote:
>>
>> Hi all,
>>
>> I am a Computer Science graduate student at The University of Texas at 
>> Austin (UT, Austin). I am looking forward to contributing to qemu !
>>
>> This semester, I am taking a class in Virtualization 
>> (https://github.com/vijay03/cs378-f19) and contributing to a virtualization 
>> related open-source project is a significant part of the course.
>> I would be interested in contributing a patchset to qemu - possibly a 
>> self-contained feature or a reasonably complex bug fix that can be completed 
>> in under a month's time. I did look at both the bugtracker and the QEMU 
>> Google Summer of Code 2019 page 
>> [https://wiki.qemu.org/Google_Summer_of_Code_2019] for ideas. However, I 
>> would be interested in hearing from the community and I would be delighted 
>> if somebody can be suggest a suitable project !
>>
>
> Hello, Rajath!
>
> Thank you for expressing interest in QEMU open source project.
>
> There is certainly a place for you and your contributions in QEMU, and you 
> are very welcomed!
>
> It looks to me the following project would fit your description:
>
> 'Implement emulation of DS3231 real time clock in QEMU'
>
> Datasheet:
>
> https://datasheets.maximintegrated.com/en/ds/DS3231.pdf
>
> The steps needed to complete it (in my opinion):
>
> - collect datasheets of as many as possible RTC chips already emulated in 
> QEMU (there are around of dozen of them, see folder hw/rtc)
>

I did a quick Google search on datasheets of existing RTC
implemtations, and the result is:

DS1338: https://datasheets.maximintegrated.com/en/ds/DS1338-DS1338Z.pdf
M41T80: https://www.st.com/resource/en/datasheet/m41t80.pdf
M48T59: http://www.elektronikjk.pl/elementy_czynne/IC/M48T59V.pdf
MC146818: https://www.nxp.com/docs/en/data-sheet/MC146818.pdf
PL031: 
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0224c/real_time_clock_pl031_r1p3_technical_reference_manual_DDI0224C.pdf
TWL92230: 
https://datasheet.octopart.com/TWL92230C-Texas-Instruments-datasheet-150321.pdf
Zynq RTC: 
https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf
(chapter 7)

Aleksandar

> - do a comparative analysis of selected RTC implementations in QEMU
>
> - get to know general QEMU device model
>
> - design and implement DS3231 emulation
>
> I can give you (unfortunately constrained by tight time limits) some help and 
> guidance. But there are other people in community too (more knowledgable in 
> the area than me).
>
> I salute your initiative!
>
> Yours,
> Aleksandar
>
>
>
>>
>> I am an advanced C programmer with both professional and academic background 
>> in systems design & implementation - especially OS & Networks. Given my 
>> background, I feel fairly confident that I can pickup the QEMU codebase 
>> quickly.
>>
>> Eagerly looking forward to hearing from the community !
>>
>> Thanks,
>> Rajath Shashidhara
>>
>>

Re: The problems about COLO

2019-11-07 Thread Lukas Straub

On Thu, 7 Nov 2019 16:14:43 +0800
Daniel Cho  wrote:

> Hi  Lukas,
> Thanks for your reply.
>
> However, we test the question 1 with steps below the error message, we
> notice the secondary VM's image
> will break  while it reboots.
> Here is the error message.
> ---
> [1.280299] XFS (sda1): Mounting V5 Filesystem
> [1.428418] input: ImExPS/2 Generic Explorer Mouse as
> /devices/platform/i8042/serio1/input/input2
> [1.501320] XFS (sda1): Starting recovery (logdev: internal)
> [1.504076] tsc: Refined TSC clocksource calibration: 3492.211 MHz
> [1.505534] Switched to clocksource tsc
> [2.031027] XFS (sda1): Internal error XFS_WANT_CORRUPTED_GOTO at line
> 1635 of file fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xfc/0x130
> [xfs]
> [2.032743] CPU: 0 PID: 300 Comm: mount Not tainted
> 3.10.0-693.11.6.el7.x86_64 #1
> [2.033982] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [2.035882] Call Trace:
> [2.036494]  [] dump_stack+0x19/0x1b
> [2.037315]  [] xfs_error_report+0x3b/0x40 [xfs]
> [2.038150]  [] ? xfs_free_extent+0xfc/0x130 [xfs]
> [2.039046]  [] xfs_free_ag_extent+0x20a/0x780 [xfs]
> [2.039920]  [] xfs_free_extent+0xfc/0x130 [xfs]
> [2.040768]  [] xfs_trans_free_extent+0x26/0x60 [xfs]
> [2.041642]  [] xlog_recover_process_efi+0x17e/0x1c0
> [xfs]
> [2.042558]  []
> xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
> [2.043771]  [] xlog_recover_finish+0x21/0xb0 [xfs]
> [2.044650]  [] xfs_log_mount_finish+0x34/0x50 [xfs]
> [2.045518]  [] xfs_mountfs+0x5d1/0x8b0 [xfs]
> [2.046341]  [] ? xfs_filestream_get_parent+0x80/0x80
> [xfs]
> [2.047260]  [] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
> [2.048116]  [] mount_bdev+0x1b0/0x1f0
> [2.048881]  [] ?
> xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
> [2.050105]  [] xfs_fs_mount+0x15/0x20 [xfs]
> [2.050906]  [] mount_fs+0x39/0x1b0
> [2.051963]  [] ? __alloc_percpu+0x15/0x20
> [2.059431]  [] vfs_kern_mount+0x67/0x110
> [2.060283]  [] do_mount+0x233/0xaf0
> [2.061081]  [] ? strndup_user+0x4b/0xa0
> [2.061844]  [] SyS_mount+0x96/0xf0
> [2.062619]  [] system_call_fastpath+0x16/0x1b
> [2.063512] XFS (sda1): Internal error xfs_trans_cancel at line 984 of
> file fs/xfs/xfs_trans.c.  Caller xlog_recover_process_efi+0x18e/0x1c0 [xfs]
> [2.065260] CPU: 0 PID: 300 Comm: mount Not tainted
> 3.10.0-693.11.6.el7.x86_64 #1
> [2.066489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [2.068023] Call Trace:
> [2.068590]  [] dump_stack+0x19/0x1b
> [2.069403]  [] xfs_error_report+0x3b/0x40 [xfs]
> [2.070318]  [] ? xlog_recover_process_efi+0x18e/0x1c0
> [xfs]
> [2.071538]  [] xfs_trans_cancel+0xbd/0xe0 [xfs]
> [2.072429]  [] xlog_recover_process_efi+0x18e/0x1c0
> [xfs]
> [2.073339]  []
> xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
> [2.074561]  [] xlog_recover_finish+0x21/0xb0 [xfs]
> [2.075421]  [] xfs_log_mount_finish+0x34/0x50 [xfs]
> [2.076301]  [] xfs_mountfs+0x5d1/0x8b0 [xfs]
> [2.077128]  [] ? xfs_filestream_get_parent+0x80/0x80
> [xfs]
> [2.078049]  [] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
> [2.078900]  [] mount_bdev+0x1b0/0x1f0
> [2.079667]  [] ?
> xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
> [2.080883]  [] xfs_fs_mount+0x15/0x20 [xfs]
> [2.081687]  [] mount_fs+0x39/0x1b0
> [2.082457]  [] ? __alloc_percpu+0x15/0x20
> [2.083258]  [] vfs_kern_mount+0x67/0x110
> [2.084057]  [] do_mount+0x233/0xaf0
> [2.084797]  [] ? strndup_user+0x4b/0xa0
> [2.085568]  [] SyS_mount+0x96/0xf0
> [2.086324]  [] system_call_fastpath+0x16/0x1b
> [2.087161] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 985
> of file fs/xfs/xfs_trans.c.  Return address = 0xc0195966
> [2.088795] XFS (sda1): Corruption of in-memory data detected.  Shutting
> down filesystem
> [2.090273] XFS (sda1): Please umount the filesystem and rectify the
> problem(s)
> [2.091519] XFS (sda1): Failed to recover EFIs
> [2.092299] XFS (sda1): log mount finish failed
> [FAILED] Failed to mount /sysroot.
> .
> .
> .
> Generating "/run/initramfs/rdsosreport.txt"
> [2.178103] blk_update_request: I/O error, dev fd0, sector 0
> [2.246106] blk_update_request: I/O error, dev fd0, sector 0
>   ---
>
> Here is the replicated steps:
> *1. Start primary VM with command, and do every thing you want on PVM*
> qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
> stdio -vnc :5 \
>   -device piix3-usb-uhci,id=puu -device usb-tablet,id=ut -name primary \
>   -netdev
> tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
>   -device rtl8139,id=e0,netdev=hn0 \
>

Re: [PATCH v4 19/20] fuzz: add virtio-net fuzz target

On Wed, Oct 30, 2019 at 02:50:03PM +, Oleinik, Alexander wrote:
> +static void virtio_net_fuzz_multi(QTestState *s,
> +const unsigned char *Data, size_t Size)
> +{
> +typedef struct vq_action {
> +uint8_t queue;
> +uint8_t length;
> +uint8_t write;
> +uint8_t next;
> +bool kick;
> +} vq_action;
> +
> +uint64_t req_addr[10];
> +int reqi = 0;
> +uint32_t free_head = 0;
> +
> +QGuestAllocator *t_alloc = fuzz_qos_alloc;
> +
> +QVirtioNet *net_if = fuzz_qos_obj;
> +QVirtioDevice *dev = net_if->vdev;
> +QVirtQueue *q;
> +vq_action vqa;
> +int iters = 0;
> +while (true) {
> +if (Size < sizeof(vqa)) {
> +break;
> +}

More concisely:

  while (Size < sizeof(vqa)) {

> +memcpy(&vqa, Data, sizeof(vqa));
> +vqa = *((vq_action *)Data);

The assignment is redundant after the memcpy.

> +Data += sizeof(vqa);
> +Size -= sizeof(vqa);
> +
> +q = net_if->queues[vqa.queue % 3];
> +
> +vqa.length = vqa.length >= Size ? Size :  vqa.length;
> +
> +req_addr[reqi] = guest_alloc(t_alloc, vqa.length);
> +qtest_memwrite(s, req_addr[reqi], Data, vqa.length);
> +if (iters == 0) {

What is the difference between iters and reqi?  The values of these
variables are always identical so I think only one of them is needed.

> +free_head = qvirtqueue_add(s, q, req_addr[reqi], vqa.length,
> +vqa.write, vqa.next);
> +} else {
> +qvirtqueue_add(s, q,
> +req_addr[reqi], vqa.length, vqa.write , vqa.next);
> +}
> +iters++;
> +reqi++;
> +if (iters == 10) {
> +break;
> +}
> +Data += vqa.length;
> +Size -= vqa.length;
> +}
> +if (iters) {
> +qvirtqueue_kick(s, dev, q, free_head);
> +qtest_clock_step_next(s);

Here we could run the main loop until free_head appears in the used
ring.  That should allow the request processing code path to be explored
more fully, but this it's okay to leave it like this for now.


signature.asc
Description: PGP signature

Re: [PATCH v4 20/20] fuzz: add documentation to docs/devel/

On Wed, Oct 30, 2019 at 02:50:04PM +, Oleinik, Alexander wrote:
> +== Building the fuzzers ==
> +
> +NOTE: If possible, build a 32-bit binary. When forking, the 32-bit fuzzer is
> +much faster, since the page-map has a smaller size. This is due to the fact 
> that
> +AddressSanitizer mmaps ~20TB of memory, as part of its detection. This 
> results
> +in a large page-map, and a much slower fork(). O
> +
> +To build the fuzzers, install a recent version of clang:
> +Configure with (substitute the clang binaries with the version you 
> installed):
> +
> +CC=clang-8 CXX=clang++-8 /path/to/configure --enable-fuzzing
> +
> +Fuzz targets are built similarly to system/softmmu:
> +
> +make i386-softmmu/fuzz
> +
> +This builds ./i386-softmmu/qemu-fuzz-i386

I'm surprised that "make i386-softmmu/fuzz" builds
i386-softmmu/qemu-fuzz-i386.  Should that be "make
i386-softmmu/qemu-fuzz-i386"?

> += Implmentation Details =

s/Implmentation/Implementation/

> +
> +== The Fuzzer's Lifecycle ==
> +
> +The fuzzer has two entrypoints that libfuzzer calls. libfuzzer provides it's
> +own main(), which performs some setup, and calls the entrypoints:
> +
> +LLVMFuzzerInitialize: called prior to fuzzing. Used to initialize all of the
> +necessary state
> +
> +LLVMFuzzerTestOneInput: called for each fuzzing run. Processes the input and
> +resets the state at the end of each run.
> +
> +In more detail:
> +
> +LLVMFuzzerInitialize parses the arguments to the fuzzer (must start with two
> +dashes, so they are ignored by libfuzzer main()). Currently, the arguments
> +select the fuzz target. Then, the qtest client is initialized. If the target
> +requires qos, qgraph is set up and the QOM/LIBQOS modules are initailized.

s/initailized/initialized/


signature.asc
Description: PGP signature

Re: [PATCH v4 00/20] Add virtual device fuzzing support

On Wed, Oct 30, 2019 at 02:49:47PM +, Oleinik, Alexander wrote:
> This series adds a framework for coverage-guided fuzzing of
> virtual-devices. Fuzzing targets are based on qtest and can make use of
> the libqos abstractions.
> 
> V4:
>  * add/transfer license headers to new files
>  * restructure the added QTestClientTransportOps struct
>  * restructure the FuzzTarget struct and fuzzer skeleton
>  * fork-based fuzzer now directly mmaps shm over the coverage bitmaps
>  * fixes to i440 and virtio-net fuzz targets
>  * undo the changes to qtest_memwrite
>  * possible to build /fuzz and /all in the same build-dir
>  * misc fixes to address V3 comments

I have finished reviewing this series.  Please fold in my Reviewed-by
tags when you send the next series.  That way I'll know which patches to
skip :).

Thanks,
Stefan


signature.asc
Description: PGP signature

Re: [PATCH v4 19/20] fuzz: add virtio-net fuzz target

2019-11-07 Thread Jason Wang




On 2019/10/30 下午10:50, Oleinik, Alexander wrote:

From: Alexander Oleinik 

The virtio-net fuzz target feeds inputs to all three virtio-net
virtqueues, and uses forking to avoid leaking state between fuzz runs.

Signed-off-by: Alexander Oleinik 



Can this fuzz vhost-net or vhost-user (I only see socket backend)? If 
it's not too hard, it would be even more interesting.


Thanks



---
  tests/fuzz/Makefile.include  |   1 +
  tests/fuzz/virtio_net_fuzz.c | 123 +++
  2 files changed, 124 insertions(+)
  create mode 100644 tests/fuzz/virtio_net_fuzz.c

diff --git a/tests/fuzz/Makefile.include b/tests/fuzz/Makefile.include
index 37d6821bee..f1d9b46b1c 100644
--- a/tests/fuzz/Makefile.include
+++ b/tests/fuzz/Makefile.include
@@ -6,5 +6,6 @@ fuzz-obj-y += tests/fuzz/fork_fuzz.o
  fuzz-obj-y += tests/fuzz/qos_fuzz.o
  
  fuzz-obj-y += tests/fuzz/i440fx_fuzz.o

+fuzz-obj-y += tests/fuzz/virtio_net_fuzz.o
  
  FUZZ_LDFLAGS += -Xlinker -T$(SRC_PATH)/tests/fuzz/fork_fuzz.ld

diff --git a/tests/fuzz/virtio_net_fuzz.c b/tests/fuzz/virtio_net_fuzz.c
new file mode 100644
index 00..0543cfd32a
--- /dev/null
+++ b/tests/fuzz/virtio_net_fuzz.c
@@ -0,0 +1,123 @@
+/*
+ * virtio-net Fuzzing Target
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "fuzz.h"
+#include "fork_fuzz.h"
+#include "qos_fuzz.h"
+#include "tests/libqtest.h"
+#include "tests/libqos/virtio-net.h"
+
+
+static void virtio_net_fuzz_multi(QTestState *s,
+const unsigned char *Data, size_t Size)
+{
+typedef struct vq_action {
+uint8_t queue;
+uint8_t length;
+uint8_t write;
+uint8_t next;
+bool kick;
+} vq_action;
+
+uint64_t req_addr[10];
+int reqi = 0;
+uint32_t free_head = 0;
+
+QGuestAllocator *t_alloc = fuzz_qos_alloc;
+
+QVirtioNet *net_if = fuzz_qos_obj;
+QVirtioDevice *dev = net_if->vdev;
+QVirtQueue *q;
+vq_action vqa;
+int iters = 0;
+while (true) {
+if (Size < sizeof(vqa)) {
+break;
+}
+memcpy(&vqa, Data, sizeof(vqa));
+vqa = *((vq_action *)Data);
+Data += sizeof(vqa);
+Size -= sizeof(vqa);
+
+q = net_if->queues[vqa.queue % 3];
+
+vqa.length = vqa.length >= Size ? Size :  vqa.length;
+
+req_addr[reqi] = guest_alloc(t_alloc, vqa.length);
+qtest_memwrite(s, req_addr[reqi], Data, vqa.length);
+if (iters == 0) {
+free_head = qvirtqueue_add(s, q, req_addr[reqi], vqa.length,
+vqa.write, vqa.next);
+} else {
+qvirtqueue_add(s, q,
+req_addr[reqi], vqa.length, vqa.write , vqa.next);
+}
+iters++;
+reqi++;
+if (iters == 10) {
+break;
+}
+Data += vqa.length;
+Size -= vqa.length;
+}
+if (iters) {
+qvirtqueue_kick(s, dev, q, free_head);
+qtest_clock_step_next(s);
+for (int i = 0; i < reqi; i++) {
+guest_free(t_alloc, req_addr[i]);
+}
+}
+}
+
+static int *sv;
+
+static void *virtio_net_test_setup_socket(GString *cmd_line, void *arg)
+{
+if (!sv) {
+sv = g_new(int, 2);
+int ret = socketpair(PF_UNIX, SOCK_STREAM, 0, sv);
+fcntl(sv[0], F_SETFL, O_NONBLOCK);
+g_assert_cmpint(ret, !=, -1);
+}
+g_string_append_printf(cmd_line, " -netdev socket,fd=%d,id=hs0 ", sv[1]);
+return arg;
+}
+
+static void virtio_net_fork_fuzz(QTestState *s,
+const unsigned char *Data, size_t Size)
+{
+if (fork() == 0) {
+virtio_net_fuzz_multi(s, Data, Size);
+flush_events(s);
+_Exit(0);
+} else {
+wait(NULL);
+}
+}
+
+static void register_virtio_net_fuzz_targets(void)
+{
+fuzz_add_qos_target(&(FuzzTarget){
+.name = "virtio-net-fork-fuzz",
+.description = "Fuzz the virtio-net virtual queues, forking"
+"for each fuzz run",
+.pre_vm_init = &counter_shm_init,
+.pre_fuzz = &qos_init_path,
+.fuzz = virtio_net_fork_fuzz,},
+"virtio-net",
+&(QOSGraphTestOptions){.before = virtio_net_test_setup_socket}
+);
+}
+
+fuzz_target_init(register_virtio_net_fuzz_targets);

Re: privileged entropy sources in QEMU/KVM guests

On 11/07/19 13:47, Paolo Bonzini wrote:
> On 07/11/19 12:52, Daniel P. Berrangé wrote:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bb5530e4082446aac3a3d69780cd4dbfa4520013
>>
>> Is it practical to provide a jitter entropy source for EDK2
>> too ?
> 
> The hard part is not collecting jitter (though the firmware might be too
> deterministic for that), but rather turning it into a random number seed
> (mixing data from various sources, crediting entropy, etc.).

If there is *any* entropy source that (a) we can trust to be random
enough and (b) depends only on the CPU, then we shouldn't struggle with
virtio-rng (or similar devices) at all. RDRAND would be a no-brainer,
but the "community literature" suggests it should not be trusted in itself.

I've read the commit message linked above, and it appears too good to be
true.

The CPU Jitter RNG provides a source of good entropy by collecting
CPU executing time jitter. [...] The RNG does not have any
dependencies on any other service in the kernel. The RNG only needs
a high-resolution time stamp. [...]

http://www.chronox.de/jent.html

The CPU Jitter Random Number Generator provides a non-physical true
random number generator that works equally in kernel and user land.
The only prerequisite is the availability of a high-resolution timer
that is available in modern CPUs.

http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

Today’s operating systems provide non-physical true random number
generators which are based on hardware events. With the advent of
virtualization and the ever growing need of more high-quality random
numbers, these random number generators reach their limits.
Additional sources of entropy must be opened up. This document
introduces an entropy source based on CPU execution time jitter. The
design and implementation of a non-physical true random number
generator, the CPU Jitter random number generator, its statistical
properties and the maintenance and behavior of entropy is discussed
in this document.

If this construct is legit, a core edk2 implementation (available to all
platforms, and on all edk2 arches) would be a huge win.

On the other hand, we're having this discussion because the premise of
TianoCore#1871 is that we shouldn't rely on just the CPU and a high
resolution timer... I simply cannot decide if this construct is
trustworthy. (With any solution that was based in the host's /dev/random
or /dev/urandom, the trustworthiness question would be side-stepped in
the firmware.)

Laszlo

Re: [RFC PATCH 06/18] qemu-storage-daemon: Add --nbd-server option

2019-11-07 Thread Eric Blake


On 11/7/19 2:33 AM, Kevin Wolf wrote:



As a replacement nbd-server-add, I envisioned adding something like a
block-export-add, which would work the way that --export already does.
It would also come with query-block-exports and block-export-del, and it
wouldn't contain only NBD devices, but also vhost-user, FUSE, etc.
exports.

Now I'm wondering if the same would make sense for nbd-server-start.
Maybe an API change would even allow us to start multiple NBD servers
(e.g. listening on different IP addresses or using different tls-creds).


We want that (the ability to run multiple parallel NBD servers) anyway, 
to allow parallel incremental backups from different points in time to 
different clients.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: Looking for issues/features for my first contribution

On Wed, Nov 06, 2019 at 05:50:44PM -0600, Rajath Shashidhara wrote:
> Hi all,
> 
> I am a Computer Science graduate student at The University of Texas at
> Austin (UT, Austin). I am looking forward to contributing to qemu !
> 
> This semester, I am taking a class in Virtualization
> (https://github.com/vijay03/cs378-f19) and contributing to a virtualization
> related open-source project is a significant part of the course.
> I would be interested in contributing a patchset to qemu - possibly a
> self-contained feature or a reasonably complex bug fix that can be completed
> in under a month's time. I did look at both the bugtracker and the QEMU
> Google Summer of Code 2019 page
> [https://wiki.qemu.org/Google_Summer_of_Code_2019] for ideas. However, I
> would be interested in hearing from the community and I would be delighted
> if somebody can be suggest a suitable project !
> 
> I am an advanced C programmer with both professional and academic background
> in systems design & implementation - especially OS & Networks. Given my
> background, I feel fairly confident that I can pickup the QEMU codebase
> quickly.

Please check with Dinah Baum whether the SeaBIOS MMConfig task is
already taken, maybe you'd like to work on it if the task is still
available:

  
https://lore.kernel.org/qemu-devel/20191105163952.GI166646@stefanha-x1.localdomain/

Stefan


signature.asc
Description: PGP signature

Re: privileged entropy sources in QEMU/KVM guests

On Thu, Nov 07, 2019 at 02:44:11PM +0100, Laszlo Ersek wrote:
> On 11/07/19 13:47, Paolo Bonzini wrote:
> > On 07/11/19 12:52, Daniel P. Berrangé wrote:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bb5530e4082446aac3a3d69780cd4dbfa4520013
> >>
> >> Is it practical to provide a jitter entropy source for EDK2
> >> too ?
> > 
> > The hard part is not collecting jitter (though the firmware might be too
> > deterministic for that), but rather turning it into a random number seed
> > (mixing data from various sources, crediting entropy, etc.).
> 
> If there is *any* entropy source that (a) we can trust to be random
> enough and (b) depends only on the CPU, then we shouldn't struggle with
> virtio-rng (or similar devices) at all. RDRAND would be a no-brainer,
> but the "community literature" suggests it should not be trusted in itself.
> 
> I've read the commit message linked above, and it appears too good to be
> true.
> 
> The CPU Jitter RNG provides a source of good entropy by collecting
> CPU executing time jitter. [...] The RNG does not have any
> dependencies on any other service in the kernel. The RNG only needs
> a high-resolution time stamp. [...]
> 
> http://www.chronox.de/jent.html
> 
> The CPU Jitter Random Number Generator provides a non-physical true
> random number generator that works equally in kernel and user land.
> The only prerequisite is the availability of a high-resolution timer
> that is available in modern CPUs.
> 
> http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html
> 
> Today’s operating systems provide non-physical true random number
> generators which are based on hardware events. With the advent of
> virtualization and the ever growing need of more high-quality random
> numbers, these random number generators reach their limits.
> Additional sources of entropy must be opened up. This document
> introduces an entropy source based on CPU execution time jitter. The
> design and implementation of a non-physical true random number
> generator, the CPU Jitter random number generator, its statistical
> properties and the maintenance and behavior of entropy is discussed
> in this document.
> 
> If this construct is legit, a core edk2 implementation (available to all
> platforms, and on all edk2 arches) would be a huge win.
> 
> On the other hand, we're having this discussion because the premise of
> TianoCore#1871 is that we shouldn't rely on just the CPU and a high
> resolution timer... I simply cannot decide if this construct is
> trustworthy. (With any solution that was based in the host's /dev/random
> or /dev/urandom, the trustworthiness question would be side-stepped in
> the firmware.)

I can't claim to have knowledge of whether the above text is accurate
or not, instead I just defer to the Linux maintainers recommendatiohns.
They've considered it acceptable to merge it into Linux, and to me that
says it should be acceptable to have in EDK2 too.

As a second data point, jitter entropy has been the recommended solution
in RHEL for VMs where there's no RDRAND or virtio-rng available. In RHEL-7
this was implemented in userspace in rng-tools rather than with the kernel
module I link above, but the approach is the same in both cases IIUC.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[Bug 1851547] Re: qemu 4 crashes with this parameter attached -usb -device usb-host, hostbus=1, hostaddr=7 \

2019-11-07 Thread Marietto

001:008 Compx 2.4G Receiver. Problem arise because I've detached one of
my USB disk and the numbering of the USB devices attached changed.
specially the compx 2.4g receiver changed from hostaddr 7 to 8 and when
this happens qemu 4 seems to work not as good as qemu 3.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1851547

Title:
  qemu 4 crashes with this parameter attached -usb -device usb-
  host,hostbus=1,hostaddr=7 \

Status in QEMU:
  New

Bug description:
  Hello.

  qemu / kvm does not start anymore after upgrading ubuntu from 19.04 to
  19.10 and qemu from 3 to 4,as you can see below. what can I do ?

  root@ziomario-Z390-AORUS-PRO:/home/ziomario/Scrivania/OS-KVM# ./boot-
  OS-HSP2.sh

  > qemu-system-x86_64: /build/qemu-
  UryNDZ/qemu-4.0+dfsg/hw/usb/core.c:720: usb_ep_get: asserzione "dev !=
  NULL" non riuscita.

  ./boot-OS-HSP2.sh: riga 40: 26312 Annullato (core dump creato) qemu-
  system-x86_64 -enable-kvm -m 16000 -cpu
  Penryn,kvm=on,vendor=GenuineIntel,+invtsc,vmware-cpuid-
  freq=on,$MY_OPTIONS -machine pc-q35-2.9 -smp 4,cores=2 -vga none
  -device vfio-pci,host=01:00.0,bus=pcie.0,multifunction=on -device
  vfio-pci,host=01:00.1,bus=pcie.0 -device vfio-
  pci,host=01:00.2,bus=pcie.0 -device vfio-pci,host=01:00.3,bus=pcie.0
  -usb -device usb-host,hostbus=1,hostaddr=7 -drive
  if=pflash,format=raw,readonly,file=$OVMF/OVMF_CODE.fd -drive
  if=pflash,format=raw,file=$OVMF/OVMF_VARS-1024x768.fd -smbios type=2
  -device ich9-ahci,id=sata -drive
  id=Clover,if=none,snapshot=on,format=qcow2,file=./'Mo/CloverNG.qcow2'
  -device ide-hd,bus=sata.2,drive=Clover -device ide-
  hd,bus=sata.3,drive=InstallMedia -drive
  id=InstallMedia,if=none,file=BaseSystemHS.img,format=raw -drive
  id=BsdHDD,if=none,file=/dev/sdg,format=raw -device ide-
  hd,bus=sata.4,drive=BsdHDD -netdev
  tap,id=net0,ifname=tap0,script=no,downscript=no -device
  e1000-82545em,netdev=net0,id=net0,mac=52:54:00:c9:18:27 -monitor stdio

  It seems that this line is not good anymore (it worked with qemu 3.x)
  :

  -usb -device usb-host,hostbus=1,hostaddr=7 \

  when I removed it,it works. But I need that. With what can I change it
  ? You can reproduce that upgrading ubuntu 19.04 to 19.10 because in
  that way also qemu will be upgraded from 3 to 4. These are the
  packages that I'm using :

  root@ziomario-Z390-AORUS-PRO:/home/ziomario# qemu-system-x86_64 --version
  QEMU emulator version 4.0.0 (Debian 1:4.0+dfsg-0ubuntu9)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1851547/+subscriptions

How to clone CPUState in a new thread?

2019-11-07 Thread Michael Goffioul

[originally posted on qemu-discuss]

=== (initial)

Hi,

I'm working on a project that wants to replace houdini (ARM-to-x86
translation layer for Android from Intel) with a free open-source
implementation. I'm trying to leverage qemu user-mode to achieve that, but
it requires code changes to allow executing dynamically loaded functions
instead of running a single executable.

In a nutshell, using ideas from unicorn-engine, I've enhanced CPUARMState
with a stop address. Whenever this address is encountered in the
translator, it generates a YIELD exception, which then makes the cpu_loop
to exit.

It works fine for simple cases, but I'm having trouble with multi-threading
aspect. Threads created from the native/ARM side do seem to work properly.
The problem is when a new Java thread (not created from native/ARM)
attempts to execute native code. The QEMU engine has been initialized in
the main thread, but new Java threads do not have access to thread-local
variable thread_cpu.

I've tried (maybe naively) to recreate what the clone syscall is doing to
create a new CPUState/CPUArchState object, usable from the new thread, but
executing any ARM code quickly lead to a crash. I suppose I'm doing
something wrong, or missing something to properly initiale a new cpu. I'm
hoping that someone could help me solve this problem.

I've attached the current QEMU patch I'm using, most of the Android glue
layer is in linux-user/main.c. It contains a set of utility functions that
my Android native bridge implementation is using.

=== (follow-up)

Basically Houdini implements the native bridge interface, as defined here:
https://android.googlesource.com/platform/system/core/+/refs/tags/android-10.0.0_r11/libnativebridge/include/nativebridge/native_bridge.h#172
It allows running Android APK that contains ARM-compiled native/JNI code on
an Android-x86 OS. It does so by taking care of loading the ARM .so JNI
files are providing trampoline stubs to the Android runtime JVM. It does
not expose the host native .so to the emulated code, instead it provides a
set of ARM-compiled core libraries from Android: it is actually very
similar to running dynamically linked code in qemu-user with a chroot'ed
ARM environment. Actual interaction with the native host is happening
mostly/only through binder socket.

To initialize the qemu-user engine, I make it load a custom ARM .so/ELF
file that uses the Android linker (from the ARM pseudo chroot environment)
as interpreter. This allows me to delegate all dynamic linking aspects.

So far, the emulation is working fine and I'm able to run simple
ARM-compiled apps on Android-x86, even if the native code spawns new
threads. My current (hopefully last) problem is when a Java thread,
different than the one that initialized the qemu engine) is trying to run
native code. I need to setup a new CPUState/CPUArchState instance for this
Java thread.

qemu-android.diff.bz2
Description: application/bzip

Re: privileged entropy sources in QEMU/KVM guests

On 07/11/19 14:27, Laszlo Ersek wrote:
> The VirtioRngDxe driver is a UEFI driver that follows the UEFI driver
> model. Meaning (in this context), it is connected to the virtio-rng
> device in the BDS phase, by platform BDS code.
> 
> Put differently, the non-privileged driver that's the source of the
> sensitive data would have to be a "platform DXE driver". The virtio
> drivers are not such drivers however.

Yes, it would have to be a platform DXE driver.  What is it that limits
virtio to the BDS phase?

Paolo

Re: [RFC v4 PATCH 49/49] multi-process: add configure and usage information

On Thu, Oct 24, 2019 at 05:09:30AM -0400, Jagannathan Raman wrote:
> From: Elena Ufimtseva 
> 
> Signed-off-by: Elena Ufimtseva 
> Signed-off-by: Jagannathan Raman 
> Signed-off-by: John G Johnson 
> ---
>  docs/qemu-multiprocess.txt | 86 
> ++
>  1 file changed, 86 insertions(+)
>  create mode 100644 docs/qemu-multiprocess.txt
> 
> diff --git a/docs/qemu-multiprocess.txt b/docs/qemu-multiprocess.txt
> new file mode 100644
> index 000..c29f4df
> --- /dev/null
> +++ b/docs/qemu-multiprocess.txt
> @@ -0,0 +1,86 @@
> +Multi-process QEMU
> +==
> +
> +This document describes how to configure and use multi-process qemu.
> +For the design document refer to docs/devel/qemu-multiprocess.
> +
> +1) Configuration
> +
> +
> +To enable support for multi-process add --enable-mpqemu
> +to the list of options for the "configure" script.
> +
> +
> +2) Usage
> +
> +
> +To start qemu with devices intended to run in a separate emulation
> +process without libvirtd support, the following should be used on QEMU
> +command line. As of now, we only support the emulation of lsi53c895a
> +in a separate process
> +
> +* Since parts of the RAM are shared between QEMU & remote process, a
> +  memory-backend-file is required to facilitate this, as follows:
> +
> +  -object memory-backend-file,id=mem,mem-path=/dev/shm/,size=4096M,share=on
> +
> +* The devices to be emulated in the separate process are defined as
> +  before with addition of "rid" suboption that serves as a remote group
> +  identificator.
> +
> +  -device ,rid="remote process id"
> +
> +  For exmaple, for non multi-process qemu:

s/exmaple/example/

> +-device lsi53c895a,id=scsi0 device
> +-device scsi-hd,drive=drive0,bus=scsi0.0,scsi-id=0
> +-drive id=drive0,file=data-disk.img
> +
> +  and for multi-process qemu and no libvirt
> +  support (i.e. QEMU forks child processes):
> +-device lsi53c895a,id=scsi0,rid=0
> +-device scsi-hd,drive=drive0,bus=scsi0.0,scsi-id=0,rid="0"
> +
> +* The command-line options for the remote process is added to the "command"

s/is added/are added/

> +  suboption of the newly added "-remote" option. 
> +
> +   -remote [socket],rid=,command="..."
> +
> +  The drives to be emulated by the remote process are specified as part of
> +  this command sub-option. The device to be used to connect to the monitor
> +  is also specified as part of this suboption.
> +
> +  For example, the following option adds a drive and monitor to the remote
> +  process:
> +  -remote rid=0,command="-drive id=drive0,,file=data-disk.img -monitor 
> unix:/home/qmp-sock,,server,,nowait"
> +
> +  Note: There's an issue with this "command" subtion which we are in the

s/subtion/sub-option/

> +  process of fixing. To work around this issue, it requires additional
> +  "comma" characters as illustrated above, and in the example below.
> +
> +* Example QEMU command-line to launch lsi53c895a in a remote process
> +
> +  #/bin/sh
> +  qemu-system-x86_64 \
> +  -name "OL7.4" \
> +  -machine q35,accel=kvm \
> +  -smp sockets=1,cores=1,threads=1 \
> +  -cpu host \
> +  -m 2048 \
> +  -object memory-backend-file,id=mem,mem-path=/dev/shm/,size=2G,share=on \
> +  -numa node,memdev=mem \
> +  -device virtio-scsi-pci,id=virtio_scsi_pci0 \
> +  -drive id=drive_image1,if=none,format=raw,file=/root/ol7.qcow2 \
> +  -device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0 \
> +  -boot d \
> +  -monitor stdio \
> +  -vnc :0 \
> +  -device lsi53c895a,id=lsi0,remote,rid=8,command="qemu-scsi-dev" \
> +  -device 
> scsi-hd,id=drive2,drive=drive_image2,bus=lsi0.0,scsi-id=0,remote,rid=8,command="qemu-scsi-dev"\
> +  -remote rid=8,command="-drive 
> id=drive_image2,,file=/root/remote-process-disk.img -monitor 
> unix:/home/qmp-sock,,server,,nowait"
> +
> +  We could connect to the monitor using the following command:
> +  socat /home/qmp-sock stdio
> +
> +  After hotplugging disks to the remote process, please execute the
> +  following command in the guest to refresh the list of storage devices:
> +  rescan_scsi_bus.sh -a

This documentation suggests that QEMU spawns the remote processes.  How
do this work with unprivileged QEMU?  Is there an additional step where
QEMU drops privileges after having spawned remote processes?

Remote processes require accesses to resources that the main QEMU
process does not need access to, so I'm wondering how this process model
ensures that each process has only the privileges it needs.

Stefan


signature.asc
Description: PGP signature

Re: [PATCH v4 09/20] libqos: split qos-test and libqos makefile vars

2019-11-07 Thread Darren Kenny


On Wed, Oct 30, 2019 at 02:49:55PM +, Oleinik, Alexander wrote:

From: Alexander Oleinik 

Most qos-related objects were specified in the qos-test-obj-y variable.
qos-test-obj-y also included qos-test.o which defines a main().
This made it difficult to repurpose qos-test-obj-y to link anything
beside tests/qos-test against libqos. This change separates objects that
are libqos-specific and ones that are qos-test specific into different
variables.

Signed-off-by: Alexander Oleinik 


Reviewed-by: Darren Kenny 


---
tests/Makefile.include | 71 +-
1 file changed, 36 insertions(+), 35 deletions(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 67853d10c3..1517c4817e 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -699,52 +699,53 @@ tests/test-crypto-block$(EXESUF): 
tests/test-crypto-block.o $(test-crypto-obj-y)

libqgraph-obj-y = tests/libqos/qgraph.o

-libqos-obj-y = $(libqgraph-obj-y) tests/libqos/pci.o tests/libqos/fw_cfg.o
-libqos-obj-y += tests/libqos/malloc.o
-libqos-obj-y += tests/libqos/libqos.o
-libqos-spapr-obj-y = $(libqos-obj-y) tests/libqos/malloc-spapr.o
+libqos-core-obj-y = $(libqgraph-obj-y) tests/libqos/pci.o tests/libqos/fw_cfg.o
+libqos-core-obj-y += tests/libqos/malloc.o
+libqos-core-obj-y += tests/libqos/libqos.o
+libqos-spapr-obj-y = $(libqos-core-obj-y) tests/libqos/malloc-spapr.o
libqos-spapr-obj-y += tests/libqos/libqos-spapr.o
libqos-spapr-obj-y += tests/libqos/rtas.o
libqos-spapr-obj-y += tests/libqos/pci-spapr.o
-libqos-pc-obj-y = $(libqos-obj-y) tests/libqos/pci-pc.o
+libqos-pc-obj-y = $(libqos-core-obj-y) tests/libqos/pci-pc.o
libqos-pc-obj-y += tests/libqos/malloc-pc.o tests/libqos/libqos-pc.o
libqos-pc-obj-y += tests/libqos/ahci.o
libqos-usb-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) tests/libqos/usb.o

# Devices
-qos-test-obj-y = tests/qos-test.o $(libqgraph-obj-y)
-qos-test-obj-y += $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
-qos-test-obj-y += tests/libqos/e1000e.o
-qos-test-obj-y += tests/libqos/i2c.o
-qos-test-obj-y += tests/libqos/i2c-imx.o
-qos-test-obj-y += tests/libqos/i2c-omap.o
-qos-test-obj-y += tests/libqos/sdhci.o
-qos-test-obj-y += tests/libqos/tpci200.o
-qos-test-obj-y += tests/libqos/virtio.o
-qos-test-obj-$(CONFIG_VIRTFS) += tests/libqos/virtio-9p.o
-qos-test-obj-y += tests/libqos/virtio-balloon.o
-qos-test-obj-y += tests/libqos/virtio-blk.o
-qos-test-obj-y += tests/libqos/virtio-mmio.o
-qos-test-obj-y += tests/libqos/virtio-net.o
-qos-test-obj-y += tests/libqos/virtio-pci.o
-qos-test-obj-y += tests/libqos/virtio-pci-modern.o
-qos-test-obj-y += tests/libqos/virtio-rng.o
-qos-test-obj-y += tests/libqos/virtio-scsi.o
-qos-test-obj-y += tests/libqos/virtio-serial.o
+libqos-obj-y = $(libqgraph-obj-y)
+libqos-obj-y += $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
+libqos-obj-y += tests/libqos/e1000e.o
+libqos-obj-y += tests/libqos/i2c.o
+libqos-obj-y += tests/libqos/i2c-imx.o
+libqos-obj-y += tests/libqos/i2c-omap.o
+libqos-obj-y += tests/libqos/sdhci.o
+libqos-obj-y += tests/libqos/tpci200.o
+libqos-obj-y += tests/libqos/virtio.o
+libqos-obj-$(CONFIG_VIRTFS) += tests/libqos/virtio-9p.o
+libqos-obj-y += tests/libqos/virtio-balloon.o
+libqos-obj-y += tests/libqos/virtio-blk.o
+libqos-obj-y += tests/libqos/virtio-mmio.o
+libqos-obj-y += tests/libqos/virtio-net.o
+libqos-obj-y += tests/libqos/virtio-pci.o
+libqos-obj-y += tests/libqos/virtio-pci-modern.o
+libqos-obj-y += tests/libqos/virtio-rng.o
+libqos-obj-y += tests/libqos/virtio-scsi.o
+libqos-obj-y += tests/libqos/virtio-serial.o

# Machines
-qos-test-obj-y += tests/libqos/aarch64-xlnx-zcu102-machine.o
-qos-test-obj-y += tests/libqos/arm-imx25-pdk-machine.o
-qos-test-obj-y += tests/libqos/arm-n800-machine.o
-qos-test-obj-y += tests/libqos/arm-raspi2-machine.o
-qos-test-obj-y += tests/libqos/arm-sabrelite-machine.o
-qos-test-obj-y += tests/libqos/arm-smdkc210-machine.o
-qos-test-obj-y += tests/libqos/arm-virt-machine.o
-qos-test-obj-y += tests/libqos/arm-xilinx-zynq-a9-machine.o
-qos-test-obj-y += tests/libqos/ppc64_pseries-machine.o
-qos-test-obj-y += tests/libqos/x86_64_pc-machine.o
+libqos-obj-y += tests/libqos/aarch64-xlnx-zcu102-machine.o
+libqos-obj-y += tests/libqos/arm-imx25-pdk-machine.o
+libqos-obj-y += tests/libqos/arm-n800-machine.o
+libqos-obj-y += tests/libqos/arm-raspi2-machine.o
+libqos-obj-y += tests/libqos/arm-sabrelite-machine.o
+libqos-obj-y += tests/libqos/arm-smdkc210-machine.o
+libqos-obj-y += tests/libqos/arm-virt-machine.o
+libqos-obj-y += tests/libqos/arm-xilinx-zynq-a9-machine.o
+libqos-obj-y += tests/libqos/ppc64_pseries-machine.o
+libqos-obj-y += tests/libqos/x86_64_pc-machine.o

# Tests
+qos-test-obj-y = tests/qos-test.o
qos-test-obj-y += tests/ac97-test.o
qos-test-obj-y += tests/ds1338-test.o
qos-test-obj-y += tests/e1000-test.o
@@ -776,7 +777,7 @@ check-unit-y += tests/test-qgraph$(EXESUF)
tests/test-qgraph$(EXESUF): tests/test-qgraph.o $(libqgraph-obj-y)

check-qtest-generic-y += tests/qos-test$(EXE

Re: privileged entropy sources in QEMU/KVM guests

2019-11-07 Thread Ard Biesheuvel

On Thu, 7 Nov 2019 at 14:44, Laszlo Ersek  wrote:
>
> On 11/07/19 13:47, Paolo Bonzini wrote:
> > On 07/11/19 12:52, Daniel P. Berrangé wrote:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bb5530e4082446aac3a3d69780cd4dbfa4520013
> >>
> >> Is it practical to provide a jitter entropy source for EDK2
> >> too ?
> >
> > The hard part is not collecting jitter (though the firmware might be too
> > deterministic for that), but rather turning it into a random number seed
> > (mixing data from various sources, crediting entropy, etc.).
>
> If there is *any* entropy source that (a) we can trust to be random
> enough and (b) depends only on the CPU, then we shouldn't struggle with
> virtio-rng (or similar devices) at all. RDRAND would be a no-brainer,
> but the "community literature" suggests it should not be trusted in itself.
>
> I've read the commit message linked above, and it appears too good to be
> true.
>
> The CPU Jitter RNG provides a source of good entropy by collecting
> CPU executing time jitter. [...] The RNG does not have any
> dependencies on any other service in the kernel. The RNG only needs
> a high-resolution time stamp. [...]
>
> http://www.chronox.de/jent.html
>
> The CPU Jitter Random Number Generator provides a non-physical true
> random number generator that works equally in kernel and user land.
> The only prerequisite is the availability of a high-resolution timer
> that is available in modern CPUs.
>
> http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html
>
> Today’s operating systems provide non-physical true random number
> generators which are based on hardware events. With the advent of
> virtualization and the ever growing need of more high-quality random
> numbers, these random number generators reach their limits.
> Additional sources of entropy must be opened up. This document
> introduces an entropy source based on CPU execution time jitter. The
> design and implementation of a non-physical true random number
> generator, the CPU Jitter random number generator, its statistical
> properties and the maintenance and behavior of entropy is discussed
> in this document.
>
> If this construct is legit, a core edk2 implementation (available to all
> platforms, and on all edk2 arches) would be a huge win.
>

 "that works equally in kernel and user land"

Firmware running at boot time on a single core without any serious
scheduling or I/O going on is not going to produce any entropy worth
mentioning, I'm afraid. And if it does, it is going to delay the boot
substantially.

> On the other hand, we're having this discussion because the premise of
> TianoCore#1871 is that we shouldn't rely on just the CPU and a high
> resolution timer... I simply cannot decide if this construct is
> trustworthy. (With any solution that was based in the host's /dev/random
> or /dev/urandom, the trustworthiness question would be side-stepped in
> the firmware.)
>

Re: [PATCH v1 1/3] target/microblaze: Plug temp leaks for loads/stores

2019-11-07 Thread Richard Henderson

On 11/6/19 3:14 PM, Edgar E. Iglesias wrote:
> @@ -967,12 +967,14 @@ static void dec_load(DisasContext *dc)
> 10 -> 10
> 11 -> 00 */
>  TCGv low = tcg_temp_new();
> +TCGv t3 = tcg_const_tl(3);
>  
>  tcg_gen_andi_tl(low, addr, 3);
> -tcg_gen_sub_tl(low, tcg_const_tl(3), low);
> +tcg_gen_sub_tl(low, t3, low);
>  tcg_gen_andi_tl(addr, addr, ~3);
>  tcg_gen_or_tl(addr, addr, low);
>  tcg_temp_free(low);
> +tcg_temp_free(t3);
>  break;

While Luc correctly notes that tcg_gen_subfi_tl() may be used here, I will note
(1) there's a typo in the comment (not 2->2, but 2->1), and (2) that this whole
section can be done with

tcg_gen_xori_tl(addr, addr, 3);

Similarly in dec_store.

The other changes in this patch around gen_helper_memalign are ok.


r~

Re: [PATCH v1 2/3] target/microblaze: Plug temp leaks with delay slot setup

2019-11-07 Thread Richard Henderson

On 11/6/19 3:14 PM, Edgar E. Iglesias wrote:
> From: "Edgar E. Iglesias" 
> 
> Plug temp leaks with delay slot setup.
> 
> Signed-off-by: Edgar E. Iglesias 
> ---
>  target/microblaze/translate.c | 26 ++
>  1 file changed, 14 insertions(+), 12 deletions(-)

Reviewed-by: Richard Henderson 


r~

Re: [PATCH v1 3/3] target/microblaze: Plug temp leak around eval_cond_jmp()

2019-11-07 Thread Richard Henderson

On 11/6/19 3:14 PM, Edgar E. Iglesias wrote:
> From: "Edgar E. Iglesias" 
> 
> Plug temp leak around eval_cond_jmp().
> 
> Signed-off-by: Edgar E. Iglesias 
> ---
>  target/microblaze/translate.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson 


r~

[PATCH 2/4] Add use of RCU for qemu_logfile.

This now allows changing the logfile while logging is active,
and also solves the issue of a seg fault while changing the logfile.

Any read access to the qemu_logfile handle will use
the rcu_read_lock()/unlock() around the use of the handle.
To fetch the handle we will use atomic_rcu_read().
We also in many cases do a check for validity of the
logfile handle before using it to deal with the case where the
file is closed and set to NULL.

The cases where we write to the qemu_logfile will use atomic_rcu_set().
Writers will also use call_rcu() with a newly added qemu_logfile_free
function for freeing/closing when readers have finished.

Signed-off-by: Robert Foley 
---
 include/qemu/log.h | 47 
 util/log.c | 78 ++
 include/exec/log.h | 33 +---
 tcg/tcg.c  | 12 +--
 4 files changed, 138 insertions(+), 32 deletions(-)

diff --git a/include/qemu/log.h b/include/qemu/log.h
index a91105b2ad..975de18e23 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -3,9 +3,17 @@
 
 /* A small part of this API is split into its own header */
 #include "qemu/log-for-trace.h"
+#include "qemu/rcu.h"
+
+struct QemuLogFile {
+struct rcu_head rcu;
+FILE *fd;
+};
+typedef struct QemuLogFile QemuLogFile;
 
 /* Private global variable, don't use */
-extern FILE *qemu_logfile;
+extern QemuLogFile *qemu_logfile;
+
 
 /* 
  * The new API:
@@ -25,7 +33,17 @@ static inline bool qemu_log_enabled(void)
  */
 static inline bool qemu_log_separate(void)
 {
-return qemu_logfile != NULL && qemu_logfile != stderr;
+QemuLogFile *logfile;
+
+if (qemu_log_enabled()) {
+rcu_read_lock();
+logfile = atomic_rcu_read(&qemu_logfile);
+if (logfile && logfile->fd != stderr) {
+return true;
+}
+rcu_read_unlock();
+}
+return false;
 }
 
 #define CPU_LOG_TB_OUT_ASM (1 << 0)
@@ -55,12 +73,23 @@ static inline bool qemu_log_separate(void)
 
 static inline void qemu_log_lock(void)
 {
-qemu_flockfile(qemu_logfile);
+QemuLogFile *logfile;
+rcu_read_lock();
+logfile = atomic_rcu_read(&qemu_logfile);
+if (logfile) {
+qemu_flockfile(logfile->fd);
+}
+rcu_read_unlock();
 }
 
 static inline void qemu_log_unlock(void)
 {
-qemu_funlockfile(qemu_logfile);
+QemuLogFile *logfile;
+logfile = atomic_rcu_read(&qemu_logfile);
+if (logfile) {
+qemu_funlockfile(logfile->fd);
+}
+rcu_read_unlock();
 }
 
 /* Logging functions: */
@@ -70,9 +99,14 @@ static inline void qemu_log_unlock(void)
 static inline void GCC_FMT_ATTR(1, 0)
 qemu_log_vprintf(const char *fmt, va_list va)
 {
-if (qemu_logfile) {
-vfprintf(qemu_logfile, fmt, va);
+QemuLogFile *logfile;
+
+rcu_read_lock();
+logfile = atomic_rcu_read(&qemu_logfile);
+if (logfile) {
+vfprintf(logfile->fd, fmt, va);
 }
+rcu_read_unlock();
 }
 
 /* log only if a bit is set on the current loglevel mask:
@@ -129,5 +163,6 @@ void qemu_print_log_usage(FILE *f);
 void qemu_log_flush(void);
 /* Close the log file */
 void qemu_log_close(void);
+void qemu_logfile_free(QemuLogFile *logfile);
 
 #endif
diff --git a/util/log.c b/util/log.c
index dff2f98c8c..d409532b8f 100644
--- a/util/log.c
+++ b/util/log.c
@@ -29,7 +29,7 @@
 static char *logfilename;
 static bool qemu_logfile_initialized;
 static QemuMutex qemu_logfile_mutex;
-FILE *qemu_logfile;
+QemuLogFile *qemu_logfile;
 int qemu_loglevel;
 static int log_append = 0;
 static GArray *debug_regions;
@@ -38,10 +38,14 @@ static GArray *debug_regions;
 int qemu_log(const char *fmt, ...)
 {
 int ret = 0;
-if (qemu_logfile) {
+QemuLogFile *logfile;
+
+rcu_read_lock();
+logfile = atomic_rcu_read(&qemu_logfile);
+if (logfile) {
 va_list ap;
 va_start(ap, fmt);
-ret = vfprintf(qemu_logfile, fmt, ap);
+ret = vfprintf(logfile->fd, fmt, ap);
 va_end(ap);
 
 /* Don't pass back error results.  */
@@ -49,6 +53,7 @@ int qemu_log(const char *fmt, ...)
 ret = 0;
 }
 }
+rcu_read_unlock();
 return ret;
 }
 
@@ -65,6 +70,8 @@ static bool log_uses_own_buffers;
 /* enable or disable low levels log */
 void qemu_set_log(int log_flags)
 {
+QemuLogFile *logfile;
+
 qemu_loglevel = log_flags;
 #ifdef CONFIG_TRACE_LOG
 qemu_loglevel |= LOG_TRACE;
@@ -77,43 +84,51 @@ void qemu_set_log(int log_flags)
 qemu_mutex_lock(&qemu_logfile_mutex);
 if (!qemu_logfile &&
 (is_daemonized() ? logfilename != NULL : qemu_loglevel)) {
+logfile = g_new0(QemuLogFile, 1);
 if (logfilename) {
-qemu_logfile = fopen(logfilename, log_append ? "a" : "w");
-if (!qemu_logfile) {
+logfile->fd = fopen(logfilename, log_append ? "a" : "w");
+if (!logfile->fd) {
+g_free(logfile);
 perror(logfilename);
 _exit(1);

[PATCH 3/4] qemu_log_lock/unlock now preserves the qemu_logfile handle.

qemu_log_lock() now returns a handle and qemu_log_unlock() receives a
handle to unlock.  This allows for changing the handle during logging
and ensures the lock() and unlock() are for the same file.

Signed-off-by: Robert Foley 
---
 include/qemu/log.h| 14 +++---
 accel/tcg/cpu-exec.c  |  4 ++--
 accel/tcg/translate-all.c |  4 ++--
 accel/tcg/translator.c|  4 ++--
 exec.c|  4 ++--
 hw/net/can/can_sja1000.c  |  4 ++--
 net/can/can_socketcan.c   |  5 ++---
 target/cris/translate.c   |  4 ++--
 target/i386/translate.c   |  5 +++--
 target/lm32/translate.c   |  4 ++--
 target/microblaze/translate.c |  4 ++--
 target/nios2/translate.c  |  4 ++--
 target/tilegx/translate.c |  7 ---
 target/unicore32/translate.c  |  4 ++--
 tcg/tcg.c | 16 
 15 files changed, 44 insertions(+), 43 deletions(-)

diff --git a/include/qemu/log.h b/include/qemu/log.h
index 975de18e23..3d0f47a479 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -71,25 +71,25 @@ static inline bool qemu_log_separate(void)
  * qemu_loglevel is never set when qemu_logfile is unset.
  */
 
-static inline void qemu_log_lock(void)
+static inline FILE *qemu_log_lock(void)
 {
 QemuLogFile *logfile;
 rcu_read_lock();
 logfile = atomic_rcu_read(&qemu_logfile);
 if (logfile) {
 qemu_flockfile(logfile->fd);
+return logfile->fd;
 }
 rcu_read_unlock();
+return NULL;
 }
 
-static inline void qemu_log_unlock(void)
+static inline void qemu_log_unlock(FILE *fd)
 {
-QemuLogFile *logfile;
-logfile = atomic_rcu_read(&qemu_logfile);
-if (logfile) {
-qemu_funlockfile(logfile->fd);
+if (fd) {
+qemu_funlockfile(fd);
+rcu_read_unlock();
 }
-rcu_read_unlock();
 }
 
 /* Logging functions: */
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index c01f59c743..62068d10c3 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -156,7 +156,7 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, 
TranslationBlock *itb)
 #if defined(DEBUG_DISAS)
 if (qemu_loglevel_mask(CPU_LOG_TB_CPU)
 && qemu_log_in_addr_range(itb->pc)) {
-qemu_log_lock();
+FILE *logfile = qemu_log_lock();
 int flags = 0;
 if (qemu_loglevel_mask(CPU_LOG_TB_FPU)) {
 flags |= CPU_DUMP_FPU;
@@ -165,7 +165,7 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, 
TranslationBlock *itb)
 flags |= CPU_DUMP_CCOP;
 #endif
 log_cpu_state(cpu, flags);
-qemu_log_unlock();
+qemu_log_unlock(logfile);
 }
 #endif /* DEBUG_DISAS */
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 9f48da9472..bb325a2bc4 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1804,7 +1804,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM) &&
 qemu_log_in_addr_range(tb->pc)) {
-qemu_log_lock();
+FILE *logfile = qemu_log_lock();
 qemu_log("OUT: [size=%d]\n", gen_code_size);
 if (tcg_ctx->data_gen_ptr) {
 size_t code_size = tcg_ctx->data_gen_ptr - tb->tc.ptr;
@@ -1829,7 +1829,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 }
 qemu_log("\n");
 qemu_log_flush();
-qemu_log_unlock();
+qemu_log_unlock(logfile);
 }
 #endif
 
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index f977682be7..603d17ff83 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -138,11 +138,11 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
 && qemu_log_in_addr_range(db->pc_first)) {
-qemu_log_lock();
+FILE *logfile = qemu_log_lock();
 qemu_log("\n");
 ops->disas_log(db, cpu);
 qemu_log("\n");
-qemu_log_unlock();
+qemu_log_unlock(logfile);
 }
 #endif
 }
diff --git a/exec.c b/exec.c
index ffdb518535..c994a00f10 100644
--- a/exec.c
+++ b/exec.c
@@ -1223,13 +1223,13 @@ void cpu_abort(CPUState *cpu, const char *fmt, ...)
 fprintf(stderr, "\n");
 cpu_dump_state(cpu, stderr, CPU_DUMP_FPU | CPU_DUMP_CCOP);
 if (qemu_log_separate()) {
-qemu_log_lock();
+FILE *logfile = qemu_log_lock();
 qemu_log("qemu: fatal: ");
 qemu_log_vprintf(fmt, ap2);
 qemu_log("\n");
 log_cpu_state(cpu, CPU_DUMP_FPU | CPU_DUMP_CCOP);
 qemu_log_flush();
-qemu_log_unlock();
+qemu_log_unlock(logfile);
 qemu_log_close();
 }
 va_end(ap2);
diff --git a/hw/net/can/can_sja1000.c b/hw/net/can/can_sja1000.c
index 1f81341554..39c78faf9b 100644
--- a/hw/net/can/can_sja1000.c
+++ b/hw/net/can/can_sja1000.c
@@ -247,8 +247,8 @@ int can_sja_accept_filter(CanSJA1000State *s,
 static vo

[PATCH 4/4] Added tests for close and change of logfile.

One test ensures that the logfile handle is still valid even if
the logfile is changed during logging.
The other test validates that the logfile handle remains valid under
the logfile lock even if the logfile is closed.

Signed-off-by: Robert Foley 
---
 tests/test-logging.c | 74 
 1 file changed, 74 insertions(+)

diff --git a/tests/test-logging.c b/tests/test-logging.c
index a12585f70a..a3190ff92c 100644
--- a/tests/test-logging.c
+++ b/tests/test-logging.c
@@ -108,6 +108,76 @@ static void test_parse_path(gconstpointer data)
 error_free_or_abort(&err);
 }
 
+static void test_logfile_write(gconstpointer data)
+{
+QemuLogFile *logfile;
+gchar const *dir = data;
+Error *err = NULL;
+gchar *file_path;
+gchar *file_path1;
+FILE *orig_fd;
+
+file_path = g_build_filename(dir, "qemu_test_log_write0.log", NULL);
+file_path1 = g_build_filename(dir, "qemu_test_log_write1.log", NULL);
+
+/*
+ * Test that even if an open file handle is changed,
+ * our handle remains valid due to RCU.
+ */
+qemu_set_log_filename(file_path, &err);
+g_assert(!err);
+rcu_read_lock();
+logfile = atomic_rcu_read(&qemu_logfile);
+orig_fd = logfile->fd;
+g_assert(logfile && logfile->fd);
+fprintf(logfile->fd, "%s 1st write to file\n", __func__);
+fflush(logfile->fd);
+
+/* Change the logfile and ensure that the handle is still valid. */
+qemu_set_log_filename(file_path1, &err);
+g_assert(!err);
+g_assert(logfile->fd == orig_fd);
+fprintf(logfile->fd, "%s 2nd write to file\n", __func__);
+fflush(logfile->fd);
+rcu_read_unlock();
+
+g_free(file_path);
+g_free(file_path1);
+}
+
+static void test_logfile_lock(gconstpointer data)
+{
+FILE *logfile;
+gchar const *dir = data;
+Error *err = NULL;
+gchar *file_path;
+
+file_path = g_build_filename(dir, "qemu_test_logfile_lock0.log", NULL);
+
+/*
+ * Test the use of the logfile lock, such
+ * that even if an open file handle is closed,
+ * our handle remains valid for use due to RCU.
+ */
+qemu_set_log_filename(file_path, &err);
+logfile = qemu_log_lock();
+g_assert(logfile);
+fprintf(logfile, "%s 1st write to file\n", __func__);
+fflush(logfile);
+
+/*
+ * Initiate a close file and make sure our handle remains
+ * valid since we still have the logfile lock.
+ */
+qemu_log_close();
+fprintf(logfile, "%s 2nd write to file\n", __func__);
+fflush(logfile);
+qemu_log_unlock(logfile);
+
+g_assert(!err);
+g_free(file_path);
+}
+
 /* Remove a directory and all its entries (non-recursive). */
 static void rmdir_full(gchar const *root)
 {
@@ -134,6 +204,10 @@ int main(int argc, char **argv)
 
 g_test_add_func("/logging/parse_range", test_parse_range);
 g_test_add_data_func("/logging/parse_path", tmp_path, test_parse_path);
+g_test_add_data_func("/logging/logfile_write_path",
+ tmp_path, test_logfile_write);
+g_test_add_data_func("/logging/logfile_lock_path",
+ tmp_path, test_logfile_lock);
 
 rc = g_test_run();
 
-- 
2.17.1

[PATCH 0/4] Make the qemu_logfile handle thread safe.

This patch adds thread safety to the qemu_logfile handle.  This now
allows changing the logfile while logging is active, and also solves 
the issue of a seg fault while changing the logfile.

This patch adds use of RCU for handling the swap out of the 
old qemu_logfile file descriptor.

Robert Foley (4):
  Add a mutex to guarantee single writer to qemu_logfile handle.
  Add use of RCU for qemu_logfile.
  qemu_log_lock/unlock now preserves the qemu_logfile handle.
  Added tests for close and change of logfile.

 accel/tcg/cpu-exec.c  |  4 +-
 accel/tcg/translate-all.c |  4 +-
 accel/tcg/translator.c|  4 +-
 exec.c|  4 +-
 hw/net/can/can_sja1000.c  |  4 +-
 include/exec/log.h| 33 ++--
 include/qemu/log.h| 51 +++---
 net/can/can_socketcan.c   |  5 +-
 target/cris/translate.c   |  4 +-
 target/i386/translate.c   |  5 +-
 target/lm32/translate.c   |  4 +-
 target/microblaze/translate.c |  4 +-
 target/nios2/translate.c  |  4 +-
 target/tilegx/translate.c |  7 +--
 target/unicore32/translate.c  |  4 +-
 tcg/tcg.c | 28 ++
 tests/test-logging.c  | 74 ++
 util/log.c| 99 ---
 18 files changed, 273 insertions(+), 69 deletions(-)

-- 
2.17.1

[PATCH 1/4] Add a mutex to guarantee single writer to qemu_logfile handle.

This is being added in preparation for using RCU with the logfile handle.
Also added qemu_logfile_init() for initializing the logfile mutex.

Signed-off-by: Robert Foley 
---
 util/log.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/util/log.c b/util/log.c
index 1ca13059ee..dff2f98c8c 100644
--- a/util/log.c
+++ b/util/log.c
@@ -24,8 +24,11 @@
 #include "qapi/error.h"
 #include "qemu/cutils.h"
 #include "trace/control.h"
+#include "qemu/thread.h"
 
 static char *logfilename;
+static bool qemu_logfile_initialized;
+static QemuMutex qemu_logfile_mutex;
 FILE *qemu_logfile;
 int qemu_loglevel;
 static int log_append = 0;
@@ -49,6 +52,14 @@ int qemu_log(const char *fmt, ...)
 return ret;
 }
 
+static void qemu_logfile_init(void)
+{
+if (!qemu_logfile_initialized) {
+qemu_mutex_init(&qemu_logfile_mutex);
+qemu_logfile_initialized = true;
+}
+}
+
 static bool log_uses_own_buffers;
 
 /* enable or disable low levels log */
@@ -58,6 +69,12 @@ void qemu_set_log(int log_flags)
 #ifdef CONFIG_TRACE_LOG
 qemu_loglevel |= LOG_TRACE;
 #endif
+
+/* Is there a better place to call this to init the logfile subsystem? */
+if (!qemu_logfile_initialized) {
+qemu_logfile_init();
+}
+qemu_mutex_lock(&qemu_logfile_mutex);
 if (!qemu_logfile &&
 (is_daemonized() ? logfilename != NULL : qemu_loglevel)) {
 if (logfilename) {
@@ -93,6 +110,7 @@ void qemu_set_log(int log_flags)
 log_append = 1;
 }
 }
+qemu_mutex_unlock(&qemu_logfile_mutex);
 if (qemu_logfile &&
 (is_daemonized() ? logfilename == NULL : !qemu_loglevel)) {
 qemu_log_close();
@@ -114,6 +132,11 @@ void qemu_set_log_filename(const char *filename, Error 
**errp)
 char *pidstr;
 g_free(logfilename);
 
+/* Is there a better place to call this to init the logfile subsystem? */
+if (!qemu_logfile_initialized) {
+qemu_logfile_init();
+}
+
 pidstr = strstr(filename, "%");
 if (pidstr) {
 /* We only accept one %d, no other format strings */
-- 
2.17.1

Re: [Patch v2 5/6] migration/postcopy: enable random order target page arrival

* Wei Yang (richardw.y...@linux.intel.com) wrote:
> After using number of target page received to track one host page, we
> could have the capability to handle random order target page arrival in
> one host page.
> 
> This is a preparation for enabling compress during postcopy.
> 
> Signed-off-by: Wei Yang 

Yep, that's better

Reviewed-by: Dr. David Alan Gilbert 


> ---
> v2:
>* use uintptr_t to calculate place_dest
>* check target pages belongs to the same host page
> ---
>  migration/ram.c | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index b5759793a9..666ad69284 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -4015,7 +4015,7 @@ static int ram_load_postcopy(QEMUFile *f)
>  MigrationIncomingState *mis = migration_incoming_get_current();
>  /* Temporary page that is later 'placed' */
>  void *postcopy_host_page = mis->postcopy_tmp_page;
> -void *last_host = NULL;
> +void *this_host = NULL;
>  bool all_zero = false;
>  int target_pages = 0;
>  
> @@ -4062,24 +4062,26 @@ static int ram_load_postcopy(QEMUFile *f)
>   * that's moved into place later.
>   * The migration protocol uses,  possibly smaller, target-pages
>   * however the source ensures it always sends all the components
> - * of a host page in order.
> + * of a host page in one chunk.
>   */
>  page_buffer = postcopy_host_page +
>((uintptr_t)host & (block->page_size - 1));
>  /* If all TP are zero then we can optimise the place */
>  if (target_pages == 1) {
>  all_zero = true;
> +this_host = (void *)QEMU_ALIGN_DOWN((uintptr_t)host,
> +block->page_size);
>  } else {
>  /* not the 1st TP within the HP */
> -if (host != (last_host + TARGET_PAGE_SIZE)) {
> -error_report("Non-sequential target page %p/%p",
> -  host, last_host);
> +if (QEMU_ALIGN_DOWN((uintptr_t)host, block->page_size) !=
> +(uintptr_t)this_host) {
> +error_report("Non-same host page %p/%p",
> +  host, this_host);
>  ret = -EINVAL;
>  break;
>  }
>  }
>  
> -
>  /*
>   * If it's the last part of a host page then we place the host
>   * page
> @@ -4090,7 +4092,6 @@ static int ram_load_postcopy(QEMUFile *f)
>  }
>  place_source = postcopy_host_page;
>  }
> -last_host = host;
>  
>  switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
>  case RAM_SAVE_FLAG_ZERO:
> @@ -4143,7 +4144,8 @@ static int ram_load_postcopy(QEMUFile *f)
>  
>  if (!ret && place_needed) {
>  /* This gets called at the last target page in the host page */
> -void *place_dest = host + TARGET_PAGE_SIZE - block->page_size;
> +void *place_dest = (void *)QEMU_ALIGN_DOWN((uintptr_t)host,
> +   block->page_size);
>  
>  if (all_zero) {
>  ret = postcopy_place_page_zero(mis, place_dest,
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [RFC v4 PATCH 49/49] multi-process: add configure and usage information

On Thu, Nov 07, 2019 at 03:02:20PM +0100, Stefan Hajnoczi wrote:
> This documentation suggests that QEMU spawns the remote processes.  How
> do this work with unprivileged QEMU?  Is there an additional step where
> QEMU drops privileges after having spawned remote processes?
> 
> Remote processes require accesses to resources that the main QEMU
> process does not need access to, so I'm wondering how this process model
> ensures that each process has only the privileges it needs.

I guess you have something like capabilities in mind?

When using something like selinux, priviledges are per binary
so the order of startup doesn't matter.

-- 
MST

Re: [PATCH v4 14/20] fuzz: Add target/fuzz makefile rules

2019-11-07 Thread Darren Kenny


On Wed, Oct 30, 2019 at 02:50:00PM +, Oleinik, Alexander wrote:

From: Alexander Oleinik 

Signed-off-by: Alexander Oleinik 
---
Makefile| 15 ++-
Makefile.objs   |  4 +++-
Makefile.target | 18 +-
tests/fuzz/Makefile.include |  4 
4 files changed, 38 insertions(+), 3 deletions(-)
create mode 100644 tests/fuzz/Makefile.include

diff --git a/Makefile b/Makefile
index d2b2ecd3c4..571f5562c9 100644
--- a/Makefile
+++ b/Makefile
@@ -464,7 +464,7 @@ config-host.h-timestamp: config-host.mak
qemu-options.def: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > 
$@,"GEN","$@")

-TARGET_DIRS_RULES := $(foreach t, all clean install, $(addsuffix /$(t), 
$(TARGET_DIRS)))
+TARGET_DIRS_RULES := $(foreach t, all fuzz clean install, $(addsuffix /$(t), 
$(TARGET_DIRS)))

SOFTMMU_ALL_RULES=$(filter %-softmmu/all, $(TARGET_DIRS_RULES))
$(SOFTMMU_ALL_RULES): $(authz-obj-y)
@@ -476,6 +476,15 @@ $(SOFTMMU_ALL_RULES): config-all-devices.mak
$(SOFTMMU_ALL_RULES): $(edk2-decompressed)
$(SOFTMMU_ALL_RULES): $(softmmu-main-y)

+SOFTMMU_FUZZ_RULES=$(filter %-softmmu/fuzz, $(TARGET_DIRS_RULES))
+$(SOFTMMU_FUZZ_RULES): $(authz-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(block-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(chardev-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(crypto-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(io-obj-y)
+$(SOFTMMU_FUZZ_RULES): config-all-devices.mak
+$(SOFTMMU_FUZZ_RULES): $(edk2-decompressed)
+
.PHONY: $(TARGET_DIRS_RULES)
# The $(TARGET_DIRS_RULES) are of the form SUBDIR/GOAL, so that
# $(dir $@) yields the sub-directory, and $(notdir $@) yields the sub-goal
@@ -526,6 +535,9 @@ subdir-slirp: slirp/all
$(filter %/all, $(TARGET_DIRS_RULES)): libqemuutil.a $(common-obj-y) \
$(qom-obj-y) $(crypto-user-obj-$(CONFIG_USER_ONLY))

+$(filter %/fuzz, $(TARGET_DIRS_RULES)): libqemuutil.a $(common-obj-y) \
+   $(qom-obj-y) $(crypto-user-obj-$(CONFIG_USER_ONLY))
+
ROM_DIRS = $(addprefix pc-bios/, $(ROMS))
ROM_DIRS_RULES=$(foreach t, all clean, $(addsuffix /$(t), $(ROM_DIRS)))
# Only keep -O and -g cflags
@@ -535,6 +547,7 @@ $(ROM_DIRS_RULES):

.PHONY: recurse-all recurse-clean recurse-install
recurse-all: $(addsuffix /all, $(TARGET_DIRS) $(ROM_DIRS))
+recurse-fuzz: $(addsuffix /fuzz, $(TARGET_DIRS) $(ROM_DIRS))
recurse-clean: $(addsuffix /clean, $(TARGET_DIRS) $(ROM_DIRS))
recurse-install: $(addsuffix /install, $(TARGET_DIRS))
$(addsuffix /install, $(TARGET_DIRS)): all
diff --git a/Makefile.objs b/Makefile.objs
index 9ff9b0c6f9..5478a554f6 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -86,10 +86,12 @@ common-obj-$(CONFIG_FDT) += device_tree.o
# qapi

common-obj-y += qapi/
+softmmu-obj-y = main.o

-softmmu-main-y = main.o
endif

+
+
###
# Target-independent parts used in system and user emulation
common-obj-y += cpus-common.o
diff --git a/Makefile.target b/Makefile.target
index ca3d14efe1..cddc8e4306 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -202,7 +202,7 @@ endif
COMMON_LDADDS = ../libqemuutil.a

# build either PROG or PROGW
-$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS)
+$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS) $(softmmu-obj-y)
$(call LINK, $(filter-out %.mak, $^))
ifdef CONFIG_DARWIN
$(call quiet-command,Rez -append $(SRC_PATH)/pc-bios/qemu.rsrc -o 
$@,"REZ","$(TARGET_DIR)$@")
@@ -227,6 +227,22 @@ ifdef CONFIG_TRACE_SYSTEMTAP
rm -f *.stp
endif

+ifdef CONFIG_FUZZ
+include $(SRC_PATH)/tests/fuzz/Makefile.include
+include $(SRC_PATH)/tests/Makefile.include
+
+fuzz: fuzz-vars
+fuzz-vars: QEMU_CFLAGS := $(FUZZ_CFLAGS) $(QEMU_CFLAGS)
+fuzz-vars: QEMU_LDFLAGS := $(FUZZ_LDFLAGS) $(QEMU_LDFLAGS)
+fuzz-vars: $(QEMU_PROG_FUZZ)
+dummy := $(call unnest-vars,, fuzz-obj-y)
+
+
+$(QEMU_PROG_FUZZ): config-devices.mak $(all-obj-y) $(COMMON_LDADDS) 
$(fuzz-obj-y)
+   $(call LINK, $(filter-out %.mak, $^))
+


It may be useful to still handle the fuzz target here, and report
that fuzzing is disabled in this configuration, as it is, if I type
'make x86_64-softmmu/fuzz' I get the less useful output of:

  make[1]: *** No rule to make target `fuzz'.  Stop.


+endif
+
install: all
ifneq ($(PROGS),)
$(call install-prog,$(PROGS),$(DESTDIR)$(bindir))
diff --git a/tests/fuzz/Makefile.include b/tests/fuzz/Makefile.include
new file mode 100644
index 00..324e6c1433
--- /dev/null
+++ b/tests/fuzz/Makefile.include
@@ -0,0 +1,4 @@
+# QEMU_PROG_FUZZ=qemu-fuzz-$(TARGET_NAME)$(EXESUF)
+fuzz-obj-y = $(libqos-obj-y)
+fuzz-obj-y += tests/libqtest.o
+


But otherwise, this seems to be cleaner in that it is not causing
rebuilds of objects between selecting target/all or target/fuzz,
assuming that is correct here.

So with that,

Reviewed-by: Darren Kenny 

Thanks,

Darren.

[PULL 0/3] Block patches for 4.2.0-rc0/4.1.1