Re: [Qemu-devel] [PATCH v1 14/14] scripts/qemu.py: allow arches use KVM for their 32bit cousins

2019-01-25 Thread Alex Bennée
Good point. I should fall through.

On Sat, 26 Jan 2019, 03:37 Eduardo Habkost  On Fri, Jan 25, 2019 at 02:00:17PM +, Alex Bennée wrote:
> > A lot of architectures can run their 32 bit cousins on KVM so the
> > kvm_available function needs to be a little less restricting when
> > deciding if KVM is available.
> >
> > Signed-off-by: Alex Bennée 
> > ---
> >  scripts/qemu.py | 11 ++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/scripts/qemu.py b/scripts/qemu.py
> > index 0a5e02eb56..2934ee12c5 100644
> > --- a/scripts/qemu.py
> > +++ b/scripts/qemu.py
> > @@ -25,9 +25,18 @@ import tempfile
> >
> >  LOG = logging.getLogger(__name__)
> >
> > +# Mapping host architecture to any additional architectures it can
> > +# support which often includes its 32 bit cousin.
> > +ADDITIONAL_ARCHES = {
> > +"x86_64" : "i386",
> > +"aarch64" : "armhf"
> > +}
> >
> >  def kvm_available(target_arch=None):
> > -if target_arch and target_arch != os.uname()[4]:
> > +host_arch = os.uname()[4]
> > +if target_arch and target_arch != host_arch:
> > +if target_arch == ADDITIONAL_ARCHES[host_arch]:
>
> This will crash host_arch isn't "x86_64" or "aarch64".  I suggest
> ADDITIONAL_ARCHES.get(host_arch)
>
> > +return True
>
> I don't think we should skip the /dev/kvm check here.
>
>
> >  return False
> >  return os.access("/dev/kvm", os.R_OK | os.W_OK)
> >
> > --
> > 2.17.1
> >
>
> --
> Eduardo
>


Re: [Qemu-devel] [PATCH v2] qemu-nbd: Deprecate qemu-nbd --partition

2019-01-25 Thread Richard W.M. Jones
On Fri, Jan 25, 2019 at 05:48:37PM -0600, Eric Blake wrote:
> The existing qemu-nbd --partition code claims to handle logical
> partitions up to 8, since its introduction in 2008 (commit 7a5ca86).
> However, the implementation is bogus (actual MBR logical partitions
> form a sort of linked list, with one partition per extended table
> entry, rather than four logical partitions in a single extended
> table), making the code unlikely to work for anything beyond -P5 on
> actual guest images. What's more, the code does not support GPT
> partitions, which are becoming more popular, and maintaining device
> subsetting in both NBD and the raw device is unnecessary duplication
> of effort (even if it is not too difficult).
> 
> Note that obtaining the offsets of a partition (MBR or GPT) can be
> learned by using 'qemu-nbd -c /dev/nbd0 file.qcow2 && sfdisk --dump
> /dev/nbd0', but by the time you've done that, you might as well
> just mount /dev/nbd0p1 that the kernel creates for you instead of
> bothering with qemu exporting a subset.  Or, keeping to just
> user-space code, use nbdkit's partition filter, which has already
> known both GPT and primary MBR partitions for a while, and was
> just recently enhanced to support arbitrary logical MBR parititions.
> 
> Start the clock on the deprecation cycle, with examples of how
> to write device subsetting without using -P.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v2: actual nbdkit example [Rich], improved doc wording
> ---
>  qemu-deprecated.texi | 33 +
>  qemu-nbd.texi|  6 --
>  qemu-nbd.c   |  2 ++
>  3 files changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> index 219206a836f..d35e78c81ff 100644
> --- a/qemu-deprecated.texi
> +++ b/qemu-deprecated.texi
> @@ -175,3 +175,36 @@ The above, converted to the current supported format:
>  @subsubsection "irq": "" (since 3.0.0)
> 
>  The ``irq'' property is obsoleted.
> +
> +@section Related binaries
> +
> +@subsection qemu-nbd --partition (since 4.0.0)
> +
> +The ``qemu-nbd --partition $digit'' code (also spelled @option{-P})
> +can only handle MBR partitions, and has never correctly handled
> +logical partitions beyond partition 5.  If you know the offset and
> +length of the partition (perhaps by using @code{sfdisk} within the
> +guest), you can achieve the effect of exporting just that subset of
> +the disk by use of the @option{--image-opts} option with a raw
> +blockdev using the @code{offset} and @code{size} parameters layered on
> +top of any other existing blockdev. For example, if partition 1 is
> +100MiB long starting at 1MiB, the old command:
> +
> +@code{qemu-nbd -t -P 1 -f qcow2 file.qcow2}
> +
> +can be rewritten as:
> +
> +@code{qemu-nbd -t --image-opts 
> driver=raw,offset=1M,size=100M,file.driver=qcow2,file.backing.driver=file,file.backing.filename=file.qcow2}
> +
> +Alternatively, the @code{nbdkit} project provides a more powerful
> +partition filter on top of its nbd plugin, which can be used to select
> +an arbitrary MBR or GPT partition on top of any other full-image NBD
> +export.  Using this to rewrite the above example results in:
> +
> +@code{qemu-nbd -t -k /tmp/sock -f qcow2 file.qcow2 &}
> +@code{nbdkit -f --filter=partition nbd socket=/tmp/sock partition=1}
> +
> +Note that if you are exposing the export via /dev/nbd0, it is easier
> +to just export the entire image and then mount only /dev/nbd0p1 than
> +it is to reinvoke @command{qemu-nbd -c /dev/nbd0} limited to just a
> +subset of the image.
> diff --git a/qemu-nbd.texi b/qemu-nbd.texi
> index 386bece4680..d0c51828149 100644
> --- a/qemu-nbd.texi
> +++ b/qemu-nbd.texi
> @@ -56,8 +56,10 @@ auto-detecting.
>  @item -r, --read-only
>  Export the disk as read-only.
>  @item -P, --partition=@var{num}
> -Only expose MBR partition @var{num}.  Understands physical partitions
> -1-4 and logical partitions 5-8.
> +Deprecated: Only expose MBR partition @var{num}.  Understands physical
> +partitions 1-4 and logical partition 5. New code should instead use
> +@option{--image-opts} with the raw driver wrapping a subset of the
> +original image.
>  @item -B, --bitmap=@var{name}
>  If @var{filename} has a qcow2 persistent bitmap @var{name}, expose
>  that bitmap via the ``qemu:dirty-bitmap:@var{name}'' context
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 1f7b2a03f5d..00c07fd27ea 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -787,6 +787,8 @@ int main(int argc, char **argv)
>  flags &= ~BDRV_O_RDWR;
>  break;
>  case 'P':
> +warn_report("The '-P' option is deprecated; use --image-opts 
> with "
> +"a raw device wrapper for subset exports instead");
>  if (qemu_strtoi(optarg, NULL, 0, &partition) < 0 ||
>  partition < 1 || partition > 8) {
>  error_report("Invalid partition '%s'", optarg);
> -- 
> 2.20.1

Reviewed-by: Richard W.M. Jones 

Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock

2019-01-25 Thread Jon DeVree
On Mon, Jan 07, 2019 at 20:04:41 -0500, Pavel Tatashin wrote:
> I did exactly the same sequence on Kaby Lake CPU and could not
> reproduce it. What is your host CPU?
> 

I have some machines which display this bug and others that don't, so I
was able to figure out the difference between their configurations.

TL;DR the bug appears to be based on wther or not
/sys/devices/system/clocksource/clocksource0/current_clocksource is set
to TSC in the hypervisor

This is the log from a machine with the bug:

[0.00] Hypervisor detected: KVM
[0.00] kvm-clock: Using msrs 12 and 11
[1162908.013830] kvm-clock: cpu 0, msr 3e0fea001, primary cpu clock
[1162908.013830] clocksource: kvm-clock: mask: 0x max_cycles: 
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[1162908.013834] tsc: Detected 1899.888 MHz processor

This is the log from a machine without the bug:

[0.00] Hypervisor detected: KVM
[0.00] kvm-clock: Using msrs 4b564d01 and 4b564d00
[0.00] kvm-clock: cpu 0, msr 149fea001, primary cpu clock
[0.00] kvm-clock: using sched offset of 1558436482528906 cycles
[0.02] clocksource: kvm-clock: mask: 0x max_cycles: 
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[0.04] tsc: Detected 2097.570 MHz processor

Note the additional line of output on the machine without the bug:

[0.00] kvm-clock: using sched offset of 1558436482528906 cycles

This is printed from kvm_sched_clock_init() in
arch/x86/kernel/kvmclock.c based on whether or not the clock is stable.
For the clock to be stable both KVM_FEATURE_CLOCKSOURCE_STABLE_BIT and
PVCLOCK_TSC_STABLE_BIT have to be set.  Both of these are controlled by
the hypervisor kernel.

* KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is always set by the hypervisor
  starting with Linux v2.6.35 - 371bcf646d17 ("KVM: x86: Tell the guest
  we'll warn it about tsc stability")
* PVCLOCK_TSC_STABLE_BIT is set starting in Linux v3.8 but only if the
  clocksource is the TSC - d828199e8444 ("KVM: x86: implement
  PVCLOCK_TSC_STABLE_BIT pvclock flag")

I changed the clocksource of a hypervisor that wasn't having issues from
TSC to HPET and when I started up a guest VM the bug suddenly appeared.
I shut down the guest, set the hypervisor's clocksource back to TSC,
started up the guest and the bug went away again.

You don't actually have to reboot the guest before the bug is visible
either, just letting the guest sit at the GRUB menu for a minute or two
before loading Linux is enough to make the bug plainly visible in the
printk timestamps.

I don't know enough to actually fix the bug, but hopefully this is
enough to allow everyone else to reproduce it and come up with a fix.

-- 
Jon
X(7): A program for managing terminal windows. See also screen(1) and tmux(1).



Re: [Qemu-devel] [PATCH v1 14/14] scripts/qemu.py: allow arches use KVM for their 32bit cousins

2019-01-25 Thread Eduardo Habkost
On Fri, Jan 25, 2019 at 02:00:17PM +, Alex Bennée wrote:
> A lot of architectures can run their 32 bit cousins on KVM so the
> kvm_available function needs to be a little less restricting when
> deciding if KVM is available.
> 
> Signed-off-by: Alex Bennée 
> ---
>  scripts/qemu.py | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/scripts/qemu.py b/scripts/qemu.py
> index 0a5e02eb56..2934ee12c5 100644
> --- a/scripts/qemu.py
> +++ b/scripts/qemu.py
> @@ -25,9 +25,18 @@ import tempfile
>  
>  LOG = logging.getLogger(__name__)
>  
> +# Mapping host architecture to any additional architectures it can
> +# support which often includes its 32 bit cousin.
> +ADDITIONAL_ARCHES = {
> +"x86_64" : "i386",
> +"aarch64" : "armhf"
> +}
>  
>  def kvm_available(target_arch=None):
> -if target_arch and target_arch != os.uname()[4]:
> +host_arch = os.uname()[4]
> +if target_arch and target_arch != host_arch:
> +if target_arch == ADDITIONAL_ARCHES[host_arch]:

This will crash host_arch isn't "x86_64" or "aarch64".  I suggest
ADDITIONAL_ARCHES.get(host_arch)

> +return True

I don't think we should skip the /dev/kvm check here.


>  return False
>  return os.access("/dev/kvm", os.R_OK | os.W_OK)
>  
> -- 
> 2.17.1
> 

-- 
Eduardo



Re: [Qemu-devel] [PATCH] i386: extended the cpuid level when Intel PT is enabled

2019-01-25 Thread Eduardo Habkost
On Fri, Jan 25, 2019 at 02:21:20AM +, Kang, Luwei wrote:
> > > Intel Processor Trace required CPUID[0x14] but the cpuid level is 0xd
> > > when create a kvm guest with e.g. "-cpu qemu64,+intel-pt".
> > >
> > > Signed-off-by: Luwei Kang 
> > > ---
> > >  target/i386/cpu.c | 7 +++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c index
> > > 2f54125..da477b3 100644
> > > --- a/target/i386/cpu.c
> > > +++ b/target/i386/cpu.c
> > > @@ -5023,6 +5023,13 @@ static void x86_cpu_expand_features(X86CPU *cpu, 
> > > Error **errp)
> > >  x86_cpu_adjust_feat_level(cpu, FEAT_C000_0001_EDX);
> > >  x86_cpu_adjust_feat_level(cpu, FEAT_SVM);
> > >  x86_cpu_adjust_feat_level(cpu, FEAT_XSAVE);
> > > +
> > > +/* Intel Processor Trace requires CPUID[0x14] */
> > > +if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
> > > + kvm_enabled()) {
> > > +x86_cpu_adjust_level(cpu, &cpu->env.cpuid_min_level, 0x14);
> > > +}
> > 
> > This will require a new machine-type compatibility flag to enable the new 
> > behavior, so we don't change CPUID data under the guest feet during live 
> > migration.
> 
> Hi Eduardo,
> Thanks for your reply. I have some question on your comments.
> The cpuid level come from specific machine-type (e.g. qemu64, 
> Skylake-Server) and they are all 0xd, but Intel PT required 0x14 so I extend 
> the cpuid level.
> I don't fully understand what is the "require a new machine-type 
> compatibility flag" mean, I need to add a new flag in each machine-type? 
> I try to do live migration with "-cpu qemu64,+intel-pt" and "-cpu host" 
> are all passed test. We didn't change the cpuid data during live migration 
> just initialize the cpuid data when create a new vcpu. Please correct me if 
> anything wrong.

CPUID data is not sent as part of the migration stream (it is
recreated on the migration destination), so if "-cpu qemu,+intel-pt"
results in different CPUID data, migration between QEMU 4.0 and
3.1 will make CPUID level change during live migration.

This is not a serious issue, but it might confuse software
running on the guest.

We can fix that doing this:

target/i386/cpu.c:

static Property x86_cpu_properties[] = {
...
DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level, 
true),

}
...
static void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
{
...
/* Intel Processor Trace requires CPUID[0x14] */
if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) && 
cpu->intel_pt_auto_level) {
x86_cpu_adjust_level(cpu, &cpu->env.cpuid_min_level, 0x14);
}
...
}

hw/i386/pc.c:

GlobalProperty pc_compat_3_1[] = {
...
{ TYPE_X86_CPU, "x-intel-pt-auto-leevl", "off" },
};

> Thanks,
> Luwei Kang
> 
> > 
> > > +
> > >  /* SVM requires CPUID[0x800A] */
> > >  if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
> > >  x86_cpu_adjust_level(cpu, &env->cpuid_min_xlevel,
> > > 0x800A);
> > > --
> > > 1.8.3.1
> > >
> > 
> > --
> > Eduardo

-- 
Eduardo



Re: [Qemu-devel] [PATCH 15/23] hw/arm/armsse: Add unimplemented-device stubs for MHUs

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> The SSE-200 has two Message Handling Units (MHUs), which sit behind
> the APB PPC0. Wire up some unimplemented-device stubs for these,
> since we don't yet implement a real model of this device.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/armsse.h |  3 +++
>  hw/arm/armsse.c | 41 +
>  2 files changed, 44 insertions(+)

Reviewed-by: Richard Henderson 


r~





Re: [Qemu-devel] [PATCH 14/23] iotkit-sysinfo: Make SYS_VERSION and SYS_CONFIG configurable

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> The SYS_VERSION and SYS_CONFIG register values differ between the
> IoTKit and SSE-200. Make them configurable via QOM properties rather
> than hard-coded, and set them appropriately in the ARMSSE code that
> instantiates the IOTKIT_SYSINFO device.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/misc/iotkit-sysinfo.h |  6 
>  hw/arm/armsse.c  | 51 
>  hw/misc/iotkit-sysinfo.c | 15 --
>  3 files changed, 70 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson 


r~





Re: [Qemu-devel] [PATCH 13/23] hw/arm/armsse: Put each CPU in its own cluster object

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> Create a cluster object to hold each CPU in the SSE. They are
> logically distinct and may be configured differently (for instance
> one may not have an FPU where the other does).
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/armsse.h |  2 ++
>  hw/arm/armsse.c | 31 ---
>  2 files changed, 30 insertions(+), 3 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH] gdbstub: Fix i386/x86_64 machine description and add control registers

2019-01-25 Thread Doug Gale
On Fri, Jan 25, 2019 at 6:22 AM Peter Maydell 
wrote:

>
> Thanks for this explanation -- the patch makes a lot more sense with it.
> I'm confused though -- the XML we ship is basically what gdb itself
> ships and uses internally:
>
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/features/i386/i386.xml;h=beb1496d9773efcf0e0526dc540a5e206a2e21fc;hb=HEAD
>
> and that uses xi:include to pull in the other files.
> So it seems odd that gdb can't parse the XML it is using
> itself internally. Maybe QEMU is doing something else wrong
> somewhere?
>
>
I thought that too. I implemented gdbstub tracing a while ago and had it
enabled (-trace gdb*). The binary data it sends is exactly what I expect.
When I break into gdb, I checked the data and it looked correct. debug_xml
even prints out what it's going to parse. Looks fine. GDB calls into an XML
parser library, which I don't have built from source, so I didn't find
precisely where it fails, only the high level log showing unexpected
elements at nested feature.


Re: [Qemu-devel] [PATCH 12/23] hw/arm/armsse: Give each CPU its own view of memory

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> Give each CPU its own container memory region. This is necessary
> for two reasons:
>  * some devices are instantiated one per CPU and the CPU sees only
>its own device
>  * since a memory region can only be put into one container, we must
>give each armv7m object a different MemoryRegion as its 'memory'
>property, or a dual-CPU configuration will assert on realize when
>the second armv7m object tries to put the MR into a container when
>it is already in the first armv7m object's container
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/armsse.h | 10 ++
>  hw/arm/armsse.c | 22 --
>  2 files changed, 30 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH 11/23] hw/arm/armsse: Support dual-CPU configuration

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> The SSE-200 has two Cortex-M33 CPUs. These see the same view
> of memory, with the exception of the "private CPU region" which
> has per-CPU devices. Internal device interrupts for SSE-200
> devices are mostly wired up to both CPUs, with the exception of
> a few per-CPU devices. External GPIO inputs on the SSE-200
> device are provided for the second CPU's interrupts above 32,
> as is already the case for the first CPU.
> 
> Refactor the code to support creation of multiple CPUs.
> For the moment we leave all CPUs with the same view of
> memory: this will not work in the multiple-CPU case, but
> we will fix this in the following commit.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/armsse.h |  21 +++-
>  hw/arm/armsse.c | 206 
>  2 files changed, 180 insertions(+), 47 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH 10/23] hw/arm/armsse: Make SRAM bank size configurable

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> For the IoTKit the SRAM bank size is always 32K (15 bits); for the
> SSE-200 this is a configurable parameter, which defaults to 32K but
> can be changed when it is built into a particular SoC. For instance
> the Musca-B1 board sets it to 128K (17 bits).
> 
> Make the bank size a QOM property. We follow the SSE-200 hardware in
> naming the parameter SRAM_ADDR_WIDTH, which specifies the number of
> address bits of a single SRAM bank.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/armsse.h |  1 +
>  hw/arm/armsse.c | 18 --
>  2 files changed, 17 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson 


r~





Re: [Qemu-devel] [PATCH 09/23] hw/arm/armsse: Make number of SRAM banks parameterised

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> The SSE-200 has four banks of SRAM, each with its own
> Memory Protection Controller, where the IoTKit has only one.
> Make the number of SRAM banks a field in ARMSSEInfo.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/armsse.h |  9 +++--
>  hw/arm/armsse.c | 78 ++---
>  2 files changed, 56 insertions(+), 31 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH 08/23] hw/misc/iotkit-secctl: Support 4 internal MPCs

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> The SSE-200 has 4 banks of SRAM, each with its own internal
> Memory Protection Controller. The interrupt status for these
> extra MPCs appears in the same security controller SECMPCINTSTATUS
> register as the MPC for the IoTKit's single SRAM bank. Enhance the
> iotkit-secctl device to allow 4 MPCs. (If the particular IoTKit/SSE
> variant in use does not have all 4 MPCs then the unused inputs will
> simply result in the SECMPCINTSTATUS bits being zero as required.)
> 
> The hardcoded constant "1"s in armsse.c indicate the actual number
> of SRAM MPCs the IoTKit has, and will be replaced in the following
> commit.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/misc/iotkit-secctl.h | 6 +++---
>  hw/arm/armsse.c | 6 +++---
>  hw/misc/iotkit-secctl.c | 5 +++--
>  3 files changed, 9 insertions(+), 8 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH 07/23] hw/arm/iotkit: Rename files to hw/arm/armsse.[ch]

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> Rename the files that used to be iotkit.[ch] to
> armsse.[ch] to reflect the fact they new cover
> multiple Arm subsystems for embedded.
> 
> Signed-off-by: Peter Maydell 
> ---
>  hw/arm/Makefile.objs  | 2 +-
>  include/hw/arm/{iotkit.h => armsse.h} | 4 ++--
>  hw/arm/{iotkit.c => armsse.c} | 2 +-
>  hw/arm/mps2-tz.c  | 2 +-
>  MAINTAINERS   | 4 ++--
>  default-configs/arm-softmmu.mak   | 2 +-
>  6 files changed, 8 insertions(+), 8 deletions(-)
>  rename include/hw/arm/{iotkit.h => armsse.h} (99%)
>  rename hw/arm/{iotkit.c => armsse.c} (99%)

Reviewed-by: Richard Henderson 


r~





Re: [Qemu-devel] [PATCH v5 00/35] target/riscv: Convert to decodetree

2019-01-25 Thread Palmer Dabbelt

On Tue, 22 Jan 2019 13:38:52 PST (-0800), richard.hender...@linaro.org wrote:

On 1/22/19 1:28 AM, Bastian Koppelmann wrote:

Hi,

this patchset converts the RISC-V decoder to decodetree in four major steps:

1) Convert 32-bit instructions to decodetree [Patch 1-16]:
Many of the gen_* functions are called by the decode functions for 16-bit
and 32-bit functions. If we move translation code from the gen_*
functions to the generated trans_* functions of decode-tree, we get a lot of
duplication. Therefore, we mostly generate calls to the old gen_* function
which are properly replaced after step 2).

Each of the trans_ functions are grouped into files corresponding to their
ISA extension, e.g. addi which is in RV32I is translated in the file
'trans_rvi.inc.c'.

2) Convert 16-bit instructions to decodetree [Patch 17-19]:
All 16 bit instructions have a direct mapping to a 32 bit instruction. Thus,
we convert the arguments in the 16 bit trans_ function to the arguments of
the corresponding 32 bit instruction and call the 32 bit trans_ function.

3) Remove old manual decoding in gen_* function [Patch 20-30]:
this move all manual translation code into the trans_* instructions of
decode tree, such that we can remove the old decode_* functions.

4) Simplify RVC by reusing as much as possible from the RVG decoder as suggested
   by Richard. [Patch 31-35]

full tree available at
https://github.com/bkoppelmann/qemu/tree/riscv-dt-v5

Cheers,
Bastian

v4 -> v5:
- fixed rebase error
- moved TARGET_LONG_BITS check of shift instructions before rd == 0 check
- removed extra sign extension of sraiw
- removed rs2 == 0 special cases in sraw/srlw


All looks good to me now.  Thanks for persevering.


Thanks for reviewing this.

Bastian: what sort of testing have you done here?  This is a pretty big change, 
and while it's been fairly extensively reviewed I'm just generally paranoid.




Re: [Qemu-devel] [PATCH 06/23] hw/arm/iotkit: Rename 'iotkit' local variables and functions

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> Rename various internal uses of 'iotkit' in hw/arm/iotkit.c to
> 'armsse', for consistency. The remaining occurences are:
>  * related to the devices TYPE_IOTKIT_SYSCTL, TYPE_IOTKIT_SYSINFO,
>etc, which this refactor is not touching
>  * references that apply specifically to the IoTKit (like
>the lack of a private CPU region)
>  * the vmstate, which keeps its old "iotkit" name for
>migration compatibility reasons
> 
> Signed-off-by: Peter Maydell 
> ---
>  hw/arm/iotkit.c | 68 -
>  1 file changed, 34 insertions(+), 34 deletions(-)

Reviewed-by: Richard Henderson 


r~





Re: [Qemu-devel] [PATCH 05/23] hw/arm/iotkit: Refactor into abstract base class and subclass

2019-01-25 Thread Richard Henderson
On 1/21/19 10:51 AM, Peter Maydell wrote:
> The Arm SSE-200 Subsystem for Embedded is a revised and
> extended version of the older IoTKit SoC. Prepare for
> adding a model of it by refactoring the IoTKit code into
> an abstract base class which contains the functionality,
> driven by a class data block specific to each subclass.
> (This is the same approach used by the existing bcm283x
> SoC family implementation.)
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/iotkit.h | 22 +-
>  hw/arm/iotkit.c | 34 +-
>  2 files changed, 46 insertions(+), 10 deletions(-)

Reviewed-by: Richard Henderson 


r~






[Qemu-devel] [PATCH v2] qemu-nbd: Deprecate qemu-nbd --partition

2019-01-25 Thread Eric Blake
The existing qemu-nbd --partition code claims to handle logical
partitions up to 8, since its introduction in 2008 (commit 7a5ca86).
However, the implementation is bogus (actual MBR logical partitions
form a sort of linked list, with one partition per extended table
entry, rather than four logical partitions in a single extended
table), making the code unlikely to work for anything beyond -P5 on
actual guest images. What's more, the code does not support GPT
partitions, which are becoming more popular, and maintaining device
subsetting in both NBD and the raw device is unnecessary duplication
of effort (even if it is not too difficult).

Note that obtaining the offsets of a partition (MBR or GPT) can be
learned by using 'qemu-nbd -c /dev/nbd0 file.qcow2 && sfdisk --dump
/dev/nbd0', but by the time you've done that, you might as well
just mount /dev/nbd0p1 that the kernel creates for you instead of
bothering with qemu exporting a subset.  Or, keeping to just
user-space code, use nbdkit's partition filter, which has already
known both GPT and primary MBR partitions for a while, and was
just recently enhanced to support arbitrary logical MBR parititions.

Start the clock on the deprecation cycle, with examples of how
to write device subsetting without using -P.

Signed-off-by: Eric Blake 

---
v2: actual nbdkit example [Rich], improved doc wording
---
 qemu-deprecated.texi | 33 +
 qemu-nbd.texi|  6 --
 qemu-nbd.c   |  2 ++
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index 219206a836f..d35e78c81ff 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -175,3 +175,36 @@ The above, converted to the current supported format:
 @subsubsection "irq": "" (since 3.0.0)

 The ``irq'' property is obsoleted.
+
+@section Related binaries
+
+@subsection qemu-nbd --partition (since 4.0.0)
+
+The ``qemu-nbd --partition $digit'' code (also spelled @option{-P})
+can only handle MBR partitions, and has never correctly handled
+logical partitions beyond partition 5.  If you know the offset and
+length of the partition (perhaps by using @code{sfdisk} within the
+guest), you can achieve the effect of exporting just that subset of
+the disk by use of the @option{--image-opts} option with a raw
+blockdev using the @code{offset} and @code{size} parameters layered on
+top of any other existing blockdev. For example, if partition 1 is
+100MiB long starting at 1MiB, the old command:
+
+@code{qemu-nbd -t -P 1 -f qcow2 file.qcow2}
+
+can be rewritten as:
+
+@code{qemu-nbd -t --image-opts 
driver=raw,offset=1M,size=100M,file.driver=qcow2,file.backing.driver=file,file.backing.filename=file.qcow2}
+
+Alternatively, the @code{nbdkit} project provides a more powerful
+partition filter on top of its nbd plugin, which can be used to select
+an arbitrary MBR or GPT partition on top of any other full-image NBD
+export.  Using this to rewrite the above example results in:
+
+@code{qemu-nbd -t -k /tmp/sock -f qcow2 file.qcow2 &}
+@code{nbdkit -f --filter=partition nbd socket=/tmp/sock partition=1}
+
+Note that if you are exposing the export via /dev/nbd0, it is easier
+to just export the entire image and then mount only /dev/nbd0p1 than
+it is to reinvoke @command{qemu-nbd -c /dev/nbd0} limited to just a
+subset of the image.
diff --git a/qemu-nbd.texi b/qemu-nbd.texi
index 386bece4680..d0c51828149 100644
--- a/qemu-nbd.texi
+++ b/qemu-nbd.texi
@@ -56,8 +56,10 @@ auto-detecting.
 @item -r, --read-only
 Export the disk as read-only.
 @item -P, --partition=@var{num}
-Only expose MBR partition @var{num}.  Understands physical partitions
-1-4 and logical partitions 5-8.
+Deprecated: Only expose MBR partition @var{num}.  Understands physical
+partitions 1-4 and logical partition 5. New code should instead use
+@option{--image-opts} with the raw driver wrapping a subset of the
+original image.
 @item -B, --bitmap=@var{name}
 If @var{filename} has a qcow2 persistent bitmap @var{name}, expose
 that bitmap via the ``qemu:dirty-bitmap:@var{name}'' context
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 1f7b2a03f5d..00c07fd27ea 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -787,6 +787,8 @@ int main(int argc, char **argv)
 flags &= ~BDRV_O_RDWR;
 break;
 case 'P':
+warn_report("The '-P' option is deprecated; use --image-opts with "
+"a raw device wrapper for subset exports instead");
 if (qemu_strtoi(optarg, NULL, 0, &partition) < 0 ||
 partition < 1 || partition > 8) {
 error_report("Invalid partition '%s'", optarg);
-- 
2.20.1




Re: [Qemu-devel] [PATCH 04/23] hw/arm/iotkit: Rename IoTKit to ARMSSE

2019-01-25 Thread Richard Henderson
On 1/21/19 10:50 AM, Peter Maydell wrote:
> The Arm IoTKit was effectively the forerunner of a series of
> subsystems for embedded SoCs, named the SSE-050, SSE-100 and SSE-200:
> https://developer.arm.com/products/system-design/subsystems
> These are generally quite similar, though later iterations have
> extra devices that earlier ones do not.
> 
> We want to add a model of the SSE-200, which means refactoring the
> IoTKit code into an abstract base class and subclasses (using the
> same design that the bcm283x SoC and Aspeed SoC family
> implementations do). As a first step, rename the IoTKit struct and
> QOM macros to ARMSSE, which is what we're going to name the base
> class. We temporarily retain TYPE_IOTKIT to avoid changing the
> code that instantiates a TYPE_IOTKIT device here and then changing
> it back again when it is re-introduced as a subclass.
> 
> Signed-off-by: Peter Maydell 
> ---
>  include/hw/arm/iotkit.h | 22 ++-
>  hw/arm/iotkit.c | 59 +
>  hw/arm/mps2-tz.c|  2 +-
>  3 files changed, 47 insertions(+), 36 deletions(-)
Reviewed-by: Richard Henderson 


r~





Re: [Qemu-devel] [PATCH 3/3] target/arm: fix decoding of B{, L}RA{A, B}

2019-01-25 Thread Richard Henderson
On 1/25/19 1:49 PM, Rémi Denis-Courmont wrote:
> From: Remi Denis-Courmont 
> 
> A flawed test lead to the instructions always being treated as
> unallocated encodings.
> 
> Signed-off-by: Remi Denis-Courmont 
> ---
>  target/arm/translate-a64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH 2/3] target/arm: actually enable PAuth in user mode

2019-01-25 Thread Richard Henderson
On 1/25/19 1:49 PM, Rémi Denis-Courmont wrote:
> From: Remi Denis-Courmont 
> 
> This always enables IA, IB, DA and DB keys in user mode on the maximum
> CPU, in a manner that is consistent with the other CPUs. That is to say
> redefining the reset value of SCTLR_ELx registers.
> 
> Without this patch, the PAC* and AUT* instructions have no effects
> (except PACGA of course).
> 
> Signed-off-by: Remi Denis-Courmont 
> ---
>  target/arm/cpu64.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index e9bc461c36..148c103ca4 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -413,8 +413,8 @@ static void aarch64_max_initfn(Object *obj)
>  (void *)&apdb_bit, &error_fatal);
>  
>  /* Enable all PAC keys by default.  */
> -cpu->env.cp15.sctlr_el[1] |= SCTLR_EnIA | SCTLR_EnIB;
> -cpu->env.cp15.sctlr_el[1] |= SCTLR_EnDA | SCTLR_EnDB;
> +cpu->reset_sctlr |= SCTLR_EnIA | SCTLR_EnIB;
> +cpu->reset_sctlr |= SCTLR_EnDA | SCTLR_EnDB;

I just sent another patch for this:
http://lists.nongnu.org/archive/html/qemu-devel/2019-01/msg06737.html

This way is valid as well, but would also need to adjust the property callbacks
to modify reset_sctlr as well.

Peter, do you have a preference?


r~



Re: [Qemu-devel] [PATCH 1/3] target/arm: fix AArch64 virtual address space size

2019-01-25 Thread Richard Henderson
On 1/25/19 1:49 PM, Rémi Denis-Courmont wrote:
> From: Remi Denis-Courmont 
> 
> Since QEMU does not support the ARMv8.2-LVA, Large Virtual Address,
> extension (yet), the VA address space is signed 48-bits. User mode can
> only handle the positive half of the address space, so that makes a
> limit of 47 bits.
> 
> (With LVA, it would be 52 and 51 bits respectively.)
> 
> The incorrectly large address space conflicts with PAuth instructions,
> which bits 48-54 and 56-63 for the pointer authentication code. This
> also conflicts with (as yet unsupported by QEMU) data tagging and with
> the ARMv8.5-MTE extension.
> 
> Signed-off-by: Remi Denis-Courmont 
> ---
>  target/arm/cpu.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index ff81db420d..2ccd04b8f7 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -2503,7 +2503,7 @@ bool write_cpustate_to_list(ARMCPU *cpu);
>  
>  #if defined(TARGET_AARCH64)
>  #  define TARGET_PHYS_ADDR_SPACE_BITS 48
> -#  define TARGET_VIRT_ADDR_SPACE_BITS 64
> +#  define TARGET_VIRT_ADDR_SPACE_BITS 47
>  #else
>  #  define TARGET_PHYS_ADDR_SPACE_BITS 40
>  #  define TARGET_VIRT_ADDR_SPACE_BITS 32
> 


This doesn't really conflict, as this doesn't affect much besides the sizing of
the PageDesc for page_find.

It's true adjusting this is worthwhile.  The current setting requires 7 levels
(6 * 10 bits + 1 * 4 bits), whereas a 48-bit address space only requires 5
levels (4 * 10 bits + 8 bits).  Even for LVA it will be (4 * 10 + 1 * 12 bits).

This will both save memory and reduce the time required for TranlationBlock
maintenance.

Your choice of 47 is wrong.  This value is also used for system mode, and the
kernel does use the negative half of the address space, so we need to have all
48 bits here.


r~



[Qemu-devel] [PATCH 0/6] target/arm: Complete ARMv8.3-PAuth linux-user

2019-01-25 Thread Richard Henderson
(1) Fix a bug I introduced at the last moment in the last
patch set -- enable pac keys during reset, not before.
(2) Add the HWCAP bits.
(3) Add the new prctl
(4) Add a smoke test so that (1) doesn't happen again.


r~


Richard Henderson (6):
  target/arm: Always enable pac keys for user-only
  aarch64-linux-user: Update HWCAP bits from linux 5.0-rc1
  aarch64-linux-user: Enable HWCAP bits for PAuth
  linux-user: Initialize aarch64 pac keys
  linux-user: Implement PR_PAC_RESET_KEYS
  tests/tcg/aarch64: Add pauth smoke tests

 linux-user/aarch64/target_syscall.h |  9 +
 linux-user/aarch64/cpu_loop.c   | 31 ++-
 linux-user/elfload.c| 10 +
 linux-user/syscall.c| 33 
 target/arm/cpu.c|  3 ++
 target/arm/cpu64.c  | 60 -
 tests/tcg/aarch64/pauth-1.c | 23 +++
 tests/tcg/aarch64/Makefile.target   |  7 +++-
 8 files changed, 113 insertions(+), 63 deletions(-)
 create mode 100644 tests/tcg/aarch64/pauth-1.c

-- 
2.17.2




[Qemu-devel] [PATCH 4/6] linux-user: Initialize aarch64 pac keys

2019-01-25 Thread Richard Henderson
Initialize the keys to a non-zero value on process start.

Signed-off-by: Richard Henderson 
---
 linux-user/aarch64/target_syscall.h |  2 ++
 linux-user/aarch64/cpu_loop.c   | 31 +++--
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/linux-user/aarch64/target_syscall.h 
b/linux-user/aarch64/target_syscall.h
index 205265e619..937fd7989e 100644
--- a/linux-user/aarch64/target_syscall.h
+++ b/linux-user/aarch64/target_syscall.h
@@ -22,4 +22,6 @@ struct target_pt_regs {
 #define TARGET_PR_SVE_SET_VL  50
 #define TARGET_PR_SVE_GET_VL  51
 
+void arm_init_pauth_key(ARMPACKey *key);
+
 #endif /* AARCH64_TARGET_SYSCALL_H */
diff --git a/linux-user/aarch64/cpu_loop.c b/linux-user/aarch64/cpu_loop.c
index 65d815f030..d75fd9d3e2 100644
--- a/linux-user/aarch64/cpu_loop.c
+++ b/linux-user/aarch64/cpu_loop.c
@@ -147,10 +147,29 @@ void cpu_loop(CPUARMState *env)
 }
 }
 
+static uint64_t arm_rand64(void)
+{
+int shift = 64 - clz64(RAND_MAX);
+int i, n = 64 / shift + (64 % shift != 0);
+uint64_t ret = 0;
+
+for (i = 0; i < n; i++) {
+ret = (ret << shift) | rand();
+}
+return ret;
+}
+
+void arm_init_pauth_key(ARMPACKey *key)
+{
+key->lo = arm_rand64();
+key->hi = arm_rand64();
+}
+
 void target_cpu_copy_regs(CPUArchState *env, struct target_pt_regs *regs)
 {
-CPUState *cpu = ENV_GET_CPU(env);
-TaskState *ts = cpu->opaque;
+ARMCPU *cpu = arm_env_get_cpu(env);
+CPUState *cs = CPU(cpu);
+TaskState *ts = cs->opaque;
 struct image_info *info = ts->info;
 int i;
 
@@ -172,6 +191,14 @@ void target_cpu_copy_regs(CPUArchState *env, struct 
target_pt_regs *regs)
 }
 #endif
 
+if (cpu_isar_feature(aa64_pauth, cpu)) {
+arm_init_pauth_key(&env->apia_key);
+arm_init_pauth_key(&env->apib_key);
+arm_init_pauth_key(&env->apda_key);
+arm_init_pauth_key(&env->apdb_key);
+arm_init_pauth_key(&env->apga_key);
+}
+
 ts->stack_base = info->start_stack;
 ts->heap_base = info->brk;
 /* This will be filled in on the first SYS_HEAPINFO call.  */
-- 
2.17.2




[Qemu-devel] [PATCH 1/6] target/arm: Always enable pac keys for user-only

2019-01-25 Thread Richard Henderson
Drop the pac properties.  This approach cannot work as written
because the properties are applied before arm_cpu_reset, which
zeros SCTLR_EL1 (amongst everything else).

We can re-introduce the properties if they turn out to be useful.
But since linux 5.0 enables all of the keys, they may not be.

Fixes: 1ae9cfbd470
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.c   |  3 +++
 target/arm/cpu64.c | 60 --
 2 files changed, 3 insertions(+), 60 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7e1f3dd637..1e79b97a9c 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -162,6 +162,9 @@ static void arm_cpu_reset(CPUState *s)
 env->pstate = PSTATE_MODE_EL0t;
 /* Userspace expects access to DC ZVA, CTL_EL0 and the cache ops */
 env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE;
+/* Enable all PAC keys.  */
+env->cp15.sctlr_el[1] |= (SCTLR_EnIA | SCTLR_EnIB |
+  SCTLR_EnDA | SCTLR_EnDB);
 /* Enable all PAC instructions */
 env->cp15.hcr_el2 |= HCR_API;
 env->cp15.scr_el3 |= SCR_API;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index e9bc461c36..303d0ef075 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -281,38 +281,6 @@ static void cpu_max_set_sve_vq(Object *obj, Visitor *v, 
const char *name,
 error_propagate(errp, err);
 }
 
-#ifdef CONFIG_USER_ONLY
-static void cpu_max_get_packey(Object *obj, Visitor *v, const char *name,
-   void *opaque, Error **errp)
-{
-ARMCPU *cpu = ARM_CPU(obj);
-const uint64_t *bit = opaque;
-bool enabled = (cpu->env.cp15.sctlr_el[1] & *bit) != 0;
-
-visit_type_bool(v, name, &enabled, errp);
-}
-
-static void cpu_max_set_packey(Object *obj, Visitor *v, const char *name,
-   void *opaque, Error **errp)
-{
-ARMCPU *cpu = ARM_CPU(obj);
-Error *err = NULL;
-const uint64_t *bit = opaque;
-bool enabled;
-
-visit_type_bool(v, name, &enabled, errp);
-
-if (!err) {
-if (enabled) {
-cpu->env.cp15.sctlr_el[1] |= *bit;
-} else {
-cpu->env.cp15.sctlr_el[1] &= ~*bit;
-}
-}
-error_propagate(errp, err);
-}
-#endif
-
 /* -cpu max: if KVM is enabled, like -cpu host (best possible with this host);
  * otherwise, a CPU with as many features enabled as our emulation supports.
  * The version of '-cpu max' for qemu-system-arm is defined in cpu.c;
@@ -388,34 +356,6 @@ static void aarch64_max_initfn(Object *obj)
  */
 cpu->ctr = 0x80038003; /* 32 byte I and D cacheline size, VIPT icache 
*/
 cpu->dcz_blocksize = 7; /*  512 bytes */
-
-/*
- * Note that Linux will enable enable all of the keys at once.
- * But doing it this way will allow experimentation beyond that.
- */
-{
-static const uint64_t apia_bit = SCTLR_EnIA;
-static const uint64_t apib_bit = SCTLR_EnIB;
-static const uint64_t apda_bit = SCTLR_EnDA;
-static const uint64_t apdb_bit = SCTLR_EnDB;
-
-object_property_add(obj, "apia", "bool", cpu_max_get_packey,
-cpu_max_set_packey, NULL,
-(void *)&apia_bit, &error_fatal);
-object_property_add(obj, "apib", "bool", cpu_max_get_packey,
-cpu_max_set_packey, NULL,
-(void *)&apib_bit, &error_fatal);
-object_property_add(obj, "apda", "bool", cpu_max_get_packey,
-cpu_max_set_packey, NULL,
-(void *)&apda_bit, &error_fatal);
-object_property_add(obj, "apdb", "bool", cpu_max_get_packey,
-cpu_max_set_packey, NULL,
-(void *)&apdb_bit, &error_fatal);
-
-/* Enable all PAC keys by default.  */
-cpu->env.cp15.sctlr_el[1] |= SCTLR_EnIA | SCTLR_EnIB;
-cpu->env.cp15.sctlr_el[1] |= SCTLR_EnDA | SCTLR_EnDB;
-}
 #endif
 
 cpu->sve_max_vq = ARM_MAX_VQ;
-- 
2.17.2




[Qemu-devel] [PATCH 3/6] aarch64-linux-user: Enable HWCAP bits for PAuth

2019-01-25 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 3c7a7c2836..775a36ccdd 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -600,6 +600,7 @@ static uint32_t get_elf_hwcap(void)
 GET_FEATURE_ID(aa64_dp, ARM_HWCAP_A64_ASIMDDP);
 GET_FEATURE_ID(aa64_fcma, ARM_HWCAP_A64_FCMA);
 GET_FEATURE_ID(aa64_sve, ARM_HWCAP_A64_SVE);
+GET_FEATURE_ID(aa64_pauth, ARM_HWCAP_A64_PACA | ARM_HWCAP_A64_PACG);
 
 #undef GET_FEATURE_ID
 
-- 
2.17.2




[Qemu-devel] [PATCH] target/arm: Fix validation of 32-bit address spaces for aa32

2019-01-25 Thread Richard Henderson
When tsz == 0, aarch32 selects the address space via exclusion,
and there are no "top_bits" remaining that require validation.

Fixes: ba97be9f4a4
Reported-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 92666e5208..e24689f767 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10447,7 +10447,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 uint64_t ttbr;
 hwaddr descaddr, indexmask, indexmask_grainsize;
 uint32_t tableattrs;
-target_ulong page_size, top_bits;
+target_ulong page_size;
 uint32_t attrs;
 int32_t stride;
 int addrsize, inputsize;
@@ -10487,12 +10487,19 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
  * We determined the region when collecting the parameters, but we
  * have not yet validated that the address is valid for the region.
  * Extract the top bits and verify that they all match select.
+ *
+ * For aa32, if inputsize == addrsize, then we have selected the
+ * region by exclusion in aa32_va_parameters and there is no more
+ * validation to do here.
  */
-top_bits = sextract64(address, inputsize, addrsize - inputsize);
-if (-top_bits != param.select || (param.select && !ttbr1_valid)) {
-/* In the gap between the two regions, this is a Translation fault */
-fault_type = ARMFault_Translation;
-goto do_fault;
+if (inputsize < addrsize) {
+target_ulong top_bits = sextract64(address, inputsize,
+   addrsize - inputsize);
+if (-top_bits != param.select || (param.select && !ttbr1_valid)) {
+/* The gap between the two regions is a Translation fault */
+fault_type = ARMFault_Translation;
+goto do_fault;
+}
 }
 
 if (param.using64k) {
-- 
2.17.1




[Qemu-devel] [PATCH 2/6] aarch64-linux-user: Update HWCAP bits from linux 5.0-rc1

2019-01-25 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 4cff9e1a31..3c7a7c2836 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -560,6 +560,15 @@ enum {
 ARM_HWCAP_A64_ASIMDDP   = 1 << 20,
 ARM_HWCAP_A64_SHA512= 1 << 21,
 ARM_HWCAP_A64_SVE   = 1 << 22,
+ARM_HWCAP_A64_ASIMDFHM  = 1 << 23,
+ARM_HWCAP_A64_DIT   = 1 << 24,
+ARM_HWCAP_A64_USCAT = 1 << 25,
+ARM_HWCAP_A64_ILRCPC= 1 << 26,
+ARM_HWCAP_A64_FLAGM = 1 << 27,
+ARM_HWCAP_A64_SSBS  = 1 << 28,
+ARM_HWCAP_A64_SB= 1 << 29,
+ARM_HWCAP_A64_PACA  = 1 << 30,
+ARM_HWCAP_A64_PACG  = 1UL << 31,
 };
 
 #define ELF_HWCAP get_elf_hwcap()
-- 
2.17.2




[Qemu-devel] [PATCH 5/6] linux-user: Implement PR_PAC_RESET_KEYS

2019-01-25 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 linux-user/aarch64/target_syscall.h |  7 ++
 linux-user/syscall.c| 33 +
 2 files changed, 40 insertions(+)

diff --git a/linux-user/aarch64/target_syscall.h 
b/linux-user/aarch64/target_syscall.h
index 937fd7989e..b595e5da82 100644
--- a/linux-user/aarch64/target_syscall.h
+++ b/linux-user/aarch64/target_syscall.h
@@ -22,6 +22,13 @@ struct target_pt_regs {
 #define TARGET_PR_SVE_SET_VL  50
 #define TARGET_PR_SVE_GET_VL  51
 
+#define TARGET_PR_PAC_RESET_KEYS 54
+# define TARGET_PR_PAC_APIAKEY   (1 << 0)
+# define TARGET_PR_PAC_APIBKEY   (1 << 1)
+# define TARGET_PR_PAC_APDAKEY   (1 << 2)
+# define TARGET_PR_PAC_APDBKEY   (1 << 3)
+# define TARGET_PR_PAC_APGAKEY   (1 << 4)
+
 void arm_init_pauth_key(ARMPACKey *key);
 
 #endif /* AARCH64_TARGET_SYSCALL_H */
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b5786d4fc1..3e2949aa2f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -9691,6 +9691,39 @@ static abi_long do_syscall1(void *cpu_env, int num, 
abi_long arg1,
 }
 }
 return ret;
+case TARGET_PR_PAC_RESET_KEYS:
+{
+CPUARMState *env = cpu_env;
+ARMCPU *cpu = arm_env_get_cpu(env);
+
+if (cpu_isar_feature(aa64_pauth, cpu)) {
+int all = (TARGET_PR_PAC_APIAKEY | TARGET_PR_PAC_APIBKEY |
+   TARGET_PR_PAC_APDAKEY | TARGET_PR_PAC_APDBKEY |
+   TARGET_PR_PAC_APGAKEY);
+if (arg2 == 0) {
+arg2 = all;
+} else if (arg2 & ~all) {
+return -TARGET_EINVAL;
+}
+if (arg2 & TARGET_PR_PAC_APIAKEY) {
+arm_init_pauth_key(&env->apia_key);
+}
+if (arg2 & TARGET_PR_PAC_APIBKEY) {
+arm_init_pauth_key(&env->apib_key);
+}
+if (arg2 & TARGET_PR_PAC_APDAKEY) {
+arm_init_pauth_key(&env->apda_key);
+}
+if (arg2 & TARGET_PR_PAC_APDBKEY) {
+arm_init_pauth_key(&env->apdb_key);
+}
+if (arg2 & TARGET_PR_PAC_APGAKEY) {
+arm_init_pauth_key(&env->apga_key);
+}
+return 0;
+}
+}
+return -TARGET_EINVAL;
 #endif /* AARCH64 */
 case PR_GET_SECCOMP:
 case PR_SET_SECCOMP:
-- 
2.17.2




[Qemu-devel] [PATCH 6/6] tests/tcg/aarch64: Add pauth smoke tests

2019-01-25 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tests/tcg/aarch64/pauth-1.c   | 23 +++
 tests/tcg/aarch64/Makefile.target |  7 ++-
 2 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/aarch64/pauth-1.c

diff --git a/tests/tcg/aarch64/pauth-1.c b/tests/tcg/aarch64/pauth-1.c
new file mode 100644
index 00..9bd8d28ede
--- /dev/null
+++ b/tests/tcg/aarch64/pauth-1.c
@@ -0,0 +1,23 @@
+#include 
+#include 
+
+asm(".arch armv8.4-a");
+
+#ifndef PR_PAC_RESET_KEYS
+#define PR_PAC_RESET_KEYS  54
+#define PR_PAC_APDAKEY (1 << 2)
+#endif
+
+int main()
+{
+int x;
+void *p0 = &x, *p1, *p2;
+
+asm volatile("pacdza %0" : "=r"(p1) : "0"(p0));
+prctl(PR_PAC_RESET_KEYS, PR_PAC_APDAKEY);
+asm volatile("pacdza %0" : "=r"(p2) : "0"(p0));
+
+assert(p1 != p0);
+assert(p1 != p2);
+return 0;
+}
diff --git a/tests/tcg/aarch64/Makefile.target 
b/tests/tcg/aarch64/Makefile.target
index 08c45b8470..e80d07276c 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -8,10 +8,15 @@ VPATH += $(AARCH64_SRC)
 # we don't build any of the ARM tests
 AARCH64_TESTS=$(filter-out $(ARM_TESTS), $(TESTS))
 AARCH64_TESTS+=fcvt
-TESTS:=$(AARCH64_TESTS)
 
 fcvt: LDFLAGS+=-lm
 
 run-fcvt: fcvt
$(call run-test,$<,$(QEMU) $<, "$< on $(TARGET_NAME)")
$(call diff-out,$<,$(AARCH64_SRC)/fcvt.ref)
+
+AARCH64_TESTS += pauth-1
+pauth-%: CFLAGS += -O -g
+run-pauth-%: QEMU += -cpu max
+
+TESTS:=$(AARCH64_TESTS)
-- 
2.17.2




Re: [Qemu-devel] [PATCH RFC 8/9] tests: Add OpenBSD image

2019-01-25 Thread Philippe Mathieu-Daudé
On 1/25/19 7:38 PM, Peter Maydell wrote:
> On Fri, 25 Jan 2019 at 18:36, Brad Smith  wrote:
>>
>> On 1/25/2019 1:24 AM, Thomas Huth wrote:
>>
>>> On 2019-01-25 01:48, Brad Smith wrote:
 Our ports tree has an option which results in the QEMU binaries being
 linked with "-z wxneeded".
>>> Then it's maybe high time to send such changes upstream now ;-)
>>
>>
>> I have not considered submitting such a patch as it's just a workaround
>> for an
>> issue within QEMU. Everything else that we had as local patches or local
>> build
>> fiddling to build things properly has been integrated in some manner.
> 
> We'll happily take a patch that fixes the underlying issue properly
> if you'd prefer...

I sent "configure: Disable W^X on OpenBSD" to help upstream developers
to test their patches on OpenBSD here:
https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06674.html

Regards,

Phil.



Re: [Qemu-devel] [PATCH] MAINTAINERS: Remove Michael Clark as a RISC-V Maintainer

2019-01-25 Thread Philippe Mathieu-Daudé
On 1/25/19 11:21 PM, Palmer Dabbelt wrote:
> Michael is no longer employed by SiFive and does not want to continue
> maintianing the RISC-V port.

"maintaining"

> 
> Signed-off-by: Palmer Dabbelt 
> ---
>  MAINTAINERS | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index af339b86db76..47cb3c14298e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -259,7 +259,6 @@ F: include/hw/ppc/
>  F: disas/ppc.c
>  
>  RISC-V
> -M: Michael Clark 
>  M: Palmer Dabbelt 
>  M: Alistair Francis 
>  M: Sagar Karandikar 
> 



Re: [Qemu-devel] [PATCH v3 2/4] qom/cpu: Add cluster_index to CPUState

2019-01-25 Thread Alistair

On 1/21/19 7:22 AM, Peter Maydell wrote:

For TCG we want to distinguish which cluster a CPU is in, and
we need to do it quickly. Cache the cluster index in the CPUState
struct, by having the cluster object set cpu->cluster_index for
each CPU child when it is realized.

This means that board/SoC code must add all CPUs to the cluster
before realizing the cluster object. Regrettably QOM provides no
way to prevent adding children to a realized object and no way for
the parent to be notified when a new child is added to it, so
we don't have any way to enforce/assert this constraint; all
we can do is document it in a comment. We can at least put in a
check that the cluster contains at least one CPU, which should
catch the typical cases of "realized cluster too early" or
"forgot to parent the CPUs into it".

The restriction on how many clusters can exist in the system
is imposed by TCG code which will be added in a subsequent commit,
but the check to enforce it in cluster.c fits better in this one.

Signed-off-by: Peter Maydell 


Reviewed-by: Alistair Francis 

Alistair


---
Changes v2->v3:
  * allow CPU objects to be indirect children of the cluster;
this is useful for ARMv7M, where the CPU object is a child
of the armv7m container and the board code that sets up
the cluster object only has the armv7m container object:
this is done by using object_child_foreach_recursive()
rather than an open-coded child iteration
  * add an assertion that the cluster has at least one CPU,
which catches the easiest-to-make errors when creating
and populating the cluster
---
  include/hw/cpu/cluster.h | 24 +
  include/qom/cpu.h|  7 ++
  hw/cpu/cluster.c | 46 
  qom/cpu.c|  1 +
  4 files changed, 78 insertions(+)

diff --git a/include/hw/cpu/cluster.h b/include/hw/cpu/cluster.h
index 73818232437..549c2d31d43 100644
--- a/include/hw/cpu/cluster.h
+++ b/include/hw/cpu/cluster.h
@@ -34,12 +34,36 @@
   * Arm big.LITTLE system) they should be in different clusters. If the CPUs do
   * not have the same view of memory (for example the main CPU and a management
   * controller processor) they should be in different clusters.
+ *
+ * A cluster is created by creating an object of TYPE_CPU_CLUSTER, and then
+ * adding the CPUs to it as QOM child objects (e.g. using the
+ * object_initialize_child() or object_property_add_child() functions).
+ * The CPUs may be either direct children of the cluster object, or indirect
+ * children (e.g. children of children of the cluster object).
+ *
+ * All CPUs must be added as children before the cluster is realized.
+ * (Regrettably QOM provides no way to prevent adding children to a realized
+ * object and no way for the parent to be notified when a new child is added
+ * to it, so this restriction is not checked for, but the system will not
+ * behave correctly if it is not adhered to. The cluster will assert that
+ * it contains at least one CPU, which should catch most inadvertent
+ * violations of this constraint.)
+ *
+ * A CPU which is not put into any cluster will be considered implicitly
+ * to be in a cluster with all the other "loose" CPUs, so all CPUs that are
+ * not assigned to clusters must be identical.
   */
  
  #define TYPE_CPU_CLUSTER "cpu-cluster"

  #define CPU_CLUSTER(obj) \
  OBJECT_CHECK(CPUClusterState, (obj), TYPE_CPU_CLUSTER)
  
+/*

+ * This limit is imposed by TCG, which puts the cluster ID into an
+ * 8 bit field (and uses all-1s for the default "not in any cluster").
+ */
+#define MAX_CLUSTERS 255
+
  /**
   * CPUClusterState:
   * @cluster_id: The cluster ID. This value is for internal use only and should
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 16bbed1ae09..4c2feb9c17b 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -280,6 +280,11 @@ struct qemu_work_item;
  /**
   * CPUState:
   * @cpu_index: CPU index (informative).
+ * @cluster_index: Identifies which cluster this CPU is in.
+ *   For boards which don't define clusters or for "loose" CPUs not assigned
+ *   to a cluster this will be UNASSIGNED_CLUSTER_INDEX; otherwise it will
+ *   be the same as the cluster-id property of the CPU object's 
TYPE_CPU_CLUSTER
+ *   QOM parent.
   * @nr_cores: Number of cores within this CPU package.
   * @nr_threads: Number of threads within this CPU.
   * @running: #true if CPU is currently running (lockless).
@@ -405,6 +410,7 @@ struct CPUState {
  
  /* TODO Move common fields from CPUArchState here. */

  int cpu_index;
+int cluster_index;
  uint32_t halted;
  uint32_t can_do_io;
  int32_t exception_index;
@@ -,5 +1117,6 @@ extern const struct VMStateDescription vmstate_cpu_common;
  #endif /* NEED_CPU_H */
  
  #define UNASSIGNED_CPU_INDEX -1

+#define UNASSIGNED_CLUSTER_INDEX -1
  
  #endif

diff --git a/hw/cpu/cluster.c b/hw/cpu/cluster.c
index 9d50a235d5c..25f90702b16 100644
--- a/hw/cpu/cluster

Re: [Qemu-devel] [PATCH v3 3/4] accel/tcg: Add cluster number to TCG TB hash

2019-01-25 Thread Alistair

On 1/21/19 7:22 AM, Peter Maydell wrote:

Include the cluster number in the hash we use to look
up TBs. This is important because a TB that is valid
for one cluster at a given physical address and set
of CPU flags is not necessarily valid for another:
the two clusters may have different views of physical
memory, or may have different CPU features (eg FPU
present or absent).

We put the cluster number in the high 8 bits of the
TB cflags. This gives us up to 256 clusters, which should
be enough for anybody. If we ever need more, or need
more bits in cflags for other purposes, we could make
tb_hash_func() take more data (and expand qemu_xxhash7()
to qemu_xxhash8()).

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 


Reviewed-by: Alistair Francis 

Alistair


---
v1->v2: move the setting of the cluster index in
cf_mask in tb_htable_lookup() up to before we
set desc.cf_mask from it...
---
  include/exec/exec-all.h   | 4 +++-
  accel/tcg/cpu-exec.c  | 3 +++
  accel/tcg/translate-all.c | 3 +++
  3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 815e5b1e838..aa7b81aaf01 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -351,9 +351,11 @@ struct TranslationBlock {
  #define CF_USE_ICOUNT  0x0002
  #define CF_INVALID 0x0004 /* TB is stale. Set with @jmp_lock held */
  #define CF_PARALLEL0x0008 /* Generate code for a parallel context */
+#define CF_CLUSTER_MASK 0xff00 /* Top 8 bits are cluster ID */
+#define CF_CLUSTER_SHIFT 24
  /* cflags' mask for hashing/comparison */
  #define CF_HASH_MASK   \
-(CF_COUNT_MASK | CF_LAST_IO | CF_USE_ICOUNT | CF_PARALLEL)
+(CF_COUNT_MASK | CF_LAST_IO | CF_USE_ICOUNT | CF_PARALLEL | 
CF_CLUSTER_MASK)
  
  /* Per-vCPU dynamic tracing state used to generate this TB */

  uint32_t trace_vcpu_dstate;
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 870027d4359..6c4a33262f5 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -325,6 +325,9 @@ TranslationBlock *tb_htable_lookup(CPUState *cpu, 
target_ulong pc,
  struct tb_desc desc;
  uint32_t h;
  
+cf_mask &= ~CF_CLUSTER_MASK;

+cf_mask |= cpu->cluster_index << CF_CLUSTER_SHIFT;
+
  desc.env = (CPUArchState *)cpu->env_ptr;
  desc.cs_base = cs_base;
  desc.flags = flags;
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 8cb8c8870e6..7364e8a071f 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1688,6 +1688,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
  cflags |= CF_NOCACHE | 1;
  }
  
+cflags &= ~CF_CLUSTER_MASK;

+cflags |= cpu->cluster_index << CF_CLUSTER_SHIFT;
+
   buffer_overflow:
  tb = tb_alloc(pc);
  if (unlikely(!tb)) {





Re: [Qemu-devel] [PATCH v6 30/35] target/riscv: Remove decode_RV32_64G()

2019-01-25 Thread Alistair

On 1/23/19 1:25 AM, Bastian Koppelmann wrote:

decodetree handles all instructions now so the fallback is not necessary
anymore.

Reviewed-by: Richard Henderson 
Signed-off-by: Bastian Koppelmann 
Signed-off-by: Peer Adelt 


Reviewed-by: Alistair Francis 

Alistair


---
  target/riscv/translate.c | 23 +--
  1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 0e37beb68e..b0251b3518 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -600,26 +600,6 @@ bool decode_insn16(DisasContext *ctx, uint16_t insn);
  #include "decode_insn16.inc.c"
  #include "insn_trans/trans_rvc.inc.c"
  
-static void decode_RV32_64G(CPURISCVState *env, DisasContext *ctx)

-{
-uint32_t op;
-
-/* We do not do misaligned address check here: the address should never be
- * misaligned at this point. Instructions that set PC must do the check,
- * since epc must be the address of the instruction that caused us to
- * perform the misaligned instruction fetch */
-
-op = MASK_OP_MAJOR(ctx->opcode);
-
-switch (op) {
-case OPC_RISC_SYSTEM:
-break;
-default:
-gen_exception_illegal(ctx);
-break;
-}
-}
-
  static void decode_opc(DisasContext *ctx)
  {
  /* check for compressed insn */
@@ -636,8 +616,7 @@ static void decode_opc(DisasContext *ctx)
  } else {
  ctx->pc_succ_insn = ctx->base.pc_next + 4;
  if (!decode_insn32(ctx, ctx->opcode)) {
-/* fallback to old decoder */
-decode_RV32_64G(ctx->env, ctx);
+gen_exception_illegal(ctx);
  }
  }
  }





Re: [Qemu-devel] [PATCH v6 28/35] target/riscv: Rename trans_arith to gen_arith

2019-01-25 Thread Alistair

On 1/23/19 1:25 AM, Bastian Koppelmann wrote:

Reviewed-by: Richard Henderson 
Signed-off-by: Bastian Koppelmann 


Reviewed-by: Alistair Francis 

Alistair


---
  target/riscv/insn_trans/trans_rvi.inc.c | 18 +-
  target/riscv/insn_trans/trans_rvm.inc.c | 14 +++---
  target/riscv/translate.c|  4 ++--
  3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.inc.c 
b/target/riscv/insn_trans/trans_rvi.inc.c
index eac79f076f..904ae44968 100644
--- a/target/riscv/insn_trans/trans_rvi.inc.c
+++ b/target/riscv/insn_trans/trans_rvi.inc.c
@@ -307,12 +307,12 @@ static bool trans_srai(DisasContext *ctx, arg_srai *a)
  
  static bool trans_add(DisasContext *ctx, arg_add *a)

  {
-return trans_arith(ctx, a, &tcg_gen_add_tl);
+return gen_arith(ctx, a, &tcg_gen_add_tl);
  }
  
  static bool trans_sub(DisasContext *ctx, arg_sub *a)

  {
-return trans_arith(ctx, a, &tcg_gen_sub_tl);
+return gen_arith(ctx, a, &tcg_gen_sub_tl);
  }
  
  static bool trans_sll(DisasContext *ctx, arg_sll *a)

@@ -322,17 +322,17 @@ static bool trans_sll(DisasContext *ctx, arg_sll *a)
  
  static bool trans_slt(DisasContext *ctx, arg_slt *a)

  {
-return trans_arith(ctx, a, &gen_slt);
+return gen_arith(ctx, a, &gen_slt);
  }
  
  static bool trans_sltu(DisasContext *ctx, arg_sltu *a)

  {
-return trans_arith(ctx, a, &gen_sltu);
+return gen_arith(ctx, a, &gen_sltu);
  }
  
  static bool trans_xor(DisasContext *ctx, arg_xor *a)

  {
-return trans_arith(ctx, a, &tcg_gen_xor_tl);
+return gen_arith(ctx, a, &tcg_gen_xor_tl);
  }
  
  static bool trans_srl(DisasContext *ctx, arg_srl *a)

@@ -347,12 +347,12 @@ static bool trans_sra(DisasContext *ctx, arg_sra *a)
  
  static bool trans_or(DisasContext *ctx, arg_or *a)

  {
-return trans_arith(ctx, a, &tcg_gen_or_tl);
+return gen_arith(ctx, a, &tcg_gen_or_tl);
  }
  
  static bool trans_and(DisasContext *ctx, arg_and *a)

  {
-return trans_arith(ctx, a, &tcg_gen_and_tl);
+return gen_arith(ctx, a, &tcg_gen_and_tl);
  }
  
  #ifdef TARGET_RISCV64

@@ -399,12 +399,12 @@ static bool trans_sraiw(DisasContext *ctx, arg_sraiw *a)
  
  static bool trans_addw(DisasContext *ctx, arg_addw *a)

  {
-return trans_arith(ctx, a, &gen_addw);
+return gen_arith(ctx, a, &gen_addw);
  }
  
  static bool trans_subw(DisasContext *ctx, arg_subw *a)

  {
-return trans_arith(ctx, a, &gen_subw);
+return gen_arith(ctx, a, &gen_subw);
  }
  
  static bool trans_sllw(DisasContext *ctx, arg_sllw *a)

diff --git a/target/riscv/insn_trans/trans_rvm.inc.c 
b/target/riscv/insn_trans/trans_rvm.inc.c
index 949f59ddb2..5844d6f5be 100644
--- a/target/riscv/insn_trans/trans_rvm.inc.c
+++ b/target/riscv/insn_trans/trans_rvm.inc.c
@@ -21,7 +21,7 @@
  
  static bool trans_mul(DisasContext *ctx, arg_mul *a)

  {
-return trans_arith(ctx, a, &tcg_gen_mul_tl);
+return gen_arith(ctx, a, &tcg_gen_mul_tl);
  }
  
  static bool trans_mulh(DisasContext *ctx, arg_mulh *a)

@@ -41,7 +41,7 @@ static bool trans_mulh(DisasContext *ctx, arg_mulh *a)
  
  static bool trans_mulhsu(DisasContext *ctx, arg_mulhsu *a)

  {
-return trans_arith(ctx, a, &gen_mulhsu);
+return gen_arith(ctx, a, &gen_mulhsu);
  }
  
  static bool trans_mulhu(DisasContext *ctx, arg_mulhu *a)

@@ -61,28 +61,28 @@ static bool trans_mulhu(DisasContext *ctx, arg_mulhu *a)
  
  static bool trans_div(DisasContext *ctx, arg_div *a)

  {
-return trans_arith(ctx, a, &gen_div);
+return gen_arith(ctx, a, &gen_div);
  }
  
  static bool trans_divu(DisasContext *ctx, arg_divu *a)

  {
-return trans_arith(ctx, a, &gen_divu);
+return gen_arith(ctx, a, &gen_divu);
  }
  
  static bool trans_rem(DisasContext *ctx, arg_rem *a)

  {
-return trans_arith(ctx, a, &gen_rem);
+return gen_arith(ctx, a, &gen_rem);
  }
  
  static bool trans_remu(DisasContext *ctx, arg_remu *a)

  {
-return trans_arith(ctx, a, &gen_remu);
+return gen_arith(ctx, a, &gen_remu);
  }
  
  #ifdef TARGET_RISCV64

  static bool trans_mulw(DisasContext *ctx, arg_mulw *a)
  {
-return trans_arith(ctx, a, &gen_mulw);
+return gen_arith(ctx, a, &gen_mulw);
  }
  
  static bool trans_divw(DisasContext *ctx, arg_divw *a)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 6a722a0045..d0b0fca12b 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -577,8 +577,8 @@ static bool gen_arith_div_w(DisasContext *ctx, arg_r *a,
  
  #endif
  
-static bool trans_arith(DisasContext *ctx, arg_r *a,

-void(*func)(TCGv, TCGv, TCGv))
+static bool gen_arith(DisasContext *ctx, arg_r *a,
+  void(*func)(TCGv, TCGv, TCGv))
  {
  TCGv source1, source2;
  source1 = tcg_temp_new();





Re: [Qemu-devel] [PATCH v6 24/35] target/riscv: Move gen_arith_imm() decoding into trans_* functions

2019-01-25 Thread Alistair

On 1/23/19 1:25 AM, Bastian Koppelmann wrote:

gen_arith_imm() does a lot of decoding manually, which was hard to read
in case of the shift instructions and is not necessary anymore with
decodetree.

Reviewed-by: Richard Henderson 
Signed-off-by: Bastian Koppelmann 
Signed-off-by: Peer Adelt 


Reviewed-by: Alistair Francis 

Alistair


---
v4 -> v5:
 - fixed funky indentation

  target/riscv/insn32.decode  |   3 +-
  target/riscv/insn_trans/trans_rvi.inc.c |  98 +-
  target/riscv/translate.c| 107 ++--
  3 files changed, 108 insertions(+), 100 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index ecc46a50cc..d6b4197841 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -35,12 +35,13 @@
  
  # Argument sets:

  &bimm rs2 rs1
+&iimm rs1 rd
  &shift shamt rs1 rd
  &atomicaq rl rs2 rs1 rd
  
  # Formats 32:

  @r   ...   . . ... . ...   %rs2 %rs1 
%rd
-@i   . ... . ... imm=%imm_i %rs1 
%rd
+@i   . ... . ... &i  imm=%imm_i %rs1 
%rd
  @b   ...   . . ... . ... &b  imm=%imm_b %rs2 %rs1
  @s   ...   . . ... . ... imm=%imm_s %rs2 %rs1
  @u     . ... imm=%imm_u  
%rd
diff --git a/target/riscv/insn_trans/trans_rvi.inc.c 
b/target/riscv/insn_trans/trans_rvi.inc.c
index da843b4e99..4e51490f94 100644
--- a/target/riscv/insn_trans/trans_rvi.inc.c
+++ b/target/riscv/insn_trans/trans_rvi.inc.c
@@ -217,52 +217,96 @@ static bool trans_sd(DisasContext *ctx, arg_sd *a)
  
  static bool trans_addi(DisasContext *ctx, arg_addi *a)

  {
-gen_arith_imm(ctx, OPC_RISC_ADDI, a->rd, a->rs1, a->imm);
-return true;
+return gen_arith_imm(ctx, a, &tcg_gen_add_tl);
  }
  
  static bool trans_slti(DisasContext *ctx, arg_slti *a)

  {
-gen_arith_imm(ctx, OPC_RISC_SLTI, a->rd, a->rs1, a->imm);
+TCGv source1;
+source1 = tcg_temp_new();
+gen_get_gpr(source1, a->rs1);
+
+tcg_gen_setcondi_tl(TCG_COND_LT, source1, source1, a->imm);
+
+gen_set_gpr(a->rd, source1);
+tcg_temp_free(source1);
  return true;
  }
  
  static bool trans_sltiu(DisasContext *ctx, arg_sltiu *a)

  {
-gen_arith_imm(ctx, OPC_RISC_SLTIU, a->rd, a->rs1, a->imm);
+TCGv source1;
+source1 = tcg_temp_new();
+gen_get_gpr(source1, a->rs1);
+
+tcg_gen_setcondi_tl(TCG_COND_LTU, source1, source1, a->imm);
+
+gen_set_gpr(a->rd, source1);
+tcg_temp_free(source1);
  return true;
  }
  
  static bool trans_xori(DisasContext *ctx, arg_xori *a)

  {
-gen_arith_imm(ctx, OPC_RISC_XORI, a->rd, a->rs1, a->imm);
-return true;
+return gen_arith_imm(ctx, a, &tcg_gen_xor_tl);
  }
  static bool trans_ori(DisasContext *ctx, arg_ori *a)
  {
-gen_arith_imm(ctx, OPC_RISC_ORI, a->rd, a->rs1, a->imm);
-return true;
+return gen_arith_imm(ctx, a, &tcg_gen_or_tl);
  }
  static bool trans_andi(DisasContext *ctx, arg_andi *a)
  {
-gen_arith_imm(ctx, OPC_RISC_ANDI, a->rd, a->rs1, a->imm);
-return true;
+return gen_arith_imm(ctx, a, &tcg_gen_and_tl);
  }
  static bool trans_slli(DisasContext *ctx, arg_slli *a)
  {
-gen_arith_imm(ctx, OPC_RISC_SLLI, a->rd, a->rs1, a->shamt);
+if (a->shamt >= TARGET_LONG_BITS) {
+return false;
+}
+
+if (a->rd != 0) {
+TCGv t = tcg_temp_new();
+gen_get_gpr(t, a->rs1);
+
+tcg_gen_shli_tl(t, t, a->shamt);
+
+gen_set_gpr(a->rd, t);
+tcg_temp_free(t);
+} /* NOP otherwise */
  return true;
  }
  
  static bool trans_srli(DisasContext *ctx, arg_srli *a)

  {
-gen_arith_imm(ctx, OPC_RISC_SHIFT_RIGHT_I, a->rd, a->rs1, a->shamt);
+if (a->shamt >= TARGET_LONG_BITS) {
+return false;
+}
+
+if (a->rd != 0) {
+TCGv t = tcg_temp_new();
+gen_get_gpr(t, a->rs1);
+
+tcg_gen_shri_tl(t, t, a->shamt);
+gen_set_gpr(a->rd, t);
+tcg_temp_free(t);
+} /* NOP otherwise */
  return true;
  }
  
  static bool trans_srai(DisasContext *ctx, arg_srai *a)

  {
-gen_arith_imm(ctx, OPC_RISC_SHIFT_RIGHT_I, a->rd, a->rs1, a->shamt | 
0x400);
+if (a->shamt >= TARGET_LONG_BITS) {
+return false;
+}
+
+if (a->rd != 0) {
+TCGv t = tcg_temp_new();
+gen_get_gpr(t, a->rs1);
+
+tcg_gen_sari_tl(t, t, a->shamt);
+gen_set_gpr(a->rd, t);
+tcg_temp_free(t);
+} /* NOP otherwise */
  return true;
  }
  
@@ -329,26 +373,42 @@ static bool trans_and(DisasContext *ctx, arg_and *a)

  #ifdef TARGET_RISCV64
  static bool trans_addiw(DisasContext *ctx, arg_addiw *a)
  {
-gen_arith_imm(ctx, OPC_RISC_ADDIW, a->rd, a->rs1, a->imm);
-return true;
+return gen_arith_imm(ctx, a, &gen_addw);
  }
  
  static bool trans_slliw(DisasContext *ctx

Re: [Qemu-devel] [PATCH v6 23/35] target/riscv: Remove manual decoding from gen_store()

2019-01-25 Thread Alistair

On 1/23/19 1:25 AM, Bastian Koppelmann wrote:

With decodetree we don't need to convert RISC-V opcodes into to MemOps
as the old gen_store() did.

Reviewed-by: Richard Henderson 
Signed-off-by: Bastian Koppelmann 
Signed-off-by: Peer Adelt 


Reviewed-by: Alistair Francis 

Alistair


---
  target/riscv/insn_trans/trans_rvi.inc.c | 27 +
  target/riscv/translate.c|  8 +---
  2 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.inc.c 
b/target/riscv/insn_trans/trans_rvi.inc.c
index 1ad00bd776..da843b4e99 100644
--- a/target/riscv/insn_trans/trans_rvi.inc.c
+++ b/target/riscv/insn_trans/trans_rvi.inc.c
@@ -168,22 +168,34 @@ static bool trans_lhu(DisasContext *ctx, arg_lhu *a)
  return gen_load(ctx, a, MO_TEUW);
  }
  
-static bool trans_sb(DisasContext *ctx, arg_sb *a)

+static bool gen_store(DisasContext *ctx, arg_sb *a, TCGMemOp memop)
  {
-gen_store(ctx, OPC_RISC_SB, a->rs1, a->rs2, a->imm);
+TCGv t0 = tcg_temp_new();
+TCGv dat = tcg_temp_new();
+gen_get_gpr(t0, a->rs1);
+tcg_gen_addi_tl(t0, t0, a->imm);
+gen_get_gpr(dat, a->rs2);
+
+tcg_gen_qemu_st_tl(dat, t0, ctx->mem_idx, memop);
+tcg_temp_free(t0);
+tcg_temp_free(dat);
  return true;
  }
  
+

+static bool trans_sb(DisasContext *ctx, arg_sb *a)
+{
+return gen_store(ctx, a, MO_SB);
+}
+
  static bool trans_sh(DisasContext *ctx, arg_sh *a)
  {
-gen_store(ctx, OPC_RISC_SH, a->rs1, a->rs2, a->imm);
-return true;
+return gen_store(ctx, a, MO_TESW);
  }
  
  static bool trans_sw(DisasContext *ctx, arg_sw *a)

  {
-gen_store(ctx, OPC_RISC_SW, a->rs1, a->rs2, a->imm);
-return true;
+return gen_store(ctx, a, MO_TESL);
  }
  
  #ifdef TARGET_RISCV64

@@ -199,8 +211,7 @@ static bool trans_ld(DisasContext *ctx, arg_ld *a)
  
  static bool trans_sd(DisasContext *ctx, arg_sd *a)

  {
-gen_store(ctx, OPC_RISC_SD, a->rs1, a->rs2, a->imm);
-return true;
+return gen_store(ctx, a, MO_TEQ);
  }
  #endif
  
diff --git a/target/riscv/translate.c b/target/riscv/translate.c

index d0fefa8fb9..59452be191 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -55,6 +55,7 @@ typedef struct DisasContext {
  CPURISCVState *env;
  } DisasContext;
  
+#ifdef TARGET_RISCV64

  /* convert riscv funct3 to qemu memop for load/store */
  static const int tcg_memop_lookup[8] = {
  [0 ... 7] = -1,
@@ -68,6 +69,7 @@ static const int tcg_memop_lookup[8] = {
  [6] = MO_TEUL,
  #endif
  };
+#endif
  
  #ifdef TARGET_RISCV64

  #define CASE_OP_32_64(X) case X: case glue(X, W)
@@ -509,9 +511,8 @@ static void gen_load_c(DisasContext *ctx, uint32_t opc, int 
rd, int rs1,
  tcg_temp_free(t0);
  tcg_temp_free(t1);
  }
-#endif
  
-static void gen_store(DisasContext *ctx, uint32_t opc, int rs1, int rs2,

+static void gen_store_c(DisasContext *ctx, uint32_t opc, int rs1, int rs2,
  target_long imm)
  {
  TCGv t0 = tcg_temp_new();
@@ -530,6 +531,7 @@ static void gen_store(DisasContext *ctx, uint32_t opc, int 
rs1, int rs2,
  tcg_temp_free(t0);
  tcg_temp_free(dat);
  }
+#endif
  
  #ifdef TARGET_RISCV32

  static void gen_fp_load(DisasContext *ctx, uint32_t opc, int rd,
@@ -653,7 +655,7 @@ static void decode_RV32_64C0(DisasContext *ctx)
  case 7:
  #if defined(TARGET_RISCV64)
  /* C.SD (RV64/128) -> sd rs2', offset[7:3](rs1')*/
-gen_store(ctx, OPC_RISC_SD, rs1s, rd_rs2,
+gen_store_c(ctx, OPC_RISC_SD, rs1s, rd_rs2,
GET_C_LD_IMM(ctx->opcode));
  #else
  /* C.FSW (RV32) -> fsw rs2', offset[6:2](rs1')*/





Re: [Qemu-devel] [PATCH v6 21/35] target/riscv: Remove manual decoding from gen_branch()

2019-01-25 Thread Alistair

On 1/23/19 1:25 AM, Bastian Koppelmann wrote:

We now utilizes argument-sets of decodetree such that no manual
decoding is necessary.

Reviewed-by: Richard Henderson 
Signed-off-by: Bastian Koppelmann 
Signed-off-by: Peer Adelt 


Reviewed-by: Alistair Francis 

Alistair


---
  target/riscv/insn_trans/trans_rvi.inc.c | 46 +---
  target/riscv/translate.c| 47 -
  2 files changed, 33 insertions(+), 60 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.inc.c 
b/target/riscv/insn_trans/trans_rvi.inc.c
index 3b3aff4803..0db1f79d20 100644
--- a/target/riscv/insn_trans/trans_rvi.inc.c
+++ b/target/riscv/insn_trans/trans_rvi.inc.c
@@ -72,41 +72,61 @@ static bool trans_jalr(DisasContext *ctx, arg_jalr *a)
  return true;
  }
  
-static bool trans_beq(DisasContext *ctx, arg_beq *a)

+static bool gen_branch(DisasContext *ctx, arg_b *a, TCGCond cond)
  {
-gen_branch(ctx->env, ctx, OPC_RISC_BEQ, a->rs1, a->rs2, a->imm);
+TCGLabel *l = gen_new_label();
+TCGv source1, source2;
+source1 = tcg_temp_new();
+source2 = tcg_temp_new();
+gen_get_gpr(source1, a->rs1);
+gen_get_gpr(source2, a->rs2);
+
+tcg_gen_brcond_tl(cond, source1, source2, l);
+gen_goto_tb(ctx, 1, ctx->pc_succ_insn);
+gen_set_label(l); /* branch taken */
+
+if (!riscv_has_ext(ctx->env, RVC) && ((ctx->base.pc_next + a->imm) & 0x3)) 
{
+/* misaligned */
+gen_exception_inst_addr_mis(ctx);
+} else {
+gen_goto_tb(ctx, 0, ctx->base.pc_next + a->imm);
+}
+ctx->base.is_jmp = DISAS_NORETURN;
+
+tcg_temp_free(source1);
+tcg_temp_free(source2);
+
  return true;
  }
  
+static bool trans_beq(DisasContext *ctx, arg_beq *a)

+{
+return gen_branch(ctx, a, TCG_COND_EQ);
+}
+
  static bool trans_bne(DisasContext *ctx, arg_bne *a)
  {
-gen_branch(ctx->env, ctx, OPC_RISC_BNE, a->rs1, a->rs2, a->imm);
-return true;
+return gen_branch(ctx, a, TCG_COND_NE);
  }
  
  static bool trans_blt(DisasContext *ctx, arg_blt *a)

  {
-gen_branch(ctx->env, ctx, OPC_RISC_BLT, a->rs1, a->rs2, a->imm);
-return true;
+return gen_branch(ctx, a, TCG_COND_LT);
  }
  
  static bool trans_bge(DisasContext *ctx, arg_bge *a)

  {
-gen_branch(ctx->env, ctx, OPC_RISC_BGE, a->rs1, a->rs2, a->imm);
-return true;
+return gen_branch(ctx, a, TCG_COND_GE);
  }
  
  static bool trans_bltu(DisasContext *ctx, arg_bltu *a)

  {
-gen_branch(ctx->env, ctx, OPC_RISC_BLTU, a->rs1, a->rs2, a->imm);
-return true;
+return gen_branch(ctx, a, TCG_COND_LTU);
  }
  
  static bool trans_bgeu(DisasContext *ctx, arg_bgeu *a)

  {
-
-gen_branch(ctx->env, ctx, OPC_RISC_BGEU, a->rs1, a->rs2, a->imm);
-return true;
+return gen_branch(ctx, a, TCG_COND_GEU);
  }
  
  static bool trans_lb(DisasContext *ctx, arg_lb *a)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 1f59b02c84..a0e96b94a9 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -489,53 +489,6 @@ static void gen_jal(CPURISCVState *env, DisasContext *ctx, 
int rd,
  ctx->base.is_jmp = DISAS_NORETURN;
  }
  
-static void gen_branch(CPURISCVState *env, DisasContext *ctx, uint32_t opc,

-   int rs1, int rs2, target_long bimm)
-{
-TCGLabel *l = gen_new_label();
-TCGv source1, source2;
-source1 = tcg_temp_new();
-source2 = tcg_temp_new();
-gen_get_gpr(source1, rs1);
-gen_get_gpr(source2, rs2);
-
-switch (opc) {
-case OPC_RISC_BEQ:
-tcg_gen_brcond_tl(TCG_COND_EQ, source1, source2, l);
-break;
-case OPC_RISC_BNE:
-tcg_gen_brcond_tl(TCG_COND_NE, source1, source2, l);
-break;
-case OPC_RISC_BLT:
-tcg_gen_brcond_tl(TCG_COND_LT, source1, source2, l);
-break;
-case OPC_RISC_BGE:
-tcg_gen_brcond_tl(TCG_COND_GE, source1, source2, l);
-break;
-case OPC_RISC_BLTU:
-tcg_gen_brcond_tl(TCG_COND_LTU, source1, source2, l);
-break;
-case OPC_RISC_BGEU:
-tcg_gen_brcond_tl(TCG_COND_GEU, source1, source2, l);
-break;
-default:
-gen_exception_illegal(ctx);
-return;
-}
-tcg_temp_free(source1);
-tcg_temp_free(source2);
-
-gen_goto_tb(ctx, 1, ctx->pc_succ_insn);
-gen_set_label(l); /* branch taken */
-if (!riscv_has_ext(env, RVC) && ((ctx->base.pc_next + bimm) & 0x3)) {
-/* misaligned */
-gen_exception_inst_addr_mis(ctx);
-} else {
-gen_goto_tb(ctx, 0, ctx->base.pc_next + bimm);
-}
-ctx->base.is_jmp = DISAS_NORETURN;
-}
-
  static void gen_load(DisasContext *ctx, uint32_t opc, int rd, int rs1,
  target_long imm)
  {





Re: [Qemu-devel] [PATCH v6 22/35] target/riscv: Remove manual decoding from gen_load()

2019-01-25 Thread Alistair

On 1/23/19 1:25 AM, Bastian Koppelmann wrote:

With decodetree we don't need to convert RISC-V opcodes into to MemOps
as the old gen_load() did.

Reviewed-by: Richard Henderson 
Signed-off-by: Bastian Koppelmann 
Signed-off-by: Peer Adelt 


Reviewed-by: Alistair Francis 

Alistair


---
  target/riscv/insn_trans/trans_rvi.inc.c | 35 +++--
  target/riscv/translate.c|  6 +++--
  2 files changed, 25 insertions(+), 16 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.inc.c 
b/target/riscv/insn_trans/trans_rvi.inc.c
index 0db1f79d20..1ad00bd776 100644
--- a/target/riscv/insn_trans/trans_rvi.inc.c
+++ b/target/riscv/insn_trans/trans_rvi.inc.c
@@ -129,34 +129,43 @@ static bool trans_bgeu(DisasContext *ctx, arg_bgeu *a)
  return gen_branch(ctx, a, TCG_COND_GEU);
  }
  
-static bool trans_lb(DisasContext *ctx, arg_lb *a)

+static bool gen_load(DisasContext *ctx, arg_lb *a, TCGMemOp memop)
  {
-gen_load(ctx, OPC_RISC_LB, a->rd, a->rs1, a->imm);
+TCGv t0 = tcg_temp_new();
+TCGv t1 = tcg_temp_new();
+gen_get_gpr(t0, a->rs1);
+tcg_gen_addi_tl(t0, t0, a->imm);
+
+tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, memop);
+gen_set_gpr(a->rd, t1);
+tcg_temp_free(t0);
+tcg_temp_free(t1);
  return true;
  }
  
+static bool trans_lb(DisasContext *ctx, arg_lb *a)

+{
+return gen_load(ctx, a, MO_SB);
+}
+
  static bool trans_lh(DisasContext *ctx, arg_lh *a)
  {
-gen_load(ctx, OPC_RISC_LH, a->rd, a->rs1, a->imm);
-return true;
+return gen_load(ctx, a, MO_TESW);
  }
  
  static bool trans_lw(DisasContext *ctx, arg_lw *a)

  {
-gen_load(ctx, OPC_RISC_LW, a->rd, a->rs1, a->imm);
-return true;
+return gen_load(ctx, a, MO_TESL);
  }
  
  static bool trans_lbu(DisasContext *ctx, arg_lbu *a)

  {
-gen_load(ctx, OPC_RISC_LBU, a->rd, a->rs1, a->imm);
-return true;
+return gen_load(ctx, a, MO_UB);
  }
  
  static bool trans_lhu(DisasContext *ctx, arg_lhu *a)

  {
-gen_load(ctx, OPC_RISC_LHU, a->rd, a->rs1, a->imm);
-return true;
+return gen_load(ctx, a, MO_TEUW);
  }
  
  static bool trans_sb(DisasContext *ctx, arg_sb *a)

@@ -180,14 +189,12 @@ static bool trans_sw(DisasContext *ctx, arg_sw *a)
  #ifdef TARGET_RISCV64
  static bool trans_lwu(DisasContext *ctx, arg_lwu *a)
  {
-gen_load(ctx, OPC_RISC_LWU, a->rd, a->rs1, a->imm);
-return true;
+return gen_load(ctx, a, MO_TEUL);
  }
  
  static bool trans_ld(DisasContext *ctx, arg_ld *a)

  {
-gen_load(ctx, OPC_RISC_LD, a->rd, a->rs1, a->imm);
-return true;
+return gen_load(ctx, a, MO_TEQ);
  }
  
  static bool trans_sd(DisasContext *ctx, arg_sd *a)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index a0e96b94a9..d0fefa8fb9 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -489,7 +489,8 @@ static void gen_jal(CPURISCVState *env, DisasContext *ctx, 
int rd,
  ctx->base.is_jmp = DISAS_NORETURN;
  }
  
-static void gen_load(DisasContext *ctx, uint32_t opc, int rd, int rs1,

+#ifdef TARGET_RISCV64
+static void gen_load_c(DisasContext *ctx, uint32_t opc, int rd, int rs1,
  target_long imm)
  {
  TCGv t0 = tcg_temp_new();
@@ -508,6 +509,7 @@ static void gen_load(DisasContext *ctx, uint32_t opc, int 
rd, int rs1,
  tcg_temp_free(t0);
  tcg_temp_free(t1);
  }
+#endif
  
  static void gen_store(DisasContext *ctx, uint32_t opc, int rs1, int rs2,

  target_long imm)
@@ -640,7 +642,7 @@ static void decode_RV32_64C0(DisasContext *ctx)
  case 3:
  #if defined(TARGET_RISCV64)
  /* C.LD(RV64/128) -> ld rd', offset[7:3](rs1')*/
-gen_load(ctx, OPC_RISC_LD, rd_rs2, rs1s,
+gen_load_c(ctx, OPC_RISC_LD, rd_rs2, rs1s,
   GET_C_LD_IMM(ctx->opcode));
  #else
  /* C.FLW (RV32) -> flw rd', offset[6:2](rs1')*/





[Qemu-devel] [PATCH] MAINTAINERS: Remove Michael Clark as a RISC-V Maintainer

2019-01-25 Thread Palmer Dabbelt
Michael is no longer employed by SiFive and does not want to continue
maintianing the RISC-V port.

Signed-off-by: Palmer Dabbelt 
---
 MAINTAINERS | 1 -
 1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index af339b86db76..47cb3c14298e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -259,7 +259,6 @@ F: include/hw/ppc/
 F: disas/ppc.c
 
 RISC-V
-M: Michael Clark 
 M: Palmer Dabbelt 
 M: Alistair Francis 
 M: Sagar Karandikar 
-- 
2.18.1




Re: [Qemu-devel] [PATCH v1] GLib sucks - Remove any connections between me and GLib

2019-01-25 Thread Palmer Dabbelt

On Fri, 18 Jan 2019 16:29:57 PST (-0800), Michael Clark wrote:

One has a basic command of English but one cannot think of a
more or less glib commit message for this commit so this is it.

Palmer, I would like to send you an invoice for this commit
but I do not know what value to place on it. 2^64-1, (-1)
or 0x in hex, sounds too little or too much.

You also will have to pay better for me to use nicer language,
and you can bet your bottom dollar it won’t be C. Also, as we
discussed, having one’s direct competitors in control of our
emulation infrastructure is not a great idea. I thought we
agreed on that then I find out we are in disagreement.

However, I feel comfortable because, while I haven’t been paid
since last August, I am arranging temporary support with
WINZ , however, one must
note, that when one’s work is sabotaged in the real world,
folk end up without jobs, whereas if we were just a student
then we probably only end up with a bruised ego and perhaps
lower grades if our supervisor is not happy. In the real world,
we get the shit scared out of us because we think we could be
fired at any moment. i.e. zero-day. I think because I dropped
out of school, I never learned that. Anyhow, I didn’t read
the contract. I just wanted to get a paycheck so I could play
games with folk on the Internet. That’s bullshit. I wasn’t
playing games, now I am. I was released from a mental ward
after a severe episode of anxiety and depression. True Story.
Anyway Palmer fired me today. Performance issues I suspect.
Could be trust, security, mistakes or simply misalignment on
basic issues. Haven't had post-mortem yet.

BTW What if we auctioned commits and if your commit didn’t
get in, you don’t get paid. Commits would become expensive?
(or cheap?). 🤔

Also, do we have MIA and KIA tags for MAINTAINERS? We could
even put little Unicode medals next to titles like this:

   (sub³)∘maintainer 🎖🎖️  A. Nonymous

What is the operator binding precedence? i.e. do we that need
parenthesis there or does exponentiation bind more tightly
than the ring operator?

Otherwise, please just delete me. It’s bad memories all round.
as I got fired today because of my QEMU performance. I am a
terrible mess when using version control.

Maybe when I have a secure job we can meet up for a beer and
everything will be okay, but at this point we have the wolves
at the door and are worried about an imminent eviction by the
landlord. Also true story (the conditions of port submission).

I know a lot of you are going to say, send me back to the
mental ward, however, I will miss my partner/spouse. We were
actually looking for a house in Palmerston North over the
weekend in preparation for my imminent demise (she still thinks
we can afford to live here in Auckland and everything is fine),
as we thought it would be survivable on social security, however,
I suspect she may not support me any more given the number of
jobs I have lost in the past 6 years since we met. Indeed, rv8
was written mostly on a benefit from the Ministry of Social
Development, C/- New Zealand Government. Oesensibly Job Seeker
Support, but we were searching for a job that didn’t have too
many interruppts, and then SiFive put me to work on a bloody
interrupt controller. That’s ironic. Of course I am slow and
cannot multitask so QEMU queue management was neglected.

Note: the auction reserve on this commit is ten bucks, or
alternatively, /almost/ the price of a beer in Singapore,
so someone will need to top it up. i.e. free as in /almost/
free beer.

P.S. Don’t mess with the RISC-V Virtual Machine Model without
consulting RISC-V SW Dev list or RISC-V ISA Dev list or the
closed doors working groups in the RISC-V Foundation. Personally
I prefer open discussion. Also, it is not the QEMU virt machine,
it is the RISC-V virt machine because SiFive has a VirtIO model
on Amazon F1, so if you mess with it in QEMU you’ll break prod.

P.P.S. No offence to the GLib maintainers, but automatic
reference counting or garbage collection is much safer to
use in higher level code that does user interaction. We could
get all sorts of leaks and what not. As you know I missed free
(or g_free, although you know what I mean).

P.P.P.S I do not represent my employer because I just got
fired today. Anyone know of any jobs going for a crusty old C++
developer who prefers std::unique_ptr over malloc/free?

P.P.P.P.S Is firecracker written in C++? Rust I think dammit,
maybe I should learn Rust. Don’t worry I know nobody likes me,
so it won’t help me no matter how much I learn. I will still
be an overweight middle-aged (human) with no sense of humour
and a fond affection for C+.

P.P.P.P.P.S I’m not trying to rub it in (I lied, I am).

If you are curious about my alter-ego, it’s my (older) brother,
he’s clever, and I am the dumb one with no sense of humour.
I was trying to get a job in California by writing cloud apps
and he somehow ends up in Walnut Creek. Yes, F

Re: [Qemu-devel] [PATCH 13/13] cputlb: Remove static tlb sizing

2019-01-25 Thread Alistair

On 1/23/19 2:57 PM, Richard Henderson wrote:

Now that all tcg backends support TCG_TARGET_IMPLEMENTS_DYN_TLB,
remove the define and the old code.

Signed-off-by: Richard Henderson 


Reviewed-by: Alistair Francis 

Alistair


---
  include/exec/cpu-defs.h  | 46 
  include/exec/cpu_ldst.h  | 14 
  tcg/aarch64/tcg-target.h |  1 -
  tcg/arm/tcg-target.h |  1 -
  tcg/i386/tcg-target.h|  1 -
  tcg/mips/tcg-target.h|  1 -
  tcg/ppc/tcg-target.h |  1 -
  tcg/riscv/tcg-target.h   |  1 -
  tcg/s390/tcg-target.h|  1 -
  tcg/sparc/tcg-target.h   |  1 -
  tcg/tci/tcg-target.h |  1 -
  accel/tcg/cputlb.c   | 21 --
  12 files changed, 90 deletions(-)

diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 191a1e021f..8f2a848bf5 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -67,11 +67,9 @@ typedef uint64_t target_ulong;
  #define CPU_TLB_ENTRY_BITS 5
  #endif
  
-#if TCG_TARGET_IMPLEMENTS_DYN_TLB

  #define CPU_TLB_DYN_MIN_BITS 6
  #define CPU_TLB_DYN_DEFAULT_BITS 8
  
-

  # if HOST_LONG_BITS == 32
  /* Make sure we do not require a double-word shift for the TLB load */
  #  define CPU_TLB_DYN_MAX_BITS (32 - TARGET_PAGE_BITS)
@@ -87,41 +85,6 @@ typedef uint64_t target_ulong;
  MIN(22, TARGET_VIRT_ADDR_SPACE_BITS - TARGET_PAGE_BITS)
  # endif
  
-#else /* !TCG_TARGET_IMPLEMENTS_DYN_TLB */

-
-/* TCG_TARGET_TLB_DISPLACEMENT_BITS is used in CPU_TLB_BITS to ensure that
- * the TLB is not unnecessarily small, but still small enough for the
- * TLB lookup instruction sequence used by the TCG target.
- *
- * TCG will have to generate an operand as large as the distance between
- * env and the tlb_table[NB_MMU_MODES - 1][0].addend.  For simplicity,
- * the TCG targets just round everything up to the next power of two, and
- * count bits.  This works because: 1) the size of each TLB is a largish
- * power of two, 2) and because the limit of the displacement is really close
- * to a power of two, 3) the offset of tlb_table[0][0] inside env is smaller
- * than the size of a TLB.
- *
- * For example, the maximum displacement 0xFFF0 on PPC and MIPS, but TCG
- * just says "the displacement is 16 bits".  TCG_TARGET_TLB_DISPLACEMENT_BITS
- * then ensures that tlb_table at least 0x8000 bytes large ("not unnecessarily
- * small": 2^15).  The operand then will come up smaller than 0xFFF0 without
- * any particular care, because the TLB for a single MMU mode is larger than
- * 0x1-0xFFF0=16 bytes.  In the end, the maximum value of the operand
- * could be something like 0xC000 (the offset of the last TLB table) plus
- * 0x18 (the offset of the addend field in each TLB entry) plus the offset
- * of tlb_table inside env (which is non-trivial but not huge).
- */
-#define CPU_TLB_BITS \
-MIN(8,   \
-TCG_TARGET_TLB_DISPLACEMENT_BITS - CPU_TLB_ENTRY_BITS -  \
-(NB_MMU_MODES <= 1 ? 0 : \
- NB_MMU_MODES <= 2 ? 1 : \
- NB_MMU_MODES <= 4 ? 2 : \
- NB_MMU_MODES <= 8 ? 3 : 4))
-
-#define CPU_TLB_SIZE (1 << CPU_TLB_BITS)
-#endif /* TCG_TARGET_IMPLEMENTS_DYN_TLB */
-
  typedef struct CPUTLBEntry {
  /* bit TARGET_LONG_BITS to TARGET_PAGE_BITS : virtual address
 bit TARGET_PAGE_BITS-1..4  : Nonzero for accesses that should not
@@ -187,10 +150,8 @@ typedef struct CPUTLBDesc {
  target_ulong large_page_mask;
  /* The next index to use in the tlb victim table.  */
  size_t vindex;
-#if TCG_TARGET_IMPLEMENTS_DYN_TLB
  CPUTLBWindow window;
  size_t n_used_entries;
-#endif
  } CPUTLBDesc;
  
  /*

@@ -215,19 +176,12 @@ typedef struct CPUTLBCommon {
  size_t elide_flush_count;
  } CPUTLBCommon;
  
-#if TCG_TARGET_IMPLEMENTS_DYN_TLB

  # define CPU_TLB\
  /* tlb_mask[i] contains (n_entries - 1) << CPU_TLB_ENTRY_BITS */\
  uintptr_t tlb_mask[NB_MMU_MODES];   \
  CPUTLBEntry *tlb_table[NB_MMU_MODES];
  # define CPU_IOTLB  \
  CPUIOTLBEntry *iotlb[NB_MMU_MODES];
-#else
-# define CPU_TLB\
-CPUTLBEntry tlb_table[NB_MMU_MODES][CPU_TLB_SIZE];
-# define CPU_IOTLB  \
-CPUIOTLBEntry iotlb[NB_MMU_MODES][CPU_TLB_SIZE];
-#endif
  
  /*

   * The meaning of each of the MMU modes is defined in the target code.
diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index 83b2907d86..d78041d7a0 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -135,7 +135,6 @@ static inline target_ulong tlb_addr_write(const CPUTLBEntry 
*entry)
  #endif
  }
  
-#if TCG_TARGET_IMPLEMENTS_DYN_TLB

  /* Find the TLB index corresponding to the mmu_

Re: [Qemu-devel] [PATCH 08/13] tcg/riscv: enable dynamic TLB sizing

2019-01-25 Thread Alistair

On 1/23/19 2:57 PM, Richard Henderson wrote:

Signed-off-by: Richard Henderson 


Reviewed-by: Alistair Francis 

Alistair


---
  tcg/riscv/tcg-target.h |   2 +-
  tcg/riscv/tcg-target.inc.c | 126 -
  2 files changed, 56 insertions(+), 72 deletions(-)

diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 1eb032626c..83b123ca03 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -33,7 +33,7 @@
  
  #define TCG_TARGET_INSN_UNIT_SIZE 4

  #define TCG_TARGET_TLB_DISPLACEMENT_BITS 20
-#define TCG_TARGET_IMPLEMENTS_DYN_TLB 0
+#define TCG_TARGET_IMPLEMENTS_DYN_TLB 1
  #define TCG_TARGET_NB_REGS 32
  
  typedef enum {

diff --git a/tcg/riscv/tcg-target.inc.c b/tcg/riscv/tcg-target.inc.c
index 6cf8de32b5..b785f4acb7 100644
--- a/tcg/riscv/tcg-target.inc.c
+++ b/tcg/riscv/tcg-target.inc.c
@@ -958,6 +958,17 @@ static void * const qemu_st_helpers[16] = {
  [MO_BEQ]  = helper_be_stq_mmu,
  };
  
+/* We don't support oversize guests */

+QEMU_BUILD_BUG_ON(TCG_TARGET_REG_BITS < TARGET_LONG_BITS);
+
+/* We expect tlb_mask to be before tlb_table.  */
+QEMU_BUILD_BUG_ON(offsetof(CPUArchState, tlb_table) <
+  offsetof(CPUArchState, tlb_mask));
+
+/* We expect tlb_mask to be "near" tlb_table.  */
+QEMU_BUILD_BUG_ON(offsetof(CPUArchState, tlb_table) -
+  offsetof(CPUArchState, tlb_mask) >= 0x800);
+
  static void tcg_out_tlb_load(TCGContext *s, TCGReg addrl,
   TCGReg addrh, TCGMemOpIdx oi,
   tcg_insn_unit **label_ptr, bool is_load)
@@ -965,94 +976,67 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg addrl,
  TCGMemOp opc = get_memop(oi);
  unsigned s_bits = opc & MO_SIZE;
  unsigned a_bits = get_alignment_bits(opc);
-target_ulong mask;
+tcg_target_long compare_mask;
  int mem_index = get_mmuidx(oi);
-int cmp_off
-= (is_load
-   ? offsetof(CPUArchState, tlb_table[mem_index][0].addr_read)
-   : offsetof(CPUArchState, tlb_table[mem_index][0].addr_write));
-int add_off = offsetof(CPUArchState, tlb_table[mem_index][0].addend);
-RISCVInsn load_cmp_op = (TARGET_LONG_BITS == 64 ? OPC_LD :
- TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW);
-RISCVInsn load_add_op = TCG_TARGET_REG_BITS == 64 ? OPC_LD : OPC_LW;
-TCGReg base = TCG_AREG0;
+int mask_off, table_off;
+TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0;
  
-/* We don't support oversize guests */

-if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
-g_assert_not_reached();
+mask_off = offsetof(CPUArchState, tlb_mask[mem_index]);
+table_off = offsetof(CPUArchState, tlb_table[mem_index]);
+if (table_off > 0x7ff) {
+int mask_hi = mask_off - sextreg(mask_off, 0, 12);
+int table_hi = table_off - sextreg(table_off, 0, 12);
+
+if (likely(mask_hi == table_hi)) {
+mask_base = table_base = TCG_REG_TMP1;
+tcg_out_opc_upper(s, OPC_LUI, mask_base, mask_hi);
+tcg_out_opc_reg(s, OPC_ADD, mask_base, mask_base, TCG_AREG0);
+mask_off -= mask_hi;
+table_off -= mask_hi;
+} else {
+mask_base = TCG_REG_TMP0;
+table_base = TCG_REG_TMP1;
+tcg_out_opc_upper(s, OPC_LUI, mask_base, mask_hi);
+tcg_out_opc_reg(s, OPC_ADD, mask_base, mask_base, TCG_AREG0);
+table_off -= mask_off;
+mask_off -= mask_hi;
+tcg_out_opc_imm(s, OPC_ADDI, table_base, mask_base, mask_off);
+}
  }
  
+tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, mask_base, mask_off);

+tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, table_base, table_off);
+
+tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addrl,
+TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
+tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
+
+/* Load the tlb comparator and the addend.  */
+tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
+   is_load ? offsetof(CPUTLBEntry, addr_read)
+   : offsetof(CPUTLBEntry, addr_write));
+tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
+   offsetof(CPUTLBEntry, addend));
+
  /* We don't support unaligned accesses. */
  if (a_bits < s_bits) {
  a_bits = s_bits;
  }
-mask = (target_ulong)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
-
-
-/* Compensate for very large offsets.  */
-if (add_off >= 0x1000) {
-int adj;
-base = TCG_REG_TMP2;
-if (cmp_off <= 2 * 0xfff) {
-adj = 0xfff;
-tcg_out_opc_imm(s, OPC_ADDI, base, TCG_AREG0, adj);
-} else {
-adj = cmp_off - sextreg(cmp_off, 0, 12);
-tcg_debug_assert(add_off - adj >= -0x1000
- && add_off - adj < 0x1000);
-
-

[Qemu-devel] [PULL 8/8] ide/via: Implement and use native PCI IDE mode

2019-01-25 Thread John Snow
From: BALATON Zoltan 

This device only implemented ISA compatibility mode and native PCI IDE
mode was missing but no clients actually need ISA mode but to the
contrary, they usually want to switch to and use device in native
PCI IDE mode. Therefore implement native PCI mode and switch default
to that.

Signed-off-by: BALATON Zoltan 
Message-id: 
c323f08c59b9931310c5d92503d370f77ce3a557.1548160772.git.bala...@eik.bme.hu
Signed-off-by: John Snow 
---
 hw/ide/via.c | 52 ++--
 1 file changed, 38 insertions(+), 14 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index c6e43a8812..ac9385228c 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -32,15 +32,6 @@
 #include "hw/ide/pci.h"
 #include "trace.h"
 
-static const struct {
-int iobase;
-int iobase2;
-int isairq;
-} port_info[] = {
-{0x1f0, 0x3f6, 14},
-{0x170, 0x376, 15},
-};
-
 static uint64_t bmdma_read(void *opaque, hwaddr addr,
unsigned size)
 {
@@ -110,6 +101,23 @@ static void bmdma_setup_bar(PCIIDEState *d)
 }
 }
 
+static void via_ide_set_irq(void *opaque, int n, int level)
+{
+PCIDevice *d = PCI_DEVICE(opaque);
+
+if (level) {
+d->config[0x70 + n * 8] |= 0x80;
+} else {
+d->config[0x70 + n * 8] &= ~0x80;
+}
+
+level = (d->config[0x70] & 0x80) || (d->config[0x78] & 0x80);
+n = pci_get_byte(d->config + PCI_INTERRUPT_LINE);
+if (n) {
+qemu_set_irq(isa_get_irq(NULL, n), level);
+}
+}
+
 static void via_ide_reset(void *opaque)
 {
 PCIIDEState *d = opaque;
@@ -121,7 +129,7 @@ static void via_ide_reset(void *opaque)
 ide_bus_reset(&d->bus[i]);
 }
 
-pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_WAIT);
+pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_IO | PCI_COMMAND_WAIT);
 pci_set_word(pci_conf + PCI_STATUS, PCI_STATUS_FAST_BACK |
  PCI_STATUS_DEVSEL_MEDIUM);
 
@@ -158,10 +166,28 @@ static void via_ide_realize(PCIDevice *dev, Error **errp)
 uint8_t *pci_conf = dev->config;
 int i;
 
-pci_config_set_prog_interface(pci_conf, 0x8a); /* legacy ATA mode */
+pci_config_set_prog_interface(pci_conf, 0x8f); /* native PCI ATA mode */
 pci_set_long(pci_conf + PCI_CAPABILITY_LIST, 0x00c0);
+dev->wmask[PCI_INTERRUPT_LINE] = 0xf;
 
 qemu_register_reset(via_ide_reset, d);
+
+memory_region_init_io(&d->data_bar[0], OBJECT(d), &pci_ide_data_le_ops,
+  &d->bus[0], "via-ide0-data", 8);
+pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &d->data_bar[0]);
+
+memory_region_init_io(&d->cmd_bar[0], OBJECT(d), &pci_ide_cmd_le_ops,
+  &d->bus[0], "via-ide0-cmd", 4);
+pci_register_bar(dev, 1, PCI_BASE_ADDRESS_SPACE_IO, &d->cmd_bar[0]);
+
+memory_region_init_io(&d->data_bar[1], OBJECT(d), &pci_ide_data_le_ops,
+  &d->bus[1], "via-ide1-data", 8);
+pci_register_bar(dev, 2, PCI_BASE_ADDRESS_SPACE_IO, &d->data_bar[1]);
+
+memory_region_init_io(&d->cmd_bar[1], OBJECT(d), &pci_ide_cmd_le_ops,
+  &d->bus[1], "via-ide1-cmd", 4);
+pci_register_bar(dev, 3, PCI_BASE_ADDRESS_SPACE_IO, &d->cmd_bar[1]);
+
 bmdma_setup_bar(d);
 pci_register_bar(dev, 4, PCI_BASE_ADDRESS_SPACE_IO, &d->bmdma_bar);
 
@@ -169,9 +195,7 @@ static void via_ide_realize(PCIDevice *dev, Error **errp)
 
 for (i = 0; i < 2; i++) {
 ide_bus_new(&d->bus[i], sizeof(d->bus[i]), DEVICE(d), i, 2);
-ide_init_ioport(&d->bus[i], NULL, port_info[i].iobase,
-port_info[i].iobase2);
-ide_init2(&d->bus[i], isa_get_irq(NULL, port_info[i].isairq));
+ide_init2(&d->bus[i], qemu_allocate_irq(via_ide_set_irq, d, i));
 
 bmdma_init(&d->bus[i], &d->bmdma[i], d);
 d->bmdma[i].bus = &d->bus[i];
-- 
2.17.2




[Qemu-devel] [PULL 6/8] ide/via: Remove vt82c686b_init_ports() function

2019-01-25 Thread John Snow
From: BALATON Zoltan 

This function is only called once from vt82c686b_ide_realize() and its
content is simple enough to not need a separate function but be
included in realize directly (as done in other IDE models except PIIX
currently).

Signed-off-by: BALATON Zoltan 
Message-id: 
47d854e0fa41dad6861107eac61327c247965566.1548160772.git.bala...@eik.bme.hu
Signed-off-by: John Snow 
---
 hw/ide/via.c | 44 
 1 file changed, 20 insertions(+), 24 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index 987d99c5ec..46cac7b8d6 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -32,6 +32,15 @@
 #include "hw/ide/pci.h"
 #include "trace.h"
 
+static const struct {
+int iobase;
+int iobase2;
+int isairq;
+} port_info[] = {
+{0x1f0, 0x3f6, 14},
+{0x170, 0x376, 15},
+};
+
 static uint64_t bmdma_read(void *opaque, hwaddr addr,
unsigned size)
 {
@@ -143,34 +152,12 @@ static void via_reset(void *opaque)
 pci_set_long(pci_conf + 0xc0, 0x00020001);
 }
 
-static void vt82c686b_init_ports(PCIIDEState *d) {
-static const struct {
-int iobase;
-int iobase2;
-int isairq;
-} port_info[] = {
-{0x1f0, 0x3f6, 14},
-{0x170, 0x376, 15},
-};
-int i;
-
-for (i = 0; i < 2; i++) {
-ide_bus_new(&d->bus[i], sizeof(d->bus[i]), DEVICE(d), i, 2);
-ide_init_ioport(&d->bus[i], NULL, port_info[i].iobase,
-port_info[i].iobase2);
-ide_init2(&d->bus[i], isa_get_irq(NULL, port_info[i].isairq));
-
-bmdma_init(&d->bus[i], &d->bmdma[i], d);
-d->bmdma[i].bus = &d->bus[i];
-ide_register_restart_cb(&d->bus[i]);
-}
-}
-
 /* via ide func */
 static void vt82c686b_ide_realize(PCIDevice *dev, Error **errp)
 {
 PCIIDEState *d = PCI_IDE(dev);
 uint8_t *pci_conf = dev->config;
+int i;
 
 pci_config_set_prog_interface(pci_conf, 0x8a); /* legacy ATA mode */
 pci_set_long(pci_conf + PCI_CAPABILITY_LIST, 0x00c0);
@@ -181,7 +168,16 @@ static void vt82c686b_ide_realize(PCIDevice *dev, Error 
**errp)
 
 vmstate_register(DEVICE(dev), 0, &vmstate_ide_pci, d);
 
-vt82c686b_init_ports(d);
+for (i = 0; i < 2; i++) {
+ide_bus_new(&d->bus[i], sizeof(d->bus[i]), DEVICE(d), i, 2);
+ide_init_ioport(&d->bus[i], NULL, port_info[i].iobase,
+port_info[i].iobase2);
+ide_init2(&d->bus[i], isa_get_irq(NULL, port_info[i].isairq));
+
+bmdma_init(&d->bus[i], &d->bmdma[i], d);
+d->bmdma[i].bus = &d->bus[i];
+ide_register_restart_cb(&d->bus[i]);
+}
 }
 
 static void vt82c686b_ide_exitfn(PCIDevice *dev)
-- 
2.17.2




[Qemu-devel] [PULL 5/8] sii3112: Remove duplicated code and use PCI IDE ops instead

2019-01-25 Thread John Snow
From: BALATON Zoltan 

Parts of the SiI3112 mmio are identical to PCI IDE registers so we can
use the corresponding functions that were factored out into ide/pci.c.
This removes code duplication and simplifies the SiI3112 model which
also helped to spot a copy paste error where reading status of the
2nd channel read the 1st channel instead. This is also fixed here.

Signed-off-by: BALATON Zoltan 
Tested-by: Mark Cave-Ayland 
Reviewed-by: John Snow 
Message-id: 
793b6a7934ef2bba26b8d066bec446019efa6c5d.1547166960.git.bala...@eik.bme.hu
Signed-off-by: John Snow 
---
 hw/ide/sii3112.c | 52 
 1 file changed, 8 insertions(+), 44 deletions(-)

diff --git a/hw/ide/sii3112.c b/hw/ide/sii3112.c
index 743a50ed51..59db09cfe4 100644
--- a/hw/ide/sii3112.c
+++ b/hw/ide/sii3112.c
@@ -88,35 +88,19 @@ static uint64_t sii3112_reg_read(void *opaque, hwaddr addr,
 val |= (uint32_t)d->i.bmdma[1].status << 16;
 break;
 case 0x80 ... 0x87:
-if (size == 1) {
-val = ide_ioport_read(&d->i.bus[0], addr - 0x80);
-} else if (addr == 0x80) {
-val = (size == 2) ? ide_data_readw(&d->i.bus[0], 0) :
-ide_data_readl(&d->i.bus[0], 0);
-} else {
-val = (1ULL << (size * 8)) - 1;
-}
+val = pci_ide_data_le_ops.read(&d->i.bus[0], addr - 0x80, size);
 break;
 case 0x8a:
-val = (size == 1) ? ide_status_read(&d->i.bus[0], 4) :
-(1ULL << (size * 8)) - 1;
+val = pci_ide_cmd_le_ops.read(&d->i.bus[0], 2, size);
 break;
 case 0xa0:
 val = d->regs[0].confstat;
 break;
 case 0xc0 ... 0xc7:
-if (size == 1) {
-val = ide_ioport_read(&d->i.bus[1], addr - 0xc0);
-} else if (addr == 0xc0) {
-val = (size == 2) ? ide_data_readw(&d->i.bus[1], 0) :
-ide_data_readl(&d->i.bus[1], 0);
-} else {
-val = (1ULL << (size * 8)) - 1;
-}
+val = pci_ide_data_le_ops.read(&d->i.bus[1], addr - 0xc0, size);
 break;
 case 0xca:
-val = (size == 1) ? ide_status_read(&d->i.bus[0], 4) :
-(1ULL << (size * 8)) - 1;
+val = pci_ide_cmd_le_ops.read(&d->i.bus[1], 2, size);
 break;
 case 0xe0:
 val = d->regs[1].confstat;
@@ -186,36 +170,16 @@ static void sii3112_reg_write(void *opaque, hwaddr addr,
 bmdma_addr_ioport_ops.write(&d->i.bmdma[1], addr - 12, val, size);
 break;
 case 0x80 ... 0x87:
-if (size == 1) {
-ide_ioport_write(&d->i.bus[0], addr - 0x80, val);
-} else if (addr == 0x80) {
-if (size == 2) {
-ide_data_writew(&d->i.bus[0], 0, val);
-} else {
-ide_data_writel(&d->i.bus[0], 0, val);
-}
-}
+pci_ide_data_le_ops.write(&d->i.bus[0], addr - 0x80, val, size);
 break;
 case 0x8a:
-if (size == 1) {
-ide_cmd_write(&d->i.bus[0], 4, val);
-}
+pci_ide_cmd_le_ops.write(&d->i.bus[0], 2, val, size);
 break;
 case 0xc0 ... 0xc7:
-if (size == 1) {
-ide_ioport_write(&d->i.bus[1], addr - 0xc0, val);
-} else if (addr == 0xc0) {
-if (size == 2) {
-ide_data_writew(&d->i.bus[1], 0, val);
-} else {
-ide_data_writel(&d->i.bus[1], 0, val);
-}
-}
+pci_ide_data_le_ops.write(&d->i.bus[1], addr - 0xc0, val, size);
 break;
 case 0xca:
-if (size == 1) {
-ide_cmd_write(&d->i.bus[1], 4, val);
-}
+pci_ide_cmd_le_ops.write(&d->i.bus[1], 2, val, size);
 break;
 case 0x100:
 d->regs[0].scontrol = val & 0xfff;
-- 
2.17.2




[Qemu-devel] [PULL 3/8] cmd646: Move PCI IDE specific functions to ide/pci.c

2019-01-25 Thread John Snow
From: BALATON Zoltan 

The io mem ops callbacks are not specific to CMD646 but really follow
the PCI IDE spec so move these from cmd646.c to pci.c to allow other
PCI IDE implementations to use them.

Signed-off-by: BALATON Zoltan 
Tested-by: Mark Cave-Ayland 
Reviewed-by: John Snow 
Message-id: 
a2b1b2b74afdc78330b8b75605687f683a249635.1547166960.git.bala...@eik.bme.hu
Signed-off-by: John Snow 
---
 hw/ide/cmd646.c  | 71 ++--
 hw/ide/pci.c | 65 
 include/hw/ide/pci.h |  2 ++
 3 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c
index c24f71e219..95f0df9742 100644
--- a/hw/ide/cmd646.c
+++ b/hw/ide/cmd646.c
@@ -50,81 +50,14 @@
 
 static void cmd646_update_irq(PCIDevice *pd);
 
-static uint64_t cmd646_cmd_read(void *opaque, hwaddr addr,
-unsigned size)
-{
-IDEBus *bus = opaque;
-
-if (addr != 2 || size != 1) {
-return ((uint64_t)1 << (size * 8)) - 1;
-}
-return ide_status_read(bus, addr + 2);
-}
-
-static void cmd646_cmd_write(void *opaque, hwaddr addr,
- uint64_t data, unsigned size)
-{
-IDEBus *bus = opaque;
-
-if (addr != 2 || size != 1) {
-return;
-}
-ide_cmd_write(bus, addr + 2, data);
-}
-
-static const MemoryRegionOps cmd646_cmd_ops = {
-.read = cmd646_cmd_read,
-.write = cmd646_cmd_write,
-.endianness = DEVICE_LITTLE_ENDIAN,
-};
-
-static uint64_t cmd646_data_read(void *opaque, hwaddr addr,
- unsigned size)
-{
-IDEBus *bus = opaque;
-
-if (size == 1) {
-return ide_ioport_read(bus, addr);
-} else if (addr == 0) {
-if (size == 2) {
-return ide_data_readw(bus, addr);
-} else {
-return ide_data_readl(bus, addr);
-}
-}
-return ((uint64_t)1 << (size * 8)) - 1;
-}
-
-static void cmd646_data_write(void *opaque, hwaddr addr,
- uint64_t data, unsigned size)
-{
-IDEBus *bus = opaque;
-
-if (size == 1) {
-ide_ioport_write(bus, addr, data);
-} else if (addr == 0) {
-if (size == 2) {
-ide_data_writew(bus, addr, data);
-} else {
-ide_data_writel(bus, addr, data);
-}
-}
-}
-
-static const MemoryRegionOps cmd646_data_ops = {
-.read = cmd646_data_read,
-.write = cmd646_data_write,
-.endianness = DEVICE_LITTLE_ENDIAN,
-};
-
 static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
 {
 IDEBus *bus = &d->bus[bus_num];
 CMD646BAR *bar = &d->cmd646_bar[bus_num];
 
-memory_region_init_io(&bar->cmd, OBJECT(d), &cmd646_cmd_ops, bus,
+memory_region_init_io(&bar->cmd, OBJECT(d), &pci_ide_cmd_le_ops, bus,
   "cmd646-cmd", 4);
-memory_region_init_io(&bar->data, OBJECT(d), &cmd646_data_ops, bus,
+memory_region_init_io(&bar->data, OBJECT(d), &pci_ide_data_le_ops, bus,
   "cmd646-data", 8);
 }
 
diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index b75154f99f..942613a9a9 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -36,6 +36,71 @@
 (IDE_RETRY_DMA | IDE_RETRY_PIO | \
 IDE_RETRY_READ | IDE_RETRY_FLUSH)
 
+static uint64_t pci_ide_cmd_read(void *opaque, hwaddr addr, unsigned size)
+{
+IDEBus *bus = opaque;
+
+if (addr != 2 || size != 1) {
+return ((uint64_t)1 << (size * 8)) - 1;
+}
+return ide_status_read(bus, addr + 2);
+}
+
+static void pci_ide_cmd_write(void *opaque, hwaddr addr,
+  uint64_t data, unsigned size)
+{
+IDEBus *bus = opaque;
+
+if (addr != 2 || size != 1) {
+return;
+}
+ide_cmd_write(bus, addr + 2, data);
+}
+
+const MemoryRegionOps pci_ide_cmd_le_ops = {
+.read = pci_ide_cmd_read,
+.write = pci_ide_cmd_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static uint64_t pci_ide_data_read(void *opaque, hwaddr addr, unsigned size)
+{
+IDEBus *bus = opaque;
+
+if (size == 1) {
+return ide_ioport_read(bus, addr);
+} else if (addr == 0) {
+if (size == 2) {
+return ide_data_readw(bus, addr);
+} else {
+return ide_data_readl(bus, addr);
+}
+}
+return ((uint64_t)1 << (size * 8)) - 1;
+}
+
+static void pci_ide_data_write(void *opaque, hwaddr addr,
+   uint64_t data, unsigned size)
+{
+IDEBus *bus = opaque;
+
+if (size == 1) {
+ide_ioport_write(bus, addr, data);
+} else if (addr == 0) {
+if (size == 2) {
+ide_data_writew(bus, addr, data);
+} else {
+ide_data_writel(bus, addr, data);
+}
+}
+}
+
+const MemoryRegionOps pci_ide_data_le_ops = {
+.read = pci_ide_data_read,
+.write = pci_ide_data_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
 static void bmdma_start_dma(IDEDMA *dma, IDEState *s,

[Qemu-devel] [PULL 0/8] Ide patches

2019-01-25 Thread John Snow
The following changes since commit ad7a21e81231ae64540310384fb0f87ac8758b02:

  Merge remote-tracking branch 'remotes/ehabkost/tags/python-next-pull-request' 
into staging (2019-01-25 17:22:20 +)

are available in the Git repository at:

  https://github.com/jnsnow/qemu.git tags/ide-pull-request

for you to fetch changes up to 4ea98d317eb442c738f898f16cfdd47a18b7ca49:

  ide/via: Implement and use native PCI IDE mode (2019-01-25 14:52:12 -0500)


Pull request



BALATON Zoltan (8):
  cmd646: Remove unused variable
  cmd646: Remove IDEBus from CMD646BAR
  cmd646: Move PCI IDE specific functions to ide/pci.c
  ide: Get rid of CMD646BAR struct
  sii3112: Remove duplicated code and use PCI IDE ops instead
  ide/via: Remove vt82c686b_init_ports() function
  ide/via: Rename functions to match device name
  ide/via: Implement and use native PCI IDE mode

 hw/ide/cmd646.c | 102 +++-
 hw/ide/pci.c|  65 +
 hw/ide/sii3112.c|  52 
 hw/ide/via.c|  87 --
 hw/mips/mips_fulong2e.c |   2 +-
 include/hw/ide.h|   2 +-
 include/hw/ide/pci.h|  14 ++
 7 files changed, 148 insertions(+), 176 deletions(-)

-- 
2.17.2




[Qemu-devel] [PULL 4/8] ide: Get rid of CMD646BAR struct

2019-01-25 Thread John Snow
From: BALATON Zoltan 

Now that no CMD646 specific parts are left in CMD646BAR (all remaining
members are really PCI IDE specific) this struct can be deleted moving
the memory regions for PCI IDE BARs to PCIIDEState where they better
belong. The CMD646 PCI IDE model is adjusted accordingly.

Signed-off-by: BALATON Zoltan 
Tested-by: Mark Cave-Ayland 
Reviewed-by: John Snow 
Message-id: 
4b6cb2ae150dc0d21178209e4beb1e35140a7325.1547166960.git.bala...@eik.bme.hu
Signed-off-by: John Snow 
---
 hw/ide/cmd646.c  | 33 -
 include/hw/ide/pci.h | 10 ++
 2 files changed, 18 insertions(+), 25 deletions(-)

diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c
index 95f0df9742..5a5679134a 100644
--- a/hw/ide/cmd646.c
+++ b/hw/ide/cmd646.c
@@ -50,17 +50,6 @@
 
 static void cmd646_update_irq(PCIDevice *pd);
 
-static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
-{
-IDEBus *bus = &d->bus[bus_num];
-CMD646BAR *bar = &d->cmd646_bar[bus_num];
-
-memory_region_init_io(&bar->cmd, OBJECT(d), &pci_ide_cmd_le_ops, bus,
-  "cmd646-cmd", 4);
-memory_region_init_io(&bar->data, OBJECT(d), &pci_ide_data_le_ops, bus,
-  "cmd646-data", 8);
-}
-
 static void cmd646_update_dma_interrupts(PCIDevice *pd)
 {
 /* Sync DMA interrupt status from UDMA interrupt status */
@@ -277,12 +266,22 @@ static void pci_cmd646_ide_realize(PCIDevice *dev, Error 
**errp)
 dev->wmask[MRDMODE] = 0x0;
 dev->w1cmask[MRDMODE] = MRDMODE_INTR_CH0 | MRDMODE_INTR_CH1;
 
-setup_cmd646_bar(d, 0);
-setup_cmd646_bar(d, 1);
-pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_IO, 
&d->cmd646_bar[0].data);
-pci_register_bar(dev, 1, PCI_BASE_ADDRESS_SPACE_IO, &d->cmd646_bar[0].cmd);
-pci_register_bar(dev, 2, PCI_BASE_ADDRESS_SPACE_IO, 
&d->cmd646_bar[1].data);
-pci_register_bar(dev, 3, PCI_BASE_ADDRESS_SPACE_IO, &d->cmd646_bar[1].cmd);
+memory_region_init_io(&d->data_bar[0], OBJECT(d), &pci_ide_data_le_ops,
+  &d->bus[0], "cmd646-data0", 8);
+pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &d->data_bar[0]);
+
+memory_region_init_io(&d->cmd_bar[0], OBJECT(d), &pci_ide_cmd_le_ops,
+  &d->bus[0], "cmd646-cmd0", 4);
+pci_register_bar(dev, 1, PCI_BASE_ADDRESS_SPACE_IO, &d->cmd_bar[0]);
+
+memory_region_init_io(&d->data_bar[1], OBJECT(d), &pci_ide_data_le_ops,
+  &d->bus[1], "cmd646-data1", 8);
+pci_register_bar(dev, 2, PCI_BASE_ADDRESS_SPACE_IO, &d->data_bar[1]);
+
+memory_region_init_io(&d->cmd_bar[1], OBJECT(d), &pci_ide_cmd_le_ops,
+  &d->bus[1], "cmd646-cmd1", 4);
+pci_register_bar(dev, 3, PCI_BASE_ADDRESS_SPACE_IO, &d->cmd_bar[1]);
+
 bmdma_setup_bar(d);
 pci_register_bar(dev, 4, PCI_BASE_ADDRESS_SPACE_IO, &d->bmdma_bar);
 
diff --git a/include/hw/ide/pci.h b/include/hw/ide/pci.h
index 3110633e4c..a9f2c33e68 100644
--- a/include/hw/ide/pci.h
+++ b/include/hw/ide/pci.h
@@ -37,11 +37,6 @@ typedef struct BMDMAState {
 struct PCIIDEState *pci_dev;
 } BMDMAState;
 
-typedef struct CMD646BAR {
-MemoryRegion cmd;
-MemoryRegion data;
-} CMD646BAR;
-
 #define TYPE_PCI_IDE "pci-ide"
 #define PCI_IDE(obj) OBJECT_CHECK(PCIIDEState, (obj), TYPE_PCI_IDE)
 
@@ -54,17 +49,16 @@ typedef struct PCIIDEState {
 BMDMAState bmdma[2];
 uint32_t secondary; /* used only for cmd646 */
 MemoryRegion bmdma_bar;
-CMD646BAR cmd646_bar[2]; /* used only for cmd646 */
+MemoryRegion cmd_bar[2];
+MemoryRegion data_bar[2];
 } PCIIDEState;
 
-
 static inline IDEState *bmdma_active_if(BMDMAState *bmdma)
 {
 assert(bmdma->bus->retry_unit != (uint8_t)-1);
 return bmdma->bus->ifs + bmdma->bus->retry_unit;
 }
 
-
 void bmdma_init(IDEBus *bus, BMDMAState *bm, PCIIDEState *d);
 void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val);
 extern MemoryRegionOps bmdma_addr_ioport_ops;
-- 
2.17.2




[Qemu-devel] [PULL 2/8] cmd646: Remove IDEBus from CMD646BAR

2019-01-25 Thread John Snow
From: BALATON Zoltan 

The cmd646 io mem ops callbacks only need the IDEBus which is
currently passed via a CMD646BAR struct. No need to wrap it up like
that, we can pass it directly to these callbacks which then allows to
drop the IDEBus from the CMD646BAR.

Signed-off-by: BALATON Zoltan 
Tested-by: Mark Cave-Ayland 
Reviewed-by: John Snow 
Message-id: 
7a31c155c9899869794499d841d30c7ef32aae47.1547166960.git.bala...@eik.bme.hu
Signed-off-by: John Snow 
---
 hw/ide/cmd646.c  | 29 ++---
 include/hw/ide/pci.h |  1 -
 2 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c
index 41c1831f9a..c24f71e219 100644
--- a/hw/ide/cmd646.c
+++ b/hw/ide/cmd646.c
@@ -53,23 +53,23 @@ static void cmd646_update_irq(PCIDevice *pd);
 static uint64_t cmd646_cmd_read(void *opaque, hwaddr addr,
 unsigned size)
 {
-CMD646BAR *cmd646bar = opaque;
+IDEBus *bus = opaque;
 
 if (addr != 2 || size != 1) {
 return ((uint64_t)1 << (size * 8)) - 1;
 }
-return ide_status_read(cmd646bar->bus, addr + 2);
+return ide_status_read(bus, addr + 2);
 }
 
 static void cmd646_cmd_write(void *opaque, hwaddr addr,
  uint64_t data, unsigned size)
 {
-CMD646BAR *cmd646bar = opaque;
+IDEBus *bus = opaque;
 
 if (addr != 2 || size != 1) {
 return;
 }
-ide_cmd_write(cmd646bar->bus, addr + 2, data);
+ide_cmd_write(bus, addr + 2, data);
 }
 
 static const MemoryRegionOps cmd646_cmd_ops = {
@@ -81,15 +81,15 @@ static const MemoryRegionOps cmd646_cmd_ops = {
 static uint64_t cmd646_data_read(void *opaque, hwaddr addr,
  unsigned size)
 {
-CMD646BAR *cmd646bar = opaque;
+IDEBus *bus = opaque;
 
 if (size == 1) {
-return ide_ioport_read(cmd646bar->bus, addr);
+return ide_ioport_read(bus, addr);
 } else if (addr == 0) {
 if (size == 2) {
-return ide_data_readw(cmd646bar->bus, addr);
+return ide_data_readw(bus, addr);
 } else {
-return ide_data_readl(cmd646bar->bus, addr);
+return ide_data_readl(bus, addr);
 }
 }
 return ((uint64_t)1 << (size * 8)) - 1;
@@ -98,15 +98,15 @@ static uint64_t cmd646_data_read(void *opaque, hwaddr addr,
 static void cmd646_data_write(void *opaque, hwaddr addr,
  uint64_t data, unsigned size)
 {
-CMD646BAR *cmd646bar = opaque;
+IDEBus *bus = opaque;
 
 if (size == 1) {
-ide_ioport_write(cmd646bar->bus, addr, data);
+ide_ioport_write(bus, addr, data);
 } else if (addr == 0) {
 if (size == 2) {
-ide_data_writew(cmd646bar->bus, addr, data);
+ide_data_writew(bus, addr, data);
 } else {
-ide_data_writel(cmd646bar->bus, addr, data);
+ide_data_writel(bus, addr, data);
 }
 }
 }
@@ -122,10 +122,9 @@ static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
 IDEBus *bus = &d->bus[bus_num];
 CMD646BAR *bar = &d->cmd646_bar[bus_num];
 
-bar->bus = bus;
-memory_region_init_io(&bar->cmd, OBJECT(d), &cmd646_cmd_ops, bar,
+memory_region_init_io(&bar->cmd, OBJECT(d), &cmd646_cmd_ops, bus,
   "cmd646-cmd", 4);
-memory_region_init_io(&bar->data, OBJECT(d), &cmd646_data_ops, bar,
+memory_region_init_io(&bar->data, OBJECT(d), &cmd646_data_ops, bus,
   "cmd646-data", 8);
 }
 
diff --git a/include/hw/ide/pci.h b/include/hw/ide/pci.h
index ed723acfb4..013d7937d2 100644
--- a/include/hw/ide/pci.h
+++ b/include/hw/ide/pci.h
@@ -40,7 +40,6 @@ typedef struct BMDMAState {
 typedef struct CMD646BAR {
 MemoryRegion cmd;
 MemoryRegion data;
-IDEBus *bus;
 } CMD646BAR;
 
 #define TYPE_PCI_IDE "pci-ide"
-- 
2.17.2




[Qemu-devel] [PATCH 1/2] i386: kvm: Disable arch_capabilities if MSR can't be set

2019-01-25 Thread Eduardo Habkost
KVM has two bugs in the handling of MSR_IA32_ARCH_CAPABILITIES:

1) Linux commit commit 1eaafe91a0df ("kvm: x86: IA32_ARCH_CAPABILITIES
   is always supported") makes GET_SUPPORTED_CPUID return
   arch_capabilities even if running on SVM.  This makes "-cpu
   host,migratable=off" incorrectly expose arch_capabilities on CPUID on
   AMD hosts (where the MSR is not emulated by KVM).

2) KVM_GET_MSR_INDEX_LIST does not return MSR_IA32_ARCH_CAPABILITIES if
   the MSR is not supported by the host CPU.  This makes QEMU not
   initialize the MSR properly at kvm_put_msrs() on those hosts.

Work around both bugs on the QEMU side, by checking if the MSR
was returned by KVM_GET_MSR_INDEX_LIST before returning the
feature flag on kvm_arch_get_supported_cpuid().

This has the unfortunate side effect of making arch_capabilities
unavailable on hosts without hardware support for the MSR until bug #2
is fixed on KVM, but I can't see another way to work around bug #1
without that side effect.

Signed-off-by: Eduardo Habkost 
---
Cc: Konrad Rzeszutek Wilk 
Cc: Jim Mattson 
Cc: KarimAllah Ahmed 
Cc: David Woodhouse 
Cc: Darren Kenny 
---
 target/i386/kvm.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 9af4542fb8..4fa3e3806a 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -389,6 +389,15 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, 
uint32_t function,
 if (host_tsx_blacklisted()) {
 ret &= ~(CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_HLE);
 }
+} else if (function == 7 && index == 0 && reg == R_EDX) {
+/*
+ * Linux incorrectly v4.17-v4.20 return ARCH_CAPABILITIES on SVM.
+ * We can detect the bug by checking if MSR_IA32_ARCH_CAPABILITIES is
+ * returned by KVM_GET_MSR_INDEX_LIST.
+ */
+if (!has_msr_arch_capabs) {
+ret &= ~CPUID_7_0_EDX_ARCH_CAPABILITIES;
+}
 } else if (function == 0x8001 && reg == R_ECX) {
 /*
  * It's safe to enable TOPOEXT even if it's not returned by
-- 
2.18.0.rc1.1.g3f1ff2140




[Qemu-devel] [PULL 7/8] ide/via: Rename functions to match device name

2019-01-25 Thread John Snow
From: BALATON Zoltan 

The device is called via-ide and the modelled IDE controller is not
specific to 82C686B but is also usable independently. Therefore, change
function name prefixes accordingly to match device name.

Signed-off-by: BALATON Zoltan 
Message-id: 
2905ced862c8d2ad509d73152171ce2472d72605.1548160772.git.bala...@eik.bme.hu
Signed-off-by: John Snow 
---
 hw/ide/via.c| 15 +++
 hw/mips/mips_fulong2e.c |  2 +-
 include/hw/ide.h|  2 +-
 3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index 46cac7b8d6..c6e43a8812 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -110,7 +110,7 @@ static void bmdma_setup_bar(PCIIDEState *d)
 }
 }
 
-static void via_reset(void *opaque)
+static void via_ide_reset(void *opaque)
 {
 PCIIDEState *d = opaque;
 PCIDevice *pd = PCI_DEVICE(d);
@@ -152,8 +152,7 @@ static void via_reset(void *opaque)
 pci_set_long(pci_conf + 0xc0, 0x00020001);
 }
 
-/* via ide func */
-static void vt82c686b_ide_realize(PCIDevice *dev, Error **errp)
+static void via_ide_realize(PCIDevice *dev, Error **errp)
 {
 PCIIDEState *d = PCI_IDE(dev);
 uint8_t *pci_conf = dev->config;
@@ -162,7 +161,7 @@ static void vt82c686b_ide_realize(PCIDevice *dev, Error 
**errp)
 pci_config_set_prog_interface(pci_conf, 0x8a); /* legacy ATA mode */
 pci_set_long(pci_conf + PCI_CAPABILITY_LIST, 0x00c0);
 
-qemu_register_reset(via_reset, d);
+qemu_register_reset(via_ide_reset, d);
 bmdma_setup_bar(d);
 pci_register_bar(dev, 4, PCI_BASE_ADDRESS_SPACE_IO, &d->bmdma_bar);
 
@@ -180,7 +179,7 @@ static void vt82c686b_ide_realize(PCIDevice *dev, Error 
**errp)
 }
 }
 
-static void vt82c686b_ide_exitfn(PCIDevice *dev)
+static void via_ide_exitfn(PCIDevice *dev)
 {
 PCIIDEState *d = PCI_IDE(dev);
 unsigned i;
@@ -191,7 +190,7 @@ static void vt82c686b_ide_exitfn(PCIDevice *dev)
 }
 }
 
-void vt82c686b_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn)
+void via_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn)
 {
 PCIDevice *dev;
 
@@ -204,8 +203,8 @@ static void via_ide_class_init(ObjectClass *klass, void 
*data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
-k->realize = vt82c686b_ide_realize;
-k->exit = vt82c686b_ide_exitfn;
+k->realize = via_ide_realize;
+k->exit = via_ide_exitfn;
 k->vendor_id = PCI_VENDOR_ID_VIA;
 k->device_id = PCI_DEVICE_ID_VIA_IDE;
 k->revision = 0x06;
diff --git a/hw/mips/mips_fulong2e.c b/hw/mips/mips_fulong2e.c
index 2fbba32c48..42d09f6892 100644
--- a/hw/mips/mips_fulong2e.c
+++ b/hw/mips/mips_fulong2e.c
@@ -249,7 +249,7 @@ static void vt82c686b_southbridge_init(PCIBus *pci_bus, int 
slot, qemu_irq intc,
 isa_create_simple(isa_bus, TYPE_VT82C686B_SUPERIO);
 
 ide_drive_get(hd, ARRAY_SIZE(hd));
-vt82c686b_ide_init(pci_bus, hd, PCI_DEVFN(slot, 1));
+via_ide_init(pci_bus, hd, PCI_DEVFN(slot, 1));
 
 pci_create_simple(pci_bus, PCI_DEVFN(slot, 2), "vt82c686b-usb-uhci");
 pci_create_simple(pci_bus, PCI_DEVFN(slot, 3), "vt82c686b-usb-uhci");
diff --git a/include/hw/ide.h b/include/hw/ide.h
index 3ae087c572..28d8a06439 100644
--- a/include/hw/ide.h
+++ b/include/hw/ide.h
@@ -18,7 +18,7 @@ PCIDevice *pci_piix3_xen_ide_init(PCIBus *bus, DriveInfo 
**hd_table, int devfn);
 PCIDevice *pci_piix3_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn);
 PCIDevice *pci_piix4_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn);
 int pci_piix3_xen_ide_unplug(DeviceState *dev, bool aux);
-void vt82c686b_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn);
+void via_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn);
 
 /* ide-mmio.c */
 void mmio_ide_init_drives(DeviceState *dev, DriveInfo *hd0, DriveInfo *hd1);
-- 
2.17.2




[Qemu-devel] [PATCH 2/2] i386: Make arch_capabilities migratable

2019-01-25 Thread Eduardo Habkost
Now that kvm_arch_get_supported_cpuid() will only return
arch_capabilities if QEMU is able to initialize the MSR properly,
we know that the feature is safely migratable.

Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2f5412592d..3ff91d794d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1088,7 +1088,6 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 .reg = R_EDX,
 },
 .tcg_features = TCG_7_0_EDX_FEATURES,
-.unmigratable_flags = CPUID_7_0_EDX_ARCH_CAPABILITIES,
 },
 [FEAT_8000_0007_EDX] = {
 .type = CPUID_FEATURE_WORD,
-- 
2.18.0.rc1.1.g3f1ff2140




[Qemu-devel] [PULL 1/8] cmd646: Remove unused variable

2019-01-25 Thread John Snow
From: BALATON Zoltan 

There was a pointer to PCIIDEState in CMD646BAR which was set but
not used afterwards. Get rid of this unused variable.

Signed-off-by: BALATON Zoltan 
Tested-by: Mark Cave-Ayland 
Reviewed-by: John Snow 
Message-id: 
1e352f091aa601fb2e19771aac46529fe278dd91.1547166960.git.bala...@eik.bme.hu
Signed-off-by: John Snow 
---
 hw/ide/cmd646.c  | 1 -
 include/hw/ide/pci.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c
index 6bb92d717f..41c1831f9a 100644
--- a/hw/ide/cmd646.c
+++ b/hw/ide/cmd646.c
@@ -123,7 +123,6 @@ static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
 CMD646BAR *bar = &d->cmd646_bar[bus_num];
 
 bar->bus = bus;
-bar->pci_dev = d;
 memory_region_init_io(&bar->cmd, OBJECT(d), &cmd646_cmd_ops, bar,
   "cmd646-cmd", 4);
 memory_region_init_io(&bar->data, OBJECT(d), &cmd646_data_ops, bar,
diff --git a/include/hw/ide/pci.h b/include/hw/ide/pci.h
index dbc6a0383d..ed723acfb4 100644
--- a/include/hw/ide/pci.h
+++ b/include/hw/ide/pci.h
@@ -41,7 +41,6 @@ typedef struct CMD646BAR {
 MemoryRegion cmd;
 MemoryRegion data;
 IDEBus *bus;
-struct PCIIDEState *pci_dev;
 } CMD646BAR;
 
 #define TYPE_PCI_IDE "pci-ide"
-- 
2.17.2




[Qemu-devel] [PATCH 0/2] i386: arch_capabilities fixes + migratability

2019-01-25 Thread Eduardo Habkost
This series works around KVM bugs that affect the arch_capabilities
feature.  One bug made the feature be enabled incorrect on AMD hosts,
and another one made the feature unsafe to enable on most Intel hosts.
With the work around, we can finally make arch_capabilities a migratable
feature.

Unfortunately, the work around has the side effect of making
arch_capabilities unavailable on hosts without hardware support for the
feature until one of the KVM bugs is fixed.

Eduardo Habkost (2):
  i386: kvm: Disable arch_capabilities if MSR can't be set
  i386: Make arch_capabilities migratable

 target/i386/cpu.c | 1 -
 target/i386/kvm.c | 9 +
 2 files changed, 9 insertions(+), 1 deletion(-)

-- 
2.18.0.rc1.1.g3f1ff2140




Re: [Qemu-devel] [RFC PATCH] ahci-test: Add dependency to qemu-img tool

2019-01-25 Thread John Snow



On 1/25/19 3:34 PM, Philippe Mathieu-Daudé wrote:
> Since the ahci-test uses qemu-img, add a dependency to build it
> before using it.
> This fixes:
> 
>   $ gmake check-qtest V=1
>   QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img 
> tests/ahci-test
>   Failed to execute child process "/tmp/qemu-test.19tMRF/qemu-img" (No such 
> file or directory)
>   ERROR:tests/libqos/libqos.c:192:mkimg: assertion failed: (ret && !err)
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC because while this dependency is valid, I don't think this is the
> clever way to solve this problem (which is, assuming the host
> distribution has the qemu-tools installed).
> I guess remember a thread about it (Eric, John?) where it was asked
> "What do we want to test, qemu-img or AHCI? Can we trust an unstable
> version of a tool to verify a device?"
> 

Yeah, these discussions have a great track record of preventing us from
fixing problems.

Here's my two cents: "make check" already exports QTEST_QEMU_IMG as the
one we're building and gets passed to ahci-test -- might as well make
the dependency explicit in that same way.

> Slighly related is when vm-tests expect qemu-img available:
> https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg08415.html
> 
>   $ make vm-build-ubuntu.i386
>   Traceback (most recent call last):
> File "source/qemu/tests/vm/basevm.py", line 236, in main
>   return vm.build_image(args.image)
> File "tests/vm/ubuntu.i386", line 67, in build_image
>   subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"])
>   OSError: [Errno 2] No such file or directory
>   tests/vm/Makefile.include:23: recipe for target 'tests/vm/ubuntu.i386.img' 
> failed
>   make: *** [tests/vm/ubuntu.i386.img] Error 2
> 
> A better fix would be checking those tools via ./configure...
> ---
>  tests/Makefile.include | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index 19b4c0a696..5e03416c81 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -706,7 +706,7 @@ tests/prom-env-test$(EXESUF): tests/prom-env-test.o 
> $(libqos-obj-y)
>  tests/rtas-test$(EXESUF): tests/rtas-test.o $(libqos-spapr-obj-y)
>  tests/fdc-test$(EXESUF): tests/fdc-test.o
>  tests/ide-test$(EXESUF): tests/ide-test.o $(libqos-pc-obj-y)
> -tests/ahci-test$(EXESUF): tests/ahci-test.o $(libqos-pc-obj-y)
> +tests/ahci-test$(EXESUF): tests/ahci-test.o $(libqos-pc-obj-y) 
> qemu-img$(EXESUF)
>  tests/ipmi-kcs-test$(EXESUF): tests/ipmi-kcs-test.o
>  tests/ipmi-bt-test$(EXESUF): tests/ipmi-bt-test.o
>  tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
> 

Reviewed-by: John Snow 



[Qemu-devel] [PATCH] acpi: Make TPM 2.0 with TIS available as MSFT0101

2019-01-25 Thread Stefan Berger
This patch makes the a TPM 2.0 with TIS interface available under the
HID 'MSF0101'. This is supported by Linux and also Windows now
recognizes the TPM 2.0 with TIS interface. Leave the TPM 1.2 as before.

Signed-off-by: Stefan Berger 
---
 hw/i386/acpi-build.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2e21a31f82..f51225b4a7 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2141,8 +2141,16 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 build_append_pci_bus_devices(scope, bus, pm->pcihp_bridge_en);
 
 if (TPM_IS_TIS(tpm)) {
-dev = aml_device("ISA.TPM");
-aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C31")));
+if (misc->tpm_version == TPM_VERSION_2_0) {
+dev = aml_device("TPM");
+aml_append(dev, aml_name_decl("_HID",
+  aml_string("MSFT0101")));
+} else {
+dev = aml_device("ISA.TPM");
+aml_append(dev, aml_name_decl("_HID",
+  aml_eisaid("PNP0C31")));
+}
+
 aml_append(dev, aml_name_decl("_STA", aml_int(0xF)));
 crs = aml_resource_template();
 aml_append(crs, aml_memory32_fixed(TPM_TIS_ADDR_BASE,
-- 
2.20.1




Re: [Qemu-devel] [PATCH v2 5/5] vfio-ccw: add handling for async channel instructions

2019-01-25 Thread Eric Farman




On 01/21/2019 06:03 AM, Cornelia Huck wrote:

Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.

Interrupt status continues to be reported through the same channels
as for ssch.

Signed-off-by: Cornelia Huck 
---
  drivers/s390/cio/Makefile   |   3 +-
  drivers/s390/cio/vfio_ccw_async.c   |  91 ++
  drivers/s390/cio/vfio_ccw_drv.c |  45 +++
  drivers/s390/cio/vfio_ccw_fsm.c | 114 +++-
  drivers/s390/cio/vfio_ccw_ops.c |  13 +++-
  drivers/s390/cio/vfio_ccw_private.h |   9 ++-
  include/uapi/linux/vfio.h   |   2 +
  include/uapi/linux/vfio_ccw.h   |  12 +++
  8 files changed, 269 insertions(+), 20 deletions(-)
  create mode 100644 drivers/s390/cio/vfio_ccw_async.c

diff --git a/drivers/s390/cio/Makefile b/drivers/s390/cio/Makefile
index f230516abb96..f6a8db04177c 100644
--- a/drivers/s390/cio/Makefile
+++ b/drivers/s390/cio/Makefile
@@ -20,5 +20,6 @@ obj-$(CONFIG_CCWGROUP) += ccwgroup.o
  qdio-objs := qdio_main.o qdio_thinint.o qdio_debug.o qdio_setup.o
  obj-$(CONFIG_QDIO) += qdio.o
  
-vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o

+vfio_ccw-objs += vfio_ccw_drv.o vfio_ccw_cp.o vfio_ccw_ops.o vfio_ccw_fsm.o \
+   vfio_ccw_async.o
  obj-$(CONFIG_VFIO_CCW) += vfio_ccw.o
diff --git a/drivers/s390/cio/vfio_ccw_async.c 
b/drivers/s390/cio/vfio_ccw_async.c
new file mode 100644
index ..604806c2970f
--- /dev/null
+++ b/drivers/s390/cio/vfio_ccw_async.c
@@ -0,0 +1,91 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Async I/O region for vfio_ccw
+ *
+ * Copyright Red Hat, Inc. 2019
+ *
+ * Author(s): Cornelia Huck 
+ */
+
+#include 
+#include 
+
+#include "vfio_ccw_private.h"
+
+static ssize_t vfio_ccw_async_region_read(struct vfio_ccw_private *private,
+ char __user *buf, size_t count,
+ loff_t *ppos)
+{
+   unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS;
+   loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
+   struct ccw_cmd_region *region;
+   int ret;
+
+   if (pos + count > sizeof(*region))
+   return -EINVAL;
+
+   mutex_lock(&private->io_mutex);
+   region = private->region[i].data;
+   if (copy_to_user(buf, (void *)region + pos, count))
+   ret = -EFAULT;
+   else
+   ret = count;
+   mutex_unlock(&private->io_mutex);
+   return ret;
+}
+
+static ssize_t vfio_ccw_async_region_write(struct vfio_ccw_private *private,
+  const char __user *buf, size_t count,
+  loff_t *ppos)
+{
+   unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - VFIO_CCW_NUM_REGIONS;
+   loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK;
+   struct ccw_cmd_region *region;
+   int ret;
+
+   if (pos + count > sizeof(*region))
+   return -EINVAL;
+
+   if (private->state == VFIO_CCW_STATE_NOT_OPER ||
+   private->state == VFIO_CCW_STATE_STANDBY)
+   return -EACCES;
+   if (!mutex_trylock(&private->io_mutex))
+   return -EAGAIN;
+
+   region = private->region[i].data;
+   if (copy_from_user((void *)region + pos, buf, count)) {
+   ret = -EFAULT;
+   goto out_unlock;
+   }
+
+   vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_ASYNC_REQ);
+
+   ret = region->ret_code ? region->ret_code : count;
+
+out_unlock:
+   mutex_unlock(&private->io_mutex);
+   return ret;
+}
+
+static void vfio_ccw_async_region_release(struct vfio_ccw_private *private,
+ struct vfio_ccw_region *region)
+{
+
+}
+
+const struct vfio_ccw_regops vfio_ccw_async_region_ops = {
+   .read = vfio_ccw_async_region_read,
+   .write = vfio_ccw_async_region_write,
+   .release = vfio_ccw_async_region_release,
+};
+
+int vfio_ccw_register_async_dev_regions(struct vfio_ccw_private *private)
+{
+   return vfio_ccw_register_dev_region(private,
+   VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD,
+   &vfio_ccw_async_region_ops,
+   sizeof(struct ccw_cmd_region),
+   VFIO_REGION_INFO_FLAG_READ |
+   VFIO_REGION_INFO_FLAG_WRITE,
+   private->cmd_region);
+}
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 2ef189fe45ed..d807911b8ed5 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -3,9 +3,11 @@
   * VFIO based Physical Subchannel device driver
   *
   * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
   *

Re: [Qemu-devel] [PATCH v2 3/5] vfio-ccw: add capabilities chain

2019-01-25 Thread Eric Farman



On 01/25/2019 11:19 AM, Eric Farman wrote:



On 01/21/2019 06:03 AM, Cornelia Huck wrote:

Allow to extend the regions used by vfio-ccw. The first user will be
handling of halt and clear subchannel.

Signed-off-by: Cornelia Huck 
---
  drivers/s390/cio/vfio_ccw_ops.c | 181 
  drivers/s390/cio/vfio_ccw_private.h |  38 ++

...snip...
diff --git a/drivers/s390/cio/vfio_ccw_private.h 
b/drivers/s390/cio/vfio_ccw_private.h

index e88237697f83..20e75f4f3695 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -3,9 +3,11 @@
   * Private stuff for vfio_ccw driver
   *
   * Copyright IBM Corp. 2017
+ * Copyright Red Hat, Inc. 2019
   *
   * Author(s): Dong Jia Shi 
   *    Xiao Feng Ren 
+ *    Cornelia Huck 
   */
  #ifndef _VFIO_CCW_PRIVATE_H_
@@ -19,6 +21,38 @@
  #include "css.h"
  #include "vfio_ccw_cp.h"
+#define VFIO_CCW_OFFSET_SHIFT   40
+#define VFIO_CCW_OFFSET_TO_INDEX(off)    (off >> VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_INDEX_TO_OFFSET(index)    ((u64)(index) << 
VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_OFFSET_MASK    (((u64)(1) << VFIO_CCW_OFFSET_SHIFT) 
- 1)

+
+/* capability chain handling similar to vfio-pci */
+struct vfio_ccw_private;
+struct vfio_ccw_region;
+
+struct vfio_ccw_regops {
+    size_t    (*read)(struct vfio_ccw_private *private, char __user 
*buf,

+    size_t count, loff_t *ppos);
+    size_t    (*write)(struct vfio_ccw_private *private,
+ const char __user *buf, size_t count, loff_t *ppos);


Oops.  Per my recommendation on v1, you change these to "ssize_t" in 
patch 5.  Might as well just do that here.



+    void    (*release)(struct vfio_ccw_private *private,
+   struct vfio_ccw_region *region);
+};
+
+struct vfio_ccw_region {
+    u32    type;
+    u32    subtype;
+    const struct vfio_ccw_regops    *ops;
+    void    *data;
+    size_t    size;
+    u32    flags;
+};
+
+int vfio_ccw_register_dev_region(struct vfio_ccw_private *private,
+ unsigned int subtype,
+ const struct vfio_ccw_regops *ops,
+ size_t size, u32 flags, void *data);
+
  /**
   * struct vfio_ccw_private
   * @sch: pointer to the subchannel

...snip...

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 02bb7ad6e986..56e2413d3e00 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -353,6 +353,8 @@ struct vfio_region_gfx_edid {
  #define VFIO_DEVICE_GFX_LINK_STATE_DOWN  2
  };
+#define VFIO_REGION_TYPE_CCW    (2)
+


Cool.  :)


  /*
   * 10de vendor sub-type
   *



Looks fine to me.  I'd love to think there was a way to generalize this 
for other vfio drivers, but man that's a tall task.  So...


With the ssize_t fixup from patch 5...



Reviewed-by: Eric Farman 




Re: [Qemu-devel] [PATCH 0/3] scsi-disk: Device Identification fixes

2019-01-25 Thread Eric Blake
On 1/25/19 11:46 AM, Kevin Wolf wrote:
> The vendor specific designator in the Device Identification VPD page has
> two problems:
> 
> 1. It defaults to the BlockBackend name (-drive id=...), which everyone
>expected to be a host detail that the guest never sees
> 
> 2. With -blockdev based setups it defaults to an empty string; if this
>default is used with more than one disk, the guest OS will interpret
>this as a single multipath disk.
> 
> We can address problem 2 immediately, and start running the deprecation
> clock for problem 1.
> 
> Related bug reports:
> https://bugzilla.redhat.com/show_bug.cgi?id=1669446
> https://bugzilla.redhat.com/show_bug.cgi?id=1668248
> 
> Kevin Wolf (3):
>   scsi-disk: Don't use empty string as device id
>   scsi-disk: Add device_id property
>   scsi-disk: Deprecate device_id fallback to BlockBackend name

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v6 1/2] qemu-io: Use error_[gs]et_progname()

2019-01-25 Thread Eric Blake
On 1/25/19 11:22 AM, Christophe Fergeau wrote:

Don't forget the 0/2 cover letter for series :)

> qemu-io reimplements itself what
> error_get_progname()/error_set_progname() already does.
> This commit switches it to use this API from qemu-error.h
> 
> Signed-off-by: Christophe Fergeau 
> ---
>  qemu-io.c | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [RFC PATCH] ahci-test: Add dependency to qemu-img tool

2019-01-25 Thread Philippe Mathieu-Daudé
Since the ahci-test uses qemu-img, add a dependency to build it
before using it.
This fixes:

  $ gmake check-qtest V=1
  QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img 
tests/ahci-test
  Failed to execute child process "/tmp/qemu-test.19tMRF/qemu-img" (No such 
file or directory)
  ERROR:tests/libqos/libqos.c:192:mkimg: assertion failed: (ret && !err)

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC because while this dependency is valid, I don't think this is the
clever way to solve this problem (which is, assuming the host
distribution has the qemu-tools installed).
I guess remember a thread about it (Eric, John?) where it was asked
"What do we want to test, qemu-img or AHCI? Can we trust an unstable
version of a tool to verify a device?"

Slighly related is when vm-tests expect qemu-img available:
https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg08415.html

  $ make vm-build-ubuntu.i386
  Traceback (most recent call last):
File "source/qemu/tests/vm/basevm.py", line 236, in main
  return vm.build_image(args.image)
File "tests/vm/ubuntu.i386", line 67, in build_image
  subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"])
  OSError: [Errno 2] No such file or directory
  tests/vm/Makefile.include:23: recipe for target 'tests/vm/ubuntu.i386.img' 
failed
  make: *** [tests/vm/ubuntu.i386.img] Error 2

A better fix would be checking those tools via ./configure...
---
 tests/Makefile.include | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 19b4c0a696..5e03416c81 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -706,7 +706,7 @@ tests/prom-env-test$(EXESUF): tests/prom-env-test.o 
$(libqos-obj-y)
 tests/rtas-test$(EXESUF): tests/rtas-test.o $(libqos-spapr-obj-y)
 tests/fdc-test$(EXESUF): tests/fdc-test.o
 tests/ide-test$(EXESUF): tests/ide-test.o $(libqos-pc-obj-y)
-tests/ahci-test$(EXESUF): tests/ahci-test.o $(libqos-pc-obj-y)
+tests/ahci-test$(EXESUF): tests/ahci-test.o $(libqos-pc-obj-y) 
qemu-img$(EXESUF)
 tests/ipmi-kcs-test$(EXESUF): tests/ipmi-kcs-test.o
 tests/ipmi-bt-test$(EXESUF): tests/ipmi-bt-test.o
 tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
-- 
2.20.1




Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

2019-01-25 Thread Eric Farman




On 01/25/2019 05:24 AM, Cornelia Huck wrote:

On Thu, 24 Jan 2019 21:37:44 -0500
Eric Farman  wrote:


On 01/24/2019 09:25 PM, Eric Farman wrote:



On 01/21/2019 06:03 AM, Cornelia Huck wrote:



[1] I think these changes are cool.  We end up going into (and staying
in) state=BUSY if we get cc=0 on the SSCH, rather than in/out as we
bumble along.

But why can't these be separated out from this patch?  It does change
the behavior of the state machine, and seem distinct from the addition
of the mutex you otherwise add here?  At the very least, this behavior
change should be documented in the commit since it's otherwise lost in
the mutex/EAGAIN stuff.


That's a very good idea. I'll factor them out into a separate patch.

   

   trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private),
  io_region->ret_code, errstr);
   }
diff --git a/drivers/s390/cio/vfio_ccw_ops.c
b/drivers/s390/cio/vfio_ccw_ops.c
index f673e106c041..3fa9fc570400 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct
mdev_device *mdev,
   {
   struct vfio_ccw_private *private;
   struct ccw_io_region *region;
+    int ret;
   if (*ppos + count > sizeof(*region))
   return -EINVAL;
   private = dev_get_drvdata(mdev_parent_dev(mdev));
+    mutex_lock(&private->io_mutex);
   region = private->io_region;
   if (copy_to_user(buf, (void *)region + *ppos, count))
-    return -EFAULT;
-
-    return count;
+    ret = -EFAULT;
+    else
+    ret = count;
+    mutex_unlock(&private->io_mutex);
+    return ret;
   }
   static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
@@ -188,25 +192,30 @@ static ssize_t vfio_ccw_mdev_write(struct
mdev_device *mdev,
   {
   struct vfio_ccw_private *private;
   struct ccw_io_region *region;
+    int ret;
   if (*ppos + count > sizeof(*region))
   return -EINVAL;
   private = dev_get_drvdata(mdev_parent_dev(mdev));
-    if (private->state != VFIO_CCW_STATE_IDLE)
+    if (private->state == VFIO_CCW_STATE_NOT_OPER ||
+    private->state == VFIO_CCW_STATE_STANDBY)
   return -EACCES;
+    if (!mutex_trylock(&private->io_mutex))
+    return -EAGAIN;


Ah, I see Halil's difficulty here.

It is true there is a race condition today, and that this doesn't
address it.  That's fine, add it to the todo list.  But even with that,
I don't see what the mutex is enforcing?  Two simultaneous SSCHs will be
serialized (one will get kicked out with a failed trylock() call), while
still leaving the window open between cc=0 on the SSCH and the
subsequent interrupt.  In the latter case, a second SSCH will come
through here, do the copy_from_user below, and then jump to fsm_io_busy
to return EAGAIN.  Do we really want to stomp on io_region in that case?
   Why can't we simply return EAGAIN if state==BUSY?


(Answering my own questions as I skim patch 5...)

Because of course this series is for async handling, while I was looking
specifically at the synchronous code that exists today.  I guess then my
question just remains on how the mutex is adding protection in the sync
case, because that's still not apparent to me.  (Perhaps I missed it in
a reply to Halil; if so I apologize, there were a lot when I returned.)


My idea behind the mutex was to make sure that we get consistent data
when reading/writing (e.g. if one user space thread is reading the I/O
region while another is writing to it).



And from that angle, this accomplishes that.  It just wasn't apparent to 
me at first.


I'm still not certain of how we handle mdev_write when state=BUSY, so 
let me ask my question a different way...


If we come into mdev_write with state=BUSY and we get the lock, 
copy_from_user, and do our jump table we go to fsm_io_busy to set 
ret_code and return -EAGAIN.  Why then don't we set the jump table for 
state=NOT_OPER||STANDBY to do something that will return -EACCES instead 
of how we currently do a direct return of -EACCES before all the 
lock/copy stuff (and the jump table that would take us to fsm_io_error 
and an error message before returning -EIO)?





Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

2019-01-25 Thread Eric Farman




On 01/25/2019 07:58 AM, Halil Pasic wrote:

On Thu, 24 Jan 2019 21:25:10 -0500
Eric Farman  wrote:


private = dev_get_drvdata(mdev_parent_dev(mdev));
-   if (private->state != VFIO_CCW_STATE_IDLE)
+   if (private->state == VFIO_CCW_STATE_NOT_OPER ||
+   private->state == VFIO_CCW_STATE_STANDBY)
return -EACCES;
+   if (!mutex_trylock(&private->io_mutex))
+   return -EAGAIN;


Ah, I see Halil's difficulty here.

It is true there is a race condition today, and that this doesn't
address it.  That's fine, add it to the todo list.  But even with that,
I don't see what the mutex is enforcing?


It is protecting the io regions. AFAIU the idea was that only one
thread is accessing the io region(s) at a time to prevent corruption and
reading half-morphed data.


Two simultaneous SSCHs will be
serialized (one will get kicked out with a failed trylock() call), while
still leaving the window open between cc=0 on the SSCH and the
subsequent interrupt.  In the latter case, a second SSCH will come
through here, do the copy_from_user below, and then jump to fsm_io_busy
to return EAGAIN.  Do we really want to stomp on io_region in that case?


I'm not sure I understood you correctly. The interrupt handler does not
take the lock before writing to the io_region. That is one race but it is
easy to fix.

The bigger problem is that between the interrupt handler has written IRB
area and userspace has read it we may end up destroying it by stomping on
it (to use your words). The userspace reading a wrong (given todays qemu
zeroed out) IRB could lead to follow on problems.


I wasn't thinking about a race between the start and interrupt handler, 
but rather between two near-simultaneous starts.  Looking at it more 
closely, the orb and scsw structs as well as the ret_code field in 
ccw_io_region are only referenced under the protection of the new mutex 
(within fsm_io_request, for example), which I guess is the point.


So that leaves us with just the irb fields, which you'd mentioned a 
couple days ago (and which I was trying to ignore since it'd seems to 
have been discussed enough at the time).  So I withdraw my concerns on 
this point.  For now.  ;-)


  

   Why can't we simply return EAGAIN if state==BUSY?



Sure we can. That would essentially go back to the old way of things:
if not idle return with error. 


I think this happens both before and after this series.  With this 
series, we just update the io_region with things that are never used 
because we're busy.


Just the error code returned would change

form EACCESS to EAGAIN. Which Isn't necessarily a win, because
conceptually here should be never two interleaved io_requests/start
commands hitting the module.


   
   	region = private->io_region;

-   if (copy_from_user((void *)region + *ppos, buf, count))
-   return -EFAULT;
+   if (copy_from_user((void *)region + *ppos, buf, count)) {
+   ret = -EFAULT;
+   goto out_unlock;
+   }







Re: [Qemu-devel] [PATCH 04/13] tcg/aarch64: enable dynamic TLB sizing

2019-01-25 Thread Richard Henderson
On 1/25/19 11:12 AM, Alex Bennée wrote:
>> +if (table_ofs > 0xfff) {
>> +int table_hi = table_ofs & ~0xfff;
>> +int mask_hi = mask_ofs & ~0xfff;
> 
> Isn't there a #define for this number here?

No.  I don't know what I'd call it, either.
You're just Supposed to Know that arm memory offsets are 12 bits.  ;-P


r~



Re: [Qemu-devel] [PATCH] hw/virtio: Use CONFIG_VIRTIO_PCI switch instead of CONFIG_PCI

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 01:56:00PM +0100, Thomas Huth wrote:
> For downstream s390x builds, we'd like to be able to build QEMU with
> CONFIG_VIRTIO_PCI disabled (since virtio-ccw is used here instead),
> but still with CONFIG_PCI enabled. This currently fails since the
> virtio-*-pci.o files are still included in the build, but virtio-pci.o
> is missing. Use the right config switch CONFIG_VIRTIO_PCI to exclude
> the virtio-*-pci.o files from the build.
> 
> Reported-by: Miroslav Rezanina 
> Signed-off-by: Thomas Huth 
> ---
>  hw/virtio/Makefile.objs | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> index ea7913d..d335dd0 100644
> --- a/hw/virtio/Makefile.objs
> +++ b/hw/virtio/Makefile.objs
> @@ -11,7 +11,7 @@ obj-$(call 
> land,$(CONFIG_VIRTIO_CRYPTO),$(CONFIG_VIRTIO_PCI)) += virtio-crypto-p
>  
>  obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
>  obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock.o
> -ifeq ($(CONFIG_PCI),y)
> +ifeq ($(CONFIG_VIRTIO_PCI),y)
>  obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock-pci.o
>  obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk-pci.o
>  obj-$(CONFIG_VHOST_USER_SCSI) += vhost-user-scsi-pci.o

Makes sense will queue.

> -- 
> 1.8.3.1



Re: [Qemu-devel] [PATCH 22/52] hw/hppa/Makefile.objs: Create CONFIG_* for hppa

2019-01-25 Thread Richard Henderson
On 1/25/19 2:06 AM, Paolo Bonzini wrote:
> From: Yang Zhong 
> 
> Add the new configs to default-configs/hppa-sofmmu.mak.
> 
> Signed-off-by: Yang Zhong 
> Message-Id: <20190123065618.3520-19-yang.zh...@intel.com>
> Signed-off-by: Paolo Bonzini 
> ---
>  default-configs/hppa-softmmu.mak | 1 +
>  hw/hppa/Makefile.objs| 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 01:26:53AM -0200, Eduardo Habkost wrote:
> On Thu, Jan 24, 2019 at 10:08:37PM -0500, Michael S. Tsirkin wrote:
> > On Thu, Jan 24, 2019 at 05:14:43PM -0200, Eduardo Habkost wrote:
> > > On Thu, Jan 24, 2019 at 02:05:45PM -0500, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 24, 2019 at 04:28:39PM -0200, Eduardo Habkost wrote:
> > > > > On Thu, Jan 24, 2019 at 12:45:54PM -0500, Michael S. Tsirkin wrote:
> > > > > > On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote:
> > > > > > > On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote:
> > > > > > > > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote:
> > > > > > > > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote:
> > > > > > > > > > From: Zhang Yi 
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Zhang Yi 
> > > > > [...]
> > > > > > > > > > + - 'pmem' option of memory-backend-file is 'on':
> > > > > > > > > > +   The backend is a file supporting DAX, e.g., a file on 
> > > > > > > > > > an ext4 or
> > > > > > > > > > +   xfs file system mounted with '-o dax'. if your pmem=on 
> > > > > > > > > > ,but the backend is
> > > > > > > > > > +   not a file supporting DAX, mapping with this flag 
> > > > > > > > > > results in an EOPNOTSUPP
> > > > > > > > > > +   error.
> > > > > > > > > 
> > > > > > > > > Won't this break existing configurations that work today on 
> > > > > > > > > QEMU
> > > > > > > > > 3.1.0?  Why exactly it is OK to break compatibility here?
> > > > > > > > won't, pmem option default is off, if people who start VM don't 
> > > > > > > > know what
> > > > > > > > backend file is, it is suggested and *default to set pmem=off,
> > > > > > > > if people well know the backend file have dax capbility. it is 
> > > > > > > > suggest
> > > > > > > > to set pmem=on. 
> > > > > > > > 
> > > > > > > > For a special case that we use /dev/dax as backend, we already 
> > > > > > > > have a
> > > > > > > > patch to add MAP_SYNC falg mapiing from device dax mode.
> > > > > > > > see https://lkml.org/lkml/2018/4/22/524 
> > > > > > > > 
> > > > > > > > So, if people force set pmem=on, mapping a regular file, it 
> > > > > > > > will results
> > > > > > > > in an EOPNOTSUPP error. 
> > > > > > > 
> > > > > > > This is where compatibility is being broken, isn't it?  People
> > > > > > > currently using pmem=on on a regular file will start getting
> > > > > > > errors after a QEMU upgrade.  Existing VMs with pmem=on may stop
> > > > > > > booting.  Maybe this is OK, but we need to be able to explain why
> > > > > > > it is OK.
> > > > > > 
> > > > > > I think it's OK since pmem explicitly means "persistent":
> > > > > > 
> > > > > > The @option{pmem} option specifies whether the backing file 
> > > > > > specified
> > > > > > by @option{mem-path} is in host persistent memory that can be 
> > > > > > accessed
> > > > > > using the SNIA NVM programming model (e.g. Intel NVDIMM).
> > > > > > If @option{pmem} is set to 'on', QEMU will take necessary 
> > > > > > operations to
> > > > > > guarantee the persistence of its own writes to @option{mem-path}
> > > > > > (e.g. in vNVDIMM label emulation and live migration).
> > > > > 
> > > > > If it's OK, let's at least explicitly document that we are
> > > > > breaking compatibility in those cases.
> > > > > 
> > > > > 
> > > > > > > > 
> > > > > [...]
> > > > > > I think generally MAP_SYNC is required.
> > > > > > But for compatibility reasons we might need to support
> > > > > > !MAP_SYNC on old kernels even though it's risky.
> > > > > 
> > > > > What about making MAP_SYNC optional only on older machine-types?
> > > > 
> > > > I don't think this makes sense. It's not a guest visible change,
> > > > machine types are for that.
> > > 
> > > Losing data written to persistent memory is surely guest-visible
> > > behavior.
> > 
> > I think we need not be purists here. Most people don't lose power and
> > then it's fine and compatible. People who want more robustness need to
> > use more modern kernels, that is all.
> 
> I don't think that's being purist.  I want to avoid hidden bugs
> if we ignore that MAP_SYNC failed for any unexpected reason.  If
> we need to ignore errors in some cases, let's at least limit that
> to cases where we absolutely have to.
> But I would also be happy with just a warning.

Makes sense to me. So if it fails with EOPNOTSUPP,
we try with MAP_SHARED_VALIDATE without MAP_SYNC.
If that succeeds then it's not a dax file, and we warn.
If it fails too then it's an old kernel and we
silently proceed for compatibility reasons.


> 
> -- 
> Eduardo



Re: [Qemu-devel] [PATCH 20/52] hw/alpha/Makefile.objs: Create CONFIG_* for alpha

2019-01-25 Thread Richard Henderson
On 1/25/19 2:06 AM, Paolo Bonzini wrote:
> From: Yang Zhong 
> 
> Add the new configs to default-configs/alpha-sofmmu.mak.
> 
> Signed-off-by: Yang Zhong 
> Message-Id: <20190123065618.3520-17-yang.zh...@intel.com>
> Signed-off-by: Paolo Bonzini 
> ---
>  default-configs/alpha-softmmu.mak | 1 +
>  hw/alpha/Makefile.objs| 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH V10 4/4] docs: Added MAP_SYNC documentation

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 01:26:53AM -0200, Eduardo Habkost wrote:
> > I think we need not be purists here. Most people don't lose power and
> > then it's fine and compatible. People who want more robustness need to
> > use more modern kernels, that is all.
> 
> I don't think that's being purist.

Right I thought that you want to prevent kernel running this
on old kernels. Why it's nice to stay up to date this
flag was not backported into stable kernels so many users
can't really update.


> I want to avoid hidden bugs


Makes total sense, I posted a proposal.

> if we ignore that MAP_SYNC failed for any unexpected reason.  If
> we need to ignore errors in some cases, let's at least limit that
> to cases where we absolutely have to.
> 
> But I would also be happy with just a warning.
> 
> -- 
> Eduardo



Re: [Qemu-devel] [PATCH 47/52] vfio: express vfio dependencies with Kconfig

2019-01-25 Thread Alex Williamson
On Fri, 25 Jan 2019 11:07:06 +0100
Paolo Bonzini  wrote:

> Signed-off-by: Paolo Bonzini 
> ---
>  hw/vfio/Kconfig | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/vfio/Kconfig b/hw/vfio/Kconfig
> index f896779..ebda9fd 100644
> --- a/hw/vfio/Kconfig
> +++ b/hw/vfio/Kconfig
> @@ -7,28 +7,30 @@ config VFIO_PCI
>  select VFIO
>  depends on LINUX
>  
> -config VFIO_SPAPR
> -bool
> -default y
> -depends on VFIO && LINUX && PSERIES


I can't say I really understand what happened with this through the
course of the series.  In patch 27 spapr.o became obj-y, VFIO_SPAPR
came about in patch 32, tweaked in patch 33, then removed in 47.  I was
really hoping the Makefile was going to reflect this as a config option
so we could follow-up with some patches to stub or ifdef out the
dependencies.  The remainder here seems to set the right precedent and
we can add VFIO_SPAPR back later and wire it through the Makefile.

Unfortunately with the full series applied I'm not able to make either
allnoconfig or defconfig:

$ make allnoconfig
  GEN x86_64-softmmu/config-devices.mak.tmp
  GEN x86_64-softmmu/config-devices.mak
  GEN config-all-devices.mak
CHK version_gen.h
CHK version_gen.h
rm */config-devices.mak config-all-devices.mak
make MINIKCONF="python -B /home/alwillia/Work/qemu.git/scripts/minikconf.py  
--" config-all-devices.mak
make[1]: Entering directory '/home/alwillia/Work/qemu.git'
  GEN x86_64-softmmu/config-devices.mak.tmp
/home/alwillia/Work/qemu.git/scripts/minikconf.py: invalid option --
CHK version_gen.h
make[1]: *** No rule to make target 'x86_64-softmmu/config-devices.mak', needed 
by 'config-all-devices.mak'.  Stop.
make[1]: Leaving directory '/home/alwillia/Work/qemu.git'
make: *** [Makefile:346: allnoconfig] Error 2

Something wrong with this expansion from patch 34 I guess:

.PHONY: allnoconfig defconfig
allnoconfig defconfig:
rm */config-devices.mak config-all-devices.mak
$(MAKE) MINIKCONF="$(MINIKCONF) --$<" config-all-devices.mak

Thanks,
Alex

> -
>  config VFIO_CCW
>  bool
> +default y
>  select VFIO
> -depends on LINUX
> +depends on LINUX && S390_CCW_VIRTIO
>  
>  config VFIO_PLATFORM
>  bool
> +default y
>  select VFIO
> -depends on LINUX
> +depends on LINUX && PLATFORM_BUS
>  
>  config VFIO_XGMAC
>  bool
> +default y
> +depends on VFIO_PLATFORM
>  
>  config VFIO_AMD_XGBE
>  bool
> +default y
> +depends on VFIO_PLATFORM
>  
>  config VFIO_AP
>  bool
> +default y
>  select VFIO
> -depends on LINUX
> +depends on LINUX && S390_CCW_VIRTIO




Re: [Qemu-devel] [PATCH] hw: input: set category of the i8042 device

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 11:30:10PM +0530, ksourav wrote:
> On Fri, Jan 25, 2019 at 10:14 PM Philippe Mathieu-Daudé
>  wrote:
> >
> > On 1/25/19 4:24 PM, Thomas Huth wrote:
> > > On 2019-01-25 16:14, kumar sourav wrote:
> > >> Sets the category of i8042 device as DEVICE_CATEGORY_INPUT
> > >> Devices should be assigned to one of DEVICE_CATEGORY_.
> > >>
> > >> Signed-off-by: kumar sourav 
> > >> ---
> > >>  hw/input/pckbd.c | 1 +
> >
> > It seems we have other potential cases:
> >
> > $ git grep -L 'set_bit(DEVICE_CATEGORY_INPUT' hw/input/*c
> > hw/input/adb.c
> > hw/input/hid.c
> > hw/input/lm832x.c
> > hw/input/milkymist-softusb.c
> > hw/input/pckbd.c
> > hw/input/pl050.c
> > hw/input/ps2.c
> > hw/input/pxa2xx_keypad.c
> > hw/input/stellaris_input.c
> > hw/input/tsc2005.c
> > hw/input/tsc210x.c
> > hw/input/virtio-input-hid.c
> > hw/input/virtio-input-host.c
> >
> > >>  1 file changed, 1 insertion(+)
> > >>
> > >> diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
> > >> index 3e66713b47..72e7d5f6cc 100644
> > >> --- a/hw/input/pckbd.c
> > >> +++ b/hw/input/pckbd.c
> > >> @@ -574,6 +574,7 @@ static void i8042_class_initfn(ObjectClass *klass, 
> > >> void *data)
> > >>
> > >>  dc->realize = i8042_realizefn;
> > >>  dc->vmsd = &vmstate_kbd_isa;
> > >> +set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
> > >>  }
> > >>
> > >>  static const TypeInfo i8042_info = {
> > >>
> > >
> > > Reviewed-by: Thomas Huth 
> >
> > Reviewed-by: Philippe Mathieu-Daudé 
> >
> Ok I will add device categories for these devices too. Should I send
> the changes in one patch for all these devices/files ?
> Or should I send it separately - one patch per device/file ?
> 
> Thanks & Regards

Either is fine. One patch will be merged faster :)

-- 
MST



Re: [Qemu-devel] [PATCH] hw: input: set category of the i8042 device

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 08:44:40PM +0530, kumar sourav wrote:
> Sets the category of i8042 device as DEVICE_CATEGORY_INPUT
> Devices should be assigned to one of DEVICE_CATEGORY_.
> 
> Signed-off-by: kumar sourav 

Reviewed-by: Michael S. Tsirkin 

who's merging this? Paolo?

> ---
>  hw/input/pckbd.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
> index 3e66713b47..72e7d5f6cc 100644
> --- a/hw/input/pckbd.c
> +++ b/hw/input/pckbd.c
> @@ -574,6 +574,7 @@ static void i8042_class_initfn(ObjectClass *klass, void 
> *data)
>  
>  dc->realize = i8042_realizefn;
>  dc->vmsd = &vmstate_kbd_isa;
> +set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
>  }
>  
>  static const TypeInfo i8042_info = {
> -- 
> 2.17.1



Re: [Qemu-devel] of apci_1_compatible in CPUHotplugFeatures

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 09:26:05AM +0100, Igor Mammedov wrote:
> On Wed, 23 Jan 2019 18:28:59 +
> "Dr. David Alan Gilbert"  wrote:
> 
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > On Tue, Jan 22, 2019 at 08:07:41PM +, Dr. David Alan Gilbert wrote:  
> > > > Hi,
> > > >   I noticed that the acpi_1_compatible flag was misspelt as
> > > >  apci_1_compatible
> > > > 
> > > > so have a trivial patch to fix that,
> pls post it.
> 
> > > > but looking at it - are
> > > > thre any cases where a[cp]i_1_compatible can possibly be false?
> ATM it's not possible, but I've wrote it with intent to reuse
> build_cpus_aml() in arm/virt board and there we shall use newer
> aml_device() instead of legacy aml_processor(), hence a feature flag
> to toggle behavior.

Are there imminent plans to do this? If not why don't we
drop unused code? One can always bring it back from git
history if need arises but for now there's no guarantee
it's not broken as it never runs.

> 
> > > > 
> > > > Dave  
> > > 
> > > legacy_cpu_hotplug so machine 2.6 and older - no?  
> > 
> > That doesn't seem to affect that flag by my reading;
> > the only place I see legacy_cpu_hotplug checked is
> > acpi-build.c:build_dsdt and we have:
> > 
> > if (pcmc->legacy_cpu_hotplug) {
> > build_legacy_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base);
> > } else {
> > CPUHotplugFeatures opts = {
> > .acpi_1_compatible = true, .has_legacy_cphp = true
> > };
> > build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
> >"\\_SB.PCI0", "\\_GPE._E02");
> > }
> > 
> > so the 'opts' field is only used in the non-legacy case.
> > 
> > That's the only caller of build_cpus_aml, and I'm not seeing another
> > user of CPUHotplugFeatures.
> > 
> > Dave
> > 
> > > > --
> > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK  
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 3/3] ide/via: Implement and use native PCI IDE mode

2019-01-25 Thread John Snow



On 1/25/19 7:25 AM, BALATON Zoltan wrote:
> On Thu, 24 Jan 2019, BALATON Zoltan wrote:
>> On Wed, 23 Jan 2019, John Snow wrote:
>>> I guess this is technically an external change in behavior... I have no
>>> real read on if this will break anything for anyone, or if anyone was
>>> even using it.
>>
>> Currently nothing but the mips_fulong2e board seems to use this device
>> and I don't think there's anything even on that board that would
>> depend on (or would work with) legacy only IDE mode of this device but
>> I could not find a test image to try. That board seems to be quite
>> unfinished, I've tried to get it in better shape but haven't succeeded
>> yet. So I don't think this change in the IDE device would break
>> anything for anyone as it does not seem to be usable yet but I plan to
>> use this IDE part in subsequent patches and PCI native mode works better.
>>
>>> Any thoughts on why it's okay to switch?
>>
>> As said above it's unlikely anyone would depend on it now and all
>> drivers I know about prefer native mode anyway so likely it would work
>> better after this change. If not and it turns out someone is using the
>> current behaviour I can look at implementing switching between the two
>> modes but that would be more difficult (the ISA ports would need to be
>> mapped and unmapped based on bits in PCI_PROG_INTERFACE but there's no
>> API to do this currently, ide/core.c only has ide_init_ioport to add
>> mapping but not the counterpart to remove it; this could be
>> implemented but unless found to be needed it probably does not worth
>> it). So I suggest to switch based on that this is a quite unused part
>> that's not likely to break anything and if it still found to break
>> something I'll look at fixing it or it could be reverted, but I don't
>> want to spend time now on something that's not actually used.
> 
> I'd appreciated if this could be included now based on the above as I
> have some pathces in the works that will need this, unless there's a
> strong reason to not apply it.
> 
> Thank you,
> BALATON Zoltan

I'll stage it. If anyone disagrees we can roll it back before release if
we are breaking legacy behavior, but I really strongly suspect nobody is
relying on this in any kind of production environment, so I'll take the
risk to just merge the improvement.

--js



Re: [Qemu-devel] [PULL v2 07/49] util: check the return value of fcntl in qemu_set_{block, nonblock}

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 02:04:15PM -0500, Brad Smith wrote:
> On 1/25/2019 1:53 PM, Philippe Mathieu-Daudé wrote:
> 
> > Hi,
> > 
> > On 1/15/19 9:04 PM, Michael S. Tsirkin wrote:
> > > From: Li Qiang 
> > > 
> > > Assert that the return value is not an error. This is like commit
> > > 7e6478e7d4f for qemu_set_cloexec.
> > > 
> > > Signed-off-by: Li Qiang 
> > > Reviewed-by: Thomas Huth 
> > > Reviewed-by: Michael S. Tsirkin 
> > > Signed-off-by: Michael S. Tsirkin 
> > > ---
> > >   util/oslib-posix.c | 8 ++--
> > >   1 file changed, 6 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> > > index c1bee2a581..4ce1ba9ca4 100644
> > > --- a/util/oslib-posix.c
> > > +++ b/util/oslib-posix.c
> > > @@ -233,14 +233,18 @@ void qemu_set_block(int fd)
> > >   {
> > >   int f;
> > >   f = fcntl(fd, F_GETFL);
> > > -fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
> > > +assert(f != -1);
> > > +f = fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
> > > +assert(f != -1);
> > >   }
> > >   void qemu_set_nonblock(int fd)
> > >   {
> > >   int f;
> > >   f = fcntl(fd, F_GETFL);
> > > -fcntl(fd, F_SETFL, f | O_NONBLOCK);
> > > +assert(f != -1);
> > > +f = fcntl(fd, F_SETFL, f | O_NONBLOCK);
> > > +assert(f != -1);
> > This commit breaks OpenBSD, when trying to start QEMU I get:
> > assertion "f != -1" failed: file "util/oslib-posix.c", line 247,
> > function "qemu_set_nonblock"
> > 
> > Having a quick look at gdb, the last device opened is /dev/null, and
> > when fcntl() fails we have errno = ENODEV.
> > 
> >  19 ENODEV Operation not supported by device.
> >  An attempt was made to apply an inappropriate function to a device,
> >  for example, trying to read a write-only device such as a printer.
> > 
> > Digging further I found a recent commit which could fix this problem:
> > https://github.com/openbsd/src/commit/c2a35b387f9d3c
> > "fcntl(F_SETFL) invokes the FIONBIO and FIOASYNC ioctls internally, so
> > the memory devices (/dev/null, /dev/zero, etc) need to permit them."
> > 
> > Brad: Do you think this might be the fix? If so, any idea what is the
> > first release to contain this fix? I don't know OpenBSD and can't figure
> > this out... Also, what would be the cleaner QEMU fix?
> 
> I don't know. But that commit was included with 6.3 or newer.

I'm quite prepared to revert that - the value of spreading
asserts around is marginal and if we don't set fd
even on a real file to non-blocking qemu doesn't explode -
it just hangs which isn't much worse than an assert.

Let me know.

-- 
MST



Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 03:12:45PM +, Stefan Hajnoczi wrote:
> On Fri, Jan 25, 2019 at 09:49:03AM +0100, Thomas Huth wrote:
> > On 2019-01-25 09:16, Stefano Garzarella wrote:
> > > On Fri, Jan 25, 2019 at 07:07:35AM +0100, Thomas Huth wrote:
> > >> On 2019-01-25 07:01, Thomas Huth wrote:
> > >>> On 2019-01-24 18:23, Stefano Garzarella wrote:
> >  If the WRITE_ZEROES feature is enabled, we check this
> >  command in the test_basic().
> > 
> >  Signed-off-by: Stefano Garzarella 
> >  ---
> >   tests/virtio-blk-test.c | 63 +
> >   1 file changed, 63 insertions(+)
> > 
> >  diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
> >  index 04c608764b..8cabbcb85a 100644
> >  --- a/tests/virtio-blk-test.c
> >  +++ b/tests/virtio-blk-test.c
> >  @@ -231,6 +231,69 @@ static void test_basic(QVirtioDevice *dev, 
> >  QGuestAllocator *alloc,
> >   
> >   guest_free(alloc, req_addr);
> >   
> >  +if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
> >  +struct virtio_blk_discard_write_zeroes *dwz_hdr;
> >  +void *expected;
> >  +
> >  +/*
> >  + * WRITE_ZEROES request on the same sector of previous test 
> >  where
> >  + * we wrote "TEST".
> >  + */
> >  +req.type = VIRTIO_BLK_T_WRITE_ZEROES;
> >  +req.data = g_malloc0(512);
> > >>>
> > >>> Wouldn't it be more interesting to do a memset(req.data, 0xaa, 512) or
> > >>> something similar here, to see whether zeroes or 0xaa is written?
> > >>
> > >> Ah, never mind, I thought req.data would be a sector buffer here, but
> > >> looking at the lines below, it apparently is something different.
> > >>
> > >> Why do you allocate 512 bytes here? I'd rather expect
> > >> g_malloc0(sizeof(struct virtio_blk_discard_write_zeroes)) here. ... and
> > >> then you could also use a local "struct virtio_blk_discard_write_zeroes
> > >> dwz_hdr" variable instead of a pointer, and drop the g_malloc0() 
> > >> completely?
> > >>
> > > 
> > > Hi Thomas,
> > > it was my initial implementation, but on the first test I discovered
> > > that virtio_blk_request() has an assert on the data_size and it requires
> > > a multiple of 512 bytes.
> > > Then I looked at the virtio-spec #1, and it seems that data should be
> > > multiple of 512 bytes also if it contains the struct
> > > virtio_blk_discard_write_zeroes. (I'm not sure)
> > > 
> > > Anyway I tried to allocate only the space for that struct, commented the
> > > assert and the test works well.
> > > 
> > > How do you suggest to proceed?
> > 
> > Wow, that's a tough question. Looking at the virtio spec, I agree with
> > you, it looks like struct virtio_blk_discard_write_zeroes should be
> > padded to 512 bytes here. But when I look at the Linux sources
> > (drivers/block/virtio_blk.c), I fail to see that they are doing the
> > padding there (but maybe I'm just too blind).
> 
> The only evidence for "pad to 512 bytes" interpretation that I see in
> the spec is "u8 data[][512];".  Or have I missed something more
> explicit?

That's it. But if it doesn't mean "any number of 512 byte chunks"
then what does it mean?

> Based on the Linux guest driver code and the lack of more evidence in
> the spec, I'm pretty sure data[] doesn't need to be padded to 512 bytes
> for discard/write zero requests.

OK. Must devices support such padding?

> > Looking at the QEMU sources, it seems like it can deal with both and
> > always sets the status right behind the last byte:
> > 
> > req->in = (void *)in_iov[in_num - 1].iov_base
> >   + in_iov[in_num - 1].iov_len
> >   - sizeof(struct virtio_blk_inhdr);
> > 
> > Anyway, I think the virtio spec should be clearer here to avoid bad
> > implementations in the future, so maybe Changpeng or Michael could
> > update the spec here a little bit?
> 
> Yep, good point.  VIRTIO 1.1 is available for public comments, so I've
> CCed the list.
> 
> Stefan

Thanks!
Care creating a github issue? And maybe proposing a spec patch.

> >  Thomas
> > 
> > 
> > > [1](https://github.com/oasis-tcs/virtio-spec/blob/master/content.tex#L3944)
> > > 
> > > 
> > > Thanks,
> > > Stefano
> > > 
> > 
> > 





Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 09:49:03AM +0100, Thomas Huth wrote:
> On 2019-01-25 09:16, Stefano Garzarella wrote:
> > On Fri, Jan 25, 2019 at 07:07:35AM +0100, Thomas Huth wrote:
> >> On 2019-01-25 07:01, Thomas Huth wrote:
> >>> On 2019-01-24 18:23, Stefano Garzarella wrote:
>  If the WRITE_ZEROES feature is enabled, we check this
>  command in the test_basic().
> 
>  Signed-off-by: Stefano Garzarella 
>  ---
>   tests/virtio-blk-test.c | 63 +
>   1 file changed, 63 insertions(+)
> 
>  diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
>  index 04c608764b..8cabbcb85a 100644
>  --- a/tests/virtio-blk-test.c
>  +++ b/tests/virtio-blk-test.c
>  @@ -231,6 +231,69 @@ static void test_basic(QVirtioDevice *dev, 
>  QGuestAllocator *alloc,
>   
>   guest_free(alloc, req_addr);
>   
>  +if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
>  +struct virtio_blk_discard_write_zeroes *dwz_hdr;
>  +void *expected;
>  +
>  +/*
>  + * WRITE_ZEROES request on the same sector of previous test 
>  where
>  + * we wrote "TEST".
>  + */
>  +req.type = VIRTIO_BLK_T_WRITE_ZEROES;
>  +req.data = g_malloc0(512);
> >>>
> >>> Wouldn't it be more interesting to do a memset(req.data, 0xaa, 512) or
> >>> something similar here, to see whether zeroes or 0xaa is written?
> >>
> >> Ah, never mind, I thought req.data would be a sector buffer here, but
> >> looking at the lines below, it apparently is something different.
> >>
> >> Why do you allocate 512 bytes here? I'd rather expect
> >> g_malloc0(sizeof(struct virtio_blk_discard_write_zeroes)) here. ... and
> >> then you could also use a local "struct virtio_blk_discard_write_zeroes
> >> dwz_hdr" variable instead of a pointer, and drop the g_malloc0() 
> >> completely?
> >>
> > 
> > Hi Thomas,
> > it was my initial implementation, but on the first test I discovered
> > that virtio_blk_request() has an assert on the data_size and it requires
> > a multiple of 512 bytes.
> > Then I looked at the virtio-spec #1, and it seems that data should be
> > multiple of 512 bytes also if it contains the struct
> > virtio_blk_discard_write_zeroes. (I'm not sure)
> > 
> > Anyway I tried to allocate only the space for that struct, commented the
> > assert and the test works well.
> > 
> > How do you suggest to proceed?
> 
> Wow, that's a tough question. Looking at the virtio spec, I agree with
> you, it looks like struct virtio_blk_discard_write_zeroes should be
> padded to 512 bytes here. But when I look at the Linux sources
> (drivers/block/virtio_blk.c), I fail to see that they are doing the
> padding there (but maybe I'm just too blind).
> 
> Looking at the QEMU sources, it seems like it can deal with both and
> always sets the status right behind the last byte:
> 
> req->in = (void *)in_iov[in_num - 1].iov_base
>   + in_iov[in_num - 1].iov_len
>   - sizeof(struct virtio_blk_inhdr);
> 
> Anyway, I think the virtio spec should be clearer here to avoid bad
> implementations in the future, so maybe Changpeng or Michael could
> update the spec here a little bit?
> 
>  Thomas

I agree the spec makes it look like data is padded to 512 bytes.

I'm happy to write it up but let's decide what it is we want to
support. Arbitrary padding the way qemu does it? Or packed format?

> 
> > [1](https://github.com/oasis-tcs/virtio-spec/blob/master/content.tex#L3944)
> > 
> > 
> > Thanks,
> > Stefano
> > 



[Qemu-devel] [PATCH 1/3] configure: Disable W^X on OpenBSD

2019-01-25 Thread Philippe Mathieu-Daudé
Since OpenBSD 6.0 [1], W^X is enforced by default [2].
TCG requires WX access. Disable W^X if it is available.
This fixes:

  # lm32-softmmu/qemu-system-lm32
  Could not allocate dynamic translator buffer

  # sysctl kern.wxabort=1
  kern.wxabort: 0 -> 1
  # lm32-softmmu/qemu-system-lm32
  mmap: Not supported
  Abort trap (core dumped)
  # gdb -q lm32-softmmu/qemu-system-lm32 qemu-system-lm32.core
  (gdb) bt
  #0  0x17e3c156c50a in _thread_sys___syscall () at {standard input}:5
  #1  0x17e3c15e5d7a in *_libc_mmap (addr=Variable "addr" is not 
available.) at /usr/src/lib/libc/sys/mmap.c:47
  #2  0x17e17d9abc8b in alloc_code_gen_buffer () at 
/usr/src/qemu/accel/tcg/translate-all.c:1064
  #3  0x17e17d9abd04 in code_gen_alloc (tb_size=0) at 
/usr/src/qemu/accel/tcg/translate-all.c:1112
  #4  0x17e17d9abe81 in tcg_exec_init (tb_size=0) at 
/usr/src/qemu/accel/tcg/translate-all.c:1149
  #5  0x17e17d9897e9 in tcg_init (ms=0x17e45e456800) at 
/usr/src/qemu/accel/tcg/tcg-all.c:66
  #6  0x17e17d9891b8 in accel_init_machine (acc=0x17e3c3f50800, 
ms=0x17e45e456800) at /usr/src/qemu/accel/accel.c:63
  #7  0x17e17d989312 in configure_accelerator (ms=0x17e45e456800, 
progname=0x7f7f07b0 "lm32-softmmu/qemu-system-lm32") at 
/usr/src/qemu/accel/accel.c:111
  #8  0x17e17d9d8616 in main (argc=1, argv=0x7f7f06b8, 
envp=0x7f7f06c8) at vl.c:4325

[1] https://www.openbsd.org/faq/upgrade60.html
[2] https://undeadly.org/cgi?action=article&sid=20160527203200

Signed-off-by: Philippe Mathieu-Daudé 
---
 configure | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/configure b/configure
index b18281c61f..f6acc028a7 100755
--- a/configure
+++ b/configure
@@ -5795,6 +5795,17 @@ if test "$mingw32" = "yes" ; then
 done
 fi
 
+# Disable W^X if available
+if test "$tcg" = "yes" -a "$targetos" = "OpenBSD"; then
+cat > $TMPC <

[Qemu-devel] [PATCH 0/3] OpenBSD fixes

2019-01-25 Thread Philippe Mathieu-Daudé
Fixes I encountered while trying to run QEMU test suite on OpenBSD.
More to come, but please review.

Regards,

Phil.

Philippe Mathieu-Daudé (3):
  configure: Disable W^X on OpenBSD
  XXX oslib-posix: Ignore fcntl("/dev/null", F_SETFL, O_NONBLOCK)
failure
  WIP tests/vm: Run tests on OpenBSD

 configure  | 11 +++
 tests/vm/openbsd   |  4 +---
 util/oslib-posix.c |  2 ++
 3 files changed, 14 insertions(+), 3 deletions(-)

-- 
2.20.1




Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for WRITE_ZEROES command

2019-01-25 Thread Michael S. Tsirkin
On Fri, Jan 25, 2019 at 01:48:26PM +0100, Thomas Huth wrote:
> On 2019-01-25 12:58, Liu, Changpeng wrote:
> > 
> > 
> >> -Original Message-
> >> From: Thomas Huth [mailto:th...@redhat.com]
> >> Sent: Friday, January 25, 2019 4:49 PM
> >> To: Stefano Garzarella ; Michael S. Tsirkin
> >> ; Liu, Changpeng 
> >> Cc: qemu-devel@nongnu.org; Laurent Vivier ; Kevin Wolf
> >> ; qemu-bl...@nongnu.org; Max Reitz
> >> ; Stefan Hajnoczi ; Paolo Bonzini
> >> 
> >> Subject: Re: [Qemu-devel] [PATCH RFC 2/2] tests/virtio-blk: add test for
> >> WRITE_ZEROES command
> >>
> >> On 2019-01-25 09:16, Stefano Garzarella wrote:
> >>> On Fri, Jan 25, 2019 at 07:07:35AM +0100, Thomas Huth wrote:
>  On 2019-01-25 07:01, Thomas Huth wrote:
> > On 2019-01-24 18:23, Stefano Garzarella wrote:
> >> If the WRITE_ZEROES feature is enabled, we check this
> >> command in the test_basic().
> >>
> >> Signed-off-by: Stefano Garzarella 
> >> ---
> >>  tests/virtio-blk-test.c | 63 +
> >>  1 file changed, 63 insertions(+)
> >>
> >> diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
> >> index 04c608764b..8cabbcb85a 100644
> >> --- a/tests/virtio-blk-test.c
> >> +++ b/tests/virtio-blk-test.c
> >> @@ -231,6 +231,69 @@ static void test_basic(QVirtioDevice *dev,
> >> QGuestAllocator *alloc,
> >>
> >>  guest_free(alloc, req_addr);
> >>
> >> +if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
> >> +struct virtio_blk_discard_write_zeroes *dwz_hdr;
> >> +void *expected;
> >> +
> >> +/*
> >> + * WRITE_ZEROES request on the same sector of previous test 
> >> where
> >> + * we wrote "TEST".
> >> + */
> >> +req.type = VIRTIO_BLK_T_WRITE_ZEROES;
> >> +req.data = g_malloc0(512);
> >
> > Wouldn't it be more interesting to do a memset(req.data, 0xaa, 512) or
> > something similar here, to see whether zeroes or 0xaa is written?
> 
>  Ah, never mind, I thought req.data would be a sector buffer here, but
>  looking at the lines below, it apparently is something different.
> 
>  Why do you allocate 512 bytes here? I'd rather expect
>  g_malloc0(sizeof(struct virtio_blk_discard_write_zeroes)) here. ... and
>  then you could also use a local "struct virtio_blk_discard_write_zeroes
>  dwz_hdr" variable instead of a pointer, and drop the g_malloc0() 
>  completely?
> 
> >>>
> >>> Hi Thomas,
> >>> it was my initial implementation, but on the first test I discovered
> >>> that virtio_blk_request() has an assert on the data_size and it requires
> >>> a multiple of 512 bytes.
> >>> Then I looked at the virtio-spec #1, and it seems that data should be
> >>> multiple of 512 bytes also if it contains the struct
> >>> virtio_blk_discard_write_zeroes. (I'm not sure)
> >>>
> >>> Anyway I tried to allocate only the space for that struct, commented the
> >>> assert and the test works well.
> >>>
> >>> How do you suggest to proceed?
> >>
> >> Wow, that's a tough question. Looking at the virtio spec, I agree with
> >> you, it looks like struct virtio_blk_discard_write_zeroes should be
> >> padded to 512 bytes here. But when I look at the Linux sources
> >> (drivers/block/virtio_blk.c), I fail to see that they are doing the
> >> padding there (but maybe I'm just too blind).
> >>
> >> Looking at the QEMU sources, it seems like it can deal with both and
> >> always sets the status right behind the last byte:
> >>
> >> req->in = (void *)in_iov[in_num - 1].iov_base
> >>   + in_iov[in_num - 1].iov_len
> >>   - sizeof(struct virtio_blk_inhdr);
> >>
> >> Anyway, I think the virtio spec should be clearer here to avoid bad
> >> implementations in the future, so maybe Changpeng or Michael could
> >> update the spec here a little bit?
> > The data for Discard and Write Zeroes commands are struct 
> > virtio_blk_discard_write_zeroes
> > aligned, that means you can pass 16 bytes aligned data, based on the 
> > segments number supported,
> > this is also aligned with NVMe specification and  the SCSI specification.
> 
> Ok, thanks, so the "u8 data[][512];" is wrong in the virtio spec in this
> case? See:
> 
>  https://github.com/oasis-tcs/virtio-spec/blob/master/content.tex#L3944
> 
> At least this should be mentioned in the description of the data field,
> I think.
> 
>  Thomas

OK. Is it a multiple of 512 for all other operations?




[Qemu-devel] [PATCH 3/3] WIP tests/vm: Run tests on OpenBSD

2019-01-25 Thread Philippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé 
---
 tests/vm/openbsd | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tests/vm/openbsd b/tests/vm/openbsd
index cfe0572c59..de907dd21c 100755
--- a/tests/vm/openbsd
+++ b/tests/vm/openbsd
@@ -25,9 +25,7 @@ class OpenBSDVM(basevm.BaseVM):
 cd $(mktemp -d /var/tmp/qemu-test.XX);
 tar -xf /dev/rsd1c;
 ./configure --cc=x86_64-unknown-openbsd6.1-gcc-4.9.4 
--python=python2.7 {configure_opts};
-gmake --output-sync -j{jobs} {verbose};
-# XXX: "gmake check" seems to always hang or fail
-#gmake --output-sync -j{jobs} check {verbose};
+gmake --output-sync -j{jobs} check {verbose};
 """
 
 def build_image(self, img):
-- 
2.20.1




[Qemu-devel] [PATCH 2/3] XXX oslib-posix: Ignore fcntl("/dev/null", F_SETFL, O_NONBLOCK) failure

2019-01-25 Thread Philippe Mathieu-Daudé
Previous to OpenBSD 6.3 [1], fcntl(F_SETFL) is not permitted on memory
devices. Do not assert fcntl failures on OpenBSD.
This fixes:

  $ lm32-softmmu/qemu-system-lm32
  assertion "f != -1" failed: file "util/oslib-posix.c", line 247, function 
"qemu_set_nonblock"
  Abort trap (core dumped)

[1] The fix seems https://github.com/openbsd/src/commit/c2a35b387f9d3c
  "fcntl(F_SETFL) invokes the FIONBIO and FIOASYNC ioctls internally, so
  the memory devices (/dev/null, /dev/zero, etc) need to permit them."

Signed-off-by: Philippe Mathieu-Daudé 
---
 util/oslib-posix.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 4ce1ba9ca4..064c3ae2f7 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -244,7 +244,9 @@ void qemu_set_nonblock(int fd)
 f = fcntl(fd, F_GETFL);
 assert(f != -1);
 f = fcntl(fd, F_SETFL, f | O_NONBLOCK);
+#ifndef __OpenBSD__
 assert(f != -1);
+#endif
 }
 
 int socket_set_fast_reuse(int fd)
-- 
2.20.1




Re: [Qemu-devel] [PATCH 04/13] tcg/aarch64: enable dynamic TLB sizing

2019-01-25 Thread Alex Bennée


Richard Henderson  writes:

> Signed-off-by: Richard Henderson 
> ---
>  tcg/aarch64/tcg-target.h |   2 +-
>  tcg/aarch64/tcg-target.inc.c | 100 +--
>  2 files changed, 60 insertions(+), 42 deletions(-)
>
> diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
> index 68868a27eb..5085a81060 100644
> --- a/tcg/aarch64/tcg-target.h
> +++ b/tcg/aarch64/tcg-target.h
> @@ -15,7 +15,7 @@
>
>  #define TCG_TARGET_INSN_UNIT_SIZE  4
>  #define TCG_TARGET_TLB_DISPLACEMENT_BITS 24
> -#define TCG_TARGET_IMPLEMENTS_DYN_TLB 0
> +#define TCG_TARGET_IMPLEMENTS_DYN_TLB 1
>  #undef TCG_TARGET_STACK_GROWSUP
>
>  typedef enum {
> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
> index ee0d5819af..d57f9e500f 100644
> --- a/tcg/aarch64/tcg-target.inc.c
> +++ b/tcg/aarch64/tcg-target.inc.c
> @@ -498,6 +498,9 @@ typedef enum {
>  I3510_EON   = 0x4a20,
>  I3510_ANDS  = 0x6a00,
>
> +/* Logical shifted register instructions (with a shift).  */
> +I3502S_AND_LSR  = I3510_AND | (1 << 22),
> +
>  /* AdvSIMD copy */
>  I3605_DUP  = 0x0e000400,
>  I3605_INS  = 0x4e001c00,
> @@ -1448,6 +1451,14 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
> is_ld, TCGMemOpIdx oi,
>  label->label_ptr[0] = label_ptr;
>  }
>
> +/* We expect tlb_mask to be before tlb_table.  */
> +QEMU_BUILD_BUG_ON(offsetof(CPUArchState, tlb_table) <
> +  offsetof(CPUArchState, tlb_mask));
> +
> +/* We expect to use a 24-bit unsigned offset from ENV.  */
> +QEMU_BUILD_BUG_ON(offsetof(CPUArchState, tlb_table[NB_MMU_MODES - 1])
> +  > 0xff);
> +
>  /* Load and compare a TLB entry, emitting the conditional jump to the
> slow path for the failure case, which will be patched later when 
> finalizing
> the slow path. Generated code returns the host addend in X1,
> @@ -1456,15 +1467,55 @@ static void tcg_out_tlb_read(TCGContext *s, TCGReg 
> addr_reg, TCGMemOp opc,
>   tcg_insn_unit **label_ptr, int mem_index,
>   bool is_read)
>  {
> -int tlb_offset = is_read ?
> -offsetof(CPUArchState, tlb_table[mem_index][0].addr_read)
> -: offsetof(CPUArchState, tlb_table[mem_index][0].addr_write);
> +int mask_ofs = offsetof(CPUArchState, tlb_mask[mem_index]);
> +int table_ofs = offsetof(CPUArchState, tlb_table[mem_index]);
>  unsigned a_bits = get_alignment_bits(opc);
>  unsigned s_bits = opc & MO_SIZE;
>  unsigned a_mask = (1u << a_bits) - 1;
>  unsigned s_mask = (1u << s_bits) - 1;
> -TCGReg base = TCG_AREG0, x3;
> -uint64_t tlb_mask;
> +TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0, x3;
> +TCGType mask_type;
> +uint64_t compare_mask;
> +
> +if (table_ofs > 0xfff) {
> +int table_hi = table_ofs & ~0xfff;
> +int mask_hi = mask_ofs & ~0xfff;

Isn't there a #define for this number here?

> +
> +table_base = TCG_REG_X1;
> +if (mask_hi == table_hi) {
> +mask_base = table_base;
> +} else if (mask_hi) {
> +mask_base = TCG_REG_X0;
> +tcg_out_insn(s, 3401, ADDI, TCG_TYPE_I64,
> + mask_base, TCG_AREG0, mask_hi);
> +}
> +tcg_out_insn(s, 3401, ADDI, TCG_TYPE_I64,
> + table_base, TCG_AREG0, table_hi);
> +mask_ofs -= mask_hi;
> +table_ofs -= table_hi;
> +}
> +
> +mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32
> + ? TCG_TYPE_I64 : TCG_TYPE_I32);
> +
> +/* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx].  */
> +tcg_out_ld(s, mask_type, TCG_REG_X0, mask_base, mask_ofs);
> +tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, table_base, table_ofs);
> +
> +/* Extract the TLB index from the address into X0.  */
> +tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
> + TCG_REG_X0, TCG_REG_X0, addr_reg,
> + TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
> +
> +/* Add the tlb_table pointer, creating the CPUTLBEntry address into X1.  
> */
> +tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
> +
> +/* Load the tlb comparator into X0, and the fast path addend into X1.  */
> +tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_X0, TCG_REG_X1, is_read
> +   ? offsetof(CPUTLBEntry, addr_read)
> +   : offsetof(CPUTLBEntry, addr_write));
> +tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
> +   offsetof(CPUTLBEntry, addend));
>
>  /* For aligned accesses, we check the first byte and include the 
> alignment
> bits within the address.  For unaligned access, we check that we don't
> @@ -1476,47 +1527,14 @@ static void tcg_out_tlb_read(TCGContext *s, TCGReg 
> addr_reg, TCGMemOp opc,
>   TCG_REG_X3, addr_reg, s_mask - a_mask);
>  x3 = TCG_REG_X3;
>  }
> -tlb_mask = (uint64_t)TARGET_PAGE_MAS

Re: [Qemu-devel] [PULL v2 07/49] util: check the return value of fcntl in qemu_set_{block, nonblock}

2019-01-25 Thread Brad Smith

On 1/25/2019 1:53 PM, Philippe Mathieu-Daudé wrote:


Hi,

On 1/15/19 9:04 PM, Michael S. Tsirkin wrote:

From: Li Qiang 

Assert that the return value is not an error. This is like commit
7e6478e7d4f for qemu_set_cloexec.

Signed-off-by: Li Qiang 
Reviewed-by: Thomas Huth 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
  util/oslib-posix.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index c1bee2a581..4ce1ba9ca4 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -233,14 +233,18 @@ void qemu_set_block(int fd)
  {
  int f;
  f = fcntl(fd, F_GETFL);
-fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
+assert(f != -1);
+f = fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
+assert(f != -1);
  }
  
  void qemu_set_nonblock(int fd)

  {
  int f;
  f = fcntl(fd, F_GETFL);
-fcntl(fd, F_SETFL, f | O_NONBLOCK);
+assert(f != -1);
+f = fcntl(fd, F_SETFL, f | O_NONBLOCK);
+assert(f != -1);

This commit breaks OpenBSD, when trying to start QEMU I get:
assertion "f != -1" failed: file "util/oslib-posix.c", line 247,
function "qemu_set_nonblock"

Having a quick look at gdb, the last device opened is /dev/null, and
when fcntl() fails we have errno = ENODEV.

 19 ENODEV Operation not supported by device.
 An attempt was made to apply an inappropriate function to a device,
 for example, trying to read a write-only device such as a printer.

Digging further I found a recent commit which could fix this problem:
https://github.com/openbsd/src/commit/c2a35b387f9d3c
"fcntl(F_SETFL) invokes the FIONBIO and FIOASYNC ioctls internally, so
the memory devices (/dev/null, /dev/zero, etc) need to permit them."

Brad: Do you think this might be the fix? If so, any idea what is the
first release to contain this fix? I don't know OpenBSD and can't figure
this out... Also, what would be the cleaner QEMU fix?


I don't know. But that commit was included with 6.3 or newer.



[Qemu-devel] [PATCH] i386: Disable MSR_PLATFORM_INFO emulation

2019-01-25 Thread Eduardo Habkost
Linux v4.12 introduced[1] emulation of MSR_PLATFORM_INFO and
MSR_MISC_FEATURES_ENABLES, and enabled the
MSR_PLATFORM_INFO_CPUID_FAULT bit unconditionally.  This made
guests incorrectly believe the VM emulates
MSR_MISC_FEATURES_ENABLES properly (which is not true because
QEMU has no migration code to handle the MSR).

The KVM_CAP_MSR_PLATFORM_INFO capability was added[2] to Linux
v4.19 to address the issue.  Use it to disable emulation of
MSR_PLATFORM_INFO and stop incorrectly exposing cpuid_fault to
guests.

References:
[1] commit db2336a80489 ("KVM: x86: virtualize cpuid faulting")
[2] commit 6fbbde9a1969 ("KVM: x86: Control guest reads of MSR_PLATFORM_INFO")

Reported-by: Maxime Coquelin 
Signed-off-by: Eduardo Habkost 
---
 target/i386/kvm.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 9af4542fb8..9629f25c90 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1647,6 +1647,14 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 }
 }
 
+/*
+ * QEMU doesn't initialize MSR_PLATFORM_INFO yet, so disable the MSR
+ * unconditionally until support for the MSR is properly implemented
+ */
+if (kvm_check_extension(s, KVM_CAP_MSR_PLATFORM_INFO)) {
+kvm_vm_enable_cap(s, KVM_CAP_MSR_PLATFORM_INFO, 0);
+}
+
 return 0;
 }
 
-- 
2.18.0.rc1.1.g3f1ff2140




Re: [Qemu-devel] [PULL v2 07/49] util: check the return value of fcntl in qemu_set_{block, nonblock}

2019-01-25 Thread Kamil Rytarowski
On 25.01.2019 19:53, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> On 1/15/19 9:04 PM, Michael S. Tsirkin wrote:
>> From: Li Qiang 
>>
>> Assert that the return value is not an error. This is like commit
>> 7e6478e7d4f for qemu_set_cloexec.
>>
>> Signed-off-by: Li Qiang 
>> Reviewed-by: Thomas Huth 
>> Reviewed-by: Michael S. Tsirkin 
>> Signed-off-by: Michael S. Tsirkin 
>> ---
>>  util/oslib-posix.c | 8 ++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
>> index c1bee2a581..4ce1ba9ca4 100644
>> --- a/util/oslib-posix.c
>> +++ b/util/oslib-posix.c
>> @@ -233,14 +233,18 @@ void qemu_set_block(int fd)
>>  {
>>  int f;
>>  f = fcntl(fd, F_GETFL);
>> -fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
>> +assert(f != -1);
>> +f = fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
>> +assert(f != -1);
>>  }
>>  
>>  void qemu_set_nonblock(int fd)
>>  {
>>  int f;
>>  f = fcntl(fd, F_GETFL);
>> -fcntl(fd, F_SETFL, f | O_NONBLOCK);
>> +assert(f != -1);
>> +f = fcntl(fd, F_SETFL, f | O_NONBLOCK);
>> +assert(f != -1);
> 
> This commit breaks OpenBSD, when trying to start QEMU I get:
> assertion "f != -1" failed: file "util/oslib-posix.c", line 247,
> function "qemu_set_nonblock"
> 
> Having a quick look at gdb, the last device opened is /dev/null, and
> when fcntl() fails we have errno = ENODEV.
> 
> 19 ENODEV Operation not supported by device.
> An attempt was made to apply an inappropriate function to a device,
> for example, trying to read a write-only device such as a printer.
> 
> Digging further I found a recent commit which could fix this problem:
> https://github.com/openbsd/src/commit/c2a35b387f9d3c
> "fcntl(F_SETFL) invokes the FIONBIO and FIOASYNC ioctls internally, so
> the memory devices (/dev/null, /dev/zero, etc) need to permit them."
> 
> Brad: Do you think this might be the fix? If so, any idea what is the
> first release to contain this fix? I don't know OpenBSD and can't figure
> this out... Also, what would be the cleaner QEMU fix?
> 
> Thanks,
> 

I cannot speak for OpenBSD (never installed it myself), but if there is
a critical patch to test on NetBSD - please let me know.

> Phil.
> 
>>  }
>>  
>>  int socket_set_fast_reuse(int fd)
>>
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PULL v2 07/49] util: check the return value of fcntl in qemu_set_{block, nonblock}

2019-01-25 Thread Philippe Mathieu-Daudé
Hi,

On 1/15/19 9:04 PM, Michael S. Tsirkin wrote:
> From: Li Qiang 
> 
> Assert that the return value is not an error. This is like commit
> 7e6478e7d4f for qemu_set_cloexec.
> 
> Signed-off-by: Li Qiang 
> Reviewed-by: Thomas Huth 
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  util/oslib-posix.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> index c1bee2a581..4ce1ba9ca4 100644
> --- a/util/oslib-posix.c
> +++ b/util/oslib-posix.c
> @@ -233,14 +233,18 @@ void qemu_set_block(int fd)
>  {
>  int f;
>  f = fcntl(fd, F_GETFL);
> -fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
> +assert(f != -1);
> +f = fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
> +assert(f != -1);
>  }
>  
>  void qemu_set_nonblock(int fd)
>  {
>  int f;
>  f = fcntl(fd, F_GETFL);
> -fcntl(fd, F_SETFL, f | O_NONBLOCK);
> +assert(f != -1);
> +f = fcntl(fd, F_SETFL, f | O_NONBLOCK);
> +assert(f != -1);

This commit breaks OpenBSD, when trying to start QEMU I get:
assertion "f != -1" failed: file "util/oslib-posix.c", line 247,
function "qemu_set_nonblock"

Having a quick look at gdb, the last device opened is /dev/null, and
when fcntl() fails we have errno = ENODEV.

19 ENODEV Operation not supported by device.
An attempt was made to apply an inappropriate function to a device,
for example, trying to read a write-only device such as a printer.

Digging further I found a recent commit which could fix this problem:
https://github.com/openbsd/src/commit/c2a35b387f9d3c
"fcntl(F_SETFL) invokes the FIONBIO and FIOASYNC ioctls internally, so
the memory devices (/dev/null, /dev/zero, etc) need to permit them."

Brad: Do you think this might be the fix? If so, any idea what is the
first release to contain this fix? I don't know OpenBSD and can't figure
this out... Also, what would be the cleaner QEMU fix?

Thanks,

Phil.

>  }
>  
>  int socket_set_fast_reuse(int fd)
> 



[Qemu-devel] [PATCH] target/arm: Fix validation of 32-bit address spaces for aa32

2019-01-25 Thread Richard Henderson
When tsz == 0, aarch32 selects the address space via exclusion,
and there are no "top_bits" remaining that require validation.

Fixes: ba97be9f4a4
Reported-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 92666e5208..e24689f767 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10447,7 +10447,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 uint64_t ttbr;
 hwaddr descaddr, indexmask, indexmask_grainsize;
 uint32_t tableattrs;
-target_ulong page_size, top_bits;
+target_ulong page_size;
 uint32_t attrs;
 int32_t stride;
 int addrsize, inputsize;
@@ -10487,12 +10487,19 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
  * We determined the region when collecting the parameters, but we
  * have not yet validated that the address is valid for the region.
  * Extract the top bits and verify that they all match select.
+ *
+ * For aa32, if inputsize == addrsize, then we have selected the
+ * region by exclusion in aa32_va_parameters and there is no more
+ * validation to do here.
  */
-top_bits = sextract64(address, inputsize, addrsize - inputsize);
-if (-top_bits != param.select || (param.select && !ttbr1_valid)) {
-/* In the gap between the two regions, this is a Translation fault */
-fault_type = ARMFault_Translation;
-goto do_fault;
+if (inputsize < addrsize) {
+target_ulong top_bits = sextract64(address, inputsize,
+   addrsize - inputsize);
+if (-top_bits != param.select || (param.select && !ttbr1_valid)) {
+/* The gap between the two regions is a Translation fault */
+fault_type = ARMFault_Translation;
+goto do_fault;
+}
 }
 
 if (param.using64k) {
-- 
2.17.1




Re: [Qemu-devel] [PULL 0/2] Python 3 compatibility fixes

2019-01-25 Thread Peter Maydell
On Fri, 25 Jan 2019 at 14:06, Eduardo Habkost  wrote:
>
> The following changes since commit 9dd0d8111fbb8015db75a38933aee1d45f9e64a3:
>
>   Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2019-01-24' 
> into staging (2019-01-25 11:52:12 +)
>
> are available in the Git repository at:
>
>   git://github.com/ehabkost/qemu.git tags/python-next-pull-request
>
> for you to fetch changes up to 651514df88fd53d537b3b78a7548663cc0816b1b:
>
>   decodetree: re.fullmatch was added in 3.4 (2019-01-25 11:41:42 -0200)
>
> 
> Python 3 compatibility fixes
>
> 

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.0
for any user-visible changes.

-- PMM



Re: [Qemu-devel] [PATCH RFC 8/9] tests: Add OpenBSD image

2019-01-25 Thread Peter Maydell
On Fri, 25 Jan 2019 at 18:36, Brad Smith  wrote:
>
> On 1/25/2019 1:24 AM, Thomas Huth wrote:
>
> > On 2019-01-25 01:48, Brad Smith wrote:
> >> Our ports tree has an option which results in the QEMU binaries being
> >> linked with "-z wxneeded".
> > Then it's maybe high time to send such changes upstream now ;-)
>
>
> I have not considered submitting such a patch as it's just a workaround
> for an
> issue within QEMU. Everything else that we had as local patches or local
> build
> fiddling to build things properly has been integrated in some manner.

We'll happily take a patch that fixes the underlying issue properly
if you'd prefer...

thanks
-- PMM



Re: [Qemu-devel] [PATCH RFC 8/9] tests: Add OpenBSD image

2019-01-25 Thread Brad Smith

On 1/25/2019 1:24 AM, Thomas Huth wrote:


On 2019-01-25 01:48, Brad Smith wrote:

On 1/24/2019 11:52 AM, Daniel P. Berrangé wrote:


On Thu, Jan 24, 2019 at 05:10:19PM +0100, Philippe Mathieu-Daudé wrote:

On 1/24/19 4:56 PM, Kamil Rytarowski wrote:

On 24.01.2019 16:52, Philippe Mathieu-Daudé wrote:

On 8/16/17 9:21 AM, Fam Zheng wrote:

The image is prepared following instructions as in:

https://wiki.qemu.org/Hosts/BSD

Signed-off-by: Fam Zheng 
---
   tests/vm/openbsd | 45 +
   1 file changed, 45 insertions(+)
   create mode 100755 tests/vm/openbsd

diff --git a/tests/vm/openbsd b/tests/vm/openbsd
new file mode 100755
index 00..d37ff83a59
--- /dev/null
+++ b/tests/vm/openbsd
@@ -0,0 +1,45 @@
+#!/usr/bin/env python
+#
+# OpenBSD VM image
+#
+# Copyright (C) 2017 Red Hat Inc.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This work is licensed under the terms of the GNU GPL, version
2.  See
+# the COPYING file in the top-level directory.
+#
+
+import os
+import sys
+import logging
+import subprocess
+import tempfile
+import time
+import basevm
+
+class OpenBSDVM(basevm.BaseVM):
+    name = "openbsd"
+    BUILD_SCRIPT = """
+    set -e;
+    cd $(mktemp -d /var/tmp/qemu-test.XX);
+    tar -xf /dev/rsd1c;
+    ./configure --cc=x86_64-unknown-openbsd6.1-gcc-4.9.4
--python=python2.7 {configure_opts};
+    gmake -j{jobs};
+    # XXX: "gmake check" seems to always hang or fail
+    #gmake check;

OK, Now it makes more sense...

After spending various hours trying to fix various issues on
OpenBSD, I
notice that we never ran tests on this OS.
The only binary I can run is qemu-img, the rest seems useless.
I'll summarize in a different thread.


Is this W^X related?

Part of it could be but I'm not sure.

The 6.1 VM provided by Fam has /usr/local mounted with wxallowed, I
tried building/running there and nothing changed, mmap() still returns
ENOTSUP:

ENOTSUP from mmap is certainly what you'd expect from the W^X  scenario

    https://undeadly.org/cgi?action=article&sid=20160527203200

   "W^X violations are no longer permitted by default.  A kernel log
message
    is generated, and mprotect/mmap return ENOTSUP.  If the sysctl(8) flag
    kern.wxabort is set then a SIGABRT occurs instead, for gdb use or
coredump
    creation."

Yes, this policy change was introduced with 6.0.

Our ports tree has an option which results in the QEMU binaries being
linked with "-z wxneeded".

Then it's maybe high time to send such changes upstream now ;-)



I have not considered submitting such a patch as it's just a workaround 
for an
issue within QEMU. Everything else that we had as local patches or local 
build

fiddling to build things properly has been integrated in some manner.




[Qemu-devel] [PATCH 2/7] target/arm/translate-a64: Don't underdecode PRFM

2019-01-25 Thread Peter Maydell
The PRFM prefetch insn in the load/store with imm9 encodings
requires idx field 0b00; we were underdecoding this by
only checking !is_unpriv (which is equivalent to idx != 2).
Correctly UNDEF the unallocated encodings where idx == 0b01
and 0b11 as well as 0b10.

Reported-by: Laurent Desnogues 
Signed-off-by: Peter Maydell 
---
 target/arm/translate-a64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index e6df303e321..8e081758e03 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -2803,7 +2803,7 @@ static void disas_ldst_reg_imm9(DisasContext *s, uint32_t 
insn,
 } else {
 if (size == 3 && opc == 2) {
 /* PRFM - prefetch */
-if (is_unpriv) {
+if (idx != 0) {
 unallocated_encoding(s);
 return;
 }
-- 
2.20.1




[Qemu-devel] [PATCH 1/7] target/arm/translate-a64: Don't underdecode system instructions

2019-01-25 Thread Peter Maydell
The "system instructions" and "system register move" subcategories
of "branches, exception generating and system instructions" for A64
only apply if bits [23:22] are zero; other values are currently
unallocated. Correctly UNDEF these unallocated encodings.

Reported-by: Laurent Desnogues 
Signed-off-by: Peter Maydell 
---
 target/arm/translate-a64.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 4d28a27c3bd..e6df303e321 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -2144,7 +2144,11 @@ static void disas_b_exc_sys(DisasContext *s, uint32_t 
insn)
 break;
 case 0x6a: /* Exception generation / System */
 if (insn & (1 << 24)) {
-disas_system(s, insn);
+if (extract32(insn, 22, 2) == 0) {
+disas_system(s, insn);
+} else {
+unallocated_encoding(s);
+}
 } else {
 disas_exc(s, insn);
 }
-- 
2.20.1




Re: [Qemu-devel] [PATCH v2] xen: fix xen-bus state model to allow frontend re-connection

2019-01-25 Thread Anthony PERARD
On Tue, Jan 22, 2019 at 03:53:46PM +, Paul Durrant wrote:
> There is a flaw in the xen-bus state model. To allow a frontend to re-
> connect the backend state of an online XenDevice is transitioned from
> Closed to InitWait, but this is currently done unilaterally which is
> incorrect. The backend state should remain Closed until the frontend state
> transitions to Initialising.
> 
> This patch removes the automatic backend state transition from
> xen_device_backend_state_changed() and, instead, adds an extra check in
> xen_device_frontend_state_changed() to determine whether a frontend is
> trying to re-connect to a previously Closed XenDevice. Only if this is
> found to be the case is the backend state transitioned from Closed to
> InitWait. Note that this transition will be common amongst all XenDevice
> classes and hence xen_device_frontend_state_changed() returns immediately
> afterwards without calling into the XenDeviceClass frontend_changed()
> method.
> 
> Signed-off-by: Paul Durrant 

I've tested OVMF with that patch, and states transitions looks better
when transitionning from ovmf to linux.
(Less Closed->InitWait->Closed..., and ovmf trying to win a race at
reading the backend state at the right time).

Acked-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



[Qemu-devel] [PATCH 0/7] target/arm: Fix various underdecodings

2019-01-25 Thread Peter Maydell
This patchset fixes the various cases of underdecoded instructions
that Laurent spotted and sent a bug report for. (The exception
is "missing default in disas_data_proc_1src", which got fixed in
commit 18de2813c35e359621a.)

thanks
-- PMM

Peter Maydell (7):
  target/arm/translate-a64: Don't underdecode system instructions
  target/arm/translate-a64: Don't underdecode PRFM
  target/arm/translate-a64: Don't underdecode SIMD ld/st multiple
  target/arm/translate-a64: Don't underdecode SIMD ld/st single
  target/arm/translate-a64: Don't underdecode add/sub extended register
  target/arm/translate-a64: Don't underdecode FP insns
  target/arm/translate-a64: Don't underdecode SDOT and UDOT

 target/arm/translate-a64.c | 53 +-
 1 file changed, 46 insertions(+), 7 deletions(-)

-- 
2.20.1




[Qemu-devel] [PATCH 4/7] target/arm/translate-a64: Don't underdecode SIMD ld/st single

2019-01-25 Thread Peter Maydell
In the AdvSIMD load/store single structure encodings, the
non-post-indexed case should have zeroes in [20:16] (which is the
Rm field for the post-indexed case). Bit 31 must also be zero
(a check we got right in ldst_multiple but not here). Correctly
UNDEF these unallocated encodings.

Reported-by: Laurent Desnogues 
Signed-off-by: Peter Maydell 
---
 target/arm/translate-a64.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index c1f0cad7691..2cade64ed25 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -3409,6 +3409,7 @@ static void disas_ldst_single_struct(DisasContext *s, 
uint32_t insn)
 {
 int rt = extract32(insn, 0, 5);
 int rn = extract32(insn, 5, 5);
+int rm = extract32(insn, 16, 5);
 int size = extract32(insn, 10, 2);
 int S = extract32(insn, 12, 1);
 int opc = extract32(insn, 13, 3);
@@ -3424,6 +3425,15 @@ static void disas_ldst_single_struct(DisasContext *s, 
uint32_t insn)
 int ebytes, xs;
 TCGv_i64 tcg_addr, tcg_rn, tcg_ebytes;
 
+if (extract32(insn, 31, 1)) {
+unallocated_encoding(s);
+return;
+}
+if (!is_postidx && rm != 0) {
+unallocated_encoding(s);
+return;
+}
+
 switch (scale) {
 case 3:
 if (!is_load || S) {
@@ -3501,7 +3511,6 @@ static void disas_ldst_single_struct(DisasContext *s, 
uint32_t insn)
 }
 
 if (is_postidx) {
-int rm = extract32(insn, 16, 5);
 if (rm == 31) {
 tcg_gen_mov_i64(tcg_rn, tcg_addr);
 } else {
-- 
2.20.1




[Qemu-devel] [PATCH 5/7] target/arm/translate-a64: Don't underdecode add/sub extended register

2019-01-25 Thread Peter Maydell
In the "add/subtract (extended register)" encoding group, the "opt"
field in bits [23:22] must be zero. Correctly UNDEF the unallocated
encodings where this field is not zero.

Reported-by: Laurent Desnogues 
Signed-off-by: Peter Maydell 
---
 target/arm/translate-a64.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 2cade64ed25..efd2f6490b5 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4204,12 +4204,13 @@ static void disas_add_sub_ext_reg(DisasContext *s, 
uint32_t insn)
 bool setflags = extract32(insn, 29, 1);
 bool sub_op = extract32(insn, 30, 1);
 bool sf = extract32(insn, 31, 1);
+bool opt = extract32(insn, 22, 2);
 
 TCGv_i64 tcg_rm, tcg_rn; /* temps */
 TCGv_i64 tcg_rd;
 TCGv_i64 tcg_result;
 
-if (imm3 > 4) {
+if (imm3 > 4 || opt != 0) {
 unallocated_encoding(s);
 return;
 }
-- 
2.20.1




  1   2   3   4   >