date:20180103

Re: [Qemu-devel] [PATCH v4 0/2] Rewrite TCP packet comparison in colo

2018-01-03 Thread Mao Zhongyi


Hi, Jason

Long time no news, Ping...

Thanks,
Mao

On 12/25/2017 10:54 AM, Mao Zhongyi wrote:

v4:
  p2: fix some typo
[Zhang Chen]
v3:
  p1: merged the patch1 and patch2 from v2
  p2: -merged the patch3 and patch4 from v2
  -implement the same process flow for tcp, udp and icmp

[Zhang Chen]
v2:
  p1: a new patch
  p2: a new patch
  p3: -rename the fill_pkt_seq to fill_pkt_tcp_info
  -rename pdsize & hdsize to payload_size & header_size
  -reuse duplicated code
  -modified colo_packet_compare_common() to suit the tcp packet
   comparison instead of build a new function service for tcp.
  -add more comments for the 'max_ack'
[Zhang Chen]

Cc: Zhang Chen 
Cc: Li Zhijian 
Cc: Jason Wang 

Mao Zhongyi (2):
  colo: modified the payload compare function
  colo: compare the packet based on the tcp sequence number

 net/colo-compare.c | 411 +
 net/colo.c |   9 ++
 net/colo.h |  15 ++
 net/trace-events   |   2 +-
 4 files changed, 284 insertions(+), 153 deletions(-)

Re: [Qemu-devel] [PATCH v1 03/21] RISC-V CPU Core Definition

2018-01-03 Thread Michael Clark

On Thu, Jan 4, 2018 at 7:47 PM, Antony Pavlov 
wrote:

> On Wed,  3 Jan 2018 13:44:07 +1300
> Michael Clark  wrote:
>
> > Add CPU state header, CPU definitions and initialization routines
> >
> > Signed-off-by: Michael Clark 
> > ---
> >  target/riscv/cpu.c  | 338 +++
> >  target/riscv/cpu.h  | 363 ++
> 
> >  target/riscv/cpu_bits.h | 411 ++
> ++
> >  3 files changed, 1112 insertions(+)
> >  create mode 100644 target/riscv/cpu.c
> >  create mode 100644 target/riscv/cpu.h
> >  create mode 100644 target/riscv/cpu_bits.h
> >
> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>
> ...
>
> > +static void riscv_cpu_reset(CPUState *cs)
> > +{
> > +RISCVCPU *cpu = RISCV_CPU(cs);
> > +RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(cpu);
> > +CPURISCVState *env = >env;
> > +
> > +mcc->parent_reset(cs);
> > +#ifndef CONFIG_USER_ONLY
> > +tlb_flush(cs);
> > +env->priv = PRV_M;
> > +env->mtvec = DEFAULT_MTVEC;
> > +#endif
> > +env->pc = DEFAULT_RSTVEC;
>
> The RISC-V Privileged Architecture Manual v1.10 states that
>
>   The pc is set to an implementation-defined reset vector.
>
> But hard-coded DEFAULT_RSTVEC leaves no chance for changing reset vector.
>
> Can we add a mechanism for changing reset vector?
>

That can be added very easily at some point when necessary.

All 5 RISC-V machines in the QEMU port currently have their emulated Mask
ROMs at 0x1000 so its not necessary until we add a machine that needs a
different value. I certainly wouldn't reject a patch that adds that
functionality if we had a machine with a different reset vector, although
given we have 5 machines using the same vector, it may remain a sensible
default. I would think twice about adding a property that no machines sets,
or duplicate code and have all machines set their reset vector even when
they are all the same? Shall we add the functionality when we need it?

I'd categorise this as a feature request. #define DEFAULT_RSTVEC 0x1000
is the "implementation-defined reset vector"

Folk on the RISC-V mailing list are actually seeking guidance on the blanks
in the RISC-V specification so it may be that a de-facto standard emerges
for some of these "implementation defined" blanks, in which case it may
become part of a platform spec (vs the ISA spec).

E.g. there is the "reset-hivecs" property in the ARM emulation code
> so SoC-specific code can change reset vector.
>
>
> > +cs->exception_index = EXCP_NONE;
> > +set_default_nan_mode(1, >fp_status);
> > +}
> > +
> > +static void riscv_cpu_disas_set_info(CPUState *s, disassemble_info
> *info)
> > +{
> > +#if defined(TARGET_RISCV32)
> > +info->print_insn = print_insn_riscv32;
> > +#elif defined(TARGET_RISCV64)
> > +info->print_insn = print_insn_riscv64;
> > +#endif
> > +}
> > +
> > +static void riscv_cpu_realize(DeviceState *dev, Error **errp)
> > +{
> > +CPUState *cs = CPU(dev);
> > +RISCVCPU *cpu = RISCV_CPU(dev);
> > +RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
> > +CPURISCVState *env = >env;
> > +Error *local_err = NULL;
> > +
> > +cpu_exec_realizefn(cs, _err);
> > +if (local_err != NULL) {
> > +error_propagate(errp, local_err);
> > +return;
> > +}
> > +
> > +if (env->misa & RVM) {
> > +set_feature(env, RISCV_FEATURE_RVM);
> > +}
> > +if (env->misa & RVA) {
> > +set_feature(env, RISCV_FEATURE_RVA);
> > +}
> > +if (env->misa & RVF) {
> > +set_feature(env, RISCV_FEATURE_RVF);
> > +}
> > +if (env->misa & RVD) {
> > +set_feature(env, RISCV_FEATURE_RVD);
> > +}
> > +if (env->misa & RVC) {
> > +set_feature(env, RISCV_FEATURE_RVC);
> > +}
> > +
> > +qemu_init_vcpu(cs);
> > +cpu_reset(cs);
> > +
> > +mcc->parent_realize(dev, errp);
> > +}
> > +
> > +static void riscv_cpu_init(Object *obj)
> > +{
> > +CPUState *cs = CPU(obj);
> > +RISCVCPU *cpu = RISCV_CPU(obj);
> > +
> > +cs->env_ptr = >env;
> > +}
> > +
> > +static const VMStateDescription vmstate_riscv_cpu = {
> > +.name = "cpu",
> > +.unmigratable = 1,
> > +};
> > +
> > +static void riscv_cpu_class_init(ObjectClass *c, void *data)
> > +{
> > +RISCVCPUClass *mcc = RISCV_CPU_CLASS(c);
> > +CPUClass *cc = CPU_CLASS(c);
> > +DeviceClass *dc = DEVICE_CLASS(c);
> > +
> > +mcc->parent_realize = dc->realize;
> > +dc->realize = riscv_cpu_realize;
> > +
> > +mcc->parent_reset = cc->reset;
> > +cc->reset = riscv_cpu_reset;
> > +
> > +cc->class_by_name = riscv_cpu_class_by_name;
> > +cc->has_work = riscv_cpu_has_work;
> > +cc->do_interrupt = riscv_cpu_do_interrupt;
> > +cc->cpu_exec_interrupt = riscv_cpu_exec_interrupt;
> > +cc->dump_state = riscv_cpu_dump_state;
> > +cc->set_pc = riscv_cpu_set_pc;
> > +

Re: [Qemu-devel] CVE-2017-5715: relevant qemu patches

2018-01-03 Thread Alexandre DERUMIER

does somebody have a redhat account to see te content of: 

https://access.redhat.com/solutions/3307851
"Impacts of CVE-2017-5754, CVE-2017-5753, and CVE-2017-5715 to Red Hat 
Virtualization products"

- Mail original -
De: "aderumier" 
À: "Stefan Priebe, Profihost AG" 
Cc: "qemu-devel" 
Envoyé: Jeudi 4 Janvier 2018 08:24:34
Objet: Re: [Qemu-devel] CVE-2017-5715: relevant qemu patches

>>Can anybody point me to the relevant qemu patches? 

I don't have find them yet. 

Do you known if a vm using kvm64 cpu model is protected or not ? 

- Mail original - 
De: "Stefan Priebe, Profihost AG"  
À: "qemu-devel"  
Envoyé: Jeudi 4 Janvier 2018 07:27:01 
Objet: [Qemu-devel] CVE-2017-5715: relevant qemu patches 

Hello, 

i've seen some vendors have updated qemu regarding meltdown / spectre. 

f.e.: 

CVE-2017-5715: QEMU was updated to allow passing through new MSR and 
CPUID flags from the host VM to the CPU, to allow enabling/disabling 
branch prediction features in the Intel CPU. (bsc#1068032) 

Can anybody point me to the relevant qemu patches? 

Thanks! 

Greets, 
Stefan

Re: [Qemu-devel] CVE-2017-5715: relevant qemu patches

2018-01-03 Thread Alexandre DERUMIER

>>Can anybody point me to the relevant qemu patches? 

I don't have find them yet.

Do you known if a vm using kvm64 cpu model is protected or not ?

- Mail original -
De: "Stefan Priebe, Profihost AG" 
À: "qemu-devel" 
Envoyé: Jeudi 4 Janvier 2018 07:27:01
Objet: [Qemu-devel] CVE-2017-5715: relevant qemu patches

Hello, 

i've seen some vendors have updated qemu regarding meltdown / spectre. 

f.e.: 

CVE-2017-5715: QEMU was updated to allow passing through new MSR and 
CPUID flags from the host VM to the CPU, to allow enabling/disabling 
branch prediction features in the Intel CPU. (bsc#1068032) 

Can anybody point me to the relevant qemu patches? 

Thanks! 

Greets, 
Stefan

Re: [Qemu-devel] [virtio-dev] [RFC 0/3] Extend vhost-user to support VFIO based accelerators

2018-01-03 Thread Jason Wang




On 2018年01月04日 14:18, Tiwei Bie wrote:

On Wed, Jan 03, 2018 at 10:34:36PM +0800, Jason Wang wrote:

On 2017年12月22日 14:41, Tiwei Bie wrote:

This RFC patch set does some small extensions to vhost-user protocol
to support VFIO based accelerators, and makes it possible to get the
similar performance of VFIO passthru while keeping the virtio device
emulation in QEMU.

When we have virtio ring compatible devices, it's possible to setup
the device (DMA mapping, PCI config, etc) based on the existing info
(memory-table, features, vring info, etc) which is available on the
vhost-backend (e.g. DPDK vhost library). Then, we will be able to
use such devices to accelerate the emulated device for the VM. And
we call it vDPA: vhost DataPath Acceleration. The key difference
between VFIO passthru and vDPA is that, in vDPA only the data path
(e.g. ring, notify and queue interrupt) is pass-throughed, the device
control path (e.g. PCI configuration space and MMIO regions) is still
defined and emulated by QEMU.

The benefits of keeping virtio device emulation in QEMU compared
with virtio device VFIO passthru include (but not limit to):

- consistent device interface from guest OS;
- max flexibility on control path and hardware design;
- leveraging the existing virtio live-migration framework;

But the critical issue in vDPA is that the data path performance is
relatively low and some host threads are needed for the data path,
because some necessary mechanisms are missing to support:

1) guest driver notifies the device directly;
2) device interrupts the guest directly;

So this patch set does some small extensions to vhost-user protocol
to make both of them possible. It leverages the same mechanisms (e.g.
EPT and Posted-Interrupt on Intel platform) as the VFIO passthru to
achieve the data path pass through.

A new protocol feature bit is added to negotiate the accelerator feature
support. Two new slave message types are added to enable the notify and
interrupt passthru for each queue. From the view of vhost-user protocol
design, it's very flexible. The passthru can be enabled/disabled for
each queue individually, and it's possible to accelerate each queue by
different devices. More design and implementation details can be found
from the last patch.

There are some rough edges in this patch set (so this is a RFC patch
set for now), but it's never too early to hear the thoughts from the
community! So any comments and suggestions would be really appreciated!

Tiwei Bie (3):
vhost-user: support receiving file descriptors in slave_read
vhost-user: introduce shared vhost-user state
vhost-user: add VFIO based accelerators support

   docs/interop/vhost-user.txt |  57 ++
   hw/scsi/vhost-user-scsi.c   |   6 +-
   hw/vfio/common.c|   2 +-
   hw/virtio/vhost-user.c  | 430 
+++-
   hw/virtio/vhost.c   |   3 +-
   hw/virtio/virtio-pci.c  |   8 -
   hw/virtio/virtio-pci.h  |   8 +
   include/hw/vfio/vfio.h  |   2 +
   include/hw/virtio/vhost-user.h  |  43 
   include/hw/virtio/virtio-scsi.h |   6 +-
   net/vhost-user.c|  30 +--
   11 files changed, 561 insertions(+), 34 deletions(-)
   create mode 100644 include/hw/virtio/vhost-user.h


I may miss something, but may I ask why you must implement them through
vhost-use/dpdk. It looks to me you could put all of them in qemu which could
simplify a lots of things (just like userspace NVME driver wrote by Fam).


Thanks for your comments! :-)

Yeah, you're right. We can also implement everything in QEMU
like the userspace NVME driver by Fam. It was also described
by Cunming on the KVM Forum 2017. Below is the link to the
slides:

https://events.static.linuxfound.org/sites/events/files/slides/KVM17%27-vDPA.pdf


Thanks for the pointer. Looks rather interesting.



We're also working on it (including defining a standard device
for vhost data path acceleration based on mdev to hide vendor
specific details).


This is exactly what I mean. Form my point of view, there's no need for 
any extension for vhost protocol, we just need to reuse qemu iothread to 
implement a userspace vhost dataplane and do the mdev inside that thread.




And IMO it's also not a bad idea to extend vhost-user protocol
to support the accelerators if possible. And it could be more
flexible because it could support (for example) below things
easily without introducing any complex command line options or
monitor commands to QEMU:


Maybe I was wrong but I don't think we care about the complexity of 
command line or monitor command in this case.




- the switching among different accelerators and software version
   can be done at runtime in vhost process;
- use different accelerators to accelerate different queue pairs
   or just accelerate some (instead of all) queue pairs;


Well, technically, if we want, these could be implemented in qemu too.

And here's some more advantages if

Re: [Qemu-devel] [PATCH v7 07/17] target/m68k: add chk and chk2

2018-01-03 Thread Richard Henderson

On 01/03/2018 05:29 PM, Laurent Vivier wrote:
> +/* From the specs:
> + *   X: Not affected, N,V: Undefined,
> + *   Z: Set if val is equal to lb or ub
> + *   V: Set if val < lb or val > ub, cleared otherwise
^^
Just a typo here.

Otherwise,

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v1 03/21] RISC-V CPU Core Definition

2018-01-03 Thread Antony Pavlov

On Wed,  3 Jan 2018 13:44:07 +1300
Michael Clark  wrote:

> Add CPU state header, CPU definitions and initialization routines
> 
> Signed-off-by: Michael Clark 
> ---
>  target/riscv/cpu.c  | 338 +++
>  target/riscv/cpu.h  | 363 ++
>  target/riscv/cpu_bits.h | 411 
> 
>  3 files changed, 1112 insertions(+)
>  create mode 100644 target/riscv/cpu.c
>  create mode 100644 target/riscv/cpu.h
>  create mode 100644 target/riscv/cpu_bits.h
> 
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c

...

> +static void riscv_cpu_reset(CPUState *cs)
> +{
> +RISCVCPU *cpu = RISCV_CPU(cs);
> +RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(cpu);
> +CPURISCVState *env = >env;
> +
> +mcc->parent_reset(cs);
> +#ifndef CONFIG_USER_ONLY
> +tlb_flush(cs);
> +env->priv = PRV_M;
> +env->mtvec = DEFAULT_MTVEC;
> +#endif
> +env->pc = DEFAULT_RSTVEC;

The RISC-V Privileged Architecture Manual v1.10 states that

  The pc is set to an implementation-defined reset vector.

But hard-coded DEFAULT_RSTVEC leaves no chance for changing reset vector.

Can we add a mechanism for changing reset vector?

E.g. there is the "reset-hivecs" property in the ARM emulation code
so SoC-specific code can change reset vector.


> +cs->exception_index = EXCP_NONE;
> +set_default_nan_mode(1, >fp_status);
> +}
> +
> +static void riscv_cpu_disas_set_info(CPUState *s, disassemble_info *info)
> +{
> +#if defined(TARGET_RISCV32)
> +info->print_insn = print_insn_riscv32;
> +#elif defined(TARGET_RISCV64)
> +info->print_insn = print_insn_riscv64;
> +#endif
> +}
> +
> +static void riscv_cpu_realize(DeviceState *dev, Error **errp)
> +{
> +CPUState *cs = CPU(dev);
> +RISCVCPU *cpu = RISCV_CPU(dev);
> +RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
> +CPURISCVState *env = >env;
> +Error *local_err = NULL;
> +
> +cpu_exec_realizefn(cs, _err);
> +if (local_err != NULL) {
> +error_propagate(errp, local_err);
> +return;
> +}
> +
> +if (env->misa & RVM) {
> +set_feature(env, RISCV_FEATURE_RVM);
> +}
> +if (env->misa & RVA) {
> +set_feature(env, RISCV_FEATURE_RVA);
> +}
> +if (env->misa & RVF) {
> +set_feature(env, RISCV_FEATURE_RVF);
> +}
> +if (env->misa & RVD) {
> +set_feature(env, RISCV_FEATURE_RVD);
> +}
> +if (env->misa & RVC) {
> +set_feature(env, RISCV_FEATURE_RVC);
> +}
> +
> +qemu_init_vcpu(cs);
> +cpu_reset(cs);
> +
> +mcc->parent_realize(dev, errp);
> +}
> +
> +static void riscv_cpu_init(Object *obj)
> +{
> +CPUState *cs = CPU(obj);
> +RISCVCPU *cpu = RISCV_CPU(obj);
> +
> +cs->env_ptr = >env;
> +}
> +
> +static const VMStateDescription vmstate_riscv_cpu = {
> +.name = "cpu",
> +.unmigratable = 1,
> +};
> +
> +static void riscv_cpu_class_init(ObjectClass *c, void *data)
> +{
> +RISCVCPUClass *mcc = RISCV_CPU_CLASS(c);
> +CPUClass *cc = CPU_CLASS(c);
> +DeviceClass *dc = DEVICE_CLASS(c);
> +
> +mcc->parent_realize = dc->realize;
> +dc->realize = riscv_cpu_realize;
> +
> +mcc->parent_reset = cc->reset;
> +cc->reset = riscv_cpu_reset;
> +
> +cc->class_by_name = riscv_cpu_class_by_name;
> +cc->has_work = riscv_cpu_has_work;
> +cc->do_interrupt = riscv_cpu_do_interrupt;
> +cc->cpu_exec_interrupt = riscv_cpu_exec_interrupt;
> +cc->dump_state = riscv_cpu_dump_state;
> +cc->set_pc = riscv_cpu_set_pc;
> +cc->synchronize_from_tb = riscv_cpu_synchronize_from_tb;
> +cc->gdb_read_register = riscv_cpu_gdb_read_register;
> +cc->gdb_write_register = riscv_cpu_gdb_write_register;
> +cc->gdb_num_core_regs = 65;
> +cc->gdb_stop_before_watchpoint = true;
> +cc->disas_set_info = riscv_cpu_disas_set_info;
> +#ifdef CONFIG_USER_ONLY
> +cc->handle_mmu_fault = riscv_cpu_handle_mmu_fault;
> +#else
> +cc->do_unassigned_access = riscv_cpu_unassigned_access;
> +cc->do_unaligned_access = riscv_cpu_do_unaligned_access;
> +cc->get_phys_page_debug = riscv_cpu_get_phys_page_debug;
> +#endif
> +#ifdef CONFIG_TCG
> +cc->tcg_initialize = riscv_translate_init;
> +#endif
> +/* For now, mark unmigratable: */
> +cc->vmsd = _riscv_cpu;
> +}
> +
> +static void cpu_register(const RISCVCPUInfo *info)
> +{
> +TypeInfo type_info = {
> +.name = g_strdup(info->name),
> +.parent = TYPE_RISCV_CPU,
> +.instance_size = sizeof(RISCVCPU),
> +.instance_init = info->initfn,
> +};
> +
> +type_register(_info);
> +g_free((void *)type_info.name);
> +}
> +
> +static const TypeInfo riscv_cpu_type_info = {
> +.name = TYPE_RISCV_CPU,
> +.parent = TYPE_CPU,
> +.instance_size = sizeof(RISCVCPU),
> +.instance_init = riscv_cpu_init,
> +.abstract = false,
> +.class_size =

Re: [Qemu-devel] [PATCH v14 9/9] target-arm: kvm64: handle SIGBUS signal from kernel or KVM

2018-01-03 Thread gengdongjiu

On 2018/1/3 21:44, Igor Mammedov wrote:
> On Wed, 3 Jan 2018 17:13:45 +0800
> gengdongjiu  wrote:
> 
>> On 2017/12/28 23:07, Igor Mammedov wrote:
>>> On Thu, 28 Dec 2017 13:54:18 +0800
>>> Dongjiu Geng  wrote:
>>>   
 Add SIGBUS signal handler. In this handler, it checks the SIGBUS type,
 translates the host VA which is delivered by host to guest PA, then fill
 this PA to CPER and fill the CPER to guest APEI GHES memory, finally
 notify guest according to the SIGBUS type. There are two kinds of SIGBUS
 that QEMU needs to handle, which are BUS_MCEERR_AO and BUS_MCEERR_AR.

 If guest accesses the poisoned memory, it generates Synchronous External
 Abort(SEA). Then host kernel gets an APEI notification and call 
 memory_failure()
 to unmapped the affected page from the guest's stage2, and 
 SIGBUS_MCEERR_AO  
>>> s/unmapped/unmap/  
>> Thanks.
>>
>>>   
 is delivered to Qemu's main thread. If Qemu receives this SIGBUS, it will
 create a new CPER and add it to guest APEI GHES memory, then notify the
 guest with a GPIO-Signal notification.  
>>> too long sentence, it's hard get what goes on here, pls split it in simple
>>> sentences/rephrase so it would be easy to understand behavior.  
>> I will split it in simple sentences/rephrase.
>> Thanks for your detailed review.
>>
>>>   

 When guest hits a PG_hwpoison page, it will trap to KVM as stage2 fault, 
 then a
 SIGBUS_MCEERR_AR synchronous signal is delivered to Qemu, Qemu record this 
 error
 into guest APEI GHES memory and notify guest using 
 Synchronous-External-Abort(SEA).

 Suggested-by: James Morse 
 Signed-off-by: Dongjiu Geng 
 ---
 Address James's comments to record CPER and notify guest for SIGBUS signal 
 handling.
 Shown some discussion in [1].

 [1]:
 https://lkml.org/lkml/2017/2/27/246
 https://lkml.org/lkml/2017/9/14/241
 https://lkml.org/lkml/2017/9/22/499
 ---
  include/sysemu/kvm.h |  2 +-
  target/arm/kvm.c |  2 ++
  target/arm/kvm64.c   | 34 ++
  3 files changed, 37 insertions(+), 1 deletion(-)

 diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
 index 3a458f5..90c1605 100644
 --- a/include/sysemu/kvm.h
 +++ b/include/sysemu/kvm.h
 @@ -361,7 +361,7 @@ bool kvm_vcpu_id_is_valid(int vcpu_id);
  /* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
  unsigned long kvm_arch_vcpu_id(CPUState *cpu);
  
 -#ifdef TARGET_I386
 +#if defined(TARGET_I386) || defined(TARGET_AARCH64)
  #define KVM_HAVE_MCE_INJECTION 1
  void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
  #endif
 diff --git a/target/arm/kvm.c b/target/arm/kvm.c
 index 7c17f0d..9d25f51 100644
 --- a/target/arm/kvm.c
 +++ b/target/arm/kvm.c
 @@ -26,6 +26,7 @@
  #include "exec/address-spaces.h"
  #include "hw/boards.h"
  #include "qemu/log.h"
 +#include "exec/ram_addr.h"
  
  const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
  KVM_CAP_LAST_INFO
 @@ -182,6 +183,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
  
  cap_has_mp_state = kvm_check_extension(s, KVM_CAP_MP_STATE);
  
 +qemu_register_reset(kvm_unpoison_all, NULL);
  type_register_static(_arm_cpu_type_info);
  
  return 0;
 diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
 index c00450d..6955d85 100644
 --- a/target/arm/kvm64.c
 +++ b/target/arm/kvm64.c
 @@ -27,6 +27,9 @@
  #include "kvm_arm.h"
  #include "internals.h"
  #include "hw/arm/arm.h"
 +#include "exec/ram_addr.h"
 +#include "hw/acpi/acpi-defs.h"
 +#include "hw/acpi/hest_ghes.h"
  
  static bool have_guest_debug;
  
 @@ -944,6 +947,37 @@ int kvm_arch_get_registers(CPUState *cs)
  return ret;
  }
  
 +void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
 +{
 +ram_addr_t ram_addr;
 +hwaddr paddr;
 +
 +assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
 +if (addr) {
 +ram_addr = qemu_ram_addr_from_host(addr);
 +if (ram_addr != RAM_ADDR_INVALID &&
 +kvm_physical_memory_addr_from_host(c->kvm_state, addr, 
 )) {
 +kvm_hwpoison_page_add(ram_addr);
 +if (code == BUS_MCEERR_AR) {
 +kvm_cpu_synchronize_state(c);
 +ghes_record_errors(ACPI_HEST_NOTIFY_SEA, paddr);
 +kvm_inject_arm_sea(c);
 +} else if (code == BUS_MCEERR_AO) {
 +ghes_record_errors(ACPI_HEST_NOTIFY_GPIO, paddr);
 +qemu_hardware_error_notify();
 +}
 +return;
 +}
 +

[Qemu-devel] CVE-2017-5715: relevant qemu patches

2018-01-03 Thread Stefan Priebe - Profihost AG

Hello,

i've seen some vendors have updated qemu regarding meltdown / spectre.

f.e.:

 CVE-2017-5715: QEMU was updated to allow passing through new MSR and
 CPUID flags from the host VM to the CPU, to allow enabling/disabling
 branch prediction features in the Intel CPU. (bsc#1068032)

Can anybody point me to the relevant qemu patches?

Thanks!

Greets,
Stefan

Re: [Qemu-devel] [virtio-dev] [RFC 0/3] Extend vhost-user to support VFIO based accelerators

2018-01-03 Thread Tiwei Bie

On Wed, Jan 03, 2018 at 10:34:36PM +0800, Jason Wang wrote:
> On 2017年12月22日 14:41, Tiwei Bie wrote:
> > This RFC patch set does some small extensions to vhost-user protocol
> > to support VFIO based accelerators, and makes it possible to get the
> > similar performance of VFIO passthru while keeping the virtio device
> > emulation in QEMU.
> > 
> > When we have virtio ring compatible devices, it's possible to setup
> > the device (DMA mapping, PCI config, etc) based on the existing info
> > (memory-table, features, vring info, etc) which is available on the
> > vhost-backend (e.g. DPDK vhost library). Then, we will be able to
> > use such devices to accelerate the emulated device for the VM. And
> > we call it vDPA: vhost DataPath Acceleration. The key difference
> > between VFIO passthru and vDPA is that, in vDPA only the data path
> > (e.g. ring, notify and queue interrupt) is pass-throughed, the device
> > control path (e.g. PCI configuration space and MMIO regions) is still
> > defined and emulated by QEMU.
> > 
> > The benefits of keeping virtio device emulation in QEMU compared
> > with virtio device VFIO passthru include (but not limit to):
> > 
> > - consistent device interface from guest OS;
> > - max flexibility on control path and hardware design;
> > - leveraging the existing virtio live-migration framework;
> > 
> > But the critical issue in vDPA is that the data path performance is
> > relatively low and some host threads are needed for the data path,
> > because some necessary mechanisms are missing to support:
> > 
> > 1) guest driver notifies the device directly;
> > 2) device interrupts the guest directly;
> > 
> > So this patch set does some small extensions to vhost-user protocol
> > to make both of them possible. It leverages the same mechanisms (e.g.
> > EPT and Posted-Interrupt on Intel platform) as the VFIO passthru to
> > achieve the data path pass through.
> > 
> > A new protocol feature bit is added to negotiate the accelerator feature
> > support. Two new slave message types are added to enable the notify and
> > interrupt passthru for each queue. From the view of vhost-user protocol
> > design, it's very flexible. The passthru can be enabled/disabled for
> > each queue individually, and it's possible to accelerate each queue by
> > different devices. More design and implementation details can be found
> > from the last patch.
> > 
> > There are some rough edges in this patch set (so this is a RFC patch
> > set for now), but it's never too early to hear the thoughts from the
> > community! So any comments and suggestions would be really appreciated!
> > 
> > Tiwei Bie (3):
> >vhost-user: support receiving file descriptors in slave_read
> >vhost-user: introduce shared vhost-user state
> >vhost-user: add VFIO based accelerators support
> > 
> >   docs/interop/vhost-user.txt |  57 ++
> >   hw/scsi/vhost-user-scsi.c   |   6 +-
> >   hw/vfio/common.c|   2 +-
> >   hw/virtio/vhost-user.c  | 430 
> > +++-
> >   hw/virtio/vhost.c   |   3 +-
> >   hw/virtio/virtio-pci.c  |   8 -
> >   hw/virtio/virtio-pci.h  |   8 +
> >   include/hw/vfio/vfio.h  |   2 +
> >   include/hw/virtio/vhost-user.h  |  43 
> >   include/hw/virtio/virtio-scsi.h |   6 +-
> >   net/vhost-user.c|  30 +--
> >   11 files changed, 561 insertions(+), 34 deletions(-)
> >   create mode 100644 include/hw/virtio/vhost-user.h
> > 
> 
> I may miss something, but may I ask why you must implement them through
> vhost-use/dpdk. It looks to me you could put all of them in qemu which could
> simplify a lots of things (just like userspace NVME driver wrote by Fam).
> 

Thanks for your comments! :-)

Yeah, you're right. We can also implement everything in QEMU
like the userspace NVME driver by Fam. It was also described
by Cunming on the KVM Forum 2017. Below is the link to the
slides:

https://events.static.linuxfound.org/sites/events/files/slides/KVM17%27-vDPA.pdf

We're also working on it (including defining a standard device
for vhost data path acceleration based on mdev to hide vendor
specific details).

And IMO it's also not a bad idea to extend vhost-user protocol
to support the accelerators if possible. And it could be more
flexible because it could support (for example) below things
easily without introducing any complex command line options or
monitor commands to QEMU:

- the switching among different accelerators and software version
  can be done at runtime in vhost process;
- use different accelerators to accelerate different queue pairs
  or just accelerate some (instead of all) queue pairs;

Best regards,
Tiwei Bie

[Qemu-devel] [PATCH RESEND V3 16/16] COLO: quick failover process by kick COLO thread

2018-01-03 Thread Zhang Chen

From: zhanghailiang 

COLO thread may sleep at qemu_sem_wait(>colo_checkpoint_sem),
while failover works begin, It's better to wakeup it to quick
the process.

Signed-off-by: zhanghailiang 
---
 migration/colo.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 10bc80c..cc616d9 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -134,6 +134,11 @@ static void primary_vm_do_failover(void)
 
 migrate_set_state(>state, MIGRATION_STATUS_COLO,
   MIGRATION_STATUS_COMPLETED);
+/*
+ * kick COLO thread which might wait at
+ * qemu_sem_wait(>colo_checkpoint_sem).
+ */
+colo_checkpoint_notify(migrate_get_current());
 
 /*
  * Wake up COLO thread which may blocked in recv() or send(),
@@ -519,6 +524,9 @@ static void colo_process_checkpoint(MigrationState *s)
 
 qemu_sem_wait(>colo_checkpoint_sem);
 
+if (s->state != MIGRATION_STATUS_COLO) {
+goto out;
+}
 ret = colo_do_checkpoint_transaction(s, bioc, fb);
 if (ret < 0) {
 goto out;
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 14/16] filter-rewriter: handle checkpoint and failover event

2018-01-03 Thread Zhang Chen

After one round of checkpoint, the states between PVM and SVM
become consistent, so it is unnecessary to adjust the sequence
of net packets for old connections, besides, while failover
happens, filter-rewriter needs to check if it still needs to
adjust sequence of net packets.

Cc: Jason Wang 
Signed-off-by: zhanghailiang 
Signed-off-by: Zhang Chen 
---
 migration/colo.c  | 13 +
 net/filter-rewriter.c | 40 
 2 files changed, 53 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index a931ff2..9eab4a3 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -30,6 +30,7 @@
 #include "block/block.h"
 #include "replication.h"
 #include "sysemu/cpus.h"
+#include "net/filter.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -81,6 +82,11 @@ static void secondary_vm_do_failover(void)
 if (local_err) {
 error_report_err(local_err);
 }
+/* Notify all filters of all NIC to do checkpoint */
+colo_notify_filters_event(COLO_EVENT_FAILOVER, _err);
+if (local_err) {
+error_report_err(local_err);
+}
 
 if (!autostart) {
 error_report("\"-S\" qemu option will be ignored in secondary side");
@@ -753,6 +759,13 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
+/* Notify all filters of all NIC to do checkpoint */
+colo_notify_filters_event(COLO_EVENT_CHECKPOINT, _err);
+if (local_err) {
+qemu_mutex_unlock_iothread();
+goto out;
+}
+
 vmstate_loading = false;
 vm_start();
 trace_colo_vm_state_change("stop", "run");
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index a58310a..bd4b6cf 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -23,6 +23,8 @@
 #include "qemu/main-loop.h"
 #include "qemu/iov.h"
 #include "net/checksum.h"
+#include "net/colo.h"
+#include "migration/colo.h"
 
 #define FILTER_COLO_REWRITER(obj) \
 OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
@@ -280,6 +282,43 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState 
*nf,
 return 0;
 }
 
+static void reset_seq_offset(gpointer key, gpointer value, gpointer user_data)
+{
+Connection *conn = (Connection *)value;
+
+conn->offset = 0;
+}
+
+static gboolean offset_is_nonzero(gpointer key,
+  gpointer value,
+  gpointer user_data)
+{
+Connection *conn = (Connection *)value;
+
+return conn->offset ? true : false;
+}
+
+static void colo_rewriter_handle_event(NetFilterState *nf, int event,
+   Error **errp)
+{
+RewriterState *rs = FILTER_COLO_REWRITER(nf);
+
+switch (event) {
+case COLO_EVENT_CHECKPOINT:
+g_hash_table_foreach(rs->connection_track_table,
+reset_seq_offset, NULL);
+break;
+case COLO_EVENT_FAILOVER:
+if (!g_hash_table_find(rs->connection_track_table,
+  offset_is_nonzero, NULL)) {
+object_property_set_str(OBJECT(nf), "off", "status", errp);
+}
+break;
+default:
+break;
+}
+}
+
 static void colo_rewriter_cleanup(NetFilterState *nf)
 {
 RewriterState *s = FILTER_COLO_REWRITER(nf);
@@ -335,6 +374,7 @@ static void colo_rewriter_class_init(ObjectClass *oc, void 
*data)
 nfc->setup = colo_rewriter_setup;
 nfc->cleanup = colo_rewriter_cleanup;
 nfc->receive_iov = colo_rewriter_receive_iov;
+nfc->handle_event = colo_rewriter_handle_event;
 }
 
 static const TypeInfo colo_rewriter_info = {
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 12/16] COLO: flush host dirty ram from cache

2018-01-03 Thread Zhang Chen

From: zhanghailiang 

Don't need to flush all VM's ram from cache, only
flush the dirty pages since last checkpoint

Cc: Juan Quintela 
Signed-off-by: Li Zhijian 
Signed-off-by: Zhang Chen 
Signed-off-by: zhanghailiang 
---
 migration/ram.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 23c67e0..0188712 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2679,6 +2679,7 @@ int colo_init_ram_cache(void)
 }
 ram_state = g_new0(RAMState, 1);
 ram_state->migration_dirty_pages = 0;
+memory_global_dirty_log_start();
 
 return 0;
 
@@ -2699,10 +2700,12 @@ void colo_release_ram_cache(void)
 {
 RAMBlock *block;
 
+memory_global_dirty_log_stop();
 QLIST_FOREACH_RCU(block, _list.blocks, next) {
 g_free(block->bmap);
 block->bmap = NULL;
 }
+
 rcu_read_lock();
 QLIST_FOREACH_RCU(block, _list.blocks, next) {
 if (block->colo_cache) {
@@ -2919,6 +2922,15 @@ static void colo_flush_ram_cache(void)
 void *src_host;
 unsigned long offset = 0;
 
+memory_global_dirty_log_sync();
+qemu_mutex_lock(_state->bitmap_mutex);
+rcu_read_lock();
+RAMBLOCK_FOREACH(block) {
+migration_bitmap_sync_range(ram_state, block, 0, block->used_length);
+}
+rcu_read_unlock();
+qemu_mutex_unlock(_state->bitmap_mutex);
+
 trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages);
 rcu_read_lock();
 block = QLIST_FIRST_RCU(_list.blocks);
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 15/16] COLO: notify net filters about checkpoint/failover event

2018-01-03 Thread Zhang Chen

From: zhanghailiang 

Notify all net filters about the checkpoint and failover event.

Cc: Jason Wang 
Signed-off-by: zhanghailiang 
---
 migration/colo.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 9eab4a3..10bc80c 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -87,6 +87,11 @@ static void secondary_vm_do_failover(void)
 if (local_err) {
 error_report_err(local_err);
 }
+/* Notify all filters of all NIC to do checkpoint */
+colo_notify_filters_event(COLO_EVENT_FAILOVER, _err);
+if (local_err) {
+error_report_err(local_err);
+}
 
 if (!autostart) {
 error_report("\"-S\" qemu option will be ignored in secondary side");
@@ -766,6 +771,13 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
+/* Notify all filters of all NIC to do checkpoint */
+colo_notify_filters_event(COLO_EVENT_CHECKPOINT, _err);
+if (local_err) {
+qemu_mutex_unlock_iothread();
+goto out;
+}
+
 vmstate_loading = false;
 vm_start();
 trace_colo_vm_state_change("stop", "run");
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 10/16] qmp event: Add COLO_EXIT event to notify users while exited COLO

2018-01-03 Thread Zhang Chen

From: zhanghailiang 

If some errors happen during VM's COLO FT stage, it's important to
notify the users of this event. Together with 'x_colo_lost_heartbeat',
Users can intervene in COLO's failover work immediately.
If users don't want to get involved in COLO's failover verdict,
it is still necessary to notify users that we exited COLO mode.

Cc: Markus Armbruster 
Cc: Michael Roth 
Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Signed-off-by: Zhang Chen 
Reviewed-by: Eric Blake 
---
 migration/colo.c| 19 +++
 qapi-schema.json| 21 +
 qapi/migration.json | 13 +
 3 files changed, 53 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 8d2e3f8..790b122 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -516,6 +516,18 @@ out:
 qemu_fclose(fb);
 }
 
+/*
+ * There are only two reasons we can go here, some error happened.
+ * Or the user triggered failover.
+ */
+if (failover_get_state() == FAILOVER_STATUS_NONE) {
+qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+  COLO_EXIT_REASON_ERROR, NULL);
+} else {
+qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+  COLO_EXIT_REASON_REQUEST, NULL);
+}
+
 /* Hope this not to be too long to wait here */
 qemu_sem_wait(>colo_exit_sem);
 qemu_sem_destroy(>colo_exit_sem);
@@ -746,6 +758,13 @@ out:
 if (local_err) {
 error_report_err(local_err);
 }
+if (failover_get_state() == FAILOVER_STATUS_NONE) {
+qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+  COLO_EXIT_REASON_ERROR, NULL);
+} else {
+qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+  COLO_EXIT_REASON_REQUEST, NULL);
+}
 
 if (fb) {
 qemu_fclose(fb);
diff --git a/qapi-schema.json b/qapi-schema.json
index 5c06745..4ff6d2c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2921,6 +2921,27 @@
 { 'command': 'query-acpi-ospm-status', 'returns': ['ACPIOSTInfo'] }
 
 ##
+# @COLO_EXIT:
+#
+# Emitted when VM finishes COLO mode due to some errors happening or
+# at the request of users.
+#
+# @mode: which COLO mode the VM was in when it exited.
+#
+# @reason: describes the reason for the COLO exit.
+#
+# Since: 2.11
+#
+# Example:
+#
+# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172},
+#  "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } 
}
+#
+##
+{ 'event': 'COLO_EXIT',
+  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
+
+##
 # @ACPI_DEVICE_OST:
 #
 # Emitted when guest executes ACPI _OST method.
diff --git a/qapi/migration.json b/qapi/migration.json
index 03f57c9..f7b2cc6 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -854,6 +854,19 @@
 ##
 { 'enum': 'FailoverStatus',
   'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
+##
+# @COLOExitReason:
+#
+# The reason for a COLO exit
+#
+# @request: COLO exit is due to an external request
+#
+# @error: COLO exit is due to an internal error
+#
+# Since: 2.11
+##
+{ 'enum': 'COLOExitReason',
+  'data': [ 'request', 'error' ] }
 
 ##
 # @x-colo-lost-heartbeat:
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 08/16] ram/COLO: Record the dirty pages that SVM received

2018-01-03 Thread Zhang Chen

We record the address of the dirty pages that received,
it will help flushing pages that cached into SVM.

Here, it is a trick, we record dirty pages by re-using migration
dirty bitmap. In the later patch, we will start the dirty log
for SVM, just like migration, in this way, we can record both
the dirty pages caused by PVM and SVM, we only flush those dirty
pages from RAM cache while do checkpoint.

Cc: Juan Quintela 
Signed-off-by: zhanghailiang 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 0fc0aee..388333d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2477,6 +2477,15 @@ static inline void 
*colo_cache_from_block_offset(RAMBlock *block,
  __func__, block->idstr);
 return NULL;
 }
+
+/*
+* During colo checkpoint, we need bitmap of these migrated pages.
+* It help us to decide which pages in ram cache should be flushed
+* into VM's RAM later.
+*/
+if (!test_and_set_bit(offset >> TARGET_PAGE_BITS, block->bmap)) {
+ram_state->migration_dirty_pages++;
+}
 return block->colo_cache + offset;
 }
 
@@ -2653,6 +2662,24 @@ int colo_init_ram_cache(void)
 }
 }
 rcu_read_unlock();
+/*
+* Record the dirty pages that sent by PVM, we use this dirty bitmap 
together
+* with to decide which page in cache should be flushed into SVM's RAM. Here
+* we use the same name 'ram_bitmap' as for migration.
+*/
+if (ram_bytes_total()) {
+RAMBlock *block;
+
+QLIST_FOREACH_RCU(block, _list.blocks, next) {
+unsigned long pages = block->max_length >> TARGET_PAGE_BITS;
+
+block->bmap = bitmap_new(pages);
+bitmap_set(block->bmap, 0, pages);
+ }
+}
+ram_state = g_new0(RAMState, 1);
+ram_state->migration_dirty_pages = 0;
+
 return 0;
 
 out_locked:
@@ -2672,6 +2699,10 @@ void colo_release_ram_cache(void)
 {
 RAMBlock *block;
 
+QLIST_FOREACH_RCU(block, _list.blocks, next) {
+g_free(block->bmap);
+block->bmap = NULL;
+}
 rcu_read_lock();
 QLIST_FOREACH_RCU(block, _list.blocks, next) {
 if (block->colo_cache) {
@@ -2680,6 +2711,8 @@ void colo_release_ram_cache(void)
 }
 }
 rcu_read_unlock();
+g_free(ram_state);
+ram_state = NULL;
 }
 
 /**
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 09/16] COLO: Flush memory data from ram cache

2018-01-03 Thread Zhang Chen

From: zhanghailiang 

During the time of VM's running, PVM may dirty some pages, we will transfer
PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
time. So, the content of SVM's RAM cache will always be same with PVM's memory
after checkpoint.

Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
we do this in a more efficient way:
Only flush any page that dirtied by PVM since last checkpoint.
In this way, we can ensure SVM's memory same with PVM's.

Besides, we must ensure flush RAM cache before load device state.

Cc: Juan Quintela 
Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/ram.c| 39 +++
 migration/trace-events |  2 ++
 2 files changed, 41 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 388333d..23c67e0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2908,6 +2908,40 @@ static bool postcopy_is_running(void)
 return ps >= POSTCOPY_INCOMING_LISTENING && ps < POSTCOPY_INCOMING_END;
 }
 
+/*
+ * Flush content of RAM cache into SVM's memory.
+ * Only flush the pages that be dirtied by PVM or SVM or both.
+ */
+static void colo_flush_ram_cache(void)
+{
+RAMBlock *block = NULL;
+void *dst_host;
+void *src_host;
+unsigned long offset = 0;
+
+trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages);
+rcu_read_lock();
+block = QLIST_FIRST_RCU(_list.blocks);
+
+while (block) {
+offset = migration_bitmap_find_dirty(ram_state, block, offset);
+migration_bitmap_clear_dirty(ram_state, block, offset);
+
+if (offset << TARGET_PAGE_BITS >= block->used_length) {
+offset = 0;
+block = QLIST_NEXT_RCU(block, next);
+} else {
+dst_host = block->host + (offset << TARGET_PAGE_BITS);
+src_host = block->colo_cache + (offset << TARGET_PAGE_BITS);
+memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+}
+}
+
+rcu_read_unlock();
+trace_colo_flush_ram_cache_end();
+assert(ram_state->migration_dirty_pages == 0);
+}
+
 static int ram_load(QEMUFile *f, void *opaque, int version_id)
 {
 int flags = 0, ret = 0, invalid_flags = 0;
@@ -2920,6 +2954,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 bool postcopy_running = postcopy_is_running();
 /* ADVISE is earlier, it shows the source has the postcopy capability on */
 bool postcopy_advised = postcopy_is_advised();
+bool need_flush = false;
 
 seq_iter++;
 
@@ -3095,6 +3130,10 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 wait_for_decompress_done();
 rcu_read_unlock();
 trace_ram_load_complete(ret, seq_iter);
+
+if (!ret  && migration_incoming_in_colo_state() && need_flush) {
+colo_flush_ram_cache();
+}
 return ret;
 }
 
diff --git a/migration/trace-events b/migration/trace-events
index d4738c8..69261a0 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -78,6 +78,8 @@ ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 
" %x"
 ram_postcopy_send_discard_bitmap(void) ""
 ram_save_page(const char *rbname, uint64_t offset, void *host) "%s: offset: 
0x%" PRIx64 " host: %p"
 ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: start: 
0x%zx len: 0x%zx"
+colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64
+colo_flush_ram_cache_end(void) ""
 
 # migration/migration.c
 await_return_path_close_on_source_close(void) ""
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 13/16] filter: Add handle_event method for NetFilterClass

2018-01-03 Thread Zhang Chen

Filter needs to process the event of checkpoint/failover or
other event passed by COLO frame.

Cc: Jason Wang 
Signed-off-by: zhanghailiang 
---
 include/net/filter.h |  5 +
 net/filter.c | 17 +
 net/net.c| 28 
 3 files changed, 50 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 0c4a2ea..df4510d 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -37,6 +37,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
 
 typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp);
 
+typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error **errp);
+
 typedef struct NetFilterClass {
 ObjectClass parent_class;
 
@@ -44,6 +46,7 @@ typedef struct NetFilterClass {
 FilterSetup *setup;
 FilterCleanup *cleanup;
 FilterStatusChanged *status_changed;
+FilterHandleEvent *handle_event;
 /* mandatory */
 FilterReceiveIOV *receive_iov;
 } NetFilterClass;
@@ -76,4 +79,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
 int iovcnt,
 void *opaque);
 
+void colo_notify_filters_event(int event, Error **errp);
+
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter.c b/net/filter.c
index 2fd7d7d..0f17eba 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -17,6 +17,8 @@
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
 #include "qemu/iov.h"
+#include "net/colo.h"
+#include "migration/colo.h"
 
 static inline bool qemu_can_skip_netfilter(NetFilterState *nf)
 {
@@ -245,11 +247,26 @@ static void netfilter_finalize(Object *obj)
 g_free(nf->netdev_id);
 }
 
+static void dummy_handle_event(NetFilterState *nf, int event, Error **errp)
+{
+switch (event) {
+case COLO_EVENT_CHECKPOINT:
+break;
+case COLO_EVENT_FAILOVER:
+object_property_set_str(OBJECT(nf), "off", "status", errp);
+break;
+default:
+break;
+}
+}
+
 static void netfilter_class_init(ObjectClass *oc, void *data)
 {
 UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+NetFilterClass *nfc = NETFILTER_CLASS(oc);
 
 ucc->complete = netfilter_complete;
+nfc->handle_event = dummy_handle_event;
 }
 
 static const TypeInfo netfilter_info = {
diff --git a/net/net.c b/net/net.c
index 39ef546..babbfd4 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1399,6 +1399,34 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
 }
 }
 
+void colo_notify_filters_event(int event, Error **errp)
+{
+NetClientState *nc, *peer;
+NetClientDriver type;
+NetFilterState *nf;
+NetFilterClass *nfc = NULL;
+Error *local_err = NULL;
+
+QTAILQ_FOREACH(nc, _clients, next) {
+peer = nc->peer;
+type = nc->info->type;
+if (!peer || type != NET_CLIENT_DRIVER_TAP) {
+continue;
+}
+QTAILQ_FOREACH(nf, >filters, next) {
+nfc =  NETFILTER_GET_CLASS(OBJECT(nf));
+if (!nfc->handle_event) {
+continue;
+}
+nfc->handle_event(nf, event, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
+}
+}
+
 void qmp_set_link(const char *name, bool up, Error **errp)
 {
 NetClientState *ncs[MAX_QUEUE_NUM];
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 04/16] COLO: integrate colo compare with colo frame

2018-01-03 Thread Zhang Chen

For COLO FT, both the PVM and SVM run at the same time,
only sync the state while it needs.

So here, let SVM runs while not doing checkpoint, change
DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100.

Besides, we forgot to release colo_checkpoint_semd and
colo_delay_timer, fix them here.

Cc: Jason Wang 
Signed-off-by: zhanghailiang 
Signed-off-by: Zhang Chen 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/colo.c  | 42 --
 migration/migration.c |  4 ++--
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index dee3aa8..c513805 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -24,8 +24,11 @@
 #include "migration/failover.h"
 #include "replication.h"
 #include "qmp-commands.h"
+#include "net/colo-compare.h"
+#include "net/colo.h"
 
 static bool vmstate_loading;
+static Notifier packets_compare_notifier;
 
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
@@ -342,6 +345,11 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 goto out;
 }
 
+colo_notify_compares_event(NULL, COLO_EVENT_CHECKPOINT, _err);
+if (local_err) {
+goto out;
+}
+
 /* Disable block migration */
 migrate_set_block_enabled(false, _err);
 qemu_savevm_state_header(fb);
@@ -399,6 +407,11 @@ out:
 return ret;
 }
 
+static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)
+{
+colo_checkpoint_notify(data);
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
 QIOChannelBuffer *bioc;
@@ -415,6 +428,9 @@ static void colo_process_checkpoint(MigrationState *s)
 goto out;
 }
 
+packets_compare_notifier.notify = colo_compare_notify_checkpoint;
+colo_compare_register_notifier(_compare_notifier);
+
 /*
  * Wait for Secondary finish loading VM states and enter COLO
  * restore.
@@ -460,11 +476,21 @@ out:
 qemu_fclose(fb);
 }
 
-timer_del(s->colo_delay_timer);
-
 /* Hope this not to be too long to wait here */
 qemu_sem_wait(>colo_exit_sem);
 qemu_sem_destroy(>colo_exit_sem);
+
+/*
+ * It is safe to unregister notifier after failover finished.
+ * Besides, colo_delay_timer and colo_checkpoint_sem can't be
+ * released befor unregister notifier, or there will be use-after-free
+ * error.
+ */
+colo_compare_unregister_notifier(_compare_notifier);
+timer_del(s->colo_delay_timer);
+timer_free(s->colo_delay_timer);
+qemu_sem_destroy(>colo_checkpoint_sem);
+
 /*
  * Must be called after failover BH is completed,
  * Or the failover BH may shutdown the wrong fd that
@@ -557,6 +583,11 @@ void *colo_process_incoming_thread(void *opaque)
 fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
 object_unref(OBJECT(bioc));
 
+qemu_mutex_lock_iothread();
+vm_start();
+trace_colo_vm_state_change("stop", "run");
+qemu_mutex_unlock_iothread();
+
 colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
   _err);
 if (local_err) {
@@ -576,6 +607,11 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
+qemu_mutex_lock_iothread();
+vm_stop_force_state(RUN_STATE_COLO);
+trace_colo_vm_state_change("run", "stop");
+qemu_mutex_unlock_iothread();
+
 /* FIXME: This is unnecessary for periodic checkpoint mode */
 colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
  _err);
@@ -629,6 +665,8 @@ void *colo_process_incoming_thread(void *opaque)
 }
 
 vmstate_loading = false;
+vm_start();
+trace_colo_vm_state_change("stop", "run");
 qemu_mutex_unlock_iothread();
 
 if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
diff --git a/migration/migration.c b/migration/migration.c
index 4de3b55..ced463c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -74,9 +74,9 @@
 #define DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE (64 * 1024 * 1024)
 
 /* The delay time (in ms) between two COLO checkpoints
- * Note: Please change this default value to 1 when we support hybrid mode.
+ * Note: Please change this default value to 2 when we support hybrid mode.
  */
-#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
+#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
 #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
 #define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 16
 
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 06/16] COLO: Remove colo_state migration struct

2018-01-03 Thread Zhang Chen

From: zhanghailiang 

We need to know if migration is going into COLO state for
incoming side before start normal migration.

Instead by using the VMStateDescription to send colo_state
from source side to destination side, we use MIG_CMD_ENABLE_COLO
to indicate whether COLO is enabled or not.

Signed-off-by: zhanghailiang 
---
 include/migration/colo.h |  5 ++--
 migration/Makefile.objs  |  2 +-
 migration/colo-comm.c| 76 
 migration/colo.c | 13 -
 migration/migration.c| 23 ++-
 migration/savevm.c   | 19 
 migration/savevm.h   |  1 +
 migration/trace-events   |  1 +
 vl.c |  2 --
 9 files changed, 59 insertions(+), 83 deletions(-)
 delete mode 100644 migration/colo-comm.c

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 6adf3a5..546cb9a 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -27,8 +27,9 @@ void migrate_start_colo_process(MigrationState *s);
 bool migration_in_colo_state(void);
 
 /* loadvm */
-bool migration_incoming_enable_colo(void);
-void migration_incoming_exit_colo(void);
+void migration_incoming_enable_colo(void);
+void migration_incoming_disable_colo(void);
+bool migration_incoming_colo_enabled(void);
 void *colo_process_incoming_thread(void *opaque);
 bool migration_incoming_in_colo_state(void);
 
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index 99e0380..3099eec 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,6 +1,6 @@
 common-obj-y += migration.o socket.o fd.o exec.o
 common-obj-y += tls.o channel.o savevm.o
-common-obj-y += colo-comm.o colo.o colo-failover.o
+common-obj-y += colo.o colo-failover.o
 common-obj-y += vmstate.o vmstate-types.o page_cache.o
 common-obj-y += qemu-file.o global_state.o
 common-obj-y += qemu-file-channel.o
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
deleted file mode 100644
index df26e4d..000
--- a/migration/colo-comm.c
+++ /dev/null
@@ -1,76 +0,0 @@
-/*
- * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
- * (a.k.a. Fault Tolerance or Continuous Replication)
- *
- * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
- * Copyright (c) 2016 FUJITSU LIMITED
- * Copyright (c) 2016 Intel Corporation
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or
- * later. See the COPYING file in the top-level directory.
- *
- */
-
-#include "qemu/osdep.h"
-#include "migration.h"
-#include "migration/colo.h"
-#include "migration/vmstate.h"
-#include "trace.h"
-
-typedef struct {
- bool colo_requested;
-} COLOInfo;
-
-static COLOInfo colo_info;
-
-COLOMode get_colo_mode(void)
-{
-if (migration_in_colo_state()) {
-return COLO_MODE_PRIMARY;
-} else if (migration_incoming_in_colo_state()) {
-return COLO_MODE_SECONDARY;
-} else {
-return COLO_MODE_UNKNOWN;
-}
-}
-
-static int colo_info_pre_save(void *opaque)
-{
-COLOInfo *s = opaque;
-
-s->colo_requested = migrate_colo_enabled();
-
-return 0;
-}
-
-static bool colo_info_need(void *opaque)
-{
-   return migrate_colo_enabled();
-}
-
-static const VMStateDescription colo_state = {
-.name = "COLOState",
-.version_id = 1,
-.minimum_version_id = 1,
-.pre_save = colo_info_pre_save,
-.needed = colo_info_need,
-.fields = (VMStateField[]) {
-VMSTATE_BOOL(colo_requested, COLOInfo),
-VMSTATE_END_OF_LIST()
-},
-};
-
-void colo_info_init(void)
-{
-vmstate_register(NULL, 0, _state, _info);
-}
-
-bool migration_incoming_enable_colo(void)
-{
-return colo_info.colo_requested;
-}
-
-void migration_incoming_exit_colo(void)
-{
-colo_info.colo_requested = false;
-}
diff --git a/migration/colo.c b/migration/colo.c
index 0e689df..8d2e3f8 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -153,6 +153,17 @@ static void primary_vm_do_failover(void)
 qemu_sem_post(>colo_exit_sem);
 }
 
+COLOMode get_colo_mode(void)
+{
+if (migration_in_colo_state()) {
+return COLO_MODE_PRIMARY;
+} else if (migration_incoming_in_colo_state()) {
+return COLO_MODE_SECONDARY;
+} else {
+return COLO_MODE_UNKNOWN;
+}
+}
+
 void colo_do_failover(MigrationState *s)
 {
 /* Make sure VM stopped while failover happened. */
@@ -747,7 +758,7 @@ out:
 if (mis->to_src_file) {
 qemu_fclose(mis->to_src_file);
 }
-migration_incoming_exit_colo();
+migration_incoming_disable_colo();
 
 return NULL;
 }
diff --git a/migration/migration.c b/migration/migration.c
index 3410145..8c16129 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -240,6 +240,22 @@ void migrate_send_rp_req_pages(MigrationIncomingState 
*mis, const char *rbname,
 }
 }
 
+static bool migration_colo_enabled;
+bool migration_incoming_colo_enabled(void)
+{
+return

[Qemu-devel] [PATCH RESEND V3 11/16] savevm: split the process of different stages for loadvm/savevm

2018-01-03 Thread Zhang Chen

From: zhanghailiang 

There are several stages during loadvm/savevm process. In different stage,
migration incoming processes different types of sections.
We want to control these stages more accuracy, it will benefit COLO
performance, we don't have to save type of QEMU_VM_SECTION_START
sections everytime while do checkpoint, besides, we want to separate
the process of saving/loading memory and devices state.

So we add three new helper functions: qemu_load_device_state() and
qemu_savevm_live_state() to achieve different process during migration.

Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
public, and simplify the codes of qemu_save_device_state() by calling the
wrapper qemu_savevm_state_header().

Cc: Juan Quintela 
Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Signed-off-by: Zhang Chen 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/colo.c   | 37 +
 migration/savevm.c | 35 ---
 migration/savevm.h |  4 
 3 files changed, 61 insertions(+), 15 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 790b122..a931ff2 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -29,6 +29,7 @@
 #include "qapi-event.h"
 #include "block/block.h"
 #include "replication.h"
+#include "sysemu/cpus.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -380,24 +381,31 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 
 /* Disable block migration */
 migrate_set_block_enabled(false, _err);
-qemu_savevm_state_header(fb);
-qemu_savevm_state_setup(fb);
 qemu_mutex_lock_iothread();
 replication_do_checkpoint_all(_err);
 if (local_err) {
 qemu_mutex_unlock_iothread();
 goto out;
 }
-qemu_savevm_state_complete_precopy(fb, false, false);
-qemu_mutex_unlock_iothread();
-
-qemu_fflush(fb);
 
 colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, _err);
 if (local_err) {
 goto out;
 }
 /*
+ * Only save VM's live state, which not including device state.
+ * TODO: We may need a timeout mechanism to prevent COLO process
+ * to be blocked here.
+ */
+qemu_savevm_live_state(s->to_dst_file);
+/* Note: device state is saved into buffer */
+ret = qemu_save_device_state(fb);
+
+qemu_mutex_unlock_iothread();
+
+qemu_fflush(fb);
+
+/*
  * We need the size of the VMstate data in Secondary side,
  * With which we can decide how much data should be read.
  */
@@ -610,6 +618,7 @@ void *colo_process_incoming_thread(void *opaque)
 uint64_t total_size;
 uint64_t value;
 Error *local_err = NULL;
+int ret;
 
 qemu_sem_init(>colo_incoming_sem, 0);
 
@@ -682,6 +691,16 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
+qemu_mutex_lock_iothread();
+cpu_synchronize_all_pre_loadvm();
+ret = qemu_loadvm_state_main(mis->from_src_file, mis);
+qemu_mutex_unlock_iothread();
+
+if (ret < 0) {
+error_report("Load VM's live state (ram) error");
+goto out;
+}
+
 value = colo_receive_message_value(mis->from_src_file,
  COLO_MESSAGE_VMSTATE_SIZE, _err);
 if (local_err) {
@@ -715,8 +734,9 @@ void *colo_process_incoming_thread(void *opaque)
 qemu_mutex_lock_iothread();
 qemu_system_reset(SHUTDOWN_CAUSE_NONE);
 vmstate_loading = true;
-if (qemu_loadvm_state(fb) < 0) {
-error_report("COLO: loadvm failed");
+ret = qemu_load_device_state(fb);
+if (ret < 0) {
+error_report("COLO: load device state failed");
 qemu_mutex_unlock_iothread();
 goto out;
 }
@@ -777,6 +797,7 @@ out:
 if (mis->to_src_file) {
 qemu_fclose(mis->to_src_file);
 }
+qemu_loadvm_state_cleanup();
 migration_incoming_disable_colo();
 
 return NULL;
diff --git a/migration/savevm.c b/migration/savevm.c
index c582716..30a3c77 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1317,13 +1317,20 @@ done:
 return ret;
 }
 
-static int qemu_save_device_state(QEMUFile *f)
+void qemu_savevm_live_state(QEMUFile *f)
 {
-SaveStateEntry *se;
+/* save QEMU_VM_SECTION_END section */
+qemu_savevm_state_complete_precopy(f, true, false);
+qemu_put_byte(f, QEMU_VM_EOF);
+}
 
-qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
-qemu_put_be32(f, QEMU_VM_FILE_VERSION);
+int qemu_save_device_state(QEMUFile *f)
+{
+SaveStateEntry *se;
 
+if (!migration_in_colo_state()) {
+qemu_savevm_state_header(f);
+}
 cpu_synchronize_all_states();
 
 QTAILQ_FOREACH(se, _state.handlers, entry) {
@@ -1379,8 +1386,6 @@ enum

[Qemu-devel] [PATCH RESEND V3 07/16] COLO: Load dirty pages into SVM's RAM cache firstly

2018-01-03 Thread Zhang Chen

From: zhanghailiang 

We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.

We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached 
ram
to SVM after we receive all PVM's state.

Cc: Dr. David Alan Gilbert 
Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Signed-off-by: Zhang Chen 
---
 include/exec/ram_addr.h |  1 +
 migration/migration.c   |  2 +
 migration/ram.c | 97 +++--
 migration/ram.h |  4 ++
 migration/savevm.c  |  2 +-
 5 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 6cbc02a..6b7b0dd 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -27,6 +27,7 @@ struct RAMBlock {
 struct rcu_head rcu;
 struct MemoryRegion *mr;
 uint8_t *host;
+uint8_t *colo_cache; /* For colo, VM's ram cache */
 ram_addr_t offset;
 ram_addr_t used_length;
 ram_addr_t max_length;
diff --git a/migration/migration.c b/migration/migration.c
index 8c16129..315b6d4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -382,6 +382,8 @@ static void process_incoming_migration_co(void *opaque)
 
 /* Wait checkpoint incoming thread exit before free resource */
 qemu_thread_join(>colo_incoming_thread);
+/* We hold the global iothread lock, so it is safe here */
+colo_release_ram_cache();
 }
 
 if (ret < 0) {
diff --git a/migration/ram.c b/migration/ram.c
index 021d583..0fc0aee 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2466,6 +2466,20 @@ static inline void *host_from_ram_block_offset(RAMBlock 
*block,
 return block->host + offset;
 }
 
+static inline void *colo_cache_from_block_offset(RAMBlock *block,
+ ram_addr_t offset)
+{
+if (!offset_in_ramblock(block, offset)) {
+return NULL;
+}
+if (!block->colo_cache) {
+error_report("%s: colo_cache is NULL in block :%s",
+ __func__, block->idstr);
+return NULL;
+}
+return block->colo_cache + offset;
+}
+
 /**
  * ram_handle_compressed: handle the zero page case
  *
@@ -2619,6 +2633,55 @@ static void decompress_data_with_multi_threads(QEMUFile 
*f,
 qemu_mutex_unlock(_done_lock);
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM, it is need to hold the global lock
+ * to call this helper.
+ */
+int colo_init_ram_cache(void)
+{
+RAMBlock *block;
+
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, _list.blocks, next) {
+block->colo_cache = qemu_anon_ram_alloc(block->used_length, NULL);
+if (!block->colo_cache) {
+error_report("%s: Can't alloc memory for COLO cache of block %s,"
+ "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
+ block->used_length);
+goto out_locked;
+}
+}
+rcu_read_unlock();
+return 0;
+
+out_locked:
+QLIST_FOREACH_RCU(block, _list.blocks, next) {
+if (block->colo_cache) {
+qemu_anon_ram_free(block->colo_cache, block->used_length);
+block->colo_cache = NULL;
+}
+}
+
+rcu_read_unlock();
+return -errno;
+}
+
+/* It is need to hold the global lock to call this helper */
+void colo_release_ram_cache(void)
+{
+RAMBlock *block;
+
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, _list.blocks, next) {
+if (block->colo_cache) {
+qemu_anon_ram_free(block->colo_cache, block->used_length);
+block->colo_cache = NULL;
+}
+}
+rcu_read_unlock();
+}
+
 /**
  * ram_load_setup: Setup RAM for migration incoming side
  *
@@ -2632,6 +2695,7 @@ static int ram_load_setup(QEMUFile *f, void *opaque)
 xbzrle_load_setup();
 compress_threads_load_setup();
 ramblock_recv_map_init();
+
 return 0;
 }
 
@@ -2645,6 +2709,7 @@ static int ram_load_cleanup(void *opaque)
 g_free(rb->receivedmap);
 rb->receivedmap = NULL;
 }
+
 return 0;
 }
 
@@ -2845,7 +2910,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 
 while (!postcopy_running && !ret && !(flags & RAM_SAVE_FLAG_EOS)) {
 ram_addr_t addr, total_ram_bytes;
-void *host = NULL;
+void *host = NULL, *host_bak = NULL;
 uint8_t ch;
 
 addr = qemu_get_be64(f);
@@ -2865,13 +2930,36 @@

[Qemu-devel] [PATCH RESEND V3 02/16] colo-compare: implement the process of checkpoint

2018-01-03 Thread Zhang Chen

While do checkpoint, we need to flush all the unhandled packets,
By using the filter notifier mechanism, we can easily to notify
every compare object to do this process, which runs inside
of compare threads as a coroutine.

Cc: Jason Wang 
Signed-off-by: zhanghailiang 
Signed-off-by: Zhang Chen 
---
 include/migration/colo.h |  6 
 net/colo-compare.c   | 71 
 net/colo-compare.h   | 22 +++
 3 files changed, 99 insertions(+)
 create mode 100644 net/colo-compare.h

diff --git a/include/migration/colo.h b/include/migration/colo.h
index ff9874e..6adf3a5 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -15,6 +15,12 @@
 
 #include "qemu-common.h"
 
+enum colo_event {
+COLO_EVENT_NONE,
+COLO_EVENT_CHECKPOINT,
+COLO_EVENT_FAILOVER,
+};
+
 void colo_info_init(void);
 
 void migrate_start_colo_process(MigrationState *s);
diff --git a/net/colo-compare.c b/net/colo-compare.c
index 0ebdec9..e9cfca2 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -29,17 +29,28 @@
 #include "qapi-visit.h"
 #include "net/colo.h"
 #include "sysemu/iothread.h"
+#include "net/colo-compare.h"
+#include "migration/colo.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
 OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
+static QTAILQ_HEAD(, CompareState) net_compares =
+   QTAILQ_HEAD_INITIALIZER(net_compares);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
 /* TODO: Should be configurable */
 #define REGULAR_PACKET_CHECK_MS 3000
 
+static QemuMutex event_mtx = { .lock = PTHREAD_MUTEX_INITIALIZER,
+   .initialized = true };
+static QemuCond event_complete_cond = { .cond = PTHREAD_COND_INITIALIZER,
+.initialized = true};
+static int event_unhandled_count;
+
 /*
  *  + CompareState ++
  *  |   |
@@ -86,6 +97,11 @@ typedef struct CompareState {
 IOThread *iothread;
 GMainContext *worker_context;
 QEMUTimer *packet_check_timer;
+
+QEMUBH *event_bh;
+enum colo_event event;
+
+QTAILQ_ENTRY(CompareState) next;
 } CompareState;
 
 typedef struct CompareClass {
@@ -631,6 +647,25 @@ static void check_old_packet_regular(void *opaque)
 REGULAR_PACKET_CHECK_MS);
 }
 
+/* Public API, Used for COLO frame to notify compare event */
+void colo_notify_compares_event(void *opaque, int event, Error **errp)
+{
+CompareState *s;
+
+qemu_mutex_lock(_mtx);
+QTAILQ_FOREACH(s, _compares, next) {
+s->event = event;
+qemu_bh_schedule(s->event_bh);
+event_unhandled_count++;
+}
+/* Wait all compare threads to finish handling this event */
+while (event_unhandled_count > 0) {
+qemu_cond_wait(_complete_cond, _mtx);
+}
+
+qemu_mutex_unlock(_mtx);
+}
+
 static void colo_compare_timer_init(CompareState *s)
 {
 AioContext *ctx = iothread_get_aio_context(s->iothread);
@@ -651,6 +686,28 @@ static void colo_compare_timer_del(CompareState *s)
 }
  }
 
+static void colo_flush_packets(void *opaque, void *user_data);
+
+static void colo_compare_handle_event(void *opaque)
+{
+CompareState *s = opaque;
+
+switch (s->event) {
+case COLO_EVENT_CHECKPOINT:
+g_queue_foreach(>conn_list, colo_flush_packets, s);
+break;
+case COLO_EVENT_FAILOVER:
+break;
+default:
+break;
+}
+qemu_mutex_lock(_mtx);
+assert(event_unhandled_count > 0);
+event_unhandled_count--;
+qemu_cond_broadcast(_complete_cond);
+qemu_mutex_unlock(_mtx);
+}
+
 static void colo_compare_iothread(CompareState *s)
 {
 object_ref(OBJECT(s->iothread));
@@ -664,6 +721,7 @@ static void colo_compare_iothread(CompareState *s)
  s, s->worker_context, true);
 
 colo_compare_timer_init(s);
+s->event_bh = qemu_bh_new(colo_compare_handle_event, s);
 }
 
 static char *compare_get_pri_indev(Object *obj, Error **errp)
@@ -821,6 +879,8 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
 net_socket_rs_init(>pri_rs, compare_pri_rs_finalize, s->vnet_hdr);
 net_socket_rs_init(>sec_rs, compare_sec_rs_finalize, s->vnet_hdr);
 
+QTAILQ_INSERT_TAIL(_compares, s, next);
+
 g_queue_init(>conn_list);
 
 s->connection_track_table = g_hash_table_new_full(connection_key_hash,
@@ -885,6 +945,7 @@ static void colo_compare_init(Object *obj)
 static void colo_compare_finalize(Object *obj)
 {
 CompareState *s = COLO_COMPARE(obj);
+CompareState *tmp = NULL;
 
 qemu_chr_fe_deinit(>chr_pri_in, false);
 qemu_chr_fe_deinit(>chr_sec_in, false);
@@ -892,6 +953,16 @@ static void colo_compare_finalize(Object *obj)
 if (s->iothread) {
 colo_compare_timer_del(s);
 }
+
+qemu_bh_delete(s->event_bh);
+
+QTAILQ_FOREACH(tmp, _compares, next) {
+

[Qemu-devel] [PATCH RESEND V3 03/16] colo-compare: use notifier to notify packets comparing result

2018-01-03 Thread Zhang Chen

It's a good idea to use notifier to notify COLO frame of
inconsistent packets comparing.

Cc: Jason Wang 
Signed-off-by: Zhang Chen 
Signed-off-by: zhanghailiang 
---
 net/colo-compare.c | 32 +---
 net/colo-compare.h |  2 ++
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index e9cfca2..dfaa81f 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -31,6 +31,7 @@
 #include "sysemu/iothread.h"
 #include "net/colo-compare.h"
 #include "migration/colo.h"
+#include "migration/migration.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
@@ -39,6 +40,9 @@
 static QTAILQ_HEAD(, CompareState) net_compares =
QTAILQ_HEAD_INITIALIZER(net_compares);
 
+static NotifierList colo_compare_notifiers =
+NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
@@ -454,8 +458,24 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t 
*check_time)
 }
 }
 
+static void colo_compare_inconsistent_notify(void)
+{
+notifier_list_notify(_compare_notifiers,
+migrate_get_current());
+}
+
+void colo_compare_register_notifier(Notifier *notify)
+{
+notifier_list_add(_compare_notifiers, notify);
+}
+
+void colo_compare_unregister_notifier(Notifier *notify)
+{
+notifier_remove(notify);
+}
+
 static int colo_old_packet_check_one_conn(Connection *conn,
-  void *user_data)
+   void *user_data)
 {
 GList *result = NULL;
 int64_t check_time = REGULAR_PACKET_CHECK_MS;
@@ -466,10 +486,7 @@ static int colo_old_packet_check_one_conn(Connection *conn,
 
 if (result) {
 /* Do checkpoint will flush old packet */
-/*
- * TODO: Notify colo frame to do checkpoint.
- * colo_compare_inconsistent_notify();
- */
+colo_compare_inconsistent_notify();
 return 0;
 }
 
@@ -544,11 +561,12 @@ static void colo_compare_connection(void *opaque, void 
*user_data)
 /*
  * If one packet arrive late, the secondary_list or
  * primary_list will be empty, so we can't compare it
- * until next comparison.
+ * until next comparison. If the packets in the list are
+ * timeout, it will trigger a checkpoint request.
  */
 trace_colo_compare_main("packet different");
 g_queue_push_head(>primary_list, pkt);
-/* TODO: colo_notify_checkpoint();*/
+colo_compare_inconsistent_notify();
 break;
 }
 }
diff --git a/net/colo-compare.h b/net/colo-compare.h
index 1b1ce76..22ddd51 100644
--- a/net/colo-compare.h
+++ b/net/colo-compare.h
@@ -18,5 +18,7 @@
 #define QEMU_COLO_COMPARE_H
 
 void colo_notify_compares_event(void *opaque, int event, Error **errp);
+void colo_compare_register_notifier(Notifier *notify);
+void colo_compare_unregister_notifier(Notifier *notify);
 
 #endif /* QEMU_COLO_COMPARE_H */
-- 
2.7.4

[Qemu-devel] [PATCH RESEND V3 05/16] COLO: Add block replication into colo process

2018-01-03 Thread Zhang Chen

Make sure master start block replication after slave's block
replication started.

Besides, we need to activate VM's blocks before goes into
COLO state.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Signed-off-by: Zhang Chen 
Cc: Stefan Hajnoczi 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Xie Changlong 
---
 migration/colo.c  | 46 ++
 migration/migration.c |  9 +
 2 files changed, 55 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index c513805..0e689df 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -26,6 +26,9 @@
 #include "qmp-commands.h"
 #include "net/colo-compare.h"
 #include "net/colo.h"
+#include "qapi-event.h"
+#include "block/block.h"
+#include "replication.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -55,6 +58,7 @@ static void secondary_vm_do_failover(void)
 {
 int old_state;
 MigrationIncomingState *mis = migration_incoming_get_current();
+Error *local_err = NULL;
 
 /* Can not do failover during the process of VM's loading VMstate, Or
  * it will break the secondary VM.
@@ -72,6 +76,11 @@ static void secondary_vm_do_failover(void)
 migrate_set_state(>state, MIGRATION_STATUS_COLO,
   MIGRATION_STATUS_COMPLETED);
 
+replication_stop_all(true, _err);
+if (local_err) {
+error_report_err(local_err);
+}
+
 if (!autostart) {
 error_report("\"-S\" qemu option will be ignored in secondary side");
 /* recover runstate to normal migration finish state */
@@ -109,6 +118,7 @@ static void primary_vm_do_failover(void)
 {
 MigrationState *s = migrate_get_current();
 int old_state;
+Error *local_err = NULL;
 
 migrate_set_state(>state, MIGRATION_STATUS_COLO,
   MIGRATION_STATUS_COMPLETED);
@@ -132,6 +142,13 @@ static void primary_vm_do_failover(void)
  FailoverStatus_str(old_state));
 return;
 }
+
+replication_stop_all(true, _err);
+if (local_err) {
+error_report_err(local_err);
+local_err = NULL;
+}
+
 /* Notify COLO thread that failover work is finished */
 qemu_sem_post(>colo_exit_sem);
 }
@@ -355,6 +372,11 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 qemu_savevm_state_header(fb);
 qemu_savevm_state_setup(fb);
 qemu_mutex_lock_iothread();
+replication_do_checkpoint_all(_err);
+if (local_err) {
+qemu_mutex_unlock_iothread();
+goto out;
+}
 qemu_savevm_state_complete_precopy(fb, false, false);
 qemu_mutex_unlock_iothread();
 
@@ -396,6 +418,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
 ret = 0;
 
 qemu_mutex_lock_iothread();
+
 vm_start();
 qemu_mutex_unlock_iothread();
 trace_colo_vm_state_change("stop", "run");
@@ -445,6 +468,12 @@ static void colo_process_checkpoint(MigrationState *s)
 object_unref(OBJECT(bioc));
 
 qemu_mutex_lock_iothread();
+replication_start_all(REPLICATION_MODE_PRIMARY, _err);
+if (local_err) {
+qemu_mutex_unlock_iothread();
+goto out;
+}
+
 vm_start();
 qemu_mutex_unlock_iothread();
 trace_colo_vm_state_change("stop", "run");
@@ -584,6 +613,11 @@ void *colo_process_incoming_thread(void *opaque)
 object_unref(OBJECT(bioc));
 
 qemu_mutex_lock_iothread();
+replication_start_all(REPLICATION_MODE_SECONDARY, _err);
+if (local_err) {
+qemu_mutex_unlock_iothread();
+goto out;
+}
 vm_start();
 trace_colo_vm_state_change("stop", "run");
 qemu_mutex_unlock_iothread();
@@ -664,6 +698,18 @@ void *colo_process_incoming_thread(void *opaque)
 goto out;
 }
 
+replication_get_error_all(_err);
+if (local_err) {
+qemu_mutex_unlock_iothread();
+goto out;
+}
+/* discard colo disk buffer */
+replication_do_checkpoint_all(_err);
+if (local_err) {
+qemu_mutex_unlock_iothread();
+goto out;
+}
+
 vmstate_loading = false;
 vm_start();
 trace_colo_vm_state_change("stop", "run");
diff --git a/migration/migration.c b/migration/migration.c
index ced463c..3410145 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -318,6 +318,7 @@ static void process_incoming_migration_co(void *opaque)
 MigrationIncomingState *mis = migration_incoming_get_current();
 PostcopyState ps;
 int ret;
+Error *local_err = NULL;
 
 assert(mis->from_src_file);
 mis->largest_page_size = qemu_ram_pagesize_largest();
@@ -349,6 +350,14 @@ static void process_incoming_migration_co(void *opaque)
 
 /* we get COLO info, and know if we are in COLO mode */
 if (!ret &&

[Qemu-devel] [PATCH RESEND V3 01/16] filter-rewriter: fix memory leak for connection in connection_track_table

2018-01-03 Thread Zhang Chen

After a net connection is closed, we didn't clear its releated resources
in connection_track_table, which will lead to memory leak.

Let't track the state of net connection, if it is closed, its related
resources will be cleared up.

Signed-off-by: zhanghailiang 
Signed-off-by: Zhang Chen 
---
 net/colo.h|  4 +++
 net/filter-rewriter.c | 69 +--
 2 files changed, 66 insertions(+), 7 deletions(-)

diff --git a/net/colo.h b/net/colo.h
index 0658e86..0193935 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -18,6 +18,7 @@
 #include "slirp/slirp.h"
 #include "qemu/jhash.h"
 #include "qemu/timer.h"
+#include "slirp/tcp.h"
 
 #define HASHTABLE_MAX_SIZE 16384
 
@@ -71,6 +72,9 @@ typedef struct Connection {
  * run once in independent tcp connection
  */
 int syn_flag;
+
+int tcp_state; /* TCP FSM state */
+tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
 } Connection;
 
 uint32_t connection_key_hash(const void *opaque);
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index 2be388f..a58310a 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -62,9 +62,9 @@ static int is_tcp_packet(Packet *pkt)
 }
 
 /* handle tcp packet from primary guest */
-static int handle_primary_tcp_pkt(NetFilterState *nf,
+static int handle_primary_tcp_pkt(RewriterState *rf,
   Connection *conn,
-  Packet *pkt)
+  Packet *pkt, ConnectionKey *key)
 {
 struct tcphdr *tcp_pkt;
 
@@ -102,15 +102,44 @@ static int handle_primary_tcp_pkt(NetFilterState *nf,
 net_checksum_calculate((uint8_t *)pkt->data + pkt->vnet_hdr_len,
pkt->size - pkt->vnet_hdr_len);
 }
+/*
+ * Case 1:
+ * The *server* side of this connect is VM, *client* tries to close
+ * the connection.
+ *
+ * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1'
+ * packet from server side. From this point, we can ensure that there
+ * will be no packets in the connection, except that, some errors
+ * happen between the path of 'filter object' and vNIC, if this rare
+ * case really happen, we can still create a new connection,
+ * So it is safe to remove the connection from connection_track_table.
+ *
+ */
+if ((conn->tcp_state == TCPS_LAST_ACK) &&
+(ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
+g_hash_table_remove(rf->connection_track_table, key);
+}
+}
+/*
+ * Case 2:
+ * The *server* side of this connect is VM, *server* tries to close
+ * the connection.
+ *
+ * We got 'fin=1, ack=1' packet from client side, we need to
+ * record the seq of 'fin=1, ack=1' packet.
+ */
+if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
+conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
+conn->tcp_state = TCPS_LAST_ACK;
 }
 
 return 0;
 }
 
 /* handle tcp packet from secondary guest */
-static int handle_secondary_tcp_pkt(NetFilterState *nf,
+static int handle_secondary_tcp_pkt(RewriterState *rf,
 Connection *conn,
-Packet *pkt)
+Packet *pkt, ConnectionKey *key)
 {
 struct tcphdr *tcp_pkt;
 
@@ -142,8 +171,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf,
 net_checksum_calculate((uint8_t *)pkt->data + pkt->vnet_hdr_len,
pkt->size - pkt->vnet_hdr_len);
 }
+/*
+ * Case 2:
+ * The *server* side of this connect is VM, *server* tries to close
+ * the connection.
+ *
+ * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1'
+ * packet from client side. Like Case 1, there should be no packets
+ * in the connection from now know, But the difference here is
+ * if the packet is lost, We will get the resent 'fin=1,ack=1' packet.
+ * TODO: Fix above case.
+ */
+if ((conn->tcp_state == TCPS_LAST_ACK) &&
+(ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
+g_hash_table_remove(rf->connection_track_table, key);
+}
+}
+/*
+ * Case 1:
+ * The *server* side of this connect is VM, *client* tries to close
+ * the connection.
+ *
+ * We got 'fin=1, ack=1' packet from server side, we need to
+ * record the seq of 'fin=1, ack=1' packet.
+ */
+if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
+conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
+conn->tcp_state = TCPS_LAST_ACK;
 }
-
 return 0;
 }
 
@@ -193,7 +248,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
 
 if (sender

[Qemu-devel] [PATCH RESEND V3 00/16] COLO: integrate colo frame with block replication and COLO proxy

2018-01-03 Thread Zhang Chen

Hi~

(Sorry, I forgot add the qemu-devel maillist, resend this series)

COLO Frame, block replication and COLO proxy(colo-compare,filter-mirror,
filter-redirector,filter-rewriter) have been exist in qemu
for long time, it's time to integrate these three parts to make COLO really 
works.

In this series, we have some optimizations for COLO frame, including separating 
the
process of saving ram and device state, using an COLO_EXIT event to notify 
users that
VM exits COLO, for these parts, most of them have been reviewed long time ago 
in old version,
but since this series have just rebased on upstream which had merged a new 
series of migration,
parts of pathes in this series deserve review again.

We use notifier/callback method for COLO compare to notify COLO frame about
net packets inconsistent event, and add a handle_event method for 
NetFilterClass to
help COLO frame to notify filters and colo-compare about checkpoint/failover 
event, 
it is flexible.

For the neweset version, please refer to:
https://github.com/zhangckid/qemu/tree/qemu-colo-18jan4

Please review, thanks.

V3:
 - Address community comments from V2.
 - Rebase on upstream codes.
 - Fix several bugs.
 - Splite shared disk part to indepentent patch set.
 - Optimize codes.

Zhang Chen (8):
  filter-rewriter: fix memory leak for connection in
connection_track_table
  colo-compare: implement the process of checkpoint
  colo-compare: use notifier to notify packets comparing result
  COLO: integrate colo compare with colo frame
  COLO: Add block replication into colo process
  ram/COLO: Record the dirty pages that SVM received
  filter: Add handle_event method for NetFilterClass
  filter-rewriter: handle checkpoint and failover event

zhanghailiang (8):
  COLO: Remove colo_state migration struct
  COLO: Load dirty pages into SVM's RAM cache firstly
  COLO: Flush memory data from ram cache
  qmp event: Add COLO_EXIT event to notify users while exited COLO
  savevm: split the process of different stages for loadvm/savevm
  COLO: flush host dirty ram from cache
  COLO: notify net filters about checkpoint/failover event
  COLO: quick failover process by kick COLO thread

 include/exec/ram_addr.h  |   1 +
 include/migration/colo.h |  11 ++-
 include/net/filter.h |   5 ++
 migration/Makefile.objs  |   2 +-
 migration/colo-comm.c|  76 ---
 migration/colo.c | 188 ---
 migration/migration.c|  38 +-
 migration/ram.c  | 181 -
 migration/ram.h  |   4 +
 migration/savevm.c   |  54 --
 migration/savevm.h   |   5 ++
 migration/trace-events   |   3 +
 net/colo-compare.c   | 103 --
 net/colo-compare.h   |  24 ++
 net/colo.h   |   4 +
 net/filter-rewriter.c| 109 +--
 net/filter.c |  17 +
 net/net.c|  28 +++
 qapi-schema.json |  21 ++
 qapi/migration.json  |  13 
 vl.c |   2 -
 21 files changed, 771 insertions(+), 118 deletions(-)
 delete mode 100644 migration/colo-comm.c
 create mode 100644 net/colo-compare.h

-- 
2.7.4

[Qemu-devel] [PATCH] memory: update comments and fix some typos

2018-01-03 Thread Jay Zhou

Signed-off-by: Jay Zhou 
---
 include/exec/memory.h | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index a4cabdf..6e5684d 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -324,7 +324,7 @@ FlatView *address_space_to_flatview(AddressSpace *as);
  * MemoryRegionSection: describes a fragment of a #MemoryRegion
  *
  * @mr: the region, or %NULL if empty
- * @address_space: the address space the region is mapped in
+ * @fv: the flat view of the address space the region is mapped in
  * @offset_within_region: the beginning of the section, relative to @mr's start
  * @size: the size of the section; will not exceed @mr's boundaries
  * @offset_within_address_space: the address of the first byte of the section
@@ -607,6 +607,7 @@ void memory_region_init_rom_nomigrate(MemoryRegion *mr,
  * @mr: the #MemoryRegion to be initialized.
  * @owner: the object that tracks the region's reference count
  * @ops: callbacks for write access handling (must not be NULL).
+ * @opaque: passed to the read and write callbacks of the @ops structure.
  * @name: Region name, becomes part of RAMBlock name used in migration stream
  *must be unique within any device
  * @size: size of the region.
@@ -650,11 +651,10 @@ static inline void 
memory_region_init_reservation(MemoryRegion *mr,
  * An IOMMU region translates addresses and forwards accesses to a target
  * memory region.
  *
- * @typename: QOM class name
  * @_iommu_mr: the #IOMMUMemoryRegion to be initialized
  * @instance_size: the IOMMUMemoryRegion subclass instance size
+ * @mrtypename: the type name of the #IOMMUMemoryRegion
  * @owner: the object that tracks the region's reference count
- * @ops: a function that translates addresses into the @target region
  * @name: used for debugging; not visible to the user or ABI
  * @size: size of the region.
  */
@@ -824,8 +824,8 @@ static inline IOMMUMemoryRegion 
*memory_region_get_iommu(MemoryRegion *mr)
  * memory_region_get_iommu_class_nocheck: returns iommu memory region class
  *   if an iommu or NULL if not
  *
- * Returns pointer to IOMMUMemoryRegioniClass if a memory region is an iommu,
- * otherwise NULL. This is fast path avoinding QOM checking, use with caution.
+ * Returns pointer to IOMMUMemoryRegionClass if a memory region is an iommu,
+ * otherwise NULL. This is fast path avoiding QOM checking, use with caution.
  *
  * @mr: the memory region being queried
  */
@@ -990,7 +990,8 @@ int memory_region_get_fd(MemoryRegion *mr);
  * protecting the pointer, such as a reference to the region that includes
  * the incoming ram_addr_t.
  *
- * @mr: the memory region being queried.
+ * @ptr: the host pointer to be converted
+ * @offset: the offset within memory region
  */
 MemoryRegion *memory_region_from_host(void *ptr, ram_addr_t *offset);
 
@@ -1267,7 +1268,7 @@ void memory_region_clear_global_locking(MemoryRegion *mr);
  * @size: the size of the access to trigger the eventfd
  * @match_data: whether to match against @data, instead of just @addr
  * @data: the data to match against the guest write
- * @fd: the eventfd to be triggered when @addr, @size, and @data all match.
+ * @e: event notifier to be triggered when @addr, @size, and @data all match.
  **/
 void memory_region_add_eventfd(MemoryRegion *mr,
hwaddr addr,
@@ -1287,7 +1288,7 @@ void memory_region_add_eventfd(MemoryRegion *mr,
  * @size: the size of the access to trigger the eventfd
  * @match_data: whether to match against @data, instead of just @addr
  * @data: the data to match against the guest write
- * @fd: the eventfd to be triggered when @addr, @size, and @data all match.
+ * @e: event notifier to be triggered when @addr, @size, and @data all match.
  */
 void memory_region_del_eventfd(MemoryRegion *mr,
hwaddr addr,
@@ -1523,7 +1524,7 @@ bool memory_region_request_mmio_ptr(MemoryRegion *mr, 
hwaddr addr);
  * will need to request the pointer again.
  *
  * @mr: #MemoryRegion associated to the pointer.
- * @addr: address within that region
+ * @offset: offset within the memory region
  * @size: size of that area.
  */
 void memory_region_invalidate_mmio_ptr(MemoryRegion *mr, hwaddr offset,
@@ -1592,6 +1593,7 @@ void address_space_destroy(AddressSpace *as);
  * @addr: address within that address space
  * @attrs: memory transaction attributes
  * @buf: buffer with the data transferred
+ * @len: the number of bytes to read or write
  * @is_write: indicates the transfer direction
  */
 MemTxResult address_space_rw(AddressSpace *as, hwaddr addr,
@@ -1609,6 +1611,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr 
addr,
  * @addr: address within that address space
  * @attrs: memory transaction attributes
  * @buf: buffer with the data transferred
+ * @len: the number of bytes to write
  */
 MemTxResult address_space_write(AddressSpace *as, hwaddr

[Qemu-devel] [PATCH v3 2/3] qemu: virtio-net: use 64-bit values for feature flags

2018-01-03 Thread Jason Baron via Qemu-devel

In prepartion for using some of the high order feature bits, make sure that
virtio-net uses 64-bit values everywhere.

Signed-off-by: Jason Baron 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtio-...@lists.oasis-open.org
---
 hw/net/virtio-net.c| 54 +-
 include/hw/virtio/virtio-net.h |  2 +-
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 38674b0..adc20df 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -48,18 +48,18 @@
 (offsetof(container, field) + sizeof(((container *)0)->field))
 
 typedef struct VirtIOFeature {
-uint32_t flags;
+uint64_t flags;
 size_t end;
 } VirtIOFeature;
 
 static VirtIOFeature feature_sizes[] = {
-{.flags = 1 << VIRTIO_NET_F_MAC,
+{.flags = 1ULL << VIRTIO_NET_F_MAC,
  .end = endof(struct virtio_net_config, mac)},
-{.flags = 1 << VIRTIO_NET_F_STATUS,
+{.flags = 1ULL << VIRTIO_NET_F_STATUS,
  .end = endof(struct virtio_net_config, status)},
-{.flags = 1 << VIRTIO_NET_F_MQ,
+{.flags = 1ULL << VIRTIO_NET_F_MQ,
  .end = endof(struct virtio_net_config, max_virtqueue_pairs)},
-{.flags = 1 << VIRTIO_NET_F_MTU,
+{.flags = 1ULL << VIRTIO_NET_F_MTU,
  .end = endof(struct virtio_net_config, mtu)},
 {}
 };
@@ -1938,7 +1938,7 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
 int i;
 
 if (n->net_conf.mtu) {
-n->host_features |= (0x1 << VIRTIO_NET_F_MTU);
+n->host_features |= (1ULL << VIRTIO_NET_F_MTU);
 }
 
 virtio_net_set_config_size(n, n->host_features);
@@ -2109,45 +2109,45 @@ static const VMStateDescription vmstate_virtio_net = {
 };
 
 static Property virtio_net_properties[] = {
-DEFINE_PROP_BIT("csum", VirtIONet, host_features, VIRTIO_NET_F_CSUM, true),
-DEFINE_PROP_BIT("guest_csum", VirtIONet, host_features,
+DEFINE_PROP_BIT64("csum", VirtIONet, host_features, VIRTIO_NET_F_CSUM, 
true),
+DEFINE_PROP_BIT64("guest_csum", VirtIONet, host_features,
 VIRTIO_NET_F_GUEST_CSUM, true),
-DEFINE_PROP_BIT("gso", VirtIONet, host_features, VIRTIO_NET_F_GSO, true),
-DEFINE_PROP_BIT("guest_tso4", VirtIONet, host_features,
+DEFINE_PROP_BIT64("gso", VirtIONet, host_features, VIRTIO_NET_F_GSO, true),
+DEFINE_PROP_BIT64("guest_tso4", VirtIONet, host_features,
 VIRTIO_NET_F_GUEST_TSO4, true),
-DEFINE_PROP_BIT("guest_tso6", VirtIONet, host_features,
+DEFINE_PROP_BIT64("guest_tso6", VirtIONet, host_features,
 VIRTIO_NET_F_GUEST_TSO6, true),
-DEFINE_PROP_BIT("guest_ecn", VirtIONet, host_features,
+DEFINE_PROP_BIT64("guest_ecn", VirtIONet, host_features,
 VIRTIO_NET_F_GUEST_ECN, true),
-DEFINE_PROP_BIT("guest_ufo", VirtIONet, host_features,
+DEFINE_PROP_BIT64("guest_ufo", VirtIONet, host_features,
 VIRTIO_NET_F_GUEST_UFO, true),
-DEFINE_PROP_BIT("guest_announce", VirtIONet, host_features,
+DEFINE_PROP_BIT64("guest_announce", VirtIONet, host_features,
 VIRTIO_NET_F_GUEST_ANNOUNCE, true),
-DEFINE_PROP_BIT("host_tso4", VirtIONet, host_features,
+DEFINE_PROP_BIT64("host_tso4", VirtIONet, host_features,
 VIRTIO_NET_F_HOST_TSO4, true),
-DEFINE_PROP_BIT("host_tso6", VirtIONet, host_features,
+DEFINE_PROP_BIT64("host_tso6", VirtIONet, host_features,
 VIRTIO_NET_F_HOST_TSO6, true),
-DEFINE_PROP_BIT("host_ecn", VirtIONet, host_features,
+DEFINE_PROP_BIT64("host_ecn", VirtIONet, host_features,
 VIRTIO_NET_F_HOST_ECN, true),
-DEFINE_PROP_BIT("host_ufo", VirtIONet, host_features,
+DEFINE_PROP_BIT64("host_ufo", VirtIONet, host_features,
 VIRTIO_NET_F_HOST_UFO, true),
-DEFINE_PROP_BIT("mrg_rxbuf", VirtIONet, host_features,
+DEFINE_PROP_BIT64("mrg_rxbuf", VirtIONet, host_features,
 VIRTIO_NET_F_MRG_RXBUF, true),
-DEFINE_PROP_BIT("status", VirtIONet, host_features,
+DEFINE_PROP_BIT64("status", VirtIONet, host_features,
 VIRTIO_NET_F_STATUS, true),
-DEFINE_PROP_BIT("ctrl_vq", VirtIONet, host_features,
+DEFINE_PROP_BIT64("ctrl_vq", VirtIONet, host_features,
 VIRTIO_NET_F_CTRL_VQ, true),
-DEFINE_PROP_BIT("ctrl_rx", VirtIONet, host_features,
+DEFINE_PROP_BIT64("ctrl_rx", VirtIONet, host_features,
 VIRTIO_NET_F_CTRL_RX, true),
-DEFINE_PROP_BIT("ctrl_vlan", VirtIONet, host_features,
+DEFINE_PROP_BIT64("ctrl_vlan", VirtIONet, host_features,
 VIRTIO_NET_F_CTRL_VLAN, true),
-DEFINE_PROP_BIT("ctrl_rx_extra", VirtIONet, host_features,
+DEFINE_PROP_BIT64("ctrl_rx_extra", VirtIONet, host_features,
 VIRTIO_NET_F_CTRL_RX_EXTRA, true),
-DEFINE_PROP_BIT("ctrl_mac_addr",

[Qemu-devel] [PATCH v3 0/3] virtio_net: allow hypervisor to indicate linkspeed and duplex setting

2018-01-03 Thread Jason Baron via Qemu-devel

We have found it useful to be able to set the linkspeed and duplex
settings from the host-side for virtio_net. This obviates the need
for guest changes and settings for these fields, and does not require
custom ethtool commands for virtio_net.

The ability to set linkspeed and duplex is useful in various cases
as described here:

16032be virtio_net: add ethtool support for set and get of settings 

  
Using 'ethtool -s' continues to over-write the linkspeed/duplex
settings with this patch.

The 1/3 patch is against net-next, while the 2-3/3 patch are the associated
qemu changes that would go in after as update-linux-headers.sh should
be run first. So the qemu patches are a demonstration of how I intend this
to work.

Thanks,

-Jason  

linux changes:

changes from v2:
* move speed/duplex read into virtnet_config_changed_work() so link up changes
  are detected

Jason Baron (1):
  virtio_net: propagate linkspeed/duplex settings from the hypervisor

 drivers/net/virtio_net.c| 19 ++-
 include/uapi/linux/virtio_net.h | 13 +
 2 files changed, 31 insertions(+), 1 deletion(-)

qemu changes:

changes from v2:
* if link up return configured speed/duplex, else return UNKNOWN speed and 
duplex

Jason Baron (2):
  qemu: virtio-net: use 64-bit values for feature flags
  qemu: add linkspeed and duplex settings to virtio-net

 hw/net/virtio-net.c | 89 -
 include/hw/virtio/virtio-net.h  |  5 +-
 include/standard-headers/linux/virtio_net.h | 13 +
 3 files changed, 79 insertions(+), 28 deletions(-)


-- 
2.6.1

[Qemu-devel] [PATCH net-next v3 1/3] virtio_net: propagate linkspeed/duplex settings from the hypervisor

2018-01-03 Thread Jason Baron via Qemu-devel

The ability to set speed and duplex for virtio_net is useful in various
scenarios as described here:

16032be virtio_net: add ethtool support for set and get of settings

However, it would be nice to be able to set this from the hypervisor,
such that virtio_net doesn't require custom guest ethtool commands.

Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
the hypervisor to export a linkspeed and duplex setting. The user can
subsequently overwrite it later if desired via: 'ethtool -s'.

Note that VIRTIO_NET_F_SPEED_DUPLEX is defined as bit 63, the intention
is that device feature bits are to grow down from bit 63, since the
transports are starting from bit 24 and growing up.

Signed-off-by: Jason Baron 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtio-...@lists.oasis-open.org
---
 drivers/net/virtio_net.c| 19 ++-
 include/uapi/linux/virtio_net.h | 13 +
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 6fb7b65..0b2d314 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2146,6 +2146,22 @@ static void virtnet_config_changed_work(struct 
work_struct *work)
 
vi->status = v;
 
+   if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_SPEED_DUPLEX)) {
+   u32 speed;
+   u8 duplex;
+
+   speed = virtio_cread32(vi->vdev,
+  offsetof(struct virtio_net_config,
+   speed));
+   if (ethtool_validate_speed(speed))
+   vi->speed = speed;
+   duplex = virtio_cread8(vi->vdev,
+  offsetof(struct virtio_net_config,
+   duplex));
+   if (ethtool_validate_duplex(duplex))
+   vi->duplex = duplex;
+   }
+
if (vi->status & VIRTIO_NET_S_LINK_UP) {
netif_carrier_on(vi->dev);
netif_tx_wake_all_queues(vi->dev);
@@ -2796,7 +2812,8 @@ static struct virtio_device_id id_table[] = {
VIRTIO_NET_F_CTRL_RX, VIRTIO_NET_F_CTRL_VLAN, \
VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ, \
VIRTIO_NET_F_CTRL_MAC_ADDR, \
-   VIRTIO_NET_F_MTU, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
+   VIRTIO_NET_F_MTU, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS, \
+   VIRTIO_NET_F_SPEED_DUPLEX
 
 static unsigned int features[] = {
VIRTNET_FEATURES,
diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index fc353b5..5de6ed3 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,8 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_SPEED_DUPLEX 63   /* Device set linkspeed and duplex */
+
 #ifndef VIRTIO_NET_NO_LEGACY
 #define VIRTIO_NET_F_GSO   6   /* Host handles pkts w/ any GSO type */
 #endif /* VIRTIO_NET_NO_LEGACY */
@@ -76,6 +78,17 @@ struct virtio_net_config {
__u16 max_virtqueue_pairs;
/* Default maximum transmit unit advice */
__u16 mtu;
+   /*
+* speed, in units of 1Mb. All values 0 to INT_MAX are legal.
+* Any other value stands for unknown.
+*/
+   __u32 speed;
+   /*
+* 0x00 - half duplex
+* 0x01 - full duplex
+* Any other value stands for unknown.
+*/
+   __u8 duplex;
 } __attribute__((packed));
 
 /*
-- 
2.6.1

[Qemu-devel] [PATCH v3 3/3] qemu: add linkspeed and duplex settings to virtio-net

2018-01-03 Thread Jason Baron via Qemu-devel

Although linkspeed and duplex can be set in a linux guest via 'ethtool -s',
this requires custom ethtool commands for virtio-net by default.

Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
the hypervisor to export a linkspeed and duplex setting. The user can
subsequently overwrite it later if desired via: 'ethtool -s'.

Linkspeed and duplex settings can be set as:
'-device virtio-net,speed=1,duplex=full'

where speed is [-1...INT_MAX], and duplex is ["half"|"full"].

Signed-off-by: Jason Baron 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtio-...@lists.oasis-open.org
---
 hw/net/virtio-net.c | 35 +
 include/hw/virtio/virtio-net.h  |  3 +++
 include/standard-headers/linux/virtio_net.h | 13 +++
 3 files changed, 51 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index adc20df..eec8422 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -40,6 +40,12 @@
 #define VIRTIO_NET_RX_QUEUE_MIN_SIZE VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE
 #define VIRTIO_NET_TX_QUEUE_MIN_SIZE VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE
 
+/* duplex and speed */
+#define DUPLEX_UNKNOWN  0xff
+#define DUPLEX_HALF 0x00
+#define DUPLEX_FULL 0x01
+#define SPEED_UNKNOWN   -1
+
 /*
  * Calculate the number of bytes up to and including the given 'field' of
  * 'container'.
@@ -61,6 +67,8 @@ static VirtIOFeature feature_sizes[] = {
  .end = endof(struct virtio_net_config, max_virtqueue_pairs)},
 {.flags = 1ULL << VIRTIO_NET_F_MTU,
  .end = endof(struct virtio_net_config, mtu)},
+{.flags = 1ULL << VIRTIO_NET_F_SPEED_DUPLEX,
+ .end = endof(struct virtio_net_config, duplex)},
 {}
 };
 
@@ -89,6 +97,14 @@ static void virtio_net_get_config(VirtIODevice *vdev, 
uint8_t *config)
 virtio_stw_p(vdev, _virtqueue_pairs, n->max_queues);
 virtio_stw_p(vdev, , n->net_conf.mtu);
 memcpy(netcfg.mac, n->mac, ETH_ALEN);
+if (n->status & VIRTIO_NET_S_LINK_UP) {
+virtio_stl_p(vdev, , n->net_conf.speed);
+netcfg.duplex = n->net_conf.duplex;
+} else {
+virtio_stl_p(vdev, , SPEED_UNKNOWN);
+netcfg.duplex = DUPLEX_UNKNOWN;
+}
+
 memcpy(config, , n->config_size);
 }
 
@@ -1941,6 +1957,23 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
 n->host_features |= (1ULL << VIRTIO_NET_F_MTU);
 }
 
+n->host_features |= (1ULL << VIRTIO_NET_F_SPEED_DUPLEX);
+if (n->net_conf.duplex_str) {
+if (strncmp(n->net_conf.duplex_str, "half", 5) == 0) {
+n->net_conf.duplex = DUPLEX_HALF;
+} else if (strncmp(n->net_conf.duplex_str, "full", 5) == 0) {
+n->net_conf.duplex = DUPLEX_FULL;
+} else {
+error_setg(errp, "'duplex' must be 'half' or 'full'");
+}
+} else {
+n->net_conf.duplex = DUPLEX_UNKNOWN;
+}
+if (n->net_conf.speed < SPEED_UNKNOWN) {
+error_setg(errp, "'speed' must be between -1 (SPEED_UNKOWN) and "
+   "INT_MAX");
+}
+
 virtio_net_set_config_size(n, n->host_features);
 virtio_init(vdev, "virtio-net", VIRTIO_ID_NET, n->config_size);
 
@@ -2160,6 +2193,8 @@ static Property virtio_net_properties[] = {
 DEFINE_PROP_UINT16("host_mtu", VirtIONet, net_conf.mtu, 0),
 DEFINE_PROP_BOOL("x-mtu-bypass-backend", VirtIONet, mtu_bypass_backend,
  true),
+DEFINE_PROP_INT32("speed", VirtIONet, net_conf.speed, SPEED_UNKNOWN),
+DEFINE_PROP_STRING("duplex", VirtIONet, net_conf.duplex_str),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index e7634c9..02484dc 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -38,6 +38,9 @@ typedef struct virtio_net_conf
 uint16_t rx_queue_size;
 uint16_t tx_queue_size;
 uint16_t mtu;
+int32_t speed;
+char *duplex_str;
+uint8_t duplex;
 } virtio_net_conf;
 
 /* Maximum packet size we can receive from tap device: header + 64k */
diff --git a/include/standard-headers/linux/virtio_net.h 
b/include/standard-headers/linux/virtio_net.h
index 30ff249..17c8531 100644
--- a/include/standard-headers/linux/virtio_net.h
+++ b/include/standard-headers/linux/virtio_net.h
@@ -57,6 +57,8 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_SPEED_DUPLEX 63   /* Device set linkspeed and duplex */
+
 #ifndef VIRTIO_NET_NO_LEGACY
 #define VIRTIO_NET_F_GSO   6   /* Host handles pkts w/ any GSO type */
 #endif /* VIRTIO_NET_NO_LEGACY */
@@ -76,6 +78,17 @@ struct virtio_net_config {
uint16_t max_virtqueue_pairs;
/* Default maximum transmit unit advice */
uint16_t mtu;
+   /*
+* speed, in units of 1Mb. All values 0 to INT_MAX are

Re: [Qemu-devel] [PATCH v14 7/9] ARM: ACPI: Add GPIO notification type for hardware RAS error

2018-01-03 Thread gengdongjiu

On 2018/1/3 21:36, Igor Mammedov wrote:
> On Wed, 3 Jan 2018 11:48:30 +0800
> gengdongjiu  wrote:
> 
>> On 2017/12/28 22:53, Igor Mammedov wrote:
>>> On Thu, 28 Dec 2017 13:54:16 +0800
>>> Dongjiu Geng  wrote:
> [...]
 +static void acpi_dsdt_add_error_device(Aml *scope)
 +{
 +Aml *dev = aml_device(ACPI_HARDWARE_ERROR_DEVICE);
 +Aml *method;
 +
 +aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C33")));
 +aml_append(dev, aml_name_decl("_UID", aml_int(0)));
 +
 +method = aml_method("_STA", 0, AML_NOTSERIALIZED);
 +aml_append(method, aml_return(aml_int(0x0f)));  
>>> no need for dummy _STA method, device is assumed to be present if there is 
>>> no _STA   
>> Igor,
>>   do you mean remove above two line code as shown in [1]?
>> I dump the DSDT table in my host Ubuntu PC for the error device (PNP0C33), 
>> it has the _STA, as shown in [2].
>> do we not want to add the _STA for guest?
>>
>> [1]
>> +method = aml_method("_STA", 0, AML_NOTSERIALIZED);
>> +aml_append(method, aml_return(aml_int(0x0f)));
> compared to host, yours method does nothing,
> read ACPI6.2 "6.3.7 _STA (Status)" one more time
Thanks for the pointing out.
yes, you are right. As the spec statement[1], Device is assumed to be present 
if there is no _STA.

[1]:
ACPI6.2 "6.3.7 _STA (Status), Return Value Information"
If a device object (including the processor object) does not have an _STA 
object, then OSPM
assumes that all of the above bits are set (i.e., the device is present, 
enabled, shown in the UI,
and functioning).

> 
>> [2]:
>> Device (WERR)
>> {
>> Name (_HID, EisaId ("PNP0C33"))  // _HID: Hardware ID
>> Method (_STA, 0, NotSerialized)  // _STA: Status
>> {
>> If (LGreaterEqual (OSYS, 0x07D9))
>> {
>> Return (0x0F)
>> }
>> Else
>> {
>> Return (Zero)
>> }
>> }
>> }
>>>   
 +aml_append(dev, method);
 +aml_append(scope, dev);
 +}
 +
> [...]
> 
> 
> .
>

Re: [Qemu-devel] [PATCH] spapr: Correct compatibility mode setting for hotplugged CPUs

2018-01-03 Thread Alexey Kardashevskiy

On 04/01/18 15:24, David Gibson wrote:
> Currently the pseries machine sets the compatibility mode for the
> guest's cpus in two places: 1) at machine reset and 2) after CAS
> negotiation.
> 
> This means that if we set or negotiate a compatiblity mode, then
> hotplug a cpu, the hotplugged cpu doesn't get the right mode set and
> will incorrectly have the full native features.
> 
> To correct this, we set the compatibility mode on a cpu when it is
> brought online with the 'start-cpu' RTAS call.  Given that we no
> longer need to set the compatibility mode on all CPUs at machine
> reset, so we change that to only set the mode for the boot cpu.
> 
> Signed-off-by: David Gibson 

Reviewed-by: Alexey Kardashevskiy 


> ---
>  hw/ppc/spapr.c  | 2 +-
>  hw/ppc/spapr_rtas.c | 8 
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e22888ba06..d1acfe8858 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1510,7 +1510,7 @@ static void spapr_machine_reset(void)
>  spapr_ovec_cleanup(spapr->ov5_cas);
>  spapr->ov5_cas = spapr_ovec_new();
>  
> -ppc_set_compat_all(spapr->max_compat_pvr, _fatal);
> +ppc_set_compat(first_ppc_cpu, spapr->max_compat_pvr, _fatal);
>  }
>  
>  fdt = spapr_build_fdt(spapr, rtas_addr, spapr->rtas_size);
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 4bb939d3d1..2ed00548c1 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -163,6 +163,7 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, 
> sPAPRMachineState *spapr,
>  CPUState *cs = CPU(cpu);
>  CPUPPCState *env = >env;
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +Error *local_err = NULL;
>  
>  if (!cs->halted) {
>  rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
> @@ -174,6 +175,13 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, 
> sPAPRMachineState *spapr,
>   * new cpu enters */
>  kvm_cpu_synchronize_state(cs);
>  
> +/* Set compatibility mode to match existing cpus */
> +ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _err);
> +if (local_err) {
> +rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
> +return;
> +}
> +
>  env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
>  
>  /* Enable Power-saving mode Exit Cause exceptions for the new CPU */
> 


-- 
Alexey

[Qemu-devel] [PATCH] spapr: Correct compatibility mode setting for hotplugged CPUs

2018-01-03 Thread David Gibson

Currently the pseries machine sets the compatibility mode for the
guest's cpus in two places: 1) at machine reset and 2) after CAS
negotiation.

This means that if we set or negotiate a compatiblity mode, then
hotplug a cpu, the hotplugged cpu doesn't get the right mode set and
will incorrectly have the full native features.

To correct this, we set the compatibility mode on a cpu when it is
brought online with the 'start-cpu' RTAS call.  Given that we no
longer need to set the compatibility mode on all CPUs at machine
reset, so we change that to only set the mode for the boot cpu.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c  | 2 +-
 hw/ppc/spapr_rtas.c | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e22888ba06..d1acfe8858 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1510,7 +1510,7 @@ static void spapr_machine_reset(void)
 spapr_ovec_cleanup(spapr->ov5_cas);
 spapr->ov5_cas = spapr_ovec_new();
 
-ppc_set_compat_all(spapr->max_compat_pvr, _fatal);
+ppc_set_compat(first_ppc_cpu, spapr->max_compat_pvr, _fatal);
 }
 
 fdt = spapr_build_fdt(spapr, rtas_addr, spapr->rtas_size);
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 4bb939d3d1..2ed00548c1 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -163,6 +163,7 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, 
sPAPRMachineState *spapr,
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = >env;
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
+Error *local_err = NULL;
 
 if (!cs->halted) {
 rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
@@ -174,6 +175,13 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, 
sPAPRMachineState *spapr,
  * new cpu enters */
 kvm_cpu_synchronize_state(cs);
 
+/* Set compatibility mode to match existing cpus */
+ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _err);
+if (local_err) {
+rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
+return;
+}
+
 env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
 
 /* Enable Power-saving mode Exit Cause exceptions for the new CPU */
-- 
2.14.3

Re: [Qemu-devel] [PATCH v14 2/9] ACPI: Add APEI GHES table generation and CPER record support

2018-01-03 Thread gengdongjiu

On 2018/1/3 21:31, Igor Mammedov wrote:
> On Wed, 3 Jan 2018 10:21:06 +0800
> gengdongjiu  wrote:
> 
> [...]   
>>>   
 In order to simulation, we hard code the error
 type to Multi-bit ECC.  
>>> Not sure what this is about, care to elaborate?  
>>
>> please see Memory Error Record in [1], in which the "Memory Error Type" 
>> field is used to describe the
>> error type, such as  Multi-bit ECC or Parity Error etc. Because KVM or host 
>> does not pass the memory
>> error type to Qemu, so Qemu does not know what is the error type for the 
>> memory section. Hence we let QEMU simulate
>> the error type to Multi-bit ECC.
> Agreed that in case of TCG qemu won't likely have any way to get hw error 
> from kernel
> so it could be useful only for testing purposes (i.e. 'make check' and/or 
> testing
> how guest OS handles errors)
> 
> But with KVM in kernel it should be possible to fish error out from host 
> kernel
> and forward it to guest. If this are intended for handling HW errors,
> I'm not sure that 'Multi-bit ECC' could replace all real errors reported by 
> host
> firmware.
Thanks for the mail.
I understand your meaning, I explain it more.

(1). In fact the Memory Error type is not important to guest OS, when the 
OS(such as guest OS) do memory recovery,
it does not uses the memory error type, OS(such as guest OS) mainly uses the 
memory_failure() function[1] to do recovery ,
In this function, it does not care what is the memory error type, It even does 
not know what is the memory error type.
(2). If KVM forward the error type to guest, it needs more efforts, may be not 
worth to do. The real memory error type exists in host
APEI table, only host APEI driver can get it, KVM can not directly get it. If 
forward it to guest, KVM needs to firstly get the error
type from APEI driver and forward it to guest, which may be opposed by 
James(james.mo...@arm.com), I ever export more error information
to guest, but James does not agree that. In the ARM64 platform, we do not have 
implementation to get the error information from the
APEI driver to KVM or to other kernel modules.


[1]:
int memory_failure(unsigned long pfn, int trapno, int flags)
{
  ..
 }

> 
> 
>> [1]:
>> UEFI Spec 2.6 Errata A:
>>
>> "N.2.5 Memory Error Section"
>> -+---+--+---+
>> Mnemonic |   Byte Offset |  Byte Length |Description 
>>|
>> -+---+--+---+
>>  |   |  .   |... 
>>|
>> -+---+--+---+
>> Memory Error Type| 72|   1  |Identifies the type of 
>> error that occurred:|
>>   |   |  | 0 – Unknown   
>>|
>>   |   |  | 1 – No error  
>>|
>>   |   |  | 2 – Single-bit ECC
>>|
>>   |   |  | 3 – Multi-bit ECC 
>>|
>>   |   |  | 4 – Single-symbol ChipKill 
>> ECC   |
>>   |   |  | 5 – Multi-symbol ChipKill ECC 
>>|
>>   |   |  | 6 – Master abort  
>> |
>>   |   |  | 7 – Target abort  
>> |
>>   |   |  | 8 – Parity Error  
>> |
>>   |   |  | 9 – Watchdog timeout  
>> |
>>   |   |  | 10 – Invalid address  
>> |
>>   |   |  | 11 – Mirror Broken
>> |
>>   |   |  | 12 – Memory Sparing   
>> |
>>   |   |  | 13 - Scrub corrected error
>> |
>>   |   |  | 14 - Scrub uncorrected error  
>> |
>>   |   |  | 15 - Physical Memory Map-out 
>> event|
>>   |   |  | All other values reserved.
>> |
>> -+---+--+---+
>>  |   |  .   |... 
>>|
>> -+---+--+---+
> [...]
> 
> .
>

Re: [Qemu-devel] [PATCH v2] hw/ppc: Remove the deprecated spapr-pci-vfio-host-bridge device

2018-01-03 Thread David Gibson

On Wed, Jan 03, 2018 at 10:10:38AM +0100, Thomas Huth wrote:
> It's a deprecated dummy device since QEMU v2.6.0. That should have
> been enough time to allow the users to update their scripts in case
> they still use it, so let's remove this legacy code now.
> 
> Reviewed-by: Alexey Kardashevskiy 
> Signed-off-by: Thomas Huth 
> ---
>  v2: Rebased to the ppc-for-2.12 branch to solve a conflict

Applied, thanks.

> 
>  hw/ppc/spapr_pci_vfio.c   | 47 
> ---
>  qemu-doc.texi |  5 -
>  scripts/device-crash-test |  1 -
>  3 files changed, 53 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
> index 1f775ea..053efb0 100644
> --- a/hw/ppc/spapr_pci_vfio.c
> +++ b/hw/ppc/spapr_pci_vfio.c
> @@ -29,31 +29,6 @@
>  #include "qemu/error-report.h"
>  #include "sysemu/qtest.h"
>  
> -#define TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE "spapr-pci-vfio-host-bridge"
> -
> -#define SPAPR_PCI_VFIO_HOST_BRIDGE(obj) \
> -OBJECT_CHECK(sPAPRPHBVFIOState, (obj), TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE)
> -
> -typedef struct sPAPRPHBVFIOState sPAPRPHBVFIOState;
> -
> -struct sPAPRPHBVFIOState {
> -sPAPRPHBState phb;
> -
> -int32_t iommugroupid;
> -};
> -
> -static Property spapr_phb_vfio_properties[] = {
> -DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1),
> -DEFINE_PROP_END_OF_LIST(),
> -};
> -
> -static void spapr_phb_vfio_instance_init(Object *obj)
> -{
> -if (!qtest_enabled()) {
> -warn_report("spapr-pci-vfio-host-bridge is deprecated");
> -}
> -}
> -
>  bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
>  {
>  return vfio_eeh_as_ok(>iommu_as);
> @@ -218,25 +193,3 @@ int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb)
>  
>  return RTAS_OUT_SUCCESS;
>  }
> -
> -static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
> -{
> -DeviceClass *dc = DEVICE_CLASS(klass);
> -
> -dc->props = spapr_phb_vfio_properties;
> -}
> -
> -static const TypeInfo spapr_phb_vfio_info = {
> -.name  = TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE,
> -.parent= TYPE_SPAPR_PCI_HOST_BRIDGE,
> -.instance_size = sizeof(sPAPRPHBVFIOState),
> -.instance_init = spapr_phb_vfio_instance_init,
> -.class_init= spapr_phb_vfio_class_init,
> -};
> -
> -static void spapr_pci_vfio_register_types(void)
> -{
> -type_register_static(_phb_vfio_info);
> -}
> -
> -type_init(spapr_pci_vfio_register_types)
> diff --git a/qemu-doc.texi b/qemu-doc.texi
> index 90bea73..5449695 100644
> --- a/qemu-doc.texi
> +++ b/qemu-doc.texi
> @@ -2744,11 +2744,6 @@ The ``host_net_remove'' command is replaced by the 
> ``netdev_del'' command.
>  The ``ivshmem'' device type is replaced by either the ``ivshmem-plain''
>  or ``ivshmem-doorbell`` device types.
>  
> -@subsection spapr-pci-vfio-host-bridge (since 2.6.0)
> -
> -The ``spapr-pci-vfio-host-bridge'' device type is replaced by
> -the ``spapr-pci-host-bridge'' device type.
> -
>  @section System emulator machines
>  
>  @subsection Xilinx EP108 (since 2.11.0)
> diff --git a/scripts/device-crash-test b/scripts/device-crash-test
> index c11fd81..827d8ec 100755
> --- a/scripts/device-crash-test
> +++ b/scripts/device-crash-test
> @@ -119,7 +119,6 @@ ERROR_WHITELIST = [
>  {'device':'scsi-generic', 'expected':True},# drive property 
> not set
>  {'device':'scsi-hd', 'expected':True}, # drive property 
> not set
>  {'device':'spapr-pci-host-bridge', 'expected':True},   # BUID not 
> specified for PHB
> -{'device':'spapr-pci-vfio-host-bridge', 'expected':True}, # BUID not 
> specified for PHB
>  {'device':'spapr-rng', 'expected':True},   # spapr-rng needs 
> an RNG backend!
>  {'device':'spapr-vty', 'expected':True},   # chardev 
> property not set
>  {'device':'tpm-tis', 'expected':True}, # tpm_tis: 
> backend driver with id (null) could not be found

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] Update dtc to fix compilation problem on Mac OS 10.6

2018-01-03 Thread David Gibson

On Wed, Dec 27, 2017 at 07:17:42PM -0500, John Arbuckle wrote:
> Currently QEMU does not build on Mac OS 10.6
> because of a missing patch in the dtc
> subproject. Updating dtc to make the patch
> available fixes this problem.
> 
> Signed-off-by: John Arbuckle 

So, after some thought I decided I wasn't comfortable updating the
qemu dtc to a random git hash of the master dtc tree, rather than an
actual named and tagged dtc release.

So, I wrapped up a new v1.4.6 release of dtc including your fix.
Could you resend the qemu update to point it at that release.

> ---
>  dtc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/dtc b/dtc
> index 558cd81bdd..e671852042 16
> --- a/dtc
> +++ b/dtc
> @@ -1 +1 @@
> -Subproject commit 558cd81bdd432769b59bff01240c44f82cfb1a9d
> +Subproject commit e671852042a77b15ec72ca908291c7d647e4fb01

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX)

2018-01-03 Thread Wei Xu

On Wed, Jan 03, 2018 at 04:07:44PM +0100, Stefan Priebe - Profihost AG wrote:
> 
> Am 03.01.2018 um 04:57 schrieb Wei Xu:
> > On Tue, Jan 02, 2018 at 10:17:25PM +0100, Stefan Priebe - Profihost AG 
> > wrote:
> >>
> >> Am 02.01.2018 um 18:04 schrieb Wei Xu:
> >>> On Tue, Jan 02, 2018 at 04:24:33PM +0100, Stefan Priebe - Profihost AG 
> >>> wrote:
>  Hi,
>  Am 02.01.2018 um 15:20 schrieb Wei Xu:
> > On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG 
> > wrote:
> >> Hello,
> >>
> >> currently i'm trying to fix a problem where we have "random" missing
> >> packets.
> >>
> >> We're doing an ssh connect from machine a to machine b every 5 minutes
> >> via rsync and ssh.
> >>
> >> Sometimes it happens that we get this cron message:
> >> "Connection to 192.168.0.2 closed by remote host.
> >> rsync: connection unexpectedly closed (0 bytes received so far) 
> >> [sender]
> >> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2]
> >> ssh: connect to host 192.168.0.2 port 22: Connection refused"
> >
> > Hi Stefan,
> > What kind of virtio-net backend are you using? Can you paste your qemu
> > command line here?
> 
>  Sure netdev part:
>  -netdev
>  type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on
>  -device
>  virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300
>  -netdev
>  type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4
>  -device
>  virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301
> >>>
> >>> According to what you have mentioned, the traffic is not heavy for the 
> >>> guests,
> >>> the dropping shouldn't happen for regular case.
> >>
> >> The avg traffic is around 300kb/s.
> >>
> >>> What is your hardware platform?
> >>
> >> Dual Intel Xeon E5-2680 v4
> >>
> >>> and Which versions are you using for both
> >>> guest/host kernel
> >> Kernel v4.4.103
> >>
> >>> and qemu?
> >> 2.9.1
> >>
> >>> Are there other VMs on the same host?
> >> Yes.
> > 
> > What about the CPU load? 
> 
> Host:
> 80-90% Idle
> LoadAvg: 6-7
> 
> VM:
> 97%-99% Idle
> 

OK, then this shouldn't be a concern.

> > 'Connection refused' usually means that the client gets a TCP Reset 
> > rather
> > than losing packets, so this might not be a relevant issue.
> 
>  Mhm so you mean these might be two seperate ones?
> >>>
> >>> Yes.
> >>>
> 
> > Also you can do a tcpdump on both guests and see what happened to SSH 
> > packets
> > (tcpdump -i tapXXX port 22).
> 
>  Sadly not as there's too much traffic on that part as rsync is syncing
>  every 5 minutes through ssh.
> >>>
> >>> You can do a tcpdump for the entire traffic from the guest and host and 
> >>> compare
> >>> what kind of packets are dropped if the traffic is not overloaded.
> >>
> >> Are you sure? I don't get why the same amount and same kind of packets
> >> should be received by both tap which are connected to different bridges
> >> to different HW and physical interfaces.
> > 
> > Exactly, possibly this would be a host or guest kernel bug cos than qemu 
> > issue
> > you are using vhost kernel as the backend and the two stats are independent,
> > you might have to check out what is happening inside the traffic.
> 
> What do you mean by inside the traffic?

You might need to figure what kind of packets are dropped on host tap interface,
are they random packets or specific packets?

There are few other tests which help to see what happened besides triaging
the traffic, or you can try alternative tests according to your test bed.

1). Upgrade host & guest kernel to latest kernel and see if it comes up, you can
use net-next tree.
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git

2). Do some traffic throughput(netperf, iperf, etc) on both guests(traffic from 
guest to host if the guests are isolated due to your comments) and check out
the statistics.

Wei

> 
> Stefan
>

[Qemu-devel] [PATCH] scsi: Don't deference in_buf if NULL

2018-01-03 Thread Fam Zheng

scsi_disk_emulate_command passes in_buf=NULL and in_len=0 in the
REQUEST_SENSE branch. Inline the fixed_in evaluation and put it after
the in_len test.

Signed-off-by: Fam Zheng 
---
 scsi/utils.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/scsi/utils.c b/scsi/utils.c
index ddae650a99..9a0a925ef9 100644
--- a/scsi/utils.c
+++ b/scsi/utils.c
@@ -320,10 +320,8 @@ int scsi_convert_sense(uint8_t *in_buf, int in_len,
uint8_t *buf, int len, bool fixed)
 {
 SCSISense sense;
-bool fixed_in;
 
-fixed_in = (in_buf[0] & 2) == 0;
-if (in_len && fixed == fixed_in) {
+if (in_len && !!fixed == ((in_buf[0] & 2) == 0)) {
 memcpy(buf, in_buf, MIN(len, in_len));
 return MIN(len, in_len);
 }
-- 
2.14.3

Re: [Qemu-devel] [PATCH v2.1 3/3] chardev: introduce qemu_chr_timeout_add() and use

2018-01-03 Thread Peter Xu

On Wed, Jan 03, 2018 at 05:41:53PM +, Stefan Hajnoczi wrote:
> On Wed, Jan 03, 2018 at 10:24:18AM +0800, Peter Xu wrote:
> > It's a replacement of g_timeout_add[_seconds]() for chardevs.  Chardevs
> > now can have dedicated gcontext, we should always bind chardev tasks
> > onto those gcontext rather than the default main context.  Since there
> > are quite a few of g_timeout_add[_seconds]() callers, a new function
> > qemu_chr_timeout_add() is introduced.
> > 
> > One thing to mention is that, terminal3270 is still always running on
> > main gcontext.  However let's convert that as well since it's still part
> > of chardev codes and in case one day we'll miss that when we move it out
> > of main gcontext too.
> > 
> > Signed-off-by: Peter Xu 
> > ---
> > 
> > v2 -> v2.1: Sorry I forgot to do the move in char.h.  Did it in this
> > minor version.
> > 
> >  chardev/char-pty.c |  9 ++---
> >  chardev/char-socket.c  |  4 ++--
> >  chardev/char.c | 20 
> >  hw/char/terminal3270.c |  7 ---
> >  include/chardev/char.h |  3 +++
> >  5 files changed, 31 insertions(+), 12 deletions(-)
> > 
> > diff --git a/chardev/char-pty.c b/chardev/char-pty.c
> > index dd17b1b823..cbd8ac5eb7 100644
> > --- a/chardev/char-pty.c
> > +++ b/chardev/char-pty.c
> > @@ -78,13 +78,8 @@ static void pty_chr_rearm_timer(Chardev *chr, int ms)
> >  s->timer_tag = 0;
> >  }
> >  
> > -if (ms == 1000) {
> > -name = g_strdup_printf("pty-timer-secs-%s", chr->label);
> > -s->timer_tag = g_timeout_add_seconds(1, pty_chr_timer, chr);
> > -} else {
> > -name = g_strdup_printf("pty-timer-ms-%s", chr->label);
> > -s->timer_tag = g_timeout_add(ms, pty_chr_timer, chr);
> > -}
> > +name = g_strdup_printf("pty-timer-ms-%s", chr->label);
> > +s->timer_tag = qemu_chr_timeout_add(chr, ms, pty_chr_timer, chr);
> 
> The label is user-visible.  Why did you remove the seconds label format?

It's used for g_source_set_name_by_id() below, and that's not
user-visible AFAICT?

I removed it because I thought it was not user visible and actually I
didn't see a point on doing that.  Please let me know if I made a
mistake.

> 
> Please either include justification in the commit description or avoid
> spurious changes like this so reviewers don't need to worry about code
> changes that are not essential.

Yes. I can add this into commit message after confirmed with you on above.

> 
> >  g_source_set_name_by_id(s->timer_tag, name);
> >  g_free(name);
> >  }
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index 630a7f2995..5cca32f963 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -73,8 +73,8 @@ static void qemu_chr_socket_restart_timer(Chardev *chr)
> >  char *name;
> >  
> >  assert(s->connected == 0);
> > -s->reconnect_timer = g_timeout_add_seconds(s->reconnect_time,
> > -   socket_reconnect_timeout, 
> > chr);
> 
> Here it was clear that reconnect_time is in seconds...
> 
> > +s->reconnect_timer = qemu_chr_timeout_add(chr, s->reconnect_time * 
> > 1000,
> > +  socket_reconnect_timeout, 
> > chr);
> 
> ...now I can't tell what the unit is.
> 
> Please rename qemu_chr_timeout_add() to include the units:
> 
>   s->reconnect_timer = qemu_chr_timeout_add_ms(chr, s->reconnect_time * 1000,

Sure.

> 
> >  name = g_strdup_printf("chardev-socket-reconnect-%s", chr->label);
> >  g_source_set_name_by_id(s->reconnect_timer, name);
> >  g_free(name);
> > diff --git a/chardev/char.c b/chardev/char.c
> > index 8c3765ee99..a1de662fec 100644
> > --- a/chardev/char.c
> > +++ b/chardev/char.c
> > @@ -1084,6 +1084,26 @@ void qmp_chardev_send_break(const char *id, Error 
> > **errp)
> >  qemu_chr_be_event(chr, CHR_EVENT_BREAK);
> >  }
> >  
> > +/*
> > + * Add a timeout callback for the chardev (in milliseconds). Please
> > + * use this to add timeout hook for chardev instead of g_timeout_add()
> > + * and g_timeout_add_seconds(), to make sure the gcontext that the
> > + * task bound to is correct.
> > + */
> 
> What is the return value?

Basically I mean it's a wrapper of the other two functions so the
return value would be the same.  But sure I'll note that out.  Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] osdep: Retry SETLK upon EINTR

2018-01-03 Thread Fam Zheng

On Wed, 01/03 16:57, Eric Blake wrote:
> On 12/26/2017 12:53 AM, Fam Zheng wrote:
> > We could hit lock failure if there is a signal that makes fcntl return
> > -1 and errno set to EINTR. In this case we should retry.
> 
> Did you hit this in practice?  In 'man fcntl' on my Fedora 27 box, the
> DESCRIPTION section only mentions EINTR as possible for F_[OFD_]SETLKW,
> but we don't appear to be using that one (just SETLK and GETLK).  On the
> other hand, the ERRORS section of the same document mentions:
> 
> 
>EINTR  cmd  is  F_SETLKW  or  F_OFD_SETLKW and the operation was
> inter‐
>   rupted by a signal; see signal(7).
> 
>EINTR  cmd is F_GETLK, F_SETLK, F_OFD_GETLK, or  F_OFD_SETLK,
> and  the
>   operation  was  interrupted  by  a  signal  before  the
> lock was
>   checked or acquired.  Most likely when  locking  a  remote
>  file
>   (e.g., locking over NFS), but can sometimes happen locally.
> 
> (I hate it when information differs between two places in the same
> document, especially if I only read the first place)

Yes, our QE found it when hammering qemu-img convert with SIGUSR1. So both SETLK
and SETLKW can get EINTR.

> 
> > 
> > Cc: qemu-sta...@nongnu.org
> > Signed-off-by: Fam Zheng 
> > ---
> >  util/osdep.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/util/osdep.c b/util/osdep.c
> > index 1231f9f876..a73de0e1ba 100644
> > --- a/util/osdep.c
> > +++ b/util/osdep.c
> > @@ -244,7 +244,9 @@ static int qemu_lock_fcntl(int fd, int64_t start, 
> > int64_t len, int fl_type)
> >  .l_type   = fl_type,
> >  };
> >  qemu_probe_lock_ops();
> > -ret = fcntl(fd, fcntl_op_setlk, );
> > +do {
> > +ret = fcntl(fd, fcntl_op_setlk, );
> > +} while (ret == -1 && errno == EINTR);
> 
> The change makes sense from a maintenance point of view, whether or not
> you hit it in practice.

Thank you for reviewing!

Fam

Re: [Qemu-devel] [PATCH v2 13/13] migration: remove notify in fd_error

2018-01-03 Thread Peter Xu

On Wed, Jan 03, 2018 at 01:31:01PM +0100, Juan Quintela wrote:
> Peter Xu  wrote:
> > It should be called in migrate_fd_cleanup too.
> 
> It is *already* called in migrate_fd_cleanup.
> 
> I think we should add a comment stating that we _always_ end calling
> migrate_fd_cleanup, independently of how the migration ends.
> 
> 
> > Signed-off-by: Peter Xu 
> 
> Reviewed-by: Juan Quintela 
> 
> I can also fix the comment when pulling if you agree with the change.

Yes.  Please modify according to your suggestions (including the other
patch comment).  Thanks for that!

-- 
Peter Xu

[Qemu-devel] [PATCH v10 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2018-01-03 Thread Changpeng Liu

This commit introduces a new vhost-user device for block, it uses a
chardev to connect with the backend, same with Qemu virito-blk device,
Guest OS still uses the virtio-blk frontend driver.

To use it, start QEMU with command line like this:

qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket \
-device vhost-user-blk-pci,chardev=char0,num-queues=2, \
bootindex=2... \

Users can use different parameters for `num-queues` and `bootindex`.

Different with exist Qemu virtio-blk host device, it makes more easy
for users to implement their own I/O processing logic, such as all
user space I/O stack against hardware block device. It uses the new
vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
information from backend process.

Signed-off-by: Changpeng Liu 
---
 default-configs/pci.mak|   1 +
 hw/block/Makefile.objs |   3 +
 hw/block/vhost-user-blk.c  | 359 +
 hw/virtio/virtio-pci.c |  55 ++
 hw/virtio/virtio-pci.h |  18 ++
 include/hw/virtio/vhost-user-blk.h |  41 +
 6 files changed, 477 insertions(+)
 create mode 100644 hw/block/vhost-user-blk.c
 create mode 100644 include/hw/virtio/vhost-user-blk.h

diff --git a/default-configs/pci.mak b/default-configs/pci.mak
index e514bde..49a0f28 100644
--- a/default-configs/pci.mak
+++ b/default-configs/pci.mak
@@ -43,3 +43,4 @@ CONFIG_VGA_PCI=y
 CONFIG_IVSHMEM_DEVICE=$(CONFIG_IVSHMEM)
 CONFIG_ROCKER=y
 CONFIG_VHOST_USER_SCSI=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
+CONFIG_VHOST_USER_BLK=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index e0ed980..4c19a58 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
 
 obj-$(CONFIG_VIRTIO) += virtio-blk.o
 obj-$(CONFIG_VIRTIO) += dataplane/
+ifeq ($(CONFIG_VIRTIO),y)
+obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
+endif
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
new file mode 100644
index 000..b53b4c9
--- /dev/null
+++ b/hw/block/vhost-user-blk.c
@@ -0,0 +1,359 @@
+/*
+ * vhost-user-blk host device
+ *
+ * Copyright(C) 2017 Intel Corporation.
+ *
+ * Authors:
+ *  Changpeng Liu 
+ *
+ * Largely based on the "vhost-user-scsi.c" and "vhost-scsi.c" implemented by:
+ * Felipe Franciosi 
+ * Stefan Hajnoczi 
+ * Nicholas Bellinger 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/typedefs.h"
+#include "qemu/cutils.h"
+#include "qom/object.h"
+#include "hw/qdev-core.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-user-blk.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+
+static const int user_feature_bits[] = {
+VIRTIO_BLK_F_SIZE_MAX,
+VIRTIO_BLK_F_SEG_MAX,
+VIRTIO_BLK_F_GEOMETRY,
+VIRTIO_BLK_F_BLK_SIZE,
+VIRTIO_BLK_F_TOPOLOGY,
+VIRTIO_BLK_F_MQ,
+VIRTIO_BLK_F_RO,
+VIRTIO_BLK_F_FLUSH,
+VIRTIO_BLK_F_CONFIG_WCE,
+VIRTIO_F_VERSION_1,
+VIRTIO_RING_F_INDIRECT_DESC,
+VIRTIO_RING_F_EVENT_IDX,
+VIRTIO_F_NOTIFY_ON_EMPTY,
+VHOST_INVALID_FEATURE_BIT
+};
+
+static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
+{
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
+memcpy(config, >blkcfg, sizeof(struct virtio_blk_config));
+}
+
+static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t 
*config)
+{
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
+int ret;
+
+if (blkcfg->wce == s->blkcfg.wce) {
+return;
+}
+
+ret = vhost_dev_set_config(>dev, >wce,
+   offsetof(struct virtio_blk_config, wce),
+   sizeof(blkcfg->wce),
+   VHOST_SET_CONFIG_TYPE_MASTER);
+if (ret) {
+error_report("set device config space failed");
+return;
+}
+
+s->blkcfg.wce = blkcfg->wce;
+}
+
+static int vhost_user_blk_handle_config_change(struct vhost_dev *dev)
+{
+int ret;
+struct virtio_blk_config blkcfg;
+VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
+
+ret = vhost_dev_get_config(dev, (uint8_t *),
+   sizeof(struct virtio_blk_config));
+if (ret < 0) {
+error_report("get config space failed");
+return -1;
+}
+
+/* valid for resize only */
+if (blkcfg.capacity != s->blkcfg.capacity) {
+s->blkcfg.capacity = blkcfg.capacity;
+memcpy(dev->vdev->config, >blkcfg, sizeof(struct

[Qemu-devel] [PATCH v10 0/4] Introduce a new vhost-user-blk host device to QEMU

2018-01-03 Thread Changpeng Liu

Although virtio scsi specification was designed as a replacement for virtio_blk,
there are still many users using virtio_blk. QEMU 2.9 introduced a new device
vhost user scsi which can process I/O in user space for virtio_scsi, this commit
introduces a new vhost user block host device, which can support virtio_blk in
Guest OS, and I/O processing in another I/O target.

Due to the limitation for virtio_blk specification, virtio_blk device cannot get
block information such as capacity, block size etc via the specification, 
several
new vhost user messages were added to deliver virtio config space
information between Qemu and I/O target, 
VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG
messages used for get/set config space from/to I/O target, 
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
slave message was added for the event notifier in case the change of virtio 
config space. Also,
those messages can be used for vhost device live migration as well.

CHANGES:
v10: fix the code style error.
v8-v9: Several small optimization and code cleanup according to the comments.
v7-v8: Instead using an event file descriptor for event notifier in case of 
virtio
configuration space changed, while here used a new vhost-user slave message to 
deliver
such event. Several small optimizations to address the comments from v7.
v6-v7: change the parameter of set configuration function let it only contain 
valid data buffer.
v5-v6: add header flags for vhost-user master so that the slave can know the 
purpose for
set config, also vhost-user get/set messages' payload doesn't contain invalid 
data buffers.
v4-v5: add header offset and size for virtio config space.
v3-v4: refactoring the vhost user block example patch based on new 
libvhost-user library.
v2-v3: add new vhost user message to get/set virtio config space.

Changpeng Liu (4):
  vhost-user: add new vhost user messages to support virtio config space
  vhost-user-blk: introduce a new vhost-user-blk host device
  contrib/libvhost-user: enable virtio config space messages
  contrib/vhost-user-blk: introduce a vhost-user-blk sample application

 .gitignore  |   1 +
 Makefile|   3 +
 Makefile.objs   |   1 +
 contrib/libvhost-user/libvhost-user.c   |  42 +++
 contrib/libvhost-user/libvhost-user.h   |  33 ++
 contrib/vhost-user-blk/Makefile.objs|   1 +
 contrib/vhost-user-blk/vhost-user-blk.c | 545 
 default-configs/pci.mak |   1 +
 docs/interop/vhost-user.txt |  55 
 hw/block/Makefile.objs  |   3 +
 hw/block/vhost-user-blk.c   | 359 +
 hw/virtio/vhost-user.c  | 118 +++
 hw/virtio/vhost.c   |  32 ++
 hw/virtio/virtio-pci.c  |  55 
 hw/virtio/virtio-pci.h  |  18 ++
 include/hw/virtio/vhost-backend.h   |  12 +
 include/hw/virtio/vhost-user-blk.h  |  41 +++
 include/hw/virtio/vhost.h   |  15 +
 18 files changed, 1335 insertions(+)
 create mode 100644 contrib/vhost-user-blk/Makefile.objs
 create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
 create mode 100644 hw/block/vhost-user-blk.c
 create mode 100644 include/hw/virtio/vhost-user-blk.h

-- 
1.9.3

[Qemu-devel] [PATCH v10 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2018-01-03 Thread Changpeng Liu

This commit introduces a vhost-user-blk backend device, it uses UNIX
domain socket to communicate with QEMU. The vhost-user-blk sample
application should be used with QEMU vhost-user-blk-pci device.

To use it, complie with:
make vhost-user-blk

and start like this:
vhost-user-blk -b /dev/sdb -s /path/vhost.socket

Signed-off-by: Changpeng Liu 
---
 .gitignore  |   1 +
 Makefile|   3 +
 Makefile.objs   |   1 +
 contrib/vhost-user-blk/Makefile.objs|   1 +
 contrib/vhost-user-blk/vhost-user-blk.c | 545 
 5 files changed, 551 insertions(+)
 create mode 100644 contrib/vhost-user-blk/Makefile.objs
 create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c

diff --git a/.gitignore b/.gitignore
index 433f64f..704b222 100644
--- a/.gitignore
+++ b/.gitignore
@@ -54,6 +54,7 @@
 /module_block.h
 /scsi/qemu-pr-helper
 /vhost-user-scsi
+/vhost-user-blk
 /fsdev/virtfs-proxy-helper
 *.tmp
 *.[1-9]
diff --git a/Makefile b/Makefile
index d86ecd2..f021fc8 100644
--- a/Makefile
+++ b/Makefile
@@ -331,6 +331,7 @@ dummy := $(call unnest-vars,, \
 ivshmem-server-obj-y \
 libvhost-user-obj-y \
 vhost-user-scsi-obj-y \
+vhost-user-blk-obj-y \
 qga-vss-dll-obj-y \
 block-obj-y \
 block-obj-m \
@@ -562,6 +563,8 @@ ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) 
$(COMMON_LDADDS)
 endif
 vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
$(call LINK, $^)
+vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y) libvhost-user.a
+   $(call LINK, $^)
 
 module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
$(call quiet-command,$(PYTHON) $< $@ \
diff --git a/Makefile.objs b/Makefile.objs
index 285c6f3..ae9aef7 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -115,6 +115,7 @@ libvhost-user-obj-y = contrib/libvhost-user/
 vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
 vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
 vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
+vhost-user-blk-obj-y = contrib/vhost-user-blk/
 
 ##
 trace-events-subdirs =
diff --git a/contrib/vhost-user-blk/Makefile.objs 
b/contrib/vhost-user-blk/Makefile.objs
new file mode 100644
index 000..72e2cdc
--- /dev/null
+++ b/contrib/vhost-user-blk/Makefile.objs
@@ -0,0 +1 @@
+vhost-user-blk-obj-y = vhost-user-blk.o
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
new file mode 100644
index 000..67dac81
--- /dev/null
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -0,0 +1,545 @@
+/*
+ * vhost-user-blk sample application
+ *
+ * Copyright (c) 2017 Intel Corporation. All rights reserved.
+ *
+ * Author:
+ *  Changpeng Liu 
+ *
+ * This work is based on the "vhost-user-scsi" sample and "virtio-blk" driver
+ * implementation by:
+ *  Felipe Franciosi 
+ *  Anthony Liguori 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 only.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "standard-headers/linux/virtio_blk.h"
+#include "contrib/libvhost-user/libvhost-user-glib.h"
+#include "contrib/libvhost-user/libvhost-user.h"
+
+#include 
+
+struct virtio_blk_inhdr {
+unsigned char status;
+};
+
+/* vhost user block device */
+typedef struct VubDev {
+VugDev parent;
+int blk_fd;
+struct virtio_blk_config blkcfg;
+char *blk_name;
+GMainLoop *loop;
+} VubDev;
+
+typedef struct VubReq {
+VuVirtqElement *elem;
+int64_t sector_num;
+size_t size;
+struct virtio_blk_inhdr *in;
+struct virtio_blk_outhdr *out;
+VubDev *vdev_blk;
+struct VuVirtq *vq;
+} VubReq;
+
+/* refer util/iov.c */
+static size_t vub_iov_size(const struct iovec *iov,
+  const unsigned int iov_cnt)
+{
+size_t len;
+unsigned int i;
+
+len = 0;
+for (i = 0; i < iov_cnt; i++) {
+len += iov[i].iov_len;
+}
+return len;
+}
+
+static void vub_panic_cb(VuDev *vu_dev, const char *buf)
+{
+VugDev *gdev;
+VubDev *vdev_blk;
+
+assert(vu_dev);
+
+gdev = container_of(vu_dev, VugDev, parent);
+vdev_blk = container_of(gdev, VubDev, parent);
+if (buf) {
+g_warning("vu_panic: %s", buf);
+}
+
+g_main_loop_quit(vdev_blk->loop);
+}
+
+static void vub_req_complete(VubReq *req)
+{
+VugDev *gdev = >vdev_blk->parent;
+VuDev *vu_dev = >parent;
+
+/* IO size with 1 extra status byte */
+vu_queue_push(vu_dev, req->vq, req->elem,
+  req->size + 1);
+vu_queue_notify(vu_dev, req->vq);
+
+if (req->elem) {
+free(req->elem);
+}
+
+g_free(req);
+}
+
+static int vub_open(const char *file_name, bool wce)
+{

[Qemu-devel] [PATCH v10 3/4] contrib/libvhost-user: enable virtio config space messages

2018-01-03 Thread Changpeng Liu

Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages in
libvhost-user library, users can implement their own I/O target
based on the library. This enable the virtio config space delivered
between QEMU host device and the I/O target.

Signed-off-by: Changpeng Liu 
---
 contrib/libvhost-user/libvhost-user.c | 42 +++
 contrib/libvhost-user/libvhost-user.h | 33 +++
 2 files changed, 75 insertions(+)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index f409bd3..27cc597 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -84,6 +84,8 @@ vu_request_to_string(unsigned int req)
 REQ(VHOST_USER_SET_SLAVE_REQ_FD),
 REQ(VHOST_USER_IOTLB_MSG),
 REQ(VHOST_USER_SET_VRING_ENDIAN),
+REQ(VHOST_USER_GET_CONFIG),
+REQ(VHOST_USER_SET_CONFIG),
 REQ(VHOST_USER_MAX),
 };
 #undef REQ
@@ -798,6 +800,42 @@ vu_set_slave_req_fd(VuDev *dev, VhostUserMsg *vmsg)
 }
 
 static bool
+vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
+{
+int ret = -1;
+
+if (dev->iface->get_config) {
+ret = dev->iface->get_config(dev, vmsg->payload.config.region,
+ vmsg->payload.config.size);
+}
+
+if (ret) {
+/* resize to zero to indicate an error to master */
+vmsg->size = 0;
+}
+
+return true;
+}
+
+static bool
+vu_set_config(VuDev *dev, VhostUserMsg *vmsg)
+{
+int ret = -1;
+
+if (dev->iface->set_config) {
+ret = dev->iface->set_config(dev, vmsg->payload.config.region,
+ vmsg->payload.config.offset,
+ vmsg->payload.config.size,
+ vmsg->payload.config.flags);
+if (ret) {
+vu_panic(dev, "Set virtio configuration space failed");
+}
+}
+
+return false;
+}
+
+static bool
 vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
 {
 int do_reply = 0;
@@ -862,6 +900,10 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
 return vu_set_vring_enable_exec(dev, vmsg);
 case VHOST_USER_SET_SLAVE_REQ_FD:
 return vu_set_slave_req_fd(dev, vmsg);
+case VHOST_USER_GET_CONFIG:
+return vu_get_config(dev, vmsg);
+case VHOST_USER_SET_CONFIG:
+return vu_set_config(dev, vmsg);
 case VHOST_USER_NONE:
 break;
 default:
diff --git a/contrib/libvhost-user/libvhost-user.h 
b/contrib/libvhost-user/libvhost-user.h
index 2f5864b..f8a730b 100644
--- a/contrib/libvhost-user/libvhost-user.h
+++ b/contrib/libvhost-user/libvhost-user.h
@@ -30,6 +30,16 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS 8
 
+typedef enum VhostSetConfigType {
+VHOST_SET_CONFIG_TYPE_MASTER = 0,
+VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
+} VhostSetConfigType;
+
+/*
+ * Maximum size of virtio device config space
+ */
+#define VHOST_USER_MAX_CONFIG_SIZE 256
+
 enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -69,6 +79,8 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_SLAVE_REQ_FD = 21,
 VHOST_USER_IOTLB_MSG = 22,
 VHOST_USER_SET_VRING_ENDIAN = 23,
+VHOST_USER_GET_CONFIG = 24,
+VHOST_USER_SET_CONFIG = 25,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -90,6 +102,18 @@ typedef struct VhostUserLog {
 uint64_t mmap_offset;
 } VhostUserLog;
 
+typedef struct VhostUserConfig {
+uint32_t offset;
+uint32_t size;
+uint32_t flags;
+uint8_t region[VHOST_USER_MAX_CONFIG_SIZE];
+} VhostUserConfig;
+
+static VhostUserConfig c __attribute__ ((unused));
+#define VHOST_USER_CONFIG_HDR_SIZE (sizeof(c.offset) \
+   + sizeof(c.size) \
+   + sizeof(c.flags))
+
 #if defined(_WIN32)
 # define VU_PACKED __attribute__((gcc_struct, packed))
 #else
@@ -112,6 +136,7 @@ typedef struct VhostUserMsg {
 struct vhost_vring_addr addr;
 VhostUserMemory memory;
 VhostUserLog log;
+VhostUserConfig config;
 } payload;
 
 int fds[VHOST_MEMORY_MAX_NREGIONS];
@@ -140,6 +165,10 @@ typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg 
*vmsg,
   int *do_reply);
 typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
 typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
+typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len);
+typedef int (*vu_set_config_cb) (VuDev *dev, const uint8_t *data,
+ uint32_t offset, uint32_t size,
+ uint32_t flags);
 
 typedef struct VuDevIface {
 /* called by VHOST_USER_GET_FEATURES to get the features bitmask */
@@ -162,6 +191,10 @@ typedef struct VuDevIface {
  * on unmanaged exit/crash.
  */
 vu_queue_is_processed_in_order_cb

[Qemu-devel] [PATCH v10 1/4] vhost-user: add new vhost user messages to support virtio config space

2018-01-03 Thread Changpeng Liu

Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
used for live migration of vhost user devices, also vhost user devices
can benefit from the messages to get/set virtio config space from/to the
I/O target. For the purpose to support virtio config space change,
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG message is added as the event notifier
in case virtio config space change in the slave I/O target.

Signed-off-by: Changpeng Liu 
---
 docs/interop/vhost-user.txt   |  55 ++
 hw/virtio/vhost-user.c| 118 ++
 hw/virtio/vhost.c |  32 +++
 include/hw/virtio/vhost-backend.h |  12 
 include/hw/virtio/vhost.h |  15 +
 5 files changed, 232 insertions(+)

diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 954771d..9a5cb6a 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -116,6 +116,19 @@ Depending on the request type, payload can be:
 - 3: IOTLB invalidate
 - 4: IOTLB access fail
 
+ * Virtio device config space
+   ---
+   | offset | size | flags | payload |
+   ---
+
+   Offset: a 32-bit offset of virtio device's configuration space
+   Size: a 32-bit configuration space access size in bytes
+   Flags: a 32-bit value:
+- 0: Vhost master messages used for writeable fields
+- 1: Vhost master messages used for live migration
+   Payload: Size bytes array holding the contents of the virtio
+   device's configuration space
+
 In QEMU the vhost-user message is implemented with the following struct:
 
 typedef struct VhostUserMsg {
@@ -129,6 +142,7 @@ typedef struct VhostUserMsg {
 VhostUserMemory memory;
 VhostUserLog log;
 struct vhost_iotlb_msg iotlb;
+VhostUserConfig config;
 };
 } QEMU_PACKED VhostUserMsg;
 
@@ -596,6 +610,32 @@ Master message types
   and expect this message once (per VQ) during device configuration
   (ie. before the master starts the VQ).
 
+ * VHOST_USER_GET_CONFIG
+
+  Id: 24
+  Equivalent ioctl: N/A
+  Master payload: virtio device config space
+  Slave payload: virtio device config space
+
+  Submitted by the vhost-user master to fetch the contents of the virtio
+  device configuration space, vhost-user slave's payload size MUST match
+  master's request, vhost-user slave uses zero length of payload to
+  indicate an error to vhost-user master. The vhost-user master may
+  cache the contents to avoid repeated VHOST_USER_GET_CONFIG calls.
+
+* VHOST_USER_SET_CONFIG
+
+  Id: 25
+  Equivalent ioctl: N/A
+  Master payload: virtio device config space
+  Slave payload: N/A
+
+  Submitted by the vhost-user master when the Guest changes the virtio
+  device configuration space and also can be used for live migration
+  on the destination host. The vhost-user slave must check the flags
+  field, and slaves MUST NOT accept SET_CONFIG for read-only
+  configuration space fields unless the live migration bit is set.
+
 Slave message types
 ---
 
@@ -614,6 +654,21 @@ Slave message types
   This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
   has been successfully negotiated.
 
+* VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
+
+ Id: 2
+ Equivalent ioctl: N/A
+ Slave payload: N/A
+ Master payload: N/A
+
+ Vhost-user slave sends such messages to notify that the virtio device's
+ configuration space has changed, for those host devices which can support
+ such feature, host driver can send VHOST_USER_GET_CONFIG message to slave
+ to get the latest content. If VHOST_USER_PROTOCOL_F_REPLY_ACK is
+ negotiated, and slave set the VHOST_USER_NEED_REPLY flag, master must
+ respond with zero when operation is successfully completed, or non-zero
+ otherwise.
+
 VHOST_USER_PROTOCOL_F_REPLY_ACK:
 ---
 The original vhost-user specification only demands replies for certain
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 093675e..8b94688 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -26,6 +26,11 @@
 #define VHOST_MEMORY_MAX_NREGIONS8
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+/*
+ * Maximum size of virtio device config space
+ */
+#define VHOST_USER_MAX_CONFIG_SIZE 256
+
 enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -65,12 +70,15 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_SLAVE_REQ_FD = 21,
 VHOST_USER_IOTLB_MSG = 22,
 VHOST_USER_SET_VRING_ENDIAN = 23,
+VHOST_USER_GET_CONFIG = 24,
+VHOST_USER_SET_CONFIG = 25,
 VHOST_USER_MAX
 } VhostUserRequest;
 
 typedef enum VhostUserSlaveRequest {
 VHOST_USER_SLAVE_NONE = 0,
 VHOST_USER_SLAVE_IOTLB_MSG = 1,
+

[Qemu-devel] [PATCH v7 15/17] target/m68k: add andi/ori/eori to SR/CCR

2018-01-03 Thread Laurent Vivier

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 53 ++---
 1 file changed, 46 insertions(+), 7 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 8f23cade04..499aaa2f3d 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -2201,6 +2201,7 @@ DISAS_INSN(arith_im)
 TCGv dest;
 TCGv addr;
 int opsize;
+bool with_SR = ((insn & 0x3f) == 0x3c);
 
 op = (insn >> 9) & 7;
 opsize = insn_opsize(insn);
@@ -2217,32 +2218,73 @@ DISAS_INSN(arith_im)
 default:
abort();
 }
-SRC_EA(env, src1, opsize, 1, (op == 6) ? NULL : );
+
+if (with_SR) {
+/* SR/CCR can only be used with andi/eori/ori */
+if (op == 2 || op == 3 || op == 6) {
+disas_undef(env, s, insn);
+return;
+}
+switch (opsize) {
+case OS_BYTE:
+src1 = gen_get_ccr(s);
+break;
+case OS_WORD:
+if (IS_USER(s)) {
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
+return;
+}
+src1 = gen_get_sr(s);
+break;
+case OS_LONG:
+disas_undef(env, s, insn);
+return;
+}
+} else {
+SRC_EA(env, src1, opsize, 1, (op == 6) ? NULL : );
+}
 dest = tcg_temp_new();
 switch (op) {
 case 0: /* ori */
 tcg_gen_or_i32(dest, src1, im);
-gen_logic_cc(s, dest, opsize);
+if (with_SR) {
+gen_set_sr(s, dest, opsize == OS_BYTE);
+} else {
+DEST_EA(env, insn, opsize, dest, );
+gen_logic_cc(s, dest, opsize);
+}
 break;
 case 1: /* andi */
 tcg_gen_and_i32(dest, src1, im);
-gen_logic_cc(s, dest, opsize);
+if (with_SR) {
+gen_set_sr(s, dest, opsize == OS_BYTE);
+} else {
+DEST_EA(env, insn, opsize, dest, );
+gen_logic_cc(s, dest, opsize);
+}
 break;
 case 2: /* subi */
 tcg_gen_setcond_i32(TCG_COND_LTU, QREG_CC_X, src1, im);
 tcg_gen_sub_i32(dest, src1, im);
 gen_update_cc_add(dest, im, opsize);
 set_cc_op(s, CC_OP_SUBB + opsize);
+DEST_EA(env, insn, opsize, dest, );
 break;
 case 3: /* addi */
 tcg_gen_add_i32(dest, src1, im);
 gen_update_cc_add(dest, im, opsize);
 tcg_gen_setcond_i32(TCG_COND_LTU, QREG_CC_X, dest, im);
 set_cc_op(s, CC_OP_ADDB + opsize);
+DEST_EA(env, insn, opsize, dest, );
 break;
 case 5: /* eori */
 tcg_gen_xor_i32(dest, src1, im);
-gen_logic_cc(s, dest, opsize);
+if (with_SR) {
+gen_set_sr(s, dest, opsize == OS_BYTE);
+} else {
+DEST_EA(env, insn, opsize, dest, );
+gen_logic_cc(s, dest, opsize);
+}
 break;
 case 6: /* cmpi */
 gen_update_cc_cmp(s, src1, im, opsize);
@@ -2251,9 +2293,6 @@ DISAS_INSN(arith_im)
 abort();
 }
 tcg_temp_free(im);
-if (op != 6) {
-DEST_EA(env, insn, opsize, dest, );
-}
 tcg_temp_free(dest);
 }
 
-- 
2.14.3

[Qemu-devel] [PATCH v7 17/17] target/m68k: fix m68k_cpu_dump_state()

2018-01-03 Thread Laurent Vivier

Display correctly the Trace bits for 680x0
(2 bits instead of 1 for Coldfire).

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/cpu.h   | 3 ++-
 target/m68k/translate.c | 9 ++---
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 759b30d389..2985b039e1 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -219,7 +219,8 @@ typedef enum {
 #define SR_I  0x0700
 #define SR_M  0x1000
 #define SR_S  0x2000
-#define SR_T  0x8000
+#define SR_T_SHIFT 14
+#define SR_T  0xc000
 
 #define M68K_SSP0
 #define M68K_USP1
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 4d5173c4be..4a6d799ee2 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -5936,9 +5936,12 @@ void m68k_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 }
 cpu_fprintf (f, "PC = %08x   ", env->pc);
 sr = env->sr | cpu_m68k_get_ccr(env);
-cpu_fprintf(f, "SR = %04x %c%c%c%c%c ", sr, (sr & CCF_X) ? 'X' : '-',
-(sr & CCF_N) ? 'N' : '-', (sr & CCF_Z) ? 'Z' : '-',
-(sr & CCF_V) ? 'V' : '-', (sr & CCF_C) ? 'C' : '-');
+cpu_fprintf(f, "SR = %04x T:%x I:%x %c%c %c%c%c%c%c\n",
+sr, (sr & SR_T) >> SR_T_SHIFT, (sr & SR_I) >> SR_I_SHIFT,
+(sr & SR_S) ? 'S' : 'U', (sr & SR_M) ? '%' : 'I',
+(sr & CCF_X) ? 'X' : '-', (sr & CCF_N) ? 'N' : '-',
+(sr & CCF_Z) ? 'Z' : '-', (sr & CCF_V) ? 'V' : '-',
+(sr & CCF_C) ? 'C' : '-');
 cpu_fprintf(f, "FPSR = %08x %c%c%c%c ", env->fpsr,
 (env->fpsr & FPSR_CC_A) ? 'A' : '-',
 (env->fpsr & FPSR_CC_I) ? 'I' : '-',
-- 
2.14.3

Re: [Qemu-devel] [PATCH v9 0/4] Introduce a new vhost-user-blk host device to QEMU

2018-01-03 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 1515029086-4206-1-git-send-email-changpeng@intel.com
Subject: [Qemu-devel] [PATCH v9 0/4] Introduce a new vhost-user-blk host device 
to QEMU

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
2dacfa9a5c contrib/vhost-user-blk: introduce a vhost-user-blk sample application
f0afdce50e contrib/libvhost-user: enable virtio config space messages
90674d9197 vhost-user-blk: introduce a new vhost-user-blk host device
754086c200 vhost-user: add new vhost user messages to support virtio config 
space

=== OUTPUT BEGIN ===
Checking PATCH 1/4: vhost-user: add new vhost user messages to support virtio 
config space...
Checking PATCH 2/4: vhost-user-blk: introduce a new vhost-user-blk host 
device...
Checking PATCH 3/4: contrib/libvhost-user: enable virtio config space 
messages...
Checking PATCH 4/4: contrib/vhost-user-blk: introduce a vhost-user-blk sample 
application...
ERROR: space required after that ',' (ctx:VxV)
#268: FILE: contrib/vhost-user-blk/vhost-user-blk.c:191:
+req = g_new0(VubReq,1);
^

WARNING: architecture specific defines should be avoided
#512: FILE: contrib/vhost-user-blk/vhost-user-blk.c:435:
+#if defined(__linux__) && defined(BLKSSZGET)

total: 1 errors, 1 warnings, 575 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [PATCH v7 14/17] target/m68k: add 680x0 "move to SR" instruction

2018-01-03 Thread Laurent Vivier

Some cleanup, and allows SR to be moved from any addressing mode.
Previous code was wrong for coldfire: coldfire also allows to
use addressing mode to set SR/CCR. It only supports Data register
to get SR/CCR (move from)

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 38 ++
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 1f867a4f7a..8f23cade04 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -2162,27 +2162,34 @@ static void gen_set_sr_im(DisasContext *s, uint16_t 
val, int ccr_only)
 tcg_gen_movi_i32(QREG_CC_N, val & CCF_N ? -1 : 0);
 tcg_gen_movi_i32(QREG_CC_X, val & CCF_X ? 1 : 0);
 } else {
-gen_helper_set_sr(cpu_env, tcg_const_i32(val));
+TCGv sr = tcg_const_i32(val);
+gen_helper_set_sr(cpu_env, sr);
+tcg_temp_free(sr);
 }
 set_cc_op(s, CC_OP_FLAGS);
 }
 
-static void gen_set_sr(CPUM68KState *env, DisasContext *s, uint16_t insn,
-   int ccr_only)
+static void gen_set_sr(DisasContext *s, TCGv val, int ccr_only)
 {
-if ((insn & 0x38) == 0) {
-if (ccr_only) {
-gen_helper_set_ccr(cpu_env, DREG(insn, 0));
-} else {
-gen_helper_set_sr(cpu_env, DREG(insn, 0));
-}
-set_cc_op(s, CC_OP_FLAGS);
-} else if ((insn & 0x3f) == 0x3c) {
+if (ccr_only) {
+gen_helper_set_ccr(cpu_env, val);
+} else {
+gen_helper_set_sr(cpu_env, val);
+}
+set_cc_op(s, CC_OP_FLAGS);
+}
+
+static void gen_move_to_sr(CPUM68KState *env, DisasContext *s, uint16_t insn,
+   bool ccr_only)
+{
+if ((insn & 0x3f) == 0x3c) {
 uint16_t val;
 val = read_im16(env, s);
 gen_set_sr_im(s, val, ccr_only);
 } else {
-disas_undef(env, s, insn);
+TCGv src;
+SRC_EA(env, src, OS_WORD, 0, NULL);
+gen_set_sr(s, src, ccr_only);
 }
 }
 
@@ -2557,7 +2564,7 @@ DISAS_INSN(neg)
 
 DISAS_INSN(move_to_ccr)
 {
-gen_set_sr(env, s, insn, 1);
+gen_move_to_sr(env, s, insn, true);
 }
 
 DISAS_INSN(not)
@@ -4409,7 +4416,7 @@ DISAS_INSN(move_to_sr)
 gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
-gen_set_sr(env, s, insn, 0);
+gen_move_to_sr(env, s, insn, false);
 gen_lookup_tb(s);
 }
 
@@ -5556,9 +5563,8 @@ void register_m68k_insns (CPUM68KState *env)
 BASE(move_to_ccr, 44c0, ffc0);
 INSN(not,   4680, fff8, CF_ISA_A);
 INSN(not,   4600, ff00, M68000);
-INSN(undef, 46c0, ffc0, M68000);
 #if defined(CONFIG_SOFTMMU)
-INSN(move_to_sr, 46c0, ffc0, CF_ISA_A);
+BASE(move_to_sr, 46c0, ffc0);
 #endif
 INSN(nbcd,  4800, ffc0, M68000);
 INSN(linkl, 4808, fff8, M68000);
-- 
2.14.3

[Qemu-devel] [PATCH v7 16/17] target/m68k: add the Interrupt Stack Pointer

2018-01-03 Thread Laurent Vivier

Add the third stack pointer, the Interrupt Stack Pointer (ISP)
(680x0 only). This stack will be needed in softmmu mode.

Update movec to set/get the value of the three stacks.

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---

Notes:
v6: use cpu_m68k_set_sr() to set SR in GDB stub and in m68k_cpu_reset()

 target/m68k/cpu.c   |  8 ++---
 target/m68k/cpu.h   | 70 -
 target/m68k/gdbstub.c   |  2 +-
 target/m68k/helper.c| 82 -
 target/m68k/helper.h|  4 ++-
 target/m68k/monitor.c   |  1 +
 target/m68k/translate.c | 40 ++--
 7 files changed, 190 insertions(+), 17 deletions(-)

diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
index 1936efd170..03126ba543 100644
--- a/target/m68k/cpu.c
+++ b/target/m68k/cpu.c
@@ -55,17 +55,17 @@ static void m68k_cpu_reset(CPUState *s)
 mcc->parent_reset(s);
 
 memset(env, 0, offsetof(CPUM68KState, end_reset_fields));
-#if !defined(CONFIG_USER_ONLY)
-env->sr = 0x2700;
+#ifdef CONFIG_SOFTMMU
+cpu_m68k_set_sr(env, SR_S | SR_I);
+#else
+cpu_m68k_set_sr(env, 0);
 #endif
-m68k_switch_sp(env);
 for (i = 0; i < 8; i++) {
 env->fregs[i].d = nan;
 }
 cpu_m68k_set_fpcr(env, 0);
 env->fpsr = 0;
 
-cpu_m68k_set_ccr(env, 0);
 /* TODO: We should set PC from the interrupt vector.  */
 env->pc = 0;
 }
diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 2ac4ab191e..759b30d389 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -89,7 +89,7 @@ typedef struct CPUM68KState {
 
 /* SSP and USP.  The current_sp is stored in aregs[7], the other here.  */
 int current_sp;
-uint32_t sp[2];
+uint32_t sp[3];
 
 /* Condition flags.  */
 uint32_t cc_op;
@@ -223,6 +223,74 @@ typedef enum {
 
 #define M68K_SSP0
 #define M68K_USP1
+#define M68K_ISP2
+
+/* m68k Control Registers */
+
+/* ColdFire */
+/* Memory Management Control Registers */
+#define M68K_CR_ASID 0x003
+#define M68K_CR_ACR0 0x004
+#define M68K_CR_ACR1 0x005
+#define M68K_CR_ACR2 0x006
+#define M68K_CR_ACR3 0x007
+#define M68K_CR_MMUBAR   0x008
+
+/* Processor Miscellaneous Registers */
+#define M68K_CR_PC   0x80F
+
+/* Local Memory and Module Control Registers */
+#define M68K_CR_ROMBAR0  0xC00
+#define M68K_CR_ROMBAR1  0xC01
+#define M68K_CR_RAMBAR0  0xC04
+#define M68K_CR_RAMBAR1  0xC05
+#define M68K_CR_MPCR 0xC0C
+#define M68K_CR_EDRAMBAR 0xC0D
+#define M68K_CR_SECMBAR  0xC0E
+#define M68K_CR_MBAR 0xC0F
+
+/* Local Memory Address Permutation Control Registers */
+#define M68K_CR_PCR1U0   0xD02
+#define M68K_CR_PCR1L0   0xD03
+#define M68K_CR_PCR2U0   0xD04
+#define M68K_CR_PCR2L0   0xD05
+#define M68K_CR_PCR3U0   0xD06
+#define M68K_CR_PCR3L0   0xD07
+#define M68K_CR_PCR1U1   0xD0A
+#define M68K_CR_PCR1L1   0xD0B
+#define M68K_CR_PCR2U1   0xD0C
+#define M68K_CR_PCR2L1   0xD0D
+#define M68K_CR_PCR3U1   0xD0E
+#define M68K_CR_PCR3L1   0xD0F
+
+/* MC680x0 */
+/* MC680[1234]0/CPU32 */
+#define M68K_CR_SFC  0x000
+#define M68K_CR_DFC  0x001
+#define M68K_CR_USP  0x800
+#define M68K_CR_VBR  0x801 /* + Coldfire */
+
+/* MC680[234]0 */
+#define M68K_CR_CACR 0x002 /* + Coldfire */
+#define M68K_CR_CAAR 0x802 /* MC68020 and MC68030 only */
+#define M68K_CR_MSP  0x803
+#define M68K_CR_ISP  0x804
+
+/* MC68040/MC68LC040 */
+#define M68K_CR_TC   0x003
+#define M68K_CR_ITT0 0x004
+#define M68K_CR_ITT1 0x005
+#define M68K_CR_DTT0 0x006
+#define M68K_CR_DTT1 0x007
+#define M68K_CR_MMUSR0x805
+#define M68K_CR_URP  0x806
+#define M68K_CR_SRP  0x807
+
+/* MC68EC040 */
+#define M68K_CR_IACR00x004
+#define M68K_CR_IACR10x005
+#define M68K_CR_DACR00x006
+#define M68K_CR_DACR10x007
 
 #define M68K_FPIAR_SHIFT  0
 #define M68K_FPIAR(1 << M68K_FPIAR_SHIFT)
diff --git a/target/m68k/gdbstub.c b/target/m68k/gdbstub.c
index c7f44c9bb3..99e5be8132 100644
--- a/target/m68k/gdbstub.c
+++ b/target/m68k/gdbstub.c
@@ -63,7 +63,7 @@ int m68k_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 } else {
 switch (n) {
 case 16:
-env->sr = tmp;
+cpu_m68k_set_sr(env, tmp);
 break;
 case 17:
 env->pc = tmp;
diff --git a/target/m68k/helper.c b/target/m68k/helper.c
index 52b054e1a3..a999389e9a 100644
--- a/target/m68k/helper.c
+++ b/target/m68k/helper.c
@@ -171,28 +171,84 @@ void m68k_cpu_init_gdb(M68kCPU *cpu)
 /* TODO: Add [E]MAC registers.  */
 }
 
-void HELPER(movec)(CPUM68KState *env, uint32_t reg, uint32_t val)
+void HELPER(cf_movec_to)(CPUM68KState *env, uint32_t reg, uint32_t val)
 {
 M68kCPU *cpu = m68k_env_get_cpu(env);
 
 switch (reg) {
-case 0x02: /* CACR */
+case M68K_CR_CACR:
 env->cacr = val;
 m68k_switch_sp(env);
 break;
-

[Qemu-devel] [PATCH v7 04/17] target/m68k: use insn_pc to generate instruction fault address

2018-01-03 Thread Laurent Vivier

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 1e9fb01252..a1e424e3db 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -1509,12 +1509,12 @@ DISAS_INSN(dbcc)
 
 DISAS_INSN(undef_mac)
 {
-gen_exception(s, s->pc - 2, EXCP_LINEA);
+gen_exception(s, s->insn_pc, EXCP_LINEA);
 }
 
 DISAS_INSN(undef_fpu)
 {
-gen_exception(s, s->pc - 2, EXCP_LINEF);
+gen_exception(s, s->insn_pc, EXCP_LINEF);
 }
 
 DISAS_INSN(undef)
@@ -1523,8 +1523,8 @@ DISAS_INSN(undef)
for the 680x0 series, as well as those that are implemented
but actually illegal for CPU32 or pre-68020.  */
 qemu_log_mask(LOG_UNIMP, "Illegal instruction: %04x @ %08x",
-  insn, s->pc - 2);
-gen_exception(s, s->pc - 2, EXCP_UNSUPPORTED);
+  insn, s->insn_pc);
+gen_exception(s, s->insn_pc, EXCP_UNSUPPORTED);
 }
 
 DISAS_INSN(mulw)
@@ -2583,7 +2583,7 @@ DISAS_INSN(swap)
 
 DISAS_INSN(bkpt)
 {
-gen_exception(s, s->pc - 2, EXCP_DEBUG);
+gen_exception(s, s->insn_pc, EXCP_DEBUG);
 }
 
 DISAS_INSN(pea)
@@ -2636,7 +2636,7 @@ DISAS_INSN(pulse)
 
 DISAS_INSN(illegal)
 {
-gen_exception(s, s->pc - 2, EXCP_ILLEGAL);
+gen_exception(s, s->insn_pc, EXCP_ILLEGAL);
 }
 
 /* ??? This should be atomic.  */
@@ -2666,7 +2666,7 @@ DISAS_INSN(mull)
 
 if (ext & 0x400) {
 if (!m68k_feature(s->env, M68K_FEATURE_QUAD_MULDIV)) {
-gen_exception(s, s->pc - 4, EXCP_UNSUPPORTED);
+gen_exception(s, s->insn_pc, EXCP_UNSUPPORTED);
 return;
 }
 
@@ -4240,7 +4240,7 @@ DISAS_INSN(move_from_sr)
 TCGv sr;
 
 if (IS_USER(s) && !m68k_feature(env, M68K_FEATURE_M68000)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 sr = gen_get_sr(s);
@@ -4250,7 +4250,7 @@ DISAS_INSN(move_from_sr)
 DISAS_INSN(move_to_sr)
 {
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 gen_set_sr(env, s, insn, 0);
@@ -4260,7 +4260,7 @@ DISAS_INSN(move_to_sr)
 DISAS_INSN(move_from_usp)
 {
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 tcg_gen_ld_i32(AREG(insn, 0), cpu_env,
@@ -4270,7 +4270,7 @@ DISAS_INSN(move_from_usp)
 DISAS_INSN(move_to_usp)
 {
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 tcg_gen_st_i32(AREG(insn, 0), cpu_env,
@@ -4287,7 +4287,7 @@ DISAS_INSN(stop)
 uint16_t ext;
 
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 
@@ -4301,10 +4301,10 @@ DISAS_INSN(stop)
 DISAS_INSN(rte)
 {
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
-gen_exception(s, s->pc - 2, EXCP_RTE);
+gen_exception(s, s->insn_pc, EXCP_RTE);
 }
 
 DISAS_INSN(movec)
@@ -4313,7 +4313,7 @@ DISAS_INSN(movec)
 TCGv reg;
 
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 
@@ -4331,7 +4331,7 @@ DISAS_INSN(movec)
 DISAS_INSN(intouch)
 {
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 /* ICache fetch.  Implement as no-op.  */
@@ -4340,7 +4340,7 @@ DISAS_INSN(intouch)
 DISAS_INSN(cpushl)
 {
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 /* Cache push/invalidate.  Implement as no-op.  */
@@ -4348,7 +4348,7 @@ DISAS_INSN(cpushl)
 
 DISAS_INSN(wddata)
 {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 }
 
 DISAS_INSN(wdebug)
@@ -4356,7 +4356,7 @@ DISAS_INSN(wdebug)
 M68kCPU *cpu = m68k_env_get_cpu(env);
 
 if (IS_USER(s)) {
-gen_exception(s, s->pc - 2, EXCP_PRIVILEGE);
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 /* TODO: Implement wdebug.  */
@@ -4365,7 +4365,7 @@ DISAS_INSN(wdebug)
 
 DISAS_INSN(trap)
 {
-gen_exception(s, s->pc - 2, EXCP_TRAP0 + (insn & 0xf));
+gen_exception(s, s->insn_pc, EXCP_TRAP0 + (insn & 0xf));
 }
 
 static void gen_load_fcr(DisasContext *s, TCGv res, int reg)
-- 
2.14.3

[Qemu-devel] [PATCH v7 08/17] target/m68k: add move16

2018-01-03 Thread Laurent Vivier

move16 moves the source line to the destination line. Lines are aligned
to 16-byte boundaries and are 16 bytes long.

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---

Notes:
v6: split move16 in two functions

 target/m68k/cpu.c   | 10 ++-
 target/m68k/cpu.h   |  1 +
 target/m68k/translate.c | 72 +
 3 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
index 57ffcb2114..1936efd170 100644
--- a/target/m68k/cpu.c
+++ b/target/m68k/cpu.c
@@ -137,7 +137,15 @@ static void m68020_cpu_initfn(Object *obj)
 m68k_set_feature(env, M68K_FEATURE_CHK2);
 }
 #define m68030_cpu_initfn m68020_cpu_initfn
-#define m68040_cpu_initfn m68020_cpu_initfn
+
+static void m68040_cpu_initfn(Object *obj)
+{
+M68kCPU *cpu = M68K_CPU(obj);
+CPUM68KState *env = >env;
+
+m68020_cpu_initfn(obj);
+m68k_set_feature(env, M68K_FEATURE_M68040);
+}
 
 static void m68060_cpu_initfn(Object *obj)
 {
diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 68396bdd70..2ac4ab191e 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -306,6 +306,7 @@ enum m68k_features {
 M68K_FEATURE_BKPT,
 M68K_FEATURE_RTD,
 M68K_FEATURE_CHK2,
+M68K_FEATURE_M68040, /* instructions specific to MC68040 */
 };
 
 static inline int m68k_feature(CPUM68KState *env, int feature)
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 7f52065375..0ef933a545 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -4277,6 +4277,76 @@ DISAS_INSN(chk2)
 tcg_temp_free(reg);
 }
 
+static void m68k_copy_line(TCGv dst, TCGv src, int index)
+{
+TCGv addr;
+TCGv_i64 t0, t1;
+
+addr = tcg_temp_new();
+
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+
+tcg_gen_andi_i32(addr, src, ~15);
+tcg_gen_qemu_ld64(t0, addr, index);
+tcg_gen_addi_i32(addr, addr, 8);
+tcg_gen_qemu_ld64(t1, addr, index);
+
+tcg_gen_andi_i32(addr, dst, ~15);
+tcg_gen_qemu_st64(t0, addr, index);
+tcg_gen_addi_i32(addr, addr, 8);
+tcg_gen_qemu_st64(t1, addr, index);
+
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
+tcg_temp_free(addr);
+}
+
+DISAS_INSN(move16_reg)
+{
+int index = IS_USER(s);
+TCGv tmp;
+uint16_t ext;
+
+ext = read_im16(env, s);
+if ((ext & (1 << 15)) == 0) {
+gen_exception(s, s->insn_pc, EXCP_ILLEGAL);
+}
+
+m68k_copy_line(AREG(ext, 12), AREG(insn, 0), index);
+
+/* Ax can be Ay, so save Ay before incrementing Ax */
+tmp = tcg_temp_new();
+tcg_gen_mov_i32(tmp, AREG(ext, 12));
+tcg_gen_addi_i32(AREG(insn, 0), AREG(insn, 0), 16);
+tcg_gen_addi_i32(AREG(ext, 12), tmp, 16);
+tcg_temp_free(tmp);
+}
+
+DISAS_INSN(move16_mem)
+{
+int index = IS_USER(s);
+TCGv reg, addr;
+
+reg = AREG(insn, 0);
+addr = tcg_const_i32(read_im32(env, s));
+
+if ((insn >> 3) & 1) {
+/* MOVE16 (xxx).L, (Ay) */
+m68k_copy_line(reg, addr, index);
+} else {
+/* MOVE16 (Ay), (xxx).L */
+m68k_copy_line(addr, reg, index);
+}
+
+tcg_temp_free(addr);
+
+if (((insn >> 3) & 2) == 0) {
+/* (Ay)+ */
+tcg_gen_addi_i32(reg, reg, 16);
+}
+}
+
 static TCGv gen_get_sr(DisasContext *s)
 {
 TCGv ccr;
@@ -5578,6 +5648,8 @@ void register_m68k_insns (CPUM68KState *env)
 INSN(fsave, f300, ffc0, FPU);
 INSN(intouch,   f340, ffc0, CF_ISA_A);
 INSN(cpushl,f428, ff38, CF_ISA_A);
+INSN(move16_mem, f600, ffe0, M68040);
+INSN(move16_reg, f620, fff8, M68040);
 INSN(wddata,fb00, ff00, CF_ISA_A);
 INSN(wdebug,fbc0, ffc0, CF_ISA_A);
 #undef INSN
-- 
2.14.3

[Qemu-devel] [PATCH v7 07/17] target/m68k: add chk and chk2

2018-01-03 Thread Laurent Vivier

chk and chk2 compare a value to boundaries, and
trigger a CHK exception if the value is out of bounds.

Signed-off-by: Laurent Vivier 
Suggested-by: Richard Henderson 
---

Notes:
v7: chk: always update C and N flags
add some comments
move flush_flags() from the helper to the
code generator, because otherwise we need
to do an update_cc_op() before calling the
helper to be sure env->cc_op and s->cc_op
are synchronized
v6: use helpers as suggested by Richard

 linux-user/main.c   |  7 +
 target/m68k/cpu.c   |  2 ++
 target/m68k/cpu.h   |  1 +
 target/m68k/helper.h|  3 ++
 target/m68k/op_helper.c | 61 +++
 target/m68k/translate.c | 77 -
 6 files changed, 150 insertions(+), 1 deletion(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 71696ed33d..99a551b04f 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2985,6 +2985,13 @@ void cpu_loop(CPUM68KState *env)
 info._sifields._sigfault._addr = env->pc;
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
 break;
+case EXCP_CHK:
+info.si_signo = TARGET_SIGFPE;
+info.si_errno = 0;
+info.si_code = TARGET_FPE_INTOVF;
+info._sifields._sigfault._addr = env->pc;
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
+break;
 case EXCP_DIV0:
 info.si_signo = TARGET_SIGFPE;
 info.si_errno = 0;
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
index 0a3dd83548..57ffcb2114 100644
--- a/target/m68k/cpu.c
+++ b/target/m68k/cpu.c
@@ -134,6 +134,7 @@ static void m68020_cpu_initfn(Object *obj)
 m68k_set_feature(env, M68K_FEATURE_CAS);
 m68k_set_feature(env, M68K_FEATURE_BKPT);
 m68k_set_feature(env, M68K_FEATURE_RTD);
+m68k_set_feature(env, M68K_FEATURE_CHK2);
 }
 #define m68030_cpu_initfn m68020_cpu_initfn
 #define m68040_cpu_initfn m68020_cpu_initfn
@@ -156,6 +157,7 @@ static void m68060_cpu_initfn(Object *obj)
 m68k_set_feature(env, M68K_FEATURE_CAS);
 m68k_set_feature(env, M68K_FEATURE_BKPT);
 m68k_set_feature(env, M68K_FEATURE_RTD);
+m68k_set_feature(env, M68K_FEATURE_CHK2);
 }
 
 static void m5208_cpu_initfn(Object *obj)
diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index cd4b3a7c7b..68396bdd70 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -305,6 +305,7 @@ enum m68k_features {
 M68K_FEATURE_CAS,
 M68K_FEATURE_BKPT,
 M68K_FEATURE_RTD,
+M68K_FEATURE_CHK2,
 };
 
 static inline int m68k_feature(CPUM68KState *env, int feature)
diff --git a/target/m68k/helper.h b/target/m68k/helper.h
index eebe52dae5..78483da003 100644
--- a/target/m68k/helper.h
+++ b/target/m68k/helper.h
@@ -94,3 +94,6 @@ DEF_HELPER_FLAGS_4(bfchg_mem, TCG_CALL_NO_WG, i32, env, i32, 
s32, i32)
 DEF_HELPER_FLAGS_4(bfclr_mem, TCG_CALL_NO_WG, i32, env, i32, s32, i32)
 DEF_HELPER_FLAGS_4(bfset_mem, TCG_CALL_NO_WG, i32, env, i32, s32, i32)
 DEF_HELPER_FLAGS_4(bfffo_mem, TCG_CALL_NO_WG, i64, env, i32, s32, i32)
+
+DEF_HELPER_3(chk, void, env, s32, s32)
+DEF_HELPER_4(chk2, void, env, s32, s32, s32)
diff --git a/target/m68k/op_helper.c b/target/m68k/op_helper.c
index 5c7b27b9ca..06144d436d 100644
--- a/target/m68k/op_helper.c
+++ b/target/m68k/op_helper.c
@@ -947,3 +947,64 @@ uint64_t HELPER(bfffo_mem)(CPUM68KState *env, uint32_t 
addr,
is already zero.  */
 return n | ffo;
 }
+
+void HELPER(chk)(CPUM68KState *env, int32_t val, int32_t ub)
+{
+/* From the specs:
+ *   X: Not affected, C,V,Z: Undefined,
+ *   N: Set if val < 0; cleared if val > ub, undefined otherwise
+ * We implement here values found from a real MC68040:
+ *   X,V,Z: Not affected
+ *   N: Set if val < 0; cleared if val >= 0
+ *   C: if 0 <= ub: set if val < 0 or val > ub, cleared otherwise
+ *  if 0 > ub: set if val > ub and val < 0, cleared otherwise
+ */
+env->cc_n = val;
+env->cc_c = 0 <= ub ? val < 0 || val > ub : val > ub && val < 0;
+
+if (val < 0 || val > ub) {
+CPUState *cs = CPU(m68k_env_get_cpu(env));
+
+/* Recover PC and CC_OP for the beginning of the insn.  */
+cpu_restore_state(cs, GETPC());
+
+/* flags have been modified by gen_flush_flags() */
+env->cc_op = CC_OP_FLAGS;
+/* Adjust PC to end of the insn.  */
+env->pc += 2;
+
+cs->exception_index = EXCP_CHK;
+cpu_loop_exit(cs);
+}
+}
+
+void HELPER(chk2)(CPUM68KState *env, int32_t val, int32_t lb, int32_t ub)
+{
+/* From the specs:
+ *   X: Not affected, N,V: Undefined,
+ *   Z: Set if val is equal to lb or ub
+ *   V: Set if val < lb or val > ub, cleared otherwise
+ * We implement here values found from a real MC68040:
+ *   X,N,V: Not affected
+ *   Z: Set if val is equal to lb or ub
+ *

[Qemu-devel] [PATCH v7 13/17] target/m68k: move CCR/SR functions

2018-01-03 Thread Laurent Vivier

The following patches will be clearer if we move
functions before adding new ones.

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 111 
 1 file changed, 55 insertions(+), 56 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index b8ed85c237..1f867a4f7a 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -2131,6 +2131,61 @@ DISAS_INSN(bitop_im)
 }
 }
 
+static TCGv gen_get_ccr(DisasContext *s)
+{
+TCGv dest;
+
+update_cc_op(s);
+dest = tcg_temp_new();
+gen_helper_get_ccr(dest, cpu_env);
+return dest;
+}
+
+static TCGv gen_get_sr(DisasContext *s)
+{
+TCGv ccr;
+TCGv sr;
+
+ccr = gen_get_ccr(s);
+sr = tcg_temp_new();
+tcg_gen_andi_i32(sr, QREG_SR, 0xffe0);
+tcg_gen_or_i32(sr, sr, ccr);
+return sr;
+}
+
+static void gen_set_sr_im(DisasContext *s, uint16_t val, int ccr_only)
+{
+if (ccr_only) {
+tcg_gen_movi_i32(QREG_CC_C, val & CCF_C ? 1 : 0);
+tcg_gen_movi_i32(QREG_CC_V, val & CCF_V ? -1 : 0);
+tcg_gen_movi_i32(QREG_CC_Z, val & CCF_Z ? 0 : 1);
+tcg_gen_movi_i32(QREG_CC_N, val & CCF_N ? -1 : 0);
+tcg_gen_movi_i32(QREG_CC_X, val & CCF_X ? 1 : 0);
+} else {
+gen_helper_set_sr(cpu_env, tcg_const_i32(val));
+}
+set_cc_op(s, CC_OP_FLAGS);
+}
+
+static void gen_set_sr(CPUM68KState *env, DisasContext *s, uint16_t insn,
+   int ccr_only)
+{
+if ((insn & 0x38) == 0) {
+if (ccr_only) {
+gen_helper_set_ccr(cpu_env, DREG(insn, 0));
+} else {
+gen_helper_set_sr(cpu_env, DREG(insn, 0));
+}
+set_cc_op(s, CC_OP_FLAGS);
+} else if ((insn & 0x3f) == 0x3c) {
+uint16_t val;
+val = read_im16(env, s);
+gen_set_sr_im(s, val, ccr_only);
+} else {
+disas_undef(env, s, insn);
+}
+}
+
 DISAS_INSN(arith_im)
 {
 int op;
@@ -2474,16 +2529,6 @@ DISAS_INSN(clr)
 tcg_temp_free(zero);
 }
 
-static TCGv gen_get_ccr(DisasContext *s)
-{
-TCGv dest;
-
-update_cc_op(s);
-dest = tcg_temp_new();
-gen_helper_get_ccr(dest, cpu_env);
-return dest;
-}
-
 DISAS_INSN(move_from_ccr)
 {
 TCGv ccr;
@@ -2510,40 +2555,6 @@ DISAS_INSN(neg)
 tcg_temp_free(dest);
 }
 
-static void gen_set_sr_im(DisasContext *s, uint16_t val, int ccr_only)
-{
-if (ccr_only) {
-tcg_gen_movi_i32(QREG_CC_C, val & CCF_C ? 1 : 0);
-tcg_gen_movi_i32(QREG_CC_V, val & CCF_V ? -1 : 0);
-tcg_gen_movi_i32(QREG_CC_Z, val & CCF_Z ? 0 : 1);
-tcg_gen_movi_i32(QREG_CC_N, val & CCF_N ? -1 : 0);
-tcg_gen_movi_i32(QREG_CC_X, val & CCF_X ? 1 : 0);
-} else {
-gen_helper_set_sr(cpu_env, tcg_const_i32(val));
-}
-set_cc_op(s, CC_OP_FLAGS);
-}
-
-static void gen_set_sr(CPUM68KState *env, DisasContext *s, uint16_t insn,
-   int ccr_only)
-{
-if ((insn & 0x38) == 0) {
-if (ccr_only) {
-gen_helper_set_ccr(cpu_env, DREG(insn, 0));
-} else {
-gen_helper_set_sr(cpu_env, DREG(insn, 0));
-}
-set_cc_op(s, CC_OP_FLAGS);
-} else if ((insn & 0x3f) == 0x3c) {
-uint16_t val;
-val = read_im16(env, s);
-gen_set_sr_im(s, val, ccr_only);
-} else {
-disas_undef(env, s, insn);
-}
-}
-
-
 DISAS_INSN(move_to_ccr)
 {
 gen_set_sr(env, s, insn, 1);
@@ -4359,18 +4370,6 @@ DISAS_INSN(move16_mem)
 }
 }
 
-static TCGv gen_get_sr(DisasContext *s)
-{
-TCGv ccr;
-TCGv sr;
-
-ccr = gen_get_ccr(s);
-sr = tcg_temp_new();
-tcg_gen_andi_i32(sr, QREG_SR, 0xffe0);
-tcg_gen_or_i32(sr, sr, ccr);
-return sr;
-}
-
 DISAS_INSN(strldsr)
 {
 uint16_t ext;
-- 
2.14.3

[Qemu-devel] [PATCH v7 06/17] target/m68k: manage 680x0 stack frames

2018-01-03 Thread Laurent Vivier

680x0 manages several stack frame formats:
  - format 0: four-word stack frame
  - format 1: four-word throwaway stack frame
  - format 2: six-word stack frame
  - format 3: Floating-Point post-instruction stack frame
  - format 4: eight-word stack frame
  - format 7: access-error stack frame

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---

Notes:
v6: update SR with the content of CCR in the logs
introduce cpu_m68k_set_sr() to set SR instead
of calling helper_set_sr().

 target/m68k/cpu.h   |   1 +
 target/m68k/helper.c|  10 ++-
 target/m68k/op_helper.c | 160 ++--
 3 files changed, 164 insertions(+), 7 deletions(-)

diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index acc2629216..cd4b3a7c7b 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -178,6 +178,7 @@ int cpu_m68k_signal_handler(int host_signum, void *pinfo,
void *puc);
 uint32_t cpu_m68k_get_ccr(CPUM68KState *env);
 void cpu_m68k_set_ccr(CPUM68KState *env, uint32_t);
+void cpu_m68k_set_sr(CPUM68KState *env, uint32_t);
 void cpu_m68k_set_fpcr(CPUM68KState *env, uint32_t val);
 
 
diff --git a/target/m68k/helper.c b/target/m68k/helper.c
index 7e50ff5871..af57ffcea9 100644
--- a/target/m68k/helper.c
+++ b/target/m68k/helper.c
@@ -316,13 +316,17 @@ uint32_t HELPER(sats)(uint32_t val, uint32_t v)
 return val;
 }
 
-void HELPER(set_sr)(CPUM68KState *env, uint32_t val)
+void cpu_m68k_set_sr(CPUM68KState *env, uint32_t sr)
 {
-env->sr = val & 0xffe0;
-cpu_m68k_set_ccr(env, val);
+env->sr = sr & 0xffe0;
+cpu_m68k_set_ccr(env, sr);
 m68k_switch_sp(env);
 }
 
+void HELPER(set_sr)(CPUM68KState *env, uint32_t val)
+{
+cpu_m68k_set_sr(env, val);
+}
 
 /* MAC unit.  */
 /* FIXME: The MAC unit implementation is a bit of a mess.  Some helpers
diff --git a/target/m68k/op_helper.c b/target/m68k/op_helper.c
index 123981af55..5c7b27b9ca 100644
--- a/target/m68k/op_helper.c
+++ b/target/m68k/op_helper.c
@@ -54,7 +54,7 @@ void tlb_fill(CPUState *cs, target_ulong addr, MMUAccessType 
access_type,
 }
 }
 
-static void do_rte(CPUM68KState *env)
+static void cf_rte(CPUM68KState *env)
 {
 uint32_t sp;
 uint32_t fmt;
@@ -65,7 +65,46 @@ static void do_rte(CPUM68KState *env)
 sp |= (fmt >> 28) & 3;
 env->aregs[7] = sp + 8;
 
-helper_set_sr(env, fmt);
+cpu_m68k_set_sr(env, fmt);
+}
+
+static void m68k_rte(CPUM68KState *env)
+{
+uint32_t sp;
+uint16_t fmt;
+uint16_t sr;
+
+sp = env->aregs[7];
+throwaway:
+sr = cpu_lduw_kernel(env, sp);
+sp += 2;
+env->pc = cpu_ldl_kernel(env, sp);
+sp += 4;
+if (m68k_feature(env, M68K_FEATURE_QUAD_MULDIV)) {
+/*  all except 68000 */
+fmt = cpu_lduw_kernel(env, sp);
+sp += 2;
+switch (fmt >> 12) {
+case 0:
+break;
+case 1:
+env->aregs[7] = sp;
+cpu_m68k_set_sr(env, sr);
+goto throwaway;
+case 2:
+case 3:
+sp += 4;
+break;
+case 4:
+sp += 8;
+break;
+case 7:
+sp += 52;
+break;
+}
+}
+env->aregs[7] = sp;
+cpu_m68k_set_sr(env, sr);
 }
 
 static const char *m68k_exception_name(int index)
@@ -173,7 +212,7 @@ static const char *m68k_exception_name(int index)
 return "Unassigned";
 }
 
-static void do_interrupt_all(CPUM68KState *env, int is_hw)
+static void cf_interrupt_all(CPUM68KState *env, int is_hw)
 {
 CPUState *cs = CPU(m68k_env_get_cpu(env));
 uint32_t sp;
@@ -189,7 +228,7 @@ static void do_interrupt_all(CPUM68KState *env, int is_hw)
 switch (cs->exception_index) {
 case EXCP_RTE:
 /* Return from an exception.  */
-do_rte(env);
+cf_rte(env);
 return;
 case EXCP_HALT_INSN:
 if (semihosting_enabled()
@@ -247,6 +286,119 @@ static void do_interrupt_all(CPUM68KState *env, int is_hw)
 env->pc = cpu_ldl_kernel(env, env->vbr + vector);
 }
 
+static inline void do_stack_frame(CPUM68KState *env, uint32_t *sp,
+  uint16_t format, uint16_t sr,
+  uint32_t addr, uint32_t retaddr)
+{
+CPUState *cs = CPU(m68k_env_get_cpu(env));
+switch (format) {
+case 4:
+*sp -= 4;
+cpu_stl_kernel(env, *sp, env->pc);
+*sp -= 4;
+cpu_stl_kernel(env, *sp, addr);
+break;
+case 3:
+case 2:
+*sp -= 4;
+cpu_stl_kernel(env, *sp, addr);
+break;
+}
+*sp -= 2;
+cpu_stw_kernel(env, *sp, (format << 12) + (cs->exception_index << 2));
+*sp -= 4;
+cpu_stl_kernel(env, *sp, retaddr);
+*sp -= 2;
+cpu_stw_kernel(env, *sp, sr);
+}
+
+static void m68k_interrupt_all(CPUM68KState *env, int is_hw)
+{
+CPUState *cs = CPU(m68k_env_get_cpu(env));
+

[Qemu-devel] [PATCH v7 05/17] target/m68k: add CPU_LOG_INT trace

2018-01-03 Thread Laurent Vivier

Display the interrupts/exceptions information
in QEMU logs (-d int)

Signed-off-by: Laurent Vivier 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---

Notes:
v6: update SR with the content of CCR in the logs

 target/m68k/cpu.h   |   8 
 target/m68k/op_helper.c | 117 +++-
 2 files changed, 123 insertions(+), 2 deletions(-)

diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 5d03764eab..acc2629216 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -45,6 +45,8 @@
 #define EXCP_ADDRESS3   /* Address error.  */
 #define EXCP_ILLEGAL4   /* Illegal instruction.  */
 #define EXCP_DIV0   5   /* Divide by zero */
+#define EXCP_CHK6   /* CHK, CHK2 Instructions */
+#define EXCP_TRAPCC 7   /* FTRAPcc, TRAPcc, TRAPV Instructions */
 #define EXCP_PRIVILEGE  8   /* Privilege violation.  */
 #define EXCP_TRACE  9
 #define EXCP_LINEA  10  /* Unimplemented line-A (MAC) opcode.  */
@@ -53,6 +55,9 @@
 #define EXCP_DEBEGBP13  /* Breakpoint debug interrupt.  */
 #define EXCP_FORMAT 14  /* RTE format error.  */
 #define EXCP_UNINITIALIZED  15
+#define EXCP_SPURIOUS   24  /* Spurious interrupt */
+#define EXCP_INT_LEVEL_125  /* Level 1 Interrupt autovector */
+#define EXCP_INT_LEVEL_731  /* Level 7 Interrupt autovector */
 #define EXCP_TRAP0  32   /* User trap #0.  */
 #define EXCP_TRAP15 47   /* User trap #15.  */
 #define EXCP_FP_BSUN48 /* Branch Set on Unordered */
@@ -63,6 +68,9 @@
 #define EXCP_FP_OVFL53 /* Overflow */
 #define EXCP_FP_SNAN54 /* Signaling Not-A-Number */
 #define EXCP_FP_UNIMP   55 /* Unimplemented Data type */
+#define EXCP_MMU_CONF   56  /* MMU Configuration Error */
+#define EXCP_MMU_ILLEGAL57  /* MMU Illegal Operation Error */
+#define EXCP_MMU_ACCESS 58  /* MMU Access Level Violation Error */
 #define EXCP_UNSUPPORTED61
 
 #define EXCP_RTE0x100
diff --git a/target/m68k/op_helper.c b/target/m68k/op_helper.c
index 63089511cb..123981af55 100644
--- a/target/m68k/op_helper.c
+++ b/target/m68k/op_helper.c
@@ -68,10 +68,116 @@ static void do_rte(CPUM68KState *env)
 helper_set_sr(env, fmt);
 }
 
+static const char *m68k_exception_name(int index)
+{
+switch (index) {
+case EXCP_ACCESS:
+return "Access Fault";
+case EXCP_ADDRESS:
+return "Address Error";
+case EXCP_ILLEGAL:
+return "Illegal Instruction";
+case EXCP_DIV0:
+return "Divide by Zero";
+case EXCP_CHK:
+return "CHK/CHK2";
+case EXCP_TRAPCC:
+return "FTRAPcc, TRAPcc, TRAPV";
+case EXCP_PRIVILEGE:
+return "Privilege Violation";
+case EXCP_TRACE:
+return "Trace";
+case EXCP_LINEA:
+return "A-Line";
+case EXCP_LINEF:
+return "F-Line";
+case EXCP_DEBEGBP: /* 68020/030 only */
+return "Copro Protocol Violation";
+case EXCP_FORMAT:
+return "Format Error";
+case EXCP_UNINITIALIZED:
+return "Unitialized Interruot";
+case EXCP_SPURIOUS:
+return "Spurious Interrupt";
+case EXCP_INT_LEVEL_1:
+return "Level 1 Interrupt";
+case EXCP_INT_LEVEL_1 + 1:
+return "Level 2 Interrupt";
+case EXCP_INT_LEVEL_1 + 2:
+return "Level 3 Interrupt";
+case EXCP_INT_LEVEL_1 + 3:
+return "Level 4 Interrupt";
+case EXCP_INT_LEVEL_1 + 4:
+return "Level 5 Interrupt";
+case EXCP_INT_LEVEL_1 + 5:
+return "Level 6 Interrupt";
+case EXCP_INT_LEVEL_1 + 6:
+return "Level 7 Interrupt";
+case EXCP_TRAP0:
+return "TRAP #0";
+case EXCP_TRAP0 + 1:
+return "TRAP #1";
+case EXCP_TRAP0 + 2:
+return "TRAP #2";
+case EXCP_TRAP0 + 3:
+return "TRAP #3";
+case EXCP_TRAP0 + 4:
+return "TRAP #4";
+case EXCP_TRAP0 + 5:
+return "TRAP #5";
+case EXCP_TRAP0 + 6:
+return "TRAP #6";
+case EXCP_TRAP0 + 7:
+return "TRAP #7";
+case EXCP_TRAP0 + 8:
+return "TRAP #8";
+case EXCP_TRAP0 + 9:
+return "TRAP #9";
+case EXCP_TRAP0 + 10:
+return "TRAP #10";
+case EXCP_TRAP0 + 11:
+return "TRAP #11";
+case EXCP_TRAP0 + 12:
+return "TRAP #12";
+case EXCP_TRAP0 + 13:
+return "TRAP #13";
+case EXCP_TRAP0 + 14:
+return "TRAP #14";
+case EXCP_TRAP0 + 15:
+return "TRAP #15";
+case EXCP_FP_BSUN:
+return "FP Branch/Set on unordered condition";
+case EXCP_FP_INEX:
+return "FP Inexact Result";
+case EXCP_FP_DZ:
+return "FP Divide by Zero";
+case EXCP_FP_UNFL:
+return "FP Underflow";
+case EXCP_FP_OPERR:
+return "FP Operand Error";
+case EXCP_FP_OVFL:
+return "FP Overflow";
+case EXCP_FP_SNAN:
+

[Qemu-devel] [PATCH v7 11/17] target/m68k: add reset

2018-01-03 Thread Laurent Vivier

The instruction traps if the CPU is not in
Supervisor state but the helper is empty because
there is no easy way to reset all the peripherals
without resetting the CPU itself.

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/helper.c|  7 +++
 target/m68k/helper.h|  4 
 target/m68k/translate.c | 13 +
 3 files changed, 24 insertions(+)

diff --git a/target/m68k/helper.c b/target/m68k/helper.c
index af57ffcea9..52b054e1a3 100644
--- a/target/m68k/helper.c
+++ b/target/m68k/helper.c
@@ -711,3 +711,10 @@ void HELPER(set_mac_extu)(CPUM68KState *env, uint32_t val, 
uint32_t acc)
 res |= (uint64_t)(val & 0x) << 16;
 env->macc[acc + 1] = res;
 }
+
+#if defined(CONFIG_SOFTMMU)
+void HELPER(reset)(CPUM68KState *env)
+{
+/* FIXME: reset all except CPU */
+}
+#endif
diff --git a/target/m68k/helper.h b/target/m68k/helper.h
index 78483da003..d27ea37d60 100644
--- a/target/m68k/helper.h
+++ b/target/m68k/helper.h
@@ -97,3 +97,7 @@ DEF_HELPER_FLAGS_4(bfffo_mem, TCG_CALL_NO_WG, i64, env, i32, 
s32, i32)
 
 DEF_HELPER_3(chk, void, env, s32, s32)
 DEF_HELPER_4(chk2, void, env, s32, s32, s32)
+
+#if defined(CONFIG_SOFTMMU)
+DEF_HELPER_FLAGS_1(reset, TCG_CALL_NO_RWG, void, env)
+#endif
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 98efe6b976..e8f7d07f3f 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -2762,6 +2762,18 @@ DISAS_INSN(unlk)
 tcg_temp_free(src);
 }
 
+#if defined(CONFIG_SOFTMMU)
+DISAS_INSN(reset)
+{
+if (IS_USER(s)) {
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
+return;
+}
+
+gen_helper_reset(cpu_env);
+}
+#endif
+
 DISAS_INSN(nop)
 {
 }
@@ -5572,6 +5584,7 @@ void register_m68k_insns (CPUM68KState *env)
 #if defined(CONFIG_SOFTMMU)
 INSN(move_to_usp, 4e60, fff8, USP);
 INSN(move_from_usp, 4e68, fff8, USP);
+INSN(reset, 4e70, , M68000);
 BASE(stop,  4e72, );
 BASE(rte,   4e73, );
 INSN(movec, 4e7b, , CF_ISA_A);
-- 
2.14.3

[Qemu-devel] [PATCH v7 02/17] target/m68k: fix gen_get_ccr()

2018-01-03 Thread Laurent Vivier

As gen_helper_get_ccr() is able to compute CCR from cc_op and
flags, we don't need to flush flags before to call it.
flush_flags() and get_ccr() use COMPUTE_CCR() to compute
flags. get_ccr() computes CCR value,
whereas flush_flags update live cc_op and flags.

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 0e9d651a2a..1e9fb01252 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -2478,7 +2478,6 @@ static TCGv gen_get_ccr(DisasContext *s)
 {
 TCGv dest;
 
-gen_flush_flags(s);
 update_cc_op(s);
 dest = tcg_temp_new();
 gen_helper_get_ccr(dest, cpu_env);
-- 
2.14.3

[Qemu-devel] [PATCH v7 12/17] target/m68k: implement fsave/frestore

2018-01-03 Thread Laurent Vivier

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index e8f7d07f3f..b8ed85c237 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -5103,28 +5103,35 @@ DISAS_INSN(fscc)
 #if defined(CONFIG_SOFTMMU)
 DISAS_INSN(frestore)
 {
-M68kCPU *cpu = m68k_env_get_cpu(env);
+TCGv addr;
 
 if (IS_USER(s)) {
 gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
-
-/* TODO: Implement frestore.  */
-cpu_abort(CPU(cpu), "FRESTORE not implemented");
+if (m68k_feature(s->env, M68K_FEATURE_M68040)) {
+SRC_EA(env, addr, OS_LONG, 0, NULL);
+/* FIXME: check the state frame */
+} else {
+disas_undef(env, s, insn);
+}
 }
 
 DISAS_INSN(fsave)
 {
-M68kCPU *cpu = m68k_env_get_cpu(env);
-
 if (IS_USER(s)) {
 gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
 return;
 }
 
-/* TODO: Implement fsave.  */
-cpu_abort(CPU(cpu), "FSAVE not implemented");
+if (m68k_feature(s->env, M68K_FEATURE_M68040)) {
+/* always write IDLE */
+TCGv idle = tcg_const_i32(0x4100);
+DEST_EA(env, insn, OS_LONG, idle, NULL);
+tcg_temp_free(idle);
+} else {
+disas_undef(env, s, insn);
+}
 }
 #endif
 
-- 
2.14.3

[Qemu-devel] [PATCH v7 10/17] target/m68k: add cpush/cinv

2018-01-03 Thread Laurent Vivier

Add cache lines invalidate and cache lines push
as no-op operations, as we don't have cache.

These instructions are 68040 only.

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index f77005215f..98efe6b976 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -4496,6 +4496,24 @@ DISAS_INSN(cpushl)
 /* Cache push/invalidate.  Implement as no-op.  */
 }
 
+DISAS_INSN(cpush)
+{
+if (IS_USER(s)) {
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
+return;
+}
+/* Cache push/invalidate.  Implement as no-op.  */
+}
+
+DISAS_INSN(cinv)
+{
+if (IS_USER(s)) {
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
+return;
+}
+/* Invalidate cache line.  Implement as no-op.  */
+}
+
 DISAS_INSN(wddata)
 {
 gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
@@ -5674,6 +5692,8 @@ void register_m68k_insns (CPUM68KState *env)
 INSN(fsave, f300, ffc0, FPU);
 INSN(intouch,   f340, ffc0, CF_ISA_A);
 INSN(cpushl,f428, ff38, CF_ISA_A);
+INSN(cpush, f420, ff20, M68040);
+INSN(cinv,  f400, ff20, M68040);
 INSN(wddata,fb00, ff00, CF_ISA_A);
 INSN(wdebug,fbc0, ffc0, CF_ISA_A);
 #endif
-- 
2.14.3

[Qemu-devel] [PATCH v7 00/17] target/m68k: supervisor mode (part 1)

2018-01-03 Thread Laurent Vivier

This series is the first series that will allow
to introduce supervisor mode and execute
privileged instructions.

Some of these patches are only cleanup:

  sync CC_OP before gen_jmp_tb()
  fix gen_get_ccr()
  softmmu cleanup
  add CPU_LOG_INT trace
  use insn_pc to generate instruction fault address
  move CCR/SR functions
  fix m68k_cpu_dump_state()

We also fix a problem with linux-user mode,
where the State Register is not updated with
the CCR value (found while testing "chk/chk2"
using signal()):

  correctly manage SR in context

We introduce some new non privileged instructions:

  add chk and chk2
  add move16

move16 is used by the kernel when it is compiled only for
68040 (it's a 68040 only instruction).

We add some trivial privileged instructions (most are empty):

  add cpush/cinv
  add reset
  implement fsave/frestore

And finally, we add the privileged instructions to
modify the state register, the Interrupt Stack
Pointer and the 680x0 stack frame formats:

  add 680x0 "move to SR" instruction
  add andi/ori/eori to SR/CCR
  add the Interrupt Stack Pointer
  manage 680x0 stack frames

The next series will introduce the MC68040 MMU.

v7: chk: always update C and N flags
chk,chk2: add some comments
chk,chk2: move flush_flags() from the helper to the
code generator, because otherwise we need
to do an update_cc_op() before calling the
helper to be sure env->cc_op and s->cc_op
are synchronized

v6: introduce cpu_m68k_set_sr() to set SR instead
of calling helper_set_sr().
update SR with the content of CCR in the logs
use helpers as suggested by Richard for chk/chk2
split move16 in two functions
use cpu_m68k_set_sr() to set SR in GDB stub and in m68k_cpu_reset()

v5: it is in fact v1, there is no previous version.
I've messed up with git-publish on an older branch without checking the
subject prefix. Sorry...

Laurent Vivier (17):
  target-m68k: sync CC_OP before gen_jmp_tb()
  target/m68k: fix gen_get_ccr()
  linux-user,m68k: correctly manage SR in context
  target/m68k: use insn_pc to generate instruction fault address
  target/m68k: add CPU_LOG_INT trace
  target/m68k: manage 680x0 stack frames
  target/m68k: add chk and chk2
  target/m68k: add move16
  target/m68k: softmmu cleanup
  target/m68k: add cpush/cinv
  target/m68k: add reset
  target/m68k: implement fsave/frestore
  target/m68k: move CCR/SR functions
  target/m68k: add 680x0 "move to SR" instruction
  target/m68k: add andi/ori/eori to SR/CCR
  target/m68k: add the Interrupt Stack Pointer
  target/m68k: fix m68k_cpu_dump_state()

 linux-user/main.c   |   7 +
 linux-user/signal.c |   7 +-
 target/m68k/cpu.c   |  20 +-
 target/m68k/cpu.h   |  84 +++-
 target/m68k/gdbstub.c   |   2 +-
 target/m68k/helper.c|  99 +-
 target/m68k/helper.h|  11 +-
 target/m68k/monitor.c   |   1 +
 target/m68k/op_helper.c | 338 +++-
 target/m68k/translate.c | 497 ++--
 10 files changed, 937 insertions(+), 129 deletions(-)

-- 
2.14.3

[Qemu-devel] [PATCH v7 09/17] target/m68k: softmmu cleanup

2018-01-03 Thread Laurent Vivier

don't compile supervisor only instructions in linux-user mode

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 39 +--
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 0ef933a545..f77005215f 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -4391,6 +4391,7 @@ DISAS_INSN(move_from_sr)
 DEST_EA(env, insn, OS_WORD, sr, NULL);
 }
 
+#if defined(CONFIG_SOFTMMU)
 DISAS_INSN(move_to_sr)
 {
 if (IS_USER(s)) {
@@ -4423,6 +4424,11 @@ DISAS_INSN(move_to_usp)
 
 DISAS_INSN(halt)
 {
+if (IS_USER(s)) {
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
+return;
+}
+
 gen_exception(s, s->pc, EXCP_HALT_INSN);
 }
 
@@ -4506,6 +4512,7 @@ DISAS_INSN(wdebug)
 /* TODO: Implement wdebug.  */
 cpu_abort(CPU(cpu), "WDEBUG not implemented");
 }
+#endif
 
 DISAS_INSN(trap)
 {
@@ -5063,10 +5070,16 @@ DISAS_INSN(fscc)
 tcg_temp_free(tmp);
 }
 
+#if defined(CONFIG_SOFTMMU)
 DISAS_INSN(frestore)
 {
 M68kCPU *cpu = m68k_env_get_cpu(env);
 
+if (IS_USER(s)) {
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
+return;
+}
+
 /* TODO: Implement frestore.  */
 cpu_abort(CPU(cpu), "FRESTORE not implemented");
 }
@@ -5075,9 +5088,15 @@ DISAS_INSN(fsave)
 {
 M68kCPU *cpu = m68k_env_get_cpu(env);
 
+if (IS_USER(s)) {
+gen_exception(s, s->insn_pc, EXCP_PRIVILEGE);
+return;
+}
+
 /* TODO: Implement fsave.  */
 cpu_abort(CPU(cpu), "FSAVE not implemented");
 }
+#endif
 
 static inline TCGv gen_mac_extract_word(DisasContext *s, TCGv val, int upper)
 {
@@ -5502,7 +5521,9 @@ void register_m68k_insns (CPUM68KState *env)
 INSN(not,   4680, fff8, CF_ISA_A);
 INSN(not,   4600, ff00, M68000);
 INSN(undef, 46c0, ffc0, M68000);
+#if defined(CONFIG_SOFTMMU)
 INSN(move_to_sr, 46c0, ffc0, CF_ISA_A);
+#endif
 INSN(nbcd,  4800, ffc0, M68000);
 INSN(linkl, 4808, fff8, M68000);
 BASE(pea,   4840, ffc0);
@@ -5517,7 +5538,9 @@ void register_m68k_insns (CPUM68KState *env)
 BASE(tst,   4a00, ff00);
 INSN(tas,   4ac0, ffc0, CF_ISA_B);
 INSN(tas,   4ac0, ffc0, M68000);
+#if defined(CONFIG_SOFTMMU)
 INSN(halt,  4ac8, , CF_ISA_A);
+#endif
 INSN(pulse, 4acc, , CF_ISA_A);
 BASE(illegal,   4afc, );
 INSN(mull,  4c00, ffc0, CF_ISA_A);
@@ -5528,14 +5551,16 @@ void register_m68k_insns (CPUM68KState *env)
 BASE(trap,  4e40, fff0);
 BASE(link,  4e50, fff8);
 BASE(unlk,  4e58, fff8);
+#if defined(CONFIG_SOFTMMU)
 INSN(move_to_usp, 4e60, fff8, USP);
 INSN(move_from_usp, 4e68, fff8, USP);
-BASE(nop,   4e71, );
 BASE(stop,  4e72, );
 BASE(rte,   4e73, );
+INSN(movec, 4e7b, , CF_ISA_A);
+#endif
+BASE(nop,   4e71, );
 INSN(rtd,   4e74, , RTD);
 BASE(rts,   4e75, );
-INSN(movec, 4e7b, , CF_ISA_A);
 BASE(jump,  4e80, ffc0);
 BASE(jump,  4ec0, ffc0);
 INSN(addsubq,   5000, f080, M68000);
@@ -5639,19 +5664,21 @@ void register_m68k_insns (CPUM68KState *env)
 BASE(undef_fpu, f000, f000);
 INSN(fpu,   f200, ffc0, CF_FPU);
 INSN(fbcc,  f280, ffc0, CF_FPU);
-INSN(frestore,  f340, ffc0, CF_FPU);
-INSN(fsave, f300, ffc0, CF_FPU);
 INSN(fpu,   f200, ffc0, FPU);
 INSN(fscc,  f240, ffc0, FPU);
 INSN(fbcc,  f280, ff80, FPU);
+#if defined(CONFIG_SOFTMMU)
+INSN(frestore,  f340, ffc0, CF_FPU);
+INSN(fsave, f300, ffc0, CF_FPU);
 INSN(frestore,  f340, ffc0, FPU);
 INSN(fsave, f300, ffc0, FPU);
 INSN(intouch,   f340, ffc0, CF_ISA_A);
 INSN(cpushl,f428, ff38, CF_ISA_A);
-INSN(move16_mem, f600, ffe0, M68040);
-INSN(move16_reg, f620, fff8, M68040);
 INSN(wddata,fb00, ff00, CF_ISA_A);
 INSN(wdebug,fbc0, ffc0, CF_ISA_A);
+#endif
+INSN(move16_mem, f600, ffe0, M68040);
+INSN(move16_reg, f620, fff8, M68040);
 #undef INSN
 }
 
-- 
2.14.3

[Qemu-devel] [PATCH v7 03/17] linux-user, m68k: correctly manage SR in context

2018-01-03 Thread Laurent Vivier

Use cpu_m68k_get_ccr()/cpu_m68k_set_ccr() to setup and restore correctly
the value of SR in the context structure. Fix target_rt_setup_ucontext().

Fixes: 3219de458c ("linux-user: correctly manage SR in ucontext")
Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 linux-user/signal.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index dae14d4a89..74fa03f96d 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -5612,13 +5612,14 @@ struct target_rt_sigframe
 static void setup_sigcontext(struct target_sigcontext *sc, CPUM68KState *env,
  abi_ulong mask)
 {
+uint32_t sr = (env->sr & 0xff00) | cpu_m68k_get_ccr(env);
 __put_user(mask, >sc_mask);
 __put_user(env->aregs[7], >sc_usp);
 __put_user(env->dregs[0], >sc_d0);
 __put_user(env->dregs[1], >sc_d1);
 __put_user(env->aregs[0], >sc_a0);
 __put_user(env->aregs[1], >sc_a1);
-__put_user(env->sr, >sc_sr);
+__put_user(sr, >sc_sr);
 __put_user(env->pc, >sc_pc);
 }
 
@@ -5634,7 +5635,7 @@ restore_sigcontext(CPUM68KState *env, struct 
target_sigcontext *sc)
 __get_user(env->aregs[1], >sc_a1);
 __get_user(env->pc, >sc_pc);
 __get_user(temp, >sc_sr);
-env->sr = (env->sr & 0xff00) | (temp & 0xff);
+cpu_m68k_set_ccr(env, temp);
 }
 
 /*
@@ -5726,7 +5727,7 @@ static inline int target_rt_setup_ucontext(struct 
target_ucontext *uc,
CPUM68KState *env)
 {
 target_greg_t *gregs = uc->tuc_mcontext.gregs;
-uint32_t sr = cpu_m68k_get_ccr(env);
+uint32_t sr = (env->sr & 0xff00) | cpu_m68k_get_ccr(env);
 
 __put_user(TARGET_MCONTEXT_VERSION, >tuc_mcontext.version);
 __put_user(env->dregs[0], [0]);
-- 
2.14.3

[Qemu-devel] [PATCH v7 01/17] target-m68k: sync CC_OP before gen_jmp_tb()

2018-01-03 Thread Laurent Vivier

And remove update_cc_op() from gen_exception() because there is
one in gen_jmp_im().

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target/m68k/translate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index bbda7399ec..0e9d651a2a 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -270,7 +270,6 @@ static void gen_raise_exception(int nr)
 
 static void gen_exception(DisasContext *s, uint32_t where, int nr)
 {
-update_cc_op(s);
 gen_jmp_im(s, where);
 gen_raise_exception(nr);
 }
@@ -2897,6 +2896,7 @@ DISAS_INSN(branch)
 gen_jmp_tb(s, 0, s->pc);
 } else {
 /* Unconditional branch.  */
+update_cc_op(s);
 gen_jmp_tb(s, 0, base + offset);
 }
 }
@@ -4875,6 +4875,7 @@ static void gen_fjmpcc(DisasContext *s, int cond, 
TCGLabel *l1)
 DisasCompare c;
 
 gen_fcc_cond(, s, cond);
+update_cc_op(s);
 tcg_gen_brcond_i32(c.tcond, c.v1, c.v2, l1);
 free_cond();
 }
-- 
2.14.3

Re: [Qemu-devel] [PATCH] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()

2018-01-03 Thread Haozhong Zhang

On 01/03/18 11:45 -0200, Eduardo Habkost wrote:
> On Wed, Jan 03, 2018 at 11:16:39AM +0800, Haozhong Zhang wrote:
> > On 01/02/18 18:02 +0200, Michael S. Tsirkin wrote:
> > > On Wed, Dec 27, 2017 at 02:56:20PM +0800, Haozhong Zhang wrote:
> > > > When a file supporting DAX is used as vNVDIMM backend, mmap it with
> > > > MAP_SYNC flag in addition can guarantee the persistence of guest write
> > > > to the backend file without other QEMU actions (e.g., periodic fsync()
> > > > by QEMU).
> > > > 
> > > > By using MAP_SHARED_VALIDATE flag with MAP_SYNC, we can ensure mmap
> > > > with MAP_SYNC fails if MAP_SYNC is not supported by the kernel or the
> > > > backend file. On such failures, QEMU retries mmap without MAP_SYNC and
> > > > MAP_SHARED_VALIDATE.
> > > > 
> > > > Signed-off-by: Haozhong Zhang 
> > > 
> > > If users rely on MAP_SYNC then don't you need to fail allocation
> > > if you can't use it?
> > 
> > MAP_SYNC is supported since Linux kernel 4.15 and only needed for mmap
> > files on nvdimm. qemu_ram_mmap() has no way to check whether its
> > parameter 'fd' points to files on nvdimm, except by looking up
> > sysfs. However, accessing sysfs may be denied by certain SELinux
> > policies.
> > 
> > The missing of MAP_SYNC should not affect the primary functionality of
> > vNVDIMM when using files on host nvdimm as backend, except the
> > guarantee of write persistence in case of qemu/host crash.
> > 
> > We may check the kernel support of MAP_SYNC and the type of vNVDIMM
> > backend in some management utility (e.g., libvirt?), and deny to
> > launch QEMU if MAP_SYNC is not supported while files on host NVDIMM
> > are in use.
> 
> Instead of making libvirt check if MAP_SYNC is supported and just
> hope it won't fail, it would be safer to let libvirt tell QEMU
> that MAP_SYNC must never fail.

For example, add an option "sync" to memory-backend-file, and pass the
it to qemu_ram_mmap()?

> 
> However, it looks like kernel 4.14 won't even fail if MAP_SYNC is
> specified.  How exactly can userspace detect if MAP_SYNC is
> really supported?

Use MAP_SYNC with MAP_SHARED_VALIDATE (both introduced in 4.15
kernel). Linux kernel 4.15 and later validate whether the MAP_SYNC is
supported. Because MAP_SHARED_VALIDATE is defined equally to
(MAP_SHARED | MAP_PRIVATE), it always fails on older kernels which do
not support MAP_SYNC as well.

If we agree to introduce an option "sync" or likelihood, we can do the
above check in qemu_ram_mmap().


Haozhong

[Qemu-devel] [PATCH v9 3/4] contrib/libvhost-user: enable virtio config space messages

2018-01-03 Thread Changpeng Liu

Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages in
libvhost-user library, users can implement their own I/O target
based on the library. This enable the virtio config space delivered
between QEMU host device and the I/O target.

Signed-off-by: Changpeng Liu 
---
 contrib/libvhost-user/libvhost-user.c | 42 +++
 contrib/libvhost-user/libvhost-user.h | 33 +++
 2 files changed, 75 insertions(+)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index f409bd3..27cc597 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -84,6 +84,8 @@ vu_request_to_string(unsigned int req)
 REQ(VHOST_USER_SET_SLAVE_REQ_FD),
 REQ(VHOST_USER_IOTLB_MSG),
 REQ(VHOST_USER_SET_VRING_ENDIAN),
+REQ(VHOST_USER_GET_CONFIG),
+REQ(VHOST_USER_SET_CONFIG),
 REQ(VHOST_USER_MAX),
 };
 #undef REQ
@@ -798,6 +800,42 @@ vu_set_slave_req_fd(VuDev *dev, VhostUserMsg *vmsg)
 }
 
 static bool
+vu_get_config(VuDev *dev, VhostUserMsg *vmsg)
+{
+int ret = -1;
+
+if (dev->iface->get_config) {
+ret = dev->iface->get_config(dev, vmsg->payload.config.region,
+ vmsg->payload.config.size);
+}
+
+if (ret) {
+/* resize to zero to indicate an error to master */
+vmsg->size = 0;
+}
+
+return true;
+}
+
+static bool
+vu_set_config(VuDev *dev, VhostUserMsg *vmsg)
+{
+int ret = -1;
+
+if (dev->iface->set_config) {
+ret = dev->iface->set_config(dev, vmsg->payload.config.region,
+ vmsg->payload.config.offset,
+ vmsg->payload.config.size,
+ vmsg->payload.config.flags);
+if (ret) {
+vu_panic(dev, "Set virtio configuration space failed");
+}
+}
+
+return false;
+}
+
+static bool
 vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
 {
 int do_reply = 0;
@@ -862,6 +900,10 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
 return vu_set_vring_enable_exec(dev, vmsg);
 case VHOST_USER_SET_SLAVE_REQ_FD:
 return vu_set_slave_req_fd(dev, vmsg);
+case VHOST_USER_GET_CONFIG:
+return vu_get_config(dev, vmsg);
+case VHOST_USER_SET_CONFIG:
+return vu_set_config(dev, vmsg);
 case VHOST_USER_NONE:
 break;
 default:
diff --git a/contrib/libvhost-user/libvhost-user.h 
b/contrib/libvhost-user/libvhost-user.h
index 2f5864b..f8a730b 100644
--- a/contrib/libvhost-user/libvhost-user.h
+++ b/contrib/libvhost-user/libvhost-user.h
@@ -30,6 +30,16 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS 8
 
+typedef enum VhostSetConfigType {
+VHOST_SET_CONFIG_TYPE_MASTER = 0,
+VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
+} VhostSetConfigType;
+
+/*
+ * Maximum size of virtio device config space
+ */
+#define VHOST_USER_MAX_CONFIG_SIZE 256
+
 enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -69,6 +79,8 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_SLAVE_REQ_FD = 21,
 VHOST_USER_IOTLB_MSG = 22,
 VHOST_USER_SET_VRING_ENDIAN = 23,
+VHOST_USER_GET_CONFIG = 24,
+VHOST_USER_SET_CONFIG = 25,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -90,6 +102,18 @@ typedef struct VhostUserLog {
 uint64_t mmap_offset;
 } VhostUserLog;
 
+typedef struct VhostUserConfig {
+uint32_t offset;
+uint32_t size;
+uint32_t flags;
+uint8_t region[VHOST_USER_MAX_CONFIG_SIZE];
+} VhostUserConfig;
+
+static VhostUserConfig c __attribute__ ((unused));
+#define VHOST_USER_CONFIG_HDR_SIZE (sizeof(c.offset) \
+   + sizeof(c.size) \
+   + sizeof(c.flags))
+
 #if defined(_WIN32)
 # define VU_PACKED __attribute__((gcc_struct, packed))
 #else
@@ -112,6 +136,7 @@ typedef struct VhostUserMsg {
 struct vhost_vring_addr addr;
 VhostUserMemory memory;
 VhostUserLog log;
+VhostUserConfig config;
 } payload;
 
 int fds[VHOST_MEMORY_MAX_NREGIONS];
@@ -140,6 +165,10 @@ typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg 
*vmsg,
   int *do_reply);
 typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
 typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
+typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len);
+typedef int (*vu_set_config_cb) (VuDev *dev, const uint8_t *data,
+ uint32_t offset, uint32_t size,
+ uint32_t flags);
 
 typedef struct VuDevIface {
 /* called by VHOST_USER_GET_FEATURES to get the features bitmask */
@@ -162,6 +191,10 @@ typedef struct VuDevIface {
  * on unmanaged exit/crash.
  */
 vu_queue_is_processed_in_order_cb

[Qemu-devel] [PATCH v9 2/4] vhost-user-blk: introduce a new vhost-user-blk host device

2018-01-03 Thread Changpeng Liu

This commit introduces a new vhost-user device for block, it uses a
chardev to connect with the backend, same with Qemu virito-blk device,
Guest OS still uses the virtio-blk frontend driver.

To use it, start QEMU with command line like this:

qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket \
-device vhost-user-blk-pci,chardev=char0,num-queues=2, \
bootindex=2... \

Users can use different parameters for `num-queues` and `bootindex`.

Different with exist Qemu virtio-blk host device, it makes more easy
for users to implement their own I/O processing logic, such as all
user space I/O stack against hardware block device. It uses the new
vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
information from backend process.

Signed-off-by: Changpeng Liu 
---
 default-configs/pci.mak|   1 +
 hw/block/Makefile.objs |   3 +
 hw/block/vhost-user-blk.c  | 359 +
 hw/virtio/virtio-pci.c |  55 ++
 hw/virtio/virtio-pci.h |  18 ++
 include/hw/virtio/vhost-user-blk.h |  41 +
 6 files changed, 477 insertions(+)
 create mode 100644 hw/block/vhost-user-blk.c
 create mode 100644 include/hw/virtio/vhost-user-blk.h

diff --git a/default-configs/pci.mak b/default-configs/pci.mak
index e514bde..49a0f28 100644
--- a/default-configs/pci.mak
+++ b/default-configs/pci.mak
@@ -43,3 +43,4 @@ CONFIG_VGA_PCI=y
 CONFIG_IVSHMEM_DEVICE=$(CONFIG_IVSHMEM)
 CONFIG_ROCKER=y
 CONFIG_VHOST_USER_SCSI=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
+CONFIG_VHOST_USER_BLK=$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX))
diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index e0ed980..4c19a58 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
 
 obj-$(CONFIG_VIRTIO) += virtio-blk.o
 obj-$(CONFIG_VIRTIO) += dataplane/
+ifeq ($(CONFIG_VIRTIO),y)
+obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
+endif
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
new file mode 100644
index 000..b53b4c9
--- /dev/null
+++ b/hw/block/vhost-user-blk.c
@@ -0,0 +1,359 @@
+/*
+ * vhost-user-blk host device
+ *
+ * Copyright(C) 2017 Intel Corporation.
+ *
+ * Authors:
+ *  Changpeng Liu 
+ *
+ * Largely based on the "vhost-user-scsi.c" and "vhost-scsi.c" implemented by:
+ * Felipe Franciosi 
+ * Stefan Hajnoczi 
+ * Nicholas Bellinger 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/typedefs.h"
+#include "qemu/cutils.h"
+#include "qom/object.h"
+#include "hw/qdev-core.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-user-blk.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+
+static const int user_feature_bits[] = {
+VIRTIO_BLK_F_SIZE_MAX,
+VIRTIO_BLK_F_SEG_MAX,
+VIRTIO_BLK_F_GEOMETRY,
+VIRTIO_BLK_F_BLK_SIZE,
+VIRTIO_BLK_F_TOPOLOGY,
+VIRTIO_BLK_F_MQ,
+VIRTIO_BLK_F_RO,
+VIRTIO_BLK_F_FLUSH,
+VIRTIO_BLK_F_CONFIG_WCE,
+VIRTIO_F_VERSION_1,
+VIRTIO_RING_F_INDIRECT_DESC,
+VIRTIO_RING_F_EVENT_IDX,
+VIRTIO_F_NOTIFY_ON_EMPTY,
+VHOST_INVALID_FEATURE_BIT
+};
+
+static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
+{
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
+memcpy(config, >blkcfg, sizeof(struct virtio_blk_config));
+}
+
+static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t 
*config)
+{
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
+int ret;
+
+if (blkcfg->wce == s->blkcfg.wce) {
+return;
+}
+
+ret = vhost_dev_set_config(>dev, >wce,
+   offsetof(struct virtio_blk_config, wce),
+   sizeof(blkcfg->wce),
+   VHOST_SET_CONFIG_TYPE_MASTER);
+if (ret) {
+error_report("set device config space failed");
+return;
+}
+
+s->blkcfg.wce = blkcfg->wce;
+}
+
+static int vhost_user_blk_handle_config_change(struct vhost_dev *dev)
+{
+int ret;
+struct virtio_blk_config blkcfg;
+VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
+
+ret = vhost_dev_get_config(dev, (uint8_t *),
+   sizeof(struct virtio_blk_config));
+if (ret < 0) {
+error_report("get config space failed");
+return -1;
+}
+
+/* valid for resize only */
+if (blkcfg.capacity != s->blkcfg.capacity) {
+s->blkcfg.capacity = blkcfg.capacity;
+memcpy(dev->vdev->config, >blkcfg, sizeof(struct

[Qemu-devel] [PATCH v9 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application

2018-01-03 Thread Changpeng Liu

This commit introduces a vhost-user-blk backend device, it uses UNIX
domain socket to communicate with QEMU. The vhost-user-blk sample
application should be used with QEMU vhost-user-blk-pci device.

To use it, complie with:
make vhost-user-blk

and start like this:
vhost-user-blk -b /dev/sdb -s /path/vhost.socket

Signed-off-by: Changpeng Liu 
---
 .gitignore  |   1 +
 Makefile|   3 +
 Makefile.objs   |   1 +
 contrib/vhost-user-blk/Makefile.objs|   1 +
 contrib/vhost-user-blk/vhost-user-blk.c | 545 
 5 files changed, 551 insertions(+)
 create mode 100644 contrib/vhost-user-blk/Makefile.objs
 create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c

diff --git a/.gitignore b/.gitignore
index 433f64f..704b222 100644
--- a/.gitignore
+++ b/.gitignore
@@ -54,6 +54,7 @@
 /module_block.h
 /scsi/qemu-pr-helper
 /vhost-user-scsi
+/vhost-user-blk
 /fsdev/virtfs-proxy-helper
 *.tmp
 *.[1-9]
diff --git a/Makefile b/Makefile
index d86ecd2..f021fc8 100644
--- a/Makefile
+++ b/Makefile
@@ -331,6 +331,7 @@ dummy := $(call unnest-vars,, \
 ivshmem-server-obj-y \
 libvhost-user-obj-y \
 vhost-user-scsi-obj-y \
+vhost-user-blk-obj-y \
 qga-vss-dll-obj-y \
 block-obj-y \
 block-obj-m \
@@ -562,6 +563,8 @@ ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) 
$(COMMON_LDADDS)
 endif
 vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
$(call LINK, $^)
+vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y) libvhost-user.a
+   $(call LINK, $^)
 
 module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
$(call quiet-command,$(PYTHON) $< $@ \
diff --git a/Makefile.objs b/Makefile.objs
index 285c6f3..ae9aef7 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -115,6 +115,7 @@ libvhost-user-obj-y = contrib/libvhost-user/
 vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
 vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
 vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
+vhost-user-blk-obj-y = contrib/vhost-user-blk/
 
 ##
 trace-events-subdirs =
diff --git a/contrib/vhost-user-blk/Makefile.objs 
b/contrib/vhost-user-blk/Makefile.objs
new file mode 100644
index 000..72e2cdc
--- /dev/null
+++ b/contrib/vhost-user-blk/Makefile.objs
@@ -0,0 +1 @@
+vhost-user-blk-obj-y = vhost-user-blk.o
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
new file mode 100644
index 000..0b889fb
--- /dev/null
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -0,0 +1,545 @@
+/*
+ * vhost-user-blk sample application
+ *
+ * Copyright (c) 2017 Intel Corporation. All rights reserved.
+ *
+ * Author:
+ *  Changpeng Liu 
+ *
+ * This work is based on the "vhost-user-scsi" sample and "virtio-blk" driver
+ * implementation by:
+ *  Felipe Franciosi 
+ *  Anthony Liguori 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 only.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "standard-headers/linux/virtio_blk.h"
+#include "contrib/libvhost-user/libvhost-user-glib.h"
+#include "contrib/libvhost-user/libvhost-user.h"
+
+#include 
+
+struct virtio_blk_inhdr {
+unsigned char status;
+};
+
+/* vhost user block device */
+typedef struct VubDev {
+VugDev parent;
+int blk_fd;
+struct virtio_blk_config blkcfg;
+char *blk_name;
+GMainLoop *loop;
+} VubDev;
+
+typedef struct VubReq {
+VuVirtqElement *elem;
+int64_t sector_num;
+size_t size;
+struct virtio_blk_inhdr *in;
+struct virtio_blk_outhdr *out;
+VubDev *vdev_blk;
+struct VuVirtq *vq;
+} VubReq;
+
+/* refer util/iov.c */
+static size_t vub_iov_size(const struct iovec *iov,
+  const unsigned int iov_cnt)
+{
+size_t len;
+unsigned int i;
+
+len = 0;
+for (i = 0; i < iov_cnt; i++) {
+len += iov[i].iov_len;
+}
+return len;
+}
+
+static void vub_panic_cb(VuDev *vu_dev, const char *buf)
+{
+VugDev *gdev;
+VubDev *vdev_blk;
+
+assert(vu_dev);
+
+gdev = container_of(vu_dev, VugDev, parent);
+vdev_blk = container_of(gdev, VubDev, parent);
+if (buf) {
+g_warning("vu_panic: %s", buf);
+}
+
+g_main_loop_quit(vdev_blk->loop);
+}
+
+static void vub_req_complete(VubReq *req)
+{
+VugDev *gdev = >vdev_blk->parent;
+VuDev *vu_dev = >parent;
+
+/* IO size with 1 extra status byte */
+vu_queue_push(vu_dev, req->vq, req->elem,
+  req->size + 1);
+vu_queue_notify(vu_dev, req->vq);
+
+if (req->elem) {
+free(req->elem);
+}
+
+g_free(req);
+}
+
+static int vub_open(const char *file_name, bool wce)
+{

[Qemu-devel] [PATCH v9 0/4] Introduce a new vhost-user-blk host device to QEMU

2018-01-03 Thread Changpeng Liu

Although virtio scsi specification was designed as a replacement for virtio_blk,
there are still many users using virtio_blk. QEMU 2.9 introduced a new device
vhost user scsi which can process I/O in user space for virtio_scsi, this commit
introduces a new vhost user block host device, which can support virtio_blk in
Guest OS, and I/O processing in another I/O target.

Due to the limitation for virtio_blk specification, virtio_blk device cannot get
block information such as capacity, block size etc via the specification, 
several
new vhost user messages were added to deliver virtio config space
information between Qemu and I/O target, 
VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG
messages used for get/set config space from/to I/O target, 
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
slave message was added for the event notifier in case the change of virtio 
config space. Also,
those messages can be used for vhost device live migration as well.

CHANGES:
v8-v9: Several optimizations and code cleanup according to the comments from v8.
v7-v8: Instead using an event file descriptor for event notifier in case of 
virtio configuration
space changed, while here used a new vhost-user slave message to deliver such 
event. Several
small optimizations to address the comments from v7.
v6-v7: change the parameter of set configuration function let it only contain 
valid data buffer.
v5-v6: add header flags for vhost-user master so that the slave can know the 
purpose for
set config, also vhost-user get/set messages' payload doesn't contain invalid 
data buffers.
v4-v5: add header offset and size for virtio config space.
v3-v4: refactoring the vhost user block example patch based on new 
libvhost-user library.
v2-v3: add new vhost user message to get/set virtio config space.

Changpeng Liu (4):
  vhost-user: add new vhost user messages to support virtio config space
  vhost-user-blk: introduce a new vhost-user-blk host device
  contrib/libvhost-user: enable virtio config space messages
  contrib/vhost-user-blk: introduce a vhost-user-blk sample application

 .gitignore  |   1 +
 Makefile|   3 +
 Makefile.objs   |   1 +
 contrib/libvhost-user/libvhost-user.c   |  42 +++
 contrib/libvhost-user/libvhost-user.h   |  33 ++
 contrib/vhost-user-blk/Makefile.objs|   1 +
 contrib/vhost-user-blk/vhost-user-blk.c | 545 
 default-configs/pci.mak |   1 +
 docs/interop/vhost-user.txt |  55 
 hw/block/Makefile.objs  |   3 +
 hw/block/vhost-user-blk.c   | 359 +
 hw/virtio/vhost-user.c  | 118 +++
 hw/virtio/vhost.c   |  32 ++
 hw/virtio/virtio-pci.c  |  55 
 hw/virtio/virtio-pci.h  |  18 ++
 include/hw/virtio/vhost-backend.h   |  12 +
 include/hw/virtio/vhost-user-blk.h  |  41 +++
 include/hw/virtio/vhost.h   |  15 +
 18 files changed, 1335 insertions(+)
 create mode 100644 contrib/vhost-user-blk/Makefile.objs
 create mode 100644 contrib/vhost-user-blk/vhost-user-blk.c
 create mode 100644 hw/block/vhost-user-blk.c
 create mode 100644 include/hw/virtio/vhost-user-blk.h

-- 
1.9.3

[Qemu-devel] [PATCH v9 1/4] vhost-user: add new vhost user messages to support virtio config space

2018-01-03 Thread Changpeng Liu

Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
used for live migration of vhost user devices, also vhost user devices
can benefit from the messages to get/set virtio config space from/to the
I/O target. For the purpose to support virtio config space change,
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG message is added as the event notifier
in case virtio config space change in the slave I/O target.

Signed-off-by: Changpeng Liu 
---
 docs/interop/vhost-user.txt   |  55 ++
 hw/virtio/vhost-user.c| 118 ++
 hw/virtio/vhost.c |  32 +++
 include/hw/virtio/vhost-backend.h |  12 
 include/hw/virtio/vhost.h |  15 +
 5 files changed, 232 insertions(+)

diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 954771d..9a5cb6a 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -116,6 +116,19 @@ Depending on the request type, payload can be:
 - 3: IOTLB invalidate
 - 4: IOTLB access fail
 
+ * Virtio device config space
+   ---
+   | offset | size | flags | payload |
+   ---
+
+   Offset: a 32-bit offset of virtio device's configuration space
+   Size: a 32-bit configuration space access size in bytes
+   Flags: a 32-bit value:
+- 0: Vhost master messages used for writeable fields
+- 1: Vhost master messages used for live migration
+   Payload: Size bytes array holding the contents of the virtio
+   device's configuration space
+
 In QEMU the vhost-user message is implemented with the following struct:
 
 typedef struct VhostUserMsg {
@@ -129,6 +142,7 @@ typedef struct VhostUserMsg {
 VhostUserMemory memory;
 VhostUserLog log;
 struct vhost_iotlb_msg iotlb;
+VhostUserConfig config;
 };
 } QEMU_PACKED VhostUserMsg;
 
@@ -596,6 +610,32 @@ Master message types
   and expect this message once (per VQ) during device configuration
   (ie. before the master starts the VQ).
 
+ * VHOST_USER_GET_CONFIG
+
+  Id: 24
+  Equivalent ioctl: N/A
+  Master payload: virtio device config space
+  Slave payload: virtio device config space
+
+  Submitted by the vhost-user master to fetch the contents of the virtio
+  device configuration space, vhost-user slave's payload size MUST match
+  master's request, vhost-user slave uses zero length of payload to
+  indicate an error to vhost-user master. The vhost-user master may
+  cache the contents to avoid repeated VHOST_USER_GET_CONFIG calls.
+
+* VHOST_USER_SET_CONFIG
+
+  Id: 25
+  Equivalent ioctl: N/A
+  Master payload: virtio device config space
+  Slave payload: N/A
+
+  Submitted by the vhost-user master when the Guest changes the virtio
+  device configuration space and also can be used for live migration
+  on the destination host. The vhost-user slave must check the flags
+  field, and slaves MUST NOT accept SET_CONFIG for read-only
+  configuration space fields unless the live migration bit is set.
+
 Slave message types
 ---
 
@@ -614,6 +654,21 @@ Slave message types
   This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
   has been successfully negotiated.
 
+* VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
+
+ Id: 2
+ Equivalent ioctl: N/A
+ Slave payload: N/A
+ Master payload: N/A
+
+ Vhost-user slave sends such messages to notify that the virtio device's
+ configuration space has changed, for those host devices which can support
+ such feature, host driver can send VHOST_USER_GET_CONFIG message to slave
+ to get the latest content. If VHOST_USER_PROTOCOL_F_REPLY_ACK is
+ negotiated, and slave set the VHOST_USER_NEED_REPLY flag, master must
+ respond with zero when operation is successfully completed, or non-zero
+ otherwise.
+
 VHOST_USER_PROTOCOL_F_REPLY_ACK:
 ---
 The original vhost-user specification only demands replies for certain
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 093675e..8b94688 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -26,6 +26,11 @@
 #define VHOST_MEMORY_MAX_NREGIONS8
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+/*
+ * Maximum size of virtio device config space
+ */
+#define VHOST_USER_MAX_CONFIG_SIZE 256
+
 enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -65,12 +70,15 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_SLAVE_REQ_FD = 21,
 VHOST_USER_IOTLB_MSG = 22,
 VHOST_USER_SET_VRING_ENDIAN = 23,
+VHOST_USER_GET_CONFIG = 24,
+VHOST_USER_SET_CONFIG = 25,
 VHOST_USER_MAX
 } VhostUserRequest;
 
 typedef enum VhostUserSlaveRequest {
 VHOST_USER_SLAVE_NONE = 0,
 VHOST_USER_SLAVE_IOTLB_MSG = 1,
+

Re: [Qemu-devel] MTTCG External Halt

2018-01-03 Thread Alistair Francis

On Wed, Jan 3, 2018 at 2:23 PM, Alistair Francis  wrote:
> On Wed, Jan 3, 2018 at 2:14 PM, Peter Maydell  
> wrote:
>> On 3 January 2018 at 22:10, Alistair Francis  wrote:
>>> Any chance any one has some insight into a way to externally set a
>>> vCPU as halted/un-halted?
>>
>> PSCI (where one vCPU can power off another) does this by
>> calling arm_set_cpu_off(). Does that (or some variation
>> on it) work?
>
> It seems to help with the assert(), but I still see CPU stalls.
>
> I also forgot to mention that we have a sev implementation, which also
> might be contributing.

I figured it out. We have the same thing for reset (a GPIO line can
reset the cores) and apparently resting the same core twice in a row
was causing the assert(). Resting the core twice was a bug, so I have
fixed that and I don't see the assert() any more. I'm still not sure
why that assert() was being hit after a reset and halt/un-halt though.

Thanks for your help Peter.

Alistair

>
> Alistair
>
>>
>> thanks
>> -- PMM

Re: [Qemu-devel] [PATCH v6 07/17] target/m68k: add chk and chk2

2018-01-03 Thread Richard Henderson

On 01/03/2018 03:40 PM, Laurent Vivier wrote:
>> Did you examine the real hw change to the other flags?
> 
> yes, C is modified, and the logic is:
>   C = 0 <= ub ? val < 0 || ub < val : val < 0 && ub < val;
> All other flags are not modified.
> 
> I'm going to update the patch to reflect the change of N and C by the
> real hardware.

Ok, thanks.  Adding a comment to note following hw over and above the spec
would be appreciated.


r~

Re: [Qemu-devel] [PATCH v1 15/21] RISC-V Spike Machines

2018-01-03 Thread Richard Henderson

On 01/02/2018 04:44 PM, Michael Clark wrote:
> +object_property_set_int(OBJECT(>soc), smp_cpus, "num-harts",
> +_abort);

Ah, right.  Nevermind my previous question.


r~

Re: [Qemu-devel] [PATCH v1 12/21] RISC-V HART Array

2018-01-03 Thread Richard Henderson

On 01/02/2018 04:44 PM, Michael Clark wrote:
> Holds the state of a heterogenous array of RISC-V hardware threads.

At the moment they are homogeneous, since they are all created from the same
cpu_model.  Is that the ultimate intent?

> +static Property riscv_harts_props[] = {
> +DEFINE_PROP_UINT32("num-harts", RISCVHartArrayState, num_harts, 1),
> +DEFINE_PROP_STRING("cpu-model", RISCVHartArrayState, cpu_model),
> +DEFINE_PROP_END_OF_LIST(),
> +};

How does num_harts interact with max_cpus and smp_cpus, and thus the related
command-line options?

r~

[Qemu-devel] [PATCH v2 2/2] hw/sd/pxa2xx_mmci: add read/write() trace events

2018-01-03 Thread Philippe Mathieu-Daudé

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
---
 hw/sd/pxa2xx_mmci.c | 77 ++---
 hw/sd/trace-events  |  4 +++
 2 files changed, 53 insertions(+), 28 deletions(-)

diff --git a/hw/sd/pxa2xx_mmci.c b/hw/sd/pxa2xx_mmci.c
index 3deccf02c9..63223b797e 100644
--- a/hw/sd/pxa2xx_mmci.c
+++ b/hw/sd/pxa2xx_mmci.c
@@ -19,6 +19,7 @@
 #include "hw/qdev.h"
 #include "hw/qdev-properties.h"
 #include "qemu/error-report.h"
+#include "trace.h"
 
 #define TYPE_PXA2XX_MMCI "pxa2xx-mmci"
 #define PXA2XX_MMCI(obj) OBJECT_CHECK(PXA2xxMMCIState, (obj), TYPE_PXA2XX_MMCI)
@@ -278,45 +279,56 @@ static void pxa2xx_mmci_wakequeues(PXA2xxMMCIState *s)
 static uint64_t pxa2xx_mmci_read(void *opaque, hwaddr offset, unsigned size)
 {
 PXA2xxMMCIState *s = (PXA2xxMMCIState *) opaque;
-uint32_t ret;
+uint32_t ret = 0;
 
 switch (offset) {
 case MMC_STRPCL:
-return 0;
+break;
 case MMC_STAT:
-return s->status;
+ret = s->status;
+break;
 case MMC_CLKRT:
-return s->clkrt;
+ret = s->clkrt;
+break;
 case MMC_SPI:
-return s->spi;
+ret = s->spi;
+break;
 case MMC_CMDAT:
-return s->cmdat;
+ret = s->cmdat;
+break;
 case MMC_RESTO:
-return s->resp_tout;
+ret = s->resp_tout;
+break;
 case MMC_RDTO:
-return s->read_tout;
+ret = s->read_tout;
+break;
 case MMC_BLKLEN:
-return s->blklen;
+ret = s->blklen;
+break;
 case MMC_NUMBLK:
-return s->numblk;
+ret = s->numblk;
+break;
 case MMC_PRTBUF:
-return 0;
+break;
 case MMC_I_MASK:
-return s->intmask;
+ret = s->intmask;
+break;
 case MMC_I_REG:
-return s->intreq;
+ret = s->intreq;
+break;
 case MMC_CMD:
-return s->cmd | 0x40;
+ret = s->cmd | 0x40;
+break;
 case MMC_ARGH:
-return s->arg >> 16;
+ret = s->arg >> 16;
+break;
 case MMC_ARGL:
-return s->arg & 0x;
+ret = s->arg & 0x;
+break;
 case MMC_RES:
-if (s->resp_len < 9)
-return s->resp_fifo[s->resp_len ++];
-return 0;
+ret = (s->resp_len < 9) ? s->resp_fifo[s->resp_len++] : 0;
+break;
 case MMC_RXFIFO:
-ret = 0;
 while (size-- && s->rx_len) {
 ret |= s->rx_fifo[s->rx_start++] << (size << 3);
 s->rx_start &= 0x1f;
@@ -324,16 +336,20 @@ static uint64_t pxa2xx_mmci_read(void *opaque, hwaddr 
offset, unsigned size)
 }
 s->intreq &= ~INT_RXFIFO_REQ;
 pxa2xx_mmci_fifo_update(s);
-return ret;
+break;
 case MMC_RDWAIT:
-return 0;
+break;
 case MMC_BLKS_REM:
-return s->numblk;
+ret = s->numblk;
+break;
 default:
-hw_error("%s: Bad offset " REG_FMT "\n", __FUNCTION__, offset);
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: incorrect register 0x%02" HWADDR_PRIx "\n",
+  __func__, offset);
 }
+trace_pxa2xx_mmci_read(size, offset, ret);
 
-return 0;
+return ret;
 }
 
 static void pxa2xx_mmci_write(void *opaque,
@@ -341,6 +357,7 @@ static void pxa2xx_mmci_write(void *opaque,
 {
 PXA2xxMMCIState *s = (PXA2xxMMCIState *) opaque;
 
+trace_pxa2xx_mmci_write(size, offset, value);
 switch (offset) {
 case MMC_STRPCL:
 if (value & STRPCL_STRT_CLK) {
@@ -368,8 +385,10 @@ static void pxa2xx_mmci_write(void *opaque,
 
 case MMC_SPI:
 s->spi = value & 0xf;
-if (value & SPI_SPI_MODE)
-printf("%s: attempted to use card in SPI mode\n", __FUNCTION__);
+if (value & SPI_SPI_MODE) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: attempted to use card in SPI mode\n", __func__);
+}
 break;
 
 case MMC_CMDAT:
@@ -442,7 +461,9 @@ static void pxa2xx_mmci_write(void *opaque,
 break;
 
 default:
-hw_error("%s: Bad offset " REG_FMT "\n", __FUNCTION__, offset);
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: incorrect reg 0x%02" HWADDR_PRIx " "
+  "(value 0x%08" PRIx64 ")\n", __func__, offset, value);
 }
 }
 
diff --git a/hw/sd/trace-events b/hw/sd/trace-events
index 1fc0bcf44b..6eca3470e2 100644
--- a/hw/sd/trace-events
+++ b/hw/sd/trace-events
@@ -3,3 +3,7 @@
 # hw/sd/milkymist-memcard.c
 milkymist_memcard_memory_read(uint32_t addr, uint32_t value) "addr 0x%08x 
value 0x%08x"
 milkymist_memcard_memory_write(uint32_t addr, uint32_t value) "addr 0x%08x 
value 0x%08x"
+
+# hw/sd/pxa2xx_mmci.c
+pxa2xx_mmci_read(uint8_t size, uint32_t addr, uint32_t value) "size %d addr 
0x%02x value 0x%08x"
+pxa2xx_mmci_write(uint8_t size, uint32_t addr,

Re: [Qemu-devel] [PATCH v1 11/21] RISC-V HTIF Console

2018-01-03 Thread Richard Henderson

On 01/02/2018 04:44 PM, Michael Clark wrote:
> +/*
> + * Find the static and dynamic symbol tables and their string
> + * tables in the the mapped binary. The sh_link field in symbol
> + * table section headers gives the section index of the string
> + * table for that symbol table.
> + */
> +shdr = (Elf64_Shdr *)(ep->maddr + ep->ehdr->e_shoff);

This fails to do any byte swapping such that this code works on a big-endian
host.  You should use the routines in "hw/elf_ops.h" as adjusted by 
"hw/loader.h".

r~

Re: [Qemu-devel] [PATCH v1 10/21] RISC-V Linux User Emulation

2018-01-03 Thread Richard Henderson

On 01/02/2018 04:44 PM, Michael Clark wrote:
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index 20f3d8c..178af56 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -1272,6 +1272,28 @@ static inline void init_thread(struct target_pt_regs 
> *regs,
>  
>  #endif /* TARGET_TILEGX */
>  
> +#ifdef TARGET_RISCV
> +
> +#define ELF_START_MMAP 0x8000

For riscv64 too?  Surely closer to ((TASK_SIZE / 3) * 2).

> diff --git a/linux-user/main.c b/linux-user/main.c
> index 71696ed..8900141 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -227,7 +227,7 @@ void cpu_loop(CPUX86State *env)
>  cpu_exec_end(cs);
>  process_queued_cpu_work(cs);
>  
> -switch(trapnr) {
> +switch (trapnr) {

Even though the formatting is wrong, don't change unrelated code.

> +case EXCP_DEBUG:
> +gdbstep:
> +signum = gdb_handlesig(cs, TARGET_SIGTRAP);
> +sigcode = TARGET_TRAP_BRKPT;
> +break;
> +default:
> +EXCP_DUMP(env, "\nqemu: unhandled CPU exception %#x - 
> aborting\n",
> + trapnr);
> +exit(EXIT_FAILURE);

You will need to handle the generic EXCP_ATOMIC as well.
Though of course you won't see that until you use tcg_gen_atomic_*.


r~

Re: [Qemu-devel] [PATCH v6 07/17] target/m68k: add chk and chk2

2018-01-03 Thread Laurent Vivier

Le 03/01/2018 à 22:52, Richard Henderson a écrit :
> On 01/02/2018 03:40 PM, Laurent Vivier wrote:
>> +void HELPER(chk)(CPUM68KState *env, int32_t val, int32_t ub)
>> +{
>> +if (val < 0 || val > ub) {
>> +CPUState *cs = CPU(m68k_env_get_cpu(env));
>> +
>> +/* Recover PC and CC_OP for the beginning of the insn.  */
>> +cpu_restore_state(cs, GETPC());
>> +
>> +/* Adjust PC and FLAGS to end of the insn.  */
>> +env->pc += 2;
>> +helper_flush_flags(env, env->cc_op);
>> +env->cc_n = val;
>> +
>> +cs->exception_index = EXCP_CHK;
>> +cpu_loop_exit(cs);
>> +}
>> +}
>> +
> 
> I thought you said for 68040, N is always unset for val >= 0.
> That would suggest
> 
>   helper_flush_flags(env, env->cc_op);
>   env->cc_n = val;
>   if (val < 0 || val > ub) {
> ...
>   }

ok, my though was it is better to not update the flag if it is not
needed (it should be undefined), but what you suggest is closer to the
real hardware so I will update it.

> Did you examine the real hw change to the other flags?

yes, C is modified, and the logic is:
  C = 0 <= ub ? val < 0 || ub < val : val < 0 && ub < val;
All other flags are not modified.

I'm going to update the patch to reflect the change of N and C by the
real hardware.

> Because they're officially undefined, which suggests
> 
>   env->cc_n = val;
>   env->cc_op = CC_OP_LOGIC;
> 
>> +void HELPER(chk2)(CPUM68KState *env, int32_t val, int32_t lb, int32_t ub)
>> +{
>> +helper_flush_flags(env, env->cc_op);
>> +
>> +env->cc_z = val != lb && val != ub;
>> +env->cc_c = lb <= ub ? val < lb || val > ub : val > ub && val < lb;
>> +
>> +if (env->cc_c) {
>> +CPUState *cs = CPU(m68k_env_get_cpu(env));
>> +
>> +cpu_restore_state(cs, GETPC());
>> +env->cc_op = CC_OP_FLAGS;
> 
> A comment that we're reverting a change made during unwind would be helpful 
> here.

Ok

Thanks,
Laurent

Re: [Qemu-devel] [PATCH v1 05/21] RISC-V CPU Helpers

2018-01-03 Thread Richard Henderson

On 01/03/2018 02:59 PM, Michael Clark wrote:
> I see exit(1) called in quite a few of the other ports too. I was wondering at
> the time if there is a canonical error_abort API?

Yes, but they're wrong too.  Lots of that is old code in less maintained 
targets.

The only time errors should exit are when parsing options for startup.  Even
then new code should use qapi/error.h, propagating the error back to generic
code.  (This is where your canonical error_abort API is located.)

Once running, guest errors should continue as best as we can.  Either ignoring
the action or raising an exception are usually the right thing.  The guest --
and even more importantly a guest running without supervisor -- should not be
able to force the hypervisor to shutdown.

Asserting for logic errors that are fully within the hypervisor are permitted.
It should be taken as written that any such assertion actually triggering is a
bug to be fixed.

We prefer g_assert_not_reached() over assert(false) or abort() for protecting
code paths that should not be reachable.  I do not use the other g_assert*
functions myself, though other parts of qemu do.

r~

Re: [Qemu-devel] [PATCH v1 21/21] RISC-V Build Infrastructure

2018-01-03 Thread Eric Blake

On 01/02/2018 06:44 PM, Michael Clark wrote:
> This adds RISC-V into the build system enabling the following targets:
> 
> - riscv32-softmmu
> - riscv64-softmmu
> - riscv32-linux-user
> - riscv64-linux-user
> 
> This adds defaults configs for RISC-V, enables the build for the RISC-V
> CPU core, hardware, and Linux User Emulation. The 'qemu-binfmt-conf.sh'
> script is updated to add the RISC-V ELF magic.
> 
> Expected checkpatch errors for consistency reasons:
> 
> ERROR: line over 90 characters
> FILE: scripts/qemu-binfmt-conf.sh
> Signed-off-by: Michael Clark 
> ---

> +++ b/qapi-schema.json
> @@ -413,7 +413,7 @@
>  # Since: 2.6
>  ##
>  { 'enum': 'CpuInfoArch',
> -  'data': ['x86', 'sparc', 'ppc', 'mips', 'tricore', 'other' ] }
> +  'data': ['x86', 'sparc', 'ppc', 'mips', 'tricore', 'riscv', 'other' ] }

Missing documentation that riscv was added in 2.12 (see QKeyCode in
qapi/ui.json for an enum that serves as an example of documenting
changes over time).


>  
>  ##
> +# @CpuInfoRISCV:
> +#
> +# Additional information about a virtual RISCV CPU
> +#
> +# @pc: the instruction pointer
> +#
> +# Since 2.8

2.12, actually.

> +##
> +{ 'struct': 'CpuInfoRISCV', 'data': { 'pc': 'int' } }

Should this be 'uint64' or other specific type, rather than the generic
'int' (which happens to be 64 bits, but signed)?  Other architectures
use 'int' because of history, but we could use this chance to improve
things if desired.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time

2018-01-03 Thread LukeShu

To have a link to it from here, on the 28th I submitted a patchset to
fix this: https://lists.nongnu.org/archive/html/qemu-
devel/2017-12/msg05237.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1740219

Title:
  static linux-user ARM emulation has several-second startup time

Status in QEMU:
  New

Bug description:
  static linux-user emulation has several-second startup time

  My problem: I'm a Parabola packager, and I'm updating our
  qemu-user-static package from 2.8 to 2.11.  With my new
  statically-linked 2.11, running `qemu-arm /my/arm-chroot/bin/true`
  went from taking 0.006s to 3s!  This does not happen with the normal
  dynamically linked 2.11, or the old static 2.8.

  What happens is it gets stuck in
  `linux-user/elfload.c:init_guest_space()`.  What `init_guest_space`
  does is map 2 parts of the address space: `[base, base+guest_size]`
  and `[base+0x, base+0x+page_size]`; where it must find
  an acceptable `base`.  Its strategy is to `mmap(NULL, guest_size,
  ...)` decide where the first range is, and then check if that
  +0x is also available.  If it isn't, then it starts trying
  `mmap(base, ...)` for the entire address space from low-address to
  high-address.

  "Normally," it finds an accaptable `base` within the first 2 tries.
  With a static 2.11, it's taking thousands of tries.

  

  Now, from my understanding, there are 2 factors working together to
  cause that in static 2.11 but not the other builds:

   - 2.11 increased the default `guest_size` from 0xf700 to 0x
   - PIE (and thus ASLR) is disabled for static builds

  For some reason that I don't understand, with the smaller
  `guest_size` the initial `mmap(NULL, guest_size, ...)` usually
  returns an acceptable address range; but larger `guest_size` makes it
  consistently return a block of memory that butts right up against
  another already mapped chunk of memory.  This isn't just true on the
  older builds, it's true with the 2.11 builds if I use the `-R` flag to
  shrink the `guest_size` back down to 0xf700.  That is with
  linux-hardened 4.13.13 on x86-64.

  So then, it it falls back to crawling the entire address space; so it
  tries base=0x1000.  With ASLR, that probably succeeds.  But with
  ASLR being disabled on static builds, the text segment is at
  0x6000; which is does not leave room for the needed
  0x1000-size block before it.  So then it tries base=0x2000.
  And so on, more than 6000 times until it finally gets to and passes
  the text segment; calling mmap more than 12000 times.

  

  I'm not sure what the fix is.  Perhaps try to mmap a continuous chunk
  of size 0x1000, then munmap it and then mmap the 2 chunks that we
  actually need.  The disadvantage to that is that it does not support
  the sparse address space that the current algorithm supports for
  `guest_size < 0x`.  If `guest_size < 0x` *and* the big
  mmap fails, then it could fall back to a sparse search; though I'm not
  sure the current algorithm is a good choice for it, as we see in this
  bug.  Perhaps it should inspect /proc/self/maps to try to find a
  suitable range before ever calling mmap?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1740219/+subscriptions

Re: [Qemu-devel] [PATCH v3] linux-user: Use *at functions instead of caching interp_prefix contents

2018-01-03 Thread Eric Blake

On 12/29/2017 12:45 PM, no-re...@patchew.org wrote:
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> === OUTPUT BEGIN ===
> Checking PATCH 1/1: linux-user: Use *at functions instead of caching 
> interp_prefix contents...
> ERROR: do not use assignment in if condition
> #25: FILE: linux-user/elfload.c:2206:
> +if (interp_dirfd < 0

Given that your compact if with embedded assignment makes the syntax
checker unhappy, should we look for a v4 that uses the construct we
debated in v2:

while (1) {
if (interp_dirfd > 0 && filename[0] == '/') {
fd = openat(interp_dirfd, filename + 1, O_RDONLY);
if (fd >= 0 || errno != ENOENT) {
break;
}
}
fd = open(filename, O_RDONLY);
break;
}


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v1 09/21] RISC-V Physical Memory Protection

2018-01-03 Thread Richard Henderson

On 01/02/2018 04:44 PM, Michael Clark wrote:
> +#ifdef DEBUG_PMP
> +#define PMP_PRINTF(fmt, ...) \
> +do { fprintf(stderr, "pmp: " fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define PMP_PRINTF(fmt, ...) \
> +do {} while (0)
> +#endif

Debugging goes to qemu_log.

Rearrange this so that formatting is always compile-time checked.
E.g.

#define DEBUG_PMP 0
#define PMP_PRINTF(fmt, ...)  \
  do {\
if (DEBUG_PMP) {  \
  qemu_log("pmp: " fmt, ##__VA_ARGS__);   \
} \
  } while (0)

> 
> +static target_ulong pmp_get_napot_base_and_range(target_ulong reg,
> +target_ulong *range)
> +{
> +/* construct a mask of all bits bar the top bit */
> +target_ulong mask = 0u;
> +target_ulong base = reg;
> +target_ulong numbits = (sizeof(target_ulong) * 8u) + 2u;
> +mask = (mask - 1u) >> 1;
> +
> +while (mask) {
> +if ((reg & mask) == mask) {
> +/* this is the mask to use */
> +base = reg & ~mask;
> +break;
> +}
> +mask >>= 1;
> +numbits--;
> +}
> +
> +*range = (1lu << numbits) - 1u;
> +return base;
> +}

You can compute napot with ctz64(~reg).
More useless LU suffixes.

> +if (pmp_index >= 1u) {
> +prev_addr = env->pmp_state.pmp[pmp_index].addr_reg;

pmp_index - 1

> 
> +for (i = 0; i < MAX_RISCV_PMPS; i++) {
> +const uint8_t a_field =
> +pmp_get_a_field(env->pmp_state.pmp[i].cfg_reg);
> +if (PMP_AMATCH_OFF != a_field) {
> +env->pmp_state.num_rules++;
> +}
> +}

Doesn't this mean that pmp_index ordering != pmp_state ordering?  Which would
mean that you'd be matching rules in the wrong order for the static prioirity.

> +static int pmp_is_in_range(CPURISCVState *env, int pmp_index, target_ulong 
> addr)
> +{
> +int result = 0;
> +
> +if ((addr >= env->pmp_state.addr[pmp_index].sa)
> +&& (addr < env->pmp_state.addr[pmp_index].ea)) {
> +result = 1;

Given how the range is computed in pmp_update_rule, surely <= ea.

> +s = pmp_is_in_range(env, i, addr);
> +e = pmp_is_in_range(env, i, addr + size);

Surely addr + size - 1.

> 
> +/* val &= 0x3ful; */
> +

Why is this commented out?  Surely that's exactly what the spec says.
Although it's easier to compare as

  val = extract64(val, 0, 54);


r~

Re: [Qemu-devel] [PATCH v1 05/21] RISC-V CPU Helpers

2018-01-03 Thread Michael Clark

On Wed, Jan 3, 2018 at 8:12 PM, Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 01/02/2018 04:44 PM, Michael Clark wrote:
> > +target_ulong mode = env->priv;
> > +if (access_type != MMU_INST_FETCH) {
> > +if (get_field(env->mstatus, MSTATUS_MPRV)) {
> > +mode = get_field(env->mstatus, MSTATUS_MPP);
> > +}
> > +}
> > +if (env->priv_ver >= PRIV_VERSION_1_10_0) {
> > +if (get_field(env->satp, SATP_MODE) == VM_1_09_MBARE) {
> > +mode = PRV_M;
> > +}
> > +} else {
> > +if (get_field(env->mstatus, MSTATUS_VM) == VM_1_10_MBARE) {
> > +mode = PRV_M;
> > +}
> > +}
>
> This is replicating cpu_mmu_index.
> Therefore you should be relying on mmu_idx.
>
> > +/* check to make sure that mmu_idx and mode that we get matches */
> > +if (unlikely(mode != mmu_idx)) {
> > +fprintf(stderr, "MODE: mmu_idx mismatch\n");
> > +exit(1);
> > +}
>
> As in the opposite of this.


OK. cpu_mmu_index has already translated the mode into mmu_idx for us so we
can eliminate the redundant mode fetch, check and error message.

Essentially we should trust mmu_idx returned from cpu_mmu_index, so this
statement should never trigger.

Will include in the next spin.


> > +
> > +if (mode == PRV_M) {
> > +target_ulong msb_mask = /*0x7FFF; */
> > +(((target_ulong)2) << (TARGET_LONG_BITS - 1)) - 1;
> > +*physical = address & msb_mask;
>
> Or perhaps extract64(address, 0, TARGET_LONG_BITS - 1)?
>
> > +if (env->priv_ver >= PRIV_VERSION_1_10_0) {
> > +base = get_field(env->satp, SATP_PPN) << PGSHIFT;
> > +sum = get_field(env->mstatus, MSTATUS_SUM);
> > +vm = get_field(env->satp, SATP_MODE);
> > +switch (vm) {
> > +case VM_1_10_SV32:
> > +  levels = 2; ptidxbits = 10; ptesize = 4; break;
> > +case VM_1_10_SV39:
> > +  levels = 3; ptidxbits = 9; ptesize = 8; break;
> > +case VM_1_10_SV48:
> > +  levels = 4; ptidxbits = 9; ptesize = 8; break;
> > +case VM_1_10_SV57:
> > +  levels = 5; ptidxbits = 9; ptesize = 8; break;
> > +default:
> > +  printf("unsupported SATP_MODE value\n");
> > +  exit(1);
>
> Just qemu_log_mask with LOG_UNIMP or LOG_GUEST_ERROR, and then return
> TRANSLATE_FAIL.  Printing to stdout and exiting isn't kosher.  Lots more
> occurrences within this file.


Understand.  I had aleady converted several printfs to error_report.

I wasn't sure which logging API to use. There are also quite a lot of uses
of grep -r error_report target.

I'll grep -r for printf and change to qemu_log_mask with appropriate level.

I see exit(1) called in quite a few of the other ports too. I was wondering
at the time if there is a canonical error_abort API?

Will try to improve things in the next spin.

> +static void raise_mmu_exception(CPURISCVState *env, target_ulong address,
> > +MMUAccessType access_type)
> > +{
> > +CPUState *cs = CPU(riscv_env_get_cpu(env));
> > +int page_fault_exceptions =
> > +(env->priv_ver >= PRIV_VERSION_1_10_0) &&
> > +get_field(env->satp, SATP_MODE) != VM_1_10_MBARE;
> > +int exception = 0;
> > +if (access_type == MMU_INST_FETCH) { /* inst access */
> > +exception = page_fault_exceptions ?
> > +RISCV_EXCP_INST_PAGE_FAULT : RISCV_EXCP_INST_ACCESS_FAULT;
> > +env->badaddr = address;
> > +} else if (access_type == MMU_DATA_STORE) { /* store access */
> > +exception = page_fault_exceptions ?
> > +RISCV_EXCP_STORE_PAGE_FAULT : RISCV_EXCP_STORE_AMO_ACCESS_
> FAULT;
> > +env->badaddr = address;
> > +} else if (access_type == MMU_DATA_LOAD) { /* load access */
> > +exception = page_fault_exceptions ?
> > +RISCV_EXCP_LOAD_PAGE_FAULT : RISCV_EXCP_LOAD_ACCESS_FAULT;
> > +env->badaddr = address;
> > +} else {
> > +fprintf(stderr, "FAIL: invalid access_type\n");
> > +exit(1);
>
> Switch with a default: g_assert_not_reached(), since access_type is not
> controlled by the guest.
>
> > +void riscv_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
> > +   MMUAccessType access_type, int
> mmu_idx,
> > +   uintptr_t retaddr)
> > +{
> > +RISCVCPU *cpu = RISCV_CPU(cs);
> > +CPURISCVState *env = >env;
> > +if (access_type == MMU_INST_FETCH) {
> > +fprintf(stderr, "unaligned inst fetch not handled here. should
> not "
> > +"trigger\n");
> > +exit(1);
>
> No exit.  Do something logical.


Got it. Assertion.


> > +} else if (access_type == MMU_DATA_STORE) {
> > +cs->exception_index = RISCV_EXCP_STORE_AMO_ADDR_MIS;
> > +env->badaddr = addr;
>
> Why does STORE imply AMO?  Why can't a normal store trigger an unaligned
> trap?


It's STORE or AMO.

Re: [Qemu-devel] [PATCH] osdep: Retry SETLK upon EINTR

2018-01-03 Thread Eric Blake

On 12/26/2017 12:53 AM, Fam Zheng wrote:
> We could hit lock failure if there is a signal that makes fcntl return
> -1 and errno set to EINTR. In this case we should retry.

Did you hit this in practice?  In 'man fcntl' on my Fedora 27 box, the
DESCRIPTION section only mentions EINTR as possible for F_[OFD_]SETLKW,
but we don't appear to be using that one (just SETLK and GETLK).  On the
other hand, the ERRORS section of the same document mentions:

   EINTR  cmd  is  F_SETLKW  or  F_OFD_SETLKW and the operation was
inter‐
  rupted by a signal; see signal(7).

   EINTR  cmd is F_GETLK, F_SETLK, F_OFD_GETLK, or  F_OFD_SETLK,
and  the
  operation  was  interrupted  by  a  signal  before  the
lock was
  checked or acquired.  Most likely when  locking  a  remote
 file
  (e.g., locking over NFS), but can sometimes happen locally.

(I hate it when information differs between two places in the same
document, especially if I only read the first place)

> 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Fam Zheng 
> ---
>  util/osdep.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/util/osdep.c b/util/osdep.c
> index 1231f9f876..a73de0e1ba 100644
> --- a/util/osdep.c
> +++ b/util/osdep.c
> @@ -244,7 +244,9 @@ static int qemu_lock_fcntl(int fd, int64_t start, int64_t 
> len, int fl_type)
>  .l_type   = fl_type,
>  };
>  qemu_probe_lock_ops();
> -ret = fcntl(fd, fcntl_op_setlk, );
> +do {
> +ret = fcntl(fd, fcntl_op_setlk, );
> +} while (ret == -1 && errno == EINTR);

The change makes sense from a maintenance point of view, whether or not
you hit it in practice.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2 1/2] hw/timer/pxa2xx_timer: replace hw_error() -> qemu_log_mask()

2018-01-03 Thread Philippe Mathieu-Daudé

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
---
 hw/timer/pxa2xx_timer.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/timer/pxa2xx_timer.c b/hw/timer/pxa2xx_timer.c
index 68ba5a70b3..a489bf5159 100644
--- a/hw/timer/pxa2xx_timer.c
+++ b/hw/timer/pxa2xx_timer.c
@@ -13,6 +13,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/arm/pxa.h"
 #include "hw/sysbus.h"
+#include "qemu/log.h"
 
 #define OSMR0  0x00
 #define OSMR1  0x04
@@ -252,8 +253,14 @@ static uint64_t pxa2xx_timer_read(void *opaque, hwaddr 
offset,
 case OSNR:
 return s->snapshot;
 default:
+qemu_log_mask(LOG_UNIMP,
+  "%s: unknown register 0x%02" HWADDR_PRIx "\n",
+  __func__, offset);
+break;
 badreg:
-hw_error("pxa2xx_timer_read: Bad offset " REG_FMT "\n", offset);
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: incorrect register 0x%02" HWADDR_PRIx "\n",
+  __func__, offset);
 }
 
 return 0;
@@ -377,8 +384,14 @@ static void pxa2xx_timer_write(void *opaque, hwaddr offset,
 }
 break;
 default:
+qemu_log_mask(LOG_UNIMP,
+  "%s: unknown register 0x%02" HWADDR_PRIx " "
+  "(value 0x%08" PRIx64 ")\n",  __func__, offset, value);
+break;
 badreg:
-hw_error("pxa2xx_timer_write: Bad offset " REG_FMT "\n", offset);
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: incorrect register 0x%02" HWADDR_PRIx " "
+  "(value 0x%08" PRIx64 ")\n", __func__, offset, value);
 }
 }
 
-- 
2.15.1

[Qemu-devel] [PATCH v2 0/2] pxa2xx_timer: ignore incorrect registers access to use U-Boot

2018-01-03 Thread Philippe Mathieu-Daudé

since v1:
- fixed qemu_log_mask() lines indentation (Alistair)
- added Alistair's R-b

tiny patches that allow to boot a Gumstix Connex board and use U-Boot.

Using https://wiki.gumstix.com/index.php/Making_qemu_images#Connex
Linux kernel also booting but crashes entering userland:

$ arm-softmmu/qemu-system-arm -M connex -nographic -pflash cflash.img
pxa2xx_clkcfg_write: CPU frequency change attempt
pxa2xx_timer_write: incorrect reg 0xd8 (value 0x00c9)
pxa2xx_timer_write: incorrect reg 0x98 (value 0x0001)
pxa2xx_timer_write: incorrect reg 0x58 (value 0x0001)


U-Boot 1.2.0 (May 10 2008 - 21:17:19) - PXA270@400 MHz - 1604

*** Welcome to Gumstix ***

DRAM:  256 MB
Flash: 16 MB
Using default environment

Hit any key to stop autoboot:  0 
Instruction Cache is ON
Copying kernel to 0xa200 from 0x00f0 (length 0x0010)...done
## Booting image at a200 ...
   Image Name:   Angstrom/2.6.21/gumstix-custom-c
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:1041252 Bytes = 1016.8 kB
   Load Address: a0008000
   Entry Point:  a0008000
OK

Starting kernel ...

Linux version 2.6.21 (otto@otto) (gcc version 4.1.2) #1 PREEMPT Mon May 12 
14:33:32 PDT 2008
CPU: XScale-PXA255 [69052d00] revision 0 (ARMv5TE), cr=7977
Machine: The Gumstix Platform
Memory policy: ECC disabled, Data cache writeback
Memory clock: 0.00MHz (*0)
Run Mode clock: 0.00MHz (*0)
Turbo Mode clock: 0.00MHz (*2.0, active)
CPU0: D VIVT write-back cache
CPU0: I cache: 16384 bytes, associativity 64, 32 byte lines, 8 sets
CPU0: D cache: 16384 bytes, associativity 64, 32 byte lines, 8 sets
Built 1 zonelists.  Total pages: 65024
Kernel command line: console=ttyS0,115200n8 root=1f01 rootfstype=jffs2 
reboot=cold,hard
PID hash table entries: 1024 (order: 10, 4096 bytes)
Console: colour dummy device 80x30
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 256MB = 256MB total
Memory: 257536KB available (1884K code, 191K data, 144K init)
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
NET: Registered protocol family 16
Time: pxa_timer clocksource has been installed.
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
JFFS2 version 2.2. (NAND) (SUMMARY)  (C) 2001-2006 Red Hat, Inc.
io scheduler noop registered
io scheduler cfq registered (default)
Console: switching to colour frame buffer device 80x24
pxa2xx-uart.0: ttyS0 at MMIO 0x4010 (irq = 15) is a FFUART
pxa2xx-uart.1: ttyS1 at MMIO 0x4020 (irq = 14) is a BTUART
pxa2xx-uart.2: ttyS2 at MMIO 0x4070 (irq = 13) is a STUART
pxa2xx-uart.3: ttyS3 at MMIO 0x4160 (irq = 0) is a HWUART
Probing Gumstix Flash ROM at physical address 0x (16-bit bankwidth)
Gumstix Flash ROM: Found 1 x16 devices at 0x0 in 16-bit bank
 Intel/Sharp Extended Query Table at 0x0031
Using buffer write method
Using static partitions on Gumstix Flash ROM
Creating 3 MTD partitions on "Gumstix Flash ROM":
0x-0x0004 : "Bootloader"
0x0004-0x00f0 : "RootFS"
0x00f0-0x0100 : "Kernel"
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
XScale DSP coprocessor detected.
VFS: Mounted root (jffs2 filesystem).
Freeing init memory: 144K
INIT: version 2.86 booting
qemu-system-arm: Trying to execute code outside RAM or ROM at 0x000618e8
This usually means one of the following happened:

(1) You told QEMU to execute a kernel for the wrong machine type, and it 
crashed on startup (eg trying to run a raspberry pi kernel on a versatilepb 
QEMU machine)
(2) You didn't give QEMU a kernel or BIOS filename at all, and QEMU 
executed a ROM full of no-op instructions until it fell off the end
(3) Your guest kernel has a bug and crashed by jumping off into nowhere

This is almost always one of the first two, so check your command line and 
that you are using the right type of kernel for this machine.
If you think option (3) is likely then you can try debugging your guest 
with the -d debug options; in particular -d guest_errors will cause the log to 
include a dump of the guest register state at this point.

Execution cannot continue; stopping here.

qemu: fatal: Trying to execute code outside RAM or ROM at 0x000618e8
R00= R01=be9acd04 R02=000bd818 R03=000b1d78
R04=000bd838 R05=000bd80c R06=0001 R07=000bda88
R08= R09=

Re: [Qemu-devel] [PATCH 1/2] hw/timer/pxa2xx_timer: replace hw_error() -> qemu_log_mask()

2018-01-03 Thread Philippe Mathieu-Daudé

On 01/03/2018 06:53 PM, Alistair Francis wrote:
> On Wed, Jan 3, 2018 at 8:41 AM, Philippe Mathieu-Daudé  
> wrote:
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  hw/timer/pxa2xx_timer.c | 13 +++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/timer/pxa2xx_timer.c b/hw/timer/pxa2xx_timer.c
>> index 68ba5a70b3..cfea0a5e22 100644
>> --- a/hw/timer/pxa2xx_timer.c
>> +++ b/hw/timer/pxa2xx_timer.c
>> @@ -13,6 +13,7 @@
>>  #include "sysemu/sysemu.h"
>>  #include "hw/arm/pxa.h"
>>  #include "hw/sysbus.h"
>> +#include "qemu/log.h"
>>
>>  #define OSMR0  0x00
>>  #define OSMR1  0x04
>> @@ -252,8 +253,12 @@ static uint64_t pxa2xx_timer_read(void *opaque, hwaddr 
>> offset,
>>  case OSNR:
>>  return s->snapshot;
>>  default:
>> +qemu_log_mask(LOG_UNIMP, "%s: unknown reg 0x%02" HWADDR_PRIx
>> +  "\n", __func__, offset);
>> +break;
>>  badreg:
>> -hw_error("pxa2xx_timer_read: Bad offset " REG_FMT "\n", offset);
>> +qemu_log_mask(LOG_GUEST_ERROR, "%s: incorrect reg 0x%02" HWADDR_PRIx
>> +  "\n", __func__, offset);
> 
> It might just be my email display, but if these lines don't line up
> can you fix them?

My guess is your email display is correct but my eyes are tired :S

> 
> Reviewed-by: Alistair Francis 

Thanks!

> 
> Alistair
> 
>>  }
>>
>>  return 0;
>> @@ -377,8 +382,12 @@ static void pxa2xx_timer_write(void *opaque, hwaddr 
>> offset,
>>  }
>>  break;
>>  default:
>> +qemu_log_mask(LOG_UNIMP, "%s: unknown reg 0x%02" HWADDR_PRIx " "
>> +  "(value 0x%08" PRIx64 ")\n", __func__, offset, value);
>> +break;
>>  badreg:
>> -hw_error("pxa2xx_timer_write: Bad offset " REG_FMT "\n", offset);
>> +qemu_log_mask(LOG_GUEST_ERROR, "%s: incorrect reg 0x%02" 
>> HWADDR_PRIx " "
>> +  "(value 0x%08" PRIx64 ")\n", __func__, offset, value);
>>  }
>>  }
>>
>> --
>> 2.15.1
>>
>>

Re: [Qemu-devel] [PATCH 2/2] hw/sd/pxa2xx_mmci: add read/write() trace events

2018-01-03 Thread Philippe Mathieu-Daudé

On 01/03/2018 06:54 PM, Alistair Francis wrote:
> On Wed, Jan 3, 2018 at 8:41 AM, Philippe Mathieu-Daudé  
> wrote:
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  hw/sd/pxa2xx_mmci.c | 63 
>> ++---
>>  hw/sd/trace-events  |  4 
>>  2 files changed, 44 insertions(+), 23 deletions(-)
>>
>> diff --git a/hw/sd/pxa2xx_mmci.c b/hw/sd/pxa2xx_mmci.c
>> index 3deccf02c9..0759a0d2eb 100644
>> --- a/hw/sd/pxa2xx_mmci.c
>> +++ b/hw/sd/pxa2xx_mmci.c
>> @@ -19,6 +19,7 @@
>>  #include "hw/qdev.h"
>>  #include "hw/qdev-properties.h"
>>  #include "qemu/error-report.h"
>> +#include "trace.h"
>>
>>  #define TYPE_PXA2XX_MMCI "pxa2xx-mmci"
>>  #define PXA2XX_MMCI(obj) OBJECT_CHECK(PXA2xxMMCIState, (obj), 
>> TYPE_PXA2XX_MMCI)
>> @@ -278,43 +279,55 @@ static void pxa2xx_mmci_wakequeues(PXA2xxMMCIState *s)
>>  static uint64_t pxa2xx_mmci_read(void *opaque, hwaddr offset, unsigned size)
>>  {
>>  PXA2xxMMCIState *s = (PXA2xxMMCIState *) opaque;
>> -uint32_t ret;
>> +uint32_t ret = 0;
>>
>>  switch (offset) {
>>  case MMC_STRPCL:
>> -return 0;
>> +break;
>>  case MMC_STAT:
>> -return s->status;
>> +ret = s->status;
>> +break;
>>  case MMC_CLKRT:
>> -return s->clkrt;
>> +ret = s->clkrt;
>> +break;
>>  case MMC_SPI:
>> -return s->spi;
>> +ret = s->spi;
>> +break;
>>  case MMC_CMDAT:
>> -return s->cmdat;
>> +ret = s->cmdat;
>> +break;
>>  case MMC_RESTO:
>> -return s->resp_tout;
>> +ret = s->resp_tout;
>> +break;
>>  case MMC_RDTO:
>> -return s->read_tout;
>> +ret = s->read_tout;
>> +break;
>>  case MMC_BLKLEN:
>> -return s->blklen;
>> +ret = s->blklen;
>> +break;
>>  case MMC_NUMBLK:
>> -return s->numblk;
>> +ret = s->numblk;
>> +break;
>>  case MMC_PRTBUF:
>> -return 0;
>> +break;
>>  case MMC_I_MASK:
>> -return s->intmask;
>> +ret = s->intmask;
>> +break;
>>  case MMC_I_REG:
>> -return s->intreq;
>> +ret = s->intreq;
>> +break;
>>  case MMC_CMD:
>> -return s->cmd | 0x40;
>> +ret = s->cmd | 0x40;
>> +break;
>>  case MMC_ARGH:
>> -return s->arg >> 16;
>> +ret = s->arg >> 16;
>> +break;
>>  case MMC_ARGL:
>> -return s->arg & 0x;
>> +ret = s->arg & 0x;
>> +break;
>>  case MMC_RES:
>> -if (s->resp_len < 9)
>> -return s->resp_fifo[s->resp_len ++];
>> -return 0;
>> +ret = (s->resp_len < 9) ? s->resp_fifo[s->resp_len++] : 0;
>> +break;
>>  case MMC_RXFIFO:
>>  ret = 0;
>>  while (size-- && s->rx_len) {
>> @@ -324,16 +337,19 @@ static uint64_t pxa2xx_mmci_read(void *opaque, hwaddr 
>> offset, unsigned size)
>>  }
>>  s->intreq &= ~INT_RXFIFO_REQ;
>>  pxa2xx_mmci_fifo_update(s);
>> -return ret;
>> +break;
>> +ret = ret;

Oops...

>>  case MMC_RDWAIT:
>> -return 0;
>> +break;
>>  case MMC_BLKS_REM:
>> -return s->numblk;
>> +ret = s->numblk;
>> +break;
>>  default:
>>  hw_error("%s: Bad offset " REG_FMT "\n", __FUNCTION__, offset);
> 
> Maybe worth removing this as well?

Indeed!

> 
> Either way:
> 
> Reviewed-by: Alistair Francis 

Thanks :)

> 
> Alistair
> 
>>  }
>> +trace_pxa2xx_mmci_read(size, offset, ret);
>>
>> -return 0;
>> +return ret;
>>  }
>>
>>  static void pxa2xx_mmci_write(void *opaque,
>> @@ -341,6 +357,7 @@ static void pxa2xx_mmci_write(void *opaque,
>>  {
>>  PXA2xxMMCIState *s = (PXA2xxMMCIState *) opaque;
>>
>> +trace_pxa2xx_mmci_write(size, offset, value);
>>  switch (offset) {
>>  case MMC_STRPCL:
>>  if (value & STRPCL_STRT_CLK) {
>> diff --git a/hw/sd/trace-events b/hw/sd/trace-events
>> index 1fc0bcf44b..6eca3470e2 100644
>> --- a/hw/sd/trace-events
>> +++ b/hw/sd/trace-events
>> @@ -3,3 +3,7 @@
>>  # hw/sd/milkymist-memcard.c
>>  milkymist_memcard_memory_read(uint32_t addr, uint32_t value) "addr 0x%08x 
>> value 0x%08x"
>>  milkymist_memcard_memory_write(uint32_t addr, uint32_t value) "addr 0x%08x 
>> value 0x%08x"
>> +
>> +# hw/sd/pxa2xx_mmci.c
>> +pxa2xx_mmci_read(uint8_t size, uint32_t addr, uint32_t value) "size %d addr 
>> 0x%02x value 0x%08x"
>> +pxa2xx_mmci_write(uint8_t size, uint32_t addr, uint32_t value) "size %d 
>> addr 0x%02x value 0x%08x"
>> --
>> 2.15.1
>>
>>

Re: [Qemu-devel] [PATCH v1 03/21] RISC-V CPU Core Definition

2018-01-03 Thread Michael Clark

On Wed, Jan 3, 2018 at 6:21 PM, Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 01/02/2018 04:44 PM, Michael Clark wrote:
> > +#ifdef CONFIG_USER_ONLY
> > +static bool riscv_cpu_has_work(CPUState *cs)
> > +{
> > +return 0;
> > +}
> > +#else
> > +static bool riscv_cpu_has_work(CPUState *cs)
> > +{
> > +return cs->interrupt_request & CPU_INTERRUPT_HARD;
> > +}
> > +#endif
>
> There's no need to conditionalize this.


Got it. Will be in the next spin.


> > +static void riscv_cpu_reset(CPUState *cs)
> > +{
> > +RISCVCPU *cpu = RISCV_CPU(cs);
> > +RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(cpu);
> > +CPURISCVState *env = >env;
> > +
> > +mcc->parent_reset(cs);
> > +#ifndef CONFIG_USER_ONLY
> > +tlb_flush(cs);
>
> Flush is now generic.  Remove it from here.


OK.

> +static void riscv_cpu_realize(DeviceState *dev, Error **errp)
> > +{
> > +CPUState *cs = CPU(dev);
> > +RISCVCPU *cpu = RISCV_CPU(dev);
> > +RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
> > +CPURISCVState *env = >env;
> > +Error *local_err = NULL;
> > +
> > +cpu_exec_realizefn(cs, _err);
> > +if (local_err != NULL) {
> > +error_propagate(errp, local_err);
> > +return;
> > +}
> > +
> > +if (env->misa & RVM) {
> > +set_feature(env, RISCV_FEATURE_RVM);
> > +}
>
> What's the point of replicating this information?
>

This is inherited code. I noticed this too. In this version they are
actually in sync with each other, which they weren't several weeks ago :-D

It may well be that the features flags pre-date the addition of the 'misa'
register in the privilege spec.

This will take a bit of re-work as a reasonable amount of code uses the
FEATURE flags vs misa.

Are you happy for this to be a pending work item? I don't like it either
and eventually want to fix, and already did some work to sync it with
'misa', but it's not a critical issue.

> +static void cpu_register(const RISCVCPUInfo *info)
> > +{
> > +TypeInfo type_info = {
> > +.name = g_strdup(info->name),
> > +.parent = TYPE_RISCV_CPU,
> > +.instance_size = sizeof(RISCVCPU),
> > +.instance_init = info->initfn,
> > +};
> > +
> > +type_register(_info);
> > +g_free((void *)type_info.name);
> > +}
>
> I think type_register does its own strdup; you don't need to do your own.


Got it.


> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > new file mode 100644
> > index 000..0480127
> > --- /dev/null
> > +++ b/target/riscv/cpu.h
> > @@ -0,0 +1,363 @@
> > +#ifndef RISCV_CPU_H
>
> Header comment and license?
>
> > +#define TARGET_HAS_ICE 1
>
> What's this for?


It's redundant. Inherited code. Looks like it came from nios2. Will remove.


> > +#define RV(x) (1L << (x - 'A'))
>
> L is useless since the type of long is variable.  Either U or ULL.
>
> > +typedef struct CPURISCVState CPURISCVState;
> > +
> > +#include "pmp.h"
> > +
> > +typedef struct CPURISCVState {
>
> Duplicate typedef.
>

Got it.


> > +target_ulong gpr[32];
> > +uint64_t fpr[32]; /* assume both F and D extensions */
> > +target_ulong pc;
> > +target_ulong load_res;
> > +
> > +target_ulong frm;
> > +target_ulong fstatus;
> > +target_ulong fflags;
> > +
> > +target_ulong badaddr;
> > +
> > +uint32_t mucounteren;
> > +
> > +target_ulong user_ver;
> > +target_ulong priv_ver;
> > +target_ulong misa_mask;
> > +target_ulong misa;
> > +
> > +#ifdef CONFIG_USER_ONLY
> > +uint32_t amoinsn;
> > +target_long amoaddr;
> > +target_long amotest;
> > +#else
> > +target_ulong priv;
> > +
> > +target_ulong mhartid;
> > +target_ulong mstatus;
> > +target_ulong mip;
> > +target_ulong mie;
> > +target_ulong mideleg;
> > +
> > +target_ulong sptbr;  /* until: priv-1.9.1 */
> > +target_ulong satp;   /* since: priv-1.10.0 */
> > +target_ulong sbadaddr;
> > +target_ulong mbadaddr;
> > +target_ulong medeleg;
> > +
> > +target_ulong stvec;
> > +target_ulong sepc;
> > +target_ulong scause;
> > +
> > +target_ulong mtvec;
> > +target_ulong mepc;
> > +target_ulong mcause;
> > +target_ulong mtval;  /* since: priv-1.10.0 */
> > +
> > +uint32_t mscounteren;
> > +target_ulong scounteren; /* since: priv-1.10.0 */
> > +target_ulong mcounteren; /* since: priv-1.10.0 */
> > +
> > +target_ulong sscratch;
> > +target_ulong mscratch;
> > +
> > +/* temporary htif regs */
> > +uint64_t mfromhost;
> > +uint64_t mtohost;
> > +uint64_t timecmp;
> > +
> > +/* physical memory protection */
> > +pmp_table_t pmp_state;
> > +#endif
> > +
> > +float_status fp_status;
> > +
> > +/* Internal CPU feature flags. */
> > +uint64_t features;
> > +
> > +/* QEMU */
> > +CPU_COMMON
> > +
> > +/* Fields from here on are preserved across CPU reset. */
> > +void *irq[8];
> > +QEMUTimer *timer; /* Internal timer */
>
> FWIW, other

Re: [Qemu-devel] [PATCH] cpu_physical_memory_sync_dirty_bitmap: Another alignment fix

2018-01-03 Thread Juan Quintela

"Dr. David Alan Gilbert (git)"  wrote:
> From: "Dr. David Alan Gilbert" 
>
> This code has an optimised, word aligned version, and a boring
> unaligned version. My commit f70d345 fixed one alignment issue, but
> there's another.
>
> The optimised version operates on 'longs' dealing with (typically) 64
> pages at a time, replacing the whole long by a 0 and counting the bits.
> If the Ramblock is less than 64bits in length that long can contain bits
> representing two different RAMBlocks, but the code will update the
> bmap belinging to the 1st RAMBlock only while having updated the total
> dirty page count for both.
>
> This probably didn't matter prior to 6b6712ef which split the dirty
> bitmap by RAMBlock, but now they're separate RAMBlocks we end up
> with a count that doesn't match the state in the bitmaps.
>
> Symptom:
>   Migration showing a few dirty pages left to be sent constantly
>   Seen on aarch64 and x86 with x86+ovmf
>
> Signed-off-by: Dr. David Alan Gilbert 
> Reported-by: Wei Huang 
> Fixes: 6b6712efccd383b48a909bee0b29e079a57601ec

Reviewed-by: Juan Quintela

Re: [Qemu-devel] MTTCG External Halt

2018-01-03 Thread Alistair Francis

On Wed, Jan 3, 2018 at 2:14 PM, Peter Maydell  wrote:
> On 3 January 2018 at 22:10, Alistair Francis  wrote:
>> Any chance any one has some insight into a way to externally set a
>> vCPU as halted/un-halted?
>
> PSCI (where one vCPU can power off another) does this by
> calling arm_set_cpu_off(). Does that (or some variation
> on it) work?

It seems to help with the assert(), but I still see CPU stalls.

I also forgot to mention that we have a sev implementation, which also
might be contributing.

Alistair

>
> thanks
> -- PMM

Re: [Qemu-devel] [PATCH v2] iotests: Test creating overlay when guest running

2018-01-03 Thread Eric Blake

On 12/24/2017 08:51 PM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> 
> ---
> 
> v2: Actually test the thing. [Kevin]
> ---
>  tests/qemu-iotests/153 | 8 +---
>  tests/qemu-iotests/153.out | 7 ---
>  2 files changed, 9 insertions(+), 6 deletions(-)

Reviewed-by: Eric Blake 

> 
> diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
> index fa25eb24bd..adfd02695b 100755
> --- a/tests/qemu-iotests/153
> +++ b/tests/qemu-iotests/153
> @@ -32,6 +32,7 @@ _cleanup()
>  {
>  _cleanup_test_img
>  rm -f "${TEST_IMG}.base"
> +rm -f "${TEST_IMG}.overlay"

Trivial conflict with Jeff's work to do per-test temporary directories
in iotests.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC PATCH v2 0/4] sdbus: testing sdcards

2018-01-03 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180103214925.16677-1-f4...@amsat.org
Subject: [Qemu-devel] [RFC PATCH v2 0/4] sdbus: testing sdcards

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]patchew/20180102234108.32713-1-laur...@vivier.eu -> 
patchew/20180102234108.32713-1-laur...@vivier.eu
 t [tag update]patchew/20180103162400.10396-1-f4...@amsat.org -> 
patchew/20180103162400.10396-1-f4...@amsat.org
 t [tag update]patchew/20180103164117.11850-1-f4...@amsat.org -> 
patchew/20180103164117.11850-1-f4...@amsat.org
Switched to a new branch 'test'
2fea93749f tests: add some sdcard qtest
0b1bb02ca7 libqos: implement sdbus QMP driver
fab57dc594 libqos: add a sdbus API
5059e2a76e sdbus: add a QMP command to access a SDBus

=== OUTPUT BEGIN ===
Checking PATCH 1/4: sdbus: add a QMP command to access a SDBus...
Checking PATCH 2/4: libqos: add a sdbus API...
ERROR: do not use C99 // comments
#91: FILE: tests/libqos/sdbus.c:61:
+// TODO check rv?

total: 1 errors, 0 warnings, 126 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 3/4: libqos: implement sdbus QMP driver...
WARNING: line over 80 characters
#97: FILE: tests/libqos/sdbus-qmp.c:66:
+static ssize_t qmp_mmc_do_cmd(SDBusAdapter *adapter, enum NCmd cmd, uint32_t 
arg,

ERROR: do not use C99 // comments
#111: FILE: tests/libqos/sdbus-qmp.c:80:
+//QDECREF(response);

total: 1 errors, 1 warnings, 138 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 4/4: tests: add some sdcard qtest...
ERROR: do not use C99 // comments
#70: FILE: tests/sdbus-test.c:32:
+//[PROTO_MMC] = "vexpress-a9",

ERROR: do not use C99 // comments
#71: FILE: tests/sdbus-test.c:33:
+//[PROTO_SPI] = "lm3s6965evb"

ERROR: do not use C99 // comments
#75: FILE: tests/sdbus-test.c:37:
+//512 * M_BYTE,

ERROR: do not use C99 // comments
#76: FILE: tests/sdbus-test.c:38:
+//1 * G_BYTE,

ERROR: do not use C99 // comments
#78: FILE: tests/sdbus-test.c:40:
+//64 * G_BYTE,

ERROR: do not use C99 // comments
#96: FILE: tests/sdbus-test.c:58:
+//g_assert_cmpuint(sz, ==, 0);

ERROR: space prohibited between function name and open parenthesis '('
#104: FILE: tests/sdbus-test.c:66:
+g_assert_cmpmem ([3], 5, "QEMU!", 5);

ERROR: do not use C99 // comments
#138: FILE: tests/sdbus-test.c:100:
+// TODO 8x: sdcard_read_data len 512

ERROR: do not use C99 // comments
#140: FILE: tests/sdbus-test.c:102:
+//sz = sdbus_do_acmd(mmc, SEND_STATUS, 0, rca, );

ERROR: do not use C99 // comments
#141: FILE: tests/sdbus-test.c:103:
+//g_free(response);

WARNING: line over 80 characters
#180: FILE: tests/sdbus-test.c:142:
+path = g_strdup_printf("sdcard/%s/%lu", proto_name[iproto], 
sizes[isize]);

ERROR: do not use C99 // comments
#183: FILE: tests/sdbus-test.c:145:
+// g_free(test)?

total: 11 errors, 1 warnings, 165 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] MTTCG External Halt

2018-01-03 Thread Peter Maydell

On 3 January 2018 at 22:10, Alistair Francis  wrote:
> Any chance any one has some insight into a way to externally set a
> vCPU as halted/un-halted?

PSCI (where one vCPU can power off another) does this by
calling arm_set_cpu_off(). Does that (or some variation
on it) work?

thanks
-- PMM

Re: [Qemu-devel] [PATCH v1 04/21] RISC-V Disassembler

2018-01-03 Thread Michael Clark

On Wed, Jan 3, 2018 at 6:30 PM, Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 01/02/2018 04:44 PM, Michael Clark wrote:
> > +static const char *rv_ireg_name_sym[] = {
> > +"zero", "ra",   "sp",   "gp",   "tp",   "t0",   "t1",   "t2",
> > +"s0",   "s1",   "a0",   "a1",   "a2",   "a3",   "a4",   "a5",
> > +"a6",   "a7",   "s2",   "s3",   "s4",   "s5",   "s6",   "s7",
> > +"s8",   "s9",   "s10",  "s11",  "t3",   "t4",   "t5",   "t6",
> > +NULL
> > +};
>
> static const char * const
>

OK.


> But maybe even better as
>
> static const char rv_ireg_name_sym[32][4]
>

Got it, but it would need to be [32][5] to make room for the NULL
terminator on zero.


> and without the useless NULL.
>

Yes. they are redundant.


> Otherwise,
>
> Reviewed-by: Richard Henderson 
>

Thanks.

These changes will be in the next spin of the patchset.

[Qemu-devel] MTTCG External Halt

2018-01-03 Thread Alistair Francis

Hey guys, I'm super stuck with an ugly MTTCG issue and was wondering
if anyone had any ideas.

In the Xilinx fork of QEMU (based on 2.11) we have a way for CPUs to
halt other CPUs. This is used for example when the power control unit
halts the ARM A53s. To do this we have internal GPIO signals that end
up calling a function that basically does this:

To halt:
cpu->halted = true;
cpu_interrupt(cpu, CPU_INTERRUPT_HALT);

To un-halt
cpu->halted = false;
cpu_reset_interrupt(cpu, CPU_INTERRUPT_HALT);

We also have the standard ARM WFI (Wait For Interrupt) implementation
in op_helper.c:
cs->halted = 1;
cs->exception_index = EXCP_HLT;
cpu_loop_exit(cs);

Before MTTCG this used to work great, but now either we end up with
the guest Linux complaining about CPU stalls or we hit:
ERROR:/scratch/alistai/master-qemu/cpus.c:1516:qemu_tcg_cpu_thread_fn:
assertion failed: (cpu->halted)

If I remove the instances of manually setting cpu->halted then I don't
see the asserts(), but the the WFI instruction doesn't work correctly.
So it seems like setting the halted status externally from the CPU
causes the issue. I have tried setting it inside a lock, using atomic
operations and running the setter async on the CPU, but nothing works.

Any chance any one has some insight into a way to externally set a
vCPU as halted/un-halted?

Thanks,
Alistair

1 2 3 4 >

1 - 100 of 353 matches

Mail list logo