[Qemu-devel] [PATCH v2 4/4] target-tricore: Added new JNE instruction variant

2016-05-30 Thread peer . adelt
From: Peer Adelt 

If D[15] is != sign_ext(const4) then PC will be set to (PC +
zero_ext(disp4 + 16)).

Signed-off-by: Peer Adelt 
---
 target-tricore/translate.c   | 11 +++
 target-tricore/tricore-opcodes.h |  1 +
 2 files changed, 12 insertions(+)

diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 960ee33..21732f8 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -3363,6 +3363,7 @@ static void gen_compute_branch(DisasContext *ctx, 
uint32_t opc, int r1,
 gen_branch_condi(ctx, TCG_COND_EQ, cpu_gpr_d[15], constant, offset);
 break;
 case OPC1_16_SBC_JNE:
+case OPC1_16_SBC_JNE16:
 gen_branch_condi(ctx, TCG_COND_NE, cpu_gpr_d[15], constant, offset);
 break;
 /* SBRN-format jumps */
@@ -4097,6 +4098,16 @@ static void decode_16Bit_opc(CPUTriCoreState *env, 
DisasContext *ctx)
 const16 = MASK_OP_SBC_CONST4_SEXT(ctx->opcode);
 gen_compute_branch(ctx, op1, 0, 0, const16, address);
 break;
+case OPC1_16_SBC_JEQ16:
+case OPC1_16_SBC_JNE16:
+if (tricore_feature(env, TRICORE_FEATURE_16)) {
+address = MASK_OP_SBC_DISP4(ctx->opcode);
+const16 = MASK_OP_SBC_CONST4_SEXT(ctx->opcode);
+gen_compute_branch(ctx, op1, 0, 0, const16, address + 16);
+} else {
+generate_trap(ctx, TRAPC_INSN_ERR, TIN2_IOPC);
+}
+break;
 /* SBRN-format */
 case OPC1_16_SBRN_JNZ_T:
 case OPC1_16_SBRN_JZ_T:
diff --git a/target-tricore/tricore-opcodes.h b/target-tricore/tricore-opcodes.h
index 2f25613..7925354 100644
--- a/target-tricore/tricore-opcodes.h
+++ b/target-tricore/tricore-opcodes.h
@@ -318,6 +318,7 @@ enum {
 OPC1_16_SBR_JLEZ = 0x8e,
 OPC1_16_SBR_JLTZ = 0x0e,
 OPC1_16_SBC_JNE  = 0x5e,
+OPC1_16_SBC_JNE16= 0xde,
 OPC1_16_SBR_JNE  = 0x7e,
 OPC1_16_SB_JNZ   = 0xee,
 OPC1_16_SBR_JNZ  = 0xf6,
-- 
2.7.4




[Qemu-devel] [PATCH v2 3/4] target-tricore: Added new MOV instruction variant

2016-05-30 Thread peer . adelt
From: Peer Adelt 

Puts the content of data register D[a] into E[c][63:32] and the
content of data register D[b] into E[c][31:0].

Signed-off-by: Peer Adelt 
---
 target-tricore/translate.c   | 8 
 target-tricore/tricore-opcodes.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index e66b433..960ee33 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -6224,6 +6224,14 @@ static void decode_rr_accumulator(CPUTriCoreState *env, 
DisasContext *ctx)
 case OPC2_32_RR_MOV:
 tcg_gen_mov_tl(cpu_gpr_d[r3], cpu_gpr_d[r2]);
 break;
+case OPC2_32_RR_MOV_EXT:
+if (tricore_feature(env, TRICORE_FEATURE_16)) {
+tcg_gen_mov_tl(cpu_gpr_d[r3], cpu_gpr_d[r1]);
+tcg_gen_mov_tl(cpu_gpr_d[(r3+1)], cpu_gpr_d[r2]);
+} else {
+generate_trap(ctx, TRAPC_INSN_ERR, TIN2_IOPC);
+}
+break;
 case OPC2_32_RR_NE:
 tcg_gen_setcond_tl(TCG_COND_NE, cpu_gpr_d[r3], cpu_gpr_d[r1],
cpu_gpr_d[r2]);
diff --git a/target-tricore/tricore-opcodes.h b/target-tricore/tricore-opcodes.h
index df666b0..2f25613 100644
--- a/target-tricore/tricore-opcodes.h
+++ b/target-tricore/tricore-opcodes.h
@@ -1062,6 +1062,7 @@ enum {
 OPC2_32_RR_MIN_H = 0x78,
 OPC2_32_RR_MIN_HU= 0x79,
 OPC2_32_RR_MOV   = 0x1f,
+OPC2_32_RR_MOV_EXT   = 0x81,
 OPC2_32_RR_NE= 0x11,
 OPC2_32_RR_OR_EQ = 0x27,
 OPC2_32_RR_OR_GE = 0x2b,
-- 
2.7.4




Re: [Qemu-devel] [PATCH v3] linux-user: Fix qemu-binfmt-conf.h to store config across reboot

2016-05-30 Thread Riku Voipio
On 27 May 2016 at 16:56, Alexander Graf  wrote:
> On 05/25/2016 05:51 PM, Laurent Vivier wrote:
>>
>>
>> Le 25/02/2016 à 17:28, Laurent Vivier a écrit :
>>>
>>> Please, Alex, Michael:
>>>
>>> We need your ack/review.
>>
>> Someone? :)

> It's definitely an improvement over today's situation.

> Reviewed-by: Alexander Graf 

Applied to linux-user que, thanks

Riku



Re: [Qemu-devel] [PATCH] scsi: esp: check TI buffer index before read/write

2016-05-30 Thread P J P
  Hello Peter,

+-- On Mon, 30 May 2016, Peter Maydell wrote --+
| > +} else if (s->ti_rptr < TI_BUFSZ) {
| >  s->rregs[ESP_FIFO] = s->ti_buf[s->ti_rptr++];
| > +} else {
| > +trace_esp_error_fifo_overrun();
| 
| Isn't this an underrun, not an overrun?

  OOB read occurs when 's->ti_rptr' exceeds 'TI_BUFSZ'.
 
| In any case, something weird seems to be going on here:
| it looks like the amount of data in the fifo should be
| constrained by ti_size (which we're already checking), so
| when can we get into a situation where ti_rptr can
| get beyond the buffer size? It seems to me that we should
| fix whatever that underlying bug is, and then have an
| assert() on ti_rptr here.
|
| > -} else if (s->ti_size == TI_BUFSZ - 1) {
| > +} else if (s->ti_wptr == TI_BUFSZ - 1) {
| >  trace_esp_error_fifo_overrun();
|
| Similarly, this looks odd -- the ti_size check should be
| sufficient if the rest of the code is correctly managing
| the ti_size/ti_wptr/ti_rptr values.

  Both issues occur as guest could control the value of 's->ti_size' by 
alternating between 'esp_reg_write(, ESP_FIFO, )' & 'esp_reg_read(, ESP_FIFO)' 
calls. One increases 's->ti_size' and other descreases the same.

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F



Re: [Qemu-devel] [PATCH 15/33] docs: update ACPI CPU hotplug spec with new protocol

2016-05-30 Thread Michael S. Tsirkin
On Tue, May 17, 2016 at 04:43:07PM +0200, Igor Mammedov wrote:
> Signed-off-by: Igor Mammedov 
> ---
>  docs/specs/acpi_cpu_hotplug.txt | 88 
> +++--
>  1 file changed, 76 insertions(+), 12 deletions(-)
> 
> diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
> index 340b751..c5bce6a 100644
> --- a/docs/specs/acpi_cpu_hotplug.txt
> +++ b/docs/specs/acpi_cpu_hotplug.txt
> @@ -4,21 +4,85 @@ QEMU<->ACPI BIOS CPU hotplug interface
>  QEMU supports CPU hotplug via ACPI. This document
>  describes the interface between QEMU and the ACPI BIOS.
>  
> -ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
> --
> -
> -Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU
> -hot-add/remove event to ACPI BIOS, via SCI interrupt.
> +ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add
> +and hot-remove events.
>  
> +
> +Legacy ACPI CPU hotplug interface registers:
> +
>  CPU present bitmap for:
> +  One bit per CPU. Bit position reflects corresponding CPU APIC ID. 
> Read-only.
>ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access)
>PIIX-PM  (IO port 0xaf00-0xaf1f, 1-byte access)
>  ---
> -One bit per CPU. Bit position reflects corresponding CPU APIC ID.
> -Read-only.
> +QEMU sets corresponding CPU bit on hot-add event and issues SCI
> +with GPE.2 event set. CPU present map read by ACPI BIOS GPE.2 handler
> +to notify OS about CPU hot-add events. CPU hot-remove isn't supported.
> +
> +=
> +ACPI CPU hotplug interface registers:
> +-
> +Register block base address:
> +ICH9-LPC IO port 0x0cd8
> +PIIX-PM  IO port 0xaf00

OK but this means we either use legacy or new one,
bot both, which is problematic for people using old seabios
without acpi loading support and with -M pc.

I don't say we must support them with >255 CPUs,
but I do say we should make an effort for simple
setups with <255 CPUs.


> +Register block size:
> +ACPI_CPU_HOTPLUG_REG_LEN = 12
> +
> +read access:

So this implies acpi must scan all cpus on each event, and
this seems too aggressive.

I think we need something hierarchical where
you read one level and know which cpus to probe.


> +offset:
> +[0x0-0x3] reserved
> +[0x4] CPU device status fields: (1 byte access)
> +bits:
> +   0: Device is enabled and may be used by guest
> +   1: Device insert event, used to distinguish device for which
> +  no device check event to OSPM was issued.
> +  It's valid only when bit 0 is set.
> +   2: Device remove event, used to distinguish device for which
> +  no device eject request to OSPM was issued.
> +   3-7: reserved and should be ignored by OSPM
> +[0x5-0x7] reserved
> +[0x8] Command data: (DWORD access)
> +  in case of error or unsupported command reads is 0x
> +  current 'Command field' value:
> +  0: returns PXM value corresponding to device
> +
> +write access:
> +offset:
> +[0x0-0x3] CPU selector: (DWORD access)
> +  selects active CPU device. All following accesses to other
> +  registers will read/store data from/to selected CPU.

I've been thinking - is it time to add an EmbeddedControl interface?
This way we have another namespace.
Not insisting on it, just an idea.

> +[0x4] CPU device control fields: (1 byte access)
> +bits:
> +0: reserved, OSPM must clear it before writing to register.
> +1: if set to 1 clears device insert event, set by OSPM
> +   after it has emitted device check event for the
> +   selected CPU device
> +2: if set to 1 clears device remove event, set by OSPM
> +   after it has emitted device eject request for the
> +   selected CPU device
> +3: if set to 1 initiates device eject, set by OSPM when it
> +   triggers CPU device removal and calls _EJ0 method
> +4-7: reserved, OSPM must clear them before writing to register
> +[0x5] Command field: (1 byte access)
> +  value:
> +0: following reads from 'Command data' register returns PXM
> +   value of device
> +1: following writes to 'Command data' register set OST event
> +   register in QEMU
> +2: following writes to 'Command data' register set OST status
> +   register in QEMU
> +other values: reserved
> +[0x6-0x7] reserved
> +[0x8] Command data: (DWORD access)
> +  current 'Command field' value:
> +  1: stores value into OST event register
> +  2: stores value into OST status 

Re: [Qemu-devel] [PATCH V3] tap: vhost busy polling support

2016-05-30 Thread Jason Wang



On 2016年05月31日 12:19, Michael S. Tsirkin wrote:

On Tue, May 31, 2016 at 11:04:18AM +0800, Jason Wang wrote:


On 2016年05月31日 02:07, Michael S. Tsirkin wrote:

On Thu, Apr 07, 2016 at 12:56:24PM +0800, Jason Wang wrote:

This patch add the capability of basic vhost net busy polling which is
supported by recent kernel. User could configure the maximum number of
us that could be spent on busy polling through a new property of tap
"vhost-poll-us".

I applied this but now I had a thought - should we generalize this to
"poll-us"? Down the road tun could support busy olling just like
sockets do.

Looks two different things. Socket busy polling depends on the value set by
sysctl or SO_BUSY_POLL, which should be transparent to qemu.

This is what I am saying.  qemu can set SO_BUSY_POLL if poll-us is specified,
can it not?


With CAP_NET_ADMIN, it can. Without it, it can only decrease the value.


  Onthe one hand this suggests a more generic name
for the option.


I see, but there're some differences:

- socket busy polling only poll for rx, vhost busy polling poll for both 
tx and rx.
- vhost busy polling does not depends on socket busy polling, it can 
work with socket busy polling disabled.



On the other how does user discover whether it's
implemented for tap without vhost? Do we require kernel support?


I believe this could be done by management through ioctl probing.


Does an unblocking read with tap without vhost also speed
things up a bit?


Kernel will try one more round of napi poll, so we can only get speed up 
if we can poll some packet in this case.





Signed-off-by: Jason Wang 
---
  hw/net/vhost_net.c|  2 +-
  hw/scsi/vhost-scsi.c  |  2 +-
  hw/virtio/vhost-backend.c |  8 
  hw/virtio/vhost.c | 40 ++-
  include/hw/virtio/vhost-backend.h |  3 +++
  include/hw/virtio/vhost.h |  3 ++-
  include/net/vhost_net.h   |  1 +
  net/tap.c | 10 --
  net/vhost-user.c  |  1 +
  qapi-schema.json  |  6 +-
  qemu-options.hx   |  3 +++
  11 files changed, 72 insertions(+), 7 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 6e1032f..1840c73 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -166,7 +166,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
  }
  r = vhost_dev_init(>dev, options->opaque,
-   options->backend_type);
+   options->backend_type, options->busyloop_timeout);
  if (r < 0) {
  goto fail;
  }
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 9261d51..2a00f2f 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -248,7 +248,7 @@ static void vhost_scsi_realize(DeviceState *dev, Error 
**errp)
  s->dev.backend_features = 0;
  ret = vhost_dev_init(>dev, (void *)(uintptr_t)vhostfd,
- VHOST_BACKEND_TYPE_KERNEL);
+ VHOST_BACKEND_TYPE_KERNEL, 0);
  if (ret < 0) {
  error_setg(errp, "vhost-scsi: vhost initialization failed: %s",
 strerror(-ret));
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index b358902..d62372e 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -138,6 +138,12 @@ static int vhost_kernel_set_vring_call(struct vhost_dev 
*dev,
  return vhost_kernel_call(dev, VHOST_SET_VRING_CALL, file);
  }
+static int vhost_kernel_set_vring_busyloop_timeout(struct vhost_dev *dev,
+   struct vhost_vring_state *s)
+{
+return vhost_kernel_call(dev, VHOST_SET_VRING_BUSYLOOP_TIMEOUT, s);
+}
+
  static int vhost_kernel_set_features(struct vhost_dev *dev,
   uint64_t features)
  {
@@ -185,6 +191,8 @@ static const VhostOps kernel_ops = {
  .vhost_get_vring_base = vhost_kernel_get_vring_base,
  .vhost_set_vring_kick = vhost_kernel_set_vring_kick,
  .vhost_set_vring_call = vhost_kernel_set_vring_call,
+.vhost_set_vring_busyloop_timeout =
+vhost_kernel_set_vring_busyloop_timeout,
  .vhost_set_features = vhost_kernel_set_features,
  .vhost_get_features = vhost_kernel_get_features,
  .vhost_set_owner = vhost_kernel_set_owner,
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 4400718..ebf8b08 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -964,6 +964,28 @@ static void vhost_eventfd_del(MemoryListener *listener,
  {
  }
+static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
+int n, uint32_t timeout)
+{
+int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, n);
+struct vhost_vring_state state = {
+.index = vhost_vq_index,
+.num = timeout,
+};
+

Re: [Qemu-devel] [PATCH 24/33] acpi: add CPU hotplug methods to DSDT

2016-05-30 Thread Michael S. Tsirkin
On Tue, May 17, 2016 at 04:43:16PM +0200, Igor Mammedov wrote:
> Add necessary CPU hotplug methods to handle hotplug
> events.
> 
> Signed-off-by: Igor Mammedov 
> ---
> v1:
>   - make replace _MAT method with named buffer object
> as its content is static
> ---
>  hw/acpi/cpu.c | 187 
> +-
>  hw/i386/acpi-build.c  |   3 +-
>  include/hw/acpi/cpu.h |   4 +-
>  3 files changed, 190 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index b3e1cca..28d3894 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -207,24 +207,178 @@ const VMStateDescription vmstate_cpu_hotplug = {
>  };
>  
>  #define CPU_NAME_FMT  "C%.03X"
> -
> -void build_cpus_aml(Aml *table, MachineState *machine, bool acpi1_compat)
> +#define CPUHP_RES_DEVICE  "PRES"
> +#define CPU_LOCK  "CPLK"
> +#define CPU_STS_METHOD"CSTA"
> +#define CPU_SCAN_METHOD   "CSCN"
> +#define CPU_EJECT_METHOD  "CEJ0"
> +#define CPU_NOTIFY_METHOD "CTFY"
> +
> +#define CPU_ENABLED   "CPEN"
> +#define CPU_SELECTOR  "CSEL"
> +#define CPU_EJECT_EVENT   "CEJ0"
> +#define CPU_INSERT_EVENT  "CINS"
> +#define CPU_REMOVE_EVENT  "CRMV"
> +
> +void build_cpus_aml(Aml *table, MachineState *machine, bool acpi1_compat,
> +const char *res_root, const char *event_handler_method,
> +hwaddr io_base)
>  {
> +Aml *ifctx;
> +Aml *field;
> +Aml *method;
> +Aml *cpu_ctrl_dev;
>  Aml *cpus_dev;
> +Aml *zero = aml_int(0);
>  Aml *sb_scope = aml_scope("_SB");
>  MachineClass *mc = MACHINE_GET_CLASS(machine);
>  CPUArchIdList *arch_ids = mc->possible_cpu_arch_ids(machine);
> +char *cphp_res_path = g_strdup_printf("%s." CPUHP_RES_DEVICE, res_root);
> +Object *obj = object_resolve_path_type("", TYPE_ACPI_DEVICE_IF, NULL);
> +AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(obj);
> +AcpiDeviceIf *adev = ACPI_DEVICE_IF(obj);
> +
> +cpu_ctrl_dev = aml_device("%s", cphp_res_path);
> +{
> +Aml *crs;
> +
> +aml_append(cpu_ctrl_dev,
> +aml_name_decl("_HID", aml_eisaid("PNP0A06")));
> +aml_append(cpu_ctrl_dev,
> +aml_name_decl("_UID", aml_string("CPU Hotplug resources")));
> +aml_append(cpu_ctrl_dev, aml_mutex(CPU_LOCK, 0));
> +
> +crs = aml_resource_template();
> +aml_append(crs, aml_io(AML_DECODE16, io_base, io_base, 1,
> +   ACPI_CPU_HOTPLUG_REG_LEN));
> +aml_append(cpu_ctrl_dev, aml_name_decl("_CRS", crs));
> +
> +/* declare CPU hotplug MMIO region with related access fields */
> +aml_append(cpu_ctrl_dev,
> +aml_operation_region("PRST", AML_SYSTEM_IO, aml_int(io_base),
> + ACPI_CPU_HOTPLUG_REG_LEN));
> +
> +field = aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK,
> +  AML_WRITE_AS_ZEROS);
> +aml_append(field, aml_reserved_field(ACPI_CPU_FLAGS_OFFSET_RW * 8));
> +/* 1 if enabled, read only */
> +aml_append(field, aml_named_field(CPU_ENABLED, 1));
> +/* (read) 1 if has a insert event. (write) 1 to clear event */
> +aml_append(field, aml_named_field(CPU_INSERT_EVENT, 1));
> +/* (read) 1 if has a remove event. (write) 1 to clear event */
> +aml_append(field, aml_named_field(CPU_REMOVE_EVENT, 1));
> +/* initiates device eject, write only */
> +aml_append(field, aml_named_field(CPU_EJECT_EVENT, 1));
> +aml_append(cpu_ctrl_dev, field);
> +
> +field = aml_field("PRST", AML_DWORD_ACC, AML_NOLOCK, AML_PRESERVE);
> +/* CPU selector, write only */
> +aml_append(field, aml_named_field(CPU_SELECTOR, 32));
> +aml_append(cpu_ctrl_dev, field);
> +
> +}
> +aml_append(sb_scope, cpu_ctrl_dev);
> +
>  cpus_dev = aml_device("\\_SB.CPUS");
>  {
>  int i;
> +Aml *one = aml_int(1);
> +Aml *cpu_selector = aml_name("%s.%s", cphp_res_path, CPU_SELECTOR);
> +Aml *ins_evt = aml_name("%s.%s", cphp_res_path, CPU_INSERT_EVENT);
> +Aml *rm_evt = aml_name("%s.%s", cphp_res_path, CPU_REMOVE_EVENT);
> +Aml *ej_evt = aml_name("%s.%s", cphp_res_path, CPU_EJECT_EVENT);
> +Aml *is_enabled = aml_name("%s.%s", cphp_res_path, CPU_ENABLED);
> +Aml *ctrl_lock = aml_name("%s.%s", cphp_res_path, CPU_LOCK);
>  
>  aml_append(cpus_dev, aml_name_decl("_HID", aml_string("ACPI0010")));
>  aml_append(cpus_dev, aml_name_decl("_CID", aml_eisaid("PNP0A05")));
>  
> +method = aml_method(CPU_NOTIFY_METHOD, 2, AML_NOTSERIALIZED);
> +for (i = 0; i < arch_ids->len; i++) {

wow that will be a ton of acpi code. why not create an AML loop?

> +Aml *cpu = aml_name(CPU_NAME_FMT, i);
> +Aml *uid = aml_arg(0);
> +Aml *event = aml_arg(1);
> +
> +ifctx = 

Re: [Qemu-devel] [PATCH V3] tap: vhost busy polling support

2016-05-30 Thread Michael S. Tsirkin
On Tue, May 31, 2016 at 11:04:18AM +0800, Jason Wang wrote:
> 
> 
> On 2016年05月31日 02:07, Michael S. Tsirkin wrote:
> >On Thu, Apr 07, 2016 at 12:56:24PM +0800, Jason Wang wrote:
> >>This patch add the capability of basic vhost net busy polling which is
> >>supported by recent kernel. User could configure the maximum number of
> >>us that could be spent on busy polling through a new property of tap
> >>"vhost-poll-us".
> >I applied this but now I had a thought - should we generalize this to
> >"poll-us"? Down the road tun could support busy olling just like
> >sockets do.
> 
> Looks two different things. Socket busy polling depends on the value set by
> sysctl or SO_BUSY_POLL, which should be transparent to qemu.

This is what I am saying.  qemu can set SO_BUSY_POLL if poll-us is specified,
can it not? Onthe one hand this suggests a more generic name
for the option. On the other how does user discover whether it's
implemented for tap without vhost? Do we require kernel support?
Does an unblocking read with tap without vhost also speed
things up a bit?

> >>Signed-off-by: Jason Wang 
> >>---
> >>  hw/net/vhost_net.c|  2 +-
> >>  hw/scsi/vhost-scsi.c  |  2 +-
> >>  hw/virtio/vhost-backend.c |  8 
> >>  hw/virtio/vhost.c | 40 
> >> ++-
> >>  include/hw/virtio/vhost-backend.h |  3 +++
> >>  include/hw/virtio/vhost.h |  3 ++-
> >>  include/net/vhost_net.h   |  1 +
> >>  net/tap.c | 10 --
> >>  net/vhost-user.c  |  1 +
> >>  qapi-schema.json  |  6 +-
> >>  qemu-options.hx   |  3 +++
> >>  11 files changed, 72 insertions(+), 7 deletions(-)
> >>
> >>diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >>index 6e1032f..1840c73 100644
> >>--- a/hw/net/vhost_net.c
> >>+++ b/hw/net/vhost_net.c
> >>@@ -166,7 +166,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions 
> >>*options)
> >>  }
> >>  r = vhost_dev_init(>dev, options->opaque,
> >>-   options->backend_type);
> >>+   options->backend_type, options->busyloop_timeout);
> >>  if (r < 0) {
> >>  goto fail;
> >>  }
> >>diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> >>index 9261d51..2a00f2f 100644
> >>--- a/hw/scsi/vhost-scsi.c
> >>+++ b/hw/scsi/vhost-scsi.c
> >>@@ -248,7 +248,7 @@ static void vhost_scsi_realize(DeviceState *dev, Error 
> >>**errp)
> >>  s->dev.backend_features = 0;
> >>  ret = vhost_dev_init(>dev, (void *)(uintptr_t)vhostfd,
> >>- VHOST_BACKEND_TYPE_KERNEL);
> >>+ VHOST_BACKEND_TYPE_KERNEL, 0);
> >>  if (ret < 0) {
> >>  error_setg(errp, "vhost-scsi: vhost initialization failed: %s",
> >> strerror(-ret));
> >>diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> >>index b358902..d62372e 100644
> >>--- a/hw/virtio/vhost-backend.c
> >>+++ b/hw/virtio/vhost-backend.c
> >>@@ -138,6 +138,12 @@ static int vhost_kernel_set_vring_call(struct 
> >>vhost_dev *dev,
> >>  return vhost_kernel_call(dev, VHOST_SET_VRING_CALL, file);
> >>  }
> >>+static int vhost_kernel_set_vring_busyloop_timeout(struct vhost_dev *dev,
> >>+   struct 
> >>vhost_vring_state *s)
> >>+{
> >>+return vhost_kernel_call(dev, VHOST_SET_VRING_BUSYLOOP_TIMEOUT, s);
> >>+}
> >>+
> >>  static int vhost_kernel_set_features(struct vhost_dev *dev,
> >>   uint64_t features)
> >>  {
> >>@@ -185,6 +191,8 @@ static const VhostOps kernel_ops = {
> >>  .vhost_get_vring_base = vhost_kernel_get_vring_base,
> >>  .vhost_set_vring_kick = vhost_kernel_set_vring_kick,
> >>  .vhost_set_vring_call = vhost_kernel_set_vring_call,
> >>+.vhost_set_vring_busyloop_timeout =
> >>+vhost_kernel_set_vring_busyloop_timeout,
> >>  .vhost_set_features = vhost_kernel_set_features,
> >>  .vhost_get_features = vhost_kernel_get_features,
> >>  .vhost_set_owner = vhost_kernel_set_owner,
> >>diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >>index 4400718..ebf8b08 100644
> >>--- a/hw/virtio/vhost.c
> >>+++ b/hw/virtio/vhost.c
> >>@@ -964,6 +964,28 @@ static void vhost_eventfd_del(MemoryListener *listener,
> >>  {
> >>  }
> >>+static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
> >>+int n, uint32_t timeout)
> >>+{
> >>+int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, n);
> >>+struct vhost_vring_state state = {
> >>+.index = vhost_vq_index,
> >>+.num = timeout,
> >>+};
> >>+int r;
> >>+
> >>+if (!dev->vhost_ops->vhost_set_vring_busyloop_timeout) {
> >>+return -EINVAL;
> >>+}
> >>+
> >>+r = 

Re: [Qemu-devel] [RFC PATCH V4 0/4] Introduce COLO-compare

2016-05-30 Thread Zhang Chen



On 05/30/2016 11:19 AM, Jason Wang wrote:



On 2016年05月25日 20:50, Zhang Chen wrote:

COLO-compare is a part of COLO project. It is used
to compare the network package to help COLO decide
whether to do checkpoint.

the full version in this github:
https://github.com/zhangckid/qemu/tree/colo-v2.7-proxy-mode-compare-with-colo-base-may25 




v4:
  p4:
 - add some comments
 - fix some trace-events
 - fix tcp compare error
  p3:
 - add rcu_read_lock().
 - fix trace name
 - fix jason's other comments
 - rebase some Dave's branch function
  p2:
 - colo_compare_connection() change g_queue_push_head() to
 - g_queue_push_tail() match to sorted order.
 - remove QemuMutex conn_list_lock


Looks like conn_list lock is still there. I still prefer to do all 
thing in the comparing thread. Have you tried Fam's suggestion to use 
g_main_context_push_thread_default()? If it does not work, does it 
work simply by replacing all:


g_source_attach(x, NULL);

with

g_souce_attach(x, g_main_context_get_thread_default());

after call g_main_context_push_thread_default()?

Thanks


I have tried fam's suggestion it does not work.

so I tried what you suggestion like that:

static void *colo_compare_thread(void *opaque)
{
CompareState *s = opaque;
GSource *source;
GMainContext *worker_context;

source = g_source_new(_funcs, sizeof(DemoSource));
worker_context = g_main_context_new ();

g_source_attach(source, g_main_context_get_thread_default());
g_source_set_callback(source, NULL, NULL, NULL);
g_main_context_push_thread_default (worker_context);
qemu_chr_add_handlers(s->chr_pri_in, compare_chr_can_read,
  compare_pri_chr_in, NULL, s);
qemu_chr_add_handlers(s->chr_sec_in, compare_chr_can_read,
  compare_sec_chr_in, NULL, s);
g_main_context_pop_thread_default (worker_context);


but it does not work too. Do you mean like this?

Thanks
Zhang Chen


 - remove pkt->s
 - move data structure to colo-base.h
 - add colo-base.c reuse codes for filter-rewriter
 - add some filter-rewriter needs struct
 - depends on previous SocketReadState patch
  p1:
 - except move qemu_chr_add_handlers()
   to colo thread
 - remove class_finalize
 - remove secondary arp codes
 - depends on previous SocketReadState patch

v3:
   - rebase colo-compare to colo-frame v2.7
   - fix most of Dave's comments
 (except RCU)
   - add TCP,UDP,ICMP and other packet comparison
   - add trace-event
   - add some comments
   - other bug fix
   - add RFC index
   - add usage in patch 1/4

v2:
   - add jhash.h

v1:
   - initial patch


Zhang Chen (4):
   colo-compare: introduce colo compare initialization
   colo-compare: track connection and enqueue packet
   colo-compare: introduce packet comparison thread
   colo-compare: add TCP,UDP,ICMP packet comparison

  include/qemu/jhash.h |  61 +
  net/Makefile.objs|   2 +
  net/colo-base.c  | 183 +
  net/colo-base.h  |  92 +++
  net/colo-compare.c   | 745 
+++

  qemu-options.hx  |  34 +++
  trace-events |  11 +
  vl.c |   3 +-
  8 files changed, 1130 insertions(+), 1 deletion(-)
  create mode 100644 include/qemu/jhash.h
  create mode 100644 net/colo-base.c
  create mode 100644 net/colo-base.h
  create mode 100644 net/colo-compare.c





.



--
Thanks
zhangchen






Re: [Qemu-devel] [PATCH V3] tap: vhost busy polling support

2016-05-30 Thread Jason Wang



On 2016年05月31日 02:07, Michael S. Tsirkin wrote:

On Thu, Apr 07, 2016 at 12:56:24PM +0800, Jason Wang wrote:

This patch add the capability of basic vhost net busy polling which is
supported by recent kernel. User could configure the maximum number of
us that could be spent on busy polling through a new property of tap
"vhost-poll-us".

I applied this but now I had a thought - should we generalize this to
"poll-us"? Down the road tun could support busy olling just like
sockets do.


Looks two different things. Socket busy polling depends on the value set 
by sysctl or SO_BUSY_POLL, which should be transparent to qemu.



Signed-off-by: Jason Wang 
---
  hw/net/vhost_net.c|  2 +-
  hw/scsi/vhost-scsi.c  |  2 +-
  hw/virtio/vhost-backend.c |  8 
  hw/virtio/vhost.c | 40 ++-
  include/hw/virtio/vhost-backend.h |  3 +++
  include/hw/virtio/vhost.h |  3 ++-
  include/net/vhost_net.h   |  1 +
  net/tap.c | 10 --
  net/vhost-user.c  |  1 +
  qapi-schema.json  |  6 +-
  qemu-options.hx   |  3 +++
  11 files changed, 72 insertions(+), 7 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 6e1032f..1840c73 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -166,7 +166,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
  }
  
  r = vhost_dev_init(>dev, options->opaque,

-   options->backend_type);
+   options->backend_type, options->busyloop_timeout);
  if (r < 0) {
  goto fail;
  }
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 9261d51..2a00f2f 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -248,7 +248,7 @@ static void vhost_scsi_realize(DeviceState *dev, Error 
**errp)
  s->dev.backend_features = 0;
  
  ret = vhost_dev_init(>dev, (void *)(uintptr_t)vhostfd,

- VHOST_BACKEND_TYPE_KERNEL);
+ VHOST_BACKEND_TYPE_KERNEL, 0);
  if (ret < 0) {
  error_setg(errp, "vhost-scsi: vhost initialization failed: %s",
 strerror(-ret));
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index b358902..d62372e 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -138,6 +138,12 @@ static int vhost_kernel_set_vring_call(struct vhost_dev 
*dev,
  return vhost_kernel_call(dev, VHOST_SET_VRING_CALL, file);
  }
  
+static int vhost_kernel_set_vring_busyloop_timeout(struct vhost_dev *dev,

+   struct vhost_vring_state *s)
+{
+return vhost_kernel_call(dev, VHOST_SET_VRING_BUSYLOOP_TIMEOUT, s);
+}
+
  static int vhost_kernel_set_features(struct vhost_dev *dev,
   uint64_t features)
  {
@@ -185,6 +191,8 @@ static const VhostOps kernel_ops = {
  .vhost_get_vring_base = vhost_kernel_get_vring_base,
  .vhost_set_vring_kick = vhost_kernel_set_vring_kick,
  .vhost_set_vring_call = vhost_kernel_set_vring_call,
+.vhost_set_vring_busyloop_timeout =
+vhost_kernel_set_vring_busyloop_timeout,
  .vhost_set_features = vhost_kernel_set_features,
  .vhost_get_features = vhost_kernel_get_features,
  .vhost_set_owner = vhost_kernel_set_owner,
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 4400718..ebf8b08 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -964,6 +964,28 @@ static void vhost_eventfd_del(MemoryListener *listener,
  {
  }
  
+static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,

+int n, uint32_t timeout)
+{
+int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, n);
+struct vhost_vring_state state = {
+.index = vhost_vq_index,
+.num = timeout,
+};
+int r;
+
+if (!dev->vhost_ops->vhost_set_vring_busyloop_timeout) {
+return -EINVAL;
+}
+
+r = dev->vhost_ops->vhost_set_vring_busyloop_timeout(dev, );
+if (r) {
+return r;
+}
+
+return 0;
+}
+
  static int vhost_virtqueue_init(struct vhost_dev *dev,
  struct vhost_virtqueue *vq, int n)
  {
@@ -994,7 +1016,7 @@ static void vhost_virtqueue_cleanup(struct vhost_virtqueue 
*vq)
  }
  
  int vhost_dev_init(struct vhost_dev *hdev, void *opaque,

-   VhostBackendType backend_type)
+   VhostBackendType backend_type, uint32_t busyloop_timeout)
  {
  uint64_t features;
  int i, r;
@@ -1035,6 +1057,17 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
  goto fail_vq;
  }
  }
+
+if (busyloop_timeout) {
+for (i = 0; i < hdev->nvqs; ++i) {
+r = 

Re: [Qemu-devel] [RFC PATCH v4 0/3] Add Mediated device support[was: Add vGPU support]

2016-05-30 Thread Jike Song
On 05/28/2016 10:56 PM, Alex Williamson wrote:
> On Fri, 27 May 2016 22:43:54 +
> "Tian, Kevin"  wrote:
> 
>>
>> My impression was that you don't like hypervisor specific thing in VFIO,
>> which makes it a bit tricky to accomplish those tasks in kernel. If we 
>> can add Xen specific logic directly in VFIO (like vfio-iommu-xen you 
>> mentioned), the whole thing would be easier.
> 
> If vfio is hosted in dom0, then Xen is the platform and we need to
> interact with the hypervisor to manage the iommu.  That said, there are
> aspects of vfio that do not seem to map well to a hypervisor managed
> iommu or a Xen-like hypervisor.  For instance, how does dom0 manage
> iommu groups and what's the distinction of using vfio to manage a
> userspace driver in dom0 versus managing a device for another domain.
> In the case of kvm, vfio has no dependency on kvm, there is some minor
> interaction, but we're not running on kvm and it's not appropriate to
> use vfio as a gateway to interact with a hypervisor that may or may not
> exist.  Thanks,

Hi Alex,

Beyond iommu, there are other aspects vfio need to interact with Xen?
e.g. to pass-through MMIO, one have to call hypercalls to establish EPT
mappings.


--
Thanks,
Jike



Re: [Qemu-devel] [PATCH v2 00/15] PATCH 00/15] NVDIMM: introduce nvdimm label support

2016-05-30 Thread Xiao Guangrong



On 05/31/2016 02:52 AM, Stefan Hajnoczi wrote:


I have reviewed the non-ACPI parts of this series.  Aside from minor
comments:

Reviewed-by: Stefan Hajnoczi 




Really appreciate all your review! Thanks your, Stefan!



Re: [Qemu-devel] [PATCH v2 03/15] pc-dimm: keep the state of the whole backend memory

2016-05-30 Thread Xiao Guangrong



On 05/31/2016 02:42 AM, Stefan Hajnoczi wrote:

On Fri, May 20, 2016 at 04:20:00PM +0800, Xiao Guangrong wrote:

QEMU keeps the state of memory of dimm device during live migration,
however, it is not enough for nvdimm device as its memory does not
contain its label data, so that we should protect the whole backend
memory instead

Signed-off-by: Xiao Guangrong 
---
  hw/mem/pc-dimm.c | 13 +++--
  1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 6de2275..72b33ba 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -105,9 +105,16 @@ void pc_dimm_memory_plug(DeviceState *dev, 
MemoryHotplugState *hpms,
  }

  memory_region_add_subregion(>mr, addr - hpms->base, mr);
-vmstate_register_ram(mr, dev);
  numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node);

+/*
+ * save the state only for @mr is not enough as it does not contain
+ * the label data of NVDIMM device, so that we keep the state of
+ * whole hostmem instead.
+ */
+vmstate_register_ram(host_memory_backend_get_memory(dimm->hostmem, errp),
+ dev);
+
  out:
  error_propagate(errp, local_err);
  }


In Patch 1 you introduced a callback to get the guest-visible memory
region.  Instead of mentioning NVDIMM in generic pc-dimm.c code, it
would be cleaner to add another callback to get the vmstate memory
region:

   .get_guest_memory_region() - Patch 1
   .get_vmstate_memory_region() - a new patch in this series



It is good to me, will do it. Thanks!



Re: [Qemu-devel] [PATCH RFC 0/2] enable iommu with -device

2016-05-30 Thread Peter Xu
On Mon, May 30, 2016 at 05:14:15PM +0300, Marcel Apfelbaum wrote:
> On 05/30/2016 04:43 PM, Peter Xu wrote:
> >On Mon, May 23, 2016 at 05:01:28PM +0300, Marcel Apfelbaum wrote:
> >>This is a proposal on how to create the iommu with
> >>'-device intel-iommu' instead of '-machine,iommu=on'.
> >>
> >>The device is part of the machine properties because we wanted
> >>to ensure it is created before any other PCI device.
> >>
> >>The alternative is to skip the bus_master_enable_region at
> >>the time the device is created. We can create this region
> >>at machine_done phase. (patch 1)
> >>
> >>Then we can enable sysbus devices for PC machines and make all the
> >>init steps inside the iommu realize function. (patch 2)
> >>
> >>The series is working, but a lot of issues are not resolved:
> >>   - minimum testing was done
> >>   - the iommu addr should be passed (maybe) in command line rather than 
> >> hard-coded
> >>   - enabling sysbus devices for PC machines is risky, I am not aware yet
> >> of the side effects of this modification.
> >>   - I am not sure moving the bus_master_enable_region to machine_done
> >> is with no undesired effects.
> >
> >I gave it a shot on the patches and it works nicely (of course no
> >complex configurations, like hot plug).
> >
> >Could you help introduce what will bring us if we use "-device" rather
> >than "-M" options?  Benefits I can see is that, we can specify
> >parameters with specific device, rather than messing them up in
> >"machine" options. Do we have any other benefits that I may have
> >missed?
> 
> Hi Peter,
> Thanks for trying it!
> 
> Mainly is about not hard-coding device options (e.g. PCI address for AMD 
> IOMMU),
> but also to avoid having devices added as a side-effect of some machine 
> option.
> This will bring as closer to a cleaner model of a modular machine.
> 
> I plan to post a non-rfc version soon.

I just thought about whether we should support multiple IOMMUs in the
future (never know whether there would be a use case for
that). Anyway, looking forward to your non-rfc version. :)

Thanks!

-- peterx



[Qemu-devel] [PATCH v2 8/8] virtio-blk: add num-queues device property

2016-05-30 Thread Stefan Hajnoczi
Multiqueue virtio-blk can be enabled as follows:

  qemu -device virtio-blk-pci,num-queues=8

Signed-off-by: Stefan Hajnoczi 
---
 hw/block/virtio-blk.c  | 15 +--
 include/hw/virtio/virtio-blk.h |  1 -
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f36b690..c79a9d5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -768,6 +768,7 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
 blkcfg.physical_block_exp = get_physical_block_exp(conf);
 blkcfg.alignment_offset = 0;
 blkcfg.wce = blk_enable_write_cache(s->blk);
+virtio_stw_p(vdev, _queues, s->conf.num_queues);
 memcpy(config, , sizeof(struct virtio_blk_config));
 }
 
@@ -811,6 +812,9 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, 
uint64_t features,
 if (blk_is_read_only(s->blk)) {
 virtio_add_feature(, VIRTIO_BLK_F_RO);
 }
+if (s->conf.num_queues > 1) {
+virtio_add_feature(, VIRTIO_BLK_F_MQ);
+}
 
 return features;
 }
@@ -942,6 +946,7 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 VirtIOBlkConf *conf = >conf;
 Error *err = NULL;
 static int virtio_blk_id;
+unsigned i;
 
 if (!conf->conf.blk) {
 error_setg(errp, "drive property not set");
@@ -951,6 +956,10 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 error_setg(errp, "Device needs media, but drive is empty");
 return;
 }
+if (!conf->num_queues) {
+error_setg(errp, "num-queues property must be larger than 0");
+return;
+}
 
 blkconf_serial(>conf, >serial);
 s->original_wce = blk_enable_write_cache(conf->conf.blk);
@@ -968,8 +977,9 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 s->rq = NULL;
 s->sector_mask = (s->conf.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
 
-conf->num_queues = 1;
-s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
+for (i = 0; i < conf->num_queues; i++) {
+virtio_add_queue(vdev, 128, virtio_blk_handle_output);
+}
 virtio_blk_data_plane_create(vdev, conf, >dataplane, );
 if (err != NULL) {
 error_propagate(errp, err);
@@ -1031,6 +1041,7 @@ static Property virtio_blk_properties[] = {
 #endif
 DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
 true),
+DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues, 1),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index b6e7860..70bd3b7 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -47,7 +47,6 @@ struct VirtIOBlockReq;
 typedef struct VirtIOBlock {
 VirtIODevice parent_obj;
 BlockBackend *blk;
-VirtQueue *vq;
 void *rq;
 QEMUBH *bh;
 QEMUBH *batch_notify_bh;
-- 
2.5.5




[Qemu-devel] [PATCH v2 4/8] virtio-blk: add VirtIOBlockConf->num_queues

2016-05-30 Thread Stefan Hajnoczi
The num_queues field is always 1 for the time being.  A later patch will
make it a configurable device property so that multiqueue can be
enabled.

Signed-off-by: Stefan Hajnoczi 
---
 hw/block/virtio-blk.c  | 1 +
 include/hw/virtio/virtio-blk.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 4ee4063..c8d66f0 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -931,6 +931,7 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 s->rq = NULL;
 s->sector_mask = (s->conf.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
 
+conf->num_queues = 1;
 s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
 virtio_blk_data_plane_create(vdev, conf, >dataplane, );
 if (err != NULL) {
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 9c6747d..487b28d 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -38,6 +38,7 @@ struct VirtIOBlkConf
 uint32_t scsi;
 uint32_t config_wce;
 uint32_t request_merging;
+uint16_t num_queues;
 };
 
 struct VirtIOBlockDataPlane;
-- 
2.5.5




[Qemu-devel] [PATCH v2 3/8] virtio-blk: associate request with a virtqueue

2016-05-30 Thread Stefan Hajnoczi
Multiqueue requires that each request knows to which virtqueue it
belongs.

Signed-off-by: Stefan Hajnoczi 
---
 hw/block/virtio-blk.c  | 16 +---
 include/hw/virtio/virtio-blk.h |  4 +++-
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 797b568..4ee4063 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -29,9 +29,11 @@
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
 
-void virtio_blk_init_request(VirtIOBlock *s, VirtIOBlockReq *req)
+void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
+ VirtIOBlockReq *req)
 {
 req->dev = s;
+req->vq = vq;
 req->qiov.size = 0;
 req->in_len = 0;
 req->next = NULL;
@@ -94,7 +96,7 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, 
unsigned char status)
 trace_virtio_blk_req_complete(req, status);
 
 stb_p(>in->status, status);
-virtqueue_push(s->vq, >elem, req->in_len);
+virtqueue_push(req->vq, >elem, req->in_len);
 qemu_bh_schedule(s->batch_notify_bh);
 }
 
@@ -224,12 +226,12 @@ out:
 
 #endif
 
-static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
+static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s, VirtQueue *vq)
 {
-VirtIOBlockReq *req = virtqueue_pop(s->vq, sizeof(VirtIOBlockReq));
+VirtIOBlockReq *req = virtqueue_pop(vq, sizeof(VirtIOBlockReq));
 
 if (req) {
-virtio_blk_init_request(s, req);
+virtio_blk_init_request(s, vq, req);
 }
 return req;
 }
@@ -620,7 +622,7 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
 
 blk_io_plug(s->blk);
 
-while ((req = virtio_blk_get_request(s))) {
+while ((req = virtio_blk_get_request(s, vq))) {
 virtio_blk_handle_request(req, );
 }
 
@@ -877,7 +879,7 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 while (qemu_get_sbyte(f)) {
 VirtIOBlockReq *req;
 req = qemu_get_virtqueue_element(f, sizeof(VirtIOBlockReq));
-virtio_blk_init_request(s, req);
+virtio_blk_init_request(s, s->vq, req);
 req->next = s->rq;
 s->rq = req;
 }
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index b3834bc..9c6747d 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -63,6 +63,7 @@ typedef struct VirtIOBlockReq {
 VirtQueueElement elem;
 int64_t sector_num;
 VirtIOBlock *dev;
+VirtQueue *vq;
 struct virtio_blk_inhdr *in;
 struct virtio_blk_outhdr out;
 QEMUIOVector qiov;
@@ -80,7 +81,8 @@ typedef struct MultiReqBuffer {
 bool is_write;
 } MultiReqBuffer;
 
-void virtio_blk_init_request(VirtIOBlock *s, VirtIOBlockReq *req);
+void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
+ VirtIOBlockReq *req);
 void virtio_blk_free_request(VirtIOBlockReq *req);
 
 void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb);
-- 
2.5.5




[Qemu-devel] [PATCH v2 2/8] virtio-blk: tell dataplane which vq to notify

2016-05-30 Thread Stefan Hajnoczi
Let the virtio_blk_data_plane_notify() caller decide which virtqueue to
notify.  This will allow the function to be used with multiqueue.

Signed-off-by: Stefan Hajnoczi 
---
 hw/block/dataplane/virtio-blk.c | 8 +++-
 hw/block/dataplane/virtio-blk.h | 2 +-
 hw/block/virtio-blk.c   | 2 +-
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index e0ac4f4..592aa95 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -34,7 +34,6 @@ struct VirtIOBlockDataPlane {
 
 VirtIODevice *vdev;
 VirtQueue *vq;  /* virtqueue vring */
-EventNotifier *guest_notifier;  /* irq */
 
 Notifier insert_notifier, remove_notifier;
 
@@ -51,13 +50,13 @@ struct VirtIOBlockDataPlane {
 };
 
 /* Raise an interrupt to signal guest, if necessary */
-void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s)
+void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s, VirtQueue *vq)
 {
-if (!virtio_should_notify(s->vdev, s->vq)) {
+if (!virtio_should_notify(s->vdev, vq)) {
 return;
 }
 
-event_notifier_set(s->guest_notifier);
+event_notifier_set(virtio_queue_get_guest_notifier(vq));
 }
 
 static void data_plane_set_up_op_blockers(VirtIOBlockDataPlane *s)
@@ -207,7 +206,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 "ensure -enable-kvm is set\n", r);
 goto fail_guest_notifiers;
 }
-s->guest_notifier = virtio_queue_get_guest_notifier(s->vq);
 
 /* Set up virtqueue notify */
 r = k->set_host_notifier(qbus->parent, 0, true);
diff --git a/hw/block/dataplane/virtio-blk.h b/hw/block/dataplane/virtio-blk.h
index 0714c11..b1f0b95 100644
--- a/hw/block/dataplane/virtio-blk.h
+++ b/hw/block/dataplane/virtio-blk.h
@@ -26,6 +26,6 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s);
 void virtio_blk_data_plane_drain(VirtIOBlockDataPlane *s);
-void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s);
+void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s, VirtQueue *vq);
 
 #endif /* HW_DATAPLANE_VIRTIO_BLK_H */
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c108414..797b568 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -54,7 +54,7 @@ static void virtio_blk_batch_notify_bh(void *opaque)
 VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
 if (s->dataplane_started && !s->dataplane_disabled) {
-virtio_blk_data_plane_notify(s->dataplane);
+virtio_blk_data_plane_notify(s->dataplane, s->vq);
 } else {
 virtio_notify(vdev, s->vq);
 }
-- 
2.5.5




[Qemu-devel] [PATCH v2 1/8] virtio-blk: use batch notify in non-dataplane case

2016-05-30 Thread Stefan Hajnoczi
Commit 5b2ffbe4d99843fd8305c573a100047a8c962327 ("virtio-blk: dataplane:
notify guest as a batch") deferred guest notification to a BH in order
batch notifications.  This optimization is not specific to dataplane so
move it to the generic virtio-blk code that is shared by both dataplane
and non-dataplane code.

Use the AioContext notifier to detect when dataplane moves the
BlockBackend to the IOThread's AioContext.  This is necessary so the
notification BH is always created in the current AioContext.

Note that this patch adds a flush function to force pending
notifications.  This way we can ensure notifications do not cross device
reset or vmstate saving.

Cc: Ming Lei 
Signed-off-by: Stefan Hajnoczi 
---
 hw/block/dataplane/virtio-blk.c | 10 ---
 hw/block/virtio-blk.c   | 60 -
 include/hw/virtio/virtio-blk.h  |  1 +
 3 files changed, 55 insertions(+), 16 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 3cb97c9..e0ac4f4 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -35,7 +35,6 @@ struct VirtIOBlockDataPlane {
 VirtIODevice *vdev;
 VirtQueue *vq;  /* virtqueue vring */
 EventNotifier *guest_notifier;  /* irq */
-QEMUBH *bh; /* bh for guest notification */
 
 Notifier insert_notifier, remove_notifier;
 
@@ -54,13 +53,6 @@ struct VirtIOBlockDataPlane {
 /* Raise an interrupt to signal guest, if necessary */
 void virtio_blk_data_plane_notify(VirtIOBlockDataPlane *s)
 {
-qemu_bh_schedule(s->bh);
-}
-
-static void notify_guest_bh(void *opaque)
-{
-VirtIOBlockDataPlane *s = opaque;
-
 if (!virtio_should_notify(s->vdev, s->vq)) {
 return;
 }
@@ -156,7 +148,6 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *conf,
 object_ref(OBJECT(s->iothread));
 }
 s->ctx = iothread_get_aio_context(s->iothread);
-s->bh = aio_bh_new(s->ctx, notify_guest_bh, s);
 
 s->insert_notifier.notify = data_plane_blk_insert_notifier;
 s->remove_notifier.notify = data_plane_blk_remove_notifier;
@@ -179,7 +170,6 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 data_plane_remove_op_blockers(s);
 notifier_remove(>insert_notifier);
 notifier_remove(>remove_notifier);
-qemu_bh_delete(s->bh);
 object_unref(OBJECT(s->iothread));
 g_free(s);
 }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 284e646..c108414 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -45,20 +45,57 @@ void virtio_blk_free_request(VirtIOBlockReq *req)
 }
 }
 
+/* Many requests can complete in an event loop iteration, we only notify the
+ * guest once.
+ */
+static void virtio_blk_batch_notify_bh(void *opaque)
+{
+VirtIOBlock *s = opaque;
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+if (s->dataplane_started && !s->dataplane_disabled) {
+virtio_blk_data_plane_notify(s->dataplane);
+} else {
+virtio_notify(vdev, s->vq);
+}
+}
+
+/* Force batch notifications to run */
+static void virtio_blk_batch_notify_flush(VirtIOBlock *s)
+{
+qemu_bh_cancel(s->batch_notify_bh);
+virtio_blk_batch_notify_bh(s);
+}
+
+static void virtio_blk_attached_aio_context(AioContext *new_context,
+void *opaque)
+{
+VirtIOBlock *s = opaque;
+
+assert(!s->batch_notify_bh);
+s->batch_notify_bh = aio_bh_new(new_context, virtio_blk_batch_notify_bh,
+s);
+qemu_bh_schedule(s->batch_notify_bh); /* in case notify was pending */
+}
+
+static void virtio_blk_detach_aio_context(void *opaque)
+{
+VirtIOBlock *s = opaque;
+
+assert(s->batch_notify_bh);
+qemu_bh_delete(s->batch_notify_bh);
+s->batch_notify_bh = NULL;
+}
+
 static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
 {
 VirtIOBlock *s = req->dev;
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
 trace_virtio_blk_req_complete(req, status);
 
 stb_p(>in->status, status);
 virtqueue_push(s->vq, >elem, req->in_len);
-if (s->dataplane_started && !s->dataplane_disabled) {
-virtio_blk_data_plane_notify(s->dataplane);
-} else {
-virtio_notify(vdev, s->vq);
-}
+qemu_bh_schedule(s->batch_notify_bh);
 }
 
 static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
@@ -664,6 +701,8 @@ static void virtio_blk_reset(VirtIODevice *vdev)
 if (s->dataplane) {
 virtio_blk_data_plane_stop(s->dataplane);
 }
+
+virtio_blk_batch_notify_flush(s);
 aio_context_release(ctx);
 
 blk_set_enable_write_cache(s->blk, s->original_wce);
@@ -801,6 +840,8 @@ static void virtio_blk_save(QEMUFile *f, void *opaque)
 virtio_blk_data_plane_stop(s->dataplane);
 }
 
+virtio_blk_batch_notify_flush(s);
+
 virtio_save(vdev, f);
 }
 
@@ 

[Qemu-devel] [PATCH v2 7/8] virtio-blk: dataplane multiqueue support

2016-05-30 Thread Stefan Hajnoczi
Monitor ioeventfds for all virtqueues in the device's AioContext.  This
is not true multiqueue because requests from all virtqueues are
processed in a single IOThread.  In the future it will be possible to
use multiple IOThreads when the QEMU block layer supports multiqueue.

Signed-off-by: Stefan Hajnoczi 
---
 hw/block/dataplane/virtio-blk.c | 50 -
 1 file changed, 34 insertions(+), 16 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 592aa95..48c0bb7 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -31,9 +31,7 @@ struct VirtIOBlockDataPlane {
 bool stopping;
 
 VirtIOBlkConf *conf;
-
 VirtIODevice *vdev;
-VirtQueue *vq;  /* virtqueue vring */
 
 Notifier insert_notifier, remove_notifier;
 
@@ -190,6 +188,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
+unsigned i;
+unsigned nvqs = s->conf->num_queues;
 int r;
 
 if (vblk->dataplane_started || s->starting) {
@@ -197,10 +197,9 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 }
 
 s->starting = true;
-s->vq = virtio_get_queue(s->vdev, 0);
 
 /* Set up guest notifier (irq) */
-r = k->set_guest_notifiers(qbus->parent, 1, true);
+r = k->set_guest_notifiers(qbus->parent, nvqs, true);
 if (r != 0) {
 fprintf(stderr, "virtio-blk failed to set guest notifier (%d), "
 "ensure -enable-kvm is set\n", r);
@@ -208,10 +207,15 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 }
 
 /* Set up virtqueue notify */
-r = k->set_host_notifier(qbus->parent, 0, true);
-if (r != 0) {
-fprintf(stderr, "virtio-blk failed to set host notifier (%d)\n", r);
-goto fail_host_notifier;
+for (i = 0; i < nvqs; i++) {
+r = k->set_host_notifier(qbus->parent, i, true);
+if (r != 0) {
+fprintf(stderr, "virtio-blk failed to set host notifier (%d)\n", 
r);
+while (i--) {
+k->set_host_notifier(qbus->parent, i, false);
+}
+goto fail_guest_notifiers;
+}
 }
 
 s->starting = false;
@@ -221,17 +225,23 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 blk_set_aio_context(s->conf->conf.blk, s->ctx);
 
 /* Kick right away to begin processing requests already in vring */
-event_notifier_set(virtio_queue_get_host_notifier(s->vq));
+for (i = 0; i < nvqs; i++) {
+VirtQueue *vq = virtio_get_queue(s->vdev, i);
+
+event_notifier_set(virtio_queue_get_host_notifier(vq));
+}
 
 /* Get this show started by hooking up our callbacks */
 aio_context_acquire(s->ctx);
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx,
-   
virtio_blk_data_plane_handle_output);
+for (i = 0; i < nvqs; i++) {
+VirtQueue *vq = virtio_get_queue(s->vdev, i);
+
+virtio_queue_aio_set_host_notifier_handler(vq, s->ctx,
+virtio_blk_data_plane_handle_output);
+}
 aio_context_release(s->ctx);
 return;
 
-  fail_host_notifier:
-k->set_guest_notifiers(qbus->parent, 1, false);
   fail_guest_notifiers:
 vblk->dataplane_disabled = true;
 s->starting = false;
@@ -244,6 +254,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
+unsigned i;
+unsigned nvqs = s->conf->num_queues;
 
 if (!vblk->dataplane_started || s->stopping) {
 return;
@@ -261,17 +273,23 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 aio_context_acquire(s->ctx);
 
 /* Stop notifications for new requests from guest */
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, NULL);
+for (i = 0; i < nvqs; i++) {
+VirtQueue *vq = virtio_get_queue(s->vdev, i);
+
+virtio_queue_aio_set_host_notifier_handler(vq, s->ctx, NULL);
+}
 
 /* Drain and switch bs back to the QEMU main loop */
 blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
 
 aio_context_release(s->ctx);
 
-k->set_host_notifier(qbus->parent, 0, false);
+for (i = 0; i < nvqs; i++) {
+k->set_host_notifier(qbus->parent, i, false);
+}
 
 /* Clean up guest notifier (irq) */
-k->set_guest_notifiers(qbus->parent, 1, false);
+k->set_guest_notifiers(qbus->parent, nvqs, false);
 
 vblk->dataplane_started = false;
 s->stopping = false;
-- 
2.5.5




[Qemu-devel] [PATCH v2 6/8] virtio-blk: live migrateion s->rq with multiqueue

2016-05-30 Thread Stefan Hajnoczi
Add a field for the virtqueue index when migrating the s->rq request
list.  The new field is only needed when num_queues > 1.  Existing QEMUs
are unaffected by this change and therefore virtio-blk migration stays
compatible.

Suggested-by: Paolo Bonzini 
Signed-off-by: Stefan Hajnoczi 
---
 hw/block/virtio-blk.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 9de749b..f36b690 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -873,6 +873,11 @@ static void virtio_blk_save_device(VirtIODevice *vdev, 
QEMUFile *f)
 
 while (req) {
 qemu_put_sbyte(f, 1);
+
+if (s->conf.num_queues > 1) {
+qemu_put_be32(f, virtio_queue_get_id(req->vq));
+}
+
 qemu_put_virtqueue_element(f, >elem);
 req = req->next;
 }
@@ -896,9 +901,22 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 VirtIOBlock *s = VIRTIO_BLK(vdev);
 
 while (qemu_get_sbyte(f)) {
+unsigned nvqs = s->conf.num_queues;
+unsigned vq_idx = 0;
 VirtIOBlockReq *req;
+
+if (nvqs > 1) {
+vq_idx = qemu_get_be32(f);
+
+if (vq_idx >= nvqs) {
+error_report("Invalid virtqueue index in request list: %#x",
+ vq_idx);
+return -EINVAL;
+}
+}
+
 req = qemu_get_virtqueue_element(f, sizeof(VirtIOBlockReq));
-virtio_blk_init_request(s, s->vq, req);
+virtio_blk_init_request(s, virtio_get_queue(vdev, vq_idx), req);
 req->next = s->rq;
 s->rq = req;
 }
-- 
2.5.5




[Qemu-devel] [PATCH v2 5/8] virtio-blk: multiqueue batch notify

2016-05-30 Thread Stefan Hajnoczi
The batch notification BH needs to know which virtqueues to notify when
multiqueue is enabled.  Use a bitmap to track the virtqueues with
pending notifications.

Signed-off-by: Stefan Hajnoczi 
---
 hw/block/virtio-blk.c  | 29 +
 include/hw/virtio/virtio-blk.h |  1 +
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c8d66f0..9de749b 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -54,11 +54,28 @@ static void virtio_blk_batch_notify_bh(void *opaque)
 {
 VirtIOBlock *s = opaque;
 VirtIODevice *vdev = VIRTIO_DEVICE(s);
+unsigned nvqs = s->conf.num_queues;
+unsigned long bitmap[BITS_TO_LONGS(nvqs)];
+unsigned j;
 
-if (s->dataplane_started && !s->dataplane_disabled) {
-virtio_blk_data_plane_notify(s->dataplane, s->vq);
-} else {
-virtio_notify(vdev, s->vq);
+memcpy(bitmap, s->batch_notify_vqs, sizeof(bitmap));
+memset(s->batch_notify_vqs, 0, sizeof(bitmap));
+
+for (j = 0; j < nvqs; j += BITS_PER_LONG) {
+unsigned long bits = bitmap[j];
+
+while (bits != 0) {
+unsigned i = j + ctzl(bits);
+VirtQueue *vq = virtio_get_queue(vdev, i);
+
+if (s->dataplane_started && !s->dataplane_disabled) {
+virtio_blk_data_plane_notify(s->dataplane, vq);
+} else {
+virtio_notify(vdev, vq);
+}
+
+bits &= bits - 1; /* clear right-most bit */
+}
 }
 }
 
@@ -97,6 +114,8 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, 
unsigned char status)
 
 stb_p(>in->status, status);
 virtqueue_push(req->vq, >elem, req->in_len);
+
+set_bit(virtio_queue_get_id(req->vq), s->batch_notify_vqs);
 qemu_bh_schedule(s->batch_notify_bh);
 }
 
@@ -940,6 +959,7 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
+s->batch_notify_vqs = bitmap_new(conf->num_queues);
 blk_add_aio_context_notifier(s->blk, virtio_blk_attached_aio_context,
  virtio_blk_detach_aio_context, s);
 virtio_blk_attached_aio_context(blk_get_aio_context(s->blk), s);
@@ -961,6 +981,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev, 
Error **errp)
 blk_remove_aio_context_notifier(s->blk, virtio_blk_attached_aio_context,
 virtio_blk_detach_aio_context, s);
 virtio_blk_detach_aio_context(s);
+g_free(s->batch_notify_vqs);
 virtio_blk_data_plane_destroy(s->dataplane);
 s->dataplane = NULL;
 qemu_del_vm_change_state_handler(s->change);
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 487b28d..b6e7860 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -51,6 +51,7 @@ typedef struct VirtIOBlock {
 void *rq;
 QEMUBH *bh;
 QEMUBH *batch_notify_bh;
+unsigned long *batch_notify_vqs;
 VirtIOBlkConf conf;
 unsigned short sector_mask;
 bool original_wce;
-- 
2.5.5




[Qemu-devel] [PATCH v2 0/8] virtio-blk: multiqueue support

2016-05-30 Thread Stefan Hajnoczi
v2:
 * Simplify s->rq live migration [Paolo]
 * Use more efficient bitmap ops for batch notification [Paolo]
 * Fix perf regression due to batch notify BH in wrong AioContext [Christian]

The virtio_blk guest driver has supported multiple virtqueues since Linux 3.17.
This patch series adds multiple virtqueues to QEMU's virtio-blk emulated
device.

Ming Lei sent patches previously but these were not merged.  This series
implements virtio-blk multiqueue for QEMU from scratch since the codebase has
changed.  Live migration support for s->rq was also missing from the previous
series and has been added.

It's important to note that QEMU's block layer does not support multiqueue yet.
Therefore virtio-blk device processes all virtqueues in the same AioContext
(IOThread).  Further work is necessary to take advantage of multiqueue support
in QEMU's block layer once it becomes available.

I will post performance results once they are ready.

Stefan Hajnoczi (8):
  virtio-blk: use batch notify in non-dataplane case
  virtio-blk: tell dataplane which vq to notify
  virtio-blk: associate request with a virtqueue
  virtio-blk: add VirtIOBlockConf->num_queues
  virtio-blk: multiqueue batch notify
  virtio-blk: live migrateion s->rq with multiqueue
  virtio-blk: dataplane multiqueue support
  virtio-blk: add num-queues device property

 hw/block/dataplane/virtio-blk.c |  68 +++--
 hw/block/dataplane/virtio-blk.h |   2 +-
 hw/block/virtio-blk.c   | 129 +++-
 include/hw/virtio/virtio-blk.h  |   8 ++-
 4 files changed, 159 insertions(+), 48 deletions(-)

-- 
2.5.5




Re: [Qemu-devel] [Qemu-block] [PATCH v19 08/10] Implement new driver for block replication

2016-05-30 Thread Changlong Xie

On 05/31/2016 02:14 AM, Stefan Hajnoczi wrote:

On Fri, May 20, 2016 at 03:36:18PM +0800, Changlong Xie wrote:

+/* start backup job now */
+error_setg(>blocker,
+   "block device is in use by internal backup job");
+
+top_bs = bdrv_lookup_bs(s->top_id, s->top_id, errp);
+if (!top_bs || !check_top_bs(top_bs, bs)) {
+reopen_backing_file(s, false, NULL);
+aio_context_release(aio_context);
+return;
+}


Missing error_setg() with an error message when check_top_bs() fails.



Will add.

Thanks
-Xie





Re: [Qemu-devel] [PATCH 0/9] virtio-blk: multiqueue support

2016-05-30 Thread Stefan Hajnoczi
On Tue, May 24, 2016 at 02:51:04PM +0200, Christian Borntraeger wrote:
> On 05/21/2016 01:40 AM, Stefan Hajnoczi wrote:
> > The virtio_blk guest driver has supported multiple virtqueues since Linux 
> > 3.17.
> > This patch series adds multiple virtqueues to QEMU's virtio-blk emulated
> > device.
> > 
> > Ming Lei sent patches previously but these were not merged.  This series
> > implements virtio-blk multiqueue for QEMU from scratch since the codebase 
> > has
> > changed.  Live migration support for s->rq was also missing from the 
> > previous
> > series and has been added.
> > 
> > It's important to note that QEMU's block layer does not support multiqueue 
> > yet.
> > Therefore virtio-blk device processes all virtqueues in the same AioContext
> > (IOThread).  Further work is necessary to take advantage of multiqueue 
> > support
> > in QEMU's block layer once it becomes available.
> > 
> > I will post performance results once they are ready.
> > 
> > Stefan Hajnoczi (9):
> >   virtio-blk: use batch notify in non-dataplane case
> >   virtio-blk: tell dataplane which vq to notify
> >   virtio-blk: associate request with a virtqueue
> >   virtio-blk: add VirtIOBlockConf->num_queues
> >   virtio-blk: multiqueue batch notify
> >   vmstate: add VMSTATE_VARRAY_UINT32_ALLOC
> >   virtio-blk: live migrate s->rq with multiqueue
> >   virtio-blk: dataplane multiqueue support
> >   virtio-blk: add num-queues device property
> > 
> >  hw/block/dataplane/virtio-blk.c |  68 +++---
> >  hw/block/dataplane/virtio-blk.h |   2 +-
> >  hw/block/virtio-blk.c   | 200 
> > 
> >  include/hw/virtio/virtio-blk.h  |  13 ++-
> >  include/migration/vmstate.h |  10 ++
> >  5 files changed, 241 insertions(+), 52 deletions(-)
> > 
> 
> With 2.6 I see 2 host threads consuming a CPU when running fio in a single CPU
> guest with a null-blk device and iothread for that disk. (the vcpu thread and
> the iothread). With this patchset the main thread also consumes almost 80% of 
> a
> CPU doing polling in main_loop_wait. I have not even changes the num-queues 
> values.
> 
> So in essence 3 vs 2 host cpus.

I see the performance regression as well.  It is caused by Patch 1 and
there is a simple explanation: I broke dataplane because the BH is in
the main loop AioContext and not the dataplane AioContext.

Thanks for spotting this, Christian!

Stefan


signature.asc
Description: PGP signature


[Qemu-devel] [PULL 10/12] exec: Do vmstate unregistration from cpu_exec_exit()

2016-05-30 Thread David Gibson
From: Bharata B Rao 

cpu_exec_init() does vmstate_register for the CPU device. This needs to be
undone from cpu_exec_exit(). This change is needed to support CPU hot
removal.

Signed-off-by: Bharata B Rao 
Reviewed-by: Thomas Huth 
Reviewed-by: David Gibson 
Acked-by: Paolo Bonzini 
[dwg: added missing include to fix compile on some archs]
Signed-off-by: David Gibson 
---
 exec.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/exec.c b/exec.c
index a1dfc01..7261172 100644
--- a/exec.c
+++ b/exec.c
@@ -57,6 +57,8 @@
 #include "exec/ram_addr.h"
 #include "exec/log.h"
 
+#include "migration/vmstate.h"
+
 #include "qemu/range.h"
 #ifndef _WIN32
 #include "qemu/mmap-alloc.h"
@@ -637,6 +639,8 @@ static void cpu_release_index(CPUState *cpu)
 
 void cpu_exec_exit(CPUState *cpu)
 {
+CPUClass *cc = CPU_GET_CLASS(cpu);
+
 #if defined(CONFIG_USER_ONLY)
 cpu_list_lock();
 #endif
@@ -654,6 +658,13 @@ void cpu_exec_exit(CPUState *cpu)
 #if defined(CONFIG_USER_ONLY)
 cpu_list_unlock();
 #endif
+
+if (cc->vmsd != NULL) {
+vmstate_unregister(NULL, cc->vmsd, cpu);
+}
+if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
+vmstate_unregister(NULL, _cpu_common, cpu);
+}
 }
 
 void cpu_exec_init(CPUState *cpu, Error **errp)
-- 
2.5.5




[Qemu-devel] [PULL 09/12] exec: Remove cpu from cpus list during cpu_exec_exit()

2016-05-30 Thread David Gibson
From: Bharata B Rao 

CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
should be removed from cpu_exec_exit().

cpu_exec_exit() is called from generic CPU::instance_finalize and some
archs like PowerPC call it from CPU unrealizefn. So ensure that we
dequeue the cpu only once.

Now -1 value for cpu->cpu_index indicates that we have already dequeued
the cpu for CONFIG_USER_ONLY case also.

Signed-off-by: Bharata B Rao 
Reviewed-by: David Gibson 
Reviewed-by: Thomas Huth 
Acked-by: Paolo Bonzini 
Signed-off-by: David Gibson 
---
 exec.c | 32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/exec.c b/exec.c
index a3a93ae..a1dfc01 100644
--- a/exec.c
+++ b/exec.c
@@ -612,15 +612,9 @@ static int cpu_get_free_index(Error **errp)
 return cpu;
 }
 
-void cpu_exec_exit(CPUState *cpu)
+static void cpu_release_index(CPUState *cpu)
 {
-if (cpu->cpu_index == -1) {
-/* cpu_index was never allocated by this @cpu or was already freed. */
-return;
-}
-
 bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
-cpu->cpu_index = -1;
 }
 #else
 
@@ -635,11 +629,33 @@ static int cpu_get_free_index(Error **errp)
 return cpu_index;
 }
 
-void cpu_exec_exit(CPUState *cpu)
+static void cpu_release_index(CPUState *cpu)
 {
+return;
 }
 #endif
 
+void cpu_exec_exit(CPUState *cpu)
+{
+#if defined(CONFIG_USER_ONLY)
+cpu_list_lock();
+#endif
+if (cpu->cpu_index == -1) {
+/* cpu_index was never allocated by this @cpu or was already freed. */
+#if defined(CONFIG_USER_ONLY)
+cpu_list_unlock();
+#endif
+return;
+}
+
+QTAILQ_REMOVE(, cpu, node);
+cpu_release_index(cpu);
+cpu->cpu_index = -1;
+#if defined(CONFIG_USER_ONLY)
+cpu_list_unlock();
+#endif
+}
+
 void cpu_exec_init(CPUState *cpu, Error **errp)
 {
 CPUClass *cc = CPU_GET_CLASS(cpu);
-- 
2.5.5




[Qemu-devel] [PULL 07/12] ppc: Get out of emulation on SMT "OR" ops

2016-05-30 Thread David Gibson
From: Benjamin Herrenschmidt 

Otherwise tight loops at smt_low for example, which OPAL does,
eat so much CPU that we can't boot a kernel anymore. With that,
I can boot 8 CPUs just fine with powernv.

Signed-off-by: Benjamin Herrenschmidt 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 target-ppc/translate.c | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 51f6eb1..fe10bf8 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1392,6 +1392,19 @@ GEN_LOGICAL2(nand, tcg_gen_nand_tl, 0x0E, PPC_INTEGER);
 /* nor & nor. */
 GEN_LOGICAL2(nor, tcg_gen_nor_tl, 0x03, PPC_INTEGER);
 
+#if defined(TARGET_PPC64)
+static void gen_pause(DisasContext *ctx)
+{
+TCGv_i32 t0 = tcg_const_i32(0);
+tcg_gen_st_i32(t0, cpu_env,
+   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
+tcg_temp_free_i32(t0);
+
+/* Stop translation, this gives other CPUs a chance to run */
+gen_exception_err(ctx, EXCP_HLT, 1);
+}
+#endif /* defined(TARGET_PPC64) */
+
 /* or & or. */
 static void gen_or(DisasContext *ctx)
 {
@@ -1447,7 +1460,7 @@ static void gen_or(DisasContext *ctx)
 }
 break;
 case 7:
-if (ctx->hv) {
+if (ctx->hv && !ctx->pr) {
 /* Set process priority to very high */
 prio = 7;
 }
@@ -1464,6 +1477,10 @@ static void gen_or(DisasContext *ctx)
 tcg_gen_ori_tl(t0, t0, ((uint64_t)prio) << 50);
 gen_store_spr(SPR_PPR, t0);
 tcg_temp_free(t0);
+/* Pause us out of TCG otherwise spin loops with smt_low
+ * eat too much CPU and the kernel hangs
+ */
+gen_pause(ctx);
 }
 #endif
 }
@@ -1489,8 +1506,6 @@ static void gen_ori(DisasContext *ctx)
 target_ulong uimm = UIMM(ctx->opcode);
 
 if (rS(ctx->opcode) == rA(ctx->opcode) && uimm == 0) {
-/* NOP */
-/* XXX: should handle special NOPs for POWER series */
 return;
 }
 tcg_gen_ori_tl(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)], uimm);
-- 
2.5.5




[Qemu-devel] [PULL 05/12] ppc: Change 'invalid' bit mask of tlbiel and tlbie

2016-05-30 Thread David Gibson
From: Benjamin Herrenschmidt 

Otherwise it will trip on the forms used in recent architecture.

Ideally, we should have different handlers for different architecture
levels but our current implementation of TLB flushing is dumb enough
that this will do for now.

Signed-off-by: Benjamin Herrenschmidt 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 target-ppc/translate.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 690ffd2..868ef31 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -9946,8 +9946,10 @@ GEN_HANDLER2(slbmfee, "slbmfee", 0x1F, 0x13, 0x1C, 
0x001F0001, PPC_SEGMENT_64B),
 GEN_HANDLER2(slbmfev, "slbmfev", 0x1F, 0x13, 0x1A, 0x001F0001, 
PPC_SEGMENT_64B),
 #endif
 GEN_HANDLER(tlbia, 0x1F, 0x12, 0x0B, 0x03FFFC01, PPC_MEM_TLBIA),
-GEN_HANDLER(tlbiel, 0x1F, 0x12, 0x08, 0x03FF0001, PPC_MEM_TLBIE),
-GEN_HANDLER(tlbie, 0x1F, 0x12, 0x09, 0x03FF0001, PPC_MEM_TLBIE),
+/* XXX Those instructions will need to be handled differently for
+ * different ISA versions */
+GEN_HANDLER(tlbiel, 0x1F, 0x12, 0x08, 0x001F0001, PPC_MEM_TLBIE),
+GEN_HANDLER(tlbie, 0x1F, 0x12, 0x09, 0x001F0001, PPC_MEM_TLBIE),
 GEN_HANDLER(tlbsync, 0x1F, 0x16, 0x11, 0x03FFF801, PPC_MEM_TLBSYNC),
 #if defined(TARGET_PPC64)
 GEN_HANDLER(slbia, 0x1F, 0x12, 0x0F, 0x03FFFC01, PPC_SLBI),
-- 
2.5.5




[Qemu-devel] [PULL 08/12] ppc: Add PPC_64H instruction flag to POWER7 and POWER8

2016-05-30 Thread David Gibson
From: Benjamin Herrenschmidt 

This will enable decoding of hrfid

Signed-off-by: Benjamin Herrenschmidt 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 target-ppc/translate_init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index a003c10..8301076 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8359,7 +8359,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
PPC_MEM_SYNC | PPC_MEM_EIEIO |
PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
-   PPC_64B | PPC_ALTIVEC |
+   PPC_64B | PPC_64H | PPC_ALTIVEC |
PPC_SEGMENT_64B | PPC_SLBI |
PPC_POPCNTB | PPC_POPCNTWD;
 pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205 |
@@ -8439,7 +8439,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
PPC_MEM_SYNC | PPC_MEM_EIEIO |
PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
-   PPC_64B | PPC_64BX | PPC_ALTIVEC |
+   PPC_64B | PPC_64H | PPC_64BX | PPC_ALTIVEC |
PPC_SEGMENT_64B | PPC_SLBI |
PPC_POPCNTB | PPC_POPCNTWD;
 pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX |
-- 
2.5.5




[Qemu-devel] [PULL 03/12] ppc: Do some batching of TCG tlb flushes

2016-05-30 Thread David Gibson
From: Benjamin Herrenschmidt 

On ppc64 especially, we flush the tlb on any slbie or tlbie instruction.

However, those instructions often come in bursts of 3 or more (context
switch will favor a series of slbie's for example to an slbia if the
SLB has less than a certain number of entries in it, and tlbie's can
happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries
at a time.

Doing a tlb_flush() each time is a waste of time. We end up doing a memset
of the whole TLB, reloading it for the next instruction, memset'ing again,
etc...

Those instructions don't have to take effect immediately. For slbie, they
can wait for the next context synchronizing event. For tlbie, the next
tlbsync.

This implements batching by keeping a flag that indicates that we have a
TLB in need of flushing. We check it on interrupts, rfi's, isync's and
tlbsync and flush the TLB if needed.

This reduces the number of tlb_flush() on a boot to a ubuntu installer
first dialog screen from roughly 360K down to 36K.

Signed-off-by: Benjamin Herrenschmidt 
[clg: added a 'CPUPPCState *' variable in h_remove() and
  h_bulk_remove() ]
Signed-off-by: Cédric Le Goater 
[dwg: removed spurious whitespace change, use 0/1 not true/false
  consistently, since tlb_need_flush has int type]
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_hcall.c | 14 +++---
 target-ppc/cpu.h |  2 ++
 target-ppc/excp_helper.c |  8 
 target-ppc/helper.h  |  1 +
 target-ppc/helper_regs.h | 13 +
 target-ppc/mmu-hash64.c  | 11 +++
 target-ppc/mmu_helper.c  |  9 -
 target-ppc/translate.c   | 39 ---
 8 files changed, 82 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index feb3629..9a3f4ec 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -186,6 +186,7 @@ static RemoveResult remove_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 static target_ulong h_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr,
  target_ulong opcode, target_ulong *args)
 {
+CPUPPCState *env = >env;
 target_ulong flags = args[0];
 target_ulong pte_index = args[1];
 target_ulong avpn = args[2];
@@ -196,6 +197,7 @@ static target_ulong h_remove(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 
 switch (ret) {
 case REMOVE_SUCCESS:
+check_tlb_flush(env);
 return H_SUCCESS;
 
 case REMOVE_NOT_FOUND:
@@ -232,7 +234,9 @@ static target_ulong h_remove(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 static target_ulong h_bulk_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr,
   target_ulong opcode, target_ulong *args)
 {
+CPUPPCState *env = >env;
 int i;
+target_ulong rc = H_SUCCESS;
 
 for (i = 0; i < H_BULK_REMOVE_MAX_BATCH; i++) {
 target_ulong *tsh = [i*2];
@@ -265,14 +269,18 @@ static target_ulong h_bulk_remove(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 break;
 
 case REMOVE_PARM:
-return H_PARAMETER;
+rc = H_PARAMETER;
+goto exit;
 
 case REMOVE_HW:
-return H_HARDWARE;
+rc = H_HARDWARE;
+goto exit;
 }
 }
+ exit:
+check_tlb_flush(env);
 
-return H_SUCCESS;
+return rc;
 }
 
 static target_ulong h_protect(PowerPCCPU *cpu, sPAPRMachineState *spapr,
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 2c8c8c0..98a24a5 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -958,6 +958,8 @@ struct CPUPPCState {
 /* PowerPC 64 SLB area */
 ppc_slb_t slb[MAX_SLB_ENTRIES];
 int32_t slb_nr;
+/* tcg TLB needs flush (deferred slb inval instruction typically) */
+uint32_t tlb_need_flush;
 #endif
 /* segment registers */
 hwaddr htab_base;
diff --git a/target-ppc/excp_helper.c b/target-ppc/excp_helper.c
index ba3caec..a37009e 100644
--- a/target-ppc/excp_helper.c
+++ b/target-ppc/excp_helper.c
@@ -718,6 +718,11 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 /* Reset exception state */
 cs->exception_index = POWERPC_EXCP_NONE;
 env->error_code = 0;
+
+/* Any interrupt is context synchronizing, check if TCG TLB
+ * needs a delayed flush on ppc64
+ */
+check_tlb_flush(env);
 }
 
 void ppc_cpu_do_interrupt(CPUState *cs)
@@ -943,6 +948,9 @@ static inline void do_rfi(CPUPPCState *env, target_ulong 
nip, target_ulong msr,
  * as rfi is always the last insn of a TB
  */
 cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
+
+/* Context synchronizing: check if TCG TLB needs flush */
+check_tlb_flush(env);
 }
 
 void helper_rfi(CPUPPCState *env)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index e5a8f7b..0526322 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -16,6 +16,7 @@ DEF_HELPER_1(rfmci, void, 

[Qemu-devel] [PULL 02/12] ppc: Use split I/D mmu modes to avoid flushes on interrupts

2016-05-30 Thread David Gibson
From: Benjamin Herrenschmidt 

We rework the way the MMU indices are calculated, providing separate
indices for I and D side based on MSR:IR and MSR:DR respectively,
and thus no longer need to flush the TLB on context changes. This also
adds correct support for HV as a separate address space.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 target-ppc/cpu.h | 11 +++---
 target-ppc/excp_helper.c | 11 --
 target-ppc/helper_regs.h | 54 +---
 target-ppc/machine.c |  5 -
 target-ppc/translate.c   |  7 ---
 5 files changed, 63 insertions(+), 25 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 02e71ea..2c8c8c0 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -359,6 +359,8 @@ struct ppc_slb_t {
 #define MSR_EP   6  /* Exception prefix on 601   */
 #define MSR_IR   5  /* Instruction relocate  */
 #define MSR_DR   4  /* Data relocate */
+#define MSR_IS   5  /* Instruction address space (BookE) */
+#define MSR_DS   4  /* Data address space (BookE)*/
 #define MSR_PE   3  /* Protection enable on 403  */
 #define MSR_PX   2  /* Protection exclusive on 403  x*/
 #define MSR_PMM  2  /* Performance monitor mark on POWERx*/
@@ -410,6 +412,8 @@ struct ppc_slb_t {
 #define msr_ep   ((env->msr >> MSR_EP)   & 1)
 #define msr_ir   ((env->msr >> MSR_IR)   & 1)
 #define msr_dr   ((env->msr >> MSR_DR)   & 1)
+#define msr_is   ((env->msr >> MSR_IS)   & 1)
+#define msr_ds   ((env->msr >> MSR_DS)   & 1)
 #define msr_pe   ((env->msr >> MSR_PE)   & 1)
 #define msr_px   ((env->msr >> MSR_PX)   & 1)
 #define msr_pmm  ((env->msr >> MSR_PMM)  & 1)
@@ -889,7 +893,7 @@ struct ppc_segment_page_sizes {
 
 /*/
 /* The whole PowerPC CPU context */
-#define NB_MMU_MODES 3
+#define NB_MMU_MODES8
 
 #define PPC_CPU_OPCODES_LEN  0x40
 #define PPC_CPU_INDIRECT_OPCODES_LEN 0x20
@@ -1053,7 +1057,8 @@ struct CPUPPCState {
 /* Those resources are used only in QEMU core */
 target_ulong hflags;  /* hflags is a MSR & HFLAGS_MASK */
 target_ulong hflags_nmsr; /* specific hflags, not coming from MSR */
-int mmu_idx; /* precomputed MMU index to speed up mem accesses */
+int immu_idx; /* precomputed MMU index to speed up insn access */
+int dmmu_idx; /* precomputed MMU index to speed up data accesses */
 
 /* Power management */
 int (*check_pow)(CPUPPCState *env);
@@ -1245,7 +1250,7 @@ int ppc_dcr_write (ppc_dcr_t *dcr_env, int dcrn, uint32_t 
val);
 #define MMU_USER_IDX 0
 static inline int cpu_mmu_index (CPUPPCState *env, bool ifetch)
 {
-return env->mmu_idx;
+return ifetch ? env->immu_idx : env->dmmu_idx;
 }
 
 #include "exec/cpu-all.h"
diff --git a/target-ppc/excp_helper.c b/target-ppc/excp_helper.c
index 288903e..ba3caec 100644
--- a/target-ppc/excp_helper.c
+++ b/target-ppc/excp_helper.c
@@ -646,9 +646,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 
 if (env->spr[SPR_LPCR] & LPCR_AIL) {
 new_msr |= (1 << MSR_IR) | (1 << MSR_DR);
-} else if (msr & ((1 << MSR_IR) | (1 << MSR_DR))) {
-/* If we disactivated any translation, flush TLBs */
-tlb_flush(cs, 1);
 }
 
 #ifdef TARGET_PPC64
@@ -721,14 +718,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 /* Reset exception state */
 cs->exception_index = POWERPC_EXCP_NONE;
 env->error_code = 0;
-
-if ((env->mmu_model == POWERPC_MMU_BOOKE) ||
-(env->mmu_model == POWERPC_MMU_BOOKE206)) {
-/* XXX: The BookE changes address space when switching modes,
-we should probably implement that as different MMU indexes,
-but for the moment we do it the slow way and flush all.  */
-tlb_flush(cs, 1);
-}
 }
 
 void ppc_cpu_do_interrupt(CPUState *cs)
diff --git a/target-ppc/helper_regs.h b/target-ppc/helper_regs.h
index 271fddf..f7edd5b 100644
--- a/target-ppc/helper_regs.h
+++ b/target-ppc/helper_regs.h
@@ -41,11 +41,50 @@ static inline void hreg_swap_gpr_tgpr(CPUPPCState *env)
 
 static inline void hreg_compute_mem_idx(CPUPPCState *env)
 {
-/* Precompute MMU index */
-if (msr_pr == 0 && msr_hv != 0) {
-env->mmu_idx = 2;
+/* This is our encoding for server processors
+ *
+ *   0 = Guest User space virtual mode
+ *   1 = Guest Kernel space virtual mode
+ *   2 = Guest Kernel space real mode
+ *   3 = HV User space virtual mode
+ *   4 = HV Kernel space virtual mode
+ *   5 = HV Kernel space real mode
+ *
+ * The 

[Qemu-devel] [PULL 11/12] cpu: Reclaim vCPU objects

2016-05-30 Thread David Gibson
From: Gu Zheng 

In order to deal well with the kvm vcpus (which can not be removed without any
protection), we do not close KVM vcpu fd, just record and mark it as stopped
into a list, so that we can reuse it for the appending cpu hot-add request if
possible. It is also the approach that kvm guys suggested:
https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html

Signed-off-by: Chen Fan 
Signed-off-by: Gu Zheng 
Signed-off-by: Zhu Guihua 
Signed-off-by: Bharata B Rao 
   [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
  isn't needed as it is done from cpu_exec_exit()
- Use iothread mutex instead of global mutex during
  destroy
- Don't cleanup vCPU object from vCPU thread context
  but leave it to the callers (device_add/device_del)]
Reviewed-by: Thomas Huth 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 cpus.c   | 39 +--
 include/qom/cpu.h| 10 +
 include/sysemu/kvm.h |  1 +
 kvm-all.c| 57 +++-
 kvm-stub.c   |  5 +
 5 files changed, 109 insertions(+), 3 deletions(-)

diff --git a/cpus.c b/cpus.c
index e75895a..3e3ef95 100644
--- a/cpus.c
+++ b/cpus.c
@@ -972,6 +972,18 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 qemu_cpu_kick(cpu);
 }
 
+static void qemu_kvm_destroy_vcpu(CPUState *cpu)
+{
+if (kvm_destroy_vcpu(cpu) < 0) {
+error_report("kvm_destroy_vcpu failed");
+exit(EXIT_FAILURE);
+}
+}
+
+static void qemu_tcg_destroy_vcpu(CPUState *cpu)
+{
+}
+
 static void flush_queued_work(CPUState *cpu)
 {
 struct qemu_work_item *wi;
@@ -1061,7 +1073,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 cpu->created = true;
 qemu_cond_signal(_cpu_cond);
 
-while (1) {
+do {
 if (cpu_can_run(cpu)) {
 r = kvm_cpu_exec(cpu);
 if (r == EXCP_DEBUG) {
@@ -1069,8 +1081,10 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 }
 }
 qemu_kvm_wait_io_event(cpu);
-}
+} while (!cpu->unplug || cpu_can_run(cpu));
 
+qemu_kvm_destroy_vcpu(cpu);
+qemu_mutex_unlock_iothread();
 return NULL;
 }
 
@@ -1124,6 +1138,7 @@ static void tcg_exec_all(void);
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
 CPUState *cpu = arg;
+CPUState *remove_cpu = NULL;
 
 rcu_register_thread();
 
@@ -1161,6 +1176,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 }
 }
 qemu_tcg_wait_io_event(QTAILQ_FIRST());
+CPU_FOREACH(cpu) {
+if (cpu->unplug && !cpu_can_run(cpu)) {
+remove_cpu = cpu;
+break;
+}
+}
+if (remove_cpu) {
+qemu_tcg_destroy_vcpu(remove_cpu);
+remove_cpu = NULL;
+}
 }
 
 return NULL;
@@ -1317,6 +1342,13 @@ void resume_all_vcpus(void)
 }
 }
 
+void cpu_remove(CPUState *cpu)
+{
+cpu->stop = true;
+cpu->unplug = true;
+qemu_cpu_kick(cpu);
+}
+
 /* For temporary buffers for forming a name */
 #define VCPU_THREAD_NAME_SIZE 16
 
@@ -1533,6 +1565,9 @@ static void tcg_exec_all(void)
 break;
 }
 } else if (cpu->stop || cpu->stopped) {
+if (cpu->unplug) {
+next_cpu = CPU_NEXT(cpu);
+}
 break;
 }
 }
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index c9ba16c..3b57757 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -244,6 +244,7 @@ struct qemu_work_item {
  * @halted: Nonzero if the CPU is in suspended state.
  * @stop: Indicates a pending stop request.
  * @stopped: Indicates the CPU has been artificially stopped.
+ * @unplug: Indicates a pending CPU unplug request.
  * @crash_occurred: Indicates the OS reported a crash (panic) for this CPU
  * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
  *   CPU and return to its top level loop.
@@ -296,6 +297,7 @@ struct CPUState {
 bool created;
 bool stop;
 bool stopped;
+bool unplug;
 bool crash_occurred;
 bool exit_request;
 bool tb_flushed;
@@ -763,6 +765,14 @@ void cpu_exit(CPUState *cpu);
 void cpu_resume(CPUState *cpu);
 
 /**
+ * cpu_remove:
+ * @cpu: The CPU to remove.
+ *
+ * Requests the CPU to be removed.
+ */
+void cpu_remove(CPUState *cpu);
+
+/**
  * qemu_init_vcpu:
  * @cpu: The vCPU to initialize.
  *
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index f357ccd..65569ed 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -216,6 +216,7 @@ int kvm_has_intx_set_mask(void);
 
 int kvm_init_vcpu(CPUState *cpu);
 int 

[Qemu-devel] [PULL 06/12] ppc: Fix sign extension issue in mtmsr(d) emulation

2016-05-30 Thread David Gibson
From: Michael Neuling 

Signed-off-by: Michael Neuling 
Signed-off-by: Benjamin Herrenschmidt 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 target-ppc/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 868ef31..51f6eb1 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -4381,7 +4381,7 @@ static void gen_mtmsrd(DisasContext *ctx)
 /* Special form that does not need any synchronisation */
 TCGv t0 = tcg_temp_new();
 tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)], (1 << MSR_RI) | (1 << 
MSR_EE));
-tcg_gen_andi_tl(cpu_msr, cpu_msr, ~((1 << MSR_RI) | (1 << MSR_EE)));
+tcg_gen_andi_tl(cpu_msr, cpu_msr, ~(target_ulong)((1 << MSR_RI) | (1 
<< MSR_EE)));
 tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
 tcg_temp_free(t0);
 } else {
@@ -4412,7 +4412,7 @@ static void gen_mtmsr(DisasContext *ctx)
 /* Special form that does not need any synchronisation */
 TCGv t0 = tcg_temp_new();
 tcg_gen_andi_tl(t0, cpu_gpr[rS(ctx->opcode)], (1 << MSR_RI) | (1 << 
MSR_EE));
-tcg_gen_andi_tl(cpu_msr, cpu_msr, ~((1 << MSR_RI) | (1 << MSR_EE)));
+tcg_gen_andi_tl(cpu_msr, cpu_msr, ~(target_ulong)((1 << MSR_RI) | (1 
<< MSR_EE)));
 tcg_gen_or_tl(cpu_msr, cpu_msr, t0);
 tcg_temp_free(t0);
 } else {
-- 
2.5.5




[Qemu-devel] [PULL 04/12] ppc: tlbie, tlbia and tlbisync are HV only

2016-05-30 Thread David Gibson
From: Benjamin Herrenschmidt 

Not that anything remotely recent supports tlbia but ...

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 target-ppc/translate.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index dfd3010..690ffd2 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -4858,7 +4858,7 @@ static void gen_tlbie(DisasContext *ctx)
 #if defined(CONFIG_USER_ONLY)
 gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);
 #else
-if (unlikely(ctx->pr)) {
+if (unlikely(ctx->pr || !ctx->hv)) {
 gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);
 return;
 }
@@ -4879,7 +4879,7 @@ static void gen_tlbsync(DisasContext *ctx)
 #if defined(CONFIG_USER_ONLY)
 gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);
 #else
-if (unlikely(ctx->pr)) {
+if (unlikely(ctx->pr || !ctx->hv)) {
 gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);
 return;
 }
@@ -4898,7 +4898,7 @@ static void gen_slbia(DisasContext *ctx)
 #if defined(CONFIG_USER_ONLY)
 gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);
 #else
-if (unlikely(ctx->pr)) {
+if (unlikely(ctx->pr || !ctx->hv)) {
 gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);
 return;
 }
-- 
2.5.5




[Qemu-devel] [PULL 12/12] cpu: Add a sync version of cpu_remove()

2016-05-30 Thread David Gibson
From: Bharata B Rao 

This sync API will be used by the CPU hotplug code to wait for the CPU to
completely get removed before flagging the failure to the device_add
command.

Sync version of this call is needed to correctly recover from CPU
realization failures when ->plug() handler fails.

Signed-off-by: Bharata B Rao 
Reviewed-by: David Gibson 
Acked-by: Paolo Bonzini 
Signed-off-by: David Gibson 
---
 cpus.c| 12 
 include/qom/cpu.h |  8 
 2 files changed, 20 insertions(+)

diff --git a/cpus.c b/cpus.c
index 3e3ef95..326742f 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1084,6 +1084,8 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 } while (!cpu->unplug || cpu_can_run(cpu));
 
 qemu_kvm_destroy_vcpu(cpu);
+cpu->created = false;
+qemu_cond_signal(_cpu_cond);
 qemu_mutex_unlock_iothread();
 return NULL;
 }
@@ -1184,6 +1186,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 }
 if (remove_cpu) {
 qemu_tcg_destroy_vcpu(remove_cpu);
+cpu->created = false;
+qemu_cond_signal(_cpu_cond);
 remove_cpu = NULL;
 }
 }
@@ -1349,6 +1353,14 @@ void cpu_remove(CPUState *cpu)
 qemu_cpu_kick(cpu);
 }
 
+void cpu_remove_sync(CPUState *cpu)
+{
+cpu_remove(cpu);
+while (cpu->created) {
+qemu_cond_wait(_cpu_cond, _global_mutex);
+}
+}
+
 /* For temporary buffers for forming a name */
 #define VCPU_THREAD_NAME_SIZE 16
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 3b57757..32f3af3 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -772,6 +772,14 @@ void cpu_resume(CPUState *cpu);
  */
 void cpu_remove(CPUState *cpu);
 
+ /**
+ * cpu_remove_sync:
+ * @cpu: The CPU to remove.
+ *
+ * Requests the CPU to be removed and waits till it is removed.
+ */
+void cpu_remove_sync(CPUState *cpu);
+
 /**
  * qemu_init_vcpu:
  * @cpu: The vCPU to initialize.
-- 
2.5.5




[Qemu-devel] [PULL 00/12] ppc-for-2.7 queue 20160531

2016-05-30 Thread David Gibson
The following changes since commit d6550e9ed2e1a60d889dfb721de00d9a4e3bafbe:

  Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160527' 
into staging (2016-05-27 14:05:48 +0100)

are available in the git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-2.7-20160531

for you to fetch changes up to 2c579042e3be50bb40a233a6986348b4f40ed026:

  cpu: Add a sync version of cpu_remove() (2016-05-30 14:17:05 +1000)


ppc patch queue for 2016-05-31

Here's another ppc patch queue.  This batch is all preliminaries
towards two significant features:

1) Full hypervisor-mode support for POWER8
Patches 1-8 start fixing various bugs with TCG's handling of
hypervisor mode

2) CPU hotplug support
Patches 9-12 make some preliminary fixes towards implementing CPU
hotplug on ppc64 (and other non-x86 platforms).  These patches are
actually to generic code, not ppc, but are included here with
Paolo's ACK.


Benjamin Herrenschmidt (7):
  ppc: Remove MMU_MODEn_SUFFIX definitions
  ppc: Use split I/D mmu modes to avoid flushes on interrupts
  ppc: Do some batching of TCG tlb flushes
  ppc: tlbie, tlbia and tlbisync are HV only
  ppc: Change 'invalid' bit mask of tlbiel and tlbie
  ppc: Get out of emulation on SMT "OR" ops
  ppc: Add PPC_64H instruction flag to POWER7 and POWER8

Bharata B Rao (3):
  exec: Remove cpu from cpus list during cpu_exec_exit()
  exec: Do vmstate unregistration from cpu_exec_exit()
  cpu: Add a sync version of cpu_remove()

Gu Zheng (1):
  cpu: Reclaim vCPU objects

Michael Neuling (1):
  ppc: Fix sign extension issue in mtmsr(d) emulation

 cpus.c  | 51 ++--
 exec.c  | 43 ++-
 hw/ppc/spapr_hcall.c| 14 ++--
 include/qom/cpu.h   | 18 ++
 include/sysemu/kvm.h|  1 +
 kvm-all.c   | 57 ++-
 kvm-stub.c  |  5 +++
 target-ppc/cpu.h| 16 +
 target-ppc/excp_helper.c| 17 --
 target-ppc/helper.h |  1 +
 target-ppc/helper_regs.h| 67 
 target-ppc/machine.c|  5 ++-
 target-ppc/mmu-hash64.c | 11 ++
 target-ppc/mmu_helper.c |  9 -
 target-ppc/translate.c  | 83 -
 target-ppc/translate_init.c |  4 +--
 16 files changed, 337 insertions(+), 65 deletions(-)



[Qemu-devel] [PULL 01/12] ppc: Remove MMU_MODEn_SUFFIX definitions

2016-05-30 Thread David Gibson
From: Benjamin Herrenschmidt 

We don't use the resulting accessors and this gets in the way of
the split I/D TLB work.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 target-ppc/cpu.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index cd33539..02e71ea 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1242,9 +1242,6 @@ int ppc_dcr_write (ppc_dcr_t *dcr_env, int dcrn, uint32_t 
val);
 #define cpu_list ppc_cpu_list
 
 /* MMU modes definitions */
-#define MMU_MODE0_SUFFIX _user
-#define MMU_MODE1_SUFFIX _kernel
-#define MMU_MODE2_SUFFIX _hypv
 #define MMU_USER_IDX 0
 static inline int cpu_mmu_index (CPUPPCState *env, bool ifetch)
 {
-- 
2.5.5




Re: [Qemu-devel] [PATCH] fixup! exec: Do vmstate unregistration from cpu_exec_exit()

2016-05-30 Thread David Gibson
On Mon, May 30, 2016 at 05:22:29PM +0200, Igor Mammedov wrote:
> with all header changes merged in current master (d6550e9ed2)
> above patch breaks compilation with:
>   exec.c: In function ‘cpu_exec_exit’:
>   exec.c:661:9: error: implicit declaration of function ‘vmstate_unregister’ 
> [-Werror=implicit-function-declaration]
>  vmstate_unregister(NULL, cc->vmsd, cpu);
> 
> pls squash this fixup in orginal patch
> 
> Signed-off-by: Igor Mammedov 

Already made the same change when I pulled it into my tree.

> ---
>  exec.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/exec.c b/exec.c
> index c5dd58e..a0327a9 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -62,6 +62,8 @@
>  #include "qemu/mmap-alloc.h"
>  #endif
>  
> +#include "migration/vmstate.h"
> +
>  //#define DEBUG_SUBPAGE
>  
>  #if !defined(CONFIG_USER_ONLY)

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 0/2] macio: switch over to new byte-aligned DMA helpers

2016-05-30 Thread Aurelien Jarno
On 2016-05-27 09:48, Mark Cave-Ayland wrote:
> Here is a tidied up version of my patch to convert the macio controller over 
> to
> using the new byte-aligned DMA helpers.
> 
> The first patch is just a hack and temporarily disables unaligned iovec
> truncation in the DMA helper (as discussed in the recent thread) until Paolo 
> or
> someone else can devise a proper solution. Without this, the subsequent switch
> over to the DMA helpers will appear to work during a Darwin PPC install but 
> the
> resulting image is corrupt and will fail to boot.
> 
> The second patch is the real one and switches the macio controller over to use
> the new byte-aligned DMA helpers. Here I see a speed-up of around 2.5x-3x for
> a typical Darwin PPC installation compared to the previous code.
> 
> Aurelien, I'd be grateful if you could test the TRIM path as I know this is
> something you've had issues with before and I couldn't quite figure out how to
> reproduce your TRIM tests from before.

I have just tested the TRIM path, all works fine with your 2 patches
applied.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH] scsi: esp: check TI buffer index before read/write

2016-05-30 Thread Peter Maydell
On 30 May 2016 at 19:58, P J P  wrote:
> From: Prasad J Pandit 
>
> The 53C9X Fast SCSI Controller(FSC) comes with internal 16-byte
> FIFO buffers. One is used to handle commands and other is for
> information transfer. While reading/writing to these buffers
> an index into 's->ti_buf[TI_BUFSZ=16]' could exceed its size,
> as a check was missing to validate it. Add check to avoid OOB
> r/w access.
>
> Reported-by: Huawei PSIRT 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/scsi/esp.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
> index 591c817..80e4287 100644
> --- a/hw/scsi/esp.c
> +++ b/hw/scsi/esp.c
> @@ -407,8 +407,10 @@ uint64_t esp_reg_read(ESPState *s, uint32_t saddr)
>  qemu_log_mask(LOG_UNIMP,
>"esp: PIO data read not implemented\n");
>  s->rregs[ESP_FIFO] = 0;
> -} else {
> +} else if (s->ti_rptr < TI_BUFSZ) {
>  s->rregs[ESP_FIFO] = s->ti_buf[s->ti_rptr++];
> +} else {
> +trace_esp_error_fifo_overrun();

Isn't this an underrun, not an overrun?

In any case, something weird seems to be going on here:
it looks like the amount of data in the fifo should be
constrained by ti_size (which we're already checking), so
when can we get into a situation where ti_rptr can
get beyond the buffer size? It seems to me that we should
fix whatever that underlying bug is, and then have an
assert() on ti_rptr here.

>  }
>  esp_raise_irq(s);
>  }
> @@ -456,7 +458,7 @@ void esp_reg_write(ESPState *s, uint32_t saddr, uint64_t 
> val)
>  } else {
>  trace_esp_error_fifo_overrun();
>  }
> -} else if (s->ti_size == TI_BUFSZ - 1) {
> +} else if (s->ti_wptr == TI_BUFSZ - 1) {
>  trace_esp_error_fifo_overrun();
>  } else {
>  s->ti_size++;

Similarly, this looks odd -- the ti_size check should be
sufficient if the rest of the code is correctly managing
the ti_size/ti_wptr/ti_rptr values.

It would probably also be possible to rewrite this
FIFO handling into a less error prone style that
isn't reliant on keeping three different indexes
in sync to avoid buffer overruns.

> --
> 2.5.5

thanks
-- PMM



Re: [Qemu-devel] [PATCH] vnc: add configurable keyboard delay

2016-05-30 Thread Alexander Graf



On 05/23/2016 03:19 PM, Gerd Hoffmann wrote:

Limits the rate kbd events from the vnc server are forwarded to the
guest, so input devices which are typically low-bandwidth can keep
up even on bulky input.

Signed-off-by: Gerd Hoffmann 
---
  ui/vnc.c | 13 +++--
  ui/vnc.h |  1 +


You probably want to extend the man page with this awesome new option :).


Alex




Re: [Qemu-devel] [PATCH 18/33] pc: add 2.7 machine

2016-05-30 Thread Eduardo Habkost
On Mon, May 30, 2016 at 09:53:59PM +0300, Marcel Apfelbaum wrote:
> On 05/17/2016 05:43 PM, Igor Mammedov wrote:
> > Signed-off-by: Igor Mammedov 
> 
> 
> Eduardo also posted a patch for 2.7 machine.

It was the same patch, with a "From:" line from Igor.

-- 
Eduardo



Re: [Qemu-devel] [PATCH 20/33] pc: q35: initialize new CPU hotplug hw

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

add necessary wiring to init new CPU hotplug hardware
if ICH9-LPC.cpu-hotplug-legacy is "off".
Set ICH9-LPC.cpu-hotplug-legacy to "off" by default and
switch legacy hotplug to enabled only for 2.6 and older
machine types.

Signed-off-by: Igor Mammedov 
---
v1:
   - drop ICH9-LPC.cpu-hotplug property
---
  hw/acpi/ich9.c | 42 +-
  include/hw/acpi/ich9.h |  6 +-
  include/hw/compat.h|  6 +-
  3 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 121e30c..b0285b9 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -170,6 +170,24 @@ static const VMStateDescription vmstate_memhp_state = {
  }
  };

+static bool vmstate_test_use_cpuhp(void *opaque)
+{
+ICH9LPCPMRegs *s = opaque;
+return !s->cpu_hotplug_legacy;
+}
+
+static const VMStateDescription vmstate_cpuhp_state = {
+.name = "ich9_pm/cpuhp",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.needed = vmstate_test_use_cpuhp,
+.fields  = (VMStateField[]) {
+VMSTATE_CPU_HOTPLUG(cpuhp.state, ICH9LPCPMRegs),
+VMSTATE_END_OF_LIST()
+}
+};
+
  static bool vmstate_test_use_tco(void *opaque)
  {
  ICH9LPCPMRegs *s = opaque;
@@ -209,6 +227,7 @@ const VMStateDescription vmstate_ich9_pm = {
  .subsections = (const VMStateDescription*[]) {
  _memhp_state,
  _tco_io_state,
+_cpuhp_state,
  NULL
  }
  };
@@ -275,7 +294,10 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,

  if (pm->cpu_hotplug_legacy) {
  legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
-OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
+OBJECT(lpc_pci), >cpuhp.legacy, ICH9_CPU_HOTPLUG_IO_BASE);
+} else {
+cpu_hotplug_hw_init(pci_address_space_io(lpc_pci), OBJECT(lpc_pci),
+>cpuhp.state, ICH9_CPU_HOTPLUG_IO_BASE);
  }

  if (pm->acpi_memory_hotplug.is_enabled) {
@@ -414,7 +436,6 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, 
Error **errp)
  {
  static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN;
  pm->acpi_memory_hotplug.is_enabled = true;
-pm->cpu_hotplug_legacy = true;


Maybe is too soon to disable legacy CPU hotplug? Will the new cpu hotplug 
mechanism
work from this patch on, or it will break git bisect?

Thanks,
Marcel


  pm->disable_s3 = 0;
  pm->disable_s4 = 0;
  pm->s4_val = 2;
@@ -461,9 +482,13 @@ void ich9_pm_device_plug_cb(HotplugHandler *hotplug_dev, 
DeviceState *dev,
  object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
  acpi_memory_plug_cb(hotplug_dev, >pm.acpi_memory_hotplug,
  dev, errp);
-} else if (lpc->pm.cpu_hotplug_legacy &&
-   object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
-legacy_acpi_cpu_plug_cb(hotplug_dev, >pm.gpe_cpu, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+if (lpc->pm.cpu_hotplug_legacy) {
+legacy_acpi_cpu_plug_cb(hotplug_dev, >pm.cpuhp.legacy, dev,
+errp);
+} else {
+acpi_cpu_plug_cb(hotplug_dev, >pm.cpuhp.state, dev, errp);
+}
  } else {
  error_setg(errp, "acpi: device plug request for not supported device"
 " type: %s", object_get_typename(OBJECT(dev)));
@@ -480,6 +505,10 @@ void ich9_pm_device_unplug_request_cb(HotplugHandler 
*hotplug_dev,
  acpi_memory_unplug_request_cb(hotplug_dev,
>pm.acpi_memory_hotplug, dev,
errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) &&
+   !lpc->pm.cpu_hotplug_legacy) {
+acpi_cpu_unplug_request_cb(hotplug_dev, >pm.cpuhp.state,
+   dev, errp);
  } else {
  error_setg(errp, "acpi: device unplug request for not supported 
device"
 " type: %s", object_get_typename(OBJECT(dev)));
@@ -494,6 +523,9 @@ void ich9_pm_device_unplug_cb(HotplugHandler *hotplug_dev, 
DeviceState *dev,
  if (lpc->pm.acpi_memory_hotplug.is_enabled &&
  object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
  acpi_memory_unplug_cb(>pm.acpi_memory_hotplug, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU) &&
+   !lpc->pm.cpu_hotplug_legacy) {
+acpi_cpu_unplug_cb(>pm.cpuhp.state, dev, errp);
  } else {
  error_setg(errp, "acpi: device unplug for not supported device"
 " type: %s", object_get_typename(OBJECT(dev)));
diff --git a/include/hw/acpi/ich9.h b/include/hw/acpi/ich9.h
index e29a856..198c017 100644
--- a/include/hw/acpi/ich9.h
+++ b/include/hw/acpi/ich9.h
@@ -23,6 +23,7 @@

  #include "hw/acpi/acpi.h"
  #include "hw/acpi/cpu_hotplug.h"

Re: [Qemu-devel] [PATCH 19/33] pc: piix4/ich9: add 'cpu-hotplug-legacy' property

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

Signed-off-by: Igor Mammedov 
---
  hw/acpi/ich9.c | 29 ++---
  hw/acpi/piix4.c| 12 +---
  include/hw/acpi/ich9.h |  1 +
  3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 853c9c4..121e30c 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -273,8 +273,10 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
  pm->powerdown_notifier.notify = pm_powerdown_req;
  qemu_register_powerdown_notifier(>powerdown_notifier);

-legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
-OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
+if (pm->cpu_hotplug_legacy) {
+legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
+OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
+}

  if (pm->acpi_memory_hotplug.is_enabled) {
  acpi_memory_hotplug_init(pci_address_space_io(lpc_pci), 
OBJECT(lpc_pci),
@@ -306,6 +308,21 @@ static void ich9_pm_set_memory_hotplug_support(Object 
*obj, bool value,
  s->pm.acpi_memory_hotplug.is_enabled = value;
  }

+static bool ich9_pm_get_cpu_hotplug_legacy(Object *obj, Error **errp)
+{
+ICH9LPCState *s = ICH9_LPC_DEVICE(obj);
+
+return s->pm.cpu_hotplug_legacy;
+}
+
+static void ich9_pm_set_cpu_hotplug_legacy(Object *obj, bool value,
+   Error **errp)
+{
+ICH9LPCState *s = ICH9_LPC_DEVICE(obj);
+
+s->pm.cpu_hotplug_legacy = value;
+}
+
  static void ich9_pm_get_disable_s3(Object *obj, Visitor *v, const char *name,
 void *opaque, Error **errp)
  {
@@ -397,6 +414,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, 
Error **errp)
  {
  static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN;
  pm->acpi_memory_hotplug.is_enabled = true;
+pm->cpu_hotplug_legacy = true;
  pm->disable_s3 = 0;
  pm->disable_s4 = 0;
  pm->s4_val = 2;
@@ -412,6 +430,10 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs 
*pm, Error **errp)
   ich9_pm_get_memory_hotplug_support,
   ich9_pm_set_memory_hotplug_support,
   NULL);
+object_property_add_bool(obj, "cpu-hotplug-legacy",
+ ich9_pm_get_cpu_hotplug_legacy,
+ ich9_pm_set_cpu_hotplug_legacy,
+ NULL);
  object_property_add(obj, ACPI_PM_PROP_S3_DISABLED, "uint8",
  ich9_pm_get_disable_s3,
  ich9_pm_set_disable_s3,
@@ -439,7 +461,8 @@ void ich9_pm_device_plug_cb(HotplugHandler *hotplug_dev, 
DeviceState *dev,
  object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
  acpi_memory_plug_cb(hotplug_dev, >pm.acpi_memory_hotplug,
  dev, errp);
-} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+} else if (lpc->pm.cpu_hotplug_legacy &&
+   object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
  legacy_acpi_cpu_plug_cb(hotplug_dev, >pm.gpe_cpu, dev, errp);
  } else {
  error_setg(errp, "acpi: device plug request for not supported device"
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index c66d3ed..d25479c 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -85,6 +85,7 @@ typedef struct PIIX4PMState {
  uint8_t disable_s4;
  uint8_t s4_val;

+bool cpu_hotplug_legacy;
  AcpiCpuHotplug gpe_cpu;

  MemHotplugState acpi_memory_hotplug;
@@ -350,7 +351,8 @@ static void piix4_device_plug_cb(HotplugHandler 
*hotplug_dev,
  acpi_memory_plug_cb(hotplug_dev, >acpi_memory_hotplug, dev, errp);
  } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
  acpi_pcihp_device_plug_cb(hotplug_dev, >acpi_pci_hotplug, dev, 
errp);
-} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+} else if (s->cpu_hotplug_legacy &&
+   object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
  legacy_acpi_cpu_plug_cb(hotplug_dev, >gpe_cpu, dev, errp);
  } else {
  error_setg(errp, "acpi: device plug request for not supported device"
@@ -569,8 +571,10 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
  acpi_pcihp_init(OBJECT(s), >acpi_pci_hotplug, bus, parent,
  s->use_acpi_pci_hotplug);

-legacy_acpi_cpu_hotplug_init(parent, OBJECT(s), >gpe_cpu,
- PIIX4_CPU_HOTPLUG_IO_BASE);
+if (s->cpu_hotplug_legacy) {
+legacy_acpi_cpu_hotplug_init(parent, OBJECT(s), >gpe_cpu,
+ PIIX4_CPU_HOTPLUG_IO_BASE);
+}

  if (s->acpi_memory_hotplug.is_enabled) {
  acpi_memory_hotplug_init(parent, OBJECT(s), >acpi_memory_hotplug);
@@ -600,6 +604,8 @@ static Property piix4_pm_properties[] = {
   use_acpi_pci_hotplug, 

[Qemu-devel] [PATCH] scsi: esp: check TI buffer index before read/write

2016-05-30 Thread P J P
From: Prasad J Pandit 

The 53C9X Fast SCSI Controller(FSC) comes with internal 16-byte
FIFO buffers. One is used to handle commands and other is for
information transfer. While reading/writing to these buffers
an index into 's->ti_buf[TI_BUFSZ=16]' could exceed its size,
as a check was missing to validate it. Add check to avoid OOB
r/w access.

Reported-by: Huawei PSIRT 
Signed-off-by: Prasad J Pandit 
---
 hw/scsi/esp.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 591c817..80e4287 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -407,8 +407,10 @@ uint64_t esp_reg_read(ESPState *s, uint32_t saddr)
 qemu_log_mask(LOG_UNIMP,
   "esp: PIO data read not implemented\n");
 s->rregs[ESP_FIFO] = 0;
-} else {
+} else if (s->ti_rptr < TI_BUFSZ) {
 s->rregs[ESP_FIFO] = s->ti_buf[s->ti_rptr++];
+} else {
+trace_esp_error_fifo_overrun();
 }
 esp_raise_irq(s);
 }
@@ -456,7 +458,7 @@ void esp_reg_write(ESPState *s, uint32_t saddr, uint64_t 
val)
 } else {
 trace_esp_error_fifo_overrun();
 }
-} else if (s->ti_size == TI_BUFSZ - 1) {
+} else if (s->ti_wptr == TI_BUFSZ - 1) {
 trace_esp_error_fifo_overrun();
 } else {
 s->ti_size++;
-- 
2.5.5




Re: [Qemu-devel] [PATCH 18/33] pc: add 2.7 machine

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

Signed-off-by: Igor Mammedov 



Eduardo also posted a patch for 2.7 machine.

Thanks,
Marcel


---
  hw/i386/pc_piix.c| 16 +---
  hw/i386/pc_q35.c | 13 +++--
  include/hw/compat.h  |  3 +++
  include/hw/i386/pc.h |  4 
  4 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 7f50116..cdbdd69 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -416,13 +416,25 @@ static void pc_i440fx_machine_options(MachineClass *m)
  m->default_display = "std";
  }

-static void pc_i440fx_2_6_machine_options(MachineClass *m)
+static void pc_i440fx_2_7_machine_options(MachineClass *m)
  {
  pc_i440fx_machine_options(m);
  m->alias = "pc";
  m->is_default = 1;
  }

+DEFINE_I440FX_MACHINE(v2_7, "pc-i440fx-2.7", NULL,
+  pc_i440fx_2_7_machine_options);
+
+
+static void pc_i440fx_2_6_machine_options(MachineClass *m)
+{
+pc_i440fx_2_7_machine_options(m);
+m->is_default = 0;
+m->alias = NULL;
+SET_MACHINE_COMPAT(m, PC_COMPAT_2_6);
+}
+
  DEFINE_I440FX_MACHINE(v2_6, "pc-i440fx-2.6", NULL,
pc_i440fx_2_6_machine_options);

@@ -431,8 +443,6 @@ static void pc_i440fx_2_5_machine_options(MachineClass *m)
  {
  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
  pc_i440fx_2_6_machine_options(m);
-m->alias = NULL;
-m->is_default = 0;
  pcmc->save_tsc_khz = false;
  m->legacy_fw_cfg_order = 1;
  SET_MACHINE_COMPAT(m, PC_COMPAT_2_5);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 04aae89..4787df1 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -283,12 +283,22 @@ static void pc_q35_machine_options(MachineClass *m)
  m->no_floppy = 1;
  }

-static void pc_q35_2_6_machine_options(MachineClass *m)
+static void pc_q35_2_7_machine_options(MachineClass *m)
  {
  pc_q35_machine_options(m);
  m->alias = "q35";
  }

+DEFINE_Q35_MACHINE(v2_7, "pc-q35-2.7", NULL,
+   pc_q35_2_7_machine_options);
+
+static void pc_q35_2_6_machine_options(MachineClass *m)
+{
+pc_q35_2_7_machine_options(m);
+m->alias = NULL;
+SET_MACHINE_COMPAT(m, PC_COMPAT_2_6);
+}
+
  DEFINE_Q35_MACHINE(v2_6, "pc-q35-2.6", NULL,
 pc_q35_2_6_machine_options);

@@ -296,7 +306,6 @@ static void pc_q35_2_5_machine_options(MachineClass *m)
  {
  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
  pc_q35_2_6_machine_options(m);
-m->alias = NULL;
  pcmc->save_tsc_khz = false;
  m->legacy_fw_cfg_order = 1;
  SET_MACHINE_COMPAT(m, PC_COMPAT_2_5);
diff --git a/include/hw/compat.h b/include/hw/compat.h
index a5dbbf8..636befe 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -1,6 +1,9 @@
  #ifndef HW_COMPAT_H
  #define HW_COMPAT_H

+#define HW_COMPAT_2_6 \
+/* empty */
+
  #define HW_COMPAT_2_5 \
  {\
  .driver   = "isa-fdc",\
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 96f0b66..f1e40ae 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -356,7 +356,11 @@ int e820_add_entry(uint64_t, uint64_t, uint32_t);
  int e820_get_num_entries(void);
  bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);

+#define PC_COMPAT_2_6 \
+HW_COMPAT_2_6
+
  #define PC_COMPAT_2_5 \
+PC_COMPAT_2_6 \
  HW_COMPAT_2_5

  #define PC_COMPAT_2_4 \






Re: [Qemu-devel] [PATCH v2 13/15] nvdimm acpi: support Get Namespace Label Data function

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 04:20:10PM +0800, Xiao Guangrong wrote:
> Function 5 is used to get Namespace Label Data
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/nvdimm.c | 83 
> +++-
>  1 file changed, 82 insertions(+), 1 deletion(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 14/15] nvdimm acpi: support Set Namespace Label Data function

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 04:20:11PM +0800, Xiao Guangrong wrote:
> Function 6 is used to set Namespace Label Data
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/nvdimm.c | 44 +++-
>  1 file changed, 43 insertions(+), 1 deletion(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 00/15] PATCH 00/15] NVDIMM: introduce nvdimm label support

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 04:19:57PM +0800, Xiao Guangrong wrote:
> Changelog in v2:
> Thanks for Stefan's review, the changes in this version are:
> - rename nvdimm device parameter 'reserve-label' to 'label-size' to
>   specify the size of label
> - add comment to explain why assert() used in nvdimm_assert_rw_label_data()
>   is safe
> - follow the code style of 'foo() return;' if nothing is returned by fool()
> - fix the value of "Non-existing-Memory-Device"
> - fix the handling the DSM functions we currently did not support
> 
>  
> This patchset is against commit 2912f22759 (fixup! virtio: convert to use DMA
> api) on pci branch of Michael's git tree and can be found at:
>   https://github.com/xiaogr/qemu.git nvdimm-label-v2
> 
> This is the last part of vNVDIMM implementation which introduces nvdimm
> label support
> 
> Currently Linux NVDIMM driver does not support namespace operation on this
> kind of PMEM, apply below changes to support dynamical namespace:
> 
> @@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc 
> *a
> continue;
> }
>  
> -   if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +   //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
> +   if (nfit_mem->memdev_pmem)
> flags |= NDD_ALIASING;
> 
> You can append a NVDIMM device in guest and do:   
> # cd /sys/bus/nd/devices/
> # cd namespace0.0/
> # echo `uuidgen` > uuid
> # echo `expr 1024 \* 1024 \* 128` > size
> then reload nd.pmem.ko
> 
> You can see /dev/pmem0 appears
> 
> Xiao Guangrong (15):
>   pc-dimm: get memory region from ->get_memory_region()
>   pc-dimm: introduce realize callback
>   pc-dimm: keep the state of the whole backend memory
>   nvdimm: support nvdimm label
>   acpi: add aml_object_type
>   acpi: add aml_call5
>   nvdimm acpi: set HDLE properly
>   nvdimm acpi: save arg3 of _DSM method
>   nvdimm acpi: check UUID
>   nvdimm acpi: abstract the operations for root & nvdimm devices
>   nvdimm acpi: check revision
>   nvdimm acpi: support Get Namespace Label Size function
>   nvdimm acpi: support Get Namespace Label Data function
>   nvdimm acpi: support Set Namespace Label Data function
>   docs: add NVDIMM ACPI documentation
> 
>  docs/specs/acpi_nvdimm.txt  | 132 +++
>  hw/acpi/aml-build.c |  22 +++
>  hw/acpi/nvdimm.c| 400 
> 
>  hw/mem/nvdimm.c | 122 ++
>  hw/mem/pc-dimm.c|  21 ++-
>  include/hw/acpi/aml-build.h |   3 +
>  include/hw/mem/nvdimm.h |  55 +-
>  include/hw/mem/pc-dimm.h|   6 +-
>  8 files changed, 723 insertions(+), 38 deletions(-)
>  create mode 100644 docs/specs/acpi_nvdimm.txt

I have reviewed the non-ACPI parts of this series.  Aside from minor
comments:

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 16/33] acpi: hardware side of CPU hotplug

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

Signed-off-by: Igor Mammedov 


Maybe it worth adding some kind of explanation
of the functionality added here.

Thanks,
Marcel


---
v1:
   - drop CPUHotplugState.is_enabled field
---
  hw/acpi/Makefile.objs |   1 +
  hw/acpi/cpu.c | 206 ++
  include/hw/acpi/cpu.h |  52 +
  trace-events  |   9 +++
  4 files changed, 268 insertions(+)
  create mode 100644 hw/acpi/cpu.c
  create mode 100644 include/hw/acpi/cpu.h

diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index 66bd727..f200419 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -2,6 +2,7 @@ common-obj-$(CONFIG_ACPI_X86) += core.o piix4.o pcihp.o
  common-obj-$(CONFIG_ACPI_X86_ICH) += ich9.o tco.o
  common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
  common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o 
memory_hotplug_acpi_table.o
+common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
  obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
  common-obj-$(CONFIG_ACPI) += acpi_interface.o
  common-obj-$(CONFIG_ACPI) += bios-linker-loader.o
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
new file mode 100644
index 000..171a5f5
--- /dev/null
+++ b/hw/acpi/cpu.c
@@ -0,0 +1,206 @@
+#include "qemu/osdep.h"
+#include "hw/boards.h"
+#include "hw/acpi/cpu.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+#define ACPI_CPU_HOTPLUG_REG_LEN 12
+#define ACPI_CPU_SELECTOR_OFFSET_WR 0
+#define ACPI_CPU_FLAGS_OFFSET_RW 4
+
+static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size)
+{
+uint64_t val = ~0;
+CPUHotplugState *cpu_st = opaque;
+AcpiCpuStatus *cdev;
+
+if (cpu_st->selector >= cpu_st->dev_count) {
+return val;
+}
+
+cdev = _st->devs[cpu_st->selector];
+switch (addr) {
+case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
+val = 0;
+val |= cdev->is_enabled   ? 1 : 0;
+val |= cdev->is_inserting ? 2 : 0;
+val |= cdev->is_removing  ? 4 : 0;
+trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
+break;
+default:
+break;
+}
+return val;
+}
+
+static void cpu_hotplug_wr(void *opaque, hwaddr addr, uint64_t data,
+   unsigned int size)
+{
+CPUHotplugState *cpu_st = opaque;
+AcpiCpuStatus *cdev;
+Error *local_err = NULL;
+
+assert(cpu_st->dev_count);
+
+if (addr) {
+if (cpu_st->selector >= cpu_st->dev_count) {
+trace_cpuhp_acpi_invalid_idx_selected(cpu_st->selector);
+return;
+}
+}
+
+switch (addr) {
+case ACPI_CPU_SELECTOR_OFFSET_WR: /* current CPU selector */
+cpu_st->selector = data;
+trace_cpuhp_acpi_write_idx(cpu_st->selector);
+break;
+case ACPI_CPU_FLAGS_OFFSET_RW: /* set is_* fields  */
+cdev = _st->devs[cpu_st->selector];
+if (data & 2) { /* clear insert event */
+cdev->is_inserting = false;
+trace_cpuhp_acpi_clear_inserting_evt(cpu_st->selector);
+} else if (data & 4) { /* clear remove event */
+cdev->is_removing = false;
+trace_cpuhp_acpi_clear_remove_evt(cpu_st->selector);
+} else if (data & 8) {
+DeviceState *dev = NULL;
+HotplugHandler *hotplug_ctrl = NULL;
+
+if (!cdev->is_enabled) {
+trace_cpuhp_acpi_ejecting_invalid_cpu(cpu_st->selector);
+break;
+}
+
+trace_cpuhp_acpi_ejecting_cpu(cpu_st->selector);
+dev = DEVICE(cdev->cpu);
+hotplug_ctrl = qdev_get_hotplug_handler(dev);
+hotplug_handler_unplug(hotplug_ctrl, dev, _err);
+if (local_err) {
+break;
+}
+}
+break;
+default:
+break;
+}
+error_free(local_err);
+}
+
+static const MemoryRegionOps cpu_hotplug_ops = {
+.read = cpu_hotplug_rd,
+.write = cpu_hotplug_wr,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 4,
+},
+};
+
+void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
+ CPUHotplugState *state, hwaddr base_addr)
+{
+MachineState *machine = MACHINE(qdev_get_machine());
+MachineClass *mc = MACHINE_GET_CLASS(machine);
+CPUArchIdList *id_list;
+int i;
+
+id_list = mc->possible_cpu_arch_ids(machine);
+state->dev_count = id_list->len;
+state->devs = g_new0(typeof(*state->devs), state->dev_count);
+for (i = 0; i < id_list->len; i++) {
+state->devs[i].cpu =  id_list->cpus[i].cpu;
+state->devs[i].arch_id = id_list->cpus[i].arch_id;
+state->devs[i].is_enabled =  id_list->cpus[i].cpu ? true : false;
+}
+g_free(id_list);
+memory_region_init_io(>ctrl_reg, owner, _hotplug_ops, state,
+  "acpi-mem-hotplug", 

Re: [Qemu-devel] [PATCH v2 12/15] nvdimm acpi: support Get Namespace Label Size function

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 04:20:09PM +0800, Xiao Guangrong wrote:
> Function 4 is used to get Namespace label size
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/nvdimm.c | 130 
> +--
>  1 file changed, 127 insertions(+), 3 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 04/15] nvdimm: support nvdimm label

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 04:20:01PM +0800, Xiao Guangrong wrote:
> Introduce a parameter, 'label-size', which is the size of nvdimm label
> data area which is reserved at the end of backend memory. It is required
> at least 128k
> 
> Two callbacks, read_label_data() and write_label_data(), are used to
> operate the label area
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/mem/nvdimm.c | 122 
> 
>  include/hw/mem/nvdimm.h |  55 +-
>  2 files changed, 176 insertions(+), 1 deletion(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 13/33] acpi: extend ACPI interface to provide send_event hook

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

send_event() hook will allow to send ACPI event in
a target specific way (GPE or GPIO based impl.)
it will also simplify proxy wrappers in piix4pm/ich9
that access ACPI regs and SCI which are part of
piix4pm/lcp_ich9 devices and call acpi_foo() API directly.

Signed-off-by: Igor Mammedov 
---
Following patch will use hook to simplify hotplug callbacks
in piix4pm/ich9.
---
  hw/acpi/core.c   |  2 +-
  hw/acpi/piix4.c  |  8 
  hw/isa/lpc_ich9.c|  8 
  include/hw/acpi/acpi.h   | 10 ++
  include/hw/acpi/acpi_dev_interface.h | 18 ++
  5 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index 6a2f452..d05844b 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -692,7 +692,7 @@ uint32_t acpi_gpe_ioport_readb(ACPIREGS *ar, uint32_t addr)
  }

  void acpi_send_gpe_event(ACPIREGS *ar, qemu_irq irq,
- AcpiGPEStatusBits status)
+ AcpiEventStatusBits status)
  {
  ar->gpe.sts[0] |= status;
  acpi_update_sci(ar, irq);
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 3e8d80b..4f5658f 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -585,6 +585,13 @@ static void piix4_ospm_status(AcpiDeviceIf *adev, 
ACPIOSTInfoList ***list)
  acpi_memory_ospm_status(>acpi_memory_hotplug, list);
  }

+static void piix4_send_gpe(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
+{
+PIIX4PMState *s = PIIX4_PM(adev);
+
+acpi_send_gpe_event(>ar, s->irq, ev);
+}
+
  static Property piix4_pm_properties[] = {
  DEFINE_PROP_UINT32("smb_io_base", PIIX4PMState, smb_io_base, 0),
  DEFINE_PROP_UINT8(ACPI_PM_PROP_S3_DISABLED, PIIX4PMState, disable_s3, 0),
@@ -623,6 +630,7 @@ static void piix4_pm_class_init(ObjectClass *klass, void 
*data)
  hc->unplug_request = piix4_device_unplug_request_cb;
  hc->unplug = piix4_device_unplug_cb;
  adevc->ospm_status = piix4_ospm_status;
+adevc->send_event = piix4_send_gpe;
  }

  static const TypeInfo piix4_pm_info = {
diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index 99cd3ba..305ccd6 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -702,6 +702,13 @@ static Property ich9_lpc_properties[] = {
  DEFINE_PROP_END_OF_LIST(),
  };

+static void ich9_send_gpe(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
+{
+ICH9LPCState *s = ICH9_LPC_DEVICE(adev);
+
+acpi_send_gpe_event(>pm.acpi_regs, s->pm.irq, ev);
+}
+
  static void ich9_lpc_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
@@ -729,6 +736,7 @@ static void ich9_lpc_class_init(ObjectClass *klass, void 
*data)
  hc->unplug_request = ich9_device_unplug_request_cb;
  hc->unplug = ich9_device_unplug_cb;
  adevc->ospm_status = ich9_pm_ospm_status;
+adevc->send_event = ich9_send_gpe;
  }

  static const TypeInfo ich9_lpc_info = {
diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h
index e0978c8..24dd572 100644
--- a/include/hw/acpi/acpi.h
+++ b/include/hw/acpi/acpi.h
@@ -23,6 +23,7 @@
  #include "qemu/option.h"
  #include "exec/memory.h"
  #include "hw/irq.h"
+#include "hw/acpi/acpi_dev_interface.h"

  /*
   * current device naming scheme supports up to 256 memory devices
@@ -89,13 +90,6 @@
  /* PM2_CNT */
  #define ACPI_BITMASK_ARB_DISABLE0x0001

-/* These values are part of guest ABI, and can not be changed */
-typedef enum {
-ACPI_PCI_HOTPLUG_STATUS = 2,
-ACPI_CPU_HOTPLUG_STATUS = 4,
-ACPI_MEMORY_HOTPLUG_STATUS = 8,
-} AcpiGPEStatusBits;
-
  /* structs */
  typedef struct ACPIPMTimer ACPIPMTimer;
  typedef struct ACPIPM1EVT ACPIPM1EVT;
@@ -179,7 +173,7 @@ void acpi_gpe_ioport_writeb(ACPIREGS *ar, uint32_t addr, 
uint32_t val);
  uint32_t acpi_gpe_ioport_readb(ACPIREGS *ar, uint32_t addr);

  void acpi_send_gpe_event(ACPIREGS *ar, qemu_irq irq,
- AcpiGPEStatusBits status);
+ AcpiEventStatusBits status);

  void acpi_update_sci(ACPIREGS *acpi_regs, qemu_irq irq);

diff --git a/include/hw/acpi/acpi_dev_interface.h 
b/include/hw/acpi/acpi_dev_interface.h
index f245f8d..fdfc163 100644
--- a/include/hw/acpi/acpi_dev_interface.h
+++ b/include/hw/acpi/acpi_dev_interface.h
@@ -4,6 +4,13 @@
  #include "qom/object.h"
  #include "qapi-types.h"

+/* These values are part of guest ABI, and can not be changed */
+typedef enum {
+ACPI_PCI_HOTPLUG_STATUS = 2,
+ACPI_CPU_HOTPLUG_STATUS = 4,
+ACPI_MEMORY_HOTPLUG_STATUS = 8,
+} AcpiEventStatusBits;
+
  #define TYPE_ACPI_DEVICE_IF "acpi-device-interface"

  #define ACPI_DEVICE_IF_CLASS(klass) \
@@ -22,11 +29,21 @@ typedef struct AcpiDeviceIf {
  Object Parent;
  } AcpiDeviceIf;

+#define ACPI_SEND_EVENT(dev, event)   \
+do {  \
+AcpiDeviceIfClass *adevc = 

Re: [Qemu-devel] [PATCH v2 03/15] pc-dimm: keep the state of the whole backend memory

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 04:20:00PM +0800, Xiao Guangrong wrote:
> QEMU keeps the state of memory of dimm device during live migration,
> however, it is not enough for nvdimm device as its memory does not
> contain its label data, so that we should protect the whole backend
> memory instead
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/mem/pc-dimm.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
> index 6de2275..72b33ba 100644
> --- a/hw/mem/pc-dimm.c
> +++ b/hw/mem/pc-dimm.c
> @@ -105,9 +105,16 @@ void pc_dimm_memory_plug(DeviceState *dev, 
> MemoryHotplugState *hpms,
>  }
>  
>  memory_region_add_subregion(>mr, addr - hpms->base, mr);
> -vmstate_register_ram(mr, dev);
>  numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node);
>  
> +/*
> + * save the state only for @mr is not enough as it does not contain
> + * the label data of NVDIMM device, so that we keep the state of
> + * whole hostmem instead.
> + */
> +vmstate_register_ram(host_memory_backend_get_memory(dimm->hostmem, errp),
> + dev);
> +
>  out:
>  error_propagate(errp, local_err);
>  }

In Patch 1 you introduced a callback to get the guest-visible memory
region.  Instead of mentioning NVDIMM in generic pc-dimm.c code, it
would be cleaner to add another callback to get the vmstate memory
region:

  .get_guest_memory_region() - Patch 1
  .get_vmstate_memory_region() - a new patch in this series


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 11/33] pc: acpi: cpuhp-legacy: switch ProcessorID to possible_cpus idx

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

In legacy cpu-hotplug ProcessorID == APIC ID is used
in MADT and cpu-hotplug AML. It was fine as both
are 8bit and unique. Spec depricated Processor()
with corresponding ProcessorID and advises to use
Device() and UID instead of it.

However UID is just 32bit and it can't fit ARM's
arch_id(MPIDR) which is 64bit. Also in case of
sparse arch_id() distribution, managment/lookup
of maps by arch_id(APIC ID/MPIDR) becomes complex
and expensive.

In preparation to common CPU hotplug with ARM
and to simplify lookup in possible_cpus[] map
switch ProcessorID to possible_cpus index in
MADT.

Legacy cpu-hotplug considerations:
HW interface of it is APIC ID based bitmask so
it's impossible to change, also CPON package in
AML also APIC ID based as well all the methods.

To avoid massive rewrite of AML keep is so and
just break assumption that ProcessorID == APIC ID,
ammending CPU_MAT_METHOD to accept APIC ID and
possible_cpus index, it needs them both to patch
MADT entry template. Also switch to possible_cpus
index Processor(ProcessorID) AML.
That way changes to MADT/AML are minimal and kept
inside AML/MADT not affecting external interfaces.



I vaguely understood the explanation, I'll let Eduardo have a look :)

Thanks,
Marcel


Signed-off-by: Igor Mammedov 
---
  hw/acpi/cpu_hotplug.c | 23 +--
  hw/i386/acpi-build.c  |  2 +-
  2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
index 36ea6c2..9d71d2f 100644
--- a/hw/acpi/cpu_hotplug.c
+++ b/hw/acpi/cpu_hotplug.c
@@ -99,7 +99,8 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState 
*machine,
  int i, apic_idx;
  Aml *sb_scope = aml_scope("_SB");
  uint8_t madt_tmpl[8] = {0x00, 0x08, 0x00, 0x00, 0x00, 0, 0, 0};
-Aml *cpu_id = aml_arg(0);
+Aml *cpu_id = aml_arg(1);
+Aml *apic_id = aml_arg(0);
  Aml *cpu_on = aml_local(0);
  Aml *madt = aml_local(1);
  Aml *cpus_map = aml_name(CPU_ON_BITMAP);
@@ -111,30 +112,31 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState 
*machine,

  /*
   * _MAT method - creates an madt apic buffer
- * cpu_id = Arg0 = Processor ID = Local APIC ID
+ * apic_id = Arg0 = Local APIC ID
+ * cpu_id  = Arg1 = Processor ID
   * cpu_on = Local0 = CPON flag for this cpu
   * madt = Local1 = Buffer (in madt apic form) to return
   */
-method = aml_method(CPU_MAT_METHOD, 1, AML_NOTSERIALIZED);
+method = aml_method(CPU_MAT_METHOD, 2, AML_NOTSERIALIZED);
  aml_append(method,
-aml_store(aml_derefof(aml_index(cpus_map, cpu_id)), cpu_on));
+aml_store(aml_derefof(aml_index(cpus_map, apic_id)), cpu_on));
  aml_append(method,
  aml_store(aml_buffer(sizeof(madt_tmpl), madt_tmpl), madt));
  /* Update the processor id, lapic id, and enable/disable status */
  aml_append(method, aml_store(cpu_id, aml_index(madt, aml_int(2;
-aml_append(method, aml_store(cpu_id, aml_index(madt, aml_int(3;
+aml_append(method, aml_store(apic_id, aml_index(madt, aml_int(3;
  aml_append(method, aml_store(cpu_on, aml_index(madt, aml_int(4;
  aml_append(method, aml_return(madt));
  aml_append(sb_scope, method);

  /*
   * _STA method - return ON status of cpu
- * cpu_id = Arg0 = Processor ID = Local APIC ID
+ * apic_id = Arg0 = Local APIC ID
   * cpu_on = Local0 = CPON flag for this cpu
   */
  method = aml_method(CPU_STATUS_METHOD, 1, AML_NOTSERIALIZED);
  aml_append(method,
-aml_store(aml_derefof(aml_index(cpus_map, cpu_id)), cpu_on));
+aml_store(aml_derefof(aml_index(cpus_map, apic_id)), cpu_on));
  if_ctx = aml_if(cpu_on);
  {
  aml_append(if_ctx, aml_return(aml_int(0xF)));
@@ -243,11 +245,12 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState 
*machine,

  assert(apic_id < ACPI_CPU_HOTPLUG_ID_LIMIT);

-dev = aml_processor(apic_id, 0, 0, "CP%.02X", apic_id);
+dev = aml_processor(i, 0, 0, "CP%.02X", apic_id);

  method = aml_method("_MAT", 0, AML_NOTSERIALIZED);
  aml_append(method,
-aml_return(aml_call1(CPU_MAT_METHOD, aml_int(apic_id;
+aml_return(aml_call2(CPU_MAT_METHOD, aml_int(apic_id), aml_int(i))
+));
  aml_append(dev, method);

  method = aml_method("_STA", 0, AML_NOTSERIALIZED);
@@ -268,7 +271,7 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState 
*machine,
  /* build this code:
   *   Method(NTFY, 2) {If (LEqual(Arg0, 0x00)) {Notify(CP00, Arg1)} ...}
   */
-/* Arg0 = Processor ID = APIC ID */
+/* Arg0 = APIC ID */
  method = aml_method(AML_NOTIFY_METHOD, 2, AML_NOTSERIALIZED);
  for (i = 0; i < apic_ids->len; i++) {
  int apic_id = apic_ids->cpus[i].arch_id;
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index b33fec9..768918f 100644
--- 

Re: [Qemu-devel] [PATCH v2 02/15] pc-dimm: introduce realize callback

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 04:19:59PM +0800, Xiao Guangrong wrote:
> nvdimm needs to  check if the backend memory is large enough to contain
> label data and init its memory region when the device is realized, so
> introduce realize callback which is called after common dimm has been
> realize
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/mem/pc-dimm.c | 5 +
>  include/hw/mem/pc-dimm.h | 3 +++
>  2 files changed, 8 insertions(+)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 01/15] pc-dimm: get memory region from ->get_memory_region()

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 04:19:58PM +0800, Xiao Guangrong wrote:
> Curretly, the memory region of backed memory is all directly

s/Curretly/Currently/

> mapped to guest's address space, however, it will be not true
> for nvdimm device if we introduce nvdimm label which only can
> be indirectly accessed by ACPI DSM method
> 
> Also it improves the comments a bit to reflect this fact
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/mem/pc-dimm.c | 3 ++-
>  include/hw/mem/pc-dimm.h | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 10/33] pc: acpi: simplify build_legacy_cpu_hotplug_aml() signature

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

since IO block used by CPU hotplug is fixed size and
initialized it the same file as build_legacy_cpu_hotplug_aml()
just use ACPI_GPE_PROC_LEN directly instead of passing
it around in several files.

Signed-off-by: Igor Mammedov 
---
  hw/acpi/cpu_hotplug.c | 6 +++---
  hw/i386/acpi-build.c  | 5 +
  include/hw/acpi/cpu_hotplug.h | 2 +-
  3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
index 2d4e034..36ea6c2 100644
--- a/hw/acpi/cpu_hotplug.c
+++ b/hw/acpi/cpu_hotplug.c
@@ -87,7 +87,7 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, 
Object *owner,
  }

  void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
-  uint16_t io_base, uint16_t io_len)
+  uint16_t io_base)
  {
  Aml *dev;
  Aml *crs;
@@ -226,13 +226,13 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState 
*machine,
  aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
  crs = aml_resource_template();
  aml_append(crs,
-aml_io(AML_DECODE16, io_base, io_base, 1, io_len)
+aml_io(AML_DECODE16, io_base, io_base, 1, ACPI_GPE_PROC_LEN)
  );
  aml_append(dev, aml_name_decl("_CRS", crs));
  aml_append(sb_scope, dev);
  /* declare CPU hotplug MMIO region and PRS field to access it */
  aml_append(sb_scope, aml_operation_region(
-"PRST", AML_SYSTEM_IO, aml_int(io_base), io_len));
+"PRST", AML_SYSTEM_IO, aml_int(io_base), ACPI_GPE_PROC_LEN));
  field = aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK, AML_PRESERVE);
  aml_append(field, aml_named_field("PRS", 256));
  aml_append(sb_scope, field);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2f6de43..b33fec9 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -94,7 +94,6 @@ typedef struct AcpiPmInfo {
  uint32_t gpe0_blk_len;
  uint32_t io_base;
  uint16_t cpu_hp_io_base;
-uint16_t cpu_hp_io_len;
  uint16_t mem_hp_io_base;
  uint16_t mem_hp_io_len;
  uint16_t pcihp_io_base;
@@ -142,7 +141,6 @@ static void acpi_get_pm_info(AcpiPmInfo *pm)
  }
  assert(obj);

-pm->cpu_hp_io_len = ACPI_GPE_PROC_LEN;
  pm->mem_hp_io_base = ACPI_MEMORY_HOTPLUG_BASE;
  pm->mem_hp_io_len = ACPI_MEMORY_HOTPLUG_IO_LEN;

@@ -1935,8 +1933,7 @@ build_dsdt(GArray *table_data, GArray *linker,
  build_q35_pci0_int(dsdt);
  }

-build_legacy_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base,
- pm->cpu_hp_io_len);
+build_legacy_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base);
  build_memory_hotplug_aml(dsdt, nr_mem, pm->mem_hp_io_base,
   pm->mem_hp_io_len);

diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
index 241b50f..6d729d8 100644
--- a/include/hw/acpi/cpu_hotplug.h
+++ b/include/hw/acpi/cpu_hotplug.h
@@ -28,5 +28,5 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, 
Object *owner,
AcpiCpuHotplug *gpe_cpu, uint16_t base);

  void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
-  uint16_t io_base, uint16_t io_len);
+  uint16_t io_base);
  #endif




Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Re: [Qemu-devel] [PATCH 09/33] pc: acpi: consolidate legacy CPU hotplug in one file

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

Since AML part of CPU hotplug is tightly coupled with
its hardware part (IO port layout/protocol), move
build_legacy_cpu_hotplug_aml() to cpu_hotplug.c
and remove empty cpu_hotplug_acpi_table.c

Signed-off-by: Igor Mammedov 
---
  hw/acpi/Makefile.objs|   2 +-
  hw/acpi/cpu_hotplug.c| 232 
  hw/acpi/cpu_hotplug_acpi_table.c | 249 ---
  3 files changed, 233 insertions(+), 250 deletions(-)
  delete mode 100644 hw/acpi/cpu_hotplug_acpi_table.c

diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index faee86c..66bd727 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -1,6 +1,6 @@
  common-obj-$(CONFIG_ACPI_X86) += core.o piix4.o pcihp.o
  common-obj-$(CONFIG_ACPI_X86_ICH) += ich9.o tco.o
-common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o cpu_hotplug_acpi_table.o
+common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
  common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o 
memory_hotplug_acpi_table.o
  obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
  common-obj-$(CONFIG_ACPI) += acpi_interface.o
diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
index ba9d903..2d4e034 100644
--- a/hw/acpi/cpu_hotplug.c
+++ b/hw/acpi/cpu_hotplug.c
@@ -14,6 +14,14 @@
  #include "hw/acpi/cpu_hotplug.h"
  #include "qapi/error.h"
  #include "qom/cpu.h"
+#include "hw/i386/pc.h"
+
+#define CPU_EJECT_METHOD "CPEJ"
+#define CPU_MAT_METHOD "CPMA"
+#define CPU_ON_BITMAP "CPON"
+#define CPU_STATUS_METHOD "CPST"
+#define CPU_STATUS_MAP "PRS"
+#define CPU_SCAN_METHOD "PRSC"

  static uint64_t cpu_status_read(void *opaque, hwaddr addr, unsigned int size)
  {
@@ -77,3 +85,227 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, 
Object *owner,
gpe_cpu, "acpi-cpu-hotplug", ACPI_GPE_PROC_LEN);
  memory_region_add_subregion(parent, base, _cpu->io);
  }
+
+void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
+  uint16_t io_base, uint16_t io_len)
+{
+Aml *dev;
+Aml *crs;
+Aml *pkg;
+Aml *field;
+Aml *method;
+Aml *if_ctx;
+Aml *else_ctx;
+int i, apic_idx;
+Aml *sb_scope = aml_scope("_SB");
+uint8_t madt_tmpl[8] = {0x00, 0x08, 0x00, 0x00, 0x00, 0, 0, 0};
+Aml *cpu_id = aml_arg(0);
+Aml *cpu_on = aml_local(0);
+Aml *madt = aml_local(1);
+Aml *cpus_map = aml_name(CPU_ON_BITMAP);
+Aml *zero = aml_int(0);
+Aml *one = aml_int(1);
+MachineClass *mc = MACHINE_GET_CLASS(machine);
+CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
+PCMachineState *pcms = PC_MACHINE(machine);
+
+/*
+ * _MAT method - creates an madt apic buffer
+ * cpu_id = Arg0 = Processor ID = Local APIC ID
+ * cpu_on = Local0 = CPON flag for this cpu
+ * madt = Local1 = Buffer (in madt apic form) to return
+ */
+method = aml_method(CPU_MAT_METHOD, 1, AML_NOTSERIALIZED);
+aml_append(method,
+aml_store(aml_derefof(aml_index(cpus_map, cpu_id)), cpu_on));
+aml_append(method,
+aml_store(aml_buffer(sizeof(madt_tmpl), madt_tmpl), madt));
+/* Update the processor id, lapic id, and enable/disable status */
+aml_append(method, aml_store(cpu_id, aml_index(madt, aml_int(2;
+aml_append(method, aml_store(cpu_id, aml_index(madt, aml_int(3;
+aml_append(method, aml_store(cpu_on, aml_index(madt, aml_int(4;
+aml_append(method, aml_return(madt));
+aml_append(sb_scope, method);
+
+/*
+ * _STA method - return ON status of cpu
+ * cpu_id = Arg0 = Processor ID = Local APIC ID
+ * cpu_on = Local0 = CPON flag for this cpu
+ */
+method = aml_method(CPU_STATUS_METHOD, 1, AML_NOTSERIALIZED);
+aml_append(method,
+aml_store(aml_derefof(aml_index(cpus_map, cpu_id)), cpu_on));
+if_ctx = aml_if(cpu_on);
+{
+aml_append(if_ctx, aml_return(aml_int(0xF)));
+}
+aml_append(method, if_ctx);
+else_ctx = aml_else();
+{
+aml_append(else_ctx, aml_return(zero));
+}
+aml_append(method, else_ctx);
+aml_append(sb_scope, method);
+
+method = aml_method(CPU_EJECT_METHOD, 2, AML_NOTSERIALIZED);
+aml_append(method, aml_sleep(200));
+aml_append(sb_scope, method);
+
+method = aml_method(CPU_SCAN_METHOD, 0, AML_NOTSERIALIZED);
+{
+Aml *while_ctx, *if_ctx2, *else_ctx2;
+Aml *bus_check_evt = aml_int(1);
+Aml *remove_evt = aml_int(3);
+Aml *status_map = aml_local(5); /* Local5 = active cpu bitmap */
+Aml *byte = aml_local(2); /* Local2 = last read byte from bitmap */
+Aml *idx = aml_local(0); /* Processor ID / APIC ID iterator */
+Aml *is_cpu_on = aml_local(1); /* Local1 = CPON flag for cpu */
+Aml *status = aml_local(3); /* Local3 = active state for cpu */
+
+aml_append(method, aml_store(aml_name(CPU_STATUS_MAP), 

Re: [Qemu-devel] [PATCH 08/33] pc: acpi: mark current CPU hotplug functions as legacy

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:43 PM, Igor Mammedov wrote:

Signed-off-by: Igor Mammedov 
---
  hw/acpi/cpu_hotplug.c|  8 
  hw/acpi/cpu_hotplug_acpi_table.c |  4 ++--
  hw/acpi/ich9.c   |  7 ---
  hw/acpi/piix4.c  |  6 +++---
  hw/i386/acpi-build.c |  3 ++-
  include/hw/acpi/cpu_hotplug.h| 12 ++--
  6 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
index 4d86743..ba9d903 100644
--- a/hw/acpi/cpu_hotplug.c
+++ b/hw/acpi/cpu_hotplug.c
@@ -54,8 +54,8 @@ static void acpi_set_cpu_present_bit(AcpiCpuHotplug *g, 
CPUState *cpu,
  g->sts[cpu_id / 8] |= (1 << (cpu_id % 8));
  }

-void acpi_cpu_plug_cb(ACPIREGS *ar, qemu_irq irq,
-  AcpiCpuHotplug *g, DeviceState *dev, Error **errp)
+void legacy_acpi_cpu_plug_cb(ACPIREGS *ar, qemu_irq irq,
+ AcpiCpuHotplug *g, DeviceState *dev, Error **errp)
  {
  acpi_set_cpu_present_bit(g, CPU(dev), errp);
  if (*errp != NULL) {
@@ -65,8 +65,8 @@ void acpi_cpu_plug_cb(ACPIREGS *ar, qemu_irq irq,
  acpi_send_gpe_event(ar, irq, ACPI_CPU_HOTPLUG_STATUS);
  }

-void acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner,
-   AcpiCpuHotplug *gpe_cpu, uint16_t base)
+void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner,
+  AcpiCpuHotplug *gpe_cpu, uint16_t base)
  {
  CPUState *cpu;

diff --git a/hw/acpi/cpu_hotplug_acpi_table.c b/hw/acpi/cpu_hotplug_acpi_table.c
index 9fdde6d..fc79c54 100644
--- a/hw/acpi/cpu_hotplug_acpi_table.c
+++ b/hw/acpi/cpu_hotplug_acpi_table.c
@@ -24,8 +24,8 @@
  #define CPU_STATUS_MAP "PRS"
  #define CPU_SCAN_METHOD "PRSC"

-void build_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
-   uint16_t io_base, uint16_t io_len)
+void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
+  uint16_t io_base, uint16_t io_len)
  {
  Aml *dev;
  Aml *crs;
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 27e978f..af340d0 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -273,8 +273,8 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
  pm->powerdown_notifier.notify = pm_powerdown_req;
  qemu_register_powerdown_notifier(>powerdown_notifier);

-acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci), OBJECT(lpc_pci),
-  >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
+legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
+OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);

  if (pm->acpi_memory_hotplug.is_enabled) {
  acpi_memory_hotplug_init(pci_address_space_io(lpc_pci), 
OBJECT(lpc_pci),
@@ -437,7 +437,8 @@ void ich9_pm_device_plug_cb(ICH9LPCPMRegs *pm, DeviceState 
*dev, Error **errp)
  acpi_memory_plug_cb(>acpi_regs, pm->irq, >acpi_memory_hotplug,
  dev, errp);
  } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
-acpi_cpu_plug_cb(>acpi_regs, pm->irq, >gpe_cpu, dev, errp);
+legacy_acpi_cpu_plug_cb(>acpi_regs, pm->irq,
+>gpe_cpu, dev, errp);
  } else {
  error_setg(errp, "acpi: device plug request for not supported device"
 " type: %s", object_get_typename(OBJECT(dev)));
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 16abdf1..3e8d80b 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -352,7 +352,7 @@ static void piix4_device_plug_cb(HotplugHandler 
*hotplug_dev,
  acpi_pcihp_device_plug_cb(>ar, s->irq, >acpi_pci_hotplug, dev,
errp);
  } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
-acpi_cpu_plug_cb(>ar, s->irq, >gpe_cpu, dev, errp);
+legacy_acpi_cpu_plug_cb(>ar, s->irq, >gpe_cpu, dev, errp);
  } else {
  error_setg(errp, "acpi: device plug request for not supported device"
 " type: %s", object_get_typename(OBJECT(dev)));
@@ -570,8 +570,8 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
  acpi_pcihp_init(OBJECT(s), >acpi_pci_hotplug, bus, parent,
  s->use_acpi_pci_hotplug);

-acpi_cpu_hotplug_init(parent, OBJECT(s), >gpe_cpu,
-  PIIX4_CPU_HOTPLUG_IO_BASE);
+legacy_acpi_cpu_hotplug_init(parent, OBJECT(s), >gpe_cpu,
+ PIIX4_CPU_HOTPLUG_IO_BASE);

  if (s->acpi_memory_hotplug.is_enabled) {
  acpi_memory_hotplug_init(parent, OBJECT(s), >acpi_memory_hotplug);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 63e2723..2f6de43 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1935,7 +1935,8 @@ build_dsdt(GArray *table_data, GArray *linker,
  build_q35_pci0_int(dsdt);
  }

-build_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base, 
pm->cpu_hp_io_len);
+

Re: [Qemu-devel] [PATCH 07/33] pc: acpi: cpu-hotplug: make AML CPU_foo defines local to cpu_hotplug_acpi_table.c

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:42 PM, Igor Mammedov wrote:

now as those defines are used only locally inside of
cpu_hotplug_acpi_table.c, move them out of header file.

Signed-off-by: Igor Mammedov 
---
  hw/acpi/cpu_hotplug_acpi_table.c | 7 +++
  include/hw/acpi/cpu_hotplug.h| 7 ---
  2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/cpu_hotplug_acpi_table.c b/hw/acpi/cpu_hotplug_acpi_table.c
index c31f346..9fdde6d 100644
--- a/hw/acpi/cpu_hotplug_acpi_table.c
+++ b/hw/acpi/cpu_hotplug_acpi_table.c
@@ -17,6 +17,13 @@
  #include "hw/acpi/cpu_hotplug.h"
  #include "hw/i386/pc.h"

+#define CPU_EJECT_METHOD "CPEJ"
+#define CPU_MAT_METHOD "CPMA"
+#define CPU_ON_BITMAP "CPON"
+#define CPU_STATUS_METHOD "CPST"
+#define CPU_STATUS_MAP "PRS"
+#define CPU_SCAN_METHOD "PRSC"
+
  void build_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
 uint16_t io_base, uint16_t io_len)
  {
diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
index 9b1d0cf..565f96c 100644
--- a/include/hw/acpi/cpu_hotplug.h
+++ b/include/hw/acpi/cpu_hotplug.h
@@ -27,13 +27,6 @@ void acpi_cpu_plug_cb(ACPIREGS *ar, qemu_irq irq,
  void acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner,
 AcpiCpuHotplug *gpe_cpu, uint16_t base);

-#define CPU_EJECT_METHOD "CPEJ"
-#define CPU_MAT_METHOD "CPMA"
-#define CPU_ON_BITMAP "CPON"
-#define CPU_STATUS_METHOD "CPST"
-#define CPU_STATUS_MAP "PRS"
-#define CPU_SCAN_METHOD "PRSC"
-
  void build_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
 uint16_t io_base, uint16_t io_len);
  #endif



Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Re: [Qemu-devel] [PATCH 06/33] pc: acpi: consolidate \GPE._E02 with the rest of CPU hotplug AML

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:42 PM, Igor Mammedov wrote:

Signed-off-by: Igor Mammedov 
---
  hw/acpi/cpu_hotplug_acpi_table.c | 4 
  hw/i386/acpi-build.c | 4 
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/acpi/cpu_hotplug_acpi_table.c b/hw/acpi/cpu_hotplug_acpi_table.c
index 730f44c..c31f346 100644
--- a/hw/acpi/cpu_hotplug_acpi_table.c
+++ b/hw/acpi/cpu_hotplug_acpi_table.c
@@ -235,4 +235,8 @@ void build_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
  g_free(apic_ids);

  aml_append(ctx, sb_scope);
+
+method = aml_method("\\_GPE._E02", 0, AML_NOTSERIALIZED);
+aml_append(method, aml_call0("\\_SB." CPU_SCAN_METHOD));
+aml_append(ctx, method);
  }
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 822230f..63e2723 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1952,10 +1952,6 @@ build_dsdt(GArray *table_data, GArray *linker,
  aml_append(scope, method);
  }

-method = aml_method("_E02", 0, AML_NOTSERIALIZED);
-aml_append(method, aml_call0("\\_SB." CPU_SCAN_METHOD));
-aml_append(scope, method);
-
  method = aml_method("_E03", 0, AML_NOTSERIALIZED);
  aml_append(method, aml_call0(MEMORY_HOTPLUG_HANDLER_PATH));
  aml_append(scope, method);



Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Re: [Qemu-devel] [PATCH 30/33] acpi: cpuhp: add cpu._OST handling

2016-05-30 Thread Michael S. Tsirkin
On Wed, May 18, 2016 at 10:09:27AM +0200, Igor Mammedov wrote:
> On Tue, 17 May 2016 09:29:15 -0600
> Eric Blake  wrote:
> 
> > On 05/17/2016 08:43 AM, Igor Mammedov wrote:
> > > Signed-off-by: Igor Mammedov 
> > > ---
> > >  hw/acpi/cpu.c | 83 
> > > +++
> > >  hw/acpi/ich9.c|  3 ++
> > >  hw/acpi/piix4.c   |  3 ++
> > >  include/hw/acpi/cpu.h |  4 +++
> > >  qapi-schema.json  |  3 +-
> > >  trace-events  |  2 ++
> > >  6 files changed, 97 insertions(+), 1 deletion(-)
> > >   
> > 
> > > +++ b/qapi-schema.json
> > > @@ -4018,8 +4018,9 @@
> > >  ## @ACPISlotType
> > >  #
> > >  # @DIMM: memory slot
> > > +# @CPU: logical CPU slot  
> > 
> > Missing a marker '(since 2.7)'
> thanks, fixed in v2.

v2 was never posted.

> > >  #
> > > -{ 'enum': 'ACPISlotType', 'data': [ 'DIMM' ] }
> > > +{ 'enum': 'ACPISlotType', 'data': [ 'DIMM', 'CPU' ] }  
> > 
> > Hmm. ACPISlotType is already on our whitelist of exceptions that allow
> > upper-case names (we prefer lower), so adding another one doesn't
> > necessarily hurt.
> I'll keep that in mind.
> 



Re: [Qemu-devel] [Qemu-block] [PATCH v19 00/10] Block replication for continuous checkpoints

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 03:36:10PM +0800, Changlong Xie wrote:
> Block replication is a very important feature which is used for
> continuous checkpoints(for example: COLO).
> 
> You can get the detailed information about block replication from here:
> http://wiki.qemu.org/Features/BlockReplication
> 
> Usage:
> Please refer to docs/block-replication.txt
> 
> You can get the patch here:
> https://github.com/Pating/qemu/tree/changlox/block-replication-v19
> 
> You can get the patch with framework here:
> https://github.com/Pating/qemu/tree/changlox/colo_framework_v18
> 
> TODO:
> 1. Continuous block replication. It will be started after basic functions
>are accepted.
> 
> Changs Log:
> V19:
> 1. Rebase to v2.6.0
> 2. Address comments from stefan
> p3: a new patch that export interfaces for extra serialization
> p8: 
> 1. call replication_stop() before freeing s->top_id
> 2. check top_bs
> 3. reopen file readonly in error return paths
> 4. enable extra serialization between read and COW
> p9: try to hanlde SIGABRT
> V18:
> p6: add local_err in all replication callbacks to prevent "errp == NULL"
> p7: add missing qemu_iovec_destroy(xxx)
> V17:
> 1. Rebase to the lastest codes 
> p2: refactor backup_do_checkpoint addressed comments from Jeff Cody
> p4: fix bugs in "drive_add buddy xxx" hmp commands
> p6: add "since: 2.7"
> p7: fix bug in replication_close(), add missing "qapi/error.h", add 
> test-replication 
> p8: add "since: 2.7"
> V16:
> 1. Rebase to the newest codes
> 2. Address comments from Stefan & hailiang
> p3: we don't need this patch now
> p4: add "top-id" parameters for secondary
> p6: fix NULL pointer in replication callbacks, remove unnecessary typedefs, 
> add doc comments that explain the semantics of Replication
> p7: Refactor AioContext for thread-safe, remove unnecessary get_top_bs()
> *Note*: I'm working on replication testcase now, will send out in V17
> V15:
> 1. Rebase to the newest codes
> 2. Fix typos and coding style addresed Eric's comments
> 3. Address Stefan's comments
>1) Make backup_do_checkpoint public, drop the changes on BlockJobDriver
>2) Update the message and description for [PATCH 4/9]
>3) Make replication_(start/stop/do_checkpoint)_all as global interfaces
>4) Introduce AioContext lock to protect start/stop/do_checkpoint callbacks
>5) Use BdrvChild instead of holding on to BlockDriverState * pointers
> 4. Clear BDRV_O_INACTIVE for hidden disk's open_flags since commit 09e0c771  
> 5. Introduce replication_get_error_all to check replication status
> 6. Remove useless discard interface
> V14:
> 1. Implement auto complete active commit
> 2. Implement active commit block job for replication.c
> 3. Address the comments from Stefan, add replication-specific API and data
>structure, also remove old block layer APIs
> V13:
> 1. Rebase to the newest codes
> 2. Remove redundant marcos and semicolon in replication.c 
> 3. Fix typos in block-replication.txt
> V12:
> 1. Rebase to the newest codes
> 2. Use backing reference to replcace 'allow-write-backing-file'
> V11:
> 1. Reopen the backing file when starting blcok replication if it is not
>opened in R/W mode
> 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
>when opening backing file
> 3. Block the top BDS so there is only one block job for the top BDS and
>its backing chain.
> V10:
> 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
>reference.
> 2. Address the comments from Eric Blake
> V9:
> 1. Update the error messages
> 2. Rebase to the newest qemu
> 3. Split child add/delete support. These patches are sent in another patchset.
> V8:
> 1. Address Alberto Garcia's comments
> V7:
> 1. Implement adding/removing quorum child. Remove the option non-connect.
> 2. Simplify the backing refrence option according to Stefan Hajnoczi's 
> suggestion
> V6:
> 1. Rebase to the newest qemu.
> V5:
> 1. Address the comments from Gong Lei
> 2. Speed the failover up. The secondary vm can take over very quickly even
>if there are too many I/O requests.
> V4:
> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
> V3:
> 1: use error_setg() instead of error_set()
> 2. Add a new block job API
> 3. Active disk, hidden disk and nbd target uses the same AioContext
> 4. Add a testcase to test new hbitmap API
> V2:
> 1. Redesign the secondary qemu(use image-fleecing)
> 2. Use Error objects to return error message
> 3. Address the comments from Max Reitz and Eric Blake
> 
> Changlong Xie (3):
>   Backup: export interfaces for extra serialization
>   Introduce new APIs to do replication operation
>   tests: add unit test case for replication
> 
> Wen Congyang (7):
>   unblock backup operations in backing file
>   Backup: clear all bitmap when doing block checkpoint
>   Link backup into block core
>   docs: block replication's description
>   auto complete active commit
>   Implement new driver for block replication
>   support replication driver 

Re: [Qemu-devel] [PATCH 05/33] pc: acpi: consolidate CPU hotplug AML

2016-05-30 Thread Marcel Apfelbaum

On 05/17/2016 05:42 PM, Igor Mammedov wrote:

move the former SSDT part of CPU hoplug close to DSDT part.
AML is only moved but there isn't any functional change.



The patch looks good to me,
but why did you decide to get rid of the build_processor_devices function?
I would simply move it to the new file.
Maybe is a matter of taste, anyway:

Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Signed-off-by: Igor Mammedov 
---
  hw/acpi/cpu_hotplug_acpi_table.c | 104 +++-
  hw/i386/acpi-build.c | 112 +--
  include/hw/acpi/cpu_hotplug.h|   3 +-
  3 files changed, 106 insertions(+), 113 deletions(-)

diff --git a/hw/acpi/cpu_hotplug_acpi_table.c b/hw/acpi/cpu_hotplug_acpi_table.c
index 97bb109..730f44c 100644
--- a/hw/acpi/cpu_hotplug_acpi_table.c
+++ b/hw/acpi/cpu_hotplug_acpi_table.c
@@ -15,12 +15,19 @@

  #include "qemu/osdep.h"
  #include "hw/acpi/cpu_hotplug.h"
+#include "hw/i386/pc.h"

-void build_cpu_hotplug_aml(Aml *ctx)
+void build_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
+   uint16_t io_base, uint16_t io_len)
  {
+Aml *dev;
+Aml *crs;
+Aml *pkg;
+Aml *field;
  Aml *method;
  Aml *if_ctx;
  Aml *else_ctx;
+int i, apic_idx;
  Aml *sb_scope = aml_scope("_SB");
  uint8_t madt_tmpl[8] = {0x00, 0x08, 0x00, 0x00, 0x00, 0, 0, 0};
  Aml *cpu_id = aml_arg(0);
@@ -29,6 +36,9 @@ void build_cpu_hotplug_aml(Aml *ctx)
  Aml *cpus_map = aml_name(CPU_ON_BITMAP);
  Aml *zero = aml_int(0);
  Aml *one = aml_int(1);
+MachineClass *mc = MACHINE_GET_CLASS(machine);
+CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
+PCMachineState *pcms = PC_MACHINE(machine);

  /*
   * _MAT method - creates an madt apic buffer
@@ -132,5 +142,97 @@ void build_cpu_hotplug_aml(Aml *ctx)
  }
  aml_append(sb_scope, method);

+/* The current AML generator can cover the APIC ID range [0..255],
+ * inclusive, for VCPU hotplug. */
+QEMU_BUILD_BUG_ON(ACPI_CPU_HOTPLUG_ID_LIMIT > 256);
+g_assert(pcms->apic_id_limit <= ACPI_CPU_HOTPLUG_ID_LIMIT);
+
+/* create PCI0.PRES device and its _CRS to reserve CPU hotplug MMIO */
+dev = aml_device("PCI0." stringify(CPU_HOTPLUG_RESOURCE_DEVICE));
+aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A06")));
+aml_append(dev,
+aml_name_decl("_UID", aml_string("CPU Hotplug resources"))
+);
+/* device present, functioning, decoding, not shown in UI */
+aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
+crs = aml_resource_template();
+aml_append(crs,
+aml_io(AML_DECODE16, io_base, io_base, 1, io_len)
+);
+aml_append(dev, aml_name_decl("_CRS", crs));
+aml_append(sb_scope, dev);
+/* declare CPU hotplug MMIO region and PRS field to access it */
+aml_append(sb_scope, aml_operation_region(
+"PRST", AML_SYSTEM_IO, aml_int(io_base), io_len));
+field = aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK, AML_PRESERVE);
+aml_append(field, aml_named_field("PRS", 256));
+aml_append(sb_scope, field);
+
+/* build Processor object for each processor */
+for (i = 0; i < apic_ids->len; i++) {
+int apic_id = apic_ids->cpus[i].arch_id;
+
+assert(apic_id < ACPI_CPU_HOTPLUG_ID_LIMIT);
+
+dev = aml_processor(apic_id, 0, 0, "CP%.02X", apic_id);
+
+method = aml_method("_MAT", 0, AML_NOTSERIALIZED);
+aml_append(method,
+aml_return(aml_call1(CPU_MAT_METHOD, aml_int(apic_id;
+aml_append(dev, method);
+
+method = aml_method("_STA", 0, AML_NOTSERIALIZED);
+aml_append(method,
+aml_return(aml_call1(CPU_STATUS_METHOD, aml_int(apic_id;
+aml_append(dev, method);
+
+method = aml_method("_EJ0", 1, AML_NOTSERIALIZED);
+aml_append(method,
+aml_return(aml_call2(CPU_EJECT_METHOD, aml_int(apic_id),
+aml_arg(0)))
+);
+aml_append(dev, method);
+
+aml_append(sb_scope, dev);
+}
+
+/* build this code:
+ *   Method(NTFY, 2) {If (LEqual(Arg0, 0x00)) {Notify(CP00, Arg1)} ...}
+ */
+/* Arg0 = Processor ID = APIC ID */
+method = aml_method(AML_NOTIFY_METHOD, 2, AML_NOTSERIALIZED);
+for (i = 0; i < apic_ids->len; i++) {
+int apic_id = apic_ids->cpus[i].arch_id;
+
+if_ctx = aml_if(aml_equal(aml_arg(0), aml_int(apic_id)));
+aml_append(if_ctx,
+aml_notify(aml_name("CP%.02X", apic_id), aml_arg(1))
+);
+aml_append(method, if_ctx);
+}
+aml_append(sb_scope, method);
+
+/* build "Name(CPON, Package() { One, One, ..., Zero, Zero, ... })"
+ *
+ * Note: The ability to create variable-sized packages was first
+ * introduced in ACPI 2.0. ACPI 1.0 only allowed fixed-size packages
+ * ith up to 255 elements. Windows guests up to win2k8 fail when
+ 

Re: [Qemu-devel] [Qemu-block] [PATCH v19 08/10] Implement new driver for block replication

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 03:36:18PM +0800, Changlong Xie wrote:
> +/* start backup job now */
> +error_setg(>blocker,
> +   "block device is in use by internal backup job");
> +
> +top_bs = bdrv_lookup_bs(s->top_id, s->top_id, errp);
> +if (!top_bs || !check_top_bs(top_bs, bs)) {
> +reopen_backing_file(s, false, NULL);
> +aio_context_release(aio_context);
> +return;
> +}

Missing error_setg() with an error message when check_top_bs() fails.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 1/33] tests: acpi: report names of expected files in verbose mode

2016-05-30 Thread Marcel Apfelbaum

On 05/26/2016 12:46 PM, Igor Mammedov wrote:

print expected file name if it doesn't exists if
verbose mode is enabled*. It helps to avoid running
bios-tables-test under debugger to figure out missing
file name.

*)
verbose mode is enabled if "V" env. variable is set

Signed-off-by: Igor Mammedov 
---
  v2: replace 'for' loop with more readble 'goto'
  Marcel Apfelbaum 
---
  tests/bios-tables-test.c | 18 +-
  1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
index 0352814..f0493f8 100644
--- a/tests/bios-tables-test.c
+++ b/tests/bios-tables-test.c
@@ -464,7 +464,6 @@ static GArray *load_expected_aml(test_data *data)
  {
  int i;
  AcpiSdtTable *sdt;
-gchar *aml_file = NULL;
  GError *error = NULL;
  gboolean ret;

@@ -472,6 +471,7 @@ static GArray *load_expected_aml(test_data *data)
  for (i = 0; i < data->tables->len; ++i) {
  AcpiSdtTable exp_sdt;
  uint32_t signature;
+gchar *aml_file = NULL;
  const char *ext = data->variant ? data->variant : "";

  sdt = _array_index(data->tables, AcpiSdtTable, i);
@@ -484,13 +484,21 @@ static GArray *load_expected_aml(test_data *data)
  try_again:
  aml_file = g_strdup_printf("%s/%s/%.4s%s", data_dir, data->machine,
 (gchar *), ext);
-if (data->variant && !g_file_test(aml_file, G_FILE_TEST_EXISTS)) {
-g_free(aml_file);
+if (getenv("V")) {
+fprintf(stderr, "\nLooking for expected file '%s'\n", aml_file);
+}
+if (g_file_test(aml_file, G_FILE_TEST_EXISTS)) {
+exp_sdt.aml_file = aml_file;
+} else if (*ext != '\0') {
+/* try fallback to generic (extention less) expected file */
  ext = "";
+g_free(aml_file);
  goto try_again;
  }
-exp_sdt.aml_file = aml_file;
-g_assert(g_file_test(aml_file, G_FILE_TEST_EXISTS));
+g_assert(exp_sdt.aml_file);
+if (getenv("V")) {
+fprintf(stderr, "\nUsing expected file '%s'\n", aml_file);
+}
  ret = g_file_get_contents(aml_file, _sdt.aml,
_sdt.aml_len, );
  g_assert(ret);



Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Re: [Qemu-devel] [PATCH V3] tap: vhost busy polling support

2016-05-30 Thread Michael S. Tsirkin
On Thu, Apr 07, 2016 at 12:56:24PM +0800, Jason Wang wrote:
> This patch add the capability of basic vhost net busy polling which is
> supported by recent kernel. User could configure the maximum number of
> us that could be spent on busy polling through a new property of tap
> "vhost-poll-us".

I applied this but now I had a thought - should we generalize this to
"poll-us"? Down the road tun could support busy polling just like
sockets do.

> 
> Signed-off-by: Jason Wang 
> ---
>  hw/net/vhost_net.c|  2 +-
>  hw/scsi/vhost-scsi.c  |  2 +-
>  hw/virtio/vhost-backend.c |  8 
>  hw/virtio/vhost.c | 40 
> ++-
>  include/hw/virtio/vhost-backend.h |  3 +++
>  include/hw/virtio/vhost.h |  3 ++-
>  include/net/vhost_net.h   |  1 +
>  net/tap.c | 10 --
>  net/vhost-user.c  |  1 +
>  qapi-schema.json  |  6 +-
>  qemu-options.hx   |  3 +++
>  11 files changed, 72 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 6e1032f..1840c73 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -166,7 +166,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>  }
>  
>  r = vhost_dev_init(>dev, options->opaque,
> -   options->backend_type);
> +   options->backend_type, options->busyloop_timeout);
>  if (r < 0) {
>  goto fail;
>  }
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index 9261d51..2a00f2f 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -248,7 +248,7 @@ static void vhost_scsi_realize(DeviceState *dev, Error 
> **errp)
>  s->dev.backend_features = 0;
>  
>  ret = vhost_dev_init(>dev, (void *)(uintptr_t)vhostfd,
> - VHOST_BACKEND_TYPE_KERNEL);
> + VHOST_BACKEND_TYPE_KERNEL, 0);
>  if (ret < 0) {
>  error_setg(errp, "vhost-scsi: vhost initialization failed: %s",
> strerror(-ret));
> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> index b358902..d62372e 100644
> --- a/hw/virtio/vhost-backend.c
> +++ b/hw/virtio/vhost-backend.c
> @@ -138,6 +138,12 @@ static int vhost_kernel_set_vring_call(struct vhost_dev 
> *dev,
>  return vhost_kernel_call(dev, VHOST_SET_VRING_CALL, file);
>  }
>  
> +static int vhost_kernel_set_vring_busyloop_timeout(struct vhost_dev *dev,
> +   struct vhost_vring_state 
> *s)
> +{
> +return vhost_kernel_call(dev, VHOST_SET_VRING_BUSYLOOP_TIMEOUT, s);
> +}
> +
>  static int vhost_kernel_set_features(struct vhost_dev *dev,
>   uint64_t features)
>  {
> @@ -185,6 +191,8 @@ static const VhostOps kernel_ops = {
>  .vhost_get_vring_base = vhost_kernel_get_vring_base,
>  .vhost_set_vring_kick = vhost_kernel_set_vring_kick,
>  .vhost_set_vring_call = vhost_kernel_set_vring_call,
> +.vhost_set_vring_busyloop_timeout =
> +vhost_kernel_set_vring_busyloop_timeout,
>  .vhost_set_features = vhost_kernel_set_features,
>  .vhost_get_features = vhost_kernel_get_features,
>  .vhost_set_owner = vhost_kernel_set_owner,
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 4400718..ebf8b08 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -964,6 +964,28 @@ static void vhost_eventfd_del(MemoryListener *listener,
>  {
>  }
>  
> +static int vhost_virtqueue_set_busyloop_timeout(struct vhost_dev *dev,
> +int n, uint32_t timeout)
> +{
> +int vhost_vq_index = dev->vhost_ops->vhost_get_vq_index(dev, n);
> +struct vhost_vring_state state = {
> +.index = vhost_vq_index,
> +.num = timeout,
> +};
> +int r;
> +
> +if (!dev->vhost_ops->vhost_set_vring_busyloop_timeout) {
> +return -EINVAL;
> +}
> +
> +r = dev->vhost_ops->vhost_set_vring_busyloop_timeout(dev, );
> +if (r) {
> +return r;
> +}
> +
> +return 0;
> +}
> +
>  static int vhost_virtqueue_init(struct vhost_dev *dev,
>  struct vhost_virtqueue *vq, int n)
>  {
> @@ -994,7 +1016,7 @@ static void vhost_virtqueue_cleanup(struct 
> vhost_virtqueue *vq)
>  }
>  
>  int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> -   VhostBackendType backend_type)
> +   VhostBackendType backend_type, uint32_t busyloop_timeout)
>  {
>  uint64_t features;
>  int i, r;
> @@ -1035,6 +1057,17 @@ int vhost_dev_init(struct vhost_dev *hdev, void 
> *opaque,
>  goto fail_vq;
>  }
>  }
> +
> +if (busyloop_timeout) {
> +for (i = 0; i < hdev->nvqs; ++i) {
> +r = 

Re: [Qemu-devel] [PATCH v5 00/10] Add Ethernet device for i.MX6 SOC

2016-05-30 Thread Jean-Christophe DUBOIS

Le 30/05/2016 08:10, Jason Wang a écrit :



On 2016年05月30日 14:04, Jean-Christophe DUBOIS wrote:

Le 30/05/2016 04:19, Jason Wang a écrit :



On 2016年05月21日 16:01, Jean-Christophe Dubois wrote:

This patch series adds Gb ENET Ethernet device to the i.MX6 SOC.

The ENET device is an evolution of the FEC device present on the 
i.MX25 SOC

and is backward compatible with it.

Therefore the ENET support has been added to the actual Qemu FEC 
device (

rather than adding a new device).

The Patch has been tested by:
  * Booting linux on i.MX25 PDK board emulation and accessing internet
  * Booting linux on i.MX6 Sabrelite board emulation and accessing 
internet


Jean-Christophe Dubois (10):
   net: improve UDP/TCP checksum computation.
   net: handle optional VLAN header in checksum computation.
   i.MX: Fix FEC code for MDIO operation selection
   i.MX: Fix FEC code for MDIO address selection
   i.MX: Fix FEC code for ECR register reset value.
   i.MX: reset TX/RX descriptors when FEC is disabled.
   i.MX: Rename i.MX FEC defines to ENET_XXX
   i.MX: move FEC device to a register array structure.
   Add ENET/Gbps Ethernet support to FEC device
   Add ENET device to i.MX6 SOC.

  hw/arm/fsl-imx25.c|1 +
  hw/arm/fsl-imx6.c |   17 +
  hw/net/imx_fec.c  | 1009 
++---

  include/hw/arm/fsl-imx6.h |6 +-
  include/hw/net/imx_fec.h  |  250 ---
  net/checksum.c|  121 --
  6 files changed, 1077 insertions(+), 327 deletions(-)



Want to merge this, but I get:

Applying: net: improve UDP/TCP checksum computation.
Applying: net: handle optional VLAN header in checksum computation.
Applying: i.MX: Fix FEC code for MDIO operation selection
Applying: i.MX: Fix FEC code for MDIO address selection
Applying: i.MX: Fix FEC code for ECR register reset value.
Applying: i.MX: reset TX/RX descriptors when FEC is disabled.
Applying: i.MX: Rename i.MX FEC defines to ENET_XXX
Applying: i.MX: move FEC device to a register array structure.
Applying: Add ENET/Gbps Ethernet support to FEC device
error: patch failed: hw/net/imx_fec.c:24
error: hw/net/imx_fec.c: patch does not apply
Patch failed at 0009 Add ENET/Gbps Ethernet support to FEC device
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


This is because of commit 03dd024ff57733a55cd2e455f361d053c81b1b29 
"hw: explicitly include qemu/log.h" that has been applied meanwhile.


I'll send a new version soon.

JC



Thanks



v6 is out.

JC




Re: [Qemu-devel] [Qemu-block] [PATCH v19 09/10] tests: add unit test case for replication

2016-05-30 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 03:36:19PM +0800, Changlong Xie wrote:
> +/* primary */
> +#define P_LOCAL_DISK "/tmp/p_local_disk.XX"
> +#define P_COMMAND "driver=replication,mode=primary,node-name=xxx,"\
> +  "file.driver=qcow2,file.file.filename="P_LOCAL_DISK
> +
> +/* secondary */
> +#define S_LOCAL_DISK "/tmp/s_local_disk.XX"
> +#define S_ACTIVE_DISK "/tmp/s_active_disk.XX"
> +#define S_HIDDEN_DISK "/tmp/s_hidden_disk.XX"

Please use unique filenames so that multiple instances of the test can
run in parallel on a single machine.  mkstemp(3) can be used to do this.

> +static void io_read(BlockDriverState *bs, long pattern, int64_t 
> pattern_offset,
> +int64_t pattern_count, int64_t offset, int64_t count,
> +bool expect_failed)
> +{
> +char *buf;
> +void *cmp_buf;
> +int ret;
> +
> +/* 1. alloc pattern buffer */
> +if (pattern) {
> +cmp_buf = g_malloc(pattern_count);
> +memset(cmp_buf, pattern, pattern_count);
> +}
> +
> +/* 2. alloc read buffer */
> +buf = qemu_blockalign(bs, count);
> +memset(buf, 0xab, count);
> +
> +/* 3. do read */
> +ret = bdrv_read(bs, offset >> 9, (uint8_t *)buf, count >> 9);
> +
> +/* 4. assert and compare buf */
> +if (expect_failed) {
> +g_assert(ret < 0);
> +} else {
> +g_assert(ret >= 0);
> +if (pattern) {
> +g_assert(memcmp(buf + pattern_offset, cmp_buf, pattern_count) <= 
> 0);
> +g_free(cmp_buf);

if pattern && expect_failed then cmp_buf is leaked.  Probably best to
initialize cmp_buf = NULL and have an unconditional g_free(cmp_buf) at
the end of the function to avoid leaks.

> +}
> +}
> +g_free(buf);

qemu_blockalign() memory is freed with qemu_vfree(), not g_free().

> +static void test_primary_do_checkpoint(void)
> +{
> +BlockDriverState *bs;
> +Error *local_err = NULL;
> +
> +bs = start_primary();
> +
> +replication_do_checkpoint_all(_err);
> +g_assert(!local_err);
> +
> +teardown_primary(bs);
> +}

Shouldn't replication_start_all() be called before
replication_do_checkpoint_all()?

> +int main(int argc, char **argv)
> +{
> +int ret;
> +qemu_init_main_loop(_fatal);
> +bdrv_init();
> +
> +do {} while (g_main_context_iteration(NULL, false));

Why is this necessary?


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH v6 09/10] Add ENET/Gbps Ethernet support to FEC device

2016-05-30 Thread Jean-Christophe Dubois
The ENET device (present in i.MX6) is "derived" from FEC and backward
compatible with it.

This patch adds the necessary support of the added feature in the ENET
device to allow Linux to use it (on supported processors).

Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * Not present on v1
   
Changes since v2:
 * Not present on v2
 
Changes since v3:
 * Separate and fix the 2 supported interrupts
 
Changes since v4:
 * is-fec property was dropped and 2 distinct devices (FEC and ENET) created

Changes since v5:
 * fix patch because of merge conflict.

 hw/arm/fsl-imx25.c   |   1 +
 hw/net/imx_fec.c | 673 ---
 include/hw/net/imx_fec.h | 131 -
 3 files changed, 702 insertions(+), 103 deletions(-)

diff --git a/hw/arm/fsl-imx25.c b/hw/arm/fsl-imx25.c
index 2f878b9..1cd749a 100644
--- a/hw/arm/fsl-imx25.c
+++ b/hw/arm/fsl-imx25.c
@@ -191,6 +191,7 @@ static void fsl_imx25_realize(DeviceState *dev, Error 
**errp)
 }
 
 qdev_set_nic_properties(DEVICE(>fec), _table[0]);
+
 object_property_set_bool(OBJECT(>fec), true, "realized", );
 if (err) {
 error_propagate(errp, err);
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index d8e4145..d91e029 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -25,6 +25,8 @@
 #include "hw/net/imx_fec.h"
 #include "sysemu/dma.h"
 #include "qemu/log.h"
+#include "net/checksum.h"
+#include "net/eth.h"
 
 /* For crc32 */
 #include 
@@ -53,10 +55,93 @@
 } \
 } while (0)
 
-static const char *imx_fec_reg_name(IMXFECState *s, uint32_t index)
+static const char *imx_default_reg_name(IMXFECState *s, uint32_t index)
 {
 static char tmp[20];
+sprintf(tmp, "index %d", index);
+return tmp;
+}
+
+static const char *imx_fec_reg_name(IMXFECState *s, uint32_t index)
+{
+switch (index) {
+case ENET_FRBR:
+return "FRBR";
+case ENET_FRSR:
+return "FRSR";
+case ENET_MIIGSK_CFGR:
+return "MIIGSK_CFGR";
+case ENET_MIIGSK_ENR:
+return "MIIGSK_ENR";
+default:
+return imx_default_reg_name(s, index);
+}
+}
+
+static const char *imx_enet_reg_name(IMXFECState *s, uint32_t index)
+{
+switch (index) {
+case ENET_RSFL:
+return "RSFL";
+case ENET_RSEM:
+return "RSEM";
+case ENET_RAEM:
+return "RAEM";
+case ENET_RAFL:
+return "RAFL";
+case ENET_TSEM:
+return "TSEM";
+case ENET_TAEM:
+return "TAEM";
+case ENET_TAFL:
+return "TAFL";
+case ENET_TIPG:
+return "TIPG";
+case ENET_FTRL:
+return "FTRL";
+case ENET_TACC:
+return "TACC";
+case ENET_RACC:
+return "RACC";
+case ENET_ATCR:
+return "ATCR";
+case ENET_ATVR:
+return "ATVR";
+case ENET_ATOFF:
+return "ATOFF";
+case ENET_ATPER:
+return "ATPER";
+case ENET_ATCOR:
+return "ATCOR";
+case ENET_ATINC:
+return "ATINC";
+case ENET_ATSTMP:
+return "ATSTMP";
+case ENET_TGSR:
+return "TGSR";
+case ENET_TCSR0:
+return "TCSR0";
+case ENET_TCCR0:
+return "TCCR0";
+case ENET_TCSR1:
+return "TCSR1";
+case ENET_TCCR1:
+return "TCCR1";
+case ENET_TCSR2:
+return "TCSR2";
+case ENET_TCCR2:
+return "TCCR2";
+case ENET_TCSR3:
+return "TCSR3";
+case ENET_TCCR3:
+return "TCCR3";
+default:
+return imx_default_reg_name(s, index);
+}
+}
 
+static const char *imx_eth_reg_name(IMXFECState *s, uint32_t index)
+{
 switch (index) {
 case ENET_EIR:
 return "EIR";
@@ -100,21 +185,16 @@ static const char *imx_fec_reg_name(IMXFECState *s, 
uint32_t index)
 return "TDSR";
 case ENET_MRBR:
 return "MRBR";
-case ENET_FRBR:
-return "FRBR";
-case ENET_FRSR:
-return "FRSR";
-case ENET_MIIGSK_CFGR:
-return "MIIGSK_CFGR";
-case ENET_MIIGSK_ENR:
-return "MIIGSK_ENR";
 default:
-sprintf(tmp, "index %d", index);
-return tmp;
+if (s->is_fec) {
+return imx_fec_reg_name(s, index);
+} else {
+return imx_enet_reg_name(s, index);
+}
 }
 }
 
-static const VMStateDescription vmstate_imx_fec = {
+static const VMStateDescription vmstate_imx_eth = {
 .name = TYPE_IMX_FEC,
 .version_id = 2,
 .minimum_version_id = 2,
@@ -140,7 +220,7 @@ static const VMStateDescription vmstate_imx_fec = {
 #define PHY_INT_PARFAULT(1 << 2)
 #define PHY_INT_AUTONEG_PAGE(1 << 1)
 
-static void imx_fec_update(IMXFECState *s);
+static void imx_eth_update(IMXFECState *s);
 
 /*
  * The MII phy could raise a GPIO to the processor which in turn
@@ -150,7 +230,7 @@ static void imx_fec_update(IMXFECState *s);
  */
 static void phy_update_irq(IMXFECState *s)
 {
-imx_fec_update(s);
+imx_eth_update(s);
 }
 

[Qemu-devel] [PATCH v6 06/10] i.MX: reset TX/RX descriptors when FEC is disabled.

2016-05-30 Thread Jean-Christophe Dubois
According to the FEC chapter of i.MX25 reference manual

RX adn TX descriptors are reseted when the FEC device is disabled through ECR.

Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * Not present on v1
 
Changes since v2:
 * Not present on v2
 
Changes since v3:
 * Not present on v3
 
Changes since v4:
 * Not present on v4

Changes since v5:
 * None

 hw/net/imx_fec.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 768181e..7369cfa 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -454,6 +454,8 @@ static void imx_fec_write(void *opaque, hwaddr addr,
 }
 if ((s->ecr & FEC_EN) == 0) {
 s->rx_enabled = 0;
+s->rx_descriptor = s->erdsr;
+s->tx_descriptor = s->etdsr;
 }
 break;
 case 0x040: /* MMFR */
-- 
2.7.4




[Qemu-devel] [PATCH v6 05/10] i.MX: Fix FEC code for ECR register reset value.

2016-05-30 Thread Jean-Christophe Dubois
According to the FEC chapter of i.MX25 reference manual ECR register is
initialized at 0xf000 at reset time.

We fix the value.

Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * Not present on v1

Changes since v2:
 * Not present on v2

Changes since v3:
 * Not present on v3
 
Changes since v4:
 * None

Changes since v5:
 * None

 hw/net/imx_fec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index bf68ce6..768181e 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -339,7 +339,7 @@ static void imx_fec_reset(DeviceState *d)
 s->eir = 0;
 s->eimr = 0;
 s->rx_enabled = 0;
-s->ecr = 0;
+s->ecr = 0xf000;
 s->mscr = 0;
 s->mibc = 0xc000;
 s->rcr = 0x05ee0001;
-- 
2.7.4




[Qemu-devel] [PATCH v6 08/10] i.MX: move FEC device to a register array structure.

2016-05-30 Thread Jean-Christophe Dubois
This is to prepare for the ENET Gb device of the i.MX6.

Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * Not present on v1.
 
Changes since v2:
 * The patch was split in 2 parts:
   - a "port" to a reg array based device (this patch).
   - the addition of the Gb support (next patch).
 
Changes since v3: 
 * Small fix patches were extracted from this patch (see previous 3 patches)
 * Reset logic through ECR was fixed.
 * TDAR/RDAR behavior was fixed.
 
Changes since v4:
 * #define renaming was extracted from this patch (see previous patch)
 * Small fix patch was extracted from this patch (see previous patch on RX/TX
   descriptor)

Changes since v5:
 * None

 hw/net/imx_fec.c | 398 ++-
 include/hw/net/imx_fec.h |  51 --
 2 files changed, 256 insertions(+), 193 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index f5eede8..d8e4145 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -53,30 +53,75 @@
 } \
 } while (0)
 
+static const char *imx_fec_reg_name(IMXFECState *s, uint32_t index)
+{
+static char tmp[20];
+
+switch (index) {
+case ENET_EIR:
+return "EIR";
+case ENET_EIMR:
+return "EIMR";
+case ENET_RDAR:
+return "RDAR";
+case ENET_TDAR:
+return "TDAR";
+case ENET_ECR:
+return "ECR";
+case ENET_MMFR:
+return "MMFR";
+case ENET_MSCR:
+return "MSCR";
+case ENET_MIBC:
+return "MIBC";
+case ENET_RCR:
+return "RCR";
+case ENET_TCR:
+return "TCR";
+case ENET_PALR:
+return "PALR";
+case ENET_PAUR:
+return "PAUR";
+case ENET_OPD:
+return "OPD";
+case ENET_IAUR:
+return "IAUR";
+case ENET_IALR:
+return "IALR";
+case ENET_GAUR:
+return "GAUR";
+case ENET_GALR:
+return "GALR";
+case ENET_TFWR:
+return "TFWR";
+case ENET_RDSR:
+return "RDSR";
+case ENET_TDSR:
+return "TDSR";
+case ENET_MRBR:
+return "MRBR";
+case ENET_FRBR:
+return "FRBR";
+case ENET_FRSR:
+return "FRSR";
+case ENET_MIIGSK_CFGR:
+return "MIIGSK_CFGR";
+case ENET_MIIGSK_ENR:
+return "MIIGSK_ENR";
+default:
+sprintf(tmp, "index %d", index);
+return tmp;
+}
+}
+
 static const VMStateDescription vmstate_imx_fec = {
 .name = TYPE_IMX_FEC,
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
 .fields = (VMStateField[]) {
-VMSTATE_UINT32(irq_state, IMXFECState),
-VMSTATE_UINT32(eir, IMXFECState),
-VMSTATE_UINT32(eimr, IMXFECState),
-VMSTATE_UINT32(rx_enabled, IMXFECState),
+VMSTATE_UINT32_ARRAY(regs, IMXFECState, ENET_MAX),
 VMSTATE_UINT32(rx_descriptor, IMXFECState),
 VMSTATE_UINT32(tx_descriptor, IMXFECState),
-VMSTATE_UINT32(ecr, IMXFECState),
-VMSTATE_UINT32(mmfr, IMXFECState),
-VMSTATE_UINT32(mscr, IMXFECState),
-VMSTATE_UINT32(mibc, IMXFECState),
-VMSTATE_UINT32(rcr, IMXFECState),
-VMSTATE_UINT32(tcr, IMXFECState),
-VMSTATE_UINT32(tfwr, IMXFECState),
-VMSTATE_UINT32(frsr, IMXFECState),
-VMSTATE_UINT32(erdsr, IMXFECState),
-VMSTATE_UINT32(etdsr, IMXFECState),
-VMSTATE_UINT32(emrbr, IMXFECState),
-VMSTATE_UINT32(miigsk_cfgr, IMXFECState),
-VMSTATE_UINT32(miigsk_enr, IMXFECState),
 
 VMSTATE_UINT32(phy_status, IMXFECState),
 VMSTATE_UINT32(phy_control, IMXFECState),
@@ -252,15 +297,13 @@ static void imx_fec_write_bd(IMXFECBufDesc *bd, 
dma_addr_t addr)
 
 static void imx_fec_update(IMXFECState *s)
 {
-uint32_t active;
-uint32_t changed;
-
-active = s->eir & s->eimr;
-changed = active ^ s->irq_state;
-if (changed) {
-qemu_set_irq(s->irq, active);
+if (s->regs[ENET_EIR] & s->regs[ENET_EIMR]) {
+FEC_PRINTF("interrupt raised\n");
+qemu_set_irq(s->irq, 1);
+} else {
+FEC_PRINTF("interrupt lowered\n");
+qemu_set_irq(s->irq, 0);
 }
-s->irq_state = active;
 }
 
 static void imx_fec_do_tx(IMXFECState *s)
@@ -284,7 +327,7 @@ static void imx_fec_do_tx(IMXFECState *s)
 len = bd.length;
 if (frame_size + len > ENET_MAX_FRAME_SIZE) {
 len = ENET_MAX_FRAME_SIZE - frame_size;
-s->eir |= ENET_INT_BABT;
+s->regs[ENET_EIR] |= ENET_INT_BABT;
 }
 dma_memory_read(_space_memory, bd.data, ptr, len);
 ptr += len;
@@ -294,17 +337,17 @@ static void imx_fec_do_tx(IMXFECState *s)
 qemu_send_packet(qemu_get_queue(s->nic), frame, len);
 ptr = frame;
 frame_size = 0;
-s->eir |= ENET_INT_TXF;
+s->regs[ENET_EIR] |= ENET_INT_TXF;
 }
-s->eir |= ENET_INT_TXB;
+s->regs[ENET_EIR] 

[Qemu-devel] [PATCH v6 07/10] i.MX: Rename i.MX FEC defines to ENET_XXX

2016-05-30 Thread Jean-Christophe Dubois
Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * Not present on v1

Changes since v2:
 * Not present on v2
 
Changes since v3:
 * Not present on v3
 
Changes since v4:
 * Not present on v4

Changes since v5:
 * None

 hw/net/imx_fec.c | 54 
 include/hw/net/imx_fec.h | 64 
 2 files changed, 59 insertions(+), 59 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 7369cfa..f5eede8 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -266,7 +266,7 @@ static void imx_fec_update(IMXFECState *s)
 static void imx_fec_do_tx(IMXFECState *s)
 {
 int frame_size = 0;
-uint8_t frame[FEC_MAX_FRAME_SIZE];
+uint8_t frame[ENET_MAX_FRAME_SIZE];
 uint8_t *ptr = frame;
 uint32_t addr = s->tx_descriptor;
 
@@ -277,31 +277,31 @@ static void imx_fec_do_tx(IMXFECState *s)
 imx_fec_read_bd(, addr);
 FEC_PRINTF("tx_bd %x flags %04x len %d data %08x\n",
addr, bd.flags, bd.length, bd.data);
-if ((bd.flags & FEC_BD_R) == 0) {
+if ((bd.flags & ENET_BD_R) == 0) {
 /* Run out of descriptors to transmit.  */
 break;
 }
 len = bd.length;
-if (frame_size + len > FEC_MAX_FRAME_SIZE) {
-len = FEC_MAX_FRAME_SIZE - frame_size;
-s->eir |= FEC_INT_BABT;
+if (frame_size + len > ENET_MAX_FRAME_SIZE) {
+len = ENET_MAX_FRAME_SIZE - frame_size;
+s->eir |= ENET_INT_BABT;
 }
 dma_memory_read(_space_memory, bd.data, ptr, len);
 ptr += len;
 frame_size += len;
-if (bd.flags & FEC_BD_L) {
+if (bd.flags & ENET_BD_L) {
 /* Last buffer in frame.  */
 qemu_send_packet(qemu_get_queue(s->nic), frame, len);
 ptr = frame;
 frame_size = 0;
-s->eir |= FEC_INT_TXF;
+s->eir |= ENET_INT_TXF;
 }
-s->eir |= FEC_INT_TXB;
-bd.flags &= ~FEC_BD_R;
+s->eir |= ENET_INT_TXB;
+bd.flags &= ~ENET_BD_R;
 /* Write back the modified descriptor.  */
 imx_fec_write_bd(, addr);
 /* Advance to the next descriptor.  */
-if ((bd.flags & FEC_BD_W) != 0) {
+if ((bd.flags & ENET_BD_W) != 0) {
 addr = s->etdsr;
 } else {
 addr += 8;
@@ -320,7 +320,7 @@ static void imx_fec_enable_rx(IMXFECState *s)
 
 imx_fec_read_bd(, s->rx_descriptor);
 
-tmp = ((bd.flags & FEC_BD_E) != 0);
+tmp = ((bd.flags & ENET_BD_E) != 0);
 
 if (!tmp) {
 FEC_PRINTF("RX buffer full\n");
@@ -438,21 +438,21 @@ static void imx_fec_write(void *opaque, hwaddr addr,
 s->eimr = value;
 break;
 case 0x010: /* RDAR */
-if ((s->ecr & FEC_EN) && !s->rx_enabled) {
+if ((s->ecr & ENET_ECR_ETHEREN) && !s->rx_enabled) {
 imx_fec_enable_rx(s);
 }
 break;
 case 0x014: /* TDAR */
-if (s->ecr & FEC_EN) {
+if (s->ecr & ENET_ECR_ETHEREN) {
 imx_fec_do_tx(s);
 }
 break;
 case 0x024: /* ECR */
 s->ecr = value;
-if (value & FEC_RESET) {
+if (value & ENET_ECR_RESET) {
 imx_fec_reset(DEVICE(s));
 }
-if ((s->ecr & FEC_EN) == 0) {
+if ((s->ecr & ENET_ECR_ETHEREN) == 0) {
 s->rx_enabled = 0;
 s->rx_descriptor = s->erdsr;
 s->tx_descriptor = s->etdsr;
@@ -467,7 +467,7 @@ static void imx_fec_write(void *opaque, hwaddr addr,
 do_phy_write(s, extract32(value, 18, 10), extract32(value, 0, 16));
 }
 /* raise the interrupt as the PHY operation is done */
-s->eir |= FEC_INT_MII;
+s->eir |= ENET_INT_MII;
 break;
 case 0x044: /* MSCR */
 s->mscr = value & 0xfe;
@@ -484,7 +484,7 @@ static void imx_fec_write(void *opaque, hwaddr addr,
 /* We transmit immediately, so raise GRA immediately.  */
 s->tcr = value;
 if (value & 1) {
-s->eir |= FEC_INT_GRA;
+s->eir |= ENET_INT_GRA;
 }
 break;
 case 0x0e4: /* PALR */
@@ -574,20 +574,20 @@ static ssize_t imx_fec_receive(NetClientState *nc, const 
uint8_t *buf,
 crc_ptr = (uint8_t *) 
 
 /* Huge frames are truncted.  */
-if (size > FEC_MAX_FRAME_SIZE) {
-size = FEC_MAX_FRAME_SIZE;
-flags |= FEC_BD_TR | FEC_BD_LG;
+if (size > ENET_MAX_FRAME_SIZE) {
+size = ENET_MAX_FRAME_SIZE;
+flags |= ENET_BD_TR | ENET_BD_LG;
 }
 
 /* Frames larger than the user limit just set error flags.  */
 if (size > (s->rcr >> 16)) {
-flags |= FEC_BD_LG;
+flags |= ENET_BD_LG;
 }
 
 addr = s->rx_descriptor;
 while (size > 0) {
 imx_fec_read_bd(, addr);
-if ((bd.flags & FEC_BD_E) == 0) {
+if ((bd.flags & ENET_BD_E) == 0) {
 

[Qemu-devel] [PATCH v6 03/10] i.MX: Fix FEC code for MDIO operation selection

2016-05-30 Thread Jean-Christophe Dubois
According to the FEC chapter of i.MX25 reference manual

When writing the MMFR register, bit 29 and 28 select the requested operation.
 * 10 means read operation with valid MII mgmt frame
 * 11 means read operation with non compliant MII mgmt frame
 * 01 means write operation with valid MII mgmt frame
 * 00 means write operation with non compliant MII mgmt frame

So while bit 28 does change beween read/write for valid MII mgmt frame, the
mening is inverted for non compliant MII mgmt frame.

Bit 29 on the other hand means read/write whatever the type of mgmt frame
involved.

So this patch change the operation selection from bit 28 to bit 29 as it is
more generic.

Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1: 
 * Not present on v1 
 
Changes since v2: 
 * Not present on v2
 
Changes since v3:
 * Not present on v3

Changes since v4:
 * None

Changes since v5:
 * None  

 hw/net/imx_fec.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 9055ea8..fce3661 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -459,10 +459,10 @@ static void imx_fec_write(void *opaque, hwaddr addr,
 case 0x040: /* MMFR */
 /* store the value */
 s->mmfr = value;
-if (extract32(value, 28, 1)) {
-do_phy_write(s, extract32(value, 18, 9), extract32(value, 0, 16));
-} else {
+if (extract32(value, 29, 1)) {
 s->mmfr = do_phy_read(s, extract32(value, 18, 9));
+} else {
+do_phy_write(s, extract32(value, 18, 9), extract32(value, 0, 16));
 }
 /* raise the interrupt as the PHY operation is done */
 s->eir |= FEC_INT_MII;
-- 
2.7.4




[Qemu-devel] [PATCH v6 02/10] net: handle optional VLAN header in checksum computation.

2016-05-30 Thread Jean-Christophe Dubois
Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * Not present on v1
 
Changes since v2:
 * Not present on v2

Changes since v3:
 * local variable name change.
   
Changes since v4: 
 * None

Changes since v5:
 * None

 net/checksum.c | 35 +++
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/net/checksum.c b/net/checksum.c
index f62b18a..62d3465 100644
--- a/net/checksum.c
+++ b/net/checksum.c
@@ -55,7 +55,7 @@ uint16_t net_checksum_tcpudp(uint16_t length, uint16_t proto,
 
 void net_checksum_calculate(uint8_t *data, int length)
 {
-int ip_len;
+int mac_hdr_len, ip_len;
 struct ip_header *ip;
 
 /*
@@ -64,12 +64,39 @@ void net_checksum_calculate(uint8_t *data, int length)
  * struct members (just in case).
  */
 
-/* Ensure data has complete L2 & L3 headers. */
-if (length < (sizeof(struct eth_header) + sizeof(struct ip_header))) {
+/* Ensure we have at least an Eth header */
+if (length < sizeof(struct eth_header)) {
 return;
 }
 
-ip = (struct ip_header *)(data + sizeof(struct eth_header));
+/* Handle the optionnal VLAN headers */
+switch (lduw_be_p(_GET_ETH_HDR(data)->h_proto)) {
+case ETH_P_VLAN:
+mac_hdr_len = sizeof(struct eth_header) +
+ sizeof(struct vlan_header);
+break;
+case ETH_P_DVLAN:
+if (lduw_be_p(_GET_VLAN_HDR(data)->h_proto) == ETH_P_VLAN) {
+mac_hdr_len = sizeof(struct eth_header) +
+ 2 * sizeof(struct vlan_header);
+} else {
+mac_hdr_len = sizeof(struct eth_header) +
+ sizeof(struct vlan_header);
+}
+break;
+default:
+mac_hdr_len = sizeof(struct eth_header);
+break;
+}
+
+length -= mac_hdr_len;
+
+/* Now check we have an IP header (with an optionnal VLAN header) */
+if (length < sizeof(struct ip_header)) {
+return;
+}
+
+ip = (struct ip_header *)(data + mac_hdr_len);
 
 if (IP_HEADER_VERSION(ip) != IP_HEADER_VERSION_4) {
 return; /* not IPv4 */
-- 
2.7.4




[Qemu-devel] [PATCH v6 04/10] i.MX: Fix FEC code for MDIO address selection

2016-05-30 Thread Jean-Christophe Dubois
According to the FEC chapter of i.MX25 reference manual

When writing to MMFR register, the MDIO device and adress are selected by
bit 27 to 23 and bit 22 to 18 respectively. This is a total of 10 bits
that need to be used by the Phy chip/address decoding function.

This patch fixes the number of bits used from 9 to 10.

Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * Not present on v1

Changes since v2:
 * Not present on v2
 
Changes since v3:
 * Not present on v3

Changes since v4: 
 * None
 
Changes since v5:
 * None

 hw/net/imx_fec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index fce3661..bf68ce6 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -460,9 +460,9 @@ static void imx_fec_write(void *opaque, hwaddr addr,
 /* store the value */
 s->mmfr = value;
 if (extract32(value, 29, 1)) {
-s->mmfr = do_phy_read(s, extract32(value, 18, 9));
+s->mmfr = do_phy_read(s, extract32(value, 18, 10));
 } else {
-do_phy_write(s, extract32(value, 18, 9), extract32(value, 0, 16));
+do_phy_write(s, extract32(value, 18, 10), extract32(value, 0, 16));
 }
 /* raise the interrupt as the PHY operation is done */
 s->eir |= FEC_INT_MII;
-- 
2.7.4




[Qemu-devel] [PATCH v6 01/10] net: improve UDP/TCP checksum computation.

2016-05-30 Thread Jean-Christophe Dubois
 * based on Eth, UDP, TCP struct present in eth.h instead of hardcoded
   indexes and sizes.
 * based on various macros present in eth.h.

Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * None

Changes since v2:
 * The patch was split in 2 parts: 
   - a rewrite of the TCP/UDB checksum function (this patch)
   - the addition of the support of the VLAN header (next patch).
  
Changes since v3:
 * None 
  
Changes since v4:
 * None

Changes since v5:
 * None

 net/checksum.c | 94 +-
 1 file changed, 67 insertions(+), 27 deletions(-)

diff --git a/net/checksum.c b/net/checksum.c
index d0fa424..f62b18a 100644
--- a/net/checksum.c
+++ b/net/checksum.c
@@ -18,9 +18,7 @@
 #include "qemu/osdep.h"
 #include "qemu-common.h"
 #include "net/checksum.h"
-
-#define PROTO_TCP  6
-#define PROTO_UDP 17
+#include "net/eth.h"
 
 uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq)
 {
@@ -57,40 +55,82 @@ uint16_t net_checksum_tcpudp(uint16_t length, uint16_t 
proto,
 
 void net_checksum_calculate(uint8_t *data, int length)
 {
-int hlen, plen, proto, csum_offset;
-uint16_t csum;
+int ip_len;
+struct ip_header *ip;
+
+/*
+ * Note: We cannot assume "data" is aligned, so the all code uses
+ * some macros that take care of possible unaligned access for
+ * struct members (just in case).
+ */
 
 /* Ensure data has complete L2 & L3 headers. */
-if (length < 14 + 20) {
+if (length < (sizeof(struct eth_header) + sizeof(struct ip_header))) {
 return;
 }
 
-if ((data[14] & 0xf0) != 0x40)
-   return; /* not IPv4 */
-hlen  = (data[14] & 0x0f) * 4;
-plen  = (data[16] << 8 | data[17]) - hlen;
-proto = data[23];
-
-switch (proto) {
-case PROTO_TCP:
-   csum_offset = 16;
-   break;
-case PROTO_UDP:
-   csum_offset = 6;
-   break;
-default:
-   return;
+ip = (struct ip_header *)(data + sizeof(struct eth_header));
+
+if (IP_HEADER_VERSION(ip) != IP_HEADER_VERSION_4) {
+return; /* not IPv4 */
 }
 
-if (plen < csum_offset + 2 || 14 + hlen + plen > length) {
+ip_len = lduw_be_p(>ip_len);
+
+/* Last, check that we have enough data for the all IP frame */
+if (length < ip_len) {
 return;
 }
 
-data[14+hlen+csum_offset]   = 0;
-data[14+hlen+csum_offset+1] = 0;
-csum = net_checksum_tcpudp(plen, proto, data+14+12, data+14+hlen);
-data[14+hlen+csum_offset]   = csum >> 8;
-data[14+hlen+csum_offset+1] = csum & 0xff;
+ip_len -= IP_HDR_GET_LEN(ip);
+
+switch (ip->ip_p) {
+case IP_PROTO_TCP:
+{
+uint16_t csum;
+tcp_header *tcp = (tcp_header *)(ip + 1);
+
+if (ip_len < sizeof(tcp_header)) {
+return;
+}
+
+/* Set csum to 0 */
+stw_he_p(>th_sum, 0);
+
+csum = net_checksum_tcpudp(ip_len, ip->ip_p,
+   (uint8_t *)>ip_src,
+   (uint8_t *)tcp);
+
+/* Store computed csum */
+stw_be_p(>th_sum, csum);
+
+break;
+}
+case IP_PROTO_UDP:
+{
+uint16_t csum;
+udp_header *udp = (udp_header *)(ip + 1);
+
+if (ip_len < sizeof(udp_header)) {
+return;
+}
+
+/* Set csum to 0 */
+stw_he_p(>uh_sum, 0);
+
+csum = net_checksum_tcpudp(ip_len, ip->ip_p,
+   (uint8_t *)>ip_src,
+   (uint8_t *)udp);
+
+/* Store computed csum */
+stw_be_p(>uh_sum, csum);
+
+break;
+}
+default:
+/* Can't handle any other protocol */
+break;
+}
 }
 
 uint32_t
-- 
2.7.4




[Qemu-devel] [PATCH v6 10/10] Add ENET device to i.MX6 SOC.

2016-05-30 Thread Jean-Christophe Dubois
This adds the ENET device to the i.MX6 SOC.

This was tested by booting Linux on an Qemu i.MX6 instance and accessing
the internet from the linux guest.

Reviewed-by: Peter Maydell 
Signed-off-by: Jean-Christophe Dubois 
---

Changes since v1:
 * Not present on v1
   
Changes since v2:
 * None
 
Changes since v3:
 * switch the 2 Eth interrupt numbers.
 
Changes since v4:
 * None

Changes since v5:
 * Fix patch because of merge conflicts.

 hw/arm/fsl-imx6.c | 17 +
 include/hw/arm/fsl-imx6.h |  6 --
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
index a5331bf..0c00e7a 100644
--- a/hw/arm/fsl-imx6.c
+++ b/hw/arm/fsl-imx6.c
@@ -105,6 +105,10 @@ static void fsl_imx6_init(Object *obj)
 snprintf(name, NAME_SIZE, "spi%d", i + 1);
 object_property_add_child(obj, name, OBJECT(>spi[i]), NULL);
 }
+
+object_initialize(>eth, sizeof(s->eth), TYPE_IMX_ENET);
+qdev_set_parent_bus(DEVICE(>eth), sysbus_get_default());
+object_property_add_child(obj, "eth", OBJECT(>eth), NULL);
 }
 
 static void fsl_imx6_realize(DeviceState *dev, Error **errp)
@@ -381,6 +385,19 @@ static void fsl_imx6_realize(DeviceState *dev, Error 
**errp)
 spi_table[i].irq));
 }
 
+object_property_set_bool(OBJECT(>eth), true, "realized", );
+if (err) {
+error_propagate(errp, err);
+return;
+}
+sysbus_mmio_map(SYS_BUS_DEVICE(>eth), 0, FSL_IMX6_ENET_ADDR);
+sysbus_connect_irq(SYS_BUS_DEVICE(>eth), 0,
+   qdev_get_gpio_in(DEVICE(>a9mpcore),
+FSL_IMX6_ENET_MAC_IRQ));
+sysbus_connect_irq(SYS_BUS_DEVICE(>eth), 1,
+   qdev_get_gpio_in(DEVICE(>a9mpcore),
+FSL_IMX6_ENET_MAC_1588_IRQ));
+
 /* ROM memory */
 memory_region_init_rom_device(>rom, NULL, NULL, NULL, "imx6.rom",
   FSL_IMX6_ROM_SIZE, );
diff --git a/include/hw/arm/fsl-imx6.h b/include/hw/arm/fsl-imx6.h
index e9157ea..ec6c509 100644
--- a/include/hw/arm/fsl-imx6.h
+++ b/include/hw/arm/fsl-imx6.h
@@ -28,6 +28,7 @@
 #include "hw/gpio/imx_gpio.h"
 #include "hw/sd/sdhci.h"
 #include "hw/ssi/imx_spi.h"
+#include "hw/net/imx_fec.h"
 #include "exec/memory.h"
 #include "cpu.h"
 
@@ -58,6 +59,7 @@ typedef struct FslIMX6State {
 IMXGPIOState   gpio[FSL_IMX6_NUM_GPIOS];
 SDHCIState esdhc[FSL_IMX6_NUM_ESDHCS];
 IMXSPIStatespi[FSL_IMX6_NUM_ECSPIS];
+IMXFECStateeth;
 MemoryRegion   rom;
 MemoryRegion   caam;
 MemoryRegion   ocram;
@@ -436,8 +438,8 @@ typedef struct FslIMX6State {
 #define FSL_IMX6_HDMI_MASTER_IRQ 115
 #define FSL_IMX6_HDMI_CEC_IRQ 116
 #define FSL_IMX6_MLB150_LOW_IRQ 117
-#define FSL_IMX6_ENET_MAC_IRQ 118
-#define FSL_IMX6_ENET_MAC_1588_IRQ 119
+#define FSL_IMX6_ENET_MAC_1588_IRQ 118
+#define FSL_IMX6_ENET_MAC_IRQ 119
 #define FSL_IMX6_PCIE1_IRQ 120
 #define FSL_IMX6_PCIE2_IRQ 121
 #define FSL_IMX6_PCIE3_IRQ 122
-- 
2.7.4




[Qemu-devel] [PATCH v6 00/10] Add Ethernet device for i.MX6 SOC

2016-05-30 Thread Jean-Christophe Dubois
This patch series adds Gb ENET Ethernet device to the i.MX6 SOC.

The ENET device is an evolution of the FEC device present on the i.MX25 SOC
and is backward compatible with it.

Therefore the ENET support has been added to the actual Qemu FEC device (
rather than adding a new device).

The Patch has been tested by:
 * Booting linux on i.MX25 PDK board emulation and accessing internet
 * Booting linux on i.MX6 Sabrelite board emulation and accessing internet

Jean-Christophe Dubois (10):
  net: improve UDP/TCP checksum computation.
  net: handle optional VLAN header in checksum computation.
  i.MX: Fix FEC code for MDIO operation selection
  i.MX: Fix FEC code for MDIO address selection
  i.MX: Fix FEC code for ECR register reset value.
  i.MX: reset TX/RX descriptors when FEC is disabled.
  i.MX: Rename i.MX FEC defines to ENET_XXX
  i.MX: move FEC device to a register array structure.
  Add ENET/Gbps Ethernet support to FEC device
  Add ENET device to i.MX6 SOC.

 hw/arm/fsl-imx25.c|1 +
 hw/arm/fsl-imx6.c |   17 +
 hw/net/imx_fec.c  | 1009 ++---
 include/hw/arm/fsl-imx6.h |6 +-
 include/hw/net/imx_fec.h  |  250 ---
 net/checksum.c|  121 --
 6 files changed, 1077 insertions(+), 327 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [PATCH v2 06/12] tcg/mips: Add support for fence

2016-05-30 Thread Aurelien Jarno
On 2016-05-26 18:00, Richard Henderson wrote:
> Signed-off-by: Richard Henderson 
> ---
>  tcg/mips/tcg-target.inc.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/tcg/mips/tcg-target.inc.c b/tcg/mips/tcg-target.inc.c
> index 50e98ea..cad1d4d 100644
> --- a/tcg/mips/tcg-target.inc.c
> +++ b/tcg/mips/tcg-target.inc.c
> @@ -292,6 +292,7 @@ typedef enum {
>  OPC_JALR = OPC_SPECIAL | 0x09,
>  OPC_MOVZ = OPC_SPECIAL | 0x0A,
>  OPC_MOVN = OPC_SPECIAL | 0x0B,
> +OPC_SYNC = OPC_SPECIAL | 0x0F,
>  OPC_MFHI = OPC_SPECIAL | 0x10,
>  OPC_MFLO = OPC_SPECIAL | 0x12,
>  OPC_MULT = OPC_SPECIAL | 0x18,
> @@ -1636,6 +1637,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
> opc,
>  const_args[4], const_args[5], true);
>  break;
>  
> +case INDEX_op_fence:
> +tcg_out32(s, OPC_SYNC);
> +break;
>  case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
>  case INDEX_op_movi_i32: /* Always emitted via tcg_out_movi.  */
>  case INDEX_op_call: /* Always emitted via tcg_out_call.  */
> @@ -1716,6 +1720,8 @@ static const TCGTargetOpDef mips_op_defs[] = {
>  { INDEX_op_qemu_ld_i64, { "L", "L", "lZ", "lZ" } },
>  { INDEX_op_qemu_st_i64, { "SZ", "SZ", "SZ", "SZ" } },
>  #endif
> +
> +{ INDEX_op_fence, { } },
>  { -1 },
>  };

Reviewed-by: Aurelien Jarno 

Also compiled tested, but we don't really have a way to test that so
far.


-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH] fix xen hvm direct kernel boot

2016-05-30 Thread Stefano Stabellini
On Fri, 29 Apr 2016, Chunyan Liu wrote:
> Since commit a1666142: acpi-build: make ROMs RAM blocks resizeable,
> xen HVM direct kernel boot failed. Xen HVM direct kernel boot will
> insert a linuxboot.bin or multiboot.bin to /genroms, before this
> commit, in acpi_setup, for rom linuxboot.bin/multiboot.bin, it
> only needs 0x2 size; after the commit, it will reserve x16
> size for resize, that is 0x20 size. It causes xen_ram_alloc
> failed due to running out of memory.
> 
> To resolve it, either:
> 1. keep using original rom size instead of max size, don't reserve x16 size.
> 2. guest maxmem needs to be increased. (commit c1d322e6 "xen-hvm: increase
>maxmem before calling xc_domain_populate_physmap" solved the problem for
>a time, by accident. But then it is reverted in commit bb369 due to
>other problem.)
> 
> For 2, more discussion is needed about howto. So this patch tries 1, to
> use unresizable rom size in xen case in rom_set_mr.
> 
> Signed-off-by: Chunyan Liu 

Thank you for the patch!


>  hw/core/loader.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/core/loader.c b/hw/core/loader.c
> index c049957..5150101 100644
> --- a/hw/core/loader.c
> +++ b/hw/core/loader.c
> @@ -55,6 +55,7 @@
>  #include "exec/address-spaces.h"
>  #include "hw/boards.h"
>  #include "qemu/cutils.h"
> +#include "hw/xen/xen.h"
>  
>  #include 
>  
> @@ -818,7 +819,10 @@ static void *rom_set_mr(Rom *rom, Object *owner, const 
> char *name)
>  void *data;
>  
>  rom->mr = g_malloc(sizeof(*rom->mr));
> -memory_region_init_resizeable_ram(rom->mr, owner, name,
> +if (xen_enabled())
> +memory_region_init_ram(rom->mr, owner, name, rom->datasize, 
> _fatal);
> +else
> +memory_region_init_resizeable_ram(rom->mr, owner, name,
>rom->datasize, rom->romsize,
>fw_cfg_resized,
>_fatal);

Wouldn't it be better to change ram_block_add so that it calls
xen_ram_alloc with used_length rather than max_length?

I think that on Xen we want to only allocate used_length bytes, but
reserve max_length of address space.



Re: [Qemu-devel] [PATCH] virtio-gpu: fix scanout rectangles

2016-05-30 Thread Marc-André Lureau
Hi

On Mon, May 30, 2016 at 10:40 AM, Gerd Hoffmann  wrote:
> Commit "ca58b45 ui/virtio-gpu: add and use qemu_create_displaysurface_pixman"
> breaks scanouts which use a region of the underlying resource only.
>
> So, we need another way to handle the underlying issue.  Lets create a
> new pixman image, grab a reference on the pixman providing the
> underlying storage, hook up a destroy callback which releases the
> reference.  That way regions work again and releasing the backing
> storage should still be impossible thanks to the extra reference we are
> holding.
>
> Signed-off-by: Gerd Hoffmann 

I proposed some similar changes in virgl mailing list, so:

Reviewed-by: Marc-André Lureau 


> ---
>  hw/display/virtio-gpu.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index 2116106..4746edc 100644
> --- a/hw/display/virtio-gpu.c
> +++ b/hw/display/virtio-gpu.c
> @@ -497,6 +497,11 @@ static void virtio_gpu_resource_flush(VirtIOGPU *g,
>  pixman_region_fini(_region);
>  }
>
> +static void virtio_unref_resource(pixman_image_t *image, void *data)
> +{
> +pixman_image_unref(data);
> +}
> +
>  static void virtio_gpu_set_scanout(VirtIOGPU *g,
> struct virtio_gpu_ctrl_command *cmd)
>  {
> @@ -576,8 +581,15 @@ static void virtio_gpu_set_scanout(VirtIOGPU *g,
>  != ((uint8_t *)pixman_image_get_data(res->image) + offset) ||
>  scanout->width != ss.r.width ||
>  scanout->height != ss.r.height) {
> +pixman_image_t *rect;
> +void *ptr = (uint8_t *)pixman_image_get_data(res->image) + offset;
> +rect = pixman_image_create_bits(format, ss.r.width, ss.r.height, ptr,
> +pixman_image_get_stride(res->image));
> +pixman_image_ref(res->image);
> +pixman_image_set_destroy_function(rect, virtio_unref_resource,
> +  res->image);
>  /* realloc the surface ptr */
> -scanout->ds = qemu_create_displaysurface_pixman(res->image);
> +scanout->ds = qemu_create_displaysurface_pixman(rect);
>  if (!scanout->ds) {
>  cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
>  return;
> --
> 1.8.3.1
>
>



-- 
Marc-André Lureau



Re: [Qemu-devel] [Xen-devel] [PATCH] xen: Clean up includes

2016-05-30 Thread Stefano Stabellini
On Tue, 24 May 2016, Peter Maydell wrote:
> Clean up includes so that osdep.h is included first and headers
> which it implies are not included manually.
> 
> This commit was created with scripts/clean-includes.
> 
> Signed-off-by: Peter Maydell 

Reviewed-by: Stefano Stabellini 

Added to my queue


>  hw/usb/xen-usb.c | 5 +
>  include/hw/xen/xen.h | 1 -
>  2 files changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
> index 664df04..8fa47ed 100644
> --- a/hw/usb/xen-usb.c
> +++ b/hw/usb/xen-usb.c
> @@ -19,13 +19,10 @@
>   *  GNU GPL, version 2 or (at your option) any later version.
>   */
>  
> +#include "qemu/osdep.h"
>  #include 
> -#include 
> -#include 
>  #include 
> -#include 
>  
> -#include "qemu/osdep.h"
>  #include "qemu-common.h"
>  #include "qemu/config-file.h"
>  #include "hw/sysbus.h"
> diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
> index 6365483..b2cd992 100644
> --- a/include/hw/xen/xen.h
> +++ b/include/hw/xen/xen.h
> @@ -8,7 +8,6 @@
>   */
>  
>  #include "qemu-common.h"
> -#include "qemu/typedefs.h"
>  #include "exec/cpu-common.h"
>  #include "hw/irq.h"
>  
> -- 
> 1.9.1
> 
> 
> ___
> Xen-devel mailing list
> xen-de...@lists.xen.org
> http://lists.xen.org/xen-devel
> 



Re: [Qemu-devel] [PATCH v6 01/17] pci: fix unaligned access in pci_xxx_quad()

2016-05-30 Thread Michael S. Tsirkin
On Mon, May 30, 2016 at 06:22:35PM +0300, Dmitry Fleytman wrote:
> 
> > On 30 May 2016, at 18:19 PM, Michael S. Tsirkin  wrote:
> > 
> > On Mon, May 30, 2016 at 06:14:56PM +0300, Dmitry Fleytman wrote:
> >>Does DSN generation function pass unaligned offsets?
> >>It does not look like it does…
> >> 
> >> 
> >> It does according to clang sanitiser.
> > 
> > 
> > Oh so it's a clang false positive?
> 
> I think not.
> The capability itself is 8-bytes aligned but 64-bit serial number inside of 
> it is not because of 32 bit header in front of it.

Oh right. Things like this should really go into commit log
in the future.
For now a code comment in pci set/get that explains that
alignment in capabilities is generally at dword not qword
boundary would be enough.

> > 
> > -- 
> > MST



Re: [Qemu-devel] [PATCH v6 01/17] pci: fix unaligned access in pci_xxx_quad()

2016-05-30 Thread Dmitry Fleytman

> On 30 May 2016, at 18:19 PM, Michael S. Tsirkin  wrote:
> 
> On Mon, May 30, 2016 at 06:14:56PM +0300, Dmitry Fleytman wrote:
>>Does DSN generation function pass unaligned offsets?
>>It does not look like it does…
>> 
>> 
>> It does according to clang sanitiser.
> 
> 
> Oh so it's a clang false positive?

I think not.
The capability itself is 8-bytes aligned but 64-bit serial number inside of it 
is not because of 32 bit header in front of it.

> 
> -- 
> MST




[Qemu-devel] [PATCH] fixup! exec: Do vmstate unregistration from cpu_exec_exit()

2016-05-30 Thread Igor Mammedov
with all header changes merged in current master (d6550e9ed2)
above patch breaks compilation with:
  exec.c: In function ‘cpu_exec_exit’:
  exec.c:661:9: error: implicit declaration of function ‘vmstate_unregister’ 
[-Werror=implicit-function-declaration]
 vmstate_unregister(NULL, cc->vmsd, cpu);

pls squash this fixup in orginal patch

Signed-off-by: Igor Mammedov 
---
 exec.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/exec.c b/exec.c
index c5dd58e..a0327a9 100644
--- a/exec.c
+++ b/exec.c
@@ -62,6 +62,8 @@
 #include "qemu/mmap-alloc.h"
 #endif
 
+#include "migration/vmstate.h"
+
 //#define DEBUG_SUBPAGE
 
 #if !defined(CONFIG_USER_ONLY)
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v6 01/17] pci: fix unaligned access in pci_xxx_quad()

2016-05-30 Thread Michael S. Tsirkin
On Mon, May 30, 2016 at 06:14:56PM +0300, Dmitry Fleytman wrote:
> Does DSN generation function pass unaligned offsets?
> It does not look like it does…
> 
> 
> It does according to clang sanitiser.


Oh so it's a clang false positive?

-- 
MST



Re: [Qemu-devel] [PATCH v6 01/17] pci: fix unaligned access in pci_xxx_quad()

2016-05-30 Thread Dmitry Fleytman

> On 30 May 2016, at 18:11 PM, Michael S. Tsirkin  wrote:
> 
> On Mon, May 30, 2016 at 06:05:57PM +0300, Dmitry Fleytman wrote:
>> 
>>> On 30 May 2016, at 17:47 PM, Michael S. Tsirkin  wrote:
>>> 
>>> On Mon, May 30, 2016 at 12:14:26PM +0300, Leonid Bloch wrote:
 From: Dmitry Fleytman 
 
 Replace legacy cpu_to_le64w()/le64_to_cpup()
 calls with stq_le_p()/ldq_le_p().
 
 Signed-off-by: Dmitry Fleytman 
 Signed-off-by: Leonid Bloch 
>>> 
>> 
>> Hi Michael,
>> 
>>> Could you please add a code comment to clarify what's going on a bit more?
>>> Something to the point that capabilities are guaranteed to
>>> be dword-aligned only.
>>> 
>> 
>> Just to clarify, do you want to add these comments to 
>> pci_set/get_quad functions or to the new DSN-generation function?
> 
> pci_set/get_quad
> 
>>> Also, this isn't a dependency of this patchset I think -
>>> as far as I could say the only user of this is
>>> pcie: Introduce function for DSN capability creation
>>> but that merely accesses a capability, and all callers pass in
>>> an aligned offset.
>> 
>> Right, this issue appeared after introduction of DSN generation function.
> 
> Does DSN generation function pass unaligned offsets?
> It does not look like it does…

It does according to clang sanitiser.

> 
>> All other callers pass aligned offsets so far.
>> 
>> Thanks,
>> Dmitry
>> 
>>> 
 ---
 include/hw/pci/pci.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
 index ef6ba51..ee238ad 100644
 --- a/include/hw/pci/pci.h
 +++ b/include/hw/pci/pci.h
 @@ -468,13 +468,13 @@ pci_get_long(const uint8_t *config)
 static inline void
 pci_set_quad(uint8_t *config, uint64_t val)
 {
 -cpu_to_le64w((uint64_t *)config, val);
 +stq_le_p(config, val);
 }
 
 static inline uint64_t
 pci_get_quad(const uint8_t *config)
 {
 -return le64_to_cpup((const uint64_t *)config);
 +return ldq_le_p(config);
 }
 
 static inline void
 -- 
 2.5.5



Re: [Qemu-devel] [PATCH v4 1/1] Introduce "xen-load-devices-state"

2016-05-30 Thread Stefano Stabellini
On Fri, 27 May 2016, Anthony PERARD wrote:
> On Mon, Apr 11, 2016 at 11:56:02AM +0800, Changlong Xie wrote:
> > From: Wen Congyang 
> > 
> > Introduce a "xen-load-devices-state" QAPI command that can be used to
> > load the state of all devices, but not the RAM or the block devices of
> > the VM.
> > 
> > We only have hmp commands savevm/loadvm, and qmp commands
> > xen-save-devices-state.
> > 
> > We use this new command for COLO:
> > 1. suspend both primary vm and secondary vm
> > 2. sync the state
> > 3. resume both primary vm and secondary vm
> > 
> > In such case, we need to update all devices' state in any time.
> > 
> > Signed-off-by: Wen Congyang 
> > Signed-off-by: Changlong Xie 
> 
> This patch looks good to me.
> 
> Reviewed-by: Anthony PERARD 

It would be nicer (and less problematic) to load the state from a file
descriptor, but given that we still saving the state to file, it would
be unfair to ask to use file descriptors here.

Acked-by: Stefano Stabellini 

Given that this is migration code, it still needs an ack from Juan or
Amit.



[Qemu-devel] [PULL 00/30] Misc changes for 2016-05-27

2016-05-30 Thread Paolo Bonzini
The following changes since commit d6550e9ed2e1a60d889dfb721de00d9a4e3bafbe:

  Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160527' 
into staging (2016-05-27 14:05:48 +0100)

are available in the git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 0878d0e11ba8013dd759c6921cbf05ba6a41bd71:

  exec: hide mr->ram_addr from qemu_get_ram_ptr users (2016-05-29 09:11:12 
+0200)

Dropped the jinxed DMA change once more, and seriously thinking of
rewriting the whole thing in assembly language...


* docs/atomics fixes and atomic_rcu_* optimization (Emilio)
* NBD bugfix (Eric)
* Memory fixes and cleanups (Paolo, Paul)
* scsi-block support for SCSI status, including persistent
  reservations (Paolo)
* kvm_stat moves to the Linux repository
* SCSI bug fixes (Peter, Prasad)
* Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)


Emilio G. Cota (3):
  docs/atomics: update atomic_read/set comparison with Linux
  atomics: emit an smp_read_barrier_depends() barrier only for Alpha and 
Thread Sanitizer
  atomics: do not emit consume barrier for atomic_rcu_read

Eric Blake (1):
  nbd: Don't trim unrequested bytes

Fam Zheng (1):
  scsi-generic: Merge block max xfer len in INQUIRY response

Paolo Bonzini (13):
  Revert "memory: Drop FlatRange.romd_mode"
  kvm_stat: Remove
  bt: rewrite csrhci_write to avoid out-of-bounds writes
  docs/atomics: update comparison with Linux
  scsi-disk: introduce a common base class
  scsi-disk: introduce dma_readv and dma_writev
  scsi-disk: add need_fua_emulation to SCSIDiskClass
  scsi-disk: introduce scsi_disk_req_check_error
  scsi-block: always use SG_IO
  memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
  exec: remove ram_addr argument from qemu_ram_block_from_host
  memory: split memory_region_from_host from qemu_ram_addr_from_host
  exec: hide mr->ram_addr from qemu_get_ram_ptr users

Paul Durrant (1):
  xen-hvm: ignore background I/O sections

Peter Lieven (1):
  block/iscsi: avoid potential overflow of acb->task->cdb

Prasad J Pandit (5):
  scsi: pvscsi: check command descriptor ring buffer size (CVE-2016-4952)
  scsi: mptsas: infinite loop while fetching requests
  scsi: megasas: use appropriate property buffer size
  scsi: megasas: initialise local configuration data buffer
  scsi: megasas: check 'read_queue_head' index value

xiaoqiang zhao (5):
  hw/char: QOM'ify escc.c
  hw/char: QOM'ify etraxfs_ser.c
  hw/char: QOM'ify lm32_juart.c
  hw/char: QOM'ify lm32_uart.c
  hw/char: QOM'ify milkymist-uart.c

 Makefile |   9 -
 block/iscsi.c|   7 +
 cputlb.c |   3 +-
 docs/atomics.txt |  38 +-
 exec.c   | 110 ++
 hw/bt/hci-csr.c  |  67 ++--
 hw/char/escc.c   |  30 +-
 hw/char/etraxfs_ser.c|  27 +-
 hw/char/lm32_juart.c |  17 +-
 hw/char/lm32_uart.c  |  28 +-
 hw/char/milkymist-uart.c |  10 +-
 hw/cris/axis_dev88.c |   4 +-
 hw/lm32/lm32.h   |  19 +-
 hw/lm32/lm32_boards.c|   9 +-
 hw/lm32/milkymist-hw.h   |   4 +-
 hw/lm32/milkymist.c  |   4 +-
 hw/misc/ivshmem.c|   5 +-
 hw/scsi/megasas.c|   6 +-
 hw/scsi/mptsas.c |   9 +-
 hw/scsi/scsi-disk.c  | 415 --
 hw/scsi/scsi-generic.c   |  12 +
 hw/scsi/vmw_pvscsi.c |  24 +-
 hw/virtio/vhost-user.c   |  25 +-
 include/exec/cpu-common.h|   4 +-
 include/exec/memory.h|  36 +-
 include/exec/ram_addr.h  |   3 -
 include/hw/cris/etraxfs.h|  16 +
 include/qemu/atomic.h|  25 +-
 memory.c |  43 ++-
 migration/postcopy-ram.c |   3 +-
 nbd/server.c |  20 +-
 scripts/dump-guest-memory.py |  19 +-
 scripts/kvm/kvm_stat | 825 ---
 scripts/kvm/kvm_stat.texi|  55 ---
 target-i386/kvm.c|   6 +-
 xen-hvm.c|  14 +-
 36 files changed, 709 insertions(+), 1242 deletions(-)
 delete mode 100755 scripts/kvm/kvm_stat
 delete mode 100644 scripts/kvm/kvm_stat.texi
-- 
2.5.5



Re: [Qemu-devel] [PATCH v6 01/17] pci: fix unaligned access in pci_xxx_quad()

2016-05-30 Thread Michael S. Tsirkin
On Mon, May 30, 2016 at 06:05:57PM +0300, Dmitry Fleytman wrote:
> 
> > On 30 May 2016, at 17:47 PM, Michael S. Tsirkin  wrote:
> > 
> > On Mon, May 30, 2016 at 12:14:26PM +0300, Leonid Bloch wrote:
> >> From: Dmitry Fleytman 
> >> 
> >> Replace legacy cpu_to_le64w()/le64_to_cpup()
> >> calls with stq_le_p()/ldq_le_p().
> >> 
> >> Signed-off-by: Dmitry Fleytman 
> >> Signed-off-by: Leonid Bloch 
> > 
> 
> Hi Michael,
> 
> > Could you please add a code comment to clarify what's going on a bit more?
> > Something to the point that capabilities are guaranteed to
> > be dword-aligned only.
> > 
> 
> Just to clarify, do you want to add these comments to 
> pci_set/get_quad functions or to the new DSN-generation function?

pci_set/get_quad

> > Also, this isn't a dependency of this patchset I think -
> > as far as I could say the only user of this is
> > pcie: Introduce function for DSN capability creation
> > but that merely accesses a capability, and all callers pass in
> > an aligned offset.
> 
> Right, this issue appeared after introduction of DSN generation function.

Does DSN generation function pass unaligned offsets?
It does not look like it does...

> All other callers pass aligned offsets so far.
> 
> Thanks,
> Dmitry
> 
> > 
> >> ---
> >> include/hw/pci/pci.h | 4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> >> index ef6ba51..ee238ad 100644
> >> --- a/include/hw/pci/pci.h
> >> +++ b/include/hw/pci/pci.h
> >> @@ -468,13 +468,13 @@ pci_get_long(const uint8_t *config)
> >> static inline void
> >> pci_set_quad(uint8_t *config, uint64_t val)
> >> {
> >> -cpu_to_le64w((uint64_t *)config, val);
> >> +stq_le_p(config, val);
> >> }
> >> 
> >> static inline uint64_t
> >> pci_get_quad(const uint8_t *config)
> >> {
> >> -return le64_to_cpup((const uint64_t *)config);
> >> +return ldq_le_p(config);
> >> }
> >> 
> >> static inline void
> >> -- 
> >> 2.5.5



Re: [Qemu-devel] [PATCH v6 16/17] net: Introduce e1000e device emulation

2016-05-30 Thread Michael S. Tsirkin
On Mon, May 30, 2016 at 12:14:41PM +0300, Leonid Bloch wrote:
> diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
> new file mode 100644
> index 000..4da6bb1
> --- /dev/null
> +++ b/hw/net/e1000e.c

Here are minor style issues that can be fixed after this is upstream.
See below.

> @@ -0,0 +1,739 @@
> +/*
> +* QEMU INTEL 82574 GbE NIC emulation
> +*
> +* Software developer's manuals:
> +* 
> http://www.intel.com/content/dam/doc/datasheet/82574l-gbe-controller-datasheet.pdf
> +*
> +* Copyright (c) 2015 Ravello Systems LTD (http://ravellosystems.com)
> +* Developed by Daynix Computing LTD (http://www.daynix.com)
> +*
> +* Authors:
> +* Dmitry Fleytman 
> +* Leonid Bloch 
> +* Yan Vugenfirer 
> +*
> +* Based on work done by:
> +* Nir Peleg, Tutis Systems Ltd. for Qumranet Inc.
> +* Copyright (c) 2008 Qumranet
> +* Based on work done by:
> +* Copyright (c) 2007 Dan Aloni
> +* Copyright (c) 2004 Antony T Curtis
> +*
> +* This library is free software; you can redistribute it and/or
> +* modify it under the terms of the GNU Lesser General Public
> +* License as published by the Free Software Foundation; either
> +* version 2 of the License, or (at your option) any later version.
> +*
> +* This library is distributed in the hope that it will be useful,
> +* but WITHOUT ANY WARRANTY; without even the implied warranty of
> +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +* Lesser General Public License for more details.
> +*
> +* You should have received a copy of the GNU Lesser General Public
> +* License along with this library; if not, see 
> .
> +*/
> +
> +#include "qemu/osdep.h"
> +#include "net/net.h"
> +#include "net/tap.h"
> +#include "qemu/range.h"
> +#include "sysemu/sysemu.h"
> +#include "hw/pci/msi.h"
> +#include "hw/pci/msix.h"
> +
> +#include "hw/net/e1000_regs.h"
> +
> +#include "e1000x_common.h"
> +#include "e1000e_core.h"
> +
> +#include "trace.h"
> +
> +#define TYPE_E1000E "e1000e"
> +#define E1000E(obj) OBJECT_CHECK(E1000EState, (obj), TYPE_E1000E)
> +
> +typedef struct {
> +PCIDevice parent_obj;
> +NICState *nic;
> +NICConf conf;
> +
> +MemoryRegion mmio;
> +MemoryRegion flash;
> +MemoryRegion io;
> +MemoryRegion msix;
> +
> +uint32_t ioaddr;
> +
> +uint16_t subsys_ven;
> +uint16_t subsys;
> +
> +uint16_t subsys_ven_used;
> +uint16_t subsys_used;
> +
> +uint32_t intr_state;
> +bool disable_vnet;
> +
> +E1000ECore core;
> +
> +} E1000EState;

typedef struct E1000EState is preferably because older
gdb versions do not always see typedefs.

> +
> +#define E1000E_MMIO_IDX 0
> +#define E1000E_FLASH_IDX1
> +#define E1000E_IO_IDX   2
> +#define E1000E_MSIX_IDX 3
> +
> +#define E1000E_MMIO_SIZE(128 * 1024)
> +#define E1000E_FLASH_SIZE   (128 * 1024)
> +#define E1000E_IO_SIZE  (32)
> +#define E1000E_MSIX_SIZE(16 * 1024)
> +
> +#define E1000E_MSIX_TABLE   (0x)
> +#define E1000E_MSIX_PBA (0x2000)
> +
> +#define E1000E_USE_MSI BIT(0)
> +#define E1000E_USE_MSIXBIT(1)
> +
> +static uint64_t
> +e1000e_mmio_read(void *opaque, hwaddr addr, unsigned size)
> +{
> +E1000EState *s = opaque;
> +return e1000e_core_read(>core, addr, size);
> +}
> +
> +static void
> +e1000e_mmio_write(void *opaque, hwaddr addr,
> +   uint64_t val, unsigned size)
> +{
> +E1000EState *s = opaque;
> +e1000e_core_write(>core, addr, val, size);
> +}
> +
> +static bool
> +e1000e_io_get_reg_index(E1000EState *s, uint32_t *idx)
> +{
> +if (s->ioaddr < 0x1) {
> +*idx = s->ioaddr;
> +return true;
> +}
> +
> +if (s->ioaddr < 0x7) {
> +trace_e1000e_wrn_io_addr_undefined(s->ioaddr);
> +return false;
> +}
> +
> +if (s->ioaddr < 0xF) {
> +trace_e1000e_wrn_io_addr_flash(s->ioaddr);
> +return false;
> +}
> +
> +trace_e1000e_wrn_io_addr_unknown(s->ioaddr);
> +return false;
> +}
> +
> +static uint64_t
> +e1000e_io_read(void *opaque, hwaddr addr, unsigned size)
> +{
> +E1000EState *s = opaque;
> +uint32_t idx;
> +uint64_t val;
> +
> +switch (addr) {
> +case E1000_IOADDR:
> +trace_e1000e_io_read_addr(s->ioaddr);
> +return s->ioaddr;
> +case E1000_IODATA:
> +if (e1000e_io_get_reg_index(s, )) {
> +val = e1000e_core_read(>core, idx, sizeof(val));
> +trace_e1000e_io_read_data(idx, val);
> +return val;
> +}
> +return 0;
> +default:
> +trace_e1000e_wrn_io_read_unknown(addr);
> +return 0;
> +}
> +}
> +
> +static void
> +e1000e_io_write(void *opaque, hwaddr addr,
> +uint64_t val, unsigned size)
> +{
> +E1000EState *s = opaque;
> +uint32_t idx;
> +
> +switch (addr) {
> +case E1000_IOADDR:
> +trace_e1000e_io_write_addr(val);
> +s->ioaddr = (uint32_t) val;
> +

Re: [Qemu-devel] [PATCH v6 01/17] pci: fix unaligned access in pci_xxx_quad()

2016-05-30 Thread Dmitry Fleytman

> On 30 May 2016, at 17:47 PM, Michael S. Tsirkin  wrote:
> 
> On Mon, May 30, 2016 at 12:14:26PM +0300, Leonid Bloch wrote:
>> From: Dmitry Fleytman 
>> 
>> Replace legacy cpu_to_le64w()/le64_to_cpup()
>> calls with stq_le_p()/ldq_le_p().
>> 
>> Signed-off-by: Dmitry Fleytman 
>> Signed-off-by: Leonid Bloch 
> 

Hi Michael,

> Could you please add a code comment to clarify what's going on a bit more?
> Something to the point that capabilities are guaranteed to
> be dword-aligned only.
> 

Just to clarify, do you want to add these comments to 
pci_set/get_quad functions or to the new DSN-generation function?

> Also, this isn't a dependency of this patchset I think -
> as far as I could say the only user of this is
>   pcie: Introduce function for DSN capability creation
> but that merely accesses a capability, and all callers pass in
> an aligned offset.

Right, this issue appeared after introduction of DSN generation function.
All other callers pass aligned offsets so far.

Thanks,
Dmitry

> 
>> ---
>> include/hw/pci/pci.h | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>> index ef6ba51..ee238ad 100644
>> --- a/include/hw/pci/pci.h
>> +++ b/include/hw/pci/pci.h
>> @@ -468,13 +468,13 @@ pci_get_long(const uint8_t *config)
>> static inline void
>> pci_set_quad(uint8_t *config, uint64_t val)
>> {
>> -cpu_to_le64w((uint64_t *)config, val);
>> +stq_le_p(config, val);
>> }
>> 
>> static inline uint64_t
>> pci_get_quad(const uint8_t *config)
>> {
>> -return le64_to_cpup((const uint64_t *)config);
>> +return ldq_le_p(config);
>> }
>> 
>> static inline void
>> -- 
>> 2.5.5




Re: [Qemu-devel] [PATCH v6 00/17] Introduce Intel 82574 GbE Controller Emulation (e1000e)

2016-05-30 Thread Michael S. Tsirkin
On Mon, May 30, 2016 at 12:14:25PM +0300, Leonid Bloch wrote:
> Hello All,
> 
> This is v6 of e1000e series.
> 
> For convenience, the same patches are available at:
> https://github.com/daynix/qemu-e1000e/tree/e1000e-submit-v6
> 
> Best regards,
> Dmitry.

There are some things that can be improved further
but overall I think it's OK to merge this.

Reviewed-by: Michael S. Tsirkin 



> Changes since v5:
> 
> 1. Fixed build failure on old clang versions
> 2. Added patch that fixes unaligned access in pci_[set|get]_quad()
> 3. Rebased to the latest master
> 
> Changes since v4:
> 
> 1. Rebased to the latest master (2.6.0+)
> 
> Changes since v3:
> 
> 1. Various code fixes as suggested by Jason and Michael
> 2. Rebased to the latest master
> 
> Changes since v2:
> 
> 1. Interrupt storm on latest Linux kernels fixed
> 2. Device unit test added
> 3. Introduced code sharing between e1000 and e1000e
> 4. Various code fixes as suggested by Jason
> 5. Rebased to the latest master
> 
> Changes since v1:
> 
> 1. PCI_PM_CAP_VER_1_1 is defined now in include/hw/pci/pci_regs.h and
>not in include/standard-headers/linux/pci_regs.h.
> 2. Changes in naming and extra comments in hw/pci/pcie.c and in
>include/hw/pci/pcie.h.
> 3. Defining pci_dsn_ver and pci_dsn_cap static const variables in
>hw/pci/pcie.c, instead of PCI_DSN_VER and PCI_DSN_CAP symbolic
>constants in include/hw/pci/pcie_regs.h.
> 4. Changing the vmxnet3_device_serial_num function in hw/net/vmxnet3.c
>to avoid the cast when it is called.
> 5. Avoiding a preceding underscore in all the e1000e-related names.
> 6. Minor style changes.
> 
> ===
> 
> Hello All,
> 
> This series is the final code of the e1000e device emulation, that we
> have developed. Please review, and consider acceptance of these patches
> to the upstream QEMU repository.
> 
> The code stability was verified by various traffic tests using Fedora 22
> Linux, and Windows Server 2012R2 guests. Also, Microsoft Hardware
> Certification Kit (HCK) tests were run on a Windows Server 2012R2 guest.
> 
> There was a discussion on the possibility of code sharing between the
> e1000e, and the existing e1000 devices. We have reviewed the final code
> for parts that may be shared between this device and the currently
> available e1000 emulation. The device specifications are very different,
> and there are almost no registers, nor functions, that were left as is
> from e1000. The ring descriptor structures were changed as well, by the
> introduction of extended and PS descriptors, as well as additional bits.
> 
> Additional differences stem from the fact that the e1000e device re-uses
> network packet abstractions introduced by the vmxnet3 device, while the
> e1000 has its own code for packet handling. BTW, it may be worth reusing
> those abstractions in e1000 as well. (Following these changes the
> vmxnet3 device was successfully tested for possible regressions.)
> 
> There are a few minor parts that may be shared, e.g. the default
> register handlers, and the ring management functions. The total amount
> of shared lines will be about 100--150, so we're not sure if it makes
> sense bothering, and taking a risk of breaking e1000, which is a good,
> old, and stable device.
> 
> Currently, the e1000e code is stand alone w.r.t. e1000.
> 
> Please share your thoughts.
> 
> Thanks in advance,
> Dmitry.
> 
> Changes since RFCv2:
> 
> 1. Device functionality verified using Microsoft Hardware Certification
> Test Kit (HCK)
> 2. Introduced a number of performance improvements
> 3. The code was cleaned, and rebased to the latest master
> 4. Patches verified with checkpatch.pl
> 
> ===
> 
> Changes since RFCv1:
> 
> 1. Added support for all the device features:
>   - Interrupt moderation.
>   - RSS.
>   - Multiqueue.
> 2. Simulated exact PCI/PCIe configuration space layout.
> 3. Made fixes needed to pass Microsoft's HW certification tests (HCK).
> 
> This series is still an RFC, because the following tasks are not done
> yet:
> 
> 1. See which code can be shared between this device and the existing
> e1000 device.
> 2. Rebase patches to the latest master (current base is v2.3.0).
> 
> Please share your thoughts,
> Thanks, Dmitry.
> 
> ===
> 
> Hello qemu-devel,
> 
> This patch series is an RFC for the new networking device emulation
> we're developing for QEMU.
> 
> This new device emulates the Intel 82574 GbE Controller and works
> with unmodified Intel e1000e drivers from the Linux/Windows kernels.
> 
> The status of the current series is "Functional Device Ready, work
> on Extended Features in Progress".
> 
> More precisely, these patches represent a functional device, which
> is recognized by the standard Intel drivers, and is able to transfer
> TX/RX packets with CSO/TSO offloads, according to the spec.
> 
> Extended features not supported yet (work in progress):
>   1. TX/RX Interrupt moderation mechanisms
>   2. RSS
>   3. 

Re: [Qemu-devel] [PATCH v6 01/17] pci: fix unaligned access in pci_xxx_quad()

2016-05-30 Thread Michael S. Tsirkin
On Mon, May 30, 2016 at 12:14:26PM +0300, Leonid Bloch wrote:
> From: Dmitry Fleytman 
> 
> Replace legacy cpu_to_le64w()/le64_to_cpup()
> calls with stq_le_p()/ldq_le_p().
> 
> Signed-off-by: Dmitry Fleytman 
> Signed-off-by: Leonid Bloch 

Could you please add a code comment to clarify what's going on a bit more?
Something to the point that capabilities are guaranteed to
be dword-aligned only.

Also, this isn't a dependency of this patchset I think -
as far as I could say the only user of this is
pcie: Introduce function for DSN capability creation
but that merely accesses a capability, and all callers pass in
an aligned offset.

> ---
>  include/hw/pci/pci.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index ef6ba51..ee238ad 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -468,13 +468,13 @@ pci_get_long(const uint8_t *config)
>  static inline void
>  pci_set_quad(uint8_t *config, uint64_t val)
>  {
> -cpu_to_le64w((uint64_t *)config, val);
> +stq_le_p(config, val);
>  }
>  
>  static inline uint64_t
>  pci_get_quad(const uint8_t *config)
>  {
> -return le64_to_cpup((const uint64_t *)config);
> +return ldq_le_p(config);
>  }
>  
>  static inline void
> -- 
> 2.5.5



Re: [Qemu-devel] [PATCH 04/10] qcow: add qcow_co_write_compressed

2016-05-30 Thread Pavel Butsykin

On 27.05.2016 20:45, Stefan Hajnoczi wrote:

On Sat, May 14, 2016 at 03:45:52PM +0300, Denis V. Lunev wrote:

+qemu_co_mutex_lock(>lock);
+cluster_offset = get_cluster_offset(bs, sector_num << 9, 2, out_len, 0, 0);
+qemu_co_mutex_unlock(>lock);
+if (cluster_offset == 0) {
+ret = -EIO;
+goto fail;
+}
+cluster_offset &= s->cluster_offset_mask;
+
+iov = (struct iovec) {
+.iov_base   = out_buf,
+.iov_len= out_len,
+};
+qemu_iovec_init_external(_qiov, , 1);
+ret = bdrv_co_pwritev(bs->file->bs, cluster_offset, out_len, _qiov, 0);


Not sure if this has the same race condition as the qcow2 patch.  It
seems that bdrv_getlength() is used to extend the file on a per-sector
basis.  That would mean compressed data is not packed inside sectors and
no read-write-modify race condition exists, but I haven't fully audited
get_cluster_offset().



The get_cluster_offset() also doesn't allow to do multiple compressed
writes in a single cluster, because this function for all offsets
within the cluster returns the same cluster_offset. So here we just
can't write at offset in the cluster, only at the beginning of the
cluster.


Stefan





[Qemu-devel] [Bug 1587065] [NEW] btrfs qemu-ga - multiple mounts block fsfreeze

2016-05-30 Thread Dadio
Public bug reported:

Having two mounts of the same device makes fsfreeze through qemu-ga impossible.
root@CmsrvMTA:/# mount -l | grep /dev/vda2
/dev/vda2 on / type btrfs (rw,relatime,space_cache,subvolid=257,subvol=/@)
/dev/vda2 on /home type btrfs 
(rw,relatime,space_cache,subvolid=258,subvol=/@home)

Having two mounts is rather common with btrfs, so the feature fsfreeze
is unusable on these systems.


Below more information about how we encountered this issue...

Message send to qemu-disc...@nongnu.org:

Message 1:
--
I use external snapshots to backup my guests. I use the 'quiesce' option to 
flush and frees the guest file system with the qemu guest agent.

With the exeption of one guest, this procedure works fine. On the 'unwilling' 
guest, I get this error message:
"ERROR 2016-05-25 00:51:19 | T25-bakVMSCmsrvVH2 | fout: internal error: unable 
to execute QEMU agent command 'guest-fsfreeze-freeze': failed to freeze /: 
Device or resource busy"

I don't think this is not some sort of time-out error, because
activation of the fsfreeze and the error message happen immediately
after each other:

$ grep qemu-ga syslog.1
May 25 00:51:19 CmsrvMTA qemu-ga: info: guest-fsfreeze called

This is the only entry of the qemu guest agent in syslog.

$ sudo virsh version
Compiled against library: libvirt 1.3.1
Using library: libvirt 1.3.1
Gebruikte API: QEMU 1.3.1
Draaiende hypervisor: QEMU 2.5.0

$ virsh qemu-agent-command CmsrvMTA '{"execute": "guest-info"}'
{"return":{"version":"2.5.0", ... 
,{"enabled":true,"name":"guest-fstrim","success-response":true},{"enabled":true,"name":"guest-fsfreeze-thaw","success-response":true},{"enabled":true,"name":"guest-fsfreeze-status","success-response":true},{"enabled":true,"name":"guest-fsfreeze-freeze-list","success-response":true},{"enabled":true,"name":"guest-fsfreeze-freeze","success-response":true},
 ... }

For making an external snapshot, I use this command:
$ virsh snapshot-create-as --domain CmsrvMTA sn1 --disk-only --atomic --quiesce 
--no-metadata --diskspec vda,file=/srv/poolVMS/CmsrvMTA.sn1

Piece of reply 1:
-
I have encountered this before. Some operating systems
 may have bind-mounts that let a device appear multiple times in the mount 
list. Unfortunately the guest agent is not smart enough to consider a device 
that has been frozen as succesfull and keep going. This causes this specific 
error.

Piece of reply 2:
-
Ok, that seems to be it.

I’ve got the ‘/’ and ‘/home’ on the same device formatted as btrfs on
two separate sub volumes.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1587065

Title:
  btrfs qemu-ga - multiple mounts block fsfreeze

Status in QEMU:
  New

Bug description:
  Having two mounts of the same device makes fsfreeze through qemu-ga 
impossible.
  root@CmsrvMTA:/# mount -l | grep /dev/vda2
  /dev/vda2 on / type btrfs (rw,relatime,space_cache,subvolid=257,subvol=/@)
  /dev/vda2 on /home type btrfs 
(rw,relatime,space_cache,subvolid=258,subvol=/@home)

  Having two mounts is rather common with btrfs, so the feature fsfreeze
  is unusable on these systems.

  
  Below more information about how we encountered this issue...

  Message send to qemu-disc...@nongnu.org:

  Message 1:
  --
  I use external snapshots to backup my guests. I use the 'quiesce' option to 
flush and frees the guest file system with the qemu guest agent.

  With the exeption of one guest, this procedure works fine. On the 'unwilling' 
guest, I get this error message:
  "ERROR 2016-05-25 00:51:19 | T25-bakVMSCmsrvVH2 | fout: internal error: 
unable to execute QEMU agent command 'guest-fsfreeze-freeze': failed to freeze 
/: Device or resource busy"

  I don't think this is not some sort of time-out error, because
  activation of the fsfreeze and the error message happen immediately
  after each other:

  $ grep qemu-ga syslog.1
  May 25 00:51:19 CmsrvMTA qemu-ga: info: guest-fsfreeze called

  This is the only entry of the qemu guest agent in syslog.

  $ sudo virsh version
  Compiled against library: libvirt 1.3.1
  Using library: libvirt 1.3.1
  Gebruikte API: QEMU 1.3.1
  Draaiende hypervisor: QEMU 2.5.0

  $ virsh qemu-agent-command CmsrvMTA '{"execute": "guest-info"}'
  {"return":{"version":"2.5.0", ... 
,{"enabled":true,"name":"guest-fstrim","success-response":true},{"enabled":true,"name":"guest-fsfreeze-thaw","success-response":true},{"enabled":true,"name":"guest-fsfreeze-status","success-response":true},{"enabled":true,"name":"guest-fsfreeze-freeze-list","success-response":true},{"enabled":true,"name":"guest-fsfreeze-freeze","success-response":true},
 ... }

  For making an external snapshot, I use this command:
  $ virsh snapshot-create-as --domain CmsrvMTA sn1 --disk-only --atomic 
--quiesce --no-metadata --diskspec vda,file=/srv/poolVMS/CmsrvMTA.sn1

  

  1   2   3   >