date:20150701

[Qemu-devel] [Bug 1470536] [NEW] qemu-img incorrectly prints "qemu-img: Host floppy pass-through is deprecated"

2015-07-01 Thread Richard Jones

Public bug reported:

qemu-img incorrectly prints this warning when you use /dev/fd/ to
pass in file descriptors.  A simple way to demonstrate this uses bash
process substitution, so the following will only work if you are using
bash as your shell:

$ qemu-img info <( cat /dev/null )
qemu-img: Host floppy pass-through is deprecated
Support for it will be removed in a future release.
qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: 
Illegal seek

The root cause is a bug in block/raw-posix.c:floppy_probe_device() where
it thinks anything starting with /dev/fd is a floppy drive, which is not
the case here:

http://git.qemu.org/?p=qemu.git;a=blob;f=block/raw-
posix.c;h=cbe6574bf4da90a124436a40422dce3667da71e6;hb=HEAD#l2425

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1470536

Title:
  qemu-img incorrectly prints "qemu-img: Host floppy pass-through is
  deprecated"

Status in QEMU:
  New

Bug description:
  qemu-img incorrectly prints this warning when you use /dev/fd/ to
  pass in file descriptors.  A simple way to demonstrate this uses bash
  process substitution, so the following will only work if you are using
  bash as your shell:

  $ qemu-img info <( cat /dev/null )
  qemu-img: Host floppy pass-through is deprecated
  Support for it will be removed in a future release.
  qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: 
Illegal seek

  The root cause is a bug in block/raw-posix.c:floppy_probe_device()
  where it thinks anything starting with /dev/fd is a floppy drive,
  which is not the case here:

  http://git.qemu.org/?p=qemu.git;a=blob;f=block/raw-
  posix.c;h=cbe6574bf4da90a124436a40422dce3667da71e6;hb=HEAD#l2425

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1470536/+subscriptions

Re: [Qemu-devel] [PATCH 9/9] qemu/kvm: kvm hyper-v based guest crash event handling

2015-07-01 Thread Paolo Bonzini

On 30/06/2015 13:33, Denis V. Lunev wrote:
> 
> +static int kvm_arch_handle_hv_crash(CPUState *cs)
> +{
> +X86CPU *cpu = X86_CPU(cs);
> +CPUX86State *env = &cpu->env;
> +
> +/* Mark that Hyper-v guest crash occurred */
> +env->hv_crash_occurred = 1;

This need not be a hv crash.  You can add crash_occurred to CPUState
directly, and set it in qemu_system_guest_panicked:

if (current_cpu) {
current_cpu->crash_occurred = true;
}

Then you would add two subsections: one for crash_occurred in exec.c
(attached to vmstate_cpu_common), one for hyperv crash params in
target-i386/machine.c.

This also gives an idea about splitting the patch: first the
introduction of qemu_system_guest_panicked and crash_occurred, second
the Hyper-V specific bits.

> +if (cpu->hyperv_crash) {
> +c->edx |= HV_X64_GUEST_CRASH_MSR_AVAILABLE;
> +has_msr_hv_crash = true;

You can only set this to true if the kernel also supports the MSRs.

> +}
> +
>  c = &cpuid_data.entries[cpuid_i++];
>  c->function = HYPERV_CPUID_ENLIGHTMENT_INFO;
>  if (cpu->hyperv_relaxed_timing) {
> @@ -761,6 +767,10 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
>  } else {
>  env->mp_state = KVM_MP_STATE_RUNNABLE;
>  }
> +if (has_msr_hv_crash) {
> +env->msr_hv_crash_ctl = HV_X64_MSR_CRASH_CTL_NOTIFY;

The value is always host-defined, so I think it doesn't need a field in
CPUX86State.  On the other hand, this:

+static bool hyperv_crash_enable_needed(void *opaque)
+{
+X86CPU *cpu = opaque;
+CPUX86State *env = &cpu->env;
+
+return (env->msr_hv_crash_ctl & HV_X64_MSR_CRASH_CTL_CONTENTS) ?
+true : false;
+}
+

can just check if any of the params fields is nonzero.

Thanks,

Paolo

> +env->hv_crash_occurred = 0;
> +}

Re: [Qemu-devel] [Qemu-block] [PATCH] block/mirror: limit qiov to IOV_MAX elements

2015-07-01 Thread Paolo Bonzini



On 01/07/2015 16:59, Stefan Hajnoczi wrote:
> I found it annoying to write it backwards too, but it's for consistency:
> 
>   if (s->buf_free_count < nb_chunks + added_chunks) {
>   trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
>   break;
>   }
>   if (IOV_MAX < nb_chunks + added_chunks) {
>   trace_mirror_break_iov_max(s, nb_chunks, added_chunks);
>   break;
>   }
> 
> It's the same type of check as s->buf_free_count (which isn't modified
> by this loop either so it's a yoda conditional).

Hmm, right.  The problem goes back to:

while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
qemu_coroutine_yield();
}

where s->buf_free_count _is_ modified by the loop.  The if below:

if (s->buf_free_count < nb_chunks + added_chunks) {
trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
break;
}

is written as a < check for consistency, and the one you add exacerbates
the problem.  If you want you can change the < to > in the "while" loop
as well; otherwise the patch is okay as is.

Paolo

[Qemu-devel] [Bug 1470536] Re: qemu-img incorrectly prints "qemu-img: Host floppy pass-through is deprecated"

2015-07-01 Thread Richard Jones

I sent a patch to qemu-devel which should fix this.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1470536

Title:
  qemu-img incorrectly prints "qemu-img: Host floppy pass-through is
  deprecated"

Status in QEMU:
  New

Bug description:
  qemu-img incorrectly prints this warning when you use /dev/fd/ to
  pass in file descriptors.  A simple way to demonstrate this uses bash
  process substitution, so the following will only work if you are using
  bash as your shell:

  $ qemu-img info <( cat /dev/null )
  qemu-img: Host floppy pass-through is deprecated
  Support for it will be removed in a future release.
  qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: 
Illegal seek

  The root cause is a bug in block/raw-posix.c:floppy_probe_device()
  where it thinks anything starting with /dev/fd is a floppy drive,
  which is not the case here:

  http://git.qemu.org/?p=qemu.git;a=blob;f=block/raw-
  posix.c;h=cbe6574bf4da90a124436a40422dce3667da71e6;hb=HEAD#l2425

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1470536/+subscriptions

[Qemu-devel] [PATCH 14/16] nvdimm: support NFIT_CMD_GET_CONFIG_SIZE function

2015-07-01 Thread Xiao Guangrong

Function 4 is used to get Namespace lable size

Signed-off-by: Xiao Guangrong 
---
 hw/mem/pc-nvdimm.c | 87 ++
 1 file changed, 87 insertions(+)

diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index b586bf7..7e5446c 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -127,6 +127,20 @@ static uint32_t nvdimm_index_to_handle(int index)
 return index + 1;
 }
 
+static PCNVDIMMDevice
+*get_nvdimm_device_by_handle(GSList *list, uint32_t handle)
+{
+for (; list; list = list->next) {
+PCNVDIMMDevice *nvdimm = list->data;
+
+if (nvdimm_index_to_handle(nvdimm->device_index) == handle) {
+return nvdimm;
+}
+}
+
+return NULL;
+}
+
 typedef struct {
 uint8_t b[16];
 } uuid_le;
@@ -391,6 +405,17 @@ enum {
| (1 << NFIT_CMD_GET_CONFIG_DATA)\
| (1 << NFIT_CMD_SET_CONFIG_DATA))
 
+struct cmd_in_get_config_data {
+uint32_t offset;
+uint32_t length;
+} QEMU_PACKED;
+
+struct cmd_in_set_config_data {
+uint32_t offset;
+uint32_t length;
+uint8_t in_buf[0];
+} QEMU_PACKED;
+
 struct dsm_buffer {
 /* RAM page. */
 uint32_t handle;
@@ -398,6 +423,7 @@ struct dsm_buffer {
 uint32_t arg1;
 uint32_t arg2;
 union {
+struct cmd_in_set_config_data cmd_config_set;
 char arg3[PAGE_SIZE - 3 * sizeof(uint32_t) - 16 * sizeof(uint8_t)];
 };
 
@@ -412,10 +438,23 @@ struct cmd_out_implemented {
 uint64_t cmd_list;
 };
 
+struct cmd_out_get_config_size {
+uint32_t status;
+uint32_t config_size;
+uint32_t max_xfer;
+} QEMU_PACKED;
+
+struct cmd_out_get_config_data {
+uint32_t status;
+uint8_t out_buf[0];
+} QEMU_PACKED;
+
 struct dsm_out {
 union {
 uint32_t status;
 struct cmd_out_implemented cmd_implemented;
+struct cmd_out_get_config_size cmd_config_size;
+struct cmd_out_get_config_data cmd_config_get;
 uint8_t data[PAGE_SIZE];
 };
 };
@@ -441,6 +480,51 @@ static void dsm_write_root(struct dsm_buffer *in, struct 
dsm_out *out)
 nvdebug("Return status %#x.\n", out->status);
 }
 
+/*
+ * the max transfer size is the max size transfered by both a
+ * NFIT_CMD_GET_CONFIG_DATA and a NFIT_CMD_SET_CONFIG_DATA
+ * command.
+ */
+static uint32_t max_xfer_config_size(void)
+{
+struct dsm_buffer *in;
+struct dsm_out *out;
+uint32_t max_get_size, max_set_size;
+
+/*
+ * the max data ACPI can read one time which is transfered by
+ * the response of NFIT_CMD_GET_CONFIG_DATA.
+ */
+max_get_size = sizeof(out->data) - sizeof(out->cmd_config_get);
+
+/*
+ * the max data ACPI can write one time which is transfered by
+ * NFIT_CMD_SET_CONFIG_DATA
+ */
+max_set_size = sizeof(in->arg3) - sizeof(in->cmd_config_set);
+return MIN(max_get_size, max_set_size);
+}
+
+static uint32_t dsm_cmd_config_size(struct dsm_buffer *in, struct dsm_out *out)
+{
+GSList *list = get_nvdimm_built_list();
+PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in->handle);
+uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV;
+
+if (!nvdimm) {
+goto exit;
+}
+
+status = NFIT_STATUS_SUCCESS;
+out->cmd_config_size.config_size = nvdimm->config_data_size;
+out->cmd_config_size.max_xfer = max_xfer_config_size();
+nvdebug("%s config_size %#x, max_xfer %#x.\n", __func__,
+out->cmd_config_size.config_size, out->cmd_config_size.max_xfer);
+exit:
+g_slist_free(list);
+return status;
+}
+
 static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out)
 {
 uint32_t function = in->arg2;
@@ -450,6 +534,9 @@ static void dsm_write_nvdimm(struct dsm_buffer *in, struct 
dsm_out *out)
 case NFIT_CMD_IMPLEMENTED:
 out->cmd_implemented.cmd_list = DIMM_SUPPORT_CMD;
 return;
+case NFIT_CMD_GET_CONFIG_SIZE:
+status = dsm_cmd_config_size(in, out);
+break;
 default:
 status = NFIT_STATUS_NOT_SUPPORTED;
 };
-- 
2.1.0

[Qemu-devel] [PATCH 15/16] nvdimm: support NFIT_CMD_GET_CONFIG_DATA

2015-07-01 Thread Xiao Guangrong

Function 5 is used to get Namespace Label Data

Signed-off-by: Xiao Guangrong 
---
 hw/mem/pc-nvdimm.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index 7e5446c..0498de3 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -423,6 +423,7 @@ struct dsm_buffer {
 uint32_t arg1;
 uint32_t arg2;
 union {
+struct cmd_in_get_config_data cmd_config_get;
 struct cmd_in_set_config_data cmd_config_set;
 char arg3[PAGE_SIZE - 3 * sizeof(uint32_t) - 16 * sizeof(uint8_t)];
 };
@@ -525,6 +526,35 @@ exit:
 return status;
 }
 
+static uint32_t dsm_cmd_config_get(struct dsm_buffer *in, struct dsm_out *out)
+{
+GSList *list = get_nvdimm_built_list();
+PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in->handle);
+struct cmd_in_get_config_data *cmd_in = &in->cmd_config_get;
+uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV;
+
+if (!nvdimm) {
+goto exit;
+}
+
+nvdebug("Read Config: offset %#x length %#x.\n", cmd_in->offset,
+cmd_in->length);
+if (nvdimm->config_data_size < cmd_in->length + cmd_in->offset) {
+nvdebug("position %#x is beyond config data (len = %#lx).\n",
+cmd_in->length + cmd_in->offset, nvdimm->config_data_size);
+status = NFIT_STATUS_INVALID_PARAS;
+goto exit;
+}
+
+status = NFIT_STATUS_SUCCESS;
+memcpy(out->cmd_config_get.out_buf, nvdimm->config_data_addr +
+   cmd_in->offset, cmd_in->length);
+
+exit:
+g_slist_free(list);
+return status;
+}
+
 static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out)
 {
 uint32_t function = in->arg2;
@@ -537,6 +567,9 @@ static void dsm_write_nvdimm(struct dsm_buffer *in, struct 
dsm_out *out)
 case NFIT_CMD_GET_CONFIG_SIZE:
 status = dsm_cmd_config_size(in, out);
 break;
+case NFIT_CMD_GET_CONFIG_DATA:
+status = dsm_cmd_config_get(in, out);
+break;
 default:
 status = NFIT_STATUS_NOT_SUPPORTED;
 };
-- 
2.1.0

Re: [Qemu-devel] [Qemu-block] [PATCH] block/mirror: limit qiov to IOV_MAX elements

2015-07-01 Thread Stefan Hajnoczi

On Wed, Jul 1, 2015 at 3:47 PM, Paolo Bonzini  wrote:
> On 01/07/2015 16:45, Stefan Hajnoczi wrote:
>> If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2)
>> EINVAL failures may be encountered.
>>
>> It is possible to trigger this by setting granularity to a low value
>> like 8192.
>>
>> This patch stops appending chunks once IOV_MAX is reached.
>>
>> The spurious EINVAL failure can be reproduced with a qcow2 image file
>> and the following QMP invocation:
>>
>>   qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1',
>>   granularity=8192, sync='full', mode='absolute-paths',
>>   format='raw')
>>
>> While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct
>> bs=4k.
>>
>> Cc: Jeff Cody 
>> Signed-off-by: Stefan Hajnoczi 
>> ---
>>  block/mirror.c | 4 
>>  trace-events   | 1 +
>>  2 files changed, 5 insertions(+)
>>
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 048e452..985ad00 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -241,6 +241,10 @@ static uint64_t coroutine_fn 
>> mirror_iteration(MirrorBlockJob *s)
>>  trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
>>  break;
>>  }
>> +if (IOV_MAX < nb_chunks + added_chunks) {
>
> No Yoda conditions... apart from that,
>
> Reviewed-by: Paolo Bonzini 

I found it annoying to write it backwards too, but it's for consistency:

  if (s->buf_free_count < nb_chunks + added_chunks) {
  trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
  break;
  }
  if (IOV_MAX < nb_chunks + added_chunks) {
  trace_mirror_break_iov_max(s, nb_chunks, added_chunks);
  break;
  }

It's the same type of check as s->buf_free_count (which isn't modified
by this loop either so it's a yoda conditional).

[Qemu-devel] [PATCH 16/16] nvdimm: support NFIT_CMD_SET_CONFIG_DATA

2015-07-01 Thread Xiao Guangrong

Function 6 is used to set Namespace Label Data

Signed-off-by: Xiao Guangrong 
---
 hw/mem/pc-nvdimm.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index 0498de3..0d2d9fb 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -450,12 +450,17 @@ struct cmd_out_get_config_data {
 uint8_t out_buf[0];
 } QEMU_PACKED;
 
+struct cmd_out_set_config_data {
+uint32_t status;
+} QEMU_PACKED;
+
 struct dsm_out {
 union {
 uint32_t status;
 struct cmd_out_implemented cmd_implemented;
 struct cmd_out_get_config_size cmd_config_size;
 struct cmd_out_get_config_data cmd_config_get;
+struct cmd_out_set_config_data cmd_config_set;
 uint8_t data[PAGE_SIZE];
 };
 };
@@ -555,6 +560,35 @@ exit:
 return status;
 }
 
+static uint32_t dsm_cmd_config_set(struct dsm_buffer *in, struct dsm_out *out)
+{
+GSList *list = get_nvdimm_built_list();
+PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in->handle);
+struct cmd_in_set_config_data *cmd_in = &in->cmd_config_set;
+uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV;
+
+if (!nvdimm) {
+goto exit;
+}
+
+nvdebug("Write Config: offset %#x length %#x.\n", cmd_in->offset,
+cmd_in->length);
+if (nvdimm->config_data_size < cmd_in->length + cmd_in->offset) {
+nvdebug("position %#x is beyond config data (len = %#lx).\n",
+cmd_in->length + cmd_in->offset, nvdimm->config_data_size);
+status = NFIT_STATUS_INVALID_PARAS;
+goto exit;
+}
+
+status = NFIT_STATUS_SUCCESS;
+memcpy(nvdimm->config_data_addr + cmd_in->offset, cmd_in->in_buf,
+   cmd_in->length);
+
+exit:
+g_slist_free(list);
+return status;
+}
+
 static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out)
 {
 uint32_t function = in->arg2;
@@ -570,6 +604,9 @@ static void dsm_write_nvdimm(struct dsm_buffer *in, struct 
dsm_out *out)
 case NFIT_CMD_GET_CONFIG_DATA:
 status = dsm_cmd_config_get(in, out);
 break;
+case NFIT_CMD_SET_CONFIG_DATA:
+status = dsm_cmd_config_set(in, out);
+break;
 default:
 status = NFIT_STATUS_NOT_SUPPORTED;
 };
-- 
2.1.0

[Qemu-devel] [PATCH 06/16] pc: implement NVDIMM device abstract

2015-07-01 Thread Xiao Guangrong

Introduce "pc-nvdimm" device and it only has one parameter, @file, which
is the backed memory file for NVDIMM device

We can use "-device pc-nvdimm,file=/dev/pmem" in the Qemu command to
create NVDIMM device for the guest

Signed-off-by: Xiao Guangrong 
---
 hw/mem/Makefile.objs   |  1 +
 hw/mem/pc-nvdimm.c | 83 ++
 include/hw/mem/pc-nvdimm.h | 32 ++
 3 files changed, 116 insertions(+)
 create mode 100644 hw/mem/pc-nvdimm.c
 create mode 100644 include/hw/mem/pc-nvdimm.h

diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index b000fb4..9a7f5a9 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1 +1,2 @@
 common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
+common-obj-$(CONFIG_LINUX) += pc-nvdimm.o
diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
new file mode 100644
index 000..0209ea9
--- /dev/null
+++ b/hw/mem/pc-nvdimm.c
@@ -0,0 +1,83 @@
+/*
+ * NVDIMM (A Non-Volatile Dual In-line Memory Module) Virtualization Implement
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong 
+ *
+ * Currently, it only supports PMEM Virtualization.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "hw/mem/pc-nvdimm.h"
+
+static char *get_file(Object *obj, Error **errp)
+{
+PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
+
+return g_strdup(nvdimm->file);
+}
+
+static void set_file(Object *obj, const char *str, Error **errp)
+{
+PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
+
+if (nvdimm->file) {
+g_free(nvdimm->file);
+}
+
+nvdimm->file = g_strdup(str);
+}
+
+static void pc_nvdimm_init(Object *obj)
+{
+object_property_add_str(obj, "file", get_file, set_file, NULL);
+}
+
+static void pc_nvdimm_realize(DeviceState *dev, Error **errp)
+{
+PCNVDIMMDevice *nvdimm = PC_NVDIMM(dev);
+
+if (!nvdimm->file) {
+error_setg(errp, "file property is not set");
+}
+}
+
+static void pc_nvdimm_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+
+/* nvdimm hotplug has not supported yet. */
+dc->hotpluggable = false;
+
+dc->realize = pc_nvdimm_realize;
+dc->desc = "NVDIMM memory module";
+}
+
+static TypeInfo pc_nvdimm_info = {
+.name  = TYPE_PC_NVDIMM,
+.parent= TYPE_DEVICE,
+.instance_size = sizeof(PCNVDIMMDevice),
+.instance_init = pc_nvdimm_init,
+.class_init= pc_nvdimm_class_init,
+};
+
+static void pc_nvdimm_register_types(void)
+{
+type_register_static(&pc_nvdimm_info);
+}
+
+type_init(pc_nvdimm_register_types)
diff --git a/include/hw/mem/pc-nvdimm.h b/include/hw/mem/pc-nvdimm.h
new file mode 100644
index 000..7f37b46
--- /dev/null
+++ b/include/hw/mem/pc-nvdimm.h
@@ -0,0 +1,32 @@
+/*
+ * NVDIMM (A Non-Volatile Dual In-line Memory Module) Virtualization Implement
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef __PC_NVDIMM_H
+#define __PC_NVDIMM_H
+
+#include "hw/qdev.h"
+
+#ifdef CONFIG_LINUX
+typedef struct PCNVDIMMDevice {
+/* private */
+DeviceState parent_obj;
+
+char *file;
+} PCNVDIMMDevice;
+
+#define TYPE_PC_NVDIMM "pc-nvdimm"
+
+#define PC_NVDIMM(obj) \
+OBJECT_CHECK(PCNVDIMMDevice, (obj), TYPE_PC_NVDIMM)
+#else  /* !CONFIG_LINUX */
+#endif
+#endif
-- 
2.1.0

[Qemu-devel] [PATCH 12/16] nvdimm: save arg3 for NVDIMM device _DSM method

2015-07-01 Thread Xiao Guangrong

Check if the function (Arg2) has additional input info (arg3) and save
the info if needed

We only do the save on NVDIMM device since we are not going to support any
function on root device

Signed-off-by: Xiao Guangrong 
---
 hw/mem/pc-nvdimm.c | 73 +-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index 0e2a9d5..c0965ae 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -329,6 +329,26 @@ static void build_nfit_table(GSList *device_list, char 
*buf)
 }
 }
 
+enum {
+NFIT_CMD_IMPLEMENTED = 0,
+
+/* bus commands */
+NFIT_CMD_ARS_CAP = 1,
+NFIT_CMD_ARS_START = 2,
+NFIT_CMD_ARS_QUERY = 3,
+
+/* per-dimm commands */
+NFIT_CMD_SMART = 1,
+NFIT_CMD_SMART_THRESHOLD = 2,
+NFIT_CMD_DIMM_FLAGS = 3,
+NFIT_CMD_GET_CONFIG_SIZE = 4,
+NFIT_CMD_GET_CONFIG_DATA = 5,
+NFIT_CMD_SET_CONFIG_DATA = 6,
+NFIT_CMD_VENDOR_EFFECT_LOG_SIZE = 7,
+NFIT_CMD_VENDOR_EFFECT_LOG = 8,
+NFIT_CMD_VENDOR = 9,
+};
+
 struct dsm_buffer {
 /* RAM page. */
 uint32_t handle;
@@ -433,6 +453,19 @@ exit:
 g_slist_free(list);
 }
 
+static bool device_cmd_has_arg3[] = {
+false,  /* NFIT_CMD_IMPLEMENTED */
+false,  /* NFIT_CMD_SMART */
+false,  /* NFIT_CMD_SMART_THRESHOLD */
+false,  /* NFIT_CMD_DIMM_FLAGS */
+false,  /* NFIT_CMD_GET_CONFIG_SIZE */
+true,   /* NFIT_CMD_GET_CONFIG_DATA */
+true,   /* NFIT_CMD_SET_CONFIG_DATA */
+false,  /* NFIT_CMD_VENDOR_EFFECT_LOG_SIZE */
+false,  /* NFIT_CMD_VENDOR_EFFECT_LOG */
+false,  /* NFIT_CMD_VENDOR */
+};
+
 #define BUILD_STA_METHOD(_dev_, _method_)  \
 do {   \
 _method_ = aml_method("_STA", 0);  \
@@ -457,10 +490,20 @@ exit:
 
 static void build_nvdimm_devices(Aml *root_dev, GSList *list)
 {
+Aml *has_arg3;
+int i, cmd_nr;
+
+cmd_nr = ARRAY_SIZE(device_cmd_has_arg3);
+has_arg3 = aml_package(cmd_nr);
+for (i = 0; i < cmd_nr; i++) {
+aml_append(has_arg3, aml_int(device_cmd_has_arg3[i]));
+}
+aml_append(root_dev, aml_name_decl("CAG3", has_arg3));
+
 for (; list; list = list->next) {
 PCNVDIMMDevice *nvdimm = list->data;
 uint32_t handle = nvdimm_index_to_handle(nvdimm->device_index);
-Aml *dev, *method;
+Aml *dev, *method, *ifctx;
 
 dev = aml_device("NVD%d", nvdimm->device_index);
 aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
@@ -470,6 +513,34 @@ static void build_nvdimm_devices(Aml *root_dev, GSList 
*list)
 method = aml_method("_DSM", 4);
 {
 SAVE_ARG012_HANDLE(method, aml_int(handle));
+
+/* Local5 = DeRefOf(Index(CAG3, Arg2)) */
+aml_append(method,
+   aml_store(aml_derefof(aml_index(aml_name("CAG3"),
+   aml_arg(2))), aml_local(5)));
+/* if 0 < local5 */
+ifctx = aml_if(aml_lless(aml_int(0), aml_local(5)));
+{
+/* Local0 = Index(Arg3, 0) */
+aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
+   aml_local(0)));
+/* Local1 = sizeof(Local0) */
+aml_append(ifctx, aml_store(aml_sizeof(aml_local(0)),
+   aml_local(1)));
+/* Local2 = Local1 << 3 */
+aml_append(ifctx, aml_store(aml_shiftleft(aml_local(1),
+   aml_int(3)), aml_local(2)));
+/* Local3 = DeRefOf(Local0) */
+aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
+   aml_local(3)));
+/* CreateField(Local3, 0, local2, IBUF) */
+aml_append(ifctx, aml_create_field(aml_local(3),
+   aml_int(0), aml_local(2), "IBUF"));
+/* ARG3 = IBUF */
+aml_append(ifctx, aml_store(aml_name("IBUF"),
+   aml_name("ARG3")));
+}
+aml_append(method, ifctx);
 NOTIFY_AND_RETURN(method);
 }
 aml_append(dev, method);
-- 
2.1.0

[Qemu-devel] [PATCH 13/16] nvdimm: support NFIT_CMD_IMPLEMENTED function

2015-07-01 Thread Xiao Guangrong

__DSM is defined in ACPI 6.0: 9.14.1 _DSM (Device Specific Method)

Function 0 is a query function. We do not support any function on root
device and only 3 functions are support for NVDIMM device,
NFIT_CMD_GET_CONFIG_SIZE, NFIT_CMD_GET_CONFIG_DATA and
NFIT_CMD_SET_CONFIG_DATA, that means we currently only allow to access
device's Label Namespace

Signed-off-by: Xiao Guangrong 
---
 hw/mem/pc-nvdimm.c | 126 +
 1 file changed, 126 insertions(+)

diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index c0965ae..b586bf7 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -29,6 +29,15 @@
 #include "exec/address-spaces.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/mem/pc-nvdimm.h"
+#include "sysemu/sysemu.h"
+
+//#define NVDIMM_DEBUG
+
+#ifdef NVDIMM_DEBUG
+#define nvdebug(fmt, ...) fprintf(stderr, "nvdimm: " fmt, ## __VA_ARGS__)
+#else
+#define nvdebug(...)
+#endif
 
 #define PAGE_SIZE   (1UL << 12)
 
@@ -135,6 +144,22 @@ static void nfit_spa_uuid_pm(void *uuid)
 memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
 }
 
+static bool dsm_is_root_uuid(uint8_t *uuid)
+{
+uuid_le uuid_root = UUID_LE(0x2f10e7a4, 0x9e91, 0x11e4, 0x89,
+0xd3, 0x12, 0x3b, 0x93, 0xf7, 0x5c, 0xba);
+
+return !memcmp(uuid, &uuid_root, sizeof(uuid_root));
+}
+
+static bool dsm_is_dimm_uuid(uint8_t *uuid)
+{
+uuid_le uuid_dimm = UUID_LE(0x4309ac30, 0x0d11, 0x11e4, 0x91,
+0x91, 0x08, 0x00, 0x20, 0x0c, 0x9a, 0x66);
+
+return !memcmp(uuid, &uuid_dimm, sizeof(uuid_dimm));
+}
+
 enum {
 NFIT_TABLE_SPA = 0,
 NFIT_TABLE_MEM = 1,
@@ -349,6 +374,23 @@ enum {
 NFIT_CMD_VENDOR = 9,
 };
 
+enum {
+NFIT_STATUS_SUCCESS = 0,
+NFIT_STATUS_NOT_SUPPORTED = 1,
+NFIT_STATUS_NON_EXISTING_MEM_DEV = 2,
+NFIT_STATUS_INVALID_PARAS = 3,
+NFIT_STATUS_VENDOR_SPECIFIC_ERROR = 4,
+};
+
+#define DSM_REVISION(1)
+
+/* do not support any command except NFIT_CMD_ARS_CAP on root. */
+#define ROOT_SUPPORT_CMD(1 << NFIT_CMD_ARS_CAP)
+#define DIMM_SUPPORT_CMD((1 << NFIT_CMD_IMPLEMENTED)\
+   | (1 << NFIT_CMD_GET_CONFIG_SIZE)\
+   | (1 << NFIT_CMD_GET_CONFIG_DATA)\
+   | (1 << NFIT_CMD_SET_CONFIG_DATA))
+
 struct dsm_buffer {
 /* RAM page. */
 uint32_t handle;
@@ -366,6 +408,18 @@ struct dsm_buffer {
 };
 };
 
+struct cmd_out_implemented {
+uint64_t cmd_list;
+};
+
+struct dsm_out {
+union {
+uint32_t status;
+struct cmd_out_implemented cmd_implemented;
+uint8_t data[PAGE_SIZE];
+};
+};
+
 static uint64_t dsm_read(void *opaque, hwaddr addr,
  unsigned size)
 {
@@ -374,10 +428,82 @@ static uint64_t dsm_read(void *opaque, hwaddr addr,
 return 0;
 }
 
+static void dsm_write_root(struct dsm_buffer *in, struct dsm_out *out)
+{
+uint32_t function = in->arg2;
+
+if (function == NFIT_CMD_IMPLEMENTED) {
+out->cmd_implemented.cmd_list = ROOT_SUPPORT_CMD;
+return;
+}
+
+out->status = NFIT_STATUS_NOT_SUPPORTED;
+nvdebug("Return status %#x.\n", out->status);
+}
+
+static void dsm_write_nvdimm(struct dsm_buffer *in, struct dsm_out *out)
+{
+uint32_t function = in->arg2;
+uint32_t status;
+
+switch (function) {
+case NFIT_CMD_IMPLEMENTED:
+out->cmd_implemented.cmd_list = DIMM_SUPPORT_CMD;
+return;
+default:
+status = NFIT_STATUS_NOT_SUPPORTED;
+};
+
+nvdebug("Return status %#x.\n", status);
+out->status = status;
+}
+
 static void dsm_write(void *opaque, hwaddr addr,
   uint64_t val, unsigned size)
 {
+struct MemoryRegion *dsm_ram_mr = opaque;
+struct dsm_buffer *dsm;
+struct dsm_out *out;
+void *buf;
+
 assert(val == NOTIFY_VALUE);
+
+buf = memory_region_get_ram_ptr(dsm_ram_mr);
+dsm = buf;
+out = buf;
+
+nvdebug("Arg0 " UUID_FMT ".\n", dsm->arg0[0], dsm->arg0[1], dsm->arg0[2],
+dsm->arg0[3], dsm->arg0[4], dsm->arg0[5], dsm->arg0[6],
+dsm->arg0[7], dsm->arg0[8], dsm->arg0[9], dsm->arg0[10],
+dsm->arg0[11], dsm->arg0[12], dsm->arg0[13], dsm->arg0[14],
+dsm->arg0[15]);
+nvdebug("Handler %#x, Arg1 %#x, Arg2 %#x.\n", dsm->handle, dsm->arg1,
+dsm->arg2);
+
+if (dsm->arg1 != DSM_REVISION) {
+nvdebug("Revision %#x is not supported, expect %#x.\n",
+dsm->arg1, DSM_REVISION);
+goto exit;
+}
+
+if (!dsm->handle) {
+if (!dsm_is_root_uuid(dsm->arg0)) {
+nvdebug("Root UUID does not match.\n");
+goto exit;
+}
+
+return dsm_write_root(dsm, out);
+}
+
+if (!dsm_is_dimm_uuid(dsm->arg0)) {
+nvdebug("DIMM UUID does not match.\n");
+goto exit;
+}
+
+return dsm_write_nvdimm(dsm, out);
+
+exit:
+out->status = NFIT_STATUS_NOT_

[Qemu-devel] [PATCH 11/16] nvdimm: build ACPI nvdimm devices

2015-07-01 Thread Xiao Guangrong

NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices

This is a root device under \_SB and specified NVDIMM device are under the
root device. Each NVDIMM device has _ADR which return its handle used to
associate MEMDEV table in NFIT

We reserve handle 0 for root device. In this patch, we save handle, arg0,
arg1 and arg2. Arg3 is conditionally saved in later patch

Signed-off-by: Xiao Guangrong 
---
 hw/i386/acpi-build.c   |   2 +
 hw/mem/pc-nvdimm.c | 126 +
 include/hw/mem/pc-nvdimm.h |   6 +++
 3 files changed, 134 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 80c21be..85c7226 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1342,6 +1342,8 @@ build_ssdt(GArray *table_data, GArray *linker,
 aml_append(sb_scope, scope);
 }
 }
+
+pc_nvdimm_build_acpi_devices(sb_scope);
 aml_append(ssdt, sb_scope);
 }
 
diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index 4c290cb..0e2a9d5 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -32,6 +32,7 @@
 
 #define PAGE_SIZE   (1UL << 12)
 
+#define NOTIFY_VALUE(0x99)
 #define MAX_NVDIMM_NUMBER   (10)
 #define MIN_CONFIG_DATA_SIZE(128 << 10)
 
@@ -348,12 +349,15 @@ struct dsm_buffer {
 static uint64_t dsm_read(void *opaque, hwaddr addr,
  unsigned size)
 {
+fprintf(stderr, "BUG: we never read DSM notification MMIO.\n");
+assert(0);
 return 0;
 }
 
 static void dsm_write(void *opaque, hwaddr addr,
   uint64_t val, unsigned size)
 {
+assert(val == NOTIFY_VALUE);
 }
 
 static const MemoryRegionOps dsm_ops = {
@@ -429,6 +433,128 @@ exit:
 g_slist_free(list);
 }
 
+#define BUILD_STA_METHOD(_dev_, _method_)  \
+do {   \
+_method_ = aml_method("_STA", 0);  \
+aml_append(_method_, aml_return(aml_int(0x0f)));   \
+aml_append(_dev_, _method_);   \
+} while (0)
+
+#define SAVE_ARG012_HANDLE(_method_, _handle_) \
+do {   \
+aml_append(_method_, aml_store(_handle_, aml_name("HDLE")));   \
+aml_append(_method_, aml_store(aml_arg(0), aml_name("ARG0"))); \
+aml_append(_method_, aml_store(aml_arg(1), aml_name("ARG1"))); \
+aml_append(_method_, aml_store(aml_arg(2), aml_name("ARG2"))); \
+} while (0)
+
+#define NOTIFY_AND_RETURN(_method_)\
+do {   \
+aml_append(_method_, aml_store(aml_int(NOTIFY_VALUE),  \
+   aml_name("NOTI"))); \
+aml_append(_method_, aml_return(aml_name("ODAT")));\
+} while (0)
+
+static void build_nvdimm_devices(Aml *root_dev, GSList *list)
+{
+for (; list; list = list->next) {
+PCNVDIMMDevice *nvdimm = list->data;
+uint32_t handle = nvdimm_index_to_handle(nvdimm->device_index);
+Aml *dev, *method;
+
+dev = aml_device("NVD%d", nvdimm->device_index);
+aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
+
+BUILD_STA_METHOD(dev, method);
+
+method = aml_method("_DSM", 4);
+{
+SAVE_ARG012_HANDLE(method, aml_int(handle));
+NOTIFY_AND_RETURN(method);
+}
+aml_append(dev, method);
+
+aml_append(root_dev, dev);
+}
+}
+
+void pc_nvdimm_build_acpi_devices(Aml *sb_scope)
+{
+Aml *dev, *method, *field;
+struct dsm_buffer *dsm_buf;
+GSList *list = get_nvdimm_built_list();
+int nr = get_nvdimm_device_number(list);
+
+if (nr <= 0 || nr > MAX_NVDIMM_NUMBER) {
+g_slist_free(list);
+return;
+}
+
+dev = aml_device("NVDR");
+aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012")));
+
+/* map DSM buffer into ACPI namespace. */
+aml_append(dev, aml_operation_region("DSMR", AML_SYSTEM_MEMORY,
+   nvdimms_info.dsm_addr, nvdimms_info.dsm_size));
+
+/*
+ * DSM input:
+ * @HDLE: store device's handle, it's zero if the _DSM call happens
+ *on ROOT.
+ * @ARG0 ~ @ARG3: store the parameters of _DSM call.
+ *
+ * They are ram mapping on host so that these access never cause VM-EXIT.
+ */
+field = aml_field("DSMR", AML_DWORD_ACC, AML_PRESERVE);
+aml_append(field, aml_named_field("HDLE",
+   sizeof(dsm_buf->handle) * BITS_PER_BYTE));
+aml_append(field, aml_named_field("ARG0",
+   sizeof(dsm_buf->arg0) * BITS_PER_BYTE));
+aml_append(field, aml_named_field("ARG1",
+   sizeof(dsm_buf->a

[Qemu-devel] [PATCH 04/16] acpi: add aml_sizeof

2015-07-01 Thread Xiao Guangrong

Implement SizeOf term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 8 
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 9e89efc..a526eed 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1143,6 +1143,14 @@ Aml *aml_derefof(Aml *arg)
 return var;
 }
 
+/* ACPI 6.0: 20.2.5.4 Type 2 Opcodes Encoding: DefSizeOf */
+Aml *aml_sizeof(Aml *arg)
+{
+Aml *var = aml_opcode(0x87 /* SizeOfOp */);
+aml_append(var, arg);
+return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 21dc5e9..6b591ab 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -276,6 +276,7 @@ Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
+Aml *aml_sizeof(Aml *arg);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
2.1.0

[Qemu-devel] [PATCH 10/16] nvdimm: init the address region used by _DSM method

2015-07-01 Thread Xiao Guangrong

This memory range is used to transfer data between ACPI in guest and Qemu,
it occupies two pages:
- one is RAM-based used to save the input info of _DSM method and Qemu reuse
  it store output info

- another one is MMIO-based, ACPI write data to this page to transfer the
  control to Qemu

Signed-off-by: Xiao Guangrong 
---
 hw/mem/pc-nvdimm.c | 80 +-
 1 file changed, 79 insertions(+), 1 deletion(-)

diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index e7cff29..4c290cb 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -37,6 +37,10 @@
 
 static struct nvdimms_info {
 ram_addr_t current_addr;
+
+ram_addr_t dsm_addr;
+int dsm_size;
+
 int device_index;
 } nvdimms_info;
 
@@ -324,14 +328,88 @@ static void build_nfit_table(GSList *device_list, char 
*buf)
 }
 }
 
+struct dsm_buffer {
+/* RAM page. */
+uint32_t handle;
+uint8_t arg0[16];
+uint32_t arg1;
+uint32_t arg2;
+union {
+char arg3[PAGE_SIZE - 3 * sizeof(uint32_t) - 16 * sizeof(uint8_t)];
+};
+
+/* MMIO page. */
+union {
+uint32_t notify;
+char pedding[PAGE_SIZE];
+};
+};
+
+static uint64_t dsm_read(void *opaque, hwaddr addr,
+ unsigned size)
+{
+return 0;
+}
+
+static void dsm_write(void *opaque, hwaddr addr,
+  uint64_t val, unsigned size)
+{
+}
+
+static const MemoryRegionOps dsm_ops = {
+.read = dsm_read,
+.write = dsm_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static int build_dsm_buffer(void)
+{
+MemoryRegion *dsm_ram_mr, *dsm_mmio_mr;
+ram_addr_t addr;;
+
+QEMU_BUILD_BUG_ON(PAGE_SIZE * 2 != sizeof(struct dsm_buffer));
+
+/* DSM buffer has already been built. */
+if (nvdimms_info.dsm_addr) {
+return 0;
+}
+
+addr = reserved_range_push(2 * PAGE_SIZE);
+if (!addr) {
+return -1;
+}
+
+nvdimms_info.dsm_addr = addr;
+nvdimms_info.dsm_size = PAGE_SIZE * 2;
+
+dsm_ram_mr = g_new(MemoryRegion, 1);
+memory_region_init_ram(dsm_ram_mr, NULL, "dsm_ram", PAGE_SIZE,
+   &error_abort);
+vmstate_register_ram_global(dsm_ram_mr);
+memory_region_add_subregion(get_system_memory(), addr, dsm_ram_mr);
+
+dsm_mmio_mr = g_new(MemoryRegion, 1);
+memory_region_init_io(dsm_mmio_mr, NULL, &dsm_ops, dsm_ram_mr,
+  "dsm_mmio", PAGE_SIZE);
+memory_region_add_subregion(get_system_memory(), addr + PAGE_SIZE,
+dsm_mmio_mr);
+return 0;
+}
+
 void pc_nvdimm_build_nfit_table(GArray *table_offsets, GArray *table_data,
 GArray *linker)
 {
-GSList *list = get_nvdimm_built_list();
+GSList *list;
 size_t total;
 char *buf;
 int nfit_start, nr;
 
+if (build_dsm_buffer()) {
+fprintf(stderr, "do not have enough space for DSM buffer.\n");
+return;
+}
+
+list = get_nvdimm_built_list();
 nr = get_nvdimm_device_number(list);
 total = get_nfit_total_size(nr);
 
-- 
2.1.0

[Qemu-devel] [PATCH 02/16] i386/acpi-build: allow SSDT to operate on 64 bit

2015-07-01 Thread Xiao Guangrong

Only 512M is left for MMIO below 4G and that are used by PCI, BIOS etc.
Other components also reserve regions from their internal usage, e.g,
[0xFED0, 0xFED0 + 0x400) is reserved for HPET

Switch SSDT to 64 bit to use the huge free room above 4G. In the later
patches, we will dynamical allocate free space within this region which
is used by NVDIMM _DSM method

Signed-off-by: Xiao Guangrong 
---
 hw/i386/acpi-build.c  | 4 ++--
 hw/i386/acpi-dsdt.dsl | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 00818b9..6a1ab09 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1348,7 +1348,7 @@ build_ssdt(GArray *table_data, GArray *linker,
 g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
 build_header(linker, table_data,
 (void *)(table_data->data + table_data->len - ssdt->buf->len),
-"SSDT", ssdt->buf->len, 1);
+"SSDT", ssdt->buf->len, 2);
 free_aml_allocator();
 }
 
@@ -1586,7 +1586,7 @@ build_dsdt(GArray *table_data, GArray *linker, 
AcpiMiscInfo *misc)
 
 memset(dsdt, 0, sizeof *dsdt);
 build_header(linker, table_data, dsdt, "DSDT",
- misc->dsdt_size, 1);
+ misc->dsdt_size, 2);
 }
 
 static GArray *
diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl
index a2d84ec..5cd3f0e 100644
--- a/hw/i386/acpi-dsdt.dsl
+++ b/hw/i386/acpi-dsdt.dsl
@@ -22,7 +22,7 @@ ACPI_EXTRACT_ALL_CODE AcpiDsdtAmlCode
 DefinitionBlock (
 "acpi-dsdt.aml",// Output Filename
 "DSDT", // Signature
-0x01,   // DSDT Compliance Revision
+0x02,   // DSDT Compliance Revision
 "BXPC", // OEMID
 "BXDSDT",   // TABLE ID
 0x1 // OEM Revision
-- 
2.1.0

[Qemu-devel] [PATCH 08/16] nvdimm: init backend memory mapping and config data area

2015-07-01 Thread Xiao Guangrong

The parameter @file is used as backed memory for NVDIMM which is
divided into two parts:
- first parts is (0, size - 128K], which is used as PMEM (Persistent
  Memory)
- 128K at the end of the file, which is used as Config Data Area, it's
  used to store Label namespace data

The @file supports both regular file and block device, of course we
can assign any these two kinds of files for test and emulation, however,
in the real word for performance reason, we usually used these files as
NVDIMM backed file:
- the regular file in the filesystem with DAX enabled created on NVDIMM
  device on host
- the raw PMEM device on host, e,g /dev/pmem0

Signed-off-by: Xiao Guangrong 
---
 hw/mem/pc-nvdimm.c | 102 -
 include/hw/mem/pc-nvdimm.h |   5 +++
 2 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index b40d4e7..9531935 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -22,12 +22,20 @@
  * License along with this library; if not, see 
  */
 
+#include 
+#include 
+#include 
+
+#include "exec/address-spaces.h"
 #include "hw/mem/pc-nvdimm.h"
 
-#define PAGE_SIZE  (1UL << 12)
+#define PAGE_SIZE   (1UL << 12)
+
+#define MIN_CONFIG_DATA_SIZE(128 << 10)
 
 static struct nvdimms_info {
 ram_addr_t current_addr;
+int device_index;
 } nvdimms_info;
 
 /* the address range [offset, ~0ULL) is reserved for NVDIMM. */
@@ -37,6 +45,26 @@ void pc_nvdimm_reserve_range(ram_addr_t offset)
 nvdimms_info.current_addr = offset;
 }
 
+static ram_addr_t reserved_range_push(uint64_t size)
+{
+uint64_t current;
+
+current = ROUND_UP(nvdimms_info.current_addr, PAGE_SIZE);
+
+/* do not have enough space? */
+if (current + size < current) {
+return 0;
+}
+
+nvdimms_info.current_addr = current + size;
+return current;
+}
+
+static uint32_t new_device_index(void)
+{
+return nvdimms_info.device_index++;
+}
+
 static char *get_file(Object *obj, Error **errp)
 {
 PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
@@ -48,6 +76,11 @@ static void set_file(Object *obj, const char *str, Error 
**errp)
 {
 PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
 
+if (memory_region_size(&nvdimm->mr)) {
+error_setg(errp, "cannot change property value");
+return;
+}
+
 if (nvdimm->file) {
 g_free(nvdimm->file);
 }
@@ -60,13 +93,80 @@ static void pc_nvdimm_init(Object *obj)
 object_property_add_str(obj, "file", get_file, set_file, NULL);
 }
 
+static uint64_t get_file_size(int fd)
+{
+struct stat stat_buf;
+uint64_t size;
+
+if (fstat(fd, &stat_buf) < 0) {
+return 0;
+}
+
+if (S_ISREG(stat_buf.st_mode)) {
+return stat_buf.st_size;
+}
+
+if (S_ISBLK(stat_buf.st_mode) && !ioctl(fd, BLKGETSIZE64, &size)) {
+return size;
+}
+
+return 0;
+}
+
 static void pc_nvdimm_realize(DeviceState *dev, Error **errp)
 {
 PCNVDIMMDevice *nvdimm = PC_NVDIMM(dev);
+char name[512];
+void *buf;
+ram_addr_t addr;
+uint64_t size;
+int fd;
 
 if (!nvdimm->file) {
 error_setg(errp, "file property is not set");
 }
+
+fd = open(nvdimm->file, O_RDWR);
+if (fd < 0) {
+error_setg(errp, "can not open %s", nvdimm->file);
+return;
+}
+
+/* reserve MIN_CONFIGDATA_AREA_SIZE for configue data */
+size = get_file_size(fd) - MIN_CONFIG_DATA_SIZE;
+if ((int64_t)size <= 0) {
+error_setg(errp, "file size is too small to store NVDIMM"
+ " configure data");
+goto do_close;
+}
+
+buf = mmap(NULL, size + MIN_CONFIG_DATA_SIZE, PROT_READ | PROT_WRITE,
+   MAP_SHARED, fd, 0);
+if (buf == MAP_FAILED) {
+error_setg(errp, "can not do mmap on %s", nvdimm->file);
+goto do_close;
+}
+
+addr = reserved_range_push(size);
+if (!addr) {
+error_setg(errp, "do not have enough space for size %#lx.\n", size);
+goto do_unmap;
+}
+
+nvdimm->device_index = new_device_index();
+sprintf(name, "NVDIMM-%d", nvdimm->device_index);
+memory_region_init_ram_ptr(&nvdimm->mr, OBJECT(dev), name, size, buf);
+vmstate_register_ram(&nvdimm->mr, DEVICE(dev));
+memory_region_add_subregion(get_system_memory(), addr, &nvdimm->mr);
+
+nvdimm->config_data_addr = buf + size;
+nvdimm->config_data_size = MIN_CONFIG_DATA_SIZE;
+
+return;
+do_unmap:
+munmap(buf, size);
+do_close:
+close(fd);
 }
 
 static void pc_nvdimm_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/mem/pc-nvdimm.h b/include/hw/mem/pc-nvdimm.h
index 2081e7c..e743ed1 100644
--- a/include/hw/mem/pc-nvdimm.h
+++ b/include/hw/mem/pc-nvdimm.h
@@ -21,6 +21,11 @@ typedef struct PCNVDIMMDevice {
 DeviceState parent_obj;
 
 char *file;
+void *config_data_addr;
+uint64_t config_data_size;
+
+int device_index;
+MemoryRegion mr

Re: [Qemu-devel] [PATCH 3/9] kvm: add hyper-v crash msrs values

2015-07-01 Thread Paolo Bonzini



On 30/06/2015 13:33, Denis V. Lunev wrote:
> +#define HV_X64_MSR_CRASH_CTL_NOTIFY  (1ULL << 63)
> +#define HV_X64_MSR_CRASH_CTL_CONTENTS\
> + (HV_X64_MSR_CRASH_CTL_NOTIFY)

Why is HV_X64_MSR_CRASH_CTL_CONTENTS needed?  Can I just remove it?

Paolo

[Qemu-devel] [PATCH 07/16] nvdimm: reserve address range for NVDIMM

2015-07-01 Thread Xiao Guangrong

NVDIMM reserves all the free range above 4G to do:
- Persistent Memory (PMEM) mapping
- implement NVDIMM ACPI device _DSM method

Signed-off-by: Xiao Guangrong 
---
 hw/i386/pc.c   | 11 +--
 hw/mem/pc-nvdimm.c | 13 +
 include/hw/mem/pc-nvdimm.h |  5 +
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 7072930..82e80a9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -64,6 +64,7 @@
 #include "hw/pci/pci_host.h"
 #include "acpi-build.h"
 #include "hw/mem/pc-dimm.h"
+#include "hw/mem/pc-nvdimm.h"
 #include "trace.h"
 #include "qapi/visitor.h"
 #include "qapi-visit.h"
@@ -1241,6 +1242,7 @@ FWCfgState *pc_memory_init(MachineState *machine,
 MemoryRegion *ram_below_4g, *ram_above_4g;
 FWCfgState *fw_cfg;
 PCMachineState *pcms = PC_MACHINE(machine);
+ram_addr_t offset;
 
 assert(machine->ram_size == below_4g_mem_size + above_4g_mem_size);
 
@@ -1278,6 +1280,8 @@ FWCfgState *pc_memory_init(MachineState *machine,
 exit(EXIT_FAILURE);
 }
 
+offset = 0x1ULL + above_4g_mem_size;
+
 /* initialize hotplug memory address space */
 if (guest_info->has_reserved_memory &&
 (machine->ram_size < machine->maxram_size)) {
@@ -1297,8 +1301,7 @@ FWCfgState *pc_memory_init(MachineState *machine,
 exit(EXIT_FAILURE);
 }
 
-pcms->hotplug_memory_base =
-ROUND_UP(0x1ULL + above_4g_mem_size, 1ULL << 30);
+pcms->hotplug_memory_base = ROUND_UP(offset, 1ULL << 30);
 
 if (pcms->enforce_aligned_dimm) {
 /* size hotplug region assuming 1G page max alignment per slot */
@@ -1316,8 +1319,12 @@ FWCfgState *pc_memory_init(MachineState *machine,
"hotplug-memory", hotplug_mem_size);
 memory_region_add_subregion(system_memory, pcms->hotplug_memory_base,
 &pcms->hotplug_memory);
+offset = pcms->hotplug_memory_base + hotplug_mem_size;
 }
 
+/* all the space left above 4G is reserved for NVDIMM. */
+pc_nvdimm_reserve_range(offset);
+
 /* Initialize PC system firmware */
 pc_system_firmware_init(rom_memory, guest_info->isapc_ram_fw);
 
diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index 0209ea9..b40d4e7 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -24,6 +24,19 @@
 
 #include "hw/mem/pc-nvdimm.h"
 
+#define PAGE_SIZE  (1UL << 12)
+
+static struct nvdimms_info {
+ram_addr_t current_addr;
+} nvdimms_info;
+
+/* the address range [offset, ~0ULL) is reserved for NVDIMM. */
+void pc_nvdimm_reserve_range(ram_addr_t offset)
+{
+offset = ROUND_UP(offset, PAGE_SIZE);
+nvdimms_info.current_addr = offset;
+}
+
 static char *get_file(Object *obj, Error **errp)
 {
 PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
diff --git a/include/hw/mem/pc-nvdimm.h b/include/hw/mem/pc-nvdimm.h
index 7f37b46..2081e7c 100644
--- a/include/hw/mem/pc-nvdimm.h
+++ b/include/hw/mem/pc-nvdimm.h
@@ -27,6 +27,11 @@ typedef struct PCNVDIMMDevice {
 
 #define PC_NVDIMM(obj) \
 OBJECT_CHECK(PCNVDIMMDevice, (obj), TYPE_PC_NVDIMM)
+
+void pc_nvdimm_reserve_range(ram_addr_t offset);
 #else  /* !CONFIG_LINUX */
+static inline void pc_nvdimm_reserve_range(ram_addr_t offset)
+{
+}
 #endif
 #endif
-- 
2.1.0

[Qemu-devel] [PATCH v3 14/25] virtio: add version 1.0 support to vp_notify

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c  | 22 +-
 src/hw/virtio-pci.h  |  7 ++-
 src/hw/virtio-ring.c |  2 +-
 src/hw/virtio-ring.h |  1 +
 4 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index bd51d8a..240cd2f 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -97,6 +97,24 @@ void vp_reset(struct vp_device *vp)
 }
 }
 
+void vp_notify(struct vp_device *vp, struct vring_virtqueue *vq)
+{
+if (vp->use_modern) {
+u32 addr = vp->notify.addr +
+vq->queue_notify_off *
+vp->notify_off_multiplier;
+if (vp->notify.is_io) {
+outw(vq->queue_index, addr);
+} else {
+writew((void*)addr, vq->queue_index);
+}
+dprintf(9, "vp notify %x (%d) -- 0x%x\n",
+addr, 2, vq->queue_index);
+} else {
+vp_write(&vp->legacy, virtio_pci_legacy, queue_notify, 
vq->queue_index);
+}
+}
+
 int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq)
 {
@@ -162,7 +180,7 @@ void vp_init_simple(struct vp_device *vp, struct pci_device 
*pci)
 {
 u8 cap = pci_find_capability(pci, PCI_CAP_ID_VNDR, 0);
 struct vp_cap *vp_cap;
-u32 addr, offset;
+u32 addr, offset, mul;
 u8 type;
 
 memset(vp, 0, sizeof(*vp));
@@ -175,6 +193,8 @@ void vp_init_simple(struct vp_device *vp, struct pci_device 
*pci)
 break;
 case VIRTIO_PCI_CAP_NOTIFY_CFG:
 vp_cap = &vp->notify;
+mul = offsetof(struct virtio_pci_notify_cap, 
notify_off_multiplier);
+vp->notify_off_multiplier = pci_config_readl(pci->bdf, cap + mul);
 break;
 case VIRTIO_PCI_CAP_ISR_CFG:
 vp_cap = &vp->isr;
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index f2ae5b9..3054a13 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -125,6 +125,7 @@ struct vp_cap {
 struct vp_device {
 unsigned int ioaddr;
 struct vp_cap common, notify, isr, device, legacy;
+u32 notify_off_multiplier;
 u8 use_modern;
 };
 
@@ -233,11 +234,6 @@ void vp_set_status(struct vp_device *vp, u8 status);
 u8 vp_get_isr(struct vp_device *vp);
 void vp_reset(struct vp_device *vp);
 
-static inline void vp_notify(struct vp_device *vp, int queue_index)
-{
-outw(queue_index, GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_QUEUE_NOTIFY);
-}
-
 static inline void vp_del_vq(struct vp_device *vp, int queue_index)
 {
int ioaddr = GET_LOWFLAT(vp->ioaddr);
@@ -252,6 +248,7 @@ static inline void vp_del_vq(struct vp_device *vp, int 
queue_index)
 struct pci_device;
 struct vring_virtqueue;
 void vp_init_simple(struct vp_device *vp, struct pci_device *pci);
+void vp_notify(struct vp_device *vp, struct vring_virtqueue *vq);
 int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq);
 #endif /* _VIRTIO_PCI_H_ */
diff --git a/src/hw/virtio-ring.c b/src/hw/virtio-ring.c
index 5c6a32e..6c86c38 100644
--- a/src/hw/virtio-ring.c
+++ b/src/hw/virtio-ring.c
@@ -145,5 +145,5 @@ void vring_kick(struct vp_device *vp, struct 
vring_virtqueue *vq, int num_added)
 smp_wmb();
 SET_LOWFLAT(avail->idx, GET_LOWFLAT(avail->idx) + num_added);
 
-vp_notify(vp, GET_LOWFLAT(vq->queue_index));
+vp_notify(vp, vq);
 }
diff --git a/src/hw/virtio-ring.h b/src/hw/virtio-ring.h
index 553a508..7df9004 100644
--- a/src/hw/virtio-ring.h
+++ b/src/hw/virtio-ring.h
@@ -88,6 +88,7 @@ struct vring_virtqueue {
u16 vdata[MAX_QUEUE_NUM];
/* PCI */
int queue_index;
+   int queue_notify_off;
 };
 
 struct vring_list {
-- 
1.8.3.1

[Qemu-devel] [PATCH 05/16] acpi: add aml_create_field

2015-07-01 Thread Xiao Guangrong

Implement CreateField term which are used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 14 ++
 include/hw/acpi/aml-build.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index a526eed..debdad2 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1151,6 +1151,20 @@ Aml *aml_sizeof(Aml *arg)
 return var;
 }
 
+/* ACPI 6.0: 20.2.5.2 Named Objects Encoding: DefCreateField */
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
+{
+Aml *var = aml_alloc();
+
+build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+build_append_byte(var->buf, 0x13); /* CreateFieldOp */
+aml_append(var, srcbuf);
+aml_append(var, index);
+aml_append(var, len);
+build_append_namestring(var->buf, "%s", name);
+return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 6b591ab..d4dbd44 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -277,6 +277,7 @@ Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
2.1.0

[Qemu-devel] [PATCH 09/16] nvdimm: build ACPI NFIT table

2015-07-01 Thread Xiao Guangrong

NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)

Currently, we only support PMEM mode. Each device has 3 tables:
- SPA table, define the PMEM region info

- MEM DEV table, it has the @handle which is used to associate specified
  ACPI NVDIMM  device we will introduce in later patch.
  Also we can happily ignored the memory device's interleave, the real
  nvdimm hardware access is hidden behind host

- DCR table, it defines Vendor ID used to associate specified vendor
  nvdimm driver. Since we only implement PMEM mode this time, Command
  window and Data window are not needed

Signed-off-by: Xiao Guangrong 
---
 hw/i386/acpi-build.c   |   3 +
 hw/mem/pc-nvdimm.c | 286 +
 include/hw/mem/pc-nvdimm.h |   8 ++
 3 files changed, 297 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6a1ab09..80c21be 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -39,6 +39,7 @@
 #include "hw/loader.h"
 #include "hw/isa/isa.h"
 #include "hw/acpi/memory_hotplug.h"
+#include "hw/mem/pc-nvdimm.h"
 #include "sysemu/tpm.h"
 #include "hw/acpi/tpm.h"
 #include "sysemu/tpm_backend.h"
@@ -1741,6 +1742,8 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 build_dmar_q35(tables_blob, tables->linker);
 }
 
+pc_nvdimm_build_nfit_table(table_offsets, tables_blob, tables->linker);
+
 /* Add tables supplied by user (if any) */
 for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
 unsigned len = acpi_table_len(u);
diff --git a/hw/mem/pc-nvdimm.c b/hw/mem/pc-nvdimm.c
index 9531935..e7cff29 100644
--- a/hw/mem/pc-nvdimm.c
+++ b/hw/mem/pc-nvdimm.c
@@ -27,10 +27,12 @@
 #include 
 
 #include "exec/address-spaces.h"
+#include "hw/acpi/aml-build.h"
 #include "hw/mem/pc-nvdimm.h"
 
 #define PAGE_SIZE   (1UL << 12)
 
+#define MAX_NVDIMM_NUMBER   (10)
 #define MIN_CONFIG_DATA_SIZE(128 << 10)
 
 static struct nvdimms_info {
@@ -65,6 +67,290 @@ static uint32_t new_device_index(void)
 return nvdimms_info.device_index++;
 }
 
+static int pc_nvdimm_built_list(Object *obj, void *opaque)
+{
+GSList **list = opaque;
+
+if (object_dynamic_cast(obj, TYPE_PC_NVDIMM)) {
+PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
+
+/* only realized NVDIMMs matter */
+if (memory_region_size(&nvdimm->mr)) {
+*list = g_slist_append(*list, nvdimm);
+}
+}
+
+object_child_foreach(obj, pc_nvdimm_built_list, opaque);
+return 0;
+}
+
+static GSList *get_nvdimm_built_list(void)
+{
+GSList *list = NULL;
+
+object_child_foreach(qdev_get_machine(), pc_nvdimm_built_list, &list);
+return list;
+}
+
+static int get_nvdimm_device_number(GSList *list)
+{
+int nr = 0;
+
+for (; list; list = list->next) {
+nr++;
+}
+
+return nr;
+}
+
+static uint32_t nvdimm_index_to_sn(int index)
+{
+return 0x123456 + index;
+}
+
+static uint32_t nvdimm_index_to_handle(int index)
+{
+return index + 1;
+}
+
+typedef struct {
+uint8_t b[16];
+} uuid_le;
+
+#define UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)   \
+((uuid_le) \
+{ { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
+(b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,  \
+(d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } })
+
+static void nfit_spa_uuid_pm(void *uuid)
+{
+uuid_le uuid_pm = UUID_LE(0x66f0d379, 0xb4f3, 0x4074, 0xac, 0x43, 0x0d,
+  0x33, 0x18, 0xb7, 0x8c, 0xdb);
+memcpy(uuid, &uuid_pm, sizeof(uuid_pm));
+}
+
+enum {
+NFIT_TABLE_SPA = 0,
+NFIT_TABLE_MEM = 1,
+NFIT_TABLE_IDT = 2,
+NFIT_TABLE_SMBIOS = 3,
+NFIT_TABLE_DCR = 4,
+NFIT_TABLE_BDW = 5,
+NFIT_TABLE_FLUSH = 6,
+};
+
+enum {
+EFI_MEMORY_UC = 0x1ULL,
+EFI_MEMORY_WC = 0x2ULL,
+EFI_MEMORY_WT = 0x4ULL,
+EFI_MEMORY_WB = 0x8ULL,
+EFI_MEMORY_UCE = 0x10ULL,
+EFI_MEMORY_WP = 0x1000ULL,
+EFI_MEMORY_RP = 0x2000ULL,
+EFI_MEMORY_XP = 0x4000ULL,
+EFI_MEMORY_NV = 0x8000ULL,
+EFI_MEMORY_MORE_RELIABLE = 0x1ULL,
+};
+
+/*
+ * struct nfit - Nvdimm Firmware Interface Table
+ * @signature: "NFIT"
+ */
+struct nfit {
+ACPI_TABLE_HEADER_DEF
+uint32_t reserved;
+} QEMU_PACKED;
+
+/*
+ * struct nfit_spa - System Physical Address Range Structure
+ */
+struct nfit_spa {
+uint16_t type;
+uint16_t length;
+uint16_t spa_index;
+uint16_t flags;
+uint32_t reserved;
+uint32_t proximity_domain;
+uint8_t type_uuid[16];
+uint64_t spa_base;
+uint64_t spa_length;
+uint64_t mem_attr;
+} QEMU_PACKED;
+
+/*
+ * struct nfit_memdev - Memory Device to SPA Map Structure
+ */
+struct nfit_memdev {
+uint16_t type;
+uint16_t length;
+uint32_t nfit_handle;
+uint16_t phys_id;
+uint16_t region_id;
+uint16_t spa_index;
+u

[Qemu-devel] [PATCH 01/16] acpi: allow aml_operation_region() working on 64 bit offset

2015-07-01 Thread Xiao Guangrong

Currently, the offset in OperationRegion is limited to 32 bit, extend it
to 64 bit so that we can switch SSDT to 64 bit in later patch

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 2 +-
 include/hw/acpi/aml-build.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 0d4b324..02f9e3d 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -752,7 +752,7 @@ Aml *aml_package(uint8_t num_elements)
 
 /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefOpRegion */
 Aml *aml_operation_region(const char *name, AmlRegionSpace rs,
-  uint32_t offset, uint32_t len)
+  uint64_t offset, uint32_t len)
 {
 Aml *var = aml_alloc();
 build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index e3afa13..996ac5b 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -222,7 +222,7 @@ Aml *aml_interrupt(AmlConsumerAndProducer con_and_pro,
 Aml *aml_io(AmlIODecode dec, uint16_t min_base, uint16_t max_base,
 uint8_t aln, uint8_t len);
 Aml *aml_operation_region(const char *name, AmlRegionSpace rs,
-  uint32_t offset, uint32_t len);
+  uint64_t offset, uint32_t len);
 Aml *aml_irq_no_flags(uint8_t irq);
 Aml *aml_named_field(const char *name, unsigned length);
 Aml *aml_reserved_field(unsigned length);
-- 
2.1.0

[Qemu-devel] [PATCH v3 18/25] virtio-blk: fix initialization for version 1.0

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-blk.c | 84 +
 src/hw/virtio-pci.h | 13 -
 2 files changed, 72 insertions(+), 25 deletions(-)

diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c
index 8378a34..703b147 100644
--- a/src/hw/virtio-blk.c
+++ b/src/hw/virtio-blk.c
@@ -102,6 +102,7 @@ static void
 init_virtio_blk(struct pci_device *pci)
 {
 u16 bdf = pci->bdf;
+u8 status = VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER;
 dprintf(1, "found virtio-blk at %x:%x\n", pci_bdf_to_bus(bdf),
 pci_bdf_to_dev(bdf));
 struct virtiodrive_s *vdrive = malloc_fseg(sizeof(*vdrive));
@@ -120,35 +121,82 @@ init_virtio_blk(struct pci_device *pci)
 goto fail;
 }
 
-struct virtio_blk_config cfg;
-vp_get(&vdrive->vp, 0, &cfg, sizeof(cfg));
+if (vdrive->vp.use_modern) {
+struct vp_device *vp = &vdrive->vp;
+u64 features = vp_get_features(vp);
+u64 version1 = 1ull << VIRTIO_F_VERSION_1;
+u64 blk_size = 1ull << VIRTIO_BLK_F_BLK_SIZE;
+if (!(features & version1)) {
+dprintf(1, "modern device without virtio_1 feature bit: %x:%x\n",
+pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
+goto fail;
+}
 
-u64 f = vp_get_features(&vdrive->vp);
-vdrive->drive.blksize = (f & (1 << VIRTIO_BLK_F_BLK_SIZE)) ?
-cfg.blk_size : DISK_SECTOR_SIZE;
+features = features & (version1 | blk_size);
+vp_set_features(vp, features);
+status |= VIRTIO_CONFIG_S_FEATURES_OK;
+vp_set_status(vp, status);
+if (!(vp_get_status(vp) & VIRTIO_CONFIG_S_FEATURES_OK)) {
+dprintf(1, "device didn't accept features: %x:%x\n",
+pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
+goto fail;
+}
 
-vdrive->drive.sectors = cfg.capacity;
-dprintf(3, "virtio-blk %x:%x blksize=%d sectors=%u\n",
-pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf),
-vdrive->drive.blksize, (u32)vdrive->drive.sectors);
+vdrive->drive.sectors =
+vp_read(&vp->device, struct virtio_blk_config, capacity);
+if (features & blk_size) {
+vdrive->drive.blksize =
+vp_read(&vp->device, struct virtio_blk_config, blk_size);
+} else {
+vdrive->drive.blksize = DISK_SECTOR_SIZE;
+}
+if (vdrive->drive.blksize != DISK_SECTOR_SIZE) {
+dprintf(1, "virtio-blk %x:%x block size %d is unsupported\n",
+pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf),
+vdrive->drive.blksize);
+goto fail;
+}
+dprintf(3, "virtio-blk %x:%x blksize=%d sectors=%u\n",
+pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf),
+vdrive->drive.blksize, (u32)vdrive->drive.sectors);
+
+vdrive->drive.pchs.cylinder =
+vp_read(&vp->device, struct virtio_blk_config, cylinders);
+vdrive->drive.pchs.head =
+vp_read(&vp->device, struct virtio_blk_config, heads);
+vdrive->drive.pchs.sector =
+vp_read(&vp->device, struct virtio_blk_config, sectors);
+} else {
+struct virtio_blk_config cfg;
+vp_get_legacy(&vdrive->vp, 0, &cfg, sizeof(cfg));
 
-if (vdrive->drive.blksize != DISK_SECTOR_SIZE) {
-dprintf(1, "virtio-blk %x:%x block size %d is unsupported\n",
+u64 f = vp_get_features(&vdrive->vp);
+vdrive->drive.blksize = (f & (1 << VIRTIO_BLK_F_BLK_SIZE)) ?
+cfg.blk_size : DISK_SECTOR_SIZE;
+
+vdrive->drive.sectors = cfg.capacity;
+dprintf(3, "virtio-blk %x:%x blksize=%d sectors=%u\n",
 pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf),
-vdrive->drive.blksize);
-goto fail;
+vdrive->drive.blksize, (u32)vdrive->drive.sectors);
+
+if (vdrive->drive.blksize != DISK_SECTOR_SIZE) {
+dprintf(1, "virtio-blk %x:%x block size %d is unsupported\n",
+pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf),
+vdrive->drive.blksize);
+goto fail;
+}
+vdrive->drive.pchs.cylinder = cfg.cylinders;
+vdrive->drive.pchs.head = cfg.heads;
+vdrive->drive.pchs.sector = cfg.sectors;
 }
 
-vdrive->drive.pchs.cylinder = cfg.cylinders;
-vdrive->drive.pchs.head = cfg.heads;
-vdrive->drive.pchs.sector = cfg.sectors;
 char *desc = znprintf(MAXDESCSIZE, "Virtio disk PCI:%x:%x",
   pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
 
 boot_add_hd(&vdrive->drive, desc, bootprio_find_pci_device(pci));
 
-vp_set_status(&vdrive->vp, VIRTIO_CONFIG_S_ACKNOWLEDGE |
-  VIRTIO_CONFIG_S_DRIVER | VIRTIO_CONFIG_S_DRIVER_OK);
+status |= VIRTIO_CONFIG_S_DRIVER_OK;
+vp_set_status(&vdrive->vp, status);
 return;
 
 fail:
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.

[Qemu-devel] [PATCH v3 25/25] virtio-pci: use high memory for rings

2015-07-01 Thread Gerd Hoffmann

That way we should be able to manage *alot* more devices.

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index 8e7ea46..6df5194 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -121,7 +121,7 @@ int vp_find_vq(struct vp_device *vp, int queue_index,
u16 num;
 
ASSERT32FLAT();
-   struct vring_virtqueue *vq = *p_vq = memalign_low(PAGE_SIZE, sizeof(*vq));
+   struct vring_virtqueue *vq = *p_vq = memalign_high(PAGE_SIZE, sizeof(*vq));
if (!vq) {
warn_noalloc();
goto fail;
-- 
1.8.3.1

[Qemu-devel] [PATCH 03/16] acpi: add aml_derefof

2015-07-01 Thread Xiao Guangrong

Implement DeRefOf term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 8 
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 02f9e3d..9e89efc 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1135,6 +1135,14 @@ Aml *aml_unicode(const char *str)
 return var;
 }
 
+/* ACPI 6.0: 20.2.5.4 Type 2 Opcodes Encoding: DefDerefOf */
+Aml *aml_derefof(Aml *arg)
+{
+Aml *var = aml_opcode(0x83 /* DerefOfOp */);
+aml_append(var, arg);
+return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 996ac5b..21dc5e9 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -275,6 +275,7 @@ Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const 
char *name);
 Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
+Aml *aml_derefof(Aml *arg);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
2.1.0

[Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-01 Thread Xiao Guangrong

== Background ==
NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
on Intel's platform. They are discovered via ACPI and configured by _DSM
method of NVDIMM device in ACPI. There has some supporting documents which
can be found at:
ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
Driver Writer's Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf

Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
this patchset tries to enable it in virtualization field

== Design ==
NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
address space then CPU can directly access it as normal memory, another is
BLK which is used as block device to reduce the occupying of CPU address
space

BLK mode accesses NVDIMM via Command Register window and Data Register window.
BLK virtualization has high workload since each sector access will cause at
least two VM-EXIT. So we currently only imperilment vPMEM in this patchset

--- vPMEM design ---
We introduce a new device named "pc-nvdimm", it has a parameter, file, which
is the file-based backed memory passed to guest. The file can be regular file
and block device. We can use any file when we do test or emulation, however,
in the real word, the files passed to guest are:
- the regular file in the filesystem with DAX enabled created on NVDIMM device
  on host
- the raw PMEM device on host, e,g /dev/pmem0
Memory access on the address created by mmap on these kinds of files can
directly reach NVDIMM device on host.

--- vConfigure data area design ---
Each NVDIMM device has a configure data area which is used to store label
namespace data. In order to emulating this area, we divide the file into two
parts:
- first parts is (0, size - 128K], which is used as PMEM
- 128K at the end of the file, which is used as Config Data Area
So that the label namespace data can be persistent during power lose or system
failure

--- _DSM method design ---
_DSM in ACPI is used to configure NVDIMM, currently we only allow access of
label namespace data, i.e, Get Namespace Label Size (Function Index 4),
Get Namespace Label Data (Function Index 5) and Set Namespace Label Data
(Function Index 6)

_DSM uses two pages to transfer data between ACPI and Qemu, the first page
is RAM-based used to save the input info of _DSM method and Qemu reuse it
store output info and another page is MMIO-based, ACPI write data to this
page to transfer the control to Qemu

We use the address region above 4G to map these pages because there is huge
free space above 4G and it can avoid the address overlap with PCI and other
address reserved component (e,g HPET). This is also the reason we choose MMIO
notification instead of PIO

== Test ==
In host
1) create memory backed file, e.g # dd if=zero of=/tmp/nvdimm bs=1G count=10
2) append '-device pc-nvdimm,file=/tmp/nvdimm' in Qemu command line

In guest, download the latest upsteam kernel (4.2 merge window) and enable
ACPI_NFIT, LIBNVDIMM and BLK_DEV_PMEM.
1) insmod drivers/nvdimm/libnvdimm.ko
2) insmod drivers/acpi/nfit.ko
3) insmod drivers/nvdimm/nd_btt.ko
4) insmod drivers/nvdimm/nd_pmem.ko
You can see the whole nvdimm device used as a single namespace and /dev/pmem0
appears. You can do whatever on /dev/pmem0 including DAX access.

Currently Linux NVDIMM driver does not support namespace operation on this
kind of PMEM, apply below changes to support dynamical namespace:

@@ -798,7 +823,8 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *a
continue;
}
 
-   if (nfit_mem->bdw && nfit_mem->memdev_pmem)
+   //if (nfit_mem->bdw && nfit_mem->memdev_pmem)
+   if (nfit_mem->memdev_pmem)
flags |= NDD_ALIASING;

You can append another NVDIMM device in guest and do:   
# cd /sys/bus/nd/devices/
# cd namespace1.0/
# echo `uuidgen` > uuid
# echo `expr 1024 \* 1024 \* 128` > size
then reload nd.pmem.ko

You can see /dev/pmem1 appears

== TODO ==
1) NVDIMM NUMA support
2) NVDIMM hotplug support

Xiao Guangrong (16):
  acpi: allow aml_operation_region() working on 64 bit offset
  i386/acpi-build: allow SSDT to operate on 64 bit
  acpi: add aml_derefof
  acpi: add aml_sizeof
  acpi: add aml_create_field
  pc: implement NVDIMM device abstract
  nvdimm: reserve address range for NVDIMM
  nvdimm: init backend memory mapping and config data area
  nvdimm: build ACPI NFIT table
  nvdimm: init the address region used by _DSM method
  nvdimm: build ACPI nvdimm devices
  nvdimm: save arg3 for NVDIMM device _DSM method
  nvdimm: support NFIT_CMD_IMPLEMENTED function
  nvdimm: support NFIT_CMD_GET_CONFIG_SIZE function
  nvdimm: support NFIT_CMD_GET_CONFIG_DATA
  nvdimm: support NFIT_CMD_SET_CON

[Qemu-devel] [PATCH v3 00/25] virtio: add version 1.0 support, move to 32bit

2015-07-01 Thread Gerd Hoffmann

  Hi,

This patch series adds virtio 1.0 support to the virtio blk and scsi
drivers in seabios.  With this series applied seabios happily boots
in virtio 1.0 mode from both transitional and modern devices.  This
series also moves all virtio code to 32bit.

Tested with Fedora 22 guest, booting from virtio-scsi cdrom (live iso),
virtio-scsi disk and virtio-blk disk.

The patches are also available in the git repository at:
  git://git.kraxel.org/seabios virtio

v3 changes:
  * change vp_device allocation.
  * fix capability detection.
  * add some cleanup patches (drop MAKESEGMENT and GET_* macros).
  * allocate virt queues in high memory.

v2 changes:
  * rename vp_modern_{read_write} to vp_{read,write}
  * switch legacy virtio code to vp_{read,write} too.
  * make vp_read return the values.

Gerd Hoffmann (25):
  pci: allow to loop over capabilities
  virtio: run drivers in 32bit mode
  virtio: add struct vp_device
  virtio: pass struct pci_device to vp_init_simple
  virtio: add version 1.0 structs and #defines
  virtio: add version 0.9.5 struct
  virtio: find version 1.0 virtio capabilities
  virtio: create vp_cap struct for legacy bar
  virtio: add read/write functions and macros
  virtio: make features 64bit, support version 1.0 features
  virtio: add version 1.0 support to vp_{get,set}_status
  virtio: add version 1.0 support to vp_get_isr
  virtio: add version 1.0 support to vp_reset
  virtio: add version 1.0 support to vp_notify
  virtio: remove unused vp_del_vq
  virtio: add version 1.0 support to vp_find_vq
  virtio-scsi: fix initialization for version 1.0
  virtio-blk: fix initialization for version 1.0
  virtio: use version 1.0 if available (flip the big switch)
  virtio: also probe version 1.0 pci ids
  virtio: legacy cleanup
  virtio-blk: 32bit cleanup
  virtio-scsi: 32bit cleanup
  virtio-ring: 32bit cleanup
  virtio-pci: use high memory for rings

 Makefile |   2 +-
 src/block.c  |   8 +-
 src/fw/pciinit.c |   4 +-
 src/hw/blockcmd.c|   5 +-
 src/hw/pci.c |  11 ++-
 src/hw/pci.h |   2 +-
 src/hw/pci_ids.h |   8 +-
 src/hw/virtio-blk.c  | 112 ---
 src/hw/virtio-pci.c  | 228 +-
 src/hw/virtio-pci.h  | 252 ---
 src/hw/virtio-ring.c |  65 +++--
 src/hw/virtio-ring.h |   9 +-
 src/hw/virtio-scsi.c |  75 ++-
 13 files changed, 582 insertions(+), 199 deletions(-)

-- 
1.8.3.1

[Qemu-devel] [PATCH v3 16/25] virtio: add version 1.0 support to vp_find_vq

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c | 58 +++--
 1 file changed, 43 insertions(+), 15 deletions(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index 240cd2f..dcbf6d7 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -118,7 +118,6 @@ void vp_notify(struct vp_device *vp, struct vring_virtqueue 
*vq)
 int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq)
 {
-   int ioaddr = GET_LOWFLAT(vp->ioaddr);
u16 num;
 
ASSERT32FLAT();
@@ -129,34 +128,49 @@ int vp_find_vq(struct vp_device *vp, int queue_index,
}
memset(vq, 0, sizeof(*vq));
 
+
/* select the queue */
-
-   outw(queue_index, ioaddr + VIRTIO_PCI_QUEUE_SEL);
+   if (vp->use_modern) {
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_select, queue_index);
+   } else {
+   vp_write(&vp->legacy, virtio_pci_legacy, queue_sel, queue_index);
+   }
 
/* check if the queue is available */
-
-   num = inw(ioaddr + VIRTIO_PCI_QUEUE_NUM);
+   if (vp->use_modern) {
+   num = vp_read(&vp->common, virtio_pci_common_cfg, queue_size);
+   if (num > MAX_QUEUE_NUM) {
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_size,
+MAX_QUEUE_NUM);
+   num = vp_read(&vp->common, virtio_pci_common_cfg, queue_size);
+   }
+   } else {
+   num = vp_read(&vp->legacy, virtio_pci_legacy, queue_num);
+   }
if (!num) {
dprintf(1, "ERROR: queue size is 0\n");
goto fail;
}
-
if (num > MAX_QUEUE_NUM) {
dprintf(1, "ERROR: queue size %d > %d\n", num, MAX_QUEUE_NUM);
goto fail;
}
 
/* check if the queue is already active */
-
-   if (inl(ioaddr + VIRTIO_PCI_QUEUE_PFN)) {
-   dprintf(1, "ERROR: queue already active\n");
-   goto fail;
+   if (vp->use_modern) {
+   if (vp_read(&vp->common, virtio_pci_common_cfg, queue_enable)) {
+   dprintf(1, "ERROR: queue already active\n");
+   goto fail;
+   }
+   } else {
+   if (vp_read(&vp->legacy, virtio_pci_legacy, queue_pfn)) {
+   dprintf(1, "ERROR: queue already active\n");
+   goto fail;
+   }
}
-
vq->queue_index = queue_index;
 
/* initialize the queue */
-
struct vring * vr = &vq->vring;
vring_init(vr, num, (unsigned char*)&vq->queue);
 
@@ -165,9 +179,23 @@ int vp_find_vq(struct vp_device *vp, int queue_index,
 * NOTE: vr->desc is initialized by vring_init()
 */
 
-   outl((unsigned long)virt_to_phys(vr->desc) >> PAGE_SHIFT,
-ioaddr + VIRTIO_PCI_QUEUE_PFN);
-
+   if (vp->use_modern) {
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_desc_lo,
+(unsigned long)virt_to_phys(vr->desc));
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_desc_hi, 0);
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_avail_lo,
+(unsigned long)virt_to_phys(vr->avail));
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_avail_hi, 0);
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_used_lo,
+(unsigned long)virt_to_phys(vr->used));
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_used_hi, 0);
+   vp_write(&vp->common, virtio_pci_common_cfg, queue_enable, 1);
+   vq->queue_notify_off = vp_read(&vp->common, virtio_pci_common_cfg,
+  queue_notify_off);
+   } else {
+   vp_write(&vp->legacy, virtio_pci_legacy, queue_pfn,
+(unsigned long)virt_to_phys(vr->desc) >> PAGE_SHIFT);
+   }
return num;
 
 fail:
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH] block/mirror: limit qiov to IOV_MAX elements

2015-07-01 Thread Paolo Bonzini



On 01/07/2015 16:45, Stefan Hajnoczi wrote:
> If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2)
> EINVAL failures may be encountered.
> 
> It is possible to trigger this by setting granularity to a low value
> like 8192.
> 
> This patch stops appending chunks once IOV_MAX is reached.
> 
> The spurious EINVAL failure can be reproduced with a qcow2 image file
> and the following QMP invocation:
> 
>   qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1',
>   granularity=8192, sync='full', mode='absolute-paths',
>   format='raw')
> 
> While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct
> bs=4k.
> 
> Cc: Jeff Cody 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block/mirror.c | 4 
>  trace-events   | 1 +
>  2 files changed, 5 insertions(+)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 048e452..985ad00 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -241,6 +241,10 @@ static uint64_t coroutine_fn 
> mirror_iteration(MirrorBlockJob *s)
>  trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
>  break;
>  }
> +if (IOV_MAX < nb_chunks + added_chunks) {

No Yoda conditions... apart from that,

Reviewed-by: Paolo Bonzini 

> +trace_mirror_break_iov_max(s, nb_chunks, added_chunks);
> +break;
> +}
>  
>  /* We have enough free space to copy these sectors.  */
>  bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);
> diff --git a/trace-events b/trace-events
> index 52b7efa..943cd0c 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -94,6 +94,7 @@ mirror_yield(void *s, int64_t cnt, int buf_free_count, int 
> in_flight) "s %p dirt
>  mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p 
> sector_num %"PRId64" in_flight %d"
>  mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested 
> chunks %d in_flight %d"
>  mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested 
> chunks %d in_flight %d"
> +mirror_break_iov_max(void *s, int nb_chunks, int added_chunks) "s %p 
> requested chunks %d added_chunks %d"
>  
>  # block/backup.c
>  backup_do_cow_enter(void *job, int64_t start, int64_t sector_num, int 
> nb_sectors) "job %p start %"PRId64" sector_num %"PRId64" nb_sectors %d"
>

[Qemu-devel] [PATCH] block/mirror: limit qiov to IOV_MAX elements

2015-07-01 Thread Stefan Hajnoczi

If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2)
EINVAL failures may be encountered.

It is possible to trigger this by setting granularity to a low value
like 8192.

This patch stops appending chunks once IOV_MAX is reached.

The spurious EINVAL failure can be reproduced with a qcow2 image file
and the following QMP invocation:

  qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1',
  granularity=8192, sync='full', mode='absolute-paths',
  format='raw')

While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct
bs=4k.

Cc: Jeff Cody 
Signed-off-by: Stefan Hajnoczi 
---
 block/mirror.c | 4 
 trace-events   | 1 +
 2 files changed, 5 insertions(+)

diff --git a/block/mirror.c b/block/mirror.c
index 048e452..985ad00 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -241,6 +241,10 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
 break;
 }
+if (IOV_MAX < nb_chunks + added_chunks) {
+trace_mirror_break_iov_max(s, nb_chunks, added_chunks);
+break;
+}
 
 /* We have enough free space to copy these sectors.  */
 bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);
diff --git a/trace-events b/trace-events
index 52b7efa..943cd0c 100644
--- a/trace-events
+++ b/trace-events
@@ -94,6 +94,7 @@ mirror_yield(void *s, int64_t cnt, int buf_free_count, int 
in_flight) "s %p dirt
 mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p 
sector_num %"PRId64" in_flight %d"
 mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested 
chunks %d in_flight %d"
 mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested 
chunks %d in_flight %d"
+mirror_break_iov_max(void *s, int nb_chunks, int added_chunks) "s %p requested 
chunks %d added_chunks %d"
 
 # block/backup.c
 backup_do_cow_enter(void *job, int64_t start, int64_t sector_num, int 
nb_sectors) "job %p start %"PRId64" sector_num %"PRId64" nb_sectors %d"
-- 
2.4.3

[Qemu-devel] [PATCH v3 19/25] virtio: use version 1.0 if available (flip the big switch)

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index dcbf6d7..5cb0e73 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -260,16 +260,19 @@ void vp_init_simple(struct vp_device *vp, struct 
pci_device *pci)
 }
 
 if (vp->common.cap && vp->notify.cap && vp->isr.cap && vp->device.cap) {
-dprintf(1, "pci dev %x:%x supports virtio 1.0\n",
+dprintf(1, "pci dev %x:%x using modern (1.0) virtio mode\n",
 pci_bdf_to_bus(pci->bdf), pci_bdf_to_dev(pci->bdf));
+vp->use_modern = 1;
+} else {
+dprintf(1, "pci dev %x:%x using legacy (0.9.5) virtio mode\n",
+pci_bdf_to_bus(pci->bdf), pci_bdf_to_dev(pci->bdf));
+vp->legacy.bar = 0;
+vp->legacy.addr = pci_config_readl(pci->bdf, PCI_BASE_ADDRESS_0) &
+PCI_BASE_ADDRESS_IO_MASK;
+vp->legacy.is_io = 1;
+vp->ioaddr = vp->legacy.addr; /* temporary */
 }
 
-vp->legacy.bar = 0;
-vp->legacy.addr = pci_config_readl(pci->bdf, PCI_BASE_ADDRESS_0) &
-PCI_BASE_ADDRESS_IO_MASK;
-vp->legacy.is_io = 1;
-vp->ioaddr = vp->legacy.addr; /* temporary */
-
 vp_reset(vp);
 pci_config_maskw(pci->bdf, PCI_COMMAND, 0, PCI_COMMAND_MASTER);
 vp_set_status(vp, VIRTIO_CONFIG_S_ACKNOWLEDGE |
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 22/25] virtio-blk: 32bit cleanup

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-blk.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c
index c3052bb..29bc4a5 100644
--- a/src/hw/virtio-blk.c
+++ b/src/hw/virtio-blk.c
@@ -33,7 +33,7 @@ virtio_blk_op(struct disk_op_s *op, int write)
 {
 struct virtiodrive_s *vdrive_gf =
 container_of(op->drive_gf, struct virtiodrive_s, drive);
-struct vring_virtqueue *vq = GET_GLOBALFLAT(vdrive_gf->vq);
+struct vring_virtqueue *vq = vdrive_gf->vq;
 struct virtio_blk_outhdr hdr = {
 .type = write ? VIRTIO_BLK_T_OUT : VIRTIO_BLK_T_IN,
 .ioprio = 0,
@@ -42,15 +42,15 @@ virtio_blk_op(struct disk_op_s *op, int write)
 u8 status = VIRTIO_BLK_S_UNSUPP;
 struct vring_list sg[] = {
 {
-.addr   = MAKE_FLATPTR(GET_SEG(SS), &hdr),
+.addr   = (void*)(&hdr),
 .length = sizeof(hdr),
 },
 {
 .addr   = op->buf_fl,
-.length = GET_GLOBALFLAT(vdrive_gf->drive.blksize) * op->count,
+.length = vdrive_gf->drive.blksize * op->count,
 },
 {
-.addr   = MAKE_FLATPTR(GET_SEG(SS), &status),
+.addr   = (void*)(&status),
 .length = sizeof(status),
 },
 };
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 07/25] virtio: find version 1.0 virtio capabilities

2015-07-01 Thread Gerd Hoffmann

virtio 1.0 specifies the location of the various virtio regions
using pci capabilities.  Look them up and store the results.

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c | 56 +
 src/hw/virtio-pci.h |  8 
 2 files changed, 64 insertions(+)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index 9428d04..58f3d39 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -87,6 +87,62 @@ fail:
 
 void vp_init_simple(struct vp_device *vp, struct pci_device *pci)
 {
+u8 cap = pci_find_capability(pci, PCI_CAP_ID_VNDR, 0);
+struct vp_cap *vp_cap;
+u32 addr, offset;
+u8 type;
+
+memset(vp, 0, sizeof(*vp));
+while (cap != 0) {
+type = pci_config_readb(pci->bdf, cap +
+offsetof(struct virtio_pci_cap, cfg_type));
+switch (type) {
+case VIRTIO_PCI_CAP_COMMON_CFG:
+vp_cap = &vp->common;
+break;
+case VIRTIO_PCI_CAP_NOTIFY_CFG:
+vp_cap = &vp->notify;
+break;
+case VIRTIO_PCI_CAP_ISR_CFG:
+vp_cap = &vp->isr;
+break;
+case VIRTIO_PCI_CAP_DEVICE_CFG:
+vp_cap = &vp->device;
+break;
+default:
+vp_cap = NULL;
+break;
+}
+if (vp_cap && !vp_cap->cap) {
+vp_cap->cap = cap;
+vp_cap->bar = pci_config_readb(pci->bdf, cap +
+   offsetof(struct virtio_pci_cap, 
bar));
+offset = pci_config_readl(pci->bdf, cap +
+  offsetof(struct virtio_pci_cap, offset));
+addr = pci_config_readl(pci->bdf, PCI_BASE_ADDRESS_0 + 4 * 
vp_cap->bar);
+if (addr & PCI_BASE_ADDRESS_SPACE_IO) {
+vp_cap->is_io = 1;
+addr &= PCI_BASE_ADDRESS_IO_MASK;
+} else {
+vp_cap->is_io = 0;
+addr &= PCI_BASE_ADDRESS_MEM_MASK;
+}
+vp_cap->addr = addr + offset;
+dprintf(3, "pci dev %x:%x virtio cap at 0x%x type %d "
+"bar %d at 0x%08x off +0x%04x [%s]\n",
+pci_bdf_to_bus(pci->bdf), pci_bdf_to_dev(pci->bdf),
+vp_cap->cap, type, vp_cap->bar, addr, offset,
+vp_cap->is_io ? "io" : "mmio");
+}
+
+cap = pci_find_capability(pci, PCI_CAP_ID_VNDR, cap);
+}
+
+if (vp->common.cap && vp->notify.cap && vp->isr.cap && vp->device.cap) {
+dprintf(1, "pci dev %x:%x supports virtio 1.0\n",
+pci_bdf_to_bus(pci->bdf), pci_bdf_to_dev(pci->bdf));
+}
+
 vp->ioaddr = pci_config_readl(pci->bdf, PCI_BASE_ADDRESS_0) &
 PCI_BASE_ADDRESS_IO_MASK;
 
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index 42e2b7f..467c02f 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -115,8 +115,16 @@ typedef struct virtio_pci_isr {
 
 /* --- driver structs --- */
 
+struct vp_cap {
+u32 addr;
+u8 cap;
+u8 bar;
+u8 is_io;
+};
+
 struct vp_device {
 unsigned int ioaddr;
+struct vp_cap common, notify, isr, device;
 };
 
 static inline u32 vp_get_features(struct vp_device *vp)
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 20/25] virtio: also probe version 1.0 pci ids

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/pci_ids.h | 8 ++--
 src/hw/virtio-blk.c  | 5 +++--
 src/hw/virtio-scsi.c | 5 +++--
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h
index 1cd4f72..cdf9b3c 100644
--- a/src/hw/pci_ids.h
+++ b/src/hw/pci_ids.h
@@ -2616,8 +2616,12 @@
 #define PCI_DEVICE_ID_RME_DIGI32_8 0x9898
 
 #define PCI_VENDOR_ID_REDHAT_QUMRANET  0x1af4
-#define PCI_DEVICE_ID_VIRTIO_BLK   0x1001
-#define PCI_DEVICE_ID_VIRTIO_SCSI  0x1004
+/* virtio 0.9.5 ids (legacy/transitional devices) */
+#define PCI_DEVICE_ID_VIRTIO_BLK_090x1001
+#define PCI_DEVICE_ID_VIRTIO_SCSI_09   0x1004
+/* virtio 1.0 ids (modern devices) */
+#define PCI_DEVICE_ID_VIRTIO_BLK_100x1042
+#define PCI_DEVICE_ID_VIRTIO_SCSI_10   0x1048
 
 #define PCI_VENDOR_ID_VMWARE0x15ad
 #define PCI_DEVICE_ID_VMWARE_PVSCSI 0x07C0
diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c
index 703b147..c3052bb 100644
--- a/src/hw/virtio-blk.c
+++ b/src/hw/virtio-blk.c
@@ -216,8 +216,9 @@ virtio_blk_setup(void)
 
 struct pci_device *pci;
 foreachpci(pci) {
-if (pci->vendor != PCI_VENDOR_ID_REDHAT_QUMRANET
-|| pci->device != PCI_DEVICE_ID_VIRTIO_BLK)
+if (pci->vendor != PCI_VENDOR_ID_REDHAT_QUMRANET ||
+   (pci->device != PCI_DEVICE_ID_VIRTIO_BLK_09 &&
+pci->device != PCI_DEVICE_ID_VIRTIO_BLK_10))
 continue;
 init_virtio_blk(pci);
 }
diff --git a/src/hw/virtio-scsi.c b/src/hw/virtio-scsi.c
index 89dcb8d..6b4ed1a 100644
--- a/src/hw/virtio-scsi.c
+++ b/src/hw/virtio-scsi.c
@@ -207,8 +207,9 @@ virtio_scsi_setup(void)
 
 struct pci_device *pci;
 foreachpci(pci) {
-if (pci->vendor != PCI_VENDOR_ID_REDHAT_QUMRANET
-|| pci->device != PCI_DEVICE_ID_VIRTIO_SCSI)
+if (pci->vendor != PCI_VENDOR_ID_REDHAT_QUMRANET ||
+(pci->device != PCI_DEVICE_ID_VIRTIO_SCSI_09 &&
+ pci->device != PCI_DEVICE_ID_VIRTIO_SCSI_10))
 continue;
 init_virtio_scsi(pci);
 }
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 24/25] virtio-ring: 32bit cleanup

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-ring.c | 61 ++--
 1 file changed, 30 insertions(+), 31 deletions(-)

diff --git a/src/hw/virtio-ring.c b/src/hw/virtio-ring.c
index 6c86c38..7205a0a 100644
--- a/src/hw/virtio-ring.c
+++ b/src/hw/virtio-ring.c
@@ -35,8 +35,8 @@
 
 int vring_more_used(struct vring_virtqueue *vq)
 {
-struct vring_used *used = GET_LOWFLAT(vq->vring.used);
-int more = GET_LOWFLAT(vq->last_used_idx) != GET_LOWFLAT(used->idx);
+struct vring_used *used = vq->vring.used;
+int more = vq->last_used_idx != used->idx;
 /* Make sure ring reads are done after idx read above. */
 smp_rmb();
 return more;
@@ -57,13 +57,13 @@ void vring_detach(struct vring_virtqueue *vq, unsigned int 
head)
 /* find end of given descriptor */
 
 i = head;
-while (GET_LOWFLAT(desc[i].flags) & VRING_DESC_F_NEXT)
-i = GET_LOWFLAT(desc[i].next);
+while (desc[i].flags & VRING_DESC_F_NEXT)
+i = desc[i].next;
 
 /* link it with free list and point to it */
 
-SET_LOWFLAT(desc[i].next, GET_LOWFLAT(vq->free_head));
-SET_LOWFLAT(vq->free_head, head);
+desc[i].next = vq->free_head;
+vq->free_head = head;
 }
 
 /*
@@ -77,22 +77,22 @@ int vring_get_buf(struct vring_virtqueue *vq, unsigned int 
*len)
 {
 struct vring *vr = &vq->vring;
 struct vring_used_elem *elem;
-struct vring_used *used = GET_LOWFLAT(vq->vring.used);
+struct vring_used *used = vq->vring.used;
 u32 id;
 int ret;
 
 //BUG_ON(!vring_more_used(vq));
 
-elem = &used->ring[GET_LOWFLAT(vq->last_used_idx) % GET_LOWFLAT(vr->num)];
-id = GET_LOWFLAT(elem->id);
+elem = &used->ring[vq->last_used_idx % vr->num];
+id = elem->id;
 if (len != NULL)
-*len = GET_LOWFLAT(elem->len);
+*len = elem->len;
 
-ret = GET_LOWFLAT(vq->vdata[id]);
+ret = vq->vdata[id];
 
 vring_detach(vq, id);
 
-SET_LOWFLAT(vq->last_used_idx, GET_LOWFLAT(vq->last_used_idx) + 1);
+vq->last_used_idx = vq->last_used_idx + 1;
 
 return ret;
 }
@@ -104,46 +104,45 @@ void vring_add_buf(struct vring_virtqueue *vq,
 {
 struct vring *vr = &vq->vring;
 int i, av, head, prev;
-struct vring_desc *desc = GET_LOWFLAT(vr->desc);
-struct vring_avail *avail = GET_LOWFLAT(vr->avail);
+struct vring_desc *desc = vr->desc;
+struct vring_avail *avail = vr->avail;
 
 BUG_ON(out + in == 0);
 
 prev = 0;
-head = GET_LOWFLAT(vq->free_head);
-for (i = head; out; i = GET_LOWFLAT(desc[i].next), out--) {
-SET_LOWFLAT(desc[i].flags, VRING_DESC_F_NEXT);
-SET_LOWFLAT(desc[i].addr, (u64)virt_to_phys(list->addr));
-SET_LOWFLAT(desc[i].len, list->length);
+head = vq->free_head;
+for (i = head; out; i = desc[i].next, out--) {
+desc[i].flags = VRING_DESC_F_NEXT;
+desc[i].addr = (u64)virt_to_phys(list->addr);
+desc[i].len = list->length;
 prev = i;
 list++;
 }
-for ( ; in; i = GET_LOWFLAT(desc[i].next), in--) {
-SET_LOWFLAT(desc[i].flags, VRING_DESC_F_NEXT|VRING_DESC_F_WRITE);
-SET_LOWFLAT(desc[i].addr, (u64)virt_to_phys(list->addr));
-SET_LOWFLAT(desc[i].len, list->length);
+for ( ; in; i = desc[i].next, in--) {
+desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
+desc[i].addr = (u64)virt_to_phys(list->addr);
+desc[i].len = list->length;
 prev = i;
 list++;
 }
-SET_LOWFLAT(desc[prev].flags,
-GET_LOWFLAT(desc[prev].flags) & ~VRING_DESC_F_NEXT);
+desc[prev].flags = desc[prev].flags & ~VRING_DESC_F_NEXT;
 
-SET_LOWFLAT(vq->free_head, i);
+vq->free_head = i;
 
-SET_LOWFLAT(vq->vdata[head], index);
+vq->vdata[head] = index;
 
-av = (GET_LOWFLAT(avail->idx) + num_added) % GET_LOWFLAT(vr->num);
-SET_LOWFLAT(avail->ring[av], head);
+av = (avail->idx + num_added) % vr->num;
+avail->ring[av] = head;
 }
 
 void vring_kick(struct vp_device *vp, struct vring_virtqueue *vq, int 
num_added)
 {
 struct vring *vr = &vq->vring;
-struct vring_avail *avail = GET_LOWFLAT(vr->avail);
+struct vring_avail *avail = vr->avail;
 
 /* Make sure idx update is done after ring write. */
 smp_wmb();
-SET_LOWFLAT(avail->idx, GET_LOWFLAT(avail->idx) + num_added);
+avail->idx = avail->idx + num_added;
 
 vp_notify(vp, vq);
 }
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 21/25] virtio: legacy cleanup

2015-07-01 Thread Gerd Hoffmann

Now that all code is switched over to use vp_read/write we can
drop the ioaddr field from vp_device and the offset #defines.

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c |  1 -
 src/hw/virtio-pci.h | 33 +
 2 files changed, 1 insertion(+), 33 deletions(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index 5cb0e73..8e7ea46 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -270,7 +270,6 @@ void vp_init_simple(struct vp_device *vp, struct pci_device 
*pci)
 vp->legacy.addr = pci_config_readl(pci->bdf, PCI_BASE_ADDRESS_0) &
 PCI_BASE_ADDRESS_IO_MASK;
 vp->legacy.is_io = 1;
-vp->ioaddr = vp->legacy.addr; /* temporary */
 }
 
 vp_reset(vp);
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index 08267ad..b11c355 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -4,39 +4,9 @@
 #include "x86.h" // inl
 #include "biosvar.h" // GET_LOWFLAT
 
-/* A 32-bit r/o bitmask of the features supported by the host */
-#define VIRTIO_PCI_HOST_FEATURES0
-
-/* A 32-bit r/w bitmask of features activated by the guest */
-#define VIRTIO_PCI_GUEST_FEATURES   4
-
-/* A 32-bit r/w PFN for the currently selected queue */
-#define VIRTIO_PCI_QUEUE_PFN8
-
-/* A 16-bit r/o queue size for the currently selected queue */
-#define VIRTIO_PCI_QUEUE_NUM12
-
-/* A 16-bit r/w queue selector */
-#define VIRTIO_PCI_QUEUE_SEL14
-
-/* A 16-bit r/w queue notifier */
-#define VIRTIO_PCI_QUEUE_NOTIFY 16
-
-/* An 8-bit device status register.  */
-#define VIRTIO_PCI_STATUS   18
-
-/* An 8-bit r/o interrupt status register.  Reading the value will return the
- * current contents of the ISR and will also clear it.  This is effectively
- * a read-and-acknowledge. */
-#define VIRTIO_PCI_ISR  19
-
 /* The bit of the ISR which indicates a device configuration change. */
 #define VIRTIO_PCI_ISR_CONFIG   0x2
 
-/* The remaining space is defined by each driver as the per-driver
- * configuration space */
-#define VIRTIO_PCI_CONFIG   20
-
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION  0
 
@@ -123,7 +93,6 @@ struct vp_cap {
 };
 
 struct vp_device {
-unsigned int ioaddr;
 struct vp_cap common, notify, isr, device, legacy;
 u32 notify_off_multiplier;
 u8 use_modern;
@@ -225,7 +194,7 @@ static inline void vp_get_legacy(struct vp_device *vp, 
unsigned offset,
 unsigned i;
 
 for (i = 0; i < len; i++)
-ptr[i] = inb(vp->ioaddr + VIRTIO_PCI_CONFIG + offset + i);
+ptr[i] = vp_read(&vp->legacy, virtio_pci_legacy, device[i]);
 }
 
 u8 vp_get_status(struct vp_device *vp);
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 13/25] virtio: add version 1.0 support to vp_reset

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c | 11 +++
 src/hw/virtio-pci.h |  9 +
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index 2f0bb40..bd51d8a 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -86,6 +86,17 @@ u8 vp_get_isr(struct vp_device *vp)
 }
 }
 
+void vp_reset(struct vp_device *vp)
+{
+if (vp->use_modern) {
+vp_write(&vp->common, virtio_pci_common_cfg, device_status, 0);
+vp_read(&vp->isr, virtio_pci_isr, isr);
+} else {
+vp_write(&vp->legacy, virtio_pci_legacy, status, 0);
+vp_read(&vp->legacy, virtio_pci_legacy, isr);
+}
+}
+
 int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq)
 {
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index c891b7c..f2ae5b9 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -231,14 +231,7 @@ static inline void vp_get(struct vp_device *vp, unsigned 
offset,
 u8 vp_get_status(struct vp_device *vp);
 void vp_set_status(struct vp_device *vp, u8 status);
 u8 vp_get_isr(struct vp_device *vp);
-
-static inline void vp_reset(struct vp_device *vp)
-{
-   int ioaddr = GET_LOWFLAT(vp->ioaddr);
-
-   outb(0, ioaddr + VIRTIO_PCI_STATUS);
-   (void)inb(ioaddr + VIRTIO_PCI_ISR);
-}
+void vp_reset(struct vp_device *vp);
 
 static inline void vp_notify(struct vp_device *vp, int queue_index)
 {
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 11/25] virtio: add version 1.0 support to vp_{get, set}_status

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c | 20 
 src/hw/virtio-pci.h | 13 ++---
 2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index 5ae6a76..c9c79b3 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -57,6 +57,26 @@ void vp_set_features(struct vp_device *vp, u64 features)
 }
 }
 
+u8 vp_get_status(struct vp_device *vp)
+{
+if (vp->use_modern) {
+return vp_read(&vp->common, virtio_pci_common_cfg, device_status);
+} else {
+return vp_read(&vp->legacy, virtio_pci_legacy, status);
+}
+}
+
+void vp_set_status(struct vp_device *vp, u8 status)
+{
+if (status == 0)/* reset */
+return;
+if (vp->use_modern) {
+vp_write(&vp->common, virtio_pci_common_cfg, device_status, status);
+} else {
+vp_write(&vp->legacy, virtio_pci_legacy, status, status);
+}
+}
+
 int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq)
 {
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index 962d6c0..e3f6f99 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -228,17 +228,8 @@ static inline void vp_get(struct vp_device *vp, unsigned 
offset,
ptr[i] = inb(ioaddr + VIRTIO_PCI_CONFIG + offset + i);
 }
 
-static inline u8 vp_get_status(struct vp_device *vp)
-{
-return inb(GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_STATUS);
-}
-
-static inline void vp_set_status(struct vp_device *vp, u8 status)
-{
-   if (status == 0)/* reset */
-   return;
-   outb(status, GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_STATUS);
-}
+u8 vp_get_status(struct vp_device *vp);
+void vp_set_status(struct vp_device *vp, u8 status);
 
 static inline u8 vp_get_isr(struct vp_device *vp)
 {
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 15/25] virtio: remove unused vp_del_vq

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.h | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index 3054a13..1b5c61d 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -234,17 +234,6 @@ void vp_set_status(struct vp_device *vp, u8 status);
 u8 vp_get_isr(struct vp_device *vp);
 void vp_reset(struct vp_device *vp);
 
-static inline void vp_del_vq(struct vp_device *vp, int queue_index)
-{
-   int ioaddr = GET_LOWFLAT(vp->ioaddr);
-
-   /* select the queue */
-   outw(queue_index, ioaddr + VIRTIO_PCI_QUEUE_SEL);
-
-   /* deactivate the queue */
-   outl(0, ioaddr + VIRTIO_PCI_QUEUE_PFN);
-}
-
 struct pci_device;
 struct vring_virtqueue;
 void vp_init_simple(struct vp_device *vp, struct pci_device *pci);
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 17/25] virtio-scsi: fix initialization for version 1.0

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-scsi.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/hw/virtio-scsi.c b/src/hw/virtio-scsi.c
index 8073c77..89dcb8d 100644
--- a/src/hw/virtio-scsi.c
+++ b/src/hw/virtio-scsi.c
@@ -151,14 +151,35 @@ init_virtio_scsi(struct pci_device *pci)
 return;
 }
 vp_init_simple(vp, pci);
+u8 status = VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER;
+
+if (vp->use_modern) {
+u64 features = vp_get_features(vp);
+u64 version1 = 1ull << VIRTIO_F_VERSION_1;
+if (!(features & version1)) {
+dprintf(1, "modern device without virtio_1 feature bit: %x:%x\n",
+pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
+goto fail;
+}
+
+vp_set_features(vp, version1);
+status |= VIRTIO_CONFIG_S_FEATURES_OK;
+vp_set_status(vp, status);
+if (!(vp_get_status(vp) & VIRTIO_CONFIG_S_FEATURES_OK)) {
+dprintf(1, "device didn't accept features: %x:%x\n",
+pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
+goto fail;
+}
+}
+
 if (vp_find_vq(vp, 2, &vq) < 0 ) {
 dprintf(1, "fail to find vq for virtio-scsi %x:%x\n",
 pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
 goto fail;
 }
 
-vp_set_status(vp, VIRTIO_CONFIG_S_ACKNOWLEDGE |
-  VIRTIO_CONFIG_S_DRIVER | VIRTIO_CONFIG_S_DRIVER_OK);
+status |= VIRTIO_CONFIG_S_DRIVER_OK;
+vp_set_status(vp, status);
 
 int i, tot;
 for (tot = 0, i = 0; i < 256; i++)
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 12/25] virtio: add version 1.0 support to vp_get_isr

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c | 9 +
 src/hw/virtio-pci.h | 6 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index c9c79b3..2f0bb40 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -77,6 +77,15 @@ void vp_set_status(struct vp_device *vp, u8 status)
 }
 }
 
+u8 vp_get_isr(struct vp_device *vp)
+{
+if (vp->use_modern) {
+return vp_read(&vp->isr, virtio_pci_isr, isr);
+} else {
+return vp_read(&vp->legacy, virtio_pci_legacy, isr);
+}
+}
+
 int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq)
 {
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index e3f6f99..c891b7c 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -230,11 +230,7 @@ static inline void vp_get(struct vp_device *vp, unsigned 
offset,
 
 u8 vp_get_status(struct vp_device *vp);
 void vp_set_status(struct vp_device *vp, u8 status);
-
-static inline u8 vp_get_isr(struct vp_device *vp)
-{
-return inb(GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_ISR);
-}
+u8 vp_get_isr(struct vp_device *vp);
 
 static inline void vp_reset(struct vp_device *vp)
 {
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 05/25] virtio: add version 1.0 structs and #defines

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.h  | 61 
 src/hw/virtio-ring.h |  5 +
 2 files changed, 66 insertions(+)

diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index 85e623f..83ebcda 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -40,6 +40,67 @@
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION  0
 
+/* --- virtio 1.0 (modern) structs -- */
+
+/* Common configuration */
+#define VIRTIO_PCI_CAP_COMMON_CFG   1
+/* Notifications */
+#define VIRTIO_PCI_CAP_NOTIFY_CFG   2
+/* ISR access */
+#define VIRTIO_PCI_CAP_ISR_CFG  3
+/* Device specific configuration */
+#define VIRTIO_PCI_CAP_DEVICE_CFG   4
+/* PCI configuration access */
+#define VIRTIO_PCI_CAP_PCI_CFG  5
+
+/* This is the PCI capability header: */
+struct virtio_pci_cap {
+u8 cap_vndr;  /* Generic PCI field: PCI_CAP_ID_VNDR */
+u8 cap_next;  /* Generic PCI field: next ptr. */
+u8 cap_len;   /* Generic PCI field: capability length */
+u8 cfg_type;  /* Identifies the structure. */
+u8 bar;   /* Where to find it. */
+u8 padding[3];/* Pad to full dword. */
+u32 offset;   /* Offset within bar. */
+u32 length;   /* Length of the structure, in bytes. */
+};
+
+struct virtio_pci_notify_cap {
+struct virtio_pci_cap cap;
+u32 notify_off_multiplier;   /* Multiplier for queue_notify_off. */
+};
+
+typedef struct virtio_pci_common_cfg {
+/* About the whole device. */
+u32 device_feature_select;   /* read-write */
+u32 device_feature;  /* read-only */
+u32 guest_feature_select;/* read-write */
+u32 guest_feature;   /* read-write */
+u16 msix_config; /* read-write */
+u16 num_queues;  /* read-only */
+u8 device_status;/* read-write */
+u8 config_generation;/* read-only */
+
+/* About a specific virtqueue. */
+u16 queue_select;/* read-write */
+u16 queue_size;  /* read-write, power of 2. */
+u16 queue_msix_vector;   /* read-write */
+u16 queue_enable;/* read-write */
+u16 queue_notify_off;/* read-only */
+u32 queue_desc_lo;   /* read-write */
+u32 queue_desc_hi;   /* read-write */
+u32 queue_avail_lo;  /* read-write */
+u32 queue_avail_hi;  /* read-write */
+u32 queue_used_lo;   /* read-write */
+u32 queue_used_hi;   /* read-write */
+} virtio_pci_common_cfg;
+
+typedef struct virtio_pci_isr {
+u8 isr;
+} virtio_pci_isr;
+
+/* --- driver structs --- */
+
 struct vp_device {
 unsigned int ioaddr;
 };
diff --git a/src/hw/virtio-ring.h b/src/hw/virtio-ring.h
index fe5133b..553a508 100644
--- a/src/hw/virtio-ring.h
+++ b/src/hw/virtio-ring.h
@@ -20,9 +20,14 @@
 #define VIRTIO_CONFIG_S_DRIVER  2
 /* Driver has used its parts of the config, and is happy */
 #define VIRTIO_CONFIG_S_DRIVER_OK   4
+/* Driver has finished configuring features */
+#define VIRTIO_CONFIG_S_FEATURES_OK 8
 /* We've given up on this device. */
 #define VIRTIO_CONFIG_S_FAILED  0x80
 
+/* v1.0 compliant. */
+#define VIRTIO_F_VERSION_1  32
+
 #define MAX_QUEUE_NUM  (128)
 
 #define VRING_DESC_F_NEXT  1
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 09/25] virtio: add read/write functions and macros

2015-07-01 Thread Gerd Hoffmann

Add macros to read/write virtio registers.

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.h | 86 +
 1 file changed, 86 insertions(+)

diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index 147e529..f7510f2 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -127,6 +127,92 @@ struct vp_device {
 struct vp_cap common, notify, isr, device, legacy;
 };
 
+static inline u64 _vp_read(struct vp_cap *cap, u32 offset, u8 size)
+{
+u32 addr = cap->addr + offset;
+u64 var;
+
+if (cap->is_io) {
+switch (size) {
+case 8:
+var = inl(addr);
+var |= (u64)inl(addr+4) << 32;
+break;
+case 4:
+var = inl(addr);
+break;
+case 2:
+var = inw(addr);
+break;
+case 1:
+var = inb(addr);
+break;
+default:
+var = 0;
+}
+} else {
+switch (size) {
+case 8:
+var = readl((void*)addr);
+var |= (u64)readl((void*)(addr+4)) << 32;
+break;
+case 4:
+var = readl((void*)addr);
+break;
+case 2:
+var = readw((void*)addr);
+break;
+case 1:
+var = readb((void*)addr);
+break;
+default:
+var = 0;
+}
+}
+dprintf(9, "vp read   %x (%d) -> 0x%llx\n", addr, size, var);
+return var;
+}
+
+static inline void _vp_write(struct vp_cap *cap, u32 offset, u8 size, u64 var)
+{
+u32 addr = cap->addr + offset;
+
+dprintf(9, "vp write  %x (%d) <- 0x%llx\n", addr, size, var);
+if (cap->is_io) {
+switch (size) {
+case 4:
+outl(var, addr);
+break;
+case 2:
+outw(var, addr);
+break;
+case 1:
+outb(var, addr);
+break;
+}
+} else {
+switch (size) {
+case 4:
+writel((void*)addr, var);
+break;
+case 2:
+writew((void*)addr, var);
+break;
+case 1:
+writeb((void*)addr, var);
+break;
+}
+}
+}
+
+#define vp_read(_cap, _struct, _field)\
+_vp_read(_cap, offsetof(_struct, _field), \
+ sizeof(((_struct *)0)->_field))
+
+#define vp_write(_cap, _struct, _field, _var)   \
+_vp_write(_cap, offsetof(_struct, _field),  \
+ sizeof(((_struct *)0)->_field), _var)
+
 static inline u32 vp_get_features(struct vp_device *vp)
 {
 return inl(GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_HOST_FEATURES);
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 03/25] virtio: add struct vp_device

2015-07-01 Thread Gerd Hoffmann

For virtio 1.0 support we will need more state than just the (legacy
mode) ioaddr for each virtio-pci device.  Prepare for that by adding
a new struct for it.  For now it carries the ioaddr only.

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-blk.c  | 19 +--
 src/hw/virtio-pci.c  | 12 ++--
 src/hw/virtio-pci.h  | 46 +++---
 src/hw/virtio-ring.c |  4 ++--
 src/hw/virtio-ring.h |  3 ++-
 src/hw/virtio-scsi.c | 37 ++---
 6 files changed, 68 insertions(+), 53 deletions(-)

diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c
index 15ac171..3a71510 100644
--- a/src/hw/virtio-blk.c
+++ b/src/hw/virtio-blk.c
@@ -25,7 +25,7 @@
 struct virtiodrive_s {
 struct drive_s drive;
 struct vring_virtqueue *vq;
-u16 ioaddr;
+struct vp_device vp;
 };
 
 static int
@@ -60,7 +60,7 @@ virtio_blk_op(struct disk_op_s *op, int write)
 vring_add_buf(vq, sg, 2, 1, 0, 0);
 else
 vring_add_buf(vq, sg, 1, 2, 0, 0);
-vring_kick(GET_GLOBALFLAT(vdrive_gf->ioaddr), vq, 1);
+vring_kick(&vdrive_gf->vp, vq, 1);
 
 /* Wait for reply */
 while (!vring_more_used(vq))
@@ -72,7 +72,7 @@ virtio_blk_op(struct disk_op_s *op, int write)
 /* Clear interrupt status register.  Avoid leaving interrupts stuck if
  * VRING_AVAIL_F_NO_INTERRUPT was ignored and interrupts were raised.
  */
-vp_get_isr(GET_GLOBALFLAT(vdrive_gf->ioaddr));
+vp_get_isr(&vdrive_gf->vp);
 
 return status == VIRTIO_BLK_S_OK ? DISK_RET_SUCCESS : DISK_RET_EBADTRACK;
 }
@@ -113,18 +113,17 @@ init_virtio_blk(struct pci_device *pci)
 vdrive->drive.type = DTYPE_VIRTIO_BLK;
 vdrive->drive.cntl_id = bdf;
 
-u16 ioaddr = vp_init_simple(bdf);
-vdrive->ioaddr = ioaddr;
-if (vp_find_vq(ioaddr, 0, &vdrive->vq) < 0 ) {
+vp_init_simple(&vdrive->vp, bdf);
+if (vp_find_vq(&vdrive->vp, 0, &vdrive->vq) < 0 ) {
 dprintf(1, "fail to find vq for virtio-blk %x:%x\n",
 pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
 goto fail;
 }
 
 struct virtio_blk_config cfg;
-vp_get(ioaddr, 0, &cfg, sizeof(cfg));
+vp_get(&vdrive->vp, 0, &cfg, sizeof(cfg));
 
-u32 f = vp_get_features(ioaddr);
+u32 f = vp_get_features(&vdrive->vp);
 vdrive->drive.blksize = (f & (1 << VIRTIO_BLK_F_BLK_SIZE)) ?
 cfg.blk_size : DISK_SECTOR_SIZE;
 
@@ -148,12 +147,12 @@ init_virtio_blk(struct pci_device *pci)
 
 boot_add_hd(&vdrive->drive, desc, bootprio_find_pci_device(pci));
 
-vp_set_status(ioaddr, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+vp_set_status(&vdrive->vp, VIRTIO_CONFIG_S_ACKNOWLEDGE |
   VIRTIO_CONFIG_S_DRIVER | VIRTIO_CONFIG_S_DRIVER_OK);
 return;
 
 fail:
-vp_reset(ioaddr);
+vp_reset(&vdrive->vp);
 free(vdrive->vq);
 free(vdrive);
 }
diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index b9b3ab1..f648328 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -24,9 +24,10 @@
 #include "virtio-pci.h"
 #include "virtio-ring.h"
 
-int vp_find_vq(unsigned int ioaddr, int queue_index,
+int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq)
 {
+   int ioaddr = GET_LOWFLAT(vp->ioaddr);
u16 num;
 
ASSERT32FLAT();
@@ -84,14 +85,13 @@ fail:
return -1;
 }
 
-u16 vp_init_simple(u16 bdf)
+void vp_init_simple(struct vp_device *vp, u16 bdf)
 {
-u16 ioaddr = pci_config_readl(bdf, PCI_BASE_ADDRESS_0) &
+vp->ioaddr = pci_config_readl(bdf, PCI_BASE_ADDRESS_0) &
 PCI_BASE_ADDRESS_IO_MASK;
 
-vp_reset(ioaddr);
+vp_reset(vp);
 pci_config_maskw(bdf, PCI_COMMAND, 0, PCI_COMMAND_MASTER);
-vp_set_status(ioaddr, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+vp_set_status(vp, VIRTIO_CONFIG_S_ACKNOWLEDGE |
   VIRTIO_CONFIG_S_DRIVER );
-return ioaddr;
 }
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index bc04b03..c1caf67 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -2,6 +2,7 @@
 #define _VIRTIO_PCI_H
 
 #include "x86.h" // inl
+#include "biosvar.h" // GET_LOWFLAT
 
 /* A 32-bit r/o bitmask of the features supported by the host */
 #define VIRTIO_PCI_HOST_FEATURES0
@@ -39,19 +40,24 @@
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION  0
 
-static inline u32 vp_get_features(unsigned int ioaddr)
+struct vp_device {
+unsigned int ioaddr;
+};
+
+static inline u32 vp_get_features(struct vp_device *vp)
 {
-   return inl(ioaddr + VIRTIO_PCI_HOST_FEATURES);
+return inl(GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_HOST_FEATURES);
 }
 
-static inline void vp_set_features(unsigned int ioaddr, u32 features)
+static inline void vp_set_features(struct vp_device *vp, u32 features)
 {
-outl(features, ioaddr + VIRTIO_PCI_GUEST_FEATURES);
+outl(features, GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_GUEST_FEATURES);
 }
 
-static inline void vp_get(unsigned int ioaddr, unsigned offset,
+s

[Qemu-devel] [PATCH v3 02/25] virtio: run drivers in 32bit mode

2015-07-01 Thread Gerd Hoffmann

virtio version 1.0 registers can (and actually do in the qemu
implementation) live in mmio space.  So we must run the blk and
scsi virtio drivers in 32bit mode, otherwise we can't access them.

This also allows to drop a bunch of GET_LOWFLAT calls from the virtio
code in the following patches.

Signed-off-by: Gerd Hoffmann 
---
 Makefile| 2 +-
 src/block.c | 8 +---
 src/hw/blockcmd.c   | 5 +++--
 src/hw/virtio-blk.c | 2 +-
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index f97b1bd..e287530 100644
--- a/Makefile
+++ b/Makefile
@@ -34,7 +34,6 @@ SRCBOTH=misc.c stacks.c output.c string.c block.c cdrom.c 
disk.c mouse.c kbd.c \
 hw/usb.c hw/usb-uhci.c hw/usb-ohci.c hw/usb-ehci.c \
 hw/usb-hid.c hw/usb-msc.c hw/usb-uas.c \
 hw/blockcmd.c hw/floppy.c hw/ata.c hw/ramdisk.c \
-hw/virtio-ring.c hw/virtio-pci.c hw/virtio-blk.c hw/virtio-scsi.c \
 hw/lsi-scsi.c hw/esp-scsi.c hw/megasas.c
 SRC16=$(SRCBOTH)
 SRC32FLAT=$(SRCBOTH) post.c memmap.c malloc.c romfile.c x86.c optionroms.c \
@@ -43,6 +42,7 @@ SRC32FLAT=$(SRCBOTH) post.c memmap.c malloc.c romfile.c x86.c 
optionroms.c \
 fw/coreboot.c fw/lzmadecode.c fw/multiboot.c fw/csm.c fw/biostables.c \
 fw/paravirt.c fw/shadow.c fw/pciinit.c fw/smm.c fw/smp.c fw/mtrr.c 
fw/xen.c \
 fw/acpi.c fw/mptable.c fw/pirtable.c fw/smbios.c fw/romfile_loader.c \
+hw/virtio-ring.c hw/virtio-pci.c hw/virtio-blk.c hw/virtio-scsi.c \
 hw/tpm_drivers.c
 SRC32SEG=string.c output.c pcibios.c apm.c stacks.c hw/pci.c hw/serialio.c
 DIRS=src src/hw src/fw vgasrc
diff --git a/src/block.c b/src/block.c
index 3f7ecb1..a9b9851 100644
--- a/src/block.c
+++ b/src/block.c
@@ -503,8 +503,10 @@ process_op(struct disk_op_s *op)
 case DTYPE_CDEMU:
 ret = process_cdemu_op(op);
 break;
-case DTYPE_VIRTIO_BLK:
-ret = process_virtio_blk_op(op);
+case DTYPE_VIRTIO_BLK: ;
+extern void _cfunc32flat_process_virtio_blk_op(void);
+ret = call32(_cfunc32flat_process_virtio_blk_op
+ , (u32)MAKE_FLATPTR(GET_SEG(SS), op), DISK_RET_EPARAM);
 break;
 case DTYPE_AHCI: ;
 extern void _cfunc32flat_process_ahci_op(void);
@@ -526,7 +528,6 @@ process_op(struct disk_op_s *op)
 break;
 case DTYPE_USB:
 case DTYPE_UAS:
-case DTYPE_VIRTIO_SCSI:
 case DTYPE_LSI_SCSI:
 case DTYPE_ESP_SCSI:
 case DTYPE_MEGASAS:
@@ -534,6 +535,7 @@ process_op(struct disk_op_s *op)
 break;
 case DTYPE_USB_32:
 case DTYPE_UAS_32:
+case DTYPE_VIRTIO_SCSI:
 case DTYPE_PVSCSI: ;
 extern void _cfunc32flat_scsi_process_op(void);
 ret = call32(_cfunc32flat_scsi_process_op
diff --git a/src/hw/blockcmd.c b/src/hw/blockcmd.c
index 78c0e65..4440201 100644
--- a/src/hw/blockcmd.c
+++ b/src/hw/blockcmd.c
@@ -35,14 +35,15 @@ cdb_cmd_data(struct disk_op_s *op, void *cdbcmd, u16 
blocksize)
 return usb_cmd_data(op, cdbcmd, blocksize);
 case DTYPE_UAS:
 return uas_cmd_data(op, cdbcmd, blocksize);
-case DTYPE_VIRTIO_SCSI:
-return virtio_scsi_cmd_data(op, cdbcmd, blocksize);
 case DTYPE_LSI_SCSI:
 return lsi_scsi_cmd_data(op, cdbcmd, blocksize);
 case DTYPE_ESP_SCSI:
 return esp_scsi_cmd_data(op, cdbcmd, blocksize);
 case DTYPE_MEGASAS:
 return megasas_cmd_data(op, cdbcmd, blocksize);
+case DTYPE_VIRTIO_SCSI:
+if (!MODESEGMENT)
+return virtio_scsi_cmd_data(op, cdbcmd, blocksize);
 case DTYPE_USB_32:
 if (!MODESEGMENT)
 return usb_cmd_data(op, cdbcmd, blocksize);
diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c
index e2dbd3c..15ac171 100644
--- a/src/hw/virtio-blk.c
+++ b/src/hw/virtio-blk.c
@@ -77,7 +77,7 @@ virtio_blk_op(struct disk_op_s *op, int write)
 return status == VIRTIO_BLK_S_OK ? DISK_RET_SUCCESS : DISK_RET_EBADTRACK;
 }
 
-int
+int VISIBLE32FLAT
 process_virtio_blk_op(struct disk_op_s *op)
 {
 if (! CONFIG_VIRTIO_BLK)
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 23/25] virtio-scsi: 32bit cleanup

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-scsi.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/hw/virtio-scsi.c b/src/hw/virtio-scsi.c
index 6b4ed1a..cb825d4 100644
--- a/src/hw/virtio-scsi.c
+++ b/src/hw/virtio-scsi.c
@@ -53,10 +53,10 @@ virtio_scsi_cmd(struct vp_device *vp, struct 
vring_virtqueue *vq,
 int in_num = (datain ? 2 : 1);
 int out_num = (len ? 3 : 2) - in_num;
 
-sg[0].addr   = MAKE_FLATPTR(GET_SEG(SS), &req);
+sg[0].addr   = (void*)(&req);
 sg[0].length = sizeof(req);
 
-sg[out_num].addr   = MAKE_FLATPTR(GET_SEG(SS), &resp);
+sg[out_num].addr   = (void*)(&resp);
 sg[out_num].length = sizeof(resp);
 
 if (len) {
@@ -93,10 +93,10 @@ virtio_scsi_cmd_data(struct disk_op_s *op, void *cdbcmd, 
u16 blocksize)
 struct virtio_lun_s *vlun_gf =
 container_of(op->drive_gf, struct virtio_lun_s, drive);
 
-return virtio_scsi_cmd(GET_GLOBALFLAT(vlun_gf->vp),
-   GET_GLOBALFLAT(vlun_gf->vq), op, cdbcmd,
-   GET_GLOBALFLAT(vlun_gf->target),
-   GET_GLOBALFLAT(vlun_gf->lun),
+return virtio_scsi_cmd(vlun_gf->vp,
+   vlun_gf->vq, op, cdbcmd,
+   vlun_gf->target,
+   vlun_gf->lun,
blocksize);
 }
 
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 01/25] pci: allow to loop over capabilities

2015-07-01 Thread Gerd Hoffmann

Add a parameter to pci_find_capability, to specify the start point.
This allows to find multiple capabilities of the same type, by calling
pci_find_capability again with the offset of the last capability found.

Signed-off-by: Gerd Hoffmann 
---
 src/fw/pciinit.c |  4 ++--
 src/hw/pci.c | 11 ---
 src/hw/pci.h |  2 +-
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c
index ac39d23..45870f2 100644
--- a/src/fw/pciinit.c
+++ b/src/fw/pciinit.c
@@ -642,7 +642,7 @@ pci_region_create_entry(struct pci_bus *bus, struct 
pci_device *dev,
 
 static int pci_bus_hotplug_support(struct pci_bus *bus)
 {
-u8 pcie_cap = pci_find_capability(bus->bus_dev, PCI_CAP_ID_EXP);
+u8 pcie_cap = pci_find_capability(bus->bus_dev, PCI_CAP_ID_EXP, 0);
 u8 shpc_cap;
 
 if (pcie_cap) {
@@ -666,7 +666,7 @@ static int pci_bus_hotplug_support(struct pci_bus *bus)
 return downstream_port && slot_implemented;
 }
 
-shpc_cap = pci_find_capability(bus->bus_dev, PCI_CAP_ID_SHPC);
+shpc_cap = pci_find_capability(bus->bus_dev, PCI_CAP_ID_SHPC, 0);
 return !!shpc_cap;
 }
 
diff --git a/src/hw/pci.c b/src/hw/pci.c
index 0379b55..a241d06 100644
--- a/src/hw/pci.c
+++ b/src/hw/pci.c
@@ -221,16 +221,21 @@ pci_find_init_device(const struct pci_device_id *ids, 
void *arg)
 return NULL;
 }
 
-u8 pci_find_capability(struct pci_device *pci, u8 cap_id)
+u8 pci_find_capability(struct pci_device *pci, u8 cap_id, u8 cap)
 {
 int i;
-u8 cap;
 u16 status = pci_config_readw(pci->bdf, PCI_STATUS);
 
 if (!(status & PCI_STATUS_CAP_LIST))
 return 0;
 
-cap = pci_config_readb(pci->bdf, PCI_CAPABILITY_LIST);
+if (cap == 0) {
+/* find first */
+cap = pci_config_readb(pci->bdf, PCI_CAPABILITY_LIST);
+} else {
+/* find next */
+cap = pci_config_readb(pci->bdf, cap + PCI_CAP_LIST_NEXT);
+}
 for (i = 0; cap && i <= 0xff; i++) {
 if (pci_config_readb(pci->bdf, cap + PCI_CAP_LIST_ID) == cap_id)
 return cap;
diff --git a/src/hw/pci.h b/src/hw/pci.h
index 0aaa84c..fc5e7b9 100644
--- a/src/hw/pci.h
+++ b/src/hw/pci.h
@@ -123,7 +123,7 @@ int pci_init_device(const struct pci_device_id *ids
 , struct pci_device *pci, void *arg);
 struct pci_device *pci_find_init_device(const struct pci_device_id *ids
 , void *arg);
-u8 pci_find_capability(struct pci_device *pci, u8 cap_id);
+u8 pci_find_capability(struct pci_device *pci, u8 cap_id, u8 cap);
 int pci_bridge_has_region(struct pci_device *pci,
   enum pci_region_type region_type);
 void pci_reboot(void);
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 06/25] virtio: add version 0.9.5 struct

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index 83ebcda..42e2b7f 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -40,6 +40,20 @@
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION  0
 
+/* --- virtio 0.9.5 (legacy) struct - */
+
+typedef struct virtio_pci_legacy {
+u32 host_features;
+u32 guest_features;
+u32 queue_pfn;
+u16 queue_num;
+u16 queue_sel;
+u16 queue_notify;
+u8  status;
+u8  isr;
+u8  device[];
+} virtio_pci_legacy;
+
 /* --- virtio 1.0 (modern) structs -- */
 
 /* Common configuration */
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 10/25] virtio: make features 64bit, support version 1.0 features

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-blk.c |  2 +-
 src/hw/virtio-pci.c | 33 +
 src/hw/virtio-pci.h | 12 +++-
 3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c
index 1a13129..8378a34 100644
--- a/src/hw/virtio-blk.c
+++ b/src/hw/virtio-blk.c
@@ -123,7 +123,7 @@ init_virtio_blk(struct pci_device *pci)
 struct virtio_blk_config cfg;
 vp_get(&vdrive->vp, 0, &cfg, sizeof(cfg));
 
-u32 f = vp_get_features(&vdrive->vp);
+u64 f = vp_get_features(&vdrive->vp);
 vdrive->drive.blksize = (f & (1 << VIRTIO_BLK_F_BLK_SIZE)) ?
 cfg.blk_size : DISK_SECTOR_SIZE;
 
diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index 3c8fb7b..5ae6a76 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -24,6 +24,39 @@
 #include "virtio-pci.h"
 #include "virtio-ring.h"
 
+u64 vp_get_features(struct vp_device *vp)
+{
+u32 f0, f1;
+
+if (vp->use_modern) {
+vp_write(&vp->common, virtio_pci_common_cfg, device_feature_select, 0);
+f0 = vp_read(&vp->common, virtio_pci_common_cfg, device_feature);
+vp_write(&vp->common, virtio_pci_common_cfg, device_feature_select, 1);
+f1 = vp_read(&vp->common, virtio_pci_common_cfg, device_feature);
+} else {
+f0 = vp_read(&vp->legacy, virtio_pci_legacy, host_features);
+f1 = 0;
+}
+return ((u64)f1 << 32) | f0;
+}
+
+void vp_set_features(struct vp_device *vp, u64 features)
+{
+u32 f0, f1;
+
+f0 = features;
+f1 = features >> 32;
+
+if (vp->use_modern) {
+vp_write(&vp->common, virtio_pci_common_cfg, guest_feature_select, 0);
+vp_write(&vp->common, virtio_pci_common_cfg, guest_feature, f0);
+vp_write(&vp->common, virtio_pci_common_cfg, guest_feature_select, 1);
+vp_write(&vp->common, virtio_pci_common_cfg, guest_feature, f1);
+} else {
+vp_write(&vp->legacy, virtio_pci_legacy, guest_features, f0);
+}
+}
+
 int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq)
 {
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index f7510f2..962d6c0 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -125,6 +125,7 @@ struct vp_cap {
 struct vp_device {
 unsigned int ioaddr;
 struct vp_cap common, notify, isr, device, legacy;
+u8 use_modern;
 };
 
 static inline u64 _vp_read(struct vp_cap *cap, u32 offset, u8 size)
@@ -213,15 +214,8 @@ static inline void _vp_write(struct vp_cap *cap, u32 
offset, u8 size, u64 var)
 _vp_write(_cap, offsetof(_struct, _field),  \
  sizeof(((_struct *)0)->_field), _var)
 
-static inline u32 vp_get_features(struct vp_device *vp)
-{
-return inl(GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_HOST_FEATURES);
-}
-
-static inline void vp_set_features(struct vp_device *vp, u32 features)
-{
-outl(features, GET_LOWFLAT(vp->ioaddr) + VIRTIO_PCI_GUEST_FEATURES);
-}
+u64 vp_get_features(struct vp_device *vp);
+void vp_set_features(struct vp_device *vp, u64 features);
 
 static inline void vp_get(struct vp_device *vp, unsigned offset,
  void *buf, unsigned len)
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 08/25] virtio: create vp_cap struct for legacy bar

2015-07-01 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-pci.c | 5 -
 src/hw/virtio-pci.h | 2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index 58f3d39..3c8fb7b 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -143,8 +143,11 @@ void vp_init_simple(struct vp_device *vp, struct 
pci_device *pci)
 pci_bdf_to_bus(pci->bdf), pci_bdf_to_dev(pci->bdf));
 }
 
-vp->ioaddr = pci_config_readl(pci->bdf, PCI_BASE_ADDRESS_0) &
+vp->legacy.bar = 0;
+vp->legacy.addr = pci_config_readl(pci->bdf, PCI_BASE_ADDRESS_0) &
 PCI_BASE_ADDRESS_IO_MASK;
+vp->legacy.is_io = 1;
+vp->ioaddr = vp->legacy.addr; /* temporary */
 
 vp_reset(vp);
 pci_config_maskw(pci->bdf, PCI_COMMAND, 0, PCI_COMMAND_MASTER);
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index 467c02f..147e529 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -124,7 +124,7 @@ struct vp_cap {
 
 struct vp_device {
 unsigned int ioaddr;
-struct vp_cap common, notify, isr, device;
+struct vp_cap common, notify, isr, device, legacy;
 };
 
 static inline u32 vp_get_features(struct vp_device *vp)
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 04/25] virtio: pass struct pci_device to vp_init_simple

2015-07-01 Thread Gerd Hoffmann

... instead of the bdf only.

Signed-off-by: Gerd Hoffmann 
---
 src/hw/virtio-blk.c  | 2 +-
 src/hw/virtio-pci.c  | 6 +++---
 src/hw/virtio-pci.h  | 3 ++-
 src/hw/virtio-scsi.c | 2 +-
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c
index 3a71510..1a13129 100644
--- a/src/hw/virtio-blk.c
+++ b/src/hw/virtio-blk.c
@@ -113,7 +113,7 @@ init_virtio_blk(struct pci_device *pci)
 vdrive->drive.type = DTYPE_VIRTIO_BLK;
 vdrive->drive.cntl_id = bdf;
 
-vp_init_simple(&vdrive->vp, bdf);
+vp_init_simple(&vdrive->vp, pci);
 if (vp_find_vq(&vdrive->vp, 0, &vdrive->vq) < 0 ) {
 dprintf(1, "fail to find vq for virtio-blk %x:%x\n",
 pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c
index f648328..9428d04 100644
--- a/src/hw/virtio-pci.c
+++ b/src/hw/virtio-pci.c
@@ -85,13 +85,13 @@ fail:
return -1;
 }
 
-void vp_init_simple(struct vp_device *vp, u16 bdf)
+void vp_init_simple(struct vp_device *vp, struct pci_device *pci)
 {
-vp->ioaddr = pci_config_readl(bdf, PCI_BASE_ADDRESS_0) &
+vp->ioaddr = pci_config_readl(pci->bdf, PCI_BASE_ADDRESS_0) &
 PCI_BASE_ADDRESS_IO_MASK;
 
 vp_reset(vp);
-pci_config_maskw(bdf, PCI_COMMAND, 0, PCI_COMMAND_MASTER);
+pci_config_maskw(pci->bdf, PCI_COMMAND, 0, PCI_COMMAND_MASTER);
 vp_set_status(vp, VIRTIO_CONFIG_S_ACKNOWLEDGE |
   VIRTIO_CONFIG_S_DRIVER );
 }
diff --git a/src/hw/virtio-pci.h b/src/hw/virtio-pci.h
index c1caf67..85e623f 100644
--- a/src/hw/virtio-pci.h
+++ b/src/hw/virtio-pci.h
@@ -106,8 +106,9 @@ static inline void vp_del_vq(struct vp_device *vp, int 
queue_index)
outl(0, ioaddr + VIRTIO_PCI_QUEUE_PFN);
 }
 
+struct pci_device;
 struct vring_virtqueue;
-void vp_init_simple(struct vp_device *vp, u16 bdf);
+void vp_init_simple(struct vp_device *vp, struct pci_device *pci);
 int vp_find_vq(struct vp_device *vp, int queue_index,
struct vring_virtqueue **p_vq);
 #endif /* _VIRTIO_PCI_H_ */
diff --git a/src/hw/virtio-scsi.c b/src/hw/virtio-scsi.c
index eb7eb81..8073c77 100644
--- a/src/hw/virtio-scsi.c
+++ b/src/hw/virtio-scsi.c
@@ -150,7 +150,7 @@ init_virtio_scsi(struct pci_device *pci)
 warn_noalloc();
 return;
 }
-vp_init_simple(vp, bdf);
+vp_init_simple(vp, pci);
 if (vp_find_vq(vp, 2, &vq) < 0 ) {
 dprintf(1, "fail to find vq for virtio-scsi %x:%x\n",
 pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf));
-- 
1.8.3.1

[Qemu-devel] [PATCH] block/raw-posix: Don't think /dev/fd/ is a floppy drive.

2015-07-01 Thread Richard W.M. Jones

In libguestfs we use /dev/fd/ to pass pre-opened file descriptors
to qemu-img.  Lately I've discovered that although this works, qemu
believes that these are floppy disk images.  That in itself isn't much
of a problem, but now qemu prints a warning about host floppy
pass-thru being deprecated.

Extend the existing test so that it ignores /dev/fd/ as well as
/dev/fdset/

A simple test of this, if you are using the bash shell, is:

  qemu-img info <( cat /dev/null )

without this patch:

  $ qemu-img info <( cat /dev/null )
  qemu-img: Host floppy pass-through is deprecated
  Support for it will be removed in a future release.
  qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: 
Illegal seek

with this patch:

  $ qemu-img info <( cat /dev/null )
  qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: 
Illegal seek

Signed-off-by: Richard W.M. Jones 
Fixes: https://bugs.launchpad.net/qemu/+bug/1470536
---
 block/raw-posix.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index cbe6574..855febe 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -2430,7 +2430,8 @@ static int floppy_probe_device(const char *filename)
 struct stat st;
 
 if (strstart(filename, "/dev/fd", NULL) &&
-!strstart(filename, "/dev/fdset/", NULL)) {
+!strstart(filename, "/dev/fdset/", NULL) &&
+!strstart(filename, "/dev/fd/", NULL)) {
 prio = 50;
 }
 
-- 
2.3.1

Re: [Qemu-devel] [PATCH] disas/mips: fix disassembling R6 instructions

2015-07-01 Thread Aurelien Jarno

On 2015-06-30 16:33, Yongbok Kim wrote:
> In the Release 6 of the MIPS Architecture, LL, SC, LLD, SCD, PREF
> and CACHE instructions have 9 bits offsets.
> 
> Signed-off-by: Yongbok Kim 
> ---
>  disas/mips.c |   12 ++--
>  1 files changed, 6 insertions(+), 6 deletions(-)

Reviewed-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] target-mips: fix to clear MSACSR.Cause

2015-07-01 Thread Aurelien Jarno

On 2015-06-30 15:44, Yongbok Kim wrote:
> MSACSR.Cause bits are needed to be cleared before a vector floating-point
> instructions.
> FEXDO.df, FEXUPL.df and FEXUPR.df were missed out.
> 
> Signed-off-by: Yongbok Kim 
> ---
>  target-mips/msa_helper.c |6 ++
>  1 files changed, 6 insertions(+), 0 deletions(-)

Reviewed-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] [PATCH 1/1] block: update BlockDriverState's children in bdrv_set_backing_hd()

2015-07-01 Thread Alberto Garcia

When a backing image is opened using bdrv_open_inherit(), it is added
to the parent image's list of children. However there's no way to
remove it from there.

In particular, changing a BlockDriverState's backing image does not
add the new one to the list nor removes the old one. If the latter is
closed then the pointer in the list becomes invalid. This can be
reproduced easily using the block-stream command.

Signed-off-by: Alberto Garcia 
Cc: Kevin Wolf 
---
 block.c | 40 ++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 7e130cc..eaf3ad0 100644
--- a/block.c
+++ b/block.c
@@ -88,6 +88,13 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const 
char *filename,
  const BdrvChildRole *child_role,
  BlockDriver *drv, Error **errp);
 
+static void bdrv_attach_child(BlockDriverState *parent_bs,
+  BlockDriverState *child_bs,
+  const BdrvChildRole *child_role);
+
+static void bdrv_detach_child(BlockDriverState *parent_bs,
+  BlockDriverState *child_bs);
+
 static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
@@ -1108,6 +1115,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd)
 if (bs->backing_hd) {
 assert(bs->backing_blocker);
 bdrv_op_unblock_all(bs->backing_hd, bs->backing_blocker);
+bdrv_detach_child(bs, bs->backing_hd);
 } else if (backing_hd) {
 error_setg(&bs->backing_blocker,
"node is used as backing hd of '%s'",
@@ -1120,6 +1128,11 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd)
 bs->backing_blocker = NULL;
 goto out;
 }
+
+bdrv_attach_child(bs, backing_hd, &child_backing);
+backing_hd->inherits_from = bs;
+backing_hd->open_flags = child_backing.inherit_flags(bs->open_flags);
+
 bs->open_flags &= ~BDRV_O_NO_BACKING;
 pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
 pstrcpy(bs->backing_format, sizeof(bs->backing_format),
@@ -1332,7 +1345,16 @@ static void bdrv_attach_child(BlockDriverState 
*parent_bs,
   BlockDriverState *child_bs,
   const BdrvChildRole *child_role)
 {
-BdrvChild *child = g_new(BdrvChild, 1);
+BdrvChild *child;
+
+/* Don't attach the child if it's already attached */
+QLIST_FOREACH(child, &parent_bs->children, next) {
+if (child->bs == child_bs) {
+return;
+}
+}
+
+child = g_new(BdrvChild, 1);
 *child = (BdrvChild) {
 .bs = child_bs,
 .role   = child_role,
@@ -1341,6 +1363,21 @@ static void bdrv_attach_child(BlockDriverState 
*parent_bs,
 QLIST_INSERT_HEAD(&parent_bs->children, child, next);
 }
 
+static void bdrv_detach_child(BlockDriverState *parent_bs,
+  BlockDriverState *child_bs)
+{
+BdrvChild *child, *next_child;
+QLIST_FOREACH_SAFE(child, &parent_bs->children, next, next_child) {
+if (child->bs == child_bs) {
+if (child->bs->inherits_from == parent_bs) {
+child->bs->inherits_from = NULL;
+}
+QLIST_REMOVE(child, next);
+g_free(child);
+}
+}
+}
+
 /*
  * Opens a disk image (raw, qcow2, vmdk, ...)
  *
@@ -2116,7 +2153,6 @@ void bdrv_append(BlockDriverState *bs_new, 
BlockDriverState *bs_top)
 /* The contents of 'tmp' will become bs_top, as we are
  * swapping bs_new and bs_top contents. */
 bdrv_set_backing_hd(bs_top, bs_new);
-bdrv_attach_child(bs_top, bs_new, &child_backing);
 }
 
 static void bdrv_delete(BlockDriverState *bs)
-- 
2.1.4

[Qemu-devel] [PATCH 0/1] A couple of problems with BlockDriverState's children list

2015-07-01 Thread Alberto Garcia

I've been debugging a couple of problems related to the recently
merged bdrv_reopen() overhaul code.

1. bs->children is not updated correctly

The problem is described in this e-mail:

   https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg06813.html

In short, changing an image's backing hd does not replace the pointer
in the bs->children list.

The children list is a feature added recently in 6e93e7c41fdfdee30.
In addition to bs->backing_hd and bs->file it also includes other
driver-specific children for cases like Quorum.

However there's no way to remove items from that list. It seems that
this was discussed when the patch was first published, but no one saw
a case where this could break:

   https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02284.html

The problem is that it can break: the block-stream command removes a
BDS's backing image (optionally replacing it with a new one), so the
pointer in bs->children becomes invalid.

I wrote a patch that updates bs->children when bdrv_set_backing_hd()
is called. It also makes sure that the same children is not added
twice to the same parent (that can happen due to bdrv_set_backing_hd()
being called in bdrv_open_backing_file()).

I think this is enough to solve this problem, but I haven't checked
all other cases of chidren added using bdrv_attach_child(). Anyway the
assumption that once a BDS is added to that list it will always be
valid seems very broad to me.


2. bdrv_reopen_queue() includes the complete backing chain
--
Calling bdrv_reopen() on a particular BlockDriverState now adds its
whole backing chain to the queue (formerly I think it was just
bs->file).

I don't know why this behavior is necessary, but there are surely
things that I'm overlooking.

However this breaks one of the features of my intermediate block
streaming patchset: the ability to start several block-stream
operations in parallel as long as the affected chains don't overlap.

This breaks iotest 030, as described here:

   https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg06273.html

Now, this feature was just a nice side effect of the ability to stream
to intermediate images, and is of secondary importance to me; so if I
can no longer assume that bdrv_reopen() is not going to touch the
whole backing chain, I can just remove it very easily and still leave
the main functionality intact.

Comments are welcome.

Thanks,

Berto

Alberto Garcia (1):
  block: update BlockDriverState's children in bdrv_set_backing_hd()

 block.c | 40 ++--
 1 file changed, 38 insertions(+), 2 deletions(-)

-- 
2.1.4

Re: [Qemu-devel] [PATCH] target-mips: fix MIPS64R6-generic configuration

2015-07-01 Thread Yongbok Kim

On 01/07/2015 15:06, Aurelien Jarno wrote:
> On 2015-07-01 14:57, Yongbok Kim wrote:
>> On 01/07/2015 14:48, Aurelien Jarno wrote:
>>> On 2015-06-29 10:11, Yongbok Kim wrote:
 Fix core configuration for MIPS64R6-generic to make it as close as
 I6400.
 I6400 core has 48-bit of Virtual Address available (SEGBITS).
 MIPS SIMD Architecture is available.
 Rearrange order of bits to match the specification.

 Signed-off-by: Yongbok Kim 
 ---
  target-mips/mips-defs.h  |2 +-
  target-mips/translate_init.c |   18 +-
  2 files changed, 10 insertions(+), 10 deletions(-)
>>>
>>> Reviewed-by: Aurelien Jarno 
>>>
>>> That said given we are getting closer to the I6400 CPU model, shouldn't
>>> we try to directly model a I6400 core (even if we have to disable some
>>> features  like IEEE 754-2008 FP) instead of a generic MIPS64R6 core?
>>>
>>
>> I fully agree with that but detailed specification of I6400 has not been
>> published yet, therefore for the time being we will need to use the generic
> 
> Oh ok.
> 
>> core name. However we could rename mips32r5-generic into P5600 with such
>> restrictions - Hardware page table walk, Virtualization, EVA.
>> What do you think?
> 
> I think it's a good idea, as long as we keep the config register in sync
> with what is actually implemented.
> 

I will form a patch to do that.

> That also reminds me that we should look at implementing hardware page
> table walk. That should be relatively easy to implement, and provide a
> huge performance boost (exceptions cost a lot on QEMU).
> 

Actually I have implemented HTW (for MIPS32 only) but due to lack of
resources, I couldn't upstream it for 2.4. Please have a look at below commits.
https://github.com/yongbok/prpl-qemu/commit/b39e60b4039bb72ab5eccabfb75f6e6389d89bfd
https://github.com/yongbok/prpl-qemu/commit/4fd75126c1d78d84a91c659de17a5bc45efdef27

Regards,
Yongbok

[Qemu-devel] [Bug 1465935] Re: kvm_irqchip_commit_routes: Assertion `ret == 0' failed

2015-07-01 Thread Li Chengyuan

-Original Message-
From: Paolo Bonzini [mailto:pbonz...@redhat.com] 
Sent: 2015年7月1日 21:39
To: Li, Chengyuan
Cc: kevin...@tencent.com
Subject: Re: [Qemu-devel] [PATCH] Fix irq route entries exceed 
KVM_MAX_IRQ_ROUTES

On 30/06/2015 05:47, Li, Chengyuan wrote:
> Here is my understanding,
> 
> 1) why isn't the existing check in kvm_irqchip_get_virq enough to fix 
> the bug?
> 
> From kvm_pc_setup_irq_routing() function, we can see that 15 routes 
> from PIC and 23 routes from IOAPIC are added into irq route table, but 
> only
> 23 irq(gsi) are reserved. This leads to irq route table has been full 
> but there are still 15 free gsi. So the "retry" part of
> kvm_irqchip_get_virq() shall never have chance to be executed.
> 
> 2) If you introduce this extra call to kvm_flush_dynamic_msi_routes, 
> does the existing check become obsolete?
> 
> As gsi_count is the max number of irq route table, if below code is 
> merged, then existing check is obsolete and can be removed.
> 
> +if (!s->direct_msi && s->irq_routes->nr == s->gsi_count) {
> +kvm_flush_dynamic_msi_routes(s);
> +}
> 
> Please let me know if you have some other comments for the patch? Thanks!

Thanks for finally clearing up my doubts about the patch!  I'll apply it
soon.

Paolo

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1465935

Title:
  kvm_irqchip_commit_routes: Assertion `ret == 0' failed

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  Several my QEMU instances crashed, and in the  qemu log, I can see
  this assertion failure,

 qemu-system-x86_64: /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:984:
  kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

  The QEMU version is 2.0.0, HV OS is ubuntu 12.04, kernel 3.2.0-38.
  Guest OS is RHEL 6.3.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1465935/+subscriptions

Re: [Qemu-devel] [SeaBIOS] [PATCH v2 02/22] virtio: run drivers in 32bit mode

2015-07-01 Thread Kevin O'Connor

On Wed, Jul 01, 2015 at 03:50:50PM +0200, Michael S. Tsirkin wrote:
> On Wed, Jul 01, 2015 at 02:30:29PM +0200, Gerd Hoffmann wrote:
> > On Mi, 2015-07-01 at 10:08 +0200, Michael S. Tsirkin wrote:
> > > On Tue, Jun 30, 2015 at 10:38:53AM +0200, Gerd Hoffmann wrote:
> > > > virtio version 1.0 registers can (and actually do in the qemu
> > > > implementation) live in mmio space.  So we must run the blk and
> > > > scsi virtio drivers in 32bit mode, otherwise we can't access them.
> > > > 
> > > > This also allows to drop a bunch of GET_LOWFLAT calls from the virtio
> > > > code in the following patches.
> > > > 
> > > > Signed-off-by: Gerd Hoffmann 
> > > 
> > > Is there an advantage to running them in a 16 bit mode?
> > 
> > Not really any more.  Switching from 32bit mode back to
> > whatever-was-active-before used to be problematic before we had smm mode
> > support.  In theory.  Because you can't save/restore the complete x86
> > processor state.  In practice we had surprisingly few problems,
> > appearently linux boot loaders simply don't play dirty tricks.
> > 
> > cheers,
> >   Gerd
> > 
> 
> Interesting. Might not be true for non-linux loaders :)

Without SMM, the only issue I've seen with "thunking" to 32bit mode
was DOS-era programs (and in particular those that used emm386).

> Anyway we support SSM now so all should be well, right?

With SMM, I haven't seen any problems.  I don't doubt that some
DOS-era programs might still have issues though.  Also, I haven't
tested Paolo's kvm smm support yet.

SeaBIOS already runs a number of drivers exclusively in 32bit mode:
ahci, xhci, sdcard, ohci disks, pvscsi.  Even without smm support,
virtio is likely a good candidate to move to 32bit mode as I don't
think there is a use case for running DOS-era programs on virtio
disks.  (Using 32bit only drivers results in smaller and easier to
maintain code - indeed we'd like to move exclusively to 32bit drivers
in the future.)

Cheers,
-Kevin

Re: [Qemu-devel] [PATCH v3 0/9] HyperV equivalent of pvpanic driver

2015-07-01 Thread Paolo Bonzini



On 30/06/2015 13:33, Denis V. Lunev wrote:
> Windows 2012 guests can notify hypervisor about occurred guest crash
> (Windows bugcheck(BSOD)) by writing specific Hyper-V msrs. This patch does
> handling of this MSR's by KVM and sending notification to user space that
> allows to gather Windows guest crash dump by QEMU/LIBVIRT.
> 
> The idea is to provide functionality equal to pvpanic device without
> QEMU guest agent for Windows.
> 
> The idea is borrowed from Linux HyperV bus driver and validated against
> Windows 2k12.
> 
> Changes from v2:
> * forbid modification crash ctl msr by guest
> * qemu_system_guest_panicked usage in pvpanic and s390x
> * hyper-v crash handler move from generic kvm to i386
> * hyper-v crash handler: skip fetching crash msrs just mark crash occured
> * sync with linux-next 20150629
> * patch 11 squashed to patch 10
> * patch 9 squashed to patch 7
> 
> Changes from v1:
> * hyperv code move to hyperv.c
> * added read handlers of crash data msrs
> * added per vm and per cpu hyperv context structures
> * added saving crash msrs inside qemu cpu state
> * added qemu fetch and update of crash msrs
> * added qemu crash msrs store in cpu state and it's migration
> 
> Signed-off-by: Andrey Smetanin 
> Signed-off-by: Denis V. Lunev 
> CC: Gleb Natapov 
> CC: Paolo Bonzini 

The patches look good, thanks.  I'll queue them as soon as I start
merging 4.3 features.

Paolo

Re: [Qemu-devel] [PATCH RFC 1 8/8] xen/pt: Check for return values for xen_host_pci_[get|set] in init

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> and if we have failures we call xen_pt_destroy introduced in
> 'xen/pt: Move bulk of xen_pt_unregister_device in its own routine.'
> and free all of the allocated structures.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Stefano Stabellini 


>  hw/xen/xen_pt.c | 32 +---
>  1 file changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 589c6c6..ce202e9 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -779,10 +779,11 @@ static int xen_pt_initfn(PCIDevice *d)
>  }
>  
>  /* Initialize virtualized PCI configuration (Extended 256 Bytes) */
> -if (xen_host_pci_get_block(&s->real_device, 0, d->config,
> -   PCI_CONFIG_SPACE_SIZE) < 0) {
> -xen_host_pci_device_put(&s->real_device);
> -return -1;
> +rc = xen_host_pci_get_block(&s->real_device, 0, d->config,
> +PCI_CONFIG_SPACE_SIZE);
> +if (rc < 0) {
> +XEN_PT_ERR(d,"Could not read PCI_CONFIG space! (rc:%d)\n", rc);
> +goto err_out;
>  }
>  
>  s->memory_listener = xen_pt_memory_listener;
> @@ -792,17 +793,18 @@ static int xen_pt_initfn(PCIDevice *d)
>  xen_pt_register_regions(s, &cmd);
>  
>  /* reinitialize each config register to be emulated */
> -if (xen_pt_config_init(s)) {
> +rc = xen_pt_config_init(s);
> +if (rc) {
>  XEN_PT_ERR(d, "PCI Config space initialisation failed.\n");
> -xen_host_pci_device_put(&s->real_device);
> -return -1;
> +goto err_out;
>  }
>  
>  /* Bind interrupt */
> -if (xen_host_pci_get_byte(&s->real_device, PCI_INTERRUPT_PIN,
> -  &machine_irq /* temp scratch */)) {
> +rc = xen_host_pci_get_byte(&s->real_device, PCI_INTERRUPT_PIN,
> +   &machine_irq /* temp scratch */);
> +if (rc) {
>  XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc);
> -machine_irq = 0;
> +goto err_out;
>  }
>  if (!machine_irq) {
>  XEN_PT_LOG(d, "no pin interrupt\n");
> @@ -859,12 +861,15 @@ out:
>  rc = xen_host_pci_get_word(&s->real_device, PCI_COMMAND, &val);
>  if (rc) {
>  XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc);
> +goto err_out;
>  }
>  else {
>  val |= cmd;
> -if (xen_host_pci_set_word(&s->real_device, PCI_COMMAND, val)) {
> +rc = xen_host_pci_set_word(&s->real_device, PCI_COMMAND, val);
> +if (rc) {
>  XEN_PT_ERR(d, "Failed to write PCI_COMMAND val=0x%x!(rc: 
> %d)\n",
> val, rc);
> +goto err_out;
>  }
>  }
>  }
> @@ -877,6 +882,11 @@ out:
> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function);
>  
>  return 0;
> +
> +err_out:
> +xen_pt_destroy(d);
> +assert(rc);
> +return rc;
>  }
>  
>  static void xen_pt_unregister_device(PCIDevice *d)
> -- 
> 2.1.0
>

[Qemu-devel] [PATCH] target-xtensa: fix gdb register map construction

2015-07-01 Thread Max Filippov

Due to different gdb overlay organization between windowed/call0
configurations core import script doesn't always work correctly.
Simplify the script: always copy complete gdb register map from overlay,
count registers at core registerstion time. Update existing cores.

Signed-off-by: Max Filippov 
---
 target-xtensa/core-dc232b.c  |  2 +-
 target-xtensa/core-dc233c.c  |  2 +-
 target-xtensa/core-fsf.c |  7 ++-
 target-xtensa/cpu.h  |  1 +
 target-xtensa/helper.c   | 14 ++
 target-xtensa/import_core.sh |  6 ++
 target-xtensa/overlay_tool.h |  2 ++
 7 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/target-xtensa/core-dc232b.c b/target-xtensa/core-dc232b.c
index a3b914b..06826c0 100644
--- a/target-xtensa/core-dc232b.c
+++ b/target-xtensa/core-dc232b.c
@@ -33,7 +33,7 @@
 #include "core-dc232b/core-isa.h"
 #include "overlay_tool.h"
 
-static const XtensaConfig dc232b __attribute__((unused)) = {
+static XtensaConfig dc232b __attribute__((unused)) = {
 .name = "dc232b",
 .gdb_regmap = {
 .num_regs = 120,
diff --git a/target-xtensa/core-dc233c.c b/target-xtensa/core-dc233c.c
index ac745d1..8daf7d9 100644
--- a/target-xtensa/core-dc233c.c
+++ b/target-xtensa/core-dc233c.c
@@ -34,7 +34,7 @@
 #include "core-dc233c/core-isa.h"
 #include "overlay_tool.h"
 
-static const XtensaConfig dc233c __attribute__((unused)) = {
+static XtensaConfig dc233c __attribute__((unused)) = {
 .name = "dc233c",
 .gdb_regmap = {
 .num_regs = 121,
diff --git a/target-xtensa/core-fsf.c b/target-xtensa/core-fsf.c
index cfcc840..f6ea6b9 100644
--- a/target-xtensa/core-fsf.c
+++ b/target-xtensa/core-fsf.c
@@ -33,9 +33,14 @@
 #include "core-fsf/core-isa.h"
 #include "overlay_tool.h"
 
-static const XtensaConfig fsf __attribute__((unused)) = {
+static XtensaConfig fsf __attribute__((unused)) = {
 .name = "fsf",
+.gdb_regmap = {
 /* GDB for this core is not supported currently */
+.reg = {
+XTREG_END
+},
+},
 .clock_freq_khz = 1,
 DEFAULT_SECTIONS
 };
diff --git a/target-xtensa/cpu.h b/target-xtensa/cpu.h
index b592efb..b89c602 100644
--- a/target-xtensa/cpu.h
+++ b/target-xtensa/cpu.h
@@ -400,6 +400,7 @@ XtensaCPU *cpu_xtensa_init(const char *cpu_model);
 void xtensa_translate_init(void);
 void xtensa_breakpoint_handler(CPUState *cs);
 int cpu_xtensa_exec(CPUXtensaState *s);
+void xtensa_finalize_config(XtensaConfig *config);
 void xtensa_register_core(XtensaConfigList *node);
 void check_interrupts(CPUXtensaState *s);
 void xtensa_irq_init(CPUXtensaState *env);
diff --git a/target-xtensa/helper.c b/target-xtensa/helper.c
index d84d259..76be50d 100644
--- a/target-xtensa/helper.c
+++ b/target-xtensa/helper.c
@@ -51,6 +51,20 @@ static void xtensa_core_class_init(ObjectClass *oc, void 
*data)
 cc->gdb_num_core_regs = config->gdb_regmap.num_regs;
 }
 
+void xtensa_finalize_config(XtensaConfig *config)
+{
+unsigned i, n = 0;
+
+if (config->gdb_regmap.num_regs) {
+return;
+}
+
+for (i = 0; config->gdb_regmap.reg[i].targno >= 0; ++i) {
+n += (config->gdb_regmap.reg[i].type != 6);
+}
+config->gdb_regmap.num_regs = n;
+}
+
 void xtensa_register_core(XtensaConfigList *node)
 {
 TypeInfo type = {
diff --git a/target-xtensa/import_core.sh b/target-xtensa/import_core.sh
index 73791ec..351bee4 100755
--- a/target-xtensa/import_core.sh
+++ b/target-xtensa/import_core.sh
@@ -22,8 +22,7 @@ mkdir -p "$TARGET"
 tar -xf "$OVERLAY" -C "$TARGET" --strip-components=1 \
 --xform='s/core/core-isa/' config/core.h
 tar -xf "$OVERLAY" -O gdb/xtensa-config.c | \
-sed -n '1,/*\//p;/pc/,/a15/p' > "$TARGET"/gdb-config.c
-NUM_REGS=$(grep XTREG "$TARGET"/gdb-config.c | wc -l)
+sed -n '1,/*\//p;/XTREG/,/XTREG_END/p' > "$TARGET"/gdb-config.c
 
 cat < "${TARGET}.c"
 #include "cpu.h"
@@ -34,10 +33,9 @@ cat < "${TARGET}.c"
 #include "core-$NAME/core-isa.h"
 #include "overlay_tool.h"
 
-static const XtensaConfig $NAME __attribute__((unused)) = {
+static XtensaConfig $NAME __attribute__((unused)) = {
 .name = "$NAME",
 .gdb_regmap = {
-.num_regs = $NUM_REGS,
 .reg = {
 #include "core-$NAME/gdb-config.c"
 }
diff --git a/target-xtensa/overlay_tool.h b/target-xtensa/overlay_tool.h
index f7b1510..eda03aa 100644
--- a/target-xtensa/overlay_tool.h
+++ b/target-xtensa/overlay_tool.h
@@ -28,6 +28,7 @@
 #define XTREG(idx, ofs, bi, sz, al, no, flags, cp, typ, grp, name, \
 a1, a2, a3, a4, a5, a6) \
 { .targno = (no), .type = (typ), .group = (grp), .size = (sz) },
+#define XTREG_END { .targno = -1 },
 
 #ifndef XCHAL_HAVE_DIV32
 #define XCHAL_HAVE_DIV32 0
@@ -316,6 +317,7 @@
 static XtensaConfigList node = { \
 .config = &core, \
 }; \
+xtensa_finalize_config(&core); \
 xtensa_register_core(&node); \
 }
 #else
-- 
1.8.1.4

Re: [Qemu-devel] [PATCH] target-mips: fix MIPS64R6-generic configuration

2015-07-01 Thread Aurelien Jarno

On 2015-07-01 14:57, Yongbok Kim wrote:
> On 01/07/2015 14:48, Aurelien Jarno wrote:
> > On 2015-06-29 10:11, Yongbok Kim wrote:
> >> Fix core configuration for MIPS64R6-generic to make it as close as
> >> I6400.
> >> I6400 core has 48-bit of Virtual Address available (SEGBITS).
> >> MIPS SIMD Architecture is available.
> >> Rearrange order of bits to match the specification.
> >>
> >> Signed-off-by: Yongbok Kim 
> >> ---
> >>  target-mips/mips-defs.h  |2 +-
> >>  target-mips/translate_init.c |   18 +-
> >>  2 files changed, 10 insertions(+), 10 deletions(-)
> > 
> > Reviewed-by: Aurelien Jarno 
> > 
> > That said given we are getting closer to the I6400 CPU model, shouldn't
> > we try to directly model a I6400 core (even if we have to disable some
> > features  like IEEE 754-2008 FP) instead of a generic MIPS64R6 core?
> > 
> 
> I fully agree with that but detailed specification of I6400 has not been
> published yet, therefore for the time being we will need to use the generic

Oh ok.

> core name. However we could rename mips32r5-generic into P5600 with such
> restrictions - Hardware page table walk, Virtualization, EVA.
> What do you think?

I think it's a good idea, as long as we keep the config register in sync
with what is actually implemented.

That also reminds me that we should look at implementing hardware page
table walk. That should be relatively easy to implement, and provide a
huge performance boost (exceptions cost a lot on QEMU).

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH RFC 1 7/8] xen/pt: Move bulk of xen_pt_unregister_device in its own routine.

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> This way we can call it if we fail during init.
> 
> This code movement introduces no changes.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Stefano Stabellini 



>  hw/xen/xen_pt.c | 119 
> +---
>  1 file changed, 62 insertions(+), 57 deletions(-)
> 
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index cda6a2d..589c6c6 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -686,6 +686,67 @@ static const MemoryListener xen_pt_io_listener = {
>  .priority = 10,
>  };
>  
> +/* destroy. */
> +static void xen_pt_destroy(PCIDevice *d) {
> +
> +XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
> +uint8_t machine_irq = s->machine_irq;
> +uint8_t intx;
> +int rc;
> +
> + /* Note that if xen_host_pci_device_put had closed config_fd, then
> +  * intx value becomes 0xff. */
> +intx = xen_pt_pci_intx(s);
> +if (machine_irq && !xen_host_pci_device_closed(&s->real_device)) {
> +rc = xc_domain_unbind_pt_irq(xen_xc, xen_domid, machine_irq,
> + PT_IRQ_TYPE_PCI,
> + pci_bus_num(d->bus),
> + PCI_SLOT(s->dev.devfn),
> + intx,
> + 0 /* isa_irq */);
> +if (rc < 0) {
> +XEN_PT_ERR(d, "unbinding of interrupt INT%c failed."
> +   " (machine irq: %i, err: %d)"
> +   " But bravely continuing on..\n",
> +   'a' + intx, machine_irq, errno);
> +}
> +}
> +
> +/* N.B. xen_pt_config_delete takes care of freeing them. */
> +if (s->msi) {
> +xen_pt_msi_disable(s);
> +}
> +if (s->msix) {
> +xen_pt_msix_disable(s);
> +}
> +
> +if (machine_irq) {
> +xen_pt_mapped_machine_irq[machine_irq]--;
> +
> +if (xen_pt_mapped_machine_irq[machine_irq] == 0) {
> +rc = xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq);
> +
> +if (rc < 0) {
> +XEN_PT_ERR(d, "unmapping of interrupt %i failed. (err: %d)"
> +   " But bravely continuing on..\n",
> +   machine_irq, errno);
> +}
> +}
> +s->machine_irq = 0;
> +}
> +
> +/* delete all emulated config registers */
> +xen_pt_config_delete(s);
> +
> +if (s->listener_set) {
> +memory_listener_unregister(&s->memory_listener);
> +memory_listener_unregister(&s->io_listener);
> +s->listener_set = false;
> +}
> +if (!xen_host_pci_device_closed(&s->real_device)) {
> +xen_host_pci_device_put(&s->real_device);
> +}
> +}
>  /* init */
>  
>  static int xen_pt_initfn(PCIDevice *d)
> @@ -820,63 +881,7 @@ out:
>  
>  static void xen_pt_unregister_device(PCIDevice *d)
>  {
> -XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
> -uint8_t machine_irq = s->machine_irq;
> -uint8_t intx;
> -int rc;
> -
> - /* Note that if xen_host_pci_device_put had closed config_fd, then
> -  * intx value becomes 0xff. */
> -intx = xen_pt_pci_intx(s);
> -if (machine_irq && !xen_host_pci_device_closed(&s->real_device)) {
> -rc = xc_domain_unbind_pt_irq(xen_xc, xen_domid, machine_irq,
> - PT_IRQ_TYPE_PCI,
> - pci_bus_num(d->bus),
> - PCI_SLOT(s->dev.devfn),
> - intx,
> - 0 /* isa_irq */);
> -if (rc < 0) {
> -XEN_PT_ERR(d, "unbinding of interrupt INT%c failed."
> -   " (machine irq: %i, err: %d)"
> -   " But bravely continuing on..\n",
> -   'a' + intx, machine_irq, errno);
> -}
> -}
> -
> -/* N.B. xen_pt_config_delete takes care of freeing them. */
> -if (s->msi) {
> -xen_pt_msi_disable(s);
> -}
> -if (s->msix) {
> -xen_pt_msix_disable(s);
> -}
> -
> -if (machine_irq) {
> -xen_pt_mapped_machine_irq[machine_irq]--;
> -
> -if (xen_pt_mapped_machine_irq[machine_irq] == 0) {
> -rc = xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq);
> -
> -if (rc < 0) {
> -XEN_PT_ERR(d, "unmapping of interrupt %i failed. (err: %d)"
> -   " But bravely continuing on..\n",
> -   machine_irq, errno);
> -}
> -}
> -s->machine_irq = 0;
> -}
> -
> -/* delete all emulated config registers */
> -xen_pt_config_delete(s);
> -
> -if (s->listener_set) {
> -memory_listener_unregister(&s->memory_listener);
> -memory_listener_unregister(&s->io_listener);
> -s->listener_set = false;
> -}
> -if (!xen_host_pci_device

Re: [Qemu-devel] [PATCH RFC 1 6/8] xen/pt: Make xen_pt_unregister_device idempotent

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> To deal with xen_host_pci_[set|get]_ functions returning error values
> and clearing ourselves in the init function we should make the
> .exit (xen_pt_unregister_device) function be idempotent in case
> the generic code starts calling .exit (or for fun does it before
> calling .init!).
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  hw/xen/xen-host-pci-device.c |  5 +
>  hw/xen/xen-host-pci-device.h |  1 +
>  hw/xen/xen_pt.c  | 22 --
>  hw/xen/xen_pt.h  |  2 ++
>  4 files changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
> index 743b37b..5b20570 100644
> --- a/hw/xen/xen-host-pci-device.c
> +++ b/hw/xen/xen-host-pci-device.c
> @@ -387,6 +387,11 @@ error:
>  return rc;
>  }
>  
> +bool xen_host_pci_device_closed(XenHostPCIDevice *d)
> +{
> +return d->config_fd == -1;
> +}
> +
>  void xen_host_pci_device_put(XenHostPCIDevice *d)
>  {
>  if (d->config_fd >= 0) {
> diff --git a/hw/xen/xen-host-pci-device.h b/hw/xen/xen-host-pci-device.h
> index c2486f0..16f4805 100644
> --- a/hw/xen/xen-host-pci-device.h
> +++ b/hw/xen/xen-host-pci-device.h
> @@ -38,6 +38,7 @@ typedef struct XenHostPCIDevice {
>  int xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t domain,
>  uint8_t bus, uint8_t dev, uint8_t func);
>  void xen_host_pci_device_put(XenHostPCIDevice *pci_dev);
> +bool xen_host_pci_device_closed(XenHostPCIDevice *d);
>  
>  int xen_host_pci_get_byte(XenHostPCIDevice *d, int pos, uint8_t *p);
>  int xen_host_pci_get_word(XenHostPCIDevice *d, int pos, uint16_t *p);
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 2535352..cda6a2d 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -810,6 +810,7 @@ out:
>  
>  memory_listener_register(&s->memory_listener, &s->dev.bus_master_as);
>  memory_listener_register(&s->io_listener, &address_space_io);
> +s->listener_set = true;
>  XEN_PT_LOG(d,
> "Real physical device %02x:%02x.%d registered 
> successfully!\n",
> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function);
> @@ -821,10 +822,13 @@ static void xen_pt_unregister_device(PCIDevice *d)
>  {
>  XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
>  uint8_t machine_irq = s->machine_irq;
> -uint8_t intx = xen_pt_pci_intx(s);
> +uint8_t intx;
>  int rc;
>  
> -if (machine_irq) {
> + /* Note that if xen_host_pci_device_put had closed config_fd, then
> +  * intx value becomes 0xff. */
> +intx = xen_pt_pci_intx(s);
> +if (machine_irq && !xen_host_pci_device_closed(&s->real_device)) {
>  rc = xc_domain_unbind_pt_irq(xen_xc, xen_domid, machine_irq,
>   PT_IRQ_TYPE_PCI,
>   pci_bus_num(d->bus),
> @@ -839,6 +843,7 @@ static void xen_pt_unregister_device(PCIDevice *d)
>  }
>  }
>  
> +/* N.B. xen_pt_config_delete takes care of freeing them. */
>  if (s->msi) {
>  xen_pt_msi_disable(s);
>  }
> @@ -858,15 +863,20 @@ static void xen_pt_unregister_device(PCIDevice *d)
> machine_irq, errno);
>  }
>  }
> +s->machine_irq = 0;
>  }
>  
>  /* delete all emulated config registers */
>  xen_pt_config_delete(s);
>  
> -memory_listener_unregister(&s->memory_listener);
> -memory_listener_unregister(&s->io_listener);
> -
> -xen_host_pci_device_put(&s->real_device);
> +if (s->listener_set) {
> +memory_listener_unregister(&s->memory_listener);
> +memory_listener_unregister(&s->io_listener);
> +s->listener_set = false;

If you call QTAILQ_INIT on memory_listener and io_listener, then you
simply check on QTAILQ_EMPTY and remove listener_set.


> +}
> +if (!xen_host_pci_device_closed(&s->real_device)) {
> +xen_host_pci_device_put(&s->real_device);
> +}
>  }
>  
>  static Property xen_pci_passthrough_properties[] = {
> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
> index 09358b1..98eb74c 100644
> --- a/hw/xen/xen_pt.h
> +++ b/hw/xen/xen_pt.h
> @@ -217,6 +217,7 @@ struct XenPCIPassthroughState {
>  
>  MemoryListener memory_listener;
>  MemoryListener io_listener;
> +bool listener_set;
>  };
>  
>  int xen_pt_config_init(XenPCIPassthroughState *s);
> @@ -282,6 +283,7 @@ static inline uint8_t 
> xen_pt_pci_intx(XenPCIPassthroughState *s)
> " value=%i, acceptable range is 1 - 4\n", r_val);
>  r_val = 0;
>  } else {
> +/* Note that if s.real_device.config_fd is closed we make 0xff. */
>  r_val -= 1;
>  }
>  
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH v2 02/22] virtio: run drivers in 32bit mode

2015-07-01 Thread Gerd Hoffmann

  Hi,

> > Not really any more.  Switching from 32bit mode back to
> > whatever-was-active-before used to be problematic before we had smm mode
> > support.  In theory.  Because you can't save/restore the complete x86
> > processor state.  In practice we had surprisingly few problems,
> > appearently linux boot loaders simply don't play dirty tricks.
> > 

> Interesting. Might not be true for non-linux loaders :)

No problems with modern windows too.
DOS with emm386 being active could be more challenging.

Didn't dig that deep though, and as the ide driver runs
in 16bit mode still there is an easy way around any
isses should they pop up.

> Anyway we support SSM now so all should be well, right?

Yes, with smm this issue is gone.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v3 1/5] cpu: Provide vcpu throttling interface

2015-07-01 Thread Paolo Bonzini



On 26/06/2015 20:07, Dr. David Alan Gilbert wrote:
> * Jason J. Herne (jjhe...@linux.vnet.ibm.com) wrote:
>> Provide a method to throttle guest cpu execution. CPUState is augmented with
>> timeout controls and throttle start/stop functions. To throttle the guest cpu
>> the caller simply has to call the throttle set function and provide a 
>> percentage
>> of throttle time.
> 
> I'm worried about atomicity and threads and all those fun things.
> 
> I think all the starting/stopping/setting the throttling level is done in the
> migration thread; I think the timers run in the main/io thread?
> So you really need to be careful with at least:
> throttle_timer_stop - which may have a minor effect  
> throttle_timer  - I worry about the way cpu_timer_active checks the 
> pointer
>   yet it's freed when the timer goes off.   It's probably
>   not too bad because it never dereferences it.

Agreed.  I think the only atomic should be throttle_percentage; if zero,
throttling is inactive.

In particular, throttle_ratio can be computed in cpu_throttle_thread.

If you have exactly one variable that is shared between the threads,
everything is much simpler.

There is no need to allocate and free the timer; it's very cheap and in
fact we probably should convert to statically allocated timers sooner or
later.  So you can just create it once, for example in cpu_ticks_init.

Paolo

> So, probably need some atomic's in there (cc'ing paolo)
> 
> Dave
> 
>> Signed-off-by: Jason J. Herne 
>> Reviewed-by: Matthew Rosato 
>> ---
>>  cpus.c| 76 
>> +++
>>  include/qom/cpu.h | 38 
>>  2 files changed, 114 insertions(+)
>>
>> diff --git a/cpus.c b/cpus.c
>> index de6469f..f57cf4f 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -68,6 +68,16 @@ static CPUState *next_cpu;
>>  int64_t max_delay;
>>  int64_t max_advance;
>>  
>> +/* vcpu throttling controls */
>> +static QEMUTimer *throttle_timer;
>> +static bool throttle_timer_stop;
>> +static int throttle_percentage;
> 
> Unsigned?
> 
>> +static float throttle_ratio;
>> +
>> +#define CPU_THROTTLE_PCT_MIN 1
>> +#define CPU_THROTTLE_PCT_MAX 99
>> +#define CPU_THROTTLE_TIMESLICE 10
>> +
>>  bool cpu_is_stopped(CPUState *cpu)
>>  {
>>  return cpu->stopped || !runstate_is_running();
>> @@ -919,6 +929,72 @@ static void qemu_kvm_wait_io_event(CPUState *cpu)
>>  qemu_wait_io_event_common(cpu);
>>  }
>>  
>> +static void cpu_throttle_thread(void *opaque)
>> +{
>> +long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
>> +
>> +qemu_mutex_unlock_iothread();
>> +g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
>> +qemu_mutex_lock_iothread();
>> +
>> +timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>> +   CPU_THROTTLE_TIMESLICE);
>> +}
>> +
>> +static void cpu_throttle_timer_pop(void *opaque)
>> +{
>> +CPUState *cpu;
>> +
>> +/* Stop the timer if needed */
>> +if (throttle_timer_stop) {
>> +timer_del(throttle_timer);
>> +timer_free(throttle_timer);
>> +throttle_timer = NULL;
>> +return;
>> +}
>> +
>> +CPU_FOREACH(cpu) {
>> +async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
>> +}
>> +}
> 
> Why pop? I pop stacks, balloons and bubbles.
> 
>> +
>> +void cpu_throttle_set(int new_throttle_pct)
>> +{
>> +double pct;
>> +
>> +/* Ensure throttle percentage is within valid range */
>> +new_throttle_pct = MIN(new_throttle_pct, CPU_THROTTLE_PCT_MAX);
>> +throttle_percentage = MAX(new_throttle_pct, CPU_THROTTLE_PCT_MIN);
>> +
>> +pct = (double)throttle_percentage/100;
>> +throttle_ratio = pct / (1 - pct);
>> +
>> +if (!cpu_throttle_active()) {
>> +throttle_timer_stop = false;
>> +throttle_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
>> +   cpu_throttle_timer_pop, NULL);
>> +timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>> +   CPU_THROTTLE_TIMESLICE);
>> +}
>> +}
>> +
>> +void cpu_throttle_stop(void)
>> +{
>> +if (cpu_throttle_active()) {
>> +throttle_timer_stop = true;
>> +}
>> +}
>> +
>> +bool cpu_throttle_active(void)
>> +{
>> +return (throttle_timer != NULL);
>> +}
>> +
>> +int cpu_throttle_get_percentage(void)
>> +{
>> +return throttle_percentage;
>> +}
>> +
>>  static void *qemu_kvm_cpu_thread_fn(void *arg)
>>  {
>>  CPUState *cpu = arg;
>> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
>> index 39f0f19..56eb964 100644
>> --- a/include/qom/cpu.h
>> +++ b/include/qom/cpu.h
>> @@ -553,6 +553,44 @@ CPUState *qemu_get_cpu(int index);
>>   */
>>  bool cpu_exists(int64_t id);
>>  
>> +/**
>> + * cpu_throttle_set:
>> + * @new_throttle_pct: Percent of sleep time to running time.
>> + *Valid range is 1 to

[Qemu-devel] [PATCH] target-mips: fix ASID synchronisation for MIPS MT

2015-07-01 Thread Aurelien Jarno

When syncing the task ASID with EntryHi, correctly or the value instead
of assigning it.

Reported-by: "Dr. David Alan Gilbert" 
Signed-off-by: Aurelien Jarno 
Cc: Leon Alrae 
---
 target-mips/op_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 2a9ddff..d457a29 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -661,7 +661,7 @@ static void sync_c0_tcstatus(CPUMIPSState *cpu, int tc,
 
 /* Sync the TASID with EntryHi.  */
 cpu->CP0_EntryHi &= ~0xff;
-cpu->CP0_EntryHi = tasid;
+cpu->CP0_EntryHi |= tasid;
 
 compute_hflags(cpu);
 }
-- 
2.1.4

Re: [Qemu-devel] [PATCH v3 1/5] cpu: Provide vcpu throttling interface

2015-07-01 Thread Paolo Bonzini



On 25/06/2015 19:46, Jason J. Herne wrote:
> +static void cpu_throttle_thread(void *opaque)
> +{
> +long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
> +
> +qemu_mutex_unlock_iothread();
> +g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
> +qemu_mutex_lock_iothread();
> +
> +timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> +   CPU_THROTTLE_TIMESLICE);

The timer need not run while the VM is stopped.  Please use
QEMU_CLOCK_VIRTUAL_RT.

> +}
> +

Are you sure you want each CPU to set the timer?  I think this should be
done in cpu_throttle_timer_pop, or it could use timer_mod_anticipate.

> +static void cpu_throttle_timer_pop(void *opaque)
> +{
> +CPUState *cpu;
> +
> +/* Stop the timer if needed */
> +if (throttle_timer_stop) {
> +timer_del(throttle_timer);

timer_del is not needed in the timer callback.

Paolo

> +timer_free(throttle_timer);
> +throttle_timer = NULL;
> +return;
> +}
> +
> +CPU_FOREACH(cpu) {
> +async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
> +}
> +}
> +

Re: [Qemu-devel] [PATCH RFC 1 5/8] xen/pt: Log xen_host_pci_get/set errors in MSI code.

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> We seem to only use these functions when de-activating the
> MSI - so just log errors.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Reviewed-by: Stefano Stabellini 


>  hw/xen/xen_pt_msi.c | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
> index 5822df5..e3d7194 100644
> --- a/hw/xen/xen_pt_msi.c
> +++ b/hw/xen/xen_pt_msi.c
> @@ -75,19 +75,29 @@ static int msi_msix_enable(XenPCIPassthroughState *s,
> bool enable)
>  {
>  uint16_t val = 0;
> +int rc;
>  
>  if (!address) {
>  return -1;
>  }
>  
> -xen_host_pci_get_word(&s->real_device, address, &val);
> +rc = xen_host_pci_get_word(&s->real_device, address, &val);
> +if (rc) {
> +XEN_PT_ERR(&s->dev, "Failed to read MSI/MSI-X register (0x%x), 
> rc:%d\n",
> +   address, rc);
> +return rc;
> +}
>  if (enable) {
>  val |= flag;
>  } else {
>  val &= ~flag;
>  }
> -xen_host_pci_set_word(&s->real_device, address, val);
> -return 0;
> +rc = xen_host_pci_set_word(&s->real_device, address, val);
> +if (rc) {
> +XEN_PT_ERR(&s->dev, "Failed to write MSI/MSI-X register (0x%x), 
> rc:%d\n",
> +   address, rc);
> +}
> +return rc;
>  }
>  
>  static int msi_msix_setup(XenPCIPassthroughState *s,
> @@ -276,7 +286,7 @@ void xen_pt_msi_disable(XenPCIPassthroughState *s)
>  return;
>  }
>  
> -xen_pt_msi_set_enable(s, false);
> +(void)xen_pt_msi_set_enable(s, false);
>  
>  msi_msix_disable(s, msi_addr64(msi), msi->data, msi->pirq, false,
>   msi->initialized);
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH v2 02/22] virtio: run drivers in 32bit mode

2015-07-01 Thread Michael S. Tsirkin

On Wed, Jul 01, 2015 at 03:50:50PM +0200, Michael S. Tsirkin wrote:
> On Wed, Jul 01, 2015 at 02:30:29PM +0200, Gerd Hoffmann wrote:
> > On Mi, 2015-07-01 at 10:08 +0200, Michael S. Tsirkin wrote:
> > > On Tue, Jun 30, 2015 at 10:38:53AM +0200, Gerd Hoffmann wrote:
> > > > virtio version 1.0 registers can (and actually do in the qemu
> > > > implementation) live in mmio space.  So we must run the blk and
> > > > scsi virtio drivers in 32bit mode, otherwise we can't access them.
> > > > 
> > > > This also allows to drop a bunch of GET_LOWFLAT calls from the virtio
> > > > code in the following patches.
> > > > 
> > > > Signed-off-by: Gerd Hoffmann 
> > > 
> > > Is there an advantage to running them in a 16 bit mode?
> > 
> > Not really any more.  Switching from 32bit mode back to
> > whatever-was-active-before used to be problematic before we had smm mode
> > support.  In theory.  Because you can't save/restore the complete x86
> > processor state.  In practice we had surprisingly few problems,
> > appearently linux boot loaders simply don't play dirty tricks.
> > 
> > cheers,
> >   Gerd
> > 
> 
> Interesting. Might not be true for non-linux loaders :)
> 
> Anyway we support SSM now so all should be well, right?

Also a question: what's cheaper on kvm: use SMM to save/restore
or access device through the config cap?

> -- 
> MST

[Qemu-devel] [Bug 1470481] [NEW] qemu-img converts large vhd files into only approx. 127GB raw file causing the VM to crash

2015-07-01 Thread srinivas kv

Public bug reported:

I have a VHD file for Windows 2014 server OS. I use the following
command to convert VHD file (20GB) to a RAW file for KVM.

qemu-img convert -f vpc -O raw WIN-SNRGCQV6O3O.VHD disk.img

The output file is about 127GB. When install the VM and boot it up, the
OS crashes with STOP error after the intial screen. I found on the
internet that the file limit of 127GB is an existing bug. Kindly fix the
problem. The workaround to use a Hyper-V to convert to fixed disk is not
a feasible solution.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1470481

Title:
  qemu-img converts large vhd files into only approx. 127GB raw file
  causing the VM to crash

Status in QEMU:
  New

Bug description:
  I have a VHD file for Windows 2014 server OS. I use the following
  command to convert VHD file (20GB) to a RAW file for KVM.

  qemu-img convert -f vpc -O raw WIN-SNRGCQV6O3O.VHD disk.img

  The output file is about 127GB. When install the VM and boot it up,
  the OS crashes with STOP error after the intial screen. I found on the
  internet that the file limit of 127GB is an existing bug. Kindly fix
  the problem. The workaround to use a Hyper-V to convert to fixed disk
  is not a feasible solution.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1470481/+subscriptions

Re: [Qemu-devel] [PATCH] target-mips: fix MIPS64R6-generic configuration

2015-07-01 Thread Yongbok Kim

On 01/07/2015 14:48, Aurelien Jarno wrote:
> On 2015-06-29 10:11, Yongbok Kim wrote:
>> Fix core configuration for MIPS64R6-generic to make it as close as
>> I6400.
>> I6400 core has 48-bit of Virtual Address available (SEGBITS).
>> MIPS SIMD Architecture is available.
>> Rearrange order of bits to match the specification.
>>
>> Signed-off-by: Yongbok Kim 
>> ---
>>  target-mips/mips-defs.h  |2 +-
>>  target-mips/translate_init.c |   18 +-
>>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
> Reviewed-by: Aurelien Jarno 
> 
> That said given we are getting closer to the I6400 CPU model, shouldn't
> we try to directly model a I6400 core (even if we have to disable some
> features  like IEEE 754-2008 FP) instead of a generic MIPS64R6 core?
> 

I fully agree with that but detailed specification of I6400 has not been
published yet, therefore for the time being we will need to use the generic
core name. However we could rename mips32r5-generic into P5600 with such
restrictions - Hardware page table walk, Virtualization, EVA.
What do you think?

Regards,
Yongbok

Re: [Qemu-devel] [PATCH RFC 1 4/8] xen/pt: Log xen_host_pci_get in two init functions

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> To help with troubleshooting in the field.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: Stefano Stabellini 


>  hw/xen/xen_pt_config_init.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
> index bc871c9..62b6a7b 100644
> --- a/hw/xen/xen_pt_config_init.c
> +++ b/hw/xen/xen_pt_config_init.c
> @@ -1776,6 +1776,8 @@ static int xen_pt_ptr_reg_init(XenPCIPassthroughState 
> *s,
>  rc = xen_host_pci_get_byte(&s->real_device,
> reg_field + PCI_CAP_LIST_ID, &cap_id);
>  if (rc) {
> +XEN_PT_ERR(&s->dev, "Failed to read capability @0x%x 
> (rc:%d)\n",
> +   reg_field + PCI_CAP_LIST_ID, rc);
>  return rc;
>  }
>  if (xen_pt_emu_reg_grps[i].grp_id == cap_id) {
> @@ -1959,6 +1961,9 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
>reg_grp_offset,
>®_grp_entry->size);
>  if (rc < 0) {
> +XEN_PT_LOG(&s->dev, "Failed to initialize %d/%ld, type=0x%x, 
> rc:%d\n",
> +   i, ARRAY_SIZE(xen_pt_emu_reg_grps),
> +   xen_pt_emu_reg_grps[i].grp_type, rc);
>  xen_pt_config_delete(s);
>  return rc;
>  }
> @@ -1973,6 +1978,10 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
>  /* initialize capability register */
>  rc = xen_pt_config_reg_init(s, reg_grp_entry, regs);
>  if (rc < 0) {
> +XEN_PT_LOG(&s->dev, "Failed to initialize %d/%ld reg 
> 0x%x in grp_type=0x%x (%d/%ld), rc=%d\n",
> +   j, 
> ARRAY_SIZE(xen_pt_emu_reg_grps[i].emu_regs),
> +   regs->offset, 
> xen_pt_emu_reg_grps[i].grp_type,
> +   i, ARRAY_SIZE(xen_pt_emu_reg_grps), rc);
>  xen_pt_config_delete(s);
>  return rc;
>  }
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH RFC 1 3/8] xen/pt: Check if reg->init is past the reg->size

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> It should never happen, but in case it does we want to
> report. The code will only write up to reg->size so there
> is no runtime danger.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  hw/xen/xen_pt_config_init.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
> index 91c3a14..bc871c9 100644
> --- a/hw/xen/xen_pt_config_init.c
> +++ b/hw/xen/xen_pt_config_init.c
> @@ -1901,9 +1901,13 @@ static int 
> xen_pt_config_reg_init(XenPCIPassthroughState *s,
>  } else
>  val = data;
>  
> +if (val & size_mask) {
> +XEN_PT_ERR(&s->dev,"Offset 0x%04x:0x%04u expands past register 
> size(%d)!\n",
> +   offset, val, reg->size);

should we return early?


> +}
>  /* This could be just pci_set_long as we don't modify the bits
> - * past reg->size, but in case this routine is run in parallel
> - * we do not want to over-write other registers. */
> + * past reg->size, but in case this routine is run in parallel or the
> + * init value is larger, we do not want to over-write registers. */
>  switch (reg->size) {
>  case 1: pci_set_byte(s->dev.config + offset, (uint8_t)val); break;
>  case 2: pci_set_word(s->dev.config + offset, (uint16_t)val); break;
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH v2 02/22] virtio: run drivers in 32bit mode

2015-07-01 Thread Michael S. Tsirkin

On Wed, Jul 01, 2015 at 02:30:29PM +0200, Gerd Hoffmann wrote:
> On Mi, 2015-07-01 at 10:08 +0200, Michael S. Tsirkin wrote:
> > On Tue, Jun 30, 2015 at 10:38:53AM +0200, Gerd Hoffmann wrote:
> > > virtio version 1.0 registers can (and actually do in the qemu
> > > implementation) live in mmio space.  So we must run the blk and
> > > scsi virtio drivers in 32bit mode, otherwise we can't access them.
> > > 
> > > This also allows to drop a bunch of GET_LOWFLAT calls from the virtio
> > > code in the following patches.
> > > 
> > > Signed-off-by: Gerd Hoffmann 
> > 
> > Is there an advantage to running them in a 16 bit mode?
> 
> Not really any more.  Switching from 32bit mode back to
> whatever-was-active-before used to be problematic before we had smm mode
> support.  In theory.  Because you can't save/restore the complete x86
> processor state.  In practice we had surprisingly few problems,
> appearently linux boot loaders simply don't play dirty tricks.
> 
> cheers,
>   Gerd
> 

Interesting. Might not be true for non-linux loaders :)

Anyway we support SSM now so all should be well, right?

-- 
MST

Re: [Qemu-devel] [PATCH RFC 1 2/8] xen/pt: Sync up the dev.config and data values.

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> For a passthrough device we maintain a state of emulated
> registers value contained within d->config. We also consult
> the host registers (and apply ro and write masks) whenever
> the guest access the registers. This is done in xen_pt_pci_write_config
> and xen_pt_pci_read_config.
> 
> Also in this picture we call pci_default_write_config which
> updates the d->config and if the d->config[PCI_COMMAND] register
> has PCI_COMMAND_MEMORY (or PCI_COMMAND_IO) acts on those changes.
> 
> On startup the d->config[PCI_COMMAND] are the host values, not
> what the guest initial values should be, which is exactly what
> we do _not_ want to do for 64-bit BARs when the guest just wants
> to read the size of the BAR. Huh you say?
> 
> To get the size of 64-bit memory space BARs,  the guest has
> to calculate ((BAR[x] & 0xFFF0) + ((BAR[x+1] & 0x) << 32))
> which means it has to do two writes of ~0 to BARx and BARx+1.
> 
> prior to this patch and with XSA120-addendum patch (Linux kernel)
> the PCI_COMMAND register is copied from the host it can have
> PCI_COMMAND_MEMORY bit set which means that QEMU will try to
> update the hypervisor's P2M with BARx+1 value to ~0 (0x)
> (to sync the guest state to host) instead of just having
> xen_pt_pci_write_config and xen_pt_bar_reg_write apply the
> proper masks and return the size to the guest.
> 
> To thwart this, this patch syncs up the host values with the
> guest values taking into account the emu_mask (bit set means
> we emulate, PCI_COMMAND_MEMORY and PCI_COMMAND_IO are set).
> That is we copy the host values - masking out any bits which
> we will emulate. Then merge it with the initial emulation register
> values. There is also some reg->size accounting taken
> into consideration - which could be removed.
> 
> This fixes errors such as these:
> 
> (XEN) memory_map:add: dom2 gfn=fffe0 mfn=fbce0 nr=20
> (DEBUG) 189 pci dev 04:0 BAR16 wrote ~0.
> (DEBUG) 200 pci dev 04:0 BAR16 read 0x0fffe0004.
> (XEN) memory_map:remove: dom2 gfn=fffe0 mfn=fbce0 nr=20
> (DEBUG) 204 pci dev 04:0 BAR16 wrote 0x0fffe0004.
> (DEBUG) 217 pci dev 04:0 BAR16 read upper 0x0.
> (XEN) memory_map:add: dom2 gfn=0 mfn=fbce0 nr=20
> (XEN) p2m.c:883:d0v0 p2m_set_entry failed! mfn= rc:-22
> (XEN) memory_map:fail: dom2 gfn=0 mfn=fbce0 nr=20 ret:-22
> (XEN) memory_map:remove: dom2 gfn=0 mfn=fbce0 nr=20
> (XEN) p2m.c:920:d0v0 gfn_to_mfn failed! gfn=0 type:4
> (XEN) p2m.c:920:d0v0 gfn_to_mfn failed! gfn=1 type:4
> ..
> (XEN) memory_map: error -22 removing dom2 access to [fbce0,fbcff]
> (DEBUG) 222 pci dev 04:0 BAR16 read upper 0x0.
> (XEN) memory_map:remove: dom2 gfn=0 mfn=fbce0 nr=20
> (XEN) memory_map: error -22 removing dom2 access to [fbce0,fbcff]
> 
> [The DEBUG is to illustate what the hvmloader was doing]
> 
> Reported-by: Sander Eikelenboom 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  hw/xen/xen_pt_config_init.c | 45 
> -
>  1 file changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
> index e34f9f8..91c3a14 100644
> --- a/hw/xen/xen_pt_config_init.c
> +++ b/hw/xen/xen_pt_config_init.c
> @@ -1856,6 +1856,10 @@ static int 
> xen_pt_config_reg_init(XenPCIPassthroughState *s,
>  reg_entry->reg = reg;
>  
>  if (reg->init) {
> +uint32_t host_mask, size_mask;
> +unsigned int offset;
> +uint32_t val;
> +
>  /* initialize emulate register */
>  rc = reg->init(s, reg_entry->reg,
> reg_grp->base_offset + reg->offset, &data);
> @@ -1868,8 +1872,47 @@ static int 
> xen_pt_config_reg_init(XenPCIPassthroughState *s,
>  g_free(reg_entry);
>  return 0;
>  }
> +/* Sync up the data to dev.config */
> +offset = reg_grp->base_offset + reg->offset;
> +size_mask = 0x >> ((4 - reg->size) << 3);
> +
> +if (xen_host_pci_get_long(&s->real_device, offset, &val))
> +val = pci_get_long(s->dev.config + offset); /* Pfff... */

I don't understand this, a more helpful comment would be helpful


> +/* Set bits in emu_mask are the ones we emulate. The dev.config shall
> + * contain the emulated view of the guest - therefore we flip the 
> mask
> + * to mask out the host values (which dev.config initially has) . */
> +host_mask = size_mask & ~reg->emu_mask;
> +
> +if ((data & host_mask) != (val & host_mask)) {
> +uint32_t new_val;
> +
> +/* Mask out host (including past size). */
> +new_val = val & host_mask;
> +/* Merge emulated ones (excluding the non-emulated ones). */
> +new_val |= data & host_mask;
> +/* Leave intact host and emulated values past the size - even 
> though
> +

Re: [Qemu-devel] [PATCH v2 07/22] virtio: find version 1.0 virtio capabilities

2015-07-01 Thread Michael S. Tsirkin

On Wed, Jul 01, 2015 at 02:49:54PM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > > 
> > > Yes, seabios always allocates both mem and io.
> > 
> > What if it can't? E.g. too many devices.
> 
> First tries to move 64bit bars above 64g.  Guess we better should
> exclude virtio devices here (like we do for xhci already).

Why? You can use config capability.

> 
> Failing that it'll panic.
> 
> cheers,
>   Gerd

IO can't well go above 4G :) So eventually we'll make these express
devices.  For express the spec explicitly says devices must still work
if IO is disabled. At that point maybe teaching bios to disable
IO and keep working will have value.

-- 
MST

Re: [Qemu-devel] [PATCH] target-mips: fix MIPS64R6-generic configuration

2015-07-01 Thread Aurelien Jarno

On 2015-06-29 10:11, Yongbok Kim wrote:
> Fix core configuration for MIPS64R6-generic to make it as close as
> I6400.
> I6400 core has 48-bit of Virtual Address available (SEGBITS).
> MIPS SIMD Architecture is available.
> Rearrange order of bits to match the specification.
> 
> Signed-off-by: Yongbok Kim 
> ---
>  target-mips/mips-defs.h  |2 +-
>  target-mips/translate_init.c |   18 +-
>  2 files changed, 10 insertions(+), 10 deletions(-)

Reviewed-by: Aurelien Jarno 

That said given we are getting closer to the I6400 CPU model, shouldn't
we try to directly model a I6400 core (even if we have to disable some
features  like IEEE 754-2008 FP) instead of a generic MIPS64R6 core?

> diff --git a/target-mips/mips-defs.h b/target-mips/mips-defs.h
> index 20aa87c..53b185e 100644
> --- a/target-mips/mips-defs.h
> +++ b/target-mips/mips-defs.h
> @@ -11,7 +11,7 @@
>  #if defined(TARGET_MIPS64)
>  #define TARGET_LONG_BITS 64
>  #define TARGET_PHYS_ADDR_SPACE_BITS 48
> -#define TARGET_VIRT_ADDR_SPACE_BITS 42
> +#define TARGET_VIRT_ADDR_SPACE_BITS 48
>  #else
>  #define TARGET_LONG_BITS 32
>  #define TARGET_PHYS_ADDR_SPACE_BITS 40
> diff --git a/target-mips/translate_init.c b/target-mips/translate_init.c
> index ddfaff8..9304e74 100644
> --- a/target-mips/translate_init.c
> +++ b/target-mips/translate_init.c
> @@ -655,14 +655,14 @@ static const mips_def_t mips_defs[] =
> (2 << CP0C1_DS) | (4 << CP0C1_DL) | (3 << CP0C1_DA) |
> (0 << CP0C1_PC) | (1 << CP0C1_WR) | (1 << CP0C1_EP),
>  .CP0_Config2 = MIPS_CONFIG2,
> -.CP0_Config3 = MIPS_CONFIG3 | (1 << CP0C3_RXI) | (1 << CP0C3_BP) |
> -   (1 << CP0C3_BI) | (1 << CP0C3_ULRI) | (1 << 
> CP0C3_LPA) |
> -   (1U << CP0C3_M),
> -.CP0_Config4 = MIPS_CONFIG4 | (0xfc << CP0C4_KScrExist) |
> -   (3 << CP0C4_IE) | (1 << CP0C4_M),
> +.CP0_Config3 = MIPS_CONFIG3 | (1U << CP0C3_M) | (1 << CP0C3_MSAP) |
> +   (1 << CP0C3_BP) | (1 << CP0C3_BI) | (1 << CP0C3_ULRI) 
> |
> +   (1 << CP0C3_RXI) | (1 << CP0C3_LPA),
> +.CP0_Config4 = MIPS_CONFIG4 | (1U << CP0C4_M) | (3 << CP0C4_IE) |
> +   (0xfc << CP0C4_KScrExist),
>  .CP0_Config5 = MIPS_CONFIG5 | (1 << CP0C5_LLB),
> -.CP0_Config5_rw_bitmask = (1 << CP0C5_SBRI) | (1 << CP0C5_FRE) |
> -  (1 << CP0C5_UFE),
> +.CP0_Config5_rw_bitmask = (1 << CP0C5_MSAEn) | (1 << CP0C5_SBRI) |
> +  (1 << CP0C5_FRE) | (1 << CP0C5_UFE),
>  .CP0_LLAddr_rw_bitmask = 0,
>  .CP0_LLAddr_shift = 0,
>  .SYNCI_Step = 32,
> @@ -674,9 +674,9 @@ static const mips_def_t mips_defs[] =
>  .CP1_fcr0 = (1 << FCR0_FREP) | (1 << FCR0_F64) | (1 << FCR0_L) |
>  (1 << FCR0_W) | (1 << FCR0_D) | (1 << FCR0_S) |
>  (0x00 << FCR0_PRID) | (0x0 << FCR0_REV),
> -.SEGBITS = 42,
> +.SEGBITS = 48,
>  .PABITS = 48,
> -.insn_flags = CPU_MIPS64R6,
> +.insn_flags = CPU_MIPS64R6 | ASE_MSA,
>  .mmu_type = MMU_TYPE_R4000,
>  },
>  {
> -- 
> 1.7.5.4
> 
> 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH pic32 v2 5/5] Two new machine platforms: pic32mz7 and pic32mz.

2015-07-01 Thread Aurelien Jarno

On 2015-06-30 21:12, Serge Vakulenko wrote:
> Signed-off-by: Serge Vakulenko 
> ---
>  hw/mips/Makefile.objs   |3 +
>  hw/mips/mips_pic32mx7.c | 1652 +
>  hw/mips/mips_pic32mz.c  | 2840 
> +++
>  hw/mips/pic32_ethernet.c|  557 +
>  hw/mips/pic32_gpio.c|   39 +
>  hw/mips/pic32_load_hex.c|  238 
>  hw/mips/pic32_peripherals.h |  210 
>  hw/mips/pic32_sdcard.c  |  428 +++
>  hw/mips/pic32_spi.c |  121 ++
>  hw/mips/pic32_uart.c|  228 
>  hw/mips/pic32mx.h   | 1290 
>  hw/mips/pic32mz.h   | 2093 +++
>  12 files changed, 9699 insertions(+)
>  create mode 100644 hw/mips/mips_pic32mx7.c
>  create mode 100644 hw/mips/mips_pic32mz.c
>  create mode 100644 hw/mips/pic32_ethernet.c
>  create mode 100644 hw/mips/pic32_gpio.c
>  create mode 100644 hw/mips/pic32_load_hex.c
>  create mode 100644 hw/mips/pic32_peripherals.h
>  create mode 100644 hw/mips/pic32_sdcard.c
>  create mode 100644 hw/mips/pic32_spi.c
>  create mode 100644 hw/mips/pic32_uart.c
>  create mode 100644 hw/mips/pic32mx.h
>  create mode 100644 hw/mips/pic32mz.h

This patch is huge, and needs to be splitted to ease the review.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH pic32 v2 4/5] Two new processor variants: M4K and microAptivP.

2015-07-01 Thread Aurelien Jarno

On 2015-06-30 21:12, Serge Vakulenko wrote:
> Signed-off-by: Serge Vakulenko 
> ---
>  target-mips/cpu.h|  2 ++
>  target-mips/translate_init.c | 46 
> 
>  2 files changed, 48 insertions(+)
> 
> diff --git a/target-mips/cpu.h b/target-mips/cpu.h
> index ab830ee..9f5890c 100644
> --- a/target-mips/cpu.h
> +++ b/target-mips/cpu.h
> @@ -394,6 +394,7 @@ struct CPUMIPSState {
>  #define CP0C0_M31
>  #define CP0C0_K23  28
>  #define CP0C0_KU   25
> +#define CP0C0_SB   21

Bits in the range 16:24 are implementation specific, so I do wonder if
we want to have this bit there. At least we should mark it as
implementation specific.

>  #define CP0C0_MDU  20
>  #define CP0C0_MM   17
>  #define CP0C0_BM   16
> @@ -479,6 +480,7 @@ struct CPUMIPSState {
>  #define CP0C5_NFExists   0
>  int32_t CP0_Config6;
>  int32_t CP0_Config7;
> +#define CP0C7_WII31

Same as above, Config6 and Config7 are implementation dependent.

>  /* XXX: Maybe make LLAddr per-TC? */
>  uint64_t lladdr;
>  target_ulong llval;
> diff --git a/target-mips/translate_init.c b/target-mips/translate_init.c
> index ddfaff8..430a547 100644
> --- a/target-mips/translate_init.c
> +++ b/target-mips/translate_init.c
> @@ -232,6 +232,52 @@ static const mips_def_t mips_defs[] =
>  .mmu_type = MMU_TYPE_FMT,
>  },
>  {
> +/* Configuration for Microchip PIC32MX microcontroller. */
> +.name = "M4K",
> +.CP0_PRid = 0x00018765,
> +.CP0_Config0 = MIPS_CONFIG0 | (2 << CP0C0_K23) | (2 << CP0C0_KU) |
> +   (1 << CP0C0_SB) | (1 << CP0C0_BM) |
> +   (1 << CP0C0_AR) | (MMU_TYPE_FMT << CP0C0_MT),
> +.CP0_Config1 = (1U << CP0C1_M) | (1 << CP0C1_CA) | (1 << CP0C1_EP),
> +.CP0_Config2 = MIPS_CONFIG2,
> +.CP0_Config3 = (1 << CP0C3_VEIC) | (1 << CP0C3_VInt),
> +.CP0_LLAddr_rw_bitmask = 0,
> +.CP0_LLAddr_shift = 4,
> +.SYNCI_Step = 32,
> +.CCRes = 2,
> +.CP0_Status_rw_bitmask = 0x1258FF17,
> +.SEGBITS = 32,
> +.PABITS = 32,
> +.insn_flags = CPU_MIPS32R2 | ASE_MIPS16,
> +.mmu_type = MMU_TYPE_FMT,
> +},
> +{
> +/* Configuration for Microchip PIC32MZ microcontroller. */
> +.name = "microAptivP",
> +.CP0_PRid = 0x00019e28,
> +.CP0_Config0 = MIPS_CONFIG0 | (0x1 << CP0C0_AR) |
> +(MMU_TYPE_R4000 << CP0C0_MT),
> +.CP0_Config1 = MIPS_CONFIG1 | (15 << CP0C1_MMU) | (1 << CP0C1_PC),
> +.CP0_Config2 = MIPS_CONFIG2,
> +.CP0_Config3 = (1 << CP0C3_M) | (1 << CP0C3_IPLW) | (1 << CP0C3_MCU) 
> |
> +(2 << CP0C3_ISA) | (1 << CP0C3_ULRI) | (1 << CP0C3_RXI) |
> +(1 << CP0C3_DSP2P) | (1 << CP0C3_DSPP) | (1 << 
> CP0C3_VEIC) |
> +(1 << CP0C3_VInt),

DSP and DSPr2 are enabled here...

> +.CP0_Config4 = (1 << CP0C4_M),
> +.CP0_Config5 = (1 << CP0C5_NFExists),
> +.CP0_Config6 = 0,
> +.CP0_Config7 = (1 << CP0C7_WII),
> +.CP0_LLAddr_rw_bitmask = 0,
> +.CP0_LLAddr_shift = 4,
> +.SYNCI_Step = 32,
> +.CCRes = 2,
> +.CP0_Status_rw_bitmask = 0x1278FF17,
> +.SEGBITS = 32,
> +.PABITS = 32,
> +.insn_flags = CPU_MIPS32R2,

so I guess you want to enable ASE_DSP and ASE_DSPR2 here.

> +.mmu_type = MMU_TYPE_R4000,
> +},
> +{
>  .name = "24Kc",
>  .CP0_PRid = 0x00019300,
>  .CP0_Config0 = MIPS_CONFIG0 | (0x1 << CP0C0_AR) |

Otherwise it looks ok, though I haven't look at the PIC32 manual to
check the values.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH v2 1/1] KVM s390 pci infrastructure modelling

2015-07-01 Thread Michael S. Tsirkin

On Wed, Jul 01, 2015 at 08:42:52PM +0800, Hong Bo Li wrote:
> 
> 
> On 7/1/2015 19:57, Michael S. Tsirkin wrote:
> >On Wed, Jul 01, 2015 at 07:46:01PM +0800, Hong Bo Li wrote:
> >>
> >>On 7/1/2015 19:23, Michael S. Tsirkin wrote:
> >>>On Wed, Jul 01, 2015 at 07:11:38PM +0800, Hong Bo Li wrote:
> On 7/1/2015 18:36, Michael S. Tsirkin wrote:
> >On Wed, Jul 01, 2015 at 06:04:24PM +0800, Hong Bo Li wrote:
> >>On 7/1/2015 17:22, Michael S. Tsirkin wrote:
> >>>On Wed, Jul 01, 2015 at 05:13:11PM +0800, Hong Bo Li wrote:
> On 7/1/2015 16:05, Michael S. Tsirkin wrote:
> >On Wed, Jul 01, 2015 at 03:56:25PM +0800, Hong Bo Li wrote:
> >>On 7/1/2015 14:22, Michael S. Tsirkin wrote:
> >>>On Tue, Jun 30, 2015 at 02:16:59PM +0800, Hong Bo Li wrote:
> On 6/29/2015 18:01, Michael S. Tsirkin wrote:
> >On Mon, Jun 29, 2015 at 05:24:53PM +0800, Hong Bo Li wrote:
> >>This patch introduce a new facility(and bus)
> >>to hold devices representing information actually
> >>provided by s390 firmware and I/O configuration.
> >>usage example:
> >>-device s390-pcihost
> >>-device vfio-pci,host=:00:00.0,id=vpci1
> >>-device zpci,fid=2,uid=5,pci_id=vpci1,id=zpci1
> >>
> >>The first line will create a s390 pci host bridge
> >>and init the root bus. The second line will create
> >>a standard vfio pci device, and attach it to the
> >>root bus. These are similiar to the standard process
> >>to define a pci device on other platform.
> >>
> >>The third line will create a s390 pci device to
> >>store s390 specific information, and references
> >>the corresponding vfio pci device via device id.
> >>We create a s390 pci facility bus to hold all the
> >>zpci devices.
> >>
> >>Signed-off-by: Hong Bo Li 
> >It's mostly up to s390 maintainers, but I'd like to note
> >one thing below
> >
> >>---
> >>  hw/s390x/s390-pci-bus.c| 314 
> >> +
> >>  hw/s390x/s390-pci-bus.h|  48 ++-
> >>  hw/s390x/s390-pci-inst.c   |   4 +-
> >>  hw/s390x/s390-virtio-ccw.c |   5 +-
> >>  4 files changed, 283 insertions(+), 88 deletions(-)
> >>
> >>diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> >>index 560b66a..d5e7b2e 100644
> >>--- a/hw/s390x/s390-pci-bus.c
> >>+++ b/hw/s390x/s390-pci-bus.c
> >>@@ -32,8 +32,8 @@ int chsc_sei_nt2_get_event(void *res)
> >>  PciCcdfErr *eccdf;
> >>  int rc = 1;
> >>  SeiContainer *sei_cont;
> >>-S390pciState *s = S390_PCI_HOST_BRIDGE(
> >>-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
> >>+S390PCIFacility *s = S390_PCI_FACILITY(
> >>+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
> >>  if (!s) {
> >>  return rc;
> >>@@ -72,8 +72,8 @@ int chsc_sei_nt2_get_event(void *res)
> >>  int chsc_sei_nt2_have_event(void)
> >>  {
> >>-S390pciState *s = S390_PCI_HOST_BRIDGE(
> >>-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
> >>+S390PCIFacility *s = S390_PCI_FACILITY(
> >>+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
> >>  if (!s) {
> >>  return 0;
> >>@@ -82,20 +82,32 @@ int chsc_sei_nt2_have_event(void)
> >>  return !QTAILQ_EMPTY(&s->pending_sei);
> >>  }
> >>+void s390_pci_device_enable(S390PCIBusDevice *zpci)
> >>+{
> >>+zpci->fh = zpci->fh | 1 << ENABLE_BIT_OFFSET;
> >>+}
> >>+
> >>+void s390_pci_device_disable(S390PCIBusDevice *zpci)
> >>+{
> >>+zpci->fh = zpci->fh & ~(1 << ENABLE_BIT_OFFSET);
> >>+if (zpci->is_unplugged)
> >>+object_unparent(OBJECT(zpci));
> >>+}
> >>+
> >>  S390PCIBusDevice *s390_pci_find_dev_by_fid(uint32_t fid)
> >>  {
> >>  S390PCIBusDevice *pbdev;
> >>-int i;
> >>-S390pciState *s = S390_PCI_HOST_BRIDGE(
> >>-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
> >>+BusChild *kid;
> >>+S390PCIFacility *s = S390_PCI_FACILITY(
> >>+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
> >>  if (!s) {
> >>  return NULL;
> >>  }
> >>>

Re: [Qemu-devel] [PATCH v8 3/3] ich9: implement strap SPKR pin logic

2015-07-01 Thread Paolo Bonzini



On 01/07/2015 15:31, Michael S. Tsirkin wrote:
> I don't think we should defer the whole series because of the argument
> about the default.  I've merged these patches, I you like, pls send a
> one-line patch on top to flip the default with some info on how it was
> tested, and we can discuss it separately.
> 
> Makes sense?

Perfect.

Paolo

> BTW 2.4 makes qemu versioned because ahci finally supports migration
> so yes, we'll have to version from now on.
> 
>

Re: [Qemu-devel] [PATCH v8 3/3] ich9: implement strap SPKR pin logic

2015-07-01 Thread Michael S. Tsirkin

On Wed, Jul 01, 2015 at 03:18:41PM +0200, Paolo Bonzini wrote:
> 
> 
> On 28/06/2015 19:58, Paulo Alcantara wrote:
> > If the signal is sampled high, this indicates that the system is
> > strapped to the "No Reboot" mode (ICH9 will disable the TCO Timer system
> > reboot feature). The status of this strap is readable via the NO_REBOOT
> > bit (CC: offset 0x3410:bit 5).
> > 
> > The NO_REBOOT bit is set when SPKR pin on ICH9 is sampled high. This bit
> > may be set or cleared by software if the strap is sampled low but may
> > not override the strap when it indicates "No Reboot".
> > 
> > This patch implements the logic where hardware has ability to set SPKR
> > pin through a property named "noreboot" and it's sampled high by
> > default.
> 
> I know Michael suggested this, but I think default high is a worse
> default.  It does not allow recovering from a hard lockup where you
> cannot process an NMI, unlike all other watchdogs implemented by QEMU.
> In fact, the Linux driver fails to start if the strap is high.
> 
> My theory is that hardware manufacturers should only set the strap high
> if they want the firmware to have total control of the watchdog via SMIs
> (TCO_EN).
> 
> If it is just a matter of being late in 2.4, just delay everything to
> 2.5.  It doesn't require any more work from Paulo, as you can just flip
> the default yourself without adding a new machine type (in fact I'm
> still not sure why machine types for Q35 are versioned, since migration
> is not supported...).
> 
> Paolo

I don't think we should defer the whole series because of the argument
about the default.  I've merged these patches, I you like, pls send a
one-line patch on top to flip the default with some info on how it was
tested, and we can discuss it separately.

Makes sense?

BTW 2.4 makes qemu versioned because ahci finally supports migration
so yes, we'll have to version from now on.


> > Signed-off-by: Paulo Alcantara 
> > ---
> > v7 -> v8:
> >   * change property name to "noreboot"
> >   * default "noreboot" property to high
> >   * define property in dc->props
> >   * update tco tests to support and exercise "noreboot" property
> > ---
> >  hw/acpi/tco.c  |  2 +-
> >  hw/isa/lpc_ich9.c  |  6 ++
> >  include/hw/i386/ich9.h |  5 +
> >  tests/tco-test.c   | 18 --
> >  4 files changed, 28 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/acpi/tco.c b/hw/acpi/tco.c
> > index 1794a54..7a026c2 100644
> > --- a/hw/acpi/tco.c
> > +++ b/hw/acpi/tco.c
> > @@ -64,7 +64,7 @@ static void tco_timer_expired(void *opaque)
> >  tr->tco.sts2 |= TCO_BOOT_STS;
> >  tr->timeouts_no = 0;
> >  
> > -if (!(gcs & ICH9_CC_GCS_NO_REBOOT)) {
> > +if (!lpc->pin_strap.spkr_hi && !(gcs & ICH9_CC_GCS_NO_REBOOT)) {
> >  watchdog_perform_action();
> >  tco_timer_stop(tr);
> >  return;
> > diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> > index b547002..3b460d4 100644
> > --- a/hw/isa/lpc_ich9.c
> > +++ b/hw/isa/lpc_ich9.c
> > @@ -688,6 +688,11 @@ static const VMStateDescription vmstate_ich9_lpc = {
> >  }
> >  };
> >  
> > +static Property ich9_lpc_properties[] = {
> > +DEFINE_PROP_BOOL("noreboot", ICH9LPCState, pin_strap.spkr_hi, true),
> > +DEFINE_PROP_END_OF_LIST(),
> > +};
> > +
> >  static void ich9_lpc_class_init(ObjectClass *klass, void *data)
> >  {
> >  DeviceClass *dc = DEVICE_CLASS(klass);
> > @@ -699,6 +704,7 @@ static void ich9_lpc_class_init(ObjectClass *klass, 
> > void *data)
> >  dc->reset = ich9_lpc_reset;
> >  k->init = ich9_lpc_init;
> >  dc->vmsd = &vmstate_ich9_lpc;
> > +dc->props = ich9_lpc_properties;
> >  k->config_write = ich9_lpc_config_write;
> >  dc->desc = "ICH9 LPC bridge";
> >  k->vendor_id = PCI_VENDOR_ID_INTEL;
> > diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
> > index f5681a3..63c5cd8 100644
> > --- a/include/hw/i386/ich9.h
> > +++ b/include/hw/i386/ich9.h
> > @@ -46,6 +46,11 @@ typedef struct ICH9LPCState {
> >  ICH9LPCPMRegs pm;
> >  uint32_t sci_level; /* track sci level */
> >  
> > +/* 2.24 Pin Straps */
> > +struct {
> > +bool spkr_hi;
> > +} pin_strap;
> > +
> >  /* 10.1 Chipset Configuration registers(Memory Space)
> >   which is pointed by RCBA */
> >  uint8_t chip_config[ICH9_CC_SIZE];
> > diff --git a/tests/tco-test.c b/tests/tco-test.c
> > index 1a2fe3d..6a48188 100644
> > --- a/tests/tco-test.c
> > +++ b/tests/tco-test.c
> > @@ -42,6 +42,7 @@ enum {
> >  
> >  typedef struct {
> >  const char *args;
> > +bool noreboot;
> >  QPCIDevice *dev;
> >  void *lpc_base;
> >  void *tco_io_base;
> > @@ -53,7 +54,9 @@ static void test_init(TestData *d)
> >  QTestState *qs;
> >  char *s;
> >  
> > -s = g_strdup_printf("-machine q35 %s", !d->args ? "" : d->args);
> > +s = g_strdup_printf("-machine q35 %s %s",
> > +d->noreboot ? "" : "-global 
>

Re: [Qemu-devel] [PATCH RFC 1 1/8] xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> During init time we treat the dev.config area as a cache
> of the host view. However during execution time we treat it
> as guest view (by the generic PCI API). We need to sync Xen's
> code to the generic PCI API view. This is the first step
> by replacing all of the code that uses dev.config or
> pci_get_[byte|word] to get host value to actually use the
> xen_host_pci_get_[byte|word] functions.
> 
> Interestingly in 'xen_pt_ptr_reg_init' we also needed to swap
> reg_field from uint32_t to uint8_t - since the access is only
> for one byte not four bytes. We can split this as a seperate
> patch however we would have to use a cast to thwart compiler
> warnings in the meantime.
> 
> We also truncated 'flags' to 'flag' to make the code fit within
> the 80 characters.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  hw/xen/xen_pt.c | 22 +++--
>  hw/xen/xen_pt_config_init.c | 77 
> +++--
>  2 files changed, 72 insertions(+), 27 deletions(-)
> 
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 1d256b9..2535352 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -738,7 +738,12 @@ static int xen_pt_initfn(PCIDevice *d)
>  }
>  
>  /* Bind interrupt */
> -if (!s->dev.config[PCI_INTERRUPT_PIN]) {
> +if (xen_host_pci_get_byte(&s->real_device, PCI_INTERRUPT_PIN,
> +  &machine_irq /* temp scratch */)) {
> +XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc);

printing rc, but rc is not set


> +machine_irq = 0;
> +}

I understand that machine_irq is just used as a scratch value here, but
I would rather introduce a new variable


> +if (!machine_irq) {
>  XEN_PT_LOG(d, "no pin interrupt\n");
>  goto out;
>  }
> @@ -788,8 +793,19 @@ static int xen_pt_initfn(PCIDevice *d)
>  
>  out:
>  if (cmd) {
> -xen_host_pci_set_word(&s->real_device, PCI_COMMAND,
> -  pci_get_word(d->config + PCI_COMMAND) | cmd);
> +uint16_t val;
> +
> +rc = xen_host_pci_get_word(&s->real_device, PCI_COMMAND, &val);
> +if (rc) {
> +XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc);
> +}
> +else {

 } else { is allowed


> +val |= cmd;
> +if (xen_host_pci_set_word(&s->real_device, PCI_COMMAND, val)) {
> +XEN_PT_ERR(d, "Failed to write PCI_COMMAND val=0x%x!(rc: 
> %d)\n",
> +   val, rc);

rc not set but printed


> +}
> +}
>  }
>  
>  memory_listener_register(&s->memory_listener, &s->dev.bus_master_as);
> diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
> index 21d4938..e34f9f8 100644
> --- a/hw/xen/xen_pt_config_init.c
> +++ b/hw/xen/xen_pt_config_init.c
> @@ -800,15 +800,21 @@ static XenPTRegInfo xen_pt_emu_reg_vendor[] = {
>  static inline uint8_t get_capability_version(XenPCIPassthroughState *s,
>   uint32_t offset)
>  {
> -uint8_t flags = pci_get_byte(s->dev.config + offset + PCI_EXP_FLAGS);
> -return flags & PCI_EXP_FLAGS_VERS;
> +uint8_t flag;
> +if (xen_host_pci_get_byte(&s->real_device, offset + PCI_EXP_FLAGS, 
> &flag)) {
> +return 0;
> +}
> +return flag & PCI_EXP_FLAGS_VERS;
>  }
>  
>  static inline uint8_t get_device_type(XenPCIPassthroughState *s,
>uint32_t offset)
>  {
> -uint8_t flags = pci_get_byte(s->dev.config + offset + PCI_EXP_FLAGS);
> -return (flags & PCI_EXP_FLAGS_TYPE) >> 4;
> +uint8_t flag;
> +if (xen_host_pci_get_byte(&s->real_device, offset + PCI_EXP_FLAGS, 
> &flag)) {
> +return 0;
> +}
> +return (flag & PCI_EXP_FLAGS_TYPE) >> 4;
>  }
>  
>  /* initialize Link Control register */
> @@ -857,8 +863,14 @@ static int 
> xen_pt_linkctrl2_reg_init(XenPCIPassthroughState *s,
>  reg_field = XEN_PT_INVALID_REG;
>  } else {
>  /* set Supported Link Speed */
> -uint8_t lnkcap = pci_get_byte(s->dev.config + real_offset - 
> reg->offset
> -  + PCI_EXP_LNKCAP);
> +uint8_t lnkcap;
> +int rc;
> +rc = xen_host_pci_get_byte(&s->real_device,
> +   real_offset - reg->offset + 
> PCI_EXP_LNKCAP,
> +   &lnkcap);
> +if (rc) {
> +return rc;
> +}
>  reg_field |= PCI_EXP_LNKCAP_SLS & lnkcap;
>  }
>  
> @@ -1039,13 +1051,15 @@ static int 
> xen_pt_msgctrl_reg_init(XenPCIPassthroughState *s,
> XenPTRegInfo *reg, uint32_t real_offset,
> uint32_t *data)
>  {
> -PCIDevice *d = &s->dev;
>  XenPTMSI *msi = s->msi;
> -uint16_t reg_field = 0;
> +uint16_t reg_field;
> +int rc;
>  
>  /* use I/O

Re: [Qemu-devel] [PATCH v8 3/3] ich9: implement strap SPKR pin logic

2015-07-01 Thread Paolo Bonzini



On 28/06/2015 19:58, Paulo Alcantara wrote:
> If the signal is sampled high, this indicates that the system is
> strapped to the "No Reboot" mode (ICH9 will disable the TCO Timer system
> reboot feature). The status of this strap is readable via the NO_REBOOT
> bit (CC: offset 0x3410:bit 5).
> 
> The NO_REBOOT bit is set when SPKR pin on ICH9 is sampled high. This bit
> may be set or cleared by software if the strap is sampled low but may
> not override the strap when it indicates "No Reboot".
> 
> This patch implements the logic where hardware has ability to set SPKR
> pin through a property named "noreboot" and it's sampled high by
> default.

I know Michael suggested this, but I think default high is a worse
default.  It does not allow recovering from a hard lockup where you
cannot process an NMI, unlike all other watchdogs implemented by QEMU.
In fact, the Linux driver fails to start if the strap is high.

My theory is that hardware manufacturers should only set the strap high
if they want the firmware to have total control of the watchdog via SMIs
(TCO_EN).

If it is just a matter of being late in 2.4, just delay everything to
2.5.  It doesn't require any more work from Paulo, as you can just flip
the default yourself without adding a new machine type (in fact I'm
still not sure why machine types for Q35 are versioned, since migration
is not supported...).

Paolo

> Signed-off-by: Paulo Alcantara 
> ---
> v7 -> v8:
>   * change property name to "noreboot"
>   * default "noreboot" property to high
>   * define property in dc->props
>   * update tco tests to support and exercise "noreboot" property
> ---
>  hw/acpi/tco.c  |  2 +-
>  hw/isa/lpc_ich9.c  |  6 ++
>  include/hw/i386/ich9.h |  5 +
>  tests/tco-test.c   | 18 --
>  4 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/acpi/tco.c b/hw/acpi/tco.c
> index 1794a54..7a026c2 100644
> --- a/hw/acpi/tco.c
> +++ b/hw/acpi/tco.c
> @@ -64,7 +64,7 @@ static void tco_timer_expired(void *opaque)
>  tr->tco.sts2 |= TCO_BOOT_STS;
>  tr->timeouts_no = 0;
>  
> -if (!(gcs & ICH9_CC_GCS_NO_REBOOT)) {
> +if (!lpc->pin_strap.spkr_hi && !(gcs & ICH9_CC_GCS_NO_REBOOT)) {
>  watchdog_perform_action();
>  tco_timer_stop(tr);
>  return;
> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> index b547002..3b460d4 100644
> --- a/hw/isa/lpc_ich9.c
> +++ b/hw/isa/lpc_ich9.c
> @@ -688,6 +688,11 @@ static const VMStateDescription vmstate_ich9_lpc = {
>  }
>  };
>  
> +static Property ich9_lpc_properties[] = {
> +DEFINE_PROP_BOOL("noreboot", ICH9LPCState, pin_strap.spkr_hi, true),
> +DEFINE_PROP_END_OF_LIST(),
> +};
> +
>  static void ich9_lpc_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -699,6 +704,7 @@ static void ich9_lpc_class_init(ObjectClass *klass, void 
> *data)
>  dc->reset = ich9_lpc_reset;
>  k->init = ich9_lpc_init;
>  dc->vmsd = &vmstate_ich9_lpc;
> +dc->props = ich9_lpc_properties;
>  k->config_write = ich9_lpc_config_write;
>  dc->desc = "ICH9 LPC bridge";
>  k->vendor_id = PCI_VENDOR_ID_INTEL;
> diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
> index f5681a3..63c5cd8 100644
> --- a/include/hw/i386/ich9.h
> +++ b/include/hw/i386/ich9.h
> @@ -46,6 +46,11 @@ typedef struct ICH9LPCState {
>  ICH9LPCPMRegs pm;
>  uint32_t sci_level; /* track sci level */
>  
> +/* 2.24 Pin Straps */
> +struct {
> +bool spkr_hi;
> +} pin_strap;
> +
>  /* 10.1 Chipset Configuration registers(Memory Space)
>   which is pointed by RCBA */
>  uint8_t chip_config[ICH9_CC_SIZE];
> diff --git a/tests/tco-test.c b/tests/tco-test.c
> index 1a2fe3d..6a48188 100644
> --- a/tests/tco-test.c
> +++ b/tests/tco-test.c
> @@ -42,6 +42,7 @@ enum {
>  
>  typedef struct {
>  const char *args;
> +bool noreboot;
>  QPCIDevice *dev;
>  void *lpc_base;
>  void *tco_io_base;
> @@ -53,7 +54,9 @@ static void test_init(TestData *d)
>  QTestState *qs;
>  char *s;
>  
> -s = g_strdup_printf("-machine q35 %s", !d->args ? "" : d->args);
> +s = g_strdup_printf("-machine q35 %s %s",
> +d->noreboot ? "" : "-global ICH9-LPC.noreboot=false",
> +!d->args ? "" : d->args);
>  qs = qtest_start(s);
>  qtest_irq_intercept_in(qs, "ioapic");
>  g_free(s);
> @@ -135,6 +138,7 @@ static void test_tco_defaults(void)
>  TestData d;
>  
>  d.args = NULL;
> +d.noreboot = true;
>  test_init(&d);
>  g_assert_cmpint(qpci_io_readw(d.dev, d.tco_io_base + TCO_RLD), ==,
>  TCO_RLD_DEFAULT);
> @@ -167,6 +171,7 @@ static void test_tco_timeout(void)
>  int ret;
>  
>  d.args = NULL;
> +d.noreboot = true;
>  test_init(&d);
>  
>  stop_tco(&d);
> @@ -210,6 +215,7 @@ static void test_tco_max_timeout(void)
>  int ret;
>

Re: [Qemu-devel] [PATCH] linux-user: Avoid compilation error with --disable-guest-base

2015-07-01 Thread Aurelien Jarno

On 2015-07-01 01:58, Laurent Vivier wrote:
> 
> 
> Le 30/06/2015 19:20, Peter Maydell a écrit :
> > On 30 June 2015 at 18:13, Laurent Vivier  wrote:
> >>
> >>
> >> Le 30/06/2015 18:45, Peter Maydell a écrit :
> >>> On 30 June 2015 at 17:19, Laurent Vivier  wrote:
>  When guest base is disabled, RESERVED_VA is 0, and
>  (__guest < RESERVED_VA) is always false as __guest is unsigned.
> 
>  With -Werror=type-limits, this triggers an error:
> 
>  include/exec/cpu_ldst.h:60:31: error: comparison of unsigned 
>  expression < 0 is always false [-Werror=type-limits]
>   (!RESERVED_VA || (__guest < RESERVED_VA)); \
> 
>  This patch removes this comparison when guest base is disabled.
> >>>
> >>> Is there a useful reason to compile with --disable-guest-base
> >>> (ie why we should retain the !CONFIG_USE_GUEST_BASE code
> >>> in QEMU at all) ? It was originally optional because we
> >>> didn't support it in all our TCG hosts, but we fixed that
> >>> back in 2012...
> >>
> >> TCG generates less code, so performance is better (well, it is what I
> >> guess).
> >>
> >> I've compiled a kernel with and without guest base in a chrooted
> >> linux-user-qemu.
> >> Without guest base it is ~1 minute less for a 13 minutes build.
> >>
> >> I can do more tests if you want.
> > 
> > Hmm. That's a fair chunk of speedup. On the downside:
> >  * you only get this if you're willing to build QEMU from
> >source with funny options
> >  * it won't work for all guest/host combinations (sometimes
> >the guest really wants to be able to map at low addresses
> >the host won't permit)
> >  * it's an extra configuration to maintain which we're
> >clearly not testing at all upstream
> > 
> > I'd still favour removing it completely, personally...
> 
> In fact, I have made more measurements, it saves only ~10 seconds on a
> 13 minutes build.
> 
> my test is: "make -j 4 vmlinux"
> (target: m68k, host: x86_64, 4 cores x 2 threads)

Note that on x86_64, guest base is implemented by using the gs segment
register. That explains why the impact should be relatively low, as your
test shows.

> --enable-guest-base
> 
> real13m26.134s13m28.712s  13m28.053s  13m28.875s
> user52m44.882s52m56.075s  52m49.223s  52m55.366s
> sys 0m33.452s 0m33.613s   0m33.013s   0m33.336s
> 
> --disable-guest-base
> 
> real13m20.412s13m17.773s  13m15.836s  13m13.278s
> user52m23.165s52m7.184s   52m1.547s   51m50.277s
> sys 0m33.427s 0m33.392s   0m32.954s   0m33.430s
> 
> Laurent
> 
> 
> 
> 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] [PATCH] libseccomp: add cacheflush to whitelist

2015-07-01 Thread Andrew Jones

cacheflush is an arm-specific syscall that qemu built for arm
uses. Add it to the whitelist.

Signed-off-by: Andrew Jones 

---

I'm not sure about the priority selection. Maybe cacheflush gets
used frequently enough that it deserves a higher one?

This patch isn't really necessary yet due to ae6e8ef11e6c: "Revert
seccomp tests that allow it to be used on non-x86 architectures",
which we can't revert until libseccomp has released a fix for
arm-specific syscall symbol naming, but when linking to a patched
libseccomp and reverting ae6e8ef11e6c, then this patch allows
guests to boot with '-sandbox on'.

Signed-off-by: Andrew Jones 
---
 qemu-seccomp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index f9de0d3390feb..33644a4e3c3d3 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -237,7 +237,8 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(fadvise64), 240 },
 { SCMP_SYS(inotify_init1), 240 },
 { SCMP_SYS(inotify_add_watch), 240 },
-{ SCMP_SYS(mbind), 240 }
+{ SCMP_SYS(mbind), 240 },
+{ SCMP_SYS(cacheflush), 240 },
 };
 
 int seccomp_start(void)
-- 
2.1.0

Re: [Qemu-devel] [PATCH RFC 6/6] xen: Add backtrace for serious issues.

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> When debugging issues that caused the emulator to kill itself
> or skipping certain operations (unable to write to host
> registers) an stack trace will most definitly aid in debugging
> the problem.
> 
> As such this patch uses the most basic backtrace to print out
> details.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

I think it could be useful, but it cannot be done as a xen-hvm.c thing.
It should be somewhere generic, maybe under util? Stefan, any
suggestions?


>  hw/xen/xen_pt.c |  3 +++
>  include/hw/xen/xen_common.h |  1 +
>  xen-hvm.c   | 16 
>  3 files changed, 20 insertions(+)
> 
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index ea1ceda..1d256b9 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -407,6 +407,7 @@ out:
>  
>  if (rc < 0) {
>  XEN_PT_ERR(d, "xen_host_pci_set_block failed. return value: 
> %d.\n", rc);
> +xen_dump_stack();
>  }
>  }
>  }
> @@ -421,6 +422,7 @@ static uint64_t xen_pt_bar_read(void *o, hwaddr addr,
>   * misconfiguration of the IOMMU. */
>  XEN_PT_ERR(d, "Should not read BAR through QEMU. @0x"TARGET_FMT_plx"\n",
> addr);
> +xen_dump_stack();
>  return 0;
>  }
>  static void xen_pt_bar_write(void *o, hwaddr addr, uint64_t val,
> @@ -430,6 +432,7 @@ static void xen_pt_bar_write(void *o, hwaddr addr, 
> uint64_t val,
>  /* Same comment as xen_pt_bar_read function */
>  XEN_PT_ERR(d, "Should not write BAR through QEMU. @0x"TARGET_FMT_plx"\n",
> addr);
> +xen_dump_stack();
>  }
>  
>  static const MemoryRegionOps ops = {
> diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
> index 38f29fb..3983cfb 100644
> --- a/include/hw/xen/xen_common.h
> +++ b/include/hw/xen/xen_common.h
> @@ -165,6 +165,7 @@ void destroy_hvm_domain(bool reboot);
>  
>  /* shutdown/destroy current domain because of an error */
>  void xen_shutdown_fatal_error(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
> +void xen_dump_stack(void);
>  
>  #ifdef HVM_PARAM_VMPORT_REGS_PFN
>  static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
> diff --git a/xen-hvm.c b/xen-hvm.c
> index a92bc14..8bf4a57 100644
> --- a/xen-hvm.c
> +++ b/xen-hvm.c
> @@ -10,6 +10,7 @@
>  
>  #include 
>  
> +#include 
>  #include "hw/pci/pci.h"
>  #include "hw/i386/pc.h"
>  #include "hw/xen/xen_common.h"
> @@ -1328,6 +1329,20 @@ void xen_register_framebuffer(MemoryRegion *mr)
>  framebuffer = mr;
>  }
>  
> +void xen_dump_stack(void)
> +{
> +int nptrs;
> +#define SIZE 1024
> +void *buffer[SIZE];
> +
> +nptrs = backtrace(buffer, SIZE);
> +if (!nptrs)
> +return;
> +
> +backtrace_symbols_fd(buffer, nptrs, STDERR_FILENO);
> +#undef SIZE
> +}
> +
>  void xen_shutdown_fatal_error(const char *fmt, ...)
>  {
>  va_list ap;
> @@ -1335,6 +1350,7 @@ void xen_shutdown_fatal_error(const char *fmt, ...)
>  va_start(ap, fmt);
>  vfprintf(stderr, fmt, ap);
>  va_end(ap);
> +xen_dump_stack();
>  fprintf(stderr, "Will destroy the domain.\n");
>  /* destroy the domain */
>  qemu_system_shutdown_request();
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH RFC 4/6] xen: Print and use errno where applicable.

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> In Xen 4.6 commit cd2f100f0f61b3f333d52d1737dd73f02daee592
> "libxc: Fix do_memory_op to return negative value on errors"
> made the libxc API less odd-ball: On errors, return value is
> -1 and error code is in errno. On success the return value
> is either 0 or an positive value.
> 
> Since we could be running with an old toolstack in which the
> Exx value is in rc or the newer, we print both and return
> the -EXX depending on rc == -1 condition.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  xen-hvm.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/xen-hvm.c b/xen-hvm.c
> index 0408462..a92bc14 100644
> --- a/xen-hvm.c
> +++ b/xen-hvm.c
> @@ -345,11 +345,12 @@ go_physmap:
>  unsigned long idx = pfn + i;
>  xen_pfn_t gpfn = start_gpfn + i;
>  
> +/* In Xen 4.6 rc is -1 and errno contains the error value. */
>  rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
> idx, gpfn);
>  if (rc) {
>  DPRINTF("add_to_physmap MFN %"PRI_xen_pfn" to PFN %"
> -PRI_xen_pfn" failed: %d\n", idx, gpfn, rc);
> -return -rc;
> +PRI_xen_pfn" failed: %d (errno: %d)\n", idx, gpfn, rc, 
> errno);
> +return rc == -1 ? -errno : -rc;

Printing both rc and errno is the right thing to do, but I am not sure
changing return value depending on the libxc version is a good idea.
Maybe we should be consistent and always return rc?


>  }
>  }
>  
> @@ -422,11 +423,12 @@ static int xen_remove_from_physmap(XenIOState *state,
>  xen_pfn_t idx = start_addr + i;
>  xen_pfn_t gpfn = phys_offset + i;
>  
> +/* In Xen 4.6 rc is -1 and errno contains the error value. */
>  rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
> idx, gpfn);
>  if (rc) {
>  fprintf(stderr, "add_to_physmap MFN %"PRI_xen_pfn" to PFN %"
> -PRI_xen_pfn" failed: %d\n", idx, gpfn, rc);
> -return -rc;
> +PRI_xen_pfn" failed: %d (errno: %d)\n", idx, gpfn, rc, 
> errno);
> +return rc == -1 ? -errno : -rc;
>  }
>  }
>  
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH RFC 3/6] xen/pt: xen_host_pci_config_read returns -errno, not -1 on failure

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> However the init routines assume that on errors the return
> code is -1 (as the libxc API is) - while those xen_host_* routines follow
> another paradigm - negative errno on return, 0 on success.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Reviewed-by: Stefano Stabellini 

>  hw/xen/xen_pt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 706e3d9..ea1ceda 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -716,7 +716,7 @@ static int xen_pt_initfn(PCIDevice *d)
>  
>  /* Initialize virtualized PCI configuration (Extended 256 Bytes) */
>  if (xen_host_pci_get_block(&s->real_device, 0, d->config,
> -   PCI_CONFIG_SPACE_SIZE) == -1) {
> +   PCI_CONFIG_SPACE_SIZE) < 0) {
>  xen_host_pci_device_put(&s->real_device);
>  return -1;
>  }
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH RFC 2/6] xen/pt: Make xen_pt_msi_set_enable static

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> As we do not use it outside our code.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Reviewed-by: Stefano Stabellini 


>  hw/xen/xen_pt.h | 1 -
>  hw/xen/xen_pt_msi.c | 2 +-
>  2 files changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
> index 393f36c..09358b1 100644
> --- a/hw/xen/xen_pt.h
> +++ b/hw/xen/xen_pt.h
> @@ -289,7 +289,6 @@ static inline uint8_t 
> xen_pt_pci_intx(XenPCIPassthroughState *s)
>  }
>  
>  /* MSI/MSI-X */
> -int xen_pt_msi_set_enable(XenPCIPassthroughState *s, bool en);
>  int xen_pt_msi_setup(XenPCIPassthroughState *s);
>  int xen_pt_msi_update(XenPCIPassthroughState *d);
>  void xen_pt_msi_disable(XenPCIPassthroughState *s);
> diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
> index 263e051..5822df5 100644
> --- a/hw/xen/xen_pt_msi.c
> +++ b/hw/xen/xen_pt_msi.c
> @@ -220,7 +220,7 @@ static int msi_msix_disable(XenPCIPassthroughState *s,
>   * MSI virtualization functions
>   */
>  
> -int xen_pt_msi_set_enable(XenPCIPassthroughState *s, bool enable)
> +static int xen_pt_msi_set_enable(XenPCIPassthroughState *s, bool enable)
>  {
>  XEN_PT_LOG(&s->dev, "%s MSI.\n", enable ? "enabling" : "disabling");
>  
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH RFC 1/6] xen/pt: Update comments with proper function name.

2015-07-01 Thread Stefano Stabellini

On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote:
> It has changed but the comments still refer to the old names.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 

Reviewed-by: Stefano Stabellini 


>  hw/xen/xen_pt.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index ed5fcae..706e3d9 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -378,7 +378,7 @@ static void xen_pt_pci_write_config(PCIDevice *d, 
> uint32_t addr,
>  }
>  }
>  
> -/* need to shift back before passing them to xen_host_pci_device */
> +/* need to shift back before passing them to xen_host_pci_set_block. */
>  val >>= (addr & 3) << 3;
>  
>  memory_region_transaction_commit();
> @@ -406,7 +406,7 @@ out:
>  (uint8_t *)&val + index, len);
>  
>  if (rc < 0) {
> -XEN_PT_ERR(d, "pci_write_block failed. return value: %d.\n", rc);
> +XEN_PT_ERR(d, "xen_host_pci_set_block failed. return value: 
> %d.\n", rc);
>  }
>  }
>  }
> -- 
> 2.1.0
>

Re: [Qemu-devel] [PATCH v2 07/22] virtio: find version 1.0 virtio capabilities

2015-07-01 Thread Gerd Hoffmann

  Hi,

> > 
> > Yes, seabios always allocates both mem and io.
> 
> What if it can't? E.g. too many devices.

First tries to move 64bit bars above 64g.  Guess we better should
exclude virtio devices here (like we do for xhci already).

Failing that it'll panic.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v2 1/1] KVM s390 pci infrastructure modelling

2015-07-01 Thread Hong Bo Li




On 7/1/2015 19:57, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 07:46:01PM +0800, Hong Bo Li wrote:


On 7/1/2015 19:23, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 07:11:38PM +0800, Hong Bo Li wrote:

On 7/1/2015 18:36, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 06:04:24PM +0800, Hong Bo Li wrote:

On 7/1/2015 17:22, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 05:13:11PM +0800, Hong Bo Li wrote:

On 7/1/2015 16:05, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 03:56:25PM +0800, Hong Bo Li wrote:

On 7/1/2015 14:22, Michael S. Tsirkin wrote:

On Tue, Jun 30, 2015 at 02:16:59PM +0800, Hong Bo Li wrote:

On 6/29/2015 18:01, Michael S. Tsirkin wrote:

On Mon, Jun 29, 2015 at 05:24:53PM +0800, Hong Bo Li wrote:

This patch introduce a new facility(and bus)
to hold devices representing information actually
provided by s390 firmware and I/O configuration.
usage example:
-device s390-pcihost
-device vfio-pci,host=:00:00.0,id=vpci1
-device zpci,fid=2,uid=5,pci_id=vpci1,id=zpci1

The first line will create a s390 pci host bridge
and init the root bus. The second line will create
a standard vfio pci device, and attach it to the
root bus. These are similiar to the standard process
to define a pci device on other platform.

The third line will create a s390 pci device to
store s390 specific information, and references
the corresponding vfio pci device via device id.
We create a s390 pci facility bus to hold all the
zpci devices.

Signed-off-by: Hong Bo Li 

It's mostly up to s390 maintainers, but I'd like to note
one thing below


---
  hw/s390x/s390-pci-bus.c| 314 +
  hw/s390x/s390-pci-bus.h|  48 ++-
  hw/s390x/s390-pci-inst.c   |   4 +-
  hw/s390x/s390-virtio-ccw.c |   5 +-
  4 files changed, 283 insertions(+), 88 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 560b66a..d5e7b2e 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -32,8 +32,8 @@ int chsc_sei_nt2_get_event(void *res)
  PciCcdfErr *eccdf;
  int rc = 1;
  SeiContainer *sei_cont;
-S390pciState *s = S390_PCI_HOST_BRIDGE(
-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
+S390PCIFacility *s = S390_PCI_FACILITY(
+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
  if (!s) {
  return rc;
@@ -72,8 +72,8 @@ int chsc_sei_nt2_get_event(void *res)
  int chsc_sei_nt2_have_event(void)
  {
-S390pciState *s = S390_PCI_HOST_BRIDGE(
-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
+S390PCIFacility *s = S390_PCI_FACILITY(
+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
  if (!s) {
  return 0;
@@ -82,20 +82,32 @@ int chsc_sei_nt2_have_event(void)
  return !QTAILQ_EMPTY(&s->pending_sei);
  }
+void s390_pci_device_enable(S390PCIBusDevice *zpci)
+{
+zpci->fh = zpci->fh | 1 << ENABLE_BIT_OFFSET;
+}
+
+void s390_pci_device_disable(S390PCIBusDevice *zpci)
+{
+zpci->fh = zpci->fh & ~(1 << ENABLE_BIT_OFFSET);
+if (zpci->is_unplugged)
+object_unparent(OBJECT(zpci));
+}
+
  S390PCIBusDevice *s390_pci_find_dev_by_fid(uint32_t fid)
  {
  S390PCIBusDevice *pbdev;
-int i;
-S390pciState *s = S390_PCI_HOST_BRIDGE(
-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
+BusChild *kid;
+S390PCIFacility *s = S390_PCI_FACILITY(
+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
  if (!s) {
  return NULL;
  }
-for (i = 0; i < PCI_SLOT_MAX; i++) {
-pbdev = &s->pbdev[i];
-if ((pbdev->fh != 0) && (pbdev->fid == fid)) {
+QTAILQ_FOREACH(kid, &s->fbus->qbus.children, sibling) {
+pbdev = (S390PCIBusDevice *)kid->child;
+if (pbdev->fid == fid) {
  return pbdev;
  }
  }
@@ -126,39 +138,24 @@ void s390_pci_sclp_configure(int configure, SCCB *sccb)
  return;
  }
-static uint32_t s390_pci_get_pfid(PCIDevice *pdev)
-{
-return PCI_SLOT(pdev->devfn);
-}
-
-static uint32_t s390_pci_get_pfh(PCIDevice *pdev)
-{
-return PCI_SLOT(pdev->devfn) | FH_VIRT;
-}
-
  S390PCIBusDevice *s390_pci_find_dev_by_idx(uint32_t idx)
  {
  S390PCIBusDevice *pbdev;
-int i;
-int j = 0;
-S390pciState *s = S390_PCI_HOST_BRIDGE(
-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
+BusChild *kid;
+int i = 0;
+S390PCIFacility *s = S390_PCI_FACILITY(
+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
  if (!s) {
  return NULL;
  }
-for (i = 0; i < PCI_SLOT_MAX; i++) {
-pbdev = &s->pbdev[i];
-
-if (pbdev->fh == 0) {
-continue;
-}
-
-if (j == idx) {
+QTAILQ_FOREACH(kid, &s->fbus->qbus.children, sibling) {
+pbdev = (S390PCIBusDevice *)kid->child;
+if (i == idx) {
  return pbdev;
  }
-j++;
+i++;
  }
  return NULL;

This relies on the order of children on the qbus, th

Re: [Qemu-devel] [PATCH 1/1] s390x/migration: Introduce 2.4 machine

2015-07-01 Thread Cornelia Huck

On Wed,  1 Jul 2015 11:16:57 +0200
Christian Borntraeger  wrote:

> The section footer changes commit f68945d42bab ("Add a protective
> section footer") and commit 37fb569c0198 ("Disable section footers
> on older machine types") broke migration for any non-versioned
> machines.
> 
> While one can argue that section footer should be enabled
> explicitely for new versions instead of disabled for old ones,
> this pinpoints to a problem of s390-ccw-machines: it needs to
> be versioned to be compatible with future changes in common
> code data structures such as section footers.
> 
> Let's introduce a version scheme for s390-ccw-virtio machines.
> We will use the old s390-ccw-virtio name as alias to the latest
> version as all existing libvirt XML for the ccw type were expanded
> by libvirt to that name.
> 
> The only downside of this patch is, that the old alias s390-ccw
> will no longer be available as machines can have only one alias,
> but it should not really matter.
> 
> Cc: Dr. David Alan Gilbert 
> Cc: Juan Quintela 
> Cc: Boris Fiuczynski 
> Cc: Jason J. Herne 
> Signed-off-by: Christian Borntraeger 
> ---
>  hw/s390x/s390-virtio-ccw.c | 22 ++
>  1 file changed, 18 insertions(+), 4 deletions(-)

Adapted the commit message and applied (with minor tweaks) to my
s390-next branch at

git://github.com/cohuck/qemu s390-next

I'll probably send a pull request including this patch tomorrow, unless
someone has further comments.

Re: [Qemu-devel] [PATCH v2 1/1] KVM s390 pci infrastructure modelling

2015-07-01 Thread Hong Bo Li




On 7/1/2015 19:57, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 07:46:01PM +0800, Hong Bo Li wrote:


On 7/1/2015 19:23, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 07:11:38PM +0800, Hong Bo Li wrote:

On 7/1/2015 18:36, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 06:04:24PM +0800, Hong Bo Li wrote:

On 7/1/2015 17:22, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 05:13:11PM +0800, Hong Bo Li wrote:

On 7/1/2015 16:05, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 03:56:25PM +0800, Hong Bo Li wrote:

On 7/1/2015 14:22, Michael S. Tsirkin wrote:

On Tue, Jun 30, 2015 at 02:16:59PM +0800, Hong Bo Li wrote:

On 6/29/2015 18:01, Michael S. Tsirkin wrote:

On Mon, Jun 29, 2015 at 05:24:53PM +0800, Hong Bo Li wrote:

This patch introduce a new facility(and bus)
to hold devices representing information actually
provided by s390 firmware and I/O configuration.
usage example:
-device s390-pcihost
-device vfio-pci,host=:00:00.0,id=vpci1
-device zpci,fid=2,uid=5,pci_id=vpci1,id=zpci1

The first line will create a s390 pci host bridge
and init the root bus. The second line will create
a standard vfio pci device, and attach it to the
root bus. These are similiar to the standard process
to define a pci device on other platform.

The third line will create a s390 pci device to
store s390 specific information, and references
the corresponding vfio pci device via device id.
We create a s390 pci facility bus to hold all the
zpci devices.

Signed-off-by: Hong Bo Li 

It's mostly up to s390 maintainers, but I'd like to note
one thing below


---
  hw/s390x/s390-pci-bus.c| 314 +
  hw/s390x/s390-pci-bus.h|  48 ++-
  hw/s390x/s390-pci-inst.c   |   4 +-
  hw/s390x/s390-virtio-ccw.c |   5 +-
  4 files changed, 283 insertions(+), 88 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 560b66a..d5e7b2e 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -32,8 +32,8 @@ int chsc_sei_nt2_get_event(void *res)
  PciCcdfErr *eccdf;
  int rc = 1;
  SeiContainer *sei_cont;
-S390pciState *s = S390_PCI_HOST_BRIDGE(
-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
+S390PCIFacility *s = S390_PCI_FACILITY(
+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
  if (!s) {
  return rc;
@@ -72,8 +72,8 @@ int chsc_sei_nt2_get_event(void *res)
  int chsc_sei_nt2_have_event(void)
  {
-S390pciState *s = S390_PCI_HOST_BRIDGE(
-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
+S390PCIFacility *s = S390_PCI_FACILITY(
+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
  if (!s) {
  return 0;
@@ -82,20 +82,32 @@ int chsc_sei_nt2_have_event(void)
  return !QTAILQ_EMPTY(&s->pending_sei);
  }
+void s390_pci_device_enable(S390PCIBusDevice *zpci)
+{
+zpci->fh = zpci->fh | 1 << ENABLE_BIT_OFFSET;
+}
+
+void s390_pci_device_disable(S390PCIBusDevice *zpci)
+{
+zpci->fh = zpci->fh & ~(1 << ENABLE_BIT_OFFSET);
+if (zpci->is_unplugged)
+object_unparent(OBJECT(zpci));
+}
+
  S390PCIBusDevice *s390_pci_find_dev_by_fid(uint32_t fid)
  {
  S390PCIBusDevice *pbdev;
-int i;
-S390pciState *s = S390_PCI_HOST_BRIDGE(
-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
+BusChild *kid;
+S390PCIFacility *s = S390_PCI_FACILITY(
+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
  if (!s) {
  return NULL;
  }
-for (i = 0; i < PCI_SLOT_MAX; i++) {
-pbdev = &s->pbdev[i];
-if ((pbdev->fh != 0) && (pbdev->fid == fid)) {
+QTAILQ_FOREACH(kid, &s->fbus->qbus.children, sibling) {
+pbdev = (S390PCIBusDevice *)kid->child;
+if (pbdev->fid == fid) {
  return pbdev;
  }
  }
@@ -126,39 +138,24 @@ void s390_pci_sclp_configure(int configure, SCCB *sccb)
  return;
  }
-static uint32_t s390_pci_get_pfid(PCIDevice *pdev)
-{
-return PCI_SLOT(pdev->devfn);
-}
-
-static uint32_t s390_pci_get_pfh(PCIDevice *pdev)
-{
-return PCI_SLOT(pdev->devfn) | FH_VIRT;
-}
-
  S390PCIBusDevice *s390_pci_find_dev_by_idx(uint32_t idx)
  {
  S390PCIBusDevice *pbdev;
-int i;
-int j = 0;
-S390pciState *s = S390_PCI_HOST_BRIDGE(
-object_resolve_path(TYPE_S390_PCI_HOST_BRIDGE, NULL));
+BusChild *kid;
+int i = 0;
+S390PCIFacility *s = S390_PCI_FACILITY(
+object_resolve_path(TYPE_S390_PCI_FACILITY, NULL));
  if (!s) {
  return NULL;
  }
-for (i = 0; i < PCI_SLOT_MAX; i++) {
-pbdev = &s->pbdev[i];
-
-if (pbdev->fh == 0) {
-continue;
-}
-
-if (j == idx) {
+QTAILQ_FOREACH(kid, &s->fbus->qbus.children, sibling) {
+pbdev = (S390PCIBusDevice *)kid->child;
+if (i == idx) {
  return pbdev;
  }
-j++;
+i++;
  }
  return NULL;

This relies on the order of children on the qbus, th

Re: [Qemu-devel] [PATCH RFC 0/4] vGICv3 support

2015-07-01 Thread Pavel Fedin

 Hello!

> I thought the general sense here was that since emulating the full
> device is much more complicated than driving the KVM part,

 Yes, but still it actually shares 50% of the code with SW emulation. It reuses 
vGICv3 base class as
well as new machine.

> the integration with the virt board should go in via this series, and the
> emulation should build on top of that?

 You know... I could rip parts of Shlomo's patches and use them as base for my 
series. But, does it
worth efforts? It will actually be a reposting of the same code over and over 
again...

> It just felt like both patch series have stalled somehow, and I would
> like to see what we can do to get this stuff moving again.

 For this purpose you can talk to Peter i guess, because it was his decision. 
By the way, when does
freeze period end?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

Re: [Qemu-devel] [PATCH v2 02/22] virtio: run drivers in 32bit mode

2015-07-01 Thread Gerd Hoffmann

On Mi, 2015-07-01 at 10:08 +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 30, 2015 at 10:38:53AM +0200, Gerd Hoffmann wrote:
> > virtio version 1.0 registers can (and actually do in the qemu
> > implementation) live in mmio space.  So we must run the blk and
> > scsi virtio drivers in 32bit mode, otherwise we can't access them.
> > 
> > This also allows to drop a bunch of GET_LOWFLAT calls from the virtio
> > code in the following patches.
> > 
> > Signed-off-by: Gerd Hoffmann 
> 
> Is there an advantage to running them in a 16 bit mode?

Not really any more.  Switching from 32bit mode back to
whatever-was-active-before used to be problematic before we had smm mode
support.  In theory.  Because you can't save/restore the complete x86
processor state.  In practice we had surprisingly few problems,
appearently linux boot loaders simply don't play dirty tricks.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v2 07/22] virtio: find version 1.0 virtio capabilities

2015-07-01 Thread Michael S. Tsirkin

On Wed, Jul 01, 2015 at 02:24:02PM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > Hmm this seems to violate this rule in the spec:
> > 
> > 
> > The driver SHOULD use the first instance of each virtio structure type
> > they can support.
> > 
> > "can support" here means that bios was able to allocate
> > it during enumeration.
> > 
> > For example there could be both IO and memory, in this order
> > you need to check that IO/memory got enabled (in theory,
> > also that they are within parent bridge's windows - used
> > by some guests, but
> > seabios doesn't disable memmory/io in this strange way).
> 
> Yes, seabios always allocates both mem and io.

What if it can't? E.g. too many devices.

> So this incremental fix ...
> 
> @@ -234,7 +234,7 @@ void vp_init_simple(struct vp_device *vp, struct
> pci_device *pci)
>  vp_cap = NULL;
>  break;
>  }
> -if (vp_cap) {
> +if (vp_cap && !vp_cap->cap) {
>  vp_cap->cap = cap;
>  vp_cap->bar = pci_config_readb(pci->bdf, cap +
> offsetof(struct
> virtio_pci_cap, bar));
> 
> ... makes seabios use the first not the last and should do the trick,
> right?
> 
> cheers,
>   Gerd
>

< 1 2 3 4 5 >

201 - 300 of 469 matches

Mail list logo