Re: [Qemu-devel] Reverse execution and deterministic replay
Hi Pavel, On Fri, Jun 27, 2014 at 3:18 PM, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: Hello! We want to publish set of patches related to the reverse execution and deterministic replay of qemu. Our implementation of deterministic replay can be used for deterministic and reverse debugging of guest code through gdb remote interface. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Reverse execution has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, and ARM hardware platforms. * Performs deterministic replay of all operations with keyboard, mouse, network adapters, audio devices, serial interfaces, and physical USB devices connected to the emulator. * Provides support for gdb reverse debugging commands like reverse-step and reverse-continue. * Supports auto-checkpointing for convenient reverse debugging. * Allows going to the live execution from the replay mode. Our implementation is completely tested for qemu 1.5 and is in beta state for 2.0.50. Some details about our implementation of reverse execution can be found in paper: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Add relevant implementation details to the git commit messages. Can anyone review our patches? Fred Konrad is doing a series on reverse exe at the moment. CC. Is the an independent implementation of the same thing or are you building on it? I suggest posting a full RFC, this looks to me just like a cover letter but without a series. Note that we are going into hard freeze imminently so there will be some delay for merge. Regards, Peter Pavel Dovgaluk
Re: [Qemu-devel] Reverse execution and deterministic replay
-Original Message- From: peter.crosthwa...@petalogix.com [mailto:peter.crosthwa...@petalogix.com] On Behalf Of Peter Crosthwaite Sent: Friday, June 27, 2014 10:11 AM To: Pavel Dovgaluk; Fréderic Konrad Cc: qemu-devel@nongnu.org Developers; Paolo Bonzini Subject: Re: [Qemu-devel] Reverse execution and deterministic replay Hi Pavel, On Fri, Jun 27, 2014 at 3:18 PM, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: Hello! We want to publish set of patches related to the reverse execution and deterministic replay of qemu. Our implementation of deterministic replay can be used for deterministic and reverse debugging of guest code through gdb remote interface. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Reverse execution has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, and ARM hardware platforms. * Performs deterministic replay of all operations with keyboard, mouse, network adapters, audio devices, serial interfaces, and physical USB devices connected to the emulator. * Provides support for gdb reverse debugging commands like reverse-step and reverse- continue. * Supports auto-checkpointing for convenient reverse debugging. * Allows going to the live execution from the replay mode. Our implementation is completely tested for qemu 1.5 and is in beta state for 2.0.50. Some details about our implementation of reverse execution can be found in paper: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Add relevant implementation details to the git commit messages. Do you mean describing the details in patches that I should submit? Can anyone review our patches? Fred Konrad is doing a series on reverse exe at the moment. CC. Is the an independent implementation of the same thing or are you building on it? Our implementation is not related to Fred Konrad. I suggest posting a full RFC, this looks to me just like a cover letter but without a series. Of course I will post a full RFC with details of implementation. Note that we are going into hard freeze imminently so there will be some delay for merge. Pavel Dovgaluk
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2
Am 27.06.2014 um 06:59 hat Paolo Bonzini geschrieben: Il 27/06/2014 03:15, Ming Lei ha scritto: On Thu, Jun 26, 2014 at 11:57 PM, Paolo Bonzini pbonz...@redhat.com wrote: We can implement (advisory) calls like bdrv_plug/bdrv_unplug in order to restore the previous levels of performance. Yes, that is also what I am thinking, or interfaces like bdrv_queue_io() and bdrv_submit_io(), which may match with aio interfaces. Would you like to try preparing a patch? Note that there is already an interface in block.c that takes multiple requests at once, bdrv_aio_multiwrite(). It is currently used by virtio-blk, even though not in dataplane mode. It also submits individual requests to the block drivers currently, so effectively it doesn't make a difference, just the problem occurs in the block layer instead of the device. We should either improve bdrv_aio_multiwrite() to submit the requests in a batch to the block drivers, add a bdrv_aio_multiwrite() and use it for dataplane as well (possibly with a flag for disabling the request merging if we want to keep the current behaviour for dataplane); or, if we consider it a bad interface, replace it altogether with the new thing even for normal virtio-blk. If this makes a difference for dataplane, it probably makes a difference for all block devices. Kevin
[Qemu-devel] [PATCH 3/3] ppc/spapr: Fix MAX_CPUS to 255
MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This MAX_CPUS number was percolated back to virsh capabilities with wrong max_cpus. Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- hw/ppc/spapr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 33f77d2..eab0f5f 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -79,7 +79,7 @@ #define TIMEBASE_FREQ 51200ULL -#define MAX_CPUS256 +#define MAX_CPUS255 #define PHANDLE_XICP0x -- 1.8.3.1
[Qemu-devel] [PATCH 2/3] spapr: add uuid/host details to device tree
Useful for identifying the guest/host uniquely within the guest. Adding following properties to the guest root node. vm,uuid - uuid of the guest host-model - Host model number host-serial - Host machine serial number hypervisor type - Tells its kvm Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- hw/ppc/spapr.c | 19 +++ target-ppc/kvm.c | 44 +++- target-ppc/kvm_ppc.h | 12 3 files changed, 74 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a8ba916..33f77d2 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -319,6 +319,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base, QemuOpts *opts = qemu_opts_find(qemu_find_opts(smp-opts), NULL); unsigned sockets = opts ? qemu_opt_get_number(opts, sockets, 0) : 0; uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1; +char char_buf[512]; add_str(hypertas, hcall-pft); add_str(hypertas, hcall-term); @@ -348,6 +349,24 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base, _FDT((fdt_property_string(fdt, model, IBM pSeries (emulated by qemu; _FDT((fdt_property_string(fdt, compatible, qemu,pseries))); +if (kvm_enabled()) { +_FDT((fdt_property_string(fdt, hypervisor, kvm))); +} + +/* + * Add info to guest to indentify which host is it being run on + * and what is the uuid of the guest + */ +memset(char_buf, 0, sizeof(char_buf)); +if (!kvmppc_get_host_model(char_buf, sizeof(char_buf))) { +_FDT((fdt_property_string(fdt, host-model, char_buf))); +memset(char_buf, 0, sizeof(char_buf)); +} +if (!kvmppc_get_host_serial(char_buf, sizeof(char_buf))) { +_FDT((fdt_property_string(fdt, host-serial, char_buf))); +} +_FDT((fdt_property(fdt, vm,uuid, qemu_uuid, 16))); + _FDT((fdt_property_cell(fdt, #address-cells, 0x2))); _FDT((fdt_property_cell(fdt, #size-cells, 0x2))); diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 2d87108..25091f8 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1369,7 +1369,7 @@ static int read_cpuinfo(const char *field, char *value, int len) } do { -if(!fgets(line, sizeof(line), f)) { +if (!fgets(line, sizeof(line), f)) { break; } if (!strncmp(line, field, field_len)) { @@ -1404,6 +1404,48 @@ uint32_t kvmppc_get_tbfreq(void) return retval; } +int32_t kvmppc_get_host_serial(char *value, int len) +{ +FILE *f; +int ret = -1; +char line[512]; + +memset(line, 0, sizeof(line)); +f = fopen(/proc/device-tree/system-id, r); +if (!f) { +return ret; +} + +if (fgets(line, sizeof(line), f)) { +snprintf(value, len, IBM,%s, line); +ret = 0; +} +fclose(f); + +return ret; +} + +int32_t kvmppc_get_host_model(char *value, int len) +{ +FILE *f; +int ret = -1; +char line[512]; + +memset(line, 0, sizeof(line)); +f = fopen(/proc/device-tree/model, r); +if (!f) { +return ret; +} + +if (fgets(line, sizeof(line), f)) { +snprintf(value, len, IBM,%s, line); +ret = 0; +} +fclose(f); + +return ret; +} + /* Try to find a device tree node for a CPU with clock-frequency property */ static int kvmppc_find_cpu_dt(char *buf, int buf_len) { diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h index 1118122..6fa3314 100644 --- a/target-ppc/kvm_ppc.h +++ b/target-ppc/kvm_ppc.h @@ -19,6 +19,8 @@ uint32_t kvmppc_get_tbfreq(void); uint64_t kvmppc_get_clockfreq(void); uint32_t kvmppc_get_vmx(void); uint32_t kvmppc_get_dfp(void); +int32_t kvmppc_get_host_model(char *buf, int buf_len); +int32_t kvmppc_get_host_serial(char *buf, int buf_len); int kvmppc_get_hasidle(CPUPPCState *env); int kvmppc_get_hypercall(CPUPPCState *env, uint8_t *buf, int buf_len); int kvmppc_set_interrupt(PowerPCCPU *cpu, int irq, int level); @@ -60,6 +62,16 @@ static inline uint32_t kvmppc_get_tbfreq(void) return 0; } +static inline int32_t kvmppc_get_host_model(char *buf, int buf_len) +{ +return 0; +} + +static inline int32_t kvmppc_get_host_serial(char *buf, int buf_len) +{ +return 0; +} + static inline uint64_t kvmppc_get_clockfreq(void) { return 0; -- 1.8.3.1
[Qemu-devel] [PATCH 1/3 v3] ppc: spapr-rtas - implement os-term rtas call
PAPR compliant guest calls this in absence of kdump. This finally reaches the guest and can be handled according to the policies set by higher level tools(like taking dump) for further analysis by tools like crash. Linux kernel calls this only when the extended version of os,term is implemented to make sure that a return to the linux kernel is gauranteed. CC: Benjamin Herrenschmidt b...@au1.ibm.com CC: Anton Blanchard an...@samba.org CC: Alexander Graf ag...@suse.de Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- v2: rebase to ppcnext v3: Do not stop the VM, and update comments --- hw/ppc/spapr_rtas.c | 41 + 1 file changed, 41 insertions(+) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 9ba1ba6..2da33c8 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -29,6 +29,8 @@ #include sysemu/char.h #include hw/qdev.h #include sysemu/device_tree.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #include hw/ppc/spapr.h #include hw/ppc/spapr_vio.h @@ -277,6 +279,41 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu, rtas_st(rets, 0, ret); } +static void rtas_ibm_os_term(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, +uint32_t nret, target_ulong rets) +{ +target_ulong ret = 0; +QObject *data; + +data = qobject_from_jsonf({ 'action': %s }, pause); +monitor_protocol_event(QEVENT_GUEST_PANICKED, data); +qobject_decref(data); + +rtas_st(rets, 0, ret); +} + +/* + * According to PAPR, rtas ibm,os-term, does not gaurantee a return + * back to the guest cpu. + * + * While an additional ibm,extended-os-term property indicates that + * rtas call return will always occur. Below function implements a + * place holder for the same. + */ +static void rtas_ibm_ext_os_term(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, +uint32_t nret, target_ulong rets) +{ +target_ulong ret = RTAS_OUT_NOT_SUPPORTED; + +rtas_st(rets, 0, ret); +} + static struct rtas_call { const char *name; spapr_rtas_fn fn; @@ -404,6 +441,10 @@ static void core_rtas_register_types(void) spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER, ibm,set-system-parameter, rtas_ibm_set_system_parameter); +spapr_rtas_register(ibm,os-term, +rtas_ibm_os_term); +spapr_rtas_register(ibm,extended-os-term, +rtas_ibm_ext_os_term); } type_init(core_rtas_register_types) -- 1.8.3.1
Re: [Qemu-devel] [PATCH 1/3 v3] ppc: spapr-rtas - implement os-term rtas call
On 06/27/2014 04:47 PM, Nikunj A Dadhania wrote: PAPR compliant guest calls this in absence of kdump. This finally reaches the guest and can be handled according to the policies set by higher level tools(like taking dump) for further analysis by tools like crash. Linux kernel calls this only when the extended version of os,term is implemented to make sure that a return to the linux kernel is gauranteed. CC: Benjamin Herrenschmidt b...@au1.ibm.com CC: Anton Blanchard an...@samba.org CC: Alexander Graf ag...@suse.de Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- v2: rebase to ppcnext v3: Do not stop the VM, and update comments --- hw/ppc/spapr_rtas.c | 41 + 1 file changed, 41 insertions(+) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 9ba1ba6..2da33c8 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -29,6 +29,8 @@ #include sysemu/char.h #include hw/qdev.h #include sysemu/device_tree.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #include hw/ppc/spapr.h #include hw/ppc/spapr_vio.h @@ -277,6 +279,41 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu, rtas_st(rets, 0, ret); } +static void rtas_ibm_os_term(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, +uint32_t nret, target_ulong rets) +{ +target_ulong ret = 0; +QObject *data; + +data = qobject_from_jsonf({ 'action': %s }, pause); +monitor_protocol_event(QEVENT_GUEST_PANICKED, data); +qobject_decref(data); + +rtas_st(rets, 0, ret); +} + +/* + * According to PAPR, rtas ibm,os-term, does not gaurantee a return + * back to the guest cpu. + * + * While an additional ibm,extended-os-term property indicates that + * rtas call return will always occur. Below function implements a + * place holder for the same. + */ +static void rtas_ibm_ext_os_term(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, +uint32_t nret, target_ulong rets) +{ +target_ulong ret = RTAS_OUT_NOT_SUPPORTED; + +rtas_st(rets, 0, ret); +} + static struct rtas_call { const char *name; spapr_rtas_fn fn; @@ -404,6 +441,10 @@ static void core_rtas_register_types(void) spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER, ibm,set-system-parameter, rtas_ibm_set_system_parameter); +spapr_rtas_register(ibm,os-term, +rtas_ibm_os_term); This just won't compile, spapr_rtas_register() takes 3 parameters now. Tokens for ibm,os-term and ibm,extended-os-term are already defined, just use them. +spapr_rtas_register(ibm,extended-os-term, +rtas_ibm_ext_os_term); } type_init(core_rtas_register_types) ps. please (please) do not use my ibm's email in public :) -- Alexey Kardashevskiy IBM OzLabs, LTC Team e-mail: a...@au1.ibm.com notes: Alexey Kardashevskiy/Australia/IBM
Re: [Qemu-devel] [PATCH 1/3 v3] ppc: spapr-rtas - implement os-term rtas call
Alexey Kardashevskiy a...@au1.ibm.com writes: On 06/27/2014 04:47 PM, Nikunj A Dadhania wrote: PAPR compliant guest calls this in absence of kdump. This finally reaches the guest and can be handled according to the policies set by higher level tools(like taking dump) for further analysis by tools like crash. Linux kernel calls this only when the extended version of os,term is implemented to make sure that a return to the linux kernel is gauranteed. CC: Benjamin Herrenschmidt b...@au1.ibm.com CC: Anton Blanchard an...@samba.org CC: Alexander Graf ag...@suse.de Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com static struct rtas_call { const char *name; spapr_rtas_fn fn; @@ -404,6 +441,10 @@ static void core_rtas_register_types(void) spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER, ibm,set-system-parameter, rtas_ibm_set_system_parameter); +spapr_rtas_register(ibm,os-term, +rtas_ibm_os_term); This just won't compile, spapr_rtas_register() takes 3 parameters now. duh, i missed that update :( Resending Tokens for ibm,os-term and ibm,extended-os-term are already defined, just use them. +spapr_rtas_register(ibm,extended-os-term, +rtas_ibm_ext_os_term); } type_init(core_rtas_register_types) ps. please (please) do not use my ibm's email in public :) Sure. Regards Nikunj
Re: [Qemu-devel] [PATCH for 2.1] qdev: correctly send DEVICE_DELETED for recursively-deleted devices
Paolo Bonzini pbonz...@redhat.com writes: When a device is unparented (i.e. made completely hidden from management) we want to send a DEVICE_DELETED event only if the device actually was realized. This avoids raising DEVICE_DELETED events when device_add fails. However, this does not work right for recursively-deleted devices: the whole tree is _first_ unrealized, _then_ unparented. Then device_unparent sees realized==false and fails to trigger the event. The solution is simply to move have_realized into the DeviceState struct. If device_add fails, we never set the new field to true and DEVICE_DELETED is not sent. Fixes qemu-iotests testcase 067. Suggest to add Broken in commit 5942a19 here, to make it clear that it's a recent regression. Reported-by: Markus Armbruster arm...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- hw/core/qdev.c | 5 +++-- include/hw/qdev-core.h | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index d1eba3c..c520415 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -848,6 +848,7 @@ static void device_set_realized(Object *obj, bool value, Error **errp) if (value !dev-realized) { [...] if (dev-hotplugged local_err == NULL) { device_reset(dev); } +dev-pending_deleted_event = false; Unset on completion of unrealized - realized transition. } else if (!value dev-realized) { QLIST_FOREACH(bus, dev-child_bus, sibling) { object_property_set_bool(OBJECT(bus), false, realized, @@ -862,6 +863,7 @@ static void device_set_realized(Object *obj, bool value, Error **errp) if (dc-unrealize local_err == NULL) { dc-unrealize(dev, local_err); } +dev-pending_deleted_event = true; Set on completion of realized - unrealized transition. } if (local_err != NULL) { @@ -972,7 +974,6 @@ static void device_unparent(Object *obj) { DeviceState *dev = DEVICE(obj); BusState *bus; -bool have_realized = dev-realized; if (dev-realized) { object_property_set_bool(obj, false, realized, NULL); @@ -988,7 +989,7 @@ static void device_unparent(Object *obj) } /* Only send event if the device had been completely realized */ -if (have_realized) { +if (dev-pending_deleted_event) { gchar *path = object_get_canonical_path(OBJECT(dev)); qapi_event_send_device_deleted(!!dev-id, dev-id, path, error_abort); Let's see whether I understand how this works. Please correct misunderstandings. device_unparent() runs right before device deletion, and only then. First thing it does is setting property realized to false. Does nothing if the device has never been completely realized. dev-pending_deleted_event remains in its initial state false. DEVICE_DELETED not sent. Good. Else, the device was completely realized at some time. If it is currently realized, we get a transition to unrealized right now, setting dev-pending_deleted_event. Else, the last transition must have been realized - unrealized, setting dev-pending_deleted_event. Since it gets unset only on unrealized - realized, it's still set. Therefore, dev-pending_deleted_event is set if and only if the device has been completely realized. diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 9221cfc..0799ff2 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -156,6 +156,7 @@ struct DeviceState { const char *id; bool realized; +bool pending_deleted_event; QemuOpts *opts; int hotplugged; BusState *parent_bus; Reviewed-by: Markus Armbruster arm...@redhat.com (Tested, too, but my r-by subsumes that here)
Re: [Qemu-devel] [PATCH 1/3 v3] ppc: spapr-rtas - implement os-term rtas call
Nikunj A Dadhania nik...@linux.vnet.ibm.com writes: PAPR compliant guest calls this in absence of kdump. This finally reaches the guest and can be handled according to the policies set by higher level tools(like taking dump) for further analysis by tools like crash. Linux kernel calls this only when the extended version of os,term is implemented to make sure that a return to the linux kernel is gauranteed. CC: Benjamin Herrenschmidt b...@au1.ibm.com CC: Anton Blanchard an...@samba.org CC: Alexander Graf ag...@suse.de Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- v2: rebase to ppcnext v3: Do not stop the VM, and update comments --- hw/ppc/spapr_rtas.c | 41 + 1 file changed, 41 insertions(+) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 9ba1ba6..2da33c8 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -29,6 +29,8 @@ #include sysemu/char.h #include hw/qdev.h #include sysemu/device_tree.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #include hw/ppc/spapr.h #include hw/ppc/spapr_vio.h @@ -277,6 +279,41 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu, rtas_st(rets, 0, ret); } +static void rtas_ibm_os_term(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, +uint32_t nret, target_ulong rets) +{ +target_ulong ret = 0; +QObject *data; + +data = qobject_from_jsonf({ 'action': %s }, pause); +monitor_protocol_event(QEVENT_GUEST_PANICKED, data); +qobject_decref(data); Even the above has got changed, and newer api: qapi_event_send_guest_panicked Regards Nikunj
Re: [Qemu-devel] [v5][PATCH 2/5] xen, gfx passthrough: create pseudo intel isa bridge
On 2014/6/25 17:58, Chen, Tiejun wrote: On 2014/6/25 17:44, Michael S. Tsirkin wrote: On Wed, Jun 25, 2014 at 05:28:48PM +0800, Chen, Tiejun wrote: On 2014/6/25 17:21, Michael S. Tsirkin wrote: On Wed, Jun 25, 2014 at 05:14:30PM +0800, Chen, Tiejun wrote: On 2014/6/25 17:04, Michael S. Tsirkin wrote: On Wed, Jun 25, 2014 at 04:48:02PM +0800, Chen, Tiejun wrote: On 2014/6/25 16:43, Michael S. Tsirkin wrote: On Wed, Jun 25, 2014 at 04:39:07PM +0800, Chen, Tiejun wrote: In fact it's exactly what passthrough does. I wonder if more bits from ./hw/i386/kvm/pci-assign.c can be reused. How do you poke at the host device? sysfs? Yes, sysfs. Thanks Tiejun Then you should be able to re-use large chunks of ./hw/i386/kvm/pci-assign.c: basically everything that deals with emulation. Do you mean those hooks to get info from the real device? Xen have its own wrapper, xen_host_pci_get_block(), so we always go there in xen scenario. Thanks Tiejun Yes and that's not good. We have two pieces of code doing mostly identical things slightly differently. hw/i386/kvm/pci-assign.c is a bit younger so it's cleaner, but these really need to be unified. Sorry, take a look at this again, xen_host_pci_get_block(XenHostPCIDevice *d, int pos, uint8_t *buf, int len) | + xen_host_pci_config_read(d, pos, buf, len) | + pread(d-config_fd, buf, len, pos) I thinks this should be same as kvm. Thanks Tiejun get_block is trivial. I really mean the whole PT infrastructure for - discovering host devices through sysfs - virtualizing devices rom, bars, msi ... the list goes on. logic is mostly the same. Looks you mean we can unify the entire PT infrastructure between kvm and xen inside qemu. But I'm afraid its not easy to do in a short time, so maybe we can queue this as next phase. Thanks Tiejun I'm afraid once we merge your code, you'll lose interest :) Currently we have to push this feature into upstream as our first priority, so unless something is really needed to address. Of course I hope this point what we're talking is not such a thing :) But I can promise here I'd like to do this optimization with your guide next :) At least, don't add duplicate code for ROM. Let me try this. Its not easy as expected. kvm always work with this structure, AssignedDevice, and especially this is just activated in kvm_enabled(). And then set all properties to this structure. In xen case, the similar structure, XenHostPCIDevice, is not easy transferred into the structure, AssignedDevice. So this mean we have to split assigned_dev_load_option_rom() as line by line for xen and kvm, respectively. I really agree we definitely need to unify PT infrastructure between kvm and xen after this try since I can't understand why we originally introduce same way to do same thing :( Do you have better idea? If not, I prefer we open this completely as next action to follow-up. But this time I'm afraid I can't get in this. Thanks Tiejun
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2
Il 27/06/2014 08:23, Kevin Wolf ha scritto: Note that there is already an interface in block.c that takes multiple requests at once, bdrv_aio_multiwrite(). It is currently used by virtio-blk, even though not in dataplane mode. It also submits individual requests to the block drivers currently, so effectively it doesn't make a difference, just the problem occurs in the block layer instead of the device. We should either improve bdrv_aio_multiwrite() to submit the requests in a batch to the block drivers, add a bdrv_aio_multiwrite() and use it for dataplane as well (possibly with a flag for disabling the request merging if we want to keep the current behaviour for dataplane); or, if we consider it a bad interface, replace it altogether with the new thing even for normal virtio-blk. In fact, what's the status of Fam's patches to unify request processing between dataplane and non-dataplane? They would add multiwrite support (also rerror/werror and blockstats). I was hoping that they could get in 2.1. Paolo If this makes a difference for dataplane, it probably makes a difference for all block devices.
[Qemu-devel] [PATCH v4] ppc: spapr-rtas - implement os-term rtas call
PAPR compliant guest calls this in absence of kdump. This finally reaches the guest and can be handled according to the policies set by higher level tools(like taking dump) for further analysis by tools like crash. Linux kernel calls this only when the extended version of os,term is implemented to make sure that a return to the linux kernel is gauranteed. CC: Benjamin Herrenschmidt b...@au1.ibm.com CC: Anton Blanchard an...@samba.org CC: Alexander Graf ag...@suse.de Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- v2: rebase to ppcnext v3: Do not stop the VM, and update comments v4: update spapr_register_rtas and qapi_event changes --- hw/ppc/spapr_rtas.c | 36 1 file changed, 36 insertions(+) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 9ba1ba6..b11de41 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -277,6 +277,38 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu, rtas_st(rets, 0, ret); } +static void rtas_ibm_os_term(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, +uint32_t nret, target_ulong rets) +{ +target_ulong ret = 0; + +qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE, error_abort); + +rtas_st(rets, 0, ret); +} + +/* + * According to PAPR, rtas ibm,os-term, does not gaurantee a return + * back to the guest cpu. + * + * While an additional ibm,extended-os-term property indicates that + * rtas call return will always occur. Below function implements a + * place holder for the same. + */ +static void rtas_ibm_ext_os_term(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, +uint32_t nret, target_ulong rets) +{ +target_ulong ret = RTAS_OUT_NOT_SUPPORTED; + +rtas_st(rets, 0, ret); +} + static struct rtas_call { const char *name; spapr_rtas_fn fn; @@ -404,6 +436,10 @@ static void core_rtas_register_types(void) spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER, ibm,set-system-parameter, rtas_ibm_set_system_parameter); +spapr_rtas_register(RTAS_IBM_OS_TERM, ibm,os-term, +rtas_ibm_os_term); +spapr_rtas_register(RTAS_IBM_EXTENDED_OS_TERM, ibm,extended-os-term, +rtas_ibm_ext_os_term); } type_init(core_rtas_register_types) -- 1.8.3.1
Re: [Qemu-devel] Reverse execution and deterministic replay
On 27/06/2014 08:11, Peter Crosthwaite wrote: Hi Pavel, On Fri, Jun 27, 2014 at 3:18 PM, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: Hello! We want to publish set of patches related to the reverse execution and deterministic replay of qemu. Our implementation of deterministic replay can be used for deterministic and reverse debugging of guest code through gdb remote interface. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Reverse execution has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, and ARM hardware platforms. * Performs deterministic replay of all operations with keyboard, mouse, network adapters, audio devices, serial interfaces, and physical USB devices connected to the emulator. * Provides support for gdb reverse debugging commands like reverse-step and reverse-continue. * Supports auto-checkpointing for convenient reverse debugging. * Allows going to the live execution from the replay mode. Our implementation is completely tested for qemu 1.5 and is in beta state for 2.0.50. Some details about our implementation of reverse execution can be found in paper: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Add relevant implementation details to the git commit messages. Can anyone review our patches? Fred Konrad is doing a series on reverse exe at the moment. CC. Is the an independent implementation of the same thing or are you building on it? Hi, Yes seems we are doing the same thing only we use icount as an instruction counter and you created a new instruction counter? This has advantage of having it working everywhere icount works but the disavantages of having to use icount for reverse execution. I think we can use both way so the reverse execution will works on other architecture the time an instruction counter is added to them. I'm sure your patches will add to our solution and I can review your patches when you'll send them. It would help if you rebase them on the patch set that is currently on the list: [RFC PATCH v5 00/13] Reverse execution. I sent two days ago. Thanks, Fred I suggest posting a full RFC, this looks to me just like a cover letter but without a series. Note that we are going into hard freeze imminently so there will be some delay for merge. Regards, Peter Pavel Dovgaluk
Re: [Qemu-devel] [PATCH 00/10] pc-bios/s390-ccw: Add DASD IPL support
On 26/06/14 16:42, Alexander Graf wrote: On 26.06.14 16:29, Jens Freimann wrote: Conny, Alex, Christian, here are some fixes for the s390-ccw bios. It's a mixture of additional features (DASD IPL support for different formats) and cleanups. From a quick glimpse it looks quite clean and straight forward, but I'd like to make sure we get rid completely of the static sector size assumption. Should be. I guess s/SECTOR_SIZE/MAX_SECTOR_SIZE/g would be ok for you then? Also, are we guaranteed that virtio always uses 512 byte block size? Or was that just an internal API thing? The virtio-blk API always talks in 512 byte sectors, no matter the block size. Overall this is a nice improvement of the boot code - if possible I would like to see that in 2.1. Conny, can you carry that in your tree (with s/SECTOR_SIZE/MAX_SECTOR_SIZE/g)? Acked-by: Christian Borntraeger borntrae...@de.ibm.com for the series. Christian
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2
On Fri, Jun 27, 2014 at 12:59 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 27/06/2014 03:15, Ming Lei ha scritto: On Thu, Jun 26, 2014 at 11:57 PM, Paolo Bonzini pbonz...@redhat.com wrote: We can implement (advisory) calls like bdrv_plug/bdrv_unplug in order to restore the previous levels of performance. Yes, that is also what I am thinking, or interfaces like bdrv_queue_io() and bdrv_submit_io(), which may match with aio interfaces. Would you like to try preparing a patch? OK, let me try to do that. Note that some fallout of the conversion was expected. Dataplane told us experimentally what level of performance could be reached, but was a dead end in terms of functionality. Now Stefan added a whole lot of functionality to dataplane (accounting, throttling, file formats and protocols, thread-pool based I/O, etc.) and we need to bring back any performance we lost in the process. These features are very good, but looks the conversion is a bit early, :-( Dataplane is still (and has always been) experimental. For now, it's a playground to get rid of the big QEMU lock in hot paths. As such, performance going up and down is expected. The good thing is that every performance improvement we do now will not be restricted to dataplane, it can be applied just as well to any other device. Yes, virtio-scsi may benefit from the improvement too, and other block devices too. Thanks, -- Ming Lei
Re: [Qemu-devel] [RFC PATCH v5 00/13] Reverse execution.
On 26/06/2014 17:52, Sebastian Tanase wrote: Hello, I'll be sending a new version (V3) of the patches on Monday. The patches add QemuOpts handling to the -icount option. If you want I can only send the part of the patch that adds QemuOpts support. Best regards, Sebastian Tanase Hi, Yes it would be nice if you can split the patch: one patch making icount a qemuopts and the second adding the align option. So I can pick the first part. I can do that for you if you want. Thanks, Fred - Mail original - De: Paolo Bonzini pbonz...@redhat.com À: Frederic Konrad fred.kon...@greensocs.com, qemu-devel@nongnu.org Cc: peter maydell peter.mayd...@linaro.org, quint...@redhat.com, mark burton mark.bur...@greensocs.com, dgilb...@redhat.com, amit shah amit.s...@redhat.com, vilan...@ac.upc.edu, sebastian tanase sebastian.tan...@openwide.fr, camille begue camille.be...@openwide.fr Envoyé: Jeudi 26 Juin 2014 17:32:57 Objet: Re: [Qemu-devel] [RFC PATCH v5 00/13] Reverse execution. Il 26/06/2014 17:11, Frederic Konrad ha scritto: Are you talking of this patch on the list: http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03039.html ? It seems to includes the align options too. Is that possible to split it up? Sure, you can split it up and when the original authors will rebase they will be able to add align on top. Paolo
Re: [Qemu-devel] [PATCH] target-arm: Implement vCPU reset via KVM_ARM_VCPU_INIT for 32-bit CPUs
On 06/26/2014 08:16 PM, Peter Maydell wrote: Implement kvm_arm_vcpu_init() as a simple call to arm_arm_vcpu_init() (which uses the KVM_ARM_VCPU_INIT vcpu ioctl to tell the kernel to re-initialize the vCPU), rather than via the complicated code which saves a copy of the register state on first init and then writes it back to the kernel. This is much simpler and brings the 32-bit KVM code into line with the 64-bit code. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- The kernel has always supported being able to call VCPU_INIT multiple times for this reset effect; I just didn't realize it was possible when I wrote the original reset code. When kvm64.c grows support for system registers we can probably coalesce the two kvm_arm_reset_cpu() functions into one. I also have a vague recollection that somebody reported that we had an actual bug in this area that this patch would fix; however I can't now find that in the mailing list archives :-( I did: http://lists.gnu.org/archive/html/qemu-devel/2014-05/msg03131.html Testing appreciated: my ARMv7 box is being a bit flaky at the moment; I don't *think* the occasional weird stuff I see is the effect of this patch but it's hard to be certain. I will test your patch in the following days. Diana
[Qemu-devel] [PATCH v6 0/5] Support Archipelago as a QEMU block backend
v6: - Split v5 1/4 patch into two different patches. First one implements QMP structured options and the second one implements bdrv_parse_filename(). v5: - Remove useless qemu_aio_count variable from BDRVArchipelagoState struct. - Cleanup xseg signal descriptor, call xseg_quit_local_signal() when closing block device. - Fix ds and volname leaks. - Make xseg request handler thread joinable and wait until exits before destroying condition variables and mutexes. Thanks to Stefan Hajnoczi for pointing this out. - Remove error_propagate() useless call. - Use memcpy instead of strncpy. - Remove check after trying to allocate memory with g_malloc(). - Remove pipe code and complete AIO by introducing QEMU bottom-half. - Add Archipelago shared memory segment name in options list and QMP. - Remove functions archipelago_aio_read()/_write() and introduce new and simpler function, __archipelago_submit_request(). Refactor archipelago_aio_segmented_rw() function. - Enable Archipelago support in qemu-iotests v4: - Move Archipelago QMP support from qapi-schema.json file to qapi/block-core.json. Fixe various typographic errors, thanks to Kevin Wolf and Eric Blake. - Use new .create_opts format, define new QemuOptsList structure and refactor qemu_archipelago_create function. v3: - Break down initial patch from one to three. First patch implements Archipelago QEMU block backend with read/write functionality. Second patch implements .bdrv_create() and adds support for creating Archipelago images. Third patch adds QMP support. - Remove global variable g_xseg_init, make xseg_initialize(), xseg_join() and xseg_leave() reentrant and thread-safe. - Introduce new enum BlockdevOptionsArchipelago for the QMP support. v2: - Implement .bdrv_parse_filename() function to convert the shortuct version with a single string to the individual options. - Remove global variables and move relevant fields to ArchipelagoAIOCB struct. - Remove ArchipelagoConf struct and use the relevant fields as individual arguments. - Remove ArchipelagoCB struct and use ArchipelagoAIOCB instead. - Remove ArchipelagoThread struct and move relevant fields to ArchipelagoAIOCB instead. Now an I/O thread is spawned for per-device to handle all async I/O requests. - Remove double data copy, use qemu_iovec_from_buf() and copy data directly to the destination buffer. - Remove archipelago_aio_bh_cb() function, a full request is completed in qemu_archipelago_complete_aio() instead. - Resolve proposed changes from Kevin Wolf and miscellaneous style issues. Chrysostomos Nanakos (5): block: Support Archipelago as a QEMU block backend block/archipelago: Implement bdrv_parse_filename() block/archipelago: Add support for creating images QMP: Add support for Archipelago qemu-iotests: add support for Archipelago protocol MAINTAINERS |6 + block/Makefile.objs |2 + block/archipelago.c | 1103 ++ configure| 40 ++ qapi/block-core.json | 39 +- tests/qemu-iotests/common|6 + tests/qemu-iotests/common.rc |9 +- 7 files changed, 1201 insertions(+), 4 deletions(-) create mode 100644 block/archipelago.c -- 1.7.10.4
[Qemu-devel] [PATCH v6 1/5] block: Support Archipelago as a QEMU block backend
VM Image on Archipelago volume is specified like this: file.driver=archipelago,file.volume=volumename[,file.mport=mapperd_port[, file.vport=vlmcd_port][,file.segment=segment_name]] 'archipelago' is the protocol. 'mport' is the port number on which mapperd is listening. This is optional and if not specified, QEMU will make Archipelago to use the default port. 'vport' is the port number on which vlmcd is listening. This is optional and if not specified, QEMU will make Archipelago to use the default port. 'segment' is the name of the shared memory segment Archipelago stack is using. This is optional and if not specified, QEMU will make Archipelago to use the default value, 'archipelago'. Examples: file.driver=archipelago,file.volume=my_vm_volume file.driver=archipelago,file.volume=my_vm_volume,file.mport=123 file.driver=archipelago,file.volume=my_vm_volume,file.mport=123, file.vport=1234 file.driver=archipelago,file.volume=my_vm_volume,file.mport=123, file.vport=1234,file.segment=my_segment Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr --- MAINTAINERS |6 + block/Makefile.objs |2 + block/archipelago.c | 819 +++ configure | 40 +++ 4 files changed, 867 insertions(+) create mode 100644 block/archipelago.c diff --git a/MAINTAINERS b/MAINTAINERS index 9b93edd..58ef1e3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -999,3 +999,9 @@ SSH M: Richard W.M. Jones rjo...@redhat.com S: Supported F: block/ssh.c + +ARCHIPELAGO +M: Chrysostomos Nanakos cnana...@grnet.gr +M: Chrysostomos Nanakos ch...@include.gr +S: Maintained +F: block/archipelago.c diff --git a/block/Makefile.objs b/block/Makefile.objs index fd88c03..858d2b3 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -17,6 +17,7 @@ block-obj-$(CONFIG_LIBNFS) += nfs.o block-obj-$(CONFIG_CURL) += curl.o block-obj-$(CONFIG_RBD) += rbd.o block-obj-$(CONFIG_GLUSTERFS) += gluster.o +block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o block-obj-$(CONFIG_LIBSSH2) += ssh.o endif @@ -35,5 +36,6 @@ gluster.o-cflags := $(GLUSTERFS_CFLAGS) gluster.o-libs := $(GLUSTERFS_LIBS) ssh.o-cflags := $(LIBSSH2_CFLAGS) ssh.o-libs := $(LIBSSH2_LIBS) +archipelago.o-libs := $(ARCHIPELAGO_LIBS) qcow.o-libs:= -lz linux-aio.o-libs := -laio diff --git a/block/archipelago.c b/block/archipelago.c new file mode 100644 index 000..c56826a --- /dev/null +++ b/block/archipelago.c @@ -0,0 +1,819 @@ +/* + * QEMU Block driver for Archipelago + * + * Copyright 2014 GRNET S.A. All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * 1. Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY GRNET S.A. ``AS IS'' AND ANY EXPRESS + * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL GRNET S.A OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED + * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN + * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * The views and conclusions contained in the software and + * documentation are those of the authors and should not be + * interpreted as representing official policies, either expressed + * or implied, of GRNET S.A. + */ + +/* +* VM Image on Archipelago volume is specified like this: +* +* file.driver=archipelago,file.volume=volumename[,file.mport=mapperd_port[, +* file.vport=vlmcd_port][,file.segment=segment_name]] +* +* 'archipelago' is the protocol. +* +* 'mport' is the port number on which mapperd is listening. This is optional +* and if not specified, QEMU will make Archipelago to use the default port. +* +* 'vport' is the port number on which vlmcd is listening. This is optional +* and if not specified, QEMU will make Archipelago to use the default port. +* +* 'segment' is the name of the shared memory segment Archipelago stack is using. +* This is optional and if not specified, QEMU will make Archipelago to use the +* default value, 'archipelago'. +* +* Examples: +* +* file.driver=archipelago,file.volume=my_vm_volume +*
[Qemu-devel] [PATCH v6 4/5] QMP: Add support for Archipelago
Introduce new enum BlockdevOptionsArchipelago. @volume: #Name of the Archipelago volume image @mport: #'mport' is the port number on which mapperd is listening. This is optional and if not specified, QEMU will make Archipelago to use the default port. @vport: #'vport' is the port number on which vlmcd is listening. This is optional and if not specified, QEMU will make Archipelago to use the default port. @segment: #optional The name of the shared memory segment Archipelago stack is using. This is optional and if not specified, QEMU will make Archipelago use the default value, 'archipelago'. Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr --- qapi/block-core.json | 39 --- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index af6b436..55eb152 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -190,8 +190,8 @@ # @ro: true if the backing device was open read-only # # @drv: the name of the block format used to open the backing device. As of -# 0.14.0 this can be: 'blkdebug', 'bochs', 'cloop', 'cow', 'dmg', -# 'file', 'file', 'ftp', 'ftps', 'host_cdrom', 'host_device', +# 0.14.0 this can be: 'archipelago', 'blkdebug', 'bochs', 'cloop', 'cow', +# 'dmg', 'file', 'file', 'ftp', 'ftps', 'host_cdrom', 'host_device', # 'host_floppy', 'http', 'https', 'nbd', 'parallels', 'qcow', # 'qcow2', 'raw', 'tftp', 'vdi', 'vmdk', 'vpc', 'vvfat' # @@ -1077,7 +1077,7 @@ # Since: 2.0 ## { 'enum': 'BlockdevDriver', - 'data': [ 'file', 'host_device', 'host_cdrom', 'host_floppy', + 'data': [ 'archipelago', 'file', 'host_device', 'host_cdrom', 'host_floppy', 'http', 'https', 'ftp', 'ftps', 'tftp', 'vvfat', 'blkdebug', 'blkverify', 'bochs', 'cloop', 'cow', 'dmg', 'parallels', 'qcow', 'qcow2', 'qed', 'raw', 'vdi', 'vhdx', 'vmdk', 'vpc', 'quorum' ] } @@ -1207,6 +1207,38 @@ '*pass-discard-snapshot': 'bool', '*pass-discard-other': 'bool' } } + +## +# @BlockdevOptionsArchipelago +# +# Driver specific block device options for Archipelago. +# +# @volume: Name of the Archipelago volume image +# +# +# @mport: #optional The port number on which mapperd is +# listening. This is optional +# and if not specified, QEMU will make Archipelago +# use the default port. +# +# @vport: #optional The port number on which vlmcd is +# listening. This is optional +# and if not specified, QEMU will make Archipelago +# use the default port. +# +# @segment: #optional The name of the shared memory segment +# Archipelago stack is using. This is optional +# and if not specified, QEMU will make Archipelago +# use the default value, 'archipelago'. +# Since: 2.1 +## +{ 'type': 'BlockdevOptionsArchipelago', + 'data': { 'volume': 'str', +'*mport': 'int', +'*vport': 'int', +'*segment': 'str' } } + + ## # @BlkdebugEvent # @@ -1347,6 +1379,7 @@ 'base': 'BlockdevOptionsBase', 'discriminator': 'driver', 'data': { + 'archipelago':'BlockdevOptionsArchipelago', 'file': 'BlockdevOptionsFile', 'host_device':'BlockdevOptionsFile', 'host_cdrom': 'BlockdevOptionsFile', -- 1.7.10.4
[Qemu-devel] [PATCH v6 2/5] block/archipelago: Implement bdrv_parse_filename()
VM Image on Archipelago volume can also be specified like this: file=archipelago:volumename[/mport=mapperd_port[:vport=vlmcd_port][: segment=segment_name]] Examples: file=archipelago:my_vm_volume file=archipelago:my_vm_volume/mport=123 file=archipelago:my_vm_volume/mport=123:vport=1234 file=archipelago:my_vm_volume/mport=123:vport=1234:segment=my_segment Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr --- block/archipelago.c | 139 ++- 1 file changed, 137 insertions(+), 2 deletions(-) diff --git a/block/archipelago.c b/block/archipelago.c index c56826a..3549454 100644 --- a/block/archipelago.c +++ b/block/archipelago.c @@ -40,6 +40,11 @@ * file.driver=archipelago,file.volume=volumename[,file.mport=mapperd_port[, * file.vport=vlmcd_port][,file.segment=segment_name]] * +* or +* +* file=archipelago:volumename[/mport=mapperd_port[:vport=vlmcd_port][: +* segment=segment_name]] +* * 'archipelago' is the protocol. * * 'mport' is the port number on which mapperd is listening. This is optional @@ -57,11 +62,20 @@ * file.driver=archipelago,file.volume=my_vm_volume * file.driver=archipelago,file.volume=my_vm_volume,file.mport=123 * file.driver=archipelago,file.volume=my_vm_volume,file.mport=123, -* file.vport=1234 +* file.vport=1234 * file.driver=archipelago,file.volume=my_vm_volume,file.mport=123, -* file.vport=1234,file.segment=my_segment +* file.vport=1234,file.segment=my_segment +* +* or +* +* file=archipelago:my_vm_volume +* file=archipelago:my_vm_volume/mport=123 +* file=archipelago:my_vm_volume/mport=123:vport=1234 +* file=archipelago:my_vm_volume/mport=123:vport=1234:segment=my_segment +* */ +#include qemu-common.h #include block/block_int.h #include qemu/error-report.h #include qemu/thread.h @@ -333,6 +347,126 @@ static void qemu_archipelago_complete_aio(void *opaque) g_free(reqdata); } +static void xseg_find_port(char *pstr, const char *needle, xport *aport) +{ +const char *a; +char *endptr = NULL; +unsigned long port; +if (strstart(pstr, needle, a)) { +if (strlen(a) 0) { +port = strtoul(a, endptr, 10); +if (strlen(endptr)) { +*aport = -2; +return; +} +*aport = (xport) port; +} +} +} + +static void xseg_find_segment(char *pstr, const char *needle, + char **segment_name) +{ +const char *a; +if (strstart(pstr, needle, a)) { +if (strlen(a) 0) { +*segment_name = g_strdup(a); +} +} +} + +static void parse_filename_opts(const char *filename, Error **errp, +char **volume, char **segment_name, +xport *mport, xport *vport) +{ +const char *start; +char *tokens[4], *ds; +int idx; +xport lmport = NoPort, lvport = NoPort; + +strstart(filename, archipelago:, start); + +ds = g_strdup(start); +tokens[0] = strtok(ds, /); +tokens[1] = strtok(NULL, :); +tokens[2] = strtok(NULL, :); +tokens[3] = strtok(NULL, \0); + +if (!strlen(tokens[0])) { +error_setg(errp, volume name must be specified first); +g_free(ds); +return; +} + +for (idx = 1; idx 4; idx++) { +if (tokens[idx] != NULL) { +if (strstart(tokens[idx], mport=, NULL)) { +xseg_find_port(tokens[idx], mport=, lmport); +} +if (strstart(tokens[idx], vport=, NULL)) { +xseg_find_port(tokens[idx], vport=, lvport); +} +if (strstart(tokens[idx], segment=, NULL)) { +xseg_find_segment(tokens[idx], segment=, segment_name); +} +} +} + +if ((lmport == -2) || (lvport == -2)) { +error_setg(errp, mport and/or vport must be set); +g_free(ds); +return; +} +*volume = g_strdup(tokens[0]); +*mport = lmport; +*vport = lvport; +g_free(ds); +} + +static void archipelago_parse_filename(const char *filename, QDict *options, + Error **errp) +{ +const char *start; +char *volume = NULL, *segment_name = NULL; +xport mport = NoPort, vport = NoPort; + +if (qdict_haskey(options, ARCHIPELAGO_OPT_VOLUME) +|| qdict_haskey(options, ARCHIPELAGO_OPT_SEGMENT) +|| qdict_haskey(options, ARCHIPELAGO_OPT_MPORT) +|| qdict_haskey(options, ARCHIPELAGO_OPT_VPORT)) { +error_setg(errp, volume/mport/vport/segment and a file name may not be + specified at the same time); +return; +} + +if (!strstart(filename, archipelago:, start)) { +error_setg(errp, File name must start with 'archipelago:'); +return; +} + +if (!strlen(start) || strstart(start, /, NULL)) { +error_setg(errp, volume name must be specified); +return; +} + +parse_filename_opts(filename,
Re: [Qemu-devel] Reverse execution and deterministic replay
On 27 June 2014 06:18, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: Our implementation is completely tested for qemu 1.5 and is in beta state for 2.0.50. Note that you should post patches against current QEMU master; patches against old releases like 1.5 are not something we could use. thanks -- PMM
[Qemu-devel] [PATCH v6 3/5] block/archipelago: Add support for creating images
qemu-img archipelago:volumename[/mport=mapperd_port[:vport=vlmcd_port] [:segment=segment_name]] [size] Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr --- block/archipelago.c | 149 +++ 1 file changed, 149 insertions(+) diff --git a/block/archipelago.c b/block/archipelago.c index 3549454..3d5aff1 100644 --- a/block/archipelago.c +++ b/block/archipelago.c @@ -613,6 +613,140 @@ err_exit: xseg_leave(s-xseg); } +static int qemu_archipelago_create_volume(Error **errp, const char *volname, + char *segment_name, + uint64_t size, xport mportno, + xport vportno) +{ +int ret, targetlen; +struct xseg *xseg = NULL; +struct xseg_request *req; +struct xseg_request_clone *xclone; +struct xseg_port *port; +xport srcport = NoPort, sport = NoPort; +char *target; + +/* Try default values if none has been set */ +if (mportno == (xport) -1) { +mportno = 1001; +} + +if (vportno == (xport) -1) { +vportno = 501; +} + +if (segment_name == NULL) { +segment_name = g_strdup(archipelago); +} + +if (xseg_initialize()) { +error_setg(errp, Cannot initialize XSEG); +return -1; +} + +xseg = xseg_join((char *)posix, segment_name, + (char *)posixfd, NULL); + +if (!xseg) { +error_setg(errp, Cannot join XSEG shared memory segment); +return -1; +} + +port = xseg_bind_dynport(xseg); +srcport = port-portno; +init_local_signal(xseg, sport, srcport); + +req = xseg_get_request(xseg, srcport, mportno, X_ALLOC); +if (!req) { +error_setg(errp, Cannot get XSEG request); +return -1; +} + +targetlen = strlen(volname); +ret = xseg_prep_request(xseg, req, targetlen, +sizeof(struct xseg_request_clone)); +if (ret 0) { +error_setg(errp, Cannot prepare XSEG request); +goto err_exit; +} + +target = xseg_get_target(xseg, req); +if (!target) { +error_setg(errp, Cannot get XSEG target.\n); +goto err_exit; +} +memcpy(target, volname, targetlen); +xclone = (struct xseg_request_clone *) xseg_get_data(xseg, req); +memset(xclone-target, 0 , XSEG_MAX_TARGETLEN); +xclone-targetlen = 0; +xclone-size = size; +req-offset = 0; +req-size = req-datalen; +req-op = X_CLONE; + +xport p = xseg_submit(xseg, req, srcport, X_ALLOC); +if (p == NoPort) { +error_setg(errp, Could not submit XSEG request); +goto err_exit; +} +xseg_signal(xseg, p); + +ret = wait_reply(xseg, srcport, port, req); +if (ret 0) { +error_setg(errp, wait_reply() error.); +} + +xseg_put_request(xseg, req, srcport); +xseg_quit_local_signal(xseg, srcport); +xseg_leave_dynport(xseg, port); +xseg_leave(xseg); +return ret; + +err_exit: +xseg_put_request(xseg, req, srcport); +xseg_quit_local_signal(xseg, srcport); +xseg_leave_dynport(xseg, port); +xseg_leave(xseg); +return -1; +} + +static int qemu_archipelago_create(const char *filename, + QemuOpts *options, + Error **errp) +{ +int ret = 0; +uint64_t total_size = 0; +char *volname = NULL, *segment_name = NULL; +const char *start; +xport mport = NoPort, vport = NoPort; + +if (!strstart(filename, archipelago:, start)) { +error_setg(errp, File name must start with 'archipelago:'); +return -1; +} + +if (!strlen(start) || strstart(start, /, NULL)) { +error_setg(errp, volume name must be specified); +return -1; +} + +parse_filename_opts(filename, errp, volname, segment_name, mport, vport); +total_size = qemu_opt_get_size_del(options, BLOCK_OPT_SIZE, 0); + +/* Create an Archipelago volume */ +ret = qemu_archipelago_create_volume(errp, volname, segment_name, + total_size, mport, + vport); + +if (volname) { +g_free(volname); +} +if (segment_name) { +g_free(segment_name); +} +return ret; +} + static void qemu_archipelago_aio_cancel(BlockDriverAIOCB *blockacb) { ArchipelagoAIOCB *aio_cb = (ArchipelagoAIOCB *) blockacb; @@ -925,6 +1059,19 @@ static int64_t qemu_archipelago_getlength(BlockDriverState *bs) return ret; } +static QemuOptsList qemu_archipelago_create_opts = { +.name = archipelago-create-opts, +.head = QTAILQ_HEAD_INITIALIZER(qemu_archipelago_create_opts.head), +.desc = { +{ +.name = BLOCK_OPT_SIZE, +.type = QEMU_OPT_SIZE, +.help = Virtual disk size +}, +{ /* end of list */ } +} +}; + static BlockDriverAIOCB
[Qemu-devel] [PATCH v6 5/5] qemu-iotests: add support for Archipelago protocol
Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr --- tests/qemu-iotests/common|6 ++ tests/qemu-iotests/common.rc |9 - 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common index 0aaf84d..a0e35c4 100644 --- a/tests/qemu-iotests/common +++ b/tests/qemu-iotests/common @@ -153,6 +153,7 @@ check options -nbdtest nbd -sshtest ssh -nfstest nfs +-archipelagotest archipelago -xdiff graphical mode diff -nocacheuse O_DIRECT on backing file -misalign misalign memory allocations @@ -264,6 +265,11 @@ testlist options xpand=false ;; +-archipelago) +IMGPROTO=archipelago +xpand=false +;; + -nocache) CACHEMODE=none CACHEMODE_IS_DEFAULT=false diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index 195c564..8ef1a52 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -64,6 +64,8 @@ elif [ $IMGPROTO = ssh ]; then elif [ $IMGPROTO = nfs ]; then TEST_DIR=nfs://127.0.0.1/$TEST_DIR TEST_IMG=$TEST_DIR/t.$IMGFMT +elif [ $IMGPROTO = archipelago ]; then +TEST_IMG=archipelago:at.$IMGFMT else TEST_IMG=$IMGPROTO:$TEST_DIR/t.$IMGFMT fi @@ -163,7 +165,8 @@ _make_test_img() -e s# lazy_refcounts=\\(on\\|off\\)##g \ -e s# block_size=[0-9]\\+##g \ -e s# block_state_zero=\\(on\\|off\\)##g \ --e s# log_size=[0-9]\\+##g +-e s# log_size=[0-9]\\+##g \ +-e s/archipelago:a/TEST_DIR\//g # Start an NBD server on the image file, which is what we'll be talking to if [ $IMGPROTO = nbd ]; then @@ -206,6 +209,10 @@ _cleanup_test_img() rbd --no-progress rm $TEST_DIR/t.$IMGFMT /dev/null ;; +archipelago) +vlmc remove at.$IMGFMT /dev/null +;; + sheepdog) collie vdi delete $TEST_DIR/t.$IMGFMT ;; -- 1.7.10.4
Re: [Qemu-devel] [PATCH] tcg/ppc: Fix failure in tcg_out_mem_long
On Thu, 26 Jun 2014 21:26:00 -0700 Richard Henderson r...@twiddle.net wrote: With rt != r0 on loads, we use rt for scratch. If we need an index register different from base, we can't use rt, but r0 is usable. Signed-off-by: Richard Henderson r...@twiddle.net --- This ought to fix the problem that Greg reported. Thanks Richard ! That we need to use --enable-debug-tcg to see the assert, and that I didn't previously do testing with that is disappointing. I'm thinking that we ought to do something like gcc wrt --enable-checking=release vs development, so that we can't do normal development withing these asserts enabled. More on that later... r~ Makes sense. Cheers. -- Greg --- tcg/ppc/tcg-target.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tcg/ppc/tcg-target.c b/tcg/ppc/tcg-target.c index c83fd9f..dd84e76 100644 --- a/tcg/ppc/tcg-target.c +++ b/tcg/ppc/tcg-target.c @@ -805,7 +805,10 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, /* For unaligned, or very large offsets, use the indexed form. */ if (offset align || offset != (int32_t)offset) { -tcg_debug_assert(rs != base (!is_store || rs != rt)); +if (rs == base) { +rs = TCG_REG_R0; +} +tcg_debug_assert(!is_store || rs != rt); tcg_out_movi(s, TCG_TYPE_PTR, rs, orig); tcg_out32(s, opx | TAB(rt, base, rs)); return; -- Gregory Kurz kurzg...@fr.ibm.com gk...@linux.vnet.ibm.com Software Engineer @ IBM/Meiosys http://www.ibm.com Tel +33 (0)562 165 496 Anarchy is about taking complete responsibility for yourself. Alan Moore.
[Qemu-devel] [PATCH] Allow mismatched virtio config-len
From: Dr. David Alan Gilbert dgilb...@redhat.com Commit 'virtio: validate config_len on load' restricted config_len loaded from the wire to match the config_len that the device had. Unfortunately, there are cases where this isn't true, the one we found it on was the wqe addition in virtio-blk. Allow mismatched config-lengths: *) If the version on the wire is shorter then ensure that the remainder is 0xff filled (as virtio_config_read does on out of range reads) *) If the version on the wire is longer, load what we have space for and skip the rest. Signed-off-by: Dr. David Alan Gilbert dgilb...@redhat.com --- hw/virtio/virtio.c | 30 ++ 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a3082d5..2b11142 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -927,11 +927,33 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f) } config_len = qemu_get_be32(f); if (config_len != vdev-config_len) { -error_report(Unexpected config length 0x%x. Expected 0x%zx, - config_len, vdev-config_len); -return -1; +/* + * Unfortunately the reality is that there are cases where we + * see mismatched config lengths, so we have to deal with them + * rather than rejecting them. + */ + +if (config_len vdev-config_len) { +/* This is normal in some devices when they add a new option */ +memset(vdev-config, 0xff, vdev-config_len); +qemu_get_buffer(f, vdev-config, config_len); +} else { +int32_t diff; +/* config_len vdev-config_len + * This is rarer, but is here to allow us to fix the case above + */ +qemu_get_buffer(f, vdev-config, vdev-config_len); +/* + * Even though we expect the diff to be small, we can't use + * qemu_file_skip because it's not safe for a large skip. + */ +for (diff = config_len - vdev-config_len; diff 0; diff--) { +qemu_get_byte(f); +} +} +} else { +qemu_get_buffer(f, vdev-config, vdev-config_len); } -qemu_get_buffer(f, vdev-config, vdev-config_len); num = qemu_get_be32(f); -- 1.9.3
Re: [Qemu-devel] [v5][PATCH 4/5] xen, gfx passthrough: create host bridge to passthrough
On 2014/6/25 14:24, Paolo Bonzini wrote: Il 25/06/2014 04:17, Tiejun Chen ha scritto: +if (xen_enabled() xen_has_gfx_passthru) { +d = pci_create_simple(b, 0, TYPE_I440FX_XEN_PCI_DEVICE); +*pi440fx_state = I440FX_XEN_PCI_DEVICE(d); +pci_create_pch(b); +} else { +d = pci_create_simple(b, 0, TYPE_I440FX_PCI_DEVICE); +*pi440fx_state = I440FX_PCI_DEVICE(d); +} As mentioned in the review of v4, this should be a separate, Xen-specific machine. pci_create_pch should not be called in generic PC code. I track this path: qemu_register_pc_machine(xenfv_machine); | + .init = pc_xen_hvm_init, | + pc_init_pci(machine); | + pc_init1(machine, 1, 1); | + i440fx_init() So how to separate this to specific to xen? Or you mean we need to create an new machine to address this scenario? But actually this is same as xenfv_machine except for these little codes. If you don't like this involve other cases, we may drop this chunk of codes as a function to tweak with CONFIG_XEN. But this is not good as well since this is device feature, so kvm may need this one day. Thanks Tiejun
Re: [Qemu-devel] [PATCH for 2.1] qdev: correctly send DEVICE_DELETED for recursively-deleted devices
Am 27.06.2014 09:16, schrieb Markus Armbruster: Paolo Bonzini pbonz...@redhat.com writes: When a device is unparented (i.e. made completely hidden from management) we want to send a DEVICE_DELETED event only if the device actually was realized. This avoids raising DEVICE_DELETED events when device_add fails. However, this does not work right for recursively-deleted devices: the whole tree is _first_ unrealized, _then_ unparented. Then device_unparent sees realized==false and fails to trigger the event. The solution is simply to move have_realized into the DeviceState struct. If device_add fails, we never set the new field to true and DEVICE_DELETED is not sent. Fixes qemu-iotests testcase 067. Suggest to add Broken in commit 5942a19 here, to make it clear that it's a recent regression. I vaguely recall that something like this was in Bandan's RFC (that I assume the above commit forward-ported, the subject would be handy to mention too), but once again without any explanation why, so I saw no need to apply that during hardfreeze. Andreas Reported-by: Markus Armbruster arm...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Reviewed-by: Markus Armbruster arm...@redhat.com -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/4] mips/kvm: Init EBase to correct KSEG0
On Thu, Jun 26, 2014 at 10:44:22AM +0100, James Hogan wrote: The EBase CP0 register is initialised to 0x8000, however with KVM the guest's KSEG0 is at 0x4000. The incorrect value doesn't get passed to KVM yet as KVM doesn't implement the EBase register, however we should set it correctly now so as not to break migration/loadvm to a future version of QEMU that does support EBase. Signed-off-by: James Hogan james.ho...@imgtec.com Cc: Aurelien Jarno aurel...@aurel32.net Cc: Paolo Bonzini pbonz...@redhat.com --- target-mips/translate.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/target-mips/translate.c b/target-mips/translate.c index 2f91959ed7b1..d7b8c4dbc81a 100644 --- a/target-mips/translate.c +++ b/target-mips/translate.c @@ -28,6 +28,7 @@ #include exec/helper-proto.h #include exec/helper-gen.h +#include sysemu/kvm.h #define MIPS_DEBUG_DISAS 0 //#define MIPS_DEBUG_SIGN_EXTENSIONS @@ -16076,7 +16077,12 @@ void cpu_state_reset(CPUMIPSState *env) env-CP0_Random = env-tlb-nb_tlb - 1; env-tlb-tlb_in_use = env-tlb-nb_tlb; env-CP0_Wired = 0; -env-CP0_EBase = 0x8000 | (cs-cpu_index 0x3FF); +env-CP0_EBase = (cs-cpu_index 0x3FF); +if (kvm_enabled()) { +env-CP0_EBase |= 0x4000; +} else { +env-CP0_EBase |= 0x8000; +} env-CP0_Status = (1 CP0St_BEV) | (1 CP0St_ERL); /* vectored interrupts not implemented, timer on int 7, no performance counters. */ Reviewed-by: Aurelien Jarno aurel...@aurel32.net -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH 2/4] mips_malta: Change default KVM cpu to 24Kc (no FP)
On Thu, Jun 26, 2014 at 10:44:23AM +0100, James Hogan wrote: Change the default Malta CPU model for when KVM is enabled to 24Kc which doesn't have floating point support compared to the 24Kf. The resulting incorrect Config CP0 register value doesn't get passed to KVM yet as KVM doesn't expose it, however we should ensure it is set correctly now to reduce the risk of breaking migration/loadvm to a future version of QEMU/Linux that does support them. Signed-off-by: James Hogan james.ho...@imgtec.com Cc: Aurelien Jarno aurel...@aurel32.net Cc: Paolo Bonzini pbonz...@redhat.com --- hw/mips/mips_malta.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c index 2868ee5b0307..c0841991f4e9 100644 --- a/hw/mips/mips_malta.c +++ b/hw/mips/mips_malta.c @@ -949,7 +949,12 @@ void mips_malta_init(MachineState *machine) #ifdef TARGET_MIPS64 cpu_model = 20Kc; #else -cpu_model = 24Kf; +if (kvm_enabled()) { +/* Don't enable FPU on KVM yet */ +cpu_model = 24Kc; +} else { +cpu_model = 24Kf; +} #endif } Given the explanations in the other mails, that looks fine to me, that said I think we should at least warn the user that we are disabling some features, instead of doing it silently. This is what is done for example on x86 when a CPU feature is not available. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH 3/4] mips_malta: Remove incorrect KVM TE references
On Thu, Jun 26, 2014 at 10:44:24AM +0100, James Hogan wrote: Fix the error message and code comments relating to KVM not supporting booting from the flash mapping when no kernel is provided. The issue is a general MIPS KVM issue and isn't specific to the Trap Emulate version of MIPS KVM. Reported-by: Andreas Färber afaer...@suse.de Signed-off-by: James Hogan james.ho...@imgtec.com Cc: Aurelien Jarno aurel...@aurel32.net Cc: Paolo Bonzini pbonz...@redhat.com --- hw/mips/mips_malta.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c index c0841991f4e9..76cf5f2c48f4 100644 --- a/hw/mips/mips_malta.c +++ b/hw/mips/mips_malta.c @@ -1033,7 +1033,7 @@ void mips_malta_init(MachineState *machine) fl_idx++; if (kernel_filename) { ram_low_size = MIN(ram_size, 256 20); -/* For KVM TE we reserve 1MB of RAM for running bootloader */ +/* For KVM we reserve 1MB of RAM for running bootloader */ if (kvm_enabled()) { ram_low_size -= 0x10; bootloader_run_addr = 0x4000 + ram_low_size; @@ -1057,10 +1057,10 @@ void mips_malta_init(MachineState *machine) bootloader_run_addr, kernel_entry); } } else { -/* The flash region isn't executable from a KVM TE guest */ +/* The flash region isn't executable from a KVM guest */ if (kvm_enabled()) { error_report(KVM enabled but no -kernel argument was specified. - Booting from flash is not supported with KVM TE.); + Booting from flash is not supported with KVM.); exit(1); } /* Load firmware from flash. */ Reviewed-by: Aurelien Jarno aurel...@aurel32.net -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH 4/4] mips_malta: Catch kernels linked at wrong address
On Thu, Jun 26, 2014 at 10:44:25AM +0100, James Hogan wrote: Add error reporting if the wrong type of kernel is provided for the current mode of acceleration. Currently a KVM kernel linked at 0x4000 can't be used with TCG, and a normal kernel linked at 0x8000 can't be used with KVM. Signed-off-by: James Hogan james.ho...@imgtec.com Cc: Aurelien Jarno aurel...@aurel32.net Cc: Paolo Bonzini pbonz...@redhat.com --- hw/mips/mips_malta.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c index 76cf5f2c48f4..95df42e6a4d5 100644 --- a/hw/mips/mips_malta.c +++ b/hw/mips/mips_malta.c @@ -792,9 +792,23 @@ static int64_t load_kernel (void) loaderparams.kernel_filename); exit(1); } + +/* Sanity check where the kernel has been linked */ if (kvm_enabled()) { +if (kernel_entry 0x8000ll) { +error_report(KVM guest kernels must be linked in useg. + Did you forget to enable CONFIG_KVM_GUEST?); +exit(1); +} + xlate_to_kseg0 = cpu_mips_kvm_um_phys_to_kseg0; } else { +if (!(kernel_entry 0x8000ll)) { +error_report(KVM guest kernels aren't supported with TCG. + Did you unintentionally enable CONFIG_KVM_GUEST?); +exit(1); +} + xlate_to_kseg0 = cpu_mips_phys_to_kseg0; } Reviewed-by: Aurelien Jarno aurel...@aurel32.net -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH v5 0/3] s390: Support for Hotplug of Standby Memory
On Wed, 25 Jun 2014 10:26:57 -0400 Matthew Rosato mjros...@linux.vnet.ibm.com wrote: This patchset adds support in s390 for a pool of standby memory, which can be set online/offline by the guest (ie, via chmem). The standby pool of memory is allocated as the difference between the initial memory setting and the maxmem setting. As part of this work, additional results are provided for the Read SCP Information SCLP, and new implentation is added for the Read Storage Element Information, Attach Storage Element, Assign Storage and Unassign Storage SCLPs, which enables the s390 guest to manipulate the standby memory pool. This patchset is based on work originally done by Jeng-Fang (Nick) Wang. Could you add short description how to test it, please. Changes for v5: * Since ACPI memory hotplug is now in, removed Igor's patches from this set. * Updated sclp.c to use object_resolve_path() instead of object_property_find(). Changes for v4: * Remove initialization code from get_sclp_memory_hotplug_dev() and place in its own function, init_sclp_memory_hotplug_dev(). * Add hit to qemu-options.hx to note the fact that the memory size specified via -m might be forced to a boundary. * Account for the legacy s390 machine, which does not support memory hotplug. * Fix a bug in sclp.c - Change memory hotplug device parent to sysbus. * Pulled latest version of Igor's patch. Matthew Rosato (3): sclp-s390: Add device to manage s390 memory hotplug virtio-ccw: Include standby memory when calculating storage increment sclp-s390: Add memory hotplug SCLPs hw/s390x/s390-virtio-ccw.c | 46 +-- hw/s390x/sclp.c| 289 +++- include/hw/s390x/sclp.h| 20 +++ qemu-options.hx|3 +- target-s390x/cpu.h | 18 +++ target-s390x/kvm.c |5 + 6 files changed, 366 insertions(+), 15 deletions(-)
Re: [Qemu-devel] [PATCH 00/10] pc-bios/s390-ccw: Add DASD IPL support
Am 27.06.2014 um 09:53 schrieb Christian Borntraeger borntrae...@de.ibm.com: On 26/06/14 16:42, Alexander Graf wrote: On 26.06.14 16:29, Jens Freimann wrote: Conny, Alex, Christian, here are some fixes for the s390-ccw bios. It's a mixture of additional features (DASD IPL support for different formats) and cleanups. From a quick glimpse it looks quite clean and straight forward, but I'd like to make sure we get rid completely of the static sector size assumption. Should be. I guess s/SECTOR_SIZE/MAX_SECTOR_SIZE/g would be ok for you then? I'm not 100% convinced that we're safe on all users of SECTOR_SIZE. So please make sure to replace the occasions manually and audit every single one. Alex Also, are we guaranteed that virtio always uses 512 byte block size? Or was that just an internal API thing? The virtio-blk API always talks in 512 byte sectors, no matter the block size. Overall this is a nice improvement of the boot code - if possible I would like to see that in 2.1. Conny, can you carry that in your tree (with s/SECTOR_SIZE/MAX_SECTOR_SIZE/g)? Acked-by: Christian Borntraeger borntrae...@de.ibm.com for the series. Christian
Re: [Qemu-devel] [v5][PATCH 3/5] xen, gfx passthrough: support Intel IGD passthrough with VT-D
On 2014/6/25 15:04, Michael S. Tsirkin wrote: On Wed, Jun 25, 2014 at 10:17:19AM +0800, Tiejun Chen wrote: Some registers of Intel IGD are mapped in host bridge, so it needs to [snip] static int is_vga_passthrough(XenHostPCIDevice *dev) { @@ -291,3 +292,158 @@ static int create_pseudo_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev) XEN_PT_LOG(dev, The pseudo Intel PCH ISA bridge created.\n); return 0; } + +int pci_create_pch(PCIBus *bus) Please prefix all xen specific non static functions with xen_ or something like this. Okay. pci_ is for pci core. In fact it's a good idea to do this for static functions as well, in case we add a conflicting function in some header. +{ +XenHostPCIDevice hdev; +int r = 0; + +if (!xen_has_gfx_passthru) { +return r; +} + +r = xen_host_pci_device_get(hdev, 0, 0, 0x1f, 0); +if (r) { +XEN_PT_ERR(NULL, Failed to find Intel PCH on host\n); +goto err; +} + +if (hdev.vendor_id == PCI_VENDOR_ID_INTEL) { +r = create_pseudo_pch_isa_bridge(bus, hdev); +if (r) { +XEN_PT_ERR(NULL, Failed to create PCH ISA bridge.\n); +goto err; +} +} Does it work on non intel? IGD means this should work on Intel platform. It seems to return success. Okay, I'd like to change this a void. Maybe you should just verify that vendor and device ID have the expected values on the host, and Vendor id is enough. fail otherwise. + +xen_host_pci_device_put(hdev); + +err: +return r; +} + +/* + * Currently we just pass this physical host bridge for IGD, 00:02.0. + * + * Here pci_dev is just that host bridge, so we have to get that real + * passthrough device by that given devfn to further confirm. + */ confirm what? So change like: * passthrough device by that given devfn to avoid other devices access. Comments like this need to document what function does. Maybe /* Can we support IGD passthrough for this device? * We require ... XYZ - fill in here */ +static int is_igd_passthrough(PCIDevice *pci_dev) +{ +PCIDevice *f = pci_dev-bus-devices[PCI_DEVFN(2, 0)]; +if (pci_dev-bus-devices[PCI_DEVFN(2, 0)]) { +XenPCIPassthroughState *s = DO_UPCAST(XenPCIPassthroughState, dev, f); +return (is_vga_passthrough(s-real_device) + (s-real_device.vendor_id == PCI_VENDOR_ID_INTEL)); +} else { +return 0; +} +} + +void igd_pci_write(PCIDevice *pci_dev, uint32_t config_addr, + uint32_t val, int len) Same here, xen_ everywhere please. Okay. +{ +XenHostPCIDevice dev; +int r; + +/* IGD read/write is through the host bridge. + * ISA bridge is only for detect purpose. In i915 driver it will + * probe ISA bridge to discover the IGD, see comment in i915_drv.c: + * intel_detect_pch(). You mean in linux kernel I guess? So change like, * probe ISA bridge to discover the IGD, see comment in Linux:i915_drv.c: + */ + +assert(pci_dev-devfn == 0x00); + +if (!is_igd_passthrough(pci_dev)) { +goto write_default; +} + +/* Just work for the i915 driver. */ +switch (config_addr) { +case 0x58: /* PAVPC Offset */ +break; +default: +/* Just sets the emulated values. */ +goto write_default; +} + +/* Host write */ +r = xen_host_pci_device_get(dev, 0, 0, 0, 0); +if (r) { +XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n); +abort(); +} + +r = xen_host_pci_set_block(dev, config_addr, (uint8_t *)val, len); +if (r) { +XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n); +abort(); +} Cleaner: if (config_addr == 0x58) { Maybe we add other offset in the future, so we'd better keep in them in switch(). /* Host write */ r = xen_host_pci_device_get(dev, 0, 0, 0, 0); if (r) { XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n); abort(); } r = xen_host_pci_set_block(dev, config_addr, (uint8_t *)val, len); if (r) { XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n); abort(); } } Note this does not work on e.g. BE. Why do we need take BE into consideration here? Shouldn't PCI already be LE? The best way is really to make the register writeable in wmask. Then pci_default_write_config(pci_dev, config_addr, val, len); if (range_covers_byte(addr, len, 0x58)) { r = xen_host_pci_set_block(dev, config_addr, pci_dev-config + config_addr, len); } + +xen_host_pci_device_put(dev); + +return; + +write_default: +pci_default_write_config(pci_dev, config_addr, val, len); +} + +uint32_t igd_pci_read(PCIDevice *pci_dev, uint32_t config_addr, int len) +{ +XenHostPCIDevice
Re: [Qemu-devel] [v5][PATCH 5/5] xen, gfx passthrough: add opregion mapping
On 2014/6/25 15:13, Michael S. Tsirkin wrote: On Wed, Jun 25, 2014 at 10:17:21AM +0800, Tiejun Chen wrote: [snip] diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h index 507165c..25147cf 100644 --- a/hw/xen/xen_pt.h +++ b/hw/xen/xen_pt.h @@ -63,7 +63,7 @@ typedef int (*xen_pt_conf_byte_read) #define XEN_PT_BAR_UNMAPPED (-1) #define PCI_CAP_MAX 48 - +#define PCI_INTEL_OPREGION 0xfc XEN_ please PCI_CAP_MAX should be fixed too. They are specific to PCI, not XEN. Why should we add such a prefix? [snip] +if (igd_guest_opregion) { +ret = xc_domain_memory_mapping(xen_xc, xen_domid, +(unsigned long)(igd_guest_opregion XC_PAGE_SHIFT), +(unsigned long)(igd_host_opregion XC_PAGE_SHIFT), don't spread casts all around. Should be a last resort. Okay. +3, +DPCI_REMOVE_MAPPING); +if (ret) { +return ret; +} +} + return 0; } @@ -447,3 +462,52 @@ err_out: XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n); return -1; } + +uint32_t igd_read_opregion(XenPCIPassthroughState *s) +{ +uint32_t val = 0; + +if (igd_guest_opregion == 0) { !igd_guest_opregion is shorter and does the same, Okay. +return val; +} + +val = igd_guest_opregion; + +XEN_PT_LOG(s-dev, Read opregion val=%x\n, val); +return val; +} + +void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val) +{ +int ret; + +if (igd_guest_opregion) { +XEN_PT_LOG(s-dev, opregion register already been set, ignoring %x\n, + val); +return; +} + +xen_host_pci_get_block(s-real_device, PCI_INTEL_OPREGION, +(uint8_t *)igd_host_opregion, 4); +igd_guest_opregion = (unsigned long)(val ~0xfff) +| (igd_host_opregion 0xfff); + Clearly broken on BE. I still can't understand why we need to address this in BE case. Maybe not important here but writing clean code is just as easy. uint8_t igd_host_opregion[4]; ... xen_host_pci_get_block(s-real_device, PCI_INTEL_OPREGION, igd_host_opregion, sizeof igd_host_opregion); igd_guest_opregion = (val ~0xfff) | (pci_get_word(igd_host_opregion) 0xfff); 0xfff should be a macro too to avoid duplication. Okay. Thanks Tiejun
Re: [Qemu-devel] [patch qemu] net: move queue number into NICPeers
On Mon, May 26, 2014 at 12:04:08PM +0200, Jiri Pirko wrote: It indicates the number of elements in ncs field and makes sense to have int inside NICPeers. Also in parse_netdev we do not need to access container and work with NICPeers only. Signed-off-by: Jiri Pirko j...@resnulli.us --- hw/core/qdev-properties-system.c | 3 +-- hw/net/virtio-net.c | 2 +- include/net/net.h| 2 +- net/net.c| 4 ++-- 4 files changed, 5 insertions(+), 6 deletions(-) Thanks, applied to my net tree: https://github.com/stefanha/qemu/commits/net Stefan pgpBUpz1DhfLw.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 00/10] pc-bios/s390-ccw: Add DASD IPL support
On 27/06/14 11:05, Alexander Graf wrote: Am 27.06.2014 um 09:53 schrieb Christian Borntraeger borntrae...@de.ibm.com: On 26/06/14 16:42, Alexander Graf wrote: On 26.06.14 16:29, Jens Freimann wrote: Conny, Alex, Christian, here are some fixes for the s390-ccw bios. It's a mixture of additional features (DASD IPL support for different formats) and cleanups. From a quick glimpse it looks quite clean and straight forward, but I'd like to make sure we get rid completely of the static sector size assumption. Should be. I guess s/SECTOR_SIZE/MAX_SECTOR_SIZE/g would be ok for you then? I'm not 100% convinced that we're safe on all users of SECTOR_SIZE. So please make sure to replace the occasions manually and audit every single one. Yes, a mindless sed, would also replace VIRTIO_SECTOR_SIZE with VIRTIO_MAX_SECTOR_SIZE. Fortunately there are only 3 place in bootmap.c. Should be simple enough to review. Alex Also, are we guaranteed that virtio always uses 512 byte block size? Or was that just an internal API thing? The virtio-blk API always talks in 512 byte sectors, no matter the block size. Overall this is a nice improvement of the boot code - if possible I would like to see that in 2.1. Conny, can you carry that in your tree (with s/SECTOR_SIZE/MAX_SECTOR_SIZE/g)? Acked-by: Christian Borntraeger borntrae...@de.ibm.com for the series. Christian
Re: [Qemu-devel] [PATCH 4/5] PPC: e500: Support platform devices
On 06/04/2014 02:28 PM, Alexander Graf wrote: For e500 our approach to supporting platform devices is to create a simple bus from the guest's point of view within which we map platform devices dynamically. We allocate memory regions always within the platform hole in address space and map IRQs to predetermined IRQ lines that are reserved for platform device usage. This maps really nicely into device tree logic, so we can just tell the guest about our virtual simple bus in device tree as well. Signed-off-by: Alexander Graf ag...@suse.de --- default-configs/ppc-softmmu.mak | 1 + default-configs/ppc64-softmmu.mak | 1 + hw/ppc/e500.c | 221 ++ hw/ppc/e500.h | 1 + hw/ppc/e500plat.c | 1 + 5 files changed, 225 insertions(+) diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak index 33f8d84..d6ec8b9 100644 --- a/default-configs/ppc-softmmu.mak +++ b/default-configs/ppc-softmmu.mak @@ -45,6 +45,7 @@ CONFIG_PREP=y CONFIG_MAC=y CONFIG_E500=y CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM)) +CONFIG_PLATFORM=y # For PReP CONFIG_MC146818RTC=y CONFIG_ETSEC=y diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak index 37a15b7..06677bf 100644 --- a/default-configs/ppc64-softmmu.mak +++ b/default-configs/ppc64-softmmu.mak @@ -45,6 +45,7 @@ CONFIG_PSERIES=y CONFIG_PREP=y CONFIG_MAC=y CONFIG_E500=y +CONFIG_PLATFORM=y CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM)) # For pSeries CONFIG_XICS=$(CONFIG_PSERIES) diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index 33d54b3..bc26215 100644 --- a/hw/ppc/e500.c +++ b/hw/ppc/e500.c @@ -36,6 +36,7 @@ #include exec/address-spaces.h #include qemu/host-utils.h #include hw/pci-host/ppce500.h +#include hw/platform/device.h #define EPAPR_MAGIC(0x45504150) #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb @@ -47,6 +48,14 @@ #define RAM_SIZES_ALIGN(64UL 20) +#define E500_PLATFORM_BASE 0xF000ULL +#define E500_PLATFORM_HOLE (128ULL * 1024 * 1024) /* 128 MB */ +#define E500_PLATFORM_PAGE_SHIFT 12 +#define E500_PLATFORM_HOLE_PAGES (E500_PLATFORM_HOLE \ +E500_PLATFORM_PAGE_SHIFT) +#define E500_PLATFORM_FIRST_IRQ5 +#define E500_PLATFORM_NUM_IRQS 10 + /* TODO: parameterize */ #define MPC8544_CCSRBAR_BASE 0xE000ULL #define MPC8544_CCSRBAR_SIZE 0x0010ULL @@ -122,6 +131,62 @@ static void dt_serial_create(void *fdt, unsigned long long offset, } } +typedef struct PlatformDevtreeData { +void *fdt; +const char *mpic; +int irq_start; +const char *node; +} PlatformDevtreeData; + +static int platform_device_create_devtree(Object *obj, void *opaque) +{ +PlatformDevtreeData *data = opaque; +Object *dev; +PlatformDeviceState *pdev; + +dev = object_dynamic_cast(obj, TYPE_PLATFORM_DEVICE); +pdev = (PlatformDeviceState *)dev; + +if (!pdev) { +/* Container, traverse it for children */ +return object_child_foreach(obj, platform_device_create_devtree, data); +} + +return 0; +} + +static void platform_create_devtree(void *fdt, const char *node, uint64_t addr, +const char *mpic, int irq_start, +int nr_irqs) +{ +const char platcomp[] = qemu,platform\0simple-bus; +PlatformDevtreeData data; + +/* Create a /platform node that we can put all devices into */ + +qemu_fdt_add_subnode(fdt, node); +qemu_fdt_setprop(fdt, node, compatible, platcomp, sizeof(platcomp)); +qemu_fdt_setprop_string(fdt, node, device_type, platform); + +/* Our platform hole is less than 32bit big, so 1 cell is enough for address + and size */ +qemu_fdt_setprop_cells(fdt, node, #size-cells, 1); +qemu_fdt_setprop_cells(fdt, node, #address-cells, 1); +qemu_fdt_setprop_cells(fdt, node, ranges, 0, addr 32, addr, + E500_PLATFORM_HOLE); + +qemu_fdt_setprop_phandle(fdt, node, interrupt-parent, mpic); + +/* Loop through all devices and create nodes for known ones */ + +data.fdt = fdt; +data.mpic = mpic; +data.irq_start = irq_start; +data.node = node; + +platform_device_create_devtree(qdev_get_machine(), data); +} + static int ppce500_load_device_tree(MachineState *machine, PPCE500Params *params, hwaddr addr, @@ -379,6 +444,12 @@ static int ppce500_load_device_tree(MachineState *machine, qemu_fdt_setprop_cell(fdt, pci, #address-cells, 3); qemu_fdt_setprop_string(fdt, /aliases, pci0, pci); +if (params-has_platform) { +platform_create_devtree(fdt, /platform,
Re: [Qemu-devel] [PATCH qom v2 1/4] sdhci: Fix misuse of qemu_free_irqs()
Am 18.06.2014 09:54, schrieb Peter Crosthwaite: From: Andreas Färber afaer...@suse.de It does a g_free() on the pointer. Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Andreas Färber afaer...@suse.de Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com Thanks for picking this up and reviewing, applied to qom-next with extended commit message: https://github.com/afaerber/qemu-cpu/commits/qom-next Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 0/5] qemu-char/monitor: make monitor_puts thread safe
On Tue, Jun 03, 2014 at 06:39:05PM +0200, Paolo Bonzini wrote: Even though virtio-blk-dataplane mostly synchronizes with the block layer by means of the AioContext, we still need to introduce mutexes for other QEMU subsystems that the dataplane thread might encounter on its way. Adding rerror/werror support, for example, means that the dataplane thread will have to generate QMP events. monitor_puts is the entry point for generating QMP responses and events. Making it thread-safe lets virtio-blk-dataplane threads generate QMP events; because the same entry point is also used for responses, a response and an event will never be intertwined. Protection is inserted at both the qemu-char and monitor levels. A generic mutex is necessary in qemu_fe_chr_write so that qemu_chr_fe_write_all does not break its output; we reuse that mutex in some of the character devices. There is no need to protect against removal of the monitor's backend, since the monitor itself cannot be removed. Paolo Bonzini (6): qemu-char: introduce qemu_chr_alloc qemu-char: do not call chr_write directly qemu-char: move pty_chr_update_read_handler around qemu-char: make writes thread-safe monitor: protect outbuf with mutex monitor: protect event emission backends/baum.c | 2 +- backends/msmouse.c| 2 +- include/sysemu/char.h | 20 ++-- monitor.c | 55 ++ qemu-char.c | 125 +- spice-qemu-char.c | 2 +- ui/console.c | 2 +- 7 files changed, 149 insertions(+), 59 deletions(-) Modulo Fam's missing unlock comment: Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpVwyyyd0A4v.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH for 2.1 0/2] Fix commit of oversized layer
Am 25.06.2014 um 22:55 hat Jeff Cody geschrieben: This fixes a regression in block-commit; if the top image is larger than the base image, we attempt to resize the base image. The regression is that we fail the image truncate operation, returning -EBUSY. Thanks, applied to the block branch. One thing I'm not sure about is whether commit (all of synchronous, live and live on active layer) should check the RESIZE blocker before resizing the backing file. In general, it feels like it would be the right thing to do, especially considering the goal of operation categories in the final state, but on the other hand it means that RESIZE would have to be excluded from bs-backing_blocker, too, allowing standalone resize commands on backing files. Not sure that this would be a good idea... Kevin
Re: [Qemu-devel] [PATCH qom v2 2/4] hw: Fix qemu_allocate_irqs() leaks
Am 18.06.2014 09:55, schrieb Peter Crosthwaite: From: Andreas Färber afaer...@suse.de Replace qemu_allocate_irqs(foo, bar, 1)[0] with qemu_allocate_irq(foo, bar, 0). This avoids leaking the dereferenced qemu_irq *. Cc: Kirill Batuzov batuz...@ispras.ru Cc: Markus Armbruster arm...@redhat.com Cc: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Andreas Färber afaer...@suse.de [PC Changes: * Applied change to instance in sh4/sh7750.c ] Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- Changed since 1: Applied change to instance in sh4/sh7750.c (Kirill review) [...] diff --git a/hw/sh4/sh7750.c b/hw/sh4/sh7750.c index 4a39357..9ccd770 100644 --- a/hw/sh4/sh7750.c +++ b/hw/sh4/sh7750.c @@ -838,6 +838,5 @@ SH7750State *sh7750_init(SuperHCPU *cpu, MemoryRegion *sysmem) qemu_irq sh7750_irl(SH7750State *s) { sh_intc_toggle_source(sh_intc_source(s-intc, IRL), 1, 0); /* enable */ -return qemu_allocate_irqs(sh_intc_set_irl, sh_intc_source(s-intc, IRL), - 1)[0]; +return qemu_allocate_irq(sh_intc_set_irl, sh_intc_source(s-intc, IRL), 1); Thanks for catching this, my grep expression failed due to the line break. But shouldn't this be 0 due to the zero-based index, as per my commit message? Will fix up unless I hear objections. Regards, Andreas } -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 00/10] pc-bios/s390-ccw: Add DASD IPL support
On Fri, 27 Jun 2014 11:27:12 +0200 Christian Borntraeger borntrae...@de.ibm.com wrote: On 27/06/14 11:05, Alexander Graf wrote: Am 27.06.2014 um 09:53 schrieb Christian Borntraeger borntrae...@de.ibm.com: On 26/06/14 16:42, Alexander Graf wrote: On 26.06.14 16:29, Jens Freimann wrote: Conny, Alex, Christian, here are some fixes for the s390-ccw bios. It's a mixture of additional features (DASD IPL support for different formats) and cleanups. From a quick glimpse it looks quite clean and straight forward, but I'd like to make sure we get rid completely of the static sector size assumption. Should be. I guess s/SECTOR_SIZE/MAX_SECTOR_SIZE/g would be ok for you then? I'm not 100% convinced that we're safe on all users of SECTOR_SIZE. So please make sure to replace the occasions manually and audit every single one. Yes, a mindless sed, would also replace VIRTIO_SECTOR_SIZE with VIRTIO_MAX_SECTOR_SIZE. Fortunately there are only 3 place in bootmap.c. Should be simple enough to review. Yes, all places that use it want a MAX_SECTOR_SIZE. All places using the actual sector size are now using the helper function. Also, are we guaranteed that virtio always uses 512 byte block size? Or was that just an internal API thing? The virtio-blk API always talks in 512 byte sectors, no matter the block size. Overall this is a nice improvement of the boot code - if possible I would like to see that in 2.1. Conny, can you carry that in your tree (with s/SECTOR_SIZE/MAX_SECTOR_SIZE/g)? Acked-by: Christian Borntraeger borntrae...@de.ibm.com for the series. Will push out shortly. Unless there are objections, I'll send a pull request for this.
Re: [Qemu-devel] [PATCH qom v2 0/4] QOMify IRQs
Am 25.06.2014 11:39, schrieb Peter Crosthwaite: Ping! This is fully reviewed and should be rdy for a merge. I'd like to see this through for 2.1. I have been very wary of applying the QOM conversion without full device test coverage, similar to realization. People actually testing this conversion would've been more reaffirming than a bit of review - the hardfreeze can but does not necessarily uncover all corner cases. But time is running out, so I intend to apply the series unless I discover issues. qtests for missing devices or statistics of how incomplete our coverage actually is appreciated as always. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH v11 1/3] sPAPR: Implement EEH RTAS calls
On Thu, Jun 26, 2014 at 12:46:50PM +0200, Alexander Graf wrote: On 26.06.14 12:43, Gavin Shan wrote: On Thu, Jun 26, 2014 at 12:30:16PM +0200, Alexander Graf wrote: On 26.06.14 03:35, Gavin Shan wrote: The emulation for EEH RTAS requests from guest isn't covered by QEMU yet and the patch implements them. The patch defines constants used by EEH RTAS calls and adds callback sPAPRPHBClass::eeh_handler, which is going to be used this way: 1. RTAS calls are received in spapr_pci.c, sanity check is done there. 2. RTAS handlers handle what they can. If there is something it cannot handle and sPAPRPHBClass::eeh_handler callback is defined, it is called. 3. sPAPRPHBClass::eeh_handler is only implemented for VFIO now. It does ioctl() to the IOMMU container fd to complete the call. Error codes from that ioctl() are transferred back to the guest. Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com --- hw/ppc/spapr_pci.c | 240 include/hw/pci-host/spapr.h | 7 ++ include/hw/ppc/spapr.h | 33 ++ 3 files changed, 280 insertions(+) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 131434b..8712051 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -422,6 +422,233 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu, rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */ } +static int rtas_handle_eeh_request(sPAPREnvironment *spapr, + uint64_t buid, uint32_t req, uint32_t opt) +{ +sPAPRPHBState *sphb = spapr_find_phb(spapr, buid); +sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + +if (!sphb || !info-eeh_handler) { +return -ENOENT; +} + +return info-eeh_handler(sphb, req, opt); +} + +static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, uint32_t nret, +target_ulong rets) +{ +uint32_t addr, option; +uint64_t buid = ((uint64_t)rtas_ld(args, 1) 32) | rtas_ld(args, 2); +int ret; + +if ((nargs != 4) || (nret != 1)) { +goto param_error_exit; +} + +addr = rtas_ld(args, 0); +option = rtas_ld(args, 3); +switch (option) { +case RTAS_EEH_ENABLE: +if (!find_dev(spapr, buid, addr)) { +goto param_error_exit; +} +break; +case RTAS_EEH_DISABLE: +case RTAS_EEH_THAW_IO: +case RTAS_EEH_THAW_DMA: +break; +default: +goto param_error_exit; +} + +ret = rtas_handle_eeh_request(spapr, buid, + RTAS_EEH_REQ_SET_OPTION, option); +if (ret = 0) { +rtas_st(rets, 0, RTAS_OUT_SUCCESS); +return; +} + +param_error_exit: +rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, uint32_t nret, + target_ulong rets) +{ +uint32_t addr, option; +uint64_t buid = ((uint64_t)rtas_ld(args, 1) 32) | rtas_ld(args, 2); +sPAPRPHBState *sphb = spapr_find_phb(spapr, buid); +sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); +PCIDevice *pdev; + +if (!sphb || !info-eeh_handler) { +goto param_error_exit; +} + +if ((nargs != 4) || (nret != 2)) { +goto param_error_exit; +} + +addr = rtas_ld(args, 0); +option = rtas_ld(args, 3); +if (option != RTAS_GET_PE_ADDR option != RTAS_GET_PE_MODE) { +goto param_error_exit; +} + +pdev = find_dev(spapr, buid, addr); +if (!pdev) { +goto param_error_exit; +} + +/* + * For now, we always have bus level PE whose address + * has format 00BBSS00. The guest OS might regard + * PE address 0 as invalid. We avoid that simply by + * extending it with one. + */ +rtas_st(rets, 0, RTAS_OUT_SUCCESS); +if (option == RTAS_GET_PE_ADDR) { +rtas_st(rets, 1, (pci_bus_num(pdev-bus) 16) + 1); +} else { +rtas_st(rets, 1, RTAS_PE_MODE_SHARED); +} + +return; + +param_error_exit: +rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, uint32_t nret, +target_ulong rets) +{ +uint64_t buid = ((uint64_t)rtas_ld(args, 1) 32) | rtas_ld(args, 2); +int ret; + +if ((nargs != 3) || (nret != 4
Re: [Qemu-devel] VNC memory corruption during resolution change
Found the issue: during resolution change in Windows 7 it happens sometimes that it changes to an intermediate resolution where server_stride % cmp_bytes != 0. The problem that causes memory corruption is where the guest fb is copied to the server fb. It can easily be fixed truncating cmp_bytes in vnc_refresh_server_surface. But by looking at the code it seems that none of the encoders called in vnc_send_framebuffer_update really care about w pixman_image_get_width(vd-server). I will send a patch that will remove all DIV_ROUND_UPs for now to avoid corruption. There are really almost no real resultions out there where width % 16 != 0. If we find some we might need to either decrease VNC_DIRTY_PIXELS_PER_BIT or make it dynamic depending on the resolution. Peter Am 26.06.2014 17:44, schrieb Peter Lieven: Hi all, while playing around with the vmware vga driver I noticed that there seems to be a race condition when the resolution is changed. I was able to trigger this also with std vga. Attached valgrind produces always an output similar to this: ==3346== Thread 1: ==3346== Invalid read of size 8 ==3346==at 0x4C2D108: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3346==by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723) ==3346==by 0x400F19: vnc_refresh (vnc.c:2753) ==3346==by 0x3DA903: dpy_refresh (console.c:1416) ==3346==by 0x3D6D93: gui_update (console.c:194) ==3346==by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) ==3346==by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) ==3346==by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) ==3346==by 0x3649CF: main_loop_wait (main-loop.c:490) ==3346==by 0x406540: main_loop (vl.c:2051) ==3346==by 0x40DEA0: main (vl.c:4507) ==3346== Address 0x12555180 is not stack'd, malloc'd or (recently) free'd ==3346== ==3346== Invalid write of size 8 ==3346==at 0x4C2D10D: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3346==by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723) ==3346==by 0x400F19: vnc_refresh (vnc.c:2753) ==3346==by 0x3DA903: dpy_refresh (console.c:1416) ==3346==by 0x3D6D93: gui_update (console.c:194) ==3346==by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) ==3346==by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) ==3346==by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) ==3346==by 0x3649CF: main_loop_wait (main-loop.c:490) ==3346==by 0x406540: main_loop (vl.c:2051) ==3346==by 0x40DEA0: main (vl.c:4507) ==3346== Address 0x15731080 is not stack'd, malloc'd or (recently) free'd ==3346== ==3346== Invalid read of size 8 ==3346==at 0x4C2D11A: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3346==by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723) ==3346==by 0x400F19: vnc_refresh (vnc.c:2753) ==3346==by 0x3DA903: dpy_refresh (console.c:1416) ==3346==by 0x3D6D93: gui_update (console.c:194) ==3346==by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) ==3346==by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) ==3346==by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) ==3346==by 0x3649CF: main_loop_wait (main-loop.c:490) ==3346==by 0x406540: main_loop (vl.c:2051) ==3346==by 0x40DEA0: main (vl.c:4507) ==3346== Address 0x12555170 is not stack'd, malloc'd or (recently) free'd ==3346== ==3346== Invalid read of size 1 ==3346==at 0x4C2DCC0: bcmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3346==by 0x400D91: vnc_refresh_server_surface (vnc.c:2720) ==3346==by 0x400F19: vnc_refresh (vnc.c:2753) ==3346==by 0x3DA903: dpy_refresh (console.c:1416) ==3346==by 0x3D6D93: gui_update (console.c:194) ==3346==by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488) ==3346==by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499) ==3346==by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605) ==3346==by 0x3649CF: main_loop_wait (main-loop.c:490) ==3346==by 0x406540: main_loop (vl.c:2051) ==3346==by 0x40DEA0: main (vl.c:4507) ==3346== Address 0x15731050 is 0 bytes after a block of size 196,560 alloc'd ==3346==at 0x4C29DB4: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==3346==by 0x70C8B1A: ??? (in /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.30.2) ==3346==by 0x70C8BF4: ??? (in /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.30.2) ==3346==by 0x3FAECC: vnc_dpy_switch (vnc.c:590) ==3346==by 0x3DA87C: dpy_gfx_replace_surface (console.c:1404) ==3346==by 0x3DBCF0: qemu_console_resize (console.c:1857) ==3346==by 0x450A39: vga_draw_text (vga.c:1344) ==3346==by 0x4521B0: vga_update_display (vga.c:1910) ==3346==by 0x2A665B: vmsvga_update_display (vmware_vga.c:1071) ==3346==by 0x3D7087: graphic_hw_update (console.c:256) ==3346==
Re: [Qemu-devel] About AddressSpace in intel-iommu emulation
On 2014-06-27 07:46, Le Tan wrote: 2014-06-27 12:55 GMT+08:00 Paolo Bonzini pbonz...@redhat.com: Il 27/06/2014 04:08, Le Tan ha scritto: 1. In struct IOMMUTLBEntry, I think the addr_mask field should be the mask of the page offset, right? But I see different usages of this field. In spapr_tce_translate_iommu(), the addr_mask field is assigned with the mask of the page offset. However, in pbm_translate_iommu(), in the passthrough case, the addr_mask field seems to be assigned the mask of the page number. Is there any problem here? The intended usage is the one of spapr_tce_translate_iommu(). In practice it doesn't matter, both work. 2. For q35, how to identify origination of DMA requests? The VT-d manual says we should use source-id(for PCI-Express devices, it is requester identifier) to map devices to domains. What is the related part in QEMU? Where can I get the source-id of a DMA request? You need to create a different AddressSpace for each PCI bus or device. How to create a different AddressSpace for each device? I thought a AddressSpace just belongs to a PCI bus before. The paging structures for different functions of the same device can also be different, too. So maybe we should create a different AddressSpace for each function? How to achieve it? Could you give me some more hints or is there any existing example in QEMU? I would suggest to study the apb IOMMU implementation Paolo referenced and the PCI layer functions used by that code. Specifically, pci_setup_iommu takes a callback that is supposed to return an address space to be used for a particular device. For apb, it's the same for all devices on a bus, but that's not required... Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 0/3] virtio-blk: Suppress error action on r/w beyond end
On Thu, Jun 05, 2014 at 02:15:33PM +0200, Markus Armbruster wrote: When a device model's I/O operation fails, we execute the error action. This lets layers above QEMU implement thin provisioning, or attempt to correct errors before they reach the guest. But when the I/O operation fails because its invalid, reporting the error to the guest is the only sensible action. This short series does exactly that for virtio-blk. I intend to do the same for IDE and SCSI. Markus Armbruster (3): virtio-blk: Factor common checks out of virtio_blk_handle_read/write() virtio-blk: Bypass error action and I/O accounting on invalid r/w virtio-blk: Treat read/write beyond end as invalid hw/block/virtio-blk.c | 45 + 1 file changed, 29 insertions(+), 16 deletions(-) -- 1.9.3 Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpilzCdbNLlQ.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 3/3] virtio-blk: Treat read/write beyond end as invalid
On Mon, Jun 23, 2014 at 02:57:36PM +0200, Markus Armbruster wrote: Markus Armbruster arm...@redhat.com writes: Stefan Hajnoczi stefa...@redhat.com writes: On Thu, Jun 05, 2014 at 02:15:36PM +0200, Markus Armbruster wrote: +if (sector total_sectors || nb_sectors total_sectors - sector) { +return false; +} if (sector = total_sectors || ...) { I suspect reading bdrv_check_byte_request() put the '' in my brain: if ((offset len) || (len - offset size)) return -EIO; Don't we need offset = len here? Just remembered: we don't, because we allow I/O at offset len provided size is zero. Same reasoning applies to my patch. Okay. I didn't remember the offset=eof length=0 thing. Stefan pgpkBLJGnB6km.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] docs/multiple-iothreads.txt: add documentation on IOThread programming
On Mon, Jun 09, 2014 at 09:29:31AM -0600, Eric Blake wrote: On 06/09/2014 07:59 AM, Stefan Hajnoczi wrote: This document explains how IOThreads and the main loop are related, especially how to write code that can run in an IOThread. Currently on virtio-blk-data-plane uses these techniques. The next obvious target is virtio-scsi; there has also been work on virtio-net. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- docs/multiple-iothreads.txt | 124 1 file changed, 124 insertions(+) create mode 100644 docs/multiple-iothreads.txt diff --git a/docs/multiple-iothreads.txt b/docs/multiple-iothreads.txt new file mode 100644 index 000..f2b008d --- /dev/null +++ b/docs/multiple-iothreads.txt @@ -0,0 +1,124 @@ +This document explains the IOThread feature and how to write code that runs +outside the QEMU global mutex. Pre-existing epidemic in this directory, but should you assert copyright and a license? Yes, I'm happy to do that. pgpoFkFscYiaA.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] docs/multiple-iothreads.txt: add documentation on IOThread programming
On Mon, Jun 09, 2014 at 04:11:29PM +0200, Paolo Bonzini wrote: +The main loop and IOThreads +--- +QEMU is an event-driven program that can do several things at once using an +event loop. The VNC server and the QMP monitor are both processed from the +same event loop which monitors their file descriptors until they become +readable and then invokes a callback. + +The default event loop is called the main loop (see main-loop.c). It is +possible to create additional event loop threads using -object +iothread,id=my-iothread. + +Side note: The main loop and IOThread are both event loops but their code is +not shared completely. Sometimes it is useful to remember that although they +are conceptually similar they are currently not interchangeable. Actually, the main loop does include all the iothread code. So you could say that the main loop is a superset of the iothread. Not quite. The main loop includes AioContext but it does not use iothread.c (IOThread). + * LEGACY timer_new_ms() - create a timer + * LEGACY qemu_bh_new() - create a BH + * LEGACY qemu_aio_wait() - run an event loop iteration also seems to be unused except for qemu-io-cmds.c (and easily removed from there). Perhaps add a note (here or elsewhere) that timer_new_ms/qemu_bh_new should never be used in the block layer? I'll note it further down where the block layer is mentioned. pgpIWVkVV628C.pgp Description: PGP signature
[Qemu-devel] [RFC PATCH 0/3] cpu: add device_add foo-x86_64-cpu support
This series is based on the previous patchset from Chen Fan: https://lists.nongnu.org/archive/html/qemu-devel/2014-05/msg02360.html This patches try to make cpu hotplug with device_add, and make -device foo-x86_64-cpu available,also we can set apic-id property with command line, if without setting apic-id property, we offer the first unoccupied apic id as the default new apic id. When hotplug cpu with device_add, additional check of APIC ID will be done after cpu object initialization which was different from 'cpu_add' command that check 'ids' at the beginning. Chen Fan (2): cpu: introduce CpuTopoInfo structure for argument simplification cpu: add device_add foo-x86_64-cpu support Gu Zheng (1): qom/cpu: move register_vmstate to common CPUClass.realizefn exec.c | 32 ++--- hw/intc/apic_common.c |3 +- include/hw/i386/apic_internal.h |3 +- include/qom/cpu.h |3 ++ qdev-monitor.c |1 + qom/cpu.c |2 + target-i386/cpu.c | 76 -- target-i386/topology.h | 51 ++ 8 files changed, 135 insertions(+), 36 deletions(-) -- 1.7.7
[Qemu-devel] [RFC PATCH 1/3] cpu: introduce CpuTopoInfo structure for argument simplification
Signed-off-by: Chen Fan chen.fan.f...@cn.fujitsu.com Reviewed-by: Eduardo Habkost ehabk...@redhat.com Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com --- target-i386/topology.h | 33 + 1 files changed, 17 insertions(+), 16 deletions(-) diff --git a/target-i386/topology.h b/target-i386/topology.h index 07a6c5f..e9ff89c 100644 --- a/target-i386/topology.h +++ b/target-i386/topology.h @@ -47,6 +47,12 @@ */ typedef uint32_t apic_id_t; +typedef struct X86CPUTopoInfo { +unsigned pkg_id; +unsigned core_id; +unsigned smt_id; +} X86CPUTopoInfo; + /* Return the bit width needed for 'count' IDs */ static unsigned apicid_bitwidth_for_count(unsigned count) @@ -92,13 +98,11 @@ static inline unsigned apicid_pkg_offset(unsigned nr_cores, unsigned nr_threads) */ static inline apic_id_t apicid_from_topo_ids(unsigned nr_cores, unsigned nr_threads, - unsigned pkg_id, - unsigned core_id, - unsigned smt_id) + const X86CPUTopoInfo *topo) { -return (pkg_id apicid_pkg_offset(nr_cores, nr_threads)) | - (core_id apicid_core_offset(nr_cores, nr_threads)) | - smt_id; +return (topo-pkg_id apicid_pkg_offset(nr_cores, nr_threads)) | + (topo-core_id apicid_core_offset(nr_cores, nr_threads)) | + topo-smt_id; } /* Calculate thread/core/package IDs for a specific topology, @@ -107,14 +111,12 @@ static inline apic_id_t apicid_from_topo_ids(unsigned nr_cores, static inline void x86_topo_ids_from_idx(unsigned nr_cores, unsigned nr_threads, unsigned cpu_index, - unsigned *pkg_id, - unsigned *core_id, - unsigned *smt_id) + X86CPUTopoInfo *topo) { unsigned core_index = cpu_index / nr_threads; -*smt_id = cpu_index % nr_threads; -*core_id = core_index % nr_cores; -*pkg_id = core_index / nr_cores; +topo-smt_id = cpu_index % nr_threads; +topo-core_id = core_index % nr_cores; +topo-pkg_id = core_index / nr_cores; } /* Make APIC ID for the CPU 'cpu_index' @@ -125,10 +127,9 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_cores, unsigned nr_threads, unsigned cpu_index) { -unsigned pkg_id, core_id, smt_id; -x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index, - pkg_id, core_id, smt_id); -return apicid_from_topo_ids(nr_cores, nr_threads, pkg_id, core_id, smt_id); +X86CPUTopoInfo topo; +x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index, topo); +return apicid_from_topo_ids(nr_cores, nr_threads, topo); } #endif /* TARGET_I386_TOPOLOGY_H */ -- 1.7.7
[Qemu-devel] [RFC PATCH 2/3] qom/cpu: move register_vmstate to common CPUClass.realizefn
Move cpu vmstate register from cpu_exec_init into cpu_common_realizefn, apic vmstate register into x86_cpu_apic_realize. And use the cc-get_arch_id as the instance id that suggested by Igor to fix the migration issue. Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com --- exec.c | 32 +++- hw/intc/apic_common.c |3 +-- include/hw/i386/apic_internal.h |3 ++- include/qom/cpu.h |2 ++ qom/cpu.c |2 ++ target-i386/cpu.c | 12 +--- 6 files changed, 35 insertions(+), 19 deletions(-) diff --git a/exec.c b/exec.c index 4e179a6..61ad996 100644 --- a/exec.c +++ b/exec.c @@ -468,10 +468,28 @@ void tcg_cpu_address_space_init(CPUState *cpu, AddressSpace *as) } #endif +void cpu_vmstate_register(CPUState *cpu) +{ +CPUClass *cc = CPU_GET_CLASS(cpu); +int cpu_index = cc-get_arch_id(cpu); + +if (qdev_get_vmsd(DEVICE(cpu)) == NULL) { +vmstate_register(NULL, cpu_index, vmstate_cpu_common, cpu); +} +#if defined(CPU_SAVE_VERSION) !defined(CONFIG_USER_ONLY) +register_savevm(NULL, cpu, cpu_index, CPU_SAVE_VERSION, +cpu_save, cpu_load, cpu-env_ptr); +assert(cc-vmsd == NULL); +assert(qdev_get_vmsd(DEVICE(cpu)) == NULL); +#endif +if (cc-vmsd != NULL) { +vmstate_register(NULL, cpu_index, cc-vmsd, cpu); +} +} + void cpu_exec_init(CPUArchState *env) { CPUState *cpu = ENV_GET_CPU(env); -CPUClass *cc = CPU_GET_CLASS(cpu); CPUState *some_cpu; int cpu_index; @@ -494,18 +512,6 @@ void cpu_exec_init(CPUArchState *env) #if defined(CONFIG_USER_ONLY) cpu_list_unlock(); #endif -if (qdev_get_vmsd(DEVICE(cpu)) == NULL) { -vmstate_register(NULL, cpu_index, vmstate_cpu_common, cpu); -} -#if defined(CPU_SAVE_VERSION) !defined(CONFIG_USER_ONLY) -register_savevm(NULL, cpu, cpu_index, CPU_SAVE_VERSION, -cpu_save, cpu_load, env); -assert(cc-vmsd == NULL); -assert(qdev_get_vmsd(DEVICE(cpu)) == NULL); -#endif -if (cc-vmsd != NULL) { -vmstate_register(NULL, cpu_index, cc-vmsd, cpu); -} } #if defined(TARGET_HAS_ICE) diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c index ce3d903..029f67d 100644 --- a/hw/intc/apic_common.c +++ b/hw/intc/apic_common.c @@ -345,7 +345,7 @@ static int apic_dispatch_post_load(void *opaque, int version_id) return 0; } -static const VMStateDescription vmstate_apic_common = { +const VMStateDescription vmstate_apic_common = { .name = apic, .version_id = 3, .minimum_version_id = 3, @@ -391,7 +391,6 @@ static void apic_common_class_init(ObjectClass *klass, void *data) ICCDeviceClass *idc = ICC_DEVICE_CLASS(klass); DeviceClass *dc = DEVICE_CLASS(klass); -dc-vmsd = vmstate_apic_common; dc-reset = apic_reset_common; dc-props = apic_properties_common; idc-realize = apic_common_realize; diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h index 83e2a42..8a645cf 100644 --- a/include/hw/i386/apic_internal.h +++ b/include/hw/i386/apic_internal.h @@ -23,6 +23,7 @@ #include exec/memory.h #include hw/cpu/icc_bus.h #include qemu/timer.h +#include migration/vmstate.h /* APIC Local Vector Table */ #define APIC_LVT_TIMER 0 @@ -136,7 +137,7 @@ typedef struct VAPICState { } QEMU_PACKED VAPICState; extern bool apic_report_tpr_access; - +extern const VMStateDescription vmstate_apic_common; void apic_report_irq_delivered(int delivered); bool apic_next_timer(APICCommonState *s, int64_t current_time); void apic_enable_tpr_access_reporting(DeviceState *d, bool enable); diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 4b352a2..87eecd2 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -548,6 +548,8 @@ void cpu_interrupt(CPUState *cpu, int mask); #endif /* USER_ONLY */ +void cpu_vmstate_register(CPUState *cpu); + #ifdef CONFIG_SOFTMMU static inline void cpu_unassigned_access(CPUState *cpu, hwaddr addr, bool is_write, bool is_exec, diff --git a/qom/cpu.c b/qom/cpu.c index fada2d4..5158343 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -296,6 +296,8 @@ static void cpu_common_realizefn(DeviceState *dev, Error **errp) { CPUState *cpu = CPU(dev); +cpu_vmstate_register(cpu); + if (dev-hotplugged) { cpu_synchronize_post_init(cpu); notifier_list_notify(cpu_added_notifiers, dev); diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 8983457..10f6d53 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -2554,13 +2554,19 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error **errp) static void x86_cpu_apic_realize(X86CPU *cpu, Error **errp) { -if (cpu-apic_state == NULL) { +DeviceState *apic_state = cpu-apic_state; +CPUClass *cc = CPU_GET_CLASS(CPU(cpu)); + +if (apic_state == NULL) { return; } -if
[Qemu-devel] [RFC PATCH 3/3] cpu: add device_add foo-x86_64-cpu support
From: Chen Fan chen.fan.f...@cn.fujitsu.com Add support to device_add foo-x86_64-cpu, and additional checks of apic id are added into x86_cpuid_set_apic_id() and x86_cpu_apic_create() for duplicate. Besides, in order to support device/device_add foo-x86_64-cpu which without specified apic id, we add a new function get_free_apic_id() to provide the first free apid id each time to avoid apic id duplicate. Signed-off-by: Chen Fan chen.fan.f...@cn.fujitsu.com Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com --- include/qom/cpu.h |1 + qdev-monitor.c |1 + target-i386/cpu.c | 64 +++- target-i386/topology.h | 18 + 4 files changed, 83 insertions(+), 1 deletions(-) diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 87eecd2..87bd652 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -291,6 +291,7 @@ struct CPUState { QTAILQ_HEAD(CPUTailQ, CPUState); extern struct CPUTailQ cpus; #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node) +#define CPU_REMOVE(cpu) QTAILQ_REMOVE(cpus, cpu, node) #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, cpus, node) #define CPU_FOREACH_SAFE(cpu, next_cpu) \ QTAILQ_FOREACH_SAFE(cpu, cpus, node, next_cpu) diff --git a/qdev-monitor.c b/qdev-monitor.c index f87f3d8..48327c8 100644 --- a/qdev-monitor.c +++ b/qdev-monitor.c @@ -24,6 +24,7 @@ #include qmp-commands.h #include sysemu/arch_init.h #include qemu/config-file.h +#include qom/object_interfaces.h /* * Aliases were a bad idea from the start. Let's keep them diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 10f6d53..b058b70 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -49,6 +49,7 @@ #include hw/i386/apic_internal.h #endif +#include qom/object_interfaces.h /* Cache topology CPUID constants: */ @@ -1550,6 +1551,7 @@ static void x86_cpuid_set_apic_id(Object *obj, Visitor *v, void *opaque, const int64_t max = UINT32_MAX; Error *error = NULL; int64_t value; +X86CPUTopoInfo topo; if (dev-realized) { error_setg(errp, Attempt to set property '%s' on '%s' after @@ -1569,10 +1571,24 @@ static void x86_cpuid_set_apic_id(Object *obj, Visitor *v, void *opaque, return; } +if (value x86_cpu_apic_id_from_index(max_cpus - 1)) { +error_setg(errp, CPU with APIC ID % PRIi64 +is more than MAX APIC ID limits, value); +return; +} + +x86_topo_ids_from_apic_id(smp_cores, smp_threads, value, topo); +if (topo.smt_id = smp_threads || topo.core_id = smp_cores) { +error_setg(errp, CPU with APIC ID % PRIi64 does not match + topology configuration., value); +return; +} + if ((value != cpu-env.cpuid_apic_id) cpu_exists(value)) { error_setg(errp, CPU with APIC ID % PRIi64 exists, value); return; } + cpu-env.cpuid_apic_id = value; } @@ -1994,12 +2010,22 @@ out: return cpu; } +static void x86_cpu_cpudef_instance_init(Object *obj) +{ +DeviceState *dev = DEVICE(obj); + +dev-hotplugged = true; +} + static void x86_cpu_cpudef_class_init(ObjectClass *oc, void *data) { X86CPUDefinition *cpudef = data; X86CPUClass *xcc = X86_CPU_CLASS(oc); +DeviceClass *dc = DEVICE_CLASS(oc); xcc-cpu_def = cpudef; + +dc-cannot_instantiate_with_device_add_yet = false; } static void x86_register_cpudef_type(X86CPUDefinition *def) @@ -2008,6 +2034,8 @@ static void x86_register_cpudef_type(X86CPUDefinition *def) TypeInfo ti = { .name = typename, .parent = TYPE_X86_CPU, +.instance_size = sizeof(X86CPU), +.instance_init = x86_cpu_cpudef_instance_init, .class_init = x86_cpu_cpudef_class_init, .class_data = def, }; @@ -2544,8 +2572,17 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error **errp) return; } +if (env-cpuid_apic_id x86_cpu_apic_id_from_index(max_cpus - 1)) { +error_setg(errp, CPU with APIC ID % PRIi32 + is more than MAX APIC ID:% PRIi32, +env-cpuid_apic_id, +x86_cpu_apic_id_from_index(max_cpus - 1)); +return; +} + object_property_add_child(OBJECT(cpu), apic, OBJECT(cpu-apic_state), NULL); + qdev_prop_set_uint8(cpu-apic_state, id, env-cpuid_apic_id); /* TODO: convert to link */ apic = APIC_COMMON(cpu-apic_state); @@ -2681,6 +2718,21 @@ uint32_t x86_cpu_apic_id_from_index(unsigned int cpu_index) } } +static uint32_t get_free_apic_id(void) +{ +int i; + +for (i = 0; i max_cpus; i++) { +uint32_t id = x86_cpu_apic_id_from_index(i); + +if (!cpu_exists(id)) { +return id; +} +} + +return x86_cpu_apic_id_from_index(max_cpus); +} + static void x86_cpu_initfn(Object *obj) { CPUState *cs = CPU(obj); @@ -2688,7 +2740,9 @@ static void x86_cpu_initfn(Object
Re: [Qemu-devel] [RFC PATCH 1/3] cpu: introduce CpuTopoInfo structure for argument simplification
Correct the author. From: Chen Fan chen.fan.f...@cn.fujitsu.com On 06/27/2014 06:03 PM, Gu Zheng wrote: Signed-off-by: Chen Fan chen.fan.f...@cn.fujitsu.com Reviewed-by: Eduardo Habkost ehabk...@redhat.com Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com --- target-i386/topology.h | 33 + 1 files changed, 17 insertions(+), 16 deletions(-) diff --git a/target-i386/topology.h b/target-i386/topology.h index 07a6c5f..e9ff89c 100644 --- a/target-i386/topology.h +++ b/target-i386/topology.h @@ -47,6 +47,12 @@ */ typedef uint32_t apic_id_t; +typedef struct X86CPUTopoInfo { +unsigned pkg_id; +unsigned core_id; +unsigned smt_id; +} X86CPUTopoInfo; + /* Return the bit width needed for 'count' IDs */ static unsigned apicid_bitwidth_for_count(unsigned count) @@ -92,13 +98,11 @@ static inline unsigned apicid_pkg_offset(unsigned nr_cores, unsigned nr_threads) */ static inline apic_id_t apicid_from_topo_ids(unsigned nr_cores, unsigned nr_threads, - unsigned pkg_id, - unsigned core_id, - unsigned smt_id) + const X86CPUTopoInfo *topo) { -return (pkg_id apicid_pkg_offset(nr_cores, nr_threads)) | - (core_id apicid_core_offset(nr_cores, nr_threads)) | - smt_id; +return (topo-pkg_id apicid_pkg_offset(nr_cores, nr_threads)) | + (topo-core_id apicid_core_offset(nr_cores, nr_threads)) | + topo-smt_id; } /* Calculate thread/core/package IDs for a specific topology, @@ -107,14 +111,12 @@ static inline apic_id_t apicid_from_topo_ids(unsigned nr_cores, static inline void x86_topo_ids_from_idx(unsigned nr_cores, unsigned nr_threads, unsigned cpu_index, - unsigned *pkg_id, - unsigned *core_id, - unsigned *smt_id) + X86CPUTopoInfo *topo) { unsigned core_index = cpu_index / nr_threads; -*smt_id = cpu_index % nr_threads; -*core_id = core_index % nr_cores; -*pkg_id = core_index / nr_cores; +topo-smt_id = cpu_index % nr_threads; +topo-core_id = core_index % nr_cores; +topo-pkg_id = core_index / nr_cores; } /* Make APIC ID for the CPU 'cpu_index' @@ -125,10 +127,9 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_cores, unsigned nr_threads, unsigned cpu_index) { -unsigned pkg_id, core_id, smt_id; -x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index, - pkg_id, core_id, smt_id); -return apicid_from_topo_ids(nr_cores, nr_threads, pkg_id, core_id, smt_id); +X86CPUTopoInfo topo; +x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index, topo); +return apicid_from_topo_ids(nr_cores, nr_threads, topo); } #endif /* TARGET_I386_TOPOLOGY_H */
Re: [Qemu-devel] [PATCH qom v2 2/4] hw: Fix qemu_allocate_irqs() leaks
On Fri, Jun 27, 2014 at 7:45 PM, Andreas Färber afaer...@suse.de wrote: Am 18.06.2014 09:55, schrieb Peter Crosthwaite: From: Andreas Färber afaer...@suse.de Replace qemu_allocate_irqs(foo, bar, 1)[0] with qemu_allocate_irq(foo, bar, 0). This avoids leaking the dereferenced qemu_irq *. Cc: Kirill Batuzov batuz...@ispras.ru Cc: Markus Armbruster arm...@redhat.com Cc: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com Reviewed-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Andreas Färber afaer...@suse.de [PC Changes: * Applied change to instance in sh4/sh7750.c ] Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- Changed since 1: Applied change to instance in sh4/sh7750.c (Kirill review) [...] diff --git a/hw/sh4/sh7750.c b/hw/sh4/sh7750.c index 4a39357..9ccd770 100644 --- a/hw/sh4/sh7750.c +++ b/hw/sh4/sh7750.c @@ -838,6 +838,5 @@ SH7750State *sh7750_init(SuperHCPU *cpu, MemoryRegion *sysmem) qemu_irq sh7750_irl(SH7750State *s) { sh_intc_toggle_source(sh_intc_source(s-intc, IRL), 1, 0); /* enable */ -return qemu_allocate_irqs(sh_intc_set_irl, sh_intc_source(s-intc, IRL), - 1)[0]; +return qemu_allocate_irq(sh_intc_set_irl, sh_intc_source(s-intc, IRL), 1); Thanks for catching this, my grep expression failed due to the line break. But shouldn't this be 0 due to the zero-based index, as per my commit message? Will fix up unless I hear objections. Yep, sorry. Regards, Peter Regards, Andreas } -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
[Qemu-devel] [PATCH v2] docs/multiple-iothreads.txt: add documentation on IOThread programming
This document explains how IOThreads and the main loop are related, especially how to write code that can run in an IOThread. Currently only virtio-blk-data-plane uses these techniques. The next obvious target is virtio-scsi; there has also been work on virtio-net. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- v2: * Mention AioContext file descriptor monitoring is POSIX host only [Paolo] * Add note that block layer code must use AioContext APIs [Paolo] * Add copyright and license header [Eric] * Add missing comma [Eric] * Fix s/on/only/ typo in commit description [Fam] docs/multiple-iothreads.txt | 134 1 file changed, 134 insertions(+) create mode 100644 docs/multiple-iothreads.txt diff --git a/docs/multiple-iothreads.txt b/docs/multiple-iothreads.txt new file mode 100644 index 000..01d2491 --- /dev/null +++ b/docs/multiple-iothreads.txt @@ -0,0 +1,134 @@ +Copyright (c) 2014 Red Hat Inc. + +This work is licensed under the terms of the GNU GPL, version 2. See +the COPYING file in the top-level directory. + + +This document explains the IOThread feature and how to write code that runs +outside the QEMU global mutex. + +The main loop and IOThreads +--- +QEMU is an event-driven program that can do several things at once using an +event loop. The VNC server and the QMP monitor are both processed from the +same event loop, which monitors their file descriptors until they become +readable and then invokes a callback. + +The default event loop is called the main loop (see main-loop.c). It is +possible to create additional event loop threads using -object +iothread,id=my-iothread. + +Side note: The main loop and IOThread are both event loops but their code is +not shared completely. Sometimes it is useful to remember that although they +are conceptually similar they are currently not interchangeable. + +Why IOThreads are useful + +IOThreads allow the user to control the placement of work. The main loop is a +scalability bottleneck on hosts with many CPUs. Work can be spread across +several IOThreads instead of just one main loop. When set up correctly this +can improve I/O latency and reduce jitter seen by the guest. + +The main loop is also deeply associated with the QEMU global mutex, which is a +scalability bottleneck in itself. vCPU threads and the main loop use the QEMU +global mutex to serialize execution of QEMU code. This mutex is necessary +because a lot of QEMU's code historically was not thread-safe. + +The fact that all I/O processing is done in a single main loop and that the +QEMU global mutex is contended by all vCPU threads and the main loop explain +why it is desirable to place work into IOThreads. + +The experimental virtio-blk data-plane implementation has been benchmarked and +shows these effects: +ftp://public.dhe.ibm.com/linux/pdfs/KVM_Virtualized_IO_Performance_Paper.pdf + +How to program for IOThreads + +The main difference between legacy code and new code that can run in an +IOThread is dealing explicitly with the event loop object, AioContext +(see include/block/aio.h). Code that only works in the main loop +implicitly uses the main loop's AioContext. Code that supports running +in IOThreads must be aware of its AioContext. + +AioContext supports the following services: + * File descriptor monitoring (read/write/error on POSIX hosts) + * Event notifiers (inter-thread signalling) + * Timers + * Bottom Halves (BH) deferred callbacks + +There are several old APIs that use the main loop AioContext: + * LEGACY qemu_aio_set_fd_handler() - monitor a file descriptor + * LEGACY qemu_aio_set_event_notifier() - monitor an event notifier + * LEGACY timer_new_ms() - create a timer + * LEGACY qemu_bh_new() - create a BH + * LEGACY qemu_aio_wait() - run an event loop iteration + +Since they implicitly work on the main loop they cannot be used in code that +runs in an IOThread. They might cause a crash or deadlock if called from an +IOThread since the QEMU global mutex is not held. + +Instead, use the AioContext functions directly (see include/block/aio.h): + * aio_set_fd_handler() - monitor a file descriptor + * aio_set_event_notifier() - monitor an event notifier + * aio_timer_new() - create a timer + * aio_bh_new() - create a BH + * aio_poll() - run an event loop iteration + +The AioContext can be obtained from the IOThread using +iothread_get_aio_context() or for the main loop using qemu_get_aio_context(). +Code that takes an AioContext argument works both in IOThreads or the main +loop, depending on which AioContext instance the caller passes in. + +How to synchronize with an IOThread +--- +AioContext is not thread-safe so some rules must be followed when using file +descriptors, event notifiers, timers, or BHs across threads: + +1. AioContext functions can be called safely from file descriptor, event +notifier,
Re: [Qemu-devel] [PATCH v1 1/1] char: cadence_uart: Convert to realize()
On 27 June 2014 01:11, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: On Tue, Jun 24, 2014 at 4:06 PM, Alistair Francis alistair.fran...@xilinx.com wrote: SysBusDevice::init is deprecated. Convert to Object::init and Device::realize as prescribed by QOM conventions. Signed-off-by: Alistair Francis alistair.fran...@xilinx.com Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com CC Peter for target-arm. I think at this point given we're quite close to hardfreeze I'd prefer not to take this, since it's just cleanup. thanks -- PMM
Re: [Qemu-devel] [PATCH v2] hw/net/eepro100: Implement read-only bits in MDI registers
On Mon, Jun 09, 2014 at 04:03:08PM +0100, Peter Maydell wrote: Although we defined an eepro100_mdi_mask[] array indicating which bits in the registers are read-only, we weren't actually doing anything with it. Make the MDI register-write code use it rather than manually making register 1 read-only and leaving the rest as reads-as-written. (The special-case handling of register 0 remains as before since its mask is all-zeros and the special casing happens before we apply the masking.) Signed-off-by: Peter Maydell peter.mayd...@linaro.org Message-id: 1402159924-13853-1-git-send-email-peter.mayd...@linaro.org --- No code change, but I fixed the errors in the commit message. hw/net/eepro100.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Thanks, applied to my net tree: https://github.com/stefanha/qemu/commits/net Stefan pgp5dVqC8br3P.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 0/5] Platform device support
Am 26.06.2014 14:01, schrieb Alexander Graf: On 20.06.14 08:43, Peter Crosthwaite wrote: On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote: Platforms without ISA and/or PCI have had a seriously hard time in the dynamic device creation world of QEMU. Devices on these were modeled as SysBus devices which can only be instantiated in machine files, not through -device. Why is that so? Well, SysBus is trying to be incredibly generic. It allows you to plug any interrupt sender into any other interrupt receiver. It allows you to map a device's memory regions into any other random memory region. All of that only works from C code or via really complicated command line arguments under discussion upstream right now. What you are doing seem to me to be an extension of SysBus - you are defining the same interfaces as sysbus but also adding some machine specifics wiring info. I think it's a candidate for QOM inheritance to avoid having to dup all the sysbus device models for both regular sysbus and platform bus. I think your functionality should be added as one of 1: and interface that can be added to sysbus devices 2: a new abstraction that inherits from SYS_BUS_DEVICE 3: just new features to the sysbus core. Then both of us are using the same suite of device models and the differences between our approaches are limited to machine level instantiation method. My gut says #2 is the cleanest. The more I think about it the more I believe #3 would be the cleanest. The only thing my platform devices do in addition to sysbus devices is that it exposes qdev properties to give mapping code hints where a device wants to be mapped. If we just add qdev properties for all the possible hints in generic sysbus core code, we should be able to automatically convert all devices into dynamically allocatable devices. Whether they actually do get mapped and the generation of device tree chunks still stays in the the machine file's court. As discussed offline with Alex, one issue I see is that this would be encouraging people to add more devices to an artificial global bus in /machine/unassigned that we've been trying to obsolete, rather than sitting down and please creating an e500 SoC object as a start. Maybe we should start generating a list of shame for 2.1. ;) Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would make me much happier than using SysBus as is. The pure QOM approach would be link properties instead of a bus, but then the machine needs to know how many slots there shall be in advance. Note that the docking procedure is always initiated from the realizing device, whether bus or no bus. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] Reverse execution and deterministic replay
-Original Message- From: Frederic Konrad [mailto:fred.kon...@greensocs.com] Sent: Friday, June 27, 2014 11:48 AM To: Pavel Dovgaluk Cc: Peter Crosthwaite; Paolo Bonzini; qemu-devel@nongnu.org Developers; Mark Burton Subject: Re: [Qemu-devel] Reverse execution and deterministic replay On 27/06/2014 08:11, Peter Crosthwaite wrote: Hi Pavel, On Fri, Jun 27, 2014 at 3:18 PM, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: Hello! We want to publish set of patches related to the reverse execution and deterministic replay of qemu. Our implementation of deterministic replay can be used for deterministic and reverse debugging of guest code through gdb remote interface. Execution recording writes non-deterministic events log, which can be later used for replaying the execution anywhere and for unlimited number of times. It also supports checkpointing for faster rewinding during reverse debugging. Execution replaying reads the log and replays all non-deterministic events including external input, hardware clocks, and interrupts. Reverse execution has the following features: * Deterministically replays whole system execution and all contents of the memory, state of the hadrware devices, clocks, and screen of the VM. * Writes execution log into the file for latter replaying for multiple times on different machines. * Supports i386, x86_64, and ARM hardware platforms. * Performs deterministic replay of all operations with keyboard, mouse, network adapters, audio devices, serial interfaces, and physical USB devices connected to the emulator. * Provides support for gdb reverse debugging commands like reverse-step and reverse- continue. * Supports auto-checkpointing for convenient reverse debugging. * Allows going to the live execution from the replay mode. Our implementation is completely tested for qemu 1.5 and is in beta state for 2.0.50. Some details about our implementation of reverse execution can be found in paper: http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html Add relevant implementation details to the git commit messages. Can anyone review our patches? Fred Konrad is doing a series on reverse exe at the moment. CC. Is the an independent implementation of the same thing or are you building on it? Hi, Yes seems we are doing the same thing only we use icount as an instruction counter and you created a new instruction counter? Yes, we created new instruction-accurate counter. This has advantage of having it working everywhere icount works but the disavantages of having to use icount for reverse execution. The major disadvantage of icount is that it's updated only on TB boundaries. When one instruction in the middle of the block uses virtual clock, it could have different values for different divisions of the code to TB. E.g. you can stop the execution using the debugger in the middle of the block. It will lead to creation of the new block starting from the next instruction (which previously was in the middle of the TB). Reading virtual clock by this instruction can give you different values. I think we can use both way so the reverse execution will works on other architecture the time an instruction counter is added to them. I'm sure your patches will add to our solution and I can review your patches when you'll send them. It would help if you rebase them on the patch set that is currently on the list: [RFC PATCH v5 00/13] Reverse execution. I sent two days ago. We do not use icount at all. We record virtual time into the replay log instead. But we implemented an icount-like feature, which computes the values of virtual clock and TSC using our internal instruction counter. Thanks, Fred I suggest posting a full RFC, this looks to me just like a cover letter but without a series. Note that we are going into hard freeze imminently so there will be some delay for merge. Regards, Peter Pavel Dovgaluk Pavel Dovgaluk
[Qemu-devel] [PATCH] ui/vnc: avoid memory corruption if width % VNC_DIRTY_PIXELS_PER_BIT != 0
during resolution change in Windows 7 it happens sometimes that Windows changes to an intermediate resolution where server_stride % cmp_bytes != 0 (in vnc_refresh_server_surface). The problem that causes memory corruption is where the guest fb is copied to the server fb. It could be easily fixed by truncating cmp_bytes in vnc_refresh_server_surface. But by looking at the code it seems that none of the encoders called in vnc_send_framebuffer_update really cares about w pixman_image_get_width(vd-server). This patch will therefore remove all DIV_ROUND_UPs for now to avoid corruption or illegal reads. I think there are really almost no real resultions out there where width % 16 != 0. If we really find some we might need to either decrease VNC_DIRTY_PIXELS_PER_BIT or make it dynamic depending on the resolution. Cc: qemu-sta...@nongnu.org Signed-off-by: Peter Lieven p...@kamp.de --- ui/vnc.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ui/vnc.c b/ui/vnc.c index 14a86c3..9e37d47 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -577,7 +577,7 @@ void *vnc_server_fb_ptr(VncDisplay *vd, int x, int y) memset(bitmap, 0x00, sizeof(bitmap));\ for (y = 0; y h; y++) {\ bitmap_set(bitmap[y], 0,\ - DIV_ROUND_UP(w, VNC_DIRTY_PIXELS_PER_BIT));\ + w / VNC_DIRTY_PIXELS_PER_BIT);\ } \ } @@ -2738,7 +2738,7 @@ static int vnc_refresh_server_surface(VncDisplay *vd) } guest_ptr += x * cmp_bytes; -for (; x DIV_ROUND_UP(width, VNC_DIRTY_PIXELS_PER_BIT); +for (; x width / VNC_DIRTY_PIXELS_PER_BIT; x++, guest_ptr += cmp_bytes, server_ptr += cmp_bytes) { if (!test_and_clear_bit(x, vd-guest.dirty[y])) { continue; -- 1.7.9.5
Re: [Qemu-devel] Reverse execution and deterministic replay
On 27 June 2014 11:35, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: The major disadvantage of icount is that it's updated only on TB boundaries. When one instruction in the middle of the block uses virtual clock, it could have different values for different divisions of the code to TB. This is only true if the instruction is incorrectly not marked as being I/O. The idea behind icount is that in general we update it on TB boundaries (it's much faster than doing it once per insn) but for those places which do turn out to need an exact icount we then retranslate the block to get the instruction-to-icount-adjustment mapping. It wouldn't surprise me if this turned out to have some bugs in corner cases, but fixing these issues seems to me like a much better design than ignoring icount completely and reimplementing a second instruction counter. thanks -- PMM
Re: [Qemu-devel] [PATCH 0/5] Platform device support
On Fri, Jun 27, 2014 at 8:30 PM, Andreas Färber afaer...@suse.de wrote: Am 26.06.2014 14:01, schrieb Alexander Graf: On 20.06.14 08:43, Peter Crosthwaite wrote: On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote: Platforms without ISA and/or PCI have had a seriously hard time in the dynamic device creation world of QEMU. Devices on these were modeled as SysBus devices which can only be instantiated in machine files, not through -device. Why is that so? Well, SysBus is trying to be incredibly generic. It allows you to plug any interrupt sender into any other interrupt receiver. It allows you to map a device's memory regions into any other random memory region. All of that only works from C code or via really complicated command line arguments under discussion upstream right now. What you are doing seem to me to be an extension of SysBus - you are defining the same interfaces as sysbus but also adding some machine specifics wiring info. I think it's a candidate for QOM inheritance to avoid having to dup all the sysbus device models for both regular sysbus and platform bus. I think your functionality should be added as one of 1: and interface that can be added to sysbus devices 2: a new abstraction that inherits from SYS_BUS_DEVICE 3: just new features to the sysbus core. Then both of us are using the same suite of device models and the differences between our approaches are limited to machine level instantiation method. My gut says #2 is the cleanest. The more I think about it the more I believe #3 would be the cleanest. The only thing my platform devices do in addition to sysbus devices is that it exposes qdev properties to give mapping code hints where a device wants to be mapped. If we just add qdev properties for all the possible hints in generic sysbus core code, we should be able to automatically convert all devices into dynamically allocatable devices. Whether they actually do get mapped and the generation of device tree chunks still stays in the the machine file's court. As discussed offline with Alex, one issue I see is that this would be encouraging people to add more devices to an artificial global bus in /machine/unassigned that we've been trying to obsolete, rather than sitting down and please creating an e500 SoC object as a start. Maybe we should start generating a list of shame for 2.1. ;) Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would make me much happier than using SysBus as is. Do you mean address_space_memory (as used by sysbus_mmio_map)? We all hate that global singleton, but can we decouple it from sysbus which is not the root cause of that problem? sysbus_mmio_map usages just need to be replaced with sysbus_mmio_get_region and you can create whatever heirachy you want using unchanged sysbus devices. Even if we phase out the global singleton and the SysBus bus, the sysbus device abstraction is still sound and should be usable busless. Then theres no need a for a tree-wide to implement Alex's feature for all devs (assuming his plugger can be made to work hintless?). Regards, Peter The pure QOM approach would be link properties instead of a bus, but then the machine needs to know how many slots there shall be in advance. Note that the docking procedure is always initiated from the realizing device, whether bus or no bus. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] Reverse execution and deterministic replay
On 27 June 2014 11:35, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: The major disadvantage of icount is that it's updated only on TB boundaries. When one instruction in the middle of the block uses virtual clock, it could have different values for different divisions of the code to TB. This is only true if the instruction is incorrectly not marked as being I/O. The idea behind icount is that in general we update it on TB boundaries (it's much faster than doing it once per insn) but for those places which do turn out to need an exact icount we then retranslate the block to get the instruction-to-icount-adjustment mapping. I see. But if we want virtual clock in real mode then we still should create new timer (based on icount code). It wouldn't surprise me if this turned out to have some bugs in corner cases, but fixing these issues seems to me like a much better design than ignoring icount completely and reimplementing a second instruction counter. When we started an implementation, we didn't have enough resources to fix all such bugs. That is why we selected such conservative approach. But I believe that in future we will adopt the icount for replay purposes. Pavel Dovgaluk
Re: [Qemu-devel] [PATCH 0/5] Platform device support
Am 27.06.2014 12:54, schrieb Peter Crosthwaite: On Fri, Jun 27, 2014 at 8:30 PM, Andreas Färber afaer...@suse.de wrote: Am 26.06.2014 14:01, schrieb Alexander Graf: On 20.06.14 08:43, Peter Crosthwaite wrote: On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote: Platforms without ISA and/or PCI have had a seriously hard time in the dynamic device creation world of QEMU. Devices on these were modeled as SysBus devices which can only be instantiated in machine files, not through -device. Why is that so? Well, SysBus is trying to be incredibly generic. It allows you to plug any interrupt sender into any other interrupt receiver. It allows you to map a device's memory regions into any other random memory region. All of that only works from C code or via really complicated command line arguments under discussion upstream right now. What you are doing seem to me to be an extension of SysBus - you are defining the same interfaces as sysbus but also adding some machine specifics wiring info. I think it's a candidate for QOM inheritance to avoid having to dup all the sysbus device models for both regular sysbus and platform bus. I think your functionality should be added as one of 1: and interface that can be added to sysbus devices 2: a new abstraction that inherits from SYS_BUS_DEVICE 3: just new features to the sysbus core. Then both of us are using the same suite of device models and the differences between our approaches are limited to machine level instantiation method. My gut says #2 is the cleanest. The more I think about it the more I believe #3 would be the cleanest. The only thing my platform devices do in addition to sysbus devices is that it exposes qdev properties to give mapping code hints where a device wants to be mapped. If we just add qdev properties for all the possible hints in generic sysbus core code, we should be able to automatically convert all devices into dynamically allocatable devices. Whether they actually do get mapped and the generation of device tree chunks still stays in the the machine file's court. As discussed offline with Alex, one issue I see is that this would be encouraging people to add more devices to an artificial global bus in /machine/unassigned that we've been trying to obsolete, rather than sitting down and please creating an e500 SoC object as a start. Maybe we should start generating a list of shame for 2.1. ;) Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would make me much happier than using SysBus as is. Do you mean address_space_memory (as used by sysbus_mmio_map)? No, I mean the QOM composition model. When we think of using -device, then they will go to /machine/peripheral/id or /machine/peripheral-anon/device[n]; in your case that means that you get a flat list of devices rather than a structure matching your device tree. And like I said above, in both your and Alex' case SysBus is something that has no real place in the composition tree unless we go from that single unholy qdev-required bus to buses as they exist in the hardware, like Anthony suggested long time ago. Alex' problem with that is that he doesn't want to implement the same UART logic for 50 different-but-same buses, so some form of reuse or inheritance would be needed. Disclaimer: I have not yet reviewed this series, I was commenting on abstract ideas that Alex requested feedback for. Cheers, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 0/5] Platform device support
On 27.06.14 13:17, Andreas Färber wrote: Am 27.06.2014 12:54, schrieb Peter Crosthwaite: On Fri, Jun 27, 2014 at 8:30 PM, Andreas Färber afaer...@suse.de wrote: Am 26.06.2014 14:01, schrieb Alexander Graf: On 20.06.14 08:43, Peter Crosthwaite wrote: On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote: Platforms without ISA and/or PCI have had a seriously hard time in the dynamic device creation world of QEMU. Devices on these were modeled as SysBus devices which can only be instantiated in machine files, not through -device. Why is that so? Well, SysBus is trying to be incredibly generic. It allows you to plug any interrupt sender into any other interrupt receiver. It allows you to map a device's memory regions into any other random memory region. All of that only works from C code or via really complicated command line arguments under discussion upstream right now. What you are doing seem to me to be an extension of SysBus - you are defining the same interfaces as sysbus but also adding some machine specifics wiring info. I think it's a candidate for QOM inheritance to avoid having to dup all the sysbus device models for both regular sysbus and platform bus. I think your functionality should be added as one of 1: and interface that can be added to sysbus devices 2: a new abstraction that inherits from SYS_BUS_DEVICE 3: just new features to the sysbus core. Then both of us are using the same suite of device models and the differences between our approaches are limited to machine level instantiation method. My gut says #2 is the cleanest. The more I think about it the more I believe #3 would be the cleanest. The only thing my platform devices do in addition to sysbus devices is that it exposes qdev properties to give mapping code hints where a device wants to be mapped. If we just add qdev properties for all the possible hints in generic sysbus core code, we should be able to automatically convert all devices into dynamically allocatable devices. Whether they actually do get mapped and the generation of device tree chunks still stays in the the machine file's court. As discussed offline with Alex, one issue I see is that this would be encouraging people to add more devices to an artificial global bus in /machine/unassigned that we've been trying to obsolete, rather than sitting down and please creating an e500 SoC object as a start. Maybe we should start generating a list of shame for 2.1. ;) Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would make me much happier than using SysBus as is. Do you mean address_space_memory (as used by sysbus_mmio_map)? No, I mean the QOM composition model. When we think of using -device, then they will go to /machine/peripheral/id or /machine/peripheral-anon/device[n]; in your case that means that you get a flat list of devices rather than a structure matching your device tree. And like I said above, in both your and Alex' case SysBus is something that has no real place in the composition tree unless we go from that single unholy qdev-required bus to buses as they exist in the hardware, like Anthony suggested long time ago. Alex' problem with that is that he doesn't want to implement the same UART logic for 50 different-but-same buses, so some form of reuse or inheritance would be needed. Disclaimer: I have not yet reviewed this series, I was commenting on abstract ideas that Alex requested feedback for. I think we can all agree that the sysbus bus is not a bus per se. So conceptually, what's the difference between a device attached to a non-bus and a device not attached to a bus at all? And why can't we convert sysbus to not be a bus anymore? Alex
[Qemu-devel] [PULL 02/10] pc-bios/s390-ccw: cleanup and enhance bootmap defintions
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Add declarations to describe structure of different dasd IPL sources (eckd and fba). Move the structure definitions to a new header bootmap.h. While we are at it, change structs to typedefs. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c | 66 +++- pc-bios/s390-ccw/bootmap.h | 254 2 files changed, 269 insertions(+), 51 deletions(-) create mode 100644 pc-bios/s390-ccw/bootmap.h diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index 753c288..c216030 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -9,6 +9,7 @@ */ #include s390-ccw.h +#include bootmap.h /* #define DEBUG_FALLBACK */ @@ -20,41 +21,6 @@ do { } while (0) #endif -struct scsi_blockptr { -uint64_t blockno; -uint16_t size; -uint16_t blockct; -uint8_t reserved[4]; -} __attribute__ ((packed)); - -struct component_entry { -struct scsi_blockptr data; -uint8_t pad[7]; -uint8_t component_type; -uint64_t load_address; -} __attribute((packed)); - -struct component_header { -uint8_t magic[4]; -uint8_t type; -uint8_t reserved[27]; -} __attribute((packed)); - -struct mbr { -uint8_t magic[4]; -uint32_t version_id; -uint8_t reserved[8]; -struct scsi_blockptr blockptr; -} __attribute__ ((packed)); - -#define ZIPL_MAGIC zIPL - -#define ZIPL_COMP_HEADER_IPL0x00 -#define ZIPL_COMP_HEADER_DUMP 0x01 - -#define ZIPL_COMP_ENTRY_LOAD0x02 -#define ZIPL_COMP_ENTRY_EXEC0x01 - /* Scratch space */ static uint8_t sec[SECTOR_SIZE] __attribute__((__aligned__(SECTOR_SIZE))); @@ -118,8 +84,6 @@ static int zipl_magic(uint8_t *ptr) return 1; } -#define FREE_SPACE_FILLER '\xAA' - static inline bool unused_space(const void *p, unsigned int size) { int i; @@ -133,10 +97,10 @@ static inline bool unused_space(const void *p, unsigned int size) return true; } -static int zipl_load_segment(struct component_entry *entry) +static int zipl_load_segment(ComponentEntry *entry) { -const int max_entries = (SECTOR_SIZE / sizeof(struct scsi_blockptr)); -struct scsi_blockptr *bprs = (void *)sec; +const int max_entries = (SECTOR_SIZE / sizeof(ScsiBlockPtr)); +ScsiBlockPtr *bprs = (void *)sec; const int bprs_size = sizeof(sec); uint64_t blockno; long address; @@ -170,7 +134,7 @@ static int zipl_load_segment(struct component_entry *entry) } if (bprs[i].blockct == 0 unused_space(bprs[i + 1], -sizeof(struct scsi_blockptr))) { +sizeof(ScsiBlockPtr))) { /* This is a continue pointer. * This ptr is the last one in the current script section. * I.e. the next ptr must point to the unused memory area. @@ -195,14 +159,14 @@ fail: } /* Run a zipl program */ -static int zipl_run(struct scsi_blockptr *pte) +static int zipl_run(ScsiBlockPtr *pte) { -struct component_header *header; -struct component_entry *entry; +ComponentHeader *header; +ComponentEntry *entry; uint8_t tmp_sec[SECTOR_SIZE]; virtio_read(pte-blockno, tmp_sec); -header = (struct component_header *)tmp_sec; +header = (ComponentHeader *)tmp_sec; if (!zipl_magic(tmp_sec)) { goto fail; @@ -215,7 +179,7 @@ static int zipl_run(struct scsi_blockptr *pte) dputs(start loading images\n); /* Load image(s) into RAM */ -entry = (struct component_entry *)(header[1]); +entry = (ComponentEntry *)(header[1]); while (entry-component_type == ZIPL_COMP_ENTRY_LOAD) { if (zipl_load_segment(entry) 0) { goto fail; @@ -244,11 +208,11 @@ fail: int zipl_load(void) { -struct mbr *mbr = (void *)sec; +ScsiMbr *mbr = (void *)sec; uint8_t *ns, *ns_end; int program_table_entries = 0; -int pte_len = sizeof(struct scsi_blockptr); -struct scsi_blockptr *prog_table_entry; +const int pte_len = sizeof(ScsiBlockPtr); +ScsiBlockPtr *prog_table_entry; const char *error = ; /* Grab the MBR */ @@ -276,7 +240,7 @@ int zipl_load(void) ns_end = sec + SECTOR_SIZE; for (ns = (sec + pte_len); (ns + pte_len) ns_end; ns++) { -prog_table_entry = (struct scsi_blockptr *)ns; +prog_table_entry = (ScsiBlockPtr *)ns; if (!prog_table_entry-blockno) { break; } @@ -292,7 +256,7 @@ int zipl_load(void) /* Run the default entry */ -prog_table_entry = (struct scsi_blockptr *)(sec + pte_len); +prog_table_entry = (ScsiBlockPtr *)(sec + pte_len); return zipl_run(prog_table_entry); diff --git a/pc-bios/s390-ccw/bootmap.h
[Qemu-devel] [PULL 03/10] pc-bios/s390-ccw: handle different sector sizes
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Use the virtio device's configuration to figure out the disk geometry and use a sector size based upon the layout. [CH: s/SECTOR_SIZE/MAX_SECTOR_SIZE/g] Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c | 12 +++--- pc-bios/s390-ccw/s390-ccw.h |2 +- pc-bios/s390-ccw/virtio.c | 96 --- pc-bios/s390-ccw/virtio.h | 48 ++ 4 files changed, 147 insertions(+), 11 deletions(-) diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index c216030..fa2ca26 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -10,6 +10,7 @@ #include s390-ccw.h #include bootmap.h +#include virtio.h /* #define DEBUG_FALLBACK */ @@ -22,7 +23,8 @@ #endif /* Scratch space */ -static uint8_t sec[SECTOR_SIZE] __attribute__((__aligned__(SECTOR_SIZE))); +static uint8_t sec[MAX_SECTOR_SIZE] +__attribute__((__aligned__(MAX_SECTOR_SIZE))); typedef struct ResetInfo { uint32_t ipl_mask; @@ -99,7 +101,7 @@ static inline bool unused_space(const void *p, unsigned int size) static int zipl_load_segment(ComponentEntry *entry) { -const int max_entries = (SECTOR_SIZE / sizeof(ScsiBlockPtr)); +const int max_entries = (MAX_SECTOR_SIZE / sizeof(ScsiBlockPtr)); ScsiBlockPtr *bprs = (void *)sec; const int bprs_size = sizeof(sec); uint64_t blockno; @@ -163,7 +165,7 @@ static int zipl_run(ScsiBlockPtr *pte) { ComponentHeader *header; ComponentEntry *entry; -uint8_t tmp_sec[SECTOR_SIZE]; +uint8_t tmp_sec[MAX_SECTOR_SIZE]; virtio_read(pte-blockno, tmp_sec); header = (ComponentHeader *)tmp_sec; @@ -187,7 +189,7 @@ static int zipl_run(ScsiBlockPtr *pte) entry++; -if ((uint8_t *)(entry[1]) (tmp_sec + SECTOR_SIZE)) { +if ((uint8_t *)(entry[1]) (tmp_sec + MAX_SECTOR_SIZE)) { goto fail; } } @@ -238,7 +240,7 @@ int zipl_load(void) goto fail; } -ns_end = sec + SECTOR_SIZE; +ns_end = sec + virtio_get_block_size(); for (ns = (sec + pte_len); (ns + pte_len) ns_end; ns++) { prog_table_entry = (ScsiBlockPtr *)ns; if (!prog_table_entry-blockno) { diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h index fe1dd22..b6c0a5b 100644 --- a/pc-bios/s390-ccw/s390-ccw.h +++ b/pc-bios/s390-ccw/s390-ccw.h @@ -130,6 +130,6 @@ static inline void yield(void) : memory, cc); } -#define SECTOR_SIZE 512 +#define MAX_SECTOR_SIZE 4096 #endif /* S390_CCW_H */ diff --git a/pc-bios/s390-ccw/virtio.c b/pc-bios/s390-ccw/virtio.c index c845b14..31b23b0 100644 --- a/pc-bios/s390-ccw/virtio.c +++ b/pc-bios/s390-ccw/virtio.c @@ -202,7 +202,7 @@ static int vring_wait_reply(struct vring *vr, int timeout) * Virtio block * ***/ -static int virtio_read_many(ulong sector, void *load_addr, int sec_num) +int virtio_read_many(ulong sector, void *load_addr, int sec_num) { struct virtio_blk_outhdr out_hdr; u8 status; @@ -211,12 +211,12 @@ static int virtio_read_many(ulong sector, void *load_addr, int sec_num) /* Tell the host we want to read */ out_hdr.type = VIRTIO_BLK_T_IN; out_hdr.ioprio = 99; -out_hdr.sector = sector; +out_hdr.sector = virtio_sector_adjust(sector); vring_send_buf(block, out_hdr, sizeof(out_hdr), VRING_DESC_F_NEXT); /* This is where we want to receive data */ -vring_send_buf(block, load_addr, SECTOR_SIZE * sec_num, +vring_send_buf(block, load_addr, virtio_get_block_size() * sec_num, VRING_DESC_F_WRITE | VRING_HIDDEN_IS_CHAIN | VRING_DESC_F_NEXT); @@ -244,7 +244,7 @@ unsigned long virtio_load_direct(ulong rec_list1, ulong rec_list2, int sec_len = rec_list2 48; ulong addr = (ulong)load_addr; -if (sec_len != SECTOR_SIZE) { +if (sec_len != virtio_get_block_size()) { return -1; } @@ -253,7 +253,7 @@ unsigned long virtio_load_direct(ulong rec_list1, ulong rec_list2, if (status) { virtio_panic(I/O Error); } -addr += sec_num * SECTOR_SIZE; +addr += sec_num * virtio_get_block_size(); return addr; } @@ -263,15 +263,95 @@ int virtio_read(ulong sector, void *load_addr) return virtio_read_many(sector, load_addr, 1); } +static VirtioBlkConfig blk_cfg = {}; +static bool guessed_disk_nature; + +bool virtio_guessed_disk_nature(void) +{ +return guessed_disk_nature; +} + +void virtio_assume_scsi(void) +{ +guessed_disk_nature = true; +blk_cfg.blk_size = 512; +} + +void virtio_assume_eckd(void) +{ +guessed_disk_nature = true; +
[Qemu-devel] [PULL 00/10] for-2.1: s390-ccw bios patches
Here are some s390-ccw bios patches I'd like to see in 2.1. Being able to finally boot from dasd is quite a useful feature. Please consider pulling. The following changes since commit ff4873cb8c81db89668d8b56e19e57b852edb5f5: coroutine-win32.c: Add noinline attribute to work around gcc bug (2014-06-26 14:08:14 +0100) are available in the git repository at: git://github.com/cohuck/qemu.git tags/s390x-20140627 for you to fetch changes up to 77416f4075a673a27cfe5a7a34e93c0fa9810e35: pc-bios/s390-ccw: update binary (2014-06-27 12:11:53 +0200) A series of patches to the s390-ccw bios: - code cleanup - improved error reporting - most important, support to ipl (boot) from ECKD DASD (CDL, LDL or CMS formatted) Eugene (jno) Dvurechenski (9): pc-bios/s390-ccw: make checkpatch happy pc-bios/s390-ccw: cleanup and enhance bootmap defintions pc-bios/s390-ccw: handle different sector sizes pc-bios/s390-ccw: add some utility code pc-bios/s390-ccw: Unify error handling pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs pc-bios/s390-ccw: factor out ipl code pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD Jens Freimann (1): pc-bios/s390-ccw: update binary pc-bios/s390-ccw.img | Bin 9432 - 17624 bytes pc-bios/s390-ccw/bootmap.c| 445 - pc-bios/s390-ccw/bootmap.h| 344 +++ pc-bios/s390-ccw/main.c | 13 +- pc-bios/s390-ccw/s390-ccw.h | 38 ++-- pc-bios/s390-ccw/sclp-ascii.c |4 +- pc-bios/s390-ccw/virtio.c | 122 +-- pc-bios/s390-ccw/virtio.h | 50 - 8 files changed, 837 insertions(+), 179 deletions(-) create mode 100644 pc-bios/s390-ccw/bootmap.h -- 1.7.9.5
[Qemu-devel] [PULL 04/10] pc-bios/s390-ccw: add some utility code
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com IPL_assert(term,message) is introduced to handle error conditions. ebcdic_to_ascii() to convert chars (mostly to print VOLSERs). read_block() provision for unified block-number handling. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c | 15 +--- pc-bios/s390-ccw/bootmap.h | 83 2 files changed, 84 insertions(+), 14 deletions(-) diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index fa2ca26..bb8dd69 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -86,25 +86,12 @@ static int zipl_magic(uint8_t *ptr) return 1; } -static inline bool unused_space(const void *p, unsigned int size) -{ -int i; -const unsigned char *m = p; - -for (i = 0; i size; i++) { -if (m[i] != FREE_SPACE_FILLER) { -return false; -} -} -return true; -} - static int zipl_load_segment(ComponentEntry *entry) { const int max_entries = (MAX_SECTOR_SIZE / sizeof(ScsiBlockPtr)); ScsiBlockPtr *bprs = (void *)sec; const int bprs_size = sizeof(sec); -uint64_t blockno; +block_number_t blockno; long address; int i; diff --git a/pc-bios/s390-ccw/bootmap.h b/pc-bios/s390-ccw/bootmap.h index 59267b0..1846632 100644 --- a/pc-bios/s390-ccw/bootmap.h +++ b/pc-bios/s390-ccw/bootmap.h @@ -12,6 +12,10 @@ #define _PC_BIOS_S390_CCW_BOOTMAP_H #include s390-ccw.h +#include virtio.h + +typedef uint64_t block_number_t; +#define NULL_BLOCK_NR 0x #define FREE_SPACE_FILLER '\xAA' @@ -251,4 +255,83 @@ typedef struct IplVolumeLabel { }; } __attribute__((packed)) IplVolumeLabel; +/* utility code below */ + +static inline void IPL_assert(bool term, const char *message) +{ +if (!term) { +sclp_print(\n! ); +sclp_print(message); +virtio_panic( !\n); /* no return */ +} +} + +static const unsigned char ebc2asc[256] = + /* 0123456789abcdef0123456789abcdef */ + /* 1F */ + /* 3F */ + ...(+|.!$*);. /* 5F first.chr.here.is.real.space */ +-/.,%_?.`:#@'=\/* 7F */ +.abcdefghi...jklmnopqr.. /* 9F */ +..stuvwxyz.. /* BF */ +.ABCDEFGHI...JKLMNOPQR.. /* DF */ +..STUVWXYZ..0123456789..;/* FF */ + +static inline void ebcdic_to_ascii(const char *src, + char *dst, + unsigned int size) +{ +unsigned int i; +for (i = 0; i size; i++) { +unsigned c = src[i]; +dst[i] = ebc2asc[c]; +} +} + +static inline void print_volser(const void *volser) +{ +char ascii[8]; + +ebcdic_to_ascii((char *)volser, ascii, 6); +ascii[6] = '\0'; +sclp_print(VOLSER=[); +sclp_print(ascii); +sclp_print(]\n); +} + +static inline bool unused_space(const void *p, size_t size) +{ +size_t i; +const unsigned char *m = p; + +for (i = 0; i size; i++) { +if (m[i] != FREE_SPACE_FILLER) { +return false; +} +} +return true; +} + +static inline bool is_null_block_number(block_number_t x) +{ +return x == NULL_BLOCK_NR; +} + +static inline void read_block(block_number_t blockno, + void *buffer, + const char *errmsg) +{ +IPL_assert(virtio_read(blockno, buffer) == 0, errmsg); +} + +static inline bool block_size_ok(uint32_t block_size) +{ +return block_size == virtio_get_block_size(); +} + +static inline bool magic_match(const void *data, const void *magic) +{ +return *((uint32_t *)data) == *((uint32_t *)magic); +} + #endif /* _PC_BIOS_S390_CCW_BOOTMAP_H */ -- 1.7.9.5
[Qemu-devel] [PULL 05/10] pc-bios/s390-ccw: Unify error handling
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Convert to IPL_assert and friends Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c | 82 +++ pc-bios/s390-ccw/main.c | 13 --- pc-bios/s390-ccw/s390-ccw.h |2 +- 3 files changed, 31 insertions(+), 66 deletions(-) diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index bb8dd69..1866a20 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -86,7 +86,7 @@ static int zipl_magic(uint8_t *ptr) return 1; } -static int zipl_load_segment(ComponentEntry *entry) +static void zipl_load_segment(ComponentEntry *entry) { const int max_entries = (MAX_SECTOR_SIZE / sizeof(ScsiBlockPtr)); ScsiBlockPtr *bprs = (void *)sec; @@ -103,10 +103,8 @@ static int zipl_load_segment(ComponentEntry *entry) do { memset(bprs, FREE_SPACE_FILLER, bprs_size); -if (virtio_read(blockno, (uint8_t *)bprs)) { -debug_print_int(failed reading bprs at, blockno); -goto fail; -} +debug_print_int(reading bprs at, blockno); +read_block(blockno, bprs, zipl_load_segment: cannot read block); for (i = 0;; i++) { u64 *cur_desc = (void *)bprs[i]; @@ -134,21 +132,13 @@ static int zipl_load_segment(ComponentEntry *entry) } address = virtio_load_direct(cur_desc[0], cur_desc[1], 0, (void *)address); -if (address == -1) { -goto fail; -} +IPL_assert(address != -1, zipl_load_segment: wrong IPL address); } } while (blockno); - -return 0; - -fail: -sclp_print(failed loading segment\n); -return -1; } /* Run a zipl program */ -static int zipl_run(ScsiBlockPtr *pte) +static void zipl_run(ScsiBlockPtr *pte) { ComponentHeader *header; ComponentEntry *entry; @@ -157,75 +147,53 @@ static int zipl_run(ScsiBlockPtr *pte) virtio_read(pte-blockno, tmp_sec); header = (ComponentHeader *)tmp_sec; -if (!zipl_magic(tmp_sec)) { -goto fail; -} +IPL_assert(zipl_magic(tmp_sec), zipl_run: zipl_magic); -if (header-type != ZIPL_COMP_HEADER_IPL) { -goto fail; -} +IPL_assert(header-type == ZIPL_COMP_HEADER_IPL, + zipl_run: wrong header type); dputs(start loading images\n); /* Load image(s) into RAM */ entry = (ComponentEntry *)(header[1]); while (entry-component_type == ZIPL_COMP_ENTRY_LOAD) { -if (zipl_load_segment(entry) 0) { -goto fail; -} +zipl_load_segment(entry); entry++; -if ((uint8_t *)(entry[1]) (tmp_sec + MAX_SECTOR_SIZE)) { -goto fail; -} +IPL_assert((uint8_t *)(entry[1]) = (tmp_sec + MAX_SECTOR_SIZE), + zipl_run: wrong entry size); } -if (entry-component_type != ZIPL_COMP_ENTRY_EXEC) { -goto fail; -} +IPL_assert(entry-component_type == ZIPL_COMP_ENTRY_EXEC, + zipl_run: no EXEC entry); /* should not return */ jump_to_IPL_code(entry-load_address); - -return 0; - -fail: -sclp_print(failed running zipl\n); -return -1; } -int zipl_load(void) +void zipl_load(void) { ScsiMbr *mbr = (void *)sec; uint8_t *ns, *ns_end; int program_table_entries = 0; const int pte_len = sizeof(ScsiBlockPtr); ScsiBlockPtr *prog_table_entry; -const char *error = ; /* Grab the MBR */ -virtio_read(0, (void *)mbr); +read_block(0, mbr, zipl_load: cannot read block 0); dputs(checking magic\n); -if (!zipl_magic(mbr-magic)) { -error = zipl_magic 1; -goto fail; -} +IPL_assert(zipl_magic(mbr-magic), zipl_load: zipl_magic 1); debug_print_int(program table, mbr-blockptr.blockno); /* Parse the program table */ -if (virtio_read(mbr-blockptr.blockno, sec)) { -error = virtio_read; -goto fail; -} +read_block(mbr-blockptr.blockno, sec, + zipl_load: cannot read program table); -if (!zipl_magic(sec)) { -error = zipl_magic 2; -goto fail; -} +IPL_assert(zipl_magic(sec), zipl_load: zipl_magic 2); ns_end = sec + virtio_get_block_size(); for (ns = (sec + pte_len); (ns + pte_len) ns_end; ns++) { @@ -239,19 +207,11 @@ int zipl_load(void) debug_print_int(program table entries, program_table_entries); -if (!program_table_entries) { -goto fail; -} +IPL_assert(program_table_entries, zipl_load: no program table); /* Run the default entry */ prog_table_entry = (ScsiBlockPtr *)(sec + pte_len); -return
[Qemu-devel] [PULL 08/10] pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Add code that allows us to start from ECKD DASD using the z/OS compatible disk layout (CDL), which is the most common format for ECKD DASD. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c | 168 1 file changed, 168 insertions(+) diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index 3c08f82..beda4d6 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -74,6 +74,171 @@ static void jump_to_IPL_code(uint64_t address) } /*** + * IPL an ECKD DASD (CDL or LDL/CMS format) + */ + +static unsigned char _bprs[8*1024]; /* guessed max ECKD sector size */ +const int max_bprs_entries = sizeof(_bprs) / sizeof(ExtEckdBlockPtr); + +static bool eckd_valid_address(BootMapPointer *p) +{ +const uint64_t cylinder = p-eckd.cylinder ++ ((p-eckd.head 0xfff0) 12); +const uint64_t head = p-eckd.head 0x000f; + +if (head = virtio_get_heads() +|| p-eckd.sector virtio_get_sectors() +|| p-eckd.sector = 0) { +return false; +} + +if (!virtio_guessed_disk_nature() cylinder = virtio_get_cylinders()) { +return false; +} + +return true; +} + +static block_number_t eckd_block_num(BootMapPointer *p) +{ +const uint64_t sectors = virtio_get_sectors(); +const uint64_t heads = virtio_get_heads(); +const uint64_t cylinder = p-eckd.cylinder ++ ((p-eckd.head 0xfff0) 12); +const uint64_t head = p-eckd.head 0x000f; +const block_number_t block = sectors * heads * cylinder + + sectors * head + + p-eckd.sector + - 1; /* block nr starts with zero */ +return block; +} + +static block_number_t load_eckd_segments(block_number_t blk, uint64_t *address) +{ +block_number_t block_nr; +int j, rc; +BootMapPointer *bprs = (void *)_bprs; +bool more_data; + +memset(_bprs, FREE_SPACE_FILLER, sizeof(_bprs)); +read_block(blk, bprs, BPRS read failed); + +do { +more_data = false; +for (j = 0;; j++) { +block_nr = eckd_block_num((void *)(bprs[j].xeckd)); +if (is_null_block_number(block_nr)) { /* end of chunk */ +break; +} + +/* we need the updated blockno for the next indirect entry + * in the chain, but don't want to advance address + */ +if (j == (max_bprs_entries - 1)) { +break; +} + +IPL_assert(block_size_ok(bprs[j].xeckd.bptr.size), + bad chunk block size); +IPL_assert(eckd_valid_address(bprs[j]), bad chunk ECKD addr); + +if ((bprs[j].xeckd.bptr.count == 0) unused_space((bprs[j+1]), +sizeof(EckdBlockPtr))) { +/* This is a continue pointer. + * This ptr should be the last one in the current + * script section. + * I.e. the next ptr must point to the unused memory area + */ +memset(_bprs, FREE_SPACE_FILLER, sizeof(_bprs)); +read_block(block_nr, bprs, BPRS continuation read failed); +more_data = true; +break; +} + +/* Load (count+1) blocks of code at (block_nr) + * to memory (address). + */ +rc = virtio_read_many(block_nr, (void *)(*address), + bprs[j].xeckd.bptr.count+1); +IPL_assert(rc == 0, code chunk read failed); + +*address += (bprs[j].xeckd.bptr.count+1) * virtio_get_block_size(); +} +} while (more_data); +return block_nr; +} + +static void run_eckd_boot_script(block_number_t mbr_block_nr) +{ +int i; +block_number_t block_nr; +uint64_t address; +ScsiMbr *scsi_mbr = (void *)sec; +BootMapScript *bms = (void *)sec; + +memset(sec, FREE_SPACE_FILLER, sizeof(sec)); +read_block(mbr_block_nr, sec, Cannot read MBR); + +block_nr = eckd_block_num((void *)(scsi_mbr-blockptr)); + +memset(sec, FREE_SPACE_FILLER, sizeof(sec)); +read_block(block_nr, sec, Cannot read Boot Map Script); + +for (i = 0; bms-entry[i].type == BOOT_SCRIPT_LOAD; i++) { +address = bms-entry[i].address.load_address; +block_nr = eckd_block_num((bms-entry[i].blkptr)); + +do { +block_nr = load_eckd_segments(block_nr, address); +} while (block_nr != -1); +} + +IPL_assert(bms-entry[i].type == BOOT_SCRIPT_EXEC, + Unknown script entry
[Qemu-devel] [PULL 10/10] pc-bios/s390-ccw: update binary
From: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw.img | Bin 9432 - 17624 bytes 1 file changed, 0 insertions(+), 0 deletions(-) diff --git a/pc-bios/s390-ccw.img b/pc-bios/s390-ccw.img index 1c7f7640fc0c5f2505c4f1114a21b3b712852dbc..603e19e003d574b24bb3b97bacda2bf38077e8fd 100644 GIT binary patch literal 17624 zcmeHPe{@v!mA~^NnU`b;FCm8ENA(4$h98C@1XMtoACTtF#JxeEeXkxMDwf3fT(D5 zG)K|u8gfieYh625(udYn8nHF4mJzL0*Ve3x*sZN;2e6T))|Zy1$q0q_q~}U=$`Y z?mv4@p69)Lzu)h@_j|wh-tYbX7|gdy%dfQCZIpcMlnY!cT!Yg(ta_gbS5ye}(WR71 zK1!!Fl$kcgY*z)i2F+3e5?%C?S(AW5$9$~y-$EEW|k7rEa7YOS!j~aX;%2V7Q0t z3`Er|vpoTYk9q!weB`VRy%!kaD?H9}U4t{mKvccY^`AH`PQTClG-o-QNBFsIOVtn z#w{zfpH6rTVUJ*;}#gVz_m*Eii6@aSM!F;LBTJ=z~}*C*XBRgw3Il+*m3@Vq{Z zqpNGECFZ4BkpZGi7brZvza^Vcy%qRS^i#B!@1h2oOclYTIiO)9xPeZ7;(;dyR_ z8hc_WWaf6gtB^HA)6kdR6Rs{^u1?utF(nYO8Gz}btthJYH|e`gnE=?94XRk1_* zZo}yPCD}a7JX1G$2Z(8M7kDnpz0C8xXHs8L74^*^ZLdbt^#XFB=bU}ESU;dUdoR9M zMAg{OXsFzqI8M=DEj`X@%xB|afKr98HRCZF?8#F(fDMOI^guoyHlmC+msv#u@qM zWzT@7if~*j9IagMvPlha$f%3@;uRSCY@(r-xQCpsVbAj{GxufBJdx50DV@deb{(K z%HNgpts+BXD~Oz8wBOiGu_yGWVa2^fD7Ov07OR5YpgJ6_Dn?t+OT)#_2%b)GI45 zdOd?kqoLR0@PI20Up)~En7*30o_YG6^h4=)*3c2$)=UsQH|h=BY2VNct{Hk2Uw0H1 zrVgobm#JZUznhlfJp$V(QD2y7AME3fYGEgJP0i8g8b$gdahrgnXrILPGIxUcZ@AC z5-$@cLx-}+K~mPf9)b32U(=`fsQb|fHiFWOg7F$yr_MFwMzRn+uQbQ9@pCE#VQX# z`{_}2K%=zoM-P(zi%~p}3eP_Y-aAqp-3Lk_rMUe8eekpADGNQ^Gdu(Vr?xiL~ z0+4$zB4l60S_tBtiYw@M3fwaz1WvQwB5?OHt*4=XwY(`N3`U#rp+zVYa91U`}NK z{W(-O^dsCSO^AjER8l}WN2FcB^MNHI`X1JAv0-i)Z$u;E6xzo!(o3csK}!xaC%4Qw zmtXLkxYf0k@d;vuvy0#lD(ec{d4q@V$T%IckDqoETm9Je;#G)B@`7YmxvUP;9nO! zTO{7kx}9|_J^!F@xES#yqF@xP)^A;rJJOFfSU2@HvT(S5mjAI)3m8K8Sm)GI z$Z9*2uuapSxF6-X#CM2$%NXXG)T5#WYz+K0(PArW@!89W@hYJkl(y#tKPmVxgsxcV zosHYg%*2EpF=j-`$zz|x}|Ns@~#Bs_FMm!wvrzLGx#ZmWj{dkuRa{8inrXCO;a zM8){GL}#+^101)LX+{7e!^my7WjSQFXZ#%ze1ejVQ9H)lW)MwW7JoZE$T(IIH5xc zYv~H}SWOh+#-5!lQBFicqQX*SOrox!t*Zb(r+{~XIiO_TqvU?At``aK-y9Etky z*|=$VJAySTaC==-J}h8pVeGS2IvuoF~#cj%BjzQxqLU$o@=sZM(mnOXFi0vQ9 zM;)c*?o0aY(Mm!VnCN0{L;#@j}tjB{(7$zLC7zLWJma~gD8r7fG=Qa%(sQ}Dks z?%pYMe*xWWo@ek;3D88Dm0R3GfD+F{f{Vw+}A)?H1w_5rDl$rFcbM9uvr^W^iEre zzMm$F?4?56kIbTx6Vj9YZz2cSRvlMJazyX-`-sidn6JjrM$O0J@zPa=a2P}Js9ai z+$|Sv5Bm?s4j}8$1al1gr1ojiD$Fc%7b4Mou1N`SebE3E)YWypAhJ|5|vMqaF5 z2jKg;kQ5famWU5MHuH?pFQe3GwvVW*czuLcY(Tc)*Z{TF?9Kn^mV=PY-S091*Gy zg0BXyl`N;mB-a$((7*moSy@{tHjze@NDM8?E=L#NtTlu4YxlrjtF8A}Z4t z3Y}z`jMai?3eR_hN9nOe5K(3Xr%Nn3q_NyfCc50;2lhR`eYA9Qj)YTX+I(D^9BF2 z;MWT6JD^QX(5A(u{dH-dDfp{`ZxY%Up@S)`+I4BM%r@)|Eb_t2^dWBr?IwpRWBQ z`7pdln)7|%pB+U1xIhtya%|l=C_QWX_s%Gv|=qq+NHsj9bzSj;l2{h}2z?R^aWt zF05%4NljniGaLJCO%-#b){OWp4gJB6n?j{!PhdL{+9GzhW!(z{o$x1j4e~r29-j$ zM0j2oyhwPS5uO5$@U-88=b8~er!5ycO?a?spwFqogS8)H%t7{PuWLcvYX^XNqvi zyl@!8bq8~O`U#Q_M}!a_Z=Sy?a#Rf$G;0Lq8YTjL_b4(Pa*Y#+q@t)o{^#IDm zvJXuKKYjs8!57GIF5+q10q@cpN?0$v@8oc4Hy+Desbgsqgzl*)@jsSB6va2(?Q z^x4Ao{R+@ls;$ZBRvs7ndxic1!S7@|9oZDU+{W{IbtmRDC9+2P$Iv-7I?0!q;Tti- zmtuxrBs2VRk#xIAs+Adzm|49|%GXHS4O=1TfW%kdd^wG#PCsG9ZR-%Ixk*4h zd$oIkz?ay^$~#F-`j9n;M(@MZ@6}I#phh(e!0C@jj#|dOrF@lF15}`SIRS-3lLW! z+bx)nPQVTN6!m}~YpylMp$pcG%KoBO~$%(KH1SeMV}4msLW{Q`%Gh4alG9BCY@R zfgjYs6v0yBz^kq8(M(?V?p#_1@_gL*=*RA2P3h?F$OeCUwEUL0C^)+N0%EP|H z+PC;$jU9U(-fAWPuRx0pd+|JtR%l)-o~M$3bBu8tEN}1Es1$47zlw4rBw%_^JJaI z$ejlyqRwM|T@uUo@5yNMV#JoQcgskovL`;d8@-U|en(UVgR|dGS-T!^nWZMhu zg%6Zf*K=H~3UOTM{qX%GWE`I4OUDkwGgK^lrzxazYt%OYllwRN5_=~_Vr8;%3chQO zdNlTIH|()nZ62H9$KaTBDqTpnOWE+nccpWwN)fz*cS=@tY|);JrsqtOr0UIJeqCk zor}^m{rk{o8Z_0(iJljrZYlQKDSAPLrs{;Dc18i4nmezi)S1U1|LkYni)$Kd zWjiB1_6q)Y!LhqBj8m~u(b9Q#2vK8$C6*%E6yKktnnZ*b_Wcps?7aXz9sBHgpI zxBUQte;k6Q6J8oPB}4Pjpx2ne%I^T4!r_#wd;n3l6^L42x?f3~th7pn;9;SY z*uFjPxm?`-!xP|6MN$ez(xh`ieuTALwQQB^iHbwB=g2Pkjw+(b#FbD6A`OLkG za+J?kkq%PIYB^%csv}{1zX2Yl}#K6XvJCL?mJAIX2I)za4r?#tCI_%gv?kbW0q z)PHKI2y~myiy$G9C({(c9BqjmA{Sxp7|szic)mc(Zw^r6P`Wiyp_nqW$D5-VU!-S z{?l2JENd^x`tRysJw84OFO!l}pHDCbBP;noO%SQ~1(hpFwly{PFK(6z_6B^ag+J z!_47(bbrJPB=VEdA3@;H}3{a!Bca2_8eiz)5WcAPZ;_47j0Dbn{xwJz-Cd3{;= zp!D?6$hcC(|77}qm(YvUE=?bZ%^7Zhb5-mFBCU1o@#RL6#O%f}H;xJFugkVVxp1 zoQLxC7w964O7rY)HR@pbuPecE5xY^rzKDlXi8|dUsbU~q;yDBR37pLsBrX%(= zh@KX?UXH|$0gji3u_u|xIpyi7d_^J?p@^8{jdVZ%CQJ{a=3p_O#P|8Z^i!Cty0D zFQBOyl%WXc-0CmKz$nJ;r_(C*Gtn~XI-x7K;-1Yg-RtKiWx4F@kK?*t1|-VGluO z$oQ1=;jZ`fC6vjhs$GBcd3~fWF-`$0fS-1gz`-od#W!TSO)Ck#t9`ZOG`h07+(A zYyaQJvKDRN!;K_?@rfQk^Zdip-f$R3Fw0{HL(Yq{Lw(I?Tz!FN?pVk%JmO#Fz zl8ENam-9l_OXn3QM|~n#i_4I2$0r-U#l9UV@}Z{QSN`sgrx772dcHbY%fwyhNX4 zoC;|1W#8$DU*-$MSD?PAyCc4Z=ui=u4JZX^O^v2C$%EbP}}Vb8R_MbfJ7RneM- zv)QR0iai3`OKivwT`ki`vlPII^v4xW|q4z0$e^d2_zUDE$uhG}rc*k@E?~{1 z4r+d9;{ASWuraK`v$hPU$M{ZTjz^S-1h3QRa=L;*@8j3BE$r8Py_A%87ar2Ts)D zT;0=Zar6`@FK{HWF}ZVf!-RV*HLgvYl*byxTze8$Zg5;{|+pANsZH6y5pQLx zmZ?w0zsmu{j?2k-fT2xRh73?ae^A*iD!Ysqo*ucUO8pHy5|V|If!vTjQ7ytuZqU z$a(eV-yC%eP$UQT$k071!OkUg;4aK7+(6+jC$7ML9;ykbai`xKC*ZQS-~2t)5LQ}M zue5I2$}zGW$X|jp_ZrM5ov1EU#DEB~oUca@uOhLj7yXG6W@9K+j}5?47kZ*IXJ zMk8kh`SPPG`=U}s+=r;vkM%S_u`r_15ykw1zpyK~ZB`21MpR;QjpFDuh62o)j~gn@
[Qemu-devel] [PULL 01/10] pc-bios/s390-ccw: make checkpatch happy
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Remove tabs, tweak whitespace and comments. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c| 37 - pc-bios/s390-ccw/s390-ccw.h | 20 ++-- pc-bios/s390-ccw/sclp-ascii.c |4 ++-- pc-bios/s390-ccw/virtio.c | 26 +- pc-bios/s390-ccw/virtio.h |2 +- 5 files changed, 46 insertions(+), 43 deletions(-) diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index 5ee3fcb..753c288 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -10,7 +10,7 @@ #include s390-ccw.h -// #define DEBUG_FALLBACK +/* #define DEBUG_FALLBACK */ #ifdef DEBUG_FALLBACK #define dputs(txt) \ @@ -47,13 +47,13 @@ struct mbr { struct scsi_blockptr blockptr; } __attribute__ ((packed)); -#define ZIPL_MAGIC zIPL +#define ZIPL_MAGIC zIPL -#define ZIPL_COMP_HEADER_IPL 0x00 -#define ZIPL_COMP_HEADER_DUMP 0x01 +#define ZIPL_COMP_HEADER_IPL0x00 +#define ZIPL_COMP_HEADER_DUMP 0x01 -#define ZIPL_COMP_ENTRY_LOAD 0x02 -#define ZIPL_COMP_ENTRY_EXEC 0x01 +#define ZIPL_COMP_ENTRY_LOAD0x02 +#define ZIPL_COMP_ENTRY_EXEC0x01 /* Scratch space */ static uint8_t sec[SECTOR_SIZE] __attribute__((__aligned__(SECTOR_SIZE))); @@ -107,8 +107,8 @@ static void jump_to_IPL_code(uint64_t address) /* Check for ZIPL magic. Returns 0 if not matched. */ static int zipl_magic(uint8_t *ptr) { -uint32_t *p = (void*)ptr; -uint32_t *z = (void*)ZIPL_MAGIC; +uint32_t *p = (void *)ptr; +uint32_t *z = (void *)ZIPL_MAGIC; if (*p != *z) { debug_print_int(invalid magic, *p); @@ -136,7 +136,7 @@ static inline bool unused_space(const void *p, unsigned int size) static int zipl_load_segment(struct component_entry *entry) { const int max_entries = (SECTOR_SIZE / sizeof(struct scsi_blockptr)); -struct scsi_blockptr *bprs = (void*)sec; +struct scsi_blockptr *bprs = (void *)sec; const int bprs_size = sizeof(sec); uint64_t blockno; long address; @@ -156,16 +156,18 @@ static int zipl_load_segment(struct component_entry *entry) } for (i = 0;; i++) { -u64 *cur_desc = (void*)bprs[i]; +u64 *cur_desc = (void *)bprs[i]; blockno = bprs[i].blockno; -if (!blockno) +if (!blockno) { break; +} /* we need the updated blockno for the next indirect entry in the chain, but don't want to advance address */ -if (i == (max_entries - 1)) +if (i == (max_entries - 1)) { break; +} if (bprs[i].blockct == 0 unused_space(bprs[i + 1], sizeof(struct scsi_blockptr))) { @@ -178,9 +180,10 @@ static int zipl_load_segment(struct component_entry *entry) break; } address = virtio_load_direct(cur_desc[0], cur_desc[1], 0, - (void*)address); -if (address == -1) + (void *)address); +if (address == -1) { goto fail; +} } } while (blockno); @@ -220,7 +223,7 @@ static int zipl_run(struct scsi_blockptr *pte) entry++; -if ((uint8_t*)(entry[1]) (tmp_sec + SECTOR_SIZE)) { +if ((uint8_t *)(entry[1]) (tmp_sec + SECTOR_SIZE)) { goto fail; } } @@ -241,7 +244,7 @@ fail: int zipl_load(void) { -struct mbr *mbr = (void*)sec; +struct mbr *mbr = (void *)sec; uint8_t *ns, *ns_end; int program_table_entries = 0; int pte_len = sizeof(struct scsi_blockptr); @@ -249,7 +252,7 @@ int zipl_load(void) const char *error = ; /* Grab the MBR */ -virtio_read(0, (void*)mbr); +virtio_read(0, (void *)mbr); dputs(checking magic\n); diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h index 5e871ac..fe1dd22 100644 --- a/pc-bios/s390-ccw/s390-ccw.h +++ b/pc-bios/s390-ccw/s390-ccw.h @@ -34,10 +34,10 @@ typedef unsigned long long __u64; #define PAGE_SIZE 4096 #ifndef EIO -#define EIO1 +#define EIO 1 #endif #ifndef EBUSY -#define EBUSY 2 +#define EBUSY 2 #endif #ifndef NULL #define NULL0 @@ -57,7 +57,7 @@ void sclp_setup(void); /* virtio.c */ unsigned long virtio_load_direct(ulong rec_list1, ulong rec_list2, -ulong subchan_id, void *load_addr); + ulong subchan_id, void *load_addr); bool virtio_is_blk(struct subchannel_id schid); void virtio_setup_block(struct
Re: [Qemu-devel] [PATCH v4 0/6] iotests: Allow out-of-tree run
Max Reitz mre...@redhat.com writes: On 07.06.2014 23:21, Max Reitz wrote: On 24.05.2014 23:24, Max Reitz wrote: This series enables qemu-iotests to be run in a build tree outside of the source tree. It also makes the tests use the command for invoking the Python interpreter specified through configure instead of always using /usr/bin/env python. Ping; I do understand that this series is not urgent, but since I realized out-of-tree builds to be probably superior, I personally base all my own patches on this series, as I don't want to fiddle around with the iotests. Therefore, I'd be glad if someone would review the remaining patches so it can be merged soon. :-) Ping again. Because this is just convenient for development, I don't need it in any specific release, though. I haven't found the time for a proper review, and I can't promise one right now, so I should probably keep my mouth where my money is, but here goes anyway: unless running tests is utterly trivial, tests will not be run, and avoidable mistakes happen. Case in point: I spent a non-trivial chunk of time yesterday to debug three regressions clearly visible in iotests. I did not scold the people involved in getting the regressions committed for not running these tests, because I feel strongly I can't demand tests to be run that require instructions more complex than make WHATEVER. I don't think this is just convenient for development. I'd say it's a must-have.
Re: [Qemu-devel] [PATCH] Allow mismatched virtio config-len
Il 27/06/2014 10:34, Dr. David Alan Gilbert (git) ha scritto: From: Dr. David Alan Gilbert dgilb...@redhat.com Commit 'virtio: validate config_len on load' restricted config_len loaded from the wire to match the config_len that the device had. Unfortunately, there are cases where this isn't true, the one we found it on was the wqe addition in virtio-blk. Indeed, the alternative here is to break migration. As a follow up, it would be nice to let the bus detect whether the config_len change is valid or not. For virtio-mmio and s390, mst said that config length must always match (luckily, these machines aren't versioned so they are not affected by the wce change). For virtio-pci, it is okay as long as the old_length + VIRTIO_PCI_REGION_SIZE(vdev) and new_length + VIRTIO_PCI_REGION_SIZE(vdev) do not cross a power of two. Paolo Allow mismatched config-lengths: *) If the version on the wire is shorter then ensure that the remainder is 0xff filled (as virtio_config_read does on out of range reads) *) If the version on the wire is longer, load what we have space for and skip the rest. Signed-off-by: Dr. David Alan Gilbert dgilb...@redhat.com --- hw/virtio/virtio.c | 30 ++ 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a3082d5..2b11142 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -927,11 +927,33 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f) } config_len = qemu_get_be32(f); if (config_len != vdev-config_len) { -error_report(Unexpected config length 0x%x. Expected 0x%zx, - config_len, vdev-config_len); -return -1; +/* + * Unfortunately the reality is that there are cases where we + * see mismatched config lengths, so we have to deal with them + * rather than rejecting them. + */ + +if (config_len vdev-config_len) { +/* This is normal in some devices when they add a new option */ +memset(vdev-config, 0xff, vdev-config_len); +qemu_get_buffer(f, vdev-config, config_len); +} else { +int32_t diff; +/* config_len vdev-config_len + * This is rarer, but is here to allow us to fix the case above + */ +qemu_get_buffer(f, vdev-config, vdev-config_len); +/* + * Even though we expect the diff to be small, we can't use + * qemu_file_skip because it's not safe for a large skip. + */ +for (diff = config_len - vdev-config_len; diff 0; diff--) { +qemu_get_byte(f); +} +} +} else { +qemu_get_buffer(f, vdev-config, vdev-config_len); } -qemu_get_buffer(f, vdev-config, vdev-config_len); num = qemu_get_be32(f);
[Qemu-devel] [PULL 07/10] pc-bios/s390-ccw: factor out ipl code
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Move the scsi-disk specific ipl code from zipl_load() into a new function ipl_scsi(). This makes it easier to add ipl routines for other disk types. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c | 83 1 file changed, 45 insertions(+), 38 deletions(-) diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index 1866a20..3c08f82 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -12,7 +12,9 @@ #include bootmap.h #include virtio.h +#ifdef DEBUG /* #define DEBUG_FALLBACK */ +#endif #ifdef DEBUG_FALLBACK #define dputs(txt) \ @@ -23,8 +25,7 @@ #endif /* Scratch space */ -static uint8_t sec[MAX_SECTOR_SIZE] -__attribute__((__aligned__(MAX_SECTOR_SIZE))); +static uint8_t sec[MAX_SECTOR_SIZE*4] __attribute__((__aligned__(PAGE_SIZE))); typedef struct ResetInfo { uint32_t ipl_mask; @@ -72,19 +73,9 @@ static void jump_to_IPL_code(uint64_t address) virtio_panic(\n! IPL returns !\n); } -/* Check for ZIPL magic. Returns 0 if not matched. */ -static int zipl_magic(uint8_t *ptr) -{ -uint32_t *p = (void *)ptr; -uint32_t *z = (void *)ZIPL_MAGIC; - -if (*p != *z) { -debug_print_int(invalid magic, *p); -virtio_panic(invalid magic); -} - -return 1; -} +/*** + * IPL a SCSI disk + */ static void zipl_load_segment(ComponentEntry *entry) { @@ -92,8 +83,10 @@ static void zipl_load_segment(ComponentEntry *entry) ScsiBlockPtr *bprs = (void *)sec; const int bprs_size = sizeof(sec); block_number_t blockno; -long address; +uint64_t address; int i; +char err_msg[] = zIPL failed to read BPRS at 0x; +char *blk_no = err_msg[30]; /* where to print blockno in (those ZZs) */ blockno = entry-data.blockno; address = entry-load_address; @@ -103,11 +96,11 @@ static void zipl_load_segment(ComponentEntry *entry) do { memset(bprs, FREE_SPACE_FILLER, bprs_size); -debug_print_int(reading bprs at, blockno); -read_block(blockno, bprs, zipl_load_segment: cannot read block); +fill_hex_val(blk_no, blockno, sizeof(blockno)); +read_block(blockno, bprs, err_msg); for (i = 0;; i++) { -u64 *cur_desc = (void *)bprs[i]; +uint64_t *cur_desc = (void *)bprs[i]; blockno = bprs[i].blockno; if (!blockno) { @@ -132,7 +125,7 @@ static void zipl_load_segment(ComponentEntry *entry) } address = virtio_load_direct(cur_desc[0], cur_desc[1], 0, (void *)address); -IPL_assert(address != -1, zipl_load_segment: wrong IPL address); +IPL_assert(address != -1, zIPL load segment failed); } } while (blockno); } @@ -144,13 +137,11 @@ static void zipl_run(ScsiBlockPtr *pte) ComponentEntry *entry; uint8_t tmp_sec[MAX_SECTOR_SIZE]; -virtio_read(pte-blockno, tmp_sec); +read_block(pte-blockno, tmp_sec, Cannot read header); header = (ComponentHeader *)tmp_sec; -IPL_assert(zipl_magic(tmp_sec), zipl_run: zipl_magic); - -IPL_assert(header-type == ZIPL_COMP_HEADER_IPL, - zipl_run: wrong header type); +IPL_assert(magic_match(tmp_sec, ZIPL_MAGIC), No zIPL magic); +IPL_assert(header-type == ZIPL_COMP_HEADER_IPL, Bad header type); dputs(start loading images\n); @@ -162,17 +153,16 @@ static void zipl_run(ScsiBlockPtr *pte) entry++; IPL_assert((uint8_t *)(entry[1]) = (tmp_sec + MAX_SECTOR_SIZE), - zipl_run: wrong entry size); + Wrong entry value); } -IPL_assert(entry-component_type == ZIPL_COMP_ENTRY_EXEC, - zipl_run: no EXEC entry); +IPL_assert(entry-component_type == ZIPL_COMP_ENTRY_EXEC, No EXEC entry); /* should not return */ jump_to_IPL_code(entry-load_address); } -void zipl_load(void) +static void ipl_scsi(void) { ScsiMbr *mbr = (void *)sec; uint8_t *ns, *ns_end; @@ -180,20 +170,16 @@ void zipl_load(void) const int pte_len = sizeof(ScsiBlockPtr); ScsiBlockPtr *prog_table_entry; -/* Grab the MBR */ -read_block(0, mbr, zipl_load: cannot read block 0); - -dputs(checking magic\n); - -IPL_assert(zipl_magic(mbr-magic), zipl_load: zipl_magic 1); +/* The 0-th block (MBR) was already read into sec[] */ +sclp_print(Using SCSI scheme.\n); debug_print_int(program table, mbr-blockptr.blockno); /* Parse the program table */ read_block(mbr-blockptr.blockno, sec, - zipl_load: cannot read program table); +
Re: [Qemu-devel] [PATCH 4/5] PPC: e500: Support platform devices
On 27.06.14 11:29, Eric Auger wrote: On 06/04/2014 02:28 PM, Alexander Graf wrote: For e500 our approach to supporting platform devices is to create a simple bus from the guest's point of view within which we map platform devices dynamically. We allocate memory regions always within the platform hole in address space and map IRQs to predetermined IRQ lines that are reserved for platform device usage. This maps really nicely into device tree logic, so we can just tell the guest about our virtual simple bus in device tree as well. Signed-off-by: Alexander Graf ag...@suse.de --- default-configs/ppc-softmmu.mak | 1 + default-configs/ppc64-softmmu.mak | 1 + hw/ppc/e500.c | 221 ++ hw/ppc/e500.h | 1 + hw/ppc/e500plat.c | 1 + 5 files changed, 225 insertions(+) diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak index 33f8d84..d6ec8b9 100644 --- a/default-configs/ppc-softmmu.mak +++ b/default-configs/ppc-softmmu.mak @@ -45,6 +45,7 @@ CONFIG_PREP=y CONFIG_MAC=y CONFIG_E500=y CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM)) +CONFIG_PLATFORM=y # For PReP CONFIG_MC146818RTC=y CONFIG_ETSEC=y diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak index 37a15b7..06677bf 100644 --- a/default-configs/ppc64-softmmu.mak +++ b/default-configs/ppc64-softmmu.mak @@ -45,6 +45,7 @@ CONFIG_PSERIES=y CONFIG_PREP=y CONFIG_MAC=y CONFIG_E500=y +CONFIG_PLATFORM=y CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM)) # For pSeries CONFIG_XICS=$(CONFIG_PSERIES) diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index 33d54b3..bc26215 100644 --- a/hw/ppc/e500.c +++ b/hw/ppc/e500.c @@ -36,6 +36,7 @@ #include exec/address-spaces.h #include qemu/host-utils.h #include hw/pci-host/ppce500.h +#include hw/platform/device.h #define EPAPR_MAGIC(0x45504150) #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb @@ -47,6 +48,14 @@ #define RAM_SIZES_ALIGN(64UL 20) +#define E500_PLATFORM_BASE 0xF000ULL +#define E500_PLATFORM_HOLE (128ULL * 1024 * 1024) /* 128 MB */ +#define E500_PLATFORM_PAGE_SHIFT 12 +#define E500_PLATFORM_HOLE_PAGES (E500_PLATFORM_HOLE \ +E500_PLATFORM_PAGE_SHIFT) +#define E500_PLATFORM_FIRST_IRQ5 +#define E500_PLATFORM_NUM_IRQS 10 + /* TODO: parameterize */ #define MPC8544_CCSRBAR_BASE 0xE000ULL #define MPC8544_CCSRBAR_SIZE 0x0010ULL @@ -122,6 +131,62 @@ static void dt_serial_create(void *fdt, unsigned long long offset, } } +typedef struct PlatformDevtreeData { +void *fdt; +const char *mpic; +int irq_start; +const char *node; +} PlatformDevtreeData; + +static int platform_device_create_devtree(Object *obj, void *opaque) +{ +PlatformDevtreeData *data = opaque; +Object *dev; +PlatformDeviceState *pdev; + +dev = object_dynamic_cast(obj, TYPE_PLATFORM_DEVICE); +pdev = (PlatformDeviceState *)dev; + +if (!pdev) { +/* Container, traverse it for children */ +return object_child_foreach(obj, platform_device_create_devtree, data); +} + +return 0; +} + +static void platform_create_devtree(void *fdt, const char *node, uint64_t addr, +const char *mpic, int irq_start, +int nr_irqs) +{ +const char platcomp[] = qemu,platform\0simple-bus; +PlatformDevtreeData data; + +/* Create a /platform node that we can put all devices into */ + +qemu_fdt_add_subnode(fdt, node); +qemu_fdt_setprop(fdt, node, compatible, platcomp, sizeof(platcomp)); +qemu_fdt_setprop_string(fdt, node, device_type, platform); + +/* Our platform hole is less than 32bit big, so 1 cell is enough for address + and size */ +qemu_fdt_setprop_cells(fdt, node, #size-cells, 1); +qemu_fdt_setprop_cells(fdt, node, #address-cells, 1); +qemu_fdt_setprop_cells(fdt, node, ranges, 0, addr 32, addr, + E500_PLATFORM_HOLE); + +qemu_fdt_setprop_phandle(fdt, node, interrupt-parent, mpic); + +/* Loop through all devices and create nodes for known ones */ + +data.fdt = fdt; +data.mpic = mpic; +data.irq_start = irq_start; +data.node = node; + +platform_device_create_devtree(qdev_get_machine(), data); +} + static int ppce500_load_device_tree(MachineState *machine, PPCE500Params *params, hwaddr addr, @@ -379,6 +444,12 @@ static int ppce500_load_device_tree(MachineState *machine, qemu_fdt_setprop_cell(fdt, pci, #address-cells, 3); qemu_fdt_setprop_string(fdt, /aliases, pci0, pci); +if (params-has_platform) { +platform_create_devtree(fdt, /platform, E500_PLATFORM_BASE, + mpic,
[Qemu-devel] [PULL 06/10] pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Factor out helper function for dumping a hex value into a buffer. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/s390-ccw.h | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h index 29468fb..959aed0 100644 --- a/pc-bios/s390-ccw/s390-ccw.h +++ b/pc-bios/s390-ccw/s390-ccw.h @@ -86,15 +86,21 @@ static inline void fill_hex(char *out, unsigned char val) out[1] = hex[val 0xf]; } -static inline void print_int(const char *desc, u64 addr) +static inline void fill_hex_val(char *out, void *ptr, unsigned size) { -unsigned char *addr_c = (unsigned char *)addr; -char out[] = : 0x\n; +unsigned char *value = ptr; unsigned int i; -for (i = 0; i sizeof(addr); i++) { -fill_hex(out[4 + (i*2)], addr_c[i]); +for (i = 0; i size; i++) { +fill_hex(out[i*2], value[i]); } +} + +static inline void print_int(const char *desc, u64 addr) +{ +char out[] = : 0x\n; + +fill_hex_val(out[4], addr, sizeof(addr)); sclp_print(desc); sclp_print(out); -- 1.7.9.5
Re: [Qemu-devel] Reverse execution and deterministic replay
On 27 June 2014 11:35, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: The major disadvantage of icount is that it's updated only on TB boundaries. When one instruction in the middle of the block uses virtual clock, it could have different values for different divisions of the code to TB. This is only true if the instruction is incorrectly not marked as being I/O. The idea behind icount is that in general we update it on TB boundaries (it's much faster than doing it once per insn) but for those places which do turn out to need an exact icount we then retranslate the block to get the instruction-to-icount-adjustment mapping. I forgot about one more issue. When qemu stops execution on the breakpoint, the icount is decreased to the number of instructions in the block. But in this case the last instruction is not executed and should not affect the counter. Pavel Dovgaluk
[Qemu-devel] [PULL 09/10] pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Add code that allows us to start from two further ECKD DASD disk layouts: LDL (Linux disk layout) and CMS (cms-formatted disk). Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c | 92 pc-bios/s390-ccw/bootmap.h |7 2 files changed, 92 insertions(+), 7 deletions(-) diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index beda4d6..fa54abb 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -80,6 +80,17 @@ static void jump_to_IPL_code(uint64_t address) static unsigned char _bprs[8*1024]; /* guessed max ECKD sector size */ const int max_bprs_entries = sizeof(_bprs) / sizeof(ExtEckdBlockPtr); +static inline void verify_boot_info(BootInfo *bip) +{ +IPL_assert(magic_match(bip-magic, ZIPL_MAGIC), No zIPL magic); +IPL_assert(bip-version == BOOT_INFO_VERSION, Wrong zIPL version); +IPL_assert(bip-bp_type == BOOT_INFO_BP_TYPE_IPL, DASD is not for IPL); +IPL_assert(bip-dev_type == BOOT_INFO_DEV_TYPE_ECKD, DASD is not ECKD); +IPL_assert(bip-flags == BOOT_INFO_FLAGS_ARCH, Not for this arch); +IPL_assert(block_size_ok(bip-bp.ipl.bm_ptr.eckd.bptr.size), + Bad block size in zIPL section of the 1st record.); +} + static bool eckd_valid_address(BootMapPointer *p) { const uint64_t cylinder = p-eckd.cylinder @@ -198,19 +209,15 @@ static void run_eckd_boot_script(block_number_t mbr_block_nr) jump_to_IPL_code(bms-entry[i].address.load_address); /* no return */ } -static void ipl_eckd(void) +static void ipl_eckd_cdl(void) { XEckdMbr *mbr; Ipl2 *ipl2 = (void *)sec; IplVolumeLabel *vlbl = (void *)sec; block_number_t block_nr; -sclp_print(Using ECKD scheme.\n); -if (virtio_guessed_disk_nature()) { -sclp_print(Using guessed DASD geometry.\n); -virtio_assume_eckd(); -} /* we have just read the block #0 and recognized it as IPL1 */ +sclp_print(CDL\n); memset(sec, FREE_SPACE_FILLER, sizeof(sec)); read_block(1, ipl2, Cannot read IPL2 record at block 1); @@ -238,6 +245,57 @@ static void ipl_eckd(void) /* no return */ } +static void ipl_eckd_ldl(ECKD_IPL_mode_t mode) +{ +LDL_VTOC *vlbl = (void *)sec; /* already read, 3rd block */ +char msg[4] = { '?', '.', '\n', '\0' }; +block_number_t block_nr; +BootInfo *bip; + +sclp_print((mode == ECKD_CMS) ? CMS : LDL); +sclp_print( version ); +switch (vlbl-LDL_version) { +case LDL1_VERSION: +msg[0] = '1'; +break; +case LDL2_VERSION: +msg[0] = '2'; +break; +default: +msg[0] = vlbl-LDL_version; +msg[0] = 0x0f; /* convert EBCDIC */ +msg[0] |= 0x30; /* to ASCII (digit) */ +msg[1] = '?'; +break; +} +sclp_print(msg); +print_volser(vlbl-volser); + +/* DO NOT read BootMap pointer (only one, xECKD) at block #2 */ + +memset(sec, FREE_SPACE_FILLER, sizeof(sec)); +read_block(0, sec, Cannot read block 0); +bip = (void *)(sec + 0x70); /* boot info is eckd mbr for LDL */ +verify_boot_info(bip); + +block_nr = eckd_block_num((void *)(bip-bp.ipl.bm_ptr.eckd.bptr)); +run_eckd_boot_script(block_nr); +/* no return */ +} + +static void ipl_eckd(ECKD_IPL_mode_t mode) +{ +switch (mode) { +case ECKD_CDL: +ipl_eckd_cdl(); /* no return */ +case ECKD_CMS: +case ECKD_LDL: +ipl_eckd_ldl(mode); /* no return */ +default: +virtio_panic(\n! Unknown ECKD IPL mode !\n); +} +} + /*** * IPL a SCSI disk */ @@ -374,6 +432,7 @@ static void ipl_scsi(void) void zipl_load(void) { ScsiMbr *mbr = (void *)sec; +LDL_VTOC *vlbl = (void *)sec; /* Grab the MBR */ memset(sec, FREE_SPACE_FILLER, sizeof(sec)); @@ -384,8 +443,27 @@ void zipl_load(void) if (magic_match(mbr-magic, ZIPL_MAGIC)) { ipl_scsi(); /* no return */ } + +/* We have failed to follow the SCSI scheme, so */ +sclp_print(Using ECKD scheme.\n); +if (virtio_guessed_disk_nature()) { +sclp_print(Using guessed DASD geometry.\n); +virtio_assume_eckd(); +} + if (magic_match(mbr-magic, IPL1_MAGIC)) { -ipl_eckd(); /* CDL ECKD; no return */ +ipl_eckd(ECKD_CDL); /* no return */ +} + +/* LDL/CMS? */ +memset(sec, FREE_SPACE_FILLER, sizeof(sec)); +read_block(2, vlbl, Cannot read block 2); + +if (magic_match(vlbl-magic, CMS1_MAGIC)) { +ipl_eckd(ECKD_CMS); /* no return */ +} +if (magic_match(vlbl-magic, LNX1_MAGIC)) { +ipl_eckd(ECKD_LDL); /* no return */ } virtio_panic(\n* invalid MBR
Re: [Qemu-devel] [PATCH 2/4] mips_malta: Change default KVM cpu to 24Kc (no FP)
Il 27/06/2014 10:43, Aurelien Jarno ha scritto: On Thu, Jun 26, 2014 at 10:44:23AM +0100, James Hogan wrote: Change the default Malta CPU model for when KVM is enabled to 24Kc which doesn't have floating point support compared to the 24Kf. The resulting incorrect Config CP0 register value doesn't get passed to KVM yet as KVM doesn't expose it, however we should ensure it is set correctly now to reduce the risk of breaking migration/loadvm to a future version of QEMU/Linux that does support them. Signed-off-by: James Hogan james.ho...@imgtec.com Cc: Aurelien Jarno aurel...@aurel32.net Cc: Paolo Bonzini pbonz...@redhat.com --- hw/mips/mips_malta.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c index 2868ee5b0307..c0841991f4e9 100644 --- a/hw/mips/mips_malta.c +++ b/hw/mips/mips_malta.c @@ -949,7 +949,12 @@ void mips_malta_init(MachineState *machine) #ifdef TARGET_MIPS64 cpu_model = 20Kc; #else -cpu_model = 24Kf; +if (kvm_enabled()) { +/* Don't enable FPU on KVM yet */ +cpu_model = 24Kc; +} else { +cpu_model = 24Kf; +} #endif } Given the explanations in the other mails, that looks fine to me, that said I think we should at least warn the user that we are disabling some features, instead of doing it silently. This is what is done for example on x86 when a CPU feature is not available. I agree. James, can you send v2 of this patch only? Paolo
Re: [Qemu-devel] [v5][PATCH 4/5] xen, gfx passthrough: create host bridge to passthrough
Il 27/06/2014 10:34, Chen, Tiejun ha scritto: So how to separate this to specific to xen? Or you mean we need to create an new machine to address this scenario? But actually this is same as xenfv_machine except for these little codes. Yes, please create a new machine so that -M pc doesn't have any of these hacks. Note that -M xenfv is obsolete, Xen can now use -M pc (i.e. the default). Paolo
Re: [Qemu-devel] [PATCH] ahci.c: mask the interrupt on complete flag to allow ahci.c to read the correct size for the PRDT
Il 27/06/2014 01:28, Reza Jelveh ha scritto: +static int prdt_tbl_entry_size(const AHCI_SG tbl) { + return (le32_to_cpu(tbl.flags_size) AHCI_PRDT_SIZE_MASK) + 1; +} Apart from the incorrect indentation/formatting here, the patch seems okay. How can this be reproduced? Paolo
Re: [Qemu-devel] [PATCH v1] trace: add qemu_system_powerdown_request and qemu_system_shutdown_request trace events
On Sun, Jun 22, 2014 at 02:43:03AM +0800, Yang Zhiyong wrote: We have the experience that the guest doesn't stop successfully though it was instructed to shut down. The root cause may be not in QEMU mostly. However, QEMU is often suspected at the beginning just because the issue occurred in virtualization environment. Therefore, we need to affirm that QEMU received the shutdown request and raised ACPI irq from virsh shutdown command, virt-manger or stopping QEMU process to the VM . So that we can affirm the problems was belonged to the Guset OS rather than the QEMU itself. When we stop guests by virsh shutdown command or virt-manger, or stopping QEMU process, qemu_system_powerdown_request() or qemu_system_shutdown_request() is called. Then the below functions in main_loop_should_exit() of Vl.c are called roughly in the following order. if (qemu_powerdown_requested()) qemu_system_powerdown() monitor_protocol_event(QEVENT_POWERDOWN, NULL) OR if(qemu_shutdown_requested()} monitor_protocol_event(QEVENT_SHUTDOWN, NULL); The tracepoint of monitor_protocol_event() already exists, but no tracepoints are defined for qemu_system_powerdown_request() and qemu_system_shutdown_request(). So this patch adds two tracepoints for the two functions. We believe that it will become much easier to isolate the problem mentioned above by these tracepoints. Signed-off-by: Yang Zhiyong yangzy.f...@cn.fujitsu.com --- trace-events |2 ++ vl.c |2 ++ 2 files changed, 4 insertions(+), 0 deletions(-) Thanks, applied to my tracing tree: https://github.com/stefanha/qemu/commits/tracing Stefan pgp5cILHIlkdU.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 0/5] Platform device support
On 27.06.14 12:30, Andreas Färber wrote: Am 26.06.2014 14:01, schrieb Alexander Graf: On 20.06.14 08:43, Peter Crosthwaite wrote: On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote: Platforms without ISA and/or PCI have had a seriously hard time in the dynamic device creation world of QEMU. Devices on these were modeled as SysBus devices which can only be instantiated in machine files, not through -device. Why is that so? Well, SysBus is trying to be incredibly generic. It allows you to plug any interrupt sender into any other interrupt receiver. It allows you to map a device's memory regions into any other random memory region. All of that only works from C code or via really complicated command line arguments under discussion upstream right now. What you are doing seem to me to be an extension of SysBus - you are defining the same interfaces as sysbus but also adding some machine specifics wiring info. I think it's a candidate for QOM inheritance to avoid having to dup all the sysbus device models for both regular sysbus and platform bus. I think your functionality should be added as one of 1: and interface that can be added to sysbus devices 2: a new abstraction that inherits from SYS_BUS_DEVICE 3: just new features to the sysbus core. Then both of us are using the same suite of device models and the differences between our approaches are limited to machine level instantiation method. My gut says #2 is the cleanest. The more I think about it the more I believe #3 would be the cleanest. The only thing my platform devices do in addition to sysbus devices is that it exposes qdev properties to give mapping code hints where a device wants to be mapped. If we just add qdev properties for all the possible hints in generic sysbus core code, we should be able to automatically convert all devices into dynamically allocatable devices. Whether they actually do get mapped and the generation of device tree chunks still stays in the the machine file's court. As discussed offline with Alex, one issue I see is that this would be encouraging people to add more devices to an artificial global bus in /machine/unassigned that we've been trying to obsolete, rather than sitting down and please creating an e500 SoC object as a start. Maybe we should start generating a list of shame for 2.1. ;) Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would make me much happier than using SysBus as is. The pure QOM approach would be link properties instead of a bus, but then the machine needs to know how many slots there shall be in advance. Note that the docking procedure is always initiated from the realizing device, whether bus or no bus. So my goal is to make life easy for users, not to fulfill some wet Anthony dreams :). And as a user, I want to be able to say -device foo and have that device created, like I do with PCI devices today. There are 2 approaches to this that I can see: 1) A new special type of bus that allows for dynamic allocation and that knows a flat numbering scheme 2) Individual devices that get attached to whatever the machine file thinks makes it happy (basically emulating the above bus, but with more flexibility) I implemented option 1 with the Platform bus. It's basically an abstraction of the Sys/AXI/AMBA idea but only with a single bus implementation, as everything else would just be ridiculously redundant (and if necessary could be implemented as a subclass on top of the bridge device). People didn't like it. I implemented option 2 with the Platform devices - this patch set. People didn't like it because it duplicates SysBus devices - and it does. I'm implementing 2 as an add-on of SysBusDevice now which to me really isn't too much different from a dangling QOM device. Linking devices by force (set IRQ0 to MPIC IRQ 32, map region0 to physical address space offset 0x12300) is a nice thing to have for people who know what they're doing. That matches probably about 0.1% of our user base - I personally am not included there. We *have* to have a mechanism to make device creation easy for users if we want to have any. Alex
Re: [Qemu-devel] [PATCH 0/5] Platform device support
On 27.06.14 12:54, Peter Crosthwaite wrote: On Fri, Jun 27, 2014 at 8:30 PM, Andreas Färber afaer...@suse.de wrote: Am 26.06.2014 14:01, schrieb Alexander Graf: On 20.06.14 08:43, Peter Crosthwaite wrote: On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote: Platforms without ISA and/or PCI have had a seriously hard time in the dynamic device creation world of QEMU. Devices on these were modeled as SysBus devices which can only be instantiated in machine files, not through -device. Why is that so? Well, SysBus is trying to be incredibly generic. It allows you to plug any interrupt sender into any other interrupt receiver. It allows you to map a device's memory regions into any other random memory region. All of that only works from C code or via really complicated command line arguments under discussion upstream right now. What you are doing seem to me to be an extension of SysBus - you are defining the same interfaces as sysbus but also adding some machine specifics wiring info. I think it's a candidate for QOM inheritance to avoid having to dup all the sysbus device models for both regular sysbus and platform bus. I think your functionality should be added as one of 1: and interface that can be added to sysbus devices 2: a new abstraction that inherits from SYS_BUS_DEVICE 3: just new features to the sysbus core. Then both of us are using the same suite of device models and the differences between our approaches are limited to machine level instantiation method. My gut says #2 is the cleanest. The more I think about it the more I believe #3 would be the cleanest. The only thing my platform devices do in addition to sysbus devices is that it exposes qdev properties to give mapping code hints where a device wants to be mapped. If we just add qdev properties for all the possible hints in generic sysbus core code, we should be able to automatically convert all devices into dynamically allocatable devices. Whether they actually do get mapped and the generation of device tree chunks still stays in the the machine file's court. As discussed offline with Alex, one issue I see is that this would be encouraging people to add more devices to an artificial global bus in /machine/unassigned that we've been trying to obsolete, rather than sitting down and please creating an e500 SoC object as a start. Maybe we should start generating a list of shame for 2.1. ;) Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would make me much happier than using SysBus as is. Do you mean address_space_memory (as used by sysbus_mmio_map)? We all hate that global singleton, but can we decouple it from sysbus which is not the root cause of that problem? sysbus_mmio_map usages just need to be replaced with sysbus_mmio_get_region and you can create whatever heirachy you want using unchanged sysbus devices. Even if we phase out the global singleton and the SysBus bus, the sysbus device abstraction is still sound and should be usable busless. Then theres no need a for a tree-wide to implement Alex's feature for all devs (assuming his plugger can be made to work hintless?). The plugger works just fine when you don't give hints - then it's up to dynamic allocation (same as PCI). Yes I fully agree with you here. Alex
Re: [Qemu-devel] [PULL 03/10] pc-bios/s390-ccw: handle different sector sizes
On 27.06.14 13:25, Cornelia Huck wrote: From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Use the virtio device's configuration to figure out the disk geometry and use a sector size based upon the layout. [CH: s/SECTOR_SIZE/MAX_SECTOR_SIZE/g] Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- pc-bios/s390-ccw/bootmap.c | 12 +++--- pc-bios/s390-ccw/s390-ccw.h |2 +- pc-bios/s390-ccw/virtio.c | 96 --- pc-bios/s390-ccw/virtio.h | 48 ++ 4 files changed, 147 insertions(+), 11 deletions(-) diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c index c216030..fa2ca26 100644 --- a/pc-bios/s390-ccw/bootmap.c +++ b/pc-bios/s390-ccw/bootmap.c @@ -10,6 +10,7 @@ #include s390-ccw.h #include bootmap.h +#include virtio.h /* #define DEBUG_FALLBACK */ @@ -22,7 +23,8 @@ #endif /* Scratch space */ -static uint8_t sec[SECTOR_SIZE] __attribute__((__aligned__(SECTOR_SIZE))); +static uint8_t sec[MAX_SECTOR_SIZE] +__attribute__((__aligned__(MAX_SECTOR_SIZE))); typedef struct ResetInfo { uint32_t ipl_mask; @@ -99,7 +101,7 @@ static inline bool unused_space(const void *p, unsigned int size) static int zipl_load_segment(ComponentEntry *entry) { -const int max_entries = (SECTOR_SIZE / sizeof(ScsiBlockPtr)); +const int max_entries = (MAX_SECTOR_SIZE / sizeof(ScsiBlockPtr)); Is this really safe to increase? Doesn't max_entries depend on the real sector size? Alex
Re: [Qemu-devel] [PATCH v11 1/3] sPAPR: Implement EEH RTAS calls
On 27.06.14 11:53, Gavin Shan wrote: On Thu, Jun 26, 2014 at 12:46:50PM +0200, Alexander Graf wrote: On 26.06.14 12:43, Gavin Shan wrote: On Thu, Jun 26, 2014 at 12:30:16PM +0200, Alexander Graf wrote: On 26.06.14 03:35, Gavin Shan wrote: The emulation for EEH RTAS requests from guest isn't covered by QEMU yet and the patch implements them. The patch defines constants used by EEH RTAS calls and adds callback sPAPRPHBClass::eeh_handler, which is going to be used this way: 1. RTAS calls are received in spapr_pci.c, sanity check is done there. 2. RTAS handlers handle what they can. If there is something it cannot handle and sPAPRPHBClass::eeh_handler callback is defined, it is called. 3. sPAPRPHBClass::eeh_handler is only implemented for VFIO now. It does ioctl() to the IOMMU container fd to complete the call. Error codes from that ioctl() are transferred back to the guest. Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com --- hw/ppc/spapr_pci.c | 240 include/hw/pci-host/spapr.h | 7 ++ include/hw/ppc/spapr.h | 33 ++ 3 files changed, 280 insertions(+) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 131434b..8712051 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -422,6 +422,233 @@ static void rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu, rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */ } +static int rtas_handle_eeh_request(sPAPREnvironment *spapr, + uint64_t buid, uint32_t req, uint32_t opt) +{ +sPAPRPHBState *sphb = spapr_find_phb(spapr, buid); +sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + +if (!sphb || !info-eeh_handler) { +return -ENOENT; +} + +return info-eeh_handler(sphb, req, opt); +} + +static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, uint32_t nret, +target_ulong rets) +{ +uint32_t addr, option; +uint64_t buid = ((uint64_t)rtas_ld(args, 1) 32) | rtas_ld(args, 2); +int ret; + +if ((nargs != 4) || (nret != 1)) { +goto param_error_exit; +} + +addr = rtas_ld(args, 0); +option = rtas_ld(args, 3); +switch (option) { +case RTAS_EEH_ENABLE: +if (!find_dev(spapr, buid, addr)) { +goto param_error_exit; +} +break; +case RTAS_EEH_DISABLE: +case RTAS_EEH_THAW_IO: +case RTAS_EEH_THAW_DMA: +break; +default: +goto param_error_exit; +} + +ret = rtas_handle_eeh_request(spapr, buid, + RTAS_EEH_REQ_SET_OPTION, option); +if (ret = 0) { +rtas_st(rets, 0, RTAS_OUT_SUCCESS); +return; +} + +param_error_exit: +rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, uint32_t nret, + target_ulong rets) +{ +uint32_t addr, option; +uint64_t buid = ((uint64_t)rtas_ld(args, 1) 32) | rtas_ld(args, 2); +sPAPRPHBState *sphb = spapr_find_phb(spapr, buid); +sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); +PCIDevice *pdev; + +if (!sphb || !info-eeh_handler) { +goto param_error_exit; +} + +if ((nargs != 4) || (nret != 2)) { +goto param_error_exit; +} + +addr = rtas_ld(args, 0); +option = rtas_ld(args, 3); +if (option != RTAS_GET_PE_ADDR option != RTAS_GET_PE_MODE) { +goto param_error_exit; +} + +pdev = find_dev(spapr, buid, addr); +if (!pdev) { +goto param_error_exit; +} + +/* + * For now, we always have bus level PE whose address + * has format 00BBSS00. The guest OS might regard + * PE address 0 as invalid. We avoid that simply by + * extending it with one. + */ +rtas_st(rets, 0, RTAS_OUT_SUCCESS); +if (option == RTAS_GET_PE_ADDR) { +rtas_st(rets, 1, (pci_bus_num(pdev-bus) 16) + 1); +} else { +rtas_st(rets, 1, RTAS_PE_MODE_SHARED); +} + +return; + +param_error_exit: +rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, +sPAPREnvironment *spapr, +uint32_t token, uint32_t nargs, +target_ulong args, uint32_t nret, +target_ulong rets) +{ +uint64_t buid = ((uint64_t)rtas_ld(args, 1) 32) | rtas_ld(args, 2); +int ret; +
Re: [Qemu-devel] [PATCH V3] qemu-img create: add 'nocow' option
On Mon, Jun 23, 2014 at 05:17:02PM +0800, Chunyan Liu wrote: Add 'nocow' option so that users could have a chance to set NOCOW flag to newly created files. It's useful on btrfs file system to enhance performance. Btrfs has low performance when hosting VM images, even more when the guest in those VM are also using btrfs as file system. One way to mitigate this bad performance is to turn off COW attributes on VM files. Generally, there are two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then all newly created files will be NOCOW. b) per file. Add the NOCOW file attribute. It could only be done to empty or new files. This patch tries the second way, according to the option, it could add NOCOW per file. For most block drivers, since the create file step is in raw-posix.c, so we can do setting NOCOW flag ioctl in raw-posix.c only. But there are some exceptions, like block/vpc.c and block/vdi.c, they are creating file by calling qemu_open directly. For them, do the same setting NOCOW flag ioctl work in them separately. Signed-off-by: Chunyan Liu cy...@suse.com --- Changes to v2: * based on QemuOpts instead of old QEMUOptionParameters * add nocow description in man page and html doc Old v2 is here: http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg02429.html --- block/cow.c | 5 + block/qcow.c | 5 + block/qcow2.c | 5 + block/qed.c | 11 --- block/raw-posix.c | 25 + block/vdi.c | 29 + block/vhdx.c | 5 + block/vmdk.c | 11 --- block/vpc.c | 29 + include/block/block_int.h | 1 + qemu-doc.texi | 16 qemu-img.texi | 16 12 files changed, 152 insertions(+), 6 deletions(-) Are you sure it's necessary to touch all image formats in order to pass through the nocow option? Looking at bdrv_img_create() I think it will work without touching all image formats since both drv and proto_drv-create_opts are appended: void bdrv_img_create(const char *filename, const char *fmt, const char *base_filename, const char *base_fmt, char *options, uint64_t img_size, int flags, Error **errp, bool quiet) { QemuOptsList *create_opts = NULL; ... create_opts = qemu_opts_append(create_opts, drv-create_opts); create_opts = qemu_opts_append(create_opts, proto_drv-create_opts); /* Create parameter list with default values */ opts = qemu_opts_create(create_opts, NULL, 0, error_abort); qemu_opt_set_number(opts, BLOCK_OPT_SIZE, img_size); /* Parse -o options */ if (options) { if (qemu_opts_do_parse(opts, options, NULL) != 0) { error_setg(errp, Invalid options for file format '%s', fmt); goto out; } } pgpn82sRWQr9t.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 0/5] Platform device support
Il 27/06/2014 13:24, Alexander Graf ha scritto: I think we can all agree that the sysbus bus is not a bus per se. So conceptually, what's the difference between a device attached to a non-bus and a device not attached to a bus at all? And why can't we convert sysbus to not be a bus anymore? I think there is no difference, and I don't think moving out of sysbus is really a goal that we need to pursue. I agree with Andreas that having a SoC object as father of sysbus (instead of nothing at all) would be slightly better. We could also make TYPE_MACHINE a subclass of TYPE_DEVICE, to have an obvious place for this SoC object. Paolo
Re: [Qemu-devel] [PATCH trivial v2] block.c: Add return value for bdrv_append_temp_snapshot() to avoid incorrect failure processing issue
On Tue, Jun 24, 2014 at 01:01:52PM +0200, Markus Armbruster wrote: Kevin Wolf kw...@redhat.com writes: Am 23.06.2014 um 17:28 hat Chen Gang geschrieben: When failure occurs, 'ret' need be set, or may return 0 to indicate success. And error_propagate() also need be called only one time within a function. It is abnormal to prevent bdrv_append_temp_snapshot() return value but still set errp when error occurs -- although it contents return value internally. So let bdrv_append_temp_snapshot() internal return value outside, and let all things normal, then fix the issue too. Signed-off-by: Chen Gang gang.chen.5...@gmail.com What does this fix? It fixes the return value of bdrv_open() when bdrv_append_temp_snapshot() fails. Before this patch, it returns a positive value, which is wrong. After the patch, it returns the negative error code bdrv_append_temp_snapshot() now returns. Exactly. I asked for the -errno return because otherwise bdrv_open() would have no accurate errno. Stefan pgptPuBNgjD6J.pgp Description: PGP signature
Re: [Qemu-devel] Reverse execution and deterministic replay
On 27 June 2014 12:31, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote: I forgot about one more issue. When qemu stops execution on the breakpoint, the icount is decreased to the number of instructions in the block. But in this case the last instruction is not executed and should not affect the counter. Yes, indeed, that's the sort of edge case bug we should fix. -- PMM
Re: [Qemu-devel] [PATCH 0/5] Platform device support
On 27 June 2014 12:48, Paolo Bonzini pbonz...@redhat.com wrote: We could also make TYPE_MACHINE a subclass of TYPE_DEVICE, to have an obvious place for this SoC object. Why isn't TYPE_MACHINE a subclass of TYPE_DEVICE anyway? thanks -- PMM
[Qemu-devel] [PULL 14/32] target-ppc: Remove unused gen_qemu_ld8s()
From: Peter Maydell peter.mayd...@linaro.org The gen_qemu_ld8s() function is unused; remove it. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Alexander Graf ag...@suse.de --- target-ppc/translate.c | 5 - 1 file changed, 5 deletions(-) diff --git a/target-ppc/translate.c b/target-ppc/translate.c index b501655..b23933f 100644 --- a/target-ppc/translate.c +++ b/target-ppc/translate.c @@ -2662,11 +2662,6 @@ static inline void gen_qemu_ld8u(DisasContext *ctx, TCGv arg1, TCGv arg2) tcg_gen_qemu_ld8u(arg1, arg2, ctx-mem_idx); } -static inline void gen_qemu_ld8s(DisasContext *ctx, TCGv arg1, TCGv arg2) -{ -tcg_gen_qemu_ld8s(arg1, arg2, ctx-mem_idx); -} - static inline void gen_qemu_ld16u(DisasContext *ctx, TCGv arg1, TCGv arg2) { TCGMemOp op = MO_UW | ctx-default_tcg_memop_mask; -- 1.8.1.4
[Qemu-devel] [PULL 03/32] linux-user: Identify Addition Hardware Capabilities for PowerPC
From: Tom Musta tommu...@gmail.com Add VSX, DFP and ISA 2.06 to the bits identified in the AT_HWCAP entry of the AUXV. Signed-off-by: Tom Musta tommu...@gmail.com Signed-off-by: Alexander Graf ag...@suse.de --- linux-user/elfload.c | 8 1 file changed, 8 insertions(+) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 64d23fa..9a41882 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -749,6 +749,8 @@ static uint32_t get_elf_hwcap(void) Altivec/FP/SPE support. Anything else is just a bonus. */ #define GET_FEATURE(flag, feature) \ do { if (cpu-env.insns_flags flag) { features |= feature; } } while (0) +#define GET_FEATURE2(flag, feature) \ +do { if (cpu-env.insns_flags2 flag) { features |= feature; } } while (0) GET_FEATURE(PPC_64B, QEMU_PPC_FEATURE_64); GET_FEATURE(PPC_FLOAT, QEMU_PPC_FEATURE_HAS_FPU); GET_FEATURE(PPC_ALTIVEC, QEMU_PPC_FEATURE_HAS_ALTIVEC); @@ -757,7 +759,13 @@ static uint32_t get_elf_hwcap(void) GET_FEATURE(PPC_SPE_DOUBLE, QEMU_PPC_FEATURE_HAS_EFP_DOUBLE); GET_FEATURE(PPC_BOOKE, QEMU_PPC_FEATURE_BOOKE); GET_FEATURE(PPC_405_MAC, QEMU_PPC_FEATURE_HAS_4xxMAC); +GET_FEATURE2(PPC2_DFP, QEMU_PPC_FEATURE_HAS_DFP); +GET_FEATURE2(PPC2_VSX, QEMU_PPC_FEATURE_HAS_VSX); +GET_FEATURE2((PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 | PPC2_ATOMIC_ISA206 | + PPC2_FP_CVT_ISA206 | PPC2_FP_TST_ISA206), + QEMU_PPC_FEATURE_ARCH_2_06); #undef GET_FEATURE +#undef GET_FEATURE2 return features; } -- 1.8.1.4