Re: [Qemu-devel] [PATCH 0/5] pc: make ROMs resizeable
On Wed, Nov 19, 2014 at 01:01:14PM +0530, Amit Shah wrote: On (Mon) 17 Nov 2014 [22:08:46], Michael S. Tsirkin wrote: At the moment we migrate ROMs which reside in fw cfg, which allows changing ROM code at will, and supports migrating largish blocks early, with good performance. However, we are running into a problem: changing size breaks migration every time. This already requires somewhat messy compatibility support in acpi generation code, and it looks like there'll be more to come. Rather than try to guess the correct size once and for all, this patchset tries to make code future-proof, by adding support for resizeable ram blocks. A (possibly very high) amount of space in ram_addr_t space is reserved for each block, but never allocated. If incoming block size differs from current size, block is reallocated. FW CFG is also notified and updated accordingly. To simplify things, I didn't add support for resizing actual RAM: device RAM such as fw cfg ROMs are never mapped into guests directly, so instead I added an API to flag device RAM explicitly, and manage them using simple alloc/free/realloc Considering this promises to rid us of worries about ROM size considerations once and for all, I thinking about pushing this as a kind of bugfix before 2.2, so we don't need to maintain more band-aids in 2.3 and on. I'd rather wait for 2.3; we've done this for a couple of releases already, so what's one more. And we're at rc2 already.. Paolo feels the same, and I agree. Note: migration stream is unaffected by these patches. This makes it possible to enable this functionality unconditionally, for all machine types. In the future, this might be handy for other things, such as changing kernels loaded on command line across migrations. I think that'll be too risky; unless we do S4 before / after migration to ensure the kernel realises things might be changing beneath its feet. Amit Well - guest never sees the resizing. It happens before we start the VM. So I don't see the issue - could you clarify please? -- MST
Re: [Qemu-devel] [PATCH 0/5] pc: make ROMs resizeable
Michael S. Tsirkin m...@redhat.com writes: On Tue, Nov 18, 2014 at 03:47:16PM +0100, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: At the moment we migrate ROMs which reside in fw cfg, which allows changing ROM code at will, and supports migrating largish blocks early, with good performance. However, we are running into a problem: changing size breaks migration every time. Pardon my ignorance... changing ROM in migration? I would expect migration to be as transparent for ROM as it is for RAM. What am I missing? [...] Nothing really. We migrate RAM size too - but if there's a mismatch, migration fails. RAM size is user configurable. ROM is used internally so we have to figure out the size, and it turned out to be a hard problem, so we end up maintaining logic for ROM size like we did in 1.7 like we did in 2.0 etc. I don't want to add any more: let's just accept whatever is migrated, and stick to it. Are you proposing to change ROM size on the target from whatever was configured during startup to whatever is configured on the source?
Re: [Qemu-devel] [PATCH] configure: Replace which(1) with command -v
On 19 November 2014 07:13, Fam Zheng f...@redhat.com wrote: Because which(1) is not always installed, whereas command -v is the more native way to check for a command. Signed-off-by: Fam Zheng f...@redhat.com --- configure | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure b/configure index 47048f0..986a13d 100755 --- a/configure +++ b/configure @@ -2727,7 +2727,7 @@ fi if test $modules = yes; then shacmd_probe=sha1sum sha1 shasum for c in $shacmd_probe; do -if which $c /dev/null 21; then +if command -v $c /dev/null 21; then shacmd=$c break fi Configure already provides the shell function has for checking for existence of a command -- can we just use that? thanks -- PMM
Re: [Qemu-devel] [PATCH 0/5] pc: make ROMs resizeable
On (Wed) 19 Nov 2014 [10:15:16], Michael S. Tsirkin wrote: On Wed, Nov 19, 2014 at 01:01:14PM +0530, Amit Shah wrote: On (Mon) 17 Nov 2014 [22:08:46], Michael S. Tsirkin wrote: Note: migration stream is unaffected by these patches. This makes it possible to enable this functionality unconditionally, for all machine types. In the future, this might be handy for other things, such as changing kernels loaded on command line across migrations. I think that'll be too risky; unless we do S4 before / after migration to ensure the kernel realises things might be changing beneath its feet. Well - guest never sees the resizing. It happens before we start the VM. So I don't see the issue - could you clarify please? Before we start the VM? That's a really odd corner case (ie not worth bothering about?). I took this to mean that the guest was running while migration happened. Amit
Re: [Qemu-devel] [PATCH v2 0/3] fix bug about balloon working incorrectly when hotplug memeory
Hi, Ping...? Should this be fixed in 2.2? Thanks, zhanghailiang On 2014/11/17 13:11, zhanghailiang wrote: Hi, Patch 1 and 2 mainly fix bug about balloon not working correctly when we do hotplug memory. It takes 'ram_size' as VM's real RAM size which is wrong after we hotplug memory. This bug exists since we begin to support hotplug memory, and it is better to fix it. Patch 3 add some trace events, it helps debugging balloon. If it is unnecessary, pls feel free to remove it. Thanks, zhanghailiang v2: - fix compiling break for other targets that don't support pc-dimm zhanghailiang (3): pc-dimm: add a function to calculate VM's current RAM size virtio-balloon: Fix balloon not working correctly when hotplug memory virtio-balloon: Add some trace events hw/mem/pc-dimm.c| 26 ++ hw/virtio/virtio-balloon.c | 21 +++-- include/exec/cpu-common.h | 1 + stubs/qmp_pc_dimm_device_list.c | 5 + trace-events| 4 5 files changed, 51 insertions(+), 6 deletions(-)
Re: [Qemu-devel] [PATCH 1/5] cpu: add cpu_physical_memory_clear_dirty_range_nocode
Michael S. Tsirkin m...@redhat.com wrote: simple wrapper so callers don't need to know about dirty bitmap clients. Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Juan Quintela quint...@redhat.com
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Michael S. Tsirkin m...@redhat.com wrote: On Tue, Nov 18, 2014 at 07:03:58AM +0100, Paolo Bonzini wrote: On 17/11/2014 21:08, Michael S. Tsirkin wrote: Add API to manage on-device RAM. This looks just like regular RAM from migration POV, but has two special properties internally: - block is sized on migration, making it easier to extend without breaking migration compatibility or wasting virtual memory - callers must specify an upper bound on size Also, I am afraid that this design could make it easier to introduce backwards-incompatible changes. Well the point is exactly to make it easy to make *compatible* changes. As I mentioned in the cover letter, it's not just ACPI. For example, we now change boot index dynamically. People using large fw cfg blobs, e.g. -initrd, would benefit from ability to change the blob dynamically. There could be other examples. changing the size of the initrd, on the fly and wanting to migrate? Is that a real use case? One that we should really care? I very much prefer to have user-controlled ACPI information (coming from the command-line) byte-for-byte identical for a given machine type. Patches for that have been on the list for almost two months, and it's not nice. Paolo I guess we just disagree on whether these patches will effectively achieve this goal. For example, some people want to rewrite iasl bits, generating everything in C. This will affect static bits too. I don't want to make every single change in code conditional on a machine type. You can't migrate with a different BIOS on destination, period. That is what is making this whole issue complicated. We have two clear options: a- require BIOS memory regions to be exactly the same in both sides. No need to add compat machinery. b- trying to accomodate any potential change that could appear and use the same BIOS. IMHO, b) is just asking for trouble. Being able to go from random changes to random changes look strange. Just think about it for a second. We are sending more data for some regions that it was allocated. And we just grow the regions and expect that everything is going to be ok. It is me, or this goes against every security discipline that I can think of? Later, Juan.
Re: [Qemu-devel] [PATCH 1/4] MAINTAINERS: Add myself to migration maintainers
Amit Shah amit.s...@redhat.com wrote: Signed-off-by: Amit Shah amit.s...@redhat.com --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS Reviewed-by: Juan Quintela quint...@redhat.com
Re: [Qemu-devel] [PATCH 2/4] MAINTAINERS: migration: add vmstate static checker files
Amit Shah amit.s...@redhat.com wrote: Signed-off-by: Amit Shah amit.s...@redhat.com --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) Reviewed-by: Juan Quintela quint...@redhat.com
Re: [Qemu-devel] [PATCH 4/4] MAINTAINERS: add include files to virtio-serial entry
Amit Shah amit.s...@redhat.com wrote: Signed-off-by: Amit Shah amit.s...@redhat.com --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) Reviewed-by: Juan Quintela quint...@redhat.com
Re: [Qemu-devel] [PATCH 3/4] MAINTAINERS: add entry for virtio-rng
Amit Shah amit.s...@redhat.com wrote: Signed-off-by: Amit Shah amit.s...@redhat.com Reviewed-by: Juan Quintela quint...@redhat.com
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On Wed, Nov 19, 2014 at 10:19:22AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Tue, Nov 18, 2014 at 07:03:58AM +0100, Paolo Bonzini wrote: On 17/11/2014 21:08, Michael S. Tsirkin wrote: Add API to manage on-device RAM. This looks just like regular RAM from migration POV, but has two special properties internally: - block is sized on migration, making it easier to extend without breaking migration compatibility or wasting virtual memory - callers must specify an upper bound on size Also, I am afraid that this design could make it easier to introduce backwards-incompatible changes. Well the point is exactly to make it easy to make *compatible* changes. As I mentioned in the cover letter, it's not just ACPI. For example, we now change boot index dynamically. People using large fw cfg blobs, e.g. -initrd, would benefit from ability to change the blob dynamically. There could be other examples. changing the size of the initrd, on the fly and wanting to migrate? Is that a real use case? One that we should really care? I'm not sure. At the moment one can swap kernels by doing halt in guest and restarting with the new one. If we wanted to allow reboot in guest to bring a new kernel instead, that would be one way to implement it. I was merely pointing out that the capability might find other uses. I very much prefer to have user-controlled ACPI information (coming from the command-line) byte-for-byte identical for a given machine type. Patches for that have been on the list for almost two months, and it's not nice. Paolo I guess we just disagree on whether these patches will effectively achieve this goal. For example, some people want to rewrite iasl bits, generating everything in C. This will affect static bits too. I don't want to make every single change in code conditional on a machine type. You can't migrate with a different BIOS on destination, period. This claim is very wrong. This would make is impossible to change BIOS bus without breaking migration. Look at history of qemu, we change BIOS every release. That is what is making this whole issue complicated. We have two clear options: a- require BIOS memory regions to be exactly the same in both sides. No need to add compat machinery. b- trying to accomodate any potential change that could appear and use the same BIOS. IMHO, b) is just asking for trouble. Being able to go from random changes to random changes look strange. Yes, it is hard to support. But it's a required feature, and in fact, it's an existing one. Just think about it for a second. We are sending more data for some regions that it was allocated. And we just grow the regions and expect that everything is going to be ok. It is me, or this goes against every security discipline that I can think of? Later, Juan. We have many devices that just get N from stream, do malloc(N), then read data from stream into it. You think it's unsafe? Go ahead and fix them all. However, my patch does address your concern: callers specify the upper limit on the region size. Trying to migrate in a 1Gbyte region -- MST
Re: [Qemu-devel] [PATCH v2] Pass semihosting exit code back to system.
On 18/11/2014 23:50, Peter Maydell wrote: On 18 November 2014 21:19, Paolo Bonzini pbonz...@redhat.com wrote: The isa-debugexit and testdev character device do not exit with exitcode 0, in order to distinguish an exit from a system power down; the low-order bit is always 1. The return values then should be 1 and 3 instead of 0 and 1. Semihosting isn't a test device -- it's an implementation of an ABI (mostly intended for almost-but-not-quite-bare-metal programs). I'm pretty sure that exiting anything except 0 on a successful exit request by the guest will break usage of QEMU in scenarios like the gcc test suite. Thanks! (FWIW, I don't find the behavior of those devices very useful either... just mentioning them for the sake of consistency). Paolo
Re: [Qemu-devel] Tunneled Migration with Non-Shared Storage
* Paolo Bonzini (pbonz...@redhat.com) wrote: On 18/11/2014 21:28, Dr. David Alan Gilbert wrote: This seems odd, since as far as I know the tunneling code is quite separate to the migration code; I thought the only thing that the migration code sees different is the file descriptors it gets past. (Having said that, again I don't know storage stuff, so if this is a storage special there may be something there...) Tunnelled migration uses the old block-migration.c code. Non-tunnelled migration uses the NBD server and block/mirror.c. OK, that explains that. Is that because the tunneling code can't deal with tunneling the NBD server connection? The main problem with the old code is that uses a possibly unbounded amount of memory in mig_save_device_dirty and can have huge jitter if any serious workload is running in the guest. So that's sending dirty blocks iteratively? Not that I can see when the allocations get freed; but is the amount allocated there related to total disk size (as Gary suggested) or to the amount of dirty blocks? Dave Paolo -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH] spice: remove spice-experimental.h include
On Mo, 2014-11-17 at 20:19 +0300, Michael Tokarev wrote: 17.11.2014 18:52, Marc-André Lureau wrote: Nothing seems to be using functions from spice-experimental.h (better that way). Let's remove its inclusion. Is it with current spice, or with some older spice too? I mean, why this include has been added to start with -- was it because of some feature which initially was only in -experimental.h but later moved to main spice? If yes, it'd be interesting to see when, in which version of spice, that has been done. Or at least to verify that the minimal version of spice required by qemu (in configure) allows to stop including -experimental.h So basically, my only question here is -- what version of spice did you check -- especially, did you check the current minimal required version too? minimum required spice-server is 0.12.0. IIRC the char channel stuff used to be in experimental, but I think that predates 0.12.0. Marc? cheers, Gerd
Re: [Qemu-devel] hotremoving a disk qmp/hmp
On Nov18 18:15, Markus Armbruster wrote: Looks like you're using an old version of QEMU. drive_del has been fixed to refuse touching a backend created with blockdev-add: I am using the last v2.1.x version but there was indeed some commits since You can't destroy a backend created with blockdev-add, yet. I hope to have blockdev-del in 2.3. thanks for the information My advice is to stick to drive_add and drive_del for now. Not in QMP, so you need to use HMP. You can do that in QMP with human-monitor-command. ok understood that's what I was suspecting after not finding a way to delete an object added by blockdev-add Thanks, -- William signature.asc Description: Digital signature
Re: [Qemu-devel] [PATCH v2 1/3] pc-dimm: add a function to calculate VM's current RAM size
On Mon, 17 Nov 2014 13:11:08 +0800 zhanghailiang zhang.zhanghaili...@huawei.com wrote: The global parameter 'ram_size' does not take into account the hotplugged memory. In some codes, we use 'ram_size' as current VM's real RAM size, which is not correct. Add function 'get_current_ram_size' to calculate VM's current RAM size, it will enumerate present memory devices and also plus ram_size. Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com --- hw/mem/pc-dimm.c| 26 ++ include/exec/cpu-common.h | 1 + stubs/qmp_pc_dimm_device_list.c | 5 + 3 files changed, 32 insertions(+) diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index a800ea7..38465d0 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -62,6 +62,32 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque) return 0; } +ram_addr_t get_current_ram_size(void) +{ +MemoryDeviceInfoList *info_list = NULL; +MemoryDeviceInfoList **prev = info_list; +MemoryDeviceInfoList *info; +ram_addr_t size = ram_size; + +qmp_pc_dimm_device_list(qdev_get_machine(), prev); +for (info = info_list; info; info = info-next) { +MemoryDeviceInfo *value = info-value; + +if (value) { +switch (value-kind) { +case MEMORY_DEVICE_INFO_KIND_DIMM: +size += value-dimm-size; +break; +default: +break; +} +} +} +qapi_free_MemoryDeviceInfoList(info_list); + +return size; +} function is a generic one and it could handle not only pc-dimm device in future, so it would be better to put it in some device neutral place. That would allow to drop following stub as well. + static int pc_dimm_slot2bitmap(Object *obj, void *opaque) { unsigned long *bitmap = opaque; diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 427b851..fcc3162 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -52,6 +52,7 @@ typedef uintptr_t ram_addr_t; #endif extern ram_addr_t ram_size; +ram_addr_t get_current_ram_size(void); /* memory API */ diff --git a/stubs/qmp_pc_dimm_device_list.c b/stubs/qmp_pc_dimm_device_list.c index 5cb220c..b584bd8 100644 --- a/stubs/qmp_pc_dimm_device_list.c +++ b/stubs/qmp_pc_dimm_device_list.c @@ -5,3 +5,8 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque) { return 0; } + +ram_addr_t get_current_ram_size(void) +{ +return ram_size; +}
Re: [Qemu-devel] [PATCH v2 0/3] fix bug about balloon working incorrectly when hotplug memeory
On Wed, 19 Nov 2014 16:28:09 +0800 zhanghailiang zhang.zhanghaili...@huawei.com wrote: Hi, Ping...? Should this be fixed in 2.2? I'd preffer it go after memory unplug is merged to see how balooning will fare along with it. Also ack from Luiz could be useful to make it sure that balloning will work just fine with sparse memory and memory unplug. Thanks, zhanghailiang On 2014/11/17 13:11, zhanghailiang wrote: Hi, Patch 1 and 2 mainly fix bug about balloon not working correctly when we do hotplug memory. It takes 'ram_size' as VM's real RAM size which is wrong after we hotplug memory. This bug exists since we begin to support hotplug memory, and it is better to fix it. Patch 3 add some trace events, it helps debugging balloon. If it is unnecessary, pls feel free to remove it. Thanks, zhanghailiang v2: - fix compiling break for other targets that don't support pc-dimm zhanghailiang (3): pc-dimm: add a function to calculate VM's current RAM size virtio-balloon: Fix balloon not working correctly when hotplug memory virtio-balloon: Add some trace events hw/mem/pc-dimm.c| 26 ++ hw/virtio/virtio-balloon.c | 21 +++-- include/exec/cpu-common.h | 1 + stubs/qmp_pc_dimm_device_list.c | 5 + trace-events| 4 5 files changed, 51 insertions(+), 6 deletions(-)
Re: [Qemu-devel] [PATCH for-2.2] acpi-build: mark RAM dirty on table update
On Wed, 19 Nov 2014 12:51:00 +0530 Amit Shah amit.s...@redhat.com wrote: On (Mon) 17 Nov 2014 [19:04:13], Michael S. Tsirkin wrote: acpi build modifies internal FW CFG RAM on first access but we forgot to mark it dirty. If this RAM has been migrated already, it won't be migrated again, returning corrupted tables to guest. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- include/hw/loader.h | 2 +- hw/core/loader.c | 8 +--- hw/i386/acpi-build.c | 11 --- 3 files changed, 14 insertions(+), 7 deletions(-) diff --git a/include/hw/loader.h b/include/hw/loader.h index 054c6a2..6481639 100644 --- a/include/hw/loader.h +++ b/include/hw/loader.h @@ -59,7 +59,7 @@ extern bool rom_file_has_mr; int rom_add_file(const char *file, const char *fw_dir, hwaddr addr, int32_t bootindex, bool option_rom); -void *rom_add_blob(const char *name, const void *blob, size_t len, +ram_addr_t rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, FWCfgReadCallback fw_callback, void *callback_opaque); Here, and in the next hunks where function signatures are modified, indent of following lines go off. Minor nit. int rom_add_elf_program(const char *name, void *data, size_t datasize, diff --git a/hw/core/loader.c b/hw/core/loader.c index bbe6eb3..5cf686d 100644 --- a/hw/core/loader.c +++ b/hw/core/loader.c @@ -798,12 +798,12 @@ err: return -1; } -void *rom_add_blob(const char *name, const void *blob, size_t len, +ram_addr_t rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, FWCfgReadCallback fw_callback, void *callback_opaque) { Rom *rom; -void *data = NULL; +ram_addr_t ret = RAM_ADDR_MAX; rom = g_malloc0(sizeof(*rom)); rom-name = g_strdup(name); @@ -815,11 +815,13 @@ void *rom_add_blob(const char *name, const void *blob, size_t len, rom_insert(rom); if (fw_file_name fw_cfg) { char devpath[100]; +void *data; snprintf(devpath, sizeof(devpath), /rom@%s, fw_file_name); if (rom_file_has_mr) { data = rom_set_mr(rom, OBJECT(fw_cfg), devpath); +ret = memory_region_get_ram_addr(rom-mr); } else { data = rom-data; } @@ -828,7 +830,7 @@ void *rom_add_blob(const char *name, const void *blob, size_t len, fw_callback, callback_opaque, data, rom-romsize); } -return data; +return ret; } /* This function is specific for elf program because we don't need to allocate diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 4003b6b..92a36e3 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -56,6 +56,7 @@ #include qapi/qmp/qint.h #include qom/qom-qobject.h +#include exec/ram_addr.h /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and * -M pc-i440fx-2.0. Even if the actual amount of AML generated grows @@ -1511,7 +1512,7 @@ static inline void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre) typedef struct AcpiBuildState { /* Copy of table in RAM (for patching). */ -uint8_t *table_ram; +ram_addr_t table_ram; uint32_t table_size; /* Is table patched? */ uint8_t patched; @@ -1716,9 +1717,12 @@ static void acpi_build_update(void *build_opaque, uint32_t offset) acpi_build(build_state-guest_info, tables); assert(acpi_data_len(tables.table_data) == build_state-table_size); -memcpy(build_state-table_ram, tables.table_data-data, +memcpy(qemu_get_ram_ptr(build_state-table_ram), tables.table_data-data, build_state-table_size); This looks like something not necessary for this patch? Can be split off into another one? + cpu_physical_memory_set_dirty_range_nocode(build_state-table_ram, + build_state-table_size); + acpi_build_tables_cleanup(tables, true); } @@ -1728,7 +1732,7 @@ static void acpi_build_reset(void *build_opaque) build_state-patched = 0; } -static void *acpi_add_rom_blob(AcpiBuildState *build_state, GArray *blob, +static ram_addr_t acpi_add_rom_blob(AcpiBuildState *build_state, GArray *blob, const char *name) { return rom_add_blob(name, blob-data, acpi_data_len(blob), -1, name, @@ -1777,6 +1781,7 @@ void acpi_setup(PcGuestInfo *guest_info) /* Now expose it all to Guest */ build_state-table_ram = acpi_add_rom_blob(build_state, tables.table_data, ACPI_BUILD_TABLE_FILE); +assert(build_state-table_ram != RAM_ADDR_MAX); build_state-table_size = acpi_data_len(tables.table_data); Isn't an assert too strong if this happens during hotplug? I'm trying to follow this code, but looks like this
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 10:19:22AM +0100, Juan Quintela wrote: I very much prefer to have user-controlled ACPI information (coming from the command-line) byte-for-byte identical for a given machine type. Patches for that have been on the list for almost two months, and it's not nice. Paolo I guess we just disagree on whether these patches will effectively achieve this goal. For example, some people want to rewrite iasl bits, generating everything in C. This will affect static bits too. I don't want to make every single change in code conditional on a machine type. You can't migrate with a different BIOS on destination, period. This claim is very wrong. This would make is impossible to change BIOS bus without breaking migration. Look at history of qemu, we change BIOS every release. And we should do: qemu -M pc-2.0 -L BIOS-2.0.bin And that way for every version and every bios. If they are the same, a symlink does. If they are not, different file. And we would not have this problems. I fully agree that we have problems with BIOS every relase. What we don't agree is _what_ is the best way to fix the issue. IMHO, b) is just asking for trouble. Being able to go from random changes to random changes look strange. Yes, it is hard to support. But it's a required feature, and in fact, it's an existing one. Just think about it for a second. We are sending more data for some regions that it was allocated. And we just grow the regions and expect that everything is going to be ok. It is me, or this goes against every security discipline that I can think of? Later, Juan. We have many devices that just get N from stream, do malloc(N), then read data from stream into it. You think it's unsafe? Go ahead and fix them all. I am sure it is unsafe. Fixing them requires incompatible changes, that is the whole point :-( However, my patch does address your concern: callers specify the upper limit on the region size. Trying to migrate in a 1Gbyte region Yes and no. You are assuming that a guest launched with a device ram size of 256KB receives a 512KB section and it knows what to do with it. It knows what to do with the 256KB section, with the 512KB answer is always a perhaps. It depends of what is on the extra space. My solution would be that RAM/ROM sizes are always the same for migration, so destination would always understand it. It just forbids migrating between different machine types. And I think that is good, not bad. Later, Juan.
Re: [Qemu-devel] [PATCH for-2.2] acpi-build: mark RAM dirty on table update
On Wed, Nov 19, 2014 at 12:51:00PM +0530, Amit Shah wrote: On (Mon) 17 Nov 2014 [19:04:13], Michael S. Tsirkin wrote: acpi build modifies internal FW CFG RAM on first access but we forgot to mark it dirty. If this RAM has been migrated already, it won't be migrated again, returning corrupted tables to guest. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- include/hw/loader.h | 2 +- hw/core/loader.c | 8 +--- hw/i386/acpi-build.c | 11 --- 3 files changed, 14 insertions(+), 7 deletions(-) diff --git a/include/hw/loader.h b/include/hw/loader.h index 054c6a2..6481639 100644 --- a/include/hw/loader.h +++ b/include/hw/loader.h @@ -59,7 +59,7 @@ extern bool rom_file_has_mr; int rom_add_file(const char *file, const char *fw_dir, hwaddr addr, int32_t bootindex, bool option_rom); -void *rom_add_blob(const char *name, const void *blob, size_t len, +ram_addr_t rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, FWCfgReadCallback fw_callback, void *callback_opaque); Here, and in the next hunks where function signatures are modified, indent of following lines go off. Minor nit. int rom_add_elf_program(const char *name, void *data, size_t datasize, diff --git a/hw/core/loader.c b/hw/core/loader.c index bbe6eb3..5cf686d 100644 --- a/hw/core/loader.c +++ b/hw/core/loader.c @@ -798,12 +798,12 @@ err: return -1; } -void *rom_add_blob(const char *name, const void *blob, size_t len, +ram_addr_t rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, FWCfgReadCallback fw_callback, void *callback_opaque) { Rom *rom; -void *data = NULL; +ram_addr_t ret = RAM_ADDR_MAX; rom = g_malloc0(sizeof(*rom)); rom-name = g_strdup(name); @@ -815,11 +815,13 @@ void *rom_add_blob(const char *name, const void *blob, size_t len, rom_insert(rom); if (fw_file_name fw_cfg) { char devpath[100]; +void *data; snprintf(devpath, sizeof(devpath), /rom@%s, fw_file_name); if (rom_file_has_mr) { data = rom_set_mr(rom, OBJECT(fw_cfg), devpath); +ret = memory_region_get_ram_addr(rom-mr); } else { data = rom-data; } @@ -828,7 +830,7 @@ void *rom_add_blob(const char *name, const void *blob, size_t len, fw_callback, callback_opaque, data, rom-romsize); } -return data; +return ret; } /* This function is specific for elf program because we don't need to allocate diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 4003b6b..92a36e3 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -56,6 +56,7 @@ #include qapi/qmp/qint.h #include qom/qom-qobject.h +#include exec/ram_addr.h /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and * -M pc-i440fx-2.0. Even if the actual amount of AML generated grows @@ -1511,7 +1512,7 @@ static inline void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre) typedef struct AcpiBuildState { /* Copy of table in RAM (for patching). */ -uint8_t *table_ram; +ram_addr_t table_ram; uint32_t table_size; /* Is table patched? */ uint8_t patched; @@ -1716,9 +1717,12 @@ static void acpi_build_update(void *build_opaque, uint32_t offset) acpi_build(build_state-guest_info, tables); assert(acpi_data_len(tables.table_data) == build_state-table_size); -memcpy(build_state-table_ram, tables.table_data-data, +memcpy(qemu_get_ram_ptr(build_state-table_ram), tables.table_data-data, build_state-table_size); This looks like something not necessary for this patch? Can be split off into another one? How can it? We need to track ram address in order to dirty it. I could track both pointer and ram address, but this seems like unnecessary data duplication that will just lead to bugs. the benefit of splitting one like off seems to small. +cpu_physical_memory_set_dirty_range_nocode(build_state-table_ram, + build_state-table_size); + acpi_build_tables_cleanup(tables, true); } @@ -1728,7 +1732,7 @@ static void acpi_build_reset(void *build_opaque) build_state-patched = 0; } -static void *acpi_add_rom_blob(AcpiBuildState *build_state, GArray *blob, +static ram_addr_t acpi_add_rom_blob(AcpiBuildState *build_state, GArray *blob, const char *name) { return rom_add_blob(name, blob-data, acpi_data_len(blob), -1, name, @@ -1777,6 +1781,7
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Michael S. Tsirkin m...@redhat.com writes: On Wed, Nov 19, 2014 at 10:19:22AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Tue, Nov 18, 2014 at 07:03:58AM +0100, Paolo Bonzini wrote: On 17/11/2014 21:08, Michael S. Tsirkin wrote: Add API to manage on-device RAM. This looks just like regular RAM from migration POV, but has two special properties internally: - block is sized on migration, making it easier to extend without breaking migration compatibility or wasting virtual memory - callers must specify an upper bound on size Also, I am afraid that this design could make it easier to introduce backwards-incompatible changes. Well the point is exactly to make it easy to make *compatible* changes. As I mentioned in the cover letter, it's not just ACPI. For example, we now change boot index dynamically. People using large fw cfg blobs, e.g. -initrd, would benefit from ability to change the blob dynamically. There could be other examples. changing the size of the initrd, on the fly and wanting to migrate? Is that a real use case? One that we should really care? I'm not sure. At the moment one can swap kernels by doing halt in guest and restarting with the new one. If we wanted to allow reboot in guest to bring a new kernel instead, that would be one way to implement it. I was merely pointing out that the capability might find other uses. I very much prefer to have user-controlled ACPI information (coming from the command-line) byte-for-byte identical for a given machine type. Patches for that have been on the list for almost two months, and it's not nice. Paolo I guess we just disagree on whether these patches will effectively achieve this goal. For example, some people want to rewrite iasl bits, generating everything in C. This will affect static bits too. I don't want to make every single change in code conditional on a machine type. You can't migrate with a different BIOS on destination, period. This claim is very wrong. This would make is impossible to change BIOS bus without breaking migration. Look at history of qemu, we change BIOS every release. Since migration doesn't transport configuration, we require a compatibly configured target, and that includes identical memory sizes. RAM size is explicit and the user's problem. ROM size is generally implicit, and we use machine type compatibility machinery to keep it fixed. BIOS changes can break migration only when we screw up or forget the compatibility machinery. Same as for lots of other stuff. No big deal, really, just a consequence of not migrating configuration. That is what is making this whole issue complicated. We have two clear options: a- require BIOS memory regions to be exactly the same in both sides. No need to add compat machinery. b- trying to accomodate any potential change that could appear and use the same BIOS. IMHO, b) is just asking for trouble. Being able to go from random changes to random changes look strange. Yes, it is hard to support. But it's a required feature, and in fact, it's an existing one. Just think about it for a second. We are sending more data for some regions that it was allocated. And we just grow the regions and expect that everything is going to be ok. It is me, or this goes against every security discipline that I can think of? Later, Juan. We have many devices that just get N from stream, do malloc(N), then read data from stream into it. You think it's unsafe? Go ahead and fix them all. However, my patch does address your concern: callers specify the upper limit on the region size. Trying to migrate in a 1Gbyte region Are you proposing to make incoming migration adjust some or all memory sizes on the target from whatever was configured during startup to whatever is configured on the source? Possibly with some limitations, such as can only adjust downwards?
[Qemu-devel] [PATCH v2 0/6] Geometry and blocksize support for backing devices
Hi folks, I'm sorry for the recent spam. I messed up during code submission last time. So please ignore any previous notes you received from me and answer only to this thread. This is the rework of the geometry+blocksize patch, which was recently discussed here: http://lists.gnu.org/archive/html/qemu-devel/2014-11/msg01148.html Markus suggested that we only detect blocksize and geometry for DASDs. According to this agreement new version contains DASD special casing. The driver methods are implemented only for host_device and inner hdev_xxx functions check if the backing storage is a DASD by means of BIODASDINFO2 ioctl. Original patchset can be found here: http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg03791.html Ekaterina Tumanova (6): geometry: add bdrv functions for geometry and blocksize geometry: Detect blocksize via ioctls in separate static functions geometry: Add driver methods to probe blocksizes and geometry geometry: Add block-backend wrappers for geometry probing geometry: Call backend function to detect geometry and blocksize geometry: Target specific hook for s390x in geometry guessing block.c| 26 + block/block-backend.c | 10 block/raw-posix.c | 123 ++--- block/raw_bsd.c| 12 hw/block/Makefile.objs | 6 +- hw/block/block.c | 11 hw/block/hd-geometry.c | 43 -- hw/block/virtio-blk.c | 1 + include/block/block.h | 20 +++ include/block/block_int.h | 3 + include/hw/block/block.h | 1 + include/sysemu/block-backend.h | 2 + 12 files changed, 234 insertions(+), 24 deletions(-) -- 1.8.5.5
[Qemu-devel] [PATCH v2 4/6] geometry: Add block-backend wrappers for geometry probing
Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com --- block/block-backend.c | 10 ++ include/sysemu/block-backend.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index d0692b1..6717301 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -629,3 +629,13 @@ void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk, { return qemu_aio_get(aiocb_info, blk_bs(blk), cb, opaque); } + +struct ProbeBlockSize blk_probe_blocksizes(BlockBackend *blk) +{ +return bdrv_probe_blocksizes(blk-bs); +} + +struct ProbeGeometry blk_probe_geometry(BlockBackend *blk) +{ +return bdrv_probe_geometry(blk-bs); +} diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 52d13c1..e8b497a 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -138,5 +138,7 @@ BlockAcctStats *blk_get_stats(BlockBackend *blk); void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk, BlockCompletionFunc *cb, void *opaque); +struct ProbeBlockSize blk_probe_blocksizes(BlockBackend *blk); +struct ProbeGeometry blk_probe_geometry(BlockBackend *blk); #endif -- 1.8.5.5
[Qemu-devel] [PATCH v2 2/6] geometry: Detect blocksize via ioctls in separate static functions
Move the IOCTL calls that detect logical blocksize from raw_probe_alignment into separate function (probe_logical_blocksize). Introduce function which detect physical blocksize via IOCTL (probe_physical_blocksize). Both functions will be used in the next patch. Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com --- block/raw-posix.c | 58 +-- 1 file changed, 39 insertions(+), 19 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index e100ae2..45f1d79 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -223,50 +223,70 @@ static int raw_normalize_devicepath(const char **filename) } #endif -static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp) +static unsigned int probe_logical_blocksize(BlockDriverState *bs, int fd) { -BDRVRawState *s = bs-opaque; -char *buf; -unsigned int sector_size; - -/* For /dev/sg devices the alignment is not really used. - With buffered I/O, we don't have any restrictions. */ -if (bs-sg || !s-needs_alignment) { -bs-request_alignment = 1; -s-buf_align = 1; -return; -} +unsigned int sector_size = 0; /* Try a few ioctls to get the right size */ -bs-request_alignment = 0; -s-buf_align = 0; - #ifdef BLKSSZGET if (ioctl(fd, BLKSSZGET, sector_size) = 0) { -bs-request_alignment = sector_size; +return sector_size; } #endif #ifdef DKIOCGETBLOCKSIZE if (ioctl(fd, DKIOCGETBLOCKSIZE, sector_size) = 0) { -bs-request_alignment = sector_size; +return sector_size; } #endif #ifdef DIOCGSECTORSIZE if (ioctl(fd, DIOCGSECTORSIZE, sector_size) = 0) { -bs-request_alignment = sector_size; +return sector_size; } #endif #ifdef CONFIG_XFS if (s-is_xfs) { struct dioattr da; if (xfsctl(NULL, fd, XFS_IOC_DIOINFO, da) = 0) { -bs-request_alignment = da.d_miniosz; +sector_size = da.d_miniosz; /* The kernel returns wrong information for d_mem */ /* s-buf_align = da.d_mem; */ +return sector_size; } } #endif +return 0; +} + +static unsigned int probe_physical_blocksize(BlockDriverState *bs, int fd) +{ +unsigned int blk_size = 0; +#ifdef BLKPBSZGET +if (ioctl(fd, BLKPBSZGET, blk_size) = 0) { +return blk_size; +} +#endif + +return 0; +} + +static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp) +{ +BDRVRawState *s = bs-opaque; +char *buf; + +/* For /dev/sg devices the alignment is not really used. + With buffered I/O, we don't have any restrictions. */ +if (bs-sg || !s-needs_alignment) { +bs-request_alignment = 1; +s-buf_align = 1; +return; +} + +s-buf_align = 0; +/* Let's try to use the logical blocksize for the alignment. */ +bs-request_alignment = probe_logical_blocksize(bs, fd); + /* If we could not get the sizes so far, we can only guess them */ if (!s-buf_align) { size_t align; -- 1.8.5.5
[Qemu-devel] [PATCH v2 5/6] geometry: Call backend function to detect geometry and blocksize
hd_geometry_guess function autodetects the drive geometry. This patch adds a block backend call, that probes the backing device geometry. If the inner driver method is implemented and succeeds (currently only DASDs), the blkconf_geometry will pass-through the backing device geometry. Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com --- hw/block/block.c | 11 +++ hw/block/hd-geometry.c | 9 + hw/block/virtio-blk.c| 1 + include/hw/block/block.h | 1 + 4 files changed, 22 insertions(+) diff --git a/hw/block/block.c b/hw/block/block.c index a625773..f1d29bc 100644 --- a/hw/block/block.c +++ b/hw/block/block.c @@ -25,6 +25,17 @@ void blkconf_serial(BlockConf *conf, char **serial) } } +void blkconf_blocksizes(BlockConf *conf) +{ +BlockBackend *blk = conf-blk; +struct ProbeBlockSize blocksize = blk_probe_blocksizes(blk); + +if (blocksize.rc == 0) { +conf-physical_block_size = blocksize.size.phys; +conf-logical_block_size = blocksize.size.log; +} +} + void blkconf_geometry(BlockConf *conf, int *ptrans, unsigned cyls_max, unsigned heads_max, unsigned secs_max, Error **errp) diff --git a/hw/block/hd-geometry.c b/hw/block/hd-geometry.c index 6fcf74d..4972114 100644 --- a/hw/block/hd-geometry.c +++ b/hw/block/hd-geometry.c @@ -121,6 +121,14 @@ void hd_geometry_guess(BlockBackend *blk, int *ptrans) { int cylinders, heads, secs, translation; +struct ProbeGeometry geometry = blk_probe_geometry(blk); + +if (geometry.rc == 0) { +*pcyls = geometry.geo.cylinders; +*psecs = geometry.geo.sectors; +*pheads = geometry.geo.heads; +goto done; +} if (guess_disk_lchs(blk, cylinders, heads, secs) 0) { /* no LCHS guess: use a standard physical disk geometry */ @@ -143,6 +151,7 @@ void hd_geometry_guess(BlockBackend *blk, the logical geometry */ translation = BIOS_ATA_TRANSLATION_NONE; } +done: if (ptrans) { *ptrans = translation; } diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index b19b102..6f01565 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -745,6 +745,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) error_propagate(errp, err); return; } +blkconf_blocksizes(conf-conf); virtio_init(vdev, virtio-blk, VIRTIO_ID_BLOCK, sizeof(struct virtio_blk_config)); diff --git a/include/hw/block/block.h b/include/hw/block/block.h index 0d0ce9a..856bf75 100644 --- a/include/hw/block/block.h +++ b/include/hw/block/block.h @@ -63,6 +63,7 @@ void blkconf_serial(BlockConf *conf, char **serial); void blkconf_geometry(BlockConf *conf, int *trans, unsigned cyls_max, unsigned heads_max, unsigned secs_max, Error **errp); +void blkconf_blocksizes(BlockConf *conf); /* Hard disk geometry */ -- 1.8.5.5
[Qemu-devel] [PATCH v2 1/6] geometry: add bdrv functions for geometry and blocksize
Add driver functions for geometry and blocksize detection Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com --- block.c | 26 ++ include/block/block.h | 20 include/block/block_int.h | 3 +++ 3 files changed, 49 insertions(+) diff --git a/block.c b/block.c index a612594..5df35cf 100644 --- a/block.c +++ b/block.c @@ -548,6 +548,32 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp) } } +struct ProbeBlockSize bdrv_probe_blocksizes(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; +struct ProbeBlockSize err_geo = { .rc = -1 }; + +assert(drv != NULL); +if (drv-bdrv_probe_blocksizes) { +return drv-bdrv_probe_blocksizes(bs); +} + +return err_geo; +} + +struct ProbeGeometry bdrv_probe_geometry(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; +struct ProbeGeometry err_geo = { .rc = -1 }; + +assert(drv != NULL); +if (drv-bdrv_probe_geometry) { +return drv-bdrv_probe_geometry(bs); +} + +return err_geo; +} + /* * Create a uniquely-named empty temporary file. * Return 0 upon success, otherwise a negative errno value. diff --git a/include/block/block.h b/include/block/block.h index 5450610..3287dbc 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -60,6 +60,24 @@ typedef enum { BDRV_REQ_MAY_UNMAP= 0x4, } BdrvRequestFlags; +struct ProbeBlockSize { +int rc; +struct BlockSize { +uint16_t phys; +uint16_t log; +} size; +}; + +struct ProbeGeometry { +int rc; +struct HDGeometry { +uint32_t heads; +uint32_t sectors; +uint32_t cylinders; +uint32_t start; +} geo; +}; + #define BDRV_O_RDWR0x0002 #define BDRV_O_SNAPSHOT0x0008 /* open the file read only and save writes in a snapshot */ #define BDRV_O_TEMPORARY 0x0010 /* delete the file after use */ @@ -538,6 +556,8 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs); * the old #AioContext is not executing. */ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context); +struct ProbeBlockSize bdrv_probe_blocksizes(BlockDriverState *bs); +struct ProbeGeometry bdrv_probe_geometry(BlockDriverState *bs); void bdrv_io_plug(BlockDriverState *bs); void bdrv_io_unplug(BlockDriverState *bs); diff --git a/include/block/block_int.h b/include/block/block_int.h index a1c17b9..830e564 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -271,6 +271,9 @@ struct BlockDriver { void (*bdrv_io_unplug)(BlockDriverState *bs); void (*bdrv_flush_io_queue)(BlockDriverState *bs); +struct ProbeBlockSize (*bdrv_probe_blocksizes)(BlockDriverState *bs); +struct ProbeGeometry (*bdrv_probe_geometry)(BlockDriverState *bs); + QLIST_ENTRY(BlockDriver) list; }; -- 1.8.5.5
[Qemu-devel] [PATCH v2 6/6] geometry: Target specific hook for s390x in geometry guessing
For target-s390x, the behaviour is chosen as follows: If no geo could be guessed from backing device, guess_disk_lchs (partition table guessing) is called. If this is not successful (e.g. image files) geometry is derived from the size of the disk (as before). We have no translation on s390, so we simply set in to NONE. Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com --- hw/block/Makefile.objs | 6 +- hw/block/hd-geometry.c | 36 +++- 2 files changed, 36 insertions(+), 6 deletions(-) diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs index d4c3ab7..1f7da7a 100644 --- a/hw/block/Makefile.objs +++ b/hw/block/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-y += block.o cdrom.o hd-geometry.o +common-obj-y += block.o cdrom.o common-obj-$(CONFIG_FDC) += fdc.o common-obj-$(CONFIG_SSI_M25P80) += m25p80.o common-obj-$(CONFIG_NAND) += nand.o @@ -13,3 +13,7 @@ obj-$(CONFIG_SH4) += tc58128.o obj-$(CONFIG_VIRTIO) += virtio-blk.o obj-$(CONFIG_VIRTIO) += dataplane/ + +# geometry is target/architecture dependent and therefore needs to be built +# for every target platform +obj-y += hd-geometry.o diff --git a/hw/block/hd-geometry.c b/hw/block/hd-geometry.c index 4972114..b462225 100644 --- a/hw/block/hd-geometry.c +++ b/hw/block/hd-geometry.c @@ -49,11 +49,12 @@ struct partition { /* try to guess the disk logical geometry from the MSDOS partition table. Return 0 if OK, -1 if could not guess */ -static int guess_disk_lchs(BlockBackend *blk, - int *pcylinders, int *pheads, int *psectors) +static int guess_disk_lchs(BlockBackend *blk, uint32_t *pcylinders, + uint32_t *pheads, uint32_t *psectors) { uint8_t buf[BDRV_SECTOR_SIZE]; -int i, heads, sectors, cylinders; +int i; +uint32_t heads, sectors, cylinders; struct partition *p; uint32_t nr_sects; uint64_t nb_sectors; @@ -116,11 +117,11 @@ static void guess_chs_for_size(BlockBackend *blk, *psecs = 63; } +#ifdef TARGET_S390X void hd_geometry_guess(BlockBackend *blk, uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs, int *ptrans) { -int cylinders, heads, secs, translation; struct ProbeGeometry geometry = blk_probe_geometry(blk); if (geometry.rc == 0) { @@ -129,7 +130,32 @@ void hd_geometry_guess(BlockBackend *blk, *pheads = geometry.geo.heads; goto done; } +if (guess_disk_lchs(blk, pcyls, pheads, psecs) == 0) { +goto done; +} +guess_chs_for_size(blk, pcyls, pheads, psecs); +done: +if (ptrans) { +*ptrans = BIOS_ATA_TRANSLATION_NONE; +} +trace_hd_geometry_guess(blk, *pcyls, *pheads, *psecs, +BIOS_ATA_TRANSLATION_NONE); +} +#else +void hd_geometry_guess(BlockBackend *blk, + uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs, + int *ptrans) +{ +int cylinders, heads, secs, translation; +struct ProbeGeometry geometry = blk_probe_geometry(blk); + +if (geometry.rc == 0) { +*pcyls = geometry.geo.cylinders; +*psecs = geometry.geo.sectors; +*pheads = geometry.geo.heads; +goto done; +} if (guess_disk_lchs(blk, cylinders, heads, secs) 0) { /* no LCHS guess: use a standard physical disk geometry */ guess_chs_for_size(blk, pcyls, pheads, psecs); @@ -157,7 +183,7 @@ done: } trace_hd_geometry_guess(blk, *pcyls, *pheads, *psecs, translation); } - +#endif int hd_bios_chs_auto_trans(uint32_t cyls, uint32_t heads, uint32_t secs) { return cyls = 1024 heads = 16 secs = 63 -- 1.8.5.5
[Qemu-devel] [PATCH v2 3/6] geometry: Add driver methods to probe blocksizes and geometry
This patch introduces driver methods of defining disk blocksizes (physical and logical) and hard drive geometry. The method is only implemented for host_device. For raw devices driver calls child's method. The detection will only work for DASD devices. In order to check that a local CheckForDASD function was introduced, which calls BIODASDINFO2 ioctl and returns 0 only if it succeeds. Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com --- block/raw-posix.c | 65 +++ block/raw_bsd.c | 12 ++ 2 files changed, 77 insertions(+) diff --git a/block/raw-posix.c b/block/raw-posix.c index 45f1d79..274ceda 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -56,6 +56,7 @@ #include linux/cdrom.h #include linux/fd.h #include linux/fs.h +#include linux/hdreg.h #ifndef FS_NOCOW_FL #define FS_NOCOW_FL 0x0080 /* Do not cow file */ #endif @@ -93,6 +94,10 @@ #include xfs/xfs.h #endif +#ifdef __s390__ +#include asm/dasd.h +#endif + //#define DEBUG_FLOPPY //#define DEBUG_BLOCK @@ -678,6 +683,64 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) bs-bl.opt_mem_alignment = s-buf_align; } +static int CheckForDASD(int fd) +{ +#ifdef BIODASDINFO2 +struct dasd_information2_t info = {0}; + +return ioctl(fd, BIODASDINFO2, info); +#endif +return -1; +} + +static struct ProbeBlockSize hdev_probe_blocksizes(BlockDriverState *bs) +{ +BDRVRawState *s = bs-opaque; +struct ProbeBlockSize block_sizes = {0}; + +block_sizes.rc = CheckForDASD(s-fd); +/* If DASD, get blocksizes */ +if (block_sizes.rc == 0) { +block_sizes.size.log = probe_logical_blocksize(bs, s-fd); +block_sizes.size.phys = probe_physical_blocksize(bs, s-fd); +} + +return block_sizes; +} + +static struct ProbeGeometry hdev_probe_geometry(BlockDriverState *bs) +{ +BDRVRawState *s = bs-opaque; +struct ProbeGeometry geometry = {0}; +struct hd_geometry ioctl_geo = {0}; + +geometry.rc = CheckForDASD(s-fd); +if (geometry.rc != 0) { +goto done; +} +/* If DASD, get it's geometry */ +geometry.rc = ioctl(s-fd, HDIO_GETGEO, ioctl_geo); +/* Do not return a geometry for partition */ +if (ioctl_geo.start != 0) { +geometry.rc = -1; +goto done; +} +/* HDIO_GETGEO may return success even though geo contains zeros + (e.g. certain multipath setups) */ +if (!ioctl_geo.heads || !ioctl_geo.sectors || !ioctl_geo.cylinders) { +geometry.rc = -1; +goto done; +} +if (geometry.rc == 0) { +geometry.geo.heads = (uint32_t)ioctl_geo.heads; +geometry.geo.sectors = (uint32_t)ioctl_geo.sectors; +geometry.geo.cylinders = (uint32_t)ioctl_geo.cylinders; +geometry.geo.start = (uint32_t)ioctl_geo.start; +} +done: + return geometry; +} + static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb) { int ret; @@ -2128,6 +2191,8 @@ static BlockDriver bdrv_host_device = { .bdrv_get_info = raw_get_info, .bdrv_get_allocated_file_size = raw_get_allocated_file_size, +.bdrv_probe_blocksizes = hdev_probe_blocksizes, +.bdrv_probe_geometry = hdev_probe_geometry, .bdrv_detach_aio_context = raw_detach_aio_context, .bdrv_attach_aio_context = raw_attach_aio_context, diff --git a/block/raw_bsd.c b/block/raw_bsd.c index 401b967..d164eba 100644 --- a/block/raw_bsd.c +++ b/block/raw_bsd.c @@ -173,6 +173,16 @@ static int raw_probe(const uint8_t *buf, int buf_size, const char *filename) return 1; } +static struct ProbeBlockSize raw_probe_blocksizes(BlockDriverState *bs) +{ +return bdrv_probe_blocksizes(bs-file); +} + +static struct ProbeGeometry raw_probe_geometry(BlockDriverState *bs) +{ +return bdrv_probe_geometry(bs-file); +} + static BlockDriver bdrv_raw = { .format_name = raw, .bdrv_probe = raw_probe, @@ -190,6 +200,8 @@ static BlockDriver bdrv_raw = { .has_variable_length = true, .bdrv_get_info= raw_get_info, .bdrv_refresh_limits = raw_refresh_limits, +.bdrv_probe_blocksizes = raw_probe_blocksizes, +.bdrv_probe_geometry = raw_probe_geometry, .bdrv_is_inserted = raw_is_inserted, .bdrv_media_changed = raw_media_changed, .bdrv_eject = raw_eject, -- 1.8.5.5
Re: [Qemu-devel] Tunneled Migration with Non-Shared Storage
On 19/11/2014 10:35, Dr. David Alan Gilbert wrote: * Paolo Bonzini (pbonz...@redhat.com) wrote: On 18/11/2014 21:28, Dr. David Alan Gilbert wrote: This seems odd, since as far as I know the tunneling code is quite separate to the migration code; I thought the only thing that the migration code sees different is the file descriptors it gets past. (Having said that, again I don't know storage stuff, so if this is a storage special there may be something there...) Tunnelled migration uses the old block-migration.c code. Non-tunnelled migration uses the NBD server and block/mirror.c. OK, that explains that. Is that because the tunneling code can't deal with tunneling the NBD server connection? The main problem with the old code is that uses a possibly unbounded amount of memory in mig_save_device_dirty and can have huge jitter if any serious workload is running in the guest. So that's sending dirty blocks iteratively? Not that I can see when the allocations get freed; but is the amount allocated there related to total disk size (as Gary suggested) or to the amount of dirty blocks? It should be related to the maximum rate limit (which can be set to arbitrarily high values, however). The reads are started, then the ones that are ready are sent and the blocks are freed in flush_blks. The jitter happens when the guest reads a lot but only writes a few blocks. In that case, the bdrv_drain_all in mig_save_device_dirty can be called relatively often and it can be expensive because it also waits for all guest-initiated reads to complete. The bulk phase is similar, just with different functions (the reads are done in mig_save_device_bulk). With a high rate limit, the total allocated memory can reach a few gigabytes indeed. Depending on the scenario, a possible disadvantage of NBD migration is that it can only throttle each disk separately, while the old code will apply a single limit to all migrations. Paolou
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On Wed, Nov 19, 2014 at 11:11:26AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 10:19:22AM +0100, Juan Quintela wrote: I very much prefer to have user-controlled ACPI information (coming from the command-line) byte-for-byte identical for a given machine type. Patches for that have been on the list for almost two months, and it's not nice. Paolo I guess we just disagree on whether these patches will effectively achieve this goal. For example, some people want to rewrite iasl bits, generating everything in C. This will affect static bits too. I don't want to make every single change in code conditional on a machine type. You can't migrate with a different BIOS on destination, period. This claim is very wrong. This would make is impossible to change BIOS bus without breaking migration. Look at history of qemu, we change BIOS every release. And we should do: qemu -M pc-2.0 -L BIOS-2.0.bin And that way for every version and every bios. If they are the same, a symlink does. If they are not, different file. And we would not have this problems. You want to keep old bios around forever, and never fix bugs in it? I disagree, quite strongly. I fully agree that we have problems with BIOS every relase. What we don't agree is _what_ is the best way to fix the issue. You don't seem to want to fix them. Your solution is just to keep bugs around forver. IMHO, b) is just asking for trouble. Being able to go from random changes to random changes look strange. Yes, it is hard to support. But it's a required feature, and in fact, it's an existing one. Just think about it for a second. We are sending more data for some regions that it was allocated. And we just grow the regions and expect that everything is going to be ok. It is me, or this goes against every security discipline that I can think of? Later, Juan. We have many devices that just get N from stream, do malloc(N), then read data from stream into it. You think it's unsafe? Go ahead and fix them all. I am sure it is unsafe. Fixing them requires incompatible changes, that is the whole point :-( I don't see the point, sorry. Want to elaborate? However, my patch does address your concern: callers specify the upper limit on the region size. Trying to migrate in a 1Gbyte region Yes and no. You are assuming that a guest launched with a device ram size of 256KB receives a 512KB section and it knows what to do with it. It knows what to do with the 256KB section, with the 512KB answer is always a perhaps. It depends of what is on the extra space. My solution would be that RAM/ROM sizes are always the same for migration, so destination would always understand it. It just forbids migrating between different machine types. And I think that is good, not bad. Later, Juan. You basically ask firmware to be perfect, or want us to carry old bugs around forever. It makes things simple for migration code, at huge cost elsewhere, and pain for users. I don't want to maintain old bugs in ACPI, and same applies to other firmware. This is really the whole point of the patchset. Keep bugs in device ram around until next reboot. At that point users get new device with non buggy behaviour. -- MST
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On Wed, Nov 19, 2014 at 11:16:57AM +0100, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: On Wed, Nov 19, 2014 at 10:19:22AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Tue, Nov 18, 2014 at 07:03:58AM +0100, Paolo Bonzini wrote: On 17/11/2014 21:08, Michael S. Tsirkin wrote: Add API to manage on-device RAM. This looks just like regular RAM from migration POV, but has two special properties internally: - block is sized on migration, making it easier to extend without breaking migration compatibility or wasting virtual memory - callers must specify an upper bound on size Also, I am afraid that this design could make it easier to introduce backwards-incompatible changes. Well the point is exactly to make it easy to make *compatible* changes. As I mentioned in the cover letter, it's not just ACPI. For example, we now change boot index dynamically. People using large fw cfg blobs, e.g. -initrd, would benefit from ability to change the blob dynamically. There could be other examples. changing the size of the initrd, on the fly and wanting to migrate? Is that a real use case? One that we should really care? I'm not sure. At the moment one can swap kernels by doing halt in guest and restarting with the new one. If we wanted to allow reboot in guest to bring a new kernel instead, that would be one way to implement it. I was merely pointing out that the capability might find other uses. I very much prefer to have user-controlled ACPI information (coming from the command-line) byte-for-byte identical for a given machine type. Patches for that have been on the list for almost two months, and it's not nice. Paolo I guess we just disagree on whether these patches will effectively achieve this goal. For example, some people want to rewrite iasl bits, generating everything in C. This will affect static bits too. I don't want to make every single change in code conditional on a machine type. You can't migrate with a different BIOS on destination, period. This claim is very wrong. This would make is impossible to change BIOS bus without breaking migration. Look at history of qemu, we change BIOS every release. Since migration doesn't transport configuration, we require a compatibly configured target, and that includes identical memory sizes. RAM size is explicit and the user's problem. ROM size is generally implicit, and we use machine type compatibility machinery to keep it fixed. BIOS changes can break migration only when we screw up or forget the compatibility machinery. Same as for lots of other stuff. No big deal, really, just a consequence of not migrating configuration. You don't get to maintain it, so it's no big deal for you. I see pain every single release and code is becoming spaghetty-like very quickly. We thought it would work. It does not. We do need a solution. And the pain is completely self-inflicted: we already migrate all necessary information! It's just a question of adjusting our datastructures to it. That is what is making this whole issue complicated. We have two clear options: a- require BIOS memory regions to be exactly the same in both sides. No need to add compat machinery. b- trying to accomodate any potential change that could appear and use the same BIOS. IMHO, b) is just asking for trouble. Being able to go from random changes to random changes look strange. Yes, it is hard to support. But it's a required feature, and in fact, it's an existing one. Just think about it for a second. We are sending more data for some regions that it was allocated. And we just grow the regions and expect that everything is going to be ok. It is me, or this goes against every security discipline that I can think of? Later, Juan. We have many devices that just get N from stream, do malloc(N), then read data from stream into it. You think it's unsafe? Go ahead and fix them all. However, my patch does address your concern: callers specify the upper limit on the region size. Trying to migrate in a 1Gbyte region Are you proposing to make incoming migration adjust some or all memory sizes on the target from whatever was configured during startup to whatever is configured on the source? Yes. At the moment, I only propose this for internal on-device RAM, for the simple reason that users don't know or care about it. So migrating it just removes maintainance pain. It wouldn't be hard to extend it for user-specified RAM, but I don't know whether that is useful. Possibly with some limitations, such as can only adjust downwards? Yes. Can adjust downwards is too limiting, since especially downstreams want two-way migration to work. So I just have devices specify an
Re: [Qemu-devel] [PATCH v2] Add the -semihosting-config option.
On 18 November 2014 20:19, Liviu Ionescu i...@livius.net wrote: The usual semihosting behaviour is to process the system calls locally and return; unfortuantelly the initial implementation dinamically changed the target to GDB during debug sessions, which, for the usual arm-none-eabi-gdb, is not implemented. The result was that during debug sessions the semihosting calls were discarded. This patch adds a configuration variable and an option to set it on the command line: -semihosting-config [enable=on|off,]target=native|gdb|auto This option enables semihosting and defines where the semihosting calls will be addressed, to QEMU ('native') or to GDB ('gdb'). The default is auto, which means 'gdb' during debug sessions and 'native' otherwise. Signed-off-by: Liviu Ionescu i...@livius.net Applied to target-arm.next, thanks. -- PMM
Re: [Qemu-devel] [PATCH v2 1/3] pc-dimm: add a function to calculate VM's current RAM size
On Mon, Nov 17, 2014 at 01:11:08PM +0800, zhanghailiang wrote: The global parameter 'ram_size' does not take into account the hotplugged memory. In some codes, we use 'ram_size' as current VM's real RAM size, which is not correct. Add function 'get_current_ram_size' to calculate VM's current RAM size, it will enumerate present memory devices and also plus ram_size. Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com This affects QMP right? Cc Luiz. --- hw/mem/pc-dimm.c| 26 ++ include/exec/cpu-common.h | 1 + stubs/qmp_pc_dimm_device_list.c | 5 + 3 files changed, 32 insertions(+) diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index a800ea7..38465d0 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -62,6 +62,32 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque) return 0; } +ram_addr_t get_current_ram_size(void) +{ +MemoryDeviceInfoList *info_list = NULL; +MemoryDeviceInfoList **prev = info_list; +MemoryDeviceInfoList *info; +ram_addr_t size = ram_size; + +qmp_pc_dimm_device_list(qdev_get_machine(), prev); +for (info = info_list; info; info = info-next) { +MemoryDeviceInfo *value = info-value; + +if (value) { +switch (value-kind) { +case MEMORY_DEVICE_INFO_KIND_DIMM: +size += value-dimm-size; +break; +default: +break; +} +} +} +qapi_free_MemoryDeviceInfoList(info_list); + +return size; +} + static int pc_dimm_slot2bitmap(Object *obj, void *opaque) { unsigned long *bitmap = opaque; diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 427b851..fcc3162 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -52,6 +52,7 @@ typedef uintptr_t ram_addr_t; #endif extern ram_addr_t ram_size; +ram_addr_t get_current_ram_size(void); /* memory API */ diff --git a/stubs/qmp_pc_dimm_device_list.c b/stubs/qmp_pc_dimm_device_list.c index 5cb220c..b584bd8 100644 --- a/stubs/qmp_pc_dimm_device_list.c +++ b/stubs/qmp_pc_dimm_device_list.c @@ -5,3 +5,8 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque) { return 0; } + +ram_addr_t get_current_ram_size(void) +{ +return ram_size; +} -- 1.7.12.4
Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests
Am 19.11.2014 um 06:48 schrieb Aravinda Prasad aravi...@linux.vnet.ibm.com: On Tuesday 11 November 2014 08:54 AM, David Gibson wrote: [..] So, this may not still be possible depending on whether the KVM side of this is already merged, but it occurs to me that there's a simpler way. Rather than mucking about with having to update the hypervisor on the RTAS location, they have qemu copy the code out of RTAS, patch it and copy it back into the vector, you could instead do this: 1. Make KVM instead of immediately delivering a 0x200 for a guest machine check, cause a special exit to qemu. 2. Have the register-nmi RTAS call store the guest side MC handler address in the spapr structure, but perform no actual guest code patching. 3. Allocate the error log buffer independently from the RTAS blob, so qemu always knows where it is. 4. When qemu gets the MC exit condition, instead of going via a patched 0x200 vector, just directly set the guest register state and jump straight into the guest side MC handler. Before I proceed further I would like to know what others think about the approach proposed above (except for step 3 - as per PAPR the error log buffer should be part of RTAS blob and hence we cannot have error log buffer independent of RTAS blob). Alex, Alexey, Ben: Any thoughts? If in doubt, stick to PAPR please. Alex
Re: [Qemu-devel] Sending packets up to VM using vhost-net User.
Any suggestions here.. Thanks Anshul Makkar On Tue, Nov 18, 2014 at 5:34 PM, Anshul Makkar anshul.mak...@profitbricks.com wrote: Sorry, forgot to mention I am using git clone -b vhost-user-v5 https://github.com/virtualopensystems/qemu.git; for vhost-user backend implementation. and git clone https://github.com/virtualopensystems/vapp.git for reference implementation. Anshul Makkar On Tue, Nov 18, 2014 at 5:29 PM, Anshul Makkar anshul.mak...@profitbricks.com wrote: Hi, I am developing an application that is using vhost-user backend for packet transfer. The architecture: 1) VM1 is using Vhost-user and executing on server1. .qemu-system-x86_64 -m 1024 -mem-path /dev/hugepages,prealloc=on,share=on -drive file=/home/amakkar/test.img,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native -device virtio-blk-pci,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -vga std -vnc 0.0.0.0:3 -netdev type=vhost-user,id=net0,file=/home/amakkar/qemu.sock -device virtio-net-pci,netdev=net0 2) App1 on server1: executing in user-mode connects with vhost-user backend over qemu.sock. As expected, initialization is done and guest addresses including addresses of descriptor ring , available ring and used ring and mapped to my userspace app and I can directly access them. I launch PACKETH on VM1 and transfer some packets using eth0 on VM1 (packet transfer uses virtio-net backend. ifconfig eth0 shows correct TX stats) In App1 I directly access the avail_ring buffer and consume the packet and then I do RDMA transfer to server 2 . 3) VM2 and App2 executing on server2 and again using VHost-User. App2: Vring initializations are successfully done and vring buffers are mapped. I get the buffer from App1 and now *I want to transfer this buffer (Raw packet) to VM2.* To transfer the buffer from App2 to VM2, I directly access the descriptor ring, place the buffer in it and update the available index and then issue the kick. code snippet for it: dest_buf = (void *)handler-map_handler(handler-context, desc[a_idx].addr); memcpy(dest_buf + hdr_len, buf, size); avail-ring[avail-idx % num] = a_idx; avail-idx++; fprintf(stdout, put_vring, synching memory \n); sync_shm(dest_buf, size); sync_shm((void *)(avail), sizeof(struct vring_avail)); kick(vhost_user-vring_table, rx_idx); But the buffer never reaches to VM2. (I do ifconfig eth0 in VM2 and RX stats are 0) Please can you share if my approach is correct in transferring the packet from App2 to VM. Can I directly place the buffer in descriptor ring and issue a kick to notify virtio-net that a packet is available or you can smell some implementation problem. Thanks Anshul Makkar
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 11:11:26AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 10:19:22AM +0100, Juan Quintela wrote: qemu -M pc-2.0 -L BIOS-2.0.bin And that way for every version and every bios. If they are the same, a symlink does. If they are not, different file. And we would not have this problems. You want to keep old bios around forever, and never fix bugs in it? I disagree, quite strongly. One thing is fix bugs, and a different one is completely change the way the tables/data are generated. About keeping old bios forever, no. Only while the machine types that depend on that BIOS are supported. At the very point that they get not supported, we can drop it. I fully agree that we have problems with BIOS every relase. What we don't agree is _what_ is the best way to fix the issue. You don't seem to want to fix them. Your solution is just to keep bugs around forver. That is unfair. It is the same that if I answer that your solution is just paper over the bugs that we have already foound and hope that there are no more bugs. Do you think that describes your position? Mine is also not described but your statement. We have many devices that just get N from stream, do malloc(N), then read data from stream into it. You think it's unsafe? Go ahead and fix them all. I am sure it is unsafe. Fixing them requires incompatible changes, that is the whole point :-( I don't see the point, sorry. Want to elaborate? In general, I don't have specific examples: - when we have a string buffer, we sent/receive it with: VMSTATE_BUFFER(queue.data, PS2State), We are sending whatever the size of queue.data is on source to whatever the queue.data is on destination. We are hopping that they are the same. In this case, if sizes are different, we got just a migration stream that is out of sync. And if attacker modifies stream, bad things could happen. In the case of buffers/arrays (so far it appears that everyplace that recives a size from the network, it checks that it is small enough). However, my patch does address your concern: callers specify the upper limit on the region size. Trying to migrate in a 1Gbyte region Yes and no. You are assuming that a guest launched with a device ram size of 256KB receives a 512KB section and it knows what to do with it. It knows what to do with the 256KB section, with the 512KB answer is always a perhaps. It depends of what is on the extra space. My solution would be that RAM/ROM sizes are always the same for migration, so destination would always understand it. It just forbids migrating between different machine types. And I think that is good, not bad. Later, Juan. You basically ask firmware to be perfect, or want us to carry old bugs around forever. It makes things simple for migration code, at huge cost elsewhere, and pain for users. I don't want to maintain old bugs in ACPI, and same applies to other firmware. This is really the whole point of the patchset. Keep bugs in device ram around until next reboot. At that point users get new device with non buggy behaviour. False. qemu-2.2 -M pc-2.0 - qemu-2.0 -M pc-2.0 You migrate that way, and you go from new-good BIOS to bad-old BIOS. I think the question here is, when we do this migration: qemu-2.0 -M pc-2.0 - qemu-2.2 -M pc-2.0 what BIOS should the second qemu use? the one that existed on qemu-2.0 time or the one that existed on qemu-2.2 time? You can allow for bugfixes, but not for big changes like moving where the acpi tables were generated, etc, etc. I really think that we should use the BIOS from qemu-2.0 era. And my understanding is that you defend that we should use the qemu-2.2 era BIOS. I think that is the whole point of the discussion. Have I at least framed correctly what the discussion is about? Later, Juan.
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 11:16:57AM +0100, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: Since migration doesn't transport configuration, we require a compatibly configured target, and that includes identical memory sizes. RAM size is explicit and the user's problem. ROM size is generally implicit, and we use machine type compatibility machinery to keep it fixed. BIOS changes can break migration only when we screw up or forget the compatibility machinery. Same as for lots of other stuff. No big deal, really, just a consequence of not migrating configuration. You don't get to maintain it, so it's no big deal for you. I see pain every single release and code is becoming spaghetty-like very quickly. We thought it would work. It does not. We do need a solution. And the pain is completely self-inflicted: we already migrate all necessary information! It's just a question of adjusting our datastructures to it. migration from version foo to version bar. You have two options here: - You make source (foo) send the data on the format/sizes that destination (bar) wants. - You make destination (bar) handle whatever source (foo) sends. You need to put the spagueti code in foo or bar. It needs to be in one of the two places, because if that code was not needed, we would not be discussion here, see? So, what we are discussing is where is better to put this code. Emit the code that destination expects, or make destination handle whatever is sent. Amound of mangling need to be basically the same. Later, Juan.
Re: [Qemu-devel] hw/arm/virt: linux,stdout-path - stdout-path
On 19 November 2014 12:08, Leif Lindholm leif.lindh...@linaro.org wrote: As of Linux 3.15, the generic stdout-path property described by ePAPR 1.1 is supported by the upstream kernel: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/of/base.c?id=676e1b2fcd9dbb47a59baac13d089621d22c68b8 ARM virt still sets the legacy linux,stdout-path. Given that this step was added to ARM virt ~ 3 months after 3.15 was released, could we simply replace it (patch below)? Failing that, could we set both for now? / Leif From 25a51745c6243ff279684a3990c8c6aad25ed7b5 Mon Sep 17 00:00:00 2001 From: Leif Lindholm leif.lindh...@linaro.org Date: Wed, 19 Nov 2014 11:02:42 + Subject: [RFC] hw/arm/virt: set stdout-path instead of linux,stdout-path ePAPR 1.1 defines the stdout-path property, making the os-specific linux,stdout-path property redundant. Change the DT setup for ARM virt to use the generic property - supported by Linux since 3.15. Signed-off-by: Leif Lindholm leif.lindh...@linaro.org Acked-by: Ard Biesheuvel ard.biesheu...@linaro.org Note that for the original patch $ git log --oneline f022b8e95379b f022b8e95379 hw/arm/virt: add linux, stdout-path to /chosen DT node $ git tag --contains f022b8e95379b v2.2.0-rc0 v2.2.0-rc1 so it makes sense to take it for 2.2 imo -- Ard. --- hw/arm/virt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 78f618d..314e55b 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -389,7 +389,7 @@ static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic) qemu_fdt_setprop(vbi-fdt, nodename, clock-names, clocknames, sizeof(clocknames)); -qemu_fdt_setprop_string(vbi-fdt, /chosen, linux,stdout-path, nodename); +qemu_fdt_setprop_string(vbi-fdt, /chosen, stdout-path, nodename); g_free(nodename); } -- 1.7.10.4
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 18 Nov 2014, at 21:46, Liviu Ionescu i...@livius.net wrote: once you integrate them... thank you for integrating them. so can I base my next patches on them? I'll work on another patch, to fix passing the semihosting command line down to the armv7m code, since currently this triggers exceptions due to null pointers. my suggestion would be to extend the -semihosting-config option with cmdline=... instead of adding a separate option (in my first patch I used -semihosting-cmdline). is this ok? Liviu
Re: [Qemu-devel] [PATCH v2 0/6] Geometry and blocksize support for backing devices
Am 19.11.2014 um 11:17 schrieb Ekaterina Tumanova: Hi folks, I'm sorry for the recent spam. I messed up during code submission last time. So please ignore any previous notes you received from me and answer only to this thread. This is the rework of the geometry+blocksize patch, which was recently discussed here: http://lists.gnu.org/archive/html/qemu-devel/2014-11/msg01148.html Markus suggested that we only detect blocksize and geometry for DASDs. According to this agreement new version contains DASD special casing. The driver methods are implemented only for host_device and inner hdev_xxx functions check if the backing storage is a DASD by means of BIODASDINFO2 ioctl. Original patchset can be found here: http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg03791.html Ekaterina Tumanova (6): geometry: add bdrv functions for geometry and blocksize geometry: Detect blocksize via ioctls in separate static functions geometry: Add driver methods to probe blocksizes and geometry geometry: Add block-backend wrappers for geometry probing geometry: Call backend function to detect geometry and blocksize geometry: Target specific hook for s390x in geometry guessing block.c| 26 + block/block-backend.c | 10 block/raw-posix.c | 123 ++--- block/raw_bsd.c| 12 hw/block/Makefile.objs | 6 +- hw/block/block.c | 11 hw/block/hd-geometry.c | 43 -- hw/block/virtio-blk.c | 1 + include/block/block.h | 20 +++ include/block/block_int.h | 3 + include/hw/block/block.h | 1 + include/sysemu/block-backend.h | 2 + 12 files changed, 234 insertions(+), 24 deletions(-) I can confirm that it makes dasd devices on s390 work (partition detection is fine, so geometry/sector size must be as well) This patch set needs to be fixed for i386, though: /home/cborntra/REPOS/qemu/hw/block/hd-geometry.c: In function 'hd_geometry_guess': /home/cborntra/REPOS/qemu/hw/block/hd-geometry.c:159:5: error: pointer targets in passing argument 2 of 'guess_disk_lchs' differ in signedness [-Werror=pointer-sign] if (guess_disk_lchs(blk, cylinders, heads, secs) 0) { ^ /home/cborntra/REPOS/qemu/hw/block/hd-geometry.c:52:12: note: expected 'uint32_t *' but argument is of type 'int *' static int guess_disk_lchs(BlockBackend *blk, uint32_t *pcylinders,
[Qemu-devel] File too large error from qemu-img snapshot (was Re: AW: Bug Repoting Directions Request)
On 19/11/2014 12:39, Prof. Dr. Michael Schefczyk wrote: Dear Paolo, thank you very much for your prompt reply. Hi, please keep the replies on the mailing list. I've added the qemu-devel mailing list, where more people should be able to see the message and hopefully reply with something helpful. Also, how do you do the backups? Paolo For example, I have a guest named gatewayb72.img where the backup failed. If I thereafter try to create or delete a snapshot, the following reply occurs on the command line: [root@linuxhost2 kvm02]# qemu-img snapshot -d gatewayb72 /kvm02/gatewayb72.img qemu-img: Could not open '/kvm02/gatewayb72.img': Could not read snapshots: File too large If I want to reboot that machine, I get the following error: Fehler beim Starten der Domain: Interner Fehler: Prozess wurde während der Verbindungsaufnahme zum Monitor beendet : qemu-kvm: -drive file=/kvm02/gatewayb72.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image /kvm02/gatewayb72.img: Could not read snapshots: File too large Traceback (most recent call last): File /usr/share/virt-manager/virtManager/asyncjob.py, line 100, in cb_wrapper callback(asyncjob, *args, **kwargs) File /usr/share/virt-manager/virtManager/asyncjob.py, line 122, in tmpcb callback(*args, **kwargs) File /usr/share/virt-manager/virtManager/domain.py, line 1220, in startup self._backend.create() File /usr/lib64/python2.7/site-packages/libvirt.py, line 698, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: Interner Fehler: Prozess wurde während der Verbindungsaufnahme zum Monitor beendet : qemu-kvm: -drive file=/kvm02/gatewayb72.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image /kvm02/gatewayb72.img: Could not read snapshots: File too large Based on the facts I can see, the file is not too large. When reading the first error, the file size was 13.8 GB, while the limit is 14.5 GB. The same does also happen with files which are only, for example 6 GB big, while their limit is also 14.5 GB. Therefore, I think that the file too large really stands for something else. Can I provide any additional information? Regards, Michael ___ Technische Universität Dresden Fakultät Wirtschaftswissenschaften Lehrstuhl für Entrepreneurship und Innovation Prof. Dr. Michael Schefczyk D-01062 Dresden Fon: +49-3 51-4 63-3 68 81 Fax: +49-3 51-4 63-3 68 83 www.gruenderlehrstuhl.de Since some months ago, I am facing the problem, that backups fail unpredictably. A failed backup does not generate a backup file and thereafter, snapshots can no longer be created or deleted and the guest cannot be started anymore. The resulting error message is File too large. Who is reporting the File too large error? Can you please include the error in full detail? Paolo
Re: [Qemu-devel] File too large error from qemu-img snapshot (was Re: AW: Bug Repoting Directions Request)
On 19/11/2014 12:48, Prof. Dr. Michael Schefczyk wrote: Thank you very much. To execute the backup, I run a script. For the machine in question, it looks as follows: #!/bin/bash dt=`date +%y%m%d` qemu-img snapshot -c gatewayb72 /kvm02/gatewayb72.img qemu-img convert -f qcow2 -O qcow2 /kvm02/gatewayb72.img /backup/gatewayb72$dt.img qemu-img snapshot -d gatewayb72 /kvm02/gatewayb72.img /bin/find /backup/* -mtime +100 -exec rm {} \; Is it done while the VM is running? Paolo
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 19 November 2014 11:37, Liviu Ionescu i...@livius.net wrote: On 18 Nov 2014, at 21:46, Liviu Ionescu i...@livius.net wrote: once you integrate them... thank you for integrating them. so can I base my next patches on them? You can, yes. (Bear in mind that my target-arm.next branch is a rebasing branch, so you can't git merge or pull it; you need to fetch it and then rebase your patchstack on it.) I'll work on another patch, to fix passing the semihosting command line down to the armv7m code, since currently this triggers exceptions due to null pointers. Yes, that should be fixed. my suggestion would be to extend the -semihosting-config option with cmdline=... instead of adding a separate option (in my first patch I used -semihosting-cmdline). On the A and R profile code path this command line is taken from the existing -append option. Is there a reason we can't just make the M profile code do the same thing? thanks -- PMM
Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests
On Wed, Nov 19, 2014 at 11:32:56AM +0100, Alexander Graf wrote: Am 19.11.2014 um 06:48 schrieb Aravinda Prasad aravi...@linux.vnet.ibm.com: On Tuesday 11 November 2014 08:54 AM, David Gibson wrote: [..] So, this may not still be possible depending on whether the KVM side of this is already merged, but it occurs to me that there's a simpler way. Rather than mucking about with having to update the hypervisor on the RTAS location, they have qemu copy the code out of RTAS, patch it and copy it back into the vector, you could instead do this: 1. Make KVM instead of immediately delivering a 0x200 for a guest machine check, cause a special exit to qemu. 2. Have the register-nmi RTAS call store the guest side MC handler address in the spapr structure, but perform no actual guest code patching. 3. Allocate the error log buffer independently from the RTAS blob, so qemu always knows where it is. 4. When qemu gets the MC exit condition, instead of going via a patched 0x200 vector, just directly set the guest register state and jump straight into the guest side MC handler. Before I proceed further I would like to know what others think about the approach proposed above (except for step 3 - as per PAPR the error log buffer should be part of RTAS blob and hence we cannot have error log buffer independent of RTAS blob). Alex, Alexey, Ben: Any thoughts? If in doubt, stick to PAPR please. Apart from (3), which was a misunderstanding on my part, this doesn't diverge from PAPR - it's just a question of how we're implementing the PAPR behaviour. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson pgpo3TEoR23sA.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests
Am 19.11.2014 um 12:44 schrieb David Gibson da...@gibson.dropbear.id.au: On Wed, Nov 19, 2014 at 11:32:56AM +0100, Alexander Graf wrote: Am 19.11.2014 um 06:48 schrieb Aravinda Prasad aravi...@linux.vnet.ibm.com: On Tuesday 11 November 2014 08:54 AM, David Gibson wrote: [..] So, this may not still be possible depending on whether the KVM side of this is already merged, but it occurs to me that there's a simpler way. Rather than mucking about with having to update the hypervisor on the RTAS location, they have qemu copy the code out of RTAS, patch it and copy it back into the vector, you could instead do this: 1. Make KVM instead of immediately delivering a 0x200 for a guest machine check, cause a special exit to qemu. 2. Have the register-nmi RTAS call store the guest side MC handler address in the spapr structure, but perform no actual guest code patching. 3. Allocate the error log buffer independently from the RTAS blob, so qemu always knows where it is. 4. When qemu gets the MC exit condition, instead of going via a patched 0x200 vector, just directly set the guest register state and jump straight into the guest side MC handler. Before I proceed further I would like to know what others think about the approach proposed above (except for step 3 - as per PAPR the error log buffer should be part of RTAS blob and hence we cannot have error log buffer independent of RTAS blob). Alex, Alexey, Ben: Any thoughts? If in doubt, stick to PAPR please. Apart from (3), which was a misunderstanding on my part, this doesn't diverge from PAPR - it's just a question of how we're implementing the PAPR behaviour. Do we need a guest handler at all? Couldn't we make MCs a new exit type and handle it all straight from QEMU? Alex -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au| minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Re: [Qemu-devel] [BUGFIX][PATCH for 2.2 1/1] hw/ide/core.c: Prevent SIGSEGV during migration
Don Slutz dsl...@verizon.com writes: The other callers to blk_set_enable_write_cache() in this file already check for s-blk == NULL. Signed-off-by: Don Slutz dsl...@verizon.com --- I think this is a bugfix that should be back ported to stable releases. I also think this should be done in xen's copy of QEMU for 4.5 with back port(s) to active stable releases. Note: In 2.1 and earlier the routine is bdrv_set_enable_write_cache(); variable is s-bs. Got a reproducer? I'm asking because I believe s-identify_set implies s-blk. s-identify_set is initialized to zero, and gets set to non-zero exactly on the first successful IDENTIFY DEVICE or IDENTIFY PACKET DEVICE, in ide_identify(), ide_atapi_identify() or ide_cfata_identify(), respectively. Only called via cmd_identify() / cmd_identify_packet() via ide_exec_cmd(). The latter immediately fails when !s-blk: s = idebus_active_if(bus); /* ignore commands to non existent slave */ if (s != bus-ifs !s-blk) { return; } Even if I'm right, your patch is fine, because it makes this spot more obviously correct, and consistent with the other uses of blk_set_enable_write_cache(). The case for stable is weak, though. hw/ide/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ide/core.c b/hw/ide/core.c index 00e21cf..d4af5e2 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -2401,7 +2401,7 @@ static int ide_drive_post_load(void *opaque, int version_id) { IDEState *s = opaque; -if (s-identify_set) { +if (s-blk s-identify_set) { blk_set_enable_write_cache(s-blk, !!(s-identify_data[85] (1 5))); } return 0;
Re: [Qemu-devel] [PATCH v2 4/4] target-tricore: Add instructions of RCR opcode format
On 11/14/2014 01:39 PM, Richard Henderson wrote: On 11/13/2014 06:12 PM, Bastian Koppelmann wrote: +tcg_gen_ext_i32_i64(t3, r3); +tcg_gen_concat_i32_i64(t2, r2_low, r2_high); +/* extend the sign for r2 to high 64 bits */ +tcg_gen_sari_i64(t4, t2, 63); +tcg_gen_ext_i32_i64(t1, r1); + +tcg_gen_muls2_i64(t1, t3, t1, t3); +tcg_gen_add2_i64(t1, t3, t2, t4, t1, t3); + I don't believe that you need 128 bit arithemetic for multiply-accumulate, either here or elsewhere (e.g. msub). Looking at unsigned, the maximum result of the multiply is 2*(2^n-1), or 2^(2n) - 2^(n+1). Which means that the accumulate with a 2^n-1 value cannot overflow a double-word intermediate result. Madd.u has the following signature 64 + (32 * 32) -- 64, as far as I read the documentation, and would result as you described in a max result of 2^(2n) - 2^(n+1) for the multiplication, but it would accumulate with 2^(2n) -1, which can definitly overflow, with n = 32. However for signed multiply accumulate I don't need 128 bit arithmetic, because only the add/sub operation of those two can overflow. Thanks for the tip! Cheers, Bastian
Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests
Am 19.11.2014 um 13:22 schrieb Alexander Graf ag...@suse.de: Am 19.11.2014 um 12:44 schrieb David Gibson da...@gibson.dropbear.id.au: On Wed, Nov 19, 2014 at 11:32:56AM +0100, Alexander Graf wrote: Am 19.11.2014 um 06:48 schrieb Aravinda Prasad aravi...@linux.vnet.ibm.com: On Tuesday 11 November 2014 08:54 AM, David Gibson wrote: [..] So, this may not still be possible depending on whether the KVM side of this is already merged, but it occurs to me that there's a simpler way. Rather than mucking about with having to update the hypervisor on the RTAS location, they have qemu copy the code out of RTAS, patch it and copy it back into the vector, you could instead do this: 1. Make KVM instead of immediately delivering a 0x200 for a guest machine check, cause a special exit to qemu. 2. Have the register-nmi RTAS call store the guest side MC handler address in the spapr structure, but perform no actual guest code patching. 3. Allocate the error log buffer independently from the RTAS blob, so qemu always knows where it is. 4. When qemu gets the MC exit condition, instead of going via a patched 0x200 vector, just directly set the guest register state and jump straight into the guest side MC handler. Before I proceed further I would like to know what others think about the approach proposed above (except for step 3 - as per PAPR the error log buffer should be part of RTAS blob and hence we cannot have error log buffer independent of RTAS blob). Alex, Alexey, Ben: Any thoughts? If in doubt, stick to PAPR please. Apart from (3), which was a misunderstanding on my part, this doesn't diverge from PAPR - it's just a question of how we're implementing the PAPR behaviour. Do we need a guest handler at all? Couldn't we make MCs a new exit type and handle it all straight from QEMU? Ah, that was your proposal ;). Sure, works for me. Alex
Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests
On Wed, Nov 19, 2014 at 01:22:01PM +0100, Alexander Graf wrote: Am 19.11.2014 um 12:44 schrieb David Gibson da...@gibson.dropbear.id.au: On Wed, Nov 19, 2014 at 11:32:56AM +0100, Alexander Graf wrote: Am 19.11.2014 um 06:48 schrieb Aravinda Prasad aravi...@linux.vnet.ibm.com: On Tuesday 11 November 2014 08:54 AM, David Gibson wrote: [..] So, this may not still be possible depending on whether the KVM side of this is already merged, but it occurs to me that there's a simpler way. Rather than mucking about with having to update the hypervisor on the RTAS location, they have qemu copy the code out of RTAS, patch it and copy it back into the vector, you could instead do this: 1. Make KVM instead of immediately delivering a 0x200 for a guest machine check, cause a special exit to qemu. 2. Have the register-nmi RTAS call store the guest side MC handler address in the spapr structure, but perform no actual guest code patching. 3. Allocate the error log buffer independently from the RTAS blob, so qemu always knows where it is. 4. When qemu gets the MC exit condition, instead of going via a patched 0x200 vector, just directly set the guest register state and jump straight into the guest side MC handler. Before I proceed further I would like to know what others think about the approach proposed above (except for step 3 - as per PAPR the error log buffer should be part of RTAS blob and hence we cannot have error log buffer independent of RTAS blob). Alex, Alexey, Ben: Any thoughts? If in doubt, stick to PAPR please. Apart from (3), which was a misunderstanding on my part, this doesn't diverge from PAPR - it's just a question of how we're implementing the PAPR behaviour. Do we need a guest handler at all? Couldn't we make MCs a new exit type and handle it all straight from QEMU? Well, PAPR allows the OS to register a handler, which existing guests will expect to be able to do. The registered handler expects various information collated for it though, so it isn't a raw 0x200 vector. IIUC, traditionally pHyp implemented this by patching the guests 0x200 vector to collate the necessary information then jump to the supplied handler. I'm suggesting that instead we indeed make a new exit type, have qemu collate the information internally then jump directly back into the guest registered handler. I'm not sure if that's quite what you were suggesting, but I think we have pretty close to the same idea here. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson pgpSQ4cAO4vNS.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 19 Nov 2014, at 14:08, Peter Maydell peter.mayd...@linaro.org wrote: ... (Bear in mind that my target-arm.next branch is a rebasing branch, so you can't git merge or pull it; you need to fetch it and then rebase your patchstack on it.) I currently have two more branches in my git, for each of the patches applied. I planned to make another branch, based on the -semihosting-config branch, and add the cmdline patches there. my suggestion would be to extend the -semihosting-config option with cmdline=... instead of adding a separate option (in my first patch I used -semihosting-cmdline). On the A and R profile code path this command line is taken from the existing -append option. Is there a reason we can't just make the M profile code do the same thing? I already explained, but will do it again. first, we are talking about two completely different use cases. on the A profile you have a kernel, and when you start it, you 'append' to the kernel path some options, and this is what you pass to the bootloader. this has nothing to do with semihosting, the command line is always used by the bootloader to start the kernel. on the M profile we do not have a kernel, but an image (that you do not want to accept as a different thing), and for normal emulation (i.e. without semihosting) we do not have any command line options. now comes semihosting. I do not know how you used semihosting for other platforms, but for arm, the current code breaks with faults, so I have my doubts it was really used (if it was, ok). anyway, the current code uses the same strategy as for the bootloader, i.e. tries to use the kernel full path and concatenate with the 'append' string. on M profiles, the semihosting startup code cannot afford to allocate very large buffers for the command line, as for the A profiles, and usually only some tens of bytes are available; with the full paths as argv[0] this buffer is already not enough, and the semihosting code returns an error, without passing anything back to the application. both the J-Link GDB server and the openOCD GDB server have a method of passing a string as an *entire*, self-contained, command line to be returned by the semihosting SYS_GET_CMDLINE call, including the argv[0], without using anything else like executable path. I consider this strategy to be reasonable for QEMU too, and I would insist on considering it. so, for the A profile, nothing changes, use -kernel with the kernel name and -append with argv[1]...argv[n], as before. for semihosting, regardless of profile, but especially to satisfy the M profile limitations, I suggest an explicit: -semihosting-config target=native,cmdline=myTest 1 2 3 I do not know if it makes any sense to start a linux kernel with semihosting enabled, but if it does, we can make the cmdline= optional, and, if missing and the -kernel and -append are present, we can apply the previous strategy, i.e. compose the long string from the full kernel path and the appended options. if it does not, we simplify things and always use cmdline= (my favourite). regards, Liviu p.s. in my GNU ARM Eclipse branch I'll continue to use -image, and completely get rid of the misleading -kernel -append options; I hope at a certain moment you'll accept qemu is also fit for non linux exclusive use cases. thanks -- PMM
Re: [Qemu-devel] [PATCH v2] tests: Use command -v instead of which(1) in shell scripts
On 11/19/2014 12:07 AM, Fam Zheng wrote: When which(1) is not installed, we would complain perl not found because it's the first set_prog_path check. The error message is wrong. Fix it by using command -v, a native way to query the existence of a command. Suggested-by: Eric Blake ebl...@redhat.com Signed-off-by: Fam Zheng f...@redhat.com +++ b/tests/qemu-iotests/common.config @@ -47,7 +47,7 @@ export PWD=`pwd` # $1 = prog to look for, $2* = default pathnames if not found in $PATH set_prog_path() { -p=`which $1 2 /dev/null` +p=`command -v $1 2 /dev/null` Reviewed-by: Eric Blake ebl...@redhat.com if [ -n $p -a -x $p ]; then Unrelated: this line works because this is a /bin/bash script, but it is non-portable. Use of -a and -o inside [] is a mistake waiting to happen. For example, is [ ! $a -a $b ] supposed to be true or false for all values of $a and $b? Naively, this says return true if '! $a' (a is empty) and '$b' (b is non-empty); but if $a is '(' and $b is ')' it could also be parsed as returning the negation of whether the parenthesized string -a is non-empty. Use of -a and -o in [[]] is a bit better, but I still HIGHLY recommend that constructs like this be rewritten as [ -n $p ] [ -x $p ] for avoidance of confusion and prevention of copy-pasting the test to non-bash shells. But that would be a separate patch. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 19 November 2014 13:06, Liviu Ionescu i...@livius.net wrote: I already explained, but will do it again. Sorry if I missed that; there's a lot of mail on qemu-devel... first, we are talking about two completely different use cases. on the A profile you have a kernel, and when you start it, you 'append' to the kernel path some options, and this is what you pass to the bootloader. this has nothing to do with semihosting, the command line is always used by the bootloader to start the kernel. on the M profile we do not have a kernel, but an image (that you do not want to accept as a different thing), and for normal emulation (i.e. without semihosting) we do not have any command line options. now comes semihosting. I do not know how you used semihosting for other platforms, but for arm, the current code breaks with faults, so I have my doubts it was really used (if it was, ok). It works fine on the A and R profile cores. The problems you are encountering are with M profile in particular. anyway, the current code uses the same strategy as for the bootloader, i.e. tries to use the kernel full path and concatenate with the 'append' string. on M profiles, the semihosting startup code cannot afford to allocate very large buffers for the command line, as for the A profiles, and usually only some tens of bytes are available; with the full paths as argv[0] this buffer is already not enough, and the semihosting code returns an error, without passing anything back to the application. OK, this seems reasonable. You might want to pass an argv[0] which wasn't the name of your binary anyway. so, for the A profile, nothing changes, use -kernel with the kernel name and -append with argv[1]...argv[n], as before. I think we should support this new option in the same way for both A profile and M profile. I do not know if it makes any sense to start a linux kernel with semihosting enabled It does; there are kernel options to use semihosting for debug console output. , but if it does, we can make the cmdline= optional, and, if missing and the -kernel and -append are present, we can apply the previous strategy, i.e. compose the long string from the full kernel path and the appended options. Yes, we should do this. (We have to make cmdline= optional anyway, to support old and previously working command lines.) I should probably also look at getting these new arguments to work for the linux-user qemu binary (which has its own separate option parsing code). p.s. in my GNU ARM Eclipse branch I'll continue to use -image, and completely get rid of the misleading -kernel -append options; I hope at a certain moment you'll accept qemu is also fit for non linux exclusive use cases. I absolutely think we should work with non-linux use cases (and that those are the majority on M profile). It's just that the option for providing a binary image to load happens to be called -kernel regardless of whether the thing you're loading is a Linux kernel or some other binary. That's unfortunate and confusing, but it's not missing functionality, and it's really hard to change our command line options without breaking existing working setups. (And any change absolutely has to be a very carefully considered design which accounts for all architectures QEMU supports, otherwise we're just making difficulties for ourselves in future. I actually would like to see this confusion cleaned up, but it's really a very hard problem to solve properly.) -- PMM
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On Wed, Nov 19, 2014 at 11:45:15AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 11:11:26AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 10:19:22AM +0100, Juan Quintela wrote: qemu -M pc-2.0 -L BIOS-2.0.bin And that way for every version and every bios. If they are the same, a symlink does. If they are not, different file. And we would not have this problems. You want to keep old bios around forever, and never fix bugs in it? I disagree, quite strongly. One thing is fix bugs, and a different one is completely change the way the tables/data are generated. I want the ability to do both without keeping a ton of old code around. About keeping old bios forever, no. Only while the machine types that depend on that BIOS are supported. At the very point that they get not supported, we can drop it. Still too much. This is unsupportable and is not what we did historically. I fully agree that we have problems with BIOS every relase. What we don't agree is _what_ is the best way to fix the issue. You don't seem to want to fix them. Your solution is just to keep bugs around forver. That is unfair. It is the same that if I answer that your solution is just paper over the bugs that we have already foound and hope that there are no more bugs. Do you think that describes your position? Mine is also not described but your statement. Then I don't understand. How do you suggest fixing BIOS bugs without changing BIOS? People have legitimate reasons to run compat machine types, and they also need bugs fixed. We have many devices that just get N from stream, do malloc(N), then read data from stream into it. You think it's unsafe? Go ahead and fix them all. I am sure it is unsafe. Fixing them requires incompatible changes, that is the whole point :-( I don't see the point, sorry. Want to elaborate? In general, I don't have specific examples: - when we have a string buffer, we sent/receive it with: VMSTATE_BUFFER(queue.data, PS2State), We are sending whatever the size of queue.data is on source to whatever the queue.data is on destination. We are hopping that they are the same. In this case, if sizes are different, we got just a migration stream that is out of sync. And if attacker modifies stream, bad things could happen. In the case of buffers/arrays (so far it appears that everyplace that recives a size from the network, it checks that it is small enough). However, my patch does address your concern: callers specify the upper limit on the region size. Trying to migrate in a 1Gbyte region Yes and no. You are assuming that a guest launched with a device ram size of 256KB receives a 512KB section and it knows what to do with it. It knows what to do with the 256KB section, with the 512KB answer is always a perhaps. It depends of what is on the extra space. My solution would be that RAM/ROM sizes are always the same for migration, so destination would always understand it. It just forbids migrating between different machine types. And I think that is good, not bad. Later, Juan. You basically ask firmware to be perfect, or want us to carry old bugs around forever. It makes things simple for migration code, at huge cost elsewhere, and pain for users. I don't want to maintain old bugs in ACPI, and same applies to other firmware. This is really the whole point of the patchset. Keep bugs in device ram around until next reboot. At that point users get new device with non buggy behaviour. False. qemu-2.2 -M pc-2.0 - qemu-2.0 -M pc-2.0 You migrate that way, and you go from new-good BIOS to bad-old BIOS. So? What is the point? I think the question here is, when we do this migration: qemu-2.0 -M pc-2.0 - qemu-2.2 -M pc-2.0 what BIOS should the second qemu use? the one that existed on qemu-2.0 time or the one that existed on qemu-2.2 time? You can allow for bugfixes, but not for big changes like moving where the acpi tables were generated, etc, etc. If you just ship old BIOS, you do not allow for bugfixes. I really think that we should use the BIOS from qemu-2.0 era. And my understanding is that you defend that we should use the qemu-2.2 era BIOS. Not only that. We already do. And we don't intend to change that for 2.2. I think that is the whole point of the discussion. Have I at least framed correctly what the discussion is about? Later, Juan. I think so. Basically you want to freeze all firmware at time of release and never change it for that machine type. Correct? This would be a regression, this is not how we did things in the past. Real hardware lets users update firmware and so should virtual hardware. -- MST
Re: [Qemu-devel] [PATCH 0/5] pc: make ROMs resizeable
On Wed, Nov 19, 2014 at 01:52:35PM +0530, Amit Shah wrote: On (Wed) 19 Nov 2014 [10:15:16], Michael S. Tsirkin wrote: On Wed, Nov 19, 2014 at 01:01:14PM +0530, Amit Shah wrote: On (Mon) 17 Nov 2014 [22:08:46], Michael S. Tsirkin wrote: Note: migration stream is unaffected by these patches. This makes it possible to enable this functionality unconditionally, for all machine types. In the future, this might be handy for other things, such as changing kernels loaded on command line across migrations. I think that'll be too risky; unless we do S4 before / after migration to ensure the kernel realises things might be changing beneath its feet. Well - guest never sees the resizing. It happens before we start the VM. So I don't see the issue - could you clarify please? Before we start the VM? That's a really odd corner case (ie not worth bothering about?). I took this to mean that the guest was running while migration happened. Amit At the moment you get old ROM before reboot, new ROM after reboot. Anyway, let's argue about this when someone proposes this. Guys, please drop responding to patch 0. There's no code there. Please review the actual code. -- MST
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On Wed, Nov 19, 2014 at 11:50:28AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 11:16:57AM +0100, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: Since migration doesn't transport configuration, we require a compatibly configured target, and that includes identical memory sizes. RAM size is explicit and the user's problem. ROM size is generally implicit, and we use machine type compatibility machinery to keep it fixed. BIOS changes can break migration only when we screw up or forget the compatibility machinery. Same as for lots of other stuff. No big deal, really, just a consequence of not migrating configuration. You don't get to maintain it, so it's no big deal for you. I see pain every single release and code is becoming spaghetty-like very quickly. We thought it would work. It does not. We do need a solution. And the pain is completely self-inflicted: we already migrate all necessary information! It's just a question of adjusting our datastructures to it. migration from version foo to version bar. You have two options here: - You make source (foo) send the data on the format/sizes that destination (bar) wants. - You make destination (bar) handle whatever source (foo) sends. You need to put the spagueti code in foo or bar. It needs to be in one of the two places, because if that code was not needed, we would not be discussion here, see? So, what we are discussing is where is better to put this code. Emit the code that destination expects, or make destination handle whatever is sent. Amound of mangling need to be basically the same. Later, Juan. This is not what the patch does at all. There is no special-casing depending on machine type anywhere. Please review the code and respond to actual patches. -- MST
Re: [Qemu-devel] [PATCH 0/5] pc: make ROMs resizeable
On Wed, Nov 19, 2014 at 09:16:09AM +0100, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: On Tue, Nov 18, 2014 at 03:47:16PM +0100, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: At the moment we migrate ROMs which reside in fw cfg, which allows changing ROM code at will, and supports migrating largish blocks early, with good performance. However, we are running into a problem: changing size breaks migration every time. Pardon my ignorance... changing ROM in migration? I would expect migration to be as transparent for ROM as it is for RAM. What am I missing? [...] Nothing really. We migrate RAM size too - but if there's a mismatch, migration fails. RAM size is user configurable. ROM is used internally so we have to figure out the size, and it turned out to be a hard problem, so we end up maintaining logic for ROM size like we did in 1.7 like we did in 2.0 etc. I don't want to add any more: let's just accept whatever is migrated, and stick to it. Are you proposing to change ROM size on the target from whatever was configured during startup to whatever is configured on the source? Exactly. ROMs are running within guest, they should just migrate together with VM. *They already do* except for their size. Which kind of mostly does not create problems anyway because we round size up. -- MST
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 14:28, Michael S. Tsirkin wrote: qemu-2.0 -M pc-2.0 - qemu-2.2 -M pc-2.0 what BIOS should the second qemu use? the one that existed on qemu-2.0 time or the one that existed on qemu-2.2 time? You can allow for bugfixes, but not for big changes like moving where the acpi tables were generated, etc, etc. I really think that we should use the BIOS from qemu-2.0 era. And my understanding is that you defend that we should use the qemu-2.2 era BIOS. Not only that. We already do. And we don't intend to change that for 2.2. Am I missing a part of the discussion? When we migrate, the second QEMU uses the BIOS from the first. So: qemu-2.0 -M pc-2.0 - qemu-2.2 -M pc-2.0 uses 2.0 BIOS qemu-2.2 -M pc-2.0 - qemu-2.0 -M pc-2.0 uses 2.2 BIOS Both should work, in general. BIOS is rarely the reason for incompatibilities. However, breakage can happen, for example I know that RHEL7 SeaBIOS does not work on RHEL6. RHEL6 SeaBIOS works on RHEL7, but it needs a couple workarounds. Shipping a separate BIOS for different machine types is unrealistic and pointless. It would also be a good terrain for bug reports, unless you also do things like forbid creating -device megasas-gen2 on 2.1 because it was introduced in 2.2. Remember that libvirt keeps the same machine type for the whole life of a virtual machine definition, even if other parts of the hardware (e.g. disks or NICs) change. Paolo
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 19 Nov 2014, at 15:20, Peter Maydell peter.mayd...@linaro.org wrote: Yes, we should do this. (We have to make cmdline= optional anyway, to support old and previously working command lines.) ok, so I'll add the optional 'cmdline=' to the newly added '-semihosting-config'. since this was not used before, it will not break any compatibility. I should probably also look at getting these new arguments to work for the linux-user qemu binary (which has its own separate option parsing code). sure, you can do it, but -kernel and -append will continue to work, as before. I absolutely think we should work with non-linux use cases ... great! as I said, I offer to catalyse and coordinate development for cortex-m related ones, to use them as part of the GNU ARM Eclipse project. ... and it's really hard to change our command line options without breaking existing working setups. in my patch, the -image was implemented as a kind of special alias, the definition was finally copied to -kernel, so it should not break any working setup. but no problem, I'll keep it local to my branch. regards, Liviu
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 11:50:28AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 11:16:57AM +0100, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: Since migration doesn't transport configuration, we require a compatibly configured target, and that includes identical memory sizes. RAM size is explicit and the user's problem. ROM size is generally implicit, and we use machine type compatibility machinery to keep it fixed. BIOS changes can break migration only when we screw up or forget the compatibility machinery. Same as for lots of other stuff. No big deal, really, just a consequence of not migrating configuration. You don't get to maintain it, so it's no big deal for you. I see pain every single release and code is becoming spaghetty-like very quickly. We thought it would work. It does not. We do need a solution. And the pain is completely self-inflicted: we already migrate all necessary information! It's just a question of adjusting our datastructures to it. migration from version foo to version bar. You have two options here: - You make source (foo) send the data on the format/sizes that destination (bar) wants. - You make destination (bar) handle whatever source (foo) sends. You need to put the spagueti code in foo or bar. It needs to be in one of the two places, because if that code was not needed, we would not be discussion here, see? So, what we are discussing is where is better to put this code. Emit the code that destination expects, or make destination handle whatever is sent. Amound of mangling need to be basically the same. Later, Juan. This is not what the patch does at all. There is no special-casing depending on machine type anywhere. Please review the code and respond to actual patches. The code allows increasing of the ram regions if they are bigger on source. Without further explanation. Without knowing _why_ they are bigger on the other side. In general it would not work, even if it works on one particular case. If they are bigger, it is because device code use that for something. not necesarely something that can be ignored. Later, Juan.
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 14:49, Juan Quintela wrote: Real hardware lets users update firmware and so should virtual hardware. But you can hibernate your laptop, update the firmware, and reboot? Where the change can be anyting, like moving from traditional BIOS to UEFI? Wait wait wait. I totally cannot follow. What would be the equivalent in QEMU? Paolo
Re: [Qemu-devel] [PATCH 0/5] pc: make ROMs resizeable
On 19 November 2014 07:31, Amit Shah amit.s...@redhat.com wrote: On (Mon) 17 Nov 2014 [22:08:46], Michael S. Tsirkin wrote: Considering this promises to rid us of worries about ROM size considerations once and for all, I thinking about pushing this as a kind of bugfix before 2.2, so we don't need to maintain more band-aids in 2.3 and on. I'd rather wait for 2.3; we've done this for a couple of releases already, so what's one more. And we're at rc2 already.. It certainly seems pretty risky to introduce this change with only two weeks to go til release; I wouldn't want to merge it without a strong consensus from everybody involved that it really needed to go in for 2.2. thanks -- PMM
Re: [Qemu-devel] [PATCH 0/5] pc: make ROMs resizeable
Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 01:52:35PM +0530, Amit Shah wrote: On (Wed) 19 Nov 2014 [10:15:16], Michael S. Tsirkin wrote: On Wed, Nov 19, 2014 at 01:01:14PM +0530, Amit Shah wrote: On (Mon) 17 Nov 2014 [22:08:46], Michael S. Tsirkin wrote: Note: migration stream is unaffected by these patches. This makes it possible to enable this functionality unconditionally, for all machine types. In the future, this might be handy for other things, such as changing kernels loaded on command line across migrations. I think that'll be too risky; unless we do S4 before / after migration to ensure the kernel realises things might be changing beneath its feet. Well - guest never sees the resizing. It happens before we start the VM. So I don't see the issue - could you clarify please? Before we start the VM? That's a really odd corner case (ie not worth bothering about?). I took this to mean that the guest was running while migration happened. Amit At the moment you get old ROM before reboot, new ROM after reboot. Anyway, let's argue about this when someone proposes this. Guys, please drop responding to patch 0. There's no code there. Please review the actual code. How can we answer: The code does what it tell it does, we are not happy with the whole idea? Later, Juan.
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 11:45:15AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 11:11:26AM +0100, Juan Quintela wrote: Michael S. Tsirkin m...@redhat.com wrote: On Wed, Nov 19, 2014 at 10:19:22AM +0100, Juan Quintela wrote: qemu -M pc-2.0 -L BIOS-2.0.bin And that way for every version and every bios. If they are the same, a symlink does. If they are not, different file. And we would not have this problems. You want to keep old bios around forever, and never fix bugs in it? I disagree, quite strongly. One thing is fix bugs, and a different one is completely change the way the tables/data are generated. I want the ability to do both without keeping a ton of old code around. You want. About keeping old bios forever, no. Only while the machine types that depend on that BIOS are supported. At the very point that they get not supported, we can drop it. Still too much. This is unsupportable and is not what we did historically. And we have failed spectacularly. If what you do don't work changing what you do? That is unfair. It is the same that if I answer that your solution is just paper over the bugs that we have already foound and hope that there are no more bugs. Do you think that describes your position? Mine is also not described but your statement. Then I don't understand. How do you suggest fixing BIOS bugs without changing BIOS? People have legitimate reasons to run compat machine types, and they also need bugs fixed. if the change in the BIOS is big enough, they need to change also machine type. Is an easy as that. You should not put a big change in BIOS without previous warning to the user. This is independent of migration. Extreme example: You used to have BIOS and now move to UEFI. And don't want to ship the old BIOS? You consider that right? if no, then we are discussing about where is the limit, not if there is a limit. This is really the whole point of the patchset. Keep bugs in device ram around until next reboot. At that point users get new device with non buggy ^ behaviour. ^ False. qemu-2.2 -M pc-2.0 - qemu-2.0 -M pc-2.0 You migrate that way, and you go from new-good BIOS to bad-old BIOS. So? What is the point? You got new BIOS, and you got that they don't get the new non-buggy behaviour. They are running the old BIOS. So, your solution don't fix all problems. I think the question here is, when we do this migration: qemu-2.0 -M pc-2.0 - qemu-2.2 -M pc-2.0 what BIOS should the second qemu use? the one that existed on qemu-2.0 time or the one that existed on qemu-2.2 time? You can allow for bugfixes, but not for big changes like moving where the acpi tables were generated, etc, etc. If you just ship old BIOS, you do not allow for bugfixes. We have qemu-2.0 And now we have qemu-2.0.1 and qemu-2.1 You know that some changes are not valid for qemu-2.0.1, right? Some for BIOS. I really think that we should use the BIOS from qemu-2.0 era. And my understanding is that you defend that we should use the qemu-2.2 era BIOS. Not only that. We already do. And we don't intend to change that for 2.2. I think that is the whole point of the discussion. Have I at least framed correctly what the discussion is about? Later, Juan. I think so. Basically you want to freeze all firmware at time of release and never change it for that machine type. Correct? Bug fixes are ok. Big changes no. Think of what is permisible for 2.0.1 and not. For instance, no new features allowed. This would be a regression, this is not how we did things in the past. Back to square one. We are failing with compatibility in the past. Time to try new approachs? Real hardware lets users update firmware and so should virtual hardware. But you can hibernate your laptop, update the firmware, and reboot? Where the change can be anyting, like moving from traditional BIOS to UEFI? NO. And for good reason. If you are able to upgrade the BIOS while hibernated, would it try to resume or just normal reboot? Same for S3. Later, Juan.
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Paolo Bonzini pbonz...@redhat.com wrote: Am I missing a part of the discussion? When we migrate, the second QEMU uses the BIOS from the first. So: qemu-2.0 -M pc-2.0 - qemu-2.2 -M pc-2.0 uses 2.0 BIOS qemu-2.2 -M pc-2.0 - qemu-2.0 -M pc-2.0 uses 2.2 BIOS Both should work, in general. BIOS is rarely the reason for incompatibilities. However, breakage can happen, for example I know that RHEL7 SeaBIOS does not work on RHEL6. RHEL6 SeaBIOS works on RHEL7, but it needs a couple workarounds. Shipping a separate BIOS for different machine types is unrealistic and pointless. It would also be a good terrain for bug reports, unless you also do things like forbid creating -device megasas-gen2 on 2.1 because it was introduced in 2.2. And I agree with that. If it got introduced on 2.2, it should not be allowed on pc-2.1. It just makes things more complicated. We don't have infrastructure to enforce that. And I am claining that is the problem. We are just papering over this problem each time that it happens. I honestely think that the only way to really fix compatibility is enforcing that machine types are stable. right now they are now, and we ended nothing it. Remember that libvirt keeps the same machine type for the whole life of a virtual machine definition, even if other parts of the hardware (e.g. disks or NICs) change. There is a way to upgrade the machine type of a specific machine. If you want to update it, just do it. If you want it is not broking, so not mess with it, it means just that, not changing anything that we can avoid to change. Later, Juan.
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 17 November 2014 20:08, Michael S. Tsirkin m...@redhat.com wrote: Add API to manage on-device RAM. This looks just like regular RAM from migration POV, but has two special properties internally: - it is never exposed to guest - block is sized on migration, making it easier to extend without breaking migration compatibility or wasting virtual memory - callers must specify an upper bound on size +/* On-device RAM allocated with g_malloc: supports realloc, + * not accessible to vcpu on kvm. + */ +#define RAM_DEVICE (1 2) Does this comment mean KVM guests cannot access this memory, so it's a board bug to attempt to map it into guest address space?. If so, what breaks? Can we have an assert or something to catch usage errors if it is mapped? Would it be possible to drop the restriction? I'm not convinced about the naming either -- isn't this for BIOSes rather than generic on-device scratch RAM (which you'd model either with a plain RAM memoryregion or with a locally allocated block of memory or array, depending on the device semantics)? thanks -- PMM
[Qemu-devel] [PATCH] geometry: fix i386 compilation
Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com --- hw/block/hd-geometry.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/block/hd-geometry.c b/hw/block/hd-geometry.c index b462225..905d2c6 100644 --- a/hw/block/hd-geometry.c +++ b/hw/block/hd-geometry.c @@ -147,7 +147,8 @@ void hd_geometry_guess(BlockBackend *blk, uint32_t *pcyls, uint32_t *pheads, uint32_t *psecs, int *ptrans) { -int cylinders, heads, secs, translation; +uint32_t cylinders, heads, secs; +int translation = BIOS_ATA_TRANSLATION_NONE; struct ProbeGeometry geometry = blk_probe_geometry(blk); if (geometry.rc == 0) { @@ -173,9 +174,6 @@ void hd_geometry_guess(BlockBackend *blk, *pcyls = cylinders; *pheads = heads; *psecs = secs; -/* disable any translation to be in sync with - the logical geometry */ -translation = BIOS_ATA_TRANSLATION_NONE; } done: if (ptrans) { -- 1.8.5.5
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Paolo Bonzini pbonz...@redhat.com wrote: On 19/11/2014 14:49, Juan Quintela wrote: Real hardware lets users update firmware and so should virtual hardware. But you can hibernate your laptop, update the firmware, and reboot? Where the change can be anyting, like moving from traditional BIOS to UEFI? Wait wait wait. I totally cannot follow. What would be the equivalent in QEMU? qemu-2.0 -M pc-2.0 migrate to disk/s3/s4 upgrade qemu qemu-2.2 -M pc-2.0 try interesting variation of s3/s4/migration to disk. Migration to disk should work (we migrate BIOS ROM blocks, enphasis on ROM), s3 perhaps (machine needs to be saved to disk), s4 . depends how it ends being done. Later, Juan.
Re: [Qemu-devel] qemu 2.2 crash on linux hvm domU (full backtrace included)
Il 14/11/2014 12:25, Fabio Fantoni ha scritto: dom0 xen-unstable from staging git with x86/hvm: Extend HVM cpuid leaf with vcpu id and x86/hvm: Add per-vcpu evtchn upcalls patches, and qemu 2.2 from spice git (spice/next commit e779fa0a715530311e6f59fc8adb0f6eca914a89): https://github.com/Fantu/Xen/commits/rebase/m2r-staging I tried with qemu tag v2.2.0-rc2 and crash still happen, here the full backtrace of latest test: Program received signal SIGSEGV, Segmentation fault. 0x55689b07 in vmport_ioport_read (opaque=0x564443a0, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/hw/misc/vmport.c:73 73 eax = env-regs[R_EAX]; (gdb) bt full #0 0x55689b07 in vmport_ioport_read (opaque=0x564443a0, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/hw/misc/vmport.c:73 s = 0x564443a0 cs = 0x0 cpu = 0x0 __func__ = vmport_ioport_read env = 0x8250 command = 0 '\000' eax = 0 #1 0x55655fc4 in memory_region_read_accessor (mr=0x5628, addr=0, value=0x7fffd8d0, size=4, shift=0, mask=4294967295) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:410 tmp = 0 #2 0x556562b7 in access_with_adjusted_size (addr=0, value=0x7fffd8d0, size=4, access_size_min=4, access_size_max=4, access=0x55655f62 memory_region_read_accessor, mr=0x5628) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:480 access_mask = 4294967295 access_size = 4 i = 0 #3 0x556590e9 in memory_region_dispatch_read1 (mr=0x5628, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1077 data = 0 #4 0x556591b1 in memory_region_dispatch_read (mr=0x5628, addr=0, pval=0x7fffd9a8, size=4) ---Type return to continue, or q return to quit--- at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1099 No locals. #5 0x5565cbbc in io_mem_read (mr=0x5628, addr=0, pval=0x7fffd9a8, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1962 No locals. #6 0x5560a1ca in address_space_rw (as=0x55eaf920, addr=22104, buf=0x7fffda50 \377\377\377\377, len=4, is_write=false) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/exec.c:2167 l = 4 ptr = 0x55a92d87 %s/%d:\n val = 7852232130387826944 addr1 = 0 mr = 0x5628 error = false #7 0x5560a38f in address_space_read (as=0x55eaf920, addr=22104, buf=0x7fffda50 \377\377\377\377, len=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/exec.c:2205 No locals. #8 0x5564fd4b in cpu_inl (addr=22104) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/ioport.c:117 buf = \377\377\377\377 val = 21845 #9 0x55670c73 in do_inp (addr=22104, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:684 ---Type return to continue, or q return to quit--- No locals. #10 0x55670ee0 in cpu_ioreq_pio (req=0x77ff3020) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:747 i = 1 #11 0x556714b3 in handle_ioreq (state=0x563c2510, req=0x77ff3020) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:853 No locals. #12 0x55671826 in cpu_handle_ioreq (opaque=0x563c2510) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:931 state = 0x563c2510 req = 0x77ff3020 #13 0x5596e240 in qemu_iohandler_poll (pollfds=0x56389a30, ret=1) at iohandler.c:143 revents = 1 pioh = 0x563f7610 ioh = 0x56450a40 #14 0x5596de1c in main_loop_wait (nonblocking=0) at main-loop.c:495 ret = 1 timeout = 4294967295 timeout_ns = 3965432 #15 0x55756d3f in main_loop () at vl.c:1882 nonblocking = false last_io = 0 #16 0x5575ea49 in main (argc=62, argv=0x7fffe048, envp=0x7fffe240) at vl.c:4400 ---Type return to continue, or q return to quit--- i = 128 snapshot = 0 linux_boot = 0 initrd_filename = 0x0 kernel_filename = 0x0 kernel_cmdline = 0x55a48f86 boot_order = 0x56387460 dc ds = 0x564b2040 cyls = 0 heads = 0 secs = 0 translation = 0 hda_opts = 0x0 opts = 0x563873b0 machine_opts = 0x56389010 icount_opts = 0x0 olist = 0x55e57e80 optind = 62 optarg = 0x7fffe914 file=/mnt/vm/disks/FEDORA19.disk1.xm,if=ide,index=0,media=disk,format=raw,cache=writeback loadvm = 0x0 machine_class = 0x5637d5c0 cpu_model = 0x0 vga_model = 0x0 qtest_chrdev = 0x0 ---Type return to continue, or q return to quit--- qtest_log = 0x0 pid_file = 0x0 incoming = 0x0 show_vnc_port = 0 defconfig = true userconfig = true log_mask = 0x0 log_file = 0x0
Re: [Qemu-devel] File too large error from qemu-img snapshot (was Re: AW: Bug Repoting Directions Request)
On 19.11.2014 12:41, Paolo Bonzini wrote: On 19/11/2014 12:39, Prof. Dr. Michael Schefczyk wrote: Dear Paolo, thank you very much for your prompt reply. Hi, please keep the replies on the mailing list. I've added the qemu-devel mailing list, where more people should be able to see the message and hopefully reply with something helpful. Also, how do you do the backups? Paolo For example, I have a guest named gatewayb72.img where the backup failed. If I thereafter try to create or delete a snapshot, the following reply occurs on the command line: [root@linuxhost2 kvm02]# qemu-img snapshot -d gatewayb72 /kvm02/gatewayb72.img qemu-img: Could not open '/kvm02/gatewayb72.img': Could not read snapshots: File too large If I want to reboot that machine, I get the following error: Fehler beim Starten der Domain: Interner Fehler: Prozess wurde während der Verbindungsaufnahme zum Monitor beendet : qemu-kvm: -drive file=/kvm02/gatewayb72.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image /kvm02/gatewayb72.img: Could not read snapshots: File too large Traceback (most recent call last): File /usr/share/virt-manager/virtManager/asyncjob.py, line 100, in cb_wrapper callback(asyncjob, *args, **kwargs) File /usr/share/virt-manager/virtManager/asyncjob.py, line 122, in tmpcb callback(*args, **kwargs) File /usr/share/virt-manager/virtManager/domain.py, line 1220, in startup self._backend.create() File /usr/lib64/python2.7/site-packages/libvirt.py, line 698, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: Interner Fehler: Prozess wurde während der Verbindungsaufnahme zum Monitor beendet : qemu-kvm: -drive file=/kvm02/gatewayb72.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image /kvm02/gatewayb72.img: Could not read snapshots: File too large Based on the facts I can see, the file is not too large. When reading the first error, the file size was 13.8 GB, while the limit is 14.5 GB. The same does also happen with files which are only, for example 6 GB big, while their limit is also 14.5 GB. Therefore, I think that the file too large really stands for something else. It most probably does, yes. Some months ago, we included a limit in qemu's qcow2 implementation for the (internal) snapshot table size. When this limit is hit, File too large will be printed (it's a pretty bad error message, I know, but we thought it never to occur for reasonable images, so we did not see the need to improve it). That limit is 64 MB, which comes from having 1 kB of metadata per snapshot in average and a maximum of 64k snapshots. 1 kB per snapshot is reasonable: Normally, you have 56 bytes plus the length of the snapshot's name plus the length of the snapshot's ID (plus a couple of bytes of padding). We have indeed seen some bug reports regarding this limit; however, nobody could provide us with an image to look into this issue so far (as far as I'm aware), and we could not reproduce it ourselves. I understand that providing a 13.8 GB image is a bit much, but maybe you are able to do it anyway (the TU Dresden does have a reasonable internet connection, as far as I'm aware) or, if possible, provide us with information on how to recreate a similar image ourselves. Max
Re: [Qemu-devel] [PATCH v2] tests: Use command -v instead of which(1) in shell scripts
On 19 November 2014 13:19, Eric Blake ebl...@redhat.com wrote: Use of -a and -o in [[]] is a bit better, but I still HIGHLY recommend that constructs like this be rewritten as [ -n $p ] [ -x $p ] for avoidance of confusion and prevention of copy-pasting the test to non-bash shells. But that would be a separate patch. Yeah, this is one of those issues I tend to pick up in patches which change our shell scripts, but we still have a fair amount of existing code that isn't up to that standard. -- PMM
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Peter Maydell peter.mayd...@linaro.org wrote: On 17 November 2014 20:08, Michael S. Tsirkin m...@redhat.com wrote: Add API to manage on-device RAM. This looks just like regular RAM from migration POV, but has two special properties internally: - it is never exposed to guest - block is sized on migration, making it easier to extend without breaking migration compatibility or wasting virtual memory - callers must specify an upper bound on size +/* On-device RAM allocated with g_malloc: supports realloc, + * not accessible to vcpu on kvm. + */ +#define RAM_DEVICE (1 2) Does this comment mean KVM guests cannot access this memory, so it's a board bug to attempt to map it into guest address space?. If so, what breaks? Can we have an assert or something to catch usage errors if it is mapped? Would it be possible to drop the restriction? I'm not convinced about the naming either -- isn't this for BIOSes rather than generic on-device scratch RAM (which you'd model either with a plain RAM memoryregion or with a locally allocated block of memory or array, depending on the device semantics)? My understanding is that it is a trick. We have internal memory for a device that is needed for the emulation, but not showed to the guest. And it is big enough that we want to save it during the live stage of migration, so we mark it as RAM. if it is somekind of cash, we can just enlarge it on destination, and it don't matter. If this has anything different on the other part of the RAM, we are on trouble. Have I understood it correctly? If so, it appears that all the cases (or the ones that mst cares about) don't care about this size. Later, Juan.
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19 November 2014 14:07, Juan Quintela quint...@redhat.com wrote: My understanding is that it is a trick. We have internal memory for a device that is needed for the emulation, but not showed to the guest. And it is big enough that we want to save it during the live stage of migration, so we mark it as RAM. if it is somekind of cash, we can just enlarge it on destination, and it don't matter. If this has anything different on the other part of the RAM, we are on trouble. Would it be feasible to just have the migration code provide an API for registering things to be migrated in the live migration stage, rather than creating memory regions which you can't actually use for most of the purposes the memory region API exists for? -- PMM
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 15:03, Juan Quintela wrote: Paolo Bonzini pbonz...@redhat.com wrote: On 19/11/2014 14:49, Juan Quintela wrote: Real hardware lets users update firmware and so should virtual hardware. But you can hibernate your laptop, update the firmware, and reboot? Where the change can be anyting, like moving from traditional BIOS to UEFI? Wait wait wait. I totally cannot follow. What would be the equivalent in QEMU? qemu-2.0 -M pc-2.0 migrate to disk/s3/s4 upgrade qemu qemu-2.2 -M pc-2.0 try interesting variation of s3/s4/migration to disk. Migration to disk should work (we migrate BIOS ROM blocks, enphasis on ROM), s3 perhaps (machine needs to be saved to disk), s4 . depends how it ends being done. Ok, got it. S3 + migrate to disk should work. S4 probably would work, but I think it would work on a real system too as long as you update software and not hardware (e.g. changing the motherboard would change the MAC address of the on-board NIC, for example). Consider the similar case on real hardware: boot update microcode RPM s4 turn on CPU microcode is installed early by the kernel, before looking for a hibernation image to resume from, so the CPU microcode after resume from S4 is different from the microcode at the time you suspended to disk. This probably would work. Paolo
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Since we've wondered off the actual ACPI table stuff into general ROM sizing, I'd like to propose some concrete fixes: 1) We explicitly name the bios file in a .romfile attribute for all ROMs. 2) The code that uses .romfile has an expansion for $MACHINETYPE 3) We actually symlink all of those together, anyone who wants/has to deal with different versions can downstream. 4) The machine types contain size attributes for the ROMs that are generoously larger than the ROMs anyone currently uses. I think 1..3 should deal with those of us who have to deal with different ROM versions on different machine types. 4 might be good enough for the ACPI tables if you can bound it. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
* Paolo Bonzini (pbonz...@redhat.com) wrote: On 19/11/2014 15:03, Juan Quintela wrote: Paolo Bonzini pbonz...@redhat.com wrote: On 19/11/2014 14:49, Juan Quintela wrote: Real hardware lets users update firmware and so should virtual hardware. But you can hibernate your laptop, update the firmware, and reboot? Where the change can be anyting, like moving from traditional BIOS to UEFI? Wait wait wait. I totally cannot follow. What would be the equivalent in QEMU? qemu-2.0 -M pc-2.0 migrate to disk/s3/s4 upgrade qemu qemu-2.2 -M pc-2.0 try interesting variation of s3/s4/migration to disk. Migration to disk should work (we migrate BIOS ROM blocks, enphasis on ROM), s3 perhaps (machine needs to be saved to disk), s4 . depends how it ends being done. Ok, got it. S3 + migrate to disk should work. S4 probably would work, but I think it would work on a real system too as long as you update software and not hardware (e.g. changing the motherboard would change the MAC address of the on-board NIC, for example). Consider the similar case on real hardware: boot update microcode RPM s4 turn on CPU microcode is installed early by the kernel, before looking for a hibernation image to resume from, so the CPU microcode after resume from S4 is different from the microcode at the time you suspended to disk. This probably would work. You mean, unless for example, someone had disabled a CPU feature in the new microcode? Dave Paolo -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Peter Maydell peter.mayd...@linaro.org wrote: On 19 November 2014 14:07, Juan Quintela quint...@redhat.com wrote: My understanding is that it is a trick. We have internal memory for a device that is needed for the emulation, but not showed to the guest. And it is big enough that we want to save it during the live stage of migration, so we mark it as RAM. if it is somekind of cash, we can just enlarge it on destination, and it don't matter. If this has anything different on the other part of the RAM, we are on trouble. Would it be feasible to just have the migration code provide an API for registering things to be migrated in the live migration stage, rather than creating memory regions which you can't actually use for most of the purposes the memory region API exists for? If somebody told me what they need, we can do it. Stefan, you needed something like that for data-plane? Or that memory is mapped on the guest? Later, Juan.
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 15:10, Peter Maydell wrote: On 19 November 2014 14:07, Juan Quintela quint...@redhat.com wrote: My understanding is that it is a trick. We have internal memory for a device that is needed for the emulation, but not showed to the guest. And it is big enough that we want to save it during the live stage of migration, so we mark it as RAM. if it is somekind of cash, we can just enlarge it on destination, and it don't matter. If this has anything different on the other part of the RAM, we are on trouble. Would it be feasible to just have the migration code provide an API for registering things to be migrated in the live migration stage, rather than creating memory regions which you can't actually use for most of the purposes the memory region API exists for? It does already, for example PPC uses it for its IOMMU tables. But in any case this is really just memory that is auto-resized on migration. And it can work even if it is mapped in memory, as long as your resize callback (or some post_load code executed while migrating devices) does something useful to update the memory map. Let's call it like what it is. Basically it is an I know how to deal with the source's configuration whatever it is, so I'm not bothering to reconstruct it equivalently on the destination flag. The fine print is that it will create more differences between a given machine type on different versions of QEMU. It can have ripple effects on the memory map and it can make existing VMs a bit more likely to break when updating QEMU. This is why I do not particularly like it, and I posted different patches to do the same thing. I understand that it is a simple fix that will mostly work and probably avoids work more than causing it. And BTW, those patches are still unreviewed,. Juan, do as you preach---review patches that are closer to what you preach. And Michael, you too---review patches in addition to asking people to review yours, so that we can compare the approaches. Paolo
[Qemu-devel] [PATCH for-2.3 1/4] blockdev: acquire AioContext in blockdev-snapshot-delete-internal-sync
Add dataplane support to the blockdev-snapshot-delete-internal-sync QMP command. By acquiring the AioContext we avoid race conditions with the dataplane thread which may also be accessing the BlockDriverState. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 16 +--- hw/block/dataplane/virtio-blk.c | 2 ++ 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/blockdev.c b/blockdev.c index 57910b8..fb9a005 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1105,6 +1105,7 @@ SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device, Error **errp) { BlockDriverState *bs = bdrv_find(device); +AioContext *aio_context; QEMUSnapshotInfo sn; Error *local_err = NULL; SnapshotInfo *info = NULL; @@ -1128,25 +1129,30 @@ SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device, return NULL; } +aio_context = bdrv_get_aio_context(bs); +aio_context_acquire(aio_context); + ret = bdrv_snapshot_find_by_id_and_name(bs, id, name, sn, local_err); if (local_err) { error_propagate(errp, local_err); -return NULL; +goto out_aio_context; } if (!ret) { error_setg(errp, Snapshot with id '%s' and name '%s' does not exist on device '%s', STR_OR_NULL(id), STR_OR_NULL(name), device); -return NULL; +goto out_aio_context; } bdrv_snapshot_delete(bs, id, name, local_err); if (local_err) { error_propagate(errp, local_err); -return NULL; +goto out_aio_context; } +aio_context_release(aio_context); + info = g_new0(SnapshotInfo, 1); info-id = g_strdup(sn.id_str); info-name = g_strdup(sn.name); @@ -1157,6 +1163,10 @@ SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device, info-vm_clock_sec = sn.vm_clock_nsec / 10; return info; + +out_aio_context: +aio_context_release(aio_context); +return NULL; } /* New and old BlockDriverState structs for group snapshots */ diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 1222a37..5ab64c5 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -198,6 +198,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf, blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_DRIVE_DEL, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_BACKUP_SOURCE, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_COMMIT, s-blocker); +blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE, + s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_MIRROR, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_STREAM, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_REPLACE, s-blocker); -- 2.1.0
[Qemu-devel] [PATCH for-2.3 0/4] blockdev: support dataplane in remaining QMP commands
This patch series adds virtio-blk dataplane support for the following QMP commands: * eject * change * change-backing-file * block_passwd * blockdev-snapshot-delete-internal-sync This requires acquiring and releasing the BlockDriverState's AioContext so that the IOThread does not run while the monitor command is executing. Monitor commands that use AioContext can be unblocked in the virtio-blk dataplane op blockers list. After this series only the transaction QMP command remains to be converted. Stefan Hajnoczi (4): blockdev: acquire AioContext in blockdev-snapshot-delete-internal-sync blockdev: check for BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE blockdev: acquire AioContext in eject, change, and block_passwd blockdev: acquire AioContext in change-backing-file blockdev.c | 75 - hw/block/dataplane/virtio-blk.c | 4 +++ 2 files changed, 63 insertions(+), 16 deletions(-) -- 2.1.0
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Paolo Bonzini pbonz...@redhat.com wrote: On 19/11/2014 15:03, Juan Quintela wrote: Paolo Bonzini pbonz...@redhat.com wrote: On 19/11/2014 14:49, Juan Quintela wrote: Real hardware lets users update firmware and so should virtual hardware. But you can hibernate your laptop, update the firmware, and reboot? Where the change can be anyting, like moving from traditional BIOS to UEFI? Wait wait wait. I totally cannot follow. What would be the equivalent in QEMU? qemu-2.0 -M pc-2.0 migrate to disk/s3/s4 upgrade qemu qemu-2.2 -M pc-2.0 try interesting variation of s3/s4/migration to disk. Migration to disk should work (we migrate BIOS ROM blocks, enphasis on ROM), s3 perhaps (machine needs to be saved to disk), s4 . depends how it ends being done. Ok, got it. S3 + migrate to disk should work. S4 probably would work, but I think it would work on a real system too as long as you update software and not hardware (e.g. changing the motherboard would change the MAC address of the on-board NIC, for example). Consider the similar case on real hardware: boot update microcode RPM s4 turn on CPU microcode is installed early by the kernel, before looking for a hibernation image to resume from, so the CPU microcode after resume from S4 is different from the microcode at the time you suspended to disk. This probably would work. I am not an expert of cpu microcode, but I would assume that changes there tend to be minimal, no? And anyways, I wouldn't expect to introduce/remove features like NX (i.e. visible by the guest) on a microcode update? Later, Juan. Paolo
[Qemu-devel] [PATCH for-2.3 4/4] blockdev: acquire AioContext in change-backing-file
Add dataplane support to the change-backing-file QMP commands. By acquiring the AioContext we avoid race conditions with the dataplane thread which may also be accessing the BlockDriverState. Note that this command operates on both bs and a node in its chain (image_bs). The bdrv_chain_contains(bs, image_bs) check guarantees that bs and image_bs are in the same AioContext. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 19 +-- hw/block/dataplane/virtio-blk.c | 1 + 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/blockdev.c b/blockdev.c index 7bf88d4..a52f205 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2584,6 +2584,7 @@ void qmp_change_backing_file(const char *device, Error **errp) { BlockDriverState *bs = NULL; +AioContext *aio_context; BlockDriverState *image_bs = NULL; Error *local_err = NULL; bool ro; @@ -2597,34 +2598,37 @@ void qmp_change_backing_file(const char *device, return; } +aio_context = bdrv_get_aio_context(bs); +aio_context_acquire(aio_context); + image_bs = bdrv_lookup_bs(NULL, image_node_name, local_err); if (local_err) { error_propagate(errp, local_err); -return; +goto out; } if (!image_bs) { error_setg(errp, image file not found); -return; +goto out; } if (bdrv_find_base(image_bs) == image_bs) { error_setg(errp, not allowing backing file change on an image without a backing file); -return; +goto out; } /* even though we are not necessarily operating on bs, we need it to * determine if block ops are currently prohibited on the chain */ if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_CHANGE, errp)) { -return; +goto out; } /* final sanity check */ if (!bdrv_chain_contains(bs, image_bs)) { error_setg(errp, '%s' and image file are not in the same chain, device); -return; +goto out; } /* if not r/w, reopen to make r/w */ @@ -2635,7 +2639,7 @@ void qmp_change_backing_file(const char *device, bdrv_reopen(image_bs, open_flags | BDRV_O_RDWR, local_err); if (local_err) { error_propagate(errp, local_err); -return; +goto out; } } @@ -2655,6 +2659,9 @@ void qmp_change_backing_file(const char *device, error_propagate(errp, local_err); /* will preserve prior errp */ } } + +out: +aio_context_release(aio_context); } void qmp_blockdev_add(BlockdevOptions *options, Error **errp) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 2548113..90ab27e 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -197,6 +197,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf, blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_RESIZE, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_DRIVE_DEL, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_BACKUP_SOURCE, s-blocker); +blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_CHANGE, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_COMMIT, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_EJECT, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE, -- 2.1.0
[Qemu-devel] [PATCH for-2.3 3/4] blockdev: acquire AioContext in eject, change, and block_passwd
By acquiring the AioContext we avoid race conditions with the dataplane thread which may also be accessing the BlockDriverState. Fix up eject, change, and block_passwd in a single patch because qmp_eject() and qmp_change_blockdev() both call eject_device(). Also fix block_passwd while we're tackling a command that takes a block encryption password. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 36 +--- hw/block/dataplane/virtio-blk.c | 1 + 2 files changed, 30 insertions(+), 7 deletions(-) diff --git a/blockdev.c b/blockdev.c index a7f1e09..7bf88d4 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1617,14 +1617,18 @@ exit: static void eject_device(BlockBackend *blk, int force, Error **errp) { BlockDriverState *bs = blk_bs(blk); +AioContext *aio_context; + +aio_context = bdrv_get_aio_context(bs); +aio_context_acquire(aio_context); if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) { -return; +goto out; } if (!blk_dev_has_removable_media(blk)) { error_setg(errp, Device '%s' is not removable, bdrv_get_device_name(bs)); -return; +goto out; } if (blk_dev_is_medium_locked(blk) !blk_dev_is_tray_open(blk)) { @@ -1632,11 +1636,14 @@ static void eject_device(BlockBackend *blk, int force, Error **errp) if (!force) { error_setg(errp, Device '%s' is locked, bdrv_get_device_name(bs)); -return; +goto out; } } bdrv_close(bs); + +out: +aio_context_release(aio_context); } void qmp_eject(const char *device, bool has_force, bool force, Error **errp) @@ -1658,6 +1665,7 @@ void qmp_block_passwd(bool has_device, const char *device, { Error *local_err = NULL; BlockDriverState *bs; +AioContext *aio_context; int err; bs = bdrv_lookup_bs(has_device ? device : NULL, @@ -1668,16 +1676,23 @@ void qmp_block_passwd(bool has_device, const char *device, return; } +aio_context = bdrv_get_aio_context(bs); +aio_context_acquire(aio_context); + err = bdrv_set_key(bs, password); if (err == -EINVAL) { error_set(errp, QERR_DEVICE_NOT_ENCRYPTED, bdrv_get_device_name(bs)); -return; +goto out; } else if (err 0) { error_set(errp, QERR_INVALID_PASSWORD); -return; +goto out; } + +out: +aio_context_release(aio_context); } +/* Assumes AioContext is held */ static void qmp_bdrv_open_encrypted(BlockDriverState *bs, const char *filename, int bdrv_flags, BlockDriver *drv, const char *password, Error **errp) @@ -1710,6 +1725,7 @@ void qmp_change_blockdev(const char *device, const char *filename, { BlockBackend *blk; BlockDriverState *bs; +AioContext *aio_context; BlockDriver *drv = NULL; int bdrv_flags; Error *err = NULL; @@ -1721,24 +1737,30 @@ void qmp_change_blockdev(const char *device, const char *filename, } bs = blk_bs(blk); +aio_context = bdrv_get_aio_context(bs); +aio_context_acquire(aio_context); + if (format) { drv = bdrv_find_whitelisted_format(format, bs-read_only); if (!drv) { error_set(errp, QERR_INVALID_BLOCK_FORMAT, format); -return; +goto out; } } eject_device(blk, 0, err); if (err) { error_propagate(errp, err); -return; +goto out; } bdrv_flags = bdrv_is_read_only(bs) ? 0 : BDRV_O_RDWR; bdrv_flags |= bdrv_is_snapshot(bs) ? BDRV_O_SNAPSHOT : 0; qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, errp); + +out: +aio_context_release(aio_context); } /* throttling disk I/O limits */ diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 5ab64c5..2548113 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -198,6 +198,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf, blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_DRIVE_DEL, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_BACKUP_SOURCE, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_COMMIT, s-blocker); +blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_EJECT, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE, s-blocker); blk_op_unblock(conf-conf.blk, BLOCK_OP_TYPE_MIRROR, s-blocker); -- 2.1.0
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 14:57, Juan Quintela wrote: Shipping a separate BIOS for different machine types is unrealistic and pointless. It would also be a good terrain for bug reports, unless you also do things like forbid creating -device megasas-gen2 on 2.1 because it was introduced in 2.2. And I agree with that. If it got introduced on 2.2, it should not be allowed on pc-2.1. It just makes things more complicated. We don't have infrastructure to enforce that. And I am claining that is the problem. We are just papering over this problem each time that it happens. I honestely think that the only way to really fix compatibility is enforcing that machine types are stable. right now they are now, and we ended nothing it. Weird, I have bought this USB device last month and I plugged it into a two-year-old laptop. QEMU version = when did I last update firmware / buy hardware Machine type = when did I buy the computer I honestly think that you are talking out of design dogma, without really thinking through the consequences of the design. Paolo
[Qemu-devel] [PATCH for-2.3 2/4] blockdev: check for BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE
The BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE op blocker exists but was never used! Let's fix that so snapshot delete can be blocked. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 4 1 file changed, 4 insertions(+) diff --git a/blockdev.c b/blockdev.c index fb9a005..a7f1e09 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1132,6 +1132,10 @@ SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device, aio_context = bdrv_get_aio_context(bs); aio_context_acquire(aio_context); +if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE, errp)) { +goto out_aio_context; +} + ret = bdrv_snapshot_find_by_id_and_name(bs, id, name, sn, local_err); if (local_err) { error_propagate(errp, local_err); -- 2.1.0
[Qemu-devel] [PATCH] target-mips: Tighten ISA level checks
Tighten ISA level checks down to MIPS II that many of our instructions are missing. Also make sure any 64-bit instruction enables are only applied to 64-bit processors, that is ones that implement at least the MIPS III ISA. Signed-off-by: Maciej W. Rozycki ma...@codesourcery.com --- Hi, As usually with changes that require manual code review I may have missed something here, but I believe I did not. I went thoroughly through all the code and checked all the instructions. Of course that was before MIPSr6, but I do hope for the latter the right checks have been implemented from the beginning. I rigorously tested this change by running full GCC, G++ and glibc mips-linux-gnu toolchain test suites under Linux (Debian Wheezy) run in QEMU in the system emulation mode, for the following multilibs: -EB -EB -mips16 -EB -mmicromips -EB -mabi=n32 -EB -mabi=64 and the -EL variants of same. Of these standard MIPS o32 testing was run on a 24Kf processor and n64/n32 testing was run on a 5KEf processor, using a 32-bit and a 64-bit kernel respectively. MIPS16 o32 testing was run on an artificial 5KEf-mips16 processor -- like a real 5KEf one, but with the MIPS16 ASE enabled, and a 64-bit kernel. Finally microMIPS o32 testing was run on an artificial 24Kf-micromips processor -- like a real 24Kf one, but with the microMIPS ASE enabled, and a 32-bit kernel built as microMIPS code itself. That should have executed enough paths in the emulator, both for user instructions and privileged (CP0) instructions, as both the user side and the kernel side were tested at the same time. The original test result counts were as follows: Result Count ERROR 20 FAIL 300 PASS 1732076 XPASS 335 XFAIL 6527 UNRESOLVED 26 UNSUPPORTED 50854 Total 1790138 After the change they were: Result Count ERROR 20 FAIL 303 PASS 1732073 XPASS 336 XFAIL 6526 UNRESOLVED 26 UNSUPPORTED 50854 Total 1790138 The changes in results were as follows: PASS - FAIL: qemu-64/glibc.sum:malloc/tst-mallocfork PASS - FAIL: qemu-mips16/glibc.sum:nptl/tst-setuid3 PASS - FAIL: qemu-n32-el/glibc.sum:malloc/tst-malloc-usable PASS - FAIL: qemu-n32/glibc.sum: nptl/tst-cancel17 FAIL - PASS: qemu-micromips/glibc.sum: nptl/tst-setuid3 XFAIL - XPASS: qemu-micromips/glibc.sum: nptl/tst-cancel7 -- which I qualify as intermittent failures that unfortunately happen all the time, also on real hardware, so they do not appear to be issues arising from this change to QEMU. I believe this satisfies quality verification requirements for such an extensive change. Please apply. Maciej qemu-mips-isa-check.diff Index: qemu-git-trunk/target-mips/cpu.h === --- qemu-git-trunk.orig/target-mips/cpu.h 2014-11-12 07:38:07.577674675 + +++ qemu-git-trunk/target-mips/cpu.h2014-11-12 07:38:24.077713983 + @@ -831,9 +831,10 @@ static inline void compute_hflags(CPUMIP env-hflags |= (env-CP0_Status CP0St_KSU) MIPS_HFLAG_KSU; } #if defined(TARGET_MIPS64) -if (((env-hflags MIPS_HFLAG_KSU) != MIPS_HFLAG_UM) || -(env-CP0_Status (1 CP0St_PX)) || -(env-CP0_Status (1 CP0St_UX))) { +if ((env-insn_flags ISA_MIPS3) +(((env-hflags MIPS_HFLAG_KSU) != MIPS_HFLAG_UM) || + (env-CP0_Status (1 CP0St_PX)) || + (env-CP0_Status (1 CP0St_UX { env-hflags |= MIPS_HFLAG_64; } Index: qemu-git-trunk/target-mips/helper.c === --- qemu-git-trunk.orig/target-mips/helper.c2014-11-12 07:33:08.548361020 + +++ qemu-git-trunk/target-mips/helper.c 2014-11-12 07:38:24.077713983 + @@ -527,7 +527,9 @@ void mips_cpu_do_interrupt(CPUState *cs) env-CP0_DEPC = exception_resume_pc(env); env-hflags = ~MIPS_HFLAG_BMASK; enter_debug_mode: -env-hflags |= MIPS_HFLAG_DM | MIPS_HFLAG_64 | MIPS_HFLAG_CP0; +if (env-insn_flags ISA_MIPS3) +env-hflags |= MIPS_HFLAG_64; +env-hflags |= MIPS_HFLAG_DM | MIPS_HFLAG_CP0; env-hflags = ~(MIPS_HFLAG_KSU); /* EJTAG probe trap enable is not implemented... */ if (!(env-CP0_Status (1 CP0St_EXL))) @@ -548,7 +550,9 @@ void mips_cpu_do_interrupt(CPUState *cs) env-CP0_ErrorEPC = exception_resume_pc(env); env-hflags = ~MIPS_HFLAG_BMASK; env-CP0_Status |= (1 CP0St_ERL) | (1 CP0St_BEV); -env-hflags |= MIPS_HFLAG_64 | MIPS_HFLAG_CP0; +if (env-insn_flags ISA_MIPS3) +env-hflags |= MIPS_HFLAG_64; +env-hflags |= MIPS_HFLAG_CP0; env-hflags = ~(MIPS_HFLAG_KSU); if (!(env-CP0_Status (1 CP0St_EXL))) env-CP0_Cause = ~(1U CP0Ca_BD); @@ -726,7 +730,9 @@
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
Dr. David Alan Gilbert dgilb...@redhat.com wrote: Since we've wondered off the actual ACPI table stuff into general ROM sizing, I'd like to propose some concrete fixes: 1) We explicitly name the bios file in a .romfile attribute for all ROMs. 2) The code that uses .romfile has an expansion for $MACHINETYPE 3) We actually symlink all of those together, anyone who wants/has to deal with different versions can downstream. 4) The machine types contain size attributes for the ROMs that are generoously larger than the ROMs anyone currently uses. I think 1..3 should deal with those of us who have to deal with different ROM versions on different machine types. 4 might be good enough for the ACPI tables if you can bound it. Dave I would agree with something like that. Thanks, Juan.
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19 November 2014 14:19, Paolo Bonzini pbonz...@redhat.com wrote: On 19/11/2014 15:10, Peter Maydell wrote: Would it be feasible to just have the migration code provide an API for registering things to be migrated in the live migration stage, rather than creating memory regions which you can't actually use for most of the purposes the memory region API exists for? It does already, for example PPC uses it for its IOMMU tables. But in any case this is really just memory that is auto-resized on migration. And it can work even if it is mapped in memory, as long as your resize callback (or some post_load code executed while migrating devices) does something useful to update the memory map. Let's call it like what it is. ...so why all the this isn't going to work in KVM warnings in the patchset? thanks -- PMM
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 15:13, Dr. David Alan Gilbert wrote: Since we've wondered off the actual ACPI table stuff into general ROM sizing, I'd like to propose some concrete fixes: 1) We explicitly name the bios file in a .romfile attribute for all ROMs. 2) The code that uses .romfile has an expansion for $MACHINETYPE 3) We actually symlink all of those together, anyone who wants/has to deal with different versions can downstream. 4) The machine types contain size attributes for the ROMs that are generoously larger than the ROMs anyone currently uses. I think 1..3 should deal with those of us who have to deal with different ROM versions on different machine types. It should, but it's a solution in search of a problem. 4 might be good enough for the ACPI tables if you can bound it. Already doing that (rounding to 128k, warning if 64k), but it is not a definitive solution. We also do (4) for ROMs, since VGA BIOSes use only 36k out of 64k and iPXE ROMs use only ~200k out of 256k. Paolo
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
* Paolo Bonzini (pbonz...@redhat.com) wrote: On 19/11/2014 15:13, Dr. David Alan Gilbert wrote: Since we've wondered off the actual ACPI table stuff into general ROM sizing, I'd like to propose some concrete fixes: 1) We explicitly name the bios file in a .romfile attribute for all ROMs. 2) The code that uses .romfile has an expansion for $MACHINETYPE 3) We actually symlink all of those together, anyone who wants/has to deal with different versions can downstream. 4) The machine types contain size attributes for the ROMs that are generoously larger than the ROMs anyone currently uses. I think 1..3 should deal with those of us who have to deal with different ROM versions on different machine types. It should, but it's a solution in search of a problem. Well we already do something close to 1 2 downstream but more ad-hoc; it's just a generalisation (and 4 from padding the size of our images). So we already had that problem. 4 might be good enough for the ACPI tables if you can bound it. Already doing that (rounding to 128k, warning if 64k), but it is not a definitive solution. We also do (4) for ROMs, since VGA BIOSes use only 36k out of 64k and iPXE ROMs use only ~200k out of 256k. Paolo -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 15:26, Dr. David Alan Gilbert wrote: * Paolo Bonzini (pbonz...@redhat.com) wrote: On 19/11/2014 15:13, Dr. David Alan Gilbert wrote: Since we've wondered off the actual ACPI table stuff into general ROM sizing, I'd like to propose some concrete fixes: 1) We explicitly name the bios file in a .romfile attribute for all ROMs. 2) The code that uses .romfile has an expansion for $MACHINETYPE 3) We actually symlink all of those together, anyone who wants/has to deal with different versions can downstream. 4) The machine types contain size attributes for the ROMs that are generoously larger than the ROMs anyone currently uses. I think 1..3 should deal with those of us who have to deal with different ROM versions on different machine types. It should, but it's a solution in search of a problem. Well we already do something close to 1 2 downstream but more ad-hoc; it's just a generalisation (and 4 from padding the size of our images). So we already had that problem. Upstream too. See pxe-* vs. efi-* NIC option ROMs. The latter includes both PXE firmware for BIOS and EFI drivers. We keep two copies because they have different sizes. Having explicit expansions for $MACHINETYPE would be hugely overkill, in my opinion. Paolo 4 might be good enough for the ACPI tables if you can bound it. Already doing that (rounding to 128k, warning if 64k), but it is not a definitive solution. We also do (4) for ROMs, since VGA BIOSes use only 36k out of 64k and iPXE ROMs use only ~200k out of 256k. Paolo -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 15:16, Dr. David Alan Gilbert wrote: Consider the similar case on real hardware: boot update microcode RPM s4 turn on CPU microcode is installed early by the kernel, before looking for a hibernation image to resume from, so the CPU microcode after resume from S4 is different from the microcode at the time you suspended to disk. This probably would work. You mean, unless for example, someone had disabled a CPU feature in the new microcode? A random example, right? :) I think it reinforces my point---just like it would work almost always on real hardware, but can fails if the stars align right, it should work almost always on QEMU. It doesn't have to be _perfect_. Bugs are always possible, of course, but things can be tested. Interested people/downstreams (hint, hint!) can try doing S4 on a release and restarting on the next, with and without machine type changes. If it breaks, it can be fixed or just documented. Paolo
Re: [Qemu-devel] File too large error from qemu-img snapshot (was Re: AW: Bug Repoting Directions Request)
Yes! My level of knowledge is that one uses the qcow2 format in order to be able to create live snapshots/backups. Otherwise one would tend to use the more efficient raw format. Is this not correct and did I apply the backup mechanism in the wrong way? -Ursprüngliche Nachricht- Von: Paolo Bonzini [mailto:pbonz...@redhat.com] Gesendet: Mittwoch, 19. November 2014 12:54 An: Prof. Dr. Michael Schefczyk; qemu-devel Betreff: Re: AW: File too large error from qemu-img snapshot (was Re: AW: Bug Repoting Directions Request) On 19/11/2014 12:48, Prof. Dr. Michael Schefczyk wrote: Thank you very much. To execute the backup, I run a script. For the machine in question, it looks as follows: #!/bin/bash dt=`date +%y%m%d` qemu-img snapshot -c gatewayb72 /kvm02/gatewayb72.img qemu-img convert -f qcow2 -O qcow2 /kvm02/gatewayb72.img /backup/gatewayb72$dt.img qemu-img snapshot -d gatewayb72 /kvm02/gatewayb72.img /bin/find /backup/* -mtime +100 -exec rm {} \; Is it done while the VM is running? Paolo smime.p7s Description: S/MIME Cryptographic Signature
[Qemu-devel] hw/arm/virt: linux,stdout-path - stdout-path
As of Linux 3.15, the generic stdout-path property described by ePAPR 1.1 is supported by the upstream kernel: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/of/base.c?id=676e1b2fcd9dbb47a59baac13d089621d22c68b8 ARM virt still sets the legacy linux,stdout-path. Given that this step was added to ARM virt ~ 3 months after 3.15 was released, could we simply replace it (patch below)? Failing that, could we set both for now? / Leif From 25a51745c6243ff279684a3990c8c6aad25ed7b5 Mon Sep 17 00:00:00 2001 From: Leif Lindholm leif.lindh...@linaro.org Date: Wed, 19 Nov 2014 11:02:42 + Subject: [RFC] hw/arm/virt: set stdout-path instead of linux,stdout-path ePAPR 1.1 defines the stdout-path property, making the os-specific linux,stdout-path property redundant. Change the DT setup for ARM virt to use the generic property - supported by Linux since 3.15. Signed-off-by: Leif Lindholm leif.lindh...@linaro.org --- hw/arm/virt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 78f618d..314e55b 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -389,7 +389,7 @@ static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic) qemu_fdt_setprop(vbi-fdt, nodename, clock-names, clocknames, sizeof(clocknames)); -qemu_fdt_setprop_string(vbi-fdt, /chosen, linux,stdout-path, nodename); +qemu_fdt_setprop_string(vbi-fdt, /chosen, stdout-path, nodename); g_free(nodename); } -- 1.7.10.4
Re: [Qemu-devel] File too large error from qemu-img snapshot (was Re: AW: Bug Repoting Directions Request)
Dear Paolo, Thank you very much. To execute the backup, I run a script. For the machine in question, it looks as follows: #!/bin/bash dt=`date +%y%m%d` qemu-img snapshot -c gatewayb72 /kvm02/gatewayb72.img qemu-img convert -f qcow2 -O qcow2 /kvm02/gatewayb72.img /backup/gatewayb72$dt.img qemu-img snapshot -d gatewayb72 /kvm02/gatewayb72.img /bin/find /backup/* -mtime +100 -exec rm {} \; Regards, Michael -Ursprüngliche Nachricht- Von: Paolo Bonzini [mailto:pbonz...@redhat.com] Gesendet: Mittwoch, 19. November 2014 12:42 An: Prof. Dr. Michael Schefczyk; qemu-devel Betreff: File too large error from qemu-img snapshot (was Re: AW: Bug Repoting Directions Request) On 19/11/2014 12:39, Prof. Dr. Michael Schefczyk wrote: Dear Paolo, thank you very much for your prompt reply. Hi, please keep the replies on the mailing list. I've added the qemu-devel mailing list, where more people should be able to see the message and hopefully reply with something helpful. Also, how do you do the backups? Paolo For example, I have a guest named gatewayb72.img where the backup failed. If I thereafter try to create or delete a snapshot, the following reply occurs on the command line: [root@linuxhost2 kvm02]# qemu-img snapshot -d gatewayb72 /kvm02/gatewayb72.img qemu-img: Could not open '/kvm02/gatewayb72.img': Could not read snapshots: File too large If I want to reboot that machine, I get the following error: Fehler beim Starten der Domain: Interner Fehler: Prozess wurde während der Verbindungsaufnahme zum Monitor beendet : qemu-kvm: -drive file=/kvm02/gatewayb72.img,if=none,id=drive-virtio-disk0,format=qcow2, cache=none: could not open disk image /kvm02/gatewayb72.img: Could not read snapshots: File too large Traceback (most recent call last): File /usr/share/virt-manager/virtManager/asyncjob.py, line 100, in cb_wrapper callback(asyncjob, *args, **kwargs) File /usr/share/virt-manager/virtManager/asyncjob.py, line 122, in tmpcb callback(*args, **kwargs) File /usr/share/virt-manager/virtManager/domain.py, line 1220, in startup self._backend.create() File /usr/lib64/python2.7/site-packages/libvirt.py, line 698, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: Interner Fehler: Prozess wurde während der Verbindungsaufnahme zum Monitor beendet : qemu-kvm: -drive file=/kvm02/gatewayb72.img,if=none,id=drive-virtio-disk0,format=qcow2, cache=none: could not open disk image /kvm02/gatewayb72.img: Could not read snapshots: File too large Based on the facts I can see, the file is not too large. When reading the first error, the file size was 13.8 GB, while the limit is 14.5 GB. The same does also happen with files which are only, for example 6 GB big, while their limit is also 14.5 GB. Therefore, I think that the file too large really stands for something else. Can I provide any additional information? Regards, Michael ___ Technische Universität Dresden Fakultät Wirtschaftswissenschaften Lehrstuhl für Entrepreneurship und Innovation Prof. Dr. Michael Schefczyk D-01062 Dresden Fon: +49-3 51-4 63-3 68 81 Fax: +49-3 51-4 63-3 68 83 www.gruenderlehrstuhl.de Since some months ago, I am facing the problem, that backups fail unpredictably. A failed backup does not generate a backup file and thereafter, snapshots can no longer be created or deleted and the guest cannot be started anymore. The resulting error message is File too large. Who is reporting the File too large error? Can you please include the error in full detail? Paolo smime.p7s Description: S/MIME Cryptographic Signature
Re: [Qemu-devel] [PATCH 2/5] exec: qemu_ram_alloc_device, qemu_ram_resize
On 19/11/2014 15:21, Peter Maydell wrote: But in any case this is really just memory that is auto-resized on migration. And it can work even if it is mapped in memory, as long as your resize callback (or some post_load code executed while migrating devices) does something useful to update the memory map. Let's call it like what it is. ...so why all the this isn't going to work in KVM warnings in the patchset? Just a limitation of the patches, and one thing I was going to ask to change for v2. :) Paolo
Re: [Qemu-devel] [PATCH] geometry: fix i386 compilation
On 19 November 2014 14:01, Ekaterina Tumanova tuman...@linux.vnet.ibm.com wrote: Signed-off-by: Ekaterina Tumanova tuman...@linux.vnet.ibm.com Could you give the compiler error/warning message, please (and the compiler version)? These changes look sensible but it's not clear to me why the current code would cause a compilation failure or why that would be specific to i386... thanks -- PMM
Re: [Qemu-devel] [PATCH 0/5] pc: make ROMs resizeable
On 19/11/2014 14:52, Peter Maydell wrote: It certainly seems pretty risky to introduce this change with only two weeks to go til release; I wouldn't want to merge it without a strong consensus from everybody involved that it really needed to go in for 2.2. I think there's consensus (you, Amit, me) that it should wait for 2.3. Paolo
Re: [Qemu-devel] File too large error from qemu-img snapshot (was Re: AW: Bug Repoting Directions Request)
On 19/11/2014 13:07, Prof. Dr. Michael Schefczyk wrote: Yes! My level of knowledge is that one uses the qcow2 format in order to be able to create live snapshots/backups. Otherwise one would tend to use the more efficient raw format. Is this not correct and did I apply the backup mechanism in the wrong way? That's correct, but you still have to create live snapshots from within QEMU. This is done with a QMP (QEMU Management Protocol) command like { execute: blockdev-snapshot-internal-sync, arguments: { device: ide-hd0, name: snapshot0 } } QMP is accessed through normal sockets, or via libvirt. However, I'm not sure if running qemu-img convert on the resulting snapshot is possible though, and there is no equivalent of qemu-img snapshot -d. You can instead use QEMU's support for backup, which will do what you wanted directly while the VM is running. For example: { execute: drive-backup, arguments: { device: ide-hd0, sync: full, format: qcow2, target: backup.img } } This does not even require qcow2 for the image. The downside is that you must not turn off the VM until the job has completed. Paolo
Re: [Qemu-devel] qemu 2.2 crash on linux hvm domU (full backtrace included)
I think I know what is happening here. But you are pointing at the wrong change. commit 9b23cfb76b3a5e9eb5cc899eaf2f46bc46d33ba4 Is what I am guessing at this time is the issue. I think that xen_enabled() is returning false in pc_machine_initfn. Where as in pc_init1 is is returning true. I am thinking that: diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 7bb97a4..3268c29 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -914,7 +914,7 @@ static QEMUMachine xenfv_machine = { .desc = Xen Fully-virtualized PC, .init = pc_xen_hvm_init, .max_cpus = HVM_MAX_VCPUS, -.default_machine_opts = accel=xen, +.default_machine_opts = accel=xen,vmport=off, .hot_add_cpu = pc_hot_add_cpu, }; #endif Will fix your issue. I have not tested this yet. -Don Slutz On 11/19/14 09:04, Fabio Fantoni wrote: Il 14/11/2014 12:25, Fabio Fantoni ha scritto: dom0 xen-unstable from staging git with x86/hvm: Extend HVM cpuid leaf with vcpu id and x86/hvm: Add per-vcpu evtchn upcalls patches, and qemu 2.2 from spice git (spice/next commit e779fa0a715530311e6f59fc8adb0f6eca914a89): https://github.com/Fantu/Xen/commits/rebase/m2r-staging I tried with qemu tag v2.2.0-rc2 and crash still happen, here the full backtrace of latest test: Program received signal SIGSEGV, Segmentation fault. 0x55689b07 in vmport_ioport_read (opaque=0x564443a0, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/hw/misc/vmport.c:73 73 eax = env-regs[R_EAX]; (gdb) bt full #0 0x55689b07 in vmport_ioport_read (opaque=0x564443a0, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/hw/misc/vmport.c:73 s = 0x564443a0 cs = 0x0 cpu = 0x0 __func__ = vmport_ioport_read env = 0x8250 command = 0 '\000' eax = 0 #1 0x55655fc4 in memory_region_read_accessor (mr=0x5628, addr=0, value=0x7fffd8d0, size=4, shift=0, mask=4294967295) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:410 tmp = 0 #2 0x556562b7 in access_with_adjusted_size (addr=0, value=0x7fffd8d0, size=4, access_size_min=4, access_size_max=4, access=0x55655f62 memory_region_read_accessor, mr=0x5628) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:480 access_mask = 4294967295 access_size = 4 i = 0 #3 0x556590e9 in memory_region_dispatch_read1 (mr=0x5628, addr=0, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1077 data = 0 #4 0x556591b1 in memory_region_dispatch_read (mr=0x5628, addr=0, pval=0x7fffd9a8, size=4) ---Type return to continue, or q return to quit--- at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1099 No locals. #5 0x5565cbbc in io_mem_read (mr=0x5628, addr=0, pval=0x7fffd9a8, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/memory.c:1962 No locals. #6 0x5560a1ca in address_space_rw (as=0x55eaf920, addr=22104, buf=0x7fffda50 \377\377\377\377, len=4, is_write=false) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/exec.c:2167 l = 4 ptr = 0x55a92d87 %s/%d:\n val = 7852232130387826944 addr1 = 0 mr = 0x5628 error = false #7 0x5560a38f in address_space_read (as=0x55eaf920, addr=22104, buf=0x7fffda50 \377\377\377\377, len=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/exec.c:2205 No locals. #8 0x5564fd4b in cpu_inl (addr=22104) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/ioport.c:117 buf = \377\377\377\377 val = 21845 #9 0x55670c73 in do_inp (addr=22104, size=4) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:684 ---Type return to continue, or q return to quit--- No locals. #10 0x55670ee0 in cpu_ioreq_pio (req=0x77ff3020) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:747 i = 1 #11 0x556714b3 in handle_ioreq (state=0x563c2510, req=0x77ff3020) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:853 No locals. #12 0x55671826 in cpu_handle_ioreq (opaque=0x563c2510) at /mnt/vm/xen/Xen/tools/qemu-xen-dir/xen-hvm.c:931 state = 0x563c2510 req = 0x77ff3020 #13 0x5596e240 in qemu_iohandler_poll (pollfds=0x56389a30, ret=1) at iohandler.c:143 revents = 1 pioh = 0x563f7610 ioh = 0x56450a40 #14 0x5596de1c in main_loop_wait (nonblocking=0) at main-loop.c:495 ret = 1 timeout = 4294967295 timeout_ns = 3965432 #15 0x55756d3f in main_loop () at vl.c:1882 nonblocking = false last_io = 0 #16 0x5575ea49 in main (argc=62, argv=0x7fffe048, envp=0x7fffe240) at vl.c:4400 ---Type return to continue, or q return to quit--- i = 128 snapshot = 0 linux_boot = 0 initrd_filename = 0x0 kernel_filename =