Re: [Qemu-devel] Help needed: Sparc 64, kernel panic
Hi Mark, Thanks a lot for this. - it did work! So, now im wondering how did you install the iso on disk? would you mind sharing your command line for the install please? and hacks if any i am now able to install and boot, but my boot hang after this line "[ 22.559491] [TTM] Initializing pool allocator" - inside user space. - I did blacklist bochs_drm for installing as such "-append 'modprobe.blacklist=bochs_drm'". also, i have a couple of questions if you dont mind responding to them. - ill appreciate this a lot: 1- did you use virtio for your install? i.e. not using the "-cdrom" command and blacklisting the pata_cmd64x and also providing your own virtio device (/dev/vdb)? - i used this link for hints on how to install with virtio http://tyom.blogspot.ch/2013/03/debiansparc64-wheezy-under-qemu-how-to.html 2- i see that you used ext3 for your OS install, isnt that slower than ext4? -again im referencing this: http://tyom.blogspot.ch/2013/03/virtio-performance-and-filesystems.html 3- also, im noticing that you didnt define root=/dev/sda or root=/dev/vda1 for when your running qemu. and when im running your image i have to do that, otherwise ill get to the initramfs only. is there a trick behind this? 4- i dont see you defining a kernel and initrd? is there a reason for this? if i just use your cmd line, ill get this: also, are you extracting the kernel and initrd from the image or the iso? OpenBIOS for Sparc64 Configuration device id QEMU version 1 machine id 0 kernel cmdline CPUs: 1 x SUNW,UltraSPARC-IIi UUID: ---- Welcome to OpenBIOS v1.1 built on Mar 15 2017 19:37 Type 'help' for detailed information Trying disk:a... Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 SILO Version 1.4.14 boot: Thanks a lot in advance for your help. Cheers Hoss From: Mark Cave-Ayland Sent: Saturday, April 22, 2017 11:12 AM To: Ajallooiean Hossein; qemu-devel@nongnu.org Subject: Re: [Qemu-devel] Help needed: Sparc 64, kernel panic On 21/04/17 16:12, Ajallooiean Hossein wrote: > Thanks for the files and the notes. > > I am attaching my config-host file for you. I'm on x86_64, running Linux 64, > Ubuntu 16.04.2 LTS > > so, i can also boot like you do - this worked before as well. btw, if you try > to install that to a disk, itll not be able to as youll need to define memory > for it. > > The problem is after i install the iso on the qcow2 disk and then try to boot > qemu-system-sparc64. > > so here is the steps to reproduce the issue: > 1- create a qcow2 image : debian-9.0-sparc64-NETINST-1.qcow2 > 2- download debian image: debian-9.0-sparc64-NETINST-1.iso > 3- install the OS on dIsk > i use the below command line to do it: > > ./qemu-system-sparc64 -cdrom > /home/nihosa/Downloads/debian-9.0-sparc64-NETINST-1.iso -hda > /home/nihosa/Downloads/debian-sparc.qcow2 -nographic -boot d -L pc-bios -m 200 > > i guess i dont have to define a kernel in the above code??? > > 4- installation goes all well. > 5- i try to run the new disk image: - here i add kernel as if i dont add it > ill get the below: > > https://pastebin.com/cFwrX9E9 I've just done a test install with https://people.debian.org/~glaubitz/debian-cd/2017-03-24/debian-9.0-sparc64-NETINST-1.iso and I didn't see any errors similar to the ones you mention (although I did have to blacklist the bochs_drm module upon boot). The resulting qcow2 image can be found temporarily at https://www.ilande.co.uk/tmp/qemu/sparc64-kernel/deb90.qcow2.xz and you can launch it with: ./qemu-system-sparc64 -hda deb90.qcow2 -m 256 -nographic Username and password are both root. ATB, Mark.
[Qemu-devel] NetBSD maintenance
Hello, I noted a call for NetBSD maintainers in the 2.9.0 release notes. I'm willing to attach a NetBSD machine to CI cluster and volunteer basic maintenance. I'm mostly interested in NetBSD as host & as guest as this is my daily and work driver on my desktop and development machines. If I understand correctly there is currently no infrastructure for CI: http://wiki.qemu.org/Testing/ContinuousIntegration I'm familiar with Python buildbot, I run a machine for the LLVM toolchain with Clang and LLDB. This setup works very well and it's easy to setup with minimal dependencies: http://llvm.org/docs/HowToAddABuilder.html My LLVM machine: http://lab.llvm.org:8011/builders/lldb-amd64-ninja-netbsd7 Although, I plan to upgrade it pretty soon and rename, so look for lldb-amd64-ninja-netbsd8. As of today NetBSD patches for qemu are maintained in pkgsrc. There are also at least DragonFly and SunOS (SmartOS) diffs available. https://github.com/NetBSD/pkgsrc/tree/trunk/emulators/qemu/patches Qemu is one of the core tools in NetBSD development and it's used in our release engineering infrastructure: http://releng.netbsd.org/test-results.html I will start with upstreaming local diffs and move on to running tests. I'm aware that there are lately issues with ACPI, APIC, SMP and certain device drivers... but not everything at once. signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [RFC] hw/arm/exynos: Add generic SDHCI devices
On Sat, Apr 22, 2017 at 08:05:41PM +0100, Peter Maydell wrote: > On 22 April 2017 at 19:46, Krzysztof Kozlowski wrote: > > On Thu, Apr 20, 2017 at 02:51:00PM +0100, Peter Maydell wrote: > >> Incidentally someday maybe we should convert this Exynos4210 code > >> to a proper QOM SoC container object, but that would be a lot of > >> work. > > > > Any existing platforms which I could take as an good example? Maybe I'll > > have some time to do it. > > The Xilinx ones, or stm32f205_soc, maybe. Basically the idea > is that instead of having a random function which is doing > a lot of instantiation of SoC devices, you have a QoM device > which encapsulates all this, and the board level code just > creates and configures that device. Right, I've seen this pattern. Thanks for the hint. Best regards, Krzysztof
[Qemu-devel] [PATCH v2] hw/arm/exynos: Add generic SDHCI devices
Exynos4210 has four SD/MMC controllers supporting: - SD Standard Host Specification Version 2.0, - MMC Specification Version 4.3, - SDIO Card Specification Version 2.0, - DMA and ADMA. Add emulation of SDHCI devices which allows accessing storage through SD cards. Differences from real hardware: - Devices are shipped with eMMC memory, not SD card. - The Exynos4210 SDHCI has few more registers, e.g. for controlling the clocks, additional status (0x80, 0x84, 0x8c). These are not implemented. Testing on smdkc210 machine with "-drive file=FILE,if=sd,bus=0,index=2". Signed-off-by: Krzysztof Kozlowski --- Changes since v1: 1. Fixes after Peter's review. --- hw/arm/exynos4210.c | 29 + 1 file changed, 29 insertions(+) diff --git a/hw/arm/exynos4210.c b/hw/arm/exynos4210.c index 0ec4250f0c05..5a622cfedfc8 100644 --- a/hw/arm/exynos4210.c +++ b/hw/arm/exynos4210.c @@ -32,6 +32,7 @@ #include "hw/arm/arm.h" #include "hw/loader.h" #include "hw/arm/exynos4210.h" +#include "hw/sd/sd.h" #include "hw/usb/hcd-ehci.h" #define EXYNOS4210_CHIPID_ADDR 0x1000 @@ -72,6 +73,13 @@ #define EXYNOS4210_EXT_COMBINER_BASE_ADDR 0x1044 #define EXYNOS4210_INT_COMBINER_BASE_ADDR 0x10448000 +/* SD/MMC host controllers */ +#define EXYNOS4210_SDHCI_CAPABILITIES 0x05E80080 +#define EXYNOS4210_SDHCI_BASE_ADDR 0x1251 +#define EXYNOS4210_SDHCI_ADDR(n)(EXYNOS4210_SDHCI_BASE_ADDR + \ +0x0001 * (n)) +#define EXYNOS4210_SDHCI_NUMBER 4 + /* PMU SFR base address */ #define EXYNOS4210_PMU_BASE_ADDR0x1002 @@ -386,6 +394,27 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem, EXYNOS4210_UART3_FIFO_SIZE, 3, NULL, s->irq_table[exynos4210_get_irq(EXYNOS4210_UART_INT_GRP, 3)]); +/*** SD/MMC host controllers ***/ +for (n = 0; n < EXYNOS4210_SDHCI_NUMBER; n++) { +DeviceState *carddev; +BlockBackend *blk; +DriveInfo *di; + +dev = qdev_create(NULL, "generic-sdhci"); +qdev_prop_set_uint32(dev, "capareg", EXYNOS4210_SDHCI_CAPABILITIES); +qdev_init_nofail(dev); + +busdev = SYS_BUS_DEVICE(dev); +sysbus_mmio_map(busdev, 0, EXYNOS4210_SDHCI_ADDR(n)); +sysbus_connect_irq(busdev, 0, s->irq_table[exynos4210_get_irq(29, n)]); + +di = drive_get(IF_SD, 0, n); +blk = di ? blk_by_legacy_dinfo(di) : NULL; +carddev = qdev_create(qdev_get_child_bus(dev, "sd-bus"), TYPE_SD_CARD); +qdev_prop_set_drive(carddev, "drive", blk, &error_abort); +qdev_init_nofail(carddev); +} + /*** Display controller (FIMD) ***/ sysbus_create_varargs("exynos4210.fimd", EXYNOS4210_FIMD0_BASE_ADDR, s->irq_table[exynos4210_get_irq(11, 0)], -- 2.9.3
Re: [Qemu-devel] [RFC] hw/arm/exynos: Add generic SDHCI devices
On 22 April 2017 at 19:46, Krzysztof Kozlowski wrote: > On Thu, Apr 20, 2017 at 02:51:00PM +0100, Peter Maydell wrote: >> Incidentally someday maybe we should convert this Exynos4210 code >> to a proper QOM SoC container object, but that would be a lot of >> work. > > Any existing platforms which I could take as an good example? Maybe I'll > have some time to do it. The Xilinx ones, or stm32f205_soc, maybe. Basically the idea is that instead of having a random function which is doing a lot of instantiation of SoC devices, you have a QoM device which encapsulates all this, and the board level code just creates and configures that device. thanks -- PMM
Re: [Qemu-devel] [RFC] hw/arm/exynos: Add generic SDHCI devices
On Thu, Apr 20, 2017 at 02:51:00PM +0100, Peter Maydell wrote: > On 16 April 2017 at 16:16, Krzysztof Kozlowski wrote: > > Exynos4210 has four SD/MMC controllers supporting: > > - SD Standard Host Specification Version 2.0, > > - MMC Specification Version 4.3, > > - SDIO Card Specification Version 2.0, > > - DMA and ADMA. > > > > Add emulation of SDHCI devices which allows accessing storage through SD > > cards. Differences from real hardware: > > - Devices are shipped with eMMC memory, not SD card. > > - The Exynos4210 SDHCI has few more registers, e.g. for > >controlling the clocks, additional status (0x80, 0x84, 0x8c). These > >are not implemented. > > > > Testing on smdkc210 machine with "-drive file=FILE,if=sd,bus=0,index=2". > > > > Signed-off-by: Krzysztof Kozlowski > > > > --- > > > > Mostly it works: > > [ 11.763786] sdhci: Secure Digital Host Controller Interface driver > > [ 11.764593] sdhci: Copyright(c) Pierre Ossman > > [ 11.777295] s3c-sdhci 1253.sdhci: clock source 2: mmc_busclk.2 > > (1200 Hz) > > [ 11.976250] mmc0: SDHCI controller on samsung-hsmmc [1253.sdhci] > > using ADMA > > [ 11.980283] Synopsys Designware Multimedia Card Interface Driver > > [ 12.086757] mmc0: new SDHC card at address 4567 > > [ 12.151653] mmcblk0: mmc0:4567 QEMU! 4.00 GiB > > > > ... except that for guest, the storage starts from 0x5. It just > > skips first 0x5 bytes thus the paritions (MBR) and initial data is > > not visible. > > > > I don't know what is the cause. > > > > Any hints? > > That is strange and it sounds like we should try to track down > what is going on there. Hopefully it shouldn't be too hard to trace > through what happens when the guest accesses what it thinks is the > first block on the SD card... > > Otherwise I just have a couple of nits below. Nah, I am just total dumbass. Skipped bytes from image are okay because that is the header of QEMU image (e.g. QCOW2). The partitions were not visible because I tried to attach snapshot of partition itself, not entire disk. Everything works fine. > > > Signed-off-by: Krzysztof Kozlowski > > --- > > hw/arm/exynos4210.c | 29 + > > 1 file changed, 29 insertions(+) > > > > diff --git a/hw/arm/exynos4210.c b/hw/arm/exynos4210.c > > index 0ec4250f0c05..d581f2217253 100644 > > --- a/hw/arm/exynos4210.c > > +++ b/hw/arm/exynos4210.c > > @@ -32,6 +32,7 @@ > > #include "hw/arm/arm.h" > > #include "hw/loader.h" > > #include "hw/arm/exynos4210.h" > > +#include "hw/sd/sd.h" > > #include "hw/usb/hcd-ehci.h" > > > > #define EXYNOS4210_CHIPID_ADDR 0x1000 > > @@ -72,6 +73,13 @@ > > #define EXYNOS4210_EXT_COMBINER_BASE_ADDR 0x1044 > > #define EXYNOS4210_INT_COMBINER_BASE_ADDR 0x10448000 > > > > +/* SD/MMC host controllers */ > > +#define EXYNOS4210_SDHCI_CAPABILITIES 0x05E80080 > > +#define EXYNOS4210_SDHCI_BASE_ADDR 0x1251 > > +#define EXYNOS4210_SDHCI_ADDR(n)(EXYNOS4210_SDHCI_BASE_ADDR + \ > > +0x0001 * (n)) > > +#define EXYNOS4210_SDHCI_NUMBER 4 > > + > > /* PMU SFR base address */ > > #define EXYNOS4210_PMU_BASE_ADDR0x1002 > > > > @@ -386,6 +394,27 @@ Exynos4210State *exynos4210_init(MemoryRegion > > *system_mem, > > EXYNOS4210_UART3_FIFO_SIZE, 3, NULL, > >s->irq_table[exynos4210_get_irq(EXYNOS4210_UART_INT_GRP, > > 3)]); > > > > +/*** SD/MMC host controllers ***/ > > +for (n = 0; n < EXYNOS4210_SDHCI_NUMBER; n++) { > > +DeviceState *carddev; > > +DriveInfo *di; > > +BlockBackend *blk; > > + > > +dev = qdev_create(NULL, "generic-sdhci"); > > +qdev_prop_set_uint32(dev, "capareg", > > EXYNOS4210_SDHCI_CAPABILITIES); > > +qdev_init_nofail(dev); > > + > > +busdev = SYS_BUS_DEVICE(dev); > > +sysbus_mmio_map(busdev, 0, EXYNOS4210_SDHCI_ADDR(n)); > > +sysbus_connect_irq(busdev, 0, s->irq_table[exynos4210_get_irq(29, > > n)]); > > + > > +di = drive_get_next(IF_SD); > > di = drive_get(IF_SD, 0, n); > > would be better -- this explicitly states that SD cards 0,1,2,3 > connect to these controllers, rather than implicitly assigning > whatever the "next" ones happen to be. Sure. > > > +blk = di ? blk_by_legacy_dinfo(di) : NULL; > > +carddev = qdev_create(qdev_get_child_bus(dev, "sd-bus"), > > TYPE_SD_CARD); > > +qdev_prop_set_drive(carddev, "drive", blk, &error_abort); > > +qdev_prop_set_bit(carddev, "realized", true); > > This isn't the right way to set the realized property. You can either: > (a) use qdev_init_nofail(), which sets the property and prints an > error and exits if the device realize fails > (b) use object_property_set_bool() to set the property and handle > errors yourself I'll fix it. > > > +} > > + > > /*** Display controller (FIMD) ***/ >
Re: [Qemu-devel] [PATCH for-2.9? v2] tests: Ignore more test executables
08.03.2017 18:15, Eric Blake wrote: > Ignore test executables when building in-tree: > test-arm-mptimer introduced in commit 882fac3 > test-crypto-hmac introduced in commit 4fd460b > test-aio-multithread introduced in commit 0c330a7 Applied to -trivial, thank you! /mjt
Re: [Qemu-devel] [PATCH] Add 'none' as type for drive's if option
17.03.2017 18:49, Craig Jellick wrote: > qemu-options.hx | 2 +- Applied to -trivial, thank you! /mjt
Re: [Qemu-devel] [PATCH] doc: fix function spelling
22.03.2017 14:52, Marc-André Lureau wrote: > include/io/channel.h | 2 +- Applied to -trivial, thanks! /mjt
Re: [Qemu-devel] proposed qcow2 extension: cluster reservations [was: [RFC] Proposed qcow2 extension: subcluster allocation
On 21.04.2017 23:09, Eric Blake wrote: > On 04/06/2017 11:40 AM, Eric Blake wrote: > >>> === Changes to the on-disk format === >>> >>> The qcow2 on-disk format needs to change so each L2 entry has a bitmap >>> indicating the allocation status of each subcluster. There are three >>> possible states (unallocated, allocated, all zeroes), so we need two >>> bits per subcluster. >> >> You also have to add a new incompatible feature bit, so that older tools >> know they can't read the new image correctly, and therefore don't >> accidentally corrupt it. > > As long as we are talking about incompatible feature bits, I had some > thoughts about image mapping today. > > tl;dr: summary> As long as we are considering incompatible features, maybe we > should > make it easier to have an image file that explicitly preserves > guest=>host mapping, such that nothing the guest can do will reorder the > mapping. This way, it would be possible to fully preallocate an image > such that all guest offsets are adjusted by a constant offset to become > the corresponding host offset (basically, no qcow2 metadata is > interleaved in the middle of guest data). > > I don't see any way to do it on current qcow2 images, but with > subclusters, you get it for free by having a cluster with an offset but > with all subclusters marked as unallocated. But perhaps it is something > orthogonal enough that we would want a separate incompatible feature bit > for just this, without having subclusters at the same time. > > In the process of exploring the topic, I expose a couple of existing > bugs in our qcow2 handling. > > > > Longer version: > > If I'm reading qcow2-clusters.c and qcow2-refcount.c correctly, our > current implementation of bdrv_discard() says that except for clusters > already marked QCOW2_CLUSTER_ZERO, we will unconditionally remove the L2 > mapping of the address. As I've said, I think the ZERO bit is just a bug and we should free preallocated zero clusters. > Whether we actually punch a hole in the > underlying image, or merely add it to a list of free clusters for use in > subsequent allocations, is later decided based on > s->discard_passthrough[type] (that is, the caller can pass different > type levels that control whether to never punch, always punch, or let > the blockdev parameters of the caller control whether to punch). > > Presumably, if I've preallocated my image because I want to guarantee > enough host space, then I would use blockdev parameters that ensure that > guest actions never punch a hole. But based on my reading, I still have > the situation that if I initially preallocated things so that guest > cluster 0 and 1 are consecutive clusters in the host, and the guest > triggers bdrv_pdiscard() over both clusters, then writes to cluster 1 > before cluster 0, then even though I'm not changing the underlying > allocation of the host file, I _am_ resulting in fragmentation in the > qcow2 file, where cluster 1 in the guest now falls prior to cluster 0. [...] > But if we can preserve mappings of clusters that are explicitly marked > zero, I started to wonder if it would also be reasonable to have a mode > where we could ALWAYS preserve mappings. Adding another bit that says > that a cluster has a reserved mapping, but still defers to the backing > file for its current data, would let us extend the existing behavior of > map-preservation on write zeros to work with ALL writes, when writing to > a fully pre-allocated image. Yes, and it also means that we may want to think about implementation a preallocation mode in qemu which puts all of the data into a single consecutive chunk (as you have hinted at somewhere above). > When I chatted with Max on IRC about the idea, we said this: > > > I mean, sure, we can add both, but I'd still want them two be > two incompatible bits > if you want the features to be orthogonal (with exponentially > more cases to check), then having multiple incompatible bits is okay > Because FEATURE_BIT_UNALLOCATED_AND_SUBCLUSTERS sounds weird > and FEATURE_BIT_EXTENDED_L2_ENTRIES a bit pretentious > Well, orthogonal is a good question. If we want to have an > UNALLOCATED flag we should think so before adding subclusters > Because then we should at least make clear that the ZERO flag > for a subcluster requires the ALLOCATED flag to be set or something > So we can reserve ZERO/!ALLOCATED for the case where you want > to fall through to the backing file > > So, if you got this far in reading, the question becomes whether having > a mode where you can mark a cluster as mapping-reserved-but-unallocated > has enough use case to be worth pursuing, knowing that it will burn an > incompatible feature bit; or if it should be rolled into the subcluster > proposal, or whether it's not a feature that anyone needs after all. I just forgot that just saying !ALLOCATED will be enough, regardless of the ZERO flag... Yeah, subclusters will give us
Re: [Qemu-devel] [PATCH] ppc_booke: drop useless assignment
24.03.2017 15:55, fred.kon...@greensocs.com wrote: > From: KONRAD Frederic > > The tb_env variable is set two lines above. So just drop the double > assignment. Applied to -trivial, thanks! /mjt
Re: [Qemu-devel] [PATCH v7 00/13] Improvements for SM501 display controller emulation
On 2017-04-21 17:18, BALATON Zoltan wrote: > v7: Define default values for some variables to avoid an (invalid) > warning from gcc 6 or 7 as suggested by Aurelien Jarno. Thanks a lot for this new version, I confirm it now builds fine with GCC 6 or 7. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [Qemu-block] [PATCH] qemu_iotests: Remove _readlink()
On 21.04.2017 10:01, Kevin Wolf wrote: > It is unused. > > Suggestetd-by: Fam Zheng *Suggested And, since it now would be rude not to, with that fixed: Reviewed-by: Max Reitz > Signed-off-by: Kevin Wolf > --- > tests/qemu-iotests/common.config | 18 -- > 1 file changed, 18 deletions(-) > > diff --git a/tests/qemu-iotests/common.config > b/tests/qemu-iotests/common.config > index 1222e43..66f4e0b 100644 > --- a/tests/qemu-iotests/common.config > +++ b/tests/qemu-iotests/common.config > @@ -204,23 +204,5 @@ fi > > export SAMPLE_IMG_DIR > > -_readlink() > -{ > -if [ $# -ne 1 ]; then > -echo "Usage: _readlink filename" 1>&2 > -exit 1 > -fi > - > -perl -e "\$in=\"$1\";" -e ' > -$lnk = readlink($in); > -if ($lnk =~ m!^/.*!) { > -print "$lnk\n"; > -} > -else { > -chomp($dir = `dirname $in`); > -print "$dir/$lnk\n"; > -}' > -} > - > # make sure this script returns success > true > signature.asc Description: OpenPGP digital signature
[Qemu-devel] [Bug 1685526] [NEW] UEFI firmware can't write to "fake" FAT hard disk
Public bug reported: Using the Tianocore OVMF UEFI firmware, a UEFI application cannot write to the emulated fat disk (-hda fat:rw:path/here). A file will get created or written, but will be corrupted. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1685526 Title: UEFI firmware can't write to "fake" FAT hard disk Status in QEMU: New Bug description: Using the Tianocore OVMF UEFI firmware, a UEFI application cannot write to the emulated fat disk (-hda fat:rw:path/here). A file will get created or written, but will be corrupted. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1685526/+subscriptions
Re: [Qemu-devel] [PATCH] Block layer core: Fix qemu-img 'amend' subcommand failure of adjusting backing file in different path
On 22.04.2017 00:34, Ping Li wrote: > Currently, qemu-img 'amend' subcommand would fail to adjust image's backing > file > which was moved into different path. > For example, parent.qcow2, the backing file of leaf.qcow2, first is at > /home/a/, > then moved into /home/b/. Originally this command, > "qemu-img amend -f qcow2 -o > backing_fmt=qcow2,backing_file=/home/b/parent.qcow2 leaf.qcow2", > would fail because qemu-img failed to open the old backing file of > leaf.qcow2. > Give the 'amend' subcommand a '-u' option to not open the old backing file > while openning leaf.qcow2. > > Signed-off-by: Li Ping > --- > qemu-img.c | 16 ++-- > 1 file changed, 10 insertions(+), 6 deletions(-) So why can't you just use rebase -u? I'm not completely opposed to adding this functionality to amend, but I'd actually rather remove the ability to change the backing file from amend than to add functionality that may turn out to be rather complicated to implement... As Eric has proposed, I also think we could turn on BDRV_O_NO_BACKING automatically if the option list consists of nothing but backing_file and backing_fmt. But then we'd have to think about the case where the user gives both backing_file and some other option at the same time; it may be unexpected that we do open the backing file in that case. The best might actually to error out so the user is forced to do both changes separately -- but at that point the logic gets so complicated that I think we should at most add a note to qemu-img's man page that qemu-img amend will open the backing file and that users should use rebase -u if that is not desired... Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [Qemu-block] [RFC for-3.0 0/4] block: Add qcow2-rust block driver
On 21.04.2017 17:51, Stefan Hajnoczi wrote: > On Sat, Apr 01, 2017 at 05:57:47PM +0200, Max Reitz wrote: >> The issues of using C are well understood and nobody likes it. Let's use >> a better language. C++ is not a better language, Rust is. Everybody >> loves Rust. Rust is good. Rust is hip. It will attract developers, it >> will improve code quality, it will improve performance, it will even >> improve your marriage. Rust is the future and the future is now. >> >> As the block layer, let's show our commitment to the future by replacing >> one of our core parts, the the LEGACY (Bah! Yuck! Ugh!) qcow2 driver, by >> a shiny (Oooh! Aaah!) Rust driver. Much better. My VMs now run thrice as >> fast. Promise. > > This is actually a good exercise. > > Did you feel like there were places where Rust allowed you to express > things better than C? Well, I very much like the try!() macro and generally the fact that you can return both a success and an error value at the same time. (Which allowed me to represent the Error ** thing we generally use in a much more natural way.) > I like pattern matching, wish C had it. That, including the (Haskell-like (?)) enums, is nice, too, yes. I would like to say I hate that you have to explicitly cast integers when converting between different widths, but if we had to do that in C, it would have probably prevented quite some bugs we had in the past. (It still is pretty annoying most of the time.) Something that's not better than C but which surprised me in a really positive way was how nice Rust's C interface is. Having said that, there are things that really enrange me every time I try my hands at Rust; most of which are intentional and probably familiar to anyone having done so. But there are some things I have missed for which there are proposed solutions, but they have been in a "should be done soon" state for some years now. For instance, when implementing methods for a generic type: impl BlockDriver { ... } you cannot test *whether* the type implements a certain trait. https://github.com/XanClic/qemu/blob/rust-qcow2/block/rust/src/interface/mod.rs#L651 is where I complain. You can implement functions only if T implements a trait: impl BlockDriver { pub fn provides_open(&mut self) { ... } } But you cannot implement e.g. the same function but with a different body (e.g. doing nothing) if T does not implement a trait, or provide a generic implementation that is overwritten by these specialized implementations. Both have been proposed and especially the first one is supposed to be added in some form at some point in the future; but it has been in that proposal state since 2014 or so... (This is also a good thing, mind you, because it means they don't just throw anything into the language just because it looks useful to someone. They really think long and hard about what they add and what implications it will have so it doesn't end up as a second C++.) So all in all, Rust is not completely unusable, but of course you swear a lot (at least I do) and it still feels kind of unfinished. I hate it very much but at the same time, *for me*, I think it's the best language for cases where I would have used C or C++ in the past. Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 2/7] target/openrisc: add shutdown logic
On 04/22/2017 12:09 PM, Stafford Horne wrote: I discussed a bit on #qemu and Alexander Graf suggested to properly define shutdown semantics for openrisc. Some examples were shown from ppc, s390 and arm. Yes, properly defining this in the spec goes a long way toward fixing the underlying problem. It's probably worth thinking about a wait-state condition as well, so that qemu can avoid staying pegged at 100% host cpu for an idle guest cpu. ppc - platform http://git.qemu.org/?p=qemu.git;a=blob;f=hw/ppc/e500.c;hb=HEAD#l936 Registers hardware device mpc8xxx_gpio which handles shutdown via gpio That's one possibility. Another is to define an SPR. r~
Re: [Qemu-devel] dns server not working in QEMU using usermode networking (SLIRP)
Stefan Weil, on ven. 21 avril 2017 21:58:18 +0200, wrote: > Am 17.04.2017 um 00:10 schrieb FONNEMANN Mark: > > I hadn't seen the original report on the list, sorry. There is too much > > traffic on qemu-devel for me to manage to catch these :/ > > > > This problem was fixed by > > e42f869b ("slirp: Make RA build more flexible") and a2f80fdf ("slirp: Send > > RDNSS in RA only if host has an IPv6 DNS server") which will be included in > > qemu 2.9. > > > > Samuel > > See report on > http://stackoverflow.com/questions/43308310/dns-server-not-working-in-qemu-usermode-networking. > > Mark, did you also get that problem with QEMU running on a Linux host, > or is it specific for QEMU running on Windows? Ah, I hadn't noticed the report was about windows. The abovementioned fix should still be fine for windows: there, get_dns6_addr just always returns -1, and thus the RDNSS option is never added. So I don't know what to do on my side, more investigation is needed there. Samuel
Re: [Qemu-devel] [PATCH 2/7] target/openrisc: add shutdown logic
On Tue, Apr 18, 2017 at 11:20:55PM +0900, Stafford Horne wrote: > On Tue, Apr 18, 2017 at 12:52:52AM -0700, Richard Henderson wrote: > > On 04/16/2017 04:23 PM, Stafford Horne wrote: > > > In openrisc simulators we use hooks like 'l.nop 1' to cause the > > > simulator to exit. Implement that for qemu too. > > > > > > Reported-by: Waldemar Brodkorb > > > Signed-off-by: Stafford Horne > > > > As I said the first time this was posted: This is horrible. > > > > If you want to do something like this, it needs to be buried under a special > > run mode like -semihosting. > > Understood, I will revise this. I didnt know this was posted before. > > > > case 0x01:/* l.nop */ > > > LOG_DIS("l.nop %d\n", I16); > > > +{ > > > +TCGv_i32 arg = tcg_const_i32(I16); > > > +gen_helper_nop(arg); > > > +} > > > > You also really really must special-case l.nop 0 so that it doesn't generate > > a function call. Just think of all the extra calls you're adding for every > > delay slot that couldn't be filled. > > Yeah, that makes sense. Ill add that for l.nop 0. FYI, I am going to drop this patch for now. I think Waldemar can apply this patch for the time being. I looked through the semihosting route and I don't think poking l.nop through there makes much sense since that looks mainly for syscalls. I also considered making another flag like `-or1k-hacks`, but I figured that wouldnt be appropriate. I discussed a bit on #qemu and Alexander Graf suggested to properly define shutdown semantics for openrisc. Some examples were shown from ppc, s390 and arm. s390x http://git.qemu.org/?p=qemu.git;a=blob;f=target/s390x/helper.c;hb=HEAD#l265 Detects the cpu is in WAIT state and shutsdown qemu. ppc - platform http://git.qemu.org/?p=qemu.git;a=blob;f=hw/ppc/e500.c;hb=HEAD#l936 Registers hardware device mpc8xxx_gpio which handles shutdown via gpio I will have a thought about this, it will require some kernel changes. -Stafford
Re: [Qemu-devel] [PATCH 4/6] migration: calculate downtime on dst side (CPUMASK)
Hello David, this mail just for CPUMASK discussion. On Fri, Apr 21, 2017 at 01:00:32PM +0100, Dr. David Alan Gilbert wrote: > * Alexey Perevalov (a.pereva...@samsung.com) wrote: > > This patch provides downtime calculation per vCPU, > > as a summary and as a overlapped value for all vCPUs. > > > > This approach just keeps tree with page fault addr as a key, > > and t1-t2 interval of pagefault time and page copy time, with > > affected vCPU bit mask. > > For more implementation details please see comment to > > get_postcopy_total_downtime function. > > > > Signed-off-by: Alexey Perevalov > > --- > > include/migration/migration.h | 14 +++ > > migration/migration.c | 280 > > +- > > migration/postcopy-ram.c | 24 +++- > > migration/qemu-file.c | 1 - > > migration/trace-events| 9 +- > > 5 files changed, 323 insertions(+), 5 deletions(-) > > > > diff --git a/include/migration/migration.h b/include/migration/migration.h > > index 5720c88..5d2c628 100644 > > --- a/include/migration/migration.h > > +++ b/include/migration/migration.h > > @@ -123,10 +123,24 @@ struct MigrationIncomingState { > > > > /* See savevm.c */ > > LoadStateEntry_Head loadvm_handlers; > > + > > +/* > > + * Tree for keeping postcopy downtime, > > + * necessary to calculate correct downtime, during multiple > > + * vm suspends, it keeps host page address as a key and > > + * DowntimeDuration as a data > > + * NULL means kernel couldn't provide process thread id, > > + * and QEMU couldn't identify which vCPU raise page fault > > + */ > > +GTree *postcopy_downtime; > > }; > > > > MigrationIncomingState *migration_incoming_get_current(void); > > void migration_incoming_state_destroy(void); > > +void mark_postcopy_downtime_begin(uint64_t addr, int cpu); > > +void mark_postcopy_downtime_end(uint64_t addr); > > +uint64_t get_postcopy_total_downtime(void); > > +void destroy_downtime_duration(gpointer data); > > > > /* > > * An outstanding page request, on the source, having been received > > diff --git a/migration/migration.c b/migration/migration.c > > index 79f6425..5bac434 100644 > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -38,6 +38,8 @@ > > #include "io/channel-tls.h" > > #include "migration/colo.h" > > > > +#define DEBUG_VCPU_DOWNTIME 1 > > + > > #define MAX_THROTTLE (32 << 20) /* Migration transfer speed > > throttling */ > > > > /* Amount of time to allocate to each "chunk" of bandwidth-throttled > > @@ -77,6 +79,19 @@ static NotifierList migration_state_notifiers = > > > > static bool deferred_incoming; > > > > +typedef struct { > > +int64_t begin; > > +int64_t end; > > +uint64_t *cpus; /* cpus bit mask array, QEMU bit functions support > > + bit operation on memory regions, but doesn't check out of range */ > > +} DowntimeDuration; > > + > > +typedef struct { > > +int64_t tp; /* point in time */ > > +bool is_end; > > +uint64_t *cpus; > > +} OverlapDowntime; > > + > > /* > > * Current state of incoming postcopy; note this is not part of > > * MigrationIncomingState since it's state is used during cleanup > > @@ -117,6 +132,13 @@ MigrationState *migrate_get_current(void) > > return ¤t_migration; > > } > > > > +void destroy_downtime_duration(gpointer data) > > +{ > > +DowntimeDuration *dd = (DowntimeDuration *)data; > > +g_free(dd->cpus); > > +g_free(data); > > +} > > + > > MigrationIncomingState *migration_incoming_get_current(void) > > { > > static bool once; > > @@ -138,10 +160,13 @@ void migration_incoming_state_destroy(void) > > struct MigrationIncomingState *mis = migration_incoming_get_current(); > > > > qemu_event_destroy(&mis->main_thread_load_event); > > +if (mis->postcopy_downtime) { > > +g_tree_destroy(mis->postcopy_downtime); > > +mis->postcopy_downtime = NULL; > > +} > > loadvm_free_handlers(mis); > > } > > > > - > > typedef struct { > > bool optional; > > uint32_t size; > > @@ -1754,7 +1779,6 @@ static int postcopy_start(MigrationState *ms, bool > > *old_vm_running) > > */ > > ms->postcopy_after_devices = true; > > notifier_list_notify(&migration_state_notifiers, ms); > > - > > Stray deletion > > > ms->downtime = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - time_at_stop; > > > > qemu_mutex_unlock_iothread(); > > @@ -2117,3 +2141,255 @@ PostcopyState postcopy_state_set(PostcopyState > > new_state) > > return atomic_xchg(&incoming_postcopy_state, new_state); > > } > > > > +#define SIZE_TO_KEEP_CPUBITS (1 + smp_cpus/sizeof(guint64)) > > Split out your cpu-sets so that you have an 'alloc_cpu_set', > a 'set bit' a 'set all bits', dup etc > (I see Linux has cpumask.h that has a 'cpu_set' that's > basically the same thing, but we need something portablish.) > Agree, the way I'm working with cp
Re: [Qemu-devel] [QEMU-2.8] Source QEMU crashes with: "bdrv_co_pwritev: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed"
Hi, I think the bellow patch can fix your problme. [PATCH 2/4] qmp-cont: invalidate on RUN_STATE_PRELAUNCH https://patchwork.kernel.org/patch/9591885/ Actually, we encounter the same problem in our test, we fix it with the follow patch: From 0e4d6d706afd9909b5fd71536b45c58af60892f8 Mon Sep 17 00:00:00 2001 From: zhanghailiang Date: Tue, 21 Mar 2017 09:44:36 +0800 Subject: [PATCH] migration: Re-activate blocks whenever migration been cancelled In commit 1d2acc3162d9c7772510c973f446353fbdd1f9a8, we try to fix the bug 'bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed' which occured in migration cancelling process. But it seems that we didn't cover all the cases, we caught such a case which slipped from the old fixup in our test: if libvirtd cancelled the migration process for a shutting down VM, it will send 'system_reset' command first, and then 'cont' command behind, after VM resumes to run, it will trigger the above error reports, because we didn't regain the control of blocks for VM. Signed-off-by: zhanghailiang Signed-off-by: Hongyang Yang --- block.c | 12 +++- include/block/block.h | 1 + include/migration/migration.h | 3 --- migration/migration.c | 7 +-- qmp.c | 4 +--- 5 files changed, 14 insertions(+), 13 deletions(-) diff --git a/block.c b/block.c index 6e906ec..c2555b0 100644 --- a/block.c +++ b/block.c @@ -3875,6 +3875,13 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp) } } +static bool is_inactivated; + +bool bdrv_is_inactivated(void) +{ +return is_inactivated; +} + void bdrv_invalidate_cache_all(Error **errp) { BlockDriverState *bs; @@ -3892,6 +3899,7 @@ void bdrv_invalidate_cache_all(Error **errp) return; } } +is_inactivated = false; } static int bdrv_inactivate_recurse(BlockDriverState *bs, @@ -3948,7 +3956,9 @@ out: for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) { aio_context_release(bdrv_get_aio_context(bs)); } - +if (!ret) { +is_inactivated = true; +} return ret; } diff --git a/include/block/block.h b/include/block/block.h index 5149260..f77b57f 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -365,6 +365,7 @@ int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf); void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp); void bdrv_invalidate_cache_all(Error **errp); int bdrv_inactivate_all(void); +bool bdrv_is_inactivated(void); /* Ensure contents are flushed to disk. */ int bdrv_flush(BlockDriverState *bs); diff --git a/include/migration/migration.h b/include/migration/migration.h index 5720c88..a9a2071 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -183,9 +183,6 @@ struct MigrationState /* Flag set once the migration thread is running (and needs joining) */ bool migration_thread_running; -/* Flag set once the migration thread called bdrv_inactivate_all */ -bool block_inactive; - /* Queue of outstanding page requests from the destination */ QemuMutex src_page_req_mutex; QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest) src_page_requests; diff --git a/migration/migration.c b/migration/migration.c index 54060f7..7c3d593 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1015,14 +1015,12 @@ static void migrate_fd_cancel(MigrationState *s) if (s->state == MIGRATION_STATUS_CANCELLING && f) { qemu_file_shutdown(f); } -if (s->state == MIGRATION_STATUS_CANCELLING && s->block_inactive) { +if (bdrv_is_inactivated()) { Error *local_err = NULL; bdrv_invalidate_cache_all(&local_err); if (local_err) { error_report_err(local_err); -} else { -s->block_inactive = false; } } } @@ -1824,7 +1822,6 @@ static void migration_completion(MigrationState *s, int current_active_state, if (ret >= 0) { qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX); qemu_savevm_state_complete_precopy(s->to_dst_file, false); -s->block_inactive = true; } } qemu_mutex_unlock_iothread(); @@ -1879,8 +1876,6 @@ fail_invalidate: bdrv_invalidate_cache_all(&local_err); if (local_err) { error_report_err(local_err);
Re: [Qemu-devel] Help needed: Sparc 64, kernel panic
On 21/04/17 16:12, Ajallooiean Hossein wrote: > Thanks for the files and the notes. > > I am attaching my config-host file for you. I'm on x86_64, running Linux 64, > Ubuntu 16.04.2 LTS > > so, i can also boot like you do - this worked before as well. btw, if you try > to install that to a disk, itll not be able to as youll need to define memory > for it. > > The problem is after i install the iso on the qcow2 disk and then try to boot > qemu-system-sparc64. > > so here is the steps to reproduce the issue: > 1- create a qcow2 image : debian-9.0-sparc64-NETINST-1.qcow2 > 2- download debian image: debian-9.0-sparc64-NETINST-1.iso > 3- install the OS on dIsk > i use the below command line to do it: > > ./qemu-system-sparc64 -cdrom > /home/nihosa/Downloads/debian-9.0-sparc64-NETINST-1.iso -hda > /home/nihosa/Downloads/debian-sparc.qcow2 -nographic -boot d -L pc-bios -m 200 > > i guess i dont have to define a kernel in the above code??? > > 4- installation goes all well. > 5- i try to run the new disk image: - here i add kernel as if i dont add it > ill get the below: > > https://pastebin.com/cFwrX9E9 I've just done a test install with https://people.debian.org/~glaubitz/debian-cd/2017-03-24/debian-9.0-sparc64-NETINST-1.iso and I didn't see any errors similar to the ones you mention (although I did have to blacklist the bochs_drm module upon boot). The resulting qcow2 image can be found temporarily at https://www.ilande.co.uk/tmp/qemu/sparc64-kernel/deb90.qcow2.xz and you can launch it with: ./qemu-system-sparc64 -hda deb90.qcow2 -m 256 -nographic Username and password are both root. ATB, Mark.
[Qemu-devel] [PATCH RESEND v2 17/18] filter-rewriter: handle checkpoint and failover event
After one round of checkpoint, the states between PVM and SVM become consistent, so it is unnecessary to adjust the sequence of net packets for old connections, besides, while failover happens, filter-rewriter needs to check if it still needs to adjust sequence of net packets. Cc: Jason Wang Signed-off-by: zhanghailiang --- net/filter-rewriter.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index c9a6d43..0a90b11 100644 --- a/net/filter-rewriter.c +++ b/net/filter-rewriter.c @@ -22,6 +22,7 @@ #include "qemu/main-loop.h" #include "qemu/iov.h" #include "net/checksum.h" +#include "net/colo.h" #define FILTER_COLO_REWRITER(obj) \ OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER) @@ -270,6 +271,43 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf, return 0; } +static void reset_seq_offset(gpointer key, gpointer value, gpointer user_data) +{ +Connection *conn = (Connection *)value; + +conn->offset = 0; +} + +static gboolean offset_is_nonzero(gpointer key, + gpointer value, + gpointer user_data) +{ +Connection *conn = (Connection *)value; + +return conn->offset ? true : false; +} + +static void colo_rewriter_handle_event(NetFilterState *nf, int event, + Error **errp) +{ +RewriterState *rs = FILTER_COLO_REWRITER(nf); + +switch (event) { +case COLO_CHECKPOINT: +g_hash_table_foreach(rs->connection_track_table, +reset_seq_offset, NULL); +break; +case COLO_FAILOVER: +if (!g_hash_table_find(rs->connection_track_table, + offset_is_nonzero, NULL)) { +object_property_set_str(OBJECT(nf), "off", "status", errp); +} +break; +default: +break; +} +} + static void colo_rewriter_cleanup(NetFilterState *nf) { RewriterState *s = FILTER_COLO_REWRITER(nf); @@ -299,6 +337,7 @@ static void colo_rewriter_class_init(ObjectClass *oc, void *data) nfc->setup = colo_rewriter_setup; nfc->cleanup = colo_rewriter_cleanup; nfc->receive_iov = colo_rewriter_receive_iov; +nfc->handle_event = colo_rewriter_handle_event; } static const TypeInfo colo_rewriter_info = { -- 1.8.3.1
[Qemu-devel] [QEMU-2.8] Source QEMU crashes with: "bdrv_co_pwritev: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed"
I have seen a bunch of reports about this assertion error (on source QEMU). [At least I recall Greg Kurz mentioning this a week or so ago on #qemu, OFTC.] I just noticed this crash in upstream OpenStack CI environment. This seems to occur (only intermittently, though) during live migration without shared-storage. [I've attached the complete source and destination QEMU command-line in a separate plain text file.] But here are the errors from source and destination QEMU: Source QEMU: --- [...] 2017-04-21 13:54:08.505+: initiating migration qemu-system-x86_64: /build/qemu-5OJ39u/qemu-2.8+dfsg/block/io.c:1514: bdrv_co_pwritev: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. 2017-04-21 13:54:08.791+: shutting down, reason=crashed --- Destination QEMU: --- [...] /build/qemu-5OJ39u/qemu-2.8+dfsg/nbd/server.c:nbd_receive_request():L710: read failed 2017-04-21 13:54:08.792+: shutting down, reason=failed 2017-04-21T13:54:08.793259Z qemu-system-x86_64: terminating on signal 15 from pid 12160 (/usr/sbin/libvirtd) --- Any hints as how to how to deal with this in QEMU 2.8? FWIW, the upstream OpenStack CI only very recently moved to QEMU 2.8, so it is unlikely that the CI env will move to the just-released 2.9 anytime soon. (But there's work in progress to create a CI job that tests with QEMU 2.9.) -- /kashyap Source QEMU: --- 2017-04-21 13:54:03.632+: starting up libvirt version: 2.5.0, package: 3ubuntu5~cloud0 (Openstack Ubuntu Testing Bot Tue, 21 Mar 2017 21:54:49 +), qemu version: 2.8.0(Debian 1:2.8+dfsg-3ubuntu2~cloud0), hostname: ubuntu-xenial-2-node-osic-cloud1-s3500-8527282-539390 LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-system-x86_64 -name guest=instance-0001,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-instance-0001/master-key.aes -machine pc-i440fx-zesty,accel=tcg,usb=off,dump-guest-core=off -m 64 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 9bf9f268-5242-4b1d-8fe6-ee348b2b8d3e -smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=15.0.4,serial=84d813aa-3d3e-4250-bcac-cb0c61adf1ef,uuid=9bf9f268-5242-4b1d-8fe6-ee348b2b8d3e,family=Virtual Machine' -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-instance-0001/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -kernel /opt/stack/data/nova/instances/9bf9f268-5242-4b1d-8fe6-ee348b2b8d3e/kernel -initrd /opt/stack/data/nova/instances/9bf9f268-5242-4b1d-8fe6-ee348b2b8d3e/ramdisk -append 'root=/dev/vda console=tty0 console=ttyS0 no_timer_check' -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/opt/stack/data/nova/instances/9bf9f268-5242-4b1d-8fe6-ee348b2b8d3e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -add-fd set=0,fd=29 -chardev pty,id=charserial0,logfile=/dev/fdset/0,logappend=on -device isa-serial,chardev=charserial0,id=serial0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -msg timestamp=on char device redirected to /dev/pts/0 (label charserial0) warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5] 2017-04-21 13:54:08.505+: initiating migration qemu-system-x86_64: /build/qemu-5OJ39u/qemu-2.8+dfsg/block/io.c:1514: bdrv_co_pwritev: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. 2017-04-21 13:54:08.791+: shutting down, reason=crashed --- Destination QEMU: --- 2017-04-21 13:54:08.142+: starting up libvirt version: 2.5.0, package: 3ubuntu5~cloud0 (Openstack Ubuntu Testing Bot Tue, 21 Mar 2017 21:54:49 +), qemu version: 2.8.0(Debian 1:2.8+dfsg-3ubuntu2~cloud0), hostname: ubuntu-xenial-2-node-osic-cloud1-s3500-8527282 LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-system-x86_64 -name guest=instance-0001,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-instance-0001/master-key.aes -machine pc-i440fx-zesty,accel=tcg,usb=off,dump-guest-core=off -m 64 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 9bf9f268-5242-4b1d-8fe6-ee348b2b8d3e -smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,ve
[Qemu-devel] [PATCH RESEND v2 16/18] filter: Add handle_event method for NetFilterClass
Filter needs to process the event of checkpoint/failover or other event passed by COLO frame. Cc: Jason Wang Signed-off-by: zhanghailiang --- include/net/filter.h | 5 + net/filter.c | 16 net/net.c| 28 3 files changed, 49 insertions(+) diff --git a/include/net/filter.h b/include/net/filter.h index 0c4a2ea..df4510d 100644 --- a/include/net/filter.h +++ b/include/net/filter.h @@ -37,6 +37,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc, typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp); +typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error **errp); + typedef struct NetFilterClass { ObjectClass parent_class; @@ -44,6 +46,7 @@ typedef struct NetFilterClass { FilterSetup *setup; FilterCleanup *cleanup; FilterStatusChanged *status_changed; +FilterHandleEvent *handle_event; /* mandatory */ FilterReceiveIOV *receive_iov; } NetFilterClass; @@ -76,4 +79,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender, int iovcnt, void *opaque); +void colo_notify_filters_event(int event, Error **errp); + #endif /* QEMU_NET_FILTER_H */ diff --git a/net/filter.c b/net/filter.c index 1dfd2ca..993b35e 100644 --- a/net/filter.c +++ b/net/filter.c @@ -17,6 +17,7 @@ #include "net/vhost_net.h" #include "qom/object_interfaces.h" #include "qemu/iov.h" +#include "net/colo.h" static inline bool qemu_can_skip_netfilter(NetFilterState *nf) { @@ -245,11 +246,26 @@ static void netfilter_finalize(Object *obj) g_free(nf->netdev_id); } +static void dummy_handle_event(NetFilterState *nf, int event, Error **errp) +{ +switch (event) { +case COLO_CHECKPOINT: +break; +case COLO_FAILOVER: +object_property_set_str(OBJECT(nf), "off", "status", errp); +break; +default: +break; +} +} + static void netfilter_class_init(ObjectClass *oc, void *data) { UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); +NetFilterClass *nfc = NETFILTER_CLASS(oc); ucc->complete = netfilter_complete; +nfc->handle_event = dummy_handle_event; } static const TypeInfo netfilter_info = { diff --git a/net/net.c b/net/net.c index 0ac3b9e..1373f63 100644 --- a/net/net.c +++ b/net/net.c @@ -1373,6 +1373,34 @@ void hmp_info_network(Monitor *mon, const QDict *qdict) } } +void colo_notify_filters_event(int event, Error **errp) +{ +NetClientState *nc, *peer; +NetClientDriver type; +NetFilterState *nf; +NetFilterClass *nfc = NULL; +Error *local_err = NULL; + +QTAILQ_FOREACH(nc, &net_clients, next) { +peer = nc->peer; +type = nc->info->type; +if (!peer || type != NET_CLIENT_DRIVER_NIC) { +continue; +} +QTAILQ_FOREACH(nf, &nc->filters, next) { +nfc = NETFILTER_GET_CLASS(OBJECT(nf)); +if (!nfc->handle_event) { +continue; +} +nfc->handle_event(nf, event, &local_err); +if (local_err) { +error_propagate(errp, local_err); +return; +} +} +} +} + void qmp_set_link(const char *name, bool up, Error **errp) { NetClientState *ncs[MAX_QUEUE_NUM]; -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 09/18] COLO: Flush memory data from ram cache
During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's RAM cache will always be same with PVM's memory after checkpoint. Instead of flushing all content of PVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- include/migration/migration.h | 1 + migration/ram.c | 40 migration/trace-events| 2 ++ 3 files changed, 43 insertions(+) diff --git a/include/migration/migration.h b/include/migration/migration.h index ba765eb..2aa7654 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -364,4 +364,5 @@ PostcopyState postcopy_state_set(PostcopyState new_state); /* ram cache */ int colo_init_ram_cache(void); void colo_release_ram_cache(void); +void colo_flush_ram_cache(void); #endif diff --git a/migration/ram.c b/migration/ram.c index 0653a24..df10d4b 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2602,6 +2602,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) bool postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING; /* ADVISE is earlier, it shows the source has the postcopy capability on */ bool postcopy_advised = postcopy_state_get() >= POSTCOPY_INCOMING_ADVISE; +bool need_flush = false; seq_iter++; @@ -2636,6 +2637,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) /* After going into COLO, we should load the Page into colo_cache */ if (migration_incoming_in_colo_state()) { host = colo_cache_from_block_offset(block, addr); +need_flush = true; } else { host = host_from_ram_block_offset(block, addr); } @@ -2742,6 +2744,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) wait_for_decompress_done(); rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); + +if (!ret && ram_cache_enable && need_flush) { +colo_flush_ram_cache(); +} return ret; } @@ -2810,6 +2816,40 @@ void colo_release_ram_cache(void) rcu_read_unlock(); } +/* + * Flush content of RAM cache into SVM's memory. + * Only flush the pages that be dirtied by PVM or SVM or both. + */ +void colo_flush_ram_cache(void) +{ +RAMBlock *block = NULL; +void *dst_host; +void *src_host; +unsigned long offset = 0; + +trace_colo_flush_ram_cache_begin(ram_state.migration_dirty_pages); +rcu_read_lock(); +block = QLIST_FIRST_RCU(&ram_list.blocks); + +while (block) { +offset = migration_bitmap_find_dirty(&ram_state, block, offset); +migration_bitmap_clear_dirty(&ram_state, block, offset); + +if (offset << TARGET_PAGE_BITS >= block->used_length) { +offset = 0; +block = QLIST_NEXT_RCU(block, next); +} else { +dst_host = block->host + (offset << TARGET_PAGE_BITS); +src_host = block->colo_cache + (offset << TARGET_PAGE_BITS); +memcpy(dst_host, src_host, TARGET_PAGE_SIZE); +} +} + +rcu_read_unlock(); +trace_colo_flush_ram_cache_end(); +assert(ram_state.migration_dirty_pages == 0); +} + static SaveVMHandlers savevm_ram_handlers = { .save_live_setup = ram_save_setup, .save_live_iterate = ram_save_iterate, diff --git a/migration/trace-events b/migration/trace-events index b8f01a2..93f4337 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -72,6 +72,8 @@ ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x" ram_postcopy_send_discard_bitmap(void) "" ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: start: %zx len: %zx" +colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64 +colo_flush_ram_cache_end(void) "" # migration/migration.c await_return_path_close_on_source_close(void) "" -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 15/18] COLO: flush host dirty ram from cache
Don't need to flush all VM's ram from cache, only flush the dirty pages since last checkpoint Cc: Juan Quintela Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang --- v2: - stop dirty log after exit from COLO state. (Dave) --- migration/ram.c | 12 1 file changed, 12 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index f171a82..7bf3515 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2775,6 +2775,7 @@ int colo_init_ram_cache(void) ram_state.ram_bitmap = g_new0(RAMBitmap, 1); ram_state.ram_bitmap->bmap = bitmap_new(last_ram_page()); ram_state.migration_dirty_pages = 0; +memory_global_dirty_log_start(); return 0; @@ -2798,6 +2799,7 @@ void colo_release_ram_cache(void) atomic_rcu_set(&ram_state.ram_bitmap, NULL); if (bitmap) { +memory_global_dirty_log_stop(); call_rcu(bitmap, migration_bitmap_free, rcu); } @@ -2822,6 +2824,16 @@ void colo_flush_ram_cache(void) void *src_host; unsigned long offset = 0; +memory_global_dirty_log_sync(); +qemu_mutex_lock(&ram_state.bitmap_mutex); +rcu_read_lock(); +QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { +migration_bitmap_sync_range(&ram_state, block, block->offset, +block->used_length); +} +rcu_read_unlock(); +qemu_mutex_unlock(&ram_state.bitmap_mutex); + trace_colo_flush_ram_cache_begin(ram_state.migration_dirty_pages); rcu_read_lock(); block = QLIST_FIRST_RCU(&ram_list.blocks); -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 10/18] qmp event: Add COLO_EXIT event to notify users while exited COLO
If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x_colo_lost_heartbeat', Users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exited COLO mode. Cc: Markus Armbruster Cc: Michael Roth Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Eric Blake --- migration/colo.c | 19 +++ qapi-schema.json | 14 ++ qapi/event.json | 21 + 3 files changed, 54 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 9949293..e62da93 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -516,6 +516,18 @@ out: qemu_fclose(fb); } +/* + * There are only two reasons we can go here, some error happened. + * Or the user triggered failover. + */ +if (failover_get_state() == FAILOVER_STATUS_NONE) { +qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_ERROR, NULL); +} else { +qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_REQUEST, NULL); +} + /* Hope this not to be too long to wait here */ qemu_sem_wait(&s->colo_exit_sem); qemu_sem_destroy(&s->colo_exit_sem); @@ -757,6 +769,13 @@ out: if (local_err) { error_report_err(local_err); } +if (failover_get_state() == FAILOVER_STATUS_NONE) { +qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_ERROR, NULL); +} else { +qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_REQUEST, NULL); +} if (fb) { qemu_fclose(fb); diff --git a/qapi-schema.json b/qapi-schema.json index 4b3e1b7..460ca53 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1233,6 +1233,20 @@ 'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] } ## +# @COLOExitReason: +# +# The reason for a COLO exit +# +# @request: COLO exit is due to an external request +# +# @error: COLO exit is due to an internal error +# +# Since: 2.10 +## +{ 'enum': 'COLOExitReason', + 'data': [ 'request', 'error' ] } + +## # @x-colo-lost-heartbeat: # # Tell qemu that heartbeat is lost, request it to do takeover procedures. diff --git a/qapi/event.json b/qapi/event.json index e80f3f4..924bc6f 100644 --- a/qapi/event.json +++ b/qapi/event.json @@ -441,6 +441,27 @@ 'data': { 'pass': 'int' } } ## +# @COLO_EXIT: +# +# Emitted when VM finishes COLO mode due to some errors happening or +# at the request of users. +# +# @mode: which COLO mode the VM was in when it exited. +# +# @reason: describes the reason for the COLO exit. +# +# Since: 2.10 +# +# Example: +# +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172}, +# "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } } +# +## +{ 'event': 'COLO_EXIT', + 'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } } + +## # @ACPI_DEVICE_OST: # # Emitted when guest executes ACPI _OST method. -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 06/18] COLO: Add block replication into colo process
Make sure master start block replication after slave's block replication started. Besides, we need to activate VM's blocks before goes into COLO state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Cc: Stefan Hajnoczi Cc: Kevin Wolf Cc: Max Reitz Cc: Xie Changlong --- migration/colo.c | 50 ++ migration/migration.c | 16 2 files changed, 66 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index c4fc865..9949293 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -23,6 +23,9 @@ #include "qmp-commands.h" #include "net/colo-compare.h" #include "net/colo.h" +#include "qapi-event.h" +#include "block/block.h" +#include "replication.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -57,6 +60,7 @@ static void secondary_vm_do_failover(void) { int old_state; MigrationIncomingState *mis = migration_incoming_get_current(); +Error *local_err = NULL; /* Can not do failover during the process of VM's loading VMstate, Or * it will break the secondary VM. @@ -74,6 +78,11 @@ static void secondary_vm_do_failover(void) migrate_set_state(&mis->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); +replication_stop_all(true, &local_err); +if (local_err) { +error_report_err(local_err); +} + if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side"); /* recover runstate to normal migration finish state */ @@ -111,6 +120,7 @@ static void primary_vm_do_failover(void) { MigrationState *s = migrate_get_current(); int old_state; +Error *local_err = NULL; migrate_set_state(&s->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); @@ -134,6 +144,13 @@ static void primary_vm_do_failover(void) FailoverStatus_lookup[old_state]); return; } + +replication_stop_all(true, &local_err); +if (local_err) { +error_report_err(local_err); +local_err = NULL; +} + /* Notify COLO thread that failover work is finished */ qemu_sem_post(&s->colo_exit_sem); } @@ -345,6 +362,15 @@ static int colo_do_checkpoint_transaction(MigrationState *s, s->params.shared = 0; qemu_savevm_state_header(fb); qemu_savevm_state_begin(fb, &s->params); + +/* We call this API although this may do nothing on primary side. */ +qemu_mutex_lock_iothread(); +replication_do_checkpoint_all(&local_err); +qemu_mutex_unlock_iothread(); +if (local_err) { +goto out; +} + qemu_mutex_lock_iothread(); qemu_savevm_state_complete_precopy(fb, false); qemu_mutex_unlock_iothread(); @@ -451,6 +477,12 @@ static void colo_process_checkpoint(MigrationState *s) object_unref(OBJECT(bioc)); qemu_mutex_lock_iothread(); +replication_start_all(REPLICATION_MODE_PRIMARY, &local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vm_start(); qemu_mutex_unlock_iothread(); trace_colo_vm_state_change("stop", "run"); @@ -554,6 +586,7 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request, case COLO_MESSAGE_GUEST_SHUTDOWN: qemu_mutex_lock_iothread(); vm_stop_force_state(RUN_STATE_COLO); +replication_stop_all(false, NULL); qemu_system_shutdown_request_core(); qemu_mutex_unlock_iothread(); /* @@ -602,6 +635,11 @@ void *colo_process_incoming_thread(void *opaque) object_unref(OBJECT(bioc)); qemu_mutex_lock_iothread(); +replication_start_all(REPLICATION_MODE_SECONDARY, &local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} vm_start(); trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); @@ -682,6 +720,18 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +replication_get_error_all(&local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} +/* discard colo disk buffer */ +replication_do_checkpoint_all(&local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vmstate_loading = false; vm_start(); trace_colo_vm_state_change("stop", "run"); diff --git a/migration/migration.c b/migration/migration.c index 2ade2aa..755ea54 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -394,6 +394,7 @@ static void process_incoming_migration_co(void *opaque) MigrationIncomingState *mis = migration_incoming_get_current(); PostcopyState ps; int ret; +Error *local_err = NULL; mis->from_src_file = f; mis->largest_page_size = qemu_ram_pagesize_largest(); @@ -425,6 +426,21 @@ static void process_incomi
[Qemu-devel] [PATCH RESEND v2 02/18] colo-compare: implement the process of checkpoint
While do checkpoint, we need to flush all the unhandled packets, By using the filter notifier mechanism, we can easily to notify every compare object to do this process, which runs inside of compare threads as a coroutine. Cc: Jason Wang Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen --- net/colo-compare.c | 78 ++ net/colo-compare.h | 6 + 2 files changed, 84 insertions(+) create mode 100644 net/colo-compare.h diff --git a/net/colo-compare.c b/net/colo-compare.c index 97bf0e5..3adccfb 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -29,17 +29,24 @@ #include "qemu/sockets.h" #include "qapi-visit.h" #include "net/colo.h" +#include "net/colo-compare.h" #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE) +static QTAILQ_HEAD(, CompareState) net_compares = + QTAILQ_HEAD_INITIALIZER(net_compares); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 /* TODO: Should be configurable */ #define REGULAR_PACKET_CHECK_MS 3000 +static QemuMutex event_mtx = { .lock = PTHREAD_MUTEX_INITIALIZER }; +static QemuCond event_complete_cond = { .cond = PTHREAD_COND_INITIALIZER }; +static int event_unhandled_count; /* + CompareState ++ | | @@ -87,6 +94,10 @@ typedef struct CompareState { GMainContext *worker_context; GMainLoop *compare_loop; +/* Used for COLO to notify compare to do something */ +FilterNotifier *notifier; + +QTAILQ_ENTRY(CompareState) next; } CompareState; typedef struct CompareClass { @@ -417,6 +428,11 @@ static void colo_compare_connection(void *opaque, void *user_data) while (!g_queue_is_empty(&conn->primary_list) && !g_queue_is_empty(&conn->secondary_list)) { pkt = g_queue_pop_tail(&conn->primary_list); +if (!pkt) { +error_report("colo-compare pop pkt failed"); +return; +} + switch (conn->ip_proto) { case IPPROTO_TCP: result = g_queue_find_custom(&conn->secondary_list, @@ -538,6 +554,53 @@ static gboolean check_old_packet_regular(void *opaque) return TRUE; } +/* Public API, Used for COLO frame to notify compare event */ +void colo_notify_compares_event(void *opaque, int event, Error **errp) +{ +CompareState *s; +int ret; + +qemu_mutex_lock(&event_mtx); +QTAILQ_FOREACH(s, &net_compares, next) { +ret = filter_notifier_set(s->notifier, event); +if (ret < 0) { +error_setg_errno(errp, -ret, "Failed to write value to eventfd"); +goto fail; +} +event_unhandled_count++; +} +/* Wait all compare threads to finish handling this event */ +while (event_unhandled_count > 0) { +qemu_cond_wait(&event_complete_cond, &event_mtx); +} + +fail: +qemu_mutex_unlock(&event_mtx); +} + +static void colo_flush_packets(void *opaque, void *user_data); + +static void colo_compare_handle_event(void *opaque, int event) +{ +FilterNotifier *notify = opaque; +CompareState *s = notify->opaque; + +switch (event) { +case COLO_CHECKPOINT: +g_queue_foreach(&s->conn_list, colo_flush_packets, s); +break; +case COLO_FAILOVER: +break; +default: +break; +} +qemu_mutex_lock(&event_mtx); +assert(event_unhandled_count > 0); +event_unhandled_count--; +qemu_cond_broadcast(&event_complete_cond); +qemu_mutex_unlock(&event_mtx); +} + static void *colo_compare_thread(void *opaque) { CompareState *s = opaque; @@ -558,10 +621,15 @@ static void *colo_compare_thread(void *opaque) (GSourceFunc)check_old_packet_regular, s, NULL); g_source_attach(timeout_source, s->worker_context); +s->notifier = filter_notifier_new(colo_compare_handle_event, s, NULL); +g_source_attach(&s->notifier->source, s->worker_context); + qemu_sem_post(&s->thread_ready); g_main_loop_run(s->compare_loop); +g_source_destroy(&s->notifier->source); +g_source_unref(&s->notifier->source); g_source_destroy(timeout_source); g_source_unref(timeout_source); @@ -706,6 +774,8 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp) net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize); net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize); +QTAILQ_INSERT_TAIL(&net_compares, s, next); + g_queue_init(&s->conn_list); s->connection_track_table = g_hash_table_new_full(connection_key_hash, @@ -765,6 +835,7 @@ static void colo_compare_init(Object *obj) static void colo_compare_finalize(Object *obj) { CompareState *s = COLO_COMPARE(obj); +CompareState *tmp = NULL; qemu_chr_fe_set_handlers(&s->chr_pri_in, NULL, NULL, NULL, NULL, s->worker_context, true); @@ -777,6 +848,13 @@ static void colo_compare_finalize(Object *o
[Qemu-devel] [PATCH RESEND v2 07/18] COLO: Load dirty pages into SVM's RAM cache firstly
We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Cc: Dr. David Alan Gilbert Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian --- v2: - Move colo_init_ram_cache() and colo_release_ram_cache() out of incoming thread since both of them need the global lock, if we keep colo_release_ram_cache() in incoming thread, there are potential dead-lock. - Remove bool ram_cache_enable flag, use migration_incoming_in_state() instead. - Remove the Reviewd-by tag because of the above changes. --- include/exec/ram_addr.h | 1 + include/migration/migration.h | 4 +++ migration/migration.c | 6 migration/ram.c | 71 ++- 4 files changed, 81 insertions(+), 1 deletion(-) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index c9ddcd0..0b3d77c 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -27,6 +27,7 @@ struct RAMBlock { struct rcu_head rcu; struct MemoryRegion *mr; uint8_t *host; +uint8_t *colo_cache; /* For colo, VM's ram cache */ ram_addr_t offset; ram_addr_t used_length; ram_addr_t max_length; diff --git a/include/migration/migration.h b/include/migration/migration.h index ba1a16c..ba765eb 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -360,4 +360,8 @@ uint64_t ram_pagesize_summary(void); PostcopyState postcopy_state_get(void); /* Set the state and return the old state */ PostcopyState postcopy_state_set(PostcopyState new_state); + +/* ram cache */ +int colo_init_ram_cache(void); +void colo_release_ram_cache(void); #endif diff --git a/migration/migration.c b/migration/migration.c index 755ea54..7419404 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -441,6 +441,10 @@ static void process_incoming_migration_co(void *opaque) error_report_err(local_err); exit(EXIT_FAILURE); } +if (colo_init_ram_cache() < 0) { +error_report("Init ram cache failed"); +exit(EXIT_FAILURE); +} mis->migration_incoming_co = qemu_coroutine_self(); qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming", colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE); @@ -449,6 +453,8 @@ static void process_incoming_migration_co(void *opaque) /* Wait checkpoint incoming thread exit before free resource */ qemu_thread_join(&mis->colo_incoming_thread); +/* We hold the global iothread lock, so it is safe here */ +colo_release_ram_cache(); } if (ret < 0) { diff --git a/migration/ram.c b/migration/ram.c index f48664e..05d1b06 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2265,6 +2265,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block, return block->host + offset; } +static inline void *colo_cache_from_block_offset(RAMBlock *block, + ram_addr_t offset) +{ +if (!offset_in_ramblock(block, offset)) { +return NULL; +} +if (!block->colo_cache) { +error_report("%s: colo_cache is NULL in block :%s", + __func__, block->idstr); +return NULL; +} +return block->colo_cache + offset; +} + /** * ram_handle_compressed: handle the zero page case * @@ -2605,7 +2619,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) { RAMBlock *block = ram_block_from_stream(f, flags); -host = host_from_ram_block_offset(block, addr); +/* After going into COLO, we should load the Page into colo_cache */ +if (migration_incoming_in_colo_state()) { +host = colo_cache_from_block_offset(block, addr); +} else { +host = host_from_ram_block_offset(block, addr); +} if (!host) { error_report("Illegal RAM offset " RAM_ADDR_FMT, addr); ret = -EINVAL; @@ -2712,6 +2731,56 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) return ret; } +/* + * colo cache: this is for secondary VM, we cache the whole + * memory of the secondary VM, it is need to hold the global lock + * to call this helper. + */ +int colo_init_ram_cache(void) +{ +RAMBlock *block; + +rcu_read_lock(); +QLIST_FOREACH_RCU(block, &ram_
[Qemu-devel] [PATCH RESEND v2 18/18] COLO: notify net filters about checkpoint/failover event
Notify all net filters about the checkpoint and failover event. Cc: Jason Wang Signed-off-by: zhanghailiang --- migration/colo.c | 13 + 1 file changed, 13 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 66bb5b2..62f58c6 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -26,6 +26,7 @@ #include "qapi-event.h" #include "block/block.h" #include "replication.h" +#include "net/filter.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -82,6 +83,11 @@ static void secondary_vm_do_failover(void) if (local_err) { error_report_err(local_err); } +/* Notify all filters of all NIC to do checkpoint */ +colo_notify_filters_event(COLO_FAILOVER, &local_err); +if (local_err) { +error_report_err(local_err); +} if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side"); @@ -794,6 +800,13 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +/* Notify all filters of all NIC to do checkpoint */ +colo_notify_filters_event(COLO_CHECKPOINT, &local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vmstate_loading = false; vm_start(); trace_colo_vm_state_change("stop", "run"); -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 08/18] ram/COLO: Record the dirty pages that SVM received
We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. Here, it is a trick, we record dirty pages by re-using migration dirty bitmap. In the later patch, we will start the dirty log for SVM, just like migration, in this way, we can record both the dirty pages caused by PVM and SVM, we only flush those dirty pages from RAM cache while do checkpoint. Cc: Juan Quintela Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 29 + 1 file changed, 29 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 05d1b06..0653a24 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2268,6 +2268,9 @@ static inline void *host_from_ram_block_offset(RAMBlock *block, static inline void *colo_cache_from_block_offset(RAMBlock *block, ram_addr_t offset) { +unsigned long *bitmap; +long k; + if (!offset_in_ramblock(block, offset)) { return NULL; } @@ -2276,6 +2279,17 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block, __func__, block->idstr); return NULL; } + +k = (memory_region_get_ram_addr(block->mr) + offset) >> TARGET_PAGE_BITS; +bitmap = atomic_rcu_read(&ram_state.ram_bitmap)->bmap; +/* +* During colo checkpoint, we need bitmap of these migrated pages. +* It help us to decide which pages in ram cache should be flushed +* into VM's RAM later. +*/ +if (!test_and_set_bit(k, bitmap)) { +ram_state.migration_dirty_pages++; +} return block->colo_cache + offset; } @@ -2752,6 +2766,15 @@ int colo_init_ram_cache(void) memcpy(block->colo_cache, block->host, block->used_length); } rcu_read_unlock(); +/* +* Record the dirty pages that sent by PVM, we use this dirty bitmap together +* with to decide which page in cache should be flushed into SVM's RAM. Here +* we use the same name 'ram_bitmap' as for migration. +*/ +ram_state.ram_bitmap = g_new0(RAMBitmap, 1); +ram_state.ram_bitmap->bmap = bitmap_new(last_ram_page()); +ram_state.migration_dirty_pages = 0; + return 0; out_locked: @@ -2770,6 +2793,12 @@ out_locked: void colo_release_ram_cache(void) { RAMBlock *block; +RAMBitmap *bitmap = ram_state.ram_bitmap; + +atomic_rcu_set(&ram_state.ram_bitmap, NULL); +if (bitmap) { +call_rcu(bitmap, migration_bitmap_free, rcu); +} rcu_read_lock(); QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 00/18] COLO: integrate colo frame with block replication and net compare
Hi, (Sorry, I have misspelled Dave's email address, resend this series.) COLO Frame, block replication and COLO net compare have been exist in qemu for long time, it's time to integrate these three parts to make COLO really works. In this series, we have some optimizations for COLO frame, including separating the process of saving ram and device state, using an COLO_EXIT event to notify users that VM exits COLO, for these parts, most of them have been reviewed long time ago in old version, but since this series have just rebased on upstream which had merged a new series of migration, parts of pathes in this series deserve review again. We use notifier/callback method for COLO compare to notify COLO frame about net packets inconsistent event, and add a handle_event method for NetFilterClass to help COLO frame to notify filters and colo-compare about checkpoint/failover event, it is flexible. Besides, this series is on top of '[PATCH 0/3] colo-compare: fix three bugs' series. For the neweset version, please refer to: https://github.com/coloft/qemu/tree/colo-for-qemu-2.10-2017-4-22 Please review, thanks. Cc: Dong eddie Cc: Jiang yunhong Cc: Xu Quan Cc: Jason Wang zhanghailiang (18): net/colo: Add notifier/callback related helpers for filter colo-compare: implement the process of checkpoint colo-compare: use notifier to notify packets comparing result COLO: integrate colo compare with colo frame COLO: Handle shutdown command for VM in COLO state COLO: Add block replication into colo process COLO: Load dirty pages into SVM's RAM cache firstly ram/COLO: Record the dirty pages that SVM received COLO: Flush memory data from ram cache qmp event: Add COLO_EXIT event to notify users while exited COLO savevm: split save/find loadvm_handlers entry into two helper functions savevm: split the process of different stages for loadvm/savevm COLO: Separate the process of saving/loading ram and device state COLO: Split qemu_savevm_state_begin out of checkpoint process COLO: flush host dirty ram from cache filter: Add handle_event method for NetFilterClass filter-rewriter: handle checkpoint and failover event COLO: notify net filters about checkpoint/failover event include/exec/ram_addr.h | 1 + include/migration/colo.h | 1 + include/migration/migration.h | 5 + include/net/filter.h | 5 + include/sysemu/sysemu.h | 9 ++ migration/colo.c | 242 +++--- migration/migration.c | 24 - migration/ram.c | 147 - migration/savevm.c| 113 migration/trace-events| 2 + net/colo-compare.c| 110 ++- net/colo-compare.h| 8 ++ net/colo.c| 105 ++ net/colo.h| 19 net/filter-rewriter.c | 39 +++ net/filter.c | 16 +++ net/net.c | 28 + qapi-schema.json | 18 +++- qapi/event.json | 21 vl.c | 19 +++- 20 files changed, 886 insertions(+), 46 deletions(-) create mode 100644 net/colo-compare.h -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 04/18] COLO: integrate colo compare with colo frame
For COLO FT, both the PVM and SVM run at the same time, only sync the state while it needs. So here, let SVM runs while not doing checkpoint, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100. Besides, we forgot to release colo_checkpoint_semd and colo_delay_timer, fix them here. Cc: Jason Wang Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 42 -- migration/migration.c | 2 +- 2 files changed, 41 insertions(+), 3 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index c19eb3f..a3344ce 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -21,8 +21,11 @@ #include "migration/failover.h" #include "replication.h" #include "qmp-commands.h" +#include "net/colo-compare.h" +#include "net/colo.h" static bool vmstate_loading; +static Notifier packets_compare_notifier; #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024) @@ -332,6 +335,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } +colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err); +if (local_err) { +goto out; +} + /* Disable block migration */ s->params.blk = 0; s->params.shared = 0; @@ -390,6 +398,11 @@ out: return ret; } +static void colo_compare_notify_checkpoint(Notifier *notifier, void *data) +{ +colo_checkpoint_notify(data); +} + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -406,6 +419,9 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } +packets_compare_notifier.notify = colo_compare_notify_checkpoint; +colo_compare_register_notifier(&packets_compare_notifier); + /* * Wait for Secondary finish loading VM states and enter COLO * restore. @@ -451,11 +467,21 @@ out: qemu_fclose(fb); } -timer_del(s->colo_delay_timer); - /* Hope this not to be too long to wait here */ qemu_sem_wait(&s->colo_exit_sem); qemu_sem_destroy(&s->colo_exit_sem); + +/* + * It is safe to unregister notifier after failover finished. + * Besides, colo_delay_timer and colo_checkpoint_sem can't be + * released befor unregister notifier, or there will be use-after-free + * error. + */ +colo_compare_unregister_notifier(&packets_compare_notifier); +timer_del(s->colo_delay_timer); +timer_free(s->colo_delay_timer); +qemu_sem_destroy(&s->colo_checkpoint_sem); + /* * Must be called after failover BH is completed, * Or the failover BH may shutdown the wrong fd that @@ -548,6 +574,11 @@ void *colo_process_incoming_thread(void *opaque) fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); +qemu_mutex_lock_iothread(); +vm_start(); +trace_colo_vm_state_change("stop", "run"); +qemu_mutex_unlock_iothread(); + colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY, &local_err); if (local_err) { @@ -567,6 +598,11 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +qemu_mutex_lock_iothread(); +vm_stop_force_state(RUN_STATE_COLO); +trace_colo_vm_state_change("run", "stop"); +qemu_mutex_unlock_iothread(); + /* FIXME: This is unnecessary for periodic checkpoint mode */ colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY, &local_err); @@ -620,6 +656,8 @@ void *colo_process_incoming_thread(void *opaque) } vmstate_loading = false; +vm_start(); +trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) { diff --git a/migration/migration.c b/migration/migration.c index 353f272..2ade2aa 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -70,7 +70,7 @@ /* The delay time (in ms) between two COLO checkpoints * Note: Please change this default value to 1 when we support hybrid mode. */ -#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200 +#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100) static NotifierList migration_state_notifiers = NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 03/18] colo-compare: use notifier to notify packets comparing result
It's a good idea to use notifier to notify COLO frame of inconsistent packets comparing. Cc: Jason Wang Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang --- net/colo-compare.c | 32 net/colo-compare.h | 2 ++ 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 3adccfb..bb234dd 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -30,6 +30,7 @@ #include "qapi-visit.h" #include "net/colo.h" #include "net/colo-compare.h" +#include "migration/migration.h" #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ @@ -38,6 +39,9 @@ static QTAILQ_HEAD(, CompareState) net_compares = QTAILQ_HEAD_INITIALIZER(net_compares); +static NotifierList colo_compare_notifiers = +NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 @@ -384,6 +388,22 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time) } } +static void colo_compare_inconsistent_notify(void) +{ +notifier_list_notify(&colo_compare_notifiers, +migrate_get_current()); +} + +void colo_compare_register_notifier(Notifier *notify) +{ +notifier_list_add(&colo_compare_notifiers, notify); +} + +void colo_compare_unregister_notifier(Notifier *notify) +{ +notifier_remove(notify); +} + static void colo_old_packet_check_one_conn(void *opaque, void *user_data) { @@ -397,7 +417,7 @@ static void colo_old_packet_check_one_conn(void *opaque, if (result) { /* do checkpoint will flush old packet */ -/* TODO: colo_notify_checkpoint();*/ +colo_compare_inconsistent_notify(); } } @@ -415,7 +435,10 @@ static void colo_old_packet_check(void *opaque) /* * Called from the compare thread on the primary - * for compare connection + * for compare connection. + * TODO: Reconstruct this function, we should hold the max handled sequence + * of the connect, Don't trigger a checkpoint request if we only get packets + * from one side (primary or secondary). */ static void colo_compare_connection(void *opaque, void *user_data) { @@ -464,11 +487,12 @@ static void colo_compare_connection(void *opaque, void *user_data) /* * If one packet arrive late, the secondary_list or * primary_list will be empty, so we can't compare it - * until next comparison. + * until next comparison. If the packets in the list are + * timeout, it will trigger a checkpoint request. */ trace_colo_compare_main("packet different"); g_queue_push_tail(&conn->primary_list, pkt); -/* TODO: colo_notify_checkpoint();*/ +colo_compare_inconsistent_notify(); break; } } diff --git a/net/colo-compare.h b/net/colo-compare.h index c9c62f5..a0b573e 100644 --- a/net/colo-compare.h +++ b/net/colo-compare.h @@ -2,5 +2,7 @@ #define QEMU_COLO_COMPARE_H void colo_notify_compares_event(void *opaque, int event, Error **errp); +void colo_compare_register_notifier(Notifier *notify); +void colo_compare_unregister_notifier(Notifier *notify); #endif /* QEMU_COLO_COMPARE_H */ -- 1.8.3.1
[Qemu-devel] [PATCH v2 03/18] colo-compare: use notifier to notify packets comparing result
It's a good idea to use notifier to notify COLO frame of inconsistent packets comparing. Cc: Jason Wang Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang --- net/colo-compare.c | 32 net/colo-compare.h | 2 ++ 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 3adccfb..bb234dd 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -30,6 +30,7 @@ #include "qapi-visit.h" #include "net/colo.h" #include "net/colo-compare.h" +#include "migration/migration.h" #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ @@ -38,6 +39,9 @@ static QTAILQ_HEAD(, CompareState) net_compares = QTAILQ_HEAD_INITIALIZER(net_compares); +static NotifierList colo_compare_notifiers = +NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 @@ -384,6 +388,22 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time) } } +static void colo_compare_inconsistent_notify(void) +{ +notifier_list_notify(&colo_compare_notifiers, +migrate_get_current()); +} + +void colo_compare_register_notifier(Notifier *notify) +{ +notifier_list_add(&colo_compare_notifiers, notify); +} + +void colo_compare_unregister_notifier(Notifier *notify) +{ +notifier_remove(notify); +} + static void colo_old_packet_check_one_conn(void *opaque, void *user_data) { @@ -397,7 +417,7 @@ static void colo_old_packet_check_one_conn(void *opaque, if (result) { /* do checkpoint will flush old packet */ -/* TODO: colo_notify_checkpoint();*/ +colo_compare_inconsistent_notify(); } } @@ -415,7 +435,10 @@ static void colo_old_packet_check(void *opaque) /* * Called from the compare thread on the primary - * for compare connection + * for compare connection. + * TODO: Reconstruct this function, we should hold the max handled sequence + * of the connect, Don't trigger a checkpoint request if we only get packets + * from one side (primary or secondary). */ static void colo_compare_connection(void *opaque, void *user_data) { @@ -464,11 +487,12 @@ static void colo_compare_connection(void *opaque, void *user_data) /* * If one packet arrive late, the secondary_list or * primary_list will be empty, so we can't compare it - * until next comparison. + * until next comparison. If the packets in the list are + * timeout, it will trigger a checkpoint request. */ trace_colo_compare_main("packet different"); g_queue_push_tail(&conn->primary_list, pkt); -/* TODO: colo_notify_checkpoint();*/ +colo_compare_inconsistent_notify(); break; } } diff --git a/net/colo-compare.h b/net/colo-compare.h index c9c62f5..a0b573e 100644 --- a/net/colo-compare.h +++ b/net/colo-compare.h @@ -2,5 +2,7 @@ #define QEMU_COLO_COMPARE_H void colo_notify_compares_event(void *opaque, int event, Error **errp); +void colo_compare_register_notifier(Notifier *notify); +void colo_compare_unregister_notifier(Notifier *notify); #endif /* QEMU_COLO_COMPARE_H */ -- 1.8.3.1
[Qemu-devel] [PATCH v2 05/18] COLO: Handle shutdown command for VM in COLO state
If VM is in COLO FT state, we need to do some extra works before starting normal shutdown process. Secondary VM will ignore the shutdown command if users issue it directly to Secondary VM. COLO will capture shutdown command and after shutdown request from user. Cc: Paolo Bonzini Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- include/migration/colo.h | 1 + include/sysemu/sysemu.h | 3 +++ migration/colo.c | 46 +- qapi-schema.json | 4 +++- vl.c | 19 --- 5 files changed, 68 insertions(+), 5 deletions(-) diff --git a/include/migration/colo.h b/include/migration/colo.h index 2bbff9e..aadd040 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -37,4 +37,5 @@ COLOMode get_colo_mode(void); void colo_do_failover(MigrationState *s); void colo_checkpoint_notify(void *opaque); +bool colo_handle_shutdown(void); #endif diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 16175f7..8054f53 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -49,6 +49,8 @@ typedef enum WakeupReason { QEMU_WAKEUP_REASON_OTHER, } WakeupReason; +extern int colo_shutdown_requested; + void qemu_system_reset_request(void); void qemu_system_suspend_request(void); void qemu_register_suspend_notifier(Notifier *notifier); @@ -56,6 +58,7 @@ void qemu_system_wakeup_request(WakeupReason reason); void qemu_system_wakeup_enable(WakeupReason reason, bool enabled); void qemu_register_wakeup_notifier(Notifier *notifier); void qemu_system_shutdown_request(void); +void qemu_system_shutdown_request_core(void); void qemu_system_powerdown_request(void); void qemu_register_powerdown_notifier(Notifier *notifier); void qemu_system_debug_request(void); diff --git a/migration/colo.c b/migration/colo.c index a3344ce..c4fc865 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -384,6 +384,21 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } +if (colo_shutdown_requested) { +colo_send_message(s->to_dst_file, COLO_MESSAGE_GUEST_SHUTDOWN, + &local_err); +if (local_err) { +error_free(local_err); +/* Go on the shutdown process and throw the error message */ +error_report("Failed to send shutdown message to SVM"); +} +qemu_fflush(s->to_dst_file); +colo_shutdown_requested = 0; +qemu_system_shutdown_request_core(); +/* Fix me: Just let the colo thread exit ? */ +qemu_thread_exit(0); +} + ret = 0; qemu_mutex_lock_iothread(); @@ -449,7 +464,9 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } -qemu_sem_wait(&s->colo_checkpoint_sem); +if (!colo_shutdown_requested) { +qemu_sem_wait(&s->colo_checkpoint_sem); +} ret = colo_do_checkpoint_transaction(s, bioc, fb); if (ret < 0) { @@ -534,6 +551,16 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request, case COLO_MESSAGE_CHECKPOINT_REQUEST: *checkpoint_request = 1; break; +case COLO_MESSAGE_GUEST_SHUTDOWN: +qemu_mutex_lock_iothread(); +vm_stop_force_state(RUN_STATE_COLO); +qemu_system_shutdown_request_core(); +qemu_mutex_unlock_iothread(); +/* + * The main thread will be exit and terminate the whole + * process, do need some cleanup ? + */ +qemu_thread_exit(0); default: *checkpoint_request = 0; error_setg(errp, "Got unknown COLO message: %d", msg); @@ -696,3 +723,20 @@ out: return NULL; } + +bool colo_handle_shutdown(void) +{ +/* + * If VM is in COLO-FT mode, we need do some significant work before + * respond to the shutdown request. Besides, Secondary VM will ignore + * the shutdown request from users. + */ +if (migration_incoming_in_colo_state()) { +return true; +} +if (migration_in_colo_state()) { +colo_shutdown_requested = 1; +return true; +} +return false; +} diff --git a/qapi-schema.json b/qapi-schema.json index 01b087f..4b3e1b7 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1187,12 +1187,14 @@ # # @vmstate-loaded: VM's state has been loaded by SVM. # +# @guest-shutdown: shutdown requested from PVM to SVM. (Since 2.9) +# # Since: 2.8 ## { 'enum': 'COLOMessage', 'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply', 'vmstate-send', 'vmstate-size', 'vmstate-received', -'vmstate-loaded' ] } +'vmstate-loaded', 'guest-shutdown' ] } ## # @COLOMode: diff --git a/vl.c b/vl.c index 0b4ed52..72638c9 100644 --- a/vl.c +++ b/vl.c @@ -1611,6 +1611,8 @@ static NotifierList wakeup_notifiers = NOTIFIER_LIST_INITIALIZER(wakeup_notifiers)
[Qemu-devel] [PATCH RESEND v2 14/18] COLO: Split qemu_savevm_state_begin out of checkpoint process
It is unnecessary to call qemu_savevm_state_begin() in every checkpoint process. It mainly sets up devices and does the first device state pass. These data will not change during the later checkpoint process. So, we split it out of colo_do_checkpoint_transaction(), in this way, we can reduce these data transferring in the subsequent checkpoint. Cc: Juan Quintela Sgned-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 51 --- 1 file changed, 36 insertions(+), 15 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 8e27a4c..66bb5b2 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -362,16 +362,6 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } -/* Disable block migration */ -s->params.blk = 0; -s->params.shared = 0; -qemu_savevm_state_begin(s->to_dst_file, &s->params); -ret = qemu_file_get_error(s->to_dst_file); -if (ret < 0) { -error_report("Save VM state begin error"); -goto out; -} - /* We call this API although this may do nothing on primary side. */ qemu_mutex_lock_iothread(); replication_do_checkpoint_all(&local_err); @@ -459,6 +449,21 @@ static void colo_compare_notify_checkpoint(Notifier *notifier, void *data) colo_checkpoint_notify(data); } +static int colo_prepare_before_save(MigrationState *s) +{ +int ret; + +/* Disable block migration */ +s->params.blk = 0; +s->params.shared = 0; +qemu_savevm_state_begin(s->to_dst_file, &s->params); +ret = qemu_file_get_error(s->to_dst_file); +if (ret < 0) { +error_report("Save VM state begin error"); +} +return ret; +} + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -478,6 +483,11 @@ static void colo_process_checkpoint(MigrationState *s) packets_compare_notifier.notify = colo_compare_notify_checkpoint; colo_compare_register_notifier(&packets_compare_notifier); +ret = colo_prepare_before_save(s); +if (ret < 0) { +goto out; +} + /* * Wait for Secondary finish loading VM states and enter COLO * restore. @@ -628,6 +638,17 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request, } } +static int colo_prepare_before_load(QEMUFile *f) +{ +int ret; + +ret = qemu_loadvm_state_begin(f); +if (ret < 0) { +error_report("Load VM state begin error, ret = %d", ret); +} +return ret; +} + void *colo_process_incoming_thread(void *opaque) { MigrationIncomingState *mis = opaque; @@ -662,6 +683,11 @@ void *colo_process_incoming_thread(void *opaque) fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); +ret = colo_prepare_before_load(mis->from_src_file); +if (ret < 0) { +goto out; +} + qemu_mutex_lock_iothread(); replication_start_all(REPLICATION_MODE_SECONDARY, &local_err); if (local_err) { @@ -709,11 +735,6 @@ void *colo_process_incoming_thread(void *opaque) goto out; } -ret = qemu_loadvm_state_begin(mis->from_src_file); -if (ret < 0) { -error_report("Load vm state begin error, ret=%d", ret); -goto out; -} ret = qemu_loadvm_state_main(mis->from_src_file, mis); if (ret < 0) { error_report("Load VM's live state (ram) error"); -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 12/18] savevm: split the process of different stages for loadvm/savevm
There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_loadvm_state_begin(), qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v2: - Use the wrapper qemu_savevm_state_header() to simplify the codes of qemu_save_device_state() (Dave's suggestion) --- include/sysemu/sysemu.h | 6 ++ migration/savevm.c | 54 ++--- 2 files changed, 53 insertions(+), 7 deletions(-) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 8054f53..0255c4e 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, uint64_t *start_list, uint64_t *length_list); +void qemu_savevm_live_state(QEMUFile *f); +int qemu_save_device_state(QEMUFile *f); + int qemu_loadvm_state(QEMUFile *f); +int qemu_loadvm_state_begin(QEMUFile *f); +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); +int qemu_load_device_state(QEMUFile *f); extern int autostart; diff --git a/migration/savevm.c b/migration/savevm.c index f87cd8d..8c2ce0b 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -54,6 +54,7 @@ #include "qemu/cutils.h" #include "io/channel-buffer.h" #include "io/channel-file.h" +#include "migration/colo.h" #ifndef ETH_P_RARP #define ETH_P_RARP 0x8035 @@ -1285,13 +1286,20 @@ done: return ret; } -static int qemu_save_device_state(QEMUFile *f) +void qemu_savevm_live_state(QEMUFile *f) { -SaveStateEntry *se; +/* save QEMU_VM_SECTION_END section */ +qemu_savevm_state_complete_precopy(f, true); +qemu_put_byte(f, QEMU_VM_EOF); +} -qemu_put_be32(f, QEMU_VM_FILE_MAGIC); -qemu_put_be32(f, QEMU_VM_FILE_VERSION); +int qemu_save_device_state(QEMUFile *f) +{ +SaveStateEntry *se; +if (!migration_in_colo_state()) { +qemu_savevm_state_header(f); +} cpu_synchronize_all_states(); QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { @@ -1342,8 +1350,6 @@ enum LoadVMExitCodes { LOADVM_QUIT = 1, }; -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); - /* -- incoming postcopy messages -- */ /* 'advise' arrives before any transfers just to tell us that a postcopy * *might* happen - it might be skipped if precopy transferred everything @@ -1957,7 +1963,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis) return 0; } -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) { uint8_t section_type; int ret = 0; @@ -2095,6 +2101,40 @@ int qemu_loadvm_state(QEMUFile *f) return ret; } +int qemu_loadvm_state_begin(QEMUFile *f) +{ +MigrationIncomingState *mis = migration_incoming_get_current(); +Error *local_err = NULL; +int ret; + +if (qemu_savevm_state_blocked(&local_err)) { +error_report_err(local_err); +return -EINVAL; +} +/* Load QEMU_VM_SECTION_START section */ +ret = qemu_loadvm_state_main(f, mis); +if (ret < 0) { +error_report("Failed to loadvm begin work: %d", ret); +} +return ret; +} + +int qemu_load_device_state(QEMUFile *f) +{ +MigrationIncomingState *mis = migration_incoming_get_current(); +int ret; + +/* Load QEMU_VM_SECTION_FULL section */ +ret = qemu_loadvm_state_main(f, mis); +if (ret < 0) { +error_report("Failed to load device state: %d", ret); +return ret; +} + +cpu_synchronize_all_post_init(); +return 0; +} + int save_vmstate(Monitor *mon, const char *name) { BlockDriverState *bs, *bs1; -- 1.8.3.1
[Qemu-devel] [PATCH v2 10/18] qmp event: Add COLO_EXIT event to notify users while exited COLO
If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x_colo_lost_heartbeat', Users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exited COLO mode. Cc: Markus Armbruster Cc: Michael Roth Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Eric Blake --- migration/colo.c | 19 +++ qapi-schema.json | 14 ++ qapi/event.json | 21 + 3 files changed, 54 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 9949293..e62da93 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -516,6 +516,18 @@ out: qemu_fclose(fb); } +/* + * There are only two reasons we can go here, some error happened. + * Or the user triggered failover. + */ +if (failover_get_state() == FAILOVER_STATUS_NONE) { +qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_ERROR, NULL); +} else { +qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_REQUEST, NULL); +} + /* Hope this not to be too long to wait here */ qemu_sem_wait(&s->colo_exit_sem); qemu_sem_destroy(&s->colo_exit_sem); @@ -757,6 +769,13 @@ out: if (local_err) { error_report_err(local_err); } +if (failover_get_state() == FAILOVER_STATUS_NONE) { +qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_ERROR, NULL); +} else { +qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_REQUEST, NULL); +} if (fb) { qemu_fclose(fb); diff --git a/qapi-schema.json b/qapi-schema.json index 4b3e1b7..460ca53 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1233,6 +1233,20 @@ 'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] } ## +# @COLOExitReason: +# +# The reason for a COLO exit +# +# @request: COLO exit is due to an external request +# +# @error: COLO exit is due to an internal error +# +# Since: 2.10 +## +{ 'enum': 'COLOExitReason', + 'data': [ 'request', 'error' ] } + +## # @x-colo-lost-heartbeat: # # Tell qemu that heartbeat is lost, request it to do takeover procedures. diff --git a/qapi/event.json b/qapi/event.json index e80f3f4..924bc6f 100644 --- a/qapi/event.json +++ b/qapi/event.json @@ -441,6 +441,27 @@ 'data': { 'pass': 'int' } } ## +# @COLO_EXIT: +# +# Emitted when VM finishes COLO mode due to some errors happening or +# at the request of users. +# +# @mode: which COLO mode the VM was in when it exited. +# +# @reason: describes the reason for the COLO exit. +# +# Since: 2.10 +# +# Example: +# +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172}, +# "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } } +# +## +{ 'event': 'COLO_EXIT', + 'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } } + +## # @ACPI_DEVICE_OST: # # Emitted when guest executes ACPI _OST method. -- 1.8.3.1
[Qemu-devel] [PATCH v2 18/18] COLO: notify net filters about checkpoint/failover event
Notify all net filters about the checkpoint and failover event. Cc: Jason Wang Signed-off-by: zhanghailiang --- migration/colo.c | 13 + 1 file changed, 13 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 66bb5b2..62f58c6 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -26,6 +26,7 @@ #include "qapi-event.h" #include "block/block.h" #include "replication.h" +#include "net/filter.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -82,6 +83,11 @@ static void secondary_vm_do_failover(void) if (local_err) { error_report_err(local_err); } +/* Notify all filters of all NIC to do checkpoint */ +colo_notify_filters_event(COLO_FAILOVER, &local_err); +if (local_err) { +error_report_err(local_err); +} if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side"); @@ -794,6 +800,13 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +/* Notify all filters of all NIC to do checkpoint */ +colo_notify_filters_event(COLO_CHECKPOINT, &local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vmstate_loading = false; vm_start(); trace_colo_vm_state_change("stop", "run"); -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 11/18] savevm: split save/find loadvm_handlers entry into two helper functions
COLO's checkpoint process is based on migration process, everytime we do checkpoint we will repeat the process of savevm and loadvm. So we will call qemu_loadvm_section_start_full() repeatedly, It will add all migration sections information into loadvm_handlers list everytime, which will lead to memory leak. To fix it, we split the process of saving and finding section entry into two helper functions, we will check if section info was exist in loadvm_handlers list before save it. This modifications have no side effect for normal migration. Cc: Juan Quintela Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/savevm.c | 55 +++--- 1 file changed, 40 insertions(+), 15 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index 03ae1bd..f87cd8d 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1836,6 +1836,37 @@ void loadvm_free_handlers(MigrationIncomingState *mis) } } +static LoadStateEntry *loadvm_add_section_entry(MigrationIncomingState *mis, + SaveStateEntry *se, + uint32_t section_id, + uint32_t version_id) +{ +LoadStateEntry *le; + +/* Add entry */ +le = g_malloc0(sizeof(*le)); + +le->se = se; +le->section_id = section_id; +le->version_id = version_id; +QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry); +return le; +} + +static LoadStateEntry *loadvm_find_section_entry(MigrationIncomingState *mis, + uint32_t section_id) +{ +LoadStateEntry *le; + +QLIST_FOREACH(le, &mis->loadvm_handlers, entry) { +if (le->section_id == section_id) { +break; +} +} + +return le; +} + static int qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis) { @@ -1878,15 +1909,12 @@ qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis) return -EINVAL; } -/* Add entry */ -le = g_malloc0(sizeof(*le)); - -le->se = se; -le->section_id = section_id; -le->version_id = version_id; -QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry); - -ret = vmstate_load(f, le->se, le->version_id); + /* Check if we have saved this section info before, if not, save it */ +le = loadvm_find_section_entry(mis, section_id); +if (!le) { +le = loadvm_add_section_entry(mis, se, section_id, version_id); +} +ret = vmstate_load(f, se, version_id); if (ret < 0) { error_report("error while loading state for instance 0x%x of" " device '%s'", instance_id, idstr); @@ -1909,12 +1937,9 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis) section_id = qemu_get_be32(f); trace_qemu_loadvm_state_section_partend(section_id); -QLIST_FOREACH(le, &mis->loadvm_handlers, entry) { -if (le->section_id == section_id) { -break; -} -} -if (le == NULL) { + +le = loadvm_find_section_entry(mis, section_id); +if (!le) { error_report("Unknown savevm section %d", section_id); return -EINVAL; } -- 1.8.3.1
[Qemu-devel] [PATCH RESEND v2 13/18] COLO: Separate the process of saving/loading ram and device state
We separate the process of saving/loading ram and device state when do checkpoint. We add new helpers for save/load ram/device. With this change, we can directly transfer RAM from primary side to secondary side without using channel-buffer as assistant, which also reduce the size of extra memory was used during checkpoint. Besides, we move the colo_flush_ram_cache to the proper position after the above change. Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 49 +++-- migration/ram.c| 5 - migration/savevm.c | 4 3 files changed, 43 insertions(+), 15 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index e62da93..8e27a4c 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -357,11 +357,20 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } +colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err); +if (local_err) { +goto out; +} + /* Disable block migration */ s->params.blk = 0; s->params.shared = 0; -qemu_savevm_state_header(fb); -qemu_savevm_state_begin(fb, &s->params); +qemu_savevm_state_begin(s->to_dst_file, &s->params); +ret = qemu_file_get_error(s->to_dst_file); +if (ret < 0) { +error_report("Save VM state begin error"); +goto out; +} /* We call this API although this may do nothing on primary side. */ qemu_mutex_lock_iothread(); @@ -372,15 +381,21 @@ static int colo_do_checkpoint_transaction(MigrationState *s, } qemu_mutex_lock_iothread(); -qemu_savevm_state_complete_precopy(fb, false); +/* + * Only save VM's live state, which not including device state. + * TODO: We may need a timeout mechanism to prevent COLO process + * to be blocked here. + */ +qemu_savevm_live_state(s->to_dst_file); +/* Note: device state is saved into buffer */ +ret = qemu_save_device_state(fb); qemu_mutex_unlock_iothread(); - -qemu_fflush(fb); - -colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err); -if (local_err) { +if (ret < 0) { +error_report("Save device state error"); goto out; } +qemu_fflush(fb); + /* * We need the size of the VMstate data in Secondary side, * With which we can decide how much data should be read. @@ -621,6 +636,7 @@ void *colo_process_incoming_thread(void *opaque) uint64_t total_size; uint64_t value; Error *local_err = NULL; +int ret; qemu_sem_init(&mis->colo_incoming_sem, 0); @@ -693,6 +709,17 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +ret = qemu_loadvm_state_begin(mis->from_src_file); +if (ret < 0) { +error_report("Load vm state begin error, ret=%d", ret); +goto out; +} +ret = qemu_loadvm_state_main(mis->from_src_file, mis); +if (ret < 0) { +error_report("Load VM's live state (ram) error"); +goto out; +} + value = colo_receive_message_value(mis->from_src_file, COLO_MESSAGE_VMSTATE_SIZE, &local_err); if (local_err) { @@ -726,8 +753,10 @@ void *colo_process_incoming_thread(void *opaque) qemu_mutex_lock_iothread(); qemu_system_reset(VMRESET_SILENT); vmstate_loading = true; -if (qemu_loadvm_state(fb) < 0) { -error_report("COLO: loadvm failed"); +colo_flush_ram_cache(); +ret = qemu_load_device_state(fb); +if (ret < 0) { +error_report("COLO: load device state failed"); qemu_mutex_unlock_iothread(); goto out; } diff --git a/migration/ram.c b/migration/ram.c index df10d4b..f171a82 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2602,7 +2602,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) bool postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING; /* ADVISE is earlier, it shows the source has the postcopy capability on */ bool postcopy_advised = postcopy_state_get() >= POSTCOPY_INCOMING_ADVISE; -bool need_flush = false; seq_iter++; @@ -2637,7 +2636,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) /* After going into COLO, we should load the Page into colo_cache */ if (migration_incoming_in_colo_state()) { host = colo_cache_from_block_offset(block, addr); -need_flush = true; } else { host = host_from_ram_block_offset(block, addr); } @@ -2745,9 +2743,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); -if (!ret && ram_cache_enable && need_flush) { -
[Qemu-devel] [PATCH v2 16/18] filter: Add handle_event method for NetFilterClass
Filter needs to process the event of checkpoint/failover or other event passed by COLO frame. Cc: Jason Wang Signed-off-by: zhanghailiang --- include/net/filter.h | 5 + net/filter.c | 16 net/net.c| 28 3 files changed, 49 insertions(+) diff --git a/include/net/filter.h b/include/net/filter.h index 0c4a2ea..df4510d 100644 --- a/include/net/filter.h +++ b/include/net/filter.h @@ -37,6 +37,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc, typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp); +typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error **errp); + typedef struct NetFilterClass { ObjectClass parent_class; @@ -44,6 +46,7 @@ typedef struct NetFilterClass { FilterSetup *setup; FilterCleanup *cleanup; FilterStatusChanged *status_changed; +FilterHandleEvent *handle_event; /* mandatory */ FilterReceiveIOV *receive_iov; } NetFilterClass; @@ -76,4 +79,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender, int iovcnt, void *opaque); +void colo_notify_filters_event(int event, Error **errp); + #endif /* QEMU_NET_FILTER_H */ diff --git a/net/filter.c b/net/filter.c index 1dfd2ca..993b35e 100644 --- a/net/filter.c +++ b/net/filter.c @@ -17,6 +17,7 @@ #include "net/vhost_net.h" #include "qom/object_interfaces.h" #include "qemu/iov.h" +#include "net/colo.h" static inline bool qemu_can_skip_netfilter(NetFilterState *nf) { @@ -245,11 +246,26 @@ static void netfilter_finalize(Object *obj) g_free(nf->netdev_id); } +static void dummy_handle_event(NetFilterState *nf, int event, Error **errp) +{ +switch (event) { +case COLO_CHECKPOINT: +break; +case COLO_FAILOVER: +object_property_set_str(OBJECT(nf), "off", "status", errp); +break; +default: +break; +} +} + static void netfilter_class_init(ObjectClass *oc, void *data) { UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); +NetFilterClass *nfc = NETFILTER_CLASS(oc); ucc->complete = netfilter_complete; +nfc->handle_event = dummy_handle_event; } static const TypeInfo netfilter_info = { diff --git a/net/net.c b/net/net.c index 0ac3b9e..1373f63 100644 --- a/net/net.c +++ b/net/net.c @@ -1373,6 +1373,34 @@ void hmp_info_network(Monitor *mon, const QDict *qdict) } } +void colo_notify_filters_event(int event, Error **errp) +{ +NetClientState *nc, *peer; +NetClientDriver type; +NetFilterState *nf; +NetFilterClass *nfc = NULL; +Error *local_err = NULL; + +QTAILQ_FOREACH(nc, &net_clients, next) { +peer = nc->peer; +type = nc->info->type; +if (!peer || type != NET_CLIENT_DRIVER_NIC) { +continue; +} +QTAILQ_FOREACH(nf, &nc->filters, next) { +nfc = NETFILTER_GET_CLASS(OBJECT(nf)); +if (!nfc->handle_event) { +continue; +} +nfc->handle_event(nf, event, &local_err); +if (local_err) { +error_propagate(errp, local_err); +return; +} +} +} +} + void qmp_set_link(const char *name, bool up, Error **errp) { NetClientState *ncs[MAX_QUEUE_NUM]; -- 1.8.3.1
[Qemu-devel] [PATCH v2 07/18] COLO: Load dirty pages into SVM's RAM cache firstly
We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Cc: Dr. David Alan Gilbert Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian --- v2: - Move colo_init_ram_cache() and colo_release_ram_cache() out of incoming thread since both of them need the global lock, if we keep colo_release_ram_cache() in incoming thread, there are potential dead-lock. - Remove bool ram_cache_enable flag, use migration_incoming_in_state() instead. - Remove the Reviewd-by tag because of the above changes. --- include/exec/ram_addr.h | 1 + include/migration/migration.h | 4 +++ migration/migration.c | 6 migration/ram.c | 71 ++- 4 files changed, 81 insertions(+), 1 deletion(-) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index c9ddcd0..0b3d77c 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -27,6 +27,7 @@ struct RAMBlock { struct rcu_head rcu; struct MemoryRegion *mr; uint8_t *host; +uint8_t *colo_cache; /* For colo, VM's ram cache */ ram_addr_t offset; ram_addr_t used_length; ram_addr_t max_length; diff --git a/include/migration/migration.h b/include/migration/migration.h index ba1a16c..ba765eb 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -360,4 +360,8 @@ uint64_t ram_pagesize_summary(void); PostcopyState postcopy_state_get(void); /* Set the state and return the old state */ PostcopyState postcopy_state_set(PostcopyState new_state); + +/* ram cache */ +int colo_init_ram_cache(void); +void colo_release_ram_cache(void); #endif diff --git a/migration/migration.c b/migration/migration.c index 755ea54..7419404 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -441,6 +441,10 @@ static void process_incoming_migration_co(void *opaque) error_report_err(local_err); exit(EXIT_FAILURE); } +if (colo_init_ram_cache() < 0) { +error_report("Init ram cache failed"); +exit(EXIT_FAILURE); +} mis->migration_incoming_co = qemu_coroutine_self(); qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming", colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE); @@ -449,6 +453,8 @@ static void process_incoming_migration_co(void *opaque) /* Wait checkpoint incoming thread exit before free resource */ qemu_thread_join(&mis->colo_incoming_thread); +/* We hold the global iothread lock, so it is safe here */ +colo_release_ram_cache(); } if (ret < 0) { diff --git a/migration/ram.c b/migration/ram.c index f48664e..05d1b06 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2265,6 +2265,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block, return block->host + offset; } +static inline void *colo_cache_from_block_offset(RAMBlock *block, + ram_addr_t offset) +{ +if (!offset_in_ramblock(block, offset)) { +return NULL; +} +if (!block->colo_cache) { +error_report("%s: colo_cache is NULL in block :%s", + __func__, block->idstr); +return NULL; +} +return block->colo_cache + offset; +} + /** * ram_handle_compressed: handle the zero page case * @@ -2605,7 +2619,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) { RAMBlock *block = ram_block_from_stream(f, flags); -host = host_from_ram_block_offset(block, addr); +/* After going into COLO, we should load the Page into colo_cache */ +if (migration_incoming_in_colo_state()) { +host = colo_cache_from_block_offset(block, addr); +} else { +host = host_from_ram_block_offset(block, addr); +} if (!host) { error_report("Illegal RAM offset " RAM_ADDR_FMT, addr); ret = -EINVAL; @@ -2712,6 +2731,56 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) return ret; } +/* + * colo cache: this is for secondary VM, we cache the whole + * memory of the secondary VM, it is need to hold the global lock + * to call this helper. + */ +int colo_init_ram_cache(void) +{ +RAMBlock *block; + +rcu_read_lock(); +QLIST_FOREACH_RCU(block, &ram_
[Qemu-devel] [PATCH RESEND v2 05/18] COLO: Handle shutdown command for VM in COLO state
If VM is in COLO FT state, we need to do some extra works before starting normal shutdown process. Secondary VM will ignore the shutdown command if users issue it directly to Secondary VM. COLO will capture shutdown command and after shutdown request from user. Cc: Paolo Bonzini Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- include/migration/colo.h | 1 + include/sysemu/sysemu.h | 3 +++ migration/colo.c | 46 +- qapi-schema.json | 4 +++- vl.c | 19 --- 5 files changed, 68 insertions(+), 5 deletions(-) diff --git a/include/migration/colo.h b/include/migration/colo.h index 2bbff9e..aadd040 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -37,4 +37,5 @@ COLOMode get_colo_mode(void); void colo_do_failover(MigrationState *s); void colo_checkpoint_notify(void *opaque); +bool colo_handle_shutdown(void); #endif diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 16175f7..8054f53 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -49,6 +49,8 @@ typedef enum WakeupReason { QEMU_WAKEUP_REASON_OTHER, } WakeupReason; +extern int colo_shutdown_requested; + void qemu_system_reset_request(void); void qemu_system_suspend_request(void); void qemu_register_suspend_notifier(Notifier *notifier); @@ -56,6 +58,7 @@ void qemu_system_wakeup_request(WakeupReason reason); void qemu_system_wakeup_enable(WakeupReason reason, bool enabled); void qemu_register_wakeup_notifier(Notifier *notifier); void qemu_system_shutdown_request(void); +void qemu_system_shutdown_request_core(void); void qemu_system_powerdown_request(void); void qemu_register_powerdown_notifier(Notifier *notifier); void qemu_system_debug_request(void); diff --git a/migration/colo.c b/migration/colo.c index a3344ce..c4fc865 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -384,6 +384,21 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } +if (colo_shutdown_requested) { +colo_send_message(s->to_dst_file, COLO_MESSAGE_GUEST_SHUTDOWN, + &local_err); +if (local_err) { +error_free(local_err); +/* Go on the shutdown process and throw the error message */ +error_report("Failed to send shutdown message to SVM"); +} +qemu_fflush(s->to_dst_file); +colo_shutdown_requested = 0; +qemu_system_shutdown_request_core(); +/* Fix me: Just let the colo thread exit ? */ +qemu_thread_exit(0); +} + ret = 0; qemu_mutex_lock_iothread(); @@ -449,7 +464,9 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } -qemu_sem_wait(&s->colo_checkpoint_sem); +if (!colo_shutdown_requested) { +qemu_sem_wait(&s->colo_checkpoint_sem); +} ret = colo_do_checkpoint_transaction(s, bioc, fb); if (ret < 0) { @@ -534,6 +551,16 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request, case COLO_MESSAGE_CHECKPOINT_REQUEST: *checkpoint_request = 1; break; +case COLO_MESSAGE_GUEST_SHUTDOWN: +qemu_mutex_lock_iothread(); +vm_stop_force_state(RUN_STATE_COLO); +qemu_system_shutdown_request_core(); +qemu_mutex_unlock_iothread(); +/* + * The main thread will be exit and terminate the whole + * process, do need some cleanup ? + */ +qemu_thread_exit(0); default: *checkpoint_request = 0; error_setg(errp, "Got unknown COLO message: %d", msg); @@ -696,3 +723,20 @@ out: return NULL; } + +bool colo_handle_shutdown(void) +{ +/* + * If VM is in COLO-FT mode, we need do some significant work before + * respond to the shutdown request. Besides, Secondary VM will ignore + * the shutdown request from users. + */ +if (migration_incoming_in_colo_state()) { +return true; +} +if (migration_in_colo_state()) { +colo_shutdown_requested = 1; +return true; +} +return false; +} diff --git a/qapi-schema.json b/qapi-schema.json index 01b087f..4b3e1b7 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1187,12 +1187,14 @@ # # @vmstate-loaded: VM's state has been loaded by SVM. # +# @guest-shutdown: shutdown requested from PVM to SVM. (Since 2.9) +# # Since: 2.8 ## { 'enum': 'COLOMessage', 'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply', 'vmstate-send', 'vmstate-size', 'vmstate-received', -'vmstate-loaded' ] } +'vmstate-loaded', 'guest-shutdown' ] } ## # @COLOMode: diff --git a/vl.c b/vl.c index 0b4ed52..72638c9 100644 --- a/vl.c +++ b/vl.c @@ -1611,6 +1611,8 @@ static NotifierList wakeup_notifiers = NOTIFIER_LIST_INITIALIZER(wakeup_notifiers)
[Qemu-devel] [PATCH RESEND v2 01/18] net/colo: Add notifier/callback related helpers for filter
We will use this notifier to help COLO to notify filter object to do something, like do checkpoint, or process failover event. Cc: Jason Wang Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Li Zhijian --- net/colo.c | 105 + net/colo.h | 19 +++ 2 files changed, 124 insertions(+) diff --git a/net/colo.c b/net/colo.c index 8cc166b..8aef670 100644 --- a/net/colo.c +++ b/net/colo.c @@ -15,6 +15,7 @@ #include "qemu/osdep.h" #include "trace.h" #include "net/colo.h" +#include "qapi/error.h" uint32_t connection_key_hash(const void *opaque) { @@ -209,3 +210,107 @@ Connection *connection_get(GHashTable *connection_track_table, return conn; } + +static gboolean +filter_notify_prepare(GSource *source, gint *timeout) +{ +*timeout = -1; + +return FALSE; +} + +static gboolean +filter_notify_check(GSource *source) +{ +FilterNotifier *notify = (FilterNotifier *)source; + +return notify->pfd.revents & (G_IO_IN | G_IO_HUP | G_IO_ERR); +} + +static gboolean +filter_notify_dispatch(GSource *source, + GSourceFunc callback, + gpointer user_data) +{ +FilterNotifier *notify = (FilterNotifier *)source; +int revents; +uint64_t value; +int ret; + +revents = notify->pfd.revents & notify->pfd.events; +if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) { +ret = filter_notifier_get(notify, &value); +if (notify->cb && !ret) { +notify->cb(notify, value); +} +} +return TRUE; +} + +static void +filter_notify_finalize(GSource *source) +{ +FilterNotifier *notify = (FilterNotifier *)source; + +event_notifier_cleanup(¬ify->event); +} + +static GSourceFuncs notifier_source_funcs = { +filter_notify_prepare, +filter_notify_check, +filter_notify_dispatch, +filter_notify_finalize, +}; + +FilterNotifier *filter_notifier_new(FilterNotifierCallback *cb, +void *opaque, Error **errp) +{ +FilterNotifier *notify; +int ret; + +notify = (FilterNotifier *)g_source_new(¬ifier_source_funcs, +sizeof(FilterNotifier)); +ret = event_notifier_init(¬ify->event, false); +if (ret < 0) { +error_setg_errno(errp, -ret, "Failed to initialize event notifier"); +goto fail; +} +notify->pfd.fd = event_notifier_get_fd(¬ify->event); +notify->pfd.events = G_IO_IN | G_IO_HUP | G_IO_ERR; +notify->cb = cb; +notify->opaque = opaque; +g_source_add_poll(¬ify->source, ¬ify->pfd); + +return notify; + +fail: +g_source_destroy(¬ify->source); +return NULL; +} + +int filter_notifier_set(FilterNotifier *notify, uint64_t value) +{ +ssize_t ret; + +do { +ret = write(notify->event.wfd, &value, sizeof(value)); +} while (ret < 0 && errno == EINTR); + +/* EAGAIN is fine, a read must be pending. */ +if (ret < 0 && errno != EAGAIN) { +return -errno; +} +return 0; +} + +int filter_notifier_get(FilterNotifier *notify, uint64_t *value) +{ +ssize_t len; + +/* Drain the notify pipe. For eventfd, only 8 bytes will be read. */ +do { +len = read(notify->event.rfd, value, sizeof(*value)); +} while ((len == -1 && errno == EINTR)); + +return len != sizeof(*value) ? -1 : 0; +} diff --git a/net/colo.h b/net/colo.h index cd9027f..b586db3 100644 --- a/net/colo.h +++ b/net/colo.h @@ -19,6 +19,7 @@ #include "qemu/jhash.h" #include "qemu/timer.h" #include "slirp/tcp.h" +#include "qemu/event_notifier.h" #define HASHTABLE_MAX_SIZE 16384 @@ -89,4 +90,22 @@ void connection_hashtable_reset(GHashTable *connection_track_table); Packet *packet_new(const void *data, int size); void packet_destroy(void *opaque, void *user_data); +typedef void FilterNotifierCallback(void *opaque, int value); +typedef struct FilterNotifier { +GSource source; +EventNotifier event; +GPollFD pfd; +FilterNotifierCallback *cb; +void *opaque; +} FilterNotifier; + +FilterNotifier *filter_notifier_new(FilterNotifierCallback *cb, +void *opaque, Error **errp); +int filter_notifier_set(FilterNotifier *notify, uint64_t value); +int filter_notifier_get(FilterNotifier *notify, uint64_t *value); + +enum { +COLO_CHECKPOINT = 2, +COLO_FAILOVER, +}; #endif /* QEMU_COLO_PROXY_H */ -- 1.8.3.1
[Qemu-devel] [PATCH v2 01/18] net/colo: Add notifier/callback related helpers for filter
We will use this notifier to help COLO to notify filter object to do something, like do checkpoint, or process failover event. Cc: Jason Wang Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Li Zhijian --- net/colo.c | 105 + net/colo.h | 19 +++ 2 files changed, 124 insertions(+) diff --git a/net/colo.c b/net/colo.c index 8cc166b..8aef670 100644 --- a/net/colo.c +++ b/net/colo.c @@ -15,6 +15,7 @@ #include "qemu/osdep.h" #include "trace.h" #include "net/colo.h" +#include "qapi/error.h" uint32_t connection_key_hash(const void *opaque) { @@ -209,3 +210,107 @@ Connection *connection_get(GHashTable *connection_track_table, return conn; } + +static gboolean +filter_notify_prepare(GSource *source, gint *timeout) +{ +*timeout = -1; + +return FALSE; +} + +static gboolean +filter_notify_check(GSource *source) +{ +FilterNotifier *notify = (FilterNotifier *)source; + +return notify->pfd.revents & (G_IO_IN | G_IO_HUP | G_IO_ERR); +} + +static gboolean +filter_notify_dispatch(GSource *source, + GSourceFunc callback, + gpointer user_data) +{ +FilterNotifier *notify = (FilterNotifier *)source; +int revents; +uint64_t value; +int ret; + +revents = notify->pfd.revents & notify->pfd.events; +if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) { +ret = filter_notifier_get(notify, &value); +if (notify->cb && !ret) { +notify->cb(notify, value); +} +} +return TRUE; +} + +static void +filter_notify_finalize(GSource *source) +{ +FilterNotifier *notify = (FilterNotifier *)source; + +event_notifier_cleanup(¬ify->event); +} + +static GSourceFuncs notifier_source_funcs = { +filter_notify_prepare, +filter_notify_check, +filter_notify_dispatch, +filter_notify_finalize, +}; + +FilterNotifier *filter_notifier_new(FilterNotifierCallback *cb, +void *opaque, Error **errp) +{ +FilterNotifier *notify; +int ret; + +notify = (FilterNotifier *)g_source_new(¬ifier_source_funcs, +sizeof(FilterNotifier)); +ret = event_notifier_init(¬ify->event, false); +if (ret < 0) { +error_setg_errno(errp, -ret, "Failed to initialize event notifier"); +goto fail; +} +notify->pfd.fd = event_notifier_get_fd(¬ify->event); +notify->pfd.events = G_IO_IN | G_IO_HUP | G_IO_ERR; +notify->cb = cb; +notify->opaque = opaque; +g_source_add_poll(¬ify->source, ¬ify->pfd); + +return notify; + +fail: +g_source_destroy(¬ify->source); +return NULL; +} + +int filter_notifier_set(FilterNotifier *notify, uint64_t value) +{ +ssize_t ret; + +do { +ret = write(notify->event.wfd, &value, sizeof(value)); +} while (ret < 0 && errno == EINTR); + +/* EAGAIN is fine, a read must be pending. */ +if (ret < 0 && errno != EAGAIN) { +return -errno; +} +return 0; +} + +int filter_notifier_get(FilterNotifier *notify, uint64_t *value) +{ +ssize_t len; + +/* Drain the notify pipe. For eventfd, only 8 bytes will be read. */ +do { +len = read(notify->event.rfd, value, sizeof(*value)); +} while ((len == -1 && errno == EINTR)); + +return len != sizeof(*value) ? -1 : 0; +} diff --git a/net/colo.h b/net/colo.h index cd9027f..b586db3 100644 --- a/net/colo.h +++ b/net/colo.h @@ -19,6 +19,7 @@ #include "qemu/jhash.h" #include "qemu/timer.h" #include "slirp/tcp.h" +#include "qemu/event_notifier.h" #define HASHTABLE_MAX_SIZE 16384 @@ -89,4 +90,22 @@ void connection_hashtable_reset(GHashTable *connection_track_table); Packet *packet_new(const void *data, int size); void packet_destroy(void *opaque, void *user_data); +typedef void FilterNotifierCallback(void *opaque, int value); +typedef struct FilterNotifier { +GSource source; +EventNotifier event; +GPollFD pfd; +FilterNotifierCallback *cb; +void *opaque; +} FilterNotifier; + +FilterNotifier *filter_notifier_new(FilterNotifierCallback *cb, +void *opaque, Error **errp); +int filter_notifier_set(FilterNotifier *notify, uint64_t value); +int filter_notifier_get(FilterNotifier *notify, uint64_t *value); + +enum { +COLO_CHECKPOINT = 2, +COLO_FAILOVER, +}; #endif /* QEMU_COLO_PROXY_H */ -- 1.8.3.1
[Qemu-devel] [PATCH v2 17/18] filter-rewriter: handle checkpoint and failover event
After one round of checkpoint, the states between PVM and SVM become consistent, so it is unnecessary to adjust the sequence of net packets for old connections, besides, while failover happens, filter-rewriter needs to check if it still needs to adjust sequence of net packets. Cc: Jason Wang Signed-off-by: zhanghailiang --- net/filter-rewriter.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index c9a6d43..0a90b11 100644 --- a/net/filter-rewriter.c +++ b/net/filter-rewriter.c @@ -22,6 +22,7 @@ #include "qemu/main-loop.h" #include "qemu/iov.h" #include "net/checksum.h" +#include "net/colo.h" #define FILTER_COLO_REWRITER(obj) \ OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER) @@ -270,6 +271,43 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf, return 0; } +static void reset_seq_offset(gpointer key, gpointer value, gpointer user_data) +{ +Connection *conn = (Connection *)value; + +conn->offset = 0; +} + +static gboolean offset_is_nonzero(gpointer key, + gpointer value, + gpointer user_data) +{ +Connection *conn = (Connection *)value; + +return conn->offset ? true : false; +} + +static void colo_rewriter_handle_event(NetFilterState *nf, int event, + Error **errp) +{ +RewriterState *rs = FILTER_COLO_REWRITER(nf); + +switch (event) { +case COLO_CHECKPOINT: +g_hash_table_foreach(rs->connection_track_table, +reset_seq_offset, NULL); +break; +case COLO_FAILOVER: +if (!g_hash_table_find(rs->connection_track_table, + offset_is_nonzero, NULL)) { +object_property_set_str(OBJECT(nf), "off", "status", errp); +} +break; +default: +break; +} +} + static void colo_rewriter_cleanup(NetFilterState *nf) { RewriterState *s = FILTER_COLO_REWRITER(nf); @@ -299,6 +337,7 @@ static void colo_rewriter_class_init(ObjectClass *oc, void *data) nfc->setup = colo_rewriter_setup; nfc->cleanup = colo_rewriter_cleanup; nfc->receive_iov = colo_rewriter_receive_iov; +nfc->handle_event = colo_rewriter_handle_event; } static const TypeInfo colo_rewriter_info = { -- 1.8.3.1
[Qemu-devel] [PATCH v2 02/18] colo-compare: implement the process of checkpoint
While do checkpoint, we need to flush all the unhandled packets, By using the filter notifier mechanism, we can easily to notify every compare object to do this process, which runs inside of compare threads as a coroutine. Cc: Jason Wang Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen --- net/colo-compare.c | 78 ++ net/colo-compare.h | 6 + 2 files changed, 84 insertions(+) create mode 100644 net/colo-compare.h diff --git a/net/colo-compare.c b/net/colo-compare.c index 97bf0e5..3adccfb 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -29,17 +29,24 @@ #include "qemu/sockets.h" #include "qapi-visit.h" #include "net/colo.h" +#include "net/colo-compare.h" #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE) +static QTAILQ_HEAD(, CompareState) net_compares = + QTAILQ_HEAD_INITIALIZER(net_compares); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 /* TODO: Should be configurable */ #define REGULAR_PACKET_CHECK_MS 3000 +static QemuMutex event_mtx = { .lock = PTHREAD_MUTEX_INITIALIZER }; +static QemuCond event_complete_cond = { .cond = PTHREAD_COND_INITIALIZER }; +static int event_unhandled_count; /* + CompareState ++ | | @@ -87,6 +94,10 @@ typedef struct CompareState { GMainContext *worker_context; GMainLoop *compare_loop; +/* Used for COLO to notify compare to do something */ +FilterNotifier *notifier; + +QTAILQ_ENTRY(CompareState) next; } CompareState; typedef struct CompareClass { @@ -417,6 +428,11 @@ static void colo_compare_connection(void *opaque, void *user_data) while (!g_queue_is_empty(&conn->primary_list) && !g_queue_is_empty(&conn->secondary_list)) { pkt = g_queue_pop_tail(&conn->primary_list); +if (!pkt) { +error_report("colo-compare pop pkt failed"); +return; +} + switch (conn->ip_proto) { case IPPROTO_TCP: result = g_queue_find_custom(&conn->secondary_list, @@ -538,6 +554,53 @@ static gboolean check_old_packet_regular(void *opaque) return TRUE; } +/* Public API, Used for COLO frame to notify compare event */ +void colo_notify_compares_event(void *opaque, int event, Error **errp) +{ +CompareState *s; +int ret; + +qemu_mutex_lock(&event_mtx); +QTAILQ_FOREACH(s, &net_compares, next) { +ret = filter_notifier_set(s->notifier, event); +if (ret < 0) { +error_setg_errno(errp, -ret, "Failed to write value to eventfd"); +goto fail; +} +event_unhandled_count++; +} +/* Wait all compare threads to finish handling this event */ +while (event_unhandled_count > 0) { +qemu_cond_wait(&event_complete_cond, &event_mtx); +} + +fail: +qemu_mutex_unlock(&event_mtx); +} + +static void colo_flush_packets(void *opaque, void *user_data); + +static void colo_compare_handle_event(void *opaque, int event) +{ +FilterNotifier *notify = opaque; +CompareState *s = notify->opaque; + +switch (event) { +case COLO_CHECKPOINT: +g_queue_foreach(&s->conn_list, colo_flush_packets, s); +break; +case COLO_FAILOVER: +break; +default: +break; +} +qemu_mutex_lock(&event_mtx); +assert(event_unhandled_count > 0); +event_unhandled_count--; +qemu_cond_broadcast(&event_complete_cond); +qemu_mutex_unlock(&event_mtx); +} + static void *colo_compare_thread(void *opaque) { CompareState *s = opaque; @@ -558,10 +621,15 @@ static void *colo_compare_thread(void *opaque) (GSourceFunc)check_old_packet_regular, s, NULL); g_source_attach(timeout_source, s->worker_context); +s->notifier = filter_notifier_new(colo_compare_handle_event, s, NULL); +g_source_attach(&s->notifier->source, s->worker_context); + qemu_sem_post(&s->thread_ready); g_main_loop_run(s->compare_loop); +g_source_destroy(&s->notifier->source); +g_source_unref(&s->notifier->source); g_source_destroy(timeout_source); g_source_unref(timeout_source); @@ -706,6 +774,8 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp) net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize); net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize); +QTAILQ_INSERT_TAIL(&net_compares, s, next); + g_queue_init(&s->conn_list); s->connection_track_table = g_hash_table_new_full(connection_key_hash, @@ -765,6 +835,7 @@ static void colo_compare_init(Object *obj) static void colo_compare_finalize(Object *obj) { CompareState *s = COLO_COMPARE(obj); +CompareState *tmp = NULL; qemu_chr_fe_set_handlers(&s->chr_pri_in, NULL, NULL, NULL, NULL, s->worker_context, true); @@ -777,6 +848,13 @@ static void colo_compare_finalize(Object *o
[Qemu-devel] [PATCH v2 15/18] COLO: flush host dirty ram from cache
Don't need to flush all VM's ram from cache, only flush the dirty pages since last checkpoint Cc: Juan Quintela Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang --- v2: - stop dirty log after exit from COLO state. (Dave) --- migration/ram.c | 12 1 file changed, 12 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index f171a82..7bf3515 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2775,6 +2775,7 @@ int colo_init_ram_cache(void) ram_state.ram_bitmap = g_new0(RAMBitmap, 1); ram_state.ram_bitmap->bmap = bitmap_new(last_ram_page()); ram_state.migration_dirty_pages = 0; +memory_global_dirty_log_start(); return 0; @@ -2798,6 +2799,7 @@ void colo_release_ram_cache(void) atomic_rcu_set(&ram_state.ram_bitmap, NULL); if (bitmap) { +memory_global_dirty_log_stop(); call_rcu(bitmap, migration_bitmap_free, rcu); } @@ -2822,6 +2824,16 @@ void colo_flush_ram_cache(void) void *src_host; unsigned long offset = 0; +memory_global_dirty_log_sync(); +qemu_mutex_lock(&ram_state.bitmap_mutex); +rcu_read_lock(); +QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { +migration_bitmap_sync_range(&ram_state, block, block->offset, +block->used_length); +} +rcu_read_unlock(); +qemu_mutex_unlock(&ram_state.bitmap_mutex); + trace_colo_flush_ram_cache_begin(ram_state.migration_dirty_pages); rcu_read_lock(); block = QLIST_FIRST_RCU(&ram_list.blocks); -- 1.8.3.1
[Qemu-devel] [PATCH v2 13/18] COLO: Separate the process of saving/loading ram and device state
We separate the process of saving/loading ram and device state when do checkpoint. We add new helpers for save/load ram/device. With this change, we can directly transfer RAM from primary side to secondary side without using channel-buffer as assistant, which also reduce the size of extra memory was used during checkpoint. Besides, we move the colo_flush_ram_cache to the proper position after the above change. Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 49 +++-- migration/ram.c| 5 - migration/savevm.c | 4 3 files changed, 43 insertions(+), 15 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index e62da93..8e27a4c 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -357,11 +357,20 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } +colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err); +if (local_err) { +goto out; +} + /* Disable block migration */ s->params.blk = 0; s->params.shared = 0; -qemu_savevm_state_header(fb); -qemu_savevm_state_begin(fb, &s->params); +qemu_savevm_state_begin(s->to_dst_file, &s->params); +ret = qemu_file_get_error(s->to_dst_file); +if (ret < 0) { +error_report("Save VM state begin error"); +goto out; +} /* We call this API although this may do nothing on primary side. */ qemu_mutex_lock_iothread(); @@ -372,15 +381,21 @@ static int colo_do_checkpoint_transaction(MigrationState *s, } qemu_mutex_lock_iothread(); -qemu_savevm_state_complete_precopy(fb, false); +/* + * Only save VM's live state, which not including device state. + * TODO: We may need a timeout mechanism to prevent COLO process + * to be blocked here. + */ +qemu_savevm_live_state(s->to_dst_file); +/* Note: device state is saved into buffer */ +ret = qemu_save_device_state(fb); qemu_mutex_unlock_iothread(); - -qemu_fflush(fb); - -colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err); -if (local_err) { +if (ret < 0) { +error_report("Save device state error"); goto out; } +qemu_fflush(fb); + /* * We need the size of the VMstate data in Secondary side, * With which we can decide how much data should be read. @@ -621,6 +636,7 @@ void *colo_process_incoming_thread(void *opaque) uint64_t total_size; uint64_t value; Error *local_err = NULL; +int ret; qemu_sem_init(&mis->colo_incoming_sem, 0); @@ -693,6 +709,17 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +ret = qemu_loadvm_state_begin(mis->from_src_file); +if (ret < 0) { +error_report("Load vm state begin error, ret=%d", ret); +goto out; +} +ret = qemu_loadvm_state_main(mis->from_src_file, mis); +if (ret < 0) { +error_report("Load VM's live state (ram) error"); +goto out; +} + value = colo_receive_message_value(mis->from_src_file, COLO_MESSAGE_VMSTATE_SIZE, &local_err); if (local_err) { @@ -726,8 +753,10 @@ void *colo_process_incoming_thread(void *opaque) qemu_mutex_lock_iothread(); qemu_system_reset(VMRESET_SILENT); vmstate_loading = true; -if (qemu_loadvm_state(fb) < 0) { -error_report("COLO: loadvm failed"); +colo_flush_ram_cache(); +ret = qemu_load_device_state(fb); +if (ret < 0) { +error_report("COLO: load device state failed"); qemu_mutex_unlock_iothread(); goto out; } diff --git a/migration/ram.c b/migration/ram.c index df10d4b..f171a82 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2602,7 +2602,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) bool postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING; /* ADVISE is earlier, it shows the source has the postcopy capability on */ bool postcopy_advised = postcopy_state_get() >= POSTCOPY_INCOMING_ADVISE; -bool need_flush = false; seq_iter++; @@ -2637,7 +2636,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) /* After going into COLO, we should load the Page into colo_cache */ if (migration_incoming_in_colo_state()) { host = colo_cache_from_block_offset(block, addr); -need_flush = true; } else { host = host_from_ram_block_offset(block, addr); } @@ -2745,9 +2743,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); -if (!ret && ram_cache_enable && need_flush) { -
[Qemu-devel] [PATCH v2 06/18] COLO: Add block replication into colo process
Make sure master start block replication after slave's block replication started. Besides, we need to activate VM's blocks before goes into COLO state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Cc: Stefan Hajnoczi Cc: Kevin Wolf Cc: Max Reitz Cc: Xie Changlong --- migration/colo.c | 50 ++ migration/migration.c | 16 2 files changed, 66 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index c4fc865..9949293 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -23,6 +23,9 @@ #include "qmp-commands.h" #include "net/colo-compare.h" #include "net/colo.h" +#include "qapi-event.h" +#include "block/block.h" +#include "replication.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -57,6 +60,7 @@ static void secondary_vm_do_failover(void) { int old_state; MigrationIncomingState *mis = migration_incoming_get_current(); +Error *local_err = NULL; /* Can not do failover during the process of VM's loading VMstate, Or * it will break the secondary VM. @@ -74,6 +78,11 @@ static void secondary_vm_do_failover(void) migrate_set_state(&mis->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); +replication_stop_all(true, &local_err); +if (local_err) { +error_report_err(local_err); +} + if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side"); /* recover runstate to normal migration finish state */ @@ -111,6 +120,7 @@ static void primary_vm_do_failover(void) { MigrationState *s = migrate_get_current(); int old_state; +Error *local_err = NULL; migrate_set_state(&s->state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); @@ -134,6 +144,13 @@ static void primary_vm_do_failover(void) FailoverStatus_lookup[old_state]); return; } + +replication_stop_all(true, &local_err); +if (local_err) { +error_report_err(local_err); +local_err = NULL; +} + /* Notify COLO thread that failover work is finished */ qemu_sem_post(&s->colo_exit_sem); } @@ -345,6 +362,15 @@ static int colo_do_checkpoint_transaction(MigrationState *s, s->params.shared = 0; qemu_savevm_state_header(fb); qemu_savevm_state_begin(fb, &s->params); + +/* We call this API although this may do nothing on primary side. */ +qemu_mutex_lock_iothread(); +replication_do_checkpoint_all(&local_err); +qemu_mutex_unlock_iothread(); +if (local_err) { +goto out; +} + qemu_mutex_lock_iothread(); qemu_savevm_state_complete_precopy(fb, false); qemu_mutex_unlock_iothread(); @@ -451,6 +477,12 @@ static void colo_process_checkpoint(MigrationState *s) object_unref(OBJECT(bioc)); qemu_mutex_lock_iothread(); +replication_start_all(REPLICATION_MODE_PRIMARY, &local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vm_start(); qemu_mutex_unlock_iothread(); trace_colo_vm_state_change("stop", "run"); @@ -554,6 +586,7 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request, case COLO_MESSAGE_GUEST_SHUTDOWN: qemu_mutex_lock_iothread(); vm_stop_force_state(RUN_STATE_COLO); +replication_stop_all(false, NULL); qemu_system_shutdown_request_core(); qemu_mutex_unlock_iothread(); /* @@ -602,6 +635,11 @@ void *colo_process_incoming_thread(void *opaque) object_unref(OBJECT(bioc)); qemu_mutex_lock_iothread(); +replication_start_all(REPLICATION_MODE_SECONDARY, &local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} vm_start(); trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); @@ -682,6 +720,18 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +replication_get_error_all(&local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} +/* discard colo disk buffer */ +replication_do_checkpoint_all(&local_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vmstate_loading = false; vm_start(); trace_colo_vm_state_change("stop", "run"); diff --git a/migration/migration.c b/migration/migration.c index 2ade2aa..755ea54 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -394,6 +394,7 @@ static void process_incoming_migration_co(void *opaque) MigrationIncomingState *mis = migration_incoming_get_current(); PostcopyState ps; int ret; +Error *local_err = NULL; mis->from_src_file = f; mis->largest_page_size = qemu_ram_pagesize_largest(); @@ -425,6 +426,21 @@ static void process_incomi
[Qemu-devel] [PATCH v2 14/18] COLO: Split qemu_savevm_state_begin out of checkpoint process
It is unnecessary to call qemu_savevm_state_begin() in every checkpoint process. It mainly sets up devices and does the first device state pass. These data will not change during the later checkpoint process. So, we split it out of colo_do_checkpoint_transaction(), in this way, we can reduce these data transferring in the subsequent checkpoint. Cc: Juan Quintela Sgned-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 51 --- 1 file changed, 36 insertions(+), 15 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 8e27a4c..66bb5b2 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -362,16 +362,6 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } -/* Disable block migration */ -s->params.blk = 0; -s->params.shared = 0; -qemu_savevm_state_begin(s->to_dst_file, &s->params); -ret = qemu_file_get_error(s->to_dst_file); -if (ret < 0) { -error_report("Save VM state begin error"); -goto out; -} - /* We call this API although this may do nothing on primary side. */ qemu_mutex_lock_iothread(); replication_do_checkpoint_all(&local_err); @@ -459,6 +449,21 @@ static void colo_compare_notify_checkpoint(Notifier *notifier, void *data) colo_checkpoint_notify(data); } +static int colo_prepare_before_save(MigrationState *s) +{ +int ret; + +/* Disable block migration */ +s->params.blk = 0; +s->params.shared = 0; +qemu_savevm_state_begin(s->to_dst_file, &s->params); +ret = qemu_file_get_error(s->to_dst_file); +if (ret < 0) { +error_report("Save VM state begin error"); +} +return ret; +} + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -478,6 +483,11 @@ static void colo_process_checkpoint(MigrationState *s) packets_compare_notifier.notify = colo_compare_notify_checkpoint; colo_compare_register_notifier(&packets_compare_notifier); +ret = colo_prepare_before_save(s); +if (ret < 0) { +goto out; +} + /* * Wait for Secondary finish loading VM states and enter COLO * restore. @@ -628,6 +638,17 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request, } } +static int colo_prepare_before_load(QEMUFile *f) +{ +int ret; + +ret = qemu_loadvm_state_begin(f); +if (ret < 0) { +error_report("Load VM state begin error, ret = %d", ret); +} +return ret; +} + void *colo_process_incoming_thread(void *opaque) { MigrationIncomingState *mis = opaque; @@ -662,6 +683,11 @@ void *colo_process_incoming_thread(void *opaque) fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); +ret = colo_prepare_before_load(mis->from_src_file); +if (ret < 0) { +goto out; +} + qemu_mutex_lock_iothread(); replication_start_all(REPLICATION_MODE_SECONDARY, &local_err); if (local_err) { @@ -709,11 +735,6 @@ void *colo_process_incoming_thread(void *opaque) goto out; } -ret = qemu_loadvm_state_begin(mis->from_src_file); -if (ret < 0) { -error_report("Load vm state begin error, ret=%d", ret); -goto out; -} ret = qemu_loadvm_state_main(mis->from_src_file, mis); if (ret < 0) { error_report("Load VM's live state (ram) error"); -- 1.8.3.1
[Qemu-devel] [PATCH v2 11/18] savevm: split save/find loadvm_handlers entry into two helper functions
COLO's checkpoint process is based on migration process, everytime we do checkpoint we will repeat the process of savevm and loadvm. So we will call qemu_loadvm_section_start_full() repeatedly, It will add all migration sections information into loadvm_handlers list everytime, which will lead to memory leak. To fix it, we split the process of saving and finding section entry into two helper functions, we will check if section info was exist in loadvm_handlers list before save it. This modifications have no side effect for normal migration. Cc: Juan Quintela Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/savevm.c | 55 +++--- 1 file changed, 40 insertions(+), 15 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index 03ae1bd..f87cd8d 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1836,6 +1836,37 @@ void loadvm_free_handlers(MigrationIncomingState *mis) } } +static LoadStateEntry *loadvm_add_section_entry(MigrationIncomingState *mis, + SaveStateEntry *se, + uint32_t section_id, + uint32_t version_id) +{ +LoadStateEntry *le; + +/* Add entry */ +le = g_malloc0(sizeof(*le)); + +le->se = se; +le->section_id = section_id; +le->version_id = version_id; +QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry); +return le; +} + +static LoadStateEntry *loadvm_find_section_entry(MigrationIncomingState *mis, + uint32_t section_id) +{ +LoadStateEntry *le; + +QLIST_FOREACH(le, &mis->loadvm_handlers, entry) { +if (le->section_id == section_id) { +break; +} +} + +return le; +} + static int qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis) { @@ -1878,15 +1909,12 @@ qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis) return -EINVAL; } -/* Add entry */ -le = g_malloc0(sizeof(*le)); - -le->se = se; -le->section_id = section_id; -le->version_id = version_id; -QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry); - -ret = vmstate_load(f, le->se, le->version_id); + /* Check if we have saved this section info before, if not, save it */ +le = loadvm_find_section_entry(mis, section_id); +if (!le) { +le = loadvm_add_section_entry(mis, se, section_id, version_id); +} +ret = vmstate_load(f, se, version_id); if (ret < 0) { error_report("error while loading state for instance 0x%x of" " device '%s'", instance_id, idstr); @@ -1909,12 +1937,9 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis) section_id = qemu_get_be32(f); trace_qemu_loadvm_state_section_partend(section_id); -QLIST_FOREACH(le, &mis->loadvm_handlers, entry) { -if (le->section_id == section_id) { -break; -} -} -if (le == NULL) { + +le = loadvm_find_section_entry(mis, section_id); +if (!le) { error_report("Unknown savevm section %d", section_id); return -EINVAL; } -- 1.8.3.1
[Qemu-devel] [PATCH v2 12/18] savevm: split the process of different stages for loadvm/savevm
There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_loadvm_state_begin(), qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v2: - Use the wrapper qemu_savevm_state_header() to simplify the codes of qemu_save_device_state() (Dave's suggestion) --- include/sysemu/sysemu.h | 6 ++ migration/savevm.c | 54 ++--- 2 files changed, 53 insertions(+), 7 deletions(-) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 8054f53..0255c4e 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, uint64_t *start_list, uint64_t *length_list); +void qemu_savevm_live_state(QEMUFile *f); +int qemu_save_device_state(QEMUFile *f); + int qemu_loadvm_state(QEMUFile *f); +int qemu_loadvm_state_begin(QEMUFile *f); +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); +int qemu_load_device_state(QEMUFile *f); extern int autostart; diff --git a/migration/savevm.c b/migration/savevm.c index f87cd8d..8c2ce0b 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -54,6 +54,7 @@ #include "qemu/cutils.h" #include "io/channel-buffer.h" #include "io/channel-file.h" +#include "migration/colo.h" #ifndef ETH_P_RARP #define ETH_P_RARP 0x8035 @@ -1285,13 +1286,20 @@ done: return ret; } -static int qemu_save_device_state(QEMUFile *f) +void qemu_savevm_live_state(QEMUFile *f) { -SaveStateEntry *se; +/* save QEMU_VM_SECTION_END section */ +qemu_savevm_state_complete_precopy(f, true); +qemu_put_byte(f, QEMU_VM_EOF); +} -qemu_put_be32(f, QEMU_VM_FILE_MAGIC); -qemu_put_be32(f, QEMU_VM_FILE_VERSION); +int qemu_save_device_state(QEMUFile *f) +{ +SaveStateEntry *se; +if (!migration_in_colo_state()) { +qemu_savevm_state_header(f); +} cpu_synchronize_all_states(); QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { @@ -1342,8 +1350,6 @@ enum LoadVMExitCodes { LOADVM_QUIT = 1, }; -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); - /* -- incoming postcopy messages -- */ /* 'advise' arrives before any transfers just to tell us that a postcopy * *might* happen - it might be skipped if precopy transferred everything @@ -1957,7 +1963,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis) return 0; } -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) { uint8_t section_type; int ret = 0; @@ -2095,6 +2101,40 @@ int qemu_loadvm_state(QEMUFile *f) return ret; } +int qemu_loadvm_state_begin(QEMUFile *f) +{ +MigrationIncomingState *mis = migration_incoming_get_current(); +Error *local_err = NULL; +int ret; + +if (qemu_savevm_state_blocked(&local_err)) { +error_report_err(local_err); +return -EINVAL; +} +/* Load QEMU_VM_SECTION_START section */ +ret = qemu_loadvm_state_main(f, mis); +if (ret < 0) { +error_report("Failed to loadvm begin work: %d", ret); +} +return ret; +} + +int qemu_load_device_state(QEMUFile *f) +{ +MigrationIncomingState *mis = migration_incoming_get_current(); +int ret; + +/* Load QEMU_VM_SECTION_FULL section */ +ret = qemu_loadvm_state_main(f, mis); +if (ret < 0) { +error_report("Failed to load device state: %d", ret); +return ret; +} + +cpu_synchronize_all_post_init(); +return 0; +} + int save_vmstate(Monitor *mon, const char *name) { BlockDriverState *bs, *bs1; -- 1.8.3.1
[Qemu-devel] [PATCH v2 09/18] COLO: Flush memory data from ram cache
During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's RAM cache will always be same with PVM's memory after checkpoint. Instead of flushing all content of PVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Cc: Juan Quintela Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- include/migration/migration.h | 1 + migration/ram.c | 40 migration/trace-events| 2 ++ 3 files changed, 43 insertions(+) diff --git a/include/migration/migration.h b/include/migration/migration.h index ba765eb..2aa7654 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -364,4 +364,5 @@ PostcopyState postcopy_state_set(PostcopyState new_state); /* ram cache */ int colo_init_ram_cache(void); void colo_release_ram_cache(void); +void colo_flush_ram_cache(void); #endif diff --git a/migration/ram.c b/migration/ram.c index 0653a24..df10d4b 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2602,6 +2602,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) bool postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING; /* ADVISE is earlier, it shows the source has the postcopy capability on */ bool postcopy_advised = postcopy_state_get() >= POSTCOPY_INCOMING_ADVISE; +bool need_flush = false; seq_iter++; @@ -2636,6 +2637,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) /* After going into COLO, we should load the Page into colo_cache */ if (migration_incoming_in_colo_state()) { host = colo_cache_from_block_offset(block, addr); +need_flush = true; } else { host = host_from_ram_block_offset(block, addr); } @@ -2742,6 +2744,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) wait_for_decompress_done(); rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); + +if (!ret && ram_cache_enable && need_flush) { +colo_flush_ram_cache(); +} return ret; } @@ -2810,6 +2816,40 @@ void colo_release_ram_cache(void) rcu_read_unlock(); } +/* + * Flush content of RAM cache into SVM's memory. + * Only flush the pages that be dirtied by PVM or SVM or both. + */ +void colo_flush_ram_cache(void) +{ +RAMBlock *block = NULL; +void *dst_host; +void *src_host; +unsigned long offset = 0; + +trace_colo_flush_ram_cache_begin(ram_state.migration_dirty_pages); +rcu_read_lock(); +block = QLIST_FIRST_RCU(&ram_list.blocks); + +while (block) { +offset = migration_bitmap_find_dirty(&ram_state, block, offset); +migration_bitmap_clear_dirty(&ram_state, block, offset); + +if (offset << TARGET_PAGE_BITS >= block->used_length) { +offset = 0; +block = QLIST_NEXT_RCU(block, next); +} else { +dst_host = block->host + (offset << TARGET_PAGE_BITS); +src_host = block->colo_cache + (offset << TARGET_PAGE_BITS); +memcpy(dst_host, src_host, TARGET_PAGE_SIZE); +} +} + +rcu_read_unlock(); +trace_colo_flush_ram_cache_end(); +assert(ram_state.migration_dirty_pages == 0); +} + static SaveVMHandlers savevm_ram_handlers = { .save_live_setup = ram_save_setup, .save_live_iterate = ram_save_iterate, diff --git a/migration/trace-events b/migration/trace-events index b8f01a2..93f4337 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -72,6 +72,8 @@ ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x" ram_postcopy_send_discard_bitmap(void) "" ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: start: %zx len: %zx" +colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64 +colo_flush_ram_cache_end(void) "" # migration/migration.c await_return_path_close_on_source_close(void) "" -- 1.8.3.1
[Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare
Hi, COLO Frame, block replication and COLO net compare have been exist in qemu for long time, it's time to integrate these three parts to make COLO really works. In this series, we have some optimizations for COLO frame, including separating the process of saving ram and device state, using an COLO_EXIT event to notify users that VM exits COLO, for these parts, most of them have been reviewed long time ago in old version, but since this series have just rebased on upstream which had merged a new series of migration, parts of pathes in this series deserve review again. We use notifier/callback method for COLO compare to notify COLO frame about net packets inconsistent event, and add a handle_event method for NetFilterClass to help COLO frame to notify filters and colo-compare about checkpoint/failover event, it is flexible. Besides, this series is on top of '[PATCH 0/3] colo-compare: fix three bugs' series. For the neweset version, please refer to: https://github.com/coloft/qemu/tree/colo-for-qemu-2.10-2017-4-22 Please review, thanks. Cc: Dong eddie Cc: Jiang yunhong Cc: Xu Quan Cc: Jason Wang zhanghailiang (18): net/colo: Add notifier/callback related helpers for filter colo-compare: implement the process of checkpoint colo-compare: use notifier to notify packets comparing result COLO: integrate colo compare with colo frame COLO: Handle shutdown command for VM in COLO state COLO: Add block replication into colo process COLO: Load dirty pages into SVM's RAM cache firstly ram/COLO: Record the dirty pages that SVM received COLO: Flush memory data from ram cache qmp event: Add COLO_EXIT event to notify users while exited COLO savevm: split save/find loadvm_handlers entry into two helper functions savevm: split the process of different stages for loadvm/savevm COLO: Separate the process of saving/loading ram and device state COLO: Split qemu_savevm_state_begin out of checkpoint process COLO: flush host dirty ram from cache filter: Add handle_event method for NetFilterClass filter-rewriter: handle checkpoint and failover event COLO: notify net filters about checkpoint/failover event include/exec/ram_addr.h | 1 + include/migration/colo.h | 1 + include/migration/migration.h | 5 + include/net/filter.h | 5 + include/sysemu/sysemu.h | 9 ++ migration/colo.c | 242 +++--- migration/migration.c | 24 - migration/ram.c | 147 - migration/savevm.c| 113 migration/trace-events| 2 + net/colo-compare.c| 110 ++- net/colo-compare.h| 8 ++ net/colo.c| 105 ++ net/colo.h| 19 net/filter-rewriter.c | 39 +++ net/filter.c | 16 +++ net/net.c | 28 + qapi-schema.json | 18 +++- qapi/event.json | 21 vl.c | 19 +++- 20 files changed, 886 insertions(+), 46 deletions(-) create mode 100644 net/colo-compare.h -- 1.8.3.1
[Qemu-devel] [PATCH v2 08/18] ram/COLO: Record the dirty pages that SVM received
We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. Here, it is a trick, we record dirty pages by re-using migration dirty bitmap. In the later patch, we will start the dirty log for SVM, just like migration, in this way, we can record both the dirty pages caused by PVM and SVM, we only flush those dirty pages from RAM cache while do checkpoint. Cc: Juan Quintela Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 29 + 1 file changed, 29 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 05d1b06..0653a24 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2268,6 +2268,9 @@ static inline void *host_from_ram_block_offset(RAMBlock *block, static inline void *colo_cache_from_block_offset(RAMBlock *block, ram_addr_t offset) { +unsigned long *bitmap; +long k; + if (!offset_in_ramblock(block, offset)) { return NULL; } @@ -2276,6 +2279,17 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block, __func__, block->idstr); return NULL; } + +k = (memory_region_get_ram_addr(block->mr) + offset) >> TARGET_PAGE_BITS; +bitmap = atomic_rcu_read(&ram_state.ram_bitmap)->bmap; +/* +* During colo checkpoint, we need bitmap of these migrated pages. +* It help us to decide which pages in ram cache should be flushed +* into VM's RAM later. +*/ +if (!test_and_set_bit(k, bitmap)) { +ram_state.migration_dirty_pages++; +} return block->colo_cache + offset; } @@ -2752,6 +2766,15 @@ int colo_init_ram_cache(void) memcpy(block->colo_cache, block->host, block->used_length); } rcu_read_unlock(); +/* +* Record the dirty pages that sent by PVM, we use this dirty bitmap together +* with to decide which page in cache should be flushed into SVM's RAM. Here +* we use the same name 'ram_bitmap' as for migration. +*/ +ram_state.ram_bitmap = g_new0(RAMBitmap, 1); +ram_state.ram_bitmap->bmap = bitmap_new(last_ram_page()); +ram_state.migration_dirty_pages = 0; + return 0; out_locked: @@ -2770,6 +2793,12 @@ out_locked: void colo_release_ram_cache(void) { RAMBlock *block; +RAMBitmap *bitmap = ram_state.ram_bitmap; + +atomic_rcu_set(&ram_state.ram_bitmap, NULL); +if (bitmap) { +call_rcu(bitmap, migration_bitmap_free, rcu); +} rcu_read_lock(); QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { -- 1.8.3.1
[Qemu-devel] [PATCH v2 04/18] COLO: integrate colo compare with colo frame
For COLO FT, both the PVM and SVM run at the same time, only sync the state while it needs. So here, let SVM runs while not doing checkpoint, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100. Besides, we forgot to release colo_checkpoint_semd and colo_delay_timer, fix them here. Cc: Jason Wang Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 42 -- migration/migration.c | 2 +- 2 files changed, 41 insertions(+), 3 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index c19eb3f..a3344ce 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -21,8 +21,11 @@ #include "migration/failover.h" #include "replication.h" #include "qmp-commands.h" +#include "net/colo-compare.h" +#include "net/colo.h" static bool vmstate_loading; +static Notifier packets_compare_notifier; #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024) @@ -332,6 +335,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } +colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err); +if (local_err) { +goto out; +} + /* Disable block migration */ s->params.blk = 0; s->params.shared = 0; @@ -390,6 +398,11 @@ out: return ret; } +static void colo_compare_notify_checkpoint(Notifier *notifier, void *data) +{ +colo_checkpoint_notify(data); +} + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -406,6 +419,9 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } +packets_compare_notifier.notify = colo_compare_notify_checkpoint; +colo_compare_register_notifier(&packets_compare_notifier); + /* * Wait for Secondary finish loading VM states and enter COLO * restore. @@ -451,11 +467,21 @@ out: qemu_fclose(fb); } -timer_del(s->colo_delay_timer); - /* Hope this not to be too long to wait here */ qemu_sem_wait(&s->colo_exit_sem); qemu_sem_destroy(&s->colo_exit_sem); + +/* + * It is safe to unregister notifier after failover finished. + * Besides, colo_delay_timer and colo_checkpoint_sem can't be + * released befor unregister notifier, or there will be use-after-free + * error. + */ +colo_compare_unregister_notifier(&packets_compare_notifier); +timer_del(s->colo_delay_timer); +timer_free(s->colo_delay_timer); +qemu_sem_destroy(&s->colo_checkpoint_sem); + /* * Must be called after failover BH is completed, * Or the failover BH may shutdown the wrong fd that @@ -548,6 +574,11 @@ void *colo_process_incoming_thread(void *opaque) fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); +qemu_mutex_lock_iothread(); +vm_start(); +trace_colo_vm_state_change("stop", "run"); +qemu_mutex_unlock_iothread(); + colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY, &local_err); if (local_err) { @@ -567,6 +598,11 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +qemu_mutex_lock_iothread(); +vm_stop_force_state(RUN_STATE_COLO); +trace_colo_vm_state_change("run", "stop"); +qemu_mutex_unlock_iothread(); + /* FIXME: This is unnecessary for periodic checkpoint mode */ colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY, &local_err); @@ -620,6 +656,8 @@ void *colo_process_incoming_thread(void *opaque) } vmstate_loading = false; +vm_start(); +trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) { diff --git a/migration/migration.c b/migration/migration.c index 353f272..2ade2aa 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -70,7 +70,7 @@ /* The delay time (in ms) between two COLO checkpoints * Note: Please change this default value to 1 when we support hybrid mode. */ -#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200 +#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100) static NotifierList migration_state_notifiers = NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); -- 1.8.3.1
Re: [Qemu-devel] [PATCH v3 00/18] crypto: add afalg-backend support
Hi, This series seems to have some coding style problems. See output below for more information: Message-id: 1492845627-4384-1-git-send-email-longpe...@huawei.com Type: series Subject: [Qemu-devel] [PATCH v3 00/18] crypto: add afalg-backend support === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 # Useful git options git config --local diff.renamelimit 0 git config --local diff.renames True commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' 661f423 tests: crypto: add hmac speed benchmark support 5171494 tests: crypto: add hash speed benchmark support 13c3e03 tests: crypto: add cipher speed benchmark support 961a401 crypto: hmac: add af_alg hmac support 063286b crypto: hash: add afalg-backend hash support b5355c4 crypto: cipher: add afalg-backend cipher support 2d2350b crypto: introduce some common functions for af_alg backend cd954fa crypto: hmac: add hmac driver framework a1fd00b crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend 9db6301 crypto: hmac: introduce qcrypto_hmac_ctx_new for nettle-backend f1ed9b7 crypto: hmac: introduce qcrypto_hmac_ctx_new for gcrypt-backend c024c50 crypto: hmac: move crypto/hmac.h into include/crypto/ 6c9afb6 crypto: hash: add hash driver framework e3e9bc0 crypto: cipher: add cipher driver framework df0cae3 crypto: cipher: introduce qcrypto_cipher_ctx_new for builtin-backend 44bbb18 crypto: cipher: introduce qcrypto_cipher_ctx_new for nettle-backend 8089468 crypto: cipher: introduce qcrypto_cipher_ctx_new for gcrypt-backend 7b46541 crypto: cipher: introduce context free function === OUTPUT BEGIN === Checking PATCH 1/18: crypto: cipher: introduce context free function... Checking PATCH 2/18: crypto: cipher: introduce qcrypto_cipher_ctx_new for gcrypt-backend... Checking PATCH 3/18: crypto: cipher: introduce qcrypto_cipher_ctx_new for nettle-backend... Checking PATCH 4/18: crypto: cipher: introduce qcrypto_cipher_ctx_new for builtin-backend... Checking PATCH 5/18: crypto: cipher: add cipher driver framework... Checking PATCH 6/18: crypto: hash: add hash driver framework... Checking PATCH 7/18: crypto: hmac: move crypto/hmac.h into include/crypto/... Checking PATCH 8/18: crypto: hmac: introduce qcrypto_hmac_ctx_new for gcrypt-backend... Checking PATCH 9/18: crypto: hmac: introduce qcrypto_hmac_ctx_new for nettle-backend... Checking PATCH 10/18: crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend... Checking PATCH 11/18: crypto: hmac: add hmac driver framework... Checking PATCH 12/18: crypto: introduce some common functions for af_alg backend... ERROR: g_free(NULL) is safe this check is probably not required #175: FILE: crypto/afalg.c:105: +if (afalg->name) { +g_free(afalg->name); total: 1 errors, 0 warnings, 217 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Checking PATCH 13/18: crypto: cipher: add afalg-backend cipher support... Checking PATCH 14/18: crypto: hash: add afalg-backend hash support... Checking PATCH 15/18: crypto: hmac: add af_alg hmac support... Checking PATCH 16/18: tests: crypto: add cipher speed benchmark support... Checking PATCH 17/18: tests: crypto: add hash speed benchmark support... Checking PATCH 18/18: tests: crypto: add hmac speed benchmark support... === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
Re: [Qemu-devel] [PATCH v3 00/18] crypto: add afalg-backend support
Hi, This series failed automatic build test. Please find the testing commands and their output below. If you have docker installed, you can probably reproduce it locally. Message-id: 1492845627-4384-1-git-send-email-longpe...@huawei.com Type: series Subject: [Qemu-devel] [PATCH v3 00/18] crypto: add afalg-backend support === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 export J=8 make docker-test-quick@centos6 make docker-test-mingw@fedora make docker-test-build@min-glib === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/1492845627-4384-1-git-send-email-longpe...@huawei.com -> patchew/1492845627-4384-1-git-send-email-longpe...@huawei.com Switched to a new branch 'test' 661f423 tests: crypto: add hmac speed benchmark support 5171494 tests: crypto: add hash speed benchmark support 13c3e03 tests: crypto: add cipher speed benchmark support 961a401 crypto: hmac: add af_alg hmac support 063286b crypto: hash: add afalg-backend hash support b5355c4 crypto: cipher: add afalg-backend cipher support 2d2350b crypto: introduce some common functions for af_alg backend cd954fa crypto: hmac: add hmac driver framework a1fd00b crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend 9db6301 crypto: hmac: introduce qcrypto_hmac_ctx_new for nettle-backend f1ed9b7 crypto: hmac: introduce qcrypto_hmac_ctx_new for gcrypt-backend c024c50 crypto: hmac: move crypto/hmac.h into include/crypto/ 6c9afb6 crypto: hash: add hash driver framework e3e9bc0 crypto: cipher: add cipher driver framework df0cae3 crypto: cipher: introduce qcrypto_cipher_ctx_new for builtin-backend 44bbb18 crypto: cipher: introduce qcrypto_cipher_ctx_new for nettle-backend 8089468 crypto: cipher: introduce qcrypto_cipher_ctx_new for gcrypt-backend 7b46541 crypto: cipher: introduce context free function === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-3q7t_6bj/src/dtc'... Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d' BUILD centos6 make[1]: Entering directory '/var/tmp/patchew-tester-tmp-3q7t_6bj/src' ARCHIVE qemu.tgz ARCHIVE dtc.tgz COPYRUNNER RUN test-quick in qemu:centos6 Packages installed: SDL-devel-1.2.14-7.el6_7.1.x86_64 ccache-3.1.6-2.el6.x86_64 epel-release-6-8.noarch gcc-4.4.7-17.el6.x86_64 git-1.7.1-4.el6_7.1.x86_64 glib2-devel-2.28.8-5.el6.x86_64 libfdt-devel-1.4.0-1.el6.x86_64 make-3.81-23.el6.x86_64 package g++ is not installed pixman-devel-0.32.8-1.el6.x86_64 tar-1.23-15.el6_8.x86_64 zlib-devel-1.2.3-29.el6.x86_64 Environment variables: PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel glib2-devel SDL-devel pixman-devel epel-release HOSTNAME=02a3bb3b76a2 TERM=xterm MAKEFLAGS= -j8 HISTSIZE=1000 J=8 USER=root CCACHE_DIR=/var/tmp/ccache EXTRA_CONFIGURE_OPTS= V= SHOW_ENV=1 MAIL=/var/spool/mail/root PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ LANG=en_US.UTF-8 TARGET_LIST= HISTCONTROL=ignoredups SHLVL=1 HOME=/root TEST_DIR=/tmp/qemu-test LOGNAME=root LESSOPEN=||/usr/bin/lesspipe.sh %s FEATURES= dtc DEBUG= G_BROKEN_FILENAMES=1 CCACHE_HASHDIR= _=/usr/bin/env Configure options: --enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/var/tmp/qemu-build/install grep: scripts/tracetool/backend/*.py: No such file or directory No C++ compiler available; disabling C++ specific optional code Install prefix/var/tmp/qemu-build/install BIOS directory/var/tmp/qemu-build/install/share/qemu binary directory /var/tmp/qemu-build/install/bin library directory /var/tmp/qemu-build/install/lib module directory /var/tmp/qemu-build/install/lib/qemu libexec directory /var/tmp/qemu-build/install/libexec include directory /var/tmp/qemu-build/install/include config directory /var/tmp/qemu-build/install/etc local state directory /var/tmp/qemu-build/install/var Manual directory /var/tmp/qemu-build/install/share/man ELF interp prefix /usr/gnemul/qemu-%M Source path /tmp/qemu-test/src C compilercc Host C compiler cc C++ compiler Objective-C compiler cc ARFLAGS rv CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g QEMU_CFLAGS -I/usr/include/pixman-1 -I$(SRC_PATH)/dtc/libfdt -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wendif-labels -Wno-missing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-all LDFLAGS -Wl,-
[Qemu-devel] [PATCH v3 11/18] crypto: hmac: add hmac driver framework
1) makes the public APIs in hmac-nettle/gcrypt/glib static, and rename them with "nettle/gcrypt/glib" prefix. 2) introduces hmac framework, including QCryptoHmacDriver and new public APIs. Signed-off-by: Longpeng(Mike) --- crypto/hmac-gcrypt.c | 51 --- crypto/hmac-glib.c| 75 +-- crypto/hmac-nettle.c | 52 --- crypto/hmac.c | 44 ++ crypto/hmacpriv.h | 36 + include/crypto/hmac.h | 1 + 6 files changed, 145 insertions(+), 114 deletions(-) create mode 100644 crypto/hmacpriv.h diff --git a/crypto/hmac-gcrypt.c b/crypto/hmac-gcrypt.c index 42489f3..76ca61b 100644 --- a/crypto/hmac-gcrypt.c +++ b/crypto/hmac-gcrypt.c @@ -15,6 +15,7 @@ #include "qemu/osdep.h" #include "qapi/error.h" #include "crypto/hmac.h" +#include "hmacpriv.h" #include static int qcrypto_hmac_alg_map[QCRYPTO_HASH_ALG__MAX] = { @@ -42,10 +43,9 @@ bool qcrypto_hmac_supports(QCryptoHashAlgorithm alg) return false; } -static QCryptoHmacGcrypt * -qcrypto_hmac_ctx_new(QCryptoHashAlgorithm alg, - const uint8_t *key, size_t nkey, - Error **errp) +void *qcrypto_hmac_ctx_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) { QCryptoHmacGcrypt *ctx; gcry_error_t err; @@ -81,27 +81,24 @@ error: return NULL; } -void qcrypto_hmac_free(QCryptoHmac *hmac) +static void +qcrypto_gcrypt_hmac_ctx_free(QCryptoHmac *hmac) { QCryptoHmacGcrypt *ctx; -if (!hmac) { -return; -} - ctx = hmac->opaque; gcry_mac_close(ctx->handle); g_free(ctx); -g_free(hmac); } -int qcrypto_hmac_bytesv(QCryptoHmac *hmac, -const struct iovec *iov, -size_t niov, -uint8_t **result, -size_t *resultlen, -Error **errp) +static int +qcrypto_gcrypt_hmac_bytesv(QCryptoHmac *hmac, + const struct iovec *iov, + size_t niov, + uint8_t **result, + size_t *resultlen, + Error **errp) { QCryptoHmacGcrypt *ctx; gcry_error_t err; @@ -147,21 +144,7 @@ int qcrypto_hmac_bytesv(QCryptoHmac *hmac, return 0; } -QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, - const uint8_t *key, size_t nkey, - Error **errp) -{ -QCryptoHmac *hmac; -QCryptoHmacGcrypt *ctx; - -ctx = qcrypto_hmac_ctx_new(alg, key, nkey, errp); -if (ctx == NULL) { -return NULL; -} - -hmac = g_new0(QCryptoHmac, 1); -hmac->alg = alg; -hmac->opaque = ctx; - -return hmac; -} +QCryptoHmacDriver qcrypto_hmac_lib_driver = { +.hmac_bytesv = qcrypto_gcrypt_hmac_bytesv, +.hmac_free = qcrypto_gcrypt_hmac_ctx_free, +}; diff --git a/crypto/hmac-glib.c b/crypto/hmac-glib.c index d9f88d8..8cf6b22 100644 --- a/crypto/hmac-glib.c +++ b/crypto/hmac-glib.c @@ -15,6 +15,7 @@ #include "qemu/osdep.h" #include "qapi/error.h" #include "crypto/hmac.h" +#include "hmacpriv.h" /* Support for HMAC Algos has been added in GLib 2.30 */ #if GLIB_CHECK_VERSION(2, 30, 0) @@ -49,10 +50,9 @@ bool qcrypto_hmac_supports(QCryptoHashAlgorithm alg) return false; } -static QCryptoHmacGlib * -qcrypto_hmac_ctx_new(QCryptoHashAlgorithm alg, - const uint8_t *key, size_t nkey, - Error **errp) +void *qcrypto_hmac_ctx_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) { QCryptoHmacGlib *ctx; @@ -78,27 +78,24 @@ error: return NULL; } -void qcrypto_hmac_free(QCryptoHmac *hmac) +static void +qcrypto_glib_hmac_ctx_free(QCryptoHmac *hmac) { QCryptoHmacGlib *ctx; -if (!hmac) { -return; -} - ctx = hmac->opaque; g_hmac_unref(ctx->ghmac); g_free(ctx); -g_free(hmac); } -int qcrypto_hmac_bytesv(QCryptoHmac *hmac, -const struct iovec *iov, -size_t niov, -uint8_t **result, -size_t *resultlen, -Error **errp) +static int +qcrypto_glib_hmac_bytesv(QCryptoHmac *hmac, + const struct iovec *iov, + size_t niov, + uint8_t **result, + size_t *resultlen, + Error **errp) { QCryptoHmacGlib *ctx; int i, ret; @@ -129,25 +126,6 @@ int qcrypto_hmac_bytesv(QCryptoHmac *hmac, return 0; } -QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, - const uint8_t *key, size_t nkey,
[Qemu-devel] [PATCH v3 15/18] crypto: hmac: add af_alg hmac support
Adds afalg-backend hmac support: introduces some private APIs firstly, and then intergrates them into qcrypto_hmac_afalg_driver. Signed-off-by: Longpeng(Mike) --- crypto/hash-afalg.c | 108 +++--- crypto/hmac.c | 27 - crypto/hmacpriv.h | 9 + include/crypto/hmac.h | 8 4 files changed, 136 insertions(+), 16 deletions(-) diff --git a/crypto/hash-afalg.c b/crypto/hash-afalg.c index f577c83..0670481 100644 --- a/crypto/hash-afalg.c +++ b/crypto/hash-afalg.c @@ -1,5 +1,5 @@ /* - * QEMU Crypto af_alg-backend hash support + * QEMU Crypto af_alg-backend hash/hmac support * * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD. * @@ -16,10 +16,13 @@ #include "qemu-common.h" #include "qapi/error.h" #include "crypto/hash.h" +#include "crypto/hmac.h" #include "hashpriv.h" +#include "hmacpriv.h" static char * qcrypto_afalg_hash_format_name(QCryptoHashAlgorithm alg, + bool is_hmac, Error **errp) { char *name; @@ -55,10 +58,14 @@ qcrypto_afalg_hash_format_name(QCryptoHashAlgorithm alg, } name = g_new0(char, SALG_NAME_LEN_MAX); -ret = snprintf(name, SALG_NAME_LEN_MAX, "%s", alg_name); +if (is_hmac) { +ret = snprintf(name, SALG_NAME_LEN_MAX, "hmac(%s)", alg_name); +} else { /* hash */ +ret = snprintf(name, SALG_NAME_LEN_MAX, "%s", alg_name); +} if (ret < 0 || ret >= SALG_NAME_LEN_MAX) { -error_setg(errp, "Build hash name(name='%s') failed", - alg_name); +error_setg(errp, "Build %s name(name='%s') failed", + is_hmac ? "hmac" : "hash", alg_name); g_free(name); return NULL; } @@ -67,12 +74,14 @@ qcrypto_afalg_hash_format_name(QCryptoHashAlgorithm alg, } static QCryptoAFAlg * -qcrypto_afalg_hash_ctx_new(QCryptoHashAlgorithm alg, Error **errp) +qcrypto_afalg_hash_hmac_ctx_new(QCryptoHashAlgorithm alg, +const uint8_t *key, size_t nkey, +bool is_hmac, Error **errp) { QCryptoAFAlg *afalg; char *name; -name = qcrypto_afalg_hash_format_name(alg, errp); +name = qcrypto_afalg_hash_format_name(alg, is_hmac, errp); if (!name) { return NULL; } @@ -84,22 +93,49 @@ qcrypto_afalg_hash_ctx_new(QCryptoHashAlgorithm alg, Error **errp) } afalg->name = name; +/* HMAC needs setkey */ +if (is_hmac) { +if (qemu_setsockopt(afalg->tfmfd, SOL_ALG, ALG_SET_KEY, +key, nkey) != 0) { +error_setg_errno(errp, errno, "Set hmac key failed"); +qcrypto_afalg_comm_free(afalg); +return NULL; +} +} + /* prepare msg header */ afalg->msg = g_new0(struct msghdr, 1); return afalg; } +static QCryptoAFAlg * +qcrypto_afalg_hash_ctx_new(QCryptoHashAlgorithm alg, + Error **errp) +{ +return qcrypto_afalg_hash_hmac_ctx_new(alg, NULL, 0, false, errp); +} + +QCryptoAFAlg * +qcrypto_afalg_hmac_ctx_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) +{ +return qcrypto_afalg_hash_hmac_ctx_new(alg, key, nkey, true, errp); +} + static int -qcrypto_afalg_hash_bytesv(QCryptoHashAlgorithm alg, - const struct iovec *iov, - size_t niov, uint8_t **result, - size_t *resultlen, - Error **errp) +qcrypto_afalg_hash_hmac_bytesv(QCryptoAFAlg *hmac, + QCryptoHashAlgorithm alg, + const struct iovec *iov, + size_t niov, uint8_t **result, + size_t *resultlen, + Error **errp) { QCryptoAFAlg *afalg; struct iovec outv; int ret = 0; +bool is_hmac = (hmac != NULL) ? true : false; const int except_len = qcrypto_hash_digest_len(alg); if (*resultlen == 0) { @@ -112,9 +148,13 @@ qcrypto_afalg_hash_bytesv(QCryptoHashAlgorithm alg, return -1; } -afalg = qcrypto_afalg_hash_ctx_new(alg, errp); -if (afalg == NULL) { -return -1; +if (is_hmac) { +afalg = hmac; +} else { +afalg = qcrypto_afalg_hash_ctx_new(alg, errp); +if (afalg == NULL) { +return -1; +} } /* send data to kernel's crypto core */ @@ -138,10 +178,48 @@ qcrypto_afalg_hash_bytesv(QCryptoHashAlgorithm alg, } out: -qcrypto_afalg_comm_free(afalg); +if (!is_hmac) { +qcrypto_afalg_comm_free(afalg); +} return ret; } +static int +qcrypto_afalg_hash_bytesv(QCryptoHashAlgorithm alg, + const struct iovec *iov, + size_t niov, uint8_t **result, + size_t
[Qemu-devel] [PATCH v3 14/18] crypto: hash: add afalg-backend hash support
Adds afalg-backend hash support: introduces some private APIs firstly, and then intergrates them into qcrypto_hash_afalg_driver. Signed-off-by: Longpeng(Mike) --- crypto/Makefile.objs | 1 + crypto/afalgpriv.h | 1 + crypto/hash-afalg.c | 147 +++ crypto/hash.c| 11 crypto/hashpriv.h| 4 ++ 5 files changed, 164 insertions(+) create mode 100644 crypto/hash-afalg.c diff --git a/crypto/Makefile.objs b/crypto/Makefile.objs index d2e8fa8..2b99e08 100644 --- a/crypto/Makefile.objs +++ b/crypto/Makefile.objs @@ -12,6 +12,7 @@ crypto-obj-y += desrfb.o crypto-obj-y += cipher.o crypto-obj-$(CONFIG_AF_ALG) += afalg.o crypto-obj-$(CONFIG_AF_ALG) += cipher-afalg.o +crypto-obj-$(CONFIG_AF_ALG) += hash-afalg.o crypto-obj-y += tlscreds.o crypto-obj-y += tlscredsanon.o crypto-obj-y += tlscredsx509.o diff --git a/crypto/afalgpriv.h b/crypto/afalgpriv.h index e384b15..a0950db 100644 --- a/crypto/afalgpriv.h +++ b/crypto/afalgpriv.h @@ -26,6 +26,7 @@ #endif #define AFALG_TYPE_CIPHER "skcipher" +#define AFALG_TYPE_HASH "hash" #define ALG_OPTYPE_LEN 4 #define ALG_MSGIV_LEN(len) (sizeof(struct af_alg_iv) + (len)) diff --git a/crypto/hash-afalg.c b/crypto/hash-afalg.c new file mode 100644 index 000..f577c83 --- /dev/null +++ b/crypto/hash-afalg.c @@ -0,0 +1,147 @@ +/* + * QEMU Crypto af_alg-backend hash support + * + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD. + * + * Authors: + *Longpeng(Mike) + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * (at your option) any later version. See the COPYING file in the + * top-level directory. + */ +#include "qemu/osdep.h" +#include "qemu/iov.h" +#include "qemu/sockets.h" +#include "qemu-common.h" +#include "qapi/error.h" +#include "crypto/hash.h" +#include "hashpriv.h" + +static char * +qcrypto_afalg_hash_format_name(QCryptoHashAlgorithm alg, + Error **errp) +{ +char *name; +const char *alg_name; +int ret; + +switch (alg) { +case QCRYPTO_HASH_ALG_MD5: +alg_name = "md5"; +break; +case QCRYPTO_HASH_ALG_SHA1: +alg_name = "sha1"; +break; +case QCRYPTO_HASH_ALG_SHA224: +alg_name = "sha224"; +break; +case QCRYPTO_HASH_ALG_SHA256: +alg_name = "sha256"; +break; +case QCRYPTO_HASH_ALG_SHA384: +alg_name = "sha384"; +break; +case QCRYPTO_HASH_ALG_SHA512: +alg_name = "sha512"; +break; +case QCRYPTO_HASH_ALG_RIPEMD160: +alg_name = "rmd160"; +break; + +default: +error_setg(errp, "Unsupported hash algorithm %d", alg); +return NULL; +} + +name = g_new0(char, SALG_NAME_LEN_MAX); +ret = snprintf(name, SALG_NAME_LEN_MAX, "%s", alg_name); +if (ret < 0 || ret >= SALG_NAME_LEN_MAX) { +error_setg(errp, "Build hash name(name='%s') failed", + alg_name); +g_free(name); +return NULL; +} + +return name; +} + +static QCryptoAFAlg * +qcrypto_afalg_hash_ctx_new(QCryptoHashAlgorithm alg, Error **errp) +{ +QCryptoAFAlg *afalg; +char *name; + +name = qcrypto_afalg_hash_format_name(alg, errp); +if (!name) { +return NULL; +} + +afalg = qcrypto_afalg_comm_alloc(AFALG_TYPE_HASH, name, errp); +if (!afalg) { +g_free(name); +return NULL; +} +afalg->name = name; + +/* prepare msg header */ +afalg->msg = g_new0(struct msghdr, 1); + +return afalg; +} + +static int +qcrypto_afalg_hash_bytesv(QCryptoHashAlgorithm alg, + const struct iovec *iov, + size_t niov, uint8_t **result, + size_t *resultlen, + Error **errp) +{ +QCryptoAFAlg *afalg; +struct iovec outv; +int ret = 0; +const int except_len = qcrypto_hash_digest_len(alg); + +if (*resultlen == 0) { +*resultlen = except_len; +*result = g_new0(uint8_t, *resultlen); +} else if (*resultlen != except_len) { +error_setg(errp, + "Result buffer size %zu is not match hash %d", + *resultlen, except_len); +return -1; +} + +afalg = qcrypto_afalg_hash_ctx_new(alg, errp); +if (afalg == NULL) { +return -1; +} + +/* send data to kernel's crypto core */ +ret = iov_send_recv(afalg->opfd, iov, niov, +0, iov_size(iov, niov), true); +if (ret < 0) { +error_setg_errno(errp, errno, "Send data to afalg-core failed"); +goto out; +} + +/* hash && get result */ +outv.iov_base = *result; +outv.iov_len = *resultlen; +afalg->msg->msg_iov = &outv; +afalg->msg->msg_iovlen = 1; +ret = recvmsg(afalg->opfd, afalg->msg, 0); +if (ret != -1) { +ret = 0; +} else { +error_setg_errno(errp, errno, "Recv result from afalg-core failed"); +
[Qemu-devel] [PATCH v3 12/18] crypto: introduce some common functions for af_alg backend
The AF_ALG socket family is the userspace interface for linux crypto API, this patch adds af_alg family support and some common functions for af_alg backend. It'll be used by afalg-backend crypto latter. Signed-off-by: Longpeng(Mike) --- configure| 21 + crypto/Makefile.objs | 1 + crypto/afalg.c | 118 +++ crypto/afalgpriv.h | 59 ++ 4 files changed, 199 insertions(+) create mode 100644 crypto/afalg.c create mode 100644 crypto/afalgpriv.h diff --git a/configure b/configure index 6db3044..db0e183 100755 --- a/configure +++ b/configure @@ -4744,6 +4744,23 @@ if compile_prog "" "" ; then have_af_vsock=yes fi +## +# check for usable AF_ALG environment +hava_af_alg=no +cat > $TMPC << EOF +#include +#include +#include +int main(void) { +int sock; +sock = socket(AF_ALG, SOCK_SEQPACKET, 0); +return sock; +} +EOF +if compile_prog "" "" ; then +have_af_alg=yes +fi + # # Sparc implicitly links with --relax, which is # incompatible with -r, so --no-relax should be @@ -5774,6 +5791,10 @@ if test "$have_af_vsock" = "yes" ; then echo "CONFIG_AF_VSOCK=y" >> $config_host_mak fi +if test "$have_af_alg" = "yes" ; then + echo "CONFIG_AF_ALG=y" >> $config_host_mak +fi + if test "$have_sysmacros" = "yes" ; then echo "CONFIG_SYSMACROS=y" >> $config_host_mak fi diff --git a/crypto/Makefile.objs b/crypto/Makefile.objs index 1f749f2..2be5a3a 100644 --- a/crypto/Makefile.objs +++ b/crypto/Makefile.objs @@ -10,6 +10,7 @@ crypto-obj-$(if $(CONFIG_NETTLE),n,$(if $(CONFIG_GCRYPT_HMAC),n,y)) += hmac-glib crypto-obj-y += aes.o crypto-obj-y += desrfb.o crypto-obj-y += cipher.o +crypto-obj-$(CONFIG_AF_ALG) += afalg.o crypto-obj-y += tlscreds.o crypto-obj-y += tlscredsanon.o crypto-obj-y += tlscredsx509.o diff --git a/crypto/afalg.c b/crypto/afalg.c new file mode 100644 index 000..80c5cfd --- /dev/null +++ b/crypto/afalg.c @@ -0,0 +1,118 @@ +/* + * QEMU Crypto af_alg support + * + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD. + * + * Authors: + *Longpeng(Mike) + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * (at your option) any later version. See the COPYING file in the + * top-level directory. + */ +#include "qemu/osdep.h" +#include "qemu/cutils.h" +#include "qemu/sockets.h" +#include "qapi/error.h" +#include "afalgpriv.h" + +static bool +qcrypto_afalg_build_saddr(const char *type, const char *name, + struct sockaddr_alg *salg, Error **errp) +{ +salg->salg_family = AF_ALG; + +if (qemu_strnlen(type, SALG_TYPE_LEN_MAX) == SALG_TYPE_LEN_MAX) { +error_setg(errp, "Afalg type(%s) is larger than %d bytes", + type, SALG_TYPE_LEN_MAX); +return false; +} + +if (qemu_strnlen(name, SALG_NAME_LEN_MAX) == SALG_NAME_LEN_MAX) { +error_setg(errp, "Afalg name(%s) is larger than %d bytes", + name, SALG_NAME_LEN_MAX); +return false; +} + +pstrcpy((char *)salg->salg_type, SALG_TYPE_LEN_MAX, type); +pstrcpy((char *)salg->salg_name, SALG_NAME_LEN_MAX, name); + +return true; +} + +static int +qcrypto_afalg_socket_bind(const char *type, const char *name, + Error **errp) +{ +int sbind; +struct sockaddr_alg salg = {0}; + +if (!qcrypto_afalg_build_saddr(type, name, &salg, errp)) { +return -1; +} + +sbind = qemu_socket(AF_ALG, SOCK_SEQPACKET, 0); +if (sbind < 0) { +error_setg_errno(errp, errno, "Failed to create socket"); +return -1; +} + +if (bind(sbind, (const struct sockaddr *)&salg, sizeof(salg)) != 0) { +error_setg_errno(errp, errno, "Failed to bind socket"); +closesocket(sbind); +return -1; +} + +return sbind; +} + +QCryptoAFAlg * +qcrypto_afalg_comm_alloc(const char *type, const char *name, + Error **errp) +{ +QCryptoAFAlg *afalg; + +afalg = g_new0(QCryptoAFAlg, 1); +/* initilize crypto API socket */ +afalg->opfd = -1; +afalg->tfmfd = qcrypto_afalg_socket_bind(type, name, errp); +if (afalg->tfmfd == -1) { +goto error; +} + +afalg->opfd = qemu_accept(afalg->tfmfd, NULL, 0); +if (afalg->opfd == -1) { +error_setg_errno(errp, errno, "Failed to accept socket"); +goto error; +} + +return afalg; + +error: +qcrypto_afalg_comm_free(afalg); +return NULL; +} + +void qcrypto_afalg_comm_free(QCryptoAFAlg *afalg) +{ +if (afalg) { +if (afalg->msg) { +g_free(afalg->msg->msg_control); +g_free(afalg->msg); +} + +if (afalg->name) { +g_free(afalg->name); +} + +if (afalg->tfmfd != -1) { +closesocket(afalg->tfmfd); +} + +if (afalg->opfd != -1) { +
[Qemu-devel] [PATCH v3 16/18] tests: crypto: add cipher speed benchmark support
Now we have two qcrypto backends, libiary-backend and afalg-backend, but which one is faster? This patch add a cipher speed benchmark, it helps us to measure the performance by using "make check-speed" or using "./tests/benchmark-crypto-cipher" directly. Signed-off-by: Longpeng(Mike) --- tests/Makefile.include | 9 - tests/benchmark-crypto-cipher.c | 90 + 2 files changed, 97 insertions(+), 2 deletions(-) create mode 100644 tests/benchmark-crypto-cipher.c diff --git a/tests/Makefile.include b/tests/Makefile.include index 579ec07..3a01523 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -101,6 +101,7 @@ gcov-files-test-write-threshold-y = block/write-threshold.c check-unit-y += tests/test-crypto-hash$(EXESUF) check-unit-y += tests/test-crypto-hmac$(EXESUF) check-unit-y += tests/test-crypto-cipher$(EXESUF) +check-speed-y += tests/benchmark-crypto-cipher$(EXESUF) check-unit-y += tests/test-crypto-secret$(EXESUF) check-unit-$(CONFIG_GNUTLS) += tests/test-crypto-tlscredsx509$(EXESUF) check-unit-$(CONFIG_GNUTLS) += tests/test-crypto-tlssession$(EXESUF) @@ -524,6 +525,7 @@ test-qom-obj-y = $(qom-obj-y) $(test-util-obj-y) test-qapi-obj-y = tests/test-qapi-visit.o tests/test-qapi-types.o \ tests/test-qapi-event.o tests/test-qmp-introspect.o \ $(test-qom-obj-y) +benchmark-crypto-obj-y = $(crypto-obj-y) $(test-qom-obj-y) test-crypto-obj-y = $(crypto-obj-y) $(test-qom-obj-y) test-io-obj-y = $(io-obj-y) $(test-crypto-obj-y) test-block-obj-y = $(block-obj-y) $(test-io-obj-y) tests/iothread.o @@ -628,6 +630,7 @@ tests/test-bitcnt$(EXESUF): tests/test-bitcnt.o $(test-util-obj-y) tests/test-crypto-hash$(EXESUF): tests/test-crypto-hash.o $(test-crypto-obj-y) tests/test-crypto-hmac$(EXESUF): tests/test-crypto-hmac.o $(test-crypto-obj-y) tests/test-crypto-cipher$(EXESUF): tests/test-crypto-cipher.o $(test-crypto-obj-y) +tests/benchmark-crypto-cipher$(EXESUF): tests/benchmark-crypto-cipher.o $(test-crypto-obj-y) tests/test-crypto-secret$(EXESUF): tests/test-crypto-secret.o $(test-crypto-obj-y) tests/test-crypto-xts$(EXESUF): tests/test-crypto-xts.o $(test-crypto-obj-y) @@ -792,6 +795,7 @@ check-help: @echo " make check-qtest-TARGET Run qtest tests for given target" @echo " make check-qtest Run qtest tests" @echo " make check-unit Run qobject tests" + @echo " make check-speed Run qobject speed tests" @echo " make check-qapi-schemaRun QAPI schema tests" @echo " make check-block Run block tests" @echo " make check-report.htmlGenerates an HTML test report" @@ -822,8 +826,8 @@ $(patsubst %, check-qtest-%, $(QTEST_TARGETS)): check-qtest-%: $(check-qtest-y) $(GCOV) $(GCOV_OPTIONS) $$f -o `dirname $$f`; \ done,) -.PHONY: $(patsubst %, check-%, $(check-unit-y)) -$(patsubst %, check-%, $(check-unit-y)): check-%: % +.PHONY: $(patsubst %, check-%, $(check-unit-y) $(check-speed-y)) +$(patsubst %, check-%, $(check-unit-y) $(check-speed-y)): check-%: % $(if $(CONFIG_GCOV),@rm -f *.gcda */*.gcda */*/*.gcda */*/*/*.gcda,) $(call quiet-command, \ MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$((RANDOM % 255 + 1))} \ @@ -882,6 +886,7 @@ check-tests/qapi-schema/doc-good.texi: tests/qapi-schema/doc-good.test.texi check-qapi-schema: $(patsubst %,check-%, $(check-qapi-schema-y)) check-tests/qapi-schema/doc-good.texi check-qtest: $(patsubst %,check-qtest-%, $(QTEST_TARGETS)) check-unit: $(patsubst %,check-%, $(check-unit-y)) +check-speed: $(patsubst %,check-%, $(check-speed-y)) check-block: $(patsubst %,check-%, $(check-block-y)) check: check-qapi-schema check-unit check-qtest check-clean: diff --git a/tests/benchmark-crypto-cipher.c b/tests/benchmark-crypto-cipher.c new file mode 100644 index 000..40594e3 --- /dev/null +++ b/tests/benchmark-crypto-cipher.c @@ -0,0 +1,90 @@ +/* + * QEMU Crypto cipher speed benchmark + * + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD. + * + * Authors: + *Longpeng(Mike) + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * (at your option) any later version. See the COPYING file in the + * top-level directory. + */ +#include "qemu/osdep.h" +#include "crypto/init.h" +#include "crypto/cipher.h" + +static void test_cipher_speed(const void *opaque) +{ +QCryptoCipher *cipher; +Error *err = NULL; +double total = 0.0; +size_t chunk_size = (size_t)opaque; +uint8_t *key = NULL, *iv = NULL; +uint8_t *plaintext = NULL, *ciphertext = NULL; +size_t nkey = qcrypto_cipher_get_key_len(QCRYPTO_CIPHER_ALG_AES_128); +size_t niv = qcrypto_cipher_get_iv_len(QCRYPTO_CIPHER_ALG_AES_128, + QCRYPTO_CIPHER_MODE_CBC); + +key = g_new0(uint8_t, nkey); +memset(key, g_test_rand_int(), nkey); + +iv = g_new0(uint8_t, niv); +memset(iv, g_test_rand_int(), niv); +
[Qemu-devel] [PATCH v3 10/18] crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend
Extracts qcrypto_hmac_ctx_new() from qcrypto_hmac_new() for glib-backend impls. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/hmac-glib.c | 34 -- 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/crypto/hmac-glib.c b/crypto/hmac-glib.c index 08a1fdd..d9f88d8 100644 --- a/crypto/hmac-glib.c +++ b/crypto/hmac-glib.c @@ -49,11 +49,11 @@ bool qcrypto_hmac_supports(QCryptoHashAlgorithm alg) return false; } -QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, - const uint8_t *key, size_t nkey, - Error **errp) +static QCryptoHmacGlib * +qcrypto_hmac_ctx_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) { -QCryptoHmac *hmac; QCryptoHmacGlib *ctx; if (!qcrypto_hmac_supports(alg)) { @@ -62,9 +62,6 @@ QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, return NULL; } -hmac = g_new0(QCryptoHmac, 1); -hmac->alg = alg; - ctx = g_new0(QCryptoHmacGlib, 1); ctx->ghmac = g_hmac_new(qcrypto_hmac_alg_map[alg], @@ -74,12 +71,10 @@ QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, goto error; } -hmac->opaque = ctx; -return hmac; +return ctx; error: g_free(ctx); -g_free(hmac); return NULL; } @@ -134,6 +129,25 @@ int qcrypto_hmac_bytesv(QCryptoHmac *hmac, return 0; } +QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) +{ +QCryptoHmac *hmac; +QCryptoHmacGlib *ctx; + +ctx = qcrypto_hmac_ctx_new(alg, key, nkey, errp); +if (ctx == NULL) { +return NULL; +} + +hmac = g_new0(QCryptoHmac, 1); +hmac->alg = alg; +hmac->opaque = ctx; + +return hmac; +} + #else bool qcrypto_hmac_supports(QCryptoHashAlgorithm alg) -- 1.8.3.1
[Qemu-devel] [PATCH v3 05/18] crypto: cipher: add cipher driver framework
1) makes the public APIs in cipher-nettle/gcrypt/builtin static, and rename them with "nettle/gcrypt/builtin" prefix. 2) introduces cipher framework, including QCryptoCipherDriver and new public APIs. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/cipher-builtin.c | 64 +-- crypto/cipher-gcrypt.c | 72 + crypto/cipher-nettle.c | 71 crypto/cipher.c | 65 crypto/cipherpriv.h | 40 +++ include/crypto/cipher.h | 1 + 6 files changed, 190 insertions(+), 123 deletions(-) create mode 100644 crypto/cipherpriv.h diff --git a/crypto/cipher-builtin.c b/crypto/cipher-builtin.c index 8cf47d1..16a36d4 100644 --- a/crypto/cipher-builtin.c +++ b/crypto/cipher-builtin.c @@ -22,6 +22,7 @@ #include "crypto/aes.h" #include "crypto/desrfb.h" #include "crypto/xts.h" +#include "cipherpriv.h" typedef struct QCryptoCipherBuiltinAESContext QCryptoCipherBuiltinAESContext; struct QCryptoCipherBuiltinAESContext { @@ -466,25 +467,22 @@ static QCryptoCipherBuiltin *qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg, return ctxt; } -void qcrypto_cipher_free(QCryptoCipher *cipher) +static void +qcrypto_builtin_cipher_ctx_free(QCryptoCipher *cipher) { QCryptoCipherBuiltin *ctxt; -if (!cipher) { -return; -} - ctxt = cipher->opaque; ctxt->free(cipher); -g_free(cipher); } -int qcrypto_cipher_encrypt(QCryptoCipher *cipher, - const void *in, - void *out, - size_t len, - Error **errp) +static int +qcrypto_builtin_cipher_encrypt(QCryptoCipher *cipher, + const void *in, + void *out, + size_t len, + Error **errp) { QCryptoCipherBuiltin *ctxt = cipher->opaque; @@ -498,11 +496,12 @@ int qcrypto_cipher_encrypt(QCryptoCipher *cipher, } -int qcrypto_cipher_decrypt(QCryptoCipher *cipher, - const void *in, - void *out, - size_t len, - Error **errp) +static int +qcrypto_builtin_cipher_decrypt(QCryptoCipher *cipher, + const void *in, + void *out, + size_t len, + Error **errp) { QCryptoCipherBuiltin *ctxt = cipher->opaque; @@ -516,9 +515,10 @@ int qcrypto_cipher_decrypt(QCryptoCipher *cipher, } -int qcrypto_cipher_setiv(QCryptoCipher *cipher, - const uint8_t *iv, size_t niv, - Error **errp) +static int +qcrypto_builtin_cipher_setiv(QCryptoCipher *cipher, + const uint8_t *iv, size_t niv, + Error **errp) { QCryptoCipherBuiltin *ctxt = cipher->opaque; @@ -526,23 +526,9 @@ int qcrypto_cipher_setiv(QCryptoCipher *cipher, } -QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, - QCryptoCipherMode mode, - const uint8_t *key, size_t nkey, - Error **errp) -{ -QCryptoCipher *cipher; -QCryptoCipherBuiltin *ctxt; - -ctxt = qcrypto_cipher_ctx_new(alg, mode, key, nkey, errp); -if (ctxt == NULL) { -return NULL; -} - -cipher = g_new0(QCryptoCipher, 1); -cipher->alg = alg; -cipher->mode = mode; -cipher->opaque = ctxt; - -return cipher; -} +static struct QCryptoCipherDriver qcrypto_cipher_lib_driver = { +.cipher_encrypt = qcrypto_builtin_cipher_encrypt, +.cipher_decrypt = qcrypto_builtin_cipher_decrypt, +.cipher_setiv = qcrypto_builtin_cipher_setiv, +.cipher_free = qcrypto_builtin_cipher_ctx_free, +}; diff --git a/crypto/cipher-gcrypt.c b/crypto/cipher-gcrypt.c index 871730b..0489147 100644 --- a/crypto/cipher-gcrypt.c +++ b/crypto/cipher-gcrypt.c @@ -20,6 +20,7 @@ #include "qemu/osdep.h" #include "crypto/xts.h" +#include "cipherpriv.h" #include @@ -64,8 +65,9 @@ struct QCryptoCipherGcrypt { uint8_t *iv; }; -static void gcrypt_cipher_free_ctx(QCryptoCipherGcrypt *ctx, - QCryptoCipherMode mode) +static void +qcrypto_gcrypt_cipher_free_ctx(QCryptoCipherGcrypt *ctx, + QCryptoCipherMode mode) { if (!ctx) { return; @@ -239,18 +241,15 @@ static QCryptoCipherGcrypt *qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg, return ctx; error: -gcrypt_cipher_free_ctx(ctx, mode); +qcrypto_gcrypt_cipher_free_ctx(ctx, mode); return NULL; } -void qcrypto_cipher_free(QCryptoCipher *cipher) +static void +qcrypto_gcrypt_ciphe
[Qemu-devel] [PATCH v3 17/18] tests: crypto: add hash speed benchmark support
This patch add a hash speed benchmark, it helps us to measure the performance by using "make check-speed" or using "./tests/benchmark-crypto-cipher" directly. Signed-off-by: Longpeng(Mike) --- tests/Makefile.include| 2 ++ tests/benchmark-crypto-hash.c | 67 +++ 2 files changed, 69 insertions(+) create mode 100644 tests/benchmark-crypto-hash.c diff --git a/tests/Makefile.include b/tests/Makefile.include index 3a01523..045d16f 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -99,6 +99,7 @@ gcov-files-test-keyval-y = util/keyval.c check-unit-y += tests/test-write-threshold$(EXESUF) gcov-files-test-write-threshold-y = block/write-threshold.c check-unit-y += tests/test-crypto-hash$(EXESUF) +check-speed-y += tests/benchmark-crypto-hash$(EXESUF) check-unit-y += tests/test-crypto-hmac$(EXESUF) check-unit-y += tests/test-crypto-cipher$(EXESUF) check-speed-y += tests/benchmark-crypto-cipher$(EXESUF) @@ -628,6 +629,7 @@ tests/test-mul64$(EXESUF): tests/test-mul64.o $(test-util-obj-y) tests/test-bitops$(EXESUF): tests/test-bitops.o $(test-util-obj-y) tests/test-bitcnt$(EXESUF): tests/test-bitcnt.o $(test-util-obj-y) tests/test-crypto-hash$(EXESUF): tests/test-crypto-hash.o $(test-crypto-obj-y) +tests/benchmark-crypto-hash$(EXESUF): tests/benchmark-crypto-hash.o $(test-crypto-obj-y) tests/test-crypto-hmac$(EXESUF): tests/test-crypto-hmac.o $(test-crypto-obj-y) tests/test-crypto-cipher$(EXESUF): tests/test-crypto-cipher.o $(test-crypto-obj-y) tests/benchmark-crypto-cipher$(EXESUF): tests/benchmark-crypto-cipher.o $(test-crypto-obj-y) diff --git a/tests/benchmark-crypto-hash.c b/tests/benchmark-crypto-hash.c new file mode 100644 index 000..6769d2a --- /dev/null +++ b/tests/benchmark-crypto-hash.c @@ -0,0 +1,67 @@ +/* + * QEMU Crypto hash speed benchmark + * + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD. + * + * Authors: + *Longpeng(Mike) + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * (at your option) any later version. See the COPYING file in the + * top-level directory. + */ +#include "qemu/osdep.h" +#include "crypto/init.h" +#include "crypto/hash.h" + +static void test_hash_speed(const void *opaque) +{ +size_t chunk_size = (size_t)opaque; +uint8_t *in = NULL, *out = NULL; +size_t out_len = 0; +double total = 0.0; +struct iovec iov; +int ret; + +in = g_new0(uint8_t, chunk_size); +memset(in, g_test_rand_int(), chunk_size); + +iov.iov_base = (char *)in; +iov.iov_len = chunk_size; + +g_test_timer_start(); +do { +ret = qcrypto_hash_bytesv(QCRYPTO_HASH_ALG_SHA256, + &iov, 1, &out, &out_len, + NULL); +g_assert(ret == 0); + +total += chunk_size; +} while (g_test_timer_elapsed() < 5.0); + +total /= 1024 * 1024; /* to MB */ +g_print("sha256: "); +g_print("Testing chunk_size %ld bytes ", chunk_size); +g_print("done: %.2f MB in %.2f secs: ", total, g_test_timer_last()); +g_print("%.2f MB/sec\n", total / g_test_timer_last()); + +g_free(out); +g_free(in); +} + +int main(int argc, char **argv) +{ +size_t i; +char name[64]; + +g_test_init(&argc, &argv, NULL); +g_assert(qcrypto_init(NULL) == 0); + +for (i = 512; i <= (64 * 1204); i *= 2) { +memset(name, 0 , sizeof(name)); +snprintf(name, sizeof(name), "/crypto/hash/speed-%lu", i); +g_test_add_data_func(name, (void *)i, test_hash_speed); +} + +return g_test_run(); +} -- 1.8.3.1
[Qemu-devel] [PATCH v3 18/18] tests: crypto: add hmac speed benchmark support
This patch add a hmac speed benchmark, it helps us to measure the performance by using "make check-speed" or using "./tests/benchmark-crypto-hmac" directly. Signed-off-by: Longpeng(Mike) --- tests/Makefile.include| 2 + tests/benchmark-crypto-hmac.c | 96 +++ 2 files changed, 98 insertions(+) create mode 100644 tests/benchmark-crypto-hmac.c diff --git a/tests/Makefile.include b/tests/Makefile.include index 045d16f..7b170be 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -101,6 +101,7 @@ gcov-files-test-write-threshold-y = block/write-threshold.c check-unit-y += tests/test-crypto-hash$(EXESUF) check-speed-y += tests/benchmark-crypto-hash$(EXESUF) check-unit-y += tests/test-crypto-hmac$(EXESUF) +check-speed-y += tests/benchmark-crypto-hmac$(EXESUF) check-unit-y += tests/test-crypto-cipher$(EXESUF) check-speed-y += tests/benchmark-crypto-cipher$(EXESUF) check-unit-y += tests/test-crypto-secret$(EXESUF) @@ -631,6 +632,7 @@ tests/test-bitcnt$(EXESUF): tests/test-bitcnt.o $(test-util-obj-y) tests/test-crypto-hash$(EXESUF): tests/test-crypto-hash.o $(test-crypto-obj-y) tests/benchmark-crypto-hash$(EXESUF): tests/benchmark-crypto-hash.o $(test-crypto-obj-y) tests/test-crypto-hmac$(EXESUF): tests/test-crypto-hmac.o $(test-crypto-obj-y) +tests/benchmark-crypto-hmac$(EXESUF): tests/benchmark-crypto-hmac.o $(test-crypto-obj-y) tests/test-crypto-cipher$(EXESUF): tests/test-crypto-cipher.o $(test-crypto-obj-y) tests/benchmark-crypto-cipher$(EXESUF): tests/benchmark-crypto-cipher.o $(test-crypto-obj-y) tests/test-crypto-secret$(EXESUF): tests/test-crypto-secret.o $(test-crypto-obj-y) diff --git a/tests/benchmark-crypto-hmac.c b/tests/benchmark-crypto-hmac.c new file mode 100644 index 000..be2f2a5 --- /dev/null +++ b/tests/benchmark-crypto-hmac.c @@ -0,0 +1,96 @@ +/* + * QEMU Crypto hmac speed benchmark + * + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD. + * + * Authors: + *Longpeng(Mike) + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * (at your option) any later version. See the COPYING file in the + * top-level directory. + */ +#include "qemu/osdep.h" +#include "crypto/init.h" +#include "crypto/hmac.h" + +#define KEY "monkey monkey monkey monkey" + +static void test_hmac_speed(const void *opaque) +{ +size_t chunk_size = (size_t)opaque; +QCryptoHmac *hmac = NULL; +uint8_t *in = NULL, *out = NULL; +size_t out_len = 0; +double total = 0.0; +struct iovec iov; +Error *err = NULL; +int ret; + +if (!qcrypto_hmac_supports(QCRYPTO_HASH_ALG_SHA256)) { +return; +} + +hmac = qcrypto_hmac_new(QCRYPTO_HASH_ALG_SHA256, (const uint8_t *)KEY, +strlen(KEY), &err); +g_assert(err == NULL); +g_assert(hmac != NULL); + +in = g_new0(uint8_t, chunk_size); +memset(in, g_test_rand_int(), chunk_size); + +iov.iov_base = (char *)in; +iov.iov_len = chunk_size; + +g_test_timer_start(); +do { +ret = qcrypto_hmac_bytesv(hmac, &iov, 1, &out, &out_len, &err); +g_assert(ret == 0); +g_assert(err == NULL); + +#if !defined(CONFIG_NETTLE) && !defined(CONFIG_GCRYPT) +/* + * qcrypto_hmac_bytesv() uses g_checksum_get_digest() to get the + * digest. Once this function has been called, the #GChecksum is + * closed and can no longer be updated with g_checksum_update(). + * So...we must free glib-backend hmac object and renew one here. + */ +qcrypto_hmac_free(hmac); +hmac = qcrypto_hmac_new(QCRYPTO_HASH_ALG_SHA256, (const uint8_t *)KEY, +strlen(KEY), &err); +g_assert(err == NULL); +g_assert(hmac != NULL); +#endif +total += chunk_size; +} while (g_test_timer_elapsed() < 5.0); + +total /= 1024 * 1024; /* to MB */ + +g_print("[drv:%s]", qcrypto_hmac_using_afalg_drv(hmac) ? +"afalg" : "libs"); +g_print("hmac(sha256): "); +g_print("Testing chunk_size %ld bytes ", chunk_size); +g_print("done: %.2f MB in %.2f secs: ", total, g_test_timer_last()); +g_print("%.2f MB/sec\n", total / g_test_timer_last()); + +qcrypto_hmac_free(hmac); +g_free(out); +g_free(in); +} + +int main(int argc, char **argv) +{ +size_t i; +char name[64]; + +g_test_init(&argc, &argv, NULL); +g_assert(qcrypto_init(NULL) == 0); + +for (i = 512; i <= (64 * 1204); i *= 2) { +memset(name, 0 , sizeof(name)); +snprintf(name, sizeof(name), "/crypto/hmac/speed-%lu", i); +g_test_add_data_func(name, (void *)i, test_hmac_speed); +} + +return g_test_run(); +} -- 1.8.3.1
[Qemu-devel] [PATCH v3 09/18] crypto: hmac: introduce qcrypto_hmac_ctx_new for nettle-backend
Extracts qcrypto_hmac_ctx_new() from qcrypto_hmac_new() for nettle-backend impls. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/hmac-nettle.c | 34 -- 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/crypto/hmac-nettle.c b/crypto/hmac-nettle.c index 4a9e6b2..19fbb4f 100644 --- a/crypto/hmac-nettle.c +++ b/crypto/hmac-nettle.c @@ -97,11 +97,11 @@ bool qcrypto_hmac_supports(QCryptoHashAlgorithm alg) return false; } -QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, - const uint8_t *key, size_t nkey, - Error **errp) +static QCryptoHmacNettle * +qcrypto_hmac_ctx_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) { -QCryptoHmac *hmac; QCryptoHmacNettle *ctx; if (!qcrypto_hmac_supports(alg)) { @@ -110,16 +110,11 @@ QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, return NULL; } -hmac = g_new0(QCryptoHmac, 1); -hmac->alg = alg; - ctx = g_new0(QCryptoHmacNettle, 1); qcrypto_hmac_alg_map[alg].setkey(&ctx->u, nkey, key); -hmac->opaque = ctx; - -return hmac; +return ctx; } void qcrypto_hmac_free(QCryptoHmac *hmac) @@ -173,3 +168,22 @@ int qcrypto_hmac_bytesv(QCryptoHmac *hmac, return 0; } + +QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) +{ +QCryptoHmac *hmac; +QCryptoHmacNettle *ctx; + +ctx = qcrypto_hmac_ctx_new(alg, key, nkey, errp); +if (ctx == NULL) { +return NULL; +} + +hmac = g_new0(QCryptoHmac, 1); +hmac->alg = alg; +hmac->opaque = ctx; + +return hmac; +} -- 1.8.3.1
[Qemu-devel] [PATCH v3 06/18] crypto: hash: add hash driver framework
1) makes the public APIs in hash-nettle/gcrypt/glib static, and rename them with "nettle/gcrypt/glib" prefix. 2) introduces hash framework, including QCryptoHashDriver and new public APIs. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/hash-gcrypt.c | 19 +-- crypto/hash-glib.c | 19 +-- crypto/hash-nettle.c | 19 +-- crypto/hash.c| 13 + crypto/hashpriv.h| 31 +++ 5 files changed, 83 insertions(+), 18 deletions(-) create mode 100644 crypto/hashpriv.h diff --git a/crypto/hash-gcrypt.c b/crypto/hash-gcrypt.c index 7690690..972beaa 100644 --- a/crypto/hash-gcrypt.c +++ b/crypto/hash-gcrypt.c @@ -22,6 +22,7 @@ #include #include "qapi/error.h" #include "crypto/hash.h" +#include "hashpriv.h" static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = { @@ -44,12 +45,13 @@ gboolean qcrypto_hash_supports(QCryptoHashAlgorithm alg) } -int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg, -const struct iovec *iov, -size_t niov, -uint8_t **result, -size_t *resultlen, -Error **errp) +static int +qcrypto_gcrypt_hash_bytesv(QCryptoHashAlgorithm alg, + const struct iovec *iov, + size_t niov, + uint8_t **result, + size_t *resultlen, + Error **errp) { int i, ret; gcry_md_hd_t md; @@ -107,3 +109,8 @@ int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg, gcry_md_close(md); return -1; } + + +QCryptoHashDriver qcrypto_hash_lib_driver = { +.hash_bytesv = qcrypto_gcrypt_hash_bytesv, +}; diff --git a/crypto/hash-glib.c b/crypto/hash-glib.c index ec99ac9..fb16ac0 100644 --- a/crypto/hash-glib.c +++ b/crypto/hash-glib.c @@ -21,6 +21,7 @@ #include "qemu/osdep.h" #include "qapi/error.h" #include "crypto/hash.h" +#include "hashpriv.h" static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = { @@ -47,12 +48,13 @@ gboolean qcrypto_hash_supports(QCryptoHashAlgorithm alg) } -int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg, -const struct iovec *iov, -size_t niov, -uint8_t **result, -size_t *resultlen, -Error **errp) +static int +qcrypto_glib_hash_bytesv(QCryptoHashAlgorithm alg, +const struct iovec *iov, +size_t niov, +uint8_t **result, +size_t *resultlen, +Error **errp) { int i, ret; GChecksum *cs; @@ -95,3 +97,8 @@ int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg, g_checksum_free(cs); return -1; } + + +QCryptoHashDriver qcrypto_hash_lib_driver = { +.hash_bytesv = qcrypto_glib_hash_bytesv, +}; diff --git a/crypto/hash-nettle.c b/crypto/hash-nettle.c index 6a206dc..96f186f 100644 --- a/crypto/hash-nettle.c +++ b/crypto/hash-nettle.c @@ -21,6 +21,7 @@ #include "qemu/osdep.h" #include "qapi/error.h" #include "crypto/hash.h" +#include "hashpriv.h" #include #include #include @@ -103,12 +104,13 @@ gboolean qcrypto_hash_supports(QCryptoHashAlgorithm alg) } -int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg, -const struct iovec *iov, -size_t niov, -uint8_t **result, -size_t *resultlen, -Error **errp) +static int +qcrypto_nettle_hash_bytesv(QCryptoHashAlgorithm alg, + const struct iovec *iov, + size_t niov, + uint8_t **result, + size_t *resultlen, + Error **errp) { int i; union qcrypto_hash_ctx ctx; @@ -152,3 +154,8 @@ int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg, return 0; } + + +QCryptoHashDriver qcrypto_hash_lib_driver = { +.hash_bytesv = qcrypto_nettle_hash_bytesv, +}; diff --git a/crypto/hash.c b/crypto/hash.c index 0f1ceac..c43fd87 100644 --- a/crypto/hash.c +++ b/crypto/hash.c @@ -21,6 +21,7 @@ #include "qemu/osdep.h" #include "qapi/error.h" #include "crypto/hash.h" +#include "hashpriv.h" static size_t qcrypto_hash_alg_size[QCRYPTO_HASH_ALG__MAX] = { [QCRYPTO_HASH_ALG_MD5] = 16, @@ -38,6 +39,18 @@ size_t qcrypto_hash_digest_len(QCryptoHashAlgorithm alg) return qcrypto_hash_alg_size[alg]; } +int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg, +const struct iovec *iov, +size_t niov, +uint8_t **result, +size_t *resultlen, +Error **errp) +{ +return qcrypto_hash_lib_driver.hash_bytesv(alg, iov, niov, +
[Qemu-devel] [PATCH v3 02/18] crypto: cipher: introduce qcrypto_cipher_ctx_new for gcrypt-backend
Extracts qcrypto_cipher_ctx_new() from qcrypto_cipher_new() for gcrypt-backend impls. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/cipher-gcrypt.c | 50 +- 1 file changed, 33 insertions(+), 17 deletions(-) diff --git a/crypto/cipher-gcrypt.c b/crypto/cipher-gcrypt.c index 0ecffa2..871730b 100644 --- a/crypto/cipher-gcrypt.c +++ b/crypto/cipher-gcrypt.c @@ -80,12 +80,12 @@ static void gcrypt_cipher_free_ctx(QCryptoCipherGcrypt *ctx, } -QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, - QCryptoCipherMode mode, - const uint8_t *key, size_t nkey, - Error **errp) +static QCryptoCipherGcrypt *qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg, + QCryptoCipherMode mode, + const uint8_t *key, + size_t nkey, + Error **errp) { -QCryptoCipher *cipher; QCryptoCipherGcrypt *ctx; gcry_error_t err; int gcryalg, gcrymode; @@ -162,10 +162,6 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, return NULL; } -cipher = g_new0(QCryptoCipher, 1); -cipher->alg = alg; -cipher->mode = mode; - ctx = g_new0(QCryptoCipherGcrypt, 1); err = gcry_cipher_open(&ctx->handle, gcryalg, gcrymode, 0); @@ -174,7 +170,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, gcry_strerror(err)); goto error; } -if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) { +if (mode == QCRYPTO_CIPHER_MODE_XTS) { err = gcry_cipher_open(&ctx->tweakhandle, gcryalg, gcrymode, 0); if (err != 0) { error_setg(errp, "Cannot initialize cipher: %s", @@ -183,7 +179,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, } } -if (cipher->alg == QCRYPTO_CIPHER_ALG_DES_RFB) { +if (alg == QCRYPTO_CIPHER_ALG_DES_RFB) { /* We're using standard DES cipher from gcrypt, so we need * to munge the key so that the results are the same as the * bizarre RFB variant of DES :-) @@ -193,7 +189,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, g_free(rfbkey); ctx->blocksize = 8; } else { -if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) { +if (mode == QCRYPTO_CIPHER_MODE_XTS) { nkey /= 2; err = gcry_cipher_setkey(ctx->handle, key, nkey); if (err != 0) { @@ -210,7 +206,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, gcry_strerror(err)); goto error; } -switch (cipher->alg) { +switch (alg) { case QCRYPTO_CIPHER_ALG_AES_128: case QCRYPTO_CIPHER_ALG_AES_192: case QCRYPTO_CIPHER_ALG_AES_256: @@ -230,7 +226,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, } } -if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) { +if (mode == QCRYPTO_CIPHER_MODE_XTS) { if (ctx->blocksize != XTS_BLOCK_SIZE) { error_setg(errp, "Cipher block size %zu must equal XTS block size %d", @@ -240,12 +236,10 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, ctx->iv = g_new0(uint8_t, ctx->blocksize); } -cipher->opaque = ctx; -return cipher; +return ctx; error: gcrypt_cipher_free_ctx(ctx, mode); -g_free(cipher); return NULL; } @@ -385,3 +379,25 @@ int qcrypto_cipher_setiv(QCryptoCipher *cipher, return 0; } + + +QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, + QCryptoCipherMode mode, + const uint8_t *key, size_t nkey, + Error **errp) +{ +QCryptoCipher *cipher; +QCryptoCipherGcrypt *ctx; + +ctx = qcrypto_cipher_ctx_new(alg, mode, key, nkey, errp); +if (ctx == NULL) { +return NULL; +} + +cipher = g_new0(QCryptoCipher, 1); +cipher->alg = alg; +cipher->mode = mode; +cipher->opaque = ctx; + +return cipher; +} -- 1.8.3.1
[Qemu-devel] [PATCH v3 04/18] crypto: cipher: introduce qcrypto_cipher_ctx_new for builtin-backend
Extracts qcrypto_cipher_ctx_new() from qcrypto_cipher_new() for builtin-backend impls. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/cipher-builtin.c | 101 ++-- 1 file changed, 55 insertions(+), 46 deletions(-) diff --git a/crypto/cipher-builtin.c b/crypto/cipher-builtin.c index b4bc2b9..8cf47d1 100644 --- a/crypto/cipher-builtin.c +++ b/crypto/cipher-builtin.c @@ -235,23 +235,24 @@ static int qcrypto_cipher_setiv_aes(QCryptoCipher *cipher, -static int qcrypto_cipher_init_aes(QCryptoCipher *cipher, - const uint8_t *key, size_t nkey, - Error **errp) +static QCryptoCipherBuiltin * +qcrypto_cipher_init_aes(QCryptoCipherMode mode, +const uint8_t *key, size_t nkey, +Error **errp) { QCryptoCipherBuiltin *ctxt; -if (cipher->mode != QCRYPTO_CIPHER_MODE_CBC && -cipher->mode != QCRYPTO_CIPHER_MODE_ECB && -cipher->mode != QCRYPTO_CIPHER_MODE_XTS) { +if (mode != QCRYPTO_CIPHER_MODE_CBC && +mode != QCRYPTO_CIPHER_MODE_ECB && +mode != QCRYPTO_CIPHER_MODE_XTS) { error_setg(errp, "Unsupported cipher mode %s", - QCryptoCipherMode_lookup[cipher->mode]); -return -1; + QCryptoCipherMode_lookup[mode]); +return NULL; } ctxt = g_new0(QCryptoCipherBuiltin, 1); -if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) { +if (mode == QCRYPTO_CIPHER_MODE_XTS) { if (AES_set_encrypt_key(key, nkey * 4, &ctxt->state.aes.key.enc) != 0) { error_setg(errp, "Failed to set encryption key"); goto error; @@ -291,13 +292,11 @@ static int qcrypto_cipher_init_aes(QCryptoCipher *cipher, ctxt->encrypt = qcrypto_cipher_encrypt_aes; ctxt->decrypt = qcrypto_cipher_decrypt_aes; -cipher->opaque = ctxt; - -return 0; +return ctxt; error: g_free(ctxt); -return -1; +return NULL; } @@ -370,16 +369,17 @@ static int qcrypto_cipher_setiv_des_rfb(QCryptoCipher *cipher, } -static int qcrypto_cipher_init_des_rfb(QCryptoCipher *cipher, - const uint8_t *key, size_t nkey, - Error **errp) +static QCryptoCipherBuiltin * +qcrypto_cipher_init_des_rfb(QCryptoCipherMode mode, +const uint8_t *key, size_t nkey, +Error **errp) { QCryptoCipherBuiltin *ctxt; -if (cipher->mode != QCRYPTO_CIPHER_MODE_ECB) { +if (mode != QCRYPTO_CIPHER_MODE_ECB) { error_setg(errp, "Unsupported cipher mode %s", - QCryptoCipherMode_lookup[cipher->mode]); -return -1; + QCryptoCipherMode_lookup[mode]); +return NULL; } ctxt = g_new0(QCryptoCipherBuiltin, 1); @@ -394,9 +394,7 @@ static int qcrypto_cipher_init_des_rfb(QCryptoCipher *cipher, ctxt->encrypt = qcrypto_cipher_encrypt_des_rfb; ctxt->decrypt = qcrypto_cipher_decrypt_des_rfb; -cipher->opaque = ctxt; - -return 0; +return ctxt; } @@ -426,12 +424,13 @@ bool qcrypto_cipher_supports(QCryptoCipherAlgorithm alg, } -QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, - QCryptoCipherMode mode, - const uint8_t *key, size_t nkey, - Error **errp) +static QCryptoCipherBuiltin *qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg, +QCryptoCipherMode mode, +const uint8_t *key, +size_t nkey, +Error **errp) { -QCryptoCipher *cipher; +QCryptoCipherBuiltin *ctxt; switch (mode) { case QCRYPTO_CIPHER_MODE_ECB: @@ -444,39 +443,27 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, return NULL; } -cipher = g_new0(QCryptoCipher, 1); -cipher->alg = alg; -cipher->mode = mode; - if (!qcrypto_cipher_validate_key_length(alg, mode, nkey, errp)) { -goto error; +return NULL; } -switch (cipher->alg) { +switch (alg) { case QCRYPTO_CIPHER_ALG_DES_RFB: -if (qcrypto_cipher_init_des_rfb(cipher, key, nkey, errp) < 0) { -goto error; -} +ctxt = qcrypto_cipher_init_des_rfb(mode, key, nkey, errp); break; case QCRYPTO_CIPHER_ALG_AES_128: case QCRYPTO_CIPHER_ALG_AES_192: case QCRYPTO_CIPHER_ALG_AES_256: -if (qcrypto_cipher_init_aes(cipher, key, nkey, errp) < 0) { -goto error; -} +ctxt = qcrypto_cipher_init_aes(mode, key, nkey, errp); break; default: error_setg(errp, "Unsupported cipher algorithm %s", -
[Qemu-devel] [PATCH v3 07/18] crypto: hmac: move crypto/hmac.h into include/crypto/
Moves crypto/hmac.h into include/crypto/, likes cipher.h and hash.h Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/hmac.h | 166 -- include/crypto/hmac.h | 166 ++ 2 files changed, 166 insertions(+), 166 deletions(-) delete mode 100644 crypto/hmac.h create mode 100644 include/crypto/hmac.h diff --git a/crypto/hmac.h b/crypto/hmac.h deleted file mode 100644 index 0d3acd7..000 --- a/crypto/hmac.h +++ /dev/null @@ -1,166 +0,0 @@ -/* - * QEMU Crypto hmac algorithms - * - * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. - * - * This work is licensed under the terms of the GNU GPL, version 2 or - * (at your option) any later version. See the COPYING file in the - * top-level directory. - * - */ - -#ifndef QCRYPTO_HMAC_H -#define QCRYPTO_HMAC_H - -#include "qapi-types.h" - -typedef struct QCryptoHmac QCryptoHmac; -struct QCryptoHmac { -QCryptoHashAlgorithm alg; -void *opaque; -}; - -/** - * qcrypto_hmac_supports: - * @alg: the hmac algorithm - * - * Determine if @alg hmac algorithm is supported by - * the current configured build - * - * Returns: - * true if the algorithm is supported, false otherwise - */ -bool qcrypto_hmac_supports(QCryptoHashAlgorithm alg); - -/** - * qcrypto_hmac_new: - * @alg: the hmac algorithm - * @key: the key bytes - * @nkey: the length of @key - * @errp: pointer to a NULL-initialized error object - * - * Creates a new hmac object with the algorithm @alg - * - * The @key parameter provides the bytes representing - * the secret key to use. The @nkey parameter specifies - * the length of @key in bytes - * - * Note: must use qcrypto_hmac_free() to release the - * returned hmac object when no longer required - * - * Returns: - * a new hmac object, or NULL on error - */ -QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, - const uint8_t *key, size_t nkey, - Error **errp); - -/** - * qcrypto_hmac_free: - * @hmac: the hmac object - * - * Release the memory associated with @hmac that was - * previously allocated by qcrypto_hmac_new() - */ -void qcrypto_hmac_free(QCryptoHmac *hmac); - -/** - * qcrypto_hmac_bytesv: - * @hmac: the hmac object - * @iov: the array of memory regions to hmac - * @niov: the length of @iov - * @result: pointer to hold output hmac - * @resultlen: pointer to hold length of @result - * @errp: pointer to a NULL-initialized error object - * - * Computes the hmac across all the memory regions - * present in @iov. The @result pointer will be - * filled with raw bytes representing the computed - * hmac, which will have length @resultlen. The - * memory pointer in @result must be released - * with a call to g_free() when no longer required. - * - * Returns: - * 0 on success, -1 on error - */ -int qcrypto_hmac_bytesv(QCryptoHmac *hmac, -const struct iovec *iov, -size_t niov, -uint8_t **result, -size_t *resultlen, -Error **errp); - -/** - * qcrypto_hmac_bytes: - * @hmac: the hmac object - * @buf: the memory region to hmac - * @len: the length of @buf - * @result: pointer to hold output hmac - * @resultlen: pointer to hold length of @result - * @errp: pointer to a NULL-initialized error object - * - * Computes the hmac across all the memory region - * @buf of length @len. The @result pointer will be - * filled with raw bytes representing the computed - * hmac, which will have length @resultlen. The - * memory pointer in @result must be released - * with a call to g_free() when no longer required. - * - * Returns: - * 0 on success, -1 on error - */ -int qcrypto_hmac_bytes(QCryptoHmac *hmac, - const char *buf, - size_t len, - uint8_t **result, - size_t *resultlen, - Error **errp); - -/** - * qcrypto_hmac_digestv: - * @hmac: the hmac object - * @iov: the array of memory regions to hmac - * @niov: the length of @iov - * @digest: pointer to hold output hmac - * @errp: pointer to a NULL-initialized error object - * - * Computes the hmac across all the memory regions - * present in @iov. The @digest pointer will be - * filled with the printable hex digest of the computed - * hmac, which will be terminated by '\0'. The - * memory pointer in @digest must be released - * with a call to g_free() when no longer required. - * - * Returns: - * 0 on success, -1 on error - */ -int qcrypto_hmac_digestv(QCryptoHmac *hmac, - const struct iovec *iov, - size_t niov, - char **digest, - Error **errp); - -/** - * qcrypto_hmac_digest: - * @hmac: the hmac object - * @buf: the memory region to hmac - * @len: the length of @buf - * @digest: pointer
[Qemu-devel] [PATCH v3 13/18] crypto: cipher: add afalg-backend cipher support
Adds afalg-backend cipher support: introduces some private APIs firstly, and then intergrates them into qcrypto_cipher_afalg_driver. Signed-off-by: Longpeng(Mike) --- crypto/Makefile.objs | 1 + crypto/afalgpriv.h | 9 ++ crypto/cipher-afalg.c | 229 + crypto/cipher.c| 28 +- crypto/cipherpriv.h| 11 +++ include/crypto/cipher.h| 8 ++ tests/test-crypto-cipher.c | 10 +- 7 files changed, 294 insertions(+), 2 deletions(-) create mode 100644 crypto/cipher-afalg.c diff --git a/crypto/Makefile.objs b/crypto/Makefile.objs index 2be5a3a..d2e8fa8 100644 --- a/crypto/Makefile.objs +++ b/crypto/Makefile.objs @@ -11,6 +11,7 @@ crypto-obj-y += aes.o crypto-obj-y += desrfb.o crypto-obj-y += cipher.o crypto-obj-$(CONFIG_AF_ALG) += afalg.o +crypto-obj-$(CONFIG_AF_ALG) += cipher-afalg.o crypto-obj-y += tlscreds.o crypto-obj-y += tlscredsanon.o crypto-obj-y += tlscredsx509.o diff --git a/crypto/afalgpriv.h b/crypto/afalgpriv.h index f1b0ae5..e384b15 100644 --- a/crypto/afalgpriv.h +++ b/crypto/afalgpriv.h @@ -21,6 +21,15 @@ #define SALG_TYPE_LEN_MAX 14 #define SALG_NAME_LEN_MAX 64 +#ifndef SOL_ALG +#define SOL_ALG 279 +#endif + +#define AFALG_TYPE_CIPHER "skcipher" + +#define ALG_OPTYPE_LEN 4 +#define ALG_MSGIV_LEN(len) (sizeof(struct af_alg_iv) + (len)) + typedef struct QCryptoAFAlg QCryptoAFAlg; struct QCryptoAFAlg { diff --git a/crypto/cipher-afalg.c b/crypto/cipher-afalg.c new file mode 100644 index 000..cce8e6b --- /dev/null +++ b/crypto/cipher-afalg.c @@ -0,0 +1,229 @@ +/* + * QEMU Crypto af_alg-backend cipher support + * + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD. + * + * Authors: + *Longpeng(Mike) + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * (at your option) any later version. See the COPYING file in the + * top-level directory. + */ +#include "qemu/osdep.h" +#include "qemu/sockets.h" +#include "qemu-common.h" +#include "qapi/error.h" +#include "crypto/cipher.h" +#include "cipherpriv.h" + + +static char * +qcrypto_afalg_cipher_format_name(QCryptoCipherAlgorithm alg, + QCryptoCipherMode mode, + Error **errp) +{ +char *name; +const char *alg_name; +const char *mode_name; +int ret; + +switch (alg) { +case QCRYPTO_CIPHER_ALG_AES_128: +case QCRYPTO_CIPHER_ALG_AES_192: +case QCRYPTO_CIPHER_ALG_AES_256: +alg_name = "aes"; +break; +case QCRYPTO_CIPHER_ALG_CAST5_128: +alg_name = "cast5"; +break; +case QCRYPTO_CIPHER_ALG_SERPENT_128: +case QCRYPTO_CIPHER_ALG_SERPENT_192: +case QCRYPTO_CIPHER_ALG_SERPENT_256: +alg_name = "serpent"; +break; +case QCRYPTO_CIPHER_ALG_TWOFISH_128: +case QCRYPTO_CIPHER_ALG_TWOFISH_192: +case QCRYPTO_CIPHER_ALG_TWOFISH_256: +alg_name = "twofish"; +break; + +default: +error_setg(errp, "Unsupported cipher algorithm %d", alg); +return NULL; +} + +mode_name = QCryptoCipherMode_lookup[mode]; + +name = g_new0(char, SALG_NAME_LEN_MAX); +ret = snprintf(name, SALG_NAME_LEN_MAX, "%s(%s)", mode_name, + alg_name); +if (ret < 0 || ret >= SALG_NAME_LEN_MAX) { +error_setg(errp, "Build ciphername(name='%s',mode='%s') failed", + alg_name, mode_name); +g_free(name); +return NULL; +} + +return name; +} + +QCryptoAFAlg * +qcrypto_afalg_cipher_ctx_new(QCryptoCipherAlgorithm alg, + QCryptoCipherMode mode, + const uint8_t *key, + size_t nkey, Error **errp) +{ +QCryptoAFAlg *afalg; +size_t except_niv; +char *name; + +name = qcrypto_afalg_cipher_format_name(alg, mode, errp); +if (!name) { +return NULL; +} + +afalg = qcrypto_afalg_comm_alloc(AFALG_TYPE_CIPHER, name, errp); +if (!afalg) { +g_free(name); +return NULL; +} +afalg->name = name; + +/* setkey */ +if (qemu_setsockopt(afalg->tfmfd, SOL_ALG, ALG_SET_KEY, key, +nkey) != 0) { +error_setg_errno(errp, errno, "Set key failed"); +qcrypto_afalg_comm_free(afalg); +return NULL; +} + +/* prepare msg header */ +afalg->msg = g_new0(struct msghdr, 1); +afalg->msg->msg_controllen += CMSG_SPACE(ALG_OPTYPE_LEN); +except_niv = qcrypto_cipher_get_iv_len(alg, mode); +if (except_niv) { +afalg->msg->msg_controllen += CMSG_SPACE(ALG_MSGIV_LEN(except_niv)); +} +afalg->msg->msg_control = g_new0(uint8_t, afalg->msg->msg_controllen); + +/* We use 1st msghdr for crypto-info and 2nd msghdr for IV-info */ +afalg->cmsg = CMSG_FIRSTHDR(afalg->msg); +afalg->cmsg->cmsg_level = SOL_ALG; +afalg->cmsg->cmsg_type = ALG_SET_OP; +afalg->cmsg->cmsg_len = CMSG_SPACE(ALG_OPTYPE_LEN); + +
[Qemu-devel] [PATCH v3 08/18] crypto: hmac: introduce qcrypto_hmac_ctx_new for gcrypt-backend
1) Fix a handle-leak problem in qcrypto_hmac_new(), doesn't free ctx->handle if gcry_mac_setkey fails. 2) Extracts qcrypto_hmac_ctx_new() from qcrypto_hmac_new() for gcrypt-backend impls. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/hmac-gcrypt.c | 35 +-- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/crypto/hmac-gcrypt.c b/crypto/hmac-gcrypt.c index 21189e6..42489f3 100644 --- a/crypto/hmac-gcrypt.c +++ b/crypto/hmac-gcrypt.c @@ -42,11 +42,11 @@ bool qcrypto_hmac_supports(QCryptoHashAlgorithm alg) return false; } -QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, - const uint8_t *key, size_t nkey, - Error **errp) +static QCryptoHmacGcrypt * +qcrypto_hmac_ctx_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) { -QCryptoHmac *hmac; QCryptoHmacGcrypt *ctx; gcry_error_t err; @@ -56,9 +56,6 @@ QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, return NULL; } -hmac = g_new0(QCryptoHmac, 1); -hmac->alg = alg; - ctx = g_new0(QCryptoHmacGcrypt, 1); err = gcry_mac_open(&ctx->handle, qcrypto_hmac_alg_map[alg], @@ -73,15 +70,14 @@ QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, if (err != 0) { error_setg(errp, "Cannot set key: %s", gcry_strerror(err)); +gcry_mac_close(ctx->handle); goto error; } -hmac->opaque = ctx; -return hmac; +return ctx; error: g_free(ctx); -g_free(hmac); return NULL; } @@ -150,3 +146,22 @@ int qcrypto_hmac_bytesv(QCryptoHmac *hmac, return 0; } + +QCryptoHmac *qcrypto_hmac_new(QCryptoHashAlgorithm alg, + const uint8_t *key, size_t nkey, + Error **errp) +{ +QCryptoHmac *hmac; +QCryptoHmacGcrypt *ctx; + +ctx = qcrypto_hmac_ctx_new(alg, key, nkey, errp); +if (ctx == NULL) { +return NULL; +} + +hmac = g_new0(QCryptoHmac, 1); +hmac->alg = alg; +hmac->opaque = ctx; + +return hmac; +} -- 1.8.3.1
[Qemu-devel] [PATCH v3 03/18] crypto: cipher: introduce qcrypto_cipher_ctx_new for nettle-backend
Extracts qcrypto_cipher_ctx_new() from qcrypto_cipher_new() for nettle-backend impls. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/cipher-nettle.c | 41 + 1 file changed, 29 insertions(+), 12 deletions(-) diff --git a/crypto/cipher-nettle.c b/crypto/cipher-nettle.c index e04e3a1..e6d6e6c 100644 --- a/crypto/cipher-nettle.c +++ b/crypto/cipher-nettle.c @@ -262,12 +262,12 @@ static void nettle_cipher_free_ctx(QCryptoCipherNettle *ctx) } -QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, - QCryptoCipherMode mode, - const uint8_t *key, size_t nkey, - Error **errp) +static QCryptoCipherNettle *qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg, + QCryptoCipherMode mode, + const uint8_t *key, + size_t nkey, + Error **errp) { -QCryptoCipher *cipher; QCryptoCipherNettle *ctx; uint8_t *rfbkey; @@ -287,12 +287,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, return NULL; } -cipher = g_new0(QCryptoCipher, 1); -cipher->alg = alg; -cipher->mode = mode; - ctx = g_new0(QCryptoCipherNettle, 1); -cipher->opaque = ctx; switch (alg) { case QCRYPTO_CIPHER_ALG_DES_RFB: @@ -436,10 +431,10 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, ctx->iv = g_new0(uint8_t, ctx->blocksize); -return cipher; +return ctx; error: -qcrypto_cipher_free(cipher); +nettle_cipher_free_ctx(ctx); return NULL; } @@ -561,3 +556,25 @@ int qcrypto_cipher_setiv(QCryptoCipher *cipher, memcpy(ctx->iv, iv, niv); return 0; } + + +QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, + QCryptoCipherMode mode, + const uint8_t *key, size_t nkey, + Error **errp) +{ +QCryptoCipher *cipher; +QCryptoCipherNettle *ctx; + +ctx = qcrypto_cipher_ctx_new(alg, mode, key, nkey, errp); +if (!ctx) { +return NULL; +} + +cipher = g_new0(QCryptoCipher, 1); +cipher->alg = alg; +cipher->mode = mode; +cipher->opaque = ctx; + +return cipher; +} -- 1.8.3.1
[Qemu-devel] [PATCH v3 01/18] crypto: cipher: introduce context free function
Refactors the qcrypto_cipher_free(), splits it into two parts. One is gcrypt/nettle__cipher_free_ctx() to free the special context. This makes code more clear, what's more, it would be used by the later patch. Reviewed-by: Gonglei Signed-off-by: Longpeng(Mike) --- crypto/cipher-gcrypt.c | 31 ++- crypto/cipher-nettle.c | 18 ++ 2 files changed, 32 insertions(+), 17 deletions(-) diff --git a/crypto/cipher-gcrypt.c b/crypto/cipher-gcrypt.c index 6487eca..0ecffa2 100644 --- a/crypto/cipher-gcrypt.c +++ b/crypto/cipher-gcrypt.c @@ -64,6 +64,22 @@ struct QCryptoCipherGcrypt { uint8_t *iv; }; +static void gcrypt_cipher_free_ctx(QCryptoCipherGcrypt *ctx, + QCryptoCipherMode mode) +{ +if (!ctx) { +return; +} + +gcry_cipher_close(ctx->handle); +if (mode == QCRYPTO_CIPHER_MODE_XTS) { +gcry_cipher_close(ctx->tweakhandle); +} +g_free(ctx->iv); +g_free(ctx); +} + + QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, QCryptoCipherMode mode, const uint8_t *key, size_t nkey, @@ -228,11 +244,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, return cipher; error: -gcry_cipher_close(ctx->handle); -if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) { -gcry_cipher_close(ctx->tweakhandle); -} -g_free(ctx); +gcrypt_cipher_free_ctx(ctx, mode); g_free(cipher); return NULL; } @@ -240,17 +252,10 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, void qcrypto_cipher_free(QCryptoCipher *cipher) { -QCryptoCipherGcrypt *ctx; if (!cipher) { return; } -ctx = cipher->opaque; -gcry_cipher_close(ctx->handle); -if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) { -gcry_cipher_close(ctx->tweakhandle); -} -g_free(ctx->iv); -g_free(ctx); +gcrypt_cipher_free_ctx(cipher->opaque, cipher->mode); g_free(cipher); } diff --git a/crypto/cipher-nettle.c b/crypto/cipher-nettle.c index dfc9030..e04e3a1 100644 --- a/crypto/cipher-nettle.c +++ b/crypto/cipher-nettle.c @@ -249,6 +249,19 @@ bool qcrypto_cipher_supports(QCryptoCipherAlgorithm alg, } +static void nettle_cipher_free_ctx(QCryptoCipherNettle *ctx) +{ +if (!ctx) { +return; +} + +g_free(ctx->iv); +g_free(ctx->ctx); +g_free(ctx->ctx_tweak); +g_free(ctx); +} + + QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg, QCryptoCipherMode mode, const uint8_t *key, size_t nkey, @@ -440,10 +453,7 @@ void qcrypto_cipher_free(QCryptoCipher *cipher) } ctx = cipher->opaque; -g_free(ctx->iv); -g_free(ctx->ctx); -g_free(ctx->ctx_tweak); -g_free(ctx); +nettle_cipher_free_ctx(ctx); g_free(cipher); } -- 1.8.3.1
[Qemu-devel] [PATCH v3 00/18] crypto: add afalg-backend support
The AF_ALG socket family is the userspace interface for linux crypto API, users can use it to access hardware accelerators. This patchset adds a afalg-backend for qemu crypto subsystem. Currently when performs encrypt/decrypt, we'll try afalg-backend first and will back to libiary-backend if it failed. In the next step, It would support a command parameter to specifies which backends prefer to and some other improvements. I measured the performance about the afalg-backend impls, I tested how many data could be encrypted in 5 seconds. NOTE: If we use specific hardware crypto cards, I think afalg-backend would even faster. test-environment: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz *sha256* chunk_size(bytes) MB/sec(afalg:sha256-ssse3) MB/sec(nettle) 512 93.03 185.87 1024146.32 201.78 2048213.32 210.93 4096275.48 215.26 8192321.77 217.49 16384 349.60 219.26 32768 363.59 219.73 65536 375.79 219.99 *hmac(sha256)* chunk_size(bytes) MB/sec(afalg:sha256-ssse3) MB/sec(nettle) 512 71.26 165.55 1024117.43 189.15 2048180.96 203.24 4096247.60 211.38 8192301.99 215.65 16384 340.79 218.22 32768 365.51 219.49 65536 377.92 220.24 *cbc(aes128)* chunk_size(bytes) MB/sec(afalg:cbc-aes-aesni) MB/sec(nettle) 512 371.76 188.41 1024559.86 189.64 2048768.66 192.11 4096939.15 192.40 81921029.48 192.49 16384 1072.79 190.52 32768 1109.38 190.41 65536 1102.38 190.40 --- Changes since v1: - init sockaddr_alg object when it's defined. [Gonglei] - fix some superfluous initialization. [Gonglei] - s/opeartion/operation/g in crypto/afalgpriv.h. [Gonglei] - check 'niv' in qcrypto_afalg_cipher_setiv. [Gonglei] Changes since v1: - use "make check-speed" to testing the performance. [Daniel] - put private definations into crypto/***priv.h. [Daniel] - remove afalg socket from qapi-schema, put them into crypto/. [Daniel] - some Error report change. [Daniel] - s/QCryptoAfalg/QCryptoAFAlg. [Daniel] - use snprintf with bounds checking instead of sprintf. [Daniel] - use "qcrypto_afalg_" prefix and "qcrypto_nettle(gcrypt,glib,builtin)_" prefix. [Daniel] - add testing results in cover-letter. [Gonglei] --- Longpeng(Mike) (18): crypto: cipher: introduce context free function crypto: cipher: introduce qcrypto_cipher_ctx_new for gcrypt-backend crypto: cipher: introduce qcrypto_cipher_ctx_new for nettle-backend crypto: cipher: introduce qcrypto_cipher_ctx_new for builtin-backend crypto: cipher: add cipher driver framework crypto: hash: add hash driver framework crypto: hmac: move crypto/hmac.h into include/crypto/ crypto: hmac: introduce qcrypto_hmac_ctx_new for gcrypt-backend crypto: hmac: introduce qcrypto_hmac_ctx_new for nettle-backend crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend crypto: hmac: add hmac driver framework crypto: introduce some common functions for af_alg backend crypto: cipher: add afalg-backend cipher support crypto: hash: add afalg-backend hash support crypto: hmac: add af_alg hmac support tests: crypto: add cipher speed benchmark support tests: crypto: add hash speed benchmark support tests: crypto: add hmac speed benchmark support configure | 21 crypto/Makefile.objs| 3 + crypto/afalg.c | 118 + crypto/afalgpriv.h | 69 crypto/cipher-afalg.c | 229 crypto/cipher-builtin.c | 125 +++--- crypto/cipher-gcrypt.c | 105 +- crypto/cipher-nettle.c | 84 +-- crypto/cipher.c | 91 crypto/cipherpriv.h | 51 + crypto/hash-afalg.c | 225 +++ crypto/hash-gcrypt.c| 19 ++-- crypto/hash-glib.c | 19 ++-- crypto/hash-nettle.c| 19 ++-- crypto/hash.c | 24 + crypto/hashpriv.h | 35 ++ crypto/hmac-gcrypt.c