Re: [Qemu-devel] [PATCH v6] Add optionrom compatible with fw_cfg DMA version
On Mon, 25 Apr 2016 17:04:41 +0100 "Richard W.M. Jones" wrote: > From: Marc Marí > > This optionrom is based on linuxboot.S. > > Signed-off-by: Marc Marí > Signed-off-by: Richard W.M. Jones > --- > .gitignore| 4 + > hw/i386/pc.c | 10 +- > hw/nvram/fw_cfg.c | 2 +- > include/hw/nvram/fw_cfg.h | 1 + > pc-bios/optionrom/Makefile| 13 +- > pc-bios/optionrom/linuxboot_dma.c | 302 > ++ 6 files changed, 327 > insertions(+), 5 deletions(-) create mode 100644 > pc-bios/optionrom/linuxboot_dma.c Not sure if I should send my reviewed-by if it's already signed-off by me, but here it goes: Reviewed-by: Marc Marí
Re: [Qemu-devel] [PATCH for-2.7 3/8] libqos: drop duplicated virtio_config.h definitions
On Sun, 8 May 2016 20:22:53 +0200 Marc Marí wrote: > On Mon, 25 Apr 2016 13:46:08 +0100 > Stefan Hajnoczi wrote: > > > Signed-off-by: Stefan Hajnoczi > > --- > > tests/libqos/virtio.c | 19 ++- > > tests/libqos/virtio.h | 9 - > > tests/virtio-blk-test.c | 5 +++-- > > 3 files changed, 13 insertions(+), 20 deletions(-) > > > > diff --git a/tests/libqos/virtio.c b/tests/libqos/virtio.c > > index 613dece..ee9e892 100644 > > --- a/tests/libqos/virtio.c > > +++ b/tests/libqos/virtio.c > > @@ -11,6 +11,7 @@ > > #include > > #include "libqtest.h" > > #include "libqos/virtio.h" > > +#include "standard-headers/linux/virtio_config.h" > > > > uint8_t qvirtio_config_readb(const QVirtioBus *bus, QVirtioDevice > > *d, uint64_t > > addr) @@ -55,28 +56,28 @@ QVirtQueue *qvirtqueue_setup(const > > QVirtioBus *bus, QVirtioDevice *d, > > void qvirtio_reset(const QVirtioBus *bus, QVirtioDevice *d) > > { > > -bus->set_status(d, QVIRTIO_RESET); > > -g_assert_cmphex(bus->get_status(d), ==, QVIRTIO_RESET); > > +bus->set_status(d, 0); > > +g_assert_cmphex(bus->get_status(d), ==, 0); > > } > > > > void qvirtio_set_acknowledge(const QVirtioBus *bus, QVirtioDevice > > *d) { > > -bus->set_status(d, bus->get_status(d) | QVIRTIO_ACKNOWLEDGE); > > -g_assert_cmphex(bus->get_status(d), ==, QVIRTIO_ACKNOWLEDGE); > > +bus->set_status(d, bus->get_status(d) | > > VIRTIO_CONFIG_S_ACKNOWLEDGE); > > +g_assert_cmphex(bus->get_status(d), ==, > > VIRTIO_CONFIG_S_ACKNOWLEDGE); } > > > > void qvirtio_set_driver(const QVirtioBus *bus, QVirtioDevice *d) > > { > > -bus->set_status(d, bus->get_status(d) | QVIRTIO_DRIVER); > > +bus->set_status(d, bus->get_status(d) | > > VIRTIO_CONFIG_S_DRIVER); g_assert_cmphex(bus->get_status(d), ==, > > -QVIRTIO_DRIVER | > > QVIRTIO_ACKNOWLEDGE); > > +VIRTIO_CONFIG_S_DRIVER | > > VIRTIO_CONFIG_S_ACKNOWLEDGE); } > > > > void qvirtio_set_driver_ok(const QVirtioBus *bus, QVirtioDevice *d) > > { > > -bus->set_status(d, bus->get_status(d) | QVIRTIO_DRIVER_OK); > > -g_assert_cmphex(bus->get_status(d), ==, > > -QVIRTIO_DRIVER_OK | QVIRTIO_DRIVER | > > QVIRTIO_ACKNOWLEDGE); > > +bus->set_status(d, bus->get_status(d) | > > VIRTIO_CONFIG_S_DRIVER_OK); > > +g_assert_cmphex(bus->get_status(d), ==, > > VIRTIO_CONFIG_S_DRIVER_OK | > > +VIRTIO_CONFIG_S_DRIVER | > > VIRTIO_CONFIG_S_ACKNOWLEDGE); } > > > > void qvirtio_wait_queue_isr(const QVirtioBus *bus, QVirtioDevice > > *d, diff --git a/tests/libqos/virtio.h b/tests/libqos/virtio.h > > index e663bcf..993ae0e 100644 > > --- a/tests/libqos/virtio.h > > +++ b/tests/libqos/virtio.h > > @@ -12,13 +12,6 @@ > > > > #include "libqos/malloc.h" > > > > -#define QVIRTIO_RESET 0x0 > > -#define QVIRTIO_ACKNOWLEDGE 0x1 > > -#define QVIRTIO_DRIVER 0x2 > > -#define QVIRTIO_DRIVER_OK 0x4 > > - > > -#define QVIRTIO_F_NOTIFY_ON_EMPTY 0x0100 > > -#define QVIRTIO_F_ANY_LAYOUT0x0800 > > #define QVIRTIO_F_RING_INDIRECT_DESC0x1000 > > #define QVIRTIO_F_RING_EVENT_IDX0x2000 > > #define QVIRTIO_F_BAD_FEATURE 0x4000 > > @@ -27,8 +20,6 @@ > > #define QVRING_DESC_F_WRITE 0x2 > > #define QVRING_DESC_F_INDIRECT 0x4 > > > > -#define QVIRTIO_F_NOTIFY_ON_EMPTY 0x0100 > > -#define QVIRTIO_F_ANY_LAYOUT0x0800 > > #define QVIRTIO_F_RING_INDIRECT_DESC0x1000 > > #define QVIRTIO_F_RING_EVENT_IDX0x2000 > > #define QVIRTIO_F_BAD_FEATURE 0x4000 > > diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c > > index 02107a6..bfeffc4 100644 > > --- a/tests/virtio-blk-test.c > > +++ b/tests/virtio-blk-test.c > > @@ -20,6 +20,7 @@ > > #include "libqos/malloc-generic.h" > > #include "qemu/bswap.h" > > #include "standard-headers/linux/virtio_ids.h" > > +#include "standard-headers/linux/virtio_config.h" > > > > #define QVIRTIO_BLK_F_BARRIER 0x0001 > > #define QVIRTIO_BLK_F_SIZE_MAX 0x0002 > > @@ -240,7 +241,7 @@ static void test_basic(const QVirtioBus *bus, > > QVirtioDevic
Re: [Qemu-devel] [PATCH for-2.7 3/8] libqos: drop duplicated virtio_config.h definitions
On Mon, 25 Apr 2016 13:46:08 +0100 Stefan Hajnoczi wrote: > Signed-off-by: Stefan Hajnoczi > --- > tests/libqos/virtio.c | 19 ++- > tests/libqos/virtio.h | 9 - > tests/virtio-blk-test.c | 5 +++-- > 3 files changed, 13 insertions(+), 20 deletions(-) > > diff --git a/tests/libqos/virtio.c b/tests/libqos/virtio.c > index 613dece..ee9e892 100644 > --- a/tests/libqos/virtio.c > +++ b/tests/libqos/virtio.c > @@ -11,6 +11,7 @@ > #include > #include "libqtest.h" > #include "libqos/virtio.h" > +#include "standard-headers/linux/virtio_config.h" > > uint8_t qvirtio_config_readb(const QVirtioBus *bus, QVirtioDevice *d, > uint64_t > addr) @@ -55,28 +56,28 @@ QVirtQueue *qvirtqueue_setup(const > QVirtioBus *bus, QVirtioDevice *d, > void qvirtio_reset(const QVirtioBus *bus, QVirtioDevice *d) > { > -bus->set_status(d, QVIRTIO_RESET); > -g_assert_cmphex(bus->get_status(d), ==, QVIRTIO_RESET); > +bus->set_status(d, 0); > +g_assert_cmphex(bus->get_status(d), ==, 0); > } > > void qvirtio_set_acknowledge(const QVirtioBus *bus, QVirtioDevice *d) > { > -bus->set_status(d, bus->get_status(d) | QVIRTIO_ACKNOWLEDGE); > -g_assert_cmphex(bus->get_status(d), ==, QVIRTIO_ACKNOWLEDGE); > +bus->set_status(d, bus->get_status(d) | > VIRTIO_CONFIG_S_ACKNOWLEDGE); > +g_assert_cmphex(bus->get_status(d), ==, > VIRTIO_CONFIG_S_ACKNOWLEDGE); } > > void qvirtio_set_driver(const QVirtioBus *bus, QVirtioDevice *d) > { > -bus->set_status(d, bus->get_status(d) | QVIRTIO_DRIVER); > +bus->set_status(d, bus->get_status(d) | VIRTIO_CONFIG_S_DRIVER); > g_assert_cmphex(bus->get_status(d), ==, > -QVIRTIO_DRIVER | > QVIRTIO_ACKNOWLEDGE); > +VIRTIO_CONFIG_S_DRIVER | > VIRTIO_CONFIG_S_ACKNOWLEDGE); } > > void qvirtio_set_driver_ok(const QVirtioBus *bus, QVirtioDevice *d) > { > -bus->set_status(d, bus->get_status(d) | QVIRTIO_DRIVER_OK); > -g_assert_cmphex(bus->get_status(d), ==, > -QVIRTIO_DRIVER_OK | QVIRTIO_DRIVER | > QVIRTIO_ACKNOWLEDGE); > +bus->set_status(d, bus->get_status(d) | > VIRTIO_CONFIG_S_DRIVER_OK); > +g_assert_cmphex(bus->get_status(d), ==, > VIRTIO_CONFIG_S_DRIVER_OK | > +VIRTIO_CONFIG_S_DRIVER | > VIRTIO_CONFIG_S_ACKNOWLEDGE); } > > void qvirtio_wait_queue_isr(const QVirtioBus *bus, QVirtioDevice *d, > diff --git a/tests/libqos/virtio.h b/tests/libqos/virtio.h > index e663bcf..993ae0e 100644 > --- a/tests/libqos/virtio.h > +++ b/tests/libqos/virtio.h > @@ -12,13 +12,6 @@ > > #include "libqos/malloc.h" > > -#define QVIRTIO_RESET 0x0 > -#define QVIRTIO_ACKNOWLEDGE 0x1 > -#define QVIRTIO_DRIVER 0x2 > -#define QVIRTIO_DRIVER_OK 0x4 > - > -#define QVIRTIO_F_NOTIFY_ON_EMPTY 0x0100 > -#define QVIRTIO_F_ANY_LAYOUT0x0800 > #define QVIRTIO_F_RING_INDIRECT_DESC0x1000 > #define QVIRTIO_F_RING_EVENT_IDX0x2000 > #define QVIRTIO_F_BAD_FEATURE 0x4000 > @@ -27,8 +20,6 @@ > #define QVRING_DESC_F_WRITE 0x2 > #define QVRING_DESC_F_INDIRECT 0x4 > > -#define QVIRTIO_F_NOTIFY_ON_EMPTY 0x0100 > -#define QVIRTIO_F_ANY_LAYOUT0x0800 > #define QVIRTIO_F_RING_INDIRECT_DESC0x1000 > #define QVIRTIO_F_RING_EVENT_IDX0x2000 > #define QVIRTIO_F_BAD_FEATURE 0x4000 > diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c > index 02107a6..bfeffc4 100644 > --- a/tests/virtio-blk-test.c > +++ b/tests/virtio-blk-test.c > @@ -20,6 +20,7 @@ > #include "libqos/malloc-generic.h" > #include "qemu/bswap.h" > #include "standard-headers/linux/virtio_ids.h" > +#include "standard-headers/linux/virtio_config.h" > > #define QVIRTIO_BLK_F_BARRIER 0x0001 > #define QVIRTIO_BLK_F_SIZE_MAX 0x0002 > @@ -240,7 +241,7 @@ static void test_basic(const QVirtioBus *bus, > QVirtioDevice *dev, > guest_free(alloc, req_addr); > > -if (features & QVIRTIO_F_ANY_LAYOUT) { > +if (features & VIRTIO_F_ANY_LAYOUT) { > /* Write and read with 2 descriptor layout */ > /* Write request */ > req.type = QVIRTIO_BLK_T_OUT; > @@ -607,7 +608,7 @@ static void pci_idx(void) > features = qvirtio_get_features(&qvirtio_pci, &dev->vdev); > features = features & ~(QVIRTIO_F_BAD_FEATURE | > QVIRTIO_F_RING_INDIRECT_DESC | > -QVIRTIO_F_NOTIFY_ON_EMPTY | > QVIRTIO_BLK_F_SCSI); > +VIRTIO_F_NOTIFY_ON_EMPTY | > QVIRTIO_BLK_F_SCSI); qvirtio_set_features(&qvirtio_pci, &dev->vdev, > features); > vqpci = (QVirtQueuePCI *)qvirtqueue_setup(&qvirtio_pci, > &dev->vdev, In standard-headers/linux/virtio_config.h, there is; #define VIRTIO_F_NOTIFY_ON_EMPTY24 #define VIRTIO_F_ANY_LAYOUT 27 whereas in tests/libqos
Re: [Qemu-devel] [PATCH v5] Add optionrom compatible with fw_cfg DMA version
On Fri, 22 Apr 2016 14:02:28 +0100 "Richard W.M. Jones" wrote: > From: Marc Marí > > This optionrom is based on linuxboot.S. > > Signed-off-by: Marc Marí > Signed-off-by: Richard W.M. Jones > --- > .gitignore| 4 + > hw/i386/pc.c | 9 +- > hw/nvram/fw_cfg.c | 2 +- > include/hw/nvram/fw_cfg.h | 1 + > pc-bios/optionrom/Makefile| 13 +- > pc-bios/optionrom/linuxboot_dma.c | 305 > ++ 6 files changed, 329 > insertions(+), 5 deletions(-) create mode 100644 > pc-bios/optionrom/linuxboot_dma.c Thanks for picking this up and solving the issues! Reviewed-by: Marc Marí
Re: [Qemu-devel] [PATCH v4.1] Add optionrom compatible with fw_cfg DMA version
On Mon, 4 Apr 2016 16:02:04 +0100 Stefan Hajnoczi wrote: > On Fri, Apr 01, 2016 at 01:43:47PM +0100, Richard W.M. Jones wrote: > > From: Marc Marí > > > > This optionrom is based on linuxboot.S. > > > > Signed-off-by: Marc Marí > > Signed-off-by: Richard W.M. Jones > > --- > > .gitignore| 4 + > > hw/i386/pc.c | 9 +- > > hw/nvram/fw_cfg.c | 2 +- > > include/hw/nvram/fw_cfg.h | 1 + > > pc-bios/optionrom/Makefile| 7 +- > > pc-bios/optionrom/linuxboot_dma.c | 297 > > ++ 6 files changed, 315 > > insertions(+), 5 deletions(-) create mode 100644 > > pc-bios/optionrom/linuxboot_dma.c > > It would be great to merge this but there's more work necessary. > > I CCed Marc's current email address. He was interning at Red Hat but > is now back at university so the Red Hat address is not being read. > > Marc: What was still missing from this patch? I'm sorry I have not answered before, it's been a very busy week. What's missing (as I remember it): - Debug loading an initrd from the new linuxboot - Other minor comments in the thread (mainly, compile using mingw). - Make it possible to disable DMA fw_cfg. I've been putting this patch at the back of my todo list for too long. Let me address some deadlines for this week, and then I'll finish this patch. Marc > > diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile > > index ce4852a..bdd0cc1 100644 > > --- a/pc-bios/optionrom/Makefile > > +++ b/pc-bios/optionrom/Makefile > > @@ -13,15 +13,18 @@ CFLAGS := -Wall -Wstrict-prototypes -Werror > > -fomit-frame-pointer -fno-builtin CFLAGS += -I$(SRC_PATH) > > CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector) > > CFLAGS += $(CFLAGS_NOPIE) > > +CFLAGS += -m32 > > QEMU_CFLAGS = $(CFLAGS) > > > > -build-all: multiboot.bin linuxboot.bin kvmvapic.bin > > +ASFLAGS += -32 > > + > > +build-all: multiboot.bin linuxboot.bin linuxboot_dma.bin > > kvmvapic.bin > > # suppress auto-removal of intermediate files > > .SECONDARY: > > > > %.img: %.o > > - $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e > > _start -s -o $@ $<," Building $(TARGET_DIR)$@") > > + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 > > -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") > > Gerd noted in <1454411187.9300.54.ca...@redhat.com>: > > Hmm, that breaks the windows cross build: > > make: Entering directory `/home/kraxel/projects/qemu/build-win32' > Building optionrom/linuxboot_dma.img > i686-w64-mingw32-ld: unrecognised emulation mode: elf_i386 > Supported emulations: i386pe > make[1]: *** [linuxboot_dma.img] Error 1 > > and then: > > Testing shows two more problems: > > (1) initrd loading is broken, kernel complains it finds only > gibberish: > > [0.934582] Unpacking initramfs... > [1.166983] Initramfs unpacking failed: junk in compressed > archive [1.168458] Freeing initrd memory: 32812k freed > > (2) going back to non-dma boot via -M pc-$old doesn't work, > appearently fw_cfg dma is enabled even for old machine types.
Re: [Qemu-devel] [PATCH v4] Add optionrom compatible with fw_cfg DMA version
El Tue, 02 Feb 2016 13:04:55 +0100 Gerd Hoffmann escribió: > On Di, 2016-02-02 at 12:06 +0100, Gerd Hoffmann wrote: > > Hi, > > > > > %.img: %.o > > > - $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e > > > _start -s -o $@ $<," Building $(TARGET_DIR)$@") > > > + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 > > > -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") > > > > Hmm, that breaks the windows cross build: > > > > make: Entering directory `/home/kraxel/projects/qemu/build-win32' > > Building optionrom/linuxboot_dma.img > > i686-w64-mingw32-ld: unrecognised emulation mode: elf_i386 > > Supported emulations: i386pe > > make[1]: *** [linuxboot_dma.img] Error 1 > > Testing shows two more problems: > > (1) initrd loading is broken, kernel complains it finds only > gibberish: > > [0.934582] Unpacking initramfs... > [1.166983] Initramfs unpacking failed: junk in compressed archive > [1.168458] Freeing initrd memory: 32812k freed Will look at it. It is probably being copied to a wrong location. > (2) going back to non-dma boot via -M pc-$old doesn't work, > appearently fw_cfg dma is enabled even for old machine types. I don't remember discussing the topic of machine types when touching fw_cfg DMA. Which means, it probably slipped amongst the other details. But it is now merged and in stable, so it should probably be left as it is now. Should this optionrom be enabled only with the latest machine type? Thanks Marc
Re: [Qemu-devel] [PATCH v4] Add optionrom compatible with fw_cfg DMA version
El Tue, 02 Feb 2016 12:06:27 +0100 Gerd Hoffmann escribió: > Hi, > > > %.img: %.o > > - $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e > > _start -s -o $@ $<," Building $(TARGET_DIR)$@") > > + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 > > -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") > > Hmm, that breaks the windows cross build: > > make: Entering directory `/home/kraxel/projects/qemu/build-win32' > Building optionrom/linuxboot_dma.img > i686-w64-mingw32-ld: unrecognised emulation mode: elf_i386 > Supported emulations: i386pe > make[1]: *** [linuxboot_dma.img] Error 1 Thanks for reporting. I don't know much about Windows cross-builds. Any idea on how to solve the issue? Marc
[Qemu-devel] [PATCH v4] Add optionrom compatible with fw_cfg DMA version
This optionrom is based on linuxboot.S. Added changes proposed by Gerd Hoffman, Stefan Hajnoczi and Kevin O'Connor. All optionroms are now compiled in 32 bits. This also forces to not use any standard C header because this would need cross-compiling support check and a big modification on the configuration script. Signed-off-by: Marc Marí --- .gitignore| 4 + hw/i386/pc.c | 9 +- hw/nvram/fw_cfg.c | 2 +- include/hw/nvram/fw_cfg.h | 1 + pc-bios/optionrom/Makefile| 7 +- pc-bios/optionrom/linuxboot_dma.c | 288 ++ 6 files changed, 306 insertions(+), 5 deletions(-) create mode 100644 pc-bios/optionrom/linuxboot_dma.c diff --git a/.gitignore b/.gitignore index 88a80ff..101d1e0 100644 --- a/.gitignore +++ b/.gitignore @@ -94,6 +94,10 @@ /pc-bios/optionrom/linuxboot.bin /pc-bios/optionrom/linuxboot.raw /pc-bios/optionrom/linuxboot.img +/pc-bios/optionrom/linuxboot_dma.asm +/pc-bios/optionrom/linuxboot_dma.bin +/pc-bios/optionrom/linuxboot_dma.raw +/pc-bios/optionrom/linuxboot_dma.img /pc-bios/optionrom/multiboot.asm /pc-bios/optionrom/multiboot.bin /pc-bios/optionrom/multiboot.raw diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 459260b..00339fa 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms, fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size); fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size); -option_rom[nb_option_roms].name = "linuxboot.bin"; -option_rom[nb_option_roms].bootindex = 0; +if (fw_cfg_dma_enabled(fw_cfg)) { +option_rom[nb_option_roms].name = "linuxboot_dma.bin"; +option_rom[nb_option_roms].bootindex = 0; +} else { +option_rom[nb_option_roms].name = "linuxboot.bin"; +option_rom[nb_option_roms].bootindex = 0; +} nb_option_roms++; } diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index a1d650d..d0a5753 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -546,7 +546,7 @@ static bool is_version_1(void *opaque, int version_id) return version_id == 1; } -static bool fw_cfg_dma_enabled(void *opaque) +bool fw_cfg_dma_enabled(void *opaque) { FWCfgState *s = opaque; diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h index 664eaf6..953e58d 100644 --- a/include/hw/nvram/fw_cfg.h +++ b/include/hw/nvram/fw_cfg.h @@ -219,6 +219,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr, hwaddr dma_addr, AddressSpace *dma_as); FWCfgState *fw_cfg_find(void); +bool fw_cfg_dma_enabled(void *opaque); #endif /* NO_QEMU_PROTOS */ diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile index ce4852a..bdd0cc1 100644 --- a/pc-bios/optionrom/Makefile +++ b/pc-bios/optionrom/Makefile @@ -13,15 +13,18 @@ CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin CFLAGS += -I$(SRC_PATH) CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector) CFLAGS += $(CFLAGS_NOPIE) +CFLAGS += -m32 QEMU_CFLAGS = $(CFLAGS) -build-all: multiboot.bin linuxboot.bin kvmvapic.bin +ASFLAGS += -32 + +build-all: multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin # suppress auto-removal of intermediate files .SECONDARY: %.img: %.o - $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") %.raw: %.img $(call quiet-command,$(OBJCOPY) -O binary -j .text $< $@," Building $(TARGET_DIR)$@") diff --git a/pc-bios/optionrom/linuxboot_dma.c b/pc-bios/optionrom/linuxboot_dma.c new file mode 100644 index 000..c1181cd --- /dev/null +++ b/pc-bios/optionrom/linuxboot_dma.c @@ -0,0 +1,288 @@ +/* + * Linux Boot Option ROM for fw_cfg DMA + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + * + * Copyright (c) 2015 Red Hat Inc. + * Authors: Marc Marí + */ + +asm( +".text\n" +".global _start\n" +"_start:\n" +" .short 0xaa55\n" +" .byte (_end - _start) / 512\n" +" lret\n" +" .org 0x18\n" +"
Re: [Qemu-devel] [PATCH v3] Add optionrom compatible with fw_cfg DMA version
> When I first tried to compile this (on fc23), I got: > > In file included from /usr/include/features.h:389:0, > from /usr/include/stdint.h:25, > from > /usr/lib/gcc/x86_64-redhat-linux/5.3.1/include/stdint.h:9, > from linuxboot_dma.c:62: > /usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such > file or directory compilation terminated. > > which I fixed by running "dnf install glibc-devel.i686". Is a > configure check needed? I forgot to address this point this morning. I now put up a clean setup and I had the same issue. I'll add the configure check, but I don't know if there is a nicer solution. Thanks Marc
Re: [Qemu-devel] [PATCH v3] Add optionrom compatible with fw_cfg DMA version
On Wed, 27 Jan 2016 19:14:54 -0500 "Kevin O'Connor" wrote: > On Mon, Jan 25, 2016 at 02:17:48PM +0100, Marc Marí wrote: > > This optionrom is based on linuxboot.S. > > Hi Marc, > > Out of curiousity, how does the timing with this option rom compare to > the previous SeaBIOS patches that implemented linux dma loading? This patch QEMU startup time: .092 BIOS startup time: .047 Kernel setup time: .003 Total time: .142 Current master (fw_cfg DMA enabled, but using linuxboot.img) QEMU startup time: .083 BIOS startup time: .047 Kernel setup time: .600 Total time: .730 You can see the time loading the kernel (between SeaBIOS function do_boot and the last instruction in the optionrom) is reduced a lot. > When I first tried to compile this (on fc23), I got: > > In file included from /usr/include/features.h:389:0, > from /usr/include/stdint.h:25, > from > /usr/lib/gcc/x86_64-redhat-linux/5.3.1/include/stdint.h:9, > from linuxboot_dma.c:62: > /usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such > file or directory compilation terminated. > > which I fixed by running "dnf install glibc-devel.i686". Is a > configure check needed? > > > See further comments below. > > [...] > > --- /dev/null > > +++ b/pc-bios/optionrom/linuxboot_dma.c > > @@ -0,0 +1,262 @@ > > +/* > > + * Linux Boot Option ROM for fw_cfg DMA > > + * > > + * This program is free software; you can redistribute it and/or > > modify > > + * it under the terms of the GNU General Public License as > > published by > > + * the Free Software Foundation; either version 2 of the License, > > or > > + * (at your option) any later version. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + * GNU General Public License for more details. > > + * > > + * You should have received a copy of the GNU General Public > > License > > + * along with this program; if not, see > > <http://www.gnu.org/licenses/>. > > + * > > + * Copyright (c) 2015 Red Hat Inc. > > + * Authors: Marc Marí > > + */ > > + > > +asm( > > +".text\n" > > +".global _start\n" > > +"_start:\n" > > +" .short 0xaa55\n" > > +" .byte (_end - _start) / 512\n" > > +" lret\n" > > +" .org 0x18\n" > > +" .short 0\n" > > +" .short _pnph\n" > > +"_pnph:\n" > > +" .ascii \"$PnP\"\n" > > +" .byte 0x01\n" > > +" .byte ( _pnph_len / 16 )\n" > > +" .short 0x\n" > > +" .byte 0x00\n" > > +" .byte 0x00\n" > > +" .long 0x\n" > > +" .short _manufacturer\n" > > +" .short _product\n" > > +" .long 0x\n" > > +" .short 0x\n" > > +" .short 0x\n" > > +" .short _bev\n" > > +" .short 0x\n" > > +" .short 0x\n" > > +" .equ _pnph_len, . - _pnph\n" > > +" .align 4, 0\n" > > +"_bev:\n" > > +".code16gcc\n" > > +/* DS = CS */ > > +" movw %cs, %ax\n" > > +" movw %ax, %ds\n" > > +" movl %esp, %ebp\n" > > +"run_linuxboot:\n" > > +" cli\n" > > +" cld\n" > > +" jmp load_kernel\n" > > +); > > The run_linuxboot label doesn't seem to be used anywhere. No it isn't, I can remove it or leave it as a reference of "code starts here". > [...] > > +static inline uint16_t readw_addr32(const void *addr) { > > +uint16_t val; > > +asm("addr32 movw %1, %0" : "=r"(val) : "g"(addr)); > > +barrier(); > > +return val; > > +} > > + > > +static inline uint32_t readl_addr32(const void *addr) { > > +uint32_t val; > > +asm("addr32 movl %1, %0" : "=r"(val) : "g"(addr)); > > +barrier(); > > +return val; > > +} > > + > > +static inline void writel_addr32(void *addr, uint32_t val) { > > +barrier(); > > +asm("addr32 movl %0, %1" : : "r"(val), "g"(addr)); > > +} > > The above does not
Re: [Qemu-devel] [PATCH v3] Add optionrom compatible with fw_cfg DMA version
On Wed, 27 Jan 2016 12:17:27 -0500 "Kevin O'Connor" wrote: > On Wed, Jan 27, 2016 at 05:57:28PM +0100, Marc Marí wrote: > > On Wed, 27 Jan 2016 16:43:29 + > > Stefan Hajnoczi wrote: > > > > > On Tue, Jan 26, 2016 at 12:26:12PM +0100, Gerd Hoffmann wrote: > > > > On Di, 2016-01-26 at 12:20 +0100, Marc Marí wrote: > > > > > On Tue, 26 Jan 2016 11:11:54 + > > > > > Stefan Hajnoczi wrote: > > > > > > > > > > > On Mon, Jan 25, 2016 at 02:17:48PM +0100, Marc Marí > > > > > > wrote: > > > > > > > +linuxboot_dma.img: linuxboot_dma.o > > > > > > > + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m > > > > > > > elf_i386 -Ttext 0 -e _start -s -o $@ $<," Building > > > > > > > $(TARGET_DIR)$@") + %.img: %.o > > > > > > > $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) > > > > > > > -Ttext 0 -e _start -s -o $@ $<," Building > > > > > > > $(TARGET_DIR)$@") > > > > > > > > > > > > Why is -m elf_i386 necessary for linuxboot_dma.img but not > > > > > > for the other *.img files? > > > > > > > > > > I cannot give a precise explanation. But if I don't force an > > > > > output type, I get this error: > > > > > > > > > > Building optionrom/linuxboot_dma.img > > > > > ld: i386 architecture of input file `linuxboot_dma.o' is > > > > > incompatible with i386:x86-64 output > > > > > > > > Any chance the linker needs -m32 too? > > > > > > I wonder why this isn't a problem for the existing firmware > > > code. Are we really building x86_64 ELF files for our firmware? > > > > Yes they are x86_64: > > > > pc-bios/optionrom/linuxboot.img: ELF 64-bit LSB executable, x86-64, > > version 1 (SYSV), statically linked, stripped > > > > But as they are written directly in assembly, it does not give any > > problem. Whereas when mixing C and ASM, it does give problems. > > Would it hurt anything to add "-m elf_i386" to the generic assembler > rule and then use that for both targets? (Just to keep the makefile a > little simpler.) It doesn't seem to hurt, and it is simpler. Marc
Re: [Qemu-devel] [PATCH v3] Add optionrom compatible with fw_cfg DMA version
On Wed, 27 Jan 2016 16:43:29 + Stefan Hajnoczi wrote: > On Tue, Jan 26, 2016 at 12:26:12PM +0100, Gerd Hoffmann wrote: > > On Di, 2016-01-26 at 12:20 +0100, Marc Marí wrote: > > > On Tue, 26 Jan 2016 11:11:54 + > > > Stefan Hajnoczi wrote: > > > > > > > On Mon, Jan 25, 2016 at 02:17:48PM +0100, Marc Marí wrote: > > > > > +linuxboot_dma.img: linuxboot_dma.o > > > > > + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m > > > > > elf_i386 -Ttext 0 -e _start -s -o $@ $<," Building > > > > > $(TARGET_DIR)$@") + %.img: %.o > > > > > $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 > > > > > -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") > > > > > > > > Why is -m elf_i386 necessary for linuxboot_dma.img but not for > > > > the other *.img files? > > > > > > I cannot give a precise explanation. But if I don't force an > > > output type, I get this error: > > > > > > Building optionrom/linuxboot_dma.img > > > ld: i386 architecture of input file `linuxboot_dma.o' is > > > incompatible with i386:x86-64 output > > > > Any chance the linker needs -m32 too? > > I wonder why this isn't a problem for the existing firmware code. Are > we really building x86_64 ELF files for our firmware? Yes they are x86_64: pc-bios/optionrom/linuxboot.img: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped But as they are written directly in assembly, it does not give any problem. Whereas when mixing C and ASM, it does give problems. Marc
Re: [Qemu-devel] [PATCH v3] Add optionrom compatible with fw_cfg DMA version
On Tue, 26 Jan 2016 11:11:54 + Stefan Hajnoczi wrote: > On Mon, Jan 25, 2016 at 02:17:48PM +0100, Marc Marí wrote: > > +linuxboot_dma.img: linuxboot_dma.o > > + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 > > -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") + > > %.img: %.o > > $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e > > _start -s -o $@ $<," Building $(TARGET_DIR)$@") > > Why is -m elf_i386 necessary for linuxboot_dma.img but not for the > other *.img files? I cannot give a precise explanation. But if I don't force an output type, I get this error: Building optionrom/linuxboot_dma.img ld: i386 architecture of input file `linuxboot_dma.o' is incompatible with i386:x86-64 output Marc
[Qemu-devel] [PATCH v3] Add optionrom compatible with fw_cfg DMA version
This optionrom is based on linuxboot.S. Signed-off-by: Marc Marí --- .gitignore| 4 + hw/i386/pc.c | 9 +- hw/nvram/fw_cfg.c | 2 +- include/hw/nvram/fw_cfg.h | 1 + pc-bios/optionrom/Makefile| 6 +- pc-bios/optionrom/linuxboot_dma.c | 262 ++ 6 files changed, 280 insertions(+), 4 deletions(-) create mode 100644 pc-bios/optionrom/linuxboot_dma.c diff --git a/.gitignore b/.gitignore index 88a80ff..101d1e0 100644 --- a/.gitignore +++ b/.gitignore @@ -94,6 +94,10 @@ /pc-bios/optionrom/linuxboot.bin /pc-bios/optionrom/linuxboot.raw /pc-bios/optionrom/linuxboot.img +/pc-bios/optionrom/linuxboot_dma.asm +/pc-bios/optionrom/linuxboot_dma.bin +/pc-bios/optionrom/linuxboot_dma.raw +/pc-bios/optionrom/linuxboot_dma.img /pc-bios/optionrom/multiboot.asm /pc-bios/optionrom/multiboot.bin /pc-bios/optionrom/multiboot.raw diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 459260b..00339fa 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms, fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size); fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size); -option_rom[nb_option_roms].name = "linuxboot.bin"; -option_rom[nb_option_roms].bootindex = 0; +if (fw_cfg_dma_enabled(fw_cfg)) { +option_rom[nb_option_roms].name = "linuxboot_dma.bin"; +option_rom[nb_option_roms].bootindex = 0; +} else { +option_rom[nb_option_roms].name = "linuxboot.bin"; +option_rom[nb_option_roms].bootindex = 0; +} nb_option_roms++; } diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index a1d650d..d0a5753 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -546,7 +546,7 @@ static bool is_version_1(void *opaque, int version_id) return version_id == 1; } -static bool fw_cfg_dma_enabled(void *opaque) +bool fw_cfg_dma_enabled(void *opaque) { FWCfgState *s = opaque; diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h index 664eaf6..953e58d 100644 --- a/include/hw/nvram/fw_cfg.h +++ b/include/hw/nvram/fw_cfg.h @@ -219,6 +219,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr, hwaddr dma_addr, AddressSpace *dma_as); FWCfgState *fw_cfg_find(void); +bool fw_cfg_dma_enabled(void *opaque); #endif /* NO_QEMU_PROTOS */ diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile index ce4852a..8c769ef 100644 --- a/pc-bios/optionrom/Makefile +++ b/pc-bios/optionrom/Makefile @@ -13,13 +13,17 @@ CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin CFLAGS += -I$(SRC_PATH) CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector) CFLAGS += $(CFLAGS_NOPIE) +CFLAGS += -m32 QEMU_CFLAGS = $(CFLAGS) -build-all: multiboot.bin linuxboot.bin kvmvapic.bin +build-all: multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin # suppress auto-removal of intermediate files .SECONDARY: +linuxboot_dma.img: linuxboot_dma.o + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") + %.img: %.o $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") diff --git a/pc-bios/optionrom/linuxboot_dma.c b/pc-bios/optionrom/linuxboot_dma.c new file mode 100644 index 000..43594c2 --- /dev/null +++ b/pc-bios/optionrom/linuxboot_dma.c @@ -0,0 +1,262 @@ +/* + * Linux Boot Option ROM for fw_cfg DMA + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + * + * Copyright (c) 2015 Red Hat Inc. + * Authors: Marc Marí + */ + +asm( +".text\n" +".global _start\n" +"_start:\n" +" .short 0xaa55\n" +" .byte (_end - _start) / 512\n" +" lret\n" +" .org 0x18\n" +" .short 0\n" +" .short _pnph\n" +"_pnph:\n" +" .ascii \"$PnP\"\n" +" .byte 0x01\n" +" .byte ( _pnph_len / 16 )\n" +" .short 0x\n" +" .byte 0x00\n" +" .byte 0x00\n" +" .long 0x\n" +" .short _manufacturer\n" +" .short _product\n" +&
Re: [Qemu-devel] [PATCH v2] Add optionrom compatible with fw_cfg DMA version
On Mon, 18 Jan 2016 14:42:06 + Stefan Hajnoczi wrote: > On Mon, Jan 18, 2016 at 10:59:00AM +0100, Marc Marí wrote: > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > > index 459260b..00339fa 100644 > > --- a/hw/i386/pc.c > > +++ b/hw/i386/pc.c > > @@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms, > > fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size); > > fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size); > > > > -option_rom[nb_option_roms].name = "linuxboot.bin"; > > -option_rom[nb_option_roms].bootindex = 0; > > +if (fw_cfg_dma_enabled(fw_cfg)) { > > +option_rom[nb_option_roms].name = "linuxboot_dma.bin"; > > +option_rom[nb_option_roms].bootindex = 0; > > +} else { > > +option_rom[nb_option_roms].name = "linuxboot.bin"; > > +option_rom[nb_option_roms].bootindex = 0; > > +} > > Live migration compatibility requires that guest-visible changes to > the machine are only introduced in a new -machine . > > This ensures that migrating from an older QEMU version to a newer QEMU > version will not change the guest-visible memory layout or device > behavior. > > The Option ROM should not change when migration between different QEMU > versions. > > I've CCed Gerd and Juan, I think they know how changes to Option ROMs > affect live migration better than me. What needs to be done to > preserve live migration compatibility? They are CC'd now :) > > diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile > > index ce4852a..076f351 100644 > > --- a/pc-bios/optionrom/Makefile > > +++ b/pc-bios/optionrom/Makefile > > @@ -2,6 +2,7 @@ all: build-all > > # Dummy command so that make thinks it has done something > > @true > > > > +BULD_DIR=$(CURDIR) > > Is this a typo (BUILD_DIR)? Is it even needed? Experiments and trials in the past that are not necessary now and I forgot to remove. They have been removed now. > > include ../../config-host.mak > > include $(SRC_PATH)/rules.mak > > > > @@ -11,15 +12,20 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/optionrom) > > > > CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer > > -fno-builtin CFLAGS += -I$(SRC_PATH) > > +CFLAGS += -I$(SRC_PATH)/include > > Are you using any QEMU headers? If not, then this change can be > dropped. Same as above. > > +linuxboot_dma.img: linuxboot_dma.o > > + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 > > -static -Ttext 0 -e _start -s -o $@ $<," Building > > $(TARGET_DIR)$@") > > Why is -static necessary? Same as above. > > > +#define BOOT_ROM_PRODUCT "Linux loader" > > Please differentiate from linuxboot.S. Maybe call it "Linux loader > DMA". > > > +typedef struct FWCfgDmaAccess { > > +uint32_t control; > > +uint32_t length; > > +uint64_t address; > > +} __attribute__((gcc_struct, packed)) FWCfgDmaAccess; > > gcc_struct is for gcc vs Windows compiler compatibility when the > struct contains bitfields. No bitfields are used and no Windows > compiled code is linked, so it seems unnecessary. > > > +static void bios_cfg_read_entry(void *buf, uint16_t entry, > > uint32_t len) +{ > > +FWCfgDmaAccess access; > > +uint32_t control = (entry << 16) | BIOS_CFG_DMA_CTL_SELECT > > +| BIOS_CFG_DMA_CTL_READ; > > + > > +access.address = cpu_to_be64((uint64_t)(uint32_t)buf); > > +access.length = cpu_to_be32(len); > > +access.control = cpu_to_be32(control); > > + > > +barrier(); > > + > > +outl(cpu_to_be32((uint32_t)&access), BIOS_CFG_DMA_ADDR_LOW); > > + > > +while(be32_to_cpu(access.control) & ~BIOS_CFG_DMA_CTL_ERROR) { > > +barrier(); > > +} > > +} > > This function is the unique part of linuxboot_dma.c. Everything else > is copied and translated from linuxboot.S. The refactorer in me > wonders how hard it would be to extend linuxboot.S instead of > introducing this new boot ROM. > > Was there a technical reason why linuxboot.S cannot be extended > (e.g. a size limit)? I don't think there's a technical reason. It is a lot simpler to write the fw_cfg DMA stuff in C. To extend linuxboot.S these things should be modified: - Add fw_cfg DMA detection support - Change read_fw from a macro to a function that checks for fw_cfg DMA support and does the operation using IO or memory - Extract bits and pieces from
[Qemu-devel] [PATCH v2] Add optionrom compatible with fw_cfg DMA version
This optionrom is based on linuxboot.S. Signed-off-by: Marc Marí --- .gitignore| 4 + hw/i386/pc.c | 9 +- hw/nvram/fw_cfg.c | 2 +- include/hw/nvram/fw_cfg.h | 1 + pc-bios/optionrom/Makefile| 8 +- pc-bios/optionrom/linuxboot_dma.c | 262 ++ pc-bios/optionrom/optionrom.h | 4 +- 7 files changed, 285 insertions(+), 5 deletions(-) create mode 100644 pc-bios/optionrom/linuxboot_dma.c diff --git a/.gitignore b/.gitignore index 88a80ff..101d1e0 100644 --- a/.gitignore +++ b/.gitignore @@ -94,6 +94,10 @@ /pc-bios/optionrom/linuxboot.bin /pc-bios/optionrom/linuxboot.raw /pc-bios/optionrom/linuxboot.img +/pc-bios/optionrom/linuxboot_dma.asm +/pc-bios/optionrom/linuxboot_dma.bin +/pc-bios/optionrom/linuxboot_dma.raw +/pc-bios/optionrom/linuxboot_dma.img /pc-bios/optionrom/multiboot.asm /pc-bios/optionrom/multiboot.bin /pc-bios/optionrom/multiboot.raw diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 459260b..00339fa 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms, fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size); fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size); -option_rom[nb_option_roms].name = "linuxboot.bin"; -option_rom[nb_option_roms].bootindex = 0; +if (fw_cfg_dma_enabled(fw_cfg)) { +option_rom[nb_option_roms].name = "linuxboot_dma.bin"; +option_rom[nb_option_roms].bootindex = 0; +} else { +option_rom[nb_option_roms].name = "linuxboot.bin"; +option_rom[nb_option_roms].bootindex = 0; +} nb_option_roms++; } diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index a1d650d..d0a5753 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -546,7 +546,7 @@ static bool is_version_1(void *opaque, int version_id) return version_id == 1; } -static bool fw_cfg_dma_enabled(void *opaque) +bool fw_cfg_dma_enabled(void *opaque) { FWCfgState *s = opaque; diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h index 664eaf6..953e58d 100644 --- a/include/hw/nvram/fw_cfg.h +++ b/include/hw/nvram/fw_cfg.h @@ -219,6 +219,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr, hwaddr dma_addr, AddressSpace *dma_as); FWCfgState *fw_cfg_find(void); +bool fw_cfg_dma_enabled(void *opaque); #endif /* NO_QEMU_PROTOS */ diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile index ce4852a..076f351 100644 --- a/pc-bios/optionrom/Makefile +++ b/pc-bios/optionrom/Makefile @@ -2,6 +2,7 @@ all: build-all # Dummy command so that make thinks it has done something @true +BULD_DIR=$(CURDIR) include ../../config-host.mak include $(SRC_PATH)/rules.mak @@ -11,15 +12,20 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/optionrom) CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin CFLAGS += -I$(SRC_PATH) +CFLAGS += -I$(SRC_PATH)/include CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector) CFLAGS += $(CFLAGS_NOPIE) +CFLAGS += -m32 QEMU_CFLAGS = $(CFLAGS) -build-all: multiboot.bin linuxboot.bin kvmvapic.bin +build-all: multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin # suppress auto-removal of intermediate files .SECONDARY: +linuxboot_dma.img: linuxboot_dma.o + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 -static -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") + %.img: %.o $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") diff --git a/pc-bios/optionrom/linuxboot_dma.c b/pc-bios/optionrom/linuxboot_dma.c new file mode 100644 index 000..9c355fd --- /dev/null +++ b/pc-bios/optionrom/linuxboot_dma.c @@ -0,0 +1,262 @@ +/* + * Linux Boot Option ROM for fw_cfg DMA + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + * + * Copyright (c) 2015 Red Hat Inc. + * Authors: Marc Marí + */ + +asm( +".text\n" +".global _start\n" +"_start:\n" +" .short 0xaa55\n" +" .byte (_end - _start) / 512\n" +" lret\n" +" .org 0x18\n" +" .short 0\n" +" .short _pnph\n" +"
Re: [Qemu-devel] [PATCH] Add optionrom compatible with fw_cfg DMA version
On Fri, 8 Jan 2016 16:54:07 +0100 Paolo Bonzini wrote: > > > On 08/01/2016 15:58, Marc Marí wrote: > > > > +static inline uint16_t readw_addr32(const void *addr) { > > +uint16_t val; > > +asm("addr32 movw %1, %0" : "=r"(val) : "g"(addr)); > > +barrier(); > > +return val; > > +} > > + > > Does SeaBIOS ensure that DS base is zero here? DS = CS: " movw %cs, %ax\n" " movw %ax, %ds\n" At the beginning of the ROM. > > +static void transition32(void) > > +{ > > +extern void *gdt; > > +uint32_t data_segment; > > +struct length_addr rombios_gdt; > > + > > +data_segment = read_ds(); > > +rombios_gdt.addr = (uint32_t)((data_segment << 4) + > > (uint32_t)(&gdt)); > > +rombios_gdt.length = (3 * 8) - 1; > > + > > +/* Load GDT */ > > +asm("data32 lgdt %0" : : "m"(rombios_gdt): "memory"); > > + > > + /* Get us to protected mode and set ES to a 32 bit segment > > */ > > +asm("mov $1, %%eax\n" > > +"mov %%eax, %%cr0\n" > > +"mov $0x10, %%eax\n" > > +"mov %%eax, %%es\n" > > +: : : "eax"); > > + > > +/* We're now running in 16-bit CS, but 32-bit ES! */ > > +} > > Do you even need to go to 32-bit mode? The only reason to do so in > the original ROM was to do a "rep insb" above 1 GB, but here fw_cfg > can do DMA to high addresses for you. There's no need of course. I was looking at the original ROM and I did not think this through. Thanks for your comments Marc
[Qemu-devel] [PATCH] Add optionrom compatible with fw_cfg DMA version
This optionrom is based on linuxboot.S. Signed-off-by: Marc Marí --- .gitignore| 4 + hw/i386/pc.c | 9 +- hw/nvram/fw_cfg.c | 2 +- include/hw/nvram/fw_cfg.h | 1 + pc-bios/optionrom/Makefile| 8 +- pc-bios/optionrom/linuxboot_dma.c | 338 ++ pc-bios/optionrom/optionrom.h | 4 +- 7 files changed, 361 insertions(+), 5 deletions(-) create mode 100644 pc-bios/optionrom/linuxboot_dma.c diff --git a/.gitignore b/.gitignore index 88a80ff..101d1e0 100644 --- a/.gitignore +++ b/.gitignore @@ -94,6 +94,10 @@ /pc-bios/optionrom/linuxboot.bin /pc-bios/optionrom/linuxboot.raw /pc-bios/optionrom/linuxboot.img +/pc-bios/optionrom/linuxboot_dma.asm +/pc-bios/optionrom/linuxboot_dma.bin +/pc-bios/optionrom/linuxboot_dma.raw +/pc-bios/optionrom/linuxboot_dma.img /pc-bios/optionrom/multiboot.asm /pc-bios/optionrom/multiboot.bin /pc-bios/optionrom/multiboot.raw diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 459260b..00339fa 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms, fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size); fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size); -option_rom[nb_option_roms].name = "linuxboot.bin"; -option_rom[nb_option_roms].bootindex = 0; +if (fw_cfg_dma_enabled(fw_cfg)) { +option_rom[nb_option_roms].name = "linuxboot_dma.bin"; +option_rom[nb_option_roms].bootindex = 0; +} else { +option_rom[nb_option_roms].name = "linuxboot.bin"; +option_rom[nb_option_roms].bootindex = 0; +} nb_option_roms++; } diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index a1d650d..d0a5753 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -546,7 +546,7 @@ static bool is_version_1(void *opaque, int version_id) return version_id == 1; } -static bool fw_cfg_dma_enabled(void *opaque) +bool fw_cfg_dma_enabled(void *opaque) { FWCfgState *s = opaque; diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h index 664eaf6..953e58d 100644 --- a/include/hw/nvram/fw_cfg.h +++ b/include/hw/nvram/fw_cfg.h @@ -219,6 +219,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr, hwaddr dma_addr, AddressSpace *dma_as); FWCfgState *fw_cfg_find(void); +bool fw_cfg_dma_enabled(void *opaque); #endif /* NO_QEMU_PROTOS */ diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile index ce4852a..076f351 100644 --- a/pc-bios/optionrom/Makefile +++ b/pc-bios/optionrom/Makefile @@ -2,6 +2,7 @@ all: build-all # Dummy command so that make thinks it has done something @true +BULD_DIR=$(CURDIR) include ../../config-host.mak include $(SRC_PATH)/rules.mak @@ -11,15 +12,20 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/optionrom) CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin CFLAGS += -I$(SRC_PATH) +CFLAGS += -I$(SRC_PATH)/include CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector) CFLAGS += $(CFLAGS_NOPIE) +CFLAGS += -m32 QEMU_CFLAGS = $(CFLAGS) -build-all: multiboot.bin linuxboot.bin kvmvapic.bin +build-all: multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin # suppress auto-removal of intermediate files .SECONDARY: +linuxboot_dma.img: linuxboot_dma.o + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 -static -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") + %.img: %.o $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e _start -s -o $@ $<," Building $(TARGET_DIR)$@") diff --git a/pc-bios/optionrom/linuxboot_dma.c b/pc-bios/optionrom/linuxboot_dma.c new file mode 100644 index 000..c420398 --- /dev/null +++ b/pc-bios/optionrom/linuxboot_dma.c @@ -0,0 +1,338 @@ +/* + * Linux Boot Option ROM for fw_cfg DMA + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + * + * Copyright (c) 2015 Red Hat Inc. + * Authors: Marc Marí + */ + +asm( +".text\n" +".global _start\n" +"_start:\n" +" .short 0xaa55\n" +" .byte (_end - _start) / 512\n" +" lret\n" +" .org 0x18\n" +" .short 0\n" +" .short _pnph\n" +"
[Qemu-devel] Bug with GDB from commit ca3e40e
Hi I was trying to debug some stuff in the bootloader and the optionrom, and I found a bug with QEMU and GDB. In commit ca3e40e (a pull request from Michael), I get this: x86_64-softmmu/qemu-system-x86_64 --enable-kvm \ -kernel /boot/vmlinuz-4.1.7-200.fc22.x86_64 -s -S qemu-system-x86_64: -s: address resolution failed for :: Name or service not known In the previous commit (3c23402) I don't get the error: x86_64-softmmu/qemu-system-x86_64 --enable-kvm \ -kernel /boot/vmlinuz-4.1.7-200.fc22.x86_64 -s -S (Opens QEMU and blocks, as expected, and I can connect GDB) I cannot see the problem (maybe because it's a bit late). I'll look at it again tomorrow, but any help is appreciated. My system is a Fedora 22, Linux 4.2.5-201, x86_64, and GDB version 7.9.1-20. Thanks Marc
Re: [Qemu-devel] Bug with GDB from commit ca3e40e
On Mon, 16 Nov 2015 08:45:09 +0200 "Michael S. Tsirkin" wrote: > On Mon, Nov 16, 2015 at 02:04:41AM +0100, Marc Marí wrote: > > Hi > > > > I was trying to debug some stuff in the bootloader and the > > optionrom, and I found a bug with QEMU and GDB. In commit ca3e40e > > (a pull request from Michael), I get this: > > > > x86_64-softmmu/qemu-system-x86_64 --enable-kvm \ > > -kernel /boot/vmlinuz-4.1.7-200.fc22.x86_64 -s -S > > qemu-system-x86_64: -s: address resolution failed for :: Name or > > service not known > > > > In the previous commit (3c23402) I don't get the error: > > x86_64-softmmu/qemu-system-x86_64 --enable-kvm \ > > -kernel /boot/vmlinuz-4.1.7-200.fc22.x86_64 -s -S > > (Opens QEMU and blocks, as expected, and I can connect GDB) > > > > I cannot see the problem (maybe because it's a bit late). I'll look > > at it again tomorrow, but any help is appreciated. > > > > My system is a Fedora 22, Linux 4.2.5-201, x86_64, and GDB version > > 7.9.1-20. > > > > Thanks > > Marc > > This command line seems to work for me. > Old QEMU version. Issue solved in commit 2ea1793. It seems it has something to do with the port in InetSocketAddress in qapi-schema.json. I should not trace "bugs" until so late. Thanks Marc
Re: [Qemu-devel] [PATCH v5 0/5] add ACPI node for fw_cfg on pc and arm
On Fri, 13 Nov 2015 21:57:13 -0500 "Gabriel L. Somlo" wrote: > New since v4: > > - rebased on top of Marc's DMA series > - drop machine compat dependency for insertion into x86/ssdt > (patch 3/5), following agreement between Igor and Eduardo > - [mm]io register range now covers DMA register as well, if > available. > - s/bios/firmware in doc file updates > > Thanks, > --Gabriel > > >New since v3: > > > > - rebased to work on top of 87e896ab (introducing pc-*-25 > > classes), inserting fw_cfg acpi node only for machines >= > > 2.5. > > > > - reintroduce _STA with value 0x0B (bit 2 for u/i visibility > > turned off to avoid Windows complaining -- thanks Igor for > > catching that!) > > > >If there's any other feedback besides questions regarding the > >appropriateness of "QEMU0002" as the value of _HID, please don't > >hesitate! > > > >>New since v2: > >> > >>- pc/i386 node in ssdt only on machine types *newer* than > >>2.4 (as suggested by Eduardo) > >> > >>I appreciate any further comments and reviews. Hopefully we can make > >>this palatable for upstream, modulo the lingering concerns about > >>whether "QEMU0002" is ok to use as the value of _HID, which I'll > >>hopefully get sorted out with the kernel crew... > >> > >>>New since v1: > >>> > >>> - expose control register size (suggested by Marc Marí) > >>> > >>> - leaving out _UID and _STA fields (thanks Shannon & Igor) > >>> > >>> - using "QEMU0002" as the value of _HID (thanks Michael) > >>> > >>> - added documentation blurb to docs/specs/fw_cfg.txt > >>> (mainly to record usage of the "QEMU0002" string with > >>> fw_cfg). > >>> > >>>> This series adds a fw_cfg device node to the SSDT (on pc), or to > >>>> the DSDT (on arm). > >>>> > >>>> - Patch 1/3 moves (and renames) the BIOS_CFG_IOPORT > >>>> (0x510) define from pc.c to pc.h, so that it could be used from > >>>>acpi-build.c in patch 2/3. > >>>> > >>>> - Patch 2/3 adds a fw_cfg node to the pc SSDT. > >>>> > >>>> - Patch 3/3 adds a fw_cfg node to the arm DSDT. > >>>> > >>>> I made up some names - "FWCF" for the node name, and "FWCF0001" > >>>> for _HID; no idea whether that's appropriate, or how else I > >>>> should figure out what to use instead... > >>>> > >>>> Also, using scope "\\_SB", based on where fw_cfg shows up in the > >>>> output of "info qtree". Again, if that's wrong, please point me > >>>> in the right direction. > >>>> > >>>> Re. 3/3 (also mentioned after the commit blurb in the patch > >>>> itself), I noticed none of the other DSDT entries contain a _STA > >>>> field, wondering why it would (not) make sense to include that, > >>>> same as on the PC. > Gabriel L. Somlo (5): > fw_cfg: expose control register size in fw_cfg.h > pc: fw_cfg: move ioport base constant to pc.h > acpi: pc: add fw_cfg device node to ssdt > acpi: arm: add fw_cfg device node to dsdt > fw_cfg: document ACPI device node information > > docs/specs/fw_cfg.txt | 9 + > hw/arm/virt-acpi-build.c | 15 +++ > hw/i386/acpi-build.c | 29 + > hw/i386/pc.c | 5 ++--- > hw/nvram/fw_cfg.c | 4 +++- > include/hw/i386/pc.h | 2 ++ > include/hw/nvram/fw_cfg.h | 3 +++ > 7 files changed, 63 insertions(+), 4 deletions(-) > Thanks for your work on this series and the other series and sorry I couldn't be more active in the last few weeks. Reviewed-by: Marc Marí
Re: [Qemu-devel] fw_cfg DMA security
On Fri, 23 Oct 2015 08:56:26 +0200 Gerd Hoffmann wrote: > Hi, > > > One complication I thought of was that it might be tricky to deal > > with the implications of allowing this DMA to specify any old > > address to fill with fw_cfg data. > > > > So, for example, since Red Hat is working on SMM. Would a DMA to > > SMRAM be protected? > > > > I haven't watched the fw_cfg DMA discussion too closely, but has > > this been thought about? > > Yes. That problem isn't new and it isn't specific to fw_cfg. You > also don't want grant dma access to smram/tseg to your ide/sata/scsi > controller or NIC. > > > One idea I had was that near the end of the firmware boot, the > > firmware could trigger fw_cfg in QEMU to stop supporting DMA until a > > reset. > > Should not be needed. We have address spaces in qemu, and the > smram/tseg regions are explicitly excluded (when enabled) from > dma-able memory. > > mark: when writing a fw_cfg_dma tests it is a good idea to add a > testcase for this, so make sure this works as intended and to avoid > security-sensitive regressions. Noted, thanks Marc > cheers, > Gerd > >
Re: [Qemu-devel] [PULL 0/7] fw_cfg: add dma interface, add strings via cmdline.
On Mon, 19 Oct 2015 17:17:12 +0200 Laszlo Ersek wrote: > On 10/19/15 13:12, Peter Maydell wrote: > > On 19 October 2015 at 10:17, Gerd Hoffmann > > wrote: > >> Hi, > >> > >> As promised last week I've picked up the fw_cfg patches which are > >> ready to go, so here comes the pull request for them. Headline > >> feature is the fw_cfg dam support fo course, enabled for arm and > >> x86. Also small fixes and an option to add string fw_cfg entries > >> via command line. > >> > >> please pull, > >> Gerd > >> > >> The following changes since commit > >> 40fe17bea478793fc9106a630fa3610dad51f939: > >> > >> hw/ide/ahci.c: Fix shift left into sign bit (2015-10-18 11:00:40 > >> +0100) > >> > >> are available in the git repository at: > >> > >> git://git.kraxel.org/qemu tags/pull-fw_cfg-20151019-1 > >> > >> for you to fetch changes up to > >> 7b0eec285d447057405df73beae78c8e8aeb9793: > >> > >> fw_cfg: Define a static signature to be returned on DMA port > >> reads (2015-10-19 11:00:50 +0200) > >> > >> > >> fw_cfg: add dma interface, add strings via cmdline. > >> > >> > > > > Hi. I'm afraid this fails 'make check': > > > > TEST: tests/fw_cfg-test... (pid=17533) > > /i386/fw_cfg/signature: > > OK /i386/fw_cfg/id: > > ** ERROR:/home/petmay01/qemu/tests/fw_cfg-test.c:40:test_fw_cfg_id: > > assertion failed (qfw_cfg_get_u32(fw_cfg, FW_CFG_ID) == 1): (3 == 1) > > FAIL > > > > (same failure on 64-bit ARM, ppc64be, OSX, 32-bit ARM). > > Heh, tests. We have tests to keep fixing them. :) > > Marc, can you add a hunk to patch "Implement fw_cfg DMA interface" > that changes the above test to mask out everything but bit 0? I don't > think we need to test DMA in the test suite right now (can be done in > another series I guess), but DMA's presence shouldn't break the > current test. I guess something like this would suffice: > > diff --git a/tests/fw_cfg-test.c b/tests/fw_cfg-test.c > index 9be78e9..27f115d 100644 > --- a/tests/fw_cfg-test.c > +++ b/tests/fw_cfg-test.c > @@ -37,7 +37,7 @@ static void test_fw_cfg_signature(void) > > static void test_fw_cfg_id(void) > { > -g_assert_cmpint(qfw_cfg_get_u32(fw_cfg, FW_CFG_ID), ==, 1); > +g_assert_cmpint(qfw_cfg_get_u32(fw_cfg, FW_CFG_ID) & 1, ==, 1); > } > > static void test_fw_cfg_uuid(void) > > (Sorry if this has been suggested already, trying to catch up with my > mailbox...) "Fixed that. Patch updated, tests running, updated pull soon." - Gerd Hoffman > Thanks > Laszlo > > > > > Windows fails to compile: > > /home/petmay01/linaro/qemu-for-merges/hw/nvram/fw_cfg.c: In function > > ‘fw_cfg_dma_mem_read’: > > /home/petmay01/linaro/qemu-for-merges/hw/nvram/fw_cfg.c:406: > > warning: integer constant is too large for ‘long’ type > > > > thanks > > -- PMM > > >
Re: [Qemu-devel] [PULL 0/7] fw_cfg: add dma interface, add strings via cmdline.
On Mon, 19 Oct 2015 08:12:22 -0400 "Kevin O'Connor" wrote: > On Mon, Oct 19, 2015 at 01:02:41PM +0100, Peter Maydell wrote: > > On 19 October 2015 at 12:52, Kevin O'Connor > > wrote: > > > On Mon, Oct 19, 2015 at 12:12:34PM +0100, Peter Maydell wrote: > > >> Windows fails to compile: > > >> /home/petmay01/linaro/qemu-for-merges/hw/nvram/fw_cfg.c: In > > >> function ‘fw_cfg_dma_mem_read’: > > >> /home/petmay01/linaro/qemu-for-merges/hw/nvram/fw_cfg.c:406: > > >> warning: integer constant is too large for ‘long’ type > > > > > > I don't have a Windows test environment, but I suspect the > > > following: > > > > > > #define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */ > > > > > > should be changed to: > > > > > > #define FW_CFG_DMA_SIGNATURE 0x51454d5520434647ULL /* "QEMU CFG" > > > */ > > > > Yes. All 64 bit constants should have an LL or ULL suffix. > > Okay - I'll respin patch 7. Thanks > > (The test failure is presumably something different.) > > I think the fix for the test failure is to make the test less picky > (it should accept a feature bitmap of either 1 or 3). But I'll leave > that to Marc or Gerd. Yes, that's the solution. We updated the version number, and the test expects exclusively version 1. And it would also be nice to update the test to test also fw_cfg DMA I can send a patch for the test (at least the version) tomorrow. Thanks Marc
[Qemu-devel] [PATCH v5 1/6] fw_cfg: document fw_cfg_modify_iXX() update functions
From: "Gabriel L. Somlo" Document the behavior of fw_cfg_modify_iXX() for leak-less updating of integer-type blobs. Currently only fw_cfg_modify_i16() is coded, but 32- and 64-bit versions may be added later if necessary.. Signed-off-by: Gabriel Somlo Signed-off-by: Gerd Hoffmann Reviewed-by: Laszlo Ersek --- docs/specs/fw_cfg.txt | 11 +++ 1 file changed, 11 insertions(+) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 74351dd..5bc7b96 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -159,6 +159,17 @@ will convert a 16-, 32-, or 64-bit integer to little-endian, then add a dynamically allocated copy of the appropriately sized item to fw_cfg under the given selector key value. +== fw_cfg_modify_iXX() == + +Modify the value of an XX-bit item (where XX may be 16, 32, or 64). +Similarly to the corresponding fw_cfg_add_iXX() function set, convert +a 16-, 32-, or 64-bit integer to little endian, create a dynamically +allocated copy of the required size, and replace the existing item at +the given selector key value with the newly allocated one. The previous +item, assumed to have been allocated during an earlier call to +fw_cfg_add_iXX() or fw_cfg_modify_iXX() (of the same width XX), is freed +before the function returns. + == fw_cfg_add_file() == Given a filename (i.e., fw_cfg item name), starting pointer, and size, -- 2.4.3
[Qemu-devel] [PATCH v5 6/6] fw_cfg: Define a static signature to be returned on DMA port reads
From: Kevin O'Connor Return a static signature ("QEMU CFG") if the guest does a read to the DMA address io register. Signed-off-by: Kevin O'Connor Reviewed-by: Laszlo Ersek Reviewed-by: Stefan Hajnoczi --- docs/specs/fw_cfg.txt | 3 +++ hw/nvram/fw_cfg.c | 14 -- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index f217d85..274b53c 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -93,6 +93,9 @@ by selecting the "signature" item using key 0x (FW_CFG_SIGNATURE), and reading four bytes from the data register. If the fw_cfg device is present, the four bytes read will contain the characters "QEMU". +If the DMA interface is available, then reading the DMA Address +Register returns 0x51454d5520434647 ("QEMU CFG" in big-endian format). + === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === A 32-bit little-endian unsigned int, this item is used to check for enabled diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index d2d1bca..309490c 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -53,6 +53,8 @@ #define FW_CFG_DMA_CTL_SKIP0x04 #define FW_CFG_DMA_CTL_SELECT 0x08 +#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */ + typedef struct FWCfgEntry { uint32_t len; uint8_t *data; @@ -397,6 +399,13 @@ static void fw_cfg_dma_transfer(FWCfgState *s) trace_fw_cfg_read(s, 0); } +static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr, +unsigned size) +{ +/* Return a signature value (and handle various read sizes) */ +return extract64(FW_CFG_DMA_SIGNATURE, (8 - addr - size) * 8, size * 8); +} + static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, uint64_t value, unsigned size) { @@ -420,8 +429,8 @@ static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr addr, unsigned size, bool is_write) { -return is_write && ((size == 4 && (addr == 0 || addr == 4)) || -(size == 8 && addr == 0)); +return !is_write || ((size == 4 && (addr == 0 || addr == 4)) || + (size == 8 && addr == 0)); } static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr, @@ -492,6 +501,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops = { }; static const MemoryRegionOps fw_cfg_dma_mem_ops = { +.read = fw_cfg_dma_mem_read, .write = fw_cfg_dma_mem_write, .endianness = DEVICE_BIG_ENDIAN, .valid.accepts = fw_cfg_dma_mem_valid, -- 2.4.3
[Qemu-devel] [PATCH v5 4/6] Enable fw_cfg DMA interface for ARM
Enable the fw_cfg DMA interface for the ARM virt machine. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí Reviewed-by: Peter Maydell Reviewed-by: Laszlo Ersek --- hw/arm/virt.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 7ae984f..0a39087 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -119,7 +119,7 @@ static const MemMapEntry a15memmap[] = { [VIRT_GIC_REDIST] = { 0x080A, 0x00F6 }, [VIRT_UART] = { 0x0900, 0x1000 }, [VIRT_RTC] ={ 0x0901, 0x1000 }, -[VIRT_FW_CFG] = { 0x0902, 0x000a }, +[VIRT_FW_CFG] = { 0x0902, 0x0018 }, [VIRT_MMIO] = { 0x0a00, 0x0200 }, /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */ [VIRT_PLATFORM_BUS] = { 0x0c00, 0x0200 }, @@ -677,13 +677,13 @@ static void create_flash(const VirtBoardInfo *vbi) g_free(nodename); } -static void create_fw_cfg(const VirtBoardInfo *vbi) +static void create_fw_cfg(const VirtBoardInfo *vbi, AddressSpace *as) { hwaddr base = vbi->memmap[VIRT_FW_CFG].base; hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); +fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); @@ -1026,7 +1026,7 @@ static void machvirt_init(MachineState *machine) */ create_virtio_devices(vbi, pic); -create_fw_cfg(vbi); +create_fw_cfg(vbi, &address_space_memory); rom_set_fw(fw_cfg_find()); guest_info->smp_cpus = smp_cpus; -- 2.4.3
[Qemu-devel] [PATCH v5 5/6] Enable fw_cfg DMA interface for x86
Enable the fw_cfg DMA interface for all the x86 platforms. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí Reviewed-by: Laszlo Ersek --- hw/i386/pc.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 9275297..da27553 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -752,14 +752,15 @@ static void pc_build_smbios(FWCfgState *fw_cfg) } } -static FWCfgState *bochs_bios_init(void) +static FWCfgState *bochs_bios_init(AddressSpace *as) { FWCfgState *fw_cfg; uint64_t *numa_fw_cfg; int i, j; unsigned int apic_id_limit = pc_apic_id_limit(max_cpus); -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); +fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, BIOS_CFG_IOPORT + 4, as); + /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86: * * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU hotplug @@ -1389,7 +1390,8 @@ FWCfgState *pc_memory_init(PCMachineState *pcms, option_rom_mr, 1); -fw_cfg = bochs_bios_init(); +fw_cfg = bochs_bios_init(&address_space_memory); + rom_set_fw(fw_cfg); if (guest_info->has_reserved_memory && pcms->hotplug_memory.base) { -- 2.4.3
[Qemu-devel] [PATCH v5 0/6] fw_cfg DMA interface
Implement host-side of the FW CFG DMA interface both for x86 and ARM. Based on Gerd Hoffman's initial implementation. Gabriel L. Somlo (1): fw_cfg: document fw_cfg_modify_iXX() update functions Kevin O'Connor (1): fw_cfg: Define a static signature to be returned on DMA port reads Marc Marí (4): fw_cfg DMA interface documentation Implement fw_cfg DMA interface Enable fw_cfg DMA interface for ARM Enable fw_cfg DMA interface for x86 docs/specs/fw_cfg.txt | 79 ++- hw/arm/virt.c | 8 +- hw/i386/pc.c | 8 +- hw/nvram/fw_cfg.c | 250 -- include/hw/nvram/fw_cfg.h | 16 ++- 5 files changed, 337 insertions(+), 24 deletions(-) -- 2.4.3
[Qemu-devel] [PATCH v5 2/6] fw_cfg DMA interface documentation
Add fw_cfg DMA interface specification in the documentation. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí Reviewed-by: Peter Maydell Reviewed-by: Laszlo Ersek --- docs/specs/fw_cfg.txt | 65 +++ 1 file changed, 61 insertions(+), 4 deletions(-) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 5bc7b96..f217d85 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -76,6 +76,13 @@ increasing address order, similar to memcpy(). Selector Register IOport: 0x510 Data Register IOport: 0x511 +DMA Address IOport: 0x514 + +=== ARM Register Locations === + +Selector Register address: Base + 8 (2 bytes) +Data Register address: Base + 0 (8 bytes) +DMA Address address: Base + 16 (8 bytes) == Firmware Configuration Items == @@ -86,11 +93,12 @@ by selecting the "signature" item using key 0x (FW_CFG_SIGNATURE), and reading four bytes from the data register. If the fw_cfg device is present, the four bytes read will contain the characters "QEMU". -=== Revision (Key 0x0001, FW_CFG_ID) === +=== Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === -A 32-bit little-endian unsigned int, this item is used as an interface -revision number, and is currently set to 1 by QEMU when fw_cfg is -initialized. +A 32-bit little-endian unsigned int, this item is used to check for enabled +features. + - Bit 0: traditional interface. Always set. + - Bit 1: DMA interface. === File Directory (Key 0x0019, FW_CFG_FILE_DIR) === @@ -132,6 +140,55 @@ Selector Reg.Range Usage In practice, the number of allowed firmware configuration items is given by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h). += Guest-side DMA Interface = + +If bit 1 of the feature bitmap is set, the DMA interface is present. This does +not replace the existing fw_cfg interface, it is an add-on. This interface +can be used through the 64-bit wide address register. + +The address register is in big-endian format. The value for the register is 0 +at startup and after an operation. A write to the least significant half (at +offset 4) triggers an operation. This means that operations with 32-bit +addresses can be triggered with just one write, whereas operations with +64-bit addresses can be triggered with one 64-bit write or two 32-bit writes, +starting with the most significant half (at offset 0). + +In this register, the physical address of a FWCfgDmaAccess structure in RAM +should be written. This is the format of the FWCfgDmaAccess structure: + +typedef struct FWCfgDmaAccess { +uint32_t control; +uint32_t length; +uint64_t address; +} FWCfgDmaAccess; + +The fields of the structure are in big endian mode, and the field at the lowest +address is the "control" field. + +The "control" field has the following bits: + - Bit 0: Error + - Bit 1: Read + - Bit 2: Skip + - Bit 3: Select. The upper 16 bits are the selected index. + +When an operation is triggered, if the "control" field has bit 3 set, the +upper 16 bits are interpreted as an index of a firmware configuration item. +This has the same effect as writing the selector register. + +If the "control" field has bit 1 set, a read operation will be performed. +"length" bytes for the current selector and offset will be copied into the +physical RAM address specified by the "address" field. + +If the "control" field has bit 2 set (and not bit 1), a skip operation will be +performed. The offset for the current selector will be advanced "length" bytes. + +To check the result, read the "control" field: + error bit set-> something went wrong. + all bits cleared -> transfer finished successfully. + otherwise-> transfer still in progress (doesn't happen +today due to implementation not being async, +but may in the future). + = Host-side API = The following functions are available to the QEMU programmer for adding -- 2.4.3
[Qemu-devel] [PATCH v5 3/6] Implement fw_cfg DMA interface
Based on the specifications on docs/specs/fw_cfg.txt This interface is an addon. The old interface can still be used as usual. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/arm/virt.c | 2 +- hw/nvram/fw_cfg.c | 240 +++--- include/hw/nvram/fw_cfg.h | 16 +++- 3 files changed, 244 insertions(+), 14 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index d25d6cf..7ae984f 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -683,7 +683,7 @@ static void create_fw_cfg(const VirtBoardInfo *vbi) hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8); +fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 658f8c4..d2d1bca 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -23,6 +23,7 @@ */ #include "hw/hw.h" #include "sysemu/sysemu.h" +#include "sysemu/dma.h" #include "hw/isa/isa.h" #include "hw/nvram/fw_cfg.h" #include "hw/sysbus.h" @@ -30,7 +31,7 @@ #include "qemu/error-report.h" #include "qemu/config-file.h" -#define FW_CFG_SIZE 2 +#define FW_CFG_CTL_SIZE 2 #define FW_CFG_NAME "fw_cfg" #define FW_CFG_PATH "/machine/" FW_CFG_NAME @@ -42,6 +43,16 @@ #define FW_CFG_IO(obj) OBJECT_CHECK(FWCfgIoState, (obj), TYPE_FW_CFG_IO) #define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM) +/* FW_CFG_VERSION bits */ +#define FW_CFG_VERSION 0x01 +#define FW_CFG_VERSION_DMA 0x02 + +/* FW_CFG_DMA_CONTROL bits */ +#define FW_CFG_DMA_CTL_ERROR 0x01 +#define FW_CFG_DMA_CTL_READ0x02 +#define FW_CFG_DMA_CTL_SKIP0x04 +#define FW_CFG_DMA_CTL_SELECT 0x08 + typedef struct FWCfgEntry { uint32_t len; uint8_t *data; @@ -59,6 +70,11 @@ struct FWCfgState { uint16_t cur_entry; uint32_t cur_offset; Notifier machine_ready; + +bool dma_enabled; +dma_addr_t dma_addr; +AddressSpace *dma_as; +MemoryRegion dma_iomem; }; struct FWCfgIoState { @@ -67,7 +83,7 @@ struct FWCfgIoState { /*< public >*/ MemoryRegion comb_iomem; -uint32_t iobase; +uint32_t iobase, dma_iobase; }; struct FWCfgMemState { @@ -292,6 +308,122 @@ static void fw_cfg_data_mem_write(void *opaque, hwaddr addr, } while (i); } +static void fw_cfg_dma_transfer(FWCfgState *s) +{ +dma_addr_t len; +FWCfgDmaAccess dma; +int arch; +FWCfgEntry *e; +int read; +dma_addr_t dma_addr; + +/* Reset the address before the next access */ +dma_addr = s->dma_addr; +s->dma_addr = 0; + +if (dma_memory_read(s->dma_as, dma_addr, &dma, sizeof(dma))) { +stl_be_dma(s->dma_as, dma_addr + offsetof(FWCfgDmaAccess, control), + FW_CFG_DMA_CTL_ERROR); +return; +} + +dma.address = be64_to_cpu(dma.address); +dma.length = be32_to_cpu(dma.length); +dma.control = be32_to_cpu(dma.control); + +if (dma.control & FW_CFG_DMA_CTL_SELECT) { +fw_cfg_select(s, dma.control >> 16); +} + +arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL); +e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK]; + +if (dma.control & FW_CFG_DMA_CTL_READ) { +read = 1; +} else if (dma.control & FW_CFG_DMA_CTL_SKIP) { +read = 0; +} else { +dma.length = 0; +} + +dma.control = 0; + +while (dma.length > 0 && !(dma.control & FW_CFG_DMA_CTL_ERROR)) { +if (s->cur_entry == FW_CFG_INVALID || !e->data || +s->cur_offset >= e->len) { +len = dma.length; + +/* If the access is not a read access, it will be a skip access, + * tested before. + */ +if (read) { +if (dma_memory_set(s->dma_as, dma.address, 0, len)) { +dma.control |= FW_CFG_DMA_CTL_ERROR; +} +} + +} else { +if (dma.length <= (e->len - s->cur_offset)) { +len = dma.length; +} else { +len = (e->len - s->cur_offset); +} + +if (e->read_callback) { +e->read_callback(e->callback_opaque, s->cur_offset); +} + +/* If the access is not a read access, it will be a skip access, + * tested before. + */ +if (read) { +if (dma_memory_write(s->dma_as, dma.address, +&e->data[s->cur_offset], len)) { +dma.control |= FW_CFG_DMA_CTL_ERROR; +} +
[Qemu-devel] [cross-post] QEMU fw_cfg DMA interface
Implementation of the FW CFG DMA interface. When running a Linux guest on top of QEMU, using the -kernel option and with fw_cfg DMA Linux boot support, this is the timing improvement for x86: Original QEMU and SeaBIOS QEMU startup time: .080 BIOS startup time: .060 Kernel setup time: .586 Total time: .726 QEMU and SeaBIOS with this patch series and fw_cfg DMA Linux boot support QEMU startup time: .080 BIOS startup time: .039 Kernel setup time: .005 Total time: .126 QEMU startup time is the time between the start and the first kvm_entry. BIOS startup time is the time between the first kvm_entry and the start of function do_boot, in SeaBIOS. Kernel setup time is the time between the start of the function do_boot in SeaBIOS and the jump to the Linux kernel. As you can see, both the BIOS (because of ACPI tables and other configurations) and the Linux kernel boot (because of the copy to memory) are greatly improved with this new interface. Also, this new interface is an addon to the old interface. Both interfaces are compatible and interchangeable. Changes from v1: - Take into account order of fields in the FWCfgDmaAccess structure - Check and change endianness of FWCfgDmaAccess fields - Change order of fields in the FWCfgDmaAccess structure - Add FW_CFG_DMA_CTL_SKIP feature for control field - Split FW_CFG_SIZE in QEMU - Make FW_CFG_ID a bitmap of features - Add 64 bit address support for the transfer. Trigger when writing the low address, and address is 0 by default and at the end of each transfer. - Align ports and addresses. - Preserve old fw_cfg_comb_valid behaviour in QEMU - Update documentation to reflect all these changes Changes from v2: - Make IOports fw_cfg DMA region a different IO region. - Reuse everything for MMIO and IOport DMA regions - Make transfer status only based on control field - Use DMA helpers instead of direct map/unmap - Change ARM fw_cfg DMA address space - Change Linux boot process to match linuxboot.S - Add select capabilities in the FWCfgDmaAccess struct - Update documentation to reflect all these changes Changes from v3: - Set properly fw_cfg DMA fields in ARM - Set fw_cfg DMA boot process properly (by Laszlo Ersek) - Add signature to fw_cfg DMA address field (by Kevin O'Connor) - Minor nitpicks Changes from v4: - Remove Linux fw_cfg boot from this series (will be sent separately) - Minor nitpicks
Re: [Qemu-devel] [PATCH v4 3/7] Implement fw_cfg DMA interface
On Thu, 8 Oct 2015 10:07:34 +0100 Stefan Hajnoczi wrote: > On Tue, Oct 06, 2015 at 03:53:33PM +0100, Peter Maydell wrote: > > On 6 October 2015 at 15:44, Stefan Hajnoczi > > wrote: > > > On Thu, Oct 01, 2015 at 02:16:55PM +0200, Marc Marí wrote: > > >> @@ -292,6 +307,119 @@ static void fw_cfg_data_mem_write(void > > >> *opaque, hwaddr addr, } while (i); > > >> } > > >> > > >> +static void fw_cfg_dma_transfer(FWCfgState *s) > > >> +{ > > >> +dma_addr_t len; > > >> +FWCfgDmaAccess dma; > > >> +int arch; > > >> +FWCfgEntry *e; > > >> +int read; > > >> +dma_addr_t dma_addr; > > >> + > > >> +/* Reset the address before the next access */ > > >> +dma_addr = s->dma_addr; > > >> +s->dma_addr = 0; > > >> + > > >> +dma.address = ldq_be_dma(s->dma_as, > > >> +dma_addr + offsetof(FWCfgDmaAccess, > > >> address)); > > >> +dma.length = ldl_be_dma(s->dma_as, > > >> +dma_addr + offsetof(FWCfgDmaAccess, > > >> length)); > > >> +dma.control = ldl_be_dma(s->dma_as, > > >> +dma_addr + offsetof(FWCfgDmaAccess, > > >> control)); > > > > > > ldq_be_dma() doesn't report errors. If dma_addr is invalid the > > > return value could be anything. Memory corruption inside the > > > guest is possible if the address/length/control values happen to > > > cause a memory read operation! > > > > We discussed this in a previous revision. IMHO if the guest has > > passed us a bogus dma_addr it should expect memory corruption. > > We only need to be sure we don't allow a VM escape. > > Even if the guest-visible behavior doesn't matter, Valgrind won't like > this. ldq_be_dma() reads from uninitialized stack memory: > > #define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \ > static inline uint##_bits##_t > ld##_lname##_##_end##_dma(AddressSpace *as, \ dma_addr_t addr) \ > { \ > uint##_bits##_t > val;\ dma_memory_read(as, > addr, &val, (_bits) / 8); \ return > _end##_bits##_to_cpu(val); \ } > > Bad QEMU, bad userspace process :). > > I think we really need to check the error and at least return early. It doesn't hurt to check the error. I'll add it. Thanks Marc > > > Instead, please use: > > > > > > if (dma_memory_read(s->dma_as, dma_addr, &dma, sizeof(dma))) { > > > stl_be_dma(s->dma_as, dma_addr + offsetof(FWCfgDmaAccess, > > > control), FW_CFG_DMA_CTL_ERROR); > > > > If the guest handed us a bad dma_addr then this write will also > > be bogus and could corrupt the guest's memory. > > That's fine because it's not a random address - it's the address that > the guest gave us. > > Stefan
Re: [Qemu-devel] [PATCHv2] fw_cfg: Define a static signature to be returned on DMA port reads
On Tue, 6 Oct 2015 10:29:25 -0400 "Kevin O'Connor" wrote: > On Tue, Oct 06, 2015 at 09:30:18AM +0200, Laszlo Ersek wrote: > > On 10/06/15 01:51, Kevin O'Connor wrote: > > > Return a static signature ("QEMU CFG") if the guest does a read > > > to the DMA address io register. > > > > > > Signed-off-by: Kevin O'Connor > > > --- > > > > > > Marc, if you decide to respin your fw_cfg series, I've updated > > > the dma signature patch. This addresses the comments from > > > Stefan, and I hope it addresses the comments from Laszlo. > > > > Thank you -- I didn't know about extract64(). > > > > The patch looks good to me, but I think the QEMU coding style > > requries /* ... */ comments, and forbids //. > > I always forget about that one. Marc, if you do respin the series I > updated the patch below. > This patch doesn't apply without the series. I'll add it to the series and send it as soon as posible. Thanks Marc > > > commit 02a449ece95da00fa64a9c704555c6afa8e03579 > Author: Kevin O'Connor > Date: Thu Oct 1 14:16:59 2015 +0200 > > fw_cfg: Define a static signature to be returned on DMA port reads > > Return a static signature ("QEMU CFG") if the guest does a read > to the DMA address io register. > > Reviewed-by: Stefan Hajnoczi > Signed-off-by: Kevin O'Connor > > diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt > index 2d6b2da..cbdce7d 100644 > --- a/docs/specs/fw_cfg.txt > +++ b/docs/specs/fw_cfg.txt > @@ -93,6 +93,9 @@ by selecting the "signature" item using key 0x > (FW_CFG_SIGNATURE), and reading four bytes from the data register. If > the fw_cfg device is present, the four bytes read will contain the > characters "QEMU". > +If the DMA interface is available, then reading the DMA Address > +Register returns 0x51454d5520434647 ("QEMU CFG" in big-endian > format). + > === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === > > A 32-bit little-endian unsigned int, this item is used to check for > enabled diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c > index 59933b3..5098bfc 100644 > --- a/hw/nvram/fw_cfg.c > +++ b/hw/nvram/fw_cfg.c > @@ -53,6 +53,8 @@ > #define FW_CFG_DMA_CTL_SKIP0x04 > #define FW_CFG_DMA_CTL_SELECT 0x08 > > +#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */ > + > typedef struct FWCfgEntry { > uint32_t len; > uint8_t *data; > @@ -393,6 +395,13 @@ static void fw_cfg_dma_transfer(FWCfgState *s) > trace_fw_cfg_read(s, 0); > } > > +static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr, > +unsigned size) > +{ > +/* Return a signature value (and handle various read sizes) */ > +return extract64(FW_CFG_DMA_SIGNATURE, (8 - addr - size) * 8, > size * 8); +} > + > static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, > uint64_t value, unsigned size) > { > @@ -416,8 +425,8 @@ static void fw_cfg_dma_mem_write(void *opaque, > hwaddr addr, static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr > addr, unsigned size, bool is_write) > { > -return is_write && ((size == 4 && (addr == 0 || addr == 4)) || > -(size == 8 && addr == 0)); > +return !is_write || ((size == 4 && (addr == 0 || addr == 4)) || > + (size == 8 && addr == 0)); > } > > static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr, > @@ -488,6 +497,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops > = { }; > > static const MemoryRegionOps fw_cfg_dma_mem_ops = { > +.read = fw_cfg_dma_mem_read, > .write = fw_cfg_dma_mem_write, > .endianness = DEVICE_BIG_ENDIAN, > .valid.accepts = fw_cfg_dma_mem_valid, >
Re: [Qemu-devel] [PATCH v4 3/7] Implement fw_cfg DMA interface
On Tue, 6 Oct 2015 15:44:53 +0100 Stefan Hajnoczi wrote: > On Thu, Oct 01, 2015 at 02:16:55PM +0200, Marc Marí wrote: > > @@ -292,6 +307,119 @@ static void fw_cfg_data_mem_write(void > > *opaque, hwaddr addr, } while (i); > > } > > > > +static void fw_cfg_dma_transfer(FWCfgState *s) > > +{ > > +dma_addr_t len; > > +FWCfgDmaAccess dma; > > +int arch; > > +FWCfgEntry *e; > > +int read; > > +dma_addr_t dma_addr; > > + > > +/* Reset the address before the next access */ > > +dma_addr = s->dma_addr; > > +s->dma_addr = 0; > > + > > +dma.address = ldq_be_dma(s->dma_as, > > +dma_addr + offsetof(FWCfgDmaAccess, > > address)); > > +dma.length = ldl_be_dma(s->dma_as, > > +dma_addr + offsetof(FWCfgDmaAccess, > > length)); > > +dma.control = ldl_be_dma(s->dma_as, > > +dma_addr + offsetof(FWCfgDmaAccess, > > control)); > > ldq_be_dma() doesn't report errors. If dma_addr is invalid the return > value could be anything. Memory corruption inside the guest is > possible if the address/length/control values happen to cause a > memory read operation! > > Instead, please use: > > if (dma_memory_read(s->dma_as, dma_addr, &dma, sizeof(dma))) { > stl_be_dma(s->dma_as, dma_addr + offsetof(FWCfgDmaAccess, > control), FW_CFG_DMA_CTL_ERROR); > return; > } > > dma.address = be64_to_cpu(dma.address); > dma.length = be32_to_cpu(dma.length); > dma.control = be32_to_cpu(dma.control); > > > + > > +if (dma.control & FW_CFG_DMA_CTL_SELECT) { > > +fw_cfg_select(s, dma.control >> 16); > > +} > > + > > +arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL); > > +e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK]; > > + > > +if (dma.control & FW_CFG_DMA_CTL_READ) { > > +read = 1; > > +} else if (dma.control & FW_CFG_DMA_CTL_SKIP) { > > +read = 0; > > +} else { > > +dma.length = 0; > > +} > > + > > +dma.control = 0; > > + > > +while (dma.length > 0 && !(dma.control & > > FW_CFG_DMA_CTL_ERROR)) { > > +if (s->cur_entry == FW_CFG_INVALID || !e->data || > > +s->cur_offset >= e->len) { > > +len = dma.length; > > + > > +/* If the access is not a read access, it will be a > > skip access, > > + * tested before. > > + */ > > +if (read) { > > +if (dma_memory_set(s->dma_as, dma.address, 0, > > len)) { > > +dma.control |= FW_CFG_DMA_CTL_ERROR; > > +} > > +} > > + > > +} else { > > +if (dma.length <= (e->len - s->cur_offset)) { > > +len = dma.length; > > +} else { > > +len = (e->len - s->cur_offset); > > +} > > + > > +if (e->read_callback) { > > +e->read_callback(e->callback_opaque, > > s->cur_offset); > > +} > > + > > +/* If the access is not a read access, it will be a > > skip access, > > + * tested before. > > + */ > > +if (read) { > > +if (dma_memory_write(s->dma_as, dma.address, > > +&e->data[s->cur_offset], len)) > > { > > +dma.control |= FW_CFG_DMA_CTL_ERROR; > > +} > > +} > > + > > +s->cur_offset += len; > > +} > > + > > +dma.address += len; > > +dma.length -= len; > > I thought these fields are written back to guest memory? For example, > so the guest knows how many bytes were read before the error occurred. This was proposed here: http://lists.gnu.org/archive/html/qemu-devel/2015-08/msg04001.html I also don't see much benefit into knowing how many bytes were read. If the guest is trying to read an invalid entry or past the end of that entry, the memory will be filled with zeros. The only moment when an error would be reported is when there's some problem in the mapping. And this problem is bad enough to just abort the whole operation. Regards Marc
Re: [Qemu-devel] [PATCH v4 6/7] Make the kernel image in the fw_cfg DMA interface bootable
On Fri, 02 Oct 2015 10:16:26 +0200 Gerd Hoffmann wrote: > Hi, > > > That's fine with me. Marc - I think qemu_vmlinux_setup() in SeaBIOS > > with the following would work: > > > > void qemu_vmlinux_setup(void) > > { > > u32 kernel_size; > > qemu_cfg_read_entry(&kernel_size, QEMU_CFG_KERNEL_SIZE, > > sizeof(kernel_size)); if (kernel_size) > > boot_add_qemu_vmlinux("QEMU Kernel image", 0); > > } > > It isn't that simple. We also have support for multiboot kernels > (using multiboot.bin option rom). So when doing this you need to be > prepared to find a multiboot kernel in fw_cfg. > > cheers, > Gerd A solution that I can see is adding DMA boot capabilities to the linuxboot.S optionrom. I was trying to avoid this, but it looks like not doing so creates lots of problems. It may be better than adding a "nice" shortcut somewhere in QEMU or SeaBIOS. Who uses this optionrom (and will benefit from this change)? Thanks Marc
Re: [Qemu-devel] [PATCH v4 6/7] Make the kernel image in the fw_cfg DMA interface bootable
On Thu, 1 Oct 2015 12:02:42 -0400 "Kevin O'Connor" wrote: > On Thu, Oct 01, 2015 at 05:25:11PM +0200, Laszlo Ersek wrote: > > On 10/01/15 14:16, Marc Marí wrote: > > > Add an entry to the bootorder file with name "vmlinux". > > > Give this entry more priority than the romfile. > > > > > > Signed-off-by: Marc Marí > > > --- > > > hw/i386/pc.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > > > index 81d93b4..c4c51f7 100644 > > > --- a/hw/i386/pc.c > > > +++ b/hw/i386/pc.c > > > @@ -1012,8 +1012,10 @@ static void load_linux(PCMachineState > > > *pcms, fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, > > > setup_size); > > > option_rom[nb_option_roms].name = "linuxboot.bin"; > > > -option_rom[nb_option_roms].bootindex = 0; > > > +option_rom[nb_option_roms].bootindex = 1; > > > nb_option_roms++; > > > + > > > +add_boot_device_path(0, NULL, "vmlinux"); > > > } > > > > > > #define NE2000_NB_MAX 6 > > > > > > > Where does this idea come from? > > > > This will yet again break the invariant that the bootorder fw_cfg > > file is a list of OpenFirmware device paths. > > I believe it came from a discussion between myself and Marc, because I > did not like the way Marc's original SeaBIOS patches overloaded the > meaning of "genroms/linuxboot.bin" in the bootorder file. I didn't like it either. > [...] > > Given that direct kernel boot is always expected to take priority > > over anything else (which is ensured by this QEMU patch too), can > > bootprio_find_vmlinux() in SeaBIOS just look at the same fw_cfg key > > (0x0008)? > > That's fine with me. Marc - I think qemu_vmlinux_setup() in SeaBIOS > with the following would work: > > void qemu_vmlinux_setup(void) > { > u32 kernel_size; > qemu_cfg_read_entry(&kernel_size, QEMU_CFG_KERNEL_SIZE, > sizeof(kernel_size)); if (kernel_size) > boot_add_qemu_vmlinux("QEMU Kernel image", 0); > } > > Marc, if you're okay with the above, you don't have to keep respinning > patches - I can fix it up upon commit. I'm ok with it Thanks Marc
Re: [Qemu-devel] [PATCH v4 3/7] Implement fw_cfg DMA interface
On Thu, 1 Oct 2015 16:36:07 +0200 Laszlo Ersek wrote: > This looks good to me. Thanks for addressing my v3 request. > > I have some new remarks here. I feel *really* bad for not finding them > earlier. (If you get tired of working on this series, I could pick it > up and try to shepherd it further.) I'll continue with it, don't worry. Thanks for your comments. Marc > > On 10/01/15 14:16, Marc Marí wrote: > > Based on the specifications on docs/specs/fw_cfg.txt > > > > This interface is an addon. The old interface can still be used as > > usual. > > > > Based on Gerd Hoffman's initial implementation. > > > > Signed-off-by: Marc Marí > > --- > > hw/arm/virt.c | 2 +- > > hw/nvram/fw_cfg.c | 231 > > +++--- > > include/hw/nvram/fw_cfg.h | 16 +++- 3 files changed, 233 > > insertions(+), 16 deletions(-) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index d25d6cf..7ae984f 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -683,7 +683,7 @@ static void create_fw_cfg(const VirtBoardInfo > > *vbi) hwaddr size = vbi->memmap[VIRT_FW_CFG].size; > > char *nodename; > > > > -fw_cfg_init_mem_wide(base + 8, base, 8); > > +fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); > > > > nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); > > qemu_fdt_add_subnode(vbi->fdt, nodename); > > diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c > > index 658f8c4..59933b3 100644 > > --- a/hw/nvram/fw_cfg.c > > +++ b/hw/nvram/fw_cfg.c > > @@ -23,6 +23,7 @@ > > */ > > #include "hw/hw.h" > > #include "sysemu/sysemu.h" > > +#include "sysemu/dma.h" > > #include "hw/isa/isa.h" > > #include "hw/nvram/fw_cfg.h" > > #include "hw/sysbus.h" > > @@ -30,7 +31,7 @@ > > #include "qemu/error-report.h" > > #include "qemu/config-file.h" > > > > -#define FW_CFG_SIZE 2 > > +#define FW_CFG_CTL_SIZE 2 > > #define FW_CFG_NAME "fw_cfg" > > #define FW_CFG_PATH "/machine/" FW_CFG_NAME > > > > @@ -42,6 +43,16 @@ > > #define FW_CFG_IO(obj) OBJECT_CHECK(FWCfgIoState, (obj), > > TYPE_FW_CFG_IO) #define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, > > (obj), TYPE_FW_CFG_MEM) > > +/* FW_CFG_VERSION bits */ > > +#define FW_CFG_VERSION 0x01 > > +#define FW_CFG_VERSION_DMA 0x02 > > + > > +/* FW_CFG_DMA_CONTROL bits */ > > +#define FW_CFG_DMA_CTL_ERROR 0x01 > > +#define FW_CFG_DMA_CTL_READ0x02 > > +#define FW_CFG_DMA_CTL_SKIP0x04 > > +#define FW_CFG_DMA_CTL_SELECT 0x08 > > + > > typedef struct FWCfgEntry { > > uint32_t len; > > uint8_t *data; > > @@ -59,6 +70,10 @@ struct FWCfgState { > > uint16_t cur_entry; > > uint32_t cur_offset; > > Notifier machine_ready; > > + > > +bool dma_enabled; > > +AddressSpace *dma_as; > > +dma_addr_t dma_addr; > > }; > > > > struct FWCfgIoState { > > @@ -66,8 +81,8 @@ struct FWCfgIoState { > > FWCfgState parent_obj; > > /*< public >*/ > > > > -MemoryRegion comb_iomem; > > -uint32_t iobase; > > +MemoryRegion comb_iomem, dma_iomem; > > +uint32_t iobase, dma_iobase; > > }; > > > > struct FWCfgMemState { > > @@ -75,7 +90,7 @@ struct FWCfgMemState { > > FWCfgState parent_obj; > > /*< public >*/ > > > > -MemoryRegion ctl_iomem, data_iomem; > > +MemoryRegion ctl_iomem, data_iomem, dma_iomem; > > uint32_t data_width; > > MemoryRegionOps wide_data_ops; > > }; > > (1) I *think* the new "dma_iomem" field, of type MemoryRegion, could > be moved up to the parent struct FWCfgEntry, from both FWCfgMemState > and FWCfgIoState. (And the references in the rest of the code could > be updated.) > > ( > > Independently, some loud thinking, mostly for myself: I've always been > surprised by the difference between (a) FWCfgIoState *carrying* > "dma_iobase" as a field -- and a property! --, and (b) FWCfgMemState > *not* carrying the same as a field -- nor as a property. > > I think I finally understand this difference now. It is all rooted in > the difference between the internal APIs sysbus_add_io() and > sysbus_init_mmio(). Both of these are called from the device realize > functions, but the fir
[Qemu-devel] [PATCH v4 4/7] Enable fw_cfg DMA interface for ARM
Enable the fw_cfg DMA interface for the ARM virt machine. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí Reviewed-by: Peter Maydell --- hw/arm/virt.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 7ae984f..0a39087 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -119,7 +119,7 @@ static const MemMapEntry a15memmap[] = { [VIRT_GIC_REDIST] = { 0x080A, 0x00F6 }, [VIRT_UART] = { 0x0900, 0x1000 }, [VIRT_RTC] ={ 0x0901, 0x1000 }, -[VIRT_FW_CFG] = { 0x0902, 0x000a }, +[VIRT_FW_CFG] = { 0x0902, 0x0018 }, [VIRT_MMIO] = { 0x0a00, 0x0200 }, /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */ [VIRT_PLATFORM_BUS] = { 0x0c00, 0x0200 }, @@ -677,13 +677,13 @@ static void create_flash(const VirtBoardInfo *vbi) g_free(nodename); } -static void create_fw_cfg(const VirtBoardInfo *vbi) +static void create_fw_cfg(const VirtBoardInfo *vbi, AddressSpace *as) { hwaddr base = vbi->memmap[VIRT_FW_CFG].base; hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); +fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); @@ -1026,7 +1026,7 @@ static void machvirt_init(MachineState *machine) */ create_virtio_devices(vbi, pic); -create_fw_cfg(vbi); +create_fw_cfg(vbi, &address_space_memory); rom_set_fw(fw_cfg_find()); guest_info->smp_cpus = smp_cpus; -- 2.4.3
[Qemu-devel] [PATCH v4 0/7] fw_cfg DMA interface
Implement host-side of the FW CFG DMA interface both for x86 and ARM. Based on Gerd Hoffman's initial implementation. Gabriel L. Somlo (1): fw_cfg: document fw_cfg_modify_iXX() update functions Kevin O'Connor (1): fw_cfg: Define a static signature to be returned on DMA port reads Marc Marí (5): fw_cfg DMA interface documentation Implement fw_cfg DMA interface Enable fw_cfg DMA interface for ARM Enable fw_cfg DMA interface for x86 Make the kernel image in the fw_cfg DMA interface bootable docs/specs/fw_cfg.txt | 80 +++- hw/arm/virt.c | 8 +- hw/i386/pc.c | 12 ++- hw/nvram/fw_cfg.c | 240 +++--- include/hw/nvram/fw_cfg.h | 16 +++- 5 files changed, 329 insertions(+), 27 deletions(-) -- 2.4.3
[Qemu-devel] [PATCH v4 2/7] fw_cfg DMA interface documentation
Add fw_cfg DMA interface specification in the documentation. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí Reviewed-by: Peter Maydell --- docs/specs/fw_cfg.txt | 65 +++ 1 file changed, 61 insertions(+), 4 deletions(-) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 5bc7b96..2d6b2da 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -76,6 +76,13 @@ increasing address order, similar to memcpy(). Selector Register IOport: 0x510 Data Register IOport: 0x511 +DMA Address IOport: 0x514 + +=== ARM Register Locations === + +Selector Register address: Base + 8 (2 bytes) +Data Register address: Base + 0 (8 bytes) +DMA Address address: Base + 16 (8 bytes) == Firmware Configuration Items == @@ -86,11 +93,12 @@ by selecting the "signature" item using key 0x (FW_CFG_SIGNATURE), and reading four bytes from the data register. If the fw_cfg device is present, the four bytes read will contain the characters "QEMU". -=== Revision (Key 0x0001, FW_CFG_ID) === +=== Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === -A 32-bit little-endian unsigned int, this item is used as an interface -revision number, and is currently set to 1 by QEMU when fw_cfg is -initialized. +A 32-bit little-endian unsigned int, this item is used to check for enabled +features. + - Bit 0: traditional interface. Always set. + - Bit 1: DMA interface. === File Directory (Key 0x0019, FW_CFG_FILE_DIR) === @@ -132,6 +140,55 @@ Selector Reg.Range Usage In practice, the number of allowed firmware configuration items is given by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h). += Guest-side DMA Interface = + +If bit 1 of the feature bitmap is set, the DMA interface is present. This does +not replace the existing fw_cfg interface, it is an add-on. This interface +can be used through the 64-bit wide address register. + +The address register is in big-endian format. The value for the register is 0 +at startup and after an operation. A write to the lower half triggers an +operation. This means that operations with 32-bit addresses can be triggered +with just one write, whereas operations with 64-bit addresses can be +triggered with one 64-bit write or two 32-bit writes, starting with the +higher part. + +In this register, the physical address of a FWCfgDmaAccess structure in RAM +should be written. This is the format of the FWCfgDmaAccess structure: + +typedef struct FWCfgDmaAccess { +uint32_t control; +uint32_t length; +uint64_t address; +} FWCfgDmaAccess; + +The fields of the structure are in big endian mode, and the field at the lowest +address is the "control" field. + +The "control" field has the following bits: + - Bit 0: Error + - Bit 1: Read + - Bit 2: Skip + - Bit 3: Select. The upper 16 bits are the selected index. + +When an operation is triggered, if the "control" field has bit 3 set, the +upper 16 bits are interpreted as an index of a firmware configuration item. +This has the same effect as writing the selector register. + +If the "control" field has bit 1 set, a read operation will be performed. +"length" bytes for the current selector and offset will be copied into the +physical RAM address specified by the "address" field. + +If the "control" field has bit 2 set (and not bit 1), a skip operation will be +performed. The offset for the current selector will be advanced "length" bytes. + +To check the result, read the "control" field: + error bit set-> something went wrong. + all bits cleared -> transfer finished successfully. + otherwise-> transfer still in progress (doesn't happen +today due to implementation not being async, +but may in the future). + = Host-side API = The following functions are available to the QEMU programmer for adding -- 2.4.3
[Qemu-devel] [PATCH v4 6/7] Make the kernel image in the fw_cfg DMA interface bootable
Add an entry to the bootorder file with name "vmlinux". Give this entry more priority than the romfile. Signed-off-by: Marc Marí --- hw/i386/pc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 81d93b4..c4c51f7 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1012,8 +1012,10 @@ static void load_linux(PCMachineState *pcms, fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size); option_rom[nb_option_roms].name = "linuxboot.bin"; -option_rom[nb_option_roms].bootindex = 0; +option_rom[nb_option_roms].bootindex = 1; nb_option_roms++; + +add_boot_device_path(0, NULL, "vmlinux"); } #define NE2000_NB_MAX 6 -- 2.4.3
[Qemu-devel] [PATCH v4 5/7] Enable fw_cfg DMA interface for x86
Enable the fw_cfg DMA interface for all the x86 platforms. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/i386/pc.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 461c128..81d93b4 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -753,14 +753,15 @@ static void pc_build_smbios(FWCfgState *fw_cfg) } } -static FWCfgState *bochs_bios_init(void) +static FWCfgState *bochs_bios_init(AddressSpace *as) { FWCfgState *fw_cfg; uint64_t *numa_fw_cfg; int i, j; unsigned int apic_id_limit = pc_apic_id_limit(max_cpus); -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); +fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, BIOS_CFG_IOPORT + 4, as); + /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86: * * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU hotplug @@ -1407,7 +1408,8 @@ FWCfgState *pc_memory_init(PCMachineState *pcms, option_rom_mr, 1); -fw_cfg = bochs_bios_init(); +fw_cfg = bochs_bios_init(&address_space_memory); + rom_set_fw(fw_cfg); if (guest_info->has_reserved_memory && pcms->hotplug_memory.base) { -- 2.4.3
[Qemu-devel] [PATCH v4 7/7] fw_cfg: Define a static signature to be returned on DMA port reads
From: Kevin O'Connor Return a static signature ("QEMU CFG") if the guest does a read to the DMA address io register. Signed-off-by: Kevin O'Connor --- docs/specs/fw_cfg.txt | 4 hw/nvram/fw_cfg.c | 13 +++-- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 2d6b2da..249b99e 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -93,6 +93,10 @@ by selecting the "signature" item using key 0x (FW_CFG_SIGNATURE), and reading four bytes from the data register. If the fw_cfg device is present, the four bytes read will contain the characters "QEMU". +Additionaly, if the DMA interface is available then a read to the DMA +Address will return 0x51454d5520434647 ("QEMU CFG" in big-endian +format). + === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === A 32-bit little-endian unsigned int, this item is used to check for enabled diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 59933b3..c6dcce4 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -53,6 +53,8 @@ #define FW_CFG_DMA_CTL_SKIP0x04 #define FW_CFG_DMA_CTL_SELECT 0x08 +#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */ + typedef struct FWCfgEntry { uint32_t len; uint8_t *data; @@ -393,6 +395,12 @@ static void fw_cfg_dma_transfer(FWCfgState *s) trace_fw_cfg_read(s, 0); } +static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr, +unsigned size) +{ +return FW_CFG_DMA_SIGNATURE >> ((8 - addr - size) * 8); +} + static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, uint64_t value, unsigned size) { @@ -416,8 +424,8 @@ static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr addr, unsigned size, bool is_write) { -return is_write && ((size == 4 && (addr == 0 || addr == 4)) || -(size == 8 && addr == 0)); +return !is_write || ((size == 4 && (addr == 0 || addr == 4)) || + (size == 8 && addr == 0)); } static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr, @@ -488,6 +496,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops = { }; static const MemoryRegionOps fw_cfg_dma_mem_ops = { +.read = fw_cfg_dma_mem_read, .write = fw_cfg_dma_mem_write, .endianness = DEVICE_BIG_ENDIAN, .valid.accepts = fw_cfg_dma_mem_valid, -- 2.4.3
[Qemu-devel] [PATCH v4 3/7] Implement fw_cfg DMA interface
Based on the specifications on docs/specs/fw_cfg.txt This interface is an addon. The old interface can still be used as usual. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/arm/virt.c | 2 +- hw/nvram/fw_cfg.c | 231 +++--- include/hw/nvram/fw_cfg.h | 16 +++- 3 files changed, 233 insertions(+), 16 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index d25d6cf..7ae984f 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -683,7 +683,7 @@ static void create_fw_cfg(const VirtBoardInfo *vbi) hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8); +fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 658f8c4..59933b3 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -23,6 +23,7 @@ */ #include "hw/hw.h" #include "sysemu/sysemu.h" +#include "sysemu/dma.h" #include "hw/isa/isa.h" #include "hw/nvram/fw_cfg.h" #include "hw/sysbus.h" @@ -30,7 +31,7 @@ #include "qemu/error-report.h" #include "qemu/config-file.h" -#define FW_CFG_SIZE 2 +#define FW_CFG_CTL_SIZE 2 #define FW_CFG_NAME "fw_cfg" #define FW_CFG_PATH "/machine/" FW_CFG_NAME @@ -42,6 +43,16 @@ #define FW_CFG_IO(obj) OBJECT_CHECK(FWCfgIoState, (obj), TYPE_FW_CFG_IO) #define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM) +/* FW_CFG_VERSION bits */ +#define FW_CFG_VERSION 0x01 +#define FW_CFG_VERSION_DMA 0x02 + +/* FW_CFG_DMA_CONTROL bits */ +#define FW_CFG_DMA_CTL_ERROR 0x01 +#define FW_CFG_DMA_CTL_READ0x02 +#define FW_CFG_DMA_CTL_SKIP0x04 +#define FW_CFG_DMA_CTL_SELECT 0x08 + typedef struct FWCfgEntry { uint32_t len; uint8_t *data; @@ -59,6 +70,10 @@ struct FWCfgState { uint16_t cur_entry; uint32_t cur_offset; Notifier machine_ready; + +bool dma_enabled; +AddressSpace *dma_as; +dma_addr_t dma_addr; }; struct FWCfgIoState { @@ -66,8 +81,8 @@ struct FWCfgIoState { FWCfgState parent_obj; /*< public >*/ -MemoryRegion comb_iomem; -uint32_t iobase; +MemoryRegion comb_iomem, dma_iomem; +uint32_t iobase, dma_iobase; }; struct FWCfgMemState { @@ -75,7 +90,7 @@ struct FWCfgMemState { FWCfgState parent_obj; /*< public >*/ -MemoryRegion ctl_iomem, data_iomem; +MemoryRegion ctl_iomem, data_iomem, dma_iomem; uint32_t data_width; MemoryRegionOps wide_data_ops; }; @@ -292,6 +307,119 @@ static void fw_cfg_data_mem_write(void *opaque, hwaddr addr, } while (i); } +static void fw_cfg_dma_transfer(FWCfgState *s) +{ +dma_addr_t len; +FWCfgDmaAccess dma; +int arch; +FWCfgEntry *e; +int read; +dma_addr_t dma_addr; + +/* Reset the address before the next access */ +dma_addr = s->dma_addr; +s->dma_addr = 0; + +dma.address = ldq_be_dma(s->dma_as, +dma_addr + offsetof(FWCfgDmaAccess, address)); +dma.length = ldl_be_dma(s->dma_as, +dma_addr + offsetof(FWCfgDmaAccess, length)); +dma.control = ldl_be_dma(s->dma_as, +dma_addr + offsetof(FWCfgDmaAccess, control)); + +if (dma.control & FW_CFG_DMA_CTL_SELECT) { +fw_cfg_select(s, dma.control >> 16); +} + +arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL); +e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK]; + +if (dma.control & FW_CFG_DMA_CTL_READ) { +read = 1; +} else if (dma.control & FW_CFG_DMA_CTL_SKIP) { +read = 0; +} else { +dma.length = 0; +} + +dma.control = 0; + +while (dma.length > 0 && !(dma.control & FW_CFG_DMA_CTL_ERROR)) { +if (s->cur_entry == FW_CFG_INVALID || !e->data || +s->cur_offset >= e->len) { +len = dma.length; + +/* If the access is not a read access, it will be a skip access, + * tested before. + */ +if (read) { +if (dma_memory_set(s->dma_as, dma.address, 0, len)) { +dma.control |= FW_CFG_DMA_CTL_ERROR; +} +} + +} else { +if (dma.length <= (e->len - s->cur_offset)) { +len = dma.length; +} else { +len = (e->len - s->cur_offset); +} + +if (e->read_callback) { +e->read_callback(e->callback_opaque, s->cur_offset); +} + +/* If the access is not a read access, it will be a ski
[Qemu-devel] [PATCH v4 1/7] fw_cfg: document fw_cfg_modify_iXX() update functions
From: "Gabriel L. Somlo" Document the behavior of fw_cfg_modify_iXX() for leak-less updating of integer-type blobs. Currently only fw_cfg_modify_i16() is coded, but 32- and 64-bit versions may be added later if necessary.. Signed-off-by: Gabriel Somlo Signed-off-by: Gerd Hoffmann Reviewed-by: Laszlo Ersek --- docs/specs/fw_cfg.txt | 11 +++ 1 file changed, 11 insertions(+) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 74351dd..5bc7b96 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -159,6 +159,17 @@ will convert a 16-, 32-, or 64-bit integer to little-endian, then add a dynamically allocated copy of the appropriately sized item to fw_cfg under the given selector key value. +== fw_cfg_modify_iXX() == + +Modify the value of an XX-bit item (where XX may be 16, 32, or 64). +Similarly to the corresponding fw_cfg_add_iXX() function set, convert +a 16-, 32-, or 64-bit integer to little endian, create a dynamically +allocated copy of the required size, and replace the existing item at +the given selector key value with the newly allocated one. The previous +item, assumed to have been allocated during an earlier call to +fw_cfg_add_iXX() or fw_cfg_modify_iXX() (of the same width XX), is freed +before the function returns. + == fw_cfg_add_file() == Given a filename (i.e., fw_cfg item name), starting pointer, and size, -- 2.4.3
[Qemu-devel] QEMU fw_cfg DMA interface
Implementation of the FW CFG DMA interface. When running a Linux guest on top of QEMU, using the -kernel options, this is the timing improvement for x86: QEMU commit b2312c6 and SeaBIOS commit 423542e QEMU startup time: .080 BIOS startup time: .060 Kernel setup time: .586 Total time: .726 QEMU with this patch series and SeaBIOS with this patch series QEMU startup time: .080 BIOS startup time: .039 Kernel setup time: .005 Total time: .126 QEMU startup time is the time between the start and the first kvm_entry. BIOS startup time is the time between the first kvm_entry and the start of function do_boot, in SeaBIOS. Kernel setup time is the time between the start of the function do_boot in SeaBIOS and the jump to the Linux kernel. As you can see, both the BIOS (because of ACPI tables and other configurations) and the Linux kernel boot (because of the copy to memory) are greatly improved with this new interface. Also, this new interface is an addon to the old interface. Both interfaces are compatible and interchangeable. Changes from v1: - Take into account order of fields in the FWCfgDmaAccess structure - Check and change endianness of FWCfgDmaAccess fields - Change order of fields in the FWCfgDmaAccess structure - Add FW_CFG_DMA_CTL_SKIP feature for control field - Split FW_CFG_SIZE in QEMU - Make FW_CFG_ID a bitmap of features - Add 64 bit address support for the transfer. Trigger when writing the low address, and address is 0 by default and at the end of each transfer. - Align ports and addresses. - Preserve old fw_cfg_comb_valid behaviour in QEMU - Update documentation to reflect all these changes Changes from v2: - Make IOports fw_cfg DMA region a different IO region. - Reuse everything for MMIO and IOport DMA regions - Make transfer status only based on control field - Use DMA helpers instead of direct map/unmap - Change ARM fw_cfg DMA address space - Change Linux boot process to match linuxboot.S - Add select capabilities in the FWCfgDmaAccess struct - Update documentation to reflect all these changes Changes from v3: - Set properly fw_cfg DMA fields in ARM - Set fw_cfg DMA boot process properly (by Laszlo Ersek) - Add signature to fw_cfg DMA address field (by Kevin O'Connor) - Minor nitpicks
Re: [Qemu-devel] [PATCH v2 0/2] Dynamic module support for block drivers
On Mon, 21 Sep 2015 16:43:27 +0300 "Denis V. Lunev" wrote: > On 09/08/2015 04:53 PM, Marc Marí wrote: > > The current module infrastructure has been improved to enable > > dynamic module loading. > > > > This reduces the load time for very simple guests. For the following > > configuration (very loaded) > > > > ./configure --enable-sdl --enable-gtk --enable-vte --enable-curses \ > > --enable-vnc --enable-vnc-{jpeg,tls,sasl,png} --enable-virtfs \ > > --enable-brlapi --enable-curl --enable-fdt --enable-bluez \ > > --enable-kvm --enable-rdma --enable-uuid --enable-vde \ > > --enable-linux-aio --enable-cap-ng --enable-attr > > --enable-vhost-net \ --enable-vhost-scsi --enable-spice > > --enable-rbd --enable-libiscsi \ --enable-smartcard-nss > > --enable-guest-agent --enable-libusb \ --enable-usb-redir > > --enable-lzo --enable-snappy --enable-bzip2 \ --enable-seccomp > > --enable-coroutine-pool --enable-glusterfs \ --enable-tpm > > --enable-libssh2 --enable-vhdx --enable-numa \ --enable-tcmalloc > > --target-list=x86_64-softmmu > > > > With modules disabled, there are 142 libraries loaded at startup. > > Time is the following: > > LD time: 0.065 seconds > > QEMU time: 0.02 seconds > > Total time: 0.085 seconds > > > > With this patch series and modules enabled, there are 128 libraries > > loaded at startup. Time is the following: > > LD time: 0.02 seconds > > QEMU time: 0.02 seconds > > Total time: 0.04 seconds > > > > Where LD time is the time between the program startup and the jump > > to main, and QEMU time is the time between the start of main and > > the first kvm_entry. > > > > These results are just with a few block drivers, that were already > > a module. Adding more modules (block or not block) should be easy, > > and will reduce the load time even more. > > > > Marc Marí (2): > >Add dynamic module loading for block drivers > >Add dynamic generation of module_block.h > > > > .gitignore | 1 + > > Makefile| 10 ++- > > block.c | 70 + > > configure | 2 +- > > include/qemu/module.h | 3 + > > scripts/modules/module_block.py | 134 > > > > util/module.c | 38 7 files changed, > > 227 insertions(+), 31 deletions(-) create mode 100755 > > scripts/modules/module_block.py > > > > From my point of view the design looks a bit complex. > The approach should be quite similar to one used > in Linux kernel. > > If the block driver is configured as a module, block_init > should register proper hooks and create generic lists. > C code parsing does not look like a good approach > to me. > > If block_init is a bad name, we could use something like > module_init. > I think applying the kind of modules of the Linux kernel here would mean reworking an important part of the drivers that want to be converted into modules. If that's the case, I think it's better to have a less good solution that is still good enough, and doesn't mean reworking half of the QEMU codebase. Correct me if I'm wrong. But of course, this is not the only valid approach. Thanks Marc
Re: [Qemu-devel] [PATCH v2 0/2] Dynamic module support for block drivers
Ping! Anyone has more comments for the next version? Thanks Marc >On Tue, 8 Sep 2015 15:53:20 +0200 >Marc Marí wrote: > > The current module infrastructure has been improved to enable dynamic > module loading. > > This reduces the load time for very simple guests. For the following > configuration (very loaded) > > ./configure --enable-sdl --enable-gtk --enable-vte --enable-curses \ > --enable-vnc --enable-vnc-{jpeg,tls,sasl,png} --enable-virtfs \ > --enable-brlapi --enable-curl --enable-fdt --enable-bluez \ > --enable-kvm --enable-rdma --enable-uuid --enable-vde \ > --enable-linux-aio --enable-cap-ng --enable-attr > --enable-vhost-net \ --enable-vhost-scsi --enable-spice --enable-rbd > --enable-libiscsi \ --enable-smartcard-nss --enable-guest-agent > --enable-libusb \ --enable-usb-redir --enable-lzo --enable-snappy > --enable-bzip2 \ --enable-seccomp --enable-coroutine-pool > --enable-glusterfs \ --enable-tpm --enable-libssh2 --enable-vhdx > --enable-numa \ --enable-tcmalloc --target-list=x86_64-softmmu > > With modules disabled, there are 142 libraries loaded at startup. > Time is the following: > LD time: 0.065 seconds > QEMU time: 0.02 seconds > Total time: 0.085 seconds > > With this patch series and modules enabled, there are 128 libraries > loaded at startup. Time is the following: > LD time: 0.02 seconds > QEMU time: 0.02 seconds > Total time: 0.04 seconds > > Where LD time is the time between the program startup and the jump to > main, and QEMU time is the time between the start of main and the > first kvm_entry. > > These results are just with a few block drivers, that were already a > module. Adding more modules (block or not block) should be easy, and > will reduce the load time even more. > > Marc Marí (2): > Add dynamic module loading for block drivers > Add dynamic generation of module_block.h > > .gitignore | 1 + > Makefile| 10 ++- > block.c | 70 + > configure | 2 +- > include/qemu/module.h | 3 + > scripts/modules/module_block.py | 134 > > util/module.c | 38 7 files changed, > 227 insertions(+), 31 deletions(-) create mode 100755 > scripts/modules/module_block.py >
Re: [Qemu-devel] [PATCH v3 4/5] Enable fw_cfg DMA interface for ARM
On Sat, 19 Sep 2015 01:10:46 +0200 Laszlo Ersek wrote: > On 09/18/15 22:24, Marc Marí wrote: > > On Fri, 18 Sep 2015 22:16:46 +0200 > > Laszlo Ersek wrote: > > > >> On 09/18/15 10:58, Marc Marí wrote: > >>> Enable the fw_cfg DMA interface for the ARM virt machine. > >>> > >>> Based on Gerd Hoffman's initial implementation. > >>> > >>> Signed-off-by: Marc Marí > >>> --- > >>> hw/arm/virt.c | 9 + > >>> 1 file changed, 5 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c > >>> index 3568107..47f4ad3 100644 > >>> --- a/hw/arm/virt.c > >>> +++ b/hw/arm/virt.c > >>> @@ -113,7 +113,7 @@ static const MemMapEntry a15memmap[] = { > >>> [VIRT_GIC_V2M] ={ 0x0802, 0x1000 }, > >>> [VIRT_UART] = { 0x0900, 0x1000 }, > >>> [VIRT_RTC] ={ 0x0901, 0x1000 }, > >>> -[VIRT_FW_CFG] = { 0x0902, 0x000a }, > >>> +[VIRT_FW_CFG] = { 0x0902, 0x0014 }, > >> > >> Okay, Laszlo is the hateful reviewer. Sorry about that. I'm late, > >> yes. > >> > >> But: this says 0x0014, ie 20 bytes in decimal. I don't think > >> that's correct; it should be 0x18 -- 24 bytes in decimal. From > >> patch #2: "DMA Address address: Base + 16 (8 bytes)". > > > > It's not your problem if I don't know how to count. So don't > > apologize :). > > > > And it's better to catch this stupid little mistakes now. > > Got some good news: with those two fixups in place (register block > size corrected, and dma_enabled set via device property), I could > test the AAVMF / ArmVirtPkg / > patches. > > On my APM Mustang, downloading a decompressed kernel (14,475,776 > bytes), a decompressed initrd (18,177,264), and a cmdline (104 > bytes :)), in total 32,653,144 bytes, takes approx. 24 seconds with > the 8-byte wide MMIO data register. (Yeah, it's *really* slow.) > > Using the DMA interface, the same takes about 52 milliseconds, and > that still includes one progress message per 1 MB downloaded :) > > It's a factor of approx. 450. Not bad. Not bad. :) Not bad. Not bad :). In x86 the speedup is high but not so brutal. I'm really happy that it works so well. Thanks Marc > Thanks > Laszlo > > > > Thanks > > Marc > > > >> Thanks (and I'm sorry about being late!) > >> Laszlo > >> > >>> [VIRT_MMIO] = { 0x0a00, 0x0200 }, > >>> /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of > >>> that size */ [VIRT_PLATFORM_BUS] = { 0x0c00, > >>> 0x0200 }, @@ -651,13 +651,13 @@ static void > >>> create_flash(const VirtBoardInfo *vbi) g_free(nodename); > >>> } > >>> > >>> -static void create_fw_cfg(const VirtBoardInfo *vbi) > >>> +static void create_fw_cfg(const VirtBoardInfo *vbi, AddressSpace > >>> *as) { > >>> hwaddr base = vbi->memmap[VIRT_FW_CFG].base; > >>> hwaddr size = vbi->memmap[VIRT_FW_CFG].size; > >>> char *nodename; > >>> > >>> -fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); > >>> +fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as); > >>> > >>> nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); > >>> qemu_fdt_add_subnode(vbi->fdt, nodename); > >>> @@ -919,6 +919,7 @@ static void machvirt_init(MachineState > >>> *machine) > >>> create_fdt(vbi); > >>> > >>> + > >>> for (n = 0; n < smp_cpus; n++) { > >>> ObjectClass *oc = cpu_class_by_name(TYPE_ARM_CPU, > >>> cpustr[0]); CPUClass *cc = CPU_CLASS(oc); > >>> @@ -984,7 +985,7 @@ static void machvirt_init(MachineState > >>> *machine) */ > >>> create_virtio_devices(vbi, pic); > >>> > >>> -create_fw_cfg(vbi); > >>> +create_fw_cfg(vbi, &address_space_memory); > >>> rom_set_fw(fw_cfg_find()); > >>> > >>> guest_info->smp_cpus = smp_cpus; > >>> > >> > > >
Re: [Qemu-devel] [PATCH v3 4/5] Enable fw_cfg DMA interface for ARM
On Fri, 18 Sep 2015 22:16:46 +0200 Laszlo Ersek wrote: > On 09/18/15 10:58, Marc Marí wrote: > > Enable the fw_cfg DMA interface for the ARM virt machine. > > > > Based on Gerd Hoffman's initial implementation. > > > > Signed-off-by: Marc Marí > > --- > > hw/arm/virt.c | 9 + > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index 3568107..47f4ad3 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -113,7 +113,7 @@ static const MemMapEntry a15memmap[] = { > > [VIRT_GIC_V2M] ={ 0x0802, 0x1000 }, > > [VIRT_UART] = { 0x0900, 0x1000 }, > > [VIRT_RTC] ={ 0x0901, 0x1000 }, > > -[VIRT_FW_CFG] = { 0x0902, 0x000a }, > > +[VIRT_FW_CFG] = { 0x0902, 0x0014 }, > > Okay, Laszlo is the hateful reviewer. Sorry about that. I'm late, yes. > > But: this says 0x0014, ie 20 bytes in decimal. I don't think > that's correct; it should be 0x18 -- 24 bytes in decimal. From patch > #2: "DMA Address address: Base + 16 (8 bytes)". It's not your problem if I don't know how to count. So don't apologize :). And it's better to catch this stupid little mistakes now. Thanks Marc > Thanks (and I'm sorry about being late!) > Laszlo > > > [VIRT_MMIO] = { 0x0a00, 0x0200 }, > > /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of > > that size */ [VIRT_PLATFORM_BUS] = { 0x0c00, 0x0200 }, > > @@ -651,13 +651,13 @@ static void create_flash(const VirtBoardInfo > > *vbi) g_free(nodename); > > } > > > > -static void create_fw_cfg(const VirtBoardInfo *vbi) > > +static void create_fw_cfg(const VirtBoardInfo *vbi, AddressSpace > > *as) { > > hwaddr base = vbi->memmap[VIRT_FW_CFG].base; > > hwaddr size = vbi->memmap[VIRT_FW_CFG].size; > > char *nodename; > > > > -fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); > > +fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as); > > > > nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); > > qemu_fdt_add_subnode(vbi->fdt, nodename); > > @@ -919,6 +919,7 @@ static void machvirt_init(MachineState *machine) > > > > create_fdt(vbi); > > > > + > > for (n = 0; n < smp_cpus; n++) { > > ObjectClass *oc = cpu_class_by_name(TYPE_ARM_CPU, > > cpustr[0]); CPUClass *cc = CPU_CLASS(oc); > > @@ -984,7 +985,7 @@ static void machvirt_init(MachineState *machine) > > */ > > create_virtio_devices(vbi, pic); > > > > -create_fw_cfg(vbi); > > +create_fw_cfg(vbi, &address_space_memory); > > rom_set_fw(fw_cfg_find()); > > > > guest_info->smp_cpus = smp_cpus; > > >
Re: [Qemu-devel] [PATCH v3 0/5] fw_cfg DMA interface
On Fri, 18 Sep 2015 14:25:09 -0400 "Kevin O'Connor" wrote: > On Fri, Sep 18, 2015 at 10:58:44AM +0200, Marc Marí wrote: > > Implement host-side of the FW CFG DMA interface both for x86 and > > ARM. > > > > Based on Gerd Hoffman's initial implementation. > > Thanks for working on this Marc! > > Any chance you could add the patch below to the series (or merge it > into your series)? Unless it is decided to merge the series as is, I'll send another version with the little nitpicks corrected. I'll add this patch too. Thank you also for all the comments! Marc > The patch adds a signature to the DMA address IO register. With the > current implementation, a future firmware would have to implement the > V1 fw_cfg interface just to probe for the dma interface. It might be > useful if future firmwares (that don't care about backwards > compatibility with old versions of qemu) could probe for the dma > fw_cfg interface by just checking for a signature (and therefore not > require all the V1 code just to probe). > > -Kevin > > > commit ae6d8df012ef9b21ae17bfb0383d116f71ba1d58 > Author: Kevin O'Connor > Date: Fri Sep 18 14:14:55 2015 -0400 > > fw_cfg: Define a static signature to be returned on DMA port reads > > Return a static signature ("QEMU CFG") if the guest does a read > to the DMA address io register. > > Signed-off-by: Kevin O'Connor > > diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt > index d5f9ddd..5bf3f65 100644 > --- a/docs/specs/fw_cfg.txt > +++ b/docs/specs/fw_cfg.txt > @@ -93,6 +93,10 @@ by selecting the "signature" item using key 0x > (FW_CFG_SIGNATU RE), > and reading four bytes from the data register. If the fw_cfg device > is present, the four bytes read will contain the characters "QEMU". > > +Additionaly, if the DMA interface is available then a read to the DMA > +Address will return 0x51454d5520434647 ("QEMU CFG" in big-endian > +format). > + > === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === > > A 32-bit little-endian unsigned int, this item is used to check for > enabled diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c > index d11d8c5..d95075d 100644 > --- a/hw/nvram/fw_cfg.c > +++ b/hw/nvram/fw_cfg.c > @@ -53,6 +53,8 @@ > #define FW_CFG_DMA_CTL_SKIP0x04 > #define FW_CFG_DMA_CTL_SELECT 0x08 > > +#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */ > + > typedef struct FWCfgEntry { > uint32_t len; > uint8_t *data; > @@ -393,6 +395,12 @@ static void fw_cfg_dma_transfer(FWCfgState *s) > trace_fw_cfg_read(s, 0); > } > > +static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr, > +unsigned size) > +{ > +return FW_CFG_DMA_SIGNATURE >> ((8 - addr - size) * 8); > +} > + > static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, > uint64_t value, unsigned size) > { > @@ -416,8 +424,8 @@ static void fw_cfg_dma_mem_write(void *opaque, > hwaddr addr, static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr > addr, unsigned size, bool is_write) > { > -return is_write && ((size == 4 && (addr == 0 || addr == 4)) || > -(size == 8 && addr == 0)); > +return !is_write || ((size == 4 && (addr == 0 || addr == 4)) || > + (size == 8 && addr == 0)); > } > > static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr, > @@ -488,6 +496,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops > = { }; > > static const MemoryRegionOps fw_cfg_dma_mem_ops = { > +.read = fw_cfg_dma_mem_read, > .write = fw_cfg_dma_mem_write, > .endianness = DEVICE_BIG_ENDIAN, > .valid.accepts = fw_cfg_dma_mem_valid, >
[Qemu-devel] [PATCH v3 5/5] Enable fw_cfg DMA interface for x86
Enable the fw_cfg DMA interface for all the x86 platforms. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/i386/pc.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 56aecce..6e02061 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -753,14 +753,15 @@ static void pc_build_smbios(FWCfgState *fw_cfg) } } -static FWCfgState *bochs_bios_init(void) +static FWCfgState *bochs_bios_init(AddressSpace *as) { FWCfgState *fw_cfg; uint64_t *numa_fw_cfg; int i, j; unsigned int apic_id_limit = pc_apic_id_limit(max_cpus); -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); +fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, BIOS_CFG_IOPORT + 4, as); + /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86: * * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU hotplug @@ -1316,6 +1317,7 @@ FWCfgState *pc_memory_init(PCMachineState *pcms, MemoryRegion *ram_below_4g, *ram_above_4g; FWCfgState *fw_cfg; MachineState *machine = MACHINE(pcms); +AddressSpace *as; assert(machine->ram_size == pcms->below_4g_mem_size + pcms->above_4g_mem_size); @@ -1407,7 +1409,10 @@ FWCfgState *pc_memory_init(PCMachineState *pcms, option_rom_mr, 1); -fw_cfg = bochs_bios_init(); +as = g_malloc(sizeof(*as)); +address_space_init(as, ram_below_4g, "pc.as"); +fw_cfg = bochs_bios_init(as); + rom_set_fw(fw_cfg); if (guest_info->has_reserved_memory && pcms->hotplug_memory.base) { -- 2.4.3
[Qemu-devel] [PATCH v3 3/5] Implement fw_cfg DMA interface
Based on the specifications on docs/specs/fw_cfg.txt This interface is an addon. The old interface can still be used as usual. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/arm/virt.c | 2 +- hw/nvram/fw_cfg.c | 228 +++--- include/hw/nvram/fw_cfg.h | 16 +++- 3 files changed, 230 insertions(+), 16 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index e9324f5..3568107 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -657,7 +657,7 @@ static void create_fw_cfg(const VirtBoardInfo *vbi) hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8); +fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 658f8c4..d11d8c5 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -23,6 +23,7 @@ */ #include "hw/hw.h" #include "sysemu/sysemu.h" +#include "sysemu/dma.h" #include "hw/isa/isa.h" #include "hw/nvram/fw_cfg.h" #include "hw/sysbus.h" @@ -30,7 +31,7 @@ #include "qemu/error-report.h" #include "qemu/config-file.h" -#define FW_CFG_SIZE 2 +#define FW_CFG_CTL_SIZE 2 #define FW_CFG_NAME "fw_cfg" #define FW_CFG_PATH "/machine/" FW_CFG_NAME @@ -42,6 +43,16 @@ #define FW_CFG_IO(obj) OBJECT_CHECK(FWCfgIoState, (obj), TYPE_FW_CFG_IO) #define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM) +/* FW_CFG_VERSION bits */ +#define FW_CFG_VERSION 0x01 +#define FW_CFG_VERSION_DMA 0x02 + +/* FW_CFG_DMA_CONTROL bits */ +#define FW_CFG_DMA_CTL_ERROR 0x01 +#define FW_CFG_DMA_CTL_READ0x02 +#define FW_CFG_DMA_CTL_SKIP0x04 +#define FW_CFG_DMA_CTL_SELECT 0x08 + typedef struct FWCfgEntry { uint32_t len; uint8_t *data; @@ -59,6 +70,10 @@ struct FWCfgState { uint16_t cur_entry; uint32_t cur_offset; Notifier machine_ready; + +bool dma_enabled; +AddressSpace *dma_as; +dma_addr_t dma_addr; }; struct FWCfgIoState { @@ -66,8 +81,8 @@ struct FWCfgIoState { FWCfgState parent_obj; /*< public >*/ -MemoryRegion comb_iomem; -uint32_t iobase; +MemoryRegion comb_iomem, dma_iomem; +uint32_t iobase, dma_iobase; }; struct FWCfgMemState { @@ -75,7 +90,7 @@ struct FWCfgMemState { FWCfgState parent_obj; /*< public >*/ -MemoryRegion ctl_iomem, data_iomem; +MemoryRegion ctl_iomem, data_iomem, dma_iomem; uint32_t data_width; MemoryRegionOps wide_data_ops; }; @@ -292,6 +307,119 @@ static void fw_cfg_data_mem_write(void *opaque, hwaddr addr, } while (i); } +static void fw_cfg_dma_transfer(FWCfgState *s) +{ +dma_addr_t len; +FWCfgDmaAccess dma; +int arch; +FWCfgEntry *e; +int read; +dma_addr_t dma_addr; + +/* Reset the address before the next access */ +dma_addr = s->dma_addr; +s->dma_addr = 0; + +dma.address = ldq_be_dma(s->dma_as, +dma_addr + offsetof(FWCfgDmaAccess, address)); +dma.length = ldl_be_dma(s->dma_as, +dma_addr + offsetof(FWCfgDmaAccess, length)); +dma.control = ldl_be_dma(s->dma_as, +dma_addr + offsetof(FWCfgDmaAccess, control)); + +if (dma.control & FW_CFG_DMA_CTL_SELECT) { +fw_cfg_select(s, dma.control >> 16); +} + +arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL); +e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK]; + +if (dma.control & FW_CFG_DMA_CTL_READ) { +read = 1; +} else if (dma.control & FW_CFG_DMA_CTL_SKIP) { +read = 0; +} else { +return; +} + +dma.control = 0; + +while (dma.length > 0 && !(dma.control & FW_CFG_DMA_CTL_ERROR)) { +if (s->cur_entry == FW_CFG_INVALID || !e->data || +s->cur_offset >= e->len) { +len = dma.length; + +/* If the access is not a read access, it will be a skip access, + * tested before. + */ +if (read) { +if (dma_memory_set(s->dma_as, dma.address, 0, len)) { +dma.control |= FW_CFG_DMA_CTL_ERROR; +} +} + +} else { +if (dma.length <= (e->len - s->cur_offset)) { +len = dma.length; +} else { +len = (e->len - s->cur_offset); +} + +if (e->read_callback) { +e->read_callback(e->callback_opaque, s->cur_offset); +} + +/* If the access is not a read access, it will be a skip access, +
[Qemu-devel] [PATCH v3 2/5] fw_cfg DMA interface documentation
Add fw_cfg DMA interface specification in the documentation. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- docs/specs/fw_cfg.txt | 65 +++ 1 file changed, 61 insertions(+), 4 deletions(-) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 5bc7b96..d5f9ddd 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -76,6 +76,13 @@ increasing address order, similar to memcpy(). Selector Register IOport: 0x510 Data Register IOport: 0x511 +DMA Address IOport: 0x514 + +=== ARM Register Locations === + +Selector Register address: Base + 8 (2 bytes) +Data Register address: Base + 0 (8 bytes) +DMA Address address: Base + 16 (8 bytes) == Firmware Configuration Items == @@ -86,11 +93,12 @@ by selecting the "signature" item using key 0x (FW_CFG_SIGNATURE), and reading four bytes from the data register. If the fw_cfg device is present, the four bytes read will contain the characters "QEMU". -=== Revision (Key 0x0001, FW_CFG_ID) === +=== Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === -A 32-bit little-endian unsigned int, this item is used as an interface -revision number, and is currently set to 1 by QEMU when fw_cfg is -initialized. +A 32-bit little-endian unsigned int, this item is used to check for enabled +features. + - Bit 0: traditional interface. Always set. + - Bit 1: DMA interface. === File Directory (Key 0x0019, FW_CFG_FILE_DIR) === @@ -132,6 +140,55 @@ Selector Reg.Range Usage In practice, the number of allowed firmware configuration items is given by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h). += Guest-side DMA Interface = + +If bit 1 of the feature bitmap is set, the DMA interface is present. This does +not replace the existing fw_cfg interface, it is an add-on. This interface +can be used through the 64-bit wide address register. + +The address register is in big-endian format. The value for the register is 0 +at startup and after an operation. A write to the lower half triggers an +operation. This means that operations with 32-bit addresses can be triggered +with just one write, whereas operations with 64-bit addresses can be +triggered with one 64-bit write or two 32-bit writes, starting with the +higher part. + +In this register, the physical address of a FWCfgDmaAccess structure in RAM +should be written. This is the format of the FWCfgDmaAccess structure: + +typedef struct FWCfgDmaAccess { +uint32_t control; +uint32_t length; +uint64_t address; +} FWCfgDmaAccess; + +The fields of the structure are in big endian mode, and the field at the lowest +address is the "control" field. + +The "control" field has the following bits: + - Bit 0: Error + - Bit 1: Read + - Bit 2: Skip + - Bit 3: Select. The upper 16 bits are the selected index. + +When an operation is triggered, if the "control" field has bit 3 set, the +upper 16 bits are interpreted as an index of a firmware configuration item. +This has the same effect as writing the selector register. + +If the "control" field has bit 1 set, a read operation will be performed. +"length" bytes for the current selector and offset will be copied into the +physical RAM address specified by the "address" field. + +If the "control" field has bit 2 set (and not bit 1), a skip operation will be +performed. The offset for the current selector will be advanced "length" bytes. + +To check result, read the "control" field: + error bit set-> something went wrong. + all bits cleared -> transfer finished successfully. + otherwise-> transfer still in progress (doesn't happen +today due to implementation not being async, +but may in the future). + = Host-side API = The following functions are available to the QEMU programmer for adding -- 2.4.3
[Qemu-devel] [PATCH v3 0/5] fw_cfg DMA interface
Implement host-side of the FW CFG DMA interface both for x86 and ARM. Based on Gerd Hoffman's initial implementation. Gabriel L. Somlo (1): fw_cfg: document fw_cfg_modify_iXX() update functions Marc Marí (4): fw_cfg DMA interface documentation Implement fw_cfg DMA interface Enable fw_cfg DMA interface for ARM Enable fw_cfg DMA interface for x86 docs/specs/fw_cfg.txt | 76 +++- hw/arm/virt.c | 9 +- hw/i386/pc.c | 11 ++- hw/nvram/fw_cfg.c | 228 +++--- include/hw/nvram/fw_cfg.h | 16 +++- 5 files changed, 314 insertions(+), 26 deletions(-) -- 2.4.3
[Qemu-devel] [PATCH v3 4/5] Enable fw_cfg DMA interface for ARM
Enable the fw_cfg DMA interface for the ARM virt machine. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/arm/virt.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 3568107..47f4ad3 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -113,7 +113,7 @@ static const MemMapEntry a15memmap[] = { [VIRT_GIC_V2M] ={ 0x0802, 0x1000 }, [VIRT_UART] = { 0x0900, 0x1000 }, [VIRT_RTC] ={ 0x0901, 0x1000 }, -[VIRT_FW_CFG] = { 0x0902, 0x000a }, +[VIRT_FW_CFG] = { 0x0902, 0x0014 }, [VIRT_MMIO] = { 0x0a00, 0x0200 }, /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */ [VIRT_PLATFORM_BUS] = { 0x0c00, 0x0200 }, @@ -651,13 +651,13 @@ static void create_flash(const VirtBoardInfo *vbi) g_free(nodename); } -static void create_fw_cfg(const VirtBoardInfo *vbi) +static void create_fw_cfg(const VirtBoardInfo *vbi, AddressSpace *as) { hwaddr base = vbi->memmap[VIRT_FW_CFG].base; hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); +fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); @@ -919,6 +919,7 @@ static void machvirt_init(MachineState *machine) create_fdt(vbi); + for (n = 0; n < smp_cpus; n++) { ObjectClass *oc = cpu_class_by_name(TYPE_ARM_CPU, cpustr[0]); CPUClass *cc = CPU_CLASS(oc); @@ -984,7 +985,7 @@ static void machvirt_init(MachineState *machine) */ create_virtio_devices(vbi, pic); -create_fw_cfg(vbi); +create_fw_cfg(vbi, &address_space_memory); rom_set_fw(fw_cfg_find()); guest_info->smp_cpus = smp_cpus; -- 2.4.3
[Qemu-devel] [PATCH v3 1/5] fw_cfg: document fw_cfg_modify_iXX() update functions
From: "Gabriel L. Somlo" Document the behavior of fw_cfg_modify_iXX() for leak-less updating of integer-type blobs. Currently only fw_cfg_modify_i16() is coded, but 32- and 64-bit versions may be added later if necessary.. Signed-off-by: Gabriel Somlo Signed-off-by: Gerd Hoffmann Reviewed-by: Laszlo Ersek --- docs/specs/fw_cfg.txt | 11 +++ 1 file changed, 11 insertions(+) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 74351dd..5bc7b96 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -159,6 +159,17 @@ will convert a 16-, 32-, or 64-bit integer to little-endian, then add a dynamically allocated copy of the appropriately sized item to fw_cfg under the given selector key value. +== fw_cfg_modify_iXX() == + +Modify the value of an XX-bit item (where XX may be 16, 32, or 64). +Similarly to the corresponding fw_cfg_add_iXX() function set, convert +a 16-, 32-, or 64-bit integer to little endian, create a dynamically +allocated copy of the required size, and replace the existing item at +the given selector key value with the newly allocated one. The previous +item, assumed to have been allocated during an earlier call to +fw_cfg_add_iXX() or fw_cfg_modify_iXX() (of the same width XX), is freed +before the function returns. + == fw_cfg_add_file() == Given a filename (i.e., fw_cfg item name), starting pointer, and size, -- 2.4.3
[Qemu-devel] QEMU fw_cfg DMA interface
Implementation of the FW CFG DMA interface. When running a Linux guest on top of QEMU, using the -kernel options, this is the timing improvement for x86: QEMU commit 16a1b6e and SeaBIOS commit e4d2b8c QEMU startup time: .080 BIOS startup time: .060 Kernel setup time: .586 Total time: .726 QEMU with this patch series and SeaBIOS with this patch series QEMU startup time: .080 BIOS startup time: .039 Kernel setup time: .002 Total time: .121 QEMU startup time is the time between the start and the first kvm_entry. BIOS startup time is the time between the first kvm_entry and the start of function do_boot, in SeaBIOS. Kernel setup time is the time between the start of the function do_boot in SeaBIOS and the jump to the Linux kernel. As you can see, both the BIOS (because of ACPI tables and other configurations) and the Linux kernel boot (because of the copy to memory) are greatly improved with this new interface. Also, this new interface is an addon to the old interface. Both interfaces are compatible and interchangeable. Changes from v1: - Take into account order of fields in the FWCfgDmaAccess structure - Check and change endianness of FWCfgDmaAccess fields - Change order of fields in the FWCfgDmaAccess structure - Add FW_CFG_DMA_CTL_SKIP feature for control field - Split FW_CFG_SIZE in QEMU - Make FW_CFG_ID a bitmap of features - Add 64 bit address support for the transfer. Trigger when writing the low address, and address is 0 by default and at the end of each transfer. - Align ports and addresses. - Preserve old fw_cfg_comb_valid behaviour in QEMU - Update documentation to reflect all these changes Changes from v2: - Make IOports fw_cfg DMA region a different IO region. - Reuse everything for MMIO and IOport DMA regions - Make transfer status only based on control field - Use DMA helpers instead of direct map/unmap - Change ARM fw_cfg DMA address space - Change Linux boot process to match linuxboot.S - Add select capabilities in the FWCfgDmaAccess struct - Update documentation to reflect all these changes
Re: [Qemu-devel] [RFC PATCH 1/3] pc: fw_cfg: move ioport base constant to pc.h
On Sun, 13 Sep 2015 13:28:24 -0400 "Gabriel L. Somlo" wrote: > On Sun, Sep 13, 2015 at 12:51:53PM +0200, Marc Marí wrote: > > On Sat, 12 Sep 2015 19:30:40 -0400 > > "Gabriel L. Somlo" wrote: > > > > > Move BIOS_CFG_IOPORT define from pc.c to pc.h, and rename > > > it to FW_CFG_IO_BASE. Also, add FW_CFG_IO_SIZE define (set > > > to 0x02, to cover the overlapping 16-bit control and 8-bit > > > data ports). > > > > > > Signed-off-by: Gabriel Somlo > > > --- > > > hw/i386/pc.c | 5 ++--- > > > include/hw/i386/pc.h | 3 +++ > > > 2 files changed, 5 insertions(+), 3 deletions(-) > > > > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > > > index b5107f7..1a92b4f 100644 > > > --- a/hw/i386/pc.c > > > +++ b/hw/i386/pc.c > > > @@ -86,7 +86,6 @@ void pc_set_legacy_acpi_data_size(void) > > > acpi_data_size = 0x1; > > > } > > > > > > -#define BIOS_CFG_IOPORT 0x510 > > > #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0) > > > #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1) > > > #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2) > > > @@ -760,7 +759,7 @@ static FWCfgState *bochs_bios_init(void) > > > int i, j; > > > unsigned int apic_id_limit = pc_apic_id_limit(max_cpus); > > > > > > -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); > > > +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE); > > > /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86: > > > * > > > * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU > > > hotplug @@ -1292,7 +1291,7 @@ FWCfgState > > > *xen_load_linux(PCMachineState *pcms, > > > assert(MACHINE(pcms)->kernel_filename != NULL); > > > > > > -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); > > > +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE); > > > rom_set_fw(fw_cfg); > > > > > > load_linux(pcms, fw_cfg); > > > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h > > > index 3e002c9..0cab3c5 100644 > > > --- a/include/hw/i386/pc.h > > > +++ b/include/hw/i386/pc.h > > > @@ -206,6 +206,9 @@ typedef void (*cpu_set_smm_t)(int smm, void > > > *arg); > > > void ioapic_init_gsi(GSIState *gsi_state, const char > > > *parent_name); > > > +#define FW_CFG_IO_BASE 0x510 > > > +#define FW_CFG_IO_SIZE 0x02 > > > + > > > /* acpi_piix.c */ > > > > > > I2CBus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t > > > smb_io_base, > > > > There is already a size defined in hw/nvram/fw_cfg.c (FW_CFG_SIZE). > > You could move this definition to the .h and reuse it for ACPI. > > This way, it is easier to modify. > > > > Note that this value is used both for the size of the IO port and > > the size of the CTL field when using memory regions. You can split > > it now in your patches, or it will be split in my patches. > > Thanks for the feedback! It does look like FW_CFG_SIZE in fw_cfg.c > appears to be mainly concerned with the width of the control register, > which is a "private" property of fw_cfg.c, rather than the total size > of the fw_cfg ioport region, which is a property of hw/i386/pc.c > (same as a15memmap[VIRT_FW_CFG] contains the same (base,size) > properties for the equivalent mmio region on arm). > > We could rename FW_CFG_SIZE in fw_cfg.c to FW_CFG_CTL_SIZE for > increased clarity, but the fact that it's equal to FW_CFG_IO_SIZE > on hw/i386/... seems to me like more of a coincidence... In hw/nvram/fw_cfg.c L. 675, in fw_cfg_io_realize(): memory_region_init_io(&s->comb_iomem, OBJECT(s), &fw_cfg_comb_mem_ops, FW_CFG(s), "fwcfg", FW_CFG_SIZE); L. 707, in fw_cfg_mem_realize(): memory_region_init_io(&s->ctl_iomem, OBJECT(s), &fw_cfg_ctl_mem_ops, FW_CFG(s), "fwcfg.ctl", FW_CFG_SIZE); The first one is the size of all the IO region, and the second one is the size of the memory mapped CTL field. The value for the ACPI size could be the same value used in memory_region_init_io. But it needs to be deatached from the CTL size. That's what I meant before. Of course, it can be the same, but it's better to split it. Thanks Marc > OTOH, i386/acpi_build.c includes both pc.h and fw_cfg.h, so if I have > to, I could use FW_CFG_IO_BASE from the former and FW_CFG_SIZE from > the latter. > > It's more of a question of aesthetics at this point, so I'm happy > to do it whichever way I'm told :) > > > > > I'm not going to comment on the other patches, because I don't know > > ACPI. > > > > Thanks > > Marc
Re: [Qemu-devel] [RFC PATCH 1/3] pc: fw_cfg: move ioport base constant to pc.h
On Sat, 12 Sep 2015 19:30:40 -0400 "Gabriel L. Somlo" wrote: > Move BIOS_CFG_IOPORT define from pc.c to pc.h, and rename > it to FW_CFG_IO_BASE. Also, add FW_CFG_IO_SIZE define (set > to 0x02, to cover the overlapping 16-bit control and 8-bit > data ports). > > Signed-off-by: Gabriel Somlo > --- > hw/i386/pc.c | 5 ++--- > include/hw/i386/pc.h | 3 +++ > 2 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index b5107f7..1a92b4f 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -86,7 +86,6 @@ void pc_set_legacy_acpi_data_size(void) > acpi_data_size = 0x1; > } > > -#define BIOS_CFG_IOPORT 0x510 > #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0) > #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1) > #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2) > @@ -760,7 +759,7 @@ static FWCfgState *bochs_bios_init(void) > int i, j; > unsigned int apic_id_limit = pc_apic_id_limit(max_cpus); > > -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); > +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE); > /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86: > * > * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU > hotplug @@ -1292,7 +1291,7 @@ FWCfgState > *xen_load_linux(PCMachineState *pcms, > assert(MACHINE(pcms)->kernel_filename != NULL); > > -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); > +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE); > rom_set_fw(fw_cfg); > > load_linux(pcms, fw_cfg); > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h > index 3e002c9..0cab3c5 100644 > --- a/include/hw/i386/pc.h > +++ b/include/hw/i386/pc.h > @@ -206,6 +206,9 @@ typedef void (*cpu_set_smm_t)(int smm, void *arg); > > void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name); > > +#define FW_CFG_IO_BASE 0x510 > +#define FW_CFG_IO_SIZE 0x02 > + > /* acpi_piix.c */ > > I2CBus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base, There is already a size defined in hw/nvram/fw_cfg.c (FW_CFG_SIZE). You could move this definition to the .h and reuse it for ACPI. This way, it is easier to modify. Note that this value is used both for the size of the IO port and the size of the CTL field when using memory regions. You can split it now in your patches, or it will be split in my patches. I'm not going to comment on the other patches, because I don't know ACPI. Thanks Marc
Re: [Qemu-devel] [PATCH v2 2/2] Add dynamic generation of module_block.h
On Wed, 9 Sep 2015 10:27:13 +0800 Fam Zheng wrote: > On Tue, 09/08 15:53, Marc Marí wrote: > > diff --git a/include/qemu/module_block.h > > b/include/qemu/module_block.h deleted file mode 100644 > > index d725db8..000 > > --- a/include/qemu/module_block.h > > +++ /dev/null > > @@ -1,90 +0,0 @@ > > -/* AUTOMATICALLY GENERATED, DO NOT MODIFY */ > > -/* > > - * QEMU Block Module Infrastructure > > - * > > - * Copyright Red Hat, Inc. 2015 > > - * > > - * Authors: > > - * Marc Mari > > - * > > - * This work is licensed under the terms of the GNU GPL, version > > 2. See > > - * the COPYING file in the top-level directory. > > - * > > - */ > > - > > Could you reorder the patches so you don't need to add them remove the > generated header? > > > diff --git a/scripts/modules/module_block.py > > b/scripts/modules/module_block.py new file mode 100755 > > index 000..0846362 > > --- /dev/null > > +++ b/scripts/modules/module_block.py > > @@ -0,0 +1,134 @@ > > +#!/usr/bin/python > > +# > > +# Module information generator > > +# > > +# Copyright Red Hat, Inc. 2015 > > +# > > +# Authors: > > +# Marc Mari > > +# > > +# This work is licensed under the terms of the GNU GPL, version 2. > > +# See the COPYING file in the top-level directory. > > + > > +from __future__ import print_function > > +import sys > > +import os > > + > > +def get_string_struct(line): > > +data = line.split() > > + > > +# data[0] -> struct element name > > +# data[1] -> = > > +# data[2] -> value > > + > > +return data[2].replace('"', '')[:-1] > > + > > +def add_module(fhader, library, format_name, protocol_name, > > +probe, probe_device): > > +lines = [] > > +lines.append('.library_name = "' + library + '",') > > +if format_name != "": > > +lines.append('.format_name = "' + format_name + '",') > > +if protocol_name != "": > > +lines.append('.protocol_name = "' + protocol_name + '",') > > +if probe: > > +lines.append('.has_probe = true,') > > +if probe_device: > > +lines.append('.has_probe_device = true,') > > + > > +text = '\n\t'.join(lines) > > +fheader.write('\n\t{\n\t' + text + '\n\t},') > > + > > +def process_file(fheader, filename): > > +# This parser assumes the coding style rules are being followed > > +with open(filename, "r") as cfile: > > +found_something = False > > +found_start = False > > +library, _ = os.path.splitext(os.path.basename(filename)) > > +for line in cfile: > > +if found_start: > > +line = line.replace('\n', '') > > +if line.find(".format_name") != -1: > > +format_name = get_string_struct(line) > > +elif line.find(".protocol_name") != -1: > > +protocol_name = get_string_struct(line) > > +elif line.find(".bdrv_probe") != -1: > > +probe = True > > +elif line.find(".bdrv_probe_device") != -1: > > +probe_device = True > > +elif line == "};": > > +add_module(fheader, library, format_name, > > protocol_name, > > +probe, probe_device) > > +found_start = False > > +elif line.find("static BlockDriver") != -1: > > +found_something = True > > +found_start = True > > +format_name = "" > > +protocol_name = "" > > +probe = False > > +probe_device = False > > + > > +if not found_something: > > +print("No BlockDriver struct found in " + filename + > > ". \ > > +Is this really a module?", file=sys.stderr) > > +sys.exit(1) > > + > > +def print_top(fheader): > > +fheader.write('''/* AUTOMATICALLY GENERATED, DO NOT MODIFY */ > > +/* > > + * QEMU Block Module Infrastr
[Qemu-devel] [PATCH v2 0/2] Dynamic module support for block drivers
The current module infrastructure has been improved to enable dynamic module loading. This reduces the load time for very simple guests. For the following configuration (very loaded) ./configure --enable-sdl --enable-gtk --enable-vte --enable-curses \ --enable-vnc --enable-vnc-{jpeg,tls,sasl,png} --enable-virtfs \ --enable-brlapi --enable-curl --enable-fdt --enable-bluez \ --enable-kvm --enable-rdma --enable-uuid --enable-vde \ --enable-linux-aio --enable-cap-ng --enable-attr --enable-vhost-net \ --enable-vhost-scsi --enable-spice --enable-rbd --enable-libiscsi \ --enable-smartcard-nss --enable-guest-agent --enable-libusb \ --enable-usb-redir --enable-lzo --enable-snappy --enable-bzip2 \ --enable-seccomp --enable-coroutine-pool --enable-glusterfs \ --enable-tpm --enable-libssh2 --enable-vhdx --enable-numa \ --enable-tcmalloc --target-list=x86_64-softmmu With modules disabled, there are 142 libraries loaded at startup. Time is the following: LD time: 0.065 seconds QEMU time: 0.02 seconds Total time: 0.085 seconds With this patch series and modules enabled, there are 128 libraries loaded at startup. Time is the following: LD time: 0.02 seconds QEMU time: 0.02 seconds Total time: 0.04 seconds Where LD time is the time between the program startup and the jump to main, and QEMU time is the time between the start of main and the first kvm_entry. These results are just with a few block drivers, that were already a module. Adding more modules (block or not block) should be easy, and will reduce the load time even more. Marc Marí (2): Add dynamic module loading for block drivers Add dynamic generation of module_block.h .gitignore | 1 + Makefile| 10 ++- block.c | 70 + configure | 2 +- include/qemu/module.h | 3 + scripts/modules/module_block.py | 134 util/module.c | 38 7 files changed, 227 insertions(+), 31 deletions(-) create mode 100755 scripts/modules/module_block.py -- 2.4.3
[Qemu-devel] [PATCH v2 1/2] Add dynamic module loading for block drivers
Extend the current module interface to allow for block drivers to be loaded dynamically on request. The only block drivers that can be converted into modules are the drivers that don't perform any init operation except for registering themselves. This is why libiscsi has been disabled as a module. All the necessary module information is located in a new structure found in include/qemu/module_block.h Signed-off-by: Marc Marí --- block.c | 70 +++ configure | 2 +- include/qemu/module.h | 3 ++ include/qemu/module_block.h | 90 + util/module.c | 38 ++- 5 files changed, 175 insertions(+), 28 deletions(-) create mode 100644 include/qemu/module_block.h diff --git a/block.c b/block.c index 090923c..814429a 100644 --- a/block.c +++ b/block.c @@ -27,6 +27,7 @@ #include "block/block_int.h" #include "block/blockjob.h" #include "qemu/error-report.h" +#include "qemu/module_block.h" #include "qemu/module.h" #include "qapi/qmp/qerror.h" #include "qapi/qmp/qjson.h" @@ -277,11 +278,30 @@ void bdrv_add_close_notifier(BlockDriverState *bs, Notifier *notify) BlockDriver *bdrv_find_format(const char *format_name) { BlockDriver *drv1; +size_t i; + QLIST_FOREACH(drv1, &bdrv_drivers, list) { if (!strcmp(drv1->format_name, format_name)) { return drv1; } } + +for (i = 0; i < ARRAY_SIZE(block_driver_modules); ++i) { +if (!strcmp(block_driver_modules[i].format_name, format_name)) { +block_module_load_one(block_driver_modules[i].library_name); +/* Copying code is not nice, but this way the current discovery is + * not modified. Calling recursively could fail if the library + * has been deleted. + */ +QLIST_FOREACH(drv1, &bdrv_drivers, list) { +if (!strcmp(drv1->format_name, format_name)) { +return drv1; +} +} +} +} + + return NULL; } @@ -484,8 +504,15 @@ int get_tmp_filename(char *filename, int size) static BlockDriver *find_hdev_driver(const char *filename) { int score_max = 0, score; +size_t i; BlockDriver *drv = NULL, *d; +for (i = 0; i < ARRAY_SIZE(block_driver_modules); ++i) { +if (block_driver_modules[i].has_probe_device) { +block_module_load_one(block_driver_modules[i].library_name); +} +} + QLIST_FOREACH(d, &bdrv_drivers, list) { if (d->bdrv_probe_device) { score = d->bdrv_probe_device(filename); @@ -507,6 +534,7 @@ BlockDriver *bdrv_find_protocol(const char *filename, char protocol[128]; int len; const char *p; +size_t i; /* TODO Drivers without bdrv_file_open must be specified explicitly */ @@ -533,6 +561,7 @@ BlockDriver *bdrv_find_protocol(const char *filename, len = sizeof(protocol) - 1; memcpy(protocol, filename, len); protocol[len] = '\0'; + QLIST_FOREACH(drv1, &bdrv_drivers, list) { if (drv1->protocol_name && !strcmp(drv1->protocol_name, protocol)) { @@ -540,6 +569,23 @@ BlockDriver *bdrv_find_protocol(const char *filename, } } +for (i = 0; i < ARRAY_SIZE(block_driver_modules); ++i) { +if (block_driver_modules[i].protocol_name && +!strcmp(block_driver_modules[i].protocol_name, protocol)) { +block_module_load_one(block_driver_modules[i].library_name); +/* Copying code is not nice, but this way the current discovery is + * not modified. Calling recursively could fail if the library + * has been deleted. + */ +QLIST_FOREACH(drv1, &bdrv_drivers, list) { +if (drv1->protocol_name && +!strcmp(drv1->protocol_name, protocol)) { +return drv1; +} +} +} +} + error_setg(errp, "Unknown protocol '%s'", protocol); return NULL; } @@ -562,8 +608,15 @@ BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size, const char *filename) { int score_max = 0, score; +size_t i; BlockDriver *drv = NULL, *d; +for (i = 0; i < ARRAY_SIZE(block_driver_modules); ++i) { +if (block_driver_modules[i].has_probe) { +block_module_load_one(block_driver_modules[i].library_name); +} +} + QLIST_FOREACH(d, &bdrv_drivers, list) { if (d->bdrv_probe) { score = d->bdrv_probe(buf, buf_size, filename); @@ -2784,6 +2837,7 @@ void bdrv_iterate_format(void (*it)(void *opaque, const char *name), BlockDriver *drv
[Qemu-devel] [PATCH v2 2/2] Add dynamic generation of module_block.h
To simplify the addition of new block modules, add a script that generates include/qemu/module_block.h automatically from the modules' source code. This script assumes that the QEMU coding style rules are followed. Signed-off-by: Marc Marí --- .gitignore | 1 + Makefile| 10 ++- include/qemu/module_block.h | 90 --- scripts/modules/module_block.py | 134 4 files changed, 142 insertions(+), 93 deletions(-) delete mode 100644 include/qemu/module_block.h create mode 100755 scripts/modules/module_block.py diff --git a/.gitignore b/.gitignore index cb4b8ec..8428dd0 100644 --- a/.gitignore +++ b/.gitignore @@ -106,3 +106,4 @@ cscope.* tags TAGS *~ +/include/qemu/module_block.h diff --git a/Makefile b/Makefile index 9ce3972..6b23e14 100644 --- a/Makefile +++ b/Makefile @@ -73,6 +73,8 @@ GENERATED_HEADERS += trace/generated-ust-provider.h GENERATED_SOURCES += trace/generated-ust.c endif +GENERATED_HEADERS += include/qemu/module_block.h + # Don't try to regenerate Makefile or configure # We don't generate any of them Makefile: ; @@ -220,9 +222,6 @@ Makefile: $(version-obj-y) $(version-lobj-y) libqemustub.a: $(stub-obj-y) libqemuutil.a: $(util-obj-y) -block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL -util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)' - ## qemu-img.o: qemu-img-cmds.h @@ -313,6 +312,11 @@ msi: @echo "MSI build not configured or dependency resolution failed (reconfigure with --enable-guest-agent-msi option)" endif +include/qemu/module_block.h: $(SRC_PATH)/scripts/modules/module_block.py + $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/modules/module_block.py \ + "./include/qemu/" $(patsubst %.mo,%.c,$(block-obj-m)), \ + " GEN $@") + clean: # avoid old build problems by removing potentially incorrect old files rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h diff --git a/include/qemu/module_block.h b/include/qemu/module_block.h deleted file mode 100644 index d725db8..000 --- a/include/qemu/module_block.h +++ /dev/null @@ -1,90 +0,0 @@ -/* AUTOMATICALLY GENERATED, DO NOT MODIFY */ -/* - * QEMU Block Module Infrastructure - * - * Copyright Red Hat, Inc. 2015 - * - * Authors: - * Marc Mari - * - * This work is licensed under the terms of the GNU GPL, version 2. See - * the COPYING file in the top-level directory. - * - */ - -#ifndef QEMU_MODULE_BLOCK_H -#define QEMU_MODULE_BLOCK_H - -#include "qemu-common.h" - -static const struct { -const char *format_name; -const char *protocol_name; -const char *library_name; -bool has_probe; -bool has_probe_device; -} block_driver_modules[] = { - { - .library_name = "curl", - .format_name = "http", - .protocol_name = "http", - }, - { - .library_name = "curl", - .format_name = "https", - .protocol_name = "https", - }, - { - .library_name = "curl", - .format_name = "ftp", - .protocol_name = "ftp", - }, - { - .library_name = "curl", - .format_name = "ftps", - .protocol_name = "ftps", - }, - { - .library_name = "curl", - .format_name = "tftp", - .protocol_name = "tftp", - }, - { - .library_name = "rbd", - .format_name = "rbd", - .protocol_name = "rbd", - }, - { - .library_name = "gluster", - .format_name = "gluster", - .protocol_name = "gluster", - }, - { - .library_name = "gluster", - .format_name = "gluster", - .protocol_name = "gluster+tcp", - }, - { - .library_name = "gluster", - .format_name = "gluster", - .protocol_name = "gluster+unix", - }, - { - .library_name = "gluster", - .format_name = "gluster", - .protocol_name = "gluster+rdma", - }, - { - .library_name = "ssh", - .format_name = "ssh", - .protocol_name = "ssh", - }, - { - .library_name = "dmg", - .format_name = "dmg", - .has_probe = true, - }, -}; - -#endif - diff --git a/scripts/modules/module_block.py b/scripts/modules/module_block.py new file mode 100755 index 000..0846362 --- /dev/null +++ b/scripts/modules/module_block.py @@ -0,0 +
Re: [Qemu-devel] [PATCH 0/2] Dynamic module support for block drivers
On Thu, 3 Sep 2015 17:12:29 +0100 Stefan Hajnoczi wrote: > On Mon, Aug 17, 2015 at 10:09:33AM +0200, Marc Marí wrote: > > The current module infrastructure has been improved to enable > > dynamic module loading. > > > > This reduces the load time for very simple guests. For the following > > configuration (very loaded) > > > > ./configure --enable-sdl --enable-gtk --enable-vte --enable-curses \ > > --enable-vnc --enable-vnc-{jpeg,tls,sasl,png} --enable-virtfs \ > > --enable-brlapi --enable-curl --enable-fdt --enable-bluez \ > > --enable-kvm --enable-rdma --enable-uuid --enable-vde \ > > --enable-linux-aio --enable-cap-ng --enable-attr > > --enable-vhost-net \ --enable-vhost-scsi --enable-spice > > --enable-rbd --enable-libiscsi \ --enable-smartcard-nss > > --enable-guest-agent --enable-libusb \ --enable-usb-redir > > --enable-lzo --enable-snappy --enable-bzip2 \ --enable-seccomp > > --enable-coroutine-pool --enable-glusterfs \ --enable-tpm > > --enable-libssh2 --enable-vhdx --enable-numa \ --enable-tcmalloc > > --target-list=x86_64-softmmu > > > > With modules disabled, there are 142 libraries loaded at startup. > > Time is the following: > > LD time: 0.065 seconds > > QEMU time: 0.02 seconds > > Total time: 0.085 seconds > > > > With this patch series and modules enabled, there are 128 libraries > > loaded at startup. Time is the following: > > LD time: 0.02 seconds > > QEMU time: 0.02 seconds > > Total time: 0.04 seconds > > > > Where LD time is the time between the program startup and the jump > > to main, and QEMU time is the time between the start of main and > > the first kvm_entry. > > > > These results are just with a few block drivers, that were already > > a module. Adding more modules (block or not block) should be easy, > > and will reduce the load time even more. > > Which QEMU command-line did you benchmark? > > Did you have a UI enabled (SDL/GTK/VNC)? x86_64-softmmu/qemu-system-x86_64 --enable-kvm \ -kernel /boot/vmlinuz-4.1.4-200.fc22.x86_64 -nographic Thanks Marc
Re: [Qemu-devel] [PATCH 1/2] Add dynamic module loading for block drivers
On Thu, 3 Sep 2015 17:33:16 +0100 Stefan Hajnoczi wrote: > On Mon, Aug 17, 2015 at 10:09:34AM +0200, Marc Marí wrote: > > +static const struct { > > +const char *format_name; > > +const char *protocol_name; > > +const char *library_name; > > +bool has_probe; > > +bool has_probe_device; > > +} block_driver_module[] = { > > Why is this list incomplete? It doesn't cover all block drivers. > Perhaps these are the only modular block drivers. I think we can decide on a protocol first (these patches), and then apply the changes to all (possible) drivers. At least, that was what I had in mind. > Also, it ignores CONFIG_CURL and friends. Perhaps it doesn't matter > because the module loading code will just see that there is no file > there, but maybe conditional compilation should be used? It's true that this patch doesn't look at the CONFIG options. But the next one does (it takes block-obj-m from the Makefile), and also replaces this file. Thanks Marc > A plural name would more consistent (i.e. you deleted the > plural block_modules[] variable and introduced a singular > block_driver_module[] variable).
[Qemu-devel] [PATCH v2 2/5] fw_cfg DMA interface documentation
Add fw_cfg DMA interface specification in the documentation. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- docs/specs/fw_cfg.txt | 68 --- 1 file changed, 64 insertions(+), 4 deletions(-) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 5bc7b96..06302f6 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -76,6 +76,13 @@ increasing address order, similar to memcpy(). Selector Register IOport: 0x510 Data Register IOport: 0x511 +DMA Address IOport: 0x514 + +=== ARM Register Locations === + +Selector Register address: 0x0902 +Data Register address: 0x09020008 +DMA Address address: 0x0902000c == Firmware Configuration Items == @@ -86,11 +93,12 @@ by selecting the "signature" item using key 0x (FW_CFG_SIGNATURE), and reading four bytes from the data register. If the fw_cfg device is present, the four bytes read will contain the characters "QEMU". -=== Revision (Key 0x0001, FW_CFG_ID) === +=== Revision / feature bitmap (Key 0x0001, FW_CFG_ID) === -A 32-bit little-endian unsigned int, this item is used as an interface -revision number, and is currently set to 1 by QEMU when fw_cfg is -initialized. +A 32-bit little-endian unsigned int, this item is used to check for enabled +features. + - Bit 0: traditional interface. Always set. + - Bit 1: DMA interface. === File Directory (Key 0x0019, FW_CFG_FILE_DIR) === @@ -132,6 +140,58 @@ Selector Reg.Range Usage In practice, the number of allowed firmware configuration items is given by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h). += Guest-side DMA Interface = + +If bit 1 of the feature bitmap is set, the DMA interface is present. This does +not replace the existing fw_cfg interface, it is an add-on. This interface +can be used through the 64-bit wide address register. + +The address register, as the selector register, is in little-endian format +when using IOports, and in big-endian format when using MMIO. The value for +the register is 0 at startup and after an operation. A write to the lower +half triggers an operation. This means, that operations with 32-bit addresses +can be triggered with just one write, whereas operations with 64-bit addresses +can be triggered with one 64-bit write or two 32-bit writes, starting with the +higher part. + +In this register, a physical RAM address to a FWCfgDmaAccess structure should +be written. This is the format of the FWCfgDmaAccess structure: + +typedef struct FWCfgDmaAccess { +uint32_t control; +uint32_t length; +uint64_t address; +} FWCfgDmaAccess; + +The fields of the structure are in big endian mode, and the field at the lowest +address is the "control" field. + +The "control" field has the following bits: + - Bit 0: Error + - Bit 1: Read + - Bit 2: Skip + +When an operation is triggered, if the "control" field has bit 1 set, a read +operation will be performed. "length" bytes for the current selector and +offset will be copied into the address specified by the "address" field. + +If the control field has only bit 2 set, a skip operation will be perfomed. +The offset for the current selector will be advanced "length" bytes. + +To check result, read the "control" field: + error bit set-> something went wrong. + all bits cleared -> transfer finished successfully. + otherwise-> transfer still in progress (doesn't happen +today due to implementation not being async, +but may in the future). + +Target address goes up and transfer length goes down as the transfer happens, +so after a successful transfer the length field is zero and the address field +points right after the memory block written. + +If a partial transfer happened before an error occured the address and +length registers indicate how much data has been transfered successfully. + = Host-side API = The following functions are available to the QEMU programmer for adding -- 2.4.3
[Qemu-devel] [PATCH v2 3/5] Implement fw_cfg DMA interface
Based on the specifications on docs/specs/fw_cfg.txt This interface is an addon. The old interface can still be used as usual. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/arm/virt.c | 2 +- hw/nvram/fw_cfg.c | 261 +++--- include/hw/nvram/fw_cfg.h | 15 ++- 3 files changed, 260 insertions(+), 18 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index d5a8417..b88c104 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -620,7 +620,7 @@ static void create_fw_cfg(const VirtBoardInfo *vbi) hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8); +fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 88481b7..7e5ba96 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -23,6 +23,7 @@ */ #include "hw/hw.h" #include "sysemu/sysemu.h" +#include "sysemu/dma.h" #include "hw/isa/isa.h" #include "hw/nvram/fw_cfg.h" #include "hw/sysbus.h" @@ -30,7 +31,8 @@ #include "qemu/error-report.h" #include "qemu/config-file.h" -#define FW_CFG_SIZE 2 +#define FW_CFG_IO_SIZE 12 /* Address aligned to 4 bytes */ +#define FW_CFG_CTL_SIZE 2 #define FW_CFG_NAME "fw_cfg" #define FW_CFG_PATH "/machine/" FW_CFG_NAME @@ -42,6 +44,15 @@ #define FW_CFG_IO(obj) OBJECT_CHECK(FWCfgIoState, (obj), TYPE_FW_CFG_IO) #define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM) +/* FW_CFG_VERSION bits */ +#define FW_CFG_VERSION 0x01 +#define FW_CFG_VERSION_DMA 0x02 + +/* FW_CFG_DMA_CONTROL bits */ +#define FW_CFG_DMA_CTL_ERROR 0x01 +#define FW_CFG_DMA_CTL_READ0x02 +#define FW_CFG_DMA_CTL_SKIP0x04 + typedef struct FWCfgEntry { uint32_t len; uint8_t *data; @@ -59,6 +70,10 @@ struct FWCfgState { uint16_t cur_entry; uint32_t cur_offset; Notifier machine_ready; + +bool dma_enabled; +AddressSpace *dma_as; +dma_addr_t dma_addr; }; struct FWCfgIoState { @@ -75,7 +90,7 @@ struct FWCfgMemState { FWCfgState parent_obj; /*< public >*/ -MemoryRegion ctl_iomem, data_iomem; +MemoryRegion ctl_iomem, data_iomem, dma_iomem; uint32_t data_width; MemoryRegionOps wide_data_ops; }; @@ -294,6 +309,142 @@ static void fw_cfg_data_mem_write(void *opaque, hwaddr addr, } while (i); } +static void fw_cfg_dma_transfer(FWCfgState *s) +{ +dma_addr_t len; +uint8_t *ptr; +void *addr; +FWCfgDmaAccess dma; +int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL); +FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK]; + +len = sizeof(dma); +addr = dma_memory_map(s->dma_as, s->dma_addr, &len, +DMA_DIRECTION_FROM_DEVICE); + +s->dma_addr = 0; + +if (!addr || !len) { +return; +} + +dma.address = be64_to_cpu(*(uint64_t *)(addr + +offsetof(FWCfgDmaAccess, address))); +dma.length = be32_to_cpu(*(uint32_t *)(addr + +offsetof(FWCfgDmaAccess, length))); +dma.control = be32_to_cpu(*(uint32_t *)(addr + +offsetof(FWCfgDmaAccess, control))); + +if (dma.control & FW_CFG_DMA_CTL_ERROR) { +goto out; +} + +if (!(dma.control & (FW_CFG_DMA_CTL_READ | FW_CFG_DMA_CTL_SKIP))) { +goto out; +} + +while (dma.length > 0) { +if (s->cur_entry == FW_CFG_INVALID || !e->data || +s->cur_offset >= e->len) { +len = dma.length; + +/* If the access is not a read access, it will be a skip access, + * tested before. + */ +if (dma.control & FW_CFG_DMA_CTL_READ) { +ptr = dma_memory_map(s->dma_as, dma.address, &len, + DMA_DIRECTION_FROM_DEVICE); +if (!ptr || !len) { +dma.control |= FW_CFG_DMA_CTL_ERROR; +goto out; +} + +memset(ptr, 0, len); + +dma_memory_unmap(s->dma_as, ptr, len, + DMA_DIRECTION_FROM_DEVICE, len); +} + +} else { +if (dma.length <= e->len) { +len = dma.length; +} else { +len = e->len; +} + +if (e->read_callback) { +e->read_callback(e->callback_opaque, s->cur_offset); +} + +/* If the access is not a read access, it will be a skip access, + * tested before. + */
[Qemu-devel] [PATCH v2 5/5] Enable fw_cfg DMA interface for x86
Enable the fw_cfg DMA interface for all the x86 platforms. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/i386/pc.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 9f2924e..c6dc84f 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -753,14 +753,15 @@ static void pc_build_smbios(FWCfgState *fw_cfg) } } -static FWCfgState *bochs_bios_init(void) +static FWCfgState *bochs_bios_init(AddressSpace *as) { FWCfgState *fw_cfg; uint64_t *numa_fw_cfg; int i, j; unsigned int apic_id_limit = pc_apic_id_limit(max_cpus); -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); +fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, as); + /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86: * * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU hotplug @@ -1316,6 +1317,7 @@ FWCfgState *pc_memory_init(PCMachineState *pcms, MemoryRegion *ram_below_4g, *ram_above_4g; FWCfgState *fw_cfg; MachineState *machine = MACHINE(pcms); +AddressSpace *as; assert(machine->ram_size == pcms->below_4g_mem_size + pcms->above_4g_mem_size); @@ -1407,7 +1409,10 @@ FWCfgState *pc_memory_init(PCMachineState *pcms, option_rom_mr, 1); -fw_cfg = bochs_bios_init(); +as = g_malloc(sizeof(*as)); +address_space_init(as, ram_below_4g, "pc.as"); +fw_cfg = bochs_bios_init(as); + rom_set_fw(fw_cfg); if (guest_info->has_reserved_memory && pcms->hotplug_memory.base) { -- 2.4.3
[Qemu-devel] [PATCH v2 1/5] fw_cfg: document fw_cfg_modify_iXX() update functions
From: "Gabriel L. Somlo" Document the behavior of fw_cfg_modify_iXX() for leak-less updating of integer-type blobs. Currently only fw_cfg_modify_i16() is coded, but 32- and 64-bit versions may be added later if necessary.. Signed-off-by: Gabriel Somlo Signed-off-by: Gerd Hoffmann Reviewed-by: Laszlo Ersek --- docs/specs/fw_cfg.txt | 11 +++ 1 file changed, 11 insertions(+) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 74351dd..5bc7b96 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -159,6 +159,17 @@ will convert a 16-, 32-, or 64-bit integer to little-endian, then add a dynamically allocated copy of the appropriately sized item to fw_cfg under the given selector key value. +== fw_cfg_modify_iXX() == + +Modify the value of an XX-bit item (where XX may be 16, 32, or 64). +Similarly to the corresponding fw_cfg_add_iXX() function set, convert +a 16-, 32-, or 64-bit integer to little endian, create a dynamically +allocated copy of the required size, and replace the existing item at +the given selector key value with the newly allocated one. The previous +item, assumed to have been allocated during an earlier call to +fw_cfg_add_iXX() or fw_cfg_modify_iXX() (of the same width XX), is freed +before the function returns. + == fw_cfg_add_file() == Given a filename (i.e., fw_cfg item name), starting pointer, and size, -- 2.4.3
[Qemu-devel] [PATCH v2 4/5] Enable fw_cfg DMA interface for ARM
Enable the fw_cfg DMA interface for the ARM virt machine. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/arm/virt.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index b88c104..54d5f54 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -111,7 +111,7 @@ static const MemMapEntry a15memmap[] = { [VIRT_GIC_V2M] ={ 0x0802, 0x1000 }, [VIRT_UART] = { 0x0900, 0x1000 }, [VIRT_RTC] ={ 0x0901, 0x1000 }, -[VIRT_FW_CFG] = { 0x0902, 0x000a }, +[VIRT_FW_CFG] = { 0x0902, 0x0014 }, [VIRT_MMIO] = { 0x0a00, 0x0200 }, /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */ [VIRT_PLATFORM_BUS] = { 0x0c00, 0x0200 }, @@ -614,13 +614,13 @@ static void create_flash(const VirtBoardInfo *vbi) g_free(nodename); } -static void create_fw_cfg(const VirtBoardInfo *vbi) +static void create_fw_cfg(AddressSpace *as, const VirtBoardInfo *vbi) { hwaddr base = vbi->memmap[VIRT_FW_CFG].base; hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); +fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); @@ -808,6 +808,7 @@ static void machvirt_init(MachineState *machine) VirtGuestInfoState *guest_info_state = g_malloc0(sizeof *guest_info_state); VirtGuestInfo *guest_info = &guest_info_state->info; char **cpustr; +AddressSpace *as = NULL; if (!cpu_model) { cpu_model = "cortex-a15"; @@ -845,6 +846,10 @@ static void machvirt_init(MachineState *machine) } cpuobj = object_new(object_class_get_name(oc)); +if (!as) { +as = CPU(cpuobj)->as; +} + /* Handle any CPU options specified by the user */ cc->parse_features(CPU(cpuobj), cpuopts, &err); g_free(cpuopts); @@ -897,7 +902,7 @@ static void machvirt_init(MachineState *machine) */ create_virtio_devices(vbi, pic); -create_fw_cfg(vbi); +create_fw_cfg(as, vbi); rom_set_fw(fw_cfg_find()); guest_info->smp_cpus = smp_cpus; -- 2.4.3
[Qemu-devel] [PATCH v2 0/5] fw_cfg DMA interface
Implement host-side of the FW CFG DMA interface both for x86 and ARM. Based on Gerd Hoffman's initial implementation. Gabriel L. Somlo (1): fw_cfg: document fw_cfg_modify_iXX() update functions Marc Marí (4): fw_cfg DMA interface documentation Implement fw_cfg DMA interface Enable fw_cfg DMA interface for ARM Enable fw_cfg DMA interface for x86 docs/specs/fw_cfg.txt | 79 +- hw/arm/virt.c | 13 ++- hw/i386/pc.c | 11 +- hw/nvram/fw_cfg.c | 261 +++--- include/hw/nvram/fw_cfg.h | 15 ++- 5 files changed, 351 insertions(+), 28 deletions(-) -- 2.4.3
[Qemu-devel] QEMU fw_cfg DMA interface
Implementation of the FW CFG DMA interface. When running a Linux guest on top of QEMU, using the -kernel options, this is the timing improvement for x86: QEMU commit 090d0bf and SeaBIOS commit 2fc20dc QEMU startup time: .078 BIOS startup time: .060 Kernel setup time: .578 Total time: .716 QEMU with this patch series and SeaBIOS with this patch series QEMU startup time: .080 BIOS startup time: .039 Kernel setup time: .002 Total time: .121 QEMU startup time is the time between the start and the first kvm_entry. BIOS startup time is the time between the first kvm_entry and the start of function do_boot, in SeaBIOS. Kernel setup time is the time between the start of the function do_boot in SeaBIOS and the jump to the Linux kernel. As you can see, both the BIOS (because of ACPI tables and other configurations) and the Linux kernel boot (because of the copy to memory) are greatly improved with this new interface. Also, this new interface is an addon to the old interface. Both interfaces are compatible and interchangeable. Changes from v1: - Take into account order of fields in the FWCfgDmaAccess structure - Check and change endianness of FWCfgDmaAccess fields - Change order of fields in the FWCfgDmaAccess structure - Add FW_CFG_DMA_CTL_SKIP feature for control field - Split FW_CFG_SIZE in QEMU - Make FW_CFG_ID a bitmap of features - Add 64 bit address support for the transfer. Trigger when writing the low address, and address is 0 by default and at the end of each transfer. - Align ports and addresses. - Preserve old fw_cfg_comb_valid behaviour in QEMU - Update documentation to reflect all these changes
Re: [Qemu-devel] [PATCH 2/2] Add dynamic generation of module_block.h
On Thu, 27 Aug 2015 10:23:32 +0100 "Daniel P. Berrange" wrote: > On Mon, Aug 17, 2015 at 10:09:35AM +0200, Marc Marí wrote: > > To simplify the addition of new block modules, add a script that > > generates include/qemu/module_block.h automatically from the > > modules' source code. > > > > This script assumes that the QEMU coding style rules are followed. > > > > Signed-off-by: Marc Marí > > --- > > .gitignore | 1 + > > Makefile| 10 ++- > > scripts/modules/module_block.py | 132 > > 3 files changed, 140 > > insertions(+), 3 deletions(-) create mode 100755 > > scripts/modules/module_block.py > > I'd expect to see module_block.h deleted in this commit, otherwise > you're re-generating a file stored in git each time someone runs > make. > > > diff --git a/scripts/modules/module_block.py > > b/scripts/modules/module_block.py new file mode 100755 > > index 000..a9a9412 > > --- /dev/null > > +++ b/scripts/modules/module_block.py > > @@ -0,0 +1,132 @@ > > +#!/usr/bin/python > > +# > > +# Module information generator > > +# > > +# Copyright Red Hat, Inc. 2015 > > +# > > +# Authors: > > +# Marc Mari > > +# > > +# This work is licensed under the terms of the GNU GPL, version 2. > > +# See the COPYING file in the top-level directory. > > + > > +import sys > > +import os > > + > > +def get_string_struct(line): > > +data = line.split() > > + > > +# data[0] -> struct element name > > +# data[1] -> = > > +# data[2] -> value > > + > > +return data[2].replace('"', '')[:-1] > > + > > +def add_module(fhader, library, format_name, protocol_name, > > +probe, probe_device): > > +lines = [] > > +lines.append('.library_name = "' + library + '",') > > +if format_name != "": > > +lines.append('.format_name = "' + format_name + '",') > > +if protocol_name != "": > > +lines.append('.protocol_name = "' + protocol_name + '",') > > +if probe: > > +lines.append('.has_probe = true,') > > +if probe_device: > > +lines.append('.has_probe_device = true,') > > + > > +text = '\n\t'.join(lines) > > +fheader.write('\n\t{\n\t' + text + '\n\t},') > > + > > +def process_file(fheader, filename): > > +# This parser assumes the coding style rules are being followed > > +with open(filename, "r") as cfile: > > +found_something = False > > +found_start = False > > +library, _ = os.path.splitext(os.path.basename(filename)) > > +for line in cfile: > > +if found_start: > > +line = line.replace('\n', '') > > +if line.find(".format_name") != -1: > > +format_name = get_string_struct(line) > > +elif line.find(".protocol_name") != -1: > > +protocol_name = get_string_struct(line) > > +elif line.find(".bdrv_probe") != -1: > > +probe = True > > +elif line.find(".bdrv_probe_device") != -1: > > +probe_device = True > > +elif line == "};": > > +add_module(fheader, library, format_name, > > protocol_name, > > +probe, probe_device) > > +found_start = False > > +elif line.find("static BlockDriver") != -1: > > +found_something = True > > +found_start = True > > +format_name = "" > > +protocol_name = "" > > +probe = False > > +probe_device = False > > + > > +if not found_something: > > +print("No BlockDriver struct found in " + filename + > > ". \ > > +Is this really a module?") > > +sys.exit(1) > > > Errors ought to go to sys.stderr rather than stdout. > > > > > + > > +def print_top(fheader): > > +fheader.write('''/* AUTOMATICA
Re: [Qemu-devel] [PATCH 1/2] Add dynamic module loading for block drivers
On Thu, 27 Aug 2015 10:19:35 +0100 "Daniel P. Berrange" wrote: > On Mon, Aug 17, 2015 at 10:09:34AM +0200, Marc Marí wrote: > > Extend the current module interface to allow for block drivers to > > be loaded dynamically on request. > > > > The only block drivers that can be converted into modules are the > > drivers that don't perform any init operation except for > > registering themselves. This is why libiscsi has been disabled as a > > module. > > Seems like we would just need to split the iscsi opts registration out > into a separate file that is permanently linked. > > > All the necessary module information is located in a new structure > > found in include/qemu/module_block.h > > > > Signed-off-by: Marc Marí > > --- > > block.c | 73 > > +++-- configure > > | 2 +- include/qemu/module.h | 3 ++ > > include/qemu/module_block.h | 89 > > + > > util/module.c | 38 ++- 5 files > > changed, 174 insertions(+), 31 deletions(-) create mode 100644 > > include/qemu/module_block.h > > > > diff --git a/block.c b/block.c > > index d088ee0..f24a624 100644 > > --- a/block.c > > +++ b/block.c > > @@ -27,6 +27,7 @@ > > #include "block/block_int.h" > > #include "block/blockjob.h" > > #include "qemu/error-report.h" > > +#include "qemu/module_block.h" > > #include "qemu/module.h" > > #include "qapi/qmp/qerror.h" > > #include "qapi/qmp/qjson.h" > > @@ -277,11 +278,30 @@ void bdrv_add_close_notifier(BlockDriverState > > *bs, Notifier *notify) BlockDriver *bdrv_find_format(const char > > *format_name) { > > BlockDriver *drv1; > > +int i; > > Nit-pick 'size_t' is a better type for loop iterators, especially > when combined with a sizeof() comparison. Some comment in later > functions too. > > > + > > QLIST_FOREACH(drv1, &bdrv_drivers, list) { > > if (!strcmp(drv1->format_name, format_name)) { > > return drv1; > > } > > } > > + > > +for (i = 0; i < ARRAY_SIZE(block_driver_module); ++i) { > > +if (!strcmp(block_driver_module[i].format_name, > > format_name)) { > > + > > block_module_load_one(block_driver_module[i].library_name); > > +/* Copying code is not nice, but this way the current > > discovery is > > + * not modified. Calling recursively could fail if the > > library > > + * has been deleted. > > + */ > > Can you explaiin what you mean here about "if the library has been > deleted" ? > > Are you referring to possibilty of dlclose()ing the previously loaded > library, or about possibility of the module on disk having been > deleted or something else ? I was thinking of relaunching the search by calling recursively the function. But this would loop infinitely if somebody, manually, deleted the library file. > > @@ -483,9 +503,15 @@ int get_tmp_filename(char *filename, int size) > > */ > > static BlockDriver *find_hdev_driver(const char *filename) > > { > > -int score_max = 0, score; > > +int score_max = 0, score, i; > > BlockDriver *drv = NULL, *d; > > > > +for (i = 0; i < ARRAY_SIZE(block_driver_module); ++i) { > > +if (block_driver_module[i].has_probe_device) { > > + > > block_module_load_one(block_driver_module[i].library_name); > > +} > > +} > > If we have multiuple disks of the same type given to QEMU, it > seems like we might end up calling block_module_load_one() > multiple times for the same module & end up loading the same > .so multiple times as a result. Should module_load() keep a > record of everything it has loaded and short-circuit itself > to a no-op, so that callers of module_load() don't have to > worry about avoiding multiple calls. I avoided that because glib already has it. It just increments the reference count. Which is not important unless we want to dlclose it. https://developer.gnome.org/glib/stable/glib-Dynamic-Loading-of-Modules.html#g-module-open > > + > > QLIST_FOREACH(d, &bdrv_drivers, list) { > > if (d->bdrv_probe_device) { > > score = d->bdrv_probe_device(filename); > > @@ -507,6 +533,7 @@ BlockDriver *bdrv_find_protocol(const char > > *filename, char protocol[128]; > > int len; &g
Re: [Qemu-devel] [PATCH 0/2] Dynamic module support for block drivers
Ping >On Mon, 17 Aug 2015 10:09:33 +0200 >Marc Marí wrote: > > The current module infrastructure has been improved to enable dynamic > module loading. > > This reduces the load time for very simple guests. For the following > configuration (very loaded) > > ./configure --enable-sdl --enable-gtk --enable-vte --enable-curses \ > --enable-vnc --enable-vnc-{jpeg,tls,sasl,png} --enable-virtfs \ > --enable-brlapi --enable-curl --enable-fdt --enable-bluez \ > --enable-kvm --enable-rdma --enable-uuid --enable-vde \ > --enable-linux-aio --enable-cap-ng --enable-attr > --enable-vhost-net \ --enable-vhost-scsi --enable-spice --enable-rbd > --enable-libiscsi \ --enable-smartcard-nss --enable-guest-agent > --enable-libusb \ --enable-usb-redir --enable-lzo --enable-snappy > --enable-bzip2 \ --enable-seccomp --enable-coroutine-pool > --enable-glusterfs \ --enable-tpm --enable-libssh2 --enable-vhdx > --enable-numa \ --enable-tcmalloc --target-list=x86_64-softmmu > > With modules disabled, there are 142 libraries loaded at startup. > Time is the following: > LD time: 0.065 seconds > QEMU time: 0.02 seconds > Total time: 0.085 seconds > > With this patch series and modules enabled, there are 128 libraries > loaded at startup. Time is the following: > LD time: 0.02 seconds > QEMU time: 0.02 seconds > Total time: 0.04 seconds > > Where LD time is the time between the program startup and the jump to > main, and QEMU time is the time between the start of main and the > first kvm_entry. > > These results are just with a few block drivers, that were already a > module. Adding more modules (block or not block) should be easy, and > will reduce the load time even more. > > Marc Marí (2): > Add dynamic module loading for block drivers > Add dynamic generation of module_block.h > > .gitignore | 1 + > Makefile| 10 ++- > block.c | 73 +- > configure | 2 +- > include/qemu/module.h | 3 + > include/qemu/module_block.h | 89 +++ > scripts/modules/module_block.py | 132 > > util/module.c | 38 8 files changed, > 314 insertions(+), 34 deletions(-) create mode 100644 > include/qemu/module_block.h create mode 100755 > scripts/modules/module_block.py >
[Qemu-devel] [PATCH 2/2] Add dynamic generation of module_block.h
To simplify the addition of new block modules, add a script that generates include/qemu/module_block.h automatically from the modules' source code. This script assumes that the QEMU coding style rules are followed. Signed-off-by: Marc Marí --- .gitignore | 1 + Makefile| 10 ++- scripts/modules/module_block.py | 132 3 files changed, 140 insertions(+), 3 deletions(-) create mode 100755 scripts/modules/module_block.py diff --git a/.gitignore b/.gitignore index 61bc492..8a37067 100644 --- a/.gitignore +++ b/.gitignore @@ -105,3 +105,4 @@ cscope.* tags TAGS *~ +/include/qemu/module_block.h diff --git a/Makefile b/Makefile index 340d9c8..47d593e 100644 --- a/Makefile +++ b/Makefile @@ -73,6 +73,8 @@ GENERATED_HEADERS += trace/generated-ust-provider.h GENERATED_SOURCES += trace/generated-ust.c endif +GENERATED_HEADERS += include/qemu/module_block.h + # Don't try to regenerate Makefile or configure # We don't generate any of them Makefile: ; @@ -219,9 +221,6 @@ Makefile: $(version-obj-y) $(version-lobj-y) libqemustub.a: $(stub-obj-y) libqemuutil.a: $(util-obj-y) -block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL -util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)' - ## qemu-img.o: qemu-img-cmds.h @@ -313,6 +312,11 @@ msi: @echo MSI build not configured or dependency resolution failed (reconfigure with --enable-guest-agent-msi option) endif +include/qemu/module_block.h: $(SRC_PATH)/scripts/modules/module_block.py + $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/modules/module_block.py \ + "./include/qemu/" $(patsubst %.mo,%.c,$(block-obj-m)), \ + " GEN $@") + clean: # avoid old build problems by removing potentially incorrect old files rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h diff --git a/scripts/modules/module_block.py b/scripts/modules/module_block.py new file mode 100755 index 000..a9a9412 --- /dev/null +++ b/scripts/modules/module_block.py @@ -0,0 +1,132 @@ +#!/usr/bin/python +# +# Module information generator +# +# Copyright Red Hat, Inc. 2015 +# +# Authors: +# Marc Mari +# +# This work is licensed under the terms of the GNU GPL, version 2. +# See the COPYING file in the top-level directory. + +import sys +import os + +def get_string_struct(line): +data = line.split() + +# data[0] -> struct element name +# data[1] -> = +# data[2] -> value + +return data[2].replace('"', '')[:-1] + +def add_module(fhader, library, format_name, protocol_name, +probe, probe_device): +lines = [] +lines.append('.library_name = "' + library + '",') +if format_name != "": +lines.append('.format_name = "' + format_name + '",') +if protocol_name != "": +lines.append('.protocol_name = "' + protocol_name + '",') +if probe: +lines.append('.has_probe = true,') +if probe_device: +lines.append('.has_probe_device = true,') + +text = '\n\t'.join(lines) +fheader.write('\n\t{\n\t' + text + '\n\t},') + +def process_file(fheader, filename): +# This parser assumes the coding style rules are being followed +with open(filename, "r") as cfile: +found_something = False +found_start = False +library, _ = os.path.splitext(os.path.basename(filename)) +for line in cfile: +if found_start: +line = line.replace('\n', '') +if line.find(".format_name") != -1: +format_name = get_string_struct(line) +elif line.find(".protocol_name") != -1: +protocol_name = get_string_struct(line) +elif line.find(".bdrv_probe") != -1: +probe = True +elif line.find(".bdrv_probe_device") != -1: +probe_device = True +elif line == "};": +add_module(fheader, library, format_name, protocol_name, +probe, probe_device) +found_start = False +elif line.find("static BlockDriver") != -1: +found_something = True +found_start = True +format_name = "" +protocol_name = "" +probe = False +probe_device = False + +if not found_something: +print("No BlockDriver struct found in " + fi
[Qemu-devel] [PATCH 0/2] Dynamic module support for block drivers
The current module infrastructure has been improved to enable dynamic module loading. This reduces the load time for very simple guests. For the following configuration (very loaded) ./configure --enable-sdl --enable-gtk --enable-vte --enable-curses \ --enable-vnc --enable-vnc-{jpeg,tls,sasl,png} --enable-virtfs \ --enable-brlapi --enable-curl --enable-fdt --enable-bluez \ --enable-kvm --enable-rdma --enable-uuid --enable-vde \ --enable-linux-aio --enable-cap-ng --enable-attr --enable-vhost-net \ --enable-vhost-scsi --enable-spice --enable-rbd --enable-libiscsi \ --enable-smartcard-nss --enable-guest-agent --enable-libusb \ --enable-usb-redir --enable-lzo --enable-snappy --enable-bzip2 \ --enable-seccomp --enable-coroutine-pool --enable-glusterfs \ --enable-tpm --enable-libssh2 --enable-vhdx --enable-numa \ --enable-tcmalloc --target-list=x86_64-softmmu With modules disabled, there are 142 libraries loaded at startup. Time is the following: LD time: 0.065 seconds QEMU time: 0.02 seconds Total time: 0.085 seconds With this patch series and modules enabled, there are 128 libraries loaded at startup. Time is the following: LD time: 0.02 seconds QEMU time: 0.02 seconds Total time: 0.04 seconds Where LD time is the time between the program startup and the jump to main, and QEMU time is the time between the start of main and the first kvm_entry. These results are just with a few block drivers, that were already a module. Adding more modules (block or not block) should be easy, and will reduce the load time even more. Marc Marí (2): Add dynamic module loading for block drivers Add dynamic generation of module_block.h .gitignore | 1 + Makefile| 10 ++- block.c | 73 +- configure | 2 +- include/qemu/module.h | 3 + include/qemu/module_block.h | 89 +++ scripts/modules/module_block.py | 132 util/module.c | 38 8 files changed, 314 insertions(+), 34 deletions(-) create mode 100644 include/qemu/module_block.h create mode 100755 scripts/modules/module_block.py -- 2.4.3
[Qemu-devel] [PATCH 1/2] Add dynamic module loading for block drivers
Extend the current module interface to allow for block drivers to be loaded dynamically on request. The only block drivers that can be converted into modules are the drivers that don't perform any init operation except for registering themselves. This is why libiscsi has been disabled as a module. All the necessary module information is located in a new structure found in include/qemu/module_block.h Signed-off-by: Marc Marí --- block.c | 73 +++-- configure | 2 +- include/qemu/module.h | 3 ++ include/qemu/module_block.h | 89 + util/module.c | 38 ++- 5 files changed, 174 insertions(+), 31 deletions(-) create mode 100644 include/qemu/module_block.h diff --git a/block.c b/block.c index d088ee0..f24a624 100644 --- a/block.c +++ b/block.c @@ -27,6 +27,7 @@ #include "block/block_int.h" #include "block/blockjob.h" #include "qemu/error-report.h" +#include "qemu/module_block.h" #include "qemu/module.h" #include "qapi/qmp/qerror.h" #include "qapi/qmp/qjson.h" @@ -277,11 +278,30 @@ void bdrv_add_close_notifier(BlockDriverState *bs, Notifier *notify) BlockDriver *bdrv_find_format(const char *format_name) { BlockDriver *drv1; +int i; + QLIST_FOREACH(drv1, &bdrv_drivers, list) { if (!strcmp(drv1->format_name, format_name)) { return drv1; } } + +for (i = 0; i < ARRAY_SIZE(block_driver_module); ++i) { +if (!strcmp(block_driver_module[i].format_name, format_name)) { +block_module_load_one(block_driver_module[i].library_name); +/* Copying code is not nice, but this way the current discovery is + * not modified. Calling recursively could fail if the library + * has been deleted. + */ +QLIST_FOREACH(drv1, &bdrv_drivers, list) { +if (!strcmp(drv1->format_name, format_name)) { +return drv1; +} +} +} +} + + return NULL; } @@ -483,9 +503,15 @@ int get_tmp_filename(char *filename, int size) */ static BlockDriver *find_hdev_driver(const char *filename) { -int score_max = 0, score; +int score_max = 0, score, i; BlockDriver *drv = NULL, *d; +for (i = 0; i < ARRAY_SIZE(block_driver_module); ++i) { +if (block_driver_module[i].has_probe_device) { +block_module_load_one(block_driver_module[i].library_name); +} +} + QLIST_FOREACH(d, &bdrv_drivers, list) { if (d->bdrv_probe_device) { score = d->bdrv_probe_device(filename); @@ -507,6 +533,7 @@ BlockDriver *bdrv_find_protocol(const char *filename, char protocol[128]; int len; const char *p; +int i; /* TODO Drivers without bdrv_file_open must be specified explicitly */ @@ -533,6 +560,7 @@ BlockDriver *bdrv_find_protocol(const char *filename, len = sizeof(protocol) - 1; memcpy(protocol, filename, len); protocol[len] = '\0'; + QLIST_FOREACH(drv1, &bdrv_drivers, list) { if (drv1->protocol_name && !strcmp(drv1->protocol_name, protocol)) { @@ -540,6 +568,23 @@ BlockDriver *bdrv_find_protocol(const char *filename, } } +for (i = 0; i < ARRAY_SIZE(block_driver_module); ++i) { +if (block_driver_module[i].protocol_name && +!strcmp(block_driver_module[i].protocol_name, protocol)) { +block_module_load_one(block_driver_module[i].library_name); +/* Copying code is not nice, but this way the current discovery is + * not modified. Calling recursively could fail if the library + * has been deleted. + */ +QLIST_FOREACH(drv1, &bdrv_drivers, list) { +if (drv1->protocol_name && +!strcmp(drv1->protocol_name, protocol)) { +return drv1; +} +} +} +} + error_setg(errp, "Unknown protocol '%s'", protocol); return NULL; } @@ -561,9 +606,15 @@ BlockDriver *bdrv_find_protocol(const char *filename, BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size, const char *filename) { -int score_max = 0, score; +int score_max = 0, score, i; BlockDriver *drv = NULL, *d; +for (i = 0; i < ARRAY_SIZE(block_driver_module); ++i) { +if (block_driver_module[i].has_probe) { +block_module_load_one(block_driver_module[i].library_name); +} +} + QLIST_FOREACH(d, &bdrv_drivers, list) { if (d->bdrv_probe) { score = d->bdrv_probe(buf, buf_size, filename); @@ -2783,7 +2834,7 @@ void b
Re: [Qemu-devel] [PATCH 2/5] fw_cfg DMA interface documentation
On Thu, 6 Aug 2015 10:20:38 -0400 "Kevin O'Connor" wrote: > On Thu, Aug 06, 2015 at 01:01:15PM +0200, Marc Marí wrote: > > Add fw_cfg DMA interface specification in the documentation. > > > > Based on Gerd Hoffman's initial implementation. > > > > Signed-off-by: Marc Marí > > --- > > docs/specs/fw_cfg.txt | 42 > > -- 1 file changed, 40 > > insertions(+), 2 deletions(-) > > > > diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt > > index 5bc7b96..dc8051e 100644 > > --- a/docs/specs/fw_cfg.txt > > +++ b/docs/specs/fw_cfg.txt > > @@ -76,6 +76,7 @@ increasing address order, similar to memcpy(). > > > > Selector Register IOport: 0x510 > > Data Register IOport: 0x511 > > +DMA Address IOport: 0x512 > > > > == Firmware Configuration Items == > > > > @@ -89,8 +90,9 @@ present, the four bytes read will contain the > > characters "QEMU". === Revision (Key 0x0001, FW_CFG_ID) === > > > > A 32-bit little-endian unsigned int, this item is used as an > > interface -revision number, and is currently set to 1 by QEMU when > > fw_cfg is -initialized. > > +revision number. If it is set to 1, the interface is the > > traditional +selector / data interface. If it is set to 2, the DMA > > extension is +also present. > > > > === File Directory (Key 0x0019, FW_CFG_FILE_DIR) === > > > > @@ -132,6 +134,42 @@ Selector Reg.Range Usage > > In practice, the number of allowed firmware configuration items is > > given by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h). > > > > += Guest-side DMA Interface = > > + > > +For revision value 2, the DMA interface is present. This does not > > replace +the existing fw_cfg interface, it is an add-on. > > + > > +When this interface is enabled the DMA Address register can be > > used to +write the address of a FWCfgDmaAccess structure: > > + > > +typedef struct FWCfgDmaAccess { > > +uint64_t address; > > +uint32_t length; > > +uint32_t control; > > +} FWCfgDmaAccess; > > + > > +If "control" has the bit 1 set (FW_CFG_DMA_CTL_READ), a read > > operation is +performed on the selected entry. "length" bytes of > > data in that fw_cfg +entry are copied to the address specified by > > "address". + > > +If the field "address" has value 0, the read is considered a skip, > > and +the data is not copied anywhere, but the offset is still > > incremented. > > Thanks! > > I have a few suggestions on the interface: > > - I think it would be better to place the 'control' field first (ie, > control/length/address instead of address/length/control). Placing > the 'control' field first makes potential future extensions easier - > that is, it would make it easier for future control bits to change > the layout of the struct. > > - It would be good to add a 'select' field to the struct. I think > this could be done by using a 'u16 control; u16 select' instead of > the current 'u32 control'. It's common for guests to select a > fw_cfg entry and read its entire contents - with the 'select' field > in the struct this could be done with a single guest/host fault. A > new control bit could be added (eg, FW_CFG_DMA_CTL_SELECT) to > determine whether or not the select field is used - iff the bit is > set then the given fw_cfg entry is selected and the read position is > reset to the start prior to the DMA data copy. (Problems with email, I got this delivered today) I don't think there's much problem in using the original fw cfg select field. Adding a new select field will add complexity to the guest (when use one select field or the other), and the host (how the two select fields interact with each other). I think this part is good enough as it is now. Thanks Marc
Re: [Qemu-devel] [PATCH 3/5] Implement fw_cfg DMA interface
On Thu, 6 Aug 2015 23:32:50 +0200 Laszlo Ersek wrote: > On 08/06/15 23:11, Marc Marí wrote: > > On Thu, 6 Aug 2015 22:49:12 +0200 > > Laszlo Ersek wrote: > > > > [...] > > > >>> +static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, > >>> + uint64_t value, unsigned size) > >>> +{ > >>> +FWCfgState *s = opaque; > >>> + > >>> +s->dma_addr = be64_to_cpu(value); > >>> +fw_cfg_dma_transfer(s); > >>> +} > >> > >> So, this is similar to the ioport size limitation I described at > >> the top. Namely, I think that an Aarch32 guest won't be able to > >> transfer a 64-bit value with a single MMIO access. (I believe > >> double-width store instructions do exist, but they cannot be > >> virtualized well. They trap, but the instruction syndrome register > >> won't give enough info to the hypervisor.) > >> > >> Therefore, the address of the dma control structure should be > >> passed in two 32-bit wide accesses, both for the ioport mapping > >> and the mmio mapping. This can be done in two ways: > >> - write the two halves to the same register, and use a latch to > >> identify each 2nd access > >> - use different addresses. > >> > >> The latch sucks, because the guest has no way to bring the register > >> to a known good state. Therefore: > >> > >> In the ioport mapped case, the port range should go up to 0x519, > >> and two outl's are going to be necessary in the guest. The > >> documentation should spell out which outl (@ 0x512 or @ 0x516) > >> will trigger the actual transfer. > >> > >> (I vaguely recall that someone already described this, but I can't > >> find the message!) > > > > Previous answer to this patch, by Kevin O'Connor: > > > > "So, I think this code needs to be able to handle a 32bit write to a > > high bits address and then store those bits until the 32bit write to > > the low bits address occurs. (I'd also recommend that after every > > dma transfer the stored high bits are reset to zero so that the > > common case of a 32bit address can be performed with a single 32bit > > write to the low bits field.)" > > > > It's easier to do it this way. > > Thank you for the pointer. :) I don't remember this answer, at least > not consciously. > > But, "this way" is not different from my suggestion. It just has more > details filled in (and therefore it is admittedly a smarter, more > elaborate suggestion than mine). > > - I asked for two separate addresses, and Kevin had asked for two > separate addresses too. (The idea that one half is stored until the > other half is written was implicit in my suggestion.) > > - I requested that the docs spell out which address would trigger the > dma. Kevin had specifically suggested that the low bits address > trigger it. > > - The clearing of the stored high bits is extra smartness in Kevin's > suggestion (I'm jealous for not thinking of that :)), but that will > require more code (one line, probably), which is not "easier". :) > > We're in violent agreement. > > (If you wanted to poke fun at me, you could say that I just repeated > what Kevin had said, only worse. Thing is, I really don't recall > seeing his message. Let me search my mailbox for a substring from > your above quote... Yep, I don't have that message. My email has been > acting up today. I only have parts of that message *within* one of > your emails. ... Which I mostly missed too. Doing too many things at > once, sorry.) > No fun intended, just complement the answer, because you said "I vaguely recall that someone already described this, but I can't find the message!". And although I did not elaborate my answer (my fault), I wanted to say: Starting on your idea (which is similar to Kevin's), it is easier for the guest to have the upper address as 0 and trigger on the lower address, because this lets 32 bit guests to trigger with just one access. Of course, this has to be documented, even if not explicitly stated. Marc
Re: [Qemu-devel] QEMU fw_cfg DMA interface
On Thu, 6 Aug 2015 11:30:43 -0400 "Kevin O'Connor" wrote: > On Thu, Aug 06, 2015 at 02:37:45PM +0200, Marc Marí wrote: > > On Thu, 6 Aug 2015 13:27:16 +0100 > > Stefan Hajnoczi wrote: > > > > > On Thu, Aug 6, 2015 at 12:00 PM, Marc Marí > > > wrote: > > > > When running a Linux guest on top of QEMU, using the -kernel > > > > options, this is the timing improvement for x86: > > > > > > > > QEMU commit 2be4f242b50a8 and SeaBIOS commit 908a58c1d5ff > > > > QEMU startup time: .078 > > > > BIOS startup time: .060 > > > > Kernel setup time: .578 > > > > Total time: .716 > > > > > > > > QEMU with this patch series and SeaBIOS with this patch series > > > > QEMU startup time: .080 > > > > BIOS startup time: .039 > > > > Kernel setup time: .002 > > > > Total time: .121 > > > > > > Impressive results! > > > > > > Is this a fully-featured QEMU build or did you disable things? > > > > > > Is this the default SeaBIOS build or did you disable things? > > > > > > > This is the default QEMU configuration I get for my system. It's > > not a fully-featured QEMU, but it has a lot of things enabled. > > SeaBIOS has a default configuration (with debugging disabled). > > Thanks! > > What qemu command-line did you use during testing? Also, do you have > a quick primer on how to use the kvm trace stuff to obtain timings? > The command line I used is: x86_64-softmmu/qemu-system-x86_64 --enable-kvm \ -kernel /boot/vmlinuz-4.0.7-300.fc22.x86_64 \ -L pc-bios/optionrom/ \ -bios roms/seabios/out/bios.bin -nographic And I used perf (and two out instructions in the BIOS) to measure the times: perf record -a -e kvm:\* -e sched:sched_process_exec And searching for sched:sched_process_exec, kvm:kvm_entry, pio_write at 0xf5 and pio_write at 0xf4. Out at 0xf5 is the one in "do_boot" function, and out at 0xf4 is the one just before the jump to the Linux kernel. I attach the script I've been using. It can be improved, but it works. It can be run like this: sudo ../../results/run_test.sh x86_64-softmmu/qemu-system-x86_64 \ --enable-kvm -kernel /boot/vmlinuz-4.0.8-300.fc22.x86_64 \ -L pc-bios/optionrom/ \ -bios roms/seabios/out/bios.bin -nographic Thanks Marc run_test.sh Description: application/shellscript
Re: [Qemu-devel] [PATCH 3/5] Implement fw_cfg DMA interface
On Thu, 6 Aug 2015 22:49:12 +0200 Laszlo Ersek wrote: [...] > > +static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr, > > + uint64_t value, unsigned size) > > +{ > > +FWCfgState *s = opaque; > > + > > +s->dma_addr = be64_to_cpu(value); > > +fw_cfg_dma_transfer(s); > > +} > > So, this is similar to the ioport size limitation I described at the > top. Namely, I think that an Aarch32 guest won't be able to transfer > a 64-bit value with a single MMIO access. (I believe double-width > store instructions do exist, but they cannot be virtualized well. > They trap, but the instruction syndrome register won't give enough > info to the hypervisor.) > > Therefore, the address of the dma control structure should be passed > in two 32-bit wide accesses, both for the ioport mapping and the mmio > mapping. This can be done in two ways: > - write the two halves to the same register, and use a latch to > identify each 2nd access > - use different addresses. > > The latch sucks, because the guest has no way to bring the register > to a known good state. Therefore: > > In the ioport mapped case, the port range should go up to 0x519, and > two outl's are going to be necessary in the guest. The documentation > should spell out which outl (@ 0x512 or @ 0x516) will trigger the > actual transfer. > > (I vaguely recall that someone already described this, but I can't > find the message!) Previous answer to this patch, by Kevin O'Connor: "So, I think this code needs to be able to handle a 32bit write to a high bits address and then store those bits until the 32bit write to the low bits address occurs. (I'd also recommend that after every dma transfer the stored high bits are reset to zero so that the common case of a 32bit address can be performed with a single 32bit write to the low bits field.)" It's easier to do it this way. Thanks for you comments Marc
Re: [Qemu-devel] [PATCH 3/5] Implement fw_cfg DMA interface
On Thu, 6 Aug 2015 10:47:21 -0400 "Kevin O'Connor" wrote: > On Thu, Aug 06, 2015 at 01:01:16PM +0200, Marc Marí wrote: > > Based on the specifications on docs/specs/fw_cfg.txt > > > > This interface is an addon. The old interface can still be used as > > usual. > > > > For x86 arch, this addon will use one extra port (0x512). For ARM, > > it will use 8 more bytes, following the actual implementation. > > > > Based on Gerd Hoffman's initial implementation. > > > > Signed-off-by: Marc Marí > > --- > > hw/arm/virt.c | 2 +- > > hw/nvram/fw_cfg.c | 212 > > +++--- > > include/hw/nvram/fw_cfg.h | 12 ++- 3 files changed, 211 > > insertions(+), 15 deletions(-) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index 4846892..374660c 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -612,7 +612,7 @@ static void create_fw_cfg(const VirtBoardInfo > > *vbi) hwaddr size = vbi->memmap[VIRT_FW_CFG].size; > > char *nodename; > > > > -fw_cfg_init_mem_wide(base + 8, base, 8); > > +fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); > > > > nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); > > qemu_fdt_add_subnode(vbi->fdt, nodename); > > diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c > > index 88481b7..7399008 100644 > > --- a/hw/nvram/fw_cfg.c > > +++ b/hw/nvram/fw_cfg.c > > @@ -23,6 +23,7 @@ > > */ > > #include "hw/hw.h" > > #include "sysemu/sysemu.h" > > +#include "sysemu/dma.h" > > #include "hw/isa/isa.h" > > #include "hw/nvram/fw_cfg.h" > > #include "hw/sysbus.h" > > @@ -30,10 +31,13 @@ > > #include "qemu/error-report.h" > > #include "qemu/config-file.h" > > > > -#define FW_CFG_SIZE 2 > > +#define FW_CFG_SIZE 3 > > #define FW_CFG_NAME "fw_cfg" > > #define FW_CFG_PATH "/machine/" FW_CFG_NAME > > > > +#define FW_CFG_VERSION 1 > > +#define FW_CFG_VERSION_DMA 2 > > + > > #define TYPE_FW_CFG "fw_cfg" > > #define TYPE_FW_CFG_IO "fw_cfg_io" > > #define TYPE_FW_CFG_MEM "fw_cfg_mem" > > @@ -42,6 +46,11 @@ > > #define FW_CFG_IO(obj) OBJECT_CHECK(FWCfgIoState, (obj), > > TYPE_FW_CFG_IO) #define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, > > (obj), TYPE_FW_CFG_MEM) > > +/* FW_CFG_DMA_CONTROL bits */ > > +#define FW_CFG_DMA_CTL_ERROR 0x01 > > +#define FW_CFG_DMA_CTL_READ0x02 > > +#define FW_CFG_DMA_CTL_MASK0x03 > > + > > typedef struct FWCfgEntry { > > uint32_t len; > > uint8_t *data; > > @@ -59,6 +68,10 @@ struct FWCfgState { > > uint16_t cur_entry; > > uint32_t cur_offset; > > Notifier machine_ready; > > + > > +bool dma_enabled; > > +AddressSpace *dma_as; > > +dma_addr_t dma_addr; > > }; > > > > struct FWCfgIoState { > > @@ -75,7 +88,7 @@ struct FWCfgMemState { > > FWCfgState parent_obj; > > /*< public >*/ > > > > -MemoryRegion ctl_iomem, data_iomem; > > +MemoryRegion ctl_iomem, data_iomem, dma_iomem; > > uint32_t data_width; > > MemoryRegionOps wide_data_ops; > > }; > > @@ -294,6 +307,108 @@ static void fw_cfg_data_mem_write(void > > *opaque, hwaddr addr, } while (i); > > } > > > > +static void fw_cfg_dma_transfer(FWCfgState *s) > > +{ > > +dma_addr_t len; > > +uint8_t *ptr; > > +FWCfgDmaAccess *dma; > > +int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL); > > +FWCfgEntry *e = &s->entries[arch][s->cur_entry & > > FW_CFG_ENTRY_MASK]; + > > +len = sizeof(*dma); > > +dma = dma_memory_map(s->dma_as, s->dma_addr, &len, > > +DMA_DIRECTION_FROM_DEVICE); > > + > > +if (!dma || !len) { > > +return; > > +} > > + > > +if (dma->control & FW_CFG_DMA_CTL_ERROR) { > > I think the dma structure would need to be copied to a local instance > of the struct and the fields endianness updated. (Also, I think the > documentation should state the required endianness of the fields.) > I'm not an expert at this, but if this is not done, I think there > could be problems both with endian and with unaligned memory accesses > in the host. >
Re: [Qemu-devel] QEMU fw_cfg DMA interface
On Thu, 6 Aug 2015 13:27:16 +0100 Stefan Hajnoczi wrote: > On Thu, Aug 6, 2015 at 12:00 PM, Marc Marí wrote: > > When running a Linux guest on top of QEMU, using the -kernel > > options, this is the timing improvement for x86: > > > > QEMU commit 2be4f242b50a8 and SeaBIOS commit 908a58c1d5ff > > QEMU startup time: .078 > > BIOS startup time: .060 > > Kernel setup time: .578 > > Total time: .716 > > > > QEMU with this patch series and SeaBIOS with this patch series > > QEMU startup time: .080 > > BIOS startup time: .039 > > Kernel setup time: .002 > > Total time: .121 > > Impressive results! > > Is this a fully-featured QEMU build or did you disable things? > > Is this the default SeaBIOS build or did you disable things? > This is the default QEMU configuration I get for my system. It's not a fully-featured QEMU, but it has a lot of things enabled. SeaBIOS has a default configuration (with debugging disabled). The QEMU configuration is: [...] tcg debug enabled no gprof enabled no sparse enabledno strip binariesyes profiler no static build no pixmansystem SDL support yes GTK support yes GNUTLS supportyes GNUTLS hash yes GNUTLS gcrypt no GNUTLS nettle yes (2.7.1) VTE support yes curses supportyes curl support yes mingw32 support no Audio drivers oss Block whitelist (rw) Block whitelist (ro) VirtFS supportyes VNC support yes VNC TLS support yes VNC SASL support yes VNC JPEG support yes VNC PNG support yes xen support no brlapi supportyes bluez supportyes Documentation no GUEST_BASEyes PIE yes vde support yes netmap supportno Linux AIO support yes ATTR/XATTR support yes Install blobs yes KVM support yes RDMA support yes TCG interpreter no fdt support yes preadv supportyes fdatasync yes madvise yes posix_madvise yes sigev_thread_id yes uuid support yes libcap-ng support yes vhost-net support yes vhost-scsi support yes Trace backendsnop spice support yes (0.12.7/0.12.5) rbd support yes xfsctl supportno nss used yes libusbyes usb net redir yes OpenGL supportno libiscsi support yes libnfs supportno build guest agent yes QGA VSS support no QGA w32 disk info no seccomp support yes coroutine backend ucontext coroutine poolyes GlusterFS support yes Archipelago support no gcov gcov gcov enabled no TPM support yes libssh2 support yes TPM passthrough yes QOM debugging yes vhdx yes lzo support yes snappy supportyes bzip2 support yes NUMA host support yes tcmalloc support no Marc
[Qemu-devel] [PATCH 5/5] Enable fw_cfg DMA interface for x86
Enable the fw_cfg DMA interface for all the x86 platforms. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/i386/pc.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 7661ea9..e5ca805 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -718,7 +718,7 @@ static unsigned int pc_apic_id_limit(unsigned int max_cpus) return x86_cpu_apic_id_from_index(max_cpus - 1) + 1; } -static FWCfgState *bochs_bios_init(void) +static FWCfgState *bochs_bios_init(AddressSpace *as) { FWCfgState *fw_cfg; uint8_t *smbios_tables, *smbios_anchor; @@ -727,7 +727,8 @@ static FWCfgState *bochs_bios_init(void) int i, j; unsigned int apic_id_limit = pc_apic_id_limit(max_cpus); -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT); +fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, as); + /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86: * * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU hotplug @@ -1302,6 +1303,7 @@ FWCfgState *pc_memory_init(MachineState *machine, MemoryRegion *ram_below_4g, *ram_above_4g; FWCfgState *fw_cfg; PCMachineState *pcms = PC_MACHINE(machine); +AddressSpace *as; assert(machine->ram_size == below_4g_mem_size + above_4g_mem_size); @@ -1391,7 +1393,10 @@ FWCfgState *pc_memory_init(MachineState *machine, option_rom_mr, 1); -fw_cfg = bochs_bios_init(); +as = g_malloc(sizeof(*as)); +address_space_init(as, ram_below_4g, "pc.as"); +fw_cfg = bochs_bios_init(as); + rom_set_fw(fw_cfg); if (guest_info->has_reserved_memory && pcms->hotplug_memory.base) { -- 2.4.3
[Qemu-devel] [PATCH 3/5] Implement fw_cfg DMA interface
Based on the specifications on docs/specs/fw_cfg.txt This interface is an addon. The old interface can still be used as usual. For x86 arch, this addon will use one extra port (0x512). For ARM, it will use 8 more bytes, following the actual implementation. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/arm/virt.c | 2 +- hw/nvram/fw_cfg.c | 212 +++--- include/hw/nvram/fw_cfg.h | 12 ++- 3 files changed, 211 insertions(+), 15 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 4846892..374660c 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -612,7 +612,7 @@ static void create_fw_cfg(const VirtBoardInfo *vbi) hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8); +fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 88481b7..7399008 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -23,6 +23,7 @@ */ #include "hw/hw.h" #include "sysemu/sysemu.h" +#include "sysemu/dma.h" #include "hw/isa/isa.h" #include "hw/nvram/fw_cfg.h" #include "hw/sysbus.h" @@ -30,10 +31,13 @@ #include "qemu/error-report.h" #include "qemu/config-file.h" -#define FW_CFG_SIZE 2 +#define FW_CFG_SIZE 3 #define FW_CFG_NAME "fw_cfg" #define FW_CFG_PATH "/machine/" FW_CFG_NAME +#define FW_CFG_VERSION 1 +#define FW_CFG_VERSION_DMA 2 + #define TYPE_FW_CFG "fw_cfg" #define TYPE_FW_CFG_IO "fw_cfg_io" #define TYPE_FW_CFG_MEM "fw_cfg_mem" @@ -42,6 +46,11 @@ #define FW_CFG_IO(obj) OBJECT_CHECK(FWCfgIoState, (obj), TYPE_FW_CFG_IO) #define FW_CFG_MEM(obj) OBJECT_CHECK(FWCfgMemState, (obj), TYPE_FW_CFG_MEM) +/* FW_CFG_DMA_CONTROL bits */ +#define FW_CFG_DMA_CTL_ERROR 0x01 +#define FW_CFG_DMA_CTL_READ0x02 +#define FW_CFG_DMA_CTL_MASK0x03 + typedef struct FWCfgEntry { uint32_t len; uint8_t *data; @@ -59,6 +68,10 @@ struct FWCfgState { uint16_t cur_entry; uint32_t cur_offset; Notifier machine_ready; + +bool dma_enabled; +AddressSpace *dma_as; +dma_addr_t dma_addr; }; struct FWCfgIoState { @@ -75,7 +88,7 @@ struct FWCfgMemState { FWCfgState parent_obj; /*< public >*/ -MemoryRegion ctl_iomem, data_iomem; +MemoryRegion ctl_iomem, data_iomem, dma_iomem; uint32_t data_width; MemoryRegionOps wide_data_ops; }; @@ -294,6 +307,108 @@ static void fw_cfg_data_mem_write(void *opaque, hwaddr addr, } while (i); } +static void fw_cfg_dma_transfer(FWCfgState *s) +{ +dma_addr_t len; +uint8_t *ptr; +FWCfgDmaAccess *dma; +int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL); +FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK]; + +len = sizeof(*dma); +dma = dma_memory_map(s->dma_as, s->dma_addr, &len, +DMA_DIRECTION_FROM_DEVICE); + +if (!dma || !len) { +return; +} + +if (dma->control & FW_CFG_DMA_CTL_ERROR) { +return; +} + +if (!(dma->control & FW_CFG_DMA_CTL_READ)) { +return; +} + +while (dma->length > 0) { +if (s->cur_entry == FW_CFG_INVALID || !e->data || +s->cur_offset >= e->len) { +len = dma->length; +if (dma->address) { +ptr = dma_memory_map(s->dma_as, dma->address, &len, + DMA_DIRECTION_FROM_DEVICE); +if (!ptr || !len) { +dma->control |= FW_CFG_DMA_CTL_ERROR; +goto out; +} + +memset(ptr, 0, len); + +dma_memory_unmap(s->dma_as, ptr, len, + DMA_DIRECTION_FROM_DEVICE, len); +} + +dma->address += len; +dma->length -= len; +} else { +if (dma->length <= e->len) { +len = dma->length; +} else { +len = e->len; +} + +if (e->read_callback) { +e->read_callback(e->callback_opaque, s->cur_offset); +} + +if (dma->address) { +ptr = dma_memory_map(s->dma_as, dma->address, &len, + DMA_DIRECTION_FROM_DEVICE); +if (!ptr || !len) { +dma->control |= FW_CFG_DMA_CTL_ERROR; +goto out; +} + +memcpy(ptr, &e->dat
[Qemu-devel] [PATCH 4/5] Enable fw_cfg DMA interface for ARM
Enable the fw_cfg DMA interface for the ARM virt machine. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- hw/arm/virt.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 374660c..eaad274 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -109,7 +109,7 @@ static const MemMapEntry a15memmap[] = { [VIRT_GIC_V2M] ={ 0x0802, 0x1000 }, [VIRT_UART] = { 0x0900, 0x1000 }, [VIRT_RTC] ={ 0x0901, 0x1000 }, -[VIRT_FW_CFG] = { 0x0902, 0x000a }, +[VIRT_FW_CFG] = { 0x0902, 0x0012 }, [VIRT_MMIO] = { 0x0a00, 0x0200 }, /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */ [VIRT_PLATFORM_BUS] = { 0x0c00, 0x0200 }, @@ -606,13 +606,13 @@ static void create_flash(const VirtBoardInfo *vbi) g_free(nodename); } -static void create_fw_cfg(const VirtBoardInfo *vbi) +static void create_fw_cfg(AddressSpace *as, const VirtBoardInfo *vbi) { hwaddr base = vbi->memmap[VIRT_FW_CFG].base; hwaddr size = vbi->memmap[VIRT_FW_CFG].size; char *nodename; -fw_cfg_init_mem_wide(base + 8, base, 8, 0, NULL); +fw_cfg_init_mem_wide(base + 8, base, 8, 16, as); nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); @@ -800,6 +800,7 @@ static void machvirt_init(MachineState *machine) VirtGuestInfoState *guest_info_state = g_malloc0(sizeof *guest_info_state); VirtGuestInfo *guest_info = &guest_info_state->info; char **cpustr; +AddressSpace *as = NULL; if (!cpu_model) { cpu_model = "cortex-a15"; @@ -837,6 +838,10 @@ static void machvirt_init(MachineState *machine) } cpuobj = object_new(object_class_get_name(oc)); +if (!as) { +as = CPU(cpuobj)->as; +} + /* Handle any CPU options specified by the user */ cc->parse_features(CPU(cpuobj), cpuopts, &err); g_free(cpuopts); @@ -889,7 +894,7 @@ static void machvirt_init(MachineState *machine) */ create_virtio_devices(vbi, pic); -create_fw_cfg(vbi); +create_fw_cfg(as, vbi); rom_set_fw(fw_cfg_find()); guest_info->smp_cpus = smp_cpus; -- 2.4.3
[Qemu-devel] [PATCH 2/5] fw_cfg DMA interface documentation
Add fw_cfg DMA interface specification in the documentation. Based on Gerd Hoffman's initial implementation. Signed-off-by: Marc Marí --- docs/specs/fw_cfg.txt | 42 -- 1 file changed, 40 insertions(+), 2 deletions(-) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 5bc7b96..dc8051e 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -76,6 +76,7 @@ increasing address order, similar to memcpy(). Selector Register IOport: 0x510 Data Register IOport: 0x511 +DMA Address IOport: 0x512 == Firmware Configuration Items == @@ -89,8 +90,9 @@ present, the four bytes read will contain the characters "QEMU". === Revision (Key 0x0001, FW_CFG_ID) === A 32-bit little-endian unsigned int, this item is used as an interface -revision number, and is currently set to 1 by QEMU when fw_cfg is -initialized. +revision number. If it is set to 1, the interface is the traditional +selector / data interface. If it is set to 2, the DMA extension is +also present. === File Directory (Key 0x0019, FW_CFG_FILE_DIR) === @@ -132,6 +134,42 @@ Selector Reg.Range Usage In practice, the number of allowed firmware configuration items is given by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h). += Guest-side DMA Interface = + +For revision value 2, the DMA interface is present. This does not replace +the existing fw_cfg interface, it is an add-on. + +When this interface is enabled the DMA Address register can be used to +write the address of a FWCfgDmaAccess structure: + +typedef struct FWCfgDmaAccess { +uint64_t address; +uint32_t length; +uint32_t control; +} FWCfgDmaAccess; + +If "control" has the bit 1 set (FW_CFG_DMA_CTL_READ), a read operation is +performed on the selected entry. "length" bytes of data in that fw_cfg +entry are copied to the address specified by "address". + +If the field "address" has value 0, the read is considered a skip, and +the data is not copied anywhere, but the offset is still incremented. + +To check result, read the control register: + error bit set -> something went wrong. + all bits cleared -> transfer finished successfully. + otherwise -> transfer still in progress (doesn't happen + today due to implementation not being async, + but may in the future). + +Target address goes up and transfer length goes down as the transfer +happens, so after a successful transfer the length register is zero +and the address register points right after the memory block written. + +If a partial transfer happened before an error occured the address and +length registers indicate how much data has been transfered +successfully. + = Host-side API = The following functions are available to the QEMU programmer for adding -- 2.4.3
[Qemu-devel] [PATCH 0/5] fw_cfg DMA interface
Implement host-side of the FW CFG DMA interface both for x86 and ARM. Based on Gerd Hoffman's initial implementation. Gabriel L. Somlo (1): fw_cfg: document fw_cfg_modify_iXX() update functions Marc Marí (4): fw_cfg DMA interface documentation Implement fw_cfg DMA interface Enable fw_cfg DMA interface for ARM Enable fw_cfg DMA interface for x86 docs/specs/fw_cfg.txt | 53 +++- hw/arm/virt.c | 13 ++- hw/i386/pc.c | 11 ++- hw/nvram/fw_cfg.c | 212 +++--- include/hw/nvram/fw_cfg.h | 12 ++- 5 files changed, 278 insertions(+), 23 deletions(-) -- 2.4.3
[Qemu-devel] [PATCH 1/5] fw_cfg: document fw_cfg_modify_iXX() update functions
From: "Gabriel L. Somlo" Document the behavior of fw_cfg_modify_iXX() for leak-less updating of integer-type blobs. Currently only fw_cfg_modify_i16() is coded, but 32- and 64-bit versions may be added later if necessary.. Signed-off-by: Gabriel Somlo Signed-off-by: Gerd Hoffmann Reviewed-by: Laszlo Ersek --- docs/specs/fw_cfg.txt | 11 +++ 1 file changed, 11 insertions(+) diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt index 74351dd..5bc7b96 100644 --- a/docs/specs/fw_cfg.txt +++ b/docs/specs/fw_cfg.txt @@ -159,6 +159,17 @@ will convert a 16-, 32-, or 64-bit integer to little-endian, then add a dynamically allocated copy of the appropriately sized item to fw_cfg under the given selector key value. +== fw_cfg_modify_iXX() == + +Modify the value of an XX-bit item (where XX may be 16, 32, or 64). +Similarly to the corresponding fw_cfg_add_iXX() function set, convert +a 16-, 32-, or 64-bit integer to little endian, create a dynamically +allocated copy of the required size, and replace the existing item at +the given selector key value with the newly allocated one. The previous +item, assumed to have been allocated during an earlier call to +fw_cfg_add_iXX() or fw_cfg_modify_iXX() (of the same width XX), is freed +before the function returns. + == fw_cfg_add_file() == Given a filename (i.e., fw_cfg item name), starting pointer, and size, -- 2.4.3
[Qemu-devel] QEMU fw_cfg DMA interface
Implementation of the FW CFG DMA interface. When running a Linux guest on top of QEMU, using the -kernel options, this is the timing improvement for x86: QEMU commit 2be4f242b50a8 and SeaBIOS commit 908a58c1d5ff QEMU startup time: .078 BIOS startup time: .060 Kernel setup time: .578 Total time: .716 QEMU with this patch series and SeaBIOS with this patch series QEMU startup time: .080 BIOS startup time: .039 Kernel setup time: .002 Total time: .121 QEMU startup time is the time between the start and the first kvm_entry. BIOS startup time is the time between the first kvm_entry and the start of function do_boot, in SeaBIOS. Kernel setup time is the time between the start of the function do_boot in SeaBIOS and the jump to the Linux kernel. As you can see, both the BIOS (because of ACPI tables and other configurations) and the Linux kernel boot (because of the copy to memory) are greatly improved with this new interface. Also, this new interface is an addon to the old interface. Both interfaces are compatible and interchangeable.
Re: [Qemu-devel] Modularizing QEMU RFC
On Mon, 3 Aug 2015 17:24:57 +0800 Fam Zheng wrote: > On Mon, 08/03 11:01, Marc Marí wrote: > > Some profiling: > > > > A QEMU with this configuration: > > ./configure --enable-sparse --enable-sdl --enable-gtk --enable-vte \ > > --enable-curses --enable-vnc --enable-vnc-{jpeg,tls,sasl,png,ws} \ > > --enable-virtfs --enable-brlapi --enable-curl --enable-fdt \ > > --enable-bluez --enable-kvm --enable-rdma --enable-uuid > > --enable-vde \ --enable-linux-aio --enable-cap-ng --enable-attr > > --enable-vhost-net \ --enable-vhost-scsi --enable-spice > > --enable-rbd --enable-libiscsi \ --enable-smartcard-nss > > --enable-guest-agent --enable-libusb \ --enable-usb-redir > > --enable-lzo --enable-snappy --enable-bzip2 \ --enable-seccomp > > --enable-coroutine-pool --enable-glusterfs \ --enable-tpm > > --enable-libssh2 --enable-vhdx --enable-quorum \ --enable-numa > > --enable-tcmalloc --target-list=x86_64-softmmu > > > > Has dependencies on 142 libraries. It takes 60 ms between the run > > and the jump to the main function, and 80 ms between the run and the > > first kvm_entry. > > > > A QEMU with the same configuration and --enable-modules has > > dependencies on 125 libraries. It takes 20 ms between the run and > > the jump to the main function, and 100 ms between the run and the > > first kvm_entry. > > Which means 40 ms is saved because we reduced the size and dependency > of QEMU executable, but 60 ms is the extra cost of dynamical loading. > That's a net loss. > > In your --enable-modules configuration, could you try comment out > module_load body and compare again, so we know how much time is spent > in looking up and loading modules? > With the module load disabled, 20 ms from run to main, and 40 ms from run to kvm_entry. Which is "the expected", from the numbers above. Marc
Re: [Qemu-devel] Modularizing QEMU RFC
On Mon, 3 Aug 2015 10:23:37 +0100 "Daniel P. Berrange" wrote: > On Fri, Jul 31, 2015 at 05:45:42PM +0200, Marc Marí wrote: > > Hi everyone > > > > I propose improving the current modular driver system for QEMU so it > > can benefit everybody in speed and flexibility. I'm looking for > > other ideas, comments, critics, etc. > > > > - Background - > > In order to speed up QEMU, I'm looking at the high number of > > libraries and dependencies that it loads. I've generated a QEMU > > image that needs 145 shared libraries to start, and there were > > still some options disabled. This is a lot, and this means that it > > is slow. > > > > So I've been looking at actual module system. Yes, QEMU does have a > > module system, but disabled by default. The problem is, the modules > > get loaded always during startup. This means, booting with modules > > enabled is even slower, because loading at runtime is slower that > > letting the linker do all the work at the start. At this point, I > > doubt of the benefits of the actual modular system. > > > > But, if disabling the actual block modules (iscsi, curl, rbd, > > gluster, ssh and dmg) gives 40 ms of speedup, I think is worth an > > effort of improving modules, and modularizing new parts > > > > - Current module flow - > > The current module system is based on shared libraries. Each of > > these libraries has a constructor that registers the BlockDriver > > structure to the global bdrv_drivers list. This list is later > > searched by the type, the protocol, etc. > > > > - Proposals - > > I have in mind two ideas, which are not mutually exclusive: > > [snip] > > > - My comments on my proposals - > > Ideally, I'd like a mixed solution. The user can specify what wants > > to load, but also, when something is not found, it is automatically > > searched. > > > > In both options, the current module system has to be partly > > rewritten, and some of the current drivers with module capability > > might need to be modified to adapt to the new specifications. > > > > And, a part from improving the current modular interface, there are > > a lot of other devices that might benefit from it, not just the > > block devices. > > > > I still haven't looked at the memory footprint of QEMU, but I'm sure > > that the QEMU binary will lose a lot of weight with this addition. > > One think I don't see mentioned is how this impacts on QEMU feature > detection by apps. For example, the recommended approach currnetly > is to launch QEMU with 'qemu-system-BLAH --machine none > -qmp /some/sock' and then query QMP for lists of devices supported, > list of various backends and other features. > > If you're going to suggest a fully modular system, then when doing > QMP feature detection we still need to see the full list of features. > So either that implies all the metadata associated with the modules > remains built-in to QEMU, so QMP can answer this without lodaing the > modules, or the QMP feature detection must imply auto-loading of all > modules that exist. Not everything can be trivially modularized. But, I think that if we are able to get in modules the "very independent" drivers, it will be already a huge improvement. And then, we can think if it's worth the rest of the trouble. Marc
Re: [Qemu-devel] Modularizing QEMU RFC
On Mon, 03 Aug 2015 10:24:56 +0100 Alex Bennée wrote: > > Marc Marí writes: > > > On Mon, 3 Aug 2015 16:22:34 +0800 > > Fam Zheng wrote: > > > >> On Mon, 08/03 09:52, Marc Marí wrote: > >> > So any other ideas to reduce the library overhead are > >> > appreciated. > >> > >> It would be interesting to see your profiling on the library > >> loading overhead. For example, how much does it help to reduce the > >> library size, and how much does it help to reduce the # of > >> libraries? > > > > > Some profiling: > > > > A QEMU with this configuration: > > ./configure --enable-sparse --enable-sdl --enable-gtk --enable-vte \ > > --enable-curses --enable-vnc --enable-vnc-{jpeg,tls,sasl,png,ws} \ > > --enable-virtfs --enable-brlapi --enable-curl --enable-fdt \ > > --enable-bluez --enable-kvm --enable-rdma --enable-uuid > > --enable-vde \ --enable-linux-aio --enable-cap-ng --enable-attr > > --enable-vhost-net \ --enable-vhost-scsi --enable-spice > > --enable-rbd --enable-libiscsi \ --enable-smartcard-nss > > --enable-guest-agent --enable-libusb \ --enable-usb-redir > > --enable-lzo --enable-snappy --enable-bzip2 \ --enable-seccomp > > --enable-coroutine-pool --enable-glusterfs \ --enable-tpm > > --enable-libssh2 --enable-vhdx --enable-quorum \ --enable-numa > > --enable-tcmalloc --target-list=x86_64-softmmu > > > > Has dependencies on 142 libraries. It takes 60 ms between the run > > and the jump to the main function, and 80 ms between the run and the > > first kvm_entry. > > > > A QEMU with the same configuration and --enable-modules has > > dependencies on 125 libraries. It takes 20 ms between the run and > > the jump to the main function, and 100 ms between the run and the > > first kvm_entry. > > > > The libraries that are not loaded are: libiscsi, libcurl, librbd, > > librados, ligfapi, libglusterfs, libgfrpc, libgfxdr, libssh2, > > libcrypt, libidin, libgssapi, liblber, libldap, libboost_thread, > > libbost_system and libatomic_ops. > > > > As I already explained, the current implementation of modules loads > > the modules at startup always. That's why the QEMU setup takes > > longer, even though it uses G_MODULE_BIND_LAZY. And that's why I > > was proposing hotplugging. > > > > I don't know if loading one big library is more efficent than a lot > > of small ones, but it would make sense. > > What's the actual use-case here where start-up latency is so > important? If it is an ephemeral cloudy thing then you might just > have a base QEMU with VIRT drivers and one big .so call "the-rest.so"? > > I don't wish to disparage the idea but certainly in emulation world > the difference of 100ms or so is neither here nor there. > Clear Containers: https://lwn.net/Articles/644675/ We are looking for making QEMU more lightweight for the general use case and also for the container use case. It is a lot better to have the same tool for both cases, and not start a new one from scratch as Intel has done. This also benefits the general QEMU community, and that's why I'm having this discussion here. If there's a point where QEMU is still too slow for containers, but optimizing means breaking, then we will have to take a step back and change the point of view. And making QEMU modular I think is benefitial for everyone. Marc
Re: [Qemu-devel] Modularizing QEMU RFC
On Mon, 3 Aug 2015 16:22:34 +0800 Fam Zheng wrote: > On Mon, 08/03 09:52, Marc Marí wrote: > > So any other ideas to reduce the library overhead are appreciated. > > It would be interesting to see your profiling on the library loading > overhead. For example, how much does it help to reduce the library > size, and how much does it help to reduce the # of libraries? > > The protocol drivers are modularized for the sake of library > dependencies, so they should stay that way. However, we can > "sensibly" combine all non-native format drivers (VMDK, VHDX, ...) > into a cold-formats.so (if it turns out that loading one big .so is > much faster than loading separate ones). But we should leave qcow2 > as a separate one for obvious reasons, or make a hot-formats.so with > one or two other formats if that makes more sense. > > With that, for the first step, we can lazy load the cold-formats.so > whenever we need to probe or a non-qcow2 format is involved. Then on > top of that we can implement what Peter has suggested. > Some profiling: A QEMU with this configuration: ./configure --enable-sparse --enable-sdl --enable-gtk --enable-vte \ --enable-curses --enable-vnc --enable-vnc-{jpeg,tls,sasl,png,ws} \ --enable-virtfs --enable-brlapi --enable-curl --enable-fdt \ --enable-bluez --enable-kvm --enable-rdma --enable-uuid --enable-vde \ --enable-linux-aio --enable-cap-ng --enable-attr --enable-vhost-net \ --enable-vhost-scsi --enable-spice --enable-rbd --enable-libiscsi \ --enable-smartcard-nss --enable-guest-agent --enable-libusb \ --enable-usb-redir --enable-lzo --enable-snappy --enable-bzip2 \ --enable-seccomp --enable-coroutine-pool --enable-glusterfs \ --enable-tpm --enable-libssh2 --enable-vhdx --enable-quorum \ --enable-numa --enable-tcmalloc --target-list=x86_64-softmmu Has dependencies on 142 libraries. It takes 60 ms between the run and the jump to the main function, and 80 ms between the run and the first kvm_entry. A QEMU with the same configuration and --enable-modules has dependencies on 125 libraries. It takes 20 ms between the run and the jump to the main function, and 100 ms between the run and the first kvm_entry. The libraries that are not loaded are: libiscsi, libcurl, librbd, librados, ligfapi, libglusterfs, libgfrpc, libgfxdr, libssh2, libcrypt, libidin, libgssapi, liblber, libldap, libboost_thread, libbost_system and libatomic_ops. As I already explained, the current implementation of modules loads the modules at startup always. That's why the QEMU setup takes longer, even though it uses G_MODULE_BIND_LAZY. And that's why I was proposing hotplugging. I don't know if loading one big library is more efficent than a lot of small ones, but it would make sense. Marc
Re: [Qemu-devel] Modularizing QEMU RFC
On Mon, 3 Aug 2015 11:09:06 +0800 Fam Zheng wrote: > On Fri, 07/31 17:45, Marc Marí wrote: > > Hi everyone > > > > I propose improving the current modular driver system for QEMU so it > > can benefit everybody in speed and flexibility. I'm looking for > > other ideas, comments, critics, etc. > > > > - Background - > > In order to speed up QEMU, I'm looking at the high number of > > libraries and dependencies that it loads. I've generated a QEMU > > image that needs 145 shared libraries to start, and there were > > still some options disabled. This is a lot, and this means that it > > is slow. > > > > So I've been looking at actual module system. Yes, QEMU does have a > > module system, but disabled by default. The problem is, the modules > > get loaded always during startup. This means, booting with modules > > enabled is even slower, because loading at runtime is slower that > > letting the linker do all the work at the start. At this point, I > > doubt of the benefits of the actual modular system. > > > > But, if disabling the actual block modules (iscsi, curl, rbd, > > gluster, ssh and dmg) gives 40 ms of speedup, I think is worth an > > effort of improving modules, and modularizing new parts > > > > - Current module flow - > > The current module system is based on shared libraries. Each of > > these libraries has a constructor that registers the BlockDriver > > structure to the global bdrv_drivers list. This list is later > > searched by the type, the protocol, etc. > > > > - Proposals - > > I have in mind two ideas, which are not mutually exclusive: > > > > -- A -- > > Having a huge lookup table with the names of the drivers that are in > > modules, and the name of the modules where it can be found. When > > some type is not found in the driver lists, this table can be > > traversed to find the library and load it. > > > > This requires an effort by the driver developer to fill the tables. > > But the refactoring work can stay localized in the drivers and the > > modules, it is (possibly) not necessary to touch any other part of > > the QEMU system. > > > > I don't specially like this huge manual-edited table. > > Yeah, the way to fill the tables is an (implementation) question, but > there are still more questions if we go this way... > > bdrv_find_protocol is easy, the protocol name is extracted from image > uri or blockdev options, we can load the module by the name. > > bdrv_probe_all is harder. If we modularize a format driver, > its .bdrv_probe code will be in the module. If we want to do the > format detection, we need to load all format drivers. This means if > the command line has an unspecified format, we'll still need to load > all drivers at starting phase. (I wish all formats are probed > according to magic bytes at offset 0, so we can simplify > the .bdrv_probe logic and do it with data matching in block.c like > the protocol case, but that's not true for VMDK :( ) > > I believe other sub systems have this kind of paradox as well. I managed to overlook that... If the user doesn't specify the type of the block device, then, all block drivers will have to be tested. I see this is based on scores. If there is a score that means "this is my type for sure" (which I don't know), the probe could be stopped there. But the user can also specify the type for its block device (if I remember correctly). So, if he wants more speed, he could just specify the type of the block device. > > > > -- B -- > > The same --enable-X that is done at compile time, but directly on > > the command line when running QEMU or even in hot at the monitor. > > > > This solution requires work on the monitor, the command line > > processing, the modules and the drivers system. It is also less > > transparent to the final user. > > I'm afraid this breaks backward compatibility. :( :( Maybe my ideas are a bit too ideal without rewriting half QEMU. I should have left my ideas to cool down and rest before writting them down. So any other ideas to reduce the library overhead are appreciated. Thanks Marc
[Qemu-devel] Modularizing QEMU RFC
Hi everyone I propose improving the current modular driver system for QEMU so it can benefit everybody in speed and flexibility. I'm looking for other ideas, comments, critics, etc. - Background - In order to speed up QEMU, I'm looking at the high number of libraries and dependencies that it loads. I've generated a QEMU image that needs 145 shared libraries to start, and there were still some options disabled. This is a lot, and this means that it is slow. So I've been looking at actual module system. Yes, QEMU does have a module system, but disabled by default. The problem is, the modules get loaded always during startup. This means, booting with modules enabled is even slower, because loading at runtime is slower that letting the linker do all the work at the start. At this point, I doubt of the benefits of the actual modular system. But, if disabling the actual block modules (iscsi, curl, rbd, gluster, ssh and dmg) gives 40 ms of speedup, I think is worth an effort of improving modules, and modularizing new parts - Current module flow - The current module system is based on shared libraries. Each of these libraries has a constructor that registers the BlockDriver structure to the global bdrv_drivers list. This list is later searched by the type, the protocol, etc. - Proposals - I have in mind two ideas, which are not mutually exclusive: -- A -- Having a huge lookup table with the names of the drivers that are in modules, and the name of the modules where it can be found. When some type is not found in the driver lists, this table can be traversed to find the library and load it. This requires an effort by the driver developer to fill the tables. But the refactoring work can stay localized in the drivers and the modules, it is (possibly) not necessary to touch any other part of the QEMU system. I don't specially like this huge manual-edited table. -- B -- The same --enable-X that is done at compile time, but directly on the command line when running QEMU or even in hot at the monitor. This solution requires work on the monitor, the command line processing, the modules and the drivers system. It is also less transparent to the final user. - My comments on my proposals - Ideally, I'd like a mixed solution. The user can specify what wants to load, but also, when something is not found, it is automatically searched. In both options, the current module system has to be partly rewritten, and some of the current drivers with module capability might need to be modified to adapt to the new specifications. And, a part from improving the current modular interface, there are a lot of other devices that might benefit from it, not just the block devices. I still haven't looked at the memory footprint of QEMU, but I'm sure that the QEMU binary will lose a lot of weight with this addition. - Closing - This is just a brief draft. These proposals can be improved a lot, or there might be some other solutions that I haven't thought of. So, I'd like to ask for ideas, comments, critics, improvements, etc. And also ask for contributors to this endeavour, because it will be a huge amount of work. Thanks for reading and have a nice weekend Marc
Re: [Qemu-devel] [RFC 2/7] fw_cfg dma interface
On Thu, 23 Jul 2015 15:45:25 +0200 Laszlo Ersek wrote: > On 07/23/15 15:35, Peter Maydell wrote: > > On 23 July 2015 at 14:13, Laszlo Ersek wrote: > >> On 07/22/15 19:18, Kevin O'Connor wrote: > >>> Another possibility would be to place the new fw_cfg dma register > >>> address into a named fw_cfg "file" (eg, "fw_cfg_dma"). The > >>> firmware could then use the existing select/data fw_cfg interface > >>> to check if the new dma interface is available by scanning for > >>> that "fw_cfg_dma" file. This has the advantage of not requiring > >>> a new "magic address", but has the disadvantage of a more complex > >>> probe. > >> > >> I like this one so much that I'm worried I'm missing some > >> details. :) > > > > This requires the device itself to know its own address, which > > is in QEMU possible but ugly enough to be worth avoiding. > > > > For ARM MMIO the obvious answer is "the new register should > > just go next to the first one". Does x86 do something that > > means we can't put it somewhere equally straightforward > > or do discovery via whatever x86 uses for discovering MMIO? > > I don't know how x86 determines the MMIO mapping. As far as I gather > from the SeaBIOS patches and this QEMU series, 0xfef0 is a > hand-picked fixed address. (See BIOS_CFG_DMA_ADDR in 7/7.) > > 0xfef0 seems to fall right above the 1MB LAPIC range; I guess > there's no conflict with anything else... > For the x86 part, I chose a completely arbitrary "magic address". Ideally, this address should be discoverable by the BIOS, so it can be moved around if some specification is changed. But a lot of device configuration used in SeaBIOS comes from fw_cfg. That might be the reason why the IO port was also hardcoded. It would be good if somebody comes with an idea on how to make it discoverable by the BIOS in x86.