Re: [Qemu-devel] Re: EFI console stopped working in Qemu 0.14.0
Added CC to Tristan. I doubt that he is still interested in EFI, though. On Thu, Mar 03, 2011 at 04:46:34PM +0900, Isaku Yamahata wrote: > > Seabios has the patch to address the similar issue with > the changeset of b82a1e49fc0e72fb9bf1a642d6aa707345b0f398, > which enables memory/io unconditionally. > > I suppose the EFI bios is very old so that it has the same issue. > I think the following file is the one to modify. > > efi-vfirmware.hg/edk2-sparse/EdkQemuPkg/Pei/BochsPciScan/BochsPciScan.c > > thanks, > > On Thu, Mar 03, 2011 at 08:43:11AM +0200, vagran wrote: > > I am using TianoCore EFI by Tristan Gingold which is published > > on http://wiki.qemu.org/download/efi-bios.tar.bz2. If you would try > > to load it on Qemu 0.14.0 (built either for i386 or x86_64) you will > > see nothing on VGA display or serial console. But it still will be > > able to load OS after timeout if you have proper disk image. > >> It seems your EFI BIOS doesn't enable memor, io or master bits > >> in command register. > >> > >> > >> or disableintx. > >> > > I have checked your guess and figured out that it works only > > if both memory and io bits are not cleared. So the following > > patch also works: > > diff --git a/hw/pci.c b/hw/pci.c > > index 8b76cea..bcf9b16 100644 > > --- a/hw/pci.c > > +++ b/hw/pci.c > > @@ -163,8 +163,9 @@ void pci_device_reset(PCIDevice *dev) > > pci_device_deassert_intx(dev); > > /* Clear all writeable bits */ > > pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, > > - pci_get_word(dev->wmask + PCI_COMMAND) | > > - pci_get_word(dev->w1cmask + PCI_COMMAND)); > > + (pci_get_word(dev->wmask + PCI_COMMAND) | > > + pci_get_word(dev->w1cmask + > > PCI_COMMAND)) & > > + ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)); > > pci_word_test_and_clear_mask(dev->config + PCI_STATUS, > > pci_get_word(dev->wmask + PCI_STATUS) | > > pci_get_word(dev->w1cmask + PCI_STATUS)); > > > > So probably the problem is in EFI BIOS. But I was not able to find > > its source code. Anyone knows how is it built? > > > > Best regards, > > Artyom. > > > > > > Isaku Yamahata wrote: > >> On Thu, Mar 03, 2011 at 12:03:53PM +0900, Isaku Yamahata wrote: > >> > >>> Hi. Thank you for reporting. > >>> Can you elaborate on the changeset that you pointed out and > >>> your work around? > >>> > >>> Regarding to the changeset, it had the issue, but I suppose > >>> 80376c3fc2c38fdd45354e4b0eb45031f35587ed fixed it. > >>> Do you found any other issue? > >>> > >>> Regarding to your workaround, what was the problem? > >>> What EFI BIOS are you using? Tiano-core derivatives that > >>> Tristan Gingold worked on? Or other one? > >>> It seems your EFI BIOS doesn't enable memor, io or master bits > >>> in command register. > >>> > >> > >> or disableintx. > >> > >> > >>> If so, the issue is in the bios, not qemu. > >>> > >>> thanks, > >>> > >>> On Wed, Mar 02, 2011 at 11:27:31PM +0200, vagran wrote: > >>> > vagran wrote: > > > Hi, > > I have noted that Qemu VGA and serial console with EFI BIOS > > stopped working in > > 0.14.0 (and in latest development snapshot is still not working). > > Everything was > > fine in 0.13.0. However EFI BIOS itself is able to load kernel if it was > > properly configured on used disk image. The only effect is that > > neither VGA nor > > serial console is not functioning. After short investigation I > > have discovered > > that this functionality was broken by this commit: > > > > commit 9bb3358627d87d8de25fb41b7276575539d799a7 > > Author: Isaku Yamahata > > Date: Fri Nov 19 18:56:02 2010 +0900 > > > > Do you have any idea how this change could affect EFI consoles? > > > > > After further investigation I have found that the following patch > provides > a workaround for the problem, may be it could be useful for somebody who > is more familiar with Qemu PCI code: > > diff --git a/hw/pci.c b/hw/pci.c > index 8b76cea..06dd7ab 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -162,9 +162,11 @@ void pci_device_reset(PCIDevice *dev) > pci_update_irq_status(dev); > pci_device_deassert_intx(dev); > /* Clear all writeable bits */ > +#if 0 > pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, > pci_get_word(dev->wmask + PCI_COMMAND) | > pci_get_word(dev->w1cmask + > PCI_COMMAND)); > +#endif > pci_word_test_and_clear_mask(dev->config + PCI_STATUS, > pci_get_word(dev->wmask + PCI_STATUS) | > pci_get
[Qemu-devel] Re: [PATCH v4 2/2] rtl8139: add vlan tag insertion
On Wed, Mar 02, 2011 at 05:36:20PM -0500, Benjamin Poirier wrote: > Add support to the emulated hardware to insert vlan tags in packets > going from the guest to the network. > > Signed-off-by: Benjamin Poirier > Cc: Igor V. Kovalenko > Cc: Jason Wang > Cc: Michael S. Tsirkin > --- > hw/rtl8139.c | 102 > ++ > 1 files changed, 74 insertions(+), 28 deletions(-) > > diff --git a/hw/rtl8139.c b/hw/rtl8139.c > index 6e770bd..0ea79e4 100644 > --- a/hw/rtl8139.c > +++ b/hw/rtl8139.c > @@ -832,11 +832,14 @@ static int rtl8139_can_receive(VLANClientState *nc) > } > } > > -static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, > size_t size_, int do_interrupt) > +static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, > +size_t buf_size, int do_interrupt, const uint8_t *dot1q_buf) > { > RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque; > +/* size_ is the total length of argument buffers */ > +int size_ = buf_size + (dot1q_buf ? VLAN_HDR_LEN : 0); > +/* size is the length of the buffer passed to the driver */ > int size = size_; > -const uint8_t *dot1q_buf; > const uint8_t *next_part; > size_t next_part_size; > > @@ -1018,10 +1021,13 @@ static ssize_t rtl8139_do_receive(VLANClientState > *nc, const uint8_t *buf, size_ > /* next_part starts right after the vlan header (if any), at the > * ethertype for the payload */ > next_part = &buf[ETHER_ADDR_LEN * 2]; > -if (s->CpCmd & CPlusRxVLAN && be16_to_cpup((uint16_t *) > -&buf[ETHER_ADDR_LEN * 2]) == ETHERTYPE_VLAN) { > -dot1q_buf = &buf[ETHER_ADDR_LEN * 2]; > -next_part += VLAN_HDR_LEN; > +if (s->CpCmd & CPlusRxVLAN && (dot1q_buf || be16_to_cpup((uint16_t *) > +&buf[ETHER_ADDR_LEN * 2]) == ETHERTYPE_VLAN)) { > +if (!dot1q_buf) { > +/* the tag is in the buffer */ > +dot1q_buf = &buf[ETHER_ADDR_LEN * 2]; > +next_part += VLAN_HDR_LEN; > +} > size -= VLAN_HDR_LEN; > > rxdw1 &= ~CP_RX_VLAN_TAG_MASK; > @@ -1034,9 +1040,8 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, > const uint8_t *buf, size_ > } else { > /* reset VLAN tag flag */ > rxdw1 &= ~CP_RX_TAVA; > -dot1q_buf = NULL; > } > -next_part_size = buf + size_ - next_part; > +next_part_size = buf + buf_size - next_part; > > /* if too small buffer, then expand it */ > if (size < MIN_BUF_SIZE) { > @@ -1164,10 +1169,18 @@ static ssize_t rtl8139_do_receive(VLANClientState > *nc, const uint8_t *buf, size_ > > /* if too small buffer, then expand it */ > if (size < MIN_BUF_SIZE) { > -memcpy(buf1, buf, size); > +if (unlikely(dot1q_buf)) { unlikely kind of iffy here > +memcpy(buf1, buf, 2 * ETHER_ADDR_LEN); > +memcpy(buf1 + 2 * ETHER_ADDR_LEN, dot1q_buf, VLAN_HDR_LEN); > +memcpy(buf1 + 2 * ETHER_ADDR_LEN + VLAN_HDR_LEN, buf + 2 * > +ETHER_ADDR_LEN, buf_size - 2 * ETHER_ADDR_LEN); > +} else { > +memcpy(buf1, buf, size); > +} > memset(buf1 + size, 0, MIN_BUF_SIZE - size); > buf = buf1; > size = MIN_BUF_SIZE; > +dot1q_buf = NULL; > } > > /* begin ring receiver mode */ > @@ -1196,8 +1209,19 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, > const uint8_t *buf, size_ > rtl8139_write_buffer(s, (uint8_t *)&val, 4); > > /* receive/copy to target memory */ > -rtl8139_write_buffer(s, buf, size); > -val = rtl8139_crc32(0, buf, size); > +if (unlikely(dot1q_buf)) { > +rtl8139_write_buffer(s, buf, 2 * ETHER_ADDR_LEN); > +val = rtl8139_crc32(0, buf, 2 * ETHER_ADDR_LEN); > +rtl8139_write_buffer(s, dot1q_buf, VLAN_HDR_LEN); > +val = rtl8139_crc32(val, dot1q_buf, VLAN_HDR_LEN); > +rtl8139_write_buffer(s, buf + 2 * ETHER_ADDR_LEN, buf_size - 2 * > +ETHER_ADDR_LEN); > +val = rtl8139_crc32(val, buf + 2 * ETHER_ADDR_LEN, buf_size - 2 * > +ETHER_ADDR_LEN); > +} else { > +rtl8139_write_buffer(s, buf, size); > +val = rtl8139_crc32(0, buf, size); > +} > > /* write checksum */ > #if defined (RTL8139_CALCULATE_RXCRC) > @@ -1227,7 +1251,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, > const uint8_t *buf, size_ > > static ssize_t rtl8139_receive(VLANClientState *nc, const uint8_t *buf, > size_t size) > { > -return rtl8139_do_receive(nc, buf, size, 1); > +return rtl8139_do_receive(nc, buf, size, 1, NULL); > } > > static void rtl8139_reset_rxring
Re: [Qemu-devel] Re: EFI console stopped working in Qemu 0.14.0
Seabios has the patch to address the similar issue with the changeset of b82a1e49fc0e72fb9bf1a642d6aa707345b0f398, which enables memory/io unconditionally. I suppose the EFI bios is very old so that it has the same issue. I think the following file is the one to modify. efi-vfirmware.hg/edk2-sparse/EdkQemuPkg/Pei/BochsPciScan/BochsPciScan.c thanks, On Thu, Mar 03, 2011 at 08:43:11AM +0200, vagran wrote: > I am using TianoCore EFI by Tristan Gingold which is published > on http://wiki.qemu.org/download/efi-bios.tar.bz2. If you would try > to load it on Qemu 0.14.0 (built either for i386 or x86_64) you will > see nothing on VGA display or serial console. But it still will be > able to load OS after timeout if you have proper disk image. >> It seems your EFI BIOS doesn't enable memor, io or master bits >> in command register. >> >> >> or disableintx. >> > I have checked your guess and figured out that it works only > if both memory and io bits are not cleared. So the following > patch also works: > diff --git a/hw/pci.c b/hw/pci.c > index 8b76cea..bcf9b16 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -163,8 +163,9 @@ void pci_device_reset(PCIDevice *dev) > pci_device_deassert_intx(dev); > /* Clear all writeable bits */ > pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, > - pci_get_word(dev->wmask + PCI_COMMAND) | > - pci_get_word(dev->w1cmask + PCI_COMMAND)); > + (pci_get_word(dev->wmask + PCI_COMMAND) | > + pci_get_word(dev->w1cmask + > PCI_COMMAND)) & > + ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)); > pci_word_test_and_clear_mask(dev->config + PCI_STATUS, > pci_get_word(dev->wmask + PCI_STATUS) | > pci_get_word(dev->w1cmask + PCI_STATUS)); > > So probably the problem is in EFI BIOS. But I was not able to find > its source code. Anyone knows how is it built? > > Best regards, > Artyom. > > > Isaku Yamahata wrote: >> On Thu, Mar 03, 2011 at 12:03:53PM +0900, Isaku Yamahata wrote: >> >>> Hi. Thank you for reporting. >>> Can you elaborate on the changeset that you pointed out and >>> your work around? >>> >>> Regarding to the changeset, it had the issue, but I suppose >>> 80376c3fc2c38fdd45354e4b0eb45031f35587ed fixed it. >>> Do you found any other issue? >>> >>> Regarding to your workaround, what was the problem? >>> What EFI BIOS are you using? Tiano-core derivatives that >>> Tristan Gingold worked on? Or other one? >>> It seems your EFI BIOS doesn't enable memor, io or master bits >>> in command register. >>> >> >> or disableintx. >> >> >>> If so, the issue is in the bios, not qemu. >>> >>> thanks, >>> >>> On Wed, Mar 02, 2011 at 11:27:31PM +0200, vagran wrote: >>> vagran wrote: > Hi, > I have noted that Qemu VGA and serial console with EFI BIOS > stopped working in > 0.14.0 (and in latest development snapshot is still not working). > Everything was > fine in 0.13.0. However EFI BIOS itself is able to load kernel if it was > properly configured on used disk image. The only effect is that > neither VGA nor > serial console is not functioning. After short investigation I > have discovered > that this functionality was broken by this commit: > > commit 9bb3358627d87d8de25fb41b7276575539d799a7 > Author: Isaku Yamahata > Date: Fri Nov 19 18:56:02 2010 +0900 > > Do you have any idea how this change could affect EFI consoles? > > After further investigation I have found that the following patch provides a workaround for the problem, may be it could be useful for somebody who is more familiar with Qemu PCI code: diff --git a/hw/pci.c b/hw/pci.c index 8b76cea..06dd7ab 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -162,9 +162,11 @@ void pci_device_reset(PCIDevice *dev) pci_update_irq_status(dev); pci_device_deassert_intx(dev); /* Clear all writeable bits */ +#if 0 pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, pci_get_word(dev->wmask + PCI_COMMAND) | pci_get_word(dev->w1cmask + PCI_COMMAND)); +#endif pci_word_test_and_clear_mask(dev->config + PCI_STATUS, pci_get_word(dev->wmask + PCI_STATUS) | pci_get_word(dev->w1cmask + PCI_STATUS)); Best regards, Artyom. >>> -- >>> yamahata >>> >>> > -- yamahata
Re: [Qemu-devel] Re: EFI console stopped working in Qemu 0.14.0
On Thu, Mar 03, 2011 at 08:43:11AM +0200, vagran wrote: > I am using TianoCore EFI by Tristan Gingold which is published > on http://wiki.qemu.org/download/efi-bios.tar.bz2. If you would try > to load it on Qemu 0.14.0 (built either for i386 or x86_64) you will > see nothing on VGA display or serial console. But it still will be > able to load OS after timeout if you have proper disk image. Thank you for the info. Then I can also test it locally. thanks, >> It seems your EFI BIOS doesn't enable memor, io or master bits >> in command register. >> >> >> or disableintx. >> > I have checked your guess and figured out that it works only > if both memory and io bits are not cleared. So the following > patch also works: > diff --git a/hw/pci.c b/hw/pci.c > index 8b76cea..bcf9b16 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -163,8 +163,9 @@ void pci_device_reset(PCIDevice *dev) > pci_device_deassert_intx(dev); > /* Clear all writeable bits */ > pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, > - pci_get_word(dev->wmask + PCI_COMMAND) | > - pci_get_word(dev->w1cmask + PCI_COMMAND)); > + (pci_get_word(dev->wmask + PCI_COMMAND) | > + pci_get_word(dev->w1cmask + > PCI_COMMAND)) & > + ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)); > pci_word_test_and_clear_mask(dev->config + PCI_STATUS, > pci_get_word(dev->wmask + PCI_STATUS) | > pci_get_word(dev->w1cmask + PCI_STATUS)); > > So probably the problem is in EFI BIOS. But I was not able to find > its source code. Anyone knows how is it built? > > Best regards, > Artyom. > > > Isaku Yamahata wrote: >> On Thu, Mar 03, 2011 at 12:03:53PM +0900, Isaku Yamahata wrote: >> >>> Hi. Thank you for reporting. >>> Can you elaborate on the changeset that you pointed out and >>> your work around? >>> >>> Regarding to the changeset, it had the issue, but I suppose >>> 80376c3fc2c38fdd45354e4b0eb45031f35587ed fixed it. >>> Do you found any other issue? >>> >>> Regarding to your workaround, what was the problem? >>> What EFI BIOS are you using? Tiano-core derivatives that >>> Tristan Gingold worked on? Or other one? >>> It seems your EFI BIOS doesn't enable memor, io or master bits >>> in command register. >>> >> >> or disableintx. >> >> >>> If so, the issue is in the bios, not qemu. >>> >>> thanks, >>> >>> On Wed, Mar 02, 2011 at 11:27:31PM +0200, vagran wrote: >>> vagran wrote: > Hi, > I have noted that Qemu VGA and serial console with EFI BIOS > stopped working in > 0.14.0 (and in latest development snapshot is still not working). > Everything was > fine in 0.13.0. However EFI BIOS itself is able to load kernel if it was > properly configured on used disk image. The only effect is that > neither VGA nor > serial console is not functioning. After short investigation I > have discovered > that this functionality was broken by this commit: > > commit 9bb3358627d87d8de25fb41b7276575539d799a7 > Author: Isaku Yamahata > Date: Fri Nov 19 18:56:02 2010 +0900 > > Do you have any idea how this change could affect EFI consoles? > > After further investigation I have found that the following patch provides a workaround for the problem, may be it could be useful for somebody who is more familiar with Qemu PCI code: diff --git a/hw/pci.c b/hw/pci.c index 8b76cea..06dd7ab 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -162,9 +162,11 @@ void pci_device_reset(PCIDevice *dev) pci_update_irq_status(dev); pci_device_deassert_intx(dev); /* Clear all writeable bits */ +#if 0 pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, pci_get_word(dev->wmask + PCI_COMMAND) | pci_get_word(dev->w1cmask + PCI_COMMAND)); +#endif pci_word_test_and_clear_mask(dev->config + PCI_STATUS, pci_get_word(dev->wmask + PCI_STATUS) | pci_get_word(dev->w1cmask + PCI_STATUS)); Best regards, Artyom. >>> -- >>> yamahata >>> >>> > -- yamahata
[Qemu-devel] [PATCH v2 3/3] correct VNC_DIRTY_WORDS on 64 bit machine
VNC_DIRTY_WORDS is wrong on 64 bit long machine. Signed-off-by: Wen Congyang --- ui/vnc.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/ui/vnc.h b/ui/vnc.h index 8a1e7b9..5fc54e5 100644 --- a/ui/vnc.h +++ b/ui/vnc.h @@ -81,7 +81,7 @@ typedef void VncSendHextileTile(VncState *vs, #define VNC_MAX_WIDTH 2560 #define VNC_MAX_HEIGHT 2048 -#define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * BITS_PER_LONG)) +#define VNC_DIRTY_WORDS (DIV_ROUND_UP(VNC_MAX_WIDTH, (16 * BITS_PER_LONG))) #define VNC_STAT_RECT 64 #define VNC_STAT_COLS (VNC_MAX_WIDTH / VNC_STAT_RECT) -- 1.7.1
[Qemu-devel] Re: [PATCH 3/3] correct VNC_DIRTY_WORDS on 64 bit machine
At 03/03/2011 02:41 PM, Corentin Chary Write: > On Thu, Mar 3, 2011 at 3:44 AM, Wen Congyang wrote: >> VNC_DIRTY_WORDS is wrong on 64 bit long machine. >> >> Signed-off-by: Wen Congyang >> >> --- >> ui/vnc.h |3 ++- >> 1 files changed, 2 insertions(+), 1 deletions(-) >> >> diff --git a/ui/vnc.h b/ui/vnc.h >> index 8a1e7b9..239a7a7 100644 >> --- a/ui/vnc.h >> +++ b/ui/vnc.h >> @@ -81,7 +81,8 @@ typedef void VncSendHextileTile(VncState *vs, >> >> #define VNC_MAX_WIDTH 2560 >> #define VNC_MAX_HEIGHT 2048 >> -#define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * BITS_PER_LONG)) >> +#define divideup(x, y) (((x) + ((y) - 1)) / (y)) > > osdep.h:#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) Thank you for pointing out it. I will update my patch. > >
Re: [Qemu-devel] Re: EFI console stopped working in Qemu 0.14.0
I am using TianoCore EFI by Tristan Gingold which is published on http://wiki.qemu.org/download/efi-bios.tar.bz2. If you would try to load it on Qemu 0.14.0 (built either for i386 or x86_64) you will see nothing on VGA display or serial console. But it still will be able to load OS after timeout if you have proper disk image. It seems your EFI BIOS doesn't enable memor, io or master bits in command register. or disableintx. I have checked your guess and figured out that it works only if both memory and io bits are not cleared. So the following patch also works: diff --git a/hw/pci.c b/hw/pci.c index 8b76cea..bcf9b16 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -163,8 +163,9 @@ void pci_device_reset(PCIDevice *dev) pci_device_deassert_intx(dev); /* Clear all writeable bits */ pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, - pci_get_word(dev->wmask + PCI_COMMAND) | - pci_get_word(dev->w1cmask + PCI_COMMAND)); + (pci_get_word(dev->wmask + PCI_COMMAND) | + pci_get_word(dev->w1cmask + PCI_COMMAND)) & + ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)); pci_word_test_and_clear_mask(dev->config + PCI_STATUS, pci_get_word(dev->wmask + PCI_STATUS) | pci_get_word(dev->w1cmask + PCI_STATUS)); So probably the problem is in EFI BIOS. But I was not able to find its source code. Anyone knows how is it built? Best regards, Artyom. Isaku Yamahata wrote: On Thu, Mar 03, 2011 at 12:03:53PM +0900, Isaku Yamahata wrote: Hi. Thank you for reporting. Can you elaborate on the changeset that you pointed out and your work around? Regarding to the changeset, it had the issue, but I suppose 80376c3fc2c38fdd45354e4b0eb45031f35587ed fixed it. Do you found any other issue? Regarding to your workaround, what was the problem? What EFI BIOS are you using? Tiano-core derivatives that Tristan Gingold worked on? Or other one? It seems your EFI BIOS doesn't enable memor, io or master bits in command register. or disableintx. If so, the issue is in the bios, not qemu. thanks, On Wed, Mar 02, 2011 at 11:27:31PM +0200, vagran wrote: vagran wrote: Hi, I have noted that Qemu VGA and serial console with EFI BIOS stopped working in 0.14.0 (and in latest development snapshot is still not working). Everything was fine in 0.13.0. However EFI BIOS itself is able to load kernel if it was properly configured on used disk image. The only effect is that neither VGA nor serial console is not functioning. After short investigation I have discovered that this functionality was broken by this commit: commit 9bb3358627d87d8de25fb41b7276575539d799a7 Author: Isaku Yamahata Date: Fri Nov 19 18:56:02 2010 +0900 Do you have any idea how this change could affect EFI consoles? After further investigation I have found that the following patch provides a workaround for the problem, may be it could be useful for somebody who is more familiar with Qemu PCI code: diff --git a/hw/pci.c b/hw/pci.c index 8b76cea..06dd7ab 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -162,9 +162,11 @@ void pci_device_reset(PCIDevice *dev) pci_update_irq_status(dev); pci_device_deassert_intx(dev); /* Clear all writeable bits */ +#if 0 pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, pci_get_word(dev->wmask + PCI_COMMAND) | pci_get_word(dev->w1cmask + PCI_COMMAND)); +#endif pci_word_test_and_clear_mask(dev->config + PCI_STATUS, pci_get_word(dev->wmask + PCI_STATUS) | pci_get_word(dev->w1cmask + PCI_STATUS)); Best regards, Artyom. -- yamahata
[Qemu-devel] Re: [PATCH 3/3] correct VNC_DIRTY_WORDS on 64 bit machine
On Thu, Mar 3, 2011 at 3:44 AM, Wen Congyang wrote: > VNC_DIRTY_WORDS is wrong on 64 bit long machine. > > Signed-off-by: Wen Congyang > > --- > ui/vnc.h | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/ui/vnc.h b/ui/vnc.h > index 8a1e7b9..239a7a7 100644 > --- a/ui/vnc.h > +++ b/ui/vnc.h > @@ -81,7 +81,8 @@ typedef void VncSendHextileTile(VncState *vs, > > #define VNC_MAX_WIDTH 2560 > #define VNC_MAX_HEIGHT 2048 > -#define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * BITS_PER_LONG)) > +#define divideup(x, y) (((x) + ((y) - 1)) / (y)) osdep.h:#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) -- Corentin Chary http://xf.iksaif.net
[Qemu-devel] RE: Printed Roll-up Banners USD7 Only (GIGAPrint Overseas)
Dear IBT Ing. B?ro Trncik V. GIGAPrint Ltd. | 11/F, Fu Hop Fty Bldg, 209-211 Wai Yip St, Kwun Tong, Kowloon, HK | 23892088
Re: [Qemu-devel] [PATCH v2] disable sigcld handling before calling pclose()
At 12/21/2010 12:05 PM, Wen Congyang Write: > When I use the command 'virsh save' to save the domain state, > I receive the following error message: > operation failed: Migration unexpectedly failed. > > I debug the qemu by adding some printf(), and find the function > pclose() returns -1. > > I use strace to trace qemu, the log is as the following: > == > close(17) = 0 > --- SIGCHLD (Child exited) @ 0 (0) --- > wait4(-1, NULL, WNOHANG, NULL) = 22016 > rt_sigreturn(0) = 0 > wait4(22016, 0x7fff7f1034fc, 0, NULL) = -1 ECHILD (No child processes) > == > > We wait the child twice: one is in signal SIGCHLD handling and the other > one is in pclose(). > > We should disable sigcld handling before calling pclose(). > > v2: > - Add stub functions for Win32 > > Signed-off-by: Wen Congyang > > --- > os-posix.c | 19 +++ > qemu-os-posix.h |2 ++ > qemu-os-win32.h |2 ++ > savevm.c|2 ++ > 4 files changed, 25 insertions(+), 0 deletions(-) > > diff --git a/os-posix.c b/os-posix.c > index 38c29d1..b163995 100644 > --- a/os-posix.c > +++ b/os-posix.c > @@ -86,6 +86,25 @@ void os_setup_signal_handling(void) > sigaction(SIGCHLD, &act, NULL); > } > > +void os_stop_sigchld_handling(void) > +{ > +struct sigaction act; > + > +memset(&act, 0, sizeof(act)); > +act.sa_handler = SIG_DFL; > +sigaction(SIGCHLD, &act, NULL); > +} > + > +void os_resume_sigchld_handling(void) > +{ > +struct sigaction act; > + > +memset(&act, 0, sizeof(act)); > +act.sa_handler = sigchld_handler; > +act.sa_flags = SA_NOCLDSTOP; > +sigaction(SIGCHLD, &act, NULL); > +} > + > /* Find a likely location for support files using the location of the binary. > For installed binaries this will be "$bindir/../share/qemu". When > running from the build tree this will be "$bindir/../pc-bios". */ > diff --git a/qemu-os-posix.h b/qemu-os-posix.h > index 81fd9ab..1c317f1 100644 > --- a/qemu-os-posix.h > +++ b/qemu-os-posix.h > @@ -33,6 +33,8 @@ static inline void os_host_main_loop_wait(int *timeout) > void os_set_line_buffering(void); > void os_set_proc_name(const char *s); > void os_setup_signal_handling(void); > +void os_stop_sigchld_handling(void); > +void os_resume_sigchld_handling(void); > void os_daemonize(void); > void os_setup_post(void); > > diff --git a/qemu-os-win32.h b/qemu-os-win32.h > index 1a07e5e..f31c5ef 100644 > --- a/qemu-os-win32.h > +++ b/qemu-os-win32.h > @@ -43,6 +43,8 @@ void qemu_del_wait_object(HANDLE handle, WaitObjectFunc > *func, void *opaque); > void os_host_main_loop_wait(int *timeout); > > static inline void os_setup_signal_handling(void) {} > +static inline void os_stop_sigchld_handling(void) {} > +static inline void os_resume_sigchld_handling(void) {} > static inline void os_daemonize(void) {} > static inline void os_setup_post(void) {} > void os_set_line_buffering(void); > diff --git a/savevm.c b/savevm.c > index 90aa237..387b70b 100644 > --- a/savevm.c > +++ b/savevm.c > @@ -234,7 +234,9 @@ static int stdio_pclose(void *opaque) > { > QEMUFileStdio *s = opaque; > int ret; > +os_stop_sigchld_handling(); > ret = pclose(s->stdio_file); > +os_resume_sigchld_handling(); > qemu_free(s); > return ret; > } Ping Again... :) This is a bug fix. 2 months has gone, but I do not receive any comment. Should we remove SIGCHLD handling as Paolo said?
Re: [Qemu-devel] Re: EFI console stopped working in Qemu 0.14.0
On Thu, Mar 03, 2011 at 12:03:53PM +0900, Isaku Yamahata wrote: > Hi. Thank you for reporting. > Can you elaborate on the changeset that you pointed out and > your work around? > > Regarding to the changeset, it had the issue, but I suppose > 80376c3fc2c38fdd45354e4b0eb45031f35587ed fixed it. > Do you found any other issue? > > Regarding to your workaround, what was the problem? > What EFI BIOS are you using? Tiano-core derivatives that > Tristan Gingold worked on? Or other one? > It seems your EFI BIOS doesn't enable memor, io or master bits > in command register. or disableintx. > If so, the issue is in the bios, not qemu. > > thanks, > > On Wed, Mar 02, 2011 at 11:27:31PM +0200, vagran wrote: > > vagran wrote: > >> Hi, > >> I have noted that Qemu VGA and serial console with EFI BIOS stopped > >> working in > >> 0.14.0 (and in latest development snapshot is still not working). > >> Everything was > >> fine in 0.13.0. However EFI BIOS itself is able to load kernel if it was > >> properly configured on used disk image. The only effect is that > >> neither VGA nor > >> serial console is not functioning. After short investigation I have > >> discovered > >> that this functionality was broken by this commit: > >> > >> commit 9bb3358627d87d8de25fb41b7276575539d799a7 > >> Author: Isaku Yamahata > >> Date: Fri Nov 19 18:56:02 2010 +0900 > >> > >> Do you have any idea how this change could affect EFI consoles? > >> > > After further investigation I have found that the following patch provides > > a workaround for the problem, may be it could be useful for somebody who > > is more familiar with Qemu PCI code: > > > > diff --git a/hw/pci.c b/hw/pci.c > > index 8b76cea..06dd7ab 100644 > > --- a/hw/pci.c > > +++ b/hw/pci.c > > @@ -162,9 +162,11 @@ void pci_device_reset(PCIDevice *dev) > > pci_update_irq_status(dev); > > pci_device_deassert_intx(dev); > > /* Clear all writeable bits */ > > +#if 0 > > pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, > > pci_get_word(dev->wmask + PCI_COMMAND) | > > pci_get_word(dev->w1cmask + PCI_COMMAND)); > > +#endif > > pci_word_test_and_clear_mask(dev->config + PCI_STATUS, > > pci_get_word(dev->wmask + PCI_STATUS) | > > pci_get_word(dev->w1cmask + PCI_STATUS)); > > > > Best regards, > > Artyom. > > > > > > -- > yamahata > -- yamahata
Re: [Qemu-devel] Re: EFI console stopped working in Qemu 0.14.0
Hi. Thank you for reporting. Can you elaborate on the changeset that you pointed out and your work around? Regarding to the changeset, it had the issue, but I suppose 80376c3fc2c38fdd45354e4b0eb45031f35587ed fixed it. Do you found any other issue? Regarding to your workaround, what was the problem? What EFI BIOS are you using? Tiano-core derivatives that Tristan Gingold worked on? Or other one? It seems your EFI BIOS doesn't enable memor, io or master bits in command register. If so, the issue is in the bios, not qemu. thanks, On Wed, Mar 02, 2011 at 11:27:31PM +0200, vagran wrote: > vagran wrote: >> Hi, >> I have noted that Qemu VGA and serial console with EFI BIOS stopped >> working in >> 0.14.0 (and in latest development snapshot is still not working). >> Everything was >> fine in 0.13.0. However EFI BIOS itself is able to load kernel if it was >> properly configured on used disk image. The only effect is that >> neither VGA nor >> serial console is not functioning. After short investigation I have >> discovered >> that this functionality was broken by this commit: >> >> commit 9bb3358627d87d8de25fb41b7276575539d799a7 >> Author: Isaku Yamahata >> Date: Fri Nov 19 18:56:02 2010 +0900 >> >> Do you have any idea how this change could affect EFI consoles? >> > After further investigation I have found that the following patch provides > a workaround for the problem, may be it could be useful for somebody who > is more familiar with Qemu PCI code: > > diff --git a/hw/pci.c b/hw/pci.c > index 8b76cea..06dd7ab 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -162,9 +162,11 @@ void pci_device_reset(PCIDevice *dev) > pci_update_irq_status(dev); > pci_device_deassert_intx(dev); > /* Clear all writeable bits */ > +#if 0 > pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, > pci_get_word(dev->wmask + PCI_COMMAND) | > pci_get_word(dev->w1cmask + PCI_COMMAND)); > +#endif > pci_word_test_and_clear_mask(dev->config + PCI_STATUS, > pci_get_word(dev->wmask + PCI_STATUS) | > pci_get_word(dev->w1cmask + PCI_STATUS)); > > Best regards, > Artyom. > > -- yamahata
[Qemu-devel] [PATCH 3/3] correct VNC_DIRTY_WORDS on 64 bit machine
VNC_DIRTY_WORDS is wrong on 64 bit long machine. Signed-off-by: Wen Congyang --- ui/vnc.h |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/ui/vnc.h b/ui/vnc.h index 8a1e7b9..239a7a7 100644 --- a/ui/vnc.h +++ b/ui/vnc.h @@ -81,7 +81,8 @@ typedef void VncSendHextileTile(VncState *vs, #define VNC_MAX_WIDTH 2560 #define VNC_MAX_HEIGHT 2048 -#define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * BITS_PER_LONG)) +#define divideup(x, y) (((x) + ((y) - 1)) / (y)) +#define VNC_DIRTY_WORDS (divideup(VNC_MAX_WIDTH, (16 * BITS_PER_LONG))) #define VNC_STAT_RECT 64 #define VNC_STAT_COLS (VNC_MAX_WIDTH / VNC_STAT_RECT) -- 1.7.1
Re: [Qemu-devel] [RFC][PATCH] Preliminary BeBox support
Le 2 mars 2011 à 22:59, Andreas Färber a écrit : > Hello François, > > Am 01.03.2011 um 01:15 schrieb François Revol: > >> Since Natalia raised the subject I though I'd post my current patch for the >> BeBox support. >> I think the loader stuff can probably be committed already with some cleanup. >> The rest is mostly a copy of the prep file with tweaks and needs more work. >> The boot nub images can be extracted with this script: >> http://revolf.free.fr/beos/extract_bebox_images.sh >> Running -M bebox -bios bootnub.image makes it try to probe the PCI bridge >> for now. >> Comments ? > > How does this relate to the BeBox emulation I started? I assume not at all? > I tried to do the PEF parsing and nub extraction inside QEMU. > http://repo.or.cz/w/qemu/afaerber.git/shortlog/refs/heads/bebox Eh, not at all, I didn't notice this one. Hopefully I won't be dupping work anymore :p > I'd prefer the BeBox machine to go into ppc_prep.c and to benefit from the > PReP cleanup that I started. Yeah it needs some. > I'll have a closer look the weekend, please cc me on future patches. Ok. François.
[Qemu-devel] Invitation to connect on LinkedIn
LinkedIn Starrry Han requested to add you as a connection on LinkedIn: -- Jiajun, I'd like to add you to my professional network on LinkedIn. - Starrry Accept invitation from Starrry Han http://www.linkedin.com/e/-kkb1ec-gkt15x24-k/qTMmi8QEI_f3FNXUkL1mvZgy00BGYniwg3/blk/I102453064_11/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYNclYQdz0PdjgOc359bS55t6xakn5KbPsQc3sNcz8Pd38LrCBxbOYWrSlI/EML_comm_afe/ View invitation from Starrry Han http://www.linkedin.com/e/-kkb1ec-gkt15x24-k/qTMmi8QEI_f3FNXUkL1mvZgy00BGYniwg3/blk/I102453064_11/34NnPgSc3cRd38MckALqnpPbOYWrSlI/svi/ -- DID YOU KNOW that LinkedIn can find the answers to your most difficult questions? Post those vexing questions on LinkedIn Answers to tap into the knowledge of the world's foremost business experts: http://www.linkedin.com/e/-kkb1ec-gkt15x24-k/ask/inv-23/ -- (c) 2011, LinkedIn Corporation
Re: [Qemu-devel] Re: [PATCH RESEND 2/2] vnc: Fix heap corruption
At 03/03/2011 06:27 AM, Stefan Weil Write: > Am 02.03.2011 23:01, schrieb Stefan Weil: >> Am 02.03.2011 19:47, schrieb Peter Maydell: >>> On 2 March 2011 18:36, Stefan Weil wrote: No. I dont't think that the third parameter of bitmap_clear is ok like that. See my patch for the correct value. >>> >>> Wen's patch: >>> >>> + const size_t width = ds_get_width(vd->ds) / 16; >>> [...] >>> -bitmap_set(width_mask, 0, (ds_get_width(vd->ds) / 16)); >>> -bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), >>> - VNC_DIRTY_WORDS * BITS_PER_LONG); >>> +bitmap_set(width_mask, 0, width); >>> +bitmap_clear(width_mask, width, VNC_DIRTY_WORDS * BITS_PER_LONG >>> - width); >>> >>> Your patch: >>> >>> bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), >>> - VNC_DIRTY_WORDS * BITS_PER_LONG); >>> + (VNC_MAX_WIDTH - ds_get_width(vd->ds)) / 16); >>> >>> Since ui/vnc.h has: >>> >>> #define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * BITS_PER_LONG)) >>> >>> the third parameter to bitmap_clear is the same value in >>> both cases, isn't it? Or is this a rounding bug? >>> >>> -- PMM >> >> Because of rounding effects, both values can be different. >> >> The part missing in my patch is correct handling of another >> rounding effect: >> >> VNC_DIRTY_WORDS is exact for 32 bit long values (and the >> "old" code which used uint32_t until some weeks ago), where >> VNC_DIRTY_WORDS = 2560/16/32 = 5. >> >> For 64 bit values, VNC_DIRTY_WORDS = 2560/16/64 = 2 (rounded)! >> >> Stefan W. > > > Is bitmap_clear() really needed here? Meanwhile I think it is not, > so this might be a new patch variant... I do not know why we call bitmap_clear() hear. I only know it is the same as the old code: -static inline void vnc_set_bits(uint32_t *d, int n, int nb_words) -{ -int j; - -j = 0; -while (n >= 32) { -d[j++] = -1; -n -= 32; -} -if (n > 0) -d[j++] = (1 << n) - 1; -while (j < nb_words) <=== bitmap_clear() -d[j++] = 0; -} -vnc_set_bits(width_mask, (ds_get_width(vd->ds) / 16), VNC_DIRTY_WORDS); +bitmap_set(width_mask, 0, (ds_get_width(vd->ds) / 16)); +bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), + VNC_DIRTY_WORDS * BITS_PER_LONG); > > >
Re: [Qemu-devel] Re: [PATCH RESEND 2/2] vnc: Fix heap corruption
On 2 March 2011 22:01, Stefan Weil wrote: > The part missing in my patch is correct handling of another > rounding effect: > > VNC_DIRTY_WORDS is exact for 32 bit long values (and the > "old" code which used uint32_t until some weeks ago), where > VNC_DIRTY_WORDS = 2560/16/32 = 5. > > For 64 bit values, VNC_DIRTY_WORDS = 2560/16/64 = 2 (rounded)! Yes, I noticed that after I posted. Given that we have arrays like unsigned long dirty[VNC_MAX_HEIGHT][VNC_DIRTY_WORDS]; rounding down rather than up looks a bit suspicious... (Can we just make VNC_MAX_WIDTH a multiple of 64, or is it baked into the VNC protocol?) -- PMM
[Qemu-devel] [PATCH v4 1/2] rtl8139: add vlan tag extraction
Add support to the emulated hardware to extract vlan tags in packets going from the network to the guest. Signed-off-by: Benjamin Poirier Cc: Igor V. Kovalenko Cc: Jason Wang Cc: Michael S. Tsirkin -- AFAIK, extraction is optional to get vlans working. The driver requests rx detagging but should not assume that it was done. Under Linux, the mac layer will catch the vlan ethertype. I only added this part for completeness (to emulate the hardware more truthfully...) --- hw/rtl8139.c | 100 +++--- 1 files changed, 81 insertions(+), 19 deletions(-) diff --git a/hw/rtl8139.c b/hw/rtl8139.c index a22530c..6e770bd 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -68,6 +68,16 @@ #if defined(RTL8139_CALCULATE_RXCRC) /* For crc32 */ #include + +static inline uLong rtl8139_crc32(uLong crc, const Bytef *buf, uInt len) +{ +return crc32(crc, buf, len); +} +#else +static inline uLong rtl8139_crc32(uLong crc, const Bytef *buf, uInt len) +{ +return 0; +} #endif #define SET_MASKED(input, mask, curr) \ @@ -77,6 +87,14 @@ #define MOD2(input, size) \ ( ( input ) & ( size - 1 ) ) +#define ETHER_ADDR_LEN 6 +#define ETHER_TYPE_LEN 2 +#define ETHER_HDR_LEN (ETHER_ADDR_LEN * 2 + ETHER_TYPE_LEN) +#define ETHERTYPE_VLAN 0x8100 + +#define VLAN_TCI_LEN 2 +#define VLAN_HDR_LEN (ETHER_TYPE_LEN + VLAN_TCI_LEN) + #if defined (DEBUG_RTL8139) # define DEBUG_PRINT(x) do { printf x ; } while (0) #else @@ -818,10 +836,13 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ { RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque; int size = size_; +const uint8_t *dot1q_buf; +const uint8_t *next_part; +size_t next_part_size; uint32_t packet_header = 0; -uint8_t buf1[60]; +uint8_t buf1[MIN_BUF_SIZE]; static const uint8_t broadcast_macaddr[6] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; @@ -933,14 +954,6 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ } } -/* if too small buffer, then expand it */ -if (size < MIN_BUF_SIZE) { -memcpy(buf1, buf, size); -memset(buf1 + size, 0, MIN_BUF_SIZE - size); -buf = buf1; -size = MIN_BUF_SIZE; -} - if (rtl8139_cp_receiver_enabled(s)) { DEBUG_PRINT(("RTL8139: in C+ Rx mode \n")); @@ -1001,6 +1014,41 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ uint32_t rx_space = rxdw0 & CP_RX_BUFFER_SIZE_MASK; +/* write VLAN info to descriptor variables */ +/* next_part starts right after the vlan header (if any), at the + * ethertype for the payload */ +next_part = &buf[ETHER_ADDR_LEN * 2]; +if (s->CpCmd & CPlusRxVLAN && be16_to_cpup((uint16_t *) +&buf[ETHER_ADDR_LEN * 2]) == ETHERTYPE_VLAN) { +dot1q_buf = &buf[ETHER_ADDR_LEN * 2]; +next_part += VLAN_HDR_LEN; +size -= VLAN_HDR_LEN; + +rxdw1 &= ~CP_RX_VLAN_TAG_MASK; +/* BE + ~le_to_cpu()~ + cpu_to_le() = BE */ +rxdw1 |= CP_RX_TAVA | le16_to_cpup((uint16_t *) +&buf[ETHER_HDR_LEN]); + +DEBUG_PRINT(("RTL8139: C+ Rx mode : extracted vlan tag with tci: " +"%u\n", be16_to_cpup((uint16_t *) &buf[ETHER_HDR_LEN]))); +} else { +/* reset VLAN tag flag */ +rxdw1 &= ~CP_RX_TAVA; +dot1q_buf = NULL; +} +next_part_size = buf + size_ - next_part; + +/* if too small buffer, then expand it */ +if (size < MIN_BUF_SIZE) { +size_t tmp_size = MIN_BUF_SIZE - ETHER_ADDR_LEN * 2; + +memcpy(buf1, next_part, next_part_size); +memset(buf1 + next_part_size, 0, tmp_size - next_part_size); +next_part = buf1; +next_part_size = tmp_size; +size = MIN_BUF_SIZE; +} + /* TODO: scatter the packet over available receive ring descriptors space */ if (size+4 > rx_space) @@ -1022,7 +1070,18 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ target_phys_addr_t rx_addr = rtl8139_addr64(rxbufLO, rxbufHI); /* receive/copy to target memory */ -cpu_physical_memory_write( rx_addr, buf, size ); +if (unlikely(dot1q_buf)) { +cpu_physical_memory_write(rx_addr, buf, 2 * ETHER_ADDR_LEN); +val = rtl8139_crc32(0, buf, 2 * ETHER_ADDR_LEN); +val = rtl8139_crc32(val, dot1q_buf, VLAN_HDR_LEN); +cpu_physical_memory_write(rx_addr + 2 * ETHER_ADDR_LEN, next_part, +next_part_size); +val = rtl8139_crc32(val, buf + 2 * ETHER_ADDR_LEN, +next_part_size); +} else { +cpu_physical_memory_write(rx_addr, buf, size); +val = rtl8139_crc32(0, buf, size); +}
[Qemu-devel] [PATCH v4 2/2] rtl8139: add vlan tag insertion
Add support to the emulated hardware to insert vlan tags in packets going from the guest to the network. Signed-off-by: Benjamin Poirier Cc: Igor V. Kovalenko Cc: Jason Wang Cc: Michael S. Tsirkin --- hw/rtl8139.c | 102 ++ 1 files changed, 74 insertions(+), 28 deletions(-) diff --git a/hw/rtl8139.c b/hw/rtl8139.c index 6e770bd..0ea79e4 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -832,11 +832,14 @@ static int rtl8139_can_receive(VLANClientState *nc) } } -static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_t size_, int do_interrupt) +static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, +size_t buf_size, int do_interrupt, const uint8_t *dot1q_buf) { RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque; +/* size_ is the total length of argument buffers */ +int size_ = buf_size + (dot1q_buf ? VLAN_HDR_LEN : 0); +/* size is the length of the buffer passed to the driver */ int size = size_; -const uint8_t *dot1q_buf; const uint8_t *next_part; size_t next_part_size; @@ -1018,10 +1021,13 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ /* next_part starts right after the vlan header (if any), at the * ethertype for the payload */ next_part = &buf[ETHER_ADDR_LEN * 2]; -if (s->CpCmd & CPlusRxVLAN && be16_to_cpup((uint16_t *) -&buf[ETHER_ADDR_LEN * 2]) == ETHERTYPE_VLAN) { -dot1q_buf = &buf[ETHER_ADDR_LEN * 2]; -next_part += VLAN_HDR_LEN; +if (s->CpCmd & CPlusRxVLAN && (dot1q_buf || be16_to_cpup((uint16_t *) +&buf[ETHER_ADDR_LEN * 2]) == ETHERTYPE_VLAN)) { +if (!dot1q_buf) { +/* the tag is in the buffer */ +dot1q_buf = &buf[ETHER_ADDR_LEN * 2]; +next_part += VLAN_HDR_LEN; +} size -= VLAN_HDR_LEN; rxdw1 &= ~CP_RX_VLAN_TAG_MASK; @@ -1034,9 +1040,8 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ } else { /* reset VLAN tag flag */ rxdw1 &= ~CP_RX_TAVA; -dot1q_buf = NULL; } -next_part_size = buf + size_ - next_part; +next_part_size = buf + buf_size - next_part; /* if too small buffer, then expand it */ if (size < MIN_BUF_SIZE) { @@ -1164,10 +1169,18 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ /* if too small buffer, then expand it */ if (size < MIN_BUF_SIZE) { -memcpy(buf1, buf, size); +if (unlikely(dot1q_buf)) { +memcpy(buf1, buf, 2 * ETHER_ADDR_LEN); +memcpy(buf1 + 2 * ETHER_ADDR_LEN, dot1q_buf, VLAN_HDR_LEN); +memcpy(buf1 + 2 * ETHER_ADDR_LEN + VLAN_HDR_LEN, buf + 2 * +ETHER_ADDR_LEN, buf_size - 2 * ETHER_ADDR_LEN); +} else { +memcpy(buf1, buf, size); +} memset(buf1 + size, 0, MIN_BUF_SIZE - size); buf = buf1; size = MIN_BUF_SIZE; +dot1q_buf = NULL; } /* begin ring receiver mode */ @@ -1196,8 +1209,19 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ rtl8139_write_buffer(s, (uint8_t *)&val, 4); /* receive/copy to target memory */ -rtl8139_write_buffer(s, buf, size); -val = rtl8139_crc32(0, buf, size); +if (unlikely(dot1q_buf)) { +rtl8139_write_buffer(s, buf, 2 * ETHER_ADDR_LEN); +val = rtl8139_crc32(0, buf, 2 * ETHER_ADDR_LEN); +rtl8139_write_buffer(s, dot1q_buf, VLAN_HDR_LEN); +val = rtl8139_crc32(val, dot1q_buf, VLAN_HDR_LEN); +rtl8139_write_buffer(s, buf + 2 * ETHER_ADDR_LEN, buf_size - 2 * +ETHER_ADDR_LEN); +val = rtl8139_crc32(val, buf + 2 * ETHER_ADDR_LEN, buf_size - 2 * +ETHER_ADDR_LEN); +} else { +rtl8139_write_buffer(s, buf, size); +val = rtl8139_crc32(0, buf, size); +} /* write checksum */ #if defined (RTL8139_CALCULATE_RXCRC) @@ -1227,7 +1251,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ static ssize_t rtl8139_receive(VLANClientState *nc, const uint8_t *buf, size_t size) { -return rtl8139_do_receive(nc, buf, size, 1); +return rtl8139_do_receive(nc, buf, size, 1, NULL); } static void rtl8139_reset_rxring(RTL8139State *s, uint32_t bufferSize) @@ -1802,7 +1826,8 @@ static uint32_t rtl8139_RxConfig_read(RTL8139State *s) return ret; } -static void rtl8139_transfer_frame(RTL8139State *s, const uint8_t *buf, int size, int do_interrupt) +static void rtl8139_transfer_frame(RTL8139State *s, uint8_t *buf, int size, +int do_interru
[Qemu-devel] [PATCH v4] rtl8139: add vlan support
I've tested v4 with x86_64 host/guest. I used the same testing procedure as before. I've tested a plain configuration as well as one with tso + vlan offload, successfully. I had to hack around the Linux 8139cp driver to be able to enable tso on vlan which leads me to wonder, can someone with access to the C+ spec or a real card confirm that it can do tso and vlan offload at the same time? The patch I used for the kernel is at https://gist.github.com/851895. Changes since v2: insertion: * moved insertion later in the process, to handle tso * use qemu_sendv_packet() to insert the tag for us * added dot1q_buf parameter to rtl8139_do_receive() to avoid some memcpy() in loopback mode. Note that the code path through that function is unchanged when dot1q_buf is NULL. extraction: * reduced the amount of copying by moving the "frame too short" logic after the removal of the vlan tag (as is done in e1000.c for example). Unfortunately, that logic can no longer be shared betwen C+ and C mode. I've posted v2 of these patches back in November http://article.gmane.org/gmane.comp.emulators.qemu/84252 I've tested v3 on the following combinations of guest and hosts: host: x86_64, guest: x86_64 host: x86_64, guest: ppc32 host: ppc32, guest: ppc32 Testing on the x86_64 host used '-net tap' and consisted of: * making an http transfert on the untagged interface. * ping -s 0-1472 to another host on a vlan. * making an scp upload to another host on a vlan. Testing on the ppc32 host used '-net socket' connected to an x86_64 qemu-kvm running the virtio nic and consisted of: * establishing an ssh connection between the two using an untagged interface. * ping -s 0-1472 between the two using a vlan. * making an scp transfer in both directions using a vlan. All that was successful. Nevertheless, it doesn't exercise all code paths so care is in order. Please note that the lack of vlan support in rtl8139 has taken a few people aback: https://bugzilla.redhat.com/show_bug.cgi?id=516587 http://article.gmane.org/gmane.linux.network.general/14266 Thanks, -Ben
Re: [Qemu-devel] Re: [PATCH RESEND 2/2] vnc: Fix heap corruption
Am 02.03.2011 23:01, schrieb Stefan Weil: Am 02.03.2011 19:47, schrieb Peter Maydell: On 2 March 2011 18:36, Stefan Weil wrote: No. I dont't think that the third parameter of bitmap_clear is ok like that. See my patch for the correct value. Wen's patch: + const size_t width = ds_get_width(vd->ds) / 16; [...] -bitmap_set(width_mask, 0, (ds_get_width(vd->ds) / 16)); -bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), - VNC_DIRTY_WORDS * BITS_PER_LONG); +bitmap_set(width_mask, 0, width); +bitmap_clear(width_mask, width, VNC_DIRTY_WORDS * BITS_PER_LONG - width); Your patch: bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), - VNC_DIRTY_WORDS * BITS_PER_LONG); + (VNC_MAX_WIDTH - ds_get_width(vd->ds)) / 16); Since ui/vnc.h has: #define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * BITS_PER_LONG)) the third parameter to bitmap_clear is the same value in both cases, isn't it? Or is this a rounding bug? -- PMM Because of rounding effects, both values can be different. The part missing in my patch is correct handling of another rounding effect: VNC_DIRTY_WORDS is exact for 32 bit long values (and the "old" code which used uint32_t until some weeks ago), where VNC_DIRTY_WORDS = 2560/16/32 = 5. For 64 bit values, VNC_DIRTY_WORDS = 2560/16/64 = 2 (rounded)! Stefan W. Is bitmap_clear() really needed here? Meanwhile I think it is not, so this might be a new patch variant...
[Qemu-devel] [PATCH 2/2] net: fix qemu_can_send_packet logic
If any of the clients is not ready to receive (ie it has a can_receive callback and can_receive() returns false), we don't want to start sending, else this client may miss/discard the packet. I got this behaviour with the following setup : the emulated machine is using an USB-ethernet adapter, it is connected to the network using SLIRP and I'm dumping the traffic in a .pcap file. As per the following command line : -net nic,model=usb,vlan=1 -net user,vlan=1 -net dump,vlan=1,file=/tmp/pkt.pcap Every time that two packets are coming in a row from the host, the usb-net code will receive the first one, then returns 0 to can_receive call since it has a 1 packet long queue. But as the dump code is always ready to receive, qemu_can_send_packet will return true and the next packet will discard the previous one in the usb-net code. Signed-off-by: Vincent Palatin --- net.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net.c b/net.c index ec4745d..72ac4cf 100644 --- a/net.c +++ b/net.c @@ -411,11 +411,11 @@ int qemu_can_send_packet(VLANClientState *sender) } /* no can_receive() handler, they can always receive */ -if (!vc->info->can_receive || vc->info->can_receive(vc)) { -return 1; +if (vc->info->can_receive && !vc->info->can_receive(vc)) { +return 0; } } -return 0; +return 1; } static ssize_t qemu_deliver_packet(VLANClientState *sender, -- 1.7.3.1
[Qemu-devel] net: small fixes
Dear Qemu developers, While debugging a machine emulation using SLIRP based user networking, I ran into a couple of issues. Please find attached the patches for them : 1) fix the SLIRP compilation when the debug traces are activated. 2) avoid packet loss with several receivers on the same vlan. Regards, -- Vincent
[Qemu-devel] [PATCH 1/2] net: fix trace when debug is activated in slirp
make the code compile correctly when DEBUG is activated. Signed-off-by: Vincent Palatin --- slirp/bootp.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/slirp/bootp.c b/slirp/bootp.c index 0905c6d..1eb2ed1 100644 --- a/slirp/bootp.c +++ b/slirp/bootp.c @@ -284,7 +284,7 @@ static void bootp_reply(Slirp *slirp, const struct bootp_t *bp) } else { static const char nak_msg[] = "requested address not available"; -DPRINTF("nak'ed addr=%08x\n", ntohl(preq_addr->s_addr)); +DPRINTF("nak'ed addr=%08x\n", ntohl(preq_addr.s_addr)); *q++ = RFC2132_MSG_TYPE; *q++ = 1; -- 1.7.3.1
Re: [Qemu-devel] Re: [PATCH RESEND 2/2] vnc: Fix heap corruption
Am 02.03.2011 19:47, schrieb Peter Maydell: On 2 March 2011 18:36, Stefan Weil wrote: No. I dont't think that the third parameter of bitmap_clear is ok like that. See my patch for the correct value. Wen's patch: + const size_t width = ds_get_width(vd->ds) / 16; [...] -bitmap_set(width_mask, 0, (ds_get_width(vd->ds) / 16)); -bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), - VNC_DIRTY_WORDS * BITS_PER_LONG); +bitmap_set(width_mask, 0, width); +bitmap_clear(width_mask, width, VNC_DIRTY_WORDS * BITS_PER_LONG - width); Your patch: bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), - VNC_DIRTY_WORDS * BITS_PER_LONG); + (VNC_MAX_WIDTH - ds_get_width(vd->ds)) / 16); Since ui/vnc.h has: #define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * BITS_PER_LONG)) the third parameter to bitmap_clear is the same value in both cases, isn't it? Or is this a rounding bug? -- PMM Because of rounding effects, both values can be different. The part missing in my patch is correct handling of another rounding effect: VNC_DIRTY_WORDS is exact for 32 bit long values (and the "old" code which used uint32_t until some weeks ago), where VNC_DIRTY_WORDS = 2560/16/32 = 5. For 64 bit values, VNC_DIRTY_WORDS = 2560/16/64 = 2 (rounded)! Stefan W.
Re: [Qemu-devel] [RFC][PATCH] Preliminary BeBox support
Hello François, Am 01.03.2011 um 01:15 schrieb François Revol: Since Natalia raised the subject I though I'd post my current patch for the BeBox support. I think the loader stuff can probably be committed already with some cleanup. The rest is mostly a copy of the prep file with tweaks and needs more work. The boot nub images can be extracted with this script: http://revolf.free.fr/beos/extract_bebox_images.sh Running -M bebox -bios bootnub.image makes it try to probe the PCI bridge for now. Comments ? How does this relate to the BeBox emulation I started? I assume not at all? I tried to do the PEF parsing and nub extraction inside QEMU. http://repo.or.cz/w/qemu/afaerber.git/shortlog/refs/heads/bebox I'd prefer the BeBox machine to go into ppc_prep.c and to benefit from the PReP cleanup that I started. I'll have a closer look the weekend, please cc me on future patches. Thanks, Andreas
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 03/02/2011 11:30 AM, Avi Kivity wrote: It's really the natural generalization of what you're proposing. So basically, the only differences are: 1) always use the new RAID1 format 2) drop the progress bitmap 3) support multiple devices per file 4) let drive properties be specified beyond filename All reasonable things to do. Well, I dislike 3, and the whole "qemu is authoritative source of configuration" thing. Yeah, at this point, I think the new no-stored-state approach is obviously better than anything proposed. So we'll have to revisit this discussion if/when there's another use-case that demands it. Regards, Anthony Liguori
[Qemu-devel] Re: EFI console stopped working in Qemu 0.14.0
vagran wrote: Hi, I have noted that Qemu VGA and serial console with EFI BIOS stopped working in 0.14.0 (and in latest development snapshot is still not working). Everything was fine in 0.13.0. However EFI BIOS itself is able to load kernel if it was properly configured on used disk image. The only effect is that neither VGA nor serial console is not functioning. After short investigation I have discovered that this functionality was broken by this commit: commit 9bb3358627d87d8de25fb41b7276575539d799a7 Author: Isaku Yamahata Date: Fri Nov 19 18:56:02 2010 +0900 Do you have any idea how this change could affect EFI consoles? After further investigation I have found that the following patch provides a workaround for the problem, may be it could be useful for somebody who is more familiar with Qemu PCI code: diff --git a/hw/pci.c b/hw/pci.c index 8b76cea..06dd7ab 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -162,9 +162,11 @@ void pci_device_reset(PCIDevice *dev) pci_update_irq_status(dev); pci_device_deassert_intx(dev); /* Clear all writeable bits */ +#if 0 pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, pci_get_word(dev->wmask + PCI_COMMAND) | pci_get_word(dev->w1cmask + PCI_COMMAND)); +#endif pci_word_test_and_clear_mask(dev->config + PCI_STATUS, pci_get_word(dev->wmask + PCI_STATUS) | pci_get_word(dev->w1cmask + PCI_STATUS)); Best regards, Artyom.
[Qemu-devel] [PATCH] fix offset for MMIO subpage access
When using a MMIO subpage not starting on a page boundary, the offset value given to the access handler is based on the start of the MMU page not on the subpage base. As a consequence, if you are mapping the same subpage sized MMIO device at different addresses, this is somewhat impractical and confusing since the same register will be called with different "offset" depending on the base address. My proposal is to workaround this by recording the offset in region_offset field. Signed-off-by: Vincent Palatin --- exec.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/exec.c b/exec.c index d611100..b59e7c9 100644 --- a/exec.c +++ b/exec.c @@ -2626,6 +2626,7 @@ void cpu_register_physical_memory_offset(target_phys_addr_t start_addr, CHECK_SUBPAGE(addr, start_addr, start_addr2, end_addr, end_addr2, need_subpage); if (need_subpage) { +region_offset -= (start_addr & ~TARGET_PAGE_MASK); if (!(orig_memory & IO_MEM_SUBPAGE)) { subpage = subpage_init((addr & TARGET_PAGE_MASK), &p->phys_offset, orig_memory, @@ -2658,6 +2659,7 @@ void cpu_register_physical_memory_offset(target_phys_addr_t start_addr, end_addr2, need_subpage); if (need_subpage) { +region_offset -= (start_addr & ~TARGET_PAGE_MASK); subpage = subpage_init((addr & TARGET_PAGE_MASK), &p->phys_offset, IO_MEM_UNASSIGNED, addr & TARGET_PAGE_MASK); -- 1.7.3.1
Re: [Qemu-devel] Memory Map
Hi, On Wed, Mar 2, 2011 at 12:11, Salvatore Lionetti wrote: > Still now, some memory region is called with base+offset. > > So: > > [0x204] <= value (write from uP register) > cause > read(opaque, offset=204, value) > > while > [0x504] <= value (write from uP register) > cause > read(opaque, offset=4, value) > > The two opaque are different as expected. > > Where i am wrong? If you mean 0x5004 and not 0x504 (as stated in your previous email), IMO it is a current limitation of the Qemu feature called "subpage" (which is used when you are mapping a memory area smaller than the MMU page size as in your example). When using subpages in the current code, the "offset" becomes the distance to the MMU page boundary instead of the distance to the base address of the peripheral. This is somewhat impractical and confusing when you are mapping the same subpage sized MMIO device at different addresses. As the emulation I'm working on has a lot of subpage sized peripherals, I have written a patch to workaround this limitation. I will send it to the list for comment if you want to try it. -- Vincent
Re: [Qemu-devel] Re: [PATCH] moving eeprom initialization
On Wed, Mar 2, 2011 at 7:28 PM, Gerhard Wiesinger wrote: > Your patch should be based on fixes for correct EEPROM initialization, see > for details: http://www.mail-archive.com/qemu-devel@nongnu.org/msg56414.html This patch is not yet integrated upstream. I will correct it if needed. -- William
[Qemu-devel] EFI console stopped working in Qemu 0.14.0
Hi, I have noted that Qemu VGA and serial console with EFI BIOS stopped working in 0.14.0 (and in latest development snapshot is still not working). Everything was fine in 0.13.0. However EFI BIOS itself is able to load kernel if it was properly configured on used disk image. The only effect is that neither VGA nor serial console is not functioning. After short investigation I have discovered that this functionality was broken by this commit: commit 9bb3358627d87d8de25fb41b7276575539d799a7 Author: Isaku Yamahata Date: Fri Nov 19 18:56:02 2010 +0900 Do you have any idea how this change could affect EFI consoles? -- Best regards, Artyom.
[Qemu-devel] Re: [PATCH v3 uq/master 00/22] Win32 iothread support
On Wed, Mar 02, 2011 at 04:36:34PM -0300, Marcelo Tosatti wrote: > On Wed, Mar 02, 2011 at 08:03:42PM +0100, Jan Kiszka wrote: > > On 2011-03-02 19:43, Marcelo Tosatti wrote: > > > On Tue, Mar 01, 2011 at 02:35:56PM +0200, Avi Kivity wrote: > > >> On 02/28/2011 04:05 PM, Paolo Bonzini wrote: > > >>> On 02/28/2011 01:13 PM, Avi Kivity wrote: > > > > > > > If there's a git tree of this I'll be happy to do an autotest run. > > >>> > > >>> Sure, it's branch iothread-win32 of git://github.com/bonzini/qemu.git > > >> > > >> Fails on Fedora 9 i386 install, hangs right after "Performing post > > >> install configuration...". The guest is processing interrupts but > > >> the mouse won't move, and it doesn't make progress. > > >> > > >> Configured with --enable-io-thread. Perhaps the problem exists even > > >> before the patchset. > > > > > > Probably unrelated, looks similar to the regression seen with qemu-kvm. > > > > > > > Do these patches change some behavior or not? > > Yes, they change some behaviour. Autotest fails. Sorry, i misunderstood. I don't think these patches change any behaviour. Avi's failure case is similar to what i've seen earlier upon qemu->qemu-kvm merge. Conclusion is there is no new regression introduced by these patches. > > > Is it this same effect you saw with my qemu-kvm queue? > > Yes. >
[Qemu-devel] Re: [PATCH v3 uq/master 00/22] Win32 iothread support
On Wed, Mar 02, 2011 at 08:03:42PM +0100, Jan Kiszka wrote: > On 2011-03-02 19:43, Marcelo Tosatti wrote: > > On Tue, Mar 01, 2011 at 02:35:56PM +0200, Avi Kivity wrote: > >> On 02/28/2011 04:05 PM, Paolo Bonzini wrote: > >>> On 02/28/2011 01:13 PM, Avi Kivity wrote: > > > > If there's a git tree of this I'll be happy to do an autotest run. > >>> > >>> Sure, it's branch iothread-win32 of git://github.com/bonzini/qemu.git > >> > >> Fails on Fedora 9 i386 install, hangs right after "Performing post > >> install configuration...". The guest is processing interrupts but > >> the mouse won't move, and it doesn't make progress. > >> > >> Configured with --enable-io-thread. Perhaps the problem exists even > >> before the patchset. > > > > Probably unrelated, looks similar to the regression seen with qemu-kvm. > > > > Do these patches change some behavior or not? Yes, they change some behaviour. Autotest fails. > Is it this same effect you saw with my qemu-kvm queue? Yes.
[Qemu-devel] Re: [PATCH v3 uq/master 00/22] Win32 iothread support
On 2011-03-02 19:43, Marcelo Tosatti wrote: > On Tue, Mar 01, 2011 at 02:35:56PM +0200, Avi Kivity wrote: >> On 02/28/2011 04:05 PM, Paolo Bonzini wrote: >>> On 02/28/2011 01:13 PM, Avi Kivity wrote: > If there's a git tree of this I'll be happy to do an autotest run. >>> >>> Sure, it's branch iothread-win32 of git://github.com/bonzini/qemu.git >> >> Fails on Fedora 9 i386 install, hangs right after "Performing post >> install configuration...". The guest is processing interrupts but >> the mouse won't move, and it doesn't make progress. >> >> Configured with --enable-io-thread. Perhaps the problem exists even >> before the patchset. > > Probably unrelated, looks similar to the regression seen with qemu-kvm. > Do these patches change some behavior or not? Is it this same effect you saw with my qemu-kvm queue? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] Re: [PATCH v3 uq/master 00/22] Win32 iothread support
On Tue, Mar 01, 2011 at 02:35:56PM +0200, Avi Kivity wrote: > On 02/28/2011 04:05 PM, Paolo Bonzini wrote: > >On 02/28/2011 01:13 PM, Avi Kivity wrote: > >>> > >> > >>If there's a git tree of this I'll be happy to do an autotest run. > > > >Sure, it's branch iothread-win32 of git://github.com/bonzini/qemu.git > > Fails on Fedora 9 i386 install, hangs right after "Performing post > install configuration...". The guest is processing interrupts but > the mouse won't move, and it doesn't make progress. > > Configured with --enable-io-thread. Perhaps the problem exists even > before the patchset. Probably unrelated, looks similar to the regression seen with qemu-kvm. Agree this is no uq/master material. Please send directly.
Re: [Qemu-devel] Re: [PATCH RESEND 2/2] vnc: Fix heap corruption
On 2 March 2011 18:36, Stefan Weil wrote: > No. I dont't think that the third parameter of bitmap_clear is > ok like that. See my patch for the correct value. Wen's patch: +const size_t width = ds_get_width(vd->ds) / 16; [...] - bitmap_set(width_mask, 0, (ds_get_width(vd->ds) / 16)); - bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), - VNC_DIRTY_WORDS * BITS_PER_LONG); + bitmap_set(width_mask, 0, width); + bitmap_clear(width_mask, width, VNC_DIRTY_WORDS * BITS_PER_LONG - width); Your patch: bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), - VNC_DIRTY_WORDS * BITS_PER_LONG); + (VNC_MAX_WIDTH - ds_get_width(vd->ds)) / 16); Since ui/vnc.h has: #define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * BITS_PER_LONG)) the third parameter to bitmap_clear is the same value in both cases, isn't it? Or is this a rounding bug? -- PMM
Re: [Qemu-devel] Re: [PATCH] moving eeprom initialization
Hello, Your patch should be based on fixes for correct EEPROM initialization, see for details: http://www.mail-archive.com/qemu-devel@nongnu.org/msg56414.html Ciao, Gerhard -- http://www.wiesinger.com/ On Wed, 2 Mar 2011, William Dauchy wrote: On Wed, Mar 2, 2011 at 2:36 PM, William Dauchy wrote: The initialization should not be only on reset but also when initializing the device. It resolves a bug when hot plugging a pci network device: the mac address was always null. --- hw/pcnet.c | 27 ++- hw/rtl8139.c | 24 2 files changed, 26 insertions(+), 25 deletions(-) diff --git a/hw/pcnet.c b/hw/pcnet.c index db52dc5..4e30e9c 100644 --- a/hw/pcnet.c +++ b/hw/pcnet.c @@ -1557,19 +1557,6 @@ uint32_t pcnet_bcr_readw(PCNetState *s, uint32_t rap) void pcnet_h_reset(void *opaque) { PCNetState *s = opaque; - int i; - uint16_t checksum; - - /* Initialize the PROM */ - - memcpy(s->prom, s->conf.macaddr.a, 6); - s->prom[12] = s->prom[13] = 0x00; - s->prom[14] = s->prom[15] = 0x57; - - for (i = 0,checksum = 0; i < 16; i++) - checksum += s->prom[i]; - *(uint16_t *)&s->prom[12] = cpu_to_le16(checksum); - s->bcr[BCR_MSRDA] = 0x0005; s->bcr[BCR_MSWRA] = 0x0005; @@ -1736,6 +1723,9 @@ void pcnet_common_cleanup(PCNetState *d) int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info) { + int i; + uint16_t checksum; + s->poll_timer = qemu_new_timer(vm_clock, pcnet_poll_timer, s); qemu_macaddr_default_if_unset(&s->conf.macaddr); @@ -1744,5 +1734,16 @@ int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info) add_boot_device_path(s->conf.bootindex, dev, "/ethernet-phy@0"); + /* Initialize the PROM */ + + memcpy(s->prom, s->conf.macaddr.a, 6); + s->prom[12] = s->prom[13] = 0x00; + s->prom[14] = s->prom[15] = 0x57; + + for (i = 0, checksum = 0; i < 16; i++) { + checksum += s->prom[i]; + } + *(uint16_t *)&s->prom[12] = cpu_to_le16(checksum); + return 0; } diff --git a/hw/rtl8139.c b/hw/rtl8139.c index a22530c..8356d5a 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -1189,18 +1189,6 @@ static void rtl8139_reset(DeviceState *d) rtl8139_update_irq(s); - /* prepare eeprom */ - s->eeprom.contents[0] = 0x8129; -#if 1 - // PCI vendor and device ID should be mirrored here - s->eeprom.contents[1] = PCI_VENDOR_ID_REALTEK; - s->eeprom.contents[2] = PCI_DEVICE_ID_REALTEK_8139; -#endif - - s->eeprom.contents[7] = s->conf.macaddr.a[0] | s->conf.macaddr.a[1] << 8; - s->eeprom.contents[8] = s->conf.macaddr.a[2] | s->conf.macaddr.a[3] << 8; - s->eeprom.contents[9] = s->conf.macaddr.a[4] | s->conf.macaddr.a[5] << 8; - /* mark all status registers as owned by host */ for (i = 0; i < 4; ++i) { @@ -3392,6 +3380,18 @@ static int pci_rtl8139_init(PCIDevice *dev) qemu_macaddr_default_if_unset(&s->conf.macaddr); + /* prepare eeprom */ + s->eeprom.contents[0] = 0x8129; +#if 1 + /* PCI vendor and device ID should be mirrored here */ + s->eeprom.contents[1] = PCI_VENDOR_ID_REALTEK; + s->eeprom.contents[2] = PCI_DEVICE_ID_REALTEK_8139; +#endif + + s->eeprom.contents[7] = s->conf.macaddr.a[0] | s->conf.macaddr.a[1] << 8; + s->eeprom.contents[8] = s->conf.macaddr.a[2] | s->conf.macaddr.a[3] << 8; + s->eeprom.contents[9] = s->conf.macaddr.a[4] | s->conf.macaddr.a[5] << 8; + s->nic = qemu_new_nic(&net_rtl8139_info, &s->conf, dev->qdev.info->name, dev->qdev.id, s); qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a); -- William hm I just noticed your correction here http://lists.gnu.org/archive/html/qemu-devel/2011-02/msg03232.html ; sorry Anyway, I tested my version and it works fine. Regards, -- William
[Qemu-devel] Re: [PATCH RESEND 2/2] vnc: Fix heap corruption
Am 02.03.2011 11:57, schrieb Corentin Chary: On Wed, Mar 2, 2011 at 3:58 AM, Wen Congyang wrote: This bug is reported by Stefan Weil: Commit bc2429b9174ac2d3c56b7fd35884b0d89ec7fb02 introduced a severe bug (heap corruption). bitmap_clear was called with a wrong argument which caused out-of-bound writes to width_mask. This bug was detected with QEMU running on windows. It also occurs with wine: *** stack smashing detected ***: terminated wine: Unhandled illegal instruction at address 0x6115c7 (thread 0009), starting debugger... The bug is not windows specific! The third argument of bitmap_clear() is number of bits to be cleared, but we pass the end bits to be cleared to bitmap_clear(). Signed-off-by: Wen Congyang Reported-by: Stefan Weil Acked-by: Corentin Chary No. I dont't think that the third parameter of bitmap_clear is ok like that. See my patch for the correct value. My own patch is also incomplete, so I'll send an update. Stefan --- ui/vnc.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/ui/vnc.c b/ui/vnc.c index e3761b0..e7d0b5b 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -2390,6 +2390,7 @@ static int vnc_refresh_server_surface(VncDisplay *vd) unsigned long width_mask[VNC_DIRTY_WORDS]; VncState *vs; int has_dirty = 0; +const size_t width = ds_get_width(vd->ds) / 16; struct timeval tv = { 0, 0 }; @@ -2403,9 +2404,8 @@ static int vnc_refresh_server_surface(VncDisplay *vd) * Check and copy modified bits from guest to server surface. * Update server dirty map. */ -bitmap_set(width_mask, 0, (ds_get_width(vd->ds) / 16)); -bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), - VNC_DIRTY_WORDS * BITS_PER_LONG); +bitmap_set(width_mask, 0, width); +bitmap_clear(width_mask, width, VNC_DIRTY_WORDS * BITS_PER_LONG - width); cmp_bytes = 16 * ds_get_bytes_per_pixel(vd->ds); guest_row = vd->guest.ds->data; server_row = vd->server->data; -- 1.7.1
Re: [Qemu-devel] Re: OMAP3 bug: disassembler disagreed with translator
Good day! Right now it works, you can see source at the http://gitorious.org/droid/qemu Don't see at the hw/motorola.c - here is only stub yet, not with real hardware, i'm experimenting with it. But omap3* files works ok with every other hardware. Here you can see examples how it works: https://www.droid-developers.org/wiki/QEMU It works ok, but somewhere at the middle of bootrom chain it also rewrite low bootrom location. (see attached trace files). For this case i'm only change register for jump to the high bootrom location, but i think it need to be more smart. Also i'm dont know where I can add support for Secure World/Services/Any other early TrustZone staff + efuse driver. Best regards, Anton Kochkov. On Fri, Feb 18, 2011 at 15:39, Peter Maydell wrote: > On 18 February 2011 12:24, wrote: >> On Feb 18, 2011, at 14:08, ext Антон Кочков wrote: >> >>> Here also modified omap3_boot.c file. I'm dont know why it is >>> successfully load bootrom at 0x40014000, but cant do this for 0x14000 >>> - it always fill by zeroes. >>> I'm only need copy memory from already loaded rom image from >>> 0x40014000 to 0x00014000 and nothing more. >> >> Our OMAP3 boot emulation is a very simplified model. For example, >> the boot ROM is never mapped to Q0 region, only Q1 region -- >> the boot ROM execution is started at 0x40014000 rather than >> 0x00014000. There is no memory mapped at Q0 unless you attach >> something to the GPMC before reset. > > However most boards do attach something to the GPMC; > the Beagle puts a NAND device on GPMC CS0, which means that > the GPMC will try to map NAND read/write functions on address > 0 [*]. If you're trying to map a boot ROM there as well this will > clash and you need to do something clever to work however > the hardware does (presumably mapping the ROM there at startup > and unmapping the ROM later in response to some undocumented > event). > > [*] Only true in the version of omap_gpmc.c in the meego > tree, but if you're doing omap3 stuff then I assume you're > using that, not upstream. > > -- PMM > qemu.log.gz Description: GNU Zip compressed data omap3430_boot_rom.log.gz Description: GNU Zip compressed data qemu-crash.log.gz Description: GNU Zip compressed data
[Qemu-devel] [PATCH] target-arm: Set carry flag correctly for Thumb2 ORNS
The code for Thumb2 ORNS (or negated and set flags) was trashing a TCG input register which was needed later for use in calculating flags, with the effect that the carry flag was always set with the wrong sense. Fix this by using a TCG temporary instead. Signed-off-by: Peter Maydell --- target-arm/translate.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index dbd958b..8f4e16b 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -7326,10 +7326,17 @@ gen_thumb2_data_op(DisasContext *s, int op, int conds, uint32_t shifter_out, TCG logic_cc = conds; break; case 3: /* orn */ -tcg_gen_not_i32(t1, t1); -tcg_gen_or_i32(t0, t0, t1); +{ +/* We can't just invert t1 in place because we might need it + * to calculate the carry flag later. + */ +TCGv tmp = tcg_temp_new_i32(); +tcg_gen_not_i32(tmp, t1); +tcg_gen_or_i32(t0, t0, tmp); +tcg_temp_free_i32(tmp); logic_cc = conds; break; +} case 4: /* eor */ tcg_gen_xor_i32(t0, t0, t1); logic_cc = conds; -- 1.7.1
Re: [Qemu-devel] Memory Map
Hi, many thanks for your response. Now i'm i've avoided the unregistering stuff, map done already at desired address space. Still now, some memory region is called with base+offset. So: [0x204] <= value (write from uP register) cause read(opaque, offset=204, value) while [0x504] <= value (write from uP register) cause read(opaque, offset=4, value) The two opaque are different as expected. Where i am wrong? --- Ven 25/2/11, Blue Swirl ha scritto: > Da: Blue Swirl > Oggetto: Re: [Qemu-devel] Memory Map > A: "Salvatore Lionetti" > Cc: qemu-devel@nongnu.org > Data: Venerdì 25 febbraio 2011, 17:10 > On Thu, Feb 24, 2011 at 11:08 AM, > Salvatore Lionetti > > wrote: > > Hi, > > > > This is what my board do > > > > cpu_register_physical_memory(0, 128*1024*1024, ...) > > cpu_register_physical_memory(0xFF80, 8*1024*1024, > ...) > > > > and this layout does not change over the entire live > (virtual) of the board. > > > > For the following offset (1st column) and size in > bytes (2nd column) > > {0x00, 512}, > > {0x000200, 16}, > > {0x000300, 32}, > > {0x000400, 32}, > > {0x000500, 64}, > > {0x000600, 64}, > > {0x000700, 128}, > > {0x000800, 30}, > > {0x000900, 256}, > > {0x000A00, 44}, > > {0x000B00, 256}, > > {0x000C00, 24}, > > {0x000F00, 20}, > > {0x001000, 20}, > > {0x001100, 20}, > > {0x001400, 168}, > > {0x001800, 24}, > > {0x002000, 4096}, > > {0x003000, 24}, > > {0x003100, 24}, > > {0x004500, 36}, > > {0x005000, 224}, > > {0x008000, 768}, > > {0x008300, 16}, > > > > i do, for each item, > > > > a = cpu_register_io_memory(r, w, o, > DEVICE_NATIVE_ENDIAN) > > cpu_register_physical_memory(_base+offset, len, a) > > > > And _base could be reprogrammed at any time. So before > to change _base i: > > > > cpu_unregister_io_memory(a) > > > > What i see is that accessing to _base+ > > _base+0x005000 => Wake up r/w with offset 0 > > _base+0x000204 => Wake up r/w with offset 0x204 > > > > So the question > > - Am i wrong something? > > cpu_unregister_io_memory() is the counterpart of > cpu_register_io_memory(), it does not affect mappings > created by > cpu_register_physical_memory(). They should be removed > first. > > > - Is possible to map address with last > TARGET_PAGE_BITS (es 0x200) bits set? > > Yes. >
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 03/01/2011 05:51 PM, Anthony Liguori wrote: Do a hot unplug of a network device with upstream libvirt with acpiphp unloaded, consult libvirt and then consult the monitor to see who has the right view of the guests config. libvirt is right and the monitor is wrong. On real hardware, calling _EJ0 doesn't affect the configuration one little bit (if I understand it correctly). It just turns off power to the slot. If you power-cycle, the card will be there. It's up to the hardware vendor. Since it's ACPI, it can result in any number of operations. Usually, there's some logic to flip on an LED or something. There's nothing that prevents a vendor from ejecting the card. My point is that there aren't cleanly separated lines in the real world. We can implement out virtual hardware like real hardware, or we can do some new stuff, and break our management model in the process. Unless I'm hallucinating, you're suggesting quite a bit more. A revolution in how qemu is to be managed. Let me take another route to see if I can't persuade you. First, let's clarify your proposal. You want to introduce a new block format that references to block devices. It may also store a dirty bitmap to keep track of which blocks are out of sync. Hopefully, it goes without saying that the dirty bitmap is strictly optional (it's a performance optimization) so let's ignore it. (as was related elsewhere, the state is also optional) Your format, as a text file, looks like: [raid1] primary=diska.img secondary=diskb.img active=primary To use it, here's the sequence: 0) qemu uses disk A for a block device 1) create a raid1 block device pointing to disk A and disk B. 2) management tool asks qemu to us the new raid1 block device. 3) qemu acks (2) 4) at some point, the mirror completes, writes are going to both disks 5) qemu sends out an event indicating that the disks are in sync 6) management tool then sends a command to fail over to disk B 7) qemu acks (6) We're making the management tool the "authoritative" source of how to launch QEMU. That means that the management tool ultimately determines which command line to relaunch QEMU with. Here are the races: A) If QEMU crashes between (2) and (3), it may have issues a write to the new raid1 block device before the management tool sees (3). If this happens, when the management tool restarts QEMU with disk A, we're left with a dangling raid1 block device. Not a critical failure, but not ideal. You can restart qemu with the RAID1 blockdev. B) If QEMU crashes between (6) and (7), QEMU may have started writing to disk B before the management tool sees (7). This means that the management tool will create the guest with the raid1 block device which no longer is the correct disk. This could fail in subtly bad ways. Depending on how read is implemented (if you try to do striping for instance), bad data could be returned. You could try to implement a policy of always reading from B if the block has been copied but this gets harry really quickly. It's definitely not RAID1 anymore. As related elsewhere, you restart qemu with image B. The trick is to partition the problem into idempotent commands; these allow you to recover from any failure. You may observe that the problem is not the RAID1 mechanism, but changing from using a normal device and the RAID1 mechanism. It would then be wise to say, let's always use this image format. Since that eliminates the race, we don't really need the copy bitmap anymore. Now we're left with a simple format that just refers to two filenames. However, block devices are more than just a filename. It needs a format, cache settings, etc. So let's put this all in the RAID1 block format. We also need a way to indicate which block device is selected. Let's make it a text file for purposes of discussion. It will look something like: [primary] filename=diska.img cache=none format=raw [secondary] filename=diskb.img cache=writethrough format=qcow2 [global] active=primary Since we might want to mirror multiple drives at once, we should probablyn support having multiple drives configured which means we need to not just have a single active entry, but an entry associated with a particular device. Or you have one file per RAID-1 image set. This is important because images are not associated with a qemu instance. You can hot-unplug an image from one qemu and hot-plug it into another. [drive "diskA"] filename=diska.img cache=none format=raw [drive "diskB"] filename=diskb.img cache=writethrough format=qcow2 [device "vda"] drive=diskB And this is exactly what I'm proposing. It's exactly what I'm opposing. Making qemu manage all this stuff. It's really the natural generalization of what you're proposing. So basically, the only differences are: 1) always use the new RAID1 format 2) drop the progress bitmap 3) support multi
[Qemu-devel] [PATCH] allow to load android binary
Hi, Android binary start with a weird elf program header : the first one is of size 0 pointing to NULL addr. Ignore LOAD program where MemSiz is 0. Elf file type is EXEC (Executable file) Entry point 0xb0001000 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0xd4 0x 0xb000 0x0 0x0 R 0x1000 LOAD 0x001000 0xb0001000 0xb0001000 0x073d4 0x073d4 R E 0x1000 LOAD 0x009000 0xb0009000 0xb0009000 0x0068c 0x0969c RW 0x1000 GNU_STACK 0x00 0x 0x 0x0 0x0 RW 0 EXIDX 0x00801c 0xb000801c 0xb000801c 0x003b8 0x003b8 R 0x4 Section to Segment mapping: Segment Sections... 00 01 .text .rodata .ARM.extab .ARM.exidx 02 .preinit_array .init_array .fini_array .ctors .data.rel.ro .got .data .bss 03 04 .ARM.exidx >From 4d986b66e9ae04efeabde9ad73f60d3c2d6912f9 Mon Sep 17 00:00:00 2001 From: Matthieu CASTET Date: Wed, 2 Mar 2011 17:04:39 +0100 Subject: [PATCH] allow to load android binary Android binary start with a weird elf program header : the first one is of size 0 pointing to NULL addr. Ignore LOAD program where MemSiz is 0. Elf file type is EXEC (Executable file) Entry point 0xb0001000 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0xd4 0x 0xb000 0x0 0x0 R 0x1000 LOAD 0x001000 0xb0001000 0xb0001000 0x073d4 0x073d4 R E 0x1000 LOAD 0x009000 0xb0009000 0xb0009000 0x0068c 0x0969c RW 0x1000 GNU_STACK 0x00 0x 0x 0x0 0x0 RW 0 EXIDX 0x00801c 0xb000801c 0xb000801c 0x003b8 0x003b8 R 0x4 Section to Segment mapping: Segment Sections... 00 01 .text .rodata .ARM.extab .ARM.exidx 02 .preinit_array .init_array .fini_array .ctors .data.rel.ro .got .data .bss 03 04 .ARM.exidx --- linux-user/elfload.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 33d776d..284f3be 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -1201,7 +1201,7 @@ static void load_elf_image(const char *image_name, int image_fd, amount of memory to handle that. */ loaddr = -1, hiaddr = 0; for (i = 0; i < ehdr->e_phnum; ++i) { -if (phdr[i].p_type == PT_LOAD) { +if (phdr[i].p_type == PT_LOAD && phdr[i].p_memsz) { abi_ulong a = phdr[i].p_vaddr; if (a < loaddr) { loaddr = a; @@ -1301,7 +1301,7 @@ static void load_elf_image(const char *image_name, int image_fd, for (i = 0; i < ehdr->e_phnum; i++) { struct elf_phdr *eppnt = phdr + i; -if (eppnt->p_type == PT_LOAD) { +if (eppnt->p_type == PT_LOAD && eppnt->p_memsz) { abi_ulong vaddr, vaddr_po, vaddr_ps, vaddr_ef, vaddr_em; int elf_prot = 0; -- 1.7.4.1
[Qemu-devel] Re: [PATCH] moving eeprom initialization
On Wed, Mar 2, 2011 at 2:36 PM, William Dauchy wrote: > The initialization should not be only on reset but also when initializing > the device. > It resolves a bug when hot plugging a pci network device: the mac address > was always null. > --- > hw/pcnet.c | 27 ++- > hw/rtl8139.c | 24 > 2 files changed, 26 insertions(+), 25 deletions(-) > > diff --git a/hw/pcnet.c b/hw/pcnet.c > index db52dc5..4e30e9c 100644 > --- a/hw/pcnet.c > +++ b/hw/pcnet.c > @@ -1557,19 +1557,6 @@ uint32_t pcnet_bcr_readw(PCNetState *s, uint32_t rap) > void pcnet_h_reset(void *opaque) > { > PCNetState *s = opaque; > - int i; > - uint16_t checksum; > - > - /* Initialize the PROM */ > - > - memcpy(s->prom, s->conf.macaddr.a, 6); > - s->prom[12] = s->prom[13] = 0x00; > - s->prom[14] = s->prom[15] = 0x57; > - > - for (i = 0,checksum = 0; i < 16; i++) > - checksum += s->prom[i]; > - *(uint16_t *)&s->prom[12] = cpu_to_le16(checksum); > - > > s->bcr[BCR_MSRDA] = 0x0005; > s->bcr[BCR_MSWRA] = 0x0005; > @@ -1736,6 +1723,9 @@ void pcnet_common_cleanup(PCNetState *d) > > int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info) > { > + int i; > + uint16_t checksum; > + > s->poll_timer = qemu_new_timer(vm_clock, pcnet_poll_timer, s); > > qemu_macaddr_default_if_unset(&s->conf.macaddr); > @@ -1744,5 +1734,16 @@ int pcnet_common_init(DeviceState *dev, PCNetState *s, > NetClientInfo *info) > > add_boot_device_path(s->conf.bootindex, dev, "/ethernet-phy@0"); > > + /* Initialize the PROM */ > + > + memcpy(s->prom, s->conf.macaddr.a, 6); > + s->prom[12] = s->prom[13] = 0x00; > + s->prom[14] = s->prom[15] = 0x57; > + > + for (i = 0, checksum = 0; i < 16; i++) { > + checksum += s->prom[i]; > + } > + *(uint16_t *)&s->prom[12] = cpu_to_le16(checksum); > + > return 0; > } > diff --git a/hw/rtl8139.c b/hw/rtl8139.c > index a22530c..8356d5a 100644 > --- a/hw/rtl8139.c > +++ b/hw/rtl8139.c > @@ -1189,18 +1189,6 @@ static void rtl8139_reset(DeviceState *d) > > rtl8139_update_irq(s); > > - /* prepare eeprom */ > - s->eeprom.contents[0] = 0x8129; > -#if 1 > - // PCI vendor and device ID should be mirrored here > - s->eeprom.contents[1] = PCI_VENDOR_ID_REALTEK; > - s->eeprom.contents[2] = PCI_DEVICE_ID_REALTEK_8139; > -#endif > - > - s->eeprom.contents[7] = s->conf.macaddr.a[0] | s->conf.macaddr.a[1] << 8; > - s->eeprom.contents[8] = s->conf.macaddr.a[2] | s->conf.macaddr.a[3] << 8; > - s->eeprom.contents[9] = s->conf.macaddr.a[4] | s->conf.macaddr.a[5] << 8; > - > /* mark all status registers as owned by host */ > for (i = 0; i < 4; ++i) > { > @@ -3392,6 +3380,18 @@ static int pci_rtl8139_init(PCIDevice *dev) > > qemu_macaddr_default_if_unset(&s->conf.macaddr); > > + /* prepare eeprom */ > + s->eeprom.contents[0] = 0x8129; > +#if 1 > + /* PCI vendor and device ID should be mirrored here */ > + s->eeprom.contents[1] = PCI_VENDOR_ID_REALTEK; > + s->eeprom.contents[2] = PCI_DEVICE_ID_REALTEK_8139; > +#endif > + > + s->eeprom.contents[7] = s->conf.macaddr.a[0] | s->conf.macaddr.a[1] << 8; > + s->eeprom.contents[8] = s->conf.macaddr.a[2] | s->conf.macaddr.a[3] << 8; > + s->eeprom.contents[9] = s->conf.macaddr.a[4] | s->conf.macaddr.a[5] << 8; > + > s->nic = qemu_new_nic(&net_rtl8139_info, &s->conf, > dev->qdev.info->name, dev->qdev.id, s); > qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a); > -- > William hm I just noticed your correction here http://lists.gnu.org/archive/html/qemu-devel/2011-02/msg03232.html ; sorry Anyway, I tested my version and it works fine. Regards, -- William
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 03/02/2011 08:00 AM, Avi Kivity wrote: On 03/02/2011 02:39 PM, Anthony Liguori wrote: Here is where your race is: 2. Management sends a switch command 3. QEMU receives switch command 4. QEMU stops doubling IO and switches to the destination 5. QEMU sends acknowledgement of switch command 6. Management receives acknowledge of switch command 7. Management changes internal state definition to reflect the new destination If QEMU or the management tool crashes after step 4 and before step 6, when the management tool restarts QEMU with the source image, data loss will have occurred (and potentially corruption if a flush had happened). No. After step 2, any qemu restart will be with the destination image. If the management tool restarts, it can query the state (or just re-issue the switch command, which is idempotent). Yeah, I think you're right, although I need to think through it a bit more. Regards, Anthony Liguori
[Qemu-devel] [PATCH] Don't allow multiwrites against a block device without underlying medium
If the block device has been closed, we no longer have a medium to submit IO against, check for this before submitting io. This prevents a segfault further in the code where we dereference elements of the block driver. Signed-off-by: Ryan Harper --- block.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 92dd3fe..534e1bc 100644 --- a/block.c +++ b/block.c @@ -2407,6 +2407,11 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) return 0; } +/* don't submit writes if we don't have a medium */ +if (bs->drv == NULL) { + return -1; +} + // Create MultiwriteCB structure mcb = qemu_mallocz(sizeof(*mcb) + num_reqs * sizeof(*mcb->callbacks)); mcb->num_requests = 0; -- 1.7.1 -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ry...@us.ibm.com
[Qemu-devel] [PATCH] Do not delete BlockDriverState when deleting the drive
When removing a drive from the host-side via drive_del we currently have the following path: drive_del qemu_aio_flush() bdrv_close() drive_uninit() bdrv_delete() When we bdrv_delete() we end up qemu_free()'ing the BlockDriverState pointer however, the block devices retain a copy of this pointer, see hw/virtio-blk.c:virtio_blk_init() where we s->bs = conf->bs. We now have a use-after-free situation. If the guest continues to issue IO against the device, and we've reallocated the memory that the BlockDriverState pointed at, then we will fail the bs->drv checks in the various bdrv_ methods. To resolve this issue as simply as possible, we can chose to not actually delete the BlockDriverState pointer. Since bdrv_close() handles setting the drv pointer to NULL, we just need to remove the BlockDriverState from the QLIST that is used to enumerate the block devices. This is currently handled within bdrv_delete, so move this into it's own function, bdrv_remove(). The result is that we can now invoke drive_del, this closes the file descriptors and sets BlockDriverState->drv to NULL which prevents futher IO to the device, and since we do not free BlockDriverState, we don't have to worry about the copy retained in the block devices. This state will be deleted if the guest is asked and responds to a pci device removal request. Reported-by: Markus Armbruster Signed-off-by: Ryan Harper --- block.c| 11 --- block.h|1 + blockdev.c |2 +- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index f7d91a2..92dd3fe 100644 --- a/block.c +++ b/block.c @@ -697,14 +697,19 @@ void bdrv_close_all(void) } } +void bdrv_remove(BlockDriverState *bs) +{ +if (bs->device_name[0] != '\0') { +QTAILQ_REMOVE(&bdrv_states, bs, list); +} +} + void bdrv_delete(BlockDriverState *bs) { assert(!bs->peer); /* remove from list, if necessary */ -if (bs->device_name[0] != '\0') { -QTAILQ_REMOVE(&bdrv_states, bs, list); -} +bdrv_remove(bs); bdrv_close(bs); if (bs->file != NULL) { diff --git a/block.h b/block.h index 5d78fc0..8447397 100644 --- a/block.h +++ b/block.h @@ -66,6 +66,7 @@ int bdrv_create(BlockDriver *drv, const char* filename, QEMUOptionParameter *options); int bdrv_create_file(const char* filename, QEMUOptionParameter *options); BlockDriverState *bdrv_new(const char *device_name); +void bdrv_remove(BlockDriverState *bs); void bdrv_delete(BlockDriverState *bs); int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); int bdrv_open(BlockDriverState *bs, const char *filename, int flags, diff --git a/blockdev.c b/blockdev.c index 0690cc8..1f57b50 100644 --- a/blockdev.c +++ b/blockdev.c @@ -760,7 +760,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) } /* clean up host side */ -drive_uninit(drive_get_by_blockdev(bs)); +bdrv_remove(bs); return 0; } -- 1.7.1 -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ry...@us.ibm.com
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/2011 07:18 AM, Jes Sorensen wrote: On 03/02/11 14:13, Michael Roth wrote: On 03/02/2011 04:19 AM, Jes Sorensen wrote: It is absolutely vital for me that we do not make things much more complicated for users with this move. I don't want to get into a situation where we start forcing external packages or daemons in order to run basic QEMU. If we start requiring such, we have failed! However, I think it is a reasonable compromise to have one daemon you launch, and then a simple client you launch straight after which will then provide the same/similar views and interfaces that users are used to when they launch current QEMU (monitor, vnc server etc). I think the challenge with this approach is defining an IPC mechanism that is robust enough to support it. For instance, on one end of the spectrum we have Spice, where shared memory paired with some mechanism for IO notification between the client/server might make sense. But then we have things like the human monitor where a socket interface or pipe is clearly more the more straight-forward route. We may find that it more desirable to have multiple classes of client components, each contain within a client process with it's own IPC mechanism. Although, multiple IPC mechanisms doesn't sound particularly enticing either...but 2 might not be so unreasonable. Hi Michael, I think we need two types for sure, even for the video case, we will still need a control channel as well. However, I don't think it is desirable to split things up more than we have to, so if we can keep it within one client process that is good. Maybe there are cases where it makes sense to split it into more processes, I could be convinced, but I think we really need to be careful making it too much of a complex mess either. Yup, if it's doable I'd prefer a single client process as well. Just hard to predict how difficult it'll be to support 2 or more mechanisms. Although, I'd imagine we'd end up with something like qemu's io loop, with event-driven shmem and fd-based work, which does seem doable. Cheers, Jes
[Qemu-devel] [PATCH] moving eeprom initialization
The initialization should not be only on reset but also when initializing the device. It resolves a bug when hot plugging a pci network device: the mac address was always null. --- hw/pcnet.c | 27 ++- hw/rtl8139.c | 24 2 files changed, 26 insertions(+), 25 deletions(-) diff --git a/hw/pcnet.c b/hw/pcnet.c index db52dc5..4e30e9c 100644 --- a/hw/pcnet.c +++ b/hw/pcnet.c @@ -1557,19 +1557,6 @@ uint32_t pcnet_bcr_readw(PCNetState *s, uint32_t rap) void pcnet_h_reset(void *opaque) { PCNetState *s = opaque; -int i; -uint16_t checksum; - -/* Initialize the PROM */ - -memcpy(s->prom, s->conf.macaddr.a, 6); -s->prom[12] = s->prom[13] = 0x00; -s->prom[14] = s->prom[15] = 0x57; - -for (i = 0,checksum = 0; i < 16; i++) -checksum += s->prom[i]; -*(uint16_t *)&s->prom[12] = cpu_to_le16(checksum); - s->bcr[BCR_MSRDA] = 0x0005; s->bcr[BCR_MSWRA] = 0x0005; @@ -1736,6 +1723,9 @@ void pcnet_common_cleanup(PCNetState *d) int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info) { +int i; +uint16_t checksum; + s->poll_timer = qemu_new_timer(vm_clock, pcnet_poll_timer, s); qemu_macaddr_default_if_unset(&s->conf.macaddr); @@ -1744,5 +1734,16 @@ int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info) add_boot_device_path(s->conf.bootindex, dev, "/ethernet-phy@0"); +/* Initialize the PROM */ + +memcpy(s->prom, s->conf.macaddr.a, 6); +s->prom[12] = s->prom[13] = 0x00; +s->prom[14] = s->prom[15] = 0x57; + +for (i = 0, checksum = 0; i < 16; i++) { +checksum += s->prom[i]; +} +*(uint16_t *)&s->prom[12] = cpu_to_le16(checksum); + return 0; } diff --git a/hw/rtl8139.c b/hw/rtl8139.c index a22530c..8356d5a 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -1189,18 +1189,6 @@ static void rtl8139_reset(DeviceState *d) rtl8139_update_irq(s); -/* prepare eeprom */ -s->eeprom.contents[0] = 0x8129; -#if 1 -// PCI vendor and device ID should be mirrored here -s->eeprom.contents[1] = PCI_VENDOR_ID_REALTEK; -s->eeprom.contents[2] = PCI_DEVICE_ID_REALTEK_8139; -#endif - -s->eeprom.contents[7] = s->conf.macaddr.a[0] | s->conf.macaddr.a[1] << 8; -s->eeprom.contents[8] = s->conf.macaddr.a[2] | s->conf.macaddr.a[3] << 8; -s->eeprom.contents[9] = s->conf.macaddr.a[4] | s->conf.macaddr.a[5] << 8; - /* mark all status registers as owned by host */ for (i = 0; i < 4; ++i) { @@ -3392,6 +3380,18 @@ static int pci_rtl8139_init(PCIDevice *dev) qemu_macaddr_default_if_unset(&s->conf.macaddr); +/* prepare eeprom */ +s->eeprom.contents[0] = 0x8129; +#if 1 +/* PCI vendor and device ID should be mirrored here */ +s->eeprom.contents[1] = PCI_VENDOR_ID_REALTEK; +s->eeprom.contents[2] = PCI_DEVICE_ID_REALTEK_8139; +#endif + +s->eeprom.contents[7] = s->conf.macaddr.a[0] | s->conf.macaddr.a[1] << 8; +s->eeprom.contents[8] = s->conf.macaddr.a[2] | s->conf.macaddr.a[3] << 8; +s->eeprom.contents[9] = s->conf.macaddr.a[4] | s->conf.macaddr.a[5] << 8; + s->nic = qemu_new_nic(&net_rtl8139_info, &s->conf, dev->qdev.info->name, dev->qdev.id, s); qemu_format_nic_info_str(&s->nic->nc, s->conf.macaddr.a); -- William
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/11 14:13, Michael Roth wrote: > On 03/02/2011 04:19 AM, Jes Sorensen wrote: >> It is absolutely vital for me that we do not make things much more >> complicated for users with this move. I don't want to get into a >> situation where we start forcing external packages or daemons in order >> to run basic QEMU. If we start requiring such, we have failed! However, >> I think it is a reasonable compromise to have one daemon you launch, and >> then a simple client you launch straight after which will then provide >> the same/similar views and interfaces that users are used to when they >> launch current QEMU (monitor, vnc server etc). > > I think the challenge with this approach is defining an IPC mechanism > that is robust enough to support it. For instance, on one end of the > spectrum we have Spice, where shared memory paired with some mechanism > for IO notification between the client/server might make sense. But then > we have things like the human monitor where a socket interface or pipe > is clearly more the more straight-forward route. > > We may find that it more desirable to have multiple classes of client > components, each contain within a client process with it's own IPC > mechanism. Although, multiple IPC mechanisms doesn't sound particularly > enticing either...but 2 might not be so unreasonable. Hi Michael, I think we need two types for sure, even for the video case, we will still need a control channel as well. However, I don't think it is desirable to split things up more than we have to, so if we can keep it within one client process that is good. Maybe there are cases where it makes sense to split it into more processes, I could be convinced, but I think we really need to be careful making it too much of a complex mess either. Cheers, Jes
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/2011 04:19 AM, Jes Sorensen wrote: On 02/28/11 18:44, Anthony Liguori wrote: On Feb 28, 2011 10:44 AM, "Jes Sorensen" wrote: Separating host-side virtagent and other tasks from core QEMU = To improve auditing of the core QEMU code, it would be ideal to be able to separate the core QEMU functionality from utility functionality by moving the utility functionality into its own process. This process will be referred to as the QEMU client below. Separating as in moving code outside of qemu.git, making code optionally built in, making code optionally built in or loadable as a separate executable that is automatically launched, or making code always built outside the main executable? I'm very nervous about having a large number of daemons necessary to run QEMU. I think a reasonable approach would be a single front-end daemond. Once QAPI is merged, there is a very incremental approach we can take for this sort of work. Take your favorite subsystem (like gdbstub or SDL) and make it only use QMP apis. Once we're only using QMP internally in a subsystem, then building it out of the main QEMU and using libqmp should be fairly easy. Hi Anthony, Back from a day off playing with power drills and concrete walls :) Sorry I should have made it a little more clear, it was obvious to me, but not written down: The idea is to keep everything as part of the QEMU package, ie. part of qemu.git. My idea is really to have one QEMU host daemon and one QEMU client, which provides the various services. You type make and you get two binaries instead of one. We could allow other daemons to connect to the host daemon, but that wouldn't be a primary target for this in my book, and I am not sure we really want to do this. It is absolutely vital for me that we do not make things much more complicated for users with this move. I don't want to get into a situation where we start forcing external packages or daemons in order to run basic QEMU. If we start requiring such, we have failed! However, I think it is a reasonable compromise to have one daemon you launch, and then a simple client you launch straight after which will then provide the same/similar views and interfaces that users are used to when they launch current QEMU (monitor, vnc server etc). I think the challenge with this approach is defining an IPC mechanism that is robust enough to support it. For instance, on one end of the spectrum we have Spice, where shared memory paired with some mechanism for IO notification between the client/server might make sense. But then we have things like the human monitor where a socket interface or pipe is clearly more the more straight-forward route. We may find that it more desirable to have multiple classes of client components, each contain within a client process with it's own IPC mechanism. Although, multiple IPC mechanisms doesn't sound particularly enticing either...but 2 might not be so unreasonable. I hope this clarifies things. Cheers, Jes
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 03/02/2011 02:39 PM, Anthony Liguori wrote: Here is where your race is: 2. Management sends a switch command 3. QEMU receives switch command 4. QEMU stops doubling IO and switches to the destination 5. QEMU sends acknowledgement of switch command 6. Management receives acknowledge of switch command 7. Management changes internal state definition to reflect the new destination If QEMU or the management tool crashes after step 4 and before step 6, when the management tool restarts QEMU with the source image, data loss will have occurred (and potentially corruption if a flush had happened). No. After step 2, any qemu restart will be with the destination image. If the management tool restarts, it can query the state (or just re-issue the switch command, which is idempotent). This all boils down to the Two Generals Problem[1]. It's simply not fixable without making one end reliable and that means that someone needs to fsync() something *after* the switchover happens but before the first write happens. That can be QEMU (Avi's RAID proposal and my state file proposal) or it can be the management tool (if we introduce synchronous events). The two problems are not equivalent. Once the management tool receives acknowledgement that the switch occurred, the protocol terminates. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH] spice/qxl: locking fix for qemu-kvm
On Wed, Mar 02, 2011 at 02:32:03PM +0200, Alon Levy wrote: > From: Gerd Hoffmann Err, that "From" got there by mistake, and the title should of course not say "for qemu-kvm".. > > qxl needs to release the qemu lock before calling some libspice > functions (and re-aquire it later). In upstream qemu qxl can just > use qemu_mutex_{unlock,lock}_iothread. In qemu-kvm this doesn't > work, qxl needs additionally save+restore the cpu_single_env pointer > on unlock+lock. > > This fixes the following assertion in kvm_mutex_unlock that happened in the > released qemu-kvm 0.14.0 on gentoo when using spice's qxl device: > > > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724: > > kvm_mutex_unlock: Assertion `!cpu_single_env' failed. > > Happening as a result of io from the guest (qxl reset): > > (gdb) bt > > #0 0x75daa165 in raise () from /lib/libc.so.6 > > #1 0x75dab580 in abort () from /lib/libc.so.6 > > #2 0x75da3201 in __assert_fail () from /lib/libc.so.6 > > #3 0x00436f7e in kvm_mutex_unlock () > > at > > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724 > > #4 qemu_mutex_unlock_iothread () > > at > > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1737 > > #5 0x005e84ee in qxl_hard_reset (d=0x15d3080, loadvm=0) > > at > > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:665 > > #6 0x005e9f9a in ioport_write (opaque=0x15d3080, addr= > According to Jan, this bug (the wrong value for cpu_single_env) is also > present > in qemu, but no abort is triggered because it isn't asserted. > > Signed-off-by: Gerd Hoffmann > --- > hw/qxl.c | 37 + > ui/spice-display.c | 12 ++-- > ui/spice-display.h |6 ++ > 3 files changed, 41 insertions(+), 14 deletions(-) > > diff --git a/hw/qxl.c b/hw/qxl.c > index fe4212b..117f7c8 100644 > --- a/hw/qxl.c > +++ b/hw/qxl.c > @@ -125,6 +125,27 @@ static void qxl_reset_memslots(PCIQXLDevice *d); > static void qxl_reset_surfaces(PCIQXLDevice *d); > static void qxl_ring_set_dirty(PCIQXLDevice *qxl); > > +/* qemu-kvm locking ... */ > +void qxl_unlock_iothread(SimpleSpiceDisplay *ssd) > +{ > +if (cpu_single_env) { > +assert(ssd->env == NULL); > +ssd->env = cpu_single_env; > +cpu_single_env = NULL; > +} > +qemu_mutex_unlock_iothread(); > +} > + > +void qxl_lock_iothread(SimpleSpiceDisplay *ssd) > +{ > +qemu_mutex_lock_iothread(); > +if (ssd->env) { > +assert(cpu_single_env == NULL); > +cpu_single_env = ssd->env; > +ssd->env = NULL; > +} > +} > + > static inline uint32_t msb_mask(uint32_t val) > { > uint32_t mask; > @@ -662,10 +683,10 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) > dprint(d, 1, "%s: start%s\n", __FUNCTION__, > loadvm ? " (loadvm)" : ""); > > -qemu_mutex_unlock_iothread(); > +qxl_unlock_iothread(&d->ssd); > d->ssd.worker->reset_cursor(d->ssd.worker); > d->ssd.worker->reset_image_cache(d->ssd.worker); > -qemu_mutex_lock_iothread(); > +qxl_lock_iothread(&d->ssd); > qxl_reset_surfaces(d); > qxl_reset_memslots(d); > > @@ -795,9 +816,9 @@ static void qxl_reset_surfaces(PCIQXLDevice *d) > { > dprint(d, 1, "%s:\n", __FUNCTION__); > d->mode = QXL_MODE_UNDEFINED; > -qemu_mutex_unlock_iothread(); > +qxl_unlock_iothread(&d->ssd); > d->ssd.worker->destroy_surfaces(d->ssd.worker); > -qemu_mutex_lock_iothread(); > +qxl_lock_iothread(&d->ssd); > memset(&d->guest_surfaces.cmds, 0, sizeof(d->guest_surfaces.cmds)); > } > > @@ -866,9 +887,9 @@ static void qxl_destroy_primary(PCIQXLDevice *d) > dprint(d, 1, "%s\n", __FUNCTION__); > > d->mode = QXL_MODE_UNDEFINED; > -qemu_mutex_unlock_iothread(); > +qxl_unlock_iothread(&d->ssd); > d->ssd.worker->destroy_primary_surface(d->ssd.worker, 0); > -qemu_mutex_lock_iothread(); > +qxl_lock_iothread(&d->ssd); > } > > static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm) > @@ -938,10 +959,10 @@ static void ioport_write(void *opaque, uint32_t addr, > uint32_t val) > case QXL_IO_UPDATE_AREA: > { > QXLRect update = d->ram->update_area; > -qemu_mutex_unlock_iothread(); > +qxl_unlock_iothread(&d->ssd); > d->ssd.worker->update_area(d->ssd.worker, d->ram->update_surface, > &update, NULL, 0, 0); > -qemu_mutex_lock_iothread(); > +qxl_lock_iothread(&d->ssd); > break; > } > case QXL_IO_NOTIFY_CMD: > diff --git a/ui/spice-display.c b/ui/spice-display.c > index 020b423..defe652 100644 > --- a/ui/spice-display.c > +++ b/ui/spice-display.c > @@ -186,18 +186,18 @@ void qemu_spice_create_host_primary(SimpleSpiceDisplay > *ssd) > surface.mem= (i
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On Wed, Mar 02, 2011 at 01:04:58PM +0200, Dor Laor wrote: > On 03/02/2011 12:58 PM, Alon Levy wrote: > >On Wed, Mar 02, 2011 at 11:25:44AM +0100, Jes Sorensen wrote: > >>On 03/01/11 15:25, Dor Laor wrote: > >>>On 03/01/2011 02:40 PM, Anthony Liguori wrote: > > On Mar 1, 2011 7:07 AM, "Dor Laor" > Qemu is the one that should spawn them and they should be transparent > from the management. This way running qemu stays the same and qemu just > need to add the logic to get a SIGCHILD and potentially re-execute an > dead son process. > > Spice is the logical place to start, no? It's the largest single > dependency we have and it does some scary things with qemu_mutex. I > would use spice as a way to prove the concept. > >>> > >>>I agree it is desirable to the this for spice but it is allot more > >>>complex than virtagent isolation. Spice is performance sensitive and > >>>contains much more state. It needs to access the guest memory for > >>>reading the surfaces. It can be solved but needs some major changes. > >>>Adding spice-devel to the discussion. > >> > >>I had a few thoughts about this already, which I think will work for > >>both spice and vnc. What we could do is to expose the video memory via > >>shared memory. That way a spice or vnc daemon could get direct access to > >>the memory, this would limit communication to keyboard/mouse events, as > >>well as video mode info, and possibly notifications to the client about > >>which ranges of memory have been updated. > >> > >>Using shared memory this way should allow us to implement the video > >>clients without performance loss, in fact it should be beneficial since > >>it would allow them to run fully separate from the host daemon. > >> > > > >I think that would work well for spice. Spice uses shared memory from the > >pci device for both the framebuffer and surfaces/commands, but this is > > Is that the only DMA do you do? That's good for this model. > Yes. We have two memory bars (another for rom) and an io bar. The bars contain the framebuffer (primary surface backing store, also used in vga mode), the non primary surfaces backing store and the rings (display, cursor, and release). In qxl (native) mode the rings are read/written to all the time, the rest by demand. In vga mode it's identical to vnc, the framebuffer is written from the guest and we track the dirty regions. > >not really relevant at this level. What about IO and irq? that would add > >additional latencies, no? because each io exit would need to be ipc'ed over > >to the spice/vnc process? and same way in the other direction, requesting > >qemu to trigger an interrupt in the next vm entry. > > The qxl device can be in the privileged qemu (as a start) and it > will handle irqs directly. Even today you need to notify the spice > server thread, so nothing will change ok, so qxl remains in qemu process. But all those io exits (well, most) end up as calls to spice-server, now in another process. So latency wise it's the same. > > > >>Cheers, > >>Jes > >> > >
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 03/01/2011 03:59 AM, Dor Laor wrote: On 02/28/2011 08:12 PM, Anthony Liguori wrote: On Feb 28, 2011 11:47 AM, "Avi Kivity" mailto:a...@redhat.com>> wrote: > > On 02/28/2011 07:33 PM, Anthony Liguori wrote: >> >> >> > >> > You're just ignoring what I've written. >> >> No, you're just impervious to my subtle attempt to refocus the discussion on solving a practical problem. >> >> There's a lot of good, reasonably straight forward changes we can make that have a high return on investment. >> > > Is making qemu the authoritative source of configuration information a straightforward change? Is the return on it high? Is the investment low? I think this is where we fundamentally disagree. My position is that QEMU is already the authoritative source. Having a state file doesn't change anything. Do a hot unplug of a network device with upstream libvirt with acpiphp unloaded, consult libvirt and then consult the monitor to see who has the right view of the guests config. To me, that's the definition of authoritative. > "No" to all three (ignoring for the moment whether it is good or not, which we were debating). > > >> The only suggestion I'm making beyond Marcelo's original patch is that we use a structured format and that we make it possible to use the same file to solve this problem in multiple places. >> > > No, you're suggesting a lot more than that. That's exactly what I'm suggesting from a technical perspective. >> I don't think this creates a fundamental break in how management tools interact with QEMU. I don't think introducing RAID support in the block layer is a reasonable alternative. >> >> > > Why not? Because its a lot of complexity and code that can go wrong while only solving the race for one specific case. Not to mention that we double the iop rate. > Something that avoids the whole state thing altogether: > > - instead of atomically switching when live copy is done, keep on issuing writes to both the origin and the live copy > - issue a notification to management > - management receives the notification, and issues an atomic blockdev switch command > this is really the RAID-1 solution but without the state file (credit Dor). An advantage is that there is no additional latency when trying to catch up to the dirty bitmap. It still suffers from the two generals problem. You cannot solve this without making one node reliable and that takes us back to it being either QEMU (posted event and state file) or the management tool (sync event). It is safe w/o a state file by changing the basic live copy algorithm: 1. Live copy in progress stage Once live copy command is issued, a dirty bit map is created for tracking. There is a single pass over the entire image where we copy blocks from the src to the dst. Write commands for blocks that were already copied will be done twice for the src and dst. Once the full copy single pass ends, we trigger a QMP event that this stage can end. The live copy stage keeps running till the management issue a switch command. When it will happen, the switch is immediate and no need to copy additional blocks (but flush pending IOs). 2. Management sends a switch command. Qemu stops the doubling the IO and switches to the destination. End. Here is where your race is: 2. Management sends a switch command 3. QEMU receives switch command 4. QEMU stops doubling IO and switches to the destination 5. QEMU sends acknowledgement of switch command 6. Management receives acknowledge of switch command 7. Management changes internal state definition to reflect the new destination If QEMU or the management tool crashes after step 4 and before step 6, when the management tool restarts QEMU with the source image, data loss will have occurred (and potentially corruption if a flush had happened). This all boils down to the Two Generals Problem[1]. It's simply not fixable without making one end reliable and that means that someone needs to fsync() something *after* the switchover happens but before the first write happens. That can be QEMU (Avi's RAID proposal and my state file proposal) or it can be the management tool (if we introduce synchronous events). [1] http://en.wikipedia.org/wiki/Two_Generals%27_Problem Regards, Anthony Liguori
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On Wed, Mar 02, 2011 at 12:34:24PM +0100, Jan Kiszka wrote: > On 2011-03-02 11:56, Alon Levy wrote: > > On Wed, Mar 02, 2011 at 09:22:35AM +0100, Jan Kiszka wrote: > >> On 2011-03-01 13:58, Alon Levy wrote: > >>> On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: > On 2011-02-27 20:03, Alon Levy wrote: > > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: > >> On 2011-02-26 12:43, xming wrote: > >>> When trying to start X (and it loads qxl driver) the kvm process just > >>> crashes. > > > > This is fixed by Gerd's attached patch (taken from rhel repository, > > don't know > > why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as > > well (separate email). > > Patch looks OK on first glance, but the changelog is misleading: This > was broken for _both_ trees, but upstream didn't detect the bug. > >>> > >>> So I didn't test with qemu not having this patch, but according to the > >>> discussion in the > >>> launchpad bug the problem only happens with qemu-kvm. This doesn't rule > >>> out it being a > >>> bug, perhaps it is just triggered much less frequently I guess. > >> > >> Again: qemu-kvm has the instrumentation to detect the bug, qemu is > >> lacking this, but both trees will break subtly if cpu_current_env is not > >> properly restored. > > > > ok, so what do you want to be done further before this patch is applied? > > The patch posted to qemu-devel just requires a changelog that correctly > reflects what it addresses (and where). Just sent, Alon > > Jan >
[Qemu-devel] [PATCH] spice/qxl: locking fix for qemu-kvm
From: Gerd Hoffmann qxl needs to release the qemu lock before calling some libspice functions (and re-aquire it later). In upstream qemu qxl can just use qemu_mutex_{unlock,lock}_iothread. In qemu-kvm this doesn't work, qxl needs additionally save+restore the cpu_single_env pointer on unlock+lock. This fixes the following assertion in kvm_mutex_unlock that happened in the released qemu-kvm 0.14.0 on gentoo when using spice's qxl device: > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724: > kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Happening as a result of io from the guest (qxl reset): > (gdb) bt > #0 0x75daa165 in raise () from /lib/libc.so.6 > #1 0x75dab580 in abort () from /lib/libc.so.6 > #2 0x75da3201 in __assert_fail () from /lib/libc.so.6 > #3 0x00436f7e in kvm_mutex_unlock () > at > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724 > #4 qemu_mutex_unlock_iothread () > at > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1737 > #5 0x005e84ee in qxl_hard_reset (d=0x15d3080, loadvm=0) > at > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:665 > #6 0x005e9f9a in ioport_write (opaque=0x15d3080, addr= --- hw/qxl.c | 37 + ui/spice-display.c | 12 ++-- ui/spice-display.h |6 ++ 3 files changed, 41 insertions(+), 14 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index fe4212b..117f7c8 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -125,6 +125,27 @@ static void qxl_reset_memslots(PCIQXLDevice *d); static void qxl_reset_surfaces(PCIQXLDevice *d); static void qxl_ring_set_dirty(PCIQXLDevice *qxl); +/* qemu-kvm locking ... */ +void qxl_unlock_iothread(SimpleSpiceDisplay *ssd) +{ +if (cpu_single_env) { +assert(ssd->env == NULL); +ssd->env = cpu_single_env; +cpu_single_env = NULL; +} +qemu_mutex_unlock_iothread(); +} + +void qxl_lock_iothread(SimpleSpiceDisplay *ssd) +{ +qemu_mutex_lock_iothread(); +if (ssd->env) { +assert(cpu_single_env == NULL); +cpu_single_env = ssd->env; +ssd->env = NULL; +} +} + static inline uint32_t msb_mask(uint32_t val) { uint32_t mask; @@ -662,10 +683,10 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm) dprint(d, 1, "%s: start%s\n", __FUNCTION__, loadvm ? " (loadvm)" : ""); -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(&d->ssd); d->ssd.worker->reset_cursor(d->ssd.worker); d->ssd.worker->reset_image_cache(d->ssd.worker); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(&d->ssd); qxl_reset_surfaces(d); qxl_reset_memslots(d); @@ -795,9 +816,9 @@ static void qxl_reset_surfaces(PCIQXLDevice *d) { dprint(d, 1, "%s:\n", __FUNCTION__); d->mode = QXL_MODE_UNDEFINED; -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(&d->ssd); d->ssd.worker->destroy_surfaces(d->ssd.worker); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(&d->ssd); memset(&d->guest_surfaces.cmds, 0, sizeof(d->guest_surfaces.cmds)); } @@ -866,9 +887,9 @@ static void qxl_destroy_primary(PCIQXLDevice *d) dprint(d, 1, "%s\n", __FUNCTION__); d->mode = QXL_MODE_UNDEFINED; -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(&d->ssd); d->ssd.worker->destroy_primary_surface(d->ssd.worker, 0); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(&d->ssd); } static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm) @@ -938,10 +959,10 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) case QXL_IO_UPDATE_AREA: { QXLRect update = d->ram->update_area; -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(&d->ssd); d->ssd.worker->update_area(d->ssd.worker, d->ram->update_surface, &update, NULL, 0, 0); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(&d->ssd); break; } case QXL_IO_NOTIFY_CMD: diff --git a/ui/spice-display.c b/ui/spice-display.c index 020b423..defe652 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -186,18 +186,18 @@ void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd) surface.mem= (intptr_t)ssd->buf; surface.group_id = MEMSLOT_GROUP_HOST; -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(ssd); ssd->worker->create_primary_surface(ssd->worker, 0, &surface); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(ssd); } void qemu_spice_destroy_host_primary(SimpleSpiceDisplay *ssd) { dprint(1, "%s:\n", __FUNCTION__); -qemu_mutex_unlock_iothread(); +qxl_unlock_iothread(ssd); ssd->worker->destroy_primary_surface(ssd->worker, 0); -qemu_mutex_lock_iothread(); +qxl_lock_iothread(ssd); } void qemu_spice_vm_chang
[Qemu-devel] [Request for inputs]Qemu parameters that need runtime change.
Hi, QEMU at present can be started with a huge list of parameters, and only a subset of these can be changed at runtime. For the remaining ones, one needs to restart the qemu instance. I've been trying to put together a list of some such parameters, which would make good candidates for a runtime change. Request inputs on more such parameters that could make it here, and also whether the following are good-to-have features: 1. Allowing a runtime change from one chardev backend to another ( Eg, from TCP socket to unix, and vice-versa ) 2. Changing the network interfaces (from -net user to -net tap ? ) I'm presently aware of these; it would be good to get more inputs on what more can be done here. -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On 2011-03-02 11:56, Alon Levy wrote: > On Wed, Mar 02, 2011 at 09:22:35AM +0100, Jan Kiszka wrote: >> On 2011-03-01 13:58, Alon Levy wrote: >>> On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: On 2011-02-27 20:03, Alon Levy wrote: > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: >> On 2011-02-26 12:43, xming wrote: >>> When trying to start X (and it loads qxl driver) the kvm process just >>> crashes. > > This is fixed by Gerd's attached patch (taken from rhel repository, don't > know > why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as > well (separate email). Patch looks OK on first glance, but the changelog is misleading: This was broken for _both_ trees, but upstream didn't detect the bug. >>> >>> So I didn't test with qemu not having this patch, but according to the >>> discussion in the >>> launchpad bug the problem only happens with qemu-kvm. This doesn't rule out >>> it being a >>> bug, perhaps it is just triggered much less frequently I guess. >> >> Again: qemu-kvm has the instrumentation to detect the bug, qemu is >> lacking this, but both trees will break subtly if cpu_current_env is not >> properly restored. > > ok, so what do you want to be done further before this patch is applied? The patch posted to qemu-devel just requires a changelog that correctly reflects what it addresses (and where). Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH -V2 4/6] hw/9pfs: Implement syncfs
On Wed, 2 Mar 2011 10:20:41 +, Stefan Hajnoczi wrote: > On Wed, Mar 2, 2011 at 5:05 AM, Aneesh Kumar K. V > wrote: > > On Tue, 1 Mar 2011 20:27:19 +, Stefan Hajnoczi > > wrote: > >> On Tue, Mar 1, 2011 at 6:02 PM, Aneesh Kumar K. V > >> wrote: > >> > On Tue, 1 Mar 2011 15:59:19 +, Stefan Hajnoczi > >> > wrote: > >> >> >> Please explain the semantics of P9_TSYNCFS. Won't returning success > >> >> >> without doing anything lead to data integrity issues? > >> >> > > >> >> > I should actually do the 9P Operation format as commit message. Will > >> >> > add in the next update. Whether returning here would cause a data > >> >> > integrity issue, it depends what sort of guarantee we want to > >> >> > provide. So calling sync on the guest will cause all the dirty pages > >> >> > in > >> >> > the guest to be flushed to host. Now all those changes are in the host > >> >> > page cache and it would be nice to flush them as a part of sync but > >> >> > then since we don't have a per file system sync, the above would imply > >> >> > we flush all dirty pages on the host which can result in large > >> >> > performance impact. > >> >> > >> >> You get the define the semantics of P9_TSYNCFS? I thought this is > >> >> part of a well-defined protocol :). If this is a .L extension then > >> >> it's probably a bad design and shouldn't be added to the protocol if > >> >> we can't implement it. > >> > > >> > It is a part of .L extension and we can definitely implement it. There > >> > is patch out there which is yet to be merged > >> > > >> > http://thread.gmane.org/gmane.linux.file-systems/44628 > >> > >> A future Linux-only ioctl :/. > >> > >> >> Is this operation supposed to flush the disk write cache too? > >> > > >> > I am not sure we need to specify that as a part of 9p operation. I guess > >> > we can only say maximum possible data integrity. Whether a sync will > >> > cause disk write cache flush depends on the file system. For ext* that > >> > can be controlled by mount option barrier. > >> > >> So on a host with a safe configuration this operation should put data > >> on stable storage? > >> > >> >> > >> >> I think virtio-9p has a file descriptor cache. Would it be possible > >> >> to fsync() those file descriptors? > >> >> > >> > > >> > Ideally we should. But that would involve a large number of fsync calls. > >> > >> Yep, that's why this is a weird operation to support, especially since > >> it's a .L add-on and not original 9P. What's the use-case since > >> today's Linux userland cannot directly make use of this operation? I > >> guess it has been added in order to pass-through a Linux internal vfs > >> super block sync function? > > > > IMHO it would be nice to have a syncfs 9p operation because that enables > > the client to say "if possible" flush the dirty data on the server. I > > guess we should consider this as something server can choose to > > ignore. In a cloud setup even doing a per file system sync can imply > > performance impact because VirtFS export may not 1:1 map to mount point > > on host. There is also plan to add a new option to the VirtFs export point > > which enable write to exported files to be either O_SYNC or > > O_DIRECT, similar to the way done for image files. That would imply > > Tsyncfs doesn't have much to do because we don't have dirty data on host > > pagecache anymore. > > > > So from 9p .L protocol point of view, it is a valid operation which > > enables client to request a flush of server cache if possible. And qemu > > 9p server choose to ignore because of the performance impact. If you are > > not comfortable with not doing anything specific in Tsyncfs > > operation, we can add sync(2) call as part of this 9p operation and > > later switch to FS_IOC_SYNCFS when it become available. > > The case we need to prevent is where applications running on virtfs > think they are getting guarantees that the implementation does not > provide. Silently noping a sync operation is a step in that direction > so I agree that sync(2) would be safer. I should capture as a part of the commit message that some servers can chose to ignore the request and return success. > > I'm not sure I understand your 1:1 mount point mapping argument. The > FS_IOC_SYNCFS ioctl does not help us there since it syncs the entire > filesystem, not the directory tree that virtfs is mapped to. It will > do a bunch of extra I/O similar to how sync(2) does this across all > filesystems today. This suggests again that P9_TSYNCFS is hard to > implement because FS_IOC_SYNCFS ends up not being useful. Did I miss > something? What i meant was that even with 1:1 mount point mapping some servers can choose to not implement cache flush due to the performance implication associated with TSYNCFS. (That argument also justifies Qemu 9p server ignoring the TSYNCFS request.) So from the client perspective it is just a hint to the server to flush the server cache "if possible". > > I'm looking for
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/2011 12:58 PM, Alon Levy wrote: On Wed, Mar 02, 2011 at 11:25:44AM +0100, Jes Sorensen wrote: On 03/01/11 15:25, Dor Laor wrote: On 03/01/2011 02:40 PM, Anthony Liguori wrote: On Mar 1, 2011 7:07 AM, "Dor Laor" Qemu is the one that should spawn them and they should be transparent from the management. This way running qemu stays the same and qemu just need to add the logic to get a SIGCHILD and potentially re-execute an dead son process. Spice is the logical place to start, no? It's the largest single dependency we have and it does some scary things with qemu_mutex. I would use spice as a way to prove the concept. I agree it is desirable to the this for spice but it is allot more complex than virtagent isolation. Spice is performance sensitive and contains much more state. It needs to access the guest memory for reading the surfaces. It can be solved but needs some major changes. Adding spice-devel to the discussion. I had a few thoughts about this already, which I think will work for both spice and vnc. What we could do is to expose the video memory via shared memory. That way a spice or vnc daemon could get direct access to the memory, this would limit communication to keyboard/mouse events, as well as video mode info, and possibly notifications to the client about which ranges of memory have been updated. Using shared memory this way should allow us to implement the video clients without performance loss, in fact it should be beneficial since it would allow them to run fully separate from the host daemon. I think that would work well for spice. Spice uses shared memory from the pci device for both the framebuffer and surfaces/commands, but this is Is that the only DMA do you do? That's good for this model. not really relevant at this level. What about IO and irq? that would add additional latencies, no? because each io exit would need to be ipc'ed over to the spice/vnc process? and same way in the other direction, requesting qemu to trigger an interrupt in the next vm entry. The qxl device can be in the privileged qemu (as a start) and it will handle irqs directly. Even today you need to notify the spice server thread, so nothing will change Cheers, Jes
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/11 11:58, Alon Levy wrote: > On Wed, Mar 02, 2011 at 11:25:44AM +0100, Jes Sorensen wrote: >> I had a few thoughts about this already, which I think will work for >> both spice and vnc. What we could do is to expose the video memory via >> shared memory. That way a spice or vnc daemon could get direct access to >> the memory, this would limit communication to keyboard/mouse events, as >> well as video mode info, and possibly notifications to the client about >> which ranges of memory have been updated. >> >> Using shared memory this way should allow us to implement the video >> clients without performance loss, in fact it should be beneficial since >> it would allow them to run fully separate from the host daemon. >> > > I think that would work well for spice. Spice uses shared memory from the > pci device for both the framebuffer and surfaces/commands, but this is > not really relevant at this level. What about IO and irq? that would add > additional latencies, no? because each io exit would need to be ipc'ed over > to the spice/vnc process? and same way in the other direction, requesting > qemu to trigger an interrupt in the next vm entry. I am glad the shmem approach will work for Spice. There would be something there with IRQs etc, I agree. but there are methods to help that too, we could use a shared memory event ring for example. Cheers, Jes
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/11 11:56, Dor Laor wrote: > On 03/02/2011 12:25 PM, Jes Sorensen wrote: >> On 03/01/11 15:25, Dor Laor wrote: >> Using shared memory this way should allow us to implement the video >> clients without performance loss, in fact it should be beneficial since >> it would allow them to run fully separate from the host daemon. > > Why do you call it a daemon? Each VM instance should have only one, the > 'host daemon' naming is misleading. I refer to it as a daemon because it is something the client(s) will connect to. But yes, there will be a daemon per VM. > The proper solution long term is to sandbox qemu in a way that there > privileged mode and non privileged mode. It might be implemented using > separate address space or not. Most operations like vnc/rpc/spice/usb > should be run with less privileges. > > The main issue is that doing it right will take time and we'll want > virt-agent be merged before the long term solution is ready. The best > approach would be gradual development Yes I agree, I don't think this will happen overnight, and blocking virtagent with this would be bad. Cheers, Jes
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On Wed, Mar 02, 2011 at 11:25:44AM +0100, Jes Sorensen wrote: > On 03/01/11 15:25, Dor Laor wrote: > > On 03/01/2011 02:40 PM, Anthony Liguori wrote: > >> > >> On Mar 1, 2011 7:07 AM, "Dor Laor" >> > Qemu is the one that should spawn them and they should be transparent > >> from the management. This way running qemu stays the same and qemu just > >> need to add the logic to get a SIGCHILD and potentially re-execute an > >> dead son process. > >> > >> Spice is the logical place to start, no? It's the largest single > >> dependency we have and it does some scary things with qemu_mutex. I > >> would use spice as a way to prove the concept. > > > > I agree it is desirable to the this for spice but it is allot more > > complex than virtagent isolation. Spice is performance sensitive and > > contains much more state. It needs to access the guest memory for > > reading the surfaces. It can be solved but needs some major changes. > > Adding spice-devel to the discussion. > > I had a few thoughts about this already, which I think will work for > both spice and vnc. What we could do is to expose the video memory via > shared memory. That way a spice or vnc daemon could get direct access to > the memory, this would limit communication to keyboard/mouse events, as > well as video mode info, and possibly notifications to the client about > which ranges of memory have been updated. > > Using shared memory this way should allow us to implement the video > clients without performance loss, in fact it should be beneficial since > it would allow them to run fully separate from the host daemon. > I think that would work well for spice. Spice uses shared memory from the pci device for both the framebuffer and surfaces/commands, but this is not really relevant at this level. What about IO and irq? that would add additional latencies, no? because each io exit would need to be ipc'ed over to the spice/vnc process? and same way in the other direction, requesting qemu to trigger an interrupt in the next vm entry. > Cheers, > Jes >
[Qemu-devel] Re: [PATCH RESEND v2 1/2] fix vnc regression
On Wed, Mar 2, 2011 at 3:46 AM, Wen Congyang wrote: > This patch fix the following two regressions: > 1. we should use bitmap_set() and bitmap_clear() to replace vnc_set_bits(). > 2. The unit of bitmap_intersects()'third parameter is bit, not words. > But we pass the num of words to bitmap_intersects(). > > Changes from v1 to v2: > 1. fix the third argument of bitmap_clear() > > Signed-off-by: Wen Congyang Acked-by: Corentin Chary > --- > ui/vnc.c | 11 --- > 1 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/ui/vnc.c b/ui/vnc.c > index 610f884..e3761b0 100644 > --- a/ui/vnc.c > +++ b/ui/vnc.c > @@ -1645,17 +1645,21 @@ static void framebuffer_update_request(VncState *vs, > int incremental, > int x_position, int y_position, > int w, int h) > { > + int i; > + const size_t width = ds_get_width(vs->ds) / 16; > + > if (y_position > ds_get_height(vs->ds)) > y_position = ds_get_height(vs->ds); > if (y_position + h >= ds_get_height(vs->ds)) > h = ds_get_height(vs->ds) - y_position; > > - int i; > vs->need_update = 1; > if (!incremental) { > vs->force_update = 1; > for (i = 0; i < h; i++) { > - bitmap_set(vs->dirty[y_position + i], x_position / 16, w / 16); > + bitmap_set(vs->dirty[y_position + i], 0, width); > + bitmap_clear(vs->dirty[y_position + i], width, > + VNC_DIRTY_WORDS * BITS_PER_LONG - width); > } > } > } > @@ -2406,7 +2410,8 @@ static int vnc_refresh_server_surface(VncDisplay *vd) > guest_row = vd->guest.ds->data; > server_row = vd->server->data; > for (y = 0; y < vd->guest.ds->height; y++) { > - if (bitmap_intersects(vd->guest.dirty[y], width_mask, > VNC_DIRTY_WORDS)) { > + if (bitmap_intersects(vd->guest.dirty[y], width_mask, > + VNC_DIRTY_WORDS * BITS_PER_LONG)) { > int x; > uint8_t *guest_ptr; > uint8_t *server_ptr; > -- > 1.7.1 > -- Corentin Chary http://xf.iksaif.net
[Qemu-devel] Re: [PATCH RESEND 2/2] vnc: Fix heap corruption
On Wed, Mar 2, 2011 at 3:58 AM, Wen Congyang wrote: > This bug is reported by Stefan Weil: > > Commit bc2429b9174ac2d3c56b7fd35884b0d89ec7fb02 introduced > a severe bug (heap corruption). > > bitmap_clear was called with a wrong argument > which caused out-of-bound writes to width_mask. > > This bug was detected with QEMU running on windows. > It also occurs with wine: > > *** stack smashing detected ***: terminated > wine: Unhandled illegal instruction at address 0x6115c7 (thread 0009), > starting debugger... > > The bug is not windows specific! > > > The third argument of bitmap_clear() is number of bits to be cleared, but we > pass > the end bits to be cleared to bitmap_clear(). > > Signed-off-by: Wen Congyang > Reported-by: Stefan Weil Acked-by: Corentin Chary > --- > ui/vnc.c | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/ui/vnc.c b/ui/vnc.c > index e3761b0..e7d0b5b 100644 > --- a/ui/vnc.c > +++ b/ui/vnc.c > @@ -2390,6 +2390,7 @@ static int vnc_refresh_server_surface(VncDisplay *vd) > unsigned long width_mask[VNC_DIRTY_WORDS]; > VncState *vs; > int has_dirty = 0; > + const size_t width = ds_get_width(vd->ds) / 16; > > struct timeval tv = { 0, 0 }; > > @@ -2403,9 +2404,8 @@ static int vnc_refresh_server_surface(VncDisplay *vd) > * Check and copy modified bits from guest to server surface. > * Update server dirty map. > */ > - bitmap_set(width_mask, 0, (ds_get_width(vd->ds) / 16)); > - bitmap_clear(width_mask, (ds_get_width(vd->ds) / 16), > - VNC_DIRTY_WORDS * BITS_PER_LONG); > + bitmap_set(width_mask, 0, width); > + bitmap_clear(width_mask, width, VNC_DIRTY_WORDS * BITS_PER_LONG - width); > cmp_bytes = 16 * ds_get_bytes_per_pixel(vd->ds); > guest_row = vd->guest.ds->data; > server_row = vd->server->data; > -- > 1.7.1 > > -- Corentin Chary http://xf.iksaif.net
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On Wed, Mar 02, 2011 at 09:22:35AM +0100, Jan Kiszka wrote: > On 2011-03-01 13:58, Alon Levy wrote: > > On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: > >> On 2011-02-27 20:03, Alon Levy wrote: > >>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: > On 2011-02-26 12:43, xming wrote: > > When trying to start X (and it loads qxl driver) the kvm process just > > crashes. > >>> > >>> This is fixed by Gerd's attached patch (taken from rhel repository, don't > >>> know > >>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as > >>> well (separate email). > >> > >> Patch looks OK on first glance, but the changelog is misleading: This > >> was broken for _both_ trees, but upstream didn't detect the bug. > > > > So I didn't test with qemu not having this patch, but according to the > > discussion in the > > launchpad bug the problem only happens with qemu-kvm. This doesn't rule out > > it being a > > bug, perhaps it is just triggered much less frequently I guess. > > Again: qemu-kvm has the instrumentation to detect the bug, qemu is > lacking this, but both trees will break subtly if cpu_current_env is not > properly restored. ok, so what do you want to be done further before this patch is applied? > > Jan >
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/2011 12:25 PM, Jes Sorensen wrote: On 03/01/11 15:25, Dor Laor wrote: On 03/01/2011 02:40 PM, Anthony Liguori wrote: On Mar 1, 2011 7:07 AM, "Dor Laor" Qemu is the one that should spawn them and they should be transparent from the management. This way running qemu stays the same and qemu just need to add the logic to get a SIGCHILD and potentially re-execute an dead son process. Spice is the logical place to start, no? It's the largest single dependency we have and it does some scary things with qemu_mutex. I would use spice as a way to prove the concept. I agree it is desirable to the this for spice but it is allot more complex than virtagent isolation. Spice is performance sensitive and contains much more state. It needs to access the guest memory for reading the surfaces. It can be solved but needs some major changes. Adding spice-devel to the discussion. I had a few thoughts about this already, which I think will work for both spice and vnc. What we could do is to expose the video memory via shared memory. That way a spice or vnc daemon could get direct access to the memory, this would limit communication to keyboard/mouse events, as well as video mode info, and possibly notifications to the client about which ranges of memory have been updated. Using shared memory this way should allow us to implement the video clients without performance loss, in fact it should be beneficial since it would allow them to run fully separate from the host daemon. Why do you call it a daemon? Each VM instance should have only one, the 'host daemon' naming is misleading. The proper solution long term is to sandbox qemu in a way that there privileged mode and non privileged mode. It might be implemented using separate address space or not. Most operations like vnc/rpc/spice/usb should be run with less privileges. The main issue is that doing it right will take time and we'll want virt-agent be merged before the long term solution is ready. The best approach would be gradual development Cheers, Jes
Re: [Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
On Tue, Mar 01, 2011 at 12:53:40PM -0600, Rick Vernam wrote: > On Tuesday 01 March 2011 12:29:14 Serge Hallyn wrote: > > @Rick, > > > > would you expect a fedora guest to reproduce this? Would it have the > > qxl driver? Or must it be Windows? > > I don't have a fedora guest to test on, and I don't know the implementation > details well enough to postulate. > -Rick > This is guest independent, as long as the guest uses a qxl driver.
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/11 11:42, Dor Laor wrote: > On 03/02/2011 12:28 PM, Jes Sorensen wrote: >> On 03/01/11 15:25, Dor Laor wrote: >>> I agree it is desirable to the this for spice but it is allot more >>> complex than virtagent isolation. Spice is performance sensitive and >>> contains much more state. It needs to access the guest memory for >>> reading the surfaces. It can be solved but needs some major changes. >>> Adding spice-devel to the discussion. >> >> Please don't add broken misconfigured mailing lists, which require >> moderator or subscription, to discussions on public lists. > > Isn't it simpler to ask the spice list maintainer (Alon Levy, CCed) to > do it? We cannot have a discussion on a mailing list if we constantly have to wait for the list maintainer to approve it. There is really zero justification for closing a development list of an open source project. Spam is handled in other ways, this doesn't solve it. Cheers, Jes
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/02/2011 12:28 PM, Jes Sorensen wrote: On 03/01/11 15:25, Dor Laor wrote: On 03/01/2011 02:40 PM, Anthony Liguori wrote: Spice is the logical place to start, no? It's the largest single dependency we have and it does some scary things with qemu_mutex. I would use spice as a way to prove the concept. I agree it is desirable to the this for spice but it is allot more complex than virtagent isolation. Spice is performance sensitive and contains much more state. It needs to access the guest memory for reading the surfaces. It can be solved but needs some major changes. Adding spice-devel to the discussion. Please don't add broken misconfigured mailing lists, which require moderator or subscription, to discussions on public lists. Isn't it simpler to ask the spice list maintainer (Alon Levy, CCed) to do it? Jes
Re: [Qemu-devel] REPOST: [PATCH v3] tracetool: Add optional argument to specify dtrace probe names
On Wed, Mar 2, 2011 at 8:22 AM, wrote: > From: Jes Sorensen > > Optional feature allowing a user to generate the probe list to match > the name of the binary, in case they wish to install qemu under a > different name than qemu-{system,user}, > > Signed-off-by: Jes Sorensen > --- > scripts/tracetool | 19 +-- > 1 files changed, 13 insertions(+), 6 deletions(-) Yes, please merge. Acked-by: Stefan Hajnoczi
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
@Serge, I had to re-target your branch (and merge) against 0.14.0~rc1 +noroms-0ubuntu4 as *ubuntu3 had already been uploaded for a different fix, and the package-importer failed to suck it in. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: Fix Released Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/01/11 15:25, Dor Laor wrote: > On 03/01/2011 02:40 PM, Anthony Liguori wrote: >> Spice is the logical place to start, no? It's the largest single >> dependency we have and it does some scary things with qemu_mutex. I >> would use spice as a way to prove the concept. > > I agree it is desirable to the this for spice but it is allot more > complex than virtagent isolation. Spice is performance sensitive and > contains much more state. It needs to access the guest memory for > reading the surfaces. It can be solved but needs some major changes. > Adding spice-devel to the discussion. Please don't add broken misconfigured mailing lists, which require moderator or subscription, to discussions on public lists. Jes
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/01/11 15:25, Dor Laor wrote: > On 03/01/2011 02:40 PM, Anthony Liguori wrote: >> >> On Mar 1, 2011 7:07 AM, "Dor Laor" > > Qemu is the one that should spawn them and they should be transparent >> from the management. This way running qemu stays the same and qemu just >> need to add the logic to get a SIGCHILD and potentially re-execute an >> dead son process. >> >> Spice is the logical place to start, no? It's the largest single >> dependency we have and it does some scary things with qemu_mutex. I >> would use spice as a way to prove the concept. > > I agree it is desirable to the this for spice but it is allot more > complex than virtagent isolation. Spice is performance sensitive and > contains much more state. It needs to access the guest memory for > reading the surfaces. It can be solved but needs some major changes. > Adding spice-devel to the discussion. I had a few thoughts about this already, which I think will work for both spice and vnc. What we could do is to expose the video memory via shared memory. That way a spice or vnc daemon could get direct access to the memory, this would limit communication to keyboard/mouse events, as well as video mode info, and possibly notifications to the client about which ranges of memory have been updated. Using shared memory this way should allow us to implement the video clients without performance loss, in fact it should be beneficial since it would allow them to run fully separate from the host daemon. Cheers, Jes
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
This bug was fixed in the package qemu-kvm - 0.14.0~rc1+noroms-0ubuntu4 --- qemu-kvm (0.14.0~rc1+noroms-0ubuntu4) natty; urgency=low * Apply spice-qxl-locking-fix-for-qemu-kvm.patch to fix bug with -qxl. (LP: #723871) -- Serge HallynTue, 01 Mar 2011 11:12:44 -0600 ** Changed in: qemu-kvm (Ubuntu) Status: In Progress => Fix Released ** Branch linked: lp:ubuntu/qemu-kvm -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: Fix Released Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/01/11 13:07, Dor Laor wrote: > On 02/28/2011 07:44 PM, Anthony Liguori wrote: >> I'm very nervous about having a large number of daemons necessary to run >> QEMU. I think a reasonable approach would be a single front-end daemond. > > s/daemon/son processes/ > Qemu is the one that should spawn them and they should be transparent > from the management. This way running qemu stays the same and qemu just > need to add the logic to get a SIGCHILD and potentially re-execute an > dead son process. I disagree here, I do not want to require QEMU to spawn the new processes. Having a daemon you can use to connect will provide more flexibility and isn't unreasonably complex. Cheers, Jes
Re: [Qemu-devel] [PATCH -V2 4/6] hw/9pfs: Implement syncfs
On Wed, Mar 2, 2011 at 5:05 AM, Aneesh Kumar K. V wrote: > On Tue, 1 Mar 2011 20:27:19 +, Stefan Hajnoczi wrote: >> On Tue, Mar 1, 2011 at 6:02 PM, Aneesh Kumar K. V >> wrote: >> > On Tue, 1 Mar 2011 15:59:19 +, Stefan Hajnoczi >> > wrote: >> >> >> Please explain the semantics of P9_TSYNCFS. Won't returning success >> >> >> without doing anything lead to data integrity issues? >> >> > >> >> > I should actually do the 9P Operation format as commit message. Will >> >> > add in the next update. Whether returning here would cause a data >> >> > integrity issue, it depends what sort of guarantee we want to >> >> > provide. So calling sync on the guest will cause all the dirty pages in >> >> > the guest to be flushed to host. Now all those changes are in the host >> >> > page cache and it would be nice to flush them as a part of sync but >> >> > then since we don't have a per file system sync, the above would imply >> >> > we flush all dirty pages on the host which can result in large >> >> > performance impact. >> >> >> >> You get the define the semantics of P9_TSYNCFS? I thought this is >> >> part of a well-defined protocol :). If this is a .L extension then >> >> it's probably a bad design and shouldn't be added to the protocol if >> >> we can't implement it. >> > >> > It is a part of .L extension and we can definitely implement it. There >> > is patch out there which is yet to be merged >> > >> > http://thread.gmane.org/gmane.linux.file-systems/44628 >> >> A future Linux-only ioctl :/. >> >> >> Is this operation supposed to flush the disk write cache too? >> > >> > I am not sure we need to specify that as a part of 9p operation. I guess >> > we can only say maximum possible data integrity. Whether a sync will >> > cause disk write cache flush depends on the file system. For ext* that >> > can be controlled by mount option barrier. >> >> So on a host with a safe configuration this operation should put data >> on stable storage? >> >> >> >> >> I think virtio-9p has a file descriptor cache. Would it be possible >> >> to fsync() those file descriptors? >> >> >> > >> > Ideally we should. But that would involve a large number of fsync calls. >> >> Yep, that's why this is a weird operation to support, especially since >> it's a .L add-on and not original 9P. What's the use-case since >> today's Linux userland cannot directly make use of this operation? I >> guess it has been added in order to pass-through a Linux internal vfs >> super block sync function? > > IMHO it would be nice to have a syncfs 9p operation because that enables > the client to say "if possible" flush the dirty data on the server. I > guess we should consider this as something server can choose to > ignore. In a cloud setup even doing a per file system sync can imply > performance impact because VirtFS export may not 1:1 map to mount point > on host. There is also plan to add a new option to the VirtFs export point > which enable write to exported files to be either O_SYNC or > O_DIRECT, similar to the way done for image files. That would imply > Tsyncfs doesn't have much to do because we don't have dirty data on host > pagecache anymore. > > So from 9p .L protocol point of view, it is a valid operation which > enables client to request a flush of server cache if possible. And qemu > 9p server choose to ignore because of the performance impact. If you are > not comfortable with not doing anything specific in Tsyncfs > operation, we can add sync(2) call as part of this 9p operation and > later switch to FS_IOC_SYNCFS when it become available. The case we need to prevent is where applications running on virtfs think they are getting guarantees that the implementation does not provide. Silently noping a sync operation is a step in that direction so I agree that sync(2) would be safer. I'm not sure I understand your 1:1 mount point mapping argument. The FS_IOC_SYNCFS ioctl does not help us there since it syncs the entire filesystem, not the directory tree that virtfs is mapped to. It will do a bunch of extra I/O similar to how sync(2) does this across all filesystems today. This suggests again that P9_TSYNCFS is hard to implement because FS_IOC_SYNCFS ends up not being useful. Did I miss something? I'm looking for a use case where guests need a P9_TSYNCFS operation. P9_TSYNCFS is not in linux-2.6 yet so I don't have any example code that exploits it. Can you point me to something that shows off why this operation is necessary? It must an optimization if 9P and NFS make do without an equivalent? Stefan
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 02/28/11 18:44, Anthony Liguori wrote: > On Feb 28, 2011 10:44 AM, "Jes Sorensen" wrote: >> > Separating host-side virtagent and other tasks from core QEMU >> > = >> > >> > To improve auditing of the core QEMU code, it would be ideal to be >> > able to separate the core QEMU functionality from utility >> > functionality by moving the utility functionality into its own >> > process. This process will be referred to as the QEMU client below. > Separating as in moving code outside of qemu.git, making code optionally > built in, making code optionally built in or loadable as a separate > executable that is automatically launched, or making code always built > outside the main executable? > > I'm very nervous about having a large number of daemons necessary to run > QEMU. I think a reasonable approach would be a single front-end daemond. > > Once QAPI is merged, there is a very incremental approach we can take for > this sort of work. Take your favorite subsystem (like gdbstub or SDL) and > make it only use QMP apis. Once we're only using QMP internally in a > subsystem, then building it out of the main QEMU and using libqmp should be > fairly easy. Hi Anthony, Back from a day off playing with power drills and concrete walls :) Sorry I should have made it a little more clear, it was obvious to me, but not written down: The idea is to keep everything as part of the QEMU package, ie. part of qemu.git. My idea is really to have one QEMU host daemon and one QEMU client, which provides the various services. You type make and you get two binaries instead of one. We could allow other daemons to connect to the host daemon, but that wouldn't be a primary target for this in my book, and I am not sure we really want to do this. It is absolutely vital for me that we do not make things much more complicated for users with this move. I don't want to get into a situation where we start forcing external packages or daemons in order to run basic QEMU. If we start requiring such, we have failed! However, I think it is a reasonable compromise to have one daemon you launch, and then a simple client you launch straight after which will then provide the same/similar views and interfaces that users are used to when they launch current QEMU (monitor, vnc server etc). I hope this clarifies things. Cheers, Jes
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On 2011-03-01 13:58, Alon Levy wrote: > On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: >> On 2011-02-27 20:03, Alon Levy wrote: >>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: On 2011-02-26 12:43, xming wrote: > When trying to start X (and it loads qxl driver) the kvm process just > crashes. >>> >>> This is fixed by Gerd's attached patch (taken from rhel repository, don't >>> know >>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as >>> well (separate email). >> >> Patch looks OK on first glance, but the changelog is misleading: This >> was broken for _both_ trees, but upstream didn't detect the bug. > > So I didn't test with qemu not having this patch, but according to the > discussion in the > launchpad bug the problem only happens with qemu-kvm. This doesn't rule out > it being a > bug, perhaps it is just triggered much less frequently I guess. Again: qemu-kvm has the instrumentation to detect the bug, qemu is lacking this, but both trees will break subtly if cpu_current_env is not properly restored. Jan signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH v3 11/17] kvm: x86: Inject pending MCE events on state writeback
The current way of injecting MCE events without updating of and synchronizing with the CPUState is broken and causes spurious corruptions of the MCE-related parts of the CPUState. As a first step towards a fix, enhance the state writeback code with support for injecting events that are pending in the CPUState. A pending exception will then be signaled via cpu_interrupt(CPU_INTERRUPT_MCE). And, just like for TCG, we need to leave the halt state when CPU_INTERRUPT_MCE is pending (left broken for the to-be-removed old KVM code). This will also allow to unify TCG and KVM injection code. Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- target-i386/kvm.c | 60 + 1 files changed, 60 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index a416554..939edc8 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -467,6 +467,38 @@ void kvm_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, #endif /* !KVM_CAP_MCE*/ } +static int kvm_inject_mce_oldstyle(CPUState *env) +{ +#ifdef KVM_CAP_MCE +if (!kvm_has_vcpu_events() && env->exception_injected == EXCP12_MCHK) { +unsigned int bank, bank_num = env->mcg_cap & 0xff; +struct kvm_x86_mce mce; + +env->exception_injected = -1; + +/* + * There must be at least one bank in use if an MCE is pending. + * Find it and use its values for the event injection. + */ +for (bank = 0; bank < bank_num; bank++) { +if (env->mce_banks[bank * 4 + 1] & MCI_STATUS_VAL) { +break; +} +} +assert(bank < bank_num); + +mce.bank = bank; +mce.status = env->mce_banks[bank * 4 + 1]; +mce.mcg_status = env->mcg_status; +mce.addr = env->mce_banks[bank * 4 + 2]; +mce.misc = env->mce_banks[bank * 4 + 3]; + +return kvm_vcpu_ioctl(env, KVM_X86_SET_MCE, &mce); +} +#endif /* KVM_CAP_MCE */ +return 0; +} + static void cpu_update_state(void *opaque, int running, int reason) { CPUState *env = opaque; @@ -1539,6 +1571,11 @@ int kvm_arch_put_registers(CPUState *env, int level) if (ret < 0) { return ret; } +/* must be before kvm_put_msrs */ +ret = kvm_inject_mce_oldstyle(env); +if (ret < 0) { +return ret; +} ret = kvm_put_msrs(env, level); if (ret < 0) { return ret; @@ -1677,6 +1714,29 @@ void kvm_arch_post_run(CPUState *env, struct kvm_run *run) int kvm_arch_process_async_events(CPUState *env) { +if (env->interrupt_request & CPU_INTERRUPT_MCE) { +/* We must not raise CPU_INTERRUPT_MCE if it's not supported. */ +assert(env->mcg_cap); + +env->interrupt_request &= ~CPU_INTERRUPT_MCE; + +kvm_cpu_synchronize_state(env); + +if (env->exception_injected == EXCP08_DBLE) { +/* this means triple fault */ +qemu_system_reset_request(); +env->exit_request = 1; +return 0; +} +env->exception_injected = EXCP12_MCHK; +env->has_error_code = 0; + +env->halted = 0; +if (kvm_irqchip_in_kernel() && env->mp_state == KVM_MP_STATE_HALTED) { +env->mp_state = KVM_MP_STATE_RUNNABLE; +} +} + if (kvm_irqchip_in_kernel()) { return 0; } -- 1.7.1
[Qemu-devel] [PATCH v3 02/17] kvm: Fix build warning when KVM_CAP_SET_GUEST_DEBUG is lacking
Original fix by David Gibson. CC: David Gibson Signed-off-by: Jan Kiszka --- kvm-all.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index e6a7de4..7753c8a 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -998,7 +998,9 @@ int kvm_cpu_exec(CPUState *env) } ret = EXCP_INTERRUPT; +#ifdef KVM_CAP_SET_GUEST_DEBUG out: +#endif env->exit_request = 0; cpu_single_env = NULL; return ret; -- 1.7.1
[Qemu-devel] [PATCH v3 13/17] kvm: x86: Consolidate TCG and KVM MCE injection code
This switches KVM's MCE injection path to cpu_x86_inject_mce, both for SIGBUS and monitor initiated events. This means we prepare the MCA MSRs in the VCPUState also for KVM. We have to drop the MSRs writeback restrictions for this purpose which is now safe as every uncoordinated MSR injection is removed with this patch. Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- target-i386/helper.c | 34 +++- target-i386/kvm.c | 238 +--- target-i386/kvm_x86.h | 25 - 3 files changed, 37 insertions(+), 260 deletions(-) delete mode 100644 target-i386/kvm_x86.h diff --git a/target-i386/helper.c b/target-i386/helper.c index a32960c..a08309f 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -27,7 +27,6 @@ #include "exec-all.h" #include "qemu-common.h" #include "kvm.h" -#include "kvm_x86.h" #ifndef CONFIG_USER_ONLY #include "sysemu.h" #include "monitor.h" @@ -1167,7 +1166,6 @@ void cpu_x86_inject_mce(Monitor *mon, CPUState *cenv, int bank, }; unsigned bank_num = cenv->mcg_cap & 0xff; CPUState *env; -int flag = 0; if (!cenv->mcg_cap) { monitor_printf(mon, "MCE injection not supported\n"); @@ -1187,27 +1185,19 @@ void cpu_x86_inject_mce(Monitor *mon, CPUState *cenv, int bank, return; } -if (kvm_enabled()) { -if (flags & MCE_INJECT_BROADCAST) { -flag |= MCE_BROADCAST; -} - -kvm_inject_x86_mce(cenv, bank, status, mcg_status, addr, misc, flag); -} else { -run_on_cpu(cenv, do_inject_x86_mce, ¶ms); -if (flags & MCE_INJECT_BROADCAST) { -params.bank = 1; -params.status = MCI_STATUS_VAL | MCI_STATUS_UC; -params.mcg_status = MCG_STATUS_MCIP | MCG_STATUS_RIPV; -params.addr = 0; -params.misc = 0; -for (env = first_cpu; env != NULL; env = env->next_cpu) { -if (cenv == env) { -continue; -} -params.env = env; -run_on_cpu(cenv, do_inject_x86_mce, ¶ms); +run_on_cpu(cenv, do_inject_x86_mce, ¶ms); +if (flags & MCE_INJECT_BROADCAST) { +params.bank = 1; +params.status = MCI_STATUS_VAL | MCI_STATUS_UC; +params.mcg_status = MCG_STATUS_MCIP | MCG_STATUS_RIPV; +params.addr = 0; +params.misc = 0; +for (env = first_cpu; env != NULL; env = env->next_cpu) { +if (cenv == env) { +continue; } +params.env = env; +run_on_cpu(cenv, do_inject_x86_mce, ¶ms); } } } diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 939edc8..be896dd 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -28,7 +28,6 @@ #include "hw/pc.h" #include "hw/apic.h" #include "ioport.h" -#include "kvm_x86.h" #ifdef CONFIG_KVM_PARA #include @@ -193,164 +192,23 @@ static int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap) return kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, mcg_cap); } -static int kvm_set_mce(CPUState *env, struct kvm_x86_mce *m) +static void kvm_mce_inject(CPUState *env, target_phys_addr_t paddr, int code) { -return kvm_vcpu_ioctl(env, KVM_X86_SET_MCE, m); -} - -static int kvm_get_msr(CPUState *env, struct kvm_msr_entry *msrs, int n) -{ -struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs); -int r; - -kmsrs->nmsrs = n; -memcpy(kmsrs->entries, msrs, n * sizeof *msrs); -r = kvm_vcpu_ioctl(env, KVM_GET_MSRS, kmsrs); -memcpy(msrs, kmsrs->entries, n * sizeof *msrs); -free(kmsrs); -return r; -} - -/* FIXME: kill this and kvm_get_msr, use env->mcg_status instead */ -static int kvm_mce_in_progress(CPUState *env) -{ -struct kvm_msr_entry msr_mcg_status = { -.index = MSR_MCG_STATUS, -}; -int r; - -r = kvm_get_msr(env, &msr_mcg_status, 1); -if (r == -1 || r == 0) { -fprintf(stderr, "Failed to get MCE status\n"); -return 0; -} -return !!(msr_mcg_status.data & MCG_STATUS_MCIP); -} - -struct kvm_x86_mce_data -{ -CPUState *env; -struct kvm_x86_mce *mce; -int abort_on_error; -}; - -static void kvm_do_inject_x86_mce(void *_data) -{ -struct kvm_x86_mce_data *data = _data; -int r; - -/* If there is an MCE exception being processed, ignore this SRAO MCE */ -if ((data->env->mcg_cap & MCG_SER_P) && -!(data->mce->status & MCI_STATUS_AR)) { -if (kvm_mce_in_progress(data->env)) { -return; -} -} - -r = kvm_set_mce(data->env, data->mce); -if (r < 0) { -perror("kvm_set_mce FAILED"); -if (data->abort_on_error) { -abort(); -} -} -} - -static void kvm_inject_x86_mce_on(CPUState *env, struct kvm_x86_mce *mce, - int flag) -{ -struct kvm_x86_mce_data data = { -.env = env, -.mce = mce, -.ab
[Qemu-devel] REPOST: [PATCH v3] tracetool: Add optional argument to specify dtrace probe names
From: Jes Sorensen Optional feature allowing a user to generate the probe list to match the name of the binary, in case they wish to install qemu under a different name than qemu-{system,user}, Signed-off-by: Jes Sorensen --- scripts/tracetool | 19 +-- 1 files changed, 13 insertions(+), 6 deletions(-) diff --git a/scripts/tracetool b/scripts/tracetool index e046683..412f695 100755 --- a/scripts/tracetool +++ b/scripts/tracetool @@ -30,9 +30,11 @@ Output formats: --stap Generate .stp file (DTrace with SystemTAP only) Options: - --binary [path] Full path to QEMU binary - --target-arch [arch] QEMU emulator target arch - --target-type [type] QEMU emulator target type ('system' or 'user') + --binary [path]Full path to QEMU binary + --target-arch [arch]QEMU emulator target arch + --target-type [type]QEMU emulator target type ('system' or 'user') + --probe-prefix [prefix] Prefix for dtrace probe names + (default: qemu-\$targettype-\$targetarch) EOF exit 1 @@ -472,7 +474,7 @@ linetostap_dtrace() # Define prototype for probe arguments cat <
[Qemu-devel] [PATCH v3 00/17] [uq/master] Patch queue, part IV (MCE edition)
This is mostly a rebase of the previous round, just including one additional ppc build fix (patch 2) for a regression in a previous part. Please merge. CC: David Gibson CC: Hidetoshi Seto CC: Huang Ying CC: Jin Dongming Huang Ying (2): Add qemu_ram_remap KVM, MCE, unpoison memory address across reboot Jan Kiszka (15): kvm: ppc: Fix breakage of kvm_arch_pre_run/process_irqchip_events kvm: Fix build warning when KVM_CAP_SET_GUEST_DEBUG is lacking x86: Account for MCE in cpu_has_work x86: Perform implicit mcg_status reset x86: Small cleanups of MCE helpers x86: Refine error reporting of MCE injection services x86: Optionally avoid injecting AO MCEs while others are pending Synchronize VCPU states before reset kvm: x86: Move MCE functions together kvm: Rename kvm_arch_process_irqchip_events to async_events kvm: x86: Inject pending MCE events on state writeback x86: Run qemu_inject_x86_mce on target VCPU kvm: x86: Consolidate TCG and KVM MCE injection code kvm: x86: Clean up kvm_setup_mce kvm: x86: Fail kvm_arch_init_vcpu if MCE initialization fails cpu-all.h |8 +- cpu-common.h |1 + exec.c| 63 +++- kvm-all.c |4 +- kvm.h |2 +- monitor.c | 11 +- qemu-common.h |6 +- target-i386/cpu.h | 11 +- target-i386/exec.h| 15 +- target-i386/helper.c | 185 +--- target-i386/kvm.c | 463 - target-i386/kvm_x86.h | 25 --- target-ppc/kvm.c |6 +- target-s390x/kvm.c|2 +- vl.c |1 + 15 files changed, 403 insertions(+), 400 deletions(-) delete mode 100644 target-i386/kvm_x86.h
[Qemu-devel] [PATCH v3 09/17] kvm: x86: Move MCE functions together
Pure function suffling to avoid multiple #ifdef KVM_CAP_MCE sections, no functional changes. While at it, annotate some #ifdef sections. Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- target-i386/kvm.c | 346 ++--- 1 files changed, 171 insertions(+), 175 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 0aa0a41..f909661 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -172,7 +172,7 @@ static int get_para_features(CPUState *env) #endif return features; } -#endif +#endif /* CONFIG_KVM_PARA */ #ifdef KVM_CAP_MCE static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap, @@ -273,8 +273,174 @@ static void kvm_inject_x86_mce_on(CPUState *env, struct kvm_x86_mce *mce, run_on_cpu(env, kvm_do_inject_x86_mce, &data); } -static void kvm_mce_broadcast_rest(CPUState *env); -#endif +static void kvm_mce_broadcast_rest(CPUState *env) +{ +struct kvm_x86_mce mce = { +.bank = 1, +.status = MCI_STATUS_VAL | MCI_STATUS_UC, +.mcg_status = MCG_STATUS_MCIP | MCG_STATUS_RIPV, +.addr = 0, +.misc = 0, +}; +CPUState *cenv; + +/* Broadcast MCA signal for processor version 06H_EH and above */ +if (cpu_x86_support_mca_broadcast(env)) { +for (cenv = first_cpu; cenv != NULL; cenv = cenv->next_cpu) { +if (cenv == env) { +continue; +} +kvm_inject_x86_mce_on(cenv, &mce, ABORT_ON_ERROR); +} +} +} + +static void kvm_mce_inj_srar_dataload(CPUState *env, target_phys_addr_t paddr) +{ +struct kvm_x86_mce mce = { +.bank = 9, +.status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN + | MCI_STATUS_MISCV | MCI_STATUS_ADDRV | MCI_STATUS_S + | MCI_STATUS_AR | 0x134, +.mcg_status = MCG_STATUS_MCIP | MCG_STATUS_EIPV, +.addr = paddr, +.misc = (MCM_ADDR_PHYS << 6) | 0xc, +}; +int r; + +r = kvm_set_mce(env, &mce); +if (r < 0) { +fprintf(stderr, "kvm_set_mce: %s\n", strerror(errno)); +abort(); +} +kvm_mce_broadcast_rest(env); +} + +static void kvm_mce_inj_srao_memscrub(CPUState *env, target_phys_addr_t paddr) +{ +struct kvm_x86_mce mce = { +.bank = 9, +.status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN + | MCI_STATUS_MISCV | MCI_STATUS_ADDRV | MCI_STATUS_S + | 0xc0, +.mcg_status = MCG_STATUS_MCIP | MCG_STATUS_RIPV, +.addr = paddr, +.misc = (MCM_ADDR_PHYS << 6) | 0xc, +}; +int r; + +r = kvm_set_mce(env, &mce); +if (r < 0) { +fprintf(stderr, "kvm_set_mce: %s\n", strerror(errno)); +abort(); +} +kvm_mce_broadcast_rest(env); +} + +static void kvm_mce_inj_srao_memscrub2(CPUState *env, target_phys_addr_t paddr) +{ +struct kvm_x86_mce mce = { +.bank = 9, +.status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN + | MCI_STATUS_MISCV | MCI_STATUS_ADDRV | MCI_STATUS_S + | 0xc0, +.mcg_status = MCG_STATUS_MCIP | MCG_STATUS_RIPV, +.addr = paddr, +.misc = (MCM_ADDR_PHYS << 6) | 0xc, +}; + +kvm_inject_x86_mce_on(env, &mce, ABORT_ON_ERROR); +kvm_mce_broadcast_rest(env); +} +#endif /* KVM_CAP_MCE */ + +static void hardware_memory_error(void) +{ +fprintf(stderr, "Hardware memory error!\n"); +exit(1); +} + +int kvm_arch_on_sigbus_vcpu(CPUState *env, int code, void *addr) +{ +#ifdef KVM_CAP_MCE +void *vaddr; +ram_addr_t ram_addr; +target_phys_addr_t paddr; + +if ((env->mcg_cap & MCG_SER_P) && addr +&& (code == BUS_MCEERR_AR +|| code == BUS_MCEERR_AO)) { +vaddr = (void *)addr; +if (qemu_ram_addr_from_host(vaddr, &ram_addr) || +!kvm_physical_memory_addr_from_ram(env->kvm_state, ram_addr, &paddr)) { +fprintf(stderr, "Hardware memory error for memory used by " +"QEMU itself instead of guest system!\n"); +/* Hope we are lucky for AO MCE */ +if (code == BUS_MCEERR_AO) { +return 0; +} else { +hardware_memory_error(); +} +} + +if (code == BUS_MCEERR_AR) { +/* Fake an Intel architectural Data Load SRAR UCR */ +kvm_mce_inj_srar_dataload(env, paddr); +} else { +/* + * If there is an MCE excpetion being processed, ignore + * this SRAO MCE + */ +if (!kvm_mce_in_progress(env)) { +/* Fake an Intel architectural Memory scrubbing UCR */ +kvm_mce_inj_srao_memscrub(env, paddr); +} +} +} else +#endif /* KVM_CAP_MCE */ +{ +if (code == BUS_MCEERR_AO) { +return 0; +} else if (code == BUS_MCEERR_AR) { +hardware_memo
[Qemu-devel] [PATCH v3 06/17] x86: Refine error reporting of MCE injection services
As this service is used by the human monitor, make sure that errors get reported to the right channel, and also raise the verbosity. This requires to move Monitor typedef in qemu-common.h to resolve the include dependency. Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- monitor.c|4 +- qemu-common.h|6 ++-- target-i386/cpu.h|6 ++-- target-i386/helper.c | 79 +- 4 files changed, 54 insertions(+), 41 deletions(-) diff --git a/monitor.c b/monitor.c index 45b0cc2..662df7c 100644 --- a/monitor.c +++ b/monitor.c @@ -2712,8 +2712,8 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) int broadcast = qdict_get_try_bool(qdict, "broadcast", 0); for (cenv = first_cpu; cenv != NULL; cenv = cenv->next_cpu) { -if (cenv->cpu_index == cpu_index && cenv->mcg_cap) { -cpu_x86_inject_mce(cenv, bank, status, mcg_status, addr, misc, +if (cenv->cpu_index == cpu_index) { +cpu_x86_inject_mce(mon, cenv, bank, status, mcg_status, addr, misc, broadcast); break; } diff --git a/qemu-common.h b/qemu-common.h index 40dad52..3ce6b6c 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -18,6 +18,9 @@ typedef struct QEMUFile QEMUFile; typedef struct QEMUBH QEMUBH; typedef struct DeviceState DeviceState; +struct Monitor; +typedef struct Monitor Monitor; + /* we put basic includes here to avoid repeating them in device drivers */ #include #include @@ -326,9 +329,6 @@ void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t count); void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count, size_t skip); -struct Monitor; -typedef struct Monitor Monitor; - /* Convert a byte between binary and BCD. */ static inline uint8_t to_bcd(uint8_t val) { diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 52bb48e..486af1d 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -987,8 +987,8 @@ static inline void cpu_get_tb_cpu_state(CPUState *env, target_ulong *pc, void do_cpu_init(CPUState *env); void do_cpu_sipi(CPUState *env); -void cpu_x86_inject_mce(CPUState *cenv, int bank, uint64_t status, -uint64_t mcg_status, uint64_t addr, uint64_t misc, -int broadcast); +void cpu_x86_inject_mce(Monitor *mon, CPUState *cenv, int bank, +uint64_t status, uint64_t mcg_status, uint64_t addr, +uint64_t misc, int broadcast); #endif /* CPU_I386_H */ diff --git a/target-i386/helper.c b/target-i386/helper.c index ba3bed9..462d332 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -30,6 +30,7 @@ #include "kvm_x86.h" #ifndef CONFIG_USER_ONLY #include "sysemu.h" +#include "monitor.h" #endif //#define DEBUG_MMU @@ -1067,33 +1068,38 @@ static void breakpoint_handler(CPUState *env) } static void -qemu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, +qemu_inject_x86_mce(Monitor *mon, CPUState *cenv, int bank, uint64_t status, uint64_t mcg_status, uint64_t addr, uint64_t misc) { uint64_t mcg_cap = cenv->mcg_cap; -uint64_t *banks = cenv->mce_banks; - -/* - * if MSR_MCG_CTL is not all 1s, the uncorrected error - * reporting is disabled - */ -if ((status & MCI_STATUS_UC) && (mcg_cap & MCG_CTL_P) && -cenv->mcg_ctl != ~(uint64_t)0) { -return; -} -banks += 4 * bank; -/* - * if MSR_MCi_CTL is not all 1s, the uncorrected error - * reporting is disabled for the bank - */ -if ((status & MCI_STATUS_UC) && banks[0] != ~(uint64_t)0) { -return; -} +uint64_t *banks = cenv->mce_banks + 4 * bank; + if (status & MCI_STATUS_UC) { +/* + * if MSR_MCG_CTL is not all 1s, the uncorrected error + * reporting is disabled + */ +if ((mcg_cap & MCG_CTL_P) && cenv->mcg_ctl != ~(uint64_t)0) { +monitor_printf(mon, + "CPU %d: Uncorrected error reporting disabled\n", + cenv->cpu_index); +return; +} + +/* + * if MSR_MCi_CTL is not all 1s, the uncorrected error + * reporting is disabled for the bank + */ +if (banks[0] != ~(uint64_t)0) { +monitor_printf(mon, "CPU %d: Uncorrected error reporting disabled " + "for bank %d\n", cenv->cpu_index, bank); +return; +} + if ((cenv->mcg_status & MCG_STATUS_MCIP) || !(cenv->cr[4] & CR4_MCE_MASK)) { -fprintf(stderr, "injects mce exception while previous " -"one is in progress!\n"); +monitor_printf(mon, "CPU %d: Previous MCE still in progress, " +"raising triple fault\n", cenv->cpu_index); qemu_log_mask(C
[Qemu-devel] [PATCH v3 07/17] x86: Optionally avoid injecting AO MCEs while others are pending
Allow to tell cpu_x86_inject_mce that it should ignore Action Optional MCE events when the target VCPU is still processing another one. This will be used by KVM soon. Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- monitor.c|7 +-- target-i386/cpu.h|5 - target-i386/helper.c | 26 +++--- 3 files changed, 28 insertions(+), 10 deletions(-) diff --git a/monitor.c b/monitor.c index 662df7c..ae20927 100644 --- a/monitor.c +++ b/monitor.c @@ -2709,12 +2709,15 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) uint64_t mcg_status = qdict_get_int(qdict, "mcg_status"); uint64_t addr = qdict_get_int(qdict, "addr"); uint64_t misc = qdict_get_int(qdict, "misc"); -int broadcast = qdict_get_try_bool(qdict, "broadcast", 0); +int flags = MCE_INJECT_UNCOND_AO; +if (qdict_get_try_bool(qdict, "broadcast", 0)) { +flags |= MCE_INJECT_BROADCAST; +} for (cenv = first_cpu; cenv != NULL; cenv = cenv->next_cpu) { if (cenv->cpu_index == cpu_index) { cpu_x86_inject_mce(mon, cenv, bank, status, mcg_status, addr, misc, - broadcast); + flags); break; } } diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 486af1d..d0eae75 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -987,8 +987,11 @@ static inline void cpu_get_tb_cpu_state(CPUState *env, target_ulong *pc, void do_cpu_init(CPUState *env); void do_cpu_sipi(CPUState *env); +#define MCE_INJECT_BROADCAST1 +#define MCE_INJECT_UNCOND_AO2 + void cpu_x86_inject_mce(Monitor *mon, CPUState *cenv, int bank, uint64_t status, uint64_t mcg_status, uint64_t addr, -uint64_t misc, int broadcast); +uint64_t misc, int flags); #endif /* CPU_I386_H */ diff --git a/target-i386/helper.c b/target-i386/helper.c index 462d332..e3ef40c 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -1069,11 +1069,20 @@ static void breakpoint_handler(CPUState *env) static void qemu_inject_x86_mce(Monitor *mon, CPUState *cenv, int bank, uint64_t status, -uint64_t mcg_status, uint64_t addr, uint64_t misc) +uint64_t mcg_status, uint64_t addr, uint64_t misc, +int flags) { uint64_t mcg_cap = cenv->mcg_cap; uint64_t *banks = cenv->mce_banks + 4 * bank; +/* + * If there is an MCE exception being processed, ignore this SRAO MCE + * unless unconditional injection was requested. + */ +if (!(flags & MCE_INJECT_UNCOND_AO) && !(status & MCI_STATUS_AR) +&& (cenv->mcg_status & MCG_STATUS_MCIP)) { +return; +} if (status & MCI_STATUS_UC) { /* * if MSR_MCG_CTL is not all 1s, the uncorrected error @@ -1127,7 +1136,7 @@ qemu_inject_x86_mce(Monitor *mon, CPUState *cenv, int bank, uint64_t status, void cpu_x86_inject_mce(Monitor *mon, CPUState *cenv, int bank, uint64_t status, uint64_t mcg_status, uint64_t addr, -uint64_t misc, int broadcast) +uint64_t misc, int flags) { unsigned bank_num = cenv->mcg_cap & 0xff; CPUState *env; @@ -1145,27 +1154,30 @@ void cpu_x86_inject_mce(Monitor *mon, CPUState *cenv, int bank, monitor_printf(mon, "Invalid MCE status code\n"); return; } -if (broadcast && !cpu_x86_support_mca_broadcast(cenv)) { +if ((flags & MCE_INJECT_BROADCAST) +&& !cpu_x86_support_mca_broadcast(cenv)) { monitor_printf(mon, "Guest CPU does not support MCA broadcast\n"); return; } if (kvm_enabled()) { -if (broadcast) { +if (flags & MCE_INJECT_BROADCAST) { flag |= MCE_BROADCAST; } kvm_inject_x86_mce(cenv, bank, status, mcg_status, addr, misc, flag); } else { -qemu_inject_x86_mce(mon, cenv, bank, status, mcg_status, addr, misc); -if (broadcast) { +qemu_inject_x86_mce(mon, cenv, bank, status, mcg_status, addr, misc, +flags); +if (flags & MCE_INJECT_BROADCAST) { for (env = first_cpu; env != NULL; env = env->next_cpu) { if (cenv == env) { continue; } qemu_inject_x86_mce(mon, env, 1, MCI_STATUS_VAL | MCI_STATUS_UC, -MCG_STATUS_MCIP | MCG_STATUS_RIPV, 0, 0); +MCG_STATUS_MCIP | MCG_STATUS_RIPV, 0, 0, +flags); } } } -- 1.7.1
[Qemu-devel] [PATCH v3 05/17] x86: Small cleanups of MCE helpers
Fix some code style issues, use proper headers, and align to cpu_x86 naming scheme. No functional changes. Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- cpu-all.h|4 monitor.c|2 +- target-i386/cpu.h|5 + target-i386/helper.c | 41 - 4 files changed, 30 insertions(+), 22 deletions(-) diff --git a/cpu-all.h b/cpu-all.h index 87b0f86..caf5e6c 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -971,8 +971,4 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf); int cpu_memory_rw_debug(CPUState *env, target_ulong addr, uint8_t *buf, int len, int is_write); -void cpu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, -uint64_t mcg_status, uint64_t addr, uint64_t misc, -int broadcast); - #endif /* CPU_ALL_H */ diff --git a/monitor.c b/monitor.c index 22ae3bb..45b0cc2 100644 --- a/monitor.c +++ b/monitor.c @@ -2713,7 +2713,7 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict) for (cenv = first_cpu; cenv != NULL; cenv = cenv->next_cpu) { if (cenv->cpu_index == cpu_index && cenv->mcg_cap) { -cpu_inject_x86_mce(cenv, bank, status, mcg_status, addr, misc, +cpu_x86_inject_mce(cenv, bank, status, mcg_status, addr, misc, broadcast); break; } diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 75156e7..52bb48e 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -986,4 +986,9 @@ static inline void cpu_get_tb_cpu_state(CPUState *env, target_ulong *pc, void do_cpu_init(CPUState *env); void do_cpu_sipi(CPUState *env); + +void cpu_x86_inject_mce(CPUState *cenv, int bank, uint64_t status, +uint64_t mcg_status, uint64_t addr, uint64_t misc, +int broadcast); + #endif /* CPU_I386_H */ diff --git a/target-i386/helper.c b/target-i386/helper.c index f41416f..ba3bed9 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -28,6 +28,9 @@ #include "qemu-common.h" #include "kvm.h" #include "kvm_x86.h" +#ifndef CONFIG_USER_ONLY +#include "sysemu.h" +#endif //#define DEBUG_MMU @@ -1063,11 +1066,9 @@ static void breakpoint_handler(CPUState *env) prev_debug_excp_handler(env); } -/* This should come from sysemu.h - if we could include it here... */ -void qemu_system_reset_request(void); - -static void qemu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, -uint64_t mcg_status, uint64_t addr, uint64_t misc) +static void +qemu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, +uint64_t mcg_status, uint64_t addr, uint64_t misc) { uint64_t mcg_cap = cenv->mcg_cap; uint64_t *banks = cenv->mce_banks; @@ -1077,15 +1078,17 @@ static void qemu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, * reporting is disabled */ if ((status & MCI_STATUS_UC) && (mcg_cap & MCG_CTL_P) && -cenv->mcg_ctl != ~(uint64_t)0) +cenv->mcg_ctl != ~(uint64_t)0) { return; +} banks += 4 * bank; /* * if MSR_MCi_CTL is not all 1s, the uncorrected error * reporting is disabled for the bank */ -if ((status & MCI_STATUS_UC) && banks[0] != ~(uint64_t)0) +if ((status & MCI_STATUS_UC) && banks[0] != ~(uint64_t)0) { return; +} if (status & MCI_STATUS_UC) { if ((cenv->mcg_status & MCG_STATUS_MCIP) || !(cenv->cr[4] & CR4_MCE_MASK)) { @@ -1095,8 +1098,9 @@ static void qemu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, qemu_system_reset_request(); return; } -if (banks[1] & MCI_STATUS_VAL) +if (banks[1] & MCI_STATUS_VAL) { status |= MCI_STATUS_OVER; +} banks[2] = addr; banks[3] = misc; cenv->mcg_status = mcg_status; @@ -1104,16 +1108,18 @@ static void qemu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, cpu_interrupt(cenv, CPU_INTERRUPT_MCE); } else if (!(banks[1] & MCI_STATUS_VAL) || !(banks[1] & MCI_STATUS_UC)) { -if (banks[1] & MCI_STATUS_VAL) +if (banks[1] & MCI_STATUS_VAL) { status |= MCI_STATUS_OVER; +} banks[2] = addr; banks[3] = misc; banks[1] = status; -} else +} else { banks[1] |= MCI_STATUS_OVER; +} } -void cpu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, +void cpu_x86_inject_mce(CPUState *cenv, int bank, uint64_t status, uint64_t mcg_status, uint64_t addr, uint64_t misc, int broadcast) { @@ -1155,15 +1161,16 @@ void cpu_inject_x86_mce(CPUState *cenv, int bank, uint64_t status, static void mce_init(CPUX86State *cenv) { -unsigned int bank, bank_num; +unsigned int
[Qemu-devel] [PATCH v3 17/17] KVM, MCE, unpoison memory address across reboot
From: Huang Ying In Linux kernel HWPoison processing implementation, the virtual address in processes mapping the error physical memory page is marked as HWPoison. So that, the further accessing to the virtual address will kill corresponding processes with SIGBUS. If the error physical memory page is used by a KVM guest, the SIGBUS will be sent to QEMU, and QEMU will simulate a MCE to report that memory error to the guest OS. If the guest OS can not recover from the error (for example, the page is accessed by kernel code), guest OS will reboot the system. But because the underlying host virtual address backing the guest physical memory is still poisoned, if the guest system accesses the corresponding guest physical memory even after rebooting, the SIGBUS will still be sent to QEMU and MCE will be simulated. That is, guest system can not recover via rebooting. In fact, across rebooting, the contents of guest physical memory page need not to be kept. We can allocate a new host physical page to back the corresponding guest physical address. This patch fixes this issue in QEMU-KVM via calling qemu_ram_remap() to clear the corresponding page table entry, so that make it possible to allocate a new page to recover the issue. [ Jan: rebasing and tiny cleanups] Signed-off-by: Huang Ying Signed-off-by: Jan Kiszka --- target-i386/kvm.c | 36 1 files changed, 36 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 44e5504..7b7105d 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -173,7 +173,40 @@ static int get_para_features(CPUState *env) } #endif /* CONFIG_KVM_PARA */ +typedef struct HWPoisonPage { +ram_addr_t ram_addr; +QLIST_ENTRY(HWPoisonPage) list; +} HWPoisonPage; + +static QLIST_HEAD(, HWPoisonPage) hwpoison_page_list = +QLIST_HEAD_INITIALIZER(hwpoison_page_list); + +static void kvm_unpoison_all(void *param) +{ +HWPoisonPage *page, *next_page; + +QLIST_FOREACH_SAFE(page, &hwpoison_page_list, list, next_page) { +QLIST_REMOVE(page, list); +qemu_ram_remap(page->ram_addr, TARGET_PAGE_SIZE); +qemu_free(page); +} +} + #ifdef KVM_CAP_MCE +static void kvm_hwpoison_page_add(ram_addr_t ram_addr) +{ +HWPoisonPage *page; + +QLIST_FOREACH(page, &hwpoison_page_list, list) { +if (page->ram_addr == ram_addr) { +return; +} +} +page = qemu_malloc(sizeof(HWPoisonPage)); +page->ram_addr = ram_addr; +QLIST_INSERT_HEAD(&hwpoison_page_list, page, list); +} + static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap, int *max_banks) { @@ -233,6 +266,7 @@ int kvm_arch_on_sigbus_vcpu(CPUState *env, int code, void *addr) hardware_memory_error(); } } +kvm_hwpoison_page_add(ram_addr); kvm_mce_inject(env, paddr, code); } else #endif /* KVM_CAP_MCE */ @@ -263,6 +297,7 @@ int kvm_arch_on_sigbus(int code, void *addr) "QEMU itself instead of guest system!: %p\n", addr); return 0; } +kvm_hwpoison_page_add(ram_addr); kvm_mce_inject(first_cpu, paddr, code); } else #endif /* KVM_CAP_MCE */ @@ -571,6 +606,7 @@ int kvm_arch_init(KVMState *s) fprintf(stderr, "e820_add_entry() table is full\n"); return ret; } +qemu_register_reset(kvm_unpoison_all, NULL); return 0; } -- 1.7.1
[Qemu-devel] [PATCH v3 16/17] Add qemu_ram_remap
From: Huang Ying qemu_ram_remap() unmaps the specified RAM pages, then re-maps these pages again. This is used by KVM HWPoison support to clear HWPoisoned page tables across guest rebooting, so that a new page may be allocated later to recover the memory error. [ Jan: style fixlets, WIN32 fix ] Signed-off-by: Huang Ying Signed-off-by: Jan Kiszka --- cpu-all.h|4 +++ cpu-common.h |1 + exec.c | 63 +- 3 files changed, 67 insertions(+), 1 deletions(-) diff --git a/cpu-all.h b/cpu-all.h index caf5e6c..4f4631d 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -863,10 +863,14 @@ target_phys_addr_t cpu_get_phys_page_debug(CPUState *env, target_ulong addr); extern int phys_ram_fd; extern ram_addr_t ram_size; +/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ +#define RAM_PREALLOC_MASK (1 << 0) + typedef struct RAMBlock { uint8_t *host; ram_addr_t offset; ram_addr_t length; +uint32_t flags; char idstr[256]; QLIST_ENTRY(RAMBlock) next; #if defined(__linux__) && !defined(TARGET_S390X) diff --git a/cpu-common.h b/cpu-common.h index 54d21d4..ef4e8da 100644 --- a/cpu-common.h +++ b/cpu-common.h @@ -50,6 +50,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, ram_addr_t size, void *host); ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size); void qemu_ram_free(ram_addr_t addr); +void qemu_ram_remap(ram_addr_t addr, ram_addr_t length); /* This should only be used for ram local to a device. */ void *qemu_get_ram_ptr(ram_addr_t addr); /* Same but slower, to use for migration, where the order of diff --git a/exec.c b/exec.c index d611100..9308a97 100644 --- a/exec.c +++ b/exec.c @@ -2867,6 +2867,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, if (host) { new_block->host = host; +new_block->flags |= RAM_PREALLOC_MASK; } else { if (mem_path) { #if defined (__linux__) && !defined(TARGET_S390X) @@ -2920,7 +2921,9 @@ void qemu_ram_free(ram_addr_t addr) QLIST_FOREACH(block, &ram_list.blocks, next) { if (addr == block->offset) { QLIST_REMOVE(block, next); -if (mem_path) { +if (block->flags & RAM_PREALLOC_MASK) { +; +} else if (mem_path) { #if defined (__linux__) && !defined(TARGET_S390X) if (block->fd) { munmap(block->host, block->length); @@ -2943,6 +2946,64 @@ void qemu_ram_free(ram_addr_t addr) } +#ifndef _WIN32 +void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) +{ +RAMBlock *block; +ram_addr_t offset; +int flags; +void *area, *vaddr; + +QLIST_FOREACH(block, &ram_list.blocks, next) { +offset = addr - block->offset; +if (offset < block->length) { +vaddr = block->host + offset; +if (block->flags & RAM_PREALLOC_MASK) { +; +} else { +flags = MAP_FIXED; +munmap(vaddr, length); +if (mem_path) { +#if defined(__linux__) && !defined(TARGET_S390X) +if (block->fd) { +#ifdef MAP_POPULATE +flags |= mem_prealloc ? MAP_POPULATE | MAP_SHARED : +MAP_PRIVATE; +#else +flags |= MAP_PRIVATE; +#endif +area = mmap(vaddr, length, PROT_READ | PROT_WRITE, +flags, block->fd, offset); +} else { +flags |= MAP_PRIVATE | MAP_ANONYMOUS; +area = mmap(vaddr, length, PROT_READ | PROT_WRITE, +flags, -1, 0); +} +#endif +} else { +#if defined(TARGET_S390X) && defined(CONFIG_KVM) +flags |= MAP_SHARED | MAP_ANONYMOUS; +area = mmap(vaddr, length, PROT_EXEC|PROT_READ|PROT_WRITE, +flags, -1, 0); +#else +flags |= MAP_PRIVATE | MAP_ANONYMOUS; +area = mmap(vaddr, length, PROT_READ | PROT_WRITE, +flags, -1, 0); +#endif +} +if (area != vaddr) { +fprintf(stderr, "Could not remap addr: %lx@%lx\n", +length, addr); +exit(1); +} +qemu_madvise(vaddr, length, QEMU_MADV_MERGEABLE); +} +return; +} +} +} +#endif /* !_WIN32 */ + /* Return a host pointer to ram allocated with qemu_ram_alloc. With the exception of the softmmu code in this file, this should only be used for local memory (e.g. video ram) that the device owns, -- 1.7.1
[Qemu-devel] [PATCH v3 04/17] x86: Perform implicit mcg_status reset
Reorder mcg_status in CPUState to achieve automatic clearing on reset. Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- target-i386/cpu.h|3 ++- target-i386/helper.c |2 -- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 5f1df8b..75156e7 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -687,6 +687,8 @@ typedef struct CPUX86State { uint64_t pat; +uint64_t mcg_status; + /* exception/interrupt handling */ int error_code; int exception_is_int; @@ -741,7 +743,6 @@ typedef struct CPUX86State { struct DeviceState *apic_state; uint64_t mcg_cap; -uint64_t mcg_status; uint64_t mcg_ctl; uint64_t mce_banks[MCE_BANKS_DEF*4]; diff --git a/target-i386/helper.c b/target-i386/helper.c index f0c546d..f41416f 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -101,8 +101,6 @@ void cpu_reset(CPUX86State *env) env->dr[7] = DR7_FIXED_1; cpu_breakpoint_remove_all(env, BP_CPU); cpu_watchpoint_remove_all(env, BP_CPU); - -env->mcg_status = 0; } void cpu_x86_close(CPUX86State *env) -- 1.7.1
[Qemu-devel] [PATCH v3 14/17] kvm: x86: Clean up kvm_setup_mce
There is nothing to abstract here. Fold kvm_setup_mce into its caller and fix up the error reporting (return code of kvm_vcpu_ioctl holds the error value). Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- target-i386/kvm.c | 11 --- 1 files changed, 4 insertions(+), 7 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index be896dd..486efb9 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -187,11 +187,6 @@ static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap, return -ENOSYS; } -static int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap) -{ -return kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, mcg_cap); -} - static void kvm_mce_inject(CPUState *env, target_phys_addr_t paddr, int code) { uint64_t status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN | @@ -440,6 +435,7 @@ int kvm_arch_init_vcpu(CPUState *env) && kvm_check_extension(env->kvm_state, KVM_CAP_MCE) > 0) { uint64_t mcg_cap; int banks; +int ret; if (kvm_get_mce_cap_supported(env->kvm_state, &mcg_cap, &banks)) { perror("kvm_get_mce_cap_supported FAILED"); @@ -448,8 +444,9 @@ int kvm_arch_init_vcpu(CPUState *env) banks = MCE_BANKS_DEF; mcg_cap &= MCE_CAP_DEF; mcg_cap |= banks; -if (kvm_setup_mce(env, &mcg_cap)) { -perror("kvm_setup_mce FAILED"); +ret = kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, &mcg_cap); +if (ret < 0) { +fprintf(stderr, "KVM_X86_SETUP_MCE: %s", strerror(-ret)); } else { env->mcg_cap = mcg_cap; } -- 1.7.1
[Qemu-devel] [PATCH v3 03/17] x86: Account for MCE in cpu_has_work
MCEs can be injected asynchronously, so they can also terminate the halt state. Signed-off-by: Jan Kiszka CC: Huang Ying CC: Hidetoshi Seto CC: Jin Dongming --- target-i386/exec.h | 15 ++- 1 files changed, 6 insertions(+), 9 deletions(-) diff --git a/target-i386/exec.h b/target-i386/exec.h index fc8945b..d050dd0 100644 --- a/target-i386/exec.h +++ b/target-i386/exec.h @@ -293,15 +293,12 @@ static inline void load_eflags(int eflags, int update_mask) static inline int cpu_has_work(CPUState *env) { -int work; - -work = (env->interrupt_request & CPU_INTERRUPT_HARD) && - (env->eflags & IF_MASK); -work |= env->interrupt_request & CPU_INTERRUPT_NMI; -work |= env->interrupt_request & CPU_INTERRUPT_INIT; -work |= env->interrupt_request & CPU_INTERRUPT_SIPI; - -return work; +return ((env->interrupt_request & CPU_INTERRUPT_HARD) && +(env->eflags & IF_MASK)) || + (env->interrupt_request & (CPU_INTERRUPT_NMI | + CPU_INTERRUPT_INIT | + CPU_INTERRUPT_SIPI | + CPU_INTERRUPT_MCE)); } static inline int cpu_halted(CPUState *env) { -- 1.7.1
[Qemu-devel] [PATCH v3 12/17] x86: Run qemu_inject_x86_mce on target VCPU
We will use the current TCG-only MCE injection path for KVM as well, and then this read-modify-write of the target VCPU state has to be performed synchronously in the corresponding thread. Signed-off-by: Jan Kiszka --- target-i386/helper.c | 87 + 1 files changed, 58 insertions(+), 29 deletions(-) diff --git a/target-i386/helper.c b/target-i386/helper.c index e3ef40c..a32960c 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -1067,29 +1067,42 @@ static void breakpoint_handler(CPUState *env) prev_debug_excp_handler(env); } -static void -qemu_inject_x86_mce(Monitor *mon, CPUState *cenv, int bank, uint64_t status, -uint64_t mcg_status, uint64_t addr, uint64_t misc, -int flags) +typedef struct MCEInjectionParams { +Monitor *mon; +CPUState *env; +int bank; +uint64_t status; +uint64_t mcg_status; +uint64_t addr; +uint64_t misc; +int flags; +} MCEInjectionParams; + +static void do_inject_x86_mce(void *data) { -uint64_t mcg_cap = cenv->mcg_cap; -uint64_t *banks = cenv->mce_banks + 4 * bank; +MCEInjectionParams *params = data; +CPUState *cenv = params->env; +uint64_t *banks = cenv->mce_banks + 4 * params->bank; + +cpu_synchronize_state(cenv); /* * If there is an MCE exception being processed, ignore this SRAO MCE * unless unconditional injection was requested. */ -if (!(flags & MCE_INJECT_UNCOND_AO) && !(status & MCI_STATUS_AR) +if (!(params->flags & MCE_INJECT_UNCOND_AO) +&& !(params->status & MCI_STATUS_AR) && (cenv->mcg_status & MCG_STATUS_MCIP)) { return; } -if (status & MCI_STATUS_UC) { + +if (params->status & MCI_STATUS_UC) { /* * if MSR_MCG_CTL is not all 1s, the uncorrected error * reporting is disabled */ -if ((mcg_cap & MCG_CTL_P) && cenv->mcg_ctl != ~(uint64_t)0) { -monitor_printf(mon, +if ((cenv->mcg_cap & MCG_CTL_P) && cenv->mcg_ctl != ~(uint64_t)0) { +monitor_printf(params->mon, "CPU %d: Uncorrected error reporting disabled\n", cenv->cpu_index); return; @@ -1100,35 +1113,39 @@ qemu_inject_x86_mce(Monitor *mon, CPUState *cenv, int bank, uint64_t status, * reporting is disabled for the bank */ if (banks[0] != ~(uint64_t)0) { -monitor_printf(mon, "CPU %d: Uncorrected error reporting disabled " - "for bank %d\n", cenv->cpu_index, bank); +monitor_printf(params->mon, + "CPU %d: Uncorrected error reporting disabled for" + " bank %d\n", + cenv->cpu_index, params->bank); return; } if ((cenv->mcg_status & MCG_STATUS_MCIP) || !(cenv->cr[4] & CR4_MCE_MASK)) { -monitor_printf(mon, "CPU %d: Previous MCE still in progress, " -"raising triple fault\n", cenv->cpu_index); +monitor_printf(params->mon, + "CPU %d: Previous MCE still in progress, raising" + " triple fault\n", + cenv->cpu_index); qemu_log_mask(CPU_LOG_RESET, "Triple fault\n"); qemu_system_reset_request(); return; } if (banks[1] & MCI_STATUS_VAL) { -status |= MCI_STATUS_OVER; +params->status |= MCI_STATUS_OVER; } -banks[2] = addr; -banks[3] = misc; -cenv->mcg_status = mcg_status; -banks[1] = status; +banks[2] = params->addr; +banks[3] = params->misc; +cenv->mcg_status = params->mcg_status; +banks[1] = params->status; cpu_interrupt(cenv, CPU_INTERRUPT_MCE); } else if (!(banks[1] & MCI_STATUS_VAL) || !(banks[1] & MCI_STATUS_UC)) { if (banks[1] & MCI_STATUS_VAL) { -status |= MCI_STATUS_OVER; +params->status |= MCI_STATUS_OVER; } -banks[2] = addr; -banks[3] = misc; -banks[1] = status; +banks[2] = params->addr; +banks[3] = params->misc; +banks[1] = params->status; } else { banks[1] |= MCI_STATUS_OVER; } @@ -1138,6 +1155,16 @@ void cpu_x86_inject_mce(Monitor *mon, CPUState *cenv, int bank, uint64_t status, uint64_t mcg_status, uint64_t addr, uint64_t misc, int flags) { +MCEInjectionParams params = { +.mon = mon, +.env = cenv, +.bank = bank, +.status = status, +.mcg_status = mcg_status, +.addr = addr, +.misc = misc, +.flags = flags, +}; unsigned bank_num = cenv->mcg_cap & 0xff; CPUState *env; int flag = 0; @@ -1167,17 +1
[Qemu-devel] [PATCH v3 10/17] kvm: Rename kvm_arch_process_irqchip_events to async_events
We will broaden the scope of this function on x86 beyond irqchip events. Signed-off-by: Jan Kiszka --- kvm-all.c |2 +- kvm.h |2 +- target-i386/kvm.c |2 +- target-ppc/kvm.c |2 +- target-s390x/kvm.c |2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 7753c8a..226843c 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -893,7 +893,7 @@ int kvm_cpu_exec(CPUState *env) DPRINTF("kvm_cpu_exec()\n"); -if (kvm_arch_process_irqchip_events(env)) { +if (kvm_arch_process_async_events(env)) { env->exit_request = 0; return EXCP_HLT; } diff --git a/kvm.h b/kvm.h index 59b2c29..7bc04e0 100644 --- a/kvm.h +++ b/kvm.h @@ -102,7 +102,7 @@ void kvm_arch_post_run(CPUState *env, struct kvm_run *run); int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run); -int kvm_arch_process_irqchip_events(CPUState *env); +int kvm_arch_process_async_events(CPUState *env); int kvm_arch_get_registers(CPUState *env); diff --git a/target-i386/kvm.c b/target-i386/kvm.c index f909661..a416554 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1675,7 +1675,7 @@ void kvm_arch_post_run(CPUState *env, struct kvm_run *run) cpu_set_apic_base(env->apic_state, run->apic_base); } -int kvm_arch_process_irqchip_events(CPUState *env) +int kvm_arch_process_async_events(CPUState *env) { if (kvm_irqchip_in_kernel()) { return 0; diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 3924f4b..6c99a16 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -259,7 +259,7 @@ void kvm_arch_post_run(CPUState *env, struct kvm_run *run) { } -int kvm_arch_process_irqchip_events(CPUState *env) +int kvm_arch_process_async_events(CPUState *env) { return 0; } diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c index b349812..5673a95 100644 --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -177,7 +177,7 @@ void kvm_arch_post_run(CPUState *env, struct kvm_run *run) { } -int kvm_arch_process_irqchip_events(CPUState *env) +int kvm_arch_process_async_events(CPUState *env) { return 0; } -- 1.7.1