On Tue, Mar 01, 2011 at 03:32:08PM +0800, Wen Congyang wrote: > > The issue is, Qemu injected sci interrupt into guest, > > but before the guest completes to handled it, > > users/qemu can start to inject the next sci event triggered by hot > > plug/unplug. > > Thus qemu loses sci interrupt and up/down status. > > Yes, you are right. But I still do not understand why introducing pending bits > can not solve the issue?
Maybe your patch works. The difference is to introduce new variables or new state machine. Anyway I suppose that the eventual solution is to switch to express native hotplug which is much more reliable and doesn't rely on ACPI. (note: I'm very biased here.) thanks > > Thanks > Wen Congyang > > > > > thanks > >> > >>> > >>> - on power-on (0, 0) > >>> - on hot plug (0, 0) -> (1, 0) > >>> if other combination -> error > >>> - on hot unpug (1, 0) or (0, 0) -> (0, 1) > >>> (0, 0) is for cold plugged devices. > >>> (1, 0) is for hot plugged devices. > >>> if other combination -> error > >>> - on pciej_write(write on PCI_EJ_BASE) > >>> (0, 1) -> (0, 0) and qdev_free() > >>> if other combination -> ignore > >>> > >>> When enabling sci, those bits are checked and raise sci if necessary. > >>> > >>> Any comments on this? > >>> Anyway the current pci hotplug-related commands are somewhat broken, > >>> and needs clean up with multifunction hotplug support which is > >>> on my todo list. > >>> > >>> thanks > >>> > >>> On Mon, Feb 28, 2011 at 10:52:40AM +0800, Wen Congyang wrote: > >>>> Hi Markus Armbruster > >>>> > >>>> At 02/23/2011 04:30 PM, Markus Armbruster Write: > >>>>> Isaku Yamahata <yamah...@valinux.co.jp> writes: > >>>>> > >>>> > >>>> <snip> > >>>> > >>>>> > >>>>> I don't think this patch is correct. Let me explain. > >>>>> > >>>>> Device hot unplug is *not* guaranteed to succeed. > >>>>> > >>>>> For some buses, such as USB, it always succeeds immediately, i.e. when > >>>>> the device_del monitor command finishes, the device is gone. Live is > >>>>> good. > >>>>> > >>>>> But for PCI, device_del merely initiates the ACPI unplug rain dance. It > >>>>> doesn't wait for the dance to complete. Why? The dance can take an > >>>>> unpredictable amount of time, including forever. > >>>>> > >>>>> Problem: Subsequent device_add can fail if it reuses the qdev ID or PCI > >>>>> slot, and the unplug has not yet completed (race condition), or it > >>>>> failed. Yes, Virginia, PCI hotplug *can* fail. > >>>>> > >>>>> When unplug succeeds, the qdev is automatically destroyed. > >>>>> pciej_write() does that for PIIX4. Looks like pcie_cap_slot_event() > >>>>> does it for PCIE. > >>>> > >>>> I got a similar problem. When I unplug a pci device by hand, it works > >>>> as expected, and I can hotplug it again. But when I use a srcipt to > >>>> do the same thing, sometimes it failed. I think I may find another bug. > >>>> > >>>> Steps to reproduce this bug: > >>>> 1. cat ./test-e1000.sh # RHEL6RC is domain name > >>>> #! /bin/bash > >>>> > >>>> while true; do > >>>> virsh attach-interface RHEL6RC network default --mac > >>>> 52:54:00:1f:db:c7 --model e1000 > >>>> if [[ $? -ne 0 ]]; then > >>>> break > >>>> fi > >>>> virsh detach-interface RHEL6RC network --mac 52:54:00:1f:db:c7 > >>>> if [[ $? -ne 0 ]]; then > >>>> break > >>>> fi > >>>> sleep 5 > >>>> done > >>>> > >>>> 2. ./test-e1000.sh > >>>> Interface attached successfully > >>>> > >>>> Interface detached successfully > >>>> > >>>> Interface attached successfully > >>>> > >>>> Interface detached successfully > >>>> > >>>> error: Failed to attach interface > >>>> error: operation failed: adding > >>>> e1000,netdev=hostnet1,id=net1,mac=52:54:00:1f:db:c7,bus=pci.0,addr=0x8 > >>>> device failed: Duplicate ID 'net1' for device > >>>> > >>>> > >>>> I use ftrace to trace whether sci interrupt is passed to guest OS, here > >>>> is log: > >>>> # cat trace | grep 'irq=9' > >>>> <idle>-0 [000] 118342.634772: irq_handler_entry: irq=9 > >>>> name=acpi > >>>> <idle>-0 [000] 118342.634831: irq_handler_exit: irq=9 > >>>> ret=handled > >>>> <idle>-0 [000] 118342.693216: irq_handler_entry: irq=9 > >>>> name=acpi > >>>> <idle>-0 [000] 118342.693254: irq_handler_exit: irq=9 > >>>> ret=handled > >>>> <idle>-0 [000] 118347.737689: irq_handler_entry: irq=9 > >>>> name=acpi > >>>> <idle>-0 [000] 118347.737738: irq_handler_exit: irq=9 > >>>> ret=handled > >>>> According to the log, we only receive 3 sci interrupt, and one is lost. > >>>> > >>>> I enable piix4's debug and 1 line printf into piix4_device_hotplug: > >>>> printf("slot: %d, up: %d, down: %d, state: %d\n", slot, > >>>> s->pci0_status.up, s->pci0_status.down, state); > >>>> here is the log: > >>>> ======== > >>>> PM: mapping to 0xb000 > >>>> PM readw port=0x0004 val=0x0000 > >>>> ... > >>>> gpe write afe2 <== 0 > >>>> gpe write afe0 <== 255 > >>>> gpe write afe3 <== 0 > >>>> gpe write afe1 <== 255 > >>>> PM readw port=0x0000 val=0x0001 > >>>> PM readw port=0x0002 val=0x0000 > >>>> gpe read afe0 == 0 > >>>> gpe read afe2 == 0 > >>>> gpe read afe1 == 0 > >>>> gpe read afe3 == 0 > >>>> PM writew port=0x0000 val=0x0020 > >>>> PM readw port=0x0002 val=0x0000 > >>>> PM writew port=0x0002 val=0x0020 > >>>> PM readw port=0x0002 val=0x0020 > >>>> gpe write afe2 <== 255 > >>>> gpe write afe3 <== 255 > >>>> ... > >>>> slot: 6, up: 0, down: 0, state: 1 <==== we attach interface the first > >>>> time > >>>> PM readw port=0x0000 val=0x0001 > >>>> PM readw port=0x0002 val=0x0120 > >>>> gpe read afe0 == 2 > >>>> gpe read afe2 == ff > >>>> gpe read afe2 == ff > >>>> gpe write afe2 <== 253 > >>>> gpe read afe1 == 0 > >>>> gpe read afe3 == ff > >>>> pcihotplug read ae00 == 40 > >>>> pcihotplug read ae04 == 0 > >>>> ... > >>>> gpe write afe0 <== 2 > >>>> gpe write afe2 <== 255 > >>>> slot: 6, up: 64, down: 0, state: 0 <===== we detach interface the first > >>>> time > >>>> PM readw port=0x0000 val=0x0001 > >>>> PM readw port=0x0002 val=0x0120 > >>>> gpe read afe0 == 2 > >>>> gpe read afe2 == ff > >>>> gpe read afe2 == ff > >>>> gpe write afe2 <== 253 > >>>> gpe read afe1 == 0 > >>>> gpe read afe3 == ff > >>>> pcihotplug read ae00 == 0 > >>>> pcihotplug read ae04 == 40 > >>>> ... > >>>> pciej write ae08 <== 64 <=== we will call qdev_free() > >>>> pciej write ae08 <== 64 > >>>> gpe write afe0 <== 2 > >>>> gpe write afe2 <== 255 > >>>> slot: 7, up: 0, down: 64, state: 1 <=== we attach interface the second > >>>> time. > >>>> PM readw port=0x0000 val=0x0001 > >>>> PM readw port=0x0002 val=0x0120 > >>>> gpe read afe0 == 2 > >>>> gpe read afe2 == ff > >>>> gpe read afe2 == ff > >>>> gpe write afe2 <== 253 > >>>> gpe read afe1 == 0 > >>>> gpe read afe3 == ff > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> pcihotplug read ae00 == 80 > >>>> pcihotplug read ae04 == 0 > >>>> slot: 7, up: 128, down: 0, state: 0 <=== we detach interface the second > >>>> time > >>>> gpe write afe0 <== 2 <=== write 2 to afe0 will cause sci > >>>> interupt to be lost > >>>> gpe write afe2 <== 255 > >>>> =========== > >>>> > >>>> Here is short log, and show the different between the first time and > >>>> second time: > >>>> =========== > >>>> gpe write afe0 <== 2 > >>>> gpe write afe2 <== 255 > >>>> slot: 6, up: 64, down: 0, state: 0 <===== we detach interface the first > >>>> time > >>>> .... > >>>> slot: 7, up: 128, down: 0, state: 0 <=== we detach interface the second > >>>> time > >>>> gpe write afe0 <== 2 <=== write 2 to afe0 will cause sci > >>>> interupt to be lost > >>>> gpe write afe2 <== 255 > >>>> > >>>> =========== > >>>> > >>>> Does this behavior is a normal behavior? > >>>> > >>>> The following patch can fix this problem. > >>>> > >>>> From 3c149575b52261b8da9a73d5c4ebdfedce9866c1 Mon Sep 17 00:00:00 2001 > >>>> From: Wen Congyang <we...@cn.fujitsu.com> > >>>> Date: Mon, 28 Feb 2011 10:33:45 +0800 > >>>> Subject: [PATCH] pend hotplug event until last event is handled by guest > >>>> OS > >>>> > >>>> --- > >>>> hw/acpi_piix4.c | 51 > >>>> ++++++++++++++++++++++++++++++++++++++++++++------- > >>>> 1 files changed, 44 insertions(+), 7 deletions(-) > >>>> > >>>> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c > >>>> index b5a2762..4ff3475 100644 > >>>> --- a/hw/acpi_piix4.c > >>>> +++ b/hw/acpi_piix4.c > >>>> @@ -74,6 +74,7 @@ typedef struct PIIX4PMState { > >>>> /* for pci hotplug */ > >>>> struct gpe_regs gpe; > >>>> struct pci_status pci0_status; > >>>> + struct pci_status pci0_status_pending; > >>>> uint32_t pci0_hotplug_enable; > >>>> } PIIX4PMState; > >>>> > >>>> @@ -518,6 +519,19 @@ static void gpe_reset_val(uint16_t *cur, int addr, > >>>> uint32_t val) > >>>> *cur = (*cur & (0xff << (8 - shift))) | (x1 << shift); > >>>> } > >>>> > >>>> +static void raise_pending_hotplug(PIIX4PMState *s) > >>>> +{ > >>>> + if (s->pci0_status_pending.up || s->pci0_status_pending.down) { > >>>> + s->gpe.sts |= PIIX4_PCI_HOTPLUG_STATUS; > >>>> + s->pci0_status.up = s->pci0_status_pending.up; > >>>> + s->pci0_status.down = s->pci0_status_pending.down; > >>>> + s->pci0_status_pending.up = 0; > >>>> + s->pci0_status_pending.down = 0; > >>>> + > >>>> + pm_update_sci(s); > >>>> + } > >>>> +} > >>>> + > >>>> static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val) > >>>> { > >>>> PIIX4PMState *s = opaque; > >>>> @@ -539,6 +553,11 @@ static void gpe_writeb(void *opaque, uint32_t addr, > >>>> uint32_t val) > >>>> pm_update_sci(s); > >>>> > >>>> PIIX4_DPRINTF("gpe write %x <== %d\n", addr, val); > >>>> + > >>>> + if (!(g->sts & PIIX4_PCI_HOTPLUG_STATUS) && g->en & > >>>> PIIX4_PCI_HOTPLUG_STATUS) { > >>>> + /* check and reraise the pending hotplug event */ > >>>> + raise_pending_hotplug(s); > >>>> + } > >>>> } > >>>> > >>>> static uint32_t pcihotplug_read(void *opaque, uint32_t addr) > >>>> @@ -639,12 +658,22 @@ static void enable_device(PIIX4PMState *s, int > >>>> slot) > >>>> s->pci0_status.up |= (1 << slot); > >>>> } > >>>> > >>>> +static void pend_enable_device(PIIX4PMState *s, int slot) > >>>> +{ > >>>> + s->pci0_status_pending.up |= (1 << slot); > >>>> +} > >>>> + > >>>> static void disable_device(PIIX4PMState *s, int slot) > >>>> { > >>>> s->gpe.sts |= PIIX4_PCI_HOTPLUG_STATUS; > >>>> s->pci0_status.down |= (1 << slot); > >>>> } > >>>> > >>>> +static void pend_disable_device(PIIX4PMState *s, int slot) > >>>> +{ > >>>> + s->pci0_status_pending.down |= (1 << slot); > >>>> +} > >>>> + > >>>> static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, > >>>> PCIHotplugState state) > >>>> { > >>>> @@ -659,15 +688,23 @@ static int piix4_device_hotplug(DeviceState *qdev, > >>>> PCIDevice *dev, > >>>> return 0; > >>>> } > >>>> > >>>> - s->pci0_status.up = 0; > >>>> - s->pci0_status.down = 0; > >>>> - if (state == PCI_HOTPLUG_ENABLED) { > >>>> - enable_device(s, slot); > >>>> + if (s->gpe.sts & PIIX4_PCI_HOTPLUG_STATUS) { > >>>> + if (state == PCI_HOTPLUG_ENABLED) { > >>>> + pend_enable_device(s, slot); > >>>> + } else { > >>>> + pend_disable_device(s, slot); > >>>> + } > >>>> } else { > >>>> - disable_device(s, slot); > >>>> - } > >>>> + s->pci0_status.up = 0; > >>>> + s->pci0_status.down = 0; > >>>> + if (state == PCI_HOTPLUG_ENABLED) { > >>>> + enable_device(s, slot); > >>>> + } else { > >>>> + disable_device(s, slot); > >>>> + } > >>>> > >>>> - pm_update_sci(s); > >>>> + pm_update_sci(s); > >>>> + } > >>>> > >>>> return 0; > >>>> } > >>>> -- > >>>> 1.7.1 > >>>> > >>>>> > >>>>> Your patch adds a *second* qdev_free(). No good. > >>>>> > >>>>> > >>>> > >>>> > >>> > >> > > > -- yamahata