Re: Patch for MS Hyper V (virtualization)
I was talking about the Hyper-V problem with a guy from MS, and he followed up on it for me. It seems this is a known issue, which should be fixed in the latest version of Hyper-V (i.e. the RC of Windows Server 2008 R2 that was released on TechNet last week). David. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
On Tuesday 14 April 2009 8:46:51 pm Sergey Babkin wrote: John Baldwin wrote: Your printf() probably isn't in the right place. pci_add_map() uses PCIB_READ_CONFIG() directly and doesn't use pci_read_config(), so if your printf is in pci_read_config_method() in pci.c it won't see them. Try hooking the cfg operations in sys/amd64/pci/pci_cfgreg.c instead. The printf was in pci_write_config(). Yes, that won't catch the PCIB_WRITE_CONFIG()'s in pci_add_map(). -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
John Baldwin wrote: On Tuesday 07 April 2009 9:14:26 pm Sergey Babkin wrote: John Baldwin wrote: On Monday 06 April 2009 11:12:33 pm Sergey Babkin wrote: Anyway, as far as I can tell, it's only the base register of the simulated DEC21140 device that has this issue, so it's quite possible that the bug is in that device's simulator. I've attached a modified patch that checks conservatively for this precise situation, so it should not break compatibility with anything else. I've tested it on Hyper-V. Can you test unmodified FreeBSD 8 on Hyper-V? It has an extra fix relative to 7 to disable decoding via the PCI command register while sizing BARs that may address this. 8.0 (February snapshot) seems to have the same issue. Ok. I've also saved the log of writes that 7.1 does for this device: reg 10 val ec01 reg 14 val febff000 reg 18 val 0 reg 1c val 0 reg 20 val 0 reg 24 val 0 reg 30 val febe reg 4 val 117 reg 3c val b reg 3d val 1 reg 3e val 14 reg 3f val 28 reg c val 8 reg d val 40 reg 9 val 0 reg 8 val 20 reg 10 val ec00 reg 14 val febff000 reg 4 val 117 reg 4 val 117 I don't see any values. Your printf() probably isn't in the right place. pci_add_map() uses PCIB_READ_CONFIG() directly and doesn't use pci_read_config(), so if your printf is in pci_read_config_method() in pci.c it won't see them. Try hooking the cfg operations in sys/amd64/pci/pci_cfgreg.c instead. The printf was in pci_write_config(). -SB ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
On Tuesday 07 April 2009 9:14:26 pm Sergey Babkin wrote: John Baldwin wrote: On Monday 06 April 2009 11:12:33 pm Sergey Babkin wrote: Anyway, as far as I can tell, it's only the base register of the simulated DEC21140 device that has this issue, so it's quite possible that the bug is in that device's simulator. I've attached a modified patch that checks conservatively for this precise situation, so it should not break compatibility with anything else. I've tested it on Hyper-V. Can you test unmodified FreeBSD 8 on Hyper-V? It has an extra fix relative to 7 to disable decoding via the PCI command register while sizing BARs that may address this. 8.0 (February snapshot) seems to have the same issue. Ok. I've also saved the log of writes that 7.1 does for this device: reg 10 val ec01 reg 14 val febff000 reg 18 val 0 reg 1c val 0 reg 20 val 0 reg 24 val 0 reg 30 val febe reg 4 val 117 reg 3c val b reg 3d val 1 reg 3e val 14 reg 3f val 28 reg c val 8 reg d val 40 reg 9 val 0 reg 8 val 20 reg 10 val ec00 reg 14 val febff000 reg 4 val 117 reg 4 val 117 I don't see any values. Your printf() probably isn't in the right place. pci_add_map() uses PCIB_READ_CONFIG() directly and doesn't use pci_read_config(), so if your printf is in pci_read_config_method() in pci.c it won't see them. Try hooking the cfg operations in sys/amd64/pci/pci_cfgreg.c instead. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
On Monday 06 April 2009 11:12:33 pm Sergey Babkin wrote: John Baldwin wrote: On Monday 06 April 2009 1:07:38 pm Ivan Voras wrote: 2009/4/6 John Baldwin j...@freebsd.org: On Sunday 05 April 2009 12:23:39 pm Sergey Babkin wrote: Hmm, the problem is we need to be able to write to BARs to size them. б Any OS needs to be able to do this to know what address space regions are being decoded by devices. б We can't avoid writing to BARs. I have only vague idea what BARs are and if it's the correct diagnosis in this case, but the fact is that other operating systems (Windows, Linux tested) work, so either there is a way around it or the original premise is wrong-ish. Every OS writes to BARs to size them during boot. It's the defined procedure for sizing them. A BAR is a base address register, and it is how a PCI device gets memory and I/O port resources. OS (or BIOS) writes a starting address into the register to tell the PCI device where a given resource starts. The OS doesn't have to write to the BAR if BIOS has already done it. And the BIOS in the Hyper-V VM is obviously special, so it doesn't trip on iself. Yes it does because we don't know how _big_ the BAR is. The OS has to know if the device is decoding 1MB or 64KB because we need to reserve the entire window to prevent any other devices from using it. We don't just write the existing value, we write all 1's to it and read it back. The lower N bits stick at zero and we use that to figure out the BAR's size. See pci_add_map() in sys/dev/pci/pci.c Anyway, as far as I can tell, it's only the base register of the simulated DEC21140 device that has this issue, so it's quite possible that the bug is in that device's simulator. I've attached a modified patch that checks conservatively for this precise situation, so it should not break compatibility with anything else. I've tested it on Hyper-V. Can you test unmodified FreeBSD 8 on Hyper-V? It has an extra fix relative to 7 to disable decoding via the PCI command register while sizing BARs that may address this. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
On Tue, Apr 7, 2009 at 9:21 AM, John Baldwin j...@freebsd.org wrote: On Monday 06 April 2009 11:12:33 pm Sergey Babkin wrote: John Baldwin wrote: On Monday 06 April 2009 1:07:38 pm Ivan Voras wrote: 2009/4/6 John Baldwin j...@freebsd.org: On Sunday 05 April 2009 12:23:39 pm Sergey Babkin wrote: Hmm, the problem is we need to be able to write to BARs to size them. б Any OS needs to be able to do this to know what address space regions are being decoded by devices. б We can't avoid writing to BARs. I have only vague idea what BARs are and if it's the correct diagnosis in this case, but the fact is that other operating systems (Windows, Linux tested) work, so either there is a way around it or the original premise is wrong-ish. Every OS writes to BARs to size them during boot. It's the defined procedure for sizing them. A BAR is a base address register, and it is how a PCI device gets memory and I/O port resources. OS (or BIOS) writes a starting address into the register to tell the PCI device where a given resource starts. The OS doesn't have to write to the BAR if BIOS has already done it. And the BIOS in the Hyper-V VM is obviously special, so it doesn't trip on iself. Yes it does because we don't know how _big_ the BAR is. The OS has to know if the device is decoding 1MB or 64KB because we need to reserve the entire window to prevent any other devices from using it. We don't just write the existing value, we write all 1's to it and read it back. The lower N bits stick at zero and we use that to figure out the BAR's size. See pci_add_map() in sys/dev/pci/pci.c John is 100% correct. Every kernel PCI driver has to figure out how big the BAR is and IN FACT typically the BIOS assigns more address space than the register set you are mapping. This is straight out of the PCI spec. -aps ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Re: Patch for MS Hyper V (virtualization)
(Let's see if I've figured yet another workaround for the web interface= ). The address space used by the card I think is actually 0x80 bytes= , in the I/O port space. The card has it located at the port 0xEC00. Yester= day I've had all the values and addresses written to this card's registers = printed for debugging and I don't remember seeing all ones written to this = register. It was writing back first 0xEC01 (1 is the I/O space indicator), = then 0xEC00. And no, the culprit is not the missing bit 0 (which should be = read-only by the PCI spec): I've tried adding it back, and it made no diffe= rence. I'll try FreeBSD 8 and see what happens. -SB Ap= r 7, 2009 10:28:50 AM, [1]...@freebsd.org wrote: On Monday 06 April 2009 11:12:33= pm Sergey Babkin wrote: John Baldwin wrote: = On Monday 06 April 2009 1:07:38 pm Ivan Voras wrote: = 2009/4/6 John Baldwin [2]...@freebsd.org: = ; On Sunday 05 April 2009 12:23:39 pm Sergey Babkin wrote: = ; Hmm, the problem is we need to be able t= o write to BARs to size them. б Any OS = gt; needs to be able to do this to know what address space regions are being decoded by devices. б We can't avoid= writing to BARs. I have only vague ide= a what BARs are and if it's the correct diagnosis in this= case, but the fact is that other operating systems (Windows, = Linux tested) work, so either there is a way around it or the originalpremise is wrong-ish. Every O= S writes to BARs to size them during boot. It's the defined procedure= br for sizing them. A BAR is a base address register, and it is = how a PCI device gets memory and I/O port resources. OS (or B= IOS) writes a starting address into the register to tell the P= CI device where a given resource starts. Th= e OS doesn't have to write to the BAR if BIOS has already done it. = And the BIOS in the Hyper-V VM is obviously special, so it doesn't = trip on iself. Yes it does because we don't know how _big_ the BAR = is. The OS has to know if the device is decoding 1MB or 64KB because w= e need to reserve the entire window to prevent any other devices from u= sing it. We don't just write the existing value, we write all 1's to i= t and read it back. The lower N bits stick at zero and we use that t= o figure out the BAR's size. See pci_add_map() in sys/dev/pci/pci.c Anyway, as far as I can tell, it's only the base register of = the simulated DEC21140 device that has this issue, so it's qu= ite possible that the bug is in that device's simulator. = I've attached a modified patch that checks conservatively for this = precise situation, so it should not break compatibility with anythi= ng else. I've tested it on Hyper-V. Can you test unmodified FreeBSD = 8 on Hyper-V? It has an extra fix relative to 7 to disable decoding vi= a the PCI command register while sizing BARs that may address this. -- John Baldwin References 1. 3Dmailto:j...@freebsd.org; 2. 3Dmailto:j...@freebsd.org; ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
John Baldwin wrote: On Monday 06 April 2009 11:12:33 pm Sergey Babkin wrote: Anyway, as far as I can tell, it's only the base register of the simulated DEC21140 device that has this issue, so it's quite possible that the bug is in that device's simulator. I've attached a modified patch that checks conservatively for this precise situation, so it should not break compatibility with anything else. I've tested it on Hyper-V. Can you test unmodified FreeBSD 8 on Hyper-V? It has an extra fix relative to 7 to disable decoding via the PCI command register while sizing BARs that may address this. 8.0 (February snapshot) seems to have the same issue. I've also saved the log of writes that 7.1 does for this device: reg 10 val ec01 reg 14 val febff000 reg 18 val 0 reg 1c val 0 reg 20 val 0 reg 24 val 0 reg 30 val febe reg 4 val 117 reg 3c val b reg 3d val 1 reg 3e val 14 reg 3f val 28 reg c val 8 reg d val 40 reg 9 val 0 reg 8 val 20 reg 10 val ec00 reg 14 val febff000 reg 4 val 117 reg 4 val 117 I don't see any values. -SB ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
On Sunday 05 April 2009 12:23:39 pm Sergey Babkin wrote: Apr 4, 2009 02:10:23 PM, ivo...@freebsd.org wrote: Can someo=ne please review and commit (if appropriate) the tweak for Hyper-V shu=tdown issue at http://shell.peach.ne.jp/aoyama/archives/40 ? =The problem is: the VM appears to hang on shutdown without it (hanging the Hyper-V VM with it so the host also can't shutdown or reboot re=liably - someone at MS skipped the part where an error in the VM isn't=supposed to bring the host down with it) I don't have the commit =permission any more but I can review :-) Yes, Hyper-V does not like th=e writes into the PCI config space. Very specifically, writing the base=register window address of the simulated 21140 screws up something tha=t prevents the VM from shutting down. Interestingly, even reading and writi=ng back the same value has this effect. So the patch is valid. =I don't particularly like the hackish checking for the 21140 chip, and I'=m not sure if if would break some real 21140 chip out there. If the dri=ver does the same as another one I've seen, the driver tries to align t=he register window to 0x80, and in the simulated 21140 it's already ali=gned. I've had a quick look but couldn't say it for sure. I'd do it dif=ferently: check if the value being written is the same that was read, =and skip the write in this case. Let me see, maybe I'll make a dif=ferent patch. Hmm, the problem is we need to be able to write to BARs to size them. Any OS needs to be able to do this to know what address space regions are being decoded by devices. We can't avoid writing to BARs. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
2009/4/6 John Baldwin j...@freebsd.org: On Sunday 05 April 2009 12:23:39 pm Sergey Babkin wrote: Hmm, the problem is we need to be able to write to BARs to size them. Any OS needs to be able to do this to know what address space regions are being decoded by devices. We can't avoid writing to BARs. I have only vague idea what BARs are and if it's the correct diagnosis in this case, but the fact is that other operating systems (Windows, Linux tested) work, so either there is a way around it or the original premise is wrong-ish. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
In message: 200904061154.19601@freebsd.org John Baldwin j...@freebsd.org writes: : On Sunday 05 April 2009 12:23:39 pm Sergey Babkin wrote: : : Apr 4, 2009 02:10:23 PM, ivo...@freebsd.org wrote: : Can someo=ne please review and commit (if appropriate) the tweak for : Hyper-V shu=tdown issue at : http://shell.peach.ne.jp/aoyama/archives/40 : ? : : =The problem is: the VM appears to hang on shutdown without it : (hanging : the Hyper-V VM with it so the host also can't shutdown or reboot : re=liably - someone at MS skipped the part where an error in the VM : isn't=supposed to bring the host down with it) : I don't have the commit =permission any more but I can review :-) : Yes, Hyper-V does not like th=e writes into the PCI config space. Why not? We need to understand exactly what it doesn't like because this is non-standard compliant behavior. : Very specifically, : writing the base=register window address of the simulated 21140 : screws up something : tha=t prevents the VM from shutting down. Interestingly, even reading : and writi=ng : back the same value has this effect. So the patch is valid. Then the Hyper-V is broken. This is bog-standard PCI behavior. The OS must be able to write to the BARs to size the resource being decoded. In addition, the OS is allowed to move the location of an allocation for a BAR, so avoiding writes to it is bad. Finally, some BIOSes don't allocate resources for a card, and this would totally prevent 21140's from being usable on those machines. : =I don't particularly like the hackish checking for the 21140 chip, : and I'=m not sure : if if would break some real 21140 chip out there. If the dri=ver does : the same as another : one I've seen, the driver tries to align t=he register window to : 0x80, and in the : simulated 21140 it's already ali=gned. I've had a quick look but : couldn't say it : for sure. I'd do it dif=ferently: check if the value being written is : the same that was read, : =and skip the write in this case. : Let me see, maybe I'll make a dif=ferent patch. : : Hmm, the problem is we need to be able to write to BARs to size them. Any OS : needs to be able to do this to know what address space regions are being : decoded by devices. We can't avoid writing to BARs. Exactly. Not only do we have to read/write them to size the BAR resource, but as I indicated above, one must write to them when the BIOS doesn't assign resources to the BAR and the driver requests that resource. Warner ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
On Monday 06 April 2009 1:07:38 pm Ivan Voras wrote: 2009/4/6 John Baldwin j...@freebsd.org: On Sunday 05 April 2009 12:23:39 pm Sergey Babkin wrote: Hmm, the problem is we need to be able to write to BARs to size them. Any OS needs to be able to do this to know what address space regions are being decoded by devices. We can't avoid writing to BARs. I have only vague idea what BARs are and if it's the correct diagnosis in this case, but the fact is that other operating systems (Windows, Linux tested) work, so either there is a way around it or the original premise is wrong-ish. Every OS writes to BARs to size them during boot. It's the defined procedure for sizing them. A BAR is a base address register, and it is how a PCI device gets memory and I/O port resources. OS (or BIOS) writes a starting address into the register to tell the PCI device where a given resource starts. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
John Baldwin wrote: On Monday 06 April 2009 1:07:38 pm Ivan Voras wrote: 2009/4/6 John Baldwin j...@freebsd.org: On Sunday 05 April 2009 12:23:39 pm Sergey Babkin wrote: Hmm, the problem is we need to be able to write to BARs to size them. б Any OS needs to be able to do this to know what address space regions are being decoded by devices. б We can't avoid writing to BARs. I have only vague idea what BARs are and if it's the correct diagnosis in this case, but the fact is that other operating systems (Windows, Linux tested) work, so either there is a way around it or the original premise is wrong-ish. Every OS writes to BARs to size them during boot. It's the defined procedure for sizing them. A BAR is a base address register, and it is how a PCI device gets memory and I/O port resources. OS (or BIOS) writes a starting address into the register to tell the PCI device where a given resource starts. The OS doesn't have to write to the BAR if BIOS has already done it. And the BIOS in the Hyper-V VM is obviously special, so it doesn't trip on iself. Anyway, as far as I can tell, it's only the base register of the simulated DEC21140 device that has this issue, so it's quite possible that the bug is in that device's simulator. I've attached a modified patch that checks conservatively for this precise situation, so it should not break compatibility with anything else. I've tested it on Hyper-V. -SB --- dev/pci/pci.c.0 2009-04-06 21:35:26.0 + +++ dev/pci/pci.c 2009-04-06 22:43:08.0 + @@ -3590,6 +3590,18 @@ struct pci_devinfo *dinfo = device_get_ivars(child); pcicfgregs *cfg = dinfo-cfg; + /* A workaround for Hyper-V that hangs on VM stop +* if the base address register of the 21140 simulator is written; +* since on Hyper-V the value written is the same as the one +* already in the register, it can be simply skipped. +* 0x1011: DEC, 0x0009: 21140 */ + if (dinfo-cfg.vendor == 0x1011 dinfo-cfg.device == 0x0009) { + if (reg == PCIR_BARS +(val ~3) == (PCIB_READ_CONFIG(device_get_parent(dev), + cfg-bus, cfg-slot, cfg-func, reg, width) ~3) ) +return; + } + PCIB_WRITE_CONFIG(device_get_parent(dev), cfg-bus, cfg-slot, cfg-func, reg, val, width); } ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Patch for MS Hyper V (virtualization)
Apr 4, 2009 02:10:23 PM, ivo...@freebsd.org wrote: Can someo= ne please review and commit (if appropriate) the tweak for Hyper-V shu= tdown issue at http://shell.peach.ne.jp/aoyama/archives/40 ? = The problem is: the VM appears to hang on shutdown without it (hanging the Hyper-V VM with it so the host also can't shutdown or reboot re= liably - someone at MS skipped the part where an error in the VM isn't= supposed to bring the host down with it) I don't have the commit = permission any more but I can review :-) Yes, Hyper-V does not like th= e writes into the PCI config space. Very specifically, writing the base= register window address of the simulated 21140 screws up something tha= t prevents the VM from shutting down. Interestingly, even reading and writi= ng back the same value has this effect. So the patch is valid. = I don't particularly like the hackish checking for the 21140 chip, and I'= m not sure if if would break some real 21140 chip out there. If the dri= ver does the same as another one I've seen, the driver tries to align t= he register window to 0x80, and in the simulated 21140 it's already ali= gned. I've had a quick look but couldn't say it for sure. I'd do it dif= ferently: check if the value being written is the same that was read, = and skip the write in this case. Let me see, maybe I'll make a dif= ferent patch. -SB ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org