On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: > On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > > > Hi, > > > In this post > > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > > > mentioned about the issues when 64Bit PCI BAR is present and 32bit > > > address range is selected for it. > > > The issue affects all recent qemu releases and all > > > old and recent guest Linux kernel versions. > > > > > > We've done some investigations. Let me explain what happens. > > > Assume we have 64bit BAR with size 32MB mapped at [0xF0000000 - > > > 0xF2000000] > > > > > > When Linux guest starts it does PCI bus enumeration. > > > The OS enumerates 64BIT bars using the following procedure. > > > 1. Write all FF's to lower half of 64bit BAR > > > 2. Write address back to lower half of 64bit BAR > > > 3. Write all FF's to higher half of 64bit BAR > > > 4. Write address back to higher half of 64bit BAR > > > > > > Linux code is here: > > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > > > > > What does it mean for qemu? > > > > > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > > > part of the 64bit BAR. Then it applies the mask and converts the value > > > to "All FF's - size + 1" (FE000000 if size is 32MB). > > > Then pci_bar_address() checks if BAR address is valid. Since it is a > > > 64bit bar it reads 0x00000000FE000000 - this address is valid. So qemu > > > updates topology and sends request to update mappings in KVM with new > > > range for the 64bit BAR FE000000 - 0xFFFFFFFF. This usually means kernel > > > panic on boot, if there is another mapping in the FE000000 - 0xFFFFFFFF > > > range, which is quite common. > > > > Do you know why does it panic? As far as I can see > > from code at > > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > > > > 171 pci_read_config_dword(dev, pos, &l); > > 172 pci_write_config_dword(dev, pos, l | mask); > > 173 pci_read_config_dword(dev, pos, &sz); > > 174 pci_write_config_dword(dev, pos, l); > > > > BAR is restored: what triggers an access between lines 172 and 174? > > Random interrupt reading the time, likely.
Weird, what the backtrace shows is init, unrelated to interrupts. > > Also, what you describe happens on a 32 bit BAR in the same way, no? > > So it seems. Btw, is this procedure correct for sizing a BAR which is > larger than 4GB? There's more code sizing 64 bit BARs, but generally software is allowed to write any junk into enabled BARs as long as there aren't any memory accesses. > -- > error compiling committee.c: too many arguments to function