Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 01/02/12 20:04, Michael S. Tsirkin wrote: > On Wed, Feb 01, 2012 at 06:44:42PM +1300, Alexey Korolev wrote: >> On 31/01/12 22:43, Avi Kivity wrote: >>> On 01/31/2012 11:40 AM, Avi Kivity wrote: On 01/27/2012 06:42 AM, Alexey Korolev wrote: > On 27/01/12 04:12, Avi Kivity wrote: >> On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote: >>> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: >> Hi, >> In this post >> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html >> I've >> mentioned about the issues when 64Bit PCI BAR is present and 32bit >> address range is selected for it. >> The issue affects all recent qemu releases and all >> old and recent guest Linux kernel versions. >> >> We've done some investigations. Let me explain what happens. >> Assume we have 64bit BAR with size 32MB mapped at [0xF000 - >> 0xF200] >> >> When Linux guest starts it does PCI bus enumeration. >> The OS enumerates 64BIT bars using the following procedure. >> 1. Write all FF's to lower half of 64bit BAR >> 2. Write address back to lower half of 64bit BAR >> 3. Write all FF's to higher half of 64bit BAR >> 4. Write address back to higher half of 64bit BAR >> >> Linux code is here: >> http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 >> >> What does it mean for qemu? >> >> At step 1. qemu pci_default_write_config() recevies all FFs for lower >> part of the 64bit BAR. Then it applies the mask and converts the >> value >> to "All FF's - size + 1" (FE00 if size is 32MB). >> Then pci_bar_address() checks if BAR address is valid. Since it is a >> 64bit bar it reads 0xFE00 - this address is valid. So >> qemu >> updates topology and sends request to update mappings in KVM with new >> range for the 64bit BAR FE00 - 0x. This usually means >> kernel >> panic on boot, if there is another mapping in the FE00 - >> 0x >> range, which is quite common. > Do you know why does it panic? As far as I can see > from code at > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > > 171pci_read_config_dword(dev, pos, &l); > 172pci_write_config_dword(dev, pos, l | mask); > 173pci_read_config_dword(dev, pos, &sz); > 174pci_write_config_dword(dev, pos, l); > > BAR is restored: what triggers an access between lines 172 and 174? Random interrupt reading the time, likely. >>> Weird, what the backtrace shows is init, unrelated >>> to interrupts. >>> >> It's a bug then. qemu doesn't undo the mapping correctly. >> >> If you have clear instructions, I'll try to reproduce it. >> > Well the easiest way to reproduce this is: > > > 1. Get kernel bzImage (version < 2.6.36) > 2. Apply patch to ivshmem.c > > --- > diff --git a/hw/ivshmem.c b/hw/ivshmem.c > index 1aa9e3b..71f8c21 100644 > --- a/hw/ivshmem.c > +++ b/hw/ivshmem.c > @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, > int fd) { > memory_region_add_subregion(&s->bar, 0, &s->ivshmem); > > /* region for shared memory */ > -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar); > +pci_register_bar(&s->dev, 2, > PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar) > } > > static void close_guest_eventfds(IVShmemState *s, int posn) > --- > > 3. Launch qemu with a command like that > > /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp > 1,socket=1,cores=1,threads=1 -name centos54 -uuid > d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev > socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait > -mon chardev=charmonitor,id=monitor,mode=readline -rtc > base=utc -drive > file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device > ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 > -drive > file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw > -device > ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev > file,id=charserial0,path=/home/alexey/cent54.log -device > isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us > -vga cirrus -device > virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=o
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On Wed, Feb 01, 2012 at 06:44:42PM +1300, Alexey Korolev wrote: > On 31/01/12 22:43, Avi Kivity wrote: > > On 01/31/2012 11:40 AM, Avi Kivity wrote: > >> On 01/27/2012 06:42 AM, Alexey Korolev wrote: > >>> On 27/01/12 04:12, Avi Kivity wrote: > On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote: > > On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: > >> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > >>> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > Hi, > In this post > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html > I've > mentioned about the issues when 64Bit PCI BAR is present and 32bit > address range is selected for it. > The issue affects all recent qemu releases and all > old and recent guest Linux kernel versions. > > We've done some investigations. Let me explain what happens. > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > 0xF200] > > When Linux guest starts it does PCI bus enumeration. > The OS enumerates 64BIT bars using the following procedure. > 1. Write all FF's to lower half of 64bit BAR > 2. Write address back to lower half of 64bit BAR > 3. Write all FF's to higher half of 64bit BAR > 4. Write address back to higher half of 64bit BAR > > Linux code is here: > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > What does it mean for qemu? > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > part of the 64bit BAR. Then it applies the mask and converts the > value > to "All FF's - size + 1" (FE00 if size is 32MB). > Then pci_bar_address() checks if BAR address is valid. Since it is a > 64bit bar it reads 0xFE00 - this address is valid. So > qemu > updates topology and sends request to update mappings in KVM with new > range for the 64bit BAR FE00 - 0x. This usually means > kernel > panic on boot, if there is another mapping in the FE00 - > 0x > range, which is quite common. > >>> Do you know why does it panic? As far as I can see > >>> from code at > >>> http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > >>> > >>> 171pci_read_config_dword(dev, pos, &l); > >>> 172pci_write_config_dword(dev, pos, l | mask); > >>> 173pci_read_config_dword(dev, pos, &sz); > >>> 174pci_write_config_dword(dev, pos, l); > >>> > >>> BAR is restored: what triggers an access between lines 172 and 174? > >> Random interrupt reading the time, likely. > > Weird, what the backtrace shows is init, unrelated > > to interrupts. > > > It's a bug then. qemu doesn't undo the mapping correctly. > > If you have clear instructions, I'll try to reproduce it. > > >>> Well the easiest way to reproduce this is: > >>> > >>> > >>> 1. Get kernel bzImage (version < 2.6.36) > >>> 2. Apply patch to ivshmem.c > >>> > >>> --- > >>> diff --git a/hw/ivshmem.c b/hw/ivshmem.c > >>> index 1aa9e3b..71f8c21 100644 > >>> --- a/hw/ivshmem.c > >>> +++ b/hw/ivshmem.c > >>> @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, > >>> int fd) { > >>> memory_region_add_subregion(&s->bar, 0, &s->ivshmem); > >>> > >>> /* region for shared memory */ > >>> -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar); > >>> +pci_register_bar(&s->dev, 2, > >>> PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar) > >>> } > >>> > >>> static void close_guest_eventfds(IVShmemState *s, int posn) > >>> --- > >>> > >>> 3. Launch qemu with a command like that > >>> > >>> /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp > >>> 1,socket=1,cores=1,threads=1 -name centos54 -uuid > >>> d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev > >>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait > >>> -mon chardev=charmonitor,id=monitor,mode=readline -rtc > >>> base=utc -drive > >>> file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device > >>> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 > >>> -drive > >>> file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw > >>> -device > >>> ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev > >>> file,id=charserial0,path=/home/alexey/cent54.log -device > >>> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us > >>> -vga cirrus -device > >>> virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 > >>> --device ivshmem,size=32,s
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 31/01/12 22:43, Avi Kivity wrote: > On 01/31/2012 11:40 AM, Avi Kivity wrote: >> On 01/27/2012 06:42 AM, Alexey Korolev wrote: >>> On 27/01/12 04:12, Avi Kivity wrote: On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote: > On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: >> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: >>> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: Hi, In this post http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've mentioned about the issues when 64Bit PCI BAR is present and 32bit address range is selected for it. The issue affects all recent qemu releases and all old and recent guest Linux kernel versions. We've done some investigations. Let me explain what happens. Assume we have 64bit BAR with size 32MB mapped at [0xF000 - 0xF200] When Linux guest starts it does PCI bus enumeration. The OS enumerates 64BIT bars using the following procedure. 1. Write all FF's to lower half of 64bit BAR 2. Write address back to lower half of 64bit BAR 3. Write all FF's to higher half of 64bit BAR 4. Write address back to higher half of 64bit BAR Linux code is here: http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 What does it mean for qemu? At step 1. qemu pci_default_write_config() recevies all FFs for lower part of the 64bit BAR. Then it applies the mask and converts the value to "All FF's - size + 1" (FE00 if size is 32MB). Then pci_bar_address() checks if BAR address is valid. Since it is a 64bit bar it reads 0xFE00 - this address is valid. So qemu updates topology and sends request to update mappings in KVM with new range for the 64bit BAR FE00 - 0x. This usually means kernel panic on boot, if there is another mapping in the FE00 - 0x range, which is quite common. >>> Do you know why does it panic? As far as I can see >>> from code at >>> http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 >>> >>> 171pci_read_config_dword(dev, pos, &l); >>> 172pci_write_config_dword(dev, pos, l | mask); >>> 173pci_read_config_dword(dev, pos, &sz); >>> 174pci_write_config_dword(dev, pos, l); >>> >>> BAR is restored: what triggers an access between lines 172 and 174? >> Random interrupt reading the time, likely. > Weird, what the backtrace shows is init, unrelated > to interrupts. > It's a bug then. qemu doesn't undo the mapping correctly. If you have clear instructions, I'll try to reproduce it. >>> Well the easiest way to reproduce this is: >>> >>> >>> 1. Get kernel bzImage (version < 2.6.36) >>> 2. Apply patch to ivshmem.c >>> >>> --- >>> diff --git a/hw/ivshmem.c b/hw/ivshmem.c >>> index 1aa9e3b..71f8c21 100644 >>> --- a/hw/ivshmem.c >>> +++ b/hw/ivshmem.c >>> @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, >>> int fd) { >>> memory_region_add_subregion(&s->bar, 0, &s->ivshmem); >>> >>> /* region for shared memory */ >>> -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar); >>> +pci_register_bar(&s->dev, 2, >>> PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar) >>> } >>> >>> static void close_guest_eventfds(IVShmemState *s, int posn) >>> --- >>> >>> 3. Launch qemu with a command like that >>> >>> /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp >>> 1,socket=1,cores=1,threads=1 -name centos54 -uuid >>> d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev >>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait >>> -mon chardev=charmonitor,id=monitor,mode=readline -rtc >>> base=utc -drive >>> file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device >>> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 >>> -drive >>> file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw >>> -device >>> ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev >>> file,id=charserial0,path=/home/alexey/cent54.log -device >>> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us >>> -vga cirrus -device >>> virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 >>> --device ivshmem,size=32,shm="shm" -kernel bzImage -append >>> "root=/dev/hda1 console=ttyS0,115200n8 console=tty0" >>> >>> in other words add: --device ivshmem,size=32,shm="shm" >>> >>> That is all. >>> >>> Note: it won't necessary cause panic message on some kernels it just hangs >>> or reboots. >>> >> In fact qemu segfaults for me, since
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 01/27/2012 06:42 AM, Alexey Korolev wrote: > On 27/01/12 04:12, Avi Kivity wrote: > > On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote: > >> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: > >>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > > Hi, > > In this post > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > > mentioned about the issues when 64Bit PCI BAR is present and 32bit > > address range is selected for it. > > The issue affects all recent qemu releases and all > > old and recent guest Linux kernel versions. > > > > We've done some investigations. Let me explain what happens. > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > > 0xF200] > > > > When Linux guest starts it does PCI bus enumeration. > > The OS enumerates 64BIT bars using the following procedure. > > 1. Write all FF's to lower half of 64bit BAR > > 2. Write address back to lower half of 64bit BAR > > 3. Write all FF's to higher half of 64bit BAR > > 4. Write address back to higher half of 64bit BAR > > > > Linux code is here: > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > > > What does it mean for qemu? > > > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > > part of the 64bit BAR. Then it applies the mask and converts the value > > to "All FF's - size + 1" (FE00 if size is 32MB). > > Then pci_bar_address() checks if BAR address is valid. Since it is a > > 64bit bar it reads 0xFE00 - this address is valid. So qemu > > updates topology and sends request to update mappings in KVM with new > > range for the 64bit BAR FE00 - 0x. This usually means kernel > > panic on boot, if there is another mapping in the FE00 - 0x > > range, which is quite common. > Do you know why does it panic? As far as I can see > from code at > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > > 171pci_read_config_dword(dev, pos, &l); > 172pci_write_config_dword(dev, pos, l | mask); > 173pci_read_config_dword(dev, pos, &sz); > 174pci_write_config_dword(dev, pos, l); > > BAR is restored: what triggers an access between lines 172 and 174? > >>> Random interrupt reading the time, likely. > >> Weird, what the backtrace shows is init, unrelated > >> to interrupts. > >> > > It's a bug then. qemu doesn't undo the mapping correctly. > > > > If you have clear instructions, I'll try to reproduce it. > > > Well the easiest way to reproduce this is: > > > 1. Get kernel bzImage (version < 2.6.36) > 2. Apply patch to ivshmem.c > > I have some patches that fix this, but they're very hacky since they're dealing with the old and rotten core. I much prefer to let this resolve itself in my continuing rewrite. Is this an urgent problem for you or can you live with this for a while? -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 01/31/2012 11:40 AM, Avi Kivity wrote: > On 01/27/2012 06:42 AM, Alexey Korolev wrote: > > On 27/01/12 04:12, Avi Kivity wrote: > > > On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote: > > >> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: > > >>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > > > Hi, > > > In this post > > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html > > > I've > > > mentioned about the issues when 64Bit PCI BAR is present and 32bit > > > address range is selected for it. > > > The issue affects all recent qemu releases and all > > > old and recent guest Linux kernel versions. > > > > > > We've done some investigations. Let me explain what happens. > > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > > > 0xF200] > > > > > > When Linux guest starts it does PCI bus enumeration. > > > The OS enumerates 64BIT bars using the following procedure. > > > 1. Write all FF's to lower half of 64bit BAR > > > 2. Write address back to lower half of 64bit BAR > > > 3. Write all FF's to higher half of 64bit BAR > > > 4. Write address back to higher half of 64bit BAR > > > > > > Linux code is here: > > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > > > > > What does it mean for qemu? > > > > > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > > > part of the 64bit BAR. Then it applies the mask and converts the value > > > to "All FF's - size + 1" (FE00 if size is 32MB). > > > Then pci_bar_address() checks if BAR address is valid. Since it is a > > > 64bit bar it reads 0xFE00 - this address is valid. So qemu > > > updates topology and sends request to update mappings in KVM with new > > > range for the 64bit BAR FE00 - 0x. This usually means > > > kernel > > > panic on boot, if there is another mapping in the FE00 - > > > 0x > > > range, which is quite common. > > Do you know why does it panic? As far as I can see > > from code at > > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > > > > 171pci_read_config_dword(dev, pos, &l); > > 172pci_write_config_dword(dev, pos, l | mask); > > 173pci_read_config_dword(dev, pos, &sz); > > 174pci_write_config_dword(dev, pos, l); > > > > BAR is restored: what triggers an access between lines 172 and 174? > > >>> Random interrupt reading the time, likely. > > >> Weird, what the backtrace shows is init, unrelated > > >> to interrupts. > > >> > > > It's a bug then. qemu doesn't undo the mapping correctly. > > > > > > If you have clear instructions, I'll try to reproduce it. > > > > > Well the easiest way to reproduce this is: > > > > > > 1. Get kernel bzImage (version < 2.6.36) > > 2. Apply patch to ivshmem.c > > > > --- > > diff --git a/hw/ivshmem.c b/hw/ivshmem.c > > index 1aa9e3b..71f8c21 100644 > > --- a/hw/ivshmem.c > > +++ b/hw/ivshmem.c > > @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, > > int fd) { > > memory_region_add_subregion(&s->bar, 0, &s->ivshmem); > > > > /* region for shared memory */ > > -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar); > > +pci_register_bar(&s->dev, 2, > > PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar) > > } > > > > static void close_guest_eventfds(IVShmemState *s, int posn) > > --- > > > > 3. Launch qemu with a command like that > > > > /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp > > 1,socket=1,cores=1,threads=1 -name centos54 -uuid > > d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev > > socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait > > -mon chardev=charmonitor,id=monitor,mode=readline -rtc > > base=utc -drive > > file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device > > ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 > > -drive > > file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw > > -device > > ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev > > file,id=charserial0,path=/home/alexey/cent54.log -device > > isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us > > -vga cirrus -device > > virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 > > --device ivshmem,size=32,shm="shm" -kernel bzImage -append > > "root=/dev/hda1 console=ttyS0,115200n8 console=tty0" > > > > in other words add: --device ivshmem,size=32,shm="shm" > > > > That is all. > > > > Note: it won't necessary cause panic message on some kernels it just hangs > > or reboots. > > >
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 01/27/2012 06:42 AM, Alexey Korolev wrote: > On 27/01/12 04:12, Avi Kivity wrote: > > On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote: > >> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: > >>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > > Hi, > > In this post > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > > mentioned about the issues when 64Bit PCI BAR is present and 32bit > > address range is selected for it. > > The issue affects all recent qemu releases and all > > old and recent guest Linux kernel versions. > > > > We've done some investigations. Let me explain what happens. > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > > 0xF200] > > > > When Linux guest starts it does PCI bus enumeration. > > The OS enumerates 64BIT bars using the following procedure. > > 1. Write all FF's to lower half of 64bit BAR > > 2. Write address back to lower half of 64bit BAR > > 3. Write all FF's to higher half of 64bit BAR > > 4. Write address back to higher half of 64bit BAR > > > > Linux code is here: > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > > > What does it mean for qemu? > > > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > > part of the 64bit BAR. Then it applies the mask and converts the value > > to "All FF's - size + 1" (FE00 if size is 32MB). > > Then pci_bar_address() checks if BAR address is valid. Since it is a > > 64bit bar it reads 0xFE00 - this address is valid. So qemu > > updates topology and sends request to update mappings in KVM with new > > range for the 64bit BAR FE00 - 0x. This usually means kernel > > panic on boot, if there is another mapping in the FE00 - 0x > > range, which is quite common. > Do you know why does it panic? As far as I can see > from code at > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > > 171pci_read_config_dword(dev, pos, &l); > 172pci_write_config_dword(dev, pos, l | mask); > 173pci_read_config_dword(dev, pos, &sz); > 174pci_write_config_dword(dev, pos, l); > > BAR is restored: what triggers an access between lines 172 and 174? > >>> Random interrupt reading the time, likely. > >> Weird, what the backtrace shows is init, unrelated > >> to interrupts. > >> > > It's a bug then. qemu doesn't undo the mapping correctly. > > > > If you have clear instructions, I'll try to reproduce it. > > > Well the easiest way to reproduce this is: > > > 1. Get kernel bzImage (version < 2.6.36) > 2. Apply patch to ivshmem.c > > --- > diff --git a/hw/ivshmem.c b/hw/ivshmem.c > index 1aa9e3b..71f8c21 100644 > --- a/hw/ivshmem.c > +++ b/hw/ivshmem.c > @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int > fd) { > memory_region_add_subregion(&s->bar, 0, &s->ivshmem); > > /* region for shared memory */ > -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar); > +pci_register_bar(&s->dev, 2, > PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar) > } > > static void close_guest_eventfds(IVShmemState *s, int posn) > --- > > 3. Launch qemu with a command like that > > /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp > 1,socket=1,cores=1,threads=1 -name centos54 -uuid > d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev > socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait > -mon chardev=charmonitor,id=monitor,mode=readline -rtc > base=utc -drive > file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device > ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive > file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw > -device > ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev > file,id=charserial0,path=/home/alexey/cent54.log -device > isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga > cirrus -device > virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 > --device ivshmem,size=32,shm="shm" -kernel bzImage -append > "root=/dev/hda1 console=ttyS0,115200n8 console=tty0" > > in other words add: --device ivshmem,size=32,shm="shm" > > That is all. > > Note: it won't necessary cause panic message on some kernels it just hangs or > reboots. > In fact qemu segfaults for me, since registering a ram region not on a page boundary is broken. This happens when the ivshmem bar is split by the hpet region, which is less than page long. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 27/01/12 04:12, Avi Kivity wrote: > On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote: >> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: >>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > Hi, > In this post > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > mentioned about the issues when 64Bit PCI BAR is present and 32bit > address range is selected for it. > The issue affects all recent qemu releases and all > old and recent guest Linux kernel versions. > > We've done some investigations. Let me explain what happens. > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > 0xF200] > > When Linux guest starts it does PCI bus enumeration. > The OS enumerates 64BIT bars using the following procedure. > 1. Write all FF's to lower half of 64bit BAR > 2. Write address back to lower half of 64bit BAR > 3. Write all FF's to higher half of 64bit BAR > 4. Write address back to higher half of 64bit BAR > > Linux code is here: > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > What does it mean for qemu? > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > part of the 64bit BAR. Then it applies the mask and converts the value > to "All FF's - size + 1" (FE00 if size is 32MB). > Then pci_bar_address() checks if BAR address is valid. Since it is a > 64bit bar it reads 0xFE00 - this address is valid. So qemu > updates topology and sends request to update mappings in KVM with new > range for the 64bit BAR FE00 - 0x. This usually means kernel > panic on boot, if there is another mapping in the FE00 - 0x > range, which is quite common. Do you know why does it panic? As far as I can see from code at http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 171pci_read_config_dword(dev, pos, &l); 172pci_write_config_dword(dev, pos, l | mask); 173pci_read_config_dword(dev, pos, &sz); 174pci_write_config_dword(dev, pos, l); BAR is restored: what triggers an access between lines 172 and 174? >>> Random interrupt reading the time, likely. >> Weird, what the backtrace shows is init, unrelated >> to interrupts. >> > It's a bug then. qemu doesn't undo the mapping correctly. > > If you have clear instructions, I'll try to reproduce it. > Well the easiest way to reproduce this is: 1. Get kernel bzImage (version < 2.6.36) 2. Apply patch to ivshmem.c --- diff --git a/hw/ivshmem.c b/hw/ivshmem.c index 1aa9e3b..71f8c21 100644 --- a/hw/ivshmem.c +++ b/hw/ivshmem.c @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int fd) { memory_region_add_subregion(&s->bar, 0, &s->ivshmem); /* region for shared memory */ -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar); +pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar) } static void close_guest_eventfds(IVShmemState *s, int posn) --- 3. Launch qemu with a command like that /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp 1,socket=1,cores=1,threads=1 -name centos54 -uuid d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=readline -rtc base=utc -drive file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev file,id=charserial0,path=/home/alexey/cent54.log -device isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 --device ivshmem,size=32,shm="shm" -kernel bzImage -append "root=/dev/hda1 console=ttyS0,115200n8 console=tty0" in other words add: --device ivshmem,size=32,shm="shm" That is all. Note: it won't necessary cause panic message on some kernels it just hangs or reboots.
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 27/01/12 03:36, Michael S. Tsirkin wrote: > On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: >> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: >>> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: Hi, In this post http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've mentioned about the issues when 64Bit PCI BAR is present and 32bit address range is selected for it. The issue affects all recent qemu releases and all old and recent guest Linux kernel versions. We've done some investigations. Let me explain what happens. Assume we have 64bit BAR with size 32MB mapped at [0xF000 - 0xF200] When Linux guest starts it does PCI bus enumeration. The OS enumerates 64BIT bars using the following procedure. 1. Write all FF's to lower half of 64bit BAR 2. Write address back to lower half of 64bit BAR 3. Write all FF's to higher half of 64bit BAR 4. Write address back to higher half of 64bit BAR Linux code is here: http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 What does it mean for qemu? At step 1. qemu pci_default_write_config() recevies all FFs for lower part of the 64bit BAR. Then it applies the mask and converts the value to "All FF's - size + 1" (FE00 if size is 32MB). Then pci_bar_address() checks if BAR address is valid. Since it is a 64bit bar it reads 0xFE00 - this address is valid. So qemu updates topology and sends request to update mappings in KVM with new range for the 64bit BAR FE00 - 0x. This usually means kernel panic on boot, if there is another mapping in the FE00 - 0x range, which is quite common. >>> Do you know why does it panic? As far as I can see >>> from code at >>> http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 >>> >>> 171pci_read_config_dword(dev, pos, &l); >>> 172pci_write_config_dword(dev, pos, l | mask); >>> 173pci_read_config_dword(dev, pos, &sz); >>> 174pci_write_config_dword(dev, pos, l); >>> >>> BAR is restored: what triggers an access between lines 172 and 174? >> Random interrupt reading the time, likely. > Weird, what the backtrace shows is init, unrelated > to interrupts. Yes, it fails during ordered late_hpet_init() call. Which is a part of kernel fs_initcall list. So no time interrupts are involved here. Basically once the region is programmed (even temporary), area behind it is lost. I mean if we even temporary overlap the HPET region with our BAR, backed by host user space memory, and commit a mapping request to kvm, the information about the old mappings belonging to HPET are lost. Even if we did this for short period of time, and later restore the original address. >>> Also, what you describe happens on a 32 bit BAR in the same way, no? >> So it seems. Btw, is this procedure correct for sizing a BAR which is >> larger than 4GB? > There's more code sizing 64 bit BARs, but generally > software is allowed to write any junk into enabled BARs > as long as there aren't any memory accesses.
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote: > On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: > > On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > > > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > > > > Hi, > > > > In this post > > > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > > > > mentioned about the issues when 64Bit PCI BAR is present and 32bit > > > > address range is selected for it. > > > > The issue affects all recent qemu releases and all > > > > old and recent guest Linux kernel versions. > > > > > > > > We've done some investigations. Let me explain what happens. > > > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > > > > 0xF200] > > > > > > > > When Linux guest starts it does PCI bus enumeration. > > > > The OS enumerates 64BIT bars using the following procedure. > > > > 1. Write all FF's to lower half of 64bit BAR > > > > 2. Write address back to lower half of 64bit BAR > > > > 3. Write all FF's to higher half of 64bit BAR > > > > 4. Write address back to higher half of 64bit BAR > > > > > > > > Linux code is here: > > > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > > > > > > > What does it mean for qemu? > > > > > > > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > > > > part of the 64bit BAR. Then it applies the mask and converts the value > > > > to "All FF's - size + 1" (FE00 if size is 32MB). > > > > Then pci_bar_address() checks if BAR address is valid. Since it is a > > > > 64bit bar it reads 0xFE00 - this address is valid. So qemu > > > > updates topology and sends request to update mappings in KVM with new > > > > range for the 64bit BAR FE00 - 0x. This usually means kernel > > > > panic on boot, if there is another mapping in the FE00 - 0x > > > > range, which is quite common. > > > > > > Do you know why does it panic? As far as I can see > > > from code at > > > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > > > > > > 171pci_read_config_dword(dev, pos, &l); > > > 172pci_write_config_dword(dev, pos, l | mask); > > > 173pci_read_config_dword(dev, pos, &sz); > > > 174pci_write_config_dword(dev, pos, l); > > > > > > BAR is restored: what triggers an access between lines 172 and 174? > > > > Random interrupt reading the time, likely. > > Weird, what the backtrace shows is init, unrelated > to interrupts. > It's a bug then. qemu doesn't undo the mapping correctly. If you have clear instructions, I'll try to reproduce it. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote: > On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > > > Hi, > > > In this post > > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > > > mentioned about the issues when 64Bit PCI BAR is present and 32bit > > > address range is selected for it. > > > The issue affects all recent qemu releases and all > > > old and recent guest Linux kernel versions. > > > > > > We've done some investigations. Let me explain what happens. > > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > > > 0xF200] > > > > > > When Linux guest starts it does PCI bus enumeration. > > > The OS enumerates 64BIT bars using the following procedure. > > > 1. Write all FF's to lower half of 64bit BAR > > > 2. Write address back to lower half of 64bit BAR > > > 3. Write all FF's to higher half of 64bit BAR > > > 4. Write address back to higher half of 64bit BAR > > > > > > Linux code is here: > > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > > > > > What does it mean for qemu? > > > > > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > > > part of the 64bit BAR. Then it applies the mask and converts the value > > > to "All FF's - size + 1" (FE00 if size is 32MB). > > > Then pci_bar_address() checks if BAR address is valid. Since it is a > > > 64bit bar it reads 0xFE00 - this address is valid. So qemu > > > updates topology and sends request to update mappings in KVM with new > > > range for the 64bit BAR FE00 - 0x. This usually means kernel > > > panic on boot, if there is another mapping in the FE00 - 0x > > > range, which is quite common. > > > > Do you know why does it panic? As far as I can see > > from code at > > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > > > > 171pci_read_config_dword(dev, pos, &l); > > 172pci_write_config_dword(dev, pos, l | mask); > > 173pci_read_config_dword(dev, pos, &sz); > > 174pci_write_config_dword(dev, pos, l); > > > > BAR is restored: what triggers an access between lines 172 and 174? > > Random interrupt reading the time, likely. Weird, what the backtrace shows is init, unrelated to interrupts. > > Also, what you describe happens on a 32 bit BAR in the same way, no? > > So it seems. Btw, is this procedure correct for sizing a BAR which is > larger than 4GB? There's more code sizing 64 bit BARs, but generally software is allowed to write any junk into enabled BARs as long as there aren't any memory accesses. > -- > error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 01/26/2012 04:05 PM, Michael S. Tsirkin wrote: > > > > Let me see if I get this right: during BAR sizing, the guest sets the > > BAR to ~1, which means 4GB-32MB -> 4GB, which overlaps the HPET. If so, > > that's expected behaviour. > > Yes BAR sizing temporarily sets the BAR to an invalid value then > restores it. What I don't understand is how come something accesses the > HPET range in between. Interrupt -> read time. > > If the guest doesn't want this memory there, > > it should disable mmio. > > Recent kernels do this for most devices, but not for > platform devices. Then they are vulnerable to this issue. The i440fx spec states that the entire top-of-memory range to 4GB if forwarded to PCI, so qemu appears to be correct here. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On Thu, Jan 26, 2012 at 03:51:06PM +0200, Avi Kivity wrote: > > Please look at HPET lines. HPET is mapped to 0xfed0. > > Size of ivshmem is 32MB. During pci enumeration ivshmem will corrupt the > > range from 0xfe00 - 0x. > > It overlaps HPET memory. When Linux does late_hpet init, it finds garbage > > and this is causing panic. > > > > Let me see if I get this right: during BAR sizing, the guest sets the > BAR to ~1, which means 4GB-32MB -> 4GB, which overlaps the HPET. If so, > that's expected behaviour. Yes BAR sizing temporarily sets the BAR to an invalid value then restores it. What I don't understand is how come something accesses the HPET range in between. > If the guest doesn't want this memory there, > it should disable mmio. Recent kernels do this for most devices, but not for platform devices. > -- > error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote: > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > > Hi, > > In this post > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > > mentioned about the issues when 64Bit PCI BAR is present and 32bit > > address range is selected for it. > > The issue affects all recent qemu releases and all > > old and recent guest Linux kernel versions. > > > > We've done some investigations. Let me explain what happens. > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > > 0xF200] > > > > When Linux guest starts it does PCI bus enumeration. > > The OS enumerates 64BIT bars using the following procedure. > > 1. Write all FF's to lower half of 64bit BAR > > 2. Write address back to lower half of 64bit BAR > > 3. Write all FF's to higher half of 64bit BAR > > 4. Write address back to higher half of 64bit BAR > > > > Linux code is here: > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > > > What does it mean for qemu? > > > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > > part of the 64bit BAR. Then it applies the mask and converts the value > > to "All FF's - size + 1" (FE00 if size is 32MB). > > Then pci_bar_address() checks if BAR address is valid. Since it is a > > 64bit bar it reads 0xFE00 - this address is valid. So qemu > > updates topology and sends request to update mappings in KVM with new > > range for the 64bit BAR FE00 - 0x. This usually means kernel > > panic on boot, if there is another mapping in the FE00 - 0x > > range, which is quite common. > > Do you know why does it panic? As far as I can see > from code at > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 > > 171pci_read_config_dword(dev, pos, &l); > 172pci_write_config_dword(dev, pos, l | mask); > 173pci_read_config_dword(dev, pos, &sz); > 174pci_write_config_dword(dev, pos, l); > > BAR is restored: what triggers an access between lines 172 and 174? Random interrupt reading the time, likely. > Also, what you describe happens on a 32 bit BAR in the same way, no? So it seems. Btw, is this procedure correct for sizing a BAR which is larger than 4GB? -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 01/26/2012 05:19 AM, Alexey Korolev wrote: > If you apply the following patch and add to qemu command: --device > ivshmem,size=32,shm="shm" > --- > diff --git a/hw/ivshmem.c b/hw/ivshmem.c > index 1aa9e3b..71f8c21 100644 > --- a/hw/ivshmem.c > +++ b/hw/ivshmem.c > @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int > fd) { > memory_region_add_subregion(&s->bar, 0, &s->ivshmem); > > /* region for shared memory */ > -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar); > +pci_register_bar(&s->dev, 2, > PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar) > } > > static void close_guest_eventfds(IVShmemState *s, int posn) > --- > > You can get the following bootup log: > > > Bootdata ok (command line is root=/dev/hda1 console=ttyS0,115200n8 > console=tty0) > Linux version 2.6.18 (root@localhost.localdomain) (gcc version 4.1.2 20080704 > (Red Hat 4.1.2-48)) #3 SMP Tue Jan 17 16:37:33 NZDT 2012 > BIOS-provided physical RAM map: > BIOS-e820: - 0009f400 (usable) > BIOS-e820: 0009f400 - 000a (reserved) > BIOS-e820: 000f - 0010 (reserved) > BIOS-e820: 0010 - 7fffd000 (usable) > BIOS-e820: 7fffd000 - 8000 (reserved) > BIOS-e820: feffc000 - ff00 (reserved) > BIOS-e820: fffc - 0001 (reserved) > DMI 2.4 present. > No NUMA configuration found > Faking a node at -7fffd000 > Bootmem setup node 0 -7fffd000 > ACPI: PM-Timer IO Port: 0xb008 > ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) > Processor #0 6:2 APIC version 17 > ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0]) > IOAPIC[0]: apic_id 1, version 17, address 0xfec0, GSI 0-23 > ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) > ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) > ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) > ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) > ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) > Setting APIC routing to physical flat > ACPI: HPET id: 0x8086a201 base: 0xfed0 > Using ACPI (MADT) for SMP configuration information > Allocating PCI resources starting at 8800 (gap: 8000:7effc000) > SMP: Allowing 1 CPUs, 0 hotplug CPUs > Built 1 zonelists. Total pages: 515393 > Kernel command line: root=/dev/hda1 console=ttyS0,115200n8 console=tty0 > Initializing CPU#0 > PID hash table entries: 4096 (order: 12, 32768 bytes) > time.c: Using 100.00 MHz WALL HPET GTOD HPET/TSC timer. > time.c: Detected 2500.081 MHz processor. > Console: colour VGA+ 80x25 > Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) > Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes) > Checking aperture... > Memory: 2058096k/2097140k available (3256k kernel code, 38656k reserved, > 2266k data, 204k init) > Calibrating delay using timer specific routine.. 5030.07 BogoMIPS > (lpj=10060155) > Mount-cache hash table entries: 256 > CPU: L1 I cache: 32K, L1 D cache: 32K > CPU: L2 cache: 4096K > MCE: warning: using only 10 banks > SMP alternatives: switching to UP code > Freeing SMP alternatives: 36k freed > ACPI: Core revision 20060707 > activating NMI Watchdog ... done. > Using local APIC timer interrupts. > result 62501506 > Detected 62.501 MHz APIC timer. > Brought up 1 CPUs > testing NMI watchdog ... OK. > migration_cost=0 > NET: Registered protocol family 16 > ACPI: bus type pci registered > PCI: Using configuration type 1 > ACPI: Interpreter enabled > ACPI: Using IOAPIC for interrupt routing > ACPI: PCI Root Bridge [PCI0] (:00) > ACPI: Assume root bridge [\_SB_.PCI0] bus is 0 > PCI quirk: region b000-b03f claimed by PIIX4 ACPI > PCI quirk: region b100-b10f claimed by PIIX4 SMB > ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11) > ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11) > ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11) > ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11) > ACPI: PCI Interrupt Link [LNKS] (IRQs 9) *0, disabled. > SCSI subsystem initialized > usbcore: registered new driver usbfs > usbcore: registered new driver hub > PCI: Using ACPI for IRQ routing > PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report > divide error: [1] SMP > CPU 0 > Modules linked in: > Pid: 1, comm: swapper Not tainted 2.6.18 #3 > RIP: 0010:[] [] hpet_alloc+0x12a/0x30c > RSP: :81007e3a1e20 EFLAGS: 00010246 > RAX: 00038d7ea4c68000 RBX: RCX: > RDX: RSI: RDI: 8057fc2b > RBP: 81007e2e28c0 R08: 8055b492 R09: 81007e39f510 > R10: 81007e3a1e50 R11: 0098 R12: 81007e3a1e50 > R13: R14: ff5fe000 R15: > FS: () GS:807fc000() knlGS: > CS: 0010 D
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > Hi, > In this post > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > mentioned about the issues when 64Bit PCI BAR is present and 32bit > address range is selected for it. > The issue affects all recent qemu releases and all > old and recent guest Linux kernel versions. > > We've done some investigations. Let me explain what happens. > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > 0xF200] > > When Linux guest starts it does PCI bus enumeration. > The OS enumerates 64BIT bars using the following procedure. > 1. Write all FF's to lower half of 64bit BAR > 2. Write address back to lower half of 64bit BAR > 3. Write all FF's to higher half of 64bit BAR > 4. Write address back to higher half of 64bit BAR > > Linux code is here: > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > What does it mean for qemu? > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > part of the 64bit BAR. Then it applies the mask and converts the value > to "All FF's - size + 1" (FE00 if size is 32MB). > Then pci_bar_address() checks if BAR address is valid. Since it is a > 64bit bar it reads 0xFE00 - this address is valid. So qemu > updates topology and sends request to update mappings in KVM with new > range for the 64bit BAR FE00 - 0x. This usually means kernel > panic on boot, if there is another mapping in the FE00 - 0x > range, which is quite common. Do you know why does it panic? As far as I can see from code at http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 171pci_read_config_dword(dev, pos, &l); 172pci_write_config_dword(dev, pos, l | mask); 173pci_read_config_dword(dev, pos, &sz); 174pci_write_config_dword(dev, pos, l); BAR is restored: what triggers an access between lines 172 and 174? Also, what you describe happens on a 32 bit BAR in the same way, no? -- MST
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On 26/01/12 01:51, Michael S. Tsirkin wrote: > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: >> Hi, >> In this post >> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've >> mentioned about the issues when 64Bit PCI BAR is present and 32bit >> address range is selected for it. >> The issue affects all recent qemu releases and all >> old and recent guest Linux kernel versions. >> >> We've done some investigations. Let me explain what happens. >> Assume we have 64bit BAR with size 32MB mapped at [0xF000 - >> 0xF200] >> >> When Linux guest starts it does PCI bus enumeration. >> The OS enumerates 64BIT bars using the following procedure. >> 1. Write all FF's to lower half of 64bit BAR >> 2. Write address back to lower half of 64bit BAR >> 3. Write all FF's to higher half of 64bit BAR >> 4. Write address back to higher half of 64bit BAR >> >> Linux code is here: >> http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 >> >> What does it mean for qemu? >> >> At step 1. qemu pci_default_write_config() recevies all FFs for lower >> part of the 64bit BAR. Then it applies the mask and converts the value >> to "All FF's - size + 1" (FE00 if size is 32MB). >> Then pci_bar_address() checks if BAR address is valid. Since it is a >> 64bit bar it reads 0xFE00 - this address is valid. So qemu >> updates topology and sends request to update mappings in KVM with new >> range for the 64bit BAR FE00 - 0x. This usually means kernel >> panic on boot, if there is another mapping in the FE00 - 0x >> range, which is quite common. >> >> >> The following patch fixes the issue. It affects 64bit PCI BAR's only. >> The idea of the patch is: we introduce the states for low and high BARs >> whose can have 3 possible values: BAR_VALID, PCIBAR64_PARTIAL_SIZE_QUERY >> - someone has requested size of one half of the 64bit PCI BAR, >> PCIBAR64_PARTIAL_ADDR_PROGRAM - someone has sent a request to update the >> address of one half of the 64bit PCI BAR. The state becomes BAR_VALID >> when both halfs are in the same state. We ignore BAR value until both >> states become BAR_VALID >> >> Note: Please use the latest Seabios version (commit >> 139d5ac037de828f89c36e39c6dd15610650cede and later), as older versions >> didn't initialize high part of 64bit BAR. >> >> The patch is tested on Linux 2.6.18 - 3.1.0 and Windows 2008 Server >> >> Signed-off-by: Alexey Korolev > Interesting. However, looking at guest code, > I note that memory and io are disabled > during BAR sizing unless mmio always on is set. > pci_bar_address should return PCI_BAR_UNMAPPED > in this case, and we should never map this BAR > until it's enabled. What's going on? > > Oh. Good point. You are right here. Linux developers have added a protection starting 2.6.36 for lower part of PCI BAR. So this issue affects all guest kernels before 2.6.36. Sorry about confusion. The code without protection is here: http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162 To solve this issue for older kernel versions the submitted patch is still relevant.
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
Hi Alex and Michael >> For testing, I applied the following patch to qemu, >> converting msix bar to 64 bit. >> Guest did not seem to crash. >> I booted Fedora Live CD 32 bit guest on a 32 bit host >> to level 3 without crash, and verified that >> the BAR is a 64 bit one, and that I got assigned an address >> at fe00. >> command line I used: >> qemu-system-x86_64 -bios /scm/seabios/out/bios.bin -snapshot -drive >> file=qemu-images/f15-test.qcow2,if=none,id=diskid,cache=unsafe >> -device virtio-blk-pci,drive=diskid -net user -net nic,model=ne2k_pci >> -cdrom Fedora-15-i686-Live-LXDE.iso >> >> At boot prompt type tab and add '3' to kernel command line >> to have guest boot into a fast text console instead >> of a graphical one which is very slow. >> >> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c >> index 2ac87ea..5271394 100644 >> --- a/hw/virtio-pci.c >> +++ b/hw/virtio-pci.c >> @@ -711,7 +711,8 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice >> *vdev) >> memory_region_init(&proxy->msix_bar, "virtio-msix", 4096); >> if (vdev->nvectors && !msix_init(&proxy->pci_dev, vdev->nvectors, >> &proxy->msix_bar, 1, 0)) { >> -pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY, >> +pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY | >> + PCI_BASE_ADDRESS_MEM_TYPE_64, >> &proxy->msix_bar); >> } else >> vdev->nvectors = 0; >> > I was also able to add MEM64 BARs to device assignment pretty trivially > and it seems to work, guest sees 64bit BARs for an 82576 VF, programs it > to an fexx address and it works. > > Alex > I'd suggest using ivshmem with buffer size 32MB to reproduce the problem in 2.6.18 guest for example. The msix case is not failing because: 1. Buffer size is just 4KB - it will reprogram range from 0xE000-0x (it doesn't overlap critical resources to cause immediate panic) 2. The memory_region_init -function doesn't create backing user memory region. So kvm does nothing about remapping in this case. If you apply the following patch and add to qemu command: --device ivshmem,size=32,shm="shm" --- diff --git a/hw/ivshmem.c b/hw/ivshmem.c index 1aa9e3b..71f8c21 100644 --- a/hw/ivshmem.c +++ b/hw/ivshmem.c @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int fd) { memory_region_add_subregion(&s->bar, 0, &s->ivshmem); /* region for shared memory */ -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar); +pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar) } static void close_guest_eventfds(IVShmemState *s, int posn) --- You can get the following bootup log: Bootdata ok (command line is root=/dev/hda1 console=ttyS0,115200n8 console=tty0) Linux version 2.6.18 (root@localhost.localdomain) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #3 SMP Tue Jan 17 16:37:33 NZDT 2012 BIOS-provided physical RAM map: BIOS-e820: - 0009f400 (usable) BIOS-e820: 0009f400 - 000a (reserved) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 7fffd000 (usable) BIOS-e820: 7fffd000 - 8000 (reserved) BIOS-e820: feffc000 - ff00 (reserved) BIOS-e820: fffc - 0001 (reserved) DMI 2.4 present. No NUMA configuration found Faking a node at -7fffd000 Bootmem setup node 0 -7fffd000 ACPI: PM-Timer IO Port: 0xb008 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 6:2 APIC version 17 ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 1, version 17, address 0xfec0, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) Setting APIC routing to physical flat ACPI: HPET id: 0x8086a201 base: 0xfed0 Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 8800 (gap: 8000:7effc000) SMP: Allowing 1 CPUs, 0 hotplug CPUs Built 1 zonelists. Total pages: 515393 Kernel command line: root=/dev/hda1 console=ttyS0,115200n8 console=tty0 Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) time.c: Using 100.00 MHz WALL HPET GTOD HPET/TSC timer. time.c: Detected 2500.081 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes) Checking aperture... Memory: 2058096k/2097140k available (3256k kernel code, 38656k reserved, 2266k data, 204k init) Calibrating delay using timer specific routin
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On Wed, 2012-01-25 at 17:38 +0200, Michael S. Tsirkin wrote: > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > > Hi, > > In this post > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > > mentioned about the issues when 64Bit PCI BAR is present and 32bit > > address range is selected for it. > > The issue affects all recent qemu releases and all > > old and recent guest Linux kernel versions. > > > > For testing, I applied the following patch to qemu, > converting msix bar to 64 bit. > Guest did not seem to crash. > I booted Fedora Live CD 32 bit guest on a 32 bit host > to level 3 without crash, and verified that > the BAR is a 64 bit one, and that I got assigned an address > at fe00. > command line I used: > qemu-system-x86_64 -bios /scm/seabios/out/bios.bin -snapshot -drive > file=qemu-images/f15-test.qcow2,if=none,id=diskid,cache=unsafe > -device virtio-blk-pci,drive=diskid -net user -net nic,model=ne2k_pci > -cdrom Fedora-15-i686-Live-LXDE.iso > > At boot prompt type tab and add '3' to kernel command line > to have guest boot into a fast text console instead > of a graphical one which is very slow. > > diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c > index 2ac87ea..5271394 100644 > --- a/hw/virtio-pci.c > +++ b/hw/virtio-pci.c > @@ -711,7 +711,8 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice > *vdev) > memory_region_init(&proxy->msix_bar, "virtio-msix", 4096); > if (vdev->nvectors && !msix_init(&proxy->pci_dev, vdev->nvectors, > &proxy->msix_bar, 1, 0)) { > -pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY, > +pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY | > + PCI_BASE_ADDRESS_MEM_TYPE_64, > &proxy->msix_bar); > } else > vdev->nvectors = 0; > I was also able to add MEM64 BARs to device assignment pretty trivially and it seems to work, guest sees 64bit BARs for an 82576 VF, programs it to an fexx address and it works. Alex
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > Hi, > In this post > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > mentioned about the issues when 64Bit PCI BAR is present and 32bit > address range is selected for it. > The issue affects all recent qemu releases and all > old and recent guest Linux kernel versions. > For testing, I applied the following patch to qemu, converting msix bar to 64 bit. Guest did not seem to crash. I booted Fedora Live CD 32 bit guest on a 32 bit host to level 3 without crash, and verified that the BAR is a 64 bit one, and that I got assigned an address at fe00. command line I used: qemu-system-x86_64 -bios /scm/seabios/out/bios.bin -snapshot -drive file=qemu-images/f15-test.qcow2,if=none,id=diskid,cache=unsafe -device virtio-blk-pci,drive=diskid -net user -net nic,model=ne2k_pci -cdrom Fedora-15-i686-Live-LXDE.iso At boot prompt type tab and add '3' to kernel command line to have guest boot into a fast text console instead of a graphical one which is very slow. diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index 2ac87ea..5271394 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -711,7 +711,8 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev) memory_region_init(&proxy->msix_bar, "virtio-msix", 4096); if (vdev->nvectors && !msix_init(&proxy->pci_dev, vdev->nvectors, &proxy->msix_bar, 1, 0)) { -pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY, +pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY | +PCI_BASE_ADDRESS_MEM_TYPE_64, &proxy->msix_bar); } else vdev->nvectors = 0;
Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote: > Hi, > In this post > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've > mentioned about the issues when 64Bit PCI BAR is present and 32bit > address range is selected for it. > The issue affects all recent qemu releases and all > old and recent guest Linux kernel versions. > > We've done some investigations. Let me explain what happens. > Assume we have 64bit BAR with size 32MB mapped at [0xF000 - > 0xF200] > > When Linux guest starts it does PCI bus enumeration. > The OS enumerates 64BIT bars using the following procedure. > 1. Write all FF's to lower half of 64bit BAR > 2. Write address back to lower half of 64bit BAR > 3. Write all FF's to higher half of 64bit BAR > 4. Write address back to higher half of 64bit BAR > > Linux code is here: > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 > > What does it mean for qemu? > > At step 1. qemu pci_default_write_config() recevies all FFs for lower > part of the 64bit BAR. Then it applies the mask and converts the value > to "All FF's - size + 1" (FE00 if size is 32MB). > Then pci_bar_address() checks if BAR address is valid. Since it is a > 64bit bar it reads 0xFE00 - this address is valid. So qemu > updates topology and sends request to update mappings in KVM with new > range for the 64bit BAR FE00 - 0x. This usually means kernel > panic on boot, if there is another mapping in the FE00 - 0x > range, which is quite common. > > > The following patch fixes the issue. It affects 64bit PCI BAR's only. > The idea of the patch is: we introduce the states for low and high BARs > whose can have 3 possible values: BAR_VALID, PCIBAR64_PARTIAL_SIZE_QUERY > - someone has requested size of one half of the 64bit PCI BAR, > PCIBAR64_PARTIAL_ADDR_PROGRAM - someone has sent a request to update the > address of one half of the 64bit PCI BAR. The state becomes BAR_VALID > when both halfs are in the same state. We ignore BAR value until both > states become BAR_VALID > > Note: Please use the latest Seabios version (commit > 139d5ac037de828f89c36e39c6dd15610650cede and later), as older versions > didn't initialize high part of 64bit BAR. > > The patch is tested on Linux 2.6.18 - 3.1.0 and Windows 2008 Server > > Signed-off-by: Alexey Korolev Interesting. However, looking at guest code, I note that memory and io are disabled during BAR sizing unless mmio always on is set. pci_bar_address should return PCI_BAR_UNMAPPED in this case, and we should never map this BAR until it's enabled. What's going on?
[Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present
Hi, In this post http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've mentioned about the issues when 64Bit PCI BAR is present and 32bit address range is selected for it. The issue affects all recent qemu releases and all old and recent guest Linux kernel versions. We've done some investigations. Let me explain what happens. Assume we have 64bit BAR with size 32MB mapped at [0xF000 - 0xF200] When Linux guest starts it does PCI bus enumeration. The OS enumerates 64BIT bars using the following procedure. 1. Write all FF's to lower half of 64bit BAR 2. Write address back to lower half of 64bit BAR 3. Write all FF's to higher half of 64bit BAR 4. Write address back to higher half of 64bit BAR Linux code is here: http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149 What does it mean for qemu? At step 1. qemu pci_default_write_config() recevies all FFs for lower part of the 64bit BAR. Then it applies the mask and converts the value to "All FF's - size + 1" (FE00 if size is 32MB). Then pci_bar_address() checks if BAR address is valid. Since it is a 64bit bar it reads 0xFE00 - this address is valid. So qemu updates topology and sends request to update mappings in KVM with new range for the 64bit BAR FE00 - 0x. This usually means kernel panic on boot, if there is another mapping in the FE00 - 0x range, which is quite common. The following patch fixes the issue. It affects 64bit PCI BAR's only. The idea of the patch is: we introduce the states for low and high BARs whose can have 3 possible values: BAR_VALID, PCIBAR64_PARTIAL_SIZE_QUERY - someone has requested size of one half of the 64bit PCI BAR, PCIBAR64_PARTIAL_ADDR_PROGRAM - someone has sent a request to update the address of one half of the 64bit PCI BAR. The state becomes BAR_VALID when both halfs are in the same state. We ignore BAR value until both states become BAR_VALID Note: Please use the latest Seabios version (commit 139d5ac037de828f89c36e39c6dd15610650cede and later), as older versions didn't initialize high part of 64bit BAR. The patch is tested on Linux 2.6.18 - 3.1.0 and Windows 2008 Server Signed-off-by: Alexey Korolev --- hw/pci.c | 45 + hw/pci.h |7 +++ 2 files changed, 52 insertions(+), 0 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 57ec104..3a7deb2 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1055,6 +1055,40 @@ static pcibus_t pci_bar_address(PCIDevice *d, return new_addr; } +static void pci_update_region_state(PCIDevice *d, uint32_t addr, uint32_t val) +{ +PCIIORegion *r; +int barnum = (addr - PCI_BASE_ADDRESS_0) >> 2; +PCIBARState *state; + +r = &d->io_regions[barnum]; + +if (d->io_regions[barnum].type & PCI_BASE_ADDRESS_MEM_TYPE_64) { +/* Programming low part of the 64bit BAR */ +r = &d->io_regions[barnum]; +state = &r->state_lo; +} else if (barnum > 0 && +(d->io_regions[barnum - 1].type & PCI_BASE_ADDRESS_MEM_TYPE_64)) { +/* Programming high part of the 64bit BAR */ +r = &d->io_regions[barnum - 1]; +state = &r->state_hi; +} else { +/* Not a 64bit BAR's */ +d->io_regions[barnum].state_lo = PCIBAR_VALID; +return; +} + +/* Request to read BAR size */ +if (val == -1U) +*state = PCIBAR64_PARTIAL_SIZE_QUERY; +else +*state = PCIBAR64_PARTIAL_ADDR_PROGRAM; + + +if (r->state_lo == r->state_hi) +r->state_lo = r->state_hi = PCIBAR_VALID; +} + static void pci_update_mappings(PCIDevice *d) { PCIIORegion *r; @@ -1068,6 +1102,13 @@ static void pci_update_mappings(PCIDevice *d) if (!r->size) continue; +/* this region state is invalid */ +if (r->state_lo != PCIBAR_VALID) +continue; +if ((r->type & PCI_BASE_ADDRESS_MEM_TYPE_64) && + (r->state_hi != PCIBAR_VALID)) +continue; + new_addr = pci_bar_address(d, i, r->type, r->size); /* This bar isn't changed */ @@ -1117,6 +1158,7 @@ uint32_t pci_default_read_config(PCIDevice *d, void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l) { int i, was_irq_disabled = pci_irq_disabled(d); +uint32_t orig_val = val; for (i = 0; i < l; val >>= 8, ++i) { uint8_t wmask = d->wmask[addr + i]; @@ -1133,6 +1175,9 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l) assigned_dev_update_irqs(); #endif /* CONFIG_KVM_DEVICE_ASSIGNMENT */ +if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24)) +pci_update_region_state(d, addr, orig_val); + if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) || ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) || diff --git a/hw/pci.h b/hw/pci.h index 4220151..5d1e529 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -86,12 +86,19 @@ t