Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-02-01 Thread Alexey Korolev
On 01/02/12 20:04, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2012 at 06:44:42PM +1300, Alexey Korolev wrote:
>> On 31/01/12 22:43, Avi Kivity wrote:
>>> On 01/31/2012 11:40 AM, Avi Kivity wrote:
 On 01/27/2012 06:42 AM, Alexey Korolev wrote:
> On 27/01/12 04:12, Avi Kivity wrote:
>> On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote:
>>> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
 On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
>> Hi, 
>> In this post
>> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html 
>> I've
>> mentioned about the issues when 64Bit PCI BAR is present and 32bit
>> address range is selected for it.
>> The issue affects all recent qemu releases and all
>> old and recent guest Linux kernel versions.
>>
>> We've done some investigations. Let me explain what happens.
>> Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
>> 0xF200]
>>
>> When Linux guest starts it does PCI bus enumeration.
>> The OS enumerates 64BIT bars using the following procedure.
>> 1. Write all FF's to lower half of 64bit BAR
>> 2. Write address back to lower half of 64bit BAR
>> 3. Write all FF's to higher half of 64bit BAR
>> 4. Write address back to higher half of 64bit BAR
>>
>> Linux code is here: 
>> http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
>>
>> What does it mean for qemu?
>>
>> At step 1. qemu pci_default_write_config() recevies all FFs for lower
>> part of the 64bit BAR. Then it applies the mask and converts the 
>> value
>> to "All FF's - size + 1" (FE00 if size is 32MB).
>> Then pci_bar_address() checks if BAR address is valid. Since it is a
>> 64bit bar it reads 0xFE00 - this address is valid. So 
>> qemu
>> updates topology and sends request to update mappings in KVM with new
>> range for the 64bit BAR FE00 - 0x. This usually means 
>> kernel
>> panic on boot, if there is another mapping in the FE00 - 
>> 0x
>> range, which is quite common.
> Do you know why does it panic? As far as I can see
> from code at
> http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
>
>  171pci_read_config_dword(dev, pos, &l);
>  172pci_write_config_dword(dev, pos, l | mask);
>  173pci_read_config_dword(dev, pos, &sz);
>  174pci_write_config_dword(dev, pos, l);
>
> BAR is restored: what triggers an access between lines 172 and 174?
 Random interrupt reading the time, likely.
>>> Weird, what the backtrace shows is init, unrelated
>>> to interrupts.
>>>
>> It's a bug then.  qemu doesn't undo the mapping correctly.
>>
>> If you have clear instructions, I'll try to reproduce it.
>>
> Well the easiest way to reproduce this is:
>
>
> 1. Get kernel bzImage (version < 2.6.36)
> 2. Apply patch to ivshmem.c
>
> ---
> diff --git a/hw/ivshmem.c b/hw/ivshmem.c
> index 1aa9e3b..71f8c21 100644
> --- a/hw/ivshmem.c
> +++ b/hw/ivshmem.c
> @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, 
> int fd) {
>  memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
>  
>  /* region for shared memory */
> -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar);
> +pci_register_bar(&s->dev, 2, 
> PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar)
>  }
>  
>  static void close_guest_eventfds(IVShmemState *s, int posn)
> ---
>
> 3. Launch qemu with a command like that
>
> /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp 
> 1,socket=1,cores=1,threads=1 -name centos54 -uuid
> d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait
>  -mon chardev=charmonitor,id=monitor,mode=readline -rtc
> base=utc -drive 
> file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device
> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 
> -drive
> file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
>  -device
> ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev 
> file,id=charserial0,path=/home/alexey/cent54.log -device
> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us 
> -vga cirrus -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=o

Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-31 Thread Michael S. Tsirkin
On Wed, Feb 01, 2012 at 06:44:42PM +1300, Alexey Korolev wrote:
> On 31/01/12 22:43, Avi Kivity wrote:
> > On 01/31/2012 11:40 AM, Avi Kivity wrote:
> >> On 01/27/2012 06:42 AM, Alexey Korolev wrote:
> >>> On 27/01/12 04:12, Avi Kivity wrote:
>  On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote:
> > On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
> >> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
> >>> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
>  Hi, 
>  In this post
>  http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html 
>  I've
>  mentioned about the issues when 64Bit PCI BAR is present and 32bit
>  address range is selected for it.
>  The issue affects all recent qemu releases and all
>  old and recent guest Linux kernel versions.
> 
>  We've done some investigations. Let me explain what happens.
>  Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
>  0xF200]
> 
>  When Linux guest starts it does PCI bus enumeration.
>  The OS enumerates 64BIT bars using the following procedure.
>  1. Write all FF's to lower half of 64bit BAR
>  2. Write address back to lower half of 64bit BAR
>  3. Write all FF's to higher half of 64bit BAR
>  4. Write address back to higher half of 64bit BAR
> 
>  Linux code is here: 
>  http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> 
>  What does it mean for qemu?
> 
>  At step 1. qemu pci_default_write_config() recevies all FFs for lower
>  part of the 64bit BAR. Then it applies the mask and converts the 
>  value
>  to "All FF's - size + 1" (FE00 if size is 32MB).
>  Then pci_bar_address() checks if BAR address is valid. Since it is a
>  64bit bar it reads 0xFE00 - this address is valid. So 
>  qemu
>  updates topology and sends request to update mappings in KVM with new
>  range for the 64bit BAR FE00 - 0x. This usually means 
>  kernel
>  panic on boot, if there is another mapping in the FE00 - 
>  0x
>  range, which is quite common.
> >>> Do you know why does it panic? As far as I can see
> >>> from code at
> >>> http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
> >>>
> >>>  171pci_read_config_dword(dev, pos, &l);
> >>>  172pci_write_config_dword(dev, pos, l | mask);
> >>>  173pci_read_config_dword(dev, pos, &sz);
> >>>  174pci_write_config_dword(dev, pos, l);
> >>>
> >>> BAR is restored: what triggers an access between lines 172 and 174?
> >> Random interrupt reading the time, likely.
> > Weird, what the backtrace shows is init, unrelated
> > to interrupts.
> >
>  It's a bug then.  qemu doesn't undo the mapping correctly.
> 
>  If you have clear instructions, I'll try to reproduce it.
> 
> >>> Well the easiest way to reproduce this is:
> >>>
> >>>
> >>> 1. Get kernel bzImage (version < 2.6.36)
> >>> 2. Apply patch to ivshmem.c
> >>>
> >>> ---
> >>> diff --git a/hw/ivshmem.c b/hw/ivshmem.c
> >>> index 1aa9e3b..71f8c21 100644
> >>> --- a/hw/ivshmem.c
> >>> +++ b/hw/ivshmem.c
> >>> @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, 
> >>> int fd) {
> >>>  memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
> >>>  
> >>>  /* region for shared memory */
> >>> -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar);
> >>> +pci_register_bar(&s->dev, 2, 
> >>> PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar)
> >>>  }
> >>>  
> >>>  static void close_guest_eventfds(IVShmemState *s, int posn)
> >>> ---
> >>>
> >>> 3. Launch qemu with a command like that
> >>>
> >>> /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp 
> >>> 1,socket=1,cores=1,threads=1 -name centos54 -uuid
> >>> d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev
> >>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait
> >>>  -mon chardev=charmonitor,id=monitor,mode=readline -rtc
> >>> base=utc -drive 
> >>> file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device
> >>> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 
> >>> -drive
> >>> file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
> >>>  -device
> >>> ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev 
> >>> file,id=charserial0,path=/home/alexey/cent54.log -device
> >>> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us 
> >>> -vga cirrus -device
> >>> virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 
> >>> --device ivshmem,size=32,s

Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-31 Thread Alexey Korolev
On 31/01/12 22:43, Avi Kivity wrote:
> On 01/31/2012 11:40 AM, Avi Kivity wrote:
>> On 01/27/2012 06:42 AM, Alexey Korolev wrote:
>>> On 27/01/12 04:12, Avi Kivity wrote:
 On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote:
> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
>>> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
 Hi, 
 In this post
 http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
 mentioned about the issues when 64Bit PCI BAR is present and 32bit
 address range is selected for it.
 The issue affects all recent qemu releases and all
 old and recent guest Linux kernel versions.

 We've done some investigations. Let me explain what happens.
 Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
 0xF200]

 When Linux guest starts it does PCI bus enumeration.
 The OS enumerates 64BIT bars using the following procedure.
 1. Write all FF's to lower half of 64bit BAR
 2. Write address back to lower half of 64bit BAR
 3. Write all FF's to higher half of 64bit BAR
 4. Write address back to higher half of 64bit BAR

 Linux code is here: 
 http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149

 What does it mean for qemu?

 At step 1. qemu pci_default_write_config() recevies all FFs for lower
 part of the 64bit BAR. Then it applies the mask and converts the value
 to "All FF's - size + 1" (FE00 if size is 32MB).
 Then pci_bar_address() checks if BAR address is valid. Since it is a
 64bit bar it reads 0xFE00 - this address is valid. So qemu
 updates topology and sends request to update mappings in KVM with new
 range for the 64bit BAR FE00 - 0x. This usually means 
 kernel
 panic on boot, if there is another mapping in the FE00 - 0x
 range, which is quite common.
>>> Do you know why does it panic? As far as I can see
>>> from code at
>>> http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
>>>
>>>  171pci_read_config_dword(dev, pos, &l);
>>>  172pci_write_config_dword(dev, pos, l | mask);
>>>  173pci_read_config_dword(dev, pos, &sz);
>>>  174pci_write_config_dword(dev, pos, l);
>>>
>>> BAR is restored: what triggers an access between lines 172 and 174?
>> Random interrupt reading the time, likely.
> Weird, what the backtrace shows is init, unrelated
> to interrupts.
>
 It's a bug then.  qemu doesn't undo the mapping correctly.

 If you have clear instructions, I'll try to reproduce it.

>>> Well the easiest way to reproduce this is:
>>>
>>>
>>> 1. Get kernel bzImage (version < 2.6.36)
>>> 2. Apply patch to ivshmem.c
>>>
>>> ---
>>> diff --git a/hw/ivshmem.c b/hw/ivshmem.c
>>> index 1aa9e3b..71f8c21 100644
>>> --- a/hw/ivshmem.c
>>> +++ b/hw/ivshmem.c
>>> @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, 
>>> int fd) {
>>>  memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
>>>  
>>>  /* region for shared memory */
>>> -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar);
>>> +pci_register_bar(&s->dev, 2, 
>>> PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar)
>>>  }
>>>  
>>>  static void close_guest_eventfds(IVShmemState *s, int posn)
>>> ---
>>>
>>> 3. Launch qemu with a command like that
>>>
>>> /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp 
>>> 1,socket=1,cores=1,threads=1 -name centos54 -uuid
>>> d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev
>>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait
>>>  -mon chardev=charmonitor,id=monitor,mode=readline -rtc
>>> base=utc -drive 
>>> file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device
>>> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 
>>> -drive
>>> file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
>>>  -device
>>> ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev 
>>> file,id=charserial0,path=/home/alexey/cent54.log -device
>>> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us 
>>> -vga cirrus -device
>>> virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 
>>> --device ivshmem,size=32,shm="shm" -kernel bzImage -append
>>> "root=/dev/hda1 console=ttyS0,115200n8 console=tty0"
>>>
>>> in other words add: --device ivshmem,size=32,shm="shm"
>>>
>>> That is all.
>>>
>>> Note: it won't necessary cause panic message on some kernels it just hangs 
>>> or reboots.
>>>
>> In fact qemu segfaults for me, since 

Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-31 Thread Avi Kivity
On 01/27/2012 06:42 AM, Alexey Korolev wrote:
> On 27/01/12 04:12, Avi Kivity wrote:
> > On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote:
> >> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
> >>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
>  On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> > Hi, 
> > In this post
> > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> > mentioned about the issues when 64Bit PCI BAR is present and 32bit
> > address range is selected for it.
> > The issue affects all recent qemu releases and all
> > old and recent guest Linux kernel versions.
> >
> > We've done some investigations. Let me explain what happens.
> > Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> > 0xF200]
> >
> > When Linux guest starts it does PCI bus enumeration.
> > The OS enumerates 64BIT bars using the following procedure.
> > 1. Write all FF's to lower half of 64bit BAR
> > 2. Write address back to lower half of 64bit BAR
> > 3. Write all FF's to higher half of 64bit BAR
> > 4. Write address back to higher half of 64bit BAR
> >
> > Linux code is here: 
> > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> >
> > What does it mean for qemu?
> >
> > At step 1. qemu pci_default_write_config() recevies all FFs for lower
> > part of the 64bit BAR. Then it applies the mask and converts the value
> > to "All FF's - size + 1" (FE00 if size is 32MB).
> > Then pci_bar_address() checks if BAR address is valid. Since it is a
> > 64bit bar it reads 0xFE00 - this address is valid. So qemu
> > updates topology and sends request to update mappings in KVM with new
> > range for the 64bit BAR FE00 - 0x. This usually means kernel
> > panic on boot, if there is another mapping in the FE00 - 0x
> > range, which is quite common.
>  Do you know why does it panic? As far as I can see
>  from code at
>  http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
> 
>   171pci_read_config_dword(dev, pos, &l);
>   172pci_write_config_dword(dev, pos, l | mask);
>   173pci_read_config_dword(dev, pos, &sz);
>   174pci_write_config_dword(dev, pos, l);
> 
>  BAR is restored: what triggers an access between lines 172 and 174?
> >>> Random interrupt reading the time, likely.
> >> Weird, what the backtrace shows is init, unrelated
> >> to interrupts.
> >>
> > It's a bug then.  qemu doesn't undo the mapping correctly.
> >
> > If you have clear instructions, I'll try to reproduce it.
> >
> Well the easiest way to reproduce this is:
>
>
> 1. Get kernel bzImage (version < 2.6.36)
> 2. Apply patch to ivshmem.c
>
>

I have some patches that fix this, but they're very hacky since they're
dealing with the old and rotten core.  I much prefer to let this resolve
itself in my continuing rewrite.  Is this an urgent problem for you or
can you live with this for a while?

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-31 Thread Avi Kivity
On 01/31/2012 11:40 AM, Avi Kivity wrote:
> On 01/27/2012 06:42 AM, Alexey Korolev wrote:
> > On 27/01/12 04:12, Avi Kivity wrote:
> > > On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote:
> > >> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
> > >>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
> >  On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> > > Hi, 
> > > In this post
> > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html 
> > > I've
> > > mentioned about the issues when 64Bit PCI BAR is present and 32bit
> > > address range is selected for it.
> > > The issue affects all recent qemu releases and all
> > > old and recent guest Linux kernel versions.
> > >
> > > We've done some investigations. Let me explain what happens.
> > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> > > 0xF200]
> > >
> > > When Linux guest starts it does PCI bus enumeration.
> > > The OS enumerates 64BIT bars using the following procedure.
> > > 1. Write all FF's to lower half of 64bit BAR
> > > 2. Write address back to lower half of 64bit BAR
> > > 3. Write all FF's to higher half of 64bit BAR
> > > 4. Write address back to higher half of 64bit BAR
> > >
> > > Linux code is here: 
> > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> > >
> > > What does it mean for qemu?
> > >
> > > At step 1. qemu pci_default_write_config() recevies all FFs for lower
> > > part of the 64bit BAR. Then it applies the mask and converts the value
> > > to "All FF's - size + 1" (FE00 if size is 32MB).
> > > Then pci_bar_address() checks if BAR address is valid. Since it is a
> > > 64bit bar it reads 0xFE00 - this address is valid. So qemu
> > > updates topology and sends request to update mappings in KVM with new
> > > range for the 64bit BAR FE00 - 0x. This usually means 
> > > kernel
> > > panic on boot, if there is another mapping in the FE00 - 
> > > 0x
> > > range, which is quite common.
> >  Do you know why does it panic? As far as I can see
> >  from code at
> >  http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
> > 
> >   171pci_read_config_dword(dev, pos, &l);
> >   172pci_write_config_dword(dev, pos, l | mask);
> >   173pci_read_config_dword(dev, pos, &sz);
> >   174pci_write_config_dword(dev, pos, l);
> > 
> >  BAR is restored: what triggers an access between lines 172 and 174?
> > >>> Random interrupt reading the time, likely.
> > >> Weird, what the backtrace shows is init, unrelated
> > >> to interrupts.
> > >>
> > > It's a bug then.  qemu doesn't undo the mapping correctly.
> > >
> > > If you have clear instructions, I'll try to reproduce it.
> > >
> > Well the easiest way to reproduce this is:
> >
> >
> > 1. Get kernel bzImage (version < 2.6.36)
> > 2. Apply patch to ivshmem.c
> >
> > ---
> > diff --git a/hw/ivshmem.c b/hw/ivshmem.c
> > index 1aa9e3b..71f8c21 100644
> > --- a/hw/ivshmem.c
> > +++ b/hw/ivshmem.c
> > @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, 
> > int fd) {
> >  memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
> >  
> >  /* region for shared memory */
> > -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar);
> > +pci_register_bar(&s->dev, 2, 
> > PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar)
> >  }
> >  
> >  static void close_guest_eventfds(IVShmemState *s, int posn)
> > ---
> >
> > 3. Launch qemu with a command like that
> >
> > /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp 
> > 1,socket=1,cores=1,threads=1 -name centos54 -uuid
> > d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev
> > socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait
> >  -mon chardev=charmonitor,id=monitor,mode=readline -rtc
> > base=utc -drive 
> > file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device
> > ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 
> > -drive
> > file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
> >  -device
> > ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev 
> > file,id=charserial0,path=/home/alexey/cent54.log -device
> > isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us 
> > -vga cirrus -device
> > virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 
> > --device ivshmem,size=32,shm="shm" -kernel bzImage -append
> > "root=/dev/hda1 console=ttyS0,115200n8 console=tty0"
> >
> > in other words add: --device ivshmem,size=32,shm="shm"
> >
> > That is all.
> >
> > Note: it won't necessary cause panic message on some kernels it just hangs 
> > or reboots.
> >
>

Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-31 Thread Avi Kivity
On 01/27/2012 06:42 AM, Alexey Korolev wrote:
> On 27/01/12 04:12, Avi Kivity wrote:
> > On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote:
> >> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
> >>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
>  On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> > Hi, 
> > In this post
> > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> > mentioned about the issues when 64Bit PCI BAR is present and 32bit
> > address range is selected for it.
> > The issue affects all recent qemu releases and all
> > old and recent guest Linux kernel versions.
> >
> > We've done some investigations. Let me explain what happens.
> > Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> > 0xF200]
> >
> > When Linux guest starts it does PCI bus enumeration.
> > The OS enumerates 64BIT bars using the following procedure.
> > 1. Write all FF's to lower half of 64bit BAR
> > 2. Write address back to lower half of 64bit BAR
> > 3. Write all FF's to higher half of 64bit BAR
> > 4. Write address back to higher half of 64bit BAR
> >
> > Linux code is here: 
> > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> >
> > What does it mean for qemu?
> >
> > At step 1. qemu pci_default_write_config() recevies all FFs for lower
> > part of the 64bit BAR. Then it applies the mask and converts the value
> > to "All FF's - size + 1" (FE00 if size is 32MB).
> > Then pci_bar_address() checks if BAR address is valid. Since it is a
> > 64bit bar it reads 0xFE00 - this address is valid. So qemu
> > updates topology and sends request to update mappings in KVM with new
> > range for the 64bit BAR FE00 - 0x. This usually means kernel
> > panic on boot, if there is another mapping in the FE00 - 0x
> > range, which is quite common.
>  Do you know why does it panic? As far as I can see
>  from code at
>  http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
> 
>   171pci_read_config_dword(dev, pos, &l);
>   172pci_write_config_dword(dev, pos, l | mask);
>   173pci_read_config_dword(dev, pos, &sz);
>   174pci_write_config_dword(dev, pos, l);
> 
>  BAR is restored: what triggers an access between lines 172 and 174?
> >>> Random interrupt reading the time, likely.
> >> Weird, what the backtrace shows is init, unrelated
> >> to interrupts.
> >>
> > It's a bug then.  qemu doesn't undo the mapping correctly.
> >
> > If you have clear instructions, I'll try to reproduce it.
> >
> Well the easiest way to reproduce this is:
>
>
> 1. Get kernel bzImage (version < 2.6.36)
> 2. Apply patch to ivshmem.c
>
> ---
> diff --git a/hw/ivshmem.c b/hw/ivshmem.c
> index 1aa9e3b..71f8c21 100644
> --- a/hw/ivshmem.c
> +++ b/hw/ivshmem.c
> @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int 
> fd) {
>  memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
>  
>  /* region for shared memory */
> -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar);
> +pci_register_bar(&s->dev, 2, 
> PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar)
>  }
>  
>  static void close_guest_eventfds(IVShmemState *s, int posn)
> ---
>
> 3. Launch qemu with a command like that
>
> /usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp 
> 1,socket=1,cores=1,threads=1 -name centos54 -uuid
> d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait
>  -mon chardev=charmonitor,id=monitor,mode=readline -rtc
> base=utc -drive 
> file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device
> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive
> file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
>  -device
> ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev 
> file,id=charserial0,path=/home/alexey/cent54.log -device
> isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga 
> cirrus -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 
> --device ivshmem,size=32,shm="shm" -kernel bzImage -append
> "root=/dev/hda1 console=ttyS0,115200n8 console=tty0"
>
> in other words add: --device ivshmem,size=32,shm="shm"
>
> That is all.
>
> Note: it won't necessary cause panic message on some kernels it just hangs or 
> reboots.
>

In fact qemu segfaults for me, since registering a ram region not on a
page boundary is broken.  This happens when the ivshmem bar is split by
the hpet region, which is less than page long.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Alexey Korolev
On 27/01/12 04:12, Avi Kivity wrote:
> On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote:
>> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
>>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
 On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> Hi, 
> In this post
> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> mentioned about the issues when 64Bit PCI BAR is present and 32bit
> address range is selected for it.
> The issue affects all recent qemu releases and all
> old and recent guest Linux kernel versions.
>
> We've done some investigations. Let me explain what happens.
> Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> 0xF200]
>
> When Linux guest starts it does PCI bus enumeration.
> The OS enumerates 64BIT bars using the following procedure.
> 1. Write all FF's to lower half of 64bit BAR
> 2. Write address back to lower half of 64bit BAR
> 3. Write all FF's to higher half of 64bit BAR
> 4. Write address back to higher half of 64bit BAR
>
> Linux code is here: 
> http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
>
> What does it mean for qemu?
>
> At step 1. qemu pci_default_write_config() recevies all FFs for lower
> part of the 64bit BAR. Then it applies the mask and converts the value
> to "All FF's - size + 1" (FE00 if size is 32MB).
> Then pci_bar_address() checks if BAR address is valid. Since it is a
> 64bit bar it reads 0xFE00 - this address is valid. So qemu
> updates topology and sends request to update mappings in KVM with new
> range for the 64bit BAR FE00 - 0x. This usually means kernel
> panic on boot, if there is another mapping in the FE00 - 0x
> range, which is quite common.
 Do you know why does it panic? As far as I can see
 from code at
 http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162

  171pci_read_config_dword(dev, pos, &l);
  172pci_write_config_dword(dev, pos, l | mask);
  173pci_read_config_dword(dev, pos, &sz);
  174pci_write_config_dword(dev, pos, l);

 BAR is restored: what triggers an access between lines 172 and 174?
>>> Random interrupt reading the time, likely.
>> Weird, what the backtrace shows is init, unrelated
>> to interrupts.
>>
> It's a bug then.  qemu doesn't undo the mapping correctly.
>
> If you have clear instructions, I'll try to reproduce it.
>
Well the easiest way to reproduce this is:


1. Get kernel bzImage (version < 2.6.36)
2. Apply patch to ivshmem.c

---
diff --git a/hw/ivshmem.c b/hw/ivshmem.c
index 1aa9e3b..71f8c21 100644
--- a/hw/ivshmem.c
+++ b/hw/ivshmem.c
@@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int 
fd) {
 memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
 
 /* region for shared memory */
-pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar);
+pci_register_bar(&s->dev, 2, 
PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar)
 }
 
 static void close_guest_eventfds(IVShmemState *s, int posn)
---

3. Launch qemu with a command like that

/usr/bin/qemu-system-x86_64 -M pc-0.14 -enable-kvm -m 2048 -smp 
1,socket=1,cores=1,threads=1 -name centos54 -uuid
d37daefd-75bd-4387-cee1-7f0b153ee2af -nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/centos54.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=readline -rtc
base=utc -drive 
file=/dev/dock200-1/centos54,if=none,id=drive-ide0-0-0,format=raw -device
ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive
file=/data/CentOS-5.4-x86_64-bin-DVD.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
 -device
ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev 
file,id=charserial0,path=/home/alexey/cent54.log -device
isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga 
cirrus -device
virtio-balloon-pci,id=balloon0,bus=pci.0,multifunction=on,addr=0x4.0x0 --device 
ivshmem,size=32,shm="shm" -kernel bzImage -append
"root=/dev/hda1 console=ttyS0,115200n8 console=tty0"

in other words add: --device ivshmem,size=32,shm="shm"

That is all.

Note: it won't necessary cause panic message on some kernels it just hangs or 
reboots.







Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Alexey Korolev
On 27/01/12 03:36, Michael S. Tsirkin wrote:
> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
>> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
>>> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
 Hi, 
 In this post
 http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
 mentioned about the issues when 64Bit PCI BAR is present and 32bit
 address range is selected for it.
 The issue affects all recent qemu releases and all
 old and recent guest Linux kernel versions.

 We've done some investigations. Let me explain what happens.
 Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
 0xF200]

 When Linux guest starts it does PCI bus enumeration.
 The OS enumerates 64BIT bars using the following procedure.
 1. Write all FF's to lower half of 64bit BAR
 2. Write address back to lower half of 64bit BAR
 3. Write all FF's to higher half of 64bit BAR
 4. Write address back to higher half of 64bit BAR

 Linux code is here: 
 http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149

 What does it mean for qemu?

 At step 1. qemu pci_default_write_config() recevies all FFs for lower
 part of the 64bit BAR. Then it applies the mask and converts the value
 to "All FF's - size + 1" (FE00 if size is 32MB).
 Then pci_bar_address() checks if BAR address is valid. Since it is a
 64bit bar it reads 0xFE00 - this address is valid. So qemu
 updates topology and sends request to update mappings in KVM with new
 range for the 64bit BAR FE00 - 0x. This usually means kernel
 panic on boot, if there is another mapping in the FE00 - 0x
 range, which is quite common.
>>> Do you know why does it panic? As far as I can see
>>> from code at
>>> http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
>>>
>>>  171pci_read_config_dword(dev, pos, &l);
>>>  172pci_write_config_dword(dev, pos, l | mask);
>>>  173pci_read_config_dword(dev, pos, &sz);
>>>  174pci_write_config_dword(dev, pos, l);
>>>
>>> BAR is restored: what triggers an access between lines 172 and 174?
>> Random interrupt reading the time, likely.
> Weird, what the backtrace shows is init, unrelated
> to interrupts.
Yes, it fails during ordered late_hpet_init() call. Which is a part of kernel
fs_initcall list. So no time interrupts are involved here.
Basically once the region is programmed (even temporary), area behind it is 
lost.
I mean if we even temporary overlap the HPET region with our BAR, backed by 
host user space memory, and
commit a mapping request to kvm, the information about the old mappings 
belonging to HPET are lost.
Even if we did this for short period of time, and later restore the original 
address.

>>> Also, what you describe happens on a 32 bit BAR in the same way, no?
>> So it seems.  Btw, is this procedure correct for sizing a BAR which is
>> larger than 4GB?
> There's more code sizing 64 bit BARs, but generally
> software is allowed to write any junk into enabled BARs
> as long as there aren't any memory accesses.





Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Avi Kivity
On 01/26/2012 04:36 PM, Michael S. Tsirkin wrote:
> On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
> > On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
> > > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> > > > Hi, 
> > > > In this post
> > > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> > > > mentioned about the issues when 64Bit PCI BAR is present and 32bit
> > > > address range is selected for it.
> > > > The issue affects all recent qemu releases and all
> > > > old and recent guest Linux kernel versions.
> > > > 
> > > > We've done some investigations. Let me explain what happens.
> > > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> > > > 0xF200]
> > > > 
> > > > When Linux guest starts it does PCI bus enumeration.
> > > > The OS enumerates 64BIT bars using the following procedure.
> > > > 1. Write all FF's to lower half of 64bit BAR
> > > > 2. Write address back to lower half of 64bit BAR
> > > > 3. Write all FF's to higher half of 64bit BAR
> > > > 4. Write address back to higher half of 64bit BAR
> > > > 
> > > > Linux code is here: 
> > > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> > > > 
> > > > What does it mean for qemu?
> > > > 
> > > > At step 1. qemu pci_default_write_config() recevies all FFs for lower
> > > > part of the 64bit BAR. Then it applies the mask and converts the value
> > > > to "All FF's - size + 1" (FE00 if size is 32MB).
> > > > Then pci_bar_address() checks if BAR address is valid. Since it is a
> > > > 64bit bar it reads 0xFE00 - this address is valid. So qemu
> > > > updates topology and sends request to update mappings in KVM with new
> > > > range for the 64bit BAR FE00 - 0x. This usually means kernel
> > > > panic on boot, if there is another mapping in the FE00 - 0x
> > > > range, which is quite common.
> > >
> > > Do you know why does it panic? As far as I can see
> > > from code at
> > > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
> > >
> > >  171pci_read_config_dword(dev, pos, &l);
> > >  172pci_write_config_dword(dev, pos, l | mask);
> > >  173pci_read_config_dword(dev, pos, &sz);
> > >  174pci_write_config_dword(dev, pos, l);
> > >
> > > BAR is restored: what triggers an access between lines 172 and 174?
> > 
> > Random interrupt reading the time, likely.
>
> Weird, what the backtrace shows is init, unrelated
> to interrupts.
>

It's a bug then.  qemu doesn't undo the mapping correctly.

If you have clear instructions, I'll try to reproduce it.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Michael S. Tsirkin
On Thu, Jan 26, 2012 at 03:52:27PM +0200, Avi Kivity wrote:
> On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
> > On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> > > Hi, 
> > > In this post
> > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> > > mentioned about the issues when 64Bit PCI BAR is present and 32bit
> > > address range is selected for it.
> > > The issue affects all recent qemu releases and all
> > > old and recent guest Linux kernel versions.
> > > 
> > > We've done some investigations. Let me explain what happens.
> > > Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> > > 0xF200]
> > > 
> > > When Linux guest starts it does PCI bus enumeration.
> > > The OS enumerates 64BIT bars using the following procedure.
> > > 1. Write all FF's to lower half of 64bit BAR
> > > 2. Write address back to lower half of 64bit BAR
> > > 3. Write all FF's to higher half of 64bit BAR
> > > 4. Write address back to higher half of 64bit BAR
> > > 
> > > Linux code is here: 
> > > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> > > 
> > > What does it mean for qemu?
> > > 
> > > At step 1. qemu pci_default_write_config() recevies all FFs for lower
> > > part of the 64bit BAR. Then it applies the mask and converts the value
> > > to "All FF's - size + 1" (FE00 if size is 32MB).
> > > Then pci_bar_address() checks if BAR address is valid. Since it is a
> > > 64bit bar it reads 0xFE00 - this address is valid. So qemu
> > > updates topology and sends request to update mappings in KVM with new
> > > range for the 64bit BAR FE00 - 0x. This usually means kernel
> > > panic on boot, if there is another mapping in the FE00 - 0x
> > > range, which is quite common.
> >
> > Do you know why does it panic? As far as I can see
> > from code at
> > http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
> >
> >  171pci_read_config_dword(dev, pos, &l);
> >  172pci_write_config_dword(dev, pos, l | mask);
> >  173pci_read_config_dword(dev, pos, &sz);
> >  174pci_write_config_dword(dev, pos, l);
> >
> > BAR is restored: what triggers an access between lines 172 and 174?
> 
> Random interrupt reading the time, likely.

Weird, what the backtrace shows is init, unrelated
to interrupts.

> > Also, what you describe happens on a 32 bit BAR in the same way, no?
> 
> So it seems.  Btw, is this procedure correct for sizing a BAR which is
> larger than 4GB?

There's more code sizing 64 bit BARs, but generally
software is allowed to write any junk into enabled BARs
as long as there aren't any memory accesses.

> -- 
> error compiling committee.c: too many arguments to function



Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Avi Kivity
On 01/26/2012 04:05 PM, Michael S. Tsirkin wrote:
> > 
> > Let me see if I get this right: during BAR sizing, the guest sets the
> > BAR to ~1, which means 4GB-32MB -> 4GB, which overlaps the HPET.  If so,
> > that's expected behaviour.
>
> Yes BAR sizing temporarily sets the BAR to an invalid value then
> restores it.  What I don't understand is how come something accesses the
> HPET range in between.

Interrupt -> read time.

> > If the guest doesn't want this memory there,
> > it should disable mmio.
>
> Recent kernels do this for most devices, but not for
> platform devices.

Then they are vulnerable to this issue.

The i440fx spec states that the entire top-of-memory range to 4GB if
forwarded to PCI, so qemu appears to be correct here.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Michael S. Tsirkin
On Thu, Jan 26, 2012 at 03:51:06PM +0200, Avi Kivity wrote:
> > Please look at HPET lines. HPET is mapped to 0xfed0.
> > Size of ivshmem is 32MB. During pci enumeration ivshmem will corrupt the 
> > range from 0xfe00 - 0x.
> > It overlaps HPET memory. When Linux does late_hpet init, it finds garbage 
> > and this is causing panic.
> >
> 
> Let me see if I get this right: during BAR sizing, the guest sets the
> BAR to ~1, which means 4GB-32MB -> 4GB, which overlaps the HPET.  If so,
> that's expected behaviour.

Yes BAR sizing temporarily sets the BAR to an invalid value then
restores it.  What I don't understand is how come something accesses the
HPET range in between.

> If the guest doesn't want this memory there,
> it should disable mmio.

Recent kernels do this for most devices, but not for
platform devices.

> -- 
> error compiling committee.c: too many arguments to function



Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Avi Kivity
On 01/26/2012 11:14 AM, Michael S. Tsirkin wrote:
> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> > Hi, 
> > In this post
> > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> > mentioned about the issues when 64Bit PCI BAR is present and 32bit
> > address range is selected for it.
> > The issue affects all recent qemu releases and all
> > old and recent guest Linux kernel versions.
> > 
> > We've done some investigations. Let me explain what happens.
> > Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> > 0xF200]
> > 
> > When Linux guest starts it does PCI bus enumeration.
> > The OS enumerates 64BIT bars using the following procedure.
> > 1. Write all FF's to lower half of 64bit BAR
> > 2. Write address back to lower half of 64bit BAR
> > 3. Write all FF's to higher half of 64bit BAR
> > 4. Write address back to higher half of 64bit BAR
> > 
> > Linux code is here: 
> > http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> > 
> > What does it mean for qemu?
> > 
> > At step 1. qemu pci_default_write_config() recevies all FFs for lower
> > part of the 64bit BAR. Then it applies the mask and converts the value
> > to "All FF's - size + 1" (FE00 if size is 32MB).
> > Then pci_bar_address() checks if BAR address is valid. Since it is a
> > 64bit bar it reads 0xFE00 - this address is valid. So qemu
> > updates topology and sends request to update mappings in KVM with new
> > range for the 64bit BAR FE00 - 0x. This usually means kernel
> > panic on boot, if there is another mapping in the FE00 - 0x
> > range, which is quite common.
>
> Do you know why does it panic? As far as I can see
> from code at
> http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162
>
>  171pci_read_config_dword(dev, pos, &l);
>  172pci_write_config_dword(dev, pos, l | mask);
>  173pci_read_config_dword(dev, pos, &sz);
>  174pci_write_config_dword(dev, pos, l);
>
> BAR is restored: what triggers an access between lines 172 and 174?

Random interrupt reading the time, likely.

> Also, what you describe happens on a 32 bit BAR in the same way, no?

So it seems.  Btw, is this procedure correct for sizing a BAR which is
larger than 4GB?

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Avi Kivity
On 01/26/2012 05:19 AM, Alexey Korolev wrote:
> If you apply the following patch and add to qemu command: --device 
> ivshmem,size=32,shm="shm"
> ---
> diff --git a/hw/ivshmem.c b/hw/ivshmem.c
> index 1aa9e3b..71f8c21 100644
> --- a/hw/ivshmem.c
> +++ b/hw/ivshmem.c
> @@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int 
> fd) {
>  memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
>  
>  /* region for shared memory */
> -pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar);
> +pci_register_bar(&s->dev, 2, 
> PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar)
>  }
>  
>  static void close_guest_eventfds(IVShmemState *s, int posn)
> ---
>
> You can get the following bootup log:
>
>
> Bootdata ok (command line is root=/dev/hda1 console=ttyS0,115200n8 
> console=tty0)
> Linux version 2.6.18 (root@localhost.localdomain) (gcc version 4.1.2 20080704 
> (Red Hat 4.1.2-48)) #3 SMP Tue Jan 17 16:37:33 NZDT 2012
> BIOS-provided physical RAM map:
>  BIOS-e820:  - 0009f400 (usable)
>  BIOS-e820: 0009f400 - 000a (reserved)
>  BIOS-e820: 000f - 0010 (reserved)
>  BIOS-e820: 0010 - 7fffd000 (usable)
>  BIOS-e820: 7fffd000 - 8000 (reserved)
>  BIOS-e820: feffc000 - ff00 (reserved)
>  BIOS-e820: fffc - 0001 (reserved)
> DMI 2.4 present.
> No NUMA configuration found
> Faking a node at -7fffd000
> Bootmem setup node 0 -7fffd000
> ACPI: PM-Timer IO Port: 0xb008
> ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> Processor #0 6:2 APIC version 17
> ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0])
> IOAPIC[0]: apic_id 1, version 17, address 0xfec0, GSI 0-23
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
> Setting APIC routing to physical flat
> ACPI: HPET id: 0x8086a201 base: 0xfed0
> Using ACPI (MADT) for SMP configuration information
> Allocating PCI resources starting at 8800 (gap: 8000:7effc000)
> SMP: Allowing 1 CPUs, 0 hotplug CPUs
> Built 1 zonelists.  Total pages: 515393
> Kernel command line: root=/dev/hda1 console=ttyS0,115200n8 console=tty0
> Initializing CPU#0
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> time.c: Using 100.00 MHz WALL HPET GTOD HPET/TSC timer.
> time.c: Detected 2500.081 MHz processor.
> Console: colour VGA+ 80x25
> Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
> Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
> Checking aperture...
> Memory: 2058096k/2097140k available (3256k kernel code, 38656k reserved, 
> 2266k data, 204k init)
> Calibrating delay using timer specific routine.. 5030.07 BogoMIPS 
> (lpj=10060155)
> Mount-cache hash table entries: 256
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 4096K
> MCE: warning: using only 10 banks
> SMP alternatives: switching to UP code
> Freeing SMP alternatives: 36k freed
> ACPI: Core revision 20060707
> activating NMI Watchdog ... done.
> Using local APIC timer interrupts.
> result 62501506
> Detected 62.501 MHz APIC timer.
> Brought up 1 CPUs
> testing NMI watchdog ... OK.
> migration_cost=0
> NET: Registered protocol family 16
> ACPI: bus type pci registered
> PCI: Using configuration type 1
> ACPI: Interpreter enabled
> ACPI: Using IOAPIC for interrupt routing
> ACPI: PCI Root Bridge [PCI0] (:00)
> ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
> PCI quirk: region b000-b03f claimed by PIIX4 ACPI
> PCI quirk: region b100-b10f claimed by PIIX4 SMB
> ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
> ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
> ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
> ACPI: PCI Interrupt Link [LNKS] (IRQs 9) *0, disabled.
> SCSI subsystem initialized
> usbcore: registered new driver usbfs
> usbcore: registered new driver hub
> PCI: Using ACPI for IRQ routing
> PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
> divide error:  [1] SMP
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.18 #3
> RIP: 0010:[]  [] hpet_alloc+0x12a/0x30c
> RSP: :81007e3a1e20  EFLAGS: 00010246
> RAX: 00038d7ea4c68000 RBX:  RCX: 
> RDX:  RSI:  RDI: 8057fc2b
> RBP: 81007e2e28c0 R08: 8055b492 R09: 81007e39f510
> R10: 81007e3a1e50 R11: 0098 R12: 81007e3a1e50
> R13:  R14: ff5fe000 R15: 
> FS:  () GS:807fc000() knlGS:
> CS:  0010 D

Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-26 Thread Michael S. Tsirkin
On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> Hi, 
> In this post
> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> mentioned about the issues when 64Bit PCI BAR is present and 32bit
> address range is selected for it.
> The issue affects all recent qemu releases and all
> old and recent guest Linux kernel versions.
> 
> We've done some investigations. Let me explain what happens.
> Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> 0xF200]
> 
> When Linux guest starts it does PCI bus enumeration.
> The OS enumerates 64BIT bars using the following procedure.
> 1. Write all FF's to lower half of 64bit BAR
> 2. Write address back to lower half of 64bit BAR
> 3. Write all FF's to higher half of 64bit BAR
> 4. Write address back to higher half of 64bit BAR
> 
> Linux code is here: 
> http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> 
> What does it mean for qemu?
> 
> At step 1. qemu pci_default_write_config() recevies all FFs for lower
> part of the 64bit BAR. Then it applies the mask and converts the value
> to "All FF's - size + 1" (FE00 if size is 32MB).
> Then pci_bar_address() checks if BAR address is valid. Since it is a
> 64bit bar it reads 0xFE00 - this address is valid. So qemu
> updates topology and sends request to update mappings in KVM with new
> range for the 64bit BAR FE00 - 0x. This usually means kernel
> panic on boot, if there is another mapping in the FE00 - 0x
> range, which is quite common.

Do you know why does it panic? As far as I can see
from code at
http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162

 171pci_read_config_dword(dev, pos, &l);
 172pci_write_config_dword(dev, pos, l | mask);
 173pci_read_config_dword(dev, pos, &sz);
 174pci_write_config_dword(dev, pos, l);

BAR is restored: what triggers an access between lines 172 and 174?


Also, what you describe happens on a 32 bit BAR in the same way, no?

-- 
MST



Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-25 Thread Alexey Korolev
On 26/01/12 01:51, Michael S. Tsirkin wrote:
> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
>> Hi, 
>> In this post
>> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
>> mentioned about the issues when 64Bit PCI BAR is present and 32bit
>> address range is selected for it.
>> The issue affects all recent qemu releases and all
>> old and recent guest Linux kernel versions.
>>
>> We've done some investigations. Let me explain what happens.
>> Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
>> 0xF200]
>>
>> When Linux guest starts it does PCI bus enumeration.
>> The OS enumerates 64BIT bars using the following procedure.
>> 1. Write all FF's to lower half of 64bit BAR
>> 2. Write address back to lower half of 64bit BAR
>> 3. Write all FF's to higher half of 64bit BAR
>> 4. Write address back to higher half of 64bit BAR
>>
>> Linux code is here: 
>> http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
>>
>> What does it mean for qemu?
>>
>> At step 1. qemu pci_default_write_config() recevies all FFs for lower
>> part of the 64bit BAR. Then it applies the mask and converts the value
>> to "All FF's - size + 1" (FE00 if size is 32MB).
>> Then pci_bar_address() checks if BAR address is valid. Since it is a
>> 64bit bar it reads 0xFE00 - this address is valid. So qemu
>> updates topology and sends request to update mappings in KVM with new
>> range for the 64bit BAR FE00 - 0x. This usually means kernel
>> panic on boot, if there is another mapping in the FE00 - 0x
>> range, which is quite common.
>>
>>
>> The following patch fixes the issue. It affects 64bit PCI BAR's only. 
>> The idea of the patch is: we introduce the states for low and high BARs
>> whose can have 3 possible values: BAR_VALID, PCIBAR64_PARTIAL_SIZE_QUERY
>> - someone has requested size of one half of the 64bit PCI BAR,
>> PCIBAR64_PARTIAL_ADDR_PROGRAM - someone has sent a request to update the
>> address of one half of the 64bit PCI BAR. The state becomes BAR_VALID
>> when both halfs are in the same state. We ignore BAR value until both
>> states become BAR_VALID
>>
>> Note: Please use the latest Seabios version (commit
>> 139d5ac037de828f89c36e39c6dd15610650cede and later), as older versions
>> didn't initialize high part of 64bit BAR. 
>>
>> The patch is tested on Linux 2.6.18 - 3.1.0 and Windows 2008 Server
>>
>> Signed-off-by: Alexey Korolev 
> Interesting. However, looking at guest code,
> I note that memory and io are disabled
> during BAR sizing unless mmio always on is set.
> pci_bar_address should return PCI_BAR_UNMAPPED
> in this case, and we should never map this BAR
> until it's enabled. What's going on?
>
>
Oh. Good point. You are right here. Linux developers
have added a protection starting 2.6.36 for lower part of PCI BAR.
So this issue affects all guest kernels before 2.6.36.
Sorry about confusion.

The code without protection is here:

http://lxr.linux.no/#linux+v2.6.35.9/drivers/pci/probe.c#L162


To solve this issue for older kernel versions the submitted patch is still 
relevant.





Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-25 Thread Alexey Korolev
Hi Alex and Michael
>> For testing, I applied the following patch to qemu,
>> converting msix bar to 64 bit.
>> Guest did not seem to crash.
>> I booted Fedora Live CD 32 bit guest on a 32 bit host
>> to level 3 without crash, and verified that
>> the BAR is a 64 bit one, and that I got assigned an address
>> at fe00.
>> command line I used:
>> qemu-system-x86_64 -bios /scm/seabios/out/bios.bin -snapshot -drive
>> file=qemu-images/f15-test.qcow2,if=none,id=diskid,cache=unsafe
>> -device virtio-blk-pci,drive=diskid -net user -net nic,model=ne2k_pci
>> -cdrom Fedora-15-i686-Live-LXDE.iso
>>
>> At boot prompt type tab and add '3' to kernel command line
>> to have guest boot into a fast text console instead
>> of a graphical one which is very slow.
>>
>> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
>> index 2ac87ea..5271394 100644
>> --- a/hw/virtio-pci.c
>> +++ b/hw/virtio-pci.c
>> @@ -711,7 +711,8 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice 
>> *vdev)
>>  memory_region_init(&proxy->msix_bar, "virtio-msix", 4096);
>>  if (vdev->nvectors && !msix_init(&proxy->pci_dev, vdev->nvectors,
>>   &proxy->msix_bar, 1, 0)) {
>> -pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY,
>> +pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY |
>> + PCI_BASE_ADDRESS_MEM_TYPE_64,
>>   &proxy->msix_bar);
>>  } else
>>  vdev->nvectors = 0;
>>
> I was also able to add MEM64 BARs to device assignment pretty trivially
> and it seems to work, guest sees 64bit BARs for an 82576 VF, programs it
> to an fexx address and it works.
>
> Alex
>

I'd suggest using ivshmem with buffer size 32MB to reproduce the problem in 
2.6.18 guest for example.

The msix case is not failing because:
1. Buffer size is just 4KB - it will reprogram range from 0xE000-0x 
(it doesn't overlap critical resources to cause immediate panic)
2. The memory_region_init -function doesn't create backing user memory region. 
So kvm does nothing about remapping in this case.

If you apply the following patch and add to qemu command: --device 
ivshmem,size=32,shm="shm"
---
diff --git a/hw/ivshmem.c b/hw/ivshmem.c
index 1aa9e3b..71f8c21 100644
--- a/hw/ivshmem.c
+++ b/hw/ivshmem.c
@@ -341,7 +341,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int 
fd) {
 memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
 
 /* region for shared memory */
-pci_register_bar(&s->dev, 2, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->bar);
+pci_register_bar(&s->dev, 2, 
PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64, &s->bar)
 }
 
 static void close_guest_eventfds(IVShmemState *s, int posn)
---

You can get the following bootup log:


Bootdata ok (command line is root=/dev/hda1 console=ttyS0,115200n8 console=tty0)
Linux version 2.6.18 (root@localhost.localdomain) (gcc version 4.1.2 20080704 
(Red Hat 4.1.2-48)) #3 SMP Tue Jan 17 16:37:33 NZDT 2012
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009f400 (usable)
 BIOS-e820: 0009f400 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 7fffd000 (usable)
 BIOS-e820: 7fffd000 - 8000 (reserved)
 BIOS-e820: feffc000 - ff00 (reserved)
 BIOS-e820: fffc - 0001 (reserved)
DMI 2.4 present.
No NUMA configuration found
Faking a node at -7fffd000
Bootmem setup node 0 -7fffd000
ACPI: PM-Timer IO Port: 0xb008
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:2 APIC version 17
ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0])
IOAPIC[0]: apic_id 1, version 17, address 0xfec0, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
Setting APIC routing to physical flat
ACPI: HPET id: 0x8086a201 base: 0xfed0
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 8800 (gap: 8000:7effc000)
SMP: Allowing 1 CPUs, 0 hotplug CPUs
Built 1 zonelists.  Total pages: 515393
Kernel command line: root=/dev/hda1 console=ttyS0,115200n8 console=tty0
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
time.c: Using 100.00 MHz WALL HPET GTOD HPET/TSC timer.
time.c: Detected 2500.081 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Checking aperture...
Memory: 2058096k/2097140k available (3256k kernel code, 38656k reserved, 2266k 
data, 204k init)
Calibrating delay using timer specific routin

Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-25 Thread Alex Williamson
On Wed, 2012-01-25 at 17:38 +0200, Michael S. Tsirkin wrote:
> On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> > Hi, 
> > In this post
> > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> > mentioned about the issues when 64Bit PCI BAR is present and 32bit
> > address range is selected for it.
> > The issue affects all recent qemu releases and all
> > old and recent guest Linux kernel versions.
> > 
> 
> For testing, I applied the following patch to qemu,
> converting msix bar to 64 bit.
> Guest did not seem to crash.
> I booted Fedora Live CD 32 bit guest on a 32 bit host
> to level 3 without crash, and verified that
> the BAR is a 64 bit one, and that I got assigned an address
> at fe00.
> command line I used:
> qemu-system-x86_64 -bios /scm/seabios/out/bios.bin -snapshot -drive
> file=qemu-images/f15-test.qcow2,if=none,id=diskid,cache=unsafe
> -device virtio-blk-pci,drive=diskid -net user -net nic,model=ne2k_pci
> -cdrom Fedora-15-i686-Live-LXDE.iso
> 
> At boot prompt type tab and add '3' to kernel command line
> to have guest boot into a fast text console instead
> of a graphical one which is very slow.
> 
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 2ac87ea..5271394 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -711,7 +711,8 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice 
> *vdev)
>  memory_region_init(&proxy->msix_bar, "virtio-msix", 4096);
>  if (vdev->nvectors && !msix_init(&proxy->pci_dev, vdev->nvectors,
>   &proxy->msix_bar, 1, 0)) {
> -pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY,
> +pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY |
> +  PCI_BASE_ADDRESS_MEM_TYPE_64,
>   &proxy->msix_bar);
>  } else
>  vdev->nvectors = 0;
> 

I was also able to add MEM64 BARs to device assignment pretty trivially
and it seems to work, guest sees 64bit BARs for an 82576 VF, programs it
to an fexx address and it works.

Alex




Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-25 Thread Michael S. Tsirkin
On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> Hi, 
> In this post
> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> mentioned about the issues when 64Bit PCI BAR is present and 32bit
> address range is selected for it.
> The issue affects all recent qemu releases and all
> old and recent guest Linux kernel versions.
> 

For testing, I applied the following patch to qemu,
converting msix bar to 64 bit.
Guest did not seem to crash.
I booted Fedora Live CD 32 bit guest on a 32 bit host
to level 3 without crash, and verified that
the BAR is a 64 bit one, and that I got assigned an address
at fe00.
command line I used:
qemu-system-x86_64 -bios /scm/seabios/out/bios.bin -snapshot -drive
file=qemu-images/f15-test.qcow2,if=none,id=diskid,cache=unsafe
-device virtio-blk-pci,drive=diskid -net user -net nic,model=ne2k_pci
-cdrom Fedora-15-i686-Live-LXDE.iso

At boot prompt type tab and add '3' to kernel command line
to have guest boot into a fast text console instead
of a graphical one which is very slow.

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 2ac87ea..5271394 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -711,7 +711,8 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice 
*vdev)
 memory_region_init(&proxy->msix_bar, "virtio-msix", 4096);
 if (vdev->nvectors && !msix_init(&proxy->pci_dev, vdev->nvectors,
  &proxy->msix_bar, 1, 0)) {
-pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY,
+pci_register_bar(&proxy->pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY |
+PCI_BASE_ADDRESS_MEM_TYPE_64,
  &proxy->msix_bar);
 } else
 vdev->nvectors = 0;



Re: [Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-25 Thread Michael S. Tsirkin
On Wed, Jan 25, 2012 at 06:46:03PM +1300, Alexey Korolev wrote:
> Hi, 
> In this post
> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
> mentioned about the issues when 64Bit PCI BAR is present and 32bit
> address range is selected for it.
> The issue affects all recent qemu releases and all
> old and recent guest Linux kernel versions.
> 
> We've done some investigations. Let me explain what happens.
> Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
> 0xF200]
> 
> When Linux guest starts it does PCI bus enumeration.
> The OS enumerates 64BIT bars using the following procedure.
> 1. Write all FF's to lower half of 64bit BAR
> 2. Write address back to lower half of 64bit BAR
> 3. Write all FF's to higher half of 64bit BAR
> 4. Write address back to higher half of 64bit BAR
> 
> Linux code is here: 
> http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149
> 
> What does it mean for qemu?
> 
> At step 1. qemu pci_default_write_config() recevies all FFs for lower
> part of the 64bit BAR. Then it applies the mask and converts the value
> to "All FF's - size + 1" (FE00 if size is 32MB).
> Then pci_bar_address() checks if BAR address is valid. Since it is a
> 64bit bar it reads 0xFE00 - this address is valid. So qemu
> updates topology and sends request to update mappings in KVM with new
> range for the 64bit BAR FE00 - 0x. This usually means kernel
> panic on boot, if there is another mapping in the FE00 - 0x
> range, which is quite common.
> 
> 
> The following patch fixes the issue. It affects 64bit PCI BAR's only. 
> The idea of the patch is: we introduce the states for low and high BARs
> whose can have 3 possible values: BAR_VALID, PCIBAR64_PARTIAL_SIZE_QUERY
> - someone has requested size of one half of the 64bit PCI BAR,
> PCIBAR64_PARTIAL_ADDR_PROGRAM - someone has sent a request to update the
> address of one half of the 64bit PCI BAR. The state becomes BAR_VALID
> when both halfs are in the same state. We ignore BAR value until both
> states become BAR_VALID
> 
> Note: Please use the latest Seabios version (commit
> 139d5ac037de828f89c36e39c6dd15610650cede and later), as older versions
> didn't initialize high part of 64bit BAR. 
> 
> The patch is tested on Linux 2.6.18 - 3.1.0 and Windows 2008 Server
> 
> Signed-off-by: Alexey Korolev 

Interesting. However, looking at guest code,
I note that memory and io are disabled
during BAR sizing unless mmio always on is set.
pci_bar_address should return PCI_BAR_UNMAPPED
in this case, and we should never map this BAR
until it's enabled. What's going on?





[Qemu-devel] [RFC/PATCH] Fix guest OS panic when 64bit BAR is present

2012-01-24 Thread Alexey Korolev
Hi, 
In this post
http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg03171.html I've
mentioned about the issues when 64Bit PCI BAR is present and 32bit
address range is selected for it.
The issue affects all recent qemu releases and all
old and recent guest Linux kernel versions.

We've done some investigations. Let me explain what happens.
Assume we have 64bit BAR with size 32MB mapped at [0xF000 -
0xF200]

When Linux guest starts it does PCI bus enumeration.
The OS enumerates 64BIT bars using the following procedure.
1. Write all FF's to lower half of 64bit BAR
2. Write address back to lower half of 64bit BAR
3. Write all FF's to higher half of 64bit BAR
4. Write address back to higher half of 64bit BAR

Linux code is here: 
http://lxr.linux.no/#linux+v3.2.1/drivers/pci/probe.c#L149

What does it mean for qemu?

At step 1. qemu pci_default_write_config() recevies all FFs for lower
part of the 64bit BAR. Then it applies the mask and converts the value
to "All FF's - size + 1" (FE00 if size is 32MB).
Then pci_bar_address() checks if BAR address is valid. Since it is a
64bit bar it reads 0xFE00 - this address is valid. So qemu
updates topology and sends request to update mappings in KVM with new
range for the 64bit BAR FE00 - 0x. This usually means kernel
panic on boot, if there is another mapping in the FE00 - 0x
range, which is quite common.


The following patch fixes the issue. It affects 64bit PCI BAR's only. 
The idea of the patch is: we introduce the states for low and high BARs
whose can have 3 possible values: BAR_VALID, PCIBAR64_PARTIAL_SIZE_QUERY
- someone has requested size of one half of the 64bit PCI BAR,
PCIBAR64_PARTIAL_ADDR_PROGRAM - someone has sent a request to update the
address of one half of the 64bit PCI BAR. The state becomes BAR_VALID
when both halfs are in the same state. We ignore BAR value until both
states become BAR_VALID

Note: Please use the latest Seabios version (commit
139d5ac037de828f89c36e39c6dd15610650cede and later), as older versions
didn't initialize high part of 64bit BAR. 

The patch is tested on Linux 2.6.18 - 3.1.0 and Windows 2008 Server

Signed-off-by: Alexey Korolev 
---
 hw/pci.c |   45 +
 hw/pci.h |7 +++
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 57ec104..3a7deb2 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1055,6 +1055,40 @@ static pcibus_t pci_bar_address(PCIDevice *d,
 return new_addr;
 }
 
+static void pci_update_region_state(PCIDevice *d, uint32_t addr, uint32_t val)
+{
+PCIIORegion *r;
+int barnum = (addr - PCI_BASE_ADDRESS_0) >> 2;
+PCIBARState *state;
+
+r = &d->io_regions[barnum];
+
+if (d->io_regions[barnum].type & PCI_BASE_ADDRESS_MEM_TYPE_64) {
+/* Programming low part of the 64bit BAR */
+r = &d->io_regions[barnum];
+state = &r->state_lo;
+} else if (barnum > 0 &&
+(d->io_regions[barnum - 1].type & PCI_BASE_ADDRESS_MEM_TYPE_64)) {
+/* Programming high part of the 64bit BAR */
+r = &d->io_regions[barnum - 1];
+state = &r->state_hi;
+} else {
+/* Not a 64bit BAR's */
+d->io_regions[barnum].state_lo = PCIBAR_VALID;
+return;
+}
+
+/* Request to read BAR size */
+if (val == -1U)
+*state = PCIBAR64_PARTIAL_SIZE_QUERY;
+else
+*state = PCIBAR64_PARTIAL_ADDR_PROGRAM;
+
+
+if (r->state_lo == r->state_hi)
+r->state_lo = r->state_hi = PCIBAR_VALID;
+}
+
 static void pci_update_mappings(PCIDevice *d)
 {
 PCIIORegion *r;
@@ -1068,6 +1102,13 @@ static void pci_update_mappings(PCIDevice *d)
 if (!r->size)
 continue;
 
+/* this region state is invalid */
+if (r->state_lo != PCIBAR_VALID)
+continue;
+if ((r->type & PCI_BASE_ADDRESS_MEM_TYPE_64) &&
+   (r->state_hi != PCIBAR_VALID))
+continue;
+
 new_addr = pci_bar_address(d, i, r->type, r->size);
 
 /* This bar isn't changed */
@@ -1117,6 +1158,7 @@ uint32_t pci_default_read_config(PCIDevice *d,
 void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l)
 {
 int i, was_irq_disabled = pci_irq_disabled(d);
+uint32_t orig_val = val;
 
 for (i = 0; i < l; val >>= 8, ++i) {
 uint8_t wmask = d->wmask[addr + i];
@@ -1133,6 +1175,9 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
addr, uint32_t val, int l)
 assigned_dev_update_irqs();
 #endif /* CONFIG_KVM_DEVICE_ASSIGNMENT */
 
+if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24))
+pci_update_region_state(d, addr, orig_val);
+
 if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
 ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
 ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
diff --git a/hw/pci.h b/hw/pci.h
index 4220151..5d1e529 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -86,12 +86,19 @@ t