Re: [Qemu-devel] [PATCH] RFC: delay pci_update_mappings for 64-bit BARs

2010-12-14 Thread Cam Macdonell
On Mon, Dec 13, 2010 at 8:00 PM, Isaku Yamahata yamah...@valinux.co.jp wrote:
 On Mon, Dec 13, 2010 at 03:43:44PM -0700, Cam Macdonell wrote:
 Do not call pci_update_mappings on the lower 32-bits of a 64-bit bar.  Wait 
 for the upper 32 or else Qemu will try to map on just the lower 32 which is 
 probably going to corrupt memory.

 I was encountering crashes when mapping certain PCI region sizes.  The 
 problem turns out that pci_update_mappings is being called without all 
 64-bits in the BAR.  For example when mapping to 0x18000, once the lower 
 32-bits were written the remapping happened (mapping to 0x800) which 
 would overwrite something.

 I'm not certain if this is completely correct, I'm simply testing the lower 
 4-bits to only be MEM_TYPE_64 flag.  Upper 32-bit address parts can be 
 values like 0xff which is tricky to test against.

 You're assuming that guest OS always write lower 32bit and them upper 32bit.
 Is the assumption correct?
 I found Linux does, but I don't know about other OSes.
 And I couldn't find any sentence about how to update (64bit) BAR in the specs.
 (Please correct me if I missed it)

I think you're right, we probably can't assume the order.


 Some work around would be necessary regardless of 32bit-or-64bit.
 because qemu doesn't emulate bus accurately at the moment.
 How about the followings?
 If BAR overlaps with RAM, don't map BAR.
 If BAR overlaps with other BARs, record the overlapping and
 when updating one of the BARs, update all the overlapping BARs.
 Which BAR wins depends on the order of updating, it doesn't matter because
 it's anomaly case.

But the addresses in the BARs may not overlap.  For example, Linux
allocates memory from top down, so I recently had the mapping of a BAR
to address 0xffc000

So BAR 0x18 sees 0xc004
Then BAR 0x1c sees 0xff

So if I understand what you mean by overlapping BARs, 0xc000 and
0xffc000 will not be detected as overlapping and so we can't
record it.  But, we can allow harmless mappings of the incomplete
lower-32 to proceed and then get remapped when the upper bits are
written.  (This is what happens currently, but fails when the lower-32
overwrite RAM).

Case of writing upper-then-lower (non-Linux case):
The addresses in the upper 32-bits are going to be limited to 16-bits
(at most 48-bit addresses currently) and so those shouldn't update
mappings because they will overlap with RAM.  When the lower-bits are
written, we have the full 64-bit address and can update mappings.

Case of writing lower-then-upper:
If the lower 32-bit BAR address doesn't conflict with RAM, map it.
When the upper bits are written, update to the correct mapping.

We would just have to ensure the first mapping is indeed harmless.

Would that work?
Cam


 This way, 32bit BAR case is also covered.

 thanks,


 Cam
 ---
  hw/pci.c |    5 -
  1 files changed, 4 insertions(+), 1 deletions(-)

 diff --git a/hw/pci.c b/hw/pci.c
 index 438c0d1..3b81792 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
 @@ -1000,6 +1000,9 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
 addr, uint32_t val, int l)
  {
      int i, was_irq_disabled = pci_irq_disabled(d);
      uint32_t config_size = pci_config_size(d);
 +    int is_64 = 0;
 +
 +    is_64 = ((val  0xf) == PCI_BASE_ADDRESS_MEM_TYPE_64);

      for (i = 0; i  l  addr + i  config_size; val = 8, ++i) {
          uint8_t wmask = d-wmask[addr + i];
 @@ -1008,7 +1011,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
 addr, uint32_t val, int l)
          d-config[addr + i] = (d-config[addr + i]  ~wmask) | (val  
 wmask);
          d-config[addr + i] = ~(val  w1cmask); /* W1C: Write 1 to Clear */
      }
 -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
 +    if ((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24)  (!is_64)) ||
          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
          ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
          range_covers_byte(addr, l, PCI_COMMAND))
 --
 1.7.0.4



 --
 yamahata


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] RFC: delay pci_update_mappings for 64-bit BARs

2010-12-13 Thread Isaku Yamahata
On Mon, Dec 13, 2010 at 03:43:44PM -0700, Cam Macdonell wrote:
 Do not call pci_update_mappings on the lower 32-bits of a 64-bit bar.  Wait 
 for the upper 32 or else Qemu will try to map on just the lower 32 which is 
 probably going to corrupt memory.
 
 I was encountering crashes when mapping certain PCI region sizes.  The 
 problem turns out that pci_update_mappings is being called without all 
 64-bits in the BAR.  For example when mapping to 0x18000, once the lower 
 32-bits were written the remapping happened (mapping to 0x800) which 
 would overwrite something.
 
 I'm not certain if this is completely correct, I'm simply testing the lower 
 4-bits to only be MEM_TYPE_64 flag.  Upper 32-bit address parts can be values 
 like 0xff which is tricky to test against.

You're assuming that guest OS always write lower 32bit and them upper 32bit.
Is the assumption correct?
I found Linux does, but I don't know about other OSes.
And I couldn't find any sentence about how to update (64bit) BAR in the specs.
(Please correct me if I missed it)

Some work around would be necessary regardless of 32bit-or-64bit.
because qemu doesn't emulate bus accurately at the moment.
How about the followings?
If BAR overlaps with RAM, don't map BAR.
If BAR overlaps with other BARs, record the overlapping and
when updating one of the BARs, update all the overlapping BARs.
Which BAR wins depends on the order of updating, it doesn't matter because
it's anomaly case.

This way, 32bit BAR case is also covered.

thanks,

 
 Cam
 ---
  hw/pci.c |5 -
  1 files changed, 4 insertions(+), 1 deletions(-)
 
 diff --git a/hw/pci.c b/hw/pci.c
 index 438c0d1..3b81792 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
 @@ -1000,6 +1000,9 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
 addr, uint32_t val, int l)
  {
  int i, was_irq_disabled = pci_irq_disabled(d);
  uint32_t config_size = pci_config_size(d);
 +int is_64 = 0;
 +
 +is_64 = ((val  0xf) == PCI_BASE_ADDRESS_MEM_TYPE_64);
  
  for (i = 0; i  l  addr + i  config_size; val = 8, ++i) {
  uint8_t wmask = d-wmask[addr + i];
 @@ -1008,7 +1011,7 @@ void pci_default_write_config(PCIDevice *d, uint32_t 
 addr, uint32_t val, int l)
  d-config[addr + i] = (d-config[addr + i]  ~wmask) | (val  wmask);
  d-config[addr + i] = ~(val  w1cmask); /* W1C: Write 1 to Clear */
  }
 -if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
 +if ((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24)  (!is_64)) ||
  ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
  ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
  range_covers_byte(addr, l, PCI_COMMAND))
 -- 
 1.7.0.4
 
 

-- 
yamahata
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html