On Mon, Mar 04, 2024 at 02:25:45PM +0100, Jan Beulich wrote:
> On 04.03.2024 11:02, Roger Pau Monné wrote:
> > On Mon, Mar 04, 2024 at 08:32:22AM +0100, Jan Beulich wrote:
> >> BARs of size 2Gb and up can't possibly fit below 4Gb: Both the bottom of
> >> the lower 2Gb range and the top of the higher 2Gb range have special
> >> purpose. Don't even have them influence whether to (perhaps) relocate
> >> low RAM.
> > 
> > Here you mention 2Gb BARs, yet the code below sets the maximum BAR
> > size supported below 4Gb to 1Gb.
> 
> Hmm, I'm puzzled: There are no other BAR sizes between 1Gb and 2Gb.
> Anything up to 1Gb continues to work as is, while everything 2Gb and
> up has behavior changed.

My bad, I was confused.

> >> --- a/tools/firmware/hvmloader/pci.c
> >> +++ b/tools/firmware/hvmloader/pci.c
> >> @@ -33,6 +33,13 @@ uint32_t pci_mem_start = HVM_BELOW_4G_MM
> >>  const uint32_t pci_mem_end = RESERVED_MEMBASE;
> >>  uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
> >>  
> >> +/*
> >> + * BARs larger than this value are put in 64-bit space unconditionally.  
> >> That
> >> + * is, such BARs also don't play into the determination of how big the 
> >> lowmem
> >> + * MMIO hole needs to be.
> >> + */
> >> +#define HUGE_BAR_THRESH GB(1)
> > 
> > I would be fine with defining this to an even lower number, like
> > 256Mb, as to avoid as much as possible memory relocation in order to
> > make the MMIO hole bigger.
> 
> As suggested in a post-commit-message remark, the main question then is
> how to justify this.

I think the justification is to avoid having to relocate memory in
order to attempt to make the hole below 4Gb larger.

> >> @@ -367,7 +376,7 @@ void pci_setup(void)
> >>              pci_mem_start = hvm_info->low_mem_pgend << PAGE_SHIFT;
> >>      }
> >>  
> >> -    if ( mmio_total > (pci_mem_end - pci_mem_start) )
> >> +    if ( mmio_total > (pci_mem_end - pci_mem_start) || bar64_relocate )
> >>      {
> >>          printf("Low MMIO hole not large enough for all devices,"
> >>                 " relocating some BARs to 64-bit\n");
> > 
> > Is the above message now accurate?  Given the current code the low
> > MMIO could be expanded up to 2Gb, yet BAR relocation will happen
> > unconditionally once a 1Gb BAR is found.
> 
> Well, "all" may not be quite accurate anymore, yet would making it e.g.
> "all applicable" really much more meaningful?
> 
> >> @@ -446,8 +455,9 @@ void pci_setup(void)
> >>           *   the code here assumes it to be.)
> >>           * Should either of those two conditions change, this code will 
> >> break.
> >>           */
> >> -        using_64bar = bars[i].is_64bar && bar64_relocate
> >> -            && (mmio_total > (mem_resource.max - mem_resource.base));
> >> +        using_64bar = bars[i].is_64bar && bar64_relocate &&
> >> +            (mmio_total > (mem_resource.max - mem_resource.base) ||
> >> +             bar_sz > HUGE_BAR_THRESH);
> >>          bar_data = pci_readl(devfn, bar_reg);
> >>  
> >>          if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
> >> @@ -467,7 +477,8 @@ void pci_setup(void)
> >>                  resource = &mem_resource;
> >>                  bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
> >>              }
> >> -            mmio_total -= bar_sz;
> >> +            if ( bar_sz <= HUGE_BAR_THRESH )
> >> +                mmio_total -= bar_sz;
> > 
> > I'm missing the part where hvmloader notifies QEMU of the possibly
> > expanded base and size memory PCI MMIO regions, so that those are
> > reflected in the PCI root complex registers?
> 
> I don't understand this comment: I'm not changing the interaction
> with qemu at all. Whatever the new calculation it'll be communicated
> to qemu just as before.

That wasn't a complain about the patch, just me failing to see where
this is done.

> > Overall I think we could simplify the code by having a hardcoded 1Gb
> > PCI MMIO hole below 4Gb, fill it with all the 32bit BARs and
> > (re)locate all 64bit BARs above 4Gb (not that I'm requesting you to do
> > it here).
> 
> I'm afraid that would not work very well with OSes which aren't 64-bit
> BAR / PA aware (first and foremost non-PAE 32-bit ones). Iirc that's
> the reason why it wasn't done like you suggest back at the time.

There will still be a ~1Gb window < 4Gb, so quite a bit of space.

I'm unsure whether such OSes will have drivers to manage devices with
that huge BARs in the first place.

Thanks, Roger.

Reply via email to