On 02/01/16 16:07, Marcel Apfelbaum wrote:
> On 01/26/2016 07:17 AM, Ni, Ruiyu wrote:
>> Laszlo,
>> I now understand your problem.
>> Can you tell me why OVMF needs multiple root bridges support?
>> My understanding to OVMF is it's a firmware which can be used in a
>> guest VM
>> environment to boot OS.
>> Multiple root bridges requirement currently mainly comes from high-end
>> servers.
>> Do you mean that the VM guest needs to be like a high-end server?
>> This may help me to think about the possible solution to your problem.
> 
> Hi Ray,
> 
> Laszlo's explanation is very good, this is not exactly about high-end VMs,
> we need the extra root bridges to match assigned devices to their
> corresponding NUMA node.
> 
> Regarding the OVMF issue, the main problem is that the extra root
> bridges are created dynamically
> for the VMs (command line parameter) and their resources are computed on
> the fly.
> 
> Not directly related to the above, the optimal way to allocate resources
> for PCI root bridges
> sharing the same PCI domain is to sort devices MEM/IO ranges from the
> biggest to smallest
> and use this order during allocation.
> 
> After the resources allocation is finished we can build the CRS for each
> PCI root bridge
> and pass it back to firmware/OS.
> 
> While for "real" machines we can hard-code the root bridge resources in
> some ROM and have it
> extracted early in the boot process, for the VM world this would not be
> possible. Also
> any effort to divide the resources range before the resource allocation
> would be odd and far from optimal.
> 
> Regarding a possible solution, I first need to understand why the
> resource allocation is done
> per PCI root bridge and not per PCI domain. The CRS allows a PCI root
> bridge to have several MEM/IO ranges
> so there is really no need to impose a per PCI root bridge logic.
> 
> I am new to the edk2 project so I might get things wrong, but I think we
> need a way to specify if
> the PCI root bridges will supply their resources or if an external
> allocator will do the job.
> Laszlo proposed solution looks like a way to implement such a policy, I
> am personally OK with it.
> 
> I really think the generic PciHostBridgeDxe driver is the right way to
> go also for OVMF, we just need
> a way to deal with this issue.

For the sake of discussion, I pushed my WIP patches to:

https://github.com/lersek/edk2/commits/pci_host_wip

It is a faithful conversion of OvmfPkg/PciHostBridgeDxe/, to
MdeModulePkg's PciHostBridgeDxe, plus an OVMF-specific PciHostBridgeLib
instance.

The problem is of course patch

  MdeModulePkg: PciHostBridgeDxe: allow PciHostBridgeLib to prime the
  GCD map

which we've been discussing in this thread.

Thanks
Laszlo


>> -----Original Message-----
>> From: Laszlo Ersek [mailto:ler...@redhat.com]
>> Sent: Tuesday, January 26, 2016 11:10 AM
>> To: Ni, Ruiyu <ruiyu...@intel.com>
>> Cc: edk2-de...@ml01.01.org; Tian, Feng <feng.t...@intel.com>; Fan,
>> Jeff <jeff....@intel.com>; Justen, Jordan L
>> <jordan.l.jus...@intel.com>; Marcel Apfelbaum <mar...@redhat.com>
>> Subject: Re: [edk2] [Patch V4 4/4] MdeModulePkg: Add generic
>> PciHostBridgeDxe driver.
>>
>> On 01/26/16 03:42, Ni, Ruiyu wrote:
>>> Laszlo,
>>> Thanks for the detailed explanation and I agrees with your idea that
>>> a generic
>>> driver should consider the case that part of the IO/MMIO resource may be
>>> already added by platform, no matter by PEI or by DXE.
>>>
>>> So I think we may not need the flag you proposed here. To achieve the
>>> maximum
>>> flexibility, we can use gDS.GetMemorySpaceMap/GetIoSpaceMap to
>>> retrieve all
>>> the current MMIO/IO space map and only add those ranges that are not
>>> added.
>>>
>>> And to be more robust, we can check that the range specified to the
>>> root bridge
>>> should not be allocated by anyone (CAN be added by someone).
>>>
>>> What's your opinion?
>>
>> Very good, but painful questions. :)
>>
>> Until now, OVMF has relied upon PlatformPei producing the HOBs in
>> question (where the base of the 32-bit PCI MMIO window varies in the
>> HOB, based on guest RAM size), *and* OVMF's PCI host bridge driver has
>> never even tried to match the root bridges' 32-bit MMIO apertures to
>> anything at all. All of those apertures are set as [2GB, 4GB).
>>
>> The idea being, the gDS->AllocateMemorySpace() calls can be satisfied
>> from whatever room is left from the range that the HOB identified.
>> Especially in the case of multiple root bridges, I wouldn't know *how*
>> to divide up the MMIO address space between them. (Dividing up the bus
>> range was hard enough.)
>>
>> Same for IO ports -- all root bridges share the same [0xC000, 0xFFFF]
>> port range.
>>
>> Therefore your idea above would immediately break OVMF, functionally :)
>>
>> (I suspect that OVMF is kinda broken already, in this aspect, but it
>> happens to work.)
>>
>> I know precious little about PCI resource allocation. This is one of the
>> areas we intend to look into seriously in the future, with Marcel
>> (CC'd). Thus, if porting OVMF to the generic PciHostBridgeDxe driver
>> requires incompatible changes at once, then I think we should postpone
>> the port until Marcel returns from his vacation.
>>
>> To reiterate, my current problem is ultimately the lack of an algorithm
>> for dynamically dividing up the IO and MMIO space between root bridges.
>> Do you have an idea for that? (The algorithm, if there is one, could
>> very well be specific to QEMU -- which too is where we'll need Marcel's
>> input.)
>>
>> Thanks!
>> Laszlo
>>
>>>
>>>
>>> Regards,
>>> Ray
>>>
>>>
>>> -----Original Message-----
>>> From: Laszlo Ersek [mailto:ler...@redhat.com]
>>> Sent: Tuesday, January 26, 2016 10:20 AM
>>> To: Ni, Ruiyu <ruiyu...@intel.com>
>>> Cc: edk2-de...@ml01.01.org; Tian, Feng <feng.t...@intel.com>; Fan,
>>> Jeff <jeff....@intel.com>; Justen, Jordan L <jordan.l.jus...@intel.com>
>>> Subject: Re: [edk2] [Patch V4 4/4] MdeModulePkg: Add generic
>>> PciHostBridgeDxe driver.
>>>
>>> Hi Ray,
>>>
>>> I have a work-in-progress series to convert OvmfPkg to use this driver.
>>> While testing it, I ran into a failed assertion:
>>>
>>> On 01/13/16 09:00, Ruiyu Ni wrote:
>>>> This driver links to PciHostBridgeLib provided by platform/silicon to
>>>> produce PciRootBridgeIo and PciHostBridgeResourceAllocation protocol.
>>>>
>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>> Signed-off-by: Ruiyu Ni <ruiyu...@intel.com>
>>>> Cc: Jeff Fan <jeff....@intel.com>
>>>> Cc: Feng Tian <feng.t...@intel.com>
>>>> ---
>>>>   .../Bus/Pci/PciHostBridgeDxe/PciHostBridge.c       | 1134
>>>> ++++++++++++++
>>>>   .../Bus/Pci/PciHostBridgeDxe/PciHostBridge.h       |  252 ++++
>>>>   .../Bus/Pci/PciHostBridgeDxe/PciHostBridgeDxe.inf  |   55 +
>>>>   .../Bus/Pci/PciHostBridgeDxe/PciHostResource.h     |   47 +
>>>>   .../Bus/Pci/PciHostBridgeDxe/PciRootBridge.h       |  568 +++++++
>>>>   .../Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c     | 1561
>>>> ++++++++++++++++++++
>>>>   MdeModulePkg/MdeModulePkg.dsc                      |    2 +
>>>>   7 files changed, 3619 insertions(+)
>>>>   create mode 100644
>>>> MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
>>>>   create mode 100644
>>>> MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.h
>>>>   create mode 100644
>>>> MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridgeDxe.inf
>>>>   create mode 100644
>>>> MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostResource.h
>>>>   create mode 100644
>>>> MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridge.h
>>>>   create mode 100644
>>>> MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c
>>>>
>>>> diff --git a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
>>>> b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
>>>> new file mode 100644
>>>> index 0000000..08285d8
>>>> --- /dev/null
>>>> +++ b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
>>>> @@ -0,0 +1,1134 @@
>>>> +/** @file
>>>> +
>>>> +  Provides the basic interfaces to abstract a PCI Host Bridge
>>>> Resource Allocation.
>>>> +
>>>> +Copyright (c) 1999 - 2016, Intel Corporation. All rights reserved.<BR>
>>>> +This program and the accompanying materials
>>>> +are licensed and made available under the terms and conditions of
>>>> the BSD License
>>>> +which accompanies this distribution.  The full text of the license
>>>> may be found at
>>>> +http://opensource.org/licenses/bsd-license.php
>>>> +
>>>> +THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS,
>>>> +WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS
>>>> OR IMPLIED.
>>>> +
>>>> +**/
>>>> +
>>>> +#include "PciHostBridge.h"
>>>> +#include "PciRootBridge.h"
>>>> +#include "PciHostResource.h"
>>>> +
>>>> +
>>>> +EFI_METRONOME_ARCH_PROTOCOL *mMetronome;
>>>> +EFI_CPU_IO2_PROTOCOL        *mCpuIo;
>>>> +
>>>> +GLOBAL_REMOVE_IF_UNREFERENCED CHAR16 *mAcpiAddressSpaceTypeStr[] = {
>>>> +  L"Mem", L"I/O", L"Bus"
>>>> +};
>>>> +GLOBAL_REMOVE_IF_UNREFERENCED CHAR16 *mPciResourceTypeStr[] = {
>>>> +  L"I/O", L"Mem", L"PMem", L"Mem64", L"PMem64", L"Bus"
>>>> +};
>>>> +
>>>> +/**
>>>> +
>>>> +  Entry point of this driver.
>>>> +
>>>> +  @param ImageHandle  Image handle of this driver.
>>>> +  @param SystemTable  Pointer to standard EFI system table.
>>>> +
>>>> +  @retval EFI_SUCCESS       Succeed.
>>>> +  @retval EFI_DEVICE_ERROR  Fail to install PCI_ROOT_BRIDGE_IO
>>>> protocol.
>>>> +
>>>> +**/
>>>> +EFI_STATUS
>>>> +EFIAPI
>>>> +InitializePciHostBridge (
>>>> +  IN EFI_HANDLE         ImageHandle,
>>>> +  IN EFI_SYSTEM_TABLE   *SystemTable
>>>> +  )
>>>> +{
>>>> +  EFI_STATUS                  Status;
>>>> +  PCI_HOST_BRIDGE_INSTANCE    *HostBridge;
>>>> +  PCI_ROOT_BRIDGE_INSTANCE    *RootBridge;
>>>> +  PCI_ROOT_BRIDGE             *RootBridges;
>>>> +  UINTN                       RootBridgeCount;
>>>> +  UINTN                       Index;
>>>> +  PCI_ROOT_BRIDGE_APERTURE    *MemApertures[4];
>>>> +  UINTN                       MemApertureIndex;
>>>> +
>>>> +  RootBridges = PciHostBridgeGetRootBridges (&RootBridgeCount);
>>>> +  if ((RootBridges == NULL) || (RootBridgeCount == 0)) {
>>>> +    return EFI_UNSUPPORTED;
>>>> +  }
>>>> +
>>>> +  Status = gBS->LocateProtocol (&gEfiMetronomeArchProtocolGuid,
>>>> NULL, (VOID **) &mMetronome);
>>>> +  ASSERT_EFI_ERROR (Status);
>>>> +  Status = gBS->LocateProtocol (&gEfiCpuIo2ProtocolGuid, NULL,
>>>> (VOID **) &mCpuIo);
>>>> +  ASSERT_EFI_ERROR (Status);
>>>> +
>>>> +  //
>>>> +  // Most systems in the world including complex servers have only
>>>> one Host Bridge.
>>>> +  //
>>>> +  HostBridge = AllocateZeroPool (sizeof (PCI_HOST_BRIDGE_INSTANCE));
>>>> +  ASSERT (HostBridge != NULL);
>>>> +
>>>> +  HostBridge->Signature        = PCI_HOST_BRIDGE_SIGNATURE;
>>>> +  HostBridge->CanRestarted     = TRUE;
>>>> +  InitializeListHead (&HostBridge->RootBridges);
>>>> +
>>>> +  HostBridge->ResAlloc.NotifyPhase          = NotifyPhase;
>>>> +  HostBridge->ResAlloc.GetNextRootBridge    = GetNextRootBridge;
>>>> +  HostBridge->ResAlloc.GetAllocAttributes   = GetAttributes;
>>>> +  HostBridge->ResAlloc.StartBusEnumeration  = StartBusEnumeration;
>>>> +  HostBridge->ResAlloc.SetBusNumbers        = SetBusNumbers;
>>>> +  HostBridge->ResAlloc.SubmitResources      = SubmitResources;
>>>> +  HostBridge->ResAlloc.GetProposedResources = GetProposedResources;
>>>> +  HostBridge->ResAlloc.PreprocessController = PreprocessController;
>>>> +
>>>> +  Status = gBS->InstallMultipleProtocolInterfaces (
>>>> +                  &HostBridge->Handle,
>>>> +                  &gEfiPciHostBridgeResourceAllocationProtocolGuid,
>>>> &HostBridge->ResAlloc,
>>>> +                  NULL
>>>> +                  );
>>>> +  if (EFI_ERROR (Status)) {
>>>> +    FreePool (HostBridge);
>>>> +    PciHostBridgeFreeRootBridges (RootBridges, RootBridgeCount);
>>>> +    return Status;
>>>> +  }
>>>> +
>>>> +  //
>>>> +  // Create Root Bridge Device Handle in this Host Bridge
>>>> +  //
>>>> +  for (Index = 0; Index < RootBridgeCount; Index++) {
>>>> +    //
>>>> +    // Create Root Bridge Handle Instance
>>>> +    //
>>>> +    RootBridge = CreateRootBridge (&RootBridges[Index],
>>>> HostBridge->Handle);
>>>> +    ASSERT (RootBridge != NULL);
>>>> +    if (RootBridge == NULL) {
>>>> +      continue;
>>>> +    }
>>>> +
>>>> +    if (RootBridges[Index].Io.Limit > RootBridges[Index].Io.Base) {
>>>> +      Status = gDS->AddIoSpace (
>>>> +                      EfiGcdIoTypeIo,
>>>> +                      RootBridges[Index].Io.Base,
>>>> +                      RootBridges[Index].Io.Limit -
>>>> RootBridges[Index].Io.Base + 1
>>>> +                      );
>>>> +      ASSERT_EFI_ERROR (Status);
>>>> +    }
>>>
>>> This assertion fails here with with "Access Denied".
>>>
>>> The reason is that we add IO and MMIO space still in PEI, as resource
>>> descriptor HOBs, and then the DXE core primes the GCD memory space map
>>> from them. Adding the same IO space here with the appropriate DXE
>>> service runs into a conflict justifiedly, and it is rejected.
>>>
>>> My question is if we could make the GCD memory space map manipulation
>>> optional in this driver:
>>>
>>> The PCI_ROOT_BRIDGE structure could be extended with a UINT32 flag. For
>>> each of Io, Mem, MemAbove4G, PMem, PMemAbove4G, the PciHostBridgeLib
>>> instance producing the PCI_ROOT_BRIDGE structure could set a separate
>>> bit, indicating whether PciHostBridgeDxe should add (and set the
>>> attributes of) the given aperture to the GCD memory space map.
>>>
>>> Since this is a generic driver, I think it should accommodate the case
>>> when the relevant apertures are described during PEI. (Not just because
>>> that's what OVMF does, but because said practice is valid, as far as I
>>> understand / recall the PI specs.)
>>>
>>> In my (limited) experience it varies between PCI host bridge drivers
>>> whether they install the apertures themselves, or rely on them already
>>> being present.
>>>
>>> In ArmVirtPkg/PciHostBridgeDxe/, the driver installs the (emulated) IO
>>> and (real) MMIO space itself; while in OvmfPkg (and the original
>>> PcAtChipsetPkg driver), this does not happen.
>>>
>>> If you agree with the idea, I can try writing a patch for it.
>>>
>>> Alternatively, you could require from the PciHostBridgeGetRootBridges()
>>> API that on return, all the necessary GCD memory space map entries be in
>>> place. At the moment, both approaches would work for me.
>>>
>>> ... Hm, I was too curious, and wrote the patch just now. OVMF (ported to
>>> PciHostBridgeDxe) works well with it; I retested the multiple root
>>> bridges case too, and compared the OVMF debug log & the Linux guest
>>> dmesg before<->after. The resource assignments are the same. I also
>>> regression tested a Windows 8.1 guest, with 6GB RAM.
>>>
>>> So, what do you think of the attached patch? If you are okay with it, I
>>> can submit it as part of my upcoming OVMF series.
>>>
>>> Thanks!
>>> Laszlo
>>>
>>> [...]
>>>
>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
> 

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to