>
> Regards,
> Ray
>
>
>> -----Original Message-----
>> From: Marcel Apfelbaum [mailto:marcel.apfelb...@gmail.com]
>> Sent: Monday, February 8, 2016 6:56 PM
>> To: Ni, Ruiyu <ruiyu...@intel.com>; Laszlo Ersek <ler...@redhat.com>
>> Cc: Justen, Jordan L <jordan.l.jus...@intel.com>;
edk2-de...@ml01.01.org;
>> Tian, Feng <feng.t...@intel.com>; Fan, Jeff <jeff....@intel.com>
>> Subject: Re: [edk2] [Patch V4 4/4] MdeModulePkg: Add generic
>> PciHostBridgeDxe driver.
>>
>> Hi,
>>
>> I am sorry for the noise, I am re-sending this mail from an e-mail
address
>> subscribed to the list.
>>
>> Thanks,
>> Marcel
>>
>> On 02/08/2016 12:41 PM, Marcel Apfelbaum wrote:
>>> On 02/06/2016 09:09 AM, Ni, Ruiyu wrote:
>>>> Marcel,
>>>> Please see my reply embedded below.
>>>>
>>>> On 2016-02-02 19:07, Laszlo Ersek wrote:
>>>>> On 02/01/16 16:07, Marcel Apfelbaum wrote:
>>>>>> On 01/26/2016 07:17 AM, Ni, Ruiyu wrote:
>>>>>>> Laszlo,
>>>>>>> I now understand your problem.
>>>>>>> Can you tell me why OVMF needs multiple root bridges support?
>>>>>>> My understanding to OVMF is it's a firmware which can be used
in a
>>>>>>> guest VM
>>>>>>> environment to boot OS.
>>>>>>> Multiple root bridges requirement currently mainly comes from
>> high-end
>>>>>>> servers.
>>>>>>> Do you mean that the VM guest needs to be like a high-end
server?
>>>>>>> This may help me to think about the possible solution to your
problem.
>>>>>> Hi Ray,
>>>>>>
>>>>>> Laszlo's explanation is very good, this is not exactly about
high-end VMs,
>>>>>> we need the extra root bridges to match assigned devices to their
>>>>>> corresponding NUMA node.
>>>>>>
>>>>>> Regarding the OVMF issue, the main problem is that the extra root
>>>>>> bridges are created dynamically
>>>>>> for the VMs (command line parameter) and their resources are
>> computed on
>>>>>> the fly.
>>>>>>
>>>>>> Not directly related to the above, the optimal way to allocate
resources
>>>>>> for PCI root bridges
>>>>>> sharing the same PCI domain is to sort devices MEM/IO ranges
from the
>>>>>> biggest to smallest
>>>>>> and use this order during allocation.
>>>>>>
>>>>>> After the resources allocation is finished we can build the CRS
for each
>>>>>> PCI root bridge
>>>>>> and pass it back to firmware/OS.
>>>>>>
>>>>>> While for "real" machines we can hard-code the root bridge
resources in
>>>>>> some ROM and have it
>>>>>> extracted early in the boot process, for the VM world this would
not be
>>>>>> possible. Also
>>>>>> any effort to divide the resources range before the resource
allocation
>>>>>> would be odd and far from optimal.
>>
>> Hi Ray,
>> Thank you for your response,
>>
>>>> Real machine uses hard-code resources for root bridges. But when
the
>> resource
>>>> cannot meet certain root bridges' requirement, firmware can save
the real
>> resource
>>>> requirement per root bridges to NV storage and divide the
resources to
>> each root
>>>> bridge in next boot according to the NV settings.
>>>> The MMIO/IO routine in the real machine I mentioned above needs
to be
>> fixed
>>>> in a very earlier phase before the PciHostBridgeDxe driver runs.
That's to
>> say if
>>>> [2G, 2.8G) is configured to route to root bridge #1, only [2G,
2.8G) is
>> allowed to
>>>> assigned to root bride #1. And the routine cannot be changed
unless a
>> platform
>>>> reset is performed.
>>
>> I understand.
>>
>>>>
>>>> Based on your description, it sounds like all the root bridges in
OVMF share
>> the
>>>> same range of resource and any MMIO/IO in the range can be route
to any
>> root
>>>> bridge. For example, every root bridge can use [2G, 3G) MMIO.
>>>
>>> Exactly. This is true for "snooping" host-bridges which do not have
their own
>>> configuration registers (or MMConfig region). They are sniffing
host-bridge
>> 0
>>> for configuration cycles and if the are meant for a device on a bus
number
>>> owned by them, they will forward the transaction to their primary
root bus.
>>>
>>> Until in
>>>> allocation phase, root bridge #1 is assigned to [2G, 2.8G), #2 is
assigned
>>>> to [2.8G, 2.9G), #3 is assigned to [2.9G, 3G).
>>
>> Correct, but the regions do not have to be disjoint in the above
scenario.
>> root bridge #1 can have [2G,2.4G) and [2.8,3G) while root bridge #1
can have
>> [2.4,2.8).
>>
>> This is so the firmware can distribute the resources in an optimal
way. An
>> example can be:
>> - root bridge #1 has a PCI device A with a huge BAR and a PCI
device B
>> with a little BAR.
>> - root bridge #2 has aPCI device C with a medium BAR.
>> The best way to distribute resources over [2G, 3G) is A BAR, C BAR,
and only
>> then B BAR.
>>
>>>> So it seems that we need a way to tell PciHostBridgeDxe driver
from the
>> PciHostBridgeLib
>>>> that all resources are sharable among all root bridges.
>>
>> This is exactly what we need, indeed.
>>
>>>>
>>>> The real platform case is the allocation per root bridge and OVMF
case is
>> the allocation
>>>> per PCI domain.
>>
>> Indeed, bare metal servers use different PCI domain per host bridge,
but I've
>> actually seen
>> real servers that have multiple root bridges sharing the same PCI
domain, 0.
>>
>>
>>>> Is my understanding correct?
>>
>> It is, and thank you for taking your time to understand the issue,
>> Marcel
>>
>>>>
>>> [...]
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel