On 06/04/2015 02:11 AM, Laszlo Ersek wrote:
> On 06/03/15 22:34, Marcel Apfelbaum wrote:
>> On 06/03/2015 01:20 PM, Laszlo Ersek wrote:
>
>>> Maybe we can experiment some more; for example we could start by
>>> you explaining to me how exactly to probe for a root bus's presence
>>> (you mentioned device 0, but I'll need more than that).
>
>> Well, I lied. :)
>> I had a look now on seabios and it does the following:
>> - Receives using a fw_config file the number of extra root buses.
>> - It starts scanning from bus 0 to bus 0xff until it discovers all
>>    the extra root buses. The 'discovery' is "go over all bus's slots
>>    and probe for a non empty PCI header". If you find at least one
>>    device you just discovered a new PCI root bus.
>
> I thought about checking the VendorId header field for dev=0 func=0 on
> each bus. (Sources on the net indicate that the VendorId field is
> usually queried for presence -- all bits one means "nope".)
>
>> I think that we can improve the fw_config file to pass the actually
>> bus numbers and not only the total. In this way should be relatively
>> easy for edk2 to handle the extra root buses.
>
> Yes. I had thought this would be the easiest. I wasn't sure though if
> you'd appreciate such an idea :)
Well, this is the right thing to do and also the implementation will be 
straight forward.

>
>>> For the bus range allocation, here's an idea:
>>> - create a bitmap with 256 bits (32 bytes) with all bits zero
>>> - probe all root buses; whatever is found, flip its bit to 1
>>> - assuming N root buses were found, divide the number of remaining
>>>    zero bits with N. The quotient Q means how many subordinate buses
>>>    each root bus would be able to accommodate
>>> - for each root bus:
>>>     - create an ACPI bus range descriptor that includes only the root
>>>       bus's number
>>>     - pull out Q zero bits from the bitmap, from the left, flipping
>>>       them to one as you proceed
>>>     - for each zero bit pulled, try to append that bus number to the
>>>       ACPI bus range descriptor (simply bumping the end). If there's a
>>>       discontinuity, start a new ACPI bus range descriptor.
>>>
>>> This greedy algorithm would grant each root bus the same number of
>>> possible subordinate buses, could be implemented in linear time, and
>>> would keep the individual bus ranges "reasonably continuous" (ie.
>>> there should be a reasonably low number of ACPI bus range
>>> descriptors, per root bus).
>>>
>>> What do you think? This wouldn't be a very hard patch to write, and
>>> then we could experiment with various -device pxb,bus_nr=xxx
>>> parameters.
>
>> Well, it looks nice but I think that we can do something much simpler
>> :)
>> Let's continue the above idea that QEMU passes to edk2 the *extra*
>> root bus numbers in ascending order for simplicity.
>> For example 8,16,32. From here you can derive that the bus ranges are:
>> 0-7 host bridge 0
>> 8-15 pxb root bridge 1
>> 16-31 pxb root bridge 2
>> 32-0xff pxb root bridge 3
>
> Sounds good, at least if the bus numbers assigned to the pxb's partition
> the full range fairly uniformly.
The management will assign the ranges, so libvirt will take care of that.
>
>> BTW, this is the way, as far as I know, that the real hw divides the
>> ranges.
>> Limitation:
>>    - How do you know you have enough bus numbers for a host bridge to
>>      cover all PCI-2-PCI bridges behind it? Let's say bus 0 has 10
>>      bridges, 0-7 range is not enough.
>
> Exactly.
>
>> Reasoning:
>>    - This is *hw vendor* issue, not firmware, in our case QEMU should
>>      check the ranges are enough before starting edk2.
>
> If you're willing to do the work in QEMU, you certainly won't meet any
> resistance on my part! :)
Of course I will handle it in QEMU.
For the moment I check only that 2 pxbs do not receive the same root bus number.
I'll improve that by checking on PXB init that it does not interfere withe the 
other ranges.

It will go like this.
Sort root/host bridges in ascending order of theri root bus number.
- For each root bridge
   - Go over pci-2-pci-bridges
     - Count all the buses using the subordinate bus pci header field.
- See that the above nr is less then <next PCI root bus nr> - <this PCI root 
bus nr> + 1


>
>> In conclusion, this assumption does not break anything or gives as a
>> big limitation.
>> And Seabios already assumes that... and QEMU is not going to break it.
>
> Great!
:)

>
>>> The MMIO and IO spaces I would just share between all of them; the
>>> allocations from those are delegated back to the host bridge / root
>>> bridge driver, and the current implementation seems sufficient -- it
>>> just assings blocks from the same big MMIO ( / IO) space downwards
>
>> Yes, this is how it should be done, I am happy that it already works
>> that way.
>
> Tonight I've started to work on this anyway. Before attacking the bitmap
> idea, I wanted to -- had to, really -- rewrap OVMF's fresh clone of
> "PcAtChipsetPkg/PciHostBridgeDxe" to 79 columns. I expect to delve into
> the driver more deeply this time than last time, and the consistently
> overlong (130-148 character) lines make the code simply unreadable.
>
> So, I just finished that. (It was surprisingly difficult; the rewrapping
> took 8 patches, the cumulative diffstat is 9 files changed, 2261
> insertions(+), 1445 deletions(-).) I thought I'd check my email before
> embarking on the bitmap thing. Your email arrived at the best possible
> moment! Not just because I don't have to implement the bitmap, the
> search, the multiple ACPI bus ranges per root bridge, but also because
> the internals of the driver rely quite heavily on each root bridge
> having a single contiguous bus range.
Good!

>
> I think I could have rebased that to bitmap checks, but the approach
> you're suggesting makes it all unnecessary. (Plus, I don't have to worry
> about any incompatibility with the PCI bus driver, which I won't touch.)
>
> What element type do you propose for the array in the new fw_cfg file?
> (And what name for the fw_cfg file itself?)
>
> "etc/extra-pci-roots" uses uint64_t, little endian, for the number of
> extra root buses. (In fact if you expose the explicit list in a separate
> file, then the element count is not even necessary separately, because
> file sizes are available in the fw_cfg directory, and I can divide the
> file size with the element size.)
I can prepare another file. Regarding the new  array, each element should be
a number between 0x0 and 0xff, so a uint8_t seems fair.
>
> I have two more questions (raised earlier), about the _HID and the _UIDs
> in the SSDT.
>
> First, I can see in your patch
>
>    hw/acpi: add support for i440fx 'snooping' root busses
>
> that the _UID is populated for each root bus with a string of the form
>
>    PC%02X
>
> where the argument is "bus_num". UEFI can accommodate this, with the
> Expanded ACPI Device Path node, but I'll have to know if the "bus_num"
> argument matches the exact numer that you're going to pass down in the
> new fw_cfg file. Does it?
Yes.

>
> Second, about the _HID and _UID objects being populated by strings, and
> not numbers. In *theory* this should be all fine, but I'm concerned that
> in practice they will cause complications (eg. in booting).
The _UID was inspired, I think, by a real HW machine with multiple root buses.
But I have no problem changing it.

>
> Can you perhaps change (or update, separately) the QEMU patch in
> question, so that _HID is populated with aml_eisaid() instead of
> aml_string(), and that _UID is populated with a flat integer (the
> "bus_num" value)? It should not be very intrusive for QEMU at this
> point, and down the road I think it would ensure better compatibility.
> Something to the tune of:
I see no problem with this, I only need to check that it plays nice with
Linux and Windows guest OS.

>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index db32fd1..8fae3b9 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -944,9 +944,8 @@ build_ssdt(GArray *table_data, GArray *linker,
>>
>>               scope = aml_scope("\\_SB");
>>               dev = aml_device("PC%.02X", bus_num);
>> -            aml_append(dev,
>> -                       aml_name_decl("_UID", aml_string("PC%.02X", 
>> bus_num)));
>> -            aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03")));
>> +            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
>> +            aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>>               aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
>>
>>               if (numa_node != NUMA_NODE_UNASSIGNED) {
>
> As far as I can see in the QEMU source, filling in _HID and _UID like
> this is existing practice.
I can submit the patch , (or you can submit and I'll ack) on top of PXB series.
I am going to be on PTO, so it will wait a week :)

Thanks a lot!
Marcel
>
> Thanks!
> Laszlo
>


------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel

Reply via email to