On 05/23/2018 02:11 PM, Zihan Yang wrote:
Hi all,
Thanks for all your comments and suggestions, I wasn't expecting so
many professional
reviewers. Some of the things you mentioned are beyond my knowledge
right now.
Please correct me if I'm wrong below.
The original purpose was just to support multiple segments in Intel
Q35 archtecure
for PCIe topology, which makes bus number a less scarce resource. The
patches are
very primitive and many things are left for firmware to finish(the
initial plan was
to implement it in SeaBIOS), the AML part in QEMU is not finished
either. I'm not
familiar with OVMF or edk2, so there is no plan to touch it yet, but
it seems not
necessary since it already supports multi-segment in the end.
Also, in this patch the assumption is one domain per host bridge,
described by '_SEG()'
in AML, which means a ECAM range per host bridge, but that should be
configurable
if the user prefers to staying in the same domain, I guess?
Yes.
I'd like to list a few things you've discussed to confirm I don't get
it wrong
* ARI enlarges the number of functions, but they does not solve the
hot-pluggable issue.
The multifunction PCIe endpoints cannot span PCIe domains
Right
* IOMMUs cannot span domains either, so bringing new domains
introduces the need
to add a VT-d DHRD or vIOMMU per PCIe domain
Not really, you may have PCI domains not associated to an vIOMMU. As a
first step,
you should not deal with it. The IOMMU can't span over multiple domains,
yes.
* 64-bit space is crowded and there are no standards within QEMU for
placing per
domain 64-bit MMIO and MMCFG ranges
Yes, but we do have some layout for the "over 4G" area and we can continue
building on it.
* NUMA modeling seems to be a stronger motivation than the limitation
of 256 but
nubmers, that each NUMA node holds its own PCI(e) sub-hierarchy
No, the 256 devices limitation is the biggest issue we are trying to solve.
* We cannot put ECAM arbitrarily high because guest's PA width is
limited by host's
when EPT is enabled.
Indeed, we should be careful about the size
the MMCFGs to not exceed CPU addressable bits.
* Compatibility issues in platforms that do not have MCFG table at all
(perhaps we limit
it to only q35 at present in which MCFG is present).
For sure.
Based on your discussions, I guess this proposal is still worth doing
overall, but it seems
many restrictions should be imposed to be compatible with some
complicated situations.
Correct.
Please correct me if I get anything wrong or missing.
You are on the right path, this discussion is meant to help you
understand wider concerns
and make you aware of different constrains we didn't think about.
Good luck with the next version!
Thanks,
Marcel
Thanks
Zihan