On Wed, 30 May 2018 12:53:53 +0100 Jean-Philippe Brucker <jean-philippe.bruc...@arm.com> wrote:
> On 30/05/18 04:45, Tian, Kevin wrote: > >>>>>> On SMMUv3 the minimum alignment for base_ptr is 64 bytes, so > >>>>>> a > >>>> guest > >>>>>> under a vSMMU might pass a pointer that's not aligned on 4k. > >>>>>> > >>>>> PASID table pointer for VT-d is 4K aligned. > >>>>>> Maybe this information could be part of the data passed to > >> userspace > >>>>>> about IOMMU table formats and features? They're not part of > >>>>>> this series, but I think we wanted to communicate > >>>>>> IOMMU-specific > >> features > >>>>>> via sysfs. > >>>>>> > >>>>> Agreed, I believe Yi Liu is working on a sysfs interface such > >>>>> that QEMU can match IOMMU model and features. > >>>> > >>>> Digging this up again since v5 still has this issue. The IOMMU > >>>> API is a kernel internal abstraction of the IOMMU. sysfs is a > >>>> userspace interface. Are we suggesting that the /only/ way to > >>>> make use of the internal IOMMU API here is to have a user > >>>> provided opaque pasid table that we can't even do minimal > >>>> compatibility sanity testing on and we simply hope that hardware > >>>> covers all the fault conditions without taking the host down > >>>> with it? I guess we have to assume the latter since the user > >>>> has full control of the table, but I have a hard time getting > >>>> past lack of internal ability to use the interface and no > >>>> ability to provide even the slimmest sanity testing. Thanks, > >>> > >>> checking size, alignment, ... is OK, which I think is already > >>> considered by vendor IOMMU driver. However sanity testing table > >>> format might be difficult. The initial table provided by guest is > >>> likely just all ZEROs. whatever format violation may be caught > >>> only when a PASID entry is updated... > >> > >> There's sanity testing the actual contents of the table, which I > >> agree would be difficult and would likely require some sort of > >> shadowing at additional overhead, but what about even basic > >> consistency checking? For example, is it possible that due to > >> hardware variations a user might generate a table which works on > >> some systems but not others? What > >> if two table formats are sufficiently similar that the IOMMU driver > >> puts an incompatible table in place but it continuously generates > >> faults, how do we debug that? As an intermediary in this whole > >> process I'd really rather be able to identify that the user claims > >> to be providing a TypeA table but the IOMMU only supports TypeB, > >> so clearly this won't work. I don't see that we have that > >> capability. Thanks, > > > > I remember we ever discussed to define some vendor/model ID, > > which can be retrieved by user space and then passed back when > > doing table binding. Then above simple model matching check can > > be done accordingly. It is actually a basic requirement when using > > virtio-iommu, same driver expecting to work on all vendor IOMMUs. > > > > However I don't remember whether/where that logic is implemented > > in this series (especially when there are two tracks moving in > > parallel). I'll leave to Jacob/Jean to further comment. > > For Arm we do need some form of sanity checking. As each architecture > version brings a new set of features that may be supported and enabled > individually, we need to communicate fine-grained features to users. > They describes the general capability of the physical IOMMU, and also > which fields are available in the PASID table (entries are 512-bits > and leave some space for future extensions). > > In the past I briefly tried using a ioctl-based interface through VFIO > only, but it seemed more complicated to extend than sysfs for this > kind of probing. > > Note that the following is from my own prototype. I'm not sure how > much Yi Liu's implementation differs but I think this was roughly > what we agreed on last time. In sysfs an IOMMU device is described > with: > > * A model number, for example intel-vtd=1, arm-smmu-v3=2. > * Properties and features, describing in detail what the pIOMMU device > and driver support. > > /sys/class/iommu/<iommu-dev>/<model>/<property> > > For example an SMMUv3: > > The model number is described as a property > /sys/class/iommu/smmu.0x00000000e0600000/arm-smmu-v3/model = 2 > > A few feature bits and values: > .../arm-smmu-v3/asid_bits // max address space ID bits, %d > .../arm-smmu-v3/ssid_bits // max substream ID (PASID) bits, %d > .../arm-smmu-v3/input_bits // max input address size, %d > .../arm-smmu-v3/output_bits // max output address size, %d > .../arm-smmu-v3/btm // broadcast TLB maintenance, > enabled/disabled .../arm-smmu-v3/httu // Hardware > table update, > access+dirty/access/none .../arm-smmu-v3/stall // > transaction stalling, enabled/disabled/force > > (Note that the base pointer alignment previously discussed could be > implied by the model number, or added explicitly here.) > > Which page table formats are supported: > .../arm-smmu-v3/pgtable_format/lpae-64 > .../arm-smmu-v3/pgtable_format/v7s > I'm not sure yet what values these will have, they might simply > contain arbitrary format numbers because fields available in the page > tables can be deduced from the above features bits. (Out of laziness, > in my prototype I just describe a preferred format in a > pgtable_format file) > > As you can imagine I'd rather not pass the fine details back to the > kernel in bind_pasid_table. The list of features is growing, and > describing them is a pain. It could be done for debugging purpose, but > all we'd be achieving is telling the kernel that userspace has read > the values, not that the guest intends to use them. The guest selects > features by writing PASID table entries, which aren't read by the > host. > > If the guest writes invalid values in the PASID table then yes, we > have to rely on the hardware to contain the fault and not bring the > host down with it. If the IOMMU cannot do that, then the driver > really shouldn't implement bind_pasid_table... Otherwise, a fault > while reading the PASID table can be injected into the guest as an > unrecoverable fault (IOMMU_FAULT_REASON_PASID_INVALID or > IOMMU_FAULT_REASON_PGD_FETCH in patch 10) or printed by the host when > debugging. > > However I think the model number should be added to > pasid_table_config. For one thing it gives us a simple sanity-check, > but it also tells which other fields are valid in pasid_table_config. > Arm-smmu-v3 needs at least two additional 8-bit fields describing the > PASID table format (number of levels and PASID0 behaviour), which are > written to device context tables when installing the PASID table > pointer. > We had model number field in v2 of this patchset. My thought was that since the config info is meant to be generic, we shouldn't include model info. But I also think a simple sanity check can be useful, would that be sufficient to address Alex's concern? Of course we still need sysfs for more specific IOMMU features. Would this work? enum pasid_table_model { PASID_TABLE_FORMAT_HOST, PASID_TABLE_FORMAT_ARM_1LVL, PASID_TABLE_FORMAT_ARM_2LVL, PASID_TABLE_FORMAT_AMD, PASID_TABLE_FORMAT_INTEL, }; /** * PASID table data used to bind guest PASID table to the host IOMMU. This will * enable guest managed first level page tables. * @version: for future extensions and identification of the data format * @bytes: size of this structure * @model: PASID table format for different IOMMU models * @base_ptr: PASID table pointer * @pasid_bits: number of bits supported in the guest PASID table, must be less * or equal than the host supported PASID size. */ struct pasid_table_config { __u32 version; #define PASID_TABLE_CFG_VERSION_1 1 __u32 bytes; enum pasid_table_model model; __u64 base_ptr; __u8 pasid_bits; }; > Compatibility: new optional features are easy to add to a given model, > just add a new sysfs file. If in the future, the host describes a new > feature that is mandatory, or implements a different PASID table > format, how does it ensure that user understands it? Perhaps use a > new model number for this, e.g. "arm-smmu-v3-a=3", with similar > features. I think it would be the same if the host stops supporting a > feature for a given model, because they are ABI. But we can also > define default values from the start, for example "if ssid_bits file > isn't present, default value is 0 - PASID not supported" > > Thanks, > Jean [Jacob Pan]