On Friday 14 July 2017 04:21 PM, Hemant Agrawal wrote:
> On 7/14/2017 3:59 PM, santosh wrote:
>> On Friday 14 July 2017 03:52 PM, santosh wrote:
>>
>>> On Friday 14 July 2017 03:09 PM, Hemant Agrawal wrote:
>>>
>>>> On 7/14/2017 2:00 PM, santosh wrote:
>>>>> On Friday 14 July 2017 01:37 PM, Hemant Agrawal wrote:
>>>>>
>>>>>> On 7/11/2017 11:46 AM, Santosh Shukla wrote:
>>>>>>> API(rte_bus_get_iommu_class) helps to automatically detect and select
>>>>>>> appropriate iova mapping scheme for iommu capable device on that bus.
>>>>>>>
>>>>>>> Algorithm for iova scheme selection for bus:
>>>>>>> 0. Iterate through bus_list.
>>>>>>> 1. Collect each bus iova mode value and update into 'mode' var.
>>>>>>> 2. Here value '1' is _pa and value '2' is _va mode.
>>>>>>> So mode selection scheme is like:
>>>>>>> if mode == 2 then iova mode is _va.
>>>>>>> if mode == 1 then iova mode is _pa
>>>>>>> if mode == 3 then iova mode ia _pa.
>>>>>>>
>>>>>>> So mode !=2 will be default iova mode.
>>>>>>>
>>>>>>> Signed-off-by: Santosh Shukla <[email protected]>
>>>>>>> Signed-off-by: Jerin Jacob <[email protected]>
>>>>>>> ---
>>>>>>> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 +
>>>>>>> lib/librte_eal/common/eal_common_bus.c | 23
>>>>>>> +++++++++++++++++++++++
>>>>>>> lib/librte_eal/common/eal_common_pci.c | 1 +
>>>>>>> lib/librte_eal/common/include/rte_bus.h | 22
>>>>>>> ++++++++++++++++++++++
>>>>>>> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
>>>>>>> 5 files changed, 48 insertions(+)
>>>>>>>
>>>>>>> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>>> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>>> index 33c2c32c0..a2dd65a33 100644
>>>>>>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>>> @@ -202,6 +202,7 @@ DPDK_17.08 {
>>>>>>> rte_bus_find_by_name;
>>>>>>> rte_pci_match;
>>>>>>> rte_pci_get_iommu_class;
>>>>>>> + rte_bus_get_iommu_class;
>>>>>>>
>>>>>>> } DPDK_17.05;
>>>>>>>
>>>>>>> diff --git a/lib/librte_eal/common/eal_common_bus.c
>>>>>>> b/lib/librte_eal/common/eal_common_bus.c
>>>>>>> index 08bec2d93..5d5753ac9 100644
>>>>>>> --- a/lib/librte_eal/common/eal_common_bus.c
>>>>>>> +++ b/lib/librte_eal/common/eal_common_bus.c
>>>>>>> @@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
>>>>>>> c[0] = '\0';
>>>>>>> return rte_bus_find(NULL, bus_can_parse, name);
>>>>>>> }
>>>>>>> +
>>>>>>> +
>>>>>>> +/*
>>>>>>> + * Get iommu class of devices on the bus.
>>>>>>> + */
>>>>>>> +enum rte_iova_mode
>>>>>>> +rte_bus_get_iommu_class(void)
>>>>>>> +{
>>>>>>> + int mode = 0;
>>>>>>> + struct rte_bus *bus;
>>>>>>> +
>>>>>>> + TAILQ_FOREACH(bus, &rte_bus_list, next) {
>>>>>>> +
>>>>>>> + if (bus->get_iommu_class)
>>>>>>> + mode |= bus->get_iommu_class();
>>>>>>> + }
>>>>>>> +
>>>>>> If you change the default return as '0' for buses. This code will work.
>>>>>> e.g. PCI will return '0' - when no device is probed. FSL MC will return
>>>>>> VA. the default mode will be 'VA'
>>>>>>
>>>>> I'm confused why it won't work for fslmc case?
>>>>>
>>>>> Let me walk through the code:
>>>>>
>>>>> If no-pci device Or (future) no-platform device probed then bus opt
>>>>> to use default mapping scheme .. which is iova_pa(default scheme).
>>>>>
>>>>> Lets take PCI_bus example:
>>>>> bus->get_iommu_class()
>>>>> ---> bus->_pci_get_iommu_class()
>>>>> * Now consider that no interface bound to any of PCI device, then
>>>>> it will return RTE_IOVA_PA mode to rte_bus layer (aka
>>>>> bus->get_iommu_class).
>>>>> So the iova mapping result from iommu_class scan is RTE_IOVA_PA
>>>>> (default).
>>>>> It works for PCI_bus case, tested for both iova_va and iova_pa
>>>>> case, no-pci device case.
>>>>>
>>>>> Now in fslmc bus case:
>>>>> bus->get_iommu_class()
>>>>> ---> bus->_fslmc_get_iommu_class()
>>>>>
>>>>> * IIUC your comment - You want fslmc bus to return RTE_IOVA_VA if
>>>>> no device
>>>>> detected, Right?
>>>> why?
>>>>
>>> As I didn't understood your previous reply:
>>> `e.g. PCI will return '0' - when no device is probed. FSL MC will return
>>> VA. the default mode will be 'VA'`
>>>
>>> So, I'm asking you that in fslmc bus case - if no device found then are you
>>> opting _va scheme or not?
>>> Seems like _not_ per your below comment.
>>>
>>>
>>>> If bus is just present but no device is in use for dpdk, then bus should
>>>> return 0 and it *should not* participate in the IOMMU class decision.
>>>>
>>> I think, I understand your point..Example if you have no-pci on first PCI
>>> bus
>>> but device found on 2nd platform bus then you don't want to fallback to
>>> default (/_pa) mode..
>>> instead you want to use 2nd bus mode for mapping, which is _va. Right?
>>>
>>> If so then In my first version - We did introduced the case called _DC.
>>> _DC:0 --> stands for no-device found case.
>>>
>>>> Right now there are only two buses. There can be more buses. (e.g. PCI,
>>>> platform, fslmc in case of dpaa2 as well).
>>>>
>>>> If the bus is not being used at all, why it influence the decision of
>>>> other buses.
>>>>
>>> If your referring to above case then I agree, We'll re-introduce _DC state
>>> from v1 in next revision.
>>> That will look like
>>> rte_pci_get_iommu_class() {
>>> int mode = RTE_IOVA_DC; /* '0' */
>>>
>>> return _DC; /* if no device found */
>>> }
>>>
>>> Right?
>
> Yes! Thanks!
>
> As I explained in the other thread. The PCI devices can be there, but none of
> them is for DPDK:
> EAL: PCI device 0000:01:00.0 on NUMA socket 0
> EAL: probe driver: 8086:10d3 net_e1000_em
> EAL: Not managed by a supported kernel driver, skipped
>
>
Ok, I will queue _DC changes in next verions. Thanks for confirming.
>>>
>>>> if no bus has any device, the System default is anyway PA.
>>>>
>>> Right, If no bus present then It's also responsibility of
>>> `rte_bus_get_iommu_class`
>>> to use default mapping scheme which is _pa and which It does.
>>>
>>>>> if so then your fslmc bus handle should do something like below
>>>>> -- If no device on fslmc bus : return RTE_IOVA_VA.
>>>>> -- If device detected on fslmc bus and bound to iommu driver
>>>>> : return RTE_IOVA_VA
>>>>> -- If device detected fslmc but not bound to iommu drv :
>>>>> return RTE_IOVA_PA..
>>>>>
>>>>> make sense? If not then can you describe fslmc mapping scheme?
>>>>>
>>>>>> if fslmc is not present. The default mode will be PA.
>>>>>>
>>>>>>> + if (mode != RTE_IOVA_VA) {
>>>>>>> + /* Use default IOVA mode */
>>>>>>> + mode = RTE_IOVA_PA;
>>>>>>> + }
>>>> The system default is anyway PA.
>>>>
>>> No, That check is needed for case like 1st bus return with _PA and 2nd bus
>>> returns with _VA,
>>> then mode = 3 (Mix mode), which we don't support so (as I mentioned before)
>>> its responsibility of
>>> rte_bus_get_iommu_class() to return default mode (_pa). That's why!.
>>>
>>>
>> Does your platform supports `mix mode`, I asked same question in thread
>> [04/11] too?
>> Let's say that dpaa2 supports mix mode then it is Ok if bus chose to opt
>> default mapping
>> for mix mode case? Do you see any issue if bus opt to use default scheme for
>> mix mode?
>>
>>
>
> yes! We can support mix mode. However with your suggested changes in mempool
> etc APIs, now the DPDK will not work for us in mix mode (when both PCI and
> DPAA2 devices are available) with VA support only for DPAA2 :)
>
> In case of mix mode, you logic is already there to default to PA. That is
> fine.
>
> But, when PCI devices are not hooked to dpdk. We should be able to use VA for
> dpaa2.
>
Ok.
I assume that You'll implement bus handle for fslmc something like
`rte_fslmc_get_iommu_class()` and make sure that you return:
- _VA in no-device found case.
- _VA if iommu capable interface detected for device.
- _PA if no-iommu.
And only change which you expect at bus layer /rte_bus_get_iommu_class() is to
honor `no device found` situation for multiple bus case. Right?