On 07/06/2017 03:08 PM, Maxime Coquelin wrote:
On 07/06/2017 01:19 PM, santosh wrote:
On Thursday 06 July 2017 04:29 PM, Maxime Coquelin wrote:
On 07/06/2017 11:49 AM, Jerin Jacob wrote:
-----Original Message-----
Date: Thu, 6 Jul 2017 09:58:41 +0200
From: Maxime Coquelin <maxime.coque...@redhat.com>
To: Jerin Jacob <jerin.ja...@caviumnetworks.com>
CC: Santosh Shukla <santosh.shu...@caviumnetworks.com>,
tho...@monjalon.net, bruce.richard...@intel.com, dev@dpdk.org,
hemant.agra...@nxp.com, shreyansh.j...@nxp.com,
gaetan.ri...@6wind.com
Subject: Re: [dpdk-dev] [PATCH 07/10] linuxapp/eal_vfio: honor iova
mode
before mapping
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
Thunderbird/52.1.0
On 07/05/2017 05:43 PM, Jerin Jacob wrote:
-----Original Message-----
Date: Wed, 5 Jul 2017 11:14:01 +0200
From: Maxime Coquelin <maxime.coque...@redhat.com>
To: Santosh Shukla <santosh.shu...@caviumnetworks.com>,
tho...@monjalon.net, bruce.richard...@intel.com, dev@dpdk.org
CC: jerin.ja...@caviumnetworks.com, hemant.agra...@nxp.com,
shreyansh.j...@nxp.com, gaetan.ri...@6wind.com
Subject: Re: [dpdk-dev] [PATCH 07/10] linuxapp/eal_vfio: honor
iova mode
before mapping
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
Thunderbird/52.1.0
On 06/08/2017 01:05 PM, Santosh Shukla wrote:
Check iova mode and accordingly map iova to pa or va.
Signed-off-by: Santosh Shukla<santosh.shu...@caviumnetworks.com>
Signed-off-by: Jerin Jacob<jerin.ja...@caviumnetworks.com>
---
lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 04914406f..348b7a7f4 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
dma_map.argsz = sizeof(struct
vfio_iommu_type1_dma_map);
dma_map.vaddr = ms[i].addr_64;
dma_map.size = ms[i].len;
- dma_map.iova = ms[i].phys_addr;
+ if (rte_eal_iova_mode() == RTE_IOVA_VA)
+ dma_map.iova = dma_map.vaddr;
+ else
+ dma_map.iova = ms[i].phys_addr;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
VFIO_DMA_MAP_FLAG_WRITE;
IIUC, it is changing default behavior for VFIO devices.
I see a possible problem, but I'm not sure the case is valid.
Imagine you have two devices in the iommu group, and the two
devices are
used in separate processes. Each process could try two different
physical addresses at the same virtual address, and so the second
map
would fail.
IMO, Doesn't look like a problem. Here is the data flow
1) The vfio DMA map function(vfio_type1_dma_map()) will be called
only
on primary process
http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_vfio.c#n359
2) On secondary process, DPDK rte_eal_huge_page_attach() will make
sure
that, the Secondary process has the _same_ virtual address as
primary or
exit from on attach.
http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_memory.c#n1452
3) Since secondary process adds the mapped the virtual address in
step (2).
in the page table in OS. On SMMU entry miss(When device
request from I/O transaction), OS will load the mapping and update
the SMMU
"context" with page tables from MMU.
Ok thanks for the detailed info, but what about the case where the
same
iommu group is used by two primary processes?
Does that case exist with DPDK? We always need to blacklist same BDF in
the secondary process to make things work with existing DPDK setup.
Which
make sense as well. Only primary process configures the HW blocks.
I meant the case when two BDF are in the same IOMMU group (if ACS is not
supported at some point in the hierarchy). And I meant two primary
processes running, like for example two containers running each a DPDK
application.
Maybe this is not a valid use-case (it is not secure, as it would break
isolation between the two containers), but it seems that it is something
DPDK allows today, if I'm not mistaken.
I'm not sure how two primary process could run, as because latter
primary process
would try accessing /var/run/.rte_config and would fail at this [1]
point.
It's not valid use-case for dpdk (imo).
[1]
http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal.c#n204
Yes this is possible. I had never used it before, but Thomas told me it
is supported by setting--file-prefix option. I had a trial, and I
confirm it works:
session 1> ./install/bin/testpmd -l 0,2 --socket-mem=1024 -w
0000:05:00.0 --proc-type=primary --file-prefix=app1 -- --disable-hw-vlan
-i --rxq=1 --txq=1 --nb-cores=1 --forward-mode=io
session 2> ./install/bin/testpmd -l 0,3 --socket-mem=1024 -w
0000:05:00.1 --proc-type=primary --file-prefix=app2 -- --disable-hw-vlan
-i --rxq=1 --txq=1 --nb-cores=1 --forward-mode=io
In the above example, two ports of the same card is used by two
processes. Note that in this case, ACS is supproted and both ports have
their own iommu group.
# ls -al /var/run/.app*
-rw-r-----. 1 root root 208420 Jul 6 09:08 /var/run/.app1_config
-rw-r--r--. 1 root root 49728 Jul 6 09:08 /var/run/.app1_hugepage_info
srwxr-xr-x. 1 root root 0 Jul 6 09:08 /var/run/.app1_mp_socket
-rw-r-----. 1 root root 208420 Jul 6 09:08 /var/run/.app2_config
-rw-r--r--. 1 root root 45584 Jul 6 09:08 /var/run/.app2_hugepage_info
srwxr-xr-x. 1 root root 0 Jul 6 09:08 /var/run/.app2_mp_socket