From: Alex Markuze [mailto:a...@weka.io]
Sent: Wednesday, November 05, 2014 11:19 PM
To: Thomas Monjalon
Cc: Zhou, Danny; dev at dpdk.org; Fastabend, John R
Subject: Re: [dpdk-dev] bifurcated driver



On Wed, Nov 5, 2014 at 5:14 PM, Alex Markuze <alex at weka.io<mailto:alex at 
weka.io>> wrote:
On Wed, Nov 5, 2014 at 3:00 PM, Thomas Monjalon <thomas.monjalon at 
6wind.com<mailto:thomas.monjalon at 6wind.com>> wrote:
Hi Danny,

2014-10-31 17:36, O'driscoll, Tim:
> Bifurcated Driver (Danny.Zhou at intel.com<mailto:Danny.Zhou at intel.com>)

Thanks for the presentation of bifurcated driver during the community call.
I asked if you looked at ibverbs and you wanted a link to check.
The kernel module is here:
        
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core
The userspace library:
        http://git.kernel.org/cgit/libs/infiniband/libibverbs.git

Extract from Kconfig:
"
config INFINIBAND_USER_ACCESS
        tristate "InfiniBand userspace access (verbs and CM)"
        select ANON_INODES
        ---help---
          Userspace InfiniBand access support.  This enables the
          kernel side of userspace verbs and the userspace
          communication manager (CM).  This allows userspace processes
          to set up connections and directly access InfiniBand
          hardware for fast-path operations.  You will also need
          libibverbs, libibcm and a hardware driver library from
          <http://www.openfabrics.org/git/>.
"

It seems to be close to the bifurcated driver needs.
Not sure if it can solve the security issues if there is no dedicated MMU
in the NIC.

Mellanox NIC's and other  RDMA HW (Infiniband/RoCE/iWARP) have MTT units - 
memory translation units - a dedicated MMU. These are filled via an ibv_reg_mr 
sys calls - this creates a Process VM to physical/iova memory mapping in the 
NIC. Thus each process can access only its own memory via the NIC. This is the 
way RNIC*s resolve the security issue I'm not sure how standard intel nics 
could support this scheme.

DZ:  Intel NICs does not provide such a embedded memory translation unit, but 
Intel chipset supports IOMMU with a generic memory protection mechanism to 
provide physical/iova memory mapping for DMA transactions on any PCIe device, 
rather than NIC only.

There is already a 6wind PMD for mellanox Nics. I'm assuming this PMD is verbs 
based and behaves similar to the bifurcated driver proposed.
http://www.mellanox.com/page/press_release_item?id=979

DZ: is it open sourced for community to use? I guess answer is No. Also, that 
PMD should have ported majority of Mellanox kernel driver code to DPDK as lots 
of NIC control related code needed, while the bifurcated driver approach only 
needs to support minimum Mellanox NIC specific packet rx/tx routines to achieve 
the DPDK claimed high performance by using all DPDK performance optimization 
techniques, such as huge page, fixed-size packet buffer, zero-copy, PMD, etc. 
Kernel driver still remains NIC control, without porting it to DPDK.

One, thing that I don't understand (And will be happy if some one could shed 
some light on), is how does the NIC supposed do distinguish between packets 
that need to go to the kernel driver rings and packets going to user space 
rings.

DZ: it depends on user. User should use standard ethtool (see below examples) 
to enable flow director and distribute packets to kernel or user space owned rx 
queue, by specifying 5-tuple as well as destination rxq index. Flow director 
embedded in NIC does flow classification and distribution, rather than the 
software approach like DPDK KNI. If you argue SRIOV has similar rx/tx queue 
pair partition capability, I would say bifurcated driver approach provides much 
more flexibility than SRIOV, (e.g, variable number of qpairs allocation for 
user space, L3 5-tuple based flow classification and distribution rather than 
SRIOV? L2 classification based on MAC or VLAN)

ethtool -K ethX ntuple on   # enable flow director
ethtool -N ethX flow-type udp4 src-ip 0.0.0.0 action 0   # distribute udp 
packet wit source IP 0.0.0.0 to rx queue No.0

I feel we should sum up pros and cons of
        - igb_uio
        - uio_pci_generic
        - VFIO
        - ibverbs
        - bifurcated driver
I suggest to consider these criterias:
        - upstream status
        - usable with kernel netdev
        - usable in a vm
        - usable for ethernet
        - hardware requirements
        - security protection
        - performance
Regarding IBVERBS - I'm not sure how its relevant to future DPDK development , 
but this is the run down as I know It.
 This is a veteran package called OFED , or its counterpart Mellanox OFED.
   ---- The kernel drivers are upstream
   ---- The PCI dev stays in the kernels care trough out its life span
   ---- SRIOV support exists, paravirt support exists only(AFAIK) as an Office 
of the CTO(VMware) project called vRDMA
   ---- Eth/RoCE (RDMA over Converged Ethernet)/IB
   === HW === RDMA capable HW ONLY.
   ---- Security is designed into RDMA HW
   ---- Stellar performance - Favored by HPC.

*RNIC - RDMA (Remote DMA - iWARP/Infinibad/RoCE)capable NICs.

--
Thomas


Reply via email to