[dpdk-dev] Jumbo frame support in pktgen
On 10/6/15, 11:07 PM, "dev on behalf of Hyunseok" wrote: >Hi, > >Can we generate 9k jumbo frames using the latest pktgen? Internally I do not create mbufs larger then 2K at this time in Pktgen. I guess it could be changed, but I do not have the time. If you want to submit a patch that would be great. > >Thanks! >-hs ? Regards, ++Keith Wiles Intel Corporation
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On 10/06/15 18:00, Michael S. Tsirkin wrote: > On Tue, Oct 06, 2015 at 05:49:21PM +0300, Vlad Zolotarov wrote: >>> and read/write the config space. >>> This means that a single userspace bug is enough to corrupt kernel >>> memory. >> Could u, pls., provide and example of this simple bug? Because it's >> absolutely not obvious... > Stick a value that happens to match a kernel address in Msg Addr field > in an unmasked MSI-X entry. This patch neither configures MSI-X entries in the user space nor provides additional means to do so therefore this "sticking" would be a matter of some extra code that is absolutely unrelated to this patch. So, this example seems absolutely irrelevant to this particular discussion. thanks, vlad >
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On 10/06/2015 05:07 PM, Michael S. Tsirkin wrote: > On Tue, Oct 06, 2015 at 03:15:57PM +0300, Avi Kivity wrote: >> btw, (2) doesn't really add any insecurity. The user could already poke at >> the msix tables (as well as perform DMA); they just couldn't get a useful >> interrupt out of them. > Poking at msix tables won't cause memory corruption unless msix and bus > mastering is enabled. It's a given that bus mastering is enabled. It's true that msix is unlikely to be enabled, unless msix support is added. >It's true root can enable msix and bus mastering > through sysfs - but that's easy to block or detect. Even if you don't > buy a security story, it seems less likely to trigger as a result > of a userspace bug. If you're doing DMA, that's the least of your worries. Still, zero-mapping the msix space seems reasonable, and can protect userspace from silly stuff. It can't be considered to have anything to do with security though, as long as users can simply DMA to every bit of RAM in the system they want to.
[dpdk-dev] Jumbo frame support in pktgen
Hi, Can we generate 9k jumbo frames using the latest pktgen? Thanks! -hs
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On Tue, Oct 06, 2015 at 05:49:21PM +0300, Vlad Zolotarov wrote: > >and read/write the config space. > >This means that a single userspace bug is enough to corrupt kernel > >memory. > > Could u, pls., provide and example of this simple bug? Because it's > absolutely not obvious... Stick a value that happens to match a kernel address in Msg Addr field in an unmasked MSI-X entry. -- MST
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On 10/06/15 16:58, Michael S. Tsirkin wrote: > On Tue, Oct 06, 2015 at 11:23:11AM +0300, Vlad Zolotarov wrote: >> Michael, how this or any other related patch is related to the problem u r >> describing? >> The above ability is there for years and if memory serves me >> well it was u who wrote uio_pci_generic with this "security flaw". ;) > I answered all this already. > > This patch enables bus mastering, enables MSI or MSI-X This may be done from the user space right now without this patch... > , and requires > userspace to map the MSI-X table Hmmm... I must have missed this requirement. Could u, pls., clarify? From what I see, MSI/MSI-X table is configured completely in the kernel here... > and read/write the config space. > This means that a single userspace bug is enough to corrupt kernel > memory. Could u, pls., provide and example of this simple bug? Because it's absolutely not obvious... > > uio_pci_generic does not enable bus mastering or MSI, and > it might be a good idea to have uio_pci_generic block > access to MSI/MSI-X config. Since device bars may be mapped bypassing the UIO/uio_pci_generic - this won't solve any issue.
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On Tue, Oct 06, 2015 at 03:15:57PM +0300, Avi Kivity wrote: > btw, (2) doesn't really add any insecurity. The user could already poke at > the msix tables (as well as perform DMA); they just couldn't get a useful > interrupt out of them. Poking at msix tables won't cause memory corruption unless msix and bus mastering is enabled. It's true root can enable msix and bus mastering through sysfs - but that's easy to block or detect. Even if you don't buy a security story, it seems less likely to trigger as a result of a userspace bug. -- MST
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On Tue, Oct 06, 2015 at 11:23:11AM +0300, Vlad Zolotarov wrote: > Michael, how this or any other related patch is related to the problem u r > describing? > The above ability is there for years and if memory serves me > well it was u who wrote uio_pci_generic with this "security flaw". ;) I answered all this already. This patch enables bus mastering, enables MSI or MSI-X, and requires userspace to map the MSI-X table and read/write the config space. This means that a single userspace bug is enough to corrupt kernel memory. uio_pci_generic does not enable bus mastering or MSI, and it might be a good idea to have uio_pci_generic block access to MSI/MSI-X config. -- MST
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On Tue, Oct 06, 2015 at 08:33:56AM +0100, Stephen Hemminger wrote: > Other than implementation objections, so far the two main arguments > against this reduce to: > 1. If you allow UIO ioctl then it opens an API hook for all the crap out > of tree UIO drivers to do what they want. > 2. If you allow UIO MSI-X then you are expanding the usage of userspace > device access in an insecure manner. That's not all. Without MSI one can detect insecure usage by detecting userspace enabling bus mastering. This can be detected simply using lspci. Or one can also imagine a configuration where this ability is disabled, is logged, or taints kernel. This seems like something that might be worth having for some locked-down systems. OTOH enabling MSI requires enabling bus mastering so suddenly we have no idea whether device can be/is used in a safe way. > > Another alternative which I explored was making a version of VFIO that > works without IOMMU. It solves #1 but actually increases the likely negative > response to arguent #2. No - because VFIO has limited protection against device misuse by userspace, by limiting access to sub-ranges of device BARs and config space. For a device that doesn't do DMA, that will be enough to make it secure to use. That's a pretty weak excuse to support userspace drivers for PCI devices without an IOMMU, but it's the best I heard so far. Is that worth the security trade-off? I'm still not sure. > This would keep same API, and avoid having to > modify UIO. But we would still have the same (if not more resistance) > from IOMMU developers who believe all systems have to be secure against > root. "Secure against root" is a confusing way to put it IMHO. We are talking about memory protection. So that's not IOMMU developers IIUC. I believe most kernel developers will agree it's not a good idea to let userspace corrupt kernel memory. Otherwise, the driver can't be supported, and maintaining upstream drivers that can't be supported serves no useful purpose. Anyone can load out of tree ones just as well. VFIO already supports MSI so VFIO developers already have a lot of experience with these issues. Getting their input would be valuable. -- MST
[dpdk-dev] [PATCH v2 4/4] Modifying configuration scripts for Netronome's nfp_uio driver.
From: "Alejandro.Lucero"Signed-off-by: Alejandro.Lucero Signed-off-by: Rolf.Neugebauer --- tools/dpdk_nic_bind.py |8 ++-- tools/setup.sh | 122 ++-- 2 files changed, 101 insertions(+), 29 deletions(-) diff --git a/tools/dpdk_nic_bind.py b/tools/dpdk_nic_bind.py index b7bd877..f7f8a39 100755 --- a/tools/dpdk_nic_bind.py +++ b/tools/dpdk_nic_bind.py @@ -43,7 +43,7 @@ ETHERNET_CLASS = "0200" # Each device within this is itself a dictionary of device properties devices = {} # list of supported DPDK drivers -dpdk_drivers = [ "igb_uio", "vfio-pci", "uio_pci_generic" ] +dpdk_drivers = [ "igb_uio", "vfio-pci", "uio_pci_generic", "nfp_uio" ] # command-line arg flags b_flag = None @@ -153,7 +153,7 @@ def find_module(mod): return path def check_modules(): -'''Checks that igb_uio is loaded''' +'''Checks that at least one dpdk module is loaded''' global dpdk_drivers fd = file("/proc/modules") @@ -261,7 +261,7 @@ def get_nic_details(): devices[d]["Active"] = "*Active*" break; -# add igb_uio to list of supporting modules if needed +# add module to list of supporting modules if needed if "Module_str" in devices[d]: for driver in dpdk_drivers: if driver not in devices[d]["Module_str"]: @@ -440,7 +440,7 @@ def display_devices(title, dev_list, extra_params = None): def show_status(): '''Function called when the script is passed the "--status" option. Displays -to the user what devices are bound to the igb_uio driver, the kernel driver +to the user what devices are bound to a dpdk driver, the kernel driver or to no driver''' global dpdk_drivers kernel_drv = [] diff --git a/tools/setup.sh b/tools/setup.sh index 5a8b2f3..e434ddb 100755 --- a/tools/setup.sh +++ b/tools/setup.sh @@ -236,6 +236,52 @@ load_vfio_module() } # +# Unloads nfp_uio.ko. +# +remove_nfp_uio_module() +{ + echo "Unloading any existing DPDK UIO module" + /sbin/lsmod | grep -s nfp_uio > /dev/null + if [ $? -eq 0 ] ; then + sudo /sbin/rmmod nfp_uio + fi +} + +# +# Loads new nfp_uio.ko (and uio module if needed). +# +load_nfp_uio_module() +{ + echo "Using RTE_SDK=$RTE_SDK and RTE_TARGET=$RTE_TARGET" + if [ ! -f $RTE_SDK/$RTE_TARGET/kmod/nfp_uio.ko ];then + echo "## ERROR: Target does not have the DPDK UIO Kernel Module." + echo " To fix, please try to rebuild target." + return + fi + + remove_nfp_uio_module + + /sbin/lsmod | grep -s uio > /dev/null + if [ $? -ne 0 ] ; then + modinfo uio > /dev/null + if [ $? -eq 0 ]; then + echo "Loading uio module" + sudo /sbin/modprobe uio + fi + fi + + # UIO may be compiled into kernel, so it may not be an error if it can't + # be loaded. + + echo "Loading DPDK UIO module" + sudo /sbin/insmod $RTE_SDK/$RTE_TARGET/kmod/nfp_uio.ko + if [ $? -ne 0 ] ; then + echo "## ERROR: Could not load kmod/nfp_uio.ko." + quit + fi +} + +# # Unloads the rte_kni.ko module. # remove_kni_module() @@ -427,10 +473,10 @@ grep_meminfo() # show_nics() { - if /sbin/lsmod | grep -q -e igb_uio -e vfio_pci; then + if /sbin/lsmod | grep -q -e igb_uio -e vfio_pci -e nfp_uio; then ${RTE_SDK}/tools/dpdk_nic_bind.py --status else - echo "# Please load the 'igb_uio' or 'vfio-pci' kernel module before " + echo "# Please load the 'igb_uio', 'vfio-pci' or 'nfp_uio' kernel module before " echo "# querying or adjusting NIC device bindings" fi } @@ -471,6 +517,23 @@ bind_nics_to_igb_uio() } # +# Uses dpdk_nic_bind.py to move devices to work with nfp_uio +# +bind_nics_to_nfp_uio() +{ + if /sbin/lsmod | grep -q nfp_uio ; then + ${RTE_SDK}/tools/dpdk_nic_bind.py --status + echo "" + echo -n "Enter PCI address of device to bind to NFP UIO driver: " + read PCI_PATH + sudo ${RTE_SDK}/tools/dpdk_nic_bind.py -b nfp_uio $PCI_PATH && echo "OK" + else + echo "# Please load the 'nfp_uio' kernel module before querying or " + echo "# adjusting NIC device bindings" + fi +} + +# # Uses dpdk_nic_bind.py to move devices to work with kernel drivers again # unbind_nics() @@ -513,29 +576,35 @@ step2_func() TEXT[1]="Insert IGB UIO module" FUNC[1]="load_igb_uio_module" - TEXT[2]="Insert VFIO module" - FUNC[2]="load_vfio_module" + TEXT[2]="Insert NFP UIO module" + FUNC[2]="load_nfp_uio_module" - TEXT[3]="Insert KNI module" - FUNC[3]="load_kni_module" + TEXT[3]="Insert VFIO
[dpdk-dev] [PATCH v2 3/4] This patch adds documentation about Netronome´s NFP nic
From: "Alejandro.Lucero"Signed-off-by: Alejandro.Lucero Signed-off-by: Rolf.Neugebauer --- doc/guides/nics/index.rst |1 + doc/guides/nics/nfp.rst | 270 + 2 files changed, 271 insertions(+) create mode 100644 doc/guides/nics/nfp.rst diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index d1a92f8..596ff88 100644 --- a/doc/guides/nics/index.rst +++ b/doc/guides/nics/index.rst @@ -48,6 +48,7 @@ Network Interface Controller Drivers virtio vmxnet3 pcap_ring +nfp **Figures** diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst new file mode 100644 index 000..57b34c6 --- /dev/null +++ b/doc/guides/nics/nfp.rst @@ -0,0 +1,270 @@ +.. BSD LICENSE +Copyright(c) 2015 Netronome Systems, Inc. All rights reserved. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +* Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in +the documentation and/or other materials provided with the +distribution. +* Neither the name of Intel Corporation nor the names of its +contributors may be used to endorse or promote products derived +from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +NFP poll mode driver library + + +Netronome's sixth generation of flow processors pack 216 programmable +cores and over 100 hardware accelerators that uniquely combine packet, +flow, security and content processing in a single device that scales +up to 400 Gbps. + +This document explains how to use DPDK with the Netronome Poll Mode +Driver (PMD) supporting Netronome's Network Flow Processor 6xxx +(NFP-6xxx). + +Currently the driver supports virtual functions (VFs) only. + +Dependencies + + +Before using the Netronome's DPDK PMD some NFP-6xxx configuration, +which is not related to DPDK, is required. The system requires +installation of **Netronome's BSP (Board Support Package)** which includes +Linux drivers, programs and libraries. + +If you have a NFP-6xxx device you should already have the code and +documentation for doing this configuration. Contact +**support at netronome.com** to obtain the latest available firmware. + +The NFP Linux kernel drivers (including the required PF driver for the +NFP) are available on Github at +**https://github.com/Netronome/nfp-drv-kmods** along with build +instructions. + +DPDK runs in userspace and PMDs uses the Linux kernel UIO interface to +allow access to physical devices from userspace. The NFP PMD requires +a separate UIO driver, **nfp_uio**, to perform correct +initialization. This driver is part of the DPDK source tree and is +equivalent to Intel's igb_uio driver. + +Building the software +- + +Netronome's PMD code is provided in the **drivers/net/nfp** directory and +nfp_uio is present in the **lib/librte_eal/linuxapp/nfp_uio** directory. Both +are part of the DPDK build if the **common_linuxapp configuration** file is +used. If you use another configuration file and want to have NFP support +just add: + +- **CONFIG_RTE_EAL_NFP_UIO=y** + +- **CONFIG_RTE_LIBRTE_NFP_PMD=y** + +Once DPDK is built all the DPDK apps and examples include support for +the NFP PMD. The nfp_uio.ko module will be at build/kmods directory or +at the directory specified when building DPDK. + + +System configuration + + +Using the NFP PMD is not different to using other PMDs. Usual steps are: + +#. **Configure hugepages:** All major Linux distributions have the hugepages + functionality enabled by default. By default this allows the system uses for + working with transparent hugepages. But in this case some hugepages need to + be created/reserved for use with the DPDK through the hugetlbfs file system. + First the virtual
[dpdk-dev] [PATCH v2 2/4] This patch adds a new UIO driver for Netronome NFP PCI cards.
From: "Alejandro.Lucero"Current Netronome's PMD just supports Virtual Functions. Future Physical Function support will require specific Netronome code here. Signed-off-by: Alejandro.Lucero Signed-off-by: Rolf.Neugebauer --- lib/librte_eal/common/include/rte_pci.h |1 + lib/librte_eal/linuxapp/eal/eal_pci.c |4 + lib/librte_eal/linuxapp/eal/eal_pci_uio.c |2 +- lib/librte_eal/linuxapp/nfp_uio/Makefile | 53 +++ lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c | 497 + lib/librte_ether/rte_ethdev.c |1 + 6 files changed, 557 insertions(+), 1 deletion(-) create mode 100644 lib/librte_eal/linuxapp/nfp_uio/Makefile create mode 100644 lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h index 83e3c28..89baaf6 100644 --- a/lib/librte_eal/common/include/rte_pci.h +++ b/lib/librte_eal/common/include/rte_pci.h @@ -146,6 +146,7 @@ struct rte_devargs; enum rte_kernel_driver { RTE_KDRV_UNKNOWN = 0, RTE_KDRV_IGB_UIO, + RTE_KDRV_NFP_UIO, RTE_KDRV_VFIO, RTE_KDRV_UIO_GENERIC, RTE_KDRV_NIC_UIO, diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c index bc5b5be..19a93fe 100644 --- a/lib/librte_eal/linuxapp/eal/eal_pci.c +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c @@ -137,6 +137,7 @@ pci_map_device(struct rte_pci_device *dev) #endif break; case RTE_KDRV_IGB_UIO: + case RTE_KDRV_NFP_UIO: case RTE_KDRV_UIO_GENERIC: /* map resources for devices that use uio */ ret = pci_uio_map_resource(dev); @@ -161,6 +162,7 @@ pci_unmap_device(struct rte_pci_device *dev) RTE_LOG(ERR, EAL, "Hotplug doesn't support vfio yet\n"); break; case RTE_KDRV_IGB_UIO: + case RTE_KDRV_NFP_UIO: case RTE_KDRV_UIO_GENERIC: /* unmap resources for devices that use uio */ pci_uio_unmap_resource(dev); @@ -357,6 +359,8 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t bus, dev->kdrv = RTE_KDRV_VFIO; else if (!strcmp(driver, "igb_uio")) dev->kdrv = RTE_KDRV_IGB_UIO; + else if (!strcmp(driver, "nfp_uio")) + dev->kdrv = RTE_KDRV_NFP_UIO; else if (!strcmp(driver, "uio_pci_generic")) dev->kdrv = RTE_KDRV_UIO_GENERIC; else diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c index ac50e13..29ec9cb 100644 --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c @@ -270,7 +270,7 @@ pci_uio_alloc_resource(struct rte_pci_device *dev, goto error; } - if (dev->kdrv == RTE_KDRV_IGB_UIO) + if (dev->kdrv == RTE_KDRV_IGB_UIO || dev->kdrv == RTE_KDRV_NFP_UIO) dev->intr_handle.type = RTE_INTR_HANDLE_UIO; else { dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX; diff --git a/lib/librte_eal/linuxapp/nfp_uio/Makefile b/lib/librte_eal/linuxapp/nfp_uio/Makefile new file mode 100644 index 000..b9e2f0a --- /dev/null +++ b/lib/librte_eal/linuxapp/nfp_uio/Makefile @@ -0,0 +1,53 @@ +# BSD LICENSE +# +# Copyright(c) 2014-2015 Netronome. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
[dpdk-dev] [PATCH v2 1/4] This patch adds a PMD driver for Netronome NFP PCI cards.
From: "Alejandro.Lucero"Signed-off-by: Alejandro.Lucero Signed-off-by: Rolf.Neugebauer --- MAINTAINERS |9 + config/common_linuxapp |6 + doc/guides/rel_notes/release_2_2.rst |8 + drivers/net/Makefile |1 + drivers/net/nfp/Makefile | 88 ++ drivers/net/nfp/nfp_net.c| 2495 ++ drivers/net/nfp/nfp_net_ctrl.h | 290 drivers/net/nfp/nfp_net_logs.h | 75 + drivers/net/nfp/nfp_net_pmd.h| 434 ++ lib/librte_eal/linuxapp/Makefile |3 + mk/rte.app.mk|1 + 11 files changed, 3410 insertions(+) create mode 100644 drivers/net/nfp/Makefile create mode 100644 drivers/net/nfp/nfp_net.c create mode 100644 drivers/net/nfp/nfp_net_ctrl.h create mode 100644 drivers/net/nfp/nfp_net_logs.h create mode 100644 drivers/net/nfp/nfp_net_pmd.h diff --git a/MAINTAINERS b/MAINTAINERS index 080a8e8..1fb2ba6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -167,6 +167,10 @@ FreeBSD UIO M: Bruce Richardson F: lib/librte_eal/bsdapp/nic_uio/ +NFP UIO +M: Alejandro Lucero +F: lib/librte_eal/linuxapp/nfp_uio/ + Core Libraries -- @@ -255,6 +259,11 @@ M: Adrien Mazarguil F: drivers/net/mlx4/ F: doc/guides/nics/mlx4.rst +Netronome NFP +M: Alejandro Lucero +F: drivers/net/nfp/ +F: doc/guides/nics/nfp.rst + RedHat virtio M: Huawei Xie M: Changchun Ouyang diff --git a/config/common_linuxapp b/config/common_linuxapp index 0de43d5..d8d6384 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -108,6 +108,7 @@ CONFIG_RTE_LIBEAL_USE_HPET=n CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n CONFIG_RTE_EAL_IGB_UIO=y +CONFIG_RTE_EAL_NFP_UIO=y CONFIG_RTE_EAL_VFIO=y CONFIG_RTE_MALLOC_DEBUG=n @@ -238,6 +239,11 @@ CONFIG_RTE_LIBRTE_ENIC_PMD=y CONFIG_RTE_LIBRTE_ENIC_DEBUG=n # +# Compile burst-oriented Netronome PMD driver +# +CONFIG_RTE_LIBRTE_NFP_PMD=y + +# # Compile burst-oriented VIRTIO PMD driver # CONFIG_RTE_LIBRTE_VIRTIO_PMD=y diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst index 5687676..364cca3 100644 --- a/doc/guides/rel_notes/release_2_2.rst +++ b/doc/guides/rel_notes/release_2_2.rst @@ -16,6 +16,10 @@ EAL Fixed issue where the ``rte_epoll_wait()`` function didn't return when the underlying call to ``epoll_wait()`` timed out. +* **eal/linuxapp: New UIO driver for Netronome?s NFP support** + + Netronome?s NFP PMD requires some specific configuration. Current implementation + supports just VFs. Future PF support will require major changes to this driver. Drivers ~~~ @@ -39,6 +43,10 @@ Drivers Fixed issue with libvirt ``virsh destroy`` not killing the VM. +* **drivers/net: New PMD for Netronome?s NFP 6xxx cards** + + PMD supporting VFs with Netronome?s NFP card. It requires specific UIO + driver, nfp_uio, and previous configuration using Netronome?s BSP. Libraries ~ diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 5ebf963..bc08591 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -48,6 +48,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += ring DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt +DIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp include $(RTE_SDK)/mk/rte.sharelib.mk include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/net/nfp/Makefile b/drivers/net/nfp/Makefile new file mode 100644 index 000..ef74e27 --- /dev/null +++ b/drivers/net/nfp/Makefile @@ -0,0 +1,88 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
[PATCH v2 0/4] Support for Netronome´s NFP-6xxx card
From: "Alejandro.Lucero"This patchset adds a new PMD for Netronome?s NFP-6xxx card along with a new UIO driver, documentation and minor changes to configuration scrips. Alejandro.Lucero (4): This patch adds a PMD driver for Netronome NFP PCI cards. This patch adds a new UIO driver for Netronome NFP PCI cards. This patch adds documentation about Netronome?s NFP nic Modifying configuration scripts for Netronome's nfp_uio driver. MAINTAINERS |9 + config/common_linuxapp|6 + doc/guides/nics/index.rst |1 + doc/guides/nics/nfp.rst | 270 doc/guides/rel_notes/release_2_2.rst |8 + drivers/net/Makefile |1 + drivers/net/nfp/Makefile | 88 + drivers/net/nfp/nfp_net.c | 2495 + drivers/net/nfp/nfp_net_ctrl.h| 290 drivers/net/nfp/nfp_net_logs.h| 75 + drivers/net/nfp/nfp_net_pmd.h | 434 + lib/librte_eal/common/include/rte_pci.h |1 + lib/librte_eal/linuxapp/Makefile |3 + lib/librte_eal/linuxapp/eal/eal_pci.c |4 + lib/librte_eal/linuxapp/eal/eal_pci_uio.c |2 +- lib/librte_eal/linuxapp/nfp_uio/Makefile | 53 + lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c | 497 ++ lib/librte_ether/rte_ethdev.c |1 + mk/rte.app.mk |1 + tools/dpdk_nic_bind.py|8 +- tools/setup.sh| 122 +- 21 files changed, 4339 insertions(+), 30 deletions(-) create mode 100644 doc/guides/nics/nfp.rst create mode 100644 drivers/net/nfp/Makefile create mode 100644 drivers/net/nfp/nfp_net.c create mode 100644 drivers/net/nfp/nfp_net_ctrl.h create mode 100644 drivers/net/nfp/nfp_net_logs.h create mode 100644 drivers/net/nfp/nfp_net_pmd.h create mode 100644 lib/librte_eal/linuxapp/nfp_uio/Makefile create mode 100644 lib/librte_eal/linuxapp/nfp_uio/nfp_uio.c -- 1.7.9.5
[dpdk-dev] [PATCH v5 3/4] ethdev: redesign link speed config API
Hi Marc, On Sun, Oct 04, 2015 at 11:12:46PM +0200, Marc Sune wrote: >[...] > /** > + * Device supported speeds bitmap flags > + */ > +#define ETH_LINK_SPEED_AUTONEG (0 << 0) /*< Autonegociate > (all speeds) */ > +#define ETH_LINK_SPEED_NO_AUTONEG(1 << 0) /*< Disable autoneg (fixed > speed) */ > +#define ETH_LINK_SPEED_10M_HD(1 << 1) /*< 10 Mbps > half-duplex */ > +#define ETH_LINK_SPEED_10M (1 << 2) /*< 10 Mbps full-duplex */ > +#define ETH_LINK_SPEED_100M_HD (1 << 3) /*< 100 Mbps > half-duplex */ > +#define ETH_LINK_SPEED_100M (1 << 4) /*< 100 Mbps full-duplex */ > +#define ETH_LINK_SPEED_1G(1 << 5) /*< 1 Gbps */ > +#define ETH_LINK_SPEED_2_5G (1 << 6) /*< 2.5 Gbps */ > +#define ETH_LINK_SPEED_5G(1 << 7) /*< 5 Gbps */ > +#define ETH_LINK_SPEED_10G (1 << 8) /*< 10 Mbps */ > +#define ETH_LINK_SPEED_20G (1 << 9) /*< 20 Gbps */ > +#define ETH_LINK_SPEED_25G (1 << 10) /*< 25 Gbps */ > +#define ETH_LINK_SPEED_40G (1 << 11) /*< 40 Gbps */ > +#define ETH_LINK_SPEED_50G (1 << 12) /*< 50 Gbps */ > +#define ETH_LINK_SPEED_56G (1 << 13) /*< 56 Gbps */ > +#define ETH_LINK_SPEED_100G (1 << 14) /*< 100 Gbps */ > + > +/** > + * Ethernet numeric link speeds in Mbps > + */ > +#define ETH_SPEED_NUM_NONE 0 /*< Not defined */ > +#define ETH_SPEED_NUM_10M10 /*< 10 Mbps */ > +#define ETH_SPEED_NUM_100M 100/*< 100 Mbps */ > +#define ETH_SPEED_NUM_1G 1000 /*< 1 Gbps */ > +#define ETH_SPEED_NUM_2_5G 2500 /*< 2.5 Gbps */ > +#define ETH_SPEED_NUM_5G 5000 /*< 5 Gbps */ > +#define ETH_SPEED_NUM_10G1 /*< 10 Mbps */ > +#define ETH_SPEED_NUM_20G2 /*< 20 Gbps */ > +#define ETH_SPEED_NUM_25G25000 /*< 25 Gbps */ > +#define ETH_SPEED_NUM_40G4 /*< 40 Gbps */ > +#define ETH_SPEED_NUM_50G5 /*< 50 Gbps */ > +#define ETH_SPEED_NUM_56G56000 /*< 56 Gbps */ > +#define ETH_SPEED_NUM_100G 10 /*< 100 Gbps */ > + > +/** > * A structure used to retrieve link-level information of an Ethernet port. > */ > struct rte_eth_link { > - uint16_t link_speed; /**< ETH_LINK_SPEED_[10, 100, 1000, 1] */ > - uint16_t link_duplex; /**< ETH_LINK_[HALF_DUPLEX, FULL_DUPLEX] */ > - uint8_t link_status : 1; /**< 1 -> link up, 0 -> link down */ > -}__attribute__((aligned(8))); /**< aligned for atomic64 read/write */ > - > -#define ETH_LINK_SPEED_AUTONEG 0 /**< Auto-negotiate link speed. */ > -#define ETH_LINK_SPEED_10 10 /**< 10 megabits/second. */ > -#define ETH_LINK_SPEED_100 100 /**< 100 megabits/second. */ > -#define ETH_LINK_SPEED_1000 1000/**< 1 gigabits/second. */ > -#define ETH_LINK_SPEED_11 /**< 10 gigabits/second. */ > -#define ETH_LINK_SPEED_10G 1 /**< alias of 10 gigabits/second. */ > -#define ETH_LINK_SPEED_20G 2 /**< 20 gigabits/second. */ > -#define ETH_LINK_SPEED_40G 4 /**< 40 gigabits/second. */ > + uint32_t link_speed; /**< Link speed (ETH_SPEED_NUM_) */ > + uint16_t link_duplex; /**< 1 -> full duplex, 0 -> half duplex */ > + uint8_t link_autoneg : 1; /**< 1 -> link speed has been autoneg */ > + uint8_t link_status : 1; /**< 1 -> link up, 0 -> link down */ > +} __attribute__((aligned(8))); /**< aligned for atomic64 read/write */ >[...] Pretty good. One question, why did you not merge link_duplex, autoneg, and status like: struct rte_eth_link { uint32_t link_speed; uint32_t link_duplex:1; uint32_t link_autoneg:1; uint32_t link_status:1; }; is it really useful to keep a uint16_t for the duplex alone? Another point, the comment about link_duplex field should point to the defines you have changed i.e. ETH_LINK_HALF_DUPLEX, ETH_LINK_FULL_DUPLEX. Regards, -- N?lio Laranjeiro 6WIND
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On 10/06/2015 10:33 AM, Stephen Hemminger wrote: > Other than implementation objections, so far the two main arguments > against this reduce to: >1. If you allow UIO ioctl then it opens an API hook for all the crap out > of tree UIO drivers to do what they want. >2. If you allow UIO MSI-X then you are expanding the usage of userspace > device access in an insecure manner. > > Another alternative which I explored was making a version of VFIO that > works without IOMMU. It solves #1 but actually increases the likely negative > response to arguent #2. This would keep same API, and avoid having to > modify UIO. But we would still have the same (if not more resistance) > from IOMMU developers who believe all systems have to be secure against > root. vfio's charter was explicitly aiming for modern setups with iommus. This could be revisited, but I agree it will have even more resistance, justified IMO. btw, (2) doesn't really add any insecurity. The user could already poke at the msix tables (as well as perform DMA); they just couldn't get a useful interrupt out of them. Maybe a module parameter "allow_insecure_dma" can be added to uio_pci_generic. Without the parameter, bus mastering and msix is disabled, with the parameter it is allowed. This requires the sysadmin to take a positive step in order to make use of their hardware.
[dpdk-dev] [PATCH 00/17] Enhance mlx5 with Mellanox OFED 3.1
Le 6 oct. 2015 09:54, "Stephen Hemminger" a ?crit : > > On Mon, 5 Oct 2015 19:54:35 +0200 > Adrien Mazarguil wrote: > > > Mellanox OFED 3.1 [1] comes with improved APIs that Mellanox ConnectX-4 > > (mlx5) adapters can take advantage of, such as: > > > > - Separate post and doorbell operations on all queues. > > - Lightweight RX queues called Work Queues (WQs). > > - Low-level RSS indirection table and hash key configuration. > > > > This patchset enhances mlx5 with all of these for better performance and > > flexibility. Documentation is updated accordingly. > > Has anybody explored doing a driver without the dependency on OFED? > It is certainly possible. The Linux kernel drivers don't depend on it. > And dropping OFED would certainly be faster. OFED is an established kernel API. I agree that F from infiniband should be deprecated since it has broader scope of use. It avoid wasting effort by duplicating kernel's code. It provides security too that UIO could not provide. Best regards, Vincent
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On 10/06/15 01:49, Michael S. Tsirkin wrote: > On Tue, Oct 06, 2015 at 01:09:55AM +0300, Vladislav Zolotarov wrote: >> How about instead of trying to invent the wheel just go and attack the >> problem >> directly just like i've proposed already a few times in the last days: >> instead >> of limiting the UIO limit the users that are allowed to use UIO to privileged >> users only (e.g. root). This would solve all clearly unresolvable issues u >> are >> raising here all together, wouldn't it? > No - root or no root, if the user can modify the addresses in the MSI-X > table and make the chip corrupt random memory, this is IMHO a non-starter. Michael, how this or any other related patch is related to the problem u r describing? The above ability is there for years and if memory serves me well it was u who wrote uio_pci_generic with this "security flaw". ;) This patch in general only adds the ability to receive notifications per MSI-X interrupt and it has nothing to do with the ability to reprogram the MSI-X related registers from the user space which was always there. > > And tainting kernel is not a solution - your patch adds a pile of > code that either goes completely unused or taints the kernel. > Not just that - it's a dedicated userspace API that either > goes completely unused or taints the kernel. > >>> -- >>> MST
[dpdk-dev] [PATCH v2] devargs: add blacklisting by linux interface name
On Tue, 2015-10-06 at 08:35 +0100, Stephen Hemminger wrote: > On Mon, 5 Oct 2015 11:26:08 -0400 > Chas Williams <3chas3 at gmail.com> wrote: > > > diff --git a/lib/librte_eal/common/include/rte_pci.h > > b/lib/librte_eal/common/include/rte_pci.h > > index 83e3c28..852c149 100644 > > --- a/lib/librte_eal/common/include/rte_pci.h > > +++ b/lib/librte_eal/common/include/rte_pci.h > > @@ -161,6 +161,7 @@ struct rte_pci_device { > > struct rte_pci_resource mem_resource[PCI_MAX_RESOURCE]; /**< PCI > > Memory Resource */ > > struct rte_intr_handle intr_handle; /**< Interrupt handle */ > > struct rte_pci_driver *driver; /**< Associated driver */ > > + char name[32]; > > Why not use IFNAMSIZ rather than magic constant here? No particular reason. It just matches the virtual device name size. I will change it.
[dpdk-dev] [PATCH 00/17] Enhance mlx5 with Mellanox OFED 3.1
On Mon, 5 Oct 2015 19:54:35 +0200 Adrien Mazarguil wrote: > Mellanox OFED 3.1 [1] comes with improved APIs that Mellanox ConnectX-4 > (mlx5) adapters can take advantage of, such as: > > - Separate post and doorbell operations on all queues. > - Lightweight RX queues called Work Queues (WQs). > - Low-level RSS indirection table and hash key configuration. > > This patchset enhances mlx5 with all of these for better performance and > flexibility. Documentation is updated accordingly. Has anybody explored doing a driver without the dependency on OFED? It is certainly possible. The Linux kernel drivers don't depend on it. And dropping OFED would certainly be faster.
[dpdk-dev] [PATCH v2] devargs: add blacklisting by linux interface name
On Mon, 5 Oct 2015 11:26:08 -0400 Chas Williams <3chas3 at gmail.com> wrote: > diff --git a/lib/librte_eal/common/include/rte_pci.h > b/lib/librte_eal/common/include/rte_pci.h > index 83e3c28..852c149 100644 > --- a/lib/librte_eal/common/include/rte_pci.h > +++ b/lib/librte_eal/common/include/rte_pci.h > @@ -161,6 +161,7 @@ struct rte_pci_device { > struct rte_pci_resource mem_resource[PCI_MAX_RESOURCE]; /**< PCI > Memory Resource */ > struct rte_intr_handle intr_handle; /**< Interrupt handle */ > struct rte_pci_driver *driver; /**< Associated driver */ > + char name[32]; Why not use IFNAMSIZ rather than magic constant here?
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
Other than implementation objections, so far the two main arguments against this reduce to: 1. If you allow UIO ioctl then it opens an API hook for all the crap out of tree UIO drivers to do what they want. 2. If you allow UIO MSI-X then you are expanding the usage of userspace device access in an insecure manner. Another alternative which I explored was making a version of VFIO that works without IOMMU. It solves #1 but actually increases the likely negative response to arguent #2. This would keep same API, and avoid having to modify UIO. But we would still have the same (if not more resistance) from IOMMU developers who believe all systems have to be secure against root.
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On Tue, Oct 06, 2015 at 01:09:55AM +0300, Vladislav Zolotarov wrote: > How about instead of trying to invent the wheel just go and attack the problem > directly just like i've proposed already a few times in the last days: instead > of limiting the UIO limit the users that are allowed to use UIO to privileged > users only (e.g. root). This would solve all clearly unresolvable issues u are > raising here all together, wouldn't it? No - root or no root, if the user can modify the addresses in the MSI-X table and make the chip corrupt random memory, this is IMHO a non-starter. And tainting kernel is not a solution - your patch adds a pile of code that either goes completely unused or taints the kernel. Not just that - it's a dedicated userspace API that either goes completely unused or taints the kernel. > > > > -- > > MST >
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On Oct 6, 2015 12:55 AM, "Michael S. Tsirkin" wrote: > > On Thu, Oct 01, 2015 at 11:33:06AM +0300, Michael S. Tsirkin wrote: > > Just forwarding events is not enough to make a valid driver. > > What is missing is a way to access the device in a safe way. > > Thinking about it some more, maybe some devices don't do DMA, and merely > signal events with MSI/MSI-X. > > The fact you mention igb_uio in the cover letter seems to hint that this > isn't the case, and that the real intent is to abuse it for DMA-capable > devices, but still ... > > If we assume such a simple device, we need to block userspace from > tweaking at least the MSI control and the MSI-X table. > And changing BARs might make someone else corrupt the MSI-X > table, so we need to block it from changing BARs, too. > > Things like device reset will clear the table. I guess this means we > need to track access to reset, too, make sure we restore the > table to a sane config. > > PM capability can be used to reset things tooI think. Better be > careful about that. > > And a bunch of devices could be doing weird things that need > to be special-cased. > > All of this is what VFIO is already dealing with. > > Maybe extending VFIO for this usecase, or finding another way to share > code might be a better idea than duplicating the code within uio? How about instead of trying to invent the wheel just go and attack the problem directly just like i've proposed already a few times in the last days: instead of limiting the UIO limit the users that are allowed to use UIO to privileged users only (e.g. root). This would solve all clearly unresolvable issues u are raising here all together, wouldn't it? > > -- > MST
[dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
On Thu, Oct 01, 2015 at 11:33:06AM +0300, Michael S. Tsirkin wrote: > Just forwarding events is not enough to make a valid driver. > What is missing is a way to access the device in a safe way. Thinking about it some more, maybe some devices don't do DMA, and merely signal events with MSI/MSI-X. The fact you mention igb_uio in the cover letter seems to hint that this isn't the case, and that the real intent is to abuse it for DMA-capable devices, but still ... If we assume such a simple device, we need to block userspace from tweaking at least the MSI control and the MSI-X table. And changing BARs might make someone else corrupt the MSI-X table, so we need to block it from changing BARs, too. Things like device reset will clear the table. I guess this means we need to track access to reset, too, make sure we restore the table to a sane config. PM capability can be used to reset things tooI think. Better be careful about that. And a bunch of devices could be doing weird things that need to be special-cased. All of this is what VFIO is already dealing with. Maybe extending VFIO for this usecase, or finding another way to share code might be a better idea than duplicating the code within uio? -- MST
[dpdk-dev] Unlinking hugepage backing file after initialiation
On Mon, Oct 05, 2015 at 01:08:52PM +, Xie, Huawei wrote: > On 9/30/2015 5:36 AM, Michael S. Tsirkin wrote: > > On Tue, Sep 29, 2015 at 05:50:00PM +, shesha Sreenivasamurthy (shesha) > > wrote: > >> Sure. Then, is there any real reason why the backing files should not be > >> unlinked ? > > AFAIK qemu unlinks them already. > Sorry, i didn't make it clear. Let us take the physical Ethernet > controller in the host for example > > 1) DPDK app1 unlinked huge page after initialization. > 2) DPDK app1 crashed or got killed unexpectedly. > 3) The nic device is just DMAing to the buffer memory allocated from > the huge page. > 4) Another app2 started, allocated memory from the hugetlbfs, and the > memory allocated happened to be the buffer memory. > Ok, the nic device dmaed to memory of app2, which corrupted app2. > Btw, the window opened is very very narrow, but we could avoid this > corruption if we don't unlink huge page immediately. We could > reinitialize the nic through binding operation and then remove the huge > page. > > I mentioned virtio at the first time. For its case, the one who does DMA > is vhost and i am talking about the guest huge page not the huge pages > used to back guest memory. > > So we had better not unlink huge pages unless we have other solution to > avoid the corruption. Oh, I get it now. It's when you (ab)use UIO to bypass all normal kernel protections. There's no problem when using VFIO. So kernel doesn't protect you in case of a crash, but I guess you can try to protect yourself. For example, write a separate service that you can pass the hugepage FDs and the device FDs to. Have it hold on to them, and when it detects your app crashed, have it reset the device before closing the FDs. Just make sure that one doesn't crash :). But really, people should just use VFIO. -- MST