> -----Original Message----- > From: Richardson, Bruce > Sent: Monday, April 20, 2015 4:03 PM > To: Ananyev, Konstantin > Cc: dev at dpdk.org; Wiles, Keith > Subject: Re: [dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation > > On Mon, Apr 20, 2015 at 12:26:43PM +0100, Ananyev, Konstantin wrote: > > > > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson > > > Sent: Friday, April 17, 2015 4:17 PM > > > To: dev at dpdk.org; Wiles, Keith > > > Subject: [dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation > > > > > > This commit demonstrates what a minimal API for all packet handling > > > types would look like. It simply provides the necessary parts for > > > receiving and transmiting packets, and is based off the ethdev > > > implementation. > > > --- > > > config/common_bsdapp | 5 ++ > > > config/common_linuxapp | 5 ++ > > > lib/Makefile | 1 + > > > lib/librte_pktdev/Makefile | 56 ++++++++++++++++ > > > lib/librte_pktdev/rte_pktdev.c | 35 ++++++++++ > > > lib/librte_pktdev/rte_pktdev.h | 144 > > > +++++++++++++++++++++++++++++++++++++++++ > > > 6 files changed, 246 insertions(+) > > > create mode 100644 lib/librte_pktdev/Makefile > > > create mode 100644 lib/librte_pktdev/rte_pktdev.c > > > create mode 100644 lib/librte_pktdev/rte_pktdev.h > > > > > > diff --git a/config/common_bsdapp b/config/common_bsdapp > > > index 8ff4dc2..d2b932c 100644 > > > --- a/config/common_bsdapp > > > +++ b/config/common_bsdapp > > > @@ -132,6 +132,11 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y > > > CONFIG_RTE_LIBRTE_KVARGS=y > > > > > > # > > > +# Compile generic packet handling device library > > > +# > > > +CONFIG_RTE_LIBRTE_PKTDEV=y > > > + > > > +# > > > # Compile generic ethernet library > > > # > > > CONFIG_RTE_LIBRTE_ETHER=y > > > diff --git a/config/common_linuxapp b/config/common_linuxapp > > > index 09a58ac..5bda416 100644 > > > --- a/config/common_linuxapp > > > +++ b/config/common_linuxapp > > > @@ -129,6 +129,11 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y > > > CONFIG_RTE_LIBRTE_KVARGS=y > > > > > > # > > > +# Compile generic packet handling device library > > > +# > > > +CONFIG_RTE_LIBRTE_PKTDEV=y > > > + > > > +# > > > # Compile generic ethernet library > > > # > > > CONFIG_RTE_LIBRTE_ETHER=y > > > diff --git a/lib/Makefile b/lib/Makefile > > > index d94355d..4db5ee0 100644 > > > --- a/lib/Makefile > > > +++ b/lib/Makefile > > > @@ -32,6 +32,7 @@ > > > include $(RTE_SDK)/mk/rte.vars.mk > > > > > > DIRS-y += librte_compat > > > +DIRS-$(CONFIG_RTE_LIBRTE_PKTDEV) += librte_pktdev > > > DIRS-$(CONFIG_RTE_LIBRTE_EAL) += librte_eal > > > DIRS-$(CONFIG_RTE_LIBRTE_MALLOC) += librte_malloc > > > DIRS-$(CONFIG_RTE_LIBRTE_RING) += librte_ring > > > diff --git a/lib/librte_pktdev/Makefile b/lib/librte_pktdev/Makefile > > > new file mode 100644 > > > index 0000000..2d3b3a1 > > > --- /dev/null > > > +++ b/lib/librte_pktdev/Makefile > > > @@ -0,0 +1,56 @@ > > > +# BSD LICENSE > > > +# > > > +# Copyright(c) 2015 Intel Corporation. All rights reserved. > > > +# All rights reserved. > > > +# > > > +# Redistribution and use in source and binary forms, with or without > > > +# modification, are permitted provided that the following conditions > > > +# are met: > > > +# > > > +# * Redistributions of source code must retain the above copyright > > > +# notice, this list of conditions and the following disclaimer. > > > +# * Redistributions in binary form must reproduce the above copyright > > > +# notice, this list of conditions and the following disclaimer in > > > +# the documentation and/or other materials provided with the > > > +# distribution. > > > +# * Neither the name of Intel Corporation nor the names of its > > > +# contributors may be used to endorse or promote products derived > > > +# from this software without specific prior written permission. > > > +# > > > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > > > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > > > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > > > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > > > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > > > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > > > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > > > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > > > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > > > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > > > + > > > +include $(RTE_SDK)/mk/rte.vars.mk > > > + > > > +# > > > +# library name > > > +# > > > +LIB = libpktdev.a > > > + > > > +CFLAGS += -O3 > > > +CFLAGS += $(WERROR_FLAGS) > > > + > > > +EXPORT_MAP := rte_pktdev_version.map > > > + > > > +LIBABIVER := 1 > > > + > > > +SRCS-y += rte_pktdev.c > > > + > > > +# > > > +# Export include files > > > +# > > > +SYMLINK-y-include += rte_pktdev.h > > > + > > > +# this lib depends upon no others: > > > +DEPDIRS-y += > > > + > > > +include $(RTE_SDK)/mk/rte.lib.mk > > > diff --git a/lib/librte_pktdev/rte_pktdev.c > > > b/lib/librte_pktdev/rte_pktdev.c > > > new file mode 100644 > > > index 0000000..4c32d86 > > > --- /dev/null > > > +++ b/lib/librte_pktdev/rte_pktdev.c > > > @@ -0,0 +1,36 @@ > > > +/*- > > > + * BSD LICENSE > > > + * > > > + * Copyright(c) 2015 Intel Corporation. All rights reserved. > > > + * All rights reserved. > > > + * > > > + * Redistribution and use in source and binary forms, with or without > > > + * modification, are permitted provided that the following conditions > > > + * are met: > > > + * > > > + * * Redistributions of source code must retain the above copyright > > > + * notice, this list of conditions and the following disclaimer. > > > + * * Redistributions in binary form must reproduce the above > > > copyright > > > + * notice, this list of conditions and the following disclaimer in > > > + * the documentation and/or other materials provided with the > > > + * distribution. > > > + * * Neither the name of Intel Corporation nor the names of its > > > + * contributors may be used to endorse or promote products derived > > > + * from this software without specific prior written permission. > > > + * > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS > > > FOR > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > > > INCIDENTAL, > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF > > > USE, > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON > > > ANY > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE > > > USE > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > > > + */ > > > + > > > +#include "rte_pktdev.h" > > > + > > > +/* For future use */ > > > diff --git a/lib/librte_pktdev/rte_pktdev.h > > > b/lib/librte_pktdev/rte_pktdev.h > > > new file mode 100644 > > > index 0000000..8a5699a > > > --- /dev/null > > > +++ b/lib/librte_pktdev/rte_pktdev.h > > > @@ -0,0 +1,144 @@ > > > +/*- > > > + * BSD LICENSE > > > + * > > > + * Copyright(c) 2015 Intel Corporation. All rights reserved. > > > + * All rights reserved. > > > + * > > > + * Redistribution and use in source and binary forms, with or without > > > + * modification, are permitted provided that the following conditions > > > + * are met: > > > + * > > > + * * Redistributions of source code must retain the above copyright > > > + * notice, this list of conditions and the following disclaimer. > > > + * * Redistributions in binary form must reproduce the above > > > copyright > > > + * notice, this list of conditions and the following disclaimer in > > > + * the documentation and/or other materials provided with the > > > + * distribution. > > > + * * Neither the name of Intel Corporation nor the names of its > > > + * contributors may be used to endorse or promote products derived > > > + * from this software without specific prior written permission. > > > + * > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > > > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > > > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS > > > FOR > > > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > > > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > > > INCIDENTAL, > > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF > > > USE, > > > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON > > > ANY > > > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > > > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE > > > USE > > > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > > > + */ > > > + > > > +#ifndef _RTE_PKTDEV_H_ > > > +#define _RTE_PKTDEV_H_ > > > + > > > +#include <stdint.h> > > > + > > > +/** > > > + * @file > > > + * > > > + * RTE Packet Processing Device API > > > + */ > > > + > > > +#ifdef __cplusplus > > > +extern "C" { > > > +#endif > > > + > > > +/* forward definition of mbuf structure. We don't need full mbuf header > > > here */ > > > +struct rte_mbuf; > > > + > > > +#define RTE_PKT_NAME_MAX_LEN (32) > > > + > > > +typedef uint16_t (*pkt_rx_burst_t)(void *rxq, > > > + struct rte_mbuf **rx_pkts, > > > + uint16_t nb_pkts); > > > +/**< @internal Retrieve packets from a queue of a device. */ > > > + > > > +typedef uint16_t (*pkt_tx_burst_t)(void *txq, > > > + struct rte_mbuf **tx_pkts, > > > + uint16_t nb_pkts); > > > +/**< @internal Send packets on a queue of a device. */ > > > + > > > +#define RTE_PKT_DEV_HDR(structname) struct { \ > > > + pkt_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ \ > > > + pkt_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ \ > > > + struct structname ## _data *data; /**< Pointer to device data */ \ > > > +} > > > + > > > +#define RTE_PKT_DEV_DATA_HDR struct { \ > > > + char name[RTE_PKT_NAME_MAX_LEN]; /**< Unique identifier name */ \ > > > +\ > > > + void **rx_queues; /**< Array of pointers to RX queues. */ \ > > > + void **tx_queues; /**< Array of pointers to TX queues. */ \ > > > + uint16_t nb_rx_queues; /**< Number of RX queues. */ \ > > > + uint16_t nb_tx_queues; /**< Number of TX queues. */ \ > > > +} > > > + > > > +struct rte_pkt_dev { > > > + RTE_PKT_DEV_HDR(rte_pkt_dev); > > > +}; > > > + > > > +struct rte_pkt_dev_data { > > > + RTE_PKT_DEV_DATA_HDR; > > > +}; > > > + > > > +/** > > > + * > > > + * Retrieve a burst of input packets from a receive queue of a > > > + * device. The retrieved packets are stored in *rte_mbuf* structures > > > whose > > > + * pointers are supplied in the *rx_pkts* array. > > > + * > > > + * @param dev > > > + * The device to be polled for packets > > > + * @param queue_id > > > + * The index of the receive queue from which to retrieve input packets. > > > + * @param rx_pkts > > > + * The address of an array of pointers to *rte_mbuf* structures that > > > + * must be large enough to store *nb_pkts* pointers in it. > > > + * @param nb_pkts > > > + * The maximum number of packets to retrieve. > > > + * @return > > > + * The number of packets actually retrieved, which is the number > > > + * of pointers to *rte_mbuf* structures effectively supplied to the > > > + * *rx_pkts* array. > > > + */ > > > +static inline uint16_t > > > +rte_pkt_rx_burst(struct rte_pkt_dev *dev, uint16_t queue_id, > > > + struct rte_mbuf **rx_pkts, uint16_t nb_pkts) > > > +{ > > > + return (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id], > > > + rx_pkts, nb_pkts); > > > +} > > > + > > > +/** > > > + * Send a burst of output packets on a transmit queue of a device. > > > + * > > > + * @param dev > > > + * The device to be given the packets. > > > + * @param queue_id > > > + * The index of the queue through which output packets must be sent. > > > + * @param tx_pkts > > > + * The address of an array of *nb_pkts* pointers to *rte_mbuf* > > > structures > > > + * which contain the output packets. > > > + * @param nb_pkts > > > + * The maximum number of packets to transmit. > > > + * @return > > > + * The number of output packets actually stored in transmit > > > descriptors of > > > + * the transmit ring. The return value can be less than the value of > > > the > > > + * *tx_pkts* parameter when the transmit ring is full or has been > > > filled up. > > > + */ > > > +static inline uint16_t > > > +rte_pkt_tx_burst(struct rte_pkt_dev *dev, uint16_t queue_id, > > > + struct rte_mbuf **tx_pkts, uint16_t nb_pkts) > > > +{ > > > + return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts, > > > nb_pkts); > > > +} > > > > That one looks much more lightweight, then Keith one :) > > I have a question here: > > Why are you guys so confident, that all foreseeable devices would fit into > > current eth_dev rx_burst/tx_burst API? > > As I understand, QAT devices have HW request/response ring pairs, so the > > idea probably is to make tx_burst() > > populate an request ring and rx_burst() to read from response ring, right? > > Though right now rte_mbuf contains a lot of flags and data fields that are > > specific for ethdev, and probably have no sense for crypto dev. > > From other side, for QAT devices, I suppose you'll need crypto specific > > flags and data: > > encrypt/decrypt, source and destination buffer addresses, some cipher > > specific data, etc. > > Wonder do you plan to fit all that into current rte_mbuf structure, or do > > you plan to have some superset structure, > > that would consist of rte_mbuf plus some extra stuff, or ... ? > > Konstantin > > > > > > From my point of view, the purpose of a pktdev API to to define a common API > for devices that read/write (i.e. RX/TX) mbufs. If other devices don't fit > that > model, then they don't use it. However, we already have a number of different > device types that already read/write mbufs, and there are likely more in > future, > so it would be nice to have a common API that can be used across those types > so > the data path can be source-neutral when processing mbufs. So far, we have > three proposals on how that might be achieved. :-)
Ok, probably I misunderstood previous RFCs about pktdev. So pktdev is not going to be a generic abstraction for any foreseeable device types: eth, crypto, compress, dpi, etc? It would represent only subset of dev types: real eth devices, plus SW emulated ones that can mimic rx_burst/tx_burst, but don't support any other eth specific ops (kni, pmd_ring)? Basically what we support right now? And for future device types to support (crypto, etc) there would be something else? > > As for the crypto, there is still a lot of investigation work to be done here, > and Keith is probably better placed than me to comment on it. But if you look > at any > application using DPDK for packet processing, the mbuf is still going to be at > the heart of it, so whatever the actual crypto API is like, it's going to have > to have to fit into an "mbuf world" anyway. > If the crypto-dev APIs don't work > with mbufs, then it's pushing work onto the app (or higher level libs) to > extract > the meaningful fields from the mbuf to pass to the dev. As I understand application (or some upper layer lib) would have to do that work anyway: RX packet, find and read related to this packet crypto data, and pass all that info to the crypto dev, right? I can hardly imagine, that in reality you always would be able to do: n = rx_burst(eth_dev, pkts, n); n = tx_burst(crypto_dev, pkts, n); without any processing of incoming packets in between. The difference is only how that information will be passed into crypto device: Would you need to to update an mbuf, or fill some other structure. > Therefore, in a DPDK > library, I would think it better to have the dev work with mbufs directly, as > the rest of DPDK does. I wouldn't say that all DPDK libs accepts data as mbufs only. Only those where it makes sense: PMDs, ip_frag/reassembly, pipeline fw. Let say rte_hash and rte_acl accept input data in a neutral way. You wouldn't expect rte_hash_lookup() or rte_acl_classify() to accept pointer(s) to the mbuf. if you'll have a SW implemented encrypt()/decrypt() routines, you probably wouldn't require an mbuf as input parameter either. Wouldn't it be more plausible to create a new struct rte_crypto_buf (or something) to pass information to/from crypto devices. Then we can provide for it an ability to attach/detach to rte_mbuf(s) buf_addr(s). That way we can probably make crypto_buf to attach to 2 mbufs at once (source and destination). Konstantin > For the extra data, my suggestion would be to have the > mbuf store a pointer to a crypto context, rather than trying to store it > directly > in the mbuf itself - there's no way we'd fit a crypto key in there! - and > the context would be shared among a set of packets rather than per-mbuf. > > /Bruce