Re: [dpdk-dev] [PATCH v2 1/2] ethdev: replace callback getting filter operations
On 3/12/21 8:46 PM, Thomas Monjalon wrote: > Since rte_flow is the only API for filtering operations, > the legacy driver interface filter_ctrl was too much complicated > for the simple task of getting the struct rte_flow_ops. > > The filter type RTE_ETH_FILTER_GENERIC and > the filter operarion RTE_ETH_FILTER_GET are removed. > The new driver callback flow_ops_get replaces filter_ctrl. > > Signed-off-by: Thomas Monjalon [snip] > diff --git a/lib/librte_ethdev/rte_eth_ctrl.h > b/lib/librte_ethdev/rte_eth_ctrl.h > index 8a50dbfef9..42652f9cce 100644 > --- a/lib/librte_ethdev/rte_eth_ctrl.h > +++ b/lib/librte_ethdev/rte_eth_ctrl.h > @@ -339,7 +339,7 @@ struct rte_eth_fdir_action { > }; > > /** > - * A structure used to define the flow director filter entry by filter_ctrl > API. > + * A structure used to define the flow director filter entry. > */ > struct rte_eth_fdir_filter { > uint32_t soft_id; > diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c > index 241af6c4ca..1a896e3e64 100644 > --- a/lib/librte_ethdev/rte_flow.c > +++ b/lib/librte_ethdev/rte_flow.c > @@ -255,18 +255,19 @@ rte_flow_ops_get(uint16_t port_id, struct > rte_flow_error *error) > > if (unlikely(!rte_eth_dev_is_valid_port(port_id))) > code = ENODEV; > - else if (unlikely(!dev->dev_ops->filter_ctrl || > - dev->dev_ops->filter_ctrl(dev, > - RTE_ETH_FILTER_GENERIC, > - RTE_ETH_FILTER_GET, > - &ops) || > - !ops)) > - code = ENOSYS; > + else if (unlikely(dev->dev_ops->flow_ops_get == NULL)) > + code = ENOTSUP; > else > - return ops; > - rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, > -NULL, rte_strerror(code)); > - return NULL; > + code = dev->dev_ops->flow_ops_get(dev, &ops); > + if (code == 0 && ops == NULL) > + code = EACCES; It looks something new. I think it should be mentioned in flow_ops_get type documentation (similar to eth_promiscuous_enable_t) and rte_flow_validate() etc functions return values description. > + > + if (code != 0) { > + rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, > +NULL, rte_strerror(code)); > + return NULL; > + } > + return ops; > } > > /* Check whether a flow rule can be created on a given port. */ [snip]
Re: [dpdk-dev] [PATCH v2 2/2] drivers/net: remove explicit include of legacy filtering
for dpaa2 Acked-by: Hemant Agrawal
Re: [dpdk-dev] [PATCH v2 1/2] ethdev: replace callback getting filter operations
For dpaa2 Acked-by: Hemant Agrawal
Re: [dpdk-dev] [PATCH v2 1/2] ethdev: replace callback getting filter operations
15/03/2021 08:18, Andrew Rybchenko: > On 3/12/21 8:46 PM, Thomas Monjalon wrote: > > --- a/lib/librte_ethdev/rte_flow.c > > +++ b/lib/librte_ethdev/rte_flow.c > > @@ -255,18 +255,19 @@ rte_flow_ops_get(uint16_t port_id, struct > > rte_flow_error *error) > > > > if (unlikely(!rte_eth_dev_is_valid_port(port_id))) > > code = ENODEV; > > - else if (unlikely(!dev->dev_ops->filter_ctrl || > > - dev->dev_ops->filter_ctrl(dev, > > - RTE_ETH_FILTER_GENERIC, > > - RTE_ETH_FILTER_GET, > > - &ops) || > > - !ops)) > > - code = ENOSYS; > > + else if (unlikely(dev->dev_ops->flow_ops_get == NULL)) > > + code = ENOTSUP; > > else > > - return ops; > > - rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, > > - NULL, rte_strerror(code)); > > - return NULL; > > + code = dev->dev_ops->flow_ops_get(dev, &ops); > > + if (code == 0 && ops == NULL) > > + code = EACCES; > > It looks something new. I think it should be mentioned in flow_ops_get > type documentation (similar to eth_promiscuous_enable_t) and > rte_flow_validate() etc functions > return values description. It is an internal function used only in rte_flow.c. The real consequence is to set rte_errno in a lot of rte_flow API. Not sure there is a good way to document the code details. Other codes are not documented in rte_flow.h
[dpdk-dev] [PATCH v1 1/1] net/hinic: fix coredump when PMD used by fstack
The fstack will use secondary process to access the memory of eth_dev_ops , and it wants to get the info of dev, but hinic driver does not initialized it when in secondary process. Fixes: 66f64dd6dc86 ("net/hinic: fix secondary process") Cc: sta...@dpdk.org Signed-off-by: Guoyang Zhou --- drivers/net/hinic/base/hinic_compat.h | 25 - drivers/net/hinic/hinic_pmd_ethdev.c | 5 + 2 files changed, 13 insertions(+), 17 deletions(-) diff --git a/drivers/net/hinic/base/hinic_compat.h b/drivers/net/hinic/base/hinic_compat.h index 6dd210e..aea3320 100644 --- a/drivers/net/hinic/base/hinic_compat.h +++ b/drivers/net/hinic/base/hinic_compat.h @@ -171,6 +171,7 @@ static inline u32 readl(const volatile void *addr) #else #define CLOCK_TYPE CLOCK_MONOTONIC #endif +#define HINIC_MUTEX_TIMEOUT 10 static inline unsigned long clock_gettime_ms(void) { @@ -225,24 +226,14 @@ static inline int hinic_mutex_destroy(pthread_mutex_t *pthreadmutex) static inline int hinic_mutex_lock(pthread_mutex_t *pthreadmutex) { int err; + struct timespec tout; - err = pthread_mutex_lock(pthreadmutex); - if (!err) { - return err; - } else if (err == EOWNERDEAD) { - PMD_DRV_LOG(ERR, "Mutex lock failed. (ErrorNo=%d)", errno); -#if defined(__GLIBC__) -#if __GLIBC_PREREQ(2, 12) - (void)pthread_mutex_consistent(pthreadmutex); -#else - (void)pthread_mutex_consistent_np(pthreadmutex); -#endif -#else - (void)pthread_mutex_consistent(pthreadmutex); -#endif - } else { - PMD_DRV_LOG(ERR, "Mutex lock failed. (ErrorNo=%d)", errno); - } + (void)clock_gettime(CLOCK_TYPE, &tout); + + tout.tv_sec += HINIC_MUTEX_TIMEOUT; + err = pthread_mutex_timedlock(pthreadmutex, &tout); + if (err) + PMD_DRV_LOG(ERR, "Mutex lock failed. (ErrorNo=%d)", err); return err; } diff --git a/drivers/net/hinic/hinic_pmd_ethdev.c b/drivers/net/hinic/hinic_pmd_ethdev.c index 1d6b710..057e7b1 100644 --- a/drivers/net/hinic/hinic_pmd_ethdev.c +++ b/drivers/net/hinic/hinic_pmd_ethdev.c @@ -3085,6 +3085,10 @@ static int hinic_dev_close(struct rte_eth_dev *dev) .filter_ctrl = hinic_dev_filter_ctrl, }; +static const struct eth_dev_ops hinic_dev_sec_ops = { + .dev_infos_get = hinic_dev_infos_get, +}; + static int hinic_func_init(struct rte_eth_dev *eth_dev) { struct rte_pci_device *pci_dev; @@ -3099,6 +3103,7 @@ static int hinic_func_init(struct rte_eth_dev *eth_dev) /* EAL is SECONDARY and eth_dev is already created */ if (rte_eal_process_type() != RTE_PROC_PRIMARY) { + eth_dev->dev_ops = &hinic_dev_sec_ops; PMD_DRV_LOG(INFO, "Initialize %s in secondary process", eth_dev->data->name); -- 1.8.3.1
Re: [dpdk-dev] [PATCH v2 1/2] ethdev: replace callback getting filter operations
On 3/15/21 10:54 AM, Thomas Monjalon wrote: > 15/03/2021 08:18, Andrew Rybchenko: >> On 3/12/21 8:46 PM, Thomas Monjalon wrote: >>> --- a/lib/librte_ethdev/rte_flow.c >>> +++ b/lib/librte_ethdev/rte_flow.c >>> @@ -255,18 +255,19 @@ rte_flow_ops_get(uint16_t port_id, struct >>> rte_flow_error *error) >>> >>> if (unlikely(!rte_eth_dev_is_valid_port(port_id))) >>> code = ENODEV; >>> - else if (unlikely(!dev->dev_ops->filter_ctrl || >>> - dev->dev_ops->filter_ctrl(dev, >>> - RTE_ETH_FILTER_GENERIC, >>> - RTE_ETH_FILTER_GET, >>> - &ops) || >>> - !ops)) >>> - code = ENOSYS; >>> + else if (unlikely(dev->dev_ops->flow_ops_get == NULL)) >>> + code = ENOTSUP; >>> else >>> - return ops; >>> - rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, >>> - NULL, rte_strerror(code)); >>> - return NULL; >>> + code = dev->dev_ops->flow_ops_get(dev, &ops); >>> + if (code == 0 && ops == NULL) >>> + code = EACCES; >> It looks something new. I think it should be mentioned in flow_ops_get >> type documentation (similar to eth_promiscuous_enable_t) and >> rte_flow_validate() etc functions >> return values description. > It is an internal function used only in rte_flow.c. > The real consequence is to set rte_errno in a lot of rte_flow API. > Not sure there is a good way to document the code details. > Other codes are not documented in rte_flow.h First of all it is a behaviour of the flow_ops_get callback and driver developers should know that it is a legal to return 0 and ops==NULL and know what it means. Second, it is visible as rte_flow_validate() (and other functions which use rte_flow_ops_get()) return value value which has special meaning. So, should be documented.
Re: [dpdk-dev] [PATCH v2 1/2] ethdev: replace callback getting filter operations
15/03/2021 09:43, Andrew Rybchenko: > On 3/15/21 10:54 AM, Thomas Monjalon wrote: > > 15/03/2021 08:18, Andrew Rybchenko: > >> On 3/12/21 8:46 PM, Thomas Monjalon wrote: > >>> --- a/lib/librte_ethdev/rte_flow.c > >>> +++ b/lib/librte_ethdev/rte_flow.c > >>> @@ -255,18 +255,19 @@ rte_flow_ops_get(uint16_t port_id, struct > >>> rte_flow_error *error) > >>> > >>> if (unlikely(!rte_eth_dev_is_valid_port(port_id))) > >>> code = ENODEV; > >>> - else if (unlikely(!dev->dev_ops->filter_ctrl || > >>> - dev->dev_ops->filter_ctrl(dev, > >>> - RTE_ETH_FILTER_GENERIC, > >>> - RTE_ETH_FILTER_GET, > >>> - &ops) || > >>> - !ops)) > >>> - code = ENOSYS; > >>> + else if (unlikely(dev->dev_ops->flow_ops_get == NULL)) > >>> + code = ENOTSUP; > >>> else > >>> - return ops; > >>> - rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, > >>> -NULL, rte_strerror(code)); > >>> - return NULL; > >>> + code = dev->dev_ops->flow_ops_get(dev, &ops); > >>> + if (code == 0 && ops == NULL) > >>> + code = EACCES; > >> It looks something new. I think it should be mentioned in flow_ops_get > >> type documentation (similar to eth_promiscuous_enable_t) and > >> rte_flow_validate() etc functions > >> return values description. > > > > It is an internal function used only in rte_flow.c. > > The real consequence is to set rte_errno in a lot of rte_flow API. > > Not sure there is a good way to document the code details. > > Other codes are not documented in rte_flow.h > > First of all it is a behaviour of the flow_ops_get callback and > driver developers should know that it is a legal to return 0 and > ops==NULL and know what it means. The combination code 0 and ops NULL is not new. Previously, it was returning ENOSYS. I've just given a more meaningful error code: EACCES, while replacing ENOSYS with ENOTSUP for the other case. > Second, it is visible as rte_flow_validate() (and other functions > which use rte_flow_ops_get()) return value value which has > special meaning. So, should be documented. Yes, I should update the API doc where ENOSYS was mentioned. Or probably better: I should keep the error code ENOSYS and do not break API. Preference?
[dpdk-dev] [RFC] net/i40e: change the timing of FDIR input set configuration
The configuration of FDIR input set should not be set during flow validate. It should be set when flow create. Signed-off-by: Murphy Yang --- drivers/net/i40e/i40e_ethdev.h | 1 + drivers/net/i40e/i40e_fdir.c | 88 +++ drivers/net/i40e/i40e_flow.c | 95 +++--- 3 files changed, 96 insertions(+), 88 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h index 1e8f5d3a87..c6ec071f44 100644 --- a/drivers/net/i40e/i40e_ethdev.h +++ b/drivers/net/i40e/i40e_ethdev.h @@ -631,6 +631,7 @@ struct i40e_fdir_flow_ext { uint8_t raw_id; uint8_t is_vf; /* 1 for VF, 0 for port dev */ uint16_t dst_id; /* VF ID, available when is_vf is 1*/ + uint64_t input_set; bool inner_ip; /* If there is inner ip */ enum i40e_fdir_ip_type iip_type; /* ip type for inner ip */ enum i40e_fdir_ip_type oip_type; /* ip type for outer ip */ diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c index c572d003cb..af0c00de04 100644 --- a/drivers/net/i40e/i40e_fdir.c +++ b/drivers/net/i40e/i40e_fdir.c @@ -1588,6 +1588,83 @@ i40e_flow_set_fdir_flex_msk(struct i40e_pf *pf, pf->fdir.flex_mask_flag[pctype] = 1; } +static int +i40e_flow_set_fdir_inset(struct i40e_pf *pf, +enum i40e_filter_pctype pctype, +uint64_t input_set) +{ + uint32_t mask_reg[I40E_INSET_MASK_NUM_REG] = {0}; + struct i40e_hw *hw = I40E_PF_TO_HW(pf); + uint64_t inset_reg = 0; + int i, num; + + /* Check if the input set is valid */ + if (i40e_validate_input_set(pctype, RTE_ETH_FILTER_FDIR, + input_set) != 0) { + PMD_DRV_LOG(ERR, "Invalid input set"); + return -EINVAL; + } + + /* Check if the configuration is conflicted */ + if (pf->fdir.inset_flag[pctype] && + memcmp(&pf->fdir.input_set[pctype], &input_set, sizeof(uint64_t))) + return -1; + + if (pf->fdir.inset_flag[pctype] && + !memcmp(&pf->fdir.input_set[pctype], &input_set, sizeof(uint64_t))) + return 0; + + num = i40e_generate_inset_mask_reg(input_set, mask_reg, + I40E_INSET_MASK_NUM_REG); + if (num < 0) + return -EINVAL; + + if (pf->support_multi_driver) { + for (i = 0; i < num; i++) + if (i40e_read_rx_ctl(hw, + I40E_GLQF_FD_MSK(i, pctype)) != + mask_reg[i]) { + PMD_DRV_LOG(ERR, "Input set setting is not" + " supported with" + " `support-multi-driver`" + " enabled!"); + return -EPERM; + } + for (i = num; i < I40E_INSET_MASK_NUM_REG; i++) + if (i40e_read_rx_ctl(hw, + I40E_GLQF_FD_MSK(i, pctype)) != 0) { + PMD_DRV_LOG(ERR, "Input set setting is not" + " supported with" + " `support-multi-driver`" + " enabled!"); + return -EPERM; + } + + } else { + for (i = 0; i < num; i++) + i40e_check_write_reg(hw, I40E_GLQF_FD_MSK(i, pctype), + mask_reg[i]); + /*clear unused mask registers of the pctype */ + for (i = num; i < I40E_INSET_MASK_NUM_REG; i++) + i40e_check_write_reg(hw, + I40E_GLQF_FD_MSK(i, pctype), 0); + } + + inset_reg |= i40e_translate_input_set_reg(hw->mac.type, input_set); + + i40e_check_write_reg(hw, I40E_PRTQF_FD_INSET(pctype, 0), +(uint32_t)(inset_reg & UINT32_MAX)); + i40e_check_write_reg(hw, I40E_PRTQF_FD_INSET(pctype, 1), +(uint32_t)((inset_reg >> +I40E_32_BIT_WIDTH) & UINT32_MAX)); + + I40E_WRITE_FLUSH(hw); + + pf->fdir.input_set[pctype] = input_set; + pf->fdir.inset_flag[pctype] = 1; + return 0; +} + static inline unsigned char * i40e_find_available_buffer(struct rte_eth_dev *dev) { @@ -1686,6 +1763,17 @@ i40e_flow_add_del_fdir_filter(struct rte_eth_dev *dev, if (add) { if (filter->input.flow_ext.is_flex_flow) { + ret = i40e_flow_set_fdir_inset(pf, pctype, + filter->input.flow_ext.input_set); + if (ret == -1) { +
Re: [dpdk-dev] [PATCH v2 1/2] ethdev: replace callback getting filter operations
On 3/15/21 11:55 AM, Thomas Monjalon wrote: > 15/03/2021 09:43, Andrew Rybchenko: >> On 3/15/21 10:54 AM, Thomas Monjalon wrote: >>> 15/03/2021 08:18, Andrew Rybchenko: On 3/12/21 8:46 PM, Thomas Monjalon wrote: > --- a/lib/librte_ethdev/rte_flow.c > +++ b/lib/librte_ethdev/rte_flow.c > @@ -255,18 +255,19 @@ rte_flow_ops_get(uint16_t port_id, struct > rte_flow_error *error) > > if (unlikely(!rte_eth_dev_is_valid_port(port_id))) > code = ENODEV; > - else if (unlikely(!dev->dev_ops->filter_ctrl || > - dev->dev_ops->filter_ctrl(dev, > - RTE_ETH_FILTER_GENERIC, > - RTE_ETH_FILTER_GET, > - &ops) || > - !ops)) > - code = ENOSYS; > + else if (unlikely(dev->dev_ops->flow_ops_get == NULL)) > + code = ENOTSUP; It is described as: -ENOTSUP: valid but unsupported rule specification (e.g. partial bit-masks are unsupported). So, it looks different. May be it is really better to keep ENOSYS. > else > - return ops; > - rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, > -NULL, rte_strerror(code)); > - return NULL; > + code = dev->dev_ops->flow_ops_get(dev, &ops); > + if (code == 0 && ops == NULL) > + code = EACCES; It looks something new. I think it should be mentioned in flow_ops_get type documentation (similar to eth_promiscuous_enable_t) and rte_flow_validate() etc functions return values description. >>> >>> It is an internal function used only in rte_flow.c. >>> The real consequence is to set rte_errno in a lot of rte_flow API. >>> Not sure there is a good way to document the code details. >>> Other codes are not documented in rte_flow.h >> >> First of all it is a behaviour of the flow_ops_get callback and >> driver developers should know that it is a legal to return 0 and >> ops==NULL and know what it means. > > The combination code 0 and ops NULL is not new. > Previously, it was returning ENOSYS. > I've just given a more meaningful error code: EACCES, > while replacing ENOSYS with ENOTSUP for the other case. Yes, exactly. What I'm trying to say that it would be helpful to make it a bit more transparent to PMD developers. Yes, it was not documented before, I agree. I think it is a good time to improve documentation. >> Second, it is visible as rte_flow_validate() (and other functions >> which use rte_flow_ops_get()) return value value which has >> special meaning. So, should be documented. > > Yes, I should update the API doc where ENOSYS was mentioned. > Or probably better: I should keep the error code ENOSYS > and do not break API. > Preference? Good question. I think we should not distinguish NULL callback and NULL ops returned by not-NULL callback. So, I think keeping ENOSYS is the best option here.
Re: [dpdk-dev] [PATCH v2 1/2] ethdev: replace callback getting filter operations
15/03/2021 10:08, Andrew Rybchenko: > On 3/15/21 11:55 AM, Thomas Monjalon wrote: > > 15/03/2021 09:43, Andrew Rybchenko: > >> On 3/15/21 10:54 AM, Thomas Monjalon wrote: > >>> 15/03/2021 08:18, Andrew Rybchenko: > On 3/12/21 8:46 PM, Thomas Monjalon wrote: > > --- a/lib/librte_ethdev/rte_flow.c > > +++ b/lib/librte_ethdev/rte_flow.c > > @@ -255,18 +255,19 @@ rte_flow_ops_get(uint16_t port_id, struct > > rte_flow_error *error) > > > > if (unlikely(!rte_eth_dev_is_valid_port(port_id))) > > code = ENODEV; > > - else if (unlikely(!dev->dev_ops->filter_ctrl || > > - dev->dev_ops->filter_ctrl(dev, > > - > > RTE_ETH_FILTER_GENERIC, > > - RTE_ETH_FILTER_GET, > > - &ops) || > > - !ops)) > > - code = ENOSYS; > > + else if (unlikely(dev->dev_ops->flow_ops_get == NULL)) > > + code = ENOTSUP; > > It is described as: >-ENOTSUP: valid but unsupported rule specification (e.g. >partial bit-masks are unsupported). > So, it looks different. May be it is really better to keep > ENOSYS. > > > else > > - return ops; > > - rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, > > - NULL, rte_strerror(code)); > > - return NULL; > > + code = dev->dev_ops->flow_ops_get(dev, &ops); > > + if (code == 0 && ops == NULL) > > + code = EACCES; > It looks something new. I think it should be mentioned in flow_ops_get > type documentation (similar to eth_promiscuous_enable_t) and > rte_flow_validate() etc functions > return values description. > >>> > >>> It is an internal function used only in rte_flow.c. > >>> The real consequence is to set rte_errno in a lot of rte_flow API. > >>> Not sure there is a good way to document the code details. > >>> Other codes are not documented in rte_flow.h > >> > >> First of all it is a behaviour of the flow_ops_get callback and > >> driver developers should know that it is a legal to return 0 and > >> ops==NULL and know what it means. > > > > The combination code 0 and ops NULL is not new. > > Previously, it was returning ENOSYS. > > I've just given a more meaningful error code: EACCES, > > while replacing ENOSYS with ENOTSUP for the other case. > > Yes, exactly. What I'm trying to say that it would be > helpful to make it a bit more transparent to PMD developers. > Yes, it was not documented before, I agree. I think it is > a good time to improve documentation. > > >> Second, it is visible as rte_flow_validate() (and other functions > >> which use rte_flow_ops_get()) return value value which has > >> special meaning. So, should be documented. > > > > Yes, I should update the API doc where ENOSYS was mentioned. > > Or probably better: I should keep the error code ENOSYS > > and do not break API. > > Preference? > > Good question. I think we should not distinguish NULL callback > and NULL ops returned by not-NULL callback. So, I think > keeping ENOSYS is the best option here. OK, thank you for the review. So the conclusion is: keep ENOSYS and document NULL ops case.
[dpdk-dev] [PATCH 0/2] support block cipher DIGEST_ENCRYPTED mode
This series adds support for block cipher DIGEST_ENCRYPTED mode in OCTEON TX, OCTEON TX2 PMDs and sample unit test application. Tejasree Kondoj (2): common/cpt: support DIGEST_ENCRYPTED mode test/crypto: support block cipher DIGEST_ENCRYPTED mode app/test/test_cryptodev_aes_test_vectors.h| 589 ++ app/test/test_cryptodev_blockcipher.c | 95 ++- app/test/test_cryptodev_blockcipher.h | 10 + doc/guides/cryptodevs/features/octeontx.ini | 1 + doc/guides/cryptodevs/features/octeontx2.ini | 1 + doc/guides/rel_notes/release_21_05.rst| 8 + drivers/common/cpt/cpt_mcode_defines.h| 7 +- drivers/common/cpt/cpt_ucode.h| 42 +- drivers/crypto/octeontx/otx_cryptodev_ops.c | 11 +- drivers/crypto/octeontx2/otx2_cryptodev.c | 3 +- drivers/crypto/octeontx2/otx2_cryptodev_ops.c | 8 +- 11 files changed, 749 insertions(+), 26 deletions(-) -- 2.27.0
[dpdk-dev] [PATCH 1/2] common/cpt: support DIGEST_ENCRYPTED mode
Adding support for DIGEST_ENCRYPTED mode. Signed-off-by: Tejasree Kondoj --- doc/guides/cryptodevs/features/octeontx.ini | 1 + doc/guides/cryptodevs/features/octeontx2.ini | 1 + doc/guides/rel_notes/release_21_05.rst| 8 drivers/common/cpt/cpt_mcode_defines.h| 7 +++- drivers/common/cpt/cpt_ucode.h| 42 +++ drivers/crypto/octeontx/otx_cryptodev_ops.c | 11 +++-- drivers/crypto/octeontx2/otx2_cryptodev.c | 3 +- drivers/crypto/octeontx2/otx2_cryptodev_ops.c | 8 +++- 8 files changed, 67 insertions(+), 14 deletions(-) diff --git a/doc/guides/cryptodevs/features/octeontx.ini b/doc/guides/cryptodevs/features/octeontx.ini index 10d94e3f7b..d9776a5788 100644 --- a/doc/guides/cryptodevs/features/octeontx.ini +++ b/doc/guides/cryptodevs/features/octeontx.ini @@ -13,6 +13,7 @@ OOP SGL In LB Out = Y OOP SGL In SGL Out = Y OOP LB In LB Out = Y RSA PRIV OP KEY QT = Y +Digest encrypted = Y Symmetric sessionless = Y ; diff --git a/doc/guides/cryptodevs/features/octeontx2.ini b/doc/guides/cryptodevs/features/octeontx2.ini index b0d50ce984..66c5fefde6 100644 --- a/doc/guides/cryptodevs/features/octeontx2.ini +++ b/doc/guides/cryptodevs/features/octeontx2.ini @@ -14,6 +14,7 @@ OOP SGL In LB Out = Y OOP SGL In SGL Out = Y OOP LB In LB Out = Y RSA PRIV OP KEY QT = Y +Digest encrypted = Y Symmetric sessionless = Y ; diff --git a/doc/guides/rel_notes/release_21_05.rst b/doc/guides/rel_notes/release_21_05.rst index 23f7f0bff9..d7c65091a9 100644 --- a/doc/guides/rel_notes/release_21_05.rst +++ b/doc/guides/rel_notes/release_21_05.rst @@ -65,6 +65,14 @@ New Features * Added support for txgbevf PMD. +* **Updated the OCTEON TX crypto PMD.** + + * Added support for DIGEST_ENCRYPTED mode in OCTEON TX crypto PMD. + +* **Updated the OCTEON TX2 crypto PMD.** + + * Added support for DIGEST_ENCRYPTED mode in OCTEON TX2 crypto PMD. + * **Updated testpmd.** * Added command to display Rx queue used descriptor count. diff --git a/drivers/common/cpt/cpt_mcode_defines.h b/drivers/common/cpt/cpt_mcode_defines.h index 56a745f419..624bdcf3cf 100644 --- a/drivers/common/cpt/cpt_mcode_defines.h +++ b/drivers/common/cpt/cpt_mcode_defines.h @@ -20,6 +20,9 @@ #define CPT_MAJOR_OP_ZUC_SNOW3G0x37 #define CPT_MAJOR_OP_KASUMI0x38 #define CPT_MAJOR_OP_MISC 0x01 +#define CPT_HMAC_FIRST_BIT_POS 0x4 +#define CPT_FC_MINOR_OP_ENCRYPT0x0 +#define CPT_FC_MINOR_OP_DECRYPT0x1 /* AE opcodes */ #define CPT_MAJOR_OP_MODEX 0x03 @@ -314,8 +317,10 @@ struct cpt_ctx { uint64_t hmac :1; uint64_t zsk_flags :3; uint64_t k_ecb :1; + uint64_t auth_enc :1; + uint64_t dec_auth :1; uint64_t snow3g :2; - uint64_t rsvd :21; + uint64_t rsvd :19; /* Below fields are accessed by hardware */ union { mc_fc_context_t fctx; diff --git a/drivers/common/cpt/cpt_ucode.h b/drivers/common/cpt/cpt_ucode.h index 0536620710..ee6d49aae7 100644 --- a/drivers/common/cpt/cpt_ucode.h +++ b/drivers/common/cpt/cpt_ucode.h @@ -752,7 +752,9 @@ cpt_enc_hmac_prep(uint32_t flags, /* Encryption */ vq_cmd_w0.s.opcode.major = CPT_MAJOR_OP_FC; - vq_cmd_w0.s.opcode.minor = 0; + vq_cmd_w0.s.opcode.minor = CPT_FC_MINOR_OP_ENCRYPT; + vq_cmd_w0.s.opcode.minor |= (cpt_ctx->auth_enc << + CPT_HMAC_FIRST_BIT_POS); if (hash_type == GMAC_TYPE) { encr_offset = 0; @@ -779,6 +781,9 @@ cpt_enc_hmac_prep(uint32_t flags, outputlen = enc_dlen + mac_len; } + if (cpt_ctx->auth_enc != 0) + outputlen = enc_dlen; + /* GP op header */ vq_cmd_w0.s.param1 = encr_data_len; vq_cmd_w0.s.param2 = auth_data_len; @@ -1112,7 +1117,9 @@ cpt_dec_hmac_prep(uint32_t flags, /* Decryption */ vq_cmd_w0.s.opcode.major = CPT_MAJOR_OP_FC; - vq_cmd_w0.s.opcode.minor = 1; + vq_cmd_w0.s.opcode.minor = CPT_FC_MINOR_OP_DECRYPT; + vq_cmd_w0.s.opcode.minor |= (cpt_ctx->dec_auth << + CPT_HMAC_FIRST_BIT_POS); if (hash_type == GMAC_TYPE) { encr_offset = 0; @@ -1130,6 +1137,9 @@ cpt_dec_hmac_prep(uint32_t flags, outputlen = enc_dlen; } + if (cpt_ctx->dec_auth != 0) + outputlen = inputlen = enc_dlen; + vq_cmd_w0.s.param1 = encr_data_len; vq_cmd_w0.s.param2 = auth_data_len; @@ -2566,6 +2576,7 @@ fill_sess_cipher(struct rte_crypto_sym_xform *xform, struct cpt_sess_misc *sess) { struct rte_crypto_cipher_xform *c_form; + struct cpt_ctx *ctx = SESS_PRIV(sess); cipher_type_t enc_type = 0; /* NULL Cipher type */ uint32_t cipher_key_len = 0; ui
[dpdk-dev] [PATCH 2/2] test/crypto: support block cipher DIGEST_ENCRYPTED mode
Adding support for block cipher DIGEST_ENCRYPTED mode. Signed-off-by: Tejasree Kondoj --- app/test/test_cryptodev_aes_test_vectors.h | 589 + app/test/test_cryptodev_blockcipher.c | 95 +++- app/test/test_cryptodev_blockcipher.h | 10 + 3 files changed, 682 insertions(+), 12 deletions(-) diff --git a/app/test/test_cryptodev_aes_test_vectors.h b/app/test/test_cryptodev_aes_test_vectors.h index c192d75a7e..7755b271c2 100644 --- a/app/test/test_cryptodev_aes_test_vectors.h +++ b/app/test/test_cryptodev_aes_test_vectors.h @@ -1093,6 +1093,172 @@ static const uint8_t ciphertext512_aes128cbc_aad[] = { 0x73, 0x65, 0x72, 0x73, 0x2C, 0x20, 0x73, 0x75 }; +static const uint8_t plaintext_aes_common_digest_enc[] = { + 0x57, 0x68, 0x61, 0x74, 0x20, 0x61, 0x20, 0x6C, + 0x6F, 0x75, 0x73, 0x79, 0x20, 0x65, 0x61, 0x72, + 0x74, 0x68, 0x21, 0x20, 0x48, 0x65, 0x20, 0x77, + 0x6F, 0x6E, 0x64, 0x65, 0x72, 0x65, 0x64, 0x20, + 0x68, 0x6F, 0x77, 0x20, 0x6D, 0x61, 0x6E, 0x79, + 0x20, 0x70, 0x65, 0x6F, 0x70, 0x6C, 0x65, 0x20, + 0x77, 0x65, 0x72, 0x65, 0x20, 0x64, 0x65, 0x73, + 0x74, 0x69, 0x74, 0x75, 0x74, 0x65, 0x20, 0x74, + 0x68, 0x61, 0x74, 0x20, 0x73, 0x61, 0x6D, 0x65, + 0x20, 0x6E, 0x69, 0x67, 0x68, 0x74, 0x20, 0x65, + 0x76, 0x65, 0x6E, 0x20, 0x69, 0x6E, 0x20, 0x68, + 0x69, 0x73, 0x20, 0x6F, 0x77, 0x6E, 0x20, 0x70, + 0x72, 0x6F, 0x73, 0x70, 0x65, 0x72, 0x6F, 0x75, + 0x73, 0x20, 0x63, 0x6F, 0x75, 0x6E, 0x74, 0x72, + 0x79, 0x2C, 0x20, 0x68, 0x6F, 0x77, 0x20, 0x6D, + 0x61, 0x6E, 0x79, 0x20, 0x68, 0x6F, 0x6D, 0x65, + 0x73, 0x20, 0x77, 0x65, 0x72, 0x65, 0x20, 0x73, + 0x68, 0x61, 0x6E, 0x74, 0x69, 0x65, 0x73, 0x2C, + 0x20, 0x68, 0x6F, 0x77, 0x20, 0x6D, 0x61, 0x6E, + 0x79, 0x20, 0x68, 0x75, 0x73, 0x62, 0x61, 0x6E, + 0x64, 0x73, 0x20, 0x77, 0x65, 0x72, 0x65, 0x20, + 0x64, 0x72, 0x75, 0x6E, 0x6B, 0x20, 0x61, 0x6E, + 0x64, 0x20, 0x77, 0x69, 0x76, 0x65, 0x73, 0x20, + 0x73, 0x6F, 0x63, 0x6B, 0x65, 0x64, 0x2C, 0x20, + 0x61, 0x6E, 0x64, 0x20, 0x68, 0x6F, 0x77, 0x20, + 0x6D, 0x61, 0x6E, 0x79, 0x20, 0x63, 0x68, 0x69, + 0x6C, 0x64, 0x72, 0x65, 0x6E, 0x20, 0x77, 0x65, + 0x72, 0x65, 0x20, 0x62, 0x75, 0x6C, 0x6C, 0x69, + 0x65, 0x64, 0x2C, 0x20, 0x61, 0x62, 0x75, 0x73, + 0x65, 0x64, 0x2C, 0x20, 0x6F, 0x72, 0x20, 0x61, + 0x62, 0x61, 0x6E, 0x64, 0x6F, 0x6E, 0x65, 0x64, + 0x2E, 0x20, 0x48, 0x6F, 0x77, 0x20, 0x6D, 0x61, + 0x6E, 0x79, 0x20, 0x66, 0x61, 0x6D, 0x69, 0x6C, + 0x69, 0x65, 0x73, 0x20, 0x68, 0x75, 0x6E, 0x67, + 0x65, 0x72, 0x65, 0x64, 0x20, 0x66, 0x6F, 0x72, + 0x20, 0x66, 0x6F, 0x6F, 0x64, 0x20, 0x74, 0x68, + 0x65, 0x79, 0x20, 0x63, 0x6F, 0x75, 0x6C, 0x64, + 0x20, 0x6E, 0x6F, 0x74, 0x20, 0x61, 0x66, 0x66, + 0x6F, 0x72, 0x64, 0x20, 0x74, 0x6F, 0x20, 0x62, + 0x75, 0x79, 0x3F, 0x20, 0x48, 0x6F, 0x77, 0x20, + 0x6D, 0x61, 0x6E, 0x79, 0x20, 0x68, 0x65, 0x61, + 0x72, 0x74, 0x73, 0x20, 0x77, 0x65, 0x72, 0x65, + 0x20, 0x62, 0x72, 0x6F, 0x6B, 0x65, 0x6E, 0x3F, + 0x20, 0x48, 0x6F, 0x77, 0x20, 0x6D, 0x61, 0x6E, + 0x79, 0x20, 0x73, 0x75, 0x69, 0x63, 0x69, 0x64, + 0x65, 0x73, 0x20, 0x77, 0x6F, 0x75, 0x6C, 0x64, + 0x20, 0x74, 0x61, 0x6B, 0x65, 0x20, 0x70, 0x6C, + 0x61, 0x63, 0x65, 0x20, 0x74, 0x68, 0x61, 0x74, + 0x20, 0x73, 0x61, 0x6D, 0x65, 0x20, 0x6E, 0x69, + 0x67, 0x68, 0x74, 0x2C, 0x20, 0x68, 0x6F, 0x77, + 0x20, 0x6D, 0x61, 0x6E, 0x79, 0x20, 0x70, 0x65, + 0x6F, 0x70, 0x6C, 0x65, 0x20, 0x77, 0x6F, 0x75, + 0x6C, 0x64, 0x20, 0x67, 0x6F, 0x20, 0x69, 0x6E, + 0x73, 0x61, 0x6E, 0x65, 0x3F, 0x20, 0x48, 0x6F, + 0x77, 0x20, 0x6D, 0x61, 0x6E, 0x79, 0x20, 0x63, + 0x6F, 0x63, 0x6B, 0x72, 0x6F, 0x61, 0x63, 0x68, + 0x65, 0x73, 0x20, 0x61, 0x6E, 0x64, 0x20, 0x6C, + 0x61, 0x6E, 0x64, 0x6C, 0x6F, 0x72, 0x64, 0x73, + 0x20, 0x77, 0x6F, 0x75, 0x6C, 0x64, 0x20, 0x74, + 0x72, 0x69, 0x75, 0x6D, 0x70, 0x68, 0x3F, 0x20, + 0x48, 0x6F, 0x77, 0x20, 0x6D, 0x61, 0x6E, 0x79, + 0x20, 0x77, 0x69, 0x6E, 0x6E, 0x65, 0x72, 0x73, + 0x20, 0x77, 0x65, 0x72, 0x65, 0x20, 0x6c, 0x6f, + 0x73, 0x65, 0x72, 0x73, 0x2c, 0x20, 0x73, 0x75, + /* mac */ + 0xC4, 0xB7, 0x0E, 0x6B, 0xDE, 0xD1, 0xE7, 0x77, + 0x7E, 0x2E, 0x8F, 0xFC, 0x48, 0x39, 0x46, 0x17, + 0x3F, 0x91, 0x64, 0x59 +}; + +static const uint8_t ciphertext512_aes128cbc_digest_enc[] = { + 0x8B, 0x4D, 0xDA, 0x1B, 0xCF, 0x04, 0xA0, 0x31, + 0xB4, 0xBF, 0xBD, 0x68, 0x43, 0x20, 0x7E, 0x76, + 0xB1, 0x96, 0x8B, 0xA2, 0x7C, 0xA2, 0x83, 0x9E, + 0x39, 0x5A, 0x2F, 0x7E, 0x92, 0xB4, 0x48, 0x1A, + 0x3F, 0x6B, 0x5D, 0xDF, 0x52, 0x85, 0x5F, 0x8E, + 0x42, 0x3C, 0xFB, 0xE9, 0x1A, 0x24, 0xD6, 0x08, + 0xDD, 0xFD, 0x16, 0xFB, 0xE9, 0x55, 0xEF, 0xF0, + 0xA0, 0x8D, 0x13, 0xAB,
[dpdk-dev] [PATCH] crypto/octeontx2: remove redundant code
Removing redundant field in a union. Signed-off-by: Tejasree Kondoj --- drivers/crypto/octeontx2/otx2_ipsec_po.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/crypto/octeontx2/otx2_ipsec_po.h b/drivers/crypto/octeontx2/otx2_ipsec_po.h index 8a672a38ea..eda9f19738 100644 --- a/drivers/crypto/octeontx2/otx2_ipsec_po.h +++ b/drivers/crypto/octeontx2/otx2_ipsec_po.h @@ -203,7 +203,6 @@ struct otx2_ipsec_po_out_sa { /* w8-w55 */ union { - uint8_t raw[384]; struct { struct otx2_ipsec_po_ip_template template; } aes_gcm; -- 2.27.0
[dpdk-dev] [PATCH 0/3] add lookaside IPsec UDP encapsulation and transport mode
This series adds lookaside IPsec UDP encapsulation and transport mode support. The functionality has been tested with ipsec-secgw application running in lookaside protocol offload mode. Tejasree Kondoj (3): crypto/octeontx2: add UDP encapsulation support examples/ipsec-secgw: add UDP encapsulation support crypto/octeontx2: support lookaside IPv4 transport mode doc/guides/cryptodevs/octeontx2.rst | 2 + doc/guides/rel_notes/release_21_05.rst| 10 ++ doc/guides/sample_app_ug/ipsec_secgw.rst | 5 +- drivers/crypto/octeontx2/otx2_cryptodev_ops.c | 7 +- drivers/crypto/octeontx2/otx2_cryptodev_sec.c | 126 -- drivers/crypto/octeontx2/otx2_cryptodev_sec.h | 4 +- drivers/crypto/octeontx2/otx2_ipsec_po.h | 6 + drivers/crypto/octeontx2/otx2_ipsec_po_ops.h | 8 +- examples/ipsec-secgw/ipsec-secgw.c| 33 - examples/ipsec-secgw/ipsec-secgw.h| 2 + examples/ipsec-secgw/ipsec.c | 1 + examples/ipsec-secgw/ipsec.h | 1 + examples/ipsec-secgw/sad.h| 5 +- 13 files changed, 130 insertions(+), 80 deletions(-) -- 2.27.0
[dpdk-dev] [PATCH 1/3] crypto/octeontx2: add UDP encapsulation support
Adding UDP encapsulation support for IPsec in lookaside protocol mode. Signed-off-by: Tejasree Kondoj --- doc/guides/cryptodevs/octeontx2.rst | 1 + doc/guides/rel_notes/release_21_05.rst| 5 +++ drivers/crypto/octeontx2/otx2_cryptodev_sec.c | 40 ++- 3 files changed, 18 insertions(+), 28 deletions(-) diff --git a/doc/guides/cryptodevs/octeontx2.rst b/doc/guides/cryptodevs/octeontx2.rst index d312eeb74c..b30f98180a 100644 --- a/doc/guides/cryptodevs/octeontx2.rst +++ b/doc/guides/cryptodevs/octeontx2.rst @@ -181,6 +181,7 @@ Features supported * Tunnel mode * ESN * Anti-replay +* UDP Encapsulation * AES-128/192/256-GCM * AES-128/192/256-CBC-SHA1-HMAC * AES-128/192/256-CBC-SHA256-128-HMAC diff --git a/doc/guides/rel_notes/release_21_05.rst b/doc/guides/rel_notes/release_21_05.rst index 23f7f0bff9..66e28e21be 100644 --- a/doc/guides/rel_notes/release_21_05.rst +++ b/doc/guides/rel_notes/release_21_05.rst @@ -65,6 +65,11 @@ New Features * Added support for txgbevf PMD. +* **Updated the OCTEON TX2 crypto PMD.** + + * Updated the OCTEON TX2 crypto PMD lookaside protocol offload for IPsec with +UDP encapsulation support for NAT Traversal. + * **Updated testpmd.** * Added command to display Rx queue used descriptor count. diff --git a/drivers/crypto/octeontx2/otx2_cryptodev_sec.c b/drivers/crypto/octeontx2/otx2_cryptodev_sec.c index 342f089df8..8942ff1fac 100644 --- a/drivers/crypto/octeontx2/otx2_cryptodev_sec.c +++ b/drivers/crypto/octeontx2/otx2_cryptodev_sec.c @@ -203,6 +203,7 @@ crypto_sec_ipsec_outb_session_create(struct rte_cryptodev *crypto_dev, struct rte_security_session *sec_sess) { struct rte_crypto_sym_xform *auth_xform, *cipher_xform; + struct otx2_ipsec_po_ip_template *template; const uint8_t *cipher_key, *auth_key; struct otx2_sec_session_ipsec_lp *lp; struct otx2_ipsec_po_sa_ctl *ctl; @@ -248,11 +249,7 @@ crypto_sec_ipsec_outb_session_create(struct rte_cryptodev *crypto_dev, if (ipsec->tunnel.type == RTE_SECURITY_IPSEC_TUNNEL_IPV4) { if (ctl->enc_type == OTX2_IPSEC_PO_SA_ENC_AES_GCM) { - if (ipsec->options.udp_encap) { - sa->aes_gcm.template.ip4.udp_src = 4500; - sa->aes_gcm.template.ip4.udp_dst = 4500; - } - ip = &sa->aes_gcm.template.ip4.ipv4_hdr; + template = &sa->aes_gcm.template; ctx_len = offsetof(struct otx2_ipsec_po_out_sa, aes_gcm.template) + sizeof( sa->aes_gcm.template.ip4); @@ -260,11 +257,7 @@ crypto_sec_ipsec_outb_session_create(struct rte_cryptodev *crypto_dev, lp->ctx_len = ctx_len >> 3; } else if (ctl->auth_type == OTX2_IPSEC_PO_SA_AUTH_SHA1) { - if (ipsec->options.udp_encap) { - sa->sha1.template.ip4.udp_src = 4500; - sa->sha1.template.ip4.udp_dst = 4500; - } - ip = &sa->sha1.template.ip4.ipv4_hdr; + template = &sa->sha1.template; ctx_len = offsetof(struct otx2_ipsec_po_out_sa, sha1.template) + sizeof( sa->sha1.template.ip4); @@ -272,11 +265,7 @@ crypto_sec_ipsec_outb_session_create(struct rte_cryptodev *crypto_dev, lp->ctx_len = ctx_len >> 3; } else if (ctl->auth_type == OTX2_IPSEC_PO_SA_AUTH_SHA2_256) { - if (ipsec->options.udp_encap) { - sa->sha2.template.ip4.udp_src = 4500; - sa->sha2.template.ip4.udp_dst = 4500; - } - ip = &sa->sha2.template.ip4.ipv4_hdr; + template = &sa->sha2.template; ctx_len = offsetof(struct otx2_ipsec_po_out_sa, sha2.template) + sizeof( sa->sha2.template.ip4); @@ -285,8 +274,15 @@ crypto_sec_ipsec_outb_session_create(struct rte_cryptodev *crypto_dev, } else { return -EINVAL; } + ip = &template->ip4.ipv4_hdr; + if (ipsec->options.udp_encap) { + ip->next_proto_id = IPPROTO_UD
[dpdk-dev] [PATCH 2/3] examples/ipsec-secgw: add UDP encapsulation support
Adding lookaside IPsec UDP encapsulation support for NAT traversal. Added --udp-encap option for application to specify if UDP encapsulation need to be enabled. Example secgw command with UDP encapsultation enabled: -c 0x1 -- -P -p 0x1 --config "(0,0,0)" -f ep0.cfg --udp-encap Signed-off-by: Tejasree Kondoj --- doc/guides/rel_notes/release_21_05.rst | 5 doc/guides/sample_app_ug/ipsec_secgw.rst | 5 +++- examples/ipsec-secgw/ipsec-secgw.c | 33 ++-- examples/ipsec-secgw/ipsec-secgw.h | 2 ++ examples/ipsec-secgw/ipsec.c | 1 + examples/ipsec-secgw/ipsec.h | 1 + examples/ipsec-secgw/sad.h | 5 +++- 7 files changed, 48 insertions(+), 4 deletions(-) diff --git a/doc/guides/rel_notes/release_21_05.rst b/doc/guides/rel_notes/release_21_05.rst index 66e28e21be..2e67038bfe 100644 --- a/doc/guides/rel_notes/release_21_05.rst +++ b/doc/guides/rel_notes/release_21_05.rst @@ -75,6 +75,11 @@ New Features * Added command to display Rx queue used descriptor count. ``show port (port_id) rxq (queue_id) desc used count`` +* **Updated ipsec-secgw sample application.** + + * Updated the ``ipsec-secgw`` sample application with UDP encapsulation +support for NAT Traversal. + Removed Items - diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst b/doc/guides/sample_app_ug/ipsec_secgw.rst index 176e292d3f..099f499c18 100644 --- a/doc/guides/sample_app_ug/ipsec_secgw.rst +++ b/doc/guides/sample_app_ug/ipsec_secgw.rst @@ -139,6 +139,7 @@ The application has a number of command line options:: --reassemble NUM --mtu MTU --frag-ttl FRAG_TTL_NS +--udp-encap Where: @@ -234,6 +235,8 @@ Where: Should be lower for low number of reassembly buckets. Valid values: from 1 ns to 10 s. Default value: 1000 (10 s). +* ``--udp-encap``: enables IPsec UDP Encapsulation for NAT Traversal. + The mapping of lcores to port/queues is similar to other l3fwd applications. @@ -1023,4 +1026,4 @@ Available options: * ``-h`` Show usage. If is specified, only tests for that mode will be invoked. For the -list of available modes please refer to run_test.sh. \ No newline at end of file +list of available modes please refer to run_test.sh. diff --git a/examples/ipsec-secgw/ipsec-secgw.c b/examples/ipsec-secgw/ipsec-secgw.c index 20d69ba813..57c8973e9d 100644 --- a/examples/ipsec-secgw/ipsec-secgw.c +++ b/examples/ipsec-secgw/ipsec-secgw.c @@ -115,6 +115,7 @@ struct flow_info flow_info_tbl[RTE_MAX_ETHPORTS]; #define CMD_LINE_OPT_REASSEMBLE"reassemble" #define CMD_LINE_OPT_MTU "mtu" #define CMD_LINE_OPT_FRAG_TTL "frag-ttl" +#define CMD_LINE_OPT_UDP_ENCAP "udp-encap" #define CMD_LINE_ARG_EVENT "event" #define CMD_LINE_ARG_POLL "poll" @@ -139,6 +140,7 @@ enum { CMD_LINE_OPT_REASSEMBLE_NUM, CMD_LINE_OPT_MTU_NUM, CMD_LINE_OPT_FRAG_TTL_NUM, + CMD_LINE_OPT_UDP_ENCAP_NUM, }; static const struct option lgopts[] = { @@ -152,6 +154,7 @@ static const struct option lgopts[] = { {CMD_LINE_OPT_REASSEMBLE, 1, 0, CMD_LINE_OPT_REASSEMBLE_NUM}, {CMD_LINE_OPT_MTU, 1, 0, CMD_LINE_OPT_MTU_NUM}, {CMD_LINE_OPT_FRAG_TTL, 1, 0, CMD_LINE_OPT_FRAG_TTL_NUM}, + {CMD_LINE_OPT_UDP_ENCAP, 0, 0, CMD_LINE_OPT_UDP_ENCAP_NUM}, {NULL, 0, 0, 0} }; @@ -360,6 +363,9 @@ prepare_one_packet(struct rte_mbuf *pkt, struct ipsec_traffic *t) const struct rte_ether_hdr *eth; const struct rte_ipv4_hdr *iph4; const struct rte_ipv6_hdr *iph6; + const struct rte_udp_hdr *udp; + uint16_t nat_port; + uint16_t ip4_hdr_len; eth = rte_pktmbuf_mtod(pkt, const struct rte_ether_hdr *); if (eth->ether_type == rte_cpu_to_be_16(RTE_ETHER_TYPE_IPV4)) { @@ -368,9 +374,26 @@ prepare_one_packet(struct rte_mbuf *pkt, struct ipsec_traffic *t) RTE_ETHER_HDR_LEN); adjust_ipv4_pktlen(pkt, iph4, 0); - if (iph4->next_proto_id == IPPROTO_ESP) + switch (iph4->next_proto_id) { + case IPPROTO_ESP: t->ipsec.pkts[(t->ipsec.num)++] = pkt; - else { + break; + case IPPROTO_UDP: + if (app_sa_prm.udp_encap == 1) { + ip4_hdr_len = ((iph4->version_ihl & + RTE_IPV4_HDR_IHL_MASK) * + RTE_IPV4_IHL_MULTIPLIER); + udp = rte_pktmbuf_mtod_offset(pkt, + struct rte_udp_hdr *, ip4_hdr_len); + nat_port = rte_cpu_to_be_16(IPSEC_NAT_T_PORT); + if (udp->src_port == nat_port || +
[dpdk-dev] [PATCH 3/3] crypto/octeontx2: support lookaside IPv4 transport mode
Adding support for IPv4 lookaside IPsec transport mode. Signed-off-by: Tejasree Kondoj --- doc/guides/cryptodevs/octeontx2.rst | 1 + drivers/crypto/octeontx2/otx2_cryptodev_ops.c | 7 +- drivers/crypto/octeontx2/otx2_cryptodev_sec.c | 110 ++ drivers/crypto/octeontx2/otx2_cryptodev_sec.h | 4 +- drivers/crypto/octeontx2/otx2_ipsec_po.h | 6 + drivers/crypto/octeontx2/otx2_ipsec_po_ops.h | 8 +- 6 files changed, 76 insertions(+), 60 deletions(-) diff --git a/doc/guides/cryptodevs/octeontx2.rst b/doc/guides/cryptodevs/octeontx2.rst index b30f98180a..811e61a1f6 100644 --- a/doc/guides/cryptodevs/octeontx2.rst +++ b/doc/guides/cryptodevs/octeontx2.rst @@ -179,6 +179,7 @@ Features supported * IPv6 * ESP * Tunnel mode +* Transport mode(IPv4) * ESN * Anti-replay * UDP Encapsulation diff --git a/drivers/crypto/octeontx2/otx2_cryptodev_ops.c b/drivers/crypto/octeontx2/otx2_cryptodev_ops.c index cec20b5c6d..c20170bcaa 100644 --- a/drivers/crypto/octeontx2/otx2_cryptodev_ops.c +++ b/drivers/crypto/octeontx2/otx2_cryptodev_ops.c @@ -928,7 +928,7 @@ otx2_cpt_sec_post_process(struct rte_crypto_op *cop, uintptr_t *rsp) struct rte_mbuf *m = sym_op->m_src; struct rte_ipv6_hdr *ip6; struct rte_ipv4_hdr *ip; - uint16_t m_len; + uint16_t m_len = 0; int mdata_len; char *data; @@ -938,11 +938,12 @@ otx2_cpt_sec_post_process(struct rte_crypto_op *cop, uintptr_t *rsp) if (word0->s.opcode.major == OTX2_IPSEC_PO_PROCESS_IPSEC_INB) { data = rte_pktmbuf_mtod(m, char *); - if (rsp[4] == RTE_SECURITY_IPSEC_TUNNEL_IPV4) { + if (rsp[4] == OTX2_IPSEC_PO_TRANSPORT || + rsp[4] == OTX2_IPSEC_PO_TUNNEL_IPV4) { ip = (struct rte_ipv4_hdr *)(data + OTX2_IPSEC_PO_INB_RPTR_HDR); m_len = rte_be_to_cpu_16(ip->total_length); - } else { + } else if (rsp[4] == OTX2_IPSEC_PO_TUNNEL_IPV6) { ip6 = (struct rte_ipv6_hdr *)(data + OTX2_IPSEC_PO_INB_RPTR_HDR); m_len = rte_be_to_cpu_16(ip6->payload_len) + diff --git a/drivers/crypto/octeontx2/otx2_cryptodev_sec.c b/drivers/crypto/octeontx2/otx2_cryptodev_sec.c index 8942ff1fac..6493ce8370 100644 --- a/drivers/crypto/octeontx2/otx2_cryptodev_sec.c +++ b/drivers/crypto/octeontx2/otx2_cryptodev_sec.c @@ -25,12 +25,15 @@ ipsec_lp_len_precalc(struct rte_security_ipsec_xform *ipsec, { struct rte_crypto_sym_xform *cipher_xform, *auth_xform; - if (ipsec->tunnel.type == RTE_SECURITY_IPSEC_TUNNEL_IPV4) - lp->partial_len = sizeof(struct rte_ipv4_hdr); - else if (ipsec->tunnel.type == RTE_SECURITY_IPSEC_TUNNEL_IPV6) - lp->partial_len = sizeof(struct rte_ipv6_hdr); - else - return -EINVAL; + lp->partial_len = 0; + if (ipsec->mode == RTE_SECURITY_IPSEC_SA_MODE_TUNNEL) { + if (ipsec->tunnel.type == RTE_SECURITY_IPSEC_TUNNEL_IPV4) + lp->partial_len = sizeof(struct rte_ipv4_hdr); + else if (ipsec->tunnel.type == RTE_SECURITY_IPSEC_TUNNEL_IPV6) + lp->partial_len = sizeof(struct rte_ipv6_hdr); + else + return -EINVAL; + } if (ipsec->proto == RTE_SECURITY_IPSEC_SA_PROTO_ESP) { lp->partial_len += sizeof(struct rte_esp_hdr); @@ -203,7 +206,7 @@ crypto_sec_ipsec_outb_session_create(struct rte_cryptodev *crypto_dev, struct rte_security_session *sec_sess) { struct rte_crypto_sym_xform *auth_xform, *cipher_xform; - struct otx2_ipsec_po_ip_template *template; + struct otx2_ipsec_po_ip_template *template = NULL; const uint8_t *cipher_key, *auth_key; struct otx2_sec_session_ipsec_lp *lp; struct otx2_ipsec_po_sa_ctl *ctl; @@ -229,10 +232,10 @@ crypto_sec_ipsec_outb_session_create(struct rte_cryptodev *crypto_dev, memset(sa, 0, sizeof(struct otx2_ipsec_po_out_sa)); /* Initialize lookaside ipsec private data */ + lp->mode_type = OTX2_IPSEC_PO_TRANSPORT; lp->ip_id = 0; lp->seq_lo = 1; lp->seq_hi = 0; - lp->tunnel_type = ipsec->tunnel.type; ret = ipsec_po_sa_ctl_set(ipsec, crypto_xform, ctl); if (ret) @@ -242,46 +245,47 @@ crypto_sec_ipsec_outb_session_create(struct rte_cryptodev *crypto_dev, if (ret) return ret; - if (ipsec->mode == RTE_SECURITY_IPSEC_SA_MODE_TUNNEL) { - /* Start ip id from 1 */ - lp->ip_id = 1; + /* Start ip id from 1 */ + lp->ip_id = 1; + + if (ctl->enc_type == OTX2_IPSEC_PO_SA_ENC_AES_GCM) { + template = &sa->aes_gcm.template; + ctx_len = offsetof(struct otx2_ipsec_po_out_sa, +
Re: [dpdk-dev] [PATCH v3 00/11] improve options help
On Fri, Mar 12, 2021 at 07:17:09PM +0100, Thomas Monjalon wrote: > The main intent of this series is to provide a nice help > for the --log-level option. > More patches are added to improve options help in general. > > > v3: > - fix use of RTE_LOG_MAX > - accept (with warning) log level higher than RTE_LOG_MAX > v2: > - fix use of the new macro RTE_LOG_MAX in level parsing > - improve parameters type and name while moving functions > > Series-acked-by: Bruce Richardson
Re: [dpdk-dev] [PATCH 7/7] eventdev: fix ABI breakage due to event vector
On 08/03/2021 18:44, Jerin Jacob wrote: > On Sun, Feb 21, 2021 at 3:41 AM wrote: >> >> From: Pavan Nikhilesh >> >> Fix ABI breakage due to event vector configuration by moving >> the vector configuration into a new structure and having a separate >> function for enabling the vector config on a given ethernet device and >> queue pair. >> This vector config and function can be merged to queue config in >> v21.11. >> >> Fixes: 44c81670cf0a ("eventdev: introduce event vector Rx capability") > > Hi @Ray Kinsella @Neil Horman @Thomas Monjalon @David Marchand > > Is the ABI breakage contract between release to release. Right? i.e it > is not between each patch. Right? > > Summary: > 1) Ideal way of adding this feature is to add elements in the > existing structure as mentioned > in ("eventdev: introduce event vector Rx capability") in this series. > 2) Since this breaking ABI, Introducing a new structure to fix this. I > think, we can remove this > limitation in 21.11 as that time we can change ABI as required. > > So, Is this patch needs to be squashed to ("eventdev: introduce event > vector Rx capability") to avoid > ABI compatibility between patches? Or Is it OK to break the ABI > compatibility in a patch in the series > and later fix it in the same series?(This is for more readability as > we can revert this patch in 21.11). You are essentially writing it as you want it to appear in 21.11, you then add one patch at the end to fix ABI compability until then. You then only have one patch to revert in the 21.11 cycle. Agree with David, I like the approach. +1 from me. > > > >> >> Signed-off-by: Pavan Nikhilesh >> --- >> app/test-eventdev/test_pipeline_common.c | 16 +- >> lib/librte_eventdev/eventdev_pmd.h| 29 +++ >> .../rte_event_eth_rx_adapter.c| 168 -- >> .../rte_event_eth_rx_adapter.h| 27 +++ >> lib/librte_eventdev/version.map | 1 + >> 5 files changed, 184 insertions(+), 57 deletions(-) >> >> diff --git a/app/test-eventdev/test_pipeline_common.c >> b/app/test-eventdev/test_pipeline_common.c >> index 89f73be86..9aeefdd5f 100644 >> --- a/app/test-eventdev/test_pipeline_common.c >> +++ b/app/test-eventdev/test_pipeline_common.c >> @@ -331,6 +331,7 @@ pipeline_event_rx_adapter_setup(struct evt_options *opt, >> uint8_t stride, >> uint16_t prod; >> struct rte_mempool *vector_pool = NULL; >> struct rte_event_eth_rx_adapter_queue_conf queue_conf; >> + struct rte_event_eth_rx_adapter_event_vector_config vec_conf; >> >> memset(&queue_conf, 0, >> sizeof(struct rte_event_eth_rx_adapter_queue_conf)); >> @@ -360,12 +361,8 @@ pipeline_event_rx_adapter_setup(struct evt_options >> *opt, uint8_t stride, >> } >> if (opt->ena_vector) { >> if (cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR) >> { >> - queue_conf.vector_sz = opt->vector_size; >> - queue_conf.vector_timeout_ns = >> - opt->vector_tmo_nsec; >> queue_conf.rx_queue_flags |= >> RTE_EVENT_ETH_RX_ADAPTER_QUEUE_EVENT_VECTOR; >> - queue_conf.vector_mp = vector_pool; >> } else { >> evt_err("Rx adapter doesn't support event >> vector"); >> return -EINVAL; >> @@ -385,6 +382,17 @@ pipeline_event_rx_adapter_setup(struct evt_options >> *opt, uint8_t stride, >> return ret; >> } >> >> + if (opt->ena_vector) { >> + vec_conf.vector_sz = opt->vector_size; >> + vec_conf.vector_timeout_ns = opt->vector_tmo_nsec; >> + vec_conf.vector_mp = vector_pool; >> + if >> (rte_event_eth_rx_adapter_queue_event_vector_config( >> + prod, prod, -1, &vec_conf) < 0) { >> + evt_err("Failed to configure event >> vectorization for Rx adapter"); >> + return -EINVAL; >> + } >> + } >> + >> if (!(cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT)) { >> uint32_t service_id = -1U; >> >> diff --git a/lib/librte_eventdev/eventdev_pmd.h >> b/lib/librte_eventdev/eventdev_pmd.h >> index 60bfaebc0..d79dfd612 100644 >> --- a/lib/librte_eventdev/eventdev_pmd.h >> +++ b/lib/librte_eventdev/eventdev_pmd.h >> @@ -667,6 +667,32 @@ typedef int >> (*eventdev_eth_rx_adapter_vector_limits_get_t)( >> const struct rte_eventdev *dev, const struct rte_eth_dev *eth_dev, >> struct rte_event_eth_rx_adapter_vector_limits *limits); >> >> +struct rte_event_eth_rx_adapter_event_vector_config; >> +/** >> +
Re: [dpdk-dev] [PATCH v2] ethdev: introduce enable_driver_sdk to install driver headers
On Fri, Mar 12, 2021 at 02:20:06PM -0800, Tyler Retzlaff wrote: > Introduce a meson option enable_driver_sdk when true installs internal > driver headers for ethdev. this allows drivers that do not depend on > stable api/abi to be built external to the dpdk source tree. > > Signed-off-by: Tyler Retzlaff > --- > lib/librte_ethdev/meson.build | 6 ++ > lib/meson.build | 4 > meson_options.txt | 2 ++ > 3 files changed, 12 insertions(+) > The infrastructure looks good to me. However, you need to add change to the cryptodev, eventdev, etc. to add the headers from there too.
Re: [dpdk-dev] [PATCH v11 2/2] bus/pci: support MMIO in PCI ioport accessors
On Thu, Mar 11, 2021 at 7:43 AM Wang, Haiyue wrote: > Like kernel use macro to do pio and mmio, maybe we can also to do so for > making code clean: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/iomap.c > > #define IO_COND(addr, is_pio, is_mmio) do { \ > unsigned long port = (unsigned long __force)addr; \ > if (port >= PIO_RESERVED) { \ > is_mmio;\ > } else if (port > PIO_OFFSET) { \ > port &= PIO_MASK; \ > is_pio; \ > } else \ > bad_io_access(port, #is_pio ); \ > } while (0) > > > Like: > > #if defined(RTE_ARCH_X86) > #define IO_COND(addr, is_pio, is_mmio) do { \ > if ((uint64_t)(uintptr_t)addr >= PIO_MAX) { \ > is_mmio; \ > } else { \ > is_pio; \ > } \ > } while (0) > #else > #define IO_COND(addr, is_pio, is_mmio) do { \ > is_mmio; \ > } while (0) > #endif We should not just copy/paste kernel code. Plus here, this seems a bit overkill. And there are other parts in this code that could use some polishing. What do you think of merging this series as is (now that we got non regression reports) and doing such cleanups in followup patches? -- David Marchand
Re: [dpdk-dev] [PATCH v3 07/11] eal: add log level help
On 12/03/2021 18:17, Thomas Monjalon wrote: > The option --log-level was not completely described in the usage text, > and it was difficult to guess the names of the log types and levels. > > A new value "help" is accepted after --log-level to give more details > about the syntax and listing the log types and levels. > > The array "levels" used for level name parsing is replaced with > a (modified) existing function which was used in rte_log_dump(). > > The new function rte_log_list_types() is exported in the API > for allowing an application to give this info to the user > if not exposing the EAL option --log-level. > The list of log types cannot include all drivers if not linked in the > application (shared object plugin case). > > Signed-off-by: Thomas Monjalon > --- > lib/librte_eal/common/eal_common_log.c | 24 +--- > lib/librte_eal/common/eal_common_options.c | 44 +++--- > lib/librte_eal/common/eal_log.h| 5 +++ > lib/librte_eal/include/rte_log.h | 11 ++ > lib/librte_eal/version.map | 3 ++ > 5 files changed, 69 insertions(+), 18 deletions(-) > > diff --git a/lib/librte_eal/common/eal_common_log.c > b/lib/librte_eal/common/eal_common_log.c > index 40cac36f89..d695b04068 100644 > --- a/lib/librte_eal/common/eal_common_log.c > +++ b/lib/librte_eal/common/eal_common_log.c > @@ -397,12 +397,12 @@ RTE_INIT_PRIO(log_init, LOG) > rte_logs.dynamic_types_len = RTE_LOGTYPE_FIRST_EXT_ID; > } > > -static const char * > -loglevel_to_string(uint32_t level) > +const char * > +eal_log_level2str(uint32_t level) > { > switch (level) { > case 0: return "disabled"; > - case RTE_LOG_EMERG: return "emerg"; > + case RTE_LOG_EMERG: return "emergency"; > case RTE_LOG_ALERT: return "alert"; > case RTE_LOG_CRIT: return "critical"; > case RTE_LOG_ERR: return "error"; > @@ -414,6 +414,20 @@ loglevel_to_string(uint32_t level) > } > } > > +/* Dump name of each logtype, one per line. */ > +void > +rte_log_list_types(FILE *out, const char *prefix) > +{ > + size_t type; > + > + for (type = 0; type < rte_logs.dynamic_types_len; ++type) { > + if (rte_logs.dynamic_types[type].name == NULL) > + continue; > + fprintf(out, "%s%s\n", > + prefix, rte_logs.dynamic_types[type].name); > + } > +} > + > /* dump global level and registered log types */ > void > rte_log_dump(FILE *f) > @@ -421,14 +435,14 @@ rte_log_dump(FILE *f) > size_t i; > > fprintf(f, "global log level is %s\n", > - loglevel_to_string(rte_log_get_global_level())); > + eal_log_level2str(rte_log_get_global_level())); > > for (i = 0; i < rte_logs.dynamic_types_len; i++) { > if (rte_logs.dynamic_types[i].name == NULL) > continue; > fprintf(f, "id %zu: %s, level is %s\n", > i, rte_logs.dynamic_types[i].name, > - loglevel_to_string(rte_logs.dynamic_types[i].loglevel)); > + eal_log_level2str(rte_logs.dynamic_types[i].loglevel)); > } > } > > diff --git a/lib/librte_eal/common/eal_common_options.c > b/lib/librte_eal/common/eal_common_options.c > index 2df3ae04ea..1da6583d71 100644 > --- a/lib/librte_eal/common/eal_common_options.c > +++ b/lib/librte_eal/common/eal_common_options.c > @@ -1227,19 +1227,31 @@ eal_parse_syslog(const char *facility, struct > internal_config *conf) > } > #endif > > +static void > +eal_log_usage(void) > +{ > + unsigned int level; > + > + printf("Log type is a pattern matching items of this list" > + " (plugins may be missing):\n"); > + rte_log_list_types(stdout, "\t"); > + printf("\n"); > + printf("Syntax using globbing pattern: "); > + printf("--"OPT_LOG_LEVEL" pattern:level\n"); > + printf("Syntax using regular expression: "); > + printf("--"OPT_LOG_LEVEL" regexp,level\n"); > + printf("Syntax for the global level: "); > + printf("--"OPT_LOG_LEVEL" level\n"); > + printf("Logs are emitted if allowed by both global and specific > levels.\n"); > + printf("\n"); > + printf("Log level can be a number or the first letters of its name:\n"); > + for (level = 1; level <= RTE_LOG_MAX; level++) > + printf("\t%d %s\n", level, eal_log_level2str(level)); > +} > + > static int > eal_parse_log_priority(const char *level) > { > - static const char * const levels[] = { > - [RTE_LOG_EMERG] = "emergency", > - [RTE_LOG_ALERT] = "alert", > - [RTE_LOG_CRIT]= "critical", > - [RTE_LOG_ERR] = "error", > - [RTE_LOG_WARNING] = "warning", > - [RTE_LOG_NOTICE] = "notice", > - [RTE_LOG_INFO]= "info", > - [RTE_LOG_DEBUG] = "debug", > - }; > size_t len = strlen(level
Re: [dpdk-dev] [PATCH v4 4/5] examples/l3fwd: implement FIB lookup method
Hi Vladimir, > > + /* Add IPv4 and IPv6 hops to one array depending on type. */ > > + for (i = 0; i < nb_rx; i++) { > > + if (type_arr[i]) > > + nh = (uint16_t)hopsv4[ipv4_arr_assem++]; > > + else > > + nh = (uint16_t)hopsv6[ipv6_arr_assem++]; > > + hops[i] = (nh != FIB_DEFAULT_HOP && nh <= > RTE_MAX_ETHPORTS && > > + (enabled_port_mask & 1 << nh) != 0) ? nh : portid; > > I think you can get rid of > "nh <= RTE_MAX_ETHPORTS && (enabled_port_mask & 1 << nh) != 0" > because it can be controlled during initialization when installing > routes to the table. So you can check it just before rte_fib_add() and > install FIB_DEFAULT_HOP if needed. I will change this ternary to: hops[i] = nh != FIB_DEFAULT_HOP ? nh : portid; I will also update the event code to match this. > > Apart from that LGTM. I will push a v5 with this change. Thanks, Conor. > > > > + } > > + > > +#if defined FIB_SEND_MULTI > > + send_packets_multi(qconf, pkts_burst, hops, nb_rx); > > +#else > > + fib_send_single(nb_rx, qconf, pkts_burst, hops); > > +#endif > > +} > > + > > > > > > > -- > Regards, > Vladimir
Re: [dpdk-dev] [PATCH v3 07/11] eal: add log level help
On Mon, Mar 15, 2021 at 10:19:47AM +, Kinsella, Ray wrote: > > > On 12/03/2021 18:17, Thomas Monjalon wrote: > > The option --log-level was not completely described in the usage text, > > and it was difficult to guess the names of the log types and levels. > > > > A new value "help" is accepted after --log-level to give more details > > about the syntax and listing the log types and levels. > > > > The array "levels" used for level name parsing is replaced with > > a (modified) existing function which was used in rte_log_dump(). > > > > The new function rte_log_list_types() is exported in the API > > for allowing an application to give this info to the user > > if not exposing the EAL option --log-level. > > The list of log types cannot include all drivers if not linked in the > > application (shared object plugin case). > > > > Signed-off-by: Thomas Monjalon > > --- > > lib/librte_eal/common/eal_common_log.c | 24 +--- > > lib/librte_eal/common/eal_common_options.c | 44 +++--- > > lib/librte_eal/common/eal_log.h| 5 +++ > > lib/librte_eal/include/rte_log.h | 11 ++ > > lib/librte_eal/version.map | 3 ++ > > 5 files changed, 69 insertions(+), 18 deletions(-) > > > > @@ -1274,6 +1286,11 @@ eal_parse_log_level(const char *arg) > > char *str, *level; > > int priority; > > > > + if (strcmp(arg, "help") == 0) { > > So I think the convention is to support both "?" and "help". > Qemu does this at least. > I've seen "/?" used for help on windows binaries, but "-?" not so much in the linux world, where --help (and often -h for short) seem to be the standard.
Re: [dpdk-dev] [PATCH v3 07/11] eal: add log level help
On 15/03/2021 10:31, Bruce Richardson wrote: > On Mon, Mar 15, 2021 at 10:19:47AM +, Kinsella, Ray wrote: >> >> >> On 12/03/2021 18:17, Thomas Monjalon wrote: >>> The option --log-level was not completely described in the usage text, >>> and it was difficult to guess the names of the log types and levels. >>> >>> A new value "help" is accepted after --log-level to give more details >>> about the syntax and listing the log types and levels. >>> >>> The array "levels" used for level name parsing is replaced with >>> a (modified) existing function which was used in rte_log_dump(). >>> >>> The new function rte_log_list_types() is exported in the API >>> for allowing an application to give this info to the user >>> if not exposing the EAL option --log-level. >>> The list of log types cannot include all drivers if not linked in the >>> application (shared object plugin case). >>> >>> Signed-off-by: Thomas Monjalon >>> --- >>> lib/librte_eal/common/eal_common_log.c | 24 +--- >>> lib/librte_eal/common/eal_common_options.c | 44 +++--- >>> lib/librte_eal/common/eal_log.h| 5 +++ >>> lib/librte_eal/include/rte_log.h | 11 ++ >>> lib/librte_eal/version.map | 3 ++ >>> 5 files changed, 69 insertions(+), 18 deletions(-) >>> > >>> @@ -1274,6 +1286,11 @@ eal_parse_log_level(const char *arg) >>> char *str, *level; >>> int priority; >>> >>> + if (strcmp(arg, "help") == 0) { >> >> So I think the convention is to support both "?" and "help". >> Qemu does this at least. >> > I've seen "/?" used for help on windows binaries, but "-?" not so much in the > linux world, where --help (and often -h for short) seem to be the standard. > This is slightly different - it is where you are looking to return a list of valid values for a parameter. So for instance in qemu mentioned above ~ > qemu-system-x86_64 -cpu ? | head -n 10 Available CPUs: x86 486 (alias configured by machine type) x86 486-v1 x86 Broadwell (alias configured by machine type) x86 Broadwell-IBRS(alias of Broadwell-v3) x86 Broadwell-noTSX (alias of Broadwell-v2) x86 Broadwell-noTSX-IBRS (alias of Broadwell-v4) x86 Broadwell-v1 Intel Core Processor (Broadwell) x86 Broadwell-v2 Intel Core Processor (Broadwell, no TSX) x86 Broadwell-v3 Intel Core Processor (Broadwell, IBRS)
Re: [dpdk-dev] [PATCH v3 00/11] improve options help
On 3/15/21 12:40 PM, Bruce Richardson wrote: > On Fri, Mar 12, 2021 at 07:17:09PM +0100, Thomas Monjalon wrote: >> The main intent of this series is to provide a nice help >> for the --log-level option. >> More patches are added to improve options help in general. >> >> >> v3: >> - fix use of RTE_LOG_MAX >> - accept (with warning) log level higher than RTE_LOG_MAX >> v2: >> - fix use of the new macro RTE_LOG_MAX in level parsing >> - improve parameters type and name while moving functions >> >> > Series-acked-by: Bruce Richardson > Series-acked-by: Andrew Rybchenko
Re: [dpdk-dev] [PATCH v3 07/11] eal: add log level help
15/03/2021 11:42, Kinsella, Ray: > > On 15/03/2021 10:31, Bruce Richardson wrote: > > On Mon, Mar 15, 2021 at 10:19:47AM +, Kinsella, Ray wrote: > >> > >> > >> On 12/03/2021 18:17, Thomas Monjalon wrote: > >>> The option --log-level was not completely described in the usage text, > >>> and it was difficult to guess the names of the log types and levels. > >>> > >>> A new value "help" is accepted after --log-level to give more details > >>> about the syntax and listing the log types and levels. > >>> > >>> The array "levels" used for level name parsing is replaced with > >>> a (modified) existing function which was used in rte_log_dump(). > >>> > >>> The new function rte_log_list_types() is exported in the API > >>> for allowing an application to give this info to the user > >>> if not exposing the EAL option --log-level. > >>> The list of log types cannot include all drivers if not linked in the > >>> application (shared object plugin case). > >>> > >>> Signed-off-by: Thomas Monjalon > >>> --- > >>> lib/librte_eal/common/eal_common_log.c | 24 +--- > >>> lib/librte_eal/common/eal_common_options.c | 44 +++--- > >>> lib/librte_eal/common/eal_log.h| 5 +++ > >>> lib/librte_eal/include/rte_log.h | 11 ++ > >>> lib/librte_eal/version.map | 3 ++ > >>> 5 files changed, 69 insertions(+), 18 deletions(-) > >>> > > > >>> @@ -1274,6 +1286,11 @@ eal_parse_log_level(const char *arg) > >>> char *str, *level; > >>> int priority; > >>> > >>> + if (strcmp(arg, "help") == 0) { > >> > >> So I think the convention is to support both "?" and "help". > >> Qemu does this at least. > >> > > I've seen "/?" used for help on windows binaries, but "-?" not so much in > > the > > linux world, where --help (and often -h for short) seem to be the standard. > > > > This is slightly different - it is where you are looking to return a list of > valid > values for a parameter. So for instance in qemu mentioned above > > ~ > qemu-system-x86_64 -cpu ? | head -n 10 "?" is a special character. In my zsh, I need to quote it to avoid globbing parsing, so I'm not a fan. I will let you extend the syntax in a separate patch :)
[dpdk-dev] [RFC PATCH] meson: remove unnecessary explicit link to libpcap
libpcap is already found and registered as a dependency by meson, and the dependency is already correctly used in librte_port. This line is just unnecessary. It also has the side effect of messing with the meson link line: dpdk link will be declared twice: manually and then through pkg-config. If you configure meson to prefer static linking over dynamic, this will cause the build to fail on librte_port, since the pcap deps are not yet seen by the linker. Signed-off-by: Gabriel Ganne --- config/meson.build | 1 - 1 file changed, 1 deletion(-) diff --git a/config/meson.build b/config/meson.build index 0fb7e1b27a0f..3eb90327dfcc 100644 --- a/config/meson.build +++ b/config/meson.build @@ -177,7 +177,6 @@ if not pcap_dep.found() endif if pcap_dep.found() and cc.has_header('pcap.h', dependencies: pcap_dep) dpdk_conf.set('RTE_PORT_PCAP', 1) - dpdk_extra_ldflags += '-lpcap' endif # for clang 32-bit compiles we need libatomic for 64-bit atomic ops -- 2.29.2
Re: [dpdk-dev] [PATCH 2/3] net/virtio: allocate fake mbuf in Rx queue
On 1/11/21 3:50 AM, Xia, Chenbo wrote: > Hi Maxime, > >> -Original Message- >> From: Maxime Coquelin >> Sent: Tuesday, December 22, 2020 12:15 AM >> To: dev@dpdk.org; Xia, Chenbo ; amore...@redhat.com; >> david.march...@redhat.com; olivier.m...@6wind.com >> Cc: Maxime Coquelin >> Subject: [PATCH 2/3] net/virtio: allocate fake mbuf in Rx queue >> >> While it is worth clarifying whether the fake mbuf >> in virtnet_rx struct is really necessary, it is sure >> that it heavily impacts cache usage by being part of >> the struct. Indeed, it takes uses cachelines, and > > Did you mean 'uses cachelines'? I don't know what I meant here :) I will rework it! >> requires alignement on a cacheline. > > Alignment? > > With above fixed: > > Reviewed-by: Chenbo Xia > >> >> Before this series, it means it took 120 bytes in >> virtnet_rx struct: >> >> struct virtnet_rx { >> struct virtqueue * vq; /* 0 8 */ >> >> /* XXX 56 bytes hole, try to pack */ >> >> /* --- cacheline 1 boundary (64 bytes) --- */ >> struct rte_mbuffake_mbuf __attribute__((__aligned__(64))); >> /*64 128 */ >> /* --- cacheline 3 boundary (192 bytes) --- */ >> >> This patch allocates it using malloc in order to optimize >> virtnet_rx cache usage and so virtqueue cache usage. >> >> Signed-off-by: Maxime Coquelin >> --- >> drivers/net/virtio/virtio_ethdev.c | 10 ++ >> drivers/net/virtio/virtio_rxtx.c | 8 +++- >> drivers/net/virtio/virtio_rxtx.h | 2 +- >> 3 files changed, 14 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/net/virtio/virtio_ethdev.c >> b/drivers/net/virtio/virtio_ethdev.c >> index 297c01a70d..a1351b36ca 100644 >> --- a/drivers/net/virtio/virtio_ethdev.c >> +++ b/drivers/net/virtio/virtio_ethdev.c >> @@ -539,6 +539,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t >> queue_idx) >> } >> >> if (queue_type == VTNET_RQ) { >> +struct rte_mbuf *fake_mbuf; >> size_t sz_sw = (RTE_PMD_VIRTIO_RX_MAX_BURST + vq_size) * >> sizeof(vq->sw_ring[0]); >> >> @@ -550,10 +551,18 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t >> queue_idx) >> goto fail_q_alloc; >> } >> >> +fake_mbuf = malloc(sizeof(*fake_mbuf)); >> +if (!fake_mbuf) { >> +PMD_INIT_LOG(ERR, "can not allocate fake mbuf"); >> +ret = -ENOMEM; >> +goto fail_q_alloc; >> +} >> + >> vq->sw_ring = sw_ring; >> rxvq = &vq->rxq; >> rxvq->port_id = dev->data->port_id; >> rxvq->mz = mz; >> +rxvq->fake_mbuf = fake_mbuf; >> } else if (queue_type == VTNET_TQ) { >> txvq = &vq->txq; >> txvq->port_id = dev->data->port_id; >> @@ -636,6 +645,7 @@ virtio_free_queues(struct virtio_hw *hw) >> >> queue_type = virtio_get_queue_type(hw, i); >> if (queue_type == VTNET_RQ) { >> +free(vq->rxq.fake_mbuf); >> rte_free(vq->sw_ring); >> rte_memzone_free(vq->rxq.mz); >> } else if (queue_type == VTNET_TQ) { >> diff --git a/drivers/net/virtio/virtio_rxtx.c >> b/drivers/net/virtio/virtio_rxtx.c >> index 1fcce36cbd..d147d7300a 100644 >> --- a/drivers/net/virtio/virtio_rxtx.c >> +++ b/drivers/net/virtio/virtio_rxtx.c >> @@ -703,11 +703,9 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev >> *dev, >> uint16_t queue_idx) >> virtio_rxq_vec_setup(rxvq); >> } >> >> -memset(&rxvq->fake_mbuf, 0, sizeof(rxvq->fake_mbuf)); >> -for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; >> - desc_idx++) { >> -vq->sw_ring[vq->vq_nentries + desc_idx] = >> -&rxvq->fake_mbuf; >> +memset(rxvq->fake_mbuf, 0, sizeof(*rxvq->fake_mbuf)); >> +for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; desc_idx++) { >> +vq->sw_ring[vq->vq_nentries + desc_idx] = rxvq->fake_mbuf; >> } >> >> if (hw->use_vec_rx && !virtio_with_packed_queue(hw)) { >> diff --git a/drivers/net/virtio/virtio_rxtx.h >> b/drivers/net/virtio/virtio_rxtx.h >> index 7f1036be6f..6ce5d67d15 100644 >> --- a/drivers/net/virtio/virtio_rxtx.h >> +++ b/drivers/net/virtio/virtio_rxtx.h >> @@ -19,7 +19,7 @@ struct virtnet_stats { >> >> struct virtnet_rx { >> /* dummy mbuf, for wraparound when processing RX ring. */ >> -struct rte_mbuf fake_mbuf; >> +struct rte_mbuf *fake_mbuf; >> uint64_t mbuf_initializer; /**< value to init mbufs. */ >> struct rte_mempool *mpool; /**< mempool for mbuf allocation */ >> >> -- >> 2.29.2 >
[dpdk-dev] [PATCH v5 0/5] examples/l3fwd: add FIB lookup method to l3fwd
Currently the l3fwd sample app supports LPM and EM lookup methods this patchset implements the FIB library as another lookup method for l3fwd. Instead of adding an individual flag for FIB, a new flag '--lookup' has been added that allows the user to select their desired lookup method. The flags '-E' and '-L' have been retained for backwards compatibility. --- v5: - Removed runtime checks to ensure desired port is within portmask, unused ports are still removed during setup v4: - Changed individual switches for lookup methods to an enum for all lookup methods - Removed '-F' and introduced '--lookup' flag to select lookup methods - Fixed indentation issues - Renamed some variables for increased clarity - Minor changes to some logic for readability - Implemented MAC updating for FIB on non-SSE machines - Implemented RFC1812 for FIB on non-SSE machines - Added checks to ensure desired port is within portmask v3: add support for NEON, PPC 64 and machines that do not support SSE, NEON or PPC 64. v2: added the socket header file to fix FreeBSD build. Conor Walsh (5): examples/l3fwd: fix LPM IPv6 subnets examples/l3fwd: move l3fwd routes to common header examples/l3fwd: add FIB infrastructure examples/l3fwd: implement FIB lookup method doc/guides/l3_forward: update documentation for FIB doc/guides/sample_app_ug/l3_forward.rst | 113 - examples/l3fwd/Makefile | 2 +- examples/l3fwd/l3fwd.h | 27 +- examples/l3fwd/l3fwd_common_route.h | 48 +++ examples/l3fwd/l3fwd_event.c| 9 + examples/l3fwd/l3fwd_event.h| 1 + examples/l3fwd/l3fwd_fib.c | 528 examples/l3fwd/l3fwd_lpm.c | 68 +-- examples/l3fwd/main.c | 107 +++-- examples/l3fwd/meson.build | 4 +- 10 files changed, 809 insertions(+), 98 deletions(-) create mode 100644 examples/l3fwd/l3fwd_common_route.h create mode 100644 examples/l3fwd/l3fwd_fib.c -- 2.25.1
[dpdk-dev] [PATCH v5 1/5] examples/l3fwd: fix LPM IPv6 subnets
The IPv6 subnets used were not within the 2001:200::/48 subnet Changed to 2001:200:0:{0-7}::/64 where 0-7 is the port ID Fixes: 37afe381bde4 ("examples/l3fwd: use reserved IP addresses") Signed-off-by: Conor Walsh Acked-by: Vladimir Medvedkin --- examples/l3fwd/l3fwd_lpm.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c index 3dcf1fef18..1cfaf36572 100644 --- a/examples/l3fwd/l3fwd_lpm.c +++ b/examples/l3fwd/l3fwd_lpm.c @@ -42,7 +42,10 @@ struct ipv6_l3fwd_lpm_route { uint8_t if_out; }; -/* 198.18.0.0/16 are set aside for RFC2544 benchmarking (RFC5735). */ +/* + * 198.18.0.0/16 are set aside for RFC2544 benchmarking (RFC5735). + * 198.18.{0-7}.0/24 = Port {0-7} + */ static const struct ipv4_l3fwd_lpm_route ipv4_l3fwd_lpm_route_array[] = { {RTE_IPV4(198, 18, 0, 0), 24, 0}, {RTE_IPV4(198, 18, 1, 0), 24, 1}, @@ -54,16 +57,19 @@ static const struct ipv4_l3fwd_lpm_route ipv4_l3fwd_lpm_route_array[] = { {RTE_IPV4(198, 18, 7, 0), 24, 7}, }; -/* 2001:0200::/48 is IANA reserved range for IPv6 benchmarking (RFC5180) */ +/* + * 2001:200::/48 is IANA reserved range for IPv6 benchmarking (RFC5180). + * 2001:200:0:{0-7}::/64 = Port {0-7} + */ static const struct ipv6_l3fwd_lpm_route ipv6_l3fwd_lpm_route_array[] = { - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 48, 0}, - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0}, 48, 1}, - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0}, 48, 2}, - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0}, 48, 3}, - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0}, 48, 4}, - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0}, 48, 5}, - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0}, 48, 6}, - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0}, 48, 7}, + {{32, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 0}, + {{32, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 1}, + {{32, 1, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 2}, + {{32, 1, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 3}, + {{32, 1, 2, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 4}, + {{32, 1, 2, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 5}, + {{32, 1, 2, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 6}, + {{32, 1, 2, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 7}, }; #define IPV4_L3FWD_LPM_MAX_RULES 1024 -- 2.25.1
[dpdk-dev] [PATCH v5 2/5] examples/l3fwd: move l3fwd routes to common header
To prevent code duplication from the addition of lookup methods the routes specified in lpm should be moved to a common header. Signed-off-by: Conor Walsh Acked-by: Konstantin Ananyev Acked-by: Vladimir Medvedkin --- examples/l3fwd/l3fwd_common_route.h | 48 +++ examples/l3fwd/l3fwd_lpm.c | 74 +++-- 2 files changed, 65 insertions(+), 57 deletions(-) create mode 100644 examples/l3fwd/l3fwd_common_route.h diff --git a/examples/l3fwd/l3fwd_common_route.h b/examples/l3fwd/l3fwd_common_route.h new file mode 100644 index 00..7f0125a8a5 --- /dev/null +++ b/examples/l3fwd/l3fwd_common_route.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include +#include + +struct ipv4_l3fwd_common_route { + uint32_t ip; + uint8_t depth; + uint8_t if_out; +}; + +struct ipv6_l3fwd_common_route { + uint8_t ip[16]; + uint8_t depth; + uint8_t if_out; +}; + +/* + * 198.18.0.0/16 are set aside for RFC2544 benchmarking (RFC5735). + * 198.18.{0-7}.0/24 = Port {0-7} + */ +static const struct ipv4_l3fwd_common_route ipv4_l3fwd_common_route_array[] = { + {RTE_IPV4(198, 18, 0, 0), 24, 0}, + {RTE_IPV4(198, 18, 1, 0), 24, 1}, + {RTE_IPV4(198, 18, 2, 0), 24, 2}, + {RTE_IPV4(198, 18, 3, 0), 24, 3}, + {RTE_IPV4(198, 18, 4, 0), 24, 4}, + {RTE_IPV4(198, 18, 5, 0), 24, 5}, + {RTE_IPV4(198, 18, 6, 0), 24, 6}, + {RTE_IPV4(198, 18, 7, 0), 24, 7}, +}; + +/* + * 2001:200::/48 is IANA reserved range for IPv6 benchmarking (RFC5180). + * 2001:200:0:{0-7}::/64 = Port {0-7} + */ +static const struct ipv6_l3fwd_common_route ipv6_l3fwd_common_route_array[] = { + {{32, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 0}, + {{32, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 1}, + {{32, 1, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 2}, + {{32, 1, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 3}, + {{32, 1, 2, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 4}, + {{32, 1, 2, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 5}, + {{32, 1, 2, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 6}, + {{32, 1, 2, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 7}, +}; diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c index 1cfaf36572..818cf717d1 100644 --- a/examples/l3fwd/l3fwd_lpm.c +++ b/examples/l3fwd/l3fwd_lpm.c @@ -30,47 +30,7 @@ #include "l3fwd.h" #include "l3fwd_event.h" -struct ipv4_l3fwd_lpm_route { - uint32_t ip; - uint8_t depth; - uint8_t if_out; -}; - -struct ipv6_l3fwd_lpm_route { - uint8_t ip[16]; - uint8_t depth; - uint8_t if_out; -}; - -/* - * 198.18.0.0/16 are set aside for RFC2544 benchmarking (RFC5735). - * 198.18.{0-7}.0/24 = Port {0-7} - */ -static const struct ipv4_l3fwd_lpm_route ipv4_l3fwd_lpm_route_array[] = { - {RTE_IPV4(198, 18, 0, 0), 24, 0}, - {RTE_IPV4(198, 18, 1, 0), 24, 1}, - {RTE_IPV4(198, 18, 2, 0), 24, 2}, - {RTE_IPV4(198, 18, 3, 0), 24, 3}, - {RTE_IPV4(198, 18, 4, 0), 24, 4}, - {RTE_IPV4(198, 18, 5, 0), 24, 5}, - {RTE_IPV4(198, 18, 6, 0), 24, 6}, - {RTE_IPV4(198, 18, 7, 0), 24, 7}, -}; - -/* - * 2001:200::/48 is IANA reserved range for IPv6 benchmarking (RFC5180). - * 2001:200:0:{0-7}::/64 = Port {0-7} - */ -static const struct ipv6_l3fwd_lpm_route ipv6_l3fwd_lpm_route_array[] = { - {{32, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 0}, - {{32, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 1}, - {{32, 1, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 2}, - {{32, 1, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 3}, - {{32, 1, 2, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 4}, - {{32, 1, 2, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 5}, - {{32, 1, 2, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 6}, - {{32, 1, 2, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0}, 64, 7}, -}; +#include "l3fwd_common_route.h" #define IPV4_L3FWD_LPM_MAX_RULES 1024 #define IPV4_L3FWD_LPM_NUMBER_TBL8S (1 << 8) @@ -485,18 +445,18 @@ setup_lpm(const int socketid) socketid); /* populate the LPM table */ - for (i = 0; i < RTE_DIM(ipv4_l3fwd_lpm_route_array); i++) { + for (i = 0; i < RTE_DIM(ipv4_l3fwd_common_route_array); i++) { struct in_addr in; /* skip unused ports */ - if ((1 << ipv4_l3fwd_lpm_route_array[i].if_out & + if ((1 << ipv4_l3fwd_common_route_array[i].if_out & enabled_port_mask) == 0) continue; ret = rte_lpm_add(ipv4_l3fwd_lpm_lookup_struct[socketid], - ipv4_l3fwd_lpm_route_array[i].ip, - ipv4_l3fwd_lpm_route_array[i].depth, - ipv4_l3fwd_lpm_route_array[i].if_out); +
[dpdk-dev] [PATCH v5 3/5] examples/l3fwd: add FIB infrastructure
The purpose of this commit is to add the necessary function calls and supporting infrastructure to allow the Forwarding Information Base (FIB) library to be integrated into the l3fwd sample app. Instead of adding an individual flag for FIB, a new flag '--lookup' has been added that allows the user to select their desired lookup method. The flags '-E' and '-L' have been retained for backwards compatibility. Signed-off-by: Conor Walsh Acked-by: Konstantin Ananyev Acked-by: Vladimir Medvedkin --- examples/l3fwd/Makefile | 2 +- examples/l3fwd/l3fwd.h | 27 - examples/l3fwd/l3fwd_event.c | 9 +++ examples/l3fwd/l3fwd_event.h | 1 + examples/l3fwd/l3fwd_fib.c | 60 examples/l3fwd/main.c| 107 ++- examples/l3fwd/meson.build | 4 +- 7 files changed, 176 insertions(+), 34 deletions(-) create mode 100644 examples/l3fwd/l3fwd_fib.c diff --git a/examples/l3fwd/Makefile b/examples/l3fwd/Makefile index 7e70bbd826..5f7baffbf7 100644 --- a/examples/l3fwd/Makefile +++ b/examples/l3fwd/Makefile @@ -5,7 +5,7 @@ APP = l3fwd # all source are stored in SRCS-y -SRCS-y := main.c l3fwd_lpm.c l3fwd_em.c l3fwd_event.c +SRCS-y := main.c l3fwd_lpm.c l3fwd_fib.c l3fwd_em.c l3fwd_event.c SRCS-y += l3fwd_event_generic.c l3fwd_event_internal_port.c # Build using pkg-config variables if possible diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h index 2cf06099e0..a808d60247 100644 --- a/examples/l3fwd/l3fwd.h +++ b/examples/l3fwd/l3fwd.h @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: BSD-3-Clause - * Copyright(c) 2010-2016 Intel Corporation + * Copyright(c) 2010-2021 Intel Corporation */ #ifndef __L3_FWD_H__ @@ -180,13 +180,16 @@ is_valid_ipv4_pkt(struct rte_ipv4_hdr *pkt, uint32_t link_len) int init_mem(uint16_t portid, unsigned int nb_mbuf); -/* Function pointers for LPM or EM functionality. */ +/* Function pointers for LPM, EM or FIB functionality. */ void setup_lpm(const int socketid); void setup_hash(const int socketid); +void +setup_fib(const int socketid); + int em_check_ptype(int portid); @@ -207,6 +210,9 @@ em_main_loop(__rte_unused void *dummy); int lpm_main_loop(__rte_unused void *dummy); +int +fib_main_loop(__rte_unused void *dummy); + int lpm_event_main_loop_tx_d(__rte_unused void *dummy); int @@ -225,8 +231,17 @@ em_event_main_loop_tx_q(__rte_unused void *dummy); int em_event_main_loop_tx_q_burst(__rte_unused void *dummy); +int +fib_event_main_loop_tx_d(__rte_unused void *dummy); +int +fib_event_main_loop_tx_d_burst(__rte_unused void *dummy); +int +fib_event_main_loop_tx_q(__rte_unused void *dummy); +int +fib_event_main_loop_tx_q_burst(__rte_unused void *dummy); + -/* Return ipv4/ipv6 fwd lookup struct for LPM or EM. */ +/* Return ipv4/ipv6 fwd lookup struct for LPM, EM or FIB. */ void * em_get_ipv4_l3fwd_lookup_struct(const int socketid); @@ -239,4 +254,10 @@ lpm_get_ipv4_l3fwd_lookup_struct(const int socketid); void * lpm_get_ipv6_l3fwd_lookup_struct(const int socketid); +void * +fib_get_ipv4_l3fwd_lookup_struct(const int socketid); + +void * +fib_get_ipv6_l3fwd_lookup_struct(const int socketid); + #endif /* __L3_FWD_H__ */ diff --git a/examples/l3fwd/l3fwd_event.c b/examples/l3fwd/l3fwd_event.c index 4d31593a0a..961860ea18 100644 --- a/examples/l3fwd/l3fwd_event.c +++ b/examples/l3fwd/l3fwd_event.c @@ -227,6 +227,12 @@ l3fwd_event_resource_setup(struct rte_eth_conf *port_conf) [1][0] = em_event_main_loop_tx_q, [1][1] = em_event_main_loop_tx_q_burst, }; + const event_loop_cb fib_event_loop[2][2] = { + [0][0] = fib_event_main_loop_tx_d, + [0][1] = fib_event_main_loop_tx_d_burst, + [1][0] = fib_event_main_loop_tx_q, + [1][1] = fib_event_main_loop_tx_q_burst, + }; uint32_t event_queue_cfg; int ret; @@ -264,4 +270,7 @@ l3fwd_event_resource_setup(struct rte_eth_conf *port_conf) evt_rsrc->ops.em_event_loop = em_event_loop[evt_rsrc->tx_mode_q] [evt_rsrc->has_burst]; + + evt_rsrc->ops.fib_event_loop = fib_event_loop[evt_rsrc->tx_mode_q] + [evt_rsrc->has_burst]; } diff --git a/examples/l3fwd/l3fwd_event.h b/examples/l3fwd/l3fwd_event.h index 0e46164170..3ad1902ab5 100644 --- a/examples/l3fwd/l3fwd_event.h +++ b/examples/l3fwd/l3fwd_event.h @@ -55,6 +55,7 @@ struct l3fwd_event_setup_ops { adapter_setup_cb adapter_setup; event_loop_cb lpm_event_loop; event_loop_cb em_event_loop; + event_loop_cb fib_event_loop; }; struct l3fwd_event_resources { diff --git a/examples/l3fwd/l3fwd_fib.c b/examples/l3fwd/l3fwd_fib.c new file mode 100644 index 00..0a2d02db2f --- /dev/null +++ b/examples/l3fwd/l3fwd_fib.c @@ -0,0 +1,60 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Cor
[dpdk-dev] [PATCH v5 4/5] examples/l3fwd: implement FIB lookup method
This patch implements the Forwarding Information Base (FIB) library in l3fwd using the function calls and infrastructure introduced in the previous patch. Signed-off-by: Conor Walsh Acked-by: Konstantin Ananyev --- examples/l3fwd/l3fwd_fib.c | 480 - 1 file changed, 474 insertions(+), 6 deletions(-) diff --git a/examples/l3fwd/l3fwd_fib.c b/examples/l3fwd/l3fwd_fib.c index 0a2d02db2f..a58b933f83 100644 --- a/examples/l3fwd/l3fwd_fib.c +++ b/examples/l3fwd/l3fwd_fib.c @@ -2,59 +2,527 @@ * Copyright(c) 2021 Intel Corporation */ +#include +#include +#include +#include +#include + #include #include #include "l3fwd.h" +#if defined RTE_ARCH_X86 +#include "l3fwd_sse.h" +#elif defined __ARM_NEON +#include "l3fwd_neon.h" +#elif defined RTE_ARCH_PPC_64 +#include "l3fwd_altivec.h" +#endif #include "l3fwd_event.h" #include "l3fwd_common_route.h" +/* Configure how many packets ahead to prefetch for fib. */ +#define FIB_PREFETCH_OFFSET 4 + +/* A non-existent portid is needed to denote a default hop for fib. */ +#define FIB_DEFAULT_HOP 999 + +/* + * If the machine has SSE, NEON or PPC 64 then multiple packets + * can be sent at once if not only single packets will be sent + */ +#if defined RTE_ARCH_X86 || defined __ARM_NEON \ + || defined RTE_ARCH_PPC_64 +#define FIB_SEND_MULTI +#endif + +static struct rte_fib *ipv4_l3fwd_fib_lookup_struct[NB_SOCKETS]; +static struct rte_fib6 *ipv6_l3fwd_fib_lookup_struct[NB_SOCKETS]; + +/* Parse packet type and ip address. */ +static inline void +fib_parse_packet(struct rte_mbuf *mbuf, + uint32_t *ipv4, uint32_t *ipv4_cnt, + uint8_t ipv6[RTE_FIB6_IPV6_ADDR_SIZE], + uint32_t *ipv6_cnt, uint8_t *ip_type) +{ + struct rte_ether_hdr *eth_hdr; + struct rte_ipv4_hdr *ipv4_hdr; + struct rte_ipv6_hdr *ipv6_hdr; + + eth_hdr = rte_pktmbuf_mtod(mbuf, struct rte_ether_hdr *); + /* IPv4 */ + if (mbuf->packet_type & RTE_PTYPE_L3_IPV4) { + ipv4_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1); + *ipv4 = rte_be_to_cpu_32(ipv4_hdr->dst_addr); + /* Store type of packet in type_arr (IPv4=1, IPv6=0). */ + *ip_type = 1; + (*ipv4_cnt)++; + } + /* IPv6 */ + else { + ipv6_hdr = (struct rte_ipv6_hdr *)(eth_hdr + 1); + rte_mov16(ipv6, (const uint8_t *)ipv6_hdr->dst_addr); + *ip_type = 0; + (*ipv6_cnt)++; + } +} + +/* + * If the machine does not have SSE, NEON or PPC 64 then the packets + * are sent one at a time using send_single_packet() + */ +#if !defined FIB_SEND_MULTI +static inline void +fib_send_single(int nb_tx, struct lcore_conf *qconf, + struct rte_mbuf **pkts_burst, uint16_t hops[nb_tx]) +{ + int32_t j; + struct rte_ether_hdr *eth_hdr; + + for (j = 0; j < nb_tx; j++) { + /* Run rfc1812 if packet is ipv4 and checks enabled. */ +#if defined DO_RFC_1812_CHECKS + rfc1812_process((struct rte_ipv4_hdr *)(rte_pktmbuf_mtod( + pkts_burst[j], struct rte_ether_hdr *) + 1), + &hops[j], pkts_burst[j]->packet_type); +#endif + + /* Set MAC addresses. */ + eth_hdr = rte_pktmbuf_mtod(pkts_burst[j], + struct rte_ether_hdr *); + *(uint64_t *)ð_hdr->d_addr = dest_eth_addr[hops[j]]; + rte_ether_addr_copy(&ports_eth_addr[hops[j]], + ð_hdr->s_addr); + + /* Send single packet. */ + send_single_packet(qconf, pkts_burst[j], hops[j]); + } +} +#endif + +/* Bulk parse, fib lookup and send. */ +static inline void +fib_send_packets(int nb_rx, struct rte_mbuf **pkts_burst, + uint16_t portid, struct lcore_conf *qconf) +{ + uint32_t ipv4_arr[nb_rx]; + uint8_t ipv6_arr[nb_rx][RTE_FIB6_IPV6_ADDR_SIZE]; + uint16_t hops[nb_rx]; + uint64_t hopsv4[nb_rx], hopsv6[nb_rx]; + uint8_t type_arr[nb_rx]; + uint32_t ipv4_cnt = 0, ipv6_cnt = 0; + uint32_t ipv4_arr_assem = 0, ipv6_arr_assem = 0; + uint16_t nh; + int32_t i; + + /* Prefetch first packets. */ + for (i = 0; i < FIB_PREFETCH_OFFSET && i < nb_rx; i++) + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i], void *)); + + /* Parse packet info and prefetch. */ + for (i = 0; i < (nb_rx - FIB_PREFETCH_OFFSET); i++) { + /* Prefetch packet. */ + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[ + i + FIB_PREFETCH_OFFSET], void *)); + fib_parse_packet(pkts_burst[i], + &ipv4_arr[ipv4_cnt], &ipv4_cnt, + ipv6_arr[ipv6_cnt], &ipv6_cnt, + &type_arr[i]); + } + + /* Parse remaining packet
[dpdk-dev] [PATCH v5 5/5] doc/guides/l3_forward: update documentation for FIB
The purpose of this patch is to update the l3fwd user guide to include the changes proposed in this patchset. Signed-off-by: Conor Walsh --- doc/guides/sample_app_ug/l3_forward.rst | 113 +--- 1 file changed, 100 insertions(+), 13 deletions(-) diff --git a/doc/guides/sample_app_ug/l3_forward.rst b/doc/guides/sample_app_ug/l3_forward.rst index e7875f8dcd..d5892bdcf8 100644 --- a/doc/guides/sample_app_ug/l3_forward.rst +++ b/doc/guides/sample_app_ug/l3_forward.rst @@ -11,7 +11,7 @@ The application performs L3 forwarding. Overview -The application demonstrates the use of the hash and LPM libraries in the DPDK +The application demonstrates the use of the hash, LPM and FIB libraries in DPDK to implement packet forwarding using poll or event mode PMDs for packet I/O. The initialization and run-time paths are very similar to those of the :doc:`l2_forward_real_virtual` and :doc:`l2_forward_event`. @@ -22,7 +22,7 @@ decision is made based on information read from the input packet. Eventdev can optionally use S/W or H/W (if supported by platform) scheduler implementation for packet I/O based on run time parameters. -The lookup method is either hash-based or LPM-based and is selected at run time. When the selected lookup method is hash-based, +The lookup method is hash-based, LPM-based or FIB-based and is selected at run time. When the selected lookup method is hash-based, a hash object is used to emulate the flow classification stage. The hash object is used in correlation with a flow table to map each input packet to its flow at runtime. @@ -30,14 +30,14 @@ The hash lookup key is represented by a DiffServ 5-tuple composed of the followi Source IP Address, Destination IP Address, Protocol, Source Port and Destination Port. The ID of the output interface for the input packet is read from the identified flow table entry. The set of flows used by the application is statically configured and loaded into the hash at initialization time. -When the selected lookup method is LPM based, an LPM object is used to emulate the forwarding stage for IPv4 packets. -The LPM object is used as the routing table to identify the next hop for each input packet at runtime. +When the selected lookup method is LPM or FIB based, an LPM or FIB object is used to emulate the forwarding stage for IPv4 packets. +The LPM or FIB object is used as the routing table to identify the next hop for each input packet at runtime. -The LPM lookup key is represented by the Destination IP Address field read from the input packet. -The ID of the output interface for the input packet is the next hop returned by the LPM lookup. -The set of LPM rules used by the application is statically configured and loaded into the LPM object at initialization time. +The LPM and FIB lookup keys are represented by the Destination IP Address field read from the input packet. +The ID of the output interface for the input packet is the next hop returned by the LPM or FIB lookup. +The set of LPM and FIB rules used by the application is statically configured and loaded into the LPM or FIB object at initialization time. -In the sample application, hash-based forwarding supports IPv4 and IPv6. LPM-based forwarding supports IPv4 only. +In the sample application, hash-based and FIB-based forwarding supports both IPv4 and IPv6. LPM-based forwarding supports IPv4 only. Compiling the Application - @@ -53,8 +53,7 @@ The application has a number of command line options:: ./dpdk-l3fwd [EAL options] -- -p PORTMASK [-P] - [-E] - [-L] + [--lookup LOOKUP_METHOD] --config(port,queue,lcore)[,(port,queue,lcore)] [--eth-dest=X,MM:MM:MM:MM:MM:MM] [--enable-jumbo [--max-pkt-len PKTLEN]] @@ -66,6 +65,8 @@ The application has a number of command line options:: [--mode] [--eventq-sched] [--event-eth-rxqs] + [-E] + [-L] Where, @@ -74,9 +75,7 @@ Where, * ``-P:`` Optional, sets all ports to promiscuous mode so that packets are accepted regardless of the packet's Ethernet MAC destination address. Without this option, only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted. -* ``-E:`` Optional, enable exact match. - -* ``-L:`` Optional, enable longest prefix match. +* ``--lookup:`` Optional, Select the lookup method. Accepted options ``em`` (Exact Match), ``lpm`` (Longest Prefix Match), ``fib`` (Forwarding Information Base). Default is ``lpm``. * ``--config (port,queue,lcore)[,(port,queue,lcore)]:`` Determines which queues from which ports are mapped to which cores. @@ -102,6 +101,10 @@
Re: [dpdk-dev] [PATCH v3 0/4] net: replace Windows networking shim
On 3/13/2021 10:22 PM, Dmitry Kozlyuk wrote: Networking header shim in Windows EAL conflicts with system headers and tries to provide POSIX compatibility out of scope for DPDK. Remove dependency on POSIX headers from libraries supported on Windows, then replace shim with librte_net with workarounds. A proposed deprecation notice is assumed: http://patchwork.dpdk.org/project/dpdk/list/?series=15595 v3: Fix build on FreeBSD for real (CI). v2: Fix build on FreeBSD (CI). Depends-on: series-15513 ("eal/windows: do not expose POSIX symbols") Dmitry Kozlyuk (4): cmdline: remove POSIX dependency ethdev: remove POSIX dependency net/mlx5: remove POSIX dependency net: replace Windows networking shim Hi Dmitry, Have you seen the CI reported build errors: http://mails.dpdk.org/archives/test-report/2021-March/182361.html Briefly: ./lib/librte_net/rte_net.c:132:7: error: 'IPPROTO_GRE' undeclared ./lib/librte_net/rte_net.c:163:7: error: 'IPPROTO_IPIP' undeclared
Re: [dpdk-dev] [PATCH v11 2/2] bus/pci: support MMIO in PCI ioport accessors
> -Original Message- > From: David Marchand > Sent: Monday, March 15, 2021 18:20 > To: Wang, Haiyue ; 谢华伟(此时此刻) > > Cc: maxime.coque...@redhat.com; Yigit, Ferruh ; > dev@dpdk.org; Burakov, Anatoly > ; xuemi...@nvidia.com; gr...@u256.net > Subject: Re: [dpdk-dev] [PATCH v11 2/2] bus/pci: support MMIO in PCI ioport > accessors > > On Thu, Mar 11, 2021 at 7:43 AM Wang, Haiyue wrote: > > Like kernel use macro to do pio and mmio, maybe we can also to do so for > > making code clean: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/iomap.c > > > > #define IO_COND(addr, is_pio, is_mmio) do { \ > > unsigned long port = (unsigned long __force)addr; \ > > if (port >= PIO_RESERVED) { \ > > is_mmio;\ > > } else if (port > PIO_OFFSET) { \ > > port &= PIO_MASK; \ > > is_pio; \ > > } else \ > > bad_io_access(port, #is_pio ); \ > > } while (0) > > > > > > Like: > > > > #if defined(RTE_ARCH_X86) > > #define IO_COND(addr, is_pio, is_mmio) do { \ > > if ((uint64_t)(uintptr_t)addr >= PIO_MAX) { \ > > is_mmio; \ > > } else { \ > > is_pio; \ > > } \ > > } while (0) > > #else > > #define IO_COND(addr, is_pio, is_mmio) do { \ > > is_mmio; \ > > } while (0) > > #endif > > We should not just copy/paste kernel code. > Got it ;-) > > > -- > David Marchand
Re: [dpdk-dev] Duplicating traffic with RTE Flow
Hello Jiawei, On Fri, 12 Mar 2021 09:32:44 + "Jiawei(Jonny) Wang" wrote: > Hi Jan, > > > -Original Message- > > From: Jan Viktorin > > Sent: Friday, March 12, 2021 12:33 AM > > To: Jiawei(Jonny) Wang > > Cc: Slava Ovsiienko ; Asaf Penso > > ; dev@dpdk.org; Ori Kam > > Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow > > > > On Thu, 11 Mar 2021 02:11:07 + > > "Jiawei(Jonny) Wang" wrote: > > > > > Hi Jan, > > > > > > Sorry for late response, > > > > > > First rule is invalid, port only works on FDB domain so need > > > 'transfer' here; Second rule should be ok, could you please check if the > > > > > port 1 was enabled on you dpdk application? > > > > I assume that it is enabled, see full transcript: > > > > $ ofed_info > > MLNX_OFED_LINUX-5.2-1.0.4.0 (OFED-5.2-1.0.4): > > ... > > $ sudo dpdk-testpmd -v -- -i > > EAL: Detected 24 lcore(s) > > EAL: Detected 1 NUMA nodes > > EAL: RTE Version: 'DPDK 20.11.0' > > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > > EAL: Selected IOVA mode 'PA' > > EAL: No available hugepages reported in hugepages-1048576kB > > EAL: Probing VFIO support... > > EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: :04:00.0 (socket 0) > > mlx5_pci: No available register for Sampler. > > mlx5_pci: Size 0x is not power of 2, will be aligned to 0x1. > > EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: :04:00.1 (socket 0) > > mlx5_pci: No available register for Sampler. > > mlx5_pci: Size 0x is not power of 2, will be aligned to 0x1. > > EAL: No legacy callbacks, legacy socket not created Interactive-mode > > selected > > testpmd: create a new mbuf pool : n=331456, size=2176, > > socket=0 > > testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 > > (socket 0) Port 0: B8:59:9F:E2:09:F6 Configuring Port 1 (socket 0) Port > > 1: > > B8:59:9F:E2:09:F7 Checking link statuses... > > Done > > Seems that you start two PF port here, Port 1 is not VF port; > FDB rule can steering the packet form PF to its VFs and vice versa, Could you > please try to open the > VF ports and start the testpmd with representor=. I did not know this, so I tried with VFs: # echo 2 > /sys/class/net/hge1/device/sriov_numvfs # echo switchdev > /sys/class/net/hge1/compat/devlink/mode # dpdk-testpmd -v -a ':05:00.1,representor=[0-1]' -- -i EAL: Detected 24 lcore(s) EAL: Detected 1 NUMA nodes EAL: RTE Version: 'DPDK 20.11.0' EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' EAL: No available hugepages reported in hugepages-1048576kB EAL: Probing VFIO support... EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: :05:00.1 (socket 0) mlx5_pci: No available register for Sampler. mlx5_pci: Size 0x is not power of 2, will be aligned to 0x1. mlx5_pci: No available register for Sampler. mlx5_pci: No available register for Sampler. EAL: No legacy callbacks, legacy socket not created Interactive-mode selected testpmd: create a new mbuf pool : n=331456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Warning! port-topology=paired and odd forward ports number, the last port will pair with itself. Configuring Port 0 (socket 0) Port 0: B8:59:9F:E2:09:F7 Configuring Port 1 (socket 0) Port 1: B2:57:D6:72:F3:31 Configuring Port 2 (socket 0) Port 2: 9E:CB:D0:73:59:CE Checking link statuses... Done testpmd> show port summary all Number of available ports: 3 Port MAC Address Name Driver Status Link 0B8:59:9F:E2:09:F7 :05:00.1 mlx5_pci up 100 Gbps 1B2:57:D6:72:F3:31 :05:00.1_representor_0 mlx5_pci up 100 Gbps 29E:CB:D0:73:59:CE :05:00.1_representor_1 mlx5_pci up 100 Gbps testpmd> set sample_actions 0 port_id id 1 / end testpmd> flow validate 0 ingress transfer pattern end actions sample ratio 1 index 0 / drop / end port_flow_complain(): Caught PMD error type 1 (cause unspecified): sample action not supported: Operation not supported Still no luck. However, there is this message 3-times in the log: mlx5_pci: No available register for Sampler. It looks like it might be related. What does it mean? Jan > > Thanks. > > > testpmd> port start 1 > > Port 1 is now not stopped > > Please stop the ports first > > Done > > testpmd> set sample_actions 0 port_id id 1 / end testpmd> flow validate 0 > > > > ingress transfer pattern end actions sample ratio 1 index 0 / drop / end > > port_flow_complain(): Caught PMD error type 1 (cause unspecified): (no > > stated reason): Operation not supported testpmd> flow create 0 ingress > > transfer pattern end actions sample ratio 1 index 0 / drop / end > > port_flow_complain(): Caught PMD error type 1 (cause unspecified): (no > > stated reason): Operation not supported testpmd> Stopping port 0... > > Stopping ports... > > Done > > > > Stopping port 1... > > Stoppi
Re: [dpdk-dev] [PATCH v3 0/4] net: replace Windows networking shim
> > > Hi Ferruh, > Have you seen the CI reported build errors: > http://mails.dpdk.org/archives/test-report/2021-March/182361.html > > Briefly: > ./lib/librte_net/rte_net.c:132:7: error: 'IPPROTO_GRE' undeclared > ./lib/librte_net/rte_net.c:163:7: error: 'IPPROTO_IPIP' undeclared > This is because CI doesn't apply patches in Depends-on. In this case, missing constants would be defined when RTE_BUILD_INTERNAL is defined (so that symbols are only visible to DPDK), and it is introduced by dependency series. >
[dpdk-dev] [Bug 661] [mlx5] VLAN packets will not do RSS
https://bugs.dpdk.org/show_bug.cgi?id=661 Bug ID: 661 Summary: [mlx5] VLAN packets will not do RSS Product: DPDK Version: 19.11 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: alia...@nvidia.com Target Milestone: --- RSS doesn't seem to be working correctly with packets that have VLAN. The packets will be received, but will always go to the first queue. To reproduce, run testpmd and set RSS type to TCP, then send the following packet: Ether / Dot1Q / IP / TCP / Raw Packets will be spread correctly when sending a packet without VLAN. Example: Ether / IP / TCP / Raw -- You are receiving this mail because: You are the assignee for the bug.
[dpdk-dev] [Bug 661] [mlx5] VLAN packets will not do RSS
https://bugs.dpdk.org/show_bug.cgi?id=661 Ali Alnubani (alia...@nvidia.com) changed: What|Removed |Added Resolution|--- |WONTFIX Status|UNCONFIRMED |RESOLVED --- Comment #1 from Ali Alnubani (alia...@nvidia.com) --- Will not fix this in 19.11. -- You are receiving this mail because: You are the assignee for the bug.
Re: [dpdk-dev] 19.11.7 patches review and test
Hi, > -Original Message- > From: Christian Ehrhardt > Sent: Wednesday, March 10, 2021 3:37 PM > To: sta...@dpdk.org > Cc: dev@dpdk.org; Abhishek Marathe ; > Akhil Goyal ; Ali Alnubani ; > benjamin.wal...@intel.com; David Christensen ; > hariprasad.govindhara...@intel.com; Hemant Agrawal > ; Ian Stokes ; Jerin > Jacob ; John McNamara ; > Ju-Hyoung Lee ; Kevin Traynor > ; Luca Boccassi ; Pei Zhang > ; pingx...@intel.com; qian.q...@intel.com; Raslan > Darawsheh ; NBU-Contact-Thomas Monjalon > ; yuan.p...@intel.com; zhaoyan.c...@intel.com > Subject: 19.11.7 patches review and test > > Hi all, > > Here is a list of patches targeted for stable release 19.11.7. > > The (new) planned date for the final release is 17th of March. > > Please help with testing and validation of your use cases and report > any issues/results with reply-all to this mail. For the final release > the fixes and reported validations will be added to the release notes. > > A release candidate tarball can be found at: > > https://dpdk.org/browse/dpdk-stable/tag/?id=v19.11.7-rc2 > > These patches are located at branch 19.11 of dpdk-stable repo: > https://dpdk.org/browse/dpdk-stable/ > > Thanks. > > Christian Ehrhardt > > --- Thanks Christian for creating the new release candidate. The following covers the functional tests that we ran on Mellanox hardware for this release: - Basic functionality: Send and receive multiple types of traffic. - testpmd xstats counter test. - testpmd timestamp test. - Changing/checking link status through testpmd. - RTE flow tests: Items: eth / vlan / ipv4 / ipv6 / tcp / udp / icmp / gre / nvgre / vxlan / ip in ip / mplsoudp / mplsogre Actions: drop / queue / rss / mark / flag / jump / count / raw_encap / raw_decap / vxlan_encap / vxlan_decap / NAT / dec_ttl - Some RSS tests. - VLAN filtering, stripping and insertion tests. - Checksum and TSO tests. - ptype tests. - link_status_interrupt example application tests. - l3fwd-power example application tests. - Multi-process example applications tests. Functional tests ran on: - NIC: ConnectX-4 Lx / OS: RHEL7.4 / Driver: MLNX_OFED_LINUX-5.2-2.2.0.0 / Firmware: 14.29.2002 - NIC: ConnectX-5 / OS: RHEL7.4 / Driver: MLNX_OFED_LINUX-5.2-2.2.0.0 / Firmware: 16.29.2002 Compilation tests with multiple configurations in the following OS/driver combinations are also passing: - Ubuntu 20.04.2 with MLNX_OFED_LINUX-5.2-2.2.0.0. - Ubuntu 20.04.2 with rdma-core master (a1a9ffb). - Ubuntu 20.04.2 with rdma-core v28.0. - Ubuntu 18.04.5 with rdma-core v17.1. - Ubuntu 18.04.5 with rdma-core master (a1a9ffb) (i386). - Ubuntu 16.04.7 with rdma-core v22.7. - Fedora 32 with rdma-core v33.0. - CentOS 7 7.9.2009 with rdma-core master (a1a9ffb). - CentOS 7 7.9.2009 with MLNX_OFED_LINUX-5.2-2.2.0.0. - CentOS 8 8.3.2011 with rdma-core master (a1a9ffb). - OpenSUSE Leap 15.2 with rdma-core v27.1. We don't see any new issues in this release candidate. However, due to environment changes, we started seeing the following issue, which reproduces in older 19.11 releases as well: https://bugs.dpdk.org/show_bug.cgi?id=661 We will not fix this issue in this release. Regards, Ali
Re: [dpdk-dev] DPDK 21.05 NVIDIA Mellanox Roadmap
This roadmap has been integrated in the web page https://core.dpdk.org/roadmap/#2105 01/03/2021 20:17, Asaf Penso: > Below is NVIDIA Mellanox's roadmap for DPDK21.05, on which we are currently > working: > > rte_flow new APIs: > === > [1]Support a new action offload which perform connection tracking window > validation. > Motivation: > TCP connection tracking is needed for many applications that act as a > mediator and perform forwarding. The new offload connection tracking(CT) > window validation is used for enforcing TCP protocol adherence. > It also enforces several sanity checks for TCP packets like the validity of > L3 and L4 headers as well as the accuracy of L3 and L4 checksum. > The new offload action API will provide means to create, configure, query, > and modify the connection tracking object by a SW application. It will > support a bi-directional, cross vport TCP handshake in an optimized manner > > [2]Add support for matching based on sanity checks of TCP packets. Able to > match the validity of L3 and L4 headers as well as L3 checksum and L4 > checksum. > Motivation: > Allow TCP connection tracking flow to intercept corrupted packets before they > alter the connection tracking object. An application may match on such cases > and handle differently than regular route(e.g drop or pass to SW queue > > [3]Extend meter capabilities with the concept of meter policy > Motivation: > Extend meter capabilities to add support for a shared meter policy. A meter > policy is an object that can be shared among different meters. It provides > the ability to associate different actions per color Red/Yellow/Green and > thus use a meter as a steering mechanism. The first implementation will > support queue, rss, jump, mark, and set_tag actions. Given the fact that the > policy is shared across many meter flows a performance gain is also expected. > rte API will be augmented with an additional create meter API to make use of > the new policy object. > > [4]Add support for writing information related to a single rte flow > Motivation: > Allow finely grained debug of how flows are represented in the HW. Previously > support was added to dump all rte flows using 'flow dump all > . Now we are extending to support single flow dump using flow > dump rule > > > rte_mtr new APIs > === > [5] add support for a meter profile that enable packet per second metering > Motivation: > Provide flexibility to applications that would like to meter based on packets > per second granularity on top of byte per second granularity that exist today > as part of meter profile. > > > mlx5 PMD updates: > > mlx5 PMD will support the rte_flow update changes listed above and below > > [6]Extend support for VLAN pop on egress direction and VLAN push on ingress > direction > Motivation: > Some applications like firewalls, need to alter the routing information > bi-directionally. Today mlx5 PMD supports VLAN pop on ingress and VLAN push > on ingress and the intention here is the augment with the corresponding > pop/push actions. > > [7]Add support for rte_security API > Motivation: enable IPsec inline offload to be used in conjunction with other > rte flow API to enable inline encrypt/decrypt of packets. Mlx5 will support > Encapsulating Security Payload(ESP) with ConnectX-6 Dx and BlueField-2 > > [8]Add support for power saving in rx queues > Motivation: support for umwait command to enable reduction of power > consumption if no packets are received. > > [9]Add support for using HW configured timestamp format > Motivation: modify the pmd to use the timestamp format based on HW ability - > either UTC or free-running > > > New PMDs: > == > [10]Implement look aside AES-XTS encryption/decryption PMD over BlueField-2 > SmartNIC and ConnectX-6 Dx to support existing rte_cryptodev API > > > Regex PMD updates: > = > [11]Added support for regex(regular expression engine in BlueField-2 with > chained mbuf > Motivation: Allow regex to handle jobs that require a multiple chained mbuf > jobs efficiently > > > testpmd updates: > > testpmd updated to support the changes listed above > > > flow-perf updates: > > enhance flow-perf application to support the connection tracking window > validation offload
Re: [dpdk-dev] DPDK 21.05 Wangxun Roadmap
The addition of a new driver is added to the web page: https://core.dpdk.org/roadmap/#2105 04/03/2021 09:30, Jiawen Wu: > There is Wangxun's roadmap for DPDK 21.05. > > Bug fixes: > [1] fix TXGBE Rx drop statistics > [2] fix TXGBE packet type > [3] fix TXGBE IPsec > [4] fix TXGBE backplane link process, and support to control training for > auto-negotiation > > Others: > [1] remove redundancy code for TXGBE > > New PMD: > [1] add a new PMD for Wangxun 1Gb NICs
Re: [dpdk-dev] [PATCH 2/3] net/virtio: allocate fake mbuf in Rx queue
On 1/11/21 6:39 AM, Xia, Chenbo wrote: > Hi Maxime, > >> -Original Message- >> From: Maxime Coquelin >> Sent: Tuesday, December 22, 2020 12:15 AM >> To: dev@dpdk.org; Xia, Chenbo ; amore...@redhat.com; >> david.march...@redhat.com; olivier.m...@6wind.com >> Cc: Maxime Coquelin >> Subject: [PATCH 2/3] net/virtio: allocate fake mbuf in Rx queue >> >> While it is worth clarifying whether the fake mbuf >> in virtnet_rx struct is really necessary, it is sure >> that it heavily impacts cache usage by being part of >> the struct. Indeed, it takes uses cachelines, and >> requires alignement on a cacheline. >> >> Before this series, it means it took 120 bytes in >> virtnet_rx struct: >> >> struct virtnet_rx { >> struct virtqueue * vq; /* 0 8 */ >> >> /* XXX 56 bytes hole, try to pack */ >> >> /* --- cacheline 1 boundary (64 bytes) --- */ >> struct rte_mbuffake_mbuf __attribute__((__aligned__(64))); >> /*64 128 */ >> /* --- cacheline 3 boundary (192 bytes) --- */ >> >> This patch allocates it using malloc in order to optimize >> virtnet_rx cache usage and so virtqueue cache usage. >> >> Signed-off-by: Maxime Coquelin >> --- >> drivers/net/virtio/virtio_ethdev.c | 10 ++ >> drivers/net/virtio/virtio_rxtx.c | 8 +++- >> drivers/net/virtio/virtio_rxtx.h | 2 +- >> 3 files changed, 14 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/net/virtio/virtio_ethdev.c >> b/drivers/net/virtio/virtio_ethdev.c >> index 297c01a70d..a1351b36ca 100644 >> --- a/drivers/net/virtio/virtio_ethdev.c >> +++ b/drivers/net/virtio/virtio_ethdev.c >> @@ -539,6 +539,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t >> queue_idx) >> } >> >> if (queue_type == VTNET_RQ) { >> +struct rte_mbuf *fake_mbuf; >> size_t sz_sw = (RTE_PMD_VIRTIO_RX_MAX_BURST + vq_size) * >> sizeof(vq->sw_ring[0]); >> >> @@ -550,10 +551,18 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t >> queue_idx) >> goto fail_q_alloc; >> } >> >> +fake_mbuf = malloc(sizeof(*fake_mbuf)); >> +if (!fake_mbuf) { >> +PMD_INIT_LOG(ERR, "can not allocate fake mbuf"); >> +ret = -ENOMEM; >> +goto fail_q_alloc; >> +} >> + >> vq->sw_ring = sw_ring; >> rxvq = &vq->rxq; >> rxvq->port_id = dev->data->port_id; >> rxvq->mz = mz; >> +rxvq->fake_mbuf = fake_mbuf; >> } else if (queue_type == VTNET_TQ) { >> txvq = &vq->txq; >> txvq->port_id = dev->data->port_id; >> @@ -636,6 +645,7 @@ virtio_free_queues(struct virtio_hw *hw) >> >> queue_type = virtio_get_queue_type(hw, i); >> if (queue_type == VTNET_RQ) { >> +free(vq->rxq.fake_mbuf); > > After thinking about this again, although you add the free of fake mbuf > here, it's better to add free in virtio_init_queue too after fail_q_alloc. > And when setup_queue(hw, vq) fails, it's better to goto fail_q_alloc to > free fake mbuf. Now it will not memory leak as we use virtio_free_queues when > virtio_alloc_queues fails. But inside virtio_init_queue, it's better to > handle the errors well.. If you agree with above, it may also be good to > change the name 'fail_q_alloc' since now it may also fail when setting up > queues. The error path indeed needs some rework. I will add a preliminary patch to rework it before this patch is applied. > Sorry for an extra email about this... No worries, that's much appreciated! Thanks, Maxime > Thanks, > Chenbo > >> rte_free(vq->sw_ring); >> rte_memzone_free(vq->rxq.mz); >> } else if (queue_type == VTNET_TQ) { >> diff --git a/drivers/net/virtio/virtio_rxtx.c >> b/drivers/net/virtio/virtio_rxtx.c >> index 1fcce36cbd..d147d7300a 100644 >> --- a/drivers/net/virtio/virtio_rxtx.c >> +++ b/drivers/net/virtio/virtio_rxtx.c >> @@ -703,11 +703,9 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev >> *dev, >> uint16_t queue_idx) >> virtio_rxq_vec_setup(rxvq); >> } >> >> -memset(&rxvq->fake_mbuf, 0, sizeof(rxvq->fake_mbuf)); >> -for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; >> - desc_idx++) { >> -vq->sw_ring[vq->vq_nentries + desc_idx] = >> -&rxvq->fake_mbuf; >> +memset(rxvq->fake_mbuf, 0, sizeof(*rxvq->fake_mbuf)); >> +for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; desc_idx++) { >> +vq->sw_ring[vq->vq_nentries + desc_idx] = rxvq->fake_mbuf; >> } >> >> if (hw->use_vec_rx && !virtio_with_packed_queue(hw)) { >> diff --git a/drivers/net/virtio/virtio_rxtx.h >> b/drivers/net/virtio/virtio_rxtx.h >> index 7f1036be6f..6ce5d67d15 100644 >> --- a/drivers/net/virtio/virtio_rxtx.h >> +++ b/drivers/net/virtio/
[dpdk-dev] [PATCH v2 0/8] common/sfc_efx: prepare to introduce vDPA driver
Update base driver to provide functionality required by vDPA driver. Factor out helper functions to be shared by net and vDPA drivers. v2: - fix windows build breakage - do not build common/sfc_efx in the case of windows - remove undefined efx_virtio_* functions from version.map (since EFSYS_OPT_VIRTIO is disabled) Vijay Kumar Srivastava (6): common/sfc_efx/base: add virtio build dependency common/sfc_efx/base: add support to get virtio features common/sfc_efx/base: add support to verify virtio features common/sfc_efx: add support to get the device class net/sfc: skip driver probe for incompatible device class drivers: add common driver API to get efx family Vijay Srivastava (2): common/sfc_efx/base: add base virtio support for vDPA common/sfc_efx/base: add API to get VirtQ doorbell offset doc/guides/nics/sfc_efx.rst| 8 + drivers/common/meson.build | 2 +- drivers/common/sfc_efx/base/efx.h | 142 drivers/common/sfc_efx/base/efx_check.h| 9 + drivers/common/sfc_efx/base/efx_impl.h | 42 +++ drivers/common/sfc_efx/base/efx_virtio.c | 340 ++ drivers/common/sfc_efx/base/meson.build| 2 + drivers/common/sfc_efx/base/rhead_impl.h | 37 ++ drivers/common/sfc_efx/base/rhead_virtio.c | 379 + drivers/common/sfc_efx/efsys.h | 2 + drivers/common/sfc_efx/meson.build | 5 + drivers/common/sfc_efx/sfc_efx.c | 105 ++ drivers/common/sfc_efx/sfc_efx.h | 44 +++ drivers/common/sfc_efx/version.map | 3 + drivers/meson.build| 1 + drivers/net/sfc/sfc.c | 61 +--- drivers/net/sfc/sfc.h | 1 + drivers/net/sfc/sfc_ethdev.c | 7 + drivers/net/sfc/sfc_kvargs.c | 1 + 19 files changed, 1133 insertions(+), 58 deletions(-) create mode 100644 drivers/common/sfc_efx/base/efx_virtio.c create mode 100644 drivers/common/sfc_efx/base/rhead_virtio.c create mode 100644 drivers/common/sfc_efx/sfc_efx.h -- 2.30.1
[dpdk-dev] [PATCH v2 2/8] common/sfc_efx/base: add API to get VirtQ doorbell offset
From: Vijay Srivastava Add an API to query the virtqueue doorbell offset in the BAR for a VI. For vDPA, the virtio net driver notifies the device directly by writing doorbell. This API would be invoked from vDPA client driver. Signed-off-by: Vijay Srivastava Signed-off-by: Andrew Rybchenko --- drivers/common/sfc_efx/base/efx.h | 12 +++ drivers/common/sfc_efx/base/efx_impl.h | 2 + drivers/common/sfc_efx/base/efx_virtio.c | 41 ++ drivers/common/sfc_efx/base/rhead_impl.h | 6 ++ drivers/common/sfc_efx/base/rhead_virtio.c | 93 ++ 5 files changed, 154 insertions(+) diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h index c2c73bd382..d4b7d7f47e 100644 --- a/drivers/common/sfc_efx/base/efx.h +++ b/drivers/common/sfc_efx/base/efx.h @@ -4475,6 +4475,18 @@ extern void efx_virtio_qdestroy( __inefx_virtio_vq_t *evvp); +/* + * Get the offset in the BAR of the doorbells for a VI. + * net device : doorbell offset of RX & TX queues + * block device : request doorbell offset in the BAR. + * For further details refer section of 4 of SF-119689 + */ +LIBEFX_API +extern __checkReturn efx_rc_t +efx_virtio_get_doorbell_offset( + __inefx_virtio_vq_t *evvp, + __out uint32_t *offsetp); + #endif /* EFSYS_OPT_VIRTIO */ #ifdef __cplusplus diff --git a/drivers/common/sfc_efx/base/efx_impl.h b/drivers/common/sfc_efx/base/efx_impl.h index f27d9fa82c..d6742f4a8c 100644 --- a/drivers/common/sfc_efx/base/efx_impl.h +++ b/drivers/common/sfc_efx/base/efx_impl.h @@ -316,6 +316,8 @@ typedef struct efx_virtio_ops_s { efx_virtio_vq_dyncfg_t *); efx_rc_t(*evo_virtio_qstop)(efx_virtio_vq_t *, efx_virtio_vq_dyncfg_t *); + efx_rc_t(*evo_get_doorbell_offset)(efx_virtio_vq_t *, + uint32_t *); } efx_virtio_ops_t; #endif /* EFSYS_OPT_VIRTIO */ diff --git a/drivers/common/sfc_efx/base/efx_virtio.c b/drivers/common/sfc_efx/base/efx_virtio.c index 1b7b01556e..de998fcad9 100644 --- a/drivers/common/sfc_efx/base/efx_virtio.c +++ b/drivers/common/sfc_efx/base/efx_virtio.c @@ -12,6 +12,7 @@ static const efx_virtio_ops_t __efx_virtio_rhead_ops = { rhead_virtio_qstart,/* evo_virtio_qstart */ rhead_virtio_qstop, /* evo_virtio_qstop */ + rhead_virtio_get_doorbell_offset, /* evo_get_doorbell_offset */ }; #endif /* EFSYS_OPT_RIVERHEAD */ @@ -213,4 +214,44 @@ efx_virtio_qdestroy( } } + __checkReturn efx_rc_t +efx_virtio_get_doorbell_offset( + __inefx_virtio_vq_t *evvp, + __out uint32_t *offsetp) +{ + efx_nic_t *enp; + const efx_virtio_ops_t *evop; + efx_rc_t rc; + + if ((evvp == NULL) || (offsetp == NULL)) { + rc = EINVAL; + goto fail1; + } + + enp = evvp->evv_enp; + evop = enp->en_evop; + + EFSYS_ASSERT3U(enp->en_magic, ==, EFX_NIC_MAGIC); + EFSYS_ASSERT3U(enp->en_mod_flags, &, EFX_MOD_VIRTIO); + + if (evop == NULL) { + rc = ENOTSUP; + goto fail2; + } + + if ((rc = evop->evo_get_doorbell_offset(evvp, offsetp)) != 0) + goto fail3; + + return (0); + +fail3: + EFSYS_PROBE(fail3); +fail2: + EFSYS_PROBE(fail2); +fail1: + EFSYS_PROBE1(fail1, efx_rc_t, rc); + + return (rc); +} + #endif /* EFSYS_OPT_VIRTIO */ diff --git a/drivers/common/sfc_efx/base/rhead_impl.h b/drivers/common/sfc_efx/base/rhead_impl.h index a15ac52a58..4304f63f4c 100644 --- a/drivers/common/sfc_efx/base/rhead_impl.h +++ b/drivers/common/sfc_efx/base/rhead_impl.h @@ -492,6 +492,12 @@ rhead_virtio_qstop( __inefx_virtio_vq_t *evvp, __out_opt efx_virtio_vq_dyncfg_t *evvdp); +LIBEFX_INTERNAL +extern __checkReturn efx_rc_t +rhead_virtio_get_doorbell_offset( + __inefx_virtio_vq_t *evvp, + __out uint32_t *offsetp); + #endif /* EFSYS_OPT_VIRTIO */ #ifdef __cplusplus diff --git a/drivers/common/sfc_efx/base/rhead_virtio.c b/drivers/common/sfc_efx/base/rhead_virtio.c index d1719f834e..147460c95c 100644 --- a/drivers/common/sfc_efx/base/rhead_virtio.c +++ b/drivers/common/sfc_efx/base/rhead_virtio.c @@ -187,4 +187,97 @@ rhead_virtio_qstop( return (rc); } + __checkReturn efx_rc_t +rhead_virtio_get_doorbell_offset( + __inefx_virtio_vq_t *evvp, + __out uint32_t *offsetp) +{ + efx_nic_t *enp = evvp->evv_enp; + efx_mcdi_req_t req; + uint32_t type; + EFX_MCDI_DECLARE_BUF(payload, MC_CMD_VIRTIO_GET_DOORBELL_OFFSET_REQ_LEN, + MC_CMD_VIRTIO_GET_NET_DOORBELL_OFFSET_RESP_LEN); + efx_rc_t
[dpdk-dev] [PATCH v2 1/8] common/sfc_efx/base: add base virtio support for vDPA
From: Vijay Srivastava In the vDPA mode, only data path is offloaded in the hardware and control path still goes through the hypervisor and it configures virtqueues via vDPA driver so new virtqueue APIs are required. Implement virtio init/fini and virtqueue create/destroy APIs. Signed-off-by: Vijay Srivastava Signed-off-by: Andrew Rybchenko --- drivers/common/sfc_efx/base/efx.h | 109 +++ drivers/common/sfc_efx/base/efx_check.h| 6 + drivers/common/sfc_efx/base/efx_impl.h | 36 drivers/common/sfc_efx/base/efx_virtio.c | 216 + drivers/common/sfc_efx/base/meson.build| 2 + drivers/common/sfc_efx/base/rhead_impl.h | 17 ++ drivers/common/sfc_efx/base/rhead_virtio.c | 190 ++ drivers/common/sfc_efx/efsys.h | 2 + 8 files changed, 578 insertions(+) create mode 100644 drivers/common/sfc_efx/base/efx_virtio.c create mode 100644 drivers/common/sfc_efx/base/rhead_virtio.c diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h index 2c820022b2..c2c73bd382 100644 --- a/drivers/common/sfc_efx/base/efx.h +++ b/drivers/common/sfc_efx/base/efx.h @@ -4368,6 +4368,115 @@ efx_mae_action_rule_remove( #endif /* EFSYS_OPT_MAE */ +#if EFSYS_OPT_VIRTIO + +/* A Virtio net device can have one or more pairs of Rx/Tx virtqueues + * while virtio block device has a single virtqueue, + * for further details refer section of 4.2.3 of SF-120734 + */ +typedef enum efx_virtio_vq_type_e { + EFX_VIRTIO_VQ_TYPE_NET_RXQ, + EFX_VIRTIO_VQ_TYPE_NET_TXQ, + EFX_VIRTIO_VQ_TYPE_BLOCK, + EFX_VIRTIO_VQ_NTYPES +} efx_virtio_vq_type_t; + +typedef struct efx_virtio_vq_dyncfg_s { + /* +* If queue is being created to be migrated then this +* should be the FINAL_PIDX value returned by MC_CMD_VIRTIO_FINI_QUEUE +* of the queue being migrated from. Otherwise, it should be zero. +*/ + uint32_tevvd_vq_pidx; + /* +* If this queue is being created to be migrated then this +* should be the FINAL_CIDX value returned by MC_CMD_VIRTIO_FINI_QUEUE +* of the queue being migrated from. Otherwise, it should be zero. +*/ + uint32_tevvd_vq_cidx; +} efx_virtio_vq_dyncfg_t; + +/* + * Virtqueue size must be a power of 2, maximum size is 32768 + * (see VIRTIO v1.1 section 2.6) + */ +#define EFX_VIRTIO_MAX_VQ_SIZE 0x8000 + +typedef struct efx_virtio_vq_cfg_s { + unsigned intevvc_vq_num; + efx_virtio_vq_type_tevvc_type; + /* +* vDPA as VF : It is target VF number if queue is being created on VF. +* vDPA as PF : If queue to be created on PF then it should be +* EFX_PCI_VF_INVALID. +*/ + uint16_tevvc_target_vf; + /* +* Maximum virtqueue size is EFX_VIRTIO_MAX_VQ_SIZE and +* virtqueue size 0 means the queue is unavailable. +*/ + uint32_tevvc_vq_size; + efsys_dma_addr_tevvc_desc_tbl_addr; + efsys_dma_addr_tevvc_avail_ring_addr; + efsys_dma_addr_tevvc_used_ring_addr; + /* MSIX vector number for the virtqueue or 0x if MSIX is not used */ + uint16_tevvc_msix_vector; + /* +* evvc_pas_id contains a PCIe address space identifier if the queue +* uses PASID. +*/ + boolean_t evvc_use_pasid; + uint32_tevvc_pas_id; + /* Negotiated virtio features to be applied to this virtqueue */ + uint64_tevcc_features; +} efx_virtio_vq_cfg_t; + +typedef struct efx_virtio_vq_s efx_virtio_vq_t; + +LIBEFX_API +extern __checkReturn efx_rc_t +efx_virtio_init( + __inefx_nic_t *enp); + +LIBEFX_API +extern void +efx_virtio_fini( + __inefx_nic_t *enp); + +/* + * When virtio net driver in the guest sets VIRTIO_CONFIG_STATUS_DRIVER_OK bit, + * hypervisor starts configuring all the virtqueues in the device. When the + * vhost_user has received VHOST_USER_SET_VRING_ENABLE for all the virtqueues, + * then it invokes VDPA driver callback dev_conf. APIs qstart and qcreate would + * be invoked from dev_conf callback to create the virtqueues, For further + * details refer SF-122427. + */ +LIBEFX_API +extern __checkReturn efx_rc_t +efx_virtio_qcreate( + __inefx_nic_t *enp, + __deref_out efx_virtio_vq_t **evvpp); + +LIBEFX_API +extern __checkReturn efx_rc_t +efx_virtio_qstart( + __inefx_virtio_vq_t *evvp, + __inefx_virtio_vq_cfg_t *evvcp, + __in_optefx_virtio_vq_dyncfg_t *evvdp); + +LIBEFX_API +extern __checkReturn efx_rc_t +efx_virtio_qstop( + __inefx_virtio_vq_t *evvp, + __out_opt efx_virtio_vq_dyncfg_t *evvdp); + +LIBEFX_API +extern void +efx_virtio_qdestroy( +
[dpdk-dev] [PATCH v2 5/8] common/sfc_efx/base: add support to verify virtio features
From: Vijay Kumar Srivastava Add an API to verify virtio features supported by device. Signed-off-by: Vijay Kumar Srivastava Signed-off-by: Andrew Rybchenko --- drivers/common/sfc_efx/base/efx.h | 7 drivers/common/sfc_efx/base/efx_impl.h | 2 + drivers/common/sfc_efx/base/efx_virtio.c | 38 +++ drivers/common/sfc_efx/base/rhead_impl.h | 7 drivers/common/sfc_efx/base/rhead_virtio.c | 44 ++ 5 files changed, 98 insertions(+) diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h index e3ac51eae0..ff5091a36b 100644 --- a/drivers/common/sfc_efx/base/efx.h +++ b/drivers/common/sfc_efx/base/efx.h @@ -4501,6 +4501,13 @@ efx_virtio_get_features( __inefx_virtio_device_type_t type, __out uint64_t *featuresp); +LIBEFX_API +extern __checkReturn efx_rc_t +efx_virtio_verify_features( + __inefx_nic_t *enp, + __inefx_virtio_device_type_t type, + __inuint64_t features); + #endif /* EFSYS_OPT_VIRTIO */ #ifdef __cplusplus diff --git a/drivers/common/sfc_efx/base/efx_impl.h b/drivers/common/sfc_efx/base/efx_impl.h index 758206d382..aa878014c1 100644 --- a/drivers/common/sfc_efx/base/efx_impl.h +++ b/drivers/common/sfc_efx/base/efx_impl.h @@ -320,6 +320,8 @@ typedef struct efx_virtio_ops_s { uint32_t *); efx_rc_t(*evo_get_features)(efx_nic_t *, efx_virtio_device_type_t, uint64_t *); + efx_rc_t(*evo_verify_features)(efx_nic_t *, + efx_virtio_device_type_t, uint64_t); } efx_virtio_ops_t; #endif /* EFSYS_OPT_VIRTIO */ diff --git a/drivers/common/sfc_efx/base/efx_virtio.c b/drivers/common/sfc_efx/base/efx_virtio.c index 20c22f02b5..b46997c09e 100644 --- a/drivers/common/sfc_efx/base/efx_virtio.c +++ b/drivers/common/sfc_efx/base/efx_virtio.c @@ -14,6 +14,7 @@ static const efx_virtio_ops_t __efx_virtio_rhead_ops = { rhead_virtio_qstop, /* evo_virtio_qstop */ rhead_virtio_get_doorbell_offset, /* evo_get_doorbell_offset */ rhead_virtio_get_features, /* evo_get_features */ + rhead_virtio_verify_features, /* evo_verify_features */ }; #endif /* EFSYS_OPT_RIVERHEAD */ @@ -299,4 +300,41 @@ efx_virtio_get_features( return (rc); } + __checkReturn efx_rc_t +efx_virtio_verify_features( + __inefx_nic_t *enp, + __inefx_virtio_device_type_t type, + __inuint64_t features) +{ + const efx_virtio_ops_t *evop = enp->en_evop; + efx_rc_t rc; + + if (type >= EFX_VIRTIO_DEVICE_NTYPES) { + rc = EINVAL; + goto fail1; + } + + EFSYS_ASSERT3U(enp->en_magic, ==, EFX_NIC_MAGIC); + EFSYS_ASSERT3U(enp->en_mod_flags, &, EFX_MOD_VIRTIO); + + if (evop == NULL) { + rc = ENOTSUP; + goto fail2; + } + + if ((rc = evop->evo_verify_features(enp, type, features)) != 0) + goto fail3; + + return (0); + +fail3: + EFSYS_PROBE(fail3); +fail2: + EFSYS_PROBE(fail2); +fail1: + EFSYS_PROBE1(fail1, efx_rc_t, rc); + + return (rc); +} + #endif /* EFSYS_OPT_VIRTIO */ diff --git a/drivers/common/sfc_efx/base/rhead_impl.h b/drivers/common/sfc_efx/base/rhead_impl.h index 69d701a47e..3bf9beceb0 100644 --- a/drivers/common/sfc_efx/base/rhead_impl.h +++ b/drivers/common/sfc_efx/base/rhead_impl.h @@ -505,6 +505,13 @@ rhead_virtio_get_features( __inefx_virtio_device_type_t type, __out uint64_t *featuresp); +LIBEFX_INTERNAL +extern __checkReturn efx_rc_t +rhead_virtio_verify_features( + __inefx_nic_t *enp, + __inefx_virtio_device_type_t type, + __inuint64_t features); + #endif /* EFSYS_OPT_VIRTIO */ #ifdef __cplusplus diff --git a/drivers/common/sfc_efx/base/rhead_virtio.c b/drivers/common/sfc_efx/base/rhead_virtio.c index 508d03d58f..0023ea1e83 100644 --- a/drivers/common/sfc_efx/base/rhead_virtio.c +++ b/drivers/common/sfc_efx/base/rhead_virtio.c @@ -332,4 +332,48 @@ rhead_virtio_get_features( return (rc); } + __checkReturn efx_rc_t +rhead_virtio_verify_features( + __inefx_nic_t *enp, + __inefx_virtio_device_type_t type, + __inuint64_t features) +{ + efx_mcdi_req_t req; + EFX_MCDI_DECLARE_BUF(payload, MC_CMD_VIRTIO_TEST_FEATURES_IN_LEN, + MC_CMD_VIRTIO_TEST_FEATURES_OUT_LEN); + efx_rc_t rc; + + EFX_STATIC_ASSERT(EFX_VIRTIO_DEVICE_TYPE_NET == + MC_CMD_VIRTIO_GET_FEATURES_IN_NET); + EFX_STATIC_ASSERT(EFX_VIRTIO_DEVICE_TYPE_BLOCK == +
[dpdk-dev] [PATCH v2 3/8] common/sfc_efx/base: add virtio build dependency
From: Vijay Kumar Srivastava Add EFSYS_HAS_UINT64 build dependency on EFSYS_OPT_VIRTIO. virtio features are represented as bitmask in 64-bit unsigned integer. Signed-off-by: Vijay Kumar Srivastava Signed-off-by: Andrew Rybchenko --- drivers/common/sfc_efx/base/efx_check.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/common/sfc_efx/base/efx_check.h b/drivers/common/sfc_efx/base/efx_check.h index 86a6d92fef..66b38eeae0 100644 --- a/drivers/common/sfc_efx/base/efx_check.h +++ b/drivers/common/sfc_efx/base/efx_check.h @@ -411,6 +411,9 @@ # if !EFSYS_OPT_RIVERHEAD # error "VIRTIO requires RIVERHEAD" # endif +# if !EFSYS_HAS_UINT64 +# error "VIRTIO requires UINT64" +# endif #endif /* EFSYS_OPT_VIRTIO */ #endif /* _SYS_EFX_CHECK_H */ -- 2.30.1
[dpdk-dev] [PATCH v2 4/8] common/sfc_efx/base: add support to get virtio features
From: Vijay Kumar Srivastava Add an API to get virtio features supported by device. Signed-off-by: Vijay Kumar Srivastava Signed-off-by: Andrew Rybchenko --- drivers/common/sfc_efx/base/efx.h | 14 ++ drivers/common/sfc_efx/base/efx_impl.h | 2 + drivers/common/sfc_efx/base/efx_virtio.c | 45 +++ drivers/common/sfc_efx/base/rhead_impl.h | 7 +++ drivers/common/sfc_efx/base/rhead_virtio.c | 52 ++ 5 files changed, 120 insertions(+) diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h index d4b7d7f47e..e3ac51eae0 100644 --- a/drivers/common/sfc_efx/base/efx.h +++ b/drivers/common/sfc_efx/base/efx.h @@ -4433,6 +4433,13 @@ typedef struct efx_virtio_vq_cfg_s { typedef struct efx_virtio_vq_s efx_virtio_vq_t; +typedef enum efx_virtio_device_type_e { + EFX_VIRTIO_DEVICE_TYPE_RESERVED, + EFX_VIRTIO_DEVICE_TYPE_NET, + EFX_VIRTIO_DEVICE_TYPE_BLOCK, + EFX_VIRTIO_DEVICE_NTYPES +} efx_virtio_device_type_t; + LIBEFX_API extern __checkReturn efx_rc_t efx_virtio_init( @@ -4487,6 +4494,13 @@ efx_virtio_get_doorbell_offset( __inefx_virtio_vq_t *evvp, __out uint32_t *offsetp); +LIBEFX_API +extern __checkReturn efx_rc_t +efx_virtio_get_features( + __inefx_nic_t *enp, + __inefx_virtio_device_type_t type, + __out uint64_t *featuresp); + #endif /* EFSYS_OPT_VIRTIO */ #ifdef __cplusplus diff --git a/drivers/common/sfc_efx/base/efx_impl.h b/drivers/common/sfc_efx/base/efx_impl.h index d6742f4a8c..758206d382 100644 --- a/drivers/common/sfc_efx/base/efx_impl.h +++ b/drivers/common/sfc_efx/base/efx_impl.h @@ -318,6 +318,8 @@ typedef struct efx_virtio_ops_s { efx_virtio_vq_dyncfg_t *); efx_rc_t(*evo_get_doorbell_offset)(efx_virtio_vq_t *, uint32_t *); + efx_rc_t(*evo_get_features)(efx_nic_t *, + efx_virtio_device_type_t, uint64_t *); } efx_virtio_ops_t; #endif /* EFSYS_OPT_VIRTIO */ diff --git a/drivers/common/sfc_efx/base/efx_virtio.c b/drivers/common/sfc_efx/base/efx_virtio.c index de998fcad9..20c22f02b5 100644 --- a/drivers/common/sfc_efx/base/efx_virtio.c +++ b/drivers/common/sfc_efx/base/efx_virtio.c @@ -13,6 +13,7 @@ static const efx_virtio_ops_t __efx_virtio_rhead_ops = { rhead_virtio_qstart,/* evo_virtio_qstart */ rhead_virtio_qstop, /* evo_virtio_qstop */ rhead_virtio_get_doorbell_offset, /* evo_get_doorbell_offset */ + rhead_virtio_get_features, /* evo_get_features */ }; #endif /* EFSYS_OPT_RIVERHEAD */ @@ -254,4 +255,48 @@ efx_virtio_get_doorbell_offset( return (rc); } + __checkReturn efx_rc_t +efx_virtio_get_features( + __inefx_nic_t *enp, + __inefx_virtio_device_type_t type, + __out uint64_t *featuresp) +{ + const efx_virtio_ops_t *evop = enp->en_evop; + efx_rc_t rc; + + if (featuresp == NULL) { + rc = EINVAL; + goto fail1; + } + + if (type >= EFX_VIRTIO_DEVICE_NTYPES) { + rc = EINVAL; + goto fail2; + } + + EFSYS_ASSERT3U(enp->en_magic, ==, EFX_NIC_MAGIC); + EFSYS_ASSERT3U(enp->en_mod_flags, &, EFX_MOD_VIRTIO); + + if (evop == NULL) { + rc = ENOTSUP; + goto fail3; + } + + if ((rc = evop->evo_get_features(enp, type, featuresp)) != 0) + goto fail4; + + return (0); + +fail4: + EFSYS_PROBE(fail4); +fail3: + EFSYS_PROBE(fail3); +fail2: + EFSYS_PROBE(fail2); +fail1: + EFSYS_PROBE1(fail1, efx_rc_t, rc); + + return (rc); +} + #endif /* EFSYS_OPT_VIRTIO */ diff --git a/drivers/common/sfc_efx/base/rhead_impl.h b/drivers/common/sfc_efx/base/rhead_impl.h index 4304f63f4c..69d701a47e 100644 --- a/drivers/common/sfc_efx/base/rhead_impl.h +++ b/drivers/common/sfc_efx/base/rhead_impl.h @@ -498,6 +498,13 @@ rhead_virtio_get_doorbell_offset( __inefx_virtio_vq_t *evvp, __out uint32_t *offsetp); +LIBEFX_INTERNAL +extern __checkReturn efx_rc_t +rhead_virtio_get_features( + __inefx_nic_t *enp, + __inefx_virtio_device_type_t type, + __out uint64_t *featuresp); + #endif /* EFSYS_OPT_VIRTIO */ #ifdef __cplusplus diff --git a/drivers/common/sfc_efx/base/rhead_virtio.c b/drivers/common/sfc_efx/base/rhead_virtio.c index 147460c95c..508d03d58f 100644 --- a/drivers/common/sfc_efx/base/rhead_virtio.c +++ b/drivers/common/sfc_efx/base/rhead_virtio.c @@ -280,4 +280,56 @@ rhead_virtio_get_doorbell_offset( return (rc); } + __checkReturn
[dpdk-dev] [PATCH v2 7/8] net/sfc: skip driver probe for incompatible device class
From: Vijay Kumar Srivastava Driver would be probed only for the net device class. Signed-off-by: Vijay Kumar Srivastava Signed-off-by: Andrew Rybchenko --- doc/guides/nics/sfc_efx.rst | 8 drivers/net/sfc/sfc.h| 1 + drivers/net/sfc/sfc_ethdev.c | 7 +++ drivers/net/sfc/sfc_kvargs.c | 1 + 4 files changed, 17 insertions(+) diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst index b6047cf5c7..cf1269cc03 100644 --- a/doc/guides/nics/sfc_efx.rst +++ b/doc/guides/nics/sfc_efx.rst @@ -357,6 +357,14 @@ allow option like "-a 02:00.0,arg1=value1,...". Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify boolean parameters value. +- ``class`` [net|vdpa] (default **net**) + + Choose the mode of operation of ef100 device. + **net** device will work as network device and will be probed by net/sfc driver. + **vdpa** device will work as vdpa device and will be probed by vdpa/sfc driver. + If this parameter is not specified then ef100 device will operate as + network device. + - ``rx_datapath`` [auto|efx|ef10|ef10_essb] (default **auto**) Choose receive datapath implementation. diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h index c2945b6ba2..b48a818adb 100644 --- a/drivers/net/sfc/sfc.h +++ b/drivers/net/sfc/sfc.h @@ -22,6 +22,7 @@ #include "efx.h" #include "sfc_efx_mcdi.h" +#include "sfc_efx.h" #include "sfc_debug.h" #include "sfc_log.h" diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c index 00a0fd3d02..23828c24ff 100644 --- a/drivers/net/sfc/sfc_ethdev.c +++ b/drivers/net/sfc/sfc_ethdev.c @@ -2161,6 +2161,13 @@ sfc_eth_dev_init(struct rte_eth_dev *dev) const struct rte_ether_addr *from; int ret; + if (sfc_efx_dev_class_get(pci_dev->device.devargs) != + SFC_EFX_DEV_CLASS_NET) { + SFC_GENERIC_LOG(DEBUG, + "Incompatible device class: skip probing, should be probed by other sfc driver."); + return 1; + } + sfc_register_dp(); logtype_main = sfc_register_logtype(&pci_dev->addr, diff --git a/drivers/net/sfc/sfc_kvargs.c b/drivers/net/sfc/sfc_kvargs.c index c42b326ab0..0efa92ed28 100644 --- a/drivers/net/sfc/sfc_kvargs.c +++ b/drivers/net/sfc/sfc_kvargs.c @@ -28,6 +28,7 @@ sfc_kvargs_parse(struct sfc_adapter *sa) SFC_KVARG_TX_DATAPATH, SFC_KVARG_FW_VARIANT, SFC_KVARG_RXD_WAIT_TIMEOUT_NS, + SFC_EFX_KVARG_DEV_CLASS, NULL, }; -- 2.30.1
[dpdk-dev] [PATCH v2 6/8] common/sfc_efx: add support to get the device class
From: Vijay Kumar Srivastava Device class argument would be used to select compatible driver. Driver probe would be skipped for incompatible device class. Signed-off-by: Vijay Kumar Srivastava Signed-off-by: Andrew Rybchenko --- drivers/common/sfc_efx/sfc_efx.c | 49 ++ drivers/common/sfc_efx/sfc_efx.h | 34 + drivers/common/sfc_efx/version.map | 2 ++ 3 files changed, 85 insertions(+) create mode 100644 drivers/common/sfc_efx/sfc_efx.h diff --git a/drivers/common/sfc_efx/sfc_efx.c b/drivers/common/sfc_efx/sfc_efx.c index d7a84c9835..a3146db255 100644 --- a/drivers/common/sfc_efx/sfc_efx.c +++ b/drivers/common/sfc_efx/sfc_efx.c @@ -7,12 +7,61 @@ * for Solarflare) and Solarflare Communications, Inc. */ +#include #include +#include +#include #include "sfc_efx_log.h" +#include "sfc_efx.h" uint32_t sfc_efx_logtype; +static int +sfc_efx_kvarg_dev_class_handler(__rte_unused const char *key, + const char *class_str, void *opaque) +{ + enum sfc_efx_dev_class *dev_class = opaque; + + if (class_str == NULL) + return *dev_class; + + if (strcmp(class_str, "vdpa") == 0) { + *dev_class = SFC_EFX_DEV_CLASS_VDPA; + } else if (strcmp(class_str, "net") == 0) { + *dev_class = SFC_EFX_DEV_CLASS_NET; + } else { + SFC_EFX_LOG(ERR, "Unsupported class %s.", class_str); + *dev_class = SFC_EFX_DEV_CLASS_INVALID; + } + + return 0; +} + +enum sfc_efx_dev_class +sfc_efx_dev_class_get(struct rte_devargs *devargs) +{ + struct rte_kvargs *kvargs; + const char *key = SFC_EFX_KVARG_DEV_CLASS; + enum sfc_efx_dev_class dev_class = SFC_EFX_DEV_CLASS_NET; + + if (devargs == NULL) + return dev_class; + + kvargs = rte_kvargs_parse(devargs->args, NULL); + if (kvargs == NULL) + return dev_class; + + if (rte_kvargs_count(kvargs, key) != 0) { + rte_kvargs_process(kvargs, key, sfc_efx_kvarg_dev_class_handler, + &dev_class); + } + + rte_kvargs_free(kvargs); + + return dev_class; +} + RTE_INIT(sfc_efx_register_logtype) { int ret; diff --git a/drivers/common/sfc_efx/sfc_efx.h b/drivers/common/sfc_efx/sfc_efx.h new file mode 100644 index 00..bbccd3e9e8 --- /dev/null +++ b/drivers/common/sfc_efx/sfc_efx.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * + * Copyright(c) 2019-2020 Xilinx, Inc. + * Copyright(c) 2019 Solarflare Communications Inc. + * + * This software was jointly developed between OKTET Labs (under contract + * for Solarflare) and Solarflare Communications, Inc. + */ + +#ifndef _SFC_EFX_H_ +#define _SFC_EFX_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#define SFC_EFX_KVARG_DEV_CLASS"class" + +enum sfc_efx_dev_class { + SFC_EFX_DEV_CLASS_INVALID = 0, + SFC_EFX_DEV_CLASS_NET, + SFC_EFX_DEV_CLASS_VDPA, + + SFC_EFX_DEV_NCLASS +}; + +__rte_internal +enum sfc_efx_dev_class sfc_efx_dev_class_get(struct rte_devargs *devargs); + +#ifdef __cplusplus +} +#endif + +#endif /* _SFC_EFX_H_ */ diff --git a/drivers/common/sfc_efx/version.map b/drivers/common/sfc_efx/version.map index 403feeaf11..a3345d34f7 100644 --- a/drivers/common/sfc_efx/version.map +++ b/drivers/common/sfc_efx/version.map @@ -221,6 +221,8 @@ INTERNAL { efx_txq_nbufs; efx_txq_size; + sfc_efx_dev_class_get; + sfc_efx_mcdi_init; sfc_efx_mcdi_fini; -- 2.30.1
Re: [dpdk-dev] [PATCH 8/8] drivers: add common driver API to get efx family
On 3/14/21 3:36 AM, Ferruh Yigit wrote: > On 3/11/2021 11:03 AM, Andrew Rybchenko wrote: >> From: Vijay Kumar Srivastava >> >> Move function to get efx family from net driver into common driver. >> >> Signed-off-by: Vijay Kumar Srivastava >> Signed-off-by: Andrew Rybchenko > > <...> > >> diff --git a/drivers/meson.build b/drivers/meson.build >> index fdf76120ac..9c8eded697 100644 >> --- a/drivers/meson.build >> +++ b/drivers/meson.build >> @@ -7,6 +7,7 @@ subdirs = [ >> 'bus', >> 'common/mlx5', # depends on bus. >> 'common/qat', # depends on bus. >> + 'common/sfc_efx', # depends on bus. >> 'mempool', # depends on common and bus. >> 'net', # depends on common, bus, mempool >> 'raw', # depends on common, bus and net. > > This enables building 'common/sfc_efx' for windows and fail. A windows > build check needs to be added to 'common/sfc_efx' I've send v2 which fixes the problem and one more - undefined functions mentioned in version.map. Thanks, Andrew.
[dpdk-dev] [PATCH v2 8/8] drivers: add common driver API to get efx family
From: Vijay Kumar Srivastava Move function to get efx family from net driver into common driver. Signed-off-by: Vijay Kumar Srivastava Signed-off-by: Andrew Rybchenko --- drivers/common/meson.build | 2 +- drivers/common/sfc_efx/meson.build | 5 +++ drivers/common/sfc_efx/sfc_efx.c | 56 +++ drivers/common/sfc_efx/sfc_efx.h | 10 + drivers/common/sfc_efx/version.map | 1 + drivers/meson.build| 1 + drivers/net/sfc/sfc.c | 61 ++ 7 files changed, 78 insertions(+), 58 deletions(-) diff --git a/drivers/common/meson.build b/drivers/common/meson.build index ba6325adf3..66e12143b2 100644 --- a/drivers/common/meson.build +++ b/drivers/common/meson.build @@ -6,4 +6,4 @@ if is_windows endif std_deps = ['eal'] -drivers = ['cpt', 'dpaax', 'iavf', 'mvep', 'octeontx', 'octeontx2', 'sfc_efx'] +drivers = ['cpt', 'dpaax', 'iavf', 'mvep', 'octeontx', 'octeontx2'] diff --git a/drivers/common/sfc_efx/meson.build b/drivers/common/sfc_efx/meson.build index d9afcf3eeb..9faf5161f5 100644 --- a/drivers/common/sfc_efx/meson.build +++ b/drivers/common/sfc_efx/meson.build @@ -5,6 +5,10 @@ # This software was jointly developed between OKTET Labs (under contract # for Solarflare) and Solarflare Communications, Inc. +if is_windows + subdir_done() +endif + if (arch_subdir != 'x86' or not dpdk_conf.get('RTE_ARCH_64')) and (arch_subdir != 'arm' or not host_machine.cpu_family().startswith('aarch64')) build = false reason = 'only supported on x86_64 and aarch64' @@ -32,6 +36,7 @@ endforeach subdir('base') objs = [base_objs] +deps += ['bus_pci'] sources = files( 'sfc_efx.c', 'sfc_efx_mcdi.c', diff --git a/drivers/common/sfc_efx/sfc_efx.c b/drivers/common/sfc_efx/sfc_efx.c index a3146db255..0b78933d9f 100644 --- a/drivers/common/sfc_efx/sfc_efx.c +++ b/drivers/common/sfc_efx/sfc_efx.c @@ -62,6 +62,62 @@ sfc_efx_dev_class_get(struct rte_devargs *devargs) return dev_class; } +static efx_rc_t +sfc_efx_find_mem_bar(efsys_pci_config_t *configp, int bar_index, +efsys_bar_t *barp) +{ + efsys_bar_t result; + struct rte_pci_device *dev; + + memset(&result, 0, sizeof(result)); + + if (bar_index < 0 || bar_index >= PCI_MAX_RESOURCE) + return -EINVAL; + + dev = configp->espc_dev; + + result.esb_rid = bar_index; + result.esb_dev = dev; + result.esb_base = dev->mem_resource[bar_index].addr; + + *barp = result; + + return 0; +} + +static efx_rc_t +sfc_efx_pci_config_readd(efsys_pci_config_t *configp, uint32_t offset, +efx_dword_t *edp) +{ + int rc; + + rc = rte_pci_read_config(configp->espc_dev, edp->ed_u32, sizeof(*edp), +offset); + + return (rc < 0 || rc != sizeof(*edp)) ? EIO : 0; +} + +int +sfc_efx_family(struct rte_pci_device *pci_dev, + efx_bar_region_t *mem_ebrp, efx_family_t *family) +{ + static const efx_pci_ops_t ops = { + .epo_config_readd = sfc_efx_pci_config_readd, + .epo_find_mem_bar = sfc_efx_find_mem_bar, + }; + + efsys_pci_config_t espcp; + int rc; + + espcp.espc_dev = pci_dev; + + rc = efx_family_probe_bar(pci_dev->id.vendor_id, + pci_dev->id.device_id, + &espcp, &ops, family, mem_ebrp); + + return rc; +} + RTE_INIT(sfc_efx_register_logtype) { int ret; diff --git a/drivers/common/sfc_efx/sfc_efx.h b/drivers/common/sfc_efx/sfc_efx.h index bbccd3e9e8..71288b7299 100644 --- a/drivers/common/sfc_efx/sfc_efx.h +++ b/drivers/common/sfc_efx/sfc_efx.h @@ -10,6 +10,11 @@ #ifndef _SFC_EFX_H_ #define _SFC_EFX_H_ +#include + +#include "efx.h" +#include "efsys.h" + #ifdef __cplusplus extern "C" { #endif @@ -27,6 +32,11 @@ enum sfc_efx_dev_class { __rte_internal enum sfc_efx_dev_class sfc_efx_dev_class_get(struct rte_devargs *devargs); +__rte_internal +int sfc_efx_family(struct rte_pci_device *pci_dev, + efx_bar_region_t *mem_ebrp, + efx_family_t *family); + #ifdef __cplusplus } #endif diff --git a/drivers/common/sfc_efx/version.map b/drivers/common/sfc_efx/version.map index a3345d34f7..c3414b760b 100644 --- a/drivers/common/sfc_efx/version.map +++ b/drivers/common/sfc_efx/version.map @@ -222,6 +222,7 @@ INTERNAL { efx_txq_size; sfc_efx_dev_class_get; + sfc_efx_family; sfc_efx_mcdi_init; sfc_efx_mcdi_fini; diff --git a/drivers/meson.build b/drivers/meson.build index fdf76120ac..9c8eded697 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -7,6 +7,7 @@ subdirs = [ 'bus', 'common/mlx5', # depends on bus. 'common/qat', # depends on bus. + 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. 'net',
Re: [dpdk-dev] [PATCH v11 0/2] support both PIO and MMIO BAR for legacy virito device
On Wed, Mar 10, 2021 at 6:37 PM 谢华伟(此时此刻) wrote: > > virtio PMD assumes legacy device only supports PIO(port-mapped) BAR > resource. This is wrong. As we need to create lots of devices, adn PIO > resource on x86 is very limited, we expose MMIO(memory-mapped I/O) BAR. > > Kernel supports both PIO and MMIO BAR for legacy virtio-pci device, and > for all other pci devices. This patchset handles different type of BAR in > the similar way. > > In previous implementation, under igb_uio driver we get PIO address from > igb_uio sysfs entry; with uio_pci_generic, we get PIO address from > /proc/ioports for x86, and for other ARCHs, we get PIO address from > standard PCI sysfs entry. For PIO/MMIO RW, there is different path for > different drivers and arch. > > All of the above is too much twisted. This patchset unifies the way to get > both PIO and MMIO address for different driver and ARCHs, all from standard > resource attr under pci sysfs. This is most generic. > > We distinguish PIO and MMIO by their address range like how kernel does. > It is ugly but works. > > v2 changes: > - add more explanation in the commit message > > v3 changes: > - fix patch format issues > > v4 changes: > - fixes for RTE_KDRV_UIO_GENERIC -> RTE_PCI_KDRV_UIO_GENERIC > > v5 changes: > - split into three seperate patches > > v6 changes: > - change to DEBUG level for IO bar detection in pci_uio_ioport_map > - rework the code in iobar branch > - fixes commit message format issue > - temporarily remove the 3rd patch for vfio path, leave it for future > discusssion > - rework against virtio_pmd_rework_v2 > > v7 changes: > - fix compilation issues of in/out instruction on non X86 archs > > v8 changes: > - change the word fix to refactor in patch 1's commit message > > v9 changes: > - keep pause version in in/out instructions > > v10 changes: > - trival fixes in commit message, like > 75 chars > > v11 changes: > - commit message fix and change > Aligned Sob and Author to fix the last checkpatch warning. Series applied to the main branch. Thanks Huawei and thanks too to reviewers/testers. -- David Marchand
Re: [dpdk-dev] [PATCH v2] update Intel roadmap for 21.05
10/03/2021 23:20, Ferruh Yigit: > Signed-off-by: Ferruh Yigit > --- > v2: > * there won't be a new driver for dlb2.5 > * reword thash library support Applied with minor updates for sorting things, thanks.
Re: [dpdk-dev] [RFC] eventdev: introduce event dispatcher
On 2021-03-07 14:04, Jerin Jacob wrote: > On Fri, Feb 26, 2021 at 1:31 PM Mattias Rönnblom > wrote: >> On 2021-02-25 13:32, Jerin Jacob wrote: >>> On Fri, Feb 19, 2021 at 12:00 AM Mattias Rönnblom >>> wrote: The purpose of the event dispatcher is primarily to decouple different parts of an application (e.g., processing pipeline stages), which share the same underlying event device. The event dispatcher replaces the conditional logic (often, a switch statement) that typically follows an event device dequeue operation, where events are dispatched to different parts of the application based on the destination queue id. >>> # If the device has all type queue[1] this RFC would restrict to >>> use queue ONLY as stage. A stage can be a Queue Type also. >>> How we can abstract this in this model? >> >> "All queue type" is about scheduling policy. I would think that would be >> independent of the "logical endpoint" of the event (i.e., the queue id). >> I feel like I'm missing something here. > Each queue type also can be represented as a stage. > For example, If the system has only one queue, the Typical IPsec > outbound stages can be > Q0-Ordered(For SA lookup) -> Q0(Atomic)(For Sequence number update) -> > Q0(Orderd)(Crypto operation)->Q0(Atomic)(Send on wire) OK, this makes sense. Would such an application want to add a callback per-queue-per-sched-type, or just per-sched-type? In your example, if you would have a queue Q1 as well, would want to have the option to have different callbacks for atomic-type events on Q0 and Q1? Would you want to dispatch based on anything else in the event? You could basically do it on any field (flow id, priority, etc.), but is there some other field that's commonly used to denote a processing stage? >> >>> # Also, I think, it may make sense to add this type of infrastructure as >>> helper functions as these are built on top of existing APIs i.e There >>> is no support >>> required from the driver to establish this model. IMO, If we need to >>> add such support as >>> one fixed set of functionality, we could have helper APIs to express a >>> certain >>> usage of eventdev. Rather defining the that's only way to do this. >>> I think, A helper function can be used to as abstraction to define >>> this kind of model. >>> >>> # Also, There is function pointer overhead and aggregating the events >>> in implementation, >>> That may be not always "the" optimized model of making it work vs switch >>> case in >>> application. >> >> Sure, but what to do in a reasonable generic framework? >> >> >> If you are very sensitive to that 20 cc or whatever function pointer >> call, you won't use this library. Or you will, and use static linking >> and LTO to get rid of that overhead. >> >> >> Probably, you have a few queues, not many. Probably, your dequeue bursts >> are large, if the system load is high (and otherwise, you don't care >> about efficiency). Then, you will have at least of couple of events per >> function call. > I am fine with this library and exposing it as a function pointer if > someone needs to > have a "helper" function to model the system around this logic. > > This RFC looks good to me in general. I would suggest to make it as > > - Helper functions i.e if someone chooses to do write the stage in > this way, it can be enabled through this helper function. > By choosing as helper function it depicts, this is one way to do the > stuff but the NOT ONLY WAY. > - Abstract stages as a queue(which already added in the patch) and > each type in the queue for all type queue cases. > - Enhance test-eventdev to showcase the functionality and performance > of these helpers. > > Thanks for the RFC. > >> >>> [1] >>> See RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES in >>> https://protect2.fireeye.com/v1/url?k=dcf3a2b9-83689b94-dcf3e222-8692dc8284cb-5ba19813a1556a85&q=1&e=0ff1861f-8e24-453c-a93b-73fd88e0f316&u=https%3A%2F%2Fdoc.dpdk.org%2Fguides%2Fprog_guide%2Feventdev.html >>> >>> The concept is similar to a UNIX file descriptor event loop library. Instead of tying callback functions to fds as for example libevent does, the event dispatcher binds callbacks to queue ids. An event dispatcher is configured to dequeue events from a specific event device, and ties into the service core framework, to do its (and the application's) work. The event dispatcher provides a convenient way for an eventdev-based application to use service cores for application-level processing, and thus for sharing those cores with other DPDK services. Signed-off-by: Mattias Rönnblom --- lib/librte_eventdev/Makefile | 2 + lib/librte_eventdev/meson.build | 6 +- lib/librte_eventdev/rte_event_dispatcher.c | 420 +++ lib/librte_eventdev/rte_event_dispatcher.h | 251 +++ lib/librte_eventdev/rte_eventdev_version.map |
Re: [dpdk-dev] [RFC] eventdev: introduce event dispatcher
> -Original Message- > From: dev On Behalf Of Mattias Rönnblom > Sent: Monday, March 15, 2021 2:45 PM > To: Jerin Jacob > Cc: Jerin Jacob ; dpdk-dev ; Richardson, > Bruce > Subject: Re: [dpdk-dev] [RFC] eventdev: introduce event dispatcher > > On 2021-03-07 14:04, Jerin Jacob wrote: > > On Fri, Feb 26, 2021 at 1:31 PM Mattias Rönnblom > > wrote: > >> On 2021-02-25 13:32, Jerin Jacob wrote: > >>> On Fri, Feb 19, 2021 at 12:00 AM Mattias Rönnblom > >>> wrote: > The purpose of the event dispatcher is primarily to decouple different > parts of an application (e.g., processing pipeline stages), which > share the same underlying event device. > > The event dispatcher replaces the conditional logic (often, a switch > statement) that typically follows an event device dequeue operation, > where events are dispatched to different parts of the application > based on the destination queue id. > >>> # If the device has all type queue[1] this RFC would restrict to > >>> use queue ONLY as stage. A stage can be a Queue Type also. > >>> How we can abstract this in this model? > >> > >> "All queue type" is about scheduling policy. I would think that would be > >> independent of the "logical endpoint" of the event (i.e., the queue id). > >> I feel like I'm missing something here. > > Each queue type also can be represented as a stage. > > For example, If the system has only one queue, the Typical IPsec > > outbound stages can be > > Q0-Ordered(For SA lookup) -> Q0(Atomic)(For Sequence number update) -> > > Q0(Orderd)(Crypto operation)->Q0(Atomic)(Send on wire) > > > OK, this makes sense. > > > Would such an application want to add a callback > per-queue-per-sched-type, or just per-sched-type? In your example, if > you would have a queue Q1 as well, would want to have the option to have > different callbacks for atomic-type events on Q0 and Q1? > > > Would you want to dispatch based on anything else in the event? You > could basically do it on any field (flow id, priority, etc.), but is > there some other field that's commonly used to denote a processing stage? I expect that struct rte_event::event_type and sub_event_type would regularly be used to split out different type of "things" that would be handled separately. Overall, I think we could imagine the Queue number, Queue Scheduling type (Re-Ordered, Atomic), Event type, sub event type, Flow-ID.. all contributing somehow to what function to execute in some situation. As a somewhat extreme example to prove a point: An RX core might use rte_flow rules to split traffic into some arbitrary grouping, and then the rte_event::flow_id could be used to select the function-pointer to jump to handle it? I like the *concept* of having a table of func-ptrs, and removing of a switch() in that way, but I'm not sure that DPDK Eventdev APIs are the right place for it. I think Jerin already suggested the "helper function" concept, which seems a good idea to allow optional usage. To be clear, I'm not against upstreaming of such an event-dispatcher, but I'm not sure its possible to build it to be generic enough for all use-cases. Maybe focusing on an actual use-case and driving the design from that is a good approach? Regards, -Harry
[dpdk-dev] [PATCH v2 0/4] net/virtio: make virtqueue struct cache-friendly
This series optimizes the cache usage of virtqueue struct, by making a "fake" mbuf being dynamically allocated in Rx virtnet struct, by removing a useless virtuque pointer into the virtnet structs and by moving a few fields to pack holes. With these 3 patches, the virtqueue struct size goes from 576 bytes (9 cachelines) to 248 bytes (4 cachelines). Changes in v2: == - Rebase on latest main - Improve error path in virtio_init_queue - Fix various typos in commit messages Maxime Coquelin (4): net/virtio: remove reference to virtqueue in vrings net/virtio: improve queue init error path net/virtio: allocate fake mbuf in Rx queue net/virtio: pack virtuqueue struct drivers/net/virtio/virtio_ethdev.c| 68 --- drivers/net/virtio/virtio_rxtx.c | 36 +- drivers/net/virtio/virtio_rxtx.h | 5 +- drivers/net/virtio/virtio_rxtx_packed.c | 4 +- drivers/net/virtio/virtio_rxtx_packed.h | 6 +- drivers/net/virtio/virtio_rxtx_packed_avx.h | 4 +- drivers/net/virtio/virtio_rxtx_simple.h | 2 +- .../net/virtio/virtio_rxtx_simple_altivec.c | 2 +- drivers/net/virtio/virtio_rxtx_simple_neon.c | 2 +- drivers/net/virtio/virtio_rxtx_simple_sse.c | 2 +- .../net/virtio/virtio_user/virtio_user_dev.c | 4 +- drivers/net/virtio/virtio_user_ethdev.c | 2 +- drivers/net/virtio/virtqueue.h| 24 --- 13 files changed, 88 insertions(+), 73 deletions(-) -- 2.29.2
[dpdk-dev] [PATCH v2 2/4] net/virtio: improve queue init error path
This patch improves the error path of virtio_init_queue(), by cleaning in reversing order all resources that have been allocated. Suggested-by: Chenbo Xia Signed-off-by: Maxime Coquelin --- drivers/net/virtio/virtio_ethdev.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index af090fdf9c..65ad71f1a6 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -507,7 +507,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) mz = rte_memzone_lookup(vq_name); if (mz == NULL) { ret = -ENOMEM; - goto fail_q_alloc; + goto free_vq; } } @@ -533,7 +533,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) hdr_mz = rte_memzone_lookup(vq_hdr_name); if (hdr_mz == NULL) { ret = -ENOMEM; - goto fail_q_alloc; + goto free_mz; } } } @@ -547,7 +547,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) if (!sw_ring) { PMD_INIT_LOG(ERR, "can not allocate RX soft ring"); ret = -ENOMEM; - goto fail_q_alloc; + goto free_hdr_mz; } vq->sw_ring = sw_ring; @@ -604,15 +604,22 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) if (VIRTIO_OPS(hw)->setup_queue(hw, vq) < 0) { PMD_INIT_LOG(ERR, "setup_queue failed"); - return -EINVAL; + ret = -EINVAL; + goto clean_vq; } return 0; -fail_q_alloc: - rte_free(sw_ring); +clean_vq: + hw->cvq = NULL; + + if (sw_ring) + rte_free(sw_ring); +free_hdr_mz: rte_memzone_free(hdr_mz); +free_mz: rte_memzone_free(mz); +free_vq: rte_free(vq); return ret; -- 2.29.2
[dpdk-dev] [PATCH v2 1/4] net/virtio: remove reference to virtqueue in vrings
Vrings are part of the virtqueues, so we don't need to have a pointer to it in Vrings descriptions. Instead, let's just subtract from its offset to calculate virtqueue address. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- drivers/net/virtio/virtio_ethdev.c| 36 +-- drivers/net/virtio/virtio_rxtx.c | 28 +++ drivers/net/virtio/virtio_rxtx.h | 3 -- drivers/net/virtio/virtio_rxtx_packed.c | 4 +-- drivers/net/virtio/virtio_rxtx_packed.h | 6 ++-- drivers/net/virtio/virtio_rxtx_packed_avx.h | 4 +-- drivers/net/virtio/virtio_rxtx_simple.h | 2 +- .../net/virtio/virtio_rxtx_simple_altivec.c | 2 +- drivers/net/virtio/virtio_rxtx_simple_neon.c | 2 +- drivers/net/virtio/virtio_rxtx_simple_sse.c | 2 +- .../net/virtio/virtio_user/virtio_user_dev.c | 4 +-- drivers/net/virtio/virtio_user_ethdev.c | 2 +- drivers/net/virtio/virtqueue.h| 6 +++- 13 files changed, 49 insertions(+), 52 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 333a5243a9..af090fdf9c 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -133,7 +133,7 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, struct virtio_pmd_ctrl *ctrl, int *dlen, int pkt_num) { - struct virtqueue *vq = cvq->vq; + struct virtqueue *vq = virtnet_cq_to_vq(cvq); int head; struct vring_packed_desc *desc = vq->vq_packed.ring.desc; struct virtio_pmd_ctrl *result; @@ -229,7 +229,7 @@ virtio_send_command_split(struct virtnet_ctl *cvq, int *dlen, int pkt_num) { struct virtio_pmd_ctrl *result; - struct virtqueue *vq = cvq->vq; + struct virtqueue *vq = virtnet_cq_to_vq(cvq); uint32_t head, i; int k, sum = 0; @@ -316,13 +316,13 @@ virtio_send_command(struct virtnet_ctl *cvq, struct virtio_pmd_ctrl *ctrl, ctrl->status = status; - if (!cvq || !cvq->vq) { + if (!cvq) { PMD_INIT_LOG(ERR, "Control queue is not supported."); return -1; } rte_spinlock_lock(&cvq->lock); - vq = cvq->vq; + vq = virtnet_cq_to_vq(cvq); PMD_INIT_LOG(DEBUG, "vq->vq_desc_head_idx = %d, status = %d, " "vq->hw->cvq = %p vq = %p", @@ -552,19 +552,16 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) vq->sw_ring = sw_ring; rxvq = &vq->rxq; - rxvq->vq = vq; rxvq->port_id = dev->data->port_id; rxvq->mz = mz; } else if (queue_type == VTNET_TQ) { txvq = &vq->txq; - txvq->vq = vq; txvq->port_id = dev->data->port_id; txvq->mz = mz; txvq->virtio_net_hdr_mz = hdr_mz; txvq->virtio_net_hdr_mem = hdr_mz->iova; } else if (queue_type == VTNET_CQ) { cvq = &vq->cq; - cvq->vq = vq; cvq->mz = mz; cvq->virtio_net_hdr_mz = hdr_mz; cvq->virtio_net_hdr_mem = hdr_mz->iova; @@ -851,7 +848,7 @@ virtio_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id) { struct virtio_hw *hw = dev->data->dev_private; struct virtnet_rx *rxvq = dev->data->rx_queues[queue_id]; - struct virtqueue *vq = rxvq->vq; + struct virtqueue *vq = virtnet_rxq_to_vq(rxvq); virtqueue_enable_intr(vq); virtio_mb(hw->weak_barriers); @@ -862,7 +859,7 @@ static int virtio_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id) { struct virtnet_rx *rxvq = dev->data->rx_queues[queue_id]; - struct virtqueue *vq = rxvq->vq; + struct virtqueue *vq = virtnet_rxq_to_vq(rxvq); virtqueue_disable_intr(vq); return 0; @@ -2180,8 +2177,7 @@ static int virtio_dev_start(struct rte_eth_dev *dev) { uint16_t nb_queues, i; - struct virtnet_rx *rxvq; - struct virtnet_tx *txvq __rte_unused; + struct virtqueue *vq; struct virtio_hw *hw = dev->data->dev_private; int ret; @@ -2238,27 +2234,27 @@ virtio_dev_start(struct rte_eth_dev *dev) PMD_INIT_LOG(DEBUG, "nb_queues=%d", nb_queues); for (i = 0; i < dev->data->nb_rx_queues; i++) { - rxvq = dev->data->rx_queues[i]; + vq = virtnet_rxq_to_vq(dev->data->rx_queues[i]); /* Flush the old packets */ - virtqueue_rxvq_flush(rxvq->vq); - virtqueue_notify(rxvq->vq); + virtqueue_rxvq_flush(vq); + virtqueue_notify(vq); } for (i = 0; i < dev->data->nb_tx_queues; i++) { - txvq = dev->data->tx_queues[i]; - virtqueue_notify(txvq->vq); + vq = virtnet_t
[dpdk-dev] [PATCH v2 3/4] net/virtio: allocate fake mbuf in Rx queue
While it is worth clarifying whether the fake mbuf in virtnet_rx struct is really necessary, it is sure that it heavily impacts cache usage by being part of the struct. Indeed, it uses two cachelines, and requires alignement on a cacheline. Before this series, it means it took 120 bytes in virtnet_rx struct: struct virtnet_rx { struct virtqueue * vq; /* 0 8 */ /* XXX 56 bytes hole, try to pack */ /* --- cacheline 1 boundary (64 bytes) --- */ struct rte_mbuffake_mbuf __attribute__((__aligned__(64))); /*64 128 */ /* --- cacheline 3 boundary (192 bytes) --- */ This patch allocates it using malloc in order to optimize virtnet_rx cache usage and so virtqueue cache usage. Signed-off-by: Maxime Coquelin --- drivers/net/virtio/virtio_ethdev.c | 13 + drivers/net/virtio/virtio_rxtx.c | 8 +++- drivers/net/virtio/virtio_rxtx.h | 2 +- 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 65ad71f1a6..0ff0b16027 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -435,6 +435,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) int queue_type = virtio_get_queue_type(hw, queue_idx); int ret; int numa_node = dev->device->numa_node; + struct rte_mbuf *fake_mbuf = NULL; PMD_INIT_LOG(INFO, "setting up queue: %u on NUMA node %d", queue_idx, numa_node); @@ -550,10 +551,18 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) goto free_hdr_mz; } + fake_mbuf = malloc(sizeof(*fake_mbuf)); + if (!fake_mbuf) { + PMD_INIT_LOG(ERR, "can not allocate fake mbuf"); + ret = -ENOMEM; + goto free_sw_ring; + } + vq->sw_ring = sw_ring; rxvq = &vq->rxq; rxvq->port_id = dev->data->port_id; rxvq->mz = mz; + rxvq->fake_mbuf = fake_mbuf; } else if (queue_type == VTNET_TQ) { txvq = &vq->txq; txvq->port_id = dev->data->port_id; @@ -613,6 +622,9 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) clean_vq: hw->cvq = NULL; + if (fake_mbuf) + free(fake_mbuf); +free_sw_ring: if (sw_ring) rte_free(sw_ring); free_hdr_mz: @@ -643,6 +655,7 @@ virtio_free_queues(struct virtio_hw *hw) queue_type = virtio_get_queue_type(hw, i); if (queue_type == VTNET_RQ) { + free(vq->rxq.fake_mbuf); rte_free(vq->sw_ring); rte_memzone_free(vq->rxq.mz); } else if (queue_type == VTNET_TQ) { diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 32af8d3d11..c1ce15c8f5 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -703,11 +703,9 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, uint16_t queue_idx) virtio_rxq_vec_setup(rxvq); } - memset(&rxvq->fake_mbuf, 0, sizeof(rxvq->fake_mbuf)); - for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; -desc_idx++) { - vq->sw_ring[vq->vq_nentries + desc_idx] = - &rxvq->fake_mbuf; + memset(rxvq->fake_mbuf, 0, sizeof(*rxvq->fake_mbuf)); + for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; desc_idx++) { + vq->sw_ring[vq->vq_nentries + desc_idx] = rxvq->fake_mbuf; } if (hw->use_vec_rx && !virtio_with_packed_queue(hw)) { diff --git a/drivers/net/virtio/virtio_rxtx.h b/drivers/net/virtio/virtio_rxtx.h index 7f1036be6f..6ce5d67d15 100644 --- a/drivers/net/virtio/virtio_rxtx.h +++ b/drivers/net/virtio/virtio_rxtx.h @@ -19,7 +19,7 @@ struct virtnet_stats { struct virtnet_rx { /* dummy mbuf, for wraparound when processing RX ring. */ - struct rte_mbuf fake_mbuf; + struct rte_mbuf *fake_mbuf; uint64_t mbuf_initializer; /**< value to init mbufs. */ struct rte_mempool *mpool; /**< mempool for mbuf allocation */ -- 2.29.2
[dpdk-dev] [PATCH v2 4/4] net/virtio: pack virtuqueue struct
This patch optimizes packing of the virtuqueue struct by moving fields around to fill holes. Offset field is not used and so can be removed. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- drivers/net/virtio/virtqueue.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h index 17e76f0e8c..4536b0ef9d 100644 --- a/drivers/net/virtio/virtqueue.h +++ b/drivers/net/virtio/virtqueue.h @@ -244,6 +244,15 @@ struct virtqueue { uint16_t vq_avail_idx; /**< sync until needed */ uint16_t vq_free_thresh; /**< free threshold */ + /** +* Head of the free chain in the descriptor table. If +* there are no free descriptors, this will be set to +* VQ_RING_DESC_CHAIN_END. +*/ + uint16_t vq_desc_head_idx; + uint16_t vq_desc_tail_idx; + uint16_t vq_queue_index; /**< PCI queue index */ + void *vq_ring_virt_mem; /**< linear address of vring*/ unsigned int vq_ring_size; @@ -256,15 +265,6 @@ struct virtqueue { rte_iova_t vq_ring_mem; /**< physical address of vring, * or virtual address for virtio_user. */ - /** -* Head of the free chain in the descriptor table. If -* there are no free descriptors, this will be set to -* VQ_RING_DESC_CHAIN_END. -*/ - uint16_t vq_desc_head_idx; - uint16_t vq_desc_tail_idx; - uint16_t vq_queue_index; - uint16_t offset; /**< relative offset to obtain addr in mbuf */ uint16_t *notify_addr; struct rte_mbuf **sw_ring; /**< RX software ring. */ struct vq_desc_extra vq_descx[0]; -- 2.29.2
Re: [dpdk-dev] [PATCH v11 2/2] bus/pci: support MMIO in PCI ioport accessors
On 2021/3/15 18:19, David Marchand wrote: #else #define IO_COND(addr, is_pio, is_mmio) do { \ is_mmio; \ } while (0) #endif We should not just copy/paste kernel code. Plus here, this seems a bit overkill. And there are other parts in this code that could use some polishing. What do you think of merging this series as is (now that we got non regression reports) and doing such cleanups in followup patches? I am OK. Yes, we could do some cleanup after it is merged, for example against vfio, if it is really necessary for virtio PMD only to use vfio to access IO port.
[dpdk-dev] [PATCH] net/mlx5: add power monitoring support
Support the PMD power management API in MLX5 driver. The monitor policy of this API puts a CPU core to sleep until a data in some monitored memory address is changed by the NIC. Implement the get_monitor_addr function to return an address of a CQE owner bit to monitor the arrival of a new packet. Signed-off-by: Alexander Kozyrev --- doc/guides/rel_notes/release_21_05.rst | 4 drivers/net/mlx5/mlx5.c| 2 ++ drivers/net/mlx5/mlx5_rxtx.c | 19 +++ drivers/net/mlx5/mlx5_rxtx.h | 1 + 4 files changed, 26 insertions(+) diff --git a/doc/guides/rel_notes/release_21_05.rst b/doc/guides/rel_notes/release_21_05.rst index f262d48e82..928eafd92f 100644 --- a/doc/guides/rel_notes/release_21_05.rst +++ b/doc/guides/rel_notes/release_21_05.rst @@ -65,6 +65,10 @@ New Features * Added support for freeing Tx mbuf on demand. * Added support for copper port in Kunpeng930. +* **Updated Mellanox mlx5 driver.** + + * Added support for the monitor policy of Power Management API. + * **Updated NXP DPAA driver.** * Added support for shared ethernet interface. diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index abd7ff70df..7b419deb2c 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1496,6 +1496,7 @@ const struct eth_dev_ops mlx5_dev_ops = { .hairpin_queue_peer_update = mlx5_hairpin_queue_peer_update, .hairpin_queue_peer_bind = mlx5_hairpin_queue_peer_bind, .hairpin_queue_peer_unbind = mlx5_hairpin_queue_peer_unbind, + .get_monitor_addr = mlx5_get_monitor_addr, }; /* Available operations from secondary process. */ @@ -1580,6 +1581,7 @@ const struct eth_dev_ops mlx5_dev_ops_isolate = { .hairpin_queue_peer_update = mlx5_hairpin_queue_peer_update, .hairpin_queue_peer_bind = mlx5_hairpin_queue_peer_bind, .hairpin_queue_peer_unbind = mlx5_hairpin_queue_peer_unbind, + .get_monitor_addr = mlx5_get_monitor_addr, }; /** diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index e3ce9fd224..0fc0c2096a 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -712,6 +712,25 @@ mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id) return rx_queue_count(rxq); } +int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc) +{ + struct mlx5_rxq_data *rxq = rx_queue; + const unsigned int cqe_num = 1 << rxq->cqe_n; + const unsigned int cqe_mask = cqe_num - 1; + const uint16_t idx = rxq->cq_ci & cqe_num; + volatile struct mlx5_cqe *cqe = &(*rxq->cqes)[rxq->cq_ci & cqe_mask]; + + if (unlikely(!cqe)) { + rte_errno = EINVAL; + return -rte_errno; + } + pmc->addr = &cqe->op_own; + pmc->val = !!idx; + pmc->mask = MLX5_CQE_OWNER_MASK; + pmc->size = sizeof(uint8_t); + return 0; +} + #define MLX5_SYSTEM_LOG_DIR "/var/log" /** * Dump debug information to log file. diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 0fd98af9d1..35a1bba486 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -441,6 +441,7 @@ int mlx5_rx_burst_mode_get(struct rte_eth_dev *dev, uint16_t rx_queue_id, struct rte_eth_burst_mode *mode); int mlx5_tx_burst_mode_get(struct rte_eth_dev *dev, uint16_t tx_queue_id, struct rte_eth_burst_mode *mode); +int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc); /* Vectorized version of mlx5_rxtx.c */ int mlx5_rxq_check_vec_support(struct mlx5_rxq_data *rxq_data); -- 2.24.1
[dpdk-dev] [PATCH v2] eal, power: use UINT64_MAX instead of -1ULL
use UINT64_MAX instead of -1ULL when manipulating uint64_t masks and initializing sentinel values. some compilers generate a warning when applying a '-' to an unsigned literal so avoid this by initializing with unsigned preprocessor definitions where appropriate. Signed-off-by: Tyler Retzlaff --- lib/librte_eal/common/eal_common_fbarray.c | 12 ++-- lib/librte_power/rte_power_pmd_mgmt.c | 2 +- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/lib/librte_eal/common/eal_common_fbarray.c b/lib/librte_eal/common/eal_common_fbarray.c index 592ec5859..3a28a5324 100644 --- a/lib/librte_eal/common/eal_common_fbarray.c +++ b/lib/librte_eal/common/eal_common_fbarray.c @@ -138,7 +138,7 @@ find_next_n(const struct rte_fbarray *arr, unsigned int start, unsigned int n, */ last = MASK_LEN_TO_IDX(arr->len); last_mod = MASK_LEN_TO_MOD(arr->len); - last_msk = ~(-1ULL << last_mod); + last_msk = ~(UINT64_MAX << last_mod); for (msk_idx = first; msk_idx < msk->n_masks; msk_idx++) { uint64_t cur_msk, lookahead_msk; @@ -398,8 +398,8 @@ find_prev_n(const struct rte_fbarray *arr, unsigned int start, unsigned int n, first_mod = MASK_LEN_TO_MOD(start); /* we're going backwards, so mask must start from the top */ ignore_msk = first_mod == MASK_ALIGN - 1 ? - -1ULL : /* prevent overflow */ - ~(-1ULL << (first_mod + 1)); + UINT64_MAX : /* prevent overflow */ + ~(UINT64_MAX << (first_mod + 1)); /* go backwards, include zero */ msk_idx = first; @@ -513,7 +513,7 @@ find_prev_n(const struct rte_fbarray *arr, unsigned int start, unsigned int n, * no runs in the space we've lookbehind-scanned * as well, so skip that on next iteration. */ - ignore_msk = -1ULL << need; + ignore_msk = UINT64_MAX << need; msk_idx = lookbehind_idx; break; } @@ -560,8 +560,8 @@ find_prev(const struct rte_fbarray *arr, unsigned int start, bool used) first_mod = MASK_LEN_TO_MOD(start); /* we're going backwards, so mask must start from the top */ ignore_msk = first_mod == MASK_ALIGN - 1 ? - -1ULL : /* prevent overflow */ - ~(-1ULL << (first_mod + 1)); + UINT64_MAX : /* prevent overflow */ + ~(UINT64_MAX << (first_mod + 1)); /* go backwards, include zero */ idx = first; diff --git a/lib/librte_power/rte_power_pmd_mgmt.c b/lib/librte_power/rte_power_pmd_mgmt.c index 454ef7091..db03cbf42 100644 --- a/lib/librte_power/rte_power_pmd_mgmt.c +++ b/lib/librte_power/rte_power_pmd_mgmt.c @@ -111,7 +111,7 @@ clb_umwait(uint16_t port_id, uint16_t qidx, struct rte_mbuf **pkts __rte_unused, ret = rte_eth_get_monitor_addr(port_id, qidx, &pmc); if (ret == 0) - rte_power_monitor(&pmc, -1ULL); + rte_power_monitor(&pmc, UINT64_MAX); } q_conf->umwait_in_progress = false; -- 2.30.0.vfs.0.2
Re: [dpdk-dev] [PATCH v4 1/2] eal: error number enhancement for thread TLS API
> Subject: Re: [PATCH v4 1/2] eal: error number enhancement for thread TLS > API > > On Wed, Mar 10, 2021 at 02:48:55PM +0200, Tal Shnaiderman wrote: > > add error number reporting to rte_errno in all functions in the > > rte_thread_tls_* API. > > > > Suggested-by: Anatoly Burakov > > Signed-off-by: Tal Shnaiderman > > --- > > lib/librte_eal/include/rte_thread.h | 14 +++--- > > lib/librte_eal/unix/rte_thread.c| 6 ++ > > lib/librte_eal/windows/rte_thread.c | 6 ++ > > 3 files changed, 23 insertions(+), 3 deletions(-) > > > > diff --git a/lib/librte_eal/include/rte_thread.h > > b/lib/librte_eal/include/rte_thread.h > > After we introduce a translation function to map from Windows error codes > to errno style codes (as part of EAL threads API), should we change this to > directly return the error code from the functions? > Or do we follow the pattern of setting rte_errno? Sorry for the late reply, I'd stick to errors in rte_errno, note that in cases like rte_thread_value_get the only way to get the errors is with rte_errno since it's returning the value itself. BTW will you also add translation function for the UNIX errors to get identical errors?
Re: [dpdk-dev] [PATCH v2 2/4] net/virtio: improve queue init error path
On Mon, Mar 15, 2021 at 4:20 PM Maxime Coquelin wrote: > @@ -604,15 +604,22 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t > queue_idx) > > if (VIRTIO_OPS(hw)->setup_queue(hw, vq) < 0) { > PMD_INIT_LOG(ERR, "setup_queue failed"); > - return -EINVAL; > + ret = -EINVAL; > + goto clean_vq; > } > > return 0; > > -fail_q_alloc: > - rte_free(sw_ring); > +clean_vq: > + hw->cvq = NULL; > + > + if (sw_ring) > + rte_free(sw_ring); Nit: rte_free handles NULL fine, you can remove the test, the same way it was done before. > +free_hdr_mz: > rte_memzone_free(hdr_mz); > +free_mz: > rte_memzone_free(mz); > +free_vq: > rte_free(vq); > > return ret; -- David Marchand
Re: [dpdk-dev] [PATCH v2 2/4] net/virtio: improve queue init error path
On 3/15/21 4:38 PM, David Marchand wrote: > On Mon, Mar 15, 2021 at 4:20 PM Maxime Coquelin > wrote: >> @@ -604,15 +604,22 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t >> queue_idx) >> >> if (VIRTIO_OPS(hw)->setup_queue(hw, vq) < 0) { >> PMD_INIT_LOG(ERR, "setup_queue failed"); >> - return -EINVAL; >> + ret = -EINVAL; >> + goto clean_vq; >> } >> >> return 0; >> >> -fail_q_alloc: >> - rte_free(sw_ring); >> +clean_vq: >> + hw->cvq = NULL; >> + >> + if (sw_ring) >> + rte_free(sw_ring); > > Nit: rte_free handles NULL fine, you can remove the test, the same way > it was done before. The API doc indeed specifies the NULL case, I'll remove it in v3. >> +free_hdr_mz: >> rte_memzone_free(hdr_mz); >> +free_mz: >> rte_memzone_free(mz); >> +free_vq: >> rte_free(vq); >> >> return ret; > Thanks, Maxime
Re: [dpdk-dev] [PATCH v2 3/4] net/virtio: allocate fake mbuf in Rx queue
On Mon, Mar 15, 2021 at 4:20 PM Maxime Coquelin wrote: > @@ -550,10 +551,18 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t > queue_idx) > goto free_hdr_mz; > } > > + fake_mbuf = malloc(sizeof(*fake_mbuf)); > + if (!fake_mbuf) { > + PMD_INIT_LOG(ERR, "can not allocate fake mbuf"); > + ret = -ENOMEM; > + goto free_sw_ring; > + } > + > vq->sw_ring = sw_ring; > rxvq = &vq->rxq; > rxvq->port_id = dev->data->port_id; > rxvq->mz = mz; > + rxvq->fake_mbuf = fake_mbuf; IIRC, vq is allocated as dpdk memory (rte_malloc). Generally speaking, storing a local pointer inside such an object is dangerous if other processes start to look at this part. > } else if (queue_type == VTNET_TQ) { > txvq = &vq->txq; > txvq->port_id = dev->data->port_id; > @@ -613,6 +622,9 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t > queue_idx) > clean_vq: > hw->cvq = NULL; > > + if (fake_mbuf) > + free(fake_mbuf); No need for if(). -- David Marchand
Re: [dpdk-dev] [PATCH v2 4/4] net/virtio: pack virtuqueue struct
On Mon, Mar 15, 2021 at 4:20 PM Maxime Coquelin wrote: > > This patch optimizes packing of the virtuqueue virtqueue ? and same typo in the title. > struct by moving fields around to fill holes. > > Offset field is not used and so can be removed. > > Signed-off-by: Maxime Coquelin > Reviewed-by: Chenbo Xia -- David Marchand
[dpdk-dev] [PATCH v2] vhost: add header check in dequeue offload
When parsing the virtio net header and packet header for dequeue offload, we need to perform sanity check on the packet header to ensure: - No out-of-boundary memory access. - The packet header and virtio_net header are valid and aligned. Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities") Cc: sta...@dpdk.org Signed-off-by: Xiao Wang --- v2: Allow empty L4 payload for cksum offload. --- lib/librte_vhost/virtio_net.c | 49 +-- 1 file changed, 43 insertions(+), 6 deletions(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 583bf379c6..53a8ff2898 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -1821,44 +1821,64 @@ virtio_net_with_host_offload(struct virtio_net *dev) return false; } -static void -parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr) +static int +parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr, + uint16_t *len) { struct rte_ipv4_hdr *ipv4_hdr; struct rte_ipv6_hdr *ipv6_hdr; void *l3_hdr = NULL; struct rte_ether_hdr *eth_hdr; uint16_t ethertype; + uint16_t data_len = m->data_len; eth_hdr = rte_pktmbuf_mtod(m, struct rte_ether_hdr *); + if (data_len <= sizeof(struct rte_ether_hdr)) + return -EINVAL; + m->l2_len = sizeof(struct rte_ether_hdr); ethertype = rte_be_to_cpu_16(eth_hdr->ether_type); + data_len -= sizeof(struct rte_ether_hdr); if (ethertype == RTE_ETHER_TYPE_VLAN) { + if (data_len <= sizeof(struct rte_vlan_hdr)) + return -EINVAL; + struct rte_vlan_hdr *vlan_hdr = (struct rte_vlan_hdr *)(eth_hdr + 1); m->l2_len += sizeof(struct rte_vlan_hdr); ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto); + data_len -= sizeof(struct rte_vlan_hdr); } l3_hdr = (char *)eth_hdr + m->l2_len; switch (ethertype) { case RTE_ETHER_TYPE_IPV4: + if (data_len <= sizeof(struct rte_ipv4_hdr)) + return -EINVAL; ipv4_hdr = l3_hdr; *l4_proto = ipv4_hdr->next_proto_id; m->l3_len = rte_ipv4_hdr_len(ipv4_hdr); + if (data_len <= m->l3_len) { + m->l3_len = 0; + return -EINVAL; + } *l4_hdr = (char *)l3_hdr + m->l3_len; m->ol_flags |= PKT_TX_IPV4; + data_len -= m->l3_len; break; case RTE_ETHER_TYPE_IPV6: + if (data_len <= sizeof(struct rte_ipv6_hdr)) + return -EINVAL; ipv6_hdr = l3_hdr; *l4_proto = ipv6_hdr->proto; m->l3_len = sizeof(struct rte_ipv6_hdr); *l4_hdr = (char *)l3_hdr + m->l3_len; m->ol_flags |= PKT_TX_IPV6; + data_len -= m->l3_len; break; default: m->l3_len = 0; @@ -1866,6 +1886,9 @@ parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr) *l4_hdr = NULL; break; } + + *len = data_len; + return 0; } static __rte_always_inline void @@ -1874,24 +1897,30 @@ vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m) uint16_t l4_proto = 0; void *l4_hdr = NULL; struct rte_tcp_hdr *tcp_hdr = NULL; + uint16_t len = 0; if (hdr->flags == 0 && hdr->gso_type == VIRTIO_NET_HDR_GSO_NONE) return; - parse_ethernet(m, &l4_proto, &l4_hdr); + if (parse_ethernet(m, &l4_proto, &l4_hdr, &len) < 0) + return; + if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) { if (hdr->csum_start == (m->l2_len + m->l3_len)) { switch (hdr->csum_offset) { case (offsetof(struct rte_tcp_hdr, cksum)): - if (l4_proto == IPPROTO_TCP) + if (l4_proto == IPPROTO_TCP && + len >= sizeof(struct rte_tcp_hdr)) m->ol_flags |= PKT_TX_TCP_CKSUM; break; case (offsetof(struct rte_udp_hdr, dgram_cksum)): - if (l4_proto == IPPROTO_UDP) + if (l4_proto == IPPROTO_UDP && + len >= sizeof(struct rte_udp_hdr)) m->ol_flags |= PKT_TX_UDP_CKSUM; break; case (offsetof(struct rte_sctp_hdr, cksum)): - if (l4_proto == IPPROTO_SCTP) + if (l4_proto == IPPROTO_SCTP && +
Re: [dpdk-dev] [PATCH 2/2] net/mlx5: avoid unbind step to enable switchdev mode
Hi, Jan Yes, bullet [4] explicitly requires to unbind VFs, and detach the netdevs from the mlx5_core driver. Otherwise, kernel driver refuses to be configured with switchdev mode in [5]. So, [4] can't be skipped. After setting swithdev mode, VFs can be bound back (if it is needed, and these ones are not mapped to VMs): echo -n "" > > /sys/bus/pci/drivers/mlx5_core/bind With best regards, Slava > -Original Message- > From: Jan Viktorin > Sent: Monday, March 15, 2021 17:34 > To: dev@dpdk.org > Cc: Jan Viktorin ; Asaf Penso ; > Shahaf Shuler ; Slava Ovsiienko > ; Matan Azrad > Subject: [PATCH 2/2] net/mlx5: avoid unbind step to enable switchdev mode > > From: Jan Viktorin > > The step 4 is a contradiction. It advices to unbind the device from the > mlx5_core which removes the associated system network interface (e.g. > eth0). In the step 5, the same system network interface (e.g. eth0) is > required to exist. > > Signed-off-by: Jan Viktorin > --- > doc/guides/nics/mlx5.rst | 6 +- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index > 0a2dc3dee..122d8e0fc 100644 > --- a/doc/guides/nics/mlx5.rst > +++ b/doc/guides/nics/mlx5.rst > @@ -1370,11 +1370,7 @@ the DPDK application. > > echo /sys/class/net//device/sriov_numvfs > > -4. Unbind the device (can be rebind after the switchdev mode):: > - > -echo -n "" > > /sys/bus/pci/drivers/mlx5_core/unbind > - > -5. Enable switchdev mode:: > +4. Enable switchdev mode:: > > echo switchdev > /sys/class/net//compat/devlink/mode > > -- > 2.30.1
Re: [dpdk-dev] [PATCH v2 0/8] common/sfc_efx: prepare to introduce vDPA driver
On 3/15/2021 1:58 PM, Andrew Rybchenko wrote: Update base driver to provide functionality required by vDPA driver. Factor out helper functions to be shared by net and vDPA drivers. v2: - fix windows build breakage - do not build common/sfc_efx in the case of windows - remove undefined efx_virtio_* functions from version.map (since EFSYS_OPT_VIRTIO is disabled) Vijay Kumar Srivastava (6): common/sfc_efx/base: add virtio build dependency common/sfc_efx/base: add support to get virtio features common/sfc_efx/base: add support to verify virtio features common/sfc_efx: add support to get the device class net/sfc: skip driver probe for incompatible device class drivers: add common driver API to get efx family Vijay Srivastava (2): common/sfc_efx/base: add base virtio support for vDPA common/sfc_efx/base: add API to get VirtQ doorbell offset build still fails for windows, http://mails.dpdk.org/archives/test-report/2021-March/182539.html I guess it is missing, "build=false": if is_windows + build=false subdir_done() endif
Re: [dpdk-dev] [PATCH v3 07/11] eal: add log level help
On Mon, 15 Mar 2021 11:52:13 +0100 Thomas Monjalon wrote: > 15/03/2021 11:42, Kinsella, Ray: > > > > On 15/03/2021 10:31, Bruce Richardson wrote: > > > On Mon, Mar 15, 2021 at 10:19:47AM +, Kinsella, Ray wrote: > > >> > > >> > > >> On 12/03/2021 18:17, Thomas Monjalon wrote: > > >>> The option --log-level was not completely described in the usage text, > > >>> and it was difficult to guess the names of the log types and levels. > > >>> > > >>> A new value "help" is accepted after --log-level to give more details > > >>> about the syntax and listing the log types and levels. > > >>> > > >>> The array "levels" used for level name parsing is replaced with > > >>> a (modified) existing function which was used in rte_log_dump(). > > >>> > > >>> The new function rte_log_list_types() is exported in the API > > >>> for allowing an application to give this info to the user > > >>> if not exposing the EAL option --log-level. > > >>> The list of log types cannot include all drivers if not linked in the > > >>> application (shared object plugin case). > > >>> > > >>> Signed-off-by: Thomas Monjalon > > >>> --- > > >>> lib/librte_eal/common/eal_common_log.c | 24 +--- > > >>> lib/librte_eal/common/eal_common_options.c | 44 +++--- > > >>> lib/librte_eal/common/eal_log.h| 5 +++ > > >>> lib/librte_eal/include/rte_log.h | 11 ++ > > >>> lib/librte_eal/version.map | 3 ++ > > >>> 5 files changed, 69 insertions(+), 18 deletions(-) > > >>> > > > > > >>> @@ -1274,6 +1286,11 @@ eal_parse_log_level(const char *arg) > > >>> char *str, *level; > > >>> int priority; > > >>> > > >>> + if (strcmp(arg, "help") == 0) { > > >> > > >> So I think the convention is to support both "?" and "help". > > >> Qemu does this at least. > > >> > > > I've seen "/?" used for help on windows binaries, but "-?" not so much in > > > the > > > linux world, where --help (and often -h for short) seem to be the > > > standard. > > > > > > > This is slightly different - it is where you are looking to return a list > > of valid > > values for a parameter. So for instance in qemu mentioned above > > > > ~ > qemu-system-x86_64 -cpu ? | head -n 10 > > "?" is a special character. > In my zsh, I need to quote it to avoid globbing parsing, > so I'm not a fan. > > I will let you extend the syntax in a separate patch :) > > Also '?' is used by getopt to match unknown option. So qemu might just be doing that as unintended side effect of any unknown option
Re: [dpdk-dev] [PATCH 2/2] net/mlx5: avoid unbind step to enable switchdev mode
Hello Salva, On Mon, 15 Mar 2021 15:53:51 + Slava Ovsiienko wrote: > Hi, Jan > > Yes, bullet [4] explicitly requires to unbind VFs, and detach the netdevs > from the mlx5_core driver. > Otherwise, kernel driver refuses to be configured with switchdev mode in [5]. > So, [4] can't be skipped. > After setting swithdev mode, VFs can be bound back (if it is needed, and > these ones are not mapped to VMs): OK, but I believe that it is **not possible** to follow the rule [5]. The guide explicitly says in [4] "can be rebind **after** the switchdev mode". Just, if you unbind the device, there is no way how to configure the switchdev mode, this is the contradiction I mentioned in the commit. You cannot configure switchdev mode because the interface is gone and the path /sys/class/net//compat/devlink/mode no longer exists. So, maybe, just the formulation is wrong. So, what is the **exact right** way how to do it? I would change the commit accordingly. Just, let's make it right. Would it work this way? # echo -n "" > /sys/bus/pci/drivers/mlx5_core/unbind # echo -n "" > /sys/bus/pci/drivers/mlx5_core/bind # echo switchdev > /sys/class/net//compat/devlink/mode It is good to mention that after the rebind, the can change. Regards, Jan > > echo -n "" > > /sys/bus/pci/drivers/mlx5_core/bind > > With best regards, > Slava > > > -Original Message- > > From: Jan Viktorin > > Sent: Monday, March 15, 2021 17:34 > > To: dev@dpdk.org > > Cc: Jan Viktorin ; Asaf Penso ; > > Shahaf Shuler ; Slava Ovsiienko > > ; Matan Azrad > > Subject: [PATCH 2/2] net/mlx5: avoid unbind step to enable switchdev mode > > > > From: Jan Viktorin > > > > The step 4 is a contradiction. It advices to unbind the device from the > > mlx5_core which removes the associated system network interface (e.g. > > eth0). In the step 5, the same system network interface (e.g. eth0) is > > required to exist. > > > > Signed-off-by: Jan Viktorin > > --- > > doc/guides/nics/mlx5.rst | 6 +- > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index > > 0a2dc3dee..122d8e0fc 100644 > > --- a/doc/guides/nics/mlx5.rst > > +++ b/doc/guides/nics/mlx5.rst > > @@ -1370,11 +1370,7 @@ the DPDK application. > > > > echo /sys/class/net//device/sriov_numvfs > > > > -4. Unbind the device (can be rebind after the switchdev mode):: > > - > > -echo -n "" > > > /sys/bus/pci/drivers/mlx5_core/unbind > > - > > -5. Enable switchdev mode:: > > +4. Enable switchdev mode:: > > > > echo switchdev > /sys/class/net//compat/devlink/mode > > > > -- > > 2.30.1 >
Re: [dpdk-dev] [dpdk-stable] [PATCH v2] vhost: add header check in dequeue offload
On Mon, Mar 15, 2021 at 4:52 PM Xiao Wang wrote: > > When parsing the virtio net header and packet header for dequeue offload, > we need to perform sanity check on the packet header to ensure: > - No out-of-boundary memory access. > - The packet header and virtio_net header are valid and aligned. > > Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities") > Cc: sta...@dpdk.org > > Signed-off-by: Xiao Wang > --- > v2: > Allow empty L4 payload for cksum offload. > --- > lib/librte_vhost/virtio_net.c | 49 > +-- > 1 file changed, 43 insertions(+), 6 deletions(-) > > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c > index 583bf379c6..53a8ff2898 100644 > --- a/lib/librte_vhost/virtio_net.c > +++ b/lib/librte_vhost/virtio_net.c > @@ -1821,44 +1821,64 @@ virtio_net_with_host_offload(struct virtio_net *dev) > return false; > } > > -static void > -parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr) > +static int > +parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr, > + uint16_t *len) > { > struct rte_ipv4_hdr *ipv4_hdr; > struct rte_ipv6_hdr *ipv6_hdr; > void *l3_hdr = NULL; > struct rte_ether_hdr *eth_hdr; > uint16_t ethertype; > + uint16_t data_len = m->data_len; > > eth_hdr = rte_pktmbuf_mtod(m, struct rte_ether_hdr *); > > + if (data_len <= sizeof(struct rte_ether_hdr)) > + return -EINVAL; On principle, the check should happen before calling rte_pktmbuf_mtod, like what rte_pktmbuf_read does. Looking at the rest of the patch, does this helper function only handle mono segment mbufs? My reading of copy_desc_to_mbuf() was that it could generate multi segments mbufs... [snip] > case RTE_ETHER_TYPE_IPV4: > + if (data_len <= sizeof(struct rte_ipv4_hdr)) > + return -EINVAL; > ipv4_hdr = l3_hdr; > *l4_proto = ipv4_hdr->next_proto_id; > m->l3_len = rte_ipv4_hdr_len(ipv4_hdr); > + if (data_len <= m->l3_len) { > + m->l3_len = 0; > + return -EINVAL; > + } ... so here, comparing l3 length to only the first segment length (data_len) would be invalid. If this helper must deal with multi segments, why not use rte_pktmbuf_read? This function returns access to mbuf data after checking offset and length are contiguous, else copy the needed data in a passed buffer. > *l4_hdr = (char *)l3_hdr + m->l3_len; > m->ol_flags |= PKT_TX_IPV4; > + data_len -= m->l3_len; > break; -- David Marchand
[dpdk-dev] [PATCH 0/7] Add support for VXLAN and NVGRE encap as a sample actions
This series adds support for VXLAN and NVGRE encap as a sample actions with the proper documentation, this series depends on [1] for the documentation part. [1] http://patches.dpdk.org/project/dpdk/patch/1615774238-51875-1-git-send-email-jiaw...@nvidia.com/ Jiawei Wang (1): app/testpmd: store VXLAN/NVGRE encap data globally Salem Sol (6): net/mlx5: support VXLAN encap action in sample net/mlx5: support NVGRE encap action in sample app/testpmd: support VXLAN encap for sample action app/testpmd: support NVGRE encap for sample action doc: update sample actions support in testpmd guide doc: update sample actions support in mlx5 guide app/test-pmd/cmdline_flow.c | 90 ++--- doc/guides/nics/mlx5.rst| 4 +- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 21 + drivers/net/mlx5/mlx5_flow_dv.c | 13 +++ 4 files changed, 99 insertions(+), 29 deletions(-) -- 2.21.0
[dpdk-dev] [PATCH 1/7] app/testpmd: store VXLAN/NVGRE encap data globally
From: Jiawei Wang With the current code the VXLAN/NVGRE parsing routine stored the configuration of the header on stack, this might lead to overwriting the data on the stack. This patch stores the external data of vxlan and nvgre encap into global data as a pre-step to supporting vxlan and nvgre encap as a sample actions. Signed-off-by: Jiawei Wang --- app/test-pmd/cmdline_flow.c | 76 - 1 file changed, 49 insertions(+), 27 deletions(-) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index 49d9f9c043..84676a2e45 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -5244,31 +5244,14 @@ parse_vc_action_rss_queue(struct context *ctx, const struct token *token, return len; } -/** Parse VXLAN encap action. */ +/** Setup VXLAN encap configuration. */ static int -parse_vc_action_vxlan_encap(struct context *ctx, const struct token *token, - const char *str, unsigned int len, - void *buf, unsigned int size) +parse_setup_vxlan_encap_data + (struct action_vxlan_encap_data *action_vxlan_encap_data) { - struct buffer *out = buf; - struct rte_flow_action *action; - struct action_vxlan_encap_data *action_vxlan_encap_data; - int ret; - - ret = parse_vc(ctx, token, str, len, buf, size); - if (ret < 0) - return ret; - /* Nothing else to do if there is no buffer. */ - if (!out) - return ret; - if (!out->args.vc.actions_n) + if (!action_vxlan_encap_data) return -1; - action = &out->args.vc.actions[out->args.vc.actions_n - 1]; - /* Point to selected object. */ - ctx->object = out->args.vc.data; - ctx->objmask = NULL; /* Set up default configuration. */ - action_vxlan_encap_data = ctx->object; *action_vxlan_encap_data = (struct action_vxlan_encap_data){ .conf = (struct rte_flow_action_vxlan_encap){ .definition = action_vxlan_encap_data->items, @@ -5372,19 +5355,18 @@ parse_vc_action_vxlan_encap(struct context *ctx, const struct token *token, } memcpy(action_vxlan_encap_data->item_vxlan.vni, vxlan_encap_conf.vni, RTE_DIM(vxlan_encap_conf.vni)); - action->conf = &action_vxlan_encap_data->conf; - return ret; + return 0; } -/** Parse NVGRE encap action. */ +/** Parse VXLAN encap action. */ static int -parse_vc_action_nvgre_encap(struct context *ctx, const struct token *token, +parse_vc_action_vxlan_encap(struct context *ctx, const struct token *token, const char *str, unsigned int len, void *buf, unsigned int size) { struct buffer *out = buf; struct rte_flow_action *action; - struct action_nvgre_encap_data *action_nvgre_encap_data; + struct action_vxlan_encap_data *action_vxlan_encap_data; int ret; ret = parse_vc(ctx, token, str, len, buf, size); @@ -5399,8 +5381,20 @@ parse_vc_action_nvgre_encap(struct context *ctx, const struct token *token, /* Point to selected object. */ ctx->object = out->args.vc.data; ctx->objmask = NULL; + action_vxlan_encap_data = ctx->object; + parse_setup_vxlan_encap_data(action_vxlan_encap_data); + action->conf = &action_vxlan_encap_data->conf; + return ret; +} + +/** Setup NVGRE encap configuration. */ +static int +parse_setup_nvgre_encap_data + (struct action_nvgre_encap_data *action_nvgre_encap_data) +{ + if (!action_nvgre_encap_data) + return -1; /* Set up default configuration. */ - action_nvgre_encap_data = ctx->object; *action_nvgre_encap_data = (struct action_nvgre_encap_data){ .conf = (struct rte_flow_action_nvgre_encap){ .definition = action_nvgre_encap_data->items, @@ -5463,6 +5457,34 @@ parse_vc_action_nvgre_encap(struct context *ctx, const struct token *token, RTE_FLOW_ITEM_TYPE_VOID; memcpy(action_nvgre_encap_data->item_nvgre.tni, nvgre_encap_conf.tni, RTE_DIM(nvgre_encap_conf.tni)); + return 0; +} + +/** Parse NVGRE encap action. */ +static int +parse_vc_action_nvgre_encap(struct context *ctx, const struct token *token, + const char *str, unsigned int len, + void *buf, unsigned int size) +{ + struct buffer *out = buf; + struct rte_flow_action *action; + struct action_nvgre_encap_data *action_nvgre_encap_data; + int ret; + + ret = parse_vc(ctx, token, str, len, buf, size); + if (ret < 0) + return ret; + /* Nothing else to do if there is no buffer. */ + if (!out) + return ret; + if (!out->args.vc.actions_n) + return -1; + action = &out->
[dpdk-dev] [PATCH 2/7] net/mlx5: support VXLAN encap action in sample
Add support for VXLAN encap as a sample action and validate it. Signed-off-by: Salem Sol --- drivers/net/mlx5/mlx5_flow_dv.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index 1a74d5ac2b..4b2db47e39 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -5242,6 +5242,16 @@ flow_dv_validate_action_sample(uint64_t *action_flags, return ret; ++actions_n; break; + case RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP: + ret = flow_dv_validate_action_l2_encap(dev, + sub_action_flags, + act, attr, + error); + if (ret < 0) + return ret; + sub_action_flags |= MLX5_FLOW_ACTION_ENCAP; + ++actions_n; + break; default: return rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ACTION, @@ -10407,6 +10417,7 @@ flow_dv_translate_action_sample(struct rte_eth_dev *dev, action_flags |= MLX5_FLOW_ACTION_PORT_ID; break; } + case RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP: case RTE_FLOW_ACTION_TYPE_RAW_ENCAP: /* Save the encap resource before sample */ pre_rix = dev_flow->handle->dvh.rix_encap_decap; -- 2.21.0
[dpdk-dev] [PATCH 4/7] app/testpmd: support VXLAN encap for sample action
Add support for rte_flow_action_vxlan_encap as a sample action. The example of test-pmd command: 1. set vxlan ip-version ... vni ... udp-src ... set raw_encap 1 eth src.../ ipv4.../... set sample_actions 2 vxlan_encap / port_id id 0 / end flow create 0 ... pattern eth / end actions sample ratio 1 index 2 / raw_encap index 1 / port_id id 0... The flow will result in all the matched egress packets will be encapsulated and sent to wire, and also mirrored the packets using VXLAN encapsulation data and sent to wire. Signed-off-by: Salem Sol --- app/test-pmd/cmdline_flow.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index 84676a2e45..61dfaab8fd 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -582,6 +582,7 @@ struct rte_flow_action_queue sample_queue[RAW_SAMPLE_CONFS_MAX_NUM]; struct rte_flow_action_count sample_count[RAW_SAMPLE_CONFS_MAX_NUM]; struct rte_flow_action_port_id sample_port_id[RAW_SAMPLE_CONFS_MAX_NUM]; struct rte_flow_action_raw_encap sample_encap[RAW_SAMPLE_CONFS_MAX_NUM]; +struct action_vxlan_encap_data sample_vxlan_encap[RAW_SAMPLE_CONFS_MAX_NUM]; struct action_rss_data sample_rss_data[RAW_SAMPLE_CONFS_MAX_NUM]; struct rte_flow_action_vf sample_vf[RAW_SAMPLE_CONFS_MAX_NUM]; @@ -1615,6 +1616,7 @@ static const enum index next_action_sample[] = { ACTION_COUNT, ACTION_PORT_ID, ACTION_RAW_ENCAP, + ACTION_VXLAN_ENCAP, ACTION_NEXT, ZERO, }; @@ -7949,6 +7951,11 @@ cmd_set_raw_parsed_sample(const struct buffer *in) (const void *)action->conf, size); action->conf = &sample_vf[idx]; break; + case RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP: + size = sizeof(struct rte_flow_action_vxlan_encap); + parse_setup_vxlan_encap_data(&sample_vxlan_encap[idx]); + action->conf = &sample_vxlan_encap[idx].conf; + break; default: printf("Error - Not supported action\n"); return; -- 2.21.0
[dpdk-dev] [PATCH 3/7] net/mlx5: support NVGRE encap action in sample
Add support for NVGRE encap as a sample action and validate it. Signed-off-by: Salem Sol --- drivers/net/mlx5/mlx5_flow_dv.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index 4b2db47e39..590abdc822 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -5243,6 +5243,7 @@ flow_dv_validate_action_sample(uint64_t *action_flags, ++actions_n; break; case RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP: + case RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP: ret = flow_dv_validate_action_l2_encap(dev, sub_action_flags, act, attr, @@ -10418,6 +10419,7 @@ flow_dv_translate_action_sample(struct rte_eth_dev *dev, break; } case RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP: + case RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP: case RTE_FLOW_ACTION_TYPE_RAW_ENCAP: /* Save the encap resource before sample */ pre_rix = dev_flow->handle->dvh.rix_encap_decap; -- 2.21.0
[dpdk-dev] [PATCH 5/7] app/testpmd: support NVGRE encap for sample action
Add support for rte_flow_action_nvge_encap as a sample action. The example of test-pmd command: 1. set nvgre ip-version ... tni ... ip-src ... ip-dst ... set raw_encap 1 eth src... / ipv4... /... set sample_actions 2 nvgre / port_id id 0 / end flow create 0 ... pattern eth / end actions sample ratio 1 index 2 / raw_encap index 1 / port_id id 0... The flow will result in all the matched egress packets will be encapsulated and sent to wire, and also mirrored the packets using NVGRE encapsulation data and sent to wire. Signed-off-by: Salem Sol --- app/test-pmd/cmdline_flow.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index 61dfaab8fd..0e33410005 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -583,6 +583,7 @@ struct rte_flow_action_count sample_count[RAW_SAMPLE_CONFS_MAX_NUM]; struct rte_flow_action_port_id sample_port_id[RAW_SAMPLE_CONFS_MAX_NUM]; struct rte_flow_action_raw_encap sample_encap[RAW_SAMPLE_CONFS_MAX_NUM]; struct action_vxlan_encap_data sample_vxlan_encap[RAW_SAMPLE_CONFS_MAX_NUM]; +struct action_nvgre_encap_data sample_nvgre_encap[RAW_SAMPLE_CONFS_MAX_NUM]; struct action_rss_data sample_rss_data[RAW_SAMPLE_CONFS_MAX_NUM]; struct rte_flow_action_vf sample_vf[RAW_SAMPLE_CONFS_MAX_NUM]; @@ -1617,6 +1618,7 @@ static const enum index next_action_sample[] = { ACTION_PORT_ID, ACTION_RAW_ENCAP, ACTION_VXLAN_ENCAP, + ACTION_NVGRE_ENCAP, ACTION_NEXT, ZERO, }; @@ -7956,6 +7958,11 @@ cmd_set_raw_parsed_sample(const struct buffer *in) parse_setup_vxlan_encap_data(&sample_vxlan_encap[idx]); action->conf = &sample_vxlan_encap[idx].conf; break; + case RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP: + size = sizeof(struct rte_flow_action_nvgre_encap); + parse_setup_nvgre_encap_data(&sample_nvgre_encap[idx]); + action->conf = &sample_nvgre_encap[idx]; + break; default: printf("Error - Not supported action\n"); return; -- 2.21.0
[dpdk-dev] [PATCH 6/7] doc: update sample actions support in testpmd guide
Update documentation for sample action usage in testpmd utilizing rte_flow_action_vxlan_encap and rte_flow_action_nvgre_encap and show the command line example. Signed-off-by: Salem Sol --- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 21 + 1 file changed, 21 insertions(+) diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst index 3a31cc6237..5e40c9bc1c 100644 --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst @@ -4900,6 +4900,27 @@ and also mirrored the packets with encapsulation header and sent to port id 0. testpmd> set sample_actions 0 raw_encap / port_id id 0 / end testpmd> flow create 0 ingress transfer pattern eth / end actions sample ratio 1 index 0 / port_id id 2 / end +E-Switch Mirroring rule, the matched ingress packets are sent to port id 2, +and also mirrored the packets with VXLAN encapsulation header and sent to port id 0. + +:: + + testpmd> set vxlan ip-version ipv4 vni 4 udp-src 4 udp-dst 4 ip-src 127.0.0.1 +ip-dst 128.0.0.1 eth-src 11:11:11:11:11:11 eth-dst 22:22:22:22:22:22 + testpmd> set sample_actions 0 vxlan_encap / port_id id 0 / end + testpmd> flow create 0 ingress transfer pattern eth / end actions +sample ratio 1 index 0 / port_id id 2 / end + +E-Switch Mirroring rule, the matched ingress packets are sent to port id 2, +and also mirrored the packets with NVGRE encapsulation header and sent to port id 0. + +:: + + testpmd> set nvgre ip-version ipv4 tni 4 ip-src 127.0.0.1 ip-dst 128.0.0.1 +eth-src 11:11:11:11:11:11 eth-dst 22:22:22:22:22:22 + testpmd> set sample_actions 0 nvgre_encap / port_id id 0 / end + testpmd> flow create 0 ingress transfer pattern eth / end actions +sample ratio 1 index 0 / port_id id 2 / end BPF Functions -- -- 2.21.0
[dpdk-dev] [PATCH 7/7] doc: update sample actions support in mlx5 guide
Updates the documentation with the added support for sample actions VXLAN and NVGRE encap in E-Switch steering flow. Signed-off-by: Salem Sol --- doc/guides/nics/mlx5.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 96fce36e3c..378b7202d9 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -376,8 +376,8 @@ Limitations encapsulation actions. - For NIC Rx flow, supports ``MARK``, ``COUNT``, ``QUEUE``, ``RSS`` in the sample actions list. - - For E-Switch mirroring flow, supports ``RAW ENCAP``, ``Port ID`` in the -sample actions list. + - For E-Switch mirroring flow, supports ``RAW ENCAP``, ``Port ID``, +``VXLAN ENCAP``, ``NVGRE ENCAP`` in the sample actions list. - Modify Field flow: -- 2.21.0
[dpdk-dev] [PATCH 1/2] net/mlx5: fix typos
From: Jan Viktorin Signed-off-by: Jan Viktorin --- doc/guides/nics/mlx5.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 7c50497fb..0a2dc3dee 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -1372,9 +1372,9 @@ the DPDK application. 4. Unbind the device (can be rebind after the switchdev mode):: -echo -n " /sys/bus/pci/drivers/mlx5_core/unbind +echo -n "" > /sys/bus/pci/drivers/mlx5_core/unbind -5. Enbale switchdev mode:: +5. Enable switchdev mode:: echo switchdev > /sys/class/net//compat/devlink/mode -- 2.30.1
[dpdk-dev] [PATCH 2/2] net/mlx5: avoid unbind step to enable switchdev mode
From: Jan Viktorin The step 4 is a contradiction. It advices to unbind the device from the mlx5_core which removes the associated system network interface (e.g. eth0). In the step 5, the same system network interface (e.g. eth0) is required to exist. Signed-off-by: Jan Viktorin --- doc/guides/nics/mlx5.rst | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 0a2dc3dee..122d8e0fc 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -1370,11 +1370,7 @@ the DPDK application. echo /sys/class/net//device/sriov_numvfs -4. Unbind the device (can be rebind after the switchdev mode):: - -echo -n "" > /sys/bus/pci/drivers/mlx5_core/unbind - -5. Enable switchdev mode:: +4. Enable switchdev mode:: echo switchdev > /sys/class/net//compat/devlink/mode -- 2.30.1
Re: [dpdk-dev] [PATCH v2 3/4] net/virtio: allocate fake mbuf in Rx queue
On 3/15/21 4:50 PM, David Marchand wrote: > On Mon, Mar 15, 2021 at 4:20 PM Maxime Coquelin > wrote: >> @@ -550,10 +551,18 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t >> queue_idx) >> goto free_hdr_mz; >> } >> >> + fake_mbuf = malloc(sizeof(*fake_mbuf)); >> + if (!fake_mbuf) { >> + PMD_INIT_LOG(ERR, "can not allocate fake mbuf"); >> + ret = -ENOMEM; >> + goto free_sw_ring; >> + } >> + >> vq->sw_ring = sw_ring; >> rxvq = &vq->rxq; >> rxvq->port_id = dev->data->port_id; >> rxvq->mz = mz; >> + rxvq->fake_mbuf = fake_mbuf; > > IIRC, vq is allocated as dpdk memory (rte_malloc). > Generally speaking, storing a local pointer inside such an object is > dangerous if other processes start to look at this part. Agree, I will change to rte_zmalloc_socket, as vq (which is was part of) is allocated like that. Thanks, Maxime > >> } else if (queue_type == VTNET_TQ) { >> txvq = &vq->txq; >> txvq->port_id = dev->data->port_id; >> @@ -613,6 +622,9 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t >> queue_idx) >> clean_vq: >> hw->cvq = NULL; >> >> + if (fake_mbuf) >> + free(fake_mbuf); > > No need for if(). > >
Re: [dpdk-dev] [PATCH v2] sched : Initialize tc ov watermark.
> -Original Message- > From: dev On Behalf Of Savinay Dharmappa > Sent: Tuesday, March 9, 2021 4:10 PM > To: Singh, Jasvinder ; Dumitrescu, Cristian > ; dev@dpdk.org > Cc: Dharmappa, Savinay > Subject: [dpdk-dev] [PATCH v2] sched : Initialize tc ov watermark. > > tc ov watermark is initialized with computed value of max tc ov watermark. > > Signed-off-by: Savinay Dharmappa > --- > v2: fix spelling error. > --- > lib/librte_sched/rte_sched.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Tested-by: David Coyle
[dpdk-dev] [PATCH v3 1/4] net/virtio: remove reference to virtqueue in vrings
Vrings are part of the virtqueues, so we don't need to have a pointer to it in Vrings descriptions. Instead, let's just subtract from its offset to calculate virtqueue address. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- drivers/net/virtio/virtio_ethdev.c| 36 +-- drivers/net/virtio/virtio_rxtx.c | 28 +++ drivers/net/virtio/virtio_rxtx.h | 3 -- drivers/net/virtio/virtio_rxtx_packed.c | 4 +-- drivers/net/virtio/virtio_rxtx_packed.h | 6 ++-- drivers/net/virtio/virtio_rxtx_packed_avx.h | 4 +-- drivers/net/virtio/virtio_rxtx_simple.h | 2 +- .../net/virtio/virtio_rxtx_simple_altivec.c | 2 +- drivers/net/virtio/virtio_rxtx_simple_neon.c | 2 +- drivers/net/virtio/virtio_rxtx_simple_sse.c | 2 +- .../net/virtio/virtio_user/virtio_user_dev.c | 4 +-- drivers/net/virtio/virtio_user_ethdev.c | 2 +- drivers/net/virtio/virtqueue.h| 6 +++- 13 files changed, 49 insertions(+), 52 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 333a5243a9..af090fdf9c 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -133,7 +133,7 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, struct virtio_pmd_ctrl *ctrl, int *dlen, int pkt_num) { - struct virtqueue *vq = cvq->vq; + struct virtqueue *vq = virtnet_cq_to_vq(cvq); int head; struct vring_packed_desc *desc = vq->vq_packed.ring.desc; struct virtio_pmd_ctrl *result; @@ -229,7 +229,7 @@ virtio_send_command_split(struct virtnet_ctl *cvq, int *dlen, int pkt_num) { struct virtio_pmd_ctrl *result; - struct virtqueue *vq = cvq->vq; + struct virtqueue *vq = virtnet_cq_to_vq(cvq); uint32_t head, i; int k, sum = 0; @@ -316,13 +316,13 @@ virtio_send_command(struct virtnet_ctl *cvq, struct virtio_pmd_ctrl *ctrl, ctrl->status = status; - if (!cvq || !cvq->vq) { + if (!cvq) { PMD_INIT_LOG(ERR, "Control queue is not supported."); return -1; } rte_spinlock_lock(&cvq->lock); - vq = cvq->vq; + vq = virtnet_cq_to_vq(cvq); PMD_INIT_LOG(DEBUG, "vq->vq_desc_head_idx = %d, status = %d, " "vq->hw->cvq = %p vq = %p", @@ -552,19 +552,16 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) vq->sw_ring = sw_ring; rxvq = &vq->rxq; - rxvq->vq = vq; rxvq->port_id = dev->data->port_id; rxvq->mz = mz; } else if (queue_type == VTNET_TQ) { txvq = &vq->txq; - txvq->vq = vq; txvq->port_id = dev->data->port_id; txvq->mz = mz; txvq->virtio_net_hdr_mz = hdr_mz; txvq->virtio_net_hdr_mem = hdr_mz->iova; } else if (queue_type == VTNET_CQ) { cvq = &vq->cq; - cvq->vq = vq; cvq->mz = mz; cvq->virtio_net_hdr_mz = hdr_mz; cvq->virtio_net_hdr_mem = hdr_mz->iova; @@ -851,7 +848,7 @@ virtio_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id) { struct virtio_hw *hw = dev->data->dev_private; struct virtnet_rx *rxvq = dev->data->rx_queues[queue_id]; - struct virtqueue *vq = rxvq->vq; + struct virtqueue *vq = virtnet_rxq_to_vq(rxvq); virtqueue_enable_intr(vq); virtio_mb(hw->weak_barriers); @@ -862,7 +859,7 @@ static int virtio_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id) { struct virtnet_rx *rxvq = dev->data->rx_queues[queue_id]; - struct virtqueue *vq = rxvq->vq; + struct virtqueue *vq = virtnet_rxq_to_vq(rxvq); virtqueue_disable_intr(vq); return 0; @@ -2180,8 +2177,7 @@ static int virtio_dev_start(struct rte_eth_dev *dev) { uint16_t nb_queues, i; - struct virtnet_rx *rxvq; - struct virtnet_tx *txvq __rte_unused; + struct virtqueue *vq; struct virtio_hw *hw = dev->data->dev_private; int ret; @@ -2238,27 +2234,27 @@ virtio_dev_start(struct rte_eth_dev *dev) PMD_INIT_LOG(DEBUG, "nb_queues=%d", nb_queues); for (i = 0; i < dev->data->nb_rx_queues; i++) { - rxvq = dev->data->rx_queues[i]; + vq = virtnet_rxq_to_vq(dev->data->rx_queues[i]); /* Flush the old packets */ - virtqueue_rxvq_flush(rxvq->vq); - virtqueue_notify(rxvq->vq); + virtqueue_rxvq_flush(vq); + virtqueue_notify(vq); } for (i = 0; i < dev->data->nb_tx_queues; i++) { - txvq = dev->data->tx_queues[i]; - virtqueue_notify(txvq->vq); + vq = virtnet_t
[dpdk-dev] [PATCH v3 0/4] net/virtio: make virtqueue struct cache-friendly
This series optimizes the cache usage of virtqueue struct, by making a "fake" mbuf being dynamically allocated in Rx virtnet struct, by removing a useless virtuque pointer into the virtnet structs and by moving a few fields to pack holes. With these 3 patches, the virtqueue struct size goes from 576 bytes (9 cachelines) to 248 bytes (4 cachelines). Changes in v3: == - Use rte_zmalloc_socket for fake mbuf alloc (David) - Fix typos in commit messages - Remove superfluous pointer check befor freeing (David) - Fix checkpatch warnings Changes in v2: == - Rebase on latest main - Improve error path in virtio_init_queue - Fix various typos in commit messages Maxime Coquelin (4): net/virtio: remove reference to virtqueue in vrings net/virtio: improve queue init error path net/virtio: allocate fake mbuf in Rx queue net/virtio: pack virtqueue struct drivers/net/virtio/virtio_ethdev.c| 64 +++ drivers/net/virtio/virtio_rxtx.c | 37 +-- drivers/net/virtio/virtio_rxtx.h | 5 +- drivers/net/virtio/virtio_rxtx_packed.c | 4 +- drivers/net/virtio/virtio_rxtx_packed.h | 6 +- drivers/net/virtio/virtio_rxtx_packed_avx.h | 4 +- drivers/net/virtio/virtio_rxtx_simple.h | 2 +- .../net/virtio/virtio_rxtx_simple_altivec.c | 2 +- drivers/net/virtio/virtio_rxtx_simple_neon.c | 2 +- drivers/net/virtio/virtio_rxtx_simple_sse.c | 2 +- .../net/virtio/virtio_user/virtio_user_dev.c | 4 +- drivers/net/virtio/virtio_user_ethdev.c | 2 +- drivers/net/virtio/virtqueue.h| 24 --- 13 files changed, 85 insertions(+), 73 deletions(-) -- 2.29.2
[dpdk-dev] [PATCH v3 2/4] net/virtio: improve queue init error path
This patch improves the error path of virtio_init_queue(), by cleaning in reversing order all resources that have been allocated. Suggested-by: Chenbo Xia Signed-off-by: Maxime Coquelin --- drivers/net/virtio/virtio_ethdev.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index af090fdf9c..d5643733f7 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -507,7 +507,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) mz = rte_memzone_lookup(vq_name); if (mz == NULL) { ret = -ENOMEM; - goto fail_q_alloc; + goto free_vq; } } @@ -533,7 +533,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) hdr_mz = rte_memzone_lookup(vq_hdr_name); if (hdr_mz == NULL) { ret = -ENOMEM; - goto fail_q_alloc; + goto free_mz; } } } @@ -547,7 +547,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) if (!sw_ring) { PMD_INIT_LOG(ERR, "can not allocate RX soft ring"); ret = -ENOMEM; - goto fail_q_alloc; + goto free_hdr_mz; } vq->sw_ring = sw_ring; @@ -604,15 +604,20 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) if (VIRTIO_OPS(hw)->setup_queue(hw, vq) < 0) { PMD_INIT_LOG(ERR, "setup_queue failed"); - return -EINVAL; + ret = -EINVAL; + goto clean_vq; } return 0; -fail_q_alloc: +clean_vq: + hw->cvq = NULL; rte_free(sw_ring); +free_hdr_mz: rte_memzone_free(hdr_mz); +free_mz: rte_memzone_free(mz); +free_vq: rte_free(vq); return ret; -- 2.29.2
[dpdk-dev] [PATCH v3 3/4] net/virtio: allocate fake mbuf in Rx queue
While it is worth clarifying whether the fake mbuf in virtnet_rx struct is really necessary, it is sure that it heavily impacts cache usage by being part of the struct. Indeed, it uses two cachelines, and requires alignment on a cacheline. Before this series, it means it took 120 bytes in virtnet_rx struct: struct virtnet_rx { struct virtqueue * vq; /* 0 8 */ /* XXX 56 bytes hole, try to pack */ /* --- cacheline 1 boundary (64 bytes) --- */ struct rte_mbuffake_mbuf __attribute__((__aligned__(64))); /*64 128 */ /* --- cacheline 3 boundary (192 bytes) --- */ This patch allocates it using malloc in order to optimize virtnet_rx cache usage and so virtqueue cache usage. Signed-off-by: Maxime Coquelin --- drivers/net/virtio/virtio_ethdev.c | 13 + drivers/net/virtio/virtio_rxtx.c | 9 +++-- drivers/net/virtio/virtio_rxtx.h | 2 +- 3 files changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index d5643733f7..fda6c141dd 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -435,6 +435,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) int queue_type = virtio_get_queue_type(hw, queue_idx); int ret; int numa_node = dev->device->numa_node; + struct rte_mbuf *fake_mbuf = NULL; PMD_INIT_LOG(INFO, "setting up queue: %u on NUMA node %d", queue_idx, numa_node); @@ -550,10 +551,19 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) goto free_hdr_mz; } + fake_mbuf = rte_zmalloc_socket("sw_ring", sizeof(*fake_mbuf), + RTE_CACHE_LINE_SIZE, numa_node); + if (!fake_mbuf) { + PMD_INIT_LOG(ERR, "can not allocate fake mbuf"); + ret = -ENOMEM; + goto free_sw_ring; + } + vq->sw_ring = sw_ring; rxvq = &vq->rxq; rxvq->port_id = dev->data->port_id; rxvq->mz = mz; + rxvq->fake_mbuf = fake_mbuf; } else if (queue_type == VTNET_TQ) { txvq = &vq->txq; txvq->port_id = dev->data->port_id; @@ -612,6 +622,8 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t queue_idx) clean_vq: hw->cvq = NULL; + rte_free(fake_mbuf); +free_sw_ring: rte_free(sw_ring); free_hdr_mz: rte_memzone_free(hdr_mz); @@ -641,6 +653,7 @@ virtio_free_queues(struct virtio_hw *hw) queue_type = virtio_get_queue_type(hw, i); if (queue_type == VTNET_RQ) { + free(vq->rxq.fake_mbuf); rte_free(vq->sw_ring); rte_memzone_free(vq->rxq.mz); } else if (queue_type == VTNET_TQ) { diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 32af8d3d11..8df913b0ba 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -703,12 +703,9 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, uint16_t queue_idx) virtio_rxq_vec_setup(rxvq); } - memset(&rxvq->fake_mbuf, 0, sizeof(rxvq->fake_mbuf)); - for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; -desc_idx++) { - vq->sw_ring[vq->vq_nentries + desc_idx] = - &rxvq->fake_mbuf; - } + memset(rxvq->fake_mbuf, 0, sizeof(*rxvq->fake_mbuf)); + for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; desc_idx++) + vq->sw_ring[vq->vq_nentries + desc_idx] = rxvq->fake_mbuf; if (hw->use_vec_rx && !virtio_with_packed_queue(hw)) { while (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { diff --git a/drivers/net/virtio/virtio_rxtx.h b/drivers/net/virtio/virtio_rxtx.h index 7f1036be6f..6ce5d67d15 100644 --- a/drivers/net/virtio/virtio_rxtx.h +++ b/drivers/net/virtio/virtio_rxtx.h @@ -19,7 +19,7 @@ struct virtnet_stats { struct virtnet_rx { /* dummy mbuf, for wraparound when processing RX ring. */ - struct rte_mbuf fake_mbuf; + struct rte_mbuf *fake_mbuf; uint64_t mbuf_initializer; /**< value to init mbufs. */ struct rte_mempool *mpool; /**< mempool for mbuf allocation */ -- 2.29.2
[dpdk-dev] [PATCH v3 4/4] net/virtio: pack virtqueue struct
This patch optimizes packing of the virtqueue struct by moving fields around to fill holes. Offset field is not used and so can be removed. Signed-off-by: Maxime Coquelin Reviewed-by: Chenbo Xia --- drivers/net/virtio/virtqueue.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h index 17e76f0e8c..e9992b745d 100644 --- a/drivers/net/virtio/virtqueue.h +++ b/drivers/net/virtio/virtqueue.h @@ -244,6 +244,15 @@ struct virtqueue { uint16_t vq_avail_idx; /**< sync until needed */ uint16_t vq_free_thresh; /**< free threshold */ + /** +* Head of the free chain in the descriptor table. If +* there are no free descriptors, this will be set to +* VQ_RING_DESC_CHAIN_END. +*/ + uint16_t vq_desc_head_idx; + uint16_t vq_desc_tail_idx; + uint16_t vq_queue_index; /**< PCI queue index */ + void *vq_ring_virt_mem; /**< linear address of vring*/ unsigned int vq_ring_size; @@ -256,15 +265,6 @@ struct virtqueue { rte_iova_t vq_ring_mem; /**< physical address of vring, * or virtual address for virtio_user. */ - /** -* Head of the free chain in the descriptor table. If -* there are no free descriptors, this will be set to -* VQ_RING_DESC_CHAIN_END. -*/ - uint16_t vq_desc_head_idx; - uint16_t vq_desc_tail_idx; - uint16_t vq_queue_index; - uint16_t offset; /**< relative offset to obtain addr in mbuf */ uint16_t *notify_addr; struct rte_mbuf **sw_ring; /**< RX software ring. */ struct vq_desc_extra vq_descx[0]; -- 2.29.2
Re: [dpdk-dev] [PATCH v3 3/4] net/virtio: allocate fake mbuf in Rx queue
On Mon, Mar 15, 2021 at 5:46 PM Maxime Coquelin wrote: > @@ -612,6 +622,8 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t > queue_idx) > > clean_vq: > hw->cvq = NULL; > + rte_free(fake_mbuf); > +free_sw_ring: > rte_free(sw_ring); > free_hdr_mz: > rte_memzone_free(hdr_mz); > @@ -641,6 +653,7 @@ virtio_free_queues(struct virtio_hw *hw) > > queue_type = virtio_get_queue_type(hw, i); > if (queue_type == VTNET_RQ) { > + free(vq->rxq.fake_mbuf); rte_free() -- David Marchand
Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v4
On 3/1/2021 4:38 PM, Stephen Hemminger wrote: On Mon, 1 Mar 2021 11:10:01 +0300 Igor Ryzhov wrote: Stephen, No, I don't have a better proposal, but I think it is not correct to change the behavior of KNI (making link down without a real response). Even though we know that communicating with userspace under rtnl_lock is a bad idea, it works as it is for many years already. Elad, I agree with you that KNI should be removed from the main tree if it is not possible to fix this __dev_close_many issue. There were discussions about this multiple times already, but no one is working on this AFAIK. Last time the discussion was a month ago: https://www.mail-archive.com/dev@dpdk.org/msg196033.html Igor The better proposal would be to make DPDK virtio smarter. There already is virtio devices that must handle this (VDPA) etc. And when you can control link through virtio, then put a big warning in KNI that says "Don't use this" Hi Igor, Elad, I think it is reasonable to do the ifdown as async to solve the problem, still we can make sync default, and async with kernel parameter, to cover both case. I will put more details on the patches.
Re: [dpdk-dev] [PATCH v3 0/4] net/virtio: make virtqueue struct cache-friendly
On Mon, Mar 15, 2021 at 5:46 PM Maxime Coquelin wrote: > > This series optimizes the cache usage of virtqueue struct, > by making a "fake" mbuf being dynamically allocated in Rx > virtnet struct, by removing a useless virtuque pointer > into the virtnet structs and by moving a few fields > to pack holes. > > With these 3 patches, the virtqueue struct size goes from > 576 bytes (9 cachelines) to 248 bytes (4 cachelines). > > Changes in v3: > == > - Use rte_zmalloc_socket for fake mbuf alloc (David) > - Fix typos in commit messages > - Remove superfluous pointer check befor freeing (David) > - Fix checkpatch warnings Once fixed patch 3, you can add for the series, Reviewed-by: David Marchand -- David Marchand
Re: [dpdk-dev] [PATCH v3 07/11] eal: add log level help
On 15/03/2021 15:59, Stephen Hemminger wrote: > On Mon, 15 Mar 2021 11:52:13 +0100 > Thomas Monjalon wrote: > >> 15/03/2021 11:42, Kinsella, Ray: >>> >>> On 15/03/2021 10:31, Bruce Richardson wrote: On Mon, Mar 15, 2021 at 10:19:47AM +, Kinsella, Ray wrote: > > > On 12/03/2021 18:17, Thomas Monjalon wrote: >> The option --log-level was not completely described in the usage text, >> and it was difficult to guess the names of the log types and levels. >> >> A new value "help" is accepted after --log-level to give more details >> about the syntax and listing the log types and levels. >> >> The array "levels" used for level name parsing is replaced with >> a (modified) existing function which was used in rte_log_dump(). >> >> The new function rte_log_list_types() is exported in the API >> for allowing an application to give this info to the user >> if not exposing the EAL option --log-level. >> The list of log types cannot include all drivers if not linked in the >> application (shared object plugin case). >> >> Signed-off-by: Thomas Monjalon >> --- >> lib/librte_eal/common/eal_common_log.c | 24 +--- >> lib/librte_eal/common/eal_common_options.c | 44 +++--- >> lib/librte_eal/common/eal_log.h| 5 +++ >> lib/librte_eal/include/rte_log.h | 11 ++ >> lib/librte_eal/version.map | 3 ++ >> 5 files changed, 69 insertions(+), 18 deletions(-) >> >> @@ -1274,6 +1286,11 @@ eal_parse_log_level(const char *arg) >> char *str, *level; >> int priority; >> >> +if (strcmp(arg, "help") == 0) { > > So I think the convention is to support both "?" and "help". > Qemu does this at least. > I've seen "/?" used for help on windows binaries, but "-?" not so much in the linux world, where --help (and often -h for short) seem to be the standard. >>> >>> This is slightly different - it is where you are looking to return a list >>> of valid >>> values for a parameter. So for instance in qemu mentioned above >>> >>> ~ > qemu-system-x86_64 -cpu ? | head -n 10 >> >> "?" is a special character. >> In my zsh, I need to quote it to avoid globbing parsing, >> so I'm not a fan. >> >> I will let you extend the syntax in a separate patch :) >> >> > > Also '?' is used by getopt to match unknown option. So qemu might just be > doing that as unintended side effect of any unknown option > for other unknowns it explicitly complains ... ~ > qemu-system-x86_64 -cpu unknown Unable to init server: Could not connect: Connection refused qemu-system-x86_64: unable to find CPU model 'unknown