date:20221116

Re: [PATCH] doc: avoid meson deprecation in setup

2022-11-16 Thread David Marchand

On Tue, Nov 15, 2022 at 6:39 PM Bruce Richardson
 wrote:
>
> On Tue, Nov 15, 2022 at 09:35:16AM -0800, Stephen Hemminger wrote:
> > The command "meson build" causes a deprecation warning with meson 0.64.0.
> > 
> > WARNING: Running the setup command as `meson [options]` instead of 
> > `meson setup [options]` is ambiguous and deprecated.
> >
> > Therefore fix the examples in the documentation.
> >
> > Signed-off-by: Stephen Hemminger 
> > ---
> >  doc/guides/cryptodevs/armv8.rst  |  2 +-
> >  doc/guides/cryptodevs/uadk.rst   |  2 +-
> >  doc/guides/freebsd_gsg/build_dpdk.rst|  2 +-
> >  doc/guides/gpus/cuda.rst |  4 ++--
> >  doc/guides/howto/openwrt.rst |  4 ++--
> >  doc/guides/nics/ark.rst  |  2 +-
> >  doc/guides/nics/mvneta.rst   |  2 +-
> >  doc/guides/nics/mvpp2.rst|  2 +-
> >  doc/guides/platform/bluefield.rst|  4 ++--
> >  doc/guides/platform/cnxk.rst |  4 ++--
> >  doc/guides/platform/octeontx.rst |  8 
> >  doc/guides/prog_guide/build-sdk-meson.rst|  4 ++--
> >  doc/guides/prog_guide/lto.rst|  2 +-
> >  doc/guides/prog_guide/profile_app.rst|  2 +-
> >  doc/guides/sample_app_ug/vm_power_management.rst | 14 ++
> >  15 files changed, 28 insertions(+), 30 deletions(-)
> >
> Acked-by: Bruce Richardson 
>
> The fact that this needs to be changed in so many places is indicative that
> we need some cleanup in the docs - but then again, we knew that already!
> :-)
>

Indeed, and I see many places still showing the issue after the patch.
Stephen, can you look at other guides?

Thanks!

-- 
David Marchand

[Bug 1030] rte_malloc() and rte_free() get stuck when used with signal handler

2022-11-16 Thread bugzilla

https://bugs.dpdk.org/show_bug.cgi?id=1030

Thomas Monjalon (tho...@monjalon.net) changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Thomas Monjalon (tho...@monjalon.net) ---
Resolved in http://git.dpdk.org/dpdk/commit/?id=8f8e8f0226

-- 
You are receiving this mail because:
You are the assignee for the bug.

[PATCH] net/mlx5: fix GENEVE resource management

2022-11-16 Thread Suanming Mou

The item translation split causes GENEVE TLV option resource register
function flow_dev_geneve_tlv_option_resource_register() to be called
twice incorrectly both in spec and mask translation.

In SWS mode the refcnt will only be decreased by 1 in flow release.
The refcnt will never be 0 again, it causes the resource be leaked.
In HWS mode the resource is allocated as global, the refcnt should
not be increased after the resource be allocated. And the resource
should be released during PMD exists.

This commit fixes GENEVE resource management.

Fixes: 75a00812b18f ("net/mlx5: add hardware steering item translation")
Fixes: cd4ab742064a ("net/mlx5: split flow item matcher and value translation")

Signed-off-by: Suanming Mou 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.c |  8 +++-
 drivers/net/mlx5/mlx5_flow.h|  2 ++
 drivers/net/mlx5/mlx5_flow_dv.c | 31 ++-
 3 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index b3efdad293..6a0d66247a 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1757,7 +1757,13 @@ mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh)
} while (++i < sh->bond.n_port);
if (sh->td)
claim_zero(mlx5_devx_cmd_destroy(sh->td));
-   MLX5_ASSERT(sh->geneve_tlv_option_resource == NULL);
+#ifdef HAVE_MLX5_HWS_SUPPORT
+   /* HWS manages geneve_tlv_option resource as global. */
+   if (sh->config.dv_flow_en == 2)
+   flow_dev_geneve_tlv_option_resource_release(sh);
+   else
+#endif
+   MLX5_ASSERT(sh->geneve_tlv_option_resource == NULL);
pthread_mutex_destroy(&sh->txpp.mutex);
mlx5_lwm_unset(sh);
mlx5_free(sh);
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 94e0ac99b9..1f57ecd6e1 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -2484,6 +2484,8 @@ struct mlx5_aso_age_action 
*flow_aso_age_get_by_idx(struct rte_eth_dev *dev,
 int flow_dev_geneve_tlv_option_resource_register(struct rte_eth_dev *dev,
 const struct rte_flow_item *item,
 struct rte_flow_error *error);
+void flow_dev_geneve_tlv_option_resource_release(struct mlx5_dev_ctx_shared 
*sh);
+
 void flow_release_workspace(void *data);
 int mlx5_flow_os_init_workspace_once(void);
 void *mlx5_flow_os_get_specific_workspace(void);
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index bc9a75f225..a9357096f5 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -9493,9 +9493,13 @@ flow_dev_geneve_tlv_option_resource_register(struct 
rte_eth_dev *dev,
geneve_opt_v->option_type &&
geneve_opt_resource->length ==
geneve_opt_v->option_len) {
-   /* We already have GENEVE TLV option obj allocated. */
-   __atomic_fetch_add(&geneve_opt_resource->refcnt, 1,
-  __ATOMIC_RELAXED);
+   /*
+* We already have GENEVE TLV option obj allocated.
+* Increasing refcnt only in SWS. HWS uses it as global.
+*/
+   if (priv->sh->config.dv_flow_en == 1)
+   
__atomic_fetch_add(&geneve_opt_resource->refcnt, 1,
+  __ATOMIC_RELAXED);
} else {
ret = rte_flow_error_set(error, ENOMEM,
RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
@@ -9571,11 +9575,14 @@ flow_dv_translate_item_geneve_opt(struct rte_eth_dev 
*dev, void *key,
return -1;
MLX5_ITEM_UPDATE(item, key_type, geneve_opt_v, geneve_opt_m,
 &rte_flow_item_geneve_opt_mask);
-   ret = flow_dev_geneve_tlv_option_resource_register(dev, item,
-  error);
-   if (ret) {
-   DRV_LOG(ERR, "Failed to create geneve_tlv_obj");
-   return ret;
+   /* Register resource requires item spec. */
+   if (key_type & MLX5_SET_MATCHER_V) {
+   ret = flow_dev_geneve_tlv_option_resource_register(dev, item,
+  error);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to create geneve_tlv_obj");
+   return ret;
+   }
}
/*
 * Set the option length in GENEVE header if not requested.
@@ -15226,11 +15233,9 @@ flow_dv_dest_array_resource_release(struct rte_eth_dev 
*dev,
&resource->entry);
 }
 
-static void
-flow_dv_geneve_tlv_option_resource_release(struct rte_eth_dev *dev)
+void
+fl

[PATCH] net/failsafe: Fix crash due to in-valid sub-device port id

2022-11-16 Thread madhuker . mythri

From: Madhuker Mythri 

Crash occuring while the DPDK secondary processes trying to probe the 
tap-device, where tap-device is a sub-device of Fail-safe device.
Some-times we get in-valid sub-devices(with the in-valid port-id’s and 
device-names), due to which the IPC communication does not get response and 
causes the communication failures b/w primary/secondary process.
So, need to validate the sub-device(tap) while secondary process probe in the 
Fail-safe PMD, to avoid such issues.
Bugzilla Id: 1116

Signed-off-by: Madhuker Mythri 
---
 drivers/net/failsafe/failsafe.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 32811403b4..3663976697 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -361,6 +361,10 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
if (sdev->devargs.name[0] == '\0')
continue;
 
+   if (!rte_eth_dev_is_valid_port(PORT_ID(sdev))) {
+   continue;
+   }
+
/* rebuild devargs to be able to get the bus name. */
ret = rte_devargs_parse(&devargs,
sdev->devargs.name);
-- 
2.32.0.windows.1

[PATCH] mempool: micro-optimize put function

2022-11-16 Thread Morten Brørup

Micro-optimization:
Reduced the most likely code path in the generic put function by moving an
unlikely check out of the most likely code path and further down.

Also updated the comments in the function.

Signed-off-by: Morten Brørup 
---
 lib/mempool/rte_mempool.h | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..aba90dbb5b 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1364,32 +1364,33 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void 
* const *obj_table,
 {
void **cache_objs;
 
-   /* No cache provided */
+   /* No cache provided? */
if (unlikely(cache == NULL))
goto driver_enqueue;
 
-   /* increment stat now, adding in mempool always success */
+   /* Increment stats now, adding in mempool always succeeds. */
RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
 
-   /* The request itself is too big for the cache */
-   if (unlikely(n > cache->flushthresh))
-   goto driver_enqueue_stats_incremented;
-
-   /*
-* The cache follows the following algorithm:
-*   1. If the objects cannot be added to the cache without crossing
-*  the flush threshold, flush the cache to the backend.
-*   2. Add the objects to the cache.
-*/
-
-   if (cache->len + n <= cache->flushthresh) {
+   if (likely(cache->len + n <= cache->flushthresh)) {
+   /*
+* The objects can be added to the cache without crossing the
+* flush threshold.
+*/
cache_objs = &cache->objs[cache->len];
cache->len += n;
-   } else {
+   } else if (likely(n <= cache->flushthresh)) {
+   /*
+* The request itself fits into the cache.
+* But first, the cache must be flushed to the backend, so
+* adding the objects does not cross the flush threshold.
+*/
cache_objs = &cache->objs[0];
rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
cache->len = n;
+   } else {
+   /* The request itself is too big for the cache. */
+   goto driver_enqueue_stats_incremented;
}
 
/* Add the objects to the cache. */
@@ -1399,13 +1400,13 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void 
* const *obj_table,
 
 driver_enqueue:
 
-   /* increment stat now, adding in mempool always success */
+   /* Increment stats now, adding in mempool always succeeds. */
RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
 
 driver_enqueue_stats_incremented:
 
-   /* push objects to the backend */
+   /* Push the objects to the backend. */
rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
 }
 
-- 
2.17.1

Re: [PATCH] maintainers: update for gve

2022-11-16 Thread Jordan Kimbrough

I've also registered

On Tue, Nov 15, 2022 at 12:04 PM Rushil Gupta  wrote:

> Done. Thanks Junfeng!
>
>
>
> On Tue, Nov 15, 2022 at 1:22 AM Guo, Junfeng 
> wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Thomas Monjalon 
> > > Sent: Tuesday, November 15, 2022 16:33
> > > To: Guo, Junfeng 
> > > Cc: dev@dpdk.org; Zhang, Qi Z ; Wu, Jingjing
> > > ; ferruh.yi...@xilinx.com; Xing, Beilei
> > > ; dev@dpdk.org; jeroe...@google.com;
> > > jr...@google.com; Zhang, Helin ; Rushil Gupta
> > > ; Jeroen de Borst ;
> > > Jordan Kimbrough 
> > > Subject: Re: [PATCH] maintainers: update for gve
> > >
> > > 09/11/2022 20:37, Rushil Gupta:
> > > > Thanks a lot Junfeng!
> > > >
> > > > On Tue, Nov 8, 2022 at 11:26 PM Junfeng Guo 
> > > wrote:
> > > >
> > > > > Add co-maintainers from Google team for gve (Google Virtual
> > > Ethernet).
> > > > >
> > > > > Signed-off-by: Junfeng Guo 
> > > > > ---
> > > > >  Google Virtual Ethernet
> > > > >  M: Junfeng Guo 
> > > > > +M: Jeroen de Borst 
> > > > > +M: Rushil Gupta 
> > > > > +M: Jordan Kimbrough 
> > >
> > > They were not part of the patch review process in the mailing list,
> > > why do you want them to become maintainers?
> > > I think it would be saner to have them involved first.
> >
> > Yes, make sense! Thanks for reminding this!
> >
> > Hi @Rushil Gupta @Jeroen de Borst @Jordan Kimbrough,
> > could you help register for dev@dpdk.org
> > at https://www.dpdk.org/contribute/#Mailing-Lists as well as for
> > the patchwork at http://patchwork.dpdk.org/project/dpdk/list/ first?
> > Then you can add more info & explanation for your maintaining plan here.
> > Thanks!
> >
> > >
> >
> >
>

[PATCH v2] mempool: fix rte_mempool_avail_count may segment fault when used in multiprocess

2022-11-16 Thread Fengnan Chang

rte_mempool_create put tailq entry into rte_mempool_tailq list before
populate, and pool_data set when populate. So in multi process, if
process A create mempool, and process B can get mempool through
rte_mempool_lookup before pool_data set, if B call rte_mempool_avail_count,
it will cause segment fault.

Fix this by put tailq entry into rte_mempool_tailq after populate.

Signed-off-by: Fengnan Chang 
---
 lib/mempool/rte_mempool.c | 43 ++-
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
index 4c78071a34..b3a6572fc8 100644
--- a/lib/mempool/rte_mempool.c
+++ b/lib/mempool/rte_mempool.c
@@ -155,6 +155,27 @@ get_min_page_size(int socket_id)
return wa.min == SIZE_MAX ? (size_t) rte_mem_page_size() : wa.min;
 }
 
+static int
+add_mempool_to_list(struct rte_mempool *mp)
+{
+   struct rte_mempool_list *mempool_list;
+   struct rte_tailq_entry *te = NULL;
+
+   /* try to allocate tailq entry */
+   te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
+   if (te == NULL) {
+   RTE_LOG(ERR, MEMPOOL, "Cannot allocate tailq entry!\n");
+   return -ENOMEM;
+   }
+
+   te->data = mp;
+   mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
+   rte_mcfg_tailq_write_lock();
+   TAILQ_INSERT_TAIL(mempool_list, te, next);
+   rte_mcfg_tailq_write_unlock();
+
+   return 0;
+}
 
 static void
 mempool_add_elem(struct rte_mempool *mp, __rte_unused void *opaque,
@@ -304,6 +325,9 @@ mempool_ops_alloc_once(struct rte_mempool *mp)
if (ret != 0)
return ret;
mp->flags |= RTE_MEMPOOL_F_POOL_CREATED;
+   ret = add_mempool_to_list(mp);
+   if (ret != 0)
+   return ret;
}
return 0;
 }
@@ -798,9 +822,7 @@ rte_mempool_create_empty(const char *name, unsigned n, 
unsigned elt_size,
int socket_id, unsigned flags)
 {
char mz_name[RTE_MEMZONE_NAMESIZE];
-   struct rte_mempool_list *mempool_list;
struct rte_mempool *mp = NULL;
-   struct rte_tailq_entry *te = NULL;
const struct rte_memzone *mz = NULL;
size_t mempool_size;
unsigned int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
@@ -820,8 +842,6 @@ rte_mempool_create_empty(const char *name, unsigned n, 
unsigned elt_size,
  RTE_CACHE_LINE_MASK) != 0);
 #endif
 
-   mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
-
/* asked for zero items */
if (n == 0) {
rte_errno = EINVAL;
@@ -866,14 +886,6 @@ rte_mempool_create_empty(const char *name, unsigned n, 
unsigned elt_size,
private_data_size = (private_data_size +
 RTE_MEMPOOL_ALIGN_MASK) & 
(~RTE_MEMPOOL_ALIGN_MASK);
 
-
-   /* try to allocate tailq entry */
-   te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
-   if (te == NULL) {
-   RTE_LOG(ERR, MEMPOOL, "Cannot allocate tailq entry!\n");
-   goto exit_unlock;
-   }
-
mempool_size = RTE_MEMPOOL_HEADER_SIZE(mp, cache_size);
mempool_size += private_data_size;
mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
@@ -923,20 +935,13 @@ rte_mempool_create_empty(const char *name, unsigned n, 
unsigned elt_size,
   cache_size);
}
 
-   te->data = mp;
-
-   rte_mcfg_tailq_write_lock();
-   TAILQ_INSERT_TAIL(mempool_list, te, next);
-   rte_mcfg_tailq_write_unlock();
rte_mcfg_mempool_write_unlock();
-
rte_mempool_trace_create_empty(name, n, elt_size, cache_size,
private_data_size, flags, mp);
return mp;
 
 exit_unlock:
rte_mcfg_mempool_write_unlock();
-   rte_free(te);
rte_mempool_free(mp);
return NULL;
 }
-- 
2.37.0 (Apple Git-136)

Reminder - DPDK DTS Working Group - Tomorrow Morning 11/16/22 @ 9am EST/'6am PST/2pm UTC

2022-11-16 Thread Nathan Southern

Good evening,

Tomorrow morning, Wed. 11/16/22, we will hold our DPDK DTS Working Group at
9am EST/6am PST/2pm UTC. Zoom information to follow. We hope to see you
there.

Thanks,

Nathan

Nathan C. Southern, Project Coordinator

Data Plane Development Kit

The Linux Foundation

248.835.4812 (mobile)

nsouth...@linuxfoundation.org


Take meeting notes
Start a new document to capture notes
14 guests
1 yes
2 maybe, 11 awaiting
content_copy
keyboard_arrow_down
notes
Description:DPDK Project is inviting you to a scheduled Zoom meeting.

Topic: DTS Working Group - DPDK
Time: Oct 20, 2022 09:00 AM Eastern Time (US and Canada)
Every 2 weeks on Thu, until Dec 14, 2023, 31 occurrence(s)
Oct 20, 2022 09:00 AM
Nov 3, 2022 09:00 AM
Nov 17, 2022 09:00 AM
Dec 1, 2022 09:00 AM
Dec 15, 2022 09:00 AM
Dec 29, 2022 09:00 AM
Jan 12, 2023 09:00 AM
Jan 26, 2023 09:00 AM
Feb 9, 2023 09:00 AM
Feb 23, 2023 09:00 AM
Mar 9, 2023 09:00 AM
Mar 23, 2023 09:00 AM
Apr 6, 2023 09:00 AM
Apr 20, 2023 09:00 AM
May 4, 2023 09:00 AM
May 18, 2023 09:00 AM
Jun 1, 2023 09:00 AM
Jun 15, 2023 09:00 AM
Jun 29, 2023 09:00 AM
Jul 13, 2023 09:00 AM
Jul 27, 2023 09:00 AM
Aug 10, 2023 09:00 AM
Aug 24, 2023 09:00 AM
Sep 7, 2023 09:00 AM
Sep 21, 2023 09:00 AM
Oct 5, 2023 09:00 AM
Oct 19, 2023 09:00 AM
Nov 2, 2023 09:00 AM
Nov 16, 2023 09:00 AM
Nov 30, 2023 09:00 AM
Dec 14, 2023 09:00 AM
Please download and import the following iCalendar (.ics) files to your
calendar system.
Weekly:
https://zoom.us/meeting/tJIodumhqDoiGddm__jlxmO14AUoLwAA_SST/ics?icsToken=98tyKuCuqzoqE9KUuBqERowAGYj4c_Pwpn5HjadZkSDaCSxLbyynYsN3PZ5oMfnv


Join Zoom Meeting
https://zoom.us/j/96510961833?pwd=YUJHWXF0bjAzWHFkTnRRNThkT0xXQT09


Meeting ID: 965 1096 1833
Passcode: 046886
One tap mobile
+13126266799,,96510961833#*046886# US (Chicago)
+16465588656,,96510961833#*046886# US (New York)

Dial by your location
+1 312 626 6799 US (Chicago)
+1 646 558 8656 US (New York)
+1 646 931 3860 US
+1 301 715 8592 US (Washington DC)
+1 309 205 3325 US
+1 346 248 7799 US (Houston)
+1 386 347 5053 US
+1 564 217 2000 US
+1 669 444 9171 US
+1 669 900 6833 US (San Jose)
+1 719 359 4580 US
+1 253 215 8782 US (Tacoma)
877 369 0926 US Toll-free
855 880 1246 US Toll-free
+1 438 809 7799 Canada
+1 587 328 1099 Canada
+1 647 374 4685 Canada
+1 647 558 0588 Canada
+1 778 907 2071 Canada
+1 780 666 0144 Canada
+1 204 272 7920 Canada
855 703 8985 Canada Toll-free
Meeting ID: 965 1096 1833
Passcode: 046886
Find your local number: https://zoom.us/u/asSaixgHd

event
Organizer: DPDK (Data Plane Development Kit)
DPDK (Data Plane Development Kit)
Creator: Created by: Nathan SouthernCreated by: Nathan Southern

Reminder - DPDK Techboard Meeting Tomorrow Wed. 11/16/22 @ 11am EST/8am PST/4pm UTC

2022-11-16 Thread Nathan Southern

Dear DPDK Community,

Tomorrow we will hold our biweekly techboard meeting for DPDK.

Read-only agenda will be posted here:

https://annuel.framapad.org/p/r.0c3cc4d1e011214183872a98f6b5c7db

And you can join the call via Jitsi here:

https://meet.jit.si/dpdk

Hope to see you there.

Thanks,

Nathan

Nathan C. Southern, Project Coordinator

Data Plane Development Kit

The Linux Foundation

248.835.4812 (mobile)

nsouth...@linuxfoundation.org

Re: [PATCH] mempool: micro-optimize put function

2022-11-16 Thread Andrew Rybchenko


On 11/16/22 13:18, Morten Brørup wrote:

Micro-optimization:
Reduced the most likely code path in the generic put function by moving an
unlikely check out of the most likely code path and further down.

Also updated the comments in the function.

Signed-off-by: Morten Brørup 
---
  lib/mempool/rte_mempool.h | 35 ++-
  1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..aba90dbb5b 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1364,32 +1364,33 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void 
* const *obj_table,
  {
void **cache_objs;
  
-	/* No cache provided */

+   /* No cache provided? */
if (unlikely(cache == NULL))
goto driver_enqueue;
  
-	/* increment stat now, adding in mempool always success */

+   /* Increment stats now, adding in mempool always succeeds. */
RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
  
-	/* The request itself is too big for the cache */

-   if (unlikely(n > cache->flushthresh))
-   goto driver_enqueue_stats_incremented;


I've kept the check here since it protects against overflow in len plus 
n below if n is really huge.



-
-   /*
-* The cache follows the following algorithm:
-*   1. If the objects cannot be added to the cache without crossing
-*  the flush threshold, flush the cache to the backend.
-*   2. Add the objects to the cache.
-*/
-
-   if (cache->len + n <= cache->flushthresh) {
+   if (likely(cache->len + n <= cache->flushthresh)) {
+   /*
+* The objects can be added to the cache without crossing the
+* flush threshold.
+*/
cache_objs = &cache->objs[cache->len];
cache->len += n;
-   } else {
+   } else if (likely(n <= cache->flushthresh)) {
+   /*
+* The request itself fits into the cache.
+* But first, the cache must be flushed to the backend, so
+* adding the objects does not cross the flush threshold.
+*/
cache_objs = &cache->objs[0];
rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
cache->len = n;
+   } else {
+   /* The request itself is too big for the cache. */
+   goto driver_enqueue_stats_incremented;
}
  
  	/* Add the objects to the cache. */

@@ -1399,13 +1400,13 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void 
* const *obj_table,
  
  driver_enqueue:
  
-	/* increment stat now, adding in mempool always success */

+   /* Increment stats now, adding in mempool always succeeds. */
RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
  
  driver_enqueue_stats_incremented:
  
-	/* push objects to the backend */

+   /* Push the objects to the backend. */
rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
  }

RE: [PATCH] mempool: micro-optimize put function

2022-11-16 Thread Morten Brørup

> From: Andrew Rybchenko [mailto:andrew.rybche...@oktetlabs.ru]
> Sent: Wednesday, 16 November 2022 12.05
> 
> On 11/16/22 13:18, Morten Brørup wrote:
> > Micro-optimization:
> > Reduced the most likely code path in the generic put function by
> moving an
> > unlikely check out of the most likely code path and further down.
> >
> > Also updated the comments in the function.
> >
> > Signed-off-by: Morten Brørup 
> > ---
> >   lib/mempool/rte_mempool.h | 35 ++-
> >   1 file changed, 18 insertions(+), 17 deletions(-)
> >
> > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> > index 9f530db24b..aba90dbb5b 100644
> > --- a/lib/mempool/rte_mempool.h
> > +++ b/lib/mempool/rte_mempool.h
> > @@ -1364,32 +1364,33 @@ rte_mempool_do_generic_put(struct rte_mempool
> *mp, void * const *obj_table,
> >   {
> > void **cache_objs;
> >
> > -   /* No cache provided */
> > +   /* No cache provided? */
> > if (unlikely(cache == NULL))
> > goto driver_enqueue;
> >
> > -   /* increment stat now, adding in mempool always success */
> > +   /* Increment stats now, adding in mempool always succeeds. */
> > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> >
> > -   /* The request itself is too big for the cache */
> > -   if (unlikely(n > cache->flushthresh))
> > -   goto driver_enqueue_stats_incremented;
> 
> I've kept the check here since it protects against overflow in len plus
> n below if n is really huge.

We can fix that, see below.

> 
> > -
> > -   /*
> > -* The cache follows the following algorithm:
> > -*   1. If the objects cannot be added to the cache without
> crossing
> > -*  the flush threshold, flush the cache to the backend.
> > -*   2. Add the objects to the cache.
> > -*/
> > -
> > -   if (cache->len + n <= cache->flushthresh) {
> > +   if (likely(cache->len + n <= cache->flushthresh)) {

It is an invariant that cache->len <= cache->flushthresh, so the above 
comparison can be rewritten to protect against overflow:

if (likely(n <= cache->flushthresh - cache->len)) {

> > +   /*
> > +* The objects can be added to the cache without crossing
> the
> > +* flush threshold.
> > +*/
> > cache_objs = &cache->objs[cache->len];
> > cache->len += n;
> > -   } else {
> > +   } else if (likely(n <= cache->flushthresh)) {
> > +   /*
> > +* The request itself fits into the cache.
> > +* But first, the cache must be flushed to the backend, so
> > +* adding the objects does not cross the flush threshold.
> > +*/
> > cache_objs = &cache->objs[0];
> > rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
> > cache->len = n;
> > +   } else {
> > +   /* The request itself is too big for the cache. */
> > +   goto driver_enqueue_stats_incremented;
> > }
> >
> > /* Add the objects to the cache. */
> > @@ -1399,13 +1400,13 @@ rte_mempool_do_generic_put(struct rte_mempool
> *mp, void * const *obj_table,
> >
> >   driver_enqueue:
> >
> > -   /* increment stat now, adding in mempool always success */
> > +   /* Increment stats now, adding in mempool always succeeds. */
> > RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> >
> >   driver_enqueue_stats_incremented:
> >
> > -   /* push objects to the backend */
> > +   /* Push the objects to the backend. */
> > rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
> >   }
> >
>

Re: [PATCH] mempool: micro-optimize put function

2022-11-16 Thread Andrew Rybchenko


On 11/16/22 14:10, Morten Brørup wrote:

From: Andrew Rybchenko [mailto:andrew.rybche...@oktetlabs.ru]
Sent: Wednesday, 16 November 2022 12.05

On 11/16/22 13:18, Morten Brørup wrote:

Micro-optimization:
Reduced the most likely code path in the generic put function by

moving an

unlikely check out of the most likely code path and further down.

Also updated the comments in the function.

Signed-off-by: Morten Brørup 
---
   lib/mempool/rte_mempool.h | 35 ++-
   1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..aba90dbb5b 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1364,32 +1364,33 @@ rte_mempool_do_generic_put(struct rte_mempool

*mp, void * const *obj_table,

   {
void **cache_objs;

-   /* No cache provided */
+   /* No cache provided? */
if (unlikely(cache == NULL))
goto driver_enqueue;

-   /* increment stat now, adding in mempool always success */
+   /* Increment stats now, adding in mempool always succeeds. */
RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);

-   /* The request itself is too big for the cache */
-   if (unlikely(n > cache->flushthresh))
-   goto driver_enqueue_stats_incremented;


I've kept the check here since it protects against overflow in len plus
n below if n is really huge.


We can fix that, see below.




-
-   /*
-* The cache follows the following algorithm:
-*   1. If the objects cannot be added to the cache without

crossing

-*  the flush threshold, flush the cache to the backend.
-*   2. Add the objects to the cache.
-*/
-
-   if (cache->len + n <= cache->flushthresh) {
+   if (likely(cache->len + n <= cache->flushthresh)) {


It is an invariant that cache->len <= cache->flushthresh, so the above 
comparison can be rewritten to protect against overflow:

if (likely(n <= cache->flushthresh - cache->len)) {



True, but it would be useful to highlight the usage of the
invariant here using either a comment or an assert.

IMHO it is wrong to use likely here since, as far as I know, it makes 
else branch very expensive, but crossing the flush

threshold is an expected branch and it must not be that
expensive.


+   /*
+* The objects can be added to the cache without crossing

the

+* flush threshold.
+*/
cache_objs = &cache->objs[cache->len];
cache->len += n;
-   } else {
+   } else if (likely(n <= cache->flushthresh)) {
+   /*
+* The request itself fits into the cache.
+* But first, the cache must be flushed to the backend, so
+* adding the objects does not cross the flush threshold.
+*/
cache_objs = &cache->objs[0];
rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
cache->len = n;
+   } else {
+   /* The request itself is too big for the cache. */
+   goto driver_enqueue_stats_incremented;
}

/* Add the objects to the cache. */
@@ -1399,13 +1400,13 @@ rte_mempool_do_generic_put(struct rte_mempool

*mp, void * const *obj_table,


   driver_enqueue:

-   /* increment stat now, adding in mempool always success */
+   /* Increment stats now, adding in mempool always succeeds. */
RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);

   driver_enqueue_stats_incremented:

-   /* push objects to the backend */
+   /* Push the objects to the backend. */
rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
   }

[Bug 1126] i40e: Rx interrupt behaviour is possibly wrong

2022-11-16 Thread bugzilla

https://bugs.dpdk.org/show_bug.cgi?id=1126

Bug ID: 1126
   Summary: i40e: Rx interrupt behaviour is possibly wrong
   Product: DPDK
   Version: unspecified
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: ivan.ma...@oktetlabs.ru
  Target Milestone: ---

We've been seeing odd behaviour of Rx interrupts on i40e rigs in the opensource
ethdev test suite [1]. There's a test, "usecases/rx_intr", which checks various
corner cases.

In particular, as shown in log [2], when the test enables Rx interrupts and
attempts to receive two packets in a row (though, not in a single burst), an
interrupt is triggered once, that is, for the 1st packet only. This result
suggests that the driver does not automatically re-enable ("rearm") interrupts,
which might be incorrect.

At the same time, according to the API contract, once the application has
invoked rte_eth_dev_rx_intr_enable, interrupts are not supposed to stop working
when the 1st packet arrives. Instead, it is only when the application invokes
rte_eth_dev_rx_intr_disable that interrupts shall cease to arrive.

That is also supported by a statement found in Intel(R) Ethernet Controller
X710/ XXV710/XL710 Datasheet, section 7.5.1.3, which is as follows: "At the end
of the interrupt handler the software re-enables the interrupts by setting the
INTENA".

Another corner case (see [3]) is to check whether normal poll mode Rx resumes
working when the application first enables Rx interrupts and then opts to
disable the feature. In the case of i40e, normal Rx poll mode is not restored:
the test doesn't see the packet after disabling interrupts. Might be incorrect
behaviour as well.

[1] https://mails.dpdk.org/archives/dev/2022-October/251663.html
[2] https://ts-factory.io/bublik/v2/log/189828?focusId=190963&mode=treeAndlog
[3] https://ts-factory.io/bublik/v2/log/189828?focusId=190961&mode=treeAndlog

-- 
You are receiving this mail because:
You are the assignee for the bug.

[dpdk-dev v6] doc: support IPsec Multi-buffer lib v1.3

2022-11-16 Thread Kai Ji

From: Pablo de Lara 

Updated AESNI MB and AESNI GCM, KASUMI, ZUC, SNOW3G
and CHACHA20_POLY1305 PMD documentation guides
with information about the latest Intel IPSec Multi-buffer
library supported.

Signed-off-by: Pablo de Lara 
Acked-by: Ciara Power 
Acked-by: Brian Dooley 
Signed-off-by: Kai Ji 
---
-v6: Release notes update reword
-v5: Release notes update
-v4: Added information on CHACHA20_POLY1305 PMD guide
-v3: Fixed library version from 1.2 to 1.3 in one line
-v2: Removed repeated word 'the'
---
 doc/guides/cryptodevs/aesni_gcm.rst |  8 +++---
 doc/guides/cryptodevs/aesni_mb.rst  | 29 -
 doc/guides/cryptodevs/chacha20_poly1305.rst | 12 ++---
 doc/guides/cryptodevs/kasumi.rst| 15 ---
 doc/guides/cryptodevs/snow3g.rst| 15 ---
 doc/guides/cryptodevs/zuc.rst   | 14 +++---
 doc/guides/rel_notes/release_22_11.rst  | 11 +++-
 7 files changed, 77 insertions(+), 27 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index 6229392f58..5192287ed8 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -40,8 +40,8 @@ Installation
 To build DPDK with the AESNI_GCM_PMD the user is required to download the 
multi-buffer
 library from `here `_
 and compile it on their user system before building DPDK.
-The latest version of the library supported by this PMD is v1.2, which
-can be downloaded in 
``_.
+The latest version of the library supported by this PMD is v1.3, which
+can be downloaded in 
``_.

 .. code-block:: console

@@ -84,8 +84,8 @@ and the external crypto libraries supported by them:
17.08 - 18.02  Multi-buffer library 0.46 - 0.48
18.05 - 19.02  Multi-buffer library 0.49 - 0.52
19.05 - 20.08  Multi-buffer library 0.52 - 0.55
-   20.11 - 21.08  Multi-buffer library 0.53 - 1.2*
-   21.11+ Multi-buffer library 1.0  - 1.2*
+   20.11 - 21.08  Multi-buffer library 0.53 - 1.3*
+   21.11+ Multi-buffer library 1.0  - 1.3*
=  

 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index 599ed5698f..b9bf03655d 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -1,7 +1,7 @@
 ..  SPDX-License-Identifier: BSD-3-Clause
 Copyright(c) 2015-2018 Intel Corporation.

-AESN-NI Multi Buffer Crypto Poll Mode Driver
+AES-NI Multi Buffer Crypto Poll Mode Driver
 


@@ -10,8 +10,6 @@ support for utilizing Intel multi buffer library, see the 
white paper
 `Fast Multi-buffer IPsec Implementations on Intel® Architecture Processors
 
`_.

-The AES-NI MB PMD has current only been tested on Fedora 21 64-bit with gcc.
-
 The AES-NI MB PMD supports synchronous mode of operation with
 ``rte_cryptodev_sym_cpu_crypto_process`` function call.

@@ -77,6 +75,23 @@ Limitations
 * RTE_CRYPTO_CIPHER_DES_DOCSISBPI is not supported for combined Crypto-CRC
   DOCSIS security protocol.

+AESNI MB PMD selection over SNOW3G/ZUC/KASUMI PMDs
+--
+
+This PMD supports wireless cipher suite (SNOW3G, ZUC and KASUMI).
+On Intel processors, it is recommended to use this PMD instead of SNOW3G, ZUC 
and KASUMI PMDs,
+as it enables algorithm mixing (e.g. cipher algorithm SNOW3G-UEA2 with
+authentication algorithm AES-CMAC-128) and performance over IMIX (packet size 
mix) traffic
+is significantly higher.
+
+AESNI MB PMD selection over CHACHA20-POLY1305 PMD
+-
+
+This PMD supports Chacha20-Poly1305 algorithm.
+On Intel processors, it is recommended to use this PMD instead of 
CHACHA20-POLY1305 PMD,
+as it delivers better performance on single segment buffers.
+For multi-segment buffers, it is still recommended to use CHACHA20-POLY1305 
PMD,
+until the new SGL API is introduced in the AESNI MB PMD.

 Installation
 
@@ -84,8 +99,8 @@ Installation
 To build DPDK with the AESNI_MB_PMD the user is required to download the 
multi-buffer
 library from `here `_
 and compile it on their user system before building DPDK.
-The latest version of the library supported by this PMD is v1.2, which
-can be downloaded from 
``_.
+The latest version of the library supported by this PMD is v1.3, which
+can be downloaded from 
``_.

 .. code-block:: console

@@ -130,8

[PATCH] net/failsafe: Fix crash due to in-valid sub-device port id

2022-11-16 Thread madhuker . mythri

From: Madhuker Mythri 

Crash occurring while the DPDK secondary processes trying to probe the
tap-device, where tap-device is a sub-device of Fail-safe device.
Some-times we get in-valid sub-devices(with the in-valid port-id's),
due to which the IPC communication does not get response and causes the
communication failures b/w primary/secondary process.
So, need to validate the sub-device(tap) while secondary process probe in
the Fail-safe PMD, to avoid such issues.

Bugzilla Id: 1116

Signed-off-by: Madhuker Mythri 
---
 drivers/net/failsafe/failsafe.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 32811403b4..51d4440ac7 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -361,6 +361,9 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
if (sdev->devargs.name[0] == '\0')
continue;
 
+   if (!rte_eth_dev_is_valid_port(PORT_ID(sdev)))
+   continue;
+
/* rebuild devargs to be able to get the bus name. */
ret = rte_devargs_parse(&devargs,
sdev->devargs.name);
-- 
2.32.0.windows.1

[PATCH v2] mempool: micro-optimize put function

2022-11-16 Thread Morten Brørup

Micro-optimization:
Reduced the most likely code path in the generic put function by moving an
unlikely check out of the most likely code path and further down.

Also updated the comments in the function.

v2 (feedback from Andrew Rybchenko):
* Modified comparison to prevent overflow if n is really huge and len is
  non-zero.
* Added assertion about the invariant preventing overflow in the
  comparison.
* Crossing the threshold is not extremely unlikely, so removed likely()
  from that comparison.
  The compiler will generate code with optimal static branch prediction
  here anyway.

Signed-off-by: Morten Brørup 
---
 lib/mempool/rte_mempool.h | 36 
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..dd1a3177d6 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void 
* const *obj_table,
 {
void **cache_objs;
 
-   /* No cache provided */
+   /* No cache provided? */
if (unlikely(cache == NULL))
goto driver_enqueue;
 
-   /* increment stat now, adding in mempool always success */
+   /* Increment stats now, adding in mempool always succeeds. */
RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
 
-   /* The request itself is too big for the cache */
-   if (unlikely(n > cache->flushthresh))
-   goto driver_enqueue_stats_incremented;
-
-   /*
-* The cache follows the following algorithm:
-*   1. If the objects cannot be added to the cache without crossing
-*  the flush threshold, flush the cache to the backend.
-*   2. Add the objects to the cache.
-*/
+   /* Assert the invariant preventing overflow in the comparison below. */
+   RTE_ASSERT(cache->len <= cache->flushthresh);
 
-   if (cache->len + n <= cache->flushthresh) {
+   if (n <= cache->flushthresh - cache->len) {
+   /*
+* The objects can be added to the cache without crossing the
+* flush threshold.
+*/
cache_objs = &cache->objs[cache->len];
cache->len += n;
-   } else {
+   } else if (likely(n <= cache->flushthresh)) {
+   /*
+* The request itself fits into the cache.
+* But first, the cache must be flushed to the backend, so
+* adding the objects does not cross the flush threshold.
+*/
cache_objs = &cache->objs[0];
rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
cache->len = n;
+   } else {
+   /* The request itself is too big for the cache. */
+   goto driver_enqueue_stats_incremented;
}
 
/* Add the objects to the cache. */
@@ -1399,13 +1403,13 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void 
* const *obj_table,
 
 driver_enqueue:
 
-   /* increment stat now, adding in mempool always success */
+   /* Increment stats now, adding in mempool always succeeds. */
RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
 
 driver_enqueue_stats_incremented:
 
-   /* push objects to the backend */
+   /* Push the objects to the backend. */
rte_mempool_ops_enqueue_bulk(mp, obj_table, n);
 }
 
-- 
2.17.1

[PATCH v2 0/6] doc: some fixes

2022-11-16 Thread Michael Baum

Some doc fixes in testpmd doc and release notes.

The first 3 were splited from commit [1] after discussion.

[1]
https://patchwork.dpdk.org/project/dpdk/patch/20221019144904.2543586-3-michae...@nvidia.com/

v2:
- rebase.
- add "Reviewed-by" and "Acked-by" lables.
- add detailes to cover letter.

Michael Baum (6):
  doc: fix underlines too long in testpmd documentation
  doc: fix the colon type in listing aged flow rules
  doc: fix miss blank line in testpmd flow syntax doc
  doc: fix miss blank line in release notes
  doc: add mlx5 HWS aging support to release notes
  doc: add ethdev pre-config flags to release notes

 doc/guides/rel_notes/release_22_11.rst  |  8 
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 11 ++-
 2 files changed, 14 insertions(+), 5 deletions(-)

-- 
2.25.1

[PATCH v2 1/6] doc: fix underlines too long in testpmd documentation

2022-11-16 Thread Michael Baum

In testpmd documentation, there are two underlines which should not
match the length of the text above.

This patch update them to be align with the guideline [1].

[1]
https://doc.dpdk.org/guides/contributing/documentation.html#section-headers

Fixes: a69c335d56b5 ("doc: add flow dump command in testpmd guide")
Fixes: 0e459ffa0889 ("app/testpmd: support flow aging")
Cc: jack...@mellanox.com
Cc: do...@mellanox.com
Cc: sta...@dpdk.org

Signed-off-by: Michael Baum 
Reviewed-by: Thomas Monjalon 
Acked-by: Yuying Zhang 
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b5649d9d9a 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -4240,7 +4240,7 @@ Disabling isolated mode::
  testpmd>
 
 Dumping HW internal information
-
+~~~
 
 ``flow dump`` dumps the hardware's internal representation information of
 all flows. It is bound to ``rte_flow_dev_dump()``::
@@ -4256,7 +4256,7 @@ Otherwise, it will complain error occurred::
Caught error type [...] ([...]): [...]
 
 Listing and destroying aged flow rules
-
+~~
 
 ``flow aged`` simply lists aged flow rules be get from api 
``rte_flow_get_aged_flows``,
 and ``destroy`` parameter can be used to destroy those flow rules in PMD.
-- 
2.25.1

[PATCH v2 2/6] doc: fix the colon type in listing aged flow rules

2022-11-16 Thread Michael Baum

In testpmd documentation, for listing aged-out flow rules there is some
boxes of examples.

In Sphinx syntax, those boxes are achieved by "::" before. However,
in two places it uses ":" instead and the example looks like a regular
text.

This patch replace the ":" with "::" to get code box.

Fixes: 0e459ffa0889 ("app/testpmd: support flow aging")
Cc: do...@mellanox.com
Cc: sta...@dpdk.org

Signed-off-by: Michael Baum 
Reviewed-by: Thomas Monjalon 
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index b5649d9d9a..b5fea1396c 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -4259,7 +4259,7 @@ Listing and destroying aged flow rules
 ~~
 
 ``flow aged`` simply lists aged flow rules be get from api 
``rte_flow_get_aged_flows``,
-and ``destroy`` parameter can be used to destroy those flow rules in PMD.
+and ``destroy`` parameter can be used to destroy those flow rules in PMD::
 
flow aged {port_id} [destroy]
 
@@ -4294,7 +4294,7 @@ will be ID 3, ID 1, ID 0::
1   0   0   i--
0   0   0   i--
 
-If attach ``destroy`` parameter, the command will destroy all the list aged 
flow rules.
+If attach ``destroy`` parameter, the command will destroy all the list aged 
flow rules::
 
testpmd> flow aged 0 destroy
Port 0 total aged flows: 4
-- 
2.25.1

[PATCH v2 3/6] doc: fix miss blank line in testpmd flow syntax doc

2022-11-16 Thread Michael Baum

In flow syntax documentation, there is example for create pattern
template.

Before the example, miss a blank line causing it to look regular bold
text.
In addition, inside the example, it uses tab instead of spaces which
expand the indentation in one line.

This patch adds the blank line and replaces tab with spaces.

Fixes: 04cc665fab38 ("app/testpmd: add flow template management")
Cc: akozy...@nvidia.com
Cc: sta...@dpdk.org

Signed-off-by: Michael Baum 
Reviewed-by: Thomas Monjalon 
Acked-by: Yuying Zhang 
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index b5fea1396c..0037506a79 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -2894,9 +2894,10 @@ following sections.
[meters_number {number}] [flags {number}]
 
 - Create a pattern template::
+
flow pattern_template {port_id} create [pattern_template_id {id}]
[relaxed {boolean}] [ingress] [egress] [transfer]
-  template {item} [/ {item} [...]] / end
+   template {item} [/ {item} [...]] / end
 
 - Destroy a pattern template::
 
-- 
2.25.1

[PATCH v2 4/6] doc: fix miss blank line in release notes

2022-11-16 Thread Michael Baum

The NVIDIA mlx5 driver inside 22.11 release notes, lists all features
support for queue-based async HW steering.

Before the list, miss a blank line causing it to look regular text line.

This patch adds the blank line as well.

Fixes: ddb68e47331e ("net/mlx5: add extended metadata mode for HWS")
Fixes: 0f4aa72b99da ("net/mlx5: support flow modify field with HWS")
Cc: bi...@nvidia.com
Cc: suanmi...@nvidia.com

Signed-off-by: Michael Baum 
Reviewed-by: Thomas Monjalon 
---
 doc/guides/rel_notes/release_22_11.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 5e091403ad..aa857e8203 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -185,6 +185,7 @@ New Features
 * **Updated NVIDIA mlx5 driver.**
 
   * Added full support for queue-based async HW steering.
+
 - Support of FDB.
 - Support of control flow and isolate mode.
 - Support of conntrack.
-- 
2.25.1

[PATCH v2 5/6] doc: add mlx5 HWS aging support to release notes

2022-11-16 Thread Michael Baum

Add to 22.11 release note the NVIDIA mlx5 HWS aging support.

Fixes: 04a4de756e14 ("net/mlx5: support flow age action with HWS")
Cc: michae...@nvidia.com

Signed-off-by: Michael Baum 
Reviewed-by: Thomas Monjalon 
---
 doc/guides/rel_notes/release_22_11.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index aa857e8203..ba8b97c09a 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -190,6 +190,7 @@ New Features
 - Support of control flow and isolate mode.
 - Support of conntrack.
 - Support of counter.
+- Support of aging.
 - Support of meter.
 - Support of modify fields.
 
-- 
2.25.1

[PATCH v2 6/6] doc: add ethdev pre-config flags to release notes

2022-11-16 Thread Michael Baum

Add to release notes:
1. Flags field in pre-configuration structure and strict-queue flag.

Fixes: dcc9a80c20b8 ("ethdev: add strict queue to pre-configuration flow hints")
Cc: michae...@nvidia.com

Signed-off-by: Michael Baum 
Reviewed-by: Thomas Monjalon 
---
 doc/guides/rel_notes/release_22_11.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index ba8b97c09a..3dee012636 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -108,6 +108,12 @@ New Features
   Each flag has a corresponding capability flag
   in ``struct rte_eth_hairpin_queue_cap``.
 
+* **Added strict queue to pre-configuration flow hints.**
+
+  * Added flags option to ``rte_flow_configure`` and ``rte_flow_info_get``.
+  * Added ``RTE_FLOW_PORT_FLAG_STRICT_QUEUE`` flag to indicate all operations
+for a given flow rule will strictly happen on the same queue.
+
 * **Added configuration for asynchronous flow connection tracking.**
 
   Added connection tracking action number hint to ``rte_flow_configure``
-- 
2.25.1

[RFC PATCH v2] net/memif: change socket listener owner uid/gid

2022-11-16 Thread Junxiao Shi

This allows a DPDK application running with root privilege to create a
memif socket listener with non-root owner uid and gid, which can be
connected from client applications running without root privilege.

Signed-off-by: Junxiao Shi 
---
 doc/guides/nics/memif.rst |  2 ++
 drivers/net/memif/memif_socket.c  | 13 +++--
 drivers/net/memif/rte_eth_memif.c | 46 +--
 drivers/net/memif/rte_eth_memif.h |  2 ++
 4 files changed, 58 insertions(+), 5 deletions(-)

diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
index aca843640b..8a8141aa72 100644
--- a/doc/guides/nics/memif.rst
+++ b/doc/guides/nics/memif.rst
@@ -44,6 +44,8 @@ client.
"rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", 
"10", "1-14"
"socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 
108"
"socket-abstract=no", "Set usage of abstract socket address", "yes", 
"yes|no"
+   "uid=1000", "Set socket listener owner uid. Only relevant to server with 
socket-abstract=no", "unchanged", "uid_t"
+   "gid=1000", "Set socket listener owner gid. Only relevant to server with 
socket-abstract=no", "unchanged", "gid_t"
"mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
"secret=abc123", "Secret is an optional security option, which if 
specified, must be matched by peer", "", "string len 24"
"zero-copy=yes", "Enable/disable zero-copy client mode. Only relevant to 
client, requires '--single-file-segments' eal argument", "no", "yes|no"
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
index 7886644412..c2b038d01a 100644
--- a/drivers/net/memif/memif_socket.c
+++ b/drivers/net/memif/memif_socket.c
@@ -889,7 +889,7 @@ memif_listener_handler(void *arg)
 }
 
 static struct memif_socket *
-memif_socket_create(char *key, uint8_t listener, bool is_abstract)
+memif_socket_create(char *key, uint8_t listener, bool is_abstract, uid_t 
owner_uid, gid_t owner_gid)
 {
struct memif_socket *sock;
struct sockaddr_un un = { 0 };
@@ -941,6 +941,14 @@ memif_socket_create(char *key, uint8_t listener, bool 
is_abstract)
 
MIF_LOG(DEBUG, "Memif listener socket %s created.", 
sock->filename);
 
+   if (!is_abstract && (owner_uid != (uid_t)-1 || owner_gid != 
(gid_t)-1)) {
+   ret = chown(sock->filename, owner_uid, owner_gid);
+   if (ret < 0) {
+   MIF_LOG(ERR, "Failed to change listener socket 
owner %d", errno);
+   goto error;
+   }
+   }
+
/* Allocate interrupt instance */
sock->intr_handle =
rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_SHARED);
@@ -1017,7 +1025,8 @@ memif_socket_init(struct rte_eth_dev *dev, const char 
*socket_filename)
if (ret < 0) {
socket = memif_socket_create(key,
(pmd->role == MEMIF_ROLE_CLIENT) ? 0 : 1,
-   pmd->flags & ETH_MEMIF_FLAG_SOCKET_ABSTRACT);
+   pmd->flags & ETH_MEMIF_FLAG_SOCKET_ABSTRACT,
+   pmd->owner_uid, pmd->owner_gid);
if (socket == NULL)
return -1;
ret = rte_hash_add_key_data(hash, key, socket);
diff --git a/drivers/net/memif/rte_eth_memif.c 
b/drivers/net/memif/rte_eth_memif.c
index dd951b8296..d69f0e823f 100644
--- a/drivers/net/memif/rte_eth_memif.c
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -37,6 +37,8 @@
 #define ETH_MEMIF_RING_SIZE_ARG"rsize"
 #define ETH_MEMIF_SOCKET_ARG   "socket"
 #define ETH_MEMIF_SOCKET_ABSTRACT_ARG  "socket-abstract"
+#define ETH_MEMIF_OWNER_UID_ARG"owner-uid"
+#define ETH_MEMIF_OWNER_GID_ARG"owner-gid"
 #define ETH_MEMIF_MAC_ARG  "mac"
 #define ETH_MEMIF_ZC_ARG   "zero-copy"
 #define ETH_MEMIF_SECRET_ARG   "secret"
@@ -48,6 +50,8 @@ static const char * const valid_arguments[] = {
ETH_MEMIF_RING_SIZE_ARG,
ETH_MEMIF_SOCKET_ARG,
ETH_MEMIF_SOCKET_ABSTRACT_ARG,
+   ETH_MEMIF_OWNER_UID_ARG,
+   ETH_MEMIF_OWNER_GID_ARG,
ETH_MEMIF_MAC_ARG,
ETH_MEMIF_ZC_ARG,
ETH_MEMIF_SECRET_ARG,
@@ -1515,7 +1519,7 @@ static const struct eth_dev_ops ops = {
 static int
 memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
 memif_interface_id_t id, uint32_t flags,
-const char *socket_filename,
+const char *socket_filename, uid_t owner_uid, gid_t owner_gid,
 memif_log2_ring_size_t log2_ring_size,
 uint16_t pkt_buffer_size, const char *secret,
 struct rte_ether_addr *ether_addr)
@@ -1554,6 +1558,8 @@ memif_create(struct rte_vdev_device *vdev, enum 
memif_role_t role,
/* Zero-copy flag irelevant to server. */
if (pmd->role == MEMIF_ROLE_SERVER)

Re: [RFC PATCH v2 03/10] dts: add dpdk build on sut

2022-11-16 Thread Owen Hilyard

On Mon, Nov 14, 2022 at 11:54 AM Juraj Linkeš 
wrote:

> Add the ability to build DPDK and apps, using a configured target.
>
> Signed-off-by: Juraj Linkeš 
> ---
>  dts/framework/exception.py|  17 +++
>  dts/framework/remote_session/os/os_session.py |  90 +++-
>  .../remote_session/os/posix_session.py| 128 +
>  .../remote_session/remote_session.py  |  34 -
>  dts/framework/remote_session/ssh_session.py   |  64 -
>  dts/framework/settings.py |  40 +-
>  dts/framework/testbed_model/node/sut_node.py  | 131 ++
>  dts/framework/utils.py|  15 ++
>  8 files changed, 505 insertions(+), 14 deletions(-)
>
> diff --git a/dts/framework/exception.py b/dts/framework/exception.py
> index b282e48198..93d99432ae 100644
> --- a/dts/framework/exception.py
> +++ b/dts/framework/exception.py
> @@ -26,6 +26,7 @@ class ReturnCode(IntEnum):
>  GENERIC_ERR = 1
>  SSH_ERR = 2
>  REMOTE_CMD_EXEC_ERR = 3
> +DPDK_BUILD_ERR = 10
>  NODE_SETUP_ERR = 20
>  NODE_CLEANUP_ERR = 21
>
> @@ -110,6 +111,22 @@ def __str__(self) -> str:
>  )
>
>
> +class RemoteDirectoryExistsError(DTSError):
> +"""
> +Raised when a remote directory to be created already exists.
> +"""
> +
> +return_code: ClassVar[ReturnCode] = ReturnCode.REMOTE_CMD_EXEC_ERR
> +
> +
> +class DPDKBuildError(DTSError):
> +"""
> +Raised when DPDK build fails for any reason.
> +"""
> +
> +return_code: ClassVar[ReturnCode] = ReturnCode.DPDK_BUILD_ERR
> +
> +
>  class NodeSetupError(DTSError):
>  """
>  Raised when setting up a node.
> diff --git a/dts/framework/remote_session/os/os_session.py
> b/dts/framework/remote_session/os/os_session.py
> index 2a72082628..57e2865282 100644
> --- a/dts/framework/remote_session/os/os_session.py
> +++ b/dts/framework/remote_session/os/os_session.py
> @@ -2,12 +2,15 @@
>  # Copyright(c) 2022 PANTHEON.tech s.r.o.
>  # Copyright(c) 2022 University of New Hampshire
>
> -from abc import ABC
> +from abc import ABC, abstractmethod
> +from pathlib import PurePath
>
> -from framework.config import NodeConfiguration
> +from framework.config import Architecture, NodeConfiguration
>  from framework.logger import DTSLOG
>  from framework.remote_session.factory import create_remote_session
>  from framework.remote_session.remote_session import RemoteSession
> +from framework.settings import SETTINGS
> +from framework.utils import EnvVarsDict
>
>
>  class OSSession(ABC):
> @@ -44,3 +47,86 @@ def is_alive(self) -> bool:
>  Check whether the remote session is still responding.
>  """
>  return self.remote_session.is_alive()
> +
> +@abstractmethod
> +def guess_dpdk_remote_dir(self, remote_dir) -> PurePath:
> +"""
> +Try to find DPDK remote dir in remote_dir.
> +"""
> +
> +@abstractmethod
> +def get_remote_tmp_dir(self) -> PurePath:
> +"""
> +Get the path of the temporary directory of the remote OS.
> +"""
> +
> +@abstractmethod
> +def get_dpdk_build_env_vars(self, arch: Architecture) -> dict:
> +"""
> +Create extra environment variables needed for the target
> architecture. Get
> +information from the node if needed.
> +"""
> +
> +@abstractmethod
> +def join_remote_path(self, *args: str | PurePath) -> PurePath:
> +"""
> +Join path parts using the path separator that fits the remote OS.
> +"""
> +
> +@abstractmethod
> +def copy_file(
> +self,
> +source_file: str | PurePath,
> +destination_file: str | PurePath,
> +source_remote: bool = False,
> +) -> None:
> +"""
> +Copy source_file from local storage to destination_file on the
> remote Node
> +associated with the remote session.
> +If source_remote is True, reverse the direction - copy
> source_file from the
> +associated remote Node to destination_file on local storage.
> +"""
> +
> +@abstractmethod
> +def remove_remote_dir(
> +self,
> +remote_dir_path: str | PurePath,
> +recursive: bool = True,
> +force: bool = True,
> +) -> None:
> +"""
> +Remove remote directory, by default remove recursively and
> forcefully.
> +"""
> +
> +@abstractmethod
> +def extract_remote_tarball(
> +self,
> +remote_tarball_path: str | PurePath,
> +expected_dir: str | PurePath | None = None,
> +) -> None:
> +"""
> +Extract remote tarball in place. If expected_dir is a non-empty
> string, check
> +whether the dir exists after extracting the archive.
> +"""
> +
> +@abstractmethod
> +def build_dpdk(
> +self,
> +env_vars: EnvVarsDict,
> +meson_args: str,
> +remote_dpdk_dir: str | PurePath,
> +target_name: str,
>

RE: [PATCH] net/mlx5: fix wrong error log in async flow destruction

2022-11-16 Thread Raslan Darawsheh

Hi,

> -Original Message-
> From: Michael Baum 
> Sent: Sunday, November 13, 2022 1:07 PM
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Slava Ovsiienko ;
> Suanming Mou ; sta...@dpdk.org
> Subject: [PATCH] net/mlx5: fix wrong error log in async flow destruction
> 
> The flow_hw_async_flow_destroy() function fills the error structure in
> case of failure.
> 
> The error log reported by function is "fail to create rte flow" while
> the correct failure is in destruction.
> 
> This patch changes the error log to report "fail to destroy rte flow".
> 
> Fixes: c40c061a022e ("net/mlx5: add basic flow queue operation")
> Cc: suanmi...@nvidia.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Michael Baum 
> Acked-by: Matan Azrad 

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh

Re: [RFC PATCH v2 04/10] dts: add dpdk execution handling

2022-11-16 Thread Owen Hilyard

On Mon, Nov 14, 2022 at 11:54 AM Juraj Linkeš 
wrote:

> Add methods for setting up and shutting down DPDK apps and for
> constructing EAL parameters.
>
> Signed-off-by: Juraj Linkeš 
> ---
>  dts/conf.yaml |   4 +
>  dts/framework/config/__init__.py  |  85 -
>  dts/framework/config/conf_yaml_schema.json|  22 +++
>  .../remote_session/os/linux_session.py|  15 ++
>  dts/framework/remote_session/os/os_session.py |  16 +-
>  .../remote_session/os/posix_session.py|  80 
>  dts/framework/testbed_model/hw/__init__.py|  17 ++
>  dts/framework/testbed_model/hw/cpu.py | 164 
>  dts/framework/testbed_model/node/node.py  |  36 
>  dts/framework/testbed_model/node/sut_node.py  | 178 +-
>  dts/framework/utils.py|  20 ++
>  11 files changed, 634 insertions(+), 3 deletions(-)
>  create mode 100644 dts/framework/testbed_model/hw/__init__.py
>  create mode 100644 dts/framework/testbed_model/hw/cpu.py
>
> diff --git a/dts/conf.yaml b/dts/conf.yaml
> index 6b0bc5c2bf..976888a88e 100644
> --- a/dts/conf.yaml
> +++ b/dts/conf.yaml
> @@ -12,4 +12,8 @@ nodes:
>- name: "SUT 1"
>  hostname: sut1.change.me.localhost
>  user: root
> +arch: x86_64
>  os: linux
> +bypass_core0: true
> +cpus: ""
> +memory_channels: 4
> diff --git a/dts/framework/config/__init__.py
> b/dts/framework/config/__init__.py
> index 1b97dc3ab9..344d697a69 100644
> --- a/dts/framework/config/__init__.py
> +++ b/dts/framework/config/__init__.py
> @@ -11,12 +11,13 @@
>  import pathlib
>  from dataclasses import dataclass
>  from enum import Enum, auto, unique
> -from typing import Any
> +from typing import Any, Iterable
>
>  import warlock  # type: ignore
>  import yaml
>
>  from framework.settings import SETTINGS
> +from framework.utils import expand_range
>
>
>  class StrEnum(Enum):
> @@ -60,6 +61,80 @@ class Compiler(StrEnum):
>  msvc = auto()
>
>
> +@dataclass(slots=True, frozen=True)
> +class CPU:
> +cpu: int
> +core: int
> +socket: int
> +node: int
> +
> +def __str__(self) -> str:
> +return str(self.cpu)
> +
> +
> +class CPUList(object):
> +"""
> +Convert these options into a list of int cpus
> +cpu_list=[CPU1, CPU2] - a list of CPUs
> +cpu_list=[0,1,2,3] - a list of int indices
> +cpu_list=['0','1','2-3'] - a list of str indices; ranges are supported
> +cpu_list='0,1,2-3' - a comma delimited str of indices; ranges are
> supported
> +
> +The class creates a unified format used across the framework and
> allows
> +the user to use either a str representation (using str(instance) or
> directly
> +in f-strings) or a list representation (by accessing
> instance.cpu_list).
> +Empty cpu_list is allowed.
> +"""
> +
> +_cpu_list: list[int]
> +
> +def __init__(self, cpu_list: list[int | str | CPU] | str):
> +self._cpu_list = []
> +if isinstance(cpu_list, str):
> +self._from_str(cpu_list.split(","))
> +else:
> +self._from_str((str(cpu) for cpu in cpu_list))
> +
> +# the input cpus may not be sorted
> +self._cpu_list.sort()
> +
> +@property
> +def cpu_list(self) -> list[int]:
> +return self._cpu_list
> +
> +def _from_str(self, cpu_list: Iterable[str]) -> None:
> +for cpu in cpu_list:
> +self._cpu_list.extend(expand_range(cpu))
> +
> +def _get_consecutive_cpus_range(self, cpu_list: list[int]) ->
> list[str]:
> +formatted_core_list = []
> +tmp_cpus_list = list(sorted(cpu_list))
> +segment = tmp_cpus_list[:1]
> +for core_id in tmp_cpus_list[1:]:
> +if core_id - segment[-1] == 1:
> +segment.append(core_id)
> +else:
> +formatted_core_list.append(
> +f"{segment[0]}-{segment[-1]}"
> +if len(segment) > 1
> +else f"{segment[0]}"
> +)
> +current_core_index = tmp_cpus_list.index(core_id)
> +formatted_core_list.extend(
> +
> self._get_consecutive_cpus_range(tmp_cpus_list[current_core_index:])
> +)
> +segment.clear()
> +break
> +if len(segment) > 0:
> +formatted_core_list.append(
> +f"{segment[0]}-{segment[-1]}" if len(segment) > 1 else
> f"{segment[0]}"
> +)
> +return formatted_core_list
> +
> +def __str__(self) -> str:
> +return
> f'{",".join(self._get_consecutive_cpus_range(self._cpu_list))}'
> +
> +
>  # Slots enables some optimizations, by pre-allocating space for the
> defined
>  # attributes in the underlying data structure.
>  #
> @@ -71,7 +146,11 @@ class NodeConfiguration:
>  hostname: str
>  user: str
>  password: str | None
> +arch: Architecture
>  os: OS
> +bypass_co

Re: [RFC PATCH v2 05/10] dts: add node memory setup

2022-11-16 Thread Owen Hilyard

On Mon, Nov 14, 2022 at 11:54 AM Juraj Linkeš 
wrote:

> Setup hugepages on nodes. This is useful not only on SUT nodes, but
> also on TG nodes which use TGs that utilize hugepages.
>
> Signed-off-by: Juraj Linkeš 
> ---
>  dts/framework/remote_session/__init__.py  |  1 +
>  dts/framework/remote_session/arch/__init__.py | 20 +
>  dts/framework/remote_session/arch/arch.py | 57 +
>  .../remote_session/os/linux_session.py| 85 +++
>  dts/framework/remote_session/os/os_session.py | 10 +++
>  dts/framework/testbed_model/node/node.py  | 15 +++-
>  6 files changed, 187 insertions(+), 1 deletion(-)
>  create mode 100644 dts/framework/remote_session/arch/__init__.py
>  create mode 100644 dts/framework/remote_session/arch/arch.py
>
> diff --git a/dts/framework/remote_session/__init__.py
> b/dts/framework/remote_session/__init__.py
> index f2339b20bd..f0deeadac6 100644
> --- a/dts/framework/remote_session/__init__.py
> +++ b/dts/framework/remote_session/__init__.py
> @@ -11,4 +11,5 @@
>
>  # pylama:ignore=W0611
>
> +from .arch import Arch, create_arch
>  from .os import OSSession, create_session
> diff --git a/dts/framework/remote_session/arch/__init__.py
> b/dts/framework/remote_session/arch/__init__.py
> new file mode 100644
> index 00..d78ad42ac5
> --- /dev/null
> +++ b/dts/framework/remote_session/arch/__init__.py
> @@ -0,0 +1,20 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2022 PANTHEON.tech s.r.o.
> +
> +from framework.config import Architecture, NodeConfiguration
> +
> +from .arch import PPC64, Arch, Arm64, i686, x86_32, x86_64
> +
> +
> +def create_arch(node_config: NodeConfiguration) -> Arch:
> +match node_config.arch:
> +case Architecture.x86_64:
> +return x86_64()
> +case Architecture.x86_32:
> +return x86_32()
> +case Architecture.i686:
> +return i686()
> +case Architecture.ppc64le:
> +return PPC64()
> +case Architecture.arm64:
> +return Arm64()
> diff --git a/dts/framework/remote_session/arch/arch.py
> b/dts/framework/remote_session/arch/arch.py
> new file mode 100644
> index 00..05c7602def
> --- /dev/null
> +++ b/dts/framework/remote_session/arch/arch.py
> @@ -0,0 +1,57 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2022 PANTHEON.tech s.r.o.
> +
> +
> +class Arch(object):
> +"""
> +Stores architecture-specific information.
> +"""
> +
> +@property
> +def default_hugepage_memory(self) -> int:
> +"""
> +Return the default amount of memory allocated for hugepages DPDK
> will use.
> +The default is an amount equal to 256 2MB hugepages (512MB
> memory).
> +"""
> +return 256 * 2048
> +
> +@property
> +def hugepage_force_first_numa(self) -> bool:
> +"""
> +An architecture may need to force configuration of hugepages to
> first socket.
> +"""
> +return False
> +
> +
> +class x86_64(Arch):
> +@property
> +def default_hugepage_memory(self) -> int:
> +return 4096 * 2048
> +
> +
> +class x86_32(Arch):
> +@property
> +def hugepage_force_first_numa(self) -> bool:
> +return True
> +
> +
> +class i686(Arch):
> +@property
> +def default_hugepage_memory(self) -> int:
> +return 512 * 2048
> +
> +@property
> +def hugepage_force_first_numa(self) -> bool:
> +return True
> +
> +
> +class PPC64(Arch):
> +@property
> +def default_hugepage_memory(self) -> int:
> +return 512 * 2048
> +
> +
> +class Arm64(Arch):
> +@property
> +def default_hugepage_memory(self) -> int:
> +return 2048 * 2048
> diff --git a/dts/framework/remote_session/os/linux_session.py
> b/dts/framework/remote_session/os/linux_session.py
> index 21f117b714..fad33d7613 100644
> --- a/dts/framework/remote_session/os/linux_session.py
> +++ b/dts/framework/remote_session/os/linux_session.py
> @@ -3,6 +3,8 @@
>  # Copyright(c) 2022 University of New Hampshire
>
>  from framework.config import CPU
> +from framework.exception import RemoteCommandExecutionError
> +from framework.utils import expand_range
>
>  from .posix_session import PosixSession
>
> @@ -24,3 +26,86 @@ def get_remote_cpus(self, bypass_core0: bool) ->
> list[CPU]:
>  continue
>  cpus.append(CPU(int(cpu), int(core), int(socket), int(node)))
>  return cpus
> +
> +def setup_hugepages(
> +self, hugepage_amount: int = -1, force_first_numa: bool = False
>

I think that hugepage_amount: int | None = None is better, since it
expresses it is an optional argument and the type checker will force anyone
using the value to check if it is none, whereas that will not happen with
-1.


> +) -> None:
> +self.logger.info("Getting Hugepage information.")
> +hugepage_size = self._get_hugepage_size()
> +hugepages_total = self._get_hugepages_total()
> +

Re: [RFC PATCH v2 07/10] dts: add simple stats report

2022-11-16 Thread Owen Hilyard

You are missing type annotations throughout this.

On Mon, Nov 14, 2022 at 11:54 AM Juraj Linkeš 
wrote:

> Provide a summary of testcase passed/failed/blocked counts.
>
> Signed-off-by: Juraj Linkeš 
> ---
>  dts/framework/dts.py|  3 ++
>  dts/framework/stats_reporter.py | 65 +
>  2 files changed, 68 insertions(+)
>  create mode 100644 dts/framework/stats_reporter.py
>
> diff --git a/dts/framework/dts.py b/dts/framework/dts.py
> index d606f8de2e..a7c243a5c3 100644
> --- a/dts/framework/dts.py
> +++ b/dts/framework/dts.py
> @@ -14,11 +14,13 @@
>  from .exception import DTSError, ReturnCode
>  from .logger import DTSLOG, getLogger
>  from .settings import SETTINGS
> +from .stats_reporter import TestStats
>  from .test_result import Result
>  from .utils import check_dts_python_version
>
>  dts_logger: DTSLOG = getLogger("dts")
>  result: Result = Result()
> +test_stats: TestStats = TestStats(SETTINGS.output_dir + "/statistics.txt")
>
>
>  def run_all() -> None:
> @@ -29,6 +31,7 @@ def run_all() -> None:
>  return_code = ReturnCode.NO_ERR
>  global dts_logger
>  global result
> +global test_stats
>
>  # check the python version of the server that run dts
>  check_dts_python_version()
> diff --git a/dts/framework/stats_reporter.py
> b/dts/framework/stats_reporter.py
> new file mode 100644
> index 00..a2735d0a1d
> --- /dev/null
> +++ b/dts/framework/stats_reporter.py
> @@ -0,0 +1,65 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2010-2014 Intel Corporation
> +# Copyright(c) 2022 PANTHEON.tech s.r.o.
> +
> +"""
> +Simple text file statistics generator
> +"""
> +
> +
> +class TestStats(object):
> +"""
> +Generates a small statistics file containing the number of passing,
> +failing and blocked tests. It makes use of a Result instance as input.
> +"""
> +
> +def __init__(self, filename):
> +self.filename = filename
> +
> +def __add_stat(self, test_result):


I think that this should probably be an option of an enum that gets matched
over. ex:

match test_result:
case None:
pass
case PASSED:
self.passed += 1
case FAILED:
self.failed += 1
case BLOCKED:
self.blocked += 1
case unknown:
# log this and throw an error.


> +if test_result is not None:
> +if test_result[0] == "PASSED":
> +self.passed += 1
> +if test_result[0] == "FAILED":
> +self.failed += 1
> +if test_result[0] == "BLOCKED":
> +self.blocked += 1
> +self.total += 1
> +
> +def __count_stats(self):
> +for sut in self.result.all_suts():
> +for target in self.result.all_targets(sut):
> +for suite in self.result.all_test_suites(sut, target):
> +for case in self.result.all_test_cases(sut, target,
> suite):
> +test_result = self.result.result_for(sut, target,
> suite, case)
> +if len(test_result):
> +self.__add_stat(test_result)
> +
> +def __write_stats(self):
> +sut_nodes = self.result.all_suts()
> +if len(sut_nodes) == 1:
> +self.stats_file.write(
> +f"dpdk_version =
> {self.result.current_dpdk_version(sut_nodes[0])}\n"
> +)
> +else:
> +for sut in sut_nodes:
> +dpdk_version = self.result.current_dpdk_version(sut)
> +self.stats_file.write(f"{sut}.dpdk_version =
> {dpdk_version}\n")
> +self.__count_stats()
> +self.stats_file.write(f"Passed = {self.passed}\n")
> +self.stats_file.write(f"Failed = {self.failed}\n")
> +self.stats_file.write(f"Blocked= {self.blocked}\n")
> +rate = 0
> +if self.total > 0:
> +rate = self.passed * 100.0 / self.total
> +self.stats_file.write(f"Pass rate  = {rate:.1f}\n")
> +
> +def save(self, result):
> +self.passed = 0
> +self.failed = 0
> +self.blocked = 0
> +self.total = 0
> +self.stats_file = open(self.filename, "w+")
> +self.result = result
> +self.__write_stats()
> +self.stats_file.close()
> --
> 2.30.2
>
>

Re: [RFC PATCH v2 08/10] dts: add testsuite class

2022-11-16 Thread Owen Hilyard

On Mon, Nov 14, 2022 at 11:54 AM Juraj Linkeš 
wrote:

> This is the base class that all test suites inherit from. The base class
> implements methods common to all test suites. The derived test suites
> implement tests and any particular setup needed for the suite or tests.
>
> Signed-off-by: Juraj Linkeš 
> ---
>  dts/conf.yaml  |   4 +
>  dts/framework/config/__init__.py   |  33 ++-
>  dts/framework/config/conf_yaml_schema.json |  49 
>  dts/framework/dts.py   |  29 +++
>  dts/framework/exception.py |  65 ++
>  dts/framework/settings.py  |  25 +++
>  dts/framework/test_case.py | 246 +
>  7 files changed, 450 insertions(+), 1 deletion(-)
>  create mode 100644 dts/framework/test_case.py
>
> diff --git a/dts/conf.yaml b/dts/conf.yaml
> index 976888a88e..0b0f2c59b0 100644
> --- a/dts/conf.yaml
> +++ b/dts/conf.yaml
> @@ -7,6 +7,10 @@ executions:
>  os: linux
>  cpu: native
>  compiler: gcc
> +perf: false
> +func: true
> +test_suites:
> +  - hello_world
>  system_under_test: "SUT 1"
>  nodes:
>- name: "SUT 1"
> diff --git a/dts/framework/config/__init__.py
> b/dts/framework/config/__init__.py
> index 344d697a69..8874b10030 100644
> --- a/dts/framework/config/__init__.py
> +++ b/dts/framework/config/__init__.py
> @@ -11,7 +11,7 @@
>  import pathlib
>  from dataclasses import dataclass
>  from enum import Enum, auto, unique
> -from typing import Any, Iterable
> +from typing import Any, Iterable, TypedDict
>
>  import warlock  # type: ignore
>  import yaml
> @@ -186,9 +186,34 @@ def from_dict(d: dict) -> "BuildTargetConfiguration":
>  )
>
>
> +class TestSuiteConfigDict(TypedDict):
> +suite: str
> +cases: list[str]
> +
> +
> +@dataclass(slots=True, frozen=True)
> +class TestSuiteConfig:
> +test_suite: str
> +test_cases: list[str]
> +
> +@staticmethod
> +def from_dict(
> +entry: str | TestSuiteConfigDict,
> +) -> "TestSuiteConfig":
> +if isinstance(entry, str):
> +return TestSuiteConfig(test_suite=entry, test_cases=[])
> +elif isinstance(entry, dict):
> +return TestSuiteConfig(test_suite=entry["suite"],
> test_cases=entry["cases"])
> +else:
> +raise TypeError(f"{type(entry)} is not valid for a test suite
> config.")
> +
> +
>  @dataclass(slots=True, frozen=True)
>  class ExecutionConfiguration:
>  build_targets: list[BuildTargetConfiguration]
> +perf: bool
> +func: bool
> +test_suites: list[TestSuiteConfig]
>  system_under_test: NodeConfiguration
>
>  @staticmethod
> @@ -196,11 +221,17 @@ def from_dict(d: dict, node_map: dict) ->
> "ExecutionConfiguration":
>  build_targets: list[BuildTargetConfiguration] = list(
>  map(BuildTargetConfiguration.from_dict, d["build_targets"])
>  )
> +test_suites: list[TestSuiteConfig] = list(
> +map(TestSuiteConfig.from_dict, d["test_suites"])
> +)
>  sut_name = d["system_under_test"]
>  assert sut_name in node_map, f"Unknown SUT {sut_name} in
> execution {d}"
>
>  return ExecutionConfiguration(
>  build_targets=build_targets,
> +perf=d["perf"],
> +func=d["func"],
> +test_suites=test_suites,
>  system_under_test=node_map[sut_name],
>  )
>
> diff --git a/dts/framework/config/conf_yaml_schema.json
> b/dts/framework/config/conf_yaml_schema.json
> index c59d3e30e6..e37ced65fe 100644
> --- a/dts/framework/config/conf_yaml_schema.json
> +++ b/dts/framework/config/conf_yaml_schema.json
> @@ -63,6 +63,31 @@
>  }
>},
>"additionalProperties": false
> +},
> +"test_suite": {
> +  "type": "string",
> +  "enum": [
> +"hello_world"
> +  ]
> +},
> +"test_target": {
> +  "type": "object",
> +  "properties": {
> +"suite": {
> +  "$ref": "#/definitions/test_suite"
> +},
> +"cases": {
> +  "type": "array",
> +  "items": {
> +"type": "string"
> +  },
> +  "minimum": 1
> +}
> +  },
> +  "required": [
> +"suite"
> +  ],
> +  "additionalProperties": false
>  }
>},
>"type": "object",
> @@ -130,6 +155,27 @@
>  },
>  "minimum": 1
>},
> +  "perf": {
> +"type": "boolean",
> +"description": "Enable performance testing"
> +  },
> +  "func": {
> +"type": "boolean",
> +"description": "Enable functional testing"
> +  },
> +  "test_suites": {
> +"type": "array",
> +"items": {
> +  "oneOf": [
> +{
> +  "$ref": "#/definitions/test_suite"
> +},
> +{
> +

RE: [PATCH v2] mempool: micro-optimize put function

2022-11-16 Thread Honnappa Nagarahalli


> 
> Micro-optimization:
> Reduced the most likely code path in the generic put function by moving an
> unlikely check out of the most likely code path and further down.
> 
> Also updated the comments in the function.
> 
> v2 (feedback from Andrew Rybchenko):
> * Modified comparison to prevent overflow if n is really huge and len is
>   non-zero.
> * Added assertion about the invariant preventing overflow in the
>   comparison.
> * Crossing the threshold is not extremely unlikely, so removed likely()
>   from that comparison.
>   The compiler will generate code with optimal static branch prediction
>   here anyway.
> 
> Signed-off-by: Morten Brørup 
> ---
>  lib/mempool/rte_mempool.h | 36 
>  1 file changed, 20 insertions(+), 16 deletions(-)
> 
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index 9f530db24b..dd1a3177d6 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct
> rte_mempool *mp, void * const *obj_table,  {
>   void **cache_objs;
> 
> - /* No cache provided */
> + /* No cache provided? */
>   if (unlikely(cache == NULL))
>   goto driver_enqueue;
> 
> - /* increment stat now, adding in mempool always success */
> + /* Increment stats now, adding in mempool always succeeds. */
>   RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
>   RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> 
> - /* The request itself is too big for the cache */
> - if (unlikely(n > cache->flushthresh))
> - goto driver_enqueue_stats_incremented;
> -
> - /*
> -  * The cache follows the following algorithm:
> -  *   1. If the objects cannot be added to the cache without crossing
> -  *  the flush threshold, flush the cache to the backend.
> -  *   2. Add the objects to the cache.
> -  */
> + /* Assert the invariant preventing overflow in the comparison below.
> */
> + RTE_ASSERT(cache->len <= cache->flushthresh);
> 
> - if (cache->len + n <= cache->flushthresh) {
> + if (n <= cache->flushthresh - cache->len) {
> + /*
> +  * The objects can be added to the cache without crossing the
> +  * flush threshold.
> +  */
>   cache_objs = &cache->objs[cache->len];
>   cache->len += n;
> - } else {
> + } else if (likely(n <= cache->flushthresh)) {
IMO, this is a misconfiguration on the application part. In the PMDs I have 
looked at, max value of 'n' is controlled by compile time constants. 
Application could do a compile time check on the cache threshold or we could 
have another RTE_ASSERT on this.

> + /*
> +  * The request itself fits into the cache.
> +  * But first, the cache must be flushed to the backend, so
> +  * adding the objects does not cross the flush threshold.
> +  */
>   cache_objs = &cache->objs[0];
>   rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache-
> >len);
>   cache->len = n;
> + } else {
> + /* The request itself is too big for the cache. */
> + goto driver_enqueue_stats_incremented;
>   }
> 
>   /* Add the objects to the cache. */
> @@ -1399,13 +1403,13 @@ rte_mempool_do_generic_put(struct
> rte_mempool *mp, void * const *obj_table,
> 
>  driver_enqueue:
> 
> - /* increment stat now, adding in mempool always success */
> + /* Increment stats now, adding in mempool always succeeds. */
>   RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
>   RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> 
>  driver_enqueue_stats_incremented:
> 
> - /* push objects to the backend */
> + /* Push the objects to the backend. */
>   rte_mempool_ops_enqueue_bulk(mp, obj_table, n);  }
> 
> --
> 2.17.1

RE: [PATCH v2] mempool: micro-optimize put function

2022-11-16 Thread Morten Brørup

> From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com]
> Sent: Wednesday, 16 November 2022 16.51
> 
> 
> >
> > Micro-optimization:
> > Reduced the most likely code path in the generic put function by
> moving an
> > unlikely check out of the most likely code path and further down.
> >
> > Also updated the comments in the function.
> >
> > v2 (feedback from Andrew Rybchenko):
> > * Modified comparison to prevent overflow if n is really huge and len
> is
> >   non-zero.
> > * Added assertion about the invariant preventing overflow in the
> >   comparison.
> > * Crossing the threshold is not extremely unlikely, so removed
> likely()
> >   from that comparison.
> >   The compiler will generate code with optimal static branch
> prediction
> >   here anyway.
> >
> > Signed-off-by: Morten Brørup 
> > ---
> >  lib/mempool/rte_mempool.h | 36 
> >  1 file changed, 20 insertions(+), 16 deletions(-)
> >
> > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> > index 9f530db24b..dd1a3177d6 100644
> > --- a/lib/mempool/rte_mempool.h
> > +++ b/lib/mempool/rte_mempool.h
> > @@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct
> > rte_mempool *mp, void * const *obj_table,  {
> > void **cache_objs;
> >
> > -   /* No cache provided */
> > +   /* No cache provided? */
> > if (unlikely(cache == NULL))
> > goto driver_enqueue;
> >
> > -   /* increment stat now, adding in mempool always success */
> > +   /* Increment stats now, adding in mempool always succeeds. */
> > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> >
> > -   /* The request itself is too big for the cache */
> > -   if (unlikely(n > cache->flushthresh))
> > -   goto driver_enqueue_stats_incremented;
> > -
> > -   /*
> > -* The cache follows the following algorithm:
> > -*   1. If the objects cannot be added to the cache without
> crossing
> > -*  the flush threshold, flush the cache to the backend.
> > -*   2. Add the objects to the cache.
> > -*/
> > +   /* Assert the invariant preventing overflow in the comparison
> below.
> > */
> > +   RTE_ASSERT(cache->len <= cache->flushthresh);
> >
> > -   if (cache->len + n <= cache->flushthresh) {
> > +   if (n <= cache->flushthresh - cache->len) {
> > +   /*
> > +* The objects can be added to the cache without crossing
> the
> > +* flush threshold.
> > +*/
> > cache_objs = &cache->objs[cache->len];
> > cache->len += n;
> > -   } else {
> > +   } else if (likely(n <= cache->flushthresh)) {
> IMO, this is a misconfiguration on the application part. In the PMDs I
> have looked at, max value of 'n' is controlled by compile time
> constants. Application could do a compile time check on the cache
> threshold or we could have another RTE_ASSERT on this.

There could be applications using a mempool for something else than mbufs.

In that case, the application should be allowed to get/put many objects in one 
transaction.

> 
> > +   /*
> > +* The request itself fits into the cache.
> > +* But first, the cache must be flushed to the backend, so
> > +* adding the objects does not cross the flush threshold.
> > +*/
> > cache_objs = &cache->objs[0];
> > rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache-
> > >len);
> > cache->len = n;
> > +   } else {
> > +   /* The request itself is too big for the cache. */
> > +   goto driver_enqueue_stats_incremented;
> > }
> >
> > /* Add the objects to the cache. */
> > @@ -1399,13 +1403,13 @@ rte_mempool_do_generic_put(struct
> > rte_mempool *mp, void * const *obj_table,
> >
> >  driver_enqueue:
> >
> > -   /* increment stat now, adding in mempool always success */
> > +   /* Increment stats now, adding in mempool always succeeds. */
> > RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> >
> >  driver_enqueue_stats_incremented:
> >
> > -   /* push objects to the backend */
> > +   /* Push the objects to the backend. */
> > rte_mempool_ops_enqueue_bulk(mp, obj_table, n);  }
> >
> > --
> > 2.17.1

RE: [PATCH v2] mempool: micro-optimize put function

2022-11-16 Thread Honnappa Nagarahalli



> > >
> > > Micro-optimization:
> > > Reduced the most likely code path in the generic put function by
> > moving an
> > > unlikely check out of the most likely code path and further down.
> > >
> > > Also updated the comments in the function.
> > >
> > > v2 (feedback from Andrew Rybchenko):
> > > * Modified comparison to prevent overflow if n is really huge and
> > > len
> > is
> > >   non-zero.
> > > * Added assertion about the invariant preventing overflow in the
> > >   comparison.
> > > * Crossing the threshold is not extremely unlikely, so removed
> > likely()
> > >   from that comparison.
> > >   The compiler will generate code with optimal static branch
> > prediction
> > >   here anyway.
> > >
> > > Signed-off-by: Morten Brørup 
> > > ---
> > >  lib/mempool/rte_mempool.h | 36 
> > >  1 file changed, 20 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> > > index 9f530db24b..dd1a3177d6 100644
> > > --- a/lib/mempool/rte_mempool.h
> > > +++ b/lib/mempool/rte_mempool.h
> > > @@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct
> > > rte_mempool *mp, void * const *obj_table,  {
> > >   void **cache_objs;
> > >
> > > - /* No cache provided */
> > > + /* No cache provided? */
> > >   if (unlikely(cache == NULL))
> > >   goto driver_enqueue;
> > >
> > > - /* increment stat now, adding in mempool always success */
> > > + /* Increment stats now, adding in mempool always succeeds. */
> > >   RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > >   RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > >
> > > - /* The request itself is too big for the cache */
> > > - if (unlikely(n > cache->flushthresh))
> > > - goto driver_enqueue_stats_incremented;
> > > -
> > > - /*
> > > -  * The cache follows the following algorithm:
> > > -  *   1. If the objects cannot be added to the cache without
> > crossing
> > > -  *  the flush threshold, flush the cache to the backend.
> > > -  *   2. Add the objects to the cache.
> > > -  */
> > > + /* Assert the invariant preventing overflow in the comparison
> > below.
> > > */
> > > + RTE_ASSERT(cache->len <= cache->flushthresh);
> > >
> > > - if (cache->len + n <= cache->flushthresh) {
> > > + if (n <= cache->flushthresh - cache->len) {
> > > + /*
> > > +  * The objects can be added to the cache without crossing
> > the
> > > +  * flush threshold.
> > > +  */
> > >   cache_objs = &cache->objs[cache->len];
> > >   cache->len += n;
> > > - } else {
> > > + } else if (likely(n <= cache->flushthresh)) {
> > IMO, this is a misconfiguration on the application part. In the PMDs I
> > have looked at, max value of 'n' is controlled by compile time
> > constants. Application could do a compile time check on the cache
> > threshold or we could have another RTE_ASSERT on this.
> 
> There could be applications using a mempool for something else than mbufs.
Agree

> 
> In that case, the application should be allowed to get/put many objects in
> one transaction.
Still, this is a misconfiguration on the application. On one hand the threshold 
is configured for 'x' but they are sending a request which is more than 'x'. It 
should be possible to change the threshold configuration or reduce the request 
size.

If the application does not fix the misconfiguration, it is possible that it 
will always hit this case and does not get the benefit of using the per-core 
cache.

With this check, we are introducing an additional memcpy as well. I am not sure 
if reusing the latest buffers is better than having an memcpy.

> 
> >
> > > + /*
> > > +  * The request itself fits into the cache.
> > > +  * But first, the cache must be flushed to the backend, so
> > > +  * adding the objects does not cross the flush threshold.
> > > +  */
> > >   cache_objs = &cache->objs[0];
> > >   rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache-
> > > >len);
> > >   cache->len = n;
> > > + } else {
> > > + /* The request itself is too big for the cache. */
> > > + goto driver_enqueue_stats_incremented;
> > >   }
> > >
> > >   /* Add the objects to the cache. */ @@ -1399,13 +1403,13 @@
> > > rte_mempool_do_generic_put(struct rte_mempool *mp, void * const
> > > *obj_table,
> > >
> > >  driver_enqueue:
> > >
> > > - /* increment stat now, adding in mempool always success */
> > > + /* Increment stats now, adding in mempool always succeeds. */
> > >   RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> > >   RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> > >
> > >  driver_enqueue_stats_incremented:
> > >
> > > - /* push objects to the backend */
> > > + /* Push the objects to the backend. */
> > >   rte_mempool_ops_enqueue_bulk(mp, obj_table, n);  }
> > >
> > > --
> > > 2.17.1

[RFC PATCH v3] net/memif: change socket listener owner uid/gid

2022-11-16 Thread Junxiao Shi

This allows a DPDK application running with root privilege to create a
memif socket listener with non-root owner uid and gid, which can be
connected from client applications running without root privilege.

Signed-off-by: Junxiao Shi 
---
 doc/guides/nics/memif.rst |  2 ++
 drivers/net/memif/memif_socket.c  | 13 +++--
 drivers/net/memif/rte_eth_memif.c | 48 +--
 drivers/net/memif/rte_eth_memif.h |  2 ++
 4 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
index aca843640b..8a8141aa72 100644
--- a/doc/guides/nics/memif.rst
+++ b/doc/guides/nics/memif.rst
@@ -44,6 +44,8 @@ client.
"rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", 
"10", "1-14"
"socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 
108"
"socket-abstract=no", "Set usage of abstract socket address", "yes", 
"yes|no"
+   "uid=1000", "Set socket listener owner uid. Only relevant to server with 
socket-abstract=no", "unchanged", "uid_t"
+   "gid=1000", "Set socket listener owner gid. Only relevant to server with 
socket-abstract=no", "unchanged", "gid_t"
"mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
"secret=abc123", "Secret is an optional security option, which if 
specified, must be matched by peer", "", "string len 24"
"zero-copy=yes", "Enable/disable zero-copy client mode. Only relevant to 
client, requires '--single-file-segments' eal argument", "no", "yes|no"
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
index 7886644412..c2b038d01a 100644
--- a/drivers/net/memif/memif_socket.c
+++ b/drivers/net/memif/memif_socket.c
@@ -889,7 +889,7 @@ memif_listener_handler(void *arg)
 }
 
 static struct memif_socket *
-memif_socket_create(char *key, uint8_t listener, bool is_abstract)
+memif_socket_create(char *key, uint8_t listener, bool is_abstract, uid_t 
owner_uid, gid_t owner_gid)
 {
struct memif_socket *sock;
struct sockaddr_un un = { 0 };
@@ -941,6 +941,14 @@ memif_socket_create(char *key, uint8_t listener, bool 
is_abstract)
 
MIF_LOG(DEBUG, "Memif listener socket %s created.", 
sock->filename);
 
+   if (!is_abstract && (owner_uid != (uid_t)-1 || owner_gid != 
(gid_t)-1)) {
+   ret = chown(sock->filename, owner_uid, owner_gid);
+   if (ret < 0) {
+   MIF_LOG(ERR, "Failed to change listener socket 
owner %d", errno);
+   goto error;
+   }
+   }
+
/* Allocate interrupt instance */
sock->intr_handle =
rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_SHARED);
@@ -1017,7 +1025,8 @@ memif_socket_init(struct rte_eth_dev *dev, const char 
*socket_filename)
if (ret < 0) {
socket = memif_socket_create(key,
(pmd->role == MEMIF_ROLE_CLIENT) ? 0 : 1,
-   pmd->flags & ETH_MEMIF_FLAG_SOCKET_ABSTRACT);
+   pmd->flags & ETH_MEMIF_FLAG_SOCKET_ABSTRACT,
+   pmd->owner_uid, pmd->owner_gid);
if (socket == NULL)
return -1;
ret = rte_hash_add_key_data(hash, key, socket);
diff --git a/drivers/net/memif/rte_eth_memif.c 
b/drivers/net/memif/rte_eth_memif.c
index dd951b8296..092f1cbc92 100644
--- a/drivers/net/memif/rte_eth_memif.c
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -37,6 +37,8 @@
 #define ETH_MEMIF_RING_SIZE_ARG"rsize"
 #define ETH_MEMIF_SOCKET_ARG   "socket"
 #define ETH_MEMIF_SOCKET_ABSTRACT_ARG  "socket-abstract"
+#define ETH_MEMIF_OWNER_UID_ARG"owner-uid"
+#define ETH_MEMIF_OWNER_GID_ARG"owner-gid"
 #define ETH_MEMIF_MAC_ARG  "mac"
 #define ETH_MEMIF_ZC_ARG   "zero-copy"
 #define ETH_MEMIF_SECRET_ARG   "secret"
@@ -48,6 +50,8 @@ static const char * const valid_arguments[] = {
ETH_MEMIF_RING_SIZE_ARG,
ETH_MEMIF_SOCKET_ARG,
ETH_MEMIF_SOCKET_ABSTRACT_ARG,
+   ETH_MEMIF_OWNER_UID_ARG,
+   ETH_MEMIF_OWNER_GID_ARG,
ETH_MEMIF_MAC_ARG,
ETH_MEMIF_ZC_ARG,
ETH_MEMIF_SECRET_ARG,
@@ -1515,7 +1519,7 @@ static const struct eth_dev_ops ops = {
 static int
 memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
 memif_interface_id_t id, uint32_t flags,
-const char *socket_filename,
+const char *socket_filename, uid_t owner_uid, gid_t owner_gid,
 memif_log2_ring_size_t log2_ring_size,
 uint16_t pkt_buffer_size, const char *secret,
 struct rte_ether_addr *ether_addr)
@@ -1554,6 +1558,8 @@ memif_create(struct rte_vdev_device *vdev, enum 
memif_role_t role,
/* Zero-copy flag irelevant to server. */
if (pmd->role == MEMIF_ROLE_SERVER)

[RFC v2] mempool: add API to return pointer to free space on per-core cache

2022-11-16 Thread Kamalakshitha Aligeri

Expose the pointer to free space in per core cache in PMD, so that the
objects can be directly copied to cache without any temporary storage

Signed-off-by: Kamalakshitha Aligeri 
---
v2: Integration of API in vector PMD
v1: API to return pointer to free space on per-core cache  and 
integration of API in scalar PMD

 app/test/test_mempool.c | 140 
 drivers/net/i40e/i40e_rxtx_vec_avx512.c |  46 +++-
 drivers/net/i40e/i40e_rxtx_vec_common.h |  22 +++-
 lib/mempool/rte_mempool.h   |  46 
 4 files changed, 219 insertions(+), 35 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 8e493eda47..a0160336dd 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -187,6 +187,142 @@ test_mempool_basic(struct rte_mempool *mp, int 
use_external_cache)
return ret;
 }
 
+/* basic tests (done on one core) */
+static int
+test_mempool_get_cache(struct rte_mempool *mp, int use_external_cache)
+{
+   uint32_t *objnum;
+   void **objtable;
+   void *obj, *obj2;
+   char *obj_data;
+   int ret = 0;
+   unsigned int i, j;
+   int offset;
+   struct rte_mempool_cache *cache;
+   void **cache_objs;
+
+   if (use_external_cache) {
+   /* Create a user-owned mempool cache. */
+   cache = rte_mempool_cache_create(RTE_MEMPOOL_CACHE_MAX_SIZE,
+SOCKET_ID_ANY);
+   if (cache == NULL)
+   RET_ERR();
+   } else {
+   /* May be NULL if cache is disabled. */
+   cache = rte_mempool_default_cache(mp, rte_lcore_id());
+   }
+
+   /* dump the mempool status */
+   rte_mempool_dump(stdout, mp);
+
+   printf("get an object\n");
+   if (rte_mempool_generic_get(mp, &obj, 1, cache) < 0)
+   GOTO_ERR(ret, out);
+   rte_mempool_dump(stdout, mp);
+
+   /* tests that improve coverage */
+   printf("get object count\n");
+   /* We have to count the extra caches, one in this case. */
+   offset = use_external_cache ? 1 * cache->len : 0;
+   if (rte_mempool_avail_count(mp) + offset != MEMPOOL_SIZE - 1)
+   GOTO_ERR(ret, out);
+
+   printf("get private data\n");
+   if (rte_mempool_get_priv(mp) != (char *)mp +
+   RTE_MEMPOOL_HEADER_SIZE(mp, mp->cache_size))
+   GOTO_ERR(ret, out);
+
+#ifndef RTE_EXEC_ENV_FREEBSD /* rte_mem_virt2iova() not supported on bsd */
+   printf("get physical address of an object\n");
+   if (rte_mempool_virt2iova(obj) != rte_mem_virt2iova(obj))
+   GOTO_ERR(ret, out);
+#endif
+
+
+   printf("put the object back\n");
+   cache_objs = rte_mempool_get_cache(mp, 1);
+   if (cache_objs != NULL)
+   rte_memcpy(cache_objs, &obj, sizeof(void *));
+   else
+   rte_mempool_ops_enqueue_bulk(mp, &obj, 1);
+
+   rte_mempool_dump(stdout, mp);
+
+   printf("get 2 objects\n");
+   if (rte_mempool_generic_get(mp, &obj, 1, cache) < 0)
+   GOTO_ERR(ret, out);
+   if (rte_mempool_generic_get(mp, &obj2, 1, cache) < 0) {
+   rte_mempool_generic_put(mp, &obj, 1, cache);
+   GOTO_ERR(ret, out);
+   }
+   rte_mempool_dump(stdout, mp);
+
+   printf("put the objects back\n");
+   cache_objs = rte_mempool_get_cache(mp, 1);
+   if (cache_objs != NULL)
+   rte_memcpy(mp, &obj, sizeof(void *));
+   else
+   rte_mempool_ops_enqueue_bulk(mp, &obj, 1);
+
+   cache_objs = rte_mempool_get_cache(mp, 1);
+   if (cache_objs != NULL)
+   rte_memcpy(mp, &obj2, sizeof(void *));
+   else
+   rte_mempool_ops_enqueue_bulk(mp, &obj2, 1);
+   rte_mempool_dump(stdout, mp);
+
+   /*
+* get many objects: we cannot get them all because the cache
+* on other cores may not be empty.
+*/
+   objtable = malloc(MEMPOOL_SIZE * sizeof(void *));
+   if (objtable == NULL)
+   GOTO_ERR(ret, out);
+
+   for (i = 0; i < MEMPOOL_SIZE; i++) {
+   if (rte_mempool_generic_get(mp, &objtable[i], 1, cache) < 0)
+   break;
+   }
+
+   /*
+* for each object, check that its content was not modified,
+* and put objects back in pool
+*/
+   cache_objs = rte_mempool_get_cache(mp, MEMPOOL_SIZE);
+   if (cache_objs != NULL) {
+   while (i--) {
+   obj = objtable[i];
+   obj_data = obj;
+   objnum = obj;
+   if (*objnum > MEMPOOL_SIZE) {
+   printf("bad object number(%d)\n", *objnum);
+   ret = -1;
+   break;
+   }
+   for (j = sizeof(*objnum); j < mp->elt_size; j++) {
+

RE: [PATCH] failsafe: fix segfault on hotplug event

2022-11-16 Thread Konstantin Ananyev




 
> When the failsafe PMD encounters a hotplug event, it switches its rx/tx
> functions to "safe" ones that validate the sub-device's rx/tx functions
> before calling them. It switches the rx/tx functions by changing the
> function pointers in the rte_eth_dev structure.
> 
> Following commit 7a0935239b, the rx/tx functions of PMDs are no longer
> called through the function pointers in the rte_eth_dev structure. They
> are rather called through a flat array named rte_eth_fp_ops. The
> function pointers in that array are initialized when the devices start
> and are initialized.
> 
> When a hotplug event occurs, the function pointers in rte_eth_fp_ops
> still point to the "unsafe" rx/tx functions in the failsafe PMD since
> they haven't been updated. This results in a segmentation fault because
> it ends up using the "unsafe" functions, when the "safe" functions
> should have been used.
> 
> To fix the problem, the failsafe PMD code was changed to update the
> function pointers in the rte_eth_fp_ops array when a hotplug event
> occurs.

 
It is not recommended way to update rte_eth_fp_ops[] contents directly.
There are eth_dev_fp_ops_setup()/ eth_dev_fp_ops_reset() that supposed
to be used for that.
About the fix itself - while it might help till some extent,
I think it will not remove the problem completely.
There still remain a race-condition between rte_eth_rx_burst() and 
failsafe_eth_rmv_event_callback().
Right now DPDK doesn't support switching PMD fast-ops functions (or updating 
rxq/txq data)
on the fly.
  
> Fixes: 7a0935239b ("ethdev: make fast-path functions to use new flat array")
> Cc: Konstantin Ananyev 
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Luc Pelletier 
> ---
>  drivers/net/failsafe/failsafe_rxtx.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/net/failsafe/failsafe_rxtx.c 
> b/drivers/net/failsafe/failsafe_rxtx.c
> index fe67293299..34d59dfbb1 100644
> --- a/drivers/net/failsafe/failsafe_rxtx.c
> +++ b/drivers/net/failsafe/failsafe_rxtx.c
> @@ -5,6 +5,7 @@
> 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
> 
> @@ -44,9 +45,13 @@ failsafe_set_burst_fn(struct rte_eth_dev *dev, int 
> force_safe)
>   DEBUG("Using safe RX bursts%s",
> (force_safe ? " (forced)" : ""));
>   dev->rx_pkt_burst = &failsafe_rx_burst;
> + rte_eth_fp_ops[dev->data->port_id].rx_pkt_burst =
> + &failsafe_rx_burst;
>   } else if (!need_safe && safe_set) {
>   DEBUG("Using fast RX bursts");
>   dev->rx_pkt_burst = &failsafe_rx_burst_fast;
> + rte_eth_fp_ops[dev->data->port_id].rx_pkt_burst =
> + &failsafe_rx_burst_fast;
>   }
>   need_safe = force_safe || fs_tx_unsafe(TX_SUBDEV(dev));
>   safe_set = (dev->tx_pkt_burst == &failsafe_tx_burst);
> @@ -54,9 +59,13 @@ failsafe_set_burst_fn(struct rte_eth_dev *dev, int 
> force_safe)
>   DEBUG("Using safe TX bursts%s",
> (force_safe ? " (forced)" : ""));
>   dev->tx_pkt_burst = &failsafe_tx_burst;
> + rte_eth_fp_ops[dev->data->port_id].tx_pkt_burst =
> + &failsafe_tx_burst;
>   } else if (!need_safe && safe_set) {
>   DEBUG("Using fast TX bursts");
>   dev->tx_pkt_burst = &failsafe_tx_burst_fast;
> + rte_eth_fp_ops[dev->data->port_id].tx_pkt_burst =
> + &failsafe_tx_burst_fast;
>   }
>   rte_wmb();
>  }
> --
> 2.25.1

RE: [PATCH v2] mempool: micro-optimize put function

2022-11-16 Thread Morten Brørup

> From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com]
> Sent: Wednesday, 16 November 2022 17.27
> 
> 
> 
> > > >
> > > > Micro-optimization:
> > > > Reduced the most likely code path in the generic put function by
> > > moving an
> > > > unlikely check out of the most likely code path and further down.
> > > >
> > > > Also updated the comments in the function.
> > > >
> > > > v2 (feedback from Andrew Rybchenko):
> > > > * Modified comparison to prevent overflow if n is really huge and
> > > > len
> > > is
> > > >   non-zero.
> > > > * Added assertion about the invariant preventing overflow in the
> > > >   comparison.
> > > > * Crossing the threshold is not extremely unlikely, so removed
> > > likely()
> > > >   from that comparison.
> > > >   The compiler will generate code with optimal static branch
> > > prediction
> > > >   here anyway.
> > > >
> > > > Signed-off-by: Morten Brørup 
> > > > ---
> > > >  lib/mempool/rte_mempool.h | 36 -
> ---
> > > >  1 file changed, 20 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/lib/mempool/rte_mempool.h
> b/lib/mempool/rte_mempool.h
> > > > index 9f530db24b..dd1a3177d6 100644
> > > > --- a/lib/mempool/rte_mempool.h
> > > > +++ b/lib/mempool/rte_mempool.h
> > > > @@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct
> > > > rte_mempool *mp, void * const *obj_table,  {
> > > > void **cache_objs;
> > > >
> > > > -   /* No cache provided */
> > > > +   /* No cache provided? */
> > > > if (unlikely(cache == NULL))
> > > > goto driver_enqueue;
> > > >
> > > > -   /* increment stat now, adding in mempool always success */
> > > > +   /* Increment stats now, adding in mempool always succeeds.
> */
> > > > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> > > > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> > > >
> > > > -   /* The request itself is too big for the cache */
> > > > -   if (unlikely(n > cache->flushthresh))
> > > > -   goto driver_enqueue_stats_incremented;
> > > > -
> > > > -   /*
> > > > -* The cache follows the following algorithm:
> > > > -*   1. If the objects cannot be added to the cache without
> > > crossing
> > > > -*  the flush threshold, flush the cache to the
> backend.
> > > > -*   2. Add the objects to the cache.
> > > > -*/
> > > > +   /* Assert the invariant preventing overflow in the
> comparison
> > > below.
> > > > */
> > > > +   RTE_ASSERT(cache->len <= cache->flushthresh);
> > > >
> > > > -   if (cache->len + n <= cache->flushthresh) {
> > > > +   if (n <= cache->flushthresh - cache->len) {
> > > > +   /*
> > > > +* The objects can be added to the cache without
> crossing
> > > the
> > > > +* flush threshold.
> > > > +*/
> > > > cache_objs = &cache->objs[cache->len];
> > > > cache->len += n;
> > > > -   } else {
> > > > +   } else if (likely(n <= cache->flushthresh)) {
> > > IMO, this is a misconfiguration on the application part. In the
> PMDs I
> > > have looked at, max value of 'n' is controlled by compile time
> > > constants. Application could do a compile time check on the cache
> > > threshold or we could have another RTE_ASSERT on this.
> >
> > There could be applications using a mempool for something else than
> mbufs.
> Agree
> 
> >
> > In that case, the application should be allowed to get/put many
> objects in
> > one transaction.
> Still, this is a misconfiguration on the application. On one hand the
> threshold is configured for 'x' but they are sending a request which is
> more than 'x'. It should be possible to change the threshold
> configuration or reduce the request size.
> 
> If the application does not fix the misconfiguration, it is possible
> that it will always hit this case and does not get the benefit of using
> the per-core cache.

Correct. I suppose this is the intended behavior of this API.

The zero-copy API proposed in another patch [1] has stricter requirements to 
the bulk size.

[1]: 
http://inbox.dpdk.org/dev/20221115161822.70886-1...@smartsharesystems.com/T/#u

> 
> With this check, we are introducing an additional memcpy as well. I am
> not sure if reusing the latest buffers is better than having an memcpy.

There is no additional memcpy. The large bulk transfer is stored directly in 
the backend pool, bypassing the mempool cache.

Please note that this check is not new, it has just been moved. Before this 
patch, it was checked on every call (if a cache is present); with this patch, 
it is only checked if the entire request cannot go directly into the cache.

> 
> >
> > >
> > > > +   /*
> > > > +* The request itself fits into the cache.
> > > > +* But first, the cache must be flushed to the
> backend, so
> > > > +* adding the objects d

Re: [PATCH] net/failsafe: Fix crash due to in-valid sub-device port id

2022-11-16 Thread Stephen Hemminger

On Wed, 16 Nov 2022 15:22:24 +0530
madhuker.myt...@oracle.com wrote:

>  
> + if (!rte_eth_dev_is_valid_port(PORT_ID(sdev))) {
> + continue;
> + }
> +

Looks ok but DPDK follows kernel style {} is unnecessary on single statement.
Checkpatch will give you warnings on this.

Re: [RFC PATCH v3] net/memif: change socket listener owner uid/gid

2022-11-16 Thread Stephen Hemminger

On Wed, 16 Nov 2022 17:14:13 +
Junxiao Shi  wrote:

> This allows a DPDK application running with root privilege to create a
> memif socket listener with non-root owner uid and gid, which can be
> connected from client applications running without root privilege.
> 
> Signed-off-by: Junxiao Shi 

Looks good, hope you tested this with the example.

Acked-by: Stephen Hemminger

[PATCH v2] mempool cache: add zero-copy get and put functions

2022-11-16 Thread Morten Brørup

Zero-copy access to mempool caches is beneficial for PMD performance, and
must be provided by the mempool library to fix [Bug 1052] without a
performance regression.

[Bug 1052]: https://bugs.dpdk.org/show_bug.cgi?id=1052

v2:
* Fix checkpatch warnings.
* Fix missing registration of trace points.
* The functions are inline, so they don't go into the map file.
v1 changes from the RFC:
* Removed run-time parameter checks. (Honnappa)
  This is a hot fast path function; requiring correct application
  behaviour, i.e. function parameters must be valid.
* Added RTE_ASSERT for parameters instead.
  Code for this is only generated if built with RTE_ENABLE_ASSERT.
* Removed fallback when 'cache' parameter is not set. (Honnappa)
* Chose the simple get function; i.e. do not move the existing objects in
  the cache to the top of the new stack, just leave them at the bottom.
* Renamed the functions. Other suggestions are welcome, of course. ;-)
* Updated the function descriptions.
* Added the functions to trace_fp and version.map.

Signed-off-by: Morten Brørup 
---
 lib/mempool/mempool_trace_points.c |   6 ++
 lib/mempool/rte_mempool.h  | 124 +
 lib/mempool/rte_mempool_trace_fp.h |  16 
 lib/mempool/version.map|   4 +
 4 files changed, 150 insertions(+)

diff --git a/lib/mempool/mempool_trace_points.c 
b/lib/mempool/mempool_trace_points.c
index 4ad76deb34..a6070799af 100644
--- a/lib/mempool/mempool_trace_points.c
+++ b/lib/mempool/mempool_trace_points.c
@@ -77,3 +77,9 @@ RTE_TRACE_POINT_REGISTER(rte_mempool_trace_ops_free,
 
 RTE_TRACE_POINT_REGISTER(rte_mempool_trace_set_ops_byname,
lib.mempool.set.ops.byname)
+
+RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_put_bulk,
+   lib.mempool.cache.zc.put.bulk)
+
+RTE_TRACE_POINT_REGISTER(rte_mempool_trace_cache_zc_get_bulk,
+   lib.mempool.cache.zc.get.bulk)
diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 9f530db24b..5e6da06bc7 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -47,6 +47,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "rte_mempool_trace_fp.h"
 
@@ -1346,6 +1347,129 @@ rte_mempool_cache_flush(struct rte_mempool_cache *cache,
cache->len = 0;
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: This API may change, or be removed, without prior notice.
+ *
+ * Zero-copy put objects in a user-owned mempool cache backed by the specified 
mempool.
+ *
+ * @param cache
+ *   A pointer to the mempool cache.
+ * @param mp
+ *   A pointer to the mempool.
+ * @param n
+ *   The number of objects to be put in the mempool cache.
+ *   Must not exceed RTE_MEMPOOL_CACHE_MAX_SIZE.
+ * @return
+ *   The pointer to where to put the objects in the mempool cache.
+ */
+__rte_experimental
+static __rte_always_inline void *
+rte_mempool_cache_zc_put_bulk(struct rte_mempool_cache *cache,
+   struct rte_mempool *mp,
+   unsigned int n)
+{
+   void **cache_objs;
+
+   RTE_ASSERT(cache != NULL);
+   RTE_ASSERT(mp != NULL);
+   RTE_ASSERT(n <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+
+   rte_mempool_trace_cache_zc_put_bulk(cache, mp, n);
+
+   /* Increment stats now, adding in mempool always succeeds. */
+   RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
+   RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
+
+   /*
+* The cache follows the following algorithm:
+*   1. If the objects cannot be added to the cache without crossing
+*  the flush threshold, flush the cache to the backend.
+*   2. Add the objects to the cache.
+*/
+
+   if (cache->len + n <= cache->flushthresh) {
+   cache_objs = &cache->objs[cache->len];
+   cache->len += n;
+   } else {
+   cache_objs = &cache->objs[0];
+   rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
+   cache->len = n;
+   }
+
+   return cache_objs;
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: This API may change, or be removed, without prior notice.
+ *
+ * Zero-copy get objects from a user-owned mempool cache backed by the 
specified mempool.
+ *
+ * @param cache
+ *   A pointer to the mempool cache.
+ * @param mp
+ *   A pointer to the mempool.
+ * @param n
+ *   The number of objects to prefetch into the mempool cache.
+ *   Must not exceed RTE_MEMPOOL_CACHE_MAX_SIZE.
+ * @return
+ *   The pointer to the objects in the mempool cache.
+ *   NULL on error; i.e. the cache + the pool does not contain n objects.
+ *   With rte_errno set to the error code of the mempool dequeue function.
+ */
+__rte_experimental
+static __rte_always_inline void *
+rte_mempool_cache_zc_get_bulk(struct rte_mempool_cache *cache,
+   struct rte_mempool *mp,
+   unsigned int n)
+{
+   unsigned int len;
+
+   RTE_ASSERT(cache != NULL);
+   RTE_ASSERT(mp != NULL);
+   RTE_ASSERT(n <= RTE_MEMPOOL_CACHE_MAX_SIZE);
+
+   rte_memp

Re: [PATCH] failsafe: fix segfault on hotplug event

2022-11-16 Thread Luc Pelletier

Hi Konstantin,

> It is not recommended way to update rte_eth_fp_ops[] contents directly.
> There are eth_dev_fp_ops_setup()/ eth_dev_fp_ops_reset() that supposed
> to be used for that.

Good to know. I see another fix that was made in a different PMD that
does exactly the same thing:

https://github.com/DPDK/dpdk/commit/bcd68b68415172815e55fc67cf3947c0433baf74

CC'ing the authors for awareness.

> About the fix itself - while it might help till some extent,
> I think it will not remove the problem completely.
> There still remain a race-condition between rte_eth_rx_burst() and 
> failsafe_eth_rmv_event_callback().
> Right now DPDK doesn't support switching PMD fast-ops functions (or updating 
> rxq/txq data)
> on the fly.

Thanks for the information. This is very helpful.

Are you saying that the previous code also had that same race
condition? It was only updating the rte_eth_dev structure, but I
assume the problem would have been the same since rte_eth_rx_burst()
in DPDK versions <=20 use the function pointers in rte_eth_dev, not
rte_eth_fp_ops.

Can you think of a possible solution to this problem? I'm happy to
provide a patch to properly fix the problem. Having your guidance
would be extremely helpful.

Thanks!

net_af_xdp pmd memory leak

2022-11-16 Thread Lei Kong

My test reports memory leaks for receive path, looking at the code, it seems 
the following allocations were never released, should they? Thanks.
https://github.com/DPDK/dpdk/blob/903ec2b1b49e496815c016b0104fd655cd972661/drivers/net/af_xdp/rte_eth_af_xdp.c#L312

Re: [PATCH] failsafe: fix segfault on hotplug event

2022-11-16 Thread Stephen Hemminger

On Wed, 16 Nov 2022 16:51:59 -0500
Luc Pelletier  wrote:

> Hi Konstantin,
> 
> > It is not recommended way to update rte_eth_fp_ops[] contents directly.
> > There are eth_dev_fp_ops_setup()/ eth_dev_fp_ops_reset() that supposed
> > to be used for that.  
> 
> Good to know. I see another fix that was made in a different PMD that
> does exactly the same thing:
> 
> https://github.com/DPDK/dpdk/commit/bcd68b68415172815e55fc67cf3947c0433baf74
> 
> CC'ing the authors for awareness.
> 
> > About the fix itself - while it might help till some extent,
> > I think it will not remove the problem completely.
> > There still remain a race-condition between rte_eth_rx_burst() and 
> > failsafe_eth_rmv_event_callback().
> > Right now DPDK doesn't support switching PMD fast-ops functions (or 
> > updating rxq/txq data)
> > on the fly.  
> 
> Thanks for the information. This is very helpful.
> 
> Are you saying that the previous code also had that same race
> condition? It was only updating the rte_eth_dev structure, but I
> assume the problem would have been the same since rte_eth_rx_burst()
> in DPDK versions <=20 use the function pointers in rte_eth_dev, not
> rte_eth_fp_ops.
> 
> Can you think of a possible solution to this problem? I'm happy to
> provide a patch to properly fix the problem. Having your guidance
> would be extremely helpful.
> 
> Thanks!

Changing burst mode on a running device is not safe because
of lack of locking and/or memory barriers.

Would have been better to not to do this optimization.
Just have one rx_burst/tx_burst function and look at what
ever conditions are present there.

[PATCH] net/idpf: fix port start

2022-11-16 Thread beilei . xing

From: Beilei Xing 

Port can't start successfully if stopping port and starting port
again.
This patch fixes port start by initialization.

Fixes: e9ff6df15b9a ("net/idpf: stop before closing device")

Signed-off-by: Beilei Xing 
---
 drivers/net/idpf/idpf_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c
index 0b90f885a8..20f088eb80 100644
--- a/drivers/net/idpf/idpf_ethdev.c
+++ b/drivers/net/idpf/idpf_ethdev.c
@@ -552,6 +552,8 @@ idpf_dev_start(struct rte_eth_dev *dev)
uint16_t req_vecs_num;
int ret;
 
+   vport->stopped = 0;
+
if (dev->data->mtu > vport->max_mtu) {
PMD_DRV_LOG(ERR, "MTU should be less than %d", vport->max_mtu);
ret = -EINVAL;
-- 
2.26.2

[PATCH] net/iavf: support vxlan gpe tunnel offload

2022-11-16 Thread Zhichao Zeng

Add support for Vxlan-GPE tunnel packet checksum offloading by adding
the VXLAN_GPE flag during processing of Tx context descriptor.

Signed-off-by: Zhichao Zeng 
---
 drivers/net/iavf/iavf_rxtx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index bd5dd2d4ed..cf87a6beda 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -2424,6 +2424,7 @@ iavf_fill_ctx_desc_tunnelling_field(volatile uint64_t 
*qw0,
/* for non UDP / GRE tunneling, set to 00b */
break;
case RTE_MBUF_F_TX_TUNNEL_VXLAN:
+   case RTE_MBUF_F_TX_TUNNEL_VXLAN_GPE:
case RTE_MBUF_F_TX_TUNNEL_GTP:
case RTE_MBUF_F_TX_TUNNEL_GENEVE:
eip_typ |= IAVF_TXD_CTX_UDP_TUNNELING;
-- 
2.25.1

RE: [PATCH] net/iavf: support vxlan gpe tunnel offload

2022-11-16 Thread Xu, Ke1



> -Original Message-
> From: Zhichao Zeng 
> Sent: Thursday, November 17, 2022 11:30 AM
> To: dev@dpdk.org
> Cc: Zhou, YidingX ; Zhang, Qi Z
> ; Zeng, ZhichaoX ; Wu,
> Jingjing ; Xing, Beilei 
> Subject: [PATCH] net/iavf: support vxlan gpe tunnel offload
> 
> Add support for Vxlan-GPE tunnel packet checksum offloading by adding the
> VXLAN_GPE flag during processing of Tx context descriptor.
> 
> Signed-off-by: Zhichao Zeng 

Verified and Passed.
Regards,

Tested-by: Ke Xu 

> ---
>  drivers/net/iavf/iavf_rxtx.c | 1 +
>  1 file changed, 1 insertion(+)
>

RE: [PATCH] net/iavf: support vxlan gpe tunnel offload

2022-11-16 Thread Zhang, Qi Z




> -Original Message-
> From: Xu, Ke1 
> Sent: Thursday, November 17, 2022 11:31 AM
> To: Zeng, ZhichaoX ; dev@dpdk.org
> Cc: Zhou, YidingX ; Zhang, Qi Z
> ; Zeng, ZhichaoX ; Wu,
> Jingjing ; Xing, Beilei 
> Subject: RE: [PATCH] net/iavf: support vxlan gpe tunnel offload
> 
> 
> > -Original Message-
> > From: Zhichao Zeng 
> > Sent: Thursday, November 17, 2022 11:30 AM
> > To: dev@dpdk.org
> > Cc: Zhou, YidingX ; Zhang, Qi Z
> > ; Zeng, ZhichaoX ; Wu,
> > Jingjing ; Xing, Beilei 
> > Subject: [PATCH] net/iavf: support vxlan gpe tunnel offload
> >
> > Add support for Vxlan-GPE tunnel packet checksum offloading by adding
> > the VXLAN_GPE flag during processing of Tx context descriptor.
> >
> > Signed-off-by: Zhichao Zeng 
> 
> Verified and Passed.
> Regards,
> 
> Tested-by: Ke Xu 

Applied to dpdk-next-net-intel.

Thanks
Qi

RE: [PATCH] net/ixgbevf: fix promiscuous and allmulti

2022-11-16 Thread Zhang, Qi Z




> -Original Message-
> From: Wu, Wenjun1 
> Sent: Friday, October 14, 2022 9:14 AM
> To: Matz, Olivier 
> Cc: dev@dpdk.org; Yang, Qiming ; Zhao1, Wei
> 
> Subject: RE: [PATCH] net/ixgbevf: fix promiscuous and allmulti
> 
> 
> 
> > -Original Message-
> > From: Olivier Matz 
> > Sent: Thursday, October 13, 2022 10:46 PM
> > To: Wu, Wenjun1 
> > Cc: dev@dpdk.org; Yang, Qiming ; Zhao1, Wei
> > 
> > Subject: Re: [PATCH] net/ixgbevf: fix promiscuous and allmulti
> >
> > Hi Wenjun,
> >
> > On Mon, Oct 10, 2022 at 01:30:54AM +, Wu, Wenjun1 wrote:
> > > Hi Olivier,
> > >
> > > > -Original Message-
> > > > From: Olivier Matz 
> > > > Sent: Thursday, September 29, 2022 8:22 PM
> > > > To: dev@dpdk.org
> > > > Cc: Yang, Qiming ; Wu, Wenjun1
> > > > ; Zhao1, Wei 
> > > > Subject: [PATCH] net/ixgbevf: fix promiscuous and allmulti
> > > >
> > > > The configuration of allmulti and promiscuous modes conflicts
> > > > together. For instance, if we enable promiscuous mode, then enable
> > > > and disable allmulti, then the promiscuous mode is wrongly disabled.
> > > >
> > > > Fix this behavior by:
> > > > - doing nothing when we set/unset allmulti if promiscuous mode is
> > > > on
> > > > - restorting the proper mode (none or allmulti) when we disable
> > > >   promiscuous mode
> > > >
> > > > Fixes: 1f4564ed7696 ("net/ixgbevf: enable promiscuous mode")
> > > >
> > > > Signed-off-by: Olivier Matz 
> > > > ---
> > > >
> > > > Hi,
> > > >
> > > > For reference, this was tested with this plan:
> > > >
> > > > echo 8 > "/sys/bus/pci/devices/:01:00.1/sriov_numvfs"
> > > > ip link set dev eno2 up
> > > > ip link set dev eno2 promisc on
> > > > bridge link set dev eno2 hwmode veb ip link set dev eno2 mtu 9000
> > > >
> > > > ip link set dev eno2 vf 0 mac ac:1f:6b:fe:ba:b0 ip link set dev
> > > > eno2 vf 0 spoofchk off ip link set dev eno2 vf 0 trust on
> > > >
> > > > ip link set dev eno2 vf 1 mac ac:1f:6b:fe:ba:b1 ip link set dev
> > > > eno2 vf 1 spoofchk off ip link set dev eno2 vf 1 trust on
> > > >
> > > > python3 usertools/dpdk-devbind.py -s
> > > > python3 usertools/dpdk-devbind.py -b vfio-pci :01:10.1   # vf 0
> > > > python3 usertools/dpdk-devbind.py -b ixgbevf :01:10.3# vf 1
> > > >
> > > >
> > > > # in another terminal
> > > > scapy
> > > > while True:
> > > >   sendp(Ether(dst='ac:1f:6b:00:00:00'), iface='eno2v1')  # wrong mac
> > > >   sendp(Ether(dst='ac:1f:6b:fe:ba:b0'), iface='eno2v1')  # correct mac
> > > >   time.sleep(1)
> > > >
> > > >
> > > > ./build/app/dpdk-testpmd -l 1,2 -a :01:10.1 -- -i --total-num-
> > > > mbufs=32768 show port info all set fwd rxonly set verbose 1 set
> > > > promisc all off set allmulti all off start
> > > >
> > > > # ok, only packets to dst='ac:1f:6b:fe:ba:b0' are received
> > > >
> > > >
> > > > # ok, both packets are received
> > > > set promisc all on
> > > >
> > > >
> > > > # nok, only packets to dst='ac:1f:6b:fe:ba:b0' are received set
> > > > allmulti all on set allmulti all off
> > > >
> > > >
> > > >  drivers/net/ixgbe/ixgbe_ethdev.c | 12 +++-
> > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > > index 8cec951d94..cc8383c5a9 100644
> > > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > > @@ -7785,9 +7785,13 @@ static int
> > > >  ixgbevf_dev_promiscuous_disable(struct rte_eth_dev *dev)  {
> > > > struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data-
> > > > >dev_private);
> > > > +   int mode = IXGBEVF_XCAST_MODE_NONE;
> > > > int ret;
> > > >
> > > > -   switch (hw->mac.ops.update_xcast_mode(hw,
> > > > IXGBEVF_XCAST_MODE_NONE)) {
> > > > +   if (dev->data->all_multicast)
> > > > +   mode = IXGBEVF_XCAST_MODE_ALLMULTI;
> > > > +
> > > > +   switch (hw->mac.ops.update_xcast_mode(hw, mode)) {
> > > > case IXGBE_SUCCESS:
> > > > ret = 0;
> > > > break;
> > > > @@ -7809,6 +7813,9 @@ ixgbevf_dev_allmulticast_enable(struct
> > > > rte_eth_dev *dev)
> > > > int ret;
> > > > int mode = IXGBEVF_XCAST_MODE_ALLMULTI;
> > > >
> > > > +   if (dev->data->promiscuous)
> > > > +   return 0;
> > > > +
> > > > switch (hw->mac.ops.update_xcast_mode(hw, mode)) {
> > > > case IXGBE_SUCCESS:
> > > > ret = 0;
> > > > @@ -7830,6 +7837,9 @@ ixgbevf_dev_allmulticast_disable(struct
> > > > rte_eth_dev *dev)
> > > > struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data-
> > > > >dev_private);
> > > > int ret;
> > > >
> > > > +   if (dev->data->promiscuous)
> > > > +   return 0;
> > > > +
> > >
> > > It seems that we cannot actually turn off allmulticast mode when
> > > promiscuous mode is enabled, so can we return error and add a log
> > > message here as a reminder?
> >
> > I think we should not return an

RE: [PATCH] net/idpf: fix port start

2022-11-16 Thread Zhang, Qi Z




> -Original Message-
> From: beilei.x...@intel.com 
> Sent: Thursday, November 17, 2022 11:08 AM
> To: Wu, Jingjing 
> Cc: dev@dpdk.org; Peng, Yuan ; Xing, Beilei
> 
> Subject: [PATCH] net/idpf: fix port start
> 
> From: Beilei Xing 
> 
> Port can't start successfully if stopping port and starting port again.
> This patch fixes port start by initialization.
> 
> Fixes: e9ff6df15b9a ("net/idpf: stop before closing device")
> 
> Signed-off-by: Beilei Xing 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi

[PATCH v1 00/13] graph enhancement for multi-core dispatch

2022-11-16 Thread Zhirun Yan

Currently, rte_graph supports RTC (Run-To-Completion) model within each
of a single core.
RTC is one of the typical model of packet processing. Others like
Pipeline or Hybrid are lack of support.

The patch set introduces a 'generic' model selection which is a
self-reacting scheme according to the core affinity.
The new model enables a cross-core dispatching mechanism which employs a
scheduling work-queue to dispatch streams to other worker cores which
being associated with the destination node. When core flavor of the
destination node is a default 'current', the stream can be continue
executed as normal.

Example:
3-node graph targets 3-core budget

Generic Model
RTC:
Config Graph-A: node-0->current; node-1->current; node-2->current;
Graph-A':node-0/1/2 @0, Graph-A':node-0/1/2 @1, Graph-A':node-0/1/2 @2

+ - - - - - - - - - - - - - - - - - - - - - +
'Core #0/1/2'
'   '
' ++ +-+ ++ '
' | Node-0 | --> | Node-1  | --> | Node-2 | '
' ++ +-+ ++ '
'   '
+ - - - - - - - - - - - - - - - - - - - - - +

Pipeline:
Config Graph-A: node-0->0; node-1->1; node-2->2;
Graph-A':node-0 @0, Graph-A':node-1 @1, Graph-A':node-2 @2

+ - - - - - -+ +- - - - - - + + - - - - - -+
'  Core #0   ' '  Core #1   ' '  Core #2   '
'' '' ''
' ++ ' ' ++ ' ' ++ '
' | Node-0 | ' --> ' | Node-1 | ' --> ' | Node-2 | '
' ++ ' ' ++ ' ' ++ '
'' '' ''
+ - - - - - -+ +- - - - - - + + - - - - - -+

Hybrid:
Config Graph-A: node-0->current; node-1->current; node-2->2;
Graph-A':node-0/1 @0, Graph-A':node-0/1 @1, Graph-A':node-2 @2

+ - - - - - - - - - - - - - - - + + - - - - - -+
'Core #0' '  Core #2   '
'   ' ''
' ++ ++ ' ' ++ '
' | Node-0 | --> | Node-1 | ' --> ' | Node-2 | '
' ++ ++ ' ' ++ '
'   ' ''
+ - - - - - - - - - - - - - - - + + - - - - - -+
  ^
  |
  |
+ - - - - - - - - - - - - - - - + |
'Core #1' |
'   ' |
' ++ ++ ' |
' | Node-0 | --> | Node-1 | ' +
' ++ ++ '
'   '
+ - - - - - - - - - - - - - - - +


The patch set has been break down as below:

1. Split graph worker into common and default model part.
2. Inline graph node processing and graph circular buffer walking to make
  it reusable.
3. Add set/get APIs to choose worker model.
4. Introduce core affinity API to set the node run on specific worker core.
  (only use in new model)
5. Introduce graph affinity API to bind one graph with specific worker
  core.
6. Introduce graph clone API.
7. Introduce stream moving with scheduler work-queue in patch 8,9,10.
8. Add stats for new models.
9. Abstract default graph config process and integrate new model into
  example/l3fwd-graph. Add new parameters for model choosing.

We could run with new worker model by this:
./dpdk-l3fwd-graph -l 8,9,10,11 -n 4 -- -p 0x1 --config="(0,0,9)" -P
--model="generic"

References:
https://static.sched.com/hosted_files/dpdkuserspace22/a6/graph%20introduce%20remote%20dispatch%20for%20mult-core%20scaling.pdf

Zhirun Yan (13):
  graph: split graph worker into common and default model
  graph: move node process into inline function
  graph: add macro to walk on graph circular buffer
  graph: add get/set graph worker model APIs
  graph: introduce core affinity API
  graph: introduce graph affinity API
  graph: introduce graph clone API for other worker core
  graph: introduce stream moving cross cores
  graph: enable create and destroy graph scheduling workqueue
  graph: introduce graph walk by cross-core dispatch
  graph: enable graph generic scheduler model
  graph: add stats for corss-core dispatching
  examples/l3fwd-graph: introduce generic worker model

 examples/l3fwd-graph/main.c | 218 +--
 lib/graph/graph.c   | 179 +
 lib/graph/graph_debug.c |   6 +
 lib/graph/graph_populate.c  |   1 +
 lib/graph/graph_private.h   |  44 +++
 lib/graph/graph_stats.c |  74 +++-
 lib/graph/meson.build   |   3 +-
 lib/graph/node.c|   1 +
 lib/graph/rte_graph.h   |  44 +++
 lib/graph/rte_graph_model_generic.c | 179 +
 lib/graph/rte_graph_model_generic.h | 114 ++
 lib/graph/rte_graph_model_rtc.h |  22 ++
 lib/graph/rte_graph_worker.h| 516 ++

[PATCH v1 01/13] graph: split graph worker into common and default model

2022-11-16 Thread Zhirun Yan

To support multiple graph worker model, split graph into common
and default. Naming the current walk function as rte_graph_model_rtc
cause the default model is RTC(Run-to-completion).

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_model_rtc.h |  57 
 lib/graph/rte_graph_worker.h| 498 +---
 lib/graph/rte_graph_worker_common.h | 456 +
 3 files changed, 515 insertions(+), 496 deletions(-)
 create mode 100644 lib/graph/rte_graph_model_rtc.h
 create mode 100644 lib/graph/rte_graph_worker_common.h

diff --git a/lib/graph/rte_graph_model_rtc.h b/lib/graph/rte_graph_model_rtc.h
new file mode 100644
index 00..fb58730bde
--- /dev/null
+++ b/lib/graph/rte_graph_model_rtc.h
@@ -0,0 +1,57 @@
+#include "rte_graph_worker_common.h"
+
+/**
+ * Perform graph walk on the circular buffer and invoke the process function
+ * of the nodes and collect the stats.
+ *
+ * @param graph
+ *   Graph pointer returned from rte_graph_lookup function.
+ *
+ * @see rte_graph_lookup()
+ */
+static inline void
+rte_graph_walk_rtc(struct rte_graph *graph)
+{
+   const rte_graph_off_t *cir_start = graph->cir_start;
+   const rte_node_t mask = graph->cir_mask;
+   uint32_t head = graph->head;
+   struct rte_node *node;
+   uint64_t start;
+   uint16_t rc;
+   void **objs;
+
+   /*
+* Walk on the source node(s) ((cir_start - head) -> cir_start) and then
+* on the pending streams (cir_start -> (cir_start + mask) -> cir_start)
+* in a circular buffer fashion.
+*
+*  +-+ <= cir_start - head [number of source nodes]
+*  | |
+*  | ... | <= source nodes
+*  | |
+*  +-+ <= cir_start [head = 0] [tail = 0]
+*  | |
+*  | ... | <= pending streams
+*  | |
+*  +-+ <= cir_start + mask
+*/
+   while (likely(head != graph->tail)) {
+   node = (struct rte_node *)RTE_PTR_ADD(graph, 
cir_start[(int32_t)head++]);
+   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
+   objs = node->objs;
+   rte_prefetch0(objs);
+
+   if (rte_graph_has_stats_feature()) {
+   start = rte_rdtsc();
+   rc = node->process(graph, node, objs, node->idx);
+   node->total_cycles += rte_rdtsc() - start;
+   node->total_calls++;
+   node->total_objs += rc;
+   } else {
+   node->process(graph, node, objs, node->idx);
+   }
+   node->idx = 0;
+   head = likely((int32_t)head > 0) ? head & mask : head;
+   }
+   graph->tail = 0;
+}
diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
index 6dc7461659..54d1390786 100644
--- a/lib/graph/rte_graph_worker.h
+++ b/lib/graph/rte_graph_worker.h
@@ -1,122 +1,4 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(C) 2020 Marvell International Ltd.
- */
-
-#ifndef _RTE_GRAPH_WORKER_H_
-#define _RTE_GRAPH_WORKER_H_
-
-/**
- * @file rte_graph_worker.h
- *
- * @warning
- * @b EXPERIMENTAL:
- * All functions in this file may be changed or removed without prior notice.
- *
- * This API allows a worker thread to walk over a graph and nodes to create,
- * process, enqueue and move streams of objects to the next nodes.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "rte_graph.h"
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-/**
- * @internal
- *
- * Data structure to hold graph data.
- */
-struct rte_graph {
-   uint32_t tail;   /**< Tail of circular buffer. */
-   uint32_t head;   /**< Head of circular buffer. */
-   uint32_t cir_mask;   /**< Circular buffer wrap around mask. */
-   rte_node_t nb_nodes; /**< Number of nodes in the graph. */
-   rte_graph_off_t *cir_start;  /**< Pointer to circular buffer. */
-   rte_graph_off_t nodes_start; /**< Offset at which node memory starts. */
-   rte_graph_t id; /**< Graph identifier. */
-   int socket; /**< Socket ID where memory is allocated. */
-   char name[RTE_GRAPH_NAMESIZE];  /**< Name of the graph. */
-   uint64_t fence; /**< Fence. */
-} __rte_cache_aligned;
-
-/**
- * @internal
- *
- * Data structure to hold node data.
- */
-struct rte_node {
-   /* Slow path area  */
-   uint64_t fence; /**< Fence. */
-   rte_graph_off_t next;   /**< Index to next node. */
-   rte_node_t id;  /**< Node identifier. */
-   rte_node_t parent_id;   /**< Parent Node identifier. */
-   rte_edge_t nb_edges;/**< Number of edges from this node. */
-   uint32_t realloc_count; /**< Number of times realloced. */
-
-   char parent[RTE_NODE_NAMESIZE]; /**< Parent node n

[PATCH v1 02/13] graph: move node process into inline function

2022-11-16 Thread Zhirun Yan

Node process is a single and reusable block, move the code into an inline
function.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_model_rtc.h | 18 +---
 lib/graph/rte_graph_worker_common.h | 33 +
 2 files changed, 34 insertions(+), 17 deletions(-)

diff --git a/lib/graph/rte_graph_model_rtc.h b/lib/graph/rte_graph_model_rtc.h
index fb58730bde..c80b0ce962 100644
--- a/lib/graph/rte_graph_model_rtc.h
+++ b/lib/graph/rte_graph_model_rtc.h
@@ -16,9 +16,6 @@ rte_graph_walk_rtc(struct rte_graph *graph)
const rte_node_t mask = graph->cir_mask;
uint32_t head = graph->head;
struct rte_node *node;
-   uint64_t start;
-   uint16_t rc;
-   void **objs;
 
/*
 * Walk on the source node(s) ((cir_start - head) -> cir_start) and then
@@ -37,20 +34,7 @@ rte_graph_walk_rtc(struct rte_graph *graph)
 */
while (likely(head != graph->tail)) {
node = (struct rte_node *)RTE_PTR_ADD(graph, 
cir_start[(int32_t)head++]);
-   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
-   objs = node->objs;
-   rte_prefetch0(objs);
-
-   if (rte_graph_has_stats_feature()) {
-   start = rte_rdtsc();
-   rc = node->process(graph, node, objs, node->idx);
-   node->total_cycles += rte_rdtsc() - start;
-   node->total_calls++;
-   node->total_objs += rc;
-   } else {
-   node->process(graph, node, objs, node->idx);
-   }
-   node->idx = 0;
+   __rte_node_process(graph, node);
head = likely((int32_t)head > 0) ? head & mask : head;
}
graph->tail = 0;
diff --git a/lib/graph/rte_graph_worker_common.h 
b/lib/graph/rte_graph_worker_common.h
index 91a5de7fa4..b7b2bb958c 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -121,6 +121,39 @@ void __rte_node_stream_alloc_size(struct rte_graph *graph,
 
 /* Fast path helper functions */
 
+/**
+ * @internal
+ *
+ * Enqueue a given node to the tail of the graph reel.
+ *
+ * @param graph
+ *   Pointer Graph object.
+ * @param node
+ *   Pointer to node object to be enqueued.
+ */
+static __rte_always_inline void
+__rte_node_process(struct rte_graph *graph, struct rte_node *node)
+{
+   uint64_t start;
+   uint16_t rc;
+   void **objs;
+
+   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
+   objs = node->objs;
+   rte_prefetch0(objs);
+
+   if (rte_graph_has_stats_feature()) {
+   start = rte_rdtsc();
+   rc = node->process(graph, node, objs, node->idx);
+   node->total_cycles += rte_rdtsc() - start;
+   node->total_calls++;
+   node->total_objs += rc;
+   } else {
+   node->process(graph, node, objs, node->idx);
+   }
+   node->idx = 0;
+}
+
 /**
  * @internal
  *
-- 
2.25.1

[PATCH v1 03/13] graph: add macro to walk on graph circular buffer

2022-11-16 Thread Zhirun Yan

It is common to walk on graph circular buffer and use macro to make
it reusable for other worker models.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_model_rtc.h | 23 ++-
 lib/graph/rte_graph_worker_common.h | 23 +++
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/lib/graph/rte_graph_model_rtc.h b/lib/graph/rte_graph_model_rtc.h
index c80b0ce962..5474b06063 100644
--- a/lib/graph/rte_graph_model_rtc.h
+++ b/lib/graph/rte_graph_model_rtc.h
@@ -12,30 +12,11 @@
 static inline void
 rte_graph_walk_rtc(struct rte_graph *graph)
 {
-   const rte_graph_off_t *cir_start = graph->cir_start;
-   const rte_node_t mask = graph->cir_mask;
uint32_t head = graph->head;
struct rte_node *node;
 
-   /*
-* Walk on the source node(s) ((cir_start - head) -> cir_start) and then
-* on the pending streams (cir_start -> (cir_start + mask) -> cir_start)
-* in a circular buffer fashion.
-*
-*  +-+ <= cir_start - head [number of source nodes]
-*  | |
-*  | ... | <= source nodes
-*  | |
-*  +-+ <= cir_start [head = 0] [tail = 0]
-*  | |
-*  | ... | <= pending streams
-*  | |
-*  +-+ <= cir_start + mask
-*/
-   while (likely(head != graph->tail)) {
-   node = (struct rte_node *)RTE_PTR_ADD(graph, 
cir_start[(int32_t)head++]);
+   rte_graph_walk_node(graph, head, node)
__rte_node_process(graph, node);
-   head = likely((int32_t)head > 0) ? head & mask : head;
-   }
+
graph->tail = 0;
 }
diff --git a/lib/graph/rte_graph_worker_common.h 
b/lib/graph/rte_graph_worker_common.h
index b7b2bb958c..df33204336 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -121,6 +121,29 @@ void __rte_node_stream_alloc_size(struct rte_graph *graph,
 
 /* Fast path helper functions */
 
+/**
+ * Macro to walk on the source node(s) ((cir_start - head) -> cir_start)
+ * and then on the pending streams
+ * (cir_start -> (cir_start + mask) -> cir_start)
+ * in a circular buffer fashion.
+ *
+ * +-+ <= cir_start - head [number of source nodes]
+ * | |
+ * | ... | <= source nodes
+ * | |
+ * +-+ <= cir_start [head = 0] [tail = 0]
+ * | |
+ * | ... | <= pending streams
+ * | |
+ * +-+ <= cir_start + mask
+ */
+#define rte_graph_walk_node(graph, head, node) 
\
+   for ((node) = RTE_PTR_ADD((graph), 
(graph)->cir_start[(int32_t)(head)]);\
+likely((head) != (graph)->tail);   
\
+(head)++,  
\
+(node) = RTE_PTR_ADD((graph), 
(graph)->cir_start[(int32_t)(head)]),\
+(head) = likely((int32_t)(head) > 0) ? (head) & (graph)->cir_mask 
: (head))
+
 /**
  * @internal
  *
-- 
2.25.1

[PATCH v1 04/13] graph: add get/set graph worker model APIs

2022-11-16 Thread Zhirun Yan

Add new get/set APIs to configure graph worker model which is used to
determine which model will be chosen.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_worker.h| 51 +
 lib/graph/rte_graph_worker_common.h | 13 
 lib/graph/version.map   |  3 ++
 3 files changed, 67 insertions(+)

diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
index 54d1390786..a0ea0df153 100644
--- a/lib/graph/rte_graph_worker.h
+++ b/lib/graph/rte_graph_worker.h
@@ -1,5 +1,56 @@
 #include "rte_graph_model_rtc.h"
 
+static enum rte_graph_worker_model worker_model = RTE_GRAPH_MODEL_DEFAULT;
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ * Set the graph worker model
+ *
+ * @note This function does not perform any locking, and is only safe to call
+ *before graph running.
+ *
+ * @param name
+ *   Name of the graph worker model.
+ *
+ * @return
+ *   0 on success, -1 otherwise.
+ */
+__rte_experimental
+static inline int
+rte_graph_worker_model_set(enum rte_graph_worker_model model)
+{
+   if (model >= RTE_GRAPH_MODEL_MAX)
+   goto fail;
+
+   worker_model = model;
+   return 0;
+
+fail:
+   worker_model = RTE_GRAPH_MODEL_DEFAULT;
+   return -1;
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Get the graph worker model
+ *
+ * @param name
+ *   Name of the graph worker model.
+ *
+ * @return
+ *   Graph worker model on success.
+ */
+__rte_experimental
+static inline
+enum rte_graph_worker_model
+rte_graph_worker_model_get(void)
+{
+   return worker_model;
+}
+
 /**
  * Perform graph walk on the circular buffer and invoke the process function
  * of the nodes and collect the stats.
diff --git a/lib/graph/rte_graph_worker_common.h 
b/lib/graph/rte_graph_worker_common.h
index df33204336..507a344afd 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -86,6 +86,19 @@ struct rte_node {
struct rte_node *nodes[] __rte_cache_min_aligned; /**< Next nodes. */
 } __rte_cache_aligned;
 
+
+
+/** Graph worker models */
+enum rte_graph_worker_model {
+#define WORKER_MODEL_DEFAULT "default"
+   RTE_GRAPH_MODEL_DEFAULT = 0,
+#define WORKER_MODEL_RTC "rtc"
+   RTE_GRAPH_MODEL_RTC,
+#define WORKER_MODEL_GENERIC "generic"
+   RTE_GRAPH_MODEL_GENERIC,
+   RTE_GRAPH_MODEL_MAX,
+};
+
 /**
  * @internal
  *
diff --git a/lib/graph/version.map b/lib/graph/version.map
index 13b838752d..eea73ec9ca 100644
--- a/lib/graph/version.map
+++ b/lib/graph/version.map
@@ -43,5 +43,8 @@ EXPERIMENTAL {
rte_node_next_stream_put;
rte_node_next_stream_move;
 
+   rte_graph_worker_model_set;
+   rte_graph_worker_model_get;
+
local: *;
 };
-- 
2.25.1

[PATCH v1 05/13] graph: introduce core affinity API

2022-11-16 Thread Zhirun Yan

1. add lcore_id for node to hold affinity core id.
2. impl rte_node_model_generic_set_lcore_affinity to affinity node
   with one lcore.
3. update version map for graph public API.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph_private.h   |  1 +
 lib/graph/meson.build   |  1 +
 lib/graph/node.c|  1 +
 lib/graph/rte_graph_model_generic.c | 31 +
 lib/graph/rte_graph_model_generic.h | 43 +
 lib/graph/version.map   |  2 ++
 6 files changed, 79 insertions(+)
 create mode 100644 lib/graph/rte_graph_model_generic.c
 create mode 100644 lib/graph/rte_graph_model_generic.h

diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index f9a85c8926..627090f802 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -49,6 +49,7 @@ struct node {
STAILQ_ENTRY(node) next;  /**< Next node in the list. */
char name[RTE_NODE_NAMESIZE]; /**< Name of the node. */
uint64_t flags;   /**< Node configuration flag. */
+   unsigned int lcore_id;/**< Node runs on the Lcore ID */
rte_node_process_t process;   /**< Node process function. */
rte_node_init_t init; /**< Node init function. */
rte_node_fini_t fini; /**< Node fini function. */
diff --git a/lib/graph/meson.build b/lib/graph/meson.build
index c7327549e8..8c8b11ed27 100644
--- a/lib/graph/meson.build
+++ b/lib/graph/meson.build
@@ -14,6 +14,7 @@ sources = files(
 'graph_debug.c',
 'graph_stats.c',
 'graph_populate.c',
+'rte_graph_model_generic.c',
 )
 headers = files('rte_graph.h', 'rte_graph_worker.h')
 
diff --git a/lib/graph/node.c b/lib/graph/node.c
index fc6345de07..8ad4b3cbeb 100644
--- a/lib/graph/node.c
+++ b/lib/graph/node.c
@@ -100,6 +100,7 @@ __rte_node_register(const struct rte_node_register *reg)
goto free;
}
 
+   node->lcore_id = RTE_MAX_LCORE;
node->id = node_id++;
 
/* Add the node at tail */
diff --git a/lib/graph/rte_graph_model_generic.c 
b/lib/graph/rte_graph_model_generic.c
new file mode 100644
index 00..54ff659c7b
--- /dev/null
+++ b/lib/graph/rte_graph_model_generic.c
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2022 Intel Corporation
+ */
+
+#include "graph_private.h"
+#include "rte_graph_model_generic.h"
+
+int
+rte_node_model_generic_set_lcore_affinity(const char *name, unsigned int 
lcore_id)
+{
+   struct node *node;
+   int ret = -EINVAL;
+
+   if (lcore_id >= RTE_MAX_LCORE)
+   return ret;
+
+   graph_spinlock_lock();
+
+   STAILQ_FOREACH(node, node_list_head_get(), next) {
+   if (strncmp(node->name, name, RTE_NODE_NAMESIZE) == 0) {
+   node->lcore_id = lcore_id;
+   ret = 0;
+   break;
+   }
+   }
+
+   graph_spinlock_unlock();
+
+   return ret;
+}
+
diff --git a/lib/graph/rte_graph_model_generic.h 
b/lib/graph/rte_graph_model_generic.h
new file mode 100644
index 00..20ca48a9e3
--- /dev/null
+++ b/lib/graph/rte_graph_model_generic.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2022 Intel Corporation
+ */
+
+#ifndef _RTE_GRAPH_MODEL_GENERIC_H_
+#define _RTE_GRAPH_MODEL_GENERIC_H_
+
+/**
+ * @file rte_graph_model_generic.h
+ *
+ * @warning
+ * @b EXPERIMENTAL:
+ * All functions in this file may be changed or removed without prior notice.
+ *
+ * This API allows a worker thread to walk over a graph and nodes to create,
+ * process, enqueue and move streams of objects to the next nodes.
+ */
+#include "rte_graph_worker_common.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Set lcore affinity to the node.
+ *
+ * @param name
+ *   Valid node name. In the case of the cloned node, the name will be
+ * "parent node name" + "-" + name.
+ * @param lcore_id
+ *   The lcore ID value.
+ *
+ * @return
+ *   0 on success, error otherwise.
+ */
+__rte_experimental
+int rte_node_model_generic_set_lcore_affinity(const char *name, unsigned int 
lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GRAPH_MODEL_GENERIC_H_ */
diff --git a/lib/graph/version.map b/lib/graph/version.map
index eea73ec9ca..33ff055be6 100644
--- a/lib/graph/version.map
+++ b/lib/graph/version.map
@@ -46,5 +46,7 @@ EXPERIMENTAL {
rte_graph_worker_model_set;
rte_graph_worker_model_get;
 
+   rte_node_model_generic_set_lcore_affinity;
+
local: *;
 };
-- 
2.25.1

[PATCH v1 06/13] graph: introduce graph affinity API

2022-11-16 Thread Zhirun Yan

Add lcore_id for graph to hold affinity core id where graph would run on.
Add bind/unbind API to set/unset graph affinity attribute. lcore_id will
be set as MAX by default, it means not enable this attribute.

Signed-off-by: Zhirun Yan 
---
 lib/graph/graph.c | 59 +++
 lib/graph/graph_private.h |  2 ++
 lib/graph/rte_graph.h | 22 +++
 lib/graph/version.map |  2 ++
 4 files changed, 85 insertions(+)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index 3a617cc369..a8d8eb633e 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -245,6 +245,64 @@ graph_mem_fixup_secondary(struct rte_graph *graph)
return graph_mem_fixup_node_ctx(graph);
 }
 
+static __rte_always_inline bool
+graph_src_node_avail(struct graph *graph)
+{
+   struct graph_node *graph_node;
+
+   STAILQ_FOREACH(graph_node, &graph->node_list, next)
+   if ((graph_node->node->flags & RTE_NODE_SOURCE_F) &&
+   (graph_node->node->lcore_id == RTE_MAX_LCORE ||
+graph->lcore_id == graph_node->node->lcore_id))
+   return true;
+
+   return false;
+}
+
+int
+rte_graph_bind_core(rte_graph_t id, int lcore)
+{
+   struct graph *graph;
+
+   GRAPH_ID_CHECK(id);
+   if (!rte_lcore_is_enabled(lcore))
+   SET_ERR_JMP(ENOLINK, fail,
+   "lcore %d not enabled\n",
+   lcore);
+
+   STAILQ_FOREACH(graph, &graph_list, next)
+   if (graph->id == id)
+   break;
+
+   graph->lcore_id = lcore;
+   graph->socket = rte_lcore_to_socket_id(lcore);
+
+   /* check the availability of source node */
+   if (!graph_src_node_avail(graph))
+   graph->graph->head = 0;
+
+   return 0;
+
+fail:
+   return -rte_errno;
+}
+
+void
+rte_graph_unbind_core(rte_graph_t id)
+{
+   struct graph *graph;
+
+   GRAPH_ID_CHECK(id);
+   STAILQ_FOREACH(graph, &graph_list, next)
+   if (graph->id == id)
+   break;
+
+   graph->lcore_id = RTE_MAX_LCORE;
+
+fail:
+   return;
+}
+
 struct rte_graph *
 rte_graph_lookup(const char *name)
 {
@@ -328,6 +386,7 @@ rte_graph_create(const char *name, struct rte_graph_param 
*prm)
graph->src_node_count = src_node_count;
graph->node_count = graph_nodes_count(graph);
graph->id = graph_id;
+   graph->lcore_id = RTE_MAX_LCORE;
 
/* Allocate the Graph fast path memory and populate the data */
if (graph_fp_mem_create(graph))
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index 627090f802..7326975a86 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -97,6 +97,8 @@ struct graph {
/**< Circular buffer mask for wrap around. */
rte_graph_t id;
/**< Graph identifier. */
+   unsigned int lcore_id;
+   /**< Lcore identifier where the graph prefer to run on. */
size_t mem_sz;
/**< Memory size of the graph. */
int socket;
diff --git a/lib/graph/rte_graph.h b/lib/graph/rte_graph.h
index b32c4bc217..1d938f6979 100644
--- a/lib/graph/rte_graph.h
+++ b/lib/graph/rte_graph.h
@@ -280,6 +280,28 @@ char *rte_graph_id_to_name(rte_graph_t id);
 __rte_experimental
 int rte_graph_export(const char *name, FILE *f);
 
+/**
+ * Set graph lcore affinity attribute
+ *
+ * @param id
+ *   Graph id to get the pointer of graph object
+ * @param lcore
+ * The lcore where the graph will run on
+ * @return
+ *   0 on success, error otherwise.
+ */
+__rte_experimental
+int rte_graph_bind_core(rte_graph_t id, int lcore);
+
+/**
+ * Unset the graph lcore affinity attribute
+ *
+ * @param id
+ * Graph id to get the pointer of graph object
+ */
+__rte_experimental
+void rte_graph_unbind_core(rte_graph_t id);
+
 /**
  * Get graph object from its name.
  *
diff --git a/lib/graph/version.map b/lib/graph/version.map
index 33ff055be6..1c599b5b47 100644
--- a/lib/graph/version.map
+++ b/lib/graph/version.map
@@ -18,6 +18,8 @@ EXPERIMENTAL {
rte_graph_node_get_by_name;
rte_graph_obj_dump;
rte_graph_walk;
+   rte_graph_bind_core;
+   rte_graph_unbind_core;
 
rte_graph_cluster_stats_create;
rte_graph_cluster_stats_destroy;
-- 
2.25.1

[PATCH v1 07/13] graph: introduce graph clone API for other worker core

2022-11-16 Thread Zhirun Yan

This patch adds graph API for supporting to clone the graph object for
a specified worker core. The new graph will also clone all nodes.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph.c | 110 ++
 lib/graph/graph_private.h |   2 +
 lib/graph/rte_graph.h |  20 +++
 lib/graph/version.map |   1 +
 4 files changed, 133 insertions(+)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index a8d8eb633e..17a9c87032 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -386,6 +386,7 @@ rte_graph_create(const char *name, struct rte_graph_param 
*prm)
graph->src_node_count = src_node_count;
graph->node_count = graph_nodes_count(graph);
graph->id = graph_id;
+   graph->parent_id = RTE_GRAPH_ID_INVALID;
graph->lcore_id = RTE_MAX_LCORE;
 
/* Allocate the Graph fast path memory and populate the data */
@@ -447,6 +448,115 @@ rte_graph_destroy(rte_graph_t id)
return rc;
 }
 
+static int
+clone_name(struct graph *graph, struct graph *parent_graph, const char *name)
+{
+   ssize_t sz, rc;
+
+#define SZ RTE_GRAPH_NAMESIZE
+   rc = rte_strscpy(graph->name, parent_graph->name, SZ);
+   if (rc < 0)
+   goto fail;
+   sz = rc;
+   rc = rte_strscpy(graph->name + sz, "-", RTE_MAX((int16_t)(SZ - sz), 0));
+   if (rc < 0)
+   goto fail;
+   sz += rc;
+   sz = rte_strscpy(graph->name + sz, name, RTE_MAX((int16_t)(SZ - sz), 
0));
+   if (sz < 0)
+   goto fail;
+
+   return 0;
+fail:
+   rte_errno = E2BIG;
+   return -rte_errno;
+}
+
+static rte_graph_t
+graph_clone(struct graph *parent_graph, const char *name)
+{
+   struct graph_node *graph_node;
+   struct graph *graph;
+
+   graph_spinlock_lock();
+
+   /* Don't allow to clone a node from a cloned graph */
+   if (parent_graph->parent_id != RTE_GRAPH_ID_INVALID)
+   SET_ERR_JMP(EEXIST, fail, "A cloned graph is not allowed to be 
cloned");
+
+   /* Create graph object */
+   graph = calloc(1, sizeof(*graph));
+   if (graph == NULL)
+   SET_ERR_JMP(ENOMEM, fail, "Failed to calloc cloned graph 
object");
+
+   /* Naming ceremony of the new graph. name is node->name + "-" + name */
+   if (clone_name(graph, parent_graph, name))
+   goto free;
+
+   /* Check for existence of duplicate graph */
+   if (rte_graph_from_name(graph->name) != RTE_GRAPH_ID_INVALID)
+   SET_ERR_JMP(EEXIST, free, "Found duplicate graph %s",
+   graph->name);
+
+   /* Clone nodes from parent graph firstly */
+   STAILQ_INIT(&graph->node_list);
+   STAILQ_FOREACH(graph_node, &parent_graph->node_list, next) {
+   if (graph_node_add(graph, graph_node->node))
+   goto graph_cleanup;
+   }
+
+   /* Just update adjacency list of all nodes in the graph */
+   if (graph_adjacency_list_update(graph))
+   goto graph_cleanup;
+
+   /* Initialize the graph object */
+   graph->src_node_count = parent_graph->src_node_count;
+   graph->node_count = parent_graph->node_count;
+   graph->parent_id = parent_graph->id;
+   graph->lcore_id = parent_graph->lcore_id;
+   graph->socket = parent_graph->socket;
+   graph->id = graph_id;
+
+   /* Allocate the Graph fast path memory and populate the data */
+   if (graph_fp_mem_create(graph))
+   goto graph_cleanup;
+
+   /* Call init() of the all the nodes in the graph */
+   if (graph_node_init(graph))
+   goto graph_mem_destroy;
+
+   /* All good, Lets add the graph to the list */
+   graph_id++;
+   STAILQ_INSERT_TAIL(&graph_list, graph, next);
+
+   graph_spinlock_unlock();
+   return graph->id;
+
+graph_mem_destroy:
+   graph_fp_mem_destroy(graph);
+graph_cleanup:
+   graph_cleanup(graph);
+free:
+   free(graph);
+fail:
+   graph_spinlock_unlock();
+   return RTE_GRAPH_ID_INVALID;
+}
+
+rte_graph_t
+rte_graph_clone(rte_graph_t id, const char *name)
+{
+   struct graph *graph;
+
+   GRAPH_ID_CHECK(id);
+   STAILQ_FOREACH(graph, &graph_list, next)
+   if (graph->id == id)
+   return graph_clone(graph, name);
+
+fail:
+   return RTE_GRAPH_ID_INVALID;
+}
+
 rte_graph_t
 rte_graph_from_name(const char *name)
 {
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index 7326975a86..c1f2aadd42 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -97,6 +97,8 @@ struct graph {
/**< Circular buffer mask for wrap around. */
rte_graph_t id;
/**< Graph identifier. */
+   rte_graph_t parent_id;
+   /**< Parent graph identifier. */
unsigned int lcore_id;
/**< Lcore identifier where the graph prefer to run on. */
size_t me

[PATCH v1 09/13] graph: enable create and destroy graph scheduling workqueue

2022-11-16 Thread Zhirun Yan

This patch enables to create and destroy scheduling workqueue into
common graph operations.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index 8ea0daaa35..63d9bcffd2 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -428,6 +428,10 @@ rte_graph_destroy(rte_graph_t id)
while (graph != NULL) {
tmp = STAILQ_NEXT(graph, next);
if (graph->id == id) {
+   /* Destroy the schedule work queue if has */
+   if (rte_graph_worker_model_get() == 
RTE_GRAPH_MODEL_GENERIC)
+   graph_sched_wq_destroy(graph);
+
/* Call fini() of the all the nodes in the graph */
graph_node_fini(graph);
/* Destroy graph fast path memory */
@@ -522,6 +526,11 @@ graph_clone(struct graph *parent_graph, const char *name)
if (graph_fp_mem_create(graph))
goto graph_cleanup;
 
+   /* Create the graph schedule work queue */
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC &&
+   graph_sched_wq_create(graph, parent_graph))
+   goto graph_mem_destroy;
+
/* Call init() of the all the nodes in the graph */
if (graph_node_init(graph))
goto graph_mem_destroy;
-- 
2.25.1

[PATCH v1 08/13] graph: introduce stream moving cross cores

2022-11-16 Thread Zhirun Yan

This patch introduces key functions to allow a worker thread to
enable enqueue and move streams of objects to the next nodes over
different cores.

1. add graph_sched_wq_node to hold graph scheduling workqueue node
stream
2. add workqueue help functions to create/destroy/enqueue/dequeue

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph.c   |   1 +
 lib/graph/graph_populate.c  |   1 +
 lib/graph/graph_private.h   |  39 
 lib/graph/meson.build   |   2 +-
 lib/graph/rte_graph_model_generic.c | 145 
 lib/graph/rte_graph_model_generic.h |  35 +++
 lib/graph/rte_graph_worker_common.h |  18 
 7 files changed, 240 insertions(+), 1 deletion(-)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index 17a9c87032..8ea0daaa35 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -275,6 +275,7 @@ rte_graph_bind_core(rte_graph_t id, int lcore)
break;
 
graph->lcore_id = lcore;
+   graph->graph->lcore_id = graph->lcore_id;
graph->socket = rte_lcore_to_socket_id(lcore);
 
/* check the availability of source node */
diff --git a/lib/graph/graph_populate.c b/lib/graph/graph_populate.c
index 102fd6c29b..26f9670406 100644
--- a/lib/graph/graph_populate.c
+++ b/lib/graph/graph_populate.c
@@ -84,6 +84,7 @@ graph_nodes_populate(struct graph *_graph)
}
node->id = graph_node->node->id;
node->parent_id = pid;
+   node->lcore_id = graph_node->node->lcore_id;
nb_edges = graph_node->node->nb_edges;
node->nb_edges = nb_edges;
off += sizeof(struct rte_node);
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index c1f2aadd42..f58d0d1d63 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -59,6 +59,18 @@ struct node {
char next_nodes[][RTE_NODE_NAMESIZE]; /**< Names of next nodes. */
 };
 
+/**
+ * @internal
+ *
+ * Structure that holds the graph scheduling workqueue node stream.
+ * Used for generic worker model.
+ */
+struct graph_sched_wq_node {
+   rte_graph_off_t node_off;
+   uint16_t nb_objs;
+   void *objs[RTE_GRAPH_BURST_SIZE];
+} __rte_cache_aligned;
+
 /**
  * @internal
  *
@@ -349,4 +361,31 @@ void graph_dump(FILE *f, struct graph *g);
  */
 void node_dump(FILE *f, struct node *n);
 
+/**
+ * @internal
+ *
+ * Create the graph schedule work queue. And all cloned graphs attached to the
+ * parent graph MUST be destroyed together for fast schedule design limitation.
+ *
+ * @param _graph
+ *   The graph object
+ * @param _parent_graph
+ *   The parent graph object which holds the run-queue head.
+ *
+ * @return
+ *   - 0: Success.
+ *   - <0: Graph schedule work queue related error.
+ */
+int graph_sched_wq_create(struct graph *_graph, struct graph *_parent_graph);
+
+/**
+ * @internal
+ *
+ * Destroy the graph schedule work queue.
+ *
+ * @param _graph
+ *   The graph object
+ */
+void graph_sched_wq_destroy(struct graph *_graph);
+
 #endif /* _RTE_GRAPH_PRIVATE_H_ */
diff --git a/lib/graph/meson.build b/lib/graph/meson.build
index 8c8b11ed27..f93ab6fdcb 100644
--- a/lib/graph/meson.build
+++ b/lib/graph/meson.build
@@ -18,4 +18,4 @@ sources = files(
 )
 headers = files('rte_graph.h', 'rte_graph_worker.h')
 
-deps += ['eal']
+deps += ['eal', 'mempool', 'ring']
diff --git a/lib/graph/rte_graph_model_generic.c 
b/lib/graph/rte_graph_model_generic.c
index 54ff659c7b..c862237432 100644
--- a/lib/graph/rte_graph_model_generic.c
+++ b/lib/graph/rte_graph_model_generic.c
@@ -5,6 +5,151 @@
 #include "graph_private.h"
 #include "rte_graph_model_generic.h"
 
+int
+graph_sched_wq_create(struct graph *_graph, struct graph *_parent_graph)
+{
+   struct rte_graph *parent_graph = _parent_graph->graph;
+   struct rte_graph *graph = _graph->graph;
+   unsigned int wq_size;
+
+   wq_size = GRAPH_SCHED_WQ_SIZE(graph->nb_nodes);
+   wq_size = rte_align32pow2(wq_size + 1);
+
+   graph->wq = rte_ring_create(graph->name, wq_size, graph->socket,
+   RING_F_SC_DEQ);
+   if (graph->wq == NULL)
+   SET_ERR_JMP(EIO, fail, "Failed to allocate graph WQ");
+
+   graph->mp = rte_mempool_create(graph->name, wq_size,
+  sizeof(struct graph_sched_wq_node),
+  0, 0, NULL, NULL, NULL, NULL,
+  graph->socket, MEMPOOL_F_SP_PUT);
+   if (graph->mp == NULL)
+   SET_ERR_JMP(EIO, fail_mp,
+   "Failed to allocate graph WQ schedule entry");
+
+   graph->lcore_id = _graph->lcore_id;
+
+   if (parent_graph->rq == NULL) {
+   parent_graph->rq = &parent_graph->rq_head;
+   SLIST_INIT(parent_graph->rq);
+   }
+
+   graph->rq = parent_graph->rq;
+   SLIS

[PATCH v1 10/13] graph: introduce graph walk by cross-core dispatch

2022-11-16 Thread Zhirun Yan

This patch introduces the task scheduler mechanism to enable dispatching
tasks to another worker cores. Currently, there is only a local work
queue for one graph to walk. We introduce a scheduler worker queue in
each worker core for dispatching tasks. It will perform the walk on
scheduler work queue first, then handle the local work queue.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_model_generic.h | 36 +
 1 file changed, 36 insertions(+)

diff --git a/lib/graph/rte_graph_model_generic.h 
b/lib/graph/rte_graph_model_generic.h
index 5715fc8ffb..c29fc31309 100644
--- a/lib/graph/rte_graph_model_generic.h
+++ b/lib/graph/rte_graph_model_generic.h
@@ -71,6 +71,42 @@ void __rte_noinline __rte_graph_sched_wq_process(struct 
rte_graph *graph);
 __rte_experimental
 int rte_node_model_generic_set_lcore_affinity(const char *name, unsigned int 
lcore_id);
 
+/**
+ * Perform graph walk on the circular buffer and invoke the process function
+ * of the nodes and collect the stats.
+ *
+ * @param graph
+ *   Graph pointer returned from rte_graph_lookup function.
+ *
+ * @see rte_graph_lookup()
+ */
+__rte_experimental
+static inline void
+rte_graph_walk_generic(struct rte_graph *graph)
+{
+   uint32_t head = graph->head;
+   struct rte_node *node;
+
+   if (graph->wq != NULL)
+   __rte_graph_sched_wq_process(graph);
+
+   rte_graph_walk_node(graph, head, node) {
+   /* skip the src nodes which not bind with current worker */
+   if ((int32_t)head < 0 && node->lcore_id != graph->lcore_id)
+   continue;
+
+   /* Schedule the node until all task/objs are done */
+   if (node->lcore_id != RTE_MAX_LCORE &&
+   graph->lcore_id != node->lcore_id && graph->rq != NULL &&
+   __rte_graph_sched_node_enqueue(node, graph->rq))
+   continue;
+
+   __rte_node_process(graph, node);
+   }
+
+   graph->tail = 0;
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.25.1

[PATCH v1 11/13] graph: enable graph generic scheduler model

2022-11-16 Thread Zhirun Yan

This patch enables to chose new scheduler model.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_worker.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
index a0ea0df153..dea207ca46 100644
--- a/lib/graph/rte_graph_worker.h
+++ b/lib/graph/rte_graph_worker.h
@@ -1,4 +1,5 @@
 #include "rte_graph_model_rtc.h"
+#include "rte_graph_model_generic.h"
 
 static enum rte_graph_worker_model worker_model = RTE_GRAPH_MODEL_DEFAULT;
 
@@ -64,5 +65,11 @@ __rte_experimental
 static inline void
 rte_graph_walk(struct rte_graph *graph)
 {
-   rte_graph_walk_rtc(graph);
+   int model = rte_graph_worker_model_get();
+
+   if (model == RTE_GRAPH_MODEL_DEFAULT ||
+   model == RTE_GRAPH_MODEL_RTC)
+   rte_graph_walk_rtc(graph);
+   else if (model == RTE_GRAPH_MODEL_GENERIC)
+   rte_graph_walk_generic(graph);
 }
-- 
2.25.1

[PATCH v1 12/13] graph: add stats for corss-core dispatching

2022-11-16 Thread Zhirun Yan

Add stats for cross-core dispatching scheduler if stats collection is
enabled.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph_debug.c |  6 +++
 lib/graph/graph_stats.c | 74 +
 lib/graph/rte_graph.h   |  2 +
 lib/graph/rte_graph_model_generic.c |  3 ++
 lib/graph/rte_graph_worker_common.h |  2 +
 5 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/lib/graph/graph_debug.c b/lib/graph/graph_debug.c
index b84412f5dd..080ba16ad9 100644
--- a/lib/graph/graph_debug.c
+++ b/lib/graph/graph_debug.c
@@ -74,6 +74,12 @@ rte_graph_obj_dump(FILE *f, struct rte_graph *g, bool all)
fprintf(f, "   size=%d\n", n->size);
fprintf(f, "   idx=%d\n", n->idx);
fprintf(f, "   total_objs=%" PRId64 "\n", n->total_objs);
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC) {
+   fprintf(f, "   total_sched_objs=%" PRId64 "\n",
+   n->total_sched_objs);
+   fprintf(f, "   total_sched_fail=%" PRId64 "\n",
+   n->total_sched_fail);
+   }
fprintf(f, "   total_calls=%" PRId64 "\n", n->total_calls);
for (i = 0; i < n->nb_edges; i++)
fprintf(f, "  edge[%d] <%s>\n", i,
diff --git a/lib/graph/graph_stats.c b/lib/graph/graph_stats.c
index c0140ba922..801fcb832d 100644
--- a/lib/graph/graph_stats.c
+++ b/lib/graph/graph_stats.c
@@ -40,13 +40,19 @@ struct rte_graph_cluster_stats {
struct cluster_node clusters[];
 } __rte_cache_aligned;
 
+#define boarder_model_generic()
  \
+   fprintf(f, "+---+---+" \
+  "---+---+---+---+" \
+  "---+---+-" \
+  "--+\n")
+
 #define boarder()  
\
fprintf(f, "+---+---+" \
   "---+---+---+---+-" \
   "--+\n")
 
 static inline void
-print_banner(FILE *f)
+print_banner_default(FILE *f)
 {
boarder();
fprintf(f, "%-32s%-16s%-16s%-16s%-16s%-16s%-16s\n", "|Node", "|calls",
@@ -55,6 +61,27 @@ print_banner(FILE *f)
boarder();
 }
 
+static inline void
+print_banner_generic(FILE *f)
+{
+   boarder_model_generic();
+   fprintf(f, "%-32s%-16s%-16s%-16s%-16s%-16s%-16s%-16s%-16s\n",
+   "|Node", "|calls",
+   "|objs", "|sched objs", "|sched fail",
+   "|realloc_count", "|objs/call", "|objs/sec(10E6)",
+   "|cycles/call|");
+   boarder_model_generic();
+}
+
+static inline void
+print_banner(FILE *f)
+{
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC)
+   print_banner_generic(f);
+   else
+   print_banner_default(f);
+}
+
 static inline void
 print_node(FILE *f, const struct rte_graph_cluster_node_stats *stat)
 {
@@ -76,11 +103,21 @@ print_node(FILE *f, const struct 
rte_graph_cluster_node_stats *stat)
objs_per_sec = ts_per_hz ? (objs - prev_objs) / ts_per_hz : 0;
objs_per_sec /= 100;
 
-   fprintf(f,
-   "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64
-   "|%-15.3f|%-15.6f|%-11.4f|\n",
-   stat->name, calls, objs, stat->realloc_count, objs_per_call,
-   objs_per_sec, cycles_per_call);
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC) {
+   fprintf(f,
+   "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64
+   "|%-15" PRIu64 "|%-15" PRIu64
+   "|%-15.3f|%-15.6f|%-11.4f|\n",
+   stat->name, calls, objs, stat->sched_objs,
+   stat->sched_fail, stat->realloc_count, objs_per_call,
+   objs_per_sec, cycles_per_call);
+   } else {
+   fprintf(f,
+   "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64
+   "|%-15.3f|%-15.6f|%-11.4f|\n",
+   stat->name, calls, objs, stat->realloc_count, 
objs_per_call,
+   objs_per_sec, cycles_per_call);
+   }
 }
 
 static int
@@ -88,13 +125,20 @@ graph_cluster_stats_cb(bool is_first, bool is_last, void 
*cookie,
   const struct rte_graph_cluster_node_stats *stat)
 {
FILE *f = cookie;
+   int model;
+
+   model = rte_graph_worker_model_get();
 
if (unlikely(is_first))
print_banner(f);
if (stat->objs)
print_node(f, stat);
-   if (unlikely(is_last))
-

[PATCH v1 13/13] examples/l3fwd-graph: introduce generic worker model

2022-11-16 Thread Zhirun Yan

Add new parameter "model" to choose generic or rtc worker model.
And in generic model, the node will affinity to worker core successively.

Note:
only support one RX node for remote model in current implementation.

./dpdk-l3fwd-graph  -l 8,9,10,11 -n 4 -- -p 0x1 --config="(0,0,9)" -P
--model="generic"

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 examples/l3fwd-graph/main.c | 218 +---
 1 file changed, 179 insertions(+), 39 deletions(-)

diff --git a/examples/l3fwd-graph/main.c b/examples/l3fwd-graph/main.c
index 6dcb6ee92b..c145a3e3e8 100644
--- a/examples/l3fwd-graph/main.c
+++ b/examples/l3fwd-graph/main.c
@@ -147,6 +147,19 @@ static struct ipv4_l3fwd_lpm_route 
ipv4_l3fwd_lpm_route_array[] = {
{RTE_IPV4(198, 18, 6, 0), 24, 6}, {RTE_IPV4(198, 18, 7, 0), 24, 7},
 };
 
+static int
+check_worker_model_params(void)
+{
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_GENERIC &&
+   nb_lcore_params > 1) {
+   printf("Exceeded max number of lcore params for remote model: 
%hu\n",
+  nb_lcore_params);
+   return -1;
+   }
+
+   return 0;
+}
+
 static int
 check_lcore_params(void)
 {
@@ -291,6 +304,20 @@ parse_max_pkt_len(const char *pktlen)
return len;
 }
 
+static int
+parse_worker_model(const char *model)
+{
+   if (strcmp(model, WORKER_MODEL_DEFAULT) == 0)
+   return RTE_GRAPH_MODEL_DEFAULT;
+   else if (strcmp(model, WORKER_MODEL_GENERIC) == 0) {
+   rte_graph_worker_model_set(RTE_GRAPH_MODEL_GENERIC);
+   return RTE_GRAPH_MODEL_GENERIC;
+   }
+   rte_exit(EXIT_FAILURE, "Invalid worker model: %s", model);
+
+   return RTE_GRAPH_MODEL_MAX;
+}
+
 static int
 parse_portmask(const char *portmask)
 {
@@ -404,6 +431,7 @@ static const char short_options[] = "p:" /* portmask */
 #define CMD_LINE_OPT_NO_NUMA  "no-numa"
 #define CMD_LINE_OPT_MAX_PKT_LEN   "max-pkt-len"
 #define CMD_LINE_OPT_PER_PORT_POOL "per-port-pool"
+#define CMD_LINE_OPT_WORKER_MODEL  "model"
 enum {
/* Long options mapped to a short option */
 
@@ -416,6 +444,7 @@ enum {
CMD_LINE_OPT_NO_NUMA_NUM,
CMD_LINE_OPT_MAX_PKT_LEN_NUM,
CMD_LINE_OPT_PARSE_PER_PORT_POOL,
+   CMD_LINE_OPT_WORKER_MODEL_TYPE,
 };
 
 static const struct option lgopts[] = {
@@ -424,6 +453,7 @@ static const struct option lgopts[] = {
{CMD_LINE_OPT_NO_NUMA, 0, 0, CMD_LINE_OPT_NO_NUMA_NUM},
{CMD_LINE_OPT_MAX_PKT_LEN, 1, 0, CMD_LINE_OPT_MAX_PKT_LEN_NUM},
{CMD_LINE_OPT_PER_PORT_POOL, 0, 0, CMD_LINE_OPT_PARSE_PER_PORT_POOL},
+   {CMD_LINE_OPT_WORKER_MODEL, 1, 0, CMD_LINE_OPT_WORKER_MODEL_TYPE},
{NULL, 0, 0, 0},
 };
 
@@ -498,6 +528,11 @@ parse_args(int argc, char **argv)
per_port_pool = 1;
break;
 
+   case CMD_LINE_OPT_WORKER_MODEL_TYPE:
+   printf("Use new worker model: %s\n", optarg);
+   parse_worker_model(optarg);
+   break;
+
default:
print_usage(prgname);
return -1;
@@ -735,6 +770,140 @@ config_port_max_pkt_len(struct rte_eth_conf *conf,
return 0;
 }
 
+static void
+graph_config_generic(struct rte_graph_param graph_conf)
+{
+   uint16_t nb_patterns = graph_conf.nb_node_patterns;
+   int worker_count = rte_lcore_count() - 1;
+   int main_lcore_id = rte_get_main_lcore();
+   int worker_lcore = main_lcore_id;
+   rte_graph_t main_graph_id = 0;
+   struct rte_node *node_tmp;
+   struct lcore_conf *qconf;
+   struct rte_graph *graph;
+   rte_graph_t graph_id;
+   rte_graph_off_t off;
+   int n_rx_node = 0;
+   rte_node_t count;
+   rte_edge_t i;
+   int ret;
+
+   for (int j = 0; j < nb_lcore_params; j++) {
+   qconf = &lcore_conf[lcore_params[j].lcore_id];
+   /* Add rx node patterns of all lcore */
+   for (i = 0; i < qconf->n_rx_queue; i++) {
+   char *node_name = qconf->rx_queue_list[i].node_name;
+
+   graph_conf.node_patterns[nb_patterns + n_rx_node + i] = 
node_name;
+   n_rx_node++;
+   ret = 
rte_node_model_generic_set_lcore_affinity(node_name,
+   
lcore_params[j].lcore_id);
+   if (ret == 0)
+   printf("Set node %s affinity to lcore %u\n", 
node_name,
+  lcore_params[j].lcore_id);
+   }
+   }
+
+   graph_conf.nb_node_patterns = nb_patterns + n_rx_node;
+   graph_conf.socket_id = rte_lcore_to_socket_id(main_lcore_id);
+
+   snprintf(qconf->name, sizeof(qconf->name), "worker_%u",
+main_lcore_id);
+
+   /* create main graph */

[PATCH V1] doc: add tested Intel platforms with Intel NICs

2022-11-16 Thread Lingli Chen

Add tested Intel platforms with Intel NICs to v22.11 release note.

Signed-off-by: Lingli Chen 
---
 doc/guides/rel_notes/release_22_11.rst | 108 +
 1 file changed, 108 insertions(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 5e091403ad..aa9ff1fd63 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -635,3 +635,111 @@ Tested Platforms
This section is a comment. Do not overwrite or remove it.
Also, make sure to start the actual text at the margin.
===
+
+
+* Intel\ |reg| platforms with Intel\ |reg| NICs combinations
+
+  * CPU
+
+* Intel\ |reg| Atom\ |trade| CPU C3758 @ 2.20GHz
+* Intel\ |reg| Xeon\ |reg| CPU D-1553N @ 2.30GHz
+* Intel\ |reg| Xeon\ |reg| CPU E5-2680 v2 @ 2.80GHz
+* Intel\ |reg| Xeon\ |reg| CPU E5-2699 v3 @ 2.30GHz
+* Intel\ |reg| Xeon\ |reg| CPU E5-2699 v4 @ 2.20GHz
+* Intel\ |reg| Xeon\ |reg| D-2796NT CPU @ 2.00GHz
+* Intel\ |reg| Xeon\ |reg| Gold 6139 CPU @ 2.30GHz
+* Intel\ |reg| Xeon\ |reg| Gold 6140M CPU @ 2.30GHz
+* Intel\ |reg| Xeon\ |reg| Gold 6252N CPU @ 2.30GHz
+* Intel\ |reg| Xeon\ |reg| Gold 6348 CPU @ 2.60GHz
+* Intel\ |reg| Xeon\ |reg| Platinum 8180M CPU @ 2.50GHz
+* Intel\ |reg| Xeon\ |reg| Platinum 8280M CPU @ 2.70GHz
+* Intel\ |reg| Xeon\ |reg| Platinum 8380 CPU @ 2.30GHz
+
+  * OS:
+
+* Fedora 36
+* FreeBSD 13.1
+* Red Hat Enterprise Linux Server release 8.6
+* Red Hat Enterprise Linux Server release 9
+* CentOS 7.9
+* Ubuntu 20.04.5
+* Ubuntu 22.04.1
+* Ubuntu 22.10
+* SUSE Linux Enterprise Server 15 SP4
+
+  * NICs:
+
+* Intel\ |reg| Ethernet Controller E810-C for SFP (4x25G)
+
+  * Firmware version: 4.10 0x800151d8 1.3310.0
+  * Device id (pf/vf): 8086:1593 / 8086:1889
+  * Driver version(out-tree): 1.10.6 (ice)
+  * Driver version(in-tree): 5.15.0-46-generic / 
4.18.0-372.9.1.rt7.166.el8.x86_64 (ice)
+  * OS Default DDP: 1.3.30.0
+  * COMMS DDP: 1.3.37.0
+  * Wireless Edge DDP: 1.3.10.0
+
+* Intel\ |reg| Ethernet Controller E810-C for QSFP (2x100G)
+
+  * Firmware version: 4.10 0x8001518e 1.3310.0
+  * Device id (pf/vf): 8086:1592 / 8086:1889
+  * Driver version: 1.10.6 (ice)
+  * OS Default DDP: 1.3.30.0
+  * COMMS DDP: 1.3.37.0
+  * Wireless Edge DDP: 1.3.10.0
+
+* Intel\ |reg| Ethernet Controller E810-XXV for SFP (2x25G)
+
+  * Firmware version: 4.10 0x80015188 1.3310.0
+  * Device id (pf/vf): 8086:159b / 8086:1889
+  * Driver version: 1.10.6 (ice)
+  * OS Default DDP: 1.3.30.0
+  * COMMS DDP: 1.3.37.0
+
+* Intel\ |reg| 82599ES 10 Gigabit Ethernet Controller
+
+  * Firmware version: 0x61bf0001
+  * Device id (pf/vf): 8086:10fb / 8086:10ed
+  * Driver version(out-tree): 5.16.5 (ixgbe)
+  * Driver version(in-tree): 5.15.0-46-generic (ixgbe)
+
+* Intel\ |reg| Ethernet Converged Network Adapter X710-DA4 (4x10G)
+
+  * Firmware version: 9.00 0x8000cead 1.3179.0
+  * Device id (pf/vf): 8086:1572 / 8086:154c
+  * Driver version(out-tree): 2.20.12 (i40e)
+  * Driver version(in-tree): 5.15.0-46-generic (i40e)
+
+* Intel\ |reg| Corporation Ethernet Connection X722 for 10GbE SFP+ (2x10G)
+
+  * Firmware version: 6.00 0x800039ec 1.3179.0
+  * Device id (pf/vf): 8086:37d0 / 8086:37cd
+  * Driver version(out-tree): 2.20.12 (i40e)
+  * Driver version(in-tree): 5.15.0-46-generic (i40e)
+
+* Intel\ |reg| Corporation Ethernet Connection X722 for 10GBASE-T
+
+  * Firmware version: 6.00 0x800039aa 1.2935.0
+  * Device id (pf/vf): 8086:37d2 / 8086:37cd
+  * Driver version(out-tree): 2.20.12 (i40e)
+  * Driver version(in-tree): 5.15.0-46-generic (i40e)
+
+* Intel\ |reg| Ethernet Converged Network Adapter XXV710-DA2 (2x25G)
+
+  * Firmware version: 9.00 0x8000ce90 1.3179.0
+  * Device id (pf/vf): 8086:158b / 8086:154c
+  * Driver version(out-tree): 2.20.12 (i40e)
+  * Driver version(in-tree): 5.15.0-46-generic (i40e)
+
+* Intel\ |reg| Ethernet Converged Network Adapter XL710-QDA2 (2X40G)
+
+  * Firmware version(PF): 9.00 0x8000ce86 1.3179.0
+  * Device id (pf/vf): 8086:1583 / 8086:154c
+  * Driver version(out-tree): 2.20.12 (i40e)
+  * Driver version(in-tree): 5.15.0-46-generic (i40e)
+
+* Intel\ |reg| Ethernet Converged Network Adapter X710-T2L
+
+  * Firmware version: 9.00 0x8000ce67 1.3179.0
+  * Device id (pf): 8086:15ff
+  * Driver version: 2.20.12 (i40e)
-- 
2.17.1

RE: [PATCH V1] doc: add tested Intel platforms with Intel NICs

2022-11-16 Thread Peng, Yuan

Acked-by: Peng, Yuan 

> -Original Message-
> From: Chen, LingliX 
> Sent: Thursday, November 17, 2022 12:36 PM
> To: Zhang, Qi Z ; dev@dpdk.org
> Cc: Peng, Yuan ; Chen, LingliX
> 
> Subject: [PATCH V1] doc: add tested Intel platforms with Intel NICs
> 
> Add tested Intel platforms with Intel NICs to v22.11 release note.
> 
> Signed-off-by: Lingli Chen 
> ---

compile failed with t3.c

2022-11-16 Thread Yi Li

Hi list,
   When i try to compile examples/bpf/t3.c with command annotated in
that source file, got this error:
[root@server bpf]# clang -O2 -U __GNUC__ -target bpf
-Wno-int-to-void-pointer-cast -c t3.c
In file included from t3.c:27:
In file included from /usr/local/include/rte_mbuf_core.h:22:
/usr/local/include/rte_byteorder.h:30:16: error: invalid output
constraint '=Q' in asm
  : [x1] "=Q" (x)
 ^
1 error generated.
[root@server bpf]#

   I think this should be the root cause as noted linux kernel
Documentation/bpf/bpf_devel_QA.txt

BPF programs may recursively include header file(s) with file scope
inline assembly codes. The default target can handle this well, while
bpf target may fail if bpf backend assembler does not understand these
assembly codes, which is true in most cases.

[PATCH v1 2/3] net/cnxk: add sg2 descriptor support

2022-11-16 Thread Ashwin Sekhar T K

Add support for creating packets with segments from different
pools. This is enabled by using the SG2 descriptors. SG2
descriptors are only used when the segment is to be freed
by the HW.

Signed-off-by: Ashwin Sekhar T K 
---
 drivers/net/cnxk/cn10k_tx.h | 161 +++-
 1 file changed, 123 insertions(+), 38 deletions(-)

diff --git a/drivers/net/cnxk/cn10k_tx.h b/drivers/net/cnxk/cn10k_tx.h
index a4c578354c..3f08a8a473 100644
--- a/drivers/net/cnxk/cn10k_tx.h
+++ b/drivers/net/cnxk/cn10k_tx.h
@@ -54,6 +54,36 @@
 
 #define NIX_NB_SEGS_TO_SEGDW(x) ((NIX_SEGDW_MAGIC >> ((x) << 2)) & 0xF)
 
+static __plt_always_inline uint8_t
+cn10k_nix_mbuf_sg_dwords(struct rte_mbuf *m)
+{
+   uint32_t nb_segs = m->nb_segs;
+   uint16_t aura0, aura;
+   int segw, sg_segs;
+
+   aura0 = roc_npa_aura_handle_to_aura(m->pool->pool_id);
+
+   nb_segs--;
+   segw = 2;
+   sg_segs = 1;
+   while (nb_segs) {
+   m = m->next;
+   aura = roc_npa_aura_handle_to_aura(m->pool->pool_id);
+   if (aura != aura0) {
+   segw += 2 + (sg_segs == 2);
+   sg_segs = 0;
+   } else {
+   segw += (sg_segs == 0); /* SUBDC */
+   segw += 1;  /* IOVA */
+   sg_segs += 1;
+   sg_segs %= 3;
+   }
+   nb_segs--;
+   }
+
+   return (segw + 1) / 2;
+}
+
 static __plt_always_inline void
 cn10k_nix_vwqe_wait_fc(struct cn10k_eth_txq *txq, int64_t req)
 {
@@ -915,15 +945,15 @@ cn10k_nix_xmit_prepare_tstamp(struct cn10k_eth_txq *txq, 
uintptr_t lmt_addr,
 static __rte_always_inline uint16_t
 cn10k_nix_prepare_mseg(struct rte_mbuf *m, uint64_t *cmd, const uint16_t flags)
 {
+   uint64_t prefree = 0, aura0, aura, nb_segs, segdw;
struct nix_send_hdr_s *send_hdr;
-   union nix_send_sg_s *sg;
+   union nix_send_sg_s *sg, l_sg;
+   union nix_send_sg2_s l_sg2;
struct rte_mbuf *m_next;
-   uint64_t *slist, sg_u;
+   uint8_t off, is_sg2;
uint64_t len, dlen;
uint64_t ol_flags;
-   uint64_t nb_segs;
-   uint64_t segdw;
-   uint8_t off, i;
+   uint64_t *slist;
 
send_hdr = (struct nix_send_hdr_s *)cmd;
 
@@ -938,20 +968,22 @@ cn10k_nix_prepare_mseg(struct rte_mbuf *m, uint64_t *cmd, 
const uint16_t flags)
ol_flags = m->ol_flags;
 
/* Start from second segment, first segment is already there */
-   i = 1;
-   sg_u = sg->u;
-   len -= sg_u & 0x;
+   is_sg2 = 0;
+   l_sg.u = sg->u;
+   len -= l_sg.u & 0x;
nb_segs = m->nb_segs - 1;
m_next = m->next;
slist = &cmd[3 + off + 1];
 
/* Set invert df if buffer is not to be freed by H/W */
-   if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)
-   sg_u |= (cnxk_nix_prefree_seg(m) << 55);
+   if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F) {
+   prefree = cnxk_nix_prefree_seg(m);
+   l_sg.i1 = prefree;
+   }
 
-   /* Mark mempool object as "put" since it is freed by NIX */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-   if (!(sg_u & (1ULL << 55)))
+   /* Mark mempool object as "put" since it is freed by NIX */
+   if (!prefree)
RTE_MEMPOOL_CHECK_COOKIES(m->pool, (void **)&m, 1, 0);
rte_io_wmb();
 #endif
@@ -964,55 +996,103 @@ cn10k_nix_prepare_mseg(struct rte_mbuf *m, uint64_t 
*cmd, const uint16_t flags)
if (!(flags & NIX_TX_MULTI_SEG_F))
goto done;
 
+   aura0 = send_hdr->w0.aura;
m = m_next;
if (!m)
goto done;
 
/* Fill mbuf segments */
do {
+   uint64_t iova;
+
+   /* Save the current mbuf properties. These can get cleared in
+* cnxk_nix_prefree_seg()
+*/
m_next = m->next;
+   iova = rte_mbuf_data_iova(m);
dlen = m->data_len;
len -= dlen;
-   sg_u = sg_u | ((uint64_t)dlen << (i << 4));
-   *slist = rte_mbuf_data_iova(m);
-   /* Set invert df if buffer is not to be freed by H/W */
-   if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)
-   sg_u |= (cnxk_nix_prefree_seg(m) << (i + 55));
-   /* Mark mempool object as "put" since it is freed by NIX
-*/
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-   if (!(sg_u & (1ULL << (i + 55
-   RTE_MEMPOOL_CHECK_COOKIES(m->pool, (void **)&m, 1, 0);
-#endif
-   slist++;
-   i++;
+
nb_segs--;
-   if (i > 2 && nb_segs) {
-   i = 0;
+   aura = aura0;
+   prefree = 0;
+
+   if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F) {
+   aura = roc_npa_aura_handle_to_aura(m->pool->pool_id)

[PATCH v1 1/3] net/cnxk: rework no-fast-free offload handling

2022-11-16 Thread Ashwin Sekhar T K

Add a separate routine to handle no-fast-free offload
in vector Tx path for multisegmented packets.

Signed-off-by: Ashwin Sekhar T K 
---
 drivers/net/cnxk/cn10k_tx.h | 124 +---
 1 file changed, 59 insertions(+), 65 deletions(-)

diff --git a/drivers/net/cnxk/cn10k_tx.h b/drivers/net/cnxk/cn10k_tx.h
index 815cd2ff1f..a4c578354c 100644
--- a/drivers/net/cnxk/cn10k_tx.h
+++ b/drivers/net/cnxk/cn10k_tx.h
@@ -956,6 +956,14 @@ cn10k_nix_prepare_mseg(struct rte_mbuf *m, uint64_t *cmd, 
const uint16_t flags)
rte_io_wmb();
 #endif
m->next = NULL;
+
+   /* Quickly handle single segmented packets. With this if-condition
+* compiler will completely optimize out the below do-while loop
+* from the Tx handler when NIX_TX_MULTI_SEG_F offload is not set.
+*/
+   if (!(flags & NIX_TX_MULTI_SEG_F))
+   goto done;
+
m = m_next;
if (!m)
goto done;
@@ -1360,6 +1368,30 @@ cn10k_nix_prepare_tso(struct rte_mbuf *m, union 
nix_send_hdr_w1_u *w1,
}
 }
 
+static __rte_always_inline uint16_t
+cn10k_nix_prepare_mseg_vec_noff(struct rte_mbuf *m, uint64_t *cmd,
+   uint64x2_t *cmd0, uint64x2_t *cmd1,
+   uint64x2_t *cmd2, uint64x2_t *cmd3,
+   const uint32_t flags)
+{
+   uint16_t segdw;
+
+   vst1q_u64(cmd, *cmd0); /* Send hdr */
+   if (flags & NIX_TX_NEED_EXT_HDR) {
+   vst1q_u64(cmd + 2, *cmd2); /* ext hdr */
+   vst1q_u64(cmd + 4, *cmd1); /* sg */
+   } else {
+   vst1q_u64(cmd + 2, *cmd1); /* sg */
+   }
+
+   segdw = cn10k_nix_prepare_mseg(m, cmd, flags);
+
+   if (flags & NIX_TX_OFFLOAD_TSTAMP_F)
+   vst1q_u64(cmd + segdw * 2 - 2, *cmd3);
+
+   return segdw;
+}
+
 static __rte_always_inline void
 cn10k_nix_prepare_mseg_vec_list(struct rte_mbuf *m, uint64_t *cmd,
union nix_send_hdr_w0_u *sh,
@@ -1389,17 +1421,6 @@ cn10k_nix_prepare_mseg_vec_list(struct rte_mbuf *m, 
uint64_t *cmd,
 
nb_segs = m->nb_segs - 1;
m_next = m->next;
-
-   /* Set invert df if buffer is not to be freed by H/W */
-   if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)
-   sg_u |= (cnxk_nix_prefree_seg(m) << 55);
-   /* Mark mempool object as "put" since it is freed by NIX */
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-   if (!(sg_u & (1ULL << 55)))
-   RTE_MEMPOOL_CHECK_COOKIES(m->pool, (void **)&m, 1, 0);
-   rte_io_wmb();
-#endif
-
m->next = NULL;
m = m_next;
/* Fill mbuf segments */
@@ -1409,16 +1430,6 @@ cn10k_nix_prepare_mseg_vec_list(struct rte_mbuf *m, 
uint64_t *cmd,
len -= dlen;
sg_u = sg_u | ((uint64_t)dlen << (i << 4));
*slist = rte_mbuf_data_iova(m);
-   /* Set invert df if buffer is not to be freed by H/W */
-   if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)
-   sg_u |= (cnxk_nix_prefree_seg(m) << (i + 55));
-   /* Mark mempool object as "put" since it is freed by NIX
-*/
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-   if (!(sg_u & (1ULL << (i + 55
-   RTE_MEMPOOL_CHECK_COOKIES(m->pool, (void **)&m, 1, 0);
-   rte_io_wmb();
-#endif
slist++;
i++;
nb_segs--;
@@ -1456,21 +1467,8 @@ cn10k_nix_prepare_mseg_vec(struct rte_mbuf *m, uint64_t 
*cmd, uint64x2_t *cmd0,
union nix_send_hdr_w0_u sh;
union nix_send_sg_s sg;
 
-   if (m->nb_segs == 1) {
-   if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F) {
-   sg.u = vgetq_lane_u64(cmd1[0], 0);
-   sg.u |= (cnxk_nix_prefree_seg(m) << 55);
-   cmd1[0] = vsetq_lane_u64(sg.u, cmd1[0], 0);
-   }
-
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-   sg.u = vgetq_lane_u64(cmd1[0], 0);
-   if (!(sg.u & (1ULL << 55)))
-   RTE_MEMPOOL_CHECK_COOKIES(m->pool, (void **)&m, 1, 0);
-   rte_io_wmb();
-#endif
+   if (m->nb_segs == 1)
return;
-   }
 
sh.u = vgetq_lane_u64(cmd0[0], 0);
sg.u = vgetq_lane_u64(cmd1[0], 0);
@@ -1491,16 +1489,32 @@ cn10k_nix_prep_lmt_mseg_vector(struct rte_mbuf **mbufs, 
uint64x2_t *cmd0,
   uint64_t *lmt_addr, __uint128_t *data128,
   uint8_t *shift, const uint16_t flags)
 {
-   uint8_t j, off, lmt_used;
+   uint8_t j, off, lmt_used = 0;
+
+   if (flags & NIX_TX_OFFLOAD_MBUF_NOFF_F) {
+   off = 0;
+   for (j = 0; j < NIX_DESCS_PER_LOOP; j++) {
+   if (off + segdw[j] > 8) {
+   *data128 |= ((__uint128_t)off - 1) << *shift;
+   *shift += 3;
+

[PATCH v1 3/3] net/cnxk: add debug check for number of Tx descriptors

2022-11-16 Thread Ashwin Sekhar T K

When SG2 descriptors are used and more than 5 segments
are present, in certain combination of segments the
number of descriptors required will be greater than
16.

In debug builds, add an assert to capture this scenario.

Signed-off-by: Ashwin Sekhar T K 
---
 drivers/net/cnxk/cn10k_tx.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/cnxk/cn10k_tx.h b/drivers/net/cnxk/cn10k_tx.h
index 3f08a8a473..09c332b2b5 100644
--- a/drivers/net/cnxk/cn10k_tx.h
+++ b/drivers/net/cnxk/cn10k_tx.h
@@ -84,6 +84,22 @@ cn10k_nix_mbuf_sg_dwords(struct rte_mbuf *m)
return (segw + 1) / 2;
 }
 
+static __plt_always_inline void
+cn10k_nix_tx_mbuf_validate(struct rte_mbuf *m, const uint32_t flags)
+{
+#ifdef RTE_LIBRTE_MBUF_DEBUG
+   uint16_t segdw;
+
+   segdw = cn10k_nix_mbuf_sg_dwords(m);
+   segdw += 1 + !!(flags & NIX_TX_NEED_EXT_HDR) + !!(flags & 
NIX_TX_OFFLOAD_TSTAMP_F);
+
+   PLT_ASSERT(segdw <= 8);
+#else
+   RTE_SET_USED(m);
+   RTE_SET_USED(flags);
+#endif
+}
+
 static __plt_always_inline void
 cn10k_nix_vwqe_wait_fc(struct cn10k_eth_txq *txq, int64_t req)
 {
@@ -1307,6 +1323,8 @@ cn10k_nix_xmit_pkts_mseg(void *tx_queue, uint64_t *ws,
}
 
for (i = 0; i < burst; i++) {
+   cn10k_nix_tx_mbuf_validate(tx_pkts[i], flags);
+
/* Perform header writes for TSO, barrier at
 * lmt steorl will suffice.
 */
@@ -1906,6 +1924,8 @@ cn10k_nix_xmit_pkts_vector(void *tx_queue, uint64_t *ws,
for (j = 0; j < NIX_DESCS_PER_LOOP; j++) {
struct rte_mbuf *m = tx_pkts[j];
 
+   cn10k_nix_tx_mbuf_validate(m, flags);
+
/* Get dwords based on nb_segs. */
if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F &&
  flags & NIX_TX_MULTI_SEG_F))
-- 
2.25.1

[PATCH] net/iavf:fix slow memory allocation

2022-11-16 Thread Kaisen You

In some cases, the DPDK does not allocate hugepage heap memory to
some sockets due to the user setting parameters
(e.g. -l 40-79, SOCKET 0 has no memory).
When the interrupt thread runs on the corresponding core of this
socket, each allocation/release will execute a whole set of heap
allocation/release operations,resulting in poor performance.
Instead we call malloc() to get memory from the system's
heap space to fix this problem.

Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
Cc: sta...@dpdk.org

Signed-off-by: Kaisen You 
---
 drivers/net/iavf/iavf_vchnl.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
index f92daf97f2..a05791fe48 100644
--- a/drivers/net/iavf/iavf_vchnl.c
+++ b/drivers/net/iavf/iavf_vchnl.c
@@ -36,7 +36,6 @@ struct iavf_event_element {
struct rte_eth_dev *dev;
enum rte_eth_event_type event;
void *param;
-   size_t param_alloc_size;
uint8_t param_alloc_data[0];
 };
 
@@ -80,7 +79,7 @@ iavf_dev_event_handle(void *param __rte_unused)
TAILQ_FOREACH_SAFE(pos, &pending, next, save_next) {
TAILQ_REMOVE(&pending, pos, next);
rte_eth_dev_callback_process(pos->dev, pos->event, 
pos->param);
-   rte_free(pos);
+   free(pos);
}
}
 
@@ -94,14 +93,13 @@ iavf_dev_event_post(struct rte_eth_dev *dev,
 {
struct iavf_event_handler *handler = &event_handler;
char notify_byte;
-   struct iavf_event_element *elem = rte_malloc(NULL, sizeof(*elem) + 
param_alloc_size, 0);
+   struct iavf_event_element *elem = malloc(sizeof(*elem) + 
param_alloc_size);
if (!elem)
return;
 
elem->dev = dev;
elem->event = event;
elem->param = param;
-   elem->param_alloc_size = param_alloc_size;
if (param && param_alloc_size) {
rte_memcpy(elem->param_alloc_data, param, param_alloc_size);
elem->param = elem->param_alloc_data;
@@ -165,7 +163,7 @@ iavf_dev_event_handler_fini(void)
struct iavf_event_element *pos, *save_next;
TAILQ_FOREACH_SAFE(pos, &handler->pending, next, save_next) {
TAILQ_REMOVE(&handler->pending, pos, next);
-   rte_free(pos);
+   free(pos);
}
 }
 
-- 
2.34.1

70 matches

Mail list logo