from:"Ferruh Yigit"

Re: [PATCH v5] doc: add new driver guidelines

2024-10-04 Thread Ferruh Yigit

On 10/4/2024 5:39 PM, Nandini Persad wrote:
> +* Each patch should be organized logically as a new feature.
> +* Run test tools per patch (See :ref:`tool_list`:).
>

':' after `tool_list` is extra and can be dropped.

Rest looks good to me, thanks.

Re: [PATCH 0/8] fix the representor port link status and speed

2024-10-04 Thread Ferruh Yigit

On 9/5/2024 7:25 AM, Chaoyong He wrote:
> This patch series aims to fix some problems in the representor port link
> status and speed update logic, also do some refactors to the related
> logics.
> 
> Qin Ke (8):
>   net/nfp: fix incorrect type declaration of some variables
>   net/nfp: add help function to check link speed
>   net/nfp: add help function to update VF link speed
>   net/nfp: rename PF speed update function
>   net/nfp: add new data field into representor port structure
>   net/nfp: fix representor port link speed update problem
>   net/nfp: standardize the use of port index in some functions
>   net/nfp: fix representor port link status update problem
>

Series applied to dpdk-next-net/main, thanks.

Re: [PATCH dpdk v2 04/16] net: use ipv6 structure for header addresses

2024-10-04 Thread Ferruh Yigit

On 10/1/2024 9:17 AM, Robin Jarry wrote:
> @@ -918,12 +918,14 @@ static struct chrte_fparse parseitem[] = {
>   .fptr  = ch_rte_parsetype_ipv6,
>   .dmask = &(const struct rte_flow_item_ipv6) {
>   .hdr = {
> - .src_addr =
> + .src_addr = { .a =
>   "\xff\xff\xff\xff\xff\xff\xff\xff"
>   "\xff\xff\xff\xff\xff\xff\xff\xff",
> - .dst_addr =
> + },
> + .dst_addr = { .a =
>   "\xff\xff\xff\xff\xff\xff\xff\xff"
>   "\xff\xff\xff\xff\xff\xff\xff\xff",
> + },
>

Hi Robin,

gcc 15 complains about the string initialization [1], can we change
these initializers?


[1]
https://patches.dpdk.org/project/dpdk/patch/20241004041335.2916435-1-ferruh.yi...@amd.com/

Re: [RFC] ethdev: convert string initialization

2024-10-04 Thread Ferruh Yigit

On 10/4/2024 4:17 PM, Stephen Hemminger wrote:
> On Thu, 1 Aug 2024 02:27:22 -0700
> Ferruh Yigit  wrote:
> 
>> gcc 15 experimental [1], with -Wextra flag, gives warning in variable
>> initialization as string [2].
>>
>> The warning has a point when initialized variable is intended to use as
>> string, since assignment is missing the required null terminator for
>> this case. But warning is useless for our usecase.
>>
>> I don't know if this behaviour will change in gcc15, as it is still
>> under development. But if not we may need to update our initialization.
>>
>> In this patch only updated a few instance to show the issue, there are
>> many instances to fix, if we prefer to go this way.
>> Other option is to disable warning but it can be useful for actual
>> string usecases, so I prefer to keep it.
>>
>> [1]
>> gcc (GCC) 15.0.0 20240801 (experimental)
> 
> 
> I saw Robin added a bunch more of these in the ipv6 struct changes.
>

Thanks for the heads up, I will comment on his patch.

Re: [PATCH] doc: update TAP device features

2024-10-04 Thread Ferruh Yigit

On 10/4/2024 8:54 AM, Bruce Richardson wrote:
> On Fri, Oct 04, 2024 at 05:09:21AM +0100, Ferruh Yigit wrote:
>> On 10/4/2024 3:26 AM, Stephen Hemminger wrote:
>>> On Fri, 4 Oct 2024 02:48:21 +0100
>>> Ferruh Yigit  wrote:
>>>
>>>> On 9/4/2024 4:42 PM, Stephen Hemminger wrote:
>>>>> The TAP device does have per-queue stats and handles multi-process.
>>>>>
>>>>> Signed-off-by: Stephen Hemminger 
>>>>> ---
>>>>>  doc/guides/nics/features/tap.ini | 2 ++
>>>>>  1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/doc/guides/nics/features/tap.ini 
>>>>> b/doc/guides/nics/features/tap.ini
>>>>> index f26355e57f..f2ea5cd833 100644
>>>>> --- a/doc/guides/nics/features/tap.ini
>>>>> +++ b/doc/guides/nics/features/tap.ini
>>>>> @@ -14,10 +14,12 @@ Basic stats  = Y
>>>>>  L3 checksum offload  = Y
>>>>>  L4 checksum offload  = Y
>>>>>  MTU update   = Y
>>>>> +Multiprocess aware   = Y
>>>>>  
>>>>
>>>> ack
>>>>
>>>>>  Multicast MAC filter = Y
>>>>>  Unicast MAC filter   = Y
>>>>>  Packet type parsing  = Y
>>>>>  Flow control = Y
>>>>> +Stats per queue  = Y
>>>>>  
>>>>
>>>> This feature name is misleading,
>>>> it is for 'rte_eth_dev_set_[rt]x_queue_stats_mapping()' API, which is
>>>> indeed for covering limitation for some drivers.
>>>> Tap does support getting stats per queue, but doesn't support above
>>>> documented feature.
>>>
>>> The stats queue mapping was a feature that was hinted at being removed.
>>> It only exists because of HW limitations on Intel ixgbe NIC and SW
>>> limitations from RTE_ETHDEV_QUEUE_STAT_CNTRS.
>>>
>>
>>
>> We have a plan to remove 'RTE_ETHDEV_QUEUE_STAT_CNTRS', by moving queue
>> stats to xstats.
>>
>> But ixgbe limitation is there.
>>
>>> Perhaps there should be a generic SW emulation for this the mapping?
>>>
>>
>> Ack, cc'ed Bruce.
>> But I am not sure ROI of the effort at this stage.
> 
> Not sure what the specific ask for me is here. :-) Overall, I think moving
> queue stats to xstats is the best way to go.
> 

cc'ed because of "generic SW emulation" comment.

I was thinking if this mapping can be done transparent to the user by
driver mapping queue <-> stats_register before reading stats, but @Bruce
let me know this won't work because the stats tracking only happens
after the mapping.

@Stephen, do you have something specific in your mind for SW emulation
for mapping?

Re: [PATCH 0/3] support load firmware from flash

2024-10-03 Thread Ferruh Yigit

On 9/3/2024 6:52 AM, Chaoyong He wrote:
> This patch series add the needed logic to support load firmware from the
> flash in card.
> 
> Chaoyong He (3):
>   net/nfp: fix potential problem on certain version BSP
>   net/nfp: add two APIs of the NSP module
>   net/nfp: support load firmware from flash
>

Series applied to dpdk-next-net/main, thanks.

Re: [PATCH] net/nfp: reserve BAR for expansion ROM

2024-10-03 Thread Ferruh Yigit

On 9/3/2024 3:34 AM, Chaoyong He wrote:
> For some platform, the warm restart of host doesn't trigger
> the reset of NIC, which causes the initialize process of NIC
> not executed with the right expansion ROM mapping. Consequently,
> the PXE boot won't work in this case.
> 
> Now reserve BAR 2.0 which used by expansion ROM so that the
> mapping is fixed.
> 
> Signed-off-by: Chaoyong He 
> Reviewed-by: Long Wu 
> Reviewed-by: Peng Zhang 
>

Applied to dpdk-next-net/main, thanks.

Re: [RFC PATCH] app/testpmd: display TM parameters when adding nodes

2024-10-03 Thread Ferruh Yigit

On 8/12/2024 2:26 PM, Bruce Richardson wrote:
> The commands to add TM nodes and shapers take many parameters without
> any descriptive words in between to identify what parameter is what. To
> help make it clear what a command is actually doing, and to help catch
> any issues with parameters put in the wrong order, print out the
> parameters for each command after it is entered.
> 
> Signed-off-by: Bruce Richardson 
>

Acked-by: Ferruh Yigit 

Applied to dpdk-next-net/main, thanks.

Re: [PATCH v2 0/3] app/testpmd: improve sse based macswap

2024-10-03 Thread Ferruh Yigit

On 8/21/2024 3:38 PM, Vipin Varghese wrote:
> Goal of the patch series is to improve SSE macswap on x86_64 by
> reducing the stalls in backend engine. Original implementation of
> the SSE-mac-swap makes loop call to multiple load, shuffle & store.
> 
> Using SIMD ISA interleaving, register variable and reducing L1 & L2
> cache eviction, we can reduce the stalls for
>  - load SSE token exhaustion
>  - Shuffle and Load dependency
> 
> Build test using meson script:
> ``
> build-gcc-static
> buildtools
> build-gcc-shared
> build-mini
> build-clang-static
> build-clang-shared
> build-x86-generic
> 
> Test Results:
> `
> 
> Platform-1: AMD EPYC SIENA 8594P @2.3GHz, no boost
> Platform-2: AMD EPYC 9554 @3.1GHz, no boost
> 
> NIC:
>  1) mellanox CX-7 1*200Gbps
>  2) intel E810 1*100Gbps
>  3) intel E810 2*200Gbps (2CQ-DA2) - loopback
>  4) braodcom P2100 2*100Gbps - loopback
> 
> 
> TEST IO 64B: baseline 
>  - NIC-1: 42.0
>  - NIC-2: 82.0
>  - NIC-3: 82.45
>  - NIC-3: 47.03
> 
> TEST MACSWAP 64B: 
>  - NIC-1: 31.533 : 31.90
>  - NIC-2: 48.0   : 48.9 
>  - NIC-3: 48.840 : 49.827
>  - NIC-4: 44.3   : 45.5
> 
> TEST MACSWAP 128B: 
>  - NIC-1: 30.946 : 31.770
>  - NIC-2: 47.4   : 48.3
>  - NIC-3: 47.979 : 48.503
>  - NIC-4: 41.53  : 44.59
> 
> TEST MACSWAP 256B: 
>  - NIC-1: 32.480 : 33.150
>  - NIC-2: 45.29  : 45.571
>  - NIC-3: 45.033 : 45.117
>  - NIC-4: 36.49  : 37.5
> 
> 
> 
> 
> TEST IO 64B: baseline 
>  - intel E810 2*200Gbps (2CQ-DA2): 82.49
> 
> 
> TEST MACSWAP: 1Q 1C1T
>  64B: : 45.0 : 45.54
> 128B: : 44.48 : 44.43
> 256B: : 42.0 : 41.99
> +
> TEST MACSWAP: 2Q 2C2T
>  64B: : 59.5 : 60.55
> 128B: : 56.78 : 58.1
> 256B: : 41.85 : 41.99
> ----
> 
> Signed-off-by: Vipin Varghese 
> 
> Vipin Varghese (3):
>   app/testpmd: add register keyword
>   app/testpmd: move offload update
>   app/testpmd: interleave SSE SIMD
>

For series,
Acked-by: Ferruh Yigit 


Bruce, if your testing is not aligned with the presented results please
let us know, otherwise I am planning to get the patch for -rc1.

Re: [PATCH v2 1/3] app/testpmd: add register keyword

2024-10-03 Thread Ferruh Yigit

On 9/6/2024 2:02 PM, Varghese, Vipin wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> 
>  
>> > > >> --- a/app/test-pmd/macswap_sse.h
>> > > >> +++ b/app/test-pmd/macswap_sse.h
>> > > >> @@ -16,13 +16,13 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t
>> nb,
>> > > >>    uint64_t ol_flags;
>> > > >>    int i;
>> > > >>    int r;
>> > > >> - __m128i addr0, addr1, addr2, addr3;
>> > > >> + register __m128i addr0, addr1, addr2, addr3;
>> > > > Some compilers treat register as a no-op. Are you sure? Did you check
>> with godbolt.
>> > >
>> > > Thank you Stephen, I have tested the code changes on Linux using GCC
>> > > and Clang compiler.
>> > >
>> > > In both cases in Linux environment, we have seen the the values
>> > > loaded onto register `xmm`.
>> > >
>> > > ```
>> > > registerconst__m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12, 5, 4,
>> > > 3, 2, 1, 0, 11, 10, 9, 8, 7, 6); vmovdqaxmm0, xmmwordptr[rip+
>> > > .LCPI0_0]
>> 
>> Yep, that what I would probably expect: one time load before the loop starts,
>> right?
>> Curious  what exactly it would generate then if 'register' keyword is missed?
>> BTW, on my box,  gcc-11  with '-O3 -msse4.2 ...'  I am seeing expected
>> behavior without 'register' keyword.
>> Is it some particular compiler version that misbehaves?
>  
> Thank you, Konstantin, for this pointer. I have been trying this
> understand this a bit more internally. Here are my observations
>  
> 1. shuf simd ISA works on XMM register only.
> 2. Any values from variables has to be loaded to `xmm` register before
> processing.
> 3. when compiled for `-march=native` with compiler not aware (SoC Arch
> gcc weights) without patch might have generating with `movzx   eax, BYTE
> PTR [rbp-48]`
> 4. when register keyword is applied for both shufl_mask and addr, the
> compiler generates trying to get the variables directly into xmm using `
> vmovdqu (%rsi),%xmm1`
>  
> So, I think you are right, from gcc12.3 and gcc 13.1 which supports `-
> march=znver4` this problem will not come.
>  

Hi Konstantin, Stephen,

There is no negative impact of adding 'register' keyword, right? At
worst it is useless, but Vipin can demonstrate that it has benefit for
some cases, so I think OK to get it.

Re: [PATCH v2] app/testpmd: show output of commands read from file

2024-10-03 Thread Ferruh Yigit

On 8/22/2024 11:41 AM, Bruce Richardson wrote:
> Testpmd supports the "--cmdline-file" parameter to read a set of initial
> commands from a file. However, the only indication that this has been
> done successfully on startup is a single-line message, no output from
> the commands is seen.
> 
> To improve usability here, we can use cmdline_new rather than
> cmdline_file_new and have the output from the various commands sent to
> stdout, allowing the user to see better what is happening.
> 
> Signed-off-by: Bruce Richardson 
> 
> ---
> v2: use STDOUT_FILENO in place of hard-coded "1"
> ---
>

After discussion, I think it is OK to have the update in the testpmd
(instead of a new function in cmdline), hence;

Acked-by: Ferruh Yigit

Re: [PATCH] app/testpmd: show output of commands read from file

2024-10-03 Thread Ferruh Yigit

On 10/4/2024 5:55 AM, Ferruh Yigit wrote:
> On 8/22/2024 11:36 AM, Bruce Richardson wrote:
>> Testpmd supports the "--cmdline-file" parameter to read a set of initial
>> commands from a file. However, the only indication that this has been
>> done successfully on startup is a single-line message, no output from
>> the commands is seen.
>>
>> To improve usability here, we can use cmdline_new rather than
>> cmdline_file_new and have the output from the various commands sent to
>> stdout, allowing the user to see better what is happening.
>>
>> Signed-off-by: Bruce Richardson 
>>
> 
> After discussion, I think it is OK to have the update in the testpmd
> (instead of a new function in cmdline), hence;
> 
> Acked-by: Ferruh Yigit 
>

Ahh, ack was for v2, please scratch this one.

Re: [PATCH] app/testpmd: show output of commands read from file

2024-10-03 Thread Ferruh Yigit

On 8/22/2024 11:36 AM, Bruce Richardson wrote:
> Testpmd supports the "--cmdline-file" parameter to read a set of initial
> commands from a file. However, the only indication that this has been
> done successfully on startup is a single-line message, no output from
> the commands is seen.
> 
> To improve usability here, we can use cmdline_new rather than
> cmdline_file_new and have the output from the various commands sent to
> stdout, allowing the user to see better what is happening.
> 
> Signed-off-by: Bruce Richardson 
>

After discussion, I think it is OK to have the update in the testpmd
(instead of a new function in cmdline), hence;

Acked-by: Ferruh Yigit

Re: [PATCH] net/gve: always attempt Rx refill on DQ

2024-10-03 Thread Ferruh Yigit

On 10/2/2024 12:45 AM, Joshua Washington wrote:
> Before this patch, gve_rx_refill_dqo() is only called if the number of
> packets received in a cycle is non-zero. However, in a
> memory-constrained scenario, this doesn't behave well, as this could be
> a potential source of lockup, if there is no memory and all buffers have
> been received before memory is freed up for the driver to use.
> 
> This patch moves the gve_rx_refill_dqo() call to occur regardless of
> whether packets have been received so that in the case that enough
> memory is freed, the driver can recover.
> 
> Fixes: 45da16b5b181 ("net/gve: support basic Rx data path for DQO")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Joshua Washington 
> Reviewed-by: Praveen Kaligineedi 
> Reviewed-by: Rushil Gupta 
>

Applied to dpdk-next-net/main, thanks.

Re: [PATCH] net/gve: fix mbuf allocation memory leak for DQ Rx

2024-10-03 Thread Ferruh Yigit

On 10/2/2024 12:48 AM, Joshua Washington wrote:
> Currently, gve_rxq_mbufs_alloc_dqo() allocates RING_SIZE buffers, but
> only posts RING_SIZE - 1 of them, inevitably leaking a buffer every
> time queues are stopped/started. This could eventually lead to running
> out of mbufs if an application stops/starts traffic enough.
> 
> Fixes: b044845bb015 ("net/gve: support queue start/stop")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Joshua Washington 
> Reviewed-by: Rushil Gupta 
> Reviewed-by: Praveen Kaligineedi 
>

Applied to dpdk-next-net/main, thanks.

Re: [PATCH 0/4] Support new card using NFP 3800 chip

2024-10-03 Thread Ferruh Yigit

On 9/3/2024 2:41 AM, Chaoyong He wrote:
> This patch series add support of a new card which using the NFP 3800 chip.
> Because it has different configure BAR size and flow steering rules
> limit, we also do some refactor to the related logic.
> 
> Chaoyong He (4):
>   net/nfp: add a new flag to indicate PF
>   net/nfp: refactor the firmware version logic
>   net/nfp: support different configuration BAR size
>   net/nfp: support different flow steering rules limit
>

Series applied to dpdk-next-net/main, thanks.

[PATCH] ethdev: convert string initialization

2024-10-03 Thread Ferruh Yigit

gcc 15 experimental [1], with -Wextra flag, gives warning in variable
initialization as string [2].

The warning has a point when initialized variable is intended to use as
string, since assignment is missing the required null terminator for
this case. But warning is useless for our usecase.

In this patch only updated a few instance to show the issue, there are
many instances to fix, if we prefer to go this way.
Other option is to disable warning but it can be useful for actual
string usecases, so I prefer to keep it.

Converted string initialization to array initialization.

[1]
gcc (GCC) 15.0.0 20241003 (experimental)

[2]
../lib/ethdev/rte_flow.h:906:36:
  error: initializer-string for array of ‘unsigned char’ is too long
[-Werror=unterminated-string-initialization]
906 | .hdr.dst_addr.addr_bytes = "\xff\xff\xff\xff\xff\xff",
|^~

../lib/ethdev/rte_flow.h:907:36:
  error: initializer-string for array of ‘unsigned char’ is too long
 [-Werror=unterminated-string-initialization]
907 | .hdr.src_addr.addr_bytes = "\xff\xff\xff\xff\xff\xff",
|^~

../lib/ethdev/rte_flow.h:1009:25:
  error: initializer-string for array of ‘unsigned char’ is too long
 [-Werror=unterminated-string-initialization]
1009 | "\xff\xff\xff\xff\xff\xff\xff\xff"
 | ^~

../lib/ethdev/rte_flow.h:1012:25:
  error: initializer-string for array of ‘unsigned char’ is too long
 [-Werror=unterminated-string-initialization]
1012 | "\xff\xff\xff\xff\xff\xff\xff\xff"
 | ^~

../lib/ethdev/rte_flow.h:1135:20:
  error: initializer-string for array of ‘unsigned char’ is too long
 [-Werror=unterminated-string-initialization]
1135 | .hdr.vni = "\xff\xff\xff",
 |        ^~

Signed-off-by: Ferruh Yigit 
Acked-by: Morten Brørup 
Acked-by: Bruce Richardson 
---
 app/test-pmd/cmdline_flow.c   | 31 -
 app/test/test_bpf.c   |  2 +-
 app/test/test_pcapng.c|  2 +-
 app/test/test_security_inline_macsec.c|  4 +-
 doc/guides/prog_guide/ethdev/flow_offload.rst |  2 +-
 drivers/net/cxgbe/cxgbe_flow.c| 14 ++--
 drivers/net/dpaa2/dpaa2_flow.c| 14 ++--
 drivers/net/mlx4/mlx4_flow.c  |  6 +-
 drivers/net/mlx5/mlx5_flow.c  | 20 +++---
 drivers/net/mlx5/mlx5_flow_dv.c   | 26 
 drivers/net/mlx5/mlx5_flow_hw.c   | 66 +--
 drivers/net/mlx5/mlx5_trigger.c   | 20 +++---
 drivers/net/nfp/flower/nfp_flower_flow.c  | 14 ++--
 drivers/net/nfp/nfp_net_flow.c|  8 +--
 drivers/net/tap/tap_flow.c| 30 -
 examples/l2fwd-macsec/main.c  |  4 +-
 lib/ethdev/rte_flow.h | 44 ++---
 17 files changed, 144 insertions(+), 163 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index b7bcf18311a2..5451b3a45338 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -898,20 +898,20 @@ struct vxlan_encap_conf vxlan_encap_conf = {
.select_ipv4 = 1,
.select_vlan = 0,
.select_tos_ttl = 0,
-   .vni = "\x00\x00\x00",
+   .vni = { 0x00, 0x00, 0x00 },
.udp_src = 0,
.udp_dst = RTE_BE16(RTE_VXLAN_DEFAULT_PORT),
.ipv4_src = RTE_IPV4(127, 0, 0, 1),
.ipv4_dst = RTE_IPV4(255, 255, 255, 255),
-   .ipv6_src = "\x00\x00\x00\x00\x00\x00\x00\x00"
-   "\x00\x00\x00\x00\x00\x00\x00\x01",
-   .ipv6_dst = "\x00\x00\x00\x00\x00\x00\x00\x00"
-   "\x00\x00\x00\x00\x00\x00\x11\x11",
+   .ipv6_src = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01 },
+   .ipv6_dst = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x11, 0x11 },
.vlan_tci = 0,
.ip_tos = 0,
.ip_ttl = 255,
-   .eth_src = "\x00\x00\x00\x00\x00\x00",
-   .eth_dst = "\xff\xff\xff\xff\xff\xff",
+   .eth_src = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 },
+   .eth_dst = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
 };
 
 /** Maximum number of items in struct rte_flow_action_vxlan_encap. */
@@ -934,16 +934,16 @@ struct action_vxlan_encap_data {
 struct nvgre_encap_conf nvgre_encap_conf = {
.select_ipv4 = 1,
.select_vlan = 0,
-   .tni = "\x00\x00\x00",
+   .tni = { 0x00, 0x00, 0x00 },
.ipv4_src = RTE_IPV4(127, 0, 0, 1),

Re: [PATCH] doc: update TAP device features

2024-10-03 Thread Ferruh Yigit

On 10/4/2024 3:26 AM, Stephen Hemminger wrote:
> On Fri, 4 Oct 2024 02:48:21 +0100
> Ferruh Yigit  wrote:
> 
>> On 9/4/2024 4:42 PM, Stephen Hemminger wrote:
>>> The TAP device does have per-queue stats and handles multi-process.
>>>
>>> Signed-off-by: Stephen Hemminger 
>>> ---
>>>  doc/guides/nics/features/tap.ini | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/doc/guides/nics/features/tap.ini 
>>> b/doc/guides/nics/features/tap.ini
>>> index f26355e57f..f2ea5cd833 100644
>>> --- a/doc/guides/nics/features/tap.ini
>>> +++ b/doc/guides/nics/features/tap.ini
>>> @@ -14,10 +14,12 @@ Basic stats  = Y
>>>  L3 checksum offload  = Y
>>>  L4 checksum offload  = Y
>>>  MTU update   = Y
>>> +Multiprocess aware   = Y
>>>  
>>
>> ack
>>
>>>  Multicast MAC filter = Y
>>>  Unicast MAC filter   = Y
>>>  Packet type parsing  = Y
>>>  Flow control = Y
>>> +Stats per queue  = Y
>>>  
>>
>> This feature name is misleading,
>> it is for 'rte_eth_dev_set_[rt]x_queue_stats_mapping()' API, which is
>> indeed for covering limitation for some drivers.
>> Tap does support getting stats per queue, but doesn't support above
>> documented feature.
> 
> The stats queue mapping was a feature that was hinted at being removed.
> It only exists because of HW limitations on Intel ixgbe NIC and SW
> limitations from RTE_ETHDEV_QUEUE_STAT_CNTRS.
> 


We have a plan to remove 'RTE_ETHDEV_QUEUE_STAT_CNTRS', by moving queue
stats to xstats.

But ixgbe limitation is there.

> Perhaps there should be a generic SW emulation for this the mapping?
>

Ack, cc'ed Bruce.
But I am not sure ROI of the effort at this stage.

Re: [PATCH] [RFC] cryptodev: replace LIST_END enumerators with APIs

2024-10-03 Thread Ferruh Yigit

On 9/23/2024 9:41 PM, Akhil Goyal wrote:
> Hi Morten,
> 
> Apologies for delayed response.
>> Maybe a combination, returning the lowest end of the two versions of the 
>> list,
>> would work...
>>
>> --
>> Common header file (rte_common.h):
>> --
>>
>> /* Add at end of enum list in the header file. */
>> #define RTE_ENUM_LIST_END(name) \
>> _ # name # _ENUM_LIST_END /**< @internal */
>>
>> /* Add somewhere in header file, preferably after the enum list. */
>> #define rte_declare_enum_list_end(name) \
>> /** @internal */ \
>> int _# name # _enum_list_end(void); \
>> \
>> static int name # _enum_list_end(void) \
>> { \
>>  static int cached = 0; \
>> \
>>  if (likely(cached != 0)) \
>>  return cached; \
>> \
>>  return cached = RTE_MIN( \
>>  RTE_ENUM_LIST_END(name), \
>>  _ # name # _enum_list_end()); \
>> } \
>> \
>> int _# name # _enum_list_end(void)
>>
>> /* Add in the library/driver implementation. */
>> #define rte_define_enum_list_end(name) \
>> int _# name # _enum_list_end(void) \
>> { \
>>  return RTE_ENUM_LIST_END(name); \
>> } \
>> \
>> int _# name # _enum_list_end(void)
>>
>> 
>> Library header file:
>> 
>>
>> enum rte_crypto_asym_op_type {
>>  RTE_CRYPTO_ASYM_OP_VERIFY,
>>  /**< Signature Verification operation */
>>  RTE_ENUM_LIST_END(rte_crypto_asym_op)
> 
> Will the ABI check be ok for adding anything in between 
> RTE_CRYPTO_ASYM_OP_VERIFY and RTE_ENUM_LIST_END(rte_crypto_asym_op)?
> Don’t we need to add exception for that if we somehow make it internal by 
> adding a comment only?
> Library is actually not restricting the application to not use 
> RTE_ENUM_LIST_END(rte_crypto_asym_op) directly.
> 
> Also we may need to expose the .c file internal function as experimental in 
> version.map
> 
>> }
>>
>> rte_declare_enum_list_end(rte_crypto_asym_op);
>>
>> ---
>> Library C file:
>> ---
>>
>> rte_define_enum_list_end(rte_crypto_asym_op);
> 
> If we want to make it a generic thing in rte_common.h
> Will the below change be ok?
> --
> Common header file (rte_common.h):
> --
> #define rte_define_enum_list_end(name, last_value) \
> static inline int name ## _enum_list_end(void) \
> { \
>return last_value + 1; \
> }
> 
> 
> Lib header file
> 
> //After the enum definition define the list end as below
> rte_define_enum_list_end(rte_crypto_asym_op, RTE_CRYPTO_ASYM_OP_VERIFY);
> 

I assume Morten suggests his macros to escape from maintenance cost of
updating inline function each time a new enum is added.

But with above suggestion, that cost is still there, so I don't think
this one has a benefit against the original suggestion.

Re: [EXTERNAL] Re: [PATCH] [RFC] cryptodev: replace LIST_END enumerators with APIs

2024-10-03 Thread Ferruh Yigit

On 9/6/2024 8:45 AM, Akhil Goyal wrote:
>>>
>>> Here's an idea...
>>>
>>> We can introduce a generic design pattern where we keep the _LIST_END enum
>> value at the end, somehow marking it private (and not part of the API/ABI), 
>> and
>> move the _list_end() function inside the C file, so it uses the _LIST_END 
>> enum
>> value that the library was built with. E.g. like this:
>>>
>>>
>>> In the header file:
>>>
>>> enum rte_crypto_asym_op_type {
>>> RTE_CRYPTO_ASYM_OP_VERIFY,
>>> /**< Signature Verification operation */
>>> #if RTE_BUILDING_INTERNAL
>>> __RTE_CRYPTO_ASYM_OP_LIST_END /* internal */
>>> #endif
>>> }
>>>
>>> int rte_crypto_asym_op_list_end(void);
>>>
>>>
>>> And in the associated library code file, when including rte_crypto_asym.h:
>>>
>>> #define RTE_BUILDING_INTERNAL
>>> #include 
>>>
>>> int
>>> rte_crypto_asym_op_list_end(void)
>>> {
>>> return __RTE_CRYPTO_ASYM_OP_LIST_END;
>>> }
>>
>> It's more generic, and also keep LIST_END in the define, we just add new enum
>> before it.
>> But based on my understanding of ABI compatibility, from the point view of
>> application,
>> this API should return old-value even with the new library, but it will 
>> return new-
>> value
>> with new library. It could also break ABI.
>>
>> So this API should force inline, just as this patch did. But it seem can't 
>> work if
>> move
>> this API to header file and add static inline.
>>
> Yes, moving to c file does not seem to solve the purpose.
> So should we move with the way the patch is submitted or we have some other 
> suggestion?
> 

+1, when it is not inline function, this approach won't work.

Re: [PATCH] [RFC] cryptodev: replace LIST_END enumerators with APIs

2024-10-03 Thread Ferruh Yigit

On 9/5/2024 11:14 AM, Akhil Goyal wrote:
> Replace *_LIST_END enumerators from asymmetric crypto
> lib to avoid ABI breakage for every new addition in
> enums with inline APIs.
> 
> Signed-off-by: Akhil Goyal 
> ---
> This patch was discussed in ML long time back.
> https://patches.dpdk.org/project/dpdk/patch/20211008204516.3497060-1-gak...@marvell.com/
> Now added inline APIs for getting the list end which need to be updated
> for each new entry to the enum. This shall help in avoiding ABI break
> for adding new algo.
> 

Hi Akhil,

*I think* this hides the problem instead of fixing it, and this may be
partially because of the tooling (libabigail) limitation.

This patch prevents the libabigail warning, true, but it doesn't improve
anything from the application's perspective.
Before or after this patch, application knows a fixed value as END value.

Not all changes in the END (MAX) enum values cause ABI break, but tool
warns on all, that is why I think this may be tooling limitation [1].
(Just to stress, I am NOT talking about regular enum values change, I am
talking about only END (MAX) value changes caused by appending new enum
items.)

As far as I can see (please Dodji, David correct me if I am wrong) ABI
break only happens if application and library exchange enum values in
the API (directly or within a struct).

Exchanging enum values via API cause ABI issues because:
1. enum size is not fixed, it uses min storage size that can hold all
values, adding new enums may cause its size change and of course this
breaks ABI.
2. Library can send enum values that application doesn't know, this may
cause application behave undefined.
3. Application sending MAX value (or more) to API expects error, but
that may be a valid value in the new library and applications gets
unexpected result.

Let's assume above 1. happens, with this patch warning is prevented, but
ABI break still can occur.

One option can be not exchanging enums in APIs at all, this way changes
in the END (MAX) enum values are harmless. But I can see cryptodev does
this a lot.

When enum is exchanged in APIs, and if we want to be strict about the
ABI, safest option can be not appending to enums at all, and keeping END
(MAX) items can be a way to enable warnings for this.

More relaxed option is protect against only 1. since that is the real
ABI issue, 2 & 3 are more defensive programming, so as long as enum size
is not changed we may ignore the END (MAX) value change warnings.

[1] It would be better if tool gives END (MAX) enum value warnings only
if it is exchanged in an API, but not sure if this can be possible to
detect.

>  app/test/test_cryptodev_asym.c  |  2 +-
>  lib/cryptodev/rte_crypto_asym.h | 18 ++
>  2 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/app/test/test_cryptodev_asym.c b/app/test/test_cryptodev_asym.c
> index f0b5d38543..f1ece475b8 100644
> --- a/app/test/test_cryptodev_asym.c
> +++ b/app/test/test_cryptodev_asym.c
> @@ -581,7 +581,7 @@ static inline void print_asym_capa(
>   rte_cryptodev_asym_get_xform_string(capa->xform_type));
>   printf("operation supported -");
>  
> - for (i = 0; i < RTE_CRYPTO_ASYM_OP_LIST_END; i++) {
> + for (i = 0; i < rte_crypto_asym_op_list_end(); i++) {
>   /* check supported operations */
>   if (rte_cryptodev_asym_xform_capability_check_optype(capa, i)) {
>   if (capa->xform_type == RTE_CRYPTO_ASYM_XFORM_DH)
> diff --git a/lib/cryptodev/rte_crypto_asym.h b/lib/cryptodev/rte_crypto_asym.h
> index 39d3da3952..290b300f84 100644
> --- a/lib/cryptodev/rte_crypto_asym.h
> +++ b/lib/cryptodev/rte_crypto_asym.h
> @@ -72,6 +72,7 @@ enum rte_crypto_curve_id {
>   * Asymmetric crypto transformation types.
>   * Each xform type maps to one asymmetric algorithm
>   * performing specific operation
> + * Note: Update rte_crypto_asym_xform_type_list_end() for every new type 
> added.
>   */
>  enum rte_crypto_asym_xform_type {
>   RTE_CRYPTO_ASYM_XFORM_UNSPECIFIED = 0,
> @@ -119,12 +120,17 @@ enum rte_crypto_asym_xform_type {
>* Performs Encrypt, Decrypt, Sign and Verify.
>* Refer to rte_crypto_asym_op_type.
>*/
> - RTE_CRYPTO_ASYM_XFORM_TYPE_LIST_END
> - /**< End of list */
>  };
>  
> +static inline int
> +rte_crypto_asym_xform_type_list_end(void)
> +{
> + return RTE_CRYPTO_ASYM_XFORM_SM2 + 1;
> +}
> +
>  /**
>   * Asymmetric crypto operation type variants
> + * Note: Update rte_crypto_asym_op_list_end for every new type added.
>   */
>  enum rte_crypto_asym_op_type {
>   RTE_CRYPTO_ASYM_OP_ENCRYPT,
> @@ -135,9 +141,14 @@ enum rte_crypto_asym_op_type {
>   /**< Signature Generation operation */
>   RTE_CRYPTO_ASYM_OP_VERIFY,
>   /**< Signature Verification operation */
> - RTE_CRYPTO_ASYM_OP_LIST_END
>  };
>  
> +static inline int
> +rte_crypto_asym_op_list_end(void)
> +{
> + return RTE_CRYPTO_ASYM_OP_VERIFY + 1;
> +}
> +
>  /**
>

Re: [PATCH] doc: update TAP device features

2024-10-03 Thread Ferruh Yigit

On 9/4/2024 4:42 PM, Stephen Hemminger wrote:
> The TAP device does have per-queue stats and handles multi-process.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  doc/guides/nics/features/tap.ini | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/doc/guides/nics/features/tap.ini 
> b/doc/guides/nics/features/tap.ini
> index f26355e57f..f2ea5cd833 100644
> --- a/doc/guides/nics/features/tap.ini
> +++ b/doc/guides/nics/features/tap.ini
> @@ -14,10 +14,12 @@ Basic stats  = Y
>  L3 checksum offload  = Y
>  L4 checksum offload  = Y
>  MTU update   = Y
> +Multiprocess aware   = Y
>

ack

>  Multicast MAC filter = Y
>  Unicast MAC filter   = Y
>  Packet type parsing  = Y
>  Flow control = Y
> +Stats per queue  = Y
>

This feature name is misleading,
it is for 'rte_eth_dev_set_[rt]x_queue_stats_mapping()' API, which is
indeed for covering limitation for some drivers.
Tap does support getting stats per queue, but doesn't support above
documented feature.

>  Linux= Y
>  ARMv7= Y
>  ARMv8= Y

Re: [PATCH] Add bus master enable/disable APIs for CDX devices

2024-10-03 Thread Ferruh Yigit

On 10/3/2024 11:16 PM, Stephen Hemminger wrote:
> On Fri, 27 Oct 2023 17:22:11 +0100
> Ferruh Yigit  wrote:
> 
>> From: Shubham Rohila 
>>
>> Define rte_cdx_vfio_bm_enable and rte_cdx_vfio_bm_disable to
>> enable or disable bus master functionality for cdx devices.
>>
>> Signed-off-by: Shubham Rohila 
> 
> This version of the patch had lots test failures, could you clean
> up and resubmit. 
>

v2 of the patch already merged.
Probably missed to update the status of this patch in the patchwork, I
will update it as 'superseded'

Re: [PATCH] drivers: remove compile-time option for IEEE 1588

2024-10-03 Thread Ferruh Yigit

On 10/3/2024 8:12 PM, Stephen Hemminger wrote:
> On Wed, 17 Apr 2024 14:24:00 +0100
> Ferruh Yigit  wrote:
> 
>>> :) remove compile flag in 24.11 LTS.
>>>
>>> But we may end up in same situation in 24.11, some drivers not being
>>> ready, so we should merge this patch very early in the 24.11 to give
>>> time to drivers to fix if they were not ready at that time.
>>>
>>> Other option is merge this patch in 24.07, so that vendors which were
>>> not ready can plan a fix for 24.11? Although enforcing an update by
>>> breaking the driver is not best planning, it may be an efficient one.
>>>  
>>
>> Hi Thomas and et al. ,
>>
>> Is is good time to merge this patch, there is enough time to fix
>> potential issues in this release, what do you think?
> 
> Patch needs rebase if going into 24.11
>

Indeed we talked about this in the release status meeting this morning,
as we discussed similar issue with Huisong [1].

If we will continue with the patch it should be merged before -rc1, so
that drivers can fix any impact from it.

Some options I can think of:

1. Move `RTE_LIBRTE_IEEE1588` macro to `config/rte_config.h`, so make it
a little more configurable and keep macro (easiest, but not preferable).

2. Add a new meson config option to 'config/rte_config.h' to set/unset
this macro, keep macro as it is.

3. Remove the macro, replace it with device argument per driver. So make
it dynamically configured instead of build time.

4. Remove the macro, introduce a new offload flag (Huisong has similar
patch [2]) for user to explicitly request this feature.

5. Remove the macro, implicitly enable it in the driver when vector path
is not enabled. So feature will be always on for scalar path, always off
for the vector path. User can disable vector path
(RTE_VECT_SIMD_DISABLED) to enable PTP.

Question is, will drivers have enough time form -rc1 to end of the
release to fix drivers?

Drivers in questions are:
txgbe
ngbe
ixgbe
i40e
igb
fm10k
bnxt
dpaa2
cnxk

Can maintainers of the above drivers please comment?

[1]
https://inbox.dpdk.org/dev/1240fd64-ba5d-4dcc-b825-851da25b8...@amd.com/

[2]
https://patches.dpdk.org/project/dpdk/patch/20240130035820.29713-1-lihuis...@huawei.com/

Re: [PATCH v5 00/18] NXP DPAA ETH driver enhancement and fixes

2024-10-01 Thread Ferruh Yigit

On 10/1/2024 12:03 PM, Hemant Agrawal wrote:
> v5: fix individual patch compilation and checkpatch warning
> v4: fix clang compilation issues
> v3: addressed Ferruh's comments 
>  - dropped Tx rate limit API patch
>  - added one small bug fix
>  - fixed removal/add of fman_offline type
> 
> v2: address review comments
>  - improve commit message
>  - add documentarion for new functions
>  - make IEEE1588 config runtime
> 
> This series adds several enhancement to the NXP DPAA Ethernet driver.
> 
> Primarily:
> 1. timestamp and IEEE 1588 support
> 2. OH and ONIC based virtual port config in DPAA
> 3. frame display and debugging infra
> 
> 
> Gagandeep Singh (3):
>   bus/dpaa: fix PFDRs leaks due to FQRNIs
>   net/dpaa: support mempool debug
>   net/dpaa: improve the dpaa port cleanup
> 
> Hemant Agrawal (5):
>   bus/dpaa: fix VSP for 1G fm1-mac9 and 10
>   bus/dpaa: fix the fman details status
>   bus/dpaa: add port buffer manager stats
>   net/dpaa: implement detailed packet parsing
>   net/dpaa: enhance DPAA frame display
> 
> Jun Yang (2):
>   net/dpaa: share MAC FMC scheme and CC parse
>   net/dpaa: improve dpaa errata A010022 handling
> 
> Rohit Raj (3):
>   net/dpaa: fix typecasting ch ID to u32
>   bus/dpaa: add OH port mode for dpaa eth
>   bus/dpaa: add ONIC port mode for the DPAA eth
> 
> Vanshika Shukla (5):
>   net/dpaa: support Tx confirmation to enable PTP
>   net/dpaa: add support to separate Tx conf queues
>   net/dpaa: support Rx/Tx timestamp read
>   net/dpaa: support IEEE 1588 PTP
>   net/dpaa: fix reallocate_mbuf handling
>

Adding implied ack from Hemant:
For series,
Acked-by: Hemant Agrawal 


Series applied to dpdk-next-net/main, thanks.

Re: [PATCH v4 00/18] NXP DPAA ETH driver enhancement and fixes

2024-10-01 Thread Ferruh Yigit

On 9/30/2024 1:15 PM, Hemant Agrawal wrote:
> v4: fix clan compilation issues
> v3: addressed Ferruh's comments 
>  - dropped Tx rate limit API patch
>  - added one small bug fix
>  - fixed removal/add of fman_offline type
> 
> v2: address review comments
>  - improve commit message
>  - add documentarion for new functions
>  - make IEEE1588 config runtime
> 
> This series adds several enhancement to the NXP DPAA Ethernet driver.
> 
> Primarily:
> 1. timestamp and IEEE 1588 support
> 2. OH and ONIC based virtual port config in DPAA
> 3. frame display and debugging infra
> 
> 
> Gagandeep Singh (3):
>   bus/dpaa: fix PFDRs leaks due to FQRNIs
>   net/dpaa: support mempool debug
>   net/dpaa: improve the dpaa port cleanup
> 
> Hemant Agrawal (5):
>   bus/dpaa: fix VSP for 1G fm1-mac9 and 10
>   bus/dpaa: fix the fman details status
>   bus/dpaa: add port buffer manager stats
>   net/dpaa: implement detailed packet parsing
>   net/dpaa: enhance DPAA frame display
> 
> Jun Yang (2):
>   net/dpaa: share MAC FMC scheme and CC parse
>   net/dpaa: improve dpaa errata A010022 handling
> 
> Rohit Raj (3):
>   net/dpaa: fix typecasting ch ID to u32
>   bus/dpaa: add OH port mode for dpaa eth
>   bus/dpaa: add ONIC port mode for the DPAA eth
> 
> Vanshika Shukla (5):
>   net/dpaa: support Tx confirmation to enable PTP
>   net/dpaa: add support to separate Tx conf queues
>   net/dpaa: support Rx/Tx timestamp read
>   net/dpaa: support IEEE 1588 PTP
>   net/dpaa: fix reallocate_mbuf handling
>

Patch by patch build fails [1] on following patch:

[PATCH v4 14/18] bus/dpaa: add OH port mode for dpaa eth


Also there are some valid checkpatch warnings, can you please fix them:
https://mails.dpdk.org/archives/test-report/2024-September/801309.html

WARNING:TYPO_SPELLING: 'assited' may be misspelled - perhaps 'assisted'?
#8:
This is an hardware assited IPC mechanism for communiting between two
^^^


[1]
../drivers/net/dpaa/dpaa_flow.c: In function ‘get_rx_port_type’:
../drivers/net/dpaa/dpaa_flow.c:648:30: error: ‘fman_offline_internal’
undeclared (first use in this function)
  648 | if (fif->mac_type == fman_offline_internal)
  |  ^
../drivers/net/dpaa/dpaa_flow.c:648:30: note: each undeclared identifier
is reported only once for each function it appears in
../drivers/net/dpaa/dpaa_flow.c: In function ‘dpaa_port_vsp_configure’:
../drivers/net/dpaa/dpaa_flow.c:963:30: error: ‘fman_offline_internal’
undeclared (first use in this function)
  963 | if (fif->mac_type == fman_offline_internal)
  |  ^

Re: [PATCH v3 1/1] ethdev: fix int overflow in descriptor count logic

2024-09-30 Thread Ferruh Yigit

On 9/30/2024 2:40 PM, Niall Meade wrote:
> Addressed a specific overflow issue in the eth_dev_adjust_nb_desc()
> function where the uint16_t variable nb_desc would overflow when its
> value was greater than (2^16 - nb_align). This overflow caused nb_desc
> to incorrectly wrap around between 0 and nb_align-1, leading to the
> function setting nb_desc to nb_min instead of the expected nb_max.
> 
> To give an example, let nb_desc=UINT16_MAX, nb_align=32, nb_max=4096 and
> nb_min=64. RTE_ALIGN_CEIL(nb_desc, nb_align) calls
> RTE_ALIGN_FLOOR(nb_desc + nb_align - 1, nb_align). This results in an
> overflow of nb_desc, leading to nb_desc being set to 30 and then 0 when
> the macros return. As a result of this, nb_desc is then set to nb_min
> later on.
> 
> The resolution involves upcasting nb_desc to a uint32_t before the
> RTE_ALIGN_CEIL macro is applied. This change ensures that the subsequent
> call to RTE_ALIGN_FLOOR(nb_desc + (nb_align - 1), nb_align) does not
> result in an overflow, as it would when nb_desc is a uint16_t. By using
> a uint32_t for these operations, the correct behavior is maintained
> without the risk of overflow.
> 
> Fixes: 0f67fc3baeb9 ("ethdev: add function to adjust number of descriptors")
> 
> Signed-off-by: Niall Meade 
>

Reviewed-by: Ferruh Yigit 

Applied to dpdk-next-net/main, thanks.

Re: [PATCH v3 0/3] add frequency adjustment support for PTP

2024-09-30 Thread Ferruh Yigit

On 9/30/2024 9:42 AM, Mingjin Ye wrote:
> [1/3] ethdev: add frequency adjustment API
> [2/3] net/ice: add frequency adjustment support for PTP
> [3/3] examples/ptpclient: add frequency adjustment support
> ---
> v2: rte_eth_timesync_adjust_freq marked as experimental.
> ---
> v3: Add more description for API.
> 
> Mingjin Ye (3):
>   ethdev: add frequency adjustment API
>   net/ice: add frequency adjustment support for PTP
>   examples/ptpclient: add frequency adjustment support
>

Who is to ack the net/ice driver, there is no maintainer listed in the
maintainers file?

@Kirill, can you please review the ptpclient changes?

Thanks,
ferruh

Re: [PATCH v3 1/3] ethdev: add frequency adjustment API

2024-09-30 Thread Ferruh Yigit

On 9/30/2024 9:42 AM, Mingjin Ye wrote:
> This patch adds freq adjustment API for PTP high accuracy.
> 
> Signed-off-by: Simei Su 
> Signed-off-by: Mingjin Ye 
>

Reviewed-by: Ferruh Yigit

Re: [PATCH v3 3/3] examples/ptpclient: add frequency adjustment support

2024-09-30 Thread Ferruh Yigit

On 9/30/2024 9:42 AM, Mingjin Ye wrote:
> This patch adds PI servo algorithm to support frequency
> adjustment API for IEEE1588 PTP.
> 
> For example, the command for starting ptpclient with PI algorithm is:
> ./build/examples/dpdk-ptpclient -a :81:00.0 -c 1 -n 3 -- -T 0 -p 0x1
> --controller=pi
> 
> Signed-off-by: Simei Su 
> Signed-off-by: Wenjun Wu 
> Signed-off-by: Mingjin Ye 
> ---
>  examples/ptpclient/ptpclient.c | 303 +
>  1 file changed, 268 insertions(+), 35 deletions(-)
> 

It can be good to update the sample application document for this update.

Also can you please provide more information how this sample can be used
to test the application. What is the environment, how to verify the
function is working etc..
This can be documented in the sample application document if makes
sense, if not in the commit log.

<...>

> @@ -724,6 +941,11 @@ ptp_parse_args(int argc, char **argv)
>  
>   ptp_data.kernel_time_set = ret;
>   break;
> + case 0:
> + if (!strcmp(lgopts[option_index].name, "controller"))
> + if (!strcmp(optarg, "pi"))
> + mode = MODE_PI;
>

Please use "struct option lgopts" for the return values of
'getopt_long()', instead of strcmp.
'l2fwd' (l2fwd_parse_args()) has sample usage.

Re: [RFC 0/4] ethdev: rework config restore

2024-09-29 Thread Ferruh Yigit

On 9/18/2024 10:21 AM, Dariusz Sosnowski wrote:
> Hi all,
> 
> We have been working on optimizing the latency of calls to 
> rte_eth_dev_start(),
> on ports spawned by mlx5 PMD. Most of the work requires changes in the 
> implementation of
> .dev_start() PMD callback, but I also wanted to start a discussion regarding
> configuration restore.
> 
> rte_eth_dev_start() does a few things on top of calling .dev_start() callback:
> 
> - Before calling it:
> - eth_dev_mac_restore() - if device supports RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> - After calling it:
> - eth_dev_mac_restore() - if device does not support 
> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> - restore promiscuous config
> - restore all multicast config
> 
> eth_dev_mac_restore() iterates over all known MAC addresses -
> stored in rte_eth_dev_data.mac_addrs array - and calls
> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.
> 
> Promiscuous config restore checks if promiscuous mode is enabled or not,
> and calls .promiscuous_enable() or .promiscuous_disable() callback.
> 
> All multicast config restore checks if all multicast mode is enabled or not,
> and calls .allmulticast_enable() or .allmulticast_disable() callback.
> 
> Callbacks are called directly in all of these cases, to bypass the checks
> for applying the same configuration, which exist in relevant APIs.
> Checks are bypassed to force drivers to reapply the configuration.
> 
> Let's consider what happens in the following sequence of API calls.
> 
> 1. rte_eth_dev_configure()
> 2. rte_eth_tx_queue_setup()
> 3. rte_eth_rx_queue_setup()
> 4. rte_eth_promiscuous_enable()
> - Call dev->dev_ops->promiscuous_enable()
> - Stores promiscuous state in dev->data->promiscuous
> 5. rte_eth_allmulticast_enable()
> - Call dev->dev_ops->allmulticast_enable()
> - Stores allmulticast state in dev->data->allmulticast
> 6. rte_eth_dev_start()
> - Call dev->dev_ops->dev_start()
> - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> - Call dev->dev_ops->promiscuous_enable()
> - Call dev->dev_ops->allmulticast_enable()
> 
> Even though all configuration is available in dev->data after step 5,
> library forces reapplying this configuration in step 6.
> 
> In mlx5 PMD case all relevant callbacks require communication with the kernel 
> driver,
> to configure the device (mlx5 PMD must create/destroy new kernel flow rules 
> and/or
> change netdev config).
> 
> mlx5 PMD handles applying all configuration in .dev_start(), so the following 
> forced callbacks
> force additional communication with the kernel. The same configuration is 
> applied multiple times.
> 
> As an optimization, mlx5 PMD could check if a given configuration was applied,
> but this would duplicate the functionality of the library
> (for example rte_eth_promiscuous_enable() does not call the driver
> if dev->data->promiscuous is set).
> 
> Question: Since all of the configuration is available before .dev_start() 
> callback is called,
> why ethdev library does not expect .dev_start() to take this configuration 
> into account?
> In other words, why library has to reapply the configuration?
> 
> I could not find any particular reason why configuration restore exists
> as part of the process (it was in the initial DPDK commit).
> 

My assumption is .dev_stop() cause these values reset in some devices,
so .dev_start() restores them back.
@Bruce or @Konstantin may remember the history.

But I agree this is device specific behavior, and can be managed by what
device requires.


> The patches included in this RFC, propose a mechanism which would help with 
> managing which drivers
> rely on forceful configuration restore.
> Drivers could advertise if forceful configuration restore is needed through
> `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is set, then the 
> driver in question
> requires ethdev to forcefully restore configuration.
> 

OK to use flag for it, but not sure about using 'dev_info->dev_flags'
(RTE_ETH_DEV_*) for this, as this flag is shared with user and this is
all dpdk internal.

What about to have a dedicated flag for it? We can have a dedicated set
of flag values for restore.
Also perhaps we have go into details what needs to be restored after
'stop' and what needs to be restored after 'reset' and use similar
mechanism etc...

> This way, if we would conclude that it makes sense for .dev_start() to handle 
> all
> starting configuration aspects, we could track which drivers still rely on 
> configuration restore.
> 
> Dariusz Sosnowski (4):
>   ethdev: rework config restore
>   ethdev: omit promiscuous config restore if not required
>   ethdev: omit all multicast config restore if not required
>   ethdev: omit MAC address restore if not required
> 
>  lib/ethdev/rte_ethdev.c | 39 ++-
>  lib/ethdev/rte_ethdev.h | 18 ++
>  2 files changed, 52 insertions(+), 5 deletions(-)
> 
> --
> 2.39.5
>

Re: [PATCH v4 2/2] app/testpmd: add set dev led on/off command

2024-09-29 Thread Ferruh Yigit

On 9/18/2024 3:38 AM, Chaoyong He wrote:
> From: James Hershaw 
> 
> Add command to change the state of a controllable LED on an ethdev port
> to on/off. This is for the purpose of identifying which physical port is
> associated with an ethdev.
> 
> Usage:
>   testpmd> set port  led 
> 
> Signed-off-by: James Hershaw 
> Reviewed-by: Chaoyong He 
> ---
>  app/test-pmd/cmdline.c  | 52 +
>  app/test-pmd/config.c   | 20 
>  app/test-pmd/testpmd.h  |  1 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  7 +++
>  4 files changed, 80 insertions(+)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index a227d3cdc5..a117a4c2fb 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -523,6 +523,10 @@ static void cmd_help_long_parsed(void *parsed_result,
>   " user is acknowledging and taking responsibility for"
>   " this risk.\n\n"
>  
> + "set port (port_id) led (on|off)\n"
> + "set a controllable LED associated with a certain"
> + " port on or off.\n\n"
> +
>

new command seems added in between some mac related commands (excluding
new eeprom cmd, assuming it will be moved), as we don't have anything
related to the led before maybe it can go last.
Same comment for all updates in this patch, can you please double check
the order, and put updates in middle of other grouping.

Re: [PATCH v4 1/2] app/testpmd: add support for setting device EEPROM

2024-09-29 Thread Ferruh Yigit

On 9/18/2024 3:38 AM, Chaoyong He wrote:
> From: James Hershaw 
> 
> There is currently no means to test the .set_eeprom function callback of
> a given PMD in drivers/net/. This patch adds functionality to allow a
> user to set device eeprom from the testpmd cmdline.
> 
> Usage:
>   testpmd> set port  eeprom  magic  \
>   value  offset 
> 
> -  is a fixed string that is required from the user to
>   acknowledge the risk involved in using this command and accepting the
>   reposibility for that.
>

+1 have a warning for user, but 'confirm' may not be clear enough, what
about something around "accept_risk"?

s/reposibility/responsibility/


> -  is a decimal
> -  is a hex-as-string with no leading "0x"
> -  is a decimal (this field is optional and defaults to 0 if
>   not specified.)
> 
> Signed-off-by: James Hershaw 
> Reviewed-by: Chaoyong He 
> ---
>  app/test-pmd/cmdline.c  | 123 
>  app/test-pmd/config.c   |  39 +++
>  app/test-pmd/testpmd.h  |   2 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  18 +++
>  4 files changed, 182 insertions(+)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 358319c20a..a227d3cdc5 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -514,6 +514,15 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "set eth-peer (port_id) (peer_addr)\n"
>   "set the peer address for certain port.\n\n"
>  
> + "set port (port_id) eeprom (confirm) magic (magic_num)"
> + " value (value) offset (offset)\n"
> + "Set the device eeprom for certain port.\nNote:\n"
> + "This is a high-risk command and its misuse may"
> + " result in unexpected behaviour from the NIC.\n"
> + "By inserting \"confirm\" into the command, the"
> + " user is acknowledging and taking responsibility for"
> + " this risk.\n\n"
> +
>

There is another eeprom command above [1], lets put this one below it.
Also can you please move the function implementation in similar order.


[1]
"show port port_id (module_eeprom|eeprom)\n"


>   "set port (port_id) uta (mac_address|all) (on|off)\n"
>   "Add/Remove a or all unicast hash filter(s)"
>   "from port X.\n\n"
> @@ -7488,6 +7497,119 @@ static cmdline_parse_inst_t cmd_set_fwd_eth_peer = {
>   },
>  };
>  
> +/* *** SET PORT EEPROM *** */
> +struct cmd_seteeprom_result {
> + cmdline_fixed_string_t set;
> + cmdline_fixed_string_t port;
> + uint16_t portnum;
> + cmdline_fixed_string_t eeprom;
> + cmdline_fixed_string_t confirm_str;
> + cmdline_fixed_string_t magic_str;
> + uint32_t magic;
> + cmdline_fixed_string_t value;
> + cmdline_multi_string_t token_str;
> +};
> +
> +static void cmd_seteeprom_parsed(void *parsed_result,
> + __rte_unused struct cmdline *cl,
> + __rte_unused void *data)
> +{
> + struct cmd_seteeprom_result *res = parsed_result;
> + uint32_t offset = 0;
> + uint32_t length;
> + char *token_str;
> + char *token;
> +
> + token_str = res->token_str;
> + token = strtok_r(token_str, " \f\n\r\t\v", &token_str);
> +
> + /* Parse Hex string to Byte data */
> + if (strlen(token) % 2 != 0) {
> + fprintf(stderr, "Bad Argument: %s\nHex string must be in 
> multiples of 2 Bytes\n",
> + token);
> + return;
> + }
> +
> + length = strlen(token) / 2;
> + uint8_t value[length];
>

VLA can cause stack overflow, what about dynamic memory allocation.


> + for (int count = 0; count < (int)(length); count++) {
> + if (sscanf(token, "%2hhx", &value[count]) != 1) {
> + fprintf(stderr, "Bad Argument: %s\nValue must be a hex 
> string\n",
> + token - (count + 1));
> + return;
> + }
> + token += 2;
> + }
> +
> + /* Second token: offset string */
> + token = strtok_r(token_str, " \f\n\r\t\v", &token_str);
> + if (token != NULL) {
> + if (strcmp(token, "offset") == 0) {
> + /* Third token: offset */
> + token = strtok_r(token_str, " \f\n\r\t\v", &token_str);
> + if (token == NULL) {
> + fprintf(stderr, "No offset specified\n");
> + return;
> + }
>

Why not move 'offset' before value, so can use it as res->offset?


> +
> + char *end;
> + offset = strtoul(token, &end, 10);
> + if (offset == 0 && strcmp(end, "") != 0) {
> + fprintf(stderr, "Bad Argument: %s\n", toke

Re: [PATCH] net/tap: add new macpair option for split MAC address

2024-09-29 Thread Ferruh Yigit

On 9/17/2024 12:51 PM, Isaac Boukris wrote:
> Normally, the MAC address of the kernel interface is the same as in the
> interface in dpdk, as they represent the same interface. It is useful
> to allow viewing them as separate connected interfaces (like ip's veth).
> 
> This solves a problem I have running a freebsd-based IPv6 stack on top
> of dpdk and using the tap interface, as both the kernel and freebsd
> stacks configure the MAC derived IPv6 address on the interface (as can
> be seen with ifconfig for the kernel), and they both complain about
> duplicate IPv6 address and the freebsd disables IPv6 as a result.
> 

How kernel side knows about the IPv6 address DPDK side uses, what is
your tap usecase?


> Signed-off-by: Isaac Boukris 
> ---
>  doc/guides/nics/tap.rst   | 18 +
>  drivers/net/tap/rte_eth_tap.c | 37 ++-
>  drivers/net/tap/rte_eth_tap.h |  1 +
>  3 files changed, 51 insertions(+), 5 deletions(-)
> 
> diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst
> index f01663c700..fd936c6084 100644
> --- a/doc/guides/nics/tap.rst
> +++ b/doc/guides/nics/tap.rst
> @@ -71,6 +71,24 @@ But this behavior can be overridden by the use of the 
> persist flag, example::
>  
>--vdev=net_tap0,iface=tap0,persist ...
>  
> +Normally, the MAC address of the kernel interface is the same as in the
> +interface in dpdk, as they represent the same interface. If a MAC address is 
> set
> +by the dpdk application using ``rte_eth_dev_default_mac_addr_set()`` then the
> +kernel interface changes accordingly (but not vice versa).
> +
> +If the ``macpair`` option is given, then the MAC address of the kernel
> +interface is initially set to a different random MAC address and it is no
> +longer altered by dpdk. This allows to view the two interfaces as separate
> +connected interfaces, similar to linux's veth pair interfaces. For instance,
> +you can configure an IP address on the kernel interface with the ``ip`` 
> command
> +in the same subnet as the one used in dpdk and use it as next gateway.
> +
> +The ``macpair`` options is incompatible with the ``remote`` option (see 
> above).
> +
>

I can see 'macpair' is coming from "veth pair" above, but I don't think
it is very explanatory in the context of tap. Do you have any more
suggestion?

Another concern is how to test this, tap PMD got multiple arguments now,
and as the pmd keeps getting more updates and features, how can we sure
this devarg usecase is not broken?

Re: [RFC] ethdev: introduce PTP device capability

2024-09-29 Thread Ferruh Yigit

On 9/29/2024 2:31 AM, lihuisong (C) wrote:
> 
> 在 2024/9/26 4:42, Ferruh Yigit 写道:
>> On 9/24/2024 8:24 AM, lihuisong (C) wrote:
>>> Hi Ferruh,
>>>
>>>
>>> 在 2024/9/23 6:23, Ferruh Yigit 写道:
>>>> On 1/30/2024 3:58 AM, Huisong Li wrote:
>>>>> Currently, the PTP feature is a little messy and has some issue.
>>>>> 1> There is different implementation for PTP. This feature of some
>>>>>  hardware depends on the Rx HW timestamp offload, and use this
>>>>>  offload to detect if enable PTP feature. Others can enable PTP
>>>>>  feature with only ethdev ops.
>>>>> 2> For some drivers, enabling PTP feature also depends on the macro
>>>>>  RTE_LIBRTE_IEEE1588. Actually, whether device support, enable
>>>>>  or disable this feature should not be determined at compilation
>>>>>  time. This has been discussed in thread [1].
>>>>> 3> The user haven't a good way to distinguish which port supports
>>>>>  the PTP feature in the case that a couple of device belong to the
>>>>>  same driver. And PTP related API in ethdev layer doesn't do check
>>>>>  for PTP capability. This has been mentioned and discussed in
>>>>>  thread [2].
>>>>>
>>>>> In the thread [1], there is a conclusion that remove the compiling
>>>>> macro RTE_LIBRTE_IEEE1588. And in the thread [2], there is a common
>>>>> opinion that the RTE_ETH_RX_OFFLOAD_TIMESTAMP cannot be used as the
>>>>> PTP capability.
>>>>>
>>>>> For the above issuse, this patch introduces a PTP capability in
>>>>> rte_eth_dev_info.dev_capa to report PTP capability.
>>>>>
>>>>> Welcome to jumping into discussion.
>>>>>
>>>>> [1] https://patchwork.dpdk.org/project/dpdk/
>>>>> patch/20230203132810.14187-1-tho...@monjalon.net/
>>>>> [2] https://inbox.dpdk.org/dev/20230817084226.55327-1-
>>>>> lihuis...@huawei.com/
>>>>>
>>>>> Signed-off-by: Huisong Li 
>>>>> ---
>>>>>    lib/ethdev/rte_ethdev.h | 6 ++
>>>>>    1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>>> index efa4f12b2a..4c8bd691bd 100644
>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>> @@ -1613,6 +1613,12 @@ struct rte_eth_conf {
>>>>>    #define RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP RTE_BIT64(3)
>>>>>    /** Device supports keeping shared flow objects across restart. */
>>>>>    #define RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP RTE_BIT64(4)
>>>>> +/**
>>>>> + * Device supports PTP feature.
>>>>> + * For some hardware, this feature also need to set the offload
>>>>> + * RTE_ETH_RX_OFFLOAD_TIMESTAMP, please check
>>>>> rte_eth_dev_info.rx_offload_capa.
>>>>> + */
>>>>> +#define RTE_ETH_DEV_CAPA_PTP RTE_BIT64(5)
>>>>>    /**@}*/
>>>>>      /*
>>>> Hi Huisong,
>>>>
>>>> Thanks for the effort, as you mentioned PTP feature can be improved and
>>>> there is a target to remove RTE_LIBRTE_IEEE1588 build time macro.
>>>>
>>>> As far as I remember, one of the main reasons of the
>>>> RTE_LIBRTE_IEEE1588
>>>> macro is some HW has resource restrictions, when RTE_LIBRTE_IEEE1588 is
>>>> enabled some other features, like vector datapath, are not usable, that
>>>> is why this is a build time selection.
>>> I think the main reason that driver don't support PTP feature in vector
>>> datapath is for performance.
>>> This can be controled by releated dev_ops API or TIMESTAMP offload and
>>> no need to use RTE_LIBRTE_IEEE1588 macro, like hns3.
>>>> And related to the PTP capability, can you please give some more
>>>> information, what does device having PTP capability exactly means.
>>> Just the device having PTP capability can be used to enable PTP feature.
>>>
>> Hi Huisong,
>>
>> I am asking for your support but not able to get detailed information is
>> not helping to progress the patch.
>>
>> If application implements PTP protocol, we already have APIs for
>> application to read time from NIC, or to adjust NI

Re: [PATCH v3] net/nfp: implement the device packet type set interface

2024-09-27 Thread Ferruh Yigit

On 9/27/2024 4:10 AM, Chaoyong He wrote:
> From: Long Wu 
> 
> Using the Rx packet offload flag rather than the device
> capability to control the packet type offload configuration.
> Also implement the device packet type set interface to
> let application can set the Rx packet offload flag.
> 
> Signed-off-by: Long Wu 
> Reviewed-by: Chaoyong He 
>

Applied to dpdk-next-net/main, thanks.

Re: [PATCH v2 1/1] ethdev: fix int overflow in descriptor count logic

2024-09-27 Thread Ferruh Yigit

On 9/27/2024 2:23 PM, Niall Meade wrote:
> Addressed a specific overflow issue in the eth_dev_adjust_nb_desc()
> function where the uint16_t variable nb_desc would overflow when its
> value was greater than (2^16 - nb_align). This overflow caused nb_desc
> to incorrectly wrap around between 0 and nb_align-1, leading to the
> function setting nb_desc to nb_min instead of the expected nb_max.
> 
> To give an example, let nb_desc=UINT16_MAX, nb_align=32, nb_max=4096 and
> nb_min=64. RTE_ALIGN_CEIL(nb_desc, nb_align) calls
> RTE_ALIGN_FLOOR(nb_desc + nb_align - 1, nb_align). This results in an
> overflow of nb_desc, leading to nb_desc being set to 30 and then 0 when
> the macros return. As a result of this, nb_desc is then set to nb_min
> later on.
> 
> The resolution involves upcasting nb_desc to a uint32_t before the
> RTE_ALIGN_CEIL macro is applied. This change ensures that the subsequent
> call to RTE_ALIGN_FLOOR(nb_desc + (nb_align - 1), nb_align) does not
> result in an overflow, as it would when nb_desc is a uint16_t. By using
> a uint32_t for these operations, the correct behavior is maintained
> without the risk of overflow.
> 
> Fixes: 0f67fc3baeb9 ("ethdev: add function to adjust number of descriptors")
> 
> Signed-off-by: Niall Meade 
> 
> ---
> v2:
> * add example of issue to commit message
> ---
>  .mailmap|  1 +
>  lib/ethdev/rte_ethdev.c | 12 +---
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/.mailmap b/.mailmap
> index 4a508bafad..c1941e78bb 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -1053,6 +1053,7 @@ Nelson Escobar 
>  Nemanja Marjanovic 
>  Netanel Belgazal 
>  Netanel Gonen 
> +Niall Meade 
>  Niall Power 
>  Nicholas Pratte 
>  Nick Connolly  
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index f1c658f49e..f978283edf 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6577,13 +6577,19 @@ static void
>  eth_dev_adjust_nb_desc(uint16_t *nb_desc,
>   const struct rte_eth_desc_lim *desc_lim)
>  {
> + /* Upcast to uint32 to avoid potential overflow with RTE_ALIGN_CEIL(). 
> */
> + uint32_t nb_desc_32 = *nb_desc;
> +
>   if (desc_lim->nb_align != 0)
> - *nb_desc = RTE_ALIGN_CEIL(*nb_desc, desc_lim->nb_align);
> + nb_desc_32 = RTE_ALIGN_CEIL(nb_desc_32, desc_lim->nb_align);
>  
>   if (desc_lim->nb_max != 0)
> - *nb_desc = RTE_MIN(*nb_desc, desc_lim->nb_max);
> + nb_desc_32 = RTE_MIN(nb_desc_32, desc_lim->nb_max);
> +
> + nb_desc_32 = RTE_MAX(nb_desc_32, desc_lim->nb_min);
>  
> - *nb_desc = RTE_MAX(*nb_desc, desc_lim->nb_min);
> + /* Assign clipped u32 back to u16. */
> + *nb_desc = nb_desc_32;
>

Better to add explicit cast here, in case compiler has '-Wconversion'
flag etc..

Except from above, lgtm.

Re: [PATCH v6 0/2] Add link_speed lanes support

2024-09-27 Thread Ferruh Yigit

On 9/26/2024 10:43 PM, Damodharam Ammepalli wrote:
> This patch series is a continuation of the patch set that supports 
> configuring speed lanes.
> https://patchwork.dpdk.org/project/dpdk/patch/20240708232351.491529-1-
> damodharam.ammepa...@broadcom.com/
> 
> The patchset consists
> 1) rtelib/testpmd changes (Addressing the comments). Earlier comments are 
> available here, 
> https://patchwork.dpdk.org/project/dpdk/list/?
> q=Add%20link_speed%20lanes%20support&archive=both&series=&submitter=&delegate=&state=*
> 
> 2) Driver implementation of bnxt PMD.
> 
> v2->v3 Consolidating the testpmd and rtelib patches into a single patch as 
> requested.
> v3->v4 Addressed comments and fix help string and documentation.
> v4->v5 Addressed comments given in v4 and also driver implementation of bnxt 
> PMD.
> v5->v6 Removed LANE_xx enums and updated driver source files. Minor help
> string fix.
> 
> Damodharam Ammepalli (2):
>   ethdev: Add link_speed lanes support
>   net/bnxt: code refactor for supporting speed lanes
>

Series applied to dpdk-next-net/main, thanks.

Re: [PATCH v1] ethdev: fix int overflow in descriptor count logic

2024-09-27 Thread Ferruh Yigit

On 9/27/2024 11:46 AM, Meade, Niall wrote:
>> On 9/26/2024 3:03 PM, Meade, Niall wrote:
>>>> From: Ferruh Yigit 
>>>> Sent: Thursday, September 26, 2024 12:16 AM
>>>> To: Meade, Niall ; Thomas Monjalon 
>>>> ; Andrew Rybchenko ; 
>>>> Roman Zhukov 
>>>> Cc: dev@dpdk.org 
>>>> Subject: Re: [PATCH v1] ethdev: fix int overflow in descriptor count logic
>>> 
>>>>> The resolution involves upcasting nb_desc to a uint32_t before the
>>>>> RTE_ALIGN_CEIL macro is applied. This change ensures that the subsequent
>>>>> call to RTE_ALIGN_FLOOR(nb_desc + (nb_align - 1), nb_align) does not
>>>>> result in an overflow, as it would when nb_desc is a uint16_t. By using
>>>>> a uint32_t for these operations, the correct behavior is maintained
>>>>> without the risk of overflow.
>>>>>
>>>>
>>>> Hi Niall,
>>>
>>> Hi Ferruh,
>>>
>>>> Thanks for the patch.
>>>>
>>>> For the 'RTE_ALIGN_CEIL(val, align)' macro, 'align' should be power of
>>>> two, as 'desc_lim->nb_align' is uint16_t, max value it can get is 2^15.
>>>> 'val' should be smaller than or equal to 'align', so '*nb_desc' can be
>>>> maximum 2^15.
>>>>
>>>> So RTE_ALIGN_CEIL(2^15-1, 2^15) = 2^15, I think this should work fine
>>>> (although I didn't test).
>>>>
>>>> And even with your uint32_t cast, I think following will fail:
>>>> RTE_ALIGN_CEIL(2^16-1, 2^15)
>>>> (again, not tested).
>>>>
>>>
>>> I tested my code with these values and the behaviour is as expected from
>>> what I can see.
>>> At a high level I ran into this issue when passing uint16_tMAX into
>>> rte_eth_dev_adjust_nb_rx_tx_desc() with the intent of selecting the maximum
>>> ring descriptor size but the minimum was selected.
>>>
>>>> Or maybe I am missing a case, can you please give some actual numbers to
>>>> show the problem and the fix?
>>>
>>> Yes sure! If we take an example of val= (2^16)-1 and align= 32.
>>> RTE_ALIGN_CEIL(val, align) calls RTE_ALIGN_FLOOR(val + align - 1, align). 
>>> With
>>> val as a uint16_t this subsequent macro call results in a wrap around for 
>>> val
>>> (originally was the max uint16_t and now we are attempting to add align to
>>> it). The returned value of RTE_ALIGN_CEIL() in this case is 0. This results 
>>> in
>>> nb_desc being set to 0, and later set to the minimum ring descriptor size 
>>> for
>>> that NIC with *nb_desc = RTE_MAX(*nb_desc, desc_lim->nb_min).
>>>
>>> While this example is an unreasonably large request for a descriptor ring 
>>> size,
>>> the expected behaviour would be that the descriptor ring size defaults back 
>>> to
>>> the maximum possible for that particular NIC, not to the minimum which it
>>> currently does.
>>> By introducing a uint32_t, the wrap around in RTE_ALIGN_FLOOR() is avoided,
>>> keeping the large value of nb_desc_32 which is later set to an appropriate 
>>> size
>>> in RTE_MIN(*nb_desc_32, desc_lim->nb_max)
>>>
>>
>> I see the problem now, thanks.
>>
>> When value > (2^16 - align), next aligned value is 2^16, which is
>> UINT16_MAX + 1, hence wraps to 0, this is kind of expected.
>>
>> For the relevant code, assuming 'desc_lim->nb_max' & 'desc_lim->nb_min'
>> are already aligned to 'desc_lim->nb_align', following should fix the
>> issue, that seems simpler to me, what do you think:
> 
> Yes, while it is a simpler solution there is still potential for an overflow 
> if nb_max
> is equal to 0. If nb_max is 0 while nb_desc is UINT16_MAX, UINT16_MAX will be
> passed to the align macro resulting in an overflow again.
> 

ack

>>
>> ```
>> if (desc_lim->nb_max != 0)
>>     *nb_desc = RTE_MIN(*nb_desc, desc_lim->nb_max);
>>
>> nb_desc_32 = RTE_MAX(nb_desc_32, desc_lim->nb_min);
>>
>> if (desc_lim->nb_align != 0)
>>     *nb_desc = RTE_ALIGN_CEIL(*nb_desc, desc_lim->nb_align);
>> ```
>>
>> Basically just changing the order of the operations...
>>
>> It is not easy to see the problem, can you please give sample values in
>> the commit log (for '*nb_desc', 'nb_align', 'nb_max' & 'nb_min'), that
>> makes much easier to see why above works.
> 
> Yes, good idea! I'll add an example to the commit log for clarity.
> --
> Intel Research and Development Ireland Limited
> Registered in Ireland
> Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
> Registered Number: 308263
> 
> 
> This e-mail and any attachments may contain confidential material for the sole
> use of the intended recipient(s). Any review or distribution by others is
> strictly prohibited. If you are not the intended recipient, please contact the
> sender and delete all copies.
>

Re: [PATCH v9 0/8] support dump reigster names and filter

2024-09-27 Thread Ferruh Yigit

On 9/26/2024 1:42 PM, Jie Hai wrote:
> The registers can be dumped through the API rte_eth_dev_get_reg_info.
> However, only register values are exported, which is inconvenient for
> users to interpret. Therefore, an extension of the structure
> "rte_dev_reg_info" and a new API rte_eth_dev_get_reg_info_ext is added
> to support the capability of exporting the name of the corresponding
> register and filtering by module names.
> 
> --
> v9:
> 1. add Acked-by.
> 2. fix typo and inaccurate expression of hns3.rst.
> 
> v8:
> 1. fix typo on patch[0/8].
> 2. add Reviewed-bys and Acked-bys.
> 3. fix potentital memory leak and error information on telemetry.
> 4. fix on reviews about hns3 driver.
> 
> v7: fix doc indention.
> --
> 
> Jie Hai (8):
>   ethdev: support report register names and filter
>   ethdev: add telemetry cmd for registers
>   net/hns3: remove some basic address dump
>   net/hns3: fix dump counter of registers
>   net/hns3: remove separators between register module
>   net/hns3: refactor register dump
>   net/hns3: support report names of registers
>   net/hns3: support filter registers by module names
>

Series applied to dpdk-next-net/main, thanks.

Re: [PATCH v2 0/7] ethdev: jump to table support

2024-09-26 Thread Ferruh Yigit

On 9/25/2024 7:05 PM, Alexander Kozyrev wrote:
> Introduce new Flow API JUMP_TO_TABLE_INDEX action.
> It allows bypassing a hierarchy of groups and going directly
> to a specified flow table. That gives a user the flexibility
> to jump between different priorities in a group and eliminates
> the need to do a table lookup in the group hierarchy.
> The JUMP_TO_TABLE_INDEX action forwards a packet to the
> specified rule index inside the index-based flow table.
> 
> The current index-based flow table doesn't do any matching
> on the packet and executes the actions immediately.
> Add a new index-based flow table with pattern matching.
> The JUMP_TO_TABLE_INDEX can redirect a packet to another
> matching criteria at the specified index in this case.
> 
> RFC: https://patchwork.dpdk.org/project/dpdk/patch/20240822202753.3856703-1-
> akozy...@nvidia.com/
> v2: added trace point to flow insertion by index functions.
> 
> Alexander Kozyrev (7):
>   ethdev: add insertion by index with pattern
>   app/testpmd: add index with pattern insertion type
>   ethdev: add flow rule insertion by index with pattern
>   app/testpmd: add insertion by index with pattern option
>   ethdev: add jump to table index action
>   app/testpmd: add jump to table index action
>   ethdev: add trace points to flow insertion by index
>

Series applied to dpdk-next-net/main, thanks.

Re: [PATCH v2] net/nfp: implement the device packet type set interface

2024-09-26 Thread Ferruh Yigit

On 9/26/2024 8:16 AM, Chaoyong He wrote:
> From: Long Wu 
> 
> Using the Rx packet offload flag rather than the device
> capability to control the packet type offload configuration.
> Also implement the device packet type set interface to
> let application can set the Rx packet offload flag.
> 
> Signed-off-by: Long Wu 
> Reviewed-by: Chaoyong He 
> 
> ---
> V2:
> * Following the advice from reviewer, abandon the modification of RTE
>   layer.
> ---
>  drivers/net/nfp/nfp_ethdev.c |  1 +
>  drivers/net/nfp/nfp_net_common.c | 42 +++-
>  drivers/net/nfp/nfp_net_common.h |  1 +
>  3 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
> index bd35df2dc9..09c15eedac 100644
> --- a/drivers/net/nfp/nfp_ethdev.c
> +++ b/drivers/net/nfp/nfp_ethdev.c
> @@ -932,6 +932,7 @@ static const struct eth_dev_ops nfp_net_eth_dev_ops = {
>   .xstats_get_names_by_id = nfp_net_xstats_get_names_by_id,
>   .dev_infos_get  = nfp_net_infos_get,
>   .dev_supported_ptypes_get = nfp_net_supported_ptypes_get,
> + .dev_ptypes_set = nfp_net_ptypes_set,
>

Hi Chaoyong,

It looks like nfp support "Packet type parsing", can you please
advertise this feature in the nfp.ini?
I think it is OK to add it into this patch.

Re: [PATCH v6 0/2] Add link_speed lanes support

2024-09-26 Thread Ferruh Yigit

On 9/26/2024 10:43 PM, Damodharam Ammepalli wrote:
> This patch series is a continuation of the patch set that supports 
> configuring speed lanes.
> https://patchwork.dpdk.org/project/dpdk/patch/20240708232351.491529-1-
> damodharam.ammepa...@broadcom.com/
> 
> The patchset consists
> 1) rtelib/testpmd changes (Addressing the comments). Earlier comments are 
> available here, 
> https://patchwork.dpdk.org/project/dpdk/list/?
> q=Add%20link_speed%20lanes%20support&archive=both&series=&submitter=&delegate=&state=*
> 
> 2) Driver implementation of bnxt PMD.
> 
> v2->v3 Consolidating the testpmd and rtelib patches into a single patch as 
> requested.
> v3->v4 Addressed comments and fix help string and documentation.
> v4->v5 Addressed comments given in v4 and also driver implementation of bnxt 
> PMD.
> v5->v6 Removed LANE_xx enums and updated driver source files. Minor help
> string fix.
> 
> Damodharam Ammepalli (2):
>   ethdev: Add link_speed lanes support
>   net/bnxt: code refactor for supporting speed lanes
>

Hi Ajit,

Can you please review the bnxt patch, if it is good I can progress with
the series.

Re: [PATCH v6 1/2] ethdev: Add link_speed lanes support

2024-09-26 Thread Ferruh Yigit

On 9/26/2024 10:43 PM, Damodharam Ammepalli wrote:
> Update the eth_dev_ops structure with new function vectors
> to get, get capabilities and set ethernet link speed lanes.
> Update the testpmd to provide required config and information
> display infrastructure.
> 
> The supporting ethernet controller driver will register callbacks
> to avail link speed lanes config and get services. This lanes
> configuration is applicable only when the nic is forced to fixed
> speeds. In Autonegiation mode, the hardware automatically
> negotiates the number of lanes.
> 
> These are the new commands.
> 
> testpmd> show port 0 speed_lanes capabilities
> 
>  Supported speeds Valid lanes
> ---
>  10 Gbps  1
>  25 Gbps  1
>  40 Gbps  4
>  50 Gbps  1 2
>  100 Gbps 1 2 4
>  200 Gbps 2 4
>  400 Gbps 4 8
> testpmd>
> 
> testpmd>
> testpmd> port stop 0
> testpmd> port config 0 speed_lanes 4
> testpmd> port config 0 speed 20 duplex full
> testpmd> port start 0
> testpmd>
> testpmd> show port info 0
> 
> * Infos for port 0  *
> MAC address: 14:23:F2:C3:BA:D2
> Device name: :b1:00.0
> Driver name: net_bnxt
> Firmware-version: 228.9.115.0
> Connect to socket: 2
> memory allocation on the socket: 2
> Link status: up
> Link speed: 200 Gbps
> Active Lanes: 4
> Link duplex: full-duplex
> Autoneg status: Off
> 
> Signed-off-by: Damodharam Ammepalli 
>

Reviewed-by: Ferruh Yigit 

Suggested patch title:
ethdev: support link speed lanes

<...>

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Get Active lanes.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param lanes
> + *   Driver updates lanes with the number of active lanes.
> + *   On a supported NIC on link up, lanes will be a non-zero value 
> irrespective whether the
> + *   link is Autonegotiated or Fixed speed. No information is dispalyed for 
> error.
>

s/dispalyed/displayed/

I can fix while merging

Re: [PATCH v4] doc: add new driver guidelines

2024-09-26 Thread Ferruh Yigit

On 9/16/2024 5:28 PM, Stephen Hemminger wrote:
> From: Nandini Persad 
> 
> This document was created to assist contributors in creating DPDK drivers
> and provides suggestions and guidelines on how to upstream effectively.
> 
> Co-authored-by: Ferruh Yigit 
> Co-authored-by: Thomas Monjalon 
> Signed-off-by: Nandini Persad 
> Reviewed-by: Stephen Hemminger 
> Acked-by: Chengwen Feng 
>

Thanks for the update, build warnings and format issues seems fixed.

<...>


> ---
>  doc/guides/contributing/index.rst  |   1 +
>  doc/guides/contributing/new_driver.rst | 207 +
>  2 files changed, 208 insertions(+)
>  create mode 100644 doc/guides/contributing/new_driver.rst
> 
> diff --git a/doc/guides/contributing/index.rst 
> b/doc/guides/contributing/index.rst
> index dcb9b1fbf0..7fc6511361 100644
> --- a/doc/guides/contributing/index.rst
> +++ b/doc/guides/contributing/index.rst
> @@ -15,6 +15,7 @@ Contributor's Guidelines
>  documentation
>  unit_test
>  new_library
> +new_driver
>  patches
>  vulnerability
>  stable
> diff --git a/doc/guides/contributing/new_driver.rst 
> b/doc/guides/contributing/new_driver.rst
> new file mode 100644
> index 00..f95d3b90bc
> --- /dev/null
> +++ b/doc/guides/contributing/new_driver.rst
> @@ -0,0 +1,207 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright 2024 The DPDK contributors
> +
> +
> +Upstreaming New DPDK Drivers Guide
> +==
> +
>

The title of this guide is:
"Upstreaming New DPDK Drivers Guide"

Title of the document for adding new libraries:
"Adding a new library"

In the index they are next to each other, and different wording looks
inconsistent.
We can simply call this document "Adding a new driver",
or update library one title, but I think better to align them somehow.

<...>

> +Avoid doing the following:
> +
> +* Using PMD specific macros when DPDK macros exist
> +* Including unused headers (process-iwyu.py)
> +* Disabling compiler warnings for driver
> +* #ifdef with driver-defined macros
> +* DPDK version checks (via RTE_VERSION_NUM) in the upstream code
> +* Introducing Public APIs directly from the driver
> +* Adding driver private APIs. If a new feature is needed, it is
> +  better to extend the corresponding framework API
> +
>

Last two items are for the same issue, we sometime call public APIs from
drivers as "driver private APIs".
First one looks simpler, perhaps we can keep that one, but I don't have
strong opinion, as long as we remove the duplication.

<...>

> +
> +Be sure to run the following test tools per patch in a patch series:
>

The list is missing in the patch, it was:

 * checkpatches.sh
 * check-git-log.sh
 * check-meson.py
 * check-doc-vs-code.sh
 * check-spdx-tag.sh
 * Build documentation and validate how output looks

Re: [PATCH v1] ethdev: fix int overflow in descriptor count logic

2024-09-26 Thread Ferruh Yigit

On 9/26/2024 3:03 PM, Meade, Niall wrote:
>> From: Ferruh Yigit 
>> Sent: Thursday, September 26, 2024 12:16 AM
>> To: Meade, Niall ; Thomas Monjalon 
>> ; Andrew Rybchenko ; 
>> Roman Zhukov 
>> Cc: dev@dpdk.org 
>> Subject: Re: [PATCH v1] ethdev: fix int overflow in descriptor count logic
> 
>>> The resolution involves upcasting nb_desc to a uint32_t before the
>>> RTE_ALIGN_CEIL macro is applied. This change ensures that the subsequent
>>> call to RTE_ALIGN_FLOOR(nb_desc + (nb_align - 1), nb_align) does not
>>> result in an overflow, as it would when nb_desc is a uint16_t. By using
>>> a uint32_t for these operations, the correct behavior is maintained
>>> without the risk of overflow.
>>>
>>
>> Hi Niall,
> 
> Hi Ferruh,
> 
>> Thanks for the patch.
>>
>> For the 'RTE_ALIGN_CEIL(val, align)' macro, 'align' should be power of
>> two, as 'desc_lim->nb_align' is uint16_t, max value it can get is 2^15.
>> 'val' should be smaller than or equal to 'align', so '*nb_desc' can be
>> maximum 2^15.
>>
>> So RTE_ALIGN_CEIL(2^15-1, 2^15) = 2^15, I think this should work fine
>> (although I didn't test).
>>
>> And even with your uint32_t cast, I think following will fail:
>> RTE_ALIGN_CEIL(2^16-1, 2^15)
>> (again, not tested).
>>
> 
> I tested my code with these values and the behaviour is as expected from
> what I can see.
> At a high level I ran into this issue when passing uint16_tMAX into
> rte_eth_dev_adjust_nb_rx_tx_desc() with the intent of selecting the maximum
> ring descriptor size but the minimum was selected.
> 
>> Or maybe I am missing a case, can you please give some actual numbers to
>> show the problem and the fix?
> 
> Yes sure! If we take an example of val= (2^16)-1 and align= 32.
> RTE_ALIGN_CEIL(val, align) calls RTE_ALIGN_FLOOR(val + align - 1, align). With
> val as a uint16_t this subsequent macro call results in a wrap around for val
> (originally was the max uint16_t and now we are attempting to add align to
> it). The returned value of RTE_ALIGN_CEIL() in this case is 0. This results in
> nb_desc being set to 0, and later set to the minimum ring descriptor size for
> that NIC with *nb_desc = RTE_MAX(*nb_desc, desc_lim->nb_min).
> 
> While this example is an unreasonably large request for a descriptor ring 
> size,
> the expected behaviour would be that the descriptor ring size defaults back to
> the maximum possible for that particular NIC, not to the minimum which it
> currently does.
> By introducing a uint32_t, the wrap around in RTE_ALIGN_FLOOR() is avoided,
> keeping the large value of nb_desc_32 which is later set to an appropriate 
> size
> in RTE_MIN(*nb_desc_32, desc_lim->nb_max)
> 

I see the problem now, thanks.

When value > (2^16 - align), next aligned value is 2^16, which is
UINT16_MAX + 1, hence wraps to 0, this is kind of expected.

For the relevant code, assuming 'desc_lim->nb_max' & 'desc_lim->nb_min'
are already aligned to 'desc_lim->nb_align', following should fix the
issue, that seems simpler to me, what do you think:

```
if (desc_lim->nb_max != 0)
*nb_desc = RTE_MIN(*nb_desc, desc_lim->nb_max);

nb_desc_32 = RTE_MAX(nb_desc_32, desc_lim->nb_min);

if (desc_lim->nb_align != 0)
*nb_desc = RTE_ALIGN_CEIL(*nb_desc, desc_lim->nb_align);
```

Basically just changing the order of the operations...

It is not easy to see the problem, can you please give sample values in
the commit log (for '*nb_desc', 'nb_align', 'nb_max' & 'nb_min'), that
makes much easier to see why above works.

Re: [PATCH v1] ethdev: fix int overflow in descriptor count logic

2024-09-25 Thread Ferruh Yigit

On 9/23/2024 10:26 AM, Niall Meade wrote:
> Addressed a specific overflow issue in the eth_dev_adjust_nb_desc()
> function where the uint16_t variable nb_desc would overflow when its
> value was greater than (2^16 - nb_align). This overflow caused nb_desc
> to incorrectly wrap around between 0 and nb_align-1, leading to the
> function setting nb_desc to nb_min instead of the expected nb_max.
> 
> The resolution involves upcasting nb_desc to a uint32_t before the
> RTE_ALIGN_CEIL macro is applied. This change ensures that the subsequent
> call to RTE_ALIGN_FLOOR(nb_desc + (nb_align - 1), nb_align) does not
> result in an overflow, as it would when nb_desc is a uint16_t. By using
> a uint32_t for these operations, the correct behavior is maintained
> without the risk of overflow.
> 

Hi Niall,


Thanks for the patch.

For the 'RTE_ALIGN_CEIL(val, align)' macro, 'align' should be power of
two, as 'desc_lim->nb_align' is uint16_t, max value it can get is 2^15.
'val' should be smaller than or equal to 'align', so '*nb_desc' can be
maximum 2^15.

So RTE_ALIGN_CEIL(2^15-1, 2^15) = 2^15, I think this should work fine
(although I didn't test).

And even with your uint32_t cast, I think following will fail:
RTE_ALIGN_CEIL(2^16-1, 2^15)
(again, not tested).

Or maybe I am missing a case, can you please give some actual numbers to
show the problem and the fix?


Perhaps what we need is to verify mentioned requirements of the macro in
the function:
- 'align' should be power of two
- val <= align
But as this is a static function, these checks can be done in caller
function and preconditions can be enforced.


> Fixes: 0f67fc3baeb9 ("ethdev: add function to adjust number of descriptors")
> 
> Signed-off-by: Niall Meade 
> ---
>  .mailmap|  1 +
>  lib/ethdev/rte_ethdev.c | 12 +---
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/.mailmap b/.mailmap
> index 4a508bafad..c1941e78bb 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -1053,6 +1053,7 @@ Nelson Escobar 
>  Nemanja Marjanovic 
>  Netanel Belgazal 
>  Netanel Gonen 
> +Niall Meade 
>  Niall Power 
>  Nicholas Pratte 
>  Nick Connolly  
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index f1c658f49e..f978283edf 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6577,13 +6577,19 @@ static void
>  eth_dev_adjust_nb_desc(uint16_t *nb_desc,
>   const struct rte_eth_desc_lim *desc_lim)
>  {
> + /* Upcast to uint32 to avoid potential overflow with RTE_ALIGN_CEIL(). 
> */
> + uint32_t nb_desc_32 = *nb_desc;
> +
>   if (desc_lim->nb_align != 0)
> - *nb_desc = RTE_ALIGN_CEIL(*nb_desc, desc_lim->nb_align);
> + nb_desc_32 = RTE_ALIGN_CEIL(nb_desc_32, desc_lim->nb_align);
>  
>   if (desc_lim->nb_max != 0)
> - *nb_desc = RTE_MIN(*nb_desc, desc_lim->nb_max);
> + nb_desc_32 = RTE_MIN(nb_desc_32, desc_lim->nb_max);
> +
> + nb_desc_32 = RTE_MAX(nb_desc_32, desc_lim->nb_min);
>  
> - *nb_desc = RTE_MAX(*nb_desc, desc_lim->nb_min);
> + /* Assign clipped u32 back to u16. */
> + *nb_desc = nb_desc_32;
>  }
>  
>  int

Re: [PATCH v4] net/zxdh: Provided zxdh basic init

2024-09-25 Thread Ferruh Yigit

On 9/10/2024 1:00 PM, Junlong Wang wrote:
> provided zxdh initialization of zxdh PMD driver.
> include msg channel, np init and etc.
> 

Hi Junlong,

It is very hard to review the driver as a single patch, it helps to
split driver into multiple patches, please check the suggestion on it:
https://patches.dpdk.org/project/dpdk/patch/20240916162856.11566-1-step...@networkplumber.org/

Also there are errors reported by scripts, please fix them:
./devtools/checkpatches.sh -n1
./devtools/check-meson.py


> Signed-off-by: Junlong Wang 
> ---
> V4: Resolve compilation issues
> V3: Resolve compilation issues
> V2: Resolve compilation issues and modify doc(zxdh.ini zdh.rst)
> V1: Provide zxdh basic init and open source NPSDK lib
> ---
>  doc/guides/nics/features/zxdh.ini |   10 +
>  doc/guides/nics/index.rst |1 +
>  doc/guides/nics/zxdh.rst  |   34 +
>  drivers/net/meson.build   |1 +
>  drivers/net/zxdh/meson.build  |   23 +
>  drivers/net/zxdh/zxdh_common.c|   59 ++
>  drivers/net/zxdh/zxdh_common.h|   32 +
>  drivers/net/zxdh/zxdh_ethdev.c| 1328 +
>  drivers/net/zxdh/zxdh_ethdev.h|  202 +
>  drivers/net/zxdh/zxdh_logs.h  |   38 +
>  drivers/net/zxdh/zxdh_msg.c   | 1177 +
>  drivers/net/zxdh/zxdh_msg.h   |  408 +
>  drivers/net/zxdh/zxdh_npsdk.c |  158 
>  drivers/net/zxdh/zxdh_npsdk.h |  216 +
>  drivers/net/zxdh/zxdh_pci.c   |  462 ++
>  drivers/net/zxdh/zxdh_pci.h   |  259 ++
>  drivers/net/zxdh/zxdh_queue.c |  138 +++
>  drivers/net/zxdh/zxdh_queue.h |   85 ++
>  drivers/net/zxdh/zxdh_ring.h  |   87 ++
>  drivers/net/zxdh/zxdh_rxtx.h  |   48 ++
>  20 files changed, 4766 insertions(+)
>  create mode 100644 doc/guides/nics/features/zxdh.ini
>  create mode 100644 doc/guides/nics/zxdh.rst
>  create mode 100644 drivers/net/zxdh/meson.build
>  create mode 100644 drivers/net/zxdh/zxdh_common.c
>  create mode 100644 drivers/net/zxdh/zxdh_common.h
>  create mode 100644 drivers/net/zxdh/zxdh_ethdev.c
>  create mode 100644 drivers/net/zxdh/zxdh_ethdev.h
>  create mode 100644 drivers/net/zxdh/zxdh_logs.h
>  create mode 100644 drivers/net/zxdh/zxdh_msg.c
>  create mode 100644 drivers/net/zxdh/zxdh_msg.h
>  create mode 100644 drivers/net/zxdh/zxdh_npsdk.c
>  create mode 100644 drivers/net/zxdh/zxdh_npsdk.h
>  create mode 100644 drivers/net/zxdh/zxdh_pci.c
>  create mode 100644 drivers/net/zxdh/zxdh_pci.h
>  create mode 100644 drivers/net/zxdh/zxdh_queue.c
>  create mode 100644 drivers/net/zxdh/zxdh_queue.h
>  create mode 100644 drivers/net/zxdh/zxdh_ring.h
>  create mode 100644 drivers/net/zxdh/zxdh_rxtx.h
> 
> diff --git a/doc/guides/nics/features/zxdh.ini b/doc/guides/nics/
> features/zxdh.ini
> new file mode 100644
> index 00..083c75511b
> --- /dev/null
> +++ b/doc/guides/nics/features/zxdh.ini
> @@ -0,0 +1,10 @@
> +;
> +; Supported features of the 'zxdh' network poll mode driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +Linux= Y
> +x86-64   = Y
> +ARMv8= Y
> +
> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
> index c14bc7988a..8e371ac4a5 100644
> --- a/doc/guides/nics/index.rst
> +++ b/doc/guides/nics/index.rst
> @@ -69,3 +69,4 @@ Network Interface Controller Drivers
>  vhost
>  virtio
>  vmxnet3
> +zxdh
> diff --git a/doc/guides/nics/zxdh.rst b/doc/guides/nics/zxdh.rst
> new file mode 100644
> index 00..e878058b7b
> --- /dev/null
> +++ b/doc/guides/nics/zxdh.rst
> @@ -0,0 +1,34 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +Copyright(c) 2023 ZTE Corporation.
> +
> +ZXDH Poll Mode Driver
> +==
> +
> +The ZXDH PMD (**librte_net_zxdh**) provides poll mode driver support
> +for 25/100 Gbps ZXDH NX Series Ethernet Controller based on
> +the ZTE Ethernet Controller E310/E312.
> +
>

Can you please provide a link for the product?
There is a link in the prerequisetes section below, if that link is for
product please move it here.

> +
> +Features
> +
> +
> +Features of the zxdh PMD are:
> +
> +- Multi arch support: x86_64, ARMv8.
> +
> +Prerequisites
> +-
> +
> +- Learning about ZXDH NX Series Ethernet Controller NICs using
> +  ``_.
> +
> +Driver compilation and testing
> +--
> +
> +Refer to the document :ref:`compiling and testing a PMD for a NIC 
> `
> +for details.
> +
> +Limitations or Known issues
> +---
> +X86-32, Power8, ARMv7 and BSD are not supported yet.
> +
> diff --git a/drivers/net/meson.build b/drivers/net/meson.build
> index fb6d34b782..1a3db8a04d 100644
> --- a/drivers/net/meson.build
> +++ b/drivers/net/meson.build
> @@ -62,6 +62,7 @@ drivers = [
>  'vhost',
>  'virtio',
>  'vmxnet3',
> +'zxdh'

Re: [PATCH v5 1/2] ethdev: Add link_speed lanes support

2024-09-25 Thread Ferruh Yigit

On 9/24/2024 11:59 PM, Damodharam Ammepalli wrote:
> On Sun, Sep 22, 2024 at 10:02 AM Ferruh Yigit  wrote:
>>
>> On 9/4/2024 6:50 PM, Damodharam Ammepalli wrote:
>>> Update the eth_dev_ops structure with new function vectors
>>> to get, get capabilities and set ethernet link speed lanes.
>>> Update the testpmd to provide required config and information
>>> display infrastructure.
>>>
>>> The supporting ethernet controller driver will register callbacks
>>> to avail link speed lanes config and get services. This lanes
>>> configuration is applicable only when the nic is forced to fixed
>>> speeds. In Autonegiation mode, the hardware automatically
>>> negotiates the number of lanes.
>>>
>>> These are the new commands.
>>>
>>> testpmd> show port 0 speed_lanes capabilities
>>>
>>>  Supported speeds Valid lanes
>>> ---
>>>  10 Gbps  1
>>>  25 Gbps  1
>>>  40 Gbps  4
>>>  50 Gbps  1 2
>>>  100 Gbps 1 2 4
>>>  200 Gbps 2 4
>>>  400 Gbps 4 8
>>> testpmd>
>>>
>>> testpmd>
>>> testpmd> port stop 0
>>> testpmd> port config 0 speed_lanes 4
>>> testpmd> port config 0 speed 20 duplex full
>>> testpmd> port start 0
>>> testpmd>
>>> testpmd> show port info 0
>>>
>>> * Infos for port 0  *
>>> MAC address: 14:23:F2:C3:BA:D2
>>> Device name: :b1:00.0
>>> Driver name: net_bnxt
>>> Firmware-version: 228.9.115.0
>>> Connect to socket: 2
>>> memory allocation on the socket: 2
>>> Link status: up
>>> Link speed: 200 Gbps
>>> Active Lanes: 4
>>> Link duplex: full-duplex
>>> Autoneg status: Off
>>>
>>> Signed-off-by: Damodharam Ammepalli 
>>> ---
>>>  app/test-pmd/cmdline.c | 248 -
>>>  app/test-pmd/config.c  |   4 +
>>>  lib/ethdev/ethdev_driver.h |  91 ++
>>>  lib/ethdev/rte_ethdev.c|  52 
>>>  lib/ethdev/rte_ethdev.h|  95 ++
>>>  lib/ethdev/version.map |   5 +
>>>  6 files changed, 494 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>>> index b7759e38a8..643102032e 100644
>>> --- a/app/test-pmd/cmdline.c
>>> +++ b/app/test-pmd/cmdline.c
>>> @@ -284,6 +284,9 @@ static void cmd_help_long_parsed(void *parsed_result,
>>>
>>>   "dump_log_types\n"
>>>   "Dumps the log level for all the dpdk modules\n\n"
>>> +
>>> + "show port (port_id) speed_lanes capabilities"
>>> + "   Show speed lanes capabilities of a port.\n\n"
>>>   );
>>>   }
>>>
>>> @@ -823,6 +826,9 @@ static void cmd_help_long_parsed(void *parsed_result,
>>>   "port config (port_id) txq (queue_id) affinity 
>>> (value)\n"
>>>   "Map a Tx queue with an aggregated port "
>>>   "of the DPDK port\n\n"
>>> +
>>> + "port config (port_id|all) speed_lanes (0|1|4|8)\n"
>>>
>>
>> This help string, and the implementation, implies there has to be fixed
>> lane values, like 1, 4, 8. Why not leave this part to the capability
>> reporting, and not limit (and worry) those values in the command, so
>> from command's perspective it can be only .
>>
> 
> ok, will update the help string to 
> 
>>> + "Set number of lanes for all ports or port_id for 
>>> a forced speed\n\n"
>>>   );
>>>   }
>>>
>>> @@ -1560,6 +1566,244 @@ static cmdline_parse_inst_t 
>>> cmd_config_speed_specific = {
>>>   },
>>>  };
>>>
>>> +static int
>>> +parse_speed_lanes_cfg(portid_t pid, uint32_t lanes)
>>> +{
>>> + int ret;
>>> +
>>> + ret = rte_eth_speed_lanes_set(pid, lanes);
>>>
>>
>> As a sample application usage, I think it is better if it gets the
>> capability and verify that 'lanes' is with

Re: [PATCH v5 1/2] ethdev: Add link_speed lanes support

2024-09-25 Thread Ferruh Yigit

On 9/25/2024 4:00 PM, Damodharam Ammepalli wrote:
> On Tue, Sep 24, 2024 at 3:59 PM Damodharam Ammepalli
>  wrote:
>>
>> On Sun, Sep 22, 2024 at 10:02 AM Ferruh Yigit  wrote:
>>>
>>> On 9/4/2024 6:50 PM, Damodharam Ammepalli wrote:
>>>> Update the eth_dev_ops structure with new function vectors
>>>> to get, get capabilities and set ethernet link speed lanes.
>>>> Update the testpmd to provide required config and information
>>>> display infrastructure.
>>>>
>>>> The supporting ethernet controller driver will register callbacks
>>>> to avail link speed lanes config and get services. This lanes
>>>> configuration is applicable only when the nic is forced to fixed
>>>> speeds. In Autonegiation mode, the hardware automatically
>>>> negotiates the number of lanes.
>>>>
>>>> These are the new commands.
>>>>
>>>> testpmd> show port 0 speed_lanes capabilities
>>>>
>>>>  Supported speeds Valid lanes
>>>> ---
>>>>  10 Gbps  1
>>>>  25 Gbps  1
>>>>  40 Gbps  4
>>>>  50 Gbps  1 2
>>>>  100 Gbps 1 2 4
>>>>  200 Gbps 2 4
>>>>  400 Gbps 4 8
>>>> testpmd>
>>>>
>>>> testpmd>
>>>> testpmd> port stop 0
>>>> testpmd> port config 0 speed_lanes 4
>>>> testpmd> port config 0 speed 20 duplex full
>>>> testpmd> port start 0
>>>> testpmd>
>>>> testpmd> show port info 0
>>>>
>>>> * Infos for port 0  *
>>>> MAC address: 14:23:F2:C3:BA:D2
>>>> Device name: :b1:00.0
>>>> Driver name: net_bnxt
>>>> Firmware-version: 228.9.115.0
>>>> Connect to socket: 2
>>>> memory allocation on the socket: 2
>>>> Link status: up
>>>> Link speed: 200 Gbps
>>>> Active Lanes: 4
>>>> Link duplex: full-duplex
>>>> Autoneg status: Off
>>>>
>>>> Signed-off-by: Damodharam Ammepalli 
>>>> ---
>>>>  app/test-pmd/cmdline.c | 248 -
>>>>  app/test-pmd/config.c  |   4 +
>>>>  lib/ethdev/ethdev_driver.h |  91 ++
>>>>  lib/ethdev/rte_ethdev.c|  52 
>>>>  lib/ethdev/rte_ethdev.h|  95 ++
>>>>  lib/ethdev/version.map |   5 +
>>>>  6 files changed, 494 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>>>> index b7759e38a8..643102032e 100644
>>>> --- a/app/test-pmd/cmdline.c
>>>> +++ b/app/test-pmd/cmdline.c
>>>> @@ -284,6 +284,9 @@ static void cmd_help_long_parsed(void *parsed_result,
>>>>
>>>>   "dump_log_types\n"
>>>>   "Dumps the log level for all the dpdk 
>>>> modules\n\n"
>>>> +
>>>> + "show port (port_id) speed_lanes capabilities"
>>>> + "   Show speed lanes capabilities of a port.\n\n"
>>>>   );
>>>>   }
>>>>
>>>> @@ -823,6 +826,9 @@ static void cmd_help_long_parsed(void *parsed_result,
>>>>   "port config (port_id) txq (queue_id) affinity 
>>>> (value)\n"
>>>>   "Map a Tx queue with an aggregated port "
>>>>   "of the DPDK port\n\n"
>>>> +
>>>> + "port config (port_id|all) speed_lanes (0|1|4|8)\n"
>>>>
>>>
>>> This help string, and the implementation, implies there has to be fixed
>>> lane values, like 1, 4, 8. Why not leave this part to the capability
>>> reporting, and not limit (and worry) those values in the command, so
>>> from command's perspective it can be only .
>>>
>>
>> ok, will update the help string to 
>>
>>>> + "Set number of lanes for all ports or port_id 
>>>> for a forced speed\n\n"
>>>>   );
>>>>   }
>>>>
>>>> @@ -1560,6 +1566,244 @@ static cmdline_

Re: [RFC] ethdev: introduce PTP device capability

2024-09-25 Thread Ferruh Yigit

On 9/24/2024 8:24 AM, lihuisong (C) wrote:
> Hi Ferruh,
> 
> 
> 在 2024/9/23 6:23, Ferruh Yigit 写道:
>> On 1/30/2024 3:58 AM, Huisong Li wrote:
>>> Currently, the PTP feature is a little messy and has some issue.
>>> 1> There is different implementation for PTP. This feature of some
>>>     hardware depends on the Rx HW timestamp offload, and use this
>>>     offload to detect if enable PTP feature. Others can enable PTP
>>>     feature with only ethdev ops.
>>> 2> For some drivers, enabling PTP feature also depends on the macro
>>>     RTE_LIBRTE_IEEE1588. Actually, whether device support, enable
>>>     or disable this feature should not be determined at compilation
>>>     time. This has been discussed in thread [1].
>>> 3> The user haven't a good way to distinguish which port supports
>>>     the PTP feature in the case that a couple of device belong to the
>>>     same driver. And PTP related API in ethdev layer doesn't do check
>>>     for PTP capability. This has been mentioned and discussed in
>>>     thread [2].
>>>
>>> In the thread [1], there is a conclusion that remove the compiling
>>> macro RTE_LIBRTE_IEEE1588. And in the thread [2], there is a common
>>> opinion that the RTE_ETH_RX_OFFLOAD_TIMESTAMP cannot be used as the
>>> PTP capability.
>>>
>>> For the above issuse, this patch introduces a PTP capability in
>>> rte_eth_dev_info.dev_capa to report PTP capability.
>>>
>>> Welcome to jumping into discussion.
>>>
>>> [1] https://patchwork.dpdk.org/project/dpdk/
>>> patch/20230203132810.14187-1-tho...@monjalon.net/
>>> [2] https://inbox.dpdk.org/dev/20230817084226.55327-1-
>>> lihuis...@huawei.com/
>>>
>>> Signed-off-by: Huisong Li 
>>> ---
>>>   lib/ethdev/rte_ethdev.h | 6 ++
>>>   1 file changed, 6 insertions(+)
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>> index efa4f12b2a..4c8bd691bd 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -1613,6 +1613,12 @@ struct rte_eth_conf {
>>>   #define RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP RTE_BIT64(3)
>>>   /** Device supports keeping shared flow objects across restart. */
>>>   #define RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP RTE_BIT64(4)
>>> +/**
>>> + * Device supports PTP feature.
>>> + * For some hardware, this feature also need to set the offload
>>> + * RTE_ETH_RX_OFFLOAD_TIMESTAMP, please check
>>> rte_eth_dev_info.rx_offload_capa.
>>> + */
>>> +#define RTE_ETH_DEV_CAPA_PTP RTE_BIT64(5)
>>>   /**@}*/
>>>     /*
>> Hi Huisong,
>>
>> Thanks for the effort, as you mentioned PTP feature can be improved and
>> there is a target to remove RTE_LIBRTE_IEEE1588 build time macro.
>>
>> As far as I remember, one of the main reasons of the RTE_LIBRTE_IEEE1588
>> macro is some HW has resource restrictions, when RTE_LIBRTE_IEEE1588 is
>> enabled some other features, like vector datapath, are not usable, that
>> is why this is a build time selection.
> I think the main reason that driver don't support PTP feature in vector
> datapath is for performance.
> This can be controled by releated dev_ops API or TIMESTAMP offload and
> no need to use RTE_LIBRTE_IEEE1588 macro, like hns3.
>>
>> And related to the PTP capability, can you please give some more
>> information, what does device having PTP capability exactly means.
> Just the device having PTP capability can be used to enable PTP feature.
>

Hi Huisong,

I am asking for your support but not able to get detailed information is
not helping to progress the patch.

If application implements PTP protocol, we already have APIs for
application to read time from NIC, or to adjust NIC clock, etc..:
'rte_eth_timesync_read_time()'
'rte_eth_timesync_adjust_time()'
...

Application can check if these APIs are supported by the device to
deduce if PTP can be supported by device or not, why this is not sufficient?

If application PTP implementation requires HW timestamp offload,
availability of this offload also can be checked.

I think for most of the cases, with combination of above two,
application can decide if it can support PTP with given device or not.

What else is missing so that application needs this additional
capability flag from NIC?


>>
>> PTP is protocol and it is implemented in userspace daemon. I guess even
>> NIC does not support timestamp offloading, by using time information
>> from S

Re: [PATCH 1/2] ethdev: add Rx packet type offload control flag

2024-09-25 Thread Ferruh Yigit

On 9/24/2024 3:03 AM, Chaoyong He wrote:
>> On 6/19/2024 11:11 AM, Chaoyong He wrote:
>>> From: Long Wu 
>>>
>>> The Rx packet type offload feature may affect the performance, so add
>>> a control flag for applications to turn it on or off.
>>>
>>> Signed-off-by: Long Wu 
>>> ---
>>>  lib/ethdev/rte_ethdev.h | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>> 548fada1c7..be86983e24 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -1555,6 +1555,7 @@ struct rte_eth_conf {  #define
>>> RTE_ETH_RX_OFFLOAD_OUTER_UDP_CKSUM  RTE_BIT64(18)
>>>  #define RTE_ETH_RX_OFFLOAD_RSS_HASH RTE_BIT64(19)
>>>  #define RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT RTE_BIT64(20)
>>> +#define RTE_ETH_RX_OFFLOAD_PTYPES   RTE_BIT64(21)
>>>
>>>  #define RTE_ETH_RX_OFFLOAD_CHECKSUM
>> (RTE_ETH_RX_OFFLOAD_IPV4_CKSUM | \
>>>  RTE_ETH_RX_OFFLOAD_UDP_CKSUM | \
>>
>> Hi Chaoyong,
>>
>> Instead of having an offload for ptypes, we have APIs for this,
>>
>> First one is 'rte_eth_dev_get_supported_ptypes()' that application can learn
>> the supported packet types.
>>
>> Second one is more related to above flag, it is 'rte_eth_dev_set_ptypes()'
>> which application can set which pytpes is required, it can be set to disable 
>> all
>> packet type parsing, can be similar to not requesting
>> 'RTE_ETH_RX_OFFLOAD_PTYPES'.
>>
>> With above two APIs, do we still need the offload flag?
>>
> 
> At present, the purpose of the ops 'rte_eth_dev_set_ptypes()' is to set the 
> range of packet types to handle.
>

Yes, and setting 'ptype_mask' to zero should disable packet type parsing.

Packet type parsing is an offload, but when we have an API that has
finer grained control to what packet type to parse, why not use it
instead of having offload flag, which is all on or off configuration.

> Of course, we can maintain a flag for each application and driver based on 
> the return value of this ops;
> but this is a bit troublesome.
>

I didn't get your point, why maintain a flag?

> So, we hope to follow the example of RSS, in addition to
> 'rte_eth_dev_rss_hash_update()' and 'rte_eth_dev_rss_hash_conf_get()', we 
> also want to set a flag for
> the ptype function similar to RTE_ETH_RX_OFFLOAD_RSS_HASH.
> 
>>
>> Another concern with adding new offload flag is backward compatibility, all
>> existing drivers that support packet type parsing should be updated to list 
>> this
>> offload flag as capability. Also they need to be updated to configure packet
>> parsing based on user requested offload configuration.
>>
> 
> If you agree with this design suggestion, we will adapt all the related code 
> to ptypes for each PMDs and 'test-pmd' applications in the next patch.
> Do you think this okay?
> 
>> Briefly, we can't just introduce a new offload flag for an existing 
>> capability and
>> update only one driver, all drivers needs to be updated.

Re: [PATCH 1/2] ethdev: add Rx packet type offload control flag

2024-09-22 Thread Ferruh Yigit

On 6/19/2024 11:11 AM, Chaoyong He wrote:
> From: Long Wu 
> 
> The Rx packet type offload feature may affect the performance,
> so add a control flag for applications to turn it on or off.
> 
> Signed-off-by: Long Wu 
> ---
>  lib/ethdev/rte_ethdev.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 548fada1c7..be86983e24 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1555,6 +1555,7 @@ struct rte_eth_conf {
>  #define RTE_ETH_RX_OFFLOAD_OUTER_UDP_CKSUM  RTE_BIT64(18)
>  #define RTE_ETH_RX_OFFLOAD_RSS_HASH RTE_BIT64(19)
>  #define RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT RTE_BIT64(20)
> +#define RTE_ETH_RX_OFFLOAD_PTYPES   RTE_BIT64(21)
>  
>  #define RTE_ETH_RX_OFFLOAD_CHECKSUM (RTE_ETH_RX_OFFLOAD_IPV4_CKSUM | \
>RTE_ETH_RX_OFFLOAD_UDP_CKSUM | \

Hi Chaoyong,

Instead of having an offload for ptypes, we have APIs for this,

First one is 'rte_eth_dev_get_supported_ptypes()' that application can
learn the supported packet types.

Second one is more related to above flag, it is
'rte_eth_dev_set_ptypes()' which application can set which pytpes is
required, it can be set to disable all packet type parsing, can be
similar to not requesting 'RTE_ETH_RX_OFFLOAD_PTYPES'.

With above two APIs, do we still need the offload flag?

Another concern with adding new offload flag is backward compatibility,
all existing drivers that support packet type parsing should be updated
to list this offload flag as capability. Also they need to be updated to
configure packet parsing based on user requested offload configuration.

Briefly, we can't just introduce a new offload flag for an existing
capability and update only one driver, all drivers needs to be updated.

Re: [PATCH v6 00/14] Enhance the bond framework to support offload

2024-09-22 Thread Ferruh Yigit

On 12/26/2023 7:28 AM, Chaoyong He wrote:
> This patch series try to enhance the bond framework to support the
> offload feature better:
> * Add new API to make the member port can access some information of the
>   bond port which belongs.
> * Add new API to get the result of whether bond port is created by the
>   member port.
> * Add two command line argument to control if enable member port
>   notification and dedicated queue features.
> * Add logic to support add ports which share the same PCI address into
>   bond port.
> * Also modify the testpmd application to test the new APIs and logics
>   added by this patch series.
> 
> ---
> v2:
> * Fix compile error on github-robot by removing the redundancy function
>   declaration in the header file.
> v3:
> * Use the hole in the structure for the new added flag data field.
> v4:
> * Drop two commits not necessary for this series.
> * Modify some logic as the review comments from reviewers.
> v5:
> * Add a new 'rte_eth_bond_flow.h' header file.
> * Add the patches of NFP PMD as the example of support bond flow
>   offload.
> v6:
> * Try to solve the CI build error.
>

Hi Chaoyong,

This patch is active in the patchwork for a while, it is waiting mainly
because of specific driver (bonding driver) implementation in ethdev layer.

Are you still pursing for this implementation?
As this is almost a year old patch series, I will update it as change
requested. If you want to continue with the patch, please send a new
version and later request a technical board meeting to discuss how to
progress with it.

Thanks,
ferruh

Re: [RFC] ethdev: introduce PTP device capability

2024-09-22 Thread Ferruh Yigit

On 1/30/2024 3:58 AM, Huisong Li wrote:
> Currently, the PTP feature is a little messy and has some issue.
> 1> There is different implementation for PTP. This feature of some
>hardware depends on the Rx HW timestamp offload, and use this
>offload to detect if enable PTP feature. Others can enable PTP
>feature with only ethdev ops.
> 2> For some drivers, enabling PTP feature also depends on the macro
>RTE_LIBRTE_IEEE1588. Actually, whether device support, enable
>or disable this feature should not be determined at compilation
>time. This has been discussed in thread [1].
> 3> The user haven't a good way to distinguish which port supports
>the PTP feature in the case that a couple of device belong to the
>same driver. And PTP related API in ethdev layer doesn't do check
>for PTP capability. This has been mentioned and discussed in
>thread [2].
> 
> In the thread [1], there is a conclusion that remove the compiling
> macro RTE_LIBRTE_IEEE1588. And in the thread [2], there is a common
> opinion that the RTE_ETH_RX_OFFLOAD_TIMESTAMP cannot be used as the
> PTP capability.
> 
> For the above issuse, this patch introduces a PTP capability in
> rte_eth_dev_info.dev_capa to report PTP capability.
> 
> Welcome to jumping into discussion.
> 
> [1] 
> https://patchwork.dpdk.org/project/dpdk/patch/20230203132810.14187-1-tho...@monjalon.net/
> [2] https://inbox.dpdk.org/dev/20230817084226.55327-1-lihuis...@huawei.com/
> 
> Signed-off-by: Huisong Li 
> ---
>  lib/ethdev/rte_ethdev.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index efa4f12b2a..4c8bd691bd 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1613,6 +1613,12 @@ struct rte_eth_conf {
>  #define RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP RTE_BIT64(3)
>  /** Device supports keeping shared flow objects across restart. */
>  #define RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP RTE_BIT64(4)
> +/**
> + * Device supports PTP feature.
> + * For some hardware, this feature also need to set the offload
> + * RTE_ETH_RX_OFFLOAD_TIMESTAMP, please check 
> rte_eth_dev_info.rx_offload_capa.
> + */
> +#define RTE_ETH_DEV_CAPA_PTP RTE_BIT64(5)
>  /**@}*/
>  
>  /*

Hi Huisong,

Thanks for the effort, as you mentioned PTP feature can be improved and
there is a target to remove RTE_LIBRTE_IEEE1588 build time macro.

As far as I remember, one of the main reasons of the RTE_LIBRTE_IEEE1588
macro is some HW has resource restrictions, when RTE_LIBRTE_IEEE1588 is
enabled some other features, like vector datapath, are not usable, that
is why this is a build time selection.

And related to the PTP capability, can you please give some more
information, what does device having PTP capability exactly means.

PTP is protocol and it is implemented in userspace daemon. I guess even
NIC does not support timestamp offloading, by using time information
from SW it can still implement PTP, right?
Is PTP offload means, offloading some part of the protocol communication
withing the device?

Btw, the relevant RTE_ETH_RX_OFFLOAD_TIMESTAMP offload is, a little more
generic HW capability that HW can add timestamp to Rx/Tx packets, which
can be used for custom diagnostics. HW supporting this offload means
that HW has some specific clock HW in it.

Both having RTE_ETH_RX_OFFLOAD_TIMESTAMP and RTE_ETH_DEV_CAPA_PTP
capability can be confusing, lets clarify it more.

Re: [RFC 2/2] ethdev: add data size field to GENEVE option item

2024-09-22 Thread Ferruh Yigit

On 4/17/2024 8:23 AM, Michael Baum wrote:
> The "rte_flow_item_geneve_opt" structure has field for option length.
> This field has 2 different usages which might limit each other:
> 
>  1. field to matching with value/mask.
>  2. descriptor for data array size.
> 
> This patch adds a new field "data_array_size" into
> "rte_flow_item_geneve_opt" structure in addition to existing
> "option_len" field.
> 
> Signed-off-by: Michael Baum 
>

Hi Michael,

The first patch of the RFC sent separately later and merged [1], but I
don't know status of patch 2/2, is it still valid?

Since this is old patch, I will mark whole set as "Superseded", if this
patch is still relevant please send a new version, RFC or an actual patch.

Thanks,
ferruh

[1]
https://patches.dpdk.org/project/dpdk/patch/20240715121316.3429593-1-michae...@nvidia.com/

Re: [RFC v2] ethdev: an API for cache stashing hints

2024-09-22 Thread Ferruh Yigit

On 7/26/2024 9:01 PM, Wathsala Wathawana Vithanage wrote:
>> rte_eth_X_get_capability()
>>
> 
> rte_eth_dev_stashing_hints_discover is somewhat similar.
> 
>> Instead of adding RTE_ETH_DEV_CAPA_ macro and contaminating
>> 'rte_eth_dev_info' with this edge use case, what do you think follow above
>> design and have dedicated get capability API?
> 
> I think it's better to have a dedicated API, given that we already have a fine
> grained capabilities discovery function. I will add this feedback to V3 of the
> RFC.
> 
>>
>> And I can see set() has two different APIs, 'rte_eth_dev_stashing_hints_rx' &
>> 'rte_eth_dev_stashing_hints_tx', is there a reason to have two separate APIs
>> instead of having one which gets RX & TX as argument, as done in internal
>> device ops?
> 
> Some types/hints may only apply to a single queue direction, so I thought it
> would be better to separate them out into separate Rx and Tx APIs for ease
> of comprehension/use for the developer.
> In fact, underneath, it uses one API for both Rx and Tx.
> 

Hi Wathsala,

Do you still pursue this RFC, should we expect a new version for this
release?

Did you have any change to measure the impact of the changes in this patch?


Btw, do you think the LLC aware lcore selection patch [1] can be
relevant or can it help for the cases this patch addresses?

[1]
https://patches.dpdk.org/project/dpdk/list/?series=32851

Re: [PATCH v7 0/8] support dump reigser names and filter

2024-09-22 Thread Ferruh Yigit

On 9/14/2024 8:13 AM, Jie Hai wrote:
> The registers can be dumped through the API rte_eth_dev_get_reg_info.
> However, only register values are exported, which is inconvenient for
> users to interpret. Therefore, an extension of the structure
> "rte_dev_reg_info" and a new API rte_eth_dev_get_reg_info_ext is added
> to support the capability of exporting the name of the corresponding
> register and filtering by module names.
> 
> --
> v7: fix doc indention.
> --
> Jie Hai (8):
>   ethdev: support report register names and filter
>   ethdev: add telemetry cmd for registers
>   net/hns3: remove some basic address dump
>   net/hns3: fix dump counter of registers
>   net/hns3: remove separators between register module
>   net/hns3: refactor register dump
>   net/hns3: support report names of registers
>   net/hns3: support filter registers by module names
>

Hi Jie,

ethdev changes looks good to me, I can see there are some comments so I
assume there will be new version, if there is no major changes please
feel free to keep the review tags.

After new version I will wait for a few more days to give chance to
others for final review, and will merge later.

Thanks,
ferruh

Re: [PATCH v7 2/8] ethdev: add telemetry cmd for registers

2024-09-22 Thread Ferruh Yigit

On 9/14/2024 8:13 AM, Jie Hai wrote:
> This patch adds a telemetry command for registers dump,
> and supports obtaining the registers of a specified module.
> 
> In one way, the number of registers that can be exported
> is limited by the number of elements carried by dict and
> container. In another way, the length of the string
> exported by telemetry is limited by MAX_OUTPUT_LEN.
> Therefore, when the number of registers to be exported
> exceeds, some information will be lost. Warn on the former
> case.
> 
> An example usage is shown below:
> --> /ethdev/regs,0,ring
> {
>   "/ethdev/regs": {
> "registers_length": 318,
> "registers_width": 4,
> "register_offset": "0x0",
> "version": "0x1140011",
> "group_0": {
>   "Q0_ring_rx_bd_num": "0x0",
>   "Q0_ring_rx_bd_len": "0x0",
>   ...
>   },
> "group_1": {
> ...
> },
> ...
> }
> 
> Signed-off-by: Jie Hai 
>

Reviewed-by: Ferruh Yigit

Re: [PATCH v7 1/8] ethdev: support report register names and filter

2024-09-22 Thread Ferruh Yigit

On 9/14/2024 8:13 AM, Jie Hai wrote:
> This patch adds "filter" and "names" fields to "rte_dev_reg_info"
> structure. Names of registers in data fields can be reported and
> the registers can be filtered by their module names.
> 
> The new API rte_eth_dev_get_reg_info_ext() is added to support
> reporting names and filtering by modules. And the original API
> rte_eth_dev_get_reg_info() does not use the names and filter fields.
> A local variable is used in rte_eth_dev_get_reg_info for
> compatibility. If the drivers does not report the names, set them
> to "index_XXX", which means the location in the register table.
> 
> Signed-off-by: Jie Hai 
> Acked-by: Huisong Li 
> Acked-by: Chengwen Feng 
>

Reviewed-by: Ferruh Yigit

Re: [PATCH v2 0/3] add frequency adjustment support for PTP

2024-09-22 Thread Ferruh Yigit

On 9/10/2024 10:13 AM, Mingjin Ye wrote:
> [1/3] ethdev: add frequency adjustment API
> [2/3] net/ice: add frequency adjustment support for PTP
> [3/3] examples/ptpclient: add frequency adjustment support
> ---
> v2: rte_eth_timesync_adjust_freq marked as experimental.
> 
> Mingjin Ye (3):
>   ethdev: add frequency adjustment API
>   net/ice: add frequency adjustment support for PTP
>   examples/ptpclient: add frequency adjustment support
>

Hi Mingjin,

Overall OK to have the API, and except from a minor documentation issue
patch is OK.

Only lets try to make describe it a little more detail, so users and
driver developers can use is better without confusion.

Re: [PATCH v2 1/3] ethdev: add frequency adjustment API

2024-09-22 Thread Ferruh Yigit

On 9/10/2024 10:13 AM, Mingjin Ye wrote:
> This patch adds freq adjustment API for PTP high accuracy.
> 
> Signed-off-by: Simei Su 
> Signed-off-by: Mingjin Ye 

<...>

> diff --git a/doc/guides/rel_notes/release_24_11.rst 
> b/doc/guides/rel_notes/release_24_11.rst
> index 0ff70d9057..10ac35e5c0 100644
> --- a/doc/guides/rel_notes/release_24_11.rst
> +++ b/doc/guides/rel_notes/release_24_11.rst
> @@ -25,35 +25,11 @@ New Features
>  
>  
>  .. This section should contain new features added in this release.
> -   Sample format:
>  
> -   * **Add a title in the past tense with a full stop.**
> +* **Added Ethernet device clock incremental rate adjustment.**
>  
> - Add a short 1-2 sentence description in the past tense.
> - The description should be enough to allow someone scanning
> - the release notes to understand the new feature.
> -
> - If the feature adds a lot of sub-features you can use a bullet list
> - like this:
> -
> - * Added feature foo to do something.
> - * Enhanced feature bar to do something else.
> -
> - Refer to the previous release notes for examples.
> -
> - Suggested order in release notes items:
> - * Core libs (EAL, mempool, ring, mbuf, buses)
> - * Device abstraction libs and PMDs (ordered alphabetically by vendor 
> name)
> -   - ethdev (lib, PMDs)
> -   - cryptodev (lib, PMDs)
> -   - eventdev (lib, PMDs)
> -   - etc
> - * Other libs
> - * Apps, Examples, Tools (if significant)
> -
> - This section is a comment. Do not overwrite or remove it.
> - Also, make sure to start the actual text at the margin.
> - ===
> +  Added new function ``rte_eth_timesync_adjust_freq`` to
> +  adjust the clock increment rate for Ethernet devices.
> 

Can you please check above, patch removes unrelated parts of the
document, I assume caused by rebase issues.

>  
>  Removed Items
> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index 883e59a927..68cadc14a5 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -664,6 +664,9 @@ typedef int (*eth_timesync_read_tx_timestamp_t)(struct 
> rte_eth_dev *dev,
>  /** @internal Function used to adjust the device clock. */
>  typedef int (*eth_timesync_adjust_time)(struct rte_eth_dev *dev, int64_t);
>  
> +/** @internal Function used to adjust the clock frequency. */
> +typedef int (*eth_timesync_adjust_freq)(struct rte_eth_dev *dev, int64_t);
> +
>  /** @internal Function used to get time from the device clock. */
>  typedef int (*eth_timesync_read_time)(struct rte_eth_dev *dev,
> struct timespec *timestamp);
> @@ -1378,6 +1381,8 @@ struct eth_dev_ops {
>   eth_timesync_read_tx_timestamp_t timesync_read_tx_timestamp;
>   /** Adjust the device clock */
>   eth_timesync_adjust_time   timesync_adjust_time;
> + /** Adjust the clock frequency */
> + eth_timesync_adjust_freq   timesync_adjust_freq;
>   /** Get the device clock time */
>   eth_timesync_read_time timesync_read_time;
>   /** Set the device clock time */
> diff --git a/lib/ethdev/ethdev_trace.h b/lib/ethdev/ethdev_trace.h
> index 3bec87bfdb..e273d5853e 100644
> --- a/lib/ethdev/ethdev_trace.h
> +++ b/lib/ethdev/ethdev_trace.h
> @@ -2183,6 +2183,15 @@ RTE_TRACE_POINT_FP(
>   rte_trace_point_emit_int(ret);
>  )
>  
> +/* Called in loop in examples/ptpclient */
> +RTE_TRACE_POINT_FP(
> + rte_eth_trace_timesync_adjust_freq,
> + RTE_TRACE_POINT_ARGS(uint16_t port_id, int64_t ppm, int ret),
> + rte_trace_point_emit_u16(port_id);
> + rte_trace_point_emit_i64(ppm);
> + rte_trace_point_emit_int(ret);
> +)
> +
>  /* Called in loop in app/test-flow-perf */
>  RTE_TRACE_POINT_FP(
>   rte_flow_trace_create,
> diff --git a/lib/ethdev/ethdev_trace_points.c 
> b/lib/ethdev/ethdev_trace_points.c
> index 99e04f5893..a99fec0c1e 100644
> --- a/lib/ethdev/ethdev_trace_points.c
> +++ b/lib/ethdev/ethdev_trace_points.c
> @@ -409,6 +409,9 @@ 
> RTE_TRACE_POINT_REGISTER(rte_eth_trace_timesync_read_tx_timestamp,
>  RTE_TRACE_POINT_REGISTER(rte_eth_trace_timesync_adjust_time,
>   lib.ethdev.timesync_adjust_time)
>  
> +RTE_TRACE_POINT_REGISTER(rte_eth_trace_timesync_adjust_freq,
> + lib.ethdev.timesync_adjust_freq)
> +
>  RTE_TRACE_POINT_REGISTER(rte_eth_trace_timesync_read_time,
>   lib.ethdev.timesync_read_time)
>  
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index f1c658f49e..660eab2f1e 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6310,6 +6310,24 @@ rte_eth_timesync_adjust_time(uint16_t port_id, int64_t 
> delta)
>   return ret;
>  }
>  
> +int
> +rte_eth_timesync_adjust_freq(uint16_t port_id, int64_t ppm)
> +{
> + struct rte_eth_dev *dev;
> + int ret;
> +
> + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> + dev = &rte_eth_devices[port_id];
> +
> +

Re: [PATCH v5 1/2] ethdev: Add link_speed lanes support

2024-09-22 Thread Ferruh Yigit

On 9/4/2024 6:50 PM, Damodharam Ammepalli wrote:
> Update the eth_dev_ops structure with new function vectors
> to get, get capabilities and set ethernet link speed lanes.
> Update the testpmd to provide required config and information
> display infrastructure.
> 
> The supporting ethernet controller driver will register callbacks
> to avail link speed lanes config and get services. This lanes
> configuration is applicable only when the nic is forced to fixed
> speeds. In Autonegiation mode, the hardware automatically
> negotiates the number of lanes.
> 
> These are the new commands.
> 
> testpmd> show port 0 speed_lanes capabilities
> 
>  Supported speeds Valid lanes
> ---
>  10 Gbps  1
>  25 Gbps  1
>  40 Gbps  4
>  50 Gbps  1 2
>  100 Gbps 1 2 4
>  200 Gbps 2 4
>  400 Gbps 4 8
> testpmd>
> 
> testpmd>
> testpmd> port stop 0
> testpmd> port config 0 speed_lanes 4
> testpmd> port config 0 speed 20 duplex full
> testpmd> port start 0
> testpmd>
> testpmd> show port info 0
> 
> * Infos for port 0  *
> MAC address: 14:23:F2:C3:BA:D2
> Device name: :b1:00.0
> Driver name: net_bnxt
> Firmware-version: 228.9.115.0
> Connect to socket: 2
> memory allocation on the socket: 2
> Link status: up
> Link speed: 200 Gbps
> Active Lanes: 4
> Link duplex: full-duplex
> Autoneg status: Off
> 
> Signed-off-by: Damodharam Ammepalli 
> ---
>  app/test-pmd/cmdline.c | 248 -
>  app/test-pmd/config.c  |   4 +
>  lib/ethdev/ethdev_driver.h |  91 ++
>  lib/ethdev/rte_ethdev.c|  52 
>  lib/ethdev/rte_ethdev.h|  95 ++
>  lib/ethdev/version.map |   5 +
>  6 files changed, 494 insertions(+), 1 deletion(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index b7759e38a8..643102032e 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -284,6 +284,9 @@ static void cmd_help_long_parsed(void *parsed_result,
>  
>   "dump_log_types\n"
>   "Dumps the log level for all the dpdk modules\n\n"
> +
> + "show port (port_id) speed_lanes capabilities"
> + "   Show speed lanes capabilities of a port.\n\n"
>   );
>   }
>  
> @@ -823,6 +826,9 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "port config (port_id) txq (queue_id) affinity 
> (value)\n"
>   "Map a Tx queue with an aggregated port "
>   "of the DPDK port\n\n"
> +
> + "port config (port_id|all) speed_lanes (0|1|4|8)\n"
>

This help string, and the implementation, implies there has to be fixed
lane values, like 1, 4, 8. Why not leave this part to the capability
reporting, and not limit (and worry) those values in the command, so
from command's perspective it can be only .

> + "Set number of lanes for all ports or port_id for a 
> forced speed\n\n"
>   );
>   }
>  
> @@ -1560,6 +1566,244 @@ static cmdline_parse_inst_t cmd_config_speed_specific 
> = {
>   },
>  };
>  
> +static int
> +parse_speed_lanes_cfg(portid_t pid, uint32_t lanes)
> +{
> + int ret;
> +
> + ret = rte_eth_speed_lanes_set(pid, lanes);
>

As a sample application usage, I think it is better if it gets the
capability and verify that 'lanes' is withing the capability and later
set it, what do you think?

<...>

> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h> index
548fada1c7..9444e0a836 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -357,6 +357,27 @@ struct rte_eth_link {
>  #define RTE_ETH_LINK_MAX_STR_LEN 40 /**< Max length of default link string. 
> */
>  /**@}*/
>  
> +/**
> + * Constants used to indicate the possible link speed lanes of an ethdev 
> port.
> + */
> +#define RTE_ETH_SPEED_LANE_UNKNOWN  0  /**< speed lanes unsupported mode or 
> default */
> +#define RTE_ETH_SPEED_LANE_1  1/**< Link speed lane  1 */
> +#define RTE_ETH_SPEED_LANE_2  2/**< Link speed lanes 2 */
> +#define RTE_ETH_SPEED_LANE_4  4/**< Link speed lanes 4 */
> +#define RTE_ETH_SPEED_LANE_8  8/**< Link speed lanes 8 */
> +
> +/* Translate from link speed lanes to speed lanes capa */
> +#define RTE_ETH_SPEED_LANES_TO_CAPA(x) RTE_BIT32(x)
> +
> +/* This macro indicates link speed lanes capa mask */
> +#define RTE_ETH_SPEED_LANES_CAPA_MASK(x) RTE_BIT32(RTE_ETH_SPEED_ ## x)
> +
>

I am not clear why we need these macros, why not use lane number as
unsigned integer, instead of macro (RTE_ETH_SPEED_LANE_2), it will be
more flexible.

Probably all we need is 'RTE_ETH_SPEED_LANES_TO_CAPA' one to use as
helper in drivers.
Again not sure about the ..CAPA_MASK one, does it actually produce a
mask value?

Re: [PATCH 4/4] ethdev: add traffic manager query function

2024-09-22 Thread Ferruh Yigit

On 8/6/2024 4:24 PM, Bruce Richardson wrote:
> +/**
> + * Return information about a traffic management node
> + *
> + * Return information about a hierarchy node, using the same format of 
> parameters
> + * as was passed to the rte_rm_node_add() function.
> + * Each of the "out" parameters pointers (except error) may be passed as 
> NULL if the
> + * information is not needed by the caller. For example, to one may check if 
> a node id
> + * is in use by:
> + *
> + *  struct rte_tm_error error;
> + *  int ret = rte_tm_node_query(port, node_id, NULL, NULL, NULL, NULL, NULL, 
> &error);
> + *  if (ret == ENOENT) ...
> + *
> + * @param[in] port_id
> + *   The port identifier of the Ethernet device.
> + * @param[in] node_id
> + *   Node ID. Should be a valid node id.
> + * @param[out] parent_node_id
> + *   Parent node ID.
> + * @param[out] priority
> + *   Node priority. The highest node priority is zero. Used by the SP 
> algorithm
> + *   running on the parent of the current node for scheduling this child 
> node.
> + * @param[out] weight
> + *   Node weight. The node weight is relative to the weight sum of all 
> siblings
> + *   that have the same priority. The lowest weight is one. Used by the WFQ
> + *   algorithm running on the parent of the current node for scheduling this
> + *   child node.
> + * @param[out] level_id
> + *   The node level in the scheduler hierarchy.
> + * @param[out] params
> + *   Node parameters, as would be used when creating the node.
> + * @param[out] error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + *   -EINVAL - port or node id value is invalid
> + *   -ENOENT - no node exists with the provided id
> + */
> +int
> +rte_tm_node_query(uint16_t port_id,
> + uint32_t node_id,
> + uint32_t *parent_node_id,
> + uint32_t *priority,
> + uint32_t *weight,
> + uint32_t *level_id,
> + struct rte_tm_node_params *params,
> + struct rte_tm_error *error);
> +
>

No objection to get an TM node query API overall, but it would be good
to get more comment on the what correct API should be, we are missing it.
Both because it is not discussed much, and it is first release, better
to add this API as experimental.

Also we should have an implementation in driver and a sample application
usage (testpmd?) with new API. Are these planned separately for this
release, or can it be available part of next version of this patch?

Finally, does it worth documenting this in release notes, as just a
query API I am not sure if this a notable feature, but just a reminder.

Re: [PATCH 3/4] ethdev: make TM shaper parameters constant

2024-09-22 Thread Ferruh Yigit

On 8/7/2024 8:29 AM, Xu, Rosen wrote:
> Hi,
> 
>> -Original Message-
<...>
>> Subject: [PATCH 3/4] ethdev: make TM shaper parameters constant
>>
>> The function to add a new shaper profile in rte_tm should not (and does
>> not) modify the profile parameters passed in via struct pointer. We should
>> guarantee this by marking the parameter pointer as const. This allows SW to
>> create multiple profiles using the same parameter struct without having to
>> reset it each time.
>>
>> Signed-off-by: Bruce Richardson 
> 
> Reviewed-by: Rosen Xu 
> 

Acked-by: Ferruh Yigit

Re: [PATCH 2/4] ethdev: make parameters to TM profile add fn constant

2024-09-22 Thread Ferruh Yigit

On 8/7/2024 8:27 AM, Xu, Rosen wrote:
> Hi,
> 
>> -Original Message-
<...>
>> Subject: [PATCH 2/4] ethdev: make parameters to TM profile add fn constant
>>
>> The function to add a new profile in rte_tm should not (and does not) modify
>> the profile parameters passed in via struct pointer. We should guarantee this
>> by marking the parameter pointer as const.  This allows SW to create multiple
>> profiles using the same parameter struct without having to reset it each 
>> time.
>>
>> Signed-off-by: Bruce Richardson 
> 
> Reviewed-by: Rosen Xu 
>

Acked-by: Ferruh Yigit

Re: [PATCH 1/4] ethdev: make parameters to TM node add fn constant

2024-09-22 Thread Ferruh Yigit

On 8/7/2024 8:27 AM, Xu, Rosen wrote:
> Hi,
> 
>> -Original Message-
<...>
>> Subject: [PATCH 1/4] ethdev: make parameters to TM node add fn constant
>>
>> The function to add a new scheduling node in rte_tm should not (and does
>> not) modify the actual node parameters passed in via struct pointer. We
>> should guarantee this by marking the parameter pointer as const. This allows
>> SW to create multiple scheduling nodes using the same parameter struct
>> without having to reset it each time.
>>
>> Signed-off-by: Bruce Richardson 
>> ---
>>  drivers/net/ipn3ke/ipn3ke_tm.c  |  4 ++--
>> diff --git a/drivers/net/ipn3ke/ipn3ke_tm.c
>> b/drivers/net/ipn3ke/ipn3ke_tm.c index 0260227900..cffe1fdaa4 100644
>> --- a/drivers/net/ipn3ke/ipn3ke_tm.c
>> +++ b/drivers/net/ipn3ke/ipn3ke_tm.c
>> @@ -1010,7 +1010,7 @@ ipn3ke_tm_tdrop_profile_delete(struct
>> rte_eth_dev *dev,  static int
>> ipn3ke_tm_node_add_check_parameter(uint32_t tm_id,
>>  uint32_t node_id, uint32_t parent_node_id, uint32_t priority,
>> -uint32_t weight, uint32_t level_id, struct rte_tm_node_params
>> *params,
>> +uint32_t weight, uint32_t level_id, const struct rte_tm_node_params
>> +*params,
>>  struct rte_tm_error *error)
>>  {
>>  uint32_t level_of_node_id;
>> @@ -1168,7 +1168,7 @@ ipn3ke_tm_node_add_check_mount(uint32_t
>> tm_id,  static int  ipn3ke_tm_node_add(struct rte_eth_dev *dev,
>>  uint32_t node_id, uint32_t parent_node_id, uint32_t priority,
>> -uint32_t weight, uint32_t level_id, struct rte_tm_node_params
>> *params,
>> +uint32_t weight, uint32_t level_id, const struct rte_tm_node_params
>> +*params,
>>  struct rte_tm_error *error)
>>  {
>>  struct ipn3ke_hw *hw = IPN3KE_DEV_PRIVATE_TO_HW(dev); diff --
>>  /** @internal Traffic manager node delete */
>> --
>> 2.43.0
> 
> Reviewed-by: Rosen Xu 
>

Acked-by: Ferruh Yigit

Re: [PATCH v2 15/18] bus/dpaa: add OH port mode for dpaa eth

2024-09-22 Thread Ferruh Yigit

On 8/23/2024 8:32 AM, Hemant Agrawal wrote:
> @@ -78,7 +78,7 @@ TAILQ_HEAD(rte_fman_if_list, __fman_if);
>  
>  /* Represents the different flavour of network interface */
>  enum fman_mac_type {
> - fman_offline_internal = 0,
> + fman_offline = 0,
>   fman_mac_1g,
>   fman_mac_10g,
>   fman_mac_2_5g,
>

Is this rename required, since next patch renames it back to
'fman_offline_internal'?

Re: [PATCH v2 14/18] net/dpaa: add Tx rate limiting DPAA PMD API

2024-09-22 Thread Ferruh Yigit

On 9/22/2024 2:27 PM, Ferruh Yigit wrote:
> On 9/22/2024 5:40 AM, Hemant Agrawal wrote:
>> HI Ferruh,
>>
>>> -Original Message-
>>> From: Ferruh Yigit 
>>> Sent: Sunday, September 22, 2024 8:44 AM
>>> To: Hemant Agrawal ; dev@dpdk.org
>>> Cc: ferruh.yi...@intel.com; Vinod Pullabhatla ;
>>> Rohit Raj 
>>> Subject: Re: [PATCH v2 14/18] net/dpaa: add Tx rate limiting DPAA PMD API
>>> Importance: High
>>>
>>> On 8/23/2024 8:32 AM, Hemant Agrawal wrote:
>>>> diff --git a/drivers/net/dpaa/version.map
>>>> b/drivers/net/dpaa/version.map index 3fdb63caf3..24a28ce649 100644
>>>> --- a/drivers/net/dpaa/version.map
>>>> +++ b/drivers/net/dpaa/version.map
>>>> @@ -6,6 +6,13 @@ DPDK_25 {
>>>>local: *;
>>>>  };
>>>>
>>>> +EXPERIMENTAL {
>>>> +  global:
>>>> +
>>>> +  # added in 24.11
>>>> +  rte_pmd_dpaa_port_set_rate_limit;
>>>> +};
>>>> +
>>>>  INTERNAL {
>>>>global:
>>>>
>>>
>>> PMD specific API needs to be justified, can't we use TM framework for this,
>>> does TM needs to be improved for this support?
>>>
>>> What do you think to send the rest of the set without this patch, so they 
>>> can
>>> progress, and this one can be discussed separately (assuming there is no
>>> dependency).
>>
>> [Hemant] I think, I replied to your earlier concerns. 
>>
>> We are yet to implement TM framework for DPAA1.  But that involves more of 
>> egress QoS.
>>
>> This one is additional capability to limit the ingress port. Kind of 
>> policing in Rx side.
>>
>> However, if you still disagree. Please apply the series without this patch.
>>
> 
> Let's detect what is missing in ethdev layer for it, and what is the
> required effort to cover it in ethdev, first. Based on findings, we can
> continue with PMD API as last resort.
> 
> I will continue with patch series without this path.
>

Hi Hemant,

I removed the commit 14/18 from set, but it impacts other patches, I can
fix the build but can't verify the functionality, probably better if you
send a new version without patch 14/18.

Btw, while resolving conflict I recognized that patch 15/18 renames
'fman_offline_internal' -> 'fman_offline', and next patch (16/18),
renames it back to 'fman_offline_internal', this creates lots of noise
in both patches, can this be prevented, I will comment on the patch?

Re: [PATCH v2 14/18] net/dpaa: add Tx rate limiting DPAA PMD API

2024-09-22 Thread Ferruh Yigit

On 9/22/2024 5:40 AM, Hemant Agrawal wrote:
> HI Ferruh,
> 
>> -Original Message-----
>> From: Ferruh Yigit 
>> Sent: Sunday, September 22, 2024 8:44 AM
>> To: Hemant Agrawal ; dev@dpdk.org
>> Cc: ferruh.yi...@intel.com; Vinod Pullabhatla ;
>> Rohit Raj 
>> Subject: Re: [PATCH v2 14/18] net/dpaa: add Tx rate limiting DPAA PMD API
>> Importance: High
>>
>> On 8/23/2024 8:32 AM, Hemant Agrawal wrote:
>>> diff --git a/drivers/net/dpaa/version.map
>>> b/drivers/net/dpaa/version.map index 3fdb63caf3..24a28ce649 100644
>>> --- a/drivers/net/dpaa/version.map
>>> +++ b/drivers/net/dpaa/version.map
>>> @@ -6,6 +6,13 @@ DPDK_25 {
>>> local: *;
>>>  };
>>>
>>> +EXPERIMENTAL {
>>> +   global:
>>> +
>>> +   # added in 24.11
>>> +   rte_pmd_dpaa_port_set_rate_limit;
>>> +};
>>> +
>>>  INTERNAL {
>>> global:
>>>
>>
>> PMD specific API needs to be justified, can't we use TM framework for this,
>> does TM needs to be improved for this support?
>>
>> What do you think to send the rest of the set without this patch, so they can
>> progress, and this one can be discussed separately (assuming there is no
>> dependency).
> 
> [Hemant] I think, I replied to your earlier concerns. 
> 
> We are yet to implement TM framework for DPAA1.  But that involves more of 
> egress QoS.
> 
> This one is additional capability to limit the ingress port. Kind of policing 
> in Rx side.
> 
> However, if you still disagree. Please apply the series without this patch.
>

Let's detect what is missing in ethdev layer for it, and what is the
required effort to cover it in ethdev, first. Based on findings, we can
continue with PMD API as last resort.

I will continue with patch series without this path.

Re: [PATCH v2 14/18] net/dpaa: add Tx rate limiting DPAA PMD API

2024-09-22 Thread Ferruh Yigit

On 9/22/2024 5:40 AM, Hemant Agrawal wrote:
> HI Ferruh,
> 
>> -Original Message-----
>> From: Ferruh Yigit 
>> Sent: Sunday, September 22, 2024 8:44 AM
>> To: Hemant Agrawal ; dev@dpdk.org
>> Cc: ferruh.yi...@intel.com; Vinod Pullabhatla ;
>> Rohit Raj 
>> Subject: Re: [PATCH v2 14/18] net/dpaa: add Tx rate limiting DPAA PMD API
>> Importance: High
>>
>> On 8/23/2024 8:32 AM, Hemant Agrawal wrote:
>>> diff --git a/drivers/net/dpaa/version.map
>>> b/drivers/net/dpaa/version.map index 3fdb63caf3..24a28ce649 100644
>>> --- a/drivers/net/dpaa/version.map
>>> +++ b/drivers/net/dpaa/version.map
>>> @@ -6,6 +6,13 @@ DPDK_25 {
>>> local: *;
>>>  };
>>>
>>> +EXPERIMENTAL {
>>> +   global:
>>> +
>>> +   # added in 24.11
>>> +   rte_pmd_dpaa_port_set_rate_limit;
>>> +};
>>> +
>>>  INTERNAL {
>>> global:
>>>
>>
>> PMD specific API needs to be justified, can't we use TM framework for this,
>> does TM needs to be improved for this support?
>>
>> What do you think to send the rest of the set without this patch, so they can
>> progress, and this one can be discussed separately (assuming there is no
>> dependency).
> 
> [Hemant] I think, I replied to your earlier concerns. 
> 
> We are yet to implement TM framework for DPAA1.  But that involves more of 
> egress QoS.
> 
> This one is additional capability to limit the ingress port. Kind of policing 
> in Rx side.
> 
> However, if you still disagree. Please apply the series without this patch.
>

Let's detect what is missing in ethdev layer for it, and what is the
required effort to cover it in ethdev, first. Based on findings, we can
continue with PMD API as last resort.

I will continue with patch series without this path.

Re: [PATCH v3 0/3] net/enic: support VF and fix minor issues

2024-09-21 Thread Ferruh Yigit

On 8/9/2024 8:07 AM, Hyong Youb Kim wrote:
> This series contains minor updates for net/enic. The first patch
> supports SR-IOV VF, which now requires the use of admin channel. The
> other patches are not related to VF, but included here to ease review.
> 
> ---
> 
> v3:
> * add spdx, doc, rel_notes (Ferruh's comments)
> 
> v2:
> * fix compiler warnings
> 
> Hyong Youb Kim (3):
>   net/enic: support SR-IOV VF using admin channel
>   net/enic: add speed capabilities for newer models
>   net/enic: allow multicast in MAC address add callback
>

Series applied to dpdk-next-net/main, thanks.

Re: [PATCH] net/ena: revert redefining memcpy

2024-09-21 Thread Ferruh Yigit

On 9/12/2024 6:12 AM, Tyler Retzlaff wrote:
> On Mon, Aug 12, 2024 at 08:34:17AM -0700, Stephen Hemminger wrote:
>> Redefining memcpy as rte_memcpy has no performance gain on
>> current compilers, and introduced bugs like this one where
>> rte_memcpy() will be detected as referencing past the destination.
>>
>> Bugzilla ID: 1510
>> Fixes: 142778b3702a ("net/ena: switch memcpy to optimized version")
>>
>> Signed-off-by: Stephen Hemminger 
>> ---
> 
> Acked-by: Tyler Retzlaff 
> 

Acked-by: Shai Brandes 

Applied to dpdk-next-net/main, thanks.

Re: [PATCH v2 14/18] net/dpaa: add Tx rate limiting DPAA PMD API

2024-09-21 Thread Ferruh Yigit

On 8/23/2024 8:32 AM, Hemant Agrawal wrote:
> diff --git a/drivers/net/dpaa/version.map b/drivers/net/dpaa/version.map
> index 3fdb63caf3..24a28ce649 100644
> --- a/drivers/net/dpaa/version.map
> +++ b/drivers/net/dpaa/version.map
> @@ -6,6 +6,13 @@ DPDK_25 {
>   local: *;
>  };
>  
> +EXPERIMENTAL {
> + global:
> +
> + # added in 24.11
> + rte_pmd_dpaa_port_set_rate_limit;
> +};
> +
>  INTERNAL {
>   global:
>

PMD specific API needs to be justified, can't we use TM framework for
this, does TM needs to be improved for this support?

What do you think to send the rest of the set without this patch, so
they can progress, and this one can be discussed separately (assuming
there is no dependency).

Re: [PATCH v2 00/18] NXP DPAA ETH driver enhancement and fixes

2024-09-21 Thread Ferruh Yigit

On 8/23/2024 8:32 AM, Hemant Agrawal wrote:
> v2: address review comments
>  - improve commit message
>  - add documentarion for new functions
>  - make IEEE1588 config runtime
> 
> This series adds several enhancement to the NXP DPAA Ethernet driver.
> 
> Primarily:
> 1. timestamp and IEEE 1588 support
> 2. OH and ONIC based virtual port config in DPAA
> 3. frame display and debugging infra
> 
> Gagandeep Singh (3):
>   bus/dpaa: fix PFDRs leaks due to FQRNIs
>   net/dpaa: support mempool debug
>   net/dpaa: improve the dpaa port cleanup
> 
> Hemant Agrawal (5):
>   bus/dpaa: fix VSP for 1G fm1-mac9 and 10
>   bus/dpaa: fix the fman details status
>   bus/dpaa: add port buffer manager stats
>   net/dpaa: implement detailed packet parsing
>   net/dpaa: enhance DPAA frame display
> 
> Jun Yang (2):
>   net/dpaa: share MAC FMC scheme and CC parse
>   net/dpaa: improve dpaa errata A010022 handling
> 
> Rohit Raj (3):
>   net/dpaa: fix typecasting ch ID to u32
>   bus/dpaa: add OH port mode for dpaa eth
>   bus/dpaa: add ONIC port mode for the DPAA eth
> 
> Vanshika Shukla (4):
>   net/dpaa: support Tx confirmation to enable PTP
>   net/dpaa: add support to separate Tx conf queues
>   net/dpaa: support Rx/Tx timestamp read
>   net/dpaa: support IEEE 1588 PTP
> 
> Vinod Pullabhatla (1):
>   net/dpaa: add Tx rate limiting DPAA PMD API
>

Hi Hemant,

Can you please ack/review series to proceed?

Re: [PATCH v3] net/af_packet: add timestamp offloading support

2024-09-21 Thread Ferruh Yigit

On 9/20/2024 8:14 AM, Stefan Laesser wrote:
> Add the packet timestamp from TPACKET_V2 to the mbuf
> dynamic rx timestamp register if offload RTE_ETH_RX_OFFLOAD_TIMESTAMP
> is enabled.
> 
> TPACKET_V2 provides the timestamp with nanosecond resolution
> and UNIX origo, i.e. time since 1-JAN-1970 UTC.
> 
> Signed-off-by: Stefan Laesser 
> Acked-by: Morten Brørup 
>

Applied to dpdk-next-net/main, thanks.

Re: [PATCH] netvsc: optimize stats counters performance

2024-09-21 Thread Ferruh Yigit

On 9/19/2024 7:06 PM, Long Li wrote:
>> Subject: RE: [PATCH] netvsc: optimize stats counters performance
>>
>>> From: Stephen Hemminger [mailto:step...@networkplumber.org]
>>> Sent: Friday, 2 August 2024 19.33
>>>
>>> On Fri, 2 Aug 2024 19:28:26 +0200
>>> Morten Brørup  wrote:
> 
> Reviewed-by: Long Li 
> 

Applied to dpdk-next-net/main, thanks.

Re: [PATCH] net/gve: add ptype parsing to DQ format

2024-09-21 Thread Ferruh Yigit

On 9/18/2024 9:33 PM, Joshua Washington wrote:
> Currently, the packet type is parsed as part of adding the
> checksum-related ol_flags for a received packet, but the parsed
> information is not added to the mbuf.
> 
> This change adds the parsed ptypes to the mbuf and updates the RX
> checksum validation to rely on the mbuf instead of re-capturing the
> ptype from the descriptor. This helps with compatibility with programs
> which rely on the packet type value stored in the mbuf.
> 
> Signed-off-by: Joshua Washington 
> Reviewed-by: Rushil Gupta 
>

Applied to dpdk-next-net/main, thanks.

Re: [PATCH v3] ethdev: optimize the activation of fast-path tracepoints

2024-09-21 Thread Ferruh Yigit

On 9/19/2024 5:37 PM, Jerin Jacob wrote:
> On Thu, Sep 19, 2024 at 12:16 AM Adel Belkhiri  
> wrote:
>>
>> From: Adel Belkhiri 
>>
>> Split the tracepoints rte_ethdev_trace_rx_burst and
>> rte_eth_trace_call_rx_callbacks into two separate ones
>> for empty and non-empty calls to avoid saturating
>> quickly the trace buffer.
>>
>> Signed-off-by: Adel Belkhiri 
> 
> Acked-by: Jerin Jacob 
>

Applied to dpdk-next-net/main, thanks.

Re: [PATCH] net/mana: support rdma-core via pkg-config in meson

2024-09-21 Thread Ferruh Yigit

On 9/20/2024 6:14 PM, Long Li wrote:
>> Subject: [PATCH] net/mana: support rdma-core via pkg-config in meson
>>
>> Currently building with custom rdma-core installed in /opt/rdma-core after
>> setting PKG_CONFIG_PATH=/opt/rdma-core/lib64/pkgconfig/ results in the below
>> meson logs:
>> Run-time dependency libmana found: YES 1.0.54.0 Header
>> "infiniband/manadv.h" has symbol "manadv_set_context_attr" : NO
>>
>> Thus to fix this, the libs is updated in meson.build and is passed to the
>> cc.has_header_symbol call using dependencies. After this change, the libmana
>> header files are getting included and net/mana is successfully enabled.
>>
>> Fixes: 517ed6e2d590 ("net/mana: add basic driver with build environment")
>> Cc: lon...@microsoft.com
>> Cc: sta...@dpdk.org
>> Signed-off-by: Shreesh Adiga <16567adigashre...@gmail.com>
> 
> Acked-by: Long Li 
> 

Applied to dpdk-next-net/main, thanks.

Re: [PATCH 0/5] Increase minimum meson version

2024-09-21 Thread Ferruh Yigit

On 9/20/2024 1:57 PM, Bruce Richardson wrote:
> This patchset proposed increasing the minimum meson version to 0.57
> and makes changes to update our build files appropriately for that
> change: replacing deprecated functions, removing unnecessary version
> checks and taking advantage of some new capabilities.
> 
> Why 0.57? No one particular reason; it's mainly a conservative version
> bump that doesn't have many impacts, but still gives us the minimum
> updates we need to replace the deprecated get_cross_properties fn
> and have a few extra features guaranteed available.
> 
> Bruce Richardson (5):
>   build: increase minimum meson version to 0.57
>   build: remove version check on compiler links function
>   build: remove unnecessary version checks
>   build: use version file support from meson
>   build: replace deprecated meson function
>

Tested-by: Ferruh Yigit

Re: Unable to access DPDK KMODS folder

2024-09-13 Thread Ferruh Yigit

On 9/13/2024 1:48 PM, Pavithra Raghavendran wrote:
> Hi Team
> 
> I have subscribed to the mailer lists. And hence re-adding the request
> below.
> 
> As part of DPDK-22.11.2, we were downloading the dpdk-kmods folder from
> the following link: http://dpdk.org/git/dpdk-kmods  dpdk-kmods>. 
> Off late, this link is down and even the alternate link: https://
> git.dpdk.org/dpdk-kmods/  is
> intermittently down. 
> Can you please guide us to the latest link to download the dpdk-kmods
> folder ?
> 
> 
> Thanks
> Pavithra
> 
> *From:* Pavithra Raghavendran
> *Sent:* Friday, September 13, 2024 11:06 AM
> *To:* dev@dpdk.org 
> *Cc:* Vikram Thirthappa 
> *Subject:* Unable to access DPDK KMODS folder
>  
> Hi Team
> 
> As part of DPDK-22.11.2, we were downloading the dpdk-kmods folder from
> the following link: http://dpdk.org/git/dpdk-kmods  dpdk-kmods>. 
> Off late, this link is down and even the alternate link: https://
> git.dpdk.org/dpdk-kmods/  is
> intermittently down. 
> Can you please guide us to the latest link to download the dpdk-kmods
> folder ?
> 

Hi Pavithra,

'https://dpdk.org/git/dpdk-kmods' is the URL of the git repo, please try
git clone from it, it works for me [1]

Web page is 'https://git.dpdk.org/dpdk-kmods/'.


Can you point where is this URL mentioned in documentation, if it is not
clear that this is git repository we can update it to clarify?



[1]
git clone https://dpdk.org/git/dpdk-kmods

Re: [PATCH v3] doc: add new driver guidelines

2024-09-13 Thread Ferruh Yigit

On 9/13/2024 5:19 AM, WanRenyong wrote:
> On 2024/9/13 4:26, Stephen Hemminger wrote
>> +
>> +Finalizing
>> +--
>> +
>> +Once the driver has been upstreamed, the author has
>> +a responsibility to the community to maintain it.
>> +
>> +This includes the public test report. Authors must send a public
>> +test report after the first upstreaming of the PMD. The same
>> +public test procedure may be reproduced regularly per release.
> Is there any guildelines about how to write a test report? Is there any 
> template?
>

This is a good question and indeed I got it a few times before, I think
we don't have it documented anywhere.
@Nandini, @Thomas, can this be next thing to work on?

We are using free format for test result reporting, you can find some
samples in mail list, please check replies to release candidate emails
in dev mailing list for samples, like:
https://inbox.dpdk.org/dev/mw4pr11mb5912fbe30092b821029858ea9f...@mw4pr11mb5912.namprd11.prod.outlook.com/

>> +
>> +Dependencies
>> +
>> +
>> +At times, drivers may have dependencies to external software.
>> +For driver dependencies, same DPDK rules for dependencies applies.
>> +Dependencies should be publicly and freely available,
>> +drivers which depend on non-available components will not be accepted.
>> +If the required dependency is not yet publicly available, then wait to 
>> submit
>> +the driver until the dependent library is available.
>> +
> Could you please interpret dependencies publicly and freely？There are 4 
> scenarios as below:
> 1. A dependency is niche software, but it's open-sourced on github, is 
> it publicly or freely?
> 2. A dependency which belongs to our company and open-sourced on github, 
> is it publicly or freely?
> 3. A dependency which is not available in the upstream distribution, but 
> available in the downstream distribution. For instance, a kernel driver 
> dependent upon by PMD, which is not available in kernel.org，but it's 
> available in openeuler kernel, the openeuler kernel is publicly and 
> freely.  Is it publicly or freely?
> 4. If a distribution does not include > the dependency, I redistribute
it with the dependency and open source,
> this is somewhat similar to mlnx_ofed, is it publicly or freely?
> 

All looks good from DPDK perspective, although it is preferred that
dependency upstreamed to its upstream distribution.

Problematic cases are like (not limited to), dependency only delivered
if you purchase the HW, or it is distributed only if you sign some
agreement, or you need to reach out to company and provide some
information to be able to get the SW etc...


> Hello, Stephen,
> 
> These guildelines are very useful for begineers like me :). I have some 
> questions above, could you please explain them? Thank you.
>

Re: [PATCH v3] doc: add new driver guidelines

2024-09-12 Thread Ferruh Yigit

On 9/12/2024 2:18 PM, Nandini Persad wrote:
> I like the separation. I can include it in V4 and see, it would be
> helpful to know if it’s more or less confusing that way. 
> 
> For the prompt before each list, can we say something like “Avoid doing
> the following” and “Suggested actions” or something a little better
> grammatically. We could also just say “Avoid”.  
>

Agree to have better header for the lists, what about:
"Avoid doing the following" and
"Remember to do the following"

Or we can go with whatever you think more convenient.

> --------
> *From:* Ferruh Yigit 
> *Sent:* Thursday, September 12, 2024 1:13:33 AM
> *To:* Nandini Persad 
> *Cc:* dev@dpdk.org ; Thomas Mojalon ;
> Stephen Hemminger 
> *Subject:* Re: [PATCH v3] doc: add new driver guidelines
>  
> On 9/11/2024 5:04 PM, Nandini Persad wrote:
>> Hi Ferruh,
>> 
>> I will work with Stephen on this. For the tone of the list, I believe we
>> can try different ways to make the tone more friendly, while still being
>> concise.
>> 
>> What about something like this:
>> # Avoid including unused headers (process-iwyu.py).
>> # Keep compiler warnings enabled in the build file.
>> # Instead of using #ifdef with driver-specific macros, opt for runtime
>> configuration.
>> # Document device parameters in the driver guide.
>> # Make device operations structs 'const'.
>> # Use dynamic logging.
>> # Skip DPDK version checks (RTE_VERSION_NUM) in the upstream code.
>> # Add SPDX license tags and copyright notices on all sides.
>> # Don’t introduce public APIs directly from the driver.
>> 
>> It's slightly more friendly.
>> 
>> Let me know what you think, I'm open to trying another way.
>> 
> 
> I think above is better.
> 
> Another option can be separating it as "Do" and "Do Not" list, as
> following, do you think does it help, or makes it harder to understand?
> 
> Avoid doing:
> - Using PMD specific macros when DPDK ones exist
> - Including unused headers
> - Disable compiler warnings for driver
> - #ifdef with driver-defined macros
> - DPDK version checks (via RTE_VERSION_NUM) in the upstream code
> - Public APIs directly from the driver
> 
> Suggested to do:
> - Runtime configuration when applicable
> - Document device parameters in the driver guide
> - Make device operations struct 'const'
> - Dynamic logging
> - SPDX license tags and copyright notice on each file
> 
> 
>> On Tue, Sep 10, 2024 at 5:16 PM Ferruh Yigit > <mailto:ferruh.yi...@amd.com <mailto:ferruh.yi...@amd.com>>> wrote:
>> 
>> On 9/10/2024 3:58 PM, Nandini Persad wrote:
>> > This document was created to assist contributors
>> > in creating DPDK drivers, providing suggestions
>> > and guidelines for how to upstream effectively.
>> >
>> 
>> There are minor differences in this v3 and v2, isn't this version on top
>> of v2, can those changes be from Stephen?
>> 
>> <...>
>> 
>> > +
>> > +Additional Suggestions
>> > +--
>> > +
>> > +* We recommend using DPDK macros instead of inventing new ones in
>> the PMD.
>> > +* Do not include unused headers (process-iwyu.py).
>> > +* Do not disable compiler warning in the build file.
>> > +* Do not use #ifdef with driver-defined macros, instead prefer
>> runtime configuration.
>> > +* Document device parameters in the driver guide.
>> > +* Make device operations struct 'const'.
>> > +* Use dynamic logging.
>> > +* Do not use DPDK version checks (via RTE_VERSION_NUM) in the
>> upstream code.
>> > +* Be sure to have SPDX license tags and copyright notice on each
>> side.
>> > +* Do not introduce public Apis directly from the driver.
>> >
>> 
>> API (Application Programming Interface) is an acronym and should be all
>> uppercase, like 'APIs'.
>> 
>> Overall the language in this list is imperative, I think it helps to
>> make it simple, but I am not sure about the tone, I wonder if we can do
>> better, do you have any suggestions?
>> 
>> 
>> > +
>> > +
>> > +Dependencies
>> > +
>> > +
>> > +At times, drivers may have dependencies to external software.
>> > +For driver dependencies,

Re: [RFC v3 0/3] Import Kernel uAPI header files

2024-09-12 Thread Ferruh Yigit

On 9/12/2024 1:08 PM, Maxime Coquelin wrote:
> Hi Ferruh,
> 
> On 9/12/24 10:30, Ferruh Yigit wrote:
>> On 9/11/2024 8:32 PM, Maxime Coquelin wrote:
>>> This series enables importing Linux Kernel uAPI headers
>>> into the DPDK repository. It aims at solving alignment
>>> issues between the build system and the system applications
>>> linked ot DPDK libraries are run on.
>>>
>>
>> Hi Maxime,
>>
>> Is the imported Linux Kernel uAPI headers are always backward compatible?
> 
> Yes, at least that's what is advertised! :)
> 
> "
> The binary interface provided by the kernel to user space cannot be
> broken except under the most severe circumstances.
> "
> 

Ack.

> 
>> If I build and run on identical platform, lets say has kernel v5.4,
>> and DPDK has the kernel uAPI header from v6.10, is this has any chance
>> to create backward compatibility issues?
> 
> It should not if backward compatibility is preserved as advertised.
> 
>> Or can it enable some features which are indeed not exist/supported in
>> the platform that has old version of kernel?
> 
> It can enable new feature, like for example a new ioctl command.
> 
> In this case, it should return something like ENOIOCTLCMD if the Kernel
> does not support it.
> 

OK. I think this has the same risk with building dpdk application in a
platform with newer kernel version, and distribute it to the another
platform with older kernel. If interface is backward compatible, this
should be OK.


In this approach we are duplicating the kernel headers, which is not a
great option. Also this requires maintenance, updating kernel headers in
DPDK as needed, which is not great in long run.

I guess intention is to enable a feature that comes with newer kernel
versions, if so why not force enable it in DPDK, like:
#ifndef FEATURE_MACRO
#define FEATURE_MACRO
#define X
#define Y
#define Z
#endif

If the kernel uAPI is stable, moving these defines to library should be
OK. When this feature hits the distribution kernels we can remove
defines from the DPDK library.


OR, won't it help if whoever building DPDK, download kernel headers
package for necessary version, and build DPDK with them?
For this documenting required versions, and a how to documentation to
explain meson commands etc.. may be sufficient.


>>
>>> It can also help simplify spaghetti code done to support
>>> different versions of the Linux Kernel headers used by
>>> the build system.
>>>
>>> Guidelines and import script are also part of first patch.
>>> A uAPI checker script is added in this 3rd RFC, goals is to
>>> use it in CI for any patch touching linux-headers/uapi.
>>>
>>> Follow-up patches are an example of imported uAPI inclusion
>>> of VDUSE header into the Vhost library.
>>>
>>> Morten, I did not (again) apply your Ack on patch 1, as it
>>> has some significant changes and additions.
>>>
>>> Changes in RFC v3:
>>> ==
>>> - Support nested headers include
>>> - Interactive mode to select which headers to include
>>> - Store version in a file instead of git commit messages
>>> - All imported headers aligned on same version
>>> - Improve loops in scripts (David)
>>> - Update documentation
>>>
>>>
>>> Changes in RFC v2:
>>> ==
>>> - Fix typos in documentation and commit messags (David, Morten)
>>> - Add uAPI checker script
>>> - Add uAPI to global_inc
>>> - Fix build issues on FreeBSD and documentation (CI, David)
>>> - Simplify import script (David)
>>>
>>> Maxime Coquelin (3):
>>>    uapi: introduce kernel uAPI headers import
>>>    uapi: import VDUSE header
>>>    vduse: use imported VDUSE uAPI header
>>>
>>>   devtools/check-linux-uapi.sh   |  85 ++
>>>   devtools/import-linux-uapi.sh  | 119 +
>>>   doc/guides/contributing/index.rst  |   1 +
>>>   doc/guides/contributing/linux_uapi.rst |  71 +
>>>   lib/vhost/meson.build  |   5 +-
>>>   lib/vhost/vduse.c  |   2 +-
>>>   lib/vhost/vduse.h  |  22 --
>>>   linux-headers/uapi/.gitignore  |   4 +
>>>   linux-headers/uapi/linux/vduse.h   | 353 +
>>>   linux-headers/uapi/version |   1 +
>>>   meson.build    |   8 +-
>>>   11 files changed, 642 insertions(+), 29 deletions(-)
>>>   create mode 100755 devtools/check-linux-uapi.sh
>>>   create mode 100755 devtools/import-linux-uapi.sh
>>>   create mode 100644 doc/guides/contributing/linux_uapi.rst
>>>   create mode 100644 linux-headers/uapi/.gitignore
>>>   create mode 100644 linux-headers/uapi/linux/vduse.h
>>>   create mode 100644 linux-headers/uapi/version
>>>
>>
>

Re: [RFC v3 0/3] Import Kernel uAPI header files

2024-09-12 Thread Ferruh Yigit

On 9/11/2024 8:32 PM, Maxime Coquelin wrote:
> This series enables importing Linux Kernel uAPI headers
> into the DPDK repository. It aims at solving alignment
> issues between the build system and the system applications
> linked ot DPDK libraries are run on.
> 

Hi Maxime,

Is the imported Linux Kernel uAPI headers are always backward compatible?

If I build and run on identical platform, lets say has kernel v5.4,
and DPDK has the kernel uAPI header from v6.10, is this has any chance
to create backward compatibility issues?
Or can it enable some features which are indeed not exist/supported in
the platform that has old version of kernel?

> It can also help simplify spaghetti code done to support
> different versions of the Linux Kernel headers used by
> the build system.
> 
> Guidelines and import script are also part of first patch.
> A uAPI checker script is added in this 3rd RFC, goals is to
> use it in CI for any patch touching linux-headers/uapi.
> 
> Follow-up patches are an example of imported uAPI inclusion
> of VDUSE header into the Vhost library.
> 
> Morten, I did not (again) apply your Ack on patch 1, as it
> has some significant changes and additions.
> 
> Changes in RFC v3:
> ==
> - Support nested headers include
> - Interactive mode to select which headers to include
> - Store version in a file instead of git commit messages
> - All imported headers aligned on same version
> - Improve loops in scripts (David)
> - Update documentation
> 
> 
> Changes in RFC v2:
> ==
> - Fix typos in documentation and commit messags (David, Morten)
> - Add uAPI checker script
> - Add uAPI to global_inc
> - Fix build issues on FreeBSD and documentation (CI, David)
> - Simplify import script (David)
> 
> Maxime Coquelin (3):
>   uapi: introduce kernel uAPI headers import
>   uapi: import VDUSE header
>   vduse: use imported VDUSE uAPI header
> 
>  devtools/check-linux-uapi.sh   |  85 ++
>  devtools/import-linux-uapi.sh  | 119 +
>  doc/guides/contributing/index.rst  |   1 +
>  doc/guides/contributing/linux_uapi.rst |  71 +
>  lib/vhost/meson.build  |   5 +-
>  lib/vhost/vduse.c  |   2 +-
>  lib/vhost/vduse.h  |  22 --
>  linux-headers/uapi/.gitignore  |   4 +
>  linux-headers/uapi/linux/vduse.h   | 353 +
>  linux-headers/uapi/version |   1 +
>  meson.build|   8 +-
>  11 files changed, 642 insertions(+), 29 deletions(-)
>  create mode 100755 devtools/check-linux-uapi.sh
>  create mode 100755 devtools/import-linux-uapi.sh
>  create mode 100644 doc/guides/contributing/linux_uapi.rst
>  create mode 100644 linux-headers/uapi/.gitignore
>  create mode 100644 linux-headers/uapi/linux/vduse.h
>  create mode 100644 linux-headers/uapi/version
>

Re: [PATCH] net/ena: revert redefining memcpy

2024-09-12 Thread Ferruh Yigit

On 9/12/2024 6:12 AM, Tyler Retzlaff wrote:
> On Mon, Aug 12, 2024 at 08:34:17AM -0700, Stephen Hemminger wrote:
>> Redefining memcpy as rte_memcpy has no performance gain on
>> current compilers, and introduced bugs like this one where
>> rte_memcpy() will be detected as referencing past the destination.
>>
>> Bugzilla ID: 1510
>> Fixes: 142778b3702a ("net/ena: switch memcpy to optimized version")
>>
>> Signed-off-by: Stephen Hemminger 
>> ---
> 
> Acked-by: Tyler Retzlaff 
> 

Hi Shai,

Did you have any chance to check/test this patch?

Re: [PATCH v3] doc: add new driver guidelines

2024-09-12 Thread Ferruh Yigit

On 9/11/2024 5:04 PM, Nandini Persad wrote:
> Hi Ferruh,
> 
> I will work with Stephen on this. For the tone of the list, I believe we
> can try different ways to make the tone more friendly, while still being
> concise.
> 
> What about something like this:
> # Avoid including unused headers (process-iwyu.py).
> # Keep compiler warnings enabled in the build file.
> # Instead of using #ifdef with driver-specific macros, opt for runtime
> configuration.
> # Document device parameters in the driver guide.
> # Make device operations structs 'const'.
> # Use dynamic logging.
> # Skip DPDK version checks (RTE_VERSION_NUM) in the upstream code.
> # Add SPDX license tags and copyright notices on all sides.
> # Don’t introduce public APIs directly from the driver.
> 
> It's slightly more friendly.
> 
> Let me know what you think, I'm open to trying another way.
> 

I think above is better.

Another option can be separating it as "Do" and "Do Not" list, as
following, do you think does it help, or makes it harder to understand?

Avoid doing:
- Using PMD specific macros when DPDK ones exist
- Including unused headers
- Disable compiler warnings for driver
- #ifdef with driver-defined macros
- DPDK version checks (via RTE_VERSION_NUM) in the upstream code
- Public APIs directly from the driver

Suggested to do:
- Runtime configuration when applicable
- Document device parameters in the driver guide
- Make device operations struct 'const'
- Dynamic logging
- SPDX license tags and copyright notice on each file


> On Tue, Sep 10, 2024 at 5:16 PM Ferruh Yigit  <mailto:ferruh.yi...@amd.com>> wrote:
> 
> On 9/10/2024 3:58 PM, Nandini Persad wrote:
> > This document was created to assist contributors
> > in creating DPDK drivers, providing suggestions
> > and guidelines for how to upstream effectively.
> >
> 
> There are minor differences in this v3 and v2, isn't this version on top
> of v2, can those changes be from Stephen?
> 
> <...>
> 
> > +
> > +Additional Suggestions
> > +--
> > +
> > +* We recommend using DPDK macros instead of inventing new ones in
> the PMD.
> > +* Do not include unused headers (process-iwyu.py).
> > +* Do not disable compiler warning in the build file.
> > +* Do not use #ifdef with driver-defined macros, instead prefer
> runtime configuration.
> > +* Document device parameters in the driver guide.
> > +* Make device operations struct 'const'.
> > +* Use dynamic logging.
> > +* Do not use DPDK version checks (via RTE_VERSION_NUM) in the
> upstream code.
> > +* Be sure to have SPDX license tags and copyright notice on each
> side.
> > +* Do not introduce public Apis directly from the driver.
> >
> 
> API (Application Programming Interface) is an acronym and should be all
> uppercase, like 'APIs'.
> 
> Overall the language in this list is imperative, I think it helps to
> make it simple, but I am not sure about the tone, I wonder if we can do
> better, do you have any suggestions?
> 
> 
> > +
> > +
> > +Dependencies
> > +
> > +
> > +At times, drivers may have dependencies to external software.
> > +For driver dependencies, same DPDK rules for dependencies applies.
> > +Dependencies should be publicly and freely available to
> > +upstream the driver.
> > +
> > +
> > +Test Tools
> > +--
> > +
> > +Per patch in a patch series, be sure to use the proper test tools.
> > +
> > +* checkpatches.sh
> > +* check-git-log.sh
> > +* check-meson.py
> > +* check-doc-vs-code.sh
> > +* check-spdx-tag.sh
> >
> 
> `check-spdx-tag.sh` seems moved in v2 to "additional suggestions", I am
> for keeping it here, as "additional suggestions" are more things to take
> into consideration during design/development, above are actual scripts
> that we can use to verify code.
> 
> And long term intention was to move this "tools to run list" to a more
> generic documentation, as these are not really specific to new PMD
> guide, but "additional suggestions" will stay in this document.
>

Re: virtio: RSS support capa

2024-09-11 Thread Ferruh Yigit

On 9/11/2024 3:59 PM, Morten Brørup wrote:
>> From: Ferruh Yigit [mailto:ferruh.yi...@amd.com]
>> Sent: Wednesday, 11 September 2024 16.02
>>
>> On 9/11/2024 2:20 PM, Thomas Monjalon wrote:
>>> 11/09/2024 14:17, Morten Brørup:
>>>> From: Andrew Rybchenko [mailto:andrew.rybche...@oktetlabs.ru]
>>>>> On 9/7/24 23:55, Morten Brørup wrote:
>>>>>>> From: Morten Brørup [mailto:m...@smartsharesystems.com]
>>>>>>> dev_info->rx_offload_capa |=
>> RTE_ETH_RX_OFFLOAD_RSS_HASH;
>>>>>>
>>>>>> Or perhaps I'm misunderstanding this capability flag.
>>>>>>
>>>>>> I thought it indicated RSS ability, i.e. multi-queue, effectively
>> shadowing
>>>>> rte_eth_conf.rxmode.mq_mode RTE_ETH_MQ_RX_RSS_FLAG.
>>>>>> But maybe it doesn't. Maybe it indicates the ability to store the
>> RSS hash
>>>>> value in the mbuf.
>>>>>>
>>>>>> The RTE_ETH_RX_OFFLOAD_RSS_HASH flag is completely undocumented.
>>>>>>
>>>>>> Can someone please clarify?
>>>>>>
>>>>> RTE_ETH_RX_OFFLOAD_RSS_HASH means that the driver can provide RSS
>> hash
>>>>> value in mbuf (it makes sense if HW can provide it to the driver).
>>>>
>>>> OK, thanks.
>>>>
>>>> Then what indicates this ethdev capability: Use RSS to distribute
>> packets into multiple RX queues?
>>>> I.e. what to check before setting
>> rte_eth_conf.rxmode.mq_mode=RTE_ETH_MQ_RX_RSS_FLAG?
>>>
>>> It is supposed to be implemented by all DPDK drivers I think.
>>> Is there any case where RSS is not supported?
> 
> Depending on the KVM host, virtio in the KVM guest might not support RSS:
> https://elixir.bootlin.com/dpdk/v24.07/source/drivers/net/virtio/virtio_ethdev.c#L2660
> 
> So we're currently checking if rte_eth_dev_info.flow_type_rss_offloads != 0.
> And I was hoping for an official way to check, which we could generally rely 
> on (e.g. for future drivers).
> Apparently not available.
> 

Indeed that check can work, although this is not explicitly documented,
not having any RSS offload type implies that there is no RSS support.

Perhaps short term solution can be to document this as a way to detect
RSS support.

>>>
>>
>> Hi Morten,
>>
>> There is no formal capability reporting support in ethdev [1].
>>
>> `struct rte_eth_dev_info::[rt]x_[queue]_offload_capa` uses
>> RTE_ETH_RX_OFFLOAD_* flags, and they are for limited scope, only to
>> report the offloading capability of device.
>>
>> Also some capability reporting distributed to
>> - struct rte_eth_dev_info::dev_capa (RTE_ETH_DEV_CAPA_*)
>> - struct rte_eth_dev_info (various files like speed_capa)
>> - struct rte_eth_dev_data::dev_flags (RTE_ETH_DEV_*)
>>
>>
>> Programmatically there is no way for device to report RSS support.
> 
> OK, that answers my question.
> Not the answer I was hoping for, but much better than not knowing.
> 
>> The .ini file has "RSS hash" feature, which covers RSS support, please
>> check 'features.rst'.
> 
> Yeah, perhaps the lack of capability reporting and inconsistent 
> interpretation of features (and no conformance testing to detect/reveal this) 
> means that the application must maintain its own list of drivers it supports 
> with details about capabilities and feature deviations.
> 
>>
>> [1]
>> We mentioned formal capability reporting a few times in the past, but
>> this is a big task to take, specially with all historical usages.
> 
> So instead of fixing this omission, because it is a big task, we pass on the 
> challenge to the application developers to find out how the various drivers 
> behave.
> 
>> Two things specially bothers me:
>> 1. When we need a capability, since there is no dedicated place for it,
>> it finds itself one of above locations, mostly in dev_info.
>> 2. 'struct rte_eth_dev_info' has not defined role, it is combination of
>> the status, configuration and capability reporting. With more
>> capabilities hitting it, it is becoming a more mixed, central struct
>> that many features depends on.
> 
> Some of it could be fixed by documenting what we already have.
> E.g. the RTE_ETH_RX_OFFLOAD_RSS_HASH is undocumented, and my initial guess at 
> its purpose was wrong.
> 

This is added later as an optimization for some NICs.
Although these NICs support RSS, accessing to the internally calculated
hash value has performance impact, so this flag enables them to escape
from reporting RSS hash value to application.

I was thinking we documented it in 'features.rst', but looking again
that information is not much useful.

Would you mind adding a comment to macro in 'rte_ethdev.h' as documentation?


> Long term, we really need good feature descriptions and conformance testing, 
> so all features behave the same way on all drivers.
> 

Ack.

> Short term I got my specific RSS questions answered. :-)
>

Re: virtio: RSS support capa

2024-09-11 Thread Ferruh Yigit

On 9/11/2024 3:02 PM, Ferruh Yigit wrote:
> On 9/11/2024 2:20 PM, Thomas Monjalon wrote:
>> 11/09/2024 14:17, Morten Brørup:
>>> From: Andrew Rybchenko [mailto:andrew.rybche...@oktetlabs.ru]
>>>> On 9/7/24 23:55, Morten Brørup wrote:
>>>>>> From: Morten Brørup [mailto:m...@smartsharesystems.com]
>>>>>>  dev_info->rx_offload_capa |= RTE_ETH_RX_OFFLOAD_RSS_HASH;
>>>>>
>>>>> Or perhaps I'm misunderstanding this capability flag.
>>>>>
>>>>> I thought it indicated RSS ability, i.e. multi-queue, effectively 
>>>>> shadowing
>>>> rte_eth_conf.rxmode.mq_mode RTE_ETH_MQ_RX_RSS_FLAG.
>>>>> But maybe it doesn't. Maybe it indicates the ability to store the RSS hash
>>>> value in the mbuf.
>>>>>
>>>>> The RTE_ETH_RX_OFFLOAD_RSS_HASH flag is completely undocumented.
>>>>>
>>>>> Can someone please clarify?
>>>>>
>>>> RTE_ETH_RX_OFFLOAD_RSS_HASH means that the driver can provide RSS hash
>>>> value in mbuf (it makes sense if HW can provide it to the driver).
>>>
>>> OK, thanks.
>>>
>>> Then what indicates this ethdev capability: Use RSS to distribute packets 
>>> into multiple RX queues?
>>> I.e. what to check before setting 
>>> rte_eth_conf.rxmode.mq_mode=RTE_ETH_MQ_RX_RSS_FLAG?
>>
>> It is supposed to be implemented by all DPDK drivers I think.
>> Is there any case where RSS is not supported?
>>
> 
> Hi Morten,
> 
> There is no formal capability reporting support in ethdev [1].
> 
> `struct rte_eth_dev_info::[rt]x_[queue]_offload_capa` uses
> RTE_ETH_RX_OFFLOAD_* flags, and they are for limited scope, only to
> report the offloading capability of device.
> 
> Also some capability reporting distributed to
> - struct rte_eth_dev_info::dev_capa (RTE_ETH_DEV_CAPA_*)
> - struct rte_eth_dev_info (various files like speed_capa)
> - struct rte_eth_dev_data::dev_flags (RTE_ETH_DEV_*)
> 
> 
> Programmatically there is no way for device to report RSS support.
> The .ini file has "RSS hash" feature, which covers RSS support, please
> check 'features.rst'.
> 
> 

[1] // forgot the label for above pointer
> We mentioned formal capability reporting a few times in the past, but
> this is a big task to take, specially with all historical usages.
> Two things specially bothers me:
> 1. When we need a capability, since there is no dedicated place for it,
> it finds itself one of above locations, mostly in dev_info.
> 2. 'struct rte_eth_dev_info' has not defined role, it is combination of
> the status, configuration and capability reporting. With more
> capabilities hitting it, it is becoming a more mixed, central struct
> that many features depends on.

Re: virtio: RSS support capa

2024-09-11 Thread Ferruh Yigit

On 9/11/2024 2:20 PM, Thomas Monjalon wrote:
> 11/09/2024 14:17, Morten Brørup:
>> From: Andrew Rybchenko [mailto:andrew.rybche...@oktetlabs.ru]
>>> On 9/7/24 23:55, Morten Brørup wrote:
> From: Morten Brørup [mailto:m...@smartsharesystems.com]
>   dev_info->rx_offload_capa |= RTE_ETH_RX_OFFLOAD_RSS_HASH;

 Or perhaps I'm misunderstanding this capability flag.

 I thought it indicated RSS ability, i.e. multi-queue, effectively shadowing
>>> rte_eth_conf.rxmode.mq_mode RTE_ETH_MQ_RX_RSS_FLAG.
 But maybe it doesn't. Maybe it indicates the ability to store the RSS hash
>>> value in the mbuf.

 The RTE_ETH_RX_OFFLOAD_RSS_HASH flag is completely undocumented.

 Can someone please clarify?

>>> RTE_ETH_RX_OFFLOAD_RSS_HASH means that the driver can provide RSS hash
>>> value in mbuf (it makes sense if HW can provide it to the driver).
>>
>> OK, thanks.
>>
>> Then what indicates this ethdev capability: Use RSS to distribute packets 
>> into multiple RX queues?
>> I.e. what to check before setting 
>> rte_eth_conf.rxmode.mq_mode=RTE_ETH_MQ_RX_RSS_FLAG?
> 
> It is supposed to be implemented by all DPDK drivers I think.
> Is there any case where RSS is not supported?
> 

Hi Morten,

There is no formal capability reporting support in ethdev [1].

`struct rte_eth_dev_info::[rt]x_[queue]_offload_capa` uses
RTE_ETH_RX_OFFLOAD_* flags, and they are for limited scope, only to
report the offloading capability of device.

Also some capability reporting distributed to
- struct rte_eth_dev_info::dev_capa (RTE_ETH_DEV_CAPA_*)
- struct rte_eth_dev_info (various files like speed_capa)
- struct rte_eth_dev_data::dev_flags (RTE_ETH_DEV_*)

Programmatically there is no way for device to report RSS support.
The .ini file has "RSS hash" feature, which covers RSS support, please
check 'features.rst'.

We mentioned formal capability reporting a few times in the past, but
this is a big task to take, specially with all historical usages.
Two things specially bothers me:
1. When we need a capability, since there is no dedicated place for it,
it finds itself one of above locations, mostly in dev_info.
2. 'struct rte_eth_dev_info' has not defined role, it is combination of
the status, configuration and capability reporting. With more
capabilities hitting it, it is becoming a more mixed, central struct
that many features depends on.

Re: [PATCH v2 00/19] XSC PMD for Yunsilicon NICs

2024-09-11 Thread Ferruh Yigit

On 9/11/2024 3:07 AM, WanRenyong wrote:
> This xsc PMD (**librte_net_xsc**) provides poll mode driver for 
> Yunsilicon metaScale serials NICs.
> 
> Features:
> -
> - MTU update
> - TSO
> - RSS hash
> - RSS key update
> - RSS reta update
> - L3 checksum offload
> - L4 checksum offload
> - Inner L3 checksum
> - Inner L4 checksum
> - Basic stats 
> 
> Support NICs:
> -
> - metaScale-200S   Single QSFP56 Port 200GE SmartNIC
> - metaScale-200Quad QSFP28 Ports 100GE SmartNIC
> - metaScale-50 Dual QSFP28 Port 25GE SmartNIC
> - metaScale-100Q   Quad QSFP28 Port 25GE SmartNIC
> 
> 
> ---
> WanRenyong (19):
>   net/xsc: add doc and minimum build framework
>   net/xsc: add log macro
>   net/xsc: add PCI device probe and remove
>   net/xsc: add xsc device init and uninit
>   net/xsc: add ioctl command interface
>   net/xsc: initialize hardware information
>   net/xsc: add representor ports probe
>   net/xsc: create eth devices for representor ports
>   net/xsc: initial representor eth device
>   net/xsc: add ethdev configure and rxtx queue setup ops
>   net/xsc: add mailbox and structure
>   net/xsc: add ethdev RSS hash ops
>   net/xsc: add ethdev start and stop ops
>   net/xsc: add ethdev Rx burst
>   net/xsc: add ethdev Tx burst
>   net/xsc: configure xsc device hardware table
>   net/xsc: add dev link and MTU ops
>   net/xsc: add dev infos get
>   net/xsc: add dev basic stats ops
>

Hi WanRenyong,

Thanks for upstreaming the driver,
I did not had a chance to check in detail yet, but when sending next
version can you please send patch series as threaded, with following
git-send-email arguments:
`--thread --no-chain-reply-to`

Or this can be added to .gitconfig as documented in [1].


Also there is new PMD upstreaming guide work in progress [2], current
version still can be useful, can you please check it.

Thanks,
ferruh


[1]
https://old.dpdk.org/dev

[2]
https://patches.dpdk.org/project/dpdk/patch/20240910145801.46186-1-nandinipersad...@gmail.com/

Re: ci: version.map check

2024-09-10 Thread Ferruh Yigit

On 9/11/2024 1:08 AM, Morten Brørup wrote:
>> From: Ferruh Yigit [mailto:ferruh.yi...@amd.com]
>> Sent: Wednesday, 11 September 2024 01.48
>>
>> On 9/10/2024 5:56 PM, Morten Brørup wrote:
>>> It would be convenient if the CI checked that the symbols in
>> version.map files actually exist in the resulting code.
>>>
>>
>> Is there instances that is not the case?
> 
> While reviewing a patch, I noticed one such instance.
> A previous version of the patch contained a global variable that had since 
> been removed, but not from the version.map file.
> 
> Specifically "rte_lcore_var" in this earlier version:
> https://patchwork.dpdk.org/project/dpdk/patch/20240208181644.455233-2-mattias.ronnb...@ericsson.com/
> 
> Which in this more recent patch has been removed, but remains in version.map:
> https://patchwork.dpdk.org/project/dpdk/patch/20240910070344.699183-2-mattias.ronnb...@ericsson.com/
> 

I expect build to catch them, via `./buildtools/check-symbols.sh`, which
uses `./devtools/check-symbol-maps.sh` internally. I did a quick check,
and add a non-existing symbol to a version.map file, build failed [1] as
expected.

For 'rte_lcore_var' instance, `check-symbol-maps.sh` does not catch it,
but if you update the variable name in version.map to 'rte_lcore_bar' it
does catch :)

So I assume `check-symbol-maps.sh` uses wildcard search, and because of
existing 'rte_lcore_var_alloc', detection fails.

Probably we need to fix `check-symbol-maps.sh` for this, and build will
catch extra symbols.


[1]
[133/264] Generating lib/ethdev.sym_chk with a custom command (wrapped
by meson to capture output)
FAILED: lib/ethdev.sym_chk

Re: [PATCH v3] doc: add new driver guidelines

2024-09-10 Thread Ferruh Yigit

On 9/10/2024 3:58 PM, Nandini Persad wrote:
> This document was created to assist contributors
> in creating DPDK drivers, providing suggestions
> and guidelines for how to upstream effectively.
> 

There are minor differences in this v3 and v2, isn't this version on top
of v2, can those changes be from Stephen?

<...>

> +
> +Additional Suggestions
> +--
> +
> +* We recommend using DPDK macros instead of inventing new ones in the PMD.
> +* Do not include unused headers (process-iwyu.py).
> +* Do not disable compiler warning in the build file.
> +* Do not use #ifdef with driver-defined macros, instead prefer runtime 
> configuration.
> +* Document device parameters in the driver guide.
> +* Make device operations struct 'const'.
> +* Use dynamic logging.
> +* Do not use DPDK version checks (via RTE_VERSION_NUM) in the upstream code.
> +* Be sure to have SPDX license tags and copyright notice on each side.
> +* Do not introduce public Apis directly from the driver.
>

API (Application Programming Interface) is an acronym and should be all
uppercase, like 'APIs'.

Overall the language in this list is imperative, I think it helps to
make it simple, but I am not sure about the tone, I wonder if we can do
better, do you have any suggestions?

> +
> +
> +Dependencies
> +
> +
> +At times, drivers may have dependencies to external software.
> +For driver dependencies, same DPDK rules for dependencies applies.
> +Dependencies should be publicly and freely available to
> +upstream the driver.
> +
> +
> +Test Tools
> +--
> +
> +Per patch in a patch series, be sure to use the proper test tools.
> +
> +* checkpatches.sh
> +* check-git-log.sh
> +* check-meson.py
> +* check-doc-vs-code.sh
> +* check-spdx-tag.sh
>

`check-spdx-tag.sh` seems moved in v2 to "additional suggestions", I am
for keeping it here, as "additional suggestions" are more things to take
into consideration during design/development, above are actual scripts
that we can use to verify code.

And long term intention was to move this "tools to run list" to a more
generic documentation, as these are not really specific to new PMD
guide, but "additional suggestions" will stay in this document.

Re: ci: version.map check

2024-09-10 Thread Ferruh Yigit

On 9/10/2024 5:56 PM, Morten Brørup wrote:
> It would be convenient if the CI checked that the symbols in version.map 
> files actually exist in the resulting code.
> 

Is there instances that is not the case?

Re: Crash in tap pmd when using more than 8 rx queues

2024-09-10 Thread Ferruh Yigit

On 9/10/2024 5:58 PM, Stephen Hemminger wrote:
> On Fri, 6 Sep 2024 12:16:47 +0100
> Ferruh Yigit  wrote:
> 
>> On 9/5/2024 1:55 PM, Edwin Brossette wrote:
>>> Hello,
>>>
>>> I have recently stumbled into an issue with my DPDK-based application
>>> running the failsafe pmd. This pmd uses a tap device, with which my
>>> application fails to start if more than 8 rx queues are used. This issue
>>> appears to be related to this patch:
>>> https://git.dpdk.org/dpdk/commit/?
>>> id=c36ce7099c2187926cd62cff7ebd479823554929 <https://git.dpdk.org/dpdk/  
>>> commit/?id=c36ce7099c2187926cd62cff7ebd479823554929>  
>>>
>>> I have seen in the documentation that there was a limitation to 8 max
>>> queues shared when using a tap device shared between multiple processes.
>>> However, my application uses a single primary process, with no secondary
>>> process, but it appears that I am still running into this limitation.
>>>
>>> Now if we look at this small chunk of code:
>>>
>>> memset(&msg, 0, sizeof(msg));
>>> strlcpy(msg.name <http://msg.name>, TAP_MP_REQ_START_RXTX,
>>> sizeof(msg.name <http://msg.name>));
>>> strlcpy(request_param->port_name, dev->data->name, sizeof(request_param-  
>>>> port_name));  
>>> msg.len_param = sizeof(*request_param);
>>> for (i = 0; i < dev->data->nb_tx_queues; i++) {
>>>     msg.fds[fd_iterator++] = process_private->txq_fds[i];
>>>     msg.num_fds++;
>>>     request_param->txq_count++;
>>> }
>>> for (i = 0; i < dev->data->nb_rx_queues; i++) {
>>>     msg.fds[fd_iterator++] = process_private->rxq_fds[i];
>>>     msg.num_fds++;
>>>     request_param->rxq_count++;
>>> }
>>> (Note that I am not using the latest DPDK version, but stable v23.11.1.
>>> But I believe the issue is still present on latest.)
>>>
>>> There are no checks on the maximum value i can take in the for loops.
>>> Since the size of msg.fds is limited by the maximum of 8 queues shared
>>> between process because of the IPC API, there is a potential buffer
>>> overflow which can happen here.
>>>
>>> See the struct declaration:
>>> struct rte_mp_msg {
>>>  char name[RTE_MP_MAX_NAME_LEN];
>>>  int len_param;
>>>  int num_fds;
>>>  uint8_t param[RTE_MP_MAX_PARAM_LEN];
>>>  int fds[RTE_MP_MAX_FD_NUM];
>>> };
>>>
>>> This means that if the number of queues used is more than 8, the program
>>> will crash. This is what happens on my end as I get the following log:
>>> *** stack smashing detected ***: terminated
>>>
>>> Reverting the commit mentionned above fixes my issue. Also setting a
>>> check like this works for me:
>>>
>>> if (dev->data->nb_tx_queues + dev->data->nb_rx_queues > RTE_MP_MAX_FD_NUM)
>>>  return -1;
>>>
>>> I've made the changes on my local branch to fix my issue. This mail is
>>> just to bring attention on this problem.
>>> Thank you in advance for considering it.
>>>   
>>
>> Hi Edwin,
>>
>> Thanks for the report, I confirm issue is valid, although that code
>> changed a little (to increase 8 limit) [3].
>>
>> And in this release Stephen put another patch [1] to increase the limit
>> even more, but irrelevant from the limit, tap code needs to be fixed.
>>
>> To fix:
>> 1. We need to add "nb_rx_queues > RTE_MP_MAX_FD_NUM" check you
>> mentioned, to not blindly update the 'msg.fds[]'
>> 2. We should prevent this to be a limit for tap PMD when there is only
>> primary process, this seems was oversight in our end.
>>
> 
> It is not clear what the error handling should be if the user requests
> 10 queues but RTE_MP_MAX_FD_NUM is 8. Ideally, it should work if no secondary
> process is used. But there is no good way to know that in the driver.
> 
> That is why it is best to just set TAP max queues to be less than
> or equal to RTE_MP_MAX_FD_NUM, and enforce that with a static assertion
> at compile time.
>

We can limit queue number to RTE_MP_MAX_FD_NUM when there is secondary
process, but when there is only primary process this restriction is
artificial, and I think we should try to remove it.

One solution can be to provide a devarg if tap pmd will be used only in
primary process, and in this case it skips all code required for
secondary process support, this removes the queue number limit.
Also this can be backward compatible, without devarg default behavior
can be secondary support and queue number is limited, with devarg limit
removed etc..

By spending some time on it probably we can come with even better
solution. Like improving MP socket communication to remove this
restriction completely independent from RTE_MP_MAX_FD_NUM size...

Re: DPDK Summit Montreal - Schedule

2024-09-06 Thread Ferruh Yigit

On 9/6/2024 3:09 PM, Konstantin Ananyev wrote:
> 
> 
>>> We will talk about the future of DPDK, the best userland networking 
>>> libraries
>>> having an incredible hardware support from our large community.
>>> It will be an opportunity to connect, learn and collaborate with developers
>>> from around the world who contribute to and utilize DPDK.
>>>
>>> Talks will cover CPU optimizations, GPU processing, machine learning,
>>> hashing, packet offload, cryptography, testing and more.
>>>
>>> The schedule can be found here, almost complete:
>>> https://events.linuxfoundation.org/dpdk-summit/program/schedule/
>>>
>>>
>>> A workshop session is planned to allow debating and making progress
>>> in smaller group discussions about specific topics to be determined.
>>> Examples of such topics could be:
>>> - debuggability
>>> - power management techniques & efficiency
>>> - config restore bypass in ethdev port start
>>> - secondary process usage and limitations
>>>
>>> Feel free to propose your ideas in advance so we can come prepared.
>>> Then we will organize ourselves in discussion groups
>>> in order to progress and hopefully reach some new conclusions.
>>>
>>
>> What do you think about 'rte_flow', it is a powerful tool but complex
>> and not adopted in same level by all vendors. As it hard to test, we are
>> having difficulty to provide consistency between vendor implementations.
>> We can discuss how to spread understanding among various vendors and
>> users, how to increase adoption, and future targets/plans.
>>
>>
>> Another one can be 'tooling', as a result of kernel bypass, some known
>> Linux networking tools does not work with DPDK solutions, and this
>> creates confusing and entry barrier for some people, this problem
>> mentioned a few times before.
>> Perhaps we should address this problem in a more structured way, to
>> design and later implement gradually some solutions. We can discuss
>> methods and plans to improve our tooling support.
>>
> 
> Thanks Ferruh, sound like an interesting ones to me, specially the second one.
> As another possible subject: we talk about how to make core DPDK 
> data-structures
> (mempool, hash-table, ring, etc.) less static: i.e. add ability to 
> grow/shrink on demand?
> Another long-hanging thing - RTE_MAX_LCORE... - can it be runtime parameter, 
> instead
> of build-time parameter?
> All these things I think would help overall  by reducing memory footprint, 
> improving usability, etc. 
> 

+1 to converting static data structures to dynamic ones discussion,
`RTE_MAX_ETHPORTS` is the one in my focus that I would like to see it
gone, (and a few friends of it, like RTE_MAX_QUEUES_PER_PORT,
RTE_ETHDEV_RXTX_CALLBACKS, etc..)

I wonder if we can introduce a generic, C++ vector like dynamic resizing
data structure first, and apply it to various cases we have at hand
gradually.

Re: Crash in tap pmd when using more than 8 rx queues

2024-09-06 Thread Ferruh Yigit

On 9/6/2024 3:04 PM, Edwin Brossette wrote:
> Hello,
> 
> I created a Bugzilla PR, just as you requested:
> https://bugs.dpdk.org/show_bug.cgi?id=1536  show_bug.cgi?id=1536>
> 
> As for the bug resolution, I have other matters to attend to and I'm
> afraid I cannot spend more time on this issue, so I was only planning to
> report it.
>

Fair enough, thanks for the report.

Re: [PATCH 0/3] eal: mark API's as stable

2024-09-06 Thread Ferruh Yigit

On 9/6/2024 11:04 AM, Morten Brørup wrote:
>> From: Ferruh Yigit [mailto:ferruh.yi...@amd.com]
>> Sent: Friday, 6 September 2024 10.54
>>
>> On 9/5/2024 3:01 PM, Jerin Jacob wrote:
>>> On Thu, Sep 5, 2024 at 3:14 PM Morten Brørup 
>> wrote:
>>>>
>>>>> From: David Marchand [mailto:david.march...@redhat.com]
>>>>> Sent: Thursday, 5 September 2024 11.03
>>>>>
>>>>> On Thu, Sep 5, 2024 at 10:55 AM Morten Brørup 
>>>>> wrote:
>>>>>>
>>>>>>> From: David Marchand [mailto:david.march...@redhat.com]
>>>>>>> Sent: Thursday, 5 September 2024 09.59
>>>>>>>
>>>>>>> On Wed, Sep 4, 2024 at 8:10 PM Stephen Hemminger
>>>>>>>  wrote:
>>>>>>>>
>>>>>>>> The API's in ethtool from before 23.11 should be marked stable.
>>>>>>>
>>>>>>> EAL* ?
>>>>>>>
>>>>>>>> Should probably include the trace api's but that is more complex
>> change.
>>>>>>>
>>>>>>> On the trace API itself it should be ok.
>>>>>>
>>>>>> No!
>>>>>
>>>>> *sigh*
>>>>>
>>>>>>
>>>>>> Trace must remain experimental until controlled by a meson option, e.g.
>>>>> "enable_trace", whereby trace can be completely disabled and omitted from
>> the
>>>>> compiled application/libraries/drivers at build time.
>>>>>
>>>>> This seems unrelated to marking the API stable as regardless of the
>>>>> API state at the moment, this code is always present.
>>>>
>>>> I cannot foresee if disabling trace at build time will require changes to
>> the trace API. So I'm being cautious here.
>>>>
>>>> However, if Jerin (as author of the trace subsystem) foresees that it will
>> be possible to disable trace at build time without affecting the trace API, I
>> don't object to marking the trace API (or some of it) stable.
>>>
>>> I don't for foresee any ABI changes when adding disabling trace
>>> compile time support. However, I don't understand why we need to do
>>> that. In the sense, fast path functions are already having an option
>>> to compile out.
>>> Slow path functions can be disabled at runtime at the cost of 1 cycle
>>> as instrumentation cost. Having said that, I don't have any concern
>>> about disabling trace as an option.
>>>
>>
>> I agree with Jerin, I don't see motivation to disable slow path traces
>> when they can be disabled in runtime.
>> And fast path traces already have compile flag to disable them.
>>
>> Build time configurations in long term has problems too, so I am for not
>> using them unless we don't have to.
> 
> For some use cases, trace is dead code, and should be omitted.
> You don't want dead code in production systems.
> 
> Please remember that DPDK is also being used in highly optimized embedded 
> systems, hardware appliances and other systems where memory is not abundant.
> 
> DPDK is not only for cloud and distros. ;-)
> 
> The CI only tests DPDK with a build time configuration expected to be usable 
> for distros.
> I'm not asking to change that.
> I'm only asking for more build time configurability to support other use 
> cases.
> 

I see, but that build time configuration argument exists in multiple
aspects. And with meson switch we lean to dynamic configuration approach.

When a build time config introduced, again and again we are having cases
that specific code enabled with compile time macro broken and nobody
noticed.
Having code enabled always and configured in runtime produces more
robust deliverable.

We are aware that DPDK is used in embedded device, but they are not
mostly very resource restricted devices, is it really matter to have a
few megabytes (I didn't check but I expect this the max binary size can
increase with tracing code) larger DPDK binary, does it really makes any
difference?

Re: [PATCH 0/3] eal: mark API's as stable

2024-09-06 Thread Ferruh Yigit

On 9/6/2024 2:11 PM, Jerin Jacob wrote:
> On Fri, Sep 6, 2024 at 3:04 PM Ferruh Yigit  wrote:
>>
>> On 9/5/2024 8:58 AM, David Marchand wrote:
>>> On Wed, Sep 4, 2024 at 8:10 PM Stephen Hemminger
>>>  wrote:
>>>>
>>>> The API's in ethtool from before 23.11 should be marked stable.
>>>
>>> EAL* ?
>>>
>>>> Should probably include the trace api's but that is more complex change.
>>>
>>> On the trace API itself it should be ok.
>>> The problem is with the tracepoint variables themselves, and I don't
>>> think we should mark them stable.
>>>
>>
>> We cleaned tracepoint variables from ethdev map file, why they exist for
>> 'eal'?
>>
>> I can see .map file has bunch of "__rte_eal_trace_generic_*", I think
>> they exists to support 'rte_eal_trace_generic_*()' APIs which can be
>> called from other libraries.
>>
>> Do we really need them?
>> Why not whoever calls them directly call 'rte_trace_point_emit_*' instead?
>> As these rte_eal_trace_generic_*()' not used at all, I assume this is
>> what done already.
>>
>> @Jerin,
>> what do think to remove 'rte_eal_trace_generic_*()' APIs, so trace
>> always keeps local to library, and don't bloat the eal .map file?
> 
> The purpose of exposing rte_eal_trace_generic_* is that, applications
> can add generic trace points
> in the application.
> 

Can't applications use 'rte_trace_point_emit_*()' directly, as libraries
does?

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1114 matches

Mail list logo